Currently Available Plugins

Basic Management and Visualization

class nomenclator(tag, subnames)

Alters the configuration.name of a Case by adding subnames.

This affects on the creation of the subfolders where the rest of the Pipeline will be executed.

Note

On execution, the plugin will not append new subnames when those already exist. For example, if configuration.name is 1QYS_experiment1_naive and subnames=['experiment1', 'naive'], the final configuration.name will still be 1QYS_experiment1_naive and not 1QYS_experiment1_naive_experiment1_naive. This is to avoid folder recursion generation when re-running a previous Pipeline.

Caution

There are some keywords that cannot be used as a subname due to them generating their own first level subfolders. These keywords are architecture, connectivity, images and summary. Trying to add one of those terms as subname will generate a NodeDataError on check time.

To Developers

When developing a new plugin, if it is expected to create new first level subfolders, they should be listed in the class attribute nomenclator.RESERVED_KEYWORDS. See more on how to develop your own plugins.

Parameters:

subnames (Union[List[str], str]) – Subnames that will be added to the Case initial name.

Raises:
NodeOptionsError:
 On initialization. If a reserved key is provided as a subname.
NodeDataError:On check. If the required fields to be executed are not there.
class plotter(tag, outfile=None, prefix=None, plot_types=None, **kwargs)

Create visual representations of the FORM.

Note

Depends on the system.image configuration option

To Developers

This Node overwrite the execute function to be able to generate multi-Case images.

Parameters:
  • outfile (Union[str, Path, None]) – Selected outfile/outdir for the plot.
  • prefix (Optional[str]) – File prefix if outfile is a dir.
  • plot_types (Optional[List[str]]) – Select the expected plot (from available options).

For each plot type, a dict can be provided with extra parameters to control plotting. Parameters are defined for plot_case_sketch()

Raises:
NodeOptionsError:
 On initialization. If an unknown plot is required.
NodeDataError:On check. If the required fields to be executed are not there.
NodeDataError:On execution. If the requested corrections do not meet the criteria imposed by the FORM.

FORM and Parametric

class builder(tag, connectivity=True, motif=False, pick_aa=None, write2disc=True)

Builds and adds a SKETCH from given FORM string to the Case using ideal SSE elements.

If corrections are available and specified, these will be applied onto the SKETCH.

Caution

In order to apply secondary structure or per layer corrections, the corrector plugin needs to be set in the Pipeline.

Parameters:
  • connectivity (Optional[bool]) – Expected secondary structure connectivity. Important: at the moment only a single connectivity supported (default: True).
  • motif (Optional[bool]) – Expected Motif to be added to the SKETCH (default: False).
  • pick_aa (Optional[str]) – Desired amino acid type to use for the SKETCH sequence. If not specified, it will use pseudorandomly assign amino acid types based on secondary structure propensity scores.
  • write2disc (Optional[str]) – Dump the SKETCH (default: True).
Raises:
NodeDataError:On check. If the required fields to be executed are not there.
NodeDataError:On execution. If the Case contains anything other than one defined connectivity.
class make_topologies(tag, representatives=False, sampling=1)

Creates, explores and evaluates all possible connectivites (topologies) of a given architecture.

Note

Depends on the configuration.defaults.distance.max_loop configuration.

Parameters:
  • representatives (bool) – If True, it will evalute the explored topologies (default: False).
  • sampling (float) – Number of topologies to randomly sample from the explored topologies. E.g. sampling = 1 will sample all topologies; sampling = .25 will randomly sample 25% of the topologies (default: 1).
Raises:
NodeDataError:On check. If the required fields to be executed are not there.
NodeDataError:On execution. If the Case contains anything other than one defined connectivity.
class corrector(tag, corrections=None)

Applies corrections to the placements of the secondary structures in a FORM.

This affects on placement and angles of the different secondary structures. Corrections are defined through a controlled vocabulary.

Caution

Structural motifs imported through the Node motif_picker will impose constraints on the way structures can move with respect to each other, thus making some corrections impossible to fulfill.

To Developers

Whenever a plugin has to apply geometric and positional changes to the FORM, it should be done through this class.

Parameters:

corrections (Union[str, Dict[~KT, ~VT], Path, List[Union[str, Path]], None]) – Per secondary structure or per layer corrections to be applied.

Raises:
NodeDataError:On check. If the required fields to be executed are not there.
NodeDataError:On execution. If the requested corrections do not meet the criteria imposed by the FORM.
class motif_picker(tag, source, selection, hotspot, attach, binder=None, identifier=None)

Recovers a motif of interest from a protein structure to map upon a FORM.

Note

By adding a motif linking it to one or more secondary structres, several restrictions are added to the system. These may include relative positioning, sequence order and positioning, amongst others.

To Developers

A part from adding the metadata.motif_picker field to the general Case, this Node adds information to the metadata field of selected secondary structures. Ignoring these metadata in a Node that alters positioning and orientation of the secondary structures will result in broken functional motifs. When applying corrections it should be done through the Node type corrector, in which this restrictions are already taken into account.

Parameters:
  • source (Union[Path, str]) – Path to the PDB file of interest containing the motif.
  • selection (str) – Range selection of the motif segments. Position must be defined as chain and PDB identifier, that means number and insertion code, if any. Thus, one motif could be selected as A:10-25 when belonging to chain A, while multi-segment motifs are made of selections separated by ,.
  • hotspot (str) – Selection of a single position that belongs to the exposed part of the interface. This residue is used as a guide to understand the motif’s placement. This selection has to follow the same format as selection, but without a range.
  • attach (Union[str, List[str]]) – FORM name of the secondary structure to which the segments need to be attached. If there is more than one segment, names should be provided in a list.
  • binder (Optional[str]) – Range selection for the binder. It can be provided as a range same as selection or as one or a comma-separated list of chain identifiers.
  • identifier (Optional[str]) – Name to give the motif. Otherwise a name is picked related to the position of the Node in the Pipeline.
Raises:
NodeOptionsError:
 On initialization. If the provided structure source file does not exist.
NodeOptionsError:
 On initialization. If the number of requested segments does not match the number of secondary structures to attach them.
NodeOptionsError:
 On initialization. If the number of requested segments does not match the shape definition.
NodeOptionsError:
 On initialization. If any secondary structure identifier in attach does not match TopoBuilder naming system.
NodeDataError:On check. If the required fields to be executed are not there.

Loop Building and Identification

Protein Backbone

class imaster(tag, correctives={'rotate': ['x'], 'translate': ['y']}, rmsd=5.0, bin='mid', step=None, subsampling=None, mirror_beta_twist=False, mirror_beta_shear=False, corrections={}, correction_type='network', correction_check=True)

Searches for defined per secondary structure or layer corrections by matching layer-wise against a defined MASTER database. If a beta-layer is present, this will be the starting layer to be corrected. Multiple beta-layers are sorted by size (number of strands) and the largest one is set as starting layer. From there on, all next layers will be corrected paire-wise, e.g. the previous corrected layer together with the next layer.

Note

Multiple correction shemes can be set. The default correction sheme is layer tilt around the x-axis and layer points (shifts) along the y-axis.

Caution

Depending on the size of the queried MASTER database, this may take a lot of time. If possible, please use the slurm.use configuration. In case this is not possible, you may reduce the size of the database by setting master.pds pointing to fewer structures to search over.

To Developers

Due to the possibilty of external Node, main function is located in the imaster module.

Parameters:
  • rmsd (Optional[float]) – RMSD threshold for master search (default: 5.0).
  • bin (Optional[str]) – Starting bin for master corrections (default: mid).
  • step (Optional[int]) – Path of how the layers are searched for corrections.
  • subsampling (Optional[int]) – Subsampling matches to accelerated calculations.
  • corrections (Optional[Dict[~KT, ~VT]]) – File containing the corrections.
  • mirror_beta_twist (Optional[bool]) – Invert beta sheet twist (default: False).
  • mirror_beta_shear (Optional[bool]) – Invert beta sheet shear between layers (default: False).
  • correction_type (Optional[str]) – Analysis method [‘network’, ‘mode’] to use for corrections (default: ‘network’).
  • correction_check (Optional[bool]) – Check if corrections have been correctly applied across the layers (default: True).
Raises:
NodeOptionsError:
 On initialization. If a reserved key is provided as a subname.
NodeDataError:On check. If the required fields to be executed are not there.
class fragment_maker(tag, protocol, script=None)

Creates or mixes fragments that are needed in multiple Rosetta protocols. Mutliple ways of creating fragments are possible through different protocols.

Note

Currently, solely the loop_fragment protocol is implemented.

Caution

In order to create fragments with the loop_fragment protocol, the loop_fragments plugin needs to be set in the Pipeline.

Parameters:
  • protocol (str) – Fragment creation protocol to be used.
  • script (Union[str, Path, None]) – Rosetta script to pick fragments.
Raises:
NodeDataError:On check. If the required fields to be executed are not there.
NodeDataError:On execution. If the Case contains anything other than one defined connectivity.
NodeMissingError:
 On exection. If required variable inputs are not there.

Sequence Design

class funfoldes(tag, nstruct=2000, design_nstruct=10, natbias=2.5, profile=False, layer_design=True)

Run Rosettas FunFolDes protocol to generate designs.

Caution

Due to the FastDesignMover, this funfoldes may take a lot of time. If possible, please use the slurm.use configuration. In case this is not possible, you may reduce the number of decoys to be generated via the nstruct parameter option.

Parameters:
  • nstruct (Optional[int]) – Number of decoys to be generated during assembly stage (default: 2000).
  • nstruct_design – Number of decoys to be generated during design stage (default: 10).
  • natbias (Optional[float]) – Score function bias towards per secondary structure types (default: 2.5).
  • profile (Optional[bool]) – Use a sequence profile derived from structure fragments for design (default: False).
  • layer_design (Optional[bool]) – If funfoldes should a layer design approach (default: True).
Raises:
NodeDataError:On check. If the required fields to be executed are not there.
NodeMissingError:
 On exection. If required variable inputs are not there.
class hybridize(tag, nstruct=2000, natbias=2.5, layer_design=True)

Run Rosettas hybridize protocol to generate designs.

Caution

Due to the FastDesignMover, this funfoldes may take a lot of time. If possible, please use the slurm.use configuration. In case this is not possible, you may reduce the number of decoys to be generated via the nstruct parameter option.

Parameters:
  • nstruct (Optional[int]) – Number of decoys to be generated (default: 2000).
  • natbias (Optional[float]) – Score function bias towards per secondary structure types (default: 2.5).
  • layer_design (Optional[bool]) – If funfoldes should a layer design approach (default: True).
Raises:
NodeDataError:On check. If the required fields to be executed are not there.
NodeMissingError:
 On exection. If required variable inputs are not there.

Analytics

class statistics(tag, source, stage, analysis, metric=None, **kwargs)

Various statistics on the sequence and structure level are computed depending on available scripts.

Note

Depends on the statistic.molprobity configuration option. Depends on the statistic.tmalign configuration option. Depends on the statistic.trrosetta_repo configuration option. Depends on the statistic.trrosetta_wts configuration option. Depends on the statistic.trrosetta_env configuration option.

Caution

In order to execute this Node, we highly recommend to install trRosetta with all dependencies. The external conda environment can be specified in the statistic.trrosetta_env configuration option.

To Developers

Due to its use in multiple Node, functions to deal with this Node are mostly located in the respective module file and external scripts are locate in this Node directory.

Parameters:
  • loop_range – Expected loop length is calculated from the euclidian distance between two secondary structures. This attribute adds a window of loop_range residues under and over the calculated length.
  • source (str) – Plugin designs come from, e.g. funfoldes.
  • stage (str) – The type of design, e.g. folded or designed.
  • analysis (str) – Geometric or quality assessment.
  • metric (Optional[str]) – Type of geometric or quality assessment.
Raises:
NodeDataError:On check. If the required fields to be executed are not there.