Best Practices
The primary goal of this guide is to offer comprehensive guidance for effectively using and maintaining BioCompute Objects in the fields of bioinformatics and computational biology. This includes promoting consistency, data integrity, and collaboration among professionals, while supporting regulatory compliance and enhancing usability and adoption. Furthermore, it aims to facilitate the ongoing maintenance and updates of BCOs to ensure their continued relevance and usability across different computational domains
BCO Creation and Versioning
Intended Audience: BCO authors
- BioCompute IDs are used as persistent URLs. A novel usability domain must result in the creation of a new BCO with a new BCO ID. BCO IDs are immutable upon creation, and are never deleted or retired. If the usability domain (UD) remains unchanged, this results in a new version of the BCO. BCO ID example: OMX_000001
- BCO major and minor versions can be incremented based on project/institution documented policies.
- The BioCompute consortium maintains a database of registered authorities. Registered authorities are able to assign their reserved prefixes to their own IDs in the object_id field, such as OMX_000001. We encourage that everyone registers a prefix at biocomputeobject.org.
BCO Metadata
The three metadata fields are filled out at the time of submission. Validity check fills in the spec_version
with the IEEE URL, an option to run a SHA256 (or just input your own hash value) for etag
, and object_id
is assigned (with option to choose from any prefix associated with the account).
Domain-specific guidance
Provenance Domain
This domain serves as a repository for metadata describing the BCO.
Usability Domain
Authors have access to a text field where they can provide a comprehensive description of the analysis and relevant details.
Extension Domain
Format of how the schema would be defined: Execution domain
Description Domain
It includes a detailed breakdown of the individual steps involved, the external resources essential for each step, and the relationships between input and output objects.
Execution Domain
When recording manual curation, the script
field of the execution_domain
should link to a Google Document or GitHub markdown that describes the steps, either programmatically or in a stepwise fashion. Manual curation steps should ALSO be properly documented in the description_domain
. An easy way to conceptualize this is: Description domain is for people, Execution domain is for machine (or programmers).
Parametric Domain
This domain captures any modifications made to parameters from their default values.
Input and Output Domain
This domain serves as a catalog of global input and output files used in the analysis.
Error domain
This domain can support a “QA/QC rules” subdomain which provides rules that, if the output file does not pass the appropriate criteria, then it is flagged as an error.
The following fields are optional based on the IEEE-2791-2020 standard: Extension Domain, Parametric Domain, Error Domain.
BCO Form-based portal
Intended Audience: BCO tool developers and authors
BCOs can be created using any bioinformatics platform that has BCO read and write functionalities. For users who do not have access to a bioinformatics platform they can use the BCO Builder in the BCO Portal which has some of the basic API functionalities:
- Create a BCO that is conformant to IEEE-2791.
- Download and install an instance within an organization’s firewall
- View videos and documentation on tool use