Data Management

The MBLWHOI Library is committed to assisting its patrons with data management.


Data Management

Data management is the organization and planning for data throughout the research cycle. It encompasses a set of activities that are essential to the short- and long-term access and use of research data. It involves planning for the creation, storage, use, security, and continued access to data.

Data are the raw, analyzed, or derived results of observations, experiments, and simulations. Data can be either analog or digital and can exist in different formats, including but not limited to text, numerical, multimedia, and instrument-specific.

Management of data throughout the research life-cycle not only increases the efficiency of a research project, it also complies with expectations for the ethical conduct of research and is rapidly becoming mandatory practice for many funding agencies.


Three Things You Can Do Today to Help Manage Your Data

1. Backup, backup, backup.           

Think of what it would take to reproduce your data. To make sure you don't lose it, strive to have three copies—the original master file, a local backup (e.g., on an external hard drive), and an external backup (e.g., on a networked drive or on a web-based storage service).

2. Organize your data.

Plan the directory structure and file naming conventions before creating your data, taking into consideration the potential need to track versions  of data sets and documents. Follow any existing project-specific conventions or disciplinary standards or best practices.

3. Document your data.

Data documentation, also known as metadata, will help you use and understand your research data into the future. If you plan to share your data it will also help others find, use, and properly cite it. At a minimum, create a readme.txt file that includes basic documentation such as title,  creator, identifier, rights/access information, dates, location, methodology, etc.


Assistance with Creating Data Management Plans

Many funders such as the National Science Foundation, have requirements for data sharing and data management plans. Guidelines for writing a plan and additional resources are available at the WHOI Data Management and Publishing site


Data Citation

The purpose of data citation is to give attribution to data creators and curators, track provenance and impact of data sets, and aid in reproducibility.  The ESIP Federation has put together these guidelines



The assignment of persistent identifiers enables accurate data citation. The Library can assign a Digital Object Identifier (DOI) to appropriate datasets deposited in the Institutional Repository (IR) WHOAS (Woods Hole Open Access Server).  We are particularly interested in working with authors to deposit datasets associated with published articles.  The DOI would ideally be assigned before submission and be included in the published paper so readers can link directly to the data set, but DOIs are also being assigned to data sets related to articles after publication. WHOAS metadata records link the article to the data sets and the data sets to the article.


The Library has also collaborated with Elsevier to enable linking from Science Direct articles to related data sets (when available) in WHOAS.


Other Data in WHOAS

The Library has partnered with researchers to deposit data sets in WHOAS that are not appropriate for national or domain specific data repositories.  These data sets currently include audio, video, text and jpg files.


Contact if you are interested in DOIs, have questions about data appropriate to deposit in WHOAS, or have other data management or citation questions.