Data produced by biodiversity research projects that evaluate and monitor Good Environmental Status have a high potential for use by stakeholders involved in [marine] environmental management. The lack of specific scientific objectives, poor organizational logic, and a characteristically disorganized collection of information leads to a decentralized data distribution, hampering environmental research. In such a heterogeneous system across different organizations and data formats, it is difficult to efficiently harmonize the outputs. There are few tools available to assist.
The task of the newly created consortium of IndexMeed is to index biodiversity data (and to provide an index of qualified existing open datasets) and make it possible to build graphs to assist in the analysis and development of new ways to mine data. Standards (including TDWG) and specific protocols can be applied to interconnect databases. Such semantic approaches greatly increase data interoperability.
The aim of this talk is to present the 2016 IndexMed workshop results (
https://indexmed2016.sciencesconf.org) and recent actions of the consortium (renamed “IndexMeed -
Indexing for
Mining
Ecological and
Environmental
Dataâ€): new approaches to investigate complex research questions and support the emergence of new scientific hypotheses. With one day of plenary sessions and two days of practical workshops, this event was dedicated to the sharing of experience and expertise, the acquisition of practical methods to construct graphs and value data through metadata and "data papers". Recent developments in data mining based on graphs, the potential for important contributions to environmental research, particularly about strategic decision-making, and new ways of organizing data were also discussed at the workshop.
In particular, this workshop promoted decisions on how (i) to analyze heterogeneous distributed data spread in different databases, (ii) to create matches and incorporate some approximations, (iii) to identify statistical relationships between observed data and the emergence of contextual patterns, and (iv) to encourage openness and the sharing of data, in order to value data and their utilization.
The IndexMeed project participants are now exploring the ability of two scientific communities (ecologyÂ
sensu lato and computer sciences) to work together. The uses of data from biodiversity research demonstrate the prototype functionalities and introduce new perspectives to analyze environmental and societal responses including decision-making. Output of the seminar lists scientific questions that can be resolved by the new data mining approaches and proposes new ways to investigate heterogeneous environmental data with graph mining.