Fork me on GitHub

Ground is a data context system under development at U.C. Berkeley. Ground is motivated by our experiences with users of open-source data processing packages like Spark, Hadoop, and Jupyter, and the software vendors that provide tools and services in those ecosystems. We are aiming to build a flexible, open source, vendor neutral system that enables users to reasons about what data they have, where that data is flowing to and from, who is using the data, when the data changed, and why and how the data is changing. Among other things, we believe a data context system is particularly useful for data inventory, data usage tracking, model-specific interpretation, reproducibility, interoperability, and collective governance.

Ground is a RESTful service written in Java. It provides APIs to interact with a data model based on a general graph design pattern. The goal of Ground is to provide a common data model and a common set of APIs to represent fully-versioned metadata in a graph structure, as well as a lineage relationships between logical objects.

We are currently in a pre-alpha stage and are in the process of developing more comprehensive ingestion and query APIs. We are strongly oriented towards basing the design of our system and our APIs in real-world use cases. If you have any interesting use cases you'd like to share with us, please do reach out! We'd love to hear from you.

For more information about Ground or to get in touch, feel free to email vikrams at berkeley dot edu.