Seven years ago, when I was working for The International Livestock Research Institute (ILRI), our data management group started to promote the usage of Open Data Kit (ODK) in household survey research. ODK gave us a first taste of digital data collection. Around the same time, the Sustainable Engineering Lab at Columbia University created FormHub, an excellent ODK aggregator. Although FormHub was great software and an enormous improvement from ODK Aggregate, it lacked data cleaning features which were particularly important when groups of people across different geographies accessed and curated the same dataset. To counter this problem, I created ODK Tools, a set of command-line programs to manage ODK data in MySQL databases. Although ODK Tools allowed us to transfer the submissions from FormHub to MySQL, the process was “manual” as it required running several command tools. It also required technical know-how and time. However, once the data was in MySQL, it was easier to access it and clean it safely and concurrently independent of the location of the user.
A few years later, the development of FormHub stopped and this brilliant software was copied and rebranded by Ona IO into “OnaData.” OnaData had the same features as FormHub and for a time it was good. However, a few months later, Ona IO decided to invest in a new closed-source interface connected to the FormHub internal functions and discontinued the original FormHub interface. Subsequent changes made to the code broke the integration with ODK Tools.
And this is the origin of FormShare: I copied OnaData and rebranded it with the intention to continue the legacy of FormHub: A free – open source – fully featured software for collecting and managing ODK data. However, a few months into working on FormShare, I realized that it needed a lot of upgrading to capture my main ideas:
- To integrate it with ODK Tools to have a proper MySQL repository to centralize the data.
- To use the latest software technologies to decentralize data management.
I therefore embarked on the task of rewriting it again into a new version called “FormShare 2.” I did this because:
- I wanted to provide an open-source and free platform to private and public organizations to help them manage their data when using ODK.
- ODK Aggregate, in my opinion, is badly designed, buggy, and not interoperable. ODK Central is just not there yet.
- Forks based on FormHub suffer from the same ills as their predecessor: No proper repository, rudimentary data cleaning, no auditing, little interoperability, and poor or no extensibility, among many others.
Though FormShare 2 is based upon ideas and principles of Formhub (e.g., simple user interface) I wrote it from scratch (not a single line of code comes from FormHub) using Python 3, Pyramid, MySQL, ElasticSearch, and PyUtilib to deliver a complete and extensible data management solution for ODK Data collection. It took me four years, but it is finally here.
FormShare 2 is for organizations to install it in their server or cloud service to serve ODK XForms and collect and manage the submissions. FormShare 2 is also available as a service at https://formshare.org for those organizations that lack the capacity or resources to run their own installation.
What does FormShare 2 offer?:
- Simple and user-friendly interface.
- Multiple languages: English, Spanish, French and Portuguese.
- MySQL database as data repository.
- Data auditing. FormShare 2 records who changes what and when.
- No data duplication controlled by a “Primary key”, e.g. Household ID.
- Sensitive data filtering. You can automatically remove fields like “Farmer’s name” from all data exports.
- Complete data interoperability with OData. You can connect your ODK data with tools like Power BI or clean it from Excel in real-time.
- API data cleaning. You can create scripts in R to clean data on arrival.
- Collect data using the Web Browser with Enketo.
- A pluggable mechanism to easily extend its functionality. For example, you can write plugins to:
- Extend the metadata to support DDI and link datasets with Dataverse
- Connect dictionary variables to ontologies
- Use your Microsoft Account to login
- Maps and much more… All for free!