Hi. Last week I’ve reached out to the OpenCellId project to talk about concrete ways to collaborate. Both of our projects have very similar goals and we found a good deal of overlap in what and how we are doing things.
To that end we agreed on a closer collaboration going forward. The two first areas we are looking at is a clear bi-directional data exchange and sharing knowledge about how to handle the data (for example validation, position estimation/aggregation). Later we might also look at working on a shared client side library for data collection. The OpenCellId project has been around for a while and when we started the MLS project lay fairly dormant, like many of the other projects in this space. Since then a German company called Enaikoon (https://www.enaikoon.com/de/) has significantly stepped up its involvement in the project. They have completely rewritten the service side of the project and are working on many improvements. Amongst those is drastically improving the data quality of their data set, which historically has varied a lot. For exchanging data we unfortunately run into a bit of a license problem, as the OpenCellId data is available under a CC-BY-SA/Odbl license, while we are using a public domain (CC-0) license. This means that on the MLS side we need to keep the data separate from each other and can only combine it while answering individual position lookups. If we’d combine the data sources, the share-alike license would force us to restrict our data set to the same license. For the actual data exchange, OpenCellId already offers their data for public download and we’ll do the same for our aggregated cell dataset. This is also finally the time we make good on our promise of open data and not just an open service, to the extend that privacy concerns allow this. We maintain our position that the individual observations contain personal data, as they allow you to reconstruct where people went, and so we won’t publish this data. In order to make the process easier, we are currently aiming to agree on a common data format for the aggregated cell data. OpenCellId just expanded their format last week to take into account different network types and additional fields for CDMA and LTE networks. There’s still some details for us to agree about, like how to include a range field and the exact semantics of what fields are required or how to signal unknown values. But the overall goal of using a common data format makes a lot of sense to me, even if we might have slightly different tastes about field naming ;-) I hope this also addresses the questions raised in the ichnaea github issues (#282, #283) we filed for tracking this work. Cheers, Hanno _______________________________________________ dev-geolocation mailing list [email protected] https://lists.mozilla.org/listinfo/dev-geolocation
