Hi.

Last week I’ve reached out to the OpenCellId project to talk about concrete 
ways to collaborate. Both of our projects have very similar goals and we found 
a good deal of overlap in what and how we are doing things.

To that end we agreed on a closer collaboration going forward. The two first 
areas we are looking at is a clear bi-directional data exchange and sharing 
knowledge about how to handle the data (for example validation, position 
estimation/aggregation). Later we might also look at working on a shared client 
side library for data collection.

The OpenCellId project has been around for a while and when we started the MLS 
project lay fairly dormant, like many of the other projects in this space. 
Since then a German company called Enaikoon (https://www.enaikoon.com/de/) has 
significantly stepped up its involvement in the project. They have completely 
rewritten the service side of the project and are working on many improvements. 
Amongst those is drastically improving the data quality of their data set, 
which historically has varied a lot.

For exchanging data we unfortunately run into a bit of a license problem, as 
the OpenCellId data is available under a CC-BY-SA/Odbl license, while we are 
using a public domain (CC-0) license. This means that on the MLS side we need 
to keep the data separate from each other and can only combine it while 
answering individual position lookups. If we’d combine the data sources, the 
share-alike license would force us to restrict our data set to the same license.

For the actual data exchange, OpenCellId already offers their data for public 
download and we’ll do the same for our aggregated cell dataset. This is also 
finally the time we make good on our promise of open data and not just an open 
service, to the extend that privacy concerns allow this. We maintain our 
position that the individual observations contain personal data, as they allow 
you to reconstruct where people went, and so we won’t publish this data.

In order to make the process easier, we are currently aiming to agree on a 
common data format for the aggregated cell data. OpenCellId just expanded their 
format last week to take into account different network types and additional 
fields for CDMA and LTE networks. There’s still some details for us to agree 
about, like how to include a range field and the exact semantics of what fields 
are required or how to signal unknown values. But the overall goal of using a 
common data format makes a lot of sense to me, even if we might have slightly 
different tastes about field naming ;-)

I hope this also addresses the questions raised in the ichnaea github issues 
(#282, #283) we filed for tracking this work.

Cheers,
Hanno
_______________________________________________
dev-geolocation mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-geolocation

Reply via email to