Hello!
We here in Data Stewardship have been receiving inquiries about how the Data Collection Review process[0] works now that more products are being built out of reusable components. What follows is a memo about how to approach Data Review when you're adding a data collection to a reusable component, or adding a reusable component to a product. The current home for the living document version of this is here: https://mana.mozilla.org/wiki/pages/viewpage.action?spaceKey=DATAPRACTICES&title=Data+Review+in+Components Data Review was designed assuming that the Product was responsible for both the data collection and reporting. The measurement code and the submission code all lived in the same place so the developer instrumenting the probe (using Telemetry.scalarSet or Telemetry::Accumulate or what-have-you) knew not only what they were instrumenting, but across what populations this probe would be reported. This is because the code being instrumented was only ever a part of Firefox[1]. With Android Components we radically shifted how we would build Firefox (and other things) on Android. Instead of having all the pieces live together and only ever being used for one product, we'd be developing the pieces separately and using them in any number of products. This means that when a data collection is added, chances are it's being added to a Component, not a Product[2]. The developer adding the data collection may not be aware of all the Products currently using their Component, and can't know of future Products that might integrate it. This makes Data Collection Review difficult as Question 7 tries to ascertain what population is being measured with this new collection. To solve this, the developer adding the data collection should list all the Products they know of that currently embed their Component, and a phrase like "Users of products that embed $MyComponent" (where $MyComponent is replaced with the name of their Component). This will help the Data Steward understand where this collection is expected to be collected today, and help any interested person in the future learn what names they should use when looking these things up. If a Product that submits data (usually by initializing the Glean SDK) adds a Component that collects data (these can be identified by their metrics documentation, usually in docs/metrics.md), then this is an expansion of the population of a data collection. This means the Product needs to submit a Data Collection Review to expand the scope of the Component's Data Collection to the population using the Product. To complete the review some questions (like why the data is being collected) will not need firm answers (as those will have been provided when the collections were added). The list of metrics can be found in the Component's documentation. The population is the population using the Product, and this is an answer the Product is most suited to give. As is the description of the opt-out mechanism. With these small allowances, Data Review is adaptable to the new component-based development situation on Android and wherever reusable components are included. This is new, and we will make mistakes. Please do ask questions of the Data Stewards along the way, and let them know if you find anything they've missed. Things that require Data Collection Review: 1. A new data collection. 2. A Product integrating a Component that collects data. 3. A Product adding a new Data Collection System (by integrating the Glean SDK, for instance). In most cases merely integrating a new system will add collection, so this will be covered under (1). In other cases, you may need special permission to start using a new system. Things that do not require Data Collection Review: 1. A Product upgrading an integrated Component to a new version that has new data collections. (This is covered by (1) above. The Product could be included in the review by name, or as a product that embeds $MyComponent. If clarification is desired, we can amend the data collection review to specifically include the Product by name. No biggie.) Assumptions: * All of the Products and Components engaging in this process are subject to Mozilla's Privacy Policy. If you have any questions, please find us at fx-datastewa...@mozilla.com, at #data-stewards on chat.mozilla.org (when available), or reach out to any Data Steward listed on the wiki[0]. Thanks! [0]: https://wiki.mozilla.org/Firefox/Data_Collection [1]: This isn't actually true. It could also be a part of Thunderbird or Geckoview, but let's keep it simple for now. [2]: Data collections can be added to Products, too. In those cases, the old mental model from Firefox still applies. _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform