I should expand my request here to ask: are there any technical measures / policies / licenses / etc which ought to be in place for Fedorans working on these datasets? (This also brings up the question of "Who *is* Fedora vs. who is *downstream of* Fedora?" Where do we draw the line in an open community?)

I ask this because we are discussing these privacy concerns internally and trying to find the best way forward. A few points here:

- It's fairly straightforward to "pseudonymize" user activity, meaning, we replace their usernames with a number (or similar). - However, *somebody* needs to perform this work. So we need to know under what conditions access can be granted (etc) to the original data.

- Even with pseudonymization, it may be possible to identify individuals by their activity. The only way to truly anonymize these datasets is to aggregate them. - However, we end up in the same position: *somebody* has to perform the aggregation. And this needs to be done very carefully (ideally, collaboratively) so that we can still extract the insights necessary to guide our community management decisions.


Thanks,

Michael Winters
--
_______________________________________________
legal mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/[email protected]
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Reply via email to