I should expand my request here to ask: are there any technical measures
/ policies / licenses / etc which ought to be in place for Fedorans
working on these datasets? (This also brings up the question of "Who
*is* Fedora vs. who is *downstream of* Fedora?" Where do we draw the
line in an open community?)
I ask this because we are discussing these privacy concerns internally
and trying to find the best way forward. A few points here:
- It's fairly straightforward to "pseudonymize" user activity, meaning,
we replace their usernames with a number (or similar).
- However, *somebody* needs to perform this work. So we need to know
under what conditions access can be granted (etc) to the original data.
- Even with pseudonymization, it may be possible to identify individuals
by their activity. The only way to truly anonymize these datasets is to
aggregate them.
- However, we end up in the same position: *somebody* has to perform
the aggregation. And this needs to be done very carefully (ideally,
collaboratively) so that we can still extract the insights necessary to
guide our community management decisions.
Thanks,
Michael Winters
--
_______________________________________________
legal mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Fedora Code of Conduct:
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives:
https://lists.fedoraproject.org/archives/list/[email protected]
Do not reply to spam, report it:
https://pagure.io/fedora-infrastructure/new_issue