Sorry for the delayed reply here, I have been swamped with other work and life items, as well as pondering on what I can and should say here about all this.
First, some disclaimers: I am definitely not a lawyer, although I was present when we setup our GDPR handling process 7+ years ago. This process was created in consultation with legal folks as well as infrastructure folks for implementation. 7 years is a long while, so I could be misremembering or not being as detailed as perhaps I might wish I am. First, the data we are talking about here is indeed completely public. You can subscribe to a mailing list and get all the posts locally, you can pull a git repo and have all the commits. You can subscribe to the message bus and see messages going accross it. I understand the point downthread about different aggregations of data, but just wanted to make clear that this data is publicly available and anyone can get it. My recollection of things is that we determined that the fedora project had a legitimate business interest in maintaining the integrety of this data. It's part of our core mission to create a community of open source developers that collaborate, organize, discuss and make changes in releasing a collection of open source software. If we remove chunks of data, our mission is compromised, we can no longer see how something was proposed, discussed, decided and then how the changes were made. So, except in exceptional cases, we do not remove this public data. (Mailing lists posts have been deleted in exceptional cases). For applications/things that we have that allow to anonomize users, we do that on request. The only application I know of that handles this is discourse. ...snip... > In particular, many of these datasets include usernames and records of user > activity tied to those usernames, e.g. the contents and exact timing of > forum posts, git commits, group membership changes, etc. My current > questions are: > > 1) Does an arbitrary username (not necessarily tied to a real name) > constitute PII which must be protected / anonymized? It is not currently > anonymized in Fedora datasets. My understanding: no. username is public. Other information attached to an account may be PII and can be removed on request, leaving the username as part of our legitimite business needs. > 2) Do current Fedora policies permit collecting user activity tied to > usernames? This is not explicitly stated under "Information We Collect", > though it is mentioned later under "Using (Processing) Your Personal Data." Yes. This could be more clear/much more explicit. > 3) Do current Fedora policies permit publishing user activity tied to > usernames? Section "Sharing Your Personal Data" does mention "For research > activities", but it does not specify that data must be shared *only* in > aggregate. IMHO, yes. Should be more clear/explicit. > 4) How does GDPR view downstream users of public data sources, i.e. Hatlas? > Is Hatlas a "data processor"? Must Hatlas integrate with Fedora's Personal > Data Removal process? We intend to do so, but there seems to be no > obligation for either party. I don't know the answer here. > 5) Are there any data licenses applicable to downstream users such as > Hatlas? I intend to apply one restricting the use of Hatlas data to > non-commercial purposes, but there seem to be no restrictions coming from > Fedora. Or here. However, there's some semantics here: Is this a seperate project? You are working on this in the context of fedora with fedora resources (once the POC is done), so a good argument could be made that it's just another fedora application run by fedora. Probibly still doesn't answer your questions above, but thought I would mention that. Thanks for opening this discussion. I think we could definitely clarify things in our privacy policy and confirm other things. Unfortunately, I don't think that work can happen here, we will need to discuss it with internal legal folks. -- _______________________________________________ legal mailing list -- [email protected] To unsubscribe send an email to [email protected] Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/[email protected] Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
