NB, I'm answering simply as a Fedora contributor. On Tue, Nov 11, 2025 at 11:27:59AM -0600, Michael Winters via legal wrote: > Hello Legal, > > My name is Michael Winters, typically known here as @mwinters. I have some > questions about Fedora's data privacy policies, which I'll provide a bit of > context to first. > > There has been a long-standing desire within Fedora for better tools with > which to analyze our user data and understand our community so that we can > improve it. To this end, I have recently created a "Data Lakehouse" proof > of concept known as "Hatlas", available at https://hatlas.mwinters.net . > This technology consolidates data from existing public Fedora datasets and > provides simplified tools to facilitate public access and analysis.
Looking at the FAQ https://hatlas.mwinters.net/docs/faq/ one item stands out to me "If you feel strongly that you want to be erased from Fedora datasets, please work through the existing Fedora Personal Data Removal request process. If you still see your data here after a reasonable amount of time, feel free to contact me." AFAICT, this is essentially saying that if you don't want your information to be processed by this Hatlas service, you need to cease all participation in the Fedora project, then request removal of your data, so that future Fedora data sources consumed by Hatlas no longer have your info. Urgh :-( With Hatlas run as a 3rd party service, as opposed to an official Fedora service, I expect it could run into GDPR compliance problems with this attempt to outsource data removal requirements to Fedora. > In particular, many of these datasets include usernames and records of user > activity tied to those usernames, e.g. the contents and exact timing of > forum posts, git commits, group membership changes, etc. My current > questions are: > > 1) Does an arbitrary username (not necessarily tied to a real name) > constitute PII which must be protected / anonymized? It is not currently > anonymized in Fedora datasets. FWIW, the question of ties to a real name is explicitly mentioned in GDPR guidance in the UK[1] "An individual’s social media ‘handle’ or username, which may seem anonymous or nonsensical, is still sufficient to identify them as it uniquely identifies that individual. The username is personal data if it distinguishes one individual from another regardless of whether it is possible to link the ‘online’ identity with a ‘real world’ named individual." > 4) How does GDPR view downstream users of public data sources, i.e. Hatlas? > Is Hatlas a "data processor"? Must Hatlas integrate with Fedora's Personal > Data Removal process? We intend to do so, but there seems to be no > obligation for either party. If Hatlas is run independently of the Fedora project, my expectation would be that it must directly provide a data removal process, and cannot rely on outsourcing it to "upstream" data sources (Fedora). With regards, Daniel [1] https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/personal-information-what-is-it/what-is-personal-data/what-are-identifiers-and-related-factors/ -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- _______________________________________________ legal mailing list -- [email protected] To unsubscribe send an email to [email protected] Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/[email protected] Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
