Thanks Daniel. Your concern is entirely valid and shared by me, which is why 
I've started this thread :)

As far as I can tell, there is no license on this data. I'm not a lawyer, and 
certainly not one skilled in US + EU law, so I don't know what rights are 
granted by default. But since this data is currently "published" by Fedora, I 
believe that any entity is at minimum allowed to "read" this information, and 
that no obligations exist thereafter regarding what they've "learned". Meaning 
- any evil entity (especially one outside of GDPR jurisdiction) can currently 
ingest this data and do whatever they want with it within their own system, and 
would be under zero obligation to execute PDRs. Ironically, it's the 
re-publishing that Hatlas does which is most obviously protected by default 
copyright etc, to my understanding. It's easier to be evil than open, as it 
stands today.

This is exactly the sort of concern that I'd like clarification on.

I also want people to understand that if they see something in Hatlas they 
don't like, deleting it from Hatlas does nothing to protect it -- it has to get 
deleted "upstream". I'll make that more explicit in the FAQ.

Thanks again for raising your concern here. I believe it's helpful for others 
to see the sort of conversations that Hatlas is spurring.

Michael Winters



On November 12, 2025 3:42:25 AM CST, "Daniel P. Berrangé via legal" 
<[email protected]> wrote:
>NB, I'm answering simply as a Fedora contributor.
>
>On Tue, Nov 11, 2025 at 11:27:59AM -0600, Michael Winters via legal wrote:
>> Hello Legal,
>> 
>> My name is Michael Winters, typically known here as @mwinters.  I have some
>> questions about Fedora's data privacy policies, which I'll provide a bit of
>> context to first.
>> 
>> There has been a long-standing desire within Fedora for better tools with
>> which to analyze our user data and understand our community so that we can
>> improve it.  To this end, I have recently created a "Data Lakehouse" proof
>> of concept known as "Hatlas", available at https://hatlas.mwinters.net .
>> This technology consolidates data from existing public Fedora datasets and
>> provides simplified tools to facilitate public access and analysis.
>
>Looking at the FAQ
>
>  https://hatlas.mwinters.net/docs/faq/
>
>one item stands out to me
>
>  "If you feel strongly that you want to be erased from Fedora datasets,
>   please work through the existing Fedora Personal Data Removal request
>   process. If you still see your data here after a reasonable amount of
>   time, feel free to contact me."
>
>AFAICT, this is essentially saying that if you don't want your information
>to be processed by this Hatlas service, you need to cease all participation
>in the Fedora project, then request removal of your data, so that future
>Fedora data sources consumed by Hatlas no longer have your info. Urgh :-(
>
>With Hatlas run as a 3rd party service, as opposed to an official Fedora
>service, I expect it could run into GDPR compliance problems with this
>attempt to outsource data removal requirements to Fedora.
>
>> In particular, many of these datasets include usernames and records of user
>> activity tied to those usernames, e.g. the contents and exact timing of
>> forum posts, git commits, group membership changes, etc.  My current
>> questions are:
>> 
>> 1) Does an arbitrary username (not necessarily tied to a real name)
>> constitute PII which must be protected / anonymized?  It is not currently
>> anonymized in Fedora datasets.
>
>FWIW, the question of ties to a real name is explicitly mentioned in
>GDPR guidance in the UK[1]
>
>  "An individual’s social media ‘handle’ or username, which may
>   seem anonymous or nonsensical, is still sufficient to identify
>   them as it uniquely identifies that individual. The username
>   is personal data if it distinguishes one individual from another
>   regardless of whether it is possible to link the ‘online’ identity
>   with a ‘real world’ named individual."
>
>
>> 4) How does GDPR view downstream users of public data sources, i.e. Hatlas?
>> Is Hatlas a "data processor"?  Must Hatlas integrate with Fedora's Personal
>> Data Removal process?  We intend to do so, but there seems to be no
>> obligation for either party.
>
>If Hatlas is run independently of the Fedora project, my expectation would
>be that it must directly provide a data removal process, and cannot rely on
>outsourcing it to "upstream" data sources (Fedora).
>
>With regards,
>Daniel
>
>[1] 
>https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/personal-information-what-is-it/what-is-personal-data/what-are-identifiers-and-related-factors/
>-- 
>|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
>|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
>|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
>
>-- 
>_______________________________________________
>legal mailing list -- [email protected]
>To unsubscribe send an email to [email protected]
>Fedora Code of Conduct: 
>https://docs.fedoraproject.org/en-US/project/code-of-conduct/
>List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
>List Archives: 
>https://lists.fedoraproject.org/archives/list/[email protected]
>Do not reply to spam, report it: 
>https://pagure.io/fedora-infrastructure/new_issue
-- 
_______________________________________________
legal mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/[email protected]
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Reply via email to