NB, I'm answering simply as a Fedora contributor.

On Tue, Nov 11, 2025 at 11:27:59AM -0600, Michael Winters via legal wrote:
> Hello Legal,
> 
> My name is Michael Winters, typically known here as @mwinters.  I have some
> questions about Fedora's data privacy policies, which I'll provide a bit of
> context to first.
> 
> There has been a long-standing desire within Fedora for better tools with
> which to analyze our user data and understand our community so that we can
> improve it.  To this end, I have recently created a "Data Lakehouse" proof
> of concept known as "Hatlas", available at https://hatlas.mwinters.net .
> This technology consolidates data from existing public Fedora datasets and
> provides simplified tools to facilitate public access and analysis.

Looking at the FAQ

  https://hatlas.mwinters.net/docs/faq/

one item stands out to me

  "If you feel strongly that you want to be erased from Fedora datasets,
   please work through the existing Fedora Personal Data Removal request
   process. If you still see your data here after a reasonable amount of
   time, feel free to contact me."

AFAICT, this is essentially saying that if you don't want your information
to be processed by this Hatlas service, you need to cease all participation
in the Fedora project, then request removal of your data, so that future
Fedora data sources consumed by Hatlas no longer have your info. Urgh :-(

With Hatlas run as a 3rd party service, as opposed to an official Fedora
service, I expect it could run into GDPR compliance problems with this
attempt to outsource data removal requirements to Fedora.

> In particular, many of these datasets include usernames and records of user
> activity tied to those usernames, e.g. the contents and exact timing of
> forum posts, git commits, group membership changes, etc.  My current
> questions are:
> 
> 1) Does an arbitrary username (not necessarily tied to a real name)
> constitute PII which must be protected / anonymized?  It is not currently
> anonymized in Fedora datasets.

FWIW, the question of ties to a real name is explicitly mentioned in
GDPR guidance in the UK[1]

  "An individual’s social media ‘handle’ or username, which may
   seem anonymous or nonsensical, is still sufficient to identify
   them as it uniquely identifies that individual. The username
   is personal data if it distinguishes one individual from another
   regardless of whether it is possible to link the ‘online’ identity
   with a ‘real world’ named individual."


> 4) How does GDPR view downstream users of public data sources, i.e. Hatlas?
> Is Hatlas a "data processor"?  Must Hatlas integrate with Fedora's Personal
> Data Removal process?  We intend to do so, but there seems to be no
> obligation for either party.

If Hatlas is run independently of the Fedora project, my expectation would
be that it must directly provide a data removal process, and cannot rely on
outsourcing it to "upstream" data sources (Fedora).

With regards,
Daniel

[1] 
https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/personal-information-what-is-it/what-is-personal-data/what-are-identifiers-and-related-factors/
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

-- 
_______________________________________________
legal mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/[email protected]
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Reply via email to