Hi,

Unfortunatly this important topic of GDPR compliance has not seen much 
interest.

After asking github about how they would cope with the issue of erasing 
the author field, they changed their privacy policy, which now 
clarifies that this won't be done.

My guess is that this would ultimately rely on "overriding legitimate 
grounds for the processing" (Art. 17 (1) point (a) GDPR) which is one 
of the most fragile legitimizations avaiblable in the GDPR.

The GDPR emphasizes the importance of using state of the art 
technology, including anonymization, in as much as possible to ensure 
privacy.

At 
https://public-inbox.org/git/CA+dhYEViN4-boZLN+5QJyE7RtX+q6a92p0C2O6TA53==bzf...@mail.gmail.com/T/
 
there is already some discussion about transitioning to a different 
hashing algorithm to get more in line with state of the art in hashing. 
(My clear favourite would be SHA-3.)

In course of this, anonymization could also be added. My idea would be 
as follows:

Do not hash anything directly to obtain the commit ID. Instead, hash a 
list of hashes of [$random_number, $information] pairs. $information 
could be an author id, a commit date, a comment, or anything else. Then 
store the commit id, the list of hashes, and the list of pairs to form 
the commit.

If someone requests erasure, simply empty the corresponding pair in the 
list. All that would be left would be the hash of the pair, which is 
completely anonymous (not more useful than a random number) and thus 
not covered by the GDPR. The history could still be completely 
verified, and when displaying the log, the erased entry could be 
displayed as "<<ERASED>>".

What do you think about this?

Best wishes
Peter

-- 
Peter Backes, r...@helen.plasma.xg8.de

Reply via email to