Re: [liberationtech] [sunlightlabs] need advice on using hashes for preserving PII's utility for disambiguation while protecting sensitive info

Josh Tauberer Thu, 06 Feb 2014 13:28:45 -0800

On 02/06/2014 03:49 PM, Tom Lee wrote:

Obviously the input key space for DLNs and most other personal IDnumbers is so small that reversing this with a dictionary attack wouldbe trivial. You can add a salt, but only on a per-entity basis (not aper-record basis) if you want to preserve the capacity todisambiguate. That in turns calls for a lookup table in which theinput keys are stored, which kind of defeats the point of using a hash(you might as well just assign random output IDs for each input ID). Iwould worry about government's ability to keep this lookup tablesecure, and I worry about the brittleness of such a system.

And yet a lookup table mapping inputs to random outputs might be thebest worst option.

Even if the right cryptographic method (hash, encryption, etc.) can befound and is mathematically sound, I'd have /very/ low confidence thatit would be implemented correctly. Maybe one office does it right, thenext office says hey that's a great idea but forgets that hashing a fourdigit pin doesn't provide any obscurity, etc. (That's not a jab atgovernment. Crypto is so hard.)

I'd ask, for a particular case, what data does the data source alreadyhave? If they /already/ have DLNs in their database, there's no addedprivacy concern in creating a random mapping to unique identifiers forpublic consumption. (Besides the mosaic effect, but that aside.)Assuming the data source can make the distinction at all internally,they must have /something/ already in their database.


HTH,

- Josh Tauberer (@JoshData)

http://razor.occams.info

-- 
Liberationtech is public & archives are searchable on Google. Violations of 
list guidelines will get you moderated: 
https://mailman.stanford.edu/mailman/listinfo/liberationtech. Unsubscribe, 
change to digest, or change password by emailing moderator at 
compa...@stanford.edu.

Re: [liberationtech] [sunlightlabs] need advice on using hashes for preserving PII's utility for disambiguation while protecting sensitive info

Reply via email to