On 02/06/2014 03:49 PM, Tom Lee wrote:
Obviously the input key space for DLNs and most other personal ID numbers is so small that reversing this with a dictionary attack would be trivial. You can add a salt, but only on a per-entity basis (not a per-record basis) if you want to preserve the capacity to disambiguate. That in turns calls for a lookup table in which the input keys are stored, which kind of defeats the point of using a hash (you might as well just assign random output IDs for each input ID). I would worry about government's ability to keep this lookup table secure, and I worry about the brittleness of such a system.


And yet a lookup table mapping inputs to random outputs might be the best worst option.

Even if the right cryptographic method (hash, encryption, etc.) can be found and is mathematically sound, I'd have /very/ low confidence that it would be implemented correctly. Maybe one office does it right, the next office says hey that's a great idea but forgets that hashing a four digit pin doesn't provide any obscurity, etc. (That's not a jab at government. Crypto is so hard.)

I'd ask, for a particular case, what data does the data source already have? If they /already/ have DLNs in their database, there's no added privacy concern in creating a random mapping to unique identifiers for public consumption. (Besides the mosaic effect, but that aside.) Assuming the data source can make the distinction at all internally, they must have /something/ already in their database.

HTH,

- Josh Tauberer (@JoshData)

http://razor.occams.info


-- 
Liberationtech is public & archives are searchable on Google. Violations of 
list guidelines will get you moderated: 
https://mailman.stanford.edu/mailman/listinfo/liberationtech. Unsubscribe, 
change to digest, or change password by emailing moderator at 
compa...@stanford.edu.

Reply via email to