Re: [liberationtech] [sunlightlabs] need advice on using hashes for preserving PII's utility for disambiguation while protecting sensitive info

2014-03-20 Thread Tom Lee
Arggh. Wrong link. Apologies to all and thanks to James McKinney. That's what I get for having that many tabs open. https://sunlightfoundation.com/blog/2014/03/20/a-little-math-could-make-identifiers-a-whole-lot-better/ On Thu, Mar 20, 2014 at 5:44 PM, James McKinney ja...@opennorth.ca wrote:

Re: [liberationtech] [sunlightlabs] need advice on using hashes for preserving PII's utility for disambiguation while protecting sensitive info

2014-02-07 Thread Michael Rogers
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On 06/02/14 20:56, Margie Roswell wrote: For all I know, the lack of implementations using this kind of one-way transformation isn't about government sluggishness but rather about its feasibility. I'd be very curious to hear folks ideas on

Re: [liberationtech] [sunlightlabs] need advice on using hashes for preserving PII's utility for disambiguation while protecting sensitive info

2014-02-06 Thread Margie Roswell
PII = personally identifiable information (Anyone who can address the question probably already knows that... but I was curious, and figured I'd spare others the look-up.) -- http://FarmBillPrimer.org http://www.BaltimoreUrbanAg.org (Please send events; This site is hungry.)

Re: [liberationtech] [sunlightlabs] need advice on using hashes for preserving PII's utility for disambiguation while protecting sensitive info

2014-02-06 Thread Chris Dary
Just one thought to throw out: Something that sprang to mind is the idea of a check digit or simplified hash that would be redundant enough to collide very often if you were trying to reverse, but would still provide enough disambiguation that you'd be able to appropriately determine who you're

Re: [liberationtech] [sunlightlabs] need advice on using hashes for preserving PII's utility for disambiguation while protecting sensitive info

2014-02-06 Thread Chris Dary
It's been a while since I dug into it, but something like an 8-bit CRChttp://en.wikipedia.org/wiki/Cyclic_redundancy_checkwould probably provide enough disambiguation but would collide often enough to not be much of a concern for reversing - 256 different values. On Thu, Feb 6, 2014 at 4:10 PM,

Re: [liberationtech] [sunlightlabs] need advice on using hashes for preserving PII's utility for disambiguation while protecting sensitive info

2014-02-06 Thread Josh Tauberer
On 02/06/2014 03:49 PM, Tom Lee wrote: Obviously the input key space for DLNs and most other personal ID numbers is so small that reversing this with a dictionary attack would be trivial. You can add a salt, but only on a per-entity basis (not a per-record basis) if you want to preserve the

Re: [liberationtech] [sunlightlabs] need advice on using hashes for preserving PII's utility for disambiguation while protecting sensitive info

2014-02-06 Thread James McKinney
I don't know how these government databases are maintained in the US, but in Canada it's not infrequent for such databases to be more-or-less write only - the government fills up a database with names, donation amounts, postcodes, etc. and then publishes it somewhere for others to consume. In a