On 9/9/13 4:25 PM, R. Jason Cronk wrote:
On 9/9/2013 5:58 PM, Chris Peterson wrote:
Our private database maps access point hash IDs to locations (and
other metadata). Assuming:

    H1 = Hash(AP1.MAC + AP1.SSID)
    H2 = Hash(AP2.MAC + AP2.SSID)

I assume + means concatenate. I might suggest XORing the values. SSID
names are usually human readable, not meant to be secure and thus follow
predictable patterns. I also hope you're not using the patterned MAC
notation but rather the 48 bit address space representation.

We currently use concatenation, but I see how XOR would make more sense. We are using the SSID as a weak protection against someone "polluting" our database results by submitting random MAC addresses. Our database still might have their junk data, but real location requests shouldn't hit them.

We are using the MAC string notation like "45:67:89:ab:cd:ef", but I see that this format has predictable patterns, too. I will recommend we use the 48-bit binary representation.


What is the granularity of the lat/long?

This depends on the GPS of the device used to collect the data, but our database stores 7 decimal places (less than one meter resolution).


Someone querying the published database would need to know the MAC
addresses and current SSIDs of two neighboring access points to look
up either's location.

When you say published, do you mean that the entire DB is published for
use by "researchers" or that it's just has a publicly exposed API that
responds to queries?

We are investigating both a web service API and a downloadable database. We are collecting position data for both Wi-Fi access points and cell towers. Depending on privacy protections, if we can't publish the whole database to the world, we can publish just the cell tower data to the world and possibly make the Wi-Fi data available only to trusted researchers.


I'm assuming if AP3 through AP10 were all also in the vicinity that
Hash(H1+Hx) ==> Random1 where x is in {2,..,10}, correct?
If so, is whatever value Hy is the prefix in the concatenation will
correspond to APy's Random id?

In the proposed scheme, yes. Since AP1 and AP2 have different (but close) latitude and longitude positions, Hash(H1+H2) would fetch the random row id for AP1's location and Hash(H2+H1) would fetch the row id for AP2's location.


chris
_______________________________________________
dev-security mailing list
dev-security@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-security

Reply via email to