Marc G. Fournier wrote: > On Wed, 9 Aug 2006, Howard Jones wrote: > >> Marc G. Fournier wrote: >> >>> Right, and the bad thing is if yu alias another IP on that device, the >>> hash totally changes, so we see that one host now as being two different >>> ones :) That's why we disqualified using ifconfig right at the >>> beginning ... >> >> But didn't you say that you effectively wipe the database once a month, >> (or expire entries over that age)? I can't find the post that mentioned >> that now, naturally... :-) if you aren't using the 'key' as a database >> key, then what do you care that it changes as long as it uniquely >> identifies the system (which it definitely would)? >> >> I don't know how typical I am, but I don't really remember the last time >> I added an IP alias on a running server, for our few dozen production >> systems. I would imagine that those types of changes might well be lost >> of systems coming and going. > > I add/remove IPs from our servers several times each week, as we add VPS > and remove them, or move then between boxes ...
This problem is intractable: any scheme you can think of to generate a unique identifying number on a random host out there on the net will either fail to actually be unique, or suffer from mutating over time as machine configuration changes. How about the following. Use the bsdstats.hub.org to generate a random token and hand it to the client. 128 bits of randomness gives a sufficiently large domain (340,282,366,920,938,463,463,374,607,431,768,211,456 different possible combinations) that given a good RNG collisions are not a problem. You can generate that sort of token easily by, for example: % openssl rand -base64 16 KSOWkPuK03Od99S5vaPGdQ== Base64 encoded strings will have to be URL escaped if they are passed as parameters in a HTTP GET -- perhaps encoding as a string of hex digits might be a better idea: % openssl rand 16 | hexdump -e '16/1 "%01x" "\n"' 566fc9f2374a7e999d9587dc143373fc Anyhow, that's just implementation detail. So the transaction would go like this the first time a client machine tried to report its configuration: Client Server ----------------------------------------------------------------------------- Check for cached ID token Not found Request new token from server ------> Generate token Record it in DB Return token to client <------ Cache token in file Generate OS version info Send to server with ID token -------> If token is known, record data in DB Generate Driver info Send to server with ID token -------> If token is known, record data in DB etc. etc. ----------------------------------------------------------------------------- Because the server generates the tokens, it knows which ones are valid, and can discard any data sent to it without a valid token. That doesn't prevent any vandal-minded person from requesting a metric butt-load of tokens to spam the database with, but that's no worse than the current situation. The neat thing is, the number of available tokens is so huge that it is infeasible to guess or accidentally collide with someone else's token. Eg. At 100Mb/s it would take about 10^33 seconds or 10^25 years to exhaustively search the whole token space. Thus spammed data will just time out at the end of the month without affecting anyone else's real data. Stealing an existing ID token by breaking into a machine or snooping on the net would be possible, but presumably sufficiently difficult to do in a large enough quantity that it wouldn't have a significant effect on the overall statistics. If snooping turns out to be a real problem, then using HTTPS is a possibility, but that will ramp up the load on the server quite a bit. For subsequent updates, the client machine just reuses the same token out of its cache file. If the cached token gets deleted, then the client machine will just have to request a new one and rely on the old data timing out at the end of the month. Saving away the token should be simple -- just make the server return the data to a 'get_token' query as MIME type text/plain and have fetch dump it in a cache file somewhere. /var/db/bsdstats for example. I can code up the client side of this in about 5 minutes, but the server end of things will take a little more work. Cheers, Matthew -- Dr Matthew J Seaman MA, D.Phil. 7 Priory Courtyard Flat 3 PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate Kent, CT11 9PW
signature.asc
Description: OpenPGP digital signature