-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 OK, results...
So, I used the test data set : https://svn.apache.org/repos/asf/incubator/devicemap/trunk/openddr/test-data/src/main/resources/test-data/dmap_20130522.txt leaving aside the desktop issue and the other stuff like bots and plain junk strings... When everything was set up it *flew* thru the +47k ua-strings in 35 seconds ! The result data can be found here : http://www.ducis.net/static/result_20130625.zip it is a pipe-separated file with header : Parser : time taken in ms DMap : DeviceMapClient claimed device UserAgent : useragent string OpenDdr : 'Old' openddr claimed device best thing to do is to import the lot in database... then weed out records WHERE DMap = 'unknown' : there are devices which no longer occur in the current XML resources *OR* where DeviceMapClient.classify returned 'Nothing' This leaves 17,919 records to compare. Of these 5,042 (28%) match, i.e. : both DeviceMapClient and the 'old' OpenDDR agree on the DeviceId I picked out a few string and ran them in the simple console app to double-check and the results were identical. Time to bring back some regex I fear :-( esjr -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.8 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJRya1NAAoJEOxywXcFLKYc1WAH/2t7eJE4r4kbH8gBYYVv9UWj HvOzHARdv3K5iAVsKKsSgrFIP/0Rqp49INqieE79bLwrwfE8TCVgieh4LhIFa7gl ZtihVthNrD+dWcFW6iitUL9JIS57lfe5sXow4PxIhs+2nyHTT0kjABAbWSt4pQYV lZwU5eGQLYHwGv1tZfm7ceonm49j8HV7zXrz54IQ0R77FZXUQKMfoLYv/w7fB76R 5E/BN41Ei9XI1XkfPowlJ7L99k320T4C2z+eOIn80yDsrnhegW1+kOxljXbL7jFf YefSkayF/Ss6/IkzMNBNJxXt33S+l4FPAit8zocjn0bKl6IPSXdAfOud9Sb7K0U= =awDC -----END PGP SIGNATURE-----
