(c/o BoingBoing)

http://www.applefritter.com/bannedbooks


 Data Mining 101: Finding Subversives with Amazon Wishlists

Frequent Make contributor Tom Owad just published a mind-blowing how-on on
his website explaining how to mine Amazon's wish list database to uncover
"subversives."

    Using a pair of 5-year-old computers, two home DSL connections, 42 hours
of computer time, and 5 man hours, I now had documents describing the
reading preferences of 260,000 U.S. citizens.

    I downloaded all the files to an external 120 GB Firewire drive in UFS
format. The raw data occupied little more than 5 GB. I initially wanted to
move all the files into a single directory to facilitate searching, but as
the directory contents exceeded 100,000 items, the speed became glacially
slow, so I kept the data divided into chunks of 25,000 wishlists.

    Next comes the fun part ­ what books are most dangerous? So many to
choose from. Here's a sample of the list I made. Feel free to make up your
own list if you decide to try some data mining. Send it to the FBI. I'm sure
they'll appreciate your help in fighting terrorism.



You are a subscribed member of the infowarrior list. Visit
www.infowarrior.org for list information or to unsubscribe. This message
may be redistributed freely in its entirety. Any and all copyrights
appearing in list messages are maintained by their respective owners.

Reply via email to