Cool, yeah I'm looking for something even larger, since this is small enough that processing it easily fits on one computer. The chapter in question is about distributing via Hadoop.
My current next-best option, if it can be used, is the LiveJournal network data here: http://snap.stanford.edu/data/index.html On Fri, May 7, 2010 at 4:29 PM, Pedro Oliveira <[email protected]> wrote: > This dataset seems to have a few million <user, artist, plays> triples from > last.fm: > http://mtg.upf.edu/node/1671
