On Tue, Mar 9, 2010 at 7:49 PM, Robin Anil <[email protected]> wrote:

> http://warsteiner.db.cs.cmu.edu/db-site/Datasets/graphData/
> Seems like there are plenty of interesting datasets here to try mahout on.
> There is even a p2p network graph. 790MB compressed Sounds like a good test
> matrix for the decomposer stuff
>

Three words: twitter social graph:
    http://an.kaist.ac.kr/traces/WWW2010.html
6GB compressed, 60M x 60M sparse matrix.

I've pulled the torrent and will put sequence files of vectors in some s3
buckets
once I get them processed.  This is a matrix with a good 1.47B nonzero
entries, and
is publically available.  Not record breaking, but pretty darn huge.

  -jake

Reply via email to