+1 I'm ready. What do we need. Perf Tuning! Cluster Setup?, Amazon Credits?
Someone to pay for the machines or from our own pockets?


Robin

On Fri, Feb 26, 2010 at 1:20 AM, Ted Dunning <ted.dunn...@gmail.com> wrote:

> These guys:
>
>
> http://delivery.acm.org/10.1145/1460000/1459718/a18-vigna.pdf?key1=1459718&key2=4070317621&coll=GUIDE&dl=GUIDE&CFID=77555530&CFTOKEN=13940667
>
> say this:
>
>   > We present experiments over a collection with 3.6 billions of
> postings---two orders of magnitudes larger than any published experiment in
> the literature.
>
> My impression is that Mahout on about 100 machines is ready to break this
> record with Jake's latest code.  The stochastic decomposition should make
> it
> even more plausible.
>
> The hardest part will be to find reasonable data with > 4 billion non-zero
> entries.  At 0.01% sparsity, this is roughly a square matrix with 5 million
> rows and columns.
>
> Jake, your social graph should be much larger than that.
>
> --
> Ted Dunning, CTO
> DeepDyve
>

Reply via email to