The link is http://www.occamslab.com/petricek/data/

The KDD or Netflix data are plenty big to play with. How big is big for your
purpose?

On Fri, Jul 8, 2011 at 7:05 AM, web service <wbs...@gmail.com> wrote:

> Is it taken offline as well ?
>
> On Thu, Jul 7, 2011 at 10:40 PM, Alex Kozlov <ale...@cloudera.com> wrote:
>
> > There is still a libimseti dataset
> > http://www.occamslab.com/petricek/datawith 17,359,346 ratings.  People
> > are scared after the Netflix lawsuit.
> >
> > On Thu, Jul 7, 2011 at 10:17 PM, Ted Dunning <ted.dunn...@gmail.com>
> > wrote:
> >
> > > Those are both reasonably large, but not commercial in scale.
> > >
> > > At Veoh, we had about 10 non-zero elements in our raw data.  I think
> > > Netflix
> > > has 100 million.
> > >
> > > On Thu, Jul 7, 2011 at 8:05 PM, Lance Norskog <goks...@gmail.com>
> wrote:
> > >
> > > > What recommendation datasets, that are available, are considered
> > > > "large" by Mahout testing standards? Yahoo KDD Cup is offline, the
> > > > Netflix data went under a cloud...
> > > >
> > > > --
> > > > Lance Norskog
> > > > goks...@gmail.com
> > > >
> > >
> >
>

Reply via email to