On Fri, Feb 26, 2010 at 1:36 AM, Robin Anil <robin.a...@gmail.com> wrote:
>
> My mind was wandering and was thinking of giving the record attempt a better
> purpose than just creating junk ngram data(its good enough for a record
> attempt)
> There are a couple of datasets we can explore, like the genome dataset.

Another interesting dataset is the wikipedia page traffic stats dataset:
http://www.datawrangling.com/wikipedia-page-traffic-statistics-dataset

I wonder if there's something interesting that can be done with that
and the frequent pattern mining code.

One advantage to this is that it's already on ec2.

Reply via email to