Take a look at this repo http://fimi.cs.helsinki.fi/data/
I am specifically talking about the retail and accidents dataset. A modified
version of them(comma separated)  is being used by me for FPGrowth testing.
Webdocs dataset looks good enough for being used for parallel fpgrowth
testing.

Question is shall i use the url to fetch them , then convert to the required
format.
Or keep the converted format in a repo like in
people.apache.org/~robinanil/datasets/ or something dedicated for mahout.



On Thu, Oct 8, 2009 at 5:39 PM, Robin Anil <robin.a...@gmail.com> wrote:

> We need a central place for all sample datasets used for examples and unit
> tests? I am against putting it in the repo
> Any suggestions?
>
> Robin
>

Reply via email to