Make the maven test phase download this dataset once for all tests ? Is that
possible



On Tue, Feb 9, 2010 at 7:43 PM, Sean <sro...@gmail.com> wrote:

> I don't, but can offer alternatives --
>
> Just have the user download the data set. I don't think this is a big
> burden.
> Download the data set automatically.
>
> These are free of legal and tarball-size problems.
>
> On Tue, Feb 9, 2010 at 2:11 PM, Robin Anil <robin.a...@gmail.com> wrote:
> > I feel a need to check in a set of text documents to mahout. maybe 3-4
> > categories of documents 10 each.
> > can be used in clustering classification, vectorizer collocation testing
> and
> > even frequent pattern generation
> >
> > And instead doing artificial tests each of it can use this to test
> against a
> > reference implementation written in the testclass like what kmeans does.
> >
> > Plus we will have a baseline with which we can see improvements in these
> > algorithms. Any idea of some good(legally sound :))  dataset which we can
> > use?
> >
> > Same idea can be extended to CF also
> >
> >
> > Robin
> >
>

Reply via email to