For redistributable data, we should definitely lock down a version in our
distribution or an associated one.  This is true if only to make sure that
we don't get surprised by somebody rearranging their web site.

For non-redistributable but available data, I think having a download
procedure that sucks the data down from a URL is fine.  There is probably
even a maven life-cycle that is appropriate (test-process-resources or some
such)

On Thu, Oct 8, 2009 at 5:32 AM, Sean Owen <sro...@gmail.com> wrote:

> Several data sets I use have distribution clauses that forbid or
> complicate redistribution, so not sure I can do that. Of course we
> should check that on any other data set.
>
> On Thu, Oct 8, 2009 at 1:09 PM, Robin Anil <robin.a...@gmail.com> wrote:
> > We need a central place for all sample datasets used for examples and
> unit
> > tests? I am against putting it in the repo
> > Any suggestions?
> >
> > Robin
> >
>



-- 
Ted Dunning, CTO
DeepDyve

Reply via email to