For redistributable data, we should definitely lock down a version in our distribution or an associated one. This is true if only to make sure that we don't get surprised by somebody rearranging their web site.
For non-redistributable but available data, I think having a download procedure that sucks the data down from a URL is fine. There is probably even a maven life-cycle that is appropriate (test-process-resources or some such) On Thu, Oct 8, 2009 at 5:32 AM, Sean Owen <sro...@gmail.com> wrote: > Several data sets I use have distribution clauses that forbid or > complicate redistribution, so not sure I can do that. Of course we > should check that on any other data set. > > On Thu, Oct 8, 2009 at 1:09 PM, Robin Anil <robin.a...@gmail.com> wrote: > > We need a central place for all sample datasets used for examples and > unit > > tests? I am against putting it in the repo > > Any suggestions? > > > > Robin > > > -- Ted Dunning, CTO DeepDyve