I don't think is ideal to run tests on datasets that prevent redistributions and are essentially non-free. Are there alternatives for this? I would be in favor of using only free datasets.
On Sun, Jan 7, 2018 at 2:26 AM, Marco de Abreu <marco.g.ab...@googlemail.com> wrote: > I have been thinking about creating a private s3 bucket, but this would > render it impossible to run the tests locally. On the other hand, the > licenses of many datasets like Movielens forbid redistribution, means > setting the s3 bucket to public is not allowed. We could think about a > hybrid solution which tries to query the s3 bucket and downloads the file > from an alternative address (aka the original source) if the s3 bucket is > not reachable. > > On Sun, Jan 7, 2018 at 12:29 AM, Marco de Abreu < > marco.g.ab...@googlemail.com> wrote: > >> I could offer to download the dataset and create an S3 bucket to store all >> used datasets. This would also reduce external dependencies. >> >> Wdyt? >> >> -Marco >> >> Am 07.01.2018 12:26 vorm. schrieb "kellen sunderland" < >> kellen.sunderl...@gmail.com>: >> >>> FYI PRs are currently failing to build. The R "Matrix Factorization" test >>> is failing to download this dataset: http://files.grouplens.org/datasets/ >>> movielens/ml-100k.zip >>> <http://files.grouplens.org/datasets/movielens/ml-100k.zip> . The site >>> https://grouplens.org/ appears to be down. >>> >>> Issue here: https://github.com/apache/incubator-mxnet/issues/9332 >>> PR to skip the test here: >>> https://github.com/apache/incubator-mxnet/pull/9333 >>> >>> -Kellen >>> >>