I don't think is ideal to run tests on datasets that prevent
redistributions and are essentially non-free. Are there alternatives
for this? I would be in favor of using only free datasets.

On Sun, Jan 7, 2018 at 2:26 AM, Marco de Abreu
<marco.g.ab...@googlemail.com> wrote:
> I have been thinking about creating a private s3 bucket, but this would
> render it impossible to run the tests locally. On the other hand, the
> licenses of many datasets like Movielens forbid redistribution, means
> setting the s3 bucket to public is not allowed. We could think about a
> hybrid solution which tries to query the s3 bucket and downloads the file
> from an alternative address (aka the original source) if the s3 bucket is
> not reachable.
>
> On Sun, Jan 7, 2018 at 12:29 AM, Marco de Abreu <
> marco.g.ab...@googlemail.com> wrote:
>
>> I could offer to download the dataset and create an S3 bucket to store all
>> used datasets. This would also reduce external dependencies.
>>
>> Wdyt?
>>
>> -Marco
>>
>> Am 07.01.2018 12:26 vorm. schrieb "kellen sunderland" <
>> kellen.sunderl...@gmail.com>:
>>
>>> FYI PRs are currently failing to build.  The R "Matrix Factorization" test
>>> is failing to download this dataset: http://files.grouplens.org/datasets/
>>> movielens/ml-100k.zip
>>> <http://files.grouplens.org/datasets/movielens/ml-100k.zip> .  The site
>>> https://grouplens.org/ appears to be down.
>>>
>>> Issue here: https://github.com/apache/incubator-mxnet/issues/9332
>>> PR to skip the test here:
>>> https://github.com/apache/incubator-mxnet/pull/9333
>>>
>>> -Kellen
>>>
>>

Reply via email to