[
https://issues.apache.org/jira/browse/MAHOUT-1464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14018871#comment-14018871
]
ASF GitHub Bot commented on MAHOUT-1464:
----------------------------------------
Github user sscdotopen commented on the pull request:
https://github.com/apache/mahout/pull/8#issuecomment-45234269
Its not allowed to redistribute the movielens dataset.
On 06/05/2014 05:28 PM, Pat Ferrel wrote:
> I could use a little advice here. The epinions and movielens tests in the
examples folder. Should they be put into the build?
>
> Pros: good example data.
> Cons: the reading and writing are not parallel and so only work locally.
It is easy to change the Spark context to use a cluster but the data still has
to be local. These tests would be easier to maintain if they were attached to
the ItemSimilarityDriver, which will handle cluster storage and execution and
will be maintained better.
>
> I'd rather move them out into an ItemSimilarityDriver examples folder and
will do this if no one objects. They will not be build tests, obviously, since
they take too long.
>
> ---
> Reply to this email directly or view it on GitHub:
> https://github.com/apache/mahout/pull/8#issuecomment-45234064
>
> Cooccurrence Analysis on Spark
> ------------------------------
>
> Key: MAHOUT-1464
> URL: https://issues.apache.org/jira/browse/MAHOUT-1464
> Project: Mahout
> Issue Type: Improvement
> Components: Collaborative Filtering
> Environment: hadoop, spark
> Reporter: Pat Ferrel
> Assignee: Sebastian Schelter
> Fix For: 1.0
>
> Attachments: MAHOUT-1464.patch, MAHOUT-1464.patch, MAHOUT-1464.patch,
> MAHOUT-1464.patch, MAHOUT-1464.patch, MAHOUT-1464.patch, run-spark-xrsj.sh
>
>
> Create a version of Cooccurrence Analysis (RowSimilarityJob with LLR) that
> runs on Spark. This should be compatible with Mahout Spark DRM DSL so a DRM
> can be used as input.
> Ideally this would extend to cover MAHOUT-1422. This cross-cooccurrence has
> several applications including cross-action recommendations.
--
This message was sent by Atlassian JIRA
(v6.2#6252)