The important thing here is that we test the code on a sufficiently large dataset on a real cluster. Take that on, if you want! Am 02.06.2014 20:08 schrieb "Pat Ferrel (JIRA)" <[email protected]>:
> > [ > https://issues.apache.org/jira/browse/MAHOUT-1464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14015667#comment-14015667 > ] > > Pat Ferrel commented on MAHOUT-1464: > ------------------------------------ > > [~ssc] Should I reassign to me for now so we can get this committed? > > > Cooccurrence Analysis on Spark > > ------------------------------ > > > > Key: MAHOUT-1464 > > URL: https://issues.apache.org/jira/browse/MAHOUT-1464 > > Project: Mahout > > Issue Type: Improvement > > Components: Collaborative Filtering > > Environment: hadoop, spark > > Reporter: Pat Ferrel > > Assignee: Sebastian Schelter > > Fix For: 1.0 > > > > Attachments: MAHOUT-1464.patch, MAHOUT-1464.patch, > MAHOUT-1464.patch, MAHOUT-1464.patch, MAHOUT-1464.patch, MAHOUT-1464.patch, > run-spark-xrsj.sh > > > > > > Create a version of Cooccurrence Analysis (RowSimilarityJob with LLR) > that runs on Spark. This should be compatible with Mahout Spark DRM DSL so > a DRM can be used as input. > > Ideally this would extend to cover MAHOUT-1422. This cross-cooccurrence > has several applications including cross-action recommendations. > > > > -- > This message was sent by Atlassian JIRA > (v6.2#6252) >
