[ 
https://issues.apache.org/jira/browse/MAHOUT-1464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010653#comment-14010653
 ] 

Pat Ferrel commented on MAHOUT-1464:
------------------------------------

There have been no commits afaik. The status is for Sebastian to say but I've 
used the cooccurrence analysis and it works correctly. I can't verify Spark 
cluster execution with HDFS due to what I think is my own bad setup.

If someone else could test it on a cluster I'd say it should be committed. If 
we can wait, I'm trying to get my cluster upgraded to hadoop 2 and reconfigure 
Spark for that. Then try testing this on the new setup.

There are no scala tests for this though there are some in the patches. I'm 
adding some scala tests that will cover this code in doing a CLI in 
MAHOUT-1541, which is a few weeks from being able to commit.

Not sure if it's packaged correctly, the tests supplied here are really 
examples since they are on large datasets and take a long time to execute.

Bottom line is it needs to be verified on a Cluster and checked for package 
structure. I'm happy to do this if we don't need it committed right away. Both 
of these things need to be done as part of MAHOUT-1541, which I'm actively 
working on but is not really ready to review yet.

> Cooccurrence Analysis on Spark
> ------------------------------
>
>                 Key: MAHOUT-1464
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1464
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>         Environment: hadoop, spark
>            Reporter: Pat Ferrel
>            Assignee: Sebastian Schelter
>             Fix For: 1.0
>
>         Attachments: MAHOUT-1464.patch, MAHOUT-1464.patch, MAHOUT-1464.patch, 
> MAHOUT-1464.patch, MAHOUT-1464.patch, MAHOUT-1464.patch, run-spark-xrsj.sh
>
>
> Create a version of Cooccurrence Analysis (RowSimilarityJob with LLR) that 
> runs on Spark. This should be compatible with Mahout Spark DRM DSL so a DRM 
> can be used as input. 
> Ideally this would extend to cover MAHOUT-1422. This cross-cooccurrence has 
> several applications including cross-action recommendations. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to