[jira] [Commented] (MAHOUT-1464) Cooccurrence Analysis on Spark

Pat Ferrel (JIRA) Tue, 15 Apr 2014 10:17:27 -0700

    [ 
https://issues.apache.org/jira/browse/MAHOUT-1464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13969766#comment-13969766
 ]


Pat Ferrel commented on MAHOUT-1464:
------------------------------------

To sum up, Spark Cooccurrence seems to complete correctly on the Spark Cluster 
in any of the configurations. Writing output has been failing on any case when 
using the remote Spark cluster for computation. However as far as I can tell 
input from local filesystem or HDFS seems to work in all cases.

Next I'll try running my tests from the Spark master machine by installing IDEA 
there. There must be some other way than IDEA to run this?

> Cooccurrence Analysis on Spark
> ------------------------------
>
>                 Key: MAHOUT-1464
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1464
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>         Environment: hadoop, spark
>            Reporter: Pat Ferrel
>            Assignee: Sebastian Schelter
>             Fix For: 1.0
>
>         Attachments: MAHOUT-1464.patch, MAHOUT-1464.patch, MAHOUT-1464.patch, 
> MAHOUT-1464.patch, MAHOUT-1464.patch, MAHOUT-1464.patch, run-spark-xrsj.sh
>
>
> Create a version of Cooccurrence Analysis (RowSimilarityJob with LLR) that 
> runs on Spark. This should be compatible with Mahout Spark DRM DSL so a DRM 
> can be used as input. 
> Ideally this would extend to cover MAHOUT-1422. This cross-cooccurrence has 
> several applications including cross-action recommendations. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (MAHOUT-1464) Cooccurrence Analysis on Spark

Reply via email to