[ 
https://issues.apache.org/jira/browse/MAHOUT-1464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14029374#comment-14029374
 ] 

ASF GitHub Bot commented on MAHOUT-1464:
----------------------------------------

Github user pferrel commented on the pull request:

    https://github.com/apache/mahout/pull/12#issuecomment-45917234
  
    I already fixed the header.
    
    I agree with Ted, kinda what functional programming is for. The reason I 
didn't use the Java aggregate is because it wasn't distributed. Still probably 
beyond this ticket. I'll refactor if a Scala journeyman wants to provide a 
general mechanism. I'm still on training wheels.
    
    This still needs to be tested in a distributed Spark+HDFS environment and 
MAHOUT-1561 will make testing easy. I'd be happy to merge this and move on, 
which will have the side effect of testing larger datasets and clusters.
    
    If Someone wants to test this now on a Spark+HDFS cluster, please do!


> Cooccurrence Analysis on Spark
> ------------------------------
>
>                 Key: MAHOUT-1464
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1464
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>         Environment: hadoop, spark
>            Reporter: Pat Ferrel
>            Assignee: Pat Ferrel
>             Fix For: 1.0
>
>         Attachments: MAHOUT-1464.patch, MAHOUT-1464.patch, MAHOUT-1464.patch, 
> MAHOUT-1464.patch, MAHOUT-1464.patch, MAHOUT-1464.patch, run-spark-xrsj.sh
>
>
> Create a version of Cooccurrence Analysis (RowSimilarityJob with LLR) that 
> runs on Spark. This should be compatible with Mahout Spark DRM DSL so a DRM 
> can be used as input. 
> Ideally this would extend to cover MAHOUT-1422. This cross-cooccurrence has 
> several applications including cross-action recommendations. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to