[jira] [Commented] (MAHOUT-1464) Cooccurrence Analysis on Spark

Pat Ferrel (JIRA) Mon, 02 Jun 2014 12:39:26 -0700

    [ 
https://issues.apache.org/jira/browse/MAHOUT-1464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14015788#comment-14015788
 ]


Pat Ferrel commented on MAHOUT-1464:
------------------------------------

Looks like DrmLike may have been refactored since this patch was written.

[~dlyubimov] The following patch code has an error at "elem" saying "Missing 
parameter type 'elem'" Looking at the scaladocs I tracked back to the DrmLike 
trait and see no way to .mapBlock on it. Has something been refactored here? 
The .nonZeroes() is a java sparse vector iterator I think. This worked about a 
month ago so thought you might have an idea how things have changed?

{code:scala}
  def computeIndicators(drmBtA: DrmLike[Int], numUsers: Int, 
maxInterestingItemsPerThing: Int,
                        bcastNumInteractionsB: Broadcast[Vector], 
bcastNumInteractionsA: Broadcast[Vector],
                        crossCooccurrence: Boolean = true) = {
    drmBtA.mapBlock() {
      case (keys, block) =>

        val llrBlock = block.like()
        val numInteractionsB: Vector = bcastNumInteractionsB
        val numInteractionsA: Vector = bcastNumInteractionsA

        for (index <- 0 until keys.size) {

          val thingB = keys(index)

          // PriorityQueue to select the top-k items
          val topItemsPerThing = new 
mutable.PriorityQueue[(Int,Double)]()(orderByScore)

          block(index, ::).nonZeroes().foreach { elem => //!!!!!!!!!!!!! Error: 
"Missing parameter type 'elem'"
            val thingA = elem.index
            val cooccurrences = elem.get
{code}

> Cooccurrence Analysis on Spark
> ------------------------------
>
>                 Key: MAHOUT-1464
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1464
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>         Environment: hadoop, spark
>            Reporter: Pat Ferrel
>            Assignee: Sebastian Schelter
>             Fix For: 1.0
>
>         Attachments: MAHOUT-1464.patch, MAHOUT-1464.patch, MAHOUT-1464.patch, 
> MAHOUT-1464.patch, MAHOUT-1464.patch, MAHOUT-1464.patch, run-spark-xrsj.sh
>
>
> Create a version of Cooccurrence Analysis (RowSimilarityJob with LLR) that 
> runs on Spark. This should be compatible with Mahout Spark DRM DSL so a DRM 
> can be used as input. 
> Ideally this would extend to cover MAHOUT-1422. This cross-cooccurrence has 
> several applications including cross-action recommendations. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (MAHOUT-1464) Cooccurrence Analysis on Spark

Reply via email to