I'm a bit lost in this discussion. Why do we assume that getNumNonZeroElements() on a Vector only returns an upper bound? The code in AbstractVector clearly returns the non-zeros only:

    int count = 0;
    Iterator<Element> it = iterateNonZero();
    while (it.hasNext()) {
      if (it.next().get() != 0.0) {
        count++;
      }
    }
    return count;

On the other hand, the internal code seems broken here, why does iterateNonZero potentially return 0's?

--sebastian





On 06/12/2014 06:38 PM, ASF GitHub Bot (JIRA) wrote:

     [ 
https://issues.apache.org/jira/browse/MAHOUT-1464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14029345#comment-14029345
 ]

ASF GitHub Bot commented on MAHOUT-1464:
----------------------------------------

Github user dlyubimov commented on the pull request:

     https://github.com/apache/mahout/pull/12#issuecomment-45915940

     fix header to say MAHOUT-1464, then hit close and reopen, it will restart 
the echo.


Cooccurrence Analysis on Spark
------------------------------

                 Key: MAHOUT-1464
                 URL: https://issues.apache.org/jira/browse/MAHOUT-1464
             Project: Mahout
          Issue Type: Improvement
          Components: Collaborative Filtering
         Environment: hadoop, spark
            Reporter: Pat Ferrel
            Assignee: Pat Ferrel
             Fix For: 1.0

         Attachments: MAHOUT-1464.patch, MAHOUT-1464.patch, MAHOUT-1464.patch, 
MAHOUT-1464.patch, MAHOUT-1464.patch, MAHOUT-1464.patch, run-spark-xrsj.sh


Create a version of Cooccurrence Analysis (RowSimilarityJob with LLR) that runs 
on Spark. This should be compatible with Mahout Spark DRM DSL so a DRM can be 
used as input.
Ideally this would extend to cover MAHOUT-1422. This cross-cooccurrence has 
several applications including cross-action recommendations.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Reply via email to