[ https://issues.apache.org/jira/browse/MAHOUT-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14396096#comment-14396096 ]
ASF GitHub Bot commented on MAHOUT-1622: ---------------------------------------- GitHub user avati opened a pull request: https://github.com/apache/mahout/pull/106 MAHOUT-1622: MultithreadedBatchItemSimilarities output fix Rebased batchSimilarities.patch attached in MAHOUT-1622 and resolved conflicts. Tests pass on laptop and good for merge. You can merge this pull request into a Git repository by running: $ git pull https://github.com/avati/mahout MAHOUT-1622 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/mahout/pull/106.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #106 ---- commit 5e15c62df3406cbe69465eee42fbddb0c439bb8a Author: Anand Avati <av...@redhat.com> Date: 2015-04-05T05:59:09Z MAHOUT-1622: MultithreadedBatchItemSimilarities output fix In some cases the Output class in MultithreadedBatchItemSimilarities does not output all of the similarity pairs that it should. It is very possible for the number of active workers to go to zero while in the while loop, in which case the remaining similarities for the finished workers will not be flushed to the output. This is because the while loop is only conditioned on whether there are active workers or not. An easy fix is to also check to make sure the results structure is not empty. This way both the number of active workers must be 0 and the result set must be empty to exit the while loop. On-behalf-of: Jesse Daniels <jessedanie...@gmail.com> Signed-off-by: Anand Avati <av...@redhat.com> ---- > MultithreadedBatchItemSimilarities outputs incorrect number of similarities. > ---------------------------------------------------------------------------- > > Key: MAHOUT-1622 > URL: https://issues.apache.org/jira/browse/MAHOUT-1622 > Project: Mahout > Issue Type: Bug > Components: Collaborative Filtering > Affects Versions: 0.9 > Reporter: Jesse Daniels > Assignee: Anand Avati > Priority: Minor > Labels: legacy > Fix For: 0.10.0 > > Attachments: batchSimilarities.patch > > > In some cases the Output class in MultithreadedBatchItemSimilarities does not > output all of the similarity pairs that it should. It is very possible for > the number of active workers to go to zero while in the while loop, in which > case the remaining similarities for the finished workers will not be flushed > to the output. This is because the while loop is only conditioned on whether > there are active workers or not. An easy fix is to also check to make sure > the results structure is not empty. This way both the number of active > workers must be 0 and the result set must be empty to exit the while loop. -- This message was sent by Atlassian JIRA (v6.3.4#6332)