[ 
https://issues.apache.org/jira/browse/MAHOUT-881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13149385#comment-13149385
 ] 

Ted Dunning commented on MAHOUT-881:
------------------------------------

{quote}
from 90 microseconds to 47 microseconds
{quote}
This is consistent with my experience in other efforts.  The priority queue is 
rarely the problem if you avoid inserting most elements and even if you do 
insert most elements due to pathological ordering of the original data, it 
isn't a big deal since the cost is n log k where n is the number of documents 
and k is the size of the queue.

One big difference that we can probably make, however, is to multi-thread some 
of these sequential programs.  This isn't very hard with the Executors in Java. 
 This doesn't make things more efficient, but it does make them 10x faster on 
commonly available servers.  That is an effort for a different JIRA in any case.
                
> Refactor TopItems to use Lucene's PriorityQueue and remove excessive sorting
> ----------------------------------------------------------------------------
>
>                 Key: MAHOUT-881
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-881
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.6
>            Reporter: Grant Ingersoll
>            Assignee: Grant Ingersoll
>            Priority: Minor
>         Attachments: Call_Tree.html, Call_Tree_2.html, MAHOUT-881.patch, 
> MAHOUT-881.patch, MAHOUT-881.patch
>
>
> TopItems.getTop*() all do a fair number of excessive operations that can be 
> replaced by switching to using Lucene's PriorityQueue implementation, which 
> is more efficient and faster than Java's built in PQ implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to