I've been reading code and am wondering about TopItems.getTopUsers
Here's my pseudocode of it (lines 96-134 in TopItems)
Get a Priority Queue (PQ)
for all users
estimate the similarity of user i with our current user (the one we are
generating a rec for)
put a SimilarUser object on the PQ.
//HERE
Create a list of SimilarUser objects
Populate that list with the PQ objects, which also contains SimilarUser objects
Sort that list
Iterate over that list once more and grab the ids and put it into an array
//HERE
return the array
Why all that in between moving stuff marked by //HERE? Isn't that traversing
the same list 3 times? Can't we just pop from the priority queue in reverse
order and fill the array from back to front? Am I missing something?
Here's the same code in Lucene:
<snip>
protected void populateResults(ScoreDoc[] results, int howMany) {
for (int i = howMany - 1; i >= 0; i--) {
results[i] = pq.pop();
}
}
</snip>
Also, FWIW, Lucene's PQ implementation is faster than Java's, from what I
understand (which is why we wrote our own).
-Grant