> A simple solution if you only have 20,000 docs is
> just to iterate
> through the hits and count them up against each
> color etc,
The one thing to avoid is reader.document() calls in
such a tight loop. This is always a killer.
The best way I've found is to create one bitset for
all the matching docs then use TermEnum on the "group"
field(s) to find all the docids - then check each
docId against the "matches" bitset to accumulate
scores for each unique "group" field value:
TermEnum te = reader.terms(new
Term(groupFieldName, ""));
Term term = te.term();
while (term!=null)
{
if (term.field().equals(groupFieldName))
{
TermDocs termDocs =
reader.termDocs(term);
GroupTotal groupTotal = null;
boolean continueThisTerm = true;
while ((continueThisTerm) &&
(termDocs.next()))
{
int docID = termDocs.doc();
if (queryMmatchedDocs.get(docId))
{
if (groupTotal == null)
{
//look up the group key
and initialize
String termText =
term.text();
Object key = termText;
groupTotal = (GroupTotal)
totals.get(key);
if (groupTotal == null)
{
//no totals exist yet,
create new one.
groupTotal = new
GroupTotal((key);
totals.put(key,
groupTotal);
}
}
groupTotal.addQueryMatchDoc(docID);
}
}
} else
{
break;
}
if(te.next())
{
term=te.term();
}
else
{
break;
}
}
Cheers
Mark
___________________________________________________________
Win a BlackBerry device from O2 with Yahoo!. Enter now.
http://www.yahoo.co.uk/blackberry
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]