Hi Chew,
with Lucene you could try the following:
Make one query for each single value in each category (each Term):
1Q - Gender:M
2Q - Department:Accounting
3Q - Department:R&D
4Q - ...
with a custom HitCollector like the following example taken from
org.apache.lucene.search.HitCollector API Spec:
Searcher searcher = new IndexSearcher(indexReader);
final BitSet bits = new BitSet(indexReader.maxDoc());
searcher.search(query, new HitCollector() {
public void collect(int doc, float score) {
bits.set(doc);
}
});
Thus you'l get one BitSet for each Term with bits set at DocIDs containing the
Term.
Then do the combinatory part on these BitSets like:
BitSet FemaleAndRnD = ((BitSet) RnD.clone()).andNot(Male);
Cheers,
Fabian
________________________________
Von: Chew Yee Chuang [mailto:[EMAIL PROTECTED]
Gesendet: Mi 15.08.2007 10:31
An: [email protected]
Betreff: RE: High CPU usage duing index and search
Greetings,
I have tested with Mysql, the grouping is ok when there is not much records in
the table, but when I come across to performed grouping in a table which have 3
millions of records, It really take a very long time to finish. Thus, Im
looking at lucene and hope it can help.
Thank you
eChuang, Chew
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]