Hi Chew,
 
with Lucene you could try the following:
 
Make one query for each single value in each category (each Term):
1Q - Gender:M
2Q - Department:Accounting
3Q - Department:R&D
4Q - ...
 
with a custom HitCollector like the following example taken from 
org.apache.lucene.search.HitCollector API Spec:
 
   Searcher searcher = new IndexSearcher(indexReader);
   final BitSet bits = new BitSet(indexReader.maxDoc());
   searcher.search(query, new HitCollector() {
       public void collect(int doc, float score) {
         bits.set(doc);
       }
     });
 
Thus you'l get one BitSet for each Term with bits set at DocIDs containing the 
Term.
 
Then do the combinatory part on these BitSets like:
 
  BitSet FemaleAndRnD = ((BitSet) RnD.clone()).andNot(Male);

Cheers,
Fabian
 

________________________________

Von: Chew Yee Chuang [mailto:[EMAIL PROTECTED]
Gesendet: Mi 15.08.2007 10:31
An: java-user@lucene.apache.org
Betreff: RE: High CPU usage duing index and search



Greetings,

I have tested with Mysql, the grouping is ok when there is not much records in 
the table, but when I come across to performed grouping in a table which have 3 
millions of records, It really take a very long time to finish. Thus, Im 
looking at lucene and hope it can help.

Thank you
eChuang, Chew



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to