Hi, You may also want to take a look at Carrot2: http://demo.carrot2.org/demo-stable/main
Lucene documentation references them, but I was disappointed to see that they had an open source version (really old) and one that you can buy. It may work for you. Also, take a look at SOLR's implementation of faceted searching: http://wiki.apache.org/solr/SolrFacetingOverview I have been dealing with a very similar problem. Iterating over all the hits may not be too slow if you do it right. One such brute-force way to deal with this is to use the FieldCache for the tag field to quickly iterate over all the document ids that came back with the search. The constant lookup time in an array makes it nice and easy, but if may not scale well with the number of documents. If you know that the number of distinct tags in your dataset is fairly small (5-10), you can always just run the query with each tag added as a constraint to find out if there are any matches or you can manage a filter for each value to deduce if it exists in the resultset even faster. Please share your experience as you find more clues on this problem. Regards, Khawaja Shams On Tue, Oct 14, 2008 at 8:15 PM, Chris Hostetter <[EMAIL PROTECTED]>wrote: > > : For example if I have 100 documents in my index and my set of tag = {A, > B, C}. > : Query Q on the text field returns 15 docs with tag A , 10 with tag B and > none > : with tag C (total of 25 hits). Is there a way to determine that the set > of > : tags for query Q = {A, B} without iterating through all 25 hits. > > what you are describing is is a subset of a rbaoder topic known as > "faceted searching" ... if you search the archives for that or "category > counts" you'll find quite a bit of discussion on the approaches that can > be used to ahieve this. > > > > -Hoss > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >