On Apr 9, 2004, at 8:16 PM, Michael A. Schoen wrote:
I have an index of urls, and need to display the top 10 results for a given query, but want to display only 1 result per domain. It seems that using either Hits or a HitCollector, I'll need to access the doc, grab the domain field (I'll have it parse ahead of time) and only take/display documents that are unique.

A significant percentage of the time I expect I may have to access thousands of results before I find 10 in unique domains. Is there a faster approach that won't require accessing thousands of documents?

I have examples of this that I can post when I have more time, but a quick pointer... check out the overloaded IndexSearcher.search() methods which accept a Sort. You can do really really interesting slicing and dicing, I think, using it. Try this one on for size:


    example.displayHits(allBooks,
        new Sort(new SortField[]{
          new SortField("category"),
          SortField.FIELD_SCORE,
          new SortField("pubmonth", SortField.INT, true)
        }));

Be clever indexing the piece you want to group on - I think you may find this the solution you're looking for.

Erik


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to