Sort is helpful. Maybe you should change you index structure if you think you need a group by.
On Tue, Feb 17, 2009 at 9:30 PM, Erick Erickson <erickerick...@gmail.com>wrote: > Well, I can imagine several schemes, how suitable they are depends > upon some as yet unspecified characteristics of your problem space. > > You don't want to iterate blindly over the responses in a > HitCollector.collect method unless your index is quite small (see the > API docs for an explanation). > > If you don't have very many users, you could consider creating a Filter > at startup time, one for each user with a bit set for each document > that user has (see TermDocs/TermEnum). > > You could *try* FieldSelector (aka Lazy Loading) to make document > fetching more efficient in your collect method. If you try this be sure > that your user field is indexed. Again, depending upon your index > characteristics this may or may not be viable. > > Instead of FieldSelector you could try using TermDocs/TermEnum in > your collect method to see if a user was indexed for a particular document. > > You could also supply some more details about your index, e.g. number > of documents, number of users, whether more than one user is allowed > per document. What response times you require. What the larger problem > you're trying to solve, that is, what use case are you trying to solve. > Which > is another way of asking if this is an XY problem. > > Perhaps wiser heads than mine can come up with something clever with > enough details. > > Best > Erick > > On Tue, Feb 17, 2009 at 6:47 AM, AmigoProgrammer <m...@papaecho.com> wrote: > > > > > A relevant client is one that is related to one or more documents found > by > > a > > search. > > > > I would store client as a keyword with a document and I would like the > > query > > to return clients with the sum of relevant documents score. A client with > > many low scoring documents could be as relevant as a client with few high > > scoring documents. Basically I am looking for a 'group by'-like > > functionality. > > > > Best, > > > > Michael > > > > > > Erick Erickson wrote: > > > > > > What constitutes a "relevant client"? If you want > > > to restrict the returned documents to a particular client > > > (or even a set of clients) a simple +client:<client name> > > > would do the trick..... > > > > > > Or you could create a Filter for "relevant clients". > > > > > > If neither of these helps, could you clarify your > > > definition of a relevant client? > > > > > > Best > > > Erick > > > > > > > > > On Mon, Feb 16, 2009 at 3:00 PM, AmigoProgrammer <m...@papaecho.com> > > wrote: > > > > > >> > > >> Hi, > > >> > > >> I have a number of documents that each relate to a client. I would > like > > >> to > > >> use an index and queries to answer two question: > > >> - Find relevant documents > > >> - Find relevant clients > > >> > > >> The first one is straight forward. > > >> For the second one, I am wondering. Should I iterate over the hits and > > >> compute the most relevant clients. Or is there a clever build-in way > of > > >> answering the question? > > >> > > >> Anyone that can help me crack the nut? > > >> > > >> Best, > > >> > > >> Michael > > >> -- > > >> View this message in context: > > >> > http://www.nabble.com/Querying-for-a-catagory-tp22044596p22044596.html > > >> Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > >> > > >> > > >> --------------------------------------------------------------------- > > >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > >> For additional commands, e-mail: java-user-h...@lucene.apache.org > > >> > > >> > > > > > > > > > > -- > > View this message in context: > > http://www.nabble.com/Querying-for-a-catagory-tp22044596p22055571.html > > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > >