Re: Group by in Lucene ?

2009-02-01 Thread Marcus Herou
Yep, you are correct, this is a lousy implementation which I knew when I wrote it. I'm not interested in the entire document just the grouping term and the docId which it is connected to. So how do I get hold of the TermDocs for the grouping field ? I mean I probably first need to perform the qu

Re: Group by in Lucene ?

2009-02-01 Thread Marcus Herou
Yep. Probably an external sort should be used when flushing to disk. I have written such code so that is probably a no brainer, the problem is to get it speedy :) http://dev.tailsweep.com/projects/utils/apid

Set a field as required in a MultiFieldQueryParser

2009-02-01 Thread Sylvain
Hello everybody, I have a search app in which the user can specify in which category the documents he's searching are. So all my indexed documents have a "category" field as well as other fields such as title, description, etc. So when the user enters his query, only the documents that are in the

tfIdf weights

2009-02-01 Thread Rehan Abdulaziz
Hi, Is it possible to retrieve the tfidf weights (or relative weight calculated from any formula) instead of simple term frequencies through getTermFreqVector()? I am more interested in knowing the relative weight of each term in each field rather than just the frequency of terms. Thank you very m

Re: Set a field as required in a MultiFieldQueryParser

2009-02-01 Thread Erick Erickson
I think query.toString() is your friend here. I'm having a hard time figuring out, from your description, what you actually want, so maybe some examples would help too. If your users enter terms for each field, then it seems to me that you'd want MUST between clauses (in which case MultiFieldQuery

Re: Best Practice for Lucene Search

2009-02-01 Thread ilwes
I like the point about doing things the easiest way possible until it starts to become a problem. Thank you very much for your answers and for the insight how you handle this issue. You helped me a lot. Ilwes -- View this message in context: http://www.nabble.com/Best-Practice-for-Lucene-Searc

Re: tfIdf weights

2009-02-01 Thread dipesh
hi, i used lucene-2.4.0 to get tf-idf. i'm not sure if the newer versions have direct methods to get tf-idfs as well. this is lengthy but might help. // Get Term Enum that contains all the terms in the index using FilterIndexReader TermEnum e = freader.terms(); // find total number of do