Victor Lee wrote:

Hi,
 I want to search for exact match of "good morning",
not phase match of it. How do I do that in Nutch? For examples, phase match returns doc that has "good
morning sir" or "hello good morning".  But exact match
returns doc that has "good morning" only.

Many thanks.

I'm afraid this won't be possible without changing the index. IF we assume that, then there are several methods to do this:

* add this text to a field that is not tokenized (e.g. "spec" field), and contains just this value

* add a QueryPlugin, which will translate the query
      spec:"good morning"
   into a TermQuery over that field

You could also go this way: modify your index to always include the start and end markers, e.g. __START__ __END__ (you can do this in NutchDocumentAnalyzer). Then write a query plugin which for exact matches rewrites the query to:

   "__START__ query __END__"

This will be parsed into a phrase query, but it will match only the documents which contain this exact phrase...

--
Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com




-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to