Hi, I had a question regarding building the Boolean Query as is done now in Nutch. While indexing the documents, I am adding fields called "fname" and "lname" for first name and last name of the author (as Field.Text). Now I want to search for 'adam smith'. My expectation is that I will get results were 'adam' is the first or last name, 'smith' is the first or last name, and the documents which has these words some where in the content. I have boosted the first name and last name to have higher weightage than the content. I have modified the query-basic plugin accordingly and the query representation looks like Query str = +((+url:adam^4.0 +url:smith^4.0 +url:"adam smith"~2147483647^4.0) (+fname:adam^4.0 +fname:smith^4.0 +fname:"adam smith"~2147483647^4.0) (+lname:adam^4.0 +lname:smith^4.0 +lname:"adam smith"~2147483647^4.0) (+summary:adam^4.0 +summary:smith^4.0 +summary:"adam smith"~2147483647^4.0) (+anchor:adam^2.0 +anchor:smith^2.0 + anchor:"adam smith"~4^2.0) (+content:adam +content:smith +content:"adam smith"~2147483647))
However I am getting only one hit, which is the document which has 'adam smith' in the content. Can someone please explain how do I go about making a query which is essentially (fname:adam OR fname:smith OR lname:adam OR lname:smith OR content:adam OR content:smith OR content:"adam smith"~SLOP_FACTOR) ? Also while on this topic, can we directly execute Lucence partial and wild card queries from Nutch ? I currently see that NutchDocumentAnalyzer strips of any special characters that I put in the query. Thanks and Have a great long weekend, Praveen. ------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click _______________________________________________ Nutch-developers mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-developers
