Hi, In case you donot want to toss away any stop words and even preserve case, WhiteSpaceAnalyzer can be used, also using WhiteSpaceTokenizer would serve as a test (but need to reindex whole data set first), to make sure there is no other problems.
Best regards, Lisheng -----Original Message----- From: mark harwood [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 18, 2007 9:42 AM To: java-user@lucene.apache.org Subject: Re: Phrase Query Problem You could write a custom analyzer that drops stopwords but adds an extra 1 to the "positionIncrement" property for the next valid Token after each omiited stop word. This would retain the benefit of removing stopwords from your index and yet prevent your example phrases matching (because the remaining words are not recorded as being directly next to each other) Cheers Mark ----- Original Message ---- From: Sirish Vadala <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Tuesday, 18 December, 2007 5:10:19 PM Subject: RE: Phrase Query Problem Yes... If my query phrase is "Health Safety", docs with "Health and Safety", "Health or Safety" are being returned... So... Is there any other way to handle this situation... Especially in the above mentioned case, the user is expecting around 5 records and the query is fetching more than 550 records.8-O Thanks. Zhang, Lisheng wrote: > > Hi, > > Do you mean that your query phrase is "Health Safety", > but docs with "Health and Safety" returned? > > If that is the case, the reason is that StandardAnalyzer > filters out "and" (also "or, "in" and others) as stop > words during indexing, and the QueryParser filters those > words out also. > > Best regards, Lisheng > > -----Original Message----- > From: Sirish Vadala [mailto:[EMAIL PROTECTED] > Sent: Monday, December 17, 2007 9:49 AM > To: java-user@lucene.apache.org > Subject: Phrase Query Problem > > > > I have the following code for search: > > BooleanQuery bQuery = new BooleanQuery(); > Query queryAuthor; > queryAuthor = new TermQuery(new Term(IFIELD_LEAD_AUTHOR, > author.trim().toLowerCase())); > bQuery.add(queryAuthor, BooleanClause.Occur.MUST); > .................................................................... > .................................................................... > > PhraseQuery pQuery = new PhraseQuery(); > String[] phrase = txtWithPhrase.toLowerCase().split(" "); > for (int i = 0; i < phrase.length; i++) { > pQuery.add(new Term(IFIELD_TEXT, phrase[i])); > } > pQuery.setSlop(0); > bQuery.add(pQuery, BooleanClause.Occur.MUST); > .................................................................... > .................................................................... > > String[] sortOrder = {IFIELD_LEAD_AUTHOR, IFIELD_TEXT}; > Sort sort = new Sort(sortOrder); > hits = indexSearcher.search(bQuery, sort); > > Now My problem here is: If I do a search on a phrase with text Health > Safety, it is fetching me all the records where in the text is Health > and/or/in Safety. It is fetching me these records even after setting the > slop of the phrase query to zero for exact match. I am using standard > analyzer while indexing my records. > > Any help on this is greatly appreciated. > > Sirish Vadala > -- > View this message in context: > http://www.nabble.com/Phrase-Query-Problem-tp14373945p14373945.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > -- View this message in context: http://www.nabble.com/Phrase-Query-Problem-tp14373945p14401354.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] __________________________________________________________ Sent from Yahoo! Mail - a smarter inbox http://uk.mail.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]