Great info Morus, After making the "escape the dash" change to the QueryParser: Query query = QueryParser.parse("+category:HW\\-NCI_TOPICS AND SPACE", "description", analyzer); Hits hits = searcher.search(query); System.out.println("query.ToString = " + query.toString("description")); assertEquals("HW-NCI_TOPICS kept as-is", "+category:HW\\-NCI_TOPICS +space", query.toString("description")); <------note that this passes with the escape put in, so not "as-is". assertEquals("doc found!", 1, hits.length()); I'm still getting this output: domain.lucenesearch.KeywordAnalyzer: [HW-NCI_TOPICS] query.ToString = +category:HW\-NCI_TOPICS +space junit.framework.AssertionFailedError: doc found! expected:<1> but was:<0> It look like bug, http://issues.apache.org/bugzilla/show_bug.cgi?id=27491 <http://issues.apache.org/bugzilla/show_bug.cgi?id=27491> , was fixed today: ------- Additional Comments From Otis Gospodnetic <mailto:[EMAIL PROTECTED]> 2004-03-24 10:10 -------
Although tft-monitor should not really result in a phrase query "tft monitor", I agree that this is better than converting it to tft AND NOT monitor (tft -monitor). Moreover, I have seen query syntax where '-' characters are used for phrase queries instead or in addition to quotes, so one could use either morus-walter or "morus walter". I applied your change, as it doesn't look like it breaks anything, and I hope nobody relied on ill behaviour where tft-monitor would result in AND NOT query. ----------- But I assume this fix won't come out for some time. Is there a way I can get this fix sooner? I'm up against a deadline and would very much like this functionality. And to go one more step with the KeywordAnalyzer that I wrote, changing this method to skip the escape: protected boolean isTokenChar(char c) { if (c == '\\') { return false; } else { return true; } } The test then returns with a space: healthecare.domain.lucenesearch.KeywordAnalyzer: [HW-NCI_TOPICS] query.ToString = +category:"HW -NCI_TOPICS" +space junit.framework.ComparisonFailure: HW-NCI_TOPICS kept as-is Expected:+category:HW\-NCI_TOPICS +space Actual :+category:"HW -NCI_TOPICS" +space <----note space where escape was. thanks, chad. -----Original Message----- From: Morus Walter [mailto:[EMAIL PROTECTED] Sent: Wed 3/24/2004 1:43 AM To: Lucene Users List Cc: Subject: RE: Query syntax on Keyword field question Chad Small writes: > Here is my attempt at a KeywordAnalyzer - although is not working? Excuse the length of the message, but wanted to give actual code. > > With this output: > > Analzying "HW-NCI_TOPICS" > org.apache.lucene.analysis.WhitespaceAnalyzer: > [HW-NCI_TOPICS] > org.apache.lucene.analysis.SimpleAnalyzer: > [hw] [nci] [topics] > org.apache.lucene.analysis.StopAnalyzer: > [hw] [nci] [topics] > org.apache.lucene.analysis.standard.StandardAnalyzer: > [hw] [nci] [topics] > healthecare.domain.lucenesearch.KeywordAnalyzer: > [HW-NCI_TOPICS] > > query.ToString = category:HW -"nci topics" +space > > junit.framework.ComparisonFailure: HW-NCI_TOPICS kept as-is > Expected:+category:HW-NCI_TOPICS +space > Actual :category:HW -"nci topics" +space > Well query parser does not allow `-' within words currently. So before your analyzer is called, query parser reads one word HW, a `-' operator, one word NCI_TOPICS. The latter is analyzed as "nci topics" because it's not in field category anymore, I guess. I suggested to change this. See http://issues.apache.org/bugzilla/show_bug.cgi?id=27491 Either you escape the - using category:HW\-NCI_TOPICS in your query (untested. and I don't know where the escape character will be removed) or you apply my suggested change. Another option for using keywords with query parser might be adding a keyword syntax to the query parser. Something like category:key("HW-NCI_TOPICS") or category="HW-NCI_TOPICS". HTH Morus --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]