Of course - you need to use the same analyzer for both indexing and query.
So, just reindex your data with this new analyzer.
-- Jack Krupansky
-----Original Message-----
From: Natalia Connolly
Sent: Tuesday, March 18, 2014 10:37 AM
To: java-user@lucene.apache.org
Subject: Re: How to search for terms containing negation
I am afraid this did not work, Tri. Here's what I tried:
List<String> words = new ArrayList();
Boolean ignoreCase = true;
CharArraySet emptyset = new
CharArraySet(Version.LUCENE_47,words,ignoreCase);
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_47,emptyset);
Here's what happens:
Searching for: no
0 total matching documents
Searching for: not
0 total matching documents
even though I know the documents contain plenty of "no" and "not"s.
Could the problem be more upstream (i.e., words like this aren't even
indexed?)
Thank you,
Natalia
On Mon, Mar 17, 2014 at 3:57 PM, Tri Cao <tm...@me.com> wrote:
StandardAnalyzer has a constructor that takes a stop word set, so I guess
you can pass it an empty set:
http://lucene.apache.org/core/4_6_1/analyzers-common/org/apache/lucene/analysis/standard/StandardAnalyzer.html#StandardAnalyzer(org.apache.lucene.util.Version,
org.apache.lucene.analysis.util.CharArraySet)
QueryParser is probably ok. I rarely use this parser but I don't think it
recognizes "not" in its grammar.
Hope this helps,
Tri
On Mar 17, 2014, at 12:46 PM, Natalia Connolly <
natalia.v.conno...@gmail.com> wrote:
Hi Tri,
Thank you so much for your message!
Yes, it looks like the negation terms have indeed been filtered out;
when I query on "no" or "not", I get no results. I am just using
StandardAnalyzer and the classic QueryParser:
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_47);
QueryParser parser = new QueryParser(Version.LUCENE_47, field, analyzer);
Which analyzer/parser would you recommend?
Thank you again,
Natalia
On Mon, Mar 17, 2014 at 3:35 PM, Tri Cao <tm...@me.com> wrote:
Natalia,
First make sure that your analyzers (both index and query analyzers) do
not filter out these as stop words. I think the standard StopFilter list
has "no" and "not". You can try to see if you index have these terms by
querying for "no" as a TermQuery. If there is not match for that query,
then you know for sure they have been filtered out.
The next thing is to check is your query parser. What query parser are you
using? Some parser actually understands the "not" term and rewrite to a
negation query.
Hope this helps,
Tri
On Mar 17, 2014, at 12:02 PM, Natalia Connolly <
natalia.v.conno...@gmail.com> wrote:
Hi All,
Is there any way I could construct a query that would not automatically
exclude negation terms (such as "no", "not", etc)? For example, I need to
find strings like "not happy", "no idea", "never available". I tried
using a simple analyzer with combinations such as "not AND happy", and
similar patterns, but it does not work.
Any help would be appreciated!
Natalia
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org