I have a question about the meaning and behavior of grouping behavior with
Lucene queries.
In particular, here is the scenario I am testing. I have indexed 1,000
documents.
|---+-------------------------------------------+---------------|
| # | Query String | Result (Hits) |
|---+-------------------------------------------+---------------|
| 1 | *:* | 1000 |
| 2 | host:host_1 | 46 |
| 3 | location:location_5 | 100 |
| 4 | host:host_1 AND NOT location:location_5 | 37 |
| 5 | host:host_1 AND (NOT location:location_5) | 0 |
|---+-------------------------------------------+---------------|
I don't understand why the last query returns 0. I would expect queries 4
and 5 to return the same result.
Here's the interpretation based on running it through the Lucene
classic.QueryParser:
|-------------------------------------------+--------------------------------------|
| Query String |
QueryParser.parse(qry).toString() |
|-------------------------------------------+--------------------------------------|
| host:host_1 AND NOT location:location_5 | +host:host_1
-location:location_5 |
| host:host_1 AND (NOT location:location_5) | +host:host_1
+(-location:location_5) |
|-------------------------------------------+--------------------------------------|
I'd like some help understanding why I'm getting this unintuitive behavior.
Also, I see that the StandardSyntaxParser generates a different query
string:
|-------------------------------------------+-------------------------------------------------|
| Query String |
StandardSyntaxParser.parse(qry).toQueryString() |
|-------------------------------------------+-------------------------------------------------|
| host:host_1 AND NOT location:location_5 | host:host_1 AND
-location:location_5 |
| host:host_1 AND (NOT location:location_5) | host:host_1 AND (
-location:location_5 ) |
|-------------------------------------------+-------------------------------------------------|
Are these equivalent in Lucene? Should I stop using the classic.QueryParser?
*Details*
Using Lucene 5.5.0.
Using classic.QueryParser and query code is:
Directory directory = FSDirectory.open(getCurrentDirectory().toPath());
StandardAnalyzer analyzer = new StandardAnalyzer();
DirectoryReader reader = DirectoryReader.open(directory);
IndexSearcher searcher = new IndexSearcher(reader);
QueryParser parser = new QueryParser("ts", analyzer);
Query query = parser.parse("host:host_1 AND NOT location:location_5");
int limit = 1000;
TopDocs hits = searcher.search(query, limit);
System.out.println("hits.totalHits = " + hits.totalHits);
Thanks very much for your insights here.
-Michael Peterson