On Thu, Dec 16, 2010 at 12:42:53PM -0500, Robert Muir wrote:
> FWIW (definitely not trying to imply its the best!), NULL is what
> lucene-java does if the Analyzer returns zero tokens. This means it has to
> be careful too in all the other processing, for example the code that
> applies boost has to handle the query == null case
I had a discussion with my colleagues here at Eventful about what should a
query like this ought to return:
foo AND ()
We concluded that if you built such a query programmatically (i.e. using the
OO constructors), then it should return nothing. That's how it works now,
because empty ANDQuery and empty ORQuery both compile down to NULL Matchers --
those empty parens fail to match, so the parent ANDQuery also fails.
In contrast, we agreed that how a tolerant, user-facing query parser such as
Lucy::Search::QueryParser ought to handle such a query string was ambiguous.
It seems reasonable to pursue a resolution to the current bug by with a minor
mod limited to the query parsing stage. I don't think we ought to touch the
Query classes themselves.
Put another way... Analysis, including application of stoplists, belongs to
the query-parsing stage of compilation, and not to the lower level of
compiling a Query object down to a Matcher. There's no way in Lucy to produce
a PolyQuery (the parent class for
ANDQuery/ORQuery/NOTQuery/RequiredOptionalQuery) which has one or more NULL
child queries. We don't have semantics for what a NULL child query would
mean, and I don't think we should add such semantics. We should avoid
following Lucene's example in this case.
Marvin Humphrey