You are not kidding. I keep finding this tougher and tougher. Originally
my method worked for most simple queries, but not all. I improved it to
cover some more ground, allowing some modestly complex queries, but now
even that improvement seems woefully inadequate for solving the general
case. I was handling some pretty complex queries, but losing order of
operations in a proximity query if things got too exciting. Among other
small bugs.
The parse tree handles the boolean stuff fine on its own, but the
proximity seems to require a distributed attack (think algebra) that
still maintains order of operations. It is a bit nasty.
consider:
cop | fowl & (fowl | priest & man) ! helicopter ~8 (hillary | tom)
this must distribute to (roughly)
cop | ( (fowl & ( (fowl ~8 hillary | (priest ~8 hillary & man ~8
hillary) ) ! helicopter ~8 hillary) | (fowl ~8 tom | (priest ~8 tom &
man ~8 tom) ) ! helicopter ~8 tom))
I have gotten close but instead of (fowl | (priest & man)) I might get
((fowl | priest) & man)...a naive distribution will ignore order of ops
and order by left to right.
my order of ops:
& "and"
| "or"
~ "within"
! "butnot"
<space between words>
I have a new plan of attack that I have begun, but who knows where it
will lead. I thought I was so close, but apparently just a tease...that
method could only take me so far. I hate to put so much work into this
since I doubt anyone will even use such complex queries (the queries I
monitor are always so basic) but I may give it a go just to see if my
new idea will solve the general case.
We will see if this parser actually has any life in it. Maybe I am no
closer than you where-- I am very new at this.
- Mark
Mark --
Yes please! I'm very interested in the mixing of boolean and proximity
operators. I have also worked on a parser (using JavaCC) but haven't
managed to crack queries such as:
((a OR b) AND c) NEAR (d NOT e)
I can get the parse tree okay, but haven't figured out how to translate
that into a valid Lucene Query object. Simple queries such as:
(a OR b) NEXT (c OR d) // note the use of OR exclusively!
are okay, but nothing more complex. So: bring it on!
-- Robert
rwatkins at foo-bar.org
On Mon, 21 Aug 2006, Mark Miller wrote:
Is anyone interested in helping me test out a new query parser (i.e is
anyone interested in using this, thereby helping me test it) ?
The parser uses a intermediate parse tree representation, unlike
Lucene's
Query Filter.
[ snipped ]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]