Paul Elschot wrote:
Mark,
On Thursday 31 August 2006 23:18, Mark Miller wrote:
I am not a huge fan of the queryparser's syntax so I have started an
open source project to create a viable alternative. I could really use
some helping testing it out. The more I can get it tested the better
chance it has of serving the community. The parser is called Qsol. I am
right up against its initial release. So far it:
offers a simple clean syntax.
allows arbitrary combinations/nesting of proximity and boolean queries.
Could you say in a few words how the combination of proximity and boolean
is implemented in Qsol?
I found this the most difficult thing to implement in surround. In surround,
every subquery that can be a proximity subquery has two (groups of) methods:
one for use as boolean and one for use as proximity.
I'd like to have a mechanism that allows mixing proximity and boolean queries
built into Lucene.
Did you also implement parsed phrases with Lucene's PhraseQuery?
Surround does not have that.
Regards,
Paul Elschot
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Hi Paul,
I'm afraid my programming is prob quite a ways behind yours so I doubt
anything I have done will be of any help to you.
I also have to treat things differently depending on if I am in a
proximity clause or boolean clause. A wildcard in a boolean is mapped to
a wildcard query. A wildcard in a proximity is mapped to a regex span
that has been modified to only deal with * and ?. When I run into a
proximity, I collect a small tree of each clause and distribute them
against each other...(old | map) ~3 big gets distributed to old ~3 big |
map ~3 big. This distribution method appears to handle all
boolean/proximity nesting/mixing cases for me, including: great ! "big
old phrase search" ~5 (holy ~4 (big black bear)). The distribution
maintains order of operations, but also obviously can create some pretty
large queries.
I did not use the phrase search because I do not like how the slop works
(not in order, etc.) so both in and out of proximity uses a nearspan
instead. For a multiphrase search I use an OrSpan on words in the same
position.
- Mark
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]