Re: New query parser?

Roman Chyla Wed, 22 May 2013 17:30:29 -0700

Hello,
The new JIRA issue has been created -
https://issues.apache.org/jira/browse/LUCENE-5014
Thank you for trying it,


roman


On Wed, May 15, 2013 at 7:34 AM, Roman Chyla <roman.ch...@gmail.com> wrote:

> Hi Jan,
>
> Thanks for thumbs up
>
>
> On Tue, May 14, 2013 at 11:14 AM, Jan Høydahl <jan....@cominvent.com>wrote:
>
>> Hello :)
>>
>> I think it has been the intention of the dev community for a long time to
>> start using the flex parser framework, and in this regard this contribution
>> is much welcome as a kickstarter for that.
>> I have not looked much at the code, but I hope it could be a starting
>> point for writing future parsers in a less "spaghetti" way.
>>
>> One question. Say we want to add a new operator such as NEAR/N. Ideally
>> this should be added in Lucene, then all the Solr QParsers extending the
>> lucene flex parser would benefit from the same new operator. Would this be
>> easily achieved with your code you think? We also have a ton of
>>
>
>
> to add a new operator is very simple on the syntax level -- ie. when I
> want the NEAR/x operator, I just change the ANTLR grammar, which produces
> the approripate abstract syntax tree. The flex parser is consuming this.
>
> Yet, imagine the following query
>
> dog NEAR/5 cat
>
> if you are using synonyms, an analyzer could have expanded dog with
> synonyms, it becomes something like
>
> (dog | canin) NEAR/5 cat
>
> and since Lucene cannot handle these queries, the flex builder must
> rewrite them, effectively producing
>
> SpanNear(SpanOr(dog | cat), SpanTerm(cat), 5)
>
> but you could also argue, that a better way to handle this query is:
>
> SpanNear(dog, cat, 5) OR SpanNear(canin, cat, 5)
>
> If that is the case, then a different builder will have to be used -
>
> Just an example where syntax is relatively simple, but the semantics is
> the hard part. But I believe the flex parser gives all necessary tools to
> deal with that and avoid the spaghetti problem
>
>
> --roman
>
>
>
>> feature requests on the eDisMax parser for new kinds of query syntax
>> support. Before we start implementing that on top of the
>> already-hard-to-maintain eDismax code, we should think about
>> re-implementing eDismax on top of flex, perhaps on top of Roman's contrib
>> here?
>>
>
> btw: i am using edismax in one of my grammars -- ie. users can type: query
> AND edismax(foo OR (dog AND cat)) -- and the "edismax(....)" will be parsed
> by edismax, but I hit the problems there as well, it is not doing such a
> nice job with operators and of course it doesn't know how to handle
> multi-token synonym expansion, but I think it could be nicely extracted
> into a flex processor and effectively become a plugin for a solr parser
> (now, it is a parser of its own, which makes it hard to extend)
>
>
>
>
>
>>
>>  --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com
>>
>> 14. mai 2013 kl. 17:07 skrev Roman Chyla <roman.ch...@gmail.com>:
>>
>> Hello World!
>>
>> Following the recommended practice I'd like to let you know that I am
>> about to start porting our existing query parser into JIRA with the aim of
>> making it available to Lucene/SOLR community.
>>
>> The query parser is built on top of the flexible query parser, but it
>> separates the parsing (ANTLR) and the query building - it allows for a very
>> sophisticated custom logic and has self-retrospecting methods, so one can
>> actually 'see' what is going on - I have had lots of FUN working with it
>> (which I consider to be a feature, not a shameless plug ;)).
>>
>> Some write up is here:
>> http://29min.wordpress.com/category/antlrqueryparser/
>>
>> You can see the source code at:
>>
>> https://github.com/romanchyla/montysolr/tree/master/contrib/antlrqueryparser
>>
>>
>> If you think this project is duplicating something or even being useless
>> (I hope not!) please let me know, stop me, say something...
>>
>> Thank you!
>>
>>   roman
>>
>>
>>
>

Re: New query parser?

Reply via email to