[ 
https://issues.apache.org/jira/browse/LUCENE-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13818114#comment-13818114
 ] 

Michael McCandless commented on LUCENE-5336:
--------------------------------------------

This is AWESOME.  I love how the operators (even whitespace!) are
optional. And I love the name :)  And it's great that it NEVER throws
an exc no matter how awful the input is.  And I love that it does not
use a lexer/parser generator: this makes it much more approachable
to those devs that don't have experience with parser generators.

Small javadoc fix: instead of "any {@code -} characters beyond the
first character in a term may not need to be escaped," I think it
should say "any {@code -} characters beyond the first character do not
need to be escaped" (and same for * operator)"?

How does it handle mal-formed input, e.g. a missing closing " for a
phrase query?  If I enter "foo bar will it just make a term query for
"foo and a term query for bar?  Or, does it strip that " and do query
foo instead?  (Same for missing closing paren?).  It looks like it
drops the " and ( and does a simple term query (good).

Maybe you could add fangs to the random test by more frequently mixing
in these operator characters ...


> Add a simple QueryParser to parse human-entered queries.
> --------------------------------------------------------
>
>                 Key: LUCENE-5336
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5336
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Jack Conradson
>         Attachments: LUCENE-5336.patch
>
>
> I would like to add a new simple QueryParser to Lucene that is designed to 
> parse human-entered queries.  This parser will operate on an entire entered 
> query using a specified single field or a set of weighted fields (using term 
> boost).
> All features/operations in this parser can be enabled or disabled depending 
> on what is necessary for the user.  A default operator may be specified as 
> either 'MUST' representing 'and' or 'SHOULD' representing 'or.'  The 
> features/operations that this parser will include are the following:
> * AND specified as '+'
> * OR specified as '|'
> * NOT specified as '-'
> * PHRASE surrounded by double quotes
> * PREFIX specified as '*'
> * PRECEDENCE surrounded by '(' and ')'
> * WHITESPACE specified as ' ' '\n' '\r' and '\t' will cause the default 
> operator to be used
> * ESCAPE specified as '\' will allow operators to be used in terms
> The key differences between this parser and other existing parsers will be 
> the following:
> * No exceptions will be thrown, and errors in syntax will be ignored.  The 
> parser will do a best-effort interpretation of any query entered.
> * It uses minimal syntax to express queries.  All available operators are 
> single characters or pairs of single characters.
> * The parser is hand-written and in a single Java file making it easy to 
> modify.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to