[jira] [Commented] (LUCENE-5336) Add a simple QueryParser to parse human-entered queries.
[ https://issues.apache.org/jira/browse/LUCENE-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13900662#comment-13900662 ] Marcio Napoli commented on LUCENE-5336: --- Believe to be interesting to include support for prefix/suffix (term* or *term*) and also the data range [20120910 TO 20130101]? Thanks! Add a simple QueryParser to parse human-entered queries. Key: LUCENE-5336 URL: https://issues.apache.org/jira/browse/LUCENE-5336 Project: Lucene - Core Issue Type: Improvement Reporter: Jack Conradson Fix For: 5.0, 4.7 Attachments: LUCENE-5336.patch, LUCENE-5336.patch, LUCENE-5336.patch I would like to add a new simple QueryParser to Lucene that is designed to parse human-entered queries. This parser will operate on an entire entered query using a specified single field or a set of weighted fields (using term boost). All features/operations in this parser can be enabled or disabled depending on what is necessary for the user. A default operator may be specified as either 'MUST' representing 'and' or 'SHOULD' representing 'or.' The features/operations that this parser will include are the following: * AND specified as '+' * OR specified as '|' * NOT specified as '-' * PHRASE surrounded by double quotes * PREFIX specified as '*' * PRECEDENCE surrounded by '(' and ')' * WHITESPACE specified as ' ' '\n' '\r' and '\t' will cause the default operator to be used * ESCAPE specified as '\' will allow operators to be used in terms The key differences between this parser and other existing parsers will be the following: * No exceptions will be thrown, and errors in syntax will be ignored. The parser will do a best-effort interpretation of any query entered. * It uses minimal syntax to express queries. All available operators are single characters or pairs of single characters. * The parser is hand-written and in a single Java file making it easy to modify. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5336) Add a simple QueryParser to parse human-entered queries.
[ https://issues.apache.org/jira/browse/LUCENE-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867681#comment-13867681 ] ASF subversion and git services commented on LUCENE-5336: - Commit 1557073 from [~mikemccand] in branch 'dev/branches/lucene5376' [ https://svn.apache.org/r1557073 ] LUCENE-5336, LUCENE-5376: expose SimpleQueryParser in lucene server Add a simple QueryParser to parse human-entered queries. Key: LUCENE-5336 URL: https://issues.apache.org/jira/browse/LUCENE-5336 Project: Lucene - Core Issue Type: Improvement Reporter: Jack Conradson Fix For: 5.0, 4.7 Attachments: LUCENE-5336.patch, LUCENE-5336.patch, LUCENE-5336.patch I would like to add a new simple QueryParser to Lucene that is designed to parse human-entered queries. This parser will operate on an entire entered query using a specified single field or a set of weighted fields (using term boost). All features/operations in this parser can be enabled or disabled depending on what is necessary for the user. A default operator may be specified as either 'MUST' representing 'and' or 'SHOULD' representing 'or.' The features/operations that this parser will include are the following: * AND specified as '+' * OR specified as '|' * NOT specified as '-' * PHRASE surrounded by double quotes * PREFIX specified as '*' * PRECEDENCE surrounded by '(' and ')' * WHITESPACE specified as ' ' '\n' '\r' and '\t' will cause the default operator to be used * ESCAPE specified as '\' will allow operators to be used in terms The key differences between this parser and other existing parsers will be the following: * No exceptions will be thrown, and errors in syntax will be ignored. The parser will do a best-effort interpretation of any query entered. * It uses minimal syntax to express queries. All available operators are single characters or pairs of single characters. * The parser is hand-written and in a single Java file making it easy to modify. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5336) Add a simple QueryParser to parse human-entered queries.
[ https://issues.apache.org/jira/browse/LUCENE-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13820091#comment-13820091 ] Adrien Grand commented on LUCENE-5336: -- Javadocs and code seem to disagree on the default operator: javadocs say {{The default operator is AND if no other operator is specified.}} while the code has {{private BooleanClause.Occur defaultOperator = BooleanClause.Occur.SHOULD;}}? Otherwise I agree with Mike that this new query parser is awesome. I will certainly use it! Add a simple QueryParser to parse human-entered queries. Key: LUCENE-5336 URL: https://issues.apache.org/jira/browse/LUCENE-5336 Project: Lucene - Core Issue Type: Improvement Reporter: Jack Conradson Attachments: LUCENE-5336.patch, LUCENE-5336.patch I would like to add a new simple QueryParser to Lucene that is designed to parse human-entered queries. This parser will operate on an entire entered query using a specified single field or a set of weighted fields (using term boost). All features/operations in this parser can be enabled or disabled depending on what is necessary for the user. A default operator may be specified as either 'MUST' representing 'and' or 'SHOULD' representing 'or.' The features/operations that this parser will include are the following: * AND specified as '+' * OR specified as '|' * NOT specified as '-' * PHRASE surrounded by double quotes * PREFIX specified as '*' * PRECEDENCE surrounded by '(' and ')' * WHITESPACE specified as ' ' '\n' '\r' and '\t' will cause the default operator to be used * ESCAPE specified as '\' will allow operators to be used in terms The key differences between this parser and other existing parsers will be the following: * No exceptions will be thrown, and errors in syntax will be ignored. The parser will do a best-effort interpretation of any query entered. * It uses minimal syntax to express queries. All available operators are single characters or pairs of single characters. * The parser is hand-written and in a single Java file making it easy to modify. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5336) Add a simple QueryParser to parse human-entered queries.
[ https://issues.apache.org/jira/browse/LUCENE-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13820196#comment-13820196 ] Michael McCandless commented on LUCENE-5336: +1, javadocs and the new test look great! Add a simple QueryParser to parse human-entered queries. Key: LUCENE-5336 URL: https://issues.apache.org/jira/browse/LUCENE-5336 Project: Lucene - Core Issue Type: Improvement Reporter: Jack Conradson Attachments: LUCENE-5336.patch, LUCENE-5336.patch, LUCENE-5336.patch I would like to add a new simple QueryParser to Lucene that is designed to parse human-entered queries. This parser will operate on an entire entered query using a specified single field or a set of weighted fields (using term boost). All features/operations in this parser can be enabled or disabled depending on what is necessary for the user. A default operator may be specified as either 'MUST' representing 'and' or 'SHOULD' representing 'or.' The features/operations that this parser will include are the following: * AND specified as '+' * OR specified as '|' * NOT specified as '-' * PHRASE surrounded by double quotes * PREFIX specified as '*' * PRECEDENCE surrounded by '(' and ')' * WHITESPACE specified as ' ' '\n' '\r' and '\t' will cause the default operator to be used * ESCAPE specified as '\' will allow operators to be used in terms The key differences between this parser and other existing parsers will be the following: * No exceptions will be thrown, and errors in syntax will be ignored. The parser will do a best-effort interpretation of any query entered. * It uses minimal syntax to express queries. All available operators are single characters or pairs of single characters. * The parser is hand-written and in a single Java file making it easy to modify. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5336) Add a simple QueryParser to parse human-entered queries.
[ https://issues.apache.org/jira/browse/LUCENE-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13820218#comment-13820218 ] Adrien Grand commented on LUCENE-5336: -- +1 Add a simple QueryParser to parse human-entered queries. Key: LUCENE-5336 URL: https://issues.apache.org/jira/browse/LUCENE-5336 Project: Lucene - Core Issue Type: Improvement Reporter: Jack Conradson Attachments: LUCENE-5336.patch, LUCENE-5336.patch, LUCENE-5336.patch I would like to add a new simple QueryParser to Lucene that is designed to parse human-entered queries. This parser will operate on an entire entered query using a specified single field or a set of weighted fields (using term boost). All features/operations in this parser can be enabled or disabled depending on what is necessary for the user. A default operator may be specified as either 'MUST' representing 'and' or 'SHOULD' representing 'or.' The features/operations that this parser will include are the following: * AND specified as '+' * OR specified as '|' * NOT specified as '-' * PHRASE surrounded by double quotes * PREFIX specified as '*' * PRECEDENCE surrounded by '(' and ')' * WHITESPACE specified as ' ' '\n' '\r' and '\t' will cause the default operator to be used * ESCAPE specified as '\' will allow operators to be used in terms The key differences between this parser and other existing parsers will be the following: * No exceptions will be thrown, and errors in syntax will be ignored. The parser will do a best-effort interpretation of any query entered. * It uses minimal syntax to express queries. All available operators are single characters or pairs of single characters. * The parser is hand-written and in a single Java file making it easy to modify. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5336) Add a simple QueryParser to parse human-entered queries.
[ https://issues.apache.org/jira/browse/LUCENE-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13820251#comment-13820251 ] ASF subversion and git services commented on LUCENE-5336: - Commit 1541151 from [~rcmuir] in branch 'dev/trunk' [ https://svn.apache.org/r1541151 ] LUCENE-5336: add SimpleQueryParser for human-entered queries Add a simple QueryParser to parse human-entered queries. Key: LUCENE-5336 URL: https://issues.apache.org/jira/browse/LUCENE-5336 Project: Lucene - Core Issue Type: Improvement Reporter: Jack Conradson Attachments: LUCENE-5336.patch, LUCENE-5336.patch, LUCENE-5336.patch I would like to add a new simple QueryParser to Lucene that is designed to parse human-entered queries. This parser will operate on an entire entered query using a specified single field or a set of weighted fields (using term boost). All features/operations in this parser can be enabled or disabled depending on what is necessary for the user. A default operator may be specified as either 'MUST' representing 'and' or 'SHOULD' representing 'or.' The features/operations that this parser will include are the following: * AND specified as '+' * OR specified as '|' * NOT specified as '-' * PHRASE surrounded by double quotes * PREFIX specified as '*' * PRECEDENCE surrounded by '(' and ')' * WHITESPACE specified as ' ' '\n' '\r' and '\t' will cause the default operator to be used * ESCAPE specified as '\' will allow operators to be used in terms The key differences between this parser and other existing parsers will be the following: * No exceptions will be thrown, and errors in syntax will be ignored. The parser will do a best-effort interpretation of any query entered. * It uses minimal syntax to express queries. All available operators are single characters or pairs of single characters. * The parser is hand-written and in a single Java file making it easy to modify. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5336) Add a simple QueryParser to parse human-entered queries.
[ https://issues.apache.org/jira/browse/LUCENE-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13820254#comment-13820254 ] ASF subversion and git services commented on LUCENE-5336: - Commit 1541158 from [~rcmuir] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1541158 ] LUCENE-5336: add SimpleQueryParser for human-entered queries Add a simple QueryParser to parse human-entered queries. Key: LUCENE-5336 URL: https://issues.apache.org/jira/browse/LUCENE-5336 Project: Lucene - Core Issue Type: Improvement Reporter: Jack Conradson Attachments: LUCENE-5336.patch, LUCENE-5336.patch, LUCENE-5336.patch I would like to add a new simple QueryParser to Lucene that is designed to parse human-entered queries. This parser will operate on an entire entered query using a specified single field or a set of weighted fields (using term boost). All features/operations in this parser can be enabled or disabled depending on what is necessary for the user. A default operator may be specified as either 'MUST' representing 'and' or 'SHOULD' representing 'or.' The features/operations that this parser will include are the following: * AND specified as '+' * OR specified as '|' * NOT specified as '-' * PHRASE surrounded by double quotes * PREFIX specified as '*' * PRECEDENCE surrounded by '(' and ')' * WHITESPACE specified as ' ' '\n' '\r' and '\t' will cause the default operator to be used * ESCAPE specified as '\' will allow operators to be used in terms The key differences between this parser and other existing parsers will be the following: * No exceptions will be thrown, and errors in syntax will be ignored. The parser will do a best-effort interpretation of any query entered. * It uses minimal syntax to express queries. All available operators are single characters or pairs of single characters. * The parser is hand-written and in a single Java file making it easy to modify. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5336) Add a simple QueryParser to parse human-entered queries.
[ https://issues.apache.org/jira/browse/LUCENE-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13819275#comment-13819275 ] Jack Conradson commented on LUCENE-5336: Thanks for the feedback. To answer the malformed input question -- If foo bar is given as the query, the double quote will be dropped, and if whitespace is an operator it will make term queries for both 'foo' and 'bar' otherwise it will make a single term query 'foo bar' If foobar is given as the query, the double quote will be dropped, and term queries will be made for both 'foo' and 'bar' The reason it's done this way is because the parser only backtracks as far as the malformed input (in this case the extraneous double quote), so 'foo' would already be part of the query tree. This is because only a single pass is made for each query. The parser could be changed to do two passes to remove extraneous characters, but I believe that only makes the code more complex, and doesn't necessarily interpret the query any better for a user since the malformed character gives no hint as to what he/she really intended to do. I will try to post another patch today or tomorrow. I plan to do the following: * Fix the Javadoc comment * Add more tests for random operators * Rename the class to SimpleQueryParser and rename the package to .simple Add a simple QueryParser to parse human-entered queries. Key: LUCENE-5336 URL: https://issues.apache.org/jira/browse/LUCENE-5336 Project: Lucene - Core Issue Type: Improvement Reporter: Jack Conradson Attachments: LUCENE-5336.patch I would like to add a new simple QueryParser to Lucene that is designed to parse human-entered queries. This parser will operate on an entire entered query using a specified single field or a set of weighted fields (using term boost). All features/operations in this parser can be enabled or disabled depending on what is necessary for the user. A default operator may be specified as either 'MUST' representing 'and' or 'SHOULD' representing 'or.' The features/operations that this parser will include are the following: * AND specified as '+' * OR specified as '|' * NOT specified as '-' * PHRASE surrounded by double quotes * PREFIX specified as '*' * PRECEDENCE surrounded by '(' and ')' * WHITESPACE specified as ' ' '\n' '\r' and '\t' will cause the default operator to be used * ESCAPE specified as '\' will allow operators to be used in terms The key differences between this parser and other existing parsers will be the following: * No exceptions will be thrown, and errors in syntax will be ignored. The parser will do a best-effort interpretation of any query entered. * It uses minimal syntax to express queries. All available operators are single characters or pairs of single characters. * The parser is hand-written and in a single Java file making it easy to modify. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5336) Add a simple QueryParser to parse human-entered queries.
[ https://issues.apache.org/jira/browse/LUCENE-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13818431#comment-13818431 ] Paul Elschot commented on LUCENE-5336: -- A realistic query parser is not likely to be any simpler than this, so why not call it simple? Add a simple QueryParser to parse human-entered queries. Key: LUCENE-5336 URL: https://issues.apache.org/jira/browse/LUCENE-5336 Project: Lucene - Core Issue Type: Improvement Reporter: Jack Conradson Attachments: LUCENE-5336.patch I would like to add a new simple QueryParser to Lucene that is designed to parse human-entered queries. This parser will operate on an entire entered query using a specified single field or a set of weighted fields (using term boost). All features/operations in this parser can be enabled or disabled depending on what is necessary for the user. A default operator may be specified as either 'MUST' representing 'and' or 'SHOULD' representing 'or.' The features/operations that this parser will include are the following: * AND specified as '+' * OR specified as '|' * NOT specified as '-' * PHRASE surrounded by double quotes * PREFIX specified as '*' * PRECEDENCE surrounded by '(' and ')' * WHITESPACE specified as ' ' '\n' '\r' and '\t' will cause the default operator to be used * ESCAPE specified as '\' will allow operators to be used in terms The key differences between this parser and other existing parsers will be the following: * No exceptions will be thrown, and errors in syntax will be ignored. The parser will do a best-effort interpretation of any query entered. * It uses minimal syntax to express queries. All available operators are single characters or pairs of single characters. * The parser is hand-written and in a single Java file making it easy to modify. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5336) Add a simple QueryParser to parse human-entered queries.
[ https://issues.apache.org/jira/browse/LUCENE-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13818114#comment-13818114 ] Michael McCandless commented on LUCENE-5336: This is AWESOME. I love how the operators (even whitespace!) are optional. And I love the name :) And it's great that it NEVER throws an exc no matter how awful the input is. And I love that it does not use a lexer/parser generator: this makes it much more approachable to those devs that don't have experience with parser generators. Small javadoc fix: instead of any {@code -} characters beyond the first character in a term may not need to be escaped, I think it should say any {@code -} characters beyond the first character do not need to be escaped (and same for * operator)? How does it handle mal-formed input, e.g. a missing closing for a phrase query? If I enter foo bar will it just make a term query for foo and a term query for bar? Or, does it strip that and do query foo instead? (Same for missing closing paren?). It looks like it drops the and ( and does a simple term query (good). Maybe you could add fangs to the random test by more frequently mixing in these operator characters ... Add a simple QueryParser to parse human-entered queries. Key: LUCENE-5336 URL: https://issues.apache.org/jira/browse/LUCENE-5336 Project: Lucene - Core Issue Type: Improvement Reporter: Jack Conradson Attachments: LUCENE-5336.patch I would like to add a new simple QueryParser to Lucene that is designed to parse human-entered queries. This parser will operate on an entire entered query using a specified single field or a set of weighted fields (using term boost). All features/operations in this parser can be enabled or disabled depending on what is necessary for the user. A default operator may be specified as either 'MUST' representing 'and' or 'SHOULD' representing 'or.' The features/operations that this parser will include are the following: * AND specified as '+' * OR specified as '|' * NOT specified as '-' * PHRASE surrounded by double quotes * PREFIX specified as '*' * PRECEDENCE surrounded by '(' and ')' * WHITESPACE specified as ' ' '\n' '\r' and '\t' will cause the default operator to be used * ESCAPE specified as '\' will allow operators to be used in terms The key differences between this parser and other existing parsers will be the following: * No exceptions will be thrown, and errors in syntax will be ignored. The parser will do a best-effort interpretation of any query entered. * It uses minimal syntax to express queries. All available operators are single characters or pairs of single characters. * The parser is hand-written and in a single Java file making it easy to modify. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org