[ https://issues.apache.org/jira/browse/LUCENE-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737957#action_12737957 ]
Luis Alves edited comment on LUCENE-1567 at 8/1/09 3:07 PM: ------------------------------------------------------------ Hi Michael, {quote} OK, didn't know there was another patch coming.... I guess I'll redo my verification then... {quote} I added that comment when I created a block dependency on LUCENE-1486. I'm still learning JIRA :). I didn't know the comment was going to get posted in this thread, I was assuming LUCENE-1486 would get the comment. was (Author: lafa): Hi Michael, {quote} OK, didn't know there was another patch coming.... I guess I'll redo my verification then... {quote} I added that comment when I created a block dependency on LUCENE-1486. I'm still learning JIRA :), I didn't the comment know was going to get posted in this thread, I was assuming it was the LUCENE-1486, that was going to get the comment. > New flexible query parser > ------------------------- > > Key: LUCENE-1567 > URL: https://issues.apache.org/jira/browse/LUCENE-1567 > Project: Lucene - Java > Issue Type: New Feature > Components: QueryParser > Environment: N/A > Reporter: Luis Alves > Assignee: Michael Busch > Fix For: 2.9 > > Attachments: lucene-1567.patch, > lucene_1567_adriano_crestani_07_13_2009.patch, > lucene_trunk_FlexQueryParser_2009July09_v4.patch, > lucene_trunk_FlexQueryParser_2009July10_v5.patch, > lucene_trunk_FlexQueryParser_2009july15_v6.patch, > lucene_trunk_FlexQueryParser_2009july16_v7.patch, > lucene_trunk_FlexQueryParser_2009july23_v8.patch, > lucene_trunk_FlexQueryParser_2009july27_v9.patch, > lucene_trunk_FlexQueryParser_2009july28_v10.patch, > lucene_trunk_FlexQueryParser_2009july30_v12.patch, > lucene_trunk_FlexQueryParser_2009july31_v14.patch, > lucene_trunk_FlexQueryParser_2009March24.patch, > lucene_trunk_FlexQueryParser_2009March26_v3.patch, new_query_parser_src.tar, > QueryParser_restructure_meetup_june2009_v2.pdf, > wiki_switching_to_the_new_query_parser.txt > > > From "New flexible query parser" thread by Micheal Busch > in my team at IBM we have used a different query parser than Lucene's in > our products for quite a while. Recently we spent a significant amount > of time in refactoring the code and designing a very generic > architecture, so that this query parser can be easily used for different > products with varying query syntaxes. > This work was originally driven by Andreas Neumann (who, however, left > our team); most of the code was written by Luis Alves, who has been a > bit active in Lucene in the past, and Adriano Campos, who joined our > team at IBM half a year ago. Adriano is Apache committer and PMC member > on the Tuscany project and getting familiar with Lucene now too. > We think this code is much more flexible and extensible than the current > Lucene query parser, and would therefore like to contribute it to > Lucene. I'd like to give a very brief architecture overview here, > Adriano and Luis can then answer more detailed questions as they're much > more familiar with the code than I am. > The goal was it to separate syntax and semantics of a query. E.g. 'a AND > b', '+a +b', 'AND(a,b)' could be different syntaxes for the same query. > We distinguish the semantics of the different query components, e.g. > whether and how to tokenize/lemmatize/normalize the different terms or > which Query objects to create for the terms. We wanted to be able to > write a parser with a new syntax, while reusing the underlying > semantics, as quickly as possible. > In fact, Adriano is currently working on a 100% Lucene-syntax compatible > implementation to make it easy for people who are using Lucene's query > parser to switch. > The query parser has three layers and its core is what we call the > QueryNodeTree. It is a tree that initially represents the syntax of the > original query, e.g. for 'a AND b': > AND > / \ > A B > The three layers are: > 1. QueryParser > 2. QueryNodeProcessor > 3. QueryBuilder > 1. The upper layer is the parsing layer which simply transforms the > query text string into a QueryNodeTree. Currently our implementations of > this layer use javacc. > 2. The query node processors do most of the work. It is in fact a > configurable chain of processors. Each processors can walk the tree and > modify nodes or even the tree's structure. That makes it possible to > e.g. do query optimization before the query is executed or to tokenize > terms. > 3. The third layer is also a configurable chain of builders, which > transform the QueryNodeTree into Lucene Query objects. > Furthermore the query parser uses flexible configuration objects, which > are based on AttributeSource/Attribute. It also uses message classes that > allow to attach resource bundles. This makes it possible to translate > messages, which is an important feature of a query parser. > This design allows us to develop different query syntaxes very quickly. > Adriano wrote the Lucene-compatible syntax in a matter of hours, and the > underlying processors and builders in a few days. We now have a 100% > compatible Lucene query parser, which means the syntax is identical and > all query parser test cases pass on the new one too using a wrapper. > Recent posts show that there is demand for query syntax improvements, > e.g improved range query syntax or operator precedence. There are > already different QP implementations in Lucene+contrib, however I think > we did not keep them all up to date and in sync. This is not too > surprising, because usually when fixes and changes are made to the main > query parser, people don't make the corresponding changes in the contrib > parsers. (I'm guilty here too) > With this new architecture it will be much easier to maintain different > query syntaxes, as the actual code for the first layer is not very much. > All syntaxes would benefit from patches and improvements we make to the > underlying layers, which will make supporting different syntaxes much > more manageable. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org