I’ve been trying to do semi-structured queries & query parsing. In other words, you could have XML snippets mixed in with plain terms, e.g. a query like:
christmas tree <store loc=”abc” close_hour=”2200”> where you’re looking for a document with the terms “christmas” “tree” but also some structured data about where (practically) you could buy the tree. Additionally, I’d like to be able to write functions relating multiple items, sort of like predicate logic or database-like queries: christmas tree NEARBY( <store close_hour=”2200”>, <restaurant close_hour=”2400”> ) which would only find you places to buy a christmas tree that had stores and restaurants in close proximity to each other. Finally, we would eventually be interested in doing something similar to org.apache.lucene.queries.CustomScoreQuery, where you can put in several different criteria and weight them separately per document. I’ve been poking around at a lot of places and would appreciate some help about where I should extend, an existing walkthough or example, etc. Here’s what I’ve been considering: * org/apache/lucene/queryparser/flexible/standard/StandardQueryParser.java — modifying this to add another group-like QueryNode, modifying the processor pipeline to include this, modifying the definition of a TERM so it can deal with attribute=”value” pairs in pseudo-xml. I read through the QueryParser documentation but quickly got lost in the implementation. * org/apache/lucene/queryparser/xml/CorePlusExtensionsParser.java — this seems like it has to do a lot of what I want, but I can’t tell. I hadn’t originally thought of the query coming in as an xml stream. I think I would still need to define some new Query types... Perhaps a lot? One for each type of thing (“store”, in the above) I’d search for? Thanks! stephen