I’ve been trying to do semi-structured queries & query parsing.  In other 
words, you could have XML snippets mixed in with plain terms, e.g. a query like:

      christmas tree <store  loc=”abc” close_hour=”2200”>

where you’re looking for a document with the terms “christmas” “tree” but also 
some structured data about where (practically) you could buy the tree.   
Additionally, I’d like to be able to write functions relating multiple items, 
sort of like predicate logic or database-like queries:

      christmas tree NEARBY( <store  close_hour=”2200”>, <restaurant 
close_hour=”2400”> )

which would only find you places to buy a christmas tree that had stores and 
restaurants in close proximity to each other.  Finally, we would eventually be 
interested in doing something similar to 
org.apache.lucene.queries.CustomScoreQuery, where you can put in several 
different criteria and weight them separately per document.

I’ve been poking around at a lot of places and would appreciate some help about 
where I should extend, an existing walkthough or example, etc.  Here’s what 
I’ve been considering:

  *   org/apache/lucene/queryparser/flexible/standard/StandardQueryParser.java 
— modifying this to add another group-like QueryNode, modifying the processor 
pipeline to include this, modifying the definition of a TERM so it can deal 
with attribute=”value” pairs in pseudo-xml.  I read through the QueryParser 
documentation but quickly got lost in the implementation.
  *   org/apache/lucene/queryparser/xml/CorePlusExtensionsParser.java — this 
seems like it has to do a lot of what I want, but I can’t tell.  I hadn’t 
originally thought of the query coming in as an xml stream.  I think I would 
still need to define some new Query types... Perhaps a lot?  One for each type 
of thing (“store”, in the above) I’d search for?

Thanks!

stephen

Reply via email to