Hi Tom, On Oct 19, 2011, at 6:22 AM, Thomas Bennett wrote:
> Hi, > > I'm trying out wildcards when using query_tool to run some queries on a > lucene catalog and its throwing exceptions left right and center. Brave man indeed! :-) > > Any help/pointers welcome. > > This query: > $ ./query_tool --url http://localhost:9000 --sql -query "SELECT Filename FROM > KatFile WHERE Observer='ja*per'" > > Returns: > Oct 19, 2011 3:19:53 PM org.apache.oodt.cas.filemgr.catalog.LuceneCatalog > paginateQuery > WARNING: Query: [q=Observer:ja*per] for Product Type: [urn:kat:KatFile] > returned no results > java.lang.NullPointerException > at > org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.complexQuery(XmlRpcFileManager.java:602) > [..snip...] I think what you're seeing here is that we never really evolved the FreeTextQuery in the File Manager field-specific clauses to be able to deal with '*'s. So, even though it is parsed correctly, the File Manager is translating the query into a TermQuery in Lucene terminology, but with a '*' in it, which is a Lucene error. In reality, it the FM sees a star, we should really make sure we translate the query on the FM end into a WildcardQuery if we're using the LuceneCatalog. We should probably file a bug on this and fix at some point. In the meanwhile, one option if you need more complex searches is to take a look at the SolrIndexer that I just checked in: https://issues.apache.org/jira/browse/OODT-326 Paul Ramirez wrote this tool and you can basically use it to dump the FM catalog into Solr directly and then query using Solr's syntax which is a bit more powerful than the FM's. The FM's Query Syntax is a trimmed down version suitable usually for production rules, for dumping metadata, and for staging files. If dumping to Solr is a bit much at this point, I can take a look at the query issue you're seeing (once you file it) and give a hand towards trying to interpret the WildcardQuery clauses more correctly. Thanks! Cheers, Chris ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
