Hi Tom,

On Oct 19, 2011, at 6:22 AM, Thomas Bennett wrote:

> Hi,
> 
> I'm trying out wildcards when using query_tool to run some queries on a 
> lucene catalog and its throwing exceptions left right and center.

Brave man indeed! :-)

> 
> Any help/pointers welcome.
> 
> This query:
> $ ./query_tool --url http://localhost:9000 --sql -query "SELECT Filename FROM 
> KatFile WHERE Observer='ja*per'"
> 
> Returns:
> Oct 19, 2011 3:19:53 PM org.apache.oodt.cas.filemgr.catalog.LuceneCatalog 
> paginateQuery
> WARNING: Query: [q=Observer:ja*per] for Product Type: [urn:kat:KatFile] 
> returned no results
> java.lang.NullPointerException
>       at 
> org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.complexQuery(XmlRpcFileManager.java:602)
> [..snip...]

I think what you're seeing here is that we never really evolved the 
FreeTextQuery in the File Manager field-specific clauses 
to be able to deal with '*'s. So, even though it is parsed correctly, the File 
Manager is translating the query into a TermQuery 
in Lucene terminology, but with a '*' in it, which is a Lucene error. In 
reality, it the FM sees a star, we should really make sure 
we translate the query on the FM end into a WildcardQuery if we're using the 
LuceneCatalog. We should probably file 
a bug on this and fix at some point.

In the meanwhile, one option if you need more complex searches is to take a 
look at the SolrIndexer that I just checked in:

https://issues.apache.org/jira/browse/OODT-326

Paul Ramirez wrote this tool and you can basically use it to dump the FM 
catalog into Solr directly and then query 
using Solr's syntax which is a bit more powerful than the FM's. The FM's Query 
Syntax is a trimmed down version 
suitable usually for production rules, for dumping metadata, and for staging 
files. 

If dumping to Solr is a bit much at this point, I can take a look at the query 
issue you're seeing (once you file it) and 
give a hand towards trying to interpret the WildcardQuery clauses more 
correctly.

Thanks!

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: [email protected]
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Reply via email to