Yes, I had done that... however, I'm beginning to see now that what I am doing is called a "wildcard query" which is going via Lucene's queryparser. Lucene's query parser doesn't not support the regexp idea of character exclusion ... i.e. I'm not trying to match "[" I'm trying to express "Match as many characters as possible, which are not underscores" with [^_]*

Perhaps I'm going about my whole problem in an ineffective way, but I'm not sure how I can sensibly describe what I'm doing without it becoming a long document.

The only other approach I can think of is to change what I'm indexing but I'm not sure how to achieve that.
I've tried explaining it once, and obviously failed, so I'll try again.

I'm given a string containing many vectors (where each dimension is separated by an underscore, and each vector is seperated by a comma) e.g.

A1_B1_C1_D1,A2_B2_C2_D2,A3_B3_C3_D3

I want my facet query to tell me if, within one of the vectors within that string, there is a match for dimensions I'm interested in. Of the four dimensions in this example, I may choose to fix an arbitrary number of them with values, and the rest with wildcards e.g. I might look for a facet containing Ox_*_*_* so one of the vectors in the string must have its first dimension matching "Ox" and I don't care about the rest.

***Is there a way to break down this string on the comma's so that I can apply a normal wildcard query and SOLR applies it to each individually?*** That would solve all my problems :
e.g.
The string is internally represented in lucene/solr as
A1_B1_C1_D1
A2_B2_C2_D2
A3_B3_C3_D3

where it tries to match the wildcard query on each in turn?

Thanks for you help, I'm deeply confused about this at the moment...

Ben

Reply via email to