Hi Erik,
I'm not sure exactly how much context you need here, so I'll try to keep
it short and expand as needed.
The column I am faceting contains a comma deliniated set of vectors.
Each vector is made up of {Make,Year,Model} e.g.
_ford_1996_focus,mercedes_1996_clk,ford_2000_focus
I have a custom request handler, where if I want to find all the cars
from 1996 I pass in a facet query for the Year (1996) which is
transformed to a wildcard facet query :
_*_1996_*
In otherwords, it'll match any records whose vector column contains a
string, which somewhere has a car from 1996.
Why not put the Make, Year and Model in separate columns and do a facet
query of multiple columns?... because once we've selected 1996, we
should (in the above example) then be offering "ford and mercedes" as
further facet choices, and nothing more. If the parts were in their own
columns, there would be no way to tie the Makes and Models to specific
years, for example.
At anyrate, the wildcard search returns the entire match
(_ford_1996_focus,mercedes_1996_clk,ford_2000_focus). I then have to do
another RegExp over it to extract only the two parts (the first ford and
mercedes) that were from 1996. This isn't using SOLR's cache very
effectively.
It would be excellent if SOLR could break up that comma separated list
into three different parts, and run the RegExp over each , returning
only those which match. Is that what you're implying with Analysis? If
that were the case, I'd not need to worry about character exclusion.
Sorry if that's a bit fuzzy... it's hard trying to explain enough to be
useful, but not too much that it turns into an essay!!!
Thanks,
Ben
The solution I'm using is to form a vector
Erik Hatcher wrote:
Ben,
Could you post an example of the type of data you're dealing with and
how you want it handled? I suspect there is a way to accomplish what
you want using an analyzed field, or by preprocessing the data you're
indexing.
Erik
On Jun 29, 2009, at 9:29 AM, Ben wrote:
Hello,
I've been using SOLR for a while now, but am stuck for information on
two issues :
1) Is it possible to exclude characters in a SOLR facet wildcard query?
e.g.
[^,]* to match any character except an "," ?
2) Can one setup the facet wildcard query to return the exact sub
strings it matched of the queried facet, rather than the whole string?
I hope somebody can help :)
Thanks,
Ben