On Wed, May 13, 2009 at 6:23 AM, Yonik Seeley
<yo...@lucidimagination.com> wrote:
> On Tue, May 12, 2009 at 7:19 PM, Geoffrey Young
> <ge...@modperlcookbook.org> wrote:
>> hi all :)
>>
>> I'm having trouble with camel-cased query strings and the dismax handler.
>>
>> a user query
>>
>>  LeAnn Rimes
>>
>> isn't matching the indexed term
>>
>>  Leann Rimes
>
> This is the camel-case case that can't currently be handled by a
> single WordDelimiterFilter.
>
> If the indexeddoc had LeAnn, then it would be indexed as
> "le","ann"/"leann" and hence queries of both forms "le ann" and
> "leann" would match.
>
> However since the indexed term is simply "leann", a
> WordDelimiterFilter configured to split won't match (a search for
> "LeAnn" will be translated into a search for "le" "ann".

but the concatparts and/or concatall should handle splicing the tokens
back together, right?

>
> One way to work around this now is to do a copyField into another
> field that catenates split terms in the query analyzer instead of
> generating/splitting, and then search across both fields.

yeah, unforunately, that's not an option for me :)

>
> BTW, your parsed query below shows you turned on both catenation and
> generation (or perhaps preserveOriginal) for split subwords in your
> query analyzer.  Unfortunately this configuration doesn't work due to
> the ambiguity of what it means to have multiple terms at the same
> position (this is the same problem for multi-word synonyms at query
> time).  The query shown below looks for "leann" or "le" followed by
> "ann" and hence an indexed term of "leann" won't match.

ugh.  ok, thanks for letting me know.

I'm not using the same concat parameters on the index as the query
based on the solr wiki docs.  but I've always wondered if that was a
good idea.  I'll see if matching them up helps at all.

thanks.  I'll let you know what I find.

--Geoff

Reply via email to