[jira] [Commented] (SOLR-219) Determine if prefix, wildcard, fuzzy queries should be lowercased

Robert Muir (JIRA) Thu, 16 Jun 2011 13:34:01 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050708#comment-13050708
 ]


Robert Muir commented on SOLR-219:
----------------------------------

a lot of analysis things like stemming are not prepared to deal with wildcard 
characters in the term, and returning multiple tokens (because a tokenizer 
splits on a * or whatever) makes no sense either

in my opinion, a good solution here is to allow you to specify in your schema: 
this is the analysis chain for these multitermqueries, so it would be a 
different chain rather than "query" or "index" (similar to SOLR-2477 where I 
propose allowing you to specify one for "phrase"). The QP would use this chain 
for things like wildcards, and throw an exception if the analyzer returns more 
than one token from a wildcard term.

This way you can use KeywordTokenizer + lowercase/fold characters or whatever, 
but in general doing things like WDF or synonyms makes no sense here.  If you 
want to do things like stemming, thats fine, you can shoot yourself in the foot 
this way and we won't stop you.

But in no case should we try to magically apply the analysis chain... too 
ambiguous what would happen.


> Determine if prefix, wildcard, fuzzy queries should be lowercased
> -----------------------------------------------------------------
>
>                 Key: SOLR-219
>                 URL: https://issues.apache.org/jira/browse/SOLR-219
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>            Priority: Minor
>             Fix For: 3.3
>
>         Attachments: lowercase_prefix.patch, wildcardlowercase.patch
>
>
> Solr should be able to "do the right thing" when doing prefix/wildcard/fuzzy 
> queries on fields with respect to lowercasing or not.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-219) Determine if prefix, wildcard, fuzzy queries should be lowercased

Reply via email to