[ 
https://issues.apache.org/jira/browse/SOLR-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067437#comment-13067437
 ] 

Hoss Man commented on SOLR-2477:
--------------------------------

Having just looked at this code in SOLR-2663 i'm realizing that as we add more 
types of analyzers, we should really clean up the semantics of how a analyzers 
w/o "type" attributes are treated, and how each of hte analyzers default if 
they aren't specified.

Consider the following (contrived) example...

{code}
<fieldType name="hoss" class="solr.TextField" positionIncrementGap="100">
   <analyzer>
     <tokenizer class="solr.WhitespaceTokenizerFactory"/>
   </analyzer>
   <analyzer type="index">
     <tokenizer class="solr.KeywordTokenizerFactory"/>
   </analyzer>
</fieldType>
{code}

Right now (on trunk and with this patch) that config will result in all of the 
analyzers (index/query[/phrase]) using KeywordTokenizerFactory because the 
type-less analyzer is ignored if there is is an analyzer with type="index".  I 
don't think that makes much sense, and as we add more types of analyzers it 
makes even less sense -- an analyzer w/o a type attribute should really be the 
"default" for each other type

I think we should change the overall flow to be (psudeo-code) ...

{code}

// exactly what is in the config
Analyzer defaultA = readAnalyzer(xpath("./analyzer[not(@type)]"));
Analyzer indexA = readAnalyzer(xpath("./analyzer[@type='index']"));
Analyzer queryA = readAnalyzer(xpath("./analyzer[@type='query']"));
Analyzer phraseA = readAnalyzer(xpath("./analyzer[@type='phrase']"));

if (null != defaultA) {
  // we have an explicit default
  if (null == indexA) indexA = defaultA;
  if (null == queryA) queryA = defaultA;
  if (null == phraseA) phraseA = defaultA;
} else {
  // implicit defaults, either historical or common sense
  if (null == queryA) queryA = indexA;
  if (null == phraseA) phraseA = queryA;
}
{code}

> add analyzer type="phrase"
> --------------------------
>
>                 Key: SOLR-2477
>                 URL: https://issues.apache.org/jira/browse/SOLR-2477
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Robert Muir
>             Fix For: 4.0
>
>         Attachments: SOLR-2477.patch
>
>
> This is just exposing LUCENE-2892, so you can easily configure things
> so that if users put things in double quotes they get a more precise search.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to