Re: Odd wildcard behavior

cjkadakia Mon, 22 Feb 2010 17:09:18 -0800

If stemming is the underlying issue here, then are there any suggestions?
Would I have to remove the SnowballPorterFilterFactory from both the index
AND the query?


Just to clarify, the ability to search on "foos" and return "foo" (and
vice-versa) is quite important, but this other issue with wildcards is a
more pressing issue for right now. What would you suggest to handle both
requirements?

As for the output from the debugQuery, here you go:

<lst name="debug">
<str name="rawquerystring">(name:(international*))</str>
<str name="querystring">(name:(international*))</str>
<str name="parsedquery">name:international*</str>
<str name="parsedquery_toString">name:international*</str>
<lst name="explain"/>
<str name="QParser">LuceneQParser</str>
−
<lst name="timing">
<double name="time">0.0</double>
−
<lst name="prepare">
<double name="time">0.0</double>
−
<lst name="org.apache.solr.handler.component.QueryComponent">
<double name="time">0.0</double>
</lst>
−
<lst name="org.apache.solr.handler.component.FacetComponent">
<double name="time">0.0</double>
</lst>
−
<lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
<double name="time">0.0</double>
</lst>
−
<lst name="org.apache.solr.handler.component.HighlightComponent">
<double name="time">0.0</double>
</lst>
−
<lst name="org.apache.solr.handler.component.StatsComponent">
<double name="time">0.0</double>
</lst>
−
<lst name="org.apache.solr.handler.component.SpellCheckComponent">
<double name="time">0.0</double>
</lst>
−
<lst name="org.apache.solr.handler.component.DebugComponent">
<double name="time">0.0</double>
</lst>
</lst>
−
<lst name="process">
<double name="time">0.0</double>
−
<lst name="org.apache.solr.handler.component.QueryComponent">
<double name="time">0.0</double>
</lst>
−
<lst name="org.apache.solr.handler.component.FacetComponent">
<double name="time">0.0</double>
</lst>
−
<lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
<double name="time">0.0</double>
</lst>
−
<lst name="org.apache.solr.handler.component.HighlightComponent">
<double name="time">0.0</double>
</lst>
−
<lst name="org.apache.solr.handler.component.StatsComponent">
<double name="time">0.0</double>
</lst>
−
<lst name="org.apache.solr.handler.component.SpellCheckComponent">
<double name="time">0.0</double>
</lst>
−
<lst name="org.apache.solr.handler.component.DebugComponent">
<double name="time">0.0</double>
</lst>
</lst>
</lst>
</lst>


cjkadakia wrote:
> 
> I'm getting very odd behavior from a wildcard search.
> 
> For example, when I'm searching for docs with a name containing the word
> "International" the following occur:
> 
> q=name:(inte*) -- found "International"
> q=name:(intern*) -- found "International"
> q=name:(interna*) -- did not find "International"
> q=name:(internat*) -- did not find "International"
> .. adding 1 character at a time did not find "International"
> q=name:(international*) -- did not find "International"
> 
> As indicated, the behavior is quite bizarre and causing issues with our
> use and test cases. Is there something I can set for the fieldType of text
> in order to make these kinds of searches working? Also, any insight as to
> why this is not working would be a big help as well.
> 
> Pasted for reference:
>     <fieldType name="text" class="solr.TextField"
> positionIncrementGap="100">
>       <analyzer type="index">
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>         <!-- in this example, we will only use synonyms at query time
>         <filter class="solr.SynonymFilterFactory"
> synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
>         -->
>         <!-- Case insensitive stop word removal.
>           add enablePositionIncrements=true in both the index and query
>           analyzers to leave a 'gap' for more accurate phrase queries.
>         -->
>         <filter class="solr.StopFilterFactory"
>                 ignoreCase="true"
>                 words="stopwords.txt"
>                 enablePositionIncrements="true"
>                 />
>         <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.SnowballPorterFilterFactory"
> language="English" protected="protwords.txt"/>
>       </analyzer>
>       <analyzer type="query">
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>         <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
>         <filter class="solr.StopFilterFactory"
>                 ignoreCase="true"
>                 words="stopwords.txt"
>                 enablePositionIncrements="true"
>                 />
>         <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="0"
> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.SnowballPorterFilterFactory"
> language="English" protected="protwords.txt"/>
>       </analyzer>
>     </fieldType>
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Odd-wildcard-behavior-tp27695404p27697228.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Odd wildcard behavior

Reply via email to