Hello!

I suggest you to try PatternTokenizer with a regex that includes "." and
blank spaces, for example, in Query and Index analyzers for that fieldType.
The expression will be tokenized by that regex expression and you will
success querying. Unfortunately, you will have to reindex all if you change
your schema.

Regards,

- Luis Cappa
El 21/11/2012 19:13, "Jack Krupansky" <j...@basetechnology.com> escribió:

> Try the Solr Admin Analysis page and see how your failing examples analyze
> for both index and query.
>
> Also, if you experiment with analyzer settings, be sure to FULLY reindex
> your documents since a mismatch between how the documents were ORIGINALLY
> analyzed and the latest query analysis can cause mismatches. Changing an
> index analyzer does not force an automatic reindex.
>
> Also, check to see that there is not a delimiter character, such as a
> colon, immediately before a term with no white space.
>
> -- Jack Krupansky
>
> -----Original Message----- From: Sohail Aboobaker
> Sent: Wednesday, November 21, 2012 8:13 AM
> To: solr-user@lucene.apache.org
> Subject: Inconsistent search results.
>
> Hi,
>
> We have 500k+ documents indexed with many fields. One of the fields is a
> simple text filled that is defined as default search field and we copy many
> field values into that field.
>
> Some values are composed of two components with a "." as separator. When we
> search for the partial terms for such values, we get inconsistent results.
> Following are some examples:
>
> Value: KWJ1112.MC2850
>
> we search on MC2850, it returns result.
> we search on KWJ1112, no results.
>
> Value: ACW9920.KL1230
>
> we search on ACW9920, gives results.
> we search on KL1230, gives results.
>
> The results are inconsistent. Sometimes, it will give results on both sides
> of partial search. For others, it would give results on only the last part
> of word. The last part search always works.
>
> We are using standard tokenizer as follows:
>
> <fieldType name="text_general" class="solr.TextField"
> positionIncrementGap="100"><**analyzer type="index"><tokenizer
> class="solr.**StandardTokenizerFactory"/><**filter
> class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"
> enablePositionIncrements="**true"/><!-- in this example, we will only use
> synonyms at query time
>        <filter class="solr.**SynonymFilterFactory"
> synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
>        --><filter
> class="solr.**LowerCaseFilterFactory"/></**analyzer><analyzer
> type="query"><tokenizer class="solr.**StandardTokenizerFactory"/><**filter
> class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"
> enablePositionIncrements="**true"/><filter class="solr.**
> SynonymFilterFactory"
> synonyms="synonyms.txt" ignoreCase="true" expand="true"/><filter
> class="solr.**LowerCaseFilterFactory"/></**analyzer></fieldType>
>
> What should we use in order to get consistent results for both sides of
> component? Should we be using whitespace with worddelimiterfactory? Some
> examples will be helpful.
>
> Thanks
>
> Sohail
>

Reply via email to