Re: indexing '-

Ken Stanley Sun, 31 Oct 2010 09:24:28 -0700

On Sun, Oct 31, 2010 at 12:12 PM, PeterKerk <vettepa...@hotmail.com> wrote:


>
> I have a city named 's-Hertogenbosch
>
> I want it to be indexed exactly like that, so "'s-Hertogenbosch" (without
> "")
>
> But now I get:
> <lst name="city">
>        <int name="hertogenbosch">1</int>
>        <int name="s">1</int>
>        <int name="shertogenbosch">1</int>
> </lst>
>
> What filter should I add/remove from my field definition?
>
> I already tried a new fieldtype with just this, but no luck:
>    <fieldType name="exacttext" class="solr.TextField"
> positionIncrementGap="100" >
>      <analyzer>
>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="false"/>
>      </analyzer>
>    </fieldType>
>
>
> My schema.xml
>
>    <fieldType name="textTight" class="solr.TextField"
> positionIncrementGap="100" >
>      <analyzer>
>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="false"/>
>        <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords_dutch.txt" />
>        <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="0" generateNumberParts="0" catenateWords="1"
> catenateNumbers="1" catenateAll="0"/>
>        <filter class="solr.ISOLatin1AccentFilterFactory"/>
>        <filter class="solr.LowerCaseFilterFactory"/>
>        <filter class="solr.SnowballPorterFilterFactory" language="Dutch"
> protected="protwords.txt"/>
>        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>      </analyzer>
>    </fieldType>
>
> <field name="city" type="textTight" indexed="true" stored="true"/>
>
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/indexing-tp1816969p1816969.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

For exact text, you should try using either the string type, or a type that
only uses the KeywordTokenizer. Other field types may perform
transformations on the text similar to what you are seeing.

- Ken

Re: indexing '-

Reply via email to