You can use LetterTokenizerFactory instead.

Regards,
Dheerendra Kulkarni

On Wed, Oct 12, 2016 at 6:24 AM, Derek Poh <d...@globalsources.com> wrote:

> Hi
>
> How can I split words with period in between into separate tokens.
> Eg. "Co.Ltd" => "Co" "Ltd" .
>
> I am using StandardTokenizerFactory and it does notreplace periods (dots)
> that are not followed by whitespace are kept as part of the token,
> including Internet domain names.
>
> This is the field definition,
>
> <fieldType name="text_general" class="solr.TextField"
> positionIncrementGap="100">
>       <analyzer type="index">
>         <tokenizer class="solr.StandardTokenizerFactory"/>
>         <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt" />
>         <filter class="solr.LowerCaseFilterFactory"/>
>       </analyzer>
>       <analyzer type="query">
>         <tokenizer class="solr.StandardTokenizerFactory"/>
>         <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt" />
>         <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>       </analyzer>
> </fieldType>
>
> Solr versionis 10.4.10.
>
> Derek
>
> ----------------------
> CONFIDENTIALITY NOTICE
> This e-mail (including any attachments) may contain confidential and/or
> privileged information. If you are not the intended recipient or have
> received this e-mail in error, please inform the sender immediately and
> delete this e-mail (including any attachments) from your computer, and you
> must not use, disclose to anyone else or copy this e-mail (including any
> attachments), whether in whole or in part.
> This e-mail and any reply to it may be monitored for security, legal,
> regulatory compliance and/or other appropriate reasons.




-- 
Regards,
Dheerendra

Reply via email to