You can use LetterTokenizerFactory instead. Regards, Dheerendra Kulkarni
On Wed, Oct 12, 2016 at 6:24 AM, Derek Poh <d...@globalsources.com> wrote: > Hi > > How can I split words with period in between into separate tokens. > Eg. "Co.Ltd" => "Co" "Ltd" . > > I am using StandardTokenizerFactory and it does notreplace periods (dots) > that are not followed by whitespace are kept as part of the token, > including Internet domain names. > > This is the field definition, > > <fieldType name="text_general" class="solr.TextField" > positionIncrementGap="100"> > <analyzer type="index"> > <tokenizer class="solr.StandardTokenizerFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt" /> > <filter class="solr.LowerCaseFilterFactory"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.StandardTokenizerFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt" /> > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" > ignoreCase="true" expand="true"/> > <filter class="solr.LowerCaseFilterFactory"/> > </analyzer> > </fieldType> > > Solr versionis 10.4.10. > > Derek > > ---------------------- > CONFIDENTIALITY NOTICE > This e-mail (including any attachments) may contain confidential and/or > privileged information. If you are not the intended recipient or have > received this e-mail in error, please inform the sender immediately and > delete this e-mail (including any attachments) from your computer, and you > must not use, disclose to anyone else or copy this e-mail (including any > attachments), whether in whole or in part. > This e-mail and any reply to it may be monitored for security, legal, > regulatory compliance and/or other appropriate reasons. -- Regards, Dheerendra