Why didn't I thought of that. That's another alternative. Thank you for your suggestion. Appreciate it.

On 10/13/2016 5:41 AM, Georg Sorst wrote:
You could use a PatternReplaceCharFilter before your tokenizer to replace
the dot with a space character.

Derek Poh <d...@globalsources.com> schrieb am Mi., 12. Okt. 2016 11:38:

Seems like LetterTokenizerFactory tokenise/discard on numbers as well. The
field does has values with numbers in them therefore it is not applicable.
Thank you.


On 10/12/2016 4:22 PM, Dheerendra Kulkarni wrote:
You can use LetterTokenizerFactory instead.

Regards,
Dheerendra Kulkarni

On Wed, Oct 12, 2016 at 6:24 AM, Derek Poh <d...@globalsources.com>
wrote:
Hi

How can I split words with period in between into separate tokens.
Eg. "Co.Ltd" => "Co" "Ltd" .

I am using StandardTokenizerFactory and it does notreplace periods
(dots)
that are not followed by whitespace are kept as part of the token,
including Internet domain names.

This is the field definition,

<fieldType name="text_general" class="solr.TextField"
positionIncrementGap="100">
        <analyzer type="index">
          <tokenizer class="solr.StandardTokenizerFactory"/>
          <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" />
          <filter class="solr.LowerCaseFilterFactory"/>
        </analyzer>
        <analyzer type="query">
          <tokenizer class="solr.StandardTokenizerFactory"/>
          <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" />
          <filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
          <filter class="solr.LowerCaseFilterFactory"/>
        </analyzer>
</fieldType>

Solr versionis 10.4.10.

Derek

----------------------
CONFIDENTIALITY NOTICE
This e-mail (including any attachments) may contain confidential and/or
privileged information. If you are not the intended recipient or have
received this e-mail in error, please inform the sender immediately and
delete this e-mail (including any attachments) from your computer, and
you
must not use, disclose to anyone else or copy this e-mail (including any
attachments), whether in whole or in part.
This e-mail and any reply to it may be monitored for security, legal,
regulatory compliance and/or other appropriate reasons.


----------------------
CONFIDENTIALITY NOTICE

This e-mail (including any attachments) may contain confidential and/or
privileged information. If you are not the intended recipient or have
received this e-mail in error, please inform the sender immediately and
delete this e-mail (including any attachments) from your computer, and you
must not use, disclose to anyone else or copy this e-mail (including any
attachments), whether in whole or in part.

This e-mail and any reply to it may be monitored for security, legal,
regulatory compliance and/or other appropriate reasons.




----------------------
CONFIDENTIALITY NOTICE This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part.
This e-mail and any reply to it may be monitored for security, legal, 
regulatory compliance and/or other appropriate reasons.

Reply via email to