Hi Rick, For both of the tokenizers, it does not split on the hyphens for email like this: solr-user@lucene.apache.org
The entire email address remains intact for both of the tokenizers. Regards, Edwin On 24 November 2017 at 20:19, Rick Leir <rl...@leirtech.com> wrote: > Edwin > There is a spec for which characters are acceptable in an email name, and > another spec for chars in a domain name. I suspect you will have more > success with a tokenizer which is specialized for email, but I have not > looked at UAX29URLEmailTokenizerFactory. Does ClassicTokenizerFactory split > on hyphens? > Cheers --Rick > > On November 24, 2017 3:46:46 AM EST, Zheng Lin Edwin Yeo < > edwinye...@gmail.com> wrote: > >Hi, > > > >I am indexing email addresses into Solr via EML files. Currently, I am > >using ClassicTokenizerFactory with LowerCaseFilterFactory. However, I > >also > >found that we can also use UAX29URLEmailTokenizerFactory with > >LowerCaseFilterFactory. > > > >Does anyone have any recommendation on which Tokenizer is better? > > > >I am currently using Solr 6.5.1, and planning to upgrade to Solr 7.1.0. > > > >Regards, > >Edwin > > -- > Sorry for being brief. Alternate email is rickleir at yahoo dot com