Classic is ... "classic" ... it exists largely for historical purposes to provide a tokenizer that does exactly what the javadocs say it does (regarding punctuation, "produc numbers", and email addresses), so that people who depend on that behavior can continue to rely on it.
Standard is ... "standard" ... it implements that Unicode Standard text segmentation rules. : Date: Fri, 20 Oct 2017 18:58:35 +0530 : From: Chitra <chithu.r...@gmail.com> : Reply-To: java-user@lucene.apache.org : To: Lucene Users <java-user@lucene.apache.org> : Subject: Re: ClassicAnalyzer Behavior on accent character : : Hi, : I found the difference and understand the behavior of both : tokenizers appropriately. : : Could you please suggest me which one is the better to use : ClassicTokenizer/StandardTokenizer? : : -- : Regards, : Chitra : -Hoss http://www.lucidworks.com/ --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org