Hi, I indexed a term 'ⒶeŘꝋꝒɫⱯŋɇ' (aeroplane) and the term was indexed as "er l n", some characters were trimmed while indexing.
Here is my code protected Analyzer.TokenStreamComponents createComponents(final String > fieldName, final Reader reader) > { > final ClassicTokenizer src = new ClassicTokenizer(getVersion(), > reader); > src.setMaxTokenLength(ClassicAnalyzer.DEFAULT_MAX_TOKEN_LENGTH); > > TokenStream tok = new ClassicFilter(src); > tok = new LowerCaseFilter(getVersion(), tok); > tok = new StopFilter(getVersion(), tok, stopwords); > tok = new ASCIIFoldingFilter(tok); // to enable AccentInsensitive > search > > return new Analyzer.TokenStreamComponents(src, tok) > { > @Override > protected void setReader(final Reader reader) throws > IOException > { > > src.setMaxTokenLength(ClassicAnalyzer.DEFAULT_MAX_TOKEN_LENGTH); > super.setReader(reader); > } > }; > } Am I missing anything? Is that expected behavior for my input or any reason behind such abnormal behavior? -- Regards, Chitra