CJKTokenizer generates tokens with incorrect offsets
----------------------------------------------------
Key: LUCENE-2207
URL: https://issues.apache.org/jira/browse/LUCENE-2207
Project: Lucene - Java
Issue Type: Bug
Components: contrib/analyzers
Reporter: Koji Sekiguchi
If I index a Japanese *multi-valued* document with CJKTokenizer and highlight a
term with FastVectorHighlighter, the output snippets have incorrect highlighted
string. I'll attach a program that reproduces the problem soon.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]