Hi again, I'm using highlighter to highlight terms in Japanese text, but I cannot get preferable output.
If I use StandardAnalyzer or SnowballAnalyzer w/ English, getBestFragment() returns preferable outputs: Sample: (SnowballAnalyzer) Text: A meeting will be held in the City Hall TokenStream: [a][meet][will][be][held][in][the][citi][hall] Query Text: meet Output: A <B>meeting</B> will be held in the City Hall But if I use JapaneseAnalyzer, which is most popular Analyzer in Japan to get TokenStream from Japanese text, to highlight Japanese text with Highlighter, whole text is highlighted: Sample: (JapaneseAnalyzer) Text: AMeetingWillBeHeldInTheCityHall TokenStream: [A][Meeting][Will][Be][Held][In][The][City][Hall] Query Text: Meeting Output: <B>AMeetingWillBeHeldInTheCityHall</B> Please note that I use alphabet to show the Text at second sample because most users in this mailing list can read it, but in reality, I used Japanese characters for the Text. And you'll see that JapaneseAnalyzer, which uses Japanese dictionary on background to extract tokens from text stream, can recognize tokens and produce TokenStream. But highlighter.getBestFragment() highlighted whole text. Do I need to implement Fragmenter to highlight tokens correctly for Japanese text? Thanks in advance, Koji --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
