[GitHub] [lucenenet] NightOwl888 commented on issue #732: ICUTokenizer discrepancies

GitBox Wed, 02 Nov 2022 21:47:17 -0700


NightOwl888 commented on issue #732:
URL: https://github.com/apache/lucenenet/issues/732#issuecomment-1301635515


   Could you post the code for how you are constructing the analyzer including 
how you are setting up the `StopFilter`? Something in your token stream is 
filtering out diacratics. We are most likely looking at some sort of a gap 
between how .NET and Java handle localization or normalization, but this 
doesn't appear to be directly related to `ICUTokenizer` or `CharArraySet`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [lucenenet] NightOwl888 commented on issue #732: ICUTokenizer discrepancies

Reply via email to