Github user NightOwl888 commented on the issue:
https://github.com/apache/lucenenet/pull/191
@conniey
I have nearly finished Highlighter on this branch
https://github.com/NightOwl888/lucenenet/tree/netcoremigration-highlighter
There is still 1 failing test (that doesn't appear to be related to
BreakIterator). I was able to make [hacky solutions for the first 3
issues](https://github.com/NightOwl888/lucenenet/blob/netcoremigration-highlighter/src/Lucene.Net.Highlighter/IcuBreakIterator.cs#L289-L306)
I mentioned
[above](https://github.com/apache/lucenenet/pull/191#issuecomment-266510336),
but after the link you provided I am convinced that none of them will suffice
for production, and we will need a `RuleBasedBreakIterator` to be sure we have
all of the breaking rules setup correctly for international support.
You can pull this now if you wish - let me know if you need me to work on
that failing test.
icu-dotnet
----
Some classes that I have already ported that you may wish to migrate into
icu-dotnet:
-
[BreakIterator](https://github.com/NightOwl888/lucenenet/blob/netcoremigration-highlighter/src/Lucene.Net.Core/Support/BreakIterator.cs)
- Note that is one change from the original - the Text property returns
string instead of CharacterIterator. In hindsight, this change may not have
been necessary.
-
[CharacterIterator](https://github.com/NightOwl888/lucenenet/blob/netcoremigration-highlighter/src/Lucene.Net.Core/Support/CharacterIterator.cs)
- a GetTextAsString() method was added primarily so we can get the text to
pass on to icu-dotnet, and the documentation hasn't yet been migrated from Java.
-
[StringCharacterIterator](https://github.com/NightOwl888/lucenenet/blob/netcoremigration-highlighter/src/Lucene.Net.Core/Support/StringCharacterIterator.cs)
- Exactly like Java (but documentation not yet migrated).
-
[IcuBreakIterator](https://github.com/NightOwl888/lucenenet/blob/netcoremigration-highlighter/src/Lucene.Net.Highlighter/IcuBreakIterator.cs)
- If you remove the hacks I put in to emulate the RuleBasedBreakIterator, it
would probably be a useful addition to icu-dotnet.
-
[TestBreakIterator](https://github.com/NightOwl888/lucenenet/blob/netcoremigration-highlighter/src/Lucene.Net.Tests.Highlighter/TestBreakIterator.cs)
- A partial set of tests that verify the sentence and word breaking
RuleBasedBreakIterator rules.
Note that CharacterIterator is a dependency of
[CharArrayIterator](https://github.com/NightOwl888/lucenenet/blob/netcoremigration-highlighter/src/Lucene.Net.Analysis.Common/Analysis/Util/CharArrayIterator.cs),
but since that is in Analysis.Common (which already depends on icu-dotnet), we
can completely remove all of these classes from Lucene.Net once the
functionality is available in icu-dotnet.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---