I have updated Itamar's LuceneNetDemo to the new API 
(https://github.com/NightOwl888/LuceneNetDemo/tree/update-api-format), but 
there is an issue with its API usage I am not quite sure about.

In the original demo code, there is an HtmlStripAnalyzerWrapper class 
(https://github.com/synhershko/LuceneNetDemo/blob/master/LuceneNetDemo/Analyzers/HtmlStripAnalyzerWrapper.cs)
 that returns the result of _wrappedAnalyzer.CreateComponents(). However, in 
Java CreateComponents() was a protected method, so it has been updated to be 
protected in .NET. Therefore, this line won't compile.

Since the purpose of the HtmlStripAnalyzerWrapper class is to apply a filter to 
the passed-in analyzer, I tried another approach. The InitReader() method is 
apparently designed for this specific purpose. So, I tried subclassing the 
StandardAnalyzer so I could override the InitReader() method. But 
StandardAnalyzer is sealed (as it was in Java).

Is the StandardAnalyzer (or any other analyzer that is marked sealed) not 
intended to be used in conjunction with a CharFilter? Or is there a loophole in 
Java that makes this somehow possible?

Of course, the workaround is to duplicate most of what StandardAnalyzer does 
(https://github.com/NightOwl888/LuceneNetDemo/blob/update-api-format/LuceneNetDemo/Analyzers/HtmlStripAnalyzer.cs),
 but it seems like there should be another option here. Is this what the Lucene 
designers intended?

Thanks,
Shad Storhaug (NightOwl888)

Reply via email to