NightOwl888 commented on code in PR #801:
URL: https://github.com/apache/lucenenet/pull/801#discussion_r1160794004
##########
src/Lucene.Net.Analysis.Common/Tartarus/Snowball/SnowballProgram.cs:
##########
@@ -63,7 +63,9 @@ protected SnowballProgram()
/// <summary>
/// Set the current string.
/// </summary>
- public virtual void SetCurrent(string value)
+ // LUCENENET specific - S1699 - marked non-virtual because calling
+ // virtual members from the constructor is not a safe operation in .NET
+ public void SetCurrent(string value)
Review Comment:
This one has a C# implementation that we can get clues from:
https://github.com/snowballstem/snowball/blob/master/csharp/Snowball/Stemmer.cs#L144-L191.
Looks like they just made a private implementation that is called by both the
constructor and the virtual Current property. Let's just follow their lead.
Ditto for the other overload.
Of course, our code is a port from Lucene, which is a port of the Java
version of this, so there are several differences that have crept in.
##########
src/Lucene.Net.Analysis.Common/Analysis/Util/OpenStringBuilder.cs:
##########
@@ -56,7 +56,9 @@ public virtual int Length
set => m_len = value;
}
- public virtual void Set(char[] arr, int end)
+ // LUCENENET specific - S1699 - marked non-virtual because calling
+ // virtual members from the constructor is not a safe operation in .NET
+ public void Set(char[] arr, int end)
Review Comment:
I doubt anyone will need to extend this class, but see my comment on the
SnowballProgram for an idea on how to deal with this - I think having a shared
private implementation here is fine. I recently modified it so we can pass
strings into the J2N parsers using bounds instead of allocating substrings.
Side note: Perhaps we could find a way to replace usages of this with
[ValueStringBuilder](https://github.com/dotnet/runtime/blob/v7.0.4/src/libraries/Common/src/System/Text/ValueStringBuilder.cs),
which is a ref struct and lives on the stack. The initial allocation can even
be on the stack. It would need its own overloads to be passed around, though -
ref structs cannot have interfaces.
##########
src/Lucene.Net.Facet/Taxonomy/Directory/DirectoryTaxonomyReader.cs:
##########
@@ -241,15 +241,19 @@ protected override TaxonomyReader DoOpenIfChanged()
/// <summary>
/// Open the <see cref="DirectoryReader"/> from this <see
cref="Directory"/>.
/// </summary>
- protected virtual DirectoryReader OpenIndexReader(Directory directory)
+ // LUCENENET specific - S1699 - marked non-virtual because calling
+ // virtual members from the constructor is not a safe operation in .NET
+ protected DirectoryReader OpenIndexReader(Directory directory)
Review Comment:
This method is complementary to the
[DirectoryTaxonomyWriter.OpenIndexWriter]() method, so we should aim for a
common solution for them both. As you can see, that one has existing subclasses
and those are broken when trying to pass any state in through the constructor.
These were definitely meant for extension by end users.
These virtual members exist so in Java they can quickly declare an anonymous
class and override this implementation to return something else. This is the
reason why factory method is preferred over abstract factory in Lucene.
I have tried fixing `DirectoryTaxonomyWriter` with abstract factory and
there were issues trying to get the state of the subclass separated from the
factory. But there are a couple of things that might work I didn't try:
1. Moving all of the shared state from the subclass and into the factory and
sharing the state as properties of that class.
2. Using `LazyInitializer` to wait until the first use to open the reader
and some of the other state that isn't available in the constructor.
That being said, I am hesitant to try fixing this class until the
concurrency issues have been tracked down. There are 2 tests that have locking
contention issues that don't exist in Java Lucene:
1. `TestTaxonomyCombined.TestTaxonomyReaderRefreshRaces()`
2. `TestDirectoryTaxonomyWriter.TestConcurrency()`
You are welcome to take a stab at them if you like.
@eladmarg gave me a tip that it could be the `LurchTable` that we are using
as an LRU cache. There is an experimental implementation that doesn't use
locking in [LRUCache](https://github.com/JKirk865/LRUCache) I would like to
try. We unfortunately put too much stock in trying to make it conform to the
`IDictionary<TKey, TValue>` interface, which basically makes using any existing
c# implementation impossible. So, I think we should abandon that idea and
create our own `ILruCache<TKey, TValue>` interface with a minimal set of
members that Lucene.Net.Facet will need.
We ought to also make it easy to plug in a custom LRU cache implementation,
since Redis has a [distributed LRU
implementation](http://cndoc.github.io/redis-doc-cn/cn/topics/lru-cache.html)
that could get a lot of milage here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]