NightOwl888 commented on code in PR #1138:
URL: https://github.com/apache/lucenenet/pull/1138#discussion_r1988416139
##########
src/Lucene.Net/Store/MMapDirectory.cs:
##########
@@ -314,7 +314,7 @@ internal virtual ByteBuffer[] Map(MMapIndexInput input,
FileStream fc, long offs
input.memoryMappedFile = MemoryMappedFile.CreateFromFile(
fileStream: fc,
mapName: null,
- capacity: length,
+ capacity: fc.Length, // use the full length of the file
Review Comment:
I reviewed #1090 and there are a few things that could be contributing to
the failure. It is specifically testing the file locking functionality and in
this case there was a failure on Linux.
1. Technically, Linux supports `FileStream.Lock()` which could be changed
[here](https://github.com/apache/lucenenet/blob/54505e8964c112119b83adf8bf9163d687765cbe/src/Lucene.Net/Store/NativeFSLockFactory.cs#L100-L107),
however after reading about it on dotnet/runtime and how it is possible to
bypass the CLR to change it, we went with using the same approach we used on
macOS, which is to use `SharingNativeFSLock`. We probably could have used
`FileStream.Lock()` (with `NativeFSLock`) at the time, however, it seems that
they have changed the native call in .NET 6 so it no longer gets an exclusive
lock on Linux: https://github.com/dotnet/runtime/issues/64634.
2. On Windows, the HResult values of IOException are known. However, on all
other platforms, these values depend on the underlying OS. We determine what
the value is for the current OS by provoking the exception during static
initialization and recording the value. There is a non-zero chance that this
logic could fail to get a value and be set to `null`. When that happens, the
implementation uses the `FallbackNativeFSLock`, which just does a best effort,
but is known to have concurrency problems. Note there is some debug code to
determine the HResult value that was found
[here](https://github.com/apache/lucenenet/blob/54505e8964c112119b83adf8bf9163d687765cbe/src/Lucene.Net.TestFramework/Util/LuceneTestCase.cs#L3184-L3234).
3. We added a ["hole" in the locking
implementation](https://github.com/apache/lucenenet/blob/54505e8964c112119b83adf8bf9163d687765cbe/src/Lucene.Net/Store/NativeFSLockFactory.cs#L496-L498)
specifically so `RAMDirectory` can copy the lock file. This "hole" could cause
the `SharingNativeFSLock` to fail. Ideally, the `share` setting would always be
set to `Share.None` to ensure the lock functions properly in all cases.
However, we would need to find some alternative for `RAMDirectory` to be able
to copy the contents of the file while there is a lock held on it. This test
failure could have occurred because the lock failed to throw its exception due
to using `Share.Read` instead of `Share.None`. It might be worth a try to see
if `RAMDirectory` can function without the lock file or with a fake lock file
when copying the contents of the directory.
Detecting the HResult code by provoking the exception is not ideal. Also,
due to the fact that MS changed the underlying native method for
`FileStream.Lock()` in .NET 6, it doesn't sound like that is a viable
alternative any longer on Linux. It may be that dropping to native code is the
only way to properly solve it on Linux 100% of the time. Note that RavenDB
changed the implementation to do all index file access using native code
(including the memory mapped bit and file locking bit). I suspect that we could
do the file locking in isolation without dropping to native code on everything,
though.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]