Github user NightOwl888 commented on the issue:

    https://github.com/apache/lucenenet/pull/191
  
    FYI - There are 2 branches that still need to be integrated before this is 
up to date with master:
    
    - https://github.com/NightOwl888/lucenenet/tree/core-tests
    - https://github.com/NightOwl888/lucenenet/tree/analysis-stempel
    
    But neither of those affects the `Lucene.Net.Analysis.Common` or the 
`Lucene.Net.Expressions` tests, so you are doing the right thing by finding 
those problems first.
    
    The core-tests branch adds ~120 missing tests and has several dozen bug 
fixes (including fixing the 2 failing tests in `Lucene.Net.Queries` and 32 
failing tests in `Lucene.Net.QueryParser`). The tests in `Lucene.Net.Tests` are 
now 100% complete (including parts of tests that were previously commented out, 
but excluding tests that only apply to JRE/JUnit).
    
    Test Context
    ====
    
    The test context issues have been "fixed" by creating stub methods to put 
the `[Test]` attribute on in the subclasses where the tests are supposed to be 
run. You probably already know that part, but there have been some additions 
and the new commit where these can be reverted (if needed) is: 
4dbc3590361814d13fae64c8d030820eb4987489
    
    That is, if xUnit properly pulls the base class tests into the right 
context (if or when it is put in place), this commit can be reverted to 
rollback those test method stubs.
    
    LuceneNetSpecificAttribute
    ====
    
    I added an attribute to apply to tests that were added as part of the port 
so they can be filtered (and compared with) the tests from Java Lucene. The 
filter can be applied in Visual Studio as `trait:LUCENENET`.
    
    SuppressCodecsAttribute
    ====
    
    Fixed this attribute to accept an array of strings and added the correct 
suppressions to all of the tests in `Lucene.Net.Core`. We still need to fix the 
Test Framework to randomize codecs and utilize this attribute to filter them.
    
    ChangeNotes
    ====
    
    I noticed you have begun [listing some of the questions you have and 
unfinished 
tasks](https://github.com/conniey/lucenenet/blob/netcoremigration/src/Lucene.Net.Core/ChangeNotes.txt)
 about certain areas of the project and I know a few of the answers and have 
some comments as well.
    
    > TODO: Confirm HashMap emulates java properly
    
    Although these already seem to be doing the job (tests are passing), we 
could raise confidence about this by porting some of the tests over from the 
JDK. I have already begun doing this for certain pieces that were ported from 
or are meant to emulate parts of Java.
    
    > TODO: Tests need to be written for WeakDictionary
    
    Looks like we [have some tests 
already](https://github.com/apache/lucenenet/blob/master/src/Lucene.Net.Tests/core/Support/TestWeakDictionary.cs)
 from Lucene.Net 3.0.3 that exist in the repo, but haven't yet been added to 
the project. I am not sure how well they match up against the JDK tests.
    
    > TODO: Tests need to be written for IdentityDictionary -> Verify behavior
    
    We should probably move this into the Support namespace. AFAIK, the Support 
namespace is for pieces that need to be sourced from the JDK and/or close 
approximations to what is in the JDK that don't exist in the .NET framework. We 
can then port over the applicable tests for IdentityDictionary from the JDK to 
verify it works as expected.
    
    > ParallelReader - extra data types, using SortedDictionary in place of 
TreeMap.  Confirm compatibility.  Looks okay, .NET uses a r/b tree just like 
Java, and it seems to perform/behave just about the same.
    
    In many cases `SortedDictionary` will function as a suitable replacement 
for `TreeMap`. The primary difference is that `TreeMap` allows duplicate keys, 
where `SortedDictionary` does not.
    
    That said, it is unlikely any of the tests cover this scenario because they 
would have all been written under the assumption that duplicates are allowed 
(and therefore the functionality doesn't need to be tested per se, since it is 
already covered under JDK tests).
    
    In `Lucene.Net.Grouping` (WIP), `TreeMap` is required to make it work. 
There is a suitable TreeMap in the [C5 library](https://github.com/sestoft/C5) 
(see [The Working Programmer - .NET Collections: Getting Started with 
C5](https://msdn.microsoft.com/en-us/magazine/jj883961.aspx)), which seems to 
do the job. I have ported `TreeMap` over to our Support namespace along with 
`TreeDictionary` and several other dependencies (as well as the tests).
    
    Do note that C5 currently doesn't have .NET core support, but someone has 
recently made a pull request for it. So whether to integrate it into the 
Support namespace and get rid of the external dependency or reference it in 
specific places is a matter of preference.
    
    > FuzzyQuery - uses java.util.PriorityQueue<T>, which .net does not have.  
Using SortedList<TKey, TVal> in it's place, which works, but a) isn't a perfect 
replacement (a SortedList<TKey, TVal> doesn't allow duplicate keys, which is 
what is sorted, where a PriorityQueue does) and b) it's likely slower than a 
PriorityQueue<T>. I can't tell if the PriorityQueue that is defined in 
Lucene.Net.Util would work in its place.
    
    There are 2 different `PriorityQueue` types in Lucene.Net:
    
    - Lucene.Net.Util.PriorityQueue
    - Lucene.Net.Support.PriorityQueue
    
    The one in the `Util` namespace is an exact match for the one in Lucene. 
The PriorityQueue in the `Support` namespace is intended to be an approximate 
match for `java.util.PriorityQueue` (it could use some tests to verify that, 
though).
    
    FuzzyQuery doesn't have a PriorityQueue, so you are likely referring to the 
wrong type where this issue exists.
    
    > Dispose needs to be implemented properly around the entire library.  IMO, 
that means that Close should be Obsoleted and the code in Close() moved to 
Dispose().
    
    AFAIK, `close()` is being universally replaced with `Dispose()` rather than 
dealing with the (confusing) redundancy. There are cases where it is required, 
though - for example, `TextWriter`. IMO, removing (where applicable) makes more 
sense than obsoleting the `Close()` method.
    
    > TODO: NamedThreadFactory.java - Is this needed?  What is it for, just for 
debugging?
    
    AFAIK, no it is not needed. It is used as a parameter to generate a 
concrete `ExecutorService` in Java. 
    
    ```java
        ExecutorService service = new ThreadPoolExecutor(4, 4, 0L, 
TimeUnit.MILLISECONDS,
                                       new LinkedBlockingQueue<Runnable>(),
                                       new 
NamedThreadFactory("TestIndexSearcher"));
    ```
    
    It looks like someone has already worked out that `TaskScheduler` is the 
rough equivalent in .NET (not confirmed). In all places where this is utilized 
in the tests, `TaskScheduler.Default`, or the 
[`Lucene.Net.Support.LimitedConcurrencyLevelTaskScheduler`](https://github.com/apache/lucenenet/blob/master/src/Lucene.Net.Core/Support/LimitedConcurrencyLevelTaskScheduler.cs)
 (which I grabbed from MSDN) seems to work well enough to get the tests to pass.
    
    While we *could* port the `ThreadPoolExecutor` from the JDK and make it a 
concrete `TaskScheduler` so we would have a place to utilize the 
`NamedThreadFactory`, it feels like that is a step away from .NETification 
rather than just allowing people to use the `TaskScheduler` natively. IMO, it 
makes more sense just to comment it out or remove the file from the project 
(but not from the repo - it would be best to keep the file around with comments 
explaining why it wasn't utilized).
    
    > TODO: LockStressTest.java - Not yet ported.
    > TODO: MMapDirectory.java - Port Issues
    > TODO: NIOFSDirectory.java - Port Issues
    
    These are now ported and most of the bugs they had were addressed.
    
    Additional Items to Add to the List
    -----
    
    **`AlreadyClosedException`** - In .NET, we already have an 
`ObjectDisposedException` so this is reinventing the wheel. The `IndexReader` 
and `IndexWriter` classes depend heavily on the right exception type being 
caught in order to provide the correct response. This may be an issue if a 
nested .NET type throws an `ObjectDisposedException` rather than the expected 
`AlreadyClosedException`. While we could partially address this by inheriting 
`ObjectDisposedException` from `AlreadyClosedException`, and always catch 
`ObjectDisposedException`, it still provides a window where consumers could 
incorrectly catch `AlreadyClosedException` and later be bitten by an uncaught 
`ObjectDisposedException`. We should probably just remove this type and use the 
native `ObjectDisposedException`.
    
    **`ICharSequence`** - Many types have replaced the use of `ICharSequence` 
with `string`. Its too bad that .NET doesn't have an equivalent interface as 
`ICharSequence` that is shared between `string` and `StringBuilder` that could 
be utilized. But using only `string` has limited the API to a certain extent. A 
few types within Lucene.Net implement `ICharSequence` (`CharTermAttribute`, 
`CharBlockArray`, `OpenStringBuilder`, `StringCharSequenceWrapper`). It could 
save a step to have overloads for `string`, `ICharSequence`, and 
`StringBuilder` in all places that were `ICharSequence` in Lucene rather than 
having to call `ToString()` on the latter two.
    
    -----
    
    Is there anything I can do to help you get this up to date?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to