RE: Bug in Lucene static initialization with multiple threads.

Van Den Berghe, Vincent Wed, 22 Mar 2017 07:50:12 -0700

"King of the one-liners". Hmmm... maybe I'll order a T-shirt with that 
description.


Regarding TestEarlyTerminationDifferentSorter, I think the error is in the 
test, not in the code being tested.
Namely, the following:

                AssertTopDocsEquals(collector1.GetTopDocs().ScoreDocs, 
collector2.GetTopDocs().ScoreDocs);

What happens is the following: At most numHits results are collected in the 
collector.  This is done with a priority queue of at most numHits size. That 
priority queued keeps track of the numHits (or less) "best matched" documents. 
If there are more than numHits total hits, and there's a better matching 
document than the lowest matching document in the priority queue, that lowest 
matching document in the queue is ousted and the better matching document is 
put in the queue instead (at its proper priority).
The fact that the queue is full is tracked with the TopFieldCollector.queueFull 
Boolean. Once it's true, the queue is full and the above process kicks in.
Now, the call to GetTopDocs() returns the at most numHits top documents. This 
is done by getting them from the queue in TopDocsCollector (TopFieldCollector's 
base class):

        protected virtual void PopulateResults(ScoreDoc[] results, int howMany)
        {
            for (int i = howMany - 1; i >= 0; i--)
            {
                results[i] = m_pq.Pop();
            }
        }

If the priority queue contains howMany or less documents, after that operation 
the queue is empty.
However, all the other state variables in the TopFieldCollector collector 
(totalHits, queueFull, numHits, docBase) are left as-is.  So, even though the 
queue is empty, queueFull is true from the previous run. The system will try to 
replace a nonexisting element, causing the null reference.
The moral of the story is "after calling collector.GetTopDocs, the collector 
object cannot be reused".

In fact, this is confirmed by the documentation: 
https://lucene.apache.org/core/4_2_0/core/org/apache/lucene/search/TopDocsCollector.html#topDocs()

"NOTE: you cannot call this method more than once for each search execution. If 
you need to call it more than once, passing each time a different start, you 
should call topDocs() and work with the returned TopDocs object, which will 
contain all the results this search execution collected."

What happens in the test is wrong: the collector should be re-created at every 
iterations.
Interestingly, if you look at the java source 
https://github.com/apache/lucene-solr/blob/master/lucene/core/src/test/org/apache/lucene/search/TestEarlyTerminatingSortingCollector.java
 you will see that the testEarlyTermination does exactly that, and that 
testEarlyTerminationDifferentSorter is quite different.

So I think this should be rewritten as:

        public void TestEarlyTerminationDifferentSorter()
        {
            // test that the collector works correctly when the index was 
sorted by a
            // different sorter than the one specified in the ctor.
            CreateRandomIndexes(5);

            IndexSearcher searcher = NewSearcher(reader);
            int iters = AtLeast(5);
            for (int i = 0; i < iters; ++i)
            {
                                int numHits = TestUtil.NextInt(Random(), 1, 
numDocs / 10);
                                Sort sort = new Sort(new SortField("ndv2", 
SortFieldType.INT64, false));
                                bool fillFields = Random().nextBoolean();
                                bool trackDocScores = Random().nextBoolean();
                                bool trackMaxScore = Random().nextBoolean();
                                bool inOrder = Random().nextBoolean();
                                TopFieldCollector collector1 = 
Search.TopFieldCollector.Create(sort, numHits, fillFields, trackDocScores, 
trackMaxScore, inOrder);
                                TopFieldCollector collector2 = 
Search.TopFieldCollector.Create(sort, numHits, fillFields, trackDocScores, 
trackMaxScore, inOrder);
                                TermQuery query = new TermQuery(new Term("s", 
RandomInts.RandomFrom(Random(), terms)));
                searcher.Search(query, collector1);
                Sort different = new Sort(new SortField("ndv2", 
SortFieldType.INT64));
                searcher.Search(query, new 
EarlyTerminatingSortingCollectorHelper(collector2, different, numHits));


                assertTrue(collector1.TotalHits >= collector2.TotalHits);
                AssertTopDocsEquals(collector1.GetTopDocs().ScoreDocs, 
collector2.GetTopDocs().ScoreDocs);
            }
        }

... and the same for the other test.

I'm getting something different from TestSearcherManager_Mem: 

Test Name:      TestSearcherManager_Mem
Test Outcome:   Failed
Result Message: 
System.Exception : MockDirectoryWrapper: cannot close: there are still open 
files: _1.cfs ,_0.cfs
  ----> System.Exception : unclosed IndexSlice: _5.cfs

... but perhaps Github is playing tricks on me again.

Vincent



-----Original Message-----
From: Shad Storhaug [mailto:[email protected]] 
Sent: Wednesday, March 22, 2017 2:00 PM
To: Van Den Berghe, Vincent <[email protected]>
Cc: [email protected]
Subject: RE: Bug in Lucene static initialization with multiple threads.

Vincent,

I am now going to have to call you "King of the one-liners", for such a quick 
fix to so many failing tests. I am not sure how I missed that one - I think I 
was staring at the screen for too long. I moved the return true statement 
inside the try block, but I suppose it doesn't really make any difference in 
this case.

So, here is what is left that is failing:

Lucene.Net.Tests.Util.Automaton.TestAgainstBrzozowski() (100% of the time)
Lucene.Net.Tests.Search.TestControlledRealTimeReopenThread.TestCRTReopen() 
(~100% of the time)
Lucene.Net.Tests.Misc.Index.Sorter.TestEarlyTermination.TestEarlyTerminationDifferentSorter()
 (~75% of the time - down from ~90% of the time after the recent fix)
Lucene.Net.Tests.Search.TestControlledRealTimeReopenThread.TestControlledRealTimeReopenThread_Mem()
 (~66% of the time - down from ~95% of the time after the recent fix)
Lucene.Net.Tests.Search.TestSearcherManager.TestSearcherManager_Mem() (~40% of 
the time - down from ~ 85% of the time after the recent fix)
Lucene.Net.Tests.Analysis.Common.Analysis.In.TestIndicNormalizer.TestBasic() 
(only seems to happen during a full run)

You read that correctly - there are now only 2 tests failing all of the time. 
There does seem to be light at the end of the tunnel!

I suspect whatever is causing the CRT tests to fail is also what is causing 
TestEarlyTerminationDifferentSorter() and TestSearcherManager_Mem() to fail. 
Most of these failures seem to be centered around 
IndexSearcher/Util.PriorityQueue<T> somehow.

I have already reviewed the ThreadedIndexingAndSearchingTestCase as well as the 
TestControlledRealTimeReopenThread class and fixed some minor differences from 
Lucene, but none of them affected the result. I also reviewed 
ControlledRealTimeReopenThread, ReferenceManager<T> and SearcherManager, fixed 
some bugs, but still no change.

As for TestAgainstBrzozowski(), here are my debug notes as well as a non-random 
test I made, and the result of the test in Java from my last attempt: 
https://github.com/apache/lucenenet/blob/api-work/src/Lucene.Net.Tests/Util/Automaton/TestMinimize.cs#L64-L274.
 There are several things that depend on Automaton to work, and since none of 
them have failing tests, I am not all that concerned about this "difference". 
But it would be nice to find an answer once all of the real problems are 
resolved.

Anyway, I am going to keep poking at it to see if I can spot the cause(s), and 
would appreciate any (more) help you can provide.


Thanks again,
Shad Storhaug (NightOwl888)



-----Original Message-----
From: Van Den Berghe, Vincent [mailto:[email protected]] 
Sent: Wednesday, March 22, 2017 3:46 PM
To: Shad Storhaug
Cc: [email protected]
Subject: RE: Bug in Lucene static initialization with multiple threads.

Hello again Shad,

I found another bug in ExecutionHelper<T>.MoveNext(): the method was always 
returning "false", even if an element is available.
The correction is to add "return true;" where indicated:


            public bool MoveNext()
            {
                if (numTasks > 0)
                {
                    try
                    {
                        var awaitable = service.Take();
                        awaitable.Wait();
                        current = awaitable.Result;

                    }
#if !NETSTANDARD
                    catch (System.Threading.ThreadInterruptedException e)
                    {
                        throw new 
System.Threading.ThreadInterruptedException(e.ToString(), e);
                    }
#endif
                    catch (Exception e)
                    {
                        // LUCENENET NOTE: We need to re-throw this as 
Exception to 
                        // ensure it is not caught in the wrong place
                        throw new Exception(e.ToString(), e);
                    }
                    finally
                    {
                        --numTasks;
                    }
                return true; // *********** here  *******************
                }

                return false;
            }

The TestDuringAddIndexes test passes now. No files, just the one line.

Vincent

-----Original Message-----
From: Shad Storhaug [mailto:[email protected]] 
Sent: Tuesday, March 21, 2017 7:41 PM
To: Van Den Berghe, Vincent <[email protected]>
Cc: [email protected]
Subject: RE: Bug in Lucene static initialization with multiple threads.

Thanks for this Vincent. It's a big help.

Unfortunately, these three lines didn't quite do the trick, but they did change 
the error message of TestDuringAddIndexes() to a null reference exception from 
the Util.PriorityQueue (which I have also been seeing randomly, most notably 
from 
Lucene.Net.Tests.Misc.Index.Sorter.TestEarlyTermination.TestEarlyTerminationDifferentSorter()).

Do note a much simpler test to work with is the 
Lucene.Net.Tests.Memory.Index.Memory.MemoryIndexTest.TestRandomQueries(), which 
(both before and after adding these 3 lines) is returning a 0 result when it 
should be 1 on the search for the term "Apache" (the first test) from 
IndexSearcher + TaskScheduler.

Since all of this is self-contained within the IndexSearcher, changing the 
design to be better is certainly not out of the question. I mean, there would 
be no public API change if we had to eliminate any or all of the private 
classes (including the ExecutorService<T>) in favor of a more .NET-friendly 
implementation. As long as the external behavior is the same, we don't much 
care how it is implemented internally. Would you be interested in providing a 
working solution?

The WeakIdentityTests are now passing. So, I will try swapping for the 
WeakDictionary to see if it eliminates your static initialization issue.

That said, there is one minor thing about the WeakIdentityMap that was 
overlooked. The enumerator was changed in Lucene 4.x to also "reap on read": 
https://github.com/apache/lucene-solr/blob/releases/lucene-solr/4.8.0/lucene/core/src/java/org/apache/lucene/util/WeakIdentityMap.java#L205,
 
https://github.com/apache/lucene-solr/blob/releases/lucene-solr/4.8.0/lucene/CHANGES.txt#L2489-L2492.
 This is one thing that made it tricky to work with because the original 
implementation only allows a single read() and then the object is no longer 
available. This would be extremely unorthodox for a property (i.e. Current). It 
is also supposed to exit out of the loop if the last entry is unavailable to 
the current thread in the case where 2 of them call MoveNext() and one beats 
the other to read Current and reap in a race condition (it does this by 
throwing an exception in Java). I will attempt to get it working this way, but 
at least now we can settle for just passing instead of ideal behavior if we 
can't get it to work exactly this way.


Thanks,
Shad Storhaug (NightOwl888)

-----Original Message-----
From: Van Den Berghe, Vincent [mailto:[email protected]] 
Sent: Tuesday, March 21, 2017 11:59 PM
To: Shad Storhaug
Cc: [email protected]
Subject: RE: Bug in Lucene static initialization with multiple threads.

Hello Shad,

Regarding TestDuringAddIndexes, you found 99.9% of the problem. Allow me to 
contribute to the remaining 0.1%.
There is indeed nothing wrong with the limited concurrencylevel task scheduler. 
The problem is the ExecutionHelper<T>'s method of obtaining the results.
I see this:

            public bool MoveNext()
            {
                if (numTasks > 0)
                {
                    try
                    {
                        current = service.Take().Result;
                    }
#if !NETSTANDARD
                    catch (System.Threading.ThreadInterruptedException)
                    {
                        throw;
                    }
#endif
                    catch (Exception e)
                    {
                        // LUCENENET NOTE: We need to re-throw this as 
Exception to 
                        // ensure it is not caught in the wrong place
                        throw new Exception(e.ToString(), e);
                    }
                    finally
                    {
                        --numTasks;
                    }
                }

                return false;
            }

The call to service.Take() returns an awaitable task, which is nothing but a 
promise that the result will be delivered at some future time.
Alas, the enumerator is strictly synchronous and expects the result to be 
available at each iteration. 
Accessing result without waiting for the task to end (as is done now) will 
yield incorrect result.
The shortest solution is to replace:

        current = service.Take().Result;

By:

        var awaitable = service.Take();
        awaitable.Wait();
        current = awaitable.Result;

The best solution would be to await al tasks in parallel (or at least as many 
as possible) and return their value as soon as they become available. This is 
EXTREMELY hard to do, since you will be essentially fighting a battle between 
your executor service and its execution helper, so I leave that as an exercise 
for the reader.

No files to send this time, it's just these 3 lines.

Vincent

-----Original Message-----
From: Shad Storhaug [mailto:[email protected]] 
Sent: Tuesday, March 21, 2017 3:42 PM
To: Van Den Berghe, Vincent <[email protected]>
Cc: [email protected]
Subject: RE: Bug in Lucene static initialization with multiple threads.

Hi Vincent,

Sure, you can email the files to me directly.

For a quick start on Git/GitHub, there is a fairly short book called Pragmatic 
Version Control Using Git 
(https://pragprog.com/book/tsgit/pragmatic-version-control-using-git) that gets 
you up and running quickly. I think you might be attempting to push to the main 
repo - and you won't have permission unless it is explicitly granted. What you 
need to do is to fork the repo to your own GitHub account, then you can 
read/write it as much as necessary. Once you get it to a point where you want 
to submit something, you can do a pull request (either through GitHub or just 
manually email a request) and someone else can then review and merge the 
changes.

Update

I found the source of the 
Lucene.Net.Tests.Index.TestIndexReaderWriter.TestDuringAddIndexes() problem - 
it always occurs when you call an overload of the IndexSearcher constructor 
that takes a TaskScheduler as a parameter and pass a non-null value. This is 
built into the test framework 
(https://github.com/apache/lucenenet/blob/api-work/src/Lucene.Net.TestFramework/Util/LuceneTestCase.cs#L1778)
 to happen rarely, which explains many of the random failures we are seeing. If 
you change the random code to never use a TaskScheduler, the test will always 
pass, change it to always use a TaskScheduler and it will always fail.

The implementation of TaskScheduler we are using for testing 
(https://github.com/apache/lucenenet/blob/api-work/src/Lucene.Net.Core/Support/LimitedConcurrencyLevelTaskScheduler.cs)
 was copied directly from MSDN, so I doubt that is the issue. In fact, there is 
a good chance that the issue is similar to the WeakIdentityMap issue in that 
there is an enumerator/call to enumerator that is not thread-safe see 
(https://github.com/apache/lucenenet/blob/api-work/src/Lucene.Net.Core/Search/IndexSearcher.cs#L474-L500)
 and 
(https://github.com/apache/lucenenet/blob/api-work/src/Lucene.Net.Core/Search/IndexSearcher.cs#L569-L590).

Anyway, I know of at least 3 tests that are failing as a result of this, so 
fixing it would be a big prize.


Thanks,
Shad Storhaug (NightOwl888)

-----Original Message-----
From: Van Den Berghe, Vincent [mailto:[email protected]] 
Sent: Tuesday, March 21, 2017 8:51 PM
To: Shad Storhaug
Cc: [email protected]
Subject: RE: Bug in Lucene static initialization with multiple threads.

Hello Shad,

I had a little time on my hands and looked into this WeakIdentityMap issue, 
more specifically TestConcurrentHashMap which fails for me as well (in 100% of 
the cases). Maybe I have something to contribute: I have 3 observations:

First, notice that in TestConcurrentHashMap , 8 threads are created and then 
all joined by doing the following:

            finally
            {
                foreach (var w in workers)
                {
                    w.Join(1000L);
                }
            }

This gives the first thread 1 second to end, the second one at most 2 seconds 
(1 second + whatever time the first thread needed to end) and so on.  Given the 
amount of work of each test thread, this is far too little time even on a fast 
machine. It takes 13 seconds for all threads to end here.
The corresponding java test has the following:

   while (!exec.awaitTermination(1000L, TimeUnit.MILLISECONDS));

... which in effect just loops until the execution of each thread is finished, 
in units of 1 second.
In TPL, threads would be tasks and we would just be able to call  Task.WaitAll. 
Since we're dealing with "real" threads here,  I would suggest just call 
w.Join() and be done with it.
This would align the test with the java behavior.

Second, there are various weaknesses in the WeakIdentityMap:
1) the implementation of the Keys enumerator 
(IteratorAnonymousInnerClassHelper) relies on the order of the elements in the 
keys collection (outerInstance.backingStore.Keys.ElementAt(position)). This is 
bad for two reasons:
- it is extremely slow (there is no indexed access on 
outerInstance.backingStore.Keys in any current implementation, so ElementAt 
needs to skip "position" elements to get at the correct one)
- it relies on the assumption that removing (or adding) a key in a dictionary 
doesn't change the previous relative key order, which is incorrect in any 
current .NET implementation I am aware of (dictionaries are hash tables with 
collision resolution through chaining, and reusing of slots through a free 
list: it's just asking for trouble).
It turns out that you can use the backing store enumerator to implement the 
keys enumerator directly. The main loop simply becomes:

                        public bool MoveNext()
                        {
                                while (enumerator.MoveNext())
                                {
                                        next = enumerator.Current.Key.Target;
                                        if (next != null)
                                        {
                                                // unfold "null" special value:
                                                if (next == NULL)
                                                        next = null;
                                                return true;
                                        }
                                }
                                return false;
                        }

This works in the non-concurrent case (because we don't touch the collection 
while the enumerator is running), and in the concurrent case as well (because 
the ConcurrentDictionary<K,V> enumerator works by design and handles concurrent 
modifications without a problem).

2) calling Reap() create objects on the heap, even when there are no elements 
to be removed. Sadly, not all of these allocation can be eliminated, but you 
can delay the creation of the keysToRemove list until it's really needed:

                        List<IdentityWeakReference> keysToRemove = null;
                        foreach (IdentityWeakReference zombie in 
backingStore.Keys)
                        {
                                if (!zombie.IsAlive)
                                {
                                        // create the list of keys to remove 
only if there are keys to remove.
                                        // this reduces heap pressure
                                        if (keysToRemove == null)
                                                keysToRemove = new 
List<IdentityWeakReference>();
                                        keysToRemove.Add(zombie);
                                }
                        }

                        if (keysToRemove != null)
                                foreach (var key in keysToRemove)
                                {
                                        backingStore.Remove(key);
                                }

Note that I don't iterate the Keys collection, but use the dictionary 
enumerator. Believe it or not, but this is slightly more efficient for reasons 
I won't explain here since this e-mail is already long enough.
It's sad but inevitable that a heap object is created for the dictionary 
enumerator, because we call it through an interface (IDictionary<K,V>): it we 
had the actual object, no enumerator object would be created on the heap.

3) Equality of weak identity references can be done using only one case (using 
"as" instead of "is"), which is more efficient.


Third, the test itself uses enumerators in a nonstandard manner. The two 
witness cases are:

        IEnumerator<string> iter = map.Keys.GetEnumerator();
        Assert.IsTrue(iter.MoveNext());
        Assert.IsNull(iter.Current);
        Assert.IsFalse(iter.MoveNext());
        Assert.IsFalse(iter.MoveNext());

And

            for (IEnumerator<string> iter = map.Keys.GetEnumerator(); 
iter.MoveNext();)
            {
                //Assert.IsTrue(iter.hasNext()); // try again, should return 
same result!
                string k = iter.Current;
        ...
            }

All the other instances are variants of these witnesses.
The correct way of using IEnumerator<T> is by calling IEnumerator<T>.Dispose() 
after you're finished with the instance. Note that in Lucene itself, foreach() 
is used which does it correctly (ByteBufferIndexInput.cs):

                    foreach (ByteBufferIndexInput clone in clones.Keys)
                    {
                        clone.UnsetBuffers();
                    }

All usages of enumerators in TestWeakIdentityMap.cs must be rewritten 
accordingly. For example:

        using (IEnumerator<string> iter = map.Keys.GetEnumerator())
        {
                Assert.IsTrue(iter.MoveNext());
                Assert.IsNull(iter.Current);
                Assert.IsFalse(iter.MoveNext());
                Assert.IsFalse(iter.MoveNext());
        }
And

            foreach (object k in map.Keys)
            {
        ...
            }

In case you are wondering why this is so important: you cannot guarantee that 
future implementations of an enumerator (especially one on a concurrent 
collection) doesn't have a cleanup to do to get rid of various synchronization 
objects. Right now this isn't the case, but you never know what the future will 
bring. And besides, nice guys Dispose() after their enumeration <g>.

The test passes now. Every time.

I've made the changes to api-work in my local repository, but when I tried to 
"push" or "sync" them, I get :

        Error encountered while pushing to the remote repository: Response 
status code does not indicate success: 403 (Forbidden).

I know next to nothing about GitHub. Can I e-mail the changed files to someone?

Vincent

-----Original Message-----
From: Shad Storhaug [mailto:[email protected]] 
Sent: Monday, March 20, 2017 9:19 PM
To: Van Den Berghe, Vincent <[email protected]>
Cc: [email protected]
Subject: RE: Bug in Lucene static initialization with multiple threads.

Hi Vincent,

Thanks for reporting this. In fact, thank you for all of your assistance 
tracking down bugs.

This issue boils down to being a failed attempt to replace Lucene's 
WeakIdentityMap with a new data structure called WeakDictionary. Since there 
are already tests to verify concurrency on WeakIdentityMap and it is used in a 
couple of other places in Lucene, it would be far better to get it working 
right than to try to fix this alternative version. I guess for the time being 
your workaround should suffice (though, a fix rather than a hack would be 
preferred).

I have spent quite a bit of time on this, but the best I have been able to do 
is to get the Lucene.Net.Tests.Util.TestWeakIdentityMap.TestConcurrentHashMap() 
test to pass about 50% of the time (and I can't seem to even get it back into 
that state). 

Here are a couple of attempts I have made:

https://github.com/NightOwl888/lucenenet/commits/api-work-weak-identity-map-1 - 
using a port of the original Java backing classes
https://github.com/NightOwl888/lucenenet/commits/api-work-weak-identity-map-2 - 
using the .NET WeakIdentity class

And here is the original Java version: 
https://github.com/apache/lucene-solr/blob/releases/lucene-solr/4.8.0/lucene/core/src/java/org/apache/lucene/util/WeakIdentityMap.java
 

The complicated part is getting it to "reap" the elements in a thread-safe way 
so the counts are right on several concurrent enumerators. Any assistance you 
could provide to make WeakIdentityMap thread-safe would be much appreciated. Do 
note that the lead branch is now at 
https://github.com/apache/lucenenet/tree/api-work, so please do any work from 
that branch.

Also note there are also currently a few other concurrency tests that are 
failing:

Lucene.Net.Tests.Index.TestIndexReaderWriter.TestDuringAddIndexes()
Lucene.Net.Tests.Search.TestControlledRealTimeReopenThread.TestControlledRealTimeReopenThread_Mem()
Lucene.Net.Tests.Search.TestControlledRealTimeReopenThread.TestCRTReopen()

I am sure that getting to the bottom of these issues will probably fix most of 
the issues you are seeing. If you have any spare time, your help would be 
appreciated on these as well.

Thanks,
Shad Storhaug (NightOwl888)


-----Original Message-----
From: Van Den Berghe, Vincent [mailto:[email protected]] 
Sent: Tuesday, February 7, 2017 6:00 PM
To: [email protected]
Subject: Bug in Lucene static initialization with multiple threads.

Hello,

Every once in a while, I get an error when using Lucene in a multithreaded 
scenario (meaning: using a single IndexWriter in multiple threads, or using a 
distinct IndexWriter in each thread: it doesn't matter).
The exception chain thrown is:

Unhandled Exception: System.ArgumentException: Could not instantiate 
implementing class for Lucene.Net.Analysis.Tokenattributes.ICharTermAttribute
---> System.ArgumentException: Could not find implementing class for 
ICharTermAttribute
--->System.InvalidOperationException: Collection was modified; enumeration 
operation  may not execute.

I could not understand what was going on, especially because it only occurred 
"sometimes". It took me a while to figure out, but I think it's a bug.

Here's the stack trace of the exception when it occurs:

                [External Code]
>             
> Lucene.Net.dll!Lucene.Net.Support.HashMap<Lucene.Net.Support.WeakDictionary<System.Type,
>  System.WeakReference>.WeakKey<System.Type>, 
> System.WeakReference>.GetEnumerator() Line 229           C#
               [External Code]
               Lucene.Net.dll!Lucene.Net.Support.WeakDictionary<System.Type, 
System.WeakReference>.Clean() Line 59           C#
               Lucene.Net.dll!Lucene.Net.Support.WeakDictionary<System.Type, 
System.WeakReference>.CleanIfNeeded() Line 71         C#
               Lucene.Net.dll!Lucene.Net.Support.WeakDictionary<System.Type, 
System.WeakReference>.Add(System.Type key, System.WeakReference value) Line 134 
          C#
                
Lucene.Net.dll!Lucene.Net.Util.AttributeSource.AttributeFactory.DefaultAttributeFactory.GetClassForInterface<Lucene.Net.Analysis.Tokenattributes.ICharTermAttribute>()
 Line 90  C#
                
Lucene.Net.dll!Lucene.Net.Util.AttributeSource.AttributeFactory.DefaultAttributeFactory.CreateAttributeInstance<Lucene.Net.Analysis.Tokenattributes.ICharTermAttribute>()
 Line 70  C#
                
Lucene.Net.dll!Lucene.Net.Util.AttributeSource.AddAttribute<Lucene.Net.Analysis.Tokenattributes.ICharTermAttribute>()
 Line 350                C#
               
Lucene.Net.dll!Lucene.Net.Documents.Field.StringTokenStream.InitializeInstanceFields()
 Line 658         C#
               
Lucene.Net.dll!Lucene.Net.Documents.Field.StringTokenStream.StringTokenStream() 
Line 676                C#
               
Lucene.Net.dll!Lucene.Net.Documents.Field.GetTokenStream(Lucene.Net.Analysis.Analyzer
 analyzer) Line 629         C#
               
Lucene.Net.dll!Lucene.Net.Index.DocInverterPerField.ProcessFields(Lucene.Net.Index.IndexableField[]
 fields, int count) Line 105              C#
                
Lucene.Net.dll!Lucene.Net.Index.DocFieldProcessor.ProcessDocument(Lucene.Net.Index.FieldInfos.Builder
 fieldInfos) Line 279          C#
                
Lucene.Net.dll!Lucene.Net.Index.DocumentsWriterPerThread.UpdateDocument(System.Collections.Generic.IEnumerable<Lucene.Net.Index.IndexableField>
 doc, Lucene.Net.Analysis.Analyzer analyzer, Lucene.Net.Index.Term delTerm) 
Line 287                C#
                
Lucene.Net.dll!Lucene.Net.Index.DocumentsWriter.UpdateDocument(System.Collections.Generic.IEnumerable<Lucene.Net.Index.IndexableField>
 doc, Lucene.Net.Analysis.Analyzer analyzer, Lucene.Net.Index.Term delTerm) 
Line 574                C#
               
Lucene.Net.dll!Lucene.Net.Index.IndexWriter.UpdateDocument(Lucene.Net.Index.Term
 term, System.Collections.Generic.IEnumerable<Lucene.Net.Index.IndexableField> 
doc, Lucene.Net.Analysis.Analyzer analyzer) Line 1830          C#
                
Lucene.Net.dll!Lucene.Net.Index.IndexWriter.AddDocument(System.Collections.Generic.IEnumerable<Lucene.Net.Index.IndexableField>
 doc, Lucene.Net.Analysis.Analyzer analyzer) Line 1455   C#
                
Lucene.Net.dll!Lucene.Net.Index.IndexWriter.AddDocument(System.Collections.Generic.IEnumerable<Lucene.Net.Index.IndexableField>
 doc) Line 1436   C#

... and to wit, here are the threads just rushing in to do the same:

Not Flagged                        35428    17           Worker Thread <No 
Name>                
Lucene.Net.dll!Lucene.Net.Support.WeakDictionary<System.Type, 
System.WeakReference>.Clean                Normal
Not Flagged                        35444    11           Worker Thread <No 
Name>                
Lucene.Net.dll!Lucene.Net.Support.WeakDictionary<System.Type, 
System.WeakReference>.Clean                Normal
Not Flagged                        44124    12           Worker Thread <No 
Name>                
Lucene.Net.dll!Lucene.Net.Support.WeakDictionary<System.Type, 
System.WeakReference>.Clean                Normal
Not Flagged        >             44140    13           Worker Thread <No Name>  
              Lucene.Net.dll!Lucene.Net.Support.WeakDictionary<System.Type, 
System.WeakReference>.Clean                Normal
Not Flagged                        47700    14           Worker Thread <No 
Name>                
Lucene.Net.dll!Lucene.Net.Support.WeakDictionary<System.Type, 
System.WeakReference>.Clean                Normal
Not Flagged                        28168    15           Worker Thread <No 
Name>                
Lucene.Net.dll!Lucene.Net.Support.WeakDictionary<System.Type, 
System.WeakReference>.Clean                Normal
Not Flagged                        30988    16           Worker Thread <No 
Name>                
Lucene.Net.dll!Lucene.Net.Support.WeakDictionary<System.Type, 
System.WeakReference>.Clean                Normal
Not Flagged                        21828    6              Worker Thread <No 
Name>                
Lucene.Net.dll!Lucene.Net.Support.WeakDictionary<System.Type, 
System.WeakReference>.Clean                Normal

The reason why it only reproduces "sometimes" is because of this little nugget 
of code:

        private void CleanIfNeeded()
        {
            int currentColCount = GC.CollectionCount(0);
            if (currentColCount > _gcCollections)
            {
                Clean();
                _gcCollections = currentColCount;
            }
        }

If one thread does a Clean() operation in the middle of another Clean() 
operation on the same collection that replaces the object being enumerated on, 
you get the exception. Always.
To avoid the intermittence, create a bunch of threads like this and eliminate 
the test "if (currentColCount > _gcCollections)" so that the Clean() code is 
always executed. You'll get the exception every time.

I will not post the correction, but there's a simple workaround: just make sure 
the static initializers are performed in a single thread.
I.e. before creating your threads, do something like this:

new global::Lucene.Net.Documents.TextField("dummy", "dummyvalue", 
global::Lucene.Net.Documents.Field.Store.NO).GetTokenStream(new (some Analyzer 
object));

Replace "some Analyzer object" with an instance of an Analyzer object, it 
doesn't matter which one. It's meaningless, but it has the side effect of 
initializing the static fields without problems.


Vincent

RE: Bug in Lucene static initialization with multiple threads.

Reply via email to