John Griffin created COLLECTIONS-885:
----------------------------------------

             Summary: PatriciaTrie submap and range view operations silently 
mutate trie structure resulting in ConcurrentModificationException for any 
iterating threads
                 Key: COLLECTIONS-885
                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-885
             Project: Commons Collections
          Issue Type: Bug
          Components: Collection, Map
    Affects Versions: 4.5.0
            Reporter: John Griffin


*Problem:*

When iterating the PatriciaTrie (or a view of it), a 
ConcurrentModificationException will be thrown if a different thread creates 
another view of it (submap(), etc.) with a key which is not an exact match for 
an existing entry in the trie.

These methods

 
{code:java}
AbstractPatriciaTrie.ceilingEntry()
AbstractPatriciaTrie.floorEntry()
AbstractPatriciaTrie.higherEntry()
AbstractPatriciaTrie.lowerEntry(){code}
use a pattern where they:

 
 # insert a phantom node into the trie with the search key
 # increment the modCount
 # find the neighbor of that phantom node via nextEntry()/previousEntry()
 # remove the phantom node
 # increment the modCount again
 # then decrement modCount by 2 to hide the 2 modifications (add + remove)

The referenced modCount is used as a fail-fast mechanism in the iterators.

This behaviour is contrary to expectations as multiple threads which are 
performing read-only operations on a collection usually can share a collection 
without any CME expected. If it was a shared object which can be updated by 
another thread a user would ensure that there is some defensive external 
locking done to prevent any CMEs.

*Existing Code:*

 
{code:java}
TrieEntry<K, V> ceilingEntry(final K key) {
...
if (KeyAnalyzer.isValidBitIndex(bitIndex)) {
    final TrieEntry<K, V> added = new TrieEntry<>(key, null, bitIndex);
    addEntry(added, lengthInBits);
    incrementSize(); // must increment because remove will decrement
    final TrieEntry<K, V> ceil = nextEntry(added);
    removeEntry(added);
    modCount -= 2; // we didn't really modify it.
    return ceil;
} {code}
*Error log:*

 

 
{code:java}
java.util.ConcurrentModificationException
    at 
org.apache.commons.collections4.trie.AbstractPatriciaTrie$AbstractTrieIterator.nextEntry(AbstractPatriciaTrie.java:261)
    at 
org.apache.commons.collections4.trie.AbstractPatriciaTrie$TrieMapIterator.nextEntry(AbstractPatriciaTrie.java:1142)
    at 
org.apache.commons.collections4.trie.AbstractPatriciaTrie$TrieMapIterator.next(AbstractPatriciaTrie.java:1137)
    at java.util.Iterator.forEachRemaining(Iterator.java:116)
    at org.example.ScratchMain.lambda$main$1(ScratchMain.java:73)
    at java.lang.Thread.run(Thread.java:750){code}
 

*Proposed Fix:*
Replace the phantom add/remove with a read-only implementation that:
 # Finds the nearest entry via getNearestEntryForKey().
 # Determines whether the search key is less than or greater than the found 
entry using isBitSet() at the first differing bit.
 # Walks forward (nextEntry()) or backward (previousEntry()) to locate the 
correct ceiling/floor neighbor.

This prevents any structural changes in these methods and removes the 
*{{modCount-=2}}* hack entirely.

*N.B.* The existing *TODO: Cleanup* comments in the code acknowledged this 
change is needed.

 
{code:java}
TrieEntry<K, V> ceilingEntry(final K key) {
// Basically:
// Follow the steps of adding an entry, but instead...
//
// - If we ever encounter a situation where we found an equal
//   key, we return it immediately.
//
// - If we hit an empty root, return the first iterable item.
//
// - If we have to add a new item, we temporarily add it,
//   find the successor to it, then remove the added item.
//
// These steps ensure that the returned value is either the
// entry for the key itself, or the first entry directly after
// the key.

// TODO: Cleanup so that we don't actually have to add/remove from the
//       tree.  (We do it here because there are other well-defined
//       functions to perform the search.) 

...{code}
 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to