Re: ConcurrentLinkedHashMap works incorrectly after clear()

Alexey Goncharuk Thu, 09 Aug 2018 06:17:06 -0700

Guys, I think we can definitely change current implementation of CLHM with
a more stable one, but as a temporal solution I see nothing wrong with
throwing an UnsupportedOperationException from clear() method. Ilya already
provided a patch which replaces all clear() calls with a new map creation,
semantically it has the same behavior as a correct clear() method.


I suggest to merge the provided PR because IgniteH2Indexing is broken
because of this bug and then create a ticket to replace/fix/whatever it
feels right to do with current CLHM.

Thoughts?

пт, 27 июл. 2018 г. в 16:43, Anton Vinogradov <[email protected]>:

> Is it possible to replace current implementation with google's [1] or some
> other?
> It a bad idea to have own CLHM and keep it outdated (this version based on
> CHM from java7!) and broken.
>
> [1]
>
> https://mvnrepository.com/artifact/com.googlecode.concurrentlinkedhashmap/concurrentlinkedhashmap-lru
>
> пт, 27 июл. 2018 г. в 16:40, Ilya Kasnacheev <[email protected]>:
>
> > Hello!
> >
> > Most of our code uses CLHM as a container with fixed size. I can surely
> fix
> > LogThrottle but my main concern is H2 indexing code which uses the same
> > CLHM with cap.
> >
> > Regards,
> >
> > --
> > Ilya Kasnacheev
> >
> > 2018-07-27 16:38 GMT+03:00 Maxim Muzafarov <[email protected]>:
> >
> > > Ilya,
> > >
> > > As for me, the main cause of this problem is not in CLHM itself but
> that
> > > we are using it for GridLogThrottle as container with fixed size.
> > Suppose,
> > > at current moment we have no alternative and should start thinking
> about
> > > futher steps.
> > >
> > > Can you create clear reproducer for described issue with CLHM above?
> > >
> > > As workaround for GridLogThrottle we can change clear() like this:
> > >
> > >     public static void clear() {
> > >         errors.forEach((k, v) -> errors.remove(k));
> > >     }
> > >
> > > Will it helps you to fix test?
> > >
> > > Thoughts?
> > >
> > > On Wed, 25 Jul 2018 at 19:57 Ilya Kasnacheev <
> [email protected]>
> > > wrote:
> > >
> > > > Hello!
> > > >
> > > > LT stops throttling input as soon as it is unable to add any entries
> to
> > > > underlying map since it would consider itself full with 0 entries.
> > > >
> > > > I have tried to implement both suggestions, and they all have big
> > impact
> > > on
> > > > CLHM code. I am super uncomfortable with making non-trivial edits to
> > it.
> > > >
> > > > If the first approach is chosen, there's the need to iterate values
> > > instead
> > > > of clearing table, and if second approach is chosen, locking becomes
> > > > non-trivial since we're grabbing segment locks outside of segment
> > code..
> > > >
> > > > Changing LongAdder to AtomicLong also has potential implications
> which
> > > are
> > > > not readily understood.
> > > >
> > > > Note that entryQ is not cleared in clear() either which can cause
> > further
> > > > problems. My suggestion is still to either disallow clear() or throw
> > this
> > > > class away in its entirety.
> > > >
> > > > Regards,
> > > >
> > > > --
> > > > Ilya Kasnacheev
> > > >
> > > > 2018-07-25 12:00 GMT+03:00 Maxim Muzafarov <[email protected]>:
> > > >
> > > > > Hello Ilya,
> > > > >
> > > > > Can you add more info about why and how LT for this test case
> prints
> > > log
> > > > > message twice?
> > > > >
> > > > > From my point, maiking clear() method to throw
> > > > > UnsupportedOperationException is not right
> > > > > fix for flaky test issues. A brief search through CLHM led me to a
> > > > thought
> > > > > that we just forgot to drop down
> > > > > LongAdder size when iterating over HashEntry array. We incrementing
> > and
> > > > > decrementing this
> > > > > counter on put/remove operations by why not in clear method? Am I
> > > right?
> > > > > So, replacing LongAdder to AtomicLong
> > > > > sounds good to me too, as it was suggested by Ilya Lantukh. But I
> can
> > > > > mistake here.
> > > > >
> > > > > In general way, I think it's a good case to start thinking about
> how
> > to
> > > > get
> > > > > rid of CLHM. E.g. we can consider this implementaion [1].
> > > > >
> > > > > [1] https://github.com/ben-manes/concurrentlinkedhashmap
> > > > >
> > > > > вт, 24 июл. 2018 г. в 16:45, Stanislav Lukyanov <
> > > [email protected]
> > > > >:
> > > > >
> > > > > > It seems that we currently use the CLHM primarily as a FIFO
> cache.
> > > > > > I see two ways around that.
> > > > > >
> > > > > > First, we could implement such LRU/FIFO cache based on another,
> > > > properly
> > > > > > supported data structure from j.u.c.
> > > > > > ConcurrentSkipListMap seems OK. I have a draft implementation of
> > > > > > LruEvictionPolicy based on it that passes functional tests,
> > > > > > but I haven’t benchmarked it yet.
> > > > > >
> > > > > > Second, Guava has a Cache API with a lot of configuration options
> > > that
> > > > we
> > > > > > could use (license is Apache, should be OK).
> > > > > >
> > > > > > Stan
> > > > > >
> > > > > > From: Nikolay Izhikov
> > > > > > Sent: 24 июля 2018 г. 16:27
> > > > > > To: [email protected]
> > > > > > Subject: Re: ConcurrentLinkedHashMap works incorrectly after
> > clear()
> > > > > >
> > > > > > Hello, Ilya.
> > > > > >
> > > > > > May be we need to proceed with ticket [1] "Get rid of
> > > > > > org.jsr166.ConcurrentLinkedHashMap"?
> > > > > >
> > > > > > Especially, if this class is broken and buggy.
> > > > > >
> > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-7516
> > > > > >
> > > > > > В Вт, 24/07/2018 в 16:20 +0300, Ilya Lantukh пишет:
> > > > > > > Thanks for revealing this issue!
> > > > > > >
> > > > > > > I don't understand why should we disallow calling clear().
> > > > > > >
> > > > > > > One way how it can be re-implemented is:
> > > > > > > 1. acquire write locks on all segments;
> > > > > > > 2. clear them;
> > > > > > > 3. reset size to 0;
> > > > > > > 4. release locks.
> > > > > > >
> > > > > > > Another approach is to calculate inside
> > > > > > > ConcurrentLinkedHashMap.Segment.clear() how many entries you
> > > actually
> > > > > > > deleted and then call size.addAndGet(...).
> > > > > > >
> > > > > > > In both cases you'll have to replace LongAdder with AtomicLong.
> > > > > > >
> > > > > > > On Tue, Jul 24, 2018 at 4:03 PM, Ilya Kasnacheev <
> > > > > > [email protected]>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hello igniters!
> > > > > > > >
> > > > > > > > So I was working on a fix for
> > > > > > > > https://issues.apache.org/jira/browse/IGNITE-9056
> > > > > > > > The reason for test flakiness turned out our
> > > > ConcurrentLinkedHashMap
> > > > > > (and
> > > > > > > > its tautological cousin GridBoundedConcurrentLinkedHashMap)
> is
> > > > > broken
> > > > > > :(
> > > > > > > >
> > > > > > > > When you do clear(). its size counter is not updated. So
> > sizex()
> > > > will
> > > > > > > > return the old size after clear, and if there's maxCnt set,
> > after
> > > > > > several
> > > > > > > > clear()s it will immediately evict entries after they are
> > > inserted,
> > > > > > > > maintaining map size at 0.
> > > > > > > >
> > > > > > > > This is scary since indexing internals make intense use of
> > > > > > > > ConcurrentLinkedHashMaps.
> > > > > > > >
> > > > > > > > My suggestion for this fix is to avoid ever calling clear(),
> > > making
> > > > > it
> > > > > > > > throw UnsupportedOperationException and recreating/replacing
> > map
> > > > > > instead of
> > > > > > > > clear()ing it. Unless somebody is going to stand up and fix
> > > > > > > > ConcurrentLinkedHashMap.clear() properly. Frankly speaking
> I'm
> > > > > afraid
> > > > > > of
> > > > > > > > touching this code in any non-trivial way.
> > > > > > > >
> > > > > > > > --
> > > > > > > > Ilya Kasnacheev
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > > --
> > > > > --
> > > > > Maxim Muzafarov
> > > > >
> > > >
> > > --
> > > --
> > > Maxim Muzafarov
> > >
> >
>

Re: ConcurrentLinkedHashMap works incorrectly after clear()

Reply via email to