I've made a benchmark
<https://gist.github.com/605804fb31572ff1abb916eb019ede36> with visualized
results
<https://jmh.morethan.io/?source=https://gist.githubusercontent.com/liach/605804fb31572ff1abb916eb019ede36/raw/4841b483365211f55e16925e18a876d39813b4dc/flat-results.json>
on putting a map of a few elements into a hash map via putAll, put in
forEach, put in for entrySet loop, and put in for keySet loop with Map.get
calls on JDK 17. Aside from regular "putAll", which is the fastest because
hash map preemptively expanded the table and has custom optimizations, the
rest of the result shows that forEach is largely the same for maps that
already store contents as Entry instances and significantly faster for
IdentityHashMap, the only one that does not store contents as Entry objects.

Hence, I believe the performance benefit from the replacement is real.
stuart, mind evaluate my claim?


On Fri, Mar 4, 2022 at 10:43 PM - <liangchenb...@gmail.com> wrote:

> Sounds reasonable. Then how about the claims on performance improvement,
> especially for maps that have to create entry objects on entry set
> iteration? In my few jmh benchmarks run, it seems adding from identity hash
> map at least is sped up, while most other maps have no significant
> difference.
>
> On Fri, Mar 4, 2022 at 6:58 PM Stuart Marks <stuart.ma...@oracle.com>
> wrote:
>
>> Hi, I have to say I'm pretty skeptical of this change overall.
>>
>> It sounds like the main issue is iterating over the entrySet of a map
>> that might be
>> modified concurrently. The bug report says
>>
>>  > for concurrent maps, regular iteration of entrySet may fail
>> spontaneously
>>
>> This isn't true for ConcurrentHashMap or ConcurrentSkipListMap, which
>> have weakly
>> consistent iterators that won't error out. It *is* true for
>> synchronizedMap, which
>> can throw ConcurrentModificationException if the map is modified during
>> iteration. I
>> guess that's mostly what you're referring to.
>>
>> The problem with synchronizedMap (and the other synchronized wrappers) is
>> that they
>> require external synchronization around iteration, and it's pretty much
>> impossible
>> to cover all the cases where iteration occurs. It's used everywhere
>> outside the JDK.
>> Even within the JDK, it appears quite frequently and is hard to avoid.
>> For example,
>> consider
>>
>>      collection.addAll(synchronizedMap.keySet())
>>
>> AbstractCollection::addAll iterates its argument, and this is inherited
>> in a bunch
>> of places.
>>
>> The root of the problem is that the synchronized wrappers support
>> concurrent
>> updating only in a very narrow range of use cases. If people have trouble
>> with
>> concurrent map updating, they need to use a real concurrent data
>> structure, or they
>> need to put everything inside their own class and have it do locking
>> around
>> higher-level operations. Replacing a few entrySet iteration cases with
>> Map::forEach
>> doesn't help anything.
>>
>> s'marks
>>
>>
>>
>> On 2/22/22 3:09 PM, - wrote:
>> > Hello,
>> > In the patch for 8281631 <
>> https://bugs.openjdk.java.net/browse/JDK-8281631>,
>> > xenoamess intentionally avoided
>> > <https://github.com/openjdk/jdk/pull/7431#discussion_r810666916>
>> repeatedly
>> > calling Map::size, for it may not be constant-timed due to being
>> > concurrent. This alerted me that the for loops on Map::entrySet may be
>> > similarly not thread-safe.
>> >
>> > I believe that we should migrate for loop on Map::entrySet to
>> Map::forEach
>> > calls; concurrent map callees usually have dedicated logic
>> > <
>> https://github.com/openjdk/jdk/blob/2557ef8a02fe19784bd5e605b11d6bd574cde2c2/src/java.base/share/classes/java/util/concurrent/ConcurrentMap.java#L119
>> >
>> > to mitigate such thread-safety issues. Synchronized map
>> > <
>> https://github.com/openjdk/jdk/blob/2557ef8a02fe19784bd5e605b11d6bd574cde2c2/src/java.base/share/classes/java/util/Collections.java#L2735
>> >
>> > callees are benefitted too.
>> >
>> > Another lesser benefit is reduced object allocation for iteration. While
>> > the most popular implementations (HashMap, LinkedHashMap, WeakHashMap,
>> > TreeMap) don't need to allocate entries for iteration, many other
>> (EnumMap,
>> > IdentityHashMap, concurrent maps) do, and iterating those maps with
>> forEach
>> > is less costly. (Note that 8170826
>> > <https://bugs.openjdk.java.net/browse/JDK-8170826> takes care of
>> > implementing proper forEach in EnumMap)
>> >
>> > A JBS issue has already been submitted at
>> > https://bugs.openjdk.java.net/browse/JDK-8282178, and I have prepared a
>> > patch. This patch modifies the putAll implementations of a few JDK maps
>> to
>> > let the callee feed entries through Map::forEach to be put into the
>> maps.
>> > Two places of Map::entrySet calls in Collectors has also been updated.
>> >
>> > For most other existing usages in the JDK, I don't think they will
>> benefit
>> > from such a migration: They mostly iterate on the entryset of the
>> popular
>> > implementations that already host entries and are single-threaded, and I
>> > don't think it's in our best interest to touch those use cases.
>> >
>> > So, here are my questions:
>> > 1. Is such a concept of replacing Map::entrySet calls with Map::forEach
>> > calls reasonable, or is there any fundamental flaw?
>> > If the concept sounds good, I will submit a patch, so we can answer
>> > 2. Is the changes implemented correct for this concept, i.e. targeting
>> code
>> > where users supply callee maps?
>> >
>> > Best regards
>> >
>>
>

Reply via email to