RE: RFR(trivial): 8222394: HashMap.compute() throws CME on an empty Map if clear() called concurrently

2019-05-08 Thread Patrick Zhang OS
> [1] http://mail.openjdk.java.net/pipermail/core-libs-dev/2019-May/060068.html
Thanks for sharing this thread. 
I saw Pavel's comments: "I believe it should be incremented on an attempt to 
modify the collection, rather than on the effective result of that attempt"
It can be true for bulk modifications like addAll, clear, while for 
single-entry modifications like put, remove, merge, etc. it depends, some are 
unconditional ++ and most are conditional ++. 
So the clarification should be not specific to clear or addAll, but at a common 
place (if possible) in a higher level javadoc for modCount, or for "fail fast".

Regards
Patrick

-Original Message-
From: Stuart Marks  
Sent: Thursday, May 9, 2019 5:01 AM
To: Patrick Zhang OS 
Cc: core-libs-dev 
Subject: Re: RFR(trivial): 8222394: HashMap.compute() throws CME on an empty 
Map if clear() called concurrently



On 5/4/19 12:21 AM, Patrick Zhang OS wrote:
> Thank you very much for the detailed explanation and examples, which solved 
> most of my previous questions and confusion. Really appreciate that. I agree 
> with the opinion about "effort vs benefit".
> 
> Re-thought my original tests concerning CMEs that passed with jdk8u but 
> failed with jdk11 (should be 9+, but I only checked LTS builds), I was 
> struggling between "fixing" the "problems" in jdk11 and "making jdk8 fail 
> similarly". However so far it looks these tests might be over strict, or 
> "verifying CMEs" itself might be an invalid (or unreliable) operation, 
> perhaps I should drop all of them. Lastly, a suggestion would be: adding more 
> comments for this in case anyone else would revisit it with similar 
> confusions, e.g. HashMap.clear.

Great, I'm glad this was helpful.

Regarding the tests, yes, I'm afraid those might be overly strict. The 
specification is admittedly quite loose in this area. This means that the exact 
behavior may vary from release to release. If a collection is optimized, for 
example, sometimes it's quite difficult to preserve the exact same behavior in 
all cases. It's for this reason we avoid specifying the exact behaviors around 
CME. Of course, when making any changes, we usually do try to preserve the 
general behavior, just not every specific edge case.

I generally welcome code comments to improve clarity, but I'm not sure one is 
warranted for HashMap.clear(). The placement of modCount++ seems quite 
reasonable. There are other cases where the choice of 
conditional-vs-unconditional is made (see [1]) and I don't think we want to go 
sprinkling explanations of this all around the code.

[1] http://mail.openjdk.java.net/pipermail/core-libs-dev/2019-May/060068.html

s'marks


Re: RFR(trivial): 8222394: HashMap.compute() throws CME on an empty Map if clear() called concurrently

2019-05-08 Thread Stuart Marks




On 5/4/19 12:21 AM, Patrick Zhang OS wrote:

Thank you very much for the detailed explanation and examples, which solved most of my 
previous questions and confusion. Really appreciate that. I agree with the opinion about 
"effort vs benefit".

Re-thought my original tests concerning CMEs that passed with jdk8u but failed with jdk11 (should be 9+, but I only 
checked LTS builds), I was struggling between "fixing" the "problems" in jdk11 and "making 
jdk8 fail similarly". However so far it looks these tests might be over strict, or "verifying CMEs" 
itself might be an invalid (or unreliable) operation, perhaps I should drop all of them. Lastly, a suggestion would be: 
adding more comments for this in case anyone else would revisit it with similar confusions, e.g. HashMap.clear.


Great, I'm glad this was helpful.

Regarding the tests, yes, I'm afraid those might be overly strict. The 
specification is admittedly quite loose in this area. This means that the exact 
behavior may vary from release to release. If a collection is optimized, for 
example, sometimes it's quite difficult to preserve the exact same behavior in 
all cases. It's for this reason we avoid specifying the exact behaviors around 
CME. Of course, when making any changes, we usually do try to preserve the 
general behavior, just not every specific edge case.


I generally welcome code comments to improve clarity, but I'm not sure one is 
warranted for HashMap.clear(). The placement of modCount++ seems quite 
reasonable. There are other cases where the choice of 
conditional-vs-unconditional is made (see [1]) and I don't think we want to go 
sprinkling explanations of this all around the code.


[1] http://mail.openjdk.java.net/pipermail/core-libs-dev/2019-May/060068.html

s'marks


RE: RFR(trivial): 8222394: HashMap.compute() throws CME on an empty Map if clear() called concurrently

2019-05-04 Thread Patrick Zhang OS
Hi Stuart,

Thank you very much for the detailed explanation and examples, which solved 
most of my previous questions and confusion. Really appreciate that. I agree 
with the opinion about "effort vs benefit".

Re-thought my original tests concerning CMEs that passed with jdk8u but failed 
with jdk11 (should be 9+, but I only checked LTS builds), I was struggling 
between "fixing" the "problems" in jdk11 and "making jdk8 fail similarly". 
However so far it looks these tests might be over strict, or "verifying CMEs" 
itself might be an invalid (or unreliable) operation, perhaps I should drop all 
of them. Lastly, a suggestion would be: adding more comments for this in case 
anyone else would revisit it with similar confusions, e.g. HashMap.clear. 

Regards
Patrick

-Original Message-
From: Stuart Marks  
Sent: Thursday, May 2, 2019 8:39 AM
To: Patrick Zhang OS 
Cc: core-libs-dev 
Subject: Re: RFR(trivial): 8222394: HashMap.compute() throws CME on an empty 
Map if clear() called concurrently


>> ...merely to serve as a discussion point about the policy for 
>> throwing ConcurrentModificationException?

> Yes, for the time being, I want to see and welcome more ideas on this. 
> It seems to me that the policy for throwing CME here is not a unified 
> one, mostly based on experience and testing. Clear, compute, and 
> computeIfAbsent are more special as I described.

OK. For reference, here are some of the words from the 
ConcurrentModificationException specification: [1]


> This exception may be thrown by methods that have detected concurrent 
> modification of an object when such modification is not permissible.
> 
> For example, it is not generally permissible for one thread to modify 
> a Collection while another thread is iterating over it. In general, 
> the results of the iteration are undefined under these circumstances. 
> Some Iterator implementations (including those of all the general 
> purpose collection implementations provided by the JRE) may choose to 
> throw this exception if this behavior is detected. Iterators that do 
> this are known as fail-fast iterators, as they fail quickly and 
> cleanly, rather that risking arbitrary, non-deterministic behavior at an 
> undetermined time in the future.
> 
> Note that this exception does not always indicate that an object has 
> been concurrently modified by a different thread. If a single thread 
> issues a sequence of method invocations that violates the contract of 
> an object, the object may throw this exception. For example, if a 
> thread modifies a collection directly while it is iterating over the 
> collection with a fail-fast iterator, the iterator will throw this exception.
> 
> Note that fail-fast behavior cannot be guaranteed as it is, generally 
> speaking, impossible to make any hard guarantees in the presence of 
> unsynchronized concurrent modification. Fail-fast operations throw 
> ConcurrentModificationException on a best-effort basis. Therefore, it 
> would be wrong to write a program that depended on this exception for 
> its
> correctness: ConcurrentModificationException should be used only to 
> detect bugs.

[1]
https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/ConcurrentModificationException.html


Similar words are repeated in several different locations around the
specification, such as in the ArrayList and HashMap class specifications.

I'm not entirely sure what your concerns are with
ConcurrentModificationException (and the "fail-fast" concurrent modification
policy), but let me discuss a few points.

1. throwing of CME is not guaranteed - "best effort"

Unlike most Java specifications, the specification around CME is fairly 
indefinite. The wording is hedged -- "This exception may be thrown" This 
implies that CME might or might not be thrown, even in cases where one might 
expect it to be.

It also says that CME is thrown on a "best effort" basis. This doesn't mean 
that 
the library makes the maximum possible effort to throw CME in every possible 
situation. Maybe "best effort" is somewhat misleading. Perhaps "reasonable" 
effort is more descriptive.

For example, ArrayList keeps a modCount field and increments and checks it 
occasionally. No synchronization is done. If the ArrayList is modified by 
another thread, the update to modCount might not be visible to all threads, 
which might result in data corruption instead of a CME.

One way to "fix" this would be to make access to modCount synchronized (or to 
make it volatile, or to make it an AtomicInteger or something) to improve the 
reliability of detecting concurrent modifications from other threads. This 
would 
add complexity to the code and also slow down common operations. Making this 

Re: RFR(trivial): 8222394: HashMap.compute() throws CME on an empty Map if clear() called concurrently

2019-05-01 Thread Stuart Marks



...merely to serve as a discussion point about the policy for throwing 
ConcurrentModificationException?


Yes, for the time being, I want to see and welcome more ideas on this. It 
seems to me that the policy for throwing CME here is not a unified one, 
mostly based on experience and testing. Clear, compute, and computeIfAbsent 
are more special as I described.


OK. For reference, here are some of the words from the
ConcurrentModificationException specification: [1]



This exception may be thrown by methods that have detected concurrent
modification of an object when such modification is not permissible.

For example, it is not generally permissible for one thread to modify a
Collection while another thread is iterating over it. In general, the results
of the iteration are undefined under these circumstances. Some Iterator
implementations (including those of all the general purpose collection
implementations provided by the JRE) may choose to throw this exception if
this behavior is detected. Iterators that do this are known as fail-fast
iterators, as they fail quickly and cleanly, rather that risking arbitrary,
non-deterministic behavior at an undetermined time in the future.

Note that this exception does not always indicate that an object has been
concurrently modified by a different thread. If a single thread issues a
sequence of method invocations that violates the contract of an object, the
object may throw this exception. For example, if a thread modifies a
collection directly while it is iterating over the collection with a
fail-fast iterator, the iterator will throw this exception.

Note that fail-fast behavior cannot be guaranteed as it is, generally
speaking, impossible to make any hard guarantees in the presence of
unsynchronized concurrent modification. Fail-fast operations throw
ConcurrentModificationException on a best-effort basis. Therefore, it would
be wrong to write a program that depended on this exception for its
correctness: ConcurrentModificationException should be used only to detect
bugs.


[1] 
https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/ConcurrentModificationException.html



Similar words are repeated in several different locations around the
specification, such as in the ArrayList and HashMap class specifications.

I'm not entirely sure what your concerns are with
ConcurrentModificationException (and the "fail-fast" concurrent modification
policy), but let me discuss a few points.

1. throwing of CME is not guaranteed - "best effort"

Unlike most Java specifications, the specification around CME is fairly 
indefinite. The wording is hedged -- "This exception may be thrown" This 
implies that CME might or might not be thrown, even in cases where one might 
expect it to be.


It also says that CME is thrown on a "best effort" basis. This doesn't mean that 
the library makes the maximum possible effort to throw CME in every possible 
situation. Maybe "best effort" is somewhat misleading. Perhaps "reasonable" 
effort is more descriptive.


For example, ArrayList keeps a modCount field and increments and checks it 
occasionally. No synchronization is done. If the ArrayList is modified by 
another thread, the update to modCount might not be visible to all threads, 
which might result in data corruption instead of a CME.


One way to "fix" this would be to make access to modCount synchronized (or to 
make it volatile, or to make it an AtomicInteger or something) to improve the 
reliability of detecting concurrent modifications from other threads. This would 
add complexity to the code and also slow down common operations. Making this 
extra effort doesn't seem to be worthwhile.


2. throwing CME sometimes done even when not absolutely necessary

Another point is that the detection and throwing of a CME is an approximation of 
when concurrent modification would have any impact. In some cases CME will be 
thrown even when one wouldn't think it strictly necessary.


For example, consider a loop in the middle of iterating an ArrayList. 
ArrayList's iterator simply keeps an index to represent its current position. If 
an element is added to or removed from the front of the list, this would result 
in the iteration skipping or repeating elements. Thus, throwing CME seems 
warranted in this case.


Now consider an iteration in the middle of an ArrayList, and an addition or 
removal is made to the *end* of the list. This doesn't affect the current 
iteration; yet CME is thrown anyway. Why?


To avoid throwing CME in this case (but to throw it in the previous case) the 
ArrayList and its Iterators would have to keep track of more information about 
what changes were made and would have to do more checking at each iteration 
step. This could increase code complexity considerably. Again, this seems like 
it isn't worthwhile. Keeping a simple counter (modCount) and checking it at each 
iteration step is quite cheap, although it arguably does throw CME unnecess

RE: RFR(trivial): 8222394: HashMap.compute() throws CME on an empty Map if clear() called concurrently

2019-04-24 Thread Patrick Zhang OS
> merely to serve as a discussion point about the policy for throwing 
> ConcurrentModificationException?

Yes, for the time being, I want to see and welcome more ideas on this. It seems 
to me that the policy for throwing CME here is not a unified one, mostly based 
on experience and testing. Clear, compute, and computeIfAbsent are more special 
as I described.

Regards 
Patrick

-Original Message-
From: Stuart Marks  
Sent: Thursday, April 25, 2019 7:48 AM
To: Patrick Zhang OS 
Cc: core-libs-dev ; Martin Buchholz 

Subject: Re: RFR(trivial): 8222394: HashMap.compute() throws CME on an empty 
Map if clear() called concurrently

Hi Patrick,

I guess I'm not sure what you're proposing here. You've updated the patch; are 
you proposing that this change be integrated?

Or are you posting code changes, not as a proposal to integrate, but merely to 
serve as a discussion point about the policy for throwing 
ConcurrentModificationException?

Either way (or something else) is fine; I just don't want to run off in one 
direction if you had intended the other.

s'marks

On 4/15/19 3:51 AM, Patrick Zhang OS wrote:
> Hi Stuart,
> 
> Thanks. I intentionally modified the map in remapping functions, just like 
> those tests in 
> http://hg.openjdk.java.net/jdk/jdk/file/00c0906bf4d1/test/jdk/java/util/Map/FunctionalCMEs.java.
>  My original tests were: create two threads, modify or perform read-only 
> operations in each, and verify those CMEs. There seems some inconsistencies:
> 1. clear() would modify the map completely, so it can be more 'defensible' 
> (Martin mentioned so in Jira), while other modifying functions like 
> put()/remove()/merge() are 'weaker', so ++modCount happens conditionally, say 
> there would be really some structural modifications in map.
> 2. compute()/computeIfAbsent() throws CME almost unconditionally, while 
> functions like forEach()/computeIfPresent()/iterator.next() are touching the 
> map content practically so these are wrapped by if-clauses.
> 
> So the concern is how to undertand "throw CME on a best-effort basis", 
> if I want to try best to detect the risk of bugs in program, 
> unconditionlly throwing CME can be the right way to go, e.g. do ++modCount in 
> removeNode() without telling if the removing would really occur, if I want to 
> make it more logically rigorous, we might need this: 
> http://cr.openjdk.java.net/~qpzhang/8222394/webrev.02 for 
> compute()/computeIfAbsent(). Centainly I know it has to afford the risk of 
> missing bugs.
> 
> Regards
> Patrick
> 
> -----Original Message-
> From: Stuart Marks 
> Sent: Saturday, April 13, 2019 4:15 AM
> To: Patrick Zhang OS 
> Cc: core-libs-dev 
> Subject: Re: RFR(trivial): 8222394: HashMap.compute() throws CME on an 
> empty Map if clear() called concurrently
> 
> [I'm about to leave for a week's vacation, so I won't be able to 
> respond further until after I return]
> 
> Hi Patrick,
> 
> A bit of background about my thinking on this proposal.
> 
> The Collections Framework has a set of weird edge cases ("tricky", as you 
> say) that I'll describe as "state-dependent" behavior, where operations that 
> seem illegal on their face might or might not throw an exception depending on 
> the current state of the system. The current state potentially depends on 
> everything that happened previously, including input to the program.
> 
> As an example of this, consider the following code snippet:
> 
>   List list1 = Collections.emptyList();
>   List list2 = getStringsFromSomewhere();
>   list1.addAll(list2);
> 
> What happens on the third line? The spec for Collections.emptyList() [1] says 
> that the returned list is immutable. You might think, then, that addAll() 
> will throw UnsupportedOperationException.
> 
> What actually happens is, "it depends."
> 
> If list2 has elements, then addAll() will throw UOE as expected. 
> However, if
> list2 happens to be empty, then addAll() returns false, because nothing was 
> added.
> 
> To me, the code above clearly has a bug: it's calling a mutator method on an 
> immutable collection. The only time this doesn't throw an exception is if 
> list2 is empty, so it can't possibly have any useful effect in this case.
> Presumably getStringsFromSomewhere() will return some actual strings from 
> time to time. If this happens rarely, we might put this code into production, 
> and it might blow up unexpectedly when a different set of inputs causes list2 
> to be nonempty.
> 
> (Aside 1: the Java 9 unmodifiable collections throw UOE 
> unconditionally, so
> List.of().addAll(List.of()) will throw UOE.)
> 
> (Aside 2: e

Re: RFR(trivial): 8222394: HashMap.compute() throws CME on an empty Map if clear() called concurrently

2019-04-24 Thread Stuart Marks

Hi Patrick,

I guess I'm not sure what you're proposing here. You've updated the patch; are 
you proposing that this change be integrated?


Or are you posting code changes, not as a proposal to integrate, but merely to 
serve as a discussion point about the policy for throwing 
ConcurrentModificationException?


Either way (or something else) is fine; I just don't want to run off in one 
direction if you had intended the other.


s'marks

On 4/15/19 3:51 AM, Patrick Zhang OS wrote:

Hi Stuart,

Thanks. I intentionally modified the map in remapping functions, just like 
those tests in 
http://hg.openjdk.java.net/jdk/jdk/file/00c0906bf4d1/test/jdk/java/util/Map/FunctionalCMEs.java.
 My original tests were: create two threads, modify or perform read-only 
operations in each, and verify those CMEs. There seems some inconsistencies:
1. clear() would modify the map completely, so it can be more 'defensible' 
(Martin mentioned so in Jira), while other modifying functions like 
put()/remove()/merge() are 'weaker', so ++modCount happens conditionally, say 
there would be really some structural modifications in map.
2. compute()/computeIfAbsent() throws CME almost unconditionally, while 
functions like forEach()/computeIfPresent()/iterator.next() are touching the 
map content practically so these are wrapped by if-clauses.

So the concern is how to undertand "throw CME on a best-effort basis",
if I want to try best to detect the risk of bugs in program, unconditionlly 
throwing CME can be the right way to go, e.g. do ++modCount in removeNode() 
without telling if the removing would really occur,
if I want to make it more logically rigorous, we might need this: 
http://cr.openjdk.java.net/~qpzhang/8222394/webrev.02 for 
compute()/computeIfAbsent(). Centainly I know it has to afford the risk of 
missing bugs.

Regards
Patrick

-Original Message-
From: Stuart Marks 
Sent: Saturday, April 13, 2019 4:15 AM
To: Patrick Zhang OS 
Cc: core-libs-dev 
Subject: Re: RFR(trivial): 8222394: HashMap.compute() throws CME on an empty 
Map if clear() called concurrently

[I'm about to leave for a week's vacation, so I won't be able to respond 
further until after I return]

Hi Patrick,

A bit of background about my thinking on this proposal.

The Collections Framework has a set of weird edge cases ("tricky", as you say) that I'll 
describe as "state-dependent" behavior, where operations that seem illegal on their face 
might or might not throw an exception depending on the current state of the system. The current 
state potentially depends on everything that happened previously, including input to the program.

As an example of this, consider the following code snippet:

  List list1 = Collections.emptyList();
  List list2 = getStringsFromSomewhere();
  list1.addAll(list2);

What happens on the third line? The spec for Collections.emptyList() [1] says 
that the returned list is immutable. You might think, then, that addAll() will 
throw UnsupportedOperationException.

What actually happens is, "it depends."

If list2 has elements, then addAll() will throw UOE as expected. However, if
list2 happens to be empty, then addAll() returns false, because nothing was 
added.

To me, the code above clearly has a bug: it's calling a mutator method on an 
immutable collection. The only time this doesn't throw an exception is if list2 
is empty, so it can't possibly have any useful effect in this case.
Presumably getStringsFromSomewhere() will return some actual strings from time 
to time. If this happens rarely, we might put this code into production, and it 
might blow up unexpectedly when a different set of inputs causes list2 to be 
nonempty.

(Aside 1: the Java 9 unmodifiable collections throw UOE unconditionally, so
List.of().addAll(List.of()) will throw UOE.)

(Aside 2: even though it's inconsistent and arguably wrong, I don't think this 
behavior of emptyList() should be changed, for compatibility reasons.)

I think you can see the analogy with HashMap.compute(). The cases from the 
tests essentially do this:

  var m = new HashMap();
  // possible modifications to m
  m.compute(someKey, (k, v) -> {
  m.clear();
  return someValue;
  });

Looking at this code, and not knowing the state of m, it seems to me it has a bug. The 
spec for compute() [2] says "The remapping function should not modify this map 
during computation" and there's a call to clear() right there. Indeed, the current 
code always throws ConcurrentModificationException.

You're proposing that it not throw CME in the case where m is empty, when
clear() has no effect. This is similar to the case above; if m is often empty, then this 
code appears to "work". But if the program's input were to change and m becomes 
non-empty, it'll throw CME unexp

RE: RFR(trivial): 8222394: HashMap.compute() throws CME on an empty Map if clear() called concurrently

2019-04-15 Thread Patrick Zhang OS
Hi Stuart,

Thanks. I intentionally modified the map in remapping functions, just like 
those tests in 
http://hg.openjdk.java.net/jdk/jdk/file/00c0906bf4d1/test/jdk/java/util/Map/FunctionalCMEs.java.
 My original tests were: create two threads, modify or perform read-only 
operations in each, and verify those CMEs. There seems some inconsistencies: 
1. clear() would modify the map completely, so it can be more 'defensible' 
(Martin mentioned so in Jira), while other modifying functions like 
put()/remove()/merge() are 'weaker', so ++modCount happens conditionally, say 
there would be really some structural modifications in map.
2. compute()/computeIfAbsent() throws CME almost unconditionally, while 
functions like forEach()/computeIfPresent()/iterator.next() are touching the 
map content practically so these are wrapped by if-clauses. 

So the concern is how to undertand "throw CME on a best-effort basis", 
if I want to try best to detect the risk of bugs in program, unconditionlly 
throwing CME can be the right way to go, e.g. do ++modCount in removeNode() 
without telling if the removing would really occur,
if I want to make it more logically rigorous, we might need this: 
http://cr.openjdk.java.net/~qpzhang/8222394/webrev.02 for 
compute()/computeIfAbsent(). Centainly I know it has to afford the risk of 
missing bugs.

Regards
Patrick

-Original Message-
From: Stuart Marks  
Sent: Saturday, April 13, 2019 4:15 AM
To: Patrick Zhang OS 
Cc: core-libs-dev 
Subject: Re: RFR(trivial): 8222394: HashMap.compute() throws CME on an empty 
Map if clear() called concurrently

[I'm about to leave for a week's vacation, so I won't be able to respond 
further until after I return]

Hi Patrick,

A bit of background about my thinking on this proposal.

The Collections Framework has a set of weird edge cases ("tricky", as you say) 
that I'll describe as "state-dependent" behavior, where operations that seem 
illegal on their face might or might not throw an exception depending on the 
current state of the system. The current state potentially depends on 
everything that happened previously, including input to the program.

As an example of this, consider the following code snippet:

 List list1 = Collections.emptyList();
 List list2 = getStringsFromSomewhere();
 list1.addAll(list2);

What happens on the third line? The spec for Collections.emptyList() [1] says 
that the returned list is immutable. You might think, then, that addAll() will 
throw UnsupportedOperationException.

What actually happens is, "it depends."

If list2 has elements, then addAll() will throw UOE as expected. However, if
list2 happens to be empty, then addAll() returns false, because nothing was 
added.

To me, the code above clearly has a bug: it's calling a mutator method on an 
immutable collection. The only time this doesn't throw an exception is if list2 
is empty, so it can't possibly have any useful effect in this case.
Presumably getStringsFromSomewhere() will return some actual strings from time 
to time. If this happens rarely, we might put this code into production, and it 
might blow up unexpectedly when a different set of inputs causes list2 to be 
nonempty.

(Aside 1: the Java 9 unmodifiable collections throw UOE unconditionally, so
List.of().addAll(List.of()) will throw UOE.)

(Aside 2: even though it's inconsistent and arguably wrong, I don't think this 
behavior of emptyList() should be changed, for compatibility reasons.)

I think you can see the analogy with HashMap.compute(). The cases from the 
tests essentially do this:

 var m = new HashMap();
 // possible modifications to m
 m.compute(someKey, (k, v) -> {
 m.clear();
 return someValue;
 });

Looking at this code, and not knowing the state of m, it seems to me it has a 
bug. The spec for compute() [2] says "The remapping function should not modify 
this map during computation" and there's a call to clear() right there. Indeed, 
the current code always throws ConcurrentModificationException.

You're proposing that it not throw CME in the case where m is empty, when
clear() has no effect. This is similar to the case above; if m is often empty, 
then this code appears to "work". But if the program's input were to change and 
m becomes non-empty, it'll throw CME unexpectedly. On the other hand, it would 
allow clear() in exactly the cases where it has no effect.

Overall I don't see that the system is improved by this change. It allows a 
particular operation only when it has no effect (thus adding no value), and it 
increases the risk of bugs in programs going undetected.

s'marks


[1]
https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/Collections.html#emptyList()

[2]
https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/Map.html#c

Re: RFR(trivial): 8222394: HashMap.compute() throws CME on an empty Map if clear() called concurrently

2019-04-12 Thread Stuart Marks
[I'm about to leave for a week's vacation, so I won't be able to respond further 
until after I return]


Hi Patrick,

A bit of background about my thinking on this proposal.

The Collections Framework has a set of weird edge cases ("tricky", as you say) 
that I'll describe as "state-dependent" behavior, where operations that seem 
illegal on their face might or might not throw an exception depending on the 
current state of the system. The current state potentially depends on everything 
that happened previously, including input to the program.


As an example of this, consider the following code snippet:

List list1 = Collections.emptyList();
List list2 = getStringsFromSomewhere();
list1.addAll(list2);

What happens on the third line? The spec for Collections.emptyList() [1] says 
that the returned list is immutable. You might think, then, that addAll() will 
throw UnsupportedOperationException.


What actually happens is, "it depends."

If list2 has elements, then addAll() will throw UOE as expected. However, if 
list2 happens to be empty, then addAll() returns false, because nothing was added.


To me, the code above clearly has a bug: it's calling a mutator method on an 
immutable collection. The only time this doesn't throw an exception is if list2 
is empty, so it can't possibly have any useful effect in this case.
Presumably getStringsFromSomewhere() will return some actual strings from time 
to time. If this happens rarely, we might put this code into production, and it 
might blow up unexpectedly when a different set of inputs causes list2 to be 
nonempty.


(Aside 1: the Java 9 unmodifiable collections throw UOE unconditionally, so 
List.of().addAll(List.of()) will throw UOE.)


(Aside 2: even though it's inconsistent and arguably wrong, I don't think this 
behavior of emptyList() should be changed, for compatibility reasons.)


I think you can see the analogy with HashMap.compute(). The cases from the tests 
essentially do this:


var m = new HashMap();
// possible modifications to m
m.compute(someKey, (k, v) -> {
m.clear();
return someValue;
});

Looking at this code, and not knowing the state of m, it seems to me it has a 
bug. The spec for compute() [2] says "The remapping function should not modify 
this map during computation" and there's a call to clear() right there. Indeed, 
the current code always throws ConcurrentModificationException.


You're proposing that it not throw CME in the case where m is empty, when 
clear() has no effect. This is similar to the case above; if m is often empty, 
then this code appears to "work". But if the program's input were to change and 
m becomes non-empty, it'll throw CME unexpectedly. On the other hand, it would 
allow clear() in exactly the cases where it has no effect.


Overall I don't see that the system is improved by this change. It allows a 
particular operation only when it has no effect (thus adding no value), and it 
increases the risk of bugs in programs going undetected.


s'marks


[1] 
https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/Collections.html#emptyList()


[2] 
https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/Map.html#compute(K,java.util.function.BiFunction)





On 4/12/19 2:46 AM, Patrick Zhang OS wrote:

Created a ticket to track it, welcome any comments. Thanks.

JBS https://bugs.openjdk.java.net/browse/JDK-8222394
Webrev: http://cr.openjdk.java.net/~qpzhang/map.clear/webrev.01

Regards
Patrick

-Original Message-
From: core-libs-dev  On Behalf Of 
Patrick Zhang OS
Sent: Saturday, March 30, 2019 1:34 PM
To: core-libs-dev 
Subject: ConcurrentModificationException thrown by HashMap.compute() operating 
on an empty Map, expected/unexpected?

Hi,
Here I have a case simplified from a practical issue that throws ConcurrentModificationException (CME) 
unexpectedly (I think). [0] creates a HashMap, keeps it empty, and calls m.computeIfAbsent() or m.compute(), 
in which a "sneaky" m.clear() occurs, some of the test cases throw CME although there were no 
"structural" changes in fact. (A structural modification is defined as "any operation that 
adds or deletes one or more mappings...").

This case cannot be reproduced with jdk8u, while jdk9 and beyond can, after the 
bug [1] got fixed for computeIfAbsent() concurrent co-modification issues. A 
couple of test cases [2] were introduced at that time, and the focus was to 
verify the behaviors at resizing, while empty maps were not tested.

A possible "fix" for this issue is to move the unconditional "modCount++" [3] into the 
if-clause, which indicates that a "structural" change would be happening indeed.

public void clear() {
 Node[] tab;
-   modCount++;
 if ((tab = table) != null && size > 0) {
+modCount++;
   size = 0;
   for (int i = 0; i < tab.length; ++i)
 tab[i] = null;
 }
}

Therefore, a dilemma here is "modCount++ before-if-clause 

RE: RFR(trivial): 8222394: HashMap.compute() throws CME on an empty Map if clear() called concurrently

2019-04-12 Thread Patrick Zhang OS
Created a ticket to track it, welcome any comments. Thanks. 

JBS https://bugs.openjdk.java.net/browse/JDK-8222394
Webrev: http://cr.openjdk.java.net/~qpzhang/map.clear/webrev.01 

Regards
Patrick

-Original Message-
From: core-libs-dev  On Behalf Of 
Patrick Zhang OS
Sent: Saturday, March 30, 2019 1:34 PM
To: core-libs-dev 
Subject: ConcurrentModificationException thrown by HashMap.compute() operating 
on an empty Map, expected/unexpected?

Hi,
Here I have a case simplified from a practical issue that throws 
ConcurrentModificationException (CME) unexpectedly (I think). [0] creates a 
HashMap, keeps it empty, and calls m.computeIfAbsent() or m.compute(), in which 
a "sneaky" m.clear() occurs, some of the test cases throw CME although there 
were no "structural" changes in fact. (A structural modification is defined as 
"any operation that adds or deletes one or more mappings...").

This case cannot be reproduced with jdk8u, while jdk9 and beyond can, after the 
bug [1] got fixed for computeIfAbsent() concurrent co-modification issues. A 
couple of test cases [2] were introduced at that time, and the focus was to 
verify the behaviors at resizing, while empty maps were not tested.

A possible "fix" for this issue is to move the unconditional "modCount++" [3] 
into the if-clause, which indicates that a "structural" change would be 
happening indeed.

public void clear() {
Node[] tab;
-   modCount++;
if ((tab = table) != null && size > 0) {
+modCount++;
  size = 0;
  for (int i = 0; i < tab.length; ++i)
tab[i] = null;
}
}

Therefore, a dilemma here is "modCount++ before-if-clause but overkills some 
cases" vs. "modCount++ into-if-clause but weakens the CME checking 
potentially". I want to make balance regarding how to "throw CME on a 
best-effort basis" more appropriately. Any suggestion?

I understand that CME here in HashMap.java cannot guarantee much and may be 
only for debugging purpose, any concurrent modification needs to be typically 
accomplished by synchronizing on some object that naturally encapsulates the 
map. So the mentioned issue is a just a tricky case.

[0]http://cr.openjdk.java.net/~qpzhang/map.clear/webrev.01/test/jdk/java/util/concurrent/ConcurrentMap/ConcurrentModification.java.udiff.html
[1]https://bugs.openjdk.java.net/browse/JDK-8071667
[2]http://hg.openjdk.java.net/jdk/jdk/file/5a9d780eb9dd/test/jdk/java/util/Map/FunctionalCMEs.java
[3]http://hg.openjdk.java.net/jdk/jdk/file/1042cac8bc2a/src/java.base/share/classes/java/util/HashMap.java#l860

Regards
Patrick