Re: 8020860: cluster Hashtable/Vector field updates for better transactional memory behaviour

Mike Duigou Mon, 14 Apr 2014 15:56:34 -0700

Hello all;

Sorry for the delay in following up on this issue. I have collected responses 
to the various comments and will provide responses here.


- Regarding the performance impact of the changes and of RTM. Valdimir Kozlov 
provided the following results from a run on a Haswell CPU system:

> threads=4 Interval=10000 CPUs=4 MapSize=2048 Population=1024 P10G80R10
> 
> 
> Without RTM locking, without Hashtable changes:
> 
> 2080 iterations/msec
> 
> Without RTM locking, with Hashtable changes 
> (-Xbootclasspath/p:Hashtable124.jar):
> 
> 2140 iterations/msec
> 
> With RTM locking (-XX:+UseRTMLocking), without Hashtable changes:
> 
> 23500 iterations/ms
> 
> With RTM locking, with Hashtable changes:
> 
> 33100 iterations/ms
> 
> 
> Numbers are average from 3 runs. They v[a]ry about 6-8%.

The benchmark is a slightly adapted version of the Hashtable benchmark used in 
Dave Dice's ASPLOS 2009 "Rock" paper [1]

- Regarding hotspot or javac doing the desired code movements. Neither compiler 
will currently move assignments past conditional logic and it isn't likely this 
will change in the near future. While it would be foolish to restructure all of 
our code for "compiler behaviour of the week" it does seem prudent to do so 
very selectively when we know that the behaviour is not going to soon change 
and the benefits are significant.

- Regarding potential loss of fast-fail behaviour. Vector is unaffected because 
reads and co-mod checks are always done under synchronization. Enumerations 
from Hashtable elements() and keys() methods offer no fast-fail behaviour 
though they may see different behaviour as a result of this change. The 
Hashtable entrySet().iterator(), keySet().iterator() and values().iterator() 
will have the existing behaviour when the illegal modification occurs on the 
same thread as iteration. When the modification occurs on a different thread 
then it is possible that different behaviour may be observed. Since the 
Hashtable Iterator modCount check occurs without synchronization the ordering 
of visible writes is unspecified until the lock on Hashtable is released. In 
the meantime neither, either or both writes may be visible. In RTM, updates to 
modCount would not be visible to unsynchronized readers regardless of placement 
within the synchronized block. We may just have to accept this limitation of 
Hashtable Iterators--they should really be holding the lock during next() if 
the fast-fail is to be reliable.

Should we proceed forward despite these understood limitations? My vote is a 
very soft "Yes".

Mike


[1] 
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.143.8940&rep=rep1&type=pdf


On Apr 4 2014, at 02:07 , Paul Sandoz <paul.san...@oracle.com> wrote:

> On Apr 4, 2014, at 1:42 AM, Mike Duigou <mike.dui...@oracle.com> wrote:
>>> 
>>> I could live with that change in behaviour, but this change completely 
>>> breaks the fail-fast semantics of the iterators in some cases! If you don't 
>>> update modCount until after the change is complete, the iterator may access 
>>> the updated state and not throw CME!.
>> 
>> For Vector I don't see this. The Iterator accesses to the data structures is 
>> always done with the Vector.this lock held. The re-ordering would only be 
>> observable to another thread if it is reading the Vector fields without 
>> holding the lock. I am not sure we should worry about that case.
>> 
> 
> Agreed, i don't see how that can happen.
> 
> 
>> For Hashtable Iterator there is no synchronization on the owning Hashtable 
>> except during the remove() method. It is unclear why the Hashtable iterators 
>> were not written in the same way as Vector.
> 
> Dunno.
> 
> 
>> It seems like there would be massive disruption to adding synchronization to 
>> Hashtable's itertors. Are the Hashtable iterators actually fast-fail?
> 
> They are fail fast only from within the same thread when the control is 
> inverted via iterator (like that for non-synchronized HashMap etc), otherwise 
> it is necessary to explicitly synchronize on the iterator, much like that for 
> Collections.synchronized* methods, see the implementation:
> 
>    public Set<K> keySet() {
>        if (keySet == null)
>            keySet = Collections.synchronizedSet(new KeySet(), this);
>        return keySet;
>    }
> 
> The documentation for keySet etc. states:
> 
>     * reflected in the set, and vice-versa.  If the map is modified
>     * while an iteration over the set is in progress (except through
>     * the iterator's own <tt>remove</tt> operation), the results of
>     * the iteration are undefined.  The set supports element removal,
> 
> 
> The documentation on the enumeration methods does not say anything.
> 
> We should probably update the documentation to additionally say something 
> like that on Collections.synchronized* methods.
> 
> 
>> Without synchronization this is not guaranteed since the writes may not be 
>> visible and Hashtable iterator failure behaviour is already likely to vary 
>> between platforms/architectures. With RTM it's presumed that the writes will 
>> NOT be visible until the transaction completes. This implies that the 
>> failure mode from Hashtable iterators is likely to change just by turning 
>> RTM locking on whether we make this code change or not. :-( 
>> 
> 
>>> I think this change is misguided.
>> 
>> I think we are fine for Vector, but Hashtable gives me concerns even in it's 
>> current state.
>> 
> 
> I don't think the current situation is made any worse by your changes.
> 
> The are some subtle changes with regards parameter checking and throwing 
> exceptions, but that does not seems to be very important behaviour to 
> preserve.
> 
> Paul.

Re: 8020860: cluster Hashtable/Vector field updates for better transactional memory behaviour

Reply via email to