Re: [jira] Created: (HARMONY-6404) possible data-reordering in some hashCode-methods (e.g. String or URL)

Vijay Menon Fri, 11 Dec 2009 07:51:06 -0800

Egor,

Thanks for the link!  I think I see the original poster's issue & I retract
my statement.  The original example is basically identical to:


tmp1 = this.hashCode
if (tmp1 == 0) {
  // calculate hash
  this.hashCode = hash
}
tmp2 = this.hashCode
return tmp2


(You can think of tmp1 and tmp2 as internal registers introduced by the
compiler.)

The compiler is allowed to reorder the above to:

tmp2 = this.hashCode
tmp1 = this.hashCode
if (tmp1 == 0) {
  // calculate hash without accessing this.hashCode
  this.hashCode = hash
  tmp2 = this.hashCode
}
return tmp2


This is just fine in a single-threaded setting.  If tmp1 is 0, tmp2 is
always assigned the new hash.  If tmp1 is non-zero, then - in the
single-threaded code - the earlier read of tmp2 would have been non-zero as
well.

In a multithreaded setting, you can get the following:

tmp2 = this.hashCode  // reads zero
// <--- another thread writes 42 to this.hashCode
tmp1 = this.hashCode // reads 42, if clause not executed
if (tmp1 == 0) {
  // calculate hash without accessing this.hashCode
  this.hashCode = hash
  tmp2 = this.hashCode
}
return tmp2 // returns zero!


Now, you get zero.  Basically, a compiler is allowed to reorder reads if it
sees no (potentially aliasing) intervening writes.

Egor: are you sure that DRLVM doesn't reorder reads to the heap?  StarJIT
did do CSE of reads to the heap (do_memopt) - which is effectively the same
thing.  Did DRLVM remove that from Jitrino?

Vijay

On Fri, Dec 11, 2009 at 6:32 AM, Egor Pasko <[email protected]> wrote:

> On the 0x684 day of Apache Harmony Tim Ellison wrote:
> > On 11/Dec/2009 04:09, Vijay Menon wrote:
> >> Perhaps I'm missing some context, but I don't see any problem.  I don't
> >> believe that this:
> >>
> >>         if (hashCode == 0) {
> >>             // calculate hash
> >>             hashCode = hash;
> >>         }
> >>         return hashCode;
> >>
> >> can ever return 0 (assuming hash is non-zero), regardless of memory
> fences.
> >>  The JMM won't allow visible reordering in a single threaded program.
> >
> > I agree.  In the multi-threaded case there can be a data race on the
> > hashCode, with the effect that the same hashCode value is unnecessarily,
> > but harmlessly, recalculated.
>
> Vijay, Tim, you are not 100% correct here.
>
> 1. there should be an extra restriction that the part "calculate hash"
>   does not touch the hashCode field. Without that restriction more
>   trivial races can happen as discussed in LANG-481.
>
> So we should assume this code:
>
> if (this.hashCode == 0) {
>  int hash;
>  if (this.hashCode == 0) {
>    // Calculate using 'hash' only, not this.hashCode.
>    this.hashCode = hash;
>  }
>  return this.hashCode;
> }
>
> where 'this.*' is always a memory reference while 'hash' is a thread
> private var.
>
> 2. DRLVM JIT indeed does not privatize field access to threads. It
>   always loads fields from memory in original order. Hence this
>   potential bug does not affect DRLVM .. now. But potentially it can
>   start optimizing things this way because current behavior prevents
>   a bunch of high level optimizations.
>
> 3. Jeremy Manson, being an expert in Java Memory Model claims [1] that
>   such reordering is allowed theoretically. I.e. like this:
>
> int hash = this.hashCode;
> if (this.hashCode == 0) {
>  hash = this.hashCode = // calculate hash
> }
> return hash;
>
> This is a correct single-threaded code. What happened here is a
> lengthy thread privatization of this.hashCode (again, does not happen
> in DRLVM). That is incorrect in multithreaded environment and needs
> extra synchronization/volatile/etc.
>
> 4. I do not know why a JIT would want to do that, i.e. two sequential
>   reads from the same memory location. Sounds like a bit synthetic example.
>
>
> [1]
> http://jeremymanson.blogspot.com/2008/12/benign-data-races-in-java.html
>
> --
> Egor Pasko
>
>

Re: [jira] Created: (HARMONY-6404) possible data-reordering in some hashCode-methods (e.g. String or URL)

Reply via email to