[Bug c++/52558] write introduction incorrect wrt the C++11 memory model

2012-03-13 Thread torvald at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52558

torvald at gcc dot gnu.org changed:

   What|Removed |Added

  Component|tree-optimization   |c++

--- Comment #12 from torvald at gcc dot gnu.org 2012-03-13 21:17:48 UTC ---
(In reply to comment #11)
> Just one remark: in this case the write introduction is incorrect wrt the 
> C++11
> memory model because there are no previous write to the same location.  If
> there had been a previous write to the same location, then the discriminating
> context would have been racy and the code generated by -O2 sound.
> 
> I believe that the above argument generalises to all write introductions but I
> don't yet have a proof of this, so take it with a pinch of salt.

We follow a similar line of reasoning when thinking about which transformations
are allowed in transactional code (or other concurrent code).  In principle,
this also applies to loads to some extent even though racy loads are supposed
to be okay for as--if as long as a value obtained from a racy read is never
used.  Having a more precise yet practical understanding of this would
certainly help.


[Bug c++/52558] write introduction incorrect wrt the C++11 memory model

2012-03-12 Thread rguenther at suse dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52558

--- Comment #9 from rguenther at suse dot de  
2012-03-12 15:55:27 UTC ---
On Mon, 12 Mar 2012, amacleod at redhat dot com wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52558
> 
> --- Comment #8 from Andrew Macleod  2012-03-12 
> 15:50:13 UTC ---
> We can still perform store motion out of a loop, we just can't put the store 
> on
> a path which is executed if the loop isn't executed.
> 
> In this case, we actually made the code *slower*.  Before LIM, there was a 
> load
> of g1, a compare and return.
> movlg_1(%rip), %edx
> xorl%eax, %eax
> testl   %edx, %edx
> jne .L1
> .L4:
> addl$1, %eax
> movl$0, g_2(%rip)
> cmpl$4, %eax
> jne .L4
> .L1:
> rep
> ret
> 
> 
>   LIM makes it have a load of g_1, a load of g_2 and a store to g_2 before
> returning.
> 
>.cfi_startproc
> movlg_1(%rip), %edx
> movlg_2(%rip), %eax
> testl   %edx, %edx
> jne .L2
> movl$0, g_2(%rip)
> ret
> .L2:
> movl%eax, g_2(%rip)
> xorl%eax, %eax
> ret
> 
> 
> 
> -O3 corrects this mistake and returns it to the more optimal results.
> 
> 
> I would argue this testcase shows LIM actually making the code worse in this
> case as well.

Usually loops are executed ;)  At least that is what we assume if
we don't know better.  But yes, splitting the exit block(s)
is a solution, so, if you fix this please go this way.

Richard.


[Bug c++/52558] write introduction incorrect wrt the C++11 memory model

2012-03-12 Thread aldyh at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52558

--- Comment #10 from Aldy Hernandez  2012-03-12 
15:56:30 UTC ---
On 03/12/12 10:45, rguenther at suse dot de wrote:

>> Just to get this straight, am I to assume that the default code
>> generation for GCC is a single threaded environment?  I just want to
>> make sure I get these variants right in my head, and can properly
>> separate what is and is not allowed in default GCC, and what only
>> pertains to the C11 memory model (or transactional stuff).
>
> It certainly is, though we are getting more careful about this stuff
> in recent years and generally only read data-races are ok.
>
> Richard.
>

Ah, ok.  Now it's clear.  I was confused because in a previous thread 
months ago you had mentioned that read data-races were ok, but write 
data-races were not, whereas here we are even allowing write data races.

Understood.

Thanks.


[Bug c++/52558] write introduction incorrect wrt the C++11 memory model

2012-03-12 Thread amacleod at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52558

--- Comment #8 from Andrew Macleod  2012-03-12 
15:50:13 UTC ---
We can still perform store motion out of a loop, we just can't put the store on
a path which is executed if the loop isn't executed.

In this case, we actually made the code *slower*.  Before LIM, there was a load
of g1, a compare and return.
movlg_1(%rip), %edx
xorl%eax, %eax
testl   %edx, %edx
jne .L1
.L4:
addl$1, %eax
movl$0, g_2(%rip)
cmpl$4, %eax
jne .L4
.L1:
rep
ret


  LIM makes it have a load of g_1, a load of g_2 and a store to g_2 before
returning.

   .cfi_startproc
movlg_1(%rip), %edx
movlg_2(%rip), %eax
testl   %edx, %edx
jne .L2
movl$0, g_2(%rip)
ret
.L2:
movl%eax, g_2(%rip)
xorl%eax, %eax
ret



-O3 corrects this mistake and returns it to the more optimal results.


I would argue this testcase shows LIM actually making the code worse in this
case as well.


[Bug c++/52558] write introduction incorrect wrt the C++11 memory model

2012-03-12 Thread rguenther at suse dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52558

--- Comment #7 from rguenther at suse dot de  
2012-03-12 15:45:39 UTC ---
On Mon, 12 Mar 2012, aldyh at redhat dot com wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52558
> 
> --- Comment #6 from Aldy Hernandez  2012-03-12 
> 15:42:45 UTC ---
> On 03/12/12 10:32, rguenther at suse dot de wrote:
> es, but still cared about introducing write
> >> data races.  This test case has both.  I don't understand why we would 
> >> allow
> >> introducing writes on paths that did not have it, but I will defer to you.
> >
> > Why should we avoid them if we know they cannot cause problems?  This
> > will happen for every loop where we do not know if it iterates at least
> > once.  Store-motion is a very important optimization.
> >
> > Richard.
> >
> 
> They won't cause problems in a single threaded environment, but will 
> cause problems in a multi threaded environment.  Even if you're writing 
> the value g_2 originally had, another thread may have written to g_2 
> right before.

Yes, that's clear to me.

> Just to get this straight, am I to assume that the default code 
> generation for GCC is a single threaded environment?  I just want to 
> make sure I get these variants right in my head, and can properly 
> separate what is and is not allowed in default GCC, and what only 
> pertains to the C11 memory model (or transactional stuff).

It certainly is, though we are getting more careful about this stuff
in recent years and generally only read data-races are ok.

Richard.


[Bug c++/52558] write introduction incorrect wrt the C++11 memory model

2012-03-12 Thread aldyh at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52558

--- Comment #6 from Aldy Hernandez  2012-03-12 
15:42:45 UTC ---
On 03/12/12 10:32, rguenther at suse dot de wrote:
es, but still cared about introducing write
>> data races.  This test case has both.  I don't understand why we would allow
>> introducing writes on paths that did not have it, but I will defer to you.
>
> Why should we avoid them if we know they cannot cause problems?  This
> will happen for every loop where we do not know if it iterates at least
> once.  Store-motion is a very important optimization.
>
> Richard.
>

They won't cause problems in a single threaded environment, but will 
cause problems in a multi threaded environment.  Even if you're writing 
the value g_2 originally had, another thread may have written to g_2 
right before.

Just to get this straight, am I to assume that the default code 
generation for GCC is a single threaded environment?  I just want to 
make sure I get these variants right in my head, and can properly 
separate what is and is not allowed in default GCC, and what only 
pertains to the C11 memory model (or transactional stuff).

Thanks.


[Bug c++/52558] write introduction incorrect wrt the C++11 memory model

2012-03-12 Thread rguenther at suse dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52558

--- Comment #5 from rguenther at suse dot de  
2012-03-12 15:32:48 UTC ---
On Mon, 12 Mar 2012, aldyh at gcc dot gnu.org wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52558
> 
> --- Comment #4 from Aldy Hernandez  2012-03-12 
> 15:29:06 UTC ---
> 
> > No, we don't want to fix this for 4.7 as this is not a regression.
> > 
> > Yes, LIM only avoids introducing traps, not data-races.  This was discussed
> > in the past already, btw, and we do not want to generally disallow this
> > optimization.  [The C++ memory model is stupid here, it should not treat
> > every variable raceable but only specially marked ones, oh well ...]
> > 
> > There will be very many other passes that are affected by this, and even 
> > more
> > very many passes that will be affected by load data-races.
> > 
> > You will for example slow down SPEC CPU 2006 quite a bit (though technically
> > it does not include C++11 benchmarks).
> 
> I thought we ignored *load* data races, but still cared about introducing 
> write
> data races.  This test case has both.  I don't understand why we would allow
> introducing writes on paths that did not have it, but I will defer to you.

Why should we avoid them if we know they cannot cause problems?  This
will happen for every loop where we do not know if it iterates at least
once.  Store-motion is a very important optimization.

Richard.


[Bug c++/52558] write introduction incorrect wrt the C++11 memory model

2012-03-12 Thread aldyh at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52558

--- Comment #4 from Aldy Hernandez  2012-03-12 
15:29:06 UTC ---

> No, we don't want to fix this for 4.7 as this is not a regression.
> 
> Yes, LIM only avoids introducing traps, not data-races.  This was discussed
> in the past already, btw, and we do not want to generally disallow this
> optimization.  [The C++ memory model is stupid here, it should not treat
> every variable raceable but only specially marked ones, oh well ...]
> 
> There will be very many other passes that are affected by this, and even more
> very many passes that will be affected by load data-races.
> 
> You will for example slow down SPEC CPU 2006 quite a bit (though technically
> it does not include C++11 benchmarks).

I thought we ignored *load* data races, but still cared about introducing write
data races.  This test case has both.  I don't understand why we would allow
introducing writes on paths that did not have it, but I will defer to you.


[Bug c++/52558] write introduction incorrect wrt the C++11 memory model

2012-03-12 Thread amacleod at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52558

--- Comment #3 from Andrew Macleod  2012-03-12 
15:24:35 UTC ---
Created attachment 26881
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26881
Testcase for simulate-threads

I've modified the testcase so that it runs in gcc.dg/simulate-thread as a
pass/fail. 

Clearly it currently fails, although its interesting because at -O3 it passes
again!!  The second pass of DOM undoes the extra write!

It does fail the testsuite at -O2 as expected.
This diff can be applied to add it to simulate threads.


[Bug c++/52558] write introduction incorrect wrt the C++11 memory model

2012-03-12 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52558

--- Comment #2 from Richard Guenther  2012-03-12 
15:01:36 UTC ---
(In reply to comment #1)
> Richi, is this something that should also be fixed for 4.7 as well?  There is 
> a
> write to g_2 that is introduced on paths that did not have it.  So this is not
> just a load/load data race.

No, we don't want to fix this for 4.7 as this is not a regression.

Yes, LIM only avoids introducing traps, not data-races.  This was discussed
in the past already, btw, and we do not want to generally disallow this
optimization.  [The C++ memory model is stupid here, it should not treat
every variable raceable but only specially marked ones, oh well ...]

There will be very many other passes that are affected by this, and even more
very many passes that will be affected by load data-races.

You will for example slow down SPEC CPU 2006 quite a bit (though technically
it does not include C++11 benchmarks).


[Bug c++/52558] write introduction incorrect wrt the C++11 memory model

2012-03-12 Thread aldyh at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52558

Aldy Hernandez  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2012-03-12
 CC||aldyh at gcc dot gnu.org,
   ||amacleod at redhat dot com,
   ||rguenth at gcc dot gnu.org
 Ever Confirmed|0   |1

--- Comment #1 from Aldy Hernandez  2012-03-12 
14:52:57 UTC ---
Richi, is this something that should also be fixed for 4.7 as well?  There is a
write to g_2 that is introduced on paths that did not have it.  So this is not
just a load/load data race.