O2 Agressive Optimisation by GCC

2018-07-20 Thread Umesh Kalappa
Hi All ,

We are looking at the C sample i.e

extern int i,j;

int test()
{
while(1)
{   i++;
j=20;
}
return 0;
}

command used :(gcc 8.1.0)
gcc -S test.c -O2

the generated asm for x86

.L2:
jmp .L2

we understand that,the infinite loop is not  deterministic ,compiler
is free to treat as that as UB and do aggressive optimization ,but we
need keep the side effects like j=20 untouched by optimization .

Please note that using the volatile qualifier for i and j  or empty
asm("") in the while loop,will stop the optimizer ,but we don't want
do  that.

Anyone from the community ,please share their insights why above
transformation is right ?

and without using volatile or memory barrier ,how we can stop the
above transformation .


Thank you in advance.
~Umesh


Re: O2 Agressive Optimisation by GCC

2018-07-20 Thread Jakub Jelinek
On Fri, Jul 20, 2018 at 05:49:12PM +0530, Umesh Kalappa wrote:
> We are looking at the C sample i.e
> 
> extern int i,j;
> 
> int test()
> {
> while(1)
> {   i++;
> j=20;
> }
> return 0;
> }
> 
> command used :(gcc 8.1.0)
> gcc -S test.c -O2
> 
> the generated asm for x86
> 
> .L2:
> jmp .L2
> 
> we understand that,the infinite loop is not  deterministic ,compiler
> is free to treat as that as UB and do aggressive optimization ,but we
> need keep the side effects like j=20 untouched by optimization .

Don't invoke UB in your code, and you won't be surprised, it is all that
easy.  After you invoke UB, anything can happen.

Jakub


Re: O2 Agressive Optimisation by GCC

2018-07-20 Thread Martin Sebor

On 07/20/2018 06:19 AM, Umesh Kalappa wrote:

Hi All ,

We are looking at the C sample i.e

extern int i,j;

int test()
{
while(1)
{   i++;
j=20;
}
return 0;
}

command used :(gcc 8.1.0)
gcc -S test.c -O2

the generated asm for x86

.L2:
jmp .L2

we understand that,the infinite loop is not  deterministic ,compiler
is free to treat as that as UB and do aggressive optimization ,but we
need keep the side effects like j=20 untouched by optimization .

Please note that using the volatile qualifier for i and j  or empty
asm("") in the while loop,will stop the optimizer ,but we don't want
do  that.

Anyone from the community ,please share their insights why above
transformation is right ?


The loop isn't necessarily undefined (and compilers don't look
for undefined behavior as opportunities to optimize code), but
because it doesn't terminate it's not possible for a conforming
C program to detect the side-effects in its body.  The only way
to detect it is to examine the object code as you did.

Compilers are allowed (and expected) to transform source code
into efficient object code as long as the transformations don't
change the observable effects of the program.  That's just what
happens in this case.

Martin


Re: O2 Agressive Optimisation by GCC

2018-07-20 Thread Richard Biener
On July 20, 2018 7:59:10 PM GMT+02:00, Martin Sebor  wrote:
>On 07/20/2018 06:19 AM, Umesh Kalappa wrote:
>> Hi All ,
>>
>> We are looking at the C sample i.e
>>
>> extern int i,j;
>>
>> int test()
>> {
>> while(1)
>> {   i++;
>> j=20;
>> }
>> return 0;
>> }
>>
>> command used :(gcc 8.1.0)
>> gcc -S test.c -O2
>>
>> the generated asm for x86
>>
>> .L2:
>> jmp .L2
>>
>> we understand that,the infinite loop is not  deterministic ,compiler
>> is free to treat as that as UB and do aggressive optimization ,but we
>> need keep the side effects like j=20 untouched by optimization .
>>
>> Please note that using the volatile qualifier for i and j  or empty
>> asm("") in the while loop,will stop the optimizer ,but we don't want
>> do  that.
>>
>> Anyone from the community ,please share their insights why above
>> transformation is right ?
>
>The loop isn't necessarily undefined (and compilers don't look
>for undefined behavior as opportunities to optimize code), but

The variable i overflows.

>because it doesn't terminate it's not possible for a conforming
>C program to detect the side-effects in its body.  The only way
>to detect it is to examine the object code as you did.

I'm not sure we perform this kind of dead code elimination but yes, we could. 
Make i unsigned and check whether that changes behavior. 

>Compilers are allowed (and expected) to transform source code
>into efficient object code as long as the transformations don't
>change the observable effects of the program.  That's just what
>happens in this case.
>
>Martin



Re: O2 Agressive Optimisation by GCC

2018-07-20 Thread Martin Sebor

On 07/20/2018 12:17 PM, Richard Biener wrote:

On July 20, 2018 7:59:10 PM GMT+02:00, Martin Sebor  wrote:

On 07/20/2018 06:19 AM, Umesh Kalappa wrote:

Hi All ,

We are looking at the C sample i.e

extern int i,j;

int test()
{
while(1)
{   i++;
j=20;
}
return 0;
}

command used :(gcc 8.1.0)
gcc -S test.c -O2

the generated asm for x86

.L2:
jmp .L2

we understand that,the infinite loop is not  deterministic ,compiler
is free to treat as that as UB and do aggressive optimization ,but we
need keep the side effects like j=20 untouched by optimization .

Please note that using the volatile qualifier for i and j  or empty
asm("") in the while loop,will stop the optimizer ,but we don't want
do  that.

Anyone from the community ,please share their insights why above
transformation is right ?


The loop isn't necessarily undefined (and compilers don't look
for undefined behavior as opportunities to optimize code), but


The variable i overflows.


Good point!

It doesn't change the answer or the behavior of any compiler
I tested (although ICC and Oracle cc both emit the assignment
as well as the increment regardless of whether the variables
are signed).  I don't think it should change it either.

Going further, and as much value as I put on diagnosing bugs,
I also wouldn't see it as helpful to diagnose this kind of
eliminated undefined behavior (so long as the result of
the overflow wasn't used).  What might be helpful, though,
is diagnosing the infinite loop similarly to IBM xlc and
Oracle cc.  Maybe not in the constant case but in the non-
constant cases if might help catch bugs.

Martin




because it doesn't terminate it's not possible for a conforming
C program to detect the side-effects in its body.  The only way
to detect it is to examine the object code as you did.


I'm not sure we perform this kind of dead code elimination but yes, we could. 
Make i unsigned and check whether that changes behavior.


Compilers are allowed (and expected) to transform source code
into efficient object code as long as the transformations don't
change the observable effects of the program.  That's just what
happens in this case.

Martin






Re: O2 Agressive Optimisation by GCC

2018-07-20 Thread Allan Sandfeld Jensen
On Freitag, 20. Juli 2018 14:19:12 CEST Umesh Kalappa wrote:
> Hi All ,
> 
> We are looking at the C sample i.e
> 
> extern int i,j;
> 
> int test()
> {
> while(1)
> {   i++;
> j=20;
> }
> return 0;
> }
> 
> command used :(gcc 8.1.0)
> gcc -S test.c -O2
> 
> the generated asm for x86
> 
> .L2:
> jmp .L2
> 
> we understand that,the infinite loop is not  deterministic ,compiler
> is free to treat as that as UB and do aggressive optimization ,but we
> need keep the side effects like j=20 untouched by optimization .
> 
> Please note that using the volatile qualifier for i and j  or empty
> asm("") in the while loop,will stop the optimizer ,but we don't want
> do  that.
> 
But you need to do that! If you want changes to a variable to be observable in 
another thread, you need to use either volatile, atomic, or some kind of 
memory barrier implicit or explicit. This is the same if the loop wasn't 
infinite, the compiler would keep the value in register during the loop and 
only write it to memory on exiting the test() function.

'Allan




Re: O2 Agressive Optimisation by GCC

2018-07-20 Thread Jonathan Wakely
On Fri, 20 Jul 2018 at 23:06, Allan Sandfeld Jensen wrote:
>
> On Freitag, 20. Juli 2018 14:19:12 CEST Umesh Kalappa wrote:
> > Hi All ,
> >
> > We are looking at the C sample i.e
> >
> > extern int i,j;
> >
> > int test()
> > {
> > while(1)
> > {   i++;
> > j=20;
> > }
> > return 0;
> > }
> >
> > command used :(gcc 8.1.0)
> > gcc -S test.c -O2
> >
> > the generated asm for x86
> >
> > .L2:
> > jmp .L2
> >
> > we understand that,the infinite loop is not  deterministic ,compiler
> > is free to treat as that as UB and do aggressive optimization ,but we
> > need keep the side effects like j=20 untouched by optimization .
> >
> > Please note that using the volatile qualifier for i and j  or empty
> > asm("") in the while loop,will stop the optimizer ,but we don't want
> > do  that.
> >
> But you need to do that! If you want changes to a variable to be observable in
> another thread, you need to use either volatile,

No, volatile doesn't work for that.

http://www.isvolatileusefulwiththreads.in/C/

> atomic, or some kind of
> memory barrier implicit or explicit. This is the same if the loop wasn't
> infinite, the compiler would keep the value in register during the loop and
> only write it to memory on exiting the test() function.
>
> 'Allan
>
>


Re: O2 Agressive Optimisation by GCC

2018-07-20 Thread Allan Sandfeld Jensen
On Samstag, 21. Juli 2018 00:21:48 CEST Jonathan Wakely wrote:
> On Fri, 20 Jul 2018 at 23:06, Allan Sandfeld Jensen wrote:
> > On Freitag, 20. Juli 2018 14:19:12 CEST Umesh Kalappa wrote:
> > > Hi All ,
> > > 
> > > We are looking at the C sample i.e
> > > 
> > > extern int i,j;
> > > 
> > > int test()
> > > {
> > > while(1)
> > > {   i++;
> > > 
> > > j=20;
> > > 
> > > }
> > > return 0;
> > > }
> > > 
> > > command used :(gcc 8.1.0)
> > > gcc -S test.c -O2
> > > 
> > > the generated asm for x86
> > > 
> > > .L2:
> > > jmp .L2
> > > 
> > > we understand that,the infinite loop is not  deterministic ,compiler
> > > is free to treat as that as UB and do aggressive optimization ,but we
> > > need keep the side effects like j=20 untouched by optimization .
> > > 
> > > Please note that using the volatile qualifier for i and j  or empty
> > > asm("") in the while loop,will stop the optimizer ,but we don't want
> > > do  that.
> > 
> > But you need to do that! If you want changes to a variable to be
> > observable in another thread, you need to use either volatile,
> 
> No, volatile doesn't work for that.
> 
It does, but you shouldn't use for that due to many other reasons (though the 
linux kernel still does) But if the guy wants to code primitive without using 
system calls or atomics, he might as well go traditional

'Allan




Re: O2 Agressive Optimisation by GCC

2018-07-22 Thread Umesh Kalappa
Allan ,
>>he might as well go traditional

you mean using the locks ?

Thank you
~Umesh

On Sat, Jul 21, 2018 at 4:20 AM, Allan Sandfeld Jensen
 wrote:
> On Samstag, 21. Juli 2018 00:21:48 CEST Jonathan Wakely wrote:
>> On Fri, 20 Jul 2018 at 23:06, Allan Sandfeld Jensen wrote:
>> > On Freitag, 20. Juli 2018 14:19:12 CEST Umesh Kalappa wrote:
>> > > Hi All ,
>> > >
>> > > We are looking at the C sample i.e
>> > >
>> > > extern int i,j;
>> > >
>> > > int test()
>> > > {
>> > > while(1)
>> > > {   i++;
>> > >
>> > > j=20;
>> > >
>> > > }
>> > > return 0;
>> > > }
>> > >
>> > > command used :(gcc 8.1.0)
>> > > gcc -S test.c -O2
>> > >
>> > > the generated asm for x86
>> > >
>> > > .L2:
>> > > jmp .L2
>> > >
>> > > we understand that,the infinite loop is not  deterministic ,compiler
>> > > is free to treat as that as UB and do aggressive optimization ,but we
>> > > need keep the side effects like j=20 untouched by optimization .
>> > >
>> > > Please note that using the volatile qualifier for i and j  or empty
>> > > asm("") in the while loop,will stop the optimizer ,but we don't want
>> > > do  that.
>> >
>> > But you need to do that! If you want changes to a variable to be
>> > observable in another thread, you need to use either volatile,
>>
>> No, volatile doesn't work for that.
>>
> It does, but you shouldn't use for that due to many other reasons (though the
> linux kernel still does) But if the guy wants to code primitive without using
> system calls or atomics, he might as well go traditional
>
> 'Allan
>
>


Re: O2 Agressive Optimisation by GCC

2018-07-22 Thread Umesh Kalappa
Hi Richard,

making i unsigned still  the  optimization is effective ,no luck.
and yes test() is the threaded  routine and since i and j are global
,we need the side effects take place like assignment etc ,that are
observed by other threads .

By making volatile or thread safe or atomic operations ,the
optimization inhibited ,but still we  didn't  get  why its valid
optimization for UB and tried with -fno-strict-overflow too ,no luck
here .

Jakub and anyone can we inhibit these kind optimizations,that consider
the UB and optimize .

Thank you
~Umesh

On Fri, Jul 20, 2018 at 11:47 PM, Richard Biener
 wrote:
> On July 20, 2018 7:59:10 PM GMT+02:00, Martin Sebor  wrote:
>>On 07/20/2018 06:19 AM, Umesh Kalappa wrote:
>>> Hi All ,
>>>
>>> We are looking at the C sample i.e
>>>
>>> extern int i,j;
>>>
>>> int test()
>>> {
>>> while(1)
>>> {   i++;
>>> j=20;
>>> }
>>> return 0;
>>> }
>>>
>>> command used :(gcc 8.1.0)
>>> gcc -S test.c -O2
>>>
>>> the generated asm for x86
>>>
>>> .L2:
>>> jmp .L2
>>>
>>> we understand that,the infinite loop is not  deterministic ,compiler
>>> is free to treat as that as UB and do aggressive optimization ,but we
>>> need keep the side effects like j=20 untouched by optimization .
>>>
>>> Please note that using the volatile qualifier for i and j  or empty
>>> asm("") in the while loop,will stop the optimizer ,but we don't want
>>> do  that.
>>>
>>> Anyone from the community ,please share their insights why above
>>> transformation is right ?
>>
>>The loop isn't necessarily undefined (and compilers don't look
>>for undefined behavior as opportunities to optimize code), but
>
> The variable i overflows.
>
>>because it doesn't terminate it's not possible for a conforming
>>C program to detect the side-effects in its body.  The only way
>>to detect it is to examine the object code as you did.
>
> I'm not sure we perform this kind of dead code elimination but yes, we could. 
> Make i unsigned and check whether that changes behavior.
>
>>Compilers are allowed (and expected) to transform source code
>>into efficient object code as long as the transformations don't
>>change the observable effects of the program.  That's just what
>>happens in this case.
>>
>>Martin
>


Re: O2 Agressive Optimisation by GCC

2018-07-22 Thread Allan Sandfeld Jensen
On Sonntag, 22. Juli 2018 17:01:29 CEST Umesh Kalappa wrote:
> Allan ,
> 
> >>he might as well go traditional
> 
> you mean using the locks ?
> 

No I am meant relying on undefined behavior. In your case I would recommend 
using modern atomics, which is defined behavior, and modern and fast. I was 
just reminded of all the nasty and theoretically wrong ways we used to do 
stuff like that 20 years ago to implement fallback locks. For instance using -
O0, asm-declarations, relying on non-inlined functions calls as memory-
barriers, etc. All stuff that "worked", but relied on various degrees of 
undefined behavior.

Still if you are curious, it might be fun playing with stuff like that, and 
try to figure for yourself why it works, just remember it is undefined 
behavior and therefore not recommended.

'Allan




Re: O2 Agressive Optimisation by GCC

2018-07-23 Thread David Brown
Hi,

This is nothing to do with undefined behaviour, but a matter of
scheduling of effects that are visible in different circumstances.  In
particular, i and j are declared in a way that tells the compiler that
the compiler, in its current thread of execution has full control of
them.  The compiler knows that while it is executing the code in test,
nothing else can affect the value of i or j, nor can they be affected by
the values of i and j.  The compiler knows that code from elsewhere may
read or write them, but only before test() is called, during functions
called from test(), or after test() returns.  It knows for sure that
there are no other threads of execution that interact via i and j.

So how do you inhibit these kinds of optimisations?  Stop lying to your
compiler.

If you want them to be visible in other threads, tell your compiler that
they are visible in other threads.  You already know how to do that -
using volatile accesses, atomic accesses, other  features,
or operating system features.  You can also use somewhat "hack"
techniques such as Linux's "ACCESS_ONCE" macro or inline assembly
dependency controls, but it would be better to define and declare the
data correctly.

Messing around with optimisation settings is just a way of hiding your
coding and design errors until they get more subtle and harder to spot
in the future.

mvh.,

David


On 22/07/18 17:00, Umesh Kalappa wrote:
> Hi Richard,
> 
> making i unsigned still  the  optimization is effective ,no luck.
> and yes test() is the threaded  routine and since i and j are global
> ,we need the side effects take place like assignment etc ,that are
> observed by other threads .
> 
> By making volatile or thread safe or atomic operations ,the
> optimization inhibited ,but still we  didn't  get  why its valid
> optimization for UB and tried with -fno-strict-overflow too ,no luck
> here .
> 
> Jakub and anyone can we inhibit these kind optimizations,that consider
> the UB and optimize .
> 
> Thank you
> ~Umesh
> 
> On Fri, Jul 20, 2018 at 11:47 PM, Richard Biener
>  wrote:
>> On July 20, 2018 7:59:10 PM GMT+02:00, Martin Sebor  wrote:
>>> On 07/20/2018 06:19 AM, Umesh Kalappa wrote:
 Hi All ,

 We are looking at the C sample i.e

 extern int i,j;

 int test()
 {
 while(1)
 {   i++;
 j=20;
 }
 return 0;
 }

 command used :(gcc 8.1.0)
 gcc -S test.c -O2

 the generated asm for x86

 .L2:
 jmp .L2

 we understand that,the infinite loop is not  deterministic ,compiler
 is free to treat as that as UB and do aggressive optimization ,but we
 need keep the side effects like j=20 untouched by optimization .

 Please note that using the volatile qualifier for i and j  or empty
 asm("") in the while loop,will stop the optimizer ,but we don't want
 do  that.

 Anyone from the community ,please share their insights why above
 transformation is right ?
>>>
>>> The loop isn't necessarily undefined (and compilers don't look
>>> for undefined behavior as opportunities to optimize code), but
>>
>> The variable i overflows.
>>
>>> because it doesn't terminate it's not possible for a conforming
>>> C program to detect the side-effects in its body.  The only way
>>> to detect it is to examine the object code as you did.
>>
>> I'm not sure we perform this kind of dead code elimination but yes, we 
>> could. Make i unsigned and check whether that changes behavior.
>>
>>> Compilers are allowed (and expected) to transform source code
>>> into efficient object code as long as the transformations don't
>>> change the observable effects of the program.  That's just what
>>> happens in this case.
>>>
>>> Martin
>>
> 



Re: O2 Agressive Optimisation by GCC

2018-07-23 Thread Segher Boessenkool
Hi!

On Mon, Jul 23, 2018 at 12:36:50PM +0200, David Brown wrote:
> This is nothing to do with undefined behaviour, but a matter of
> scheduling of effects that are visible in different circumstances.  In
> particular, i and j are declared in a way that tells the compiler that
> the compiler, in its current thread of execution has full control of
> them.  The compiler knows that while it is executing the code in test,
> nothing else can affect the value of i or j, nor can they be affected by
> the values of i and j.  The compiler knows that code from elsewhere may
> read or write them, but only before test() is called, during functions
> called from test(), or after test() returns.  It knows for sure that
> there are no other threads of execution that interact via i and j.

It could in theory know that, yes, but in this case it just hoists the
assignment to after the loop.  And it's an infinite loop, so it just
disappears.

> So how do you inhibit these kinds of optimisations?  Stop lying to your
> compiler.

Yup :-)


Segher