[Bug rtl-optimization/82677] Many projects (linux, coreutils, GMP, gcrypt, openSSL, etc) are misusing asm(divq/divl) etc, potentially resulting in faulty/unintended optimisations

2017-11-08 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82677

--- Comment #13 from rguenther at suse dot de  ---
On November 8, 2017 5:04:39 PM GMT+01:00, "nisse at lysator dot liu.se"
 wrote:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82677
>
>--- Comment #12 from Niels Möller  ---
>It would be nice with some way to annotate the asm to have it treated
>in the
>same as a possibly trapping division or pointer dereference.
>
>E.g., adding "trap" to the clobber list, somewhat similar to "memory".
>This
>would be a weaker constraint than volatile, since it would allow the
>compiler
>to reorder the asm relative to other instructions, including loads and
>stores,
>but it can't move it out of a conditional. 
>
>Do you think that's a reasonable addition?

Yes. I think something besides volatile would be useful.

[Bug rtl-optimization/82677] Many projects (linux, coreutils, GMP, gcrypt, openSSL, etc) are misusing asm(divq/divl) etc, potentially resulting in faulty/unintended optimisations

2017-11-08 Thread nisse at lysator dot liu.se
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82677

--- Comment #12 from Niels Möller  ---
It would be nice with some way to annotate the asm to have it treated in the
same as a possibly trapping division or pointer dereference.

E.g., adding "trap" to the clobber list, somewhat similar to "memory". This
would be a weaker constraint than volatile, since it would allow the compiler
to reorder the asm relative to other instructions, including loads and stores,
but it can't move it out of a conditional. 

Do you think that's a reasonable addition?

[Bug rtl-optimization/82677] Many projects (linux, coreutils, GMP, gcrypt, openSSL, etc) are misusing asm(divq/divl) etc, potentially resulting in faulty/unintended optimisations

2017-10-26 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82677

--- Comment #11 from rguenther at suse dot de  ---
On Thu, 26 Oct 2017, nisse at lysator dot liu.se wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82677
> 
> Niels Möller  changed:
> 
>What|Removed |Added
> 
>  CC||nisse at lysator dot liu.se
> 
> --- Comment #10 from Niels Möller  ---
> Out of curiosity, how is this handled for division instructions generated by
> gcc, with no __asm__ involved? E.g., consider
> 
> int foo(int d) {
>   int r = 1234567;
>   if (d != 0)
> r = r / d;
>   return r;
> }
> 
> On an architecture where the div instruction doesn't raise any exception on
> divide by zero, this function could be compiled to a division instruction +
> conditional move, without any branch instruction. Right?
> 
> But on most architectures, that optimization would be invalid, and the 
> compiler
> must somehow know that. Is that a property on the representation of division
> expression? Or is it tied to some property of the instruction pattern for the
> divide instruction?
> 
> My question really is: What would it take to mark an __asm__ expression so 
> that
> it's treated in the same way as a plain C division?

The division is possibly trapping (due to special value of zero) and thus
is never hoisted out of a conditional region.

Similar to a pointer dereference where the pointer may be zero btw.

[Bug rtl-optimization/82677] Many projects (linux, coreutils, GMP, gcrypt, openSSL, etc) are misusing asm(divq/divl) etc, potentially resulting in faulty/unintended optimisations

2017-10-26 Thread nisse at lysator dot liu.se
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82677

Niels Möller  changed:

   What|Removed |Added

 CC||nisse at lysator dot liu.se

--- Comment #10 from Niels Möller  ---
Out of curiosity, how is this handled for division instructions generated by
gcc, with no __asm__ involved? E.g., consider

int foo(int d) {
  int r = 1234567;
  if (d != 0)
r = r / d;
  return r;
}

On an architecture where the div instruction doesn't raise any exception on
divide by zero, this function could be compiled to a division instruction +
conditional move, without any branch instruction. Right?

But on most architectures, that optimization would be invalid, and the compiler
must somehow know that. Is that a property on the representation of division
expression? Or is it tied to some property of the instruction pattern for the
divide instruction?

My question really is: What would it take to mark an __asm__ expression so that
it's treated in the same way as a plain C division?

[Bug rtl-optimization/82677] Many projects (linux, coreutils, GMP, gcrypt, openSSL, etc) are misusing asm(divq/divl) etc, potentially resulting in faulty/unintended optimisations

2017-10-24 Thread infinity0 at pwned dot gg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82677

--- Comment #9 from infinity0 at pwned dot gg ---
(In reply to rguent...@suse.de from comment #7)
> [..]
> 
> You still have to mark stmts with side-effects as volatile.
> 
> Conditional side-effects are tricky to get correct of course.

I think with the current asm() syntax and semantics, the only way to get this
correct in the C source code, is by using __volatile__. It is not possible to
even *express* "conditional side-effects" using the current syntax. To
demonstrate:

The "explicit zero check" that I mentioned in the first post does not actually
work, see the newer udiv.c test that I just attached. I think that is correct
behaviour from GCC as well.

AIUI, asm() without volatile, says to GCC: "this code will *never* cause side
effects under *any circumstance*, it depends only on its declared
inputs/outputs/clobbers". Under this assumption, it is correct to optimise

|   if(d) { asm(d); ... } ...

into

|   asm(d); if(d) { ... }

as long as the outputs to asm(d) don't clobber the inputs to if(d) or any
else-branches - I assume other parts of the optimiser will already check that.

However, it is *not* correct to perform the above, when the asm(d) has
conditional side-effects depending on d, that the if(d) checks for. But there
is no way to express this conditional-side-effect using the current asm()
syntax. So GCC will happily go ahead and perform the optimisation, since it
believes asm(d) will never have any side effects.

So, the pre-optimised code "looks correct", it's explicitly checking d, but
will in fact cause the types of unintended optimisations that we're looking at
here. The only way to avoid this is to use "volatile".

In conclusion: whilst making the optimiser more strict for this case as you
suggested, would fix this specific instance, I believe that it is not "the
proper fix" for GCC. A proper fix would have to involve changing the
*semantics* of a asm(div) call so that GCC understands that it has "side
effects depending on the divisor and n1[1]". Otherwise, more sophisticated
optimisations added to GCC in the future might make this issue come back.

If that is not done, then it would be necessary to fix these 20 projects's C
source code to properly express __volatile__. Actually this is probably
necessary regardless of any GCC fixes - from some quick searches it seems that
asm() semantics is not standardised, and also these projects might want to
support older GCC that has the existing semantics.

[1] it can also raise DE when n1 >= divisor.

[Bug rtl-optimization/82677] Many projects (linux, coreutils, GMP, gcrypt, openSSL, etc) are misusing asm(divq/divl) etc, potentially resulting in faulty/unintended optimisations

2017-10-24 Thread infinity0 at pwned dot gg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82677

infinity0 at pwned dot gg changed:

   What|Removed |Added

  Attachment #42439|0   |1
is obsolete||
  Attachment #42440|0   |1
is obsolete||

--- Comment #8 from infinity0 at pwned dot gg ---
Created attachment 42461
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42461=edit
More specific and thorough test case

[Bug rtl-optimization/82677] Many projects (linux, coreutils, GMP, gcrypt, openSSL, etc) are misusing asm(divq/divl) etc, potentially resulting in faulty/unintended optimisations

2017-10-23 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82677

--- Comment #7 from rguenther at suse dot de  ---
On Mon, 23 Oct 2017, infinity0 at pwned dot gg wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82677
> 
> --- Comment #6 from infinity0 at pwned dot gg ---
> What I mean is, even if you do change GCC to fix the unintended optimisation,
> other projects' code are *still wrong* - it's only correct if you can assume
> the C compiler is optimising your code in a very specific way, that happens to
> safe for the specific sorts of "side effects" that div instructions do.

You still have to mark stmts with side-effects as volatile.

Conditional side-effects are tricky to get correct of course.

[Bug rtl-optimization/82677] Many projects (linux, coreutils, GMP, gcrypt, openSSL, etc) are misusing asm(divq/divl) etc, potentially resulting in faulty/unintended optimisations

2017-10-23 Thread infinity0 at pwned dot gg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82677

--- Comment #6 from infinity0 at pwned dot gg ---
What I mean is, even if you do change GCC to fix the unintended optimisation,
other projects' code are *still wrong* - it's only correct if you can assume
the C compiler is optimising your code in a very specific way, that happens to
safe for the specific sorts of "side effects" that div instructions do.

[Bug rtl-optimization/82677] Many projects (linux, coreutils, GMP, gcrypt, openSSL, etc) are misusing asm(divq/divl) etc, potentially resulting in faulty/unintended optimisations

2017-10-23 Thread infinity0 at pwned dot gg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82677

--- Comment #5 from infinity0 at pwned dot gg ---
(In reply to Richard Biener from comment #4)
> [..]
> It's still safe to move the asm in
> 
> int main() {
>   ulong d = 0;
>   for (ulong i = 0; i < 3; i++)
> for (ulong j = 0; j < 3; j++)
> {
>   ulong r;
>   __asm__("" : "=r"(d) : "rm"((ulong)0));
>   udiv_qrnnd(q, r, 0, 0, (i << d));
> }
> }
> 
> thus without the if ().

For the specific case of asm(div) I suppose it's safe because raising a DE is
pretty much similar to raising it several times in a loop, but it is correct to
assume that for all types of side effects?

[Bug rtl-optimization/82677] Many projects (linux, coreutils, GMP, gcrypt, openSSL, etc) are misusing asm(divq/divl) etc, potentially resulting in faulty/unintended optimisations

2017-10-23 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82677

Richard Biener  changed:

   What|Removed |Added

   Keywords||wrong-code
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2017-10-23
 CC||rguenth at gcc dot gnu.org
  Component|inline-asm  |rtl-optimization
 Ever confirmed|0   |1

--- Comment #4 from Richard Biener  ---
This is RTL invariant motion hoisting the non-volatile asm out of the loop even
though it is _not_ unconditionally executed.

I think it is against the spirit of asm()s to be treated this way.  We're
handling possible NULL pointer dereferences correctly (just not in asm()s)
as well.

So I think LIM needs to be conservative with not always executed asm()s.

It's still safe to move the asm in

int main() {
  ulong d = 0;
  for (ulong i = 0; i < 3; i++)
for (ulong j = 0; j < 3; j++)
{
  ulong r;
  __asm__("" : "=r"(d) : "rm"((ulong)0));
  udiv_qrnnd(q, r, 0, 0, (i << d));
}
}

thus without the if ().