Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Jakub Jelinek
On Sun, Nov 14, 2010 at 07:21:50PM -0800, Linus Torvalds wrote:
 So when Richard Gunther says a memory clobber doesn't cover automatic
 storage, to me that very clearly spells gcc is buggy as hell.
 Because automatic storage with its address taken _very_ much gets
 clobbered by things like memset etc. If the compiler doesn't
 understand that, the compiler is just broken.

I'll leave the discussion about meaning of memory clobber aside to
Richard,

 And now, if even the (superfluous) +m isn't working, it sounds like
 we have no sane options left. Except to say that gcc-4.5.1 is totally

just to say that of course there are sane options left.

:=a(rc), +m (*regs)
:a(regs)
:%ebx, %ecx, %edx, %esi, %edi, memory);

is simply too high register pressure for i386 if you force also
-fno-omit-frame-pointer, there is not a single register left.

Yes, reload should figure out it has address of regs already tied to %eax,
unfortunately starting with IRA it doesn't (I'll file a GCC bug about that;
so that leaves 4.4/4.5/4.6 currently not being able to compile it).

That said, changing the inline asm to just clobber one less register
would be completely sufficient to make it work well with all gccs out there,
just push/pop one of the register around the whole body.  I doubt calling
out SMM BIOS is actually so performance critical that one push and one pop
would ruin it.  Of course x86_64 version can stay as is, there are enough
registers left...

Jakub


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Andi Kleen
 That said, changing the inline asm to just clobber one less register
 would be completely sufficient to make it work well with all gccs out there,
 just push/pop one of the register around the whole body.  I doubt calling
 out SMM BIOS is actually so performance critical that one push and one pop
 would ruin it.  Of course x86_64 version can stay as is, there are enough
 registers left...

Yes traditionally clobbering all registers has been dangerous
and it clearly can be done inside the asm too.

Here's a untested patch to do some manual push/pops too. Could someone with
the hardware please test it? (running a 32bit kernel)

-Andi

---

i8k: Clobber less registers

gcc doesn't like inline assembler statements that clobber nearly
all registers. Save a few registers manually on i386 to avoid this
problem.

Fix suggested by Jakub Jelinek

Signed-off-by: Andi Kleen a...@linux.intel.com

diff --git a/drivers/char/i8k.c b/drivers/char/i8k.c
index f0863be..a2da38b 100644
--- a/drivers/char/i8k.c
+++ b/drivers/char/i8k.c
@@ -146,7 +146,10 @@ static int i8k_smm(struct smm_regs *regs)
:a(regs)
:%ebx, %ecx, %edx, %esi, %edi, memory);
 #else
-   asm(pushl %%eax\n\t
+   asm(pushl %%ebx\n\t
+   pushl %%ecx\n\t
+   pushl %%edx\n\t
+   pushl %%eax\n\t
movl 0(%%eax),%%edx\n\t
push %%edx\n\t
movl 4(%%eax),%%ebx\n\t
@@ -167,10 +170,13 @@ static int i8k_smm(struct smm_regs *regs)
movl %%edx,0(%%eax)\n\t
lahf\n\t
shrl $8,%%eax\n\t
-   andl $1,%%eax\n
+   andl $1,%%eax\n\t
+   popl %%edx\n\t
+   popl %%ecx\n\t
+   popl %%ebx\n
:=a(rc), +m (*regs)
:a(regs)
-   :%ebx, %ecx, %edx, %esi, %edi, memory);
+   :%esi, %edi, memory);
 #endif
if (rc != 0 || (regs-eax  0x) == 0x || regs-eax == eax)
return -EINVAL;


-- 
a...@linux.intel.com -- Speaking for myself only.


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Jakub Jelinek
On Mon, Nov 15, 2010 at 09:56:05AM +0100, Jakub Jelinek wrote:
 Yes, reload should figure out it has address of regs already tied to %eax,
 unfortunately starting with IRA it doesn't (I'll file a GCC bug about that;

http://gcc.gnu.org/PR46479

Jakub


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Jakub Jelinek
On Mon, Nov 15, 2010 at 09:56:05AM +0100, Jakub Jelinek wrote:
 On Sun, Nov 14, 2010 at 07:21:50PM -0800, Linus Torvalds wrote:
  So when Richard Gunther says a memory clobber doesn't cover automatic
  storage, to me that very clearly spells gcc is buggy as hell.
  Because automatic storage with its address taken _very_ much gets
  clobbered by things like memset etc. If the compiler doesn't
  understand that, the compiler is just broken.
 
 I'll leave the discussion about meaning of memory clobber aside to
 Richard,

And for this the starting point should be what has been requested,
i.e. preprocessed source + gcc options + gcc version and some hints what
actually misbehaves (with the , +m (*regs) change reverted)
in gcc bugzilla.  Only with that we can actually look at what has been
happening, see whether it is the tree optimizations or RTL and which one
makes a difference.
If I've missed a PR about this I apologize. 

Jakub


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Richard Guenther
On Mon, Nov 15, 2010 at 9:56 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Sun, Nov 14, 2010 at 07:21:50PM -0800, Linus Torvalds wrote:
 So when Richard Gunther says a memory clobber doesn't cover automatic
 storage, to me that very clearly spells gcc is buggy as hell.
 Because automatic storage with its address taken _very_ much gets
 clobbered by things like memset etc. If the compiler doesn't
 understand that, the compiler is just broken.

 I'll leave the discussion about meaning of memory clobber aside to
 Richard,

Of course GCC handles memset just fine.  Note that I was refering
to non-address taken automatic storage for memory (even though
when double-checking the current implementation GCC even thinks
that all address-taken memory is clobbered by asms as soon as
they have at least one memory operand or a memory clobber).

It's just that in future we might want to improve this and I think
not covering non-address taken automatic storage for memory
is sensible.  And I see that you don't see address-taken automatic
storage as a sensible choice to exclude from memory, and I
have noted that.

Btw, I still haven't seen an testcase for the actual problem we are
talking about.

Richard.


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Andi Kleen
 And for this the starting point should be what has been requested,
 i.e. preprocessed source + gcc options + gcc version and some hints what
 actually misbehaves (with the , +m (*regs) change reverted)
 in gcc bugzilla.  Only with that we can actually look at what has been
 happening, see whether it is the tree optimizations or RTL and which one
 makes a difference.
 If I've missed a PR about this I apologize. 

I tried to file one, but I can't reproduce it currently
(I don't have hardware, so have to rely on code reading and the 32bit
code looks correct to me even without the additional +m)

The preprocessed source is at
http://halobates.de/tmp/i8k.i

Options I used:

 -D__KERNEL__ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs 
-fno-strict-aliasing -fno-common -Werror-implicit-function-declaration 
-Wno-format-security -fno-delete-null-pointer-checks -O2 -m32 -msoft-float 
-mregparm=3 -freg-struct-return -mpreferred-stack-boundary=2 -march=i686 
-mtune=pentium3 -mtune=generic -maccumulate-outgoing-args -Wa,-mtune=generic32 
-ffreestanding -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 
-DCONFIG_AS_CFI_SECTIONS=1 -pipe -Wno-sign-compare 
-fno-asynchronous-unwind-tables -mno-sse -mno-mmx -mno-sse2 -mno-3dnow 
-Wframe-larger-than=2048 -fno-stack-protector -fno-omit-frame-pointer 
-fno-optimize-sibling-calls -pg -Wdeclaration-after-statement -Wno-pointer-sign 
-fno-strict-overflow -fconserve-stack

-Andi
-- 
a...@linux.intel.com -- Speaking for myself only.


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Jakub Jelinek
On Mon, Nov 15, 2010 at 11:54:46AM +0100, Andi Kleen wrote:
  And for this the starting point should be what has been requested,
  i.e. preprocessed source + gcc options + gcc version and some hints what
  actually misbehaves (with the , +m (*regs) change reverted)
  in gcc bugzilla.  Only with that we can actually look at what has been
  happening, see whether it is the tree optimizations or RTL and which one
  makes a difference.
  If I've missed a PR about this I apologize. 
 
 I tried to file one, but I can't reproduce it currently
 (I don't have hardware, so have to rely on code reading and the 32bit
 code looks correct to me even without the additional +m)
 
 The preprocessed source is at
 http://halobates.de/tmp/i8k.i
 
 Options I used:
 
  -D__KERNEL__ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs 
 -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration 
 -Wno-format-security -fno-delete-null-pointer-checks -O2 -m32 -msoft-float 
 -mregparm=3 -freg-struct-return -mpreferred-stack-boundary=2 -march=i686 
 -mtune=pentium3 -mtune=generic -maccumulate-outgoing-args 
 -Wa,-mtune=generic32 -ffreestanding -DCONFIG_AS_CFI=1 
 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCONFIG_AS_CFI_SECTIONS=1 -pipe 
 -Wno-sign-compare -fno-asynchronous-unwind-tables -mno-sse -mno-mmx -mno-sse2 
 -mno-3dnow -Wframe-larger-than=2048 -fno-stack-protector 
 -fno-omit-frame-pointer -fno-optimize-sibling-calls -pg 
 -Wdeclaration-after-statement -Wno-pointer-sign -fno-strict-overflow 
 -fconserve-stack

Indeed, with this and 4.5.2 2010 (prerelease) from SVN as well as
gcc-4.5.1-5.fc14:

...
movl%eax, -16(%ebp) # regs, %sfp
movl(%eax), %eax# regs_2(D)-eax,
movl%eax, -20(%ebp) #, %sfp
movl-16(%ebp), %eax # %sfp,
#APP
# 149 /home/lsrc/git/linux-work2/drivers/char/i8k.c 1
...
#NO_APP
testl   %eax, %eax  #
movl$-22, %edx  #, D.18378
movl%eax, -24(%ebp) #, %sfp
je  .L7 #,
.L2:
movl-12(%ebp), %ebx #,
movl%edx, %eax  # D.18378,
movl-8(%ebp), %esi  #,
movl-4(%ebp), %edi  #,
movl%ebp, %esp  #,
popl%ebp#
ret
.p2align 4,,7
.p2align 3
.L7:
movl-16(%ebp), %eax # %sfp,
movl(%eax), %ecx# regs_2(D)-eax, D.18371
cmpw$-1, %cx#, D.18371
je  .L2 #,
cmpl%ecx, -20(%ebp) # D.18371, %sfp
cmovne  -24(%ebp), %edx # %sfp,, D.18378
jmp .L2 #
.size   i8k_smm, .-i8k_smm

I don't see any problems on the assembly level.  i8k_smm is
not inlined in this case and checks all 3 conditions.

Guess we need somebody who actually reported the problem, state what
gcc was actually used and post preprocessed source, gcc options
from his case.

Jakub


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Andi Kleen
 Guess we need somebody who actually reported the problem, state what
 gcc was actually used and post preprocessed source, gcc options
 from his case.

Jim Bos,
Can you please supply that? 

Please use 

rm drivers/char/i8k.o
make V=1 drivers/char/i8k.o
make drivers/char/i8k.i  

and supply the .i file and the output of the first make line

Thanks,
-Andi

-- 
a...@linux.intel.com -- Speaking for myself only.


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Linus Torvalds
On Mon, Nov 15, 2010 at 3:16 AM, Jakub Jelinek ja...@redhat.com wrote:

 I don't see any problems on the assembly level.  i8k_smm is
 not inlined in this case and checks all 3 conditions.

If it really is related to gcc not understanding that *regs has
changed due to the memory being an automatic variable, and passing in
regs itself as a pointer to that automatic variable together with
the memory clobber not being sufficient, than I think it's the lack
of inlining that will automatically hide the bug.

(Side note: and I think this does show how much of a gcc bug it is not
to consider memory together with passing in a pointer to an asm to
always be a clobber).

Because if it isn't inlined, then regs will be seen a a real pointer
to some external memory (the caller) rather than being optimized to
just be the auto structure on the stack. Because *mem is auto only
within the context of the caller.

Which actually points to a possible simpler:
 - remove the +m since it adds too much register pressure
 - mark the i8k_smm() as noinline instead.

Quite frankly, I'd hate to add even more crud to that inline asm (to
save/restore the registers manually). It's already not the prettiest
thing around.

So does the attached patch work for everybody?

 Linus
 drivers/char/i8k.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/char/i8k.c b/drivers/char/i8k.c
index f0863be..101011e 100644
--- a/drivers/char/i8k.c
+++ b/drivers/char/i8k.c
@@ -114,7 +114,7 @@ static inline const char *i8k_get_dmi_data(int field)
 /*
  * Call the System Management Mode BIOS. Code provided by Jonathan Buzzard.
  */
-static int i8k_smm(struct smm_regs *regs)
+static noinline int i8k_smm(struct smm_regs *regs)
 {
 	int rc;
 	int eax = regs-eax;
@@ -142,7 +142,7 @@ static int i8k_smm(struct smm_regs *regs)
 		lahf\n\t
 		shrl $8,%%eax\n\t
 		andl $1,%%eax\n
-		:=a(rc), +m (*regs)
+		:=a(rc)
 		:a(regs)
 		:%ebx, %ecx, %edx, %esi, %edi, memory);
 #else
@@ -168,7 +168,7 @@ static int i8k_smm(struct smm_regs *regs)
 	lahf\n\t
 	shrl $8,%%eax\n\t
 	andl $1,%%eax\n
-	:=a(rc), +m (*regs)
+	:=a(rc)
 	:a(regs)
 	:%ebx, %ecx, %edx, %esi, %edi, memory);
 #endif


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Jim Bos
On 11/15/2010 05:04 PM, Linus Torvalds wrote:
 On Mon, Nov 15, 2010 at 3:16 AM, Jakub Jelinek ja...@redhat.com wrote:

 I don't see any problems on the assembly level.  i8k_smm is
 not inlined in this case and checks all 3 conditions.
 
 If it really is related to gcc not understanding that *regs has
 changed due to the memory being an automatic variable, and passing in
 regs itself as a pointer to that automatic variable together with
 the memory clobber not being sufficient, than I think it's the lack
 of inlining that will automatically hide the bug.
 
 (Side note: and I think this does show how much of a gcc bug it is not
 to consider memory together with passing in a pointer to an asm to
 always be a clobber).
 
 Because if it isn't inlined, then regs will be seen a a real pointer
 to some external memory (the caller) rather than being optimized to
 just be the auto structure on the stack. Because *mem is auto only
 within the context of the caller.
 
 Which actually points to a possible simpler:
  - remove the +m since it adds too much register pressure
  - mark the i8k_smm() as noinline instead.
 
 Quite frankly, I'd hate to add even more crud to that inline asm (to
 save/restore the registers manually). It's already not the prettiest
 thing around.
 
 So does the attached patch work for everybody?
 
  Linus

Hmm, that doesn't work.

[ Not sure if you read to whole thread but initial workaround was to
change the asm(..) to asm volatile(..) which did work. ]

Jim.


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Jakub Jelinek
On Mon, Nov 15, 2010 at 06:36:06PM +0100, Jim Bos wrote:
 On 11/15/2010 12:37 PM, Andi Kleen wrote:
 See attached, note this is the vanilla 2.6.36 i8k.c (without any patch).
 And to be 100% sure, if I build this (make drivers/char/i8k.ko) it won't
 work.
 
 [ The i8k.i is rather big, even gzipped 80k, not sure if it'll bounce ]

Please also say which exact gcc you are using.

Note, I've compiled it with current 4.5 branch and made the function
always_inline and still didn't see any issues in the *.optimized dump,
regs.eax after the inline asm has always been compared to the constant
that has been stored into regs.eax before the inline asm.

Jakub


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Linus Torvalds
On Mon, Nov 15, 2010 at 9:40 AM, Jim Bos jim...@xs4all.nl wrote:

 Hmm, that doesn't work.

 [ Not sure if you read to whole thread but initial workaround was to
 change the asm(..) to asm volatile(..) which did work. ]

Since I have a different gcc than yours (and I'm not going to compile
my own), have you posted your broken .s file anywhere? In fact, with
the noinline (and the removal of the +m thing - iow just the patch
you tried), what does just the i8k_smm function assembly look like
for you after you've done a make drivers/char/i8k.s?

If the asm just doesn't exist AT ALL, that's just odd. Because every
single call-site of i8k_smm() clearly looks at the return value. So
the volatile really shouldn't make any difference from that
standpoint. Odd.

   Linus


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Jim Bos
On 11/15/2010 06:44 PM, Jakub Jelinek wrote:
 On Mon, Nov 15, 2010 at 06:36:06PM +0100, Jim Bos wrote:
 On 11/15/2010 12:37 PM, Andi Kleen wrote:
 See attached, note this is the vanilla 2.6.36 i8k.c (without any patch).
 And to be 100% sure, if I build this (make drivers/char/i8k.ko) it won't
 work.

 [ The i8k.i is rather big, even gzipped 80k, not sure if it'll bounce ]
 
 Please also say which exact gcc you are using.
 
 Note, I've compiled it with current 4.5 branch and made the function
 always_inline and still didn't see any issues in the *.optimized dump,
 regs.eax after the inline asm has always been compared to the constant
 that has been stored into regs.eax before the inline asm.
 
   Jakub
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/

 # gcc -v
Reading specs from /usr/lib/gcc/i486-slackware-linux/4.5.1/specs
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/i486-slackware-linux/4.5.1/lto-wrapper
Target: i486-slackware-linux
Configured with: ../gcc-4.5.1/configure --prefix=/usr --libdir=/usr/lib
--mandir=/usr/man --infodir=/usr/info --enable-shared --enable-bootstrap
--enable-languages=ada,c,c++,fortran,java,objc --enable-threads=posix
--enable-checking=release --with-system-zlib
--with-python-dir=/lib/python2.6/site-packages
--disable-libunwind-exceptions --enable-__cxa_atexit --enable-libssp
--with-gnu-ld --verbose --with-arch=i486 --target=i486-slackware-linux
--build=i486-slackware-linux --host=i486-slackware-linux
Thread model: posix
gcc version 4.5.1 (GCC)

I'm re-reading this thread where I found the asm- asm volatine suggestion:
  https://bbs.archlinux.org/viewtopic.php?pid=752099#p752099
but nobody there reported their gcc version (but apparently first
people started complaining May 1st).

_
Jim


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Jakub Jelinek
On Mon, Nov 15, 2010 at 07:17:31PM +0100, Jim Bos wrote:
  # gcc -v
 Reading specs from /usr/lib/gcc/i486-slackware-linux/4.5.1/specs
 COLLECT_GCC=gcc
 COLLECT_LTO_WRAPPER=/usr/libexec/gcc/i486-slackware-linux/4.5.1/lto-wrapper
 Target: i486-slackware-linux
 Configured with: ../gcc-4.5.1/configure --prefix=/usr --libdir=/usr/lib
 --mandir=/usr/man --infodir=/usr/info --enable-shared --enable-bootstrap
 --enable-languages=ada,c,c++,fortran,java,objc --enable-threads=posix
 --enable-checking=release --with-system-zlib
 --with-python-dir=/lib/python2.6/site-packages
 --disable-libunwind-exceptions --enable-__cxa_atexit --enable-libssp
 --with-gnu-ld --verbose --with-arch=i486 --target=i486-slackware-linux
 --build=i486-slackware-linux --host=i486-slackware-linux
 Thread model: posix
 gcc version 4.5.1 (GCC)

Does it have any patches applied?  The gcc options look the same as what
I've been already trying earlier.
Thus, can you run gcc with those options on i8k.i and add -fverbose-asm
to make it easier to read and post i8k.s you get?

Jakub


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Jim Bos
On 11/15/2010 07:08 PM, Linus Torvalds wrote:
 On Mon, Nov 15, 2010 at 9:40 AM, Jim Bos jim...@xs4all.nl wrote:

 Hmm, that doesn't work.

 [ Not sure if you read to whole thread but initial workaround was to
 change the asm(..) to asm volatile(..) which did work. ]
 
 Since I have a different gcc than yours (and I'm not going to compile
 my own), have you posted your broken .s file anywhere? In fact, with
 the noinline (and the removal of the +m thing - iow just the patch
 you tried), what does just the i8k_smm function assembly look like
 for you after you've done a make drivers/char/i8k.s?
 
 If the asm just doesn't exist AT ALL, that's just odd. Because every
 single call-site of i8k_smm() clearly looks at the return value. So
 the volatile really shouldn't make any difference from that
 standpoint. Odd.
 
Linus
 

Attached version with plain 2.6.36 source and version with the committed
patch, i.e with the '+m (*regs)'


_
Jim


.file   i8k.c
# GNU C (GCC) version 4.5.1 (i486-slackware-linux)
#   compiled by GNU C version 4.5.1, GMP version 5.0.1, MPFR version 
2.4.2-p3, MPC version 0.8.2
# GGC heuristics: --param ggc-min-expand=81 --param ggc-min-heapsize=96817
# options passed:  -nostdinc -I/usr/src/linux-2.6.36/arch/x86/include
# -Iinclude -D__KERNEL__ -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1
# -DCONFIG_AS_CFI_SECTIONS=1 -DMODULE -DKBUILD_STR(s)=#s
# -DKBUILD_BASENAME=KBUILD_STR(i8k) -DKBUILD_MODNAME=KBUILD_STR(i8k)
# -isystem /usr/lib/gcc/i486-slackware-linux/4.5.1/include -include
# include/generated/autoconf.h -MD drivers/char/.i8k.s.d drivers/char/i8k.c
# -m32 -msoft-float -mregparm=3 -mpreferred-stack-boundary=2 -march=i686
# -mtune=pentium3 -mtune=generic -mno-sse -mno-mmx -mno-sse2 -mno-3dnow
# -auxbase-strip drivers/char/i8k.s -Os -Wall -Wundef -Wstrict-prototypes
# -Wno-trigraphs -Werror-implicit-function-declaration -Wno-format-security
# -Wno-sign-compare -Wframe-larger-than=1024 -Wdeclaration-after-statement
# -Wno-pointer-sign -fno-strict-aliasing -fno-common
# -fno-delete-null-pointer-checks -freg-struct-return -ffreestanding
# -fno-asynchronous-unwind-tables -fno-stack-protector -fomit-frame-pointer
# -fno-strict-overflow -fconserve-stack -fverbose-asm
# options enabled:  -falign-loops -fargument-alias -fauto-inc-dec
# -fbranch-count-reg -fcaller-saves -fcprop-registers -fcrossjumping
# -fcse-follow-jumps -fdefer-pop -fdwarf2-cfi-asm -fearly-inlining
# -feliminate-unused-debug-types -fexpensive-optimizations
# -fforward-propagate -ffunction-cse -fgcse -fgcse-lm
# -fguess-branch-probability -fident -fif-conversion -fif-conversion2
# -findirect-inlining -finline -finline-functions
# -finline-functions-called-once -finline-small-functions -fipa-cp
# -fipa-pure-const -fipa-reference -fipa-sra -fira-share-save-slots
# -fira-share-spill-slots -fivopts -fkeep-static-consts
# -fleading-underscore -fmath-errno -fmerge-constants -fmerge-debug-strings
# -fmove-loop-invariants -fomit-frame-pointer -foptimize-register-move
# -foptimize-sibling-calls -fpeephole -fpeephole2 -freg-struct-return
# -fregmove -freorder-blocks -freorder-functions -frerun-cse-after-loop
# -fsched-critical-path-heuristic -fsched-dep-count-heuristic
# -fsched-group-heuristic -fsched-interblock -fsched-last-insn-heuristic
# -fsched-rank-heuristic -fsched-spec -fsched-spec-insn-heuristic
# -fsched-stalled-insns-dep -fschedule-insns2 -fshow-column -fsigned-zeros
# -fsplit-ivs-in-unroller -fsplit-wide-types -fthread-jumps
# -ftoplevel-reorder -ftrapping-math -ftree-builtin-call-dce -ftree-ccp
# -ftree-ch -ftree-copy-prop -ftree-copyrename -ftree-cselim -ftree-dce
# -ftree-dominator-opts -ftree-dse -ftree-forwprop -ftree-fre
# -ftree-loop-im -ftree-loop-ivcanon -ftree-loop-optimize
# -ftree-parallelize-loops= -ftree-phiprop -ftree-pre -ftree-pta
# -ftree-reassoc -ftree-scev-cprop -ftree-sink -ftree-slp-vectorize
# -ftree-sra -ftree-switch-conversion -ftree-ter -ftree-vect-loop-version
# -ftree-vrp -funit-at-a-time -fvect-cost-model -fverbose-asm
# -fzero-initialized-in-bss -m32 -m96bit-long-double -malign-stringops
# -mfused-madd -mglibc -mieee-fp -mno-fancy-math-387 -mno-red-zone
# -mno-sse4 -mpush-args -msahf -mtls-direct-seg-refs

# Compiler executable checksum: 7ba2dc3c015559b9d16b297ee7f8d354

.text
.type   i8k_smm, @function
i8k_smm:
pushl   %ebp#
movl%eax, %ebp  # regs, regs
pushl   %edi#
pushl   %esi#
pushl   %ebx#
subl$8, %esp#,
movl(%eax), %eax# regs_2(D)-eax,
movl%eax, 4(%esp)   #, %sfp
movl%ebp, %eax  # regs,
#APP
# 148 drivers/char/i8k.c 1
pushl %eax
movl 0(%eax),%edx
push %edx
movl 4(%eax),%ebx
movl 8(%eax),%ecx
movl 12(%eax),%edx
movl 16(%eax),%esi
movl 20(%eax),%edi
popl %eax
out %al,$0xb2
out %al,$0x84
xchgl %eax,(%esp)
movl 

Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Jim Bos
On 11/15/2010 07:30 PM, Jim Bos wrote:
 On 11/15/2010 07:08 PM, Linus Torvalds wrote:
 On Mon, Nov 15, 2010 at 9:40 AM, Jim Bos jim...@xs4all.nl wrote:

 Hmm, that doesn't work.

 [ Not sure if you read to whole thread but initial workaround was to
 change the asm(..) to asm volatile(..) which did work. ]

 Since I have a different gcc than yours (and I'm not going to compile
 my own), have you posted your broken .s file anywhere? In fact, with
 the noinline (and the removal of the +m thing - iow just the patch
 you tried), what does just the i8k_smm function assembly look like
 for you after you've done a make drivers/char/i8k.s?

 If the asm just doesn't exist AT ALL, that's just odd. Because every
 single call-site of i8k_smm() clearly looks at the return value. So
 the volatile really shouldn't make any difference from that
 standpoint. Odd.

Linus

 
 Attached version with plain 2.6.36 source and version with the committed
 patch, i.e with the '+m (*regs)'
 
 
 _
 Jim
 
 

And I just tried with your noninline patch which results in exactly the
same .s file as with plain 2.6.36 source, i.e. the noninline patch is
not doing anything here.

_
Jim




Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Jeff Law

On 11/07/10 15:41, Andreas Schwab wrote:

Andi Kleena...@firstfloor.org  writes:


Jimjim...@xs4all.nl  writes:


After upgrading my Dell laptop, both OS+kernel the i8k interface was giving
nonsensical output. As it turned out it's not the kernel but compiler
upgrade which broke this.

Guys at Archlinux have found the underlying cause (but don't seem to have
submitted a patch yet):
   https://bbs.archlinux.org/viewtopic.php?pid=780692#p780692
gcc seems to optimize the assembly statements away.

And indeed, applying this patch makes the i8k interface work again,
i.e. replacing the asm(..) construct by  asm volatile(..)

The compiler really should not optimize the asm away, because
it has both input and output arguments which are later used.
asm volatile normally just means don't move significantly

The asm fails to mention that it modifies *regs.
But there's a memory clobber, that should be sufficient to indicate 
*regs is modified.


jeff



Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Jeff Law

On 11/08/10 03:49, Richard Guenther wrote:

On Mon, Nov 8, 2010 at 12:03 AM, Andi Kleena...@firstfloor.org  wrote:

Andreas Schwabsch...@linux-m68k.org  writes:

The asm fails to mention that it modifies *regs.

It has a memory clobber, that should be enough, no?

No.  A memory clobber does not cover automatic storage.
A memory clobber should clobber anything in memory, including autos in 
memory; if it doesn't, then that seems like a major problem.  I'd like 
to see the rationale behind not clobbering autos in memory.


Jeff


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Linus Torvalds
On Mon, Nov 15, 2010 at 10:30 AM, Jim Bos jim...@xs4all.nl wrote:

 Attached version with plain 2.6.36 source and version with the committed
 patch, i.e with the '+m (*regs)'

Looks 100% identical in i8k_smm() itself, and I'm not seeing anything
bad. The asm has certainly not been optimized away as implied in the
archlinux thread.

There are differences, but they are with code generation *elsewhere*.

To me it is starting to look like the real problem is that gcc has
decided that the i8k_smm() function is __attribute__((const)).

Which is clearly totally bogus. If a function has an inline asm that
has a memory clobber, it is clearly *not* 'const'. But that does
explain the bug, and does explain why +m makes a difference and why
noinline does not.

So what I _think_ happens is that

 - gcc logic for the automatic 'const' attribute for functions is
broken, so it marks that function 'const'.

 - since the rule for a const function is that it only _looks_ at its
attributes and has no side effects, now the callers will decide that
'i8k_smm()' cannot change the passed-in structure, so they'll happily
optimize away all the accesses to it.

   Linus


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Jakub Jelinek
On Mon, Nov 15, 2010 at 07:30:35PM +0100, Jim Bos wrote:
 On 11/15/2010 07:08 PM, Linus Torvalds wrote:
  On Mon, Nov 15, 2010 at 9:40 AM, Jim Bos jim...@xs4all.nl wrote:
 
  Hmm, that doesn't work.
 
  [ Not sure if you read to whole thread but initial workaround was to
  change the asm(..) to asm volatile(..) which did work. ]
  
  Since I have a different gcc than yours (and I'm not going to compile
  my own), have you posted your broken .s file anywhere? In fact, with
  the noinline (and the removal of the +m thing - iow just the patch
  you tried), what does just the i8k_smm function assembly look like
  for you after you've done a make drivers/char/i8k.s?
  
  If the asm just doesn't exist AT ALL, that's just odd. Because every
  single call-site of i8k_smm() clearly looks at the return value. So
  the volatile really shouldn't make any difference from that
  standpoint. Odd.
  
 Linus
  
 
 Attached version with plain 2.6.36 source and version with the committed
 patch, i.e with the '+m (*regs)'

Thanks, this actually helped to see the problem.
The problem is not inside of i8k_smm, which is not inlined, but in the
callers.
ipa-pure-const.c pass thinks i8k_smm is a pure function, thus
regs = {};
regs.eax = 166;
x = i8k_smm (regs);
if (!x) x = regs.eax;
in the callers is optimized into
regs = {}
regs.eax = 166;
x = i8k_smm (regs);
if (!x) x = 166;
Now, not sure why this happens, as there is
case GIMPLE_ASM:
  for (i = 0; i  gimple_asm_nclobbers (stmt); i++)
{
  tree op = gimple_asm_clobber_op (stmt, i);
  if (simple_cst_equal(TREE_VALUE (op), memory_identifier_string) == 1)
{
  if (dump_file)
fprintf (dump_file, memory asm clobber is not const/pure);
  /* Abandon all hope, ye who enter here. */
  local-pure_const_state = IPA_NEITHER;
}
}
Debugging...

Jakub


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Linus Torvalds
On Mon, Nov 15, 2010 at 10:45 AM, Jeff Law l...@redhat.com wrote:

 A memory clobber should clobber anything in memory, including autos in
 memory; if it doesn't, then that seems like a major problem.  I'd like to
 see the rationale behind not clobbering autos in memory.

Yes. It turns out that the asm optimized away was entirely wrong (we
never saw that, it was just a report on another mailing list).

Looking at the asm posted, it seems to me that gcc actually compiles
the asm itself 100% correctly, and the memory clobber is working
fine inside that function. So the code generated for i8k_smm() itself
is all good.

But _while_ generating the good code, gcc doesn't seem to realize that
it writes to anything, so it decides to mark the function
__attribute__((const)), which is obviously wrong (a memory clobber
definitely implies that it's not const). And as a result, the callers
will be mis-optimized, because they do things like

static int i8k_get_bios_version(void)
{
struct smm_regs regs = { .eax = I8K_SMM_BIOS_VERSION, };

return i8k_smm(regs) ? : regs.eax;
}

and since gcc has (incorrectly) decided that i8k_smm() is a const
function, it thinks that regs.eax hasn't changed, so it doesn't
bother to reload it: it knows that it is still I8K_SMM_BIOS_VERSION
that it initialized it with. So it will basically have rewritten that
final return statement as

return i8k_smm(regs) ? : I8K_SMM_BIOS_VERSION;

which obviously doesn't really work.

This also explains why adding volatile worked. The asm volatile
triggered this is not a const function.

Similarly, the +m works, because it also makes clear that the asm is
writing to memory, and isn't a const function.

Now, the memory clobber should clearly also have done that, but I'd
be willing to bet that some version of gcc (possibly extra slackware
patches) had forgotten the trivial logic to say a memory clobber also
makes the user function non-const.

 Linus


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Jim Bos
On 11/15/2010 07:26 PM, Jakub Jelinek wrote:
 On Mon, Nov 15, 2010 at 07:17:31PM +0100, Jim Bos wrote:
  # gcc -v
 Reading specs from /usr/lib/gcc/i486-slackware-linux/4.5.1/specs
 COLLECT_GCC=gcc
 COLLECT_LTO_WRAPPER=/usr/libexec/gcc/i486-slackware-linux/4.5.1/lto-wrapper
 Target: i486-slackware-linux
 Configured with: ../gcc-4.5.1/configure --prefix=/usr --libdir=/usr/lib
 --mandir=/usr/man --infodir=/usr/info --enable-shared --enable-bootstrap
 --enable-languages=ada,c,c++,fortran,java,objc --enable-threads=posix
 --enable-checking=release --with-system-zlib
 --with-python-dir=/lib/python2.6/site-packages
 --disable-libunwind-exceptions --enable-__cxa_atexit --enable-libssp
 --with-gnu-ld --verbose --with-arch=i486 --target=i486-slackware-linux
 --build=i486-slackware-linux --host=i486-slackware-linux
 Thread model: posix
 gcc version 4.5.1 (GCC)
 
 Does it have any patches applied?  The gcc options look the same as what
 I've been already trying earlier.
 Thus, can you run gcc with those options on i8k.i and add -fverbose-asm
 to make it easier to read and post i8k.s you get?
 
   Jakub
 

Slackware is typically not patching much (and I'm just using the
pre-compiled binary).  Here is the link to how it's built:
 http://slackware.osuosl.org/slackware-current/source/d/gcc/
there doesn't appear to be anything relevant changed.

I already posted the .s files, plain 2.6.36 and the one with working
patch, I =think= that's already using -fverbose-asm, at least that shows
in the output.

_
Jim





Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Jakub Jelinek
On Mon, Nov 15, 2010 at 07:58:48PM +0100, Jakub Jelinek wrote:
 Now, not sure why this happens, as there is
 case GIMPLE_ASM:
   for (i = 0; i  gimple_asm_nclobbers (stmt); i++)
 {
   tree op = gimple_asm_clobber_op (stmt, i);
   if (simple_cst_equal(TREE_VALUE (op), memory_identifier_string) == 
 1)
 {
   if (dump_file)
 fprintf (dump_file, memory asm clobber is not 
 const/pure);
   /* Abandon all hope, ye who enter here. */
   local-pure_const_state = IPA_NEITHER;
 }
 }
 Debugging...

Ah, the problem is that memory_identifier_string is only initialized in
ipa-reference.c's initialization, so it can be (and is in this case) NULL in
ipa-pure-const.c.

Two possible fixes (the latter is apparently what is used in
tree-ssa-operands.c, so is probably sufficient).  Guess ipa-reference.c
should be changed to do the same and just drop memory_identifier_string.

Jakub
--- gcc/ipa-pure-const.c.jj 2010-08-11 16:06:19.0 +0200
+++ gcc/ipa-pure-const.c2010-11-15 20:06:36.121310614 +0100
@@ -460,7 +460,10 @@ check_stmt (gimple_stmt_iterator *gsip, 
   for (i = 0; i  gimple_asm_nclobbers (stmt); i++)
{
  tree op = gimple_asm_clobber_op (stmt, i);
- if (simple_cst_equal(TREE_VALUE (op), memory_identifier_string) == 1)
+ if (TREE_CODE (TREE_VALUE (op)) == STRING_CST
+  TREE_STRING_LENGTH (TREE_VALUE (op)) == sizeof (memory)
+  memcmp (TREE_STRING_POINTER (TREE_VALUE (op)), memory,
+sizeof (memory)) == 0)
{
   if (dump_file)
 fprintf (dump_file, memory asm clobber is not 
const/pure);
--- gcc/ipa-pure-const.c.jj 2010-08-11 16:06:19.0 +0200
+++ gcc/ipa-pure-const.c2010-11-15 20:07:51.463716989 +0100
@@ -460,7 +460,7 @@ check_stmt (gimple_stmt_iterator *gsip, 
   for (i = 0; i  gimple_asm_nclobbers (stmt); i++)
{
  tree op = gimple_asm_clobber_op (stmt, i);
- if (simple_cst_equal(TREE_VALUE (op), memory_identifier_string) == 1)
+ if (strcmp (TREE_STRING_POINTER (TREE_VALUE (link)), memory) == 0)
{
   if (dump_file)
 fprintf (dump_file, memory asm clobber is not 
const/pure);


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Linus Torvalds
On Mon, Nov 15, 2010 at 11:12 AM, Jakub Jelinek ja...@redhat.com wrote:

 Ah, the problem is that memory_identifier_string is only initialized in
 ipa-reference.c's initialization, so it can be (and is in this case) NULL in
 ipa-pure-const.c.

Ok. And I guess you can verify that all versions of gcc do this
correctly for asm volatile?

Because since we'll have to work around this problem in the kernel, I
suspect the simplest solution is to remove the +m that causes
register pressure problems, and then use asm volatile to work around
the const-function bug.

And add a large comment about why asm volatile is probably always a
good idea when you have a memory clobber and don't have any other
visible memory modifications.

I do wonder if this explains some of the problems we had with the
bitop asms too.

Hmm?

   Linus


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Jakub Jelinek
On Mon, Nov 15, 2010 at 11:21:30AM -0800, Linus Torvalds wrote:
 On Mon, Nov 15, 2010 at 11:12 AM, Jakub Jelinek ja...@redhat.com wrote:
 
  Ah, the problem is that memory_identifier_string is only initialized in
  ipa-reference.c's initialization, so it can be (and is in this case) NULL in
  ipa-pure-const.c.
 
 Ok. And I guess you can verify that all versions of gcc do this
 correctly for asm volatile?

Yes, reading 4.1/4.2/4.3/4.4/4.5/4.6 code ipa-pure-const.c handles
asm volatile correctly, in each case the function is no longer assumed to be
pure or const in the discovery (of course, user can still say the
function is const or pure). 4.0 and earlier didn't have ipa-pure-const.c.

Using the simplified

extern void abort (void);

__attribute__((noinline)) int
foo (int *p)
{
  int r;
  asm (movl $6, (%1)\n\txorl %0, %0 : =r (r) : r (p) : memory);
  return r;
}

int
main (void)
{
  int p = 8;
  if ((foo (p) ? : p) != 6)
abort ();
  return 0;
}

testcase shows that in 4.1/4.2/4.3/4.4 this is miscompiled only when using
-fno-ipa-reference, in 4.5 it is miscompiled always when optimizing
unless -fno-ipa-pure-const (as 4.5 added local-pure-const pass which is run
before ipa-reference) and in 4.6 this has been fixed by Honza when
doing ipa cleanups.

 Because since we'll have to work around this problem in the kernel, I
 suspect the simplest solution is to remove the +m that causes
 register pressure problems, and then use asm volatile to work around
 the const-function bug.

Yes.

Jakub


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Richard Henderson
On 11/15/2010 11:12 AM, Jakub Jelinek wrote:
 -   if (simple_cst_equal(TREE_VALUE (op), memory_identifier_string) == 1)
 +   if (strcmp (TREE_STRING_POINTER (TREE_VALUE (link)), memory) == 0)

I prefer this solution.  I think memory_identifier_string is over-engineering.
Patch to remove it entirely is pre-approved.


r~


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Jakub Jelinek
On Mon, Nov 15, 2010 at 11:53:05AM -0800, Richard Henderson wrote:
 On 11/15/2010 11:12 AM, Jakub Jelinek wrote:
  - if (simple_cst_equal(TREE_VALUE (op), memory_identifier_string) == 1)
  + if (strcmp (TREE_STRING_POINTER (TREE_VALUE (link)), memory) == 0)
 
 I prefer this solution.  I think memory_identifier_string is over-engineering.
 Patch to remove it entirely is pre-approved.

Honza even committed this to the trunk in May, it is just release branches
that are broken (and only in 4.5 it matters a lot because it happens with
the default flags).

Jakub


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Jim Bos
On 11/15/2010 08:51 PM, Jakub Jelinek wrote:
 On Mon, Nov 15, 2010 at 11:21:30AM -0800, Linus Torvalds wrote:
 On Mon, Nov 15, 2010 at 11:12 AM, Jakub Jelinek ja...@redhat.com wrote:

 Ah, the problem is that memory_identifier_string is only initialized in
 ipa-reference.c's initialization, so it can be (and is in this case) NULL in
 ipa-pure-const.c.

 Ok. And I guess you can verify that all versions of gcc do this
 correctly for asm volatile?
 
 Yes, reading 4.1/4.2/4.3/4.4/4.5/4.6 code ipa-pure-const.c handles
 asm volatile correctly, in each case the function is no longer assumed to be
 pure or const in the discovery (of course, user can still say the
 function is const or pure). 4.0 and earlier didn't have ipa-pure-const.c.
 
 Using the simplified
 
 extern void abort (void);
 
 __attribute__((noinline)) int
 foo (int *p)
 {
   int r;
   asm (movl $6, (%1)\n\txorl %0, %0 : =r (r) : r (p) : memory);
   return r;
 }
 
 int
 main (void)
 {
   int p = 8;
   if ((foo (p) ? : p) != 6)
 abort ();
   return 0;
 }
 
 testcase shows that in 4.1/4.2/4.3/4.4 this is miscompiled only when using
 -fno-ipa-reference, in 4.5 it is miscompiled always when optimizing
 unless -fno-ipa-pure-const (as 4.5 added local-pure-const pass which is run
 before ipa-reference) and in 4.6 this has been fixed by Honza when
 doing ipa cleanups.
 
 Because since we'll have to work around this problem in the kernel, I
 suspect the simplest solution is to remove the +m that causes
 register pressure problems, and then use asm volatile to work around
 the const-function bug.
 
 Yes.
 
   Jakub
 

Linus,

In case you didn't already fixed this, here's the follow-up patch.
---

The fix to work around the gcc miscompiling i8k.c to add +m (*regs)
caused register pressure problems. Changing the 'asm' statement to
'asm volatile' instead should prevent that and works around the gcc
bug as well.

Signed-off-by: Jim Bos jim...@xs4all.nl

--- linux/drivers/char/i8k.c.ORIG   2010-11-15 21:04:19.0 +0100
+++ linux/drivers/char/i8k.c2010-11-15 21:02:32.0 +0100
@@ -119,7 +119,7 @@
int eax = regs-eax;
 
 #if defined(CONFIG_X86_64)
-   asm(pushq %%rax\n\t
+   asm volatile(pushq %%rax\n\t
movl 0(%%rax),%%edx\n\t
pushq %%rdx\n\t
movl 4(%%rax),%%ebx\n\t
@@ -141,11 +141,11 @@
lahf\n\t
shrl $8,%%eax\n\t
andl $1,%%eax\n
-   :=a(rc), +m (*regs)
+   :=a(rc)
:a(regs)
:%ebx, %ecx, %edx, %esi, %edi, memory);
 #else
-   asm(pushl %%eax\n\t
+   asm volatile(pushl %%eax\n\t
movl 0(%%eax),%%edx\n\t
push %%edx\n\t
movl 4(%%eax),%%ebx\n\t
@@ -167,7 +167,7 @@
lahf\n\t
shrl $8,%%eax\n\t
andl $1,%%eax\n
-   :=a(rc), +m (*regs)
+   :=a(rc)
:a(regs)
:%ebx, %ecx, %edx, %esi, %edi, memory);
 #endif


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Richard Guenther
On Mon, Nov 15, 2010 at 7:45 PM, Jeff Law l...@redhat.com wrote:
 On 11/08/10 03:49, Richard Guenther wrote:

 On Mon, Nov 8, 2010 at 12:03 AM, Andi Kleena...@firstfloor.org  wrote:

 Andreas Schwabsch...@linux-m68k.org  writes:

 The asm fails to mention that it modifies *regs.

 It has a memory clobber, that should be enough, no?

 No.  A memory clobber does not cover automatic storage.

 A memory clobber should clobber anything in memory, including autos in
 memory; if it doesn't, then that seems like a major problem.  I'd like to
 see the rationale behind not clobbering autos in memory.

Non-address taken automatic storage.  (note that we don't excercise this
in optimization yet)

It's difficult to model thins kind of non-aliased memory with this kind
of aliasing mechanism (apart from taking all asms as clobbering
everything as we currently do).

Richard.

 Jeff



Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Andi Kleen
 testcase shows that in 4.1/4.2/4.3/4.4 this is miscompiled only when using
 -fno-ipa-reference, in 4.5 it is miscompiled always when optimizing
 unless -fno-ipa-pure-const (as 4.5 added local-pure-const pass which is run
 before ipa-reference) and in 4.6 this has been fixed by Honza when
 doing ipa cleanups.

Maybe it would be better to simply change the kernel Makefiles to pass
-fno-ipa-pure-const instead of adding volatiles everywhere.

-Andi


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Jakub Jelinek
On Mon, Nov 15, 2010 at 11:43:22PM +0100, Andi Kleen wrote:
  testcase shows that in 4.1/4.2/4.3/4.4 this is miscompiled only when using
  -fno-ipa-reference, in 4.5 it is miscompiled always when optimizing
  unless -fno-ipa-pure-const (as 4.5 added local-pure-const pass which is run
  before ipa-reference) and in 4.6 this has been fixed by Honza when
  doing ipa cleanups.
 
 Maybe it would be better to simply change the kernel Makefiles to pass
 -fno-ipa-pure-const instead of adding volatiles everywhere.

If you do this, please do it for 4.5.[012] only.  If you disable all gcc
passes that ever had any bugs in it, you'd need to disable most of them if
not all.

Jakub


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Jeff Law

On 11/15/10 15:07, Richard Guenther wrote:

On Mon, Nov 15, 2010 at 7:45 PM, Jeff Lawl...@redhat.com  wrote:

On 11/08/10 03:49, Richard Guenther wrote:

On Mon, Nov 8, 2010 at 12:03 AM, Andi Kleena...@firstfloor.orgwrote:

Andreas Schwabsch...@linux-m68k.orgwrites:

The asm fails to mention that it modifies *regs.

It has a memory clobber, that should be enough, no?

No.  A memory clobber does not cover automatic storage.

A memory clobber should clobber anything in memory, including autos in
memory; if it doesn't, then that seems like a major problem.  I'd like to
see the rationale behind not clobbering autos in memory.

Non-address taken automatic storage.  (note that we don't excercise this
in optimization yet)
If the address of the auto isn't taken, then why is the object in memory 
to begin with (with the obvious exception for aggregates).


Jeff


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Richard Guenther
On Mon, Nov 15, 2010 at 11:58 PM, Jeff Law l...@redhat.com wrote:
 On 11/15/10 15:07, Richard Guenther wrote:

 On Mon, Nov 15, 2010 at 7:45 PM, Jeff Lawl...@redhat.com  wrote:

 On 11/08/10 03:49, Richard Guenther wrote:

 On Mon, Nov 8, 2010 at 12:03 AM, Andi Kleena...@firstfloor.org
  wrote:

 Andreas Schwabsch...@linux-m68k.org    writes:

 The asm fails to mention that it modifies *regs.

 It has a memory clobber, that should be enough, no?

 No.  A memory clobber does not cover automatic storage.

 A memory clobber should clobber anything in memory, including autos in
 memory; if it doesn't, then that seems like a major problem.  I'd like to
 see the rationale behind not clobbering autos in memory.

 Non-address taken automatic storage.  (note that we don't excercise this
 in optimization yet)

 If the address of the auto isn't taken, then why is the object in memory to
 begin with (with the obvious exception for aggregates).

Exactly sort of my point.  If people pass the address of x to an asm
and modify x + 8 expecting the adjacent stack location to be changed
I want to tell them that's not a supported way to get to another stack
variable (even if they clobber memory).  Or consider the C-decl guy
who wants to access adjacent parameters by address arithmetic on
the address of the first param ...

Richard.

 Jeff



Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Jeff Law

On 11/15/10 16:07, Richard Guenther wrote:


If the address of the auto isn't taken, then why is the object in memory to
begin with (with the obvious exception for aggregates).

Exactly sort of my point.  If people pass the address ofx to an asm
and modifyx + 8 expecting the adjacent stack location to be changed
I want to tell them that's not a supported way to get to another stack
variable (even if they clobber memory).  Or consider the C-decl guy
who wants to access adjacent parameters by address arithmetic on
the address of the first param ...
Well, in that case, I think we can easily say that the programmer has 
gone off the deep end and has entered the realm of undefined behavior.


Presumably we rooted out all relevant instances of the latter over the 
last 20 years...  It was fairly common in the past, but I doubt anyone 
worth caring about is still writing code assuming they can take the 
address of parameter A, offset it and get parameters B, C, D, etc.


jeff


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-14 Thread James Cloos
Gcc 4.5.1 running on an amd64 box cross-compiling for a P3 i8k fails
to compile the module since commit 6b4e81db2552bad04100e7d5ddeed7e848f53b48
with:

  CC  drivers/char/i8k.o
drivers/char/i8k.c: In function ‘i8k_smm’:
drivers/char/i8k.c:149:2: error: can't find a register in class ‘GENERAL_REGS’ 
while reloading ‘asm’
drivers/char/i8k.c:149:2: error: ‘asm’ operand has impossible constraints

-JimC
-- 
James Cloos cl...@jhcloos.com OpenPGP: 1024D/ED7DAEA6


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-14 Thread Linus Torvalds
On Sun, Nov 14, 2010 at 4:52 PM, James Cloos cl...@jhcloos.com wrote:
 Gcc 4.5.1 running on an amd64 box cross-compiling for a P3 i8k fails
 to compile the module since commit 6b4e81db2552bad04100e7d5ddeed7e848f53b48
 with:

  CC      drivers/char/i8k.o
 drivers/char/i8k.c: In function ‘i8k_smm’:
 drivers/char/i8k.c:149:2: error: can't find a register in class 
 ‘GENERAL_REGS’ while reloading ‘asm’
 drivers/char/i8k.c:149:2: error: ‘asm’ operand has impossible constraints

At this point, I think this falls clearly under unresolvable gcc bug.

Quite frankly, I think gcc was buggy to begin with: since we had a
memory clobber, the +m (*regs) should not have mattered. The fact
that *regs may be some local variable doesn't make any difference
what-so-ever, since we took the address of the variable. So the memory
clobber _clearly_ can change that variable.

So when Richard Gunther says a memory clobber doesn't cover automatic
storage, to me that very clearly spells gcc is buggy as hell.
Because automatic storage with its address taken _very_ much gets
clobbered by things like memset etc. If the compiler doesn't
understand that, the compiler is just broken.

And now, if even the (superfluous) +m isn't working, it sounds like
we have no sane options left. Except to say that gcc-4.5.1 is totally
broken wrt asms.

Can we just get gcc to realize that when you pass the address of
automatic storage to an asm, that means that memory really does
clobber it? Because clearly that is the case.

Linus


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-09 Thread Andi Kleen
 My speculation is, that the asm is not removed but rather that regs.eax 
 isn't reloaded after the asm because the memory clobber doesn't clobber 
 automatic variables.

Yes that makes sense. I wasn't able to verify it so far though.

Maybe the original poster could try the obvious patch 
instead of the volatile change.


i8k: tell gcc that regs gets clobbered

Signed-off-by: Andi Kleen a...@linux.intel.com

diff --git a/drivers/char/i8k.c b/drivers/char/i8k.c
index 3bc0eef..f3bbf73 100644
--- a/drivers/char/i8k.c
+++ b/drivers/char/i8k.c
@@ -142,7 +142,7 @@ static int i8k_smm(struct smm_regs *regs)
lahf\n\t
shrl $8,%%eax\n\t
andl $1,%%eax\n
-   :=a(rc)
+   :=a(rc), =m (*regs)
:a(regs)
:%ebx, %ecx, %edx, %esi, %edi, memory);
 #else
@@ -167,7 +167,7 @@ static int i8k_smm(struct smm_regs *regs)
movl %%edx,0(%%eax)\n\t
lahf\n\t
shrl $8,%%eax\n\t
-   andl $1,%%eax\n:=a(rc)
+   andl $1,%%eax\n:=a(rc), =m (*regs)
:a(regs)
:%ebx, %ecx, %edx, %esi, %edi, memory);
 #endif

-Andi


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-09 Thread Michael Matz
Hi,

On Mon, 8 Nov 2010, Dave Korn wrote:

 void foo (void)
 {
   int x, y, z;
   x = 23;
   asm (do something : =r (y) : r (x) );
   z = y + 1;
 }

The case in i8k.c really is different.  It does use the value by 
influencing the return value and the callers use the returned value in 
conditionals and the like.  It really, really _is_ used :-) and if GCC 
removes the asm (which up to now is only speculation) then it's a GCC bug.

The code outlines like so:

int i8k_smm (regs) {
  int rc;
  asm (... : =r(rc) ...);
  if (rc != 0 || ...)
return -EINVAL;
  return 0;
}

...
  struct regs regs = {.eax = ...}
  return i8k_smm(regs) ?: regs.eax;
...

My speculation is, that the asm is not removed but rather that regs.eax 
isn't reloaded after the asm because the memory clobber doesn't clobber 
automatic variables.


Ciao,
Michael.


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-09 Thread Andreas Schwab
Andi Kleen a...@firstfloor.org writes:

 @@ -142,7 +142,7 @@ static int i8k_smm(struct smm_regs *regs)
   lahf\n\t
   shrl $8,%%eax\n\t
   andl $1,%%eax\n
 - :=a(rc)
 + :=a(rc), =m (*regs)

I think this should be +m.

Andreas.

-- 
Andreas Schwab, sch...@redhat.com
GPG Key fingerprint = D4E8 DBE3 3813 BB5D FA84  5EC7 45C6 250E 6F00 984E
And now for something completely different.


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-09 Thread Jim
On 11/09/2010 02:57 PM, Andreas Schwab wrote:
 Andi Kleen a...@firstfloor.org writes:
 
 @@ -142,7 +142,7 @@ static int i8k_smm(struct smm_regs *regs)
  lahf\n\t
  shrl $8,%%eax\n\t
  andl $1,%%eax\n
 -:=a(rc)
 +:=a(rc), =m (*regs)
 
 I think this should be +m.
 
 Andreas.
 

Just tested Andi's patch with Andreas' suggestion to make it +m,
i.e. like attached and can confirm it solves the issue.

Thanks guys,
   Jim Bos


--- i8k.c.ORIG  2010-08-02 17:20:46.0 +0200
+++ i8k.c   2010-11-09 17:31:29.0 +0100
@@ -141,7 +141,7 @@
lahf\n\t
shrl $8,%%eax\n\t
andl $1,%%eax\n
-   :=a(rc)
+   :=a(rc), +m (*regs)
:a(regs)
:%ebx, %ecx, %edx, %esi, %edi, memory);
 #else
@@ -166,7 +166,7 @@
movl %%edx,0(%%eax)\n\t
lahf\n\t
shrl $8,%%eax\n\t
-   andl $1,%%eax\n:=a(rc)
+   andl $1,%%eax\n:=a(rc), +m (*regs)
:a(regs)
:%ebx, %ecx, %edx, %esi, %edi, memory);
 #endif


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-08 Thread Richard Guenther
On Mon, Nov 8, 2010 at 12:03 AM, Andi Kleen a...@firstfloor.org wrote:
 Andreas Schwab sch...@linux-m68k.org writes:

 The asm fails to mention that it modifies *regs.

 It has a memory clobber, that should be enough, no?

No.  A memory clobber does not cover automatic storage.

Btw, I can't see a testcase anywhere so I just assume Andreas got
it right as usual.

Richard.

 Besides in any case it cannot be eliminated because it has
 valid non dead inputs and outputs.

 -Andi
 --
 a...@linux.intel.com -- Speaking for myself only.



Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-08 Thread Andi Kleen
Richard Guenther richard.guent...@gmail.com writes:

 On Mon, Nov 8, 2010 at 12:03 AM, Andi Kleen a...@firstfloor.org wrote:
 Andreas Schwab sch...@linux-m68k.org writes:

 The asm fails to mention that it modifies *regs.

 It has a memory clobber, that should be enough, no?

 No.  A memory clobber does not cover automatic storage.

That's a separate problem.

 Btw, I can't see a testcase anywhere so I just assume Andreas got
 it right as usual.

An asm with live inputs and outputs should never be optimized
way. If 4.5.1 started doing that it's seriously broken.

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only.


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-08 Thread Richard Guenther
On Mon, Nov 8, 2010 at 12:20 PM, Andi Kleen a...@firstfloor.org wrote:
 Richard Guenther richard.guent...@gmail.com writes:

 On Mon, Nov 8, 2010 at 12:03 AM, Andi Kleen a...@firstfloor.org wrote:
 Andreas Schwab sch...@linux-m68k.org writes:

 The asm fails to mention that it modifies *regs.

 It has a memory clobber, that should be enough, no?

 No.  A memory clobber does not cover automatic storage.

 That's a separate problem.

 Btw, I can't see a testcase anywhere so I just assume Andreas got
 it right as usual.

 An asm with live inputs and outputs should never be optimized
 way. If 4.5.1 started doing that it's seriously broken.

Please provide a testcase, such asms can be optimized if the
outputs are dead.

Richard.

 -Andi

 --
 a...@linux.intel.com -- Speaking for myself only.



Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-08 Thread Paul Koning

On Nov 8, 2010, at 6:20 AM, Richard Guenther wrote:

 On Mon, Nov 8, 2010 at 12:20 PM, Andi Kleen a...@firstfloor.org wrote:
 Richard Guenther richard.guent...@gmail.com writes:
 
 On Mon, Nov 8, 2010 at 12:03 AM, Andi Kleen a...@firstfloor.org wrote:
 Andreas Schwab sch...@linux-m68k.org writes:
 
 The asm fails to mention that it modifies *regs.
 
 It has a memory clobber, that should be enough, no?
 
 No.  A memory clobber does not cover automatic storage.
 
 That's a separate problem.
 
 Btw, I can't see a testcase anywhere so I just assume Andreas got
 it right as usual.
 
 An asm with live inputs and outputs should never be optimized
 way. If 4.5.1 started doing that it's seriously broken.
 
 Please provide a testcase, such asms can be optimized if the
 outputs are dead.

I don't know about 4.5, but I noticed that with 4.6 (trunk), testcasese like 
gcc.c-torture/compile/2804-1.c optimize away the asm and all the operand 
generation except for -O0.

paul



Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-08 Thread Jakub Jelinek
On Mon, Nov 08, 2010 at 06:47:59AM -0500, Paul Koning wrote:
 I don't know about 4.5, but I noticed that with 4.6 (trunk), testcasese
 like gcc.c-torture/compile/2804-1.c optimize away the asm and all the
 operand generation except for -O0.

That's fine, the asm isn't volatile and the output is not used.

Jakub


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-08 Thread Michael Matz
Hi,

On Mon, 8 Nov 2010, Andi Kleen wrote:

 Richard Guenther richard.guent...@gmail.com writes:
 
  On Mon, Nov 8, 2010 at 12:03 AM, Andi Kleen a...@firstfloor.org wrote:
  Andreas Schwab sch...@linux-m68k.org writes:
 
  The asm fails to mention that it modifies *regs.
 
  It has a memory clobber, that should be enough, no?
 
  No.  A memory clobber does not cover automatic storage.
 
 That's a separate problem.
 
  Btw, I can't see a testcase anywhere so I just assume Andreas got
  it right as usual.
 
 An asm with live inputs and outputs should never be optimized
 way. If 4.5.1 started doing that it's seriously broken.

You know the drill: testcase - gcc.gnu.org/bugzilla/

(In particular up to now it's only speculation in some forum that the asm 
really is optimized away, which I agree would be a bug, or if it isn't 
merely that regs-eax isn't reloaded after the asm(), which would be 
caused by the problem Andreas mentioned)


Ciao,
Michael.


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-08 Thread Dave Korn
On 08/11/2010 11:20, Andi Kleen wrote:

 An asm with live inputs and outputs should never be optimized
 way. If 4.5.1 started doing that it's seriously broken.

  I don't see that.  Consider:

void foo (void)
{
  int x, y, z;
  x = 23;
  y = x + 1;
  z = y + 1;
}

  So far, you'd agree the compiler may optimise the entire function away?  So
why not this:

void foo (void)
{
  int x, y, z;
  x = 23;
  asm (do something : =r (y) : r (x) );
  z = y + 1;
}

  ?

cheers,
  DaveK


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-07 Thread Andi Kleen
Jim jim...@xs4all.nl writes:

 After upgrading my Dell laptop, both OS+kernel the i8k interface was giving
 nonsensical output. As it turned out it's not the kernel but compiler
 upgrade which broke this.

 Guys at Archlinux have found the underlying cause (but don't seem to have
 submitted a patch yet):
   https://bbs.archlinux.org/viewtopic.php?pid=780692#p780692
 gcc seems to optimize the assembly statements away.

 And indeed, applying this patch makes the i8k interface work again,
 i.e. replacing the asm(..) construct by  asm volatile(..)

The compiler really should not optimize the asm away, because
it has both input and output arguments which are later used.
asm volatile normally just means don't move significantly

I tested it with gcc version 4.5.0 20100604 [gcc-4_5-branch revision
160292] (SUSE Linux) 
and the asm statement is there for both 32bit and 64bit
(with an allmodconfig, with both -O2 and -Os)

If gcc 4.5.1 broke that over 4.5.0 you should really file a bug report
for the compiler, it seems like a serious regression in 4.5.1

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only.


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-07 Thread Andreas Schwab
Andi Kleen a...@firstfloor.org writes:

 Jim jim...@xs4all.nl writes:

 After upgrading my Dell laptop, both OS+kernel the i8k interface was giving
 nonsensical output. As it turned out it's not the kernel but compiler
 upgrade which broke this.

 Guys at Archlinux have found the underlying cause (but don't seem to have
 submitted a patch yet):
   https://bbs.archlinux.org/viewtopic.php?pid=780692#p780692
 gcc seems to optimize the assembly statements away.

 And indeed, applying this patch makes the i8k interface work again,
 i.e. replacing the asm(..) construct by  asm volatile(..)

 The compiler really should not optimize the asm away, because
 it has both input and output arguments which are later used.
 asm volatile normally just means don't move significantly

The asm fails to mention that it modifies *regs.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-07 Thread Andi Kleen
Andreas Schwab sch...@linux-m68k.org writes:

 The asm fails to mention that it modifies *regs.

It has a memory clobber, that should be enough, no?

Besides in any case it cannot be eliminated because it has
valid non dead inputs and outputs.

-Andi
-- 
a...@linux.intel.com -- Speaking for myself only.