RE: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
> In that case it specifies that any evaluation of "*foo" in an rvalue > context specifies a read (with a few exceptions for G++ where the C++ > language generally confuses things). Specifically it mentions the > statement "*src;" and discusses the statement as providing "a void > context". In other words, a statement such as "(void)(expr);" is > redundant because the statement already implies void context and the > extra cast-to-void is just extra text. As such "(void)(*src);" on a > "volatile int *src;" is documented to force a read of "*src". Now, > if you actually _use_ the result over just casting it to void and > discarding it, then GCC can provide no _less_ guarantee with regards > to the read-and-store than it provides to the read-and-discard. I read over this section and didn't realize the implications of the void context. I now agree with you. DS - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
My apologies for the late response, I've had a lot on my schedule over the last week. On Dec 02, 2006, at 23:29:28, David Schwartz wrote: It comes down to just what those guarantees GCC provides actually are. This is the first correct statement in your email. In any case the documented GCC guarantees have always been much stronger than you have been trying to persuade us they should be. I would argue that the C standard somewhat indirectly specifies those guarantees but I really don't have the heart for any more language-lawyering so I'm going to leave it at that. I have tried to find any documentation of the guarantees gcc actually provides and have been unable to do so. Where are these "documented GCC guarantees" documented? Well, under "When is a Volatile Object Accessed" in the GCC manual (seems to be present for at least a few versions back): http://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Volatiles.html#Volatiles In that case it specifies that any evaluation of "*foo" in an rvalue context specifies a read (with a few exceptions for G++ where the C++ language generally confuses things). Specifically it mentions the statement "*src;" and discusses the statement as providing "a void context". In other words, a statement such as "(void)(expr);" is redundant because the statement already implies void context and the extra cast-to-void is just extra text. As such "(void)(*src);" on a "volatile int *src;" is documented to force a read of "*src". Now, if you actually _use_ the result over just casting it to void and discarding it, then GCC can provide no _less_ guarantee with regards to the read-and-store than it provides to the read-and-discard. For example: /* This code is guaranteed to generate assembly to read the memory address into a register multiple times */ volatile int *foo = (expression); *foo; *foo; /* This code is guaranteed to do the same */ volatile int *foo = (expression); int x, y; x = *foo; y = *foo; If the result is actually the conditional expression for a loop as in the problem code then GCC would plainly have no choice about optimizing, not only is the volatile variable read and returned, but the result is the conditional for continued execution of a block of code with unknown side effects. Cheers, Kyle Moffett - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
My apologies for the late response, I've had a lot on my schedule over the last week. On Dec 02, 2006, at 23:29:28, David Schwartz wrote: It comes down to just what those guarantees GCC provides actually are. This is the first correct statement in your email. In any case the documented GCC guarantees have always been much stronger than you have been trying to persuade us they should be. I would argue that the C standard somewhat indirectly specifies those guarantees but I really don't have the heart for any more language-lawyering so I'm going to leave it at that. I have tried to find any documentation of the guarantees gcc actually provides and have been unable to do so. Where are these documented GCC guarantees documented? Well, under When is a Volatile Object Accessed in the GCC manual (seems to be present for at least a few versions back): http://gcc.gnu.org/onlinedocs/gcc-4.1.1/gcc/Volatiles.html#Volatiles In that case it specifies that any evaluation of *foo in an rvalue context specifies a read (with a few exceptions for G++ where the C++ language generally confuses things). Specifically it mentions the statement *src; and discusses the statement as providing a void context. In other words, a statement such as (void)(expr); is redundant because the statement already implies void context and the extra cast-to-void is just extra text. As such (void)(*src); on a volatile int *src; is documented to force a read of *src. Now, if you actually _use_ the result over just casting it to void and discarding it, then GCC can provide no _less_ guarantee with regards to the read-and-store than it provides to the read-and-discard. For example: /* This code is guaranteed to generate assembly to read the memory address into a register multiple times */ volatile int *foo = (expression); *foo; *foo; /* This code is guaranteed to do the same */ volatile int *foo = (expression); int x, y; x = *foo; y = *foo; If the result is actually the conditional expression for a loop as in the problem code then GCC would plainly have no choice about optimizing, not only is the volatile variable read and returned, but the result is the conditional for continued execution of a block of code with unknown side effects. Cheers, Kyle Moffett - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
In that case it specifies that any evaluation of *foo in an rvalue context specifies a read (with a few exceptions for G++ where the C++ language generally confuses things). Specifically it mentions the statement *src; and discusses the statement as providing a void context. In other words, a statement such as (void)(expr); is redundant because the statement already implies void context and the extra cast-to-void is just extra text. As such (void)(*src); on a volatile int *src; is documented to force a read of *src. Now, if you actually _use_ the result over just casting it to void and discarding it, then GCC can provide no _less_ guarantee with regards to the read-and-store than it provides to the read-and-discard. I read over this section and didn't realize the implications of the void context. I now agree with you. DS - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
> > It comes down to just what those guarantees GCC provides actually are. > This is the first correct statement in your email. In any case the > documented GCC guarantees have always been much stronger than you > have been trying to persuade us they should be. I would argue that > the C standard somewhat indirectly specifies those guarantees but I > really don't have the heart for any more language-lawyering so I'm > going to leave it at that. I have tried to find any documentation of the guarantees gcc actually provides and have been unable to do so. Where are these "documented GCC guarantees" documented? DS - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
On Dec 01, 2006, at 09:03:26, David Schwartz wrote: "David Schwartz" <[EMAIL PROTECTED]> writes: The problem is that '*(volatile unsigned int *)' results in a 'volatile unsigned int'. No, it doesn't. Values don't have qualifiers, only objects have. Qualifiers on rvalues are meaningless. Yeah. That's the problem here. The 'volatile' has no object to qualify. You are essentially lying to the compiler (telling it the pointer points to a volatile object when it doesn't) and hoping it does the right thing. No, the volatile has a perfectly valid object to qualify; the object pointed to by the provided pointer. Casting in C is explicitly defined so that you can change the properties of something. When I cast a "char *" pointer to "int *", I'm not telling the compiler to go through and magically convert a bunch of data, I'm telling it to treat the preexisting memory addresses, whatever they happened to be, as integer data. The _only_ place where that is not true is casting an "int" to a "float" or vice versa, and even then it doesn't apply to any levels of indirection (casting "int *" to "float *" is virtually guaranteed to cause headaches for first-year CS students). Likewise when I cast a pointer to "(volatile int *)", I am telling it that whatever happened before I want it to treat accesses through that new pointer as accesses to a volatile object, with all the implied confusion. Now if my code isn't careful about aliasing and I modified the data through a different pointer a few moments before (and I have strict-aliasing turned on) then I may get inconsistent results because the value has not been written to memory yet just stored in a register. Nothing in the standard requires any special behavior for accesses through volatile-qualified pointers. It only requires special behavior for access to objects that are in fact volatile. This is completely and utterly wrong. See this quote from the C99 standard pulled from the bugzilla page: 6.7.3: any expression referring to an object of volatile- qualified type must be evaluated strictly by the rules of the abstract machine, although precisely what constitutes an "access" to the object is implementation-defined. (Note, implementation- defined" behavior is required to be well-defined and documented*.) So if the reference in question is an "access", it must occur where the abstract machine says it should. Now GCC's documentation in multiple places refer to an empty statement containing an lvalue or rvalue as being "read" access. Example: The statement "int i = 0; i;" would cause "i" to be "read" in the second statement. Since "i" is not volatile then it may be optimized out, however if I stated "volatile int i = 0; i;", then the compiler would allocate "i" on the stack, assign 0 to it, then read it into a register. GCC could not optimize any part of that out without breaking the rules of the abstract machine. I think the technically right solution is some mechanism to define an object (which can be volatile-qualified) that exists at a particular address. Accessing this object would be accessing a volatile object and you'd get all the things the standard promises. Umm, that's exactly what "volatile" on a pointer means; treat this memory address as containing a volatile-qualified object. An adequate solution would probably be to make 'readl' return a volatile-qualified unsigned integer. No, you cannot return a volatile-qualified value because you a return value is an "rvalue" and rvalues cannot be "volatile". They can't be "const" or "restrict" either, it just doesn't make any sense. What would these mean? const int foo(float x); restrict int foo(float x); How can a return-value be const or restrict? It doesn't have an address I can take so I can't modify it where it is. If you have an expression for which you cannot get the address of with the "&" operator (in other words, an rvalue) then "volatile", "const", or "restrict" make no sense on the type of your expression. It comes down to just what those guarantees GCC provides actually are. This is the first correct statement in your email. In any case the documented GCC guarantees have always been much stronger than you have been trying to persuade us they should be. I would argue that the C standard somewhat indirectly specifies those guarantees but I really don't have the heart for any more language-lawyering so I'm going to leave it at that. Cheers, Kyle Moffett - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
>> Next you stick a my_other_func declaration in a header and use >> my_other_func instead of my_func() in the main function. Now the >> result is that the compiler has no damn clue what my_other_func() >> contains so it can't optimize it out of the loop with either >> version. You cannot treat "volatile" the way you are saying it is >> treated without severely violating both the C99 spec *and* common sense. > >The compiler *happens* to have no damn clue because such inter-module >optimizations don't exist. That doesn't make the code correct, just not >likely to demonstrate its brokenness. GCC has inter-module optimization, it's just not used everyday. I think I have seen a discussion on this. Right there it is -> http://lkml.org/lkml/2006/8/24/212 -`J' -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
Next you stick a my_other_func declaration in a header and use my_other_func instead of my_func() in the main function. Now the result is that the compiler has no damn clue what my_other_func() contains so it can't optimize it out of the loop with either version. You cannot treat volatile the way you are saying it is treated without severely violating both the C99 spec *and* common sense. The compiler *happens* to have no damn clue because such inter-module optimizations don't exist. That doesn't make the code correct, just not likely to demonstrate its brokenness. GCC has inter-module optimization, it's just not used everyday. I think I have seen a discussion on this. Right there it is - http://lkml.org/lkml/2006/8/24/212 -`J' -- - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
On Dec 01, 2006, at 09:03:26, David Schwartz wrote: David Schwartz [EMAIL PROTECTED] writes: The problem is that '*(volatile unsigned int *)' results in a 'volatile unsigned int'. No, it doesn't. Values don't have qualifiers, only objects have. Qualifiers on rvalues are meaningless. Yeah. That's the problem here. The 'volatile' has no object to qualify. You are essentially lying to the compiler (telling it the pointer points to a volatile object when it doesn't) and hoping it does the right thing. No, the volatile has a perfectly valid object to qualify; the object pointed to by the provided pointer. Casting in C is explicitly defined so that you can change the properties of something. When I cast a char * pointer to int *, I'm not telling the compiler to go through and magically convert a bunch of data, I'm telling it to treat the preexisting memory addresses, whatever they happened to be, as integer data. The _only_ place where that is not true is casting an int to a float or vice versa, and even then it doesn't apply to any levels of indirection (casting int * to float * is virtually guaranteed to cause headaches for first-year CS students). Likewise when I cast a pointer to (volatile int *), I am telling it that whatever happened before I want it to treat accesses through that new pointer as accesses to a volatile object, with all the implied confusion. Now if my code isn't careful about aliasing and I modified the data through a different pointer a few moments before (and I have strict-aliasing turned on) then I may get inconsistent results because the value has not been written to memory yet just stored in a register. Nothing in the standard requires any special behavior for accesses through volatile-qualified pointers. It only requires special behavior for access to objects that are in fact volatile. This is completely and utterly wrong. See this quote from the C99 standard pulled from the bugzilla page: 6.7.3: any expression referring to an object of volatile- qualified type must be evaluated strictly by the rules of the abstract machine, although precisely what constitutes an access to the object is implementation-defined. (Note, implementation- defined behavior is required to be well-defined and documented*.) So if the reference in question is an access, it must occur where the abstract machine says it should. Now GCC's documentation in multiple places refer to an empty statement containing an lvalue or rvalue as being read access. Example: The statement int i = 0; i; would cause i to be read in the second statement. Since i is not volatile then it may be optimized out, however if I stated volatile int i = 0; i;, then the compiler would allocate i on the stack, assign 0 to it, then read it into a register. GCC could not optimize any part of that out without breaking the rules of the abstract machine. I think the technically right solution is some mechanism to define an object (which can be volatile-qualified) that exists at a particular address. Accessing this object would be accessing a volatile object and you'd get all the things the standard promises. Umm, that's exactly what volatile on a pointer means; treat this memory address as containing a volatile-qualified object. An adequate solution would probably be to make 'readl' return a volatile-qualified unsigned integer. No, you cannot return a volatile-qualified value because you a return value is an rvalue and rvalues cannot be volatile. They can't be const or restrict either, it just doesn't make any sense. What would these mean? const int foo(float x); restrict int foo(float x); How can a return-value be const or restrict? It doesn't have an address I can take so I can't modify it where it is. If you have an expression for which you cannot get the address of with the operator (in other words, an rvalue) then volatile, const, or restrict make no sense on the type of your expression. It comes down to just what those guarantees GCC provides actually are. This is the first correct statement in your email. In any case the documented GCC guarantees have always been much stronger than you have been trying to persuade us they should be. I would argue that the C standard somewhat indirectly specifies those guarantees but I really don't have the heart for any more language-lawyering so I'm going to leave it at that. Cheers, Kyle Moffett - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
It comes down to just what those guarantees GCC provides actually are. This is the first correct statement in your email. In any case the documented GCC guarantees have always been much stronger than you have been trying to persuade us they should be. I would argue that the C standard somewhat indirectly specifies those guarantees but I really don't have the heart for any more language-lawyering so I'm going to leave it at that. I have tried to find any documentation of the guarantees gcc actually provides and have been unable to do so. Where are these documented GCC guarantees documented? DS - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
> "David Schwartz" <[EMAIL PROTECTED]> writes: > > > The problem is that '*(volatile unsigned int *)' results in a 'volatile > > unsigned int'. > > No, it doesn't. Values don't have qualifiers, only objects have. > Qualifiers on rvalues are meaningless. Yeah. That's the problem here. The 'volatile' has no object to qualify. You are essentially lying to the compiler (telling it the pointer points to a volatile object when it doesn't) and hoping it does the right thing. Nothing in the standard requires any special behavior for accesses through volatile-qualified pointers. It only requires special behavior for access to objects that are in fact volatile. I think the technically right solution is some mechanism to define an object (which can be volatile-qualified) that exists at a particular address. Accessing this object would be accessing a volatile object and you'd get all the things the standard promises. An adequate solution would probably be to make 'readl' return a volatile-qualified unsigned integer. However, I'd have no complaints if GCC provided stronger volatile guarantees than the C standard does, assuring that even subsequent casts or other changes still assure the access takes place where you expect it to. Just the guarantees in the standard get you only signals and longjmp. It comes down to just what those guarantees GCC provides actually are. DS - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
> No, it can't. If you leave the prototype alone and the function when > called in sequence returns the same list of values, then by > definition the internals can have no effect on the code which uses > that function. As further proof, if you wrapped my "my_func()" with > this in some C file: > int my_other_func(int a) > { > return my_func(a); > } This is an extremely strange position to take. If the function is in some sense broken or invokes undefined or platform-defined behavior, it most certainly can affect the code that uses that function. In this case, the code invokes implementation-defined behavior and is inlined. As a result, the outcome of the code is implementation-defined. > Next you stick a my_other_func declaration in a header and use > my_other_func instead of my_func() in the main function. Now the > result is that the compiler has no damn clue what my_other_func() > contains so it can't optimize it out of the loop with either > version. You cannot treat "volatile" the way you are saying it is > treated without severely violating both the C99 spec *and* common sense. The compiler *happens* to have no damn clue because such inter-module optimizations don't exist. That doesn't make the code correct, just not likely to demonstrate its brokenness. > > In some cases, it's very unlikely that compilers will ever become > > smart enough to demonstrate that our code is broken, but that > > doesn't make the code any less broken, just less likely to fail. > No, the code is not broken because the language simply isn't defined > that way. Essentially when the compiler is looking at any volatile > data it cannot ignore or optimize-away any operations on that data. Agreed. But it's not looking at any volatile data. (What data is volatile? A pointer is only *cast* to volatile, but then it's cast to a non-volatile type.) > On the other hand, when you cast volatile data into non-volatile > data, the compiler must preserve linearity of program execution. If > you call a function in a loop which dereferences a pointer to a > volatile then the compiler *MUST* always dereference the pointer, > even if it later discards the result and continues on its merry way. I agree, but that's not what happens. It does not discard the result, it casts the volatile away. > >> Actually, no. The reason for the volatile in the pointer > >> dereference is to force the memory access to *always* happen. > > > > That's why it was placed there, however it was thrown away right > > after it was placed, in the same step it was supposed to force a > > memory access. > Doesn't matter if you throw away the result. The C standard defines > this: >(void)( *(volatile int *)0xABCD1234 ); > to imply that the code reads an integer from that memory location and > then discards the result. The whole point of volatile is you still > MUST do the read. The C standard, AFAICT, doesn't cover this case. The cast to void means that no volatile object is ever accessed. One is named, but then the compiler is explicitly told to treat it like it's not volatile. The cast does not come after the reference, there is no sequence point here. > Feel free to read the bugzilla entry mentioned in > this thread as it even quotes all the pertinent sections of the C > standard for you. The bugzilla entry is about a different issue. If there are sections of the C standard that you think affect the case where the volatile is discarded by cast or conversion and never stored in any volatile-qualified variable, please tell me what they are. > > The problem is that '*(volatile unsigned int *)' results in a > > 'volatile unsigned int'. The *assignment* occurs in the return > > operation, after the 'volatile unsigned int' is *cast* to a plain > > 'unsigned int'. The assignment is *not* in any sense volatile or > > inviolate, so neither is the return value. > > No, the assignment is irrelevant but the pointer dereference in > rvalue context *is* relevant. The dereference forces a read operation. The dereference forces a read operation if and only if it results in a volatile value or is assigned to a volatile variable. That's what the standard says. In this case, you specifically ask the compiler to pretend the result is not volatile prior to *any* assignment. For example, the C standard talks about "accessing a volatile object", but there is no volatile object here. Only an object asked to be treated as volatile in the same sequence it is asked to be treated as not volatile. Accesses to volatile objects may not be reordered across sequence points and must be stable at sequence points, but there is no sequence point here between the cast to volatile and the coversion away from volatile. In fact, when the underlying objects are non-volatile, casting a pointer to volatile and dereferencing it does not provide *any* guarantees from the C standard. The C standard talks about accesses to volatile objects, not accesses no non-volatile objects
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
"David Schwartz" <[EMAIL PROTECTED]> writes: > The problem is that '*(volatile unsigned int *)' results in a 'volatile > unsigned int'. No, it doesn't. Values don't have qualifiers, only objects have. Qualifiers on rvalues are meaningless. Andreas. -- Andreas Schwab, SuSE Labs, [EMAIL PROTECTED] SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
On Dec 01, 2006, at 06:24:54, David Schwartz wrote: Imagine we change the code to read from some auto-increment hardware at a specific MMIO address instead of a global: static int my_func(int a) { return a + *(volatile int *)(0xABCD1234); } But you're telling me that the change in the header file (where the new function returns the exact same series of values with every call as the old) causes the program to enter an infinite loop? How do you rationalize this again? No, I'm not saying that. I'm saying it *can*. No, it can't. If you leave the prototype alone and the function when called in sequence returns the same list of values, then by definition the internals can have no effect on the code which uses that function. As further proof, if you wrapped my "my_func()" with this in some C file: int my_other_func(int a) { return my_func(a); } Next you stick a my_other_func declaration in a header and use my_other_func instead of my_func() in the main function. Now the result is that the compiler has no damn clue what my_other_func() contains so it can't optimize it out of the loop with either version. You cannot treat "volatile" the way you are saying it is treated without severely violating both the C99 spec *and* common sense. In some cases, it's very unlikely that compilers will ever become smart enough to demonstrate that our code is broken, but that doesn't make the code any less broken, just less likely to fail. No, the code is not broken because the language simply isn't defined that way. Essentially when the compiler is looking at any volatile data it cannot ignore or optimize-away any operations on that data. On the other hand, when you cast volatile data into non-volatile data, the compiler must preserve linearity of program execution. If you call a function in a loop which dereferences a pointer to a volatile then the compiler *MUST* always dereference the pointer, even if it later discards the result and continues on its merry way. Actually, no. The reason for the volatile in the pointer dereference is to force the memory access to *always* happen. That's why it was placed there, however it was thrown away right after it was placed, in the same step it was supposed to force a memory access. Doesn't matter if you throw away the result. The C standard defines this: (void)( *(volatile int *)0xABCD1234 ); to imply that the code reads an integer from that memory location and then discards the result. The whole point of volatile is you still MUST do the read. Feel free to read the bugzilla entry mentioned in this thread as it even quotes all the pertinent sections of the C standard for you. The problem is that '*(volatile unsigned int *)' results in a 'volatile unsigned int'. The *assignment* occurs in the return operation, after the 'volatile unsigned int' is *cast* to a plain 'unsigned int'. The assignment is *not* in any sense volatile or inviolate, so neither is the return value. No, the assignment is irrelevant but the pointer dereference in rvalue context *is* relevant. The dereference forces a read operation. One solution would be this: static inline unsigned int readl(const volatile void __iomem *addr) { volatile unsigned int j; j=*(volatile unsigned int __force *) addr; return j; } This is no different from the current code. You cast the pointer to a volatile unsigned int, you dereference that pointer, and you cast the result to an unsigned int. C does not have the same "assignment" distinctions that C++ has. (This may or may not fix the issue though. There is at least one known compiler issue that might be causing the breakage. However, correct compiler optimizations should be ruled out first.) Nope, any GCC behavior requiring code such as the above is broken, not just in my opinion but in the opinion of the GCC developers themselves, as you'd notice if you took the time to read down to the end of the bugzilla discussion. Cheers, Kyle Moffett - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
> Imagine we change the code to read from some auto-increment hardware > at a specific MMIO address instead of a global: > > static int my_func(int a) > > { > > return a + *(volatile int *)(0xABCD1234); > > } > But you're telling me that the change in the header file (where the > new function returns the exact same series of values with every call > as the old) causes the program to enter an infinite loop? > > How do you rationalize this again? No, I'm not saying that. I'm saying it *can*. We try to write code so that no matter what information the compiler has, it will still build correct code if the compiler is correct. When we don't do this, smarter compilers compile our code into executables that don't do what we want. In some cases, it's very unlikely that compilers will ever become smart enough to demonstrate that our code is broken, but that doesn't make the code any less broken, just less likely to fail. > > The 'readl' function should actually assign the value to a volatile > > variable. Assignments to volatiles cannot be cast away, but casts > > can and assignments to non-volatile variables can be optimized out. > Actually, no. The reason for the volatile in the pointer dereference > is to force the memory access to *always* happen. That's why it was placed there, however it was thrown away right after it was placed, in the same step it was supposed to force a memory access. > It's a guarantee > that the compiler will do that memory access every time it appears. Unless you throw it away before the memory access or in the same step as the memory access, say by casting it. > You have a volatile int afterwards and what you do with that nobody > cares. You have a volatile int unless you cast it so something else. > The point is the presence of the volatile in a single pointer- > dereference forcibly turns off any optimization of that specific > access, including loop unrolling and such. It did that in the past, because optimizers weren't smart enough. Now they're smarter, and so the breakage in the code is becoming apparent. The code is broken because it gets rid of the volatile in the same step that it expects the volatile to have effect. Only an assignment to a volatile variable cannot be elided by a cast. To put it another way: int j=*(int *)(volatile int *)f; is the same as int j=*(int *)f; Because the 'int *' cast removes the 'volatile int *' cast. This applies to whatever is cast, and whenever it is cast or assigned. Let's look back at 'readl': static inline unsigned int readl(const volatile void __iomem *addr) { return *(volatile unsigned int __force *) addr; } Notice that there is no assignment to anything volatile qualified. Notice also that before any assignment takes place, and in the same step as the access that you thing can't be eliminated, the result is cast to an 'unsigned int' and returned. The problem is that '*(volatile unsigned int *)' results in a 'volatile unsigned int'. The *assignment* occurs in the return operation, after the 'volatile unsigned int' is *cast* to a plain 'unsigned int'. The assignment is *not* in any sense volatile or inviolate, so neither is the return value. One solution would be this: static inline unsigned int readl(const volatile void __iomem *addr) { volatile unsigned int j; j=*(volatile unsigned int __force *) addr; return j; } This will probably result in an extra memory access though. There are probably better solutions but I can't think of any at the moment. (This may or may not fix the issue though. There is at least one known compiler issue that might be causing the breakage. However, correct compiler optimizations should be ruled out first.) DS - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
Imagine we change the code to read from some auto-increment hardware at a specific MMIO address instead of a global: static int my_func(int a) { return a + *(volatile int *)(0xABCD1234); } But you're telling me that the change in the header file (where the new function returns the exact same series of values with every call as the old) causes the program to enter an infinite loop? How do you rationalize this again? No, I'm not saying that. I'm saying it *can*. We try to write code so that no matter what information the compiler has, it will still build correct code if the compiler is correct. When we don't do this, smarter compilers compile our code into executables that don't do what we want. In some cases, it's very unlikely that compilers will ever become smart enough to demonstrate that our code is broken, but that doesn't make the code any less broken, just less likely to fail. The 'readl' function should actually assign the value to a volatile variable. Assignments to volatiles cannot be cast away, but casts can and assignments to non-volatile variables can be optimized out. Actually, no. The reason for the volatile in the pointer dereference is to force the memory access to *always* happen. That's why it was placed there, however it was thrown away right after it was placed, in the same step it was supposed to force a memory access. It's a guarantee that the compiler will do that memory access every time it appears. Unless you throw it away before the memory access or in the same step as the memory access, say by casting it. You have a volatile int afterwards and what you do with that nobody cares. You have a volatile int unless you cast it so something else. The point is the presence of the volatile in a single pointer- dereference forcibly turns off any optimization of that specific access, including loop unrolling and such. It did that in the past, because optimizers weren't smart enough. Now they're smarter, and so the breakage in the code is becoming apparent. The code is broken because it gets rid of the volatile in the same step that it expects the volatile to have effect. Only an assignment to a volatile variable cannot be elided by a cast. To put it another way: int j=*(int *)(volatile int *)f; is the same as int j=*(int *)f; Because the 'int *' cast removes the 'volatile int *' cast. This applies to whatever is cast, and whenever it is cast or assigned. Let's look back at 'readl': static inline unsigned int readl(const volatile void __iomem *addr) { return *(volatile unsigned int __force *) addr; } Notice that there is no assignment to anything volatile qualified. Notice also that before any assignment takes place, and in the same step as the access that you thing can't be eliminated, the result is cast to an 'unsigned int' and returned. The problem is that '*(volatile unsigned int *)' results in a 'volatile unsigned int'. The *assignment* occurs in the return operation, after the 'volatile unsigned int' is *cast* to a plain 'unsigned int'. The assignment is *not* in any sense volatile or inviolate, so neither is the return value. One solution would be this: static inline unsigned int readl(const volatile void __iomem *addr) { volatile unsigned int j; j=*(volatile unsigned int __force *) addr; return j; } This will probably result in an extra memory access though. There are probably better solutions but I can't think of any at the moment. (This may or may not fix the issue though. There is at least one known compiler issue that might be causing the breakage. However, correct compiler optimizations should be ruled out first.) DS - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
On Dec 01, 2006, at 06:24:54, David Schwartz wrote: Imagine we change the code to read from some auto-increment hardware at a specific MMIO address instead of a global: static int my_func(int a) { return a + *(volatile int *)(0xABCD1234); } But you're telling me that the change in the header file (where the new function returns the exact same series of values with every call as the old) causes the program to enter an infinite loop? How do you rationalize this again? No, I'm not saying that. I'm saying it *can*. No, it can't. If you leave the prototype alone and the function when called in sequence returns the same list of values, then by definition the internals can have no effect on the code which uses that function. As further proof, if you wrapped my my_func() with this in some C file: int my_other_func(int a) { return my_func(a); } Next you stick a my_other_func declaration in a header and use my_other_func instead of my_func() in the main function. Now the result is that the compiler has no damn clue what my_other_func() contains so it can't optimize it out of the loop with either version. You cannot treat volatile the way you are saying it is treated without severely violating both the C99 spec *and* common sense. In some cases, it's very unlikely that compilers will ever become smart enough to demonstrate that our code is broken, but that doesn't make the code any less broken, just less likely to fail. No, the code is not broken because the language simply isn't defined that way. Essentially when the compiler is looking at any volatile data it cannot ignore or optimize-away any operations on that data. On the other hand, when you cast volatile data into non-volatile data, the compiler must preserve linearity of program execution. If you call a function in a loop which dereferences a pointer to a volatile then the compiler *MUST* always dereference the pointer, even if it later discards the result and continues on its merry way. Actually, no. The reason for the volatile in the pointer dereference is to force the memory access to *always* happen. That's why it was placed there, however it was thrown away right after it was placed, in the same step it was supposed to force a memory access. Doesn't matter if you throw away the result. The C standard defines this: (void)( *(volatile int *)0xABCD1234 ); to imply that the code reads an integer from that memory location and then discards the result. The whole point of volatile is you still MUST do the read. Feel free to read the bugzilla entry mentioned in this thread as it even quotes all the pertinent sections of the C standard for you. The problem is that '*(volatile unsigned int *)' results in a 'volatile unsigned int'. The *assignment* occurs in the return operation, after the 'volatile unsigned int' is *cast* to a plain 'unsigned int'. The assignment is *not* in any sense volatile or inviolate, so neither is the return value. No, the assignment is irrelevant but the pointer dereference in rvalue context *is* relevant. The dereference forces a read operation. One solution would be this: static inline unsigned int readl(const volatile void __iomem *addr) { volatile unsigned int j; j=*(volatile unsigned int __force *) addr; return j; } This is no different from the current code. You cast the pointer to a volatile unsigned int, you dereference that pointer, and you cast the result to an unsigned int. C does not have the same assignment distinctions that C++ has. (This may or may not fix the issue though. There is at least one known compiler issue that might be causing the breakage. However, correct compiler optimizations should be ruled out first.) Nope, any GCC behavior requiring code such as the above is broken, not just in my opinion but in the opinion of the GCC developers themselves, as you'd notice if you took the time to read down to the end of the bugzilla discussion. Cheers, Kyle Moffett - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
David Schwartz [EMAIL PROTECTED] writes: The problem is that '*(volatile unsigned int *)' results in a 'volatile unsigned int'. No, it doesn't. Values don't have qualifiers, only objects have. Qualifiers on rvalues are meaningless. Andreas. -- Andreas Schwab, SuSE Labs, [EMAIL PROTECTED] SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
No, it can't. If you leave the prototype alone and the function when called in sequence returns the same list of values, then by definition the internals can have no effect on the code which uses that function. As further proof, if you wrapped my my_func() with this in some C file: int my_other_func(int a) { return my_func(a); } This is an extremely strange position to take. If the function is in some sense broken or invokes undefined or platform-defined behavior, it most certainly can affect the code that uses that function. In this case, the code invokes implementation-defined behavior and is inlined. As a result, the outcome of the code is implementation-defined. Next you stick a my_other_func declaration in a header and use my_other_func instead of my_func() in the main function. Now the result is that the compiler has no damn clue what my_other_func() contains so it can't optimize it out of the loop with either version. You cannot treat volatile the way you are saying it is treated without severely violating both the C99 spec *and* common sense. The compiler *happens* to have no damn clue because such inter-module optimizations don't exist. That doesn't make the code correct, just not likely to demonstrate its brokenness. In some cases, it's very unlikely that compilers will ever become smart enough to demonstrate that our code is broken, but that doesn't make the code any less broken, just less likely to fail. No, the code is not broken because the language simply isn't defined that way. Essentially when the compiler is looking at any volatile data it cannot ignore or optimize-away any operations on that data. Agreed. But it's not looking at any volatile data. (What data is volatile? A pointer is only *cast* to volatile, but then it's cast to a non-volatile type.) On the other hand, when you cast volatile data into non-volatile data, the compiler must preserve linearity of program execution. If you call a function in a loop which dereferences a pointer to a volatile then the compiler *MUST* always dereference the pointer, even if it later discards the result and continues on its merry way. I agree, but that's not what happens. It does not discard the result, it casts the volatile away. Actually, no. The reason for the volatile in the pointer dereference is to force the memory access to *always* happen. That's why it was placed there, however it was thrown away right after it was placed, in the same step it was supposed to force a memory access. Doesn't matter if you throw away the result. The C standard defines this: (void)( *(volatile int *)0xABCD1234 ); to imply that the code reads an integer from that memory location and then discards the result. The whole point of volatile is you still MUST do the read. The C standard, AFAICT, doesn't cover this case. The cast to void means that no volatile object is ever accessed. One is named, but then the compiler is explicitly told to treat it like it's not volatile. The cast does not come after the reference, there is no sequence point here. Feel free to read the bugzilla entry mentioned in this thread as it even quotes all the pertinent sections of the C standard for you. The bugzilla entry is about a different issue. If there are sections of the C standard that you think affect the case where the volatile is discarded by cast or conversion and never stored in any volatile-qualified variable, please tell me what they are. The problem is that '*(volatile unsigned int *)' results in a 'volatile unsigned int'. The *assignment* occurs in the return operation, after the 'volatile unsigned int' is *cast* to a plain 'unsigned int'. The assignment is *not* in any sense volatile or inviolate, so neither is the return value. No, the assignment is irrelevant but the pointer dereference in rvalue context *is* relevant. The dereference forces a read operation. The dereference forces a read operation if and only if it results in a volatile value or is assigned to a volatile variable. That's what the standard says. In this case, you specifically ask the compiler to pretend the result is not volatile prior to *any* assignment. For example, the C standard talks about accessing a volatile object, but there is no volatile object here. Only an object asked to be treated as volatile in the same sequence it is asked to be treated as not volatile. Accesses to volatile objects may not be reordered across sequence points and must be stable at sequence points, but there is no sequence point here between the cast to volatile and the coversion away from volatile. In fact, when the underlying objects are non-volatile, casting a pointer to volatile and dereferencing it does not provide *any* guarantees from the C standard. The C standard talks about accesses to volatile objects, not accesses no non-volatile objects through pointers that claim the object is volatile. One solution
RE: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
David Schwartz [EMAIL PROTECTED] writes: The problem is that '*(volatile unsigned int *)' results in a 'volatile unsigned int'. No, it doesn't. Values don't have qualifiers, only objects have. Qualifiers on rvalues are meaningless. Yeah. That's the problem here. The 'volatile' has no object to qualify. You are essentially lying to the compiler (telling it the pointer points to a volatile object when it doesn't) and hoping it does the right thing. Nothing in the standard requires any special behavior for accesses through volatile-qualified pointers. It only requires special behavior for access to objects that are in fact volatile. I think the technically right solution is some mechanism to define an object (which can be volatile-qualified) that exists at a particular address. Accessing this object would be accessing a volatile object and you'd get all the things the standard promises. An adequate solution would probably be to make 'readl' return a volatile-qualified unsigned integer. However, I'd have no complaints if GCC provided stronger volatile guarantees than the C standard does, assuring that even subsequent casts or other changes still assure the access takes place where you expect it to. Just the guarantees in the standard get you only signals and longjmp. It comes down to just what those guarantees GCC provides actually are. DS - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
On Fri, Dec 01, 2006 at 08:28:16AM +0100, Willy Tarreau wrote: > Oh, I'm perfectly aware of this. That's in part why I started the hotfix > branch in the past :-) But sometimes, fixes consist in merging all the > patches from the maintenance branch (eg: from 4.1.0 to 4.1.1), and if > this is the case, there would not be much justification not to simply > update the version. In fact, what's really missing is a "fixlevel" in > the packages, to inform the user that 4.1.0 as shipped by the distro > has the same level of fixes as 4.1.1. But this is what the version is > used for today. This is even more complicated by the fact that upstream GCC release branches (and also several Linux distributors) start announcing the upcoming version already a few days after a release is tagged. E.g. 14 days old gcc-4_1-branch says: ./xgcc -B ./ --version; ./xgcc -B ./ -dD -E -xc /dev/null | grep GNU xgcc (GCC) 4.1.2 20061114 (prerelease) Copyright (C) 2006 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. #define __GNUC__ 4 #define __GNUC_MINOR__ 1 #define __GNUC_PATCHLEVEL__ 2 but GCC 4.1.2 has not been released yet. In Fedora Core/RHEL and I think a few other distros the version number is only changed when it is officially released, e.g.: gcc --version; gcc -dD -E -xc /dev/null | grep GNU gcc (GCC) 4.1.1 20061011 (Red Hat 4.1.1-30) Copyright (C) 2006 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. #define __GNUC__ 4 #define __GNUC_MINOR__ 1 #define __GNUC_PATCHLEVEL__ 1 #define __GNUC_RH_RELEASE__ 30 Note, 4.1.1 was released end of May this year and 4.1.2 has not been released. So, using __GNUC_PATCHLEVEL__ to detect if a bug has been fixed or not isn't very useful (you'd need to rule out also __GNUC_PATCHLEVEL__ <= 1 because gcc-4_1-branch was announcing that patchlevel already since beggining of March, on the other side there is a lot of GCCs with __GNUC_PATCHLEVEL__ == 1 that certainly have that bug fixed). You perhaps could parse the prerelease vs. release vs. vendor strings, but that could be quite difficult, perhaps easier would be just parse the date in the --version output. Checking for the bug is best though, because that will catch even backports of the bugfix without rebasing from the release branch. Jakub - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
On Fri, Dec 01, 2006 at 05:32:42PM +1100, Keith Owens wrote: > Willy Tarreau (on Fri, 1 Dec 2006 06:26:53 +0100) wrote: > >On Fri, Dec 01, 2006 at 04:14:04PM +1100, Keith Owens wrote: > >> SuSE's SLES10 ships with gcc 4.1.0. There is nothing to stop a > >> distributor from backporting the bug fix from gcc 4.1.1 to 4.1.0, but > >> this patch would not allow the fixed compiler to build the kernel. > > > >Then maybe replace #error with #warning ? It's too dangerous to let people > >build their kernel with a known broken compiler without being informed. > > Agreed. > > >I think this shows the limit of backports to known broken versions. > >Providing a full update to 4.1.1 would certainly be cleaner for all > >customers than backporting 4.1.1 to 4.1.0 and calling it 4.1.0. > > Agreed, but Enterprise customers expect bug fixes, not wholesale > replacements of critical programs. Oh, I'm perfectly aware of this. That's in part why I started the hotfix branch in the past :-) But sometimes, fixes consist in merging all the patches from the maintenance branch (eg: from 4.1.0 to 4.1.1), and if this is the case, there would not be much justification not to simply update the version. In fact, what's really missing is a "fixlevel" in the packages, to inform the user that 4.1.0 as shipped by the distro has the same level of fixes as 4.1.1. But this is what the version is used for today. > >Another solution would be to be able to check gcc for known bugs in the > >makefile, just like we check it for specific options. But I don't know > >how we can check gcc for bad code, especially in cross-compile environments > > It is doable, but it is as ugly as hell. Note the lack of a > signed-off-by :) Yes it's ugly, but at least it's a proposal. We might need something like this as a more generic solution across all architectures to easily check for such known problems. It's somewhat comparable to runtime fixups but here it's build time fixups. Regards, Willy > --- > arch/i386/kernel/Makefile | 16 > 1 file changed, 16 insertions(+) > > Index: linux/arch/i386/kernel/Makefile > === > --- linux.orig/arch/i386/kernel/Makefile > +++ linux/arch/i386/kernel/Makefile > @@ -83,3 +83,19 @@ $(obj)/vsyscall-syms.o: $(src)/vsyscall. > k8-y += ../../x86_64/kernel/k8.o > stacktrace-y += ../../x86_64/kernel/stacktrace.o > > +# Some versions of gcc generate invalid code for hpet_timer, depending > +# on other config options. Make sure that the generated code is valid. > +# Invalid versions of gcc generate a tight loop in wait_hpet_tick, with > +# no 'cmp' instructions. Extract the generated object code for > +# wait_hpet_tick, down to the next function then check that the code > +# contains at least one comparison. > + > +ifeq ($(CONFIG_HPET_TIMER),y) > +$(obj)/built-in.o: $(obj)/.tmp_check_gcc_bug > + > +$(obj)/.tmp_check_gcc_bug: $(obj)/time_hpet.o > + $(Q)[ -n "`$(OBJDUMP) -Sr $< | grep -A40 ':' | sed -e > '1d; />:$$/,$$d;' | grep -w cmp`" ] || \ > + (echo gcc volatile bug detected, fix your gcc; exit 1) > + $(Q)touch $@ > + > +endif - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
Willy Tarreau (on Fri, 1 Dec 2006 06:26:53 +0100) wrote: >On Fri, Dec 01, 2006 at 04:14:04PM +1100, Keith Owens wrote: >> SuSE's SLES10 ships with gcc 4.1.0. There is nothing to stop a >> distributor from backporting the bug fix from gcc 4.1.1 to 4.1.0, but >> this patch would not allow the fixed compiler to build the kernel. > >Then maybe replace #error with #warning ? It's too dangerous to let people >build their kernel with a known broken compiler without being informed. Agreed. >I think this shows the limit of backports to known broken versions. >Providing a full update to 4.1.1 would certainly be cleaner for all >customers than backporting 4.1.1 to 4.1.0 and calling it 4.1.0. Agreed, but Enterprise customers expect bug fixes, not wholesale replacements of critical programs. >Another solution would be to be able to check gcc for known bugs in the >makefile, just like we check it for specific options. But I don't know >how we can check gcc for bad code, especially in cross-compile environments It is doable, but it is as ugly as hell. Note the lack of a signed-off-by :) --- arch/i386/kernel/Makefile | 16 1 file changed, 16 insertions(+) Index: linux/arch/i386/kernel/Makefile === --- linux.orig/arch/i386/kernel/Makefile +++ linux/arch/i386/kernel/Makefile @@ -83,3 +83,19 @@ $(obj)/vsyscall-syms.o: $(src)/vsyscall. k8-y += ../../x86_64/kernel/k8.o stacktrace-y += ../../x86_64/kernel/stacktrace.o +# Some versions of gcc generate invalid code for hpet_timer, depending +# on other config options. Make sure that the generated code is valid. +# Invalid versions of gcc generate a tight loop in wait_hpet_tick, with +# no 'cmp' instructions. Extract the generated object code for +# wait_hpet_tick, down to the next function then check that the code +# contains at least one comparison. + +ifeq ($(CONFIG_HPET_TIMER),y) +$(obj)/built-in.o: $(obj)/.tmp_check_gcc_bug + +$(obj)/.tmp_check_gcc_bug: $(obj)/time_hpet.o + $(Q)[ -n "`$(OBJDUMP) -Sr $< | grep -A40 ':' | sed -e '1d; />:$$/,$$d;' | grep -w cmp`" ] || \ + (echo gcc volatile bug detected, fix your gcc; exit 1) + $(Q)touch $@ + +endif - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
On Nov 29, 2006, at 20:04:28, David Schwartz wrote: Ask yourself this question: Can an assignment to a non-volatile variable be optimized out? Then ask yourself this question: Does casting away volatile make it not volatile any more? Hmm, let's put this in a C file: int my_global = 0; And put this in a header: extern int my_global; static int my_func(int a) { return a + my_global++; } Now put this in another source file #include int main() { while (my_func(3) < 5) printf("It hasn't happened yet!\n"); return 0; } The obvious result is that it prints "It hasn't happened yet!" twice. On the third iteration when my_global == 2 it breaks out of the loop. This is all fairly standard C so far. Imagine we change the code to read from some auto-increment hardware at a specific MMIO address instead of a global: static int my_func(int a) { return a + *(volatile int *)(0xABCD1234); } But you're telling me that the change in the header file (where the new function returns the exact same series of values with every call as the old) causes the program to enter an infinite loop? How do you rationalize this again? The 'readl' function should actually assign the value to a volatile variable. Assignments to volatiles cannot be cast away, but casts can and assignments to non-volatile variables can be optimized out. Actually, no. The reason for the volatile in the pointer dereference is to force the memory access to *always* happen. It's a guarantee that the compiler will do that memory access every time it appears. You have a volatile int afterwards and what you do with that nobody cares. The point is the presence of the volatile in a single pointer- dereference forcibly turns off any optimization of that specific access, including loop unrolling and such. Cheers, Kyle Moffett - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
On Fri, Dec 01, 2006 at 04:14:04PM +1100, Keith Owens wrote: > Andrew Morton (on Thu, 30 Nov 2006 21:05:51 -0800) wrote: > >On Wed, 29 Nov 2006 21:14:10 +0100 > >Willy Tarreau <[EMAIL PROTECTED]> wrote: > > > >> Then why not simply check for gcc 4.1.0 in compiler.h and refuse to build > >> with 4.1.0 if it's known to produce bad code ? > > > >Think so. I'll queue this and see how many howls it causes. > > > >--- a/init/main.c~gcc-4-1-0-is-bust > >+++ a/init/main.c > >@@ -75,6 +75,10 @@ > > #error Sorry, your GCC is too old. It builds incorrect kernels. > > #endif > > > >+#if __GNUC__ == 4 && __GNUC_MINOR__ == 1 && __GNUC_PATCHLEVEL__ == 0 > >+#error gcc-4.1.0 is known to miscompile the kernel. Please use a different > >compiler version. > >+#endif > >+ > > static int init(void *); > > > > extern void init_IRQ(void); > > SuSE's SLES10 ships with gcc 4.1.0. There is nothing to stop a > distributor from backporting the bug fix from gcc 4.1.1 to 4.1.0, but > this patch would not allow the fixed compiler to build the kernel. Then maybe replace #error with #warning ? It's too dangerous to let people build their kernel with a known broken compiler without being informed. I think this shows the limit of backports to known broken versions. Providing a full update to 4.1.1 would certainly be cleaner for all customers than backporting 4.1.1 to 4.1.0 and calling it 4.1.0. Another solution would be to be able to check gcc for known bugs in the makefile, just like we check it for specific options. But I don't know how we can check gcc for bad code, especially in cross-compile environments :-/ Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
Andrew Morton (on Thu, 30 Nov 2006 21:05:51 -0800) wrote: >On Wed, 29 Nov 2006 21:14:10 +0100 >Willy Tarreau <[EMAIL PROTECTED]> wrote: > >> Then why not simply check for gcc 4.1.0 in compiler.h and refuse to build >> with 4.1.0 if it's known to produce bad code ? > >Think so. I'll queue this and see how many howls it causes. > >--- a/init/main.c~gcc-4-1-0-is-bust >+++ a/init/main.c >@@ -75,6 +75,10 @@ > #error Sorry, your GCC is too old. It builds incorrect kernels. > #endif > >+#if __GNUC__ == 4 && __GNUC_MINOR__ == 1 && __GNUC_PATCHLEVEL__ == 0 >+#error gcc-4.1.0 is known to miscompile the kernel. Please use a different >compiler version. >+#endif >+ > static int init(void *); > > extern void init_IRQ(void); SuSE's SLES10 ships with gcc 4.1.0. There is nothing to stop a distributor from backporting the bug fix from gcc 4.1.1 to 4.1.0, but this patch would not allow the fixed compiler to build the kernel. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
On Wed, 29 Nov 2006 21:14:10 +0100 Willy Tarreau <[EMAIL PROTECTED]> wrote: > Then why not simply check for gcc 4.1.0 in compiler.h and refuse to build > with 4.1.0 if it's known to produce bad code ? Think so. I'll queue this and see how many howls it causes. --- a/init/main.c~gcc-4-1-0-is-bust +++ a/init/main.c @@ -75,6 +75,10 @@ #error Sorry, your GCC is too old. It builds incorrect kernels. #endif +#if __GNUC__ == 4 && __GNUC_MINOR__ == 1 && __GNUC_PATCHLEVEL__ == 0 +#error gcc-4.1.0 is known to miscompile the kernel. Please use a different compiler version. +#endif + static int init(void *); extern void init_IRQ(void); _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
On Wed, 29 Nov 2006 21:14:10 +0100 Willy Tarreau [EMAIL PROTECTED] wrote: Then why not simply check for gcc 4.1.0 in compiler.h and refuse to build with 4.1.0 if it's known to produce bad code ? Think so. I'll queue this and see how many howls it causes. --- a/init/main.c~gcc-4-1-0-is-bust +++ a/init/main.c @@ -75,6 +75,10 @@ #error Sorry, your GCC is too old. It builds incorrect kernels. #endif +#if __GNUC__ == 4 __GNUC_MINOR__ == 1 __GNUC_PATCHLEVEL__ == 0 +#error gcc-4.1.0 is known to miscompile the kernel. Please use a different compiler version. +#endif + static int init(void *); extern void init_IRQ(void); _ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
Andrew Morton (on Thu, 30 Nov 2006 21:05:51 -0800) wrote: On Wed, 29 Nov 2006 21:14:10 +0100 Willy Tarreau [EMAIL PROTECTED] wrote: Then why not simply check for gcc 4.1.0 in compiler.h and refuse to build with 4.1.0 if it's known to produce bad code ? Think so. I'll queue this and see how many howls it causes. --- a/init/main.c~gcc-4-1-0-is-bust +++ a/init/main.c @@ -75,6 +75,10 @@ #error Sorry, your GCC is too old. It builds incorrect kernels. #endif +#if __GNUC__ == 4 __GNUC_MINOR__ == 1 __GNUC_PATCHLEVEL__ == 0 +#error gcc-4.1.0 is known to miscompile the kernel. Please use a different compiler version. +#endif + static int init(void *); extern void init_IRQ(void); SuSE's SLES10 ships with gcc 4.1.0. There is nothing to stop a distributor from backporting the bug fix from gcc 4.1.1 to 4.1.0, but this patch would not allow the fixed compiler to build the kernel. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
On Fri, Dec 01, 2006 at 04:14:04PM +1100, Keith Owens wrote: Andrew Morton (on Thu, 30 Nov 2006 21:05:51 -0800) wrote: On Wed, 29 Nov 2006 21:14:10 +0100 Willy Tarreau [EMAIL PROTECTED] wrote: Then why not simply check for gcc 4.1.0 in compiler.h and refuse to build with 4.1.0 if it's known to produce bad code ? Think so. I'll queue this and see how many howls it causes. --- a/init/main.c~gcc-4-1-0-is-bust +++ a/init/main.c @@ -75,6 +75,10 @@ #error Sorry, your GCC is too old. It builds incorrect kernels. #endif +#if __GNUC__ == 4 __GNUC_MINOR__ == 1 __GNUC_PATCHLEVEL__ == 0 +#error gcc-4.1.0 is known to miscompile the kernel. Please use a different compiler version. +#endif + static int init(void *); extern void init_IRQ(void); SuSE's SLES10 ships with gcc 4.1.0. There is nothing to stop a distributor from backporting the bug fix from gcc 4.1.1 to 4.1.0, but this patch would not allow the fixed compiler to build the kernel. Then maybe replace #error with #warning ? It's too dangerous to let people build their kernel with a known broken compiler without being informed. I think this shows the limit of backports to known broken versions. Providing a full update to 4.1.1 would certainly be cleaner for all customers than backporting 4.1.1 to 4.1.0 and calling it 4.1.0. Another solution would be to be able to check gcc for known bugs in the makefile, just like we check it for specific options. But I don't know how we can check gcc for bad code, especially in cross-compile environments :-/ Willy - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
On Nov 29, 2006, at 20:04:28, David Schwartz wrote: Ask yourself this question: Can an assignment to a non-volatile variable be optimized out? Then ask yourself this question: Does casting away volatile make it not volatile any more? Hmm, let's put this in a C file: int my_global = 0; And put this in a header: extern int my_global; static int my_func(int a) { return a + my_global++; } Now put this in another source file #include my_header.h int main() { while (my_func(3) 5) printf(It hasn't happened yet!\n); return 0; } The obvious result is that it prints It hasn't happened yet! twice. On the third iteration when my_global == 2 it breaks out of the loop. This is all fairly standard C so far. Imagine we change the code to read from some auto-increment hardware at a specific MMIO address instead of a global: static int my_func(int a) { return a + *(volatile int *)(0xABCD1234); } But you're telling me that the change in the header file (where the new function returns the exact same series of values with every call as the old) causes the program to enter an infinite loop? How do you rationalize this again? The 'readl' function should actually assign the value to a volatile variable. Assignments to volatiles cannot be cast away, but casts can and assignments to non-volatile variables can be optimized out. Actually, no. The reason for the volatile in the pointer dereference is to force the memory access to *always* happen. It's a guarantee that the compiler will do that memory access every time it appears. You have a volatile int afterwards and what you do with that nobody cares. The point is the presence of the volatile in a single pointer- dereference forcibly turns off any optimization of that specific access, including loop unrolling and such. Cheers, Kyle Moffett - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
Willy Tarreau (on Fri, 1 Dec 2006 06:26:53 +0100) wrote: On Fri, Dec 01, 2006 at 04:14:04PM +1100, Keith Owens wrote: SuSE's SLES10 ships with gcc 4.1.0. There is nothing to stop a distributor from backporting the bug fix from gcc 4.1.1 to 4.1.0, but this patch would not allow the fixed compiler to build the kernel. Then maybe replace #error with #warning ? It's too dangerous to let people build their kernel with a known broken compiler without being informed. Agreed. I think this shows the limit of backports to known broken versions. Providing a full update to 4.1.1 would certainly be cleaner for all customers than backporting 4.1.1 to 4.1.0 and calling it 4.1.0. Agreed, but Enterprise customers expect bug fixes, not wholesale replacements of critical programs. Another solution would be to be able to check gcc for known bugs in the makefile, just like we check it for specific options. But I don't know how we can check gcc for bad code, especially in cross-compile environments It is doable, but it is as ugly as hell. Note the lack of a signed-off-by :) --- arch/i386/kernel/Makefile | 16 1 file changed, 16 insertions(+) Index: linux/arch/i386/kernel/Makefile === --- linux.orig/arch/i386/kernel/Makefile +++ linux/arch/i386/kernel/Makefile @@ -83,3 +83,19 @@ $(obj)/vsyscall-syms.o: $(src)/vsyscall. k8-y += ../../x86_64/kernel/k8.o stacktrace-y += ../../x86_64/kernel/stacktrace.o +# Some versions of gcc generate invalid code for hpet_timer, depending +# on other config options. Make sure that the generated code is valid. +# Invalid versions of gcc generate a tight loop in wait_hpet_tick, with +# no 'cmp' instructions. Extract the generated object code for +# wait_hpet_tick, down to the next function then check that the code +# contains at least one comparison. + +ifeq ($(CONFIG_HPET_TIMER),y) +$(obj)/built-in.o: $(obj)/.tmp_check_gcc_bug + +$(obj)/.tmp_check_gcc_bug: $(obj)/time_hpet.o + $(Q)[ -n `$(OBJDUMP) -Sr $ | grep -A40 'wait_hpet_tick:' | sed -e '1d; /:$$/,$$d;' | grep -w cmp` ] || \ + (echo gcc volatile bug detected, fix your gcc; exit 1) + $(Q)touch $@ + +endif - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
On Fri, Dec 01, 2006 at 05:32:42PM +1100, Keith Owens wrote: Willy Tarreau (on Fri, 1 Dec 2006 06:26:53 +0100) wrote: On Fri, Dec 01, 2006 at 04:14:04PM +1100, Keith Owens wrote: SuSE's SLES10 ships with gcc 4.1.0. There is nothing to stop a distributor from backporting the bug fix from gcc 4.1.1 to 4.1.0, but this patch would not allow the fixed compiler to build the kernel. Then maybe replace #error with #warning ? It's too dangerous to let people build their kernel with a known broken compiler without being informed. Agreed. I think this shows the limit of backports to known broken versions. Providing a full update to 4.1.1 would certainly be cleaner for all customers than backporting 4.1.1 to 4.1.0 and calling it 4.1.0. Agreed, but Enterprise customers expect bug fixes, not wholesale replacements of critical programs. Oh, I'm perfectly aware of this. That's in part why I started the hotfix branch in the past :-) But sometimes, fixes consist in merging all the patches from the maintenance branch (eg: from 4.1.0 to 4.1.1), and if this is the case, there would not be much justification not to simply update the version. In fact, what's really missing is a fixlevel in the packages, to inform the user that 4.1.0 as shipped by the distro has the same level of fixes as 4.1.1. But this is what the version is used for today. Another solution would be to be able to check gcc for known bugs in the makefile, just like we check it for specific options. But I don't know how we can check gcc for bad code, especially in cross-compile environments It is doable, but it is as ugly as hell. Note the lack of a signed-off-by :) Yes it's ugly, but at least it's a proposal. We might need something like this as a more generic solution across all architectures to easily check for such known problems. It's somewhat comparable to runtime fixups but here it's build time fixups. Regards, Willy --- arch/i386/kernel/Makefile | 16 1 file changed, 16 insertions(+) Index: linux/arch/i386/kernel/Makefile === --- linux.orig/arch/i386/kernel/Makefile +++ linux/arch/i386/kernel/Makefile @@ -83,3 +83,19 @@ $(obj)/vsyscall-syms.o: $(src)/vsyscall. k8-y += ../../x86_64/kernel/k8.o stacktrace-y += ../../x86_64/kernel/stacktrace.o +# Some versions of gcc generate invalid code for hpet_timer, depending +# on other config options. Make sure that the generated code is valid. +# Invalid versions of gcc generate a tight loop in wait_hpet_tick, with +# no 'cmp' instructions. Extract the generated object code for +# wait_hpet_tick, down to the next function then check that the code +# contains at least one comparison. + +ifeq ($(CONFIG_HPET_TIMER),y) +$(obj)/built-in.o: $(obj)/.tmp_check_gcc_bug + +$(obj)/.tmp_check_gcc_bug: $(obj)/time_hpet.o + $(Q)[ -n `$(OBJDUMP) -Sr $ | grep -A40 'wait_hpet_tick:' | sed -e '1d; /:$$/,$$d;' | grep -w cmp` ] || \ + (echo gcc volatile bug detected, fix your gcc; exit 1) + $(Q)touch $@ + +endif - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
On Fri, Dec 01, 2006 at 08:28:16AM +0100, Willy Tarreau wrote: Oh, I'm perfectly aware of this. That's in part why I started the hotfix branch in the past :-) But sometimes, fixes consist in merging all the patches from the maintenance branch (eg: from 4.1.0 to 4.1.1), and if this is the case, there would not be much justification not to simply update the version. In fact, what's really missing is a fixlevel in the packages, to inform the user that 4.1.0 as shipped by the distro has the same level of fixes as 4.1.1. But this is what the version is used for today. This is even more complicated by the fact that upstream GCC release branches (and also several Linux distributors) start announcing the upcoming version already a few days after a release is tagged. E.g. 14 days old gcc-4_1-branch says: ./xgcc -B ./ --version; ./xgcc -B ./ -dD -E -xc /dev/null | grep GNU xgcc (GCC) 4.1.2 20061114 (prerelease) Copyright (C) 2006 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. #define __GNUC__ 4 #define __GNUC_MINOR__ 1 #define __GNUC_PATCHLEVEL__ 2 but GCC 4.1.2 has not been released yet. In Fedora Core/RHEL and I think a few other distros the version number is only changed when it is officially released, e.g.: gcc --version; gcc -dD -E -xc /dev/null | grep GNU gcc (GCC) 4.1.1 20061011 (Red Hat 4.1.1-30) Copyright (C) 2006 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. #define __GNUC__ 4 #define __GNUC_MINOR__ 1 #define __GNUC_PATCHLEVEL__ 1 #define __GNUC_RH_RELEASE__ 30 Note, 4.1.1 was released end of May this year and 4.1.2 has not been released. So, using __GNUC_PATCHLEVEL__ to detect if a bug has been fixed or not isn't very useful (you'd need to rule out also __GNUC_PATCHLEVEL__ = 1 because gcc-4_1-branch was announcing that patchlevel already since beggining of March, on the other side there is a lot of GCCs with __GNUC_PATCHLEVEL__ == 1 that certainly have that bug fixed). You perhaps could parse the prerelease vs. release vs. vendor strings, but that could be quite difficult, perhaps easier would be just parse the date in the --version output. Checking for the bug is best though, because that will catch even backports of the bugfix without rebasing from the release branch. Jakub - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
Ask yourself this question: Can an assignment to a non-volatile variable be optimized out? Then ask yourself this question: Does casting away volatile make it not volatile any more? > The volatile'ness does not simply disappear the moment you > assign the result to some local variable which is not volatile. Yes, it does. That's what a cast does, it tells the compiler to, in all respects, pretend that a variable is of a different type than it 'actually is', such that it actually isn't anymore. > Half of our drivers would break if this were true. On the contrary, they'd break if it was true. If casting away volatile didn't make it go away, then casting in volatile wouldn't have to make it appear. A cast causes the compiler to act as if a variable really was the type you cast it to. If you cast volatile away, that has the reverse of the same affect casting to volatile has. The 'readl' function should actually assign the value to a volatile variable. Assignments to volatiles cannot be cast away, but casts can and assignments to non-volatile variables can be optimized out. DS - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
On Wed, Nov 29, 2006 at 04:08:00AM -0500, Jakub Jelinek wrote: > On Wed, Nov 29, 2006 at 02:56:20PM +1100, Keith Owens wrote: > > Nicholas Miell (on Tue, 28 Nov 2006 19:08:25 -0800) wrote: > > >On Wed, 2006-11-29 at 13:22 +1100, Keith Owens wrote: > > >> Compiling 2.6.19-rc6 with gcc version 4.1.0 (SUSE Linux), > > >> wait_hpet_tick is optimized away to a never ending loop and the kernel > > >> hangs on boot in timer setup. > > >> > > >> 001a : > > >> 1a: 55 push %ebp > > >> 1b: 89 e5 mov%esp,%ebp > > >> 1d: eb fe jmp1d > > >> > > >> This is not a problem with gcc 3.3.5. Adding barrier() calls to > > >> wait_hpet_tick does not help, making the variables volatile does. > > >> > > >> Signed-off-by: Keith Owens > > >> > > >> --- > > >> arch/i386/kernel/time_hpet.c |2 +- > > >> 1 file changed, 1 insertion(+), 1 deletion(-) > > >> > > >> Index: linux-2.6/arch/i386/kernel/time_hpet.c > > >> === > > >> --- linux-2.6.orig/arch/i386/kernel/time_hpet.c > > >> +++ linux-2.6/arch/i386/kernel/time_hpet.c > > >> @@ -51,7 +51,7 @@ static void hpet_writel(unsigned long d, > > >> */ > > >> static void __devinit wait_hpet_tick(void) > > >> { > > >> -unsigned int start_cmp_val, end_cmp_val; > > >> +unsigned volatile int start_cmp_val, end_cmp_val; > > >> > > >> start_cmp_val = hpet_readl(HPET_T0_CMP); > > >> do { > > > > > >When you examine the inlined functions involved, this looks an awful lot > > >like http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22278 > > > > > >Perhaps SUSE should fix their gcc instead of working around compiler > > >problems in the kernel? > > > > Firstly, the fix for 22278 is included in gcc 4.1.0. > > This actually sounds more like http://gcc.gnu.org/PR27236 > And that one is broken in 4.1.0, fixed in 4.1.1. Then why not simply check for gcc 4.1.0 in compiler.h and refuse to build with 4.1.0 if it's known to produce bad code ? I find it a bit annoying that we start fixing every place affected by buggy compiler versions, especially when the fix is already available. > Jakub Regards, Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
On Wed, Nov 29, 2006 at 02:56:20PM +1100, Keith Owens wrote: > Nicholas Miell (on Tue, 28 Nov 2006 19:08:25 -0800) wrote: > >On Wed, 2006-11-29 at 13:22 +1100, Keith Owens wrote: > >> Compiling 2.6.19-rc6 with gcc version 4.1.0 (SUSE Linux), > >> wait_hpet_tick is optimized away to a never ending loop and the kernel > >> hangs on boot in timer setup. > >> > >> 001a : > >> 1a: 55 push %ebp > >> 1b: 89 e5 mov%esp,%ebp > >> 1d: eb fe jmp1d > >> > >> This is not a problem with gcc 3.3.5. Adding barrier() calls to > >> wait_hpet_tick does not help, making the variables volatile does. > >> > >> Signed-off-by: Keith Owens > >> > >> --- > >> arch/i386/kernel/time_hpet.c |2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> Index: linux-2.6/arch/i386/kernel/time_hpet.c > >> === > >> --- linux-2.6.orig/arch/i386/kernel/time_hpet.c > >> +++ linux-2.6/arch/i386/kernel/time_hpet.c > >> @@ -51,7 +51,7 @@ static void hpet_writel(unsigned long d, > >> */ > >> static void __devinit wait_hpet_tick(void) > >> { > >> - unsigned int start_cmp_val, end_cmp_val; > >> + unsigned volatile int start_cmp_val, end_cmp_val; > >> > >>start_cmp_val = hpet_readl(HPET_T0_CMP); > >>do { > > > >When you examine the inlined functions involved, this looks an awful lot > >like http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22278 > > > >Perhaps SUSE should fix their gcc instead of working around compiler > >problems in the kernel? > > Firstly, the fix for 22278 is included in gcc 4.1.0. This actually sounds more like http://gcc.gnu.org/PR27236 And that one is broken in 4.1.0, fixed in 4.1.1. Jakub - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
On Wed, Nov 29, 2006 at 02:56:20PM +1100, Keith Owens wrote: Nicholas Miell (on Tue, 28 Nov 2006 19:08:25 -0800) wrote: On Wed, 2006-11-29 at 13:22 +1100, Keith Owens wrote: Compiling 2.6.19-rc6 with gcc version 4.1.0 (SUSE Linux), wait_hpet_tick is optimized away to a never ending loop and the kernel hangs on boot in timer setup. 001a wait_hpet_tick: 1a: 55 push %ebp 1b: 89 e5 mov%esp,%ebp 1d: eb fe jmp1d wait_hpet_tick+0x3 This is not a problem with gcc 3.3.5. Adding barrier() calls to wait_hpet_tick does not help, making the variables volatile does. Signed-off-by: Keith Owens kaos@ocs.com.au --- arch/i386/kernel/time_hpet.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6/arch/i386/kernel/time_hpet.c === --- linux-2.6.orig/arch/i386/kernel/time_hpet.c +++ linux-2.6/arch/i386/kernel/time_hpet.c @@ -51,7 +51,7 @@ static void hpet_writel(unsigned long d, */ static void __devinit wait_hpet_tick(void) { - unsigned int start_cmp_val, end_cmp_val; + unsigned volatile int start_cmp_val, end_cmp_val; start_cmp_val = hpet_readl(HPET_T0_CMP); do { When you examine the inlined functions involved, this looks an awful lot like http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22278 Perhaps SUSE should fix their gcc instead of working around compiler problems in the kernel? Firstly, the fix for 22278 is included in gcc 4.1.0. This actually sounds more like http://gcc.gnu.org/PR27236 And that one is broken in 4.1.0, fixed in 4.1.1. Jakub - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
On Wed, Nov 29, 2006 at 04:08:00AM -0500, Jakub Jelinek wrote: On Wed, Nov 29, 2006 at 02:56:20PM +1100, Keith Owens wrote: Nicholas Miell (on Tue, 28 Nov 2006 19:08:25 -0800) wrote: On Wed, 2006-11-29 at 13:22 +1100, Keith Owens wrote: Compiling 2.6.19-rc6 with gcc version 4.1.0 (SUSE Linux), wait_hpet_tick is optimized away to a never ending loop and the kernel hangs on boot in timer setup. 001a wait_hpet_tick: 1a: 55 push %ebp 1b: 89 e5 mov%esp,%ebp 1d: eb fe jmp1d wait_hpet_tick+0x3 This is not a problem with gcc 3.3.5. Adding barrier() calls to wait_hpet_tick does not help, making the variables volatile does. Signed-off-by: Keith Owens kaos@ocs.com.au --- arch/i386/kernel/time_hpet.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6/arch/i386/kernel/time_hpet.c === --- linux-2.6.orig/arch/i386/kernel/time_hpet.c +++ linux-2.6/arch/i386/kernel/time_hpet.c @@ -51,7 +51,7 @@ static void hpet_writel(unsigned long d, */ static void __devinit wait_hpet_tick(void) { -unsigned int start_cmp_val, end_cmp_val; +unsigned volatile int start_cmp_val, end_cmp_val; start_cmp_val = hpet_readl(HPET_T0_CMP); do { When you examine the inlined functions involved, this looks an awful lot like http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22278 Perhaps SUSE should fix their gcc instead of working around compiler problems in the kernel? Firstly, the fix for 22278 is included in gcc 4.1.0. This actually sounds more like http://gcc.gnu.org/PR27236 And that one is broken in 4.1.0, fixed in 4.1.1. Then why not simply check for gcc 4.1.0 in compiler.h and refuse to build with 4.1.0 if it's known to produce bad code ? I find it a bit annoying that we start fixing every place affected by buggy compiler versions, especially when the fix is already available. Jakub Regards, Willy - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
Ask yourself this question: Can an assignment to a non-volatile variable be optimized out? Then ask yourself this question: Does casting away volatile make it not volatile any more? The volatile'ness does not simply disappear the moment you assign the result to some local variable which is not volatile. Yes, it does. That's what a cast does, it tells the compiler to, in all respects, pretend that a variable is of a different type than it 'actually is', such that it actually isn't anymore. Half of our drivers would break if this were true. On the contrary, they'd break if it was true. If casting away volatile didn't make it go away, then casting in volatile wouldn't have to make it appear. A cast causes the compiler to act as if a variable really was the type you cast it to. If you cast volatile away, that has the reverse of the same affect casting to volatile has. The 'readl' function should actually assign the value to a volatile variable. Assignments to volatiles cannot be cast away, but casts can and assignments to non-volatile variables can be optimized out. DS - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
On Wed, 2006-11-29 at 15:30 +1100, Keith Owens wrote: > David Miller (on Tue, 28 Nov 2006 20:04:53 -0800 (PST)) wrote: > >From: Keith Owens > >Date: Wed, 29 Nov 2006 14:56:20 +1100 > > > >> Secondly, I believe that this is a separate problem from bug 22278. > >> hpet_readl() is correctly using volatile internally, but its result is > >> being assigned to a pair of normal integers (not declared as volatile). > >> In the context of wait_hpet_tick, all the variables are unqualified so > >> gcc is allowed to optimize the comparison away. > >> > >> The same problem may exist in other parts of arch/i386/kernel/time_hpet.c, > >> where the return value from hpet_readl() is assigned to a normal > >> variable. Nothing in the C standard says that those unqualified > >> variables should be magically treated as volatile, just because the > >> original code that extracted the value used volatile. IOW, time_hpet.c > >> needs to declare any variables that hold the result of hpet_readl() as > >> being volatile variables. > > > >I disagree with this. > > > >readl() returns values from an opaque source, and it is declared > >as such to show this to GCC. It's like a function that GCC > >cannot see the implementation of, which it cannot determine > >anything about wrt. return values. > > > >The volatile'ness does not simply disappear the moment you > >assign the result to some local variable which is not volatile. > > > >Half of our drivers would break if this were true. > > This is definitely a gcc bug, 4.1.0 is doing something weird. Compile > with CONFIG_CC_OPTIMIZE_FOR_SIZE=n and the bug appears, > CONFIG_CC_OPTIMIZE_FOR_SIZE=y has no problem. > > Compile with CONFIG_CC_OPTIMIZE_FOR_SIZE=n and _either_ of the patches > below and the problem disappears. > My theory: gcc is inlining readl into hpet_readl (readl is an inline function, so it should be doing this no matter what), and inlining hpet_readl into wait_hpet_tick (otherwise, it can't possibly make any assumptions about the return values of hpet_readl -- this looks to be a SUSE-specific over-aggressive optimization), and somewhere along the way the volatile qualifier is getting lost. -- Nicholas Miell <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
David Miller (on Tue, 28 Nov 2006 20:04:53 -0800 (PST)) wrote: >From: Keith Owens >Date: Wed, 29 Nov 2006 14:56:20 +1100 > >> Secondly, I believe that this is a separate problem from bug 22278. >> hpet_readl() is correctly using volatile internally, but its result is >> being assigned to a pair of normal integers (not declared as volatile). >> In the context of wait_hpet_tick, all the variables are unqualified so >> gcc is allowed to optimize the comparison away. >> >> The same problem may exist in other parts of arch/i386/kernel/time_hpet.c, >> where the return value from hpet_readl() is assigned to a normal >> variable. Nothing in the C standard says that those unqualified >> variables should be magically treated as volatile, just because the >> original code that extracted the value used volatile. IOW, time_hpet.c >> needs to declare any variables that hold the result of hpet_readl() as >> being volatile variables. > >I disagree with this. > >readl() returns values from an opaque source, and it is declared >as such to show this to GCC. It's like a function that GCC >cannot see the implementation of, which it cannot determine >anything about wrt. return values. > >The volatile'ness does not simply disappear the moment you >assign the result to some local variable which is not volatile. > >Half of our drivers would break if this were true. This is definitely a gcc bug, 4.1.0 is doing something weird. Compile with CONFIG_CC_OPTIMIZE_FOR_SIZE=n and the bug appears, CONFIG_CC_OPTIMIZE_FOR_SIZE=y has no problem. Compile with CONFIG_CC_OPTIMIZE_FOR_SIZE=n and _either_ of the patches below and the problem disappears. Index: linux/arch/i386/kernel/time_hpet.c === --- linux.orig/arch/i386/kernel/time_hpet.c 2006-11-29 13:51:33.900462088 +1100 +++ linux/arch/i386/kernel/time_hpet.c 2006-11-29 15:25:47.853245938 +1100 @@ -35,7 +35,8 @@ static void __iomem * hpet_virt_address; int hpet_readl(unsigned long a) { - return readl(hpet_virt_address + a); + volatile int v = readl(hpet_virt_address + a); + return v; } static void hpet_writel(unsigned long d, unsigned long a) Index: linux-2.6/arch/i386/kernel/time_hpet.c === --- linux-2.6.orig/arch/i386/kernel/time_hpet.c +++ linux-2.6/arch/i386/kernel/time_hpet.c @@ -51,7 +51,7 @@ static void hpet_writel(unsigned long d, */ static void __devinit wait_hpet_tick(void) { - unsigned int start_cmp_val, end_cmp_val; + unsigned volatile int start_cmp_val, end_cmp_val; start_cmp_val = hpet_readl(HPET_T0_CMP); do { - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
From: Keith Owens Date: Wed, 29 Nov 2006 14:56:20 +1100 > Secondly, I believe that this is a separate problem from bug 22278. > hpet_readl() is correctly using volatile internally, but its result is > being assigned to a pair of normal integers (not declared as volatile). > In the context of wait_hpet_tick, all the variables are unqualified so > gcc is allowed to optimize the comparison away. > > The same problem may exist in other parts of arch/i386/kernel/time_hpet.c, > where the return value from hpet_readl() is assigned to a normal > variable. Nothing in the C standard says that those unqualified > variables should be magically treated as volatile, just because the > original code that extracted the value used volatile. IOW, time_hpet.c > needs to declare any variables that hold the result of hpet_readl() as > being volatile variables. I disagree with this. readl() returns values from an opaque source, and it is declared as such to show this to GCC. It's like a function that GCC cannot see the implementation of, which it cannot determine anything about wrt. return values. The volatile'ness does not simply disappear the moment you assign the result to some local variable which is not volatile. Half of our drivers would break if this were true. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
Nicholas Miell (on Tue, 28 Nov 2006 19:08:25 -0800) wrote: >On Wed, 2006-11-29 at 13:22 +1100, Keith Owens wrote: >> Compiling 2.6.19-rc6 with gcc version 4.1.0 (SUSE Linux), >> wait_hpet_tick is optimized away to a never ending loop and the kernel >> hangs on boot in timer setup. >> >> 001a : >> 1a: 55 push %ebp >> 1b: 89 e5 mov%esp,%ebp >> 1d: eb fe jmp1d >> >> This is not a problem with gcc 3.3.5. Adding barrier() calls to >> wait_hpet_tick does not help, making the variables volatile does. >> >> Signed-off-by: Keith Owens >> >> --- >> arch/i386/kernel/time_hpet.c |2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> Index: linux-2.6/arch/i386/kernel/time_hpet.c >> === >> --- linux-2.6.orig/arch/i386/kernel/time_hpet.c >> +++ linux-2.6/arch/i386/kernel/time_hpet.c >> @@ -51,7 +51,7 @@ static void hpet_writel(unsigned long d, >> */ >> static void __devinit wait_hpet_tick(void) >> { >> -unsigned int start_cmp_val, end_cmp_val; >> +unsigned volatile int start_cmp_val, end_cmp_val; >> >> start_cmp_val = hpet_readl(HPET_T0_CMP); >> do { > >When you examine the inlined functions involved, this looks an awful lot >like http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22278 > >Perhaps SUSE should fix their gcc instead of working around compiler >problems in the kernel? Firstly, the fix for 22278 is included in gcc 4.1.0. Secondly, I believe that this is a separate problem from bug 22278. hpet_readl() is correctly using volatile internally, but its result is being assigned to a pair of normal integers (not declared as volatile). In the context of wait_hpet_tick, all the variables are unqualified so gcc is allowed to optimize the comparison away. The same problem may exist in other parts of arch/i386/kernel/time_hpet.c, where the return value from hpet_readl() is assigned to a normal variable. Nothing in the C standard says that those unqualified variables should be magically treated as volatile, just because the original code that extracted the value used volatile. IOW, time_hpet.c needs to declare any variables that hold the result of hpet_readl() as being volatile variables. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
On Wed, 2006-11-29 at 13:22 +1100, Keith Owens wrote: > Compiling 2.6.19-rc6 with gcc version 4.1.0 (SUSE Linux), > wait_hpet_tick is optimized away to a never ending loop and the kernel > hangs on boot in timer setup. > > 001a : > 1a: 55 push %ebp > 1b: 89 e5 mov%esp,%ebp > 1d: eb fe jmp1d > > This is not a problem with gcc 3.3.5. Adding barrier() calls to > wait_hpet_tick does not help, making the variables volatile does. > > Signed-off-by: Keith Owens > > --- > arch/i386/kernel/time_hpet.c |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > Index: linux-2.6/arch/i386/kernel/time_hpet.c > === > --- linux-2.6.orig/arch/i386/kernel/time_hpet.c > +++ linux-2.6/arch/i386/kernel/time_hpet.c > @@ -51,7 +51,7 @@ static void hpet_writel(unsigned long d, > */ > static void __devinit wait_hpet_tick(void) > { > - unsigned int start_cmp_val, end_cmp_val; > + unsigned volatile int start_cmp_val, end_cmp_val; > > start_cmp_val = hpet_readl(HPET_T0_CMP); > do { When you examine the inlined functions involved, this looks an awful lot like http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22278 Perhaps SUSE should fix their gcc instead of working around compiler problems in the kernel? -- Nicholas Miell <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
Compiling 2.6.19-rc6 with gcc version 4.1.0 (SUSE Linux), wait_hpet_tick is optimized away to a never ending loop and the kernel hangs on boot in timer setup. 001a : 1a: 55 push %ebp 1b: 89 e5 mov%esp,%ebp 1d: eb fe jmp1d This is not a problem with gcc 3.3.5. Adding barrier() calls to wait_hpet_tick does not help, making the variables volatile does. Signed-off-by: Keith Owens --- arch/i386/kernel/time_hpet.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6/arch/i386/kernel/time_hpet.c === --- linux-2.6.orig/arch/i386/kernel/time_hpet.c +++ linux-2.6/arch/i386/kernel/time_hpet.c @@ -51,7 +51,7 @@ static void hpet_writel(unsigned long d, */ static void __devinit wait_hpet_tick(void) { - unsigned int start_cmp_val, end_cmp_val; + unsigned volatile int start_cmp_val, end_cmp_val; start_cmp_val = hpet_readl(HPET_T0_CMP); do { - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
Compiling 2.6.19-rc6 with gcc version 4.1.0 (SUSE Linux), wait_hpet_tick is optimized away to a never ending loop and the kernel hangs on boot in timer setup. 001a wait_hpet_tick: 1a: 55 push %ebp 1b: 89 e5 mov%esp,%ebp 1d: eb fe jmp1d wait_hpet_tick+0x3 This is not a problem with gcc 3.3.5. Adding barrier() calls to wait_hpet_tick does not help, making the variables volatile does. Signed-off-by: Keith Owens kaos@ocs.com.au --- arch/i386/kernel/time_hpet.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6/arch/i386/kernel/time_hpet.c === --- linux-2.6.orig/arch/i386/kernel/time_hpet.c +++ linux-2.6/arch/i386/kernel/time_hpet.c @@ -51,7 +51,7 @@ static void hpet_writel(unsigned long d, */ static void __devinit wait_hpet_tick(void) { - unsigned int start_cmp_val, end_cmp_val; + unsigned volatile int start_cmp_val, end_cmp_val; start_cmp_val = hpet_readl(HPET_T0_CMP); do { - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
On Wed, 2006-11-29 at 13:22 +1100, Keith Owens wrote: Compiling 2.6.19-rc6 with gcc version 4.1.0 (SUSE Linux), wait_hpet_tick is optimized away to a never ending loop and the kernel hangs on boot in timer setup. 001a wait_hpet_tick: 1a: 55 push %ebp 1b: 89 e5 mov%esp,%ebp 1d: eb fe jmp1d wait_hpet_tick+0x3 This is not a problem with gcc 3.3.5. Adding barrier() calls to wait_hpet_tick does not help, making the variables volatile does. Signed-off-by: Keith Owens kaos@ocs.com.au --- arch/i386/kernel/time_hpet.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6/arch/i386/kernel/time_hpet.c === --- linux-2.6.orig/arch/i386/kernel/time_hpet.c +++ linux-2.6/arch/i386/kernel/time_hpet.c @@ -51,7 +51,7 @@ static void hpet_writel(unsigned long d, */ static void __devinit wait_hpet_tick(void) { - unsigned int start_cmp_val, end_cmp_val; + unsigned volatile int start_cmp_val, end_cmp_val; start_cmp_val = hpet_readl(HPET_T0_CMP); do { When you examine the inlined functions involved, this looks an awful lot like http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22278 Perhaps SUSE should fix their gcc instead of working around compiler problems in the kernel? -- Nicholas Miell [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
Nicholas Miell (on Tue, 28 Nov 2006 19:08:25 -0800) wrote: On Wed, 2006-11-29 at 13:22 +1100, Keith Owens wrote: Compiling 2.6.19-rc6 with gcc version 4.1.0 (SUSE Linux), wait_hpet_tick is optimized away to a never ending loop and the kernel hangs on boot in timer setup. 001a wait_hpet_tick: 1a: 55 push %ebp 1b: 89 e5 mov%esp,%ebp 1d: eb fe jmp1d wait_hpet_tick+0x3 This is not a problem with gcc 3.3.5. Adding barrier() calls to wait_hpet_tick does not help, making the variables volatile does. Signed-off-by: Keith Owens kaos@ocs.com.au --- arch/i386/kernel/time_hpet.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6/arch/i386/kernel/time_hpet.c === --- linux-2.6.orig/arch/i386/kernel/time_hpet.c +++ linux-2.6/arch/i386/kernel/time_hpet.c @@ -51,7 +51,7 @@ static void hpet_writel(unsigned long d, */ static void __devinit wait_hpet_tick(void) { -unsigned int start_cmp_val, end_cmp_val; +unsigned volatile int start_cmp_val, end_cmp_val; start_cmp_val = hpet_readl(HPET_T0_CMP); do { When you examine the inlined functions involved, this looks an awful lot like http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22278 Perhaps SUSE should fix their gcc instead of working around compiler problems in the kernel? Firstly, the fix for 22278 is included in gcc 4.1.0. Secondly, I believe that this is a separate problem from bug 22278. hpet_readl() is correctly using volatile internally, but its result is being assigned to a pair of normal integers (not declared as volatile). In the context of wait_hpet_tick, all the variables are unqualified so gcc is allowed to optimize the comparison away. The same problem may exist in other parts of arch/i386/kernel/time_hpet.c, where the return value from hpet_readl() is assigned to a normal variable. Nothing in the C standard says that those unqualified variables should be magically treated as volatile, just because the original code that extracted the value used volatile. IOW, time_hpet.c needs to declare any variables that hold the result of hpet_readl() as being volatile variables. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
From: Keith Owens kaos@ocs.com.au Date: Wed, 29 Nov 2006 14:56:20 +1100 Secondly, I believe that this is a separate problem from bug 22278. hpet_readl() is correctly using volatile internally, but its result is being assigned to a pair of normal integers (not declared as volatile). In the context of wait_hpet_tick, all the variables are unqualified so gcc is allowed to optimize the comparison away. The same problem may exist in other parts of arch/i386/kernel/time_hpet.c, where the return value from hpet_readl() is assigned to a normal variable. Nothing in the C standard says that those unqualified variables should be magically treated as volatile, just because the original code that extracted the value used volatile. IOW, time_hpet.c needs to declare any variables that hold the result of hpet_readl() as being volatile variables. I disagree with this. readl() returns values from an opaque source, and it is declared as such to show this to GCC. It's like a function that GCC cannot see the implementation of, which it cannot determine anything about wrt. return values. The volatile'ness does not simply disappear the moment you assign the result to some local variable which is not volatile. Half of our drivers would break if this were true. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
David Miller (on Tue, 28 Nov 2006 20:04:53 -0800 (PST)) wrote: From: Keith Owens kaos@ocs.com.au Date: Wed, 29 Nov 2006 14:56:20 +1100 Secondly, I believe that this is a separate problem from bug 22278. hpet_readl() is correctly using volatile internally, but its result is being assigned to a pair of normal integers (not declared as volatile). In the context of wait_hpet_tick, all the variables are unqualified so gcc is allowed to optimize the comparison away. The same problem may exist in other parts of arch/i386/kernel/time_hpet.c, where the return value from hpet_readl() is assigned to a normal variable. Nothing in the C standard says that those unqualified variables should be magically treated as volatile, just because the original code that extracted the value used volatile. IOW, time_hpet.c needs to declare any variables that hold the result of hpet_readl() as being volatile variables. I disagree with this. readl() returns values from an opaque source, and it is declared as such to show this to GCC. It's like a function that GCC cannot see the implementation of, which it cannot determine anything about wrt. return values. The volatile'ness does not simply disappear the moment you assign the result to some local variable which is not volatile. Half of our drivers would break if this were true. This is definitely a gcc bug, 4.1.0 is doing something weird. Compile with CONFIG_CC_OPTIMIZE_FOR_SIZE=n and the bug appears, CONFIG_CC_OPTIMIZE_FOR_SIZE=y has no problem. Compile with CONFIG_CC_OPTIMIZE_FOR_SIZE=n and _either_ of the patches below and the problem disappears. Index: linux/arch/i386/kernel/time_hpet.c === --- linux.orig/arch/i386/kernel/time_hpet.c 2006-11-29 13:51:33.900462088 +1100 +++ linux/arch/i386/kernel/time_hpet.c 2006-11-29 15:25:47.853245938 +1100 @@ -35,7 +35,8 @@ static void __iomem * hpet_virt_address; int hpet_readl(unsigned long a) { - return readl(hpet_virt_address + a); + volatile int v = readl(hpet_virt_address + a); + return v; } static void hpet_writel(unsigned long d, unsigned long a) Index: linux-2.6/arch/i386/kernel/time_hpet.c === --- linux-2.6.orig/arch/i386/kernel/time_hpet.c +++ linux-2.6/arch/i386/kernel/time_hpet.c @@ -51,7 +51,7 @@ static void hpet_writel(unsigned long d, */ static void __devinit wait_hpet_tick(void) { - unsigned int start_cmp_val, end_cmp_val; + unsigned volatile int start_cmp_val, end_cmp_val; start_cmp_val = hpet_readl(HPET_T0_CMP); do { - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 2.6.19-rc6] Stop gcc 4.1.0 optimizing wait_hpet_tick away
On Wed, 2006-11-29 at 15:30 +1100, Keith Owens wrote: David Miller (on Tue, 28 Nov 2006 20:04:53 -0800 (PST)) wrote: From: Keith Owens kaos@ocs.com.au Date: Wed, 29 Nov 2006 14:56:20 +1100 Secondly, I believe that this is a separate problem from bug 22278. hpet_readl() is correctly using volatile internally, but its result is being assigned to a pair of normal integers (not declared as volatile). In the context of wait_hpet_tick, all the variables are unqualified so gcc is allowed to optimize the comparison away. The same problem may exist in other parts of arch/i386/kernel/time_hpet.c, where the return value from hpet_readl() is assigned to a normal variable. Nothing in the C standard says that those unqualified variables should be magically treated as volatile, just because the original code that extracted the value used volatile. IOW, time_hpet.c needs to declare any variables that hold the result of hpet_readl() as being volatile variables. I disagree with this. readl() returns values from an opaque source, and it is declared as such to show this to GCC. It's like a function that GCC cannot see the implementation of, which it cannot determine anything about wrt. return values. The volatile'ness does not simply disappear the moment you assign the result to some local variable which is not volatile. Half of our drivers would break if this were true. This is definitely a gcc bug, 4.1.0 is doing something weird. Compile with CONFIG_CC_OPTIMIZE_FOR_SIZE=n and the bug appears, CONFIG_CC_OPTIMIZE_FOR_SIZE=y has no problem. Compile with CONFIG_CC_OPTIMIZE_FOR_SIZE=n and _either_ of the patches below and the problem disappears. My theory: gcc is inlining readl into hpet_readl (readl is an inline function, so it should be doing this no matter what), and inlining hpet_readl into wait_hpet_tick (otherwise, it can't possibly make any assumptions about the return values of hpet_readl -- this looks to be a SUSE-specific over-aggressive optimization), and somewhere along the way the volatile qualifier is getting lost. -- Nicholas Miell [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/