Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__
On Mon, 2007-07-23 at 23:30 +0200, Andi Kleen wrote: > > gcc also tries to count the number of instructions, to guess how large in > > bytes the asm block is, as it could make a difference for near vs short > > jumps, etc. > > Are you sure? I doubt it. It would need a full asm parser to do this > properly and then even it could be wrong > (e.g. when the sections are switched like Linux does extensively) > > gcc doesn't have such a parser. > > Also on x86 gcc doesn't need to care about long/short anyways because > the assembler takes care of this and the other instructions who cared > about this (loop) isn't generated anymore. > > You're probably confusing it with some other compiler, who sometimes > do this. e.g. the Microsoft inline asm syntax requires assembler parsing > in the compiler. > > > I wonder it it also affects the instruction count the inline heuristics > > use? > > AFAIK it counts like one operand. > > -Andi GCC counts newlines and semicolons and uses that number as the likely instruction count. See asm_insn_count() in gcc/gcc/final.c -- Nicholas Miell <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__
> gcc also tries to count the number of instructions, to guess how large in > bytes the asm block is, as it could make a difference for near vs short > jumps, etc. Are you sure? I doubt it. It would need a full asm parser to do this properly and then even it could be wrong (e.g. when the sections are switched like Linux does extensively) gcc doesn't have such a parser. Also on x86 gcc doesn't need to care about long/short anyways because the assembler takes care of this and the other instructions who cared about this (loop) isn't generated anymore. You're probably confusing it with some other compiler, who sometimes do this. e.g. the Microsoft inline asm syntax requires assembler parsing in the compiler. > I wonder it it also affects the instruction count the inline heuristics > use? AFAIK it counts like one operand. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__
On Mon, 23 Jul 2007, Jeremy Fitzhardinge wrote: > Trent Piepho wrote: > > gcc also tries to count the number of instructions, to guess how large in > > bytes the asm block is, as it could make a difference for near vs short > > jumps, etc. > > > > How does it do that? By looking for \n, ';', etc? Yes: Some targets require that GCC track the size of each instruction used in order to generate correct code. Because the final length of an `asm' is only known by the assembler, GCC must make an estimate as to how big it will be. The estimate is formed by counting the number of statements in the pattern of the `asm' and multiplying that by the length of the longest instruction on that processor. Statements in the `asm' are identified by newline characters and whatever statement separator characters are supported by the assembler; on most processors this is the ``;'' character. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__
Trent Piepho wrote: > gcc also tries to count the number of instructions, to guess how large in > bytes the asm block is, as it could make a difference for near vs short > jumps, etc. > How does it do that? By looking for \n, ';', etc? J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__
On Mon, 23 Jul 2007, Jeremy Fitzhardinge wrote: > Satyam Sharma wrote: > > The (3) as I had originally written / meant was that multiple > > instructions in a volatile asm would not get _individually_ > > interspersed with the rest of the code i.e. be emitted out > > _consecutively_. I don't think we need any such guarantees for > > the non-atomic variants of those operations, so it's good to > > let the compiler have a free hand with what it wants to do, > > and optimize/combine multiple bitops as necessary / possible, > > which was the original intention. > > > > No, a single asm statement is always emitted in one piece. Gcc doesn't > parse the string other than to do %-substitution to insert arguments, so > it has no way to meaningfully split it up. gcc also tries to count the number of instructions, to guess how large in bytes the asm block is, as it could make a difference for near vs short jumps, etc. I wonder it it also affects the instruction count the inline heuristics use? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__
Satyam Sharma wrote: > The (3) as I had originally written / meant was that multiple > instructions in a volatile asm would not get _individually_ > interspersed with the rest of the code i.e. be emitted out > _consecutively_. I don't think we need any such guarantees for > the non-atomic variants of those operations, so it's good to > let the compiler have a free hand with what it wants to do, > and optimize/combine multiple bitops as necessary / possible, > which was the original intention. > No, a single asm statement is always emitted in one piece. Gcc doesn't parse the string other than to do %-substitution to insert arguments, so it has no way to meaningfully split it up. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__
On Mon, 23 Jul 2007, Jeremy Fitzhardinge wrote: > I'm not quite sure what your point is. Could be a case of terminology confusion ... > The paragraph you quoted is > pretty explicit in saying that volatile doesn't prevent an "asm > volatile" from being interspersed with other code, and the example just > before that is explicit in talking about how to use dependencies to > control the ordering of asm volatiles with respect to surrounding code. Yes, that was the (2). > In fact nothing in that section precludes asm volatiles from being > reordered with respect to each other either; you just have to make sure > your dependency constraints are all correct. The (3) as I had originally written / meant was that multiple instructions in a volatile asm would not get _individually_ interspersed with the rest of the code i.e. be emitted out _consecutively_. I don't think we need any such guarantees for the non-atomic variants of those operations, so it's good to let the compiler have a free hand with what it wants to do, and optimize/combine multiple bitops as necessary / possible, which was the original intention. Satyam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__
Satyam Sharma wrote: > Hi Jeremy, > > > On Mon, 23 Jul 2007, Jeremy Fitzhardinge wrote: > > >> Satyam Sharma wrote: >> >>> From: Satyam Sharma <[EMAIL PROTECTED]> >>> >>> [7/8] i386: bitops: Kill needless usage of __asm__ __volatile__ >>> >>> Another oddity I noticed in this file. The semantics of __volatile__ >>> when used to qualify inline __asm__ are that the compiler will not >>> (1) elid, or, (2) reorder, or, (3) intersperse, our inline asm with >>> the rest of the generated code. >>> >>> >> "asm volatile" does not mean that at all. It only guarantees (1), >> > > > Actually, you're probably right about (2), but (3)? > > From the gcc manual: > > > > Similarly, you can't expect a sequence of volatile asm instructions to > remain perfectly consecutive. If you want consecutive output, use a > single asm. Also GCC will perform some optimizations across a volatile > asm instruction, GCC does not "forget everything" when it encounters a > volatile asm instruction the way some other compilers do. > > > > I'm reading "Similarly, you can't expect a sequence of volatile asm > instructions to remain perfectly consecutive" to mean they're talking > about something like: > > asm volatile(...); > asm volatile(...); > asm volatile(...); > > But "If you want consecutive output, use a single asm" probably means: > > asm volatile(... multiple instructions here ...); > > would actually ensure the code written in there would not be > interspersed ... at least that's how I read it. > I'm not quite sure what your point is. The paragraph you quoted is pretty explicit in saying that volatile doesn't prevent an "asm volatile" from being interspersed with other code, and the example just before that is explicit in talking about how to use dependencies to control the ordering of asm volatiles with respect to surrounding code. In fact nothing in that section precludes asm volatiles from being reordered with respect to each other either; you just have to make sure your dependency constraints are all correct. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__
Hi Jeremy, On Mon, 23 Jul 2007, Jeremy Fitzhardinge wrote: > Satyam Sharma wrote: > > From: Satyam Sharma <[EMAIL PROTECTED]> > > > > [7/8] i386: bitops: Kill needless usage of __asm__ __volatile__ > > > > Another oddity I noticed in this file. The semantics of __volatile__ > > when used to qualify inline __asm__ are that the compiler will not > > (1) elid, or, (2) reorder, or, (3) intersperse, our inline asm with > > the rest of the generated code. > > > > "asm volatile" does not mean that at all. It only guarantees (1), Actually, you're probably right about (2), but (3)? >From the gcc manual: Similarly, you can't expect a sequence of volatile asm instructions to remain perfectly consecutive. If you want consecutive output, use a single asm. Also GCC will perform some optimizations across a volatile asm instruction, GCC does not "forget everything" when it encounters a volatile asm instruction the way some other compilers do. I'm reading "Similarly, you can't expect a sequence of volatile asm instructions to remain perfectly consecutive" to mean they're talking about something like: asm volatile(...); asm volatile(...); asm volatile(...); But "If you want consecutive output, use a single asm" probably means: asm volatile(... multiple instructions here ...); would actually ensure the code written in there would not be interspersed ... at least that's how I read it. [ BTW "Also GCC will perform some optimizations across a volatile asm instruction ..." is exactly the kind of optimizations we must allow the compiler to do, but that's not related to point at hand. ] > and > only then if the asm is ever reachable. Yup, of course. Satyam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__
Hi, On Mon, 23 Jul 2007, Andi Kleen wrote: > On Monday 23 July 2007 18:06:03 Satyam Sharma wrote: > > From: Satyam Sharma <[EMAIL PROTECTED]> > > > > [7/8] i386: bitops: Kill needless usage of __asm__ __volatile__ > > > > Another oddity I noticed in this file. The semantics of __volatile__ > > when used to qualify inline __asm__ are that the compiler will not > > (1) elid, or, (2) reorder, or, (3) intersperse, our inline asm with > > the rest of the generated code. > > > > However, we do not want these guarantees in the unlocked variants of the > > bitops functions. > > I thought so too and did a similar transformation while moving > some string functions out of line. After that recent misadventure > I would be very very careful with this. Ah, ok, so this must be the case we'd stumbled upon recently on the other thread. I hadn't got your fix for this, so didn't know. > Overall I'm sorry to say, but the risk:gain ratio of this > patch is imho totally out of whack. The patch does look correct to me, we're only killing the use of __volatile__ from places that don't require it (the guarantee-less variants). Without losing it, I really don't see how the compiler can ever combine multiple bitops, which does sound beneficial when the caller has already implemented higher-level locking around the usage of these operations. Satyam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__
Satyam Sharma wrote: > From: Satyam Sharma <[EMAIL PROTECTED]> > > [7/8] i386: bitops: Kill needless usage of __asm__ __volatile__ > > Another oddity I noticed in this file. The semantics of __volatile__ > when used to qualify inline __asm__ are that the compiler will not > (1) elid, or, (2) reorder, or, (3) intersperse, our inline asm with > the rest of the generated code. > "asm volatile" does not mean that at all. It only guarantees (1), and only then if the asm is ever reachable. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__ II
BTW if you want to optimize inline asm code a bit -- find_first_bit / find_first_zero_bit / for_each_cpu could really benefit from a optimized version for cpumask_t sized bitmaps. That would save a lot of cycles in some of the hotter paths of the kernel like the scheduler. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__
On Monday 23 July 2007 18:06:03 Satyam Sharma wrote: > From: Satyam Sharma <[EMAIL PROTECTED]> > > [7/8] i386: bitops: Kill needless usage of __asm__ __volatile__ > > Another oddity I noticed in this file. The semantics of __volatile__ > when used to qualify inline __asm__ are that the compiler will not > (1) elid, or, (2) reorder, or, (3) intersperse, our inline asm with > the rest of the generated code. > > However, we do not want these guarantees in the unlocked variants of the > bitops functions. I thought so too and did a similar transformation while moving some string functions out of line. After that recent misadventure I would be very very careful with this. Overall I'm sorry to say, but the risk:gain ratio of this patch is imho totally out of whack. It took a long time to get bitops.h correct and as far as we know it is compiled correctly currently. You risk all at for very dubious improvements. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__
From: Satyam Sharma <[EMAIL PROTECTED]> [7/8] i386: bitops: Kill needless usage of __asm__ __volatile__ Another oddity I noticed in this file. The semantics of __volatile__ when used to qualify inline __asm__ are that the compiler will not (1) elid, or, (2) reorder, or, (3) intersperse, our inline asm with the rest of the generated code. However, we do not want these guarantees in the unlocked variants of the bitops functions. Also, note that test_bit() is *always* an unlocked operation on i386 -- the i386 instruction set disallows the use of the LOCK prefix with the "btl" instruction anyway, and the CPU will throw an invalid opcode exception otherwise. Hence, every caller of variable_test_bit() must have all the required locking implemented at a higher-level anyway and this operation would necessarily be guarantee-less. So let's remove the __volatile__ qualification of the inline asm in the variable_test_bit() function also -- and let's give the compiler a chance to optimize / combine multiple bitops, if possible. Signed-off-by: Satyam Sharma <[EMAIL PROTECTED]> Cc: David Howells <[EMAIL PROTECTED]> Cc: Nick Piggin <[EMAIL PROTECTED]> --- include/asm-i386/bitops.h |8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/include/asm-i386/bitops.h b/include/asm-i386/bitops.h index f37b8a2..4f1fda5 100644 --- a/include/asm-i386/bitops.h +++ b/include/asm-i386/bitops.h @@ -96,7 +96,7 @@ static inline void clear_bit(int nr, unsigned long *addr) */ static inline void __clear_bit(int nr, unsigned long *addr) { - __asm__ __volatile__( + __asm__( "btrl %1,%0" :"=m" (*addr) :"r" (nr)); @@ -123,7 +123,7 @@ static inline void __clear_bit(int nr, unsigned long *addr) */ static inline void __change_bit(int nr, unsigned long *addr) { - __asm__ __volatile__( + __asm__( "btcl %1,%0" :"=m" (*addr) :"r" (nr)); @@ -251,7 +251,7 @@ static inline int __test_and_change_bit(int nr, unsigned long *addr) { int oldbit; - __asm__ __volatile__( + __asm__( "btcl %2,%1\n\tsbbl %0,%0" :"=r" (oldbit),"=m" (*addr) :"r" (nr)); @@ -297,7 +297,7 @@ static inline int variable_test_bit(int nr, const unsigned long *addr) { int oldbit; - __asm__ __volatile__( + __asm__( "btl %2,%1\n\tsbbl %0,%0" :"=r" (oldbit) :"m" (*addr),"r" (nr)); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/