Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__

2007-07-23 Thread Nicholas Miell
On Mon, 2007-07-23 at 23:30 +0200, Andi Kleen wrote:
> > gcc also tries to count the number of instructions, to guess how large in
> > bytes the asm block is, as it could make a difference for near vs short
> > jumps, etc.
> 
> Are you sure? I doubt it. It would need a full asm parser to do this
> properly and then even it could be wrong 
> (e.g. when the sections are switched like Linux does extensively)  
> 
> gcc doesn't have such a parser.
> 
> Also on x86 gcc doesn't need to care about long/short anyways because
> the assembler takes care of this and the other instructions who cared
> about this (loop) isn't generated anymore.
> 
> You're probably confusing it with some other compiler, who sometimes
> do this. e.g. the Microsoft inline asm syntax requires assembler parsing
> in the compiler.
> 
> > I wonder it it also affects the instruction count the inline heuristics
> > use?
> 
> AFAIK it counts like one operand.
> 
> -Andi

GCC counts newlines and semicolons and uses that number as the likely
instruction count.

See asm_insn_count() in gcc/gcc/final.c

-- 
Nicholas Miell <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__

2007-07-23 Thread Andi Kleen

> gcc also tries to count the number of instructions, to guess how large in
> bytes the asm block is, as it could make a difference for near vs short
> jumps, etc.

Are you sure? I doubt it. It would need a full asm parser to do this
properly and then even it could be wrong 
(e.g. when the sections are switched like Linux does extensively)  

gcc doesn't have such a parser.

Also on x86 gcc doesn't need to care about long/short anyways because
the assembler takes care of this and the other instructions who cared
about this (loop) isn't generated anymore.

You're probably confusing it with some other compiler, who sometimes
do this. e.g. the Microsoft inline asm syntax requires assembler parsing
in the compiler.

> I wonder it it also affects the instruction count the inline heuristics
> use?

AFAIK it counts like one operand.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__

2007-07-23 Thread Trent Piepho
On Mon, 23 Jul 2007, Jeremy Fitzhardinge wrote:
> Trent Piepho wrote:
> > gcc also tries to count the number of instructions, to guess how large in
> > bytes the asm block is, as it could make a difference for near vs short
> > jumps, etc.
> >
>
> How does it do that?  By looking for \n, ';', etc?

Yes:

Some targets require that GCC track the size of each instruction used in
order to generate correct code.  Because the final length of an `asm' is
only known by the assembler, GCC must make an estimate as to how big it
will be.  The estimate is formed by counting the number of statements in
the pattern of the `asm' and multiplying that by the length of the longest
instruction on that processor.  Statements in the `asm' are identified by
newline characters and whatever statement separator characters are
supported by the assembler; on most processors this is the ``;''
character.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__

2007-07-23 Thread Jeremy Fitzhardinge
Trent Piepho wrote:
> gcc also tries to count the number of instructions, to guess how large in
> bytes the asm block is, as it could make a difference for near vs short
> jumps, etc.
>   

How does it do that?  By looking for \n, ';', etc?

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__

2007-07-23 Thread Trent Piepho
On Mon, 23 Jul 2007, Jeremy Fitzhardinge wrote:
> Satyam Sharma wrote:
> > The (3) as I had originally written / meant was that multiple
> > instructions in a volatile asm would not get _individually_
> > interspersed with the rest of the code i.e. be emitted out
> > _consecutively_. I don't think we need any such guarantees for
> > the non-atomic variants of those operations, so it's good to
> > let the compiler have a free hand with what it wants to do,
> > and optimize/combine multiple bitops as necessary / possible,
> > which was the original intention.
> >
>
> No, a single asm statement is always emitted in one piece.  Gcc doesn't
> parse the string other than to do %-substitution to insert arguments, so
> it has no way to meaningfully split it up.

gcc also tries to count the number of instructions, to guess how large in
bytes the asm block is, as it could make a difference for near vs short
jumps, etc.

I wonder it it also affects the instruction count the inline heuristics
use?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__

2007-07-23 Thread Jeremy Fitzhardinge
Satyam Sharma wrote:
> The (3) as I had originally written / meant was that multiple
> instructions in a volatile asm would not get _individually_
> interspersed with the rest of the code i.e. be emitted out
> _consecutively_. I don't think we need any such guarantees for
> the non-atomic variants of those operations, so it's good to
> let the compiler have a free hand with what it wants to do,
> and optimize/combine multiple bitops as necessary / possible,
> which was the original intention.
>   

No, a single asm statement is always emitted in one piece.  Gcc doesn't
parse the string other than to do %-substitution to insert arguments, so
it has no way to meaningfully split it up.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__

2007-07-23 Thread Satyam Sharma
On Mon, 23 Jul 2007, Jeremy Fitzhardinge wrote:

> I'm not quite sure what your point is.

Could be a case of terminology confusion ...

> The paragraph you quoted is
> pretty explicit in saying that volatile doesn't prevent an "asm
> volatile" from being interspersed with other code, and the example just
> before that is explicit in talking about how to use dependencies to
> control the ordering of asm volatiles with respect to surrounding code.

Yes, that was the (2).

> In fact nothing in that section precludes asm volatiles from being
> reordered with respect to each other either; you just have to make sure
> your dependency constraints are all correct.

The (3) as I had originally written / meant was that multiple
instructions in a volatile asm would not get _individually_
interspersed with the rest of the code i.e. be emitted out
_consecutively_. I don't think we need any such guarantees for
the non-atomic variants of those operations, so it's good to
let the compiler have a free hand with what it wants to do,
and optimize/combine multiple bitops as necessary / possible,
which was the original intention.

Satyam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__

2007-07-23 Thread Jeremy Fitzhardinge
Satyam Sharma wrote:
> Hi Jeremy,
>
>
> On Mon, 23 Jul 2007, Jeremy Fitzhardinge wrote:
>
>   
>> Satyam Sharma wrote:
>> 
>>> From: Satyam Sharma <[EMAIL PROTECTED]>
>>>
>>> [7/8] i386: bitops: Kill needless usage of __asm__ __volatile__
>>>
>>> Another oddity I noticed in this file. The semantics of __volatile__
>>> when used to qualify inline __asm__ are that the compiler will not
>>> (1) elid, or, (2) reorder, or, (3) intersperse, our inline asm with
>>> the rest of the generated code.
>>>   
>>>   
>> "asm volatile" does not mean that at all.  It only guarantees (1),
>> 
>
>
> Actually, you're probably right about (2), but (3)?
>
> From the gcc manual:
>
> 
>
> Similarly, you can't expect a sequence of volatile asm instructions to
> remain perfectly consecutive. If you want consecutive output, use a
> single asm. Also GCC will perform some optimizations across a volatile
> asm instruction, GCC does not "forget everything" when it encounters a
> volatile asm instruction the way some other compilers do.
>
> 
>
> I'm reading "Similarly, you can't expect a sequence of volatile asm
> instructions to remain perfectly consecutive" to mean they're talking
> about something like:
>
> asm volatile(...);
> asm volatile(...);
> asm volatile(...);
>
> But "If you want consecutive output, use a single asm" probably means:
>
> asm volatile(... multiple instructions here ...);
>
> would actually ensure the code written in there would not be
> interspersed ... at least that's how I read it.
>   

I'm not quite sure what your point is.  The paragraph you quoted is
pretty explicit in saying that volatile doesn't prevent an "asm
volatile" from being interspersed with other code, and the example just
before that is explicit in talking about how to use dependencies to
control the ordering of asm volatiles with respect to surrounding code. 
In fact nothing in that section precludes asm volatiles from being
reordered with respect to each other either; you just have to make sure
your dependency constraints are all correct.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__

2007-07-23 Thread Satyam Sharma
Hi Jeremy,


On Mon, 23 Jul 2007, Jeremy Fitzhardinge wrote:

> Satyam Sharma wrote:
> > From: Satyam Sharma <[EMAIL PROTECTED]>
> >
> > [7/8] i386: bitops: Kill needless usage of __asm__ __volatile__
> >
> > Another oddity I noticed in this file. The semantics of __volatile__
> > when used to qualify inline __asm__ are that the compiler will not
> > (1) elid, or, (2) reorder, or, (3) intersperse, our inline asm with
> > the rest of the generated code.
> >   
> 
> "asm volatile" does not mean that at all.  It only guarantees (1),


Actually, you're probably right about (2), but (3)?

>From the gcc manual:



Similarly, you can't expect a sequence of volatile asm instructions to
remain perfectly consecutive. If you want consecutive output, use a
single asm. Also GCC will perform some optimizations across a volatile
asm instruction, GCC does not "forget everything" when it encounters a
volatile asm instruction the way some other compilers do.



I'm reading "Similarly, you can't expect a sequence of volatile asm
instructions to remain perfectly consecutive" to mean they're talking
about something like:

asm volatile(...);
asm volatile(...);
asm volatile(...);

But "If you want consecutive output, use a single asm" probably means:

asm volatile(... multiple instructions here ...);

would actually ensure the code written in there would not be
interspersed ... at least that's how I read it.

[ BTW "Also GCC will perform some optimizations across a volatile
asm instruction ..." is exactly the kind of optimizations we must
allow the compiler to do, but that's not related to point at hand. ]


> and
> only then if the asm is ever reachable.

Yup, of course.

Satyam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__

2007-07-23 Thread Satyam Sharma
Hi,

On Mon, 23 Jul 2007, Andi Kleen wrote:

> On Monday 23 July 2007 18:06:03 Satyam Sharma wrote:
> > From: Satyam Sharma <[EMAIL PROTECTED]>
> > 
> > [7/8] i386: bitops: Kill needless usage of __asm__ __volatile__
> > 
> > Another oddity I noticed in this file. The semantics of __volatile__
> > when used to qualify inline __asm__ are that the compiler will not
> > (1) elid, or, (2) reorder, or, (3) intersperse, our inline asm with
> > the rest of the generated code.
> > 
> > However, we do not want these guarantees in the unlocked variants of the
> > bitops functions. 
> 
> I thought so too and did a similar transformation while moving
> some string functions out of line. After that recent misadventure
> I would be very very careful with this.

Ah, ok, so this must be the case we'd stumbled upon recently on the
other thread. I hadn't got your fix for this, so didn't know.

> Overall I'm sorry to say, but the risk:gain ratio of this
> patch is imho totally out of whack.

The patch does look correct to me, we're only killing the use of
__volatile__ from places that don't require it (the guarantee-less
variants). Without losing it, I really don't see how the compiler
can ever combine multiple bitops, which does sound beneficial when
the caller has already implemented higher-level locking around
the usage of these operations.

Satyam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__

2007-07-23 Thread Jeremy Fitzhardinge
Satyam Sharma wrote:
> From: Satyam Sharma <[EMAIL PROTECTED]>
>
> [7/8] i386: bitops: Kill needless usage of __asm__ __volatile__
>
> Another oddity I noticed in this file. The semantics of __volatile__
> when used to qualify inline __asm__ are that the compiler will not
> (1) elid, or, (2) reorder, or, (3) intersperse, our inline asm with
> the rest of the generated code.
>   

"asm volatile" does not mean that at all.  It only guarantees (1), and
only then if the asm is ever reachable.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__ II

2007-07-23 Thread Andi Kleen

BTW if you want to optimize inline asm code a bit -- find_first_bit / 
find_first_zero_bit / for_each_cpu could really benefit from a optimized 
version 
for cpumask_t sized bitmaps. That would  save a lot of cycles in some of the 
hotter paths of the kernel like the scheduler.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__

2007-07-23 Thread Andi Kleen
On Monday 23 July 2007 18:06:03 Satyam Sharma wrote:
> From: Satyam Sharma <[EMAIL PROTECTED]>
> 
> [7/8] i386: bitops: Kill needless usage of __asm__ __volatile__
> 
> Another oddity I noticed in this file. The semantics of __volatile__
> when used to qualify inline __asm__ are that the compiler will not
> (1) elid, or, (2) reorder, or, (3) intersperse, our inline asm with
> the rest of the generated code.
> 
> However, we do not want these guarantees in the unlocked variants of the
> bitops functions. 

I thought so too and did a similar transformation while moving
some string functions out of line. After that recent misadventure
I would be very very careful with this.

Overall I'm sorry to say, but the risk:gain ratio of this
patch is imho totally out of whack.

It took a long time to get bitops.h correct and as far as we know
it is compiled correctly currently. You risk all at for very dubious
improvements.

-Andi



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 7/8] i386: bitops: Kill needless usage of __asm__ __volatile__

2007-07-23 Thread Satyam Sharma
From: Satyam Sharma <[EMAIL PROTECTED]>

[7/8] i386: bitops: Kill needless usage of __asm__ __volatile__

Another oddity I noticed in this file. The semantics of __volatile__
when used to qualify inline __asm__ are that the compiler will not
(1) elid, or, (2) reorder, or, (3) intersperse, our inline asm with
the rest of the generated code.

However, we do not want these guarantees in the unlocked variants of the
bitops functions. Also, note that test_bit() is *always* an unlocked
operation on i386 -- the i386 instruction set disallows the use of the LOCK
prefix with the "btl" instruction anyway, and the CPU will throw an invalid
opcode exception otherwise. Hence, every caller of variable_test_bit() must
have all the required locking implemented at a higher-level anyway and this
operation would necessarily be guarantee-less.

So let's remove the __volatile__ qualification of the inline asm in the
variable_test_bit() function also -- and let's give the compiler a chance
to optimize / combine multiple bitops, if possible.

Signed-off-by: Satyam Sharma <[EMAIL PROTECTED]>
Cc: David Howells <[EMAIL PROTECTED]>
Cc: Nick Piggin <[EMAIL PROTECTED]>

---

 include/asm-i386/bitops.h |8 
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/asm-i386/bitops.h b/include/asm-i386/bitops.h
index f37b8a2..4f1fda5 100644
--- a/include/asm-i386/bitops.h
+++ b/include/asm-i386/bitops.h
@@ -96,7 +96,7 @@ static inline void clear_bit(int nr, unsigned long *addr)
  */
 static inline void __clear_bit(int nr, unsigned long *addr)
 {
-   __asm__ __volatile__(
+   __asm__(
"btrl %1,%0"
:"=m" (*addr)
:"r" (nr));
@@ -123,7 +123,7 @@ static inline void __clear_bit(int nr, unsigned long *addr)
  */
 static inline void __change_bit(int nr, unsigned long *addr)
 {
-   __asm__ __volatile__(
+   __asm__(
"btcl %1,%0"
:"=m" (*addr)
:"r" (nr));
@@ -251,7 +251,7 @@ static inline int __test_and_change_bit(int nr, unsigned 
long *addr)
 {
int oldbit;
 
-   __asm__ __volatile__(
+   __asm__(
"btcl %2,%1\n\tsbbl %0,%0"
:"=r" (oldbit),"=m" (*addr)
:"r" (nr));
@@ -297,7 +297,7 @@ static inline int variable_test_bit(int nr, const unsigned 
long *addr)
 {
int oldbit;
 
-   __asm__ __volatile__(
+   __asm__(
"btl %2,%1\n\tsbbl %0,%0"
:"=r" (oldbit)
:"m" (*addr),"r" (nr));
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/