[ Restricting discussion to the i386 bitops implementation. ]
Hi Nick, On 7/23/07, Satyam Sharma <[EMAIL PROTECTED]> wrote:
Hi, On 7/23/07, Nick Piggin <[EMAIL PROTECTED]> wrote: > Linus Torvalds wrote: > > > > On Fri, 20 Jul 2007, Nick Piggin wrote: > > > >>So you did. Then to answer that, yes it could be faster because there are > >>stupid volatiles sprinkled all over the bitops code so you could easily > >>end up having to do more loads. Does it make a real difference? Unlikely, > >>but David loves counting cycles :) > > > > > > I thought we long long since removed the volatiles. They are buggy and > > horrible, and we really want to let the compiler combine multiple > > test-bits, and if they matter that implies locking is buggy or something > > worse.. > > > > Ie we'd *want* > > > > if (test_bit(x, y) || test_bit(z,y)) > > > > to be rewritten by the compiler as testing bits x/z at the same time. > > Yep. We'd also want __set_bit(x, y); __set_bit(z, y); and such to be > combined.
BTW I'm also running some tests writing test code, compiling and verifying the code gcc generates ... curiously, volatile-access-casting of the passed bit-string address is not the only thing that'll prevent gcc's optimizer from combining the operations such as ones you listed above. Then there are -O2 vs -Os (and constant_test_bit() vs variable_test_bit()) differences I am observing ... and sometimes just the inadequacy of gcc's optimizer -- note that constant_test_bit() seems to go through extra hoops unnecessarily to avoid honouring @nr >= 32, whereas none of the other primitives in that file does that. So the i386 kernel's stock constant_test_bit implementation ends up differing from David's open-coded versions quite drastically in subtle ways, and again makes it difficult to combine the kind of operations you guys are discussing here ... It's a given, of course, that the code that gcc generates when combining such operations would clearly be more optimal than the simple btl-sbbl pair with test-and-conditional-jumps that would otherwise get generated ...
> > But now I'm too scared to look. > > Not a chance :) Even the asm-generic "reference" implementation ratifies > the volatile crapiness. Would you take a patch? Coincidentally, I'm working on a cleanup of the bitops code just now -- I stumbled upon a lot of varied bogosity in there :-)
Such as bogus/invalid asm constraints being passed in the inline assembly. Probably gcc knows everybody gets its complicated extended asm wrong, so doesn't barf when parsing such stuff ... :-)
Intend to send it out in a couple of hours, probably.
Satyam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/