On Thu, 28 Jul 2005, Steven Rostedt wrote:
>
> In the thread "[RFC][PATCH] Make MAX_RT_PRIO and MAX_USER_RT_PRIO
> configurable" I discovered that a C version of find_first_bit is faster
> than the asm version now when compiled against gcc 3.3.6 and gcc 4.0.1
> (both from versions of Debian unstable).  I wrote a benchmark (attached)
> that runs the code 1,000,000 times.

I suspect the old "rep scas" has always been slower than 
compiler-generated code, at least under your test conditions. Many of the 
old asm's are actually _very_ old, and some of them come from pre-0.01 
days and are more about me learning the i386 (and gcc inline asm).

That said, I don't much like your benchmarking methodology. I suspect that 
quite often, the code in question runs from L2 cache, not in a tight loop, 
and so that "run a million times" approach is not necessarily the best 
one.

I'll apply this one as obvious: I doubt the compiler generates bigger code
or has any real downsides, but I just wanted to say that in general I just
wish people didn't always time the hot-cache case ;)

                Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to