Probably you can make it even faster by avoiding the multiplication, like

   unsigned int offset = 0;
   for (i = 0; i < image.height; i++) {
        dst[offset] = src[i];
        offset += pitch;
   }


More than two decades ago I learned to avoid mul and imul. Use shifts, add and lea instead, that was the credo those days. The name of the game was CP/M 80/86, a86, d86 and ddt ;-)

But let´s get serious again.

Your proposed change of the patch results in a 21 ms performance decrease on my system. Yes, I do know that this is hard to believe. I tested a similar variation before, and the results
were even worse.

Avoiding mul is a good idea in assembly language today, but often it is better to write a multiplication with the loop counter in C and not to introduce an extra variable instead. The compiler will optimize the code and it´s easier for gcc without that extra variable.

More interesting would be the question what should be done for idx==2 or idx==3. Probably fb_pad_aligned_buffer() is also slower for those cases. But does anybody use such fonts?

cu,
knut
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to