"Antti S. Lankila" <alank...@bel.fi> writes: > Attached is a simple patch that produces around 20 % Mpix/s > improvement for wide path processing due to significant optimization > of pixman_expand. On my i7 laptop, we go from: > >> src_8888_2x10 = L1: 62.08 L2: 60.73 M: 59.61 >> ( 4.30%) HT: 46.81 VT: 42.17 R: 43.18 RT: 26.01 ( >> 325Kops/s) > > to > >> src_8888_2x10 = L1: 76.94 L2: 78.43 M: 75.87 >> ( 5.59%) HT: 56.73 VT: 52.39 R: 53.00 RT: 29.29 ( >> 363Kops/s) > > The key of the patch is the observation that unorm_to_unorm's work can > more easily be done with a simple multiplication and shift, when the > function is applied repeatedly and the parameters are not compile-time > constants. For instance, converting from 0xfe to 0xfefe (expanding > from 8 bits to 16 bits) can be done by calculating > > c = c * 0x101 > > However, sometimes the result is not a neat replication of all the > bits. For instance, going from 10 bits to 16 bits can be done by > calculating > > c = c * 0x401UL >> 4 > > where the intermediate result is 20 bit wide repetition of the 10-bit > pattern followed by shifting off the unnecessary lowest bits. > > The patch has the algorithm to calculate the factor and the shift, and > converts the code to use it.
This patch looks basically good to me provided that make check still passes. The comments I have are mainly about coding style (please see the CODING_STYLE file). In particular: - All the information in the mail would be useful to have in the commit message: the speed-up, how it works, etc. - The function unorm_to_unorm_params() should be static - Space before the left parenthesis - Avoid variable declarations in the middle of the code - Indent is four spaces - Braces go on their own line But other than that, this looks like a nice speedup. Thanks, Søren _______________________________________________ Pixman mailing list Pixman@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/pixman