2012/4/5 Ronald S. Bultje <[email protected]>: > This looks OK. Is there a performance benefit? I assume there isn't > anything measurable, because the overhead is relatively low?
Yes, the split between prescaled/non-scaled cases shaves like 3 cycles per call, and leads to a result within measure noise. However, this makes it clear that such a distinction should be made (I'm thinking of neon code here). For the SSSE3 case, the non-scaled version is ~217 cycles, and the prescaled (producing identical results) is ~141. > As for the code, do please document the arrays in rv34dsp.h, so we > don't have to look at the code to figure out what the difference > between [0][0] and [1][1] is. Done. The commit message is also more verbose, but in the end, it would be interesting to know if it is good enough for someone implementing code based on this. Another (clearer?) solution would be to have: rv40_weight_func rv40_nonscaled_biweight[2]; rv40_weight_func rv40_prescaled_biweight[2]; in RV34DSPContext and have function pointers set to the correct values in RV34DecContex. Christophe
0001-rv40dsp-implement-prescaled-versions-for-biweight.patch
Description: Binary data
_______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
