Hi,

On Thu, Jul 9, 2015 at 9:15 AM, 
<shivraj.pa...@imgtec.com<mailto:shivraj.pa...@imgtec.com>> wrote:
+void ff_idct_idct_16x16_add_msa(uint8_t *dst, ptrdiff_t stride,
+                                int16_t *block, int eob)
+{
+    vp9_idct16x16_colcol_addblk_msa(block, dst, stride);
+    memset(block, 0, 16 * 16 * sizeof(*block));
+}

(This comment applies to all code in this file), you're not using the eob 
parameter anywhere. Admittedly, for the iadst variants, the eob value is 
generally quite high so this won't give any merit, but for idct_idct, eob is 
typically low (possibly even 1), and you can make use of that to do sub-idcts. 
Look at the C code for an example of dc-only idct_idct, and look at the x86 
simd for examples of sub-idcts. They give great speedups on top of the regular 
speedup expected from simd vectorization, especially for the bigger ones 
(16x16, 32x32).

Agreed, will incorporate the same.

Shivraj
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Reply via email to