The codeblock affected accounted for around 4% of the runtime on x86_64
(measured using oprofile on a Penryn).
Timings for Arrandale (gcc 4.6.1 tdm64-1 for windows):
win32: 341 -> 331
win64: 321 -> 120
Part of the gain also comes from the adpcm values to be converted to float
outside of the loops.
---
 libavcodec/dcadec.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/libavcodec/dcadec.c b/libavcodec/dcadec.c
index 12ff8b5..4b59cd5 100644
--- a/libavcodec/dcadec.c
+++ b/libavcodec/dcadec.c
@@ -1185,15 +1185,15 @@ static int dca_subsubframe(DCAContext *s, int 
base_channel, int block_index)
             if (s->prediction_mode[k][l]) {
                 int n;
                 for (m = 0; m < 8; m++) {
+                    float sum = 0;
                     for (n = 1; n <= 4; n++)
                         if (m >= n)
-                            subband_samples[k][l][m] +=
-                                (adpcm_vb[s->prediction_vq[k][l]][n - 1] *
-                                 subband_samples[k][l][m - n] / 8192);
+                            sum += adpcm_vb[s->prediction_vq[k][l]][n - 1] *
+                                   subband_samples[k][l][m - n];
                         else if (s->predictor_history)
-                            subband_samples[k][l][m] +=
-                                (adpcm_vb[s->prediction_vq[k][l]][n - 1] *
-                                 s->subband_samples_hist[k][l][m - n + 4] / 
8192);
+                            sum += adpcm_vb[s->prediction_vq[k][l]][n - 1] *
+                                   s->subband_samples_hist[k][l][m - n + 4];
+                    subband_samples[k][l][m] += sum / 8192;
                 }
             }
         }
-- 
1.8.0.msysgit.0

_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to