The codeblock affected accounted for around 4% of the runtime on x86_64
(measured using oprofile on a Penryn).
Timings for Arrandale (gcc 4.6.1 tdm64-1 for windows):
win32: 341 -> 331
win64: 321 -> 120
Part of the gain also comes from the adpcm values to be converted to float
outside of the loops.
---
libavcodec/dcadec.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/libavcodec/dcadec.c b/libavcodec/dcadec.c
index 12ff8b5..4b59cd5 100644
--- a/libavcodec/dcadec.c
+++ b/libavcodec/dcadec.c
@@ -1185,15 +1185,15 @@ static int dca_subsubframe(DCAContext *s, int
base_channel, int block_index)
if (s->prediction_mode[k][l]) {
int n;
for (m = 0; m < 8; m++) {
+ float sum = 0;
for (n = 1; n <= 4; n++)
if (m >= n)
- subband_samples[k][l][m] +=
- (adpcm_vb[s->prediction_vq[k][l]][n - 1] *
- subband_samples[k][l][m - n] / 8192);
+ sum += adpcm_vb[s->prediction_vq[k][l]][n - 1] *
+ subband_samples[k][l][m - n];
else if (s->predictor_history)
- subband_samples[k][l][m] +=
- (adpcm_vb[s->prediction_vq[k][l]][n - 1] *
- s->subband_samples_hist[k][l][m - n + 4] /
8192);
+ sum += adpcm_vb[s->prediction_vq[k][l]][n - 1] *
+ s->subband_samples_hist[k][l][m - n + 4];
+ subband_samples[k][l][m] += sum / 8192;
}
}
}
--
1.8.0.msysgit.0
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel