On 10/6/2015 4:40 PM, Paul B Mahol wrote:
> On 10/6/15, James Almer <jamr...@gmail.com> wrote:
>> Since AVFrame.extended_data is apparently not padded, simd functions
>> could in some cases overread, so make the decoder use a temp buffer
>> unconditionally.
>>
>> Signed-off-by: James Almer <jamr...@gmail.com>
>> ---
>>  libavcodec/alac.c | 18 +++++-------------
>>  1 file changed, 5 insertions(+), 13 deletions(-)
>>
>> diff --git a/libavcodec/alac.c b/libavcodec/alac.c
>> index 146668e..394bd19 100644
>> --- a/libavcodec/alac.c
>> +++ b/libavcodec/alac.c
>> @@ -80,7 +80,6 @@ typedef struct ALACContext {
>>      int extra_bits;     /**< number of extra bits beyond 16-bit */
>>      int nb_samples;     /**< number of samples in the current frame */
>>
>> -    int direct_output;
>>      int extra_bit_bug;
>>
>>      ALACDSPContext dsp;
>> @@ -278,10 +277,6 @@ static int decode_element(AVCodecContext *avctx,
>> AVFrame *frame, int ch_index,
>>          return AVERROR_INVALIDDATA;
>>      }
>>      alac->nb_samples = output_samples;
>> -    if (alac->direct_output) {
>> -        for (ch = 0; ch < channels; ch++)
>> -            alac->output_samples_buffer[ch] = (int32_t
>> *)frame->extended_data[ch_index + ch];
>> -    }
>>
>>      if (is_compressed) {
>>          int16_t lpc_coefs[2][32];
>> @@ -393,8 +388,9 @@ static int decode_element(AVCodecContext *avctx, AVFrame
>> *frame, int ch_index,
>>          break;
>>      case 24: {
>>          for (ch = 0; ch < channels; ch++) {
>> +            int32_t *outbuffer = (int32_t *)frame->extended_data[ch_index +
>> ch];
>>              for (i = 0; i < alac->nb_samples; i++)
>> -                alac->output_samples_buffer[ch][i] <<= 8;
>> +                *outbuffer++ = alac->output_samples_buffer[ch][i] << 8;
>>          }}
>>          break;
>>      }
>> @@ -468,8 +464,7 @@ static av_cold int alac_decode_close(AVCodecContext
>> *avctx)
>>      int ch;
>>      for (ch = 0; ch < FFMIN(alac->channels, 2); ch++) {
>>          av_freep(&alac->predict_error_buffer[ch]);
>> -        if (!alac->direct_output)
>> -            av_freep(&alac->output_samples_buffer[ch]);
>> +        av_freep(&alac->output_samples_buffer[ch]);
>>          av_freep(&alac->extra_bits_buffer[ch]);
>>      }
>>
>> @@ -491,11 +486,8 @@ static int allocate_buffers(ALACContext *alac)
>>          FF_ALLOC_OR_GOTO(alac->avctx, alac->predict_error_buffer[ch],
>>                           buf_size, buf_alloc_fail);
>>
>> -        alac->direct_output = alac->sample_size > 16;
>> -        if (!alac->direct_output) {
>> -            FF_ALLOC_OR_GOTO(alac->avctx, alac->output_samples_buffer[ch],
>> -                             buf_size, buf_alloc_fail);
>> -        }
>> +        FF_ALLOC_OR_GOTO(alac->avctx, alac->output_samples_buffer[ch],
>> +                         buf_size, buf_alloc_fail);
>>
>>          FF_ALLOC_OR_GOTO(alac->avctx, alac->extra_bits_buffer[ch],
>>                           buf_size, buf_alloc_fail);
>> --
>> 2.5.2
>>
>> _______________________________________________
>> ffmpeg-devel mailing list
>> ffmpeg-devel@ffmpeg.org
>> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>
> 
> it should be padded and not introduce slowdown

If you mean the temp buffers, they will be padded alongside the simd functions
once i commit them.
But If you mean the avframe.extended_data buffer, could you take care of that?
I'm not familiar enough with avframe to change the relevant alloc functions.

running "time ffmpeg -v 0 -threads 1 -i INPUT -threads 1 -f null -" (implicit
pcm_s16le output)

Before
real    0m0.596s
user    0m0.000s
sys     0m0.000s

After
real    0m0.575s
user    0m0.000s
sys     0m0.000s


running "time ffmpeg -v 0 -threads 1 -i INPUT -threads 1 -c:a pcm_s24le -f null 
-"

Before
real    0m0.618s
user    0m0.000s
sys     0m0.000s

After
real    0m0.618s
user    0m0.000s
sys     0m0.000s

With a ~1 minute 24 bit 44.1kh stereo sample. Curious that it's faster when the
output is s16.
You'll probably have to do the same for the tak decoder before you commit your
decorrelate simd patch, btw. It also uses avframe.extended_data buffer directly
for 24bit samples.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Reply via email to