Le 07/07/2022 à 09:34, Martijn van Beurden a écrit :
Op do 7 jul. 2022 om 09:07 schreef olivier tristan <o.tris...@uvi.net>:
> Hence even small optimization are very welcomed :)
I presume you use libFLAC directly then. Sadly there is little left to
optimize in the decoder. Below is an excerpt of the output of gprof on
flac decoding a track
> % cumulative self self total
> time seconds seconds calls s/call s/call name
> 34.87 0.68 0.68 680925 0.00 0.00
FLAC__bitreader_read_rice_signed_block
> 25.64 1.18 0.50 6004826 0.00 0.00 FLAC__MD5Transform
> 14.36 1.46 0.28 46030 0.00 0.00
FLAC__lpc_restore_signal
> 8.72 1.63 0.17 23457 0.00 0.00 read_frame_
> 5.13 1.73 0.10 23457 0.00 0.00 write_callback
> 3.08 1.79 0.06 23457 0.00 0.00 FLAC__MD5Accumulate
> 3.08 1.85 0.06 read
> 2.56 1.90 0.05 50901 0.00 0.00
FLAC__crc16_update_words32
> 1.03 1.92 0.02 23457 0.00 0.00
write_audio_frame_to_client_
> 0.51 1.93 0.01 2016520 0.00 0.00
bitreader_read_from_client_
> 0.51 1.94 0.01 _IO_file_seekoff
> 0.51 1.95 0.01 write
As you can see, the bitreader takes up most time. This is however not
something that can be optimized with SIMD/vector instructions like
SSE, AVX, NEON etc. It is also strictly a sequential process. In the
past there have been several attempts at improving speed of this call.
You could try for yourself configuring using ./configure
--enable-64-bit-words or cmake -DENABLE_64_BIT_WORDS=ON whether that
brings any (small) improvement.
Next the MD5 transformation takes up a lot of time too, but I suppose
you do not use that anyway. It is disabled by default when decoding
using libFLAC directly.
Finally the lpc restore takes up some time and can be improved with
SSE, AVX, NEON etc., but it represents only a small part of the
decoding CPU load.
We use libflac directly indeed so MD5 is not enabled in my case.
We indeed see in the perf analyzer
FLAC__bitreader_read_rice_signed_block and FLAC__lpc_restore_signal
Perhaps it is possible to add a switch to the encoder to create FLAC
files that are optimized for decoding speed instead of size. Would
that be something you would use? For example trading in 5% less
compression against 30% more decoding speed, assuming that MD5
checking is already off?
This would indeed be interesting.
The material we use are very well compressed by FLAC as this is just a
single note of an instrument as opposed to a song.
For example in a piano library, we can divide the sample size by 4.
--
Olivier Tristan
Research & Development
www.uvi.net
_______________________________________________
flac-dev mailing list
flac-dev@xiph.org
http://lists.xiph.org/mailman/listinfo/flac-dev