The function has some branches that get pruned by constant propagation if it is inlined. Some compilers miss this opportunity if inlining is not forced, producing much slower code.
Matches what get_vlc2 does. Vorbis, converted to use bitstream_read_vlc gets impacted as following: x86 gcc-6.3 0.0% clang-4.0 2.3% x86_64 gcc-6.3 0.5% clang-4.0 1.6% arm32 gcc-6.3 0.6% clang-4.0 -0.1% arm64 gcc-6.3 0.4% clang-4.0 4.0% Raw data and computation https://gist.github.com/lu-zero/171c854498ba934cdb7bae385f045e5b#some-benchmarks-using-perf-to-see-whats-going-on https://docs.google.com/spreadsheets/d/1V0f-YNauz1SrRBO3jzJtOsA97Z1XnPGBMbvI0sYrhFA/edit#gid=0 --- libavcodec/vlc.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/libavcodec/vlc.h b/libavcodec/vlc.h index 8ac52381b9..7962c15145 100644 --- a/libavcodec/vlc.h +++ b/libavcodec/vlc.h @@ -65,8 +65,8 @@ void ff_free_vlc(VLC *vlc); } while (0) /* Return the LUT element for the given bitstream configuration. */ -static inline int set_idx(BitstreamContext *bc, int code, int *n, int *nb_bits, - VLC_TYPE (*table)[2]) +static av_always_inline int set_idx(BitstreamContext *bc, int code, int *n, int *nb_bits, + VLC_TYPE (*table)[2]) { unsigned idx; @@ -87,8 +87,8 @@ static inline int set_idx(BitstreamContext *bc, int code, int *n, int *nb_bits, * If the VLC code is invalid and max_depth = 1, then no bits will be removed. * If the VLC code is invalid and max_depth > 1, then the number of bits removed * is undefined. */ -static inline int bitstream_read_vlc(BitstreamContext *bc, VLC_TYPE (*table)[2], - int bits, int max_depth) +static av_always_inline int bitstream_read_vlc(BitstreamContext *bc, VLC_TYPE (*table)[2], + int bits, int max_depth) { int nb_bits; unsigned idx = bitstream_peek(bc, bits); -- 2.12.2 _______________________________________________ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel