Re: [FFmpeg-devel] [RFC/PATCH] bitpacked_dec: Optimization for bitpacked_dec decoder performance

2023-06-12 Thread Paul B Mahol
On Fri, May 12, 2023 at 12:20 AM Marton Balint wrote: > > > On Wed, 10 May 2023, Lance Wang wrote: > > > On Sat, May 6, 2023 at 8:41 PM Devin Heitmueller < > > devin.heitmuel...@ltnglobal.com> wrote: > > > >> On Sat, May 6, 2023 at 8:16 AM James Almer wrote: > >> > Can you bench with the

Re: [FFmpeg-devel] [RFC/PATCH] bitpacked_dec: Optimization for bitpacked_dec decoder performance

2023-05-12 Thread Devin Heitmueller
On Thu, May 11, 2023 at 6:20 PM Marton Balint wrote: > Actually the cached bitstream reader was faster here than the manual > approach: > > ./ffmpeg -stream_loop 128 -threads 1 -f bitpacked -pix_fmt yuv422p10le -s > 3840x2160 -c:v bitpacked -i source.yuv -pix_fmt yuv422p10le -f null none >

Re: [FFmpeg-devel] [RFC/PATCH] bitpacked_dec: Optimization for bitpacked_dec decoder performance

2023-05-11 Thread Marton Balint
On Wed, 10 May 2023, Lance Wang wrote: On Sat, May 6, 2023 at 8:41 PM Devin Heitmueller < devin.heitmuel...@ltnglobal.com> wrote: On Sat, May 6, 2023 at 8:16 AM James Almer wrote: > Can you bench with the START_TIMER and STOP_TIMER macros in timer.h? > Also, define CACHED_BITSTREAM_READER

Re: [FFmpeg-devel] [RFC/PATCH] bitpacked_dec: Optimization for bitpacked_dec decoder performance

2023-05-10 Thread Lance Wang
On Sat, May 6, 2023 at 8:41 PM Devin Heitmueller < devin.heitmuel...@ltnglobal.com> wrote: > On Sat, May 6, 2023 at 8:16 AM James Almer wrote: > > Can you bench with the START_TIMER and STOP_TIMER macros in timer.h? > > Also, define CACHED_BITSTREAM_READER in bitpacked_dec.c before including > >

Re: [FFmpeg-devel] [RFC/PATCH] bitpacked_dec: Optimization for bitpacked_dec decoder performance

2023-05-06 Thread Devin Heitmueller
On Sat, May 6, 2023 at 8:16 AM James Almer wrote: > Can you bench with the START_TIMER and STOP_TIMER macros in timer.h? > Also, define CACHED_BITSTREAM_READER in bitpacked_dec.c before including > git_bits.h and test the actual implementation again, to see if it makes > any difference. Original

Re: [FFmpeg-devel] [RFC/PATCH] bitpacked_dec: Optimization for bitpacked_dec decoder performance

2023-05-06 Thread James Almer
On 5/6/2023 9:13 AM, Devin Heitmueller wrote: I added some instrumentation via the attached patch. You can see the benefits here: Before=1683378057.243350 After 1683378057.264239 Before=1683378083.335424 After 1683378083.356440 Before=1683378089.675400 After 1683378089.696512

Re: [FFmpeg-devel] [RFC/PATCH] bitpacked_dec: Optimization for bitpacked_dec decoder performance

2023-05-06 Thread Devin Heitmueller
I added some instrumentation via the attached patch. You can see the benefits here: Before=1683378057.243350 After 1683378057.264239 Before=1683378083.335424 After 1683378083.356440 Before=1683378089.675400 After 1683378089.696512 Before=1683378151.792324 After 1683378151.813579 21 ms per run

Re: [FFmpeg-devel] [RFC/PATCH] bitpacked_dec: Optimization for bitpacked_dec decoder performance

2023-05-06 Thread Paul B Mahol
On Sat, May 6, 2023 at 1:32 PM Lance Wang wrote: > On Sat, May 6, 2023 at 4:58 AM Devin Heitmueller < > devin.heitmuel...@ltnglobal.com> wrote: > > > Rework the code a bit to speed up the 10-bit bitpacked decoding > > routine. This is probably about as fast as I can get it without > > switching

Re: [FFmpeg-devel] [RFC/PATCH] bitpacked_dec: Optimization for bitpacked_dec decoder performance

2023-05-06 Thread Devin Heitmueller
Hi Lance, On Sat, May 6, 2023 at 7:32 AM Lance Wang wrote: > FYI, on my development system, I run two time for the original and modified > version and no obvious difference: Simply running "time" against the binary isn't an accurate way to measure a 60ms difference for a single frame being

Re: [FFmpeg-devel] [RFC/PATCH] bitpacked_dec: Optimization for bitpacked_dec decoder performance

2023-05-06 Thread Lance Wang
On Sat, May 6, 2023 at 4:58 AM Devin Heitmueller < devin.heitmuel...@ltnglobal.com> wrote: > Rework the code a bit to speed up the 10-bit bitpacked decoding > routine. This is probably about as fast as I can get it without > switching to assembly language. > > Demonstratable with: > > ./ffmpeg

[FFmpeg-devel] [RFC/PATCH] bitpacked_dec: Optimization for bitpacked_dec decoder performance

2023-05-05 Thread Devin Heitmueller
Rework the code a bit to speed up the 10-bit bitpacked decoding routine. This is probably about as fast as I can get it without switching to assembly language. Demonstratable with: ./ffmpeg -f lavfi -i "smptehdbars=size=3840x2160" -c bitpacked -f image2 -frames:v 1 source.yuv ./ffmpeg -f