Andy Furniss wrote:

I do know that I have really grabbed and encoded 1080p60 with my AMD
h/w and including nv12 conversion gives a sane looking result -

gst-launch-1.0 -f ximagesrc use-damage=0 startx=0 starty=0 endx=1919
 endy=1079 num-buffers=1000 ! queue ! videoconvert !
video/x-raw,framerate=100/1,format=NV12  ! fakesink Setting pipeline
to PAUSED ... Pipeline is live and does not need PREROLL ... Setting
pipeline to PLAYING ... New clock: GstSystemClock Got EOS from
element "pipeline0". Execution ended after 0:00:14.419928745 Setting
pipeline to PAUSED ... Setting pipeline to READY ... Setting pipeline
to NULL ... Freeing pipeline ...

1000/14.419928745 = 69.3

Over the weekend I looked at the CSC aspect of this without using
x11grab = benching bgr0 on tmpfs to nv12 and managed with a bit of luck
to get ffmpeg to beat gstreamer.

Starting point gstreamer bgr0 to nv12 = 70fps, to I420 68fps.

ffmpeg benched using -f null as -f rawvideo to ram or /dev/null is
slower and I suspect/hope for my intended usage = vaapi upload -f null
will be more representative, but of course I don't know that.

ffmpeg -f rawvideo -s 1920x1080 -pix_fmt bgr0 -i /mnt/ramdisk/out.bgr0 -pix_fmt nv12 -f null -

=41 fps, yuv420p = 66fps

So yuv420p is close to gstreamer but nv12 is poor.

By chance I wondered how much worse it would be if I used -sws_flags as
I have done in the past. Result it was faster, it turns out that
+full_chroma_inp takes yuv420p from 66 to 84fps and nv12 to 47fps.

The reason being that with no flags time is spent in bgr32toUV_half_c
with flag above I don't use that and see various sse in use like
ff_rgbatoUV_sse2.

nv12 is still too slow though. Looking with sysprof I see that time
is spent in yuv2nv12cX_c.

Seemed slow when remembering yuv420p -> nv12 conversions from the past
so I benched 1080p yuv420p -> nv12 and got > 500fps. Doing this didn't
use yuv2nv12cX_c at all so I got to make a new command line -

ffmpeg -f rawvideo -s 1920x1080 -pix_fmt bgr0 -i /mnt/ramdisk/out.bgr0 -vf scale=flags=+full_chroma_inp,format=yuv420p,format=nv12 -f null -

= 78fps, nice.

So at least I can beat gstreamer on CSC now. Testing the new commandline
with x11grab gets me close to gst using the legacy x11grab = 65 fps.

libxcb x11grab is 52 fps though, so it would be good if that can be fixed up.




_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Reply via email to