On 09/21/2017 03:03 AM, Mark Thompson wrote:
On 21/09/17 03:17, Leo Liu wrote:
On 09/20/2017 06:11 PM, Mark Thompson wrote:
On 19/09/17 20:04, Leo Liu wrote:
This series are for VA-API State Tracker Postproc, including:
Deinterlacing I video for transcode;
Scaling support in postproc for transcode;
Frame grabber in postproc
Thanks Andy Furniss <adf.li...@gmail.com> for lots of testing on these.
Leo Liu (14):
st/va/postproc: use video original size for postprocessing
vl/compositor: separate YUV part from shader video buffer function
vl/compositor: extend YUV deint function to do field deint
vl/compositor: add a new function for YUV deint
st/omx: use new vl_compositor_yuv_deint_full() to deint
st/va: use new vl_compositor_yuv_deint_full() to deint
vl/compositor: remove vl_compositor_yuv_deint() function
vl/compositor: add Bob top and bottom to YUV deint function
st/va/postproc: add a full NV12 deint support from buffer I to P
st/va: make internal func vlVaHandleSurfaceAllocate() call simpler
st/va/postproc: use progressive target buffer for scaling
vl/compositor: create RGB to YUV fragment shader
vl/compositor: convert RGB buffer to YUV with color conversion
st/va/postproc: implement the DRM prime grabber
src/gallium/auxiliary/vl/vl_compositor.c | 263
+++++++++++++++++-----
src/gallium/auxiliary/vl/vl_compositor.h | 50 +++-
src/gallium/state_trackers/omx_bellagio/vid_dec.c | 11 +-
src/gallium/state_trackers/va/picture.c | 16 +-
src/gallium/state_trackers/va/postproc.c | 69 +++++-
src/gallium/state_trackers/va/surface.c | 7 +-
src/gallium/state_trackers/va/va_private.h | 2 +-
7 files changed, 331 insertions(+), 87 deletions(-)
Looks good for import from a bit of testing so far (with the update today).
Something funny going on with RGB upload cases? With ffmpeg:
./ffmpeg_g -y -i in.mp4 -an -vaapi_device /dev/dri/renderD129 -vf
format=bgr0,hwupload,scale_vaapi=w=1920:h=1080:format=nv12 -c:v h264_vaapi
-profile:v 578 -bf 0 out.mp4
it crashes a few lines into copying to the image.
The mapping in vlVaMapBuffer() looks like:
(gdb) p *buf->derived_surface.resource
$9 = {reference = {count = 5}, screen = 0x555557829010, width0 = 1920, height0
= 1088, depth0 = 1, array_size = 1, format = PIPE_FORMAT_B8G8R8X8_UNORM, target
= PIPE_TEXTURE_2D, last_level = 0, nr_samples = 0, usage = 0, bind = 2097152,
flags = 0, next = 0x0}
(gdb) p *buf->derived_surface.transfer
$8 = {resource = 0x555557d8e2c0, level = 0, usage = PIPE_TRANSFER_WRITE, box =
{x = 0, y = 0, z = 0, width = 1920, height = 1, depth = 1}, stride = 7680,
layer_stride = 7680}
height = 1 looks suspicious, like it's only mapping the first line?
Looks like the command line crashed at some point where is before you would to go.
i.e RGB->YUV in postproc.
th
I'm not quite understanding what you mean. Do you crash at a different point
rather than in the copy after mapping the the image to upload to? Backtrace?
I haven't tried your command yet, but I know it won't work. If we would
like to have raw RGB to scale in postproc , the raw data have to be in
RGB surface.
Currently the case we support in RGB format in vaCreateSurface is the
passing of the dma-buf handle, not support allocate RGB surface.
Even though we got luckily enough, the command line can pass through, it
will put rgb data to nv12 surface ( the driver should explicitly return
invalid surface for this case).
A general question for the whole driver: why are surfaces interlaced by default?
I think it's firmware preferred, and they are also good for deinterlacing.
Can you be more specific?
Take a look "rvid_get_video_param()" from radeon_video.c, that will tell
what interlaced format HW support and prefer
I agree that it is required for deinterlacing, but that isn't a particularly
common case and will only become less so with time. E.g. is it somehow better
to decode even progressive video to interlaced frames? That seems like it
would have significantly worse locality of reference to me, but maybe the
hardware does something special.
I may be getting some things wrong here, but the relevant components which deal
with surfaces that I see are:
H
* Decoder: can write either format, the stream type doesn't seem to matter (?).
Normally, HW decoder write to NV12, P016, and for Mjpeg it can do YUYV as well.
Stream type depends on codecs HW supports
All in interlaced and progressive forms? I didn't consider it earlier, but the
H.265 decoder seems to always produce progressive for me.
Again it depends on HW, State Tracker query driver what it supports and
prefers, then make decision how to create surface.
Regards,
Leo
* Encoder: can only accept progressive surfaces.
* Deinterlacer: only works on interlaced surfaces (?).
Yes, if you would like to have a pretty picture for 'deinterlace_vappi=mode=3'
* Scaler: can work on either.
* Import: will pretty much always be progressive unless forced not to be (noone
is going to make the interlaced format externally unless they get forced into
it).
If the import usages are for encoder, it have to progressive,
Typically it isn't directly for the encoder because few things directly produce
the necessary formats - a postproc step for some colour-conversion is very
likely to happen first.
* Export: works for either, but interlaced is likely much harder for others to
use.
For some use for playback, see <https://github.com/01org/libva/pull/85>,
<https://lists.freedesktop.org/archives/mesa-dev/2017-September/169953.html>. The
mpv use-case really wants just the native format without any copying (to not pointlessly
waste resources); if that has to be the interlaced form then that should be ok (though I
don't know if other projects might disagree on that one), but it would need changes to the
libva API to be able to express it.
The current copy cases (progressive -> interlaced) with the interlaced default,
then, are after most imports and on encode, and it was suggested that we want to do
it before export as well. If surfaces were instead progressive by default then I
think the only copy necessary would be before the deinterlacer. Since most streams
nowadays are progressive (including all >1080-line high-resolution streams, where
performance matters most), and that proportion is only going to increase, improving
other parts for a decrease in performance of the deinterlacer seems like a pretty
good tradeoff to me.
Thoughts?
Thanks,
- Mark
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev