On Fri, Aug 14, 2020 at 07:25:17PM +0200, Daniel Vetter wrote:
> On Fri, Aug 14, 2020 at 7:17 PM Daniel Stone <dan...@fooishbar.org> wrote:
> >
> > Hi,
> >
> > On Fri, 14 Aug 2020 at 17:22, Thierry Reding <thierry.red...@gmail.com> 
> > wrote:
> > > I suspect that the reason why this works in X but not in Wayland is
> > > because X passes the right usage flags, whereas Weston may not. But I'll
> > > have to investigate more in order to be sure.
> >
> > Weston allocates its own buffers for displaying the result of
> > composition through GBM with USE_SCANOUT, which is definitely correct.
> >
> > Wayland clients (common to all compositors, in Mesa's
> > src/egl/drivers/dri2/platform_wayland.c) allocate with USE_SHARED but
> > _not_ USE_SCANOUT, which is correct in that they are guaranteed to be
> > shared, but not guaranteed to be scanned out. The expectation is that
> > non-scanout-compatible buffers would be rejected by gbm_bo_import if
> > not drmModeAddFB2.
> >
> > One difference between Weston and all other compositors (GNOME Shell,
> > KWin, Sway, etc) is that Weston uses KMS planes for composition when
> > it can (i.e. when gbm_bo_import from dmabuf + drmModeAddFB2 from
> > gbm_bo handle + atomic check succeed), but the other compositors only
> > use the GPU. So if you have different assumptions about the layout of
> > imported buffers between the GPU and KMS, that would explain a fair
> > bit.
> 
> Yeah non-modifiered multi-gpu (of any kind) is pretty much hopeless I
> think. I guess the only option is if the tegra mesa driver forces
> linear and an extra copy on everything that's USE_SHARED or
> USE_SCANOUT.

I ended up trying this, but this fails for the X case, unfortunately,
because there doesn't seem to be a good synchronization point at which
the de-tiling blit could be done. Weston and kmscube end up calling a
gallium driver's ->flush_resource() implementation, but that never
happens for X and glamor.

But after looking into this some more, I don't think that's even the
problem that we're facing here. The root of the problem that causes the
glxgears crash that Karol was originally reporting is because we end up
allocating the glxgears pixmaps using the dri3 loader from Mesa. But the
dri3 loader will unconditionally pass both __DRI_IMAGE_USE_SHARE and
__DRI_IMAGE_USE_SCANOUT, irrespective of whether the buffer will end up
being scanned out directly or whether it will be composited onto the
root window.

What exactly happens depends on whether I run glxgears in fullscreen
mode or windowed mode. In windowed mode, the glxgears buffers will be
composited onto the root window, so there's no need for the buffers to
be scanout-capable. If I modify the dri3 loader to not pass those flags
I can make this work just fine.

When I run glxgears in fullscreen mode, the modesetting driver ends up
wanting to display the glxgears buffer directly on screen, without
compositing it onto the root window. This ends up working if I leave out
the _USE_SHARE and _USE_SCANOUT flags, but I notice that the kernel then
complains about being unable to create a framebuffer, which in turn is
caused by the fact that those buffers are not exported (the Tegra Mesa
driver only exports/imports buffers that are meant for scanout, under
the assumption that those are the only ones that will ever need to be
used by KMS) and therefore Tegra DRM doesn't have a valid handle for
them.

So I think an ideal solution would probably be for glxgears to somehow
pass better usage information when allocating buffers, but I suspect
that that's just not possible, or would be way too much work and require
additional protocol at the DRI level, so it's not really a good option
when all we want to fix is backwards-compatibility with pre-modifiers
userspace.

Given that glamor also doesn't have any synchronization points, I don't
see how I can implement the de-tiling blit reliably. I was wondering if
it shouldn't be possible to flush the framebuffer resource (and perform
the blit) at presentation time, but I couldn't find a good entry point
to do this.

One other solution that occurred to me was to reintroduce an old IOCTL
that we used to have in the Tegra DRM driver. That IOCTL was meant to
attach tiling meta data to an imported buffer and was basically a
simplified, driver-specific way of doing framebuffer modifiers. That's
a very ugly solution, but it would allow us to be backwards-compatible
with pre-modifiers userspace and even use an optimal path for rendering
and scanning out. The only prerequisite would be that the driver IOCTL
was implemented and that a recent enough Mesa was used to make use of
it. I don't like this very much because framebuffer modifiers are a much
more generic solution, but all of the other options above are pretty
much just as ugly.

One other idea that I haven't explored yet is to be a little more clever
about the export/import dance that we do for buffers. Currently we
export/import at allocation time, and that seems to cause a bit of a
problem, like the lack of valid GEM handles for some buffers (such as in
the glxgears fullscreen use-case discussed above). I wonder if perhaps
deferring the export/import dance until the handles are actually
required may be a better way to do this. With such a solution, even if a
buffer is allocated for scanout, it won't actually be imported/exported
if the client ends up being composited onto the root window. Import and
export would be limited to buffers that truly are going to be used for
drmModeAddFB2(). I'll give that a shot and see if that gets me closer to
my goal.

Thierry

Attachment: signature.asc
Description: PGP signature

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Reply via email to