Re: dri2-without-sarea branches for review

Kristian Høgsberg Tue, 19 Aug 2008 09:50:24 -0700

On Tue, Aug 19, 2008 at 6:57 AM, Michel Dänzer
<[EMAIL PROTECTED]> wrote:
> On Mon, 2008-08-18 at 15:30 -0400, Kristian Høgsberg wrote:
>>
>> I have pushed the DRI2 update to the dri2proto, mesa, xserver, and
>> xf86-video-intel trees in ~krh. It's on the master branch in those repos.
>
> I don't see anything younger than 5 months in your xf86-video-intel
> repo.


Ah, forgot to push that one. Should be there now.

>> The way this works now, is that when ctx->Driver.Viewport is called
>> (and thus at least when binding a drawable to a context), the DRI
>> driver calls back to the loader, which then calls into the DRI2 module
>> to get the buffers associated with the drawable.  The DRI2 module in
>> turns calls into the DDX driver to allocate these and sends them back
>> to the DRI driver, which updates the renderbuffers to use the given
>> buffers.
>
> So after binding a drawable to a context, the buffer information will
> only be updated when the app calls glViewport()? Any idea if this scheme
> will be suitable for other APIs like GLES or OpenVG?

Yes.

GLES has the glViewport entrypoint with the same semantics.

OpenVG doesn't seem to have a way to communicate the surface size to
the library, but it relies on EGL for the surfarce and context
management.  However, the EGL API is "open ended", in that it relies
on implementation dependent types, specifically NativeWindowType,
where we can add API to update the window size.  For example, the
NativeWindowType could be a DRIWindow in the MESA/DRI EGL
implementation that you create and pass to eglCreateWindowSurface()
and applications are required to call DRIWindowSetSize, whenever the
underlying window changes size.

Also of interest are the requirements of a potential direct rendering
cairo.  The xlib surface type already requires the user to call
cairo_xlib_surface_set_size() when the window is resized, so a
cairo_dri_surface could do the same.

> Also, I'm wondering if xDRI2Buffer should have a buffer size field, or
> if any buffer size padding requirements beyond height * pitch can always
> be handled in the driver components.
>
>> When glXSwapBuffers is called, the loader calls into the DRI
>> driver to finish the frame (this part is missing currently) and then
>> calls into the DRI2 module to actually do the back buffer to front
>> buffer copy.  The DRI2 module again implements this using a hook into
>> the DDX driver.  The code right now just does a generic CopyArea, and
>> then flushes the batch buffer.  Swap buffer needs to be a round trip
>> so the swap buffer commands are emitted before the DRI driver proceeds
>> to render the next frame.
>
> Making SwapBuffers a round trip isn't ideal for sync to vblank
> (especially considering potentially using triple buffering, but see
> below).

Are you thinking that the DRI client will do the wait-for-vblank and
then post the swap buffer request?  That's clearly not feasible, but
my thinking was that the waiting will be done in the X server, thus
the flags argument to DRI2SwapBuffers.  What I think we should do here
is to use a dedicated command queue for the front buffer and block
that on vblank using a hw wait instruction.  This lets us just fire
and forget swap buffer commands, since any cliprect changes and
associated window contents blits end up in the front buffer queue
after the buffer swap commands.  It does mean that all other rendering
to the front buffer (very little, most toolkits are double buffered)
gets delayed a bit (a most a frame), but hey, that's a feature, and
rendering *throughput* isn't affected.  Toolkits double buffering to a
pixmap and double buffered GLX apps can run unhindered.

For hw that doesn't have multiple queue support, we can simulate this
using a software queue in the kernel.  This is similar to what we have
with the vblank tasklet now, but we don't construct the blitting
commands in the kernel and all front buffer rendering goes through it.

The key idea is that we want to use the pipelining nature of the stack
and the hardware.  The one thing that will break this idea is software
fallbacks to the front buffer, since they force us to wait for the
queue to flush and the hardware to go idle.  Good news is that in a
composited world, this won't happen, but even in the non-composited
case, we can just automatically redirect a window that triggers a
software fallback.  In either case we end up with only hw access to
the front buffer and sync to vblank.

> Also, it looks like currently every glXCopySubBufferMESA() call
> is a roundtrip as well, which might incur a noticeable hit with compiz.

Yeah, that's a problem, but the glXCopySubBufferMESA() API doesn't let
us do much better - maybe we need glXCopyBufferRegionMESA()?

> About triple buffering, AFAICT this scheme makes that impossible, as
> well as implementing buffer swaps by just flipping the front and back
> buffers, because the clients won't know the mapping from API buffers to
> memory buffers changed.

One thing I had in mind was that DRI2SwapBuffers could return the new
color buffer, or maybe it should just return the full new set of
buffers always.  This lets us do page flips for fullscreen
non-redirected windows and for redirected windows (just swap the
pixmaps).  As it is, it doesn't return anything, but we can add that
in a later version easily.

> Have you considered any other schemes, e.g. some kind of event triggered
> when a buffer swap actually takes effect, and which includes information
> about the new mapping from API buffers to memory buffers? Or is the idea
> to just leave any advanced SwapBuffers schemes to the drivers?

Right, the problem with triple buffering is that once we schedule a
swap, we don't know when the previous swap is finished and we can
start rendering again.  Is it actually different from the regular
double buffer case though?  You still need to block the client, which
we can just do by delaying the reply from DRI2SwapBuffers.  In the
triple buffering case you just have an extra buffer and you're
blocking on the previous buffer swap instead of the current.

Do we need to split DRI2SwapBuffers into two requests?  One async
request that schedules the swap and one that blocks the client and
waits for the swap to complete and returns the new buffers?  This will
allow clients to non-blocking swaps, then load and decode the next
frame of video, say, and then only after that block on the swap to
complete...

Kristian
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
--
_______________________________________________
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: dri2-without-sarea branches for review

Reply via email to