Swapbuffers [was: Re: DRI2 and lock-less operation]

Keith Whitwell Wed, 28 Nov 2007 02:14:56 -0800

Kristian Høgsberg wrote:
> On Nov 27, 2007 11:48 AM, Stephane Marchesin
> <[EMAIL PROTECTED]> wrote:
>> On 11/22/07, Kristian Høgsberg <[EMAIL PROTECTED]> wrote:
> ...
>>> It's all delightfully simple, but I'm starting to reconsider whether
>>> the "lockless" bullet point is realistic.   Note, the drawable lock is
>>> gone, we always render to private back buffers and do swap buffers in
>>> the kernel, so I'm "only" concerned with the DRI lock here.  The idea
>>> is that since we have the memory manager and the super-ioctl and the X
>>> server now can push cliprects into the kernel in one atomic operation,
>>> we would be able to get rid of the DRI lock.  My overall question,
>>> here is, is that feasible?
>> How do you plan to ensure that X didn't change the cliprects after you
>> emitted them to the DRM ?
> 
> The idea was that the buffer swap happens in the kernel, triggered by
> an ioctl. The kernel generates the command stream to execute the swap
> against the current set of cliprects.  The back buffers are always
> private so the cliprects only come into play when copying from the
> back buffer to the front buffer.  Single buffered visuals are secretly
> double buffered and implemented the same way.
> 
> I'm trying to figure now whether it makes more sense to keep cliprects
> and swapbuffer out of the kernel, which wouldn't change the above
> much, except the swapbuffer case.  I described the idea for swapbuffer
> in this case in my reply to Thomas: the X server publishes cliprects
> to the clients through a shared ring buffer, and clients parse the
> clip rect changes out of this buffer as they need it.  When posting a
> swap buffer request, the buffer head should be included in the
> super-ioctl so that the kernel can reject stale requests.  When that
> happens, the client must parse the new cliprect info and resubmit an
> updated swap buffer request.


In my ideal world, the entity which knows and cares about cliprects 
should be the one that does the swapbuffers, or at least is in control 
of the process.  That entity is the X server.

Instead of tying ourselves into knots trying to figure out how to get 
some other entity a sufficiently up-to-date set of cliprects to make 
this work (which is what was wrong with DRI 1.0), maybe we should try 
and figure out how to get the X server to efficiently orchestrate 
swapbuffers.

In particular it seems like we have:

1) The X server knows about cliprects.
2) The kernel knows about IRQ reception.
3) The kernel knows how to submit rendering commands to hardware.
4) Userspace is where we want to craft rendering commands.

Given the above, what do we think about swapbuffers:

        a) Swapbuffers is a rendering command
        b) which depends on cliprect information
        c) that needs to be fired as soon as possible after an IRQ receipt.

So:
        swapbuffers should be crafted from userspace (a, 4)
        ... by the X server (b, 1)
        ... and should be actually fired by the kernel (c, 2, 3)


I propose something like:

0) 3D client submits rendering to the kernel and receives back a fence.

1) 3D client wants to do swapbuffers.  It sends a message to the X 
server asking it "please do me a swapbuffers after this fence has 
completed".

2) X server crafts (somehow) commands implementing swapbuffers for this 
drawable under the current set of cliprects and passes it to the kernel 
along with the fence.

3) The kernel keeps that batchbuffer to the side until
        a) the commands associated with the fence have been submitted to 
hardware.
        b) the next vblank IRQ arrives.

when both of these are true, the kernel simply submits the prepared 
swapbuffer commands through the lowest latency path to hardware.

But what happens if the cliprects change?  The 100% perfect solution 
looks like:

The X server knows all about cliprect changes, and can use fences or 
other mechanisms to keep track of which swapbuffers are outstanding.  At 
the time of a cliprect change, it must create new swapbuffer commandsets 
for all pending swapbuffers and re-submit those to the kernel.

These new sets of commands must be tied to the progress of the X 
server's own rendering command stream so that the kernel fires the 
appropriate one to land the swapbuffers to the correct destination as 
the X server's own rendering flies by.

In the simplest case, where the kernel puts commands onto the one true 
ring as it receives them, the kernel can simply discard the old 
swapbuffer command.  Indeed this is true also if the kernel has a 
ring-per-context and uses one of those rings to serialize the X server 
rendering and swapbuffers commands.

Note that condition 3a) above is always true in the current i915.o 
one-true-ring/single-fifo approach to hardware serialization.

I think the above can work and seems more straight-forward than many of 
the proposed alternatives.

Keith



-------------------------------------------------------------------------
SF.Net email is sponsored by: The Future of Linux Business White Paper
from Novell.  From the desktop to the data center, Linux is going
mainstream.  Let it simplify your IT future.
http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4
--
_______________________________________________
Dri-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Swapbuffers [was: Re: DRI2 and lock-less operation]

Reply via email to