Re: [PATCH] DRI2: Expose API to set drawable swap limit.

Mario Kleiner Mon, 25 Oct 2010 17:54:13 -0700

On Oct 25, 2010, at 6:59 PM, Jesse Barnes wrote:

On Mon, 25 Oct 2010 17:13:55 +0300
Pauli Nieminen <ext-pauli.niemi...@nokia.com> wrote:

This allows ddx to set swap_limit if there is more than one back
buffer for drawable. Setting swap_limit has to also check if change
affects a client that is blocked.

This can be used to implement N-buffering in driver with minimal
logic in allocation and selecting next back.

Signed-off-by: Pauli Nieminen <ext-pauli.niemi...@nokia.com>
Tested-by: Francisco Jerez <curroje...@riseup.net>
CC: Kristian HÃ¸gsberg <k...@bitplanet.net>
CC: Jesse Barnes <jbar...@virtuousgeek.org>
---
 hw/xfree86/dri2/dri2.c |   23 +++++++++++++++++++++++
 hw/xfree86/dri2/dri2.h |    1 +
 2 files changed, 24 insertions(+), 0 deletions(-)

diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c
index 34f735f..c96eb35 100644
--- a/hw/xfree86/dri2/dri2.c
+++ b/hw/xfree86/dri2/dri2.c
@@ -190,6 +190,29 @@ DRI2AllocateDrawable(DrawablePtr pDraw)
     return pPriv;
 }

+Bool
+DRI2SwapLimit(DrawablePtr pDraw, int swap_limit)
+{
+    DRI2DrawablePtr pPriv = DRI2GetDrawable(pDraw);
+    if (!pPriv)
+       return FALSE;
+
+    pPriv->swap_limit = swap_limit;
+
+    /* Check throttling */
+    if (pPriv->swapsPending >= pPriv->swap_limit)
+       return TRUE;
+
+    if (pPriv->target_sbc == -1 && !pPriv->blockedOnMsc) {
+       if (pPriv->blockedClient) {
+           AttendClient(pPriv->blockedClient);
+           pPriv->blockedClient = NULL;
+       }
+    }
+
+    return TRUE;
+}
+
 typedef struct DRI2DrawableRefRec {
     XID                  id;
     XID                  dri2_id;
diff --git a/hw/xfree86/dri2/dri2.h b/hw/xfree86/dri2/dri2.h
index fe0bf6c..0e0bea4 100644
--- a/hw/xfree86/dri2/dri2.h
+++ b/hw/xfree86/dri2/dri2.h

@@ -251,6 +251,7 @@ extern _X_EXPORT DRI2BufferPtr*DRI2GetBuffersWithFormat(DrawablePtr pDraw,

        int *out_count);

extern _X_EXPORT void DRI2SwapInterval(DrawablePtr pDrawable, intinterval);+extern _X_EXPORT Bool DRI2SwapLimit(DrawablePtr pDraw, intswap_limit);extern _X_EXPORT int DRI2SwapBuffers(ClientPtr client,DrawablePtr pDrawable,

                                     CARD64 target_msc, CARD64 divisor,
                                     CARD64 remainder, CARD64 *swap_target,


I think this is ok, but I'd like Mario to ack it as well; I think we
can still get into trouble if multiple clients are sharing a drawable
and doing waits on it.
--
Jesse Barnes, Intel Open Source Technology Center

Ja, i think the patch is ok for doing what it does. But i think itisn't sufficient to implement n-buffering reliably. We'd need moreclever scheduling of swaps and vblank events if multiple swaps arepending.

The problem is wrong presentation timing for copyswaps if the gpu isloaded and really corrupt/wrong presentation in the pageflip case.

Assume a client A submits 10 render + swapbuffers requests in a row,for presentation at ten consecutive refresh intervals (and theswaplimit is >=10). The driver would happily queue 10 vblank eventsin the kernel for consecutive vblank counts.

Meanwhile client B submits a huge rendering request into the commandstream, worth multiple video refresh durations of rendering time.


1. BBBBBBBBB -> gpu

The 1st vblank event fires, gets processed by the server and acopyswap commandbuffer A gets submitted into the cs and aintel_swap_event or OML_sync_control timestamp delivered to theclient about the supposed time of swap completion.


2. ABBBBBBBB -> gpu

After 1st swap request is stuck behind client B's command buffers forat least 1 video refresh cycle, the next vblank event fires asscheduled and the next copyswap buffer gets queued, more wrongtimestamps get delivered:


3. AABBBBBBB -> gpu

In the end they all get executed at some point, but their true swaptiming has nothing to do with what was specified in theglXSwapBuffersMscOML() call and the (divisor, remainder) mechanismfor dealing with such delayed swaps fails to anything usefulcompletely. Also all delivered timestamps and msc counts are wrong -they indicate perfect timing which has nothing to do with reality.

Pageflipped swaps are worse. Step 1 and 2 would be the same forpageflip, but in step 2 a pageflip would be queued inside thekernel's pageflip ioctl(). One refresh cycle later, when the nextvblank fires, the ioctl() gets called again to schedule the nextpageflip while the 1st one hasn't completed yet. Result is that theioctl() returns EBUSY and the swap is discarded. Same for some of thefollowing swaps if the gpu is a bit busy. So with pageflipping, notonly the timting is wrong, but many of the rendered frames aren'tpresented at all and stuff gets really out of sequence.

These funny things can't happen as long as we only allow 1 pendingswaps (swaplimit == 1).

My idea was to solve it as follows for n-buffering. If somebody feelslike implementing that or something better now, instead of me in acouple of weeks or months, go for it:

1. We maintain a little fifo queue / list of pending swap requestsfor each drawable.2. The DRI2ScheduleSwap() function just converts the incomingswapbuffers request into a little struct with all the parameters(target_msc, divisor, remainder, ...) and inserts it at the tail ofthat queue, after checking that swaps_pending < current swaplimit.If (++swaps_pending) == 1, it kicks off a littleDRI2ReallyScheduleOneSwap() function which dequeues this firstrequest from the queue and calls into the ddx->ScheduleSwap()routine. At the end of the DRI2SwapComplete() function, when a swapreally got completed, that DRI2ReallyScheduleOneSwap() function getscalled and checks if more swap requests are pending in the queue anddequeues and processes the oldest request. If the queue is empty, itis a no op and the whole thing goes idle again. MaybeDRI2ReallyScheduleOneSwap() would also take care of msc jumps due todrawables being moved between crtc's etc.

This can't perfectly solve delays in one swapbuffer request due totraffic jams in the command stream, but it could prevent that onemisscheduled swap will trigger the followup ones to go ugly in randomways.

The problems with multiple clients blocking on glXWaitForMscOML orglXWaitForSbcOML() is a separate issue which would needimplementation of additional wait queues.

And then, as we discussed on xds, there's the need for propertimestamping of copyswaps in the kernel, otherwise this proposalwould only fix the pageflipped case, not the windowed case.


thanks,
-mario

*********************************************************************
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:    +49 (0)7071/601-616
www:    http://www.kyb.tuebingen.mpg.de/~kleinerm
*********************************************************************
"For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled."
(Richard Feynman)

_______________________________________________
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Re: [PATCH] DRI2: Expose API to set drawable swap limit.

Reply via email to