On 03/07/2013 01:19 PM, Owen Taylor wrote:
On Thu, 2013-02-28 at 16:55 -0800, Keith Packard wrote:

* It would be great if we could figure out a plan to get to the
   point where the exact same application code is going to work for
   proprietary and open source drivers. When you get down to the details
   of swap this isn't close to the case currently.

Agreed -- the problem here is that except for the nVidia closed drivers,
everything else implicitly serializes device access through the kernel,
providing a natural way to provide some defined order of
operations. Failing that, I'd love to know what mechanisms *could* work
with that design.

Fence syncs. Note the original fence sync + multi-buffer proposal solved basically the same problems you're trying to solve here, as well as everything Owen's WM spec updates do, but more generally, and with that, a little more implementation complexity. It included proposals to make minor updates to GLX/EGL as well to tie them in with the newer model. There didn't seem to be much interest outside of NVIDIA, so besides fence sync, the ideas are tabled internally ATM.

I don't think serialization is actually the big issue - although it's
annoying to deal with fences that are no-op for the open sources, it's
pretty well defined where you have to insert them, and because they are
no-op's for the open source drivers, there's little overhead.

Notification is more of an issue.

   - Because swap handled client side in some drivers, INTEL_swap_event
     is seen as awkward to implement.

I'm not sure what could be done here, other than to have some way for
the X server to get information about the swap and stuff it into the
event stream, of course. It could be as simple as having the client
stuff the event data to the X server itself.

It may be that a focus on redirection makes things easier - once the
compositor is involved, we can't get away from X server involvement. The
compositor is the main case where the X server can be completely
bypassed when swapping. And I'm less concerned about API divergence for
the compositor. (Not that I *invite* it...)

   - There is divergence on some basic behaviors, e.g.,  whether
     glXSwapBuffers() glFinish() waits for the swap to complete or not.

glXSwapBuffers is pretty darn explicit in saying that it *does not* wait
for the swap to complete, and glFinish only promises to synchronize the
effects of rendering ("contents of the frame buffer"), not the actual
swap operation itself. I'm not sure how we're supposed to respond when
drivers ignore the spec and do their own thing?

I wish the GLX specification was clear enough so we actually knew who
was ignoring the spec and doing their own thing... ;-) The GLX
specification describes the swap operation as the contents of the back
buffer "become the contents of the front buffer" ... that seems like an
operation on the "contents of the frame buffer".

The GLX spec is plenty clear here.  It states:

"Subsequent OpenGL commands can be issued immediately, but will not be executed until the buffer swapping has completed..."

And glFinish, besides the fact that it counts as a GL command, isn't defined as simply waiting until effects on the framebuffer land. All rendering, client, and server (GL server, not X server) state side effects from previous operations must settle before it returns. SwapBuffers affects all three of those. Same for fence syncs with condition GL_SYNC_GPU_COMMANDS_COMPLETE.

So if the drawable swapped is current to the thread calling swap buffers, and they issue any other GL commands afterwards, including glFinish, glFenceSync, etc., those commands can't complete until after the swap operation does. For glFinish, that means it can't return. For fence, the fence won't trigger until the swap finishes. If implementations aren't behaving that way, it's a bug in the implementation. Not to say our implementation doesn't have bugs, but AFAIK, we don't have that one.

Thanks,
-James

But getting into the details here is a bit of a distraction - my goal is
to try to get us to convergence so we have only one API with well
defined behaviors.

   - When rendering with a compositor, the X server is innocent of
     relevant information about timing and when the application should
     draw additional new frames. I've been working on handing this
     via client <=> compositor protocols

With 'Swap', I think the X server should be involved as it is necessary
to get be able to 'idle' buffers which aren't in use after the
compositor is done with them. I tried to outline a sketch of how that
would work before.

(https://mail.gnome.org/archives/wm-spec-list/2013-January/msg00000.html)

     But this adds a lot of complexity to the minimal client, especially
     when a client wants to work both redirected and unredirected.

Right, which is why I think fixing the X server to help here would be better.

If the goal is really to obsolete the proposed WM spec changes, rather
than just make existing GLX apps work better, then there's quite a bit
of stuff to get right. For example, from my perspective, the
OML_sync_control defined UST timestamps are completely insufficient -
it's not even defined what the units are for these timestamps!

   I think it would be great if we could sit down and figure out what
   the Linux-ecosystem API is for this in a way we could give to
   application authors.

Ideally, a GL application using simple GLX or EGL APIs would work
'perfectly', without the need to use additional X-specific APIs. My hope
with splitting DRI3000 into separate DRI3 and Swap extensions is to
provide those same semantics to simple double-buffered 2D applications
using core X and Render drawing as well, without requiring that they be
rewritten to use GL, and while providing all of the same functionality
over the network as local direct rendering applications get today.

The GLX APIs have some significant holes and poorly defined aspects. And
they don't properly take compositing into account, which is the norm
today. So providing those capabilities to 2D apps seems of limited
utility.

[...]

   The SwapComplete event is specified as - "This event is delivered
   when a SwapRegion operation completes" - but the specification
   of SwapRegion itself is fuzzy enough that I'm unclear exactly what
   that means.

   - The description SwapRegion needs to define "swap" since the
     operation has only a vague resemblance to the English-language
     meaning of "swap".

Right, SwapRegion can either be a copy operation or an actual swap. The
returned information about idle buffers tells the client what they
contain, so I think the only confusion here is over the name of the request?

The confusion to me is that we all have some idea of what a "swap" is,
and what "complete" means, but when we try to nail things down, the
details are not so clear. I'd rather we were precise about the meaning
than try to leave wriggle room for future stuff.

  * What do you you get if you CopyArea/glReadPixels/draw with TFP from
    various targets.

  * What is scanned out from the front buffer to the output device?

Swap should be defined in terms of these basic concepts.

[...]

   - What happens when multiple SwapRegion requests are made with a
     swap-interval of zero. Are previous ones discarded?

Any time a SwapRegion request is made with one still pending, the server
may choose to skip the first contents and swap directly to the second
contents. I'm not sure how this would be visible to the application
though?

The application can tell by looking at events. My question was also
inspired by thinking about the question of what would happen in the
redirected case if the client and the compositor had side-band protocols
to control the rate of presentation.

[...]

   - What's the interaction between swap-interval and target-msc, etc?

I'm afraid I just copied these from the DRI2 spec without really
understanding the precise semantics. They originally came from the
related GL specs.

DRI2 combined together concepts from several overlapping specs. As long
as it was an implementation detail, the question of what happens when
things overlap wasn't a big issue. If we make Swap app-exposed, then a
hand-wave isn't sufficient.

   - When a window is redirected, what's the interpretation of
     swap-interval, target-msc, etc? Is it that the server performs the
     operation at the selected blanking interval (as if they window
     wasn't redirected), and then damage/other events are generated
     and the server picks it up and renders to the real front buffer
     at the next opportunity - usually a frame later.

This sends us down a very deep hole, and one which I intend to resolve
at some point, but for now, I'd love to focus on getting the semantics
for non-redirected windows looking sane, and then try to figure out how
to replicate those semantics in a redirected world.

We have a set of semantics for GLX that we need to keep working, since
there are going to be piles of old GLX applications. But once we move
beyond that, then redirection is the normal case.

[...]

* In the definition of SWAPIDLE you say:

     If valid is TRUE, swap-hi/swap-lo form a 64-bit
     swap count value from the SwapRegion request which matches the
     data that the pixmap currently contains

  If I'm not misunderstanding things, this is a confusing statement
  because, leaving aside damage to the front buffer, pixmaps always
  contain the same contents (whatever the client rendered into it.)

No, the Swap operation may actually *replace* the pixmap contents with
the other buffer contents. That allows for efficient pointer swapping
instead of actual data copying. This number lets the client know what
the pixmap holds as a result of this operation, which may simply be the
previous pixmap contents or may be the contents from one or more frames
previous.

So the association of pixmap ID to buffer can change as the result of a
swap operation? What's the motivation for this? - it seems to me that
once we start labeling buffers with pixmap ID's, it would be simpler to
keep the association - it wouldn't hinder the server from implementing a
swap as either a copy or a exchange.

* What control, if any, will applications have over the number of
   buffers used - what the behavior will be when an application starts
   rendering another frame in terms of allocating a new buffer versus
   swapping?

The application is entirely in charge of allocating buffers; the server
never allocates anything. As such, the application may well choose to
pause until buffers go idle before continuing to render so as to limit
buffer use to a sane amount.

The question was really about whether we saw exposing such a control to
apps rendering via GLX.

Switching to events for Idle notification should make this a lot logical
and perhaps easier to understand, although the client implementation
will be a pain.

* Do we need to deal with stereo as part of this?

Probably? But I'm not sure how?

One thing I'll point out here is that the texture_from_pixmap already
has some support for stereo written into it. As it is written the pixmap
ID represents *all* the buffers for the window- left, right, and aux.

This poses some issues for viewing the pixmap as a "normal pixmap" in X
terms, but otherwise simplifies stereo to a question of buffer format.
An issue for the DRI3 extension rather than the Swap extension.

- Owen


_______________________________________________
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

_______________________________________________
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel

Reply via email to