Re: [Mesa-dev] Plumbing explicit synchronization through the Linux ecosystem
> GL and GLES are not relevant. What is relevant is EGL, which defines > interfaces to make things work on the native platform. Yes and no. This is what EGL spec says about sharing a texture between contexts: "OpenGL and OpenGL ES makes no attempt to synchronize access to texture objects. If a texture object is bound to more than one context, then it is up to the programmer to ensure that the contents of the object are not being changed via one context while another context is using the texture object for rendering. The results of changing a texture object while another context is using it are undefined." There are similar statements with regards to the lack of synchronisation guarantees for EGL images or between GL and native rendering, etc. But the main thing here is that EGL and Vulkan differ significantly. The eglSwapBuffers() is expected to post an unspecified "back buffer" to the display system using some internal driver magic. EGL driver is then expected to obtain another back buffer at some unspecified point in the future. Vulkan on the other hand is very specific and explicit. The vkQueuePresentKHR() is expected to post a specific vkImage with an explicit set of set of semaphores. Another image is obtained through vkAcquireNextImageKHR() and it's the application's decision whether it wants a fence, a semaphore, both or none with the acquired buffer. The implicit synchronisation doesn't mix well with Vulkan drivers and requires a lot of extra plumbing in the WSI code. > If you are using EGL_WL_bind_wayland_display, then one of the things > it is explicitly allowed/expected to do is to create a Wayland > protocol interface between client and compositor, which can be used to > pass buffer handles and metadata in a platform-specific way. Adding > synchronisation is also possible. Only one-way synchronisation is possible with this mechanism. There's a standard protocol for recycling buffers - wl_buffer_release() so buffer hand-over from the compositor to client remains unsynchronised - see below. > > The most troublesome part was Wayland buffer release mechanism, as it only > > involves a CPU signalling over Wayland IPC, without any 3D driver > > involvement. The choices were: explicit synchronisation extension or a > > buffer copy in the compositor (i.e. compositor textures from the copy, so > > the client can re-write the original), or some implicit synchronisation in > > kernel space (but that wasn't an option in Broadcom driver). > > You can add your own explicit synchronisation extension. I could but that requires implementing in in the driver and in a number of compositors, therefore a standard extension zwp_linux_explicit_synchronization_v1 is much better choice here than a custom one. > In every cross-process and cross-subsystem usecase, synchronisation is > obviously required. The two options for this are to implement kernel > support for implicit synchronisation (as everyone else has done), That would require major changes in driver architecture or a 2nd mechanisms doing the same thing but in kernel space - both are non-starters. > or implement generic support for explicit synchronisation (as we have > been working on with implementations inside Weston and Exosphere at > least), The zwp_linux_explicit_synchronization_v1 is a good step forward. I'm using this extension as a main synchronisation mechanism in EGL and Vulkan driver whenever available. I remember that Gustavo Padovan was working on explicit sync support in the display system some time ago. I hope it got merged into kernel by now, but I don't know to what extend it's actually being used. > or implement private support for explicit synchronisation, If everything else fails, that would be the last resort scenario, but far from ideal and very costly in terms of implementation and maintenance as it would require maintaining custom patches for various 3rd party components or littering them with multiple custom explicit synchronisation schemes. > or do nothing and then be surprised at the lack of synchronisation. Thank you, but no, thank you :) Cheers, Tomek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Plumbing explicit synchronization through the Linux ecosystem
> vkAcquireNextImageKHR() [...] it's the application's decision whether it > wants a fence, a semaphore, both or none Correction: "or none" is not allowed ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Plumbing explicit synchronization through the Linux ecosystem
> That's not true; you can post back a sync token every time the client > buffer is used by the compositor. Technically, yes but it's very cumbersome and invasive to the point where it becomes impractical. Explicit sync is much cleaner solution. > For instance, Mesa adds the `wl_drm` extension, which is > used for bidirectional communication between the EGL implementations > in the client and compositor address spaces, without modifying either. Broadcom driver adds "wl_nexus" extension which servers similar purpose for both EGL and Vulkan WSI > OK. As it stands, everyone else has the kernel mechanism (e.g. via > dmabuf resv), so in this case if you are reinventing the underlying > platform in a proprietary stack, you get to solve the same problems > yourselves. That's an important point. In the explicit synchronisation scenario the sync token is passed with the buffer. It becomes irrelevant where the token originated from, as long as it's a commonly used type of token, i.e. dma_fence in kernel space or sync_fd in user space. That allows for greater flexibility and works with and without dma reservation objects. Cheers, Tomek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Plumbing explicit synchronization through the Linux ecosystem
> As long as we can fall back to not using fences then we should be fine. Buffers written by the camera are trivial because you control what happens - just don't attach fence, so that the capture can be used immediately. For recycled buffers there's an extra bit of work to do because won't be up to camera driver to decide whether the buffer comes back with or without fence. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Plumbing explicit synchronization through the Linux ecosystem
Hi Jason, I've been wrestling with the sync problems in Wayland some time ago, but only with regards to 3D drivers. The guarantee given by the GL/GLES spec is limited to a single graphics context. If the same buffer is accessed by 2 contexts the outcome is unspecified. The cross-context and cross-process synchronisation is not guaranteed. It happens to work on Mesa, because the read/write locking is implemented in the kernel space, but it didn't work on Broadcom driver, which has read-write interlocks in user space. A Vulkan client makes it even worse because of conflicting requirements: Vulkan's vkQueuePresentKHR() passes in a number of semaphores but disallows waiting. Wayland WSI requires wl_surface_commit() to be called from vkQueuePresentKHR() which does require a wait, unless a synchronisation primitive representing Vulkan samaphores is passed between Vulkan client and the compositor. The most troublesome part was Wayland buffer release mechanism, as it only involves a CPU signalling over Wayland IPC, without any 3D driver involvement. The choices were: explicit synchronisation extension or a buffer copy in the compositor (i.e. compositor textures from the copy, so the client can re-write the original), or some implicit synchronisation in kernel space (but that wasn't an option in Broadcom driver). With regards to V4L2, I believe it could easily work the same way as 3D drivers, i.e. pass a buffer+fence pair to the next stage. The encode always succeeds, but for capture or decode, the main problem is the uncertain outcome, I believe? If we're fine with rendering or displaying an occasional broken frame, then buffer+fence pair would work too. The broken frame will go into the pipeline, but application can drain the pipeline and start over once the capture works again. To answer some points raised by Laurent (although I'm unfamiliar with the camera drivers): > you don't know until capture complete in which buffer the frame has been captured Surely you do, you only don't know in advance if the capture will be successful > but if an error occurs during capture, they can be recycled internally and put to the back of the queue. That would have to change in order to use explicit synchronisation. Every started capture becomes immediately available as a buffer+fence pair. Fence is signalled once the capture is finished (successfully or otherwise). The buffer must not be reused until it's released, possibly with another fence - in that case the buffer must not be reused until the release fence is signalled. Cheers, Tomek On Mon, 16 Mar 2020 at 10:20, Laurent Pinchart < laurent.pinch...@ideasonboard.com> wrote: > On Wed, Mar 11, 2020 at 04:18:55PM -0400, Nicolas Dufresne wrote: > > (I know I'm going to be spammed by so many mailing list ...) > > > > Le mercredi 11 mars 2020 à 14:21 -0500, Jason Ekstrand a écrit : > > > On Wed, Mar 11, 2020 at 12:31 PM Jason Ekstrand > wrote: > > > > All, > > > > > > > > Sorry for casting such a broad net with this one. I'm sure most > people > > > > who reply will get at least one mailing list rejection. However, > this > > > > is an issue that affects a LOT of components and that's why it's > > > > thorny to begin with. Please pardon the length of this e-mail as > > > > well; I promise there's a concrete point/proposal at the end. > > > > > > > > > > > > Explicit synchronization is the future of graphics and media. At > > > > least, that seems to be the consensus among all the graphics people > > > > I've talked to. I had a chat with one of the lead Android graphics > > > > engineers recently who told me that doing explicit sync from the > start > > > > was one of the best engineering decisions Android ever made. It's > > > > also the direction being taken by more modern APIs such as Vulkan. > > > > > > > > > > > > ## What are implicit and explicit synchronization? > > > > > > > > For those that aren't familiar with this space, GPUs, media encoders, > > > > etc. are massively parallel and synchronization of some form is > > > > required to ensure that everything happens in the right order and > > > > avoid data races. Implicit synchronization is when bits of work (3D, > > > > compute, video encode, etc.) are implicitly based on the absolute > > > > CPU-time order in which API calls occur. Explicit synchronization is > > > > when the client (whatever that means in any given context) provides > > > > the dependency graph explicitly via some sort of synchronization > > > > primitives. If you're still confused, consider the following > > > > examples: > > > > > > > > With OpenGL and EGL, almost everything is implicit sync. Say you > have > > > > two OpenGL contexts sharing an image where one writes to it and the > > > > other textures from it. The way the OpenGL spec works, the client > has > > > > to make the API calls to render to the image before (in CPU time) it > > > > makes the API calls which texture from the image. As long as it does > > > > this (an