On Thu, 24 Aug 2023 15:37:05 -0400 Eduardo Hopperdietzel <ehopperdiet...@gmail.com> wrote:
> Hi David, > > I've made a little Wayland app that uses both SHM and DMA, and I > tested it on Weston, Sway, and my own compositor. I also tried it on > three different machines: two with Intel i7 CPUs and one with a > smaller ARM CPU. These machines had Intel Iris Pro, Nvidia GT525M, > and Mali-400 GPUs, respectively. > > Here's the code and results for one of the machines: > > https://github.com/ehopperdietzel/QPainter-SHM-DMA-Benchmark > > The results show that there's no significant difference in the time it > takes for read and write operations using QPainter in SHM and DMA > maps. It seems like DMA I/O operations are handled asynchronously by > the kernel. The most noticeable improvement is on the compositor > side. When using DMA, the experience feels much smoother, especially > when moving other windows while the benchmark is running on > single-threaded compositors like Weston. There's also a slight > increase in the number of frame callbacks returned by the compositors > when using DMA, though it doesn't significantly boost the overall FPS. > > However, there are challenges with implementing DMA: > > 1. There does not seems to be standard method to create DMA buffers in > userspace. I tried creating a GBM bo, obtaining a PRIME fd, and > mapping it, but this isn't supported by all GPUs/drivers. For > instance, it didn't work with the Mali GPU using the Lima driver. I > also experimented with DMA-BUFF heaps, but driver support does not > seems to be consistent across all distributions, and accessing > /dev/dma-heaps/** often requires superuser privileges. > > 2. When using DMA, triple buffering is necessary; otherwise, > compositors only display partial buffer updates. This could > potentially be avoided by using DMA fencing mechanisms (like EGL does > under the hood) and protocols like this one: > > https://wayland.app/protocols/linux-explicit-synchronization-unstable-v1 > > But it seems that not many compositors have implemented it. > > To sum it up, while DMA does offer a performance boost, it's not > without its issues: > > - DMA's effectiveness varies depending on hardware. > - Implementing DMA can be complex. > - The performance gains might not justify the effort. > > So, as you mentioned earlier, it's probably best to stick with SHM > and let the compositor handle uploads using DMA, preferably > asynchronously. > > Cheers, > > Eduardo Hopperdietzel I wonder whether this would help with FramelessWindowHint artifacts on Debian 10? Currently SHM doesn't work correctly on Debian 10 and one has to create a child QOpenGLWidget for artifacts to disappear. -- Development mailing list Development@qt-project.org https://lists.qt-project.org/listinfo/development