On Thu, 24 Aug 2023 15:37:05 -0400
Eduardo Hopperdietzel <ehopperdiet...@gmail.com> wrote:

> Hi David,
> 
> I've made a little Wayland app that uses both SHM and DMA, and I
> tested it on Weston, Sway, and my own compositor. I also tried it on
> three different machines: two with Intel i7 CPUs and one with a
> smaller ARM CPU. These machines had Intel Iris Pro, Nvidia GT525M,
> and Mali-400 GPUs, respectively.
> 
> Here's the code and results for one of the machines:
> 
> https://github.com/ehopperdietzel/QPainter-SHM-DMA-Benchmark
> 
> The results show that there's no significant difference in the time it
> takes for read and write operations using QPainter in SHM and DMA
> maps. It seems like DMA I/O operations are handled asynchronously by
> the kernel. The most noticeable improvement is on the compositor
> side. When using DMA, the experience feels much smoother, especially
> when moving other windows while the benchmark is running on
> single-threaded compositors like Weston. There's also a slight
> increase in the number of frame callbacks returned by the compositors
> when using DMA, though it doesn't significantly boost the overall FPS.
> 
> However, there are challenges with implementing DMA:
> 
> 1. There does not seems to be standard method to create DMA buffers in
> userspace. I tried creating a GBM bo, obtaining a PRIME fd, and
> mapping it, but this isn't supported by all GPUs/drivers. For
> instance, it didn't work with the Mali GPU using the Lima driver. I
> also experimented with DMA-BUFF heaps, but driver support does not
> seems to be consistent across all distributions, and accessing
> /dev/dma-heaps/** often requires superuser privileges.
> 
> 2. When using DMA, triple buffering is necessary; otherwise,
> compositors only display partial buffer updates. This could
> potentially be avoided by using DMA fencing mechanisms (like EGL does
> under the hood) and protocols like this one:
> 
> https://wayland.app/protocols/linux-explicit-synchronization-unstable-v1
> 
> But it seems that not many compositors have implemented it.
> 
> To sum it up, while DMA does offer a performance boost, it's not
> without its issues:
> 
> - DMA's effectiveness varies depending on hardware.
> - Implementing DMA can be complex.
> - The performance gains might not justify the effort.
> 
> So, as you mentioned earlier, it's probably best to stick with SHM
> and let the compositor handle uploads using DMA, preferably
> asynchronously.
> 
> Cheers,
> 
> Eduardo Hopperdietzel

I wonder whether this would help with FramelessWindowHint artifacts on
Debian 10? Currently SHM doesn't work correctly on Debian 10 and one
has to create a child QOpenGLWidget for artifacts to disappear.
-- 
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development

Reply via email to