Re: Possible Performance Regression with Mesa
Hi, Just for you to know. I opened an issue on Mesa's gitalb too about the regression https://gitlab.freedesktop.org/mesa/mesa/-/issues/11105. Thanks for the help. Regards, João Paulo Gonçalves
Re: Possible Performance Regression with Mesa
Hi Daniel, On Fri, Apr 26, 2024 at 12:17:33PM +0100, Daniel Stone wrote: > One thing you can try is to edit > weston/libweston/backend-drm/state-propose.c and, inside > dmabuf_feedback_maybe_update(), prevent action_needed from ever being > set to ACTION_NEEDED_ADD_SCANOUT_TRANCHE. It would be interesting to > know if this restores full performance. Tried it. Same performance as before, so didn't solve the issue. I just removed the part of the code that set action_needed to ACTION_NEEDED_SCANOUT_TRANCHE. Regards, João Paulo Goncalves diff --git a/libweston/backend-drm/state-propose.c b/libweston/backend-drm/state-propose.c index 18a6d628..0ba23517 100644 --- a/libweston/backend-drm/state-propose.c +++ b/libweston/backend-drm/state-propose.c @@ -311,7 +311,9 @@ dmabuf_feedback_maybe_update(struct drm_device *device, struct weston_view *ev, action_needed = ACTION_NEEDED_REMOVE_SCANOUT_TRANCHE; /* Direct scanout may be possible if client re-allocates using the * params from the scanout tranche. */ - } else if (try_view_on_plane_failure_reasons & (FAILURE_REASONS_ADD_FB_FAILED | + } + #if 0 + else if (try_view_on_plane_failure_reasons & (FAILURE_REASONS_ADD_FB_FAILED | FAILURE_REASONS_FB_FORMAT_INCOMPATIBLE | FAILURE_REASONS_DMABUF_MODIFIER_INVALID | FAILURE_REASONS_GBM_BO_IMPORT_FAILED | @@ -321,6 +323,7 @@ dmabuf_feedback_maybe_update(struct drm_device *device, struct weston_view *ev, } else if (try_view_on_plane_failure_reasons == FAILURE_REASONS_NONE) { action_needed = ACTION_NEEDED_ADD_SCANOUT_TRANCHE; } + #endif /* No actions needed, so disarm timer and return */ if (action_needed == ACTION_NEEDED_NONE ||
Re: Possible Performance Regression with Mesa
Hi Joao, On Fri, 26 Apr 2024 at 08:42, Joao Paulo Silva Goncalves wrote: > On Thu, Apr 25, 2024 at 9:08 AM Lucas Stach wrote: > > I can reproduce the issue, but sadly there is no simple fix for this, > > as it's a bad interaction between some of the new features. > > At the core of the issue is the dmabuf-feedback support with the chain > > of events being as follows: > > > 1. weston switches to the scanout tranche, as it would like to put the > > surface on a plane > > 2. the client reallocates as linear but does so on the render node > > 3. weston still isn't able to put the buffer on the plane, as it's > > still scanout incompatible due to being non-contig, so needs to fall > > back to rendering > > 4. now we are stuck at a linear buffer being used for rendering, which > > is very non-optimal > > > I'll look into improving this, but can make no commitments as to when > > I'll be able to get around to this. > > Seem to be tricky. > If you want, we at least can help you test it. Just reach out. > We also saw similar behaviour on more modern hardware, like the iMX8MM. > I will do a bit more testing on the iMX8MM and also some on the iMX8MP to > geather more data and > I am thinking in also opening an issue on the gitlab of Mesa, for better > tracking. What do you think? One thing you can try is to edit weston/libweston/backend-drm/state-propose.c and, inside dmabuf_feedback_maybe_update(), prevent action_needed from ever being set to ACTION_NEEDED_ADD_SCANOUT_TRANCHE. It would be interesting to know if this restores full performance. Cheers, Daniel
Re: Possible Performance Regression with Mesa
On Thu, Apr 25, 2024 at 9:08 AM Lucas Stach wrote: > I can reproduce the issue, but sadly there is no simple fix for this, > as it's a bad interaction between some of the new features. > At the core of the issue is the dmabuf-feedback support with the chain > of events being as follows: > 1. weston switches to the scanout tranche, as it would like to put the > surface on a plane > 2. the client reallocates as linear but does so on the render node > 3. weston still isn't able to put the buffer on the plane, as it's > still scanout incompatible due to being non-contig, so needs to fall > back to rendering > 4. now we are stuck at a linear buffer being used for rendering, which > is very non-optimal > I'll look into improving this, but can make no commitments as to when > I'll be able to get around to this. Seem to be tricky. If you want, we at least can help you test it. Just reach out. We also saw similar behaviour on more modern hardware, like the iMX8MM. I will do a bit more testing on the iMX8MM and also some on the iMX8MP to geather more data and I am thinking in also opening an issue on the gitlab of Mesa, for better tracking. What do you think? Thanks for all the help Lucas. Regards, João Paulo Goncalves
Re: Possible Performance Regression with Mesa
On Thu, Apr 25, 2024 at 5:58 AM Lucas Stach wrote: > Etnaviv added some resource tracking to fix issues with a number of > use-cases, which did add some CPU overhead and might cost some > performance, but should no be as dramatic as the numbers you are seeing > here. Good to know. Thanks! > Since the glmark2 cumulative score can be skewed quite heavily by > single tests, it would be interesting to compare the results from > individual benchmark tests. Do you see any outliers there or is the > performance drop across the board? It seems to have a perfomance impact on overall the individual benchmarks too, for example: 6-9-rc4 Kernel, glmark 2023.01 and Mesa 22.0.3: >> GPU Test: Linux apalis-imx6-10692086 6.9.0-rc4 #1 SMP Wed Apr 24 18:57:48 -03 2024 armv7l armv7l armv7l GNU/Linux === glmark2 2023.01 === OpenGL Information GL_VENDOR: etnaviv GL_RENDERER:Vivante GC2000 rev 5108 GL_VERSION: OpenGL ES 2.0 Mesa 22.0.3 Surface Config: buf=32 r=8 g=8 b=8 a=8 depth=24 stencil=0 samples=0 Surface Size: 640x480 windowed === [shading] duration=5.0: FPS: 475 FrameTime: 2.106 ms [build] use-vbo=false: FPS: 550 FrameTime: 1.819 ms [texture] : FPS: 345 FrameTime: 2.902 ms === glmark2 Score: 455 === 6-9-rc4 Kernel, glmark 2023.01 and Mesa 24.0.2: >> GPU Test: Linux apalis-imx6-10692086 6.9.0-rc4-0.0.0-devel-5-g2186ca42060f #1 SMP Sun Apr 14 20:38:39 UTC 2024 armv7l GNU/Linux === glmark2 2023.01 === OpenGL Information GL_VENDOR: Mesa GL_RENDERER:Vivante GC2000 rev 5108 GL_VERSION: OpenGL ES 2.0 Mesa 24.0.2 Surface Config: buf=32 r=8 g=8 b=8 a=8 depth=24 stencil=0 samples=0 Surface Size: 640x480 windowed === [shading] duration=5.0: FPS: 325 FrameTime: 3.078 ms [build] use-vbo=false: FPS: 368 FrameTime: 2.719 ms [texture] : FPS: 210 FrameTime: 4.771 ms === glmark2 Score: 300 === Regards, João Paulo Goncalves
Re: Possible Performance Regression with Mesa
On Thu, 25 Apr 2024 at 13:08, Lucas Stach wrote: > I can reproduce the issue, but sadly there is no simple fix for this, > as it's a bad interaction between some of the new features. > At the core of the issue is the dmabuf-feedback support with the chain > of events being as follows: > > 1. weston switches to the scanout tranche, as it would like to put the > surface on a plane > 2. the client reallocates as linear but does so on the render node > 3. weston still isn't able to put the buffer on the plane, as it's > still scanout incompatible due to being non-contig, so needs to fall > back to rendering > 4. now we are stuck at a linear buffer being used for rendering, which > is very non-optimal Oh man, sorry about that, that shouldn't happen. As long as drmModeAddFB2 is failing, we should be marking the buffer as non-importable, and then hinting the client back towards tiled. That being said, yeah, having the client render to linear and skip composition is definitely going to be better! Cheers, Daniel
Re: Possible Performance Regression with Mesa
Am Donnerstag, dem 25.04.2024 um 07:56 -0300 schrieb Joao Paulo Silva Goncalves: > > > On Thu, Apr 25, 2024 at 5:58 AM Lucas Stach wrote: > > > Etnaviv added some resource tracking to fix issues with a number of > > use-cases, which did add some CPU overhead and might cost some > > performance, but should no be as dramatic as the numbers you are seeing > > here. > > Good to know. Thanks! > > > Since the glmark2 cumulative score can be skewed quite heavily by > > single tests, it would be interesting to compare the results from > > individual benchmark tests. Do you see any outliers there or is the > > performance drop across the board? > > It seems to have a perfomance impact on overall the individual benchmarks > too, for example: I can reproduce the issue, but sadly there is no simple fix for this, as it's a bad interaction between some of the new features. At the core of the issue is the dmabuf-feedback support with the chain of events being as follows: 1. weston switches to the scanout tranche, as it would like to put the surface on a plane 2. the client reallocates as linear but does so on the render node 3. weston still isn't able to put the buffer on the plane, as it's still scanout incompatible due to being non-contig, so needs to fall back to rendering 4. now we are stuck at a linear buffer being used for rendering, which is very non-optimal I'll look into improving this, but can make no commitments as to when I'll be able to get around to this. Regards, Lucas
Re: Possible Performance Regression with Mesa
Hi Joao Paulo, Am Mittwoch, dem 24.04.2024 um 19:31 -0300 schrieb Joao Paulo Silva Goncalves: > Hello all, > > We might have encountered a performance regression after upgrading from Mesa > 2022.0.3 to 2024.0.2. During our automated hardware tests using LAVA, we > noticed > a lower score on glmark2 when we upgraded from the OpenEmbedded release from > Kirkstone to Scartgarth. After conducting some internal tests, it doesn't seem > to be an issue with the kernel or the glmark2 tool version, so we suspect that > the issue may be related to something within Mesa. We believe that there might > be something we're overlooking. Do you have any ideas or insights about > this problem? > Etnaviv added some resource tracking to fix issues with a number of use-cases, which did add some CPU overhead and might cost some performance, but should no be as dramatic as the numbers you are seeing here. > Here are some details about our hardware platform and some tests we > have conducted: > > Platform: Toradex Apalis iMX6 - NXP i.MX 6Q/6D Arm Cortex A9 with > Vivante GC2000 rev 5108 using Etnaviv. > > Tests: > > Kernel Versions - v6.1.87 and v6.9-rc4 > Glmark2 Versions - 2021.12 and 2023.01 > > We combined different upstream kernel, Mesa, and glmark2 versions and > ran glmark2 on each > combination on a mostly idle system. The benchmark was run 20 times on > each combination. > > Some Results: > > > Kernel | Mesa| glmark2 | Max-Min Score > v6.1.87 2022.0.32021.12 449-495 > v6.9-rc42022.0.32021.12 452-502 > v6.1.87 2022.0.32023.01 453-504 > v6.9-rc42022.0.32023.01 455-496 > v6.1.87 2024.0.22021.12 301-313 > v6.9-rc42024.0.22021.12 298-320 > v6.1.87 2024.0.22023.01 301-313 > v6.9-rc42024.0.22023.01 295-310 Since the glmark2 cumulative score can be skewed quite heavily by single tests, it would be interesting to compare the results from individual benchmark tests. Do you see any outliers there or is the performance drop across the board? Regards, Lucas