Re: Possible Performance Regression with Mesa

2024-05-02 Thread Joao Paulo Silva Goncalves
Hi,

Just for you to know. I opened an issue on Mesa's gitalb too about the
regression https://gitlab.freedesktop.org/mesa/mesa/-/issues/11105.

Thanks for the help.

Regards,
João Paulo Gonçalves


Re: Possible Performance Regression with Mesa

2024-04-26 Thread João Paulo Silva Goncalves
Hi Daniel, 

On Fri, Apr 26, 2024 at 12:17:33PM +0100, Daniel Stone wrote:

> One thing you can try is to edit
> weston/libweston/backend-drm/state-propose.c and, inside
> dmabuf_feedback_maybe_update(), prevent action_needed from ever being
> set to ACTION_NEEDED_ADD_SCANOUT_TRANCHE. It would be interesting to
> know if this restores full performance.
 
Tried it. Same performance as before, so didn't solve the issue. I just 
removed the part of the code that set action_needed to 
ACTION_NEEDED_SCANOUT_TRANCHE.
 
Regards,
João Paulo Goncalves
 
diff --git a/libweston/backend-drm/state-propose.c 
b/libweston/backend-drm/state-propose.c
index 18a6d628..0ba23517 100644
--- a/libweston/backend-drm/state-propose.c
+++ b/libweston/backend-drm/state-propose.c
@@ -311,7 +311,9 @@ dmabuf_feedback_maybe_update(struct drm_device *device, 
struct weston_view *ev,
action_needed = ACTION_NEEDED_REMOVE_SCANOUT_TRANCHE;
/* Direct scanout may be possible if client re-allocates using the
 * params from the scanout tranche. */
-   } else if (try_view_on_plane_failure_reasons & 
(FAILURE_REASONS_ADD_FB_FAILED |
+   }
+   #if 0
+   else if (try_view_on_plane_failure_reasons & 
(FAILURE_REASONS_ADD_FB_FAILED |

FAILURE_REASONS_FB_FORMAT_INCOMPATIBLE |

FAILURE_REASONS_DMABUF_MODIFIER_INVALID |

FAILURE_REASONS_GBM_BO_IMPORT_FAILED |
@@ -321,6 +323,7 @@ dmabuf_feedback_maybe_update(struct drm_device *device, 
struct weston_view *ev,
} else if (try_view_on_plane_failure_reasons == FAILURE_REASONS_NONE) {
action_needed = ACTION_NEEDED_ADD_SCANOUT_TRANCHE;
}
+   #endif

/* No actions needed, so disarm timer and return */
if (action_needed == ACTION_NEEDED_NONE ||


Re: Possible Performance Regression with Mesa

2024-04-26 Thread Daniel Stone
Hi Joao,

On Fri, 26 Apr 2024 at 08:42, Joao Paulo Silva Goncalves
 wrote:
> On Thu, Apr 25, 2024 at 9:08 AM Lucas Stach  wrote:
> > I can reproduce the issue, but sadly there is no simple fix for this,
> > as it's a bad interaction between some of the new features.
> > At the core of the issue is the dmabuf-feedback support with the chain
> > of events being as follows:
>
> > 1. weston switches to the scanout tranche, as it would like to put the
> > surface on a plane
> > 2. the client reallocates as linear but does so on the render node
> > 3. weston still isn't able to put the buffer on the plane, as it's
> > still scanout incompatible due to being non-contig, so needs to fall
> > back to rendering
> > 4. now we are stuck at a linear buffer being used for rendering, which
> > is very non-optimal
>
> > I'll look into improving this, but can make no commitments as to when
> > I'll be able to get around to this.
>
> Seem to be tricky.
> If you want, we at least can help you test it. Just reach out.
> We also saw similar behaviour on more modern hardware, like the iMX8MM.
> I will do a bit more testing on the iMX8MM and also some on the iMX8MP to 
> geather more data and
> I am thinking in also opening an issue on the gitlab of Mesa, for better 
> tracking. What do you think?

One thing you can try is to edit
weston/libweston/backend-drm/state-propose.c and, inside
dmabuf_feedback_maybe_update(), prevent action_needed from ever being
set to ACTION_NEEDED_ADD_SCANOUT_TRANCHE. It would be interesting to
know if this restores full performance.

Cheers,
Daniel


Re: Possible Performance Regression with Mesa

2024-04-26 Thread Joao Paulo Silva Goncalves
On Thu, Apr 25, 2024 at 9:08 AM Lucas Stach  wrote:
> I can reproduce the issue, but sadly there is no simple fix for this,
> as it's a bad interaction between some of the new features.
> At the core of the issue is the dmabuf-feedback support with the chain
> of events being as follows:

> 1. weston switches to the scanout tranche, as it would like to put the
> surface on a plane
> 2. the client reallocates as linear but does so on the render node
> 3. weston still isn't able to put the buffer on the plane, as it's
> still scanout incompatible due to being non-contig, so needs to fall
> back to rendering
> 4. now we are stuck at a linear buffer being used for rendering, which
> is very non-optimal

> I'll look into improving this, but can make no commitments as to when
> I'll be able to get around to this.

Seem to be tricky.
If you want, we at least can help you test it. Just reach out.
We also saw similar behaviour on more modern hardware, like the iMX8MM.
I will do a bit more testing on the iMX8MM and also some on the iMX8MP to
geather more data and
I am thinking in also opening an issue on the gitlab of Mesa, for better
tracking. What do you think?

Thanks for all the help Lucas.

Regards,
João Paulo Goncalves


Re: Possible Performance Regression with Mesa

2024-04-26 Thread Joao Paulo Silva Goncalves
On Thu, Apr 25, 2024 at 5:58 AM Lucas Stach  wrote:

> Etnaviv added some resource tracking to fix issues with a number of
> use-cases, which did add some CPU overhead and might cost some
> performance, but should no be as dramatic as the numbers you are seeing
> here.

Good to know. Thanks!

> Since the glmark2 cumulative score can be skewed quite heavily by
> single tests, it would be interesting to compare the results from
> individual benchmark tests. Do you see any outliers there or is the
> performance drop across the board?

It seems to have a perfomance impact on overall the individual benchmarks
too, for example:

6-9-rc4 Kernel, glmark 2023.01 and Mesa 22.0.3:

>> GPU Test: Linux apalis-imx6-10692086 6.9.0-rc4 #1 SMP Wed Apr 24
18:57:48 -03 2024 armv7l armv7l armv7l GNU/Linux
===
glmark2 2023.01
===
OpenGL Information
GL_VENDOR:  etnaviv
GL_RENDERER:Vivante GC2000 rev 5108
GL_VERSION: OpenGL ES 2.0 Mesa 22.0.3
Surface Config: buf=32 r=8 g=8 b=8 a=8 depth=24 stencil=0 samples=0
Surface Size:   640x480 windowed
===
[shading] duration=5.0: FPS: 475 FrameTime: 2.106 ms
[build] use-vbo=false: FPS: 550 FrameTime: 1.819 ms
[texture] : FPS: 345 FrameTime: 2.902 ms
===
  glmark2 Score: 455
===

6-9-rc4 Kernel, glmark 2023.01 and Mesa 24.0.2:

>> GPU Test: Linux apalis-imx6-10692086
6.9.0-rc4-0.0.0-devel-5-g2186ca42060f #1 SMP Sun Apr 14 20:38:39 UTC
2024 armv7l GNU/Linux
===
glmark2 2023.01
===
OpenGL Information
GL_VENDOR:  Mesa
GL_RENDERER:Vivante GC2000 rev 5108
GL_VERSION: OpenGL ES 2.0 Mesa 24.0.2
Surface Config: buf=32 r=8 g=8 b=8 a=8 depth=24 stencil=0 samples=0
Surface Size:   640x480 windowed
===
[shading] duration=5.0: FPS: 325 FrameTime: 3.078 ms
[build] use-vbo=false: FPS: 368 FrameTime: 2.719 ms
[texture] : FPS: 210 FrameTime: 4.771 ms
===
  glmark2 Score: 300
===


Regards,
João Paulo Goncalves


Re: Possible Performance Regression with Mesa

2024-04-25 Thread Daniel Stone
On Thu, 25 Apr 2024 at 13:08, Lucas Stach  wrote:
> I can reproduce the issue, but sadly there is no simple fix for this,
> as it's a bad interaction between some of the new features.
> At the core of the issue is the dmabuf-feedback support with the chain
> of events being as follows:
>
> 1. weston switches to the scanout tranche, as it would like to put the
> surface on a plane
> 2. the client reallocates as linear but does so on the render node
> 3. weston still isn't able to put the buffer on the plane, as it's
> still scanout incompatible due to being non-contig, so needs to fall
> back to rendering
> 4. now we are stuck at a linear buffer being used for rendering, which
> is very non-optimal

Oh man, sorry about that, that shouldn't happen. As long as
drmModeAddFB2 is failing, we should be marking the buffer as
non-importable, and then hinting the client back towards tiled.

That being said, yeah, having the client render to linear and skip
composition is definitely going to be better!

Cheers,
Daniel


Re: Possible Performance Regression with Mesa

2024-04-25 Thread Lucas Stach
Am Donnerstag, dem 25.04.2024 um 07:56 -0300 schrieb Joao Paulo Silva
Goncalves:
> 
> 
> On Thu, Apr 25, 2024 at 5:58 AM Lucas Stach  wrote:
> 
> > Etnaviv added some resource tracking to fix issues with a number of
> > use-cases, which did add some CPU overhead and might cost some
> > performance, but should no be as dramatic as the numbers you are seeing
> > here.
> 
> Good to know. Thanks!
> 
> > Since the glmark2 cumulative score can be skewed quite heavily by
> >  single tests, it would be interesting to compare the results from
> >  individual benchmark tests. Do you see any outliers there or is the
> >  performance drop across the board?
> 
> It seems to have a perfomance impact on overall the individual benchmarks 
> too, for example:

I can reproduce the issue, but sadly there is no simple fix for this,
as it's a bad interaction between some of the new features.
At the core of the issue is the dmabuf-feedback support with the chain
of events being as follows:

1. weston switches to the scanout tranche, as it would like to put the
surface on a plane
2. the client reallocates as linear but does so on the render node
3. weston still isn't able to put the buffer on the plane, as it's
still scanout incompatible due to being non-contig, so needs to fall
back to rendering
4. now we are stuck at a linear buffer being used for rendering, which
is very non-optimal

I'll look into improving this, but can make no commitments as to when
I'll be able to get around to this.

Regards,
Lucas


Re: Possible Performance Regression with Mesa

2024-04-25 Thread Lucas Stach
Hi Joao Paulo,

Am Mittwoch, dem 24.04.2024 um 19:31 -0300 schrieb Joao Paulo Silva
Goncalves:
> Hello all,
> 
> We might have encountered a performance regression after upgrading from Mesa
> 2022.0.3 to 2024.0.2. During our automated hardware tests using LAVA, we 
> noticed
> a lower score on glmark2 when we upgraded from the OpenEmbedded release from
> Kirkstone to Scartgarth. After conducting some internal tests, it doesn't seem
> to be an issue with the kernel or the glmark2 tool version, so we suspect that
> the issue may be related to something within Mesa. We believe that there might
> be something we're overlooking. Do you have any ideas or insights about
> this problem?
> 
Etnaviv added some resource tracking to fix issues with a number of
use-cases, which did add some CPU overhead and might cost some
performance, but should no be as dramatic as the numbers you are seeing
here.

> Here are some details about our hardware platform and some tests we
> have conducted:
> 
> Platform: Toradex Apalis iMX6 - NXP i.MX 6Q/6D Arm Cortex A9 with
> Vivante GC2000 rev 5108 using Etnaviv.
> 
> Tests:
> 
> Kernel Versions - v6.1.87 and v6.9-rc4
> Glmark2 Versions - 2021.12 and 2023.01
> 
> We combined different upstream kernel, Mesa, and glmark2 versions and
> ran glmark2 on each
> combination on a mostly idle system. The benchmark was run 20 times on
> each combination.
> 
> Some Results:
> 
> > Kernel   |   Mesa| glmark2 | Max-Min Score
> v6.1.87 2022.0.32021.12   449-495
> v6.9-rc42022.0.32021.12   452-502
> v6.1.87 2022.0.32023.01   453-504
> v6.9-rc42022.0.32023.01   455-496
> v6.1.87 2024.0.22021.12   301-313
> v6.9-rc42024.0.22021.12   298-320
> v6.1.87 2024.0.22023.01   301-313
> v6.9-rc42024.0.22023.01   295-310

Since the glmark2 cumulative score can be skewed quite heavily by
single tests, it would be interesting to compare the results from
individual benchmark tests. Do you see any outliers there or is the
performance drop across the board?

Regards,
Lucas