Hi

Am 08.09.23 um 16:41 schrieb Pekka Paalanen:
On Fri, 8 Sep 2023 15:56:51 +0200
Thomas Zimmermann <tzimmerm...@suse.de> wrote:

Hi

Am 08.09.23 um 13:16 schrieb Pekka Paalanen:
On Fri, 8 Sep 2023 11:21:51 +0200
Thomas Zimmermann <tzimmerm...@suse.de> wrote:
Hi

Am 25.08.23 um 16:04 schrieb Jocelyn Falempe:
[...]
+ *
+ *     But there are two exceptions only for dumb buffers:
+ *     * To support XRGB8888 if it's not supported by the hardware.

+ *     * Any driver is free to modify its internal representation of the 
format,
+ *       as long as it doesn't alter the visible content in any way, and 
doesn't
+ *       modify the user-provided buffer. An example would be to drop the
+ *       padding component from a format to save some memory bandwidth.

I have strong objections to this point, _especially_ as you're
apparently trying to sneak this in after our discussion. NAK on this
part from my side.

If you want userspace to be able to use a certain format, then export
the corresponding 4cc code. Then let userspace decide what to do about
it. If userspace pick a certain format, go with it.

What is the reason for your objection, exactly?
Hence, no implicit conversion from XRGB888 to RGB888, just because it's
possible.

For the particular driver in question though, the conversion allows
using a display resolution that is otherwise not possible. I also hear
it improves performance since 25% less data needs to travel across a
slow bus. There is also so little VRAM, than all dumb buffers need to
be allocated from sysram instead anyway, so a copy is always necessary.

Since XRGB8888 is the one format that is recommended to be supported by
all drivers, I don't see a problem here. Did you test on your
incredibly slow g200 test rig if the conversion ends up hurting instead
of helping performance there?

If it hurts, then I see that you have a good reason to NAK this.

It's hard to imagine how it would hurt, since you always need a copy
from sysram dumb buffers to VRAM - or do you?

I have a number of concerns. My point it not that we shouldn't optimize.
I just don't want it in the kernel. Mgag200 can export DRM_FORMAT_RGB888
for userspace to use.

AFAICT the main argument against userspace is that Mesa doesn't like
3-byte pixels. But I don't see how this conversion cannot be a
post-processing step within Mesa: do the rendering in RGB32 and then
convert to a framebuffer in RGB24. Userspace can do that more
efficiently than the kernel. This has all of the upsides of reduced
bandwidth, but none of the downsides of kernel code. Applications and/or
Mesa would be in control of the buffer format and apply the optimization
where it makes sense. And it would be available for all drivers that are
similar to mgag200.

My main point is simplicity of the driver: I prefer the driver to be
simple without unnecessary indirection or overhead. Optimizations like
these my or may not work on a given system with a certain workload. I'd
better leave this heuristic to userspace.

Another point of concern is CPU consumption: Slow I/O buses may stall
the display thread, but the CPU could do something else in the meantime.
Doing format conversion on the CPU prevents that, hence affecting other
parts of the system negatively. Of course, that's more of a gut feeling
than hard data.

Please note that the kernel's conversion code uses memory allocation of
intermediate buffers. We even recently had a discussion about allocation
overhead during display updates. Userspace can surely do a better job at
keeping such buffers around.

And finally a note the hardware itself: on low-end hardware like those
Matrox chips, just switch to RGB16. That will be pretty and fast enough
for these chips' server systems. Anyone who cares about fast and
beautiful should buy a real graphics card.

Fair enough.

Did you consider that every frame will be copied twice: once in
userspace to get RGB888, then again by the driver into VRAM, since dumb
buffers reside in sysram?

In the kernel, we reduce the copying to the changed parts, if we have damage information from userspace. IDK Mesa's software renderer, but it could certainly apply a similar optimization.


RGB565 would probably go the same route I guess.

From what I know userspace still supports RGB565 rendering directly. Last time I tested, it worked for me.


I suspect the intermediate buffers (dumb buffers in this case) need to
be full frame size rather than a scanline, too. I'm not sure why the
driver would need any scratch buffers beyond a couple dozen bytes while
doing a software conversion, just enough to have the lowest common
multiple of 3 bytes and optimal bus write width.

For the conversion in the kernel we allocate enough temporary memory to hold the part of each scanline that changed. Then we go over dirty scanlines, convert each into the output format and store it in that temporary buffer. Then copy the temporary buffer to VRAM. The buffer can be reused for the scanlines of a single conversion. But the nature of the framebuffer (buffers are possibly shared with concurrent access from multiple planes) makes it hard to keep that temporary memory around. Hence it's freed after each conversion. The code is at [1].

I assume that a userspace software renderer could do a much better job at keeping the temporary memory allocated.

Best regards
Thomas

[1] https://elixir.bootlin.com/linux/v6.5/source/drivers/gpu/drm/drm_format_helper.c#L88



Thanks,
pq


Best regards
Thomas



Thanks,
pq
+ *     On most hardware, VRAM read access are slow, so when doing the software
+ *     conversion, the dumb buffer should be allocated in system RAM in order 
to
+ *     have decent performance.
+ *     Extra care should be taken when doing software conversion with
+ *     DRM_CAP_DUMB_PREFER_SHADOW, there are more detailed explanations here:
+ *     https://lore.kernel.org/dri-devel/20230818162415.2185f8e3@eldfell/
     */
static unsigned int drm_num_planes(struct drm_device *dev)

base-commit: 82d750e9d2f5d0594c8f7057ce59127e701af781



--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)

Attachment: OpenPGP_signature
Description: OpenPGP digital signature

Reply via email to