Re: [PATCH v2 0/5] drm: Reuse temporary memory for format conversion

2023-10-02 Thread Maxime Ripard
On Fri, Sep 29, 2023 at 04:58:30PM +0200, Thomas Zimmermann wrote:
> Am 29.09.23 um 14:11 schrieb Maxime Ripard:
> > On Wed, Sep 20, 2023 at 04:24:26PM +0200, Thomas Zimmermann wrote:
> > > DRM's format-conversion helpers require temporary memory. Pass the
> > > buffer from the caller and keep it allocated over several calls. Allow
> > > the caller to preallocate the buffer memory.
> > > 
> > > The motivation for this patchset is the recent work on a DRM panic
> > > handler. The panic handler requires format conversion to display an
> > > error to the screen. But allocating memory during kernel panics is
> > > fragile. The changes in this patchset enable the DRM panic handler to
> > > preallocate buffer storage before the panic occurs.
> > > 
> > > As an additonal benefit, drivers can now keep the temporary storage
> > > across multiple display updates. Avoiding memory allocation reduces
> > > the CPU overhead of the format helpers.
> > 
> > This argument is getting a bit tiring. The main reason is that it isn't
> > one, and:
> 
> CPU overhead isn't the driver behind this patchset, but if it is affected,
> why not say that in the commit message?

Any patch affects the CPU overhead then, one way or the other.

> There's a alloc/free pair for each updated scanline. For a full-screen
> updates, that's quite a bit.
>
> > 
> >- we allocate something in the 10-20 range objects at a given commit,
> >  so another small one is not insignificant.
> > 
> >- If it was, it would be trivial to produce a benchmark, but no-one
> >  ever actually showed a workload and numbers where there's actually
> >  any difference.
> > 
> >- Also, the CPU overhead is indeed (even if marginally) decreased, but
> >  the memory overhead is increased. I don't know whether that's a good
> >  trade-off or not, see the point above.
> > 
> > It really sounds like an empty statement to me: "But just think of the
> > CPU!".
> > 
> > That being said, I also have more fundamental doubts about this series.
> > 
> > The first one is that storing the buffer pointer in the device instead
> > of the state makes it harder to reason about. When you have a state, the
> > framework provides the guarantee at commit time that there's only going
> > to be one at a given time. And since the buffer is stored in that state
> > object, you can't access it by mistake. The buffer size also depends on
> > the state, so that all makes sense from a logical PoV.
> 
> Yes. I discussed this with Javier already. Putting this into the state is
> the clean solution.
> 
> > 
> > If we store the buffer into the device, then suddenly you have to think
> > about whether there's multiple CRTCs or not (since commits aren't
> > serialised if they affect different CRTCs), whether the buffer size you
> > allocated is large enough now for your current resolution and format,
> > etc. It adds a decent chunk of complexity on something that was quite
> > complex already.
> 
> It's in the device because it's good enough for these simple drivers. The
> next best place would be a dedicated plane structure in each driver. The
> per-plane cache would then be clearly attributed to a single plane.

Right, but you still need to think about all that before figuring it
out. And we now have simple drivers very likely to be used as an example
unsafe-but-actually-aren't. When copied and pasted into a different
context, it might not be safe anymore. And since those copy/pasting that
code are super experienced, they won't know about it.

> > I understand that the motivation is for drm_panic to have a buffer ready
> > when it has to kick in. But why don't we just allocate (and check) a
> > spare commit at probe time so we just have to commit it when we panic.
> 
> DRM panic doesn't commit anything. It picks up whatever the current scanout
> buffer is and draws into that. If the DRM driver cannot provide a scanout
> buffer, DRM panic does nothing.

My point is that it could do a commit if we have prepared everything. Or
we could steal the current buffer. Or we could pre-allocate a dumb
buffer for every device. You'll have to interact the state anyway for
any proper driver, and I don't think just allocating it here like you do
is safe and enough.

Maxime


signature.asc
Description: PGP signature


Re: [PATCH v2 0/5] drm: Reuse temporary memory for format conversion

2023-09-29 Thread Thomas Zimmermann

Hi

Am 29.09.23 um 14:11 schrieb Maxime Ripard:

On Wed, Sep 20, 2023 at 04:24:26PM +0200, Thomas Zimmermann wrote:

DRM's format-conversion helpers require temporary memory. Pass the
buffer from the caller and keep it allocated over several calls. Allow
the caller to preallocate the buffer memory.

The motivation for this patchset is the recent work on a DRM panic
handler. The panic handler requires format conversion to display an
error to the screen. But allocating memory during kernel panics is
fragile. The changes in this patchset enable the DRM panic handler to
preallocate buffer storage before the panic occurs.

As an additonal benefit, drivers can now keep the temporary storage
across multiple display updates. Avoiding memory allocation reduces
the CPU overhead of the format helpers.


This argument is getting a bit tiring. The main reason is that it isn't
one, and:


CPU overhead isn't the driver behind this patchset, but if it is 
affected, why not say that in the commit message? There's a alloc/free 
pair for each updated scanline. For a full-screen updates, that's quite 
a bit.




   - we allocate something in the 10-20 range objects at a given commit,
 so another small one is not insignificant.

   - If it was, it would be trivial to produce a benchmark, but no-one
 ever actually showed a workload and numbers where there's actually
 any difference.

   - Also, the CPU overhead is indeed (even if marginally) decreased, but
 the memory overhead is increased. I don't know whether that's a good
 trade-off or not, see the point above.

It really sounds like an empty statement to me: "But just think of the
CPU!".

That being said, I also have more fundamental doubts about this series.

The first one is that storing the buffer pointer in the device instead
of the state makes it harder to reason about. When you have a state, the
framework provides the guarantee at commit time that there's only going
to be one at a given time. And since the buffer is stored in that state
object, you can't access it by mistake. The buffer size also depends on
the state, so that all makes sense from a logical PoV.


Yes. I discussed this with Javier already. Putting this into the state 
is the clean solution.




If we store the buffer into the device, then suddenly you have to think
about whether there's multiple CRTCs or not (since commits aren't
serialised if they affect different CRTCs), whether the buffer size you
allocated is large enough now for your current resolution and format,
etc. It adds a decent chunk of complexity on something that was quite
complex already.


It's in the device because it's good enough for these simple drivers. 
The next best place would be a dedicated plane structure in each driver. 
The per-plane cache would then be clearly attributed to a single plane.




I understand that the motivation is for drm_panic to have a buffer ready
when it has to kick in. But why don't we just allocate (and check) a
spare commit at probe time so we just have to commit it when we panic.


DRM panic doesn't commit anything. It picks up whatever the current 
scanout buffer is and draws into that. If the DRM driver cannot provide 
a scanout buffer, DRM panic does nothing.


Best regards
Thomas



That would fall nicely into the rest of the atomic modesetting code, and
since we pretty much require not to allocate anything during
atomic_commit, we have that constraints already figured out.

Maxime


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: [PATCH v2 0/5] drm: Reuse temporary memory for format conversion

2023-09-29 Thread Maxime Ripard
On Wed, Sep 20, 2023 at 04:24:26PM +0200, Thomas Zimmermann wrote:
> DRM's format-conversion helpers require temporary memory. Pass the
> buffer from the caller and keep it allocated over several calls. Allow
> the caller to preallocate the buffer memory.
> 
> The motivation for this patchset is the recent work on a DRM panic
> handler. The panic handler requires format conversion to display an
> error to the screen. But allocating memory during kernel panics is
> fragile. The changes in this patchset enable the DRM panic handler to
> preallocate buffer storage before the panic occurs.
> 
> As an additonal benefit, drivers can now keep the temporary storage
> across multiple display updates. Avoiding memory allocation reduces
> the CPU overhead of the format helpers.

This argument is getting a bit tiring. The main reason is that it isn't
one, and:

  - we allocate something in the 10-20 range objects at a given commit,
so another small one is not insignificant.

  - If it was, it would be trivial to produce a benchmark, but no-one
ever actually showed a workload and numbers where there's actually
any difference.

  - Also, the CPU overhead is indeed (even if marginally) decreased, but
the memory overhead is increased. I don't know whether that's a good
trade-off or not, see the point above.

It really sounds like an empty statement to me: "But just think of the
CPU!".

That being said, I also have more fundamental doubts about this series.

The first one is that storing the buffer pointer in the device instead
of the state makes it harder to reason about. When you have a state, the
framework provides the guarantee at commit time that there's only going
to be one at a given time. And since the buffer is stored in that state
object, you can't access it by mistake. The buffer size also depends on
the state, so that all makes sense from a logical PoV.

If we store the buffer into the device, then suddenly you have to think
about whether there's multiple CRTCs or not (since commits aren't
serialised if they affect different CRTCs), whether the buffer size you
allocated is large enough now for your current resolution and format,
etc. It adds a decent chunk of complexity on something that was quite
complex already.

I understand that the motivation is for drm_panic to have a buffer ready
when it has to kick in. But why don't we just allocate (and check) a
spare commit at probe time so we just have to commit it when we panic.

That would fall nicely into the rest of the atomic modesetting code, and
since we pretty much require not to allocate anything during
atomic_commit, we have that constraints already figured out.

Maxime


signature.asc
Description: PGP signature


Re: [PATCH v2 0/5] drm: Reuse temporary memory for format conversion

2023-09-26 Thread Jocelyn Falempe

On 20/09/2023 16:24, Thomas Zimmermann wrote:

DRM's format-conversion helpers require temporary memory. Pass the
buffer from the caller and keep it allocated over several calls. Allow
the caller to preallocate the buffer memory.

The motivation for this patchset is the recent work on a DRM panic
handler. The panic handler requires format conversion to display an
error to the screen. But allocating memory during kernel panics is
fragile. The changes in this patchset enable the DRM panic handler to
preallocate buffer storage before the panic occurs.

As an additonal benefit, drivers can now keep the temporary storage
across multiple display updates. Avoiding memory allocation reduces
the CPU overhead of the format helpers.

Patch 1 adds struct drm_xfrm_buf, a simple interface to pass around
the buffer storage. Patch 2 moves the memory management from the format
helpers into their callers. Drivers release the temporary storage at
the end of their display-update functions.

Patches 3 to 8 update three drivers to keep the allocated memory for
all of a device's lifetime. Managed cleanup releases the buffer as part
of releaseing the device. As additional benefit, buffer allocation now
happens in atomic_check helpers. The driver thus detects OOM errors
before the display update begins.


Thanks for the patches.
I'm still experimenting with drm_panic, and it's not clear if it will 
use that or not yet.
But it's already useful for other drivers, avoiding alloc/free for each 
frame, is a good thing.



Best regards,

--

Jocelyn




Tested with simpledrm.

v2:
* reserve storage during probing in the drivers

Thomas Zimmermann (5):
   drm/format-helper: Add struct drm_xfrm_buf to cache format conversion
   drm/format-helper: Pass xfrm buffer to format-conversion helpers
   drm/simpledrm: Store xfrm buffer in device instance
   drm/ofdrm: Store xfrm buffer in device instance
   drm/ssd130x: Store xfrm buffer in device instance

  drivers/gpu/drm/drm_format_helper.c   | 204 +-
  drivers/gpu/drm/drm_mipi_dbi.c|   7 +-
  drivers/gpu/drm/gud/gud_pipe.c|  21 +-
  drivers/gpu/drm/solomon/ssd130x.c |  16 +-
  drivers/gpu/drm/solomon/ssd130x.h |   3 +
  .../gpu/drm/tests/drm_format_helper_test.c|  33 +--
  drivers/gpu/drm/tiny/cirrus.c |   5 +-
  drivers/gpu/drm/tiny/ofdrm.c  |  11 +-
  drivers/gpu/drm/tiny/repaper.c|   5 +-
  drivers/gpu/drm/tiny/simpledrm.c  |  11 +-
  drivers/gpu/drm/tiny/st7586.c |   5 +-
  include/drm/drm_format_helper.h   |  74 +--
  12 files changed, 300 insertions(+), 95 deletions(-)





[PATCH v2 0/5] drm: Reuse temporary memory for format conversion

2023-09-20 Thread Thomas Zimmermann
DRM's format-conversion helpers require temporary memory. Pass the
buffer from the caller and keep it allocated over several calls. Allow
the caller to preallocate the buffer memory.

The motivation for this patchset is the recent work on a DRM panic
handler. The panic handler requires format conversion to display an
error to the screen. But allocating memory during kernel panics is
fragile. The changes in this patchset enable the DRM panic handler to
preallocate buffer storage before the panic occurs.

As an additonal benefit, drivers can now keep the temporary storage
across multiple display updates. Avoiding memory allocation reduces
the CPU overhead of the format helpers.

Patch 1 adds struct drm_xfrm_buf, a simple interface to pass around
the buffer storage. Patch 2 moves the memory management from the format
helpers into their callers. Drivers release the temporary storage at
the end of their display-update functions.

Patches 3 to 8 update three drivers to keep the allocated memory for
all of a device's lifetime. Managed cleanup releases the buffer as part
of releaseing the device. As additional benefit, buffer allocation now
happens in atomic_check helpers. The driver thus detects OOM errors
before the display update begins.

Tested with simpledrm.

v2:
* reserve storage during probing in the drivers

Thomas Zimmermann (5):
  drm/format-helper: Add struct drm_xfrm_buf to cache format conversion
  drm/format-helper: Pass xfrm buffer to format-conversion helpers
  drm/simpledrm: Store xfrm buffer in device instance
  drm/ofdrm: Store xfrm buffer in device instance
  drm/ssd130x: Store xfrm buffer in device instance

 drivers/gpu/drm/drm_format_helper.c   | 204 +-
 drivers/gpu/drm/drm_mipi_dbi.c|   7 +-
 drivers/gpu/drm/gud/gud_pipe.c|  21 +-
 drivers/gpu/drm/solomon/ssd130x.c |  16 +-
 drivers/gpu/drm/solomon/ssd130x.h |   3 +
 .../gpu/drm/tests/drm_format_helper_test.c|  33 +--
 drivers/gpu/drm/tiny/cirrus.c |   5 +-
 drivers/gpu/drm/tiny/ofdrm.c  |  11 +-
 drivers/gpu/drm/tiny/repaper.c|   5 +-
 drivers/gpu/drm/tiny/simpledrm.c  |  11 +-
 drivers/gpu/drm/tiny/st7586.c |   5 +-
 include/drm/drm_format_helper.h   |  74 +--
 12 files changed, 300 insertions(+), 95 deletions(-)

-- 
2.42.0