Re: [PATCH v2 0/5] drm: Reuse temporary memory for format conversion
On Fri, Sep 29, 2023 at 04:58:30PM +0200, Thomas Zimmermann wrote: > Am 29.09.23 um 14:11 schrieb Maxime Ripard: > > On Wed, Sep 20, 2023 at 04:24:26PM +0200, Thomas Zimmermann wrote: > > > DRM's format-conversion helpers require temporary memory. Pass the > > > buffer from the caller and keep it allocated over several calls. Allow > > > the caller to preallocate the buffer memory. > > > > > > The motivation for this patchset is the recent work on a DRM panic > > > handler. The panic handler requires format conversion to display an > > > error to the screen. But allocating memory during kernel panics is > > > fragile. The changes in this patchset enable the DRM panic handler to > > > preallocate buffer storage before the panic occurs. > > > > > > As an additonal benefit, drivers can now keep the temporary storage > > > across multiple display updates. Avoiding memory allocation reduces > > > the CPU overhead of the format helpers. > > > > This argument is getting a bit tiring. The main reason is that it isn't > > one, and: > > CPU overhead isn't the driver behind this patchset, but if it is affected, > why not say that in the commit message? Any patch affects the CPU overhead then, one way or the other. > There's a alloc/free pair for each updated scanline. For a full-screen > updates, that's quite a bit. > > > > >- we allocate something in the 10-20 range objects at a given commit, > > so another small one is not insignificant. > > > >- If it was, it would be trivial to produce a benchmark, but no-one > > ever actually showed a workload and numbers where there's actually > > any difference. > > > >- Also, the CPU overhead is indeed (even if marginally) decreased, but > > the memory overhead is increased. I don't know whether that's a good > > trade-off or not, see the point above. > > > > It really sounds like an empty statement to me: "But just think of the > > CPU!". > > > > That being said, I also have more fundamental doubts about this series. > > > > The first one is that storing the buffer pointer in the device instead > > of the state makes it harder to reason about. When you have a state, the > > framework provides the guarantee at commit time that there's only going > > to be one at a given time. And since the buffer is stored in that state > > object, you can't access it by mistake. The buffer size also depends on > > the state, so that all makes sense from a logical PoV. > > Yes. I discussed this with Javier already. Putting this into the state is > the clean solution. > > > > > If we store the buffer into the device, then suddenly you have to think > > about whether there's multiple CRTCs or not (since commits aren't > > serialised if they affect different CRTCs), whether the buffer size you > > allocated is large enough now for your current resolution and format, > > etc. It adds a decent chunk of complexity on something that was quite > > complex already. > > It's in the device because it's good enough for these simple drivers. The > next best place would be a dedicated plane structure in each driver. The > per-plane cache would then be clearly attributed to a single plane. Right, but you still need to think about all that before figuring it out. And we now have simple drivers very likely to be used as an example unsafe-but-actually-aren't. When copied and pasted into a different context, it might not be safe anymore. And since those copy/pasting that code are super experienced, they won't know about it. > > I understand that the motivation is for drm_panic to have a buffer ready > > when it has to kick in. But why don't we just allocate (and check) a > > spare commit at probe time so we just have to commit it when we panic. > > DRM panic doesn't commit anything. It picks up whatever the current scanout > buffer is and draws into that. If the DRM driver cannot provide a scanout > buffer, DRM panic does nothing. My point is that it could do a commit if we have prepared everything. Or we could steal the current buffer. Or we could pre-allocate a dumb buffer for every device. You'll have to interact the state anyway for any proper driver, and I don't think just allocating it here like you do is safe and enough. Maxime signature.asc Description: PGP signature
Re: [PATCH v2 0/5] drm: Reuse temporary memory for format conversion
Hi Am 29.09.23 um 14:11 schrieb Maxime Ripard: On Wed, Sep 20, 2023 at 04:24:26PM +0200, Thomas Zimmermann wrote: DRM's format-conversion helpers require temporary memory. Pass the buffer from the caller and keep it allocated over several calls. Allow the caller to preallocate the buffer memory. The motivation for this patchset is the recent work on a DRM panic handler. The panic handler requires format conversion to display an error to the screen. But allocating memory during kernel panics is fragile. The changes in this patchset enable the DRM panic handler to preallocate buffer storage before the panic occurs. As an additonal benefit, drivers can now keep the temporary storage across multiple display updates. Avoiding memory allocation reduces the CPU overhead of the format helpers. This argument is getting a bit tiring. The main reason is that it isn't one, and: CPU overhead isn't the driver behind this patchset, but if it is affected, why not say that in the commit message? There's a alloc/free pair for each updated scanline. For a full-screen updates, that's quite a bit. - we allocate something in the 10-20 range objects at a given commit, so another small one is not insignificant. - If it was, it would be trivial to produce a benchmark, but no-one ever actually showed a workload and numbers where there's actually any difference. - Also, the CPU overhead is indeed (even if marginally) decreased, but the memory overhead is increased. I don't know whether that's a good trade-off or not, see the point above. It really sounds like an empty statement to me: "But just think of the CPU!". That being said, I also have more fundamental doubts about this series. The first one is that storing the buffer pointer in the device instead of the state makes it harder to reason about. When you have a state, the framework provides the guarantee at commit time that there's only going to be one at a given time. And since the buffer is stored in that state object, you can't access it by mistake. The buffer size also depends on the state, so that all makes sense from a logical PoV. Yes. I discussed this with Javier already. Putting this into the state is the clean solution. If we store the buffer into the device, then suddenly you have to think about whether there's multiple CRTCs or not (since commits aren't serialised if they affect different CRTCs), whether the buffer size you allocated is large enough now for your current resolution and format, etc. It adds a decent chunk of complexity on something that was quite complex already. It's in the device because it's good enough for these simple drivers. The next best place would be a dedicated plane structure in each driver. The per-plane cache would then be clearly attributed to a single plane. I understand that the motivation is for drm_panic to have a buffer ready when it has to kick in. But why don't we just allocate (and check) a spare commit at probe time so we just have to commit it when we panic. DRM panic doesn't commit anything. It picks up whatever the current scanout buffer is and draws into that. If the DRM driver cannot provide a scanout buffer, DRM panic does nothing. Best regards Thomas That would fall nicely into the rest of the atomic modesetting code, and since we pretty much require not to allocate anything during atomic_commit, we have that constraints already figured out. Maxime -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Frankenstrasse 146, 90461 Nuernberg, Germany GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman HRB 36809 (AG Nuernberg) OpenPGP_signature.asc Description: OpenPGP digital signature
Re: [PATCH v2 0/5] drm: Reuse temporary memory for format conversion
On Wed, Sep 20, 2023 at 04:24:26PM +0200, Thomas Zimmermann wrote: > DRM's format-conversion helpers require temporary memory. Pass the > buffer from the caller and keep it allocated over several calls. Allow > the caller to preallocate the buffer memory. > > The motivation for this patchset is the recent work on a DRM panic > handler. The panic handler requires format conversion to display an > error to the screen. But allocating memory during kernel panics is > fragile. The changes in this patchset enable the DRM panic handler to > preallocate buffer storage before the panic occurs. > > As an additonal benefit, drivers can now keep the temporary storage > across multiple display updates. Avoiding memory allocation reduces > the CPU overhead of the format helpers. This argument is getting a bit tiring. The main reason is that it isn't one, and: - we allocate something in the 10-20 range objects at a given commit, so another small one is not insignificant. - If it was, it would be trivial to produce a benchmark, but no-one ever actually showed a workload and numbers where there's actually any difference. - Also, the CPU overhead is indeed (even if marginally) decreased, but the memory overhead is increased. I don't know whether that's a good trade-off or not, see the point above. It really sounds like an empty statement to me: "But just think of the CPU!". That being said, I also have more fundamental doubts about this series. The first one is that storing the buffer pointer in the device instead of the state makes it harder to reason about. When you have a state, the framework provides the guarantee at commit time that there's only going to be one at a given time. And since the buffer is stored in that state object, you can't access it by mistake. The buffer size also depends on the state, so that all makes sense from a logical PoV. If we store the buffer into the device, then suddenly you have to think about whether there's multiple CRTCs or not (since commits aren't serialised if they affect different CRTCs), whether the buffer size you allocated is large enough now for your current resolution and format, etc. It adds a decent chunk of complexity on something that was quite complex already. I understand that the motivation is for drm_panic to have a buffer ready when it has to kick in. But why don't we just allocate (and check) a spare commit at probe time so we just have to commit it when we panic. That would fall nicely into the rest of the atomic modesetting code, and since we pretty much require not to allocate anything during atomic_commit, we have that constraints already figured out. Maxime signature.asc Description: PGP signature
Re: [PATCH v2 0/5] drm: Reuse temporary memory for format conversion
On 20/09/2023 16:24, Thomas Zimmermann wrote: DRM's format-conversion helpers require temporary memory. Pass the buffer from the caller and keep it allocated over several calls. Allow the caller to preallocate the buffer memory. The motivation for this patchset is the recent work on a DRM panic handler. The panic handler requires format conversion to display an error to the screen. But allocating memory during kernel panics is fragile. The changes in this patchset enable the DRM panic handler to preallocate buffer storage before the panic occurs. As an additonal benefit, drivers can now keep the temporary storage across multiple display updates. Avoiding memory allocation reduces the CPU overhead of the format helpers. Patch 1 adds struct drm_xfrm_buf, a simple interface to pass around the buffer storage. Patch 2 moves the memory management from the format helpers into their callers. Drivers release the temporary storage at the end of their display-update functions. Patches 3 to 8 update three drivers to keep the allocated memory for all of a device's lifetime. Managed cleanup releases the buffer as part of releaseing the device. As additional benefit, buffer allocation now happens in atomic_check helpers. The driver thus detects OOM errors before the display update begins. Thanks for the patches. I'm still experimenting with drm_panic, and it's not clear if it will use that or not yet. But it's already useful for other drivers, avoiding alloc/free for each frame, is a good thing. Best regards, -- Jocelyn Tested with simpledrm. v2: * reserve storage during probing in the drivers Thomas Zimmermann (5): drm/format-helper: Add struct drm_xfrm_buf to cache format conversion drm/format-helper: Pass xfrm buffer to format-conversion helpers drm/simpledrm: Store xfrm buffer in device instance drm/ofdrm: Store xfrm buffer in device instance drm/ssd130x: Store xfrm buffer in device instance drivers/gpu/drm/drm_format_helper.c | 204 +- drivers/gpu/drm/drm_mipi_dbi.c| 7 +- drivers/gpu/drm/gud/gud_pipe.c| 21 +- drivers/gpu/drm/solomon/ssd130x.c | 16 +- drivers/gpu/drm/solomon/ssd130x.h | 3 + .../gpu/drm/tests/drm_format_helper_test.c| 33 +-- drivers/gpu/drm/tiny/cirrus.c | 5 +- drivers/gpu/drm/tiny/ofdrm.c | 11 +- drivers/gpu/drm/tiny/repaper.c| 5 +- drivers/gpu/drm/tiny/simpledrm.c | 11 +- drivers/gpu/drm/tiny/st7586.c | 5 +- include/drm/drm_format_helper.h | 74 +-- 12 files changed, 300 insertions(+), 95 deletions(-)
[PATCH v2 0/5] drm: Reuse temporary memory for format conversion
DRM's format-conversion helpers require temporary memory. Pass the buffer from the caller and keep it allocated over several calls. Allow the caller to preallocate the buffer memory. The motivation for this patchset is the recent work on a DRM panic handler. The panic handler requires format conversion to display an error to the screen. But allocating memory during kernel panics is fragile. The changes in this patchset enable the DRM panic handler to preallocate buffer storage before the panic occurs. As an additonal benefit, drivers can now keep the temporary storage across multiple display updates. Avoiding memory allocation reduces the CPU overhead of the format helpers. Patch 1 adds struct drm_xfrm_buf, a simple interface to pass around the buffer storage. Patch 2 moves the memory management from the format helpers into their callers. Drivers release the temporary storage at the end of their display-update functions. Patches 3 to 8 update three drivers to keep the allocated memory for all of a device's lifetime. Managed cleanup releases the buffer as part of releaseing the device. As additional benefit, buffer allocation now happens in atomic_check helpers. The driver thus detects OOM errors before the display update begins. Tested with simpledrm. v2: * reserve storage during probing in the drivers Thomas Zimmermann (5): drm/format-helper: Add struct drm_xfrm_buf to cache format conversion drm/format-helper: Pass xfrm buffer to format-conversion helpers drm/simpledrm: Store xfrm buffer in device instance drm/ofdrm: Store xfrm buffer in device instance drm/ssd130x: Store xfrm buffer in device instance drivers/gpu/drm/drm_format_helper.c | 204 +- drivers/gpu/drm/drm_mipi_dbi.c| 7 +- drivers/gpu/drm/gud/gud_pipe.c| 21 +- drivers/gpu/drm/solomon/ssd130x.c | 16 +- drivers/gpu/drm/solomon/ssd130x.h | 3 + .../gpu/drm/tests/drm_format_helper_test.c| 33 +-- drivers/gpu/drm/tiny/cirrus.c | 5 +- drivers/gpu/drm/tiny/ofdrm.c | 11 +- drivers/gpu/drm/tiny/repaper.c| 5 +- drivers/gpu/drm/tiny/simpledrm.c | 11 +- drivers/gpu/drm/tiny/st7586.c | 5 +- include/drm/drm_format_helper.h | 74 +-- 12 files changed, 300 insertions(+), 95 deletions(-) -- 2.42.0