On Tue, 27 Feb 2024 15:47:08 -0300
Arthur Grillo <arthurgri...@riseup.net> wrote:

> On 27/02/24 12:02, Louis Chauvet wrote:
> > Le 26/02/24 - 10:07, Arthur Grillo a écrit :  
> >>
> >>
> >> On 26/02/24 05:46, Louis Chauvet wrote:  
> >>> Add some documentation on pixel conversion functions.
> >>> Update of outdated comments for pixel_write functions.
> >>>
> >>> Signed-off-by: Louis Chauvet <louis.chau...@bootlin.com>
> >>> ---
> >>>  drivers/gpu/drm/vkms/vkms_composer.c |  4 +++
> >>>  drivers/gpu/drm/vkms/vkms_drv.h      | 13 ++++++++
> >>>  drivers/gpu/drm/vkms/vkms_formats.c  | 58 
> >>> ++++++++++++++++++++++++++++++------
> >>>  3 files changed, 66 insertions(+), 9 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/vkms/vkms_composer.c 
> >>> b/drivers/gpu/drm/vkms/vkms_composer.c
> >>> index c6d9b4a65809..5b341222d239 100644
> >>> --- a/drivers/gpu/drm/vkms/vkms_composer.c
> >>> +++ b/drivers/gpu/drm/vkms/vkms_composer.c
> >>> @@ -189,6 +189,10 @@ static void blend(struct vkms_writeback_job *wb,
> >>>  
> >>>   size_t crtc_y_limit = crtc_state->base.crtc->mode.vdisplay;
> >>>  
> >>> + /*
> >>> +  * The planes are composed line-by-line. It is a necessary complexity 
> >>> to avoid poor
> >>> +  * blending performance.  
> >>
> >> At this moment in the series, you have not yet reintroduced the
> >> line-by-line algorithm yet. Maybe it's better to add this comment when
> >> you do.  
> > 
> > Is it better with this:
> > 
> >     /*
> >      * The planes are composed line-by-line to avoid heavy memory usage. It 
> > is a necessary
> >      * complexity to avoid poor blending performance.
> >      *
> >      * The function vkms_compose_row is used to read a line, 
> > pixel-by-pixel, into the staging
> >      * buffer.
> >      */
> >    
> >> Also, I think it's good to give more context, like:
> >> "The planes are composed line-by-line, instead of pixel-by-pixel"  
> > 
> > And after PATCHv3 5/9:
> > 
> >     /*
> >      * The planes are composed line-by-line to avoid heavy memory usage. It 
> > is a necessary
> >      * complexity to avoid poor blending performance.
> >      *
> >      * The function pixel_read_line callback is used to read a line, using 
> > an efficient 
> >      * algorithm for a specific format, into the staging buffer.
> >      */
> >   

Hi,

there are a few reasons for the line-by-line algorithm, and the
optimizations at large:

VKMS uses temporary stage and output buffers so that blending functions
can operate on just one high-precision pixel format, struct
pixel_argb_u16. We can make pixel-format-specific read and write
functions completely orthogonal from the blending operations and FB
format combinations. This avoids a combinatorial explosion of needed
functions for { input pixel formats × blending operations × output pixel
formats }.

We can use a temporary stage and output buffer whose size is one line
and not whole FB or CRTC framebuffer. This is the memory savings.

Using a temporary output buffer also avoids repeated
read-decode-blend-encode-write cycles into the final destination
buffer, as we don't need to decode/encode the pixel format.

Finally, doing elementary operations (read, blend, write) line-by-line
is much more efficient than pixel-by-pixel, because it allows making
the inner-most loop very tight. It avoids repeatedly computing a result
that does not change, like which function to call for a specific pixel
format or blending equation.


Thanks,
pq

Attachment: pgpEfUKR6v06m.pgp
Description: OpenPGP digital signature

Reply via email to