On Sun, Jul 12, 2020 at 07:16:13PM +0200, Frederic Cambus wrote:
> On Fri, Jun 26, 2020 at 07:42:50AM -0700, [email protected] wrote:
> > Optimized 32 bit character rendering with unrolled rows and pairwise
> > foreground / background pixel rendering.
> > 
> > If it weren't for the 5x8 font, I would have just assumed everything
> > was an even width and made the fallback path also pairwise.
> > 
> > In isolation, the 16x32 character case got 2x faster, but that wasn't
> > a huge real world speedup where the space rendering that was already
> > at memory bandwidth limits accounted for most of the character
> > rendering time.  However, in combination with the previous fast
> > conditional console scrolling that removes most of the space rendering,
> > it becomes significant.
> 
> On my Ryzen desktop with radeondrm, I don't see any improvements, the
> rasops_vcons_copyrows() optimizations seems to have made character
> plotting fast enough so that it's not a bottleneck anymore, which is
> definitely great.
> 
> cpu0: AMD Ryzen 7 2700 Eight-Core Processor, 3394.18 MHz, 17-08-02
> radeondrm0 at pci8 dev 0 function 0 "ATI Radeon HD 6450" rev 0x00
> radeondrm0: 1920x1080, 32bpp
> 
> On my T450 however, this diff makes cat'ing my usual test file [1] up
> to 20% faster with the default 12x24 font on the built-in 1600x900
> screen, which I think is significant enough for the diff to go in.
> 
> cpu0: Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz, 2095.47 MHz, 06-3d-04
> inteldrm0 at pci0 dev 2 function 0 "Intel HD Graphics 5500" rev 0x09
> drm0 at inteldrm0
> inteldrm0: 1600x900, 32bpp
> 
> On my Cubieboard2 (armv7) I didn't notice any meaningful difference,
> which I assume is to be expected on a 32-bit platform. I suppose it's
> also reasonable to assume other 32-bit platforms (i386, hppa, macppc)
> will not see any regression beyond noise level?
>  
> Anyone willing to OK this diff?
> 
> [1] https://norvig.com/big.txt

So I tested on the other 32-bit machine I have, and didn't notice any
regression on my i386 machine with inteldrm, it is actually up to 1.5%
faster.

It seems we can remove the 'q' variable and drop this assignement, as
it is not used:

                q = u.q[0];

The diff makes sense to me, I will commit it this week with some minor
style(9) fixes for the switch statement (don't indent the case), unless
I hear objections.

Reply via email to