On Fri, Jun 26, 2020 at 07:42:50AM -0700, jo...@armadilloaerospace.com wrote:
> Optimized 32 bit character rendering with unrolled rows and pairwise
> foreground / background pixel rendering.
> 
> If it weren't for the 5x8 font, I would have just assumed everything
> was an even width and made the fallback path also pairwise.
> 
> In isolation, the 16x32 character case got 2x faster, but that wasn't
> a huge real world speedup where the space rendering that was already
> at memory bandwidth limits accounted for most of the character
> rendering time.  However, in combination with the previous fast
> conditional console scrolling that removes most of the space rendering,
> it becomes significant.

On my Ryzen desktop with radeondrm, I don't see any improvements, the
rasops_vcons_copyrows() optimizations seems to have made character
plotting fast enough so that it's not a bottleneck anymore, which is
definitely great.

cpu0: AMD Ryzen 7 2700 Eight-Core Processor, 3394.18 MHz, 17-08-02
radeondrm0 at pci8 dev 0 function 0 "ATI Radeon HD 6450" rev 0x00
radeondrm0: 1920x1080, 32bpp

On my T450 however, this diff makes cat'ing my usual test file [1] up
to 20% faster with the default 12x24 font on the built-in 1600x900
screen, which I think is significant enough for the diff to go in.

cpu0: Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz, 2095.47 MHz, 06-3d-04
inteldrm0 at pci0 dev 2 function 0 "Intel HD Graphics 5500" rev 0x09
drm0 at inteldrm0
inteldrm0: 1600x900, 32bpp

On my Cubieboard2 (armv7) I didn't notice any meaningful difference,
which I assume is to be expected on a 32-bit platform. I suppose it's
also reasonable to assume other 32-bit platforms (i386, hppa, macppc)
will not see any regression beyond noise level?
 
Anyone willing to OK this diff?

[1] https://norvig.com/big.txt

Reply via email to