On Tue, Feb 02, 2021 at 10:13:14AM -0500, Phillip Susi wrote: > > Daniel Vetter writes: > > > Just a quick comment on this: Since most framebuffers are write-combining, > > and reads from that tend to be ~3 orders of magnitude slower than writes > > (at least on the pile of machines I looked at here, there's big > > differences, and some special streaming cpu instructions to make the > > reading side not so slow). > > > > So scrolling by copying tends to be significantly slower than just > > redrawing everything. > > I know this was the case years ago with AGP as iirc, it doubled ( 4x, 8x > ) the PCI clock rate but only for writes wasn't it? I thought this was > no longer an issue with PCIe, but if it is, then I guess I'll go ahead > with cleaning up the dead code and having it re-render with the larger > text buffer.
Still the same with PCIe, probably gotten worse since uncached reads are still as slow, but write-combined writes have gotten much faster even. There's work going on to have a coherent link to gpus which would allow fully cached reads and writes, early with nvlink and now as a standard with CXL (https://en.wikipedia.org/wiki/Compute_Express_Link) But that's aimed at big compute jobs for servers, not really for display. Also some on-die gpus have become fully coherent, but again only for render/compute buffers, not for the display framebuffer. So all together 0 signs this is changing going forward, reading from framebuffers is slow. Ok there's some exceptions: For manual update buffers (defio for fbdev drivers, drm also supports this with an entire set of helpers) the framebuffer used by the cpu is sometimes (but very often still not) cached. Imo not worth optimizing for, since for the drivers where it is cached they either have no blitter, or it's really tiny panels behind spi links and similar, so not going to be fast anyway. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch