Siarhei Siamashka <siarhei.siamas...@gmail.com> writes: > From: Siarhei Siamashka <siarhei.siamas...@nokia.com> > > Without this fix, setting PAD repeat on a source image prevents > the use of any nonscaled standard fast paths, affecting performance > a lot. But as long as no pixels outside the source image boundaries > are touched by the compositing operation, all the repeat types > behave the same and can take the same fast paths. > > This is important because setting PAD repeat instead of NONE is > more hardware acceleration friendly (for the drivers implementing > RENDER extension) and does not inhibit OVER->SRC operator > optimization in pixman. > --- > > We discussed this problem and the related code with Soeren on IRC the > other day, and it may be that the code dealing with FAST_PATH_COVERS_CLIP > and FAST_PATH_SAMPLES_COVER_CLIP would better to be changed a bit. Along > with the selection of fast path functions dealing with SOLID sources or > masks (for example, any transformation set on a SOLID source image does > not actually make any difference for end result, so it can be ignored). > And there is a special optimization for NORMAL repeat in > walk_region_internal() > function which is also somehow related. Well, I guess this all stuff > carried us away from the original problem :) > > But now returning back to this issue. Ideally, PAD repeat would be a better > default instead of NONE. Just because NONE repeat is a bit strange and > hard to accelerate in hardware. This is one example from freedesktop.org > bugzilla (and there are more similar reports): > https://bugs.freedesktop.org/show_bug.cgi?id=27954 > > But right now setting PAD repeat for everying will cause serious performance > problems for pixman, because it will stop using simple non-transformed fast > paths. > > This patch tries to address the problem. It passes current pixman tests and > I think that it is most likely fine. But I'm still not totally sure if it is > really safe in all possible cases.
I can't think of a case where it would be broken, but I don't think it's really *right* either. The COVERS_CLIP flag is supposed to indicate that the image doesn't have any undefined parts, which is certainly the case when you have PAD or REFLECT set. The fundamental problem is that the PIXMAN_STD_FAST_PATH macro is mixing up several things. There are two kinds of images involved in the standard fast paths: solids and bits. The problem is that the required flags for those images are quite different. A solid image has basically no requirements on the flags other than the usual NO_ALPHAMAP, NO_CONVOLUTION etc. Generally, as long as it's a solid color, the fast path will work. The bits images have basically the requirement that we now call SAMPLES_COVER_CLIP. That allows the biltters to simply walk the image without worrying about reading out of bounds. Then there is the 'simple repeat' stuff, which is an optimization that notices that in some cases you can do a repeat operation by repeating the compositing instead of sampling with wrap-around addressing. This code has survived since the dawn of time, when pixman was called fb, didn't have transformations, and NORMAL was the only repeat mode available. Back then this code was the way to do repeating. On IRC you pointed out that for 1xn repeated images, this simple repeat stuff is not an optimization at all because it results in terrible memory access patterns. It has historically had other pathological cases, such as one it would end up calling a fast path a zillion times with a 1x1 rectangle. In fact, it was never written as an optimization, it was only left in because it was assumed to be one. I don't think anyone has ever measured the impact compared to the general path. Considering that it is really at the wrong level in the pipeline too (repeating conceptually happens during sampling, not after compositing), maybe it's time to retire this code. There are cases where it is probably still useful, but I'd rather avoid the headache of thinking about special cases where it can't or shouldn't be used. So, basically, what I think could be done is: - Change the PIXMAN_STD_FAST_PATH to put in different flags when an image is solid vs. when it is bits. The flags in the solid case would be just the usual ones, in the bits case, it would be the usual ones plus SAMPLES_COVER_CLIP. - Delete the COVERS_CLIP flag, since it would not be used in either case and isn't used elsewhere. - Delete the simple_repeat stuff (along with the SIMPLE_REPEAT flags) If there are specific benchmarks where the simple_repeat stuff made a noticable difference, then it's probably better to just do a regular fast path for them. A scanline approach would likely be faster than a tiled approach anyway in those cases. Soren _______________________________________________ Pixman mailing list Pixman@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/pixman