Siarhei Siamashka <siarhei.siamas...@gmail.com> writes:

> From: Siarhei Siamashka <siarhei.siamas...@nokia.com>
> 
> Without this fix, setting PAD repeat on a source image prevents
> the use of any nonscaled standard fast paths, affecting performance
> a lot. But as long as no pixels outside the source image boundaries
> are touched by the compositing operation, all the repeat types
> behave the same and can take the same fast paths.
> 
> This is important because setting PAD repeat instead of NONE is
> more hardware acceleration friendly (for the drivers implementing
> RENDER extension) and does not inhibit OVER->SRC operator
> optimization in pixman.
> ---
> 
> We discussed this problem and the related code with Soeren on IRC the
> other day, and it may be that the code dealing with FAST_PATH_COVERS_CLIP
> and FAST_PATH_SAMPLES_COVER_CLIP would better to be changed a bit. Along
> with the selection of fast path functions dealing with SOLID sources or
> masks (for example, any transformation set on a SOLID source image does
> not actually make any difference for end result, so it can be ignored).
> And there is a special optimization for NORMAL repeat in 
> walk_region_internal()
> function which is also somehow related. Well, I guess this all stuff
> carried us away from the original problem :)
> 
> But now returning back to this issue. Ideally, PAD repeat would be a better
> default instead of NONE. Just because NONE repeat is a bit strange and
> hard to accelerate in hardware. This is one example from freedesktop.org
> bugzilla (and there are more similar reports):
> https://bugs.freedesktop.org/show_bug.cgi?id=27954
> 
> But right now setting PAD repeat for everying will cause serious performance
> problems for pixman, because it will stop using simple non-transformed fast
> paths.
> 
> This patch tries to address the problem. It passes current pixman tests and
> I think that it is most likely fine. But I'm still not totally sure if it is
> really safe in all possible cases.

I can't think of a case where it would be broken, but I don't think
it's really *right* either. The COVERS_CLIP flag is supposed to
indicate that the image doesn't have any undefined parts, which is
certainly the case when you have PAD or REFLECT set.

The fundamental problem is that the PIXMAN_STD_FAST_PATH macro is
mixing up several things. There are two kinds of images involved in
the standard fast paths: solids and bits. The problem is that the
required flags for those images are quite different.

A solid image has basically no requirements on the flags other than
the usual NO_ALPHAMAP, NO_CONVOLUTION etc. Generally, as long as it's
a solid color, the fast path will work.

The bits images have basically the requirement that we now call
SAMPLES_COVER_CLIP. That allows the biltters to simply walk the image
without worrying about reading out of bounds.

Then there is the 'simple repeat' stuff, which is an optimization that
notices that in some cases you can do a repeat operation by repeating
the compositing instead of sampling with wrap-around addressing. This
code has survived since the dawn of time, when pixman was called fb,
didn't have transformations, and NORMAL was the only repeat mode
available. Back then this code was the way to do repeating.

On IRC you pointed out that for 1xn repeated images, this simple
repeat stuff is not an optimization at all because it results in
terrible memory access patterns. It has historically had other
pathological cases, such as one it would end up calling a fast path a
zillion times with a 1x1 rectangle.

In fact, it was never written as an optimization, it was only left in
because it was assumed to be one. I don't think anyone has ever
measured the impact compared to the general path.

Considering that it is really at the wrong level in the pipeline too
(repeating conceptually happens during sampling, not after
compositing), maybe it's time to retire this code. There are cases
where it is probably still useful, but I'd rather avoid the headache
of thinking about special cases where it can't or shouldn't be used.

So, basically, what I think could be done is:

- Change the PIXMAN_STD_FAST_PATH to put in different flags when an
  image is solid vs. when it is bits. The flags in the solid case
  would be just the usual ones, in the bits case, it would be the
  usual ones plus SAMPLES_COVER_CLIP.

- Delete the COVERS_CLIP flag, since it would not be used in either
  case and isn't used elsewhere.

- Delete the simple_repeat stuff (along with the SIMPLE_REPEAT flags)

If there are specific benchmarks where the simple_repeat stuff made a
noticable difference, then it's probably better to just do a regular
fast path for them. A scanline approach would likely be faster than a
tiled approach anyway in those cases.



Soren
_______________________________________________
Pixman mailing list
Pixman@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/pixman

Reply via email to