Re: [Pixman] [PATCH 1/4] Change conditions for setting FAST_PATH_SAMPLES_COVER_CLIP flags

2015-09-16 Thread Ben Avison

On Wed, 16 Sep 2015 03:09:50 +0100, Søren Sandmann  
wrote:

If all existing fast paths and iterators can be
changed/verified to not read outside their bounds, I have no objection
to changing the meaning of the existing flag (or adding a new one if
that's more convenient).


OK, it sounds like the way to go is use two flags for at least a
transitional period. Then if and when all fast paths and iterators are
changed, then we can delete the old flag.


I realized that there is another use for a new flag like this: It could
likely also be used here:

http://cgit.freedesktop.org/pixman/tree/pixman/pixman.c#n662

so that images would be considered opaque in more cases, leading to more
cases of reducing the operator from OVER to SRC.


I had come across that when searching the source code for references to
the COVER_CLIP flags, but had mentally filed it away because I wasn't
sure at which point it would be appropriate to switch it over to using
the new bilinear flag. But having thought about it a bit more, I agree it
could use the new definition straight away, even before any fast paths or
iterators do.

Ben
___
Pixman mailing list
Pixman@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/pixman


Re: [Pixman] [PATCH 1/4] Change conditions for setting FAST_PATH_SAMPLES_COVER_CLIP flags

2015-09-15 Thread Søren Sandmann
Ben Avison  writes:

> For what it's worth, my experience has been that treating the operation
> as a two-pass operation (separable into horizontal/vertical passes)
> generally outperforms the 2x2 convolution filter approach by a wide
> margin. But let's say we still try to describe it in those terms.

Yes, especially for upscaling, where the same scanline can be reused
many times; this approach is used in a couple of places in pixman
already: The SSSE3 and 'fast' bilinear iterators work this way.

But this is an optimization/implementation detail. Those iterators still
act as if they were doing convolutions with 2x2 matrices.

> You're making an assumption that the coefficients of that 2x2 matrix are
>
> / (1 - frac(x)) . (1 - frac(y)) frac(x) . (1 - frac(y)) \
> |   |
> \ (1 - frac(x)) . frac(y)   frac(x) . frac(y)   /
>
> However, other definitions will work, such as
>
> / frac(-x) . frac(-y)   (1 - frac(-x)) . frac(-y)   \
> |   |
> \ frac(-x) . (1 - frac(-y)) (1 - frac(-x)) . (1 - frac(-y)) /
>
> These matrix coefficients are not part of the public Pixman API, but they
> do affect how you align the matrix. In the first case, the position of
> the NW coefficient is always found by rounding *down* the coordinates, as
> you say; in the second, the position of the SE coefficient is always
> found by rounding *up* the coordinates.
>
> You might think this makes little difference, but as I've described
> previously, there are benefits to the latter approach.

That is a good point, and it may actually be the explanation for why I
left out BILINEAR from the rounding.txt document: All reasonable choices
of rounding will lead to precisely the same results, and the aim of the
document was to describe rounding, not filtering.

> If you're only going to have one flag, it makes sense for it to be the
> most inclusive one. That way, any given fast path can always add
> additional checks to detect cases they can't handle and pass them on to a
> secondary routine. This is effectively the approach I took here:

Well, the fast path system is based on the idea that each fast path
describes what it wants from the operation, and if it matches, it should
do what it claimed to be capable of and not pass things off to secondary
routines. There are known issues with this system of course, but IMO ad
hoc workarounds inside the fast paths themselves will quickly turn into
a mess.

> http://lists.freedesktop.org/archives/pixman/2015-September/003946.html
>
> It can't be done the other way round - once we have filtered out an
> operation for consideration of bilinear fast paths, there's no way a fast
> path can say "actually, yes I could have handled that if you had asked
> me".
>
>> A separate possibility is a flag that says "all pixels whose weights are
>> non-zero are inside the borders of the source image". Is this useful
>> information? It might be, and if so, it could be conveyed through some
>> new flag, though I'd echo Siarhei's comment about whether this is
>> something that happens in practice.
>
> Here's an example I've just thought of which nicely illustrates how the
> existing scheme leads to a suboptimal solution. Say we are aiming to plot
> an image such that it is reduced in size vertically by a factor of 2, but
> left unchanged horizontally. The overall effect desired is to find the
> mean of each two vertically neighbouring pixels.
>
> The set of Y coordinates, once transformed into source space, would be
>   1.0, 3.0, 5.0, ... (height * 2 - 1.0)
> The set of X coordinates, once transformed into source space, would be
>   0.5, 1.5, 2.5, ... (width - 0.5)
>
> Under the existing scheme, this would not be passed to any COVER fast
> paths - because it involves the X coordinate at exactly (width - 0.5).
> And yet it's obvious to a human that no pixels outside the source image
> are involved. To slightly reduce the X increment to avoid this would
> be highly undesirable, as it could lead to a marked softening and/or
> sharpening as you move from left to right across the image. Trimming off
> the rightmost pixel might also reasonably be frowned upon, so we'd be
> left with having to use a slower fast path, for no good reason.

Okay, that's a good example; I'll buy that this could happen in
practice. If all existing fast paths and iterators can be
changed/verified to not read outside their bounds, I have no objection
to changing the meaning of the existing flag (or adding a new one if
that's more convenient).

I realized that there is another use for a new flag like this: It could
likely also be used here:

http://cgit.freedesktop.org/pixman/tree/pixman/pixman.c#n662

so that images would be considered opaque in more cases, leading to more
cases of reducing the operator from OVER to SRC.


Søren

Re: [Pixman] [PATCH 1/4] Change conditions for setting FAST_PATH_SAMPLES_COVER_CLIP flags

2015-09-15 Thread Pekka Paalanen
On Mon, 14 Sep 2015 20:52:18 +0200
soren.sandm...@gmail.com (Søren Sandmann) wrote:

> Ben Avison  writes:
> 
> > It's an odd omission that it doesn't talk about BILINEAR filtering,
> > though. However, having briefly read through the text, even though some
> > of it goes over my head a bit, I'd say it's describing from a strictly
> > mathematical point of view. Discussion of exactly which pixels get loaded
> > from memory in order to reach this mathematical outcome feels outside the
> > scope of that document to me.
> 
> My take on the question is that conceptually the BILINEAR filter works
> by loading four pixels surrounding the sample locations, and then
> convolving them with a 2 x 2 matrix where the entries are determined by
> the fractional part of the sample position. The question then is: How do
> you determine the North-West pixel? And the answer is that you round
> down. The other three pixels are then found by adding 1 to one or both
> of the North-West pixel's coordinates. The CONVOLUTION filter works this way
> too, except the matrix may be bigger than 2x2 and is given explicitly
> rather than implicitly.

Hi,

it would have been really nice to have this explanation in rounding.txt.

I did think about a 2x2 matrix in connection to CONVOLUTION, but I
couldn't quickly formulate it in a way that it would have produced what
the code does. Particularly, the choice of the matrix contents was my
problem, because it didn't occur to me that they are per bilinear
sample (destination pixel) and not constants.

How's my version of it, or should I re-write it based on CONVOLUTION?

To me, using CONVOLUTION seems like a detour. I don't think computing a
discrete convolution usually involves computing a different matrix for
each destination pixel. I think that would involve a lot more
explaining on how convolution in this case doesn't use a constant
matrix.

> It is important to note that a pixman image is not undefined outside its
> borders because the repeat mode always says what happens if you ask for
> such a sample.
> 
> Now, of course, if the sample position is exactly on top of a pixel,
> some of the weights in the matrix will be zero and so a possible
> optimization is to eliminate those memory references. But I think this
> should be considered an optimization and not the specification.

Where should we document what COVER_CLIP_BILINEAR means?

It would be pretty convenient to just state in rounding.txt on what
choices Pixman's implementation does, just like is said for NEAREST.

I understand the definition of the flag is a Pixman internal detail,
while the things said about NEAREST so far are actually public API.


Thanks,
pq

> I think this is my basic concern: In my mind BILINEAR_COVER_CLIP says
> "if[1] the filter is BILINEAR, then all pixels required for the
> operation (regardless of weight) are inside the borders of the source
> image". This is useful information for the implementations because it
> means they can elide the border-checking logic.
> 
> A separate possibility is a flag that says "all pixels whose weights are
> non-zero are inside the borders of the source image". Is this useful
> information? It might be, and if so, it could be conveyed through some
> new flag, though I'd echo Siarhei's comment about whether this is
> something that happens in practice.
> 
> 
> Søren
> 
> 
> [1] A subtle, but mostly irrelevant for this discussion, point is that
> the flag can be set even if the filter in question is not actually
> BILINEAR.



pgpH0oN1jlkIJ.pgp
Description: OpenPGP digital signature
___
Pixman mailing list
Pixman@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/pixman


Re: [Pixman] [PATCH 1/4] Change conditions for setting FAST_PATH_SAMPLES_COVER_CLIP flags

2015-09-14 Thread Bill Spitzak
On Mon, Sep 14, 2015 at 11:52 AM, Søren Sandmann 
wrote:

>
> A separate possibility is a flag that says "all pixels whose weights are
> non-zero are inside the borders of the source image". Is this useful
> information? It might be, and if so, it could be conveyed through some
> new flag, though I'd echo Siarhei's comment about whether this is
> something that happens in practice.
>

I believe this *is* what happens in practice, much more often.

The clip regions are not random, they are chosen by programmers for
specific purposes. One thing that is wanted is to scale images up and
preserve sharp edges. In Cairo this requires trimming the partial pixels
off the edge. This will produce a clip that will turn on this flag. The
alternative version of the flag will require the program to clip off at
least one opaque pixel from two edges for scale factors less than 2. There
is far less reason for a program to do that.

Therefore I think this version of the flag will actually be used far more
often, easily making up for the expense of adding the test for the
zero-weight pixel to the bilinear fast paths.
___
Pixman mailing list
Pixman@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/pixman


Re: [Pixman] [PATCH 1/4] Change conditions for setting FAST_PATH_SAMPLES_COVER_CLIP flags

2015-09-14 Thread Søren Sandmann
Ben Avison  writes:

> It's an odd omission that it doesn't talk about BILINEAR filtering,
> though. However, having briefly read through the text, even though some
> of it goes over my head a bit, I'd say it's describing from a strictly
> mathematical point of view. Discussion of exactly which pixels get loaded
> from memory in order to reach this mathematical outcome feels outside the
> scope of that document to me.

My take on the question is that conceptually the BILINEAR filter works
by loading four pixels surrounding the sample locations, and then
convolving them with a 2 x 2 matrix where the entries are determined by
the fractional part of the sample position. The question then is: How do
you determine the North-West pixel? And the answer is that you round
down. The other three pixels are then found by adding 1 to one or both
of the North-West pixel's coordinates. The CONVOLUTION filter works this way
too, except the matrix may be bigger than 2x2 and is given explicitly
rather than implicitly.

It is important to note that a pixman image is not undefined outside its
borders because the repeat mode always says what happens if you ask for
such a sample.

Now, of course, if the sample position is exactly on top of a pixel,
some of the weights in the matrix will be zero and so a possible
optimization is to eliminate those memory references. But I think this
should be considered an optimization and not the specification.

> Here's a draft section for BILINEAR filtering, comments welcome:
>
>  8< -
>
> -- BILINEAR filtering:
>
> The BILINEAR filter calculates the linear interpolation between (i.e. a
> weighted mean of) the two closest pixels to the given position - one
> found by rounding down, one by rounding up.
>
>  round_up(x) = ceil(x - o) + o
>  round_down(x) = floor(x - o) + o
>
> The weight factor applied to each of these is given by
>
>  1 - abs(round(x) - x)
>
> except in the case where two to rounding functions amount to the same
  ^^
I really don't like to have a exceptions in the specifications for
filters. Instead, I'd just say that the first pixel is found by rounding
down, the second by adding 1 to the first pixel. The weights are
given by 1 - (x - x0) and (x1 - x).

> pixel - which only occurs if the given position aligns precisely with one
> pixel. In that case, that one pixel value is used directly.
>
> A common simplification, to avoid having to treat this case differently,
> is to define one (and only one) of the two round functions such that when
> the given positions aligns with a pixel, abs(round(x) - x) = 1, and hence
> the corresponding weight factor is 0. Either of the following pairs of
> definitions satisfy this requirement:

When described as a above, this simplification is simply a description
of how BILINEAR works in all cases.

I think this is my basic concern: In my mind BILINEAR_COVER_CLIP says
"if[1] the filter is BILINEAR, then all pixels required for the
operation (regardless of weight) are inside the borders of the source
image". This is useful information for the implementations because it
means they can elide the border-checking logic.

A separate possibility is a flag that says "all pixels whose weights are
non-zero are inside the borders of the source image". Is this useful
information? It might be, and if so, it could be conveyed through some
new flag, though I'd echo Siarhei's comment about whether this is
something that happens in practice.


Søren


[1] A subtle, but mostly irrelevant for this discussion, point is that
the flag can be set even if the filter in question is not actually
BILINEAR.
___
Pixman mailing list
Pixman@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/pixman


Re: [Pixman] [PATCH 1/4] Change conditions for setting FAST_PATH_SAMPLES_COVER_CLIP flags

2015-09-11 Thread Pekka Paalanen
On Thu, 10 Sep 2015 17:35:03 +0100
"Ben Avison"  wrote:

> On Thu, 10 Sep 2015 12:46:50 +0100, Pekka Paalanen  
> wrote:
> > you're right, documenting this is important. However, I think this
> > particular patch is not the best place, and here is why.
> >
> > When we recently discussed this, both I and Siarhei had the opinion
> > that this needs to be done in two separate steps:
> >
> > 1. Remove the useless 8e fuzz margins.
> >
> > 2. Change the meaning of the COVER_CLIP_BILINEAR flag so that it is no
> >longer safe for fetchers to always fetch a 2x2 pixel block.
> 
> I sense we're taking a slightly different perspective on the problem
> here. I don't really see these two steps as different in spirit. In the
> first one, the flag calculation was allowing extra space to permit the
> corresponding fast path to be a bit sloppy with its coordinate
> transformations. In the second one, the flag calculation was allowing
> extra space to permit the corresponding fast path to be a bit sloppy with
> loading data that it doesn't need. Apart from meaning that less efficient
> fast paths sometimes get used, this also means a lot of unnecessary cache
> lines get loaded in many cases, which has got to hurt performance.

The difference between these is that the 8e margin is completely
useless already. We can remove it without any other changes. So, we
should do just that. Simple and not controversial at all.

Step 2 is about changing the meaning of a flag and fixing up the code
to correspond. This is more tricky.

Sure, both remove extra space requirements, but they do it on very
different grounds.

> Back to the point at hand, though. I still think documenting the existing
> behaviour is important for the first patch, because it explains why after
> removing the 8e margin, then although we subtract e for the NEAREST case,
> we *don't* do so for the BILINEAR case. Suppose the existing
> implementations had been written to always fetch a 2x2 pixel block, but
> that in the event of the coordinate having fractional part 0, that it was
> right-aligned rather than left-aligned to the relevant source pixel. In
> that case, you *would* have needed to subtract e in the BILINEAR case as
> well as the NEAREST case.

You can leave the comments in the commit message, but please formulate
them to not question the location of the BILINEAR zero-weight pixel.
Instead, say how it has been like it would always remain like that,
since no bit of this patch is preparing for it to change. If this patch
did something extra based on the fact that future patches are expected
to change things wrt. to the zero-weight pixel, then justification for
that extra change would be in place.

If you actually want to document things, then I think
pixman/rounding.txt would be the right place (and another patch). After
all, commit messages are only used to justify the patch, they are not
documentation people usually read.

Curiously enough, that file competely ignores BILINEAR but explains
everything else. I'm not sure if I'm supposed to infer what BILINEAR
does by assuming some particular convolution filter, but I could not
figure out how.

> Perhaps the only way to make this clear is to further split the first
> step into two. To avoid failing cover-test when only one of the two sub-
> step patches was applied, it would have to be in the order:
> 
> Step 1, patch 1: centre the 8e margin over the correct position in the
> NEAREST case
> 
> Step 1, patch 2: remove the 8e margin in both NEAREST and BILINEAR cases

I do not think that would be an improvement. Let's keep it as it is.

> >> Given what Siarhei said, I think I understand where these came from
> >> now. I'd agree that the correct thing to do would be to remove the 8*e
> >> parts, though I think that could safely be in a separate patch.
> >
> > Very good! That would be patch 2 of step 1. :-)
> >
> > You're right, making it a separate patch is a good idea as we might
> > have overlooked something.
> >
> > Would you like me to do these for step 1?
> 
> Assuming you agree with the above, this would then be patch 3 of step 1,
> presumably? None of these are hard code changes, all the argument is
> about the wording of the commit messages, I think. Feel free to compose
> something that you think is appropriate.

Cool, I'll see what I can come up with.


Thanks,
pq


pgpp4iYeRlbRb.pgp
Description: OpenPGP digital signature
___
Pixman mailing list
Pixman@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/pixman


Re: [Pixman] [PATCH 1/4] Change conditions for setting FAST_PATH_SAMPLES_COVER_CLIP flags

2015-09-11 Thread Ben Avison

On Fri, 11 Sep 2015 10:13:08 +0100, Pekka Paalanen  wrote:

If you actually want to document things, then I think
pixman/rounding.txt would be the right place (and another patch). After
all, commit messages are only used to justify the patch, they are not
documentation people usually read.


Somehow, I don't remember ever having noticed that file! Perhaps it was
because it was called rounding.txt that it never occurred to me that
filtering might be documented there?

It's an odd omission that it doesn't talk about BILINEAR filtering,
though. However, having briefly read through the text, even though some
of it goes over my head a bit, I'd say it's describing from a strictly
mathematical point of view. Discussion of exactly which pixels get loaded
from memory in order to reach this mathematical outcome feels outside the
scope of that document to me.

Here's a draft section for BILINEAR filtering, comments welcome:

 8< -

-- BILINEAR filtering:

The BILINEAR filter calculates the linear interpolation between (i.e. a
weighted mean of) the two closest pixels to the given position - one
found by rounding down, one by rounding up.

 round_up(x) = ceil(x - o) + o
 round_down(x) = floor(x - o) + o

The weight factor applied to each of these is given by

 1 - abs(round(x) - x)

except in the case where two to rounding functions amount to the same
pixel - which only occurs if the given position aligns precisely with one
pixel. In that case, that one pixel value is used directly.

A common simplification, to avoid having to treat this case differently,
is to define one (and only one) of the two round functions such that when
the given positions aligns with a pixel, abs(round(x) - x) = 1, and hence
the corresponding weight factor is 0. Either of the following pairs of
definitions satisfy this requirement:

 round_down(x) = floor(x - o) + o
 round_up(x) = round_down(x) + 1

 round_up(x) = ceil(x - o) + o
 round_down(x) = round_up(x) - 1

 8< -

Ben
___
Pixman mailing list
Pixman@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/pixman


Re: [Pixman] [PATCH 1/4] Change conditions for setting FAST_PATH_SAMPLES_COVER_CLIP flags

2015-09-10 Thread Pekka Paalanen
On Wed, 09 Sep 2015 18:09:04 +0100
"Ben Avison"  wrote:

> On Wed, 09 Sep 2015 11:42:26 +0100, Pekka Paalanen 
> wrote:
> 
> > On Wed, 9 Sep 2015 09:39:07 +0300
> > Oded Gabbay  wrote:
> >
> >> On Fri, Sep 4, 2015 at 5:09 AM, Ben Avison  wrote:
> >> > Many bilinear fast paths currently assume that COVER_CLIP_BILINEAR is
> >> > not set when the transformed upper coordinate corresponds exactly
> >> > to the last row/pixel in source space. This is due to a detail of
> >> > many current implementations - they assume they can always load the
> >> > pixel after the one you get by dividing the coordinate by 2^16 with
> >> > rounding to -infinity.
> >> >
> >> > To avoid causing these implementations to exceed array bounds in
> >> > these cases, the COVER_CLIP_BILINEAR flag retains this feature in
> >> > this patch. Subsequent patches deal with removing this assumption,
> >> > to enable cover fast paths to be used in the maximum number of cases.
> >
> > I'm not sure if these two paragraphs about bilinear are really
> > necessary here. The essence is that we remove the extra margins from
> > both NEAREST (not 8*eps, but 7*eps and 9*eps, see below) and BILINEAR
> > (8*eps), without changing them in any other way.
> >
> > The subsequent patches are still under discussion, and we have to see
> > how they work out.
> 
> I felt that since the point of the patch is about getting the thresholds
> correct to the exact multiple of pixman_fixed_e, and I'd been asked for a
> "strict proof of correctness" in the commit message, then I felt this had
> to be said - after all, I don't think this behaviour is documented
> anywhere else.
> 
> Just witness the confusion about the issue - even Søren posted to say that
> when the coordinate was exactly coincident with a source pixel that the
> pixel with the next-lowest coordinate was loaded and multiplied by 0,
> before correcting himself shortly afterwards to say it was the
> next-highest one.
> 
> I hope I've demonstrated, via three completely different approaches, that
> there's no reason why the upper bound has to be set the way it is - you
> don't even have to sacrifice the ability to do SIMD loads. Even with the
> existing implementations, most of those that do load pixels from the
> next-highest coordinate when we end aligned on a source pixel in the
> horizontal direction, don't do the same in the vertical direction.
> 
> I'm also playing a longer game - but because people get overfaced with
> very long patch series, I'm holding back on some others that build on this
> change. Eventually I re-use the same assembly functions in a context where
> they need to adhere to my tighter limits, but I expect the pool of people
> able to review the assembly code will be pretty small, so the ability to
> demonstrate that they obey the limits when used for a COVER operation is
> very useful.
> 
> Granted, perhaps the last sentence you quoted probably belonged after the
> --- separator, especially if the whole series doesn't get pushed at once,
> but I really want to see all of them to go through. I did originally post
> the whole thing as a single patch, after all (mind you, I've spent a lot
> of time working on bilinear scaling for ARMv6 and ARMv7, and this feels
> like a really tiny and obvious part of it to me now...)

Hi Ben,

you're right, documenting this is important. However, I think this
particular patch is not the best place, and here is why.

When we recently discussed this, both I and Siarhei had the opinion
that this needs to be done in two separate steps:

1. Remove the useless 8e fuzz margins.

2. Change the meaning of the COVER_CLIP_BILINEAR flag so that it is no
   longer safe for fetchers to always fetch a 2x2 pixel block.

In that sense, patch 1/4 of this series is step 1. Patches 2, 3 and 4
are step 2, which I assumed to be a follow-up patch series.

We're still pending on cover-test, too, so we're getting quite a lot
ahead of ourselves. Thankfully cover-test is essentially sorted by now.

It seems to me the old maintainers processed each patch series as a
single unit. It makes sense because a series is usually interdependent.
I am more liberal in that, I can accept patches from the top or even
the middle if there are no dependencies. That's why I am looking only
at the step 1 patch right now and making sure it is what we want and
get it merged.

That is why I consider talking about the zero-weight pixel off-topic
for *this one* patch. It is very much on topic on the three other
patches that focus on the definition of COVER in the BILINEAR case,
specifically on the point of zero-weight input pixels.

All in due time, IMHO. At least we are moving now. :-)

> > All the above otherwise looks good to me, but there is one more place
> > that has 8*eps, analyze_extent() in pixman.c has:
> >
> > if (!compute_transformed_extents (transform, _extents, 
> > ))
> > return FALSE;
> 

Re: [Pixman] [PATCH 1/4] Change conditions for setting FAST_PATH_SAMPLES_COVER_CLIP flags

2015-09-10 Thread Bill Spitzak
On Thu, Sep 10, 2015 at 9:35 AM, Ben Avison  wrote:

> On Thu, 10 Sep 2015 12:46:50 +0100, Pekka Paalanen 
> wrote:
>
>> you're right, documenting this is important. However, I think this
>> particular patch is not the best place, and here is why.
>>
>> When we recently discussed this, both I and Siarhei had the opinion
>> that this needs to be done in two separate steps:
>>
>> 1. Remove the useless 8e fuzz margins.
>>
>> 2. Change the meaning of the COVER_CLIP_BILINEAR flag so that it is no
>>longer safe for fetchers to always fetch a 2x2 pixel block.
>>
>
This sounds exactly right to me.

Another way to say it is that it is safe to fetch all pixels that have a
non-zero weight. Certain bilinear positions will produce 2x2 blocks that
contain up to 3 pixels with a zero weight, those pixels may be outside the
source clip even when this flag is on.


> I sense we're taking a slightly different perspective on the problem
> here. I don't really see these two steps as different in spirit. In the
> first one, the flag calculation was allowing extra space to permit the
> corresponding fast path to be a bit sloppy with its coordinate
> transformations. In the second one, the flag calculation was allowing
> extra space to permit the corresponding fast path to be a bit sloppy with
> loading data that it doesn't need. Apart from meaning that less efficient
> fast paths sometimes get used, this also means a lot of unnecessary cache
> lines get loaded in many cases, which has got to hurt performance.
>

I believe you are confusing the implementation of the bilinear with this
flag.

For a coordinate of .5, pixel 0 will have a weight of 1.0, and all other
pixels will have a weight of 0.0. This includes both the pixel at -1 and
the pixel at 1. They both have a weight of zero.

It is useful for a bilinear algorithm to think about pairs of pixels, and
depending on the implementation the pair produced for the .5 coordinate may
be pixels 0 and 1, or pixels -1 and 0.

However this does not change the setting of COVER_CLIP_BILINEAR! No pixel
has changed it's weight, both pixel -1 and 1 remain with a weight of zero,
no matter which is chosen for the pair.

I think you are right that there are reasons for the bilinear
*implementation* to round down, so the weighted-zero pixel is at the lower
coordinate. This is because the special code to avoid fetching it can be at
the start of the loop, rather than at the end. But this has nothing to do
with COVER_CLIP_BILINEAR and should not be part of what sets it.

And it is true that there are a lot of broken bilinear fetches that do read
the weight-zero pixel. These need to be fixed. But this still does not
change the meaning of the flag.
___
Pixman mailing list
Pixman@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/pixman


Re: [Pixman] [PATCH 1/4] Change conditions for setting FAST_PATH_SAMPLES_COVER_CLIP flags

2015-09-09 Thread Pekka Paalanen
On Wed, 9 Sep 2015 09:39:07 +0300
Oded Gabbay  wrote:

> On Fri, Sep 4, 2015 at 5:09 AM, Ben Avison  wrote:
> > As discussed in
> > http://lists.freedesktop.org/archives/pixman/2015-August/003905.html
> >
> > the 8 * pixman_fixed_e adjustment which was applied to the transformed
> > coordinates is a legacy of rounding errors which used to occur in old
> > versions of Pixman, but which no longer apply. For any affine transform,
> > you are now guaranteed to get the same result by transforming the upper
> > coordinate as though you transform the lower coordinate and add (size-1)
> > steps of the increment in source coordinate space. No projective
> > transform routines use the COVER_CLIP flags, so they cannot be affected.
> >
> > However, removing the 8 * pixman_fixed_e border exposes a bug in the
> > calculation of the COVER_CLIP_NEAREST flag. Because the half-way cases
> > round down, an adjustment by pixman_fixed_e is needed. This patch
> > applies this adjustment.
> >
> > Many bilinear fast paths currently assume that COVER_CLIP_BILINEAR is
> > not set when the transformed upper coordinate corresponds exactly to the
> > last row/pixel in source space. This is due to a detail of many current
> > implementations - they assume they can always load the pixel after the one
> > you get by dividing the coordinate by 2^16 with rounding to -infinity.
> >
> > To avoid causing these implementations to exceed array bounds in these
> > cases, the COVER_CLIP_BILINEAR flag retains this feature in this patch.
> > Subsequent patches deal with removing this assumption, to enable cover
> > fast paths to be used in the maximum number of cases.

Hi,

I'm not sure if these two paragraphs about bilinear are really
necessary here. The essence is that we remove the extra margins from
both NEAREST (not 8*eps, but 7*eps and 9*eps, see below) and BILINEAR
(8*eps), without changing them in any other way.

The subsequent patches are still under discussion, and we have to see
how they work out.

> >
> > Proof:
> >
> > All implementations must give the same numerical results as
> > bits_image_fetch_pixel_nearest() / bits_image_fetch_pixel_bilinear().
> >
> > The former does
> > int x0 = pixman_fixed_to_int (x - pixman_fixed_e);
> > which maps directly to the new test for the nearest flag, when you consider
> > that x0 must fall in the interval [0,width).

Right.

> >
> > The latter does
> > x1 = x - pixman_fixed_1 / 2;
> > x1 = pixman_fixed_to_int (x1);
> > x2 = x1 + 1;
> > but then x2 is brought back in range by the repeat() function, so it can't
> > stray beyond the end of the source line. When you write a cover path, you
> > are effectively omitting the repeat() function. The weight applied to the
> > pixel at offset x2 will be zero, so if you skip the load and leave the
> > pixel value undefined, the numerical result is unaffected, but you have
> > avoided a memory access that could potentially cause a page fault. If
> > however, we assume that the implementation loads from offset x2
> > unconditionally, then we need
> > x1 >= 0
> > x2 < width
> > so
> > x - pixman_fixed_1 / 2 >= 0
> > x - pixman_fixed_1 / 2 + pixman_fixed_1 < width * pixman_fixed_1
> > so
> > pixman_fixed_to_int (x - pixman_fixed_1 / 2) >= 0
> > pixman_fixed_to_int (x + pixman_fixed_1 / 2) < width
> > which matches the source code lines for the bilinear case, once you delete
> > the lines that apply the 8 * pixman_fixed_e border.

Indeed, but I think the talk about omitting the loading of the
zero-weight pixel does not belong in this patch. Before and after
this patch we still do assume that bilinear fetchers unconditionally
read the pixel at x2.

However, this is a good explanation on why exactly does -0.5 / +0.5
adjustment allow bilinear fetchers to load x2 unconditionally. This is
the rationale for not changing anything else for COVER_CLIP_BILINEAR.

> > ---
> >  pixman/pixman.c |   17 -
> >  test/affine-bench.c |   16 ++--
> >  2 files changed, 14 insertions(+), 19 deletions(-)
> >
> > diff --git a/pixman/pixman.c b/pixman/pixman.c
> > index a07c577..f932eac 100644
> > --- a/pixman/pixman.c
> > +++ b/pixman/pixman.c
> > @@ -497,21 +497,12 @@ analyze_extent (pixman_image_t   *image,
> >  if (!compute_transformed_extents (transform, extents, ))
> > return FALSE;
> >
> > -/* Expand the source area by a tiny bit so account of different 
> > rounding that
> > - * may happen during sampling. Note that (8 * pixman_fixed_e) is very 
> > far from
> > - * 0.5 so this won't cause the area computed to be overly pessimistic.
> > - */
> > -transformed.x1 -= 8 * pixman_fixed_e;
> > -transformed.y1 -= 8 * pixman_fixed_e;
> > -transformed.x2 += 8 * pixman_fixed_e;
> > -transformed.y2 += 8 * pixman_fixed_e;
> > -
> >  if (image->common.type == BITS)
> >  {
> > -   if (pixman_fixed_to_int 

[Pixman] [PATCH 1/4] Change conditions for setting FAST_PATH_SAMPLES_COVER_CLIP flags

2015-09-03 Thread Ben Avison
As discussed in
http://lists.freedesktop.org/archives/pixman/2015-August/003905.html

the 8 * pixman_fixed_e adjustment which was applied to the transformed
coordinates is a legacy of rounding errors which used to occur in old
versions of Pixman, but which no longer apply. For any affine transform,
you are now guaranteed to get the same result by transforming the upper
coordinate as though you transform the lower coordinate and add (size-1)
steps of the increment in source coordinate space. No projective
transform routines use the COVER_CLIP flags, so they cannot be affected.

However, removing the 8 * pixman_fixed_e border exposes a bug in the
calculation of the COVER_CLIP_NEAREST flag. Because the half-way cases
round down, an adjustment by pixman_fixed_e is needed. This patch
applies this adjustment.

Many bilinear fast paths currently assume that COVER_CLIP_BILINEAR is
not set when the transformed upper coordinate corresponds exactly to the
last row/pixel in source space. This is due to a detail of many current
implementations - they assume they can always load the pixel after the one
you get by dividing the coordinate by 2^16 with rounding to -infinity.

To avoid causing these implementations to exceed array bounds in these
cases, the COVER_CLIP_BILINEAR flag retains this feature in this patch.
Subsequent patches deal with removing this assumption, to enable cover
fast paths to be used in the maximum number of cases.

Proof:

All implementations must give the same numerical results as
bits_image_fetch_pixel_nearest() / bits_image_fetch_pixel_bilinear().

The former does
int x0 = pixman_fixed_to_int (x - pixman_fixed_e);
which maps directly to the new test for the nearest flag, when you consider
that x0 must fall in the interval [0,width).

The latter does
x1 = x - pixman_fixed_1 / 2;
x1 = pixman_fixed_to_int (x1);
x2 = x1 + 1;
but then x2 is brought back in range by the repeat() function, so it can't
stray beyond the end of the source line. When you write a cover path, you
are effectively omitting the repeat() function. The weight applied to the
pixel at offset x2 will be zero, so if you skip the load and leave the
pixel value undefined, the numerical result is unaffected, but you have
avoided a memory access that could potentially cause a page fault. If
however, we assume that the implementation loads from offset x2
unconditionally, then we need
x1 >= 0
x2 < width
so
x - pixman_fixed_1 / 2 >= 0
x - pixman_fixed_1 / 2 + pixman_fixed_1 < width * pixman_fixed_1
so
pixman_fixed_to_int (x - pixman_fixed_1 / 2) >= 0
pixman_fixed_to_int (x + pixman_fixed_1 / 2) < width
which matches the source code lines for the bilinear case, once you delete
the lines that apply the 8 * pixman_fixed_e border.
---
 pixman/pixman.c |   17 -
 test/affine-bench.c |   16 ++--
 2 files changed, 14 insertions(+), 19 deletions(-)

diff --git a/pixman/pixman.c b/pixman/pixman.c
index a07c577..f932eac 100644
--- a/pixman/pixman.c
+++ b/pixman/pixman.c
@@ -497,21 +497,12 @@ analyze_extent (pixman_image_t   *image,
 if (!compute_transformed_extents (transform, extents, ))
return FALSE;
 
-/* Expand the source area by a tiny bit so account of different rounding 
that
- * may happen during sampling. Note that (8 * pixman_fixed_e) is very far 
from
- * 0.5 so this won't cause the area computed to be overly pessimistic.
- */
-transformed.x1 -= 8 * pixman_fixed_e;
-transformed.y1 -= 8 * pixman_fixed_e;
-transformed.x2 += 8 * pixman_fixed_e;
-transformed.y2 += 8 * pixman_fixed_e;
-
 if (image->common.type == BITS)
 {
-   if (pixman_fixed_to_int (transformed.x1) >= 0   &&
-   pixman_fixed_to_int (transformed.y1) >= 0   &&
-   pixman_fixed_to_int (transformed.x2) < image->bits.width&&
-   pixman_fixed_to_int (transformed.y2) < image->bits.height)
+   if (pixman_fixed_to_int (transformed.x1 - pixman_fixed_e) >= 0  
  &&
+   pixman_fixed_to_int (transformed.y1 - pixman_fixed_e) >= 0  
  &&
+   pixman_fixed_to_int (transformed.x2 - pixman_fixed_e) < 
image->bits.width &&
+   pixman_fixed_to_int (transformed.y2 - pixman_fixed_e) < 
image->bits.height)
{
*flags |= FAST_PATH_SAMPLES_COVER_CLIP_NEAREST;
}
diff --git a/test/affine-bench.c b/test/affine-bench.c
index 9e0121e..697684b 100644
--- a/test/affine-bench.c
+++ b/test/affine-bench.c
@@ -396,13 +396,17 @@ main (int argc, char *argv[])
 }
 
 compute_transformed_extents (, _box, );
-/* The source area is expanded by a tiny bit (8/65536th pixel)
- * to match the calculation of the COVER_CLIP flags in analyze_extent()
+xmin = pixman_fixed_to_int (transformed.x1 - pixman_fixed_1 / 2);
+ymin = pixman_fixed_to_int (transformed.y1 - pixman_fixed_1 / 2);
+xmax = pixman_fixed_to_int (transformed.x2 +