Re: [Pixman] [cairo] Cairo/pixman behavior with large translations

2012-12-06 Thread Søren Sandmann
Siarhei Siamashka siarhei.siamas...@gmail.com writes:

 The main culprits are functions pixman_transform_point_3d() and
 pixman_transform_point() here:
 
 http://cgit.freedesktop.org/pixman/tree/pixman/pixman-matrix.c?id=pixman-0.28.0#n49
 They perform multiplication of a matrix (16.16 fixed point) with a
 vector (16.16 fixed point), to get a vector with 16.16 fixed point
 values. This code can be upgraded to perform multiplication of the same
 16.16 fixed point matrix, but use something like 31.16 fixed point
 for input vectors and get results in the 48.16 fixed point output
 vectors. The caller then should be able to deal with the 48.16
 results depending on the type of repeat set for the image. One
 example is here:
 
 http://cgit.freedesktop.org/pixman/tree/pixman/pixman-inlines.h?id=pixman-0.28.0#n417
 In this code, the src_x and src_y arguments are originally coming
 from pixman_image_composite32() function and are already supposed to be
 larger than 16 bits after the following commit:
 
 http://cgit.freedesktop.org/pixman/commit/?id=e841c556d59ca0aa6d86eaf6dbf061ae0f4287de
 If they are getting converted to fixed point, we already get something
 like 32.16 fixed point values which are asking for a larger vector
 argument for pixman_transform_point_3d() function (though we might
 want not to allow the use of full 32-bit range for src_x and src_y
 in order to keep some headroom and safeguard against overflows). In any
 case, immediately after pixman_transform_point_3d() we are
 tweaking v.vector[0] and v.vector[1] according to the repeat type:
 
 http://cgit.freedesktop.org/pixman/tree/pixman/pixman-inlines.h?id=pixman-0.28.0#n432
 Updating this code (repeat and pad_repeat_get_scanline_bounds
 functions) to deal with 48.16 fixed point values from the vector can't
 be too hard.

There is really a lot to be said for this 31.16 format. Intermediate
results from matrix and vector multiplications fit in 64 bits, and image
access coordinates up to +/- 1 billion pixels can be supported. Yet it
is still a fixed-point format so there won't be any weird effects where
positions are rounded differently depending on where they are on the
screen.  If 64 bit coordinates become a performance problem, a new flag,
FAST_PATH_16_16_IS_ENOUGH or something, could be added so that fast
paths could use 32 bit coordinates. (We used to have something like this
before pixman started to just return without compositing when it detects
overflow).

It's very tempting to switch pixman over to using this format
internally.

The one thing I'm not sure of is whether this bug:

https://bugs.freedesktop.org/show_bug.cgi?id=15355

Under a non-affine transform, or, in fact, any transform where the
 'w' component of the (u,v,w) homogeneous source coordinate is not 1,
 the representation of u,v,w as 16.16 fixed point numbers will
 over/under flow on a regular basis even given fairly mundane
 transformations (like a simple keystone correction). I've fixed the
 intel driver to do these computations in floating point; I'm not
 sure we want to try to use fixed point here.

is adequately taken care off by using this format.

It's worth pointing out that even if there are useful non-affine
transformations that still overflow, a homogeneous transformation matrix
doesn't change if multiplied or divided by an arbitrary constant, so if
worse comes to worst, we could drop precision by rounding the
overflowing result down to 47 bits and then dividing all the other
entries by a similar amount, in effect faking a floating point type.

Keith, comments on this would definitely be appreciated. For the RandR
use cases, would 31.16 be good enough?


Søren
___
Pixman mailing list
Pixman@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/pixman


Re: [Pixman] [cairo] Cairo/pixman behavior with large translations

2012-12-06 Thread Siarhei Siamashka
On Thu, 06 Dec 2012 14:02:11 +0100
sandm...@cs.au.dk (Søren Sandmann) wrote:

 Søren Sandmann sandm...@cs.au.dk writes:
 
  Intermediate results from matrix and vector multiplications fit in 64
  bits, and
 
 That is, as long as the final result doesn't overflow the 31.16 type,
 the intermediate values fit in 64 bits.

We are fine if we don't mind dealing with two intermediate 64-bit
variables for doing multiplication and accumulation ('lo' and 'hi'
variables in the example below):

/***/

#include stdint.h
#include stdlib.h
#include assert.h
#include math.h

/* a reference implementation (needs 128-bit int types) */
static uint64_t
ref_mul_16_16_by_31_16_to_48_16 (int32_t fix_16_16, int64_t fix_31_16)
{
return ((__int128_t)fix_16_16 * fix_31_16 + 0x8000)  16;
}

/* this is a reference floating point implementation for comparison */
static __float128
float_mul_16_16_by_31_16_to_48_16 (int32_t fix_16_16, int64_t fix_31_16)
{
__float128 a = (__float128)fix_16_16 / 65536.;
__float128 b = (__float128)fix_31_16 / 65536.;
return a * b;
}

/* this implementation does not need 128-bit data types */
static uint64_t
mul_16_16_by_31_16_to_48_16 (int32_t fix_16_16, int64_t fix_31_16)
{
int64_t hi = (int64_t)fix_16_16 * (fix_31_16  16);
int64_t lo = (int64_t)fix_16_16 * (fix_31_16  0x);
return hi + ((lo + 0x8000)  16);
}

void fillrand (void *p, int size)
{
uint8_t *bytep = (uint8_t *)p;
while (size--)
*bytep++ = rand()  0xFF;
}

int main (void)
{
int i;
for (i = 0; i  1; i++)
{
int32_t fix_16_16;
int64_t fix_31_16;
int64_t ref_res, res;
__float128 float_res; /* 64-bit double data type is not enough */

fillrand (fix_16_16, sizeof(fix_16_16));
fillrand (fix_31_16, sizeof(fix_31_16));

/* in fact this will be 33.16 random number after this shift,
 * but in the matrix by vector multiplication code we want to
 * be able to add together 3 multiplication results without
 * overflow, hence the fixed 31.16 data type is interesting
 * for the extra 2 bits headroom */
fix_31_16 = 15; 

/* test the correctness of mul_16_16_by_31_16_to_48_16 by comparing
 * it with the other implementations */
ref_res   = ref_mul_16_16_by_31_16_to_48_16 (fix_16_16, fix_31_16);
float_res = float_mul_16_16_by_31_16_to_48_16 (fix_16_16, fix_31_16);
res   = mul_16_16_by_31_16_to_48_16 (fix_16_16, fix_31_16);

assert(ref_res == res);
assert(fabs(float_res * 65536. - (__float128)res)  0.51);
}
return 0;
}

/***/

-- 
Best regards,
Siarhei Siamashka
___
Pixman mailing list
Pixman@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/pixman