Re: [Pixman] [PATCH 5/5] test: Add a new benchmarker targeting affine operations

2015-04-07 Thread Ben Avison

On Tue, 17 Mar 2015 11:12:53 -, Pekka Paalanen  wrote:

affine-bench should be added to .gitignore too.


OK.


+#define L2CACHE_SIZE (128 * 1024)


I see lowlevel-blt-bench.c also uses L2CACHE_SIZE, but where does it
come from? If it's really an arch/platform-independent constant, maybe
some test header could have it?


As you will no doubt have noticed, affine-bench is heavily inspired by
lowlevel-blt-bench. Admittedly it's a bit of a cheat to assume one L2
cache size, but in lowlevel-blt-bench, it's only used for sizing the
images used for the L2 test. For those purposes, it's only important that
it's a number big enough that the images are flushed out of the L1 cache
but small enough that they aren't flushed out to memory, so as long as
it's bigger than the largest L1 cache you're likely to encounter and no
smaller than the smallest L2 cache, then it'll do, and saves lots of
messy OS-specific code to determine the cache sizes.

Admittedly, now I look back on the use of the same constant in
affine-bench, I've taken the point of view that it's too difficult in the
presence of significant transformation coefficients to calculate image
sizes that sit within a particular cache, so I'm only bothering to test
the memory-constrained case. To do that, I'm using L2CACHE_SIZE to size
an array used to flush the L2 cache - but that needs the *largest*
typical L2 cache size, not the smallest, so 128K won't cut it here.

While I have easy access to the cache sizes on CPUs that I'm actively
targeting, a general-purpose benchmarker should probably be suitable for
all architectures. Should we read it from the OS somehow? (Anyone know
how best to do that for, say, Linux, Windows, OS X and iOS, in that case?)


Looks like Pixman style is using separate statements for struct
definition and typedefs.


Not exclusively - I think the bits I've looked at usually do, but now
that I count them, I grant you the most common form is separate typedefs.
This particular example was cut-and-pasted from pixman.c but can easily
be changed.


+static pixman_bool_t
+compute_transformed_extents (pixman_transform_t *transform,
+ const pixman_box32_t *extents,
+ box_48_16_t *transformed)


CODING_STYLE is asking for alignment here for the arg names.


This was also a cut-and-paste, but good point.


If there is no transform, why not return the original extents?
These have been reduced by a half unit in all four directions.


It's more obviously relevant in the bilinear scaling case, but a pixel's
colour is considered to apply to the midpoint of each pixel. Thus an
image of width N is specifying colours at ordinates 0.5, 1.5, 2.5 ...
(N-1.5), (N-0.5). Therefore, when you're working out which source pixels
are required to generate a range of output pixels, it's 0.5 and (N-0.5)
that you need to feed through any transformation matrix. If the no
transform case were special-cased, you'd then need to special-case it
again in the calling function because the 0.5 offset applies (albeit at
different scaling factors) to both source and destination images, so the
caller will be adding the 0.5 pixel border back on again.

In affine-bench, I calculate image sizes based on bilinear scaling for
simplicity, since a source or mask image big enough to satisfy COVER_CLIP
for bilinear scaling will also do so for nearest scaling.

[compute_transformed_extents]

+tx1 = ty1 = INT64_MAX;
+tx2 = ty2 = INT64_MIN;
+
+for (i = 0; i < 4; ++i)
+{
+pixman_fixed_48_16_t tx, ty;
+pixman_vector_t v;
+
+v.vector[0] = (i & 0x01)? x1 : x2;
+v.vector[1] = (i & 0x02)? y1 : y2;
+v.vector[2] = pixman_fixed_1;
+
+if (!pixman_transform_point (transform, &v))
+return FALSE;
+
+tx = (pixman_fixed_48_16_t)v.vector[0];
+ty = (pixman_fixed_48_16_t)v.vector[1];


This works only for affine. This is called affine-bench, so that is
natural, yet might be nice to add an assert or something to guarantee
no-one will copy this code to something else that uses projective
transforms.


I might be missing something, but it looks to me like
pixman_transform_point (through its subroutine
pixman_transform_point_31_16) does handle projective transforms, and what
this code segment is doing is iterating over the four corners of the
destination rectangle and finding the maximum and minimum source x/y by
considering the transformed position of each corner independently. Surely
the whole set of destination pixels, once passed through even a
projective matrix, must still lie within this bounding box?

[create_image]

Too long line, needs breaking up.


OK.

[various places]

Add empty line.


OK.


+*bits = aligned_malloc (4096, stride * height);


Is there a #define to get the 4096 from? I assume it's page size? What
if the platform has big pages?


I assume the same thing. It's quite a common page size as I understand
it. I know it's not much excuse, but 

Re: [Pixman] Is Pixman being maintained at all?

2015-04-07 Thread Matt Turner
On Thu, Apr 2, 2015 at 2:26 AM, Pekka Paalanen  wrote:
> On Wed, 1 Apr 2015 18:46:10 -0700
> Matt Turner  wrote:
>
>> On Mon, Mar 30, 2015 at 10:58 AM, Bill Spitzak  wrote:
>> > On 03/30/2015 10:25 AM, Matt Turner wrote:
>> >
>> >> Do you just need someone to push them?
>> >>
>> >> I'm not capable of reviewing these.
>> >>
>> >> Since Søren isn't really maintaining pixman anymore I'm not really
>> >> sure how to proceed.
>> >
>> >
>> > Is this true?
>>
>> I don't see anyone but Pekka reviewing patches and there hasn't been a
>> release in 15 months, so yeah.
>>
>> > I think something needs to be done about this as all new work on X and 
>> > Cairo
>> > is depending on pixman.
>>
>> I mean, sure.
>>
>> > I have had an outstanding patch set for 8 months now. Søren responded to an
>> > earlier version and I tried to address it but have not heard anything 
>> > since.
>> > This is very frustrating as I would like to work on this but I'm not going
>> > to do it if it is useless.
>>
>> As far as I know, Søren isn't working at Redhat any more, so I don't
>> think you can expect him to continue maintaining pixman.
>
> Ok.
>
> Søren, Matt, Siarhei,
>
> how can we get the Pixman maintenance communitized? Maybe a la
> libdrm, because no-one has the resources to become a dedicated
> maintainer?

Seems fine to me, though I don't really feel like a pixman maintainer. :)

> What does it take to get push and release authorization, in the
> political sense that Pixman quality would not degrade and the
> current/old maintainers would approve?
> What kind of review policies should be enforced?

Søren told me back in December on IRC "Feel free to do a release".

I'm happy to have people commit to pixman who have a track record of
contributions to other X.Org projects.

> What development guidelines should there be? Should it be strictly no
> new API/ABI nor features, only performance work and new platform
> support like the latest new ARM?

I'm not aware of any backwards-incompatible changes to pixman, at
least in a really long time. Keeping that policy in place seems like a
good idea.

New APIs do happen. I think that's probably fine.

> If there is one person contributing arch or cpu-specific optimizations
> in assembly that no-one is willing to review apart from the scope of
> code changes and style, should we trust that one person and just land
> his work if he shows the performance numbers are good?

I might be a bit biased in my answer, since I have some patches to the
MMX code in my tree that I don't expect anyone to review, but yeah I
think we should mostly trust the author (obviously depends on the
author's credibility).

> I mean, I'm a newbie here. I don't want to hijack this project and push
> it only to my own directions, also because I cannot become a dedicated
> maintainer, nor promise to review anyone else's stuff. But, there are
> patches I'd like to see landed. I could work on them with Ben, but if
> there is no-one "upstream" to tell us what goes and what doesn't, we
> are left to our own judgement. Would you trust my and Ben's judgement
> so that I could land Ben's patches and make Pixman releases?

I don't think you're hijacking at all. I think this conversation
needed to happen sooner or later, though I do wish Søren or Siarhei
could spend a little time on it.

> You probably don't have a good understanding about how I work and what
> kind of a developer I am, nor have that kind of trust in me. That is
> fine. We need time to build that trust through discussion and patches.
> But it's hard to have a discussion if no-one can reply. I also
> understand that because I will not promise to be a maintainer, there is
> less incentive in educating me. It is quite likely that I hang around
> here for a while and then wander off when my needs are filled.

I haven't worked with you, but I'm familiar with your contributions.
I'd trust you to commit to pixman.

But I don't think I could really educate anyone except in the MMX and SSE2 code.

> The same goes for everyone, I believe.
>
> What could we do to let Pixman go forward?
>
> I suppose a project in a similar state would just get forked by some
> new people, who will then drive it with their own goals. Except here
> that doesn't work, because the fork would soon fall into the same state
> as the original project, except the world would just be more
> fragmented. Couldn't we as well just loosen up on the master branch and
> let stuff land whenever someone is active and someone else doesn't see
> anything bad in it? There are always the stable branches, too, for
> those who want to stick to old and well-tested code.
>
> Yes, the software quality will likely degrade somewhat, at least from
> the old maintainers' perspective. However, the alternative seems to be a
> completely stalled project. Which one is better?
>
> FWIW, distros (well, Raspbian at least) already maintain their own
> forks, most likely as a single-person project. At upstream we could at
> least ai

[Pixman] [PATCH] lowlevel-blt-bench: Parse test name strings in general case

2015-04-07 Thread Ben Avison
There are many types of composite operation that are useful to benchmark but
which are omitted from the table. Continually having to add extra entries to
the table is a nuisance and is prone to human error, so this patch adds the
ability to break down unknow strings of the format
  _[_[_ca]
where bitmap formats are specified by number of bits of each component
(assumed in ARGB order) or 'n' to indicate a solid source or mask.

The only downside to this approach is that specifying 'all' instead of a
specific operation will remain limited to the set of operations from the
table.
---
 test/lowlevel-blt-bench.c |  141 ++---
 1 files changed, 133 insertions(+), 8 deletions(-)

diff --git a/test/lowlevel-blt-bench.c b/test/lowlevel-blt-bench.c
index 3da094a..c8501d4 100644
--- a/test/lowlevel-blt-bench.c
+++ b/test/lowlevel-blt-bench.c
@@ -51,6 +51,12 @@
 
 #define EXCLUDE_OVERHEAD 1
 
+typedef struct
+{
+const char *name;
+uint32_tnumber;
+} name_to_number_t;
+
 uint32_t *dst;
 uint32_t *src;
 uint32_t *mask;
@@ -367,14 +373,14 @@ bench_RT (pixman_op_t  op,
 }
 
 void
-bench_composite (char * testname,
- intsrc_fmt,
- intsrc_flags,
- intop,
- intmask_fmt,
- intmask_flags,
- intdst_fmt,
- double npix)
+bench_composite (const char * testname,
+ int  src_fmt,
+ int  src_flags,
+ int  op,
+ int  mask_fmt,
+ int  mask_flags,
+ int  dst_fmt,
+ double   npix)
 {
 pixman_image_t *src_img;
 pixman_image_t *dst_img;
@@ -719,12 +725,124 @@ tests_tbl[] =
 { "rpixbuf",   PIXMAN_x8b8g8r8,0, PIXMAN_OP_SRC, 
PIXMAN_a8b8g8r8, 0, PIXMAN_a8b8g8r8 },
 };
 
+static uint32_t
+lookup_name_to_number (const char **string, const name_to_number_t *array, 
size_t entries)
+{
+size_t entry, i;
+for (entry = 0; entry < entries; entry++)
+{
+for (i = 0; (*string)[i] == array[entry].name[i]; i++)
+{
+if ((*string)[i] == 0)
+{
+*string += i;
+return array[entry].number;
+}
+}
+if ((*string)[i] == '_' && array[entry].name[i] == 0)
+{
+*string += i;
+return array[entry].number;
+}
+}
+return 0;
+}
+
+static int
+parse_and_test (const char *pattern)
+{
+static const name_to_number_t combine_type[] = {
+{ "src",  PIXMAN_OP_SRC },
+{ "over_reverse", PIXMAN_OP_OVER_REVERSE },
+{ "overrev",  PIXMAN_OP_OVER_REVERSE },
+{ "over", PIXMAN_OP_OVER },
+{ "in_reverse",   PIXMAN_OP_IN_REVERSE },
+{ "inrev",PIXMAN_OP_IN_REVERSE },
+{ "in",   PIXMAN_OP_IN },
+{ "out_reverse",  PIXMAN_OP_OUT_REVERSE },
+{ "outrev",   PIXMAN_OP_OUT_REVERSE },
+{ "out",  PIXMAN_OP_OUT },
+{ "atop_reverse", PIXMAN_OP_ATOP_REVERSE },
+{ "atoprev",  PIXMAN_OP_ATOP_REVERSE },
+{ "atop", PIXMAN_OP_ATOP },
+{ "xor",  PIXMAN_OP_XOR },
+{ "add",  PIXMAN_OP_ADD }
+};
+static const name_to_number_t format[] = {
+{ "", PIXMAN_a8r8g8b8 },
+{ "x888", PIXMAN_x8r8g8b8 },
+{ "0565", PIXMAN_r5g6b5 },
+{ "1555", PIXMAN_a1r5g5b5 },
+{ "8",PIXMAN_a8 }
+};
+
+const char *p = pattern;
+int src_fmt;
+int src_flags = 0;
+int mask_fmt;
+int mask_flags = 0;
+int dst_fmt;
+int op = lookup_name_to_number(&p, combine_type, sizeof 
combine_type / sizeof *combine_type);
+if (op == 0 || *p++ != '_')
+return 0;
+
+if (p[0] == 'n' && p[1] == '_')
+{
+src_fmt = PIXMAN_a8r8g8b8;
+src_flags = SOLID_FLAG;
+++p;
+}
+else
+src_fmt = lookup_name_to_number(&p, format, sizeof format / sizeof 
*format);
+if (src_fmt == 0 || *p++ != '_')
+return 0;
+
+if (p[0] == 'n' && p[1] == '_')
+{
+mask_fmt = PIXMAN_a8r8g8b8;
+mask_flags = SOLID_FLAG;
+++p;
+}
+else
+mask_fmt = lookup_name_to_number(&p, format, sizeof format / sizeof 
*format);
+if (mask_fmt == 0)
+return 0;
+
+if (*p == 0)
+{
+dst_fmt = mask_fmt;
+mask_fmt = PIXMAN_null;
+}
+else
+{
+++p;
+dst_fmt = lookup_name_to_number(&p, format, sizeof format / sizeof 
*format);
+if (strcmp (p, "_ca") == 0)
+mask_flags |= CA_FLAG;
+else if (*p != 0)

Re: [Pixman] [PATCH 4/5] lowlevel-blt-bench: Parse test name strings in general case

2015-04-07 Thread Ben Avison

Hi,

Sorry for the delay in following up...

On Mon, 16 Mar 2015 14:33:47 -, Pekka Paalanen  wrote:

I understood these arrays should have been replaced by
format_from_string/operator_from_string functions from patch 2 and
similar new functions doing lookups in the same arrays.

Hmm, but I see there are several alternative spellings for operators,
and the formats follow a whole new convention.


Some background to this might help. Pixman supports a lot of pixel
formats; of particular significance is the fact that in many cases you
can have the same bits-per-pixel, divided up into the same bitpattern of
colour components, but with the red and blue components exchanged.

Each of Pixman's operations act upon between 1 and 3 bitmaps, each of
which may have any supported pixel format. But in practice, any given
application tends to stick to the same colour component ordering for most
or all of its images, so the actual number of operations you're likely to
encounter in practice is 1/2 (for 2-image operations) or 1/4 (for 3-image
operations) of what you might otherwise expect. Furthermore,
mathematically, an operation on (for example) two a8r8g8b8 images is
identical to one on two a8b8g8r8 images, so typically a fast path written
for one will be listed in the fast path table under both image formats.
Because of this, a naming convention has evolved in the source code where
fast path names include the string  as an indication that it can be
applied to either a8r8g8b8 or a8b8g8r8, with the implicit assumption that
the other image has the same colour component ordering.

lowlevel-blt-bench is most useful when you're testing the effectiveness
of a particular fast path, so it makes sense that its test pattern names
reflect the names of the fast path function that you're interested in.
However, there are a few tests, like "src_0888__rev" or "rpixbuf"
where the limitations of this approach start to show. I suspected I would
get objections if I changed the command line of lowlevel-blt-bench, but
in introducing a new tool, affine-bench, I saw an opportunity to allow
the pixel formats to be specified more directly, and deliberately made
its syntax different.

It is a matter for debate as to whether the two tools should use the same
syntax or not, and if so, which one they should standardise on. I'd be a
bit uncomfortable with saying that "" is an alias for "a8r8g8b8" or
"0565" for "r5g6b5", because that's not true in both directions, as I
have described. I'd probably choose the more verbose form if the
consensus was that the two tools should match. Is there anyone who cares
either way, though?


Personally I'm not really much of a fan of this much nesting,
especially where you have a number of parsing steps and each step nests
a bit more. I'd refactor this code into a function and use early (error)
returns instead of nesting.


Quick-and-dirty code there - it was only a test harness after all. But
easily fixed.


Looks like there is also no complaint, if the given string is not a
valid test name.


It never did generate any sort of error if you specified a bad test
string - you just wouldn't have seen any results. Another easy fix -
updated patch following shortly.

Ben
___
Pixman mailing list
Pixman@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/pixman