On Tue, 3 Mar 2015 15:24:20 +0000 Ben Avison <bavi...@riscosopen.org> wrote:
> --- > test/Makefile.sources | 1 + > test/affine-bench.c | 330 > +++++++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 331 insertions(+), 0 deletions(-) > create mode 100644 test/affine-bench.c > > diff --git a/test/Makefile.sources b/test/Makefile.sources > index c20c34b..8b0e855 100644 > --- a/test/Makefile.sources > +++ b/test/Makefile.sources > @@ -37,6 +37,7 @@ OTHERPROGRAMS = \ > radial-perf-test \ > check-formats \ > scaling-bench \ > + affine-bench \ > $(NULL) affine-bench should be added to .gitignore too. > > # Utility functions > diff --git a/test/affine-bench.c b/test/affine-bench.c > new file mode 100644 > index 0000000..55478d6 > --- /dev/null > +++ b/test/affine-bench.c > @@ -0,0 +1,330 @@ > +/* > + * Copyright © 2014 RISC OS Open Ltd > + * > + * Permission to use, copy, modify, distribute, and sell this software and > its > + * documentation for any purpose is hereby granted without fee, provided that > + * the above copyright notice appear in all copies and that both that > + * copyright notice and this permission notice appear in supporting > + * documentation, and that the name of the copyright holders not be used in > + * advertising or publicity pertaining to distribution of the software > without > + * specific, written prior permission. The copyright holders make no > + * representations about the suitability of this software for any purpose. > It > + * is provided "as is" without express or implied warranty. > + * > + * THE COPYRIGHT HOLDERS DISCLAIM ALL WARRANTIES WITH REGARD TO THIS > + * SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND > + * FITNESS, IN NO EVENT SHALL THE COPYRIGHT HOLDERS BE LIABLE FOR ANY > + * SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES > + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN > + * AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING > + * OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS > + * SOFTWARE. > + * > + * Author: Ben Avison (bavi...@riscosopen.org) > + */ > + > +#include <stdio.h> > +#include <stdlib.h> > +#include <string.h> > +#include <ctype.h> > +#include <stdint.h> > +#include "utils.h" > + > +#ifdef HAVE_GETTIMEOFDAY > +#include <sys/time.h> > +#else > +#include <time.h> > +#endif > + > +#define WIDTH 1920 > +#define HEIGHT 1080 > +#define L2CACHE_SIZE (128 * 1024) Hi, I see lowlevel-blt-bench.c also uses L2CACHE_SIZE, but where does it come from? If it's really an arch/platform-independent constant, maybe some test header could have it? > + > +typedef struct > +{ > + pixman_fixed_48_16_t x1; > + pixman_fixed_48_16_t y1; > + pixman_fixed_48_16_t x2; > + pixman_fixed_48_16_t y2; > +} box_48_16_t; Looks like Pixman style is using separate statements for struct definition and typedefs. > + > +static pixman_bool_t > +compute_transformed_extents (pixman_transform_t *transform, > + const pixman_box32_t *extents, > + box_48_16_t *transformed) CODING_STYLE is asking for alignment here for the arg names. > +{ > + pixman_fixed_48_16_t tx1, ty1, tx2, ty2; > + pixman_fixed_t x1, y1, x2, y2; > + int i; > + > + x1 = pixman_int_to_fixed (extents->x1) + pixman_fixed_1 / 2; > + y1 = pixman_int_to_fixed (extents->y1) + pixman_fixed_1 / 2; > + x2 = pixman_int_to_fixed (extents->x2) - pixman_fixed_1 / 2; > + y2 = pixman_int_to_fixed (extents->y2) - pixman_fixed_1 / 2; > + > + if (!transform) > + { > + transformed->x1 = x1; > + transformed->y1 = y1; > + transformed->x2 = x2; > + transformed->y2 = y2; If there is no transform, why not return the original extents? These have been reduced by a half unit in all four directions. > + > + return TRUE; > + } > + > + tx1 = ty1 = INT64_MAX; > + tx2 = ty2 = INT64_MIN; > + > + for (i = 0; i < 4; ++i) > + { > + pixman_fixed_48_16_t tx, ty; > + pixman_vector_t v; > + > + v.vector[0] = (i & 0x01)? x1 : x2; > + v.vector[1] = (i & 0x02)? y1 : y2; > + v.vector[2] = pixman_fixed_1; > + > + if (!pixman_transform_point (transform, &v)) > + return FALSE; > + > + tx = (pixman_fixed_48_16_t)v.vector[0]; > + ty = (pixman_fixed_48_16_t)v.vector[1]; This works only for affine. This is called affine-bench, so that is natural, yet might be nice to add an assert or something to guarantee no-one will copy this code to something else that uses projective transforms. Or, maybe having "affine" as part of the function name? > + > + if (tx < tx1) > + tx1 = tx; > + if (ty < ty1) > + ty1 = ty; > + if (tx > tx2) > + tx2 = tx; > + if (ty > ty2) > + ty2 = ty; > + } > + > + transformed->x1 = tx1; > + transformed->y1 = ty1; > + transformed->x2 = tx2; > + transformed->y2 = ty2; > + > + return TRUE; > +} > + > +static void > +create_image (uint32_t width, uint32_t height, pixman_format_code_t format, > pixman_filter_t filter, const pixman_transform_t *t, uint32_t **bits, > pixman_image_t **image) Too long line, needs breaking up. > +{ > + uint32_t stride = (width * PIXMAN_FORMAT_BPP(format) + 31) / 32 * 4; Add empty line. > + *bits = aligned_malloc (4096, stride * height); Is there a #define to get the 4096 from? I assume it's page size? What if the platform has big pages? > + memset (*bits, 0xCC, stride * height); > + *image = pixman_image_create_bits (format, width, height, *bits, stride); > + pixman_image_set_repeat (*image, PIXMAN_REPEAT_NORMAL); > + pixman_image_set_filter (*image, filter, NULL, 0); > +} > + > +static void > +flush_cache (void) > +{ > + static const char clean_space[L2CACHE_SIZE]; > + volatile const char *x = clean_space; > + const char *clean_end = clean_space + sizeof clean_space; > + while (x < clean_end) > + { > + (void) *x; Are you sure this actually does the read? That the compiler is not allowed to discard the read? That's something I'm never sure of. > + x += 32; Why 32? Is this CACHELINE_LENGTH from lowlevel-blt-bench.c? Maybe there should be a header with the shared bench definitions? > + } > +} > + > +/* Obtain current time in microseconds modulo 2^32 */ > +uint32_t > +gettimei (void) > +{ > +#ifdef HAVE_GETTIMEOFDAY > + struct timeval tv; > + > + gettimeofday (&tv, NULL); > + return tv.tv_sec * 1000000 + tv.tv_usec; > +#else > + return (double) clock () / (double) CLOCKS_PER_SEC; This returns seconds; copy-pasta from utils.c? It might also be worth considering adding support for clock_gettime, because that allows you to use CLOCK_MONOTONIC or even CLOCK_MONOTOMIC_RAW. Though, maybe that's not necessary since below you use a 5 second run time. > +#endif > +} > + > +static void > +pixman_image_composite_wrapper (pixman_composite_info_t *info) > +{ > + pixman_image_composite (info->op, > + info->src_image, info->mask_image, > info->dest_image, > + info->src_x, info->src_y, > + info->mask_x, info->mask_y, > + info->dest_x, info->dest_y, > + info->width, info->height); > +} > + > +static void > +pixman_image_composite_empty (pixman_composite_info_t *info) > +{ > + pixman_image_composite (info->op, > + info->src_image, info->mask_image, > info->dest_image, > + info->src_x, info->src_y, > + info->mask_x, info->mask_y, > + info->dest_x, info->dest_y, > + 1, 1); > +} > + > +static void > +bench (pixman_op_t op, > + pixman_transform_t *t, > + pixman_image_t *src_image, > + pixman_image_t *mask_image, > + pixman_image_t *dest_image, > + int32_t src_x, > + int32_t src_y, > + uint32_t max_n, > + uint32_t max_time, > + uint32_t *ret_n, > + uint32_t *ret_time, > + void (*func) (pixman_composite_info_t *info)) > +{ > + uint32_t n = 0; > + uint32_t t0 = gettimei (); > + uint32_t t1; > + uint32_t x = 0; Empty line. > + do > + { > + pixman_composite_info_t info; Empty line. > + if (++x >= 64) > + x = 0; Empty line. > + info.op = op; > + info.src_image = src_image; > + info.mask_image = mask_image; > + info.dest_image = dest_image; > + info.src_x = 0; > + info.src_y = 0; > + info.mask_x = 0; > + info.mask_y = 0; > + info.dest_x = 63 - x; > + info.dest_y = 0; > + info.width = WIDTH; > + info.height = HEIGHT; Empty line. > + t->matrix[0][2] = pixman_int_to_fixed (src_x + x); > + t->matrix[1][2] = pixman_int_to_fixed (src_y); > + pixman_image_set_transform (src_image, t); Empty line. Hmm, what's the idea here with the transform? > + if (mask_image) > + pixman_image_set_transform (mask_image, t); Empty line. > + func (&info); > + t1 = gettimei (); > + } > + while (++n < max_n && (t1 - t0) < max_time); Empty line. > + if (ret_n) > + *ret_n = n; Empty line. > + *ret_time = t1 - t0; > +} > + > +int > +main (int argc, char *argv[]) > +{ > + pixman_filter_t filter = PIXMAN_FILTER_NEAREST; > + pixman_transform_t t; > + pixman_op_t op = PIXMAN_OP_SRC; > + pixman_format_code_t src_format = PIXMAN_a8r8g8b8; > + pixman_format_code_t mask_format = 0; > + pixman_format_code_t dest_format = PIXMAN_a8r8g8b8; > + > + uint32_t *src, *mask, *dest; > + pixman_image_t *src_image, *mask_image = NULL, *dest_image; > + pixman_box32_t dest_box = { 0, 0, WIDTH, HEIGHT }; > + box_48_16_t transformed = { 0 }; > + int32_t xmin, ymin, xmax, ymax; > + int32_t src_x, src_y; > + > + pixman_transform_init_identity (&t); > + > + ++argv; > + if (*argv && (*argv)[0] == '-' && (*argv)[1] == 'n') > + { > + filter = PIXMAN_FILTER_NEAREST; > + ++argv; > + --argc; > + } > + if (*argv && (*argv)[0] == '-' && (*argv)[1] == 'b') > + { > + filter = PIXMAN_FILTER_BILINEAR; > + ++argv; > + --argc; > + } > + if (argc == 1) > + { > + printf ("Usage: affine-bench [-n] [-b] axx [axy] [ayx] [ayy] > [combine type]\n"); > + printf (" [src format] [mask format] [dest > format]\n"); > + printf (" -n : nearest scaling (default)\n"); > + printf (" -b : bilinear scaling\n"); > + printf (" axx : x_out:x_in factor\n"); > + printf (" axy : x_out:y_in factor (default 0)\n"); > + printf (" ayx : y_out:x_in factor (default 0)\n"); > + printf (" ayy : y_out:y_in factor (default 1)\n"); > + printf (" combine type : src, over, in etc (default src)\n"); > + printf (" src format : a8r8g8b8, r5g6b6 etc (default a8r8g8b8)\n"); > + printf (" mask format : as for src format, but no mask used if > omitted\n"); > + printf (" dest format : as for src format (default a8r8g8b8)\n"); > + return EXIT_FAILURE; > + } > + else > + { > + t.matrix[0][0] = pixman_double_to_fixed (strtod (*argv, NULL)); > + if (*++argv) > + { > + t.matrix[0][1] = pixman_double_to_fixed (strtod (*argv, NULL)); > + if (*++argv) > + { > + t.matrix[1][0] = pixman_double_to_fixed (strtod (*argv, > NULL)); > + if (*++argv) > + { > + t.matrix[1][1] = pixman_double_to_fixed (strtod (*argv, > NULL)); > + if (*++argv) > + { > + op = operator_from_string (*argv); > + if (*++argv) > + { > + src_format = format_from_string (*argv); > + ++argv; > + if (argv[0] && argv[1]) > + { > + mask_format = format_from_string (*argv); > + ++argv; > + } > + if (*argv) > + dest_format = format_from_string (*argv); > + } > + } > + } > + } > + } All this nesting should be refactored away by using a helper function with an early return for each else-branch. I don't see invalid arguments detected at all? Should we be able to give, say, src type without giving all the matrix elements? That doesn't work atm. This is getting complicated enough that maybe using getopt would be good here. When you have lots of optional positional arguments, it gets really hard to detect what is what. > + } > + > + compute_transformed_extents (&t, &dest_box, &transformed); > + xmin = pixman_fixed_to_int (transformed.x1 - 8 * pixman_fixed_e - > pixman_fixed_1 / 2); > + ymin = pixman_fixed_to_int (transformed.y1 - 8 * pixman_fixed_e - > pixman_fixed_1 / 2); > + xmax = pixman_fixed_to_int (transformed.x2 + 8 * pixman_fixed_e + > pixman_fixed_1 / 2); > + ymax = pixman_fixed_to_int (transformed.y2 + 8 * pixman_fixed_e + > pixman_fixed_1 / 2); > + src_x = -xmin; > + src_y = -ymin; Oh! I was wondering what is going on here with the 8*e, grepped for pixman_fixed_e, and noticed that this code comes from pixman.c. I think you should copy also the comments, not only code. And now I see also compute_transformed_extents is copied from pixman.c. It still doesn't invalidate my questions - why does the original function do what it does? Why is no transform producing different results from identity transform (AFAIU)? > + > + create_image (xmax - xmin + 64, ymax - ymin + 1, src_format, filter, &t, > &src, &src_image); Empty line. Ok, so you compute the source image size from the transformed extents size, use src_x,y to position it to avoid translation increasing memory allocation, and have a 64 pixel margin... is the margin supposed to be the same on all sides or only on the right? What is that 64 about, anyway? A minimal size in case the transform would make the source too small? A minimal size of 64x1 - any particular reason for 64? > + if (mask_format) > + { > + create_image (xmax - xmin + 64, ymax - ymin + 1, mask_format, > filter, &t, &mask, &mask_image); > + if ((PIXMAN_FORMAT_R(mask_format) || PIXMAN_FORMAT_G(mask_format) || > PIXMAN_FORMAT_B(mask_format))) > + pixman_image_set_component_alpha (mask_image, 1); > + } Empty line. > + create_image (WIDTH + 64, HEIGHT, dest_format, filter, &t, &dest, > &dest_image); > + > + { > + uint32_t n; /* number of iterations in at least 5 seconds */ > + uint32_t t1; /* time taken to do n iterations, microseconds */ > + uint32_t t2; /* calling overhead for n iterations, microseconds */ Empty line. > + flush_cache (); I wonder if flush_cache here has any meaning... bench is doing multiple rounds, so flush_cache affects only the first round. Is that intended? > + bench (op, &t, src_image, mask_image, dest_image, src_x, src_y, > UINT32_MAX, 5000000, &n, &t1, pixman_image_composite_wrapper); > + bench (op, &t, src_image, mask_image, dest_image, src_x, src_y, n, > UINT32_MAX, NULL, &t2, pixman_image_composite_empty); > + /* The result indicates the output rate in megapixels/second */ > + printf ("%6.2f\n", (double) n * WIDTH * HEIGHT / (t1 - t2)); > + } I find it a little odd to have a block like this. Maybe other code should be moved into functions instead? Or this code? > + > + return EXIT_SUCCESS; > +} Did you check how many rounds a Raspberry Pi 1 can do? Does it do more than two in 5 seconds with an affine transform that is nearly identity but not quite? There are lots of small details to improve, but the idea seems fine to me. What I would add is better checking for whether you are actually running what you think you are: abort on invalid arguments, and maybe a verbose mode for checking the arguments were interpreted correctly, also printing the transformed extents. I don't trust myself that much when writing commands so sanity checking would be cool. Thanks, pq _______________________________________________ Pixman mailing list Pixman@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/pixman