https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90070

            Bug ID: 90070
           Summary: Add optimization for optimizing small integer values
                    by fp integral constant
           Product: gcc
           Version: 9.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: meissner at gcc dot gnu.org
  Target Milestone: ---

I was looking at the Spec 2017 imagick benchmark, and in particular at the hot
function in enhance.c.

The code has many places where it has:

typedef struct _PixelPacket
{
  unsigned short blue;
  unsigned short green;
  unsigned short red;
  unsigned short opacity;
} PixelPacket;

typedef struct _MagickPixelPacket
{
  float red;
  float green;
  float blue;
  float opacity;
  float index;
} MagickPixelPacket;

/* ... */

foo () {
  MagickPixelPacket aggregate;

  /* ... */

  aggregate.red+=(5.0)*((r)->red);

  /* ... */
}

In particular this becomes:

  double temp1 = (double)r->red;
  double temp2 = (double)aggregate.red;
  double temp3 = temp2 + (temp1 * 5.0);
  aggregate.red = (float) temp3;

This is due to 5.0 being considered a double precision constant.

It occurs to me that on many machines, multiplying an int by 5 is cheaper than
multiplying a double by 5.0.  In particular, since you are multiply an unsigned
short by 5.0, you know the value will fit in a 32-bit or 64-bit integer.  This
would mean the example might be executed as:

  long temp1 = (long)r->red;
  long temp2 = 5 * temp1;
  float temp3 = (float) temp2;
  aggregate.red += temp3;

Perhaps for non-fast-math it would need to be optimized as in case there are
rounding issues:

  long temp1 = (long)r->red;
  long temp2 = 5 * temp1;
  double temp3 = aggregate.red;
  double temp4 = (float) temp2;
  double temp5 = temp3 * temp4;
  aggregate.red = (float) temp5;

Reply via email to