swrast

bugzilla-daemon Mon, 16 Mar 2015 01:19:10 -0700

https://bugs.freedesktop.org/show_bug.cgi?id=89586


--- Comment #4 from Dan Sebald <daniel.seb...@ieee.org> ---
I'm now going to attach the full patch which fixes what I believe is another
subtle bug and attempts to make things slightly more efficient in the innermost
loop.

I notice in the following hunk of code from s_zoom.c:

static inline GLint
unzoom_x(GLfloat zoomX, GLint imageX, GLint zx)
{
   /*
   zx = imageX + (x - imageX) * zoomX;
   zx - imageX = (x - imageX) * zoomX;
   (zx - imageX) / zoomX = x - imageX;
   */
   GLint x;
   if (zoomX < 0.0)
      zx++;
   x = imageX + (GLint) ((zx - imageX) / zoomX);
   return x;
}

there is a test for zoomX < 0.0.  That zoomX test is run for every unzoomed
computation, which can be a lot of tests if the image displayed is large.  (I
will say more about efficiency later.)  That is, in the calling routine:

         for (i = 0; i < zoomedWidth; i++) {
            GLint j = unzoom_x(ctx->Pixel.ZoomX, imgX, x0 + i) - span->x;
            assert(j >= 0);
            assert(j < (GLint) span->end);
            COPY_4V(zoomed.array->attribs[VARYING_SLOT_COL0][i], rgba[j]);
         }

that value of ctx->Pixel.ZoomX is a constant.  So, really, the zoom sign test
could be brought outside the for-loop, and perhaps increase efficiency by, oh,
maybe 25%.  (For the loop, but there is inefficiency elsewhere of course so may
not be noticeable.  However, I'm one who tends to aim for efficiency.)  That
is, this "if (zoomX < 0.0)" is probably one or two machine instructions, while
"x = imageX + (GLint) ((zx - imageX) / zoomX);" is probably about five machine
instructions.

In the full patch I have replaced the unzoom_x() with a positive_unzoom_x() and
negative_unzoom_x().  As you might imagine, the changeset then first tests the
polarity of zoomX the picks a loop to run, depending on the sign of zoomX. 
This is done in a clean and graceful way using a macro called SPAN_LOOP_X()
where the argument is one or two lines of code that is run inside the loop,
i.e., the memory copy.

Now, having done the positive/negative_unzoom_x() construct, I also wrote some
more extensive formula derivations in comments in the code to show what I think
is a discrepancy with the formula of unzoom_x() above, in particular at least
this

   if (zoomX < 0.0)
      zx++;

By my derivation, the addition (effectively subtraction) should be done after
the division operation.  Doing the addition before dividing is something
different that really doesn't have the same effect.  Also, notice there is the
subtle case of ((xz - xr) / xfactor) being an integer that isn't properly
addressed.  It takes some time to think through, and if you don't agree with my
derivation/code, point out what is wrong and we can refine things.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 89586] Drivers/DRI/swrast

Reply via email to