Daniel Phillips wrote:
Hi Timothy,

There was a thread on rasterization back there somewhere in which a few points were left unresolved or resolved suboptimally. I'll try to address some of these now. Caveat: it's been ten years since I last implemented this and both my source (in Forth) and reference material are packed away somewhere in boxes at the moment, so this is all from memory.

The definition of "accurate" rasterization of a trapezoid is:

  - Every pixel whose center lies with the trapezoid is filled exactly
    once

- No pixel whose center lies outside the trapezoid is filled

We do this.


As a consequence, every pixel in a mesh of trapezoids is filled exactly once and no pixels inside the mesh are left empty. Insideness is defined in terms of half spaces:


- A pixel is 'inside' a left bound if its X coordinate is greater than the bound.

  - A pixel is 'inside' a right bound if its X coordinate is less
    than or equal to the bound.

We do either this or its opposite, depending on our definition of "round".

What I WANT to do is define "round" so that it rounds up if the parameter is exactly 0.5. The hardware for that is MUCH simpler than the alternative.


We decide whether or not to draw a pixel based on its 'winding count', which is the sum of left boundary insideness less the sum of right boundary non-insideness:


  - If the current polygon mode is "winding", draw the pixel if its
    winding count is nonzero

  - If the current polygon mode is "alternating", draw the pixel if
    its winding count is odd

For what it's worth, "left boundary insideness" and "right boundary non-insideness" are the same thing.

We don't concern ourselves with winding, since that's something the vertex processor needs to do. We only handle triangles.


Y coordinate insideness is exactly analogous to X coordinate insideness, except that the insideness result applies to entire trapezoid scans rather than individual pixels. (Here let me plead for no departure from the usual 2D treatment of Y coordinates, that is: increasing device Y corresponds to increasing memory address. Any consequent Y negation should be handled in the device coordinate transformation.)

Above, I define insideness in terms of pixel centers, but it is better to work with upper left pixel coordinates instead, and compensate by biasing all geometry one half pixel to the lower right. (If we were to implement multisampling, say by a factor of four, we would bias all geometry by a quarter pixel.) With this simplification we can now forget about rounding and work in terms of truncation (i.e., floor).

I designed the simulation model to deal with pixel centers and get the right results. The reason you can't pre-bias is that it breaks down when you do MIPmapping. Therefore, we just have to do it "right".



I seem to recall it was decided that geometry should be processed in fixed point, as opposed to floating point for color, texture and depth parameters. Considering that the multipliers are 18 bit, I suggest 14.18 fixed point format, or perhaps 12.20 on the theory that a couple of bits of extra interpolation accuracy are worth the extra shifts.

We're using floats. 8 exponent, 1 sign, 16 mantissa.

Now we get to the meat. To fill a pixel span from a left to a right trapezoid boundary:

  - Truncate the interpolated X coordinate of the left edge and add one,
    giving the X address of the first leftmost to fill

  - Truncate the interpolated X coordinate of the right edge, giving
    the X address of the rightmost pixel to fill

The count of pixels to fill is thus trunc(Xright) - trunc(Xleft)
which may be zero. Note that we do not use ceiling(Xleft) for the leftmost pixel because that violates the half space conditions, causing pixel overdraw between adjacent trapezoids if Xleft is an exact integer.

We use round(). See the model.

We can test that this is all working correctly by reducing the calculated pixel span count by one. A horizontal single pixel gap should appear between all triangles. We can test the handling of vertical coordinates similarly.

I did tests with abutting triangles, and it works exactly as it should in terms of never double-drawing pixels.


All interpolants, i.e., texture U and V, R/G/B and W or 1/W need to be adjusted before span filling begins. Given T, an interpolant, first calculate the per-pixel X delta as:

  Tdelta = (Tright - Tleft) / Xright - Xleft)

then:

  Tadjusted = Tleft + Tdelta*frac(Xleft)

This calculation is more complex for perspective interpolation because the adjustment has to be performed on the inverse of the interpolant. Could somebody please take the time to work out an efficient algorithm for this? Even per-span divides have to be kept to a minimum otherwise small triangles become too expensive.

We do perspective correction by taking the reciprocal of W and multiplying.

A number of factors determine texel rendering accuracy. The correctness and accuracy of the above interpolant adjustment is one of the main factors. When the job is done properly, boundaries between rows of rendered texels will appear exactly linear no matter how near the viewpoint is to the polygon plane, and both the jaggies in the texel boundaries and the jaggies in the polygon edges will swim across the screen monotonically and smoothly as we animate the viewpoint, no matter how finely. Jaggies will never suddenly "jump backwards" from frame to frame even if the viewpoint is rotating.

This is something we need to test. But I think we do it right.

Besides ironing out all the tricky interpolant accuracy issues, numerical accuracy problems will also show up. Multiplier precision problems typically manifest as short runs of single pixel "noise" in what ought to be simple jags along texel boundaries. While it is nigh on impossible to eliminate such noise completely except by filtering, such artifacts can be made very rare.

There are two shortcuts we take in the math: Limiting float mantissas to 16 bits, and using only 10 mantissa bits when computing reciprocal. Everything else is done accurately.



With careful attention to numerical issues much clamping of generated texture coordinates can be eliminated, although in my experience, eliminating all clamping is very difficult.


Some logic can be factored out of the per-pixel pipeline by directly interpolating addresses as opposed to coordinates.

Well, we'll figure out what we need to factor out when we need to do that. What's that they say about pre-optimizing? :)


Again, I did all of this quite a few years ago, so it needs to be checked over carefully.

Have a look at the software model and tell us if you notice any problems.

Thanks!
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to