On Wednesday 09 February 2005 21:19, Daniel Phillips wrote:
> On Wednesday 09 February 2005 07:55, Lourens Veen wrote:
> > Ah. I think I'm starting to see your point about 13 bits not being
> > enough.
>
> I failed to mention a especially easy way to amplify the error:
> repeating textures.  Say you're viewing a long hall with repeating
> textures on the wall.  Small errors at the far end of the hall will
> amplify into multi-texel errors at the near end.  Masking doesn't work
> because the math is nonlinear.  The easiest fix is to have some
> precision in reserve and otherwise ignore the problem.

Urgh! I've been trying to get my head around this, but I just don't grok it. 
We draw trapezoids with the top and bottom horizontal. So we have a linear 
interpolation between the top-left and the bottom-left values, and a linear 
interpolation along the right edge. That gives us a start and end value for 
each horizontal line, across which we do another linear interpolation to get 
the value for each pixel. Then we divide these values by M to perspective 
correct them, and then we use them to look up texture data or be a colour?

That doesn't make sense to me at all. What am I missing?

> The pragmatic approach is to just ignore this for the first release.
>
> > I am fairly sure I can make that full interpolation (which presently
> > does 15 bits) into full 16 bit precision by storing 17- or 18-bit
> > values instead of 16-bit ones.
>
> I vote for 18 bits, which would guarantee 17 bit precision.  The 17th
> bit would only live for a short while, being fed immediately into the
> perspective correction multipliers.  To me, every bit of extra
> precision is gold, so if it's easy to get, get it.

Okay, I'm using 18-bit values now, and I've almost got 17 bit precision. 
Almost.

Each span is 64 units (input is 16 bit, LUT is 10 bit) and I have two spans 
where the very first and last entries have a difference of 0, everything else 
has a difference of 1, except for the middle entry which has a difference of 
2. I guess we just get really really unlucky there with the rounding.

Anyway, these can be special cased, and then we have 17-bit precision (using 
two RAM blocks). Maybe I can even get rid of the problem by storing values in 
one table and differences in the other. Differences can then be in 8.10 fixed 
format, which may give enough precision to prevent the problem. Or I could do 
the same thing I'm doing for the 14/4 one-read version and have 19 bits value 
and 17 bits difference over the two tables. That will definitely give 17 bits 
of precision, and it saves a subtract operation.

> OK, here goes my limited understanding of how this RAM works: it's
> dual-ported, so each pixel can pick up the start sample on one clock
> and the following sample on the next clock, which is perfectly ok
> because latency isn't a problem for the texture divide.  Alternatively,
> twice as much RAM can be used, encoding an 18 bit sample and an 18 bit
> difference in each 36 bit word, and look them up together.

Yes, it can, but the point of pipelining is that you do stuff at the same 
time. So, the pick up of the next sample on the next clock will interfere 
with the pick up of the first sample on the first clock of the next pixel. 
Then each pixel pipeline will still require two reads per clock...

> Can a 36 bit lookup be dual ported without losing a multiplier?  The
> earlier discussion left me confused on this point, and wandering
> through the chip spec hasn't helped.

Not sure....Timothy, what configurations are possible for these RAM blocks?

> I think the simplest arrangement is just to compute the difference on
> the fly, taking two clocks for the interpolation.  Interpolating the
> last sample in the table needs a bit of extra logic to handle the table
> wrap.  Interpolating the zeroth sample needs something special to
> handle the missing most significant bit for the sample of exactly 1.

Yeah, they're special cased now. If we use two tables and require 17 bits of 
precision then we can probably even store that extra MSB in the table without 
problems. We'd get a 1.19/8.8 value/difference split. I'll see if I can code 
that.

> The nice thing about all of this is, the divides seems to be under
> control.  That's what I worried about most when I first heard of this
> project.

I'm glad about that too, but I'm still worried about the gate budget...it 
looks like we're tight on multipliers too. It's not going to be easy, but 
then, if it were easy it would be boring :-).

Lourens
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to