Merging the ttm-vram-0-1-branch.
Hi! I'm about to merge the DRM ttm-vram-0-1-branch, unless it gets in the way, in that case, please speak up. It makes the memory manager support fixed memory, be it on-card VRAM or pre-bound AGP memory, mappable or unmappable. The device driver basically needs to provide a fast copy operation from the fixed memory region to either AGP, where it can be flipped out using TTMs or to system. Typically a blit or PCI-SG operation. Tested only with a pre-bound AGP region simulating vram, and the i915 driver. /Thomas - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
UT2004 and _mesa_fetch_state
FYI, I decided to give ut2004 a spin this morning, for the first time with the free driver in quite a while. I had heard good things since the VBO merge... Unfortunately, I very quickly ran into a problem with I loaded up the Icefields Bombing Run level: Mesa 6.5.3 implementation error: Invalid state in _mesa_fetch_state Please report at bugzilla.freedesktop.org At Michel's suggestion, I ran the game through gdb, with _mesa_problem set as a breakpoint. At the first instance of the breakpoint, I grabbed a backtrace: http://www.visualtech.com/backtrace.txt I then realized that this didn't refer to the _mesa_fetch_state problem, so I tried again, continuing through breakpoints before I hit _mesa_fetch_state with the third. I grabbed a backtrace at that specific breakpoint: http://www.visualtech.com/backtrace-_mesa_fetch_state.txt All three breakpoints had to do with state: Breakpoint 2, _mesa_problem (ctx=0x0, fmtString=0x699f11fc Invalid state in make_state_string) at main/imports.c:963 Breakpoint 2, _mesa_problem (ctx=0x0, fmtString=0x699f1248 unexpected state[0] in make_state_flags()) at main/imports.c:963 Breakpoint 2, _mesa_problem (ctx=0xbbb5c20, fmtString=0x699f118c Invalid state in _mesa_fetch_state) at main/imports.c:963 Between the two links above, there are backtraces for the 1st and 3rd instance of _mesa_problem, though I can also grab one at the 2nd if necessary :-) Now, if no one is familiar with this problem, I'm more than willing to open up a report on the bugzilla. Please let me know :-) Adam - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: UT2004 and _mesa_fetch_state
Adam K Kirchhoff wrote: FYI, I decided to give ut2004 a spin this morning, for the first time with the free driver in quite a while. I had heard good things since the VBO merge... Unfortunately, I very quickly ran into a problem with I loaded up the Icefields Bombing Run level: Mesa 6.5.3 implementation error: Invalid state in _mesa_fetch_state Please report at bugzilla.freedesktop.org At Michel's suggestion, I ran the game through gdb, with _mesa_problem set as a breakpoint. At the first instance of the breakpoint, I grabbed a backtrace: http://www.visualtech.com/backtrace.txt Oops. Looks like it's caused by the optimizations I did in t_vp_build.c (in particular, the fog using optimized internal fog state params). I'll look into it. Roland - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: UT2004 and _mesa_fetch_state
Roland Scheidegger wrote: Adam K Kirchhoff wrote: FYI, I decided to give ut2004 a spin this morning, for the first time with the free driver in quite a while. I had heard good things since the VBO merge... Unfortunately, I very quickly ran into a problem with I loaded up the Icefields Bombing Run level: Mesa 6.5.3 implementation error: Invalid state in _mesa_fetch_state Please report at bugzilla.freedesktop.org At Michel's suggestion, I ran the game through gdb, with _mesa_problem set as a breakpoint. At the first instance of the breakpoint, I grabbed a backtrace: http://www.visualtech.com/backtrace.txt Oops. Looks like it's caused by the optimizations I did in t_vp_build.c (in particular, the fog using optimized internal fog state params). I'll look into it. Ok should hopefully be fixed. Roland - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: UT2004 and _mesa_fetch_state
On Wed, 2007-02-14 at 16:39 +0100, Roland Scheidegger wrote: Roland Scheidegger wrote: Adam K Kirchhoff wrote: FYI, I decided to give ut2004 a spin this morning, for the first time with the free driver in quite a while. I had heard good things since the VBO merge... Unfortunately, I very quickly ran into a problem with I loaded up the Icefields Bombing Run level: Mesa 6.5.3 implementation error: Invalid state in _mesa_fetch_state Please report at bugzilla.freedesktop.org At Michel's suggestion, I ran the game through gdb, with _mesa_problem set as a breakpoint. At the first instance of the breakpoint, I grabbed a backtrace: http://www.visualtech.com/backtrace.txt Oops. Looks like it's caused by the optimizations I did in t_vp_build.c (in particular, the fog using optimized internal fog state params). I'll look into it. Ok should hopefully be fixed. And so it is. Thanks! Adam - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [R300][PATCH] Add/fix COS SIN + FP fixes
Roland Scheidegger wrote: Roland Scheidegger wrote: Rune Petersen Ok commited. I didn't look too closely at this but I've a couple of comments. - COS looks too complicated broken. If you'd want to get 2 with a LOG2, you'd need 0.25 as source. But even using RCP instead, that's 5 instructions before performing the sine, for something you can easily do in two, using another constant (just 1 add + 1 cmp needed, if you use the right constants for the add). Maybe it's not that bad though, I don't know how many rgb and a slots it will actually consume, but still, are constant slots that rare? Second, you'd really need to do range reduction of the input, otherwise results will be very wrong for inputs outside [-pi, pi]. This would be true for taylor approximation too, of course, unless you do an infinite series :-). You wouldn't need to do that for SCS. Oh, and forgot to mention, you probably really want to use the higher precision variant by default. 12% max relative error (and even absolute it's still 6%) will likely be visible in some cases depending what the shader is doing. Even the enhanced version seems to miss opengl conformance (accurate to about 1 part in 10^5) by roughly a factor of 10, which stretches the meaning of about a bit probably already. You could also rely on the precision hint for fragment programs to switch to the faster version instead of a dri conf option (note though the spec explicitly states implementations are discouraged even in this case to perform optimizations which could have significant impact on the output). This patch: - Fixes COS. - Does range reductions for SIN COS. - Adds SCS. - removes the optimized version of SIN COS. - tweaked weight (should help on precision). - fixed a copy paste typo in emit_arith(). Roland would you mind testing if the tweaked weight helped? And Jerome would you mind committing this? Rune Petersen diff --git a/src/mesa/drivers/dri/r300/r300_context.h b/src/mesa/drivers/dri/r300/r300_context.h index b140235..48b50bc 100644 --- a/src/mesa/drivers/dri/r300/r300_context.h +++ b/src/mesa/drivers/dri/r300/r300_context.h @@ -731,7 +731,7 @@ struct r300_fragment_program { int max_temp_idx; /* the index of the sin constant is stored here */ - GLint const_sin; + GLint const_sin[2]; GLuint optimization; }; diff --git a/src/mesa/drivers/dri/r300/r300_fragprog.c b/src/mesa/drivers/dri/r300/r300_fragprog.c index b00cf9e..8e45bd5 100644 --- a/src/mesa/drivers/dri/r300/r300_fragprog.c +++ b/src/mesa/drivers/dri/r300/r300_fragprog.c @@ -33,7 +33,6 @@ /*TODO'S * - * - SCS instructions * - Depth write, WPOS/FOGC inputs * - FogOption * - Verify results of opcodes for accuracy, I've only checked them @@ -1081,7 +1080,7 @@ static void emit_arith(struct r300_fragment_program *rp, break; } if (emit_sop - (s_swiz[REG_GET_VSWZ(src[i])].flags SLOT_VECTOR)) { + (s_swiz[REG_GET_SSWZ(src[i])].flags SLOT_VECTOR)) { vpos = spos = MAX2(vpos, spos); break; } @@ -1204,6 +1203,25 @@ static GLuint get_attrib(struct r300_fragment_program *rp, GLuint attr) } #endif +static void make_sin_const(struct r300_fragment_program *rp) +{ + if(rp-const_sin[0] == -1){ + GLfloat cnstv[4]; + + cnstv[0] = 1.273239545; // 4/PI + cnstv[1] =-0.405284735; // -4/(PI*PI) + cnstv[2] = 3.141592654; // PI + cnstv[3] = 0.2225; // weight + rp-const_sin[0] = emit_const4fv(rp, cnstv); + + cnstv[0] = 0.5; + cnstv[1] = -1.5; + cnstv[2] = 0.159154943; // 1/(2*PI) + cnstv[3] = 6.283185307; // 2*PI + rp-const_sin[1] = emit_const4fv(rp, cnstv); + } +} + static GLboolean parse_program(struct r300_fragment_program *rp) { struct gl_fragment_program *mp = rp-mesa_program; @@ -1260,84 +1278,68 @@ static GLboolean parse_program(struct r300_fragment_program *rp) * cos using a parabola (see SIN): * cos(x): * x += PI/2 - * x = (x PI)?x : x-2*PI + * x = (x/(2*PI))+0.5 + * x = frac(x) + * x = (x*2*PI)-PI * result = sin(x) */ temp = get_temp_reg(rp); - if(rp-const_sin == -1){ - cnstv[0] = 1.273239545; - cnstv[1] =-0.405284735; - cnstv[2] = 3.141592654; - cnstv[3] = 0.225; - rp-const_sin = emit_const4fv(rp, cnstv); - } - cnst = rp-const_sin; + make_sin_const(rp); src[0] = t_scalar_src(rp, fpi-SrcReg[0]); - emit_arith(rp, PFS_OP_LG2, temp, WRITEMASK_W, - pfs_half, - undef, - undef, - 0); + /* add 0.5*PI and do range reduction */ emit_arith(rp, PFS_OP_MAD, temp, WRITEMASK_X, - swizzle(cnst, Z, Z, Z, Z), //PI + swizzle(rp-const_sin[0], Z, Z, Z, Z), //PI pfs_half, swizzle(keep(src[0]), X, X, X, X), 0); - emit_arith(rp, PFS_OP_MAD, temp, WRITEMASK_W, - negate(swizzle(temp, W, W, W, W)), //-2 - swizzle(cnst, Z, Z, Z, Z), //PI + emit_arith(rp, PFS_OP_MAD, temp, WRITEMASK_X,
Re: [R300][PATCH] Add/fix COS SIN + FP fixes
On 2/14/07, Rune Petersen [EMAIL PROTECTED] wrote: Roland Scheidegger wrote: Roland Scheidegger wrote: Rune Petersen Ok commited. I didn't look too closely at this but I've a couple of comments. - COS looks too complicated broken. If you'd want to get 2 with a LOG2, you'd need 0.25 as source. But even using RCP instead, that's 5 instructions before performing the sine, for something you can easily do in two, using another constant (just 1 add + 1 cmp needed, if you use the right constants for the add). Maybe it's not that bad though, I don't know how many rgb and a slots it will actually consume, but still, are constant slots that rare? Second, you'd really need to do range reduction of the input, otherwise results will be very wrong for inputs outside [-pi, pi]. This would be true for taylor approximation too, of course, unless you do an infinite series :-). You wouldn't need to do that for SCS. Oh, and forgot to mention, you probably really want to use the higher precision variant by default. 12% max relative error (and even absolute it's still 6%) will likely be visible in some cases depending what the shader is doing. Even the enhanced version seems to miss opengl conformance (accurate to about 1 part in 10^5) by roughly a factor of 10, which stretches the meaning of about a bit probably already. You could also rely on the precision hint for fragment programs to switch to the faster version instead of a dri conf option (note though the spec explicitly states implementations are discouraged even in this case to perform optimizations which could have significant impact on the output). This patch: - Fixes COS. - Does range reductions for SIN COS. - Adds SCS. - removes the optimized version of SIN COS. - tweaked weight (should help on precision). - fixed a copy paste typo in emit_arith(). Roland would you mind testing if the tweaked weight helped? And Jerome would you mind committing this? Rune Petersen Pushed, git isn't so frightening trust me :) best, Jerome Glisse - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [R300][PATCH] Add/fix COS SIN + FP fixes
Rune Petersen wrote: This patch: - Fixes COS. - Does range reductions for SIN COS. - Adds SCS. - removes the optimized version of SIN COS. - tweaked weight (should help on precision). - fixed a copy paste typo in emit_arith(). Roland would you mind testing if the tweaked weight helped? Well I didn't test it first time (just quoting the numbers from the link you provided), but I guess that's fine too. I was actually wondering myself if it's better to optimize for absolute or relative error, so choosing a weight in-between should work too (the difference is not that big after all). A couple comments though: Since ((x + PI/2)/(2*PI))+0.5 is (x/(2*PI) + (1/4 + 0.5) you could optimize away the first mad for the COS case. Also, the comments for SCS seem a bit off. That's a pity, because without comments I can't really see what the code does at first sight :-). Looks like quite a few extra instructions though, are you sure not more could be shared for calculating both sin and cos? Otherwise, looks good to me. Keep up the good work! Roland - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel