I just realized I didn't send it to the list:

There was yet another problem with reordering of instructions. The
attached patch (which is against my earlier patch) should fix this.

~Nicolai


On 3/18/07, Oliver McFadden <[EMAIL PROTECTED]> wrote:
Another thought; the same changed are probably needed for the vertprog code. I
think there are also a lot of bugs there.


On 3/18/07, Oliver McFadden <[EMAIL PROTECTED]> wrote:
> This patch seems to break one of my longer fragment programs. I believe this
> is
> because it's running out of registers, but I haven't looked into it in
> detail
> yet.
>
> I think this patch should be committed, but directly followed by a patch to
> reduce the number of registers used.
>
>
> On 3/18/07, Nicolai Haehnle <[EMAIL PROTECTED]> wrote:
> > There were a number of bugs related to the pairing of vector and
> > scalar operations where swizzles ended up using the wrong source
> > register, or an instruction was moved forward and ended up overwriting
> > an aliased register.
> >
> > The new algorithm for register allocation is slightly conservative and
> > may run out of registers before it's strictly necessary. On the plus
> > side, it Just Works.
> >
> > Pairing of instructions is done whenever possible, and in more cases
> > than before, so in practice this change should be a net win.
> >
> > The patch mostly fixes glean/texCombine. One remaining problem is that
> > the code duplicates constants and parameters all over the place and
> > therefore quickly runs out of resources and falls back to software.
> > I'm going to look into that as well.
> >
> > Please test and commit this patch. If you notice any regressions,
> > please tell me (but the tests are looking good).
> >
> > ~Nicolai
> >
>

commit 1ec4703585171f504180425b65dfab92be2a7782
Author: Nicolai Haehnle <[EMAIL PROTECTED]>
Date:   Sun Mar 18 13:29:18 2007 +0100

    r300: Fix fragment program reordering
    
    Do not move an instruction that writes to a temp forward past an instruction
    that reads the same temporary.

diff --git a/src/mesa/drivers/dri/r300/r300_context.h b/src/mesa/drivers/dri/r300/r300_context.h
index bc43953..29436ab 100644
--- a/src/mesa/drivers/dri/r300/r300_context.h
+++ b/src/mesa/drivers/dri/r300/r300_context.h
@@ -674,6 +674,11 @@ struct reg_lifetime {
 	   emitted instruction that writes to the register */
 	int vector_valid;
 	int scalar_valid;
+	
+	/* Index to the slot where the register was last read.
+	   This is also the first slot in which the register may be written again */
+	int vector_lastread;
+	int scalar_lastread;
 };
 
 
diff --git a/src/mesa/drivers/dri/r300/r300_fragprog.c b/src/mesa/drivers/dri/r300/r300_fragprog.c
index 3c54830..89e9f65 100644
--- a/src/mesa/drivers/dri/r300/r300_fragprog.c
+++ b/src/mesa/drivers/dri/r300/r300_fragprog.c
@@ -1026,10 +1026,11 @@ static void emit_tex(struct r300_fragment_program *rp,
  */
 static int get_earliest_allowed_write(
 		struct r300_fragment_program* rp,
-		GLuint dest)
+		GLuint dest, int mask)
 {
 	COMPILE_STATE;
 	int idx;
+	int pos;
 	GLuint index = REG_GET_INDEX(dest);
 	assert(REG_GET_VALID(dest));
 
@@ -1047,7 +1048,17 @@ static int get_earliest_allowed_write(
 			return 0;
 	}
 	
-	return cs->hwtemps[idx].reserved;
+	pos = cs->hwtemps[idx].reserved;
+	if (mask & WRITEMASK_XYZ) {
+		if (pos < cs->hwtemps[idx].vector_lastread)
+			pos = cs->hwtemps[idx].vector_lastread;
+	}
+	if (mask & WRITEMASK_W) {
+		if (pos < cs->hwtemps[idx].scalar_lastread)
+			pos = cs->hwtemps[idx].scalar_lastread;
+	}
+	
+	return pos;
 }
 
 
@@ -1070,7 +1081,8 @@ static int find_and_prepare_slot(struct r300_fragment_program* rp,
 		GLboolean emit_sop,
 		int argc,
 		GLuint* src,
-		GLuint dest)
+		GLuint dest,
+		int mask)
 {
 	COMPILE_STATE;
 	int hwsrc[3];
@@ -1092,7 +1104,7 @@ static int find_and_prepare_slot(struct r300_fragment_program* rp,
 	if (emit_sop)
 		used |= SLOT_OP_SCALAR;
 	
-	pos = get_earliest_allowed_write(rp, dest);
+	pos = get_earliest_allowed_write(rp, dest, mask);
 	
 	if (rp->node[rp->cur_node].alu_offset > pos)
 		pos = rp->node[rp->cur_node].alu_offset;
@@ -1191,6 +1203,21 @@ static int find_and_prepare_slot(struct r300_fragment_program* rp,
 		cs->slot[pos].ssrc[i] = tempssrc[i];
 	}
 	
+	for(i = 0; i < argc; ++i) {
+		if (REG_GET_TYPE(src[i]) == REG_TYPE_TEMP) {
+			int regnr = hwsrc[i] & 31;
+			
+			if (used & (SLOT_SRC_VECTOR << i)) {
+				if (cs->hwtemps[regnr].vector_lastread < pos)
+					cs->hwtemps[regnr].vector_lastread = pos;
+			}
+			if (used & (SLOT_SRC_SCALAR << i)) {
+				if (cs->hwtemps[regnr].scalar_lastread < pos)
+					cs->hwtemps[regnr].scalar_lastread = pos;
+			}
+		}
+	}
+	
 	// Emit the source fetch code
 	rp->alu.inst[pos].inst1 &= ~R300_FPI1_SRC_MASK;
 	rp->alu.inst[pos].inst1 |=
@@ -1287,7 +1314,7 @@ static void emit_arith(struct r300_fragment_program *rp,
 	if ((mask & WRITEMASK_W) || vop == R300_FPI0_OUTC_REPL_ALPHA)
 		emit_sop = GL_TRUE;
 
-	pos = find_and_prepare_slot(rp, emit_vop, emit_sop, argc, src, dest);
+	pos = find_and_prepare_slot(rp, emit_vop, emit_sop, argc, src, dest, mask);
 	if (pos < 0)
 		return;
 	
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
--
_______________________________________________
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Reply via email to