Hi, Short story:
I have two implementations of a patch to fix this bug: https://bugs.freedesktop.org/show_bug.cgi?id=78679 I would need help testing both on Gen4 (I don't have Gen4 hardware available) and I would also like to understand why only on of them works on Gen5. There is a patch making both implementations available for testing attached to the bug report. Following are the details of the problem and some issues I see with the two implementations I have. Hopefully someone can help me address the questions/problems I mention below. Long story: The bug report is related to evaluating whether c->runtime_check_aads_emit is still needed and in that case put back the code that dealt with it and that was removed with commit 098acf6c843. I found that this flag is relevant for correct rendering in Gen < 6 when the following requirements are met: - Rendering polygons (not lines) - Rendering exactly one of the polygon's faces in GL_LINE mode and the other in GL_FILL mode (and none of them are culled). - Enable smooth lines (antaliased lines). I attached a sample program to the bug report that generates this scenario. Running the program in IronLake shows incorrect rendering of the GL_FILL face of the polygon and correct rendering of the GL_LINE face. The idea here is that with this setup in Gen < 6 the hardware generates a runtime bit that indicates whether AA data needs to be sent as part of the frame buffer write SEND message. Particularly, AA data has to be sent when rendering the GL_LINE face of the polygon, but not when rendering the GL_FILL face for things to work properly. The current code always sends AA info which produces the incorrect rendering results in the sample program I mentioned. Instead, runtime_check_aads_emit should be checked and if TRUE we should generate code that tests this bit at runtime and changes the SEND message accordingly. The code removed in the mentioned commit apparently addressed this for Gen4, however rendering in Gen5 was not addressed even before that commit because the code path that included the code to deal with this did not run in Gen5, so for IronLake this has always been broken as far as I can tell and, presumably, it has been broken for Gen < 5 since that commit. The original code used a brw_JMPI instruction to handle the runtime check when runtime_check_aads_emit==TRUE. Adapted to the current code base it would look something like this: brw_set_conditionalmod(p, BRW_CONDITIONAL_NZ); brw_AND(p, v1_null_ud, get_element_ud(brw_vec8_grf(1,0), 6), brw_imm_ud(1<<26)); jmp = brw_JMPI(p, ip, ip, brw_imm_ud(0)) - p->store; { fire_fb_write(inst, inst->base_mrf+1, implied_header, inst->mlen-1); } brw_land_fwd_jump(p, jmp); fire_fb_write(inst, inst->base_mrf, implied_header, inst->mlen); Basically the idea here is to shift the message header by one position to fill in the space that would otherwise be used for AA data (stored in base_mrf+2) when the runtime test indicates that we should not send AA info in the SEND message. However, at least in my IronLake hardware this does not produce correct rendering results either. The face of the polygon in GL_FILL mode, although different from before, is still not rendered properly. The original code included comments suggesting that the resulting program would be equivalent to an IF/ELSE structure where only one of the fire_fb_writes will be executed. However, I found that this other code that in theory should be equivalent, produces correct results: struct brw_reg v1_null_ud = vec1(retype(brw_null_reg(), BRW_REGISTER_TYPE_UD)); struct brw_reg tmp_reg_ud = brw_vec1_grf(BRW_MAX_GRF-1, 0); brw_AND(p, tmp_reg_ud, get_element_ud(brw_vec8_grf(1,0), 6), brw_imm_ud(1<<26)); brw_CMP(p, v1_null_ud, BRW_CONDITIONAL_Z, tmp_reg_ud, brw_imm_ud(0)); brw_IF(p, BRW_EXECUTE_1); { fire_fb_write(inst, inst->base_mrf+1, implied_header, inst->mlen-1); } brw_ELSE(p); { fire_fb_write(inst, inst->base_mrf, implied_header, inst->mlen); } brw_ENDIF(p); I have no idea why this happens. Also, since the implementation that follows the strategy from the original code does not work on IronLake, I wonder if that means that it was not working correctly in Gen < 5 either. Can anyone explain if the is any difference between both implementations? If it helps, here is the assembly code produced for both implementations (notice that fire_fb_write produces a mov and a send): IF/ELSE/ENDIF version: and(1) g127<1>F g1.6<0,1,0>UD 0x04000000UD { align1 nomask }; cmp.e.f0(1) null g127<0,1,0>F 0x00000000UD { align1 nomask }; (+f0) if(1) ip 6D { align1 switch }; mov(8) m3<1>F g1<8,8,1>F { align1 nomask }; send(16) 2 null g0<8,8,1>UW write (0, 8, 4, 0) mlen 10 rlen 0 { align1 nomask EOT }; else(1) ip 65544D { align1 switch }; mov(8) m2<1>F g1<8,8,1>F { align1 nomask }; send(16) 1 null g0<8,8,1>UW write (0, 8, 4, 0) mlen 11 rlen 0 { align1 nomask EOT }; endif(1) JMPI version: and.ne.f0(1) null g1.6<0,1,0>UD 0x04000000UD { align1 nomask }; (+f0) jmpi(2) 4 { align1 nomask }; mov(8) m3<1>F g1<8,8,1>F { align1 nomask }; send(16) 2 null g0<8,8,1>UW write (0, 8, 4, 0) mlen 10 rlen 0 { align1 nomask EOT }; mov(8) m2<1>F g1<8,8,1>F { align1 nomask }; send(16) 1 null g0<8,8,1>UW write (0, 8, 4, 0) mlen 11 rlen 0 { align1 nomask EOT }; I think it would be better if we could go with the JMPI implementation (if someone tells me how to make that work as intended) since the one with IF/ELSE/ENDIF requires an additional GRF register to deal with the AND/CMP instructions and I have not found any API to query unused GRF registers. As you can see I am using BRW_MAX_GRF-1, hoping that this is unlikely to be assigned in normal situations, but this is obviously not a good solution. Iago _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev