On Tue, Jun 24, 2014 at 10:23 AM, Kenneth Graunke <kenn...@whitecape.org> wrote:
> On Tuesday, June 24, 2014 08:17:23 AM Ian Romanick wrote: > > Send patches with git-send-email and no other method. Save this message > > and apply it to origin/master using git-am to see why. > > > > Hopefully Matt or Ken can review this... > > > > On 06/12/2014 12:35 PM, Cody Northrop wrote: > > > Commit 17c7ead7 exposed a bug in how uniform loading happens in the > > > presence of discard. It manifested itself in an application as > randomly > > > incorrect pixels on the borders of conditional areas. > > > > > > We believe it is due to how discards jump to the end of the shader > > > incorrectly for some channels. The current implementation checks each > > > 2x2 subspan to preserve derivatives. When uniform loading via samplers > > > was turned on, it uses a full execution mask, as stated in > > > lower_uniform_pull_constant_loads(), and only populates four channels > of > > > the destination (see generate_uniform_pull_constant_load_gen7()). We > > > think that is happening incorrectly when the first subspan has been > > > jumped over. > > > > > > A possible fix is to only jump to the end of the shader if all relevant > > > channels are disabled, i.e. all 8 or 16, depending on dispatch. This > > > preserves the original speedup noted in commit beafced2. There are > > > other more heavyweight fixes (i.e. don't use sampler for uniforms if > > > discard in use), but this seems like a good fix. I've appended it > below > > > as a patch. It changes the shader accordingly: > > > > > > before : 0x000000d0: (-f0.1.any4h) halt(8) 17 2 > > > null { align1 WE_all 1Q }; > > > after(8) : 0x000000d0: (-f0.1.any8h) halt(8) 17 2 > > > null { align1 WE_all 1Q }; > > > after(16) : 0x00000250: (-f0.1.any16h) halt(16) 17 2 > > > null { align1 WE_all 1H }; > > > > All of this information should go in the commit message. > > > > > I attached a test case to the bugzilla entry below. > > > > Was the test also sent to the piglit mailing list for inclusion in the > > the test suite? > > > > > Thanks for any review or feedback. Curious if anyone sees a better > fix. > > > > > > -C > > > > > > > > > From 871671c4ab8ff00e85b434865e8855fc356efa8f Mon Sep 17 00:00:00 2001 > > > From: Cody Northrop <c...@lunarg.com <mailto:c...@lunarg.com>> > > > Date: Thu, 12 Jun 2014 09:07:18 -0600 > > > Subject: [PATCH] i965/fs: Update discard jump to preserve uniform loads > via > > > sampler. > > > > > > The series that implemented this optimization was done before > > > the changes to use samplers for uniform loads. Uniform sampler > > > loads use special execution masks and only populate four > > > channels, so we can't jump over those or corruption ensues. > > > Use a more conservative jump mask which only jumps to the end > > > if all relevant channels are disabled. > > > > > > No change was observed in GLbenchmark 2.7, so the optimization > > > is preserved. > > > > > > Signed-off-by: Cody Northrop <c...@lunarg.com <mailto:c...@lunarg.com > >> > > > Reviewed-by: Mike Stroyan <m...@lunarg.com <mailto:m...@lunarg.com>> > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79948 > > > --- > > > src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 10 +++++++++- > > > 1 file changed, 9 insertions(+), 1 deletion(-) > > > > > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp > > > b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp > > > index 8858852..fe05715 100644 > > > --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp > > > +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp > > > @@ -1907,7 +1907,15 @@ fs_visitor::visit(ir_discard *ir) > > > */ > > > fs_inst *discard_jump = emit(FS_OPCODE_DISCARD_JUMP); > > > discard_jump->flag_subreg = 1; > > > - discard_jump->predicate = BRW_PREDICATE_ALIGN1_ANY4H; > > > + > > > + /* Uniforms are now loaded using samplers with a routine that > has > > > + * its own execution mask, so we can only jump if all relevant > > > + * channels are dead. This is more conservative than the > previous > > > + * four channel checking, but still preserves speedups. > > This comment won't make much sense when reading the code, since "the > previous > four channel checking" no longer exists and can't be observed anywhere > other > than in git history. Personally, I'd just drop the comment. > > Sure, I'll clean up the comments. > As far as I can tell, this is just broken and should have always been > ANY8H/ANY16H. Good catch, and thanks for fixing it! > > > > + */ > > > + discard_jump->predicate = (8 == dispatch_width) > > Please drop the Yoda-conditional and use (dispatch_width == 8). > Since dispatch_width is constant, it can go on the left. '8' was more read-only to my eyes. I try to avoid common typo mistakes (= instead of ==). > > With those two changes, this would get: > Reviewed-by: Kenneth Graunke <kenn...@whitecape.org> > > If you resend a v2 using git-send-email, I'll be happy to push it. > Thanks, I'll fix all of this and send again correctly. > > > > + ? BRW_PREDICATE_ALIGN1_ANY8H > > > + : BRW_PREDICATE_ALIGN1_ANY16H; > > > discard_jump->predicate_inverse = true; > > > } > > > } > -- Cody Northrop Graphics Software Engineer LunarG, Inc.- 3D Driver Innovations Email: c...@lunarg.com Website: http://www.lunarg.com
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev