On 01/10/2014 03:27 PM, Matt Turner wrote: > On Thu, Jan 9, 2014 at 11:28 AM, Ian Romanick <i...@freedesktop.org> wrote: >> On 01/08/2014 12:43 PM, Matt Turner wrote: >>> +/** >>> + * \file opt_vectorize.cpp >>> + * >>> + * Combines scalar assignments of the same expression (modulo swizzle) to >>> + * multiple channels of the same variable into a single vectorized >>> expression >>> + * and assignment. >>> + * >>> + * Many generated shaders contain scalarized code. That is, they contain >>> + * >>> + * r1.x = log2(v0.x); >>> + * r1.y = log2(v0.y); >>> + * r1.z = log2(v0.z); >>> + * >>> + * rather than >>> + * >>> + * r1.xyz = log2(v0.xyz); >>> + * >>> + * We look for consecutive assignments of the same expression (modulo >>> swizzle) >>> + * to each channel of the same variable. >>> + * >>> + * For instance, we want to convert these three scalar operations >>> + * >>> + * (assign (x) (var_ref r1) (expression float log2 (swiz x (var_ref v0)))) >>> + * (assign (y) (var_ref r1) (expression float log2 (swiz y (var_ref v0)))) >>> + * (assign (z) (var_ref r1) (expression float log2 (swiz z (var_ref v0)))) >>> + * >>> + * into a single vector operation >>> + * >>> + * (assign (xyz) (var_ref r1) (expression vec3 log2 (swiz xyz (var_ref >>> v0)))) >> >> I think it's worth adding a note that this pass only attempts to combine >> assignments that are sequential. > > That comment block already says that: > > + * We look for consecutive assignments of the same expression (modulo > swizzle) > + * to each channel of the same variable.
I guess I overlooked that word. :( > I'll change the first comment to use the word consecutive. > >> The above example gets fully >> vectorized, but this sequence would not: >> >> (assign (x) (var_ref r1) (expression float log2 (swiz x (var_ref v0)))) >> (assign (x) (var_ref r2) (expression float log2 (swiz y (var_ref v0)))) >> (assign (y) (var_ref r1) (expression float log2 (swiz z (var_ref v0)))) >> (assign (y) (var_ref r2) (expression float log2 (swiz w (var_ref v0)))) >> >> I think this will also break on code like >> >> (assign (x) (var_ref r1) (expression float log2 (swiz w (var_ref r1)))) >> (assign (y) (var_ref r1) (expression float log2 (swiz z (var_ref r1)))) >> # r1.xy have different values now. >> (assign (z) (var_ref r1) (expression float log2 (swiz y (var_ref r1)))) >> (assign (w) (var_ref r1) (expression float log2 (swiz x (var_ref r1)))) >> >> Maybe just skip assignments where the LHS also appears in the RHS for >> now? Or does the check write_mask_matches_swizzle take care of this? > > It won't break because the code rejects expressions that contain > swizzles that don't match the LHS's write mask. See the call to > write_mask_matches_swizzle(). > > The good thing about this is that we can combine expressions that use > the LHS, like > > (assign (x) (var_ref r1) (expression float log2 (swiz x (var_ref r1)))) > (assign (y) (var_ref r1) (expression float log2 (swiz y (var_ref r1)))) Okay... that's what I thought, but I wanted to be sure. With the slight tweak to the header comment (that you mention above), patches 2, 4, and 5 are Reviewed-by: Ian Romanick <ian.d.roman...@intel.com> This means we also won't vectorize things like (assign (x) (var_ref r1) (expression float * (swiz x (var_ref r1)) (swiz x (var_ref r2)))) (assign (y) (var_ref r1) (expression float * (swiz y (var_ref r1)) (swiz x (var_ref r2)))) (assign (z) (var_ref r1) (expression float * (swiz z (var_ref r1)) (swiz x (var_ref r2)))) (assign (w) (var_ref r1) (expression float * (swiz w (var_ref r1)) (swiz x (var_ref r2)))) Right? If there are occurances of that pattern in shaderdb, that may be an opportunity for some follow-on work... _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev