Iago Toral <ito...@igalia.com> writes: > On Thu, 2016-05-12 at 20:01 -0700, Francisco Jerez wrote: >> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes: >> >> > From: Iago Toral Quiroga <ito...@igalia.com> >> > >> > There will be a few places where we need to shuffle the result of a 32-bit >> > load into valid 64-bit data, so extract this logic into a separate helper >> > that we can reuse. >> > >> > The shuffling needs to operate with WE_all set because we are changing the >> > layout of the data across the various channels. Otherwise we will run into >> > problems in non-uniform control-flow scenarios. >> > >> I guess you could remove this paragraph because it no longer applies to >> the last version of the patch. > > Ooops, yes! > >> > v2 (Curro): >> > - Use subscript() instead of stride() >> > - Assert on the input types rather than retyping. >> > - Use offset() instead of horiz_offset(), drop the multiplier definition. >> > - Do not use a temporary for the writes and drop force_writemask_all. >> >> Don't pretend you took my "don't use a temporary" suggestion into >> account. :P > > Oh right, I did write the patch with that change included but then run > into the issue we discussed when src == dst and forgot to remove this. > >> > - Mark component_i as const. >> >> Did you forget to git-add something? > > No, somehow I lost the change in the process... I'll put it back. > >> > - Make the function name lower case. >> > --- >> > src/mesa/drivers/dri/i965/brw_fs.cpp | 53 >> > ++++++++++++++++++++++++++++++++++++ >> > src/mesa/drivers/dri/i965/brw_fs.h | 5 ++++ >> > 2 files changed, 58 insertions(+) >> > >> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp >> > b/src/mesa/drivers/dri/i965/brw_fs.cpp >> > index 15d5759..4827dea 100644 >> > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp >> > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp >> > @@ -212,6 +212,59 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const >> > fs_builder &bld, >> > } >> > >> > /** >> > + * This helper takes the result of a load operation that reads 32-bit >> > elements >> > + * in this format: >> > + * >> > + * x x x x x x x x >> > + * y y y y y y y y >> > + * z z z z z z z z >> > + * w w w w w w w w >> > + * >> > + * and shuffles the data to get this: >> > + * >> > + * x y x y x y x y >> > + * x y x y x y x y >> > + * z w z w z w z w >> > + * z w z w z w z w >> > + * >> > + * Which is exactly what we want if the load is reading 64-bit components >> > + * like doubles, where x represents the low 32-bit of the x double >> > component >> > + * and y represents the high 32-bit of the x double component (likewise >> > with >> > + * z and w for double component y). The parameter @components represents >> > + * the number of 64-bit components present in @src. This would typically >> > be >> > + * 2 at most, since we can only fit 2 double elements in the result of a >> > + * vec4 load. >> > + * >> > + * Notice that @dst and @src can be the same register. >> > + */ >> > +void >> > +fs_visitor::shuffle_32bit_load_result_to_64bit_data(const fs_builder &bld, >> >> There was a second reason I had in mind when I suggested it would >> improve encapsulation to take this out of fs_visitor: The function has >> absolutely nothing to do with visiting, it uses no internal or external >> fs_visitor data structures or interfaces, it doesn't even use the "this" >> pointer. Defining a function that could perfectly be stand-alone inside >> an object it has no need to be in (in this case it doesn't even have any >> logical relation with) actually *decreases* encapsulation because it >> exposes the object's internals to the function unnecessarily. >> >> Either way the back-end code is already plagued by this anti-pattern so >> I'm not going to complain if you keep the code as-is -- You could argue >> you're just being consistent with the existing practice. ;) > > I agree with all your reasoning, my only disagreement is with the fact > that brw_fs_nir.cpp is a better place for it considering that this has > nothing to do with NIR, but since we have to put this somewhere let's > put it there for now. > Oh, I don't really mind where you put the function, the only reason I suggested that was that I assumed you were only using it from the NIR front-end, but brw_fs.cpp does seem like a better fit if you're using it elsewhere.
>> > + const fs_reg dst, >> > + const fs_reg src, >> >> Pass by reference. > > Ok. > >> > + uint32_t components) >> > +{ >> > + assert(type_sz(src.type) == 4); >> > + assert(type_sz(dst.type) == 8); >> > + >> > + /* A temporary that we will use to shuffle the 32-bit data of each >> > + * component in the vector into valid 64-bit data. We can't write >> > directly >> > + * to dst because dst can be (and would usually be) the same as src >> > + * and in that case the first MOV in the loop below would overwrite the >> > + * data read in the second MOV. >> > + */ >> > + fs_reg tmp = bld.vgrf(dst.type); >> > + >> > + for (unsigned i = 0; i < components; i++) { >> > + fs_reg component_i = offset(src, bld, 2 * i); >> > + >> > + bld.MOV(subscript(tmp, src.type, 0), component_i); >> > + bld.MOV(subscript(tmp, src.type, 1), offset(component_i, bld, 1)); >> > + >> > + bld.MOV(offset(dst, bld, i), tmp); >> >> Ah, yes, this looks much better, > > Yes, I agree :) > >> Reviewed-by: Francisco Jerez <curroje...@riseup.net> >> >> > + } >> > +} >> > + >> > +/** >> > * A helper for MOV generation for fixing up broken hardware SEND >> > dependency >> > * handling. >> > */ >> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.h >> > b/src/mesa/drivers/dri/i965/brw_fs.h >> > index b906f3d..1970ad0 100644 >> > --- a/src/mesa/drivers/dri/i965/brw_fs.h >> > +++ b/src/mesa/drivers/dri/i965/brw_fs.h >> > @@ -105,6 +105,11 @@ public: >> > uint32_t const_offset); >> > void DEP_RESOLVE_MOV(const brw::fs_builder &bld, int grf); >> > >> > + void shuffle_32bit_load_result_to_64bit_data(const brw::fs_builder >> > &bld, >> > + const fs_reg dst, >> > + const fs_reg src, >> > + uint32_t components); >> > + >> > bool run_fs(bool do_rep_send); >> > bool run_vs(gl_clip_plane *clip_planes); >> > bool run_tcs_single_patch(); >> > -- >> > 2.5.0 >> > >> > _______________________________________________ >> > mesa-dev mailing list >> > mesa-dev@lists.freedesktop.org >> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev >> _______________________________________________ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev