On Jan 18, 2018, at 1:10 PM, Roland Scheidegger <srol...@vmware.com<mailto:srol...@vmware.com>> wrote:
Am 17.01.2018 um 23:33 schrieb George Kyriazis: The texture swizzle was not doing the right thing for avx512-style 16-wide loads. Special-case the post-load swizzle operations for avx512 so that we move the xyzw components correctly to the outputs. cc: Jose Fonseca <jfons...@vmware.com<mailto:jfons...@vmware.com>> --- src/gallium/auxiliary/gallivm/lp_bld_pack.c | 40 +++++++++++++++++++++++++++-- 1 file changed, 38 insertions(+), 2 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_pack.c b/src/gallium/auxiliary/gallivm/lp_bld_pack.c index e8d4fcd..7879826 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_pack.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_pack.c @@ -129,6 +129,31 @@ lp_build_const_unpack_shuffle_half(struct gallivm_state *gallivm, } /** + * Similar to lp_build_const_unpack_shuffle_half, but for AVX512 + * See comment above lp_build_interleave2_half for more details. + */ +static LLVMValueRef +lp_build_const_unpack_shuffle_16wide(struct gallivm_state *gallivm, + unsigned lo_hi) +{ + LLVMValueRef elems[LP_MAX_VECTOR_LENGTH]; + unsigned i, j; + + assert(lo_hi < 2); + + // for the following lo_hi setting, convert 0 -> f to: + // 0: 0 16 4 20 8 24 12 28 1 17 5 21 9 25 13 29 + // 1: 2 18 6 22 10 26 14 30 3 19 7 23 11 27 15 31 + for (i = 0; i < 16; i++) { + j = ((i&0x06)<<1) + ((i&1)<<4) + (i>>3) + (lo_hi<<1); + + elems[i] = lp_build_const_int32(gallivm, j); + } + + return LLVMConstVector(elems, 16); +} + +/** * Build shuffle vectors that match PACKxx (SSE) instructions or * VPERM (Altivec). */ @@ -325,8 +350,8 @@ lp_build_interleave2(struct gallivm_state *gallivm, } /** - * Interleave vector elements but with 256 bit, - * treats it as interleave with 2 concatenated 128 bit vectors. + * Interleave vector elements but with 256 (or 512) bit, + * treats it as interleave with 2 concatenated 128 (or 256) bit vectors. * * This differs to lp_build_interleave2 as that function would do the following (for lo): * a0 b0 a1 b1 a2 b2 a3 b3, and this does not compile into an AVX unpack instruction. @@ -343,6 +368,14 @@ lp_build_interleave2(struct gallivm_state *gallivm, * * And interleave-hi would result in: * a2 b2 a3 b3 a6 b6 a7 b7 + * + * For 512 bits, the following are true: + * + * Interleave-lo would result in (capital letters denote hex indices): + * a0 b0 a1 b1 a4 b4 a5 b5 a8 b8 a9 b9 aC bC aD bD + * + * Interleave-hi would result in: + * a2 b2 a3 b3 a6 b6 a7 b7 aA bA aB bB aE bE aF bF */ LLVMValueRef lp_build_interleave2_half(struct gallivm_state *gallivm, @@ -354,6 +387,9 @@ lp_build_interleave2_half(struct gallivm_state *gallivm, if (type.length * type.width == 256) { LLVMValueRef shuffle = lp_build_const_unpack_shuffle_half(gallivm, type.length, lo_hi); return LLVMBuildShuffleVector(gallivm->builder, a, b, shuffle, ""); + } else if ((type.length == 16) && (type.width == 32)) { + LLVMValueRef shuffle = lp_build_const_unpack_shuffle_16wide(gallivm, lo_hi); + return LLVMBuildShuffleVector(gallivm->builder, a, b, shuffle, ""); This is not really "interleave_half", more like "interleave_quarter"... That said, avx512 certainly follows the same rules as avx256, so 128bit pieces are treated independently. So maybe this should be renamed like "interleave_native" or something like that. Also, I believe it is definitely a mistake to restrict this to dword interleaves here. You should handle all type widths, just like the 256bit case can handle all widths. And I'm not sure through which paths you reach this, but I'm not sure why you don't need the corresponding unpack2_native and pack2_native adjustments - it should not really be a special case, avx512 should generally handle things like this (if you'd want to extend the gallivm code to use avx512...). For that matter, the commit log and shortlog is confusing, because this isn't directly related to texture fetching. Roland Roland, The stack trace that I am seeing is the following: (gdb) bt #0 lp_build_const_unpack_shuffle_16wide (gallivm=0x168b690, lo_hi=0) at ../../../../src/gallium/auxiliary/gallivm/lp_bld_pack.c:138 #1 0x00007ffff62786de in lp_build_interleave2_half (gallivm=0x168b690, type=..., a=0x16a7378, b=0x16a7d38, lo_hi=0) at ../../../../src/gallium/auxiliary/gallivm/lp_bld_pack.c:391 #2 0x00007ffff629585f in lp_build_transpose_aos (gallivm=0x168b690, single_type_lp=..., src=0x7fffffff32e0, dst=0x7fffffff3300) at ../../../../src/gallium/auxiliary/gallivm/lp_bld_swizzle.c:664 #3 0x00007ffff626a887 in lp_build_fetch_rgba_soa (gallivm=0x168b690, format_desc=0x7ffff67fe9a0 <util_format_r32g32b32a32_sint_description>, type=..., aligned=1 '\001', base_ptr=0x16a3218, offset=0x16a6890, i=0xf87a90, j=0xf87a90, cache=0x0, rgba_out=0x7fffffff4280) at ../../../../src/gallium/auxiliary/gallivm/lp_bld_format_soa.c:635 #4 0x00007ffff628f899 in lp_build_fetch_texel (bld=0x7fffffff3680, texture_unit=0, coords=0x7fffffff4060, explicit_lod=0x16a2bf0, offsets=0x7fffffff4260, colors_out=0x7fffffff4280) at ../../../../src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c:2682 #5 0x00007ffff6290a6b in lp_build_sample_soa_code (gallivm=0x168b690, static_texture_state=0x7fffffffc61c, static_sampler_state=0x7fffffffc618, dynamic_state=0x1696d18, type=..., sample_key=100, texture_index=0, sampler_index=0, context_ptr=0x16a2b70, thread_data_ptr=0x0, coords=0x7fffffff42a0, offsets=0x7fffffff4260, derivs=0x0, lod=0x16a2bf0, texel_out=0x7fffffff4280) at ../../../../src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c:3092 #6 0x00007ffff629202d in lp_build_sample_gen_func (gallivm=0x168b690, static_texture_state=0x7fffffffc61c, static_sampler_state=0x7fffffffc618, dynamic_state=0x1696d18, type=..., texture_index=0, sampler_index=0, function=0x16a2aa8, num_args=3, sample_key=100) at ../../../../src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c:3483 #7 0x00007ffff629286d in lp_build_sample_soa_func (gallivm=0x168b690, static_texture_state=0x7fffffffc61c, static_sampler_state=0x7fffffffc618, dynamic_state=0x1696d18, params=0x7fffffff46b0) at ../../../../src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c:3629 ---Type <return> to continue, or q <return> to quit--- #8 0x00007ffff6292cdb in lp_build_sample_soa ( static_texture_state=0x7fffffffc61c, static_sampler_state=0x7fffffffc618, dynamic_state=0x1696d18, gallivm=0x168b690, params=0x7fffffff46b0) at ../../../../src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c:3734 #9 0x00007ffff630fd71 in swr_sampler_soa_emit_fetch_texel (base=0x1696d00, gallivm=0x168b690, params=0x7fffffff46b0) at ../../../../../src/gallium/drivers/swr/swr_tex_sample.cpp:302 #10 0x00007ffff62a3fc0 in emit_fetch_texels (bld=0x7fffffff4a40, inst=0x1698b20, texel=0x7fffffff4868, is_samplei=0 '\000') at ../../../../src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c:2523 #11 0x00007ffff62a584d in txf_emit (action=0x7fffffff54c0, bld_base=0x7fffffff4a40, emit_data=0x7fffffff47f0) at ../../../../src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c:3178 #12 0x00007ffff629bcaa in lp_build_tgsi_inst_llvm (bld_base=0x7fffffff4a40, inst=0x1698b20) at ../../../../src/gallium/auxiliary/gallivm/lp_bld_tgsi.c:309 #13 0x00007ffff629c650 in lp_build_tgsi_llvm (bld_base=0x7fffffff4a40, tokens=0x168a5e0) at ../../../../src/gallium/auxiliary/gallivm/lp_bld_tgsi.c:546 #14 0x00007ffff62a7255 in lp_build_tgsi_soa (gallivm=0x168b690, tokens=0x168a5e0, type=..., mask=0x0, consts_ptr=0x16913e8, const_sizes_ptr=0x16914a8, system_values=0x7fffffffac30, inputs=0x7fffffffacd0, outputs=0x7fffffffb6d0, context_ptr=0x1691300, thread_data_ptr=0x0, sampler=0x1696d00, info=0x1688ab8, gs_iface=0x0) at ../../../../src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c:3945 #15 0x00007ffff6315cfe in BuilderSWR::CompileVS (this=0x7fffffffc130, ctx=0x645300, key=...) at ../../../../../src/gallium/drivers/swr/swr_shader.cpp:836 } else { return lp_build_interleave2(gallivm, type, a, b, lo_hi); } _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org<mailto:mesa-dev@lists.freedesktop.org> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev