On 2/16/19 1:21 AM, Rhys Perry wrote:
This series add support for:
- VK_KHR_shader_float16_int8
- VK_AMD_gpu_shader_half_float
- VK_AMD_gpu_shader_int16
- VK_KHR_8bit_storage
on VI+. Half floats are disabled on LLVM 7 because of a bug causing large
memory usage and long (or unbounded) compilation times with some CTS
tests.

It is written against the following patch series:
- https://patchwork.freedesktop.org/series/53454/ (v4)
- https://patchwork.freedesktop.org/series/53660/ (v1)

With LLVM 9, there are no reproducable Vulkan CTS regressions with Vega
and VI except for
dEQP-VK.spirv_assembly.instruction.graphics.16bit_storage.input_output_float_64_to_16.*
which fails or crashes because of unrelated radv bugs with 64-bit varyings
and because the tests use VK_FORMAT_R64_SFLOAT as a vertex format even
though radv does not support it.

test bug?

The two NIR related patches (22 and 25) should be sent separately, otherwise people working on NIR might miss them.


With LLVM 9, there are no reproducable piglit regressions except for
glsl-array-bounds-12.shader_test because of a LLVM bug when
SLP vectorization is enabled.

With LLVM 8, there are no reproducable Vulkan CTS regressions with Vega
and VI except for those with LLVM 9 and a couple of tests because of a
LLVM bug after the SLP vectorizer and with the current lack of fallback
for 16-bit interpolation on LLVM versions before LLVM 9.

With LLVM 7, there are no reproducable Vulkan CTS regressions with Vega
and VI except for those with LLVM 9 and a couple of tests because of a
LLVM bug after the SLP vectorizer.

The SLP vectorization patch is marked as WIP because it exposes LLVM bugs
with piglit's glsl-array-bounds-12.shader_test, some Vulkan CTS tests and
some shader-db test for a game I can't remember. It also over-vectorizes
32-bit code which can cause significant worsening in generated code
quality.

The 16-bit interpolation patch is marked as WIP because it currently
requires intrinsics only available in LLVM 9 and does not have a fallback.

A branch on Github containing this series can be found at:
https://github.com/pendingchaos/mesa/commits/radv_fp16_int16_int8_v2

v2: rebase
v2: implement 16-bit interpolation
v2: move LLVMAddSLPVectorizePass to after LLVMAddEarlyCSEMemSSAPass
v2: run vectorization unconditionally on GFX9 and later
v2: remove ac_get_one(), ac_get_zero(), ac_get_onef() and ac_get_zerof()
v2: remove ac_int_of_size()
v2: fix 64-bit visit_load_var()
v2: mark VK_KHR_8bit_storage as DONE in features.txt
v2: mark SLP vectorization patch as WIP
v2: fix C++ style comment

Rhys Perry (41):
   radv: bitcast 16-bit outputs to integers
   radv: ensure export arguments are always float
   ac: add various helpers for float16/int16/int8
   ac/nir: implement 8-bit push constant, ssbo and ubo loads
   ac/nir: implement 8-bit ssbo stores
   ac/nir: fix 16-bit ssbo stores
   ac/nir: implement 8-bit nir_load_const_instr
   ac/nir: implement 8-bit conversions
   ac/nir: fix 64-bit nir_op_f2f16_rtz
   ac/nir: make ac_build_clamp work on all bit sizes
   ac/nir: make ac_build_fract work on all bit sizes
   ac/nir: make ac_build_isign work on all bit sizes
   ac/nir: make ac_build_fsign work on all bit sizes
   ac/nir: make ac_build_fdiv support 16-bit floats
   ac/nir: implement half-float nir_op_frcp
   ac/nir: implement half-float nir_op_frsq
   ac/nir: implement half-float nir_op_ldexp
   radv: lower 16-bit flrp
   ac/nir: support half floats in emit_b2f
   ac/nir: make emit_b2i work on all bit sizes
   ac/nir: implement 16-bit shifts
   compiler/nir: add lowering option for 16-bit ffma
   ac/nir: implement 16-bit ac_build_ddxy
   ac/nir: implement 8 and 16 bit ac_build_readlane
   nir: make bitfield_reverse and ifind_msb work with all integers
   ac/nir: make ac_find_lsb work on all bit sizes
   ac/nir: make ac_build_umsb work on all bit sizes
   ac/nir: implement 8 and 16 bit ac_build_imsb
   ac/nir: make ac_build_bit_count work on all bit sizes
   ac/nir: make ac_build_bitfield_reverse work on all bit sizes
   ac/nir: implement 16-bit pack/unpack opcodes
   ac/nir: add 8-bit types to glsl_base_to_llvm_type
   ac/nir,radv: create an array of varying output types
   ac/nir: store all outputs as f32
   radv: store all fragment shader inputs as f32
   radv: handle all fragment output types
   WIP: radv,ac: implement 16-bit interpolation
   WIP: ac,radv: run LLVM's SLP vectorizer
   ac/nir: generate better code for nir_op_f2f16_rtz
   ac/nir: have nir_op_f2f16 round to zero
   radv,docs: expose float16, int16 and int8 features and extensions

  docs/features.txt                        |   2 +-
  src/amd/common/ac_llvm_build.c           | 325 +++++++++++------------
  src/amd/common/ac_llvm_build.h           |  18 +-
  src/amd/common/ac_llvm_util.c            |   8 +-
  src/amd/common/ac_nir_to_llvm.c          | 268 +++++++++++++++----
  src/amd/common/ac_shader_abi.h           |   1 +
  src/amd/vulkan/radv_device.c             |  17 ++
  src/amd/vulkan/radv_extensions.py        |   4 +
  src/amd/vulkan/radv_nir_to_llvm.c        | 123 +++++----
  src/amd/vulkan/radv_pipeline.c           |  19 +-
  src/amd/vulkan/radv_shader.c             |   4 +
  src/amd/vulkan/radv_shader.h             |   1 +
  src/broadcom/compiler/nir_to_vir.c       |   1 +
  src/compiler/nir/nir.h                   |   1 +
  src/compiler/nir/nir_opcodes.py          |   4 +-
  src/compiler/nir/nir_opt_algebraic.py    |   4 +-
  src/gallium/drivers/radeonsi/si_get.c    |   1 +
  src/gallium/drivers/radeonsi/si_shader.c |   2 +-
  src/gallium/drivers/vc4/vc4_program.c    |   1 +
  19 files changed, 507 insertions(+), 297 deletions(-)

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to