How about splitting this series in four different parts? One for every extension? Is this doable without too much troubles?

On 2/12/19 6:02 PM, Rhys Perry wrote:
It currently requires review (and possibly rebasing). Marek Olšák send
some feedback for a few of the patches but other than that, it hasn't
gotten much attention.

Also patch 35 seems to vectorize 32-bit code which can help or hurt
shaders quite a bit and seems to hurt shaders overall. I'm not yet
sure how to solve this without removing it or changing the result of
LLVM's SLP vectorizer significantly.
IIRC enabling SLP vectorizer also uncovered a RA bug with a shader.

I think I'll look into the issues with patch 35 again.

On Tue, 12 Feb 2019 at 16:30, Samuel Pitoiset <samuel.pitoi...@gmail.com> wrote:
What's the status of this?

On 12/7/18 6:21 PM, Rhys Perry wrote:
This series add support for:
- VK_KHR_shader_float16_int8
- VK_AMD_gpu_shader_half_float
- VK_AMD_gpu_shader_int16
- VK_KHR_8bit_storage
on VI+. Half floats are currently disabled on LLVM 7 because of a bug
causing large memory usage and long (or unbounded) compilation times with
some tests.

It depends on the follow patch series:
- https://patchwork.freedesktop.org/series/53454/
- https://patchwork.freedesktop.org/series/53602/
- https://patchwork.freedesktop.org/series/53660/

An older version was tested on my Polaris card, but due to hardware issues
I currently can't test the latest version of the series.

deqp-vk has no regressions and none of the newly enabled tests fail.

Rhys Perry (38):
    ac: add various helpers for float16/int16/int8
    ac/nir: implement 8-bit push constant, ssbo and ubo loads
    ac/nir: implement 8-bit ssbo stores
    ac/nir: fix 16-bit ssbo stores
    ac/nir: implement 8-bit nir_load_const_instr
    ac/nir: implement 8-bit conversions
    ac/nir: fix 64-bit nir_op_f2f16_rtz
    ac/nir: make ac_build_clamp work on all bit sizes
    ac/nir: make ac_build_fract work on all bit sizes
    ac/nir: make ac_build_isign work on all bit sizes
    ac/nir: make ac_build_fsign work on all bit sizes
    ac/nir: make ac_build_fdiv support 16-bit floats
    ac/nir: implement half-float nir_op_frcp
    ac/nir: implement half-float nir_op_frsq
    ac/nir: implement half-float nir_op_ldexp
    radv: lower 16-bit flrp
    ac/nir: support half floats in emit_b2f
    ac/nir: make emit_b2i work on all bit sizes
    ac/nir: implement 16-bit shifts
    compiler/nir: add lowering option for 16-bit ffma
    ac/nir: implement 16-bit ac_build_ddxy
    ac/nir: implement 8 and 16 bit ac_build_readlane
    nir: make bitfield_reverse and ifind_msb work with all integers
    ac/nir: make ac_find_lsb work on all bit sizes
    ac/nir: make ac_build_umsb work on all bit sizes
    ac/nir: implement 8 and 16 bit ac_build_imsb
    ac/nir: make ac_build_bit_count work on all bit sizes
    ac/nir: make ac_build_bitfield_reverse work on all bit sizes
    ac/nir: implement 16-bit pack/unpack opcodes
    ac/nir: add 8-bit and 16-bit types to glsl_base_to_llvm_type
    ac/nir,radv: create an array of varying output types
    ac/nir: store all outputs as f32
    radv: store all fragment shader inputs as f32
    radv: handle all fragment output types
    ac,radv: run LLVM's SLP vectorizer
    ac/nir: generate better code for nir_op_f2f16_rtz
    ac/nir: have nir_op_f2f16 round to zero
    radv: expose float16, int16 and int8 features and extensions

   src/amd/common/ac_llvm_build.c        | 355 ++++++++++++++------------
   src/amd/common/ac_llvm_build.h        |  22 +-
   src/amd/common/ac_llvm_util.c         |   9 +-
   src/amd/common/ac_llvm_util.h         |   1 +
   src/amd/common/ac_nir_to_llvm.c       | 258 +++++++++++++++----
   src/amd/common/ac_shader_abi.h        |   1 +
   src/amd/vulkan/radv_device.c          |  17 ++
   src/amd/vulkan/radv_extensions.py     |   4 +
   src/amd/vulkan/radv_nir_to_llvm.c     |  92 ++++---
   src/amd/vulkan/radv_shader.c          |   7 +
   src/broadcom/compiler/nir_to_vir.c    |   1 +
   src/compiler/nir/nir.h                |   1 +
   src/compiler/nir/nir_opcodes.py       |   4 +-
   src/compiler/nir/nir_opt_algebraic.py |   4 +-
   src/gallium/drivers/radeonsi/si_get.c |   1 +
   src/gallium/drivers/vc4/vc4_program.c |   1 +
   16 files changed, 516 insertions(+), 262 deletions(-)

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to