Re: [Mesa-dev] [PATCH 00/38] radv, ac: 16-bit and 8-bit arithmetic and 8-bit storage

Samuel Pitoiset Wed, 13 Feb 2019 13:20:29 -0800


On 2/13/19 9:20 PM, Rhys Perry wrote:

Quite a bit of the patches aren't specific to a single extension as
many make code size-generic and some of the extensions intersect in
functionality.
It might still be possible to roughly order the patches by
functionality but I'm not sure if it would be very useful (possible
order in attachment). I didn't look at the actual content of the
patches when creating the attachment, this is from memory and looking
at the descriptions.
Would you like me to send out a v2 of this series doing like that?


Ok. No that's fine.

Can you rebase and handle Marek feedbacks, at least? I will review the v2.

Thanks Rhys.


On Tue, 12 Feb 2019 at 17:08, Samuel Pitoiset <samuel.pitoi...@gmail.com> wrote:

How about splitting this series in four different parts? One for every
extension? Is this doable without too much troubles?

On 2/12/19 6:02 PM, Rhys Perry wrote:

It currently requires review (and possibly rebasing). Marek Olšák send
some feedback for a few of the patches but other than that, it hasn't
gotten much attention.

Also patch 35 seems to vectorize 32-bit code which can help or hurt
shaders quite a bit and seems to hurt shaders overall. I'm not yet
sure how to solve this without removing it or changing the result of
LLVM's SLP vectorizer significantly.
IIRC enabling SLP vectorizer also uncovered a RA bug with a shader.

I think I'll look into the issues with patch 35 again.

On Tue, 12 Feb 2019 at 16:30, Samuel Pitoiset <samuel.pitoi...@gmail.com> wrote:

What's the status of this?

On 12/7/18 6:21 PM, Rhys Perry wrote:

This series add support for:
- VK_KHR_shader_float16_int8
- VK_AMD_gpu_shader_half_float
- VK_AMD_gpu_shader_int16
- VK_KHR_8bit_storage
on VI+. Half floats are currently disabled on LLVM 7 because of a bug
causing large memory usage and long (or unbounded) compilation times with
some tests.

It depends on the follow patch series:
- https://patchwork.freedesktop.org/series/53454/
- https://patchwork.freedesktop.org/series/53602/
- https://patchwork.freedesktop.org/series/53660/

An older version was tested on my Polaris card, but due to hardware issues
I currently can't test the latest version of the series.

deqp-vk has no regressions and none of the newly enabled tests fail.

Rhys Perry (38):
     ac: add various helpers for float16/int16/int8
     ac/nir: implement 8-bit push constant, ssbo and ubo loads
     ac/nir: implement 8-bit ssbo stores
     ac/nir: fix 16-bit ssbo stores
     ac/nir: implement 8-bit nir_load_const_instr
     ac/nir: implement 8-bit conversions
     ac/nir: fix 64-bit nir_op_f2f16_rtz
     ac/nir: make ac_build_clamp work on all bit sizes
     ac/nir: make ac_build_fract work on all bit sizes
     ac/nir: make ac_build_isign work on all bit sizes
     ac/nir: make ac_build_fsign work on all bit sizes
     ac/nir: make ac_build_fdiv support 16-bit floats
     ac/nir: implement half-float nir_op_frcp
     ac/nir: implement half-float nir_op_frsq
     ac/nir: implement half-float nir_op_ldexp
     radv: lower 16-bit flrp
     ac/nir: support half floats in emit_b2f
     ac/nir: make emit_b2i work on all bit sizes
     ac/nir: implement 16-bit shifts
     compiler/nir: add lowering option for 16-bit ffma
     ac/nir: implement 16-bit ac_build_ddxy
     ac/nir: implement 8 and 16 bit ac_build_readlane
     nir: make bitfield_reverse and ifind_msb work with all integers
     ac/nir: make ac_find_lsb work on all bit sizes
     ac/nir: make ac_build_umsb work on all bit sizes
     ac/nir: implement 8 and 16 bit ac_build_imsb
     ac/nir: make ac_build_bit_count work on all bit sizes
     ac/nir: make ac_build_bitfield_reverse work on all bit sizes
     ac/nir: implement 16-bit pack/unpack opcodes
     ac/nir: add 8-bit and 16-bit types to glsl_base_to_llvm_type
     ac/nir,radv: create an array of varying output types
     ac/nir: store all outputs as f32
     radv: store all fragment shader inputs as f32
     radv: handle all fragment output types
     ac,radv: run LLVM's SLP vectorizer
     ac/nir: generate better code for nir_op_f2f16_rtz
     ac/nir: have nir_op_f2f16 round to zero
     radv: expose float16, int16 and int8 features and extensions

    src/amd/common/ac_llvm_build.c        | 355 ++++++++++++++------------
    src/amd/common/ac_llvm_build.h        |  22 +-
    src/amd/common/ac_llvm_util.c         |   9 +-
    src/amd/common/ac_llvm_util.h         |   1 +
    src/amd/common/ac_nir_to_llvm.c       | 258 +++++++++++++++----
    src/amd/common/ac_shader_abi.h        |   1 +
    src/amd/vulkan/radv_device.c          |  17 ++
    src/amd/vulkan/radv_extensions.py     |   4 +
    src/amd/vulkan/radv_nir_to_llvm.c     |  92 ++++---
    src/amd/vulkan/radv_shader.c          |   7 +
    src/broadcom/compiler/nir_to_vir.c    |   1 +
    src/compiler/nir/nir.h                |   1 +
    src/compiler/nir/nir_opcodes.py       |   4 +-
    src/compiler/nir/nir_opt_algebraic.py |   4 +-
    src/gallium/drivers/radeonsi/si_get.c |   1 +
    src/gallium/drivers/vc4/vc4_program.c |   1 +
    16 files changed, 516 insertions(+), 262 deletions(-)

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/38] radv, ac: 16-bit and 8-bit arithmetic and 8-bit storage

Reply via email to