On 05/12/17 18:31, Chema Casanova wrote: > El 05/12/17 a las 06:16, Jason Ekstrand escribió: >> A couple of notes: >> >> 1) I *think* I gave you enough reviews to land the UBO/SSBO part and >> the optimizations in 26-28. If reviews are still missing anywhere, >> please let me know. If not, let's try and get that part landed. > > The series is almost ready to land, I have only pending to address your > feedback about use untyped_read for reading vec3 ssbos. > > The only missing explicit R-b is that " [PATCH v4 28/44] i965/fs: Use > untyped_surface_read for 16-bit load_ssbo" and "[PATCH v4 23/44] > i965/fs: Enables 16-bit load_ubo with sampler" i've just answered your > review to confirm the R-b. > > I expect to finish today vec3 ssbo and send the series to Jenkins before > landing, confirm your "pending" R-b, do a last rebase to master and ask > for a push.
I've just prepared a rebased branch with the reviewed commits ready to land to enable VK_KHR_16bit_storage support for SSBO/UBO. https://github.com/Igalia/mesa/tree/wip/VK_KHR_16bit_storage-v4-ubo-ssbo-to-land As I don't have still commit access to mesa, maybe Eduardo or Alejandro can land it for me tomorrow. But, Jason feel free to push it if you want. Chema >> 2) I send out a patch to rewrite assign_constant_locations which I >> think should make it automatically handle 8 and 16-bit values as >> well. I'd rather do that than more special casing if everything works >> out ok. > > I'm testing this patch with 16-bits and make sure whatever is needed to > have 16-bit working. > >> >> 3) I sent out a series of patches to enable pushing of UBOs in >> Vulkan. If we're not careful, these will clash with 16bit storage as >> UBO support suddenly has to imply push constant support. That said, >> I"m willing to wait at least a little while before landing them to let >> us get 16bit push constant support sorted out. The UBO pushing >> patches give us a nice little performance boost but we're nowhere near >> a release and I don't want it blocking you. > > That would be my next priority, so we would only have pending to land > the 16-bit input/output support to finish this extension. > > Chema > >> On Wed, Nov 29, 2017 at 6:07 PM, Jose Maria Casanova Crespo >> <jmcasan...@igalia.com <mailto:jmcasan...@igalia.com>> wrote: >> >> Hello, >> >> this is the V4 series for the implementation of the >> SPV_KHR_16bit_storage >> and VK_KHR_16bit_storage extensions on the anv vulkan driver, in >> addition >> to the GLSL and NIR support needed. >> >> The original series can be found here [1], the following v2 [2] >> and v3 [3]. >> >> In short V4 includes the following: >> >> * Reorder the series to enable features as they are implemented, >> the series >> now enables first UBO and SSBO support, and then inputs/outputs and >> finally push constants. >> * Support the byte scattered read/write messages with different >> bit sizes >> byte/word/dword. >> * Refactor of the store_ssbo code and also fix stores when >> writemask was .yz >> * Uses the sampler for load_ubo avoiding the initial >> implementation of >> the series using byte_scattered_read. >> * Addressed all the feedback provided by Jason and Topi on v3 review. >> >> This series is also available at: >> >> https://github.com/Igalia/mesa/tree/wip/VK_KHR_16bit_storage-rc4 >> <https://github.com/Igalia/mesa/tree/wip/VK_KHR_16bit_storage-rc4> >> >> The objective is to start landing part of this series, all >> feedback has been >> addressed for SSBO and UBO. But for input/outputs features it will >> probably >> need another iteration as was not completely reviewed. It is also >> needed >> to define the approach for push constants issues before of after >> landing >> the support with this implementation. >> >> Patches 1-5 and 8-17 have already been reviewed. Patch 7 was already >> reviewed but as it has changed too much i would appreciate another >> review. When patches until 25 or 28 are reviewed we could land >> UBOs and >> SSBOs support. >> >> Finally an updated overview of the patches: >> >> Patches 1-2 add 16-bit float, int and uint types to GLSL. This is >> needed because NIR uses GLSL types internally. We use the enums >> already defined at AMD_gpu_shader_half_float and NV_gpu_shader >> extensions. Patch 2 updates mesa/st, in order to avoid warnings for >> types not handled on a switch. >> >> Patches 3-6 add NIR support for those new GLSL 16-bit types, >> conversion opcodes, and rounding modes for float to half-float >> conversions. >> >> Patches 7-9 add the SPIR-V (SPV_KHR_16bit_storage) to NIR support. >> >> Patches 10-12 add general 16-bit support for i965. This includes >> handling of new types on several general purpose methods, >> update/remove some asserts. >> >> Patches 14-17 add support for 32 to 16-bit conversions for i965, >> including rounding mode opcodes (needed for float to half-float >> conversions), and an optimization that removes superfluous rounding >> mode sets. >> >> Patches 18-21 add and use two new messages: byte scattered read and >> write. Those were needed because untyped surface message has a fixed >> 32-bit write size. Those messages are used on the 16-bit support of >> store SSBO, load SSBO and load shared. >> >> Patch 22 adds helpers to allow un/shuffle 16-bit components in 32-bit >> ones. This would be needed for following optimizations on load/store >> ssbo. This huck was originally in the input/outputs support, but >> needed >> a relocation because of the new order of the series. >> >> Patch 23 Enables load_ubo support for 16-bit using the sampler >> un-shuffling >> pairs 16-bit components from 32-bit. >> >> Patches 24-25 enable SPV_KHR_16bit_storage and >> VK_KHR_16bit_storage but only >> the support for SSBO and UBO on anv vulkan driver. >> >> Patches 26-28 were new patches included in V2 to improve performance >> reducing the use of multiple scattered messages for untyped read/write >> opreations. 16bit CTS tests passes without them. The other one would >> fix a real problem using predication (patch 27), but >> unfourtunately no CTS >> test yet catching it. >> >> Patches 29-33 implement 16-bit vertex attribute inputs support on >> i965. These include changes on anv. This was needed because 16-bit >> surface formats do implicit conversion to 32-bit. To workaround this, >> we override the 16-bit surface format, and use 32-bit ones. Issues >> related >> to robust buffer access have been addressed. >> >> Patch 34 implements load input and load store for all intra stage. >> This >> patch could have problems pointed by Jason related to how TCS >> outputs are >> implmemented that need more work. >> >> Patch 35-42 implements 16-bit store output support for fragment >> shaders on i965. Last patch enables VK_KHR_16bit for input/outputs. >> >> Patch 43-44 adds 16-bit support for push constant and enables the >> feature. >> There is still pending to work on a general solution for push >> constants and >> the mixture of different bit_sizes. >> >> [1] >> https://lists.freedesktop.org/archives/mesa-dev/2017-July/162791.html >> <https://lists.freedesktop.org/archives/mesa-dev/2017-July/162791.html> >> [2] >> https://lists.freedesktop.org/archives/mesa-dev/2017-August/167455.html >> <https://lists.freedesktop.org/archives/mesa-dev/2017-August/167455.html> >> [3] >> https://lists.freedesktop.org/archives/mesa-dev/2017-October/172557.html >> >> <https://lists.freedesktop.org/archives/mesa-dev/2017-October/172557.html> >> >> CC: Jason Ekstrand <ja...@jlekstrand.net >> <mailto:ja...@jlekstrand.net>> >> CC: Topi Pohjolainen <topi.pohjolai...@gmail.com >> <mailto:topi.pohjolai...@gmail.com>> >> CC: Matt Turner <matts...@gmail.com <mailto:matts...@gmail.com>> >> >> Alejandro Piñeiro (12): >> i965/vec4: Handle 16-bit types at type_size_xvec4 >> i965/fs: Remove BRW_REGISTER_TYPE_HF assert at get_exec_type >> i965/fs: Handle 32-bit to 16-bit conversions >> i965/fs: Define new shader opcode to set rounding modes >> i965/fs: Enable rounding mode on f2f16 ops >> i965/fs: Add remove_extra_rounding_modes optimization >> i965/fs: Use byte_scattered_write on 16-bit store_ssbo >> anv: Enable VK_KHR_16bit_storage for SSBO and UBO >> i965/fs: Predicate byte scattered writes if needed >> anv/pipeline: Use 32-bit surface formats for 16-bit formats >> anv/cmd_buffer: Add a padding to the vertex buffer >> i965/fs: Use half_precision data_format on 16-bit fb writes >> >> Eduardo Lima Mitev (8): >> glsl: Add 16-bit types >> mesa/st: Handle 16-bit types at st_glsl_storage_type_size() >> nir: Add support for 16-bit types (half float, int16 and uint16) >> nir: Populate conversion opcodes to 16-bit types >> spirv/nir: Handle 16-bit types >> spirv/nir: Add support for SPV_KHR_16bit_storage >> anv: Enable SPV_KHR_16bit_storage on gen 8+ >> i965/fs: Optimize 16-bit SSBO stores by packing two into a >> 32-bit reg >> >> Jose Maria Casanova Crespo (24): >> nir: Add rounding modes enum >> nir: Handle fp16 rounding modes at nir_type_conversion_op >> spirv: Enable FPRoundingMode decorator to nir operations >> i965: Support for 16-bit base types in helper functions >> i965: Add support for control register >> i965/fs: Add byte scattered write message and fs support >> i965/fs: Add byte scattered read message and fs support >> i965/fs: Use byte scattered read for 16-bit load_ssbo >> i965/fs: Helpers for un/shuffle 16-bit pairs in 32-bit components >> i965/fs: Enables 16-bit load_ubo with sampler >> i965/fs: Use untyped_surface_read for 16-bit load_ssbo >> compiler: Mark when input/ouput attribute at VS uses 16-bit >> i965/compiler: includes 16-bit vertex input >> i965/fs: Unpack 16-bit from 32-bit components in VS load_input >> i965/fs: Support 16-bit types at load_input and store_output >> i965/fs: Enable Render Target Write for 16-bit outputs >> anv: Enable VK_KHR_16bit_storage for input/output >> i965/fs: Include support for SEND data_format bit for Render Targets >> i965/disasm: Show half-precision data_format on rt_writes >> i965/fs: Mark 16-bit outputs on FS store_output >> i965/fs: 16-bit source payloads always use 1 register >> i965/fs: Enable 16-bit render target write on SKL and CHV >> i965/fs: Support push constants of 16-bit types >> anv: Enable VK_KHR_16bit_storage for push_constant >> >> src/compiler/builtin_type_macros.h | 26 ++ >> src/compiler/glsl/ast_to_hir.cpp | 3 + >> src/compiler/glsl/glsl_to_nir.cpp | 9 +- >> src/compiler/glsl/ir_clone.cpp | 3 + >> src/compiler/glsl/link_uniform_initializers.cpp | 3 + >> src/compiler/glsl/lower_buffer_access.cpp | 3 +- >> src/compiler/glsl_types.cpp | 93 +++++- >> src/compiler/glsl_types.h | 25 +- >> src/compiler/nir/nir.c | 6 + >> src/compiler/nir/nir.h | 22 +- >> src/compiler/nir/nir_gather_info.c | 17 +- >> src/compiler/nir/nir_opcodes.py | 11 +- >> src/compiler/nir/nir_opcodes_c.py | 17 +- >> src/compiler/nir/nir_split_var_copies.c | 6 + >> src/compiler/nir_types.cpp | 18 ++ >> src/compiler/nir_types.h | 8 + >> src/compiler/shader_info.h | 2 + >> src/compiler/spirv/nir_spirv.h | 1 + >> src/compiler/spirv/spirv_to_nir.c | 118 ++++++- >> src/compiler/spirv/vtn_alu.c | 35 +- >> src/compiler/spirv/vtn_variables.c | 21 ++ >> src/intel/compiler/brw_compiler.h | 1 + >> src/intel/compiler/brw_disasm.c | 4 + >> src/intel/compiler/brw_eu.h | 25 +- >> src/intel/compiler/brw_eu_defines.h | 36 +++ >> src/intel/compiler/brw_eu_emit.c | 130 +++++++- >> src/intel/compiler/brw_fs.cpp | 142 +++++++- >> src/intel/compiler/brw_fs.h | 12 + >> src/intel/compiler/brw_fs_copy_propagation.cpp | 8 +- >> src/intel/compiler/brw_fs_generator.cpp | 30 +- >> src/intel/compiler/brw_fs_nir.cpp | 409 >> ++++++++++++++++++++++-- >> src/intel/compiler/brw_fs_surface_builder.cpp | 23 +- >> src/intel/compiler/brw_fs_surface_builder.h | 14 + >> src/intel/compiler/brw_fs_visitor.cpp | 6 + >> src/intel/compiler/brw_inst.h | 1 + >> src/intel/compiler/brw_ir_fs.h | 3 - >> src/intel/compiler/brw_nir.c | 16 + >> src/intel/compiler/brw_reg.h | 6 + >> src/intel/compiler/brw_shader.cpp | 23 ++ >> src/intel/compiler/brw_shader.h | 7 + >> src/intel/compiler/brw_vec4.cpp | 1 + >> src/intel/compiler/brw_vec4_generator.cpp | 3 +- >> src/intel/compiler/brw_vec4_visitor.cpp | 3 + >> src/intel/vulkan/anv_device.c | 23 ++ >> src/intel/vulkan/anv_extensions.py | 1 + >> src/intel/vulkan/anv_pipeline.c | 1 + >> src/intel/vulkan/genX_cmd_buffer.c | 20 +- >> src/intel/vulkan/genX_pipeline.c | 34 ++ >> src/mesa/program/ir_to_mesa.cpp | 6 + >> src/mesa/state_tracker/st_glsl_types.cpp | 3 + >> 50 files changed, 1337 insertions(+), 101 deletions(-) >> >> -- >> 2.14.3 >> >> >> >> >> _______________________________________________ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev