Re: [Mesa-dev] [PATCH 1/3] nir: handle no variable in derefs in some places
On Tue, Jul 3, 2018 at 11:25 PM Dave Airlie wrote: > From: Dave Airlie > > --- > src/compiler/nir/nir_gather_info.c | 2 ++ > src/compiler/nir/nir_lower_indirect_derefs.c | 4 > src/compiler/nir/nir_lower_vars_to_ssa.c | 4 > 3 files changed, 10 insertions(+) > > diff --git a/src/compiler/nir/nir_gather_info.c > b/src/compiler/nir/nir_gather_info.c > index 2b431e343e9..4bbdd967c2b 100644 > --- a/src/compiler/nir/nir_gather_info.c > +++ b/src/compiler/nir/nir_gather_info.c > @@ -233,6 +233,8 @@ gather_intrinsic_info(nir_intrinsic_instr *instr, > nir_shader *shader, >nir_deref_instr *deref = nir_src_as_deref(instr->src[0]); >nir_variable *var = nir_deref_instr_get_variable(deref); > > + if (!var) > + break; > You should be able to change the if below to if (deref->mode == nir_var_shader_in || deref->mode == nir_var_shader_out) and then pull the nir_deref_instr_get_variable inside the if. >if (var->data.mode == nir_var_shader_in || >var->data.mode == nir_var_shader_out) { > bool is_output_read = false; > diff --git a/src/compiler/nir/nir_lower_indirect_derefs.c > b/src/compiler/nir/nir_lower_indirect_derefs.c > index d85c1704222..be39e1098ed 100644 > --- a/src/compiler/nir/nir_lower_indirect_derefs.c > +++ b/src/compiler/nir/nir_lower_indirect_derefs.c > @@ -131,6 +131,8 @@ lower_indirect_derefs_block(nir_block *block, > nir_builder *b, > continue; > >nir_deref_instr *deref = nir_src_as_deref(intrin->src[0]); > + if (!deref) > + continue; > Is this a real problem? intrin->src[0] should always point at a deref even if that deref is just a cast. >/* Walk the deref chain back to the base and look for indirects */ >bool has_indirect = false; > @@ -141,6 +143,8 @@ lower_indirect_derefs_block(nir_block *block, > nir_builder *b, > has_indirect = true; > > base = nir_deref_instr_parent(base); > + if (!base) > +break; > This isn't sufficient. You need to make some case below continue if base == NULL. >} > >if (!has_indirect) > diff --git a/src/compiler/nir/nir_lower_vars_to_ssa.c > b/src/compiler/nir/nir_lower_vars_to_ssa.c > index 3f37acaed33..dcef9b8e221 100644 > --- a/src/compiler/nir/nir_lower_vars_to_ssa.c > +++ b/src/compiler/nir/nir_lower_vars_to_ssa.c > @@ -142,6 +142,8 @@ static struct deref_node * > get_deref_node_recur(nir_deref_instr *deref, > struct lower_variables_state *state) > { > + if (!deref) > + return NULL; > if (deref->deref_type == nir_deref_type_var) >return get_deref_node_for_var(deref->var, state); > > @@ -198,6 +200,8 @@ get_deref_node_recur(nir_deref_instr *deref, > >return parent->wildcard; > > + case nir_deref_type_cast: > + return NULL; > I think the better thing to do here would be to just look at the deref mode and bail if it's not a nir_var_local. I just sent an untested patch that does just that. > default: >unreachable("Invalid deref type"); > } > -- > 2.17.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nir/vars_to_ssa: Don't build deref nodes for non-local variables
--- src/compiler/nir/nir_lower_vars_to_ssa.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/compiler/nir/nir_lower_vars_to_ssa.c b/src/compiler/nir/nir_lower_vars_to_ssa.c index 3f37acaed33..ef2019551c6 100644 --- a/src/compiler/nir/nir_lower_vars_to_ssa.c +++ b/src/compiler/nir/nir_lower_vars_to_ssa.c @@ -206,6 +206,12 @@ get_deref_node_recur(nir_deref_instr *deref, static struct deref_node * get_deref_node(nir_deref_instr *deref, struct lower_variables_state *state) { + /* This pass only works on local variables. Just ignore any derefs with +* a non-local mode. +*/ + if (deref->mode != nir_var_local) + return NULL; + struct deref_node *node = get_deref_node_recur(deref, state); if (!node) return NULL; -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/9] i965/fs: Register allocator shoudn't use grf127 for sends dest
This is a very clever solution to the problem. I like it. :-) Reviewed-by: Jason Ekstrand I think that's all of them. I've pushed the XML bump so you should be able to land at will. --Jason On Sun, Jul 8, 2018 at 5:29 PM Jose Maria Casanova Crespo < jmcasan...@igalia.com> wrote: > Since Gen8+ Intel PRM states that "r127 must not be used for return > address when there is a src and dest overlap in send instruction." > > This patch implements this restriction creating new grf127_send_hack_node > at the register allocator. This node has a fixed assignation to grf127. > > For vgrf that are used as destination of send messages we create node > interfereces with the grf127_send_hack_node. So the register allocator > will never assign to these vgrf a register that involves grf127. > > If dispatch_width > 8 we don't create these interferences to the because > all instructions have node interferences between sources and destination. > That is enough to avoid the r127 restriction. > > This fixes CTS tests that raised this issue as they were executed as SIMD8: > > > dEQP-VK.spirv_assembly.instruction.graphics.8bit_storage.8struct_to_32struct.storage_buffer_*int_geom > > Shader-db results on Skylake: >total instructions in shared programs: 7686798 -> 7686797 (<.01%) >instructions in affected programs: 301 -> 300 (-0.33%) >helped: 1 >HURT: 0 > >total cycles in shared programs: 337092322 -> 337091919 (<.01%) >cycles in affected programs: 22420415 -> 22420012 (<.01%) >helped: 712 >HURT: 588 > > Shader-db results on Broadwell: > >total instructions in shared programs: 7658574 -> 7658625 (<.01%) >instructions in affected programs: 19610 -> 19661 (0.26%) >helped: 3 >HURT: 4 > >total cycles in shared programs: 340694553 -> 340676378 (<.01%) >cycles in affected programs: 24724915 -> 24706740 (-0.07%) >helped: 998 >HURT: 916 > >total spills in shared programs: 4300 -> 4311 (0.26%) >spills in affected programs: 333 -> 344 (3.30%) >helped: 1 >HURT: 3 > >total fills in shared programs: 5370 -> 5378 (0.15%) >fills in affected programs: 274 -> 282 (2.92%) >helped: 1 >HURT: 3 > > v2: Avoid duplicating register classes without grf127. Let's use a node > with a fixed assignation to grf127 and create interferences to send > message vgrf destinations. (Eric Anholt) > v3: Update reference to CTS VK_KHR_8bit_storage failing tests. > (Jose Maria Casanova) > --- > src/intel/compiler/brw_fs_reg_allocate.cpp | 25 ++ > 1 file changed, 25 insertions(+) > > diff --git a/src/intel/compiler/brw_fs_reg_allocate.cpp > b/src/intel/compiler/brw_fs_reg_allocate.cpp > index ec8e116cb38..59e047483c0 100644 > --- a/src/intel/compiler/brw_fs_reg_allocate.cpp > +++ b/src/intel/compiler/brw_fs_reg_allocate.cpp > @@ -548,6 +548,9 @@ fs_visitor::assign_regs(bool allow_spilling, bool > spill_all) > int first_mrf_hack_node = node_count; > if (devinfo->gen >= 7) >node_count += BRW_MAX_GRF - GEN7_MRF_HACK_START; > + int grf127_send_hack_node = node_count; > + if (devinfo->gen >= 8 && dispatch_width == 8) > + node_count ++; > struct ra_graph *g = >ra_alloc_interference_graph(compiler->fs_reg_sets[rsi].regs, > node_count); > > @@ -653,6 +656,28 @@ fs_visitor::assign_regs(bool allow_spilling, bool > spill_all) >} > } > > + if (devinfo->gen >= 8 && dispatch_width == 8) { > + /* At Intel Broadwell PRM, vol 07, section "Instruction Set > Reference", > + * subsection "EUISA Instructions", Send Message (page 990): > + * > + * "r127 must not be used for return address when there is a src and > + * dest overlap in send instruction." > + * > + * We are avoiding using grf127 as part of the destination of send > + * messages adding a node interference to the grf127_send_hack_node. > + * This node has a fixed asignment to grf127. > + * > + * We don't apply it to SIMD16 because previous code avoids any > register > + * overlap between sources and destination. > + */ > + ra_set_node_reg(g, grf127_send_hack_node, 127); > + foreach_block_and_inst(block, fs_inst, inst, cfg) { > + if (inst->is_send_from_grf() && inst->dst.file == VGRF) { > +ra_add_node_interference(g, inst->dst.nr, > grf127_send_hack_node); > + } > + } > + } > + > /* Debug of register spilling: Go spill everything. */ > if (unlikely(spill_all)) { >int reg = choose_spill_reg(g); > -- > 2.17.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 7/9] spirv: Include headers and grammar for SPV_KHR_8bit_storage
Acked-by: Jason Ekstrand On Sun, Jul 8, 2018 at 5:29 PM Jose Maria Casanova Crespo < jmcasan...@igalia.com> wrote: > Update to headers and grammar to ff684ffc6a35d2a58f0f63108877d0064ea33feb > --- > src/compiler/spirv/spirv.core.grammar.json | 44 ++ > src/compiler/spirv/spirv.h | 3 ++ > 2 files changed, 40 insertions(+), 7 deletions(-) > > diff --git a/src/compiler/spirv/spirv.core.grammar.json > b/src/compiler/spirv/spirv.core.grammar.json > index a03c024335c..cb641420d07 100644 > --- a/src/compiler/spirv/spirv.core.grammar.json > +++ b/src/compiler/spirv/spirv.core.grammar.json > @@ -3914,7 +3914,7 @@ > { "kind" : "IdRef", "name" : "'Target'" }, > { "kind" : "Decoration" } >], > - "extensions" : [ "SPV_GOOGLE_decorate_string" ], > + "extensions" : [ "SPV_GOOGLE_decorate_string", > "SPV_GOOGLE_hlsl_functionality1" ], >"version" : "None" > }, > { > @@ -3925,7 +3925,7 @@ > { "kind" : "LiteralInteger", "name" : "'Member'" }, > { "kind" : "Decoration" } >], > - "extensions" : [ "SPV_GOOGLE_decorate_string" ], > + "extensions" : [ "SPV_GOOGLE_decorate_string", > "SPV_GOOGLE_hlsl_functionality1" ], >"version" : "None" > }, > { > @@ -3991,6 +3991,7 @@ > { >"enumerant" : "ConstOffsets", >"value" : "0x0020", > + "capabilities" : [ "ImageGatherExtended" ], >"parameters" : [ > { "kind" : "IdRef" } >] > @@ -5550,12 +5551,14 @@ >"enumerant" : "OverrideCoverageNV", >"value" : 5248, >"capabilities" : [ "SampleMaskOverrideCoverageNV" ], > + "extensions" : [ "SPV_NV_sample_mask_override_coverage" ], >"version" : "None" > }, > { >"enumerant" : "PassthroughNV", >"value" : 5250, >"capabilities" : [ "GeometryShaderPassthroughNV" ], > + "extensions" : [ "SPV_NV_geometry_shader_passthrough" ], >"version" : "None" > }, > { > @@ -5568,6 +5571,7 @@ >"enumerant" : "SecondaryViewportRelativeNV", >"value" : 5256, >"capabilities" : [ "ShaderStereoViewNV" ], > + "extensions" : [ "SPV_NV_stereo_view_rendering" ], >"version" : "None", >"parameters" : [ > { "kind" : "LiteralInteger", "name" : "'Offset'" } > @@ -5960,12 +5964,14 @@ >"enumerant" : "SecondaryPositionNV", >"value" : 5257, >"capabilities" : [ "ShaderStereoViewNV" ], > + "extensions" : [ "SPV_NV_stereo_view_rendering" ], >"version" : "None" > }, > { >"enumerant" : "SecondaryViewportMaskNV", >"value" : 5258, >"capabilities" : [ "ShaderStereoViewNV" ], > + "extensions" : [ "SPV_NV_stereo_view_rendering" ], >"version" : "None" > }, > { > @@ -6043,17 +6049,23 @@ > { >"enumerant" : "PartitionedReduceNV", >"value" : 6, > - "capabilities" : [ "GroupNonUniformPartitionedNV" ] > + "capabilities" : [ "GroupNonUniformPartitionedNV" ], > + "extensions" : [ "SPV_NV_shader_subgroup_partitioned" ], > + "version" : "None" > }, > { >"enumerant" : "PartitionedInclusiveScanNV", >"value" : 7, > - "capabilities" : [ "GroupNonUniformPartitionedNV" ] > + "capabilities" : [ "GroupNonUniformPartitionedNV" ], > + "extensions" : [ "SPV_NV_shader_subgroup_partitioned" ], > + "version" : "None" > }, > { >"enumerant" : "PartitionedExclusiveScanNV", >"value" : 8, > - "capabilities" : [ "GroupNonUniformPartitionedNV" ] > + "capabilities" : [ "GroupNonUniformPartitionedNV" ], > + "extensions" : [ "SPV_NV_shader_subgroup_partitioned" ], > + "version" : "None" > } >] > }, > @@ -6260,8 +6272,7 @@ > }, > { >"enumerant" : "Int8", > - "value" : 39, > - "capabilities" : [ "Kernel" ] > + "value" : 39 > }, > { >"enumerant" : "InputAttachment", > @@ -6518,6 +6529,25 @@ >"extensions" : [ "SPV_KHR_post_depth_coverage" ], >"version" : "None" > }, > +{ > + "enumerant" : "StorageBuffer8BitAccess", > + "value" : 4448, > + "extensions" : [ "SPV_KHR_8bit_storage" ], > + "version" : "None" > +}, > +{ > + "enumerant" : "UniformAndStorageBuffer8BitAccess", > + "value" : 4449, > + "capabilities" : [ "StorageBuffer8BitAccess" ], > + "extensions" : [ "SPV_KHR_8bit_storage" ], > + "version" : "None" > +}, > +{ > + "enumerant" :
[Mesa-dev] [PATCH 9/9] anv: Enable SPV_KHR_8bit_storage and VK_KHR_8bit_storage
Enables SPV_KHR_8bit_storage and VK_KHR_8bit_storage on gen 8+ using the VK_KHR_get_physical_device_properties2 functionality to expose if the extension is supported or not. Reviewed-by: Jason Ekstrand --- src/intel/vulkan/anv_device.c | 11 +++ src/intel/vulkan/anv_extensions.py | 1 + src/intel/vulkan/anv_pipeline.c| 1 + 3 files changed, 13 insertions(+) diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c index 7b3ddbb9501..a8b0bd2fc3e 100644 --- a/src/intel/vulkan/anv_device.c +++ b/src/intel/vulkan/anv_device.c @@ -896,6 +896,17 @@ void anv_GetPhysicalDeviceFeatures2( break; } + case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_8BIT_STORAGE_FEATURES_KHR: { + VkPhysicalDevice8BitStorageFeaturesKHR *features = +(VkPhysicalDevice8BitStorageFeaturesKHR *)ext; + ANV_FROM_HANDLE(anv_physical_device, pdevice, physicalDevice); + + features->storageBuffer8BitAccess = pdevice->info.gen >= 8; + features->uniformAndStorageBuffer8BitAccess = pdevice->info.gen >= 8; + features->storagePushConstant8 = pdevice->info.gen >= 8; + break; + } + default: anv_debug_ignored_stype(ext->sType); break; diff --git a/src/intel/vulkan/anv_extensions.py b/src/intel/vulkan/anv_extensions.py index 4179315a388..99529162781 100644 --- a/src/intel/vulkan/anv_extensions.py +++ b/src/intel/vulkan/anv_extensions.py @@ -72,6 +72,7 @@ MAX_API_VERSION = None # Computed later EXTENSIONS = [ Extension('VK_ANDROID_native_buffer', 5, 'ANDROID'), Extension('VK_KHR_16bit_storage', 1, 'device->info.gen >= 8'), +Extension('VK_KHR_8bit_storage', 1, 'device->info.gen >= 8'), Extension('VK_KHR_bind_memory2', 1, True), Extension('VK_KHR_create_renderpass2',1, True), Extension('VK_KHR_dedicated_allocation', 1, True), diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c index b0c9c3422a5..1565fe7a7a3 100644 --- a/src/intel/vulkan/anv_pipeline.c +++ b/src/intel/vulkan/anv_pipeline.c @@ -153,6 +153,7 @@ anv_shader_compile_to_nir(struct anv_pipeline *pipeline, .subgroup_shuffle = true, .subgroup_vote = true, .stencil_export = device->instance->physicalDevice.info.gen >= 9, + .storage_8bit = device->instance->physicalDevice.info.gen >= 8, }, }; -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 8/9] spirv/nir: Add support for SPV_KHR_8bit_storage
Reviewed-by: Jason Ekstrand --- src/compiler/shader_info.h| 1 + src/compiler/spirv/spirv_to_nir.c | 6 ++ 2 files changed, 7 insertions(+) diff --git a/src/compiler/shader_info.h b/src/compiler/shader_info.h index 8c58ee285ec..3b95d5962c0 100644 --- a/src/compiler/shader_info.h +++ b/src/compiler/shader_info.h @@ -58,6 +58,7 @@ struct spirv_supported_capabilities { bool runtime_descriptor_array; bool stencil_export; bool atomic_storage; + bool storage_8bit; }; typedef struct shader_info { diff --git a/src/compiler/spirv/spirv_to_nir.c b/src/compiler/spirv/spirv_to_nir.c index fb4211193fb..80a35b1b750 100644 --- a/src/compiler/spirv/spirv_to_nir.c +++ b/src/compiler/spirv/spirv_to_nir.c @@ -3498,6 +3498,12 @@ vtn_handle_preamble_instruction(struct vtn_builder *b, SpvOp opcode, spv_check_supported(shader_viewport_index_layer, cap); break; + case SpvCapabilityStorageBuffer8BitAccess: + case SpvCapabilityUniformAndStorageBuffer8BitAccess: + case SpvCapabilityStoragePushConstant8: + spv_check_supported(storage_8bit, cap); + break; + case SpvCapabilityInputAttachmentArrayDynamicIndexingEXT: case SpvCapabilityUniformTexelBufferArrayDynamicIndexingEXT: case SpvCapabilityStorageTexelBufferArrayDynamicIndexingEXT: -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/9] i965/fs: Enable store_ssbo for 8-bit types.
v2: Update comment according to this patch. (Jason Ekstrand) Reviewed-by: Jason Ekstrand --- src/intel/compiler/brw_fs_nir.cpp | 15 --- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp index 4155b2ed996..62ec0df8994 100644 --- a/src/intel/compiler/brw_fs_nir.cpp +++ b/src/intel/compiler/brw_fs_nir.cpp @@ -4284,7 +4284,6 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , nir_intrinsic_instr *instr write_src = shuffle_for_32bit_write(bld, write_src, 0, num_components); } else if (type_size < 4) { -assert(type_size == 2); /* For 16-bit types we pack two consecutive values into a 32-bit * word and use an untyped write message. For single values or not * 32-bit-aligned we need to use byte-scattered writes because @@ -4308,12 +4307,15 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , nir_intrinsic_instr *instr * being aligned to 32-bit. */ num_components = 1; -} else if (num_components > 2 && (num_components % 2)) { - /* If there is an odd number of consecutive components we left -* the not paired component for a following emit of length == 1 -* with byte_scattered_write. +} else if (num_components * type_size > 4 && + (num_components * type_size % 4)) { + /* If the pending components size is not a multiple of 4 bytes +* we left the not aligned components for following emits of +* length == 1 with byte_scattered_write. */ - num_components --; + num_components -= (num_components * type_size % 4) / type_size; +} else if (num_components * type_size < 4) { + num_components = 1; } /* For num_components == 1 we are also shuffling the component * because byte scattered writes of 16-bit need values to be dword @@ -4337,7 +4339,6 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , nir_intrinsic_instr *instr } if (type_size < 4 && num_components == 1) { -assert(type_size == 2); /* Untyped Surface messages have a fixed 32-bit size, so we need * to rely on byte scattered in order to write 16-bit elements. * The byte_scattered_write message needs that every written 16-bit -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 7/9] spirv: Include headers and grammar for SPV_KHR_8bit_storage
Update to headers and grammar to ff684ffc6a35d2a58f0f63108877d0064ea33feb --- src/compiler/spirv/spirv.core.grammar.json | 44 ++ src/compiler/spirv/spirv.h | 3 ++ 2 files changed, 40 insertions(+), 7 deletions(-) diff --git a/src/compiler/spirv/spirv.core.grammar.json b/src/compiler/spirv/spirv.core.grammar.json index a03c024335c..cb641420d07 100644 --- a/src/compiler/spirv/spirv.core.grammar.json +++ b/src/compiler/spirv/spirv.core.grammar.json @@ -3914,7 +3914,7 @@ { "kind" : "IdRef", "name" : "'Target'" }, { "kind" : "Decoration" } ], - "extensions" : [ "SPV_GOOGLE_decorate_string" ], + "extensions" : [ "SPV_GOOGLE_decorate_string", "SPV_GOOGLE_hlsl_functionality1" ], "version" : "None" }, { @@ -3925,7 +3925,7 @@ { "kind" : "LiteralInteger", "name" : "'Member'" }, { "kind" : "Decoration" } ], - "extensions" : [ "SPV_GOOGLE_decorate_string" ], + "extensions" : [ "SPV_GOOGLE_decorate_string", "SPV_GOOGLE_hlsl_functionality1" ], "version" : "None" }, { @@ -3991,6 +3991,7 @@ { "enumerant" : "ConstOffsets", "value" : "0x0020", + "capabilities" : [ "ImageGatherExtended" ], "parameters" : [ { "kind" : "IdRef" } ] @@ -5550,12 +5551,14 @@ "enumerant" : "OverrideCoverageNV", "value" : 5248, "capabilities" : [ "SampleMaskOverrideCoverageNV" ], + "extensions" : [ "SPV_NV_sample_mask_override_coverage" ], "version" : "None" }, { "enumerant" : "PassthroughNV", "value" : 5250, "capabilities" : [ "GeometryShaderPassthroughNV" ], + "extensions" : [ "SPV_NV_geometry_shader_passthrough" ], "version" : "None" }, { @@ -5568,6 +5571,7 @@ "enumerant" : "SecondaryViewportRelativeNV", "value" : 5256, "capabilities" : [ "ShaderStereoViewNV" ], + "extensions" : [ "SPV_NV_stereo_view_rendering" ], "version" : "None", "parameters" : [ { "kind" : "LiteralInteger", "name" : "'Offset'" } @@ -5960,12 +5964,14 @@ "enumerant" : "SecondaryPositionNV", "value" : 5257, "capabilities" : [ "ShaderStereoViewNV" ], + "extensions" : [ "SPV_NV_stereo_view_rendering" ], "version" : "None" }, { "enumerant" : "SecondaryViewportMaskNV", "value" : 5258, "capabilities" : [ "ShaderStereoViewNV" ], + "extensions" : [ "SPV_NV_stereo_view_rendering" ], "version" : "None" }, { @@ -6043,17 +6049,23 @@ { "enumerant" : "PartitionedReduceNV", "value" : 6, - "capabilities" : [ "GroupNonUniformPartitionedNV" ] + "capabilities" : [ "GroupNonUniformPartitionedNV" ], + "extensions" : [ "SPV_NV_shader_subgroup_partitioned" ], + "version" : "None" }, { "enumerant" : "PartitionedInclusiveScanNV", "value" : 7, - "capabilities" : [ "GroupNonUniformPartitionedNV" ] + "capabilities" : [ "GroupNonUniformPartitionedNV" ], + "extensions" : [ "SPV_NV_shader_subgroup_partitioned" ], + "version" : "None" }, { "enumerant" : "PartitionedExclusiveScanNV", "value" : 8, - "capabilities" : [ "GroupNonUniformPartitionedNV" ] + "capabilities" : [ "GroupNonUniformPartitionedNV" ], + "extensions" : [ "SPV_NV_shader_subgroup_partitioned" ], + "version" : "None" } ] }, @@ -6260,8 +6272,7 @@ }, { "enumerant" : "Int8", - "value" : 39, - "capabilities" : [ "Kernel" ] + "value" : 39 }, { "enumerant" : "InputAttachment", @@ -6518,6 +6529,25 @@ "extensions" : [ "SPV_KHR_post_depth_coverage" ], "version" : "None" }, +{ + "enumerant" : "StorageBuffer8BitAccess", + "value" : 4448, + "extensions" : [ "SPV_KHR_8bit_storage" ], + "version" : "None" +}, +{ + "enumerant" : "UniformAndStorageBuffer8BitAccess", + "value" : 4449, + "capabilities" : [ "StorageBuffer8BitAccess" ], + "extensions" : [ "SPV_KHR_8bit_storage" ], + "version" : "None" +}, +{ + "enumerant" : "StoragePushConstant8", + "value" : 4450, + "extensions" : [ "SPV_KHR_8bit_storage" ], + "version" : "None" +}, { "enumerant" : "Float16ImageAMD", "value" : 5008, diff --git a/src/compiler/spirv/spirv.h b/src/compiler/spirv/spirv.h index e0a0330ba63..4c90c936ce0 100644 --- a/src/compiler/spirv/spirv.h +++
[Mesa-dev] [PATCH 3/9] i965: Support for 8-bit base types in helper functions
Reviewed-by: Jason Ekstrand --- src/intel/compiler/brw_fs_nir.cpp | 11 ++- src/intel/compiler/brw_nir.c | 4 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp index 02ac92e62f1..83ed9575f80 100644 --- a/src/intel/compiler/brw_fs_nir.cpp +++ b/src/intel/compiler/brw_fs_nir.cpp @@ -303,10 +303,13 @@ brw_reg_type_from_bit_size(const unsigned bit_size, default: unreachable("Invalid bit size"); } + case BRW_REGISTER_TYPE_B: case BRW_REGISTER_TYPE_W: case BRW_REGISTER_TYPE_D: case BRW_REGISTER_TYPE_Q: switch(bit_size) { + case 8: + return BRW_REGISTER_TYPE_B; case 16: return BRW_REGISTER_TYPE_W; case 32: @@ -316,10 +319,13 @@ brw_reg_type_from_bit_size(const unsigned bit_size, default: unreachable("Invalid bit size"); } + case BRW_REGISTER_TYPE_UB: case BRW_REGISTER_TYPE_UW: case BRW_REGISTER_TYPE_UD: case BRW_REGISTER_TYPE_UQ: switch(bit_size) { + case 8: + return BRW_REGISTER_TYPE_UB; case 16: return BRW_REGISTER_TYPE_UW; case 32: @@ -1666,7 +1672,10 @@ fs_visitor::get_nir_dest(const nir_dest ) { if (dest.is_ssa) { const brw_reg_type reg_type = - brw_reg_type_from_bit_size(dest.ssa.bit_size, BRW_REGISTER_TYPE_F); + brw_reg_type_from_bit_size(dest.ssa.bit_size, +dest.ssa.bit_size == 8 ? +BRW_REGISTER_TYPE_D : +BRW_REGISTER_TYPE_F); nir_ssa_values[dest.ssa.index] = bld.vgrf(reg_type, dest.ssa.num_components); return nir_ssa_values[dest.ssa.index]; diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c index 74b39ad80a2..5990427b731 100644 --- a/src/intel/compiler/brw_nir.c +++ b/src/intel/compiler/brw_nir.c @@ -887,6 +887,10 @@ brw_type_for_nir_type(const struct gen_device_info *devinfo, nir_alu_type type) return BRW_REGISTER_TYPE_W; case nir_type_uint16: return BRW_REGISTER_TYPE_UW; + case nir_type_int8: + return BRW_REGISTER_TYPE_B; + case nir_type_uint8: + return BRW_REGISTER_TYPE_UB; default: unreachable("unknown type"); } -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/9] intel/compiler: grf127 can not be dest when src and dest overlap in send
Implement at brw_eu_validate the restriction from Intel Broadwell PRM, vol 07, section "Instruction Set Reference", subsection "EUISA Instructions", Send Message (page 990): "r127 must not be used for return address when there is a src and dest overlap in send instruction." v2: Style fixes (Matt Turner) Reviewed-by: Matt Turner --- src/intel/compiler/brw_eu_validate.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/intel/compiler/brw_eu_validate.c b/src/intel/compiler/brw_eu_validate.c index d3189d1ef5e..29d1fe46f71 100644 --- a/src/intel/compiler/brw_eu_validate.c +++ b/src/intel/compiler/brw_eu_validate.c @@ -261,6 +261,17 @@ send_restrictions(const struct gen_device_info *devinfo, brw_inst_src0_da_reg_nr(devinfo, inst) < 112, "send with EOT must use g112-g127"); } + + if (devinfo->gen >= 8) { + ERROR_IF(!dst_is_null(devinfo, inst) && + (brw_inst_dst_da_reg_nr(devinfo, inst) + + brw_inst_rlen(devinfo, inst) > 127) && + (brw_inst_src0_da_reg_nr(devinfo, inst) + + brw_inst_mlen(devinfo, inst) > + brw_inst_dst_da_reg_nr(devinfo, inst)), + "r127 must not be used for return address when there is " + "a src and dest overlap"); + } } return error_msg; -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/9] i965/fs: Enable conversions to 8-bit integers
Reviewed-by: Jason Ekstrand --- src/intel/compiler/brw_fs_nir.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp index 83ed9575f80..4155b2ed996 100644 --- a/src/intel/compiler/brw_fs_nir.cpp +++ b/src/intel/compiler/brw_fs_nir.cpp @@ -834,6 +834,8 @@ fs_visitor::nir_emit_alu(const fs_builder , nir_alu_instr *instr) case nir_op_u2u16: case nir_op_i2f16: case nir_op_u2f16: + case nir_op_i2i8: + case nir_op_u2u8: inst = bld.MOV(result, op[0]); inst->saturate = instr->dest.saturate; break; -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/9] i965/fs: Register allocator shoudn't use grf127 for sends dest
Since Gen8+ Intel PRM states that "r127 must not be used for return address when there is a src and dest overlap in send instruction." This patch implements this restriction creating new grf127_send_hack_node at the register allocator. This node has a fixed assignation to grf127. For vgrf that are used as destination of send messages we create node interfereces with the grf127_send_hack_node. So the register allocator will never assign to these vgrf a register that involves grf127. If dispatch_width > 8 we don't create these interferences to the because all instructions have node interferences between sources and destination. That is enough to avoid the r127 restriction. This fixes CTS tests that raised this issue as they were executed as SIMD8: dEQP-VK.spirv_assembly.instruction.graphics.8bit_storage.8struct_to_32struct.storage_buffer_*int_geom Shader-db results on Skylake: total instructions in shared programs: 7686798 -> 7686797 (<.01%) instructions in affected programs: 301 -> 300 (-0.33%) helped: 1 HURT: 0 total cycles in shared programs: 337092322 -> 337091919 (<.01%) cycles in affected programs: 22420415 -> 22420012 (<.01%) helped: 712 HURT: 588 Shader-db results on Broadwell: total instructions in shared programs: 7658574 -> 7658625 (<.01%) instructions in affected programs: 19610 -> 19661 (0.26%) helped: 3 HURT: 4 total cycles in shared programs: 340694553 -> 340676378 (<.01%) cycles in affected programs: 24724915 -> 24706740 (-0.07%) helped: 998 HURT: 916 total spills in shared programs: 4300 -> 4311 (0.26%) spills in affected programs: 333 -> 344 (3.30%) helped: 1 HURT: 3 total fills in shared programs: 5370 -> 5378 (0.15%) fills in affected programs: 274 -> 282 (2.92%) helped: 1 HURT: 3 v2: Avoid duplicating register classes without grf127. Let's use a node with a fixed assignation to grf127 and create interferences to send message vgrf destinations. (Eric Anholt) v3: Update reference to CTS VK_KHR_8bit_storage failing tests. (Jose Maria Casanova) --- src/intel/compiler/brw_fs_reg_allocate.cpp | 25 ++ 1 file changed, 25 insertions(+) diff --git a/src/intel/compiler/brw_fs_reg_allocate.cpp b/src/intel/compiler/brw_fs_reg_allocate.cpp index ec8e116cb38..59e047483c0 100644 --- a/src/intel/compiler/brw_fs_reg_allocate.cpp +++ b/src/intel/compiler/brw_fs_reg_allocate.cpp @@ -548,6 +548,9 @@ fs_visitor::assign_regs(bool allow_spilling, bool spill_all) int first_mrf_hack_node = node_count; if (devinfo->gen >= 7) node_count += BRW_MAX_GRF - GEN7_MRF_HACK_START; + int grf127_send_hack_node = node_count; + if (devinfo->gen >= 8 && dispatch_width == 8) + node_count ++; struct ra_graph *g = ra_alloc_interference_graph(compiler->fs_reg_sets[rsi].regs, node_count); @@ -653,6 +656,28 @@ fs_visitor::assign_regs(bool allow_spilling, bool spill_all) } } + if (devinfo->gen >= 8 && dispatch_width == 8) { + /* At Intel Broadwell PRM, vol 07, section "Instruction Set Reference", + * subsection "EUISA Instructions", Send Message (page 990): + * + * "r127 must not be used for return address when there is a src and + * dest overlap in send instruction." + * + * We are avoiding using grf127 as part of the destination of send + * messages adding a node interference to the grf127_send_hack_node. + * This node has a fixed asignment to grf127. + * + * We don't apply it to SIMD16 because previous code avoids any register + * overlap between sources and destination. + */ + ra_set_node_reg(g, grf127_send_hack_node, 127); + foreach_block_and_inst(block, fs_inst, inst, cfg) { + if (inst->is_send_from_grf() && inst->dst.file == VGRF) { +ra_add_node_interference(g, inst->dst.nr, grf127_send_hack_node); + } + } + } + /* Debug of register spilling: Go spill everything. */ if (unlikely(spill_all)) { int reg = choose_spill_reg(g); -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/9] anv: Enable VK_KHR_8bit_storage
This series enables support to VK_KHR_8bit_storage vulkan extension for anv. It enables all capabilities available for this extension including StorageBuffer8BitAccess, UniformAndStorageBuffer8BitAccess and StoragePushConstant8. 8-bit read operations from UBO and SSBO and push constants are already supported by the backend so this series only implements the pending write support for SSBO and the conversions to/from 8-bit integers. This series is applied on top of VK_KHR_create_renderpass2 series already sent by Jason that updates the vulkan XML and headers to 1.1.80. Only patch 2 (fix for register allocator to avoid grf127 overlaps) and 7 (spir-v headers update) have pending review. This series is organized as follows: * Patchs 1-2 were already submitted, but patch 2 has pending review. They implement the restriction of "r127 must not be used for return address when there is a src and dest overlap in send instruction." We need to fix this to avoid faling 2 CTS test of this new extensions. * Patch 3 enables 8-bit support in some helpers. * Patch 4 enable conversions to 8-bit integers. * Patches 5-6 implement 8-bit write operations for SSBO. They also relax the requirements of the brw_eu_validate to allow raw movs of bytes altough there are difference between exec size and dest size. * Patches 7-9 enable the Vulkan and SPIR-V 8bit_storage extensions. SPIR-V headers are updated. With this series we pass all CTS VK_KHR_8bit_storage tests: dEQP-VK.spirv_assembly.instruction.*.8bit_storage.* Jose Maria Casanova Crespo (9): intel/compiler: grf127 can not be dest when src and dest overlap in send i965/fs: Register allocator shoudn't use grf127 for sends dest i965: Support for 8-bit base types in helper functions i965/fs: Enable conversions to 8-bit integers intel/compiler: relax brw_eu_validate for byte raw movs i965/fs: Enable store_ssbo for 8-bit types. spirv: Include headers and grammar for SPV_KHR_8bit_storage spirv/nir: Add support for SPV_KHR_8bit_storage anv: Enable SPV_KHR_8bit_storage and VK_KHR_8bit_storage src/compiler/shader_info.h | 1 + src/compiler/spirv/spirv.core.grammar.json | 44 ++ src/compiler/spirv/spirv.h | 3 ++ src/compiler/spirv/spirv_to_nir.c | 5 +++ src/intel/compiler/brw_eu_validate.c | 19 -- src/intel/compiler/brw_fs_nir.cpp | 28 ++ src/intel/compiler/brw_fs_reg_allocate.cpp | 25 src/intel/compiler/brw_nir.c | 4 ++ src/intel/vulkan/anv_device.c | 11 ++ src/intel/vulkan/anv_extensions.py | 1 + src/intel/vulkan/anv_pipeline.c| 1 + 11 files changed, 124 insertions(+), 18 deletions(-) Cc: Jason Ekstrand Cc: Iago Toral -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/9] intel/compiler: relax brw_eu_validate for byte raw movs
When the destination is a BYTE type allow raw movs even if the stride is not exact multiple of destination type and exec type, execution type is Word and its size is 2. This restriction was only allowing stride==2 destinations for 8-bit types. Reviewed-by: Jason Ekstrand --- src/intel/compiler/brw_eu_validate.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/src/intel/compiler/brw_eu_validate.c b/src/intel/compiler/brw_eu_validate.c index 29d1fe46f71..a25010b225c 100644 --- a/src/intel/compiler/brw_eu_validate.c +++ b/src/intel/compiler/brw_eu_validate.c @@ -472,9 +472,11 @@ general_restrictions_based_on_operand_types(const struct gen_device_info *devinf dst_type_size = 8; if (exec_type_size > dst_type_size) { - ERROR_IF(dst_stride * dst_type_size != exec_type_size, - "Destination stride must be equal to the ratio of the sizes of " - "the execution data type to the destination type"); + if (!(dst_type_is_byte && inst_is_raw_move(devinfo, inst))) { + ERROR_IF(dst_stride * dst_type_size != exec_type_size, + "Destination stride must be equal to the ratio of the sizes " + "of the execution data type to the destination type"); + } unsigned subreg = brw_inst_dst_da1_subreg_nr(devinfo, inst); -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 107156] earth tessellation bug
https://bugs.freedesktop.org/show_bug.cgi?id=107156 Timothy Arceri changed: What|Removed |Added Status|NEW |NEEDINFO --- Comment #3 from Timothy Arceri --- Are you able to do a git bisect to find the commit that caused the issue? -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 107158] When compiling against musl libc, pthread.h is not included in radv_amdgpu_winsys
https://bugs.freedesktop.org/show_bug.cgi?id=107158 Bas Nieuwenhuizen changed: What|Removed |Added Status|NEW |NEEDINFO --- Comment #1 from Bas Nieuwenhuizen --- Should be fixed by https://gitlab.freedesktop.org/mesa/mesa/commit/1a1f2b134c4bdb502659724e232a9e009287fe58 which is already upstream? -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 107158] When compiling against musl libc, pthread.h is not included in radv_amdgpu_winsys
https://bugs.freedesktop.org/show_bug.cgi?id=107158 Bug ID: 107158 Summary: When compiling against musl libc, pthread.h is not included in radv_amdgpu_winsys Product: Mesa Version: 18.0 Hardware: All OS: Linux (All) Status: NEW Severity: minor Priority: medium Component: Drivers/Vulkan/radeon Assignee: mesa-dev@lists.freedesktop.org Reporter: roncha...@gmail.com QA Contact: mesa-dev@lists.freedesktop.org Created attachment 140514 --> https://bugs.freedesktop.org/attachment.cgi?id=140514=edit Patch to include pthread.h amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.h does not #include , which causes compilation to fail when compiling against musl libc. Adding the #include line fixes the issue. Patch attached. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radv: using tls to store llvm related info and speed up compiles (v8)
On Thu, Jul 5, 2018 at 2:03 AM, Dave Airlie wrote: > From: Dave Airlie > > This uses the common compiler passes abstraction to help radv > avoid fixed cost compiler overheads. This uses a linked list per > thread stored in thread local storage, with an entry in the list > for each target machine. > > This should remove all the fixed overheads setup costs of creating > the pass manager each time. > > This takes a demo app time to compile the radv meta shaders on nocache > and exit from 1.7s to 1s. It also has been reported to take the startup > time of uncached shaders on RoTR from 12m24s to 11m35s (Alex) > > v2: fix llvm6 build, inline emit function, handle multiple targets > in one thread > v3: rebase and port onto new structure > v4: rename some vars (Bas) > v5: drag all code into radv for now, we can refactor it out later > for radeonsi if we make it shareable > v6: use a bit more C++ in the wrapper > v7: logic bugs fixed so it actually runs again. > v8: rebase on top of radeonsi changes. > --- > src/amd/vulkan/Makefile.sources | 2 + > src/amd/vulkan/meson.build | 2 + > src/amd/vulkan/radv_debug.h | 1 + > src/amd/vulkan/radv_device.c| 1 + > src/amd/vulkan/radv_llvm_helper.cpp | 148 > src/amd/vulkan/radv_nir_to_llvm.c | 27 + > src/amd/vulkan/radv_shader.c| 10 +- > src/amd/vulkan/radv_shader_helper.h | 44 + > 8 files changed, 207 insertions(+), 28 deletions(-) > create mode 100644 src/amd/vulkan/radv_llvm_helper.cpp > create mode 100644 src/amd/vulkan/radv_shader_helper.h > > diff --git a/src/amd/vulkan/Makefile.sources b/src/amd/vulkan/Makefile.sources > index 70d56e88cb3..152fdd7cb71 100644 > --- a/src/amd/vulkan/Makefile.sources > +++ b/src/amd/vulkan/Makefile.sources > @@ -54,6 +54,7 @@ VULKAN_FILES := \ > radv_meta_resolve_cs.c \ > radv_meta_resolve_fs.c \ > radv_nir_to_llvm.c \ > + radv_llvm_helper.cpp \ > radv_pass.c \ > radv_pipeline.c \ > radv_pipeline_cache.c \ > @@ -62,6 +63,7 @@ VULKAN_FILES := \ > radv_shader.c \ > radv_shader_info.c \ > radv_shader.h \ > + radv_shader_helper.h \ > radv_query.c \ > radv_util.c \ > radv_util.h \ > diff --git a/src/amd/vulkan/meson.build b/src/amd/vulkan/meson.build > index 22857926fa1..9f2842182e7 100644 > --- a/src/amd/vulkan/meson.build > +++ b/src/amd/vulkan/meson.build > @@ -67,6 +67,7 @@ libradv_files = files( >'radv_descriptor_set.h', >'radv_formats.c', >'radv_image.c', > + 'radv_llvm_helper.cpp', >'radv_meta.c', >'radv_meta.h', >'radv_meta_blit.c', > @@ -88,6 +89,7 @@ libradv_files = files( >'radv_radeon_winsys.h', >'radv_shader.c', >'radv_shader.h', > + 'radv_shader_helper.h', >'radv_shader_info.c', >'radv_query.c', >'radv_util.c', > diff --git a/src/amd/vulkan/radv_debug.h b/src/amd/vulkan/radv_debug.h > index f1b0dc26a63..9fe4c3b7404 100644 > --- a/src/amd/vulkan/radv_debug.h > +++ b/src/amd/vulkan/radv_debug.h > @@ -49,6 +49,7 @@ enum { > RADV_DEBUG_ERRORS= 0x8, > RADV_DEBUG_STARTUP = 0x10, > RADV_DEBUG_CHECKIR = 0x20, > + RADV_DEBUG_NOTHREADLLVM = 0x40, > }; > > enum { > diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c > index ad3465f594e..73c48cef1f0 100644 > --- a/src/amd/vulkan/radv_device.c > +++ b/src/amd/vulkan/radv_device.c > @@ -436,6 +436,7 @@ static const struct debug_control radv_debug_options[] = { > {"errors", RADV_DEBUG_ERRORS}, > {"startup", RADV_DEBUG_STARTUP}, > {"checkir", RADV_DEBUG_CHECKIR}, > + {"nothreadllvm", RADV_DEBUG_NOTHREADLLVM}, > {NULL, 0} > }; > > diff --git a/src/amd/vulkan/radv_llvm_helper.cpp > b/src/amd/vulkan/radv_llvm_helper.cpp > new file mode 100644 > index 000..dad881f6b1a > --- /dev/null > +++ b/src/amd/vulkan/radv_llvm_helper.cpp > @@ -0,0 +1,148 @@ > +/* > + * Copyright © 2018 Red Hat. > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice (including the next > + * paragraph) shall be included in all copies or substantial portions of the > + * Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE AUTHORS OR
Re: [Mesa-dev] [PATCH 2/2] radv: add support for VK_KHR_create_renderpass2
Reviewed-by: Bas Nieuwenhuizen for the series. On Sun, Jul 8, 2018 at 5:47 PM, Samuel Pitoiset wrote: > VkCreateRenderPass2KHR() is quite similar to VkCreateRenderPass() > but refactoring the code is a bit painful. > > Signed-off-by: Samuel Pitoiset > --- > src/amd/vulkan/radv_cmd_buffer.c | 24 + > src/amd/vulkan/radv_extensions.py | 1 + > src/amd/vulkan/radv_pass.c| 169 ++ > 3 files changed, 194 insertions(+) > > diff --git a/src/amd/vulkan/radv_cmd_buffer.c > b/src/amd/vulkan/radv_cmd_buffer.c > index 0f50f6..5807718b6e 100644 > --- a/src/amd/vulkan/radv_cmd_buffer.c > +++ b/src/amd/vulkan/radv_cmd_buffer.c > @@ -3060,6 +3060,15 @@ void radv_CmdBeginRenderPass( > radv_cmd_buffer_clear_subpass(cmd_buffer); > } > > +void radv_CmdBeginRenderPass2KHR( > +VkCommandBuffer commandBuffer, > +const VkRenderPassBeginInfo*pRenderPassBeginInfo, > +const VkSubpassBeginInfoKHR*pSubpassBeginInfo) > +{ > + radv_CmdBeginRenderPass(commandBuffer, pRenderPassBeginInfo, > + pSubpassBeginInfo->contents); > +} > + > void radv_CmdNextSubpass( > VkCommandBuffer commandBuffer, > VkSubpassContents contents) > @@ -3075,6 +3084,14 @@ void radv_CmdNextSubpass( > radv_cmd_buffer_clear_subpass(cmd_buffer); > } > > +void radv_CmdNextSubpass2KHR( > +VkCommandBuffer commandBuffer, > +const VkSubpassBeginInfoKHR*pSubpassBeginInfo, > +const VkSubpassEndInfoKHR* pSubpassEndInfo) > +{ > + radv_CmdNextSubpass(commandBuffer, pSubpassBeginInfo->contents); > +} > + > static void radv_emit_view_index(struct radv_cmd_buffer *cmd_buffer, > unsigned index) > { > struct radv_pipeline *pipeline = cmd_buffer->state.pipeline; > @@ -3955,6 +3972,13 @@ void radv_CmdEndRenderPass( > cmd_buffer->state.framebuffer = NULL; > } > > +void radv_CmdEndRenderPass2KHR( > +VkCommandBuffer commandBuffer, > +const VkSubpassEndInfoKHR* pSubpassEndInfo) > +{ > + radv_CmdEndRenderPass(commandBuffer); > +} > + > /* > * For HTILE we have the following interesting clear words: > * 0xf30f: Uncompressed, full depth range, for depth+stencil HTILE > diff --git a/src/amd/vulkan/radv_extensions.py > b/src/amd/vulkan/radv_extensions.py > index a0f1038110..13b26c9f0b 100644 > --- a/src/amd/vulkan/radv_extensions.py > +++ b/src/amd/vulkan/radv_extensions.py > @@ -52,6 +52,7 @@ class Extension: > EXTENSIONS = [ > Extension('VK_ANDROID_native_buffer', 5, 'ANDROID && > device->rad_info.has_syncobj_wait_for_submit'), > Extension('VK_KHR_bind_memory2', 1, True), > +Extension('VK_KHR_create_renderpass2',1, True), > Extension('VK_KHR_dedicated_allocation', 1, True), > Extension('VK_KHR_descriptor_update_template',1, True), > Extension('VK_KHR_device_group', 1, True), > diff --git a/src/amd/vulkan/radv_pass.c b/src/amd/vulkan/radv_pass.c > index 0e0f767751..2191093391 100644 > --- a/src/amd/vulkan/radv_pass.c > +++ b/src/amd/vulkan/radv_pass.c > @@ -197,6 +197,175 @@ VkResult radv_CreateRenderPass( > return VK_SUCCESS; > } > > +VkResult radv_CreateRenderPass2KHR( > +VkDevice_device, > +const VkRenderPassCreateInfo2KHR* pCreateInfo, > +const VkAllocationCallbacks*pAllocator, > +VkRenderPass* pRenderPass) > +{ > + RADV_FROM_HANDLE(radv_device, device, _device); > + struct radv_render_pass *pass; > + size_t size; > + size_t attachments_offset; > + VkRenderPassMultiviewCreateInfoKHR *multiview_info = NULL; > + > + assert(pCreateInfo->sType == > VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO_2_KHR); > + > + size = sizeof(*pass); > + size += pCreateInfo->subpassCount * sizeof(pass->subpasses[0]); > + attachments_offset = size; > + size += pCreateInfo->attachmentCount * sizeof(pass->attachments[0]); > + > + pass = vk_alloc2(>alloc, pAllocator, size, 8, > + VK_SYSTEM_ALLOCATION_SCOPE_OBJECT); > + if (pass == NULL) > + return vk_error(device->instance, > VK_ERROR_OUT_OF_HOST_MEMORY); > + > + memset(pass, 0, size); > + pass->attachment_count = pCreateInfo->attachmentCount; > + pass->subpass_count = pCreateInfo->subpassCount; > + pass->attachments = (void *) pass + attachments_offset; > + > + vk_foreach_struct(ext, pCreateInfo->pNext) { > + switch(ext->sType) { > + case VK_STRUCTURE_TYPE_RENDER_PASS_MULTIVIEW_CREATE_INFO_KHR: > + multiview_info = ( >
Re: [Mesa-dev] [PATCH 1/2] radv: introduce radv_subpass_attachment data structure
On Sun, Jul 8, 2018 at 5:47 PM, Samuel Pitoiset wrote: > Needed for VK_KHR_create_renderpass2. > > Signed-off-by: Samuel Pitoiset > --- > src/amd/vulkan/radv_cmd_buffer.c | 4 ++-- > src/amd/vulkan/radv_meta_clear.c | 4 ++-- > src/amd/vulkan/radv_meta_resolve.c| 14 +++--- > src/amd/vulkan/radv_meta_resolve_cs.c | 4 ++-- > src/amd/vulkan/radv_meta_resolve_fs.c | 6 +++--- > src/amd/vulkan/radv_pass.c| 28 +-- > src/amd/vulkan/radv_private.h | 15 +- > 7 files changed, 44 insertions(+), 31 deletions(-) > > diff --git a/src/amd/vulkan/radv_cmd_buffer.c > b/src/amd/vulkan/radv_cmd_buffer.c > index 1ea023a811..0f50f6 100644 > --- a/src/amd/vulkan/radv_cmd_buffer.c > +++ b/src/amd/vulkan/radv_cmd_buffer.c > @@ -2054,7 +2054,7 @@ static void radv_subpass_barrier(struct radv_cmd_buffer > *cmd_buffer, const struc > } > > static void radv_handle_subpass_image_transition(struct radv_cmd_buffer > *cmd_buffer, > -VkAttachmentReference att) > +struct > radv_subpass_attachment att) > { > unsigned idx = att.attachment; > struct radv_image_view *view = > cmd_buffer->state.framebuffer->attachments[idx].attachment; > @@ -3944,7 +3944,7 @@ void radv_CmdEndRenderPass( > for (unsigned i = 0; i < > cmd_buffer->state.framebuffer->attachment_count; ++i) { > VkImageLayout layout = > cmd_buffer->state.pass->attachments[i].final_layout; > radv_handle_subpass_image_transition(cmd_buffer, > - (VkAttachmentReference){i, layout}); > + (struct radv_subpass_attachment){i, > layout}); > } > > vk_free(_buffer->pool->alloc, cmd_buffer->state.attachments); > diff --git a/src/amd/vulkan/radv_meta_clear.c > b/src/amd/vulkan/radv_meta_clear.c > index 2c0bb37387..d7c9849734 100644 > --- a/src/amd/vulkan/radv_meta_clear.c > +++ b/src/amd/vulkan/radv_meta_clear.c > @@ -366,10 +366,10 @@ emit_color_clear(struct radv_cmd_buffer *cmd_buffer, > > struct radv_subpass clear_subpass = { > .color_count = 1, > - .color_attachments = (VkAttachmentReference[]) { > + .color_attachments = (struct radv_subpass_attachment[]) { > subpass->color_attachments[clear_att->colorAttachment] > }, > - .depth_stencil_attachment = (VkAttachmentReference) { > VK_ATTACHMENT_UNUSED, VK_IMAGE_LAYOUT_UNDEFINED } > + .depth_stencil_attachment = (struct radv_subpass_attachment) > { VK_ATTACHMENT_UNUSED, VK_IMAGE_LAYOUT_UNDEFINED } > }; > > radv_cmd_buffer_set_subpass(cmd_buffer, _subpass, false); > diff --git a/src/amd/vulkan/radv_meta_resolve.c > b/src/amd/vulkan/radv_meta_resolve.c > index d4d3552f31..b049237ba6 100644 > --- a/src/amd/vulkan/radv_meta_resolve.c > +++ b/src/amd/vulkan/radv_meta_resolve.c > @@ -613,8 +613,8 @@ radv_cmd_buffer_resolve_subpass(struct radv_cmd_buffer > *cmd_buffer) > return; > > for (uint32_t i = 0; i < subpass->color_count; ++i) { > - VkAttachmentReference src_att = subpass->color_attachments[i]; > - VkAttachmentReference dest_att = > subpass->resolve_attachments[i]; > + struct radv_subpass_attachment src_att = > subpass->color_attachments[i]; > + struct radv_subpass_attachment dest_att = > subpass->resolve_attachments[i]; > > if (src_att.attachment == VK_ATTACHMENT_UNUSED || > dest_att.attachment == VK_ATTACHMENT_UNUSED) > @@ -641,8 +641,8 @@ radv_cmd_buffer_resolve_subpass(struct radv_cmd_buffer > *cmd_buffer) >RADV_META_SAVE_GRAPHICS_PIPELINE); > > for (uint32_t i = 0; i < subpass->color_count; ++i) { > - VkAttachmentReference src_att = subpass->color_attachments[i]; > - VkAttachmentReference dest_att = > subpass->resolve_attachments[i]; > + struct radv_subpass_attachment src_att = > subpass->color_attachments[i]; > + struct radv_subpass_attachment dest_att = > subpass->resolve_attachments[i]; > > if (src_att.attachment == VK_ATTACHMENT_UNUSED || > dest_att.attachment == VK_ATTACHMENT_UNUSED) > @@ -657,7 +657,7 @@ radv_cmd_buffer_resolve_subpass(struct radv_cmd_buffer > *cmd_buffer) > > struct radv_subpass resolve_subpass = { > .color_count = 2, > - .color_attachments = (VkAttachmentReference[]) { > src_att, dest_att }, > + .color_attachments = (struct > radv_subpass_attachment[]) { src_att, dest_att }, > .depth_stencil_attachment = { .attachment = > VK_ATTACHMENT_UNUSED }, > }; > > @@ -684,8 +684,8 @@
[Mesa-dev] [Bug 106644] [llvmpipe] Mesa 18.1.2 fails lp_test_format, lp_test_arit, lp_test_blend, lp_test_printf, lp_test_conv tests
https://bugs.freedesktop.org/show_bug.cgi?id=106644 erhar...@mailbox.org changed: What|Removed |Added Attachment #140477|0 |1 is obsolete|| --- Comment #29 from erhar...@mailbox.org --- Created attachment 140508 --> https://bugs.freedesktop.org/attachment.cgi?id=140508=edit output from lp_test_* (ppc) -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 106644] [llvmpipe] Mesa 18.1.2 fails lp_test_format, lp_test_arit, lp_test_blend, lp_test_printf, lp_test_conv tests
https://bugs.freedesktop.org/show_bug.cgi?id=106644 --- Comment #28 from erhar...@mailbox.org --- (In reply to Ben Crocker from comment #26) I added the following exports % export GALLIVM_MATTRS="-vsx" % export GALLIVM_VSX=0 as you suggested, but the tests still segfault. Config was: ./configure --enable-llvm --with-llvm-prefix=/usr/lib/llvm/5/ --disable-gles1 --with-gallium-drivers=r300,r600,swrast --enable-debug -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nv50/ir: fix use of getUniqueInsn() in loadProjTexCoords
Fixes "value not uniquely defined" messages during shader-db runs. Fixes: 57594065c30feec9376b "nv50/ir: import new shader backend code" Signed-off-by: Rhys Perry --- src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp index 2f9bcc1f34..2fe49909d5 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp @@ -2337,10 +2337,10 @@ void Converter::loadProjTexCoords(Value *dst[4], Value *src[4], unsigned int mask) { Value *proj = fetchSrc(0, 3); - Instruction *insn = proj->getUniqueInsn(); + Instruction *insn = proj->defs.size() > 1 ? NULL : proj->getUniqueInsn(); int c; - if (insn->op == OP_PINTERP) { + if (insn && insn->op == OP_PINTERP) { bb->insertTail(insn = cloneForward(func, insn)); insn->op = OP_LINTERP; insn->setInterpolate(NV50_IR_INTERP_LINEAR | insn->getSampleMode()); @@ -2352,6 +2352,8 @@ Converter::loadProjTexCoords(Value *dst[4], Value *src[4], unsigned int mask) for (c = 0; c < 4; ++c) { if (!(mask & (1 << c))) continue; + if (src[c]->defs.size() > 1) + continue; if ((insn = src[c]->getUniqueInsn())->op != OP_PINTERP) continue; mask &= ~(1 << c); -- 2.14.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] radv: add support for VK_KHR_create_renderpass2
VkCreateRenderPass2KHR() is quite similar to VkCreateRenderPass() but refactoring the code is a bit painful. Signed-off-by: Samuel Pitoiset --- src/amd/vulkan/radv_cmd_buffer.c | 24 + src/amd/vulkan/radv_extensions.py | 1 + src/amd/vulkan/radv_pass.c| 169 ++ 3 files changed, 194 insertions(+) diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c index 0f50f6..5807718b6e 100644 --- a/src/amd/vulkan/radv_cmd_buffer.c +++ b/src/amd/vulkan/radv_cmd_buffer.c @@ -3060,6 +3060,15 @@ void radv_CmdBeginRenderPass( radv_cmd_buffer_clear_subpass(cmd_buffer); } +void radv_CmdBeginRenderPass2KHR( +VkCommandBuffer commandBuffer, +const VkRenderPassBeginInfo*pRenderPassBeginInfo, +const VkSubpassBeginInfoKHR*pSubpassBeginInfo) +{ + radv_CmdBeginRenderPass(commandBuffer, pRenderPassBeginInfo, + pSubpassBeginInfo->contents); +} + void radv_CmdNextSubpass( VkCommandBuffer commandBuffer, VkSubpassContents contents) @@ -3075,6 +3084,14 @@ void radv_CmdNextSubpass( radv_cmd_buffer_clear_subpass(cmd_buffer); } +void radv_CmdNextSubpass2KHR( +VkCommandBuffer commandBuffer, +const VkSubpassBeginInfoKHR*pSubpassBeginInfo, +const VkSubpassEndInfoKHR* pSubpassEndInfo) +{ + radv_CmdNextSubpass(commandBuffer, pSubpassBeginInfo->contents); +} + static void radv_emit_view_index(struct radv_cmd_buffer *cmd_buffer, unsigned index) { struct radv_pipeline *pipeline = cmd_buffer->state.pipeline; @@ -3955,6 +3972,13 @@ void radv_CmdEndRenderPass( cmd_buffer->state.framebuffer = NULL; } +void radv_CmdEndRenderPass2KHR( +VkCommandBuffer commandBuffer, +const VkSubpassEndInfoKHR* pSubpassEndInfo) +{ + radv_CmdEndRenderPass(commandBuffer); +} + /* * For HTILE we have the following interesting clear words: * 0xf30f: Uncompressed, full depth range, for depth+stencil HTILE diff --git a/src/amd/vulkan/radv_extensions.py b/src/amd/vulkan/radv_extensions.py index a0f1038110..13b26c9f0b 100644 --- a/src/amd/vulkan/radv_extensions.py +++ b/src/amd/vulkan/radv_extensions.py @@ -52,6 +52,7 @@ class Extension: EXTENSIONS = [ Extension('VK_ANDROID_native_buffer', 5, 'ANDROID && device->rad_info.has_syncobj_wait_for_submit'), Extension('VK_KHR_bind_memory2', 1, True), +Extension('VK_KHR_create_renderpass2',1, True), Extension('VK_KHR_dedicated_allocation', 1, True), Extension('VK_KHR_descriptor_update_template',1, True), Extension('VK_KHR_device_group', 1, True), diff --git a/src/amd/vulkan/radv_pass.c b/src/amd/vulkan/radv_pass.c index 0e0f767751..2191093391 100644 --- a/src/amd/vulkan/radv_pass.c +++ b/src/amd/vulkan/radv_pass.c @@ -197,6 +197,175 @@ VkResult radv_CreateRenderPass( return VK_SUCCESS; } +VkResult radv_CreateRenderPass2KHR( +VkDevice_device, +const VkRenderPassCreateInfo2KHR* pCreateInfo, +const VkAllocationCallbacks*pAllocator, +VkRenderPass* pRenderPass) +{ + RADV_FROM_HANDLE(radv_device, device, _device); + struct radv_render_pass *pass; + size_t size; + size_t attachments_offset; + VkRenderPassMultiviewCreateInfoKHR *multiview_info = NULL; + + assert(pCreateInfo->sType == VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO_2_KHR); + + size = sizeof(*pass); + size += pCreateInfo->subpassCount * sizeof(pass->subpasses[0]); + attachments_offset = size; + size += pCreateInfo->attachmentCount * sizeof(pass->attachments[0]); + + pass = vk_alloc2(>alloc, pAllocator, size, 8, + VK_SYSTEM_ALLOCATION_SCOPE_OBJECT); + if (pass == NULL) + return vk_error(device->instance, VK_ERROR_OUT_OF_HOST_MEMORY); + + memset(pass, 0, size); + pass->attachment_count = pCreateInfo->attachmentCount; + pass->subpass_count = pCreateInfo->subpassCount; + pass->attachments = (void *) pass + attachments_offset; + + vk_foreach_struct(ext, pCreateInfo->pNext) { + switch(ext->sType) { + case VK_STRUCTURE_TYPE_RENDER_PASS_MULTIVIEW_CREATE_INFO_KHR: + multiview_info = ( VkRenderPassMultiviewCreateInfoKHR*)ext; + break; + default: + break; + } + } + + for (uint32_t i = 0; i < pCreateInfo->attachmentCount; i++) { + struct radv_render_pass_attachment *att = >attachments[i]; + + att->format =
[Mesa-dev] [PATCH 1/2] radv: introduce radv_subpass_attachment data structure
Needed for VK_KHR_create_renderpass2. Signed-off-by: Samuel Pitoiset --- src/amd/vulkan/radv_cmd_buffer.c | 4 ++-- src/amd/vulkan/radv_meta_clear.c | 4 ++-- src/amd/vulkan/radv_meta_resolve.c| 14 +++--- src/amd/vulkan/radv_meta_resolve_cs.c | 4 ++-- src/amd/vulkan/radv_meta_resolve_fs.c | 6 +++--- src/amd/vulkan/radv_pass.c| 28 +-- src/amd/vulkan/radv_private.h | 15 +- 7 files changed, 44 insertions(+), 31 deletions(-) diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c index 1ea023a811..0f50f6 100644 --- a/src/amd/vulkan/radv_cmd_buffer.c +++ b/src/amd/vulkan/radv_cmd_buffer.c @@ -2054,7 +2054,7 @@ static void radv_subpass_barrier(struct radv_cmd_buffer *cmd_buffer, const struc } static void radv_handle_subpass_image_transition(struct radv_cmd_buffer *cmd_buffer, -VkAttachmentReference att) +struct radv_subpass_attachment att) { unsigned idx = att.attachment; struct radv_image_view *view = cmd_buffer->state.framebuffer->attachments[idx].attachment; @@ -3944,7 +3944,7 @@ void radv_CmdEndRenderPass( for (unsigned i = 0; i < cmd_buffer->state.framebuffer->attachment_count; ++i) { VkImageLayout layout = cmd_buffer->state.pass->attachments[i].final_layout; radv_handle_subpass_image_transition(cmd_buffer, - (VkAttachmentReference){i, layout}); + (struct radv_subpass_attachment){i, layout}); } vk_free(_buffer->pool->alloc, cmd_buffer->state.attachments); diff --git a/src/amd/vulkan/radv_meta_clear.c b/src/amd/vulkan/radv_meta_clear.c index 2c0bb37387..d7c9849734 100644 --- a/src/amd/vulkan/radv_meta_clear.c +++ b/src/amd/vulkan/radv_meta_clear.c @@ -366,10 +366,10 @@ emit_color_clear(struct radv_cmd_buffer *cmd_buffer, struct radv_subpass clear_subpass = { .color_count = 1, - .color_attachments = (VkAttachmentReference[]) { + .color_attachments = (struct radv_subpass_attachment[]) { subpass->color_attachments[clear_att->colorAttachment] }, - .depth_stencil_attachment = (VkAttachmentReference) { VK_ATTACHMENT_UNUSED, VK_IMAGE_LAYOUT_UNDEFINED } + .depth_stencil_attachment = (struct radv_subpass_attachment) { VK_ATTACHMENT_UNUSED, VK_IMAGE_LAYOUT_UNDEFINED } }; radv_cmd_buffer_set_subpass(cmd_buffer, _subpass, false); diff --git a/src/amd/vulkan/radv_meta_resolve.c b/src/amd/vulkan/radv_meta_resolve.c index d4d3552f31..b049237ba6 100644 --- a/src/amd/vulkan/radv_meta_resolve.c +++ b/src/amd/vulkan/radv_meta_resolve.c @@ -613,8 +613,8 @@ radv_cmd_buffer_resolve_subpass(struct radv_cmd_buffer *cmd_buffer) return; for (uint32_t i = 0; i < subpass->color_count; ++i) { - VkAttachmentReference src_att = subpass->color_attachments[i]; - VkAttachmentReference dest_att = subpass->resolve_attachments[i]; + struct radv_subpass_attachment src_att = subpass->color_attachments[i]; + struct radv_subpass_attachment dest_att = subpass->resolve_attachments[i]; if (src_att.attachment == VK_ATTACHMENT_UNUSED || dest_att.attachment == VK_ATTACHMENT_UNUSED) @@ -641,8 +641,8 @@ radv_cmd_buffer_resolve_subpass(struct radv_cmd_buffer *cmd_buffer) RADV_META_SAVE_GRAPHICS_PIPELINE); for (uint32_t i = 0; i < subpass->color_count; ++i) { - VkAttachmentReference src_att = subpass->color_attachments[i]; - VkAttachmentReference dest_att = subpass->resolve_attachments[i]; + struct radv_subpass_attachment src_att = subpass->color_attachments[i]; + struct radv_subpass_attachment dest_att = subpass->resolve_attachments[i]; if (src_att.attachment == VK_ATTACHMENT_UNUSED || dest_att.attachment == VK_ATTACHMENT_UNUSED) @@ -657,7 +657,7 @@ radv_cmd_buffer_resolve_subpass(struct radv_cmd_buffer *cmd_buffer) struct radv_subpass resolve_subpass = { .color_count = 2, - .color_attachments = (VkAttachmentReference[]) { src_att, dest_att }, + .color_attachments = (struct radv_subpass_attachment[]) { src_att, dest_att }, .depth_stencil_attachment = { .attachment = VK_ATTACHMENT_UNUSED }, }; @@ -684,8 +684,8 @@ radv_decompress_resolve_subpass_src(struct radv_cmd_buffer *cmd_buffer) struct radv_framebuffer *fb = cmd_buffer->state.framebuffer; for (uint32_t i = 0; i < subpass->color_count; ++i) { - VkAttachmentReference src_att =
Re: [Mesa-dev] [PATCH] loader_dri3: Handle mismatched depth 30 formats for Prime renderoffload.
Ping. One more loose patch following the same logic/principle as the x11+egl patch for nouveau depth 30, that would benefit from a r-b and ideally make it into Mesa 18.2. to clear out nouveau depth 30 support. Obsessively tested by now by myself on intel, amd, nouveau and dri3/present with prime renderoffload for intel+nouveau, amd+nouveau and nouveau+amd, on native x11 with the patched nouveau-ddx and server-1.20 modesetting-ddx. It makes many things better, but nothing worse, as far as my testing goes. It would be great to get this merged for Mesa 18.2. Can i bribe somebody into this, in exchange for a beer on xdc2018, assuming i manage to make it there? thanks, -mario On Thu, Jun 14, 2018 at 6:04 AM, Mario Kleiner wrote: > Detect if the display (X-Server) gpu and Prime renderoffload gpu prefer > different channel ordering for color depth 30 formats ([X/A]BGR2101010 > vs. [X/A]RGB2101010) and perform format conversion during the blitImage() > detiling op from tiled backbuffer -> linear buffer. > > For this we need to find the visual (= red channel mask) for the > X-Drawable used to display on the server gpu. We use the same proven > logic for finding that visual as in commit "egl/x11: Handle both depth > 30 formats for eglCreateImage()". > > This is mostly to allow "NVidia Optimus" at depth 30, as Intel/AMD > gpu's prefer xRGB2101010 ordering, whereas NVidia gpu's prefer > xBGR2101010 ordering, so we can offload to nouveau without getting > funky colors. > > Tested on Intel single gpu, NVidia single gpu, Intel + NVidia prime > offload with DRI3/Present. > > Note: An unintended but pleasant surprise of this patch is that it also > seems to make the modesetting-ddx of server 1.20.0 work at depth 30 > on nouveau, at least with unredirected "classic" X rendering, and > with redirected desktop compositing under XRender accel, and with OpenGL > compositing under GLX. Only X11 compositing via OpenGL + EGL still gives > funky colors. modesetting-ddx + glamor are not yet ready to deal with > nouveau's ABGR2101010 format, and treat it as ARGB2101010, also exposing > X-visuals with ARGB2101010 style channel masks. Seems somehow this triggers > the logic in this patch on modesetting-ddx + depth 30 + DRI3 buffer sharing > and does the "wrong" channel swizzling that then cancels out the "wrong" > swizzling of glamor and we end up with the proper pixel formatting in > the scanout buffer :). This so far tested on a NVA5 Tesla card under KDE5 > Plasma as shipping with Ubuntu 16.04.4 LTS. > > Signed-off-by: Mario Kleiner > Cc: Ilia Mirkin > Cc: Eric Engestrom > --- > src/loader/loader_dri3_helper.c | 83 - > src/loader/loader_dri3_helper.h | 1 + > 2 files changed, 83 insertions(+), 1 deletion(-) > > diff --git a/src/loader/loader_dri3_helper.c b/src/loader/loader_dri3_helper.c > index f0ff2f07bd..a9a18921ce 100644 > --- a/src/loader/loader_dri3_helper.c > +++ b/src/loader/loader_dri3_helper.c > @@ -64,6 +64,55 @@ dri3_flush_present_events(struct loader_dri3_drawable > *draw); > static struct loader_dri3_buffer * > dri3_find_back_alloc(struct loader_dri3_drawable *draw); > > +static xcb_screen_t * > +get_screen_for_root(xcb_connection_t *conn, xcb_window_t root) > +{ > + xcb_screen_iterator_t screen_iter = > + xcb_setup_roots_iterator(xcb_get_setup(conn)); > + > + for (; screen_iter.rem; xcb_screen_next (_iter)) { > + if (screen_iter.data->root == root) > + return screen_iter.data; > + } > + > + return NULL; > +} > + > +static xcb_visualtype_t * > +get_xcb_visualtype_for_depth(struct loader_dri3_drawable *draw, int depth) > +{ > + xcb_visualtype_iterator_t visual_iter; > + xcb_screen_t *screen = draw->screen; > + xcb_depth_iterator_t depth_iter; > + > + if (!screen) > + return NULL; > + > + depth_iter = xcb_screen_allowed_depths_iterator(screen); > + for (; depth_iter.rem; xcb_depth_next(_iter)) { > + if (depth_iter.data->depth != depth) > + continue; > + > + visual_iter = xcb_depth_visuals_iterator(depth_iter.data); > + if (visual_iter.rem) > + return visual_iter.data; > + } > + > + return NULL; > +} > + > +/* Get red channel mask for given drawable at given depth. */ > +static unsigned int > +dri3_get_red_mask_for_depth(struct loader_dri3_drawable *draw, int depth) > +{ > + xcb_visualtype_t *visual = get_xcb_visualtype_for_depth(draw, depth); > + > + if (visual) > + return visual->red_mask; > + > + return 0; > +} > + > /** > * Do we have blit functionality in the image blit extension? > * > @@ -323,6 +372,7 @@ loader_dri3_drawable_init(xcb_connection_t *conn, >return 1; > } > > + draw->screen = get_screen_for_root(draw->conn, reply->root); > draw->width = reply->width; > draw->height = reply->height; > draw->depth = reply->depth; > @@ -1030,6 +1080,36 @@ dri3_cpp_for_format(uint32_t format) { > } > } > > +/* Map format of render buffer to corresponding
Re: [Mesa-dev] Nouveau depth 30 stuff again..
Ping? I think this series should be ready for merging, had all review comments addressed and was obsessively tested by myself on intel, amd, nouveau and dri3/present with prime renderoffload for intel+nouveau, amd+nouveau and nouveau+amd, on both native x11 and wayland+weston. It makes many things better, but nothing worse, as far as my testing goes. It would be great to get this merged for Mesa 18.2. Can i bribe somebody into this, in exchange for a beer on xdc2018, assuming i manage to make it there? Cheers, -mario On Wed, Jun 13, 2018 at 6:04 AM, Mario Kleiner wrote: > A resend of the series, with all of Eric Engestroems review comments > addressed and retested on all combos of intel, nvidia, intel+nvidia > prime. > > Rebased and retested against current Mesa master, otherwise only > style fixes and an additional assert for documentation, no real > functional changes. > > Please merge, thanks > -mario > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nvc0,gm107/ir: add cycle count estimation
With branching, for simplicity and usefulness, this assumes both paths are taken. With loops, it assumes their basic blocks execute 10 times. The average latency for variable latency instructions in this patch is rather poor, with only something reasonably accurate for IMUL/IMAD. It should be better than nothing though. Since information is lacking and this may miss some details, the estimates should probably be taken with caution, at least until we get better numbers for variable latency instructions. Estimation can be enabled or disabled through NV50_PROG_CYCLE_ESTIMATE, which defaults to enabled on debug builds. Signed-off-by: Rhys Perry --- src/gallium/drivers/nouveau/codegen/nv50_ir.cpp| 3 + src/gallium/drivers/nouveau/codegen/nv50_ir.h | 6 + .../drivers/nouveau/codegen/nv50_ir_driver.h | 1 + .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 132 - .../drivers/nouveau/codegen/nv50_ir_graph.cpp | 7 +- .../drivers/nouveau/codegen/nv50_ir_graph.h| 2 +- .../nouveau/codegen/nv50_ir_target_gm107.cpp | 24 .../drivers/nouveau/codegen/nv50_ir_target_gm107.h | 1 + src/gallium/drivers/nouveau/nvc0/nvc0_program.c| 4 +- 9 files changed, 169 insertions(+), 11 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp index 49425b98b9..a0c6057dd1 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp @@ -1119,6 +1119,7 @@ Program::Program(Type type, Target *arch) binSize = 0; maxGPR = -1; + cycleEstimate = 0; main = new Function(this, "MAIN", ~0); calls.insert(>call); @@ -1279,6 +1280,8 @@ nv50_ir_generate_code(struct nv50_ir_prog_info *info) goto out; } + info->bin.cycleEstimate = prog->cycleEstimate; + out: INFO_DBG(prog->dbgFlags, VERBOSE, "nv50_ir_generate_code: ret = %i\n", ret); diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h b/src/gallium/drivers/nouveau/codegen/nv50_ir.h index f4f3c70888..79e4c7cccf 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h @@ -1136,6 +1136,11 @@ public: bool explicitCont; // loop headers: true if loop contains continue stmts + // used for cycle count estimation on GM107+ + Instruction *unresolvedBarriers[6]; + bool unresolvedBarriersAreRead[6]; + uint32_t cycleEstimate; + private: int id; DLList df; @@ -1282,6 +1287,7 @@ public: uint32_t tlsSize; // size required for FILE_MEMORY_LOCAL int maxGPR; + uint32_t cycleEstimate; MemoryPool mem_Instruction; MemoryPool mem_CmpInstruction; diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h index 7c835ceab8..1c7e7f4b5a 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h @@ -99,6 +99,7 @@ struct nv50_ir_prog_info void *fixupData; struct nv50_ir_prog_symbol *syms; uint16_t numSyms; + uint32_t cycleEstimate; } bin; struct nv50_ir_varying sv[PIPE_MAX_SHADER_INPUTS]; diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp index 26826d6360..c84b9e59d5 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp @@ -3479,7 +3479,12 @@ CodeEmitterGM107::getMinEncodingSize(const Instruction *i) const class SchedDataCalculatorGM107 : public Pass { public: - SchedDataCalculatorGM107(const TargetGM107 *targ) : targ(targ) {} + SchedDataCalculatorGM107(const TargetGM107 *targ); + + inline uint32_t getCycleEstimate() const + { + return cycleEstimate; + } private: struct RegScores @@ -3573,9 +3578,20 @@ private: } }; + struct InsnCostInfo { + std::vector waitSrcs; + std::vector waitsAreRead; + uint32_t bbCycleEstimate; + }; + RegScores *score; // for current BB std::vector scoreBoards; + bool estimateCycleCount; + int cycleEstimate; + // used for cycle count estimation + std::vector insnCostInfo; + const TargetGM107 *targ; bool visit(Function *); bool visit(BasicBlock *); @@ -3591,7 +3607,7 @@ private: inline void emitReuse(Instruction *, uint8_t); inline void emitWrDepBar(Instruction *, uint8_t); inline void emitRdDepBar(Instruction *, uint8_t); - inline void emitWtDepBar(Instruction *, uint8_t); + inline void emitWtDepBar(Instruction *, uint8_t, Instruction *, bool); inline int getStall(const Instruction *) const; inline int getWrDepBar(const Instruction *) const; @@ -3624,8 +3640,20 @@ private: bool needRdDepBar(const Instruction *) const; bool needWrDepBar(const Instruction *) const; + + void doEstimateCycleCount(BasicBlock
Re: [Mesa-dev] [PATCH 2/2] radv: add the trace BO to the list when starting a new cmdbuf
Reviewed-by: Bas Nieuwenhuizen for the series. On Tue, Jul 3, 2018 at 12:43 PM, Samuel Pitoiset wrote: > That might reduce CPU overhead a little bit when using > RADV_TRACE_FILE. > > Signed-off-by: Samuel Pitoiset > --- > src/amd/vulkan/radv_cmd_buffer.c | 11 +++ > 1 file changed, 7 insertions(+), 4 deletions(-) > > diff --git a/src/amd/vulkan/radv_cmd_buffer.c > b/src/amd/vulkan/radv_cmd_buffer.c > index 26d9fef314..0a7a3f3fa9 100644 > --- a/src/amd/vulkan/radv_cmd_buffer.c > +++ b/src/amd/vulkan/radv_cmd_buffer.c > @@ -446,7 +446,6 @@ void radv_cmd_buffer_trace_emit(struct radv_cmd_buffer > *cmd_buffer) > MAYBE_UNUSED unsigned cdw_max = > radeon_check_space(cmd_buffer->device->ws, cmd_buffer->cs, 7); > > ++cmd_buffer->state.trace_id; > - radv_cs_add_buffer(device->ws, cs, device->trace_bo, 8); > radv_emit_write_data_packet(cs, va, 1, _buffer->state.trace_id); > radeon_emit(cs, PKT3(PKT3_NOP, 0, 0)); > radeon_emit(cs, AC_ENCODE_TRACE_POINT(cmd_buffer->state.trace_id)); > @@ -509,7 +508,6 @@ radv_save_pipeline(struct radv_cmd_buffer *cmd_buffer, > data[0] = (uintptr_t)pipeline; > data[1] = (uintptr_t)pipeline >> 32; > > - radv_cs_add_buffer(device->ws, cs, device->trace_bo, 8); > radv_emit_write_data_packet(cs, va, 2, data); > } > > @@ -551,7 +549,6 @@ radv_save_descriptors(struct radv_cmd_buffer *cmd_buffer, > data[i * 2 + 1] = (uintptr_t)set >> 32; > } > > - radv_cs_add_buffer(device->ws, cs, device->trace_bo, 8); > radv_emit_write_data_packet(cs, va, MAX_SETS * 2, data); > } > > @@ -2300,8 +2297,14 @@ VkResult radv_BeginCommandBuffer( > radv_cmd_buffer_set_subpass(cmd_buffer, subpass, false); > } > > - if (unlikely(cmd_buffer->device->trace_bo)) > + if (unlikely(cmd_buffer->device->trace_bo)) { > + struct radv_device *device = cmd_buffer->device; > + > + radv_cs_add_buffer(device->ws, cmd_buffer->cs, > + device->trace_bo, 8); > + > radv_cmd_buffer_trace_emit(cmd_buffer); > + } > > cmd_buffer->status = RADV_CMD_BUFFER_STATUS_RECORDING; > > -- > 2.18.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 107022] [RADV] The Witcher 3: Trembling of trees
https://bugs.freedesktop.org/show_bug.cgi?id=107022 --- Comment #1 from ximik --- on the video on the embedded link is visible. same problem. card r290x, the driver of the message 18-18.2 git link https://mega.nz/#!Pr5nmYTQ!uGrPyzSW32-Ln60x2jUxrvtW3VH9rG2b2uTgC1iwe18 -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 107156] earth tessellation bug
https://bugs.freedesktop.org/show_bug.cgi?id=107156 --- Comment #2 from ximik --- Created attachment 140507 --> https://bugs.freedesktop.org/attachment.cgi?id=140507=edit earth tessellation bug -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/7] anv/pass: Use a designated initailizer for attachments
With the version bumped in patch 7, patches 2-7 are : Reviewed-by: Lionel Landwerlin ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 107156] earth tessellation bug
https://bugs.freedesktop.org/show_bug.cgi?id=107156 --- Comment #1 from ximik --- Created attachment 140506 --> https://bugs.freedesktop.org/attachment.cgi?id=140506=edit earth tessellation bug -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/7] vulkan: Update the XML and headers to 1.1.80
Acked-by: Lionel Landwerlin ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 7/7] anv: Add support for VK_KHR_create_renderpass2
On 07/07/18 17:29, Jason Ekstrand wrote: The implementation of CreateRenderPass2 uses the helpers we broke out in previous commits. The implementations of the new vkCmd functions just call the old versions. --- src/intel/vulkan/anv_extensions.py | 1 + src/intel/vulkan/anv_pass.c| 140 + src/intel/vulkan/genX_cmd_buffer.c | 24 + 3 files changed, 165 insertions(+) diff --git a/src/intel/vulkan/anv_extensions.py b/src/intel/vulkan/anv_extensions.py index 0f99f58ecb1..4179315a388 100644 --- a/src/intel/vulkan/anv_extensions.py +++ b/src/intel/vulkan/anv_extensions.py @@ -73,6 +73,7 @@ EXTENSIONS = [ You might want to bump API_PATCH_VERSION above. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 107156] earth tessellation bug
https://bugs.freedesktop.org/show_bug.cgi?id=107156 Bug ID: 107156 Summary: earth tessellation bug Product: Mesa Version: git Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: normal Priority: medium Component: Drivers/Vulkan/radeon Assignee: mesa-dev@lists.freedesktop.org Reporter: ximi...@mail.ru QA Contact: mesa-dev@lists.freedesktop.org Created attachment 140505 --> https://bugs.freedesktop.org/attachment.cgi?id=140505=edit earth tessellation bug Some surfaces with tessellation are separated from the earth for some distance. The bug is encountered on the map r290x with the latest open mesa-git driver. the bug appeared in early June, in git driver versions at the end of May there was no bug. apitrace can not be done, Far Cry 5 throws an error too long to load. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] egl: Fix missing clamping in eglSetDamageRegionKHR
Clamp the x and y co-ordinates of the rectangles. v2: Clamp width/height after converting to co-ordinates (Ilia Merkin) Signed-off-by: Harish Krupo --- src/egl/main/eglapi.c | 25 +++-- 1 file changed, 11 insertions(+), 14 deletions(-) diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c index c110349119..deb479b6d5 100644 --- a/src/egl/main/eglapi.c +++ b/src/egl/main/eglapi.c @@ -1320,9 +1320,7 @@ eglSwapBuffersWithDamageKHR(EGLDisplay dpy, EGLSurface surface, } /** - * If the width of the passed rect is greater than the surface's - * width then it is clamped to the width of the surface. Same with - * height. + * Clamp the rectangles so that they lie within the surface. */ static void @@ -1334,17 +1332,16 @@ _eglSetDamageRegionKHRClampRects(_EGLDisplay* disp, _EGLSurface* surf, EGLint surf_width = surf->Width; for (i = 0; i < (4 * n_rects); i += 4) { - EGLint x, y, rect_width, rect_height; - x = rects[i]; - y = rects[i + 1]; - rect_width = rects[i + 2]; - rect_height = rects[i + 3]; - - if (rect_width > surf_width - x) - rects[i + 2] = surf_width - x; - - if (rect_height > surf_height - y) - rects[i + 3] = surf_height - y; + EGLint x1, y1, x2, y2; + x1 = rects[i]; + y1 = rects[i + 1]; + x2 = rects[i + 2] + x1; + y2 = rects[i + 3] + y1; + + rects[i] = CLAMP(x1, 0, surf_width); + rects[i + 1] = CLAMP(y1, 0, surf_height); + rects[i + 2] = CLAMP(x2, 0, surf_width) - rects[i]; + rects[i + 3] = CLAMP(y2, 0, surf_height) - rects[i + 1]; } } -- 2.18.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev