e, or complexity to
avoid that. The main body of the fixed function TCS is not
that interesting to precompile anyway, since we do it on
demand and it is very small.
v2: Use u_bit_scan64.
Signed-off-by: Bas Nieuwenhuizen
Reviewed-by: Marek Olšák
---
src/gallium/drivers/radeonsi/si_shader.c
Signed-off-by: Bas Nieuwenhuizen
Reviewed-by: Nicolai Hähnle
Reviewed-by: Marek Olšák
---
src/gallium/drivers/radeonsi/si_shader.c| 28 -
src/gallium/drivers/radeonsi/si_shader.h| 3 ++-
src/gallium/drivers/radeonsi/si_state_shaders.c | 9
3
This happens to be in the right position, but that changes
when TCS/TES get new parameters.
Signed-off-by: Bas Nieuwenhuizen
Reviewed-by: Nicolai Hähnle
Reviewed-by: Marek Olšák
---
src/gallium/drivers/radeonsi/si_shader.h | 7 ---
1 file changed, 4 insertions(+), 3 deletions(-)
diff
as late as possible.
- Use tgsi_full_src_register_from_dst.
- Remove some bad comments.
Signed-off-by: Bas Nieuwenhuizen
Reviewed-by: Marek Olšák
---
src/gallium/drivers/radeonsi/si_shader.c | 124 +++
1 file changed, 124 insertions(+)
diff --git a/src/gall
Signed-off-by: Bas Nieuwenhuizen
Reviewed-by: Nicolai Hähnle
Reviewed-by: Marek Olšák
---
src/gallium/drivers/radeonsi/si_shader.c | 3 +++
src/gallium/drivers/radeonsi/si_shader.h | 12 ++--
src/gallium/drivers/radeonsi/si_state_draw.c | 9 +++--
3 files changed, 20
v2: - Use llvm.admgcn.buffer.load instrinsics for new LLVM.
- Code style fixes.
v3: - Code style fix.
Signed-off-by: Bas Nieuwenhuizen
Reviewed-by: Marek Olšák
---
src/gallium/drivers/radeonsi/si_shader.c | 114 +++
1 file changed, 114 insertions(+)
diff --git
to LDS and the LDS space, so we can load the outputs
later, either due to the shader, of for wrting the tess factors.
Signed-off-by: Bas Nieuwenhuizen
Reviewed-by: Nicolai Hähnle
Reviewed-by: Marek Olšák
---
src/gallium/drivers/radeonsi/si_shader.c | 66
1 file c
The factors may be stored to LDs by another invocation than
the invocation for vertex 0.
Signed-off-by: Bas Nieuwenhuizen
---
src/gallium/drivers/radeonsi/si_shader.c | 6 ++
1 file changed, 6 insertions(+)
diff --git a/src/gallium/drivers/radeonsi/si_shader.c
b/src/gallium/drivers
that the optimal value for it differs between applications,
and I don't have that many applications to check against.
Any thoughts on the issue?
- Bas
Bas Nieuwenhuizen (14):
radeonsi: Add buffer for offchip storage between TCS and TES.
radeonsi: Add offchip tessellation parameters.
rad
They are unused.
Signed-off-by: Bas Nieuwenhuizen
Reviewed-by: Nicolai Hähnle
Reviewed-by: Marek Olšák
---
src/gallium/drivers/radeonsi/si_shader.c | 4 +---
src/gallium/drivers/radeonsi/si_shader.h | 15 ---
src/gallium/drivers/radeonsi/si_state_draw.c | 4 +---
3 files
imits for not splitting patches between waves.
- Set max num_patches to 40 as in the proprietary driver.
Signed-off-by: Bas Nieuwenhuizen
---
src/gallium/drivers/radeonsi/si_state_draw.c | 50 +++-
1 file changed, 35 insertions(+), 15 deletions(-)
diff --git a/src/ga
The R_028B50_VGT_TESS_DISTRIBUTION value is copied from
amdgpu-pro. Smaller values in the ACCUM fields seem to
decrease the performance advantage from this patch, higher
values don't seem to matter.
v2: Add distribution mode field enums.
Signed-off-by: Bas Nieuwenhuizen
Reviewed-by: Ni
This allows running the TES on different CU's than the
TCS which results in performance improvements.
v2: Only write the control word from one invocation.
Signed-off-by: Bas Nieuwenhuizen
Reviewed-by: Marek Olšák
---
src/gallium/drivers/radeonsi/si_shader.c
The buffer is quite large, but should only be allocated if the
application uses tessellation. Most non-games don't.
v2: - Use the correct register for SI.
- Add define for block size.
Signed-off-by: Bas Nieuwenhuizen
Reviewed-by: Marek Olšák
---
src/gallium/drivers/radeonsi/si_p
I don't think this is the right approach as we shouldn't be getting 0
in the first place. At least for LDS the output size shoudl be at
least 2 as we load the inner & outer tess factors while writing them
to the tessellation factor ring.
We could just do
num_tcs_patch_outputs = MAX2(num_tcs_patch_
Those are always read for writing to the TF ring.
Should fix CTS test
GL45-CTS.shader_image_load_store.multiple-uniforms
after a regression due to the new tessellation code.
Signed-off-by: Bas Nieuwenhuizen
---
I have no CTS, so it actually is not tested whether it fixes
this test.
src
Signed-off-by: Bas Nieuwenhuizen
---
src/gallium/drivers/ddebug/dd_screen.c | 9 +
1 file changed, 9 insertions(+)
diff --git a/src/gallium/drivers/ddebug/dd_screen.c
b/src/gallium/drivers/ddebug/dd_screen.c
index ebe090b..5a883bd 100644
--- a/src/gallium/drivers/ddebug/dd_screen.c
Reviewed-by: Bas Nieuwenhuizen
On Fri, May 27, 2016 at 12:40 PM, Marek Olšák wrote:
> And how about the attached patch?
>
> Marek
>
> On Fri, May 27, 2016 at 10:08 AM, Bas Nieuwenhuizen
> wrote:
>> Those are always read for writing to the TF ring.
>>
Signed-off-by: Bas Nieuwenhuizen
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96239
---
src/gallium/drivers/radeonsi/si_state_shaders.c | 8 ++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c
b/src/gallium/drivers
Signed-off-by: Bas Nieuwenhuizen
---
src/gallium/drivers/radeon/r600_pipe_common.h | 5 +
src/gallium/drivers/radeonsi/si_pipe.c| 2 ++
src/gallium/drivers/radeonsi/si_state.c | 18 ++
src/gallium/drivers/radeonsi/si_state.h | 1 +
4 files changed, 26
By using a counter to quickly reject textures that are not
bound to a framebuffer, the performance impact when binding
sampler_views/images is not too large.
Signed-off-by: Bas Nieuwenhuizen
---
src/gallium/drivers/radeonsi/si_blit.c| 99 +++
src/gallium/drivers
e[1] &= C_008F14_BASE_ADDRESS_HI;
> + state[3] &= C_008F1C_TILING_INDEX;
> + state[4] &= C_008F20_PITCH;
> + state[6] &= C_008F28_COMPRESSION_EN;
Aren't these fields cleared already? Either way, the series is
Reviewe
Reviewed-by: Bas Nieuwenhuizen
On Fri, Jun 3, 2016 at 7:20 PM, Marek Olšák wrote:
> From: Marek Olšák
>
> This should fix spec@arb_shader_image_load_store@level.
>
> Broken by:
> Commit: 95c5bbae66af3ca1f805d94f6fe8d8e4ba2c9c43
> radeonsi: set some image descripto
seen more than a 30%
> speedup. The difference in real-world applications should of course be
> *much* smaller (1-2% if you're really lucky, if that). Please review!
This series is
Reviewed-by: Bas Nieuwenhuizen
Did you also try/benchmark using a dirty mask for emitting the pointers?
-
of IB 1, but not between IB 1 and IB 2.
The old code put the CE RAM loads in the preamble of IB 2. As the preamble of
IB 1 does not have the loads and the preamble of IB 2 does not get executed, the
old values are not load into CE RAM.
Fix this by always restoring the entire CE RAM.
Signed-off-by
On Mon, Jun 6, 2016 at 4:21 PM, Nicolai Hähnle wrote:
> On 06.06.2016 16:16, Nicolai Hähnle wrote:
>>
>> Patches 1 & 2:
>>
>> Reviewed-by: Nicolai Hähnle
>
>
> Hold off on patch #2 - how does this work together with shader image writes?
> Then we're in an ugly situation where the other process co
On Mon, Jun 6, 2016 at 5:14 PM, Nicolai Hähnle wrote:
> On 06.06.2016 00:28, Bas Nieuwenhuizen wrote:
>>
>> This fixes a problem with the CE preamble and restoring only stuff in the
>> preamble when needed.
>>
>> To illustrate suppose we have two graphics IB'
load all descriptor set buffers instead of load and store the entire
CE RAM.
- Leave the ce_ram_dirty tracking in place for the non-preamble case.
Signed-off-by: Bas Nieuwenhuizen
Cc: "12.0"
---
Replaces "radeonsi: Save and restore entire CE RAM."
src
0;
> @@ -344,7 +347,8 @@ static int amdgpu_surface_init(struct radeon_winsys *rws,
> AddrSurfInfoIn.flags.dccCompatible = !(surf->flags &
> RADEON_SURF_Z_OR_SBUFFER) &&
> !(surf->flags & RADEON_SURF_SCANOUT)
load all descriptor set buffers instead of load and store the entire
CE RAM.
- Leave the ce_ram_dirty tracking in place for the non-preamble case.
v3: - Fixed parameter alignment.
- Rebased to master (Nicolai's descriptor series).
Signed-off-by: Bas Nieuwenhuizen
Cc: "12.0&q
On Fri, Jun 10, 2016 at 11:35 AM, Nicolai Hähnle wrote:
> On 09.06.2016 19:50, Bas Nieuwenhuizen wrote:
>>
>> This fixes a problem with the CE preamble and restoring only stuff in the
>> preamble when needed.
>>
>> To illustrate suppose we have two graphics IB
Reviewed-by: Bas Nieuwenhuizen
On Mon, Jun 13, 2016 at 6:17 PM, Marek Olšák wrote:
> From: Marek Olšák
>
> Use LLVMBuildRetVoid in epilogs and the GS copy shader and
> si_llvm_build_ret otherwise.
> ---
> src/gallium/drivers/radeonsi/si_shader.c | 20 ++--
>
We always compute HTILE size using addrlib, even when not TC compatible.
Signed-off-by: Bas Nieuwenhuizen
---
src/amd/common/ac_surface.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/src/amd/common/ac_surface.c b/src/amd/common/ac_surface.c
index 51e15d07d3c
We never use EXPCLEAR clears.
Signed-off-by: Bas Nieuwenhuizen
---
src/amd/vulkan/radv_cmd_buffer.c | 19 +++
src/amd/vulkan/radv_device.c | 21 +++--
src/amd/vulkan/radv_image.c | 7 ---
src/amd/vulkan/radv_meta_clear.c | 2 +-
src/amd/vulkan
Signed-off-by: Bas Nieuwenhuizen
---
src/amd/vulkan/radv_cmd_buffer.c | 25 +
src/amd/vulkan/radv_image.c | 12
src/amd/vulkan/radv_meta_clear.c | 18 --
src/amd/vulkan/radv_private.h| 6 --
4 files changed, 41 insertions(+), 20
Not really what the fast depth clear does, no matter whether you use
EXPCLEAR or not. Seems the fast clear using the DB HW always touches
the main buffer.
Signed-off-by: Bas Nieuwenhuizen
---
src/amd/vulkan/radv_meta_clear.c | 94 +++-
1 file changed, 93
And correct implementation to specify only what we support.
Signed-off-by: Bas Nieuwenhuizen
---
src/amd/vulkan/radv_cmd_buffer.c | 4
src/amd/vulkan/radv_image.c | 9 ++---
src/amd/vulkan/radv_private.h| 10 ++
3 files changed, 20 insertions(+), 3 deletions(-)
diff
Did some RE'ing what several HTILE words give when read from a descriptor
with HTILE compression enabled.
Seems to align with -pro usage for D16 too.
Signed-off-by: Bas Nieuwenhuizen
---
src/amd/vulkan/radv_cmd_buffer.c | 17 +
1 file changed, 13 insertions(+), 4 dele
It is a successful return.
Signed-off-by: Bas Nieuwenhuizen
---
src/amd/vulkan/radv_wsi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/amd/vulkan/radv_wsi.c b/src/amd/vulkan/radv_wsi.c
index 3a8617fd8fa..5e866126b91 100644
--- a/src/amd/vulkan/radv_wsi.c
+++ b/src/amd
Signed-off-by: Bas Nieuwenhuizen
---
src/amd/vulkan/radv_pipeline.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index 3282652ddd4..01303d90da5 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
flush_compute_state doesn't reserve a large chunk, so these need their own
reservation.
Signed-off-by: Bas Nieuwenhuizen
Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
---
src/amd/vulkan/radv_cmd_buffer.c | 8
1 file changed, 8 insertions(+)
d
0x30f regressed mad max.
Signed-off-by: Bas Nieuwenhuizen
Fixes: df91abfe5af "radv: Use correct clear words for HTILE."
---
src/amd/vulkan/radv_cmd_buffer.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cm
s the regressing part of the change.
- Bas
On Tue, May 30, 2017 at 9:15 PM, Fredrik Höglund wrote:
> On Monday 22 May 2017, Bas Nieuwenhuizen wrote:
>> Did some RE'ing what several HTILE words give when read from a descriptor
>> with HTILE compression enabled.
>>
>> Seems
here, it also results in the
regression). Still looking at it, but in the meantime I'd like to
revert it as it breaks a game while AFAIK not fixing any yet.
On Wed, May 31, 2017 at 12:33 AM, Marek Olšák wrote:
> How did it regress it?
>
> Marek
>
> On Tue, May 30, 2017 at 11
eam size
> checks.
>
> This does change behaviour by emitting 2 EOPs where we used to emit
> one on CIK/VI,
> but it's also what radeonsi seems to do now.
>
> Dave.
Yeah, the change in radv_CmdEndQuery needs the reservation in the
VK_QUERY_TYPE_PIPELINE_STATISTICS cas
Patches 1,2,4 are also
Reviewed-by: Bas Nieuwenhuizen
On Thu, Jun 1, 2017 at 6:43 AM, Dave Airlie wrote:
> From: Dave Airlie
>
> This reworks this code to be like radeonsi, which will make it
> easier to add GFX9 support to it in the future.
> ---
> src/amd/vulkan/si_
We clear the descriptors_dirty array afterwards, so the SGPRs for
the other pipeline don't get updated on the flush for that other
draw/dispatch, so we have to make sure we do it immediately.
Signed-off-by: Bas Nieuwenhuizen
Fixes: ae61ddabe8c "radv: move userdata sgpr ownership to com
Sets could have been ignored during previous descriptor set flush
due to the shader not using them and therefore no SGPR being assigned.
Signed-off-by: Bas Nieuwenhuizen
Fixes: ae61ddabe8c "radv: move userdata sgpr ownership to compiler side."
---
src/amd/vulkan/radv_cmd_buf
Hi Jason,
How about just using vblank_mode=0 for mailbox and vblank_mode=3 for
fifo? While the semantics probably don't line up perfectly I think it
is close enough, and means people don't need to use different flags
depending on whether the app uses vulkan or GL.
- Bas
On Sun, Jun 4, 2017 at 9:
Hi Marek,
Do you have any other reasons besides it not improving correctness?
I'd like to pick at least the radv one, as the code doesn't get less
clear, and using 5 zeros for struct with 6 members is just plain
silly.
- Bas
On Sun, Jun 4, 2017 at 9:57 PM, Marek Olšák wrote:
> NAK.
>
> In C/C+
Signed-off-by: Bas Nieuwenhuizen
---
src/amd/vulkan/radv_image.c | 7 ++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c
index 9a76d285242..22bc6b41da8 100644
--- a/src/amd/vulkan/radv_image.c
+++ b/src/amd/vulkan
var or something.
v2 (Bas): - Don't expose the semaphore ext without implementing it.
- Only export the capabilities ext as instance ext.
- Implement radv_GetPhysicalDeviceExternalBufferPropertiesKHX.
Signed-off-by: Dave Airlie
Signed-off-by: Bas Nieuwenhuizen
---
sr
Signed-off-by: Bas Nieuwenhuizen
---
src/amd/vulkan/radv_device.c | 24 ++--
src/amd/vulkan/radv_private.h | 1 +
2 files changed, 23 insertions(+), 2 deletions(-)
diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 2d89e8635e7..9a44f657a3c
From: Dave Airlie
This just aligns with how anv does it.
Signed-off-by: Dave Airlie
---
src/amd/vulkan/radv_formats.c | 76 ++-
1 file changed, 46 insertions(+), 30 deletions(-)
diff --git a/src/amd/vulkan/radv_formats.c b/src/amd/vulkan/radv_formats.c
Reviewed-by: Bas Nieuwenhuizen
On Mon, Jun 5, 2017 at 4:08 AM, Dave Airlie wrote:
> From: Dave Airlie
>
> This just collapses a few per-stage things into a loop,
> shouldn't affect anything.
>
> Signed-off-by: Dave Airlie
> ---
> src/amd/
On Mon, Jun 5, 2017 at 5:40 PM, Jason Ekstrand wrote:
> On Mon, Jun 5, 2017 at 7:48 AM, Alex Smith
> wrote:
>>
>> As already done by RADV, this code is lifted straight from there.
>
>
> I think this is a good idea but I'm very confused by the code.
>
>>
>> Signed-off-by: Alex Smith
>> ---
>> sr
Reviewed-by: Bas Nieuwenhuizen
On Mon, Jun 5, 2017 at 2:47 AM, Dave Airlie wrote:
> From: Dave Airlie
>
> This is just ported from radeonsi.
>
> Signed-off-by: Dave Airlie
> ---
> src/amd/vulkan/radv_image.c | 28 ++
> src/amd/vul
Series is
Reviewed-by: Bas Nieuwenhuizen
On Mon, Jun 5, 2017 at 3:12 AM, Dave Airlie wrote:
> From: Dave Airlie
>
> In advance of GFX9 to reduce chances for regression, refactor
> this code out so adding the GFX9 changes will be more obvious.
>
> Signed-off-by: Dave Airlie
Reviewed-by: Bas Nieuwenhuizen
On Tue, Jun 6, 2017 at 12:37 AM, Dave Airlie wrote:
> From: Dave Airlie
>
> radeonsi never uses 512 here anymore.
>
> Signed-off-by: Dave Airlie
> ---
> src/amd/vulkan/radv_device.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions
Redundant.
Signed-off-by: Bas Nieuwenhuizen
---
src/amd/vulkan/radv_cmd_buffer.c | 4 +---
src/amd/vulkan/radv_private.h| 1 -
2 files changed, 1 insertion(+), 4 deletions(-)
diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index ed0aa8020ce..ca9d606a7ca
Divides are pretty slow, and this is in the hot path of a draw.
Signed-off-by: Bas Nieuwenhuizen
---
src/amd/vulkan/radv_cmd_buffer.c | 11 ---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index
radeonsi doesn't have it anymore either.
Signed-off-by: Bas Nieuwenhuizen
Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
---
src/amd/vulkan/radv_query.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/src/amd/vulkan/radv_query.c b/src/amd/vulk
Simple refactor.
Signed-off-by: Bas Nieuwenhuizen
---
src/amd/vulkan/radv_cmd_buffer.c | 29 ++---
1 file changed, 18 insertions(+), 11 deletions(-)
diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 6dfd52ea9d0..f3187e84d7f 100644
No sense checking each bit separately in the common case of none
being set.
Signed-off-by: Bas Nieuwenhuizen
---
src/amd/vulkan/si_cmd_buffer.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
index
No functional changes.
Signed-off-by: Bas Nieuwenhuizen
---
src/amd/vulkan/radv_cmd_buffer.c | 21 ++---
1 file changed, 10 insertions(+), 11 deletions(-)
diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index ca9d606a7ca..6dfd52ea9d0 100644
radv_prims_for_vertices(&cmd_buffer->state.pipeline->graphics.prim_vertex_count,
> draw_vertex_count);
> + instance_less_than_primgroup_size = num_prims <
> primgroup_size;
> + }
> +
> + multi_instances_smaller_than_primgroup = indirect_dr
This series is
Reviewed-by: Bas Nieuwenhuizen
On Wed, Jun 7, 2017 at 1:18 AM, Dave Airlie wrote:
> From: Dave Airlie
>
> I want to use these in the pipeline setup stage.
>
> Signed-off-by: Dave Airlie
> ---
> src/amd/vulkan/radv_cmd_buffer.c | 24 -
This series is
Reviewed-by: Bas Nieuwenhuizen
On Wed, Jun 7, 2017 at 1:31 AM, Grazvydas Ignotas wrote:
> Most functions are only inspecting nir, so nir related arguments can be
> marked const. Some more can be done if/when some nir changes are
> accepted.
>
> Signed-off-by: Gr
Reviewed-by: Bas Nieuwenhuizen
On Thu, Jun 8, 2017 at 9:04 PM, Dave Airlie wrote:
> From: Dave Airlie
>
> We have some features that seem to slow things down or cause other
> possible undesireable side effects, but it would be nice to test
> games etc with them easily.
>
>
Reviewed-by: Bas Nieuwenhuizen
On Fri, Jun 9, 2017 at 3:40 AM, Dave Airlie wrote:
> From: Dave Airlie
>
> The shader reads the descriptor to decide if it should take the
> fmask value, however we weren't initing it always, which meant
> random crap, esp with MSAA depth
On Sat, Jun 10, 2017 at 1:50 AM, Connor Abbott
wrote:
> From: Connor Abbott
>
> Signed-off-by: Connor Abbott
> ---
> src/amd/common/ac_nir_to_llvm.c | 75
> +
> src/amd/vulkan/radv_device.c| 8 +
> src/amd/vulkan/radv_pipeline.c | 2 ++
> 3 fi
Merge this with patch 14?
On Sat, Jun 10, 2017 at 1:47 AM, Connor Abbott
wrote:
> From: Connor Abbott
>
> To match si_shader_context.
>
> Signed-off-by: Connor Abbott
> ---
> src/amd/common/ac_llvm_build.c | 2 ++
> src/amd/common/ac_llvm_build.h | 2 ++
> 2 files changed, 4 insertions(+)
>
>
The series is
Reviewed-by: Bas Nieuwenhuizen
On Sat, Jun 10, 2017 at 5:53 PM, Grazvydas Ignotas wrote:
> The register header (and radeonsi comment) states V_411_SRC_ADDR_TC_L2
> is for CIK+ only, so let's assert on earlier ASICs.
>
> Signed-off-by: Grazvydas Ignotas
> -
Slightly faster than bpermute, and seems supported since at least
LLVM 3.9.
Signed-off-by: Bas Nieuwenhuizen
---
src/amd/common/ac_llvm_build.c | 78 +-
1 file changed, 54 insertions(+), 24 deletions(-)
diff --git a/src/amd/common/ac_llvm_build.c b/src
Slightly faster than bpermute, and seems supported since at least
LLVM 3.9.
v2: Since this supersedes bpermute, remove the bpermute code.
Signed-off-by: Bas Nieuwenhuizen
---
src/amd/common/ac_llvm_build.c | 47
src/amd/common/ac_llvm_build.h
The amdgpu winsys has radv_amdgpu_winsys.h, and getting another
radv_radeon_winsys.h
in there for a radeon winsys would be awkward.
Signed-off-by: Bas Nieuwenhuizen
---
src/amd/vulkan/Makefile.sources| 2 +-
src/amd/vulkan/radv_cmd_buffer.c | 2
Don't rename the enums and constants used for metadata.
Signed-off-by: Bas Nieuwenhuizen
---
src/amd/vulkan/radv_cmd_buffer.c | 4 +--
src/amd/vulkan/radv_descriptor_set.c | 2 +-
src/amd/vulkan/radv_device.c | 48 +--
sr
For preventing confusion with a radeon winsys.
Signed-off-by: Bas Nieuwenhuizen
---
src/amd/vulkan/radv_cmd_buffer.c | 26 ++---
src/amd/vulkan/radv_cs.h | 24 ++---
src/amd/vulkan/radv_descriptor_set.c | 18 ++--
src/amd/vulkan
Reviewed-by: Bas Nieuwenhuizen
We shouldn't chain when use_ib_bos is false and embed secondary
command buffers directly in the primary buffer as well, so no handling
of chaining is needed.
On Sun, Jun 11, 2017 at 4:03 PM, Grazvydas Ignotas wrote:
> Fixes trace dumping crash for SI
Reviewed-by: Bas Nieuwenhuizen
On Mon, Jun 12, 2017 at 9:54 PM, Dave Airlie wrote:
> From: Dave Airlie
>
> Coverity warned about dead code below, as meta_va was being shadowed.
>
> Signed-off-by: Dave Airlie
> ---
> src/amd/vulkan/radv_image.c | 2 +-
> 1 file ch
Reviewed-by: Bas Nieuwenhuizen
On Mon, Jun 12, 2017 at 9:49 PM, Dave Airlie wrote:
> From: Dave Airlie
>
> Coverity pointed out this was returning uninitialised.
>
> Signed-off-by: Dave Airlie
> ---
> src/amd/vulkan/radv_device.c | 5 +++--
> 1 file changed, 3 in
With the unrelated line removed,
Reviewed-by: Bas Nieuwenhuizen
On Tue, Jun 13, 2017 at 1:54 AM, Dave Airlie wrote:
> On 13 June 2017 at 09:53, Dave Airlie wrote:
>> From: Dave Airlie
>>
>> has_hw_decode is assigned twice.
>>
>> Pointed out by coverity.
Reviewed-by: Bas Nieuwenhuizen
On Tue, Jun 13, 2017 at 1:53 AM, Dave Airlie wrote:
> From: Dave Airlie
>
> coverity complains about the deref before NULL check.
>
> Signed-off-by: Dave Airlie
> ---
> src/amd/vulkan/radv_cmd_buffer.c | 4 ++--
> 1 file changed, 2 in
Otherwise the flag is borderline useless.
---
src/amd/vulkan/radv_pipeline_cache.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/amd/vulkan/radv_pipeline_cache.c
b/src/amd/vulkan/radv_pipeline_cache.c
index fc99b43fff0..458fe998b18 100644
--- a/src/amd/vulkan/radv_pipe
r-b for the series.
On Mon, Oct 16, 2017 at 2:15 PM, Samuel Pitoiset
wrote:
> Signed-off-by: Samuel Pitoiset
> ---
> src/amd/vulkan/radv_cmd_buffer.c | 5 +
> 1 file changed, 5 insertions(+)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c
> b/src/amd/vulkan/radv_cmd_buffer.c
> index 2252
, modify, merge, publish,
> + * distribute, sub license, and/or sell copies of the Software, and to
> + * permit persons to whom the Software is furnished to do so, subject to
> + * the following conditions:
> + *
> + * The above copyright notice and this permission notic
oing those also (the latter is when *NOT* to lower
indirect outputs though?). Everything besides TCS uses our custom
vector buffering code, so is susceptible to LLVM indirect addressing
bugs. We might be reducing this for LS/ES by directly writing to
LDS/memory instead of to internal vars first,
r-b
On Tue, Oct 17, 2017 at 12:02 PM, Samuel Pitoiset
wrote:
> Missed that when I allowed waves to be launched out-of-order.
>
> Signed-off-by: Samuel Pitoiset
> ---
> src/amd/vulkan/radv_cmd_buffer.c | 24 +---
> 1 file changed, 13 insertions(+), 11 deletions(-)
>
> diff --
r-b
On Tue, Oct 17, 2017 at 11:04 AM, Samuel Pitoiset
wrote:
> Signed-off-by: Samuel Pitoiset
> ---
> src/amd/vulkan/radv_meta_bufimage.c | 62
> -
> 1 file changed, 26 insertions(+), 36 deletions(-)
>
> diff --git a/src/amd/vulkan/radv_meta_bufimage.c
> b/
r-b
On Wed, Oct 18, 2017 at 5:02 AM, Timothy Arceri wrote:
> Fixes: d1c9f30d7ff7 "radv: add radv_create_shaders() helper"
> ---
> src/amd/vulkan/radv_pipeline.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.
Thanks, pushed both patches.
On Wed, Oct 18, 2017 at 3:47 PM, Alex Smith wrote:
> Fixes: 7d45d22fdd2e ("radv: switch to using radv_create_shaders()")
> Signed-off-by: Alex Smith
> ---
> src/amd/vulkan/radv_pipeline.c | 7 ++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a
r-b
On Wed, Oct 18, 2017 at 2:09 PM, Samuel Pitoiset
wrote:
> Move it to radv_cmd_buffer_flush_state() because if
> rasterizerDiscardEnable is true, the flags are not cleared.
>
> Signed-off-by: Samuel Pitoiset
> ---
> src/amd/vulkan/radv_cmd_buffer.c | 4 ++--
> 1 file changed, 2 insertions(+)
I'd prefer not to. The current size is already huge when you consider
that a lot of applications use pretty small command buffers, adding
another 12k per command buffer is a bit much. I'd prefer not having
that overhead, since the GL_vs_VK benchmarks were IIRC not really
representative.
On Wed, Oc
Interesting that we already had RADV_CMD_DIRTY_INDEX_BUFFER. r-b for the series.
On Wed, Oct 18, 2017 at 2:17 PM, Samuel Pitoiset
wrote:
> It can only be changed when CmdBindIndexBuffer() is called
> or when a secondary buffer is used. Though not always, but
> let's re-emit the packets in this si
---
src/amd/common/ac_binary.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/src/amd/common/ac_binary.c b/src/amd/common/ac_binary.c
index 1bf52c78328..cf0125c415f 100644
--- a/src/amd/common/ac_binary.c
+++ b/src/amd/common/ac_binary.c
@@ -252,6 +252,7 @@ void ac_shader_binary_read_config(s
---
src/amd/common/ac_nir_to_llvm.c | 254
1 file changed, 178 insertions(+), 76 deletions(-)
diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index f01ca8799b9..c6c56f30b81 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/s
---
src/amd/common/ac_nir_to_llvm.c | 27 ---
1 file changed, 16 insertions(+), 11 deletions(-)
diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index c6c56f30b81..67945a353e8 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/a
---
src/amd/common/ac_nir_to_llvm.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/amd/common/ac_nir_to_llvm.h b/src/amd/common/ac_nir_to_llvm.h
index 8a1e64ce7e1..66d539dec47 100644
--- a/src/amd/common/ac_nir_to_llvm.h
+++ b/src/amd/common/ac_nir_to_llvm.h
@@ -154,7 +154
---
src/amd/common/ac_nir_to_llvm.h | 1 +
src/amd/vulkan/radv_pipeline.c | 29 +
src/amd/vulkan/radv_shader.c| 17 ++---
src/amd/vulkan/radv_shader.h| 5 +++--
4 files changed, 39 insertions(+), 13 deletions(-)
diff --git a/src/amd/common/ac_nir
---
src/amd/common/ac_nir_to_llvm.c | 31 +--
1 file changed, 17 insertions(+), 14 deletions(-)
diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 38f47b34e10..f01ca8799b9 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/comm
501 - 600 of 2366 matches
Mail list logo