Reviewed-by: Karol Herbst
On Sun, Oct 7, 2018 at 11:50 PM Ilia Mirkin wrote:
>
> The current state tracker can generate these sometimes. Fixing this is
> more involved, and due to some integer math we can generate
> divisions-by-zero.
>
> Signed-off-by: Ilia Mirk
but this movs all single color blits to the 3d blitter, right?
On Sun, Oct 7, 2018 at 11:50 PM Ilia Mirkin wrote:
>
> For some reason the 2d engine can't handle this. Red formats get special
> treatment there, so perhaps related.
>
> Fixes dEQP-GLES3 tests of the form:
>
> dEQP-GLES3.functional.
just asking, because your commit message more or less hinted towards
that being about r -> srgb conversions, but it makes sense that any
single component blits wouldn't work.
Reviewed-by: Karol Herbst
On Tue, Oct 9, 2018 at 1:03 PM Ilia Mirkin wrote:
>
> Ones that blit to srgb,
yeah, no. Everything should be handled. Got a little bit confused with
the luminance check.
On Tue, Oct 9, 2018 at 4:07 PM Ilia Mirkin wrote:
>
> What other single component formats are there that aren't covered by the
> earlier ifs?
>
> On Tue, Oct 9, 2018, 07:49
Fixes: 2f52925f5c60c72c9389bfdc122c3d5f8e15b25f
"nv50/ir: move a * b -> a << log2(b) code into createMul()"
Reviewed-by: Rhys Perry
Signed-off-by: Karol Herbst
---
src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 2 +-
1 file changed, 1 insertion(+), 1 dele
In function 'uint8_t nv50_ir::getTEXSMask(uint8_t)':
warning: control reaches end of non-void function [-Wreturn-type]
Reported-by: Moiman@freenode
Fixes: f821e80213e38e93f96255b3deacb737a600ed40
"gm107/ir: use scalar tex instructions where possible"
Signed-off-by: Ka
On Wed, Nov 7, 2018 at 3:21 PM Ilia Mirkin wrote:
>
> Reviewed-by: Ilia Mirkin
>
> Although I'd rather the return go into the "default:" case, after the
> assert, which is the reason for the warning.
sure, I'll move it.
> On Wed, Nov 7, 2018 at 7:50 AM Ka
you need from
> nir_lower_system_values.c to master.
>
> Thank you,
> Pam
>
> On Thu, Jun 28, 2018 at 5:50 AM, Karol Herbst wrote:
>>
>> Hi,
>>
>> if the changes inside "src/compiler/nir/nir_lower_system_values.c" are
>> extracted into a seperate patch, this
On Sun, Nov 11, 2018 at 10:48 PM Jason Ekstrand wrote:
>
> On Sun, Nov 11, 2018 at 3:35 PM Plamena Manolova
> wrote:
>>
>> Lowering shader variables which depend on the local work group
>> size being available in nir_lower_system_values is only possible
>> if the local work group size isn't vari
an)
On Mon, Nov 12, 2018 at 12:37 AM Jason Ekstrand wrote:
>
> On November 11, 2018 16:36:16 Karol Herbst wrote:
>
> > On Sun, Nov 11, 2018 at 10:48 PM Jason Ekstrand
> > wrote:
> >>
> >> On Sun, Nov 11, 2018 at 3:35 PM Plamena Manolova
> >> wr
it shouldn't make a difference. This pass lowers load_derefs into
whatever we want here. If we handle the system value explicitly
"sysval" gets set. If not, we fetch the op through
nir_intrinsic_from_system_value and do the load based on that. We just
take a different path, but fundamentally we do
Reviewed-by: Karol Herbst
On Tue, Nov 13, 2018 at 3:51 AM Jason Ekstrand wrote:
>
> On Mon, Nov 12, 2018 at 6:10 PM Karol Herbst wrote:
>>
>> it shouldn't make a difference. This pass lowers load_derefs into
>> whatever we want here. If we handle the system value ex
this helps reduce the overall code changes when a bit_size parameter is
added to nir_load_system_value
Reviewed-by: Jason Ekstrand
Reviewed-by: Eric Anholt
Signed-off-by: Karol Herbst
---
src/amd/vulkan/radv_meta_buffer.c| 8
src/amd/vulkan/radv_meta_bufimage.c
this allows to replace some nir_load_system_value calls with the specific
system value constructor
Reviewed-by: Jason Ekstrand
Reviewed-by: Eric Anholt
Signed-off-by: Karol Herbst
---
src/compiler/nir/nir_builder_opcodes_h.py | 21 +++--
1 file changed, 19 insertions(+), 2
or how the derefs for pointers are created.
But overall it feels we require less changes overall with my new approach
to support physical pointers inside nir and vtn.
Karol Herbst (18):
nir: add const_index parameters to system value builder function
nir: replace nir_load_system_value calls
instructions
that take more than 4 srcs (ie vec8 and vec16), nir_build_alu() is has
nir_build_alu_tail() split out and re-used by nir_build_alu2() (which is
used for the > 4 src args case).
Signed-off-by: Rob Clark
Signed-off-by: Karol Herbst
---
src/compiler/nir/nir.h |
Signed-off-by: Karol Herbst
---
src/compiler/glsl_types.h | 34 ++
src/compiler/nir_types.h | 30 +-
2 files changed, 35 insertions(+), 29 deletions(-)
diff --git a/src/compiler/glsl_types.h b/src/compiler/glsl_types.h
index
Signed-off-by: Karol Herbst
---
src/compiler/spirv/spirv_to_nir.c | 3 +++
src/compiler/spirv/vtn_private.h | 2 ++
2 files changed, 5 insertions(+)
diff --git a/src/compiler/spirv/spirv_to_nir.c
b/src/compiler/spirv/spirv_to_nir.c
index 2c214324774..650eb6a977c 100644
--- a/src/compiler
From: Rob Clark
For pointers we'll need to add another caller, plus in addition a
type_align() fxn ptr. So just simplify things and pass the
lower_io_state to get_io_offset().
Signed-off-by: Karol Herbst
---
src/compiler/nir/nir_lower_io.c | 12 ++--
1 file changed, 6 inser
Signed-off-by: Karol Herbst
---
src/compiler/nir/nir.c | 4
src/compiler/nir/nir.h | 1 +
src/compiler/nir/nir_print.c | 2 ++
src/compiler/spirv/vtn_private.h | 1 +
src/compiler/spirv/vtn_variables.c | 4
5 files changed, 12 insertions(+)
diff --git a
v2: fix for specialization constants as well
Signed-off-by: Karol Herbst
---
src/compiler/spirv/spirv_to_nir.c | 20
src/compiler/spirv/vtn_alu.c | 11 +++
2 files changed, 31 insertions(+)
diff --git a/src/compiler/spirv/spirv_to_nir.c
b/src/compiler/spirv
Signed-off-by: Karol Herbst
---
src/compiler/spirv/vtn_opencl.c | 59 +
1 file changed, 59 insertions(+)
diff --git a/src/compiler/spirv/vtn_opencl.c b/src/compiler/spirv/vtn_opencl.c
index 089e6168fd8..ecaca4c17bc 100644
--- a/src/compiler/spirv/vtn_opencl.c
Signed-off-by: Karol Herbst
---
src/compiler/glsl_types.cpp | 48 +
src/compiler/glsl_types.h | 10
src/compiler/nir_types.cpp | 12 ++
src/compiler/nir_types.h| 4
4 files changed, 74 insertions(+)
diff --git a/src/compiler
We need this for OpenCL kernels because we have to apply C rules for alignment
and padding inside structs and for this we also have to know if a struct is
packed or not.
Signed-off-by: Karol Herbst
---
src/compiler/glsl_types.cpp | 17 +++--
src/compiler/glsl_types.h
Signed-off-by: Karol Herbst
---
src/compiler/nir/nir.h| 8
src/compiler/nir/nir_clone.c | 1 +
src/compiler/nir/nir_serialize.c | 2 ++
src/compiler/spirv/spirv_to_nir.c | 15 +--
4 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/src
Not complete, mostly just adding things as I encounter them in CTS. But
not getting far enough yet to hit most of the OpenCL.std instructions.
Anyway, this is better than nothing and covers the most common builtins.
Signed-off-by: Karol Herbst
---
src/compiler/nir/meson.build | 1
From: Rob Clark
For cl we can have structs with 8/16/32/64 bit scalar types (as well as,
ofc, arrays/structs/etc), which are padded according to 'C' rules. So
for lowering struct deref's we need to not just consider a field's size,
but also it's alignment.
Signed-off-b
ed across all thrads of a work group
the solution where everybody could be happy with is to rename "global" to
"private" and use "global" later for memory usually stored within system
accessible memory (be it VRAM or system RAM if keeping SVM in mind).
Signed-
From: Rob Clark
changes by Karol:
v2: make compatible with 64 bit floats
fix isfinite
v3: use snake_case.
Signed-off-by: Karol Herbst
---
src/compiler/spirv/vtn_alu.c | 32
1 file changed, 32 insertions(+)
diff --git a/src/compiler/spirv/vtn_alu.c b/src
From: Rob Clark
vtn supports these, so don't squalk if user is happy with enabling
these.
Signed-off-by: Karol Herbst
---
src/compiler/shader_info.h | 3 +++
src/compiler/spirv/spirv_to_nir.c | 16 +---
src/compiler/spirv/vtn_variables.c | 6 --
3 files change
Signed-off-by: Karol Herbst
---
src/amd/vulkan/radv_meta_buffer.c | 8 ++--
src/amd/vulkan/radv_meta_bufimage.c | 16
src/amd/vulkan/radv_meta_fast_clear.c | 4 +-
src/amd/vulkan/radv_meta_resolve_cs.c | 4 +-
src/amd/vulkan/radv_query.c
values.
Also this allows for further validation
Signed-off-by: Karol Herbst
---
src/compiler/nir/nir.h | 3 +++
src/compiler/nir/nir_intrinsics.py | 15 ++-
src/compiler/nir/nir_intrinsics_c.py | 6 +-
src/nouveau/meson.build | 25
e patch feels rather hacky, but it is
much smaller than what we had with the fat ptr approach and I don't have to add
a new phys_pointer value type, which made things a lot easier.
Signed-off-by: Karol Herbst
---
src/compiler/nir/nir.c| 4 +-
src/compiler/nir/nir.h
2. CL inputs match uniforms in most ways and we can just take advantage of most
of nir_lower_io
Signed-off-by: Karol Herbst
---
src/compiler/spirv/spirv_to_nir.c | 32 +++
1 file changed, 32 insertions(+)
diff --git a/src/compiler/spirv/spirv_to_nir.c
b/src/com
Signed-off-by: Karol Herbst
---
src/compiler/spirv/spirv_to_nir.c | 5 +-
src/compiler/spirv/vtn_alu.c | 187 +-
src/compiler/spirv/vtn_private.h | 3 +
3 files changed, 115 insertions(+), 80 deletions(-)
diff --git a/src/compiler/spirv/spirv_to_nir.c
b
2018 at 3:11 PM Jason Ekstrand wrote:
>>
>> Reviewed-by: Jason Ekstrand
>> Cc: mesa-sta...@lists.freedesktop.org
>>
>> On Tue, Nov 13, 2018 at 9:48 AM Karol Herbst wrote:
>>>
>>> v2: fix for specialization constants as well
>>&g
Reviewed-by: Karol Herbst
On Wed, Nov 14, 2018 at 12:23 AM Jason Ekstrand wrote:
>
> The pattern of adding or multiplying an integer by an immediate is
> fairly common especially in deref chain handling. This adds a helper
> for it and uses it a few places. The advantage to the he
I like the general idea, we just shouldn't rely too much on the type
size later on, especially in regards to CL where we can have unaligned
load/stores especially for packed structs.
Acked-by: Karol Herbst
On Wed, Nov 14, 2018 at 12:24 AM Jason Ekstrand wrote:
>
> This also changes s
somehow it doesn't really look like that it is worth the effort as the
generated shaders are worse in avg?
On Fri, Jul 6, 2018 at 10:32 PM, Rhys Perry wrote:
> This patch doesn't touch NTID since it seems very difficult (or
> impossible) to generate. Seemingly because the state tracker or glsl
>
okay right, if loading those special regs is indeed more expensive
than doing the read + a few extbf then I see the point of this
optimization
On Sat, Jul 7, 2018 at 2:46 AM, Ilia Mirkin wrote:
> Are they? Fewer special reg loads = better...
>
> On Fri, Jul 6, 2018 at 8:31 PM, Kar
anyway, I think it might make sense to take a look at the shaders hurt
most as I suspect there might be a way to improve the situation a
little
On Sat, Jul 7, 2018 at 3:38 AM, Karol Herbst wrote:
> okay right, if loading those special regs is indeed more expensive
> than doing the read +
s should be fine in the end.
Reviewed-by: Karol Herbst
> On Sat, Jul 7, 2018 at 2:46 AM, Karol Herbst wrote:
>> anyway, I think it might make sense to take a look at the shaders hurt
>> most as I suspect there might be a way to improve the situation a
>> little
>>
On Sun, Jul 8, 2018 at 1:01 AM, Rhys Perry wrote:
> Previously, a phi node's sources were implicitly ordered by the inbound edge
> order. This patch changes that so that a phi node instead has a basic block
> stored for each source in a deque.
>
> There are no regressions in Unigine Heaven, Valley
most of the patches can be reviewed independently, this are just some
smaller patches I think we can upstream already.
Karol Herbst (4):
compiler: add missing entries to gl_system_value_name
nir: move lowering of SYSTEM_VALUE_LOCAL_GROUP_SIZE into a function
nir/vtn: implement
From: Rob Clark
Otherwise nir_validate may complain about 8 bit floats, which do not exist.
Reviewed-by: Karol Herbst
Signed-off-by: Karol Herbst
---
src/compiler/spirv/spirv_to_nir.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/compiler/spirv/spirv_to_nir.c
b
Signed-off-by: Karol Herbst
---
src/compiler/nir/nir_lower_system_values.c | 7 +++
src/compiler/shader_enums.c| 1 +
src/compiler/shader_enums.h| 1 +
src/compiler/spirv/vtn_variables.c | 4
4 files changed, 13 insertions(+)
diff --git a/src
From: Rob Clark
Reviewed-by: Karol Herbst
Signed-off-by: Karol Herbst
---
src/compiler/spirv/spirv_to_nir.c | 9 +
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/src/compiler/spirv/spirv_to_nir.c
b/src/compiler/spirv/spirv_to_nir.c
index aad4c713f9e..413fbf481c1 100644
we already have this code duplicated and we will need it for the global
group size as well
Signed-off-by: Karol Herbst
---
src/compiler/nir/nir_lower_system_values.c | 29 +++---
1 file changed, 14 insertions(+), 15 deletions(-)
diff --git a/src/compiler/nir
Signed-off-by: Karol Herbst
---
src/compiler/spirv/spirv_to_nir.c | 15 +--
1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/src/compiler/spirv/spirv_to_nir.c
b/src/compiler/spirv/spirv_to_nir.c
index 413fbf481c1..235003e872a 100644
--- a/src/compiler/spirv
also reorder to match the gl_system_value enum.
It is weird that the STATIC_ASSERT doesn't trigger though.
Signed-off-by: Karol Herbst
---
src/compiler/shader_enums.c | 5 -
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/src/compiler/shader_enums.c b/src/com
From: Rob Clark
Signed-off-by: Karol Herbst
---
src/compiler/nir/nir.c | 2 ++
src/compiler/nir/nir_intrinsics.py | 1 +
src/compiler/shader_enums.c| 1 +
src/compiler/shader_enums.h| 1 +
src/compiler/spirv/vtn_variables.c | 4
5 files changed, 9 insertions
also move some of the GLSL builtins over we will need for implementing
some OpenCL builtins
Signed-off-by: Karol Herbst
---
src/compiler/Makefile.sources | 2 +
src/compiler/nir/meson.build | 2 +
src/compiler/nir/nir_builtin_builder.c | 67 ++
src
OpenCL knows vector of size 8 and 16.
Signed-off-by: Karol Herbst
---
src/compiler/nir/nir.c| 14
src/compiler/nir/nir.h| 34 ++-
src/compiler/nir/nir_builder.h| 18 ++
src/compiler/nir
Signed-off-by: Karol Herbst
---
src/compiler/spirv/vtn_cfg.c | 25 ++---
1 file changed, 18 insertions(+), 7 deletions(-)
diff --git a/src/compiler/spirv/vtn_cfg.c b/src/compiler/spirv/vtn_cfg.c
index ed1ab5d1c2c..2b01ede6f81 100644
--- a/src/compiler/spirv/vtn_cfg.c
+++ b
This time all of the patches can be reviewed independently.
Karol Herbst (5):
nir/spirv: print id for unsupported builtins
nir: add builtin builder
nir: fix printing of vec16 type
nir: prepare for bumping up max components to 16
nir/spirv: handle functions with scalar and vector params
Signed-off-by: Karol Herbst
---
src/compiler/spirv/vtn_variables.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/compiler/spirv/vtn_variables.c
b/src/compiler/spirv/vtn_variables.c
index c86416495b6..67b4d59b9fe 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src
Fixes: 2f181c8c183cc8b4d0450789bb20c2be48d32db3
"glsl_types: vec8/vec16 support"
Signed-off-by: Karol Herbst
---
src/compiler/nir/nir_print.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/src/compiler/nir/nir_print.c b/src/compiler/nir/nir_print.c
index 18860db0058..4
There are no fixed sized array arguments in C, those are simply pointers
to unsized arrays and as the size is passed in anyway, just rely on that.
where possible calls are replaced by nir_channel and nir_channels.
Signed-off-by: Karol Herbst
---
src/amd/vulkan/radv_meta_blit2d.c | 9
On Fri, Jul 13, 2018 at 4:04 AM, Jason Ekstrand wrote:
> On Thu, Jul 12, 2018 at 6:48 PM Karol Herbst wrote:
>>
>> There are no fixed sized array arguments in C, those are simply pointers
>> to unsized arrays and as the size is passed in anyway, just rely on that.
>>
On Sat, Jul 14, 2018 at 8:44 PM, Ilia Mirkin wrote:
> On Fri, Jun 29, 2018 at 10:27 PM, Karol Herbst wrote:
>> We already guarded all OP_SULDP against out of bound accesses, but those
>> ended up just reusing whatever value was stored in the dest registers.
>&
On Sun, Jul 15, 2018 at 4:29 AM, Jason Ekstrand wrote:
> On Thu, Jul 12, 2018 at 4:30 AM Karol Herbst wrote:
>>
>> also move some of the GLSL builtins over we will need for implementing
>> some OpenCL builtins
>>
>> Signed-off-by: Karol Herbst
>>
u, Jul 12, 2018 at 4:30 AM Karol Herbst wrote:
>>
>> Signed-off-by: Karol Herbst
>> ---
>> src/compiler/spirv/vtn_cfg.c | 25 ++---
>> 1 file changed, 18 insertions(+), 7 deletions(-)
>>
>> diff --git a/src/compiler/spirv/vtn_cfg.c b/src/
x27;t help, because we have plenty of those
"blitter->fp[targ][mode]" accesses to cache the generated shader. Sure
we can mask the high bit away, but somehow I don't think this will be
less work than what I've already did.
> On Sun, Jun 24, 2018 at 5:00 PM, Karol Herbst w
well, I could do something like that instead though:
bool int_clamp = mode == NV50_BLIT_MODE_INT_CLAMP;
mode = NV50_BLIT_MODE_PASS;
...
if (int_clamp)
ureg_UMIN(ureg, data, ureg_src(data), ureg_imm1u(ureg, 0x7fff));
...
On Sun, Jul 15, 2018 at 12:01 PM, Karol Herbst wrote:
> On Sat,
On Sun, Jul 15, 2018 at 4:46 PM, Ilia Mirkin wrote:
> On Sun, Jul 15, 2018 at 6:03 AM, Karol Herbst wrote:
>> well, I could do something like that instead though:
>>
>> bool int_clamp = mode == NV50_BLIT_MODE_INT_CLAMP;
>> mode = NV50_BLIT_MODE_PASS;
>
> Is
From: Tobias Klausmann
Wether we wait on an inverted rendering condition or not, we should not render
on a passed query.
This fixes the CTS test case 'KHR-GL45.conditional_render_inverted.functional'.
Signed-off-by: Karol Herbst
---
src/gallium/drivers/nouveau/nvc0/nvc0_query.c
of those reworked and merged.
Boyan Ding (3):
gk110/ir: Add rcp f64 implementation
gk110/ir: Add rsq f64 implementation
gk110/ir: Use the new rcp/rsq in library
Karol Herbst (8):
nouveau: disable BGRA4 formats
nvc0/ir: return 0 in imageLoad on incomplete textures
gallium: add
From: Karol Herbst
BGRA4 formats need special handling as we can texture from them, but we
can't render to.
Signed-off-by: Karol Herbst
---
src/gallium/drivers/nouveau/nv50/nv50_formats.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_formats
From: Boyan Ding
Signed-off-by: Karol Herbst
---
.../drivers/nouveau/codegen/lib/gk110.asm | 69 ++-
.../drivers/nouveau/codegen/lib/gk110.asm.h | 42 ++-
2 files changed, 109 insertions(+), 2 deletions(-)
diff --git a/src/gallium/drivers/nouveau/codegen/lib
From: Boyan Ding
Signed-off-by: Karol Herbst
---
.../drivers/nouveau/codegen/lib/gk110.asm | 152 +-
.../drivers/nouveau/codegen/lib/gk110.asm.h | 87 +-
2 files changed, 235 insertions(+), 4 deletions(-)
diff --git a/src/gallium/drivers/nouveau/codegen/lib
Signed-off-by: Karol Herbst
---
.../nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 33 +--
.../nouveau/codegen/nv50_ir_lowering_nvc0.h | 1 +
2 files changed, 31 insertions(+), 3 deletions(-)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
b/src
From: Karol Herbst
this way Nouveau can report 128 inputs, but only 124 varyings.
Fixes: 'KHR-GL45.limits.max_fragment_input_components'
Signed-off-by: Karol Herbst
---
src/gallium/auxiliary/gallivm/lp_bld_limits.h| 1 +
src/gallium/auxiliary/tgsi/tgsi_exec.h |
Signed-off-by: Karol Herbst
---
.../drivers/nouveau/codegen/lib/gm107.asm | 169 +-
.../drivers/nouveau/codegen/lib/gm107.asm.h | 103 ++-
.../nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 2 +-
3 files changed, 270 insertions(+), 4 deletions(-)
diff --git a/src
this fixes cases where layers of a 3d image are used as 2d images inside
load/store operations.
fixes
spec@arb_shader_image_load_store@layer image3D/non-layered binding
shader_image_load_store.non-layered_binding
Signed-off-by: Karol Herbst
---
src/gallium/drivers/nouveau/nv50/nv50_miptree.c
From: Boyan Ding
v2: (Karol Herbst )
* fix Value setup for the builtins
Signed-off-by: Karol Herbst
---
.../nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 36 +++
.../nouveau/codegen/nv50_ir_lowering_nvc0.h | 1 +
2 files changed, 37 insertions(+)
diff --git a/src/gallium
apperantly fixes packed_depth_stencil.blit.depth32f_stencil8
Signed-off-by: Karol Herbst
---
src/gallium/drivers/nouveau/nvc0/nvc0_surface.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c
b/src/gallium/drivers/nouveau
From: Rhys Perry
Signed-off-by: Karol Herbst
---
.../drivers/nouveau/nvc0/nvc0_compute.c | 32 +++
.../drivers/nouveau/nvc0/nvc0_context.h | 4 +++
.../drivers/nouveau/nvc0/nvc0_query_hw.c | 7 ++--
.../drivers/nouveau/nvc0/nve4_compute.c | 2 ++
4
From: Karol Herbst
Signed-off-by: Karol Herbst
---
.../drivers/nouveau/codegen/lib/gk104.asm | 207 +-
.../drivers/nouveau/codegen/lib/gk104.asm.h | 144 +++-
.../nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 2 +-
3 files changed, 336 insertions(+), 17 deletions
Signed-off-by: Karol Herbst
---
.../drivers/nouveau/codegen/lib/gm107.asm | 78 ++-
.../drivers/nouveau/codegen/lib/gm107.asm.h | 51 +++-
.../nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 2 +-
3 files changed, 128 insertions(+), 3 deletions(-)
diff --git a/src
sure that this patch is the cause of it, just noticed it within a
CTS run.
On Sun, Jul 15, 2018 at 8:15 PM, Karol Herbst wrote:
> From: Rhys Perry
>
> Signed-off-by: Karol Herbst
> ---
> .../drivers/nouveau/nvc0/nvc0_compute.c | 32 +++
> .../drivers
>> > to
>> > take an OpenCL version as argument, which is used to configure the SPIR-V
>> > validator, instead of hardcoding OpenCL 1.2: new patch 16;
>> > * Edit patch 17 to use SPIR-V functions from the backend.
>> >
>> > Missing reviews/a
also move some of the GLSL builtins over we will need for implementing
some OpenCL builtins
v2: replace NIR_IMM_FP by nir_imm_floatN_t in ported code
fix up changes caused by swizzle rework
Signed-off-by: Karol Herbst
---
src/compiler/Makefile.sources | 2 +
src/compiler/nir
sions.
Last thing is preparing for vec8/vec16 types and handling 64 bit system
values, which is required by OpenCL.
Karol Herbst (6):
nir: add builtin builder
nir: prepare for bumping up max components to 16
nir/spirv: initial handling of OpenCL.std extension opcodes
nir/spirv: print i
OpenCL knows vector of size 8 and 16.
v2: rebased on master (nir_swizzle rework)
rework more declarations with nir_component_mask_t
adjust print_var_decl
Reviewed-by: Jason Ekstrand
Signed-off-by: Karol Herbst
---
src/compiler/nir/nir.c| 14
src
From: Rob Clark
Lightly edited to be valid 'C' code.
Is there a bug open to fix this upstream?
Signed-off-by: Karol Herbst
---
src/compiler/spirv/OpenCL.std.h | 211
1 file changed, 211 insertions(+)
create mode 100644 src/compiler/spirv/OpenCL.s
From: Rob Clark
v2 (Karol Herbst ):
make compatible with 64 bit floats
fix isfinite
Signed-off-by: Karol Herbst
---
src/compiler/spirv/vtn_alu.c | 32
1 file changed, 32 insertions(+)
diff --git a/src/compiler/spirv/vtn_alu.c b/src/compiler/spirv
Signed-off-by: Karol Herbst
---
src/compiler/spirv/vtn_alu.c | 10 ++
1 file changed, 10 insertions(+)
diff --git a/src/compiler/spirv/vtn_alu.c b/src/compiler/spirv/vtn_alu.c
index 5db6c7f0a87..d6f149d12e9 100644
--- a/src/compiler/spirv/vtn_alu.c
+++ b/src/compiler/spirv/vtn_alu.c
Not complete, mostly just adding things as I encounter them in CTS. But
not getting far enough yet to hit most of the OpenCL.std instructions.
Anyway, this is better than nothing and covers the most common builtins.
Signed-off-by: Karol Herbst
---
src/compiler/nir/meson.build | 1
Signed-off-by: Karol Herbst
---
src/compiler/spirv/vtn_alu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/compiler/spirv/vtn_alu.c b/src/compiler/spirv/vtn_alu.c
index 0ec0234f531..5a0347989e9 100644
--- a/src/compiler/spirv/vtn_alu.c
+++ b/src/compiler/spirv/vtn_alu.c
ffort and we can simply depend on the dest type choosen by the API.
Signed-off-by: Karol Herbst
---
src/compiler/nir/nir_builder_opcodes_h.py | 9 ++--
src/compiler/nir/nir_lower_alpha_test.c | 2 +-
src/compiler/nir/nir_lower_clip.c | 3 +-
src/compile
without this we might end up looping inside the dominator analysis
infinitly. Hit by some 64 bit int div OpenCL CTS test.
Signed-off-by: Karol Herbst
---
src/compiler/nir/nir_lower_int64.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/compiler/nir/nir_lower_int64.c
b/src/compiler
it shifts in nir. It is legal SPIR-V, so we might
just want to fix that.
> On Mon, Jul 16, 2018 at 7:28 AM Karol Herbst wrote:
>>
>> Signed-off-by: Karol Herbst
>> ---
>> src/compiler/spirv/vtn_alu.c | 10 ++
>> 1 file changed, 10 insertions(+)
>
t as fixed, so I guess I just go ahead and change that. Thanks
for pointing that out.
> On Fri, Jun 29, 2018 at 11:32 PM, Karol Herbst wrote:
>> Signed-off-by: Karol Herbst
>> ---
>> src/gallium/drivers/nouveau/codegen/nv50_ir_from_nir.cpp | 8
>> 1 file chan
interesting, do you have some numbers on that? Wondering if we could
switch more sysvals over to it and what about older gens?
On Tue, Jul 17, 2018 at 12:46 PM, Rhys Perry wrote:
> This instruction seems to be faster than S2R and requires no barrier,
> though the range of special registers it can
the big problem with all that is, that the code was correct and turned
out to do the right thing. Last time I was looking into that, the
projection value was overwritten with the newest projection and this
turned out to be returned into insn and just ended up doing the right
thing.
I would expect
can it be moved after Split64BitOpPreRA? The problem we want to fix
here is that LoadPropagation can potentially do some optimizations
after running LateAlgebraicOpt.
On Tue, Jun 12, 2018 at 3:30 PM, Rhys Perry wrote:
> Reverts 3072bbe ("nv50/ir: move LateAlgebraicOpt to the very end") since
> SH
x27;m getting ~28 cycles for the S2R and ~6 cycles (unsurprisingly) for the
> CS2R.
>
> nvcc with SM30 seems to use the same instruction as the nvc0 emission code.
>
> The SV_LANE* system values don't work with CS2R and I haven't looked
> too deeply into the others.
&g
uint16_t
On Wed, Jul 18, 2018 at 7:05 PM, Rhys Perry wrote:
> Signed-off-by: Rhys Perry
> ---
> src/gallium/drivers/nouveau/codegen/nv50_ir.h | 23
> ++
> .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 17 ++--
> .../drivers/nouveau/codegen/nv50_ir_pr
On Wed, Jul 18, 2018 at 7:05 PM, Rhys Perry wrote:
> Signed-off-by: Rhys Perry
> ---
> .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 63
> ++
> .../nouveau/codegen/nv50_ir_target_gm107.cpp | 6 ++-
> .../nouveau/codegen/nv50_ir_target_nvc0.cpp| 1 +
>
some nitpicks, but with those fixed:
Reviewed-by: Karol Herbst
On Wed, Jul 18, 2018 at 7:05 PM, Rhys Perry wrote:
> This hits the shader-db numbers a good bit, though a few xmads is way
> faster than an imul or imad and the cost is mitigated by the next commit,
> which optim
601 - 700 of 927 matches
Mail list logo