From: Francisco Jerez
This has likely been broken since we started propagating copies not
matching the offset of the instruction exactly
(1728e74957a62b1b4b9fbb62a7de2c12b77c8a75). The copy source stride
needs to be taken into account to find out the offset at the origin
From: Iago Toral Quiroga
We were not invalidating entries with a src that reads more than one register
when we find writes that overwrite any register read by entry->src after
the first. This leads to incorrect copy propagation because we re-use
entries from the ACP that have
Hi,
this version includes all the feedback received to v1 plus a few new
patches (22-27) that deal with 64bit URB read/writes, which was
missing in v1. Below is a list of patches that still need to get the Rb:
[PATCH v2 02/30] i965/fs: Fix propagation of copies with strided source.
[PATCH v2
From: Iago Toral Quiroga
Because the semantics of source modifiers are type-dependent, the type of the
original source of the copy must be kept unmodified while propagating it into
some instruction, which implies that we need to have the guarantee that the
meaning of the
From: Francisco Jerez
try_copy_propagate() was special-casing UNIFORM registers (the
BAD_FILE, ARF and FIXED_GRF cases are dead, see the assertion at the
top of the function) and then failing to take into account the
possibility of the instruction reading from a non-zero
From: Iago Toral Quiroga
We were not considering the case where the load payload is writing to
a destination with a reg_offset > 0.
Reviewed-by: Kenneth Graunke
Reviewed-by: Francisco Jerez
---
From: Francisco Jerez
---
.../drivers/dri/i965/brw_fs_copy_propagation.cpp | 46 +++---
1 file changed, 23 insertions(+), 23 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
From: Iago Toral Quiroga
Specifically, consider the size of the data type of the operand to compute
the number of registers written.
v2 (Sam):
- Fix line width (Jordan).
- Add an assert (Jordan).
- Use REG_SIZE in the calculation of regs_written (Curro)
Reviewed-by: Kenneth
From: Iago Toral Quiroga <ito...@igalia.com>
This can happen if the register already has a non-zero subreg_offset
when byte_offset() is called.
v2 (Sam):
- Refactor byte_offset() (Jordan).
Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
Reviewed-by: Kenneth G
Extra bits required to make room for the df field of the union don't get
initialized in the constructor. Initialize them to zero before setting
the rest of union's fields.
Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
Reported-by: Francisco Jerez <curroje...@riseup.net&
Extra bits required to make room for the df field of the union don't get
initialized in all codepaths, so backend_reg comparisons done using
memcmp() can basically return random results. Check field by field to
avoid this.
Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
Re
On 11/05/16 22:46, Francisco Jerez wrote:
> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
>
>> On Tue, 2016-05-10 at 21:06 -0700, Francisco Jerez wrote:
>>> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
>>>
>>>
On 11/05/16 22:30, Francisco Jerez wrote:
> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
>
>> On 11/05/16 05:56, Francisco Jerez wrote:
>>> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
>>>
>>>> From: Connor
On Wed, 2016-05-11 at 17:12 +0200, Samuel Iglesias Gonsálvez wrote:
> On Tue, 2016-05-10 at 21:06 -0700, Francisco Jerez wrote:
> >
> > Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> >
> > >
> > >
>
On Tue, 2016-05-10 at 21:06 -0700, Francisco Jerez wrote:
> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
>
> >
> > From: Iago Toral Quiroga <ito...@igalia.com>
> >
> > UNIFORM_PULL_CONSTANT_LOAD is used to load a contiguous vec4
> > s
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
On 10/05/16 22:57, Francisco Jerez wrote:
> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
>
>> From: Iago Toral Quiroga <ito...@igalia.com>
>>
>> We should not offset into them based on t
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
On 10/05/16 22:41, Francisco Jerez wrote:
> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
>
>> From: Iago Toral Quiroga <ito...@igalia.com>
>>
>> Specifically, consider the size of the data type of
On 11/05/16 05:56, Francisco Jerez wrote:
> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
>
>> From: Connor Abbott <connor.w.abb...@intel.com>
>>
>> v2 (Iago)
>> - Fixup accessibility in backend_reg
>>
>> Signed-off-by: Iago To
different sizes
Sam
> -Jordan
>
> On 2016-05-03 05:21:53, Samuel Iglesias Gonsálvez wrote:
>> From: Iago Toral Quiroga <ito...@igalia.com>
>>
>> When source modifiers are present and the types of the source and
>> the entry's source are different, there are cer
On 04/05/16 00:51, Jordan Justen wrote:
> On 2016-05-03 05:21:52, Samuel Iglesias Gonsálvez wrote:
>> From: Iago Toral Quiroga <ito...@igalia.com>
>>
>> Specifically, consider the size of the data type of the operand to compute
>> the number of registers written
On 03/05/16 20:30, Jordan Justen wrote:
> On 2016-05-03 05:21:51, Samuel Iglesias Gonsálvez wrote:
>> From: Iago Toral Quiroga <ito...@igalia.com>
>>
>> We should not offset into them based on the relative offset of
>> our source and the destination of the instr
On 03/05/16 19:57, Jordan Justen wrote:
> On 2016-05-03 05:21:50, Samuel Iglesias Gonsálvez wrote:
>> From: Iago Toral Quiroga <ito...@igalia.com>
>>
>> This can happen if the register already has a non-zero subreg_offset
>> when byte_offset() is called.
>
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
On 03/05/16 20:59, Kenneth Graunke wrote:
> Other than patches 37, 56, and ones you agreed to drop, the series
> is: Reviewed-by: Kenneth Graunke
>
> I think you can go ahead and land all except those, and we can
> land
variables in the push constant buffer.
>>
>> To fix this, this patch pushes first all the 64-bit variables and
>> then the rest. Then, all the variables would be aligned to
>> its data type size.
>>
>> Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igali
On 07/05/16 09:22, Jordan Justen wrote:
> On 2016-05-05 23:56:08, Samuel Iglesias Gonsálvez wrote:
>> When there is a mix of definitions of uniforms with 32-bit or 64-bit
>> data type sizes, the driver ends up doing misaligned access to double
>> based variables in the
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
Kenneth gave his R-b to this patch on IRC:
"i965/fs: push first double-based uniforms in push constant
buffer" gets my R-b
On 06/05/16 08:56, Samuel Iglesias Gonsálvez wrote:
> When there is a mix of definitions of uniforms with 3
be aligned to
its data type size.
Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 113 +--
1 file changed, 83 insertions(+), 30 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/sr
On 03/05/16 01:10, Kenneth Graunke wrote:
> On Friday, April 29, 2016 1:29:53 PM PDT Samuel Iglesias Gonsálvez wrote:
>> When there is a mix of definitions of uniforms with 32-bit or 64-bit
>> data type sizes, the driver ends up doing misaligned access to double
>> based
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
On 04/05/16 21:28, Francisco Jerez wrote:
> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
>
>
>
> On 03/05/16 20:59, Kenneth Graunke wrote:
>>>> Other than patches 37, 56, and ones you agreed to d
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
On 03/05/16 20:59, Kenneth Graunke wrote:
> Other than patches 37, 56, and ones you agreed to drop, the series
> is: Reviewed-by: Kenneth Graunke
>
> I think you can go ahead and land all except those, and we can
> land
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
This patch is still unreviewed. Rob, Can you take a look at it?
Sam
On 29/04/16 13:29, Samuel Iglesias Gonsálvez wrote:
> Lower lrp when operating with double operands because float version
> of lrp is also lowered.
>
> Signed-of
From: Iago Toral Quiroga
ARB_gpu_shader_fp64 was the only feature missing.
---
src/mesa/drivers/dri/i965/intel_extensions.c | 4 +++-
src/mesa/drivers/dri/i965/intel_screen.c | 2 +-
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git
From: Iago Toral Quiroga
---
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 87 +---
1 file changed, 80 insertions(+), 7 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index
From: Iago Toral Quiroga
---
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 9 ++---
1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 2c11783..fa1c30e 100644
---
From: Iago Toral Quiroga
This is pretty much the same we do with SSBOs.
---
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 32 +++-
1 file changed, 27 insertions(+), 5 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
From: Iago Toral Quiroga
There are a few places where we need to shuffle the result of a 32-bit load
into valid 64-bit data, so extract this logic into a separate helper that we
can reuse.
Also, the shuffling needs to operate with WE_all set, which we were missing
before,
From: Iago Toral Quiroga
---
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 37
1 file changed, 33 insertions(+), 4 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index
From: Iago Toral Quiroga
UNIFORM_PULL_CONSTANT_LOAD is used to load a contiguous vec4 starting at a
constant offset that is 16-byte aligned. If we need to access an unaligned
offset we emit a load with an aligned offset and use the remaining constant
offset to select the
From: Iago Toral Quiroga
UBO loads with constant offset use the UNIFORM_PULL_CONSTANT_LOAD
instruction, which reads 16 bytes (a vec4) of data from memory. For dvec
types this only provides components x and y. Thus, if we are reading
more than 2 components we need to issue a
From: Iago Toral Quiroga
---
docs/GL3.txt | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/docs/GL3.txt b/docs/GL3.txt
index bb2bb6e..a8219a4 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -124,7 +124,7 @@ GL 4.0, GLSL 4.00 --- all DONE: nvc0, r600,
From: Iago Toral Quiroga
We are going to need the same logic that we use to handle ssbo loads
of doubles in other places, like shared variable loads, which also
use emit_untyped_read. Pull the logic to a separate helper function
that we can share.
---
From: Iago Toral Quiroga
This does the inverse operation of SHUFFLE_32BIT_LOAD_RESULT_TO_64BIT_DATA
and we will use it when we need to write 64-bit data in the layout expected
by untyped write messages.
Again, this needs to operate with WE_all set for the same reasons as the
From: Iago Toral Quiroga
---
src/mesa/drivers/dri/i965/intel_extensions.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c
b/src/mesa/drivers/dri/i965/intel_extensions.c
index 820d573..7ef3a7c 100644
---
From: Iago Toral Quiroga
We were not considering the case where the load payload is writing to
a destination with a reg_offset > 0.
---
src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git
From: Iago Toral Quiroga
The transposition needs to set exec_all() but it writes directly to the
original instruction's destination, which can lead to execmasking
problems if the original instruction did not have force_writemask_all
set. In that case, write the result of the
From: Iago Toral Quiroga
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 49 ++--
1 file changed, 47 insertions(+), 2 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index bc81a80..0e69be8
From: Iago Toral Quiroga
Because the stride is in units for the type, if we copy-propagate from
a another instruction using a larger type, then we need to make sure
that the source in that instruction, the one we will be copy-propagating
from, sources consecutive elements,
From: Iago Toral Quiroga
The current code ignores the suboffet in the instruction's source
and just uses the one from the constant. This is not correct
when the instruction's source is accessing the constant with a
different type and using the suboffset to select a specific
From: Iago Toral Quiroga
We were not invalidating entries with a src that reads more than one register
when we find writes that overwrite any register read by entry->src after
the first. This leads to incorrect copy propagation because we re-use
entries from the ACP that have
From: Iago Toral Quiroga
We were not accounting for reg_suboffset in the check for the start
of the region. This meant that would allow copy-propagation even if
the dst wrote to sub_regoffset 4 and our source read from
sub_regoffset 0, which is not correct. This was observed
From: Iago Toral Quiroga
When source modifiers are present and the types of the source and
the entry's source are different, there are certain cases in which
we allow copy-propagation to change the type of source by the type
of the entry's source we are copy propagating from.
From: Iago Toral Quiroga
Specifically, consider the size of the data type of the operand to compute
the number of registers written.
---
src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git
Hello,
This series adds the final bits to support arb_gpu_shader_fp64 in the
i965 scalar backend for BDW+ hardware. It sits on top of the previous
series we sent last week [0] and which is going through review at the
moment. Specifically, this series adds:
1. Fixes to copy propagation required
From: Iago Toral Quiroga
We should not offset into them based on the relative offset of
our source and the destination of the instruction we are copy
propagating from, so we don't turn this:
mov(16) vgrf6:F, vgrf7+0.0<0>:F
(...)
load_payload(8) vgrf28:F, vgrf6+1.0:F 2ndhalf
From: Iago Toral Quiroga
This can happen if the register already has a non-zero subreg_offset
when byte_offset() is called.
---
src/mesa/drivers/dri/i965/brw_ir_fs.h | 4
1 file changed, 4 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
On 02/05/16 23:50, Mark Janes wrote:
> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
>
>> Hello,
>>
>> This patch series continues adding arb_gpu_shader_fp64 support to
>> the Intel driver. Specifi
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
On 02/05/16 14:33, Pohjolainen, Topi wrote:
> On Fri, Apr 29, 2016 at 01:29:55PM +0200, Samuel Iglesias Gons?lvez
> wrote:
>> Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
>> --- src/mesa/drivers/dri
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
On 02/05/16 11:01, Pohjolainen, Topi wrote:
> On Fri, Apr 29, 2016 at 01:29:40PM +0200, Samuel Iglesias Gons?lvez
> wrote:
>> From: Connor Abbott
>>
>> --- src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 2 +- 1 file
>>
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
On 02/05/16 10:38, Pohjolainen, Topi wrote:
> On Fri, Apr 29, 2016 at 01:29:29PM +0200, Samuel Iglesias Gons?lvez
> wrote:
>> From: Iago Toral Quiroga
>>
>> When we are actually unpacking from a double that we have
>>
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
On 02/05/16 10:17, Pohjolainen, Topi wrote:
> On Fri, Apr 29, 2016 at 01:29:27PM +0200, Samuel Iglesias Gons?lvez
> wrote:
>> From: Connor Abbott
>>
>> --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 12 1
On 02/05/16 09:22, Pohjolainen, Topi wrote:
> On Fri, Apr 29, 2016 at 01:29:23PM +0200, Samuel Iglesias Gons?lvez wrote:
>> From: Connor Abbott
>>
>> v2 (Sam):
>> - Add bitsize to brw_type_for_nir_type() in optimize_extract_to_float()
>> ---
>>
On 02/05/16 09:22, Pohjolainen, Topi wrote:
> On Fri, Apr 29, 2016 at 01:29:23PM +0200, Samuel Iglesias Gons?lvez wrote:
>> From: Connor Abbott
>>
>> v2 (Sam):
>> - Add bitsize to brw_type_for_nir_type() in optimize_extract_to_float()
>> ---
>>
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
On 02/05/16 09:13, Pohjolainen, Topi wrote:
> On Fri, Apr 29, 2016 at 01:29:20PM +0200, Samuel Iglesias Gons?lvez
> wrote:
>> From: Connor Abbott
>>
>> v2 (Iago): - Squashed bits from 'support double precission
>>
On 02/05/16 09:56, Iago Toral wrote:
> On Mon, 2016-05-02 at 10:54 +0300, Pohjolainen, Topi wrote:
>> On Mon, May 02, 2016 at 09:42:14AM +0200, Iago Toral wrote:
>>> On Mon, 2016-05-02 at 10:34 +0300, Pohjolainen, Topi wrote:
On Mon, May 02, 2016 at 09:22:49AM +0200, Iago Toral wrote:
>
On 02/05/16 09:02, Pohjolainen, Topi wrote:
> On Fri, Apr 29, 2016 at 01:29:15PM +0200, Samuel Iglesias Gons?lvez wrote:
>> From: Iago Toral Quiroga
>>
>> ---
>> src/mesa/drivers/dri/i965/brw_shader.cpp | 28 ++--
>> 1 file changed, 22 insertions(+),
On 02/05/16 08:24, Pohjolainen, Topi wrote:
> On Fri, Apr 29, 2016 at 01:29:12PM +0200, Samuel Iglesias Gons?lvez wrote:
>> From: Connor Abbott
>>
>> ---
>> src/mesa/drivers/dri/i965/brw_eu_emit.c | 28 +---
>> 1 file changed, 21
tching up email and doing the required changes to each
patch. If they don't have replies, I will reply to them.
Sam
> On 2016-04-29 04:28:57, Samuel Iglesias Gonsálvez wrote:
>> Hello,
>>
>> This patch series continues adding arb_gpu_shader_fp64 support to
>> the Intel
te it now
> that we lower away all the possible math operations on doubles.
>
OK, I will remove it.
Thanks Connor!
Sam
> On Fri, Apr 29, 2016 at 7:29 AM, Samuel Iglesias Gonsálvez
> <sigles...@igalia.com> wrote:
>> From: Connor Abbott <connor.w.abb...@intel.com&g
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
On 30/04/16 09:26, Kenneth Graunke wrote:
> On Friday, April 29, 2016 1:29:20 PM PDT Samuel Iglesias Gonsálvez
> wrote:
>> From: Connor Abbott <connor.w.abb...@intel.com>
>>
>> v2 (Iago): - Squashed bits f
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
On 30/04/16 09:20, Kenneth Graunke wrote:
> On Friday, April 29, 2016 1:29:19 PM PDT Samuel Iglesias Gonsálvez
> wrote:
>> From: Connor Abbott <connor.w.abb...@intel.com>
>>
>> --- src/mesa/drivers/dri/i965/brw_fs.cp
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
On 29/04/16 20:26, Jordan Justen wrote:
> On 2016-04-28 04:19:18, Samuel Iglesias Gonsálvez wrote:
>> Make this distintion as the drivers might need to lower it inside NIR
.
>>
>> Signed-off-by: Samuel Iglesias Gonsálvez
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
On 29/04/16 20:43, Jason Ekstrand wrote:
> Why not just squash 2 and 3 and call it "Separate 32 and 64-bit
> fmod lowering" or something like that.
>
OK, I like it.
Sam
>
> On Thu, Apr 28, 2016 at 4:19 AM,
On 2016-04-30 10:34, Kenneth Graunke wrote:
With my horiz_offset concerns and other minor comments addressed,
Patches 1-33 (except 2, 3, 11, and 22) and 36, 40-55, 57-58 are:
Reviewed-by: Kenneth Graunke
I'll look at the rest soon.
It sounds like Curro is taking a look
In that case, the writes need two times the size of a 32-bit value.
We need to adjust the exec_size, so it is not breaking any hardware
rule.
Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 8 +++-
1 file changed, 7 insertions
From: Iago Toral Quiroga
Usually, writes to a subreg_offset > 0 would also have a stride > 1
and we would recognize them as partial, however, there is one case
where this does not happen, that is when we generate code for 64-bit
imemdiates in gen7, where we produce something
From: Iago Toral Quiroga
When the original instruction had a stride > 1, the combined registers
written by the split instructions won't amount to the same register space
written by the original instruction because the split instructions will
use a stride of 1. The current code
From: Connor Abbott
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 729c7a0..45afd1a 100644
---
From: Iago Toral Quiroga
---
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 4d8cb28..87e098a 100644
---
From: Connor Abbott
We need to do this late, in order to avoid partial writes during the
optimization loop.
---
src/mesa/drivers/dri/i965/Makefile.sources | 1 +
src/mesa/drivers/dri/i965/brw_fs.cpp | 5 ++
src/mesa/drivers/dri/i965/brw_fs.h | 1
From: Connor Abbott
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index ac170d5..729c7a0 100644
---
From: Connor Abbott
v2: Account for the stride of the dst (Iago)
Signed-off-by: Iago Toral Quiroga
---
src/mesa/drivers/dri/i965/brw_fs_builder.h | 8 ++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git
From: Iago Toral Quiroga
These need the same treatment as d2f, so generalize our d2f lowering to cover
these too.
---
src/mesa/drivers/dri/i965/brw_fs_lower_d2f.cpp | 6 --
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 2 ++
2 files changed, 6 insertions(+), 2
The constants could be double, and it was allocating size for float types
for the destination register of varying pull constant loads.
Then the fs_visitor::validate() will complain.
Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
src/mesa/drivers/dri/i965/brw_fs.c
From: Iago Toral Quiroga
---
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 98 ++--
1 file changed, 81 insertions(+), 17 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index
From: Connor Abbott
---
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 70ffc6d..f8a82d7 100644
---
From: Iago Toral Quiroga
Since it no longer handles conversions from double to float but from
double to various other 32-bit types.
---
src/mesa/drivers/dri/i965/Makefile.sources | 2 +-
src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
to remove a couple of loops that had to compute that.
- Reworked things a bit so we can get rid of the nr_paddings field in
brw_stage_prog_data.
- Use rzalloc_array instead or ralloc_array and memset.
- Fixed wrong indentation.
Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.
From: Iago Toral Quiroga
Add asserts so we remember to address this when we enable 64-bit
integer support, as suggested by Connor and Jason.
---
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 32 +---
1 file changed, 29 insertions(+), 3 deletions(-)
From: Iago Toral Quiroga
In the case of the pack opcode we are already doing the
lowering in NIR, so no need to do it here. The unpack opcode
operates on scalars, so it should not be lowered.
In the case of frexp_sig and frexp_exp, they are lowered in
lower_instructions, so
From: Connor Abbott
The destination has to have the same source as the type, or else the
simulator will complain. As a result, we need to emit a CMP that
outputs a 64-bit wide result and then do a strided MOV to pick out the
low 32 bits of each channel.
---
From: Connor Abbott
---
src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
index 15af2c1..047e4ef 100644
---
Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
---
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 87e098a..4cd219a
From: Connor Abbott
Work based on registers read/written instead of dispatch_width, so that
the interferences are added for 64-bit sources/destinations as well.
---
src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 67 ---
1 file changed, 48
From: Iago Toral Quiroga
Probably not needed since we fix the dst type of comparisons
automatically, but for consistency with the rest of null_reg_*
functions.
---
src/mesa/drivers/dri/i965/brw_fs_builder.h | 7 +++
1 file changed, 7 insertions(+)
diff --git
From: Iago Toral Quiroga
---
src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 13 +
1 file changed, 13 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index fb48a56..6b1b5b9 100644
---
From: Connor Abbott
The HW has a restriction that only vertical stride may cross register
boundaries. Previously, this only mattered for SIMD16 instructions where
we needed to use the same regioning parameters as the equivalent SIMD8
instruction but double the exec
From: Connor Abbott
Uniform doubles will read two registers, in which case we need to mark
both as being live.
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
From: Connor Abbott
v2: Do it only for uniforms (Iago)
Signed-off-by: Iago Toral Quiroga
---
src/mesa/drivers/dri/i965/brw_ir_fs.h | 6 +-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h
From: Connor Abbott
This makes things more consistent, and also fixes the offset calculation
for double uniforms.
---
src/mesa/drivers/dri/i965/brw_fs.h | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h
From: Connor Abbott
---
src/mesa/drivers/dri/i965/brw_fs.cpp | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index e9fd251..90c1e93 100644
---
801 - 900 of 1472 matches
Mail list logo