This patch converts the SSE-optimized build_mask_32() and
build_mask_linear_32() to VMX/VSX.
I measured the results on POWER8 machine with 32 cores at 3.4GHz and
16GB of RAM.
FPS/Score
NameBefore AfterDelta
Hi,
I did a couple of modest optimizations in llvmpipe to increase the
performance when running on a POWER8 machine. These optimizations are both
for ppc64le and ppc64.
Basically, I looked at all the places where there are special code paths for
SSE (using the PIPE_ARCH_SSE define), and
To determine if we could use special POWER8 assembly directives, we first
need to detect whether we are running on POWER8 architecture. This patch
adds this detection to configure.ac and adds the necessary compilation
flags accordingly.
Signed-off-by: Oded Gabbay
---
This file provides a portability layer that will make it easier to convert
SSE-based functions to VMX/VSX-based functions.
All the functions implemented in this file are prefixed using "vec_".
Therefore, when converting from SSE-based function, one needs to simply
replace the "_mm_" prefix of the
On Tue, Dec 29, 2015 at 10:15 AM, Marta Lofstedt
wrote:
> From: Marta Lofstedt
>
> The imulExtended test of the shader bitfield tests of the
> OpenGL ES 3.1 CTS, fail on gen8+, when BRW_REGISTER_TYPE_W
> is used for SHADER_OPECODE_MULH.
>
A hugely common case when using nir_builder is to have a shader with a
single function called main. This adds a helper that gives you just that.
This commit also makes us use it in the NIR control-flow unit tests as well
as tgsi_to_nir and prog_to_nir.
---
src/gallium/auxiliary/nir/tgsi_to_nir.c
On Tue, Dec 29, 2015 at 12:36 PM, Jason Ekstrand wrote:
>
>
> On Tue, Dec 29, 2015 at 7:32 AM, Rob Clark wrote:
>>
>> On Mon, Dec 28, 2015 at 4:23 PM, Connor Abbott
>> wrote:
>> > On Mon, Dec 28, 2015 at 3:25 PM, Rob Clark
Fixes make check.
CC: Jason Ekstrand
Signed-off-by: Aaron Watry
---
src/glsl/nir/tests/control_flow_tests.cpp | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/src/glsl/nir/tests/control_flow_tests.cpp
On Tue, Dec 29, 2015 at 7:32 AM, Rob Clark wrote:
> On Mon, Dec 28, 2015 at 4:23 PM, Connor Abbott
> wrote:
> > On Mon, Dec 28, 2015 at 3:25 PM, Rob Clark wrote:
> >> On Mon, Dec 28, 2015 at 2:05 PM, Jason Ekstrand
All the "features" of the hardware are similar starting with GEN8, so remove as
much of the GEN9 uniqueness as possible. This makes implementing future gen
platforms a bit easier.
Signed-off-by: Ben Widawsky
---
src/mesa/drivers/dri/i965/brw_device_info.c | 13
This saves a bit of typing for fields which we are obvious and always required
to be entered by the structure which is defining a platform. This is unlike
fields like URB sizes where the defaults might be fine.
Doing this also makes it easy and obvious to keep around preliminary hardware
On Tue, Dec 29, 2015 at 12:59 PM, Jason Ekstrand wrote:
> A hugely common case when using nir_builder is to have a shader with a
> single function called main. This adds a helper that gives you just that.
> This commit also makes us use it in the NIR control-flow unit tests
Jason Ekstrand writes:
> A hugely common case when using nir_builder is to have a shader with a
> single function called main. This adds a helper that gives you just that.
> This commit also makes us use it in the NIR control-flow unit tests as well
> as tgsi_to_nir and
From: Kristian Høgsberg Kristensen
GL_ARB_shader_draw_parameters added two new system values. This gets us
back to mapping mesa system values to the right TGSI semantics.
Cc: Ilia Mirkin
---
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 2 ++
1 file
On Mon, Dec 28, 2015 at 3:51 PM, Connor Abbott wrote:
> On Mon, Dec 28, 2015 at 3:31 PM, Rob Clark wrote:
>> On Mon, Dec 28, 2015 at 1:58 PM, Connor Abbott wrote:
>>> On Mon, Dec 28, 2015 at 1:35 PM, Rob Clark
On Mon, Dec 28, 2015 at 11:28 PM, Kenneth Graunke wrote:
> Unigine Heaven 4.0 and Valley 1.0 use dual color blending but don't
> specify which fragment shader output is which, so there's at best a
> 50/50 chance of us guessing it correctly. This is invalid.
>
> Unigine
On Mon, Dec 28, 2015 at 4:23 PM, Connor Abbott wrote:
> On Mon, Dec 28, 2015 at 3:25 PM, Rob Clark wrote:
>> On Mon, Dec 28, 2015 at 2:05 PM, Jason Ekstrand wrote:
>>>
>>>
>>> On Mon, Dec 28, 2015 at 10:33 AM, Rob Clark
This patch converts the SSE-optimized lp_rast_triangle_32_3_16()
to VMX/VSX.
I measured the results on POWER8 machine with 32 cores at 3.4GHz and
16GB of RAM.
FPS/Score
NameBefore AfterDelta
openarena
This patch converts the SSE optimization done in do_triangle_ccw to
VMX/VSX.
I measured the results on POWER8 machine with 32 cores at 3.4GHz and
16GB of RAM.
FPS/Score
NameBefore AfterDelta
glmark2
On Tue, Dec 29, 2015 at 8:51 AM, Aaron Watry wrote:
> Fixes make check.
>
Thanks!
Reviewed-by: Jason Ekstrand
I went ahead and pushed it.
>
> CC: Jason Ekstrand
> Signed-off-by: Aaron Watry
> ---
>
On Tue, Dec 29, 2015 at 10:15 AM, Marta Lofstedt
wrote:
> From: Marta Lofstedt
>
> The imulExtended test of the shader bitfield tests of the
> OpenGL ES 3.1 CTS, fail on gen8+, when BRW_REGISTER_TYPE_W
> is used for SHADER_OPECODE_MULH.
>
The idea looks right to me.
Though frankly I don't like our current setup code too much - in
particular the mix between c, assembly, and jit code, with some
duplication (plus the lots of transpose everywhere). There's likely
optimization potential to be found there.
Roland
Am 29.12.2015 um 17:12
On 29.12.2015 14:27, Krzysztof A. Sobiecki wrote:
From: Krzysztof Sobiecki
ALIGN_DIVUP is a driver specific(r600g) macro that duplicates DIV_ROUND_UP
functionality.
Replacing it with DIV_ROUND_UP eliminates this problems.
Those macros are actually slightly different, and
Am 29.12.2015 um 17:12 schrieb Oded Gabbay:
> This file provides a portability layer that will make it easier to convert
> SSE-based functions to VMX/VSX-based functions.
>
> All the functions implemented in this file are prefixed using "vec_".
> Therefore, when converting from SSE-based
So, if I see that right, you will automatically generate binaries using
power8 instructions if compiled on power8 capable box, which then won't
run on boxes not supporting power8? Is that really what you want?
Maybe some runtime detection would be a good idea (though I don't know
if anyone cares
From: Krzysztof Sobiecki
ALIGN_DIVUP is a driver specific(r600g) macro that duplicates DIV_ROUND_UP
functionality.
Replacing it with DIV_ROUND_UP eliminates this problems.
Signed-off-by: Krzysztof A. Sobiecki
---
src/gallium/drivers/r600/evergreen_state.c
On Tue, Dec 29, 2015 at 4:52 PM, Jason Ekstrand wrote:
>
>
> On Tue, Dec 29, 2015 at 1:50 PM, Ilia Mirkin wrote:
>>
>> On Tue, Dec 29, 2015 at 4:45 PM, Jason Ekstrand
>> wrote:
>> >
>> >
>> > On Mon, Dec 28, 2015 at 11:06 AM,
On Tue, Dec 29, 2015 at 4:45 PM, Jason Ekstrand wrote:
>
>
> On Mon, Dec 28, 2015 at 11:06 AM, Ilia Mirkin wrote:
>>
>> Currently any access params (coherent/volatile/restrict) are being lost
>> when lowering to the ssbo load/store intrinsics. Keep
On Tue, Dec 29, 2015 at 1:56 PM, Ilia Mirkin wrote:
> On Tue, Dec 29, 2015 at 4:52 PM, Jason Ekstrand
> wrote:
> >
> >
> > On Tue, Dec 29, 2015 at 1:50 PM, Ilia Mirkin
> wrote:
> >>
> >> On Tue, Dec 29, 2015 at 4:45 PM, Jason
Hooks up the new system values, passes the drawid in.
Signed-off-by: Ilia Mirkin
---
src/mesa/state_tracker/st_draw.c | 1 +
src/mesa/state_tracker/st_extensions.c | 1 +
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 4 ++--
3 files changed, 4 insertions(+), 2
This allows the state tracker to know that the various draw parameters
are available in vertex shaders.
Signed-off-by: Ilia Mirkin
---
src/gallium/docs/source/screen.rst | 3 +++
src/gallium/drivers/freedreno/freedreno_screen.c | 1 +
Signed-off-by: Ilia Mirkin
---
src/gallium/docs/source/tgsi.rst | 13 +
src/gallium/include/pipe/p_shader_tokens.h | 4 +++-
2 files changed, 16 insertions(+), 1 deletion(-)
diff --git a/src/gallium/docs/source/tgsi.rst
Am 29.12.2015 um 23:04 schrieb Ilia Mirkin:
> Signed-off-by: Ilia Mirkin
> ---
> src/gallium/docs/source/tgsi.rst | 13 +
> src/gallium/include/pipe/p_shader_tokens.h | 4 +++-
> 2 files changed, 16 insertions(+), 1 deletion(-)
>
> diff --git
On Tue, Dec 29, 2015 at 10:14 AM, Ben Widawsky
wrote:
> This saves a bit of typing for fields which we are obvious and always required
> to be entered by the structure which is defining a platform. This is unlike
> fields like URB sizes where the defaults might be
On Tuesday, December 29, 2015 9:59:08 AM PST Jason Ekstrand wrote:
> A hugely common case when using nir_builder is to have a shader with a
> single function called main. This adds a helper that gives you just that.
> This commit also makes us use it in the NIR control-flow unit tests as well
>
On Mon, Dec 28, 2015 at 11:06 AM, Ilia Mirkin wrote:
> Currently any access params (coherent/volatile/restrict) are being lost
> when lowering to the ssbo load/store intrinsics. Keep track of the
> variable being used, and bake its access params in as the last arg of
> the
On Tue, Dec 29, 2015 at 1:50 PM, Ilia Mirkin wrote:
> On Tue, Dec 29, 2015 at 4:45 PM, Jason Ekstrand
> wrote:
> >
> >
> > On Mon, Dec 28, 2015 at 11:06 AM, Ilia Mirkin
> wrote:
> >>
> >> Currently any access params
This will allow the state tracker to inform the driver where in a
broken-up multidraw we currently are. This can then be passed into the
vertex shader.
Signed-off-by: Ilia Mirkin
---
src/gallium/include/pipe/p_state.h | 2 ++
1 file changed, 2 insertions(+)
diff --git
Reviewed-by: Ilia Mirkin
Thanks for the speedy turnaround!
On Tue, Dec 29, 2015 at 2:17 PM, Kristian Høgsberg wrote:
> From: Kristian Høgsberg Kristensen
>
> GL_ARB_shader_draw_parameters added two new system values. This gets us
On Tuesday, December 29, 2015 6:42:58 PM PST Marek Olšák wrote:
> On Mon, Dec 28, 2015 at 11:28 PM, Kenneth Graunke
wrote:
> > Unigine Heaven 4.0 and Valley 1.0 use dual color blending but don't
> > specify which fragment shader output is which, so there's at best a
> >
On Tue, Dec 29, 2015 at 10:14 AM, Ben Widawsky
wrote:
> All the "features" of the hardware are similar starting with GEN8, so remove
> as
> much of the GEN9 uniqueness as possible. This makes implementing future gen
> platforms a bit easier.
>
> Signed-off-by: Ben
Am 29.12.2015 um 23:04 schrieb Ilia Mirkin:
> This allows the state tracker to know that the various draw parameters
> are available in vertex shaders.
>
> Signed-off-by: Ilia Mirkin
> ---
> src/gallium/docs/source/screen.rst | 3 +++
>
Am 29.12.2015 um 23:04 schrieb Ilia Mirkin:
> Hooks up the new system values, passes the drawid in.
>
> Signed-off-by: Ilia Mirkin
> ---
> src/mesa/state_tracker/st_draw.c | 1 +
> src/mesa/state_tracker/st_extensions.c | 1 +
>
On Tue, 2015-12-29 at 16:00 +1100, Timothy Arceri wrote:
> For tessellation shaders we cannot just copy everything to the packed
> varyings like we do in other stages as tessellation uses shared
> memory for
> varyings, therefore it is only safe to copy array elements that the
> shader
> actually
On 21.12.2015 17:35, Marek Olšák wrote:
Hi,
This patch series adds more flexibility to u_upload_mgr. First, it adds the
ability to specify the alignment per suballocation. The idea is that several
users can use the same upload buffer, but each may need a different alignment.
Finally, it
On 21.12.2015 17:35, Marek Olšák wrote:
From: Marek Olšák
The fixed alignment of u_upload_mgr will go away.
This is the first step.
The motivation is that one u_upload_mgr can have multiple users,
each allocating from the same buffer, but requiring a different alignment.
From: Marta Lofstedt
The imulExtended test of the shader bitfield tests of the
OpenGL ES 3.1 CTS, fail on gen8+, when BRW_REGISTER_TYPE_W
is used for SHADER_OPECODE_MULH.
See:
https://bugs.freedesktop.org/show_bug.cgi?id=92595
Signed-off-by: Marta Lofstedt
Reviewed-by: Marta Lofstedt
> -Original Message-
> From: Samuel Iglesias Gonsálvez [mailto:sigles...@igalia.com]
> Sent: Tuesday, December 22, 2015 8:40 AM
> To: Iago Toral Quiroga; mesa-dev@lists.freedesktop.org
> Cc: Lofstedt, Marta; Palli, Tapani
> Subject:
There used to be more members but they now share other fields
in order to keep memory use low.
Also making the naming more generic will allow us to reuse the
field for explicit byte offsets within blocks for
ARB_enhanced_layouts.
---
src/glsl/ast_to_hir.cpp | 2 +-
src/glsl/ir.cpp
From: Emil Velikov
Reviewed-by: Timothy Arceri
---
src/glsl/link_uniform_block_active_visitor.cpp | 4 ++--
src/glsl/link_uniform_blocks.cpp | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git
From: Emil Velikov
Reviewed-by: Timothy Arceri
---
src/glsl/ast_to_hir.cpp | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index bb35d72..d51f095 100644
---
This series is,
Reviewed-by: Edward O'Callaghan
On 2015-12-29 21:02, Timothy Arceri wrote:
From: Emil Velikov
Reviewed-by: Timothy Arceri
---
src/glsl/ast_to_hir.cpp | 2 +-
1 file changed, 1
52 matches
Mail list logo