Re: [Mesa-dev] [PATCH] i965/nir: Provide a default LOD for buffer textures

2015-12-13 Thread Eduardo Lima Mitev
Patch is:

Reviewed-by: Eduardo Lima Mitev 

Thanks!

On 12/13/2015 01:12 AM, Jason Ekstrand wrote:
> Our hardware requires an LOD for all texelFetch commands even if they are
> on buffer textures.  GLSL IR gives us an LOD of 0 in that case, but the LOD
> is really rather meaningless.  This commit allows other NIR producers to be
> more lazy and not provide one at all.
> ---
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 4 
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 4 
>  2 files changed, 8 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index c52bc04..6f51ce1 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -2780,6 +2780,10 @@ fs_visitor::nir_emit_texture(const fs_builder , 
> nir_tex_instr *instr)
>  
> fs_reg coordinate, shadow_comparitor, lod, lod2, sample_index, mcs, 
> tex_offset;
>  
> +   /* Our hardware requires a LOD for buffer textures */
> +   if (instr->sampler_dim == GLSL_SAMPLER_DIM_BUF)
> +  lod = brw_imm_d(0);
> +
> for (unsigned i = 0; i < instr->num_srcs; i++) {
>fs_reg src = get_nir_src(instr->src[i].src);
>switch (instr->src[i].src_type) {
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index cf1f82f..cfb66a5 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -1558,6 +1558,10 @@ vec4_visitor::nir_emit_texture(nir_tex_instr *instr)
>   nir_tex_instr_dest_size(instr));
> dst_reg dest = get_nir_dest(instr->dest, instr->dest_type);
>  
> +   /* Our hardware requires a LOD for buffer textures */
> +   if (instr->sampler_dim == GLSL_SAMPLER_DIM_BUF)
> +  lod = brw_imm_d(0);
> +
> /* Load the texture operation sources */
> for (unsigned i = 0; i < instr->num_srcs; i++) {
>switch (instr->src[i].src_type) {
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: assign varying locations to tess shaders when doing SSO

2015-12-13 Thread Timothy Arceri
On Sun, 2015-12-13 at 03:25 -0500, Ilia Mirkin wrote:
> GRID Autosport uses SSO shaders. When a tessellation evaluation
> shader
> is passed through this, it triggers assertion failures down the line
> with unassigned varying locations. Make sure to do this when the
> first
> shader in the pipeline is not a vertex shader.
> 
> Signed-off-by: Ilia Mirkin 
> Cc: "11.0 11.1" 

Reviewed-by: Timothy Arceri 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: assign varying locations to tess shaders when doing SSO

2015-12-13 Thread Ilia Mirkin
GRID Autosport uses SSO shaders. When a tessellation evaluation shader
is passed through this, it triggers assertion failures down the line
with unassigned varying locations. Make sure to do this when the first
shader in the pipeline is not a vertex shader.

Signed-off-by: Ilia Mirkin 
Cc: "11.0 11.1" 
---
 src/glsl/linker.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index a87bbb2..158361a 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -4423,13 +4423,13 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
if (first < MESA_SHADER_FRAGMENT) {
   gl_shader *const sh = prog->_LinkedShaders[last];
 
-  if (first == MESA_SHADER_GEOMETRY) {
+  if (first != MESA_SHADER_VERTEX) {
  /* There was no vertex shader, but we still have to assign varying
   * locations for use by geometry shader inputs in SSO.
   *
   * If the shader is not separable (i.e., prog->SeparateShader is
-  * false), linking will have already failed when first is
-  * MESA_SHADER_GEOMETRY.
+  * false), linking will have already failed when first is not
+  * MESA_SHADER_VERTEX.
   */
  if (!assign_varying_locations(ctx, mem_ctx, prog,
NULL, prog->_LinkedShaders[first],
-- 
2.4.10

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] configure.ac: fix test for SSE4.1 assembler support

2015-12-13 Thread Jonathan Gray
On Sat, Dec 12, 2015 at 06:41:56PM +, Emil Velikov wrote:
> On 10 December 2015 at 08:42, Oded Gabbay  wrote:
> > On Wed, Dec 9, 2015 at 8:30 PM, Matt Turner  wrote:
> >> On Tue, Dec 8, 2015 at 9:37 PM, Jonathan Gray  wrote:
> >>> Change the __m128i variables to be volatile so gcc 4.9 won't optimise
> >>> all of them out with -O1 or greater.  The _mm_set1_epi32/pinsrd calls
> >>> still get optimised out but now there is at least one SSE4.1 instruction
> >>> generated via _mm_max_epu32/pmaxud.  When all of the sse4.1 instructions
> >>> got optimised out the configure test would incorrectly pass when the
> >>> compiler supported the intrinsics and the assembler didn't support the
> >>> instructions.
> >>>
> >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91806
> >>> Signed-off-by: Jonathan Gray 
> >>> Cc: "11.0 11.1" 
> >>> ---
> >>>  configure.ac | 2 +-
> >>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/configure.ac b/configure.ac
> >>> index 260934d..1d82e47 100644
> >>> --- a/configure.ac
> >>> +++ b/configure.ac
> >>> @@ -384,7 +384,7 @@ CFLAGS="$SSE41_CFLAGS $CFLAGS"
> >>>  AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
> >>>  #include 
> >>>  int main () {
> >>> -__m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c;
> >>> +volatile __m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c;
> >>>  c = _mm_max_epu32(a, b);
> >>>  return 0;
> >>
> >> I would have extracted an int from the result of _mm_max_epu32 and
> >> returned that instead of 0.
> >
> > Instead of the volatile I assume ?
> >
> Precisely. If anyone wants to follow on Matt's suggestion we can pick
> that one as well. I'd like to get a patch for the next stable releases
> (next Friday for 11.0.x and just after new year for 11.1.1) so I'll
> take whatever's around :-)
> 
> -Emil

I avoided that as I wasn't sure if there was a case where autoconf
cared about the return code.  If someone wants to create a new diff
feel free, I have limited connectivity till the middle of next week.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 10/13] i965: Make TES inputs match TCS outputs.

2015-12-13 Thread Jordan Justen
Reviewed-by: Jordan Justen 

On 2015-12-11 13:23:59, Kenneth Graunke wrote:
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_nir.c | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
> b/src/mesa/drivers/dri/i965/brw_nir.c
> index d405991..3cb6123 100644
> --- a/src/mesa/drivers/dri/i965/brw_nir.c
> +++ b/src/mesa/drivers/dri/i965/brw_nir.c
> @@ -641,6 +641,17 @@ brw_create_nir(struct brw_context *brw,
> /* First, lower the GLSL IR or Mesa IR to NIR */
> if (shader_prog) {
>nir = glsl_to_nir(shader_prog, stage, options);
> +
> +  if (nir->stage == MESA_SHADER_TESS_EVAL &&
> +  shader_prog->_LinkedShaders[MESA_SHADER_TESS_CTRL]) {
> + const struct gl_program *tcs =
> +shader_prog->_LinkedShaders[MESA_SHADER_TESS_CTRL]->Program;
> + /* Work around the TCS having bonus outputs used as shared memory
> +  * segments, which makes OutputsWritten not match InputsRead
> +  */
> + nir->info.inputs_read = tcs->OutputsWritten;
> + nir->info.patch_inputs_read = tcs->PatchOutputsWritten;
> +  }
> } else {
>nir = prog_to_nir(prog, options);
>OPT_V(nir_convert_to_ssa); /* turn registers into SSA */
> -- 
> 2.6.3
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] configure.ac: fix test for SSE4.1 assembler support

2015-12-13 Thread Oded Gabbay
On Sun, Dec 13, 2015 at 11:56 AM, Jonathan Gray  wrote:
> On Sat, Dec 12, 2015 at 06:41:56PM +, Emil Velikov wrote:
>> On 10 December 2015 at 08:42, Oded Gabbay  wrote:
>> > On Wed, Dec 9, 2015 at 8:30 PM, Matt Turner  wrote:
>> >> On Tue, Dec 8, 2015 at 9:37 PM, Jonathan Gray  wrote:
>> >>> Change the __m128i variables to be volatile so gcc 4.9 won't optimise
>> >>> all of them out with -O1 or greater.  The _mm_set1_epi32/pinsrd calls
>> >>> still get optimised out but now there is at least one SSE4.1 instruction
>> >>> generated via _mm_max_epu32/pmaxud.  When all of the sse4.1 instructions
>> >>> got optimised out the configure test would incorrectly pass when the
>> >>> compiler supported the intrinsics and the assembler didn't support the
>> >>> instructions.
>> >>>
>> >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91806
>> >>> Signed-off-by: Jonathan Gray 
>> >>> Cc: "11.0 11.1" 
>> >>> ---
>> >>>  configure.ac | 2 +-
>> >>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> >>>
>> >>> diff --git a/configure.ac b/configure.ac
>> >>> index 260934d..1d82e47 100644
>> >>> --- a/configure.ac
>> >>> +++ b/configure.ac
>> >>> @@ -384,7 +384,7 @@ CFLAGS="$SSE41_CFLAGS $CFLAGS"
>> >>>  AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
>> >>>  #include 
>> >>>  int main () {
>> >>> -__m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c;
>> >>> +volatile __m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c;
>> >>>  c = _mm_max_epu32(a, b);
>> >>>  return 0;
>> >>
>> >> I would have extracted an int from the result of _mm_max_epu32 and
>> >> returned that instead of 0.
>> >
>> > Instead of the volatile I assume ?
>> >
>> Precisely. If anyone wants to follow on Matt's suggestion we can pick
>> that one as well. I'd like to get a patch for the next stable releases
>> (next Friday for 11.0.x and just after new year for 11.1.1) so I'll
>> take whatever's around :-)
>>
>> -Emil
>
> I avoided that as I wasn't sure if there was a case where autoconf
> cared about the return code.  If someone wants to create a new diff
> feel free, I have limited connectivity till the middle of next week.

So I'm not a huge SSE expert, but I tried doing this (remove volatile
and return _mm_cvtsi128_si32 of c):


#include 
#include 
#include 

int main () {
__m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c;
c = _mm_xor_si128 (a, b);
return _mm_cvtsi128_si32(c);
}
-

When compiling with "gcc -O1 -msse2", gcc is 4.8.5 (from RHEL 7.2), I got:

-
main:
.LFB521:
.cfi_startproc
movl $0, %eax
ret
.cfi_endproc
---

So unless I misunderstood matt's suggestion, I think we *have* to use
the volatile as it forces the compiler to produce pxor and movdqa
assembly commands.

   Oded
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965: Use MESA_FORMAT_B8G8R8X8_SRGB for RGB visuals

2015-12-13 Thread Mike Lothian
Hi

These three commits have stopped Plasma5's kwin  working on my skylake
system

Reverting back to 7752bbc44e78e982de3cd4c34862adc38a338234 fixed it for me

I can send you more details / raise a bug if you like

Cheers

Mike

On Sat, 12 Dec 2015 at 20:56 Kenneth Graunke  wrote:

> On Friday, December 11, 2015 12:32:18 PM Neil Roberts wrote:
> > Previously if the visual didn't have an alpha channel then it would
> > pick a format that is not sRGB-capable. I don't think there's any
> > reason not to always have an sRGB-capable visual. Since 28090b30 there
> > are now visuals advertised without an alpha channel which means that
> > games that don't request alpha bits in the config would end up without
> > an sRGB-capable visual. This was breaking supertuxkart which assumes
> > the winsys buffer is always sRGB-capable.
> >
> > The previous code always used an RGBA format if the visual config
> > itself was marked as sRGB-capable regardless of whether the visual has
> > alpha bits. I think we don't actually advertise any sRGB-capable
> > visuals (but we just use sRGB formats anyway) so it shouldn't make any
> > difference. However this patch also changes it to use RGBX if an
> > sRGB-capable visual is requested without alpha bits for consistency.
> >
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92759
> > Cc: "11.0 11.1" 
> > Cc: Ilia Mirkin 
> > Suggested-by: Ilia Mirkin 
> > ---
> >  src/mesa/drivers/dri/i965/intel_screen.c | 13 ++---
> >  1 file changed, 6 insertions(+), 7 deletions(-)
>
> The whole series is:
> Reviewed-by: Kenneth Graunke 
>
> We definitely should have the same behavior regardless of whether the
> config has an alpha channel.  So, this is good.
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/4] gallium: add GREMEDY_string_marker

2015-12-13 Thread Rob Clark
On Tue, Dec 8, 2015 at 6:57 AM, Emil Velikov  wrote:
> On 7 December 2015 at 18:55, Rob Clark  wrote:
>> On Mon, Dec 7, 2015 at 1:42 PM, Emil Velikov  
>> wrote:
>>> On 7 December 2015 at 16:45, Rob Clark  wrote:
 From: Rob Clark 

 Only exposed w/ ST_DEBUG=gremedy.

>>> Perhaps a bit of a silly question - why expose the extension only for
>>> debug mesa builds ? Afaict there isn't any noticeable performance
>>> implication (from infrastructural POV) is there ?
>>>
>>> If driver X performance goes down the drain, just have them
>>> conditionally return 0/1 for PIPE_CAP_STRING_MARKER ?
>>
>> It was suggested, iirc by Ian, on the basis that apps might do
>> something non-performant if they see the extension (since the original
>> use-case was that the extension is injected by the gremedy debugger).
>>
> Hmm fair enough. Can you add some/most of that in the commit message, please ?

Perhaps something like:

Since the GREMEDY extensions are normally only exposed by the gremedy
debugger (and could possibly trigger debug paths in the app), we don't
expose the extension by default, but instead only with
ST_DEBUG=gremedy.

?

I wouldn't mind at least getting this one patch pushed in the near
future (since it conflicts all over the place every time someone adds
a new pipe-cap ;-))

BR,
-R

> Thanks
> Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] configure.ac: fix test for SSE4.1 assembler support

2015-12-13 Thread Matt Turner
On Sun, Dec 13, 2015 at 5:23 AM, Oded Gabbay  wrote:
> On Sun, Dec 13, 2015 at 11:56 AM, Jonathan Gray  wrote:
>> On Sat, Dec 12, 2015 at 06:41:56PM +, Emil Velikov wrote:
>>> On 10 December 2015 at 08:42, Oded Gabbay  wrote:
>>> > On Wed, Dec 9, 2015 at 8:30 PM, Matt Turner  wrote:
>>> >> On Tue, Dec 8, 2015 at 9:37 PM, Jonathan Gray  wrote:
>>> >>> Change the __m128i variables to be volatile so gcc 4.9 won't optimise
>>> >>> all of them out with -O1 or greater.  The _mm_set1_epi32/pinsrd calls
>>> >>> still get optimised out but now there is at least one SSE4.1 instruction
>>> >>> generated via _mm_max_epu32/pmaxud.  When all of the sse4.1 instructions
>>> >>> got optimised out the configure test would incorrectly pass when the
>>> >>> compiler supported the intrinsics and the assembler didn't support the
>>> >>> instructions.
>>> >>>
>>> >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91806
>>> >>> Signed-off-by: Jonathan Gray 
>>> >>> Cc: "11.0 11.1" 
>>> >>> ---
>>> >>>  configure.ac | 2 +-
>>> >>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>> >>>
>>> >>> diff --git a/configure.ac b/configure.ac
>>> >>> index 260934d..1d82e47 100644
>>> >>> --- a/configure.ac
>>> >>> +++ b/configure.ac
>>> >>> @@ -384,7 +384,7 @@ CFLAGS="$SSE41_CFLAGS $CFLAGS"
>>> >>>  AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
>>> >>>  #include 
>>> >>>  int main () {
>>> >>> -__m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c;
>>> >>> +volatile __m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c;
>>> >>>  c = _mm_max_epu32(a, b);
>>> >>>  return 0;
>>> >>
>>> >> I would have extracted an int from the result of _mm_max_epu32 and
>>> >> returned that instead of 0.
>>> >
>>> > Instead of the volatile I assume ?
>>> >
>>> Precisely. If anyone wants to follow on Matt's suggestion we can pick
>>> that one as well. I'd like to get a patch for the next stable releases
>>> (next Friday for 11.0.x and just after new year for 11.1.1) so I'll
>>> take whatever's around :-)
>>>
>>> -Emil
>>
>> I avoided that as I wasn't sure if there was a case where autoconf
>> cared about the return code.  If someone wants to create a new diff
>> feel free, I have limited connectivity till the middle of next week.
>
> So I'm not a huge SSE expert, but I tried doing this (remove volatile
> and return _mm_cvtsi128_si32 of c):
>
> 
> #include 
> #include 
> #include 
>
> int main () {
> __m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c;
> c = _mm_xor_si128 (a, b);
> return _mm_cvtsi128_si32(c);
> }
> -
>
> When compiling with "gcc -O1 -msse2", gcc is 4.8.5 (from RHEL 7.2), I got:
>
> -
> main:
> .LFB521:
> .cfi_startproc
> movl $0, %eax
> ret
> .cfi_endproc
> ---
>
> So unless I misunderstood matt's suggestion, I think we *have* to use
> the volatile as it forces the compiler to produce pxor and movdqa
> assembly commands.

Since all the arguments to the intrinsics are constants, GCC is
constant-evaluating them.

I expect all you'd need to do is pass some global variables to the
intrinsics or similar.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH] configure.ac: fix test for SSE4.1 assembler support

2015-12-13 Thread Oded Gabbay
On Sun, Dec 13, 2015 at 10:34 PM, Matt Turner  wrote:
> On Sun, Dec 13, 2015 at 5:23 AM, Oded Gabbay  wrote:
>> On Sun, Dec 13, 2015 at 11:56 AM, Jonathan Gray  wrote:
>>> On Sat, Dec 12, 2015 at 06:41:56PM +, Emil Velikov wrote:
 On 10 December 2015 at 08:42, Oded Gabbay  wrote:
 > On Wed, Dec 9, 2015 at 8:30 PM, Matt Turner  wrote:
 >> On Tue, Dec 8, 2015 at 9:37 PM, Jonathan Gray  wrote:
 >>> Change the __m128i variables to be volatile so gcc 4.9 won't optimise
 >>> all of them out with -O1 or greater.  The _mm_set1_epi32/pinsrd calls
 >>> still get optimised out but now there is at least one SSE4.1 
 >>> instruction
 >>> generated via _mm_max_epu32/pmaxud.  When all of the sse4.1 
 >>> instructions
 >>> got optimised out the configure test would incorrectly pass when the
 >>> compiler supported the intrinsics and the assembler didn't support the
 >>> instructions.
 >>>
 >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91806
 >>> Signed-off-by: Jonathan Gray 
 >>> Cc: "11.0 11.1" 
 >>> ---
 >>>  configure.ac | 2 +-
 >>>  1 file changed, 1 insertion(+), 1 deletion(-)
 >>>
 >>> diff --git a/configure.ac b/configure.ac
 >>> index 260934d..1d82e47 100644
 >>> --- a/configure.ac
 >>> +++ b/configure.ac
 >>> @@ -384,7 +384,7 @@ CFLAGS="$SSE41_CFLAGS $CFLAGS"
 >>>  AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
 >>>  #include 
 >>>  int main () {
 >>> -__m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c;
 >>> +volatile __m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), 
 >>> c;
 >>>  c = _mm_max_epu32(a, b);
 >>>  return 0;
 >>
 >> I would have extracted an int from the result of _mm_max_epu32 and
 >> returned that instead of 0.
 >
 > Instead of the volatile I assume ?
 >
 Precisely. If anyone wants to follow on Matt's suggestion we can pick
 that one as well. I'd like to get a patch for the next stable releases
 (next Friday for 11.0.x and just after new year for 11.1.1) so I'll
 take whatever's around :-)

 -Emil
>>>
>>> I avoided that as I wasn't sure if there was a case where autoconf
>>> cared about the return code.  If someone wants to create a new diff
>>> feel free, I have limited connectivity till the middle of next week.
>>
>> So I'm not a huge SSE expert, but I tried doing this (remove volatile
>> and return _mm_cvtsi128_si32 of c):
>>
>> 
>> #include 
>> #include 
>> #include 
>>
>> int main () {
>> __m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c;
>> c = _mm_xor_si128 (a, b);
>> return _mm_cvtsi128_si32(c);
>> }
>> -
>>
>> When compiling with "gcc -O1 -msse2", gcc is 4.8.5 (from RHEL 7.2), I got:
>>
>> -
>> main:
>> .LFB521:
>> .cfi_startproc
>> movl $0, %eax
>> ret
>> .cfi_endproc
>> ---
>>
>> So unless I misunderstood matt's suggestion, I think we *have* to use
>> the volatile as it forces the compiler to produce pxor and movdqa
>> assembly commands.
>
> Since all the arguments to the intrinsics are constants, GCC is
> constant-evaluating them.
>
> I expect all you'd need to do is pass some global variables to the
> intrinsics or similar.

ok, so what helped was this:

int param;

int main () {
__m128i a = _mm_set1_epi32 (param), b = _mm_set1_epi32 (param+1), c;

Notice the (param+1) - if using just (param), the compiler will
optimize it. And it is quite understandable, as xoring a value with
itself gives 0.

Oded
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/5] i965/gen9: Return false in place of assert in intelEmitCopyBlit()

2015-12-13 Thread Jordan Justen
1-3 Reviewed-by: Jordan Justen 

On 2015-12-11 19:14:22, Anuj Phogat wrote:
> This allows the fallback paths to handle it correctly.
> 
> Signed-off-by: Anuj Phogat 
> ---
>  src/mesa/drivers/dri/i965/intel_blit.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_blit.c 
> b/src/mesa/drivers/dri/i965/intel_blit.c
> index d4e25d8..6d29fbd 100644
> --- a/src/mesa/drivers/dri/i965/intel_blit.c
> +++ b/src/mesa/drivers/dri/i965/intel_blit.c
> @@ -564,9 +564,10 @@ intelEmitCopyBlit(struct brw_context *brw,
> dst_offset, dst_pitch,
> dst_tiling, dst_tr_mode,
> w, h, cpp);
> -   assert(use_fast_copy_blit ||
> -  (src_tr_mode == INTEL_MIPTREE_TRMODE_NONE &&
> -   dst_tr_mode == INTEL_MIPTREE_TRMODE_NONE));
> +   if (!use_fast_copy_blit &&
> +   (src_tr_mode != INTEL_MIPTREE_TRMODE_NONE ||
> +dst_tr_mode != INTEL_MIPTREE_TRMODE_NONE))
> +  return false;
>  
> if (use_fast_copy_blit) {
>/* When two sequential fast copy blits have different source surfaces,
> -- 
> 2.5.0
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v5 12/70] glsl: implement unsized array length

2015-12-13 Thread Ilia Mirkin
On Thu, Sep 10, 2015 at 9:35 AM, Iago Toral Quiroga  wrote:
> +ir_expression *
> +lower_ubo_reference_visitor::process_ssbo_unsized_array_length(ir_rvalue 
> **rvalue,
> +   
> ir_dereference *deref,
> +   ir_variable 
> *var)
> +{
> +   mem_ctx = ralloc_parent(*rvalue);
> +
> +   ir_rvalue *base_offset = NULL;
> +   unsigned const_offset;
> +   bool row_major;
> +   int matrix_columns;
> +   int unsized_array_stride = calculate_unsized_array_stride(deref);
> +
> +   /* Compute the offset to the start if the dereference as well as other
> +* information we need to calculate the length.
> +*/
> +   setup_for_load_or_store(var, deref,
> +   _offset, _offset,
> +   _major, _columns);
> +   /* array.length() =
> +*  max((buffer_object_size - offset_of_array) / stride_of_array, 0)
> +*/
> +   ir_expression *buffer_size = emit_ssbo_get_buffer_size();
> +
> +   ir_expression *offset_of_array = new(mem_ctx)
> +  ir_expression(ir_binop_add, base_offset,
> +new(mem_ctx) ir_constant(const_offset));
> +   ir_expression *offset_of_array_int = new(mem_ctx)
> +  ir_expression(ir_unop_u2i, offset_of_array);
> +
> +   ir_expression *sub = new(mem_ctx)
> +  ir_expression(ir_binop_sub, buffer_size, offset_of_array_int);
> +   ir_expression *div =  new(mem_ctx)
> +  ir_expression(ir_binop_div, sub,
> +new(mem_ctx) ir_constant(unsized_array_stride));
> +   ir_expression *max = new(mem_ctx)
> +  ir_expression(ir_binop_max, div, new(mem_ctx) ir_constant(0));
> +
> +   return max;
> +}

Hi Iago,

I noticed that this comes out as a signed division. Is there any way
to make it into an unsigned division? That way we can e.g. optimize a
power-of-two division into a shift, and it's a few instructions fewer
to emulate when there's no built-in integer division instruction
(which I think is most GPUs). It seems that you went to some trouble
to do all this with signed integers, but I can't quite figure out why.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Intel-gfx] [RFC libdrm] intel: Add support for softpin

2015-12-13 Thread Kristian Høgsberg
On Sun, Dec 13, 2015 at 7:17 PM, Song, Ruiling  wrote:
>> -Original Message-
>> From: Intel-gfx [mailto:intel-gfx-boun...@lists.freedesktop.org] On Behalf
>> Of Micha? Winiarski
>> Sent: Wednesday, September 9, 2015 10:07 PM
>> To: intel-...@lists.freedesktop.org
>> Cc: Ben Widawsky ; dri-de...@lists.freedesktop.org;
>> mesa-dev@lists.freedesktop.org
>> Subject: [Intel-gfx] [RFC libdrm] intel: Add support for softpin
>>
>> Softpin allows userspace to take greater control of GPU virtual address
>> space and eliminates the need of relocations. It can also be used to
>> mirror addresses between GPU and CPU (shared virtual memory).
>> Calls to drm_intel_bo_emit_reloc are still required to build the list of
>> drm_i915_gem_exec_objects at exec time, but no entries in relocs are
>> created. Self-relocs don't make any sense for softpinned objects and can
>> indicate a programming errors, thus are forbidden. Softpinned objects
>> are marked by asterisk in debug dumps.
>>
>> Cc: Thomas Daniel 
>> Cc: Kristian Høgsberg 
>> Cc: Zou Nanhai 
>> Cc: Michel Thierry 
>> Cc: Ben Widawsky 
>> Cc: Chris Wilson 
>> Signed-off-by: Michał Winiarski 
>> ---
>>  include/drm/i915_drm.h|   4 +-
>>  intel/intel_bufmgr.c  |   9 +++
>>  intel/intel_bufmgr.h  |   1 +
>>  intel/intel_bufmgr_gem.c  | 176
>> --
>>  intel/intel_bufmgr_priv.h |   7 ++
>>  5 files changed, 173 insertions(+), 24 deletions(-)
>
> Will anybody help to push the patch to libdrm? Beignet highly depend on this 
> to implement ocl2.0 svm.

Is the kernel patch upstream?

> Thanks!
> Ruiling
>
> ___
> Intel-gfx mailing list
> intel-...@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v5 12/70] glsl: implement unsized array length

2015-12-13 Thread Iago Toral
On Sun, 2015-12-13 at 23:10 -0500, Ilia Mirkin wrote:
> On Thu, Sep 10, 2015 at 9:35 AM, Iago Toral Quiroga  wrote:
> > +ir_expression *
> > +lower_ubo_reference_visitor::process_ssbo_unsized_array_length(ir_rvalue 
> > **rvalue,
> > +   
> > ir_dereference *deref,
> > +   ir_variable 
> > *var)
> > +{
> > +   mem_ctx = ralloc_parent(*rvalue);
> > +
> > +   ir_rvalue *base_offset = NULL;
> > +   unsigned const_offset;
> > +   bool row_major;
> > +   int matrix_columns;
> > +   int unsized_array_stride = calculate_unsized_array_stride(deref);
> > +
> > +   /* Compute the offset to the start if the dereference as well as other
> > +* information we need to calculate the length.
> > +*/
> > +   setup_for_load_or_store(var, deref,
> > +   _offset, _offset,
> > +   _major, _columns);
> > +   /* array.length() =
> > +*  max((buffer_object_size - offset_of_array) / stride_of_array, 0)
> > +*/
> > +   ir_expression *buffer_size = emit_ssbo_get_buffer_size();
> > +
> > +   ir_expression *offset_of_array = new(mem_ctx)
> > +  ir_expression(ir_binop_add, base_offset,
> > +new(mem_ctx) ir_constant(const_offset));
> > +   ir_expression *offset_of_array_int = new(mem_ctx)
> > +  ir_expression(ir_unop_u2i, offset_of_array);
> > +
> > +   ir_expression *sub = new(mem_ctx)
> > +  ir_expression(ir_binop_sub, buffer_size, offset_of_array_int);
> > +   ir_expression *div =  new(mem_ctx)
> > +  ir_expression(ir_binop_div, sub,
> > +new(mem_ctx) ir_constant(unsized_array_stride));
> > +   ir_expression *max = new(mem_ctx)
> > +  ir_expression(ir_binop_max, div, new(mem_ctx) ir_constant(0));
> > +
> > +   return max;
> > +}
> 
> Hi Iago,
> 
> I noticed that this comes out as a signed division. Is there any way
> to make it into an unsigned division? That way we can e.g. optimize a
> power-of-two division into a shift, and it's a few instructions fewer
> to emulate when there's no built-in integer division instruction
> (which I think is most GPUs). It seems that you went to some trouble
> to do all this with signed integers, but I can't quite figure out why.

Hi Ilia,

I agree, I don't see why we would do the extra work to make this
signed... Samuel wrote this code though, so I'll let him confirm.

Iago

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev