Re: [Mesa-dev] [PATCH V2] mesa: Permanently enable features supported by target CPU at compile time.

2014-11-09 Thread Bruno Jimenez
On Sun, 2014-11-09 at 14:24 +1100, Timothy Arceri wrote:
> On Sat, 2014-11-08 at 18:25 +, Emil Velikov wrote:
> > On 08/11/14 08:35, Siavash Eliasi wrote:
> > > This will remove the need for unnecessary runtime checks for CPU features 
> > > if
> > > already supported by target CPU, resulting in smaller and less branchy 
> > > code.
> > > 
> > A comment I could not withheld based on your earlier post - "We require
> > micro-benchmark for this code. It will take me hours to find why mesa is
> > so slow now :P"
> 
> Hehe, you are technically correct, the best kind of correct.
> 
> >  
> > Ideally mesa should have an infrastructure/farm that handles regressions
> > - be that performance or otherwise. Pretty sure some companies have such
> > features but those seem to be hidden behind locked doors :'(
> 
> I considered trying to work on a solution for this where anyone could
> volunteer their machine to run such tests and send the results back to a
> central server. But in the end its a pretty big project to get something
> like this working correctly, and making it easy to setup thus
> participate in.
> 
> I guess you could leverage something like the oibaf ppa to start with
> rather than pulling down each commit and building. Anyway a much bigger
> project than what I have time for at the moment.

Maybe it isn't needed for every commit. Depending on the architecture
you are testing you would focus on some commits. For example, I have a
radeon with the r600 driver, so I don't need to check all the intel and
nvidia related commits.

Even if it's not feasible for every commit, we may try to do it for
individual series. Or even, we may send some commits with some kind of
'request for test for performance'.

But we would need to have some kind of standarized benchmark/tests for
this, à la piglit.

Just my opinion :)
- Bruno
> 
> > 
> > But on a more mature note, currently only cpu_has_xmm
> > (_tnl_generate_sse_emit) and cpu_has_sse4_1(vbo_get_minmax_index) are
> > actually useful, with the former of questionable amount :P
> > 
> > Can you confirm that it does not cause issues with "interesting" setups
> > such as https://bugs.freedesktop.org/show_bug.cgi?id=71547
> > 
> 
> I think this patch should be find in this case as the solution there was
> to wrap the code with #ifdef __SSE4_1__ which is what makes this patch
> work. 
> 
> > 
> > Thanks
> > Emil
> > 
> > > V2:
> > > - Removed the SSSE3 related part for the not yet merged patch.
> > > - Avoiding redefinition of macros.
> > > ---
> > >  src/mesa/x86/common_x86_features.h | 26 ++
> > >  1 file changed, 26 insertions(+)
> > > 
> > > diff --git a/src/mesa/x86/common_x86_features.h 
> > > b/src/mesa/x86/common_x86_features.h
> > > index 66f2cf6..65634aa 100644
> > > --- a/src/mesa/x86/common_x86_features.h
> > > +++ b/src/mesa/x86/common_x86_features.h
> > > @@ -59,13 +59,39 @@
> > >  #define X86_CPUEXT_3DNOW_EXT (1<<30)
> > >  #define X86_CPUEXT_3DNOW (1<<31)
> > >  
> > > +#ifdef __MMX__
> > > +#define cpu_has_mmx  1
> > > +#else
> > >  #define cpu_has_mmx  (_mesa_x86_cpu_features & 
> > > X86_FEATURE_MMX)
> > > +#endif
> > > +
> > >  #define cpu_has_mmxext   (_mesa_x86_cpu_features & 
> > > X86_FEATURE_MMXEXT)
> > > +
> > > +#ifdef __SSE__
> > > +#define cpu_has_xmm  1
> > > +#else
> > >  #define cpu_has_xmm  (_mesa_x86_cpu_features & 
> > > X86_FEATURE_XMM)
> > > +#endif
> > > +
> > > +#ifdef __SSE2__
> > > +#define cpu_has_xmm2 1
> > > +#else
> > >  #define cpu_has_xmm2 (_mesa_x86_cpu_features & 
> > > X86_FEATURE_XMM2)
> > > +#endif
> > > +
> > > +#ifdef __3dNOW__
> > > +#define cpu_has_3dnow1
> > > +#else
> > >  #define cpu_has_3dnow(_mesa_x86_cpu_features & 
> > > X86_FEATURE_3DNOW)
> > > +#endif
> > > +
> > >  #define cpu_has_3dnowext (_mesa_x86_cpu_features & X86_FEATURE_3DNOWEXT)
> > > +
> > > +#ifdef __SSE4_1__
> > > +#define cpu_has_sse4_1   1
> > > +#else
> > >  #define cpu_has_sse4_1   (_mesa_x86_cpu_features & 
> > > X86_FEATURE_SSE4_1)
> > > +#endif
> > >  
> > >  #endif
> > >  
> > > 
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH V4 3/3] i965: add runtime check for SSSE3 rgba8_copy

2014-11-09 Thread Timothy Arceri
Callgrind cpu usage results from pts benchmarks:

For ytile_copy_faster()
Nexuiz 1.6.1: 2.48% -> 0.97%

The folowing are the only discernible results from teximage:

Without patch and mesa default build flags -
TexSubImage(BGRA/ubyte 256 x 256): 6122.6 images/sec, 1530.6 MB/sec

With patch runtime ssse3 -
TexSubImage(BGRA/ubyte 256 x 256): 9288.0 images/sec, 2322.0 MB/sec

V4:
- fix slight regression when building with ssse3 compile flag by
 wrapping fallback if statments with #ifndef __SSSE3__
- add Mesa demo teximage results to commit message 

V3:
- rather than putting the ssse3 code in a different file
 in order to compile make use of gcc pragma for per
 function optimisations. Results in improved performace and less
 impact on those not needing runtime ssse3 checks.

V2:
- put back the if statements and add one for the SSSE3 rgba8_copy
- move some header files out of the header
- don't indent the preprocessor tests
- changed copyright to Google and add author Frank Henigman

Signed-off-by: Timothy Arceri 
---
 src/mesa/drivers/dri/i965/intel_tex_subimage.c | 96 ++
 1 file changed, 81 insertions(+), 15 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_tex_subimage.c 
b/src/mesa/drivers/dri/i965/intel_tex_subimage.c
index cb5738a..4c9ca18 100644
--- a/src/mesa/drivers/dri/i965/intel_tex_subimage.c
+++ b/src/mesa/drivers/dri/i965/intel_tex_subimage.c
@@ -42,8 +42,13 @@
 #include "intel_mipmap_tree.h"
 #include "intel_blit.h"
 
-#ifdef __SSSE3__
+#include "x86/common_x86_asm.h"
+#include "x86/x86_function_opt.h"
+
+#if defined(SSSE3_FUNC_OPT_START)
+SSSE3_FUNC_OPT_START
 #include 
+SSSE3_FUNC_OPT_END
 #endif
 
 #define FILE_DEBUG_FLAG DEBUG_TEXTURE
@@ -175,7 +180,8 @@ err:
return false;
 }
 
-#ifdef __SSSE3__
+#if defined(SSSE3_FUNC_OPT_START)
+SSSE3_FUNC_OPT_START
 static const uint8_t rgba8_permutation[16] =
{ 2,1,0,3, 6,5,4,7, 10,9,8,11, 14,13,12,15 };
 
@@ -185,24 +191,18 @@ static const uint8_t rgba8_permutation[16] =
   (__m128i) _mm_loadu_ps((float *)(src)),   \
   *(__m128i *) rgba8_permutation\
)
-#endif
 
-/**
- * Copy RGBA to BGRA - swap R and B.
+/* Fast copying for tile spans.
+ *
+ * As long as the destination texture is 16 aligned,
+ * any 16 or 64 spans we get here should also be 16 aligned.
  */
 static inline void *
-rgba8_copy(void *dst, const void *src, size_t bytes)
+ssse3_fast_rgba8_copy(void *dst, const void *src, size_t bytes)
 {
uint8_t *d = dst;
uint8_t const *s = src;
 
-#ifdef __SSSE3__
-   /* Fast copying for tile spans.
-*
-* As long as the destination texture is 16 aligned,
-* any 16 or 64 spans we get here should also be 16 aligned.
-*/
-
if (bytes == 16) {
   assert(!(((uintptr_t)dst) & 0xf));
   rgba8_copy_16(d+ 0, s+ 0);
@@ -217,8 +217,30 @@ rgba8_copy(void *dst, const void *src, size_t bytes)
   rgba8_copy_16(d+48, s+48);
   return dst;
}
+
+   while (bytes >= 4) {
+  d[0] = s[2];
+  d[1] = s[1];
+  d[2] = s[0];
+  d[3] = s[3];
+  d += 4;
+  s += 4;
+  bytes -= 4;
+   }
+   return dst;
+}
+SSSE3_FUNC_OPT_END
 #endif
 
+/**
+ * Copy RGBA to BGRA - swap R and B.
+ */
+static inline void *
+rgba8_copy(void *dst, const void *src, size_t bytes)
+{
+   uint8_t *d = dst;
+   uint8_t const *s = src;
+
while (bytes >= 4) {
   d[0] = s[2];
   d[1] = s[1];
@@ -355,16 +377,32 @@ xtile_copy_faster(uint32_t x0, uint32_t x1, uint32_t x2, 
uint32_t x3,
   if (mem_copy == memcpy)
  return xtile_copy(0, 0, xtile_width, xtile_width, 0, xtile_height,
dst, src, src_pitch, swizzle_bit, memcpy);
+  #if defined(SSSE3_FUNC_OPT_START)
+  else if (mem_copy == ssse3_fast_rgba8_copy)
+ return xtile_copy(0, 0, xtile_width, xtile_width, 0, xtile_height,
+   dst, src, src_pitch, swizzle_bit,
+   ssse3_fast_rgba8_copy);
+  #endif
+  #ifndef __SSSE3__
   else if (mem_copy == rgba8_copy)
  return xtile_copy(0, 0, xtile_width, xtile_width, 0, xtile_height,
dst, src, src_pitch, swizzle_bit, rgba8_copy);
+  #endif
} else {
   if (mem_copy == memcpy)
  return xtile_copy(x0, x1, x2, x3, y0, y1,
dst, src, src_pitch, swizzle_bit, memcpy);
+  #if defined(SSSE3_FUNC_OPT_START)
+  else if (mem_copy == ssse3_fast_rgba8_copy)
+ return xtile_copy(x0, x1, x2, x3, y0, y1,
+   dst, src, src_pitch, swizzle_bit,
+   ssse3_fast_rgba8_copy);
+  #endif
+  #ifndef __SSSE3__
   else if (mem_copy == rgba8_copy)
  return xtile_copy(x0, x1, x2, x3, y0, y1,
dst, src, src_pitch, swizzle_bit, rgba8_copy);
+  #endif
}
xtile_copy(x0, x1, x2, x3, y0, y1,
   dst, src, src_pitch, swizzle_bit, mem_copy);
@@ -391,16 +429,32 @@ ytile_copy_faster(uint32_

Re: [Mesa-dev] [PATCH] r600g: Implement GL_ARB_draw_indirect

2014-11-09 Thread Marek Olšák
This might not always work due to these reasons:

These variables shouldn't be used anywhere if info.indirect != NULL:
- info.start
- info.count
- info.index_bias
For example, the translation of 8-bit indices is broken.

The code which uses these variables has no effect if info.indirect !=
NULL. For clarity, we shouldn't execute that code at all:
- info.start_instance
- info.instance_count

In get_param, you can just use "return the_bool_expression".

Marek


On Sat, Nov 8, 2014 at 11:52 PM, Glenn Kennard  wrote:
> Requires evergreen/cayman, and updated radeon kernel module.
>
> Signed-off-by: Glenn Kennard 
> ---
> See also kernel side patch sent to dri-de...@lists.freedesktop.org
>
>  docs/GL3.txt |  4 +-
>  docs/relnotes/10.4.html  |  1 +
>  src/gallium/drivers/r600/evergreend.h|  7 ++-
>  src/gallium/drivers/r600/r600_pipe.c |  6 ++-
>  src/gallium/drivers/r600/r600_state_common.c | 80 
> ++--
>  5 files changed, 77 insertions(+), 21 deletions(-)
>
> diff --git a/docs/GL3.txt b/docs/GL3.txt
> index 2854431..06c52f9 100644
> --- a/docs/GL3.txt
> +++ b/docs/GL3.txt
> @@ -95,7 +95,7 @@ GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, 
> radeonsi, llvmpipe, soft
>  GL 4.0, GLSL 4.00:
>
>GL_ARB_draw_buffers_blendDONE (i965, nv50, 
> nvc0, r600, radeonsi, llvmpipe, softpipe)
> -  GL_ARB_draw_indirect DONE (i965, nvc0, 
> radeonsi, llvmpipe, softpipe)
> +  GL_ARB_draw_indirect DONE (i965, nvc0, 
> r600, radeonsi, llvmpipe, softpipe)
>GL_ARB_gpu_shader5   DONE (i965, nvc0)
>- 'precise' qualifierDONE
>- Dynamically uniform sampler array indices  DONE (r600)
> @@ -159,7 +159,7 @@ GL 4.3, GLSL 4.30:
>GL_ARB_framebuffer_no_attachmentsnot started
>GL_ARB_internalformat_query2 not started
>GL_ARB_invalidate_subdataDONE (all drivers)
> -  GL_ARB_multi_draw_indirect   DONE (i965, nvc0, 
> radeonsi, llvmpipe, softpipe)
> +  GL_ARB_multi_draw_indirect   DONE (i965, nvc0, 
> r600, radeonsi, llvmpipe, softpipe)
>GL_ARB_program_interface_query   not started
>GL_ARB_robust_buffer_access_behavior not started
>GL_ARB_shader_image_size not started
> diff --git a/docs/relnotes/10.4.html b/docs/relnotes/10.4.html
> index d0fbd3b..9c2a491 100644
> --- a/docs/relnotes/10.4.html
> +++ b/docs/relnotes/10.4.html
> @@ -49,6 +49,7 @@ Note: some of the new features are only available with 
> certain drivers.
>  GL_ARB_texture_view on nv50, nvc0
>  GL_ARB_clip_control on llvmpipe, softpipe, r300, r600, radeonsi
>  GL_KHR_context_flush_control on all drivers
> +GL_ARB_draw_indirect, GL_ARB_multi_draw_indirect on r600
>  
>
>
> diff --git a/src/gallium/drivers/r600/evergreend.h 
> b/src/gallium/drivers/r600/evergreend.h
> index 4989996..b8880c8 100644
> --- a/src/gallium/drivers/r600/evergreend.h
> +++ b/src/gallium/drivers/r600/evergreend.h
> @@ -64,6 +64,8 @@
>  #define R600_TEXEL_PITCH_ALIGNMENT_MASK0x7
>
>  #define PKT3_NOP   0x10
> +#define PKT3_SET_BASE  0x11
> +#define PKT3_INDEX_BUFFER_SIZE 0x13
>  #define PKT3_DEALLOC_STATE 0x14
>  #define PKT3_DISPATCH_DIRECT   0x15
>  #define PKT3_DISPATCH_INDIRECT 0x16
> @@ -72,12 +74,15 @@
>  #define PKT3_REG_RMW   0x21
>  #define PKT3_COND_EXEC 0x22
>  #define PKT3_PRED_EXEC 0x23
> -#define PKT3_START_3D_CMDBUF   0x24
> +#define PKT3_DRAW_INDIRECT 0x24
> +#define PKT3_DRAW_INDEX_INDIRECT   0x25
> +#define PKT3_INDEX_BASE0x26
>  #define PKT3_DRAW_INDEX_2  0x27
>  #define PKT3_CONTEXT_CONTROL   0x28
>  #define PKT3_DRAW_INDEX_IMMD_BE0x29
>  #define PKT3_INDEX_TYPE0x2A
>  #define PKT3_DRAW_INDEX0x2B
> +#define PKT3_DRAW_INDIRECT_MULTI   0x2C
>  #define PKT3_DRAW_INDEX_AUTO   0x2D
>  #define PKT3_DRAW_INDEX_IMMD   0x2E
>  #define PKT3_NUM_INSTANCES 0x2F
> diff --git a/src/gallium/drivers/r600/r600_pipe.c 
> b/src/gallium/drivers/r600/r600_pipe.c
> index 0b571e4..829deaf 100644
> --- a/src/gallium/drivers/r600/r600_pipe.c
> +++ b/src/gallium/drivers/r600/r600_pipe.c
> @@ -313,6 +313,11 @@ static int r600_get_param(struct pipe_screen* pscreen, 
> enum pipe_cap param)
> return family >= CHIP_CEDAR ? 1 : 0;
> case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS:
> 

Re: [Mesa-dev] [PATCH] gbm: dlopen libglapi so gbm_create_device works

2014-11-09 Thread Frank Henigman
On Sat, Nov 8, 2014 at 7:13 PM, Emil Velikov  wrote:
> On 06/11/14 21:29, Frank Henigman wrote:
>> From: Frank Henigman 
>>
>> Dri driver libs are not linked to pull in libglapi so gbm_create_device()
>> fails when it tries to dlopen them (unless the application is linked
>> with something that does pull in libglapi, like libGL).
>> Until dri drivers can be fixed properly, dlopen libglapi before trying
>> to dlopen them.
>> https://bugs.freedesktop.org/show_bug.cgi?id=57702
>>
> Hi Frank,
>
> I think I can understand the frustration that this has caused you, so
> unless there are any objections I will gladly pick it up for the 10.4
> (and if there are no side effects for the stable 10.3 branch).
>
> Just a couple of nits, which I'm planning to make prior to pushing this
> (a week from now, just before the branchpoint)
>  * the bugzilla report mentiones libglapi, but in a different light so
> I'll rephase the commit msg a bit.
>  * we might as well print out an error message and bail out when we
> dlopen fails.

I think the check should be after the dlopen() of blah_dri.so, a few lines down,
and show the dlerror() message if that fails.  That's the code I've
put in in the
past to diagnose this issue, and I really should have included it in my patch.
Then there's probably no need to error check the new dlopen, and the later
check can stay in when the new dlopen is removed.
Thanks!

> Thanks for bringing this up :)
>
> -Emil
>
>> Signed-off-by: Frank Henigman 
>> ---
>>  src/gbm/backends/dri/gbm_dri.c | 5 +
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c
>> index f637e32..6ea2294 100644
>> --- a/src/gbm/backends/dri/gbm_dri.c
>> +++ b/src/gbm/backends/dri/gbm_dri.c
>> @@ -311,6 +311,11 @@ dri_open_driver(struct gbm_dri_device *dri)
>> if (search_paths == NULL)
>>search_paths = DEFAULT_DRIVER_DIR;
>>
>> +   /* Temporarily work around dri driver libs that need symbols in libglapi
>> +* but don't automatically link it in.
>> +*/
>> +   dlopen("libglapi.so.0", RTLD_LAZY | RTLD_GLOBAL);
>> +
>> dri->driver = NULL;
>> end = search_paths + strlen(search_paths);
>> for (p = search_paths; p < end && dri->driver == NULL; p = next + 1) {
>>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: Implement GL_ARB_draw_indirect

2014-11-09 Thread Benjamin Bellec
Hello,

You have defined some "define" but you don't use it everywhere, for
instance :
 cs->buf[cs->cdw++] = PKT3(0x24 /* PKT3_DRAW_INDIRECT */, 1,
rctx->b.predicate_drawing);
instead of simply :
 cs->buf[cs->cdw++] = PKT3(PKT3_DRAW_INDIRECT, 1,
rctx->b.predicate_drawing);

There is 5 instances like that.

Regards.

- Benjamin

2014-11-08 23:52 GMT+01:00 Glenn Kennard :

> Requires evergreen/cayman, and updated radeon kernel module.
>
> Signed-off-by: Glenn Kennard 
> ---
> See also kernel side patch sent to dri-de...@lists.freedesktop.org
>
>  docs/GL3.txt |  4 +-
>  docs/relnotes/10.4.html  |  1 +
>  src/gallium/drivers/r600/evergreend.h|  7 ++-
>  src/gallium/drivers/r600/r600_pipe.c |  6 ++-
>  src/gallium/drivers/r600/r600_state_common.c | 80
> ++--
>  5 files changed, 77 insertions(+), 21 deletions(-)
>
> diff --git a/docs/GL3.txt b/docs/GL3.txt
> index 2854431..06c52f9 100644
> --- a/docs/GL3.txt
> +++ b/docs/GL3.txt
> @@ -95,7 +95,7 @@ GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600,
> radeonsi, llvmpipe, soft
>  GL 4.0, GLSL 4.00:
>
>GL_ARB_draw_buffers_blendDONE (i965, nv50,
> nvc0, r600, radeonsi, llvmpipe, softpipe)
> -  GL_ARB_draw_indirect DONE (i965, nvc0,
> radeonsi, llvmpipe, softpipe)
> +  GL_ARB_draw_indirect DONE (i965, nvc0,
> r600, radeonsi, llvmpipe, softpipe)
>GL_ARB_gpu_shader5   DONE (i965, nvc0)
>- 'precise' qualifierDONE
>- Dynamically uniform sampler array indices  DONE (r600)
> @@ -159,7 +159,7 @@ GL 4.3, GLSL 4.30:
>GL_ARB_framebuffer_no_attachmentsnot started
>GL_ARB_internalformat_query2 not started
>GL_ARB_invalidate_subdataDONE (all drivers)
> -  GL_ARB_multi_draw_indirect   DONE (i965, nvc0,
> radeonsi, llvmpipe, softpipe)
> +  GL_ARB_multi_draw_indirect   DONE (i965, nvc0,
> r600, radeonsi, llvmpipe, softpipe)
>GL_ARB_program_interface_query   not started
>GL_ARB_robust_buffer_access_behavior not started
>GL_ARB_shader_image_size not started
> diff --git a/docs/relnotes/10.4.html b/docs/relnotes/10.4.html
> index d0fbd3b..9c2a491 100644
> --- a/docs/relnotes/10.4.html
> +++ b/docs/relnotes/10.4.html
> @@ -49,6 +49,7 @@ Note: some of the new features are only available with
> certain drivers.
>  GL_ARB_texture_view on nv50, nvc0
>  GL_ARB_clip_control on llvmpipe, softpipe, r300, r600, radeonsi
>  GL_KHR_context_flush_control on all drivers
> +GL_ARB_draw_indirect, GL_ARB_multi_draw_indirect on r600
>  
>
>
> diff --git a/src/gallium/drivers/r600/evergreend.h
> b/src/gallium/drivers/r600/evergreend.h
> index 4989996..b8880c8 100644
> --- a/src/gallium/drivers/r600/evergreend.h
> +++ b/src/gallium/drivers/r600/evergreend.h
> @@ -64,6 +64,8 @@
>  #define R600_TEXEL_PITCH_ALIGNMENT_MASK0x7
>
>  #define PKT3_NOP   0x10
> +#define PKT3_SET_BASE  0x11
> +#define PKT3_INDEX_BUFFER_SIZE 0x13
>  #define PKT3_DEALLOC_STATE 0x14
>  #define PKT3_DISPATCH_DIRECT   0x15
>  #define PKT3_DISPATCH_INDIRECT 0x16
> @@ -72,12 +74,15 @@
>  #define PKT3_REG_RMW   0x21
>  #define PKT3_COND_EXEC 0x22
>  #define PKT3_PRED_EXEC 0x23
> -#define PKT3_START_3D_CMDBUF   0x24
> +#define PKT3_DRAW_INDIRECT 0x24
> +#define PKT3_DRAW_INDEX_INDIRECT   0x25
> +#define PKT3_INDEX_BASE0x26
>  #define PKT3_DRAW_INDEX_2  0x27
>  #define PKT3_CONTEXT_CONTROL   0x28
>  #define PKT3_DRAW_INDEX_IMMD_BE0x29
>  #define PKT3_INDEX_TYPE0x2A
>  #define PKT3_DRAW_INDEX0x2B
> +#define PKT3_DRAW_INDIRECT_MULTI   0x2C
>  #define PKT3_DRAW_INDEX_AUTO   0x2D
>  #define PKT3_DRAW_INDEX_IMMD   0x2E
>  #define PKT3_NUM_INSTANCES 0x2F
> diff --git a/src/gallium/drivers/r600/r600_pipe.c
> b/src/gallium/drivers/r600/r600_pipe.c
> index 0b571e4..829deaf 100644
> --- a/src/gallium/drivers/r600/r600_pipe.c
> +++ b/src/gallium/drivers/r600/r600_pipe.c
> @@ -313,6 +313,11 @@ static int r600_get_param(struct pipe_screen*
> pscreen, enum pipe_cap param)
> return family >= CHIP_CEDAR ? 1 : 0;
> case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS:
> return family >= CHIP_CEDAR ? 4 : 0;
> +   case PIPE_CAP_DRAW_INDIRECT:
> +   /* needs kernel command checking support to work */
> +   

Re: [Mesa-dev] [PATCH] i965: Use the predicate enable bit for conditional rendering without stalling

2014-11-09 Thread Daniel Vetter
On Fri, Nov 07, 2014 at 03:28:01PM +, Neil Roberts wrote:
> Previously whenever a primitive is drawn the driver would call
> _mesa_check_conditional_render which blocks waiting for the result of the
> query to determine whether to render. On Gen7+ there is a bit in the
> 3DPRIMITIVE command which can be used to disable the primitive based on the
> value of a state bit. This state bit can be set based on whether two registers
> have different values using the MI_PREDICATE command. We can load these two
> registers with the pixel count values stored in the query begin and end to
> implement conditional rendering without stalling.
> 
> Unfortunately these two source registers are not in the whitelist of available
> registers in the kernel driver so this needs a kernel patch to work. This
> patch tries to check whether it is possible to write to this register by
> creating a DRM buffer with a single command to write to the register and then
> trying to exec it. The return value of the exec function (and hence the ioctl)
> is checked.
> 
> There is a function with a similar goal just above to check whether the
> OACONTROL register can be used. This works by putting the test command in the
> regular batch buffer and trying to compare whether the value was successfully
> written. I couldn't get this approach to work because intel_batchbuffer_flush
> actually calls exit() if the ioctl fails. I am suspicious that this approach
> doesn't work for OACONTROL either.

Atm the cmd parser for gen7 validates the batch and rejects it if there's
something in there it doesn't like. But it doesn't grant any additional
privs. Hence
- OACONTROL passes since it's in the whitelist, but since the cmd parser
  doesn't grant the needed privs the writes would need.
- your new regs aren't in the whitelist, so the execbuf fails with
  -EINVAL.

We have a cmd parser version which we intend to bump every time we add new
registers, so probably better to check that. And I guess we'd need one to
indicate that the cmd parser actually does something useful. Probably best
done with just another bump. And if we do that we could replace the
current trick mesa uses with just a getparam query - the getparam is fixed
so either returns -EINVAL on old kernels or the cmd parser version.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/2] i965: Use the predicate enable bit for conditional rendering without stalling

2014-11-09 Thread Daniel Vetter
On Fri, Nov 07, 2014 at 06:53:00PM +, Neil Roberts wrote:
> Previously whenever a primitive is drawn the driver would call
> _mesa_check_conditional_render which blocks waiting for the result of the
> query to determine whether to render. On Gen7+ there is a bit in the
> 3DPRIMITIVE command which can be used to disable the primitive based on the
> value of a state bit. This state bit can be set based on whether two registers
> have different values using the MI_PREDICATE command. We can load these two
> registers with the pixel count values stored in the query begin and end to
> implement conditional rendering without stalling.
> 
> Unfortunately these two source registers are not in the whitelist of available
> registers in the kernel driver so this needs a kernel patch to work. This
> patch uses the command parser version from intel_screen to detect whether to
> attempt to set the predicate data registers.
> 
> The predicate enable bit is currently only used for drawing 3D primitives. For
> blits, clears, bitmaps, copypixels and drawpixels it still causes a stall. For
> most of these it would probably just work to call the new
> brw_check_conditional_render function instead of
> _mesa_check_conditional_render because they already work in terms of rendering
> primitives. However it's a bit trickier for blits because it can use the BLT
> ring or the blorp codepath. I think these operations are less useful for
> conditional rendering than rendering primitives so it might be best to leave
> it for a later patch.
> 
> v2: Use the command parser version to detect whether we can write to the
> predicate data registers instead of trying to execute a register load
> command.
> ---
> 
> Glenn Kennard suggested that instead of trying to send a load register
> command to detect whether the predicate source registers can be set we
> could just increase the command parser version in the kernel driver
> and query that. That seems nicer to me so here is a second version of
> the patch to do that. I will post a v2 of the kernel patch too.

Oh, I guess my earlier mail was too late. One issue still is picking the
numbers, since you seem to assume here that ver >= 2 means the stuff
actually works. But like Ken said the cmd parser in upstream isn't
really enabled yet.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 86070] Host application crash on vmware fusion 7 in vmw_swc_flush

2014-11-09 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=86070

Bug ID: 86070
   Summary: Host application crash on vmware fusion 7 in
vmw_swc_flush
   Product: Mesa
   Version: 8.0
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: yue.nicho...@gmail.com

nicholas@ubuntu:~$ mplay-bin: vmw_context.c:138: vmw_swc_flush: Assertion `ret
== PIPE_OK' failed.
4646:  (sent by pid 4646)


glxinfo (top portion)

name of display: :0
display: :0  screen: 0
direct rendering: Yes
server glx vendor string: SGI
server glx version string: 1.4
server glx extensions:
GLX_ARB_multisample, GLX_EXT_import_context, GLX_EXT_texture_from_pixmap, 
GLX_EXT_visual_info, GLX_EXT_visual_rating, GLX_MESA_copy_sub_buffer, 
GLX_OML_swap_method, GLX_SGIS_multisample, GLX_SGIX_fbconfig, 
GLX_SGIX_pbuffer, GLX_SGIX_visual_select_group, GLX_INTEL_swap_event
client glx vendor string: Mesa Project and SGI
client glx version string: 1.4
client glx extensions:
GLX_ARB_create_context, GLX_ARB_create_context_profile, 
GLX_ARB_get_proc_address, GLX_ARB_multisample, GLX_EXT_import_context, 
GLX_EXT_visual_info, GLX_EXT_visual_rating, GLX_EXT_framebuffer_sRGB, 
GLX_EXT_create_context_es2_profile, GLX_MESA_copy_sub_buffer, 
GLX_MESA_multithread_makecurrent, GLX_MESA_swap_control, 
GLX_OML_swap_method, GLX_OML_sync_control, GLX_SGI_make_current_read, 
GLX_SGI_swap_control, GLX_SGI_video_sync, GLX_SGIS_multisample, 
GLX_SGIX_fbconfig, GLX_SGIX_pbuffer, GLX_SGIX_visual_select_group, 
GLX_EXT_texture_from_pixmap, GLX_INTEL_swap_event
GLX version: 1.4
GLX extensions:
GLX_ARB_get_proc_address, GLX_ARB_multisample, GLX_EXT_import_context, 
GLX_EXT_visual_info, GLX_EXT_visual_rating, GLX_MESA_copy_sub_buffer, 
GLX_MESA_multithread_makecurrent, GLX_MESA_swap_control, 
GLX_OML_swap_method, GLX_SGI_make_current_read, GLX_SGI_video_sync, 
GLX_SGIS_multisample, GLX_SGIX_fbconfig, GLX_SGIX_pbuffer, 
GLX_SGIX_visual_select_group, GLX_EXT_texture_from_pixmap
OpenGL vendor string: VMware, Inc.
OpenGL renderer string: Gallium 0.4 on SVGA3D; build: RELEASE;  
OpenGL version string: 2.1 Mesa 8.0.4
OpenGL shading language version string: 1.20

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] tgsi: fixup the string of VS_WINDOW_SPACE_POSITION

2014-11-09 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/auxiliary/tgsi/tgsi_strings.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_strings.c 
b/src/gallium/auxiliary/tgsi/tgsi_strings.c
index f84cd79..01fa5a9 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_strings.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_strings.c
@@ -122,7 +122,7 @@ const char *tgsi_property_names[TGSI_PROPERTY_COUNT] =
"FS_DEPTH_LAYOUT",
"VS_PROHIBIT_UCPS",
"GS_INVOCATIONS",
-   "VS_POSITION_WINDOW_SPACE"
+   "VS_WINDOW_SPACE_POSITION"
 };
 
 const char *tgsi_return_type_names[TGSI_RETURN_TYPE_COUNT] =
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] gallium/util: add a test for TGSI_PROPERTY_VS_WINDOW_SPACE_POSITION

2014-11-09 Thread Marek Olšák
From: Marek Olšák 

Not testable by OpenGL. Required by Nine.

This is an example of how to implement a piglit-like test using gallium only.
---
 src/gallium/auxiliary/Makefile.sources |   1 +
 src/gallium/auxiliary/util/u_tests.c   | 270 +
 src/gallium/auxiliary/util/u_tests.h   |  37 +
 3 files changed, 308 insertions(+)
 create mode 100644 src/gallium/auxiliary/util/u_tests.c
 create mode 100644 src/gallium/auxiliary/util/u_tests.h

diff --git a/src/gallium/auxiliary/Makefile.sources 
b/src/gallium/auxiliary/Makefile.sources
index f6621ef..9fc1a8a 100644
--- a/src/gallium/auxiliary/Makefile.sources
+++ b/src/gallium/auxiliary/Makefile.sources
@@ -139,6 +139,7 @@ C_SOURCES := \
util/u_suballoc.c \
util/u_surface.c \
util/u_surfaces.c \
+   util/u_tests.c \
util/u_texture.c \
util/u_tile.c \
util/u_transfer.c \
diff --git a/src/gallium/auxiliary/util/u_tests.c 
b/src/gallium/auxiliary/util/u_tests.c
new file mode 100644
index 000..9483f06
--- /dev/null
+++ b/src/gallium/auxiliary/util/u_tests.c
@@ -0,0 +1,270 @@
+/**
+ *
+ * Copyright 2014 Advanced Micro Devices, Inc.
+ * All Rights Reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
+ * "Software"), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sub license, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial portions
+ * of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+ * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
+ * IN NO EVENT SHALL THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR
+ * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ *
+ **/
+
+#include "util/u_tests.h"
+
+#include "util/u_draw_quad.h"
+#include "util/u_format.h"
+#include "util/u_inlines.h"
+#include "util/u_simple_shaders.h"
+#include "util/u_surface.h"
+#include "util/u_tile.h"
+#include "cso_cache/cso_context.h"
+#include 
+
+#define TOLERANCE 0.01
+
+static struct pipe_resource *
+util_create_texture2d(struct pipe_screen *screen, unsigned width,
+  unsigned height, enum pipe_format format)
+{
+   struct pipe_resource templ = {{0}};
+
+   templ.target = PIPE_TEXTURE_2D;
+   templ.width0 = width;
+   templ.height0 = height;
+   templ.depth0 = 1;
+   templ.array_size = 1;
+   templ.format = format;
+   templ.usage = PIPE_USAGE_DEFAULT;
+   templ.bind = PIPE_BIND_SAMPLER_VIEW |
+(util_format_is_depth_or_stencil(format) ?
+PIPE_BIND_DEPTH_STENCIL : PIPE_BIND_RENDER_TARGET);
+
+   return screen->resource_create(screen, &templ);
+}
+
+static void
+util_set_framebuffer_cb0(struct cso_context *cso, struct pipe_context *ctx,
+struct pipe_resource *tex)
+{
+   struct pipe_surface templ = {{0}}, *surf;
+   struct pipe_framebuffer_state fb = {0};
+
+   templ.format = tex->format;
+   surf = ctx->create_surface(ctx, tex, &templ);
+
+   fb.width = tex->width0;
+   fb.height = tex->height0;
+   fb.cbufs[0] = surf;
+   fb.nr_cbufs = 1;
+
+   cso_set_framebuffer(cso, &fb);
+   pipe_surface_reference(&surf, NULL);
+}
+
+static void
+util_set_blend_normal(struct cso_context *cso)
+{
+   struct pipe_blend_state blend = {0};
+
+   blend.rt[0].colormask = PIPE_MASK_RGBA;
+   cso_set_blend(cso, &blend);
+}
+
+static void
+util_set_dsa_disable(struct cso_context *cso)
+{
+   struct pipe_depth_stencil_alpha_state dsa = {{0}};
+
+   cso_set_depth_stencil_alpha(cso, &dsa);
+}
+
+static void
+util_set_rasterizer_normal(struct cso_context *cso)
+{
+   struct pipe_rasterizer_state rs = {0};
+
+   rs.half_pixel_center = 1;
+   rs.bottom_edge_rule = 1;
+   rs.depth_clip = 1;
+
+   cso_set_rasterizer(cso, &rs);
+}
+
+static void
+util_set_max_viewport(struct cso_context *cso, struct pipe_resource *tex)
+{
+   struct pipe_viewport_state viewport;
+
+   viewport.scale[0] = 0.5f * tex->width0;
+   viewport.scale[1] = 0.5f * tex->height0;
+   viewport.scale[2] = 1.0f;
+   viewport.scale[3] = 1.0f;
+   viewport.translate[0] = 0.5f * tex->width0;
+   viewport.translate[1] = 0.5f * tex->height0;
+   viewport.translate[2] = 0.0f;
+   viewport.translate[3] = 0.0f;
+
+   cso_set_viewport(cs

[Mesa-dev] [PATCH 4/4] tgsi/ureg: simplify code for declaring properties

2014-11-09 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/auxiliary/tgsi/tgsi_ureg.c| 153 ++
 src/gallium/auxiliary/tgsi/tgsi_ureg.h|  35 +-
 src/gallium/auxiliary/util/u_simple_shaders.c |   2 +-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp|  12 +-
 src/mesa/state_tracker/st_mesa_to_tgsi.c  |  12 +-
 src/mesa/state_tracker/st_program.c   |  23 ++--
 6 files changed, 43 insertions(+), 194 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c 
b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
index 6d3ac91..f524dfb 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
@@ -165,15 +165,7 @@ struct ureg_program
struct const_decl const_decls;
struct const_decl const_decls2D[PIPE_MAX_CONSTANT_BUFFERS];
 
-   unsigned property_gs_input_prim;
-   unsigned property_gs_output_prim;
-   unsigned property_gs_max_vertices;
-   unsigned property_gs_invocations;
-   unsigned char property_fs_coord_origin; /* = TGSI_FS_COORD_ORIGIN_* */
-   unsigned char property_fs_coord_pixel_center; /* = 
TGSI_FS_COORD_PIXEL_CENTER_* */
-   unsigned char property_fs_color0_writes_all_cbufs; /* = 
TGSI_FS_COLOR0_WRITES_ALL_CBUFS * */
-   unsigned char property_fs_depth_layout; /* TGSI_FS_DEPTH_LAYOUT */
-   boolean property_vs_window_space_position; /* TGSI_VS_WINDOW_SPACE_POSITION 
*/
+   unsigned properties[TGSI_PROPERTY_COUNT];
 
unsigned nr_addrs;
unsigned nr_preds;
@@ -278,65 +270,10 @@ ureg_dst_register( unsigned file,
 
 
 void
-ureg_property_gs_input_prim(struct ureg_program *ureg,
-unsigned input_prim)
+ureg_property(struct ureg_program *ureg, unsigned name, unsigned value)
 {
-   ureg->property_gs_input_prim = input_prim;
-}
-
-void
-ureg_property_gs_output_prim(struct ureg_program *ureg,
- unsigned output_prim)
-{
-   ureg->property_gs_output_prim = output_prim;
-}
-
-void
-ureg_property_gs_max_vertices(struct ureg_program *ureg,
-  unsigned max_vertices)
-{
-   ureg->property_gs_max_vertices = max_vertices;
-}
-void
-ureg_property_gs_invocations(struct ureg_program *ureg,
- unsigned invocations)
-{
-   ureg->property_gs_invocations = invocations;
-}
-
-void
-ureg_property_fs_coord_origin(struct ureg_program *ureg,
-unsigned fs_coord_origin)
-{
-   ureg->property_fs_coord_origin = fs_coord_origin;
-}
-
-void
-ureg_property_fs_coord_pixel_center(struct ureg_program *ureg,
-unsigned fs_coord_pixel_center)
-{
-   ureg->property_fs_coord_pixel_center = fs_coord_pixel_center;
-}
-
-void
-ureg_property_fs_color0_writes_all_cbufs(struct ureg_program *ureg,
-unsigned fs_color0_writes_all_cbufs)
-{
-   ureg->property_fs_color0_writes_all_cbufs = fs_color0_writes_all_cbufs;
-}
-
-void
-ureg_property_fs_depth_layout(struct ureg_program *ureg,
-  unsigned fs_depth_layout)
-{
-   ureg->property_fs_depth_layout = fs_depth_layout;
-}
-
-void
-ureg_property_vs_window_space_position(struct ureg_program *ureg,
-   boolean vs_window_space_position)
-{
-   ureg->property_vs_window_space_position = vs_window_space_position;
+   assert(name < Elements(ureg->properties));
+   ureg->properties[name] = value;
 }
 
 struct ureg_src
@@ -1452,77 +1389,9 @@ static void emit_decls( struct ureg_program *ureg )
 {
unsigned i;
 
-   if (ureg->property_gs_input_prim != ~0) {
-  assert(ureg->processor == TGSI_PROCESSOR_GEOMETRY);
-
-  emit_property(ureg,
-TGSI_PROPERTY_GS_INPUT_PRIM,
-ureg->property_gs_input_prim);
-   }
-
-   if (ureg->property_gs_output_prim != ~0) {
-  assert(ureg->processor == TGSI_PROCESSOR_GEOMETRY);
-
-  emit_property(ureg,
-TGSI_PROPERTY_GS_OUTPUT_PRIM,
-ureg->property_gs_output_prim);
-   }
-
-   if (ureg->property_gs_max_vertices != ~0) {
-  assert(ureg->processor == TGSI_PROCESSOR_GEOMETRY);
-
-  emit_property(ureg,
-TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES,
-ureg->property_gs_max_vertices);
-   }
-
-   if (ureg->property_gs_invocations != ~0) {
-  assert(ureg->processor == TGSI_PROCESSOR_GEOMETRY);
-
-  emit_property(ureg,
-TGSI_PROPERTY_GS_INVOCATIONS,
-ureg->property_gs_invocations);
-   }
-
-   if (ureg->property_fs_coord_origin) {
-  assert(ureg->processor == TGSI_PROCESSOR_FRAGMENT);
-
-  emit_property(ureg,
-TGSI_PROPERTY_FS_COORD_ORIGIN,
-ureg->property_fs_coord_origin);
-   }
-
-   if (ureg->property_fs_coord_pixel_center) {
-  assert(ureg->processor == TGSI_PROCESSOR_FRAGMENT);
-
-  emit_property(ureg,
-TGSI_PROPERTY_FS_COORD_PIXEL_CENTER,
-ureg->property_fs_coord_pixel_center);
-   }
-

[Mesa-dev] [PATCH 2/4] gallium/util: add a window_space option to the passthrough vertex shader

2014-11-09 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/auxiliary/postprocess/pp_program.c |  2 +-
 src/gallium/auxiliary/util/u_blit.c|  2 +-
 src/gallium/auxiliary/util/u_blitter.c |  5 +++--
 src/gallium/auxiliary/util/u_simple_shaders.c  | 10 --
 src/gallium/auxiliary/util/u_simple_shaders.h  |  4 +++-
 src/mesa/state_tracker/st_cb_bitmap.c  |  3 ++-
 src/mesa/state_tracker/st_cb_clear.c   |  3 ++-
 src/mesa/state_tracker/st_cb_drawtex.c |  2 +-
 8 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/src/gallium/auxiliary/postprocess/pp_program.c 
b/src/gallium/auxiliary/postprocess/pp_program.c
index 91cc781..811f1fb 100644
--- a/src/gallium/auxiliary/postprocess/pp_program.c
+++ b/src/gallium/auxiliary/postprocess/pp_program.c
@@ -131,7 +131,7 @@ pp_init_prog(struct pp_queue_t *ppq, struct pipe_context 
*pipe,
   const uint semantic_indexes[] = { 0, 0 };
   p->passvs = util_make_vertex_passthrough_shader(p->pipe, 2,
   semantic_names,
-  semantic_indexes);
+  semantic_indexes, FALSE);
}
 
p->framebuffer.nr_cbufs = 1;
diff --git a/src/gallium/auxiliary/util/u_blit.c 
b/src/gallium/auxiliary/util/u_blit.c
index 2573bed..020e5f7 100644
--- a/src/gallium/auxiliary/util/u_blit.c
+++ b/src/gallium/auxiliary/util/u_blit.c
@@ -188,7 +188,7 @@ set_vertex_shader(struct blit_state *ctx)
   const uint semantic_indexes[] = { 0, 0 };
   ctx->vs = util_make_vertex_passthrough_shader(ctx->pipe, 2,
 semantic_names,
-semantic_indexes);
+semantic_indexes, FALSE);
}
 
cso_set_vertex_shader_handle(ctx->cso, ctx->vs);
diff --git a/src/gallium/auxiliary/util/u_blitter.c 
b/src/gallium/auxiliary/util/u_blitter.c
index 4626c1e..b4598b8 100644
--- a/src/gallium/auxiliary/util/u_blitter.c
+++ b/src/gallium/auxiliary/util/u_blitter.c
@@ -331,7 +331,8 @@ static void bind_vs_pos_only(struct blitter_context_priv 
*ctx)
 
   ctx->vs_pos_only =
  util_make_vertex_passthrough_shader_with_so(pipe, 1, semantic_names,
- semantic_indices, &so);
+ semantic_indices, FALSE,
+ &so);
}
 
pipe->bind_vs_state(pipe, ctx->vs_pos_only);
@@ -347,7 +348,7 @@ static void bind_vs_passthrough(struct blitter_context_priv 
*ctx)
   const uint semantic_indices[] = { 0, 0 };
   ctx->vs =
  util_make_vertex_passthrough_shader(pipe, 2, semantic_names,
- semantic_indices);
+ semantic_indices, FALSE);
}
 
pipe->bind_vs_state(pipe, ctx->vs);
diff --git a/src/gallium/auxiliary/util/u_simple_shaders.c 
b/src/gallium/auxiliary/util/u_simple_shaders.c
index adf4887..280ed8f 100644
--- a/src/gallium/auxiliary/util/u_simple_shaders.c
+++ b/src/gallium/auxiliary/util/u_simple_shaders.c
@@ -59,11 +59,13 @@ void *
 util_make_vertex_passthrough_shader(struct pipe_context *pipe,
 uint num_attribs,
 const uint *semantic_names,
-const uint *semantic_indexes)
+const uint *semantic_indexes,
+bool window_space)
 {
return util_make_vertex_passthrough_shader_with_so(pipe, num_attribs,
   semantic_names,
-  semantic_indexes, NULL);
+  semantic_indexes,
+  window_space, NULL);
 }
 
 void *
@@ -71,6 +73,7 @@ util_make_vertex_passthrough_shader_with_so(struct 
pipe_context *pipe,
 uint num_attribs,
 const uint *semantic_names,
 const uint *semantic_indexes,
+bool window_space,
const struct pipe_stream_output_info *so)
 {
struct ureg_program *ureg;
@@ -80,6 +83,9 @@ util_make_vertex_passthrough_shader_with_so(struct 
pipe_context *pipe,
if (ureg == NULL)
   return NULL;
 
+   if (window_space)
+  ureg_property_vs_window_space_position(ureg, TRUE);
+
for (i = 0; i < num_attribs; i++) {
   struct ureg_src src;
   struct ureg_dst dst;
diff --git a/src/gallium/auxiliary/util/u_simple_shaders.h 
b/src/gallium/auxiliary/util/u_simple_shaders.h
index c1d14aa..2ba3f2a 100644
--- a/src/gallium/auxiliary/util/u_simple_shaders.h
+++ b/src/galli

[Mesa-dev] [PATCH] radeonsi: implement TGSI_PROPERTY_VS_WINDOW_SPACE_POSITION

2014-11-09 Thread Marek Olšák
From: Marek Olšák 

Required by Nine.
---
 src/gallium/drivers/radeonsi/si_pipe.c   |  2 +-
 src/gallium/drivers/radeonsi/si_state.c  |  1 -
 src/gallium/drivers/radeonsi/si_state_draw.c | 16 +++-
 3 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_pipe.c 
b/src/gallium/drivers/radeonsi/si_pipe.c
index 7c479d6..279d7ce 100644
--- a/src/gallium/drivers/radeonsi/si_pipe.c
+++ b/src/gallium/drivers/radeonsi/si_pipe.c
@@ -211,6 +211,7 @@ static int si_get_param(struct pipe_screen* pscreen, enum 
pipe_cap param)
case PIPE_CAP_SAMPLE_SHADING:
case PIPE_CAP_DRAW_INDIRECT:
case PIPE_CAP_CLIP_HALFZ:
+   case PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION:
return 1;
 
case PIPE_CAP_TEXTURE_MULTISAMPLE:
@@ -249,7 +250,6 @@ static int si_get_param(struct pipe_screen* pscreen, enum 
pipe_cap param)
case PIPE_CAP_TGSI_TEXCOORD:
case PIPE_CAP_FAKE_SW_MSAA:
case PIPE_CAP_TEXTURE_GATHER_OFFSETS:
-   case PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION:
case PIPE_CAP_TGSI_FS_FINE_DERIVATIVE:
case PIPE_CAP_CONDITIONAL_RENDER_INVERTED:
case PIPE_CAP_SAMPLER_VIEW_TARGET:
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 030d6e9..ea8e61a 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -3184,7 +3184,6 @@ void si_init_config(struct si_context *sctx)
si_pm4_set_reg(pm4, R_028230_PA_SC_EDGERULE, 0x);
si_pm4_set_reg(pm4, R_0282D0_PA_SC_VPORT_ZMIN_0, 0x);
si_pm4_set_reg(pm4, R_0282D4_PA_SC_VPORT_ZMAX_0, 0x3F80);
-   si_pm4_set_reg(pm4, R_028818_PA_CL_VTE_CNTL, 0x043F);
si_pm4_set_reg(pm4, R_028820_PA_CL_NANINF_CNTL, 0x);
si_pm4_set_reg(pm4, R_028BE8_PA_CL_GB_VERT_CLIP_ADJ, 0x3F80);
si_pm4_set_reg(pm4, R_028BEC_PA_CL_GB_VERT_DISC_ADJ, 0x3F80);
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 708e42a..d5b27e7 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -150,6 +150,8 @@ static void si_shader_vs(struct si_shader *shader)
unsigned num_sgprs, num_user_sgprs;
unsigned nparams, i, vgpr_comp_cnt;
uint64_t va;
+   unsigned window_space =
+  
shader->selector->info.properties[TGSI_PROPERTY_VS_WINDOW_SPACE_POSITION];
 
pm4 = shader->pm4 = CALLOC_STRUCT(si_pm4_state);
 
@@ -218,6 +220,15 @@ static void si_shader_vs(struct si_shader *shader)
   S_00B12C_SO_BASE2_EN(!!shader->selector->so.stride[2]) |
   S_00B12C_SO_BASE3_EN(!!shader->selector->so.stride[3]) |
   S_00B12C_SO_EN(!!shader->selector->so.num_outputs));
+   if (window_space)
+   si_pm4_set_reg(pm4, R_028818_PA_CL_VTE_CNTL,
+  S_028818_VTX_XY_FMT(1) | S_028818_VTX_Z_FMT(1));
+   else
+   si_pm4_set_reg(pm4, R_028818_PA_CL_VTE_CNTL,
+  S_028818_VTX_W0_FMT(1) |
+  S_028818_VPORT_X_SCALE_ENA(1) | 
S_028818_VPORT_X_OFFSET_ENA(1) |
+  S_028818_VPORT_Y_SCALE_ENA(1) | 
S_028818_VPORT_Y_OFFSET_ENA(1) |
+  S_028818_VPORT_Z_SCALE_ENA(1) | 
S_028818_VPORT_Z_OFFSET_ENA(1));
 }
 
 static void si_shader_ps(struct si_shader *shader)
@@ -436,6 +447,8 @@ static bool si_update_draw_info_state(struct si_context 
*sctx,
 {
struct si_pm4_state *pm4 = CALLOC_STRUCT(si_pm4_state);
struct si_shader *vs = si_get_vs_state(sctx);
+   unsigned window_space =
+  
vs->selector->info.properties[TGSI_PROPERTY_VS_WINDOW_SPACE_POSITION];
unsigned prim = si_conv_pipe_prim(info->mode);
unsigned gs_out_prim =
si_conv_prim_to_gs_out(sctx->gs_shader ?
@@ -496,7 +509,8 @@ static bool si_update_draw_info_state(struct si_context 
*sctx,
si_pm4_set_reg(pm4, R_028810_PA_CL_CLIP_CNTL,
   sctx->queued.named.rasterizer->pa_cl_clip_cntl |
   (vs->clip_dist_write ? 0 :
-   sctx->queued.named.rasterizer->clip_plane_enable & 
0x3F));
+   sctx->queued.named.rasterizer->clip_plane_enable & 
0x3F) |
+  S_028810_CLIP_DISABLE(window_space));
 
si_pm4_set_state(sctx, draw_info, pm4);
return true;
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium: remove unused pipe_viewport_state::translate[3] and scale[3]

2014-11-09 Thread Marek Olšák
From: Marek Olšák 

Almost all drivers ignore them.
---
 src/gallium/auxiliary/draw/draw_context.c  | 4 +---
 src/gallium/auxiliary/hud/hud_context.c| 2 --
 src/gallium/auxiliary/postprocess/pp_init.c| 2 --
 src/gallium/auxiliary/util/u_blit.c| 2 --
 src/gallium/auxiliary/util/u_blitter.c | 2 --
 src/gallium/auxiliary/util/u_tests.c   | 2 --
 src/gallium/auxiliary/vl/vl_compositor.c   | 2 --
 src/gallium/auxiliary/vl/vl_deint_filter.c | 1 -
 src/gallium/auxiliary/vl/vl_idct.c | 2 --
 src/gallium/auxiliary/vl/vl_matrix_filter.c| 1 -
 src/gallium/auxiliary/vl/vl_mc.c   | 2 --
 src/gallium/auxiliary/vl/vl_median_filter.c| 1 -
 src/gallium/auxiliary/vl/vl_zscan.c| 2 --
 src/gallium/drivers/ilo/ilo_blitter_rectlist.c | 1 -
 src/gallium/drivers/nouveau/nv30/nv30_state_validate.c | 4 ++--
 src/gallium/drivers/radeon/r600_pipe_common.c  | 2 --
 src/gallium/include/pipe/p_state.h | 4 ++--
 src/gallium/state_trackers/nine/nine_state.c   | 2 --
 src/gallium/state_trackers/vega/renderer.c | 2 --
 src/gallium/state_trackers/xa/xa_renderer.c| 2 --
 src/gallium/tests/graw/fs-test.c   | 2 --
 src/gallium/tests/graw/graw_util.h | 2 --
 src/gallium/tests/graw/gs-test.c   | 2 --
 src/gallium/tests/graw/quad-sample.c   | 2 --
 src/gallium/tests/graw/shader-leak.c   | 2 --
 src/gallium/tests/graw/tri-gs.c| 2 --
 src/gallium/tests/graw/tri-instanced.c | 2 --
 src/gallium/tests/graw/vs-test.c   | 2 --
 src/gallium/tests/trivial/quad-tex.c   | 2 --
 src/gallium/tests/trivial/tri.c| 2 --
 src/mesa/state_tracker/st_atom_viewport.c  | 2 --
 src/mesa/state_tracker/st_cb_bitmap.c  | 2 --
 src/mesa/state_tracker/st_cb_clear.c   | 2 --
 src/mesa/state_tracker/st_cb_drawpixels.c  | 2 --
 src/mesa/state_tracker/st_cb_drawtex.c | 2 --
 35 files changed, 5 insertions(+), 67 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_context.c 
b/src/gallium/auxiliary/draw/draw_context.c
index 20dea66..92d7a4f 100644
--- a/src/gallium/auxiliary/draw/draw_context.c
+++ b/src/gallium/auxiliary/draw/draw_context.c
@@ -373,11 +373,9 @@ void draw_set_viewport_states( struct draw_context *draw,
   (viewport->scale[0] == 1.0f &&
viewport->scale[1] == 1.0f &&
viewport->scale[2] == 1.0f &&
-   viewport->scale[3] == 1.0f &&
viewport->translate[0] == 0.0f &&
viewport->translate[1] == 0.0f &&
-   viewport->translate[2] == 0.0f &&
-   viewport->translate[3] == 0.0f);
+   viewport->translate[2] == 0.0f);
 
draw->driver.bypass_clip_xy = vps[0].scale[3] == 0.0f;
draw->clip_xy = !draw->driver.bypass_clip_xy;
diff --git a/src/gallium/auxiliary/hud/hud_context.c 
b/src/gallium/auxiliary/hud/hud_context.c
index 18a8781..98678fc 100644
--- a/src/gallium/auxiliary/hud/hud_context.c
+++ b/src/gallium/auxiliary/hud/hud_context.c
@@ -444,11 +444,9 @@ hud_draw(struct hud_context *hud, struct pipe_resource 
*tex)
viewport.scale[0] = 0.5f * hud->fb_width;
viewport.scale[1] = 0.5f * hud->fb_height;
viewport.scale[2] = 1.0f;
-   viewport.scale[3] = 1.0f;
viewport.translate[0] = 0.5f * hud->fb_width;
viewport.translate[1] = 0.5f * hud->fb_height;
viewport.translate[2] = 0.0f;
-   viewport.translate[3] = 0.0f;
 
cso_set_framebuffer(cso, &fb);
cso_set_sample_mask(cso, ~0);
diff --git a/src/gallium/auxiliary/postprocess/pp_init.c 
b/src/gallium/auxiliary/postprocess/pp_init.c
index 05a0830..bdf66e6 100644
--- a/src/gallium/auxiliary/postprocess/pp_init.c
+++ b/src/gallium/auxiliary/postprocess/pp_init.c
@@ -324,8 +324,6 @@ pp_init_fbos(struct pp_queue_t *ppq, unsigned int w,
 
p->viewport.scale[0] = p->viewport.translate[0] = (float) w / 2.0f;
p->viewport.scale[1] = p->viewport.translate[1] = (float) h / 2.0f;
-   p->viewport.scale[3] = 1.0f;
-   p->viewport.translate[3] = 0.0f;
 
ppq->fbos_init = true;
 
diff --git a/src/gallium/auxiliary/util/u_blit.c 
b/src/gallium/auxiliary/util/u_blit.c
index 020e5f7..90408ff 100644
--- a/src/gallium/auxiliary/util/u_blit.c
+++ b/src/gallium/auxiliary/util/u_blit.c
@@ -559,11 +559,9 @@ util_blit_pixels_tex(struct blit_state *ctx,
ctx->viewport.scale[0] = 0.5f * dst->width;
ctx->viewport.scale[1] = 0.5f * dst->height;
ctx->viewport.scale[2] = 0.5f;
-   ctx->viewport.scale[3] = 1.0f;
ctx->viewport.translate[0] = 0.5f * dst->width;
ctx->viewport.translate[1] = 0.5f * dst->height;
ctx->viewport.translate[2] = 0.5f;
-   ctx->viewport.translate[3] = 0.0f;
cso_set_viewport(ctx->cso, &ctx->view

Re: [Mesa-dev] [PATCH] gallium: remove unused pipe_viewport_state::translate[3] and scale[3]

2014-11-09 Thread Ilia Mirkin
On Sun, Nov 9, 2014 at 6:39 PM, Marek Olšák  wrote:

> diff --git a/src/gallium/drivers/nouveau/nv30/nv30_state_validate.c
> b/src/gallium/drivers/nouveau/nv30/nv30_state_validate.c
> index f227559..6df0c47 100644
> --- a/src/gallium/drivers/nouveau/nv30/nv30_state_validate.c
> +++ b/src/gallium/drivers/nouveau/nv30/nv30_state_validate.c
> @@ -254,11 +254,11 @@ nv30_validate_viewport(struct nv30_context *nv30)
> PUSH_DATAf(push, vp->translate[0]);
> PUSH_DATAf(push, vp->translate[1]);
> PUSH_DATAf(push, vp->translate[2]);
> -   PUSH_DATAf(push, vp->translate[3]);
> +   PUSH_DATAf(push, 1.0f);
>

This should probably be 0.0f... or am I misunderstanding what this does?


> PUSH_DATAf(push, vp->scale[0]);
> PUSH_DATAf(push, vp->scale[1]);
> PUSH_DATAf(push, vp->scale[2]);
> -   PUSH_DATAf(push, vp->scale[3]);
> +   PUSH_DATAf(push, 1.0f);
> BEGIN_NV04(push, NV30_3D(DEPTH_RANGE_NEAR), 2);
> PUSH_DATAf(push, vp->translate[2] - fabsf(vp->scale[2]));
> PUSH_DATAf(push, vp->translate[2] + fabsf(vp->scale[2]));
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] The final batch of gallium 'make dist'

2014-11-09 Thread Emil Velikov
Ping anyone ?
Would love to get gallium ticked off, and continue with the rest of mesa.

-Emil
On 14/10/14 16:44, Emil Velikov wrote:
> Hello list,
> 
> As the title says it - this ought to be the final batch of patches 
> needed to get 'make dist' working for gallium. With that one out of the 
> way I'll start looking into egl, gbm . and mapi.
> 
> Some of the patches may contain rather controversial changes, but I 
> would rather get things working and then nitpick on making it 'perfect'
> 
> As usual any comment and suggestions are greatly appreciated.
> 
> -Emil
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2] mesa: Permanently enable features supported by target CPU at compile time.

2014-11-09 Thread Emil Velikov
On 09/11/14 04:37, Siavash Eliasi wrote:
> 
> On 11/08/2014 09:55 PM, Emil Velikov wrote:
[...]
>> Can you confirm that it does not cause issues with "interesting" setups
>> such as https://bugs.freedesktop.org/show_bug.cgi?id=71547
> 
> Challenge accepted! What my patch is doing is to check for provided
> compile flags (-msse, ...) on compile time (__SSE__, ...) and set
> "cpu_has_sse" macro to "1" which allows any sane compiler to turn this
> pieces of code:

I'm not sure did you just said that you've checked it, or that's what it
ought to do ? There is a reason why I'm so picky - this bizarre (as one
might call it) setup is just the tip of the iceberg when it comes to
people building mesa themselves.
Would be nice to get your patch in as long as it does not break stuff :)

-Emil
> Best regards,
> Siavash Eliasi.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: implement TGSI_PROPERTY_VS_WINDOW_SPACE_POSITION

2014-11-09 Thread Michel Dänzer

On 10.11.2014 08:09, Marek Olšák wrote:

From: Marek Olšák 

Required by Nine.


Reviewed-by: Michel Dänzer 


--
Earthling Michel Dänzer|  http://www.amd.com
Libre software enthusiast  |Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev