Re: [Mesa-dev] [PATCH V4] mesa: add SSE optimisation for glDrawElements

2014-11-02 Thread Timothy Arceri
On Sat, 2014-11-01 at 23:15 +, Bruno Jimenez wrote:
 On Wed, 2014-10-29 at 23:09 +1100, Timothy Arceri wrote:
  On Wed, 2014-10-29 at 16:58 +1100, Timothy Arceri wrote:
   On Tue, 2014-10-28 at 22:14 +, Bruno Jimenez wrote:
Hi,

I haven't had time to play yet with OpenMP, but I have seen the assembly
it produces in my computer. If I enable SSE2 it can use it, and if I
enable SSE4.1 it uses the parallel max. But it uses unaligned loads, and
since we are trying to avoid them I don't know if we want to use just
OpenMP.

Processing the first unaligned elements by hand and using the
__builtin_assume_aligned as the article you link allows OpenMP to use
aligned loads.

Also, it must be noted that I am using GCC, I don't know what Clang may
produce (for the plain algorithms, as it doesn't support OpenMP if I
recall correctly).

When I have time I'll try to collect all the variants and make some kind
of benchmark between them to see if it is worthy to use any of them.

Also, do we know if 'count' has an upper bound? or if we can force the
array to be aligned so we don't have to worry about the first items?
   
   I'm not sure about count but the indices array comes from the
   glDrawRangeElements() call so I don't think we can do anything about the
   alignment.
  
  Sorry I meant glDrawElements()
 
 Ok, Thanks for the information.
 
 At last I have had time to put everything in place and make some
 benchmarks for the naive algorithm, what OpenMP generates, SSE2 and
 SSE4.1 code, for both aligned and unaligned scenarios.
 
 As a summary for my CPU:
 
 Limited to SSE2:
   - If the array is aligned:
 + For low values (~20) the naive algorithm is faster
 + For higher values the SSE2 code is faster
   - If the array is unaligned:
 + For low values (~24) the naive algorithm is faster
 + For higher values, the SSE2 code is faster
 
 Using -march=native (SSE4.1 support)
   - If the array is aligned:
 + For very low values (~8) the naive algorithm is faster
 + For low values (12-32) the SSE4.1 algorithm is faster
 + For higher values, the OpenMP generated code is faster
   - If the array is unaligned:
 + For very low values (~12) the naive algorithm is faster
 + For low values (16-32) the SSE4.1 algorithm is faster
 + For higher values, the OpenMP generated code is faster
 
 For the OpenMP generated code there's not much differences between the
 naive and the aligned, although the second is a little bit faster.
 
 But that said, I don't know how the different algorithm thresholds will
 be for other CPUs.
 
 Also, I have done the benchmark with GCC 4.9.1, maybe we would get
 different results with clang or other compilers.
  
 Hope it is useful, if I can help adding SSE to anything else just ask.

It's very easy to find places for improvement by running Phoronix Test
Suite benchmarks in a tool like callgrind, and looking at the output in
Kcachegrind

I've just setup a wiki on my website so I can keep track of places that
could benefit from these kind of optimisations. Feel free to make use of
it, or have a go at improving anything I put on there.

http://www.itsqueeze.com/wiki/index.php?title=Main_Page 
 
 - Bruno
 
  

   

BTW, Thanks for the article!
- Bruno

 
 
  
  - Bruno
  
   
- Bruno

+  unsigned max_arr[4] __attribute__ ((aligned (16)));
+  unsigned min_arr[4] __attribute__ ((aligned (16)));
+  unsigned vec_count;
+  __m128i max_ui4 = _mm_setzero_si128();
+  __m128i min_ui4 = _mm_set1_epi32(~0U);
+  __m128i ui_indices4;
+  __m128i *ui_indices_ptr;
+
  [snip]
 
 


   
   
   ___
   mesa-dev mailing list
   mesa-dev@lists.freedesktop.org
   http://lists.freedesktop.org/mailman/listinfo/mesa-dev
  
  
 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] clover: fix clBuildProgram Piglit regression

2014-11-02 Thread EdB
should trigger CL_INVALID_VALUE
if device_list is NULL and num_devices is greater than zero.

introduced by e5468dfa523be2a7a0d04bb9efcf8ae780957563
---
 src/gallium/state_trackers/clover/api/program.cpp | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/src/gallium/state_trackers/clover/api/program.cpp 
b/src/gallium/state_trackers/clover/api/program.cpp
index 64c4a43..dc89730 100644
--- a/src/gallium/state_trackers/clover/api/program.cpp
+++ b/src/gallium/state_trackers/clover/api/program.cpp
@@ -27,7 +27,7 @@ using namespace clover;
 
 namespace {
void validate_build_program_common(const program prog, cl_uint num_devs,
-  const ref_vectordevice devs,
+  ref_vectordevice devs,
   void (*pfn_notify)(cl_program, void *),
   void *user_data) {
 
@@ -37,10 +37,14 @@ namespace {
   if (prog.kernel_ref_count())
  throw error(CL_INVALID_OPERATION);
 
-  if (any_of([](const device dev) {
-   return !count(dev, prog.context().devices());
-}, devs))
- throw error(CL_INVALID_DEVICE);
+  if (!num_devs) {
+ devs = prog.context().devices();
+  } else {
+ if (any_of([](const device dev) {
+  return !count(dev, prog.context().devices());
+   }, devs))
+throw error(CL_INVALID_DEVICE);
+  }
}
 }
 
@@ -173,8 +177,7 @@ clBuildProgram(cl_program d_prog, cl_uint num_devs,
void (*pfn_notify)(cl_program, void *),
void *user_data) try {
auto prog = obj(d_prog);
-   auto devs = (d_devs ? objs(d_devs, num_devs) :
-ref_vectordevice(prog.context().devices()));
+   auto devs = objsallow_empty_tag(d_devs, num_devs);
auto opts = (p_opts ? p_opts : );
 
validate_build_program_common(prog, num_devs, devs, pfn_notify, user_data);
@@ -195,8 +198,7 @@ clCompileProgram(cl_program d_prog, cl_uint num_devs,
  void (*pfn_notify)(cl_program, void *),
  void *user_data) try {
auto prog = obj(d_prog);
-   auto devs = (d_devs ? objs(d_devs, num_devs) :
-ref_vectordevice(prog.context().devices()));
+   auto devs = objsallow_empty_tag(d_devs, num_devs);
auto opts = (p_opts ? p_opts : );
header_map headers;
 
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] clover: fix clBuildProgram Piglit regression

2014-11-02 Thread Francisco Jerez
EdB edb+m...@sigluy.net writes:

 should trigger CL_INVALID_VALUE
 if device_list is NULL and num_devices is greater than zero.

 introduced by e5468dfa523be2a7a0d04bb9efcf8ae780957563

Tom, can you just drop the the vector of devices parameter and validate
the d_devs/num_devs arguments from validate_build_program_common() by
calling objsallow_empty_tag, as I suggested when I gave my R-b for
your commit.

Thanks.

 ---
  src/gallium/state_trackers/clover/api/program.cpp | 20 +++-
  1 file changed, 11 insertions(+), 9 deletions(-)

 diff --git a/src/gallium/state_trackers/clover/api/program.cpp 
 b/src/gallium/state_trackers/clover/api/program.cpp
 index 64c4a43..dc89730 100644
 --- a/src/gallium/state_trackers/clover/api/program.cpp
 +++ b/src/gallium/state_trackers/clover/api/program.cpp
 @@ -27,7 +27,7 @@ using namespace clover;
  
  namespace {
 void validate_build_program_common(const program prog, cl_uint num_devs,
 -  const ref_vectordevice devs,
 +  ref_vectordevice devs,
void (*pfn_notify)(cl_program, void *),
void *user_data) {
  
 @@ -37,10 +37,14 @@ namespace {
if (prog.kernel_ref_count())
   throw error(CL_INVALID_OPERATION);
  
 -  if (any_of([](const device dev) {
 -   return !count(dev, prog.context().devices());
 -}, devs))
 - throw error(CL_INVALID_DEVICE);
 +  if (!num_devs) {
 + devs = prog.context().devices();
 +  } else {
 + if (any_of([](const device dev) {
 +  return !count(dev, prog.context().devices());
 +   }, devs))
 +throw error(CL_INVALID_DEVICE);
 +  }
 }
  }
  
 @@ -173,8 +177,7 @@ clBuildProgram(cl_program d_prog, cl_uint num_devs,
 void (*pfn_notify)(cl_program, void *),
 void *user_data) try {
 auto prog = obj(d_prog);
 -   auto devs = (d_devs ? objs(d_devs, num_devs) :
 -ref_vectordevice(prog.context().devices()));
 +   auto devs = objsallow_empty_tag(d_devs, num_devs);
 auto opts = (p_opts ? p_opts : );
  
 validate_build_program_common(prog, num_devs, devs, pfn_notify, 
 user_data);
 @@ -195,8 +198,7 @@ clCompileProgram(cl_program d_prog, cl_uint num_devs,
   void (*pfn_notify)(cl_program, void *),
   void *user_data) try {
 auto prog = obj(d_prog);
 -   auto devs = (d_devs ? objs(d_devs, num_devs) :
 -ref_vectordevice(prog.context().devices()));
 +   auto devs = objsallow_empty_tag(d_devs, num_devs);
 auto opts = (p_opts ? p_opts : );
 header_map headers;
  
 -- 
 1.9.3

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev


pgpzZSd4zZ0Gd.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 0/9] Gallium Nine

2014-11-02 Thread David Heidelberg

Hello everyone!

First I'd like thank you for great feedback.
Sending third Gallium Nine merge request. We reduced number of commits 
to necessary minimum. I hope all proposed changes are incorporated in v3.


Thank you

Axel Davy (2):
  nine: Add drirc options (v2)
  nine: Implement threadpool

Christoph Bumiller (6):
  tgsi/ureg: add ureg_UARL shortcut (v2)
  winsys/sw/wrapper: implement is_displaytarget_format_supported for
swrast
  gallium/auxiliary: implement sw_probe_wrapped
  gallium/auxiliary: add inc and dec alternative with return (v2)
  gallium/auxiliary: add contained and rect checks (v3)
  gallium/auxiliary: add dump functions for bind and transfer flags

Joakim Sindholt (1):
  nine: Add state tracker nine for Direct3D9 (v2)

 configure.ac   |   26 +
 include/D3D9/d3d9.h| 1858 +++
 include/D3D9/d3d9caps.h|  387 +++
 include/D3D9/d3d9types.h   | 1797 ++
 include/d3dadapter/d3dadapter9.h   |  101 +
 include/d3dadapter/drm.h   |   44 +
 include/d3dadapter/present.h   |  137 +
 src/gallium/Makefile.am|4 +
 src/gallium/auxiliary/pipe-loader/pipe_loader.h|   11 +
 src/gallium/auxiliary/pipe-loader/pipe_loader_sw.c |   22 +
 .../auxiliary/target-helpers/inline_sw_helper.h|   28 +
 src/gallium/auxiliary/tgsi/tgsi_opcode_tmp.h   |1 +
 src/gallium/auxiliary/util/u_atomic.h  |   78 +
 src/gallium/auxiliary/util/u_box.h |  230 ++
 src/gallium/auxiliary/util/u_dump.h|6 +
 src/gallium/auxiliary/util/u_dump_defines.c|   86 +
 src/gallium/auxiliary/util/u_rect.h|   28 +
 src/gallium/state_trackers/nine/Makefile.am|   13 +
 src/gallium/state_trackers/nine/Makefile.sources   |   73 +
 src/gallium/state_trackers/nine/README |   78 +
 src/gallium/state_trackers/nine/adapter9.c | 1081 ++
 src/gallium/state_trackers/nine/adapter9.h |  137 +
 .../state_trackers/nine/authenticatedchannel9.c|   78 +
 .../state_trackers/nine/authenticatedchannel9.h|   65 +
 src/gallium/state_trackers/nine/basetexture9.c |  504 +++
 src/gallium/state_trackers/nine/basetexture9.h |  138 +
 src/gallium/state_trackers/nine/cryptosession9.c   |  115 +
 src/gallium/state_trackers/nine/cryptosession9.h   |   86 +
 src/gallium/state_trackers/nine/cubetexture9.c |  274 ++
 src/gallium/state_trackers/nine/cubetexture9.h |   79 +
 src/gallium/state_trackers/nine/device9.c  | 3450 


 src/gallium/state_trackers/nine/device9.h  |  798 +
 src/gallium/state_trackers/nine/device9ex.c|  396 +++
 src/gallium/state_trackers/nine/device9ex.h|  148 +
 src/gallium/state_trackers/nine/device9video.c |   62 +
 src/gallium/state_trackers/nine/device9video.h |   57 +
 src/gallium/state_trackers/nine/guid.c |   66 +
 src/gallium/state_trackers/nine/guid.h |   36 +
 src/gallium/state_trackers/nine/indexbuffer9.c |  218 ++
 src/gallium/state_trackers/nine/indexbuffer9.h |   88 +
 src/gallium/state_trackers/nine/iunknown.c |  126 +
 src/gallium/state_trackers/nine/iunknown.h |  153 +
 src/gallium/state_trackers/nine/nine_debug.c   |  104 +
 src/gallium/state_trackers/nine/nine_debug.h   |  135 +
 src/gallium/state_trackers/nine/nine_defines.h |   55 +
 src/gallium/state_trackers/nine/nine_dump.c|  813 +
 src/gallium/state_trackers/nine/nine_dump.h|   52 +
 src/gallium/state_trackers/nine/nine_ff.c  | 2213 +
 src/gallium/state_trackers/nine/nine_ff.h  |   32 +
 src/gallium/state_trackers/nine/nine_helpers.c |  100 +
 src/gallium/state_trackers/nine/nine_helpers.h |  176 +
 src/gallium/state_trackers/nine/nine_lock.c| 3319 
+++

 src/gallium/state_trackers/nine/nine_lock.h|   51 +
 src/gallium/state_trackers/nine/nine_pdata.h   |   45 +
 src/gallium/state_trackers/nine/nine_pipe.c|  410 +++
 src/gallium/state_trackers/nine/nine_pipe.h|  568 
 src/gallium/state_trackers/nine/nine_quirk.c   |   49 +
 src/gallium/state_trackers/nine/nine_quirk.h   |   36 +
 src/gallium/state_trackers/nine/nine_shader.c  | 2959 
+

 src/gallium/state_trackers/nine/nine_shader.h  |  142 +
 src/gallium/state_trackers/nine/nine_state.c   | 1489 +
 src/gallium/state_trackers/nine/nine_state.h   |  234 ++
 .../state_trackers/nine/nineexoverlayextension.c   |   46 +
 .../state_trackers/nine/nineexoverlayextension.h   |   49 +
 src/gallium/state_trackers/nine/pixelshader9.c |  172 +
 src/gallium/state_trackers/nine/pixelshader9.h |   82 +
 src/gallium/state_trackers/nine/query9.c   |  358 ++
 

[Mesa-dev] [PATCH v3 1/9] tgsi/ureg: add ureg_UARL shortcut (v2)

2014-11-02 Thread David Heidelberg

v2: moved in in same order as in p_shader_tokens (thanks Brian)
Signed-off-by: David Heidelberg da...@ixit.cz
---
 src/gallium/auxiliary/tgsi/tgsi_opcode_tmp.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_opcode_tmp.h 
b/src/gallium/auxiliary/tgsi/tgsi_opcode_tmp.h

index 7888be8..4ca4f24 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_opcode_tmp.h
+++ b/src/gallium/auxiliary/tgsi/tgsi_opcode_tmp.h
@@ -201,6 +201,7 @@ OP13_SAMPLE(GATHER4)
 OP12(SVIEWINFO)
 OP13(SAMPLE_POS)
 OP12(SAMPLE_INFO)
+OP11(UARL)
  OP13(UCMP)
 -- 2.1.3


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 2/9] winsys/sw/wrapper: implement is_displaytarget_format_supported for swrast

2014-11-02 Thread David Heidelberg

Signed-off-by: David Heidelberg da...@ixit.cz
---
 src/gallium/winsys/sw/wrapper/wrapper_sw_winsys.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/gallium/winsys/sw/wrapper/wrapper_sw_winsys.c 
b/src/gallium/winsys/sw/wrapper/wrapper_sw_winsys.c

index e552ac2..a6bf498 100644
--- a/src/gallium/winsys/sw/wrapper/wrapper_sw_winsys.c
+++ b/src/gallium/winsys/sw/wrapper/wrapper_sw_winsys.c
@@ -85,6 +85,19 @@ wrapper_sw_displaytarget(struct sw_displaytarget *dt)
   static boolean
+wsw_is_dt_format_supported(struct sw_winsys *ws,
+   unsigned tex_usage,
+   enum pipe_format format)
+{
+   struct wrapper_sw_winsys *wsw = wrapper_sw_winsys(ws);
+
+   return wsw-screen-is_format_supported(wsw-screen, format,
+   PIPE_TEXTURE_2D, 0,
+   PIPE_BIND_RENDER_TARGET |
+   PIPE_BIND_DISPLAY_TARGET);
+}
+
+static boolean
 wsw_dt_get_stride(struct wrapper_sw_displaytarget *wdt, unsigned *stride)
 {
struct pipe_context *pipe = wdt-winsys-pipe;
@@ -276,6 +289,7 @@ wrapper_sw_winsys_wrap_pipe_screen(struct 
pipe_screen *screen)

if (!wsw)
   goto err;
 +   wsw-base.is_displaytarget_format_supported = 
wsw_is_dt_format_supported;

wsw-base.displaytarget_create = wsw_dt_create;
wsw-base.displaytarget_from_handle = wsw_dt_from_handle;
wsw-base.displaytarget_get_handle = wsw_dt_get_handle;
--
2.1.3


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 4/9] gallium/auxiliary: add inc and dec alternative with return (v2)

2014-11-02 Thread David Heidelberg

At this moment we use only zero or positive values.

v2: Implement it for also for Solaris, MSVC assembly
and enable for other combinations.

Signed-off-by: David Heidelberg da...@ixit.cz
---
 src/gallium/auxiliary/util/u_atomic.h | 78 
+++

 1 file changed, 78 insertions(+)

diff --git a/src/gallium/auxiliary/util/u_atomic.h 
b/src/gallium/auxiliary/util/u_atomic.h

index 2f2b42b..f177b60 100644
--- a/src/gallium/auxiliary/util/u_atomic.h
+++ b/src/gallium/auxiliary/util/u_atomic.h
@@ -69,6 +69,18 @@ p_atomic_dec(int32_t *v)
 }
  static INLINE int32_t
+p_atomic_inc_return(int32_t *v)
+{
+   return __sync_add_and_fetch(v, 1);
+}
+
+static INLINE int32_t
+p_atomic_dec_return(int32_t *v)
+{
+   return __sync_sub_and_fetch(v, 1);
+}
+
+static INLINE int32_t
 p_atomic_cmpxchg(int32_t *v, int32_t old, int32_t _new)
 {
return __sync_val_compare_and_swap(v, old, _new);
@@ -116,6 +128,18 @@ p_atomic_dec(int32_t *v)
 }
  static INLINE int32_t
+p_atomic_inc_return(int32_t *v)
+{
+   return __sync_add_and_fetch(v, 1);
+}
+
+static INLINE int32_t
+p_atomic_dec_return(int32_t *v)
+{
+   return __sync_sub_and_fetch(v, 1);
+}
+
+static INLINE int32_t
 p_atomic_cmpxchg(int32_t *v, int32_t old, int32_t _new)
 {
return __sync_val_compare_and_swap(v, old, _new);
@@ -161,6 +185,18 @@ p_atomic_dec(int32_t *v)
 }
  static INLINE int32_t
+p_atomic_inc_return(int32_t *v)
+{
+   return __sync_add_and_fetch(v, 1);
+}
+
+static INLINE int32_t
+p_atomic_dec_return(int32_t *v)
+{
+   return __sync_sub_and_fetch(v, 1);
+}
+
+static INLINE int32_t
 p_atomic_cmpxchg(int32_t *v, int32_t old, int32_t _new)
 {
return __sync_val_compare_and_swap(v, old, _new);
@@ -186,6 +222,8 @@ p_atomic_cmpxchg(int32_t *v, int32_t old, int32_t _new)
 #define p_atomic_dec_zero(_v) ((boolean) --(*(_v)))
 #define p_atomic_inc(_v) ((void) (*(_v))++)
 #define p_atomic_dec(_v) ((void) (*(_v))--)
+#define p_atomic_inc_return(_v) ((*(_v))++)
+#define p_atomic_dec_return(_v) ((*(_v))--)
 #define p_atomic_cmpxchg(_v, old, _new) (*(_v) == old ? *(_v) = (_new) 
: *(_v))

  #endif
@@ -237,6 +275,32 @@ p_atomic_dec(int32_t *v)
 }
  static INLINE int32_t
+p_atomic_inc_return(int32_t *v)
+{
+   int32_t i;
+
+   __asm {
+  mov   eax, [v]
+  lock inc  dword ptr [eax]
+  mov   [i], eax
+   }
+   return i;
+}
+
+static INLINE int32_t
+p_atomic_dec_return(int32_t *v)
+{
+   int32_t i;
+
+   __asm {
+  mov   eax, [v]
+  lock dec  dword ptr [eax]
+  mov   [i], eax
+   }
+   return i;
+}
+
+static INLINE int32_t
 p_atomic_cmpxchg(int32_t *v, int32_t old, int32_t _new)
 {
int32_t orig;
@@ -288,6 +352,12 @@ p_atomic_inc(int32_t *v)
_InterlockedIncrement((long *)v);
 }
 +static INLINE int32_t
+p_atomic_inc_return(int32_t *v)
+{
+   return _InterlockedIncrement((long *)v);
+}
+
 static INLINE void
 p_atomic_dec(int32_t *v)
 {
@@ -295,6 +365,12 @@ p_atomic_dec(int32_t *v)
 }
  static INLINE int32_t
+p_atomic_dec_return(int32_t *v)
+{
+   return _InterlockedDecrement((long *)v);
+}
+
+static INLINE int32_t
 p_atomic_cmpxchg(int32_t *v, int32_t old, int32_t _new)
 {
return _InterlockedCompareExchange((long *)v, _new, old);
@@ -329,6 +405,8 @@ p_atomic_dec_zero(int32_t *v)
  #define p_atomic_inc(_v) atomic_inc_32((uint32_t *) _v)
 #define p_atomic_dec(_v) atomic_dec_32((uint32_t *) _v)
+#define p_atomic_inc_return(_v) atomic_inc_32_nv((uint32_t *) _v)
+#define p_atomic_dec_return(_v) atomic_dec_32_nv((uint32_t *) _v)
  #define p_atomic_cmpxchg(_v, _old, _new) \
atomic_cas_32( (uint32_t *) _v, (uint32_t) _old, (uint32_t) _new)
--
2.1.3


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 3/9] gallium/auxiliary: implement sw_probe_wrapped

2014-11-02 Thread David Heidelberg

Implement pipe_loader_sw_probe_wrapped which allows to use the wrapped
software renderer backend when using the pipe loader.

Signed-off-by: David Heidelberg da...@ixit.cz
---
 src/gallium/auxiliary/pipe-loader/pipe_loader.h| 11 +++
 src/gallium/auxiliary/pipe-loader/pipe_loader_sw.c | 22 
++

 src/gallium/targets/gbm/Makefile.am|  1 +
 src/gallium/targets/opencl/Makefile.am |  1 +
 src/gallium/targets/xa/Makefile.am |  1 +
 src/gallium/tests/trivial/Makefile.am  |  1 +
 6 files changed, 37 insertions(+)

diff --git a/src/gallium/auxiliary/pipe-loader/pipe_loader.h 
b/src/gallium/auxiliary/pipe-loader/pipe_loader.h

index 6127a6a..9f43f17 100644
--- a/src/gallium/auxiliary/pipe-loader/pipe_loader.h
+++ b/src/gallium/auxiliary/pipe-loader/pipe_loader.h
@@ -166,6 +166,17 @@ pipe_loader_sw_probe_null(struct pipe_loader_device 
**devs);

 int
 pipe_loader_sw_probe(struct pipe_loader_device **devs, int ndev);
 +/**
+ * Get a software device wrapped atop another device.
+ *
+ * This function is platform-specific.
+ *
+ * \sa pipe_loader_probe
+ */
+boolean
+pipe_loader_sw_probe_wrapped(struct pipe_loader_device **dev,
+ struct pipe_screen *screen);
+
 #ifdef HAVE_PIPE_LOADER_DRM
  /**
diff --git a/src/gallium/auxiliary/pipe-loader/pipe_loader_sw.c 
b/src/gallium/auxiliary/pipe-loader/pipe_loader_sw.c

index b1b1ca6..b152f60 100644
--- a/src/gallium/auxiliary/pipe-loader/pipe_loader_sw.c
+++ b/src/gallium/auxiliary/pipe-loader/pipe_loader_sw.c
@@ -29,8 +29,11 @@
  #include util/u_memory.h
 #include util/u_dl.h
+#ifdef HAVE_PIPE_LOADER_DRI
 #include sw/dri/dri_sw_winsys.h
+#endif
 #include sw/null/null_sw_winsys.h
+#include sw/wrapper/wrapper_sw_winsys.h
 #ifdef HAVE_PIPE_LOADER_XLIB
 /* Explicitly wrap the header to ease build without X11 headers */
 #include sw/xlib/xlib_sw_winsys.h
@@ -140,6 +143,25 @@ pipe_loader_sw_probe(struct pipe_loader_device 
**devs, int ndev)

return i;
 }
 +boolean
+pipe_loader_sw_probe_wrapped(struct pipe_loader_device **dev,
+ struct pipe_screen *screen)
+{
+   struct pipe_loader_sw_device *sdev = 
CALLOC_STRUCT(pipe_loader_sw_device);

+
+   sdev-base.type = PIPE_LOADER_DEVICE_SOFTWARE;
+   sdev-base.driver_name = swrast;
+   sdev-base.ops = pipe_loader_sw_ops;
+   sdev-ws = wrapper_sw_winsys_wrap_pipe_screen(screen);
+
+   if (!sdev-ws) {
+  FREE(sdev);
+  return FALSE;
+   }
+   *dev = sdev-base;
+   return TRUE;
+}
+
 static void
 pipe_loader_sw_release(struct pipe_loader_device **dev)
 {
diff --git a/src/gallium/targets/gbm/Makefile.am 
b/src/gallium/targets/gbm/Makefile.am

index 2c9b425..679c994 100644
--- a/src/gallium/targets/gbm/Makefile.am
+++ b/src/gallium/targets/gbm/Makefile.am
@@ -34,6 +34,7 @@ gbm_gallium_drm_la_SOURCES =
 gbm_gallium_drm_la_LIBADD = \
$(top_builddir)/src/gallium/state_trackers/gbm/libgbm.la \
$(top_builddir)/src/gallium/auxiliary/libgallium.la \
+   $(top_builddir)/src/gallium/winsys/sw/wrapper/libwsw.la \
$(top_builddir)/src/util/libmesautil.la \
$(LIBDRM_LIBS) \
$(GALLIUM_COMMON_LIB_DEPS)
diff --git a/src/gallium/targets/opencl/Makefile.am 
b/src/gallium/targets/opencl/Makefile.am

index 1c5a908..fe458bc 100644
--- a/src/gallium/targets/opencl/Makefile.am
+++ b/src/gallium/targets/opencl/Makefile.am
@@ -20,6 +20,7 @@ lib@OPENCL_LIBNAME@_la_LIBADD = \

$(top_builddir)/src/gallium/auxiliary/pipe-loader/libpipe_loader_client.la \
$(top_builddir)/src/gallium/state_trackers/clover/libclover.la \
$(top_builddir)/src/gallium/auxiliary/libgallium.la \
+   $(top_builddir)/src/gallium/winsys/sw/wrapper/libwsw.la \
$(top_builddir)/src/util/libmesautil.la \
$(GALLIUM_PIPE_LOADER_WINSYS_LIBS) \
$(GALLIUM_PIPE_LOADER_CLIENT_LIBS) \
diff --git a/src/gallium/targets/xa/Makefile.am 
b/src/gallium/targets/xa/Makefile.am

index 77d9fa6..c1f52de 100644
--- a/src/gallium/targets/xa/Makefile.am
+++ b/src/gallium/targets/xa/Makefile.am
@@ -36,6 +36,7 @@ libxatracker_la_SOURCES =
 libxatracker_la_LIBADD = \
$(top_builddir)/src/gallium/state_trackers/xa/libxatracker.la \
$(top_builddir)/src/gallium/auxiliary/libgallium.la \
+   $(top_builddir)/src/gallium/winsys/sw/wrapper/libwsw.la \
$(top_builddir)/src/util/libmesautil.la \
$(LIBDRM_LIBS) \
$(GALLIUM_COMMON_LIB_DEPS)
diff --git a/src/gallium/tests/trivial/Makefile.am 
b/src/gallium/tests/trivial/Makefile.am

index fcd240e..a24b5ec 100644
--- a/src/gallium/tests/trivial/Makefile.am
+++ b/src/gallium/tests/trivial/Makefile.am
@@ -14,6 +14,7 @@ AM_CPPFLAGS = \
 LDADD = \

$(top_builddir)/src/gallium/auxiliary/pipe-loader/libpipe_loader_client.la \
$(top_builddir)/src/gallium/auxiliary/libgallium.la \
+   $(top_builddir)/src/gallium/winsys/sw/wrapper/libwsw.la \
$(top_builddir)/src/util/libmesautil.la \

[Mesa-dev] [PATCH v3 6/9] gallium/auxiliary: add dump functions for bind and transfer flags

2014-11-02 Thread David Heidelberg

v2: rename and extend support with code for C11 and MSVC (thanks to Brian)

Signed-off-by: David Heidelberg da...@ixit.cz
---
 src/gallium/auxiliary/util/u_dump.h |  6 ++
 src/gallium/auxiliary/util/u_dump_defines.c | 86 
+

 2 files changed, 92 insertions(+)

diff --git a/src/gallium/auxiliary/util/u_dump.h 
b/src/gallium/auxiliary/util/u_dump.h

index 58e7dfd..84ba1ed 100644
--- a/src/gallium/auxiliary/util/u_dump.h
+++ b/src/gallium/auxiliary/util/u_dump.h
@@ -88,6 +88,12 @@ util_dump_tex_filter(unsigned value, boolean shortened);
 const char *
 util_dump_query_type(unsigned value, boolean shortened);
 +const char *
+util_dump_bind_flags(unsigned flags);
+
+const char *
+util_dump_transfer_flags(unsigned flags);
+
  /*
  * p_state.h, through a FILE
diff --git a/src/gallium/auxiliary/util/u_dump_defines.c 
b/src/gallium/auxiliary/util/u_dump_defines.c

index 03fd15d..20ae6c0 100644
--- a/src/gallium/auxiliary/util/u_dump_defines.c
+++ b/src/gallium/auxiliary/util/u_dump_defines.c
@@ -61,6 +61,36 @@ util_dump_enum_continuous(unsigned value,
return names[value];
 }
 +static const char *
+util_dump_flags(unsigned flags, const char *prefix,
+unsigned num_names,
+const char **names)
+{
+#if __STDC_VERSION__ = 201112  !defined __STDC_NO_THREADS__
+   static _Thread_local char str[256];
+#elif defined(PIPE_CC_GCC)
+   static __thread char str[256];
+#elif defined(PIPE_CC_MSVC)
+   static __declspec(thread) char str[256];
+#else
+#error Unsupported compiler: please find how to implement thread local 
storage on it

+#endif
+   int i, pos;
+
+   if (!flags)
+  return ;
+   pos = snprintf(str, Elements(str), %s_, prefix);
+
+   for (i = 0; (i  num_names)  flags; flags = 1, ++i) {
+  if (flags  1) {
+ pos += snprintf(str[pos], Elements(str) - pos, %s, names[i]);
+ if (flags  ~1)
+pos += snprintf(str[pos], Elements(str) - pos, |);
+  }
+   }
+   return str;
+}
+
  #define DEFINE_UTIL_DUMP_CONTINUOUS(_name) \
const char * \
@@ -90,6 +120,14 @@ util_dump_enum_continuous(unsigned value,
}
  +#define DEFINE_UTIL_DUMP_FLAGS(_prefix, _name)   \
+   const char * \
+   util_dump_##_name##_flags(unsigned flags) \
+   { \
+  return util_dump_flags(flags, _prefix, 
Elements(util_dump_##_name##_flag_names), util_dump_##_name##_flag_names); \

+   }
+
+
 static const char *
 util_dump_blend_factor_names[] = {
UTIL_DUMP_INVALID_NAME, /* 0x0 */
@@ -392,3 +430,51 @@ util_dump_query_type_short_names[] = {
 };
  DEFINE_UTIL_DUMP_CONTINUOUS(query_type)
+
+
+static const char *
+util_dump_bind_flag_names[] = {
+   DEPTH_STENCIL,
+   RENDER_TARGET,
+   BLENDABLE,
+   SAMPLER_VIEW,
+   VERTEX_BUFFER,
+   INDEX_BUFFER,
+   CONSTANT_BUFFER,
+   (7),
+   DISPLAY_TARGET,
+   TRANSFER_WRITE,
+   TRANSFER_READ,
+   STREAM_OUTPUT,
+   (12),
+   (13),
+   (14),
+   (15),
+   CURSOR,
+   CUSTOM,
+   GLOBAL,
+   SHADER_RESOURCE,
+   COMPUTE_RESOURCE
+};
+
+DEFINE_UTIL_DUMP_FLAGS(PIPE_BIND, bind)
+
+
+static const char *
+util_dump_transfer_flag_names[] = {
+   READ,
+   WRITE,
+   MAP_DIRECTLY,
+   (3),
+   (4),
+   (5),
+   (6),
+   (7),
+   DISCARD_RANGE,
+   DONTBLOCK,
+   UNSYNCHRONIZED,
+   FLUSH_EXPLICIT,
+   DISCARD_WHOLE_RESOURCE
+};
+
+DEFINE_UTIL_DUMP_FLAGS(, transfer)
--
2.1.3


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 5/9] gallium/auxiliary: add contained and rect checks (v3)

2014-11-02 Thread David Heidelberg

v3: thanks to Brian, improved coding style, also glennk helped spot few
things (unsigned - int, two constify)

Signed-off-by: David Heidelberg da...@ixit.cz
---
 src/gallium/auxiliary/util/u_box.h  | 230 


 src/gallium/auxiliary/util/u_rect.h |  28 +
 2 files changed, 258 insertions(+)

diff --git a/src/gallium/auxiliary/util/u_box.h 
b/src/gallium/auxiliary/util/u_box.h

index 0b28d0f..e22af9c 100644
--- a/src/gallium/auxiliary/util/u_box.h
+++ b/src/gallium/auxiliary/util/u_box.h
@@ -2,6 +2,7 @@
 #define UTIL_BOX_INLINES_H
  #include pipe/p_state.h
+#include util/u_math.h
  static INLINE
 void u_box_1d( unsigned x,
@@ -77,4 +78,233 @@ void u_box_3d( unsigned x,
box-depth = d;
 }
 +/* Returns whether @a is contained in or equal to @b. */
+static INLINE boolean
+u_box_contained(struct pipe_box *a, struct pipe_box *b)
+{
+   return
+  a-x = b-x  (a-x + a-width  = b-x + b-width) 
+  a-y = b-y  (a-y + a-height = b-y + b-height) 
+  a-z = b-z  (a-z + a-depth  = b-z + b-depth);
+}
+
+/* Clips @box to width @w and height @h.
+ * Returns -1 if the resulting box would be empty (then @box is left 
unchanged).

+ *  0 if nothing have been reduced.
+ *   = 1 if width/height/both have been reduced.
+ * Aliasing permitted.
+ */
+static INLINE int
+u_box_clip_2d(struct pipe_box *dst,
+  const struct pipe_box *box, int w, int h)
+{
+   unsigned i;
+   int a[2], b[2], dim[2];
+   int res = 0;
+
+   if (!box-width || !box-height)
+  return -1;
+   dim[0] = w;
+   dim[1] = h;
+   a[0] = box-x;
+   a[1] = box-y;
+   b[0] = box-x + box-width;
+   b[1] = box-y + box-height;
+
+   for (i = 0; i  2; ++i) {
+  if (b[i]  a[i]) {
+ if (a[i]  0 || b[i] = dim[i])
+return -1;
+ if (a[i]  dim[i]) {
+a[i] = dim[i];
+res |= (1  i);
+}
+ if (b[i]  0) {
+b[i] = 0;
+res |= (1  i);
+ }
+  } else {
+ if (b[i]  0 || a[i] = dim[i])
+return -1;
+ if (a[i]  0) {
+a[i] = 0;
+res |= (1  i);
+ }
+ if (b[i]  dim[i]) {
+b[i] = dim[i];
+res |= (1  i);
+ }
+  }
+   }
+
+   if (res) {
+  dst-x = a[0];
+  dst-y = a[1];
+  dst-width = b[0] - a[0];
+  dst-height = b[1] - a[1];
+   }
+   return res;
+}
+
+static INLINE int
+u_box_clip_3d(struct pipe_box *dst,
+  const struct pipe_box *box, int w, int h, int d)
+{
+   unsigned i;
+   int a[3], b[3], dim[3];
+   int res = 0;
+
+   if (!box-width || !box-height)
+  return -1;
+   dim[0] = w;
+   dim[1] = h;
+   dim[2] = d;
+   a[0] = box-x;
+   a[1] = box-y;
+   a[2] = box-z;
+   b[0] = box-x + box-width;
+   b[1] = box-y + box-height;
+   b[2] = box-z + box-depth;
+
+   for (i = 0; i  2; ++i) { /* ignore z */
+  if (b[i]  a[i]) {
+ if (a[i]  0 || b[i] = dim[i])
+return -1;
+ if (a[i]  dim[i]) {
+a[i] = dim[i];
+res |= (1  i);
+ }
+ if (b[i]  0) {
+b[i] = 0;
+res |= (1  i);
+ }
+  } else {
+ if (b[i]  0 || a[i] = dim[i])
+return -1;
+ if (a[i]  0) {
+a[i] = 0;
+res |= (1  i);
+ }
+ if (b[i]  dim[i]) {
+b[i] = dim[i];
+res |= (1  i);
+ }
+  }
+   }
+
+   if (res) {
+  dst-x = a[0];
+  dst-y = a[1];
+  dst-z = a[2];
+  dst-width = b[0] - a[0];
+  dst-height = b[1] - a[1];
+  dst-depth = b[2] - a[2];
+   }
+   return res;
+}
+
+/* Return true if @a is contained in or equal to @b.
+ */
+static INLINE boolean
+u_box_contained_2d(const struct pipe_box *a, const struct pipe_box *b)
+{
+   int a_x1 = a-x + a-width;
+   int b_x1 = b-x + b-width;
+   int a_y1 = a-y + a-height;
+   int b_y1 = b-y + b-height;
+   return
+  a-x = b-x  a_x1 = b_x1 
+  a-y = b-y  a_y1 = b_y1;
+}
+
+static INLINE int64_t
+u_box_volume(const struct pipe_box *box)
+{
+   return (int64_t)box-width * box-height * box-depth;
+}
+
+/* Aliasing of @dst and @a permitted. */
+static INLINE void
+u_box_cover_2d(struct pipe_box *dst,
+   struct pipe_box *a, const struct pipe_box *b)
+{
+   int x1_a = a-x + a-width;
+   int y1_a = a-y + a-height;
+   int x1_b = b-x + b-width;
+   int y1_b = b-y + b-height;
+
+   dst-x = MIN2(a-x, b-x);
+   dst-y = MIN2(a-y, b-y);
+
+   dst-width = MAX2(x1_a, x1_b) - dst-x;
+   dst-height = MAX2(y1_a, y1_b) - dst-y;
+}
+
+/* Aliasing of @dst and @a permitted. */
+static INLINE void
+u_box_cover(struct pipe_box *dst,
+struct pipe_box *a, const struct pipe_box *b)
+{
+   int x1_a = a-x + a-width;
+   int y1_a = a-y + a-height;
+   int z1_a = a-z + a-depth;
+   int x1_b = b-x + b-width;
+   int y1_b = b-y + b-height;
+   int z1_b = b-z + b-depth;
+
+   dst-x = MIN2(a-x, b-x);
+   dst-y = MIN2(a-y, b-y);
+   dst-z = MIN2(a-z, b-z);
+
+   

[Mesa-dev] [PATCH v3 8/9] nine: Add drirc options (v2)

2014-11-02 Thread David Heidelberg

Implements vblank_mode and throttling, which  allows us change default ratio
between framerate and input lag.

Signed-off-by: David Heidelberg da...@ixit.cz
Signed-off-by: Axel Davy axel.d...@ens.fr
---
 src/gallium/state_trackers/nine/adapter9.h  |  1 +
 src/gallium/state_trackers/nine/swapchain9.c|  5 
 src/gallium/targets/d3dadapter9/drm.c   | 35 
+

 src/mesa/drivers/dri/common/xmlpool/t_options.h | 13 +
 4 files changed, 54 insertions(+)

diff --git a/src/gallium/state_trackers/nine/adapter9.h 
b/src/gallium/state_trackers/nine/adapter9.h

index ad04e4c..3c429d0 100644
--- a/src/gallium/state_trackers/nine/adapter9.h
+++ b/src/gallium/state_trackers/nine/adapter9.h
@@ -37,6 +37,7 @@ struct d3dadapter9_context
 BOOL linear_framebuffer;
 BOOL throttling;
 int throttling_value;
+int vblank_mode;
  void (*destroy)( struct d3dadapter9_context *ctx );
 };
diff --git a/src/gallium/state_trackers/nine/swapchain9.c 
b/src/gallium/state_trackers/nine/swapchain9.c

index 69b19e1..74ab01d 100644
--- a/src/gallium/state_trackers/nine/swapchain9.c
+++ b/src/gallium/state_trackers/nine/swapchain9.c
@@ -166,6 +166,11 @@ NineSwapChain9_Resize( struct NineSwapChain9 *This,
 if (This-desired_fences  DRI_SWAP_FENCES_MAX)
 This-desired_fences = DRI_SWAP_FENCES_MAX;
 +if (This-actx-vblank_mode == 0)
+pParams-PresentationInterval = D3DPRESENT_INTERVAL_IMMEDIATE;
+else if (This-actx-vblank_mode == 3)
+pParams-PresentationInterval = D3DPRESENT_INTERVAL_ONE;
+
 if (mode  This-mode) {
 *(This-mode) = *mode;
 } else if (mode) {
diff --git a/src/gallium/targets/d3dadapter9/drm.c 
b/src/gallium/targets/d3dadapter9/drm.c

index 09b8e82..e9b1ecc 100644
--- a/src/gallium/targets/d3dadapter9/drm.c
+++ b/src/gallium/targets/d3dadapter9/drm.c
@@ -36,6 +36,9 @@
 #include d3dadapter/d3dadapter9.h
 #include d3dadapter/drm.h
 +#include xmlconfig.h
+#include xmlpool.h
+
 #include libdrm/drm.h
 #include sys/ioctl.h
 #include fcntl.h
@@ -49,6 +52,16 @@
  (DWORD)((lo)  0x) \
 ))
 +const char __driConfigOptionsNine[] =
+DRI_CONF_BEGIN
+DRI_CONF_SECTION_PERFORMANCE
+ DRI_CONF_VBLANK_MODE(DRI_CONF_VBLANK_DEF_INTERVAL_1)
+DRI_CONF_SECTION_END
+DRI_CONF_SECTION_NINE
+DRI_CONF_NINE_THROTTLE(-2)
+DRI_CONF_SECTION_END
+DRI_CONF_END;
+
 /* Regarding os versions, we should not define our own as that would 
simply be
  * weird. Defaulting to Win2k/XP seems sane considering the origin of 
D3D9. The
  * driver also defaults to being a generic D3D9 driver, which of 
course only

@@ -229,6 +242,9 @@ drm_create_adapter( int fd,
 int i, different_device;
 const struct drm_conf_ret *throttle_ret = NULL;
 const struct drm_conf_ret *dmabuf_ret = NULL;
+driOptionCache defaultInitOptions;
+driOptionCache userInitOptions;
+int throttling_value_user;
  #if !GALLIUM_STATIC_TARGETS
 const char *paths[] = {
@@ -289,6 +305,25 @@ drm_create_adapter( int fd,
 } else
 ctx-base.throttling = FALSE;
 +driParseOptionInfo(defaultInitOptions, __driConfigOptionsNine);
+driParseConfigFiles(userInitOptions, defaultInitOptions, 0, nine);
+if (driCheckOption(userInitOptions, throttle_value, DRI_INT)) {
+throttling_value_user = driQueryOptioni(userInitOptions, 
throttle_value);

+if (throttling_value_user == -1)
+ctx-base.throttling = FALSE;
+else if (throttling_value_user = 0) {
+ctx-base.throttling = TRUE;
+ctx-base.throttling_value = throttling_value_user;
+}
+}
+
+if (driCheckOption(userInitOptions, vblank_mode, DRI_ENUM))
+ctx-base.vblank_mode = driQueryOptioni(userInitOptions, 
vblank_mode);

+else
+ctx-base.vblank_mode = 1;
+
+driDestroyOptionCache(userInitOptions);
+driDestroyOptionInfo(defaultInitOptions);
  #if GALLIUM_STATIC_TARGETS
 ctx-base.ref = ninesw_create_screen(ctx-base.hal);
diff --git a/src/mesa/drivers/dri/common/xmlpool/t_options.h 
b/src/mesa/drivers/dri/common/xmlpool/t_options.h

index b73a662..e4f6937 100644
--- a/src/mesa/drivers/dri/common/xmlpool/t_options.h
+++ b/src/mesa/drivers/dri/common/xmlpool/t_options.h
@@ -340,3 +340,16 @@ DRI_CONF_SECTION_BEGIN \
 DRI_CONF_OPT_BEGIN(device_id, string, def) \
 DRI_CONF_DESC(en,gettext(Define the graphic device to use if 
possible)) \

 DRI_CONF_OPT_END
+
+/**
+ * \brief Gallium-Nine specific configuration options
+ */
+
+#define DRI_CONF_SECTION_NINE \
+DRI_CONF_SECTION_BEGIN \
+DRI_CONF_DESC(en,gettext(Gallium Nine))
+
+#define DRI_CONF_NINE_THROTTLE(def) \
+DRI_CONF_OPT_BEGIN(throttle_value, int, def) \
+DRI_CONF_DESC(en,gettext(Define the throttling value. -1 for 
no throttling, -2 for default (usually 2), 0 for glfinish behaviour)) \

+DRI_CONF_OPT_END
--
2.1.3


___
mesa-dev mailing list

[Mesa-dev] [PATCH v3 7/9] nine: Add state tracker nine for Direct3D9 (v2)

2014-11-02 Thread David Heidelberg
This patch is too big for ML, please see it in 
https://github.com/iXit/Mesa-3D/commits/for-upstream-3 .

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 9/9] nine: Implement threadpool

2014-11-02 Thread David Heidelberg

DRI_PRIME setups have different issues due the lack of dma-buf fences
support in the drivers. For DRI3 DRI_PRIME, a race can appear, making
tearings visible, or worse showing older content than expected. Until
dma-buf fences are well supported (and by all drivers), an alternative
is to send the buffers to the server only when rendering has finished.
Since waiting the rendering has finished in the main thread has a
performance impact, this patch uses an additional thread to offload the
wait and the sending of the buffers to the server.

Signed-off-by: Axel Davy axel.d...@ens.fr
Signed-off-by: David Heidelberg da...@ixit.cz
---
 src/gallium/state_trackers/nine/Makefile.sources |   2 +
 src/gallium/state_trackers/nine/adapter9.h   |   1 +
 src/gallium/state_trackers/nine/swapchain9.c |  86 +-
 src/gallium/state_trackers/nine/swapchain9.h |   7 +
 src/gallium/state_trackers/nine/threadpool.c | 202 
+++

 src/gallium/state_trackers/nine/threadpool.h |  55 ++
 src/gallium/targets/d3dadapter9/drm.c|  16 +-
 src/mesa/drivers/dri/common/xmlpool/t_options.h  |   5 +
 8 files changed, 364 insertions(+), 10 deletions(-)
 create mode 100644 src/gallium/state_trackers/nine/threadpool.c
 create mode 100644 src/gallium/state_trackers/nine/threadpool.h

diff --git a/src/gallium/state_trackers/nine/Makefile.sources 
b/src/gallium/state_trackers/nine/Makefile.sources

index b821961..99b623a 100644
--- a/src/gallium/state_trackers/nine/Makefile.sources
+++ b/src/gallium/state_trackers/nine/Makefile.sources
@@ -59,6 +59,8 @@ C_SOURCES := \
swapchain9.h \
texture9.c \
texture9.h \
+   threadpool.c \
+   threadpool.h \
vertexbuffer9.c \
vertexbuffer9.h \
vertexdeclaration9.c \
diff --git a/src/gallium/state_trackers/nine/adapter9.h 
b/src/gallium/state_trackers/nine/adapter9.h

index 3c429d0..b2fea6c 100644
--- a/src/gallium/state_trackers/nine/adapter9.h
+++ b/src/gallium/state_trackers/nine/adapter9.h
@@ -38,6 +38,7 @@ struct d3dadapter9_context
 BOOL throttling;
 int throttling_value;
 int vblank_mode;
+BOOL thread_submit;
  void (*destroy)( struct d3dadapter9_context *ctx );
 };
diff --git a/src/gallium/state_trackers/nine/swapchain9.c 
b/src/gallium/state_trackers/nine/swapchain9.c

index 74ab01d..ffa89b2 100644
--- a/src/gallium/state_trackers/nine/swapchain9.c
+++ b/src/gallium/state_trackers/nine/swapchain9.c
@@ -33,6 +33,8 @@
 #include hud/hud_context.h
 #include state_tracker/drm_driver.h
 +#include threadpool.h
+
 #define DBG_CHANNEL DBG_SWAPCHAIN
  #define UNTESTED(n) DBG(UNTESTED point %d. Please tell if it 
worked\n, n)

@@ -72,6 +74,7 @@ NineSwapChain9_ctor( struct NineSwapChain9 *This,
 params.hDeviceWindow = hFocusWindow;
  This-rendering_done = FALSE;
+This-pool = NULL;
 return NineSwapChain9_Resize(This, params, mode);
 }
 @@ -229,6 +232,21 @@ NineSwapChain9_Resize( struct NineSwapChain9 *This,
 desc.Width = pParams-BackBufferWidth;
 desc.Height = pParams-BackBufferHeight;
 +if (This-pool) {
+_mesa_threadpool_destroy(This-pool);
+This-pool = NULL;
+}
+This-enable_threadpool = This-actx-thread_submit  
(pParams-SwapEffect != D3DSWAPEFFECT_COPY);

+if (This-enable_threadpool)
+This-pool = _mesa_threadpool_create();
+if (!This-pool)
+This-enable_threadpool = FALSE;
+
+This-tasks = REALLOC(This-tasks,
+  oldBufferCount * sizeof(struct 
threadpool_task *),
+  newBufferCount * sizeof(struct 
threadpool_task *));
+memset(This-tasks, 0, newBufferCount * sizeof(struct 
threadpool_task *));

+
 for (i = 0; i  oldBufferCount; i++) {
 ID3DPresent_DestroyD3DWindowBuffer(This-present, 
This-present_handles[i]);

 This-present_handles[i] = NULL;
@@ -446,6 +464,9 @@ NineSwapChain9_dtor( struct NineSwapChain9 *This )
  DBG(This=%p\n, This);
 +if (This-pool)
+_mesa_threadpool_destroy(This-pool);
+
 if (This-buffers) {
 for (i = 0; i  This-params.BackBufferCount; i++) {
 NineUnknown_Destroy(NineUnknown(This-buffers[i]));
@@ -543,6 +564,40 @@ handle_draw_cursor_and_hud( struct NineSwapChain9 
*This, struct pipe_resource *r

 }
 }
 +struct end_present_struct {
+struct pipe_screen *screen;
+struct pipe_fence_handle *fence_to_wait;
+ID3DPresent *present;
+D3DWindowBuffer *present_handle;
+HWND hDestWindowOverride;
+};
+
+static void work_present(void *data)
+{
+struct end_present_struct *work = data;
+if (work-fence_to_wait) {
+(void) work-screen-fence_finish(work-screen, 
work-fence_to_wait, PIPE_TIMEOUT_INFINITE);
+work-screen-fence_reference(work-screen, 
(work-fence_to_wait), NULL);

+}
+ID3DPresent_PresentBuffer(work-present, work-present_handle, 
work-hDestWindowOverride, NULL, NULL, NULL, 0);

+free(work);
+}
+
+static void 

[Mesa-dev] [Bug 85608] Account request for David Heidelberg

2014-11-02 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=85608

David Heidelberg (okias) da...@ixit.cz changed:

   What|Removed |Added

  Component|New Accounts|Other
   Assignee|sitewranglers@lists.freedes |mesa-dev@lists.freedesktop.
   |ktop.org|org
Product|freedesktop.org |Mesa

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/chv: Increase VS and GS thread counts

2014-11-02 Thread Ben Widawsky
AFAICT the number of threads is 80, not 70. I am not sure if Ken knows
something I do not.

Signed-off-by: Ben Widawsky b...@bwidawsk.net
Cc: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_device_info.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c 
b/src/mesa/drivers/dri/i965/brw_device_info.c
index 18e4c80..35ca125 100644
--- a/src/mesa/drivers/dri/i965/brw_device_info.c
+++ b/src/mesa/drivers/dri/i965/brw_device_info.c
@@ -240,8 +240,8 @@ static const struct brw_device_info brw_device_info_bdw_gt3 
= {
 static const struct brw_device_info brw_device_info_chv = {
GEN8_FEATURES, .is_cherryview = 1, .gt = 1,
.has_llc = false,
-   .max_vs_threads = 70,
-   .max_gs_threads = 70,
+   .max_vs_threads = 80,
+   .max_gs_threads = 80,
.max_wm_threads = 102,
.urb = {
   .size = 128,
-- 
2.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/chv: Increase VS and GS thread counts

2014-11-02 Thread Kenneth Graunke
On Sunday, November 02, 2014 11:43:24 AM Ben Widawsky wrote:
 AFAICT the number of threads is 80, not 70. I am not sure if Ken knows
 something I do not.
 
 Signed-off-by: Ben Widawsky b...@bwidawsk.net
 Cc: Kenneth Graunke kenn...@whitecape.org
 ---
  src/mesa/drivers/dri/i965/brw_device_info.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c 
b/src/mesa/drivers/dri/i965/brw_device_info.c
 index 18e4c80..35ca125 100644
 --- a/src/mesa/drivers/dri/i965/brw_device_info.c
 +++ b/src/mesa/drivers/dri/i965/brw_device_info.c
 @@ -240,8 +240,8 @@ static const struct brw_device_info 
brw_device_info_bdw_gt3 = {
  static const struct brw_device_info brw_device_info_chv = {
 GEN8_FEATURES, .is_cherryview = 1, .gt = 1,
 .has_llc = false,
 -   .max_vs_threads = 70,
 -   .max_gs_threads = 70,
 +   .max_vs_threads = 80,
 +   .max_gs_threads = 80,
 .max_wm_threads = 102,
 .urb = {
.size = 128,
 

Reviewed-by: Kenneth Graunke kenn...@whitecape.org

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] clover: fix clBuildProgram Piglit regression

2014-11-02 Thread Tom Stellard
On Sun, Nov 02, 2014 at 08:03:31PM +0200, Francisco Jerez wrote:
 EdB edb+m...@sigluy.net writes:
 
  should trigger CL_INVALID_VALUE
  if device_list is NULL and num_devices is greater than zero.
 
  introduced by e5468dfa523be2a7a0d04bb9efcf8ae780957563
 
 Tom, can you just drop the the vector of devices parameter and validate
 the d_devs/num_devs arguments from validate_build_program_common() by
 calling objsallow_empty_tag, as I suggested when I gave my R-b for
 your commit.
 

The reason I kept the vector of devices is because if the
device list is NULL, then the device list from the context
need to be used.  I didn't want to duplicate this logic in
validate_build_program_common(), so I added the allow_empty_tag
to the API functions instead.

I think EdB's fix is a better solution, what do you think?

-Tom

 Thanks.
 
  ---
   src/gallium/state_trackers/clover/api/program.cpp | 20 +++-
   1 file changed, 11 insertions(+), 9 deletions(-)
 
  diff --git a/src/gallium/state_trackers/clover/api/program.cpp 
  b/src/gallium/state_trackers/clover/api/program.cpp
  index 64c4a43..dc89730 100644
  --- a/src/gallium/state_trackers/clover/api/program.cpp
  +++ b/src/gallium/state_trackers/clover/api/program.cpp
  @@ -27,7 +27,7 @@ using namespace clover;
   
   namespace {
  void validate_build_program_common(const program prog, cl_uint 
  num_devs,
  -  const ref_vectordevice devs,
  +  ref_vectordevice devs,
 void (*pfn_notify)(cl_program, void 
  *),
 void *user_data) {
   
  @@ -37,10 +37,14 @@ namespace {
 if (prog.kernel_ref_count())
throw error(CL_INVALID_OPERATION);
   
  -  if (any_of([](const device dev) {
  -   return !count(dev, prog.context().devices());
  -}, devs))
  - throw error(CL_INVALID_DEVICE);
  +  if (!num_devs) {
  + devs = prog.context().devices();
  +  } else {
  + if (any_of([](const device dev) {
  +  return !count(dev, prog.context().devices());
  +   }, devs))
  +throw error(CL_INVALID_DEVICE);
  +  }
  }
   }
   
  @@ -173,8 +177,7 @@ clBuildProgram(cl_program d_prog, cl_uint num_devs,
  void (*pfn_notify)(cl_program, void *),
  void *user_data) try {
  auto prog = obj(d_prog);
  -   auto devs = (d_devs ? objs(d_devs, num_devs) :
  -ref_vectordevice(prog.context().devices()));
  +   auto devs = objsallow_empty_tag(d_devs, num_devs);
  auto opts = (p_opts ? p_opts : );
   
  validate_build_program_common(prog, num_devs, devs, pfn_notify, 
  user_data);
  @@ -195,8 +198,7 @@ clCompileProgram(cl_program d_prog, cl_uint num_devs,
void (*pfn_notify)(cl_program, void *),
void *user_data) try {
  auto prog = obj(d_prog);
  -   auto devs = (d_devs ? objs(d_devs, num_devs) :
  -ref_vectordevice(prog.context().devices()));
  +   auto devs = objsallow_empty_tag(d_devs, num_devs);
  auto opts = (p_opts ? p_opts : );
  header_map headers;
   
  -- 
  1.9.3
 
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev




 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev