[Mesa-dev] [Bug 98652] AMD driver doesn't compile anymore after recent LLVM changes

2016-11-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98652

Bug ID: 98652
   Summary: AMD driver doesn't compile anymore after recent LLVM
changes
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Other
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: b...@lindev.ch
QA Contact: mesa-dev@lists.freedesktop.org

The LLVMAttribute API has been removed from LLVM recently.
Mesa's src/amd/common/ac_nir_to_llvm.c still uses it, causing compile failures
when using current LLVM snapshots.

ac_nir_to_llvm.c:144:43: error: unknown type name 'LLVMAttribute'; did you mean
  'LLVMAttributeRef'?
unsigned param_count, LLVMAttribute attribs);
  ^
  LLVMAttributeRef  
/usr/include/llvm-c/Types.h:116:40: note: 'LLVMAttributeRef' declared here
typedef struct LLVMOpaqueAttributeRef *LLVMAttributeRef;
   ^
ac_nir_to_llvm.c:230:4: error: implicit declaration of function
'LLVMAddAttribute' is
  invalid in C99 [-Werror,-Wimplicit-function-declaration]
LLVMAddAttribute(P, LLVMByValAttribute);
^
ac_nir_to_llvm.c:230:24: error: use of undeclared identifier
'LLVMByValAttribute'; did you
  mean 'LLVMAddAttribute'?
LLVMAddAttribute(P, LLVMByValAttribute);
^~
LLVMAddAttribute
ac_nir_to_llvm.c:230:4: note: 'LLVMAddAttribute' declared here
LLVMAddAttribute(P, LLVMByValAttribute);
^
ac_nir_to_llvm.c:234:24: error: use of undeclared identifier
'LLVMInRegAttribute'
LLVMAddAttribute(P, LLVMInRegAttribute);
^
ac_nir_to_llvm.c:710:63: error: use of undeclared identifier
'LLVMReadNoneAttribute'
return emit_llvm_intrinsic(ctx, intrin, ctx->f32, params, 1,
LLVMReadNoneAt...
 ^
ac_nir_to_llvm.c:721:63: error: use of undeclared identifier
'LLVMReadNoneAttribute'
return emit_llvm_intrinsic(ctx, intrin, ctx->f32, params, 2,
LLVMReadNoneAt...
 ^
ac_nir_to_llvm.c:733:63: error: use of undeclared identifier
'LLVMReadNoneAttribute'
return emit_llvm_intrinsic(ctx, intrin, ctx->f32, params, 3,
LLVMReadNoneAt...
 ^
ac_nir_to_llvm.c:759:72: error: use of undeclared identifier
'LLVMReadNoneAttribute'
return emit_llvm_intrinsic(ctx, "llvm.cttz.i32", ctx->i32, params, 2,
LLVMReadNoneA...
  ^
ac_nir_to_llvm.c:767:13: error: use of undeclared identifier
'LLVMReadNoneAttribute'
   LLVMReadNoneAttribute);
   ^
ac_nir_to_llvm.c:793:13: error: use of undeclared identifier
'LLVMReadNoneAttribute'
   LLVMReadNoneAttribute);
   ^
ac_nir_to_llvm.c:857:8: error: use of undeclared identifier
'LLVMReadNoneAttribute'
 LLVMReadNoneAttribute);
 ^
ac_nir_to_llvm.c:873:18: error: use of undeclared identifier
'LLVMReadNoneAttribute'
  params, 2, LLVMReadNoneAttribute);
 ^
ac_nir_to_llvm.c:918:63: error: use of undeclared identifier
'LLVMReadNoneAttribute'
result = emit_llvm_intrinsic(ctx, intrin, ctx->i32, srcs, 3,
LLVMReadNoneAt...
 ^
ac_nir_to_llvm.c:1025:21: error: use of undeclared identifier
'LLVMReadNoneAttribute'
  tid_args, 2, LLVMReadNoneAttribute);
   ^
ac_nir_to_llvm.c:1029:20: error: use of undeclared identifier
'LLVMReadNoneAttribute'
  tid_args, 2, LLVMReadNoneAttribute);
   ^
ac_nir_to_llvm.c:1117:7: error: use of undeclared identifier
'LLVMReadNoneAttribute'
 LLVMReadNoneAttribute);
 ^
ac_nir_to_llvm.c:1123:9: error: use of undeclared identifier
'LLVMReadNoneAttribute'
   LLVMReadNoneAttribute);
   ^
ac_nir_to_llvm.c:1453:78: error: use

Re: [Mesa-dev] [PATCH] radv: fix GetFenceStatus for signaled fences

2016-11-08 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen 

On Wed, Nov 9, 2016 at 2:22 AM, Dave Airlie  wrote:
> From: Dave Airlie 
>
> if a fence is created pre-signaled we should return that
> in GetFenceStatus even if it hasn't been submitted.
>
> Signed-off-by: Dave Airlie 
> ---
>  src/amd/vulkan/radv_device.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
> index fdb6db9..214af5f 100644
> --- a/src/amd/vulkan/radv_device.c
> +++ b/src/amd/vulkan/radv_device.c
> @@ -1202,6 +1202,8 @@ VkResult radv_GetFenceStatus(VkDevice _device, VkFence 
> _fence)
> RADV_FROM_HANDLE(radv_device, device, _device);
> RADV_FROM_HANDLE(radv_fence, fence, _fence);
>
> +   if (fence->signalled)
> +   return VK_SUCCESS;
> if (!fence->submitted)
> return VK_NOT_READY;
>
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] swr: [rasterizer core] allow an OpenGL driver to specify halfz clipping

2016-11-08 Thread Ilia Mirkin
With ARB_clip_control, GL may also do 0..1 depth clipping, not just
-1..1. For backwards compatibility, preserve the existing driver type
check for DX as well.

Signed-off-by: Ilia Mirkin 
---
 src/gallium/drivers/swr/rasterizer/core/clip.h  | 6 +++---
 src/gallium/drivers/swr/rasterizer/core/state.h | 1 +
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/core/clip.h 
b/src/gallium/drivers/swr/rasterizer/core/clip.h
index 43bc522..78dbcf0 100644
--- a/src/gallium/drivers/swr/rasterizer/core/clip.h
+++ b/src/gallium/drivers/swr/rasterizer/core/clip.h
@@ -90,7 +90,7 @@ void ComputeClipCodes(DRIVER_TYPE type, const API_STATE& 
state, const simdvector
 {
 // FRUSTUM_NEAR
 // DX clips depth [0..w], GL clips [-w..w]
-if (type == DX)
+if (type == DX || state.rastState.clipHalfZ)
 {
 vRes = _simd_cmplt_ps(vertex.z, _simd_setzero_ps());
 }
@@ -640,7 +640,7 @@ private:
 case FRUSTUM_BOTTOM:t = ComputeInterpFactor(_simd_sub_ps(v1[3], 
v1[1]), _simd_sub_ps(v2[3], v2[1])); break;
 case FRUSTUM_NEAR:  
 // DX Znear plane is 0, GL is -w
-if (this->driverType == DX)
+if (this->driverType == DX || this->state.rastState.clipHalfZ)
 {
 t = ComputeInterpFactor(v1[2], v2[2]);
 }
@@ -708,7 +708,7 @@ private:
 case FRUSTUM_RIGHT: return _simd_cmple_ps(v[0], v[3]);
 case FRUSTUM_TOP:   return _simd_cmpge_ps(v[1], _simd_mul_ps(v[3], 
_simd_set1_ps(-1.0f)));
 case FRUSTUM_BOTTOM:return _simd_cmple_ps(v[1], v[3]);
-case FRUSTUM_NEAR:  return _simd_cmpge_ps(v[2], this->driverType 
== DX ? _simd_setzero_ps() : _simd_mul_ps(v[3], _simd_set1_ps(-1.0f)));
+case FRUSTUM_NEAR:  return _simd_cmpge_ps(v[2], this->driverType 
== DX || this->state.rastState.clipHalfZ ? _simd_setzero_ps() : 
_simd_mul_ps(v[3], _simd_set1_ps(-1.0f)));
 case FRUSTUM_FAR:   return _simd_cmple_ps(v[2], v[3]);
 default:
 SWR_ASSERT(false, "invalid clipping plane: %d", ClippingPlane);
diff --git a/src/gallium/drivers/swr/rasterizer/core/state.h 
b/src/gallium/drivers/swr/rasterizer/core/state.h
index 93e4565..5ee12e8 100644
--- a/src/gallium/drivers/swr/rasterizer/core/state.h
+++ b/src/gallium/drivers/swr/rasterizer/core/state.h
@@ -932,6 +932,7 @@ struct SWR_RASTSTATE
 uint32_t frontWinding   : 1;
 uint32_t scissorEnable  : 1;
 uint32_t depthClipEnable: 1;
+uint32_t clipHalfZ  : 1;
 uint32_t pointParam : 1;
 uint32_t pointSpriteEnable  : 1;
 uint32_t pointSpriteTopOrigin   : 1;
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] swr: set halfz rasterizer setting

2016-11-08 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/gallium/drivers/swr/swr_state.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/swr/swr_state.cpp 
b/src/gallium/drivers/swr/swr_state.cpp
index 01cadce..d19acfb 100644
--- a/src/gallium/drivers/swr/swr_state.cpp
+++ b/src/gallium/drivers/swr/swr_state.cpp
@@ -921,6 +921,7 @@ swr_update_derived(struct pipe_context *pipe,
  rastState->depthFormat = swr_resource(zb->texture)->swr.format;
 
   rastState->depthClipEnable = rasterizer->depth_clip;
+  rastState->clipHalfZ = rasterizer->clip_halfz;
 
   rastState->clipDistanceMask =
  ctx->vs->info.base.num_written_clipdistance ?
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] swr: [rasterizer jitter] fix logic op to work with unorm/snorm

2016-11-08 Thread Rowley, Timothy O
Reviewed-by: Tim Rowley 
mailto:timothy.o.row...@intel.com>>

On Nov 7, 2016, at 6:18 PM, Ilia Mirkin 
mailto:imir...@alum.mit.edu>> wrote:

Most logic op usage is probably going to end up with normalized
textures. Scale the floating point values and convert to integer before
performing the logic operations.

Signed-off-by: Ilia Mirkin mailto:imir...@alum.mit.edu>>
---

The gl-1.1-xor-copypixels test still fails. The image stays the same. I'm
suspecting it's for reasons outside of this patch.

I'm not too familiar with the whole swr infrastructure, perhaps there was
an eaiser way to do all this. I looked for conversion helper functions but
couldn't find anything that would fit nicely here. Feel free to point me
in the right direction.

.../drivers/swr/rasterizer/jitter/blend_jit.cpp| 81 +-
1 file changed, 64 insertions(+), 17 deletions(-)

diff --git a/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp
index 1452d27..d69d503 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp
@@ -649,29 +649,54 @@ struct BlendJit : public Builder
if(state.blendState.logicOpEnable)
{
const SWR_FORMAT_INFO& info = GetFormatInfo(state.format);
-SWR_ASSERT(info.type[0] == SWR_TYPE_UINT);
Value* vMask[4];
+float scale[4];
+
+if (!state.blendState.blendEnable) {
+Clamp(state.format, src);
+Clamp(state.format, dst);
+}
+
for(uint32_t i = 0; i < 4; i++)
{
-switch(info.bpc[i])
+if (info.type[i] == SWR_TYPE_UNUSED)
{
-case 0: vMask[i] = VIMMED1(0x); break;
-case 2: vMask[i] = VIMMED1(0x0003); break;
-case 5: vMask[i] = VIMMED1(0x001F); break;
-case 6: vMask[i] = VIMMED1(0x003F); break;
-case 8: vMask[i] = VIMMED1(0x00FF); break;
-case 10: vMask[i] = VIMMED1(0x03FF); break;
-case 11: vMask[i] = VIMMED1(0x07FF); break;
-case 16: vMask[i] = VIMMED1(0x); break;
-case 24: vMask[i] = VIMMED1(0x00FF); break;
-case 32: vMask[i] = VIMMED1(0x); break;
+continue;
+}
+
+if (info.bpc[i] >= 32) {
+vMask[i] = VIMMED1(0x);
+scale[i] = 0x;
+} else {
+vMask[i] = VIMMED1((1 << info.bpc[i]) - 1);
+if (info.type[i] == SWR_TYPE_SNORM)
+scale[i] = (1 << (info.bpc[i] - 1)) - 1;
+else
+scale[i] = (1 << info.bpc[i]) - 1;
+}
+
+switch (info.type[i]) {
default:
-vMask[i] = VIMMED1(0x0);
-SWR_ASSERT(0, "Unsupported bpc for logic op\n");
+SWR_ASSERT(0, "Unsupported type for logic op\n");
+/* fallthrough */
+case SWR_TYPE_UINT:
+case SWR_TYPE_SINT:
+src[i] = BITCAST(src[i], mSimdInt32Ty);
+dst[i] = BITCAST(dst[i], mSimdInt32Ty);
+break;
+case SWR_TYPE_SNORM:
+src[i] = FADD(src[i], VIMMED1(0.5f));
+dst[i] = FADD(dst[i], VIMMED1(0.5f));
+/* fallthrough */
+case SWR_TYPE_UNORM:
+src[i] = FP_TO_UI(
+FMUL(src[i], VIMMED1(scale[i])),
+mSimdInt32Ty);
+dst[i] = FP_TO_UI(
+FMUL(dst[i], VIMMED1(scale[i])),
+mSimdInt32Ty);
break;
}
-src[i] = BITCAST(src[i], mSimdInt32Ty);//, vMask[i]);
-dst[i] = BITCAST(dst[i], mSimdInt32Ty);
}

LogicOpFunc(state.blendState.logicOpFunc, src, dst, result);
@@ -679,10 +704,32 @@ struct BlendJit : public Builder
// store results out
for(uint32_t i = 0; i < 4; ++i)
{
+if (info.type[i] == SWR_TYPE_UNUSED)
+{
+continue;
+}
+
// clear upper bits from PS output not in RT format after doing 
logic op
result[i] = AND(result[i], vMask[i]);

-STORE(BITCAST(result[i], mSimdFP32Ty), pResult, {i});
+switch (info.type[i]) {
+default:
+SWR_ASSERT(0, "Unsupported type for logic op\n");
+/* fallthrough */
+case SWR_TYPE_UINT:
+case SWR_TYPE_SINT:
+result[i] = BITCAST(resu

[Mesa-dev] [PATCH] swr: fix support for inverted depth scales

2016-11-08 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---

This improves bin/arb_clip_control-clip-control results, but still not
quite there yet.

 src/gallium/drivers/swr/swr_state.cpp | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/src/gallium/drivers/swr/swr_state.cpp 
b/src/gallium/drivers/swr/swr_state.cpp
index ede475a..01cadce 100644
--- a/src/gallium/drivers/swr/swr_state.cpp
+++ b/src/gallium/drivers/swr/swr_state.cpp
@@ -38,6 +38,7 @@
 #include "util/u_inlines.h"
 #include "util/u_helpers.h"
 #include "util/u_framebuffer.h"
+#include "util/u_viewport.h"
 
 #include "swr_state.h"
 #include "swr_context.h"
@@ -951,13 +952,8 @@ swr_update_derived(struct pipe_context *pipe,
   vp->width = state->translate[0] + state->scale[0];
   vp->y = state->translate[1] - fabs(state->scale[1]);
   vp->height = state->translate[1] + fabs(state->scale[1]);
-  if (rasterizer->clip_halfz == 0) {
- vp->minZ = state->translate[2] - state->scale[2];
- vp->maxZ = state->translate[2] + state->scale[2];
-  } else {
- vp->minZ = state->translate[2];
- vp->maxZ = state->translate[2] + state->scale[2];
-  }
+  util_viewport_zmin_zmax(state, rasterizer->clip_halfz,
+  &vp->minZ, &vp->maxZ);
 
   vpm->m00[0] = state->scale[0];
   vpm->m11[0] = state->scale[1];
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] anv: Make anv_finishme onl warn once per call-site

2016-11-08 Thread Jason Ekstrand
When you fire up Dota2 on Haswell you get spammed with thousands of
"Implement Gen7 HZ ops" finishme's.  The point of the finshme is as a
reminder that there is something left to implement.  Printing it once
should be sufficient.

Signed-off-by: Jason Ekstrand 
---
 src/intel/vulkan/anv_private.h | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h
index 8f5a95b..c71a884 100644
--- a/src/intel/vulkan/anv_private.h
+++ b/src/intel/vulkan/anv_private.h
@@ -194,8 +194,13 @@ void anv_loge_v(const char *format, va_list va);
 /**
  * Print a FINISHME message, including its source location.
  */
-#define anv_finishme(format, ...) \
-   __anv_finishme(__FILE__, __LINE__, format, ##__VA_ARGS__);
+#define anv_finishme(format, ...) ({ \
+   static bool reported = false; \
+   if (!reported) { \
+  __anv_finishme(__FILE__, __LINE__, format, ##__VA_ARGS__); \
+  reported = true; \
+   } \
+})
 
 /* A non-fatal assert.  Useful for debugging. */
 #ifdef DEBUG
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98606] Compile error in gallium target VA--LLVM undefined referencences

2016-11-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98606

Michel Dänzer  changed:

   What|Removed |Added

 Attachment #127855|text/x-log  |text/plain
  mime type||

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98606] Compile error in gallium target VA--LLVM undefined referencences

2016-11-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98606

--- Comment #9 from charlie  ---
Created attachment 127860
  --> https://bugs.freedesktop.org/attachment.cgi?id=127860&action=edit
Varibles used to configure mesa.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98606] Compile error in gallium target VA--LLVM undefined referencences

2016-11-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98606

--- Comment #8 from charlie  ---
Created attachment 127859
  --> https://bugs.freedesktop.org/attachment.cgi?id=127859&action=edit
Default variables that get added to all components being built.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98606] Compile error in gallium target VA--LLVM undefined referencences

2016-11-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98606

--- Comment #7 from charlie  ---
Created attachment 127858
  --> https://bugs.freedesktop.org/attachment.cgi?id=127858&action=edit
environment variable used when compiling libva

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98606] Compile error in gallium target VA--LLVM undefined referencences

2016-11-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98606

--- Comment #6 from charlie  ---
Created attachment 127857
  --> https://bugs.freedesktop.org/attachment.cgi?id=127857&action=edit
libva.cfg

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98606] Compile error in gallium target VA--LLVM undefined referencences

2016-11-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98606

--- Comment #5 from charlie  ---
Created attachment 127856
  --> https://bugs.freedesktop.org/attachment.cgi?id=127856&action=edit
mesa build.log

Generated with "make V=1 -j${threads} 2>&1 | tee build.log"

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98606] Compile error in gallium target VA--LLVM undefined referencences

2016-11-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98606

--- Comment #4 from charlie  ---
Created attachment 127855
  --> https://bugs.freedesktop.org/attachment.cgi?id=127855&action=edit
mesa config.log

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98606] Compile error in gallium target VA--LLVM undefined referencences

2016-11-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98606

--- Comment #3 from charlie  ---
The "gcc/llvm/etc. combination" that mesa-va was last working at occurred
perhaps near the beginning of the year 2016.

"Atm if you build with --enable and then reconfigure/rebuild with --disable
things will break similar to your log."

I can try deleting all of x and llvm with no changes in my configure options
although I think I have done that before with no change in the mesa compile
error.  I'll try it again.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/25] anv: Rework the way render target surfaces are allocated

2016-11-08 Thread Jason Ekstrand
On Tue, Nov 8, 2016 at 5:16 PM, Nanley Chery  wrote:

> On Tue, Nov 08, 2016 at 05:02:29PM -0800, Jason Ekstrand wrote:
> > On Tue, Nov 8, 2016 at 5:00 PM, Nanley Chery 
> wrote:
> >
> > > On Tue, Nov 08, 2016 at 04:24:48PM -0800, Jason Ekstrand wrote:
> > > > On Tue, Nov 8, 2016 at 3:13 PM, Nanley Chery 
> > > wrote:
> > > >
> > > > > On Sat, Oct 22, 2016 at 10:50:37AM -0700, Jason Ekstrand wrote:
> > > > > > This commit moves the allocation and filling out of surface state
> > > from
> > > > > > CreateImageView time to BeginRenderPass time.  Instead of
> allocating
> > > the
> > > > > > render target surface state as part of the image view, we
> allocate
> > > it in
> > > > > > the command buffer state at the same time that we set up
> clears.  For
> > > > > > secondary command buffers, we allocate memory for the surface
> states
> > > in
> > > > > > BeginCommandBuffer but don't fill them out; instead, we use our
> new
> > > > > > SOL-based memcpy function to copy the surface states from the
> primary
> > > > > > command buffer.  This allows us to handle secondary command
> buffers
> > > > > without
> > > > > > the user specifying the framebuffer ahead-of-time.
> > > > > > ---
> > > > > >  src/intel/vulkan/anv_cmd_buffer.c  |  56 --
> > > > > >  src/intel/vulkan/anv_image.c   |  22 
> > > > > >  src/intel/vulkan/anv_private.h |  24 -
> > > > > >  src/intel/vulkan/genX_cmd_buffer.c | 204
> > > +-
> > > > > ---
> > > > > >  4 files changed, 180 insertions(+), 126 deletions(-)
> > > > > >
> > > > > > diff --git a/src/intel/vulkan/anv_cmd_buffer.c
> > > > > b/src/intel/vulkan/anv_cmd_buffer.c
> > > > > > index a652f9a..372030c 100644
> > > > > > --- a/src/intel/vulkan/anv_cmd_buffer.c
> > > > > > +++ b/src/intel/vulkan/anv_cmd_buffer.c
> > > > > > @@ -144,62 +144,6 @@ anv_cmd_state_reset(struct anv_cmd_buffer
> > > > > *cmd_buffer)
> > > > > > state->gen7.index_buffer = NULL;
> > > > > >  }
> > > > > >
> > > > > > -/**
> > > > > > - * Setup anv_cmd_state::attachments for vkCmdBeginRenderPass.
> > > > > > - */
> > > > > > -void
> > > > > > -anv_cmd_state_setup_attachments(struct anv_cmd_buffer
> *cmd_buffer,
> > > > > > -const VkRenderPassBeginInfo
> *info)
> > > > > > -{
> > > > > > -   struct anv_cmd_state *state = &cmd_buffer->state;
> > > > > > -   ANV_FROM_HANDLE(anv_render_pass, pass, info->renderPass);
> > > > > > -
> > > > > > -   vk_free(&cmd_buffer->pool->alloc, state->attachments);
> > > > > > -
> > > > > > -   if (pass->attachment_count == 0) {
> > > > > > -  state->attachments = NULL;
> > > > > > -  return;
> > > > > > -   }
> > > > > > -
> > > > > > -   state->attachments = vk_alloc(&cmd_buffer->pool->alloc,
> > > > > > -  pass->attachment_count *
> > > > > > -
> > >  sizeof(state->attachments[0]),
> > > > > > -  8, VK_SYSTEM_ALLOCATION_SCOPE_
> > > > > OBJECT);
> > > > > > -   if (state->attachments == NULL) {
> > > > > > -  /* FIXME: Propagate VK_ERROR_OUT_OF_HOST_MEMORY to
> > > > > vkEndCommandBuffer */
> > > > > > -  abort();
> > > > > > -   }
> > > > > > -
> > > > > > -   for (uint32_t i = 0; i < pass->attachment_count; ++i) {
> > > > > > -  struct anv_render_pass_attachment *att =
> > > &pass->attachments[i];
> > > > > > -  VkImageAspectFlags att_aspects =
> > > vk_format_aspects(att->format);
> > > > > > -  VkImageAspectFlags clear_aspects = 0;
> > > > > > -
> > > > > > -  if (att_aspects == VK_IMAGE_ASPECT_COLOR_BIT) {
> > > > > > - /* color attachment */
> > > > > > - if (att->load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) {
> > > > > > -clear_aspects |= VK_IMAGE_ASPECT_COLOR_BIT;
> > > > > > - }
> > > > > > -  } else {
> > > > > > - /* depthstencil attachment */
> > > > > > - if ((att_aspects & VK_IMAGE_ASPECT_DEPTH_BIT) &&
> > > > > > - att->load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) {
> > > > > > -clear_aspects |= VK_IMAGE_ASPECT_DEPTH_BIT;
> > > > > > - }
> > > > > > - if ((att_aspects & VK_IMAGE_ASPECT_STENCIL_BIT) &&
> > > > > > - att->stencil_load_op ==
> VK_ATTACHMENT_LOAD_OP_CLEAR) {
> > > > > > -clear_aspects |= VK_IMAGE_ASPECT_STENCIL_BIT;
> > > > > > - }
> > > > > > -  }
> > > > > > -
> > > > > > -  state->attachments[i].pending_clear_aspects =
> clear_aspects;
> > > > > > -  if (clear_aspects) {
> > > > > > - assert(info->clearValueCount > i);
> > > > > > - state->attachments[i].clear_value =
> info->pClearValues[i];
> > > > > > -  }
> > > > > > -   }
> > > > > > -}
> > > > > > -
> > > > > >  VkResult
> > > > > >  anv_cmd_buffer_ensure_push_constants_size(struct anv_cmd_buffer
> > > > > *cmd_buffer,
> > > > > >gl_shader_stage stage,
> > > > > uint32_t size)
> > > > > > diff --git a/src/intel/vulkan

[Mesa-dev] [PATCH] radv: fix GetFenceStatus for signaled fences

2016-11-08 Thread Dave Airlie
From: Dave Airlie 

if a fence is created pre-signaled we should return that
in GetFenceStatus even if it hasn't been submitted.

Signed-off-by: Dave Airlie 
---
 src/amd/vulkan/radv_device.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index fdb6db9..214af5f 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -1202,6 +1202,8 @@ VkResult radv_GetFenceStatus(VkDevice _device, VkFence 
_fence)
RADV_FROM_HANDLE(radv_device, device, _device);
RADV_FROM_HANDLE(radv_fence, fence, _fence);
 
+   if (fence->signalled)
+   return VK_SUCCESS;
if (!fence->submitted)
return VK_NOT_READY;
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/25] anv: Rework the way render target surfaces are allocated

2016-11-08 Thread Nanley Chery
On Tue, Nov 08, 2016 at 05:02:29PM -0800, Jason Ekstrand wrote:
> On Tue, Nov 8, 2016 at 5:00 PM, Nanley Chery  wrote:
> 
> > On Tue, Nov 08, 2016 at 04:24:48PM -0800, Jason Ekstrand wrote:
> > > On Tue, Nov 8, 2016 at 3:13 PM, Nanley Chery 
> > wrote:
> > >
> > > > On Sat, Oct 22, 2016 at 10:50:37AM -0700, Jason Ekstrand wrote:
> > > > > This commit moves the allocation and filling out of surface state
> > from
> > > > > CreateImageView time to BeginRenderPass time.  Instead of allocating
> > the
> > > > > render target surface state as part of the image view, we allocate
> > it in
> > > > > the command buffer state at the same time that we set up clears.  For
> > > > > secondary command buffers, we allocate memory for the surface states
> > in
> > > > > BeginCommandBuffer but don't fill them out; instead, we use our new
> > > > > SOL-based memcpy function to copy the surface states from the primary
> > > > > command buffer.  This allows us to handle secondary command buffers
> > > > without
> > > > > the user specifying the framebuffer ahead-of-time.
> > > > > ---
> > > > >  src/intel/vulkan/anv_cmd_buffer.c  |  56 --
> > > > >  src/intel/vulkan/anv_image.c   |  22 
> > > > >  src/intel/vulkan/anv_private.h |  24 -
> > > > >  src/intel/vulkan/genX_cmd_buffer.c | 204
> > +-
> > > > ---
> > > > >  4 files changed, 180 insertions(+), 126 deletions(-)
> > > > >
> > > > > diff --git a/src/intel/vulkan/anv_cmd_buffer.c
> > > > b/src/intel/vulkan/anv_cmd_buffer.c
> > > > > index a652f9a..372030c 100644
> > > > > --- a/src/intel/vulkan/anv_cmd_buffer.c
> > > > > +++ b/src/intel/vulkan/anv_cmd_buffer.c
> > > > > @@ -144,62 +144,6 @@ anv_cmd_state_reset(struct anv_cmd_buffer
> > > > *cmd_buffer)
> > > > > state->gen7.index_buffer = NULL;
> > > > >  }
> > > > >
> > > > > -/**
> > > > > - * Setup anv_cmd_state::attachments for vkCmdBeginRenderPass.
> > > > > - */
> > > > > -void
> > > > > -anv_cmd_state_setup_attachments(struct anv_cmd_buffer *cmd_buffer,
> > > > > -const VkRenderPassBeginInfo *info)
> > > > > -{
> > > > > -   struct anv_cmd_state *state = &cmd_buffer->state;
> > > > > -   ANV_FROM_HANDLE(anv_render_pass, pass, info->renderPass);
> > > > > -
> > > > > -   vk_free(&cmd_buffer->pool->alloc, state->attachments);
> > > > > -
> > > > > -   if (pass->attachment_count == 0) {
> > > > > -  state->attachments = NULL;
> > > > > -  return;
> > > > > -   }
> > > > > -
> > > > > -   state->attachments = vk_alloc(&cmd_buffer->pool->alloc,
> > > > > -  pass->attachment_count *
> > > > > -
> >  sizeof(state->attachments[0]),
> > > > > -  8, VK_SYSTEM_ALLOCATION_SCOPE_
> > > > OBJECT);
> > > > > -   if (state->attachments == NULL) {
> > > > > -  /* FIXME: Propagate VK_ERROR_OUT_OF_HOST_MEMORY to
> > > > vkEndCommandBuffer */
> > > > > -  abort();
> > > > > -   }
> > > > > -
> > > > > -   for (uint32_t i = 0; i < pass->attachment_count; ++i) {
> > > > > -  struct anv_render_pass_attachment *att =
> > &pass->attachments[i];
> > > > > -  VkImageAspectFlags att_aspects =
> > vk_format_aspects(att->format);
> > > > > -  VkImageAspectFlags clear_aspects = 0;
> > > > > -
> > > > > -  if (att_aspects == VK_IMAGE_ASPECT_COLOR_BIT) {
> > > > > - /* color attachment */
> > > > > - if (att->load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) {
> > > > > -clear_aspects |= VK_IMAGE_ASPECT_COLOR_BIT;
> > > > > - }
> > > > > -  } else {
> > > > > - /* depthstencil attachment */
> > > > > - if ((att_aspects & VK_IMAGE_ASPECT_DEPTH_BIT) &&
> > > > > - att->load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) {
> > > > > -clear_aspects |= VK_IMAGE_ASPECT_DEPTH_BIT;
> > > > > - }
> > > > > - if ((att_aspects & VK_IMAGE_ASPECT_STENCIL_BIT) &&
> > > > > - att->stencil_load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) {
> > > > > -clear_aspects |= VK_IMAGE_ASPECT_STENCIL_BIT;
> > > > > - }
> > > > > -  }
> > > > > -
> > > > > -  state->attachments[i].pending_clear_aspects = clear_aspects;
> > > > > -  if (clear_aspects) {
> > > > > - assert(info->clearValueCount > i);
> > > > > - state->attachments[i].clear_value = info->pClearValues[i];
> > > > > -  }
> > > > > -   }
> > > > > -}
> > > > > -
> > > > >  VkResult
> > > > >  anv_cmd_buffer_ensure_push_constants_size(struct anv_cmd_buffer
> > > > *cmd_buffer,
> > > > >gl_shader_stage stage,
> > > > uint32_t size)
> > > > > diff --git a/src/intel/vulkan/anv_image.c
> > b/src/intel/vulkan/anv_image.c
> > > > > index b7c2e99..b014985 100644
> > > > > --- a/src/intel/vulkan/anv_image.c
> > > > > +++ b/src/intel/vulkan/anv_image.c
> > > > > @@ -504,23 +504,6 @@ anv_CreateImageView(VkDevice _device,
> > > > >iview->sampler_surface_state.al

Re: [Mesa-dev] [PATCH 06/25] anv: Rework the way render target surfaces are allocated

2016-11-08 Thread Jason Ekstrand
On Tue, Nov 8, 2016 at 5:00 PM, Nanley Chery  wrote:

> On Tue, Nov 08, 2016 at 04:24:48PM -0800, Jason Ekstrand wrote:
> > On Tue, Nov 8, 2016 at 3:13 PM, Nanley Chery 
> wrote:
> >
> > > On Sat, Oct 22, 2016 at 10:50:37AM -0700, Jason Ekstrand wrote:
> > > > This commit moves the allocation and filling out of surface state
> from
> > > > CreateImageView time to BeginRenderPass time.  Instead of allocating
> the
> > > > render target surface state as part of the image view, we allocate
> it in
> > > > the command buffer state at the same time that we set up clears.  For
> > > > secondary command buffers, we allocate memory for the surface states
> in
> > > > BeginCommandBuffer but don't fill them out; instead, we use our new
> > > > SOL-based memcpy function to copy the surface states from the primary
> > > > command buffer.  This allows us to handle secondary command buffers
> > > without
> > > > the user specifying the framebuffer ahead-of-time.
> > > > ---
> > > >  src/intel/vulkan/anv_cmd_buffer.c  |  56 --
> > > >  src/intel/vulkan/anv_image.c   |  22 
> > > >  src/intel/vulkan/anv_private.h |  24 -
> > > >  src/intel/vulkan/genX_cmd_buffer.c | 204
> +-
> > > ---
> > > >  4 files changed, 180 insertions(+), 126 deletions(-)
> > > >
> > > > diff --git a/src/intel/vulkan/anv_cmd_buffer.c
> > > b/src/intel/vulkan/anv_cmd_buffer.c
> > > > index a652f9a..372030c 100644
> > > > --- a/src/intel/vulkan/anv_cmd_buffer.c
> > > > +++ b/src/intel/vulkan/anv_cmd_buffer.c
> > > > @@ -144,62 +144,6 @@ anv_cmd_state_reset(struct anv_cmd_buffer
> > > *cmd_buffer)
> > > > state->gen7.index_buffer = NULL;
> > > >  }
> > > >
> > > > -/**
> > > > - * Setup anv_cmd_state::attachments for vkCmdBeginRenderPass.
> > > > - */
> > > > -void
> > > > -anv_cmd_state_setup_attachments(struct anv_cmd_buffer *cmd_buffer,
> > > > -const VkRenderPassBeginInfo *info)
> > > > -{
> > > > -   struct anv_cmd_state *state = &cmd_buffer->state;
> > > > -   ANV_FROM_HANDLE(anv_render_pass, pass, info->renderPass);
> > > > -
> > > > -   vk_free(&cmd_buffer->pool->alloc, state->attachments);
> > > > -
> > > > -   if (pass->attachment_count == 0) {
> > > > -  state->attachments = NULL;
> > > > -  return;
> > > > -   }
> > > > -
> > > > -   state->attachments = vk_alloc(&cmd_buffer->pool->alloc,
> > > > -  pass->attachment_count *
> > > > -
>  sizeof(state->attachments[0]),
> > > > -  8, VK_SYSTEM_ALLOCATION_SCOPE_
> > > OBJECT);
> > > > -   if (state->attachments == NULL) {
> > > > -  /* FIXME: Propagate VK_ERROR_OUT_OF_HOST_MEMORY to
> > > vkEndCommandBuffer */
> > > > -  abort();
> > > > -   }
> > > > -
> > > > -   for (uint32_t i = 0; i < pass->attachment_count; ++i) {
> > > > -  struct anv_render_pass_attachment *att =
> &pass->attachments[i];
> > > > -  VkImageAspectFlags att_aspects =
> vk_format_aspects(att->format);
> > > > -  VkImageAspectFlags clear_aspects = 0;
> > > > -
> > > > -  if (att_aspects == VK_IMAGE_ASPECT_COLOR_BIT) {
> > > > - /* color attachment */
> > > > - if (att->load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) {
> > > > -clear_aspects |= VK_IMAGE_ASPECT_COLOR_BIT;
> > > > - }
> > > > -  } else {
> > > > - /* depthstencil attachment */
> > > > - if ((att_aspects & VK_IMAGE_ASPECT_DEPTH_BIT) &&
> > > > - att->load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) {
> > > > -clear_aspects |= VK_IMAGE_ASPECT_DEPTH_BIT;
> > > > - }
> > > > - if ((att_aspects & VK_IMAGE_ASPECT_STENCIL_BIT) &&
> > > > - att->stencil_load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) {
> > > > -clear_aspects |= VK_IMAGE_ASPECT_STENCIL_BIT;
> > > > - }
> > > > -  }
> > > > -
> > > > -  state->attachments[i].pending_clear_aspects = clear_aspects;
> > > > -  if (clear_aspects) {
> > > > - assert(info->clearValueCount > i);
> > > > - state->attachments[i].clear_value = info->pClearValues[i];
> > > > -  }
> > > > -   }
> > > > -}
> > > > -
> > > >  VkResult
> > > >  anv_cmd_buffer_ensure_push_constants_size(struct anv_cmd_buffer
> > > *cmd_buffer,
> > > >gl_shader_stage stage,
> > > uint32_t size)
> > > > diff --git a/src/intel/vulkan/anv_image.c
> b/src/intel/vulkan/anv_image.c
> > > > index b7c2e99..b014985 100644
> > > > --- a/src/intel/vulkan/anv_image.c
> > > > +++ b/src/intel/vulkan/anv_image.c
> > > > @@ -504,23 +504,6 @@ anv_CreateImageView(VkDevice _device,
> > > >iview->sampler_surface_state.alloc_size = 0;
> > > > }
> > > >
> > > > -   if (image->usage & VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT) {
> > > > -  iview->color_rt_surface_state = alloc_surface_state(device);
> > > > -
> > > > -  struct isl_view view = iview->isl;
> > > > -  view.usage |= ISL_SURF_USAGE_REND

[Mesa-dev] [PATCH v2 1/3] vulkan/wsi: Add a thread-safe queue implementation

2016-11-08 Thread Jason Ekstrand
From: Kevin Strasser 

In order to support FIFO mode without blocking the application on calls
to vkQueuePresentKHR it is necessary to enqueue the request and defer
calling the server until the next vblank period. The xcb present api
doesn't offer a way to register a callback, so we will have to spawn a
worker thread that will wait for a request to be added to the queue, call
to the server, and then make the image available for reuse.  This commit
introduces the queue data structure needed to implement this.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Eric Engestrom 
---
 src/vulkan/wsi/Makefile.sources   |   3 +-
 src/vulkan/wsi/wsi_common_queue.h | 154 ++
 2 files changed, 156 insertions(+), 1 deletion(-)
 create mode 100644 src/vulkan/wsi/wsi_common_queue.h

diff --git a/src/vulkan/wsi/Makefile.sources b/src/vulkan/wsi/Makefile.sources
index 3139e6d..50660f9 100644
--- a/src/vulkan/wsi/Makefile.sources
+++ b/src/vulkan/wsi/Makefile.sources
@@ -1,6 +1,7 @@
 
 VULKAN_WSI_FILES := \
-   wsi_common.h
+   wsi_common.h \
+   wsi_common_queue.h
 
 VULKAN_WSI_WAYLAND_FILES := \
wsi_common_wayland.c \
diff --git a/src/vulkan/wsi/wsi_common_queue.h 
b/src/vulkan/wsi/wsi_common_queue.h
new file mode 100644
index 000..0e72c8d
--- /dev/null
+++ b/src/vulkan/wsi/wsi_common_queue.h
@@ -0,0 +1,154 @@
+/*
+ * Copyright © 2016 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#ifndef VULKAN_WSI_COMMON_QUEUE_H
+#define VULKAN_WSI_COMMON_QUEUE_H
+
+#include 
+#include 
+#include "util/u_vector.h"
+
+struct wsi_queue {
+   struct u_vector vector;
+   pthread_mutex_t mutex;
+   pthread_cond_t cond;
+};
+
+static inline int
+wsi_queue_init(struct wsi_queue *queue, int length)
+{
+   int ret;
+
+   uint32_t length_pow2 = 4;
+   while (length_pow2 < length)
+  length_pow2 *= 2;
+
+   ret = u_vector_init(&queue->vector, sizeof(uint32_t),
+   sizeof(uint32_t) * length_pow2);
+   if (!ret)
+  return ENOMEM;
+
+   pthread_condattr_t condattr;
+   ret = pthread_condattr_init(&condattr);
+   if (ret)
+  goto fail_vector;
+
+   ret = pthread_condattr_setclock(&condattr, CLOCK_MONOTONIC);
+   if (ret)
+  goto fail_condattr;
+
+   ret = pthread_cond_init(&queue->cond, &condattr);
+   if (ret)
+  goto fail_condattr;
+
+   ret = pthread_mutex_init(&queue->mutex, NULL);
+   if (ret)
+  goto fail_cond;
+
+   return 0;
+
+fail_cond:
+   pthread_cond_destroy(&queue->cond);
+fail_condattr:
+   pthread_condattr_destroy(&condattr);
+fail_vector:
+   u_vector_finish(&queue->vector);
+
+   return ret;
+}
+
+static inline void
+wsi_queue_destroy(struct wsi_queue *queue)
+{
+   u_vector_finish(&queue->vector);
+   pthread_mutex_destroy(&queue->mutex);
+   pthread_cond_destroy(&queue->cond);
+}
+
+static inline void
+wsi_queue_push(struct wsi_queue *queue, uint32_t index)
+{
+   uint32_t *elem;
+
+   pthread_mutex_lock(&queue->mutex);
+
+   if (u_vector_length(&queue->vector) == 0)
+  pthread_cond_signal(&queue->cond);
+
+   elem = u_vector_add(&queue->vector);
+   *elem = index;
+
+   pthread_mutex_unlock(&queue->mutex);
+}
+
+#define NSEC_PER_SEC 10
+#define INT_TYPE_MAX(type) ((1ull << (sizeof(type) * 8 - 1)) - 1)
+
+static inline VkResult
+wsi_queue_pull(struct wsi_queue *queue, uint32_t *index, uint64_t timeout)
+{
+   VkResult result;
+   int32_t ret;
+
+   pthread_mutex_lock(&queue->mutex);
+
+   struct timespec now;
+   clock_gettime(CLOCK_MONOTONIC, &now);
+
+   uint32_t abs_nsec = now.tv_nsec + timeout % NSEC_PER_SEC;
+   uint64_t abs_sec = now.tv_sec + (abs_nsec / NSEC_PER_SEC) +
+  (timeout / NSEC_PER_SEC);
+   abs_nsec %= NSEC_PER_SEC;
+
+   /* Avoid roll-over in tv_sec on 32-bit systems if the user provided timeout
+* is UINT64_MAX
+*/
+   struct timespec abstime;
+   abstime.tv_nsec = abs_nsec;
+   abstime.tv_

[Mesa-dev] [PATCH v2 3/3] vulkan/wsi/x11: Implement FIFO mode.

2016-11-08 Thread Jason Ekstrand
This implements VK_PRESENT_MODE_FIFO_KHR for X11.  Unfortunately, due to
the way the present extension works, we have to manage the queue of
presented images in a separate thread.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Eric Engestrom 
---
 src/vulkan/wsi/wsi_common_x11.c | 174 +---
 1 file changed, 164 insertions(+), 10 deletions(-)

diff --git a/src/vulkan/wsi/wsi_common_x11.c b/src/vulkan/wsi/wsi_common_x11.c
index 4bc5ef3..208d8d4 100644
--- a/src/vulkan/wsi/wsi_common_x11.c
+++ b/src/vulkan/wsi/wsi_common_x11.c
@@ -39,6 +39,7 @@
 
 #include "wsi_common.h"
 #include "wsi_common_x11.h"
+#include "wsi_common_queue.h"
 
 #define typed_memcpy(dest, src, count) ({ \
static_assert(sizeof(*src) == sizeof(*dest), ""); \
@@ -145,6 +146,7 @@ static const VkSurfaceFormatKHR formats[] = {
 static const VkPresentModeKHR present_modes[] = {
VK_PRESENT_MODE_IMMEDIATE_KHR,
VK_PRESENT_MODE_MAILBOX_KHR,
+   VK_PRESENT_MODE_FIFO_KHR,
 };
 
 static xcb_screen_t *
@@ -490,8 +492,15 @@ struct x11_swapchain {
xcb_present_event_t  event_id;
xcb_special_event_t *special_event;
uint64_t send_sbc;
+   uint64_t last_present_msc;
uint32_t stamp;
 
+   bool threaded;
+   VkResult status;
+   struct wsi_queue present_queue;
+   struct wsi_queue acquire_queue;
+   pthread_tqueue_manager;
+
struct x11_image images[0];
 };
 
@@ -536,6 +545,8 @@ x11_handle_dri3_present_event(struct x11_swapchain *chain,
   for (unsigned i = 0; i < chain->image_count; i++) {
  if (chain->images[i].pixmap == idle->pixmap) {
 chain->images[i].busy = false;
+if (chain->threaded)
+   wsi_queue_push(&chain->acquire_queue, i);
 break;
  }
   }
@@ -543,7 +554,13 @@ x11_handle_dri3_present_event(struct x11_swapchain *chain,
   break;
}
 
-   case XCB_PRESENT_COMPLETE_NOTIFY:
+   case XCB_PRESENT_EVENT_COMPLETE_NOTIFY: {
+  xcb_present_complete_notify_event_t *complete = (void *) event;
+  if (complete->kind == XCB_PRESENT_COMPLETE_KIND_PIXMAP)
+ chain->last_present_msc = complete->msc;
+  break;
+   }
+
default:
   break;
}
@@ -572,12 +589,9 @@ static uint64_t wsi_get_absolute_timeout(uint64_t timeout)
 }
 
 static VkResult
-x11_acquire_next_image(struct wsi_swapchain *anv_chain,
-   uint64_t timeout,
-   VkSemaphore semaphore,
-   uint32_t *image_index)
+x11_acquire_next_image_poll_x11(struct x11_swapchain *chain,
+uint32_t *image_index, uint64_t timeout)
 {
-   struct x11_swapchain *chain = (struct x11_swapchain *)anv_chain;
xcb_generic_event_t *event;
struct pollfd pfds;
uint64_t atimeout;
@@ -635,17 +649,38 @@ x11_acquire_next_image(struct wsi_swapchain *anv_chain,
 }
 
 static VkResult
-x11_queue_present(struct wsi_swapchain *anv_chain,
-  uint32_t image_index)
+x11_acquire_next_image_from_queue(struct x11_swapchain *chain,
+  uint32_t *image_index_out, uint64_t timeout)
+{
+   assert(chain->threaded);
+
+   uint32_t image_index;
+   VkResult result = wsi_queue_pull(&chain->acquire_queue,
+&image_index, timeout);
+   if (result != VK_SUCCESS) {
+  return result;
+   } else if (chain->status != VK_SUCCESS) {
+  return chain->status;
+   }
+
+   assert(image_index < chain->image_count);
+   xshmfence_await(chain->images[image_index].shm_fence);
+
+   *image_index_out = image_index;
+
+   return VK_SUCCESS;
+}
+
+static VkResult
+x11_present_to_x11(struct x11_swapchain *chain, uint32_t image_index,
+   uint32_t target_msc)
 {
-   struct x11_swapchain *chain = (struct x11_swapchain *)anv_chain;
struct x11_image *image = &chain->images[image_index];
 
assert(image_index < chain->image_count);
 
uint32_t options = XCB_PRESENT_OPTION_NONE;
 
-   int64_t target_msc = 0;
int64_t divisor = 0;
int64_t remainder = 0;
 
@@ -680,6 +715,82 @@ x11_queue_present(struct wsi_swapchain *anv_chain,
 }
 
 static VkResult
+x11_acquire_next_image(struct wsi_swapchain *anv_chain,
+   uint64_t timeout,
+   VkSemaphore semaphore,
+   uint32_t *image_index)
+{
+   struct x11_swapchain *chain = (struct x11_swapchain *)anv_chain;
+
+   if (chain->threaded) {
+  return x11_acquire_next_image_from_queue(chain, image_index, timeout);
+   } else {
+  return x11_acquire_next_image_poll_x11(chain, image_index, timeout);
+   }
+}
+
+static VkResult
+x11_queue_present(struc

[Mesa-dev] [PATCH v2 2/3] vulkan/wsi: Report the correct min/maxImageCount

2016-11-08 Thread Jason Ekstrand
From the Vulkan spec 1.0.32 section 29.6 docs for vkAcquireNextImageKHR:

   "Let n be the total number of images in the swapchain, m be the value of
   VkSurfaceCapabilitiesKHR::minImageCount, and a be the number of
   presentable images that the application has currently acquired (i.e.
   images acquired with vkAcquireNextImageKHR, but not yet presented with
   vkQueuePresentKHR).  vkAcquireNextImageKHR can always succeed if a ≤ n -
   m at the time vkAcquireNextImageKHR is called. vkAcquireNextImageKHR
   should not be called if a > n - m with a timeout of UINT64_MAX; in such
   a case, vkAcquireNextImageKHR may block indefinitely."

With minImageCount == 2 (as it was previously, the client is allowed to
acquire all but one image withoutblocking.  If we really need 4 images for
mailbox mode + pageflipping, then we need to request a minimum of 4 images
up-front.  This is a bit unfortunate because it means we will always
consume 4 images.  In the future, we may be able to optimize this a bit by
waiting until the server starts to flip and returning OUT_OF_DATE to get
the client to re-allocate with more images or something like that.

Signed-off-by: Jason Ekstrand 
---
 src/vulkan/wsi/wsi_common_wayland.c | 25 ++---
 src/vulkan/wsi/wsi_common_x11.c | 21 ++---
 2 files changed, 20 insertions(+), 26 deletions(-)

diff --git a/src/vulkan/wsi/wsi_common_wayland.c 
b/src/vulkan/wsi/wsi_common_wayland.c
index c6e138e..41c099f 100644
--- a/src/vulkan/wsi/wsi_common_wayland.c
+++ b/src/vulkan/wsi/wsi_common_wayland.c
@@ -41,8 +41,6 @@
memcpy((dest), (src), (count) * sizeof(*(src))); \
 })
 
-#define MIN_NUM_IMAGES 2
-
 struct wsi_wayland;
 
 struct wsi_wl_display {
@@ -366,8 +364,16 @@ static VkResult
 wsi_wl_surface_get_capabilities(VkIcdSurfaceBase *surface,
 VkSurfaceCapabilitiesKHR* caps)
 {
-   caps->minImageCount = MIN_NUM_IMAGES;
-   caps->maxImageCount = 4;
+   /* For true mailbox mode, we need at least 4 images:
+*  1) One to scan out from
+*  2) One to have queued for scan-out
+*  3) One to be currently held by the Wayland compositor
+*  4) One to render to
+*/
+   caps->minImageCount = 4;
+   /* There is no real maximum */
+   caps->maxImageCount = 0;
+
caps->currentExtent = (VkExtent2D) { -1, -1 };
caps->minImageExtent = (VkExtent2D) { 1, 1 };
caps->maxImageExtent = (VkExtent2D) { INT16_MAX, INT16_MAX };
@@ -685,17 +691,6 @@ wsi_wl_surface_create_swapchain(VkIcdSurfaceBase 
*icd_surface,
 
int num_images = pCreateInfo->minImageCount;
 
-   assert(num_images >= MIN_NUM_IMAGES);
-
-   /* For true mailbox mode, we need at least 4 images:
-*  1) One to scan out from
-*  2) One to have queued for scan-out
-*  3) One to be currently held by the Wayland compositor
-*  4) One to render to
-*/
-   if (pCreateInfo->presentMode == VK_PRESENT_MODE_MAILBOX_KHR)
-  num_images = MAX2(num_images, 4);
-
size_t size = sizeof(*chain) + num_images * sizeof(chain->images[0]);
chain = vk_alloc(pAllocator, size, 8,
   VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
diff --git a/src/vulkan/wsi/wsi_common_x11.c b/src/vulkan/wsi/wsi_common_x11.c
index 98f0923..4bc5ef3 100644
--- a/src/vulkan/wsi/wsi_common_x11.c
+++ b/src/vulkan/wsi/wsi_common_x11.c
@@ -373,8 +373,16 @@ x11_surface_get_capabilities(VkIcdSurfaceBase *icd_surface,
   VK_COMPOSITE_ALPHA_OPAQUE_BIT_KHR;
}
 
+   /* For true mailbox mode, we need at least 4 images:
+*  1) One to scan out from
+*  2) One to have queued for scan-out
+*  3) One to be currently held by the Wayland compositor
+*  4) One to render to
+*/
caps->minImageCount = 2;
-   caps->maxImageCount = 4;
+   /* There is no real maximum */
+   caps->maxImageCount = 0;
+
caps->supportedTransforms = VK_SURFACE_TRANSFORM_IDENTITY_BIT_KHR;
caps->currentTransform = VK_SURFACE_TRANSFORM_IDENTITY_BIT_KHR;
caps->maxImageArrayLayers = 1;
@@ -791,16 +799,7 @@ x11_surface_create_swapchain(VkIcdSurfaceBase *icd_surface,
 
assert(pCreateInfo->sType == VK_STRUCTURE_TYPE_SWAPCHAIN_CREATE_INFO_KHR);
 
-   int num_images = pCreateInfo->minImageCount;
-
-   /* For true mailbox mode, we need at least 4 images:
-*  1) One to scan out from
-*  2) One to have queued for scan-out
-*  3) One to be currently held by the Wayland compositor
-*  4) One to render to
-*/
-   if (pCreateInfo->presentMode == VK_PRESENT_MODE_MAILBOX_KHR)
-  num_images = MAX2(num_images, 4);
+   const unsigned num_images = pCreateInfo->minImageCount;
 
size_t size = sizeof(*chain) + num_images * sizeof(chain->images[0]);
chain = vk_alloc(pAllocator, size, 8,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/25] anv: Rework the way render target surfaces are allocated

2016-11-08 Thread Nanley Chery
On Tue, Nov 08, 2016 at 04:24:48PM -0800, Jason Ekstrand wrote:
> On Tue, Nov 8, 2016 at 3:13 PM, Nanley Chery  wrote:
> 
> > On Sat, Oct 22, 2016 at 10:50:37AM -0700, Jason Ekstrand wrote:
> > > This commit moves the allocation and filling out of surface state from
> > > CreateImageView time to BeginRenderPass time.  Instead of allocating the
> > > render target surface state as part of the image view, we allocate it in
> > > the command buffer state at the same time that we set up clears.  For
> > > secondary command buffers, we allocate memory for the surface states in
> > > BeginCommandBuffer but don't fill them out; instead, we use our new
> > > SOL-based memcpy function to copy the surface states from the primary
> > > command buffer.  This allows us to handle secondary command buffers
> > without
> > > the user specifying the framebuffer ahead-of-time.
> > > ---
> > >  src/intel/vulkan/anv_cmd_buffer.c  |  56 --
> > >  src/intel/vulkan/anv_image.c   |  22 
> > >  src/intel/vulkan/anv_private.h |  24 -
> > >  src/intel/vulkan/genX_cmd_buffer.c | 204 +-
> > ---
> > >  4 files changed, 180 insertions(+), 126 deletions(-)
> > >
> > > diff --git a/src/intel/vulkan/anv_cmd_buffer.c
> > b/src/intel/vulkan/anv_cmd_buffer.c
> > > index a652f9a..372030c 100644
> > > --- a/src/intel/vulkan/anv_cmd_buffer.c
> > > +++ b/src/intel/vulkan/anv_cmd_buffer.c
> > > @@ -144,62 +144,6 @@ anv_cmd_state_reset(struct anv_cmd_buffer
> > *cmd_buffer)
> > > state->gen7.index_buffer = NULL;
> > >  }
> > >
> > > -/**
> > > - * Setup anv_cmd_state::attachments for vkCmdBeginRenderPass.
> > > - */
> > > -void
> > > -anv_cmd_state_setup_attachments(struct anv_cmd_buffer *cmd_buffer,
> > > -const VkRenderPassBeginInfo *info)
> > > -{
> > > -   struct anv_cmd_state *state = &cmd_buffer->state;
> > > -   ANV_FROM_HANDLE(anv_render_pass, pass, info->renderPass);
> > > -
> > > -   vk_free(&cmd_buffer->pool->alloc, state->attachments);
> > > -
> > > -   if (pass->attachment_count == 0) {
> > > -  state->attachments = NULL;
> > > -  return;
> > > -   }
> > > -
> > > -   state->attachments = vk_alloc(&cmd_buffer->pool->alloc,
> > > -  pass->attachment_count *
> > > -   sizeof(state->attachments[0]),
> > > -  8, VK_SYSTEM_ALLOCATION_SCOPE_
> > OBJECT);
> > > -   if (state->attachments == NULL) {
> > > -  /* FIXME: Propagate VK_ERROR_OUT_OF_HOST_MEMORY to
> > vkEndCommandBuffer */
> > > -  abort();
> > > -   }
> > > -
> > > -   for (uint32_t i = 0; i < pass->attachment_count; ++i) {
> > > -  struct anv_render_pass_attachment *att = &pass->attachments[i];
> > > -  VkImageAspectFlags att_aspects = vk_format_aspects(att->format);
> > > -  VkImageAspectFlags clear_aspects = 0;
> > > -
> > > -  if (att_aspects == VK_IMAGE_ASPECT_COLOR_BIT) {
> > > - /* color attachment */
> > > - if (att->load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) {
> > > -clear_aspects |= VK_IMAGE_ASPECT_COLOR_BIT;
> > > - }
> > > -  } else {
> > > - /* depthstencil attachment */
> > > - if ((att_aspects & VK_IMAGE_ASPECT_DEPTH_BIT) &&
> > > - att->load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) {
> > > -clear_aspects |= VK_IMAGE_ASPECT_DEPTH_BIT;
> > > - }
> > > - if ((att_aspects & VK_IMAGE_ASPECT_STENCIL_BIT) &&
> > > - att->stencil_load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) {
> > > -clear_aspects |= VK_IMAGE_ASPECT_STENCIL_BIT;
> > > - }
> > > -  }
> > > -
> > > -  state->attachments[i].pending_clear_aspects = clear_aspects;
> > > -  if (clear_aspects) {
> > > - assert(info->clearValueCount > i);
> > > - state->attachments[i].clear_value = info->pClearValues[i];
> > > -  }
> > > -   }
> > > -}
> > > -
> > >  VkResult
> > >  anv_cmd_buffer_ensure_push_constants_size(struct anv_cmd_buffer
> > *cmd_buffer,
> > >gl_shader_stage stage,
> > uint32_t size)
> > > diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
> > > index b7c2e99..b014985 100644
> > > --- a/src/intel/vulkan/anv_image.c
> > > +++ b/src/intel/vulkan/anv_image.c
> > > @@ -504,23 +504,6 @@ anv_CreateImageView(VkDevice _device,
> > >iview->sampler_surface_state.alloc_size = 0;
> > > }
> > >
> > > -   if (image->usage & VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT) {
> > > -  iview->color_rt_surface_state = alloc_surface_state(device);
> > > -
> > > -  struct isl_view view = iview->isl;
> > > -  view.usage |= ISL_SURF_USAGE_RENDER_TARGET_BIT;
> > > -  isl_surf_fill_state(&device->isl_dev,
> > > -  iview->color_rt_surface_state.map,
> > > -  .surf = &surface->isl,
> > > -  .view = &view,
> > > -   

Re: [Mesa-dev] [PATCH 06/25] anv: Rework the way render target surfaces are allocated

2016-11-08 Thread Jason Ekstrand
On Tue, Nov 8, 2016 at 3:13 PM, Nanley Chery  wrote:

> On Sat, Oct 22, 2016 at 10:50:37AM -0700, Jason Ekstrand wrote:
> > This commit moves the allocation and filling out of surface state from
> > CreateImageView time to BeginRenderPass time.  Instead of allocating the
> > render target surface state as part of the image view, we allocate it in
> > the command buffer state at the same time that we set up clears.  For
> > secondary command buffers, we allocate memory for the surface states in
> > BeginCommandBuffer but don't fill them out; instead, we use our new
> > SOL-based memcpy function to copy the surface states from the primary
> > command buffer.  This allows us to handle secondary command buffers
> without
> > the user specifying the framebuffer ahead-of-time.
> > ---
> >  src/intel/vulkan/anv_cmd_buffer.c  |  56 --
> >  src/intel/vulkan/anv_image.c   |  22 
> >  src/intel/vulkan/anv_private.h |  24 -
> >  src/intel/vulkan/genX_cmd_buffer.c | 204 +-
> ---
> >  4 files changed, 180 insertions(+), 126 deletions(-)
> >
> > diff --git a/src/intel/vulkan/anv_cmd_buffer.c
> b/src/intel/vulkan/anv_cmd_buffer.c
> > index a652f9a..372030c 100644
> > --- a/src/intel/vulkan/anv_cmd_buffer.c
> > +++ b/src/intel/vulkan/anv_cmd_buffer.c
> > @@ -144,62 +144,6 @@ anv_cmd_state_reset(struct anv_cmd_buffer
> *cmd_buffer)
> > state->gen7.index_buffer = NULL;
> >  }
> >
> > -/**
> > - * Setup anv_cmd_state::attachments for vkCmdBeginRenderPass.
> > - */
> > -void
> > -anv_cmd_state_setup_attachments(struct anv_cmd_buffer *cmd_buffer,
> > -const VkRenderPassBeginInfo *info)
> > -{
> > -   struct anv_cmd_state *state = &cmd_buffer->state;
> > -   ANV_FROM_HANDLE(anv_render_pass, pass, info->renderPass);
> > -
> > -   vk_free(&cmd_buffer->pool->alloc, state->attachments);
> > -
> > -   if (pass->attachment_count == 0) {
> > -  state->attachments = NULL;
> > -  return;
> > -   }
> > -
> > -   state->attachments = vk_alloc(&cmd_buffer->pool->alloc,
> > -  pass->attachment_count *
> > -   sizeof(state->attachments[0]),
> > -  8, VK_SYSTEM_ALLOCATION_SCOPE_
> OBJECT);
> > -   if (state->attachments == NULL) {
> > -  /* FIXME: Propagate VK_ERROR_OUT_OF_HOST_MEMORY to
> vkEndCommandBuffer */
> > -  abort();
> > -   }
> > -
> > -   for (uint32_t i = 0; i < pass->attachment_count; ++i) {
> > -  struct anv_render_pass_attachment *att = &pass->attachments[i];
> > -  VkImageAspectFlags att_aspects = vk_format_aspects(att->format);
> > -  VkImageAspectFlags clear_aspects = 0;
> > -
> > -  if (att_aspects == VK_IMAGE_ASPECT_COLOR_BIT) {
> > - /* color attachment */
> > - if (att->load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) {
> > -clear_aspects |= VK_IMAGE_ASPECT_COLOR_BIT;
> > - }
> > -  } else {
> > - /* depthstencil attachment */
> > - if ((att_aspects & VK_IMAGE_ASPECT_DEPTH_BIT) &&
> > - att->load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) {
> > -clear_aspects |= VK_IMAGE_ASPECT_DEPTH_BIT;
> > - }
> > - if ((att_aspects & VK_IMAGE_ASPECT_STENCIL_BIT) &&
> > - att->stencil_load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) {
> > -clear_aspects |= VK_IMAGE_ASPECT_STENCIL_BIT;
> > - }
> > -  }
> > -
> > -  state->attachments[i].pending_clear_aspects = clear_aspects;
> > -  if (clear_aspects) {
> > - assert(info->clearValueCount > i);
> > - state->attachments[i].clear_value = info->pClearValues[i];
> > -  }
> > -   }
> > -}
> > -
> >  VkResult
> >  anv_cmd_buffer_ensure_push_constants_size(struct anv_cmd_buffer
> *cmd_buffer,
> >gl_shader_stage stage,
> uint32_t size)
> > diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
> > index b7c2e99..b014985 100644
> > --- a/src/intel/vulkan/anv_image.c
> > +++ b/src/intel/vulkan/anv_image.c
> > @@ -504,23 +504,6 @@ anv_CreateImageView(VkDevice _device,
> >iview->sampler_surface_state.alloc_size = 0;
> > }
> >
> > -   if (image->usage & VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT) {
> > -  iview->color_rt_surface_state = alloc_surface_state(device);
> > -
> > -  struct isl_view view = iview->isl;
> > -  view.usage |= ISL_SURF_USAGE_RENDER_TARGET_BIT;
> > -  isl_surf_fill_state(&device->isl_dev,
> > -  iview->color_rt_surface_state.map,
> > -  .surf = &surface->isl,
> > -  .view = &view,
> > -  .mocs = device->default_mocs);
> > -
> > -  if (!device->info.has_llc)
> > - anv_state_clflush(iview->color_rt_surface_state);
> > -   } else {
> > -  iview->color_rt_surface_state.alloc_size = 0;
> > -   }
> > -
> > /* NOTE: This one needs to go la

[Mesa-dev] [Bug 98629] OpenGL applications warns "MESA-LOADER: failed to retrieve device information"

2016-11-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98629

--- Comment #3 from Mingcong Bai  ---
(In reply to Emil Velikov from comment #1)
> [Moving to 'core' since it's not really nouveau specific]
> 
> Does this happen with glxinfo/glxgears as well ? If so can you attach the
> output of $strace glxinfo
> 
> If glxinfo works fine, while $program does not, attach the output of
> $DL_DEBUG=libs $program
> 
> Thanks

glxinfo and glxgears... and basically everything provided by mesa-demos have
the same issue.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98629] OpenGL applications warns "MESA-LOADER: failed to retrieve device information"

2016-11-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98629

--- Comment #2 from Mingcong Bai  ---
Created attachment 127852
  --> https://bugs.freedesktop.org/attachment.cgi?id=127852&action=edit
strace of glxinfo

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965/compiler: Disable trig workarounds on KBL+

2016-11-08 Thread Anuj Phogat
On Tue, Nov 8, 2016 at 1:21 PM, Jason Ekstrand  wrote:

> The precision of our trig instructions instructions appears to have been
>
​s/​instructions instructions/instructions

> fixed on Kaby Lake.  Neither Ben nor I can find any documentation for this.
> However, the dEQP precision tests now pass with INTEL_PRECISE_TRIG=0 where
> they fail on Sky Lake.
>
> Signed-off-by: Jason Ekstrand 
> ---
>  src/mesa/drivers/dri/i965/brw_nir.c   | 5 -
>  src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py | 7 ---
>  2 files changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_nir.c
> b/src/mesa/drivers/dri/i965/brw_nir.c
> index a93d825..1069438 100644
> --- a/src/mesa/drivers/dri/i965/brw_nir.c
> +++ b/src/mesa/drivers/dri/i965/brw_nir.c
> @@ -449,6 +449,7 @@ nir_optimize(nir_shader *nir, bool is_scalar)
>  nir_shader *
>  brw_preprocess_nir(const struct brw_compiler *compiler, nir_shader *nir)
>  {
> +   const struct gen_device_info *devinfo = compiler->devinfo;
> bool progress; /* Written by OPT and OPT_V */
> (void)progress;
>
> @@ -457,7 +458,9 @@ brw_preprocess_nir(const struct brw_compiler
> *compiler, nir_shader *nir)
> if (nir->stage == MESA_SHADER_GEOMETRY)
>OPT(nir_lower_gs_intrinsics);
>
> -   if (compiler->precise_trig)
> +   /* See also brw_nir_trig_workarounds.py */
> +   if (compiler->precise_trig &&
> +   !(devinfo->gen >= 10 || devinfo->is_kabylake))
>OPT(brw_nir_apply_trig_workarounds);
>
> static const nir_lower_tex_options tex_options = {
> diff --git a/src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py
> b/src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py
> index 67dab9a..3b8d0ce 100755
> --- a/src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py
> +++ b/src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py
> @@ -23,9 +23,10 @@
>
>  import nir_algebraic
>
> -# The SIN and COS instructions on Intel hardware can produce values
> -# slightly outside of the [-1.0, 1.0] range for a small set of values.
> -# Obviously, this can break everyone's expectations about trig functions.
> +# Prior to Kaby Lake, The SIN and COS instructions on Intel hardware can
> +# produce values slightly outside of the [-1.0, 1.0] range for a small
> set of
> +# values.  Obviously, this can break everyone's expectations about trig
> +# functions.  This appears to be fixed in Kaby Lake.
>  #
>  # According to an internal presentation, the COS instruction can produce
>  # a value up to 1.27 for inputs in the range (0.08296, 0.09888).  One
> --
> 2.5.0.400.gff86faf
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/25] anv: Rework the way render target surfaces are allocated

2016-11-08 Thread Nanley Chery
On Sat, Oct 22, 2016 at 10:50:37AM -0700, Jason Ekstrand wrote:
> This commit moves the allocation and filling out of surface state from
> CreateImageView time to BeginRenderPass time.  Instead of allocating the
> render target surface state as part of the image view, we allocate it in
> the command buffer state at the same time that we set up clears.  For
> secondary command buffers, we allocate memory for the surface states in
> BeginCommandBuffer but don't fill them out; instead, we use our new
> SOL-based memcpy function to copy the surface states from the primary
> command buffer.  This allows us to handle secondary command buffers without
> the user specifying the framebuffer ahead-of-time.
> ---
>  src/intel/vulkan/anv_cmd_buffer.c  |  56 --
>  src/intel/vulkan/anv_image.c   |  22 
>  src/intel/vulkan/anv_private.h |  24 -
>  src/intel/vulkan/genX_cmd_buffer.c | 204 
> +
>  4 files changed, 180 insertions(+), 126 deletions(-)
> 
> diff --git a/src/intel/vulkan/anv_cmd_buffer.c 
> b/src/intel/vulkan/anv_cmd_buffer.c
> index a652f9a..372030c 100644
> --- a/src/intel/vulkan/anv_cmd_buffer.c
> +++ b/src/intel/vulkan/anv_cmd_buffer.c
> @@ -144,62 +144,6 @@ anv_cmd_state_reset(struct anv_cmd_buffer *cmd_buffer)
> state->gen7.index_buffer = NULL;
>  }
>  
> -/**
> - * Setup anv_cmd_state::attachments for vkCmdBeginRenderPass.
> - */
> -void
> -anv_cmd_state_setup_attachments(struct anv_cmd_buffer *cmd_buffer,
> -const VkRenderPassBeginInfo *info)
> -{
> -   struct anv_cmd_state *state = &cmd_buffer->state;
> -   ANV_FROM_HANDLE(anv_render_pass, pass, info->renderPass);
> -
> -   vk_free(&cmd_buffer->pool->alloc, state->attachments);
> -
> -   if (pass->attachment_count == 0) {
> -  state->attachments = NULL;
> -  return;
> -   }
> -
> -   state->attachments = vk_alloc(&cmd_buffer->pool->alloc,
> -  pass->attachment_count *
> -   sizeof(state->attachments[0]),
> -  8, VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
> -   if (state->attachments == NULL) {
> -  /* FIXME: Propagate VK_ERROR_OUT_OF_HOST_MEMORY to vkEndCommandBuffer 
> */
> -  abort();
> -   }
> -
> -   for (uint32_t i = 0; i < pass->attachment_count; ++i) {
> -  struct anv_render_pass_attachment *att = &pass->attachments[i];
> -  VkImageAspectFlags att_aspects = vk_format_aspects(att->format);
> -  VkImageAspectFlags clear_aspects = 0;
> -
> -  if (att_aspects == VK_IMAGE_ASPECT_COLOR_BIT) {
> - /* color attachment */
> - if (att->load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) {
> -clear_aspects |= VK_IMAGE_ASPECT_COLOR_BIT;
> - }
> -  } else {
> - /* depthstencil attachment */
> - if ((att_aspects & VK_IMAGE_ASPECT_DEPTH_BIT) &&
> - att->load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) {
> -clear_aspects |= VK_IMAGE_ASPECT_DEPTH_BIT;
> - }
> - if ((att_aspects & VK_IMAGE_ASPECT_STENCIL_BIT) &&
> - att->stencil_load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) {
> -clear_aspects |= VK_IMAGE_ASPECT_STENCIL_BIT;
> - }
> -  }
> -
> -  state->attachments[i].pending_clear_aspects = clear_aspects;
> -  if (clear_aspects) {
> - assert(info->clearValueCount > i);
> - state->attachments[i].clear_value = info->pClearValues[i];
> -  }
> -   }
> -}
> -
>  VkResult
>  anv_cmd_buffer_ensure_push_constants_size(struct anv_cmd_buffer *cmd_buffer,
>gl_shader_stage stage, uint32_t 
> size)
> diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c
> index b7c2e99..b014985 100644
> --- a/src/intel/vulkan/anv_image.c
> +++ b/src/intel/vulkan/anv_image.c
> @@ -504,23 +504,6 @@ anv_CreateImageView(VkDevice _device,
>iview->sampler_surface_state.alloc_size = 0;
> }
>  
> -   if (image->usage & VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT) {
> -  iview->color_rt_surface_state = alloc_surface_state(device);
> -
> -  struct isl_view view = iview->isl;
> -  view.usage |= ISL_SURF_USAGE_RENDER_TARGET_BIT;
> -  isl_surf_fill_state(&device->isl_dev,
> -  iview->color_rt_surface_state.map,
> -  .surf = &surface->isl,
> -  .view = &view,
> -  .mocs = device->default_mocs);
> -
> -  if (!device->info.has_llc)
> - anv_state_clflush(iview->color_rt_surface_state);
> -   } else {
> -  iview->color_rt_surface_state.alloc_size = 0;
> -   }
> -
> /* NOTE: This one needs to go last since it may stomp isl_view.format */
> if (image->usage & VK_IMAGE_USAGE_STORAGE_BIT) {
>iview->storage_surface_state = alloc_surface_state(device);
> @@ -565,11 +548,6 @@ anv_DestroyImageView(VkDevice _device, VkImageView 
> _iview,
> ANV_FROM_HANDLE(anv

Re: [Mesa-dev] [PATCH 00/25] anv: A major rework of color attachment surface states

2016-11-08 Thread Jason Ekstrand
On Tue, Nov 8, 2016 at 2:26 PM, Nanley Chery  wrote:

> On Tue, Nov 08, 2016 at 01:52:15PM -0800, Jason Ekstrand wrote:
> > On Tue, Nov 8, 2016 at 1:36 PM, Nanley Chery 
> wrote:
> >
> > > On Sat, Oct 22, 2016 at 10:50:31AM -0700, Jason Ekstrand wrote:
> > > > This series does some fairly major surgery on color attachment
> surface
> > > > state allocation and fill-out in the Intel Vulkan driver.  This is in
> > > > preparation for doing color compression, fast-clears, and HiZ-capable
> > > input
> > > > attachments.  Naturally, as with everything else I've done in the
> last 2
> > > > months, it also involves some non-trivial blorp work.
> > > >
> > > > Let's start off at the beginning...  For a variety of reasons, we
> can't
> > > > really know 100% of the details of an attachment's surface state at
> any
> > > > other places than vkCmdBeginRenderPass and vkCmdNextSubpss.  The same
> > > > applies for depth buffers if you consider 3DSTATE_DEPTH_BUFFER and
> > > friends
> > > > to be the depth and stencil buffer's "surface state".  That's a
> fairly
> > > > strong statement, but there are a couple of reasons for this:
> > > >
> > > >  1) In order for fast-clears to work, the surface state has to
> contain
> > > the
> > > > clear color.  (This is it's own packet for HiZ but not for
> color.)
> > > We
> > > > don't know the clear value until BeginRenderPass.  This means we
> > > can't
> > > > fully fill out the surface state in vkCmdCreateImageView.
> > > >
> > >
> > > We could alternatively merge the view's surface state packet into
> > > another that only contains the clear color(s) right?
> > >
> >
> > Potentially, yes.  However that adds a good bit of complication because
> we
> > now have to emit render target surfaces on-the-fly because you may be
> > building two different batches simultaneously that use the same
> VkImageView
> > as a render target with two different clear colors.  It also doesn't
> solve
> > the null framebuffer problem.
> >
>
> I'm not suggesting that this optimization solves the null framebuffer
> problem, nor that we could add the clear color to the VkImageView's
> surface state. I'm trying to confirm that we could allocate the block
> of states (as is done in this series), then assign a block entry the
> VkImageView's surface state + a surface state struct that only
> contains the clear colors.
>

Yes, that might work and would let us keep the isl_surf_fill_state call in
anv_image.c.  We would also have to deal with at least the AuxUsage field
in the OR-in as well.  I think we could set up the other aux buffer
information in isl_surf_fill_state and the hardware *should* ignore if we
set AuxUsage to AUX_USAGE_NONE.


> >
> > > - Nanley
> > >
> > > >  2) The Vulkan spec requires that you be able to call
> > > vkBeginCommandBuffer
> > > > on a secondary command buffer with
> USAGE_RENDER_PASS_CONTINUE_BIT set
> > > > but with a null framebuffer.  In this case, the secondary is
> supposed
> > > > to inherit the framebuffer from the primary.  (This is not
> something
> > > we
> > > > have properly implemented until now.)  This means that anything
> that
> > > is
> > > > callable from a render-pass-continuing secondary command buffer
> has
> > > to
> > > > be able to operate without knowing any surface details that
> aren't
> > > part
> > > > of the VkRenderPass object.  Basically, all you know is the
> Vulkan
> > > > format (not the isl format) and the sample count.
> > > >
> > > > Between the two of those, about the only two entrypoints left at
> which we
> > > > actually know surface details are vkCmdBeginRenderPass and
> > > vkCmdNextSubpass
> > > > so we have to figure out how to do everything there.  As it turns
> out,
> > > this
> > > > works out surprisingly well.  The format and the sample count turn
> out to
> > > > be exactly the data we actually need in order to do all of our
> pipeline
> > > > programming.  The only hard part is refactoring things so that it
> pulls
> > > the
> > > > data from the render pass instead of the framebuffer.  There are a
> number
> > > > of places where we were grabbing the image view for an attachment
> because
> > > > we either wanted to shove something into blorp or because we wanted
> the
> > > > format and we were lazy.
> > > >
> > > > The approach taken in this patch series is the following:
> > > >
> > > >  1) Instead of allocating render target surface states in
> > > vkCreateImageView,
> > > > we allocate them as part of render pass setup in
> > > vkCmdBeginRenderPass.
> > > > All of the surface states we will ever need (including a null
> surface
> > > > state) are allocated up-front out of a single contiguous block.
> > > >
> > > >  2) For secondary command buffers with USAGE_RENDER_PASS_CONTINUE_BIT
> > > set,
> > > > we allocate storage for all of the surface states but don't
> actually
> > > > fill them out.  In the secondary command buffer, all binding
> tables
> >

Re: [Mesa-dev] [PATCH 1/3] gallium/scons: OpenSWR Windows support

2016-11-08 Thread Kyriazis, George


> -Original Message-
> From: Jose Fonseca [mailto:jfons...@vmware.com]
> Sent: Tuesday, November 8, 2016 4:17 PM
> To: Kyriazis, George ; mesa-
> d...@lists.freedesktop.org
> Subject: Re: [Mesa-dev] [PATCH 1/3] gallium/scons: OpenSWR Windows
> support
> 
> On 07/11/16 22:32, George Kyriazis wrote:
> > - Added code to create screen and handle swaps in libgl_gdi.c
> > - Added call to swr SConscript
> > - included llvm 3.9 support for scons (windows swr only support 3.9 and
> >   later)
> > - include -DHAVE_SWR to subdirs that need it
> >
> > To buils SWR on windows, use "scons swr libgl-gdi"
> > ---
> >  scons/llvm.py | 21 +++--
> >  src/gallium/SConscript|  1 +
> >  src/gallium/targets/libgl-gdi/SConscript  |  4 
> > src/gallium/targets/libgl-gdi/libgl_gdi.c | 28
> > +++-  src/gallium/targets/libgl-xlib/SConscript
> |  4 
> >  src/gallium/targets/osmesa/SConscript |  4 
> >  6 files changed, 55 insertions(+), 7 deletions(-)
> >
> > diff --git a/scons/llvm.py b/scons/llvm.py index 1fc8a3f..977e47a
> > 100644
> > --- a/scons/llvm.py
> > +++ b/scons/llvm.py
> > @@ -106,7 +106,24 @@ def generate(env):
> >  ])
> >  env.Prepend(LIBPATH = [os.path.join(llvm_dir, 'lib')])
> >  # LIBS should match the output of `llvm-config --libs engine mcjit
> bitwriter x86asmprinter`
> > -if llvm_version >= distutils.version.LooseVersion('3.7'):
> > +if llvm_version >= distutils.version.LooseVersion('3.9'):
> > +env.Prepend(LIBS = [
> > +'LLVMX86Disassembler', 'LLVMX86AsmParser',
> > +'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter',
> > +'LLVMDebugInfoCodeView', 'LLVMCodeGen',
> > +'LLVMScalarOpts', 'LLVMInstCombine',
> > +'LLVMInstrumentation', 'LLVMTransformUtils',
> > +'LLVMBitWriter', 'LLVMX86Desc',
> > +'LLVMMCDisassembler', 'LLVMX86Info',
> > +'LLVMX86AsmPrinter', 'LLVMX86Utils',
> > +'LLVMMCJIT', 'LLVMExecutionEngine', 'LLVMTarget',
> > +'LLVMAnalysis', 'LLVMProfileData',
> > +'LLVMRuntimeDyld', 'LLVMObject', 'LLVMMCParser',
> > +'LLVMBitReader', 'LLVMMC', 'LLVMCore',
> > +'LLVMSupport',
> > +'LLVMIRReader', 'LLVMASMParser'
> > +])
> > +elif llvm_version >= distutils.version.LooseVersion('3.7'):
> >  env.Prepend(LIBS = [
> >  'LLVMBitWriter', 'LLVMX86Disassembler', 'LLVMX86AsmParser',
> >  'LLVMX86CodeGen', 'LLVMSelectionDAG',
> > 'LLVMAsmPrinter', @@ -203,7 +220,7 @@ def generate(env):
> >  if '-fno-rtti' in cxxflags:
> >  env.Append(CXXFLAGS = ['-fno-rtti'])
> >
> > -components = ['engine', 'mcjit', 'bitwriter', 'x86asmprinter',
> 'mcdisassembler']
> > +components = ['engine', 'mcjit', 'bitwriter',
> > + 'x86asmprinter', 'mcdisassembler', 'irreader']
> >
> >  env.ParseConfig('llvm-config --libs ' + ' '.join(components))
> >  env.ParseConfig('llvm-config --ldflags') diff --git
> > a/src/gallium/SConscript b/src/gallium/SConscript index
> > f98268f..9273db7 100644
> > --- a/src/gallium/SConscript
> > +++ b/src/gallium/SConscript
> > @@ -18,6 +18,7 @@ SConscript([
> >  'drivers/softpipe/SConscript',
> >  'drivers/svga/SConscript',
> >  'drivers/trace/SConscript',
> > +'drivers/swr/SConscript',
> >  ])
> >
> >  #
> > diff --git a/src/gallium/targets/libgl-gdi/SConscript
> > b/src/gallium/targets/libgl-gdi/SConscript
> > index 2a52363..ef8050b 100644
> > --- a/src/gallium/targets/libgl-gdi/SConscript
> > +++ b/src/gallium/targets/libgl-gdi/SConscript
> > @@ -30,6 +30,10 @@ if env['llvm']:
> >  env.Append(CPPDEFINES = 'HAVE_LLVMPIPE')
> >  drivers += [llvmpipe]
> >
> > +if 'swr' in COMMAND_LINE_TARGETS :
> > +env.Append(CPPDEFINES = 'HAVE_SWR')
> > +drivers += [swr]
> > +
> >  if env['gcc'] and env['machine'] != 'x86_64':
> >  # DEF parser in certain versions of MinGW is busted, as does not behave
> as
> >  # MSVC.  mingw-w64 works fine.
> > diff --git a/src/gallium/targets/libgl-gdi/libgl_gdi.c
> > b/src/gallium/targets/libgl-gdi/libgl_gdi.c
> > index 922c186..12576db 100644
> > --- a/src/gallium/targets/libgl-gdi/libgl_gdi.c
> > +++ b/src/gallium/targets/libgl-gdi/libgl_gdi.c
> > @@ -51,9 +51,12 @@
> >  #include "llvmpipe/lp_public.h"
> >  #endif
> >
> > +#ifdef HAVE_SWR
> > +#include "swr/swr_public.h"
> > +#endif
> >
> >  static boolean use_llvmpipe = FALSE;
> > -
> > +static boolean use_swr = FALSE;
> >
> >  static struct pipe_screen *
> >  gdi_screen_create(void)
> > @@ -69,6 +72,8 @@ gdi_screen_create(void)
> >
> >  #ifdef HAVE_LLVMPIPE
> > default_driver = "llvmpipe";
> > +#elif HAVE_SWR
> > +   default_driver = "swr";
> >  #else
> >

Re: [Mesa-dev] [PATCH 2/3] mesa: added msvc HAS_TRIVIAL_DESTRUCTOR implementation

2016-11-08 Thread Kyriazis, George


> -Original Message-
> From: Jose Fonseca [mailto:jfons...@vmware.com]
> Sent: Tuesday, November 8, 2016 4:12 PM
> To: Kyriazis, George ; mesa-
> d...@lists.freedesktop.org
> Subject: Re: [Mesa-dev] [PATCH 2/3] mesa: added msvc
> HAS_TRIVIAL_DESTRUCTOR implementation
> 
> On 07/11/16 22:32, George Kyriazis wrote:
> > not having it on windows causes a CANARY assertion in
> > src/util/ralloc.c:get_header()
> >
> > Tested only on MSVC 19.00 (DevStudio 14.0), so #ifdef guards reflect that.
> > ---
> >  src/util/macros.h | 5 +
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/src/util/macros.h b/src/util/macros.h index
> > 27d1b62..12b26d3 100644
> > --- a/src/util/macros.h
> > +++ b/src/util/macros.h
> > @@ -175,6 +175,11 @@ do {   \
> >  #  if __has_feature(has_trivial_destructor)
> >  # define HAS_TRIVIAL_DESTRUCTOR(T) __has_trivial_destructor(T)
> >  #  endif
> > +#   elif defined(_MSC_VER) && !defined(__INTEL_COMPILER)
> > +#  if _MSC_VER >= 1900
> > +# define HAS_TRIVIAL_DESTRUCTOR(T) __has_trivial_destructor(T)
> > +#  else
> 
> #else is redundant her.  Otherwise looks good.
> 
No problem.  I'll remove.

George

> Reviewed-by: Jose Fonseca 
> 
> > +#  endif
> >  #   endif
> >  #   ifndef HAS_TRIVIAL_DESTRUCTOR
> > /* It's always safe (if inefficient) to assume that a
> >

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98629] OpenGL applications warns "MESA-LOADER: failed to retrieve device information"

2016-11-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98629

Emil Velikov  changed:

   What|Removed |Added

 CC||emil.l.veli...@gmail.com

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] swr: disable logic op when the rt format is float or srgb

2016-11-08 Thread Rowley, Timothy O
I’d prefer parenthesis to clarify the logic "(foo && ((bar == bla) || footer)”.

With those added, Reviewed-by: Tim Rowley 
mailto:timothy.o.row...@intel.com>>

On Nov 8, 2016, at 4:30 PM, Ilia Mirkin 
mailto:imir...@alum.mit.edu>> wrote:

Signed-off-by: Ilia Mirkin mailto:imir...@alum.mit.edu>>
---
src/gallium/drivers/swr/swr_state.cpp | 6 ++
1 file changed, 6 insertions(+)

diff --git a/src/gallium/drivers/swr/swr_state.cpp 
b/src/gallium/drivers/swr/swr_state.cpp
index d8a8ee1..d16c307 100644
--- a/src/gallium/drivers/swr/swr_state.cpp
+++ b/src/gallium/drivers/swr/swr_state.cpp
@@ -1305,6 +1305,12 @@ swr_update_derived(struct pipe_context *pipe,
   &ctx->blend->compileState[target],
   sizeof(compileState.blendState));

+const SWR_FORMAT_INFO& info = GetFormatInfo(compileState.format);
+if (compileState.blendState.logicOpEnable &&
+(info.type[0] == SWR_TYPE_FLOAT || info.isSRGB)) {
+   compileState.blendState.logicOpEnable = false;
+}
+
if (compileState.blendState.blendEnable == false &&
compileState.blendState.logicOpEnable == false) {
   SwrSetBlendFunc(ctx->swrContext, target, NULL);
--
2.7.3


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] swr: disable logic op when the rt format is float or srgb

2016-11-08 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/gallium/drivers/swr/swr_state.cpp | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/gallium/drivers/swr/swr_state.cpp 
b/src/gallium/drivers/swr/swr_state.cpp
index d8a8ee1..d16c307 100644
--- a/src/gallium/drivers/swr/swr_state.cpp
+++ b/src/gallium/drivers/swr/swr_state.cpp
@@ -1305,6 +1305,12 @@ swr_update_derived(struct pipe_context *pipe,
&ctx->blend->compileState[target],
sizeof(compileState.blendState));
 
+const SWR_FORMAT_INFO& info = GetFormatInfo(compileState.format);
+if (compileState.blendState.logicOpEnable &&
+(info.type[0] == SWR_TYPE_FLOAT || info.isSRGB)) {
+   compileState.blendState.logicOpEnable = false;
+}
+
 if (compileState.blendState.blendEnable == false &&
 compileState.blendState.logicOpEnable == false) {
SwrSetBlendFunc(ctx->swrContext, target, NULL);
-- 
2.7.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/25] anv: A major rework of color attachment surface states

2016-11-08 Thread Nanley Chery
On Tue, Nov 08, 2016 at 01:52:15PM -0800, Jason Ekstrand wrote:
> On Tue, Nov 8, 2016 at 1:36 PM, Nanley Chery  wrote:
> 
> > On Sat, Oct 22, 2016 at 10:50:31AM -0700, Jason Ekstrand wrote:
> > > This series does some fairly major surgery on color attachment surface
> > > state allocation and fill-out in the Intel Vulkan driver.  This is in
> > > preparation for doing color compression, fast-clears, and HiZ-capable
> > input
> > > attachments.  Naturally, as with everything else I've done in the last 2
> > > months, it also involves some non-trivial blorp work.
> > >
> > > Let's start off at the beginning...  For a variety of reasons, we can't
> > > really know 100% of the details of an attachment's surface state at any
> > > other places than vkCmdBeginRenderPass and vkCmdNextSubpss.  The same
> > > applies for depth buffers if you consider 3DSTATE_DEPTH_BUFFER and
> > friends
> > > to be the depth and stencil buffer's "surface state".  That's a fairly
> > > strong statement, but there are a couple of reasons for this:
> > >
> > >  1) In order for fast-clears to work, the surface state has to contain
> > the
> > > clear color.  (This is it's own packet for HiZ but not for color.)
> > We
> > > don't know the clear value until BeginRenderPass.  This means we
> > can't
> > > fully fill out the surface state in vkCmdCreateImageView.
> > >
> >
> > We could alternatively merge the view's surface state packet into
> > another that only contains the clear color(s) right?
> >
> 
> Potentially, yes.  However that adds a good bit of complication because we
> now have to emit render target surfaces on-the-fly because you may be
> building two different batches simultaneously that use the same VkImageView
> as a render target with two different clear colors.  It also doesn't solve
> the null framebuffer problem.
> 

I'm not suggesting that this optimization solves the null framebuffer
problem, nor that we could add the clear color to the VkImageView's
surface state. I'm trying to confirm that we could allocate the block
of states (as is done in this series), then assign a block entry the
VkImageView's surface state + a surface state struct that only 
contains the clear colors.

> 
> > - Nanley
> >
> > >  2) The Vulkan spec requires that you be able to call
> > vkBeginCommandBuffer
> > > on a secondary command buffer with USAGE_RENDER_PASS_CONTINUE_BIT set
> > > but with a null framebuffer.  In this case, the secondary is supposed
> > > to inherit the framebuffer from the primary.  (This is not something
> > we
> > > have properly implemented until now.)  This means that anything that
> > is
> > > callable from a render-pass-continuing secondary command buffer has
> > to
> > > be able to operate without knowing any surface details that aren't
> > part
> > > of the VkRenderPass object.  Basically, all you know is the Vulkan
> > > format (not the isl format) and the sample count.
> > >
> > > Between the two of those, about the only two entrypoints left at which we
> > > actually know surface details are vkCmdBeginRenderPass and
> > vkCmdNextSubpass
> > > so we have to figure out how to do everything there.  As it turns out,
> > this
> > > works out surprisingly well.  The format and the sample count turn out to
> > > be exactly the data we actually need in order to do all of our pipeline
> > > programming.  The only hard part is refactoring things so that it pulls
> > the
> > > data from the render pass instead of the framebuffer.  There are a number
> > > of places where we were grabbing the image view for an attachment because
> > > we either wanted to shove something into blorp or because we wanted the
> > > format and we were lazy.
> > >
> > > The approach taken in this patch series is the following:
> > >
> > >  1) Instead of allocating render target surface states in
> > vkCreateImageView,
> > > we allocate them as part of render pass setup in
> > vkCmdBeginRenderPass.
> > > All of the surface states we will ever need (including a null surface
> > > state) are allocated up-front out of a single contiguous block.
> > >
> > >  2) For secondary command buffers with USAGE_RENDER_PASS_CONTINUE_BIT
> > set,
> > > we allocate storage for all of the surface states but don't actually
> > > fill them out.  In the secondary command buffer, all binding tables
> > > refer to these surface states rather than the ones in the primary.
> > >
> > >  3) A blorp entrypoint is added that performs a clear operation without
> > > touching the depth/stencil buffer state and with a color attachment
> > > binding table explicitly provided by the caller.  This means that
> > even
> > > our blorp clears are using the surface states allocated in
> > > vkCmdBeginRenderPass.  Unfortunately, this turned out to be more work
> > > than expected because I had to add vertex shader support to blorp
> > along
> > > the way.
> > >
> > >  4) Here's the tricky bit.

Re: [Mesa-dev] [Mesa-announce] Mesa 12.0.4 release candidate

2016-11-08 Thread Mark Janes
Matt Turner  writes:

> On Tue, Nov 8, 2016 at 1:59 PM, Emil Velikov  wrote:
>> Jordan Justen (1)
>>   49c24d8 i965: fix noop_scissor range issue on width/height
>> Note: temporary on hold since it causes GPU lockups on 32bit builds.
>
> Let's just drop this one. I found it in an old branch and committed it
> (even wrote a piglit test for it), but it didn't fix any actual
> applications.

It looks like Emil already dropped this from his release candidate, but
forgot to remove it from the announcement.

> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] anv/device: Don't even try to map memory with a size of 0

2016-11-08 Thread Jason Ekstrand
On Tue, Nov 8, 2016 at 2:11 PM, Nanley Chery  wrote:

> On Tue, Nov 08, 2016 at 02:01:17PM -0800, Nanley Chery wrote:
> > On Tue, Nov 08, 2016 at 01:50:01PM -0800, Jason Ekstrand wrote:
> > > On Tue, Nov 8, 2016 at 1:46 PM, Nanley Chery 
> wrote:
> > >
> > > > On Mon, Nov 07, 2016 at 05:28:12PM -0800, Jason Ekstrand wrote:
> > > > > Signed-off-by: Jason Ekstrand 
> > > > > Cc: "12.0 13.0" 
> > > > > ---
> > > > >  src/intel/vulkan/anv_device.c | 5 +
> > > > >  1 file changed, 5 insertions(+)
> > > > >
> > > > > diff --git a/src/intel/vulkan/anv_device.c
> > > > b/src/intel/vulkan/anv_device.c
> > > > > index 5393144..8055893 100644
> > > > > --- a/src/intel/vulkan/anv_device.c
> > > > > +++ b/src/intel/vulkan/anv_device.c
> > > > > @@ -1258,6 +1258,11 @@ VkResult anv_MapMemory(
> > > > > if (size == VK_WHOLE_SIZE)
> > > > >size = mem->bo.size - offset;
> > > > >
> > > > > +   if (size == 0) {
> > > >
> > > > The user isn't allowed to make such a call. Does this fix a CTS test?
> > > >
> > >
> > > Heh, so they aren't.  It doesn't fix anything, it just ensures that you
> > > never hit the ioctl with a size of zero.  How about I replace it with
> an
> > > assert?
> > >
> >
> > An assert or no assert is fine. The validation layers technically should
> > catch this for us.
>

They should, but this is more for my confidence in subsequent code than to
try and fix apps.


> >
>
> With patch 1 fixed or omitted, this series is:
> Reviewed-by: Nanley Chery 
>

Thanks!


> > >
> > > > > +  *ppData = NULL;
> > > > > +  return VK_SUCCESS;
> > > > > +   }
> > > > > +
> > > > > /* FIXME: Is this supposed to be thread safe? Since
> vkUnmapMemory()
> > > > only
> > > > >  * takes a VkDeviceMemory pointer, it seems like only one map
> of the
> > > > memory
> > > > >  * at a time is valid. We could just mmap up front and return
> an
> > > > offset
> > > > > --
> > > > > 2.5.0.400.gff86faf
> > > > >
> > > > > ___
> > > > > mesa-dev mailing list
> > > > > mesa-dev@lists.freedesktop.org
> > > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > > >
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gallium/scons: OpenSWR Windows support

2016-11-08 Thread Jose Fonseca

On 07/11/16 22:32, George Kyriazis wrote:

- Added code to create screen and handle swaps in libgl_gdi.c
- Added call to swr SConscript
- included llvm 3.9 support for scons (windows swr only support 3.9 and
  later)
- include -DHAVE_SWR to subdirs that need it

To buils SWR on windows, use "scons swr libgl-gdi"
---
 scons/llvm.py | 21 +++--
 src/gallium/SConscript|  1 +
 src/gallium/targets/libgl-gdi/SConscript  |  4 
 src/gallium/targets/libgl-gdi/libgl_gdi.c | 28 +++-
 src/gallium/targets/libgl-xlib/SConscript |  4 
 src/gallium/targets/osmesa/SConscript |  4 
 6 files changed, 55 insertions(+), 7 deletions(-)

diff --git a/scons/llvm.py b/scons/llvm.py
index 1fc8a3f..977e47a 100644
--- a/scons/llvm.py
+++ b/scons/llvm.py
@@ -106,7 +106,24 @@ def generate(env):
 ])
 env.Prepend(LIBPATH = [os.path.join(llvm_dir, 'lib')])
 # LIBS should match the output of `llvm-config --libs engine mcjit 
bitwriter x86asmprinter`
-if llvm_version >= distutils.version.LooseVersion('3.7'):
+if llvm_version >= distutils.version.LooseVersion('3.9'):
+env.Prepend(LIBS = [
+'LLVMX86Disassembler', 'LLVMX86AsmParser',
+'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter',
+'LLVMDebugInfoCodeView', 'LLVMCodeGen',
+'LLVMScalarOpts', 'LLVMInstCombine',
+'LLVMInstrumentation', 'LLVMTransformUtils',
+'LLVMBitWriter', 'LLVMX86Desc',
+'LLVMMCDisassembler', 'LLVMX86Info',
+'LLVMX86AsmPrinter', 'LLVMX86Utils',
+'LLVMMCJIT', 'LLVMExecutionEngine', 'LLVMTarget',
+'LLVMAnalysis', 'LLVMProfileData',
+'LLVMRuntimeDyld', 'LLVMObject', 'LLVMMCParser',
+'LLVMBitReader', 'LLVMMC', 'LLVMCore',
+'LLVMSupport',
+'LLVMIRReader', 'LLVMASMParser'
+])
+elif llvm_version >= distutils.version.LooseVersion('3.7'):
 env.Prepend(LIBS = [
 'LLVMBitWriter', 'LLVMX86Disassembler', 'LLVMX86AsmParser',
 'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter',
@@ -203,7 +220,7 @@ def generate(env):
 if '-fno-rtti' in cxxflags:
 env.Append(CXXFLAGS = ['-fno-rtti'])

-components = ['engine', 'mcjit', 'bitwriter', 'x86asmprinter', 
'mcdisassembler']
+components = ['engine', 'mcjit', 'bitwriter', 'x86asmprinter', 
'mcdisassembler', 'irreader']

 env.ParseConfig('llvm-config --libs ' + ' '.join(components))
 env.ParseConfig('llvm-config --ldflags')
diff --git a/src/gallium/SConscript b/src/gallium/SConscript
index f98268f..9273db7 100644
--- a/src/gallium/SConscript
+++ b/src/gallium/SConscript
@@ -18,6 +18,7 @@ SConscript([
 'drivers/softpipe/SConscript',
 'drivers/svga/SConscript',
 'drivers/trace/SConscript',
+'drivers/swr/SConscript',
 ])

 #
diff --git a/src/gallium/targets/libgl-gdi/SConscript 
b/src/gallium/targets/libgl-gdi/SConscript
index 2a52363..ef8050b 100644
--- a/src/gallium/targets/libgl-gdi/SConscript
+++ b/src/gallium/targets/libgl-gdi/SConscript
@@ -30,6 +30,10 @@ if env['llvm']:
 env.Append(CPPDEFINES = 'HAVE_LLVMPIPE')
 drivers += [llvmpipe]

+if 'swr' in COMMAND_LINE_TARGETS :
+env.Append(CPPDEFINES = 'HAVE_SWR')
+drivers += [swr]
+
 if env['gcc'] and env['machine'] != 'x86_64':
 # DEF parser in certain versions of MinGW is busted, as does not behave as
 # MSVC.  mingw-w64 works fine.
diff --git a/src/gallium/targets/libgl-gdi/libgl_gdi.c 
b/src/gallium/targets/libgl-gdi/libgl_gdi.c
index 922c186..12576db 100644
--- a/src/gallium/targets/libgl-gdi/libgl_gdi.c
+++ b/src/gallium/targets/libgl-gdi/libgl_gdi.c
@@ -51,9 +51,12 @@
 #include "llvmpipe/lp_public.h"
 #endif

+#ifdef HAVE_SWR
+#include "swr/swr_public.h"
+#endif

 static boolean use_llvmpipe = FALSE;
-
+static boolean use_swr = FALSE;

 static struct pipe_screen *
 gdi_screen_create(void)
@@ -69,6 +72,8 @@ gdi_screen_create(void)

 #ifdef HAVE_LLVMPIPE
default_driver = "llvmpipe";
+#elif HAVE_SWR
+   default_driver = "swr";
 #else
default_driver = "softpipe";
 #endif
@@ -78,15 +83,21 @@ gdi_screen_create(void)
 #ifdef HAVE_LLVMPIPE
if (strcmp(driver, "llvmpipe") == 0) {
   screen = llvmpipe_create_screen( winsys );
+  if (screen)
+ use_llvmpipe = TRUE;
+   }
+#endif
+#ifdef HAVE_SWR



+   if (strcmp(driver, "swr") == 0) {
+  screen = swr_create_screen( winsys );
+  if (screen)
+ use_swr = TRUE;
}
-#else
-   (void) driver;
 #endif
+   (void) driver;

if (screen == NULL) {
   screen = softpipe_create_screen( winsys );
-   } else {
-  use_llvmpipe = TRUE;
}

if(!screen)
@@ -128,6 +139,13 @@ gdi_present(struct pipe_screen *screen,
}
 #endif

+#ifdef HAVE_SWR
+

Re: [Mesa-dev] [PATCH] gallivm: fix [IU]MUL_HI regression

2016-11-08 Thread Roland Scheidegger
Am 08.11.2016 um 20:10 schrieb Marek Olšák:
> FYI, this doesn't fix the regression fully. (GLCTS failures with
> piglit: -t mulextended)

Maybe using shuffle isn't such a good idea then.
Not sure how well you handle them, and there's probably a problem with
scalar build contexts (initially this was restricted to 4 and 8-sized
vectors), looking at it we'd actually return a (1-sized) vector instead
of a scalar in the end... shifts/truncs have the advantage that they are
completely agnostic if it's scalars or vectors (and if it's vectors,
what kind of vectors).

Roland


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] mesa: added msvc HAS_TRIVIAL_DESTRUCTOR implementation

2016-11-08 Thread Jose Fonseca

On 07/11/16 22:32, George Kyriazis wrote:

not having it on windows causes a CANARY assertion in
src/util/ralloc.c:get_header()

Tested only on MSVC 19.00 (DevStudio 14.0), so #ifdef guards reflect that.
---
 src/util/macros.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/util/macros.h b/src/util/macros.h
index 27d1b62..12b26d3 100644
--- a/src/util/macros.h
+++ b/src/util/macros.h
@@ -175,6 +175,11 @@ do {   \
 #  if __has_feature(has_trivial_destructor)
 # define HAS_TRIVIAL_DESTRUCTOR(T) __has_trivial_destructor(T)
 #  endif
+#   elif defined(_MSC_VER) && !defined(__INTEL_COMPILER)
+#  if _MSC_VER >= 1900
+# define HAS_TRIVIAL_DESTRUCTOR(T) __has_trivial_destructor(T)
+#  else


#else is redundant her.  Otherwise looks good.

Reviewed-by: Jose Fonseca 


+#  endif
 #   endif
 #   ifndef HAS_TRIVIAL_DESTRUCTOR
/* It's always safe (if inefficient) to assume that a



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] anv/device: Don't even try to map memory with a size of 0

2016-11-08 Thread Nanley Chery
On Tue, Nov 08, 2016 at 02:01:17PM -0800, Nanley Chery wrote:
> On Tue, Nov 08, 2016 at 01:50:01PM -0800, Jason Ekstrand wrote:
> > On Tue, Nov 8, 2016 at 1:46 PM, Nanley Chery  wrote:
> > 
> > > On Mon, Nov 07, 2016 at 05:28:12PM -0800, Jason Ekstrand wrote:
> > > > Signed-off-by: Jason Ekstrand 
> > > > Cc: "12.0 13.0" 
> > > > ---
> > > >  src/intel/vulkan/anv_device.c | 5 +
> > > >  1 file changed, 5 insertions(+)
> > > >
> > > > diff --git a/src/intel/vulkan/anv_device.c
> > > b/src/intel/vulkan/anv_device.c
> > > > index 5393144..8055893 100644
> > > > --- a/src/intel/vulkan/anv_device.c
> > > > +++ b/src/intel/vulkan/anv_device.c
> > > > @@ -1258,6 +1258,11 @@ VkResult anv_MapMemory(
> > > > if (size == VK_WHOLE_SIZE)
> > > >size = mem->bo.size - offset;
> > > >
> > > > +   if (size == 0) {
> > >
> > > The user isn't allowed to make such a call. Does this fix a CTS test?
> > >
> > 
> > Heh, so they aren't.  It doesn't fix anything, it just ensures that you
> > never hit the ioctl with a size of zero.  How about I replace it with an
> > assert?
> > 
> 
> An assert or no assert is fine. The validation layers technically should
> catch this for us.
> 

With patch 1 fixed or omitted, this series is:
Reviewed-by: Nanley Chery 

> > 
> > > > +  *ppData = NULL;
> > > > +  return VK_SUCCESS;
> > > > +   }
> > > > +
> > > > /* FIXME: Is this supposed to be thread safe? Since vkUnmapMemory()
> > > only
> > > >  * takes a VkDeviceMemory pointer, it seems like only one map of the
> > > memory
> > > >  * at a time is valid. We could just mmap up front and return an
> > > offset
> > > > --
> > > > 2.5.0.400.gff86faf
> > > >
> > > > ___
> > > > mesa-dev mailing list
> > > > mesa-dev@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > >
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] gallivm: add wrappers for missing functions in LLVM <= 3.8

2016-11-08 Thread Jose Fonseca

On 19/10/16 23:14, Marek Olšák wrote:

From: Marek Olšák 

radeonsi needs these.
---
 src/gallium/auxiliary/gallivm/lp_bld_misc.cpp | 21 +
 src/gallium/auxiliary/gallivm/lp_bld_misc.h   |  6 ++
 2 files changed, 27 insertions(+)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp 
b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
index 791a470..f4045ad 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
+++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
@@ -70,20 +70,21 @@
 #include 
 #else
 #include 
 #endif
 #include 
 #include 
 #include 

 #include 

+#include 
 #include 
 #include 
 #include 

 #include 
 #if LLVM_USE_INTEL_JITEVENTS
 #include 
 #endif

 // Workaround http://llvm.org/PR23628
@@ -701,10 +702,30 @@ lp_free_memory_manager(LLVMMCJITMemoryManagerRef 
memorymgr)
 extern "C" void
 lp_add_attr_dereferenceable(LLVMValueRef val, uint64_t bytes)
 {
 #if HAVE_LLVM >= 0x0306
llvm::Argument *A = llvm::unwrap(val);
llvm::AttrBuilder B;
B.addDereferenceableAttr(bytes);
A->addAttr(llvm::AttributeSet::get(A->getContext(), A->getArgNo() + 1,  B));
 #endif
 }
+
+extern "C" LLVMValueRef
+lp_get_called_value(LLVMValueRef call)
+{
+#if HAVE_LLVM >= 0x0309
+   return LLVMGetCalledValue(call);
+#else
+   return 
llvm::wrap(llvm::CallSite(llvm::unwrap(call)).getCalledValue());
+#endif
+}


In these circumstances, rather introducing a wrapper,  I find it more 
appealing to "backport" the future defintion, as:


#if HAVE_LLVM < 0x0309
extern "C" LLVMValueRef
LLVMGetCalledValue(LLVMValueRef call)
{
	return 
llvm::wrap(llvm::CallSite(llvm::unwrap(call)).getCalledValue());

 }
#endif


This way it's one less wrapper to learn.  And when the required LLVM 
version reaches 3.9, we can just remove the function.


Jose



+
+extern "C" bool
+lp_is_function(LLVMValueRef v)
+{
+#if HAVE_LLVM >= 0x0309
+   return LLVMGetValueKind(v) == LLVMFunctionValueKind;
+#else
+   return llvm::isa(llvm::unwrap(v));
+#endif
+}
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.h 
b/src/gallium/auxiliary/gallivm/lp_bld_misc.h
index c127c48..a55c6bd 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_misc.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.h
@@ -69,16 +69,22 @@ lp_free_generated_code(struct lp_generated_code *code);

 extern LLVMMCJITMemoryManagerRef
 lp_get_default_memory_manager();

 extern void
 lp_free_memory_manager(LLVMMCJITMemoryManagerRef memorymgr);

 extern void
 lp_add_attr_dereferenceable(LLVMValueRef val, uint64_t bytes);

+extern LLVMValueRef
+lp_get_called_value(LLVMValueRef call);
+
+extern bool
+lp_is_function(LLVMValueRef v);
+
 #ifdef __cplusplus
 }
 #endif


 #endif /* !LP_BLD_MISC_H */



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-announce] Mesa 12.0.4 release candidate

2016-11-08 Thread Matt Turner
On Tue, Nov 8, 2016 at 1:59 PM, Emil Velikov  wrote:
> Jordan Justen (1)
>   49c24d8 i965: fix noop_scissor range issue on width/height
> Note: temporary on hold since it causes GPU lockups on 32bit builds.

Let's just drop this one. I found it in an old branch and committed it
(even wrote a piglit test for it), but it didn't fix any actual
applications.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] anv/device: Don't even try to map memory with a size of 0

2016-11-08 Thread Nanley Chery
On Tue, Nov 08, 2016 at 01:50:01PM -0800, Jason Ekstrand wrote:
> On Tue, Nov 8, 2016 at 1:46 PM, Nanley Chery  wrote:
> 
> > On Mon, Nov 07, 2016 at 05:28:12PM -0800, Jason Ekstrand wrote:
> > > Signed-off-by: Jason Ekstrand 
> > > Cc: "12.0 13.0" 
> > > ---
> > >  src/intel/vulkan/anv_device.c | 5 +
> > >  1 file changed, 5 insertions(+)
> > >
> > > diff --git a/src/intel/vulkan/anv_device.c
> > b/src/intel/vulkan/anv_device.c
> > > index 5393144..8055893 100644
> > > --- a/src/intel/vulkan/anv_device.c
> > > +++ b/src/intel/vulkan/anv_device.c
> > > @@ -1258,6 +1258,11 @@ VkResult anv_MapMemory(
> > > if (size == VK_WHOLE_SIZE)
> > >size = mem->bo.size - offset;
> > >
> > > +   if (size == 0) {
> >
> > The user isn't allowed to make such a call. Does this fix a CTS test?
> >
> 
> Heh, so they aren't.  It doesn't fix anything, it just ensures that you
> never hit the ioctl with a size of zero.  How about I replace it with an
> assert?
> 

An assert or no assert is fine. The validation layers technically should
catch this for us.

> 
> > > +  *ppData = NULL;
> > > +  return VK_SUCCESS;
> > > +   }
> > > +
> > > /* FIXME: Is this supposed to be thread safe? Since vkUnmapMemory()
> > only
> > >  * takes a VkDeviceMemory pointer, it seems like only one map of the
> > memory
> > >  * at a time is valid. We could just mmap up front and return an
> > offset
> > > --
> > > 2.5.0.400.gff86faf
> > >
> > > ___
> > > mesa-dev mailing list
> > > mesa-dev@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Mesa 12.0.4 release candidate

2016-11-08 Thread Emil Velikov
Hello list,

The candidate for the Mesa 12.0.4 is now available. Currently we have:
 - 115 queued
 - 10 nominated (outstanding)
 - and 11 (self-)rejected patches

Notes:
 - The Intel CI infrastructure is utilised for testing which brings us testing
on all platforms supported by the i965 DRI and ANV Vulkan drivers.
Classic swrast, softpipe and llvmpipe are also tested against latest piglit.
Unfortunately, reports can no longer include the Changes section.
 - Sent-but-not-yet-merged-in-master patches are no longer tracked in the
Nominated section.
 - Nominated and Rejected sections now include the sha of the commit in master.
 - The Rejected section includes reasoning behind the decision. Objections are
considered with backports greatly appreciated.

Take a look at section "Mesa stable queue" for more information.

Testing reports/general approval

Any testing reports (or general approval of the state of the branch) will be
greatly appreciated.

The plan is to have 12.0.4 this Thursday (10th of November), around or shortly
after 20:00 GMT.

If you have any questions or suggestions - be that about the current patch
queue or otherwise, please go ahead.

Trivial merge conflicts
---
commit 05ec6a7c03ce0b3c38b081f7947aeaa47b1d7e81
Author: Ilia Mirkin 

a3xx: use window scissor to simulate viewport xy clip

(cherry picked from commit ca313e00b6eda27e4308c29fd7244f43c77d4f97)


commit b1c5719d7b1be2f6fb438252614a2008f54159d7
Author: Marek Olšák 

radeonsi: fix FP64 UBO loads with indirect uniform block indexing

(cherry picked from commit 15a127bc2c3267f35e0d78ebc205e1686a5a5e3f)


commit 6a72af2aeb48b90adfff922c99a26274cbc2f357
Author: Kenneth Graunke 

mesa: Expose RESET_NOTIFICATION_STRATEGY with KHR_robustness.

(cherry picked from commit 3bcdc2e3db8fb9f8e04d3504b6f331b484ebcc96)


commit f228c90f80effc413b51aba0b9b4f3487ed35871
Author: Nicholas Bishop 

st/dri: check pipe_screen->resource_get_handle() return value

(cherry picked from commit aa560e8e6328acd5b8feec1fea54dec06ae21368)


commit 17429a22a6026dc6601fc8e9ae4f0daecb30079a
Author: Jason Ekstrand 

nir: Add a nop intrinsic

(cherry picked from commit 7697b4b98b155c818811709becdb408772371538)


commit a5c0b8784aacfc0e7d2ff90a92dd56bb53b97bdd
Author: Nicolai Hähnle 

gallium/radeon: cleanup and fix branch emits

(cherry picked from commit 6f87d7a14699277be6dd17e9e712841c4057c4df)


commit bc04c92aef700a60a65ff567aed7f3e99a6d95b4
Author: Kenneth Graunke 

i965: Add missing BRW_NEW_VS_PROG_DATA to 3DSTATE_CLIP.

(cherry picked from commit 28e1538be7923205231402ab928b61b670bd2962)


commit eb9236e27591583c8fa20daaeb29bcab1ccb8ad8
Author: Vinson Lee 

Revert "mesa_glinterop: remove inclusion of GLX header"

(cherry picked from commit c10dcb2ce837922c6ee4e191e6d6202098a5ee10)
(cherry picked from commit c85b34ffd04f9a7a16fe30173474e857d0f42d5f)


commit 341889d6ca85b9c7346e656b2eb65ac1007756a4
Author: Chad Versace 

egl: Don't advertise unsupported platform extensions

(cherry picked from commit c177ef9d47943f648a13beed14269f468583c16e)


commit 979e4b9c3f5b1272df807c0195a85d980c45ea29
Author: Tapani Pälli 

egl: add check that eglCreateContext gets a valid config

(cherry picked from commit 5876f3c85a61d73bb4863331bd641152a40a7b0c)


commit ac3abe534bb9986ce7ee1286854a3bb2a83568bc
Author: Marek Olšák 

winsys/amdgpu: fix radeon_surf::macro_tile_index for imported textures

(cherry picked from commit 6ec3b2a4b1d41b83a4721d06b42c49f55e695cbf)


commit 3d4a219dd86eaab549e49c48ed1c8a0c922b5221
Author: Jason Ekstrand 

intel/blorp: Rework our usage of ralloc when compiling shaders

(cherry picked from commit 43dadb6edd5e3e3e10b1198184a9f75556edad49)


Cheers,
Emil


Mesa stable queue
-

Nominated (10)
==

Adam Jackson (2)
  deb0eb1 glx/glvnd: Don't modify the dummy slot in the dispatch table
  8bca8d8 glx/glvnd: Fix dispatch function names and indices

Haixia Shi (1)
  8c56ff6 mesa: change state query return value for RGB565

Jason Ekstrand (1)
  2a4a868 i965/fs/generator: Don't use the address immediate for
MOV_INDIRECT

Jordan Justen (1)
  49c24d8 i965: fix noop_scissor range issue on width/height
Note: temporary on hold since it causes GPU lockups on 32bit builds.

Marek Olšák (5)
  8b05076 gallium/radeon: unify viewport emission code
  687c4be gallium/radeon: set VPORT_ZMIN/MAX registers correctly
  03708de radeonsi: fix gl_PatchVerticesIn for tessellation
evaluation shader
  3e756f0 radeonsi: fix a crash in imageSize for cubemap arrays
  b425b57 radeonsi: emit TA_CS_BC_BASE_ADDR on SI only if the
kernel allows it

Matt Turner (1)
  0775523 anv: Replace "abi_versions" with correct "api_version".


Queued (115)


Axel Davy (4):
  gallium/util: Really allow aliasing of dst for u_box_union_*
  st/nine: Fix the calculation o

Re: [Mesa-dev] [PATCH 1/2] intel/common: Add an is_kabylake field to gen_device_info

2016-11-08 Thread Jason Ekstrand
On Tue, Nov 8, 2016 at 1:53 PM, Matt Turner  wrote:

> On Tue, Nov 8, 2016 at 1:21 PM, Jason Ekstrand 
> wrote:
> > Most of the 3-D engine Kaby Lake is identical to Sky Lake.  However,
> there
>
> While hyphenating 3D looks a little odd to me, Skylake is definitely
> just a single word. (Strangely, Kaby Lake is indeed two words)
>
> > are a few small differences that we need to be able to detect.
> >
> > Signed-off-by: Jason Ekstrand 
> > ---
> >  src/intel/common/gen_device_info.c | 14 +-
> >  src/intel/common/gen_device_info.h |  1 +
> >  2 files changed, 10 insertions(+), 5 deletions(-)
> >
> > diff --git a/src/intel/common/gen_device_info.c
> b/src/intel/common/gen_device_info.c
> > index 30df0b2..3ff98f0 100644
> > --- a/src/intel/common/gen_device_info.c
> > +++ b/src/intel/common/gen_device_info.c
> > @@ -427,6 +427,10 @@ static const struct gen_device_info
> gen_device_info_bxt_2x6 = {
> >   * There's no KBL entry. Using the default SKL (GEN9) GS entries value.
> >   */
> >
> > +#define KBL_FEATURES \
>
> We don't have subgen FEATURES #defines for anything else. bxt, for
> instance just sets .is_broxton in its couple of fields. Not wrong, but
> doesn't seem particularly necessary for a single field.
>

We do for Haswell which is why I did it this way.


> I'd prefer to just put .is_kabylake in the KBL structs, unless you've
> got further plans.
>

Sure.  I don't care much either way.


> With that fixed, both patches are
>
> Reviewed-by: Matt Turner 
>

Thanks
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] intel/common: Add an is_kabylake field to gen_device_info

2016-11-08 Thread Matt Turner
On Tue, Nov 8, 2016 at 1:21 PM, Jason Ekstrand  wrote:
> Most of the 3-D engine Kaby Lake is identical to Sky Lake.  However, there

While hyphenating 3D looks a little odd to me, Skylake is definitely
just a single word. (Strangely, Kaby Lake is indeed two words)

> are a few small differences that we need to be able to detect.
>
> Signed-off-by: Jason Ekstrand 
> ---
>  src/intel/common/gen_device_info.c | 14 +-
>  src/intel/common/gen_device_info.h |  1 +
>  2 files changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/src/intel/common/gen_device_info.c 
> b/src/intel/common/gen_device_info.c
> index 30df0b2..3ff98f0 100644
> --- a/src/intel/common/gen_device_info.c
> +++ b/src/intel/common/gen_device_info.c
> @@ -427,6 +427,10 @@ static const struct gen_device_info 
> gen_device_info_bxt_2x6 = {
>   * There's no KBL entry. Using the default SKL (GEN9) GS entries value.
>   */
>
> +#define KBL_FEATURES \

We don't have subgen FEATURES #defines for anything else. bxt, for
instance just sets .is_broxton in its couple of fields. Not wrong, but
doesn't seem particularly necessary for a single field.

I'd prefer to just put .is_kabylake in the KBL structs, unless you've
got further plans.

With that fixed, both patches are

Reviewed-by: Matt Turner 

Neat find. :)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/25] anv: A major rework of color attachment surface states

2016-11-08 Thread Jason Ekstrand
On Tue, Nov 8, 2016 at 1:36 PM, Nanley Chery  wrote:

> On Sat, Oct 22, 2016 at 10:50:31AM -0700, Jason Ekstrand wrote:
> > This series does some fairly major surgery on color attachment surface
> > state allocation and fill-out in the Intel Vulkan driver.  This is in
> > preparation for doing color compression, fast-clears, and HiZ-capable
> input
> > attachments.  Naturally, as with everything else I've done in the last 2
> > months, it also involves some non-trivial blorp work.
> >
> > Let's start off at the beginning...  For a variety of reasons, we can't
> > really know 100% of the details of an attachment's surface state at any
> > other places than vkCmdBeginRenderPass and vkCmdNextSubpss.  The same
> > applies for depth buffers if you consider 3DSTATE_DEPTH_BUFFER and
> friends
> > to be the depth and stencil buffer's "surface state".  That's a fairly
> > strong statement, but there are a couple of reasons for this:
> >
> >  1) In order for fast-clears to work, the surface state has to contain
> the
> > clear color.  (This is it's own packet for HiZ but not for color.)
> We
> > don't know the clear value until BeginRenderPass.  This means we
> can't
> > fully fill out the surface state in vkCmdCreateImageView.
> >
>
> We could alternatively merge the view's surface state packet into
> another that only contains the clear color(s) right?
>

Potentially, yes.  However that adds a good bit of complication because we
now have to emit render target surfaces on-the-fly because you may be
building two different batches simultaneously that use the same VkImageView
as a render target with two different clear colors.  It also doesn't solve
the null framebuffer problem.


> - Nanley
>
> >  2) The Vulkan spec requires that you be able to call
> vkBeginCommandBuffer
> > on a secondary command buffer with USAGE_RENDER_PASS_CONTINUE_BIT set
> > but with a null framebuffer.  In this case, the secondary is supposed
> > to inherit the framebuffer from the primary.  (This is not something
> we
> > have properly implemented until now.)  This means that anything that
> is
> > callable from a render-pass-continuing secondary command buffer has
> to
> > be able to operate without knowing any surface details that aren't
> part
> > of the VkRenderPass object.  Basically, all you know is the Vulkan
> > format (not the isl format) and the sample count.
> >
> > Between the two of those, about the only two entrypoints left at which we
> > actually know surface details are vkCmdBeginRenderPass and
> vkCmdNextSubpass
> > so we have to figure out how to do everything there.  As it turns out,
> this
> > works out surprisingly well.  The format and the sample count turn out to
> > be exactly the data we actually need in order to do all of our pipeline
> > programming.  The only hard part is refactoring things so that it pulls
> the
> > data from the render pass instead of the framebuffer.  There are a number
> > of places where we were grabbing the image view for an attachment because
> > we either wanted to shove something into blorp or because we wanted the
> > format and we were lazy.
> >
> > The approach taken in this patch series is the following:
> >
> >  1) Instead of allocating render target surface states in
> vkCreateImageView,
> > we allocate them as part of render pass setup in
> vkCmdBeginRenderPass.
> > All of the surface states we will ever need (including a null surface
> > state) are allocated up-front out of a single contiguous block.
> >
> >  2) For secondary command buffers with USAGE_RENDER_PASS_CONTINUE_BIT
> set,
> > we allocate storage for all of the surface states but don't actually
> > fill them out.  In the secondary command buffer, all binding tables
> > refer to these surface states rather than the ones in the primary.
> >
> >  3) A blorp entrypoint is added that performs a clear operation without
> > touching the depth/stencil buffer state and with a color attachment
> > binding table explicitly provided by the caller.  This means that
> even
> > our blorp clears are using the surface states allocated in
> > vkCmdBeginRenderPass.  Unfortunately, this turned out to be more work
> > than expected because I had to add vertex shader support to blorp
> along
> > the way.
> >
> >  4) Here's the tricky bit.  When vkCmdExecuteCommands is called during a
> > render pass, we use transform feedback (yeah, crazy) to copy the
> block
> > of surface states from the primary into the secondary right before
> > executing the secondary.
> >
> > It's kind of a crazy scheme but I like the end result quite a bit.
> >
> > Cc: Kristian Høgsberg Kristensen 
> > Cc: Chad Versace 
> > Cc: Nanley Chery 
> > Cc: Topi Pohjolainen 
> >
> > Jason Ekstrand (25):
> >   intel/isl: Add some basic info about RENDER_SURFACE_STATE to
> > isl_device
> >   intel/genxml: Add SO_WRITE_OFFSET registers for gen7-9
> >   anv: Add a helper f

Re: [Mesa-dev] [PATCH 1/3] anv/device: Don't even try to map memory with a size of 0

2016-11-08 Thread Jason Ekstrand
On Tue, Nov 8, 2016 at 1:46 PM, Nanley Chery  wrote:

> On Mon, Nov 07, 2016 at 05:28:12PM -0800, Jason Ekstrand wrote:
> > Signed-off-by: Jason Ekstrand 
> > Cc: "12.0 13.0" 
> > ---
> >  src/intel/vulkan/anv_device.c | 5 +
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/src/intel/vulkan/anv_device.c
> b/src/intel/vulkan/anv_device.c
> > index 5393144..8055893 100644
> > --- a/src/intel/vulkan/anv_device.c
> > +++ b/src/intel/vulkan/anv_device.c
> > @@ -1258,6 +1258,11 @@ VkResult anv_MapMemory(
> > if (size == VK_WHOLE_SIZE)
> >size = mem->bo.size - offset;
> >
> > +   if (size == 0) {
>
> The user isn't allowed to make such a call. Does this fix a CTS test?
>

Heh, so they aren't.  It doesn't fix anything, it just ensures that you
never hit the ioctl with a size of zero.  How about I replace it with an
assert?


> > +  *ppData = NULL;
> > +  return VK_SUCCESS;
> > +   }
> > +
> > /* FIXME: Is this supposed to be thread safe? Since vkUnmapMemory()
> only
> >  * takes a VkDeviceMemory pointer, it seems like only one map of the
> memory
> >  * at a time is valid. We could just mmap up front and return an
> offset
> > --
> > 2.5.0.400.gff86faf
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] anv/device: Don't even try to map memory with a size of 0

2016-11-08 Thread Nanley Chery
On Mon, Nov 07, 2016 at 05:28:12PM -0800, Jason Ekstrand wrote:
> Signed-off-by: Jason Ekstrand 
> Cc: "12.0 13.0" 
> ---
>  src/intel/vulkan/anv_device.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
> index 5393144..8055893 100644
> --- a/src/intel/vulkan/anv_device.c
> +++ b/src/intel/vulkan/anv_device.c
> @@ -1258,6 +1258,11 @@ VkResult anv_MapMemory(
> if (size == VK_WHOLE_SIZE)
>size = mem->bo.size - offset;
>  
> +   if (size == 0) {

The user isn't allowed to make such a call. Does this fix a CTS test?

> +  *ppData = NULL;
> +  return VK_SUCCESS;
> +   }
> +
> /* FIXME: Is this supposed to be thread safe? Since vkUnmapMemory() only
>  * takes a VkDeviceMemory pointer, it seems like only one map of the 
> memory
>  * at a time is valid. We could just mmap up front and return an offset
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] nir: add conditional discard optimisation (v3)

2016-11-08 Thread Eric Anholt
Dave Airlie  writes:

> From: Dave Airlie 
>
> This is ported from GLSL and converts
>
> if (cond)
>   discard;
>
> into
> discard_if(cond);
>
> This removes a block, but also is needed by radv
> to workaround a bug in the LLVM backend.
>
> v2: handle if (a) discard_if(b) (nha)
> cleanup and drop pointless loop (Matt)
> make sure there are no dependent phis (Eric)
> v3: make sure only one instruction in the then block.
>
> Signed-off-by: Dave Airlie 
> ---

> diff --git a/src/compiler/nir/nir_opt_conditional_discard.c 
> b/src/compiler/nir/nir_opt_conditional_discard.c
> new file mode 100644
> index 000..6e90983
> --- /dev/null
> +++ b/src/compiler/nir/nir_opt_conditional_discard.c
> @@ -0,0 +1,125 @@
> +/*
> + * Copyright © 2016 Red Hat
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> DEALINGS
> + * IN THE SOFTWARE.
> + */
> +
> +#include "nir.h"
> +#include "nir_builder.h"
> +
> +/** @file nir_opt_conditional_discard.c
> + *
> + * Handles optimization of lowering if (cond) discard to discard_if(cond).
> + */

Maybe put some quotes around "if (cond) discard" to clarify what
statement is being lowered.

> +static bool
> +nir_opt_conditional_discard_block(nir_block *block, void *mem_ctx)
> +{
> +   nir_builder bld;
> +
> +   if (nir_cf_node_is_first(&block->cf_node))
> +  return false;
> +
> +   nir_cf_node *prev_node = nir_cf_node_prev(&block->cf_node);
> +   if (prev_node->type != nir_cf_node_if)
> +  return false;
> +
> +   nir_if *if_stmt = nir_cf_node_as_if(prev_node);
> +   nir_block *then_block = nir_if_first_then_block(if_stmt);
> +   nir_block *else_block = nir_if_first_else_block(if_stmt);
> +
> +   /* check there is only one else block and it is empty */
> +   if (nir_if_last_else_block(if_stmt) != else_block)
> +  return false;
> +   if (!exec_list_is_empty(&else_block->instr_list))
> +  return false;
> +
> +   /* check there is only one then block and it has only one instruction in 
> it */
> +   if (nir_if_last_then_block(if_stmt) != then_block)
> +  return false;
> +   if (exec_list_is_empty(&then_block->instr_list))
> +  return false;
> +   if (exec_list_length(&then_block->instr_list) > 1)
> +  return false;
> +   /*
> +* make sure no subsequent phi nodes point at this if.
> +*/
> +   nir_block *after = 
> nir_cf_node_as_block(nir_cf_node_next(&if_stmt->cf_node));
> +   nir_foreach_instr_safe(instr, after) {
> +  if (instr->type != nir_instr_type_phi)
> + break;
> +  nir_phi_instr *phi = nir_instr_as_phi(instr);
> +
> +  nir_foreach_phi_src(phi_src, phi) {
> + if (phi_src->pred == then_block ||
> + phi_src->pred == else_block)
> +return false;
> +  }
> +   }
> +
> +   /* Get the first instruction in the then block and confirm it is
> +* a discard or a discard_if
> +*/
> +   nir_instr *instr = nir_block_first_instr(then_block);
> +   if (instr->type != nir_instr_type_intrinsic)
> +  return false;
> +
> +   nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
> +   if (intrin->intrinsic != nir_intrinsic_discard &&
> +   intrin->intrinsic != nir_intrinsic_discard_if)
> +  return false;
> +
> +   nir_src cond;
> +
> +   nir_builder_init(&bld, mem_ctx);

Missing bld.cursor initialization, so the adding-to-a-discard_if case
should crash.  With that fixed, this will be:

Reviewed-by: Eric Anholt 


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/25] anv: A major rework of color attachment surface states

2016-11-08 Thread Nanley Chery
On Sat, Oct 22, 2016 at 10:50:31AM -0700, Jason Ekstrand wrote:
> This series does some fairly major surgery on color attachment surface
> state allocation and fill-out in the Intel Vulkan driver.  This is in
> preparation for doing color compression, fast-clears, and HiZ-capable input
> attachments.  Naturally, as with everything else I've done in the last 2
> months, it also involves some non-trivial blorp work.
> 
> Let's start off at the beginning...  For a variety of reasons, we can't
> really know 100% of the details of an attachment's surface state at any
> other places than vkCmdBeginRenderPass and vkCmdNextSubpss.  The same
> applies for depth buffers if you consider 3DSTATE_DEPTH_BUFFER and friends
> to be the depth and stencil buffer's "surface state".  That's a fairly
> strong statement, but there are a couple of reasons for this:
> 
>  1) In order for fast-clears to work, the surface state has to contain the
> clear color.  (This is it's own packet for HiZ but not for color.)  We
> don't know the clear value until BeginRenderPass.  This means we can't
> fully fill out the surface state in vkCmdCreateImageView.
> 

We could alternatively merge the view's surface state packet into
another that only contains the clear color(s) right?

- Nanley

>  2) The Vulkan spec requires that you be able to call vkBeginCommandBuffer
> on a secondary command buffer with USAGE_RENDER_PASS_CONTINUE_BIT set
> but with a null framebuffer.  In this case, the secondary is supposed
> to inherit the framebuffer from the primary.  (This is not something we
> have properly implemented until now.)  This means that anything that is
> callable from a render-pass-continuing secondary command buffer has to
> be able to operate without knowing any surface details that aren't part
> of the VkRenderPass object.  Basically, all you know is the Vulkan
> format (not the isl format) and the sample count.
> 
> Between the two of those, about the only two entrypoints left at which we
> actually know surface details are vkCmdBeginRenderPass and vkCmdNextSubpass
> so we have to figure out how to do everything there.  As it turns out, this
> works out surprisingly well.  The format and the sample count turn out to
> be exactly the data we actually need in order to do all of our pipeline
> programming.  The only hard part is refactoring things so that it pulls the
> data from the render pass instead of the framebuffer.  There are a number
> of places where we were grabbing the image view for an attachment because
> we either wanted to shove something into blorp or because we wanted the
> format and we were lazy.
> 
> The approach taken in this patch series is the following:
> 
>  1) Instead of allocating render target surface states in vkCreateImageView,
> we allocate them as part of render pass setup in vkCmdBeginRenderPass.
> All of the surface states we will ever need (including a null surface
> state) are allocated up-front out of a single contiguous block.
> 
>  2) For secondary command buffers with USAGE_RENDER_PASS_CONTINUE_BIT set,
> we allocate storage for all of the surface states but don't actually
> fill them out.  In the secondary command buffer, all binding tables
> refer to these surface states rather than the ones in the primary.
> 
>  3) A blorp entrypoint is added that performs a clear operation without
> touching the depth/stencil buffer state and with a color attachment
> binding table explicitly provided by the caller.  This means that even
> our blorp clears are using the surface states allocated in
> vkCmdBeginRenderPass.  Unfortunately, this turned out to be more work
> than expected because I had to add vertex shader support to blorp along
> the way.
> 
>  4) Here's the tricky bit.  When vkCmdExecuteCommands is called during a
> render pass, we use transform feedback (yeah, crazy) to copy the block
> of surface states from the primary into the secondary right before
> executing the secondary.
> 
> It's kind of a crazy scheme but I like the end result quite a bit.
> 
> Cc: Kristian Høgsberg Kristensen 
> Cc: Chad Versace 
> Cc: Nanley Chery 
> Cc: Topi Pohjolainen 
> 
> Jason Ekstrand (25):
>   intel/isl: Add some basic info about RENDER_SURFACE_STATE to
> isl_device
>   intel/genxml: Add SO_WRITE_OFFSET registers for gen7-9
>   anv: Add a helper for doing buffer copies with nothing but VF and SOL.
>   anv/cmd_buffer: Use the surface state alloc helper in
> null_surface_state
>   anv/cmd_buffer: Expose add_surface_state_reloc as an inline helper
>   anv: Rework the way render target surfaces are allocated
>   anv/cmd_buffer: Stop relying on the framebuffer for 3DSTATE_SF on gen7
>   intel/genxml: Make some VS/GS fields consistent across gens
>   intel/blorp: Make the number of samples an explicit parameter
>   intel/blorp: Add a shader type to make keys more unique
>   intel/blorp: Remove NIR support for

[Mesa-dev] [PATCH 2/2] i965/compiler: Disable trig workarounds on KBL+

2016-11-08 Thread Jason Ekstrand
The precision of our trig instructions instructions appears to have been
fixed on Kaby Lake.  Neither Ben nor I can find any documentation for this.
However, the dEQP precision tests now pass with INTEL_PRECISE_TRIG=0 where
they fail on Sky Lake.

Signed-off-by: Jason Ekstrand 
---
 src/mesa/drivers/dri/i965/brw_nir.c   | 5 -
 src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py | 7 ---
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
b/src/mesa/drivers/dri/i965/brw_nir.c
index a93d825..1069438 100644
--- a/src/mesa/drivers/dri/i965/brw_nir.c
+++ b/src/mesa/drivers/dri/i965/brw_nir.c
@@ -449,6 +449,7 @@ nir_optimize(nir_shader *nir, bool is_scalar)
 nir_shader *
 brw_preprocess_nir(const struct brw_compiler *compiler, nir_shader *nir)
 {
+   const struct gen_device_info *devinfo = compiler->devinfo;
bool progress; /* Written by OPT and OPT_V */
(void)progress;
 
@@ -457,7 +458,9 @@ brw_preprocess_nir(const struct brw_compiler *compiler, 
nir_shader *nir)
if (nir->stage == MESA_SHADER_GEOMETRY)
   OPT(nir_lower_gs_intrinsics);
 
-   if (compiler->precise_trig)
+   /* See also brw_nir_trig_workarounds.py */
+   if (compiler->precise_trig &&
+   !(devinfo->gen >= 10 || devinfo->is_kabylake))
   OPT(brw_nir_apply_trig_workarounds);
 
static const nir_lower_tex_options tex_options = {
diff --git a/src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py 
b/src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py
index 67dab9a..3b8d0ce 100755
--- a/src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py
+++ b/src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py
@@ -23,9 +23,10 @@
 
 import nir_algebraic
 
-# The SIN and COS instructions on Intel hardware can produce values
-# slightly outside of the [-1.0, 1.0] range for a small set of values.
-# Obviously, this can break everyone's expectations about trig functions.
+# Prior to Kaby Lake, The SIN and COS instructions on Intel hardware can
+# produce values slightly outside of the [-1.0, 1.0] range for a small set of
+# values.  Obviously, this can break everyone's expectations about trig
+# functions.  This appears to be fixed in Kaby Lake.
 #
 # According to an internal presentation, the COS instruction can produce
 # a value up to 1.27 for inputs in the range (0.08296, 0.09888).  One
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] intel/common: Add an is_kabylake field to gen_device_info

2016-11-08 Thread Jason Ekstrand
Most of the 3-D engine Kaby Lake is identical to Sky Lake.  However, there
are a few small differences that we need to be able to detect.

Signed-off-by: Jason Ekstrand 
---
 src/intel/common/gen_device_info.c | 14 +-
 src/intel/common/gen_device_info.h |  1 +
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/src/intel/common/gen_device_info.c 
b/src/intel/common/gen_device_info.c
index 30df0b2..3ff98f0 100644
--- a/src/intel/common/gen_device_info.c
+++ b/src/intel/common/gen_device_info.c
@@ -427,6 +427,10 @@ static const struct gen_device_info 
gen_device_info_bxt_2x6 = {
  * There's no KBL entry. Using the default SKL (GEN9) GS entries value.
  */
 
+#define KBL_FEATURES \
+   GEN9_FEATURES, \
+   .is_kabylake = true
+
 /*
  * Both SKL and KBL support a maximum of 64 threads per
  * Pixel Shader Dispatch (PSD) unit.
@@ -434,7 +438,7 @@ static const struct gen_device_info gen_device_info_bxt_2x6 
= {
 #define  KBL_MAX_THREADS_PER_PSD 64
 
 static const struct gen_device_info gen_device_info_kbl_gt1 = {
-   GEN9_FEATURES,
+   KBL_FEATURES,
.gt = 1,
 
.max_cs_threads = 7 * 6,
@@ -444,7 +448,7 @@ static const struct gen_device_info gen_device_info_kbl_gt1 
= {
 };
 
 static const struct gen_device_info gen_device_info_kbl_gt1_5 = {
-   GEN9_FEATURES,
+   KBL_FEATURES,
.gt = 1,
 
.max_cs_threads = 7 * 6,
@@ -453,7 +457,7 @@ static const struct gen_device_info 
gen_device_info_kbl_gt1_5 = {
 };
 
 static const struct gen_device_info gen_device_info_kbl_gt2 = {
-   GEN9_FEATURES,
+   KBL_FEATURES,
.gt = 2,
 
.max_wm_threads = KBL_MAX_THREADS_PER_PSD * 3,
@@ -461,7 +465,7 @@ static const struct gen_device_info gen_device_info_kbl_gt2 
= {
 };
 
 static const struct gen_device_info gen_device_info_kbl_gt3 = {
-   GEN9_FEATURES,
+   KBL_FEATURES,
.gt = 3,
 
.max_wm_threads = KBL_MAX_THREADS_PER_PSD * 6,
@@ -469,7 +473,7 @@ static const struct gen_device_info gen_device_info_kbl_gt3 
= {
 };
 
 static const struct gen_device_info gen_device_info_kbl_gt4 = {
-   GEN9_FEATURES,
+   KBL_FEATURES,
.gt = 4,
 
.max_wm_threads = KBL_MAX_THREADS_PER_PSD * 9,
diff --git a/src/intel/common/gen_device_info.h 
b/src/intel/common/gen_device_info.h
index 10324e6..53ac5f6 100644
--- a/src/intel/common/gen_device_info.h
+++ b/src/intel/common/gen_device_info.h
@@ -41,6 +41,7 @@ struct gen_device_info
bool is_haswell;
bool is_cherryview;
bool is_broxton;
+   bool is_kabylake;
 
bool has_hiz_and_separate_stencil;
bool must_use_separate_stencil;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/59 v2] glsl/standalone: Optimize add-of-neg to subtract

2016-11-08 Thread Ian Romanick
ping

On 10/26/2016 07:17 PM, Ian Romanick wrote:
> From: Ian Romanick 
> 
> This just makes the output of the standalone compiler a little more
> compact.
> 
> v2: Fix indexing typo noticed by Iago.  Move the add_neg_to_sub_visitor
> to it's own header file.  Add a unit test that exercises the visitor.
> Both the neg_a_plus_b and neg_a_plus_neg_b tests reproduced the bug that
> Iago discovered.
> 
> Signed-off-by: Ian Romanick 
> ---
>  src/compiler/Makefile.glsl.am  |   1 +
>  src/compiler/glsl/opt_add_neg_to_sub.h |  61 ++
>  src/compiler/glsl/standalone.cpp   |   4 +
>  .../glsl/tests/opt_add_neg_to_sub_test.cpp | 210 
> +
>  4 files changed, 276 insertions(+)
>  create mode 100644 src/compiler/glsl/opt_add_neg_to_sub.h
>  create mode 100644 src/compiler/glsl/tests/opt_add_neg_to_sub_test.cpp
> 
> diff --git a/src/compiler/Makefile.glsl.am b/src/compiler/Makefile.glsl.am
> index 80dfb73..4de51e4 100644
> --- a/src/compiler/Makefile.glsl.am
> +++ b/src/compiler/Makefile.glsl.am
> @@ -69,6 +69,7 @@ glsl_tests_general_ir_test_SOURCES =
> \
>   glsl/tests/builtin_variable_test.cpp\
>   glsl/tests/invalidate_locations_test.cpp\
>   glsl/tests/general_ir_test.cpp  \
> + glsl/tests/opt_add_neg_to_sub_test.cpp  \
>   glsl/tests/varyings_test.cpp
>  glsl_tests_general_ir_test_CFLAGS =  \
>   $(PTHREAD_CFLAGS)
> diff --git a/src/compiler/glsl/opt_add_neg_to_sub.h 
> b/src/compiler/glsl/opt_add_neg_to_sub.h
> new file mode 100644
> index 000..9f97071
> --- /dev/null
> +++ b/src/compiler/glsl/opt_add_neg_to_sub.h
> @@ -0,0 +1,61 @@
> +/*
> + * Copyright © 2016 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + */
> +
> +#ifndef OPT_ADD_NEG_TO_SUB_H
> +#define OPT_ADD_NEG_TO_SUB_H
> +
> +#include "ir.h"
> +#include "ir_hierarchical_visitor.h"
> +
> +class add_neg_to_sub_visitor : public ir_hierarchical_visitor {
> +public:
> +   add_neg_to_sub_visitor()
> +   {
> +  /* empty */
> +   }
> +
> +   ir_visitor_status visit_leave(ir_expression *ir)
> +   {
> +  if (ir->operation != ir_binop_add)
> + return visit_continue;
> +
> +  for (unsigned i = 0; i < 2; i++) {
> + ir_expression *const op = ir->operands[i]->as_expression();
> +
> + if (op != NULL && op->operation == ir_unop_neg) {
> +ir->operation = ir_binop_sub;
> +
> +/* This ensures that -a + b becomes b - a. */
> +if (i == 0)
> +   ir->operands[0] = ir->operands[1];
> +
> +ir->operands[1] = op->operands[0];
> +break;
> + }
> +  }
> +
> +  return visit_continue;
> +   }
> +};
> +
> +#endif /* OPT_ADD_NEG_TO_SUB_H */
> diff --git a/src/compiler/glsl/standalone.cpp 
> b/src/compiler/glsl/standalone.cpp
> index 055c433..07793a9 100644
> --- a/src/compiler/glsl/standalone.cpp
> +++ b/src/compiler/glsl/standalone.cpp
> @@ -37,6 +37,7 @@
>  #include "standalone_scaffolding.h"
>  #include "standalone.h"
>  #include "util/string_to_uint_map.h"
> +#include "opt_add_neg_to_sub.h"
>  
>  static const struct standalone_options *options;
>  
> @@ -441,6 +442,9 @@ standalone_compile_shader(const struct standalone_options 
> *_options,
>   if (!shader)
>  continue;
>  
> + add_neg_to_sub_visitor v;
> + visit_list_elements(&v, shader->ir);
> +
>   shader->Program = rzalloc(shader, gl_program);
>   init_gl_program(shader->Program, shader->Stage);
>}
> diff --git a/src/compiler/glsl/tests/opt_add_neg_to_sub_test.cpp 
> b/src/compiler/glsl/tests/opt_add_neg_to_sub_test.cpp
> new file mode 100644
> index 000..b82e47f
> --- /dev/null
> +++ b/src/compiler/glsl/tests/opt_add_neg_to_sub_test.cpp
> 

Re: [Mesa-dev] clover: Add CL_PROGRAM_BINARY_TYPE support (CL1.2).

2016-11-08 Thread Serge Martin
On Sunday 06 November 2016 17:02:26 Dieter Nützel wrote:
> After latest clover commit 'luxmark-v3.0' sigfault immediately:

Hello

Did you bisect it? Luxmark seems to crash just the same here without this 
commit.

Serge

> 
> SOURCE/luxmark-v3.0> ./luxmark
> ./luxmark.bin: /usr/local/lib/libOpenCL.so.1: no version information
> available (required by ./luxmark.bin)
> *** Error in `./luxmark.bin': corrupted double-linked list:
> 0x7f51a57829e0 ***
> === Backtrace: =
> /lib64/libc.so.6(+0x727df)[0x7f51e49847df]
> /lib64/libc.so.6(+0x7804e)[0x7f51e498a04e]
> /lib64/libc.so.6(+0x782d4)[0x7f51e498a2d4]
> /lib64/libc.so.6(+0x78d01)[0x7f51e498ad01]
> /usr/local/lib/libOpenCL.so.1(+0x20b418)[0x7f51e53d7418]
> /usr/local/lib/libOpenCL.so.1(+0x20bcc7)[0x7f51e53d7cc7]
> /usr/local/lib/libOpenCL.so.1(clReleaseMemObject+0x40)[0x7f51e53ba700]
> ./luxmark.bin(_ZN3slg23PathOCLBaseRenderThread13FreeOCLBufferEPPN2cl6BufferE
> +0x52)[0x7bfb92]
> ./luxmark.bin(_ZN3slg23PathOCLBaseRenderThread4StopEv+0x6c)[0x7c3f0c]
> ./luxmark.bin(_ZN3slg19PathOCLRenderThread4StopEv+0x9)[0x701a09]
> ./luxmark.bin(_ZN3slg23PathOCLBaseRenderEngine12StopLockLessEv+0x97)[0x7bba0
> 7] ./luxmark.bin(_ZN3slg12RenderEngine4StopEv+0x26)[0x666a26]
> ./luxmark.bin(_ZN3slg13RenderSessionD1Ev+0x7e)[0x658e8e]
> ./luxmark.bin(_ZN7luxcore13RenderSessionD1Ev+0x22)[0x5e5502]
> ./luxmark.bin[0x5e1db2]
> ./luxmark.bin[0x5c69a4]
> ./luxmark.bin[0x5c6b6f]
> ./luxmark.bin[0x5cf6de]
> ./luxmark.bin[0x86e83c]
> ./luxmark.bin[0x8742d7]
> ./luxmark.bin[0xf981ad]
> ./luxmark.bin[0xf9a770]
> ./luxmark.bin[0xfbb67f]
> ./luxmark.bin[0x8ed424]
> ./luxmark.bin[0xf9705f]
> ./luxmark.bin[0xf9735e]
> ./luxmark.bin[0xf9b58b]
> ./luxmark.bin[0x5ad82a]
> /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f51e4933b05]
> ./luxmark.bin[0x5bea37]
> === Memory map: 
> 0040-01e7f000 r-xp  09:00 1054819
> /tmp/INSTALL/SOURCE/luxmark-v3.0/luxmark.bin
> 0207f000-020f5000 r--p 01a7f000 09:00 1054819
> /tmp/INSTALL/SOURCE/luxmark-v3.0/luxmark.bin
> 020f5000-0000 rw-p 01af5000 09:00 1054819
> /tmp/INSTALL/SOURCE/luxmark-v3.0/luxmark.bin
> 0000-02284000 rw-p  00:00 0
> 02882000-03246000 rw-p  00:00 0
> [heap]
> 7f519bfde000-7f51a400 rw-p  00:00 0
> 7f51a400-7f51a7ff8000 rw-p  00:00 0
> 7f51a7ff8000-7f51a800 ---p  00:00 0
> 7f51a8855000-7f51a9219000 rw-s 12cb3b000 00:06 13357
> /dev/dri/renderD128
> 7f51a9219000-7f51ac00 rw-p  00:00 0
> 7f51ac00-7f51ac0e3000 rw-p  00:00 0
> 7f51ac0e3000-7f51b000 ---p  00:00 0
> 7f51b000-7f51b00e4000 rw-p  00:00 0
> 7f51b00e4000-7f51b400 ---p  00:00 0
> 7f51b400-7f51b40e3000 rw-p  00:00 0
> 7f51b40e3000-7f51b800 ---p  00:00 0
> 7f51b8217000-7f51baffe000 rw-p  00:00 0
> 7f51baffe000-7f51bafff000 ---p  00:00 0
> 7f51bafff000-7f51bb7ff000 rw-p  00:00 0
> 7f51bb7ff000-7f51bb80 ---p  00:00 0
> 7f51bb80-7f51bc00 rw-p  00:00 0
> 7f51bc00-7f51bc0e3000 rw-p  00:00 0
> 7f51bc0e3000-7f51c000 ---p  00:00 0
> 7f51c000-7f51c00e3000 rw-p  00:00 0
> 7f51c00e3000-7f51c400 ---p  00:00 0
> 7f51c400-7f51c40e3000 rw-p  00:00 0
> 7f51c40e3000-7f51c800 ---p  00:00 0
> 7f51c800-7f51c80e3000 rw-p  00:00 0
> 7f51c80e3000-7f51cc00 ---p  00:00 0
> 7f51cc00-7f51cc0e3000 rw-p  00:00 0
> 7f51cc0e3000-7f51d000 ---p  00:00 0
> 7f51d0216000-7f51d0217000 ---p  00:00 0
> 7f51d0217000-7f51d0a17000 rw-p  00:00 0
> 7f51d1218000-7f51d400 rw-p  00:00 0
> 7f51d400-7f51d7ef8000 rw-p  00:00 0
> 7f51d7ef8000-7f51d800 ---p  00:00 0
> 7f51d8962000-7f51d8963000 ---p  00:00 0
> 7f51d8963000-7f51d9163000 rw-p  00:00 0
> 7f51d9163000-7f51d9164000 ---p  00:00 0
> 7f51d9164000-7f51d9964000 rw-p  00:00 0
> 7f51d9964000-7f51d9965000 ---p  00:00 0
> 7f51d9965000-7f51da165000 rw-p  00:00 0
> 7f51da165000-7f51da166000 ---p  00:00 0
> 7f51da166000-7f51da966000 rw-p  00:00 0
> 7f51da966000-7f51da967000 ---p  00:00 0
> 7f51da967000-7f51db167000 rw-p  00:00 0
> 7f51db167000-7f51db168000 ---p  00:00 0
> 7f51db168000-7f51db968000 rw-p  00:00 0
> 7f51dbcb1000-7f51dbd5b000 r--p  09:00 4983414
> /usr/share/fonts/truetype/DejaVuSans-Bold.ttf
> 7f51dbd5b000-7f51dbd5c000 ---p  00:00 0
> 7f51dbd5c000-7f51dc55c000 rw-p  00:00 0
> 7f51dc55c000-7f51dc55d000 ---p  00:00 0
> 7f51dc55d000-7f51dcd5d000 rw-p  00:00 0
> 7f51dcd5d000-7f51dcd5e000 ---p  00:00 0
> 7f51dcd5e000-7f51dd55e000 rw-p  00:00 0
> 7f51dd55e000-7f51dd56 r-xp  09:00 3554964
> /usr/lib64/libXinerama.so.1.0.0
> 7f51dd56-7f51dd75f000 ---p 2000 09:00 3554964
> /usr/lib64/libXinerama.so.1.0.0
> 7f51dd75f000-7f51d

[Mesa-dev] [PATCH 1/2] glcpp: Handle '#version 0' and other invalid values

2016-11-08 Thread Ian Romanick
From: Ian Romanick 

The #version directive can only handle decimal constants.  Enforce that
the value is a decimal constant.

Section 3.3 (Preprocessor) of the GLSL 4.50 spec says:

The language version a shader is written to is specified by

#version number profile opt

where number must be a version of the language, following the same
convention as __VERSION__ above.

The same section also says:

__VERSION__ will substitute a decimal integer reflecting the version
number of the OpenGL shading language.

Use a separate flag to track whether or not the #version line has been
encountered.  Any possible sentinel (0 is currently used) could be
specified in a #version directive.  This would lead to trying to
(internally) redefine __VERSION__.  Since there is no parser location
for this addition, NULL is passed.  This eventually results in a NULL
dereference and a segfault.

Attempts to use -1 as the sentinel would also fail if '#version
4294967295' or '#version 18446744073709551615' were used.  We should
have piglit tests for both of these.

Signed-off-by: Ian Romanick 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97420
Cc: mesa-sta...@lists.freedesktop.org
Cc: Juan A. Suarez Romero 
Cc: Karol Herbst 
---
 src/compiler/glsl/glcpp/glcpp-parse.y | 25 +++--
 src/compiler/glsl/glcpp/glcpp.h   |  9 +
 2 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/src/compiler/glsl/glcpp/glcpp-parse.y 
b/src/compiler/glsl/glcpp/glcpp-parse.y
index b80ff04..63012bc 100644
--- a/src/compiler/glsl/glcpp/glcpp-parse.y
+++ b/src/compiler/glsl/glcpp/glcpp-parse.y
@@ -177,7 +177,7 @@ add_builtin_define(glcpp_parser_t *parser, const char 
*name, int value);
  * (such as the  and  start conditions in the lexer). */
 %token DEFINED ELIF_EXPANDED HASH_TOKEN DEFINE_TOKEN FUNC_IDENTIFIER 
OBJ_IDENTIFIER ELIF ELSE ENDIF ERROR_TOKEN IF IFDEF IFNDEF LINE PRAGMA UNDEF 
VERSION_TOKEN GARBAGE IDENTIFIER IF_EXPANDED INTEGER INTEGER_STRING 
LINE_EXPANDED NEWLINE OTHER PLACEHOLDER SPACE PLUS_PLUS MINUS_MINUS
 %token PASTE
-%type  INTEGER operator SPACE integer_constant
+%type  INTEGER operator SPACE integer_constant version_constant
 %type  expression
 %type  IDENTIFIER FUNC_IDENTIFIER OBJ_IDENTIFIER INTEGER_STRING OTHER 
ERROR_TOKEN PRAGMA
 %type  identifier_list
@@ -419,14 +419,14 @@ control_line_success:
 |  HASH_TOKEN ENDIF {
_glcpp_parser_skip_stack_pop (parser, & @1);
} NEWLINE
-|  HASH_TOKEN VERSION_TOKEN integer_constant NEWLINE {
-   if (parser->version != 0) {
+|  HASH_TOKEN VERSION_TOKEN version_constant NEWLINE {
+   if (parser->version_set) {
glcpp_error(& @1, parser, "#version must appear on the 
first line");
}
_glcpp_parser_handle_version_declaration(parser, $3, NULL, 
true);
}
-|  HASH_TOKEN VERSION_TOKEN integer_constant IDENTIFIER NEWLINE {
-   if (parser->version != 0) {
+|  HASH_TOKEN VERSION_TOKEN version_constant IDENTIFIER NEWLINE {
+   if (parser->version_set) {
glcpp_error(& @1, parser, "#version must appear on the 
first line");
}
_glcpp_parser_handle_version_declaration(parser, $3, $4, true);
@@ -465,6 +465,17 @@ integer_constant:
$$ = $1;
}
 
+version_constant:
+   INTEGER_STRING {
+  /* Both octal and hexadecimal constants begin with 0. */
+  if ($1[0] == '0' && $1[1] != '\0') {
+   glcpp_error(&@1, parser, "invalid #version \"%s\" (not a 
decimal constant)", $1);
+   $$ = 0;
+  } else {
+   $$ = strtoll($1, NULL, 10);
+  }
+   }
+
 expression:
integer_constant {
$$.value = $1;
@@ -1361,6 +1372,7 @@ glcpp_parser_create(glcpp_extension_iterator extensions, 
void *state, gl_api api
parser->state = state;
parser->api = api;
parser->version = 0;
+   parser->version_set = false;
 
parser->has_new_line_number = 0;
parser->new_line_number = 1;
@@ -2293,10 +2305,11 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t 
*parser, intmax_t versio
  const char *es_identifier,
  bool explicitly_set)
 {
-   if (parser->version != 0)
+   if (parser->version_set)
   return;
 
parser->version = version;
+   parser->version_set = true;
 
add_builtin_define (parser, "__VERSION__", version);
 
diff --git a/src/compiler/glsl/glcpp/glcpp.h b/src/compiler/glsl/glcpp/glcpp.h
index bb4ad67..232e053 100644
--- a/src/compiler/glsl/glcpp/glcpp.h
+++ b/src/compiler/glsl/glcpp/glcpp.h
@@ -208,6 +208,15 @@ struct glcpp_parser {
void *state;
gl_api api;
unsigned version;
+
+   /**
+* Has the #version been set?
+*
+* A separate flag is used because any possible senti

[Mesa-dev] [PATCH 2/2] glsl: Parse 0 as a preprocessor INTCONSTANT

2016-11-08 Thread Ian Romanick
From: Ian Romanick 

This allows a more reasonable error message for '#version 0' of

0:1(10): error: GLSL 0.00 is not supported. Supported versions are: 1.10, 
1.20, 1.30, 1.00 ES, 3.00 ES, 3.10 ES, and 3.20 ES

instead of

0:1(10): error: syntax error, unexpected $undefined, expecting INTCONSTANT

Signed-off-by: Ian Romanick 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97420
Cc: mesa-sta...@lists.freedesktop.org
Cc: Juan A. Suarez Romero 
Cc: Karol Herbst 
---
 src/compiler/glsl/glsl_lexer.ll | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/compiler/glsl/glsl_lexer.ll b/src/compiler/glsl/glsl_lexer.ll
index b473af7..0e722cb 100644
--- a/src/compiler/glsl/glsl_lexer.ll
+++ b/src/compiler/glsl/glsl_lexer.ll
@@ -253,6 +253,10 @@ HASH   ^{SPC}#{SPC}
yylval->n = strtol(yytext, NULL, 10);
return INTCONSTANT;
}
+0  {
+   yylval->n = 0;
+   return INTCONSTANT;
+   }
 \n { BEGIN 0; yylineno++; yycolumn = 0; return 
EOL; }
 .  { return yytext[0]; }
 
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Do not allow scalar types in vector relational functions

2016-11-08 Thread Matt Turner
On Tue, Nov 8, 2016 at 12:01 PM, Ian Romanick  wrote:
> On 11/08/2016 11:58 AM, Ian Romanick wrote:
>> On 11/04/2016 12:23 PM, Matt Turner wrote:
>>> On Sun, Oct 30, 2016 at 11:45 PM, Boyan Ding  wrote:
 According to OpenGL Shading Language 4.50 spec, Section 8.7 "Vector
 Relational Functions", functions of this type do not operate on scalar
 types, so remove scalar types from signature definitions to make the
 behavior consistent with glslangValidator and other drivers.
>>>
>>> Yep. Looks like it's always been this way.
>>>
>>> The patch is
>>>
>>> Reviewed-by: Matt Turner 
>>>
>>> Since this seems to be untested by any suite, could you provide some
>>> piglit parser tests that confirm that lessThanEqual(scalar, scalar),
>>> et al doesn't work?
>>>
>>> Rant: what a stupid mess to require <= for scalars but lessThanEqual
>>> for vectors.
>>
>> I think it makes sense.  What would 'vec4(...) < vec4(...)' return?  A

Yes, vec4(...) < vec4(...) would return a bvec4().

>> bvec4?  How much time would like to spend with a the compiler error that
>> would result from
>
> Or did you mean that you can't use lessThanEqual and friends with
> scalars is stupid?  I think we just didn't consider that when we added
> the ability to swizzle scalars, but we probably should have added that
> too.  I could get behind an extension that makes vector functions also
> work with vec1.

Definitely that, but I'm not convinced that supporting the operators
on vectors would be bad either.

bvec4 x = ...
if (x)
[...]

gives a very appropriate error on our compiler:

> error: if-statement condition must be scalar boolean
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Fix pixel shader scratch space allocation on Gen9+ platforms.

2016-11-08 Thread Kenneth Graunke
On Tuesday, November 8, 2016 11:49:58 AM PST Matt Turner wrote:
> On Tue, Nov 8, 2016 at 10:25 AM, Kenneth Graunke  
> wrote:
> > We had missed a bit of errata - PS scratch needs to be computed as if
> > there were 4 subslices per slice, rather than 3.
> >
> >   Skylake  BroxtonKabylake
> >   GT1 GT2 GT3 GT4  2x6 3x6  GT1 GT1.5 GT2 GT3 GT4
> > Actual Slices  1   1   2   31   1111   2   3
> > Total Subslices3   3   6   92   3233   6   9
> > Subsl. for PS Scratch  4   4   8   12   4   4444   8   12
> >
> > Note that Skylake GT1-3 already worked because we allocated 64 * 9
> > (trying to use a value that would work on GT4, with 9 subslices),
> > and the actual required values were 64 * 4 or 64 * 8.  However, all
> > others (Skylake GT4, Broxton, and Kabylake GT1-4) underallocated,
> > which can lead to scratch writes trashing random process memory,
> > and rendering corruption or GPU hangs.
> >
> > Fixes GPU hangs and rendering corruption on Skylake GT4 in shaders that
> > spill.  Particularly, dEQP-GLES31.functional.ubo.all_per_block_buffers.*
> > now runs successfully with no hangs and renders correctly.  This may
> > fix problems on Broxton and Kabylake as well.
> >
> > Cc: "13.0" 
> > Signed-off-by: Kenneth Graunke 
> > ---
> >  src/intel/common/gen_device_info.c | 33 +++--
> >  1 file changed, 19 insertions(+), 14 deletions(-)
> >
> > diff --git a/src/intel/common/gen_device_info.c 
> > b/src/intel/common/gen_device_info.c
> > index 30df0b2..1dc1769 100644
> > --- a/src/intel/common/gen_device_info.c
> > +++ b/src/intel/common/gen_device_info.c
> > @@ -335,7 +335,6 @@ static const struct gen_device_info gen_device_info_chv 
> > = {
> > .max_gs_threads = 336,   \
> > .max_tcs_threads = 336,  \
> > .max_tes_threads = 336,  \
> > -   .max_wm_threads = 64 * 9,\
> 
> Is this intentional? I don't see CHV called out in the commit message,
> and the new code at the bottom is for gen >= 9, while CHV is 8.

Sorry, I should have used a bigger -U setting when sending these out.
This change is to the GEN9_FEATURES macro which is directly below the
chv struct...and diff doesn't handle the header nicely.

--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] swr: Support windows builds

2016-11-08 Thread Emil Velikov
On 8 November 2016 at 17:57, Kyriazis, George  wrote:

>> This is now I folded/cleaned up the autoconf build with commit
>> bb949e262cb5c4fffe991debc605447e15322666. A similar solution here would
>> be great/possible.
>>
>
> Can I take care of it on a follow-on check-in?  Ie. check-in as-is for now?
>
Based on the bb949e262 history (was supposed to be a follow-on) I'm
leaning towards - please take care of this in v2.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Do not allow scalar types in vector relational functions

2016-11-08 Thread Ian Romanick
On 11/08/2016 11:58 AM, Ian Romanick wrote:
> On 11/04/2016 12:23 PM, Matt Turner wrote:
>> On Sun, Oct 30, 2016 at 11:45 PM, Boyan Ding  wrote:
>>> According to OpenGL Shading Language 4.50 spec, Section 8.7 "Vector
>>> Relational Functions", functions of this type do not operate on scalar
>>> types, so remove scalar types from signature definitions to make the
>>> behavior consistent with glslangValidator and other drivers.
>>
>> Yep. Looks like it's always been this way.
>>
>> The patch is
>>
>> Reviewed-by: Matt Turner 
>>
>> Since this seems to be untested by any suite, could you provide some
>> piglit parser tests that confirm that lessThanEqual(scalar, scalar),
>> et al doesn't work?
>>
>> Rant: what a stupid mess to require <= for scalars but lessThanEqual
>> for vectors.
> 
> I think it makes sense.  What would 'vec4(...) < vec4(...)' return?  A
> bvec4?  How much time would like to spend with a the compiler error that
> would result from

Or did you mean that you can't use lessThanEqual and friends with
scalars is stupid?  I think we just didn't consider that when we added
the ability to swizzle scalars, but we probably should have added that
too.  I could get behind an extension that makes vector functions also
work with vec1.

> if (a < b) {
> ...
> }
> 
> because a and b happen to be vectors.  I bet about an equal number of
> people think that would be stupid as think the current requirement of
> comparison functions for vectors is stupid. :)
> 
>> Somewhere on my todo list is a GLSL extension that fixes things like this...
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Do not allow scalar types in vector relational functions

2016-11-08 Thread Ian Romanick
On 11/04/2016 12:23 PM, Matt Turner wrote:
> On Sun, Oct 30, 2016 at 11:45 PM, Boyan Ding  wrote:
>> According to OpenGL Shading Language 4.50 spec, Section 8.7 "Vector
>> Relational Functions", functions of this type do not operate on scalar
>> types, so remove scalar types from signature definitions to make the
>> behavior consistent with glslangValidator and other drivers.
> 
> Yep. Looks like it's always been this way.
> 
> The patch is
> 
> Reviewed-by: Matt Turner 
> 
> Since this seems to be untested by any suite, could you provide some
> piglit parser tests that confirm that lessThanEqual(scalar, scalar),
> et al doesn't work?
> 
> Rant: what a stupid mess to require <= for scalars but lessThanEqual
> for vectors.

I think it makes sense.  What would 'vec4(...) < vec4(...)' return?  A
bvec4?  How much time would like to spend with a the compiler error that
would result from

if (a < b) {
...
}

because a and b happen to be vectors.  I bet about an equal number of
people think that would be stupid as think the current requirement of
comparison functions for vectors is stupid. :)

> Somewhere on my todo list is a GLSL extension that fixes things like this...
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Fix pixel shader scratch space allocation on Gen9+ platforms.

2016-11-08 Thread Matt Turner
On Tue, Nov 8, 2016 at 10:25 AM, Kenneth Graunke  wrote:
> We had missed a bit of errata - PS scratch needs to be computed as if
> there were 4 subslices per slice, rather than 3.
>
>   Skylake  BroxtonKabylake
>   GT1 GT2 GT3 GT4  2x6 3x6  GT1 GT1.5 GT2 GT3 GT4
> Actual Slices  1   1   2   31   1111   2   3
> Total Subslices3   3   6   92   3233   6   9
> Subsl. for PS Scratch  4   4   8   12   4   4444   8   12
>
> Note that Skylake GT1-3 already worked because we allocated 64 * 9
> (trying to use a value that would work on GT4, with 9 subslices),
> and the actual required values were 64 * 4 or 64 * 8.  However, all
> others (Skylake GT4, Broxton, and Kabylake GT1-4) underallocated,
> which can lead to scratch writes trashing random process memory,
> and rendering corruption or GPU hangs.
>
> Fixes GPU hangs and rendering corruption on Skylake GT4 in shaders that
> spill.  Particularly, dEQP-GLES31.functional.ubo.all_per_block_buffers.*
> now runs successfully with no hangs and renders correctly.  This may
> fix problems on Broxton and Kabylake as well.
>
> Cc: "13.0" 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/intel/common/gen_device_info.c | 33 +++--
>  1 file changed, 19 insertions(+), 14 deletions(-)
>
> diff --git a/src/intel/common/gen_device_info.c 
> b/src/intel/common/gen_device_info.c
> index 30df0b2..1dc1769 100644
> --- a/src/intel/common/gen_device_info.c
> +++ b/src/intel/common/gen_device_info.c
> @@ -335,7 +335,6 @@ static const struct gen_device_info gen_device_info_chv = 
> {
> .max_gs_threads = 336,   \
> .max_tcs_threads = 336,  \
> .max_tes_threads = 336,  \
> -   .max_wm_threads = 64 * 9,\

Is this intentional? I don't see CHV called out in the commit message,
and the new code at the bottom is for gen >= 9, while CHV is 8.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glcpp: initializes version to -1

2016-11-08 Thread Ian Romanick
I should have a fix for all of these problems out in about an hour.  I'm
just running it through the CI now.

On 11/05/2016 02:48 AM, Karol Herbst wrote:
> 2016-11-05 2:50 GMT+01:00 Ian Romanick :
>> (Sorry about the top post. Sent from my phone.)
>>
>> That expression will allow versions like 0130 as valid.  If you just want to
>> allow 0, you need a more complex regular expression.  I feel like that's
>> just a bandage... what about other bad values like "#version -130"?  Won't
>> that have the same problem that 0 currently has?
>>
> 
> no, it doesn't.
> 
> I tested the patch with glsl_compiler
> 
> "#version 0130": 0:1(10): error: GLSL 0.88 is not supported. Supported
> versions are: 1.10, 1.20, 1.30, 1.00 ES, and 3.00 ES
> 
> "#version 0": 0:1(10): error: GLSL 0.00 is not supported. Supported
> versions are: 1.10, 1.20, 1.30, 1.00 ES, and 3.00 ES
> 
> "#version -130":0:1(10): preprocessor error: syntax error, unexpected
> '-', expecting INTEGER or INTEGER_STRING
> 
> but
> 
> "#version 0512": 0:1(10): error: GLSL 3.30 is not supported. Supported
> versions are: 1.10, 1.20, 1.30, 1.00 ES, and 3.00 ES
> 
> so the issue with this would be, that "0512" is parsed as 3.30, which
> isn't right either, but the current master version does the same. \o/
> new bug found
> 
>>
>> On November 4, 2016 6:09:58 AM "Juan A. Suarez Romero" 
>> wrote:
>>
>>> Shader can define #version as an integer, including 0.
>>>
>>> Initializes version to -1 to know later if shader has defined a #version
>>> or not.
>>>
>>> It fixes 4 piglit tests:
>>>   spec/glsl-1.10/compiler/version-0.frag: crash pass
>>>   spec/glsl-1.10/compiler/version-0.vert: crash pass
>>>   spec/glsl-es-3.00/compiler/version-0.frag: crash pass
>>>   spec/glsl-es-3.00/compiler/version-0.vert: crash pass
>>>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97420
>>> ---
>>>  src/compiler/glsl/glcpp/glcpp-parse.y | 8 
>>>  src/compiler/glsl/glcpp/glcpp.h   | 2 +-
>>>  src/compiler/glsl/glsl_lexer.ll   | 2 +-
>>>  3 files changed, 6 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/src/compiler/glsl/glcpp/glcpp-parse.y
>>> b/src/compiler/glsl/glcpp/glcpp-parse.y
>>> index b80ff04..6207a62 100644
>>> --- a/src/compiler/glsl/glcpp/glcpp-parse.y
>>> +++ b/src/compiler/glsl/glcpp/glcpp-parse.y
>>> @@ -420,13 +420,13 @@ control_line_success:
>>> _glcpp_parser_skip_stack_pop (parser, & @1);
>>> } NEWLINE
>>>  |  HASH_TOKEN VERSION_TOKEN integer_constant NEWLINE {
>>> -   if (parser->version != 0) {
>>> +   if (parser->version != -1) {
>>> glcpp_error(& @1, parser, "#version must appear on
>>> the first line");
>>> }
>>> _glcpp_parser_handle_version_declaration(parser, $3, NULL,
>>> true);
>>> }
>>>  |  HASH_TOKEN VERSION_TOKEN integer_constant IDENTIFIER NEWLINE {
>>> -   if (parser->version != 0) {
>>> +   if (parser->version != -1) {
>>> glcpp_error(& @1, parser, "#version must appear on
>>> the first line");
>>> }
>>> _glcpp_parser_handle_version_declaration(parser, $3, $4,
>>> true);
>>> @@ -1360,7 +1360,7 @@ glcpp_parser_create(glcpp_extension_iterator
>>> extensions, void *state, gl_api api
>>> parser->extensions = extensions;
>>> parser->state = state;
>>> parser->api = api;
>>> -   parser->version = 0;
>>> +   parser->version = -1;
>>>
>>> parser->has_new_line_number = 0;
>>> parser->new_line_number = 1;
>>> @@ -2293,7 +2293,7 @@
>>> _glcpp_parser_handle_version_declaration(glcpp_parser_t *parser, intmax_t
>>> versio
>>>   const char *es_identifier,
>>>   bool explicitly_set)
>>>  {
>>> -   if (parser->version != 0)
>>> +   if (parser->version != -1)
>>>return;
>>>
>>> parser->version = version;
>>> diff --git a/src/compiler/glsl/glcpp/glcpp.h
>>> b/src/compiler/glsl/glcpp/glcpp.h
>>> index bb4ad67..2acac0c 100644
>>> --- a/src/compiler/glsl/glcpp/glcpp.h
>>> +++ b/src/compiler/glsl/glcpp/glcpp.h
>>> @@ -207,7 +207,7 @@ struct glcpp_parser {
>>> glcpp_extension_iterator extensions;
>>> void *state;
>>> gl_api api;
>>> -   unsigned version;
>>> +   int version;
>>> bool has_new_line_number;
>>> int new_line_number;
>>> bool has_new_source_number;
>>> diff --git a/src/compiler/glsl/glsl_lexer.ll
>>> b/src/compiler/glsl/glsl_lexer.ll
>>> index b473af7..7d1d616 100644
>>> --- a/src/compiler/glsl/glsl_lexer.ll
>>> +++ b/src/compiler/glsl/glsl_lexer.ll
>>> @@ -249,7 +249,7 @@ HASH^{SPC}#{SPC}
>>>yylval->identifier =
>>> linear_strdup(mem_ctx, yytext);
>>>return IDENTIFIER;
>>> }
>>> -[1-9][0-9]*{
>>> +[0-9][0-9]*{

Re: [Mesa-dev] [PATCH] gallivm: fix [IU]MUL_HI regression

2016-11-08 Thread Marek Olšák
FYI, this doesn't fix the regression fully. (GLCTS failures with
piglit: -t mulextended)

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12.0 backport] intel: Fix pixel shader scratch space allocation on Gen9+ platforms.

2016-11-08 Thread Kenneth Graunke
We had missed a bit of errata - PS scratch needs to be computed as if
there were 4 subslices per slice, rather than 3.

This is a conservative backport of commit .  It only
increases the scratch amount, unlike the original commit which decreases
it Skylake GT1-3 to avoid overallocating.

Cc: "12.0 11.2" 
Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_device_info.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c 
b/src/mesa/drivers/dri/i965/brw_device_info.c
index 77bbe78..e191c6c 100644
--- a/src/mesa/drivers/dri/i965/brw_device_info.c
+++ b/src/mesa/drivers/dri/i965/brw_device_info.c
@@ -336,7 +336,7 @@ static const struct brw_device_info brw_device_info_chv = {
.max_gs_threads = 336,   \
.max_hs_threads = 336,   \
.max_ds_threads = 336,   \
-   .max_wm_threads = 64 * 9,\
+   .max_wm_threads = 64 * 12,   \
.max_cs_threads = 56,\
.urb = { \
   .size = 384,  \
@@ -389,7 +389,7 @@ static const struct brw_device_info brw_device_info_bxt = {
.max_hs_threads = 112,
.max_ds_threads = 112,
.max_gs_threads = 112,
-   .max_wm_threads = 64 * 3,
+   .max_wm_threads = 64 * 4,
.max_cs_threads = 6 * 6,
.urb = {
   .size = 192,
@@ -412,7 +412,7 @@ static const struct brw_device_info brw_device_info_bxt_2x6 
= {
.max_hs_threads = 56, /* XXX: guess */
.max_ds_threads = 56,
.max_gs_threads = 56,
-   .max_wm_threads = 64 * 2,
+   .max_wm_threads = 64 * 4,
.max_cs_threads = 6 * 6,
.urb = {
   .size = 128,
@@ -439,7 +439,7 @@ static const struct brw_device_info brw_device_info_kbl_gt1 
= {
.gt = 1,
 
.max_cs_threads = 7 * 6,
-   .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 2,
+   .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 4,
.urb.size = 192,
.num_slices = 1,
 };
@@ -449,7 +449,7 @@ static const struct brw_device_info 
brw_device_info_kbl_gt1_5 = {
.gt = 1,
 
.max_cs_threads = 7 * 6,
-   .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 3,
+   .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 4,
.num_slices = 1,
 };
 
@@ -457,7 +457,7 @@ static const struct brw_device_info brw_device_info_kbl_gt2 
= {
GEN9_FEATURES,
.gt = 2,
 
-   .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 3,
+   .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 4,
.num_slices = 1,
 };
 
@@ -465,7 +465,7 @@ static const struct brw_device_info brw_device_info_kbl_gt3 
= {
GEN9_FEATURES,
.gt = 3,
 
-   .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 6,
+   .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 8,
.num_slices = 2,
 };
 
@@ -473,7 +473,7 @@ static const struct brw_device_info brw_device_info_kbl_gt4 
= {
GEN9_FEATURES,
.gt = 4,
 
-   .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 9,
+   .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 12,
/*
 * From the "L3 Allocation and Programming" documentation:
 *
-- 
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: Fix pixel shader scratch space allocation on Gen9+ platforms.

2016-11-08 Thread Kenneth Graunke
We had missed a bit of errata - PS scratch needs to be computed as if
there were 4 subslices per slice, rather than 3.

  Skylake  BroxtonKabylake
  GT1 GT2 GT3 GT4  2x6 3x6  GT1 GT1.5 GT2 GT3 GT4
Actual Slices  1   1   2   31   1111   2   3
Total Subslices3   3   6   92   3233   6   9
Subsl. for PS Scratch  4   4   8   12   4   4444   8   12

Note that Skylake GT1-3 already worked because we allocated 64 * 9
(trying to use a value that would work on GT4, with 9 subslices),
and the actual required values were 64 * 4 or 64 * 8.  However, all
others (Skylake GT4, Broxton, and Kabylake GT1-4) underallocated,
which can lead to scratch writes trashing random process memory,
and rendering corruption or GPU hangs.

Fixes GPU hangs and rendering corruption on Skylake GT4 in shaders that
spill.  Particularly, dEQP-GLES31.functional.ubo.all_per_block_buffers.*
now runs successfully with no hangs and renders correctly.  This may
fix problems on Broxton and Kabylake as well.

Cc: "13.0" 
Signed-off-by: Kenneth Graunke 
---
 src/intel/common/gen_device_info.c | 33 +++--
 1 file changed, 19 insertions(+), 14 deletions(-)

diff --git a/src/intel/common/gen_device_info.c 
b/src/intel/common/gen_device_info.c
index 30df0b2..1dc1769 100644
--- a/src/intel/common/gen_device_info.c
+++ b/src/intel/common/gen_device_info.c
@@ -335,7 +335,6 @@ static const struct gen_device_info gen_device_info_chv = {
.max_gs_threads = 336,   \
.max_tcs_threads = 336,  \
.max_tes_threads = 336,  \
-   .max_wm_threads = 64 * 9,\
.max_cs_threads = 56,\
.urb = { \
   .size = 384,  \
@@ -388,7 +387,6 @@ static const struct gen_device_info gen_device_info_bxt = {
.max_tcs_threads = 112,
.max_tes_threads = 112,
.max_gs_threads = 112,
-   .max_wm_threads = 64 * 3,
.max_cs_threads = 6 * 6,
.urb = {
   .size = 192,
@@ -411,7 +409,6 @@ static const struct gen_device_info gen_device_info_bxt_2x6 
= {
.max_tcs_threads = 56, /* XXX: guess */
.max_tes_threads = 56,
.max_gs_threads = 56,
-   .max_wm_threads = 64 * 2,
.max_cs_threads = 6 * 6,
.urb = {
   .size = 128,
@@ -427,18 +424,11 @@ static const struct gen_device_info 
gen_device_info_bxt_2x6 = {
  * There's no KBL entry. Using the default SKL (GEN9) GS entries value.
  */
 
-/*
- * Both SKL and KBL support a maximum of 64 threads per
- * Pixel Shader Dispatch (PSD) unit.
- */
-#define  KBL_MAX_THREADS_PER_PSD 64
-
 static const struct gen_device_info gen_device_info_kbl_gt1 = {
GEN9_FEATURES,
.gt = 1,
 
.max_cs_threads = 7 * 6,
-   .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 2,
.urb.size = 192,
.num_slices = 1,
 };
@@ -448,7 +438,6 @@ static const struct gen_device_info 
gen_device_info_kbl_gt1_5 = {
.gt = 1,
 
.max_cs_threads = 7 * 6,
-   .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 3,
.num_slices = 1,
 };
 
@@ -456,7 +445,6 @@ static const struct gen_device_info gen_device_info_kbl_gt2 
= {
GEN9_FEATURES,
.gt = 2,
 
-   .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 3,
.num_slices = 1,
 };
 
@@ -464,7 +452,6 @@ static const struct gen_device_info gen_device_info_kbl_gt3 
= {
GEN9_FEATURES,
.gt = 3,
 
-   .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 6,
.num_slices = 2,
 };
 
@@ -472,7 +459,6 @@ static const struct gen_device_info gen_device_info_kbl_gt4 
= {
GEN9_FEATURES,
.gt = 4,
 
-   .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 9,
/*
 * From the "L3 Allocation and Programming" documentation:
 *
@@ -500,6 +486,25 @@ gen_get_device_info(int devid, struct gen_device_info 
*devinfo)
   return false;
}
 
+   /* From the Skylake PRM, 3DSTATE_PS::Scratch Space Base Pointer:
+*
+* "Scratch Space per slice is computed based on 4 sub-slices.  SW must
+*  allocate scratch space enough so that each slice has 4 slices allowed."
+*
+* The equivalent internal documentation says that this programming note
+* applies to all Gen9+ platforms.
+*
+* The hardware typically calculates the scratch space pointer by taking
+* the base address, and adding per-thread-scratch-space * thread ID.
+* Extra padding can be necessary depending how the thread IDs are
+* calculated for a particular shader stage.
+*/
+   if (devinfo->gen >= 9) {
+  devinfo->max_wm_threads = 64 /* threads-per-PSD */
+  * devinfo->num_slices
+  * 4; /* effective subslices per slice */
+   }
+
return true;
 }
 
-- 
2.10.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/lis

Re: [Mesa-dev] [PATCH 2/2] i965: Enable several GLES 3.1 extensions on HSW+

2016-11-08 Thread Kenneth Graunke
On Tuesday, November 8, 2016 10:10:35 AM PST Ian Romanick wrote:
> From: Ian Romanick 
> 
> The only reason we didn't previously enable this was the dependency on
> OpenGL ES 3.1.  These should have been enabled as soon as HSW got
> stencil texturing.  We also needed to fixup setting MaxViewports.
> 
> Signed-off-by: Ian Romanick 
> ---
>  docs/features.txt| 6 +++---
>  docs/relnotes/12.1.0.html| 6 +++---
>  src/mesa/drivers/dri/i965/intel_extensions.c | 6 +++---
>  3 files changed, 9 insertions(+), 9 deletions(-)

Patch doesn't apply against master - there is no relnotes/12.1.0.html.

Assuming you make it apply, and have regression tested these, they are
Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965: Enable several GLES 3.1 extensions on HSW+

2016-11-08 Thread Ilia Mirkin
On Tue, Nov 8, 2016 at 1:10 PM, Ian Romanick  wrote:
> diff --git a/docs/relnotes/12.1.0.html b/docs/relnotes/12.1.0.html
> index c7e4d01..b8862d3 100644
> --- a/docs/relnotes/12.1.0.html
> +++ b/docs/relnotes/12.1.0.html

This is not the relnotes file you're looking for.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] i965: Enable several GLES 3.1 extensions on HSW+

2016-11-08 Thread Ian Romanick
From: Ian Romanick 

The only reason we didn't previously enable this was the dependency on
OpenGL ES 3.1.  These should have been enabled as soon as HSW got
stencil texturing.  We also needed to fixup setting MaxViewports.

Signed-off-by: Ian Romanick 
---
 docs/features.txt| 6 +++---
 docs/relnotes/12.1.0.html| 6 +++---
 src/mesa/drivers/dri/i965/intel_extensions.c | 6 +++---
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/docs/features.txt b/docs/features.txt
index a677bfb..b1f9384 100644
--- a/docs/features.txt
+++ b/docs/features.txt
@@ -260,18 +260,18 @@ GLES3.2, GLSL ES 3.2 -- all DONE: i965/gen9+
   GL_OES_copy_image DONE (all drivers)
   GL_OES_draw_buffers_indexed   DONE (all drivers that 
support GL_ARB_draw_buffers_blend)
   GL_OES_draw_elements_base_vertex  DONE (all drivers)
-  GL_OES_geometry_shaderDONE (i965/gen8+, 
nvc0, radeonsi)
+  GL_OES_geometry_shaderDONE (i965/hsw+, nvc0, 
radeonsi)
   GL_OES_gpu_shader5DONE (all drivers that 
support GL_ARB_gpu_shader5)
   GL_OES_primitive_bounding_box DONE (i965/gen7+, 
nvc0, radeonsi)
   GL_OES_sample_shading DONE (i965, nvc0, 
r600, radeonsi)
   GL_OES_sample_variables   DONE (i965, nvc0, 
r600, radeonsi)
   GL_OES_shader_image_atomicDONE (all drivers that 
support GL_ARB_shader_image_load_store)
-  GL_OES_shader_io_blocks   DONE (i965/gen8+, 
nvc0, radeonsi)
+  GL_OES_shader_io_blocks   DONE (All drivers that 
support GLES 3.1)
   GL_OES_shader_multisample_interpolation   DONE (i965, nvc0, 
r600, radeonsi)
   GL_OES_tessellation_shaderDONE (all drivers that 
support GL_ARB_tessellation_shader)
   GL_OES_texture_border_clamp   DONE (all drivers)
   GL_OES_texture_buffer DONE (i965, nvc0, 
radeonsi)
-  GL_OES_texture_cube_map_array DONE (i965/gen8+, 
nvc0, radeonsi)
+  GL_OES_texture_cube_map_array DONE (i965/hsw+, nvc0, 
radeonsi)
   GL_OES_texture_stencil8   DONE (all drivers that 
support GL_ARB_texture_stencil8)
   GL_OES_texture_storage_multisample_2d_array   DONE (all drivers that 
support GL_ARB_texture_multisample)
 
diff --git a/docs/relnotes/12.1.0.html b/docs/relnotes/12.1.0.html
index c7e4d01..b8862d3 100644
--- a/docs/relnotes/12.1.0.html
+++ b/docs/relnotes/12.1.0.html
@@ -64,11 +64,11 @@ Note: some of the new features are only available with 
certain drivers.
 GL_KHR_robustness on nvc0, radeonsi
 GL_KHR_texture_compression_astc_sliced_3d on i965
 GL_OES_copy_image on nv50, nvc0, r600, radeonsi, softpipe, llvmpipe
-GL_OES_geometry_shader on i965/gen8+, nvc0, radeonsi
+GL_OES_geometry_shader on i965/hsw+, nvc0, radeonsi
 GL_OES_primitive_bounding_box on i965/gen7+, nvc0, radeonsi
-GL_OES_texture_cube_map_array on i965/gen8+, nvc0, radeonsi
+GL_OES_texture_cube_map_array on i965/hsw+, nvc0, radeonsi
 GL_OES_tessellation_shader on i965/gen7+, nvc0, radeonsi
-GL_OES_viewport_array on nvc0, radeonsi
+GL_OES_viewport_array on i965/hsw+, nvc0, radeonsi
 GL_ANDROID_extension_pack_es31a on i965/gen9+
 
 
diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index 66079b5..cbde3fe 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -380,6 +380,9 @@ intelInitExtensions(struct gl_context *ctx)
if (brw->gen >= 8 || brw->is_haswell) {
   ctx->Extensions.ARB_stencil_texturing = true;
   ctx->Extensions.ARB_texture_stencil8 = true;
+  ctx->Extensions.OES_geometry_shader = true;
+  ctx->Extensions.OES_texture_cube_map_array = true;
+  ctx->Extensions.OES_viewport_array = true;
}
 
if (brw->gen >= 8 || brw->is_haswell || brw->is_baytrail) {
@@ -403,9 +406,6 @@ intelInitExtensions(struct gl_context *ctx)
   ctx->Extensions.ARB_shader_precision = true;
   ctx->Extensions.ARB_vertex_attrib_64bit = true;
   ctx->Extensions.ARB_ES3_2_compatibility = true;
-  ctx->Extensions.OES_geometry_shader = true;
-  ctx->Extensions.OES_texture_cube_map_array = true;
-  ctx->Extensions.OES_viewport_array = true;
}
 
if (brw->gen >= 9) {
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] i965: Always set MaxViewports and related limits

2016-11-08 Thread Ian Romanick
From: Ian Romanick 

Since 9d6ca7c3, there should be no performance hit for having
MaxViewports > 1.  Always set this context state.  This eliminates the
need to update this conditional as we add support for OES_viewport_array
on older GPUs.

Signed-off-by: Ian Romanick 
---
 src/mesa/drivers/dri/i965/brw_context.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index d6204fd..3295eb3 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -778,8 +778,7 @@ brw_initialize_context_constants(struct brw_context *brw)
}
 
/* ARB_viewport_array, OES_viewport_array */
-   if ((brw->gen >= 6 && ctx->API == API_OPENGL_CORE) ||
-   (brw->gen >= 8  && ctx->API == API_OPENGLES2)) {
+   if (brw->gen >= 6) {
   ctx->Const.MaxViewports = GEN6_NUM_VIEWPORTS;
   ctx->Const.ViewportSubpixelBits = 0;
 
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] swr: Support windows builds

2016-11-08 Thread Kyriazis, George


> -Original Message-
> From: Emil Velikov [mailto:emil.l.veli...@gmail.com]
> Sent: Tuesday, November 8, 2016 10:54 AM
> To: Kyriazis, George 
> Cc: ML mesa-dev 
> Subject: Re: [Mesa-dev] [PATCH 3/3] swr: Support windows builds
> 
> On 8 November 2016 at 15:48, Kyriazis, George 
> wrote:
> > Comments inline..
> >
> >> -Original Message-
> >> From: Emil Velikov [mailto:emil.l.veli...@gmail.com]
> >> Sent: Tuesday, November 8, 2016 8:25 AM
> >> To: Kyriazis, George 
> >> Cc: ML mesa-dev 
> >> Subject: Re: [Mesa-dev] [PATCH 3/3] swr: Support windows builds
> >>
> >> On 7 November 2016 at 22:32, George Kyriazis
> >> 
> >> wrote:
> >> > - Added SConscript files
> >> > - better handling of NOMINMAX for  inclusion
> >> > - Reorder header files in swr_context.cpp to handle NOMINMAX
> >> > better,
> >> since
> >> >   mesa header files include windows.h before we get a chance to
> #define
> >> >   NOMINMAX
> >> > - cleaner support for .dll and .so prefix/suffix across OSes
> >> > - added PUBLIC for some protos
> >> > - added swr_gdi_swap() which is call from libgl_gdi.c
> >> > ---
> >> >  src/gallium/drivers/swr/Makefile.am|   8 ++
> >> >  src/gallium/drivers/swr/SConscript |  46 +++
> >> >  src/gallium/drivers/swr/SConscript-arch| 175
> >> +
> >> >  src/gallium/drivers/swr/rasterizer/common/os.h |   5 +-
> >> >  src/gallium/drivers/swr/swr_context.cpp|  16 +--
> >> >  src/gallium/drivers/swr/swr_context.h  |   2 +
> >> >  src/gallium/drivers/swr/swr_loader.cpp |  37 +-
> >> >  src/gallium/drivers/swr/swr_public.h   |  11 +-
> >> >  src/gallium/drivers/swr/swr_screen.cpp |  25 +---
> >> >  9 files changed, 291 insertions(+), 34 deletions(-)  create mode
> >> > 100644 src/gallium/drivers/swr/SConscript
> >> >  create mode 100644 src/gallium/drivers/swr/SConscript-arch
> >> >
> >> Similar to 1/3 this patch does too many things. Please _don't_  do that.
> >>
> >> Some ideas based on the above:
> >>  - source code fixes - one or multiple patches, depending on details.
> >>  - automake fixes - ^^
> >>  - introduce scons build (+ the EXTRA_DIST hunk)
> >>
> > As stated in review of patch 1/3, I will send v2 of patches with different
> breakdown.
> >
> >
> >> Some misc comments below.
> >>
> >>
> >> > +++ b/src/gallium/drivers/swr/SConscript
> >> > @@ -0,0 +1,46 @@
> >> > +Import('*')
> >> > +
> >> > +from sys import executable as python_cmd import distutils.version
> >> Seems unused. Maybe it was aimed for the llvm 3.9 requirement/check
> >> mentioned in 1/3 ?
> >>
> > Scons build fails without the Import('*'), because env is undefined:
> >
> > NameError: name 'env' is not defined:
> >
> The "unused" comment was meant for the "import distutils.version"
> line. Which seemingly got manged somewhere along the way.
> 

That explains it.  Ok, I'll take care of it.

> >> > +import os.path
> >> > +
> >> > +if not 'swr' in COMMAND_LINE_TARGETS:
> >> > +Return()
> >> > +
> >> > +if not env['llvm']:
> >> > +print 'warning: LLVM disabled: not building swr'
> >> > +Return()
> >> > +
> >> > +env.MSVC2013Compat()
> >> > +
> >>
> >> > +swr_arch = 'avx'
> >> > +VariantDir('avx', '.', duplicate=0)
> >> > +SConscript('avx/SConscript-arch', exports='swr_arch')
> >> > +
> >> > +swr_arch = 'avx2'
> >> > +VariantDir('avx2', '.', duplicate=0)
> >> > +SConscript('avx2/SConscript-arch', exports='swr_arch')
> >> > +
> >> Afaict one can just fold the SConscript-arch here. Thus one won't
> >> need to bother with the above nor the Depends hunk below.
> >> Additionally with current approach one is generating [the] identical
> >> source files twice. Far from ideal...
> >>
> > The AVX and AVX2 builds build differently (with different compiler flags).
> At runtime, we load the appropriate dll, based on the underlying
> architecture.  We do the same thing on the linux build.  Also, since
> duplicate=0, source is not duplicated.  Yes, generated files are generated
> twice, however currently SConscript is just a shell around SConscript-arch; 
> all
> the logic that generates the files and source lists is in SConscript-arch.  By
> moving the auto generation to SConscript will generate only one copy of the
> gen files, however it splits the build logic into two files, which is more 
> messy.
> I can certainly move the generation code in SConscript, however, I think that
> it's cleaner to strive for source code cleanliness, as opposed to generate 
> code
> cleanliness.
> 
> "By moving the auto generation to SConscript ..., however it splits the build
> logic into two files..." did you mean "one file" here ?
> I'm proposing to fold the two SConscripts, which effectively moves the build
> logic into _one_ file :-)
> 
> Scons was never my thing, so I'm failing to see the "source code cleanliness"
> that you're thinking about :-( The following isn't that messy is it ?
> 
> build loader
> generate sources
> build avx - (uses above so

Re: [Mesa-dev] [PATCH] dir-locals.el: Adds whitespace support

2016-11-08 Thread Andres Gomez
If nobody says otherwise, I will land this by the beginning of next
week.

On Sun, 2016-10-23 at 00:10 +0300, Andres Gomez wrote:
> Provides support for highlighting incorrect indentation.
> 
> v2: Removed too long lines trail highlighting, as suggested by Ilia
> Mirkin.
> 
> Signed-off-by: Andres Gomez 
> ---
>  .dir-locals.el | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/.dir-locals.el b/.dir-locals.el
> index 4b53931..5340c3a 100644
> --- a/.dir-locals.el
> +++ b/.dir-locals.el
> @@ -1,4 +1,5 @@
> -((prog-mode
> +((nil . ((show-trailing-whitespace . t)))
> + (prog-mode
>(indent-tabs-mode . nil)
>(tab-width . 8)
>(c-basic-offset . 3)
> @@ -8,6 +9,10 @@
>   (c-set-offset 'case-label '0)
>   (c-set-offset 'innamespace '0)
>   (c-set-offset 'inline-open '0)))
> -  )
> +  (whitespace-style face indentation)
> +  (whitespace-line-column . 79)
> +  (eval ignore-errors
> +(require 'whitespace)
> +(whitespace-mode 1)))
>   (makefile-mode (indent-tabs-mode . t))
>   )
> 
-- 
Br,

Andres
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] swr: disable logic op when the rt format is float

2016-11-08 Thread Ilia Mirkin
Ah indeed, I missed that. That's what I get for looking at the man page.

On Tue, Nov 8, 2016 at 12:25 PM, Rowley, Timothy O
 wrote:
> Looking at the spec, that seems like that should also check for sRGB and also 
> disable in that case (“GetFormatInfo(compileState.format).isSRGB”).
>
>> On Nov 7, 2016, at 6:18 PM, Ilia Mirkin  wrote:
>>
>> Signed-off-by: Ilia Mirkin 
>> ---
>> src/gallium/drivers/swr/swr_state.cpp | 5 +
>> 1 file changed, 5 insertions(+)
>>
>> diff --git a/src/gallium/drivers/swr/swr_state.cpp 
>> b/src/gallium/drivers/swr/swr_state.cpp
>> index d8a8ee1..acb0452 100644
>> --- a/src/gallium/drivers/swr/swr_state.cpp
>> +++ b/src/gallium/drivers/swr/swr_state.cpp
>> @@ -1305,6 +1305,11 @@ swr_update_derived(struct pipe_context *pipe,
>>&ctx->blend->compileState[target],
>>sizeof(compileState.blendState));
>>
>> +if (compileState.blendState.logicOpEnable &&
>> +GetFormatInfo(compileState.format).type[0] == 
>> SWR_TYPE_FLOAT) {
>> +   compileState.blendState.logicOpEnable = false;
>> +}
>> +
>> if (compileState.blendState.blendEnable == false &&
>> compileState.blendState.logicOpEnable == false) {
>>SwrSetBlendFunc(ctx->swrContext, target, NULL);
>> --
>> 2.7.3
>>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98563] Xorg segfaults with displaylink attached and mesa version >= 13.0

2016-11-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98563

Jan Rüegg  changed:

   What|Removed |Added

 CC||rgg...@gmail.com

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] swr: disable logic op when the rt format is float

2016-11-08 Thread Rowley, Timothy O
Looking at the spec, that seems like that should also check for sRGB and also 
disable in that case (“GetFormatInfo(compileState.format).isSRGB”).

> On Nov 7, 2016, at 6:18 PM, Ilia Mirkin  wrote:
> 
> Signed-off-by: Ilia Mirkin 
> ---
> src/gallium/drivers/swr/swr_state.cpp | 5 +
> 1 file changed, 5 insertions(+)
> 
> diff --git a/src/gallium/drivers/swr/swr_state.cpp 
> b/src/gallium/drivers/swr/swr_state.cpp
> index d8a8ee1..acb0452 100644
> --- a/src/gallium/drivers/swr/swr_state.cpp
> +++ b/src/gallium/drivers/swr/swr_state.cpp
> @@ -1305,6 +1305,11 @@ swr_update_derived(struct pipe_context *pipe,
>&ctx->blend->compileState[target],
>sizeof(compileState.blendState));
> 
> +if (compileState.blendState.logicOpEnable &&
> +GetFormatInfo(compileState.format).type[0] == 
> SWR_TYPE_FLOAT) {
> +   compileState.blendState.logicOpEnable = false;
> +}
> +
> if (compileState.blendState.blendEnable == false &&
> compileState.blendState.logicOpEnable == false) {
>SwrSetBlendFunc(ctx->swrContext, target, NULL);
> -- 
> 2.7.3
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] swr: fix AND_INVERTED logic op conversion

2016-11-08 Thread Rowley, Timothy O
Reviewed-by: Tim Rowley 
mailto:timothy.o.row...@intel.com>>

On Nov 7, 2016, at 6:18 PM, Ilia Mirkin 
mailto:imir...@alum.mit.edu>> wrote:

Signed-off-by: Ilia Mirkin mailto:imir...@alum.mit.edu>>
---
src/gallium/drivers/swr/swr_state.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/swr/swr_state.h 
b/src/gallium/drivers/swr/swr_state.h
index 0e3b49d..8409114 100644
--- a/src/gallium/drivers/swr/swr_state.h
+++ b/src/gallium/drivers/swr/swr_state.h
@@ -106,7 +106,7 @@ swr_convert_logic_op(const UINT op)
   case PIPE_LOGICOP_NOR:
  return LOGICOP_NOR;
   case PIPE_LOGICOP_AND_INVERTED:
-  return LOGICOP_CLEAR;
+  return LOGICOP_AND_INVERTED;
   case PIPE_LOGICOP_COPY_INVERTED:
  return LOGICOP_COPY_INVERTED;
   case PIPE_LOGICOP_AND_REVERSE:
--
2.7.3


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 21/25] anv/blorp: Break the guts of alloc_binding_table into a shared helper

2016-11-08 Thread Jason Ekstrand
On Fri, Oct 28, 2016 at 12:27 PM, Pohjolainen, Topi <
topi.pohjolai...@gmail.com> wrote:

> On Sat, Oct 22, 2016 at 10:50:52AM -0700, Jason Ekstrand wrote:
> > ---
> >  src/intel/vulkan/anv_blorp.c   | 24 
> >  src/intel/vulkan/anv_private.h |  5 +
> >  src/intel/vulkan/genX_blorp_exec.c | 18 ++
> >  3 files changed, 31 insertions(+), 16 deletions(-)
> >
> > diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
> > index 5361c4b..f495815 100644
> > --- a/src/intel/vulkan/anv_blorp.c
> > +++ b/src/intel/vulkan/anv_blorp.c
> > @@ -868,6 +868,30 @@ void anv_CmdClearDepthStencilImage(
> > blorp_batch_finish(&batch);
> >  }
> >
> > +struct anv_state
> > +anv_cmd_buffer_alloc_blorp_binding_table(struct anv_cmd_buffer
> *cmd_buffer,
> > + uint32_t num_entries,
> > + uint32_t *state_offset)
> > +{
> > +   struct anv_state bt_state =
> > +  anv_cmd_buffer_alloc_binding_table(cmd_buffer, num_entries,
> > + state_offset);
> > +   if (bt_state.map == NULL) {
> > +  /* We ran out of space.  Grab a new binding table block. */
> > +  VkResult result = anv_cmd_buffer_new_binding_
> table_block(cmd_buffer);
> > +  assert(result == VK_SUCCESS);
> > +
> > +  /* Re-emit state base addresses so we get the new surface state
> base
> > +   * address before we start emitting binding tables etc.
> > +   */
> > +  anv_cmd_buffer_emit_state_base_address(cmd_buffer);
> > +
> > +  bt_state = anv_cmd_buffer_alloc_binding_table(cmd_buffer,
> num_entries,
> > +state_offset);
> > +  assert(bt_state.map != NULL);
> > +   }
>
> This is not returning the state.
>

Thanks for catching this.  I've got it fixed now.


> > +}
> > +
> >  static void
> >  clear_color_attachment(struct anv_cmd_buffer *cmd_buffer,
> > struct blorp_batch *batch,
> > diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_
> private.h
> > index 5664a6e..44fe606 100644
> > --- a/src/intel/vulkan/anv_private.h
> > +++ b/src/intel/vulkan/anv_private.h
> > @@ -1271,6 +1271,11 @@ void anv_cmd_buffer_resolve_subpass(struct
> anv_cmd_buffer *cmd_buffer);
> >  const struct anv_image_view *
> >  anv_cmd_buffer_get_depth_stencil_view(const struct anv_cmd_buffer
> *cmd_buffer);
> >
> > +struct anv_state
> > +anv_cmd_buffer_alloc_blorp_binding_table(struct anv_cmd_buffer
> *cmd_buffer,
> > + uint32_t num_entries,
> > + uint32_t *state_offset);
> > +
> >  void anv_cmd_buffer_dump(struct anv_cmd_buffer *cmd_buffer);
> >
> >  struct anv_fence {
> > diff --git a/src/intel/vulkan/genX_blorp_exec.c
> b/src/intel/vulkan/genX_blorp_exec.c
> > index 185aff6..a705de0 100644
> > --- a/src/intel/vulkan/genX_blorp_exec.c
> > +++ b/src/intel/vulkan/genX_blorp_exec.c
> > @@ -87,22 +87,8 @@ blorp_alloc_binding_table(struct blorp_batch *batch,
> unsigned num_entries,
> >
> > uint32_t state_offset;
> > struct anv_state bt_state =
> > -  anv_cmd_buffer_alloc_binding_table(cmd_buffer, num_entries,
> > - &state_offset);
> > -   if (bt_state.map == NULL) {
> > -  /* We ran out of space.  Grab a new binding table block. */
> > -  VkResult result = anv_cmd_buffer_new_binding_
> table_block(cmd_buffer);
> > -  assert(result == VK_SUCCESS);
> > -
> > -  /* Re-emit state base addresses so we get the new surface state
> base
> > -   * address before we start emitting binding tables etc.
> > -   */
> > -  genX(cmd_buffer_emit_state_base_address)(cmd_buffer);
> > -
> > -  bt_state = anv_cmd_buffer_alloc_binding_table(cmd_buffer,
> num_entries,
> > -&state_offset);
> > -  assert(bt_state.map != NULL);
> > -   }
> > +  anv_cmd_buffer_alloc_blorp_binding_table(cmd_buffer, num_entries,
> > +   &state_offset);
> >
> > uint32_t *bt_map = bt_state.map;
> > *bt_offset = bt_state.offset;
> > --
> > 2.5.0.400.gff86faf
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 19/25] intel/blorp: Add a clear_attachments entrypoint

2016-11-08 Thread Jason Ekstrand
On Fri, Oct 28, 2016 at 12:11 PM, Pohjolainen, Topi <
topi.pohjolai...@gmail.com> wrote:

> On Sat, Oct 22, 2016 at 10:50:50AM -0700, Jason Ekstrand wrote:
> > ---
> >  src/intel/blorp/blorp.h   |  11 +++
> >  src/intel/blorp/blorp_clear.c | 162 ++
> +++-
> >  src/intel/blorp/blorp_priv.h  |   1 +
> >  3 files changed, 172 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/intel/blorp/blorp.h b/src/intel/blorp/blorp.h
> > index 0c64d13..8a761ce 100644
> > --- a/src/intel/blorp/blorp.h
> > +++ b/src/intel/blorp/blorp.h
> > @@ -156,6 +156,17 @@ blorp_clear_depth_stencil(struct blorp_batch
> *batch,
> >uint8_t stencil_mask, uint8_t stencil_value);
> >
> >  void
> > +blorp_clear_attachments(struct blorp_batch *batch,
> > +uint32_t color_surface_state,
> > +enum isl_format depth_format,
> > +uint32_t num_samples,
> > +uint32_t start_layer, uint32_t num_layers,
> > +uint32_t x0, uint32_t y0, uint32_t x1, uint32_t
> y1,
> > +bool clear_color, union isl_color_value
> color_value,
> > +bool clear_depth, float depth_value,
> > +uint8_t stencil_mask, uint8_t stencil_value);
> > +
> > +void
> >  blorp_ccs_resolve(struct blorp_batch *batch,
> >struct blorp_surf *surf, enum isl_format format);
> >
> > diff --git a/src/intel/blorp/blorp_clear.c
> b/src/intel/blorp/blorp_clear.c
> > index 3d752ac..2287f59 100644
> > --- a/src/intel/blorp/blorp_clear.c
> > +++ b/src/intel/blorp/blorp_clear.c
> > @@ -87,6 +87,94 @@ blorp_params_get_clear_kernel(struct blorp_context
> *blorp,
> > ralloc_free(mem_ctx);
> >  }
> >
> > +struct layer_offset_vs_key {
> > +   enum blorp_shader_type shader_type;
> > +   unsigned num_inputs;
> > +};
> > +
>
> I'm assuming we need this because we are re-using the surface state from
> other pass and therefore cannot set the base layer in the surface state? If
> so it would be nice to have a comment here.
>

Done.


> > +static void
> > +blorp_params_get_layer_offset_vs(struct blorp_context *blorp,
> > + struct blorp_params *params)
> > +{
> > +   struct layer_offset_vs_key blorp_key = {
> > +  .shader_type = BLORP_SHADER_TYPE_LAYER_OFFSET_VS,
> > +   };
> > +
> > +   if (params->wm_prog_data)
> > +  blorp_key.num_inputs = params->wm_prog_data->num_varying_inputs;
> > +
> > +   if (blorp->lookup_shader(blorp, &blorp_key, sizeof(blorp_key),
> > +¶ms->vs_prog_kernel,
> ¶ms->vs_prog_data))
> > +  return;
> > +
> > +   void *mem_ctx = ralloc_context(NULL);
> > +
> > +   nir_builder b;
> > +   nir_builder_init_simple_shader(&b, mem_ctx, MESA_SHADER_VERTEX,
> NULL);
> > +   b.shader->info.name = ralloc_strdup(b.shader,
> "BLORP-layer-offset-vs");
> > +
> > +   const struct glsl_type *uvec4_type = glsl_vector_type(GLSL_TYPE_UINT,
> 4);
> > +
> > +   /*
> > +* First we deal with the header which has instance and base instance
> > +*/
>
> Fits as oneline comment.
>
> > +   nir_variable *a_header = nir_variable_create(b.shader,
> nir_var_shader_in,
> > +uvec4_type, "header");
> > +   a_header->data.location = VERT_ATTRIB_GENERIC0;
> > +
> > +   nir_variable *v_layer = nir_variable_create(b.shader,
> nir_var_shader_out,
> > +   glsl_int_type(),
> "layer_id");
> > +   v_layer->data.location = VARYING_SLOT_LAYER;
> > +
> > +   /* Compute the layer id */
> > +   nir_ssa_def *header = nir_load_var(&b, a_header);
> > +   nir_ssa_def *base_layer = nir_channel(&b, header, 0);
> > +   nir_ssa_def *instance = nir_channel(&b, header, 1);
> > +   nir_store_var(&b, v_layer, nir_iadd(&b, instance, base_layer), 0x1);
> > +
> > +   /*
> > +* Then we copy the vertex from the next slot to VARYING_SLOT_POS
> > +*/
>
> Same here.
>
> > +   nir_variable *a_vertex = nir_variable_create(b.shader,
> nir_var_shader_in,
> > +glsl_vec4_type(),
> "a_vertex");
> > +   a_vertex->data.location = VERT_ATTRIB_GENERIC1;
> > +
> > +   nir_variable *v_pos = nir_variable_create(b.shader,
> nir_var_shader_out,
> > + glsl_vec4_type(), "v_pos");
> > +   v_pos->data.location = VARYING_SLOT_POS;
> > +
> > +   nir_copy_var(&b, v_pos, a_vertex);
> > +
> > +   /*
> > +* Then we copy everything else
> > +*/
>
> And here.
>
> > +   for (unsigned i = 0; i < blorp_key.num_inputs; i++) {
> > +  nir_variable *a_in = nir_variable_create(b.shader,
> nir_var_shader_in,
> > +   uvec4_type, "input");
> > +  a_in->data.location = VERT_ATTRIB_GENERIC2 + i;
> > +
> > +  nir_variable *v_out = nir_variable_create(b.shader,
> nir_var_shader_out,
> > +  

[Mesa-dev] [Bug 98599] xterm menus corrupt since tgsi/scan: handle indirect image indexing correctly

2016-11-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98599

Marek Olšák  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from Marek Olšák  ---
Fixed by f864547fa92262f4b2c65a047210ee41e5b45e9a. Closing.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] swr: Support windows builds

2016-11-08 Thread Emil Velikov
On 8 November 2016 at 15:48, Kyriazis, George  wrote:
> Comments inline..
>
>> -Original Message-
>> From: Emil Velikov [mailto:emil.l.veli...@gmail.com]
>> Sent: Tuesday, November 8, 2016 8:25 AM
>> To: Kyriazis, George 
>> Cc: ML mesa-dev 
>> Subject: Re: [Mesa-dev] [PATCH 3/3] swr: Support windows builds
>>
>> On 7 November 2016 at 22:32, George Kyriazis 
>> wrote:
>> > - Added SConscript files
>> > - better handling of NOMINMAX for  inclusion
>> > - Reorder header files in swr_context.cpp to handle NOMINMAX better,
>> since
>> >   mesa header files include windows.h before we get a chance to #define
>> >   NOMINMAX
>> > - cleaner support for .dll and .so prefix/suffix across OSes
>> > - added PUBLIC for some protos
>> > - added swr_gdi_swap() which is call from libgl_gdi.c
>> > ---
>> >  src/gallium/drivers/swr/Makefile.am|   8 ++
>> >  src/gallium/drivers/swr/SConscript |  46 +++
>> >  src/gallium/drivers/swr/SConscript-arch| 175
>> +
>> >  src/gallium/drivers/swr/rasterizer/common/os.h |   5 +-
>> >  src/gallium/drivers/swr/swr_context.cpp|  16 +--
>> >  src/gallium/drivers/swr/swr_context.h  |   2 +
>> >  src/gallium/drivers/swr/swr_loader.cpp |  37 +-
>> >  src/gallium/drivers/swr/swr_public.h   |  11 +-
>> >  src/gallium/drivers/swr/swr_screen.cpp |  25 +---
>> >  9 files changed, 291 insertions(+), 34 deletions(-)  create mode
>> > 100644 src/gallium/drivers/swr/SConscript
>> >  create mode 100644 src/gallium/drivers/swr/SConscript-arch
>> >
>> Similar to 1/3 this patch does too many things. Please _don't_  do that.
>>
>> Some ideas based on the above:
>>  - source code fixes - one or multiple patches, depending on details.
>>  - automake fixes - ^^
>>  - introduce scons build (+ the EXTRA_DIST hunk)
>>
> As stated in review of patch 1/3, I will send v2 of patches with different 
> breakdown.
>
>
>> Some misc comments below.
>>
>>
>> > +++ b/src/gallium/drivers/swr/SConscript
>> > @@ -0,0 +1,46 @@
>> > +Import('*')
>> > +
>> > +from sys import executable as python_cmd import distutils.version
>> Seems unused. Maybe it was aimed for the llvm 3.9 requirement/check
>> mentioned in 1/3 ?
>>
> Scons build fails without the Import('*'), because env is undefined:
>
> NameError: name 'env' is not defined:
>
The "unused" comment was meant for the "import distutils.version"
line. Which seemingly got manged somewhere along the way.

>> > +import os.path
>> > +
>> > +if not 'swr' in COMMAND_LINE_TARGETS:
>> > +Return()
>> > +
>> > +if not env['llvm']:
>> > +print 'warning: LLVM disabled: not building swr'
>> > +Return()
>> > +
>> > +env.MSVC2013Compat()
>> > +
>>
>> > +swr_arch = 'avx'
>> > +VariantDir('avx', '.', duplicate=0)
>> > +SConscript('avx/SConscript-arch', exports='swr_arch')
>> > +
>> > +swr_arch = 'avx2'
>> > +VariantDir('avx2', '.', duplicate=0)
>> > +SConscript('avx2/SConscript-arch', exports='swr_arch')
>> > +
>> Afaict one can just fold the SConscript-arch here. Thus one won't need to
>> bother with the above nor the Depends hunk below.
>> Additionally with current approach one is generating [the] identical source
>> files twice. Far from ideal...
>>
> The AVX and AVX2 builds build differently (with different compiler flags).  
> At runtime, we load the appropriate dll, based on the underlying 
> architecture.  We do the same thing on the linux build.  Also, since 
> duplicate=0, source is not duplicated.  Yes, generated files are generated 
> twice, however currently SConscript is just a shell around SConscript-arch; 
> all the logic that generates the files and source lists is in 
> SConscript-arch.  By moving the auto generation to SConscript will generate 
> only one copy of the gen files, however it splits the build logic into two 
> files, which is more messy.  I can certainly move the generation code in 
> SConscript, however, I think that it's cleaner to strive for source code 
> cleanliness, as opposed to generate code cleanliness.

"By moving the auto generation to SConscript ..., however it splits
the build logic into two files..." did you mean "one file" here ?
I'm proposing to fold the two SConscripts, which effectively moves the
build logic into _one_ file :-)

Scons was never my thing, so I'm failing to see the "source code
cleanliness" that you're thinking about :-(
The following isn't that messy is it ?

build loader
generate sources
build avx - (uses above sources + avx compile flags)
build avx2 - (uses above sources + avx2 compile flags)


This is now I folded/cleaned up the autoconf build with commit
bb949e262cb5c4fffe991debc605447e15322666. A similar solution here
would be great/possible.

>> > +# remove headers, as scons thinks they are static objects for the .so
>> > +source = [x for x in source if not x.endswith(tuple(['.h','.hpp']))]
>> > +
>> Should be handled already. Otherwise please do so in scons/* Quick grep
>> suggests scons/custom.py
>>
>

Re: [Mesa-dev] [PATCH] gallivm: Fix build after removal of deprecated attribute API v2

2016-11-08 Thread Marek Olšák
On Mon, Nov 7, 2016 at 11:04 PM, Jan Vesely  wrote:
> On Mon, 2016-11-07 at 21:06 +, Tom Stellard wrote:
>> v2:
>>   Fix adding parameter attributes with LLVM < 4.0.
>> ---
>>  src/gallium/auxiliary/draw/draw_llvm.c|  6 +-
>>  src/gallium/auxiliary/gallivm/lp_bld_intr.c   | 52 -
>>  src/gallium/auxiliary/gallivm/lp_bld_intr.h   | 13 -
>>  src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c |  4 +-
>>  src/gallium/drivers/radeonsi/si_shader.c  | 69 
>> ---
>>  src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c | 24 
>>  6 files changed, 116 insertions(+), 52 deletions(-)
>>
>> diff --git a/src/gallium/auxiliary/draw/draw_llvm.c 
>> b/src/gallium/auxiliary/draw/draw_llvm.c
>> index 5b4e2a1..5d87318 100644
>> --- a/src/gallium/auxiliary/draw/draw_llvm.c
>> +++ b/src/gallium/auxiliary/draw/draw_llvm.c
>> @@ -1568,8 +1568,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct 
>> draw_llvm_variant *variant,
>> LLVMSetFunctionCallConv(variant_func, LLVMCCallConv);
>> for (i = 0; i < num_arg_types; ++i)
>>if (LLVMGetTypeKind(arg_types[i]) == LLVMPointerTypeKind)
>> - LLVMAddAttribute(LLVMGetParam(variant_func, i),
>> -  LLVMNoAliasAttribute);
>> + lp_add_function_attr(variant_func, i + 1, "noalias", 7);
>>
>> context_ptr   = LLVMGetParam(variant_func, 0);
>> io_ptr= LLVMGetParam(variant_func, 1);
>> @@ -2193,8 +2192,7 @@ draw_gs_llvm_generate(struct draw_llvm *llvm,
>>
>> for (i = 0; i < ARRAY_SIZE(arg_types); ++i)
>>if (LLVMGetTypeKind(arg_types[i]) == LLVMPointerTypeKind)
>> - LLVMAddAttribute(LLVMGetParam(variant_func, i),
>> -  LLVMNoAliasAttribute);
>> + lp_add_function_attr(variant_func, i + 1, "noalias", 7);
>>
>> context_ptr   = LLVMGetParam(variant_func, 0);
>> input_array   = LLVMGetParam(variant_func, 1);
>> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_intr.c 
>> b/src/gallium/auxiliary/gallivm/lp_bld_intr.c
>> index f12e735..401e9a2 100644
>> --- a/src/gallium/auxiliary/gallivm/lp_bld_intr.c
>> +++ b/src/gallium/auxiliary/gallivm/lp_bld_intr.c
>> @@ -120,13 +120,57 @@ lp_declare_intrinsic(LLVMModuleRef module,
>>  }
>>
>>
>> +#if HAVE_LLVM < 0x0400
>> +static LLVMAttribute str_to_attr(const char *attr_name, unsigned attr_len)
>> +{
>> +   if (!strncmp("alwaysinline", attr_name, attr_len)) {
>> +  return LLVMAlwaysInlineAttribute;
>> +   } else if (!strncmp("byval", attr_name, attr_len)) {
>> +  return LLVMByValAttribute;
>> +   } else if (!strncmp("inreg", attr_name, attr_len)) {
>> +  return LLVMInRegAttribute;
>> +   } else if (!strncmp("noalias", attr_name, attr_len)) {
>> +  return LLVMNoAlliasAttribute;
>> +   } else if (!strncmp("readnone", attr_name, attr_len)) {
>> +  return LLVMReadNoneAttribute;
>> +   } else if (!strncmp("readonly", attr_name, attr_len)) {
>> +  return LLVMReadOnlyAttribute;
>> +   } else {
>> +  _debug_printf("Unhandled function attribute: %s\n", attr_name);
>> +  return 0;
>> +   }
>> +}
>> +#endif
>> +
>> +void
>> +lp_add_function_attr(LLVMValueRef function,
>> + int attr_idx,
>> + const char *attr_name,
>> + unsigned attr_len)
>
> Any reason to pass string length by hand rather than local strlen?

An enum would be better. Then lp_add_function_attr can translate the
enum to a string. The enums can be defined by gallivm.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 03/12] egl: add EGL_ANDROID_native_fence_sync

2016-11-08 Thread Rafael Antognolli
On Mon, Nov 07, 2016 at 07:48:25PM -0500, Rob Clark wrote:
> On Mon, Nov 7, 2016 at 6:29 PM, Rafael Antognolli
>  wrote:
> > On Mon, Oct 31, 2016 at 08:58:26AM -0700, Rafael Antognolli wrote:
> >> On Sat, Oct 29, 2016 at 01:15:44PM -0400, Rob Clark wrote:
> >> > On Fri, Oct 28, 2016 at 7:44 PM, Rafael Antognolli
> >> >  wrote:
> >
> > ...
> >
> >> > Hey, thanks for this.  I don't suppose you have a branch somewhere w/
> >> > the piglit tests?
> >>
> >> Ouch, I mentioned it on another email but should have mentioned it here
> >> too. It's here:
> >>
> >> https://github.com/rantogno/piglit/tree/fences
> >>
> >> > I've rebased and pulled in Chad's squash patches (and also a squash
> >> > patch based on the issues you pointed out), but not yet the i965
> >> > patches:
> >> >
> >> > https://github.com/freedreno/mesa/commits/wip-fence
> >>
> >> Awesome, I will check that one.
> >
> > Just an update: I did test that branch, and there was just one change
> > needed for the piglit tests to work:
> >
> > https://github.com/rantogno/mesa/commit/c637f1ce404acaccaa920d37c52724c9d8093597
> 
> oh, good catch.. I'll squash that in and push an updated branch soon
> 
> > You can also check my last version of these tests (also submitted to the
> > piglit list) here:
> >
> > https://github.com/rantogno/piglit/tree/review/fences-v02
> >
> > The only test that I don't know how to do yet is to make sure that Mesa
> > and the kernel are respecting an eglSyncWait for a native sync fence.
> > eglClientWaitSyncKHR is already covered.
> 
> yeah, I can't think of a particularly easy way to test that..  but I
> think the API level tests have already caught quite a few issues..

Alright, I'll ignore that for now...

> > Also I did test your series with kmscube and some other stuff too, and
> > so far it's all behaving really well. I'm looking forward to see your
> > patches get merged.
> 
> I guess we should pull together a unified branch.. since we have this
> working for intel + virgl + freedreno.  AFAIU the current status is
> intel and freedreno kernel bits are upstream.  The libdrm bits for
> freedreno are upstream, not sure about intel (and virgl doesn't have
> any libdrm component).  Not sure about the kernel bit for virgl, but I
> assume that will be 4.10?

Actually, the intel kernel bits have not been merged yet, AFAIK they are
"waiting for userspace". Chris Wilson has been sending updated versions
of them every now and then, so it's just a matter of merging them. And
they have been reviewed already.

libdrm bits for intel are not upstream yet either. What I have been
using is a mix of patches from Chad and Chris Wilson as well, but imho
it's ready to go:

https://github.com/rantogno/libdrm/tree/wip-fence

> I have one small update for the gallium patch, to add the pipe-cap to
> all the other drivers.  I usually try to wait until the patch is ready
> to push since otherwise it ends up being a huge rebase headache.
> 
> I would defn like to get this merged, esp. since I'm starting to get
> busy on the next thing ;-)

Awesome, that sounds really good :)

Thanks,
Rafael
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallivm: fix [IU]MUL_HI regression

2016-11-08 Thread Roland Scheidegger
Am 08.11.2016 um 16:23 schrieb Nicolai Hähnle:
> On 08.11.2016 14:44, Roland Scheidegger wrote:
>> Sorry for breaking radeonsi, I somehow thought this way only used for
>> cpu only already, without actually checking...
>> And thanks for fixing that typo, apparently you can pass piglits
>> umul_hi/imul_hi tests (at least those from the shader_integer_mix group)
>> even with the square of argument a...
> 
> Yeah, it sucks that test runs take so long with llvmpipe. Is there
> anybody doing systematic full regression runs on it?
> 
> I do full runs on radeonsi fairly frequently, and I noticed this bug
> with
> tests/spec/arb_gpu_shader5/execution/built-in-functions/fs-imulExtended.shader_test
> and friends.

I do but not that frequently. I usually do full runs though with changes
such as this, so I thought I screwed up the testing.
However, looking at the assembly, this wasn't the case - glsl lowered
the imul_hi and umul_hi to a sequence of muls/adds/shifts
The reason for that is probably the IMUL_HIGH_TO_MUL lowering - this is
done when ARB_gpu_shader5 isn't supported (which llvmpipe does not).
With llvmpipe, we definitely don't want the mul_hi lowering but there's
even a comment there that there's no individual caps (and some of the
other stuff might not be implemented).
So there was no way to catch that with llvmpipe, effectively only the
umul_hi was tested called directly from the draw fetch code, not from
shader (fwiw I did see a regression with some automated internal testing
using another api...).
(We also have automated piglit tests, however due to results changing
frequently it's difficult to catch "real" regressions.)

Roland



> 
> 
>> btw as I didn't consider this, I don't know if you want to change the
>> shift/trunc to shuffle in the end - feel free to change it back if it
>> doesn't generate good code on radeonsi...
> 
> It seems instcombine has no difficulties seeing through the IR, so I
> think we're good :)
Ok. With x86 sse2 it definitely generates some different assembly, but
what exactly is better depends on sse2/sse41/avx/avx2 being available
and the llvm version...

Roland


> 
> 
>> Reviewed-by: Roland Scheidegger 
> 
> Thanks!
> 
> Nicolai
> 
> 
>> Am 08.11.2016 um 10:15 schrieb Nicolai Hähnle:
>>> From: Nicolai Hähnle 
>>>
>>> This patch does two things:
>>>
>>> 1. It separates the host-CPU code generation from the generic code
>>>generation. This guards against accidently breaking things for
>>>radeonsi in the future.
>>>
>>> 2. It makes sure we actually use both arguments and don't just compute
>>>a square :-p
>>>
>>> Fixes a regression introduced by commit
>>> 29279f44b3172ef3b84d470e70fc7684695ced4b
>>>
>>> Cc: Roland Scheidegger 
>>> ---
>>>  src/gallium/auxiliary/gallivm/lp_bld_arit.c| 72
>>> ++
>>>  src/gallium/auxiliary/gallivm/lp_bld_arit.h|  6 ++
>>>  src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c | 40 +++-
>>>  3 files changed, 90 insertions(+), 28 deletions(-)
>>>
>>> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_arit.c
>>> b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
>>> index 3de4628..43ad238 100644
>>> --- a/src/gallium/auxiliary/gallivm/lp_bld_arit.c
>>> +++ b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
>>> @@ -1087,26 +1087,28 @@ lp_build_mul(struct lp_build_context *bld,
>>>  res = LLVMBuildLShr(builder, res, shift, "");
>>>}
>>> }
>>>
>>> return res;
>>>  }
>>>
>>>  /*
>>>   * Widening mul, valid for 32x32 bit -> 64bit only.
>>>   * Result is low 32bits, high bits returned in res_hi.
>>> + *
>>> + * Emits code that is meant to be compiled for the host CPU.
>>>   */
>>>  LLVMValueRef
>>> -lp_build_mul_32_lohi(struct lp_build_context *bld,
>>> - LLVMValueRef a,
>>> - LLVMValueRef b,
>>> - LLVMValueRef *res_hi)
>>> +lp_build_mul_32_lohi_cpu(struct lp_build_context *bld,
>>> + LLVMValueRef a,
>>> + LLVMValueRef b,
>>> + LLVMValueRef *res_hi)
>>>  {
>>> struct gallivm_state *gallivm = bld->gallivm;
>>> LLVMBuilderRef builder = gallivm->builder;
>>>
>>> assert(bld->type.width == 32);
>>> assert(bld->type.floating == 0);
>>> assert(bld->type.fixed == 0);
>>> assert(bld->type.norm == 0);
>>>
>>> /*
>>> @@ -1209,43 +1211,61 @@ lp_build_mul_32_lohi(struct lp_build_context
>>> *bld,
>>>*res_hi = LLVMBuildShuffleVector(builder, muleven, mulodd,
>>> shuf_vec, "");
>>>
>>>for (i = 0; i < bld->type.length; i += 2) {
>>>   shuf[i] = lp_build_const_int32(gallivm, i);
>>>   shuf[i+1] = lp_build_const_int32(gallivm, i +
>>> bld->type.length);
>>>}
>>>shuf_vec = LLVMConstVector(shuf, bld->type.length);
>>>return LLVMBuildShuffleVector(builder, muleven, mulodd,
>>> shuf_vec, "");
>>> }
>>> else {
>>> -  LLVMValueRef tmp;
>>> -  struct lp_type type_tmp;
>>> -  

Re: [Mesa-dev] [PATCH 13/18] anv: Add initial for Sky Lake color compression

2016-11-08 Thread Jason Ekstrand
On Tue, Nov 8, 2016 at 12:21 AM, Pohjolainen, Topi <
topi.pohjolai...@gmail.com> wrote:

>
> Title says: "anv: Add initial for Sky Lake color compression". Did you mean
> to have something after "initial"?
>

Yeah, "support" should probably go in there


> On Fri, Oct 28, 2016 at 02:17:09AM -0700, Jason Ekstrand wrote:
> > This commit adds basic support for color compression.  For the moment,
> > color compression is only enabled within a render pass and a full resolve
> > is done before the render pass finishes.  All texturing operations still
> > happen with CCS disabled.
>
> I'm not that familiar with all the vulkan concepts so far and need to ask a
> few things. In this patch CCS_E is enabled whenever there is suitable
> render
> target. For surfaces of the type VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT and
> VK_DESCRIPTOR_TYPE_STORAGE_IMAGE aux is explicitly disabled. Does this
> mean
> that it is impossible to have one of these surfaces as render target in
> the same pass (and having compression turned on for writing)?
>

No, not quite.  In this patch, we always do a full resolve at the end of
the render pass.  Since input attachments aren't really supported yet (my
next task), those aren't a problem.  Also, you can't bind one of your
render pass attachments as a storage image or texture so those will never
be used while it's in an unresolved state.


> Otherwise this patch looks good to me.
>
> >
> > Signed-off-by: Jason Ekstrand 
> > ---
> >  src/intel/vulkan/anv_blorp.c   | 139 +-
> ---
> >  src/intel/vulkan/anv_image.c   |  17 +++--
> >  src/intel/vulkan/anv_private.h |   1 +
> >  src/intel/vulkan/genX_cmd_buffer.c |  50 -
> >  4 files changed, 171 insertions(+), 36 deletions(-)
> >
> > diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c
> > index 0e70e9b..bf317c7 100644
> > --- a/src/intel/vulkan/anv_blorp.c
> > +++ b/src/intel/vulkan/anv_blorp.c
> > @@ -1179,52 +1179,131 @@ void anv_CmdResolveImage(
> > blorp_batch_finish(&batch);
> >  }
> >
> > +static void
> > +ccs_resolve_attachment(struct anv_cmd_buffer *cmd_buffer,
> > +   struct blorp_batch *batch,
> > +   uint32_t att)
> > +{
> > +   struct anv_framebuffer *fb = cmd_buffer->state.framebuffer;
> > +   struct anv_attachment_state *att_state =
> > +  &cmd_buffer->state.attachments[att];
> > +
> > +   assert(att_state->aux_usage != ISL_AUX_USAGE_CCS_D);
> > +   if (att_state->aux_usage != ISL_AUX_USAGE_CCS_E)
> > +  return; /* Nothing to resolve */
> > +
> > +   struct anv_render_pass *pass = cmd_buffer->state.pass;
> > +   struct anv_subpass *subpass = cmd_buffer->state.subpass;
> > +   unsigned subpass_idx = subpass - pass->subpasses;
> > +   assert(subpass_idx < pass->subpass_count);
> > +
> > +   /* Scan forward to see what all ways this attachment will be used.
> > +* Ideally, we would like to resolve in the same subpass as the last
> write
> > +* of a particular attachment.  That way we only resolve once but
> it's
> > +* still hot in the cache.
> > +*/
> > +   for (uint32_t s = subpass_idx + 1; s < pass->subpass_count; s++) {
> > +  enum anv_subpass_usage usage = pass->attachments[att].
> subpass_usage[s];
>
> I'm wondering if this holds?
>
>  assert(!(usage & ANV_SUBPASS_USAGE_INPUT));
>
> > +
> > +  if (usage & (ANV_SUBPASS_USAGE_DRAW |
> ANV_SUBPASS_USAGE_RESOLVE_DST)) {
> > + /* We found another subpass that draws to this attachment.
> We'll
> > +  * wait to resolve until then.
> > +  */
> > + return;
> > +  }
> > +   }
> > +
> > +   struct anv_image_view *iview = fb->attachments[att];
> > +   const struct anv_image *image = iview->image;
> > +   assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT);
> > +
> > +   struct blorp_surf surf;
> > +   get_blorp_surf_for_anv_image(image, VK_IMAGE_ASPECT_COLOR_BIT,
> &surf);
> > +   surf.aux_surf = &image->aux_surface.isl;
> > +   surf.aux_addr = (struct blorp_address) {
> > +  .buffer = image->bo,
> > +  .offset = image->offset + image->aux_surface.offset,
> > +   };
> > +   surf.aux_usage = att_state->aux_usage;
> > +
> > +   for (uint32_t layer = 0; layer < fb->layers; layer++) {
> > +  blorp_ccs_resolve(batch, &surf,
> > +iview->isl.base_level,
> > +iview->isl.base_array_layer + layer,
> > +iview->isl.format,
> > +BLORP_FAST_CLEAR_OP_RESOLVE_FULL);
> > +   }
> > +}
> > +
> >  void
> >  anv_cmd_buffer_resolve_subpass(struct anv_cmd_buffer *cmd_buffer)
> >  {
> > struct anv_framebuffer *fb = cmd_buffer->state.framebuffer;
> > struct anv_subpass *subpass = cmd_buffer->state.subpass;
> >
> > -   if (!subpass->has_resolve)
> > -  return;
> >
> > struct blorp_batch batch;
> > blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer, 0);
> >
> > +   /* From the Sky Lake PRM

Re: [Mesa-dev] [PATCH 3/3] swr: Support windows builds

2016-11-08 Thread Kyriazis, George
Comments inline..

> -Original Message-
> From: Emil Velikov [mailto:emil.l.veli...@gmail.com]
> Sent: Tuesday, November 8, 2016 8:25 AM
> To: Kyriazis, George 
> Cc: ML mesa-dev 
> Subject: Re: [Mesa-dev] [PATCH 3/3] swr: Support windows builds
> 
> On 7 November 2016 at 22:32, George Kyriazis 
> wrote:
> > - Added SConscript files
> > - better handling of NOMINMAX for  inclusion
> > - Reorder header files in swr_context.cpp to handle NOMINMAX better,
> since
> >   mesa header files include windows.h before we get a chance to #define
> >   NOMINMAX
> > - cleaner support for .dll and .so prefix/suffix across OSes
> > - added PUBLIC for some protos
> > - added swr_gdi_swap() which is call from libgl_gdi.c
> > ---
> >  src/gallium/drivers/swr/Makefile.am|   8 ++
> >  src/gallium/drivers/swr/SConscript |  46 +++
> >  src/gallium/drivers/swr/SConscript-arch| 175
> +
> >  src/gallium/drivers/swr/rasterizer/common/os.h |   5 +-
> >  src/gallium/drivers/swr/swr_context.cpp|  16 +--
> >  src/gallium/drivers/swr/swr_context.h  |   2 +
> >  src/gallium/drivers/swr/swr_loader.cpp |  37 +-
> >  src/gallium/drivers/swr/swr_public.h   |  11 +-
> >  src/gallium/drivers/swr/swr_screen.cpp |  25 +---
> >  9 files changed, 291 insertions(+), 34 deletions(-)  create mode
> > 100644 src/gallium/drivers/swr/SConscript
> >  create mode 100644 src/gallium/drivers/swr/SConscript-arch
> >
> Similar to 1/3 this patch does too many things. Please _don't_  do that.
> 
> Some ideas based on the above:
>  - source code fixes - one or multiple patches, depending on details.
>  - automake fixes - ^^
>  - introduce scons build (+ the EXTRA_DIST hunk)
> 
As stated in review of patch 1/3, I will send v2 of patches with different 
breakdown.


> Some misc comments below.
> 
> 
> > +++ b/src/gallium/drivers/swr/SConscript
> > @@ -0,0 +1,46 @@
> > +Import('*')
> > +
> > +from sys import executable as python_cmd import distutils.version
> Seems unused. Maybe it was aimed for the llvm 3.9 requirement/check
> mentioned in 1/3 ?
> 
Scons build fails without the Import('*'), because env is undefined:

NameError: name 'env' is not defined:

> > +import os.path
> > +
> > +if not 'swr' in COMMAND_LINE_TARGETS:
> > +Return()
> > +
> > +if not env['llvm']:
> > +print 'warning: LLVM disabled: not building swr'
> > +Return()
> > +
> > +env.MSVC2013Compat()
> > +
> 
> > +swr_arch = 'avx'
> > +VariantDir('avx', '.', duplicate=0)
> > +SConscript('avx/SConscript-arch', exports='swr_arch')
> > +
> > +swr_arch = 'avx2'
> > +VariantDir('avx2', '.', duplicate=0)
> > +SConscript('avx2/SConscript-arch', exports='swr_arch')
> > +
> Afaict one can just fold the SConscript-arch here. Thus one won't need to
> bother with the above nor the Depends hunk below.
> Additionally with current approach one is generating [the] identical source
> files twice. Far from ideal...
> 
The AVX and AVX2 builds build differently (with different compiler flags).  At 
runtime, we load the appropriate dll, based on the underlying architecture.  We 
do the same thing on the linux build.  Also, since duplicate=0, source is not 
duplicated.  Yes, generated files are generated twice, however currently 
SConscript is just a shell around SConscript-arch; all the logic that generates 
the files and source lists is in SConscript-arch.  By moving the auto 
generation to SConscript will generate only one copy of the gen files, however 
it splits the build logic into two files, which is more messy.  I can certainly 
move the generation code in SConscript, however, I think that it's cleaner to 
strive for source code cleanliness, as opposed to generate code cleanliness.

> > +env = env.Clone()
> > +
> > +source = env.ParseSourceList('Makefile.sources', [
> > +'LOADER_SOURCES'
> > +])
> > +
> > +env.Prepend(CPPPATH = [
> > +'rasterizer/scripts'
> > +])
> > +
> > +swr = env.ConvenienceLibrary(
> > +   target = 'swr',
> > +   source = source,
> > +   )
> Keep the indentation to 4 spaces here and throughout the SConscripts.
> That's a python requirement.
Ok, will correct that.

> In general I'd encourage using .editorconfig and updating the section for swr,
> if needed.
> 
> 
> > +# remove headers, as scons thinks they are static objects for the .so
> > +source = [x for x in source if not x.endswith(tuple(['.h','.hpp']))]
> > +
> Should be handled already. Otherwise please do so in scons/* Quick grep
> suggests scons/custom.py
> 
ParseSourceList() will filter out .h files, however it won't filter out .hpp 
files.  Are you saying add the .hpp filter in custom.py?

> 
> > +#ifdef _WIN32
> > +   prefix = "";
> > +   postfix = ".dll";
> > +#else
> > +   prefix = "lib";
> > +   postfix = ".so";
> > +#endif
> > +
> Quick grep suggests:
> 
> UTIL_DL_EXT
> UTIL_DL_PREFIX
> 
Ah.  Thank you!  I'll fix this and include the change in the next rev.

[Mesa-dev] [Bug 98629] OpenGL applications warns "MESA-LOADER: failed to retrieve device information"

2016-11-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98629

Emil Velikov  changed:

   What|Removed |Added

   Assignee|nouveau@lists.freedesktop.o |mesa-dev@lists.freedesktop.
   |rg  |org
  Component|Drivers/DRI/nouveau |Mesa core
 QA Contact|nouveau@lists.freedesktop.o |mesa-dev@lists.freedesktop.
   |rg  |org

--- Comment #1 from Emil Velikov  ---
[Moving to 'core' since it's not really nouveau specific]

Does this happen with glxinfo/glxgears as well ? If so can you attach the
output of $strace glxinfo

If glxinfo works fine, while $program does not, attach the output of
$DL_DEBUG=libs $program

Thanks

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallivm: fix [IU]MUL_HI regression

2016-11-08 Thread Nicolai Hähnle

On 08.11.2016 14:44, Roland Scheidegger wrote:

Sorry for breaking radeonsi, I somehow thought this way only used for
cpu only already, without actually checking...
And thanks for fixing that typo, apparently you can pass piglits
umul_hi/imul_hi tests (at least those from the shader_integer_mix group)
even with the square of argument a...


Yeah, it sucks that test runs take so long with llvmpipe. Is there 
anybody doing systematic full regression runs on it?


I do full runs on radeonsi fairly frequently, and I noticed this bug 
with 
tests/spec/arb_gpu_shader5/execution/built-in-functions/fs-imulExtended.shader_test 
and friends.




btw as I didn't consider this, I don't know if you want to change the
shift/trunc to shuffle in the end - feel free to change it back if it
doesn't generate good code on radeonsi...


It seems instcombine has no difficulties seeing through the IR, so I 
think we're good :)




Reviewed-by: Roland Scheidegger 


Thanks!

Nicolai



Am 08.11.2016 um 10:15 schrieb Nicolai Hähnle:

From: Nicolai Hähnle 

This patch does two things:

1. It separates the host-CPU code generation from the generic code
   generation. This guards against accidently breaking things for
   radeonsi in the future.

2. It makes sure we actually use both arguments and don't just compute
   a square :-p

Fixes a regression introduced by commit 29279f44b3172ef3b84d470e70fc7684695ced4b

Cc: Roland Scheidegger 
---
 src/gallium/auxiliary/gallivm/lp_bld_arit.c| 72 ++
 src/gallium/auxiliary/gallivm/lp_bld_arit.h|  6 ++
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c | 40 +++-
 3 files changed, 90 insertions(+), 28 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_arit.c 
b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
index 3de4628..43ad238 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_arit.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
@@ -1087,26 +1087,28 @@ lp_build_mul(struct lp_build_context *bld,
 res = LLVMBuildLShr(builder, res, shift, "");
   }
}

return res;
 }

 /*
  * Widening mul, valid for 32x32 bit -> 64bit only.
  * Result is low 32bits, high bits returned in res_hi.
+ *
+ * Emits code that is meant to be compiled for the host CPU.
  */
 LLVMValueRef
-lp_build_mul_32_lohi(struct lp_build_context *bld,
- LLVMValueRef a,
- LLVMValueRef b,
- LLVMValueRef *res_hi)
+lp_build_mul_32_lohi_cpu(struct lp_build_context *bld,
+ LLVMValueRef a,
+ LLVMValueRef b,
+ LLVMValueRef *res_hi)
 {
struct gallivm_state *gallivm = bld->gallivm;
LLVMBuilderRef builder = gallivm->builder;

assert(bld->type.width == 32);
assert(bld->type.floating == 0);
assert(bld->type.fixed == 0);
assert(bld->type.norm == 0);

/*
@@ -1209,43 +1211,61 @@ lp_build_mul_32_lohi(struct lp_build_context *bld,
   *res_hi = LLVMBuildShuffleVector(builder, muleven, mulodd, shuf_vec, "");

   for (i = 0; i < bld->type.length; i += 2) {
  shuf[i] = lp_build_const_int32(gallivm, i);
  shuf[i+1] = lp_build_const_int32(gallivm, i + bld->type.length);
   }
   shuf_vec = LLVMConstVector(shuf, bld->type.length);
   return LLVMBuildShuffleVector(builder, muleven, mulodd, shuf_vec, "");
}
else {
-  LLVMValueRef tmp;
-  struct lp_type type_tmp;
-  LLVMTypeRef wide_type, cast_type;
-
-  type_tmp = bld->type;
-  type_tmp.width *= 2;
-  wide_type = lp_build_vec_type(gallivm, type_tmp);
-  type_tmp = bld->type;
-  type_tmp.length *= 2;
-  cast_type = lp_build_vec_type(gallivm, type_tmp);
-
-  if (bld->type.sign) {
- a = LLVMBuildSExt(builder, a, wide_type, "");
- b = LLVMBuildSExt(builder, b, wide_type, "");
-  } else {
- a = LLVMBuildZExt(builder, a, wide_type, "");
- b = LLVMBuildZExt(builder, b, wide_type, "");
-  }
-  tmp = LLVMBuildMul(builder, a, b, "");
-  tmp = LLVMBuildBitCast(builder, tmp, cast_type, "");
-  *res_hi = lp_build_uninterleave1(gallivm, bld->type.length * 2, tmp, 1);
-  return lp_build_uninterleave1(gallivm, bld->type.length * 2, tmp, 0);
+  return lp_build_mul_32_lohi(bld, a, b, res_hi);
+   }
+}
+
+
+/*
+ * Widening mul, valid for 32x32 bit -> 64bit only.
+ * Result is low 32bits, high bits returned in res_hi.
+ *
+ * Emits generic code.
+ */
+LLVMValueRef
+lp_build_mul_32_lohi(struct lp_build_context *bld,
+ LLVMValueRef a,
+ LLVMValueRef b,
+ LLVMValueRef *res_hi)
+{
+   struct gallivm_state *gallivm = bld->gallivm;
+   LLVMBuilderRef builder = gallivm->builder;
+   LLVMValueRef tmp;
+   struct lp_type type_tmp;
+   LLVMTypeRef wide_type, cast_type;
+
+   type_tmp = bld->type;
+   type_tmp.width *= 2;
+   wide_type = lp_build_vec_type(gallivm, type_tmp);
+   type_tmp = bld->type;
+  

Re: [Mesa-dev] [PATCH 1/3] gallium/scons: OpenSWR Windows support

2016-11-08 Thread Kyriazis, George
Thank you for the review.  Comments inline.

> -Original Message-
> From: Emil Velikov [mailto:emil.l.veli...@gmail.com]
> Sent: Tuesday, November 8, 2016 7:52 AM
> To: Kyriazis, George ; Jose Fonseca
> 
> Cc: ML mesa-dev 
> Subject: Re: [Mesa-dev] [PATCH 1/3] gallium/scons: OpenSWR Windows
> support
> 
> Hi George,
> 
> For Scons changes please keep Jose Fonseca in the loop.
> 
> On 7 November 2016 at 22:32, George Kyriazis 
> wrote:
> > - Added code to create screen and handle swaps in libgl_gdi.c
> > - Added call to swr SConscript
> > - included llvm 3.9 support for scons (windows swr only support 3.9 and
> >   later)
> If that's the case building SWR with earlier one should error out ?
> Then again, here you reference gallium/drivers/swr/
> 
Yes, SWR will only work on windows for 3.9 and above.

> > - include -DHAVE_SWR to subdirs that need it
> >
> As the above indicates here you have multiple independent changes.
> Please do _not_ mix those into a single patch.
> 
I'll resend v2 of the patches with a different breakdown.

Additional comments on your review of patch 3/3.

Thanks,

George

> 
> > To buils SWR on windows, use "scons swr libgl-gdi"
> > ---
> >  scons/llvm.py | 21 +++--
> >  src/gallium/SConscript|  1 +
> >  src/gallium/targets/libgl-gdi/SConscript  |  4 
> > src/gallium/targets/libgl-gdi/libgl_gdi.c | 28
> > +++-  src/gallium/targets/libgl-xlib/SConscript
> |  4 
> >  src/gallium/targets/osmesa/SConscript |  4 
> >  6 files changed, 55 insertions(+), 7 deletions(-)
> >
> > diff --git a/scons/llvm.py b/scons/llvm.py index 1fc8a3f..977e47a
> > 100644
> > --- a/scons/llvm.py
> > +++ b/scons/llvm.py
> > @@ -106,7 +106,24 @@ def generate(env):
> >  ])
> >  env.Prepend(LIBPATH = [os.path.join(llvm_dir, 'lib')])
> >  # LIBS should match the output of `llvm-config --libs engine mcjit
> bitwriter x86asmprinter`
> > -if llvm_version >= distutils.version.LooseVersion('3.7'):
> > +if llvm_version >= distutils.version.LooseVersion('3.9'):
> > +env.Prepend(LIBS = [
> > +'LLVMX86Disassembler', 'LLVMX86AsmParser',
> > +'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter',
> > +'LLVMDebugInfoCodeView', 'LLVMCodeGen',
> > +'LLVMScalarOpts', 'LLVMInstCombine',
> > +'LLVMInstrumentation', 'LLVMTransformUtils',
> > +'LLVMBitWriter', 'LLVMX86Desc',
> > +'LLVMMCDisassembler', 'LLVMX86Info',
> > +'LLVMX86AsmPrinter', 'LLVMX86Utils',
> > +'LLVMMCJIT', 'LLVMExecutionEngine', 'LLVMTarget',
> > +'LLVMAnalysis', 'LLVMProfileData',
> > +'LLVMRuntimeDyld', 'LLVMObject', 'LLVMMCParser',
> > +'LLVMBitReader', 'LLVMMC', 'LLVMCore',
> > +'LLVMSupport',
> > +'LLVMIRReader', 'LLVMASMParser'
> > +])
> LLVM 3.9 support. cc: mesa-stable (if Jose/Brian are up for it).
> 
> > +elif llvm_version >= distutils.version.LooseVersion('3.7'):
> >  env.Prepend(LIBS = [
> >  'LLVMBitWriter', 'LLVMX86Disassembler', 'LLVMX86AsmParser',
> >  'LLVMX86CodeGen', 'LLVMSelectionDAG',
> > 'LLVMAsmPrinter', @@ -203,7 +220,7 @@ def generate(env):
> >  if '-fno-rtti' in cxxflags:
> >  env.Append(CXXFLAGS = ['-fno-rtti'])
> >
> > -components = ['engine', 'mcjit', 'bitwriter', 'x86asmprinter',
> 'mcdisassembler']
> > +components = ['engine', 'mcjit', 'bitwriter',
> > + 'x86asmprinter', 'mcdisassembler', 'irreader']
> Standalone bugfix. Cc: mesa-stable ?
> 
> 
> > +++ b/src/gallium/SConscript
> 
> > +'drivers/swr/SConscript',
> This file is only introduced with 3/3. Which means that you've added scons
> support which is broken - please don't do that.
> 
> 
> > +++ b/src/gallium/targets/libgl-gdi/SConscript
> > +++ b/src/gallium/targets/libgl-gdi/libgl_gdi.c
> > +++ b/src/gallium/targets/libgl-xlib/SConscript
> > +++ b/src/gallium/targets/osmesa/SConscript
> 
> Couple of ideas how to split these. Or anything else that comes to mind on
> your end.
> 
> A)
> Patch 1
> src/gallium/SConscript
> src/gallium/targets/libgl-gdi/SConscript
> src/gallium/targets/libgl-gdi/libgl_gdi.c
> Patch 2
> src/gallium/targets/libgl-xlib/SConscript
> src/gallium/targets/osmesa/SConscript
> 
> B)
> Patch 1
> src/gallium/targets/libgl-gdi/libgl_gdi.c
> Patch 2
> src/gallium/SConscript
> src/gallium/targets/libgl-gdi/SConscript
> src/gallium/targets/libgl-xlib/SConscript
> src/gallium/targets/osmesa/SConscript
> 
> 
> Thanks
> Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glcpp: initializes version to -1

2016-11-08 Thread Juan A. Suarez Romero
On Tue, 2016-11-08 at 14:19 +0100, Karol Herbst wrote:
> well I don't care either way, maybe the spec does say anything about
> it.

I was re-reading GLSL 1.10 spec about #version directive.

#version follows the same convention as __VERSION__

For __VERSION___, spec says "will substitute a decimal integer
reflecting the version number of the OpenGL shading language"


So no clear if we should always read as decimal, or keep current
behaviour.

J.A.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/1] Fix endianess detection with musl-based toolchains

2016-11-08 Thread Emil Velikov
On 5 November 2016 at 01:55, Jonathan Gray  wrote:
> On Fri, Nov 04, 2016 at 07:53:25PM +0100, Bernd Kuhls wrote:
>> Musl does not define __GLIBC__ and will not provide a __MUSL__ macro:
>> http://wiki.musl-libc.org/wiki/FAQ#Q:_why_is_there_no_MUSL_macro_.3F
>>
>> This patch checks for the presence of endian.h and promotes the result
>> to src/amd/Makefile.addrlib.am which executes the broken build command.
>> Fixes compile errors detected by the autobuilder infrastructure of the
>> buildroot project:
>
> This will break OpenBSD and perhaps other platforms which
> have endian.h that does not define glibc definitions.
>
From a quick skim on my system glibc provides the non __ prefixed
symbols (BYTE_ORDER and friends) if _DEFAULT_SOURCE is set. The latter
of which being implicitly set though a wide variation (once you get
through the ifdef spaghetti).

Worth checking if the non __ defines are available across the board
and using them ?

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallivm: Fix build after removal of deprecated attribute API

2016-11-08 Thread Andy Furniss

Aaron Watry wrote:

On Tue, Nov 8, 2016 at 4:38 AM, Andy Furniss  wrote:


Tom Stellard wrote:


---

Build tested only so far.

   src/gallium/auxiliary/draw/draw_llvm.c|  6 +-
   src/gallium/auxiliary/gallivm/lp_bld_intr.c   | 48 +++-
   src/gallium/auxiliary/gallivm/lp_bld_intr.h   | 13 -
   src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c |  4 +-
   src/gallium/drivers/radeonsi/si_shader.c  | 69
---
   src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c | 24 
   6 files changed, 112 insertions(+), 52 deletions(-)



I notice that llvmpipe needs fixing as well - or maybe that's for someone
else?



I sent a patch for that last night.  Feel free to give it a spin.


Oops, sorry I missed that - with it I can build OK.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] swr: Support windows builds

2016-11-08 Thread Emil Velikov
On 7 November 2016 at 22:32, George Kyriazis  wrote:
> - Added SConscript files
> - better handling of NOMINMAX for  inclusion
> - Reorder header files in swr_context.cpp to handle NOMINMAX better, since
>   mesa header files include windows.h before we get a chance to #define
>   NOMINMAX
> - cleaner support for .dll and .so prefix/suffix across OSes
> - added PUBLIC for some protos
> - added swr_gdi_swap() which is call from libgl_gdi.c
> ---
>  src/gallium/drivers/swr/Makefile.am|   8 ++
>  src/gallium/drivers/swr/SConscript |  46 +++
>  src/gallium/drivers/swr/SConscript-arch| 175 
> +
>  src/gallium/drivers/swr/rasterizer/common/os.h |   5 +-
>  src/gallium/drivers/swr/swr_context.cpp|  16 +--
>  src/gallium/drivers/swr/swr_context.h  |   2 +
>  src/gallium/drivers/swr/swr_loader.cpp |  37 +-
>  src/gallium/drivers/swr/swr_public.h   |  11 +-
>  src/gallium/drivers/swr/swr_screen.cpp |  25 +---
>  9 files changed, 291 insertions(+), 34 deletions(-)
>  create mode 100644 src/gallium/drivers/swr/SConscript
>  create mode 100644 src/gallium/drivers/swr/SConscript-arch
>
Similar to 1/3 this patch does too many things. Please _don't_  do that.

Some ideas based on the above:
 - source code fixes - one or multiple patches, depending on details.
 - automake fixes - ^^
 - introduce scons build (+ the EXTRA_DIST hunk)

Some misc comments below.


> +++ b/src/gallium/drivers/swr/SConscript
> @@ -0,0 +1,46 @@
> +Import('*')
> +
> +from sys import executable as python_cmd
> +import distutils.version
Seems unused. Maybe it was aimed for the llvm 3.9 requirement/check
mentioned in 1/3 ?

> +import os.path
> +
> +if not 'swr' in COMMAND_LINE_TARGETS:
> +Return()
> +
> +if not env['llvm']:
> +print 'warning: LLVM disabled: not building swr'
> +Return()
> +
> +env.MSVC2013Compat()
> +

> +swr_arch = 'avx'
> +VariantDir('avx', '.', duplicate=0)
> +SConscript('avx/SConscript-arch', exports='swr_arch')
> +
> +swr_arch = 'avx2'
> +VariantDir('avx2', '.', duplicate=0)
> +SConscript('avx2/SConscript-arch', exports='swr_arch')
> +
Afaict one can just fold the SConscript-arch here. Thus one won't need
to bother with the above nor the Depends hunk below.
Additionally with current approach one is generating [the] identical
source files twice. Far from ideal...

> +env = env.Clone()
> +
> +source = env.ParseSourceList('Makefile.sources', [
> +'LOADER_SOURCES'
> +])
> +
> +env.Prepend(CPPPATH = [
> +'rasterizer/scripts'
> +])
> +
> +swr = env.ConvenienceLibrary(
> +   target = 'swr',
> +   source = source,
> +   )
Keep the indentation to 4 spaces here and throughout the SConscripts.
That's a python requirement.
In general I'd encourage using .editorconfig and updating the section
for swr, if needed.


> +# remove headers, as scons thinks they are static objects for the .so
> +source = [x for x in source if not x.endswith(tuple(['.h','.hpp']))]
> +
Should be handled already. Otherwise please do so in scons/*
Quick grep suggests scons/custom.py


> +#ifdef _WIN32
> +   prefix = "";
> +   postfix = ".dll";
> +#else
> +   prefix = "lib";
> +   postfix = ".so";
> +#endif
> +
Quick grep suggests:

UTIL_DL_EXT
UTIL_DL_PREFIX

Regards,
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 98606] Compile error in gallium target VA--LLVM undefined referencences

2016-11-08 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98606

--- Comment #2 from Emil Velikov  ---
When you say "... I no longer have VA-API decode ability like in the past" do
you have an estimate when (what gcc/llvm/etc. combination) it was working ?

Also, please make sure that you start a clean build if managing gallium llvm
(--enable-gallium-llvm and permutations). Atm if you build with --enable and
then reconfigure/rebuild with --disable things will break similar to your log.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallivm: Fix build after removal of deprecated attribute API

2016-11-08 Thread Aaron Watry
On Tue, Nov 8, 2016 at 4:38 AM, Andy Furniss  wrote:

> Tom Stellard wrote:
>
>> ---
>>
>> Build tested only so far.
>>
>>   src/gallium/auxiliary/draw/draw_llvm.c|  6 +-
>>   src/gallium/auxiliary/gallivm/lp_bld_intr.c   | 48 +++-
>>   src/gallium/auxiliary/gallivm/lp_bld_intr.h   | 13 -
>>   src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c |  4 +-
>>   src/gallium/drivers/radeonsi/si_shader.c  | 69
>> ---
>>   src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c | 24 
>>   6 files changed, 112 insertions(+), 52 deletions(-)
>>
>
> I notice that llvmpipe needs fixing as well - or maybe that's for someone
> else?
>
>
I sent a patch for that last night.  Feel free to give it a spin.

--Aaron


>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gallium/scons: OpenSWR Windows support

2016-11-08 Thread Emil Velikov
Hi George,

For Scons changes please keep Jose Fonseca in the loop.

On 7 November 2016 at 22:32, George Kyriazis  wrote:
> - Added code to create screen and handle swaps in libgl_gdi.c
> - Added call to swr SConscript
> - included llvm 3.9 support for scons (windows swr only support 3.9 and
>   later)
If that's the case building SWR with earlier one should error out ?
Then again, here you reference gallium/drivers/swr/

> - include -DHAVE_SWR to subdirs that need it
>
As the above indicates here you have multiple independent changes.
Please do _not_ mix those into a single patch.


> To buils SWR on windows, use "scons swr libgl-gdi"
> ---
>  scons/llvm.py | 21 +++--
>  src/gallium/SConscript|  1 +
>  src/gallium/targets/libgl-gdi/SConscript  |  4 
>  src/gallium/targets/libgl-gdi/libgl_gdi.c | 28 +++-
>  src/gallium/targets/libgl-xlib/SConscript |  4 
>  src/gallium/targets/osmesa/SConscript |  4 
>  6 files changed, 55 insertions(+), 7 deletions(-)
>
> diff --git a/scons/llvm.py b/scons/llvm.py
> index 1fc8a3f..977e47a 100644
> --- a/scons/llvm.py
> +++ b/scons/llvm.py
> @@ -106,7 +106,24 @@ def generate(env):
>  ])
>  env.Prepend(LIBPATH = [os.path.join(llvm_dir, 'lib')])
>  # LIBS should match the output of `llvm-config --libs engine mcjit 
> bitwriter x86asmprinter`
> -if llvm_version >= distutils.version.LooseVersion('3.7'):
> +if llvm_version >= distutils.version.LooseVersion('3.9'):
> +env.Prepend(LIBS = [
> +'LLVMX86Disassembler', 'LLVMX86AsmParser',
> +'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter',
> +'LLVMDebugInfoCodeView', 'LLVMCodeGen',
> +'LLVMScalarOpts', 'LLVMInstCombine',
> +'LLVMInstrumentation', 'LLVMTransformUtils',
> +'LLVMBitWriter', 'LLVMX86Desc',
> +'LLVMMCDisassembler', 'LLVMX86Info',
> +'LLVMX86AsmPrinter', 'LLVMX86Utils',
> +'LLVMMCJIT', 'LLVMExecutionEngine', 'LLVMTarget',
> +'LLVMAnalysis', 'LLVMProfileData',
> +'LLVMRuntimeDyld', 'LLVMObject', 'LLVMMCParser',
> +'LLVMBitReader', 'LLVMMC', 'LLVMCore',
> +'LLVMSupport',
> +'LLVMIRReader', 'LLVMASMParser'
> +])
LLVM 3.9 support. cc: mesa-stable (if Jose/Brian are up for it).

> +elif llvm_version >= distutils.version.LooseVersion('3.7'):
>  env.Prepend(LIBS = [
>  'LLVMBitWriter', 'LLVMX86Disassembler', 'LLVMX86AsmParser',
>  'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter',
> @@ -203,7 +220,7 @@ def generate(env):
>  if '-fno-rtti' in cxxflags:
>  env.Append(CXXFLAGS = ['-fno-rtti'])
>
> -components = ['engine', 'mcjit', 'bitwriter', 'x86asmprinter', 
> 'mcdisassembler']
> +components = ['engine', 'mcjit', 'bitwriter', 'x86asmprinter', 
> 'mcdisassembler', 'irreader']
Standalone bugfix. Cc: mesa-stable ?


> +++ b/src/gallium/SConscript

> +'drivers/swr/SConscript',
This file is only introduced with 3/3. Which means that you've added
scons support which is broken - please don't do that.


> +++ b/src/gallium/targets/libgl-gdi/SConscript
> +++ b/src/gallium/targets/libgl-gdi/libgl_gdi.c
> +++ b/src/gallium/targets/libgl-xlib/SConscript
> +++ b/src/gallium/targets/osmesa/SConscript

Couple of ideas how to split these. Or anything else that comes to
mind on your end.

A)
Patch 1
src/gallium/SConscript
src/gallium/targets/libgl-gdi/SConscript
src/gallium/targets/libgl-gdi/libgl_gdi.c
Patch 2
src/gallium/targets/libgl-xlib/SConscript
src/gallium/targets/osmesa/SConscript

B)
Patch 1
src/gallium/targets/libgl-gdi/libgl_gdi.c
Patch 2
src/gallium/SConscript
src/gallium/targets/libgl-gdi/SConscript
src/gallium/targets/libgl-xlib/SConscript
src/gallium/targets/osmesa/SConscript


Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallivm: fix [IU]MUL_HI regression

2016-11-08 Thread Roland Scheidegger
Sorry for breaking radeonsi, I somehow thought this way only used for
cpu only already, without actually checking...
And thanks for fixing that typo, apparently you can pass piglits
umul_hi/imul_hi tests (at least those from the shader_integer_mix group)
even with the square of argument a...
btw as I didn't consider this, I don't know if you want to change the
shift/trunc to shuffle in the end - feel free to change it back if it
doesn't generate good code on radeonsi...

Reviewed-by: Roland Scheidegger 



Am 08.11.2016 um 10:15 schrieb Nicolai Hähnle:
> From: Nicolai Hähnle 
> 
> This patch does two things:
> 
> 1. It separates the host-CPU code generation from the generic code
>generation. This guards against accidently breaking things for
>radeonsi in the future.
> 
> 2. It makes sure we actually use both arguments and don't just compute
>a square :-p
> 
> Fixes a regression introduced by commit 
> 29279f44b3172ef3b84d470e70fc7684695ced4b
> 
> Cc: Roland Scheidegger 
> ---
>  src/gallium/auxiliary/gallivm/lp_bld_arit.c| 72 
> ++
>  src/gallium/auxiliary/gallivm/lp_bld_arit.h|  6 ++
>  src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c | 40 +++-
>  3 files changed, 90 insertions(+), 28 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_arit.c 
> b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
> index 3de4628..43ad238 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_arit.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
> @@ -1087,26 +1087,28 @@ lp_build_mul(struct lp_build_context *bld,
>  res = LLVMBuildLShr(builder, res, shift, "");
>}
> }
>  
> return res;
>  }
>  
>  /*
>   * Widening mul, valid for 32x32 bit -> 64bit only.
>   * Result is low 32bits, high bits returned in res_hi.
> + *
> + * Emits code that is meant to be compiled for the host CPU.
>   */
>  LLVMValueRef
> -lp_build_mul_32_lohi(struct lp_build_context *bld,
> - LLVMValueRef a,
> - LLVMValueRef b,
> - LLVMValueRef *res_hi)
> +lp_build_mul_32_lohi_cpu(struct lp_build_context *bld,
> + LLVMValueRef a,
> + LLVMValueRef b,
> + LLVMValueRef *res_hi)
>  {
> struct gallivm_state *gallivm = bld->gallivm;
> LLVMBuilderRef builder = gallivm->builder;
>  
> assert(bld->type.width == 32);
> assert(bld->type.floating == 0);
> assert(bld->type.fixed == 0);
> assert(bld->type.norm == 0);
>  
> /*
> @@ -1209,43 +1211,61 @@ lp_build_mul_32_lohi(struct lp_build_context *bld,
>*res_hi = LLVMBuildShuffleVector(builder, muleven, mulodd, shuf_vec, 
> "");
>  
>for (i = 0; i < bld->type.length; i += 2) {
>   shuf[i] = lp_build_const_int32(gallivm, i);
>   shuf[i+1] = lp_build_const_int32(gallivm, i + bld->type.length);
>}
>shuf_vec = LLVMConstVector(shuf, bld->type.length);
>return LLVMBuildShuffleVector(builder, muleven, mulodd, shuf_vec, "");
> }
> else {
> -  LLVMValueRef tmp;
> -  struct lp_type type_tmp;
> -  LLVMTypeRef wide_type, cast_type;
> -
> -  type_tmp = bld->type;
> -  type_tmp.width *= 2;
> -  wide_type = lp_build_vec_type(gallivm, type_tmp);
> -  type_tmp = bld->type;
> -  type_tmp.length *= 2;
> -  cast_type = lp_build_vec_type(gallivm, type_tmp);
> -
> -  if (bld->type.sign) {
> - a = LLVMBuildSExt(builder, a, wide_type, "");
> - b = LLVMBuildSExt(builder, b, wide_type, "");
> -  } else {
> - a = LLVMBuildZExt(builder, a, wide_type, "");
> - b = LLVMBuildZExt(builder, b, wide_type, "");
> -  }
> -  tmp = LLVMBuildMul(builder, a, b, "");
> -  tmp = LLVMBuildBitCast(builder, tmp, cast_type, "");
> -  *res_hi = lp_build_uninterleave1(gallivm, bld->type.length * 2, tmp, 
> 1);
> -  return lp_build_uninterleave1(gallivm, bld->type.length * 2, tmp, 0);
> +  return lp_build_mul_32_lohi(bld, a, b, res_hi);
> +   }
> +}
> +
> +
> +/*
> + * Widening mul, valid for 32x32 bit -> 64bit only.
> + * Result is low 32bits, high bits returned in res_hi.
> + *
> + * Emits generic code.
> + */
> +LLVMValueRef
> +lp_build_mul_32_lohi(struct lp_build_context *bld,
> + LLVMValueRef a,
> + LLVMValueRef b,
> + LLVMValueRef *res_hi)
> +{
> +   struct gallivm_state *gallivm = bld->gallivm;
> +   LLVMBuilderRef builder = gallivm->builder;
> +   LLVMValueRef tmp;
> +   struct lp_type type_tmp;
> +   LLVMTypeRef wide_type, cast_type;
> +
> +   type_tmp = bld->type;
> +   type_tmp.width *= 2;
> +   wide_type = lp_build_vec_type(gallivm, type_tmp);
> +   type_tmp = bld->type;
> +   type_tmp.length *= 2;
> +   cast_type = lp_build_vec_type(gallivm, type_tmp);
> +
> +   if (bld->type.sign) {
> +  a = LLVMBuildSExt(builder, a, wide_type, "");
> +  b = LLVMBuildSExt(builder, b, wide_ty

Re: [Mesa-dev] [PATCH 1/4] linker: Trivial coding standards fixes

2016-11-08 Thread Ilia Mirkin
On Tue, Nov 8, 2016 at 12:50 AM, Ian Romanick  wrote:
> -   virtual void visit_field(const glsl_type *type, const char *name,
> -bool row_major)
> +   virtual void visit_field(const glsl_type *, const char *,
> +bool /* row_major */)
> {
> -  (void) type;
> -  (void) name;
> -  (void) row_major;
> -  assert(!"Should not get here.");
> +  unreachable("Should not get here.");
> }

I'd be in favor of leaving this as an assert. The unreachable gets you
nothing here, except potential infinite loops on production builds
should this path ever get hit somehow.

I think people have started going overboard with unreachable... it
really should be for "shut up compiler, this can't happen, you're just
too dumb to see it" cases. Not for "it would be a bug to hit this
path" cases.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glcpp: initializes version to -1

2016-11-08 Thread Karol Herbst
2016-11-08 13:35 GMT+01:00 Juan A. Suarez Romero :
> On Sat, 2016-11-05 at 10:48 +0100, Karol Herbst wrote:
>> "#version 0512": 0:1(10): error: GLSL 3.30 is not supported.
>> Supported
>> versions are: 1.10, 1.20, 1.30, 1.00 ES, and 3.00 ES
>>
>> so the issue with this would be, that "0512" is parsed as 3.30, which
>> isn't right either, but the current master version does the same. \o/
>> new bug found
>
>
> Doing a quick check, not sure if this is a bug... 0512 is interpreted
> in octal format, which in decimal is 330. Same for 0130, which is 88 in
> decimal.
>
>
> So unless we want to force all the values to be read as decimal, I
> woulnd't say it is a bug.
>
>
> J.A.
>

well I don't care either way, maybe the spec does say anything about it.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >