[Mesa-dev] [Bug 98652] AMD driver doesn't compile anymore after recent LLVM changes
https://bugs.freedesktop.org/show_bug.cgi?id=98652 Bug ID: 98652 Summary: AMD driver doesn't compile anymore after recent LLVM changes Product: Mesa Version: git Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: Other Assignee: mesa-dev@lists.freedesktop.org Reporter: b...@lindev.ch QA Contact: mesa-dev@lists.freedesktop.org The LLVMAttribute API has been removed from LLVM recently. Mesa's src/amd/common/ac_nir_to_llvm.c still uses it, causing compile failures when using current LLVM snapshots. ac_nir_to_llvm.c:144:43: error: unknown type name 'LLVMAttribute'; did you mean 'LLVMAttributeRef'? unsigned param_count, LLVMAttribute attribs); ^ LLVMAttributeRef /usr/include/llvm-c/Types.h:116:40: note: 'LLVMAttributeRef' declared here typedef struct LLVMOpaqueAttributeRef *LLVMAttributeRef; ^ ac_nir_to_llvm.c:230:4: error: implicit declaration of function 'LLVMAddAttribute' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMAddAttribute(P, LLVMByValAttribute); ^ ac_nir_to_llvm.c:230:24: error: use of undeclared identifier 'LLVMByValAttribute'; did you mean 'LLVMAddAttribute'? LLVMAddAttribute(P, LLVMByValAttribute); ^~ LLVMAddAttribute ac_nir_to_llvm.c:230:4: note: 'LLVMAddAttribute' declared here LLVMAddAttribute(P, LLVMByValAttribute); ^ ac_nir_to_llvm.c:234:24: error: use of undeclared identifier 'LLVMInRegAttribute' LLVMAddAttribute(P, LLVMInRegAttribute); ^ ac_nir_to_llvm.c:710:63: error: use of undeclared identifier 'LLVMReadNoneAttribute' return emit_llvm_intrinsic(ctx, intrin, ctx->f32, params, 1, LLVMReadNoneAt... ^ ac_nir_to_llvm.c:721:63: error: use of undeclared identifier 'LLVMReadNoneAttribute' return emit_llvm_intrinsic(ctx, intrin, ctx->f32, params, 2, LLVMReadNoneAt... ^ ac_nir_to_llvm.c:733:63: error: use of undeclared identifier 'LLVMReadNoneAttribute' return emit_llvm_intrinsic(ctx, intrin, ctx->f32, params, 3, LLVMReadNoneAt... ^ ac_nir_to_llvm.c:759:72: error: use of undeclared identifier 'LLVMReadNoneAttribute' return emit_llvm_intrinsic(ctx, "llvm.cttz.i32", ctx->i32, params, 2, LLVMReadNoneA... ^ ac_nir_to_llvm.c:767:13: error: use of undeclared identifier 'LLVMReadNoneAttribute' LLVMReadNoneAttribute); ^ ac_nir_to_llvm.c:793:13: error: use of undeclared identifier 'LLVMReadNoneAttribute' LLVMReadNoneAttribute); ^ ac_nir_to_llvm.c:857:8: error: use of undeclared identifier 'LLVMReadNoneAttribute' LLVMReadNoneAttribute); ^ ac_nir_to_llvm.c:873:18: error: use of undeclared identifier 'LLVMReadNoneAttribute' params, 2, LLVMReadNoneAttribute); ^ ac_nir_to_llvm.c:918:63: error: use of undeclared identifier 'LLVMReadNoneAttribute' result = emit_llvm_intrinsic(ctx, intrin, ctx->i32, srcs, 3, LLVMReadNoneAt... ^ ac_nir_to_llvm.c:1025:21: error: use of undeclared identifier 'LLVMReadNoneAttribute' tid_args, 2, LLVMReadNoneAttribute); ^ ac_nir_to_llvm.c:1029:20: error: use of undeclared identifier 'LLVMReadNoneAttribute' tid_args, 2, LLVMReadNoneAttribute); ^ ac_nir_to_llvm.c:1117:7: error: use of undeclared identifier 'LLVMReadNoneAttribute' LLVMReadNoneAttribute); ^ ac_nir_to_llvm.c:1123:9: error: use of undeclared identifier 'LLVMReadNoneAttribute' LLVMReadNoneAttribute); ^ ac_nir_to_llvm.c:1453:78: error: use
Re: [Mesa-dev] [PATCH] radv: fix GetFenceStatus for signaled fences
Reviewed-by: Bas Nieuwenhuizen On Wed, Nov 9, 2016 at 2:22 AM, Dave Airlie wrote: > From: Dave Airlie > > if a fence is created pre-signaled we should return that > in GetFenceStatus even if it hasn't been submitted. > > Signed-off-by: Dave Airlie > --- > src/amd/vulkan/radv_device.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c > index fdb6db9..214af5f 100644 > --- a/src/amd/vulkan/radv_device.c > +++ b/src/amd/vulkan/radv_device.c > @@ -1202,6 +1202,8 @@ VkResult radv_GetFenceStatus(VkDevice _device, VkFence > _fence) > RADV_FROM_HANDLE(radv_device, device, _device); > RADV_FROM_HANDLE(radv_fence, fence, _fence); > > + if (fence->signalled) > + return VK_SUCCESS; > if (!fence->submitted) > return VK_NOT_READY; > > -- > 2.7.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] swr: [rasterizer core] allow an OpenGL driver to specify halfz clipping
With ARB_clip_control, GL may also do 0..1 depth clipping, not just -1..1. For backwards compatibility, preserve the existing driver type check for DX as well. Signed-off-by: Ilia Mirkin --- src/gallium/drivers/swr/rasterizer/core/clip.h | 6 +++--- src/gallium/drivers/swr/rasterizer/core/state.h | 1 + 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/clip.h b/src/gallium/drivers/swr/rasterizer/core/clip.h index 43bc522..78dbcf0 100644 --- a/src/gallium/drivers/swr/rasterizer/core/clip.h +++ b/src/gallium/drivers/swr/rasterizer/core/clip.h @@ -90,7 +90,7 @@ void ComputeClipCodes(DRIVER_TYPE type, const API_STATE& state, const simdvector { // FRUSTUM_NEAR // DX clips depth [0..w], GL clips [-w..w] -if (type == DX) +if (type == DX || state.rastState.clipHalfZ) { vRes = _simd_cmplt_ps(vertex.z, _simd_setzero_ps()); } @@ -640,7 +640,7 @@ private: case FRUSTUM_BOTTOM:t = ComputeInterpFactor(_simd_sub_ps(v1[3], v1[1]), _simd_sub_ps(v2[3], v2[1])); break; case FRUSTUM_NEAR: // DX Znear plane is 0, GL is -w -if (this->driverType == DX) +if (this->driverType == DX || this->state.rastState.clipHalfZ) { t = ComputeInterpFactor(v1[2], v2[2]); } @@ -708,7 +708,7 @@ private: case FRUSTUM_RIGHT: return _simd_cmple_ps(v[0], v[3]); case FRUSTUM_TOP: return _simd_cmpge_ps(v[1], _simd_mul_ps(v[3], _simd_set1_ps(-1.0f))); case FRUSTUM_BOTTOM:return _simd_cmple_ps(v[1], v[3]); -case FRUSTUM_NEAR: return _simd_cmpge_ps(v[2], this->driverType == DX ? _simd_setzero_ps() : _simd_mul_ps(v[3], _simd_set1_ps(-1.0f))); +case FRUSTUM_NEAR: return _simd_cmpge_ps(v[2], this->driverType == DX || this->state.rastState.clipHalfZ ? _simd_setzero_ps() : _simd_mul_ps(v[3], _simd_set1_ps(-1.0f))); case FRUSTUM_FAR: return _simd_cmple_ps(v[2], v[3]); default: SWR_ASSERT(false, "invalid clipping plane: %d", ClippingPlane); diff --git a/src/gallium/drivers/swr/rasterizer/core/state.h b/src/gallium/drivers/swr/rasterizer/core/state.h index 93e4565..5ee12e8 100644 --- a/src/gallium/drivers/swr/rasterizer/core/state.h +++ b/src/gallium/drivers/swr/rasterizer/core/state.h @@ -932,6 +932,7 @@ struct SWR_RASTSTATE uint32_t frontWinding : 1; uint32_t scissorEnable : 1; uint32_t depthClipEnable: 1; +uint32_t clipHalfZ : 1; uint32_t pointParam : 1; uint32_t pointSpriteEnable : 1; uint32_t pointSpriteTopOrigin : 1; -- 2.7.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] swr: set halfz rasterizer setting
Signed-off-by: Ilia Mirkin --- src/gallium/drivers/swr/swr_state.cpp | 1 + 1 file changed, 1 insertion(+) diff --git a/src/gallium/drivers/swr/swr_state.cpp b/src/gallium/drivers/swr/swr_state.cpp index 01cadce..d19acfb 100644 --- a/src/gallium/drivers/swr/swr_state.cpp +++ b/src/gallium/drivers/swr/swr_state.cpp @@ -921,6 +921,7 @@ swr_update_derived(struct pipe_context *pipe, rastState->depthFormat = swr_resource(zb->texture)->swr.format; rastState->depthClipEnable = rasterizer->depth_clip; + rastState->clipHalfZ = rasterizer->clip_halfz; rastState->clipDistanceMask = ctx->vs->info.base.num_written_clipdistance ? -- 2.7.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] swr: [rasterizer jitter] fix logic op to work with unorm/snorm
Reviewed-by: Tim Rowley mailto:timothy.o.row...@intel.com>> On Nov 7, 2016, at 6:18 PM, Ilia Mirkin mailto:imir...@alum.mit.edu>> wrote: Most logic op usage is probably going to end up with normalized textures. Scale the floating point values and convert to integer before performing the logic operations. Signed-off-by: Ilia Mirkin mailto:imir...@alum.mit.edu>> --- The gl-1.1-xor-copypixels test still fails. The image stays the same. I'm suspecting it's for reasons outside of this patch. I'm not too familiar with the whole swr infrastructure, perhaps there was an eaiser way to do all this. I looked for conversion helper functions but couldn't find anything that would fit nicely here. Feel free to point me in the right direction. .../drivers/swr/rasterizer/jitter/blend_jit.cpp| 81 +- 1 file changed, 64 insertions(+), 17 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp b/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp index 1452d27..d69d503 100644 --- a/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp +++ b/src/gallium/drivers/swr/rasterizer/jitter/blend_jit.cpp @@ -649,29 +649,54 @@ struct BlendJit : public Builder if(state.blendState.logicOpEnable) { const SWR_FORMAT_INFO& info = GetFormatInfo(state.format); -SWR_ASSERT(info.type[0] == SWR_TYPE_UINT); Value* vMask[4]; +float scale[4]; + +if (!state.blendState.blendEnable) { +Clamp(state.format, src); +Clamp(state.format, dst); +} + for(uint32_t i = 0; i < 4; i++) { -switch(info.bpc[i]) +if (info.type[i] == SWR_TYPE_UNUSED) { -case 0: vMask[i] = VIMMED1(0x); break; -case 2: vMask[i] = VIMMED1(0x0003); break; -case 5: vMask[i] = VIMMED1(0x001F); break; -case 6: vMask[i] = VIMMED1(0x003F); break; -case 8: vMask[i] = VIMMED1(0x00FF); break; -case 10: vMask[i] = VIMMED1(0x03FF); break; -case 11: vMask[i] = VIMMED1(0x07FF); break; -case 16: vMask[i] = VIMMED1(0x); break; -case 24: vMask[i] = VIMMED1(0x00FF); break; -case 32: vMask[i] = VIMMED1(0x); break; +continue; +} + +if (info.bpc[i] >= 32) { +vMask[i] = VIMMED1(0x); +scale[i] = 0x; +} else { +vMask[i] = VIMMED1((1 << info.bpc[i]) - 1); +if (info.type[i] == SWR_TYPE_SNORM) +scale[i] = (1 << (info.bpc[i] - 1)) - 1; +else +scale[i] = (1 << info.bpc[i]) - 1; +} + +switch (info.type[i]) { default: -vMask[i] = VIMMED1(0x0); -SWR_ASSERT(0, "Unsupported bpc for logic op\n"); +SWR_ASSERT(0, "Unsupported type for logic op\n"); +/* fallthrough */ +case SWR_TYPE_UINT: +case SWR_TYPE_SINT: +src[i] = BITCAST(src[i], mSimdInt32Ty); +dst[i] = BITCAST(dst[i], mSimdInt32Ty); +break; +case SWR_TYPE_SNORM: +src[i] = FADD(src[i], VIMMED1(0.5f)); +dst[i] = FADD(dst[i], VIMMED1(0.5f)); +/* fallthrough */ +case SWR_TYPE_UNORM: +src[i] = FP_TO_UI( +FMUL(src[i], VIMMED1(scale[i])), +mSimdInt32Ty); +dst[i] = FP_TO_UI( +FMUL(dst[i], VIMMED1(scale[i])), +mSimdInt32Ty); break; } -src[i] = BITCAST(src[i], mSimdInt32Ty);//, vMask[i]); -dst[i] = BITCAST(dst[i], mSimdInt32Ty); } LogicOpFunc(state.blendState.logicOpFunc, src, dst, result); @@ -679,10 +704,32 @@ struct BlendJit : public Builder // store results out for(uint32_t i = 0; i < 4; ++i) { +if (info.type[i] == SWR_TYPE_UNUSED) +{ +continue; +} + // clear upper bits from PS output not in RT format after doing logic op result[i] = AND(result[i], vMask[i]); -STORE(BITCAST(result[i], mSimdFP32Ty), pResult, {i}); +switch (info.type[i]) { +default: +SWR_ASSERT(0, "Unsupported type for logic op\n"); +/* fallthrough */ +case SWR_TYPE_UINT: +case SWR_TYPE_SINT: +result[i] = BITCAST(resu
[Mesa-dev] [PATCH] swr: fix support for inverted depth scales
Signed-off-by: Ilia Mirkin --- This improves bin/arb_clip_control-clip-control results, but still not quite there yet. src/gallium/drivers/swr/swr_state.cpp | 10 +++--- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/swr/swr_state.cpp b/src/gallium/drivers/swr/swr_state.cpp index ede475a..01cadce 100644 --- a/src/gallium/drivers/swr/swr_state.cpp +++ b/src/gallium/drivers/swr/swr_state.cpp @@ -38,6 +38,7 @@ #include "util/u_inlines.h" #include "util/u_helpers.h" #include "util/u_framebuffer.h" +#include "util/u_viewport.h" #include "swr_state.h" #include "swr_context.h" @@ -951,13 +952,8 @@ swr_update_derived(struct pipe_context *pipe, vp->width = state->translate[0] + state->scale[0]; vp->y = state->translate[1] - fabs(state->scale[1]); vp->height = state->translate[1] + fabs(state->scale[1]); - if (rasterizer->clip_halfz == 0) { - vp->minZ = state->translate[2] - state->scale[2]; - vp->maxZ = state->translate[2] + state->scale[2]; - } else { - vp->minZ = state->translate[2]; - vp->maxZ = state->translate[2] + state->scale[2]; - } + util_viewport_zmin_zmax(state, rasterizer->clip_halfz, + &vp->minZ, &vp->maxZ); vpm->m00[0] = state->scale[0]; vpm->m11[0] = state->scale[1]; -- 2.7.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] anv: Make anv_finishme onl warn once per call-site
When you fire up Dota2 on Haswell you get spammed with thousands of "Implement Gen7 HZ ops" finishme's. The point of the finshme is as a reminder that there is something left to implement. Printing it once should be sufficient. Signed-off-by: Jason Ekstrand --- src/intel/vulkan/anv_private.h | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h index 8f5a95b..c71a884 100644 --- a/src/intel/vulkan/anv_private.h +++ b/src/intel/vulkan/anv_private.h @@ -194,8 +194,13 @@ void anv_loge_v(const char *format, va_list va); /** * Print a FINISHME message, including its source location. */ -#define anv_finishme(format, ...) \ - __anv_finishme(__FILE__, __LINE__, format, ##__VA_ARGS__); +#define anv_finishme(format, ...) ({ \ + static bool reported = false; \ + if (!reported) { \ + __anv_finishme(__FILE__, __LINE__, format, ##__VA_ARGS__); \ + reported = true; \ + } \ +}) /* A non-fatal assert. Useful for debugging. */ #ifdef DEBUG -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 98606] Compile error in gallium target VA--LLVM undefined referencences
https://bugs.freedesktop.org/show_bug.cgi?id=98606 Michel Dänzer changed: What|Removed |Added Attachment #127855|text/x-log |text/plain mime type|| -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 98606] Compile error in gallium target VA--LLVM undefined referencences
https://bugs.freedesktop.org/show_bug.cgi?id=98606 --- Comment #9 from charlie --- Created attachment 127860 --> https://bugs.freedesktop.org/attachment.cgi?id=127860&action=edit Varibles used to configure mesa. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 98606] Compile error in gallium target VA--LLVM undefined referencences
https://bugs.freedesktop.org/show_bug.cgi?id=98606 --- Comment #8 from charlie --- Created attachment 127859 --> https://bugs.freedesktop.org/attachment.cgi?id=127859&action=edit Default variables that get added to all components being built. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 98606] Compile error in gallium target VA--LLVM undefined referencences
https://bugs.freedesktop.org/show_bug.cgi?id=98606 --- Comment #7 from charlie --- Created attachment 127858 --> https://bugs.freedesktop.org/attachment.cgi?id=127858&action=edit environment variable used when compiling libva -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 98606] Compile error in gallium target VA--LLVM undefined referencences
https://bugs.freedesktop.org/show_bug.cgi?id=98606 --- Comment #6 from charlie --- Created attachment 127857 --> https://bugs.freedesktop.org/attachment.cgi?id=127857&action=edit libva.cfg -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 98606] Compile error in gallium target VA--LLVM undefined referencences
https://bugs.freedesktop.org/show_bug.cgi?id=98606 --- Comment #5 from charlie --- Created attachment 127856 --> https://bugs.freedesktop.org/attachment.cgi?id=127856&action=edit mesa build.log Generated with "make V=1 -j${threads} 2>&1 | tee build.log" -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 98606] Compile error in gallium target VA--LLVM undefined referencences
https://bugs.freedesktop.org/show_bug.cgi?id=98606 --- Comment #4 from charlie --- Created attachment 127855 --> https://bugs.freedesktop.org/attachment.cgi?id=127855&action=edit mesa config.log -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 98606] Compile error in gallium target VA--LLVM undefined referencences
https://bugs.freedesktop.org/show_bug.cgi?id=98606 --- Comment #3 from charlie --- The "gcc/llvm/etc. combination" that mesa-va was last working at occurred perhaps near the beginning of the year 2016. "Atm if you build with --enable and then reconfigure/rebuild with --disable things will break similar to your log." I can try deleting all of x and llvm with no changes in my configure options although I think I have done that before with no change in the mesa compile error. I'll try it again. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/25] anv: Rework the way render target surfaces are allocated
On Tue, Nov 8, 2016 at 5:16 PM, Nanley Chery wrote: > On Tue, Nov 08, 2016 at 05:02:29PM -0800, Jason Ekstrand wrote: > > On Tue, Nov 8, 2016 at 5:00 PM, Nanley Chery > wrote: > > > > > On Tue, Nov 08, 2016 at 04:24:48PM -0800, Jason Ekstrand wrote: > > > > On Tue, Nov 8, 2016 at 3:13 PM, Nanley Chery > > > wrote: > > > > > > > > > On Sat, Oct 22, 2016 at 10:50:37AM -0700, Jason Ekstrand wrote: > > > > > > This commit moves the allocation and filling out of surface state > > > from > > > > > > CreateImageView time to BeginRenderPass time. Instead of > allocating > > > the > > > > > > render target surface state as part of the image view, we > allocate > > > it in > > > > > > the command buffer state at the same time that we set up > clears. For > > > > > > secondary command buffers, we allocate memory for the surface > states > > > in > > > > > > BeginCommandBuffer but don't fill them out; instead, we use our > new > > > > > > SOL-based memcpy function to copy the surface states from the > primary > > > > > > command buffer. This allows us to handle secondary command > buffers > > > > > without > > > > > > the user specifying the framebuffer ahead-of-time. > > > > > > --- > > > > > > src/intel/vulkan/anv_cmd_buffer.c | 56 -- > > > > > > src/intel/vulkan/anv_image.c | 22 > > > > > > src/intel/vulkan/anv_private.h | 24 - > > > > > > src/intel/vulkan/genX_cmd_buffer.c | 204 > > > +- > > > > > --- > > > > > > 4 files changed, 180 insertions(+), 126 deletions(-) > > > > > > > > > > > > diff --git a/src/intel/vulkan/anv_cmd_buffer.c > > > > > b/src/intel/vulkan/anv_cmd_buffer.c > > > > > > index a652f9a..372030c 100644 > > > > > > --- a/src/intel/vulkan/anv_cmd_buffer.c > > > > > > +++ b/src/intel/vulkan/anv_cmd_buffer.c > > > > > > @@ -144,62 +144,6 @@ anv_cmd_state_reset(struct anv_cmd_buffer > > > > > *cmd_buffer) > > > > > > state->gen7.index_buffer = NULL; > > > > > > } > > > > > > > > > > > > -/** > > > > > > - * Setup anv_cmd_state::attachments for vkCmdBeginRenderPass. > > > > > > - */ > > > > > > -void > > > > > > -anv_cmd_state_setup_attachments(struct anv_cmd_buffer > *cmd_buffer, > > > > > > -const VkRenderPassBeginInfo > *info) > > > > > > -{ > > > > > > - struct anv_cmd_state *state = &cmd_buffer->state; > > > > > > - ANV_FROM_HANDLE(anv_render_pass, pass, info->renderPass); > > > > > > - > > > > > > - vk_free(&cmd_buffer->pool->alloc, state->attachments); > > > > > > - > > > > > > - if (pass->attachment_count == 0) { > > > > > > - state->attachments = NULL; > > > > > > - return; > > > > > > - } > > > > > > - > > > > > > - state->attachments = vk_alloc(&cmd_buffer->pool->alloc, > > > > > > - pass->attachment_count * > > > > > > - > > > sizeof(state->attachments[0]), > > > > > > - 8, VK_SYSTEM_ALLOCATION_SCOPE_ > > > > > OBJECT); > > > > > > - if (state->attachments == NULL) { > > > > > > - /* FIXME: Propagate VK_ERROR_OUT_OF_HOST_MEMORY to > > > > > vkEndCommandBuffer */ > > > > > > - abort(); > > > > > > - } > > > > > > - > > > > > > - for (uint32_t i = 0; i < pass->attachment_count; ++i) { > > > > > > - struct anv_render_pass_attachment *att = > > > &pass->attachments[i]; > > > > > > - VkImageAspectFlags att_aspects = > > > vk_format_aspects(att->format); > > > > > > - VkImageAspectFlags clear_aspects = 0; > > > > > > - > > > > > > - if (att_aspects == VK_IMAGE_ASPECT_COLOR_BIT) { > > > > > > - /* color attachment */ > > > > > > - if (att->load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) { > > > > > > -clear_aspects |= VK_IMAGE_ASPECT_COLOR_BIT; > > > > > > - } > > > > > > - } else { > > > > > > - /* depthstencil attachment */ > > > > > > - if ((att_aspects & VK_IMAGE_ASPECT_DEPTH_BIT) && > > > > > > - att->load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) { > > > > > > -clear_aspects |= VK_IMAGE_ASPECT_DEPTH_BIT; > > > > > > - } > > > > > > - if ((att_aspects & VK_IMAGE_ASPECT_STENCIL_BIT) && > > > > > > - att->stencil_load_op == > VK_ATTACHMENT_LOAD_OP_CLEAR) { > > > > > > -clear_aspects |= VK_IMAGE_ASPECT_STENCIL_BIT; > > > > > > - } > > > > > > - } > > > > > > - > > > > > > - state->attachments[i].pending_clear_aspects = > clear_aspects; > > > > > > - if (clear_aspects) { > > > > > > - assert(info->clearValueCount > i); > > > > > > - state->attachments[i].clear_value = > info->pClearValues[i]; > > > > > > - } > > > > > > - } > > > > > > -} > > > > > > - > > > > > > VkResult > > > > > > anv_cmd_buffer_ensure_push_constants_size(struct anv_cmd_buffer > > > > > *cmd_buffer, > > > > > >gl_shader_stage stage, > > > > > uint32_t size) > > > > > > diff --git a/src/intel/vulkan
[Mesa-dev] [PATCH] radv: fix GetFenceStatus for signaled fences
From: Dave Airlie if a fence is created pre-signaled we should return that in GetFenceStatus even if it hasn't been submitted. Signed-off-by: Dave Airlie --- src/amd/vulkan/radv_device.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c index fdb6db9..214af5f 100644 --- a/src/amd/vulkan/radv_device.c +++ b/src/amd/vulkan/radv_device.c @@ -1202,6 +1202,8 @@ VkResult radv_GetFenceStatus(VkDevice _device, VkFence _fence) RADV_FROM_HANDLE(radv_device, device, _device); RADV_FROM_HANDLE(radv_fence, fence, _fence); + if (fence->signalled) + return VK_SUCCESS; if (!fence->submitted) return VK_NOT_READY; -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/25] anv: Rework the way render target surfaces are allocated
On Tue, Nov 08, 2016 at 05:02:29PM -0800, Jason Ekstrand wrote: > On Tue, Nov 8, 2016 at 5:00 PM, Nanley Chery wrote: > > > On Tue, Nov 08, 2016 at 04:24:48PM -0800, Jason Ekstrand wrote: > > > On Tue, Nov 8, 2016 at 3:13 PM, Nanley Chery > > wrote: > > > > > > > On Sat, Oct 22, 2016 at 10:50:37AM -0700, Jason Ekstrand wrote: > > > > > This commit moves the allocation and filling out of surface state > > from > > > > > CreateImageView time to BeginRenderPass time. Instead of allocating > > the > > > > > render target surface state as part of the image view, we allocate > > it in > > > > > the command buffer state at the same time that we set up clears. For > > > > > secondary command buffers, we allocate memory for the surface states > > in > > > > > BeginCommandBuffer but don't fill them out; instead, we use our new > > > > > SOL-based memcpy function to copy the surface states from the primary > > > > > command buffer. This allows us to handle secondary command buffers > > > > without > > > > > the user specifying the framebuffer ahead-of-time. > > > > > --- > > > > > src/intel/vulkan/anv_cmd_buffer.c | 56 -- > > > > > src/intel/vulkan/anv_image.c | 22 > > > > > src/intel/vulkan/anv_private.h | 24 - > > > > > src/intel/vulkan/genX_cmd_buffer.c | 204 > > +- > > > > --- > > > > > 4 files changed, 180 insertions(+), 126 deletions(-) > > > > > > > > > > diff --git a/src/intel/vulkan/anv_cmd_buffer.c > > > > b/src/intel/vulkan/anv_cmd_buffer.c > > > > > index a652f9a..372030c 100644 > > > > > --- a/src/intel/vulkan/anv_cmd_buffer.c > > > > > +++ b/src/intel/vulkan/anv_cmd_buffer.c > > > > > @@ -144,62 +144,6 @@ anv_cmd_state_reset(struct anv_cmd_buffer > > > > *cmd_buffer) > > > > > state->gen7.index_buffer = NULL; > > > > > } > > > > > > > > > > -/** > > > > > - * Setup anv_cmd_state::attachments for vkCmdBeginRenderPass. > > > > > - */ > > > > > -void > > > > > -anv_cmd_state_setup_attachments(struct anv_cmd_buffer *cmd_buffer, > > > > > -const VkRenderPassBeginInfo *info) > > > > > -{ > > > > > - struct anv_cmd_state *state = &cmd_buffer->state; > > > > > - ANV_FROM_HANDLE(anv_render_pass, pass, info->renderPass); > > > > > - > > > > > - vk_free(&cmd_buffer->pool->alloc, state->attachments); > > > > > - > > > > > - if (pass->attachment_count == 0) { > > > > > - state->attachments = NULL; > > > > > - return; > > > > > - } > > > > > - > > > > > - state->attachments = vk_alloc(&cmd_buffer->pool->alloc, > > > > > - pass->attachment_count * > > > > > - > > sizeof(state->attachments[0]), > > > > > - 8, VK_SYSTEM_ALLOCATION_SCOPE_ > > > > OBJECT); > > > > > - if (state->attachments == NULL) { > > > > > - /* FIXME: Propagate VK_ERROR_OUT_OF_HOST_MEMORY to > > > > vkEndCommandBuffer */ > > > > > - abort(); > > > > > - } > > > > > - > > > > > - for (uint32_t i = 0; i < pass->attachment_count; ++i) { > > > > > - struct anv_render_pass_attachment *att = > > &pass->attachments[i]; > > > > > - VkImageAspectFlags att_aspects = > > vk_format_aspects(att->format); > > > > > - VkImageAspectFlags clear_aspects = 0; > > > > > - > > > > > - if (att_aspects == VK_IMAGE_ASPECT_COLOR_BIT) { > > > > > - /* color attachment */ > > > > > - if (att->load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) { > > > > > -clear_aspects |= VK_IMAGE_ASPECT_COLOR_BIT; > > > > > - } > > > > > - } else { > > > > > - /* depthstencil attachment */ > > > > > - if ((att_aspects & VK_IMAGE_ASPECT_DEPTH_BIT) && > > > > > - att->load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) { > > > > > -clear_aspects |= VK_IMAGE_ASPECT_DEPTH_BIT; > > > > > - } > > > > > - if ((att_aspects & VK_IMAGE_ASPECT_STENCIL_BIT) && > > > > > - att->stencil_load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) { > > > > > -clear_aspects |= VK_IMAGE_ASPECT_STENCIL_BIT; > > > > > - } > > > > > - } > > > > > - > > > > > - state->attachments[i].pending_clear_aspects = clear_aspects; > > > > > - if (clear_aspects) { > > > > > - assert(info->clearValueCount > i); > > > > > - state->attachments[i].clear_value = info->pClearValues[i]; > > > > > - } > > > > > - } > > > > > -} > > > > > - > > > > > VkResult > > > > > anv_cmd_buffer_ensure_push_constants_size(struct anv_cmd_buffer > > > > *cmd_buffer, > > > > >gl_shader_stage stage, > > > > uint32_t size) > > > > > diff --git a/src/intel/vulkan/anv_image.c > > b/src/intel/vulkan/anv_image.c > > > > > index b7c2e99..b014985 100644 > > > > > --- a/src/intel/vulkan/anv_image.c > > > > > +++ b/src/intel/vulkan/anv_image.c > > > > > @@ -504,23 +504,6 @@ anv_CreateImageView(VkDevice _device, > > > > >iview->sampler_surface_state.al
Re: [Mesa-dev] [PATCH 06/25] anv: Rework the way render target surfaces are allocated
On Tue, Nov 8, 2016 at 5:00 PM, Nanley Chery wrote: > On Tue, Nov 08, 2016 at 04:24:48PM -0800, Jason Ekstrand wrote: > > On Tue, Nov 8, 2016 at 3:13 PM, Nanley Chery > wrote: > > > > > On Sat, Oct 22, 2016 at 10:50:37AM -0700, Jason Ekstrand wrote: > > > > This commit moves the allocation and filling out of surface state > from > > > > CreateImageView time to BeginRenderPass time. Instead of allocating > the > > > > render target surface state as part of the image view, we allocate > it in > > > > the command buffer state at the same time that we set up clears. For > > > > secondary command buffers, we allocate memory for the surface states > in > > > > BeginCommandBuffer but don't fill them out; instead, we use our new > > > > SOL-based memcpy function to copy the surface states from the primary > > > > command buffer. This allows us to handle secondary command buffers > > > without > > > > the user specifying the framebuffer ahead-of-time. > > > > --- > > > > src/intel/vulkan/anv_cmd_buffer.c | 56 -- > > > > src/intel/vulkan/anv_image.c | 22 > > > > src/intel/vulkan/anv_private.h | 24 - > > > > src/intel/vulkan/genX_cmd_buffer.c | 204 > +- > > > --- > > > > 4 files changed, 180 insertions(+), 126 deletions(-) > > > > > > > > diff --git a/src/intel/vulkan/anv_cmd_buffer.c > > > b/src/intel/vulkan/anv_cmd_buffer.c > > > > index a652f9a..372030c 100644 > > > > --- a/src/intel/vulkan/anv_cmd_buffer.c > > > > +++ b/src/intel/vulkan/anv_cmd_buffer.c > > > > @@ -144,62 +144,6 @@ anv_cmd_state_reset(struct anv_cmd_buffer > > > *cmd_buffer) > > > > state->gen7.index_buffer = NULL; > > > > } > > > > > > > > -/** > > > > - * Setup anv_cmd_state::attachments for vkCmdBeginRenderPass. > > > > - */ > > > > -void > > > > -anv_cmd_state_setup_attachments(struct anv_cmd_buffer *cmd_buffer, > > > > -const VkRenderPassBeginInfo *info) > > > > -{ > > > > - struct anv_cmd_state *state = &cmd_buffer->state; > > > > - ANV_FROM_HANDLE(anv_render_pass, pass, info->renderPass); > > > > - > > > > - vk_free(&cmd_buffer->pool->alloc, state->attachments); > > > > - > > > > - if (pass->attachment_count == 0) { > > > > - state->attachments = NULL; > > > > - return; > > > > - } > > > > - > > > > - state->attachments = vk_alloc(&cmd_buffer->pool->alloc, > > > > - pass->attachment_count * > > > > - > sizeof(state->attachments[0]), > > > > - 8, VK_SYSTEM_ALLOCATION_SCOPE_ > > > OBJECT); > > > > - if (state->attachments == NULL) { > > > > - /* FIXME: Propagate VK_ERROR_OUT_OF_HOST_MEMORY to > > > vkEndCommandBuffer */ > > > > - abort(); > > > > - } > > > > - > > > > - for (uint32_t i = 0; i < pass->attachment_count; ++i) { > > > > - struct anv_render_pass_attachment *att = > &pass->attachments[i]; > > > > - VkImageAspectFlags att_aspects = > vk_format_aspects(att->format); > > > > - VkImageAspectFlags clear_aspects = 0; > > > > - > > > > - if (att_aspects == VK_IMAGE_ASPECT_COLOR_BIT) { > > > > - /* color attachment */ > > > > - if (att->load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) { > > > > -clear_aspects |= VK_IMAGE_ASPECT_COLOR_BIT; > > > > - } > > > > - } else { > > > > - /* depthstencil attachment */ > > > > - if ((att_aspects & VK_IMAGE_ASPECT_DEPTH_BIT) && > > > > - att->load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) { > > > > -clear_aspects |= VK_IMAGE_ASPECT_DEPTH_BIT; > > > > - } > > > > - if ((att_aspects & VK_IMAGE_ASPECT_STENCIL_BIT) && > > > > - att->stencil_load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) { > > > > -clear_aspects |= VK_IMAGE_ASPECT_STENCIL_BIT; > > > > - } > > > > - } > > > > - > > > > - state->attachments[i].pending_clear_aspects = clear_aspects; > > > > - if (clear_aspects) { > > > > - assert(info->clearValueCount > i); > > > > - state->attachments[i].clear_value = info->pClearValues[i]; > > > > - } > > > > - } > > > > -} > > > > - > > > > VkResult > > > > anv_cmd_buffer_ensure_push_constants_size(struct anv_cmd_buffer > > > *cmd_buffer, > > > >gl_shader_stage stage, > > > uint32_t size) > > > > diff --git a/src/intel/vulkan/anv_image.c > b/src/intel/vulkan/anv_image.c > > > > index b7c2e99..b014985 100644 > > > > --- a/src/intel/vulkan/anv_image.c > > > > +++ b/src/intel/vulkan/anv_image.c > > > > @@ -504,23 +504,6 @@ anv_CreateImageView(VkDevice _device, > > > >iview->sampler_surface_state.alloc_size = 0; > > > > } > > > > > > > > - if (image->usage & VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT) { > > > > - iview->color_rt_surface_state = alloc_surface_state(device); > > > > - > > > > - struct isl_view view = iview->isl; > > > > - view.usage |= ISL_SURF_USAGE_REND
[Mesa-dev] [PATCH v2 1/3] vulkan/wsi: Add a thread-safe queue implementation
From: Kevin Strasser In order to support FIFO mode without blocking the application on calls to vkQueuePresentKHR it is necessary to enqueue the request and defer calling the server until the next vblank period. The xcb present api doesn't offer a way to register a callback, so we will have to spawn a worker thread that will wait for a request to be added to the queue, call to the server, and then make the image available for reuse. This commit introduces the queue data structure needed to implement this. Signed-off-by: Jason Ekstrand Reviewed-by: Eric Engestrom --- src/vulkan/wsi/Makefile.sources | 3 +- src/vulkan/wsi/wsi_common_queue.h | 154 ++ 2 files changed, 156 insertions(+), 1 deletion(-) create mode 100644 src/vulkan/wsi/wsi_common_queue.h diff --git a/src/vulkan/wsi/Makefile.sources b/src/vulkan/wsi/Makefile.sources index 3139e6d..50660f9 100644 --- a/src/vulkan/wsi/Makefile.sources +++ b/src/vulkan/wsi/Makefile.sources @@ -1,6 +1,7 @@ VULKAN_WSI_FILES := \ - wsi_common.h + wsi_common.h \ + wsi_common_queue.h VULKAN_WSI_WAYLAND_FILES := \ wsi_common_wayland.c \ diff --git a/src/vulkan/wsi/wsi_common_queue.h b/src/vulkan/wsi/wsi_common_queue.h new file mode 100644 index 000..0e72c8d --- /dev/null +++ b/src/vulkan/wsi/wsi_common_queue.h @@ -0,0 +1,154 @@ +/* + * Copyright © 2016 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#ifndef VULKAN_WSI_COMMON_QUEUE_H +#define VULKAN_WSI_COMMON_QUEUE_H + +#include +#include +#include "util/u_vector.h" + +struct wsi_queue { + struct u_vector vector; + pthread_mutex_t mutex; + pthread_cond_t cond; +}; + +static inline int +wsi_queue_init(struct wsi_queue *queue, int length) +{ + int ret; + + uint32_t length_pow2 = 4; + while (length_pow2 < length) + length_pow2 *= 2; + + ret = u_vector_init(&queue->vector, sizeof(uint32_t), + sizeof(uint32_t) * length_pow2); + if (!ret) + return ENOMEM; + + pthread_condattr_t condattr; + ret = pthread_condattr_init(&condattr); + if (ret) + goto fail_vector; + + ret = pthread_condattr_setclock(&condattr, CLOCK_MONOTONIC); + if (ret) + goto fail_condattr; + + ret = pthread_cond_init(&queue->cond, &condattr); + if (ret) + goto fail_condattr; + + ret = pthread_mutex_init(&queue->mutex, NULL); + if (ret) + goto fail_cond; + + return 0; + +fail_cond: + pthread_cond_destroy(&queue->cond); +fail_condattr: + pthread_condattr_destroy(&condattr); +fail_vector: + u_vector_finish(&queue->vector); + + return ret; +} + +static inline void +wsi_queue_destroy(struct wsi_queue *queue) +{ + u_vector_finish(&queue->vector); + pthread_mutex_destroy(&queue->mutex); + pthread_cond_destroy(&queue->cond); +} + +static inline void +wsi_queue_push(struct wsi_queue *queue, uint32_t index) +{ + uint32_t *elem; + + pthread_mutex_lock(&queue->mutex); + + if (u_vector_length(&queue->vector) == 0) + pthread_cond_signal(&queue->cond); + + elem = u_vector_add(&queue->vector); + *elem = index; + + pthread_mutex_unlock(&queue->mutex); +} + +#define NSEC_PER_SEC 10 +#define INT_TYPE_MAX(type) ((1ull << (sizeof(type) * 8 - 1)) - 1) + +static inline VkResult +wsi_queue_pull(struct wsi_queue *queue, uint32_t *index, uint64_t timeout) +{ + VkResult result; + int32_t ret; + + pthread_mutex_lock(&queue->mutex); + + struct timespec now; + clock_gettime(CLOCK_MONOTONIC, &now); + + uint32_t abs_nsec = now.tv_nsec + timeout % NSEC_PER_SEC; + uint64_t abs_sec = now.tv_sec + (abs_nsec / NSEC_PER_SEC) + + (timeout / NSEC_PER_SEC); + abs_nsec %= NSEC_PER_SEC; + + /* Avoid roll-over in tv_sec on 32-bit systems if the user provided timeout +* is UINT64_MAX +*/ + struct timespec abstime; + abstime.tv_nsec = abs_nsec; + abstime.tv_
[Mesa-dev] [PATCH v2 3/3] vulkan/wsi/x11: Implement FIFO mode.
This implements VK_PRESENT_MODE_FIFO_KHR for X11. Unfortunately, due to the way the present extension works, we have to manage the queue of presented images in a separate thread. Signed-off-by: Jason Ekstrand Reviewed-by: Eric Engestrom --- src/vulkan/wsi/wsi_common_x11.c | 174 +--- 1 file changed, 164 insertions(+), 10 deletions(-) diff --git a/src/vulkan/wsi/wsi_common_x11.c b/src/vulkan/wsi/wsi_common_x11.c index 4bc5ef3..208d8d4 100644 --- a/src/vulkan/wsi/wsi_common_x11.c +++ b/src/vulkan/wsi/wsi_common_x11.c @@ -39,6 +39,7 @@ #include "wsi_common.h" #include "wsi_common_x11.h" +#include "wsi_common_queue.h" #define typed_memcpy(dest, src, count) ({ \ static_assert(sizeof(*src) == sizeof(*dest), ""); \ @@ -145,6 +146,7 @@ static const VkSurfaceFormatKHR formats[] = { static const VkPresentModeKHR present_modes[] = { VK_PRESENT_MODE_IMMEDIATE_KHR, VK_PRESENT_MODE_MAILBOX_KHR, + VK_PRESENT_MODE_FIFO_KHR, }; static xcb_screen_t * @@ -490,8 +492,15 @@ struct x11_swapchain { xcb_present_event_t event_id; xcb_special_event_t *special_event; uint64_t send_sbc; + uint64_t last_present_msc; uint32_t stamp; + bool threaded; + VkResult status; + struct wsi_queue present_queue; + struct wsi_queue acquire_queue; + pthread_tqueue_manager; + struct x11_image images[0]; }; @@ -536,6 +545,8 @@ x11_handle_dri3_present_event(struct x11_swapchain *chain, for (unsigned i = 0; i < chain->image_count; i++) { if (chain->images[i].pixmap == idle->pixmap) { chain->images[i].busy = false; +if (chain->threaded) + wsi_queue_push(&chain->acquire_queue, i); break; } } @@ -543,7 +554,13 @@ x11_handle_dri3_present_event(struct x11_swapchain *chain, break; } - case XCB_PRESENT_COMPLETE_NOTIFY: + case XCB_PRESENT_EVENT_COMPLETE_NOTIFY: { + xcb_present_complete_notify_event_t *complete = (void *) event; + if (complete->kind == XCB_PRESENT_COMPLETE_KIND_PIXMAP) + chain->last_present_msc = complete->msc; + break; + } + default: break; } @@ -572,12 +589,9 @@ static uint64_t wsi_get_absolute_timeout(uint64_t timeout) } static VkResult -x11_acquire_next_image(struct wsi_swapchain *anv_chain, - uint64_t timeout, - VkSemaphore semaphore, - uint32_t *image_index) +x11_acquire_next_image_poll_x11(struct x11_swapchain *chain, +uint32_t *image_index, uint64_t timeout) { - struct x11_swapchain *chain = (struct x11_swapchain *)anv_chain; xcb_generic_event_t *event; struct pollfd pfds; uint64_t atimeout; @@ -635,17 +649,38 @@ x11_acquire_next_image(struct wsi_swapchain *anv_chain, } static VkResult -x11_queue_present(struct wsi_swapchain *anv_chain, - uint32_t image_index) +x11_acquire_next_image_from_queue(struct x11_swapchain *chain, + uint32_t *image_index_out, uint64_t timeout) +{ + assert(chain->threaded); + + uint32_t image_index; + VkResult result = wsi_queue_pull(&chain->acquire_queue, +&image_index, timeout); + if (result != VK_SUCCESS) { + return result; + } else if (chain->status != VK_SUCCESS) { + return chain->status; + } + + assert(image_index < chain->image_count); + xshmfence_await(chain->images[image_index].shm_fence); + + *image_index_out = image_index; + + return VK_SUCCESS; +} + +static VkResult +x11_present_to_x11(struct x11_swapchain *chain, uint32_t image_index, + uint32_t target_msc) { - struct x11_swapchain *chain = (struct x11_swapchain *)anv_chain; struct x11_image *image = &chain->images[image_index]; assert(image_index < chain->image_count); uint32_t options = XCB_PRESENT_OPTION_NONE; - int64_t target_msc = 0; int64_t divisor = 0; int64_t remainder = 0; @@ -680,6 +715,82 @@ x11_queue_present(struct wsi_swapchain *anv_chain, } static VkResult +x11_acquire_next_image(struct wsi_swapchain *anv_chain, + uint64_t timeout, + VkSemaphore semaphore, + uint32_t *image_index) +{ + struct x11_swapchain *chain = (struct x11_swapchain *)anv_chain; + + if (chain->threaded) { + return x11_acquire_next_image_from_queue(chain, image_index, timeout); + } else { + return x11_acquire_next_image_poll_x11(chain, image_index, timeout); + } +} + +static VkResult +x11_queue_present(struc
[Mesa-dev] [PATCH v2 2/3] vulkan/wsi: Report the correct min/maxImageCount
From the Vulkan spec 1.0.32 section 29.6 docs for vkAcquireNextImageKHR: "Let n be the total number of images in the swapchain, m be the value of VkSurfaceCapabilitiesKHR::minImageCount, and a be the number of presentable images that the application has currently acquired (i.e. images acquired with vkAcquireNextImageKHR, but not yet presented with vkQueuePresentKHR). vkAcquireNextImageKHR can always succeed if a ≤ n - m at the time vkAcquireNextImageKHR is called. vkAcquireNextImageKHR should not be called if a > n - m with a timeout of UINT64_MAX; in such a case, vkAcquireNextImageKHR may block indefinitely." With minImageCount == 2 (as it was previously, the client is allowed to acquire all but one image withoutblocking. If we really need 4 images for mailbox mode + pageflipping, then we need to request a minimum of 4 images up-front. This is a bit unfortunate because it means we will always consume 4 images. In the future, we may be able to optimize this a bit by waiting until the server starts to flip and returning OUT_OF_DATE to get the client to re-allocate with more images or something like that. Signed-off-by: Jason Ekstrand --- src/vulkan/wsi/wsi_common_wayland.c | 25 ++--- src/vulkan/wsi/wsi_common_x11.c | 21 ++--- 2 files changed, 20 insertions(+), 26 deletions(-) diff --git a/src/vulkan/wsi/wsi_common_wayland.c b/src/vulkan/wsi/wsi_common_wayland.c index c6e138e..41c099f 100644 --- a/src/vulkan/wsi/wsi_common_wayland.c +++ b/src/vulkan/wsi/wsi_common_wayland.c @@ -41,8 +41,6 @@ memcpy((dest), (src), (count) * sizeof(*(src))); \ }) -#define MIN_NUM_IMAGES 2 - struct wsi_wayland; struct wsi_wl_display { @@ -366,8 +364,16 @@ static VkResult wsi_wl_surface_get_capabilities(VkIcdSurfaceBase *surface, VkSurfaceCapabilitiesKHR* caps) { - caps->minImageCount = MIN_NUM_IMAGES; - caps->maxImageCount = 4; + /* For true mailbox mode, we need at least 4 images: +* 1) One to scan out from +* 2) One to have queued for scan-out +* 3) One to be currently held by the Wayland compositor +* 4) One to render to +*/ + caps->minImageCount = 4; + /* There is no real maximum */ + caps->maxImageCount = 0; + caps->currentExtent = (VkExtent2D) { -1, -1 }; caps->minImageExtent = (VkExtent2D) { 1, 1 }; caps->maxImageExtent = (VkExtent2D) { INT16_MAX, INT16_MAX }; @@ -685,17 +691,6 @@ wsi_wl_surface_create_swapchain(VkIcdSurfaceBase *icd_surface, int num_images = pCreateInfo->minImageCount; - assert(num_images >= MIN_NUM_IMAGES); - - /* For true mailbox mode, we need at least 4 images: -* 1) One to scan out from -* 2) One to have queued for scan-out -* 3) One to be currently held by the Wayland compositor -* 4) One to render to -*/ - if (pCreateInfo->presentMode == VK_PRESENT_MODE_MAILBOX_KHR) - num_images = MAX2(num_images, 4); - size_t size = sizeof(*chain) + num_images * sizeof(chain->images[0]); chain = vk_alloc(pAllocator, size, 8, VK_SYSTEM_ALLOCATION_SCOPE_OBJECT); diff --git a/src/vulkan/wsi/wsi_common_x11.c b/src/vulkan/wsi/wsi_common_x11.c index 98f0923..4bc5ef3 100644 --- a/src/vulkan/wsi/wsi_common_x11.c +++ b/src/vulkan/wsi/wsi_common_x11.c @@ -373,8 +373,16 @@ x11_surface_get_capabilities(VkIcdSurfaceBase *icd_surface, VK_COMPOSITE_ALPHA_OPAQUE_BIT_KHR; } + /* For true mailbox mode, we need at least 4 images: +* 1) One to scan out from +* 2) One to have queued for scan-out +* 3) One to be currently held by the Wayland compositor +* 4) One to render to +*/ caps->minImageCount = 2; - caps->maxImageCount = 4; + /* There is no real maximum */ + caps->maxImageCount = 0; + caps->supportedTransforms = VK_SURFACE_TRANSFORM_IDENTITY_BIT_KHR; caps->currentTransform = VK_SURFACE_TRANSFORM_IDENTITY_BIT_KHR; caps->maxImageArrayLayers = 1; @@ -791,16 +799,7 @@ x11_surface_create_swapchain(VkIcdSurfaceBase *icd_surface, assert(pCreateInfo->sType == VK_STRUCTURE_TYPE_SWAPCHAIN_CREATE_INFO_KHR); - int num_images = pCreateInfo->minImageCount; - - /* For true mailbox mode, we need at least 4 images: -* 1) One to scan out from -* 2) One to have queued for scan-out -* 3) One to be currently held by the Wayland compositor -* 4) One to render to -*/ - if (pCreateInfo->presentMode == VK_PRESENT_MODE_MAILBOX_KHR) - num_images = MAX2(num_images, 4); + const unsigned num_images = pCreateInfo->minImageCount; size_t size = sizeof(*chain) + num_images * sizeof(chain->images[0]); chain = vk_alloc(pAllocator, size, 8, -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/25] anv: Rework the way render target surfaces are allocated
On Tue, Nov 08, 2016 at 04:24:48PM -0800, Jason Ekstrand wrote: > On Tue, Nov 8, 2016 at 3:13 PM, Nanley Chery wrote: > > > On Sat, Oct 22, 2016 at 10:50:37AM -0700, Jason Ekstrand wrote: > > > This commit moves the allocation and filling out of surface state from > > > CreateImageView time to BeginRenderPass time. Instead of allocating the > > > render target surface state as part of the image view, we allocate it in > > > the command buffer state at the same time that we set up clears. For > > > secondary command buffers, we allocate memory for the surface states in > > > BeginCommandBuffer but don't fill them out; instead, we use our new > > > SOL-based memcpy function to copy the surface states from the primary > > > command buffer. This allows us to handle secondary command buffers > > without > > > the user specifying the framebuffer ahead-of-time. > > > --- > > > src/intel/vulkan/anv_cmd_buffer.c | 56 -- > > > src/intel/vulkan/anv_image.c | 22 > > > src/intel/vulkan/anv_private.h | 24 - > > > src/intel/vulkan/genX_cmd_buffer.c | 204 +- > > --- > > > 4 files changed, 180 insertions(+), 126 deletions(-) > > > > > > diff --git a/src/intel/vulkan/anv_cmd_buffer.c > > b/src/intel/vulkan/anv_cmd_buffer.c > > > index a652f9a..372030c 100644 > > > --- a/src/intel/vulkan/anv_cmd_buffer.c > > > +++ b/src/intel/vulkan/anv_cmd_buffer.c > > > @@ -144,62 +144,6 @@ anv_cmd_state_reset(struct anv_cmd_buffer > > *cmd_buffer) > > > state->gen7.index_buffer = NULL; > > > } > > > > > > -/** > > > - * Setup anv_cmd_state::attachments for vkCmdBeginRenderPass. > > > - */ > > > -void > > > -anv_cmd_state_setup_attachments(struct anv_cmd_buffer *cmd_buffer, > > > -const VkRenderPassBeginInfo *info) > > > -{ > > > - struct anv_cmd_state *state = &cmd_buffer->state; > > > - ANV_FROM_HANDLE(anv_render_pass, pass, info->renderPass); > > > - > > > - vk_free(&cmd_buffer->pool->alloc, state->attachments); > > > - > > > - if (pass->attachment_count == 0) { > > > - state->attachments = NULL; > > > - return; > > > - } > > > - > > > - state->attachments = vk_alloc(&cmd_buffer->pool->alloc, > > > - pass->attachment_count * > > > - sizeof(state->attachments[0]), > > > - 8, VK_SYSTEM_ALLOCATION_SCOPE_ > > OBJECT); > > > - if (state->attachments == NULL) { > > > - /* FIXME: Propagate VK_ERROR_OUT_OF_HOST_MEMORY to > > vkEndCommandBuffer */ > > > - abort(); > > > - } > > > - > > > - for (uint32_t i = 0; i < pass->attachment_count; ++i) { > > > - struct anv_render_pass_attachment *att = &pass->attachments[i]; > > > - VkImageAspectFlags att_aspects = vk_format_aspects(att->format); > > > - VkImageAspectFlags clear_aspects = 0; > > > - > > > - if (att_aspects == VK_IMAGE_ASPECT_COLOR_BIT) { > > > - /* color attachment */ > > > - if (att->load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) { > > > -clear_aspects |= VK_IMAGE_ASPECT_COLOR_BIT; > > > - } > > > - } else { > > > - /* depthstencil attachment */ > > > - if ((att_aspects & VK_IMAGE_ASPECT_DEPTH_BIT) && > > > - att->load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) { > > > -clear_aspects |= VK_IMAGE_ASPECT_DEPTH_BIT; > > > - } > > > - if ((att_aspects & VK_IMAGE_ASPECT_STENCIL_BIT) && > > > - att->stencil_load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) { > > > -clear_aspects |= VK_IMAGE_ASPECT_STENCIL_BIT; > > > - } > > > - } > > > - > > > - state->attachments[i].pending_clear_aspects = clear_aspects; > > > - if (clear_aspects) { > > > - assert(info->clearValueCount > i); > > > - state->attachments[i].clear_value = info->pClearValues[i]; > > > - } > > > - } > > > -} > > > - > > > VkResult > > > anv_cmd_buffer_ensure_push_constants_size(struct anv_cmd_buffer > > *cmd_buffer, > > >gl_shader_stage stage, > > uint32_t size) > > > diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c > > > index b7c2e99..b014985 100644 > > > --- a/src/intel/vulkan/anv_image.c > > > +++ b/src/intel/vulkan/anv_image.c > > > @@ -504,23 +504,6 @@ anv_CreateImageView(VkDevice _device, > > >iview->sampler_surface_state.alloc_size = 0; > > > } > > > > > > - if (image->usage & VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT) { > > > - iview->color_rt_surface_state = alloc_surface_state(device); > > > - > > > - struct isl_view view = iview->isl; > > > - view.usage |= ISL_SURF_USAGE_RENDER_TARGET_BIT; > > > - isl_surf_fill_state(&device->isl_dev, > > > - iview->color_rt_surface_state.map, > > > - .surf = &surface->isl, > > > - .view = &view, > > > -
Re: [Mesa-dev] [PATCH 06/25] anv: Rework the way render target surfaces are allocated
On Tue, Nov 8, 2016 at 3:13 PM, Nanley Chery wrote: > On Sat, Oct 22, 2016 at 10:50:37AM -0700, Jason Ekstrand wrote: > > This commit moves the allocation and filling out of surface state from > > CreateImageView time to BeginRenderPass time. Instead of allocating the > > render target surface state as part of the image view, we allocate it in > > the command buffer state at the same time that we set up clears. For > > secondary command buffers, we allocate memory for the surface states in > > BeginCommandBuffer but don't fill them out; instead, we use our new > > SOL-based memcpy function to copy the surface states from the primary > > command buffer. This allows us to handle secondary command buffers > without > > the user specifying the framebuffer ahead-of-time. > > --- > > src/intel/vulkan/anv_cmd_buffer.c | 56 -- > > src/intel/vulkan/anv_image.c | 22 > > src/intel/vulkan/anv_private.h | 24 - > > src/intel/vulkan/genX_cmd_buffer.c | 204 +- > --- > > 4 files changed, 180 insertions(+), 126 deletions(-) > > > > diff --git a/src/intel/vulkan/anv_cmd_buffer.c > b/src/intel/vulkan/anv_cmd_buffer.c > > index a652f9a..372030c 100644 > > --- a/src/intel/vulkan/anv_cmd_buffer.c > > +++ b/src/intel/vulkan/anv_cmd_buffer.c > > @@ -144,62 +144,6 @@ anv_cmd_state_reset(struct anv_cmd_buffer > *cmd_buffer) > > state->gen7.index_buffer = NULL; > > } > > > > -/** > > - * Setup anv_cmd_state::attachments for vkCmdBeginRenderPass. > > - */ > > -void > > -anv_cmd_state_setup_attachments(struct anv_cmd_buffer *cmd_buffer, > > -const VkRenderPassBeginInfo *info) > > -{ > > - struct anv_cmd_state *state = &cmd_buffer->state; > > - ANV_FROM_HANDLE(anv_render_pass, pass, info->renderPass); > > - > > - vk_free(&cmd_buffer->pool->alloc, state->attachments); > > - > > - if (pass->attachment_count == 0) { > > - state->attachments = NULL; > > - return; > > - } > > - > > - state->attachments = vk_alloc(&cmd_buffer->pool->alloc, > > - pass->attachment_count * > > - sizeof(state->attachments[0]), > > - 8, VK_SYSTEM_ALLOCATION_SCOPE_ > OBJECT); > > - if (state->attachments == NULL) { > > - /* FIXME: Propagate VK_ERROR_OUT_OF_HOST_MEMORY to > vkEndCommandBuffer */ > > - abort(); > > - } > > - > > - for (uint32_t i = 0; i < pass->attachment_count; ++i) { > > - struct anv_render_pass_attachment *att = &pass->attachments[i]; > > - VkImageAspectFlags att_aspects = vk_format_aspects(att->format); > > - VkImageAspectFlags clear_aspects = 0; > > - > > - if (att_aspects == VK_IMAGE_ASPECT_COLOR_BIT) { > > - /* color attachment */ > > - if (att->load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) { > > -clear_aspects |= VK_IMAGE_ASPECT_COLOR_BIT; > > - } > > - } else { > > - /* depthstencil attachment */ > > - if ((att_aspects & VK_IMAGE_ASPECT_DEPTH_BIT) && > > - att->load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) { > > -clear_aspects |= VK_IMAGE_ASPECT_DEPTH_BIT; > > - } > > - if ((att_aspects & VK_IMAGE_ASPECT_STENCIL_BIT) && > > - att->stencil_load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) { > > -clear_aspects |= VK_IMAGE_ASPECT_STENCIL_BIT; > > - } > > - } > > - > > - state->attachments[i].pending_clear_aspects = clear_aspects; > > - if (clear_aspects) { > > - assert(info->clearValueCount > i); > > - state->attachments[i].clear_value = info->pClearValues[i]; > > - } > > - } > > -} > > - > > VkResult > > anv_cmd_buffer_ensure_push_constants_size(struct anv_cmd_buffer > *cmd_buffer, > >gl_shader_stage stage, > uint32_t size) > > diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c > > index b7c2e99..b014985 100644 > > --- a/src/intel/vulkan/anv_image.c > > +++ b/src/intel/vulkan/anv_image.c > > @@ -504,23 +504,6 @@ anv_CreateImageView(VkDevice _device, > >iview->sampler_surface_state.alloc_size = 0; > > } > > > > - if (image->usage & VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT) { > > - iview->color_rt_surface_state = alloc_surface_state(device); > > - > > - struct isl_view view = iview->isl; > > - view.usage |= ISL_SURF_USAGE_RENDER_TARGET_BIT; > > - isl_surf_fill_state(&device->isl_dev, > > - iview->color_rt_surface_state.map, > > - .surf = &surface->isl, > > - .view = &view, > > - .mocs = device->default_mocs); > > - > > - if (!device->info.has_llc) > > - anv_state_clflush(iview->color_rt_surface_state); > > - } else { > > - iview->color_rt_surface_state.alloc_size = 0; > > - } > > - > > /* NOTE: This one needs to go la
[Mesa-dev] [Bug 98629] OpenGL applications warns "MESA-LOADER: failed to retrieve device information"
https://bugs.freedesktop.org/show_bug.cgi?id=98629 --- Comment #3 from Mingcong Bai --- (In reply to Emil Velikov from comment #1) > [Moving to 'core' since it's not really nouveau specific] > > Does this happen with glxinfo/glxgears as well ? If so can you attach the > output of $strace glxinfo > > If glxinfo works fine, while $program does not, attach the output of > $DL_DEBUG=libs $program > > Thanks glxinfo and glxgears... and basically everything provided by mesa-demos have the same issue. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 98629] OpenGL applications warns "MESA-LOADER: failed to retrieve device information"
https://bugs.freedesktop.org/show_bug.cgi?id=98629 --- Comment #2 from Mingcong Bai --- Created attachment 127852 --> https://bugs.freedesktop.org/attachment.cgi?id=127852&action=edit strace of glxinfo -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965/compiler: Disable trig workarounds on KBL+
On Tue, Nov 8, 2016 at 1:21 PM, Jason Ekstrand wrote: > The precision of our trig instructions instructions appears to have been > s/instructions instructions/instructions > fixed on Kaby Lake. Neither Ben nor I can find any documentation for this. > However, the dEQP precision tests now pass with INTEL_PRECISE_TRIG=0 where > they fail on Sky Lake. > > Signed-off-by: Jason Ekstrand > --- > src/mesa/drivers/dri/i965/brw_nir.c | 5 - > src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py | 7 --- > 2 files changed, 8 insertions(+), 4 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_nir.c > b/src/mesa/drivers/dri/i965/brw_nir.c > index a93d825..1069438 100644 > --- a/src/mesa/drivers/dri/i965/brw_nir.c > +++ b/src/mesa/drivers/dri/i965/brw_nir.c > @@ -449,6 +449,7 @@ nir_optimize(nir_shader *nir, bool is_scalar) > nir_shader * > brw_preprocess_nir(const struct brw_compiler *compiler, nir_shader *nir) > { > + const struct gen_device_info *devinfo = compiler->devinfo; > bool progress; /* Written by OPT and OPT_V */ > (void)progress; > > @@ -457,7 +458,9 @@ brw_preprocess_nir(const struct brw_compiler > *compiler, nir_shader *nir) > if (nir->stage == MESA_SHADER_GEOMETRY) >OPT(nir_lower_gs_intrinsics); > > - if (compiler->precise_trig) > + /* See also brw_nir_trig_workarounds.py */ > + if (compiler->precise_trig && > + !(devinfo->gen >= 10 || devinfo->is_kabylake)) >OPT(brw_nir_apply_trig_workarounds); > > static const nir_lower_tex_options tex_options = { > diff --git a/src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py > b/src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py > index 67dab9a..3b8d0ce 100755 > --- a/src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py > +++ b/src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py > @@ -23,9 +23,10 @@ > > import nir_algebraic > > -# The SIN and COS instructions on Intel hardware can produce values > -# slightly outside of the [-1.0, 1.0] range for a small set of values. > -# Obviously, this can break everyone's expectations about trig functions. > +# Prior to Kaby Lake, The SIN and COS instructions on Intel hardware can > +# produce values slightly outside of the [-1.0, 1.0] range for a small > set of > +# values. Obviously, this can break everyone's expectations about trig > +# functions. This appears to be fixed in Kaby Lake. > # > # According to an internal presentation, the COS instruction can produce > # a value up to 1.27 for inputs in the range (0.08296, 0.09888). One > -- > 2.5.0.400.gff86faf > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/25] anv: Rework the way render target surfaces are allocated
On Sat, Oct 22, 2016 at 10:50:37AM -0700, Jason Ekstrand wrote: > This commit moves the allocation and filling out of surface state from > CreateImageView time to BeginRenderPass time. Instead of allocating the > render target surface state as part of the image view, we allocate it in > the command buffer state at the same time that we set up clears. For > secondary command buffers, we allocate memory for the surface states in > BeginCommandBuffer but don't fill them out; instead, we use our new > SOL-based memcpy function to copy the surface states from the primary > command buffer. This allows us to handle secondary command buffers without > the user specifying the framebuffer ahead-of-time. > --- > src/intel/vulkan/anv_cmd_buffer.c | 56 -- > src/intel/vulkan/anv_image.c | 22 > src/intel/vulkan/anv_private.h | 24 - > src/intel/vulkan/genX_cmd_buffer.c | 204 > + > 4 files changed, 180 insertions(+), 126 deletions(-) > > diff --git a/src/intel/vulkan/anv_cmd_buffer.c > b/src/intel/vulkan/anv_cmd_buffer.c > index a652f9a..372030c 100644 > --- a/src/intel/vulkan/anv_cmd_buffer.c > +++ b/src/intel/vulkan/anv_cmd_buffer.c > @@ -144,62 +144,6 @@ anv_cmd_state_reset(struct anv_cmd_buffer *cmd_buffer) > state->gen7.index_buffer = NULL; > } > > -/** > - * Setup anv_cmd_state::attachments for vkCmdBeginRenderPass. > - */ > -void > -anv_cmd_state_setup_attachments(struct anv_cmd_buffer *cmd_buffer, > -const VkRenderPassBeginInfo *info) > -{ > - struct anv_cmd_state *state = &cmd_buffer->state; > - ANV_FROM_HANDLE(anv_render_pass, pass, info->renderPass); > - > - vk_free(&cmd_buffer->pool->alloc, state->attachments); > - > - if (pass->attachment_count == 0) { > - state->attachments = NULL; > - return; > - } > - > - state->attachments = vk_alloc(&cmd_buffer->pool->alloc, > - pass->attachment_count * > - sizeof(state->attachments[0]), > - 8, VK_SYSTEM_ALLOCATION_SCOPE_OBJECT); > - if (state->attachments == NULL) { > - /* FIXME: Propagate VK_ERROR_OUT_OF_HOST_MEMORY to vkEndCommandBuffer > */ > - abort(); > - } > - > - for (uint32_t i = 0; i < pass->attachment_count; ++i) { > - struct anv_render_pass_attachment *att = &pass->attachments[i]; > - VkImageAspectFlags att_aspects = vk_format_aspects(att->format); > - VkImageAspectFlags clear_aspects = 0; > - > - if (att_aspects == VK_IMAGE_ASPECT_COLOR_BIT) { > - /* color attachment */ > - if (att->load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) { > -clear_aspects |= VK_IMAGE_ASPECT_COLOR_BIT; > - } > - } else { > - /* depthstencil attachment */ > - if ((att_aspects & VK_IMAGE_ASPECT_DEPTH_BIT) && > - att->load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) { > -clear_aspects |= VK_IMAGE_ASPECT_DEPTH_BIT; > - } > - if ((att_aspects & VK_IMAGE_ASPECT_STENCIL_BIT) && > - att->stencil_load_op == VK_ATTACHMENT_LOAD_OP_CLEAR) { > -clear_aspects |= VK_IMAGE_ASPECT_STENCIL_BIT; > - } > - } > - > - state->attachments[i].pending_clear_aspects = clear_aspects; > - if (clear_aspects) { > - assert(info->clearValueCount > i); > - state->attachments[i].clear_value = info->pClearValues[i]; > - } > - } > -} > - > VkResult > anv_cmd_buffer_ensure_push_constants_size(struct anv_cmd_buffer *cmd_buffer, >gl_shader_stage stage, uint32_t > size) > diff --git a/src/intel/vulkan/anv_image.c b/src/intel/vulkan/anv_image.c > index b7c2e99..b014985 100644 > --- a/src/intel/vulkan/anv_image.c > +++ b/src/intel/vulkan/anv_image.c > @@ -504,23 +504,6 @@ anv_CreateImageView(VkDevice _device, >iview->sampler_surface_state.alloc_size = 0; > } > > - if (image->usage & VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT) { > - iview->color_rt_surface_state = alloc_surface_state(device); > - > - struct isl_view view = iview->isl; > - view.usage |= ISL_SURF_USAGE_RENDER_TARGET_BIT; > - isl_surf_fill_state(&device->isl_dev, > - iview->color_rt_surface_state.map, > - .surf = &surface->isl, > - .view = &view, > - .mocs = device->default_mocs); > - > - if (!device->info.has_llc) > - anv_state_clflush(iview->color_rt_surface_state); > - } else { > - iview->color_rt_surface_state.alloc_size = 0; > - } > - > /* NOTE: This one needs to go last since it may stomp isl_view.format */ > if (image->usage & VK_IMAGE_USAGE_STORAGE_BIT) { >iview->storage_surface_state = alloc_surface_state(device); > @@ -565,11 +548,6 @@ anv_DestroyImageView(VkDevice _device, VkImageView > _iview, > ANV_FROM_HANDLE(anv
Re: [Mesa-dev] [PATCH 00/25] anv: A major rework of color attachment surface states
On Tue, Nov 8, 2016 at 2:26 PM, Nanley Chery wrote: > On Tue, Nov 08, 2016 at 01:52:15PM -0800, Jason Ekstrand wrote: > > On Tue, Nov 8, 2016 at 1:36 PM, Nanley Chery > wrote: > > > > > On Sat, Oct 22, 2016 at 10:50:31AM -0700, Jason Ekstrand wrote: > > > > This series does some fairly major surgery on color attachment > surface > > > > state allocation and fill-out in the Intel Vulkan driver. This is in > > > > preparation for doing color compression, fast-clears, and HiZ-capable > > > input > > > > attachments. Naturally, as with everything else I've done in the > last 2 > > > > months, it also involves some non-trivial blorp work. > > > > > > > > Let's start off at the beginning... For a variety of reasons, we > can't > > > > really know 100% of the details of an attachment's surface state at > any > > > > other places than vkCmdBeginRenderPass and vkCmdNextSubpss. The same > > > > applies for depth buffers if you consider 3DSTATE_DEPTH_BUFFER and > > > friends > > > > to be the depth and stencil buffer's "surface state". That's a > fairly > > > > strong statement, but there are a couple of reasons for this: > > > > > > > > 1) In order for fast-clears to work, the surface state has to > contain > > > the > > > > clear color. (This is it's own packet for HiZ but not for > color.) > > > We > > > > don't know the clear value until BeginRenderPass. This means we > > > can't > > > > fully fill out the surface state in vkCmdCreateImageView. > > > > > > > > > > We could alternatively merge the view's surface state packet into > > > another that only contains the clear color(s) right? > > > > > > > Potentially, yes. However that adds a good bit of complication because > we > > now have to emit render target surfaces on-the-fly because you may be > > building two different batches simultaneously that use the same > VkImageView > > as a render target with two different clear colors. It also doesn't > solve > > the null framebuffer problem. > > > > I'm not suggesting that this optimization solves the null framebuffer > problem, nor that we could add the clear color to the VkImageView's > surface state. I'm trying to confirm that we could allocate the block > of states (as is done in this series), then assign a block entry the > VkImageView's surface state + a surface state struct that only > contains the clear colors. > Yes, that might work and would let us keep the isl_surf_fill_state call in anv_image.c. We would also have to deal with at least the AuxUsage field in the OR-in as well. I think we could set up the other aux buffer information in isl_surf_fill_state and the hardware *should* ignore if we set AuxUsage to AUX_USAGE_NONE. > > > > > - Nanley > > > > > > > 2) The Vulkan spec requires that you be able to call > > > vkBeginCommandBuffer > > > > on a secondary command buffer with > USAGE_RENDER_PASS_CONTINUE_BIT set > > > > but with a null framebuffer. In this case, the secondary is > supposed > > > > to inherit the framebuffer from the primary. (This is not > something > > > we > > > > have properly implemented until now.) This means that anything > that > > > is > > > > callable from a render-pass-continuing secondary command buffer > has > > > to > > > > be able to operate without knowing any surface details that > aren't > > > part > > > > of the VkRenderPass object. Basically, all you know is the > Vulkan > > > > format (not the isl format) and the sample count. > > > > > > > > Between the two of those, about the only two entrypoints left at > which we > > > > actually know surface details are vkCmdBeginRenderPass and > > > vkCmdNextSubpass > > > > so we have to figure out how to do everything there. As it turns > out, > > > this > > > > works out surprisingly well. The format and the sample count turn > out to > > > > be exactly the data we actually need in order to do all of our > pipeline > > > > programming. The only hard part is refactoring things so that it > pulls > > > the > > > > data from the render pass instead of the framebuffer. There are a > number > > > > of places where we were grabbing the image view for an attachment > because > > > > we either wanted to shove something into blorp or because we wanted > the > > > > format and we were lazy. > > > > > > > > The approach taken in this patch series is the following: > > > > > > > > 1) Instead of allocating render target surface states in > > > vkCreateImageView, > > > > we allocate them as part of render pass setup in > > > vkCmdBeginRenderPass. > > > > All of the surface states we will ever need (including a null > surface > > > > state) are allocated up-front out of a single contiguous block. > > > > > > > > 2) For secondary command buffers with USAGE_RENDER_PASS_CONTINUE_BIT > > > set, > > > > we allocate storage for all of the surface states but don't > actually > > > > fill them out. In the secondary command buffer, all binding > tables > >
Re: [Mesa-dev] [PATCH 1/3] gallium/scons: OpenSWR Windows support
> -Original Message- > From: Jose Fonseca [mailto:jfons...@vmware.com] > Sent: Tuesday, November 8, 2016 4:17 PM > To: Kyriazis, George ; mesa- > d...@lists.freedesktop.org > Subject: Re: [Mesa-dev] [PATCH 1/3] gallium/scons: OpenSWR Windows > support > > On 07/11/16 22:32, George Kyriazis wrote: > > - Added code to create screen and handle swaps in libgl_gdi.c > > - Added call to swr SConscript > > - included llvm 3.9 support for scons (windows swr only support 3.9 and > > later) > > - include -DHAVE_SWR to subdirs that need it > > > > To buils SWR on windows, use "scons swr libgl-gdi" > > --- > > scons/llvm.py | 21 +++-- > > src/gallium/SConscript| 1 + > > src/gallium/targets/libgl-gdi/SConscript | 4 > > src/gallium/targets/libgl-gdi/libgl_gdi.c | 28 > > +++- src/gallium/targets/libgl-xlib/SConscript > | 4 > > src/gallium/targets/osmesa/SConscript | 4 > > 6 files changed, 55 insertions(+), 7 deletions(-) > > > > diff --git a/scons/llvm.py b/scons/llvm.py index 1fc8a3f..977e47a > > 100644 > > --- a/scons/llvm.py > > +++ b/scons/llvm.py > > @@ -106,7 +106,24 @@ def generate(env): > > ]) > > env.Prepend(LIBPATH = [os.path.join(llvm_dir, 'lib')]) > > # LIBS should match the output of `llvm-config --libs engine mcjit > bitwriter x86asmprinter` > > -if llvm_version >= distutils.version.LooseVersion('3.7'): > > +if llvm_version >= distutils.version.LooseVersion('3.9'): > > +env.Prepend(LIBS = [ > > +'LLVMX86Disassembler', 'LLVMX86AsmParser', > > +'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter', > > +'LLVMDebugInfoCodeView', 'LLVMCodeGen', > > +'LLVMScalarOpts', 'LLVMInstCombine', > > +'LLVMInstrumentation', 'LLVMTransformUtils', > > +'LLVMBitWriter', 'LLVMX86Desc', > > +'LLVMMCDisassembler', 'LLVMX86Info', > > +'LLVMX86AsmPrinter', 'LLVMX86Utils', > > +'LLVMMCJIT', 'LLVMExecutionEngine', 'LLVMTarget', > > +'LLVMAnalysis', 'LLVMProfileData', > > +'LLVMRuntimeDyld', 'LLVMObject', 'LLVMMCParser', > > +'LLVMBitReader', 'LLVMMC', 'LLVMCore', > > +'LLVMSupport', > > +'LLVMIRReader', 'LLVMASMParser' > > +]) > > +elif llvm_version >= distutils.version.LooseVersion('3.7'): > > env.Prepend(LIBS = [ > > 'LLVMBitWriter', 'LLVMX86Disassembler', 'LLVMX86AsmParser', > > 'LLVMX86CodeGen', 'LLVMSelectionDAG', > > 'LLVMAsmPrinter', @@ -203,7 +220,7 @@ def generate(env): > > if '-fno-rtti' in cxxflags: > > env.Append(CXXFLAGS = ['-fno-rtti']) > > > > -components = ['engine', 'mcjit', 'bitwriter', 'x86asmprinter', > 'mcdisassembler'] > > +components = ['engine', 'mcjit', 'bitwriter', > > + 'x86asmprinter', 'mcdisassembler', 'irreader'] > > > > env.ParseConfig('llvm-config --libs ' + ' '.join(components)) > > env.ParseConfig('llvm-config --ldflags') diff --git > > a/src/gallium/SConscript b/src/gallium/SConscript index > > f98268f..9273db7 100644 > > --- a/src/gallium/SConscript > > +++ b/src/gallium/SConscript > > @@ -18,6 +18,7 @@ SConscript([ > > 'drivers/softpipe/SConscript', > > 'drivers/svga/SConscript', > > 'drivers/trace/SConscript', > > +'drivers/swr/SConscript', > > ]) > > > > # > > diff --git a/src/gallium/targets/libgl-gdi/SConscript > > b/src/gallium/targets/libgl-gdi/SConscript > > index 2a52363..ef8050b 100644 > > --- a/src/gallium/targets/libgl-gdi/SConscript > > +++ b/src/gallium/targets/libgl-gdi/SConscript > > @@ -30,6 +30,10 @@ if env['llvm']: > > env.Append(CPPDEFINES = 'HAVE_LLVMPIPE') > > drivers += [llvmpipe] > > > > +if 'swr' in COMMAND_LINE_TARGETS : > > +env.Append(CPPDEFINES = 'HAVE_SWR') > > +drivers += [swr] > > + > > if env['gcc'] and env['machine'] != 'x86_64': > > # DEF parser in certain versions of MinGW is busted, as does not behave > as > > # MSVC. mingw-w64 works fine. > > diff --git a/src/gallium/targets/libgl-gdi/libgl_gdi.c > > b/src/gallium/targets/libgl-gdi/libgl_gdi.c > > index 922c186..12576db 100644 > > --- a/src/gallium/targets/libgl-gdi/libgl_gdi.c > > +++ b/src/gallium/targets/libgl-gdi/libgl_gdi.c > > @@ -51,9 +51,12 @@ > > #include "llvmpipe/lp_public.h" > > #endif > > > > +#ifdef HAVE_SWR > > +#include "swr/swr_public.h" > > +#endif > > > > static boolean use_llvmpipe = FALSE; > > - > > +static boolean use_swr = FALSE; > > > > static struct pipe_screen * > > gdi_screen_create(void) > > @@ -69,6 +72,8 @@ gdi_screen_create(void) > > > > #ifdef HAVE_LLVMPIPE > > default_driver = "llvmpipe"; > > +#elif HAVE_SWR > > + default_driver = "swr"; > > #else > >
Re: [Mesa-dev] [PATCH 2/3] mesa: added msvc HAS_TRIVIAL_DESTRUCTOR implementation
> -Original Message- > From: Jose Fonseca [mailto:jfons...@vmware.com] > Sent: Tuesday, November 8, 2016 4:12 PM > To: Kyriazis, George ; mesa- > d...@lists.freedesktop.org > Subject: Re: [Mesa-dev] [PATCH 2/3] mesa: added msvc > HAS_TRIVIAL_DESTRUCTOR implementation > > On 07/11/16 22:32, George Kyriazis wrote: > > not having it on windows causes a CANARY assertion in > > src/util/ralloc.c:get_header() > > > > Tested only on MSVC 19.00 (DevStudio 14.0), so #ifdef guards reflect that. > > --- > > src/util/macros.h | 5 + > > 1 file changed, 5 insertions(+) > > > > diff --git a/src/util/macros.h b/src/util/macros.h index > > 27d1b62..12b26d3 100644 > > --- a/src/util/macros.h > > +++ b/src/util/macros.h > > @@ -175,6 +175,11 @@ do { \ > > # if __has_feature(has_trivial_destructor) > > # define HAS_TRIVIAL_DESTRUCTOR(T) __has_trivial_destructor(T) > > # endif > > +# elif defined(_MSC_VER) && !defined(__INTEL_COMPILER) > > +# if _MSC_VER >= 1900 > > +# define HAS_TRIVIAL_DESTRUCTOR(T) __has_trivial_destructor(T) > > +# else > > #else is redundant her. Otherwise looks good. > No problem. I'll remove. George > Reviewed-by: Jose Fonseca > > > +# endif > > # endif > > # ifndef HAS_TRIVIAL_DESTRUCTOR > > /* It's always safe (if inefficient) to assume that a > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 98629] OpenGL applications warns "MESA-LOADER: failed to retrieve device information"
https://bugs.freedesktop.org/show_bug.cgi?id=98629 Emil Velikov changed: What|Removed |Added CC||emil.l.veli...@gmail.com -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] swr: disable logic op when the rt format is float or srgb
I’d prefer parenthesis to clarify the logic "(foo && ((bar == bla) || footer)”. With those added, Reviewed-by: Tim Rowley mailto:timothy.o.row...@intel.com>> On Nov 8, 2016, at 4:30 PM, Ilia Mirkin mailto:imir...@alum.mit.edu>> wrote: Signed-off-by: Ilia Mirkin mailto:imir...@alum.mit.edu>> --- src/gallium/drivers/swr/swr_state.cpp | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/gallium/drivers/swr/swr_state.cpp b/src/gallium/drivers/swr/swr_state.cpp index d8a8ee1..d16c307 100644 --- a/src/gallium/drivers/swr/swr_state.cpp +++ b/src/gallium/drivers/swr/swr_state.cpp @@ -1305,6 +1305,12 @@ swr_update_derived(struct pipe_context *pipe, &ctx->blend->compileState[target], sizeof(compileState.blendState)); +const SWR_FORMAT_INFO& info = GetFormatInfo(compileState.format); +if (compileState.blendState.logicOpEnable && +(info.type[0] == SWR_TYPE_FLOAT || info.isSRGB)) { + compileState.blendState.logicOpEnable = false; +} + if (compileState.blendState.blendEnable == false && compileState.blendState.logicOpEnable == false) { SwrSetBlendFunc(ctx->swrContext, target, NULL); -- 2.7.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] swr: disable logic op when the rt format is float or srgb
Signed-off-by: Ilia Mirkin --- src/gallium/drivers/swr/swr_state.cpp | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/gallium/drivers/swr/swr_state.cpp b/src/gallium/drivers/swr/swr_state.cpp index d8a8ee1..d16c307 100644 --- a/src/gallium/drivers/swr/swr_state.cpp +++ b/src/gallium/drivers/swr/swr_state.cpp @@ -1305,6 +1305,12 @@ swr_update_derived(struct pipe_context *pipe, &ctx->blend->compileState[target], sizeof(compileState.blendState)); +const SWR_FORMAT_INFO& info = GetFormatInfo(compileState.format); +if (compileState.blendState.logicOpEnable && +(info.type[0] == SWR_TYPE_FLOAT || info.isSRGB)) { + compileState.blendState.logicOpEnable = false; +} + if (compileState.blendState.blendEnable == false && compileState.blendState.logicOpEnable == false) { SwrSetBlendFunc(ctx->swrContext, target, NULL); -- 2.7.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/25] anv: A major rework of color attachment surface states
On Tue, Nov 08, 2016 at 01:52:15PM -0800, Jason Ekstrand wrote: > On Tue, Nov 8, 2016 at 1:36 PM, Nanley Chery wrote: > > > On Sat, Oct 22, 2016 at 10:50:31AM -0700, Jason Ekstrand wrote: > > > This series does some fairly major surgery on color attachment surface > > > state allocation and fill-out in the Intel Vulkan driver. This is in > > > preparation for doing color compression, fast-clears, and HiZ-capable > > input > > > attachments. Naturally, as with everything else I've done in the last 2 > > > months, it also involves some non-trivial blorp work. > > > > > > Let's start off at the beginning... For a variety of reasons, we can't > > > really know 100% of the details of an attachment's surface state at any > > > other places than vkCmdBeginRenderPass and vkCmdNextSubpss. The same > > > applies for depth buffers if you consider 3DSTATE_DEPTH_BUFFER and > > friends > > > to be the depth and stencil buffer's "surface state". That's a fairly > > > strong statement, but there are a couple of reasons for this: > > > > > > 1) In order for fast-clears to work, the surface state has to contain > > the > > > clear color. (This is it's own packet for HiZ but not for color.) > > We > > > don't know the clear value until BeginRenderPass. This means we > > can't > > > fully fill out the surface state in vkCmdCreateImageView. > > > > > > > We could alternatively merge the view's surface state packet into > > another that only contains the clear color(s) right? > > > > Potentially, yes. However that adds a good bit of complication because we > now have to emit render target surfaces on-the-fly because you may be > building two different batches simultaneously that use the same VkImageView > as a render target with two different clear colors. It also doesn't solve > the null framebuffer problem. > I'm not suggesting that this optimization solves the null framebuffer problem, nor that we could add the clear color to the VkImageView's surface state. I'm trying to confirm that we could allocate the block of states (as is done in this series), then assign a block entry the VkImageView's surface state + a surface state struct that only contains the clear colors. > > > - Nanley > > > > > 2) The Vulkan spec requires that you be able to call > > vkBeginCommandBuffer > > > on a secondary command buffer with USAGE_RENDER_PASS_CONTINUE_BIT set > > > but with a null framebuffer. In this case, the secondary is supposed > > > to inherit the framebuffer from the primary. (This is not something > > we > > > have properly implemented until now.) This means that anything that > > is > > > callable from a render-pass-continuing secondary command buffer has > > to > > > be able to operate without knowing any surface details that aren't > > part > > > of the VkRenderPass object. Basically, all you know is the Vulkan > > > format (not the isl format) and the sample count. > > > > > > Between the two of those, about the only two entrypoints left at which we > > > actually know surface details are vkCmdBeginRenderPass and > > vkCmdNextSubpass > > > so we have to figure out how to do everything there. As it turns out, > > this > > > works out surprisingly well. The format and the sample count turn out to > > > be exactly the data we actually need in order to do all of our pipeline > > > programming. The only hard part is refactoring things so that it pulls > > the > > > data from the render pass instead of the framebuffer. There are a number > > > of places where we were grabbing the image view for an attachment because > > > we either wanted to shove something into blorp or because we wanted the > > > format and we were lazy. > > > > > > The approach taken in this patch series is the following: > > > > > > 1) Instead of allocating render target surface states in > > vkCreateImageView, > > > we allocate them as part of render pass setup in > > vkCmdBeginRenderPass. > > > All of the surface states we will ever need (including a null surface > > > state) are allocated up-front out of a single contiguous block. > > > > > > 2) For secondary command buffers with USAGE_RENDER_PASS_CONTINUE_BIT > > set, > > > we allocate storage for all of the surface states but don't actually > > > fill them out. In the secondary command buffer, all binding tables > > > refer to these surface states rather than the ones in the primary. > > > > > > 3) A blorp entrypoint is added that performs a clear operation without > > > touching the depth/stencil buffer state and with a color attachment > > > binding table explicitly provided by the caller. This means that > > even > > > our blorp clears are using the surface states allocated in > > > vkCmdBeginRenderPass. Unfortunately, this turned out to be more work > > > than expected because I had to add vertex shader support to blorp > > along > > > the way. > > > > > > 4) Here's the tricky bit.
Re: [Mesa-dev] [Mesa-announce] Mesa 12.0.4 release candidate
Matt Turner writes: > On Tue, Nov 8, 2016 at 1:59 PM, Emil Velikov wrote: >> Jordan Justen (1) >> 49c24d8 i965: fix noop_scissor range issue on width/height >> Note: temporary on hold since it causes GPU lockups on 32bit builds. > > Let's just drop this one. I found it in an old branch and committed it > (even wrote a piglit test for it), but it didn't fix any actual > applications. It looks like Emil already dropped this from his release candidate, but forgot to remove it from the announcement. > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] anv/device: Don't even try to map memory with a size of 0
On Tue, Nov 8, 2016 at 2:11 PM, Nanley Chery wrote: > On Tue, Nov 08, 2016 at 02:01:17PM -0800, Nanley Chery wrote: > > On Tue, Nov 08, 2016 at 01:50:01PM -0800, Jason Ekstrand wrote: > > > On Tue, Nov 8, 2016 at 1:46 PM, Nanley Chery > wrote: > > > > > > > On Mon, Nov 07, 2016 at 05:28:12PM -0800, Jason Ekstrand wrote: > > > > > Signed-off-by: Jason Ekstrand > > > > > Cc: "12.0 13.0" > > > > > --- > > > > > src/intel/vulkan/anv_device.c | 5 + > > > > > 1 file changed, 5 insertions(+) > > > > > > > > > > diff --git a/src/intel/vulkan/anv_device.c > > > > b/src/intel/vulkan/anv_device.c > > > > > index 5393144..8055893 100644 > > > > > --- a/src/intel/vulkan/anv_device.c > > > > > +++ b/src/intel/vulkan/anv_device.c > > > > > @@ -1258,6 +1258,11 @@ VkResult anv_MapMemory( > > > > > if (size == VK_WHOLE_SIZE) > > > > >size = mem->bo.size - offset; > > > > > > > > > > + if (size == 0) { > > > > > > > > The user isn't allowed to make such a call. Does this fix a CTS test? > > > > > > > > > > Heh, so they aren't. It doesn't fix anything, it just ensures that you > > > never hit the ioctl with a size of zero. How about I replace it with > an > > > assert? > > > > > > > An assert or no assert is fine. The validation layers technically should > > catch this for us. > They should, but this is more for my confidence in subsequent code than to try and fix apps. > > > > With patch 1 fixed or omitted, this series is: > Reviewed-by: Nanley Chery > Thanks! > > > > > > > > + *ppData = NULL; > > > > > + return VK_SUCCESS; > > > > > + } > > > > > + > > > > > /* FIXME: Is this supposed to be thread safe? Since > vkUnmapMemory() > > > > only > > > > > * takes a VkDeviceMemory pointer, it seems like only one map > of the > > > > memory > > > > > * at a time is valid. We could just mmap up front and return > an > > > > offset > > > > > -- > > > > > 2.5.0.400.gff86faf > > > > > > > > > > ___ > > > > > mesa-dev mailing list > > > > > mesa-dev@lists.freedesktop.org > > > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] gallium/scons: OpenSWR Windows support
On 07/11/16 22:32, George Kyriazis wrote: - Added code to create screen and handle swaps in libgl_gdi.c - Added call to swr SConscript - included llvm 3.9 support for scons (windows swr only support 3.9 and later) - include -DHAVE_SWR to subdirs that need it To buils SWR on windows, use "scons swr libgl-gdi" --- scons/llvm.py | 21 +++-- src/gallium/SConscript| 1 + src/gallium/targets/libgl-gdi/SConscript | 4 src/gallium/targets/libgl-gdi/libgl_gdi.c | 28 +++- src/gallium/targets/libgl-xlib/SConscript | 4 src/gallium/targets/osmesa/SConscript | 4 6 files changed, 55 insertions(+), 7 deletions(-) diff --git a/scons/llvm.py b/scons/llvm.py index 1fc8a3f..977e47a 100644 --- a/scons/llvm.py +++ b/scons/llvm.py @@ -106,7 +106,24 @@ def generate(env): ]) env.Prepend(LIBPATH = [os.path.join(llvm_dir, 'lib')]) # LIBS should match the output of `llvm-config --libs engine mcjit bitwriter x86asmprinter` -if llvm_version >= distutils.version.LooseVersion('3.7'): +if llvm_version >= distutils.version.LooseVersion('3.9'): +env.Prepend(LIBS = [ +'LLVMX86Disassembler', 'LLVMX86AsmParser', +'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter', +'LLVMDebugInfoCodeView', 'LLVMCodeGen', +'LLVMScalarOpts', 'LLVMInstCombine', +'LLVMInstrumentation', 'LLVMTransformUtils', +'LLVMBitWriter', 'LLVMX86Desc', +'LLVMMCDisassembler', 'LLVMX86Info', +'LLVMX86AsmPrinter', 'LLVMX86Utils', +'LLVMMCJIT', 'LLVMExecutionEngine', 'LLVMTarget', +'LLVMAnalysis', 'LLVMProfileData', +'LLVMRuntimeDyld', 'LLVMObject', 'LLVMMCParser', +'LLVMBitReader', 'LLVMMC', 'LLVMCore', +'LLVMSupport', +'LLVMIRReader', 'LLVMASMParser' +]) +elif llvm_version >= distutils.version.LooseVersion('3.7'): env.Prepend(LIBS = [ 'LLVMBitWriter', 'LLVMX86Disassembler', 'LLVMX86AsmParser', 'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter', @@ -203,7 +220,7 @@ def generate(env): if '-fno-rtti' in cxxflags: env.Append(CXXFLAGS = ['-fno-rtti']) -components = ['engine', 'mcjit', 'bitwriter', 'x86asmprinter', 'mcdisassembler'] +components = ['engine', 'mcjit', 'bitwriter', 'x86asmprinter', 'mcdisassembler', 'irreader'] env.ParseConfig('llvm-config --libs ' + ' '.join(components)) env.ParseConfig('llvm-config --ldflags') diff --git a/src/gallium/SConscript b/src/gallium/SConscript index f98268f..9273db7 100644 --- a/src/gallium/SConscript +++ b/src/gallium/SConscript @@ -18,6 +18,7 @@ SConscript([ 'drivers/softpipe/SConscript', 'drivers/svga/SConscript', 'drivers/trace/SConscript', +'drivers/swr/SConscript', ]) # diff --git a/src/gallium/targets/libgl-gdi/SConscript b/src/gallium/targets/libgl-gdi/SConscript index 2a52363..ef8050b 100644 --- a/src/gallium/targets/libgl-gdi/SConscript +++ b/src/gallium/targets/libgl-gdi/SConscript @@ -30,6 +30,10 @@ if env['llvm']: env.Append(CPPDEFINES = 'HAVE_LLVMPIPE') drivers += [llvmpipe] +if 'swr' in COMMAND_LINE_TARGETS : +env.Append(CPPDEFINES = 'HAVE_SWR') +drivers += [swr] + if env['gcc'] and env['machine'] != 'x86_64': # DEF parser in certain versions of MinGW is busted, as does not behave as # MSVC. mingw-w64 works fine. diff --git a/src/gallium/targets/libgl-gdi/libgl_gdi.c b/src/gallium/targets/libgl-gdi/libgl_gdi.c index 922c186..12576db 100644 --- a/src/gallium/targets/libgl-gdi/libgl_gdi.c +++ b/src/gallium/targets/libgl-gdi/libgl_gdi.c @@ -51,9 +51,12 @@ #include "llvmpipe/lp_public.h" #endif +#ifdef HAVE_SWR +#include "swr/swr_public.h" +#endif static boolean use_llvmpipe = FALSE; - +static boolean use_swr = FALSE; static struct pipe_screen * gdi_screen_create(void) @@ -69,6 +72,8 @@ gdi_screen_create(void) #ifdef HAVE_LLVMPIPE default_driver = "llvmpipe"; +#elif HAVE_SWR + default_driver = "swr"; #else default_driver = "softpipe"; #endif @@ -78,15 +83,21 @@ gdi_screen_create(void) #ifdef HAVE_LLVMPIPE if (strcmp(driver, "llvmpipe") == 0) { screen = llvmpipe_create_screen( winsys ); + if (screen) + use_llvmpipe = TRUE; + } +#endif +#ifdef HAVE_SWR + if (strcmp(driver, "swr") == 0) { + screen = swr_create_screen( winsys ); + if (screen) + use_swr = TRUE; } -#else - (void) driver; #endif + (void) driver; if (screen == NULL) { screen = softpipe_create_screen( winsys ); - } else { - use_llvmpipe = TRUE; } if(!screen) @@ -128,6 +139,13 @@ gdi_present(struct pipe_screen *screen, } #endif +#ifdef HAVE_SWR +
Re: [Mesa-dev] [PATCH] gallivm: fix [IU]MUL_HI regression
Am 08.11.2016 um 20:10 schrieb Marek Olšák: > FYI, this doesn't fix the regression fully. (GLCTS failures with > piglit: -t mulextended) Maybe using shuffle isn't such a good idea then. Not sure how well you handle them, and there's probably a problem with scalar build contexts (initially this was restricted to 4 and 8-sized vectors), looking at it we'd actually return a (1-sized) vector instead of a scalar in the end... shifts/truncs have the advantage that they are completely agnostic if it's scalars or vectors (and if it's vectors, what kind of vectors). Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] mesa: added msvc HAS_TRIVIAL_DESTRUCTOR implementation
On 07/11/16 22:32, George Kyriazis wrote: not having it on windows causes a CANARY assertion in src/util/ralloc.c:get_header() Tested only on MSVC 19.00 (DevStudio 14.0), so #ifdef guards reflect that. --- src/util/macros.h | 5 + 1 file changed, 5 insertions(+) diff --git a/src/util/macros.h b/src/util/macros.h index 27d1b62..12b26d3 100644 --- a/src/util/macros.h +++ b/src/util/macros.h @@ -175,6 +175,11 @@ do { \ # if __has_feature(has_trivial_destructor) # define HAS_TRIVIAL_DESTRUCTOR(T) __has_trivial_destructor(T) # endif +# elif defined(_MSC_VER) && !defined(__INTEL_COMPILER) +# if _MSC_VER >= 1900 +# define HAS_TRIVIAL_DESTRUCTOR(T) __has_trivial_destructor(T) +# else #else is redundant her. Otherwise looks good. Reviewed-by: Jose Fonseca +# endif # endif # ifndef HAS_TRIVIAL_DESTRUCTOR /* It's always safe (if inefficient) to assume that a ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] anv/device: Don't even try to map memory with a size of 0
On Tue, Nov 08, 2016 at 02:01:17PM -0800, Nanley Chery wrote: > On Tue, Nov 08, 2016 at 01:50:01PM -0800, Jason Ekstrand wrote: > > On Tue, Nov 8, 2016 at 1:46 PM, Nanley Chery wrote: > > > > > On Mon, Nov 07, 2016 at 05:28:12PM -0800, Jason Ekstrand wrote: > > > > Signed-off-by: Jason Ekstrand > > > > Cc: "12.0 13.0" > > > > --- > > > > src/intel/vulkan/anv_device.c | 5 + > > > > 1 file changed, 5 insertions(+) > > > > > > > > diff --git a/src/intel/vulkan/anv_device.c > > > b/src/intel/vulkan/anv_device.c > > > > index 5393144..8055893 100644 > > > > --- a/src/intel/vulkan/anv_device.c > > > > +++ b/src/intel/vulkan/anv_device.c > > > > @@ -1258,6 +1258,11 @@ VkResult anv_MapMemory( > > > > if (size == VK_WHOLE_SIZE) > > > >size = mem->bo.size - offset; > > > > > > > > + if (size == 0) { > > > > > > The user isn't allowed to make such a call. Does this fix a CTS test? > > > > > > > Heh, so they aren't. It doesn't fix anything, it just ensures that you > > never hit the ioctl with a size of zero. How about I replace it with an > > assert? > > > > An assert or no assert is fine. The validation layers technically should > catch this for us. > With patch 1 fixed or omitted, this series is: Reviewed-by: Nanley Chery > > > > > > + *ppData = NULL; > > > > + return VK_SUCCESS; > > > > + } > > > > + > > > > /* FIXME: Is this supposed to be thread safe? Since vkUnmapMemory() > > > only > > > > * takes a VkDeviceMemory pointer, it seems like only one map of the > > > memory > > > > * at a time is valid. We could just mmap up front and return an > > > offset > > > > -- > > > > 2.5.0.400.gff86faf > > > > > > > > ___ > > > > mesa-dev mailing list > > > > mesa-dev@lists.freedesktop.org > > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] gallivm: add wrappers for missing functions in LLVM <= 3.8
On 19/10/16 23:14, Marek Olšák wrote: From: Marek Olšák radeonsi needs these. --- src/gallium/auxiliary/gallivm/lp_bld_misc.cpp | 21 + src/gallium/auxiliary/gallivm/lp_bld_misc.h | 6 ++ 2 files changed, 27 insertions(+) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp index 791a470..f4045ad 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp +++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp @@ -70,20 +70,21 @@ #include #else #include #endif #include #include #include #include +#include #include #include #include #include #if LLVM_USE_INTEL_JITEVENTS #include #endif // Workaround http://llvm.org/PR23628 @@ -701,10 +702,30 @@ lp_free_memory_manager(LLVMMCJITMemoryManagerRef memorymgr) extern "C" void lp_add_attr_dereferenceable(LLVMValueRef val, uint64_t bytes) { #if HAVE_LLVM >= 0x0306 llvm::Argument *A = llvm::unwrap(val); llvm::AttrBuilder B; B.addDereferenceableAttr(bytes); A->addAttr(llvm::AttributeSet::get(A->getContext(), A->getArgNo() + 1, B)); #endif } + +extern "C" LLVMValueRef +lp_get_called_value(LLVMValueRef call) +{ +#if HAVE_LLVM >= 0x0309 + return LLVMGetCalledValue(call); +#else + return llvm::wrap(llvm::CallSite(llvm::unwrap(call)).getCalledValue()); +#endif +} In these circumstances, rather introducing a wrapper, I find it more appealing to "backport" the future defintion, as: #if HAVE_LLVM < 0x0309 extern "C" LLVMValueRef LLVMGetCalledValue(LLVMValueRef call) { return llvm::wrap(llvm::CallSite(llvm::unwrap(call)).getCalledValue()); } #endif This way it's one less wrapper to learn. And when the required LLVM version reaches 3.9, we can just remove the function. Jose + +extern "C" bool +lp_is_function(LLVMValueRef v) +{ +#if HAVE_LLVM >= 0x0309 + return LLVMGetValueKind(v) == LLVMFunctionValueKind; +#else + return llvm::isa(llvm::unwrap(v)); +#endif +} diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.h b/src/gallium/auxiliary/gallivm/lp_bld_misc.h index c127c48..a55c6bd 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_misc.h +++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.h @@ -69,16 +69,22 @@ lp_free_generated_code(struct lp_generated_code *code); extern LLVMMCJITMemoryManagerRef lp_get_default_memory_manager(); extern void lp_free_memory_manager(LLVMMCJITMemoryManagerRef memorymgr); extern void lp_add_attr_dereferenceable(LLVMValueRef val, uint64_t bytes); +extern LLVMValueRef +lp_get_called_value(LLVMValueRef call); + +extern bool +lp_is_function(LLVMValueRef v); + #ifdef __cplusplus } #endif #endif /* !LP_BLD_MISC_H */ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-announce] Mesa 12.0.4 release candidate
On Tue, Nov 8, 2016 at 1:59 PM, Emil Velikov wrote: > Jordan Justen (1) > 49c24d8 i965: fix noop_scissor range issue on width/height > Note: temporary on hold since it causes GPU lockups on 32bit builds. Let's just drop this one. I found it in an old branch and committed it (even wrote a piglit test for it), but it didn't fix any actual applications. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] anv/device: Don't even try to map memory with a size of 0
On Tue, Nov 08, 2016 at 01:50:01PM -0800, Jason Ekstrand wrote: > On Tue, Nov 8, 2016 at 1:46 PM, Nanley Chery wrote: > > > On Mon, Nov 07, 2016 at 05:28:12PM -0800, Jason Ekstrand wrote: > > > Signed-off-by: Jason Ekstrand > > > Cc: "12.0 13.0" > > > --- > > > src/intel/vulkan/anv_device.c | 5 + > > > 1 file changed, 5 insertions(+) > > > > > > diff --git a/src/intel/vulkan/anv_device.c > > b/src/intel/vulkan/anv_device.c > > > index 5393144..8055893 100644 > > > --- a/src/intel/vulkan/anv_device.c > > > +++ b/src/intel/vulkan/anv_device.c > > > @@ -1258,6 +1258,11 @@ VkResult anv_MapMemory( > > > if (size == VK_WHOLE_SIZE) > > >size = mem->bo.size - offset; > > > > > > + if (size == 0) { > > > > The user isn't allowed to make such a call. Does this fix a CTS test? > > > > Heh, so they aren't. It doesn't fix anything, it just ensures that you > never hit the ioctl with a size of zero. How about I replace it with an > assert? > An assert or no assert is fine. The validation layers technically should catch this for us. > > > > + *ppData = NULL; > > > + return VK_SUCCESS; > > > + } > > > + > > > /* FIXME: Is this supposed to be thread safe? Since vkUnmapMemory() > > only > > > * takes a VkDeviceMemory pointer, it seems like only one map of the > > memory > > > * at a time is valid. We could just mmap up front and return an > > offset > > > -- > > > 2.5.0.400.gff86faf > > > > > > ___ > > > mesa-dev mailing list > > > mesa-dev@lists.freedesktop.org > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Mesa 12.0.4 release candidate
Hello list, The candidate for the Mesa 12.0.4 is now available. Currently we have: - 115 queued - 10 nominated (outstanding) - and 11 (self-)rejected patches Notes: - The Intel CI infrastructure is utilised for testing which brings us testing on all platforms supported by the i965 DRI and ANV Vulkan drivers. Classic swrast, softpipe and llvmpipe are also tested against latest piglit. Unfortunately, reports can no longer include the Changes section. - Sent-but-not-yet-merged-in-master patches are no longer tracked in the Nominated section. - Nominated and Rejected sections now include the sha of the commit in master. - The Rejected section includes reasoning behind the decision. Objections are considered with backports greatly appreciated. Take a look at section "Mesa stable queue" for more information. Testing reports/general approval Any testing reports (or general approval of the state of the branch) will be greatly appreciated. The plan is to have 12.0.4 this Thursday (10th of November), around or shortly after 20:00 GMT. If you have any questions or suggestions - be that about the current patch queue or otherwise, please go ahead. Trivial merge conflicts --- commit 05ec6a7c03ce0b3c38b081f7947aeaa47b1d7e81 Author: Ilia Mirkin a3xx: use window scissor to simulate viewport xy clip (cherry picked from commit ca313e00b6eda27e4308c29fd7244f43c77d4f97) commit b1c5719d7b1be2f6fb438252614a2008f54159d7 Author: Marek Olšák radeonsi: fix FP64 UBO loads with indirect uniform block indexing (cherry picked from commit 15a127bc2c3267f35e0d78ebc205e1686a5a5e3f) commit 6a72af2aeb48b90adfff922c99a26274cbc2f357 Author: Kenneth Graunke mesa: Expose RESET_NOTIFICATION_STRATEGY with KHR_robustness. (cherry picked from commit 3bcdc2e3db8fb9f8e04d3504b6f331b484ebcc96) commit f228c90f80effc413b51aba0b9b4f3487ed35871 Author: Nicholas Bishop st/dri: check pipe_screen->resource_get_handle() return value (cherry picked from commit aa560e8e6328acd5b8feec1fea54dec06ae21368) commit 17429a22a6026dc6601fc8e9ae4f0daecb30079a Author: Jason Ekstrand nir: Add a nop intrinsic (cherry picked from commit 7697b4b98b155c818811709becdb408772371538) commit a5c0b8784aacfc0e7d2ff90a92dd56bb53b97bdd Author: Nicolai Hähnle gallium/radeon: cleanup and fix branch emits (cherry picked from commit 6f87d7a14699277be6dd17e9e712841c4057c4df) commit bc04c92aef700a60a65ff567aed7f3e99a6d95b4 Author: Kenneth Graunke i965: Add missing BRW_NEW_VS_PROG_DATA to 3DSTATE_CLIP. (cherry picked from commit 28e1538be7923205231402ab928b61b670bd2962) commit eb9236e27591583c8fa20daaeb29bcab1ccb8ad8 Author: Vinson Lee Revert "mesa_glinterop: remove inclusion of GLX header" (cherry picked from commit c10dcb2ce837922c6ee4e191e6d6202098a5ee10) (cherry picked from commit c85b34ffd04f9a7a16fe30173474e857d0f42d5f) commit 341889d6ca85b9c7346e656b2eb65ac1007756a4 Author: Chad Versace egl: Don't advertise unsupported platform extensions (cherry picked from commit c177ef9d47943f648a13beed14269f468583c16e) commit 979e4b9c3f5b1272df807c0195a85d980c45ea29 Author: Tapani Pälli egl: add check that eglCreateContext gets a valid config (cherry picked from commit 5876f3c85a61d73bb4863331bd641152a40a7b0c) commit ac3abe534bb9986ce7ee1286854a3bb2a83568bc Author: Marek Olšák winsys/amdgpu: fix radeon_surf::macro_tile_index for imported textures (cherry picked from commit 6ec3b2a4b1d41b83a4721d06b42c49f55e695cbf) commit 3d4a219dd86eaab549e49c48ed1c8a0c922b5221 Author: Jason Ekstrand intel/blorp: Rework our usage of ralloc when compiling shaders (cherry picked from commit 43dadb6edd5e3e3e10b1198184a9f75556edad49) Cheers, Emil Mesa stable queue - Nominated (10) == Adam Jackson (2) deb0eb1 glx/glvnd: Don't modify the dummy slot in the dispatch table 8bca8d8 glx/glvnd: Fix dispatch function names and indices Haixia Shi (1) 8c56ff6 mesa: change state query return value for RGB565 Jason Ekstrand (1) 2a4a868 i965/fs/generator: Don't use the address immediate for MOV_INDIRECT Jordan Justen (1) 49c24d8 i965: fix noop_scissor range issue on width/height Note: temporary on hold since it causes GPU lockups on 32bit builds. Marek Olšák (5) 8b05076 gallium/radeon: unify viewport emission code 687c4be gallium/radeon: set VPORT_ZMIN/MAX registers correctly 03708de radeonsi: fix gl_PatchVerticesIn for tessellation evaluation shader 3e756f0 radeonsi: fix a crash in imageSize for cubemap arrays b425b57 radeonsi: emit TA_CS_BC_BASE_ADDR on SI only if the kernel allows it Matt Turner (1) 0775523 anv: Replace "abi_versions" with correct "api_version". Queued (115) Axel Davy (4): gallium/util: Really allow aliasing of dst for u_box_union_* st/nine: Fix the calculation o
Re: [Mesa-dev] [PATCH 1/2] intel/common: Add an is_kabylake field to gen_device_info
On Tue, Nov 8, 2016 at 1:53 PM, Matt Turner wrote: > On Tue, Nov 8, 2016 at 1:21 PM, Jason Ekstrand > wrote: > > Most of the 3-D engine Kaby Lake is identical to Sky Lake. However, > there > > While hyphenating 3D looks a little odd to me, Skylake is definitely > just a single word. (Strangely, Kaby Lake is indeed two words) > > > are a few small differences that we need to be able to detect. > > > > Signed-off-by: Jason Ekstrand > > --- > > src/intel/common/gen_device_info.c | 14 +- > > src/intel/common/gen_device_info.h | 1 + > > 2 files changed, 10 insertions(+), 5 deletions(-) > > > > diff --git a/src/intel/common/gen_device_info.c > b/src/intel/common/gen_device_info.c > > index 30df0b2..3ff98f0 100644 > > --- a/src/intel/common/gen_device_info.c > > +++ b/src/intel/common/gen_device_info.c > > @@ -427,6 +427,10 @@ static const struct gen_device_info > gen_device_info_bxt_2x6 = { > > * There's no KBL entry. Using the default SKL (GEN9) GS entries value. > > */ > > > > +#define KBL_FEATURES \ > > We don't have subgen FEATURES #defines for anything else. bxt, for > instance just sets .is_broxton in its couple of fields. Not wrong, but > doesn't seem particularly necessary for a single field. > We do for Haswell which is why I did it this way. > I'd prefer to just put .is_kabylake in the KBL structs, unless you've > got further plans. > Sure. I don't care much either way. > With that fixed, both patches are > > Reviewed-by: Matt Turner > Thanks ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] intel/common: Add an is_kabylake field to gen_device_info
On Tue, Nov 8, 2016 at 1:21 PM, Jason Ekstrand wrote: > Most of the 3-D engine Kaby Lake is identical to Sky Lake. However, there While hyphenating 3D looks a little odd to me, Skylake is definitely just a single word. (Strangely, Kaby Lake is indeed two words) > are a few small differences that we need to be able to detect. > > Signed-off-by: Jason Ekstrand > --- > src/intel/common/gen_device_info.c | 14 +- > src/intel/common/gen_device_info.h | 1 + > 2 files changed, 10 insertions(+), 5 deletions(-) > > diff --git a/src/intel/common/gen_device_info.c > b/src/intel/common/gen_device_info.c > index 30df0b2..3ff98f0 100644 > --- a/src/intel/common/gen_device_info.c > +++ b/src/intel/common/gen_device_info.c > @@ -427,6 +427,10 @@ static const struct gen_device_info > gen_device_info_bxt_2x6 = { > * There's no KBL entry. Using the default SKL (GEN9) GS entries value. > */ > > +#define KBL_FEATURES \ We don't have subgen FEATURES #defines for anything else. bxt, for instance just sets .is_broxton in its couple of fields. Not wrong, but doesn't seem particularly necessary for a single field. I'd prefer to just put .is_kabylake in the KBL structs, unless you've got further plans. With that fixed, both patches are Reviewed-by: Matt Turner Neat find. :) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/25] anv: A major rework of color attachment surface states
On Tue, Nov 8, 2016 at 1:36 PM, Nanley Chery wrote: > On Sat, Oct 22, 2016 at 10:50:31AM -0700, Jason Ekstrand wrote: > > This series does some fairly major surgery on color attachment surface > > state allocation and fill-out in the Intel Vulkan driver. This is in > > preparation for doing color compression, fast-clears, and HiZ-capable > input > > attachments. Naturally, as with everything else I've done in the last 2 > > months, it also involves some non-trivial blorp work. > > > > Let's start off at the beginning... For a variety of reasons, we can't > > really know 100% of the details of an attachment's surface state at any > > other places than vkCmdBeginRenderPass and vkCmdNextSubpss. The same > > applies for depth buffers if you consider 3DSTATE_DEPTH_BUFFER and > friends > > to be the depth and stencil buffer's "surface state". That's a fairly > > strong statement, but there are a couple of reasons for this: > > > > 1) In order for fast-clears to work, the surface state has to contain > the > > clear color. (This is it's own packet for HiZ but not for color.) > We > > don't know the clear value until BeginRenderPass. This means we > can't > > fully fill out the surface state in vkCmdCreateImageView. > > > > We could alternatively merge the view's surface state packet into > another that only contains the clear color(s) right? > Potentially, yes. However that adds a good bit of complication because we now have to emit render target surfaces on-the-fly because you may be building two different batches simultaneously that use the same VkImageView as a render target with two different clear colors. It also doesn't solve the null framebuffer problem. > - Nanley > > > 2) The Vulkan spec requires that you be able to call > vkBeginCommandBuffer > > on a secondary command buffer with USAGE_RENDER_PASS_CONTINUE_BIT set > > but with a null framebuffer. In this case, the secondary is supposed > > to inherit the framebuffer from the primary. (This is not something > we > > have properly implemented until now.) This means that anything that > is > > callable from a render-pass-continuing secondary command buffer has > to > > be able to operate without knowing any surface details that aren't > part > > of the VkRenderPass object. Basically, all you know is the Vulkan > > format (not the isl format) and the sample count. > > > > Between the two of those, about the only two entrypoints left at which we > > actually know surface details are vkCmdBeginRenderPass and > vkCmdNextSubpass > > so we have to figure out how to do everything there. As it turns out, > this > > works out surprisingly well. The format and the sample count turn out to > > be exactly the data we actually need in order to do all of our pipeline > > programming. The only hard part is refactoring things so that it pulls > the > > data from the render pass instead of the framebuffer. There are a number > > of places where we were grabbing the image view for an attachment because > > we either wanted to shove something into blorp or because we wanted the > > format and we were lazy. > > > > The approach taken in this patch series is the following: > > > > 1) Instead of allocating render target surface states in > vkCreateImageView, > > we allocate them as part of render pass setup in > vkCmdBeginRenderPass. > > All of the surface states we will ever need (including a null surface > > state) are allocated up-front out of a single contiguous block. > > > > 2) For secondary command buffers with USAGE_RENDER_PASS_CONTINUE_BIT > set, > > we allocate storage for all of the surface states but don't actually > > fill them out. In the secondary command buffer, all binding tables > > refer to these surface states rather than the ones in the primary. > > > > 3) A blorp entrypoint is added that performs a clear operation without > > touching the depth/stencil buffer state and with a color attachment > > binding table explicitly provided by the caller. This means that > even > > our blorp clears are using the surface states allocated in > > vkCmdBeginRenderPass. Unfortunately, this turned out to be more work > > than expected because I had to add vertex shader support to blorp > along > > the way. > > > > 4) Here's the tricky bit. When vkCmdExecuteCommands is called during a > > render pass, we use transform feedback (yeah, crazy) to copy the > block > > of surface states from the primary into the secondary right before > > executing the secondary. > > > > It's kind of a crazy scheme but I like the end result quite a bit. > > > > Cc: Kristian Høgsberg Kristensen > > Cc: Chad Versace > > Cc: Nanley Chery > > Cc: Topi Pohjolainen > > > > Jason Ekstrand (25): > > intel/isl: Add some basic info about RENDER_SURFACE_STATE to > > isl_device > > intel/genxml: Add SO_WRITE_OFFSET registers for gen7-9 > > anv: Add a helper f
Re: [Mesa-dev] [PATCH 1/3] anv/device: Don't even try to map memory with a size of 0
On Tue, Nov 8, 2016 at 1:46 PM, Nanley Chery wrote: > On Mon, Nov 07, 2016 at 05:28:12PM -0800, Jason Ekstrand wrote: > > Signed-off-by: Jason Ekstrand > > Cc: "12.0 13.0" > > --- > > src/intel/vulkan/anv_device.c | 5 + > > 1 file changed, 5 insertions(+) > > > > diff --git a/src/intel/vulkan/anv_device.c > b/src/intel/vulkan/anv_device.c > > index 5393144..8055893 100644 > > --- a/src/intel/vulkan/anv_device.c > > +++ b/src/intel/vulkan/anv_device.c > > @@ -1258,6 +1258,11 @@ VkResult anv_MapMemory( > > if (size == VK_WHOLE_SIZE) > >size = mem->bo.size - offset; > > > > + if (size == 0) { > > The user isn't allowed to make such a call. Does this fix a CTS test? > Heh, so they aren't. It doesn't fix anything, it just ensures that you never hit the ioctl with a size of zero. How about I replace it with an assert? > > + *ppData = NULL; > > + return VK_SUCCESS; > > + } > > + > > /* FIXME: Is this supposed to be thread safe? Since vkUnmapMemory() > only > > * takes a VkDeviceMemory pointer, it seems like only one map of the > memory > > * at a time is valid. We could just mmap up front and return an > offset > > -- > > 2.5.0.400.gff86faf > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] anv/device: Don't even try to map memory with a size of 0
On Mon, Nov 07, 2016 at 05:28:12PM -0800, Jason Ekstrand wrote: > Signed-off-by: Jason Ekstrand > Cc: "12.0 13.0" > --- > src/intel/vulkan/anv_device.c | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c > index 5393144..8055893 100644 > --- a/src/intel/vulkan/anv_device.c > +++ b/src/intel/vulkan/anv_device.c > @@ -1258,6 +1258,11 @@ VkResult anv_MapMemory( > if (size == VK_WHOLE_SIZE) >size = mem->bo.size - offset; > > + if (size == 0) { The user isn't allowed to make such a call. Does this fix a CTS test? > + *ppData = NULL; > + return VK_SUCCESS; > + } > + > /* FIXME: Is this supposed to be thread safe? Since vkUnmapMemory() only > * takes a VkDeviceMemory pointer, it seems like only one map of the > memory > * at a time is valid. We could just mmap up front and return an offset > -- > 2.5.0.400.gff86faf > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] nir: add conditional discard optimisation (v3)
Dave Airlie writes: > From: Dave Airlie > > This is ported from GLSL and converts > > if (cond) > discard; > > into > discard_if(cond); > > This removes a block, but also is needed by radv > to workaround a bug in the LLVM backend. > > v2: handle if (a) discard_if(b) (nha) > cleanup and drop pointless loop (Matt) > make sure there are no dependent phis (Eric) > v3: make sure only one instruction in the then block. > > Signed-off-by: Dave Airlie > --- > diff --git a/src/compiler/nir/nir_opt_conditional_discard.c > b/src/compiler/nir/nir_opt_conditional_discard.c > new file mode 100644 > index 000..6e90983 > --- /dev/null > +++ b/src/compiler/nir/nir_opt_conditional_discard.c > @@ -0,0 +1,125 @@ > +/* > + * Copyright © 2016 Red Hat > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice (including the next > + * paragraph) shall be included in all copies or substantial portions of the > + * Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > DEALINGS > + * IN THE SOFTWARE. > + */ > + > +#include "nir.h" > +#include "nir_builder.h" > + > +/** @file nir_opt_conditional_discard.c > + * > + * Handles optimization of lowering if (cond) discard to discard_if(cond). > + */ Maybe put some quotes around "if (cond) discard" to clarify what statement is being lowered. > +static bool > +nir_opt_conditional_discard_block(nir_block *block, void *mem_ctx) > +{ > + nir_builder bld; > + > + if (nir_cf_node_is_first(&block->cf_node)) > + return false; > + > + nir_cf_node *prev_node = nir_cf_node_prev(&block->cf_node); > + if (prev_node->type != nir_cf_node_if) > + return false; > + > + nir_if *if_stmt = nir_cf_node_as_if(prev_node); > + nir_block *then_block = nir_if_first_then_block(if_stmt); > + nir_block *else_block = nir_if_first_else_block(if_stmt); > + > + /* check there is only one else block and it is empty */ > + if (nir_if_last_else_block(if_stmt) != else_block) > + return false; > + if (!exec_list_is_empty(&else_block->instr_list)) > + return false; > + > + /* check there is only one then block and it has only one instruction in > it */ > + if (nir_if_last_then_block(if_stmt) != then_block) > + return false; > + if (exec_list_is_empty(&then_block->instr_list)) > + return false; > + if (exec_list_length(&then_block->instr_list) > 1) > + return false; > + /* > +* make sure no subsequent phi nodes point at this if. > +*/ > + nir_block *after = > nir_cf_node_as_block(nir_cf_node_next(&if_stmt->cf_node)); > + nir_foreach_instr_safe(instr, after) { > + if (instr->type != nir_instr_type_phi) > + break; > + nir_phi_instr *phi = nir_instr_as_phi(instr); > + > + nir_foreach_phi_src(phi_src, phi) { > + if (phi_src->pred == then_block || > + phi_src->pred == else_block) > +return false; > + } > + } > + > + /* Get the first instruction in the then block and confirm it is > +* a discard or a discard_if > +*/ > + nir_instr *instr = nir_block_first_instr(then_block); > + if (instr->type != nir_instr_type_intrinsic) > + return false; > + > + nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr); > + if (intrin->intrinsic != nir_intrinsic_discard && > + intrin->intrinsic != nir_intrinsic_discard_if) > + return false; > + > + nir_src cond; > + > + nir_builder_init(&bld, mem_ctx); Missing bld.cursor initialization, so the adding-to-a-discard_if case should crash. With that fixed, this will be: Reviewed-by: Eric Anholt signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/25] anv: A major rework of color attachment surface states
On Sat, Oct 22, 2016 at 10:50:31AM -0700, Jason Ekstrand wrote: > This series does some fairly major surgery on color attachment surface > state allocation and fill-out in the Intel Vulkan driver. This is in > preparation for doing color compression, fast-clears, and HiZ-capable input > attachments. Naturally, as with everything else I've done in the last 2 > months, it also involves some non-trivial blorp work. > > Let's start off at the beginning... For a variety of reasons, we can't > really know 100% of the details of an attachment's surface state at any > other places than vkCmdBeginRenderPass and vkCmdNextSubpss. The same > applies for depth buffers if you consider 3DSTATE_DEPTH_BUFFER and friends > to be the depth and stencil buffer's "surface state". That's a fairly > strong statement, but there are a couple of reasons for this: > > 1) In order for fast-clears to work, the surface state has to contain the > clear color. (This is it's own packet for HiZ but not for color.) We > don't know the clear value until BeginRenderPass. This means we can't > fully fill out the surface state in vkCmdCreateImageView. > We could alternatively merge the view's surface state packet into another that only contains the clear color(s) right? - Nanley > 2) The Vulkan spec requires that you be able to call vkBeginCommandBuffer > on a secondary command buffer with USAGE_RENDER_PASS_CONTINUE_BIT set > but with a null framebuffer. In this case, the secondary is supposed > to inherit the framebuffer from the primary. (This is not something we > have properly implemented until now.) This means that anything that is > callable from a render-pass-continuing secondary command buffer has to > be able to operate without knowing any surface details that aren't part > of the VkRenderPass object. Basically, all you know is the Vulkan > format (not the isl format) and the sample count. > > Between the two of those, about the only two entrypoints left at which we > actually know surface details are vkCmdBeginRenderPass and vkCmdNextSubpass > so we have to figure out how to do everything there. As it turns out, this > works out surprisingly well. The format and the sample count turn out to > be exactly the data we actually need in order to do all of our pipeline > programming. The only hard part is refactoring things so that it pulls the > data from the render pass instead of the framebuffer. There are a number > of places where we were grabbing the image view for an attachment because > we either wanted to shove something into blorp or because we wanted the > format and we were lazy. > > The approach taken in this patch series is the following: > > 1) Instead of allocating render target surface states in vkCreateImageView, > we allocate them as part of render pass setup in vkCmdBeginRenderPass. > All of the surface states we will ever need (including a null surface > state) are allocated up-front out of a single contiguous block. > > 2) For secondary command buffers with USAGE_RENDER_PASS_CONTINUE_BIT set, > we allocate storage for all of the surface states but don't actually > fill them out. In the secondary command buffer, all binding tables > refer to these surface states rather than the ones in the primary. > > 3) A blorp entrypoint is added that performs a clear operation without > touching the depth/stencil buffer state and with a color attachment > binding table explicitly provided by the caller. This means that even > our blorp clears are using the surface states allocated in > vkCmdBeginRenderPass. Unfortunately, this turned out to be more work > than expected because I had to add vertex shader support to blorp along > the way. > > 4) Here's the tricky bit. When vkCmdExecuteCommands is called during a > render pass, we use transform feedback (yeah, crazy) to copy the block > of surface states from the primary into the secondary right before > executing the secondary. > > It's kind of a crazy scheme but I like the end result quite a bit. > > Cc: Kristian Høgsberg Kristensen > Cc: Chad Versace > Cc: Nanley Chery > Cc: Topi Pohjolainen > > Jason Ekstrand (25): > intel/isl: Add some basic info about RENDER_SURFACE_STATE to > isl_device > intel/genxml: Add SO_WRITE_OFFSET registers for gen7-9 > anv: Add a helper for doing buffer copies with nothing but VF and SOL. > anv/cmd_buffer: Use the surface state alloc helper in > null_surface_state > anv/cmd_buffer: Expose add_surface_state_reloc as an inline helper > anv: Rework the way render target surfaces are allocated > anv/cmd_buffer: Stop relying on the framebuffer for 3DSTATE_SF on gen7 > intel/genxml: Make some VS/GS fields consistent across gens > intel/blorp: Make the number of samples an explicit parameter > intel/blorp: Add a shader type to make keys more unique > intel/blorp: Remove NIR support for
[Mesa-dev] [PATCH 2/2] i965/compiler: Disable trig workarounds on KBL+
The precision of our trig instructions instructions appears to have been fixed on Kaby Lake. Neither Ben nor I can find any documentation for this. However, the dEQP precision tests now pass with INTEL_PRECISE_TRIG=0 where they fail on Sky Lake. Signed-off-by: Jason Ekstrand --- src/mesa/drivers/dri/i965/brw_nir.c | 5 - src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py | 7 --- 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_nir.c b/src/mesa/drivers/dri/i965/brw_nir.c index a93d825..1069438 100644 --- a/src/mesa/drivers/dri/i965/brw_nir.c +++ b/src/mesa/drivers/dri/i965/brw_nir.c @@ -449,6 +449,7 @@ nir_optimize(nir_shader *nir, bool is_scalar) nir_shader * brw_preprocess_nir(const struct brw_compiler *compiler, nir_shader *nir) { + const struct gen_device_info *devinfo = compiler->devinfo; bool progress; /* Written by OPT and OPT_V */ (void)progress; @@ -457,7 +458,9 @@ brw_preprocess_nir(const struct brw_compiler *compiler, nir_shader *nir) if (nir->stage == MESA_SHADER_GEOMETRY) OPT(nir_lower_gs_intrinsics); - if (compiler->precise_trig) + /* See also brw_nir_trig_workarounds.py */ + if (compiler->precise_trig && + !(devinfo->gen >= 10 || devinfo->is_kabylake)) OPT(brw_nir_apply_trig_workarounds); static const nir_lower_tex_options tex_options = { diff --git a/src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py b/src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py index 67dab9a..3b8d0ce 100755 --- a/src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py +++ b/src/mesa/drivers/dri/i965/brw_nir_trig_workarounds.py @@ -23,9 +23,10 @@ import nir_algebraic -# The SIN and COS instructions on Intel hardware can produce values -# slightly outside of the [-1.0, 1.0] range for a small set of values. -# Obviously, this can break everyone's expectations about trig functions. +# Prior to Kaby Lake, The SIN and COS instructions on Intel hardware can +# produce values slightly outside of the [-1.0, 1.0] range for a small set of +# values. Obviously, this can break everyone's expectations about trig +# functions. This appears to be fixed in Kaby Lake. # # According to an internal presentation, the COS instruction can produce # a value up to 1.27 for inputs in the range (0.08296, 0.09888). One -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] intel/common: Add an is_kabylake field to gen_device_info
Most of the 3-D engine Kaby Lake is identical to Sky Lake. However, there are a few small differences that we need to be able to detect. Signed-off-by: Jason Ekstrand --- src/intel/common/gen_device_info.c | 14 +- src/intel/common/gen_device_info.h | 1 + 2 files changed, 10 insertions(+), 5 deletions(-) diff --git a/src/intel/common/gen_device_info.c b/src/intel/common/gen_device_info.c index 30df0b2..3ff98f0 100644 --- a/src/intel/common/gen_device_info.c +++ b/src/intel/common/gen_device_info.c @@ -427,6 +427,10 @@ static const struct gen_device_info gen_device_info_bxt_2x6 = { * There's no KBL entry. Using the default SKL (GEN9) GS entries value. */ +#define KBL_FEATURES \ + GEN9_FEATURES, \ + .is_kabylake = true + /* * Both SKL and KBL support a maximum of 64 threads per * Pixel Shader Dispatch (PSD) unit. @@ -434,7 +438,7 @@ static const struct gen_device_info gen_device_info_bxt_2x6 = { #define KBL_MAX_THREADS_PER_PSD 64 static const struct gen_device_info gen_device_info_kbl_gt1 = { - GEN9_FEATURES, + KBL_FEATURES, .gt = 1, .max_cs_threads = 7 * 6, @@ -444,7 +448,7 @@ static const struct gen_device_info gen_device_info_kbl_gt1 = { }; static const struct gen_device_info gen_device_info_kbl_gt1_5 = { - GEN9_FEATURES, + KBL_FEATURES, .gt = 1, .max_cs_threads = 7 * 6, @@ -453,7 +457,7 @@ static const struct gen_device_info gen_device_info_kbl_gt1_5 = { }; static const struct gen_device_info gen_device_info_kbl_gt2 = { - GEN9_FEATURES, + KBL_FEATURES, .gt = 2, .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 3, @@ -461,7 +465,7 @@ static const struct gen_device_info gen_device_info_kbl_gt2 = { }; static const struct gen_device_info gen_device_info_kbl_gt3 = { - GEN9_FEATURES, + KBL_FEATURES, .gt = 3, .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 6, @@ -469,7 +473,7 @@ static const struct gen_device_info gen_device_info_kbl_gt3 = { }; static const struct gen_device_info gen_device_info_kbl_gt4 = { - GEN9_FEATURES, + KBL_FEATURES, .gt = 4, .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 9, diff --git a/src/intel/common/gen_device_info.h b/src/intel/common/gen_device_info.h index 10324e6..53ac5f6 100644 --- a/src/intel/common/gen_device_info.h +++ b/src/intel/common/gen_device_info.h @@ -41,6 +41,7 @@ struct gen_device_info bool is_haswell; bool is_cherryview; bool is_broxton; + bool is_kabylake; bool has_hiz_and_separate_stencil; bool must_use_separate_stencil; -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/59 v2] glsl/standalone: Optimize add-of-neg to subtract
ping On 10/26/2016 07:17 PM, Ian Romanick wrote: > From: Ian Romanick > > This just makes the output of the standalone compiler a little more > compact. > > v2: Fix indexing typo noticed by Iago. Move the add_neg_to_sub_visitor > to it's own header file. Add a unit test that exercises the visitor. > Both the neg_a_plus_b and neg_a_plus_neg_b tests reproduced the bug that > Iago discovered. > > Signed-off-by: Ian Romanick > --- > src/compiler/Makefile.glsl.am | 1 + > src/compiler/glsl/opt_add_neg_to_sub.h | 61 ++ > src/compiler/glsl/standalone.cpp | 4 + > .../glsl/tests/opt_add_neg_to_sub_test.cpp | 210 > + > 4 files changed, 276 insertions(+) > create mode 100644 src/compiler/glsl/opt_add_neg_to_sub.h > create mode 100644 src/compiler/glsl/tests/opt_add_neg_to_sub_test.cpp > > diff --git a/src/compiler/Makefile.glsl.am b/src/compiler/Makefile.glsl.am > index 80dfb73..4de51e4 100644 > --- a/src/compiler/Makefile.glsl.am > +++ b/src/compiler/Makefile.glsl.am > @@ -69,6 +69,7 @@ glsl_tests_general_ir_test_SOURCES = > \ > glsl/tests/builtin_variable_test.cpp\ > glsl/tests/invalidate_locations_test.cpp\ > glsl/tests/general_ir_test.cpp \ > + glsl/tests/opt_add_neg_to_sub_test.cpp \ > glsl/tests/varyings_test.cpp > glsl_tests_general_ir_test_CFLAGS = \ > $(PTHREAD_CFLAGS) > diff --git a/src/compiler/glsl/opt_add_neg_to_sub.h > b/src/compiler/glsl/opt_add_neg_to_sub.h > new file mode 100644 > index 000..9f97071 > --- /dev/null > +++ b/src/compiler/glsl/opt_add_neg_to_sub.h > @@ -0,0 +1,61 @@ > +/* > + * Copyright © 2016 Intel Corporation > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice (including the next > + * paragraph) shall be included in all copies or substantial portions of the > + * Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > + * DEALINGS IN THE SOFTWARE. > + */ > + > +#ifndef OPT_ADD_NEG_TO_SUB_H > +#define OPT_ADD_NEG_TO_SUB_H > + > +#include "ir.h" > +#include "ir_hierarchical_visitor.h" > + > +class add_neg_to_sub_visitor : public ir_hierarchical_visitor { > +public: > + add_neg_to_sub_visitor() > + { > + /* empty */ > + } > + > + ir_visitor_status visit_leave(ir_expression *ir) > + { > + if (ir->operation != ir_binop_add) > + return visit_continue; > + > + for (unsigned i = 0; i < 2; i++) { > + ir_expression *const op = ir->operands[i]->as_expression(); > + > + if (op != NULL && op->operation == ir_unop_neg) { > +ir->operation = ir_binop_sub; > + > +/* This ensures that -a + b becomes b - a. */ > +if (i == 0) > + ir->operands[0] = ir->operands[1]; > + > +ir->operands[1] = op->operands[0]; > +break; > + } > + } > + > + return visit_continue; > + } > +}; > + > +#endif /* OPT_ADD_NEG_TO_SUB_H */ > diff --git a/src/compiler/glsl/standalone.cpp > b/src/compiler/glsl/standalone.cpp > index 055c433..07793a9 100644 > --- a/src/compiler/glsl/standalone.cpp > +++ b/src/compiler/glsl/standalone.cpp > @@ -37,6 +37,7 @@ > #include "standalone_scaffolding.h" > #include "standalone.h" > #include "util/string_to_uint_map.h" > +#include "opt_add_neg_to_sub.h" > > static const struct standalone_options *options; > > @@ -441,6 +442,9 @@ standalone_compile_shader(const struct standalone_options > *_options, > if (!shader) > continue; > > + add_neg_to_sub_visitor v; > + visit_list_elements(&v, shader->ir); > + > shader->Program = rzalloc(shader, gl_program); > init_gl_program(shader->Program, shader->Stage); >} > diff --git a/src/compiler/glsl/tests/opt_add_neg_to_sub_test.cpp > b/src/compiler/glsl/tests/opt_add_neg_to_sub_test.cpp > new file mode 100644 > index 000..b82e47f > --- /dev/null > +++ b/src/compiler/glsl/tests/opt_add_neg_to_sub_test.cpp >
Re: [Mesa-dev] clover: Add CL_PROGRAM_BINARY_TYPE support (CL1.2).
On Sunday 06 November 2016 17:02:26 Dieter Nützel wrote: > After latest clover commit 'luxmark-v3.0' sigfault immediately: Hello Did you bisect it? Luxmark seems to crash just the same here without this commit. Serge > > SOURCE/luxmark-v3.0> ./luxmark > ./luxmark.bin: /usr/local/lib/libOpenCL.so.1: no version information > available (required by ./luxmark.bin) > *** Error in `./luxmark.bin': corrupted double-linked list: > 0x7f51a57829e0 *** > === Backtrace: = > /lib64/libc.so.6(+0x727df)[0x7f51e49847df] > /lib64/libc.so.6(+0x7804e)[0x7f51e498a04e] > /lib64/libc.so.6(+0x782d4)[0x7f51e498a2d4] > /lib64/libc.so.6(+0x78d01)[0x7f51e498ad01] > /usr/local/lib/libOpenCL.so.1(+0x20b418)[0x7f51e53d7418] > /usr/local/lib/libOpenCL.so.1(+0x20bcc7)[0x7f51e53d7cc7] > /usr/local/lib/libOpenCL.so.1(clReleaseMemObject+0x40)[0x7f51e53ba700] > ./luxmark.bin(_ZN3slg23PathOCLBaseRenderThread13FreeOCLBufferEPPN2cl6BufferE > +0x52)[0x7bfb92] > ./luxmark.bin(_ZN3slg23PathOCLBaseRenderThread4StopEv+0x6c)[0x7c3f0c] > ./luxmark.bin(_ZN3slg19PathOCLRenderThread4StopEv+0x9)[0x701a09] > ./luxmark.bin(_ZN3slg23PathOCLBaseRenderEngine12StopLockLessEv+0x97)[0x7bba0 > 7] ./luxmark.bin(_ZN3slg12RenderEngine4StopEv+0x26)[0x666a26] > ./luxmark.bin(_ZN3slg13RenderSessionD1Ev+0x7e)[0x658e8e] > ./luxmark.bin(_ZN7luxcore13RenderSessionD1Ev+0x22)[0x5e5502] > ./luxmark.bin[0x5e1db2] > ./luxmark.bin[0x5c69a4] > ./luxmark.bin[0x5c6b6f] > ./luxmark.bin[0x5cf6de] > ./luxmark.bin[0x86e83c] > ./luxmark.bin[0x8742d7] > ./luxmark.bin[0xf981ad] > ./luxmark.bin[0xf9a770] > ./luxmark.bin[0xfbb67f] > ./luxmark.bin[0x8ed424] > ./luxmark.bin[0xf9705f] > ./luxmark.bin[0xf9735e] > ./luxmark.bin[0xf9b58b] > ./luxmark.bin[0x5ad82a] > /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f51e4933b05] > ./luxmark.bin[0x5bea37] > === Memory map: > 0040-01e7f000 r-xp 09:00 1054819 > /tmp/INSTALL/SOURCE/luxmark-v3.0/luxmark.bin > 0207f000-020f5000 r--p 01a7f000 09:00 1054819 > /tmp/INSTALL/SOURCE/luxmark-v3.0/luxmark.bin > 020f5000-0000 rw-p 01af5000 09:00 1054819 > /tmp/INSTALL/SOURCE/luxmark-v3.0/luxmark.bin > 0000-02284000 rw-p 00:00 0 > 02882000-03246000 rw-p 00:00 0 > [heap] > 7f519bfde000-7f51a400 rw-p 00:00 0 > 7f51a400-7f51a7ff8000 rw-p 00:00 0 > 7f51a7ff8000-7f51a800 ---p 00:00 0 > 7f51a8855000-7f51a9219000 rw-s 12cb3b000 00:06 13357 > /dev/dri/renderD128 > 7f51a9219000-7f51ac00 rw-p 00:00 0 > 7f51ac00-7f51ac0e3000 rw-p 00:00 0 > 7f51ac0e3000-7f51b000 ---p 00:00 0 > 7f51b000-7f51b00e4000 rw-p 00:00 0 > 7f51b00e4000-7f51b400 ---p 00:00 0 > 7f51b400-7f51b40e3000 rw-p 00:00 0 > 7f51b40e3000-7f51b800 ---p 00:00 0 > 7f51b8217000-7f51baffe000 rw-p 00:00 0 > 7f51baffe000-7f51bafff000 ---p 00:00 0 > 7f51bafff000-7f51bb7ff000 rw-p 00:00 0 > 7f51bb7ff000-7f51bb80 ---p 00:00 0 > 7f51bb80-7f51bc00 rw-p 00:00 0 > 7f51bc00-7f51bc0e3000 rw-p 00:00 0 > 7f51bc0e3000-7f51c000 ---p 00:00 0 > 7f51c000-7f51c00e3000 rw-p 00:00 0 > 7f51c00e3000-7f51c400 ---p 00:00 0 > 7f51c400-7f51c40e3000 rw-p 00:00 0 > 7f51c40e3000-7f51c800 ---p 00:00 0 > 7f51c800-7f51c80e3000 rw-p 00:00 0 > 7f51c80e3000-7f51cc00 ---p 00:00 0 > 7f51cc00-7f51cc0e3000 rw-p 00:00 0 > 7f51cc0e3000-7f51d000 ---p 00:00 0 > 7f51d0216000-7f51d0217000 ---p 00:00 0 > 7f51d0217000-7f51d0a17000 rw-p 00:00 0 > 7f51d1218000-7f51d400 rw-p 00:00 0 > 7f51d400-7f51d7ef8000 rw-p 00:00 0 > 7f51d7ef8000-7f51d800 ---p 00:00 0 > 7f51d8962000-7f51d8963000 ---p 00:00 0 > 7f51d8963000-7f51d9163000 rw-p 00:00 0 > 7f51d9163000-7f51d9164000 ---p 00:00 0 > 7f51d9164000-7f51d9964000 rw-p 00:00 0 > 7f51d9964000-7f51d9965000 ---p 00:00 0 > 7f51d9965000-7f51da165000 rw-p 00:00 0 > 7f51da165000-7f51da166000 ---p 00:00 0 > 7f51da166000-7f51da966000 rw-p 00:00 0 > 7f51da966000-7f51da967000 ---p 00:00 0 > 7f51da967000-7f51db167000 rw-p 00:00 0 > 7f51db167000-7f51db168000 ---p 00:00 0 > 7f51db168000-7f51db968000 rw-p 00:00 0 > 7f51dbcb1000-7f51dbd5b000 r--p 09:00 4983414 > /usr/share/fonts/truetype/DejaVuSans-Bold.ttf > 7f51dbd5b000-7f51dbd5c000 ---p 00:00 0 > 7f51dbd5c000-7f51dc55c000 rw-p 00:00 0 > 7f51dc55c000-7f51dc55d000 ---p 00:00 0 > 7f51dc55d000-7f51dcd5d000 rw-p 00:00 0 > 7f51dcd5d000-7f51dcd5e000 ---p 00:00 0 > 7f51dcd5e000-7f51dd55e000 rw-p 00:00 0 > 7f51dd55e000-7f51dd56 r-xp 09:00 3554964 > /usr/lib64/libXinerama.so.1.0.0 > 7f51dd56-7f51dd75f000 ---p 2000 09:00 3554964 > /usr/lib64/libXinerama.so.1.0.0 > 7f51dd75f000-7f51d
[Mesa-dev] [PATCH 1/2] glcpp: Handle '#version 0' and other invalid values
From: Ian Romanick The #version directive can only handle decimal constants. Enforce that the value is a decimal constant. Section 3.3 (Preprocessor) of the GLSL 4.50 spec says: The language version a shader is written to is specified by #version number profile opt where number must be a version of the language, following the same convention as __VERSION__ above. The same section also says: __VERSION__ will substitute a decimal integer reflecting the version number of the OpenGL shading language. Use a separate flag to track whether or not the #version line has been encountered. Any possible sentinel (0 is currently used) could be specified in a #version directive. This would lead to trying to (internally) redefine __VERSION__. Since there is no parser location for this addition, NULL is passed. This eventually results in a NULL dereference and a segfault. Attempts to use -1 as the sentinel would also fail if '#version 4294967295' or '#version 18446744073709551615' were used. We should have piglit tests for both of these. Signed-off-by: Ian Romanick Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97420 Cc: mesa-sta...@lists.freedesktop.org Cc: Juan A. Suarez Romero Cc: Karol Herbst --- src/compiler/glsl/glcpp/glcpp-parse.y | 25 +++-- src/compiler/glsl/glcpp/glcpp.h | 9 + 2 files changed, 28 insertions(+), 6 deletions(-) diff --git a/src/compiler/glsl/glcpp/glcpp-parse.y b/src/compiler/glsl/glcpp/glcpp-parse.y index b80ff04..63012bc 100644 --- a/src/compiler/glsl/glcpp/glcpp-parse.y +++ b/src/compiler/glsl/glcpp/glcpp-parse.y @@ -177,7 +177,7 @@ add_builtin_define(glcpp_parser_t *parser, const char *name, int value); * (such as the and start conditions in the lexer). */ %token DEFINED ELIF_EXPANDED HASH_TOKEN DEFINE_TOKEN FUNC_IDENTIFIER OBJ_IDENTIFIER ELIF ELSE ENDIF ERROR_TOKEN IF IFDEF IFNDEF LINE PRAGMA UNDEF VERSION_TOKEN GARBAGE IDENTIFIER IF_EXPANDED INTEGER INTEGER_STRING LINE_EXPANDED NEWLINE OTHER PLACEHOLDER SPACE PLUS_PLUS MINUS_MINUS %token PASTE -%type INTEGER operator SPACE integer_constant +%type INTEGER operator SPACE integer_constant version_constant %type expression %type IDENTIFIER FUNC_IDENTIFIER OBJ_IDENTIFIER INTEGER_STRING OTHER ERROR_TOKEN PRAGMA %type identifier_list @@ -419,14 +419,14 @@ control_line_success: | HASH_TOKEN ENDIF { _glcpp_parser_skip_stack_pop (parser, & @1); } NEWLINE -| HASH_TOKEN VERSION_TOKEN integer_constant NEWLINE { - if (parser->version != 0) { +| HASH_TOKEN VERSION_TOKEN version_constant NEWLINE { + if (parser->version_set) { glcpp_error(& @1, parser, "#version must appear on the first line"); } _glcpp_parser_handle_version_declaration(parser, $3, NULL, true); } -| HASH_TOKEN VERSION_TOKEN integer_constant IDENTIFIER NEWLINE { - if (parser->version != 0) { +| HASH_TOKEN VERSION_TOKEN version_constant IDENTIFIER NEWLINE { + if (parser->version_set) { glcpp_error(& @1, parser, "#version must appear on the first line"); } _glcpp_parser_handle_version_declaration(parser, $3, $4, true); @@ -465,6 +465,17 @@ integer_constant: $$ = $1; } +version_constant: + INTEGER_STRING { + /* Both octal and hexadecimal constants begin with 0. */ + if ($1[0] == '0' && $1[1] != '\0') { + glcpp_error(&@1, parser, "invalid #version \"%s\" (not a decimal constant)", $1); + $$ = 0; + } else { + $$ = strtoll($1, NULL, 10); + } + } + expression: integer_constant { $$.value = $1; @@ -1361,6 +1372,7 @@ glcpp_parser_create(glcpp_extension_iterator extensions, void *state, gl_api api parser->state = state; parser->api = api; parser->version = 0; + parser->version_set = false; parser->has_new_line_number = 0; parser->new_line_number = 1; @@ -2293,10 +2305,11 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t *parser, intmax_t versio const char *es_identifier, bool explicitly_set) { - if (parser->version != 0) + if (parser->version_set) return; parser->version = version; + parser->version_set = true; add_builtin_define (parser, "__VERSION__", version); diff --git a/src/compiler/glsl/glcpp/glcpp.h b/src/compiler/glsl/glcpp/glcpp.h index bb4ad67..232e053 100644 --- a/src/compiler/glsl/glcpp/glcpp.h +++ b/src/compiler/glsl/glcpp/glcpp.h @@ -208,6 +208,15 @@ struct glcpp_parser { void *state; gl_api api; unsigned version; + + /** +* Has the #version been set? +* +* A separate flag is used because any possible senti
[Mesa-dev] [PATCH 2/2] glsl: Parse 0 as a preprocessor INTCONSTANT
From: Ian Romanick This allows a more reasonable error message for '#version 0' of 0:1(10): error: GLSL 0.00 is not supported. Supported versions are: 1.10, 1.20, 1.30, 1.00 ES, 3.00 ES, 3.10 ES, and 3.20 ES instead of 0:1(10): error: syntax error, unexpected $undefined, expecting INTCONSTANT Signed-off-by: Ian Romanick Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97420 Cc: mesa-sta...@lists.freedesktop.org Cc: Juan A. Suarez Romero Cc: Karol Herbst --- src/compiler/glsl/glsl_lexer.ll | 4 1 file changed, 4 insertions(+) diff --git a/src/compiler/glsl/glsl_lexer.ll b/src/compiler/glsl/glsl_lexer.ll index b473af7..0e722cb 100644 --- a/src/compiler/glsl/glsl_lexer.ll +++ b/src/compiler/glsl/glsl_lexer.ll @@ -253,6 +253,10 @@ HASH ^{SPC}#{SPC} yylval->n = strtol(yytext, NULL, 10); return INTCONSTANT; } +0 { + yylval->n = 0; + return INTCONSTANT; + } \n { BEGIN 0; yylineno++; yycolumn = 0; return EOL; } . { return yytext[0]; } -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: Do not allow scalar types in vector relational functions
On Tue, Nov 8, 2016 at 12:01 PM, Ian Romanick wrote: > On 11/08/2016 11:58 AM, Ian Romanick wrote: >> On 11/04/2016 12:23 PM, Matt Turner wrote: >>> On Sun, Oct 30, 2016 at 11:45 PM, Boyan Ding wrote: According to OpenGL Shading Language 4.50 spec, Section 8.7 "Vector Relational Functions", functions of this type do not operate on scalar types, so remove scalar types from signature definitions to make the behavior consistent with glslangValidator and other drivers. >>> >>> Yep. Looks like it's always been this way. >>> >>> The patch is >>> >>> Reviewed-by: Matt Turner >>> >>> Since this seems to be untested by any suite, could you provide some >>> piglit parser tests that confirm that lessThanEqual(scalar, scalar), >>> et al doesn't work? >>> >>> Rant: what a stupid mess to require <= for scalars but lessThanEqual >>> for vectors. >> >> I think it makes sense. What would 'vec4(...) < vec4(...)' return? A Yes, vec4(...) < vec4(...) would return a bvec4(). >> bvec4? How much time would like to spend with a the compiler error that >> would result from > > Or did you mean that you can't use lessThanEqual and friends with > scalars is stupid? I think we just didn't consider that when we added > the ability to swizzle scalars, but we probably should have added that > too. I could get behind an extension that makes vector functions also > work with vec1. Definitely that, but I'm not convinced that supporting the operators on vectors would be bad either. bvec4 x = ... if (x) [...] gives a very appropriate error on our compiler: > error: if-statement condition must be scalar boolean ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: Fix pixel shader scratch space allocation on Gen9+ platforms.
On Tuesday, November 8, 2016 11:49:58 AM PST Matt Turner wrote: > On Tue, Nov 8, 2016 at 10:25 AM, Kenneth Graunke > wrote: > > We had missed a bit of errata - PS scratch needs to be computed as if > > there were 4 subslices per slice, rather than 3. > > > > Skylake BroxtonKabylake > > GT1 GT2 GT3 GT4 2x6 3x6 GT1 GT1.5 GT2 GT3 GT4 > > Actual Slices 1 1 2 31 1111 2 3 > > Total Subslices3 3 6 92 3233 6 9 > > Subsl. for PS Scratch 4 4 8 12 4 4444 8 12 > > > > Note that Skylake GT1-3 already worked because we allocated 64 * 9 > > (trying to use a value that would work on GT4, with 9 subslices), > > and the actual required values were 64 * 4 or 64 * 8. However, all > > others (Skylake GT4, Broxton, and Kabylake GT1-4) underallocated, > > which can lead to scratch writes trashing random process memory, > > and rendering corruption or GPU hangs. > > > > Fixes GPU hangs and rendering corruption on Skylake GT4 in shaders that > > spill. Particularly, dEQP-GLES31.functional.ubo.all_per_block_buffers.* > > now runs successfully with no hangs and renders correctly. This may > > fix problems on Broxton and Kabylake as well. > > > > Cc: "13.0" > > Signed-off-by: Kenneth Graunke > > --- > > src/intel/common/gen_device_info.c | 33 +++-- > > 1 file changed, 19 insertions(+), 14 deletions(-) > > > > diff --git a/src/intel/common/gen_device_info.c > > b/src/intel/common/gen_device_info.c > > index 30df0b2..1dc1769 100644 > > --- a/src/intel/common/gen_device_info.c > > +++ b/src/intel/common/gen_device_info.c > > @@ -335,7 +335,6 @@ static const struct gen_device_info gen_device_info_chv > > = { > > .max_gs_threads = 336, \ > > .max_tcs_threads = 336, \ > > .max_tes_threads = 336, \ > > - .max_wm_threads = 64 * 9,\ > > Is this intentional? I don't see CHV called out in the commit message, > and the new code at the bottom is for gen >= 9, while CHV is 8. Sorry, I should have used a bigger -U setting when sending these out. This change is to the GEN9_FEATURES macro which is directly below the chv struct...and diff doesn't handle the header nicely. --Ken signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] swr: Support windows builds
On 8 November 2016 at 17:57, Kyriazis, George wrote: >> This is now I folded/cleaned up the autoconf build with commit >> bb949e262cb5c4fffe991debc605447e15322666. A similar solution here would >> be great/possible. >> > > Can I take care of it on a follow-on check-in? Ie. check-in as-is for now? > Based on the bb949e262 history (was supposed to be a follow-on) I'm leaning towards - please take care of this in v2. Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: Do not allow scalar types in vector relational functions
On 11/08/2016 11:58 AM, Ian Romanick wrote: > On 11/04/2016 12:23 PM, Matt Turner wrote: >> On Sun, Oct 30, 2016 at 11:45 PM, Boyan Ding wrote: >>> According to OpenGL Shading Language 4.50 spec, Section 8.7 "Vector >>> Relational Functions", functions of this type do not operate on scalar >>> types, so remove scalar types from signature definitions to make the >>> behavior consistent with glslangValidator and other drivers. >> >> Yep. Looks like it's always been this way. >> >> The patch is >> >> Reviewed-by: Matt Turner >> >> Since this seems to be untested by any suite, could you provide some >> piglit parser tests that confirm that lessThanEqual(scalar, scalar), >> et al doesn't work? >> >> Rant: what a stupid mess to require <= for scalars but lessThanEqual >> for vectors. > > I think it makes sense. What would 'vec4(...) < vec4(...)' return? A > bvec4? How much time would like to spend with a the compiler error that > would result from Or did you mean that you can't use lessThanEqual and friends with scalars is stupid? I think we just didn't consider that when we added the ability to swizzle scalars, but we probably should have added that too. I could get behind an extension that makes vector functions also work with vec1. > if (a < b) { > ... > } > > because a and b happen to be vectors. I bet about an equal number of > people think that would be stupid as think the current requirement of > comparison functions for vectors is stupid. :) > >> Somewhere on my todo list is a GLSL extension that fixes things like this... >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: Do not allow scalar types in vector relational functions
On 11/04/2016 12:23 PM, Matt Turner wrote: > On Sun, Oct 30, 2016 at 11:45 PM, Boyan Ding wrote: >> According to OpenGL Shading Language 4.50 spec, Section 8.7 "Vector >> Relational Functions", functions of this type do not operate on scalar >> types, so remove scalar types from signature definitions to make the >> behavior consistent with glslangValidator and other drivers. > > Yep. Looks like it's always been this way. > > The patch is > > Reviewed-by: Matt Turner > > Since this seems to be untested by any suite, could you provide some > piglit parser tests that confirm that lessThanEqual(scalar, scalar), > et al doesn't work? > > Rant: what a stupid mess to require <= for scalars but lessThanEqual > for vectors. I think it makes sense. What would 'vec4(...) < vec4(...)' return? A bvec4? How much time would like to spend with a the compiler error that would result from if (a < b) { ... } because a and b happen to be vectors. I bet about an equal number of people think that would be stupid as think the current requirement of comparison functions for vectors is stupid. :) > Somewhere on my todo list is a GLSL extension that fixes things like this... > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: Fix pixel shader scratch space allocation on Gen9+ platforms.
On Tue, Nov 8, 2016 at 10:25 AM, Kenneth Graunke wrote: > We had missed a bit of errata - PS scratch needs to be computed as if > there were 4 subslices per slice, rather than 3. > > Skylake BroxtonKabylake > GT1 GT2 GT3 GT4 2x6 3x6 GT1 GT1.5 GT2 GT3 GT4 > Actual Slices 1 1 2 31 1111 2 3 > Total Subslices3 3 6 92 3233 6 9 > Subsl. for PS Scratch 4 4 8 12 4 4444 8 12 > > Note that Skylake GT1-3 already worked because we allocated 64 * 9 > (trying to use a value that would work on GT4, with 9 subslices), > and the actual required values were 64 * 4 or 64 * 8. However, all > others (Skylake GT4, Broxton, and Kabylake GT1-4) underallocated, > which can lead to scratch writes trashing random process memory, > and rendering corruption or GPU hangs. > > Fixes GPU hangs and rendering corruption on Skylake GT4 in shaders that > spill. Particularly, dEQP-GLES31.functional.ubo.all_per_block_buffers.* > now runs successfully with no hangs and renders correctly. This may > fix problems on Broxton and Kabylake as well. > > Cc: "13.0" > Signed-off-by: Kenneth Graunke > --- > src/intel/common/gen_device_info.c | 33 +++-- > 1 file changed, 19 insertions(+), 14 deletions(-) > > diff --git a/src/intel/common/gen_device_info.c > b/src/intel/common/gen_device_info.c > index 30df0b2..1dc1769 100644 > --- a/src/intel/common/gen_device_info.c > +++ b/src/intel/common/gen_device_info.c > @@ -335,7 +335,6 @@ static const struct gen_device_info gen_device_info_chv = > { > .max_gs_threads = 336, \ > .max_tcs_threads = 336, \ > .max_tes_threads = 336, \ > - .max_wm_threads = 64 * 9,\ Is this intentional? I don't see CHV called out in the commit message, and the new code at the bottom is for gen >= 9, while CHV is 8. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glcpp: initializes version to -1
I should have a fix for all of these problems out in about an hour. I'm just running it through the CI now. On 11/05/2016 02:48 AM, Karol Herbst wrote: > 2016-11-05 2:50 GMT+01:00 Ian Romanick : >> (Sorry about the top post. Sent from my phone.) >> >> That expression will allow versions like 0130 as valid. If you just want to >> allow 0, you need a more complex regular expression. I feel like that's >> just a bandage... what about other bad values like "#version -130"? Won't >> that have the same problem that 0 currently has? >> > > no, it doesn't. > > I tested the patch with glsl_compiler > > "#version 0130": 0:1(10): error: GLSL 0.88 is not supported. Supported > versions are: 1.10, 1.20, 1.30, 1.00 ES, and 3.00 ES > > "#version 0": 0:1(10): error: GLSL 0.00 is not supported. Supported > versions are: 1.10, 1.20, 1.30, 1.00 ES, and 3.00 ES > > "#version -130":0:1(10): preprocessor error: syntax error, unexpected > '-', expecting INTEGER or INTEGER_STRING > > but > > "#version 0512": 0:1(10): error: GLSL 3.30 is not supported. Supported > versions are: 1.10, 1.20, 1.30, 1.00 ES, and 3.00 ES > > so the issue with this would be, that "0512" is parsed as 3.30, which > isn't right either, but the current master version does the same. \o/ > new bug found > >> >> On November 4, 2016 6:09:58 AM "Juan A. Suarez Romero" >> wrote: >> >>> Shader can define #version as an integer, including 0. >>> >>> Initializes version to -1 to know later if shader has defined a #version >>> or not. >>> >>> It fixes 4 piglit tests: >>> spec/glsl-1.10/compiler/version-0.frag: crash pass >>> spec/glsl-1.10/compiler/version-0.vert: crash pass >>> spec/glsl-es-3.00/compiler/version-0.frag: crash pass >>> spec/glsl-es-3.00/compiler/version-0.vert: crash pass >>> >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97420 >>> --- >>> src/compiler/glsl/glcpp/glcpp-parse.y | 8 >>> src/compiler/glsl/glcpp/glcpp.h | 2 +- >>> src/compiler/glsl/glsl_lexer.ll | 2 +- >>> 3 files changed, 6 insertions(+), 6 deletions(-) >>> >>> diff --git a/src/compiler/glsl/glcpp/glcpp-parse.y >>> b/src/compiler/glsl/glcpp/glcpp-parse.y >>> index b80ff04..6207a62 100644 >>> --- a/src/compiler/glsl/glcpp/glcpp-parse.y >>> +++ b/src/compiler/glsl/glcpp/glcpp-parse.y >>> @@ -420,13 +420,13 @@ control_line_success: >>> _glcpp_parser_skip_stack_pop (parser, & @1); >>> } NEWLINE >>> | HASH_TOKEN VERSION_TOKEN integer_constant NEWLINE { >>> - if (parser->version != 0) { >>> + if (parser->version != -1) { >>> glcpp_error(& @1, parser, "#version must appear on >>> the first line"); >>> } >>> _glcpp_parser_handle_version_declaration(parser, $3, NULL, >>> true); >>> } >>> | HASH_TOKEN VERSION_TOKEN integer_constant IDENTIFIER NEWLINE { >>> - if (parser->version != 0) { >>> + if (parser->version != -1) { >>> glcpp_error(& @1, parser, "#version must appear on >>> the first line"); >>> } >>> _glcpp_parser_handle_version_declaration(parser, $3, $4, >>> true); >>> @@ -1360,7 +1360,7 @@ glcpp_parser_create(glcpp_extension_iterator >>> extensions, void *state, gl_api api >>> parser->extensions = extensions; >>> parser->state = state; >>> parser->api = api; >>> - parser->version = 0; >>> + parser->version = -1; >>> >>> parser->has_new_line_number = 0; >>> parser->new_line_number = 1; >>> @@ -2293,7 +2293,7 @@ >>> _glcpp_parser_handle_version_declaration(glcpp_parser_t *parser, intmax_t >>> versio >>> const char *es_identifier, >>> bool explicitly_set) >>> { >>> - if (parser->version != 0) >>> + if (parser->version != -1) >>>return; >>> >>> parser->version = version; >>> diff --git a/src/compiler/glsl/glcpp/glcpp.h >>> b/src/compiler/glsl/glcpp/glcpp.h >>> index bb4ad67..2acac0c 100644 >>> --- a/src/compiler/glsl/glcpp/glcpp.h >>> +++ b/src/compiler/glsl/glcpp/glcpp.h >>> @@ -207,7 +207,7 @@ struct glcpp_parser { >>> glcpp_extension_iterator extensions; >>> void *state; >>> gl_api api; >>> - unsigned version; >>> + int version; >>> bool has_new_line_number; >>> int new_line_number; >>> bool has_new_source_number; >>> diff --git a/src/compiler/glsl/glsl_lexer.ll >>> b/src/compiler/glsl/glsl_lexer.ll >>> index b473af7..7d1d616 100644 >>> --- a/src/compiler/glsl/glsl_lexer.ll >>> +++ b/src/compiler/glsl/glsl_lexer.ll >>> @@ -249,7 +249,7 @@ HASH^{SPC}#{SPC} >>>yylval->identifier = >>> linear_strdup(mem_ctx, yytext); >>>return IDENTIFIER; >>> } >>> -[1-9][0-9]*{ >>> +[0-9][0-9]*{
Re: [Mesa-dev] [PATCH] gallivm: fix [IU]MUL_HI regression
FYI, this doesn't fix the regression fully. (GLCTS failures with piglit: -t mulextended) Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 12.0 backport] intel: Fix pixel shader scratch space allocation on Gen9+ platforms.
We had missed a bit of errata - PS scratch needs to be computed as if there were 4 subslices per slice, rather than 3. This is a conservative backport of commit . It only increases the scratch amount, unlike the original commit which decreases it Skylake GT1-3 to avoid overallocating. Cc: "12.0 11.2" Signed-off-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_device_info.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c b/src/mesa/drivers/dri/i965/brw_device_info.c index 77bbe78..e191c6c 100644 --- a/src/mesa/drivers/dri/i965/brw_device_info.c +++ b/src/mesa/drivers/dri/i965/brw_device_info.c @@ -336,7 +336,7 @@ static const struct brw_device_info brw_device_info_chv = { .max_gs_threads = 336, \ .max_hs_threads = 336, \ .max_ds_threads = 336, \ - .max_wm_threads = 64 * 9,\ + .max_wm_threads = 64 * 12, \ .max_cs_threads = 56,\ .urb = { \ .size = 384, \ @@ -389,7 +389,7 @@ static const struct brw_device_info brw_device_info_bxt = { .max_hs_threads = 112, .max_ds_threads = 112, .max_gs_threads = 112, - .max_wm_threads = 64 * 3, + .max_wm_threads = 64 * 4, .max_cs_threads = 6 * 6, .urb = { .size = 192, @@ -412,7 +412,7 @@ static const struct brw_device_info brw_device_info_bxt_2x6 = { .max_hs_threads = 56, /* XXX: guess */ .max_ds_threads = 56, .max_gs_threads = 56, - .max_wm_threads = 64 * 2, + .max_wm_threads = 64 * 4, .max_cs_threads = 6 * 6, .urb = { .size = 128, @@ -439,7 +439,7 @@ static const struct brw_device_info brw_device_info_kbl_gt1 = { .gt = 1, .max_cs_threads = 7 * 6, - .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 2, + .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 4, .urb.size = 192, .num_slices = 1, }; @@ -449,7 +449,7 @@ static const struct brw_device_info brw_device_info_kbl_gt1_5 = { .gt = 1, .max_cs_threads = 7 * 6, - .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 3, + .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 4, .num_slices = 1, }; @@ -457,7 +457,7 @@ static const struct brw_device_info brw_device_info_kbl_gt2 = { GEN9_FEATURES, .gt = 2, - .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 3, + .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 4, .num_slices = 1, }; @@ -465,7 +465,7 @@ static const struct brw_device_info brw_device_info_kbl_gt3 = { GEN9_FEATURES, .gt = 3, - .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 6, + .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 8, .num_slices = 2, }; @@ -473,7 +473,7 @@ static const struct brw_device_info brw_device_info_kbl_gt4 = { GEN9_FEATURES, .gt = 4, - .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 9, + .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 12, /* * From the "L3 Allocation and Programming" documentation: * -- 2.10.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: Fix pixel shader scratch space allocation on Gen9+ platforms.
We had missed a bit of errata - PS scratch needs to be computed as if there were 4 subslices per slice, rather than 3. Skylake BroxtonKabylake GT1 GT2 GT3 GT4 2x6 3x6 GT1 GT1.5 GT2 GT3 GT4 Actual Slices 1 1 2 31 1111 2 3 Total Subslices3 3 6 92 3233 6 9 Subsl. for PS Scratch 4 4 8 12 4 4444 8 12 Note that Skylake GT1-3 already worked because we allocated 64 * 9 (trying to use a value that would work on GT4, with 9 subslices), and the actual required values were 64 * 4 or 64 * 8. However, all others (Skylake GT4, Broxton, and Kabylake GT1-4) underallocated, which can lead to scratch writes trashing random process memory, and rendering corruption or GPU hangs. Fixes GPU hangs and rendering corruption on Skylake GT4 in shaders that spill. Particularly, dEQP-GLES31.functional.ubo.all_per_block_buffers.* now runs successfully with no hangs and renders correctly. This may fix problems on Broxton and Kabylake as well. Cc: "13.0" Signed-off-by: Kenneth Graunke --- src/intel/common/gen_device_info.c | 33 +++-- 1 file changed, 19 insertions(+), 14 deletions(-) diff --git a/src/intel/common/gen_device_info.c b/src/intel/common/gen_device_info.c index 30df0b2..1dc1769 100644 --- a/src/intel/common/gen_device_info.c +++ b/src/intel/common/gen_device_info.c @@ -335,7 +335,6 @@ static const struct gen_device_info gen_device_info_chv = { .max_gs_threads = 336, \ .max_tcs_threads = 336, \ .max_tes_threads = 336, \ - .max_wm_threads = 64 * 9,\ .max_cs_threads = 56,\ .urb = { \ .size = 384, \ @@ -388,7 +387,6 @@ static const struct gen_device_info gen_device_info_bxt = { .max_tcs_threads = 112, .max_tes_threads = 112, .max_gs_threads = 112, - .max_wm_threads = 64 * 3, .max_cs_threads = 6 * 6, .urb = { .size = 192, @@ -411,7 +409,6 @@ static const struct gen_device_info gen_device_info_bxt_2x6 = { .max_tcs_threads = 56, /* XXX: guess */ .max_tes_threads = 56, .max_gs_threads = 56, - .max_wm_threads = 64 * 2, .max_cs_threads = 6 * 6, .urb = { .size = 128, @@ -427,18 +424,11 @@ static const struct gen_device_info gen_device_info_bxt_2x6 = { * There's no KBL entry. Using the default SKL (GEN9) GS entries value. */ -/* - * Both SKL and KBL support a maximum of 64 threads per - * Pixel Shader Dispatch (PSD) unit. - */ -#define KBL_MAX_THREADS_PER_PSD 64 - static const struct gen_device_info gen_device_info_kbl_gt1 = { GEN9_FEATURES, .gt = 1, .max_cs_threads = 7 * 6, - .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 2, .urb.size = 192, .num_slices = 1, }; @@ -448,7 +438,6 @@ static const struct gen_device_info gen_device_info_kbl_gt1_5 = { .gt = 1, .max_cs_threads = 7 * 6, - .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 3, .num_slices = 1, }; @@ -456,7 +445,6 @@ static const struct gen_device_info gen_device_info_kbl_gt2 = { GEN9_FEATURES, .gt = 2, - .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 3, .num_slices = 1, }; @@ -464,7 +452,6 @@ static const struct gen_device_info gen_device_info_kbl_gt3 = { GEN9_FEATURES, .gt = 3, - .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 6, .num_slices = 2, }; @@ -472,7 +459,6 @@ static const struct gen_device_info gen_device_info_kbl_gt4 = { GEN9_FEATURES, .gt = 4, - .max_wm_threads = KBL_MAX_THREADS_PER_PSD * 9, /* * From the "L3 Allocation and Programming" documentation: * @@ -500,6 +486,25 @@ gen_get_device_info(int devid, struct gen_device_info *devinfo) return false; } + /* From the Skylake PRM, 3DSTATE_PS::Scratch Space Base Pointer: +* +* "Scratch Space per slice is computed based on 4 sub-slices. SW must +* allocate scratch space enough so that each slice has 4 slices allowed." +* +* The equivalent internal documentation says that this programming note +* applies to all Gen9+ platforms. +* +* The hardware typically calculates the scratch space pointer by taking +* the base address, and adding per-thread-scratch-space * thread ID. +* Extra padding can be necessary depending how the thread IDs are +* calculated for a particular shader stage. +*/ + if (devinfo->gen >= 9) { + devinfo->max_wm_threads = 64 /* threads-per-PSD */ + * devinfo->num_slices + * 4; /* effective subslices per slice */ + } + return true; } -- 2.10.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/lis
Re: [Mesa-dev] [PATCH 2/2] i965: Enable several GLES 3.1 extensions on HSW+
On Tuesday, November 8, 2016 10:10:35 AM PST Ian Romanick wrote: > From: Ian Romanick > > The only reason we didn't previously enable this was the dependency on > OpenGL ES 3.1. These should have been enabled as soon as HSW got > stencil texturing. We also needed to fixup setting MaxViewports. > > Signed-off-by: Ian Romanick > --- > docs/features.txt| 6 +++--- > docs/relnotes/12.1.0.html| 6 +++--- > src/mesa/drivers/dri/i965/intel_extensions.c | 6 +++--- > 3 files changed, 9 insertions(+), 9 deletions(-) Patch doesn't apply against master - there is no relnotes/12.1.0.html. Assuming you make it apply, and have regression tested these, they are Reviewed-by: Kenneth Graunke signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965: Enable several GLES 3.1 extensions on HSW+
On Tue, Nov 8, 2016 at 1:10 PM, Ian Romanick wrote: > diff --git a/docs/relnotes/12.1.0.html b/docs/relnotes/12.1.0.html > index c7e4d01..b8862d3 100644 > --- a/docs/relnotes/12.1.0.html > +++ b/docs/relnotes/12.1.0.html This is not the relnotes file you're looking for. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] i965: Enable several GLES 3.1 extensions on HSW+
From: Ian Romanick The only reason we didn't previously enable this was the dependency on OpenGL ES 3.1. These should have been enabled as soon as HSW got stencil texturing. We also needed to fixup setting MaxViewports. Signed-off-by: Ian Romanick --- docs/features.txt| 6 +++--- docs/relnotes/12.1.0.html| 6 +++--- src/mesa/drivers/dri/i965/intel_extensions.c | 6 +++--- 3 files changed, 9 insertions(+), 9 deletions(-) diff --git a/docs/features.txt b/docs/features.txt index a677bfb..b1f9384 100644 --- a/docs/features.txt +++ b/docs/features.txt @@ -260,18 +260,18 @@ GLES3.2, GLSL ES 3.2 -- all DONE: i965/gen9+ GL_OES_copy_image DONE (all drivers) GL_OES_draw_buffers_indexed DONE (all drivers that support GL_ARB_draw_buffers_blend) GL_OES_draw_elements_base_vertex DONE (all drivers) - GL_OES_geometry_shaderDONE (i965/gen8+, nvc0, radeonsi) + GL_OES_geometry_shaderDONE (i965/hsw+, nvc0, radeonsi) GL_OES_gpu_shader5DONE (all drivers that support GL_ARB_gpu_shader5) GL_OES_primitive_bounding_box DONE (i965/gen7+, nvc0, radeonsi) GL_OES_sample_shading DONE (i965, nvc0, r600, radeonsi) GL_OES_sample_variables DONE (i965, nvc0, r600, radeonsi) GL_OES_shader_image_atomicDONE (all drivers that support GL_ARB_shader_image_load_store) - GL_OES_shader_io_blocks DONE (i965/gen8+, nvc0, radeonsi) + GL_OES_shader_io_blocks DONE (All drivers that support GLES 3.1) GL_OES_shader_multisample_interpolation DONE (i965, nvc0, r600, radeonsi) GL_OES_tessellation_shaderDONE (all drivers that support GL_ARB_tessellation_shader) GL_OES_texture_border_clamp DONE (all drivers) GL_OES_texture_buffer DONE (i965, nvc0, radeonsi) - GL_OES_texture_cube_map_array DONE (i965/gen8+, nvc0, radeonsi) + GL_OES_texture_cube_map_array DONE (i965/hsw+, nvc0, radeonsi) GL_OES_texture_stencil8 DONE (all drivers that support GL_ARB_texture_stencil8) GL_OES_texture_storage_multisample_2d_array DONE (all drivers that support GL_ARB_texture_multisample) diff --git a/docs/relnotes/12.1.0.html b/docs/relnotes/12.1.0.html index c7e4d01..b8862d3 100644 --- a/docs/relnotes/12.1.0.html +++ b/docs/relnotes/12.1.0.html @@ -64,11 +64,11 @@ Note: some of the new features are only available with certain drivers. GL_KHR_robustness on nvc0, radeonsi GL_KHR_texture_compression_astc_sliced_3d on i965 GL_OES_copy_image on nv50, nvc0, r600, radeonsi, softpipe, llvmpipe -GL_OES_geometry_shader on i965/gen8+, nvc0, radeonsi +GL_OES_geometry_shader on i965/hsw+, nvc0, radeonsi GL_OES_primitive_bounding_box on i965/gen7+, nvc0, radeonsi -GL_OES_texture_cube_map_array on i965/gen8+, nvc0, radeonsi +GL_OES_texture_cube_map_array on i965/hsw+, nvc0, radeonsi GL_OES_tessellation_shader on i965/gen7+, nvc0, radeonsi -GL_OES_viewport_array on nvc0, radeonsi +GL_OES_viewport_array on i965/hsw+, nvc0, radeonsi GL_ANDROID_extension_pack_es31a on i965/gen9+ diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index 66079b5..cbde3fe 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++ b/src/mesa/drivers/dri/i965/intel_extensions.c @@ -380,6 +380,9 @@ intelInitExtensions(struct gl_context *ctx) if (brw->gen >= 8 || brw->is_haswell) { ctx->Extensions.ARB_stencil_texturing = true; ctx->Extensions.ARB_texture_stencil8 = true; + ctx->Extensions.OES_geometry_shader = true; + ctx->Extensions.OES_texture_cube_map_array = true; + ctx->Extensions.OES_viewport_array = true; } if (brw->gen >= 8 || brw->is_haswell || brw->is_baytrail) { @@ -403,9 +406,6 @@ intelInitExtensions(struct gl_context *ctx) ctx->Extensions.ARB_shader_precision = true; ctx->Extensions.ARB_vertex_attrib_64bit = true; ctx->Extensions.ARB_ES3_2_compatibility = true; - ctx->Extensions.OES_geometry_shader = true; - ctx->Extensions.OES_texture_cube_map_array = true; - ctx->Extensions.OES_viewport_array = true; } if (brw->gen >= 9) { -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] i965: Always set MaxViewports and related limits
From: Ian Romanick Since 9d6ca7c3, there should be no performance hit for having MaxViewports > 1. Always set this context state. This eliminates the need to update this conditional as we add support for OES_viewport_array on older GPUs. Signed-off-by: Ian Romanick --- src/mesa/drivers/dri/i965/brw_context.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index d6204fd..3295eb3 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -778,8 +778,7 @@ brw_initialize_context_constants(struct brw_context *brw) } /* ARB_viewport_array, OES_viewport_array */ - if ((brw->gen >= 6 && ctx->API == API_OPENGL_CORE) || - (brw->gen >= 8 && ctx->API == API_OPENGLES2)) { + if (brw->gen >= 6) { ctx->Const.MaxViewports = GEN6_NUM_VIEWPORTS; ctx->Const.ViewportSubpixelBits = 0; -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] swr: Support windows builds
> -Original Message- > From: Emil Velikov [mailto:emil.l.veli...@gmail.com] > Sent: Tuesday, November 8, 2016 10:54 AM > To: Kyriazis, George > Cc: ML mesa-dev > Subject: Re: [Mesa-dev] [PATCH 3/3] swr: Support windows builds > > On 8 November 2016 at 15:48, Kyriazis, George > wrote: > > Comments inline.. > > > >> -Original Message- > >> From: Emil Velikov [mailto:emil.l.veli...@gmail.com] > >> Sent: Tuesday, November 8, 2016 8:25 AM > >> To: Kyriazis, George > >> Cc: ML mesa-dev > >> Subject: Re: [Mesa-dev] [PATCH 3/3] swr: Support windows builds > >> > >> On 7 November 2016 at 22:32, George Kyriazis > >> > >> wrote: > >> > - Added SConscript files > >> > - better handling of NOMINMAX for inclusion > >> > - Reorder header files in swr_context.cpp to handle NOMINMAX > >> > better, > >> since > >> > mesa header files include windows.h before we get a chance to > #define > >> > NOMINMAX > >> > - cleaner support for .dll and .so prefix/suffix across OSes > >> > - added PUBLIC for some protos > >> > - added swr_gdi_swap() which is call from libgl_gdi.c > >> > --- > >> > src/gallium/drivers/swr/Makefile.am| 8 ++ > >> > src/gallium/drivers/swr/SConscript | 46 +++ > >> > src/gallium/drivers/swr/SConscript-arch| 175 > >> + > >> > src/gallium/drivers/swr/rasterizer/common/os.h | 5 +- > >> > src/gallium/drivers/swr/swr_context.cpp| 16 +-- > >> > src/gallium/drivers/swr/swr_context.h | 2 + > >> > src/gallium/drivers/swr/swr_loader.cpp | 37 +- > >> > src/gallium/drivers/swr/swr_public.h | 11 +- > >> > src/gallium/drivers/swr/swr_screen.cpp | 25 +--- > >> > 9 files changed, 291 insertions(+), 34 deletions(-) create mode > >> > 100644 src/gallium/drivers/swr/SConscript > >> > create mode 100644 src/gallium/drivers/swr/SConscript-arch > >> > > >> Similar to 1/3 this patch does too many things. Please _don't_ do that. > >> > >> Some ideas based on the above: > >> - source code fixes - one or multiple patches, depending on details. > >> - automake fixes - ^^ > >> - introduce scons build (+ the EXTRA_DIST hunk) > >> > > As stated in review of patch 1/3, I will send v2 of patches with different > breakdown. > > > > > >> Some misc comments below. > >> > >> > >> > +++ b/src/gallium/drivers/swr/SConscript > >> > @@ -0,0 +1,46 @@ > >> > +Import('*') > >> > + > >> > +from sys import executable as python_cmd import distutils.version > >> Seems unused. Maybe it was aimed for the llvm 3.9 requirement/check > >> mentioned in 1/3 ? > >> > > Scons build fails without the Import('*'), because env is undefined: > > > > NameError: name 'env' is not defined: > > > The "unused" comment was meant for the "import distutils.version" > line. Which seemingly got manged somewhere along the way. > That explains it. Ok, I'll take care of it. > >> > +import os.path > >> > + > >> > +if not 'swr' in COMMAND_LINE_TARGETS: > >> > +Return() > >> > + > >> > +if not env['llvm']: > >> > +print 'warning: LLVM disabled: not building swr' > >> > +Return() > >> > + > >> > +env.MSVC2013Compat() > >> > + > >> > >> > +swr_arch = 'avx' > >> > +VariantDir('avx', '.', duplicate=0) > >> > +SConscript('avx/SConscript-arch', exports='swr_arch') > >> > + > >> > +swr_arch = 'avx2' > >> > +VariantDir('avx2', '.', duplicate=0) > >> > +SConscript('avx2/SConscript-arch', exports='swr_arch') > >> > + > >> Afaict one can just fold the SConscript-arch here. Thus one won't > >> need to bother with the above nor the Depends hunk below. > >> Additionally with current approach one is generating [the] identical > >> source files twice. Far from ideal... > >> > > The AVX and AVX2 builds build differently (with different compiler flags). > At runtime, we load the appropriate dll, based on the underlying > architecture. We do the same thing on the linux build. Also, since > duplicate=0, source is not duplicated. Yes, generated files are generated > twice, however currently SConscript is just a shell around SConscript-arch; > all > the logic that generates the files and source lists is in SConscript-arch. By > moving the auto generation to SConscript will generate only one copy of the > gen files, however it splits the build logic into two files, which is more > messy. > I can certainly move the generation code in SConscript, however, I think that > it's cleaner to strive for source code cleanliness, as opposed to generate > code > cleanliness. > > "By moving the auto generation to SConscript ..., however it splits the build > logic into two files..." did you mean "one file" here ? > I'm proposing to fold the two SConscripts, which effectively moves the build > logic into _one_ file :-) > > Scons was never my thing, so I'm failing to see the "source code cleanliness" > that you're thinking about :-( The following isn't that messy is it ? > > build loader > generate sources > build avx - (uses above so
Re: [Mesa-dev] [PATCH] dir-locals.el: Adds whitespace support
If nobody says otherwise, I will land this by the beginning of next week. On Sun, 2016-10-23 at 00:10 +0300, Andres Gomez wrote: > Provides support for highlighting incorrect indentation. > > v2: Removed too long lines trail highlighting, as suggested by Ilia > Mirkin. > > Signed-off-by: Andres Gomez > --- > .dir-locals.el | 9 +++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/.dir-locals.el b/.dir-locals.el > index 4b53931..5340c3a 100644 > --- a/.dir-locals.el > +++ b/.dir-locals.el > @@ -1,4 +1,5 @@ > -((prog-mode > +((nil . ((show-trailing-whitespace . t))) > + (prog-mode >(indent-tabs-mode . nil) >(tab-width . 8) >(c-basic-offset . 3) > @@ -8,6 +9,10 @@ > (c-set-offset 'case-label '0) > (c-set-offset 'innamespace '0) > (c-set-offset 'inline-open '0))) > - ) > + (whitespace-style face indentation) > + (whitespace-line-column . 79) > + (eval ignore-errors > +(require 'whitespace) > +(whitespace-mode 1))) > (makefile-mode (indent-tabs-mode . t)) > ) > -- Br, Andres ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] swr: disable logic op when the rt format is float
Ah indeed, I missed that. That's what I get for looking at the man page. On Tue, Nov 8, 2016 at 12:25 PM, Rowley, Timothy O wrote: > Looking at the spec, that seems like that should also check for sRGB and also > disable in that case (“GetFormatInfo(compileState.format).isSRGB”). > >> On Nov 7, 2016, at 6:18 PM, Ilia Mirkin wrote: >> >> Signed-off-by: Ilia Mirkin >> --- >> src/gallium/drivers/swr/swr_state.cpp | 5 + >> 1 file changed, 5 insertions(+) >> >> diff --git a/src/gallium/drivers/swr/swr_state.cpp >> b/src/gallium/drivers/swr/swr_state.cpp >> index d8a8ee1..acb0452 100644 >> --- a/src/gallium/drivers/swr/swr_state.cpp >> +++ b/src/gallium/drivers/swr/swr_state.cpp >> @@ -1305,6 +1305,11 @@ swr_update_derived(struct pipe_context *pipe, >>&ctx->blend->compileState[target], >>sizeof(compileState.blendState)); >> >> +if (compileState.blendState.logicOpEnable && >> +GetFormatInfo(compileState.format).type[0] == >> SWR_TYPE_FLOAT) { >> + compileState.blendState.logicOpEnable = false; >> +} >> + >> if (compileState.blendState.blendEnable == false && >> compileState.blendState.logicOpEnable == false) { >>SwrSetBlendFunc(ctx->swrContext, target, NULL); >> -- >> 2.7.3 >> > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 98563] Xorg segfaults with displaylink attached and mesa version >= 13.0
https://bugs.freedesktop.org/show_bug.cgi?id=98563 Jan Rüegg changed: What|Removed |Added CC||rgg...@gmail.com -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] swr: disable logic op when the rt format is float
Looking at the spec, that seems like that should also check for sRGB and also disable in that case (“GetFormatInfo(compileState.format).isSRGB”). > On Nov 7, 2016, at 6:18 PM, Ilia Mirkin wrote: > > Signed-off-by: Ilia Mirkin > --- > src/gallium/drivers/swr/swr_state.cpp | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/src/gallium/drivers/swr/swr_state.cpp > b/src/gallium/drivers/swr/swr_state.cpp > index d8a8ee1..acb0452 100644 > --- a/src/gallium/drivers/swr/swr_state.cpp > +++ b/src/gallium/drivers/swr/swr_state.cpp > @@ -1305,6 +1305,11 @@ swr_update_derived(struct pipe_context *pipe, >&ctx->blend->compileState[target], >sizeof(compileState.blendState)); > > +if (compileState.blendState.logicOpEnable && > +GetFormatInfo(compileState.format).type[0] == > SWR_TYPE_FLOAT) { > + compileState.blendState.logicOpEnable = false; > +} > + > if (compileState.blendState.blendEnable == false && > compileState.blendState.logicOpEnable == false) { >SwrSetBlendFunc(ctx->swrContext, target, NULL); > -- > 2.7.3 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] swr: fix AND_INVERTED logic op conversion
Reviewed-by: Tim Rowley mailto:timothy.o.row...@intel.com>> On Nov 7, 2016, at 6:18 PM, Ilia Mirkin mailto:imir...@alum.mit.edu>> wrote: Signed-off-by: Ilia Mirkin mailto:imir...@alum.mit.edu>> --- src/gallium/drivers/swr/swr_state.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/swr/swr_state.h b/src/gallium/drivers/swr/swr_state.h index 0e3b49d..8409114 100644 --- a/src/gallium/drivers/swr/swr_state.h +++ b/src/gallium/drivers/swr/swr_state.h @@ -106,7 +106,7 @@ swr_convert_logic_op(const UINT op) case PIPE_LOGICOP_NOR: return LOGICOP_NOR; case PIPE_LOGICOP_AND_INVERTED: - return LOGICOP_CLEAR; + return LOGICOP_AND_INVERTED; case PIPE_LOGICOP_COPY_INVERTED: return LOGICOP_COPY_INVERTED; case PIPE_LOGICOP_AND_REVERSE: -- 2.7.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 21/25] anv/blorp: Break the guts of alloc_binding_table into a shared helper
On Fri, Oct 28, 2016 at 12:27 PM, Pohjolainen, Topi < topi.pohjolai...@gmail.com> wrote: > On Sat, Oct 22, 2016 at 10:50:52AM -0700, Jason Ekstrand wrote: > > --- > > src/intel/vulkan/anv_blorp.c | 24 > > src/intel/vulkan/anv_private.h | 5 + > > src/intel/vulkan/genX_blorp_exec.c | 18 ++ > > 3 files changed, 31 insertions(+), 16 deletions(-) > > > > diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c > > index 5361c4b..f495815 100644 > > --- a/src/intel/vulkan/anv_blorp.c > > +++ b/src/intel/vulkan/anv_blorp.c > > @@ -868,6 +868,30 @@ void anv_CmdClearDepthStencilImage( > > blorp_batch_finish(&batch); > > } > > > > +struct anv_state > > +anv_cmd_buffer_alloc_blorp_binding_table(struct anv_cmd_buffer > *cmd_buffer, > > + uint32_t num_entries, > > + uint32_t *state_offset) > > +{ > > + struct anv_state bt_state = > > + anv_cmd_buffer_alloc_binding_table(cmd_buffer, num_entries, > > + state_offset); > > + if (bt_state.map == NULL) { > > + /* We ran out of space. Grab a new binding table block. */ > > + VkResult result = anv_cmd_buffer_new_binding_ > table_block(cmd_buffer); > > + assert(result == VK_SUCCESS); > > + > > + /* Re-emit state base addresses so we get the new surface state > base > > + * address before we start emitting binding tables etc. > > + */ > > + anv_cmd_buffer_emit_state_base_address(cmd_buffer); > > + > > + bt_state = anv_cmd_buffer_alloc_binding_table(cmd_buffer, > num_entries, > > +state_offset); > > + assert(bt_state.map != NULL); > > + } > > This is not returning the state. > Thanks for catching this. I've got it fixed now. > > +} > > + > > static void > > clear_color_attachment(struct anv_cmd_buffer *cmd_buffer, > > struct blorp_batch *batch, > > diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_ > private.h > > index 5664a6e..44fe606 100644 > > --- a/src/intel/vulkan/anv_private.h > > +++ b/src/intel/vulkan/anv_private.h > > @@ -1271,6 +1271,11 @@ void anv_cmd_buffer_resolve_subpass(struct > anv_cmd_buffer *cmd_buffer); > > const struct anv_image_view * > > anv_cmd_buffer_get_depth_stencil_view(const struct anv_cmd_buffer > *cmd_buffer); > > > > +struct anv_state > > +anv_cmd_buffer_alloc_blorp_binding_table(struct anv_cmd_buffer > *cmd_buffer, > > + uint32_t num_entries, > > + uint32_t *state_offset); > > + > > void anv_cmd_buffer_dump(struct anv_cmd_buffer *cmd_buffer); > > > > struct anv_fence { > > diff --git a/src/intel/vulkan/genX_blorp_exec.c > b/src/intel/vulkan/genX_blorp_exec.c > > index 185aff6..a705de0 100644 > > --- a/src/intel/vulkan/genX_blorp_exec.c > > +++ b/src/intel/vulkan/genX_blorp_exec.c > > @@ -87,22 +87,8 @@ blorp_alloc_binding_table(struct blorp_batch *batch, > unsigned num_entries, > > > > uint32_t state_offset; > > struct anv_state bt_state = > > - anv_cmd_buffer_alloc_binding_table(cmd_buffer, num_entries, > > - &state_offset); > > - if (bt_state.map == NULL) { > > - /* We ran out of space. Grab a new binding table block. */ > > - VkResult result = anv_cmd_buffer_new_binding_ > table_block(cmd_buffer); > > - assert(result == VK_SUCCESS); > > - > > - /* Re-emit state base addresses so we get the new surface state > base > > - * address before we start emitting binding tables etc. > > - */ > > - genX(cmd_buffer_emit_state_base_address)(cmd_buffer); > > - > > - bt_state = anv_cmd_buffer_alloc_binding_table(cmd_buffer, > num_entries, > > -&state_offset); > > - assert(bt_state.map != NULL); > > - } > > + anv_cmd_buffer_alloc_blorp_binding_table(cmd_buffer, num_entries, > > + &state_offset); > > > > uint32_t *bt_map = bt_state.map; > > *bt_offset = bt_state.offset; > > -- > > 2.5.0.400.gff86faf > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 19/25] intel/blorp: Add a clear_attachments entrypoint
On Fri, Oct 28, 2016 at 12:11 PM, Pohjolainen, Topi < topi.pohjolai...@gmail.com> wrote: > On Sat, Oct 22, 2016 at 10:50:50AM -0700, Jason Ekstrand wrote: > > --- > > src/intel/blorp/blorp.h | 11 +++ > > src/intel/blorp/blorp_clear.c | 162 ++ > +++- > > src/intel/blorp/blorp_priv.h | 1 + > > 3 files changed, 172 insertions(+), 2 deletions(-) > > > > diff --git a/src/intel/blorp/blorp.h b/src/intel/blorp/blorp.h > > index 0c64d13..8a761ce 100644 > > --- a/src/intel/blorp/blorp.h > > +++ b/src/intel/blorp/blorp.h > > @@ -156,6 +156,17 @@ blorp_clear_depth_stencil(struct blorp_batch > *batch, > >uint8_t stencil_mask, uint8_t stencil_value); > > > > void > > +blorp_clear_attachments(struct blorp_batch *batch, > > +uint32_t color_surface_state, > > +enum isl_format depth_format, > > +uint32_t num_samples, > > +uint32_t start_layer, uint32_t num_layers, > > +uint32_t x0, uint32_t y0, uint32_t x1, uint32_t > y1, > > +bool clear_color, union isl_color_value > color_value, > > +bool clear_depth, float depth_value, > > +uint8_t stencil_mask, uint8_t stencil_value); > > + > > +void > > blorp_ccs_resolve(struct blorp_batch *batch, > >struct blorp_surf *surf, enum isl_format format); > > > > diff --git a/src/intel/blorp/blorp_clear.c > b/src/intel/blorp/blorp_clear.c > > index 3d752ac..2287f59 100644 > > --- a/src/intel/blorp/blorp_clear.c > > +++ b/src/intel/blorp/blorp_clear.c > > @@ -87,6 +87,94 @@ blorp_params_get_clear_kernel(struct blorp_context > *blorp, > > ralloc_free(mem_ctx); > > } > > > > +struct layer_offset_vs_key { > > + enum blorp_shader_type shader_type; > > + unsigned num_inputs; > > +}; > > + > > I'm assuming we need this because we are re-using the surface state from > other pass and therefore cannot set the base layer in the surface state? If > so it would be nice to have a comment here. > Done. > > +static void > > +blorp_params_get_layer_offset_vs(struct blorp_context *blorp, > > + struct blorp_params *params) > > +{ > > + struct layer_offset_vs_key blorp_key = { > > + .shader_type = BLORP_SHADER_TYPE_LAYER_OFFSET_VS, > > + }; > > + > > + if (params->wm_prog_data) > > + blorp_key.num_inputs = params->wm_prog_data->num_varying_inputs; > > + > > + if (blorp->lookup_shader(blorp, &blorp_key, sizeof(blorp_key), > > +¶ms->vs_prog_kernel, > ¶ms->vs_prog_data)) > > + return; > > + > > + void *mem_ctx = ralloc_context(NULL); > > + > > + nir_builder b; > > + nir_builder_init_simple_shader(&b, mem_ctx, MESA_SHADER_VERTEX, > NULL); > > + b.shader->info.name = ralloc_strdup(b.shader, > "BLORP-layer-offset-vs"); > > + > > + const struct glsl_type *uvec4_type = glsl_vector_type(GLSL_TYPE_UINT, > 4); > > + > > + /* > > +* First we deal with the header which has instance and base instance > > +*/ > > Fits as oneline comment. > > > + nir_variable *a_header = nir_variable_create(b.shader, > nir_var_shader_in, > > +uvec4_type, "header"); > > + a_header->data.location = VERT_ATTRIB_GENERIC0; > > + > > + nir_variable *v_layer = nir_variable_create(b.shader, > nir_var_shader_out, > > + glsl_int_type(), > "layer_id"); > > + v_layer->data.location = VARYING_SLOT_LAYER; > > + > > + /* Compute the layer id */ > > + nir_ssa_def *header = nir_load_var(&b, a_header); > > + nir_ssa_def *base_layer = nir_channel(&b, header, 0); > > + nir_ssa_def *instance = nir_channel(&b, header, 1); > > + nir_store_var(&b, v_layer, nir_iadd(&b, instance, base_layer), 0x1); > > + > > + /* > > +* Then we copy the vertex from the next slot to VARYING_SLOT_POS > > +*/ > > Same here. > > > + nir_variable *a_vertex = nir_variable_create(b.shader, > nir_var_shader_in, > > +glsl_vec4_type(), > "a_vertex"); > > + a_vertex->data.location = VERT_ATTRIB_GENERIC1; > > + > > + nir_variable *v_pos = nir_variable_create(b.shader, > nir_var_shader_out, > > + glsl_vec4_type(), "v_pos"); > > + v_pos->data.location = VARYING_SLOT_POS; > > + > > + nir_copy_var(&b, v_pos, a_vertex); > > + > > + /* > > +* Then we copy everything else > > +*/ > > And here. > > > + for (unsigned i = 0; i < blorp_key.num_inputs; i++) { > > + nir_variable *a_in = nir_variable_create(b.shader, > nir_var_shader_in, > > + uvec4_type, "input"); > > + a_in->data.location = VERT_ATTRIB_GENERIC2 + i; > > + > > + nir_variable *v_out = nir_variable_create(b.shader, > nir_var_shader_out, > > +
[Mesa-dev] [Bug 98599] xterm menus corrupt since tgsi/scan: handle indirect image indexing correctly
https://bugs.freedesktop.org/show_bug.cgi?id=98599 Marek Olšák changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #7 from Marek Olšák --- Fixed by f864547fa92262f4b2c65a047210ee41e5b45e9a. Closing. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] swr: Support windows builds
On 8 November 2016 at 15:48, Kyriazis, George wrote: > Comments inline.. > >> -Original Message- >> From: Emil Velikov [mailto:emil.l.veli...@gmail.com] >> Sent: Tuesday, November 8, 2016 8:25 AM >> To: Kyriazis, George >> Cc: ML mesa-dev >> Subject: Re: [Mesa-dev] [PATCH 3/3] swr: Support windows builds >> >> On 7 November 2016 at 22:32, George Kyriazis >> wrote: >> > - Added SConscript files >> > - better handling of NOMINMAX for inclusion >> > - Reorder header files in swr_context.cpp to handle NOMINMAX better, >> since >> > mesa header files include windows.h before we get a chance to #define >> > NOMINMAX >> > - cleaner support for .dll and .so prefix/suffix across OSes >> > - added PUBLIC for some protos >> > - added swr_gdi_swap() which is call from libgl_gdi.c >> > --- >> > src/gallium/drivers/swr/Makefile.am| 8 ++ >> > src/gallium/drivers/swr/SConscript | 46 +++ >> > src/gallium/drivers/swr/SConscript-arch| 175 >> + >> > src/gallium/drivers/swr/rasterizer/common/os.h | 5 +- >> > src/gallium/drivers/swr/swr_context.cpp| 16 +-- >> > src/gallium/drivers/swr/swr_context.h | 2 + >> > src/gallium/drivers/swr/swr_loader.cpp | 37 +- >> > src/gallium/drivers/swr/swr_public.h | 11 +- >> > src/gallium/drivers/swr/swr_screen.cpp | 25 +--- >> > 9 files changed, 291 insertions(+), 34 deletions(-) create mode >> > 100644 src/gallium/drivers/swr/SConscript >> > create mode 100644 src/gallium/drivers/swr/SConscript-arch >> > >> Similar to 1/3 this patch does too many things. Please _don't_ do that. >> >> Some ideas based on the above: >> - source code fixes - one or multiple patches, depending on details. >> - automake fixes - ^^ >> - introduce scons build (+ the EXTRA_DIST hunk) >> > As stated in review of patch 1/3, I will send v2 of patches with different > breakdown. > > >> Some misc comments below. >> >> >> > +++ b/src/gallium/drivers/swr/SConscript >> > @@ -0,0 +1,46 @@ >> > +Import('*') >> > + >> > +from sys import executable as python_cmd import distutils.version >> Seems unused. Maybe it was aimed for the llvm 3.9 requirement/check >> mentioned in 1/3 ? >> > Scons build fails without the Import('*'), because env is undefined: > > NameError: name 'env' is not defined: > The "unused" comment was meant for the "import distutils.version" line. Which seemingly got manged somewhere along the way. >> > +import os.path >> > + >> > +if not 'swr' in COMMAND_LINE_TARGETS: >> > +Return() >> > + >> > +if not env['llvm']: >> > +print 'warning: LLVM disabled: not building swr' >> > +Return() >> > + >> > +env.MSVC2013Compat() >> > + >> >> > +swr_arch = 'avx' >> > +VariantDir('avx', '.', duplicate=0) >> > +SConscript('avx/SConscript-arch', exports='swr_arch') >> > + >> > +swr_arch = 'avx2' >> > +VariantDir('avx2', '.', duplicate=0) >> > +SConscript('avx2/SConscript-arch', exports='swr_arch') >> > + >> Afaict one can just fold the SConscript-arch here. Thus one won't need to >> bother with the above nor the Depends hunk below. >> Additionally with current approach one is generating [the] identical source >> files twice. Far from ideal... >> > The AVX and AVX2 builds build differently (with different compiler flags). > At runtime, we load the appropriate dll, based on the underlying > architecture. We do the same thing on the linux build. Also, since > duplicate=0, source is not duplicated. Yes, generated files are generated > twice, however currently SConscript is just a shell around SConscript-arch; > all the logic that generates the files and source lists is in > SConscript-arch. By moving the auto generation to SConscript will generate > only one copy of the gen files, however it splits the build logic into two > files, which is more messy. I can certainly move the generation code in > SConscript, however, I think that it's cleaner to strive for source code > cleanliness, as opposed to generate code cleanliness. "By moving the auto generation to SConscript ..., however it splits the build logic into two files..." did you mean "one file" here ? I'm proposing to fold the two SConscripts, which effectively moves the build logic into _one_ file :-) Scons was never my thing, so I'm failing to see the "source code cleanliness" that you're thinking about :-( The following isn't that messy is it ? build loader generate sources build avx - (uses above sources + avx compile flags) build avx2 - (uses above sources + avx2 compile flags) This is now I folded/cleaned up the autoconf build with commit bb949e262cb5c4fffe991debc605447e15322666. A similar solution here would be great/possible. >> > +# remove headers, as scons thinks they are static objects for the .so >> > +source = [x for x in source if not x.endswith(tuple(['.h','.hpp']))] >> > + >> Should be handled already. Otherwise please do so in scons/* Quick grep >> suggests scons/custom.py >> >
Re: [Mesa-dev] [PATCH] gallivm: Fix build after removal of deprecated attribute API v2
On Mon, Nov 7, 2016 at 11:04 PM, Jan Vesely wrote: > On Mon, 2016-11-07 at 21:06 +, Tom Stellard wrote: >> v2: >> Fix adding parameter attributes with LLVM < 4.0. >> --- >> src/gallium/auxiliary/draw/draw_llvm.c| 6 +- >> src/gallium/auxiliary/gallivm/lp_bld_intr.c | 52 - >> src/gallium/auxiliary/gallivm/lp_bld_intr.h | 13 - >> src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 4 +- >> src/gallium/drivers/radeonsi/si_shader.c | 69 >> --- >> src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c | 24 >> 6 files changed, 116 insertions(+), 52 deletions(-) >> >> diff --git a/src/gallium/auxiliary/draw/draw_llvm.c >> b/src/gallium/auxiliary/draw/draw_llvm.c >> index 5b4e2a1..5d87318 100644 >> --- a/src/gallium/auxiliary/draw/draw_llvm.c >> +++ b/src/gallium/auxiliary/draw/draw_llvm.c >> @@ -1568,8 +1568,7 @@ draw_llvm_generate(struct draw_llvm *llvm, struct >> draw_llvm_variant *variant, >> LLVMSetFunctionCallConv(variant_func, LLVMCCallConv); >> for (i = 0; i < num_arg_types; ++i) >>if (LLVMGetTypeKind(arg_types[i]) == LLVMPointerTypeKind) >> - LLVMAddAttribute(LLVMGetParam(variant_func, i), >> - LLVMNoAliasAttribute); >> + lp_add_function_attr(variant_func, i + 1, "noalias", 7); >> >> context_ptr = LLVMGetParam(variant_func, 0); >> io_ptr= LLVMGetParam(variant_func, 1); >> @@ -2193,8 +2192,7 @@ draw_gs_llvm_generate(struct draw_llvm *llvm, >> >> for (i = 0; i < ARRAY_SIZE(arg_types); ++i) >>if (LLVMGetTypeKind(arg_types[i]) == LLVMPointerTypeKind) >> - LLVMAddAttribute(LLVMGetParam(variant_func, i), >> - LLVMNoAliasAttribute); >> + lp_add_function_attr(variant_func, i + 1, "noalias", 7); >> >> context_ptr = LLVMGetParam(variant_func, 0); >> input_array = LLVMGetParam(variant_func, 1); >> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_intr.c >> b/src/gallium/auxiliary/gallivm/lp_bld_intr.c >> index f12e735..401e9a2 100644 >> --- a/src/gallium/auxiliary/gallivm/lp_bld_intr.c >> +++ b/src/gallium/auxiliary/gallivm/lp_bld_intr.c >> @@ -120,13 +120,57 @@ lp_declare_intrinsic(LLVMModuleRef module, >> } >> >> >> +#if HAVE_LLVM < 0x0400 >> +static LLVMAttribute str_to_attr(const char *attr_name, unsigned attr_len) >> +{ >> + if (!strncmp("alwaysinline", attr_name, attr_len)) { >> + return LLVMAlwaysInlineAttribute; >> + } else if (!strncmp("byval", attr_name, attr_len)) { >> + return LLVMByValAttribute; >> + } else if (!strncmp("inreg", attr_name, attr_len)) { >> + return LLVMInRegAttribute; >> + } else if (!strncmp("noalias", attr_name, attr_len)) { >> + return LLVMNoAlliasAttribute; >> + } else if (!strncmp("readnone", attr_name, attr_len)) { >> + return LLVMReadNoneAttribute; >> + } else if (!strncmp("readonly", attr_name, attr_len)) { >> + return LLVMReadOnlyAttribute; >> + } else { >> + _debug_printf("Unhandled function attribute: %s\n", attr_name); >> + return 0; >> + } >> +} >> +#endif >> + >> +void >> +lp_add_function_attr(LLVMValueRef function, >> + int attr_idx, >> + const char *attr_name, >> + unsigned attr_len) > > Any reason to pass string length by hand rather than local strlen? An enum would be better. Then lp_add_function_attr can translate the enum to a string. The enums can be defined by gallivm. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC 03/12] egl: add EGL_ANDROID_native_fence_sync
On Mon, Nov 07, 2016 at 07:48:25PM -0500, Rob Clark wrote: > On Mon, Nov 7, 2016 at 6:29 PM, Rafael Antognolli > wrote: > > On Mon, Oct 31, 2016 at 08:58:26AM -0700, Rafael Antognolli wrote: > >> On Sat, Oct 29, 2016 at 01:15:44PM -0400, Rob Clark wrote: > >> > On Fri, Oct 28, 2016 at 7:44 PM, Rafael Antognolli > >> > wrote: > > > > ... > > > >> > Hey, thanks for this. I don't suppose you have a branch somewhere w/ > >> > the piglit tests? > >> > >> Ouch, I mentioned it on another email but should have mentioned it here > >> too. It's here: > >> > >> https://github.com/rantogno/piglit/tree/fences > >> > >> > I've rebased and pulled in Chad's squash patches (and also a squash > >> > patch based on the issues you pointed out), but not yet the i965 > >> > patches: > >> > > >> > https://github.com/freedreno/mesa/commits/wip-fence > >> > >> Awesome, I will check that one. > > > > Just an update: I did test that branch, and there was just one change > > needed for the piglit tests to work: > > > > https://github.com/rantogno/mesa/commit/c637f1ce404acaccaa920d37c52724c9d8093597 > > oh, good catch.. I'll squash that in and push an updated branch soon > > > You can also check my last version of these tests (also submitted to the > > piglit list) here: > > > > https://github.com/rantogno/piglit/tree/review/fences-v02 > > > > The only test that I don't know how to do yet is to make sure that Mesa > > and the kernel are respecting an eglSyncWait for a native sync fence. > > eglClientWaitSyncKHR is already covered. > > yeah, I can't think of a particularly easy way to test that.. but I > think the API level tests have already caught quite a few issues.. Alright, I'll ignore that for now... > > Also I did test your series with kmscube and some other stuff too, and > > so far it's all behaving really well. I'm looking forward to see your > > patches get merged. > > I guess we should pull together a unified branch.. since we have this > working for intel + virgl + freedreno. AFAIU the current status is > intel and freedreno kernel bits are upstream. The libdrm bits for > freedreno are upstream, not sure about intel (and virgl doesn't have > any libdrm component). Not sure about the kernel bit for virgl, but I > assume that will be 4.10? Actually, the intel kernel bits have not been merged yet, AFAIK they are "waiting for userspace". Chris Wilson has been sending updated versions of them every now and then, so it's just a matter of merging them. And they have been reviewed already. libdrm bits for intel are not upstream yet either. What I have been using is a mix of patches from Chad and Chris Wilson as well, but imho it's ready to go: https://github.com/rantogno/libdrm/tree/wip-fence > I have one small update for the gallium patch, to add the pipe-cap to > all the other drivers. I usually try to wait until the patch is ready > to push since otherwise it ends up being a huge rebase headache. > > I would defn like to get this merged, esp. since I'm starting to get > busy on the next thing ;-) Awesome, that sounds really good :) Thanks, Rafael ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallivm: fix [IU]MUL_HI regression
Am 08.11.2016 um 16:23 schrieb Nicolai Hähnle: > On 08.11.2016 14:44, Roland Scheidegger wrote: >> Sorry for breaking radeonsi, I somehow thought this way only used for >> cpu only already, without actually checking... >> And thanks for fixing that typo, apparently you can pass piglits >> umul_hi/imul_hi tests (at least those from the shader_integer_mix group) >> even with the square of argument a... > > Yeah, it sucks that test runs take so long with llvmpipe. Is there > anybody doing systematic full regression runs on it? > > I do full runs on radeonsi fairly frequently, and I noticed this bug > with > tests/spec/arb_gpu_shader5/execution/built-in-functions/fs-imulExtended.shader_test > and friends. I do but not that frequently. I usually do full runs though with changes such as this, so I thought I screwed up the testing. However, looking at the assembly, this wasn't the case - glsl lowered the imul_hi and umul_hi to a sequence of muls/adds/shifts The reason for that is probably the IMUL_HIGH_TO_MUL lowering - this is done when ARB_gpu_shader5 isn't supported (which llvmpipe does not). With llvmpipe, we definitely don't want the mul_hi lowering but there's even a comment there that there's no individual caps (and some of the other stuff might not be implemented). So there was no way to catch that with llvmpipe, effectively only the umul_hi was tested called directly from the draw fetch code, not from shader (fwiw I did see a regression with some automated internal testing using another api...). (We also have automated piglit tests, however due to results changing frequently it's difficult to catch "real" regressions.) Roland > > >> btw as I didn't consider this, I don't know if you want to change the >> shift/trunc to shuffle in the end - feel free to change it back if it >> doesn't generate good code on radeonsi... > > It seems instcombine has no difficulties seeing through the IR, so I > think we're good :) Ok. With x86 sse2 it definitely generates some different assembly, but what exactly is better depends on sse2/sse41/avx/avx2 being available and the llvm version... Roland > > >> Reviewed-by: Roland Scheidegger > > Thanks! > > Nicolai > > >> Am 08.11.2016 um 10:15 schrieb Nicolai Hähnle: >>> From: Nicolai Hähnle >>> >>> This patch does two things: >>> >>> 1. It separates the host-CPU code generation from the generic code >>>generation. This guards against accidently breaking things for >>>radeonsi in the future. >>> >>> 2. It makes sure we actually use both arguments and don't just compute >>>a square :-p >>> >>> Fixes a regression introduced by commit >>> 29279f44b3172ef3b84d470e70fc7684695ced4b >>> >>> Cc: Roland Scheidegger >>> --- >>> src/gallium/auxiliary/gallivm/lp_bld_arit.c| 72 >>> ++ >>> src/gallium/auxiliary/gallivm/lp_bld_arit.h| 6 ++ >>> src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c | 40 +++- >>> 3 files changed, 90 insertions(+), 28 deletions(-) >>> >>> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_arit.c >>> b/src/gallium/auxiliary/gallivm/lp_bld_arit.c >>> index 3de4628..43ad238 100644 >>> --- a/src/gallium/auxiliary/gallivm/lp_bld_arit.c >>> +++ b/src/gallium/auxiliary/gallivm/lp_bld_arit.c >>> @@ -1087,26 +1087,28 @@ lp_build_mul(struct lp_build_context *bld, >>> res = LLVMBuildLShr(builder, res, shift, ""); >>>} >>> } >>> >>> return res; >>> } >>> >>> /* >>> * Widening mul, valid for 32x32 bit -> 64bit only. >>> * Result is low 32bits, high bits returned in res_hi. >>> + * >>> + * Emits code that is meant to be compiled for the host CPU. >>> */ >>> LLVMValueRef >>> -lp_build_mul_32_lohi(struct lp_build_context *bld, >>> - LLVMValueRef a, >>> - LLVMValueRef b, >>> - LLVMValueRef *res_hi) >>> +lp_build_mul_32_lohi_cpu(struct lp_build_context *bld, >>> + LLVMValueRef a, >>> + LLVMValueRef b, >>> + LLVMValueRef *res_hi) >>> { >>> struct gallivm_state *gallivm = bld->gallivm; >>> LLVMBuilderRef builder = gallivm->builder; >>> >>> assert(bld->type.width == 32); >>> assert(bld->type.floating == 0); >>> assert(bld->type.fixed == 0); >>> assert(bld->type.norm == 0); >>> >>> /* >>> @@ -1209,43 +1211,61 @@ lp_build_mul_32_lohi(struct lp_build_context >>> *bld, >>>*res_hi = LLVMBuildShuffleVector(builder, muleven, mulodd, >>> shuf_vec, ""); >>> >>>for (i = 0; i < bld->type.length; i += 2) { >>> shuf[i] = lp_build_const_int32(gallivm, i); >>> shuf[i+1] = lp_build_const_int32(gallivm, i + >>> bld->type.length); >>>} >>>shuf_vec = LLVMConstVector(shuf, bld->type.length); >>>return LLVMBuildShuffleVector(builder, muleven, mulodd, >>> shuf_vec, ""); >>> } >>> else { >>> - LLVMValueRef tmp; >>> - struct lp_type type_tmp; >>> -
Re: [Mesa-dev] [PATCH 13/18] anv: Add initial for Sky Lake color compression
On Tue, Nov 8, 2016 at 12:21 AM, Pohjolainen, Topi < topi.pohjolai...@gmail.com> wrote: > > Title says: "anv: Add initial for Sky Lake color compression". Did you mean > to have something after "initial"? > Yeah, "support" should probably go in there > On Fri, Oct 28, 2016 at 02:17:09AM -0700, Jason Ekstrand wrote: > > This commit adds basic support for color compression. For the moment, > > color compression is only enabled within a render pass and a full resolve > > is done before the render pass finishes. All texturing operations still > > happen with CCS disabled. > > I'm not that familiar with all the vulkan concepts so far and need to ask a > few things. In this patch CCS_E is enabled whenever there is suitable > render > target. For surfaces of the type VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT and > VK_DESCRIPTOR_TYPE_STORAGE_IMAGE aux is explicitly disabled. Does this > mean > that it is impossible to have one of these surfaces as render target in > the same pass (and having compression turned on for writing)? > No, not quite. In this patch, we always do a full resolve at the end of the render pass. Since input attachments aren't really supported yet (my next task), those aren't a problem. Also, you can't bind one of your render pass attachments as a storage image or texture so those will never be used while it's in an unresolved state. > Otherwise this patch looks good to me. > > > > > Signed-off-by: Jason Ekstrand > > --- > > src/intel/vulkan/anv_blorp.c | 139 +- > --- > > src/intel/vulkan/anv_image.c | 17 +++-- > > src/intel/vulkan/anv_private.h | 1 + > > src/intel/vulkan/genX_cmd_buffer.c | 50 - > > 4 files changed, 171 insertions(+), 36 deletions(-) > > > > diff --git a/src/intel/vulkan/anv_blorp.c b/src/intel/vulkan/anv_blorp.c > > index 0e70e9b..bf317c7 100644 > > --- a/src/intel/vulkan/anv_blorp.c > > +++ b/src/intel/vulkan/anv_blorp.c > > @@ -1179,52 +1179,131 @@ void anv_CmdResolveImage( > > blorp_batch_finish(&batch); > > } > > > > +static void > > +ccs_resolve_attachment(struct anv_cmd_buffer *cmd_buffer, > > + struct blorp_batch *batch, > > + uint32_t att) > > +{ > > + struct anv_framebuffer *fb = cmd_buffer->state.framebuffer; > > + struct anv_attachment_state *att_state = > > + &cmd_buffer->state.attachments[att]; > > + > > + assert(att_state->aux_usage != ISL_AUX_USAGE_CCS_D); > > + if (att_state->aux_usage != ISL_AUX_USAGE_CCS_E) > > + return; /* Nothing to resolve */ > > + > > + struct anv_render_pass *pass = cmd_buffer->state.pass; > > + struct anv_subpass *subpass = cmd_buffer->state.subpass; > > + unsigned subpass_idx = subpass - pass->subpasses; > > + assert(subpass_idx < pass->subpass_count); > > + > > + /* Scan forward to see what all ways this attachment will be used. > > +* Ideally, we would like to resolve in the same subpass as the last > write > > +* of a particular attachment. That way we only resolve once but > it's > > +* still hot in the cache. > > +*/ > > + for (uint32_t s = subpass_idx + 1; s < pass->subpass_count; s++) { > > + enum anv_subpass_usage usage = pass->attachments[att]. > subpass_usage[s]; > > I'm wondering if this holds? > > assert(!(usage & ANV_SUBPASS_USAGE_INPUT)); > > > + > > + if (usage & (ANV_SUBPASS_USAGE_DRAW | > ANV_SUBPASS_USAGE_RESOLVE_DST)) { > > + /* We found another subpass that draws to this attachment. > We'll > > + * wait to resolve until then. > > + */ > > + return; > > + } > > + } > > + > > + struct anv_image_view *iview = fb->attachments[att]; > > + const struct anv_image *image = iview->image; > > + assert(image->aspects == VK_IMAGE_ASPECT_COLOR_BIT); > > + > > + struct blorp_surf surf; > > + get_blorp_surf_for_anv_image(image, VK_IMAGE_ASPECT_COLOR_BIT, > &surf); > > + surf.aux_surf = &image->aux_surface.isl; > > + surf.aux_addr = (struct blorp_address) { > > + .buffer = image->bo, > > + .offset = image->offset + image->aux_surface.offset, > > + }; > > + surf.aux_usage = att_state->aux_usage; > > + > > + for (uint32_t layer = 0; layer < fb->layers; layer++) { > > + blorp_ccs_resolve(batch, &surf, > > +iview->isl.base_level, > > +iview->isl.base_array_layer + layer, > > +iview->isl.format, > > +BLORP_FAST_CLEAR_OP_RESOLVE_FULL); > > + } > > +} > > + > > void > > anv_cmd_buffer_resolve_subpass(struct anv_cmd_buffer *cmd_buffer) > > { > > struct anv_framebuffer *fb = cmd_buffer->state.framebuffer; > > struct anv_subpass *subpass = cmd_buffer->state.subpass; > > > > - if (!subpass->has_resolve) > > - return; > > > > struct blorp_batch batch; > > blorp_batch_init(&cmd_buffer->device->blorp, &batch, cmd_buffer, 0); > > > > + /* From the Sky Lake PRM
Re: [Mesa-dev] [PATCH 3/3] swr: Support windows builds
Comments inline.. > -Original Message- > From: Emil Velikov [mailto:emil.l.veli...@gmail.com] > Sent: Tuesday, November 8, 2016 8:25 AM > To: Kyriazis, George > Cc: ML mesa-dev > Subject: Re: [Mesa-dev] [PATCH 3/3] swr: Support windows builds > > On 7 November 2016 at 22:32, George Kyriazis > wrote: > > - Added SConscript files > > - better handling of NOMINMAX for inclusion > > - Reorder header files in swr_context.cpp to handle NOMINMAX better, > since > > mesa header files include windows.h before we get a chance to #define > > NOMINMAX > > - cleaner support for .dll and .so prefix/suffix across OSes > > - added PUBLIC for some protos > > - added swr_gdi_swap() which is call from libgl_gdi.c > > --- > > src/gallium/drivers/swr/Makefile.am| 8 ++ > > src/gallium/drivers/swr/SConscript | 46 +++ > > src/gallium/drivers/swr/SConscript-arch| 175 > + > > src/gallium/drivers/swr/rasterizer/common/os.h | 5 +- > > src/gallium/drivers/swr/swr_context.cpp| 16 +-- > > src/gallium/drivers/swr/swr_context.h | 2 + > > src/gallium/drivers/swr/swr_loader.cpp | 37 +- > > src/gallium/drivers/swr/swr_public.h | 11 +- > > src/gallium/drivers/swr/swr_screen.cpp | 25 +--- > > 9 files changed, 291 insertions(+), 34 deletions(-) create mode > > 100644 src/gallium/drivers/swr/SConscript > > create mode 100644 src/gallium/drivers/swr/SConscript-arch > > > Similar to 1/3 this patch does too many things. Please _don't_ do that. > > Some ideas based on the above: > - source code fixes - one or multiple patches, depending on details. > - automake fixes - ^^ > - introduce scons build (+ the EXTRA_DIST hunk) > As stated in review of patch 1/3, I will send v2 of patches with different breakdown. > Some misc comments below. > > > > +++ b/src/gallium/drivers/swr/SConscript > > @@ -0,0 +1,46 @@ > > +Import('*') > > + > > +from sys import executable as python_cmd import distutils.version > Seems unused. Maybe it was aimed for the llvm 3.9 requirement/check > mentioned in 1/3 ? > Scons build fails without the Import('*'), because env is undefined: NameError: name 'env' is not defined: > > +import os.path > > + > > +if not 'swr' in COMMAND_LINE_TARGETS: > > +Return() > > + > > +if not env['llvm']: > > +print 'warning: LLVM disabled: not building swr' > > +Return() > > + > > +env.MSVC2013Compat() > > + > > > +swr_arch = 'avx' > > +VariantDir('avx', '.', duplicate=0) > > +SConscript('avx/SConscript-arch', exports='swr_arch') > > + > > +swr_arch = 'avx2' > > +VariantDir('avx2', '.', duplicate=0) > > +SConscript('avx2/SConscript-arch', exports='swr_arch') > > + > Afaict one can just fold the SConscript-arch here. Thus one won't need to > bother with the above nor the Depends hunk below. > Additionally with current approach one is generating [the] identical source > files twice. Far from ideal... > The AVX and AVX2 builds build differently (with different compiler flags). At runtime, we load the appropriate dll, based on the underlying architecture. We do the same thing on the linux build. Also, since duplicate=0, source is not duplicated. Yes, generated files are generated twice, however currently SConscript is just a shell around SConscript-arch; all the logic that generates the files and source lists is in SConscript-arch. By moving the auto generation to SConscript will generate only one copy of the gen files, however it splits the build logic into two files, which is more messy. I can certainly move the generation code in SConscript, however, I think that it's cleaner to strive for source code cleanliness, as opposed to generate code cleanliness. > > +env = env.Clone() > > + > > +source = env.ParseSourceList('Makefile.sources', [ > > +'LOADER_SOURCES' > > +]) > > + > > +env.Prepend(CPPPATH = [ > > +'rasterizer/scripts' > > +]) > > + > > +swr = env.ConvenienceLibrary( > > + target = 'swr', > > + source = source, > > + ) > Keep the indentation to 4 spaces here and throughout the SConscripts. > That's a python requirement. Ok, will correct that. > In general I'd encourage using .editorconfig and updating the section for swr, > if needed. > > > > +# remove headers, as scons thinks they are static objects for the .so > > +source = [x for x in source if not x.endswith(tuple(['.h','.hpp']))] > > + > Should be handled already. Otherwise please do so in scons/* Quick grep > suggests scons/custom.py > ParseSourceList() will filter out .h files, however it won't filter out .hpp files. Are you saying add the .hpp filter in custom.py? > > > +#ifdef _WIN32 > > + prefix = ""; > > + postfix = ".dll"; > > +#else > > + prefix = "lib"; > > + postfix = ".so"; > > +#endif > > + > Quick grep suggests: > > UTIL_DL_EXT > UTIL_DL_PREFIX > Ah. Thank you! I'll fix this and include the change in the next rev.
[Mesa-dev] [Bug 98629] OpenGL applications warns "MESA-LOADER: failed to retrieve device information"
https://bugs.freedesktop.org/show_bug.cgi?id=98629 Emil Velikov changed: What|Removed |Added Assignee|nouveau@lists.freedesktop.o |mesa-dev@lists.freedesktop. |rg |org Component|Drivers/DRI/nouveau |Mesa core QA Contact|nouveau@lists.freedesktop.o |mesa-dev@lists.freedesktop. |rg |org --- Comment #1 from Emil Velikov --- [Moving to 'core' since it's not really nouveau specific] Does this happen with glxinfo/glxgears as well ? If so can you attach the output of $strace glxinfo If glxinfo works fine, while $program does not, attach the output of $DL_DEBUG=libs $program Thanks -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallivm: fix [IU]MUL_HI regression
On 08.11.2016 14:44, Roland Scheidegger wrote: Sorry for breaking radeonsi, I somehow thought this way only used for cpu only already, without actually checking... And thanks for fixing that typo, apparently you can pass piglits umul_hi/imul_hi tests (at least those from the shader_integer_mix group) even with the square of argument a... Yeah, it sucks that test runs take so long with llvmpipe. Is there anybody doing systematic full regression runs on it? I do full runs on radeonsi fairly frequently, and I noticed this bug with tests/spec/arb_gpu_shader5/execution/built-in-functions/fs-imulExtended.shader_test and friends. btw as I didn't consider this, I don't know if you want to change the shift/trunc to shuffle in the end - feel free to change it back if it doesn't generate good code on radeonsi... It seems instcombine has no difficulties seeing through the IR, so I think we're good :) Reviewed-by: Roland Scheidegger Thanks! Nicolai Am 08.11.2016 um 10:15 schrieb Nicolai Hähnle: From: Nicolai Hähnle This patch does two things: 1. It separates the host-CPU code generation from the generic code generation. This guards against accidently breaking things for radeonsi in the future. 2. It makes sure we actually use both arguments and don't just compute a square :-p Fixes a regression introduced by commit 29279f44b3172ef3b84d470e70fc7684695ced4b Cc: Roland Scheidegger --- src/gallium/auxiliary/gallivm/lp_bld_arit.c| 72 ++ src/gallium/auxiliary/gallivm/lp_bld_arit.h| 6 ++ src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c | 40 +++- 3 files changed, 90 insertions(+), 28 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_arit.c b/src/gallium/auxiliary/gallivm/lp_bld_arit.c index 3de4628..43ad238 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_arit.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_arit.c @@ -1087,26 +1087,28 @@ lp_build_mul(struct lp_build_context *bld, res = LLVMBuildLShr(builder, res, shift, ""); } } return res; } /* * Widening mul, valid for 32x32 bit -> 64bit only. * Result is low 32bits, high bits returned in res_hi. + * + * Emits code that is meant to be compiled for the host CPU. */ LLVMValueRef -lp_build_mul_32_lohi(struct lp_build_context *bld, - LLVMValueRef a, - LLVMValueRef b, - LLVMValueRef *res_hi) +lp_build_mul_32_lohi_cpu(struct lp_build_context *bld, + LLVMValueRef a, + LLVMValueRef b, + LLVMValueRef *res_hi) { struct gallivm_state *gallivm = bld->gallivm; LLVMBuilderRef builder = gallivm->builder; assert(bld->type.width == 32); assert(bld->type.floating == 0); assert(bld->type.fixed == 0); assert(bld->type.norm == 0); /* @@ -1209,43 +1211,61 @@ lp_build_mul_32_lohi(struct lp_build_context *bld, *res_hi = LLVMBuildShuffleVector(builder, muleven, mulodd, shuf_vec, ""); for (i = 0; i < bld->type.length; i += 2) { shuf[i] = lp_build_const_int32(gallivm, i); shuf[i+1] = lp_build_const_int32(gallivm, i + bld->type.length); } shuf_vec = LLVMConstVector(shuf, bld->type.length); return LLVMBuildShuffleVector(builder, muleven, mulodd, shuf_vec, ""); } else { - LLVMValueRef tmp; - struct lp_type type_tmp; - LLVMTypeRef wide_type, cast_type; - - type_tmp = bld->type; - type_tmp.width *= 2; - wide_type = lp_build_vec_type(gallivm, type_tmp); - type_tmp = bld->type; - type_tmp.length *= 2; - cast_type = lp_build_vec_type(gallivm, type_tmp); - - if (bld->type.sign) { - a = LLVMBuildSExt(builder, a, wide_type, ""); - b = LLVMBuildSExt(builder, b, wide_type, ""); - } else { - a = LLVMBuildZExt(builder, a, wide_type, ""); - b = LLVMBuildZExt(builder, b, wide_type, ""); - } - tmp = LLVMBuildMul(builder, a, b, ""); - tmp = LLVMBuildBitCast(builder, tmp, cast_type, ""); - *res_hi = lp_build_uninterleave1(gallivm, bld->type.length * 2, tmp, 1); - return lp_build_uninterleave1(gallivm, bld->type.length * 2, tmp, 0); + return lp_build_mul_32_lohi(bld, a, b, res_hi); + } +} + + +/* + * Widening mul, valid for 32x32 bit -> 64bit only. + * Result is low 32bits, high bits returned in res_hi. + * + * Emits generic code. + */ +LLVMValueRef +lp_build_mul_32_lohi(struct lp_build_context *bld, + LLVMValueRef a, + LLVMValueRef b, + LLVMValueRef *res_hi) +{ + struct gallivm_state *gallivm = bld->gallivm; + LLVMBuilderRef builder = gallivm->builder; + LLVMValueRef tmp; + struct lp_type type_tmp; + LLVMTypeRef wide_type, cast_type; + + type_tmp = bld->type; + type_tmp.width *= 2; + wide_type = lp_build_vec_type(gallivm, type_tmp); + type_tmp = bld->type; +
Re: [Mesa-dev] [PATCH 1/3] gallium/scons: OpenSWR Windows support
Thank you for the review. Comments inline. > -Original Message- > From: Emil Velikov [mailto:emil.l.veli...@gmail.com] > Sent: Tuesday, November 8, 2016 7:52 AM > To: Kyriazis, George ; Jose Fonseca > > Cc: ML mesa-dev > Subject: Re: [Mesa-dev] [PATCH 1/3] gallium/scons: OpenSWR Windows > support > > Hi George, > > For Scons changes please keep Jose Fonseca in the loop. > > On 7 November 2016 at 22:32, George Kyriazis > wrote: > > - Added code to create screen and handle swaps in libgl_gdi.c > > - Added call to swr SConscript > > - included llvm 3.9 support for scons (windows swr only support 3.9 and > > later) > If that's the case building SWR with earlier one should error out ? > Then again, here you reference gallium/drivers/swr/ > Yes, SWR will only work on windows for 3.9 and above. > > - include -DHAVE_SWR to subdirs that need it > > > As the above indicates here you have multiple independent changes. > Please do _not_ mix those into a single patch. > I'll resend v2 of the patches with a different breakdown. Additional comments on your review of patch 3/3. Thanks, George > > > To buils SWR on windows, use "scons swr libgl-gdi" > > --- > > scons/llvm.py | 21 +++-- > > src/gallium/SConscript| 1 + > > src/gallium/targets/libgl-gdi/SConscript | 4 > > src/gallium/targets/libgl-gdi/libgl_gdi.c | 28 > > +++- src/gallium/targets/libgl-xlib/SConscript > | 4 > > src/gallium/targets/osmesa/SConscript | 4 > > 6 files changed, 55 insertions(+), 7 deletions(-) > > > > diff --git a/scons/llvm.py b/scons/llvm.py index 1fc8a3f..977e47a > > 100644 > > --- a/scons/llvm.py > > +++ b/scons/llvm.py > > @@ -106,7 +106,24 @@ def generate(env): > > ]) > > env.Prepend(LIBPATH = [os.path.join(llvm_dir, 'lib')]) > > # LIBS should match the output of `llvm-config --libs engine mcjit > bitwriter x86asmprinter` > > -if llvm_version >= distutils.version.LooseVersion('3.7'): > > +if llvm_version >= distutils.version.LooseVersion('3.9'): > > +env.Prepend(LIBS = [ > > +'LLVMX86Disassembler', 'LLVMX86AsmParser', > > +'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter', > > +'LLVMDebugInfoCodeView', 'LLVMCodeGen', > > +'LLVMScalarOpts', 'LLVMInstCombine', > > +'LLVMInstrumentation', 'LLVMTransformUtils', > > +'LLVMBitWriter', 'LLVMX86Desc', > > +'LLVMMCDisassembler', 'LLVMX86Info', > > +'LLVMX86AsmPrinter', 'LLVMX86Utils', > > +'LLVMMCJIT', 'LLVMExecutionEngine', 'LLVMTarget', > > +'LLVMAnalysis', 'LLVMProfileData', > > +'LLVMRuntimeDyld', 'LLVMObject', 'LLVMMCParser', > > +'LLVMBitReader', 'LLVMMC', 'LLVMCore', > > +'LLVMSupport', > > +'LLVMIRReader', 'LLVMASMParser' > > +]) > LLVM 3.9 support. cc: mesa-stable (if Jose/Brian are up for it). > > > +elif llvm_version >= distutils.version.LooseVersion('3.7'): > > env.Prepend(LIBS = [ > > 'LLVMBitWriter', 'LLVMX86Disassembler', 'LLVMX86AsmParser', > > 'LLVMX86CodeGen', 'LLVMSelectionDAG', > > 'LLVMAsmPrinter', @@ -203,7 +220,7 @@ def generate(env): > > if '-fno-rtti' in cxxflags: > > env.Append(CXXFLAGS = ['-fno-rtti']) > > > > -components = ['engine', 'mcjit', 'bitwriter', 'x86asmprinter', > 'mcdisassembler'] > > +components = ['engine', 'mcjit', 'bitwriter', > > + 'x86asmprinter', 'mcdisassembler', 'irreader'] > Standalone bugfix. Cc: mesa-stable ? > > > > +++ b/src/gallium/SConscript > > > +'drivers/swr/SConscript', > This file is only introduced with 3/3. Which means that you've added scons > support which is broken - please don't do that. > > > > +++ b/src/gallium/targets/libgl-gdi/SConscript > > +++ b/src/gallium/targets/libgl-gdi/libgl_gdi.c > > +++ b/src/gallium/targets/libgl-xlib/SConscript > > +++ b/src/gallium/targets/osmesa/SConscript > > Couple of ideas how to split these. Or anything else that comes to mind on > your end. > > A) > Patch 1 > src/gallium/SConscript > src/gallium/targets/libgl-gdi/SConscript > src/gallium/targets/libgl-gdi/libgl_gdi.c > Patch 2 > src/gallium/targets/libgl-xlib/SConscript > src/gallium/targets/osmesa/SConscript > > B) > Patch 1 > src/gallium/targets/libgl-gdi/libgl_gdi.c > Patch 2 > src/gallium/SConscript > src/gallium/targets/libgl-gdi/SConscript > src/gallium/targets/libgl-xlib/SConscript > src/gallium/targets/osmesa/SConscript > > > Thanks > Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glcpp: initializes version to -1
On Tue, 2016-11-08 at 14:19 +0100, Karol Herbst wrote: > well I don't care either way, maybe the spec does say anything about > it. I was re-reading GLSL 1.10 spec about #version directive. #version follows the same convention as __VERSION__ For __VERSION___, spec says "will substitute a decimal integer reflecting the version number of the OpenGL shading language" So no clear if we should always read as decimal, or keep current behaviour. J.A. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/1] Fix endianess detection with musl-based toolchains
On 5 November 2016 at 01:55, Jonathan Gray wrote: > On Fri, Nov 04, 2016 at 07:53:25PM +0100, Bernd Kuhls wrote: >> Musl does not define __GLIBC__ and will not provide a __MUSL__ macro: >> http://wiki.musl-libc.org/wiki/FAQ#Q:_why_is_there_no_MUSL_macro_.3F >> >> This patch checks for the presence of endian.h and promotes the result >> to src/amd/Makefile.addrlib.am which executes the broken build command. >> Fixes compile errors detected by the autobuilder infrastructure of the >> buildroot project: > > This will break OpenBSD and perhaps other platforms which > have endian.h that does not define glibc definitions. > From a quick skim on my system glibc provides the non __ prefixed symbols (BYTE_ORDER and friends) if _DEFAULT_SOURCE is set. The latter of which being implicitly set though a wide variation (once you get through the ifdef spaghetti). Worth checking if the non __ defines are available across the board and using them ? -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallivm: Fix build after removal of deprecated attribute API
Aaron Watry wrote: On Tue, Nov 8, 2016 at 4:38 AM, Andy Furniss wrote: Tom Stellard wrote: --- Build tested only so far. src/gallium/auxiliary/draw/draw_llvm.c| 6 +- src/gallium/auxiliary/gallivm/lp_bld_intr.c | 48 +++- src/gallium/auxiliary/gallivm/lp_bld_intr.h | 13 - src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 4 +- src/gallium/drivers/radeonsi/si_shader.c | 69 --- src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c | 24 6 files changed, 112 insertions(+), 52 deletions(-) I notice that llvmpipe needs fixing as well - or maybe that's for someone else? I sent a patch for that last night. Feel free to give it a spin. Oops, sorry I missed that - with it I can build OK. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] swr: Support windows builds
On 7 November 2016 at 22:32, George Kyriazis wrote: > - Added SConscript files > - better handling of NOMINMAX for inclusion > - Reorder header files in swr_context.cpp to handle NOMINMAX better, since > mesa header files include windows.h before we get a chance to #define > NOMINMAX > - cleaner support for .dll and .so prefix/suffix across OSes > - added PUBLIC for some protos > - added swr_gdi_swap() which is call from libgl_gdi.c > --- > src/gallium/drivers/swr/Makefile.am| 8 ++ > src/gallium/drivers/swr/SConscript | 46 +++ > src/gallium/drivers/swr/SConscript-arch| 175 > + > src/gallium/drivers/swr/rasterizer/common/os.h | 5 +- > src/gallium/drivers/swr/swr_context.cpp| 16 +-- > src/gallium/drivers/swr/swr_context.h | 2 + > src/gallium/drivers/swr/swr_loader.cpp | 37 +- > src/gallium/drivers/swr/swr_public.h | 11 +- > src/gallium/drivers/swr/swr_screen.cpp | 25 +--- > 9 files changed, 291 insertions(+), 34 deletions(-) > create mode 100644 src/gallium/drivers/swr/SConscript > create mode 100644 src/gallium/drivers/swr/SConscript-arch > Similar to 1/3 this patch does too many things. Please _don't_ do that. Some ideas based on the above: - source code fixes - one or multiple patches, depending on details. - automake fixes - ^^ - introduce scons build (+ the EXTRA_DIST hunk) Some misc comments below. > +++ b/src/gallium/drivers/swr/SConscript > @@ -0,0 +1,46 @@ > +Import('*') > + > +from sys import executable as python_cmd > +import distutils.version Seems unused. Maybe it was aimed for the llvm 3.9 requirement/check mentioned in 1/3 ? > +import os.path > + > +if not 'swr' in COMMAND_LINE_TARGETS: > +Return() > + > +if not env['llvm']: > +print 'warning: LLVM disabled: not building swr' > +Return() > + > +env.MSVC2013Compat() > + > +swr_arch = 'avx' > +VariantDir('avx', '.', duplicate=0) > +SConscript('avx/SConscript-arch', exports='swr_arch') > + > +swr_arch = 'avx2' > +VariantDir('avx2', '.', duplicate=0) > +SConscript('avx2/SConscript-arch', exports='swr_arch') > + Afaict one can just fold the SConscript-arch here. Thus one won't need to bother with the above nor the Depends hunk below. Additionally with current approach one is generating [the] identical source files twice. Far from ideal... > +env = env.Clone() > + > +source = env.ParseSourceList('Makefile.sources', [ > +'LOADER_SOURCES' > +]) > + > +env.Prepend(CPPPATH = [ > +'rasterizer/scripts' > +]) > + > +swr = env.ConvenienceLibrary( > + target = 'swr', > + source = source, > + ) Keep the indentation to 4 spaces here and throughout the SConscripts. That's a python requirement. In general I'd encourage using .editorconfig and updating the section for swr, if needed. > +# remove headers, as scons thinks they are static objects for the .so > +source = [x for x in source if not x.endswith(tuple(['.h','.hpp']))] > + Should be handled already. Otherwise please do so in scons/* Quick grep suggests scons/custom.py > +#ifdef _WIN32 > + prefix = ""; > + postfix = ".dll"; > +#else > + prefix = "lib"; > + postfix = ".so"; > +#endif > + Quick grep suggests: UTIL_DL_EXT UTIL_DL_PREFIX Regards, Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 98606] Compile error in gallium target VA--LLVM undefined referencences
https://bugs.freedesktop.org/show_bug.cgi?id=98606 --- Comment #2 from Emil Velikov --- When you say "... I no longer have VA-API decode ability like in the past" do you have an estimate when (what gcc/llvm/etc. combination) it was working ? Also, please make sure that you start a clean build if managing gallium llvm (--enable-gallium-llvm and permutations). Atm if you build with --enable and then reconfigure/rebuild with --disable things will break similar to your log. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallivm: Fix build after removal of deprecated attribute API
On Tue, Nov 8, 2016 at 4:38 AM, Andy Furniss wrote: > Tom Stellard wrote: > >> --- >> >> Build tested only so far. >> >> src/gallium/auxiliary/draw/draw_llvm.c| 6 +- >> src/gallium/auxiliary/gallivm/lp_bld_intr.c | 48 +++- >> src/gallium/auxiliary/gallivm/lp_bld_intr.h | 13 - >> src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 4 +- >> src/gallium/drivers/radeonsi/si_shader.c | 69 >> --- >> src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c | 24 >> 6 files changed, 112 insertions(+), 52 deletions(-) >> > > I notice that llvmpipe needs fixing as well - or maybe that's for someone > else? > > I sent a patch for that last night. Feel free to give it a spin. --Aaron > > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] gallium/scons: OpenSWR Windows support
Hi George, For Scons changes please keep Jose Fonseca in the loop. On 7 November 2016 at 22:32, George Kyriazis wrote: > - Added code to create screen and handle swaps in libgl_gdi.c > - Added call to swr SConscript > - included llvm 3.9 support for scons (windows swr only support 3.9 and > later) If that's the case building SWR with earlier one should error out ? Then again, here you reference gallium/drivers/swr/ > - include -DHAVE_SWR to subdirs that need it > As the above indicates here you have multiple independent changes. Please do _not_ mix those into a single patch. > To buils SWR on windows, use "scons swr libgl-gdi" > --- > scons/llvm.py | 21 +++-- > src/gallium/SConscript| 1 + > src/gallium/targets/libgl-gdi/SConscript | 4 > src/gallium/targets/libgl-gdi/libgl_gdi.c | 28 +++- > src/gallium/targets/libgl-xlib/SConscript | 4 > src/gallium/targets/osmesa/SConscript | 4 > 6 files changed, 55 insertions(+), 7 deletions(-) > > diff --git a/scons/llvm.py b/scons/llvm.py > index 1fc8a3f..977e47a 100644 > --- a/scons/llvm.py > +++ b/scons/llvm.py > @@ -106,7 +106,24 @@ def generate(env): > ]) > env.Prepend(LIBPATH = [os.path.join(llvm_dir, 'lib')]) > # LIBS should match the output of `llvm-config --libs engine mcjit > bitwriter x86asmprinter` > -if llvm_version >= distutils.version.LooseVersion('3.7'): > +if llvm_version >= distutils.version.LooseVersion('3.9'): > +env.Prepend(LIBS = [ > +'LLVMX86Disassembler', 'LLVMX86AsmParser', > +'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter', > +'LLVMDebugInfoCodeView', 'LLVMCodeGen', > +'LLVMScalarOpts', 'LLVMInstCombine', > +'LLVMInstrumentation', 'LLVMTransformUtils', > +'LLVMBitWriter', 'LLVMX86Desc', > +'LLVMMCDisassembler', 'LLVMX86Info', > +'LLVMX86AsmPrinter', 'LLVMX86Utils', > +'LLVMMCJIT', 'LLVMExecutionEngine', 'LLVMTarget', > +'LLVMAnalysis', 'LLVMProfileData', > +'LLVMRuntimeDyld', 'LLVMObject', 'LLVMMCParser', > +'LLVMBitReader', 'LLVMMC', 'LLVMCore', > +'LLVMSupport', > +'LLVMIRReader', 'LLVMASMParser' > +]) LLVM 3.9 support. cc: mesa-stable (if Jose/Brian are up for it). > +elif llvm_version >= distutils.version.LooseVersion('3.7'): > env.Prepend(LIBS = [ > 'LLVMBitWriter', 'LLVMX86Disassembler', 'LLVMX86AsmParser', > 'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter', > @@ -203,7 +220,7 @@ def generate(env): > if '-fno-rtti' in cxxflags: > env.Append(CXXFLAGS = ['-fno-rtti']) > > -components = ['engine', 'mcjit', 'bitwriter', 'x86asmprinter', > 'mcdisassembler'] > +components = ['engine', 'mcjit', 'bitwriter', 'x86asmprinter', > 'mcdisassembler', 'irreader'] Standalone bugfix. Cc: mesa-stable ? > +++ b/src/gallium/SConscript > +'drivers/swr/SConscript', This file is only introduced with 3/3. Which means that you've added scons support which is broken - please don't do that. > +++ b/src/gallium/targets/libgl-gdi/SConscript > +++ b/src/gallium/targets/libgl-gdi/libgl_gdi.c > +++ b/src/gallium/targets/libgl-xlib/SConscript > +++ b/src/gallium/targets/osmesa/SConscript Couple of ideas how to split these. Or anything else that comes to mind on your end. A) Patch 1 src/gallium/SConscript src/gallium/targets/libgl-gdi/SConscript src/gallium/targets/libgl-gdi/libgl_gdi.c Patch 2 src/gallium/targets/libgl-xlib/SConscript src/gallium/targets/osmesa/SConscript B) Patch 1 src/gallium/targets/libgl-gdi/libgl_gdi.c Patch 2 src/gallium/SConscript src/gallium/targets/libgl-gdi/SConscript src/gallium/targets/libgl-xlib/SConscript src/gallium/targets/osmesa/SConscript Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallivm: fix [IU]MUL_HI regression
Sorry for breaking radeonsi, I somehow thought this way only used for cpu only already, without actually checking... And thanks for fixing that typo, apparently you can pass piglits umul_hi/imul_hi tests (at least those from the shader_integer_mix group) even with the square of argument a... btw as I didn't consider this, I don't know if you want to change the shift/trunc to shuffle in the end - feel free to change it back if it doesn't generate good code on radeonsi... Reviewed-by: Roland Scheidegger Am 08.11.2016 um 10:15 schrieb Nicolai Hähnle: > From: Nicolai Hähnle > > This patch does two things: > > 1. It separates the host-CPU code generation from the generic code >generation. This guards against accidently breaking things for >radeonsi in the future. > > 2. It makes sure we actually use both arguments and don't just compute >a square :-p > > Fixes a regression introduced by commit > 29279f44b3172ef3b84d470e70fc7684695ced4b > > Cc: Roland Scheidegger > --- > src/gallium/auxiliary/gallivm/lp_bld_arit.c| 72 > ++ > src/gallium/auxiliary/gallivm/lp_bld_arit.h| 6 ++ > src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c | 40 +++- > 3 files changed, 90 insertions(+), 28 deletions(-) > > diff --git a/src/gallium/auxiliary/gallivm/lp_bld_arit.c > b/src/gallium/auxiliary/gallivm/lp_bld_arit.c > index 3de4628..43ad238 100644 > --- a/src/gallium/auxiliary/gallivm/lp_bld_arit.c > +++ b/src/gallium/auxiliary/gallivm/lp_bld_arit.c > @@ -1087,26 +1087,28 @@ lp_build_mul(struct lp_build_context *bld, > res = LLVMBuildLShr(builder, res, shift, ""); >} > } > > return res; > } > > /* > * Widening mul, valid for 32x32 bit -> 64bit only. > * Result is low 32bits, high bits returned in res_hi. > + * > + * Emits code that is meant to be compiled for the host CPU. > */ > LLVMValueRef > -lp_build_mul_32_lohi(struct lp_build_context *bld, > - LLVMValueRef a, > - LLVMValueRef b, > - LLVMValueRef *res_hi) > +lp_build_mul_32_lohi_cpu(struct lp_build_context *bld, > + LLVMValueRef a, > + LLVMValueRef b, > + LLVMValueRef *res_hi) > { > struct gallivm_state *gallivm = bld->gallivm; > LLVMBuilderRef builder = gallivm->builder; > > assert(bld->type.width == 32); > assert(bld->type.floating == 0); > assert(bld->type.fixed == 0); > assert(bld->type.norm == 0); > > /* > @@ -1209,43 +1211,61 @@ lp_build_mul_32_lohi(struct lp_build_context *bld, >*res_hi = LLVMBuildShuffleVector(builder, muleven, mulodd, shuf_vec, > ""); > >for (i = 0; i < bld->type.length; i += 2) { > shuf[i] = lp_build_const_int32(gallivm, i); > shuf[i+1] = lp_build_const_int32(gallivm, i + bld->type.length); >} >shuf_vec = LLVMConstVector(shuf, bld->type.length); >return LLVMBuildShuffleVector(builder, muleven, mulodd, shuf_vec, ""); > } > else { > - LLVMValueRef tmp; > - struct lp_type type_tmp; > - LLVMTypeRef wide_type, cast_type; > - > - type_tmp = bld->type; > - type_tmp.width *= 2; > - wide_type = lp_build_vec_type(gallivm, type_tmp); > - type_tmp = bld->type; > - type_tmp.length *= 2; > - cast_type = lp_build_vec_type(gallivm, type_tmp); > - > - if (bld->type.sign) { > - a = LLVMBuildSExt(builder, a, wide_type, ""); > - b = LLVMBuildSExt(builder, b, wide_type, ""); > - } else { > - a = LLVMBuildZExt(builder, a, wide_type, ""); > - b = LLVMBuildZExt(builder, b, wide_type, ""); > - } > - tmp = LLVMBuildMul(builder, a, b, ""); > - tmp = LLVMBuildBitCast(builder, tmp, cast_type, ""); > - *res_hi = lp_build_uninterleave1(gallivm, bld->type.length * 2, tmp, > 1); > - return lp_build_uninterleave1(gallivm, bld->type.length * 2, tmp, 0); > + return lp_build_mul_32_lohi(bld, a, b, res_hi); > + } > +} > + > + > +/* > + * Widening mul, valid for 32x32 bit -> 64bit only. > + * Result is low 32bits, high bits returned in res_hi. > + * > + * Emits generic code. > + */ > +LLVMValueRef > +lp_build_mul_32_lohi(struct lp_build_context *bld, > + LLVMValueRef a, > + LLVMValueRef b, > + LLVMValueRef *res_hi) > +{ > + struct gallivm_state *gallivm = bld->gallivm; > + LLVMBuilderRef builder = gallivm->builder; > + LLVMValueRef tmp; > + struct lp_type type_tmp; > + LLVMTypeRef wide_type, cast_type; > + > + type_tmp = bld->type; > + type_tmp.width *= 2; > + wide_type = lp_build_vec_type(gallivm, type_tmp); > + type_tmp = bld->type; > + type_tmp.length *= 2; > + cast_type = lp_build_vec_type(gallivm, type_tmp); > + > + if (bld->type.sign) { > + a = LLVMBuildSExt(builder, a, wide_type, ""); > + b = LLVMBuildSExt(builder, b, wide_ty
Re: [Mesa-dev] [PATCH 1/4] linker: Trivial coding standards fixes
On Tue, Nov 8, 2016 at 12:50 AM, Ian Romanick wrote: > - virtual void visit_field(const glsl_type *type, const char *name, > -bool row_major) > + virtual void visit_field(const glsl_type *, const char *, > +bool /* row_major */) > { > - (void) type; > - (void) name; > - (void) row_major; > - assert(!"Should not get here."); > + unreachable("Should not get here."); > } I'd be in favor of leaving this as an assert. The unreachable gets you nothing here, except potential infinite loops on production builds should this path ever get hit somehow. I think people have started going overboard with unreachable... it really should be for "shut up compiler, this can't happen, you're just too dumb to see it" cases. Not for "it would be a bug to hit this path" cases. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glcpp: initializes version to -1
2016-11-08 13:35 GMT+01:00 Juan A. Suarez Romero : > On Sat, 2016-11-05 at 10:48 +0100, Karol Herbst wrote: >> "#version 0512": 0:1(10): error: GLSL 3.30 is not supported. >> Supported >> versions are: 1.10, 1.20, 1.30, 1.00 ES, and 3.00 ES >> >> so the issue with this would be, that "0512" is parsed as 3.30, which >> isn't right either, but the current master version does the same. \o/ >> new bug found > > > Doing a quick check, not sure if this is a bug... 0512 is interpreted > in octal format, which in decimal is 330. Same for 0130, which is 88 in > decimal. > > > So unless we want to force all the values to be read as decimal, I > woulnd't say it is a bug. > > > J.A. > well I don't care either way, maybe the spec does say anything about it. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev