[Mesa-dev] [Bug 99553] Tracker bug for runnning OpenCL applications on Clover
https://bugs.freedesktop.org/show_bug.cgi?id=99553 Timothy Arceri changed: What|Removed |Added Depends on||109329 Referenced Bugs: https://bugs.freedesktop.org/show_bug.cgi?id=109329 [Bug 109329] Luxmark freezes the system -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v4 33/40] intel/compiler: also set F execution type for mixed float mode in BDW
Still waiting on this, specifically we are missing reviews for patches 33, 34, 36 and 37. On Sat, 2019-02-16 at 09:58 -0600, Jason Ekstrand wrote: > Matt, Curro, > > Could one of you please take a look at this and the other validator > patches in this series? Region restrictions aren't my strongest > area. > > > On Tue, Feb 12, 2019 at 5:56 AM Iago Toral Quiroga > wrote: > > The section 'Execution Data Types' of 3D Media GPGPU volume, which > > > > describes execution types, is exactly the same in BDW and SKL+. > > > > > > > > Also, this section states that there is a single execution type, so > > it > > > > makes sense that this is the wider of the two floating point types > > > > involved in mixed float mode, which is what we do for SKL+ and CHV. > > > > --- > > > > src/intel/compiler/brw_eu_validate.c | 18 +++--- > > > > 1 file changed, 7 insertions(+), 11 deletions(-) > > > > > > > > diff --git a/src/intel/compiler/brw_eu_validate.c > > b/src/intel/compiler/brw_eu_validate.c > > > > index 358a0347a93..000a05cb6ac 100644 > > > > --- a/src/intel/compiler/brw_eu_validate.c > > > > +++ b/src/intel/compiler/brw_eu_validate.c > > > > @@ -431,18 +431,14 @@ execution_type(const struct gen_device_info > > *devinfo, const brw_inst *inst) > > > > src1_exec_type == BRW_REGISTER_TYPE_DF) > > > >return BRW_REGISTER_TYPE_DF; > > > > > > > > - if (devinfo->gen >= 9 || devinfo->is_cherryview) { > > > > - if (dst_exec_type == BRW_REGISTER_TYPE_F || > > > > - src0_exec_type == BRW_REGISTER_TYPE_F || > > > > - src1_exec_type == BRW_REGISTER_TYPE_F) { > > > > - return BRW_REGISTER_TYPE_F; > > > > - } else { > > > > - return BRW_REGISTER_TYPE_HF; > > > > - } > > > > + if (dst_exec_type == BRW_REGISTER_TYPE_F || > > > > + src0_exec_type == BRW_REGISTER_TYPE_F || > > > > + src1_exec_type == BRW_REGISTER_TYPE_F) { > > > > + return BRW_REGISTER_TYPE_F; > > > > + } else { > > > > + assert(devinfo->gen >= 8 && src0_exec_type == > > BRW_REGISTER_TYPE_HF); > > > > + return BRW_REGISTER_TYPE_HF; > > > > } > > > > - > > > > - assert(src0_exec_type == BRW_REGISTER_TYPE_F); > > > > - return BRW_REGISTER_TYPE_F; > > > > } > > > > > > > > /** > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109446] Shadow of the Tomb Raider Trial freezes the system at startup
https://bugs.freedesktop.org/show_bug.cgi?id=109446 fin4...@hotmail.com changed: What|Removed |Added Status|REOPENED|RESOLVED Resolution|--- |FIXED --- Comment #6 from fin4...@hotmail.com --- (In reply to fin4478 from comment #5) > With Padoka ppa this game fails to start so the LLVM version is not the > cause of this bug. I did not have mesa-vulkan packages installed. The game runs with Padoka ppa, so LLVM 7.01 is the cause of this bug. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109446] Shadow of the Tomb Raider Trial freezes the system at startup
https://bugs.freedesktop.org/show_bug.cgi?id=109446 --- Comment #5 from fin4...@hotmail.com --- With Padoka ppa this game fails to start so the LLVM version is not the cause of this bug. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/4] RadeonSI: Upload constants to VRAM via SDMA
We need to extend the CS ioctl to allow submitting 2 command buffers at the same time. Marek On Mon, Feb 25, 2019, 10:06 PM Dieter Nützel wrote: > Hello Marek, > > you wrote with your series sent: > > [-] > Trivial benchmarks such as glxgears can expect 20% decrease > in performance due to the added cost of the SDMA CS ioctl that wasn't > there before. > [-] > > Any ideas to speed this up, again? > glmark2 went from 9766 (best ever) down to 7455 (all with NIR). > Or are micro benchmarks not worth more effort? > > Dieter > > SDMA > === > glmark2 2017.07 > === > OpenGL Information > GL_VENDOR: X.Org > GL_RENDERER: Radeon RX 580 Series (POLARIS10, DRM 3.30.0, > 5.0.0-rc1-1.g7262353-default+, LLVM 9.0.0) > GL_VERSION:4.5 (Compatibility Profile) Mesa 19.1.0-devel > (git-a9b32aaa16) > === > [build] use-vbo=false: FPS: 3694 FrameTime: 0.271 ms > [build] use-vbo=true: FPS: 9341 FrameTime: 0.107 ms > [texture] texture-filter=nearest: FPS: 9140 FrameTime: 0.109 ms > [texture] texture-filter=linear: FPS: 9163 FrameTime: 0.109 ms > [texture] texture-filter=mipmap: FPS: 9161 FrameTime: 0.109 ms > [shading] shading=gouraud: FPS: 9234 FrameTime: 0.108 ms > [shading] shading=blinn-phong-inf: FPS: 9255 FrameTime: 0.108 ms > [shading] shading=phong: FPS: 9226 FrameTime: 0.108 ms > [shading] shading=cel: FPS: 9310 FrameTime: 0.107 ms > [bump] bump-render=high-poly: FPS: 9298 FrameTime: 0.108 ms > [bump] bump-render=normals: FPS: 9121 FrameTime: 0.110 ms > [bump] bump-render=height: FPS: 9120 FrameTime: 0.110 ms > libpng warning: iCCP: known incorrect sRGB profile > [effect2d] kernel=0,1,0;1,-4,1;0,1,0;: FPS: 9858 FrameTime: 0.101 ms > libpng warning: iCCP: known incorrect sRGB profile > [effect2d] kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;: FPS: 9854 FrameTime: > 0.101 ms > [pulsar] light=false:quads=5:texture=false: FPS: 8468 FrameTime: 0.118 > ms > libpng warning: iCCP: known incorrect sRGB profile > [desktop] blur-radius=5:effect=blur:passes=1:separable=true:windows=4: > FPS: 5181 FrameTime: 0.193 ms > libpng warning: iCCP: known incorrect sRGB profile > [desktop] effect=shadow:windows=4: FPS: 5374 FrameTime: 0.186 ms > [buffer] > columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=map: > > FPS: 824 FrameTime: 1.214 ms > [buffer] > columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=subdata: > > FPS: 1114 FrameTime: 0.898 ms > [buffer] > columns=200:interleave=true:update-dispersion=0.9:update-fraction=0.5:update-method=map: > > FPS: 899 FrameTime: 1.112 ms > [ideas] speed=duration: FPS: 3485 FrameTime: 0.287 ms > [jellyfish] : FPS: 7992 FrameTime: 0.125 ms > [terrain] : FPS: 1796 FrameTime: 0.557 ms > [shadow] : FPS: 7350 FrameTime: 0.136 ms > [refract] : FPS: 3595 FrameTime: 0.278 ms > [conditionals] fragment-steps=0:vertex-steps=0: FPS: 9401 FrameTime: > 0.106 ms > [conditionals] fragment-steps=5:vertex-steps=0: FPS: 9413 FrameTime: > 0.106 ms > [conditionals] fragment-steps=0:vertex-steps=5: FPS: 9417 FrameTime: > 0.106 ms > [function] fragment-complexity=low:fragment-steps=5: FPS: 9365 > FrameTime: 0.107 ms > [function] fragment-complexity=medium:fragment-steps=5: FPS: 9451 > FrameTime: 0.106 ms > [loop] fragment-loop=false:fragment-steps=5:vertex-steps=5: FPS: 9300 > FrameTime: 0.108 ms > [loop] fragment-steps=5:fragment-uniform=false:vertex-steps=5: FPS: 9440 > FrameTime: 0.106 ms > [loop] fragment-steps=5:fragment-uniform=true:vertex-steps=5: FPS: 9392 > FrameTime: 0.106 ms > === >glmark2 Score: 7455 > === > > > Before > === > glmark2 2017.07 > === > OpenGL Information > GL_VENDOR: X.Org > GL_RENDERER: Radeon RX 580 Series (POLARIS10, DRM 3.27.0, > 4.20.0-rc3-1.g7262353-default+, LLVM 8.0.0) > GL_VERSION:4.5 (Compatibility Profile) Mesa 19.0.0-devel > (git-c49b3df3cb) > === > [build] use-vbo=false: FPS: 3373 FrameTime: 0.296 ms > [build] use-vbo=true: FPS: 13121 FrameTime: 0.076 ms > [texture] texture-filter=nearest: FPS: 12172 FrameTime: 0.082 ms > [texture] texture-filter=linear: FPS: 12557 FrameTime: 0.080 ms > [texture] texture-filter=mipmap: FPS: 12228 FrameTime: 0.082 ms > [shading] shading=gouraud: FPS: 12536 FrameTime: 0.080 ms > [shading] shading=blinn-phong-inf: FPS: 12782 FrameTime: 0.078 ms > [shading] shading=phong: FPS: 12619 FrameTime: 0.079 ms > [shading] shading=cel: FPS: 12735 FrameTime: 0.079 ms > [bump] bump-render=high-poly: FPS: 11412 FrameTime: 0.088 ms > [bump] bump-render=normals: FPS: 12467 FrameTime: 0.080 ms >
[Mesa-dev] [Bug 109786] Assassin's Creed Black Flag hangs when starting the Abstergo Interlude 1 mission
https://bugs.freedesktop.org/show_bug.cgi?id=109786 Bug ID: 109786 Summary: Assassin's Creed Black Flag hangs when starting the Abstergo Interlude 1 mission Product: Mesa Version: git Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: normal Priority: medium Component: Drivers/Vulkan/radeon Assignee: mesa-dev@lists.freedesktop.org Reporter: fin4...@hotmail.com QA Contact: mesa-dev@lists.freedesktop.org Created attachment 143470 --> https://bugs.freedesktop.org/attachment.cgi?id=143470=edit dmesg output The game works fine from the beginning, but it can not continue because of this bug: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered I use DXVK 1.0 with latest drivers. Other users have reported that the game works fine with nvidia drivers and DXVK and not with amdgpu drivers. My system: Host: ryzenpc Kernel: 5.0.0-rc7 x86_64 bits: 64 Desktop: Xfce 4.13.2 Distro: Debian GNU/Linux buster/sid Machine: Type: Desktop Mobo: ASUSTeK model: PRIME B350M-K v: Rev X.0x serial: UEFI [Legacy]: American Megatrends v: 4207 date: 12/07/2018 CPU: 6-Core: AMD Ryzen 5 1600 type: MT MCP speed: 2759 MHz Graphics: Device-1: AMD Ellesmere [Radeon RX 470/480] driver: amdgpu v: kernel Display: x11 server: X.Org 1.20.3 driver: amdgpu resolution: 3840x2160~60Hz OpenGL: renderer: Radeon RX 570 Series (POLARIS10 DRM 3.27.0 5.0.0-rc7 LLVM 9.0.0) v: 4.5 Mesa 19.1.0-devel - padoka PPA -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109443] Build failure with MSVC when using Scons >= 3.0.2
https://bugs.freedesktop.org/show_bug.cgi?id=109443 --- Comment #5 from William Deegan --- Found some issues with the patch I attached. Should have an updated one tomorrow. -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/mesa: Reduce array updates due to current changes.
Hi, Thanks, Brian and Marek for the review! best Mathias On Monday, 25 February 2019 22:44:37 CET Marek Olšák wrote: > Reviewed-by: Marek Olšák > > Marek > > On Sun, Feb 24, 2019 at 1:46 AM wrote: > > > From: Mathias Fröhlich > > > > Hi Brian, > > > > Following a small optimization in the gallium state tracker to > > avoid flagging ST_NEW_VERTEX_ARRAYS a bit more often: > > > > please review! > > > > best > > > > Mathias > > > > > > > > > > Since using bitmasks we can easily check if we have any > > current value that is potentially uploaded on array setup. > > So check for any potential vertex program input that is not > > already a vao enabled array. Only flag array update if there is > > a potential overlap. > > > > Signed-off-by: Mathias Fröhlich > > --- > > src/mesa/state_tracker/st_context.c | 2 +- > > src/mesa/state_tracker/st_context.h | 9 + > > 2 files changed, 10 insertions(+), 1 deletion(-) > > > > diff --git a/src/mesa/state_tracker/st_context.c > > b/src/mesa/state_tracker/st_context.c > > index 0a0bd8ba1ca..45451531df9 100644 > > --- a/src/mesa/state_tracker/st_context.c > > +++ b/src/mesa/state_tracker/st_context.c > > @@ -224,7 +224,7 @@ st_invalidate_state(struct gl_context *ctx) > > if (new_state & _NEW_PIXEL) > >st->dirty |= ST_NEW_PIXEL_TRANSFER; > > > > - if (new_state & _NEW_CURRENT_ATTRIB) > > + if (new_state & _NEW_CURRENT_ATTRIB && st_vp_uses_current_values(ctx)) > >st->dirty |= ST_NEW_VERTEX_ARRAYS; > > > > /* Update the vertex shader if ctx->Light._ClampVertexColor was > > changed. */ > > diff --git a/src/mesa/state_tracker/st_context.h > > b/src/mesa/state_tracker/st_context.h > > index ed69e3d4873..324a7f24178 100644 > > --- a/src/mesa/state_tracker/st_context.h > > +++ b/src/mesa/state_tracker/st_context.h > > @@ -28,6 +28,7 @@ > > #ifndef ST_CONTEXT_H > > #define ST_CONTEXT_H > > > > +#include "main/arrayobj.h" > > #include "main/mtypes.h" > > #include "state_tracker/st_api.h" > > #include "main/fbobject.h" > > @@ -398,6 +399,14 @@ st_user_clip_planes_enabled(struct gl_context *ctx) > >ctx->Transform.ClipPlanesEnabled; > > } > > > > + > > +static inline bool > > +st_vp_uses_current_values(const struct gl_context *ctx) > > +{ > > + const uint64_t inputs = ctx->VertexProgram._Current->info.inputs_read; > > + return _mesa_draw_current_bits(ctx) & inputs; > > +} > > + > > /** clear-alloc a struct-sized object, with casting */ > > #define ST_CALLOC_STRUCT(T) (struct T *) calloc(1, sizeof(struct T)) > > > > -- > > 2.20.1 > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109443] Build failure with MSVC when using Scons >= 3.0.2
https://bugs.freedesktop.org/show_bug.cgi?id=109443 William Deegan changed: What|Removed |Added CC||b...@baddogconsulting.com --- Comment #4 from William Deegan --- Created attachment 143466 --> https://bugs.freedesktop.org/attachment.cgi?id=143466=edit Trial patch to fix issue Trial patch to SCons to address issue. Please apply to SCons/Node/FS.py and let me know if it resolves the issue. The bug seems to be windows only and has to do with looking up info about previous build to decide if a target needs to be built. I'm still running tests locally on it against mesa win build. Next I'll run against SCons' own test suite. -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] svga: fix dma.pending > 0 test
The dma.pending field is boolean, so testing for > 0 isn't right. --- src/gallium/drivers/svga/svga_resource_buffer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/svga/svga_resource_buffer.c b/src/gallium/drivers/svga/svga_resource_buffer.c index e4d12f0..a3e11ad 100644 --- a/src/gallium/drivers/svga/svga_resource_buffer.c +++ b/src/gallium/drivers/svga/svga_resource_buffer.c @@ -117,7 +117,7 @@ svga_buffer_transfer_map(struct pipe_context *pipe, (void) svga_buffer_handle(svga, resource, sbuf->bind_flags); } - if (sbuf->dma.pending > 0) { + if (sbuf->dma.pending) { svga_buffer_upload_flush(svga, sbuf); svga_context_finish(svga); } -- 1.8.5.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/4] RadeonSI: Upload constants to VRAM via SDMA
Hello Marek, you wrote with your series sent: [-] Trivial benchmarks such as glxgears can expect 20% decrease in performance due to the added cost of the SDMA CS ioctl that wasn't there before. [-] Any ideas to speed this up, again? glmark2 went from 9766 (best ever) down to 7455 (all with NIR). Or are micro benchmarks not worth more effort? Dieter SDMA === glmark2 2017.07 === OpenGL Information GL_VENDOR: X.Org GL_RENDERER: Radeon RX 580 Series (POLARIS10, DRM 3.30.0, 5.0.0-rc1-1.g7262353-default+, LLVM 9.0.0) GL_VERSION:4.5 (Compatibility Profile) Mesa 19.1.0-devel (git-a9b32aaa16) === [build] use-vbo=false: FPS: 3694 FrameTime: 0.271 ms [build] use-vbo=true: FPS: 9341 FrameTime: 0.107 ms [texture] texture-filter=nearest: FPS: 9140 FrameTime: 0.109 ms [texture] texture-filter=linear: FPS: 9163 FrameTime: 0.109 ms [texture] texture-filter=mipmap: FPS: 9161 FrameTime: 0.109 ms [shading] shading=gouraud: FPS: 9234 FrameTime: 0.108 ms [shading] shading=blinn-phong-inf: FPS: 9255 FrameTime: 0.108 ms [shading] shading=phong: FPS: 9226 FrameTime: 0.108 ms [shading] shading=cel: FPS: 9310 FrameTime: 0.107 ms [bump] bump-render=high-poly: FPS: 9298 FrameTime: 0.108 ms [bump] bump-render=normals: FPS: 9121 FrameTime: 0.110 ms [bump] bump-render=height: FPS: 9120 FrameTime: 0.110 ms libpng warning: iCCP: known incorrect sRGB profile [effect2d] kernel=0,1,0;1,-4,1;0,1,0;: FPS: 9858 FrameTime: 0.101 ms libpng warning: iCCP: known incorrect sRGB profile [effect2d] kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;: FPS: 9854 FrameTime: 0.101 ms [pulsar] light=false:quads=5:texture=false: FPS: 8468 FrameTime: 0.118 ms libpng warning: iCCP: known incorrect sRGB profile [desktop] blur-radius=5:effect=blur:passes=1:separable=true:windows=4: FPS: 5181 FrameTime: 0.193 ms libpng warning: iCCP: known incorrect sRGB profile [desktop] effect=shadow:windows=4: FPS: 5374 FrameTime: 0.186 ms [buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 824 FrameTime: 1.214 ms [buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=subdata: FPS: 1114 FrameTime: 0.898 ms [buffer] columns=200:interleave=true:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 899 FrameTime: 1.112 ms [ideas] speed=duration: FPS: 3485 FrameTime: 0.287 ms [jellyfish] : FPS: 7992 FrameTime: 0.125 ms [terrain] : FPS: 1796 FrameTime: 0.557 ms [shadow] : FPS: 7350 FrameTime: 0.136 ms [refract] : FPS: 3595 FrameTime: 0.278 ms [conditionals] fragment-steps=0:vertex-steps=0: FPS: 9401 FrameTime: 0.106 ms [conditionals] fragment-steps=5:vertex-steps=0: FPS: 9413 FrameTime: 0.106 ms [conditionals] fragment-steps=0:vertex-steps=5: FPS: 9417 FrameTime: 0.106 ms [function] fragment-complexity=low:fragment-steps=5: FPS: 9365 FrameTime: 0.107 ms [function] fragment-complexity=medium:fragment-steps=5: FPS: 9451 FrameTime: 0.106 ms [loop] fragment-loop=false:fragment-steps=5:vertex-steps=5: FPS: 9300 FrameTime: 0.108 ms [loop] fragment-steps=5:fragment-uniform=false:vertex-steps=5: FPS: 9440 FrameTime: 0.106 ms [loop] fragment-steps=5:fragment-uniform=true:vertex-steps=5: FPS: 9392 FrameTime: 0.106 ms === glmark2 Score: 7455 === Before === glmark2 2017.07 === OpenGL Information GL_VENDOR: X.Org GL_RENDERER: Radeon RX 580 Series (POLARIS10, DRM 3.27.0, 4.20.0-rc3-1.g7262353-default+, LLVM 8.0.0) GL_VERSION:4.5 (Compatibility Profile) Mesa 19.0.0-devel (git-c49b3df3cb) === [build] use-vbo=false: FPS: 3373 FrameTime: 0.296 ms [build] use-vbo=true: FPS: 13121 FrameTime: 0.076 ms [texture] texture-filter=nearest: FPS: 12172 FrameTime: 0.082 ms [texture] texture-filter=linear: FPS: 12557 FrameTime: 0.080 ms [texture] texture-filter=mipmap: FPS: 12228 FrameTime: 0.082 ms [shading] shading=gouraud: FPS: 12536 FrameTime: 0.080 ms [shading] shading=blinn-phong-inf: FPS: 12782 FrameTime: 0.078 ms [shading] shading=phong: FPS: 12619 FrameTime: 0.079 ms [shading] shading=cel: FPS: 12735 FrameTime: 0.079 ms [bump] bump-render=high-poly: FPS: 11412 FrameTime: 0.088 ms [bump] bump-render=normals: FPS: 12467 FrameTime: 0.080 ms [bump] bump-render=height: FPS: 12422 FrameTime: 0.081 ms libpng warning: iCCP: known incorrect sRGB profile [effect2d] kernel=0,1,0;1,-4,1;0,1,0;: FPS: 13252 FrameTime: 0.075 ms libpng warning: iCCP: known incorrect sRGB profile [effect2d] kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;: FPS: 11468 FrameTime: 0.087 ms [pulsar]
[Mesa-dev] [Bug 109784] rasterizer/archrast/eventmanager.h:58:21: error: ‘EventHandler’ has not been declared
https://bugs.freedesktop.org/show_bug.cgi?id=109784 Bug ID: 109784 Summary: rasterizer/archrast/eventmanager.h:58:21: error: ‘EventHandler’ has not been declared Product: Mesa Version: git Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Keywords: regression Severity: normal Priority: medium Component: Drivers/Gallium/swr Assignee: mesa-dev@lists.freedesktop.org Reporter: v...@freedesktop.org QA Contact: mesa-dev@lists.freedesktop.org CC: alok.h...@intel.com CXX rasterizer/archrast/libswrAVX_la-archrast.lo In file included from ./rasterizer/archrast/archrast.h:32, from rasterizer/archrast/archrast.cpp:31: ./rasterizer/archrast/eventmanager.h:58:21: error: ‘EventHandler’ has not been declared void Attach(EventHandler* pHandler) ^~~~ -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] radeonsi: always use compute rings for clover on CI and newer (v2)
Hello Marek, this series need a rebase (if you have some time). Dieter Am 12.02.2019 19:12, schrieb Marek Olšák: From: Marek Olšák initialize all non-compute context functions to NULL. v2: fix SI --- src/gallium/drivers/radeonsi/si_blit.c| 14 ++- src/gallium/drivers/radeonsi/si_clear.c | 7 +- src/gallium/drivers/radeonsi/si_compute.c | 15 +-- src/gallium/drivers/radeonsi/si_descriptors.c | 10 +- src/gallium/drivers/radeonsi/si_gfx_cs.c | 29 +++--- src/gallium/drivers/radeonsi/si_pipe.c| 95 +++ src/gallium/drivers/radeonsi/si_pipe.h| 3 +- src/gallium/drivers/radeonsi/si_state.c | 3 +- src/gallium/drivers/radeonsi/si_state.h | 1 + src/gallium/drivers/radeonsi/si_state_draw.c | 25 +++-- src/gallium/drivers/radeonsi/si_texture.c | 3 + 11 files changed, 130 insertions(+), 75 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_blit.c b/src/gallium/drivers/radeonsi/si_blit.c index bb8d1cbd12d..f39cb5d143f 100644 --- a/src/gallium/drivers/radeonsi/si_blit.c +++ b/src/gallium/drivers/radeonsi/si_blit.c @@ -1345,25 +1345,31 @@ static void si_flush_resource(struct pipe_context *ctx, if (separate_dcc_dirty) { tex->separate_dcc_dirty = false; vi_separate_dcc_process_and_reset_stats(ctx, tex); } } } void si_decompress_dcc(struct si_context *sctx, struct si_texture *tex) { - if (!tex->dcc_offset) + /* If graphics is disabled, we can't decompress DCC, but it shouldn't +* be compressed either. The caller should simply discard it. +*/ + if (!tex->dcc_offset || !sctx->has_graphics) return; si_blit_decompress_color(sctx, tex, 0, tex->buffer.b.b.last_level, 0, util_max_layer(>buffer.b.b, 0), true); } void si_init_blit_functions(struct si_context *sctx) { sctx->b.resource_copy_region = si_resource_copy_region; - sctx->b.blit = si_blit; - sctx->b.flush_resource = si_flush_resource; - sctx->b.generate_mipmap = si_generate_mipmap; + + if (sctx->has_graphics) { + sctx->b.blit = si_blit; + sctx->b.flush_resource = si_flush_resource; + sctx->b.generate_mipmap = si_generate_mipmap; + } } diff --git a/src/gallium/drivers/radeonsi/si_clear.c b/src/gallium/drivers/radeonsi/si_clear.c index 9a00bb73b94..e1805f2a1c9 100644 --- a/src/gallium/drivers/radeonsi/si_clear.c +++ b/src/gallium/drivers/radeonsi/si_clear.c @@ -764,15 +764,18 @@ static void si_clear_texture(struct pipe_context *pipe, util_clear_render_target(pipe, sf, , box->x, box->y, box->width, box->height); } } pipe_surface_reference(, NULL); } void si_init_clear_functions(struct si_context *sctx) { - sctx->b.clear = si_clear; sctx->b.clear_render_target = si_clear_render_target; - sctx->b.clear_depth_stencil = si_clear_depth_stencil; sctx->b.clear_texture = si_clear_texture; + + if (sctx->has_graphics) { + sctx->b.clear = si_clear; + sctx->b.clear_depth_stencil = si_clear_depth_stencil; + } } diff --git a/src/gallium/drivers/radeonsi/si_compute.c b/src/gallium/drivers/radeonsi/si_compute.c index 1a62b3e0844..87addd53976 100644 --- a/src/gallium/drivers/radeonsi/si_compute.c +++ b/src/gallium/drivers/radeonsi/si_compute.c @@ -880,26 +880,28 @@ static void si_launch_grid( info->block[0] * info->block[1] * info->block[2] > 256; if (cs_regalloc_hang) sctx->flags |= SI_CONTEXT_PS_PARTIAL_FLUSH | SI_CONTEXT_CS_PARTIAL_FLUSH; if (program->ir_type != PIPE_SHADER_IR_NATIVE && program->shader.compilation_failed) return; - if (sctx->last_num_draw_calls != sctx->num_draw_calls) { - si_update_fb_dirtiness_after_rendering(sctx); - sctx->last_num_draw_calls = sctx->num_draw_calls; - } + if (sctx->has_graphics) { + if (sctx->last_num_draw_calls != sctx->num_draw_calls) { + si_update_fb_dirtiness_after_rendering(sctx); + sctx->last_num_draw_calls = sctx->num_draw_calls; + } - si_decompress_textures(sctx, 1 << PIPE_SHADER_COMPUTE); + si_decompress_textures(sctx, 1 << PIPE_SHADER_COMPUTE); + } /* Add buffer sizes for memory checking in need_cs_space. */ si_context_add_resource_size(sctx, >shader.bo->b.b); /* TODO: add the scratch buffer */ if (info->indirect) { si_context_add_resource_size(sctx, info->indirect); /* Indirect buffers use TC L2 on GFX9,
Re: [Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute
Hello Marek, do you plan to commit or rebase both set? Dieter Am 14.02.2019 07:29, schrieb Marek Olšák: I have some fixes for Sea Islands that improve Radeon 290X performance to 43 fps, moving it just below Radeon VII in the picture. Marek On Wed, Feb 13, 2019 at 12:16 AM Marek Olšák wrote: Hi, This patch series uses async compute to do primitive culling before the vertex shader. It significantly improves performance for applications that use a lot of geometry that is invisible because primitives don't intersect sample points or there are a lot of back faces, etc. It passes 99.% of all tests (GL CTS, dEQP, piglit) and is 100% stable. It supports all chips all the way from Sea Islands to Radeon VII. As you can see in the results marked (ENABLED) in the picture below, it destroys our competition (The GeForce results are from a Phoronix article from 2017, the latest ones I could find): Benchmark: ParaView - Many Spheres - 2560x1440 https://people.freedesktop.org/~mareko/prim-discard-cs-results.png The last patch describes the implementation and functional limitations if you can find the huge code comment, so I'm not gonna do that here. I decided to enable this optimization on all Pro graphics cards. The reason is that I haven't had time to benchmark games. This decision may be changed based on community feedback, etc. People using the Pro graphics cards can disable this by setting AMD_DEBUG=nopd, and people using consumer graphics cards can enable this by setting AMD_DEBUG=pd. So you always have a choice. Eventually we might also enable this on consumer graphics cards for those games that benefit. It might decrease performance if there is not enough invisible geometry. Branch: https://cgit.freedesktop.org/~mareko/mesa/log/?h=prim-discard-cs Please review. Thanks, Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 7/7] radeonsi: implement ARB/KHR_parallel_shader_compile callbacks
For the series Tested-by: Dieter Nützel Do we have a (special) test in the wild? Dieter Am 25.02.2019 19:27, schrieb Marek Olšák: From: Marek Olšák --- src/gallium/drivers/radeonsi/si_pipe.c | 31 ++ 1 file changed, 31 insertions(+) diff --git a/src/gallium/drivers/radeonsi/si_pipe.c b/src/gallium/drivers/radeonsi/si_pipe.c index b965d9d64d4..7dbd4cb2c40 100644 --- a/src/gallium/drivers/radeonsi/si_pipe.c +++ b/src/gallium/drivers/radeonsi/si_pipe.c @@ -19,20 +19,21 @@ * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL * THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM, * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE * USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include "si_pipe.h" #include "si_public.h" #include "si_shader_internal.h" +#include "si_compute.h" #include "sid.h" #include "ac_llvm_util.h" #include "radeon/radeon_uvd.h" #include "gallivm/lp_bld_misc.h" #include "util/disk_cache.h" #include "util/u_log.h" #include "util/u_memory.h" #include "util/u_suballoc.h" #include "util/u_tests.h" @@ -826,20 +827,46 @@ static void si_disk_cache_create(struct si_screen *sscreen) */ STATIC_ASSERT(ALL_FLAGS <= UINT_MAX); shader_debug_flags |= (uint64_t)sscreen->info.address32_hi << 32; sscreen->disk_shader_cache = disk_cache_create(sscreen->info.name, cache_id, shader_debug_flags); } +static void si_set_max_shader_compiler_threads(struct pipe_screen *screen, + unsigned max_threads) +{ + struct si_screen *sscreen = (struct si_screen *)screen; + + /* This function doesn't allow a greater number of threads than +* the queue had at its creation. */ + util_queue_adjust_num_threads(>shader_compiler_queue, + max_threads); + /* Don't change the number of threads on the low priority queue. */ +} + +static bool si_is_parallel_shader_compilation_finished(struct pipe_screen *screen, + void *shader, + unsigned shader_type) +{ + if (shader_type == PIPE_SHADER_COMPUTE) { + struct si_compute *cs = (struct si_compute*)shader; + + return util_queue_fence_is_signalled(>ready); + } + struct si_shader_selector *sel = (struct si_shader_selector *)shader; + + return util_queue_fence_is_signalled(>ready); +} + struct pipe_screen *radeonsi_screen_create(struct radeon_winsys *ws, const struct pipe_screen_config *config) { struct si_screen *sscreen = CALLOC_STRUCT(si_screen); unsigned hw_threads, num_comp_hi_threads, num_comp_lo_threads, i; if (!sscreen) { return NULL; } @@ -856,20 +883,24 @@ struct pipe_screen *radeonsi_screen_create(struct radeon_winsys *ws, } sscreen->debug_flags = debug_get_flags_option("R600_DEBUG", debug_options, 0); sscreen->debug_flags |= debug_get_flags_option("AMD_DEBUG", debug_options, 0); /* Set functions first. */ sscreen->b.context_create = si_pipe_create_context; sscreen->b.destroy = si_destroy_screen; + sscreen->b.set_max_shader_compiler_threads = + si_set_max_shader_compiler_threads; + sscreen->b.is_parallel_shader_compilation_finished = + si_is_parallel_shader_compilation_finished; si_init_screen_get_functions(sscreen); si_init_screen_buffer_functions(sscreen); si_init_screen_fence_functions(sscreen); si_init_screen_state_functions(sscreen); si_init_screen_texture_functions(sscreen); si_init_screen_query_functions(sscreen); /* Set these flags in debug_flags early, so that the shader cache takes * them into account. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: fix display list corner case assertion
Reviewed-by: Marek Olšák Marek On Mon, Feb 25, 2019 at 5:03 PM Brian Paul wrote: > This fixes a failed assertion in glDeleteLists() for the following > case: > > list = glGenLists(1); > glDeleteLists(list, 1); > > when those are the first display list commands issued by the > application. > > When we generate display lists, we plug in empty lists created with > the make_list() helper. This function uses the OPCODE_END_OF_LIST > opcode but does not call dlist_alloc() which would set the > InstSize[OPCODE_END_OF_LIST] element to non-zero. > > When the empty list was deleted, we failed the InstSize[opcode] > 0 > assertion. > > Typically, display lists are created with glNewList/glEndList so we > set InstSize[OPCODE_END_OF_LIST] = 1 in dlist_alloc(). That's why > this bug wasn't found before. > > To fix this failure, simply initialize the InstSize[OPCODE_END_OF_LIST] > element in make_list(). > > The game oolite was hitting this. > > Fixes: https://github.com/OoliteProject/oolite/issues/325 > --- > src/mesa/main/dlist.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/src/mesa/main/dlist.c b/src/mesa/main/dlist.c > index 97461ce..8dcf8bd 100644 > --- a/src/mesa/main/dlist.c > +++ b/src/mesa/main/dlist.c > @@ -962,6 +962,8 @@ make_list(GLuint name, GLuint count) > dlist->Name = name; > dlist->Head = malloc(sizeof(Node) * count); > dlist->Head[0].opcode = OPCODE_END_OF_LIST; > + /* All InstSize[] entries must be non-zero */ > + InstSize[OPCODE_END_OF_LIST] = 1; > return dlist; > } > > -- > 1.8.5.6 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109532] ir_variable has maximum access out of bounds -- but it's not out of bounds
https://bugs.freedesktop.org/show_bug.cgi?id=109532 --- Comment #43 from Mark Janes --- On my local system, I can easily reproduce the 32-bit pass, and the 64-bit crash. This is unusual, however the dEQP bug that started all of this also has a different 32 and 64 bit signature. --- /tmp/build_root/m32/lib/piglit/bin/glslparsertest /tmp/build_root/m32/lib/piglit/tests/spec/arb_shader_storage_buffer_object/compiler/unused-array-element.comp pass 4.30 piglit: debug: Requested an OpenGL 4.3 Core Context, and received a matching 4.5 context Successfully compiled compute shader /tmp/build_root/m32/lib/piglit/tests/spec/arb_shader_storage_buffer_object/compiler/unused-array-element.comp: (no compiler output) PIGLIT: {"result": "pass" } --- /tmp/build_root/m64/lib/piglit/bin/glslparsertest /tmp/build_root/m64/lib/piglit/tests/spec/arb_shader_storage_buffer_object/compiler/unused-array-element.comp pass 4.30 piglit: debug: Requested an OpenGL 4.3 Core Context, and received a matching 4.5 context ir_variable has maximum access out of bounds (1 vs 0) Aborted --- -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH v4] i965: fixed clamping in set_scissor_bits when the y is flipped
On Mon, Feb 25, 2019 at 03:14:10PM -0800, Dylan Baker wrote: > Quoting Eleni Maria Stea (2019-02-22 13:02:30) > > Calculating the scissor rectangle fields with the y flipped (0 on top) > > can generate negative values that will cause assertion failure later on > > as the scissor fields are all unsigned. We must clamp the bbox values > > again to make sure they don't exceed the fb_height. Also fixed a > > calculation error. > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999 > > https://bugs.freedesktop.org/show_bug.cgi?id=109594 > > > > v2: > >- I initially clamped the values inside the if (Y is flipped) case > >and I made a mistake in the calculation: the clamp of the bbox[2] should > >be a check if (bbox[2] >= fbheight) bbox[2] = fbheight - 1 instead and I > >shouldn't have changed the ScissorRectangleYMax calculation. As the > >fixed code is equivalent with using CLAMP instead of MAX2 at the top of > >the function when bbox[2] and bbox[3] are calculated, and the 2nd is more > >clear, I replaced it. (Nanley Chery) > > > > v3: > >- Reversed the CLAMP change in bbox[3] as the API guarantees that the > >viewport height is positive. (Nanley Chery) > > > > v4: > > - Added nomination for the mesa-stable branch and the link to the second > > bugzilla bug (Nanley Chery) > > > > CC: > > Tested-by: Paul Chelombitko > > Reviewed-by: Nanley Chery > > --- > > src/mesa/drivers/dri/i965/genX_state_upload.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c > > b/src/mesa/drivers/dri/i965/genX_state_upload.c > > index 027dad1e089..73c983ce742 100644 > > --- a/src/mesa/drivers/dri/i965/genX_state_upload.c > > +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c > > @@ -2446,7 +2446,7 @@ set_scissor_bits(const struct gl_context *ctx, int i, > > > > bbox[0] = MAX2(ctx->ViewportArray[i].X, 0); > > bbox[1] = MIN2(bbox[0] + ctx->ViewportArray[i].Width, fb_width); > > - bbox[2] = MAX2(ctx->ViewportArray[i].Y, 0); > > + bbox[2] = CLAMP(ctx->ViewportArray[i].Y, 0, fb_height); > > bbox[3] = MIN2(bbox[2] + ctx->ViewportArray[i].Height, fb_height); > > _mesa_intersect_scissor_bounding_box(ctx, i, bbox); > > > > -- > > 2.20.1 > > > > Do you have push access? I'd like to get this merged so we can close said > bugs, > and Nanley or I can push this for you if you don't have access. > I haven't landed this patch because its piglit test isn't catching the error in CI. I'm hoping we could resolve that soon though. -Nanley ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH v4] i965: fixed clamping in set_scissor_bits when the y is flipped
Quoting Eleni Maria Stea (2019-02-22 13:02:30) > Calculating the scissor rectangle fields with the y flipped (0 on top) > can generate negative values that will cause assertion failure later on > as the scissor fields are all unsigned. We must clamp the bbox values > again to make sure they don't exceed the fb_height. Also fixed a > calculation error. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999 > https://bugs.freedesktop.org/show_bug.cgi?id=109594 > > v2: >- I initially clamped the values inside the if (Y is flipped) case >and I made a mistake in the calculation: the clamp of the bbox[2] should >be a check if (bbox[2] >= fbheight) bbox[2] = fbheight - 1 instead and I >shouldn't have changed the ScissorRectangleYMax calculation. As the >fixed code is equivalent with using CLAMP instead of MAX2 at the top of >the function when bbox[2] and bbox[3] are calculated, and the 2nd is more >clear, I replaced it. (Nanley Chery) > > v3: >- Reversed the CLAMP change in bbox[3] as the API guarantees that the >viewport height is positive. (Nanley Chery) > > v4: > - Added nomination for the mesa-stable branch and the link to the second > bugzilla bug (Nanley Chery) > > CC: > Tested-by: Paul Chelombitko > Reviewed-by: Nanley Chery > --- > src/mesa/drivers/dri/i965/genX_state_upload.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c > b/src/mesa/drivers/dri/i965/genX_state_upload.c > index 027dad1e089..73c983ce742 100644 > --- a/src/mesa/drivers/dri/i965/genX_state_upload.c > +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c > @@ -2446,7 +2446,7 @@ set_scissor_bits(const struct gl_context *ctx, int i, > > bbox[0] = MAX2(ctx->ViewportArray[i].X, 0); > bbox[1] = MIN2(bbox[0] + ctx->ViewportArray[i].Width, fb_width); > - bbox[2] = MAX2(ctx->ViewportArray[i].Y, 0); > + bbox[2] = CLAMP(ctx->ViewportArray[i].Y, 0, fb_height); > bbox[3] = MIN2(bbox[2] + ctx->ViewportArray[i].Height, fb_height); > _mesa_intersect_scissor_bounding_box(ctx, i, bbox); > > -- > 2.20.1 > Do you have push access? I'd like to get this merged so we can close said bugs, and Nanley or I can push this for you if you don't have access. Dylan signature.asc Description: signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109535] [Tracker] Mesa 19.0 release tracker
https://bugs.freedesktop.org/show_bug.cgi?id=109535 Bug 109535 depends on bug 109561, which changed state. Bug 109561 Summary: [regression, bisected] code re-factor causing games to stutter or lock-up system https://bugs.freedesktop.org/show_bug.cgi?id=109561 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] i965: Reimplement all the PIPE_CONTROL rules.
On Monday, February 25, 2019 11:01:33 AM PST Pohjolainen, Topi wrote: > On Mon, Feb 25, 2019 at 10:32:27AM -0800, Kenneth Graunke wrote: > > On Monday, February 25, 2019 6:33:11 AM PST Pohjolainen, Topi wrote: > > > On Thu, Nov 01, 2018 at 08:04:21PM -0700, Kenneth Graunke wrote: > > > > - if (GEN_GEN >= 9) { > > > > -/* THE PIPE_CONTROL "VF Cache Invalidation Enable" docs > > > > continue: > > > > - * > > > > - *"Project: BDW+ > > > > - * > > > > - * When VF Cache Invalidate is set “Post Sync > > > > Operation” must > > > > - * be enabled to “Write Immediate Data” or “Write PS > > > > Depth > > > > - * Count” or “Write Timestamp”." > > > > - * > > > > - * If there's a BO, we're already doing some kind of write. > > > > - * If not, add a write to the workaround BO. > > > > - * > > > > - * XXX: This causes GPU hangs on Broadwell, so restrict it > > > > to > > > > - * Gen9+ for now...see this bug for more information: > > > > - * https://bugs.freedesktop.org/show_bug.cgi?id=103787 > > > > > > In "Flush Types" workarounds later on you apply this for gen8 as well. > > > > Yes, that's intentional - we're supposed to according to the docs. > > I re-tested the Piglit test from bug 103787 on my Broadwell, and it > > works fine - no GPU hangs. I think we were just doing it wrong before. > > > > Trying to figure out an ordering for the workarounds is awful... :( > > What would you think about another patch just before this to enable that for > gen8? Just in case it causes problems it would bisect to much smaller patch. It isn't simply enabling it though...in the old code, we had: if (devinfo->gen == 8) gen8_add_cs_stall_workaround_bits(); if (flags & PIPE_CONTROL_VF_CACHE_INVALIDATE) { if (devinfo->gen >= 9) { ... if (!bo) { flags |= PIPE_CONTROL_WRITE_IMMEDIATE; bo = brw->workaround_bo; } } } Which adds a WRITE_IMMEDIATE to the current PIPE_CONTROL, but does so after the call to gen8_add_cs_stall_workaround_bits - and that function would have added a CS_STALL had it seen the WRITE_IMMEDIATE. I suspect this bad ordering is why we saw hangs on Broadwell - we missed a stall. The new code performs these in the opposite order, correctly adding the necessary CS_STALL. I could probably write a patch to swap those and enable it on Gen8+. --Ken signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/9] nir: Add a new ALU nir_op_imad24_ir3
On Wed, Feb 13, 2019 at 3:30 PM Eduardo Lima Mitev wrote: > ir3 compiler has an integer multiply-add instruction (MAD_S24) > that is used for different offset calculations in the backend. > Since we intend to move some of these calculations to NIR, we need > a new ALU op that can directly represent it. > --- > src/compiler/nir/nir_opcodes.py | 16 > 1 file changed, 16 insertions(+) > > diff --git a/src/compiler/nir/nir_opcodes.py > b/src/compiler/nir/nir_opcodes.py > index d32005846a6..abbb3627a33 100644 > --- a/src/compiler/nir/nir_opcodes.py > +++ b/src/compiler/nir/nir_opcodes.py > @@ -892,3 +892,19 @@ dst.w = src3.x; > """) > > > +# Freedreno-specific opcode that maps directly to ir3_MAD_S24. > +# It is emitted by ir3_nir_lower_io_offsets pass when computing > +# byte-offsets for image store and atomics. > +# > +# The nir_algebraic expression below is: get 23 bits of the > +# two factors as unsigned and multiply them. If either of the > +# two was negative, invert sign of the product. Then add it src2. > +# @FIXME: I suspect there is a simpler expression for this. > +triop("imad24_ir3", tint, """ > I doubt you really want this to be a variable bit-size opcode. Maybe tint32? > +unsigned f0 = ((unsigned) src0) & 0x7f; > +unsigned f1 = ((unsigned) src1) & 0x7f; > +dst = f0 * f1; > +if (src0 * src1 < 0) > + dst = -dst; > +dst += src2; > +""") > -- > 2.20.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/mesa: Reduce array updates due to current changes.
LGTM too. Reviewed-by: Brian Paul On Sun, Feb 24, 2019 at 1:46 AM wrote: > From: Mathias Fröhlich > > Hi Brian, > > Following a small optimization in the gallium state tracker to > avoid flagging ST_NEW_VERTEX_ARRAYS a bit more often: > > please review! > > best > > Mathias > > > > > Since using bitmasks we can easily check if we have any > current value that is potentially uploaded on array setup. > So check for any potential vertex program input that is not > already a vao enabled array. Only flag array update if there is > a potential overlap. > > Signed-off-by: Mathias Fröhlich > --- > src/mesa/state_tracker/st_context.c | 2 +- > src/mesa/state_tracker/st_context.h | 9 + > 2 files changed, 10 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/state_tracker/st_context.c > b/src/mesa/state_tracker/st_context.c > index 0a0bd8ba1ca..45451531df9 100644 > --- a/src/mesa/state_tracker/st_context.c > +++ b/src/mesa/state_tracker/st_context.c > @@ -224,7 +224,7 @@ st_invalidate_state(struct gl_context *ctx) > if (new_state & _NEW_PIXEL) >st->dirty |= ST_NEW_PIXEL_TRANSFER; > > - if (new_state & _NEW_CURRENT_ATTRIB) > + if (new_state & _NEW_CURRENT_ATTRIB && st_vp_uses_current_values(ctx)) >st->dirty |= ST_NEW_VERTEX_ARRAYS; > > /* Update the vertex shader if ctx->Light._ClampVertexColor was > changed. */ > diff --git a/src/mesa/state_tracker/st_context.h > b/src/mesa/state_tracker/st_context.h > index ed69e3d4873..324a7f24178 100644 > --- a/src/mesa/state_tracker/st_context.h > +++ b/src/mesa/state_tracker/st_context.h > @@ -28,6 +28,7 @@ > #ifndef ST_CONTEXT_H > #define ST_CONTEXT_H > > +#include "main/arrayobj.h" > #include "main/mtypes.h" > #include "state_tracker/st_api.h" > #include "main/fbobject.h" > @@ -398,6 +399,14 @@ st_user_clip_planes_enabled(struct gl_context *ctx) >ctx->Transform.ClipPlanesEnabled; > } > > + > +static inline bool > +st_vp_uses_current_values(const struct gl_context *ctx) > +{ > + const uint64_t inputs = ctx->VertexProgram._Current->info.inputs_read; > + return _mesa_draw_current_bits(ctx) & inputs; > +} > + > /** clear-alloc a struct-sized object, with casting */ > #define ST_CALLOC_STRUCT(T) (struct T *) calloc(1, sizeof(struct T)) > > -- > 2.20.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: fix display list corner case assertion
This fixes a failed assertion in glDeleteLists() for the following case: list = glGenLists(1); glDeleteLists(list, 1); when those are the first display list commands issued by the application. When we generate display lists, we plug in empty lists created with the make_list() helper. This function uses the OPCODE_END_OF_LIST opcode but does not call dlist_alloc() which would set the InstSize[OPCODE_END_OF_LIST] element to non-zero. When the empty list was deleted, we failed the InstSize[opcode] > 0 assertion. Typically, display lists are created with glNewList/glEndList so we set InstSize[OPCODE_END_OF_LIST] = 1 in dlist_alloc(). That's why this bug wasn't found before. To fix this failure, simply initialize the InstSize[OPCODE_END_OF_LIST] element in make_list(). The game oolite was hitting this. Fixes: https://github.com/OoliteProject/oolite/issues/325 --- src/mesa/main/dlist.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/main/dlist.c b/src/mesa/main/dlist.c index 97461ce..8dcf8bd 100644 --- a/src/mesa/main/dlist.c +++ b/src/mesa/main/dlist.c @@ -962,6 +962,8 @@ make_list(GLuint name, GLuint count) dlist->Name = name; dlist->Head = malloc(sizeof(Node) * count); dlist->Head[0].opcode = OPCODE_END_OF_LIST; + /* All InstSize[] entries must be non-zero */ + InstSize[OPCODE_END_OF_LIST] = 1; return dlist; } -- 1.8.5.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: Don't block SIGSYS for new threads
Reviewed-by: Marek Olšák Marek On Sat, Feb 23, 2019 at 2:05 AM Drew Davenport wrote: > SIGSYS is needed for programs using seccomp for sandboxing. > --- > src/util/u_thread.h | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/src/util/u_thread.h b/src/util/u_thread.h > index 7538d7d634b2..a46c18d3db20 100644 > --- a/src/util/u_thread.h > +++ b/src/util/u_thread.h > @@ -44,7 +44,8 @@ static inline thrd_t u_thread_create(int (*routine)(void > *), void *param) > int ret; > > sigfillset(_set); > - pthread_sigmask(SIG_SETMASK, _set, _set); > + sigdelset(_set, SIGSYS); > + pthread_sigmask(SIG_BLOCK, _set, _set); > ret = thrd_create( , routine, param ); > pthread_sigmask(SIG_SETMASK, _set, NULL); > #else > -- > 2.20.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] radeonsi: fix query buffer allocation
Reviewed-by: Marek Olšák Marek On Sun, Feb 24, 2019 at 6:56 PM Timothy Arceri wrote: > Fix the logic for buffer full check on alloc. > > This patch just takes the fix Nicolai attached to the bug report > and updates it to work on master. > > Fixes: e0f0d3675d4 ("radeonsi: factor si_query_buffer logic out of > si_query_hw") > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109561 > --- > src/gallium/drivers/radeonsi/si_query.c | 52 ++--- > src/gallium/drivers/radeonsi/si_query.h | 5 ++- > 2 files changed, 32 insertions(+), 25 deletions(-) > > diff --git a/src/gallium/drivers/radeonsi/si_query.c > b/src/gallium/drivers/radeonsi/si_query.c > index 266b9d3ce84..280eee3a280 100644 > --- a/src/gallium/drivers/radeonsi/si_query.c > +++ b/src/gallium/drivers/radeonsi/si_query.c > @@ -549,11 +549,15 @@ void si_query_buffer_reset(struct si_context *sctx, > struct si_query_buffer *buff > } > buffer->results_end = 0; > > + if (!buffer->buf) > + return; > + > /* Discard even the oldest buffer if it can't be mapped without a > stall. */ > - if (buffer->buf && > - (si_rings_is_buffer_referenced(sctx, buffer->buf->buf, > RADEON_USAGE_READWRITE) || > -!sctx->ws->buffer_wait(buffer->buf->buf, 0, > RADEON_USAGE_READWRITE))) { > + if (si_rings_is_buffer_referenced(sctx, buffer->buf->buf, > RADEON_USAGE_READWRITE) || > + !sctx->ws->buffer_wait(buffer->buf->buf, 0, > RADEON_USAGE_READWRITE)) { > si_resource_reference(>buf, NULL); > + } else { > + buffer->unprepared = true; > } > } > > @@ -561,29 +565,31 @@ bool si_query_buffer_alloc(struct si_context *sctx, > struct si_query_buffer *buff >bool (*prepare_buffer)(struct si_context *, > struct si_query_buffer*), >unsigned size) > { > - if (buffer->buf && buffer->results_end + size >= > buffer->buf->b.b.width0) > - return true; > + bool unprepared = buffer->unprepared; > + buffer->unprepared = false; > + > + if (!buffer->buf || buffer->results_end + size > > buffer->buf->b.b.width0) { > + if (buffer->buf) { > + struct si_query_buffer *qbuf = > MALLOC_STRUCT(si_query_buffer); > + memcpy(qbuf, buffer, sizeof(*qbuf)); > + buffer->previous = qbuf; > + } > + buffer->results_end = 0; > > - if (buffer->buf) { > - struct si_query_buffer *qbuf = > MALLOC_STRUCT(si_query_buffer); > - memcpy(qbuf, buffer, sizeof(*qbuf)); > - buffer->previous = qbuf; > + /* Queries are normally read by the CPU after > +* being written by the gpu, hence staging is probably a > good > +* usage pattern. > +*/ > + struct si_screen *screen = sctx->screen; > + unsigned buf_size = MAX2(size, > screen->info.min_alloc_size); > + buffer->buf = si_resource( > + pipe_buffer_create(>b, 0, > PIPE_USAGE_STAGING, buf_size)); > + if (unlikely(!buffer->buf)) > + return false; > + unprepared = true; > } > > - buffer->results_end = 0; > - > - /* Queries are normally read by the CPU after > -* being written by the gpu, hence staging is probably a good > -* usage pattern. > -*/ > - struct si_screen *screen = sctx->screen; > - unsigned buf_size = MAX2(size, screen->info.min_alloc_size); > - buffer->buf = si_resource( > - pipe_buffer_create(>b, 0, PIPE_USAGE_STAGING, > buf_size)); > - if (unlikely(!buffer->buf)) > - return false; > - > - if (prepare_buffer) { > + if (unprepared && prepare_buffer) { > if (unlikely(!prepare_buffer(sctx, buffer))) { > si_resource_reference(>buf, NULL); > return false; > diff --git a/src/gallium/drivers/radeonsi/si_query.h > b/src/gallium/drivers/radeonsi/si_query.h > index aaf0bd03aca..c61af51d57c 100644 > --- a/src/gallium/drivers/radeonsi/si_query.h > +++ b/src/gallium/drivers/radeonsi/si_query.h > @@ -177,12 +177,13 @@ struct si_query_hw_ops { > struct si_query_buffer { > /* The buffer where query results are stored. */ > struct si_resource *buf; > - /* Offset of the next free result after current query data */ > - unsignedresults_end; > /* If a query buffer is full, a new buffer is created and the old > one > * is put in here. When we calculate the result, we sum up the > samples > * from all buffers. */ > struct si_query_buffer *previous; > + /* Offset of the next free result after current query data */ > + unsigned
Re: [Mesa-dev] [PATCH] st/mesa: Reduce array updates due to current changes.
Reviewed-by: Marek Olšák Marek On Sun, Feb 24, 2019 at 1:46 AM wrote: > From: Mathias Fröhlich > > Hi Brian, > > Following a small optimization in the gallium state tracker to > avoid flagging ST_NEW_VERTEX_ARRAYS a bit more often: > > please review! > > best > > Mathias > > > > > Since using bitmasks we can easily check if we have any > current value that is potentially uploaded on array setup. > So check for any potential vertex program input that is not > already a vao enabled array. Only flag array update if there is > a potential overlap. > > Signed-off-by: Mathias Fröhlich > --- > src/mesa/state_tracker/st_context.c | 2 +- > src/mesa/state_tracker/st_context.h | 9 + > 2 files changed, 10 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/state_tracker/st_context.c > b/src/mesa/state_tracker/st_context.c > index 0a0bd8ba1ca..45451531df9 100644 > --- a/src/mesa/state_tracker/st_context.c > +++ b/src/mesa/state_tracker/st_context.c > @@ -224,7 +224,7 @@ st_invalidate_state(struct gl_context *ctx) > if (new_state & _NEW_PIXEL) >st->dirty |= ST_NEW_PIXEL_TRANSFER; > > - if (new_state & _NEW_CURRENT_ATTRIB) > + if (new_state & _NEW_CURRENT_ATTRIB && st_vp_uses_current_values(ctx)) >st->dirty |= ST_NEW_VERTEX_ARRAYS; > > /* Update the vertex shader if ctx->Light._ClampVertexColor was > changed. */ > diff --git a/src/mesa/state_tracker/st_context.h > b/src/mesa/state_tracker/st_context.h > index ed69e3d4873..324a7f24178 100644 > --- a/src/mesa/state_tracker/st_context.h > +++ b/src/mesa/state_tracker/st_context.h > @@ -28,6 +28,7 @@ > #ifndef ST_CONTEXT_H > #define ST_CONTEXT_H > > +#include "main/arrayobj.h" > #include "main/mtypes.h" > #include "state_tracker/st_api.h" > #include "main/fbobject.h" > @@ -398,6 +399,14 @@ st_user_clip_planes_enabled(struct gl_context *ctx) >ctx->Transform.ClipPlanesEnabled; > } > > + > +static inline bool > +st_vp_uses_current_values(const struct gl_context *ctx) > +{ > + const uint64_t inputs = ctx->VertexProgram._Current->info.inputs_read; > + return _mesa_draw_current_bits(ctx) & inputs; > +} > + > /** clear-alloc a struct-sized object, with casting */ > #define ST_CALLOC_STRUCT(T) (struct T *) calloc(1, sizeof(struct T)) > > -- > 2.20.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] Revert "meson: drop GLESv1 .so version back to 1.0.0"
Quoting Ross Burton (2019-02-25 12:06:48) > This patch claimed that the autotools build generates libGLESv1_CM.so.1.0.0, > but > it doesn't: > > es1api_libGLESv1_CM_la_LDFLAGS = \ > -no-undefined \ > -version-number 1:1 \ > $(GC_SECTIONS) \ > $(LD_NO_UNDEFINED) > > Revert commit cc15460e182148292be877bec5a8a61cec57377d to ensure that the > autotools and meson builds produce the same libraries. > --- > src/mapi/es1api/meson.build | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/mapi/es1api/meson.build b/src/mapi/es1api/meson.build > index 016090dac91..a3628e076f7 100644 > --- a/src/mapi/es1api/meson.build > +++ b/src/mapi/es1api/meson.build > @@ -39,7 +39,7 @@ libglesv1_cm = shared_library( >include_directories : [inc_src, inc_include, inc_mapi], >link_with : libglapi, >dependencies : [dep_thread, dep_libdrm, dep_m, dep_dl], > - version : '1.0.0', > + version : '1.1.0', >install : true, > ) > > -- > 2.11.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev For this patch: Reviewed-by: Dylan Baker signature.asc Description: signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109739] Mesa build fails when vulkan-overlay-layer option is enabled
https://bugs.freedesktop.org/show_bug.cgi?id=109739 --- Comment #6 from Shmerl --- From https://github.com/KhronosGroup/Vulkan-ValidationLayers/issues/729#issuecomment-467162536 it looks like the expected correct usage for these headers is -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] Revert "meson: drop GLESv1 .so version back to 1.0.0"
This patch claimed that the autotools build generates libGLESv1_CM.so.1.0.0, but it doesn't: es1api_libGLESv1_CM_la_LDFLAGS = \ -no-undefined \ -version-number 1:1 \ $(GC_SECTIONS) \ $(LD_NO_UNDEFINED) Revert commit cc15460e182148292be877bec5a8a61cec57377d to ensure that the autotools and meson builds produce the same libraries. --- src/mapi/es1api/meson.build | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mapi/es1api/meson.build b/src/mapi/es1api/meson.build index 016090dac91..a3628e076f7 100644 --- a/src/mapi/es1api/meson.build +++ b/src/mapi/es1api/meson.build @@ -39,7 +39,7 @@ libglesv1_cm = shared_library( include_directories : [inc_src, inc_include, inc_mapi], link_with : libglapi, dependencies : [dep_thread, dep_libdrm, dep_m, dep_dl], - version : '1.0.0', + version : '1.1.0', install : true, ) -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] winsys/svga/drm: Include sys/types.h
From: Khem Raj vmw_screen.h uses dev_t which is defines in sys/types.h this header is required to be included for getting dev_t definition. This issue happens on musl C library, it is hidden on glibc since sys/types.h is included through another system headers --- src/gallium/winsys/svga/drm/vmw_screen.h | 1 + 1 file changed, 1 insertion(+) diff --git a/src/gallium/winsys/svga/drm/vmw_screen.h b/src/gallium/winsys/svga/drm/vmw_screen.h index a87c087d9c5..cb34fec48e7 100644 --- a/src/gallium/winsys/svga/drm/vmw_screen.h +++ b/src/gallium/winsys/svga/drm/vmw_screen.h @@ -41,6 +41,7 @@ #include "svga_winsys.h" #include "pipebuffer/pb_buffer_fenced.h" #include +#include #define VMW_GMR_POOL_SIZE (16*1024*1024) #define VMW_QUERY_POOL_SIZE (8192) -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109443] Build failure with MSVC when using Scons >= 3.0.2
https://bugs.freedesktop.org/show_bug.cgi?id=109443 --- Comment #3 from William Deegan --- If you comment out line 311 in mesascons/gallium.py # env.Decider('MD5-timestamp') You can use SCons 3.0.4. 3.0.4 had a bugfix for a longstanding issue where it was possible that md5-timestamp in conjunction with changing the number of implicit dependencies (think header files) could yield (infrequently) corrupted content signatures in the sconsign file. By defaulting the decider to MD5, you'll have a temporary workaround and be able to use SCons 3.0.4 I'm investigating this further currently. It looks like with the bugfix in 3.0.4 api_exec.obj somehow doesn't have nir_opcodes.h (which is a generated file) as an implicit dependency, thus nir_opcodes.h isn't being generated before api_exec.c is compiled. -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/9] nir: Add a new ALU nir_op_imad24_ir3
On 2/13/19 1:29 PM, Eduardo Lima Mitev wrote: > ir3 compiler has an integer multiply-add instruction (MAD_S24) > that is used for different offset calculations in the backend. > Since we intend to move some of these calculations to NIR, we need > a new ALU op that can directly represent it. > --- > src/compiler/nir/nir_opcodes.py | 16 > 1 file changed, 16 insertions(+) > > diff --git a/src/compiler/nir/nir_opcodes.py b/src/compiler/nir/nir_opcodes.py > index d32005846a6..abbb3627a33 100644 > --- a/src/compiler/nir/nir_opcodes.py > +++ b/src/compiler/nir/nir_opcodes.py > @@ -892,3 +892,19 @@ dst.w = src3.x; > """) > > > +# Freedreno-specific opcode that maps directly to ir3_MAD_S24. > +# It is emitted by ir3_nir_lower_io_offsets pass when computing > +# byte-offsets for image store and atomics. > +# > +# The nir_algebraic expression below is: get 23 bits of the > +# two factors as unsigned and multiply them. If either of the > +# two was negative, invert sign of the product. Then add it src2. > +# @FIXME: I suspect there is a simpler expression for this. > +triop("imad24_ir3", tint, """ > +unsigned f0 = ((unsigned) src0) & 0x7f; > +unsigned f1 = ((unsigned) src1) & 0x7f; > +dst = f0 * f1; How about (((int)src0 << 8) >> 8) * (((int)src1 << 8) >> 8) + src2? The trick is making sure the implementation matches what the hardware does in all cases. My expression will produce different results than yours for cases like 0xf01f * 2. 0x3e vs -0x3e. "Correct" depends entirely on what real hardware would produce. If I had to guess, I would guess that the hardware would produce 0x3e since it likely just ignores the upper 8 bits of the sources. > +if (src0 * src1 < 0) > + dst = -dst; > +dst += src2; > +""") > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir/algebraic: Replace a-fract(a) with floor(a)
On 2/23/19 4:11 PM, Timothy Arceri wrote: > > > On 23/2/19 4:09 pm, Ian Romanick wrote: >> From: Ian Romanick >> >> I noticed this while looking at a shader that was affected by Tim's >> "more loop unrolling" series. >> >> All Gen6+ platforms had similar results. (Skylake shown) >> total instructions in shared programs: 15437001 -> 15435259 (-0.01%) >> instructions in affected programs: 213651 -> 211909 (-0.82%) >> helped: 988 >> HURT: 0 >> helped stats (abs) min: 1 max: 27 x̄: 1.76 x̃: 1 >> helped stats (rel) min: 0.15% max: 11.54% x̄: 1.14% x̃: 0.59% >> 95% mean confidence interval for instructions value: -1.89 -1.63 >> 95% mean confidence interval for instructions %-change: -1.23% -1.05% >> Instructions are helped. >> >> total cycles in shared programs: 383007378 -> 382997063 (<.01%) >> cycles in affected programs: 1650825 -> 1640510 (-0.62%) >> helped: 679 >> HURT: 302 > > Why the hurt on Gen6+ is this something that should be in the late > optimisations pass? As far as I can tell, it's just because our scheduler is terrible. In all the fragment shaders that I looked at (some hurt shaders were from other stages), only one of the SIMD8 or SIMD16 version would be hurt. In many of those case, the other SIMD width is improved (e.g., shaders/closed/steam/brutal-legend/3990.shader_test). Often it looks like the scheduler decides to differently schedule a SEND the occurs somewhere early in the shader. Once that happens, everything is different. :( I looked at one vertex shader that was hurt (from Goat Simulator). In that case, both the floor and fract are used. The optimization eliminates the add, and it should allow better scheduling. In the area of the FRC and RNDD instructions, the scheduler does the right thing. However, later in the shader a MAD and and ADD get scheduled differently, and that makes it slightly worse. In light of this, I tried adding some "is_used_once" mark-up, and that did not fix all the cycles regressions. It also did a lot more harm than good on SKL: total cycles in shared programs: 382997063 -> 382998953 (<.01%) cycles in affected programs: 549527 -> 551417 (0.34%) helped: 82 HURT: 241 helped stats (abs) min: 1 max: 26 x̄: 6.88 x̃: 6 helped stats (rel) min: 0.06% max: 2.04% x̄: 0.56% x̃: 0.44% HURT stats (abs) min: 1 max: 120 x̄: 10.18 x̃: 14 HURT stats (rel) min: 0.04% max: 3.86% x̄: 0.63% x̃: 0.52% 95% mean confidence interval for cycles value: 4.44 7.26 95% mean confidence interval for cycles %-change: 0.24% 0.42% Cycles are HURT. >> helped stats (abs) min: 1 max: 348 x̄: 23.39 x̃: 14 >> helped stats (rel) min: 0.04% max: 28.77% x̄: 1.61% x̃: 0.98% >> HURT stats (abs) min: 1 max: 250 x̄: 18.43 x̃: 7 >> HURT stats (rel) min: 0.04% max: 25.86% x̄: 1.41% x̃: 0.53% >> 95% mean confidence interval for cycles value: -13.05 -7.98 >> 95% mean confidence interval for cycles %-change: -0.86% -0.50% >> Cycles are helped. >> >> Iron Lake and GM45 had similar results. (GM45 shown) >> total instructions in shared programs: 5043616 -> 5043010 (-0.01%) >> instructions in affected programs: 119691 -> 119085 (-0.51%) >> helped: 432 >> HURT: 0 >> helped stats (abs) min: 1 max: 27 x̄: 1.40 x̃: 1 >> helped stats (rel) min: 0.10% max: 8.11% x̄: 0.66% x̃: 0.39% >> 95% mean confidence interval for instructions value: -1.58 -1.23 >> 95% mean confidence interval for instructions %-change: -0.72% -0.59% >> Instructions are helped. >> >> total cycles in shared programs: 128139812 -> 128135762 (<.01%) >> cycles in affected programs: 3829724 -> 3825674 (-0.11%) >> helped: 602 >> HURT: 0 >> helped stats (abs) min: 2 max: 486 x̄: 6.73 x̃: 6 >> helped stats (rel) min: 0.02% max: 4.85% x̄: 0.19% x̃: 0.10% >> 95% mean confidence interval for cycles value: -8.40 -5.05 >> 95% mean confidence interval for cycles %-change: -0.22% -0.16% >> Cycles are helped. >> --- >> src/compiler/nir/nir_opt_algebraic.py | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/src/compiler/nir/nir_opt_algebraic.py >> b/src/compiler/nir/nir_opt_algebraic.py >> index ba27d702b5d..c8fc938cc8f 100644 >> --- a/src/compiler/nir/nir_opt_algebraic.py >> +++ b/src/compiler/nir/nir_opt_algebraic.py >> @@ -127,6 +127,7 @@ optimizations = [ >> (('flrp@32', a, b, c), ('fadd', ('fmul', c, ('fsub', b, a)), a), >> 'options->lower_flrp32'), >> (('flrp@64', a, b, c), ('fadd', ('fmul', c, ('fsub', b, a)), a), >> 'options->lower_flrp64'), >> (('ffloor', a), ('fsub', a, ('ffract', a)), >> 'options->lower_ffloor'), >> + (('fadd', a, ('fneg', ('ffract', a))), ('ffloor', a), >> '!options->lower_ffloor'), >> (('ffract', a), ('fsub', a, ('ffloor', a)), >> 'options->lower_ffract'), >> (('fceil', a), ('fneg', ('ffloor', ('fneg', a))), >> 'options->lower_fceil'), >> (('~fadd', ('fmul', a, ('fadd', 1.0, ('fneg', ('b2f', 'c@1', >> ('fmul', b, ('b2f', c))), ('bcsel', c, b, a), 'options->lower_flrp32'), ___ mesa-dev mailing list
Re: [Mesa-dev] [PATCH 4/4] i965: Reimplement all the PIPE_CONTROL rules.
On Mon, Feb 25, 2019 at 10:32:27AM -0800, Kenneth Graunke wrote: > On Monday, February 25, 2019 6:33:11 AM PST Pohjolainen, Topi wrote: > > On Thu, Nov 01, 2018 at 08:04:21PM -0700, Kenneth Graunke wrote: > > > This implements virtually all documented PIPE_CONTROL restrictions > > > in a centralized helper. You now simply ask for the operations you > > > want, and the pipe control "brain" will figure out exactly what pipe > > > controls to emit to make that happen without tanking your system. > > > > > > The hope is that this will fix some intermittent flushing issues as > > > well as GPU hangs. However, it also has a high risk of causing GPU > > > hangs and other regressions, as this is a particularly sensitive > > > area and poking the bear isn't always advisable. > > > > First I checked I could find all the things in bspec. There was one that I > > couldn't, noted further down. > > > > Second I checked that all the rules earlier were implemented. Found one > > exception, noted further down as well. > > > > I didn't check if the rules still miss something in bspec. That would be > > another exercise... > > [snip] > > > + /* Recursive PIPE_CONTROL workarounds > > > +* (http://knowyourmeme.com/memes/xzibit-yo-dawg) > > > +* > > > +* We do these first because we want to look at the original > > > operation, > > > +* rather than any workarounds we set. > > > +*/ > > > + if (GEN_GEN == 6 && (flags & PIPE_CONTROL_RENDER_TARGET_FLUSH)) { > > > + /* Hardware workaround: SNB B-Spec says: > > > + * > > > + *"[Dev-SNB{W/A}]: Before a PIPE_CONTROL with Write Cache Flush > > > + * Enable = 1, a PIPE_CONTROL with any non-zero post-sync-op is > > > + * required." > > > + */ > > > + brw_emit_post_sync_nonzero_flush(brw); > > > + } > > > + > > > + if (GEN_GEN == 9 && (flags & PIPE_CONTROL_VF_CACHE_INVALIDATE)) { > > > + /* The PIPE_CONTROL "VF Cache Invalidation Enable" bit description > > > + * lists several workarounds: > > > + * > > > + *"Project: SKL, KBL, BXT > > > + * > > > + * If the VF Cache Invalidation Enable is set to a 1 in a > > > + * PIPE_CONTROL, a separate Null PIPE_CONTROL, all bitfields > > > + * sets to 0, with the VF Cache Invalidation Enable set to 0 > > > + * needs to be sent prior to the PIPE_CONTROL with VF Cache > > > + * Invalidation Enable set to a 1." > > > + */ > > > + genX(emit_raw_pipe_control)(brw, 0, NULL, 0, 0); > > > + } > > > + > > > + if (GEN_GEN == 9 && IS_COMPUTE_PIPELINE(brw) && post_sync_flags) { > > > + /* Project: SKL / Argument: LRI Post Sync Operation [23] > > > + * > > > + * "PIPECONTROL command with “Command Streamer Stall Enable” must > > > be > > > + * programmed prior to programming a PIPECONTROL command with "LRI > > > + * Post Sync Operation" in GPGPU mode of operation (i.e when > > > + * PIPELINE_SELECT command is set to GPGPU mode of operation)." > > > + * > > > + * The same text exists a few rows below for Post Sync Op. > > > + */ > > > + genX(emit_raw_pipe_control)(brw, PIPE_CONTROL_CS_STALL, bo, > > > offset, imm); > > > > Are bo, offset, imm needed here as well? > > No, I don't think so. We should pass NULL, 0, 0 here - we're just doing > a plain CS stall separately. We'll use them for the actual write later. > > Good catch! > > [snip] > > > - if (GEN_GEN >= 9) { > > > -/* THE PIPE_CONTROL "VF Cache Invalidation Enable" docs > > > continue: > > > - * > > > - *"Project: BDW+ > > > - * > > > - * When VF Cache Invalidate is set “Post Sync Operation” > > > must > > > - * be enabled to “Write Immediate Data” or “Write PS > > > Depth > > > - * Count” or “Write Timestamp”." > > > - * > > > - * If there's a BO, we're already doing some kind of write. > > > - * If not, add a write to the workaround BO. > > > - * > > > - * XXX: This causes GPU hangs on Broadwell, so restrict it to > > > - * Gen9+ for now...see this bug for more information: > > > - * https://bugs.freedesktop.org/show_bug.cgi?id=103787 > > > > In "Flush Types" workarounds later on you apply this for gen8 as well. > > Yes, that's intentional - we're supposed to according to the docs. > I re-tested the Piglit test from bug 103787 on my Broadwell, and it > works fine - no GPU hangs. I think we were just doing it wrong before. > > Trying to figure out an ordering for the workarounds is awful... :( What would you think about another patch just before this to enable that for gen8? Just in case it causes problems it would bisect to much smaller patch. > > > > - */ > > > -if (!bo) { > > > -
[Mesa-dev] [Bug 109443] Build failure with MSVC when using Scons >= 3.0.2
https://bugs.freedesktop.org/show_bug.cgi?id=109443 Alex Granni changed: What|Removed |Added Summary|Build failure with MSVC |Build failure with MSVC |2017 when using Scons >=|when using Scons >= 3.0.2 |3.0.2 | -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] i965: Reimplement all the PIPE_CONTROL rules.
On Monday, February 25, 2019 6:33:11 AM PST Pohjolainen, Topi wrote: > On Thu, Nov 01, 2018 at 08:04:21PM -0700, Kenneth Graunke wrote: > > This implements virtually all documented PIPE_CONTROL restrictions > > in a centralized helper. You now simply ask for the operations you > > want, and the pipe control "brain" will figure out exactly what pipe > > controls to emit to make that happen without tanking your system. > > > > The hope is that this will fix some intermittent flushing issues as > > well as GPU hangs. However, it also has a high risk of causing GPU > > hangs and other regressions, as this is a particularly sensitive > > area and poking the bear isn't always advisable. > > First I checked I could find all the things in bspec. There was one that I > couldn't, noted further down. > > Second I checked that all the rules earlier were implemented. Found one > exception, noted further down as well. > > I didn't check if the rules still miss something in bspec. That would be > another exercise... [snip] > > + /* Recursive PIPE_CONTROL workarounds > > +* (http://knowyourmeme.com/memes/xzibit-yo-dawg) > > +* > > +* We do these first because we want to look at the original operation, > > +* rather than any workarounds we set. > > +*/ > > + if (GEN_GEN == 6 && (flags & PIPE_CONTROL_RENDER_TARGET_FLUSH)) { > > + /* Hardware workaround: SNB B-Spec says: > > + * > > + *"[Dev-SNB{W/A}]: Before a PIPE_CONTROL with Write Cache Flush > > + * Enable = 1, a PIPE_CONTROL with any non-zero post-sync-op is > > + * required." > > + */ > > + brw_emit_post_sync_nonzero_flush(brw); > > + } > > + > > + if (GEN_GEN == 9 && (flags & PIPE_CONTROL_VF_CACHE_INVALIDATE)) { > > + /* The PIPE_CONTROL "VF Cache Invalidation Enable" bit description > > + * lists several workarounds: > > + * > > + *"Project: SKL, KBL, BXT > > + * > > + * If the VF Cache Invalidation Enable is set to a 1 in a > > + * PIPE_CONTROL, a separate Null PIPE_CONTROL, all bitfields > > + * sets to 0, with the VF Cache Invalidation Enable set to 0 > > + * needs to be sent prior to the PIPE_CONTROL with VF Cache > > + * Invalidation Enable set to a 1." > > + */ > > + genX(emit_raw_pipe_control)(brw, 0, NULL, 0, 0); > > + } > > + > > + if (GEN_GEN == 9 && IS_COMPUTE_PIPELINE(brw) && post_sync_flags) { > > + /* Project: SKL / Argument: LRI Post Sync Operation [23] > > + * > > + * "PIPECONTROL command with “Command Streamer Stall Enable” must be > > + * programmed prior to programming a PIPECONTROL command with "LRI > > + * Post Sync Operation" in GPGPU mode of operation (i.e when > > + * PIPELINE_SELECT command is set to GPGPU mode of operation)." > > + * > > + * The same text exists a few rows below for Post Sync Op. > > + */ > > + genX(emit_raw_pipe_control)(brw, PIPE_CONTROL_CS_STALL, bo, offset, > > imm); > > Are bo, offset, imm needed here as well? No, I don't think so. We should pass NULL, 0, 0 here - we're just doing a plain CS stall separately. We'll use them for the actual write later. Good catch! [snip] > > - if (GEN_GEN >= 9) { > > -/* THE PIPE_CONTROL "VF Cache Invalidation Enable" docs > > continue: > > - * > > - *"Project: BDW+ > > - * > > - * When VF Cache Invalidate is set “Post Sync Operation” > > must > > - * be enabled to “Write Immediate Data” or “Write PS Depth > > - * Count” or “Write Timestamp”." > > - * > > - * If there's a BO, we're already doing some kind of write. > > - * If not, add a write to the workaround BO. > > - * > > - * XXX: This causes GPU hangs on Broadwell, so restrict it to > > - * Gen9+ for now...see this bug for more information: > > - * https://bugs.freedesktop.org/show_bug.cgi?id=103787 > > In "Flush Types" workarounds later on you apply this for gen8 as well. Yes, that's intentional - we're supposed to according to the docs. I re-tested the Piglit test from bug 103787 on my Broadwell, and it works fine - no GPU hangs. I think we were just doing it wrong before. Trying to figure out an ordering for the workarounds is awful... :( > > - */ > > -if (!bo) { > > - flags |= PIPE_CONTROL_WRITE_IMMEDIATE; > > - bo = brw->workaround_bo; > > -} > > - } > > + if (GEN_IS_HASWELL) { > > + /* From the PIPE_CONTROL page itself: > > + * > > + *"HSW - Programming Note: PIPECONTROL with RO Cache > > Invalidation: > > + * Prior to programming a PIPECONTROL command with any of the RO > > + * cache invalidation bit set, program a
[Mesa-dev] [PATCH 7/7] radeonsi: implement ARB/KHR_parallel_shader_compile callbacks
From: Marek Olšák --- src/gallium/drivers/radeonsi/si_pipe.c | 31 ++ 1 file changed, 31 insertions(+) diff --git a/src/gallium/drivers/radeonsi/si_pipe.c b/src/gallium/drivers/radeonsi/si_pipe.c index b965d9d64d4..7dbd4cb2c40 100644 --- a/src/gallium/drivers/radeonsi/si_pipe.c +++ b/src/gallium/drivers/radeonsi/si_pipe.c @@ -19,20 +19,21 @@ * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL * THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM, * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE * USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include "si_pipe.h" #include "si_public.h" #include "si_shader_internal.h" +#include "si_compute.h" #include "sid.h" #include "ac_llvm_util.h" #include "radeon/radeon_uvd.h" #include "gallivm/lp_bld_misc.h" #include "util/disk_cache.h" #include "util/u_log.h" #include "util/u_memory.h" #include "util/u_suballoc.h" #include "util/u_tests.h" @@ -826,20 +827,46 @@ static void si_disk_cache_create(struct si_screen *sscreen) */ STATIC_ASSERT(ALL_FLAGS <= UINT_MAX); shader_debug_flags |= (uint64_t)sscreen->info.address32_hi << 32; sscreen->disk_shader_cache = disk_cache_create(sscreen->info.name, cache_id, shader_debug_flags); } +static void si_set_max_shader_compiler_threads(struct pipe_screen *screen, + unsigned max_threads) +{ + struct si_screen *sscreen = (struct si_screen *)screen; + + /* This function doesn't allow a greater number of threads than +* the queue had at its creation. */ + util_queue_adjust_num_threads(>shader_compiler_queue, + max_threads); + /* Don't change the number of threads on the low priority queue. */ +} + +static bool si_is_parallel_shader_compilation_finished(struct pipe_screen *screen, + void *shader, + unsigned shader_type) +{ + if (shader_type == PIPE_SHADER_COMPUTE) { + struct si_compute *cs = (struct si_compute*)shader; + + return util_queue_fence_is_signalled(>ready); + } + struct si_shader_selector *sel = (struct si_shader_selector *)shader; + + return util_queue_fence_is_signalled(>ready); +} + struct pipe_screen *radeonsi_screen_create(struct radeon_winsys *ws, const struct pipe_screen_config *config) { struct si_screen *sscreen = CALLOC_STRUCT(si_screen); unsigned hw_threads, num_comp_hi_threads, num_comp_lo_threads, i; if (!sscreen) { return NULL; } @@ -856,20 +883,24 @@ struct pipe_screen *radeonsi_screen_create(struct radeon_winsys *ws, } sscreen->debug_flags = debug_get_flags_option("R600_DEBUG", debug_options, 0); sscreen->debug_flags |= debug_get_flags_option("AMD_DEBUG", debug_options, 0); /* Set functions first. */ sscreen->b.context_create = si_pipe_create_context; sscreen->b.destroy = si_destroy_screen; + sscreen->b.set_max_shader_compiler_threads = + si_set_max_shader_compiler_threads; + sscreen->b.is_parallel_shader_compilation_finished = + si_is_parallel_shader_compilation_finished; si_init_screen_get_functions(sscreen); si_init_screen_buffer_functions(sscreen); si_init_screen_fence_functions(sscreen); si_init_screen_state_functions(sscreen); si_init_screen_texture_functions(sscreen); si_init_screen_query_functions(sscreen); /* Set these flags in debug_flags early, so that the shader cache takes * them into account. -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/7] util/queue: hold a lock when reading num_threads in util_queue_finish
From: Marek Olšák Reviewed-by: Ian Romanick --- src/util/u_queue.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/util/u_queue.c b/src/util/u_queue.c index cfd2a08e3c8..5e0c1095569 100644 --- a/src/util/u_queue.c +++ b/src/util/u_queue.c @@ -585,29 +585,29 @@ util_queue_finish_execute(void *data, int num_thread) util_barrier_wait(barrier); } /** * Wait until all previously added jobs have completed. */ void util_queue_finish(struct util_queue *queue) { util_barrier barrier; - struct util_queue_fence *fences = malloc(queue->num_threads * sizeof(*fences)); - - util_barrier_init(, queue->num_threads); + struct util_queue_fence *fences; /* If 2 threads were adding jobs for 2 different barries at the same time, * a deadlock would happen, because 1 barrier requires that all threads * wait for it exclusively. */ mtx_lock(>finish_lock); + fences = malloc(queue->num_threads * sizeof(*fences)); + util_barrier_init(, queue->num_threads); for (unsigned i = 0; i < queue->num_threads; ++i) { util_queue_fence_init([i]); util_queue_add_job(queue, , [i], util_queue_finish_execute, NULL); } for (unsigned i = 0; i < queue->num_threads; ++i) { util_queue_fence_wait([i]); util_queue_fence_destroy([i]); } -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/7] mesa: implement ARB/KHR_parallel_shader_compile
From: Marek Olšák Tested by piglit. --- docs/features.txt | 2 +- docs/relnotes/19.0.0.html | 2 ++ src/mapi/glapi/gen/gl_API.xml | 15 ++- src/mesa/main/dd.h | 7 +++ src/mesa/main/extensions_table.h| 2 ++ src/mesa/main/get_hash_params.py| 3 +++ src/mesa/main/hint.c| 12 src/mesa/main/hint.h| 4 src/mesa/main/mtypes.h | 1 + src/mesa/main/shaderapi.c | 10 ++ src/mesa/main/tests/dispatch_sanity.cpp | 5 + 11 files changed, 61 insertions(+), 2 deletions(-) diff --git a/docs/features.txt b/docs/features.txt index 6c2b6d59377..440192be8f0 100644 --- a/docs/features.txt +++ b/docs/features.txt @@ -295,21 +295,21 @@ GLES3.2, GLSL ES 3.2 -- all DONE: i965/gen9+, radeonsi, virgl GL_OES_texture_storage_multisample_2d_array DONE (all drivers that support GL_ARB_texture_multisample) Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES version: GL_ARB_bindless_texture DONE (nvc0, radeonsi) GL_ARB_cl_event not started GL_ARB_compute_variable_group_sizeDONE (nvc0, radeonsi) GL_ARB_ES3_2_compatibilityDONE (i965/gen8+, radeonsi, virgl) GL_ARB_fragment_shader_interlock DONE (i965) GL_ARB_gpu_shader_int64 DONE (i965/gen8+, nvc0, radeonsi, softpipe, llvmpipe) - GL_ARB_parallel_shader_compilenot started, but Chia-I Wu did some related work in 2014 + GL_ARB_parallel_shader_compileDONE (all drivers) GL_ARB_post_depth_coverageDONE (i965, nvc0) GL_ARB_robustness_isolation not started GL_ARB_sample_locations DONE (nvc0) GL_ARB_seamless_cubemap_per_texture DONE (freedreno, i965, nvc0, radeonsi, r600, softpipe, swr, virgl) GL_ARB_shader_ballot DONE (i965/gen8+, nvc0, radeonsi) GL_ARB_shader_clock DONE (i965/gen7+, nv50, nvc0, r600, radeonsi, virgl) GL_ARB_shader_stencil_export DONE (i965/gen9+, r600, radeonsi, softpipe, llvmpipe, swr, virgl) GL_ARB_shader_viewport_layer_arrayDONE (i965/gen6+, nvc0, radeonsi) GL_ARB_sparse_buffer DONE (radeonsi/CIK+) GL_ARB_sparse_texture not started diff --git a/docs/relnotes/19.0.0.html b/docs/relnotes/19.0.0.html index 1b4edd7ce76..69d9649721f 100644 --- a/docs/relnotes/19.0.0.html +++ b/docs/relnotes/19.0.0.html @@ -33,25 +33,27 @@ Compatibility contexts may report a lower version depending on each driver. SHA256 checksums TBD. New features GL_AMD_texture_texture4 on all GL 4.0 drivers. +GL_ARB_parallel_shader_compile on all drivers. GL_EXT_shader_implicit_conversions on all drivers (ES extension). GL_EXT_texture_compression_bptc on all GL 4.0 drivers (ES extension). GL_EXT_texture_compression_rgtc on all GL 3.0 drivers (ES extension). GL_EXT_render_snorm on gallium drivers (ES extension). GL_EXT_texture_view on drivers supporting texture views (ES extension). +GL_KHR_parallel_shader_compile on all drivers. GL_OES_texture_view on drivers supporting texture views (ES extension). GL_NV_shader_atomic_float on nvc0 (Fermi/Kepler only). Shader-based software implementations of GL_ARB_gpu_shader_fp64, GL_ARB_gpu_shader_int64, GL_ARB_vertex_attrib_64bit, and GL_ARB_shader_ballot on i965. VK_ANDROID_external_memory_android_hardware_buffer on Intel Fixed and re-exposed VK_EXT_pci_bus_info on Intel and RADV VK_EXT_scalar_block_layout on Intel and RADV VK_KHR_depth_stencil_resolve on Intel VK_KHR_draw_indirect_count on Intel VK_EXT_conditional_rendering on Intel VK_EXT_memory_budget on RADV diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml index 929e5f6b024..9b8998532d5 100644 --- a/src/mapi/glapi/gen/gl_API.xml +++ b/src/mapi/glapi/gen/gl_API.xml @@ -8393,21 +8393,34 @@ http://www.w3.org/2001/XInclude"/> - + + + + + + + + + + + + + + http://www.w3.org/2001/XInclude"/> diff --git a/src/mesa/main/dd.h b/src/mesa/main/dd.h index 1214eeaa474..8251af3f667 100644 --- a/src/mesa/main/dd.h +++ b/src/mesa/main/dd.h @@ -1291,20 +1291,27 @@ struct dd_function_table { /** * Called to initialize gl_program::driver_cache_blob (and size) with a * ralloc allocated buffer. * * This buffer will be saved and restored as part of the gl_program * serialization and deserialization. */ void
[Mesa-dev] [PATCH 4/7] util/queue: add ability to kill a subset of threads
From: Marek Olšák for ARB_parallel_shader_compile --- src/util/u_queue.c | 52 ++ src/util/u_queue.h | 5 ++--- 2 files changed, 36 insertions(+), 21 deletions(-) diff --git a/src/util/u_queue.c b/src/util/u_queue.c index 48c5c79552d..cfd2a08e3c8 100644 --- a/src/util/u_queue.c +++ b/src/util/u_queue.c @@ -26,42 +26,43 @@ #include "u_queue.h" #include #include "util/os_time.h" #include "util/u_string.h" #include "util/u_thread.h" #include "u_process.h" -static void util_queue_killall_and_wait(struct util_queue *queue); +static void +util_queue_kill_threads(struct util_queue *queue, unsigned keep_num_threads); / * Wait for all queues to assert idle when exit() is called. * * Otherwise, C++ static variable destructors can be called while threads * are using the static variables. */ static once_flag atexit_once_flag = ONCE_FLAG_INIT; static struct list_head queue_list; static mtx_t exit_mutex = _MTX_INITIALIZER_NP; static void atexit_handler(void) { struct util_queue *iter; mtx_lock(_mutex); /* Wait for all queues to assert idle. */ LIST_FOR_EACH_ENTRY(iter, _list, head) { - util_queue_killall_and_wait(iter); + util_queue_kill_threads(iter, 0); } mtx_unlock(_mutex); } static void global_init(void) { LIST_INITHEAD(_list); atexit(atexit_handler); } @@ -259,55 +260,58 @@ util_queue_thread_func(void *input) u_thread_setname(name); } while (1) { struct util_queue_job job; mtx_lock(>lock); assert(queue->num_queued >= 0 && queue->num_queued <= queue->max_jobs); /* wait if the queue is empty */ - while (!queue->kill_threads && queue->num_queued == 0) + while (thread_index < queue->num_threads && queue->num_queued == 0) cnd_wait(>has_queued_cond, >lock); - if (queue->kill_threads) { + /* only kill threads that are above "num_threads" */ + if (thread_index >= queue->num_threads) { mtx_unlock(>lock); break; } job = queue->jobs[queue->read_idx]; memset(>jobs[queue->read_idx], 0, sizeof(struct util_queue_job)); queue->read_idx = (queue->read_idx + 1) % queue->max_jobs; queue->num_queued--; cnd_signal(>has_space_cond); mtx_unlock(>lock); if (job.job) { job.execute(job.job, thread_index); util_queue_fence_signal(job.fence); if (job.cleanup) job.cleanup(job.job, thread_index); } } - /* signal remaining jobs before terminating */ + /* signal remaining jobs if all threads are being terminated */ mtx_lock(>lock); - for (unsigned i = queue->read_idx; i != queue->write_idx; -i = (i + 1) % queue->max_jobs) { - if (queue->jobs[i].job) { - util_queue_fence_signal(queue->jobs[i].fence); - queue->jobs[i].job = NULL; + if (queue->num_threads == 0) { + for (unsigned i = queue->read_idx; i != queue->write_idx; + i = (i + 1) % queue->max_jobs) { + if (queue->jobs[i].job) { +util_queue_fence_signal(queue->jobs[i].fence); +queue->jobs[i].job = NULL; + } } + queue->read_idx = queue->write_idx; + queue->num_queued = 0; } - queue->read_idx = queue->write_idx; - queue->num_queued = 0; mtx_unlock(>lock); return 0; } static bool util_queue_create_thread(struct util_queue *queue, unsigned index) { struct thread_input *input = (struct thread_input *) malloc(sizeof(struct thread_input)); input->queue = queue; @@ -418,60 +422,72 @@ fail: cnd_destroy(>has_queued_cond); mtx_destroy(>lock); free(queue->jobs); } /* also util_queue_is_initialized can be used to check for success */ memset(queue, 0, sizeof(*queue)); return false; } static void -util_queue_killall_and_wait(struct util_queue *queue) +util_queue_kill_threads(struct util_queue *queue, unsigned keep_num_threads) { unsigned i; /* Signal all threads to terminate. */ + mtx_lock(>finish_lock); + + if (keep_num_threads >= queue->num_threads) { + mtx_unlock(>finish_lock); + return; + } + mtx_lock(>lock); - queue->kill_threads = 1; + unsigned old_num_threads = queue->num_threads; + /* Setting num_threads is what causes the threads to terminate. +* Then cnd_broadcast wakes them up and they will exit their function. +*/ + queue->num_threads = keep_num_threads; cnd_broadcast(>has_queued_cond); mtx_unlock(>lock); - for (i = 0; i < queue->num_threads; i++) + for (i = keep_num_threads; i < old_num_threads; i++) thrd_join(queue->threads[i], NULL); - queue->num_threads = 0; + + mtx_unlock(>finish_lock); } void util_queue_destroy(struct util_queue *queue) { - util_queue_killall_and_wait(queue); +
[Mesa-dev] [PATCH 3/7] util/queue: move thread creation into a separate function
From: Marek Olšák Reviewed-by: Ian Romanick --- src/util/u_queue.c | 56 ++ 1 file changed, 32 insertions(+), 24 deletions(-) diff --git a/src/util/u_queue.c b/src/util/u_queue.c index 3812c824b6d..48c5c79552d 100644 --- a/src/util/u_queue.c +++ b/src/util/u_queue.c @@ -298,20 +298,51 @@ util_queue_thread_func(void *input) util_queue_fence_signal(queue->jobs[i].fence); queue->jobs[i].job = NULL; } } queue->read_idx = queue->write_idx; queue->num_queued = 0; mtx_unlock(>lock); return 0; } +static bool +util_queue_create_thread(struct util_queue *queue, unsigned index) +{ + struct thread_input *input = + (struct thread_input *) malloc(sizeof(struct thread_input)); + input->queue = queue; + input->thread_index = index; + + queue->threads[index] = u_thread_create(util_queue_thread_func, input); + + if (!queue->threads[index]) { + free(input); + return false; + } + + if (queue->flags & UTIL_QUEUE_INIT_USE_MINIMUM_PRIORITY) { +#if defined(__linux__) && defined(SCHED_IDLE) + struct sched_param sched_param = {0}; + + /* The nice() function can only set a maximum of 19. + * SCHED_IDLE is the same as nice = 20. + * + * Note that Linux only allows decreasing the priority. The original + * priority can't be restored. + */ + pthread_setschedparam(queue->threads[index], SCHED_IDLE, _param); +#endif + } + return true; +} + bool util_queue_init(struct util_queue *queue, const char *name, unsigned max_jobs, unsigned num_threads, unsigned flags) { unsigned i; /* Form the thread name from process_name and name, limited to 13 @@ -357,53 +388,30 @@ util_queue_init(struct util_queue *queue, queue->num_queued = 0; cnd_init(>has_queued_cond); cnd_init(>has_space_cond); queue->threads = (thrd_t*) calloc(num_threads, sizeof(thrd_t)); if (!queue->threads) goto fail; /* start threads */ for (i = 0; i < num_threads; i++) { - struct thread_input *input = - (struct thread_input *) malloc(sizeof(struct thread_input)); - input->queue = queue; - input->thread_index = i; - - queue->threads[i] = u_thread_create(util_queue_thread_func, input); - - if (!queue->threads[i]) { - free(input); - + if (!util_queue_create_thread(queue, i)) { if (i == 0) { /* no threads created, fail */ goto fail; } else { /* at least one thread created, so use it */ queue->num_threads = i; break; } } - - if (flags & UTIL_QUEUE_INIT_USE_MINIMUM_PRIORITY) { - #if defined(__linux__) && defined(SCHED_IDLE) - struct sched_param sched_param = {0}; - - /* The nice() function can only set a maximum of 19. - * SCHED_IDLE is the same as nice = 20. - * - * Note that Linux only allows decreasing the priority. The original - * priority can't be restored. - */ - pthread_setschedparam(queue->threads[i], SCHED_IDLE, _param); - #endif - } } add_to_atexit_list(queue); return true; fail: free(queue->threads); if (queue->jobs) { cnd_destroy(>has_space_cond); -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/7] util/queue: add util_queue_adjust_num_threads
From: Marek Olšák for ARB_parallel_shader_compile Reviewed-by: Ian Romanick --- src/util/u_queue.c | 50 -- src/util/u_queue.h | 8 2 files changed, 52 insertions(+), 6 deletions(-) diff --git a/src/util/u_queue.c b/src/util/u_queue.c index 5e0c1095569..cd0b95b6ead 100644 --- a/src/util/u_queue.c +++ b/src/util/u_queue.c @@ -27,42 +27,43 @@ #include "u_queue.h" #include #include "util/os_time.h" #include "util/u_string.h" #include "util/u_thread.h" #include "u_process.h" static void -util_queue_kill_threads(struct util_queue *queue, unsigned keep_num_threads); +util_queue_kill_threads(struct util_queue *queue, unsigned keep_num_threads, +bool finish_locked); / * Wait for all queues to assert idle when exit() is called. * * Otherwise, C++ static variable destructors can be called while threads * are using the static variables. */ static once_flag atexit_once_flag = ONCE_FLAG_INIT; static struct list_head queue_list; static mtx_t exit_mutex = _MTX_INITIALIZER_NP; static void atexit_handler(void) { struct util_queue *iter; mtx_lock(_mutex); /* Wait for all queues to assert idle. */ LIST_FOR_EACH_ENTRY(iter, _list, head) { - util_queue_kill_threads(iter, 0); + util_queue_kill_threads(iter, 0, false); } mtx_unlock(_mutex); } static void global_init(void) { LIST_INITHEAD(_list); atexit(atexit_handler); } @@ -333,20 +334,53 @@ util_queue_create_thread(struct util_queue *queue, unsigned index) * * Note that Linux only allows decreasing the priority. The original * priority can't be restored. */ pthread_setschedparam(queue->threads[index], SCHED_IDLE, _param); #endif } return true; } +void +util_queue_adjust_num_threads(struct util_queue *queue, unsigned num_threads) +{ + num_threads = MIN2(num_threads, queue->max_threads); + num_threads = MAX2(num_threads, 1); + + mtx_lock(>finish_lock); + unsigned old_num_threads = queue->num_threads; + + if (num_threads == old_num_threads) { + mtx_unlock(>finish_lock); + return; + } + + if (num_threads < old_num_threads) { + util_queue_kill_threads(queue, num_threads, true); + mtx_unlock(>finish_lock); + return; + } + + /* Create threads. +* +* We need to update num_threads first, because threads terminate +* when thread_index < num_threads. +*/ + queue->num_threads = num_threads; + for (unsigned i = old_num_threads; i < num_threads; i++) { + if (!util_queue_create_thread(queue, i)) + break; + } + mtx_unlock(>finish_lock); +} + bool util_queue_init(struct util_queue *queue, const char *name, unsigned max_jobs, unsigned num_threads, unsigned flags) { unsigned i; /* Form the thread name from process_name and name, limited to 13 @@ -371,20 +405,21 @@ util_queue_init(struct util_queue *queue, memset(queue, 0, sizeof(*queue)); if (process_len) { util_snprintf(queue->name, sizeof(queue->name), "%.*s:%s", process_len, process_name, name); } else { util_snprintf(queue->name, sizeof(queue->name), "%s", name); } queue->flags = flags; + queue->max_threads = num_threads; queue->num_threads = num_threads; queue->max_jobs = max_jobs; queue->jobs = (struct util_queue_job*) calloc(max_jobs, sizeof(struct util_queue_job)); if (!queue->jobs) goto fail; (void) mtx_init(>lock, mtx_plain); (void) mtx_init(>finish_lock, mtx_plain); @@ -422,51 +457,54 @@ fail: cnd_destroy(>has_queued_cond); mtx_destroy(>lock); free(queue->jobs); } /* also util_queue_is_initialized can be used to check for success */ memset(queue, 0, sizeof(*queue)); return false; } static void -util_queue_kill_threads(struct util_queue *queue, unsigned keep_num_threads) +util_queue_kill_threads(struct util_queue *queue, unsigned keep_num_threads, +bool finish_locked) { unsigned i; /* Signal all threads to terminate. */ - mtx_lock(>finish_lock); + if (!finish_locked) + mtx_lock(>finish_lock); if (keep_num_threads >= queue->num_threads) { mtx_unlock(>finish_lock); return; } mtx_lock(>lock); unsigned old_num_threads = queue->num_threads; /* Setting num_threads is what causes the threads to terminate. * Then cnd_broadcast wakes them up and they will exit their function. */ queue->num_threads = keep_num_threads; cnd_broadcast(>has_queued_cond); mtx_unlock(>lock); for (i = keep_num_threads; i < old_num_threads; i++) thrd_join(queue->threads[i], NULL); - mtx_unlock(>finish_lock); + if (!finish_locked) +
[Mesa-dev] [PATCH 2/7] gallium: implement ARB/KHR_parallel_shader_compile
From: Marek Olšák --- src/gallium/include/pipe/p_screen.h| 13 ++ src/mesa/state_tracker/st_cb_program.c | 59 +- 2 files changed, 71 insertions(+), 1 deletion(-) diff --git a/src/gallium/include/pipe/p_screen.h b/src/gallium/include/pipe/p_screen.h index c4d6e1cc94f..d4e2d9f63ac 100644 --- a/src/gallium/include/pipe/p_screen.h +++ b/src/gallium/include/pipe/p_screen.h @@ -435,20 +435,33 @@ struct pipe_screen { * \param uuidpointer to a memory region of PIPE_UUID_SIZE bytes */ void (*get_driver_uuid)(struct pipe_screen *screen, char *uuid); /** * Fill @uuid with a unique device identifier * * \param uuidpointer to a memory region of PIPE_UUID_SIZE bytes */ void (*get_device_uuid)(struct pipe_screen *screen, char *uuid); + + /** +* Set the maximum number of parallel shader compiler threads. +*/ + void (*set_max_shader_compiler_threads)(struct pipe_screen *screen, + unsigned max_threads); + + /** +* Return whether parallel shader compilation has finished. +*/ + bool (*is_parallel_shader_compilation_finished)(struct pipe_screen *screen, + void *shader, + unsigned shader_type); }; /** * Global configuration options for screen creation. */ struct pipe_screen_config { const struct driOptionCache *options; }; diff --git a/src/mesa/state_tracker/st_cb_program.c b/src/mesa/state_tracker/st_cb_program.c index 555fc5d5ad9..cc96ec552bb 100644 --- a/src/mesa/state_tracker/st_cb_program.c +++ b/src/mesa/state_tracker/st_cb_program.c @@ -259,23 +259,80 @@ st_program_string_notify( struct gl_context *ctx, static struct gl_program * st_new_ati_fs(struct gl_context *ctx, struct ati_fragment_shader *curProg) { struct gl_program *prog = ctx->Driver.NewProgram(ctx, GL_FRAGMENT_PROGRAM_ARB, curProg->Id, true); struct st_fragment_program *stfp = (struct st_fragment_program *)prog; stfp->ati_fs = curProg; return prog; } +static void +st_max_shader_compiler_threads(struct gl_context *ctx, unsigned count) +{ + struct pipe_screen *screen = st_context(ctx)->pipe->screen; + + if (screen->set_max_shader_compiler_threads) + screen->set_max_shader_compiler_threads(screen, count); +} + +static bool +st_get_shader_program_completion_status(struct gl_context *ctx, +struct gl_shader_program *shprog) +{ + struct pipe_screen *screen = st_context(ctx)->pipe->screen; + + if (!screen->is_parallel_shader_compilation_finished) + return true; + + for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { + struct gl_linked_shader *linked = shprog->_LinkedShaders[i]; + void *sh = NULL; + + if (!linked || !linked->Program) + continue; + + switch (i) { + case MESA_SHADER_VERTEX: + if (st_vertex_program(linked->Program)->variants) +sh = st_vertex_program(linked->Program)->variants->driver_shader; + break; + case MESA_SHADER_FRAGMENT: + if (st_fragment_program(linked->Program)->variants) +sh = st_fragment_program(linked->Program)->variants->driver_shader; + break; + case MESA_SHADER_TESS_CTRL: + case MESA_SHADER_TESS_EVAL: + case MESA_SHADER_GEOMETRY: + if (st_common_program(linked->Program)->variants) +sh = st_common_program(linked->Program)->variants->driver_shader; + break; + case MESA_SHADER_COMPUTE: + if (st_compute_program(linked->Program)->variants) +sh = st_compute_program(linked->Program)->variants->driver_shader; + break; + } + + unsigned type = pipe_shader_type_from_mesa(i); + + if (sh && + !screen->is_parallel_shader_compilation_finished(screen, sh, type)) + return false; + } + return true; +} + /** * Plug in the program and shader-related device driver functions. */ void st_init_program_functions(struct dd_function_table *functions) { functions->NewProgram = st_new_program; functions->DeleteProgram = st_delete_program; functions->ProgramStringNotify = st_program_string_notify; functions->NewATIfs = st_new_ati_fs; - functions->LinkShader = st_link_shader; + functions->SetMaxShaderCompilerThreads = st_max_shader_compiler_threads; + functions->GetShaderProgramCompletionStatus = + st_get_shader_program_completion_status; } -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] swr/rast: don't create wrapper for every Create LLVM call
On Mon, 25 Feb 2019 at 17:59, Hota, Alok wrote: > > > -Original Message- > > From: Emil Velikov [mailto:emil.l.veli...@gmail.com] > > Sent: Monday, February 11, 2019 7:04 AM > > To: Hota, Alok > > Cc: mesa-dev@lists.freedesktop.org; mesa-sta...@lists.freedesktop.org > > Subject: Re: [Mesa-dev] [PATCH 1/2] swr/rast: don't create wrapper for > > every Create LLVM call > > > > On Mon, 4 Feb 2019 at 17:04, Hota, Alok wrote: > > > > > > > -Original Message- > > > > From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On > > > > Behalf Of Emil Velikov > > > > Sent: Monday, February 4, 2019 3:21 AM > > > > To: mesa-dev@lists.freedesktop.org > > > > Cc: mesa-sta...@lists.freedesktop.org; emil.l.veli...@gmail.com; > > > > Hota, Alok > > > > Subject: [Mesa-dev] [PATCH 1/2] swr/rast: don't create wrapper for > > > > every Create LLVM call > > > > > > > > We user only a fraction (approximatelly 1/4) of the API - generate only > > those. > > > > > > > > This way, we spend less time processing and generate smaller file. > > > > This also removes the need for hacks needed for compiling files > > > > bootstrapped with another LLVM version. > > > > > > Thanks for the patch! > > > I had to add one function, CreateNeg, to used_functions for it to compile > > for me. > > > > > Thanks, can I consider this a Tested-by? > > A review or even an ack from SWR developers would also be appreciated. > > Sorry to let this hang for so long. > After some review, it looks like we have some upcoming changes internally > that would be complicated with this change, and with this approach. At the > time, I think we will hold off on this change, though I will be keeping it in > mind. Can you elaborate what seems to be the problem? I'm more than happy to rework the patch so it works for you guys - do you have a WIP branch somewhere? Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] swr/rast: don't create wrapper for every Create LLVM call
> -Original Message- > From: Emil Velikov [mailto:emil.l.veli...@gmail.com] > Sent: Monday, February 11, 2019 7:04 AM > To: Hota, Alok > Cc: mesa-dev@lists.freedesktop.org; mesa-sta...@lists.freedesktop.org > Subject: Re: [Mesa-dev] [PATCH 1/2] swr/rast: don't create wrapper for > every Create LLVM call > > On Mon, 4 Feb 2019 at 17:04, Hota, Alok wrote: > > > > > -Original Message- > > > From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On > > > Behalf Of Emil Velikov > > > Sent: Monday, February 4, 2019 3:21 AM > > > To: mesa-dev@lists.freedesktop.org > > > Cc: mesa-sta...@lists.freedesktop.org; emil.l.veli...@gmail.com; > > > Hota, Alok > > > Subject: [Mesa-dev] [PATCH 1/2] swr/rast: don't create wrapper for > > > every Create LLVM call > > > > > > We user only a fraction (approximatelly 1/4) of the API - generate only > those. > > > > > > This way, we spend less time processing and generate smaller file. > > > This also removes the need for hacks needed for compiling files > > > bootstrapped with another LLVM version. > > > > Thanks for the patch! > > I had to add one function, CreateNeg, to used_functions for it to compile > for me. > > > Thanks, can I consider this a Tested-by? > A review or even an ack from SWR developers would also be appreciated. Sorry to let this hang for so long. After some review, it looks like we have some upcoming changes internally that would be complicated with this change, and with this approach. At the time, I think we will hold off on this change, though I will be keeping it in mind. I also ran a basic test and don't see much difference in compile time. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/9] ir3/nir: Add a new pass 'ir3_nir_lower_io_offsets'
On Wed, Feb 13, 2019 at 4:30 PM Eduardo Lima Mitev wrote: > > This pass moves to NIR some offset computations that are currently > implemented on the IR3 backend compiler, to allow NIR to possibly > optimize them. > > For now, it only supports lowering byte-offset computation for image > store and atomics. > --- > src/freedreno/Makefile.sources | 1 + > src/freedreno/ir3/ir3_nir.h | 1 + > src/freedreno/ir3/ir3_nir_lower_io_offsets.c | 334 +++ > 3 files changed, 336 insertions(+) > create mode 100644 src/freedreno/ir3/ir3_nir_lower_io_offsets.c > > diff --git a/src/freedreno/Makefile.sources b/src/freedreno/Makefile.sources > index 7fea9de39ef..235fec1c4f2 100644 > --- a/src/freedreno/Makefile.sources > +++ b/src/freedreno/Makefile.sources > @@ -31,6 +31,7 @@ ir3_SOURCES := \ > ir3/ir3_legalize.c \ > ir3/ir3_nir.c \ > ir3/ir3_nir.h \ > + ir3/ir3_nir_lower_io_offsets.c \ > ir3/ir3_nir_lower_tg4_to_tex.c \ > ir3/ir3_print.c \ > ir3/ir3_ra.c \ > diff --git a/src/freedreno/ir3/ir3_nir.h b/src/freedreno/ir3/ir3_nir.h > index 74201d34160..7983b74af2c 100644 > --- a/src/freedreno/ir3/ir3_nir.h > +++ b/src/freedreno/ir3/ir3_nir.h > @@ -36,6 +36,7 @@ void ir3_nir_scan_driver_consts(nir_shader *shader, struct > ir3_driver_const_layo > > bool ir3_nir_apply_trig_workarounds(nir_shader *shader); > bool ir3_nir_lower_tg4_to_tex(nir_shader *shader); > +bool ir3_nir_lower_io_offsets(nir_shader *shader); > > const nir_shader_compiler_options * ir3_get_compiler_options(struct > ir3_compiler *compiler); > bool ir3_key_lowers_nir(const struct ir3_shader_key *key); > diff --git a/src/freedreno/ir3/ir3_nir_lower_io_offsets.c > b/src/freedreno/ir3/ir3_nir_lower_io_offsets.c > new file mode 100644 > index 000..a43b3895fd8 > --- /dev/null > +++ b/src/freedreno/ir3/ir3_nir_lower_io_offsets.c > @@ -0,0 +1,334 @@ > +/* > + * Copyright © 2018-2019 Igalia S.L. > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice (including the next > + * paragraph) shall be included in all copies or substantial portions of the > + * Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > DEALINGS > + * IN THE SOFTWARE. > + */ > + > +#include "ir3_nir.h" > +#include "compiler/nir/nir_builder.h" > + > +/** > + * This pass moves to NIR certain offset computations for different I/O > + * ops that are currently implemented on the IR3 backend compiler, to > + * give NIR a chance to optimize them: > + * > + * - Byte-offset for image store and atomics: Emit instructions to > + * compute (x*bpp) + y*y_stride + z*z_stride), and place the resulting > + * SSA value in the 4th-component of the vec4 instruction that defines > + * the offset. > + */ > + > + > +static bool > +intrinsic_is_image_atomic(unsigned intrinsic) > +{ > + switch (intrinsic) { > + case nir_intrinsic_image_deref_atomic_add: > + case nir_intrinsic_image_deref_atomic_min: > + case nir_intrinsic_image_deref_atomic_max: > + case nir_intrinsic_image_deref_atomic_and: > + case nir_intrinsic_image_deref_atomic_or: > + case nir_intrinsic_image_deref_atomic_xor: > + case nir_intrinsic_image_deref_atomic_exchange: > + case nir_intrinsic_image_deref_atomic_comp_swap: > + return true; > + default: > + break; > + } > + > + return false; > +} > + > +static bool > +intrinsic_is_image_store_or_atomic(unsigned intrinsic) > +{ > + if (intrinsic == nir_intrinsic_image_deref_store) > + return true; > + else > + return intrinsic_is_image_atomic(intrinsic); > +} > + > +/* > + * FIXME: shamelessly copied from ir3_compiler_nir until it gets factorized > + * out at some point. Sorry, I'd overlooked this patchset.. blame gitlab MR's Anyways, the good news is these helpers were refactored out into ir3_image.c.. the bad news is rebasing this series will be a bit conflicty since I broke out the image/ssbo intrinsics into ir3_a6xx/ir3_a4xx. (Since
[Mesa-dev] [Bug 109782] [CTS] dEQP-VK.graphicsfuzz.while-inside-switch hangs
https://bugs.freedesktop.org/show_bug.cgi?id=109782 Bug ID: 109782 Summary: [CTS] dEQP-VK.graphicsfuzz.while-inside-switch hangs Product: Mesa Version: git Hardware: Other OS: All Status: NEW Severity: critical Priority: medium Component: Drivers/Vulkan/radeon Assignee: mesa-dev@lists.freedesktop.org Reporter: samuel.pitoi...@gmail.com QA Contact: mesa-dev@lists.freedesktop.org Writing test log into TestResults.qpa dEQP Core git-65ebf85d4b32fc62201aedbef5b69d7c00298bee (0x65ebf85d) starting.. target implementation = 'Default' Test case 'dEQP-VK.graphicsfuzz.while-inside-switch'.. shader: MESA_SHADER_VERTEX inputs: 0 outputs: 0 uniforms: 0 shared: 0 decl_var shader_in INTERP_MODE_NONE vec4 @0 (VERT_ATTRIB_GENERIC0, 0, 0) decl_var shader_out INTERP_MODE_NONE vec4 @1 (VARYING_SLOT_POS, 0, 0) decl_function main (0 params) impl main { decl_var INTERP_MODE_NONE vec4 out@(null)-temp decl_var INTERP_MODE_NONE vec4 in@(null)-temp block block_0: /* preds: */ vec1 32 ssa_0 = deref_var &@0 (shader_in vec4) vec4 32 ssa_1 = intrinsic load_deref (ssa_0) (0) /* access=0 */ vec1 32 ssa_2 = deref_var &@1 (shader_out vec4) intrinsic store_deref (ssa_2, ssa_1) (15, 0) /* wrmask=xyzw */ /* access=0 */ /* succs: block_1 */ block block_1: } shader: MESA_SHADER_FRAGMENT inputs: 0 outputs: 0 uniforms: 0 shared: 0 decl_var shader_in INTERP_MODE_NONE vec4 gl_FragCoord (VARYING_SLOT_POS, 0, 0) decl_var shader_out INTERP_MODE_NONE vec4 _GLF_color (FRAG_RESULT_DATA0, 0, 0) decl_function main (0 params) impl main { decl_var INTERP_MODE_NONE vec4 out@_GLF_color-temp block block_0: /* preds: */ vec1 32 ssa_0 = load_const (0x /* 0.00 */) vec1 32 ssa_1 = load_const (0x4120 /* 10.00 */) /* succs: block_1 */ loop { block block_1: /* preds: block_0 */ vec1 32 ssa_2 = deref_var _FragCoord (shader_in vec4) vec4 32 ssa_3 = intrinsic load_deref (ssa_2) (0) /* access=0 */ vec1 32 ssa_4 = flt32 ssa_3.x, ssa_1 /* succs: block_2 block_3 */ if ssa_4 { block block_2: /* preds: block_1 */ break /* succs: block_10 */ } else { block block_3: /* preds: block_1 */ /* succs: block_4 */ } block block_4: /* preds: block_3 */ vec1 32 ssa_5 = intrinsic vulkan_resource_index (ssa_0) (0, 0, 6) /* desc-set=0 */ /* binding=0 */ /* desc_type=UBO */ vec1 32 ssa_6 = intrinsic load_ubo (ssa_5, ssa_0) (4, 0) /* align_mul=4 */ /* align_offset=0 */ vec1 32 ssa_7 = f2i32 ssa_6 vec1 32 ssa_8 = ieq32 ssa_7, ssa_0 /* succs: block_5 block_8 */ if ssa_8 { block block_5: /* preds: block_4 */ /* succs: block_6 */ loop { block block_6: /* preds: block_5 block_6 */ intrinsic discard () () /* succs: block_6 */ } block block_7: /* preds: */ /* succs: block_9 */ } else { block block_8: /* preds: block_4 */ /* succs: block_9 */ } block block_9: /* preds: block_7 block_8 */ break /* succs: block_10 */ } block block_10: /* preds: block_2 block_9 */ vec1 32 ssa_9 = deref_var &_GLF_color (shader_out vec4) vec4 32 ssa_10 = load_const (0x3f80 /* 1.00 */, 0x /* 0.00 */, 0x /* 0.00 */, 0x3f80 /* 1.00 */) intrinsic store_deref (ssa_9, ssa_10) (15, 0) /* wrmask=xyzw */ /* access=0 */ /* succs: block_11 */ block block_11: } 1) The NIR compiler should be able to kill the outer loop 2) The inner (infinite) loop should probably be killed as well (or the discard should be moved outside) -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109764] [radv] Enable framerate capping
https://bugs.freedesktop.org/show_bug.cgi?id=109764 John changed: What|Removed |Added CC||john.etted...@gmail.com -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109443] Build failure with MSVC 2017 when using Scons >= 3.0.2
https://bugs.freedesktop.org/show_bug.cgi?id=109443 --- Comment #2 from Alex Granni --- Created attachment 143462 --> https://bugs.freedesktop.org/attachment.cgi?id=143462=edit A hack patch that removes the functionality from the build system that Scons >= 3.0.2 doesn't like -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109698] dri.pc contents invalid when built with meson
https://bugs.freedesktop.org/show_bug.cgi?id=109698 Emil Velikov changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #3 from Emil Velikov --- Should be fixed with the following. Expect the commit to land in 18.3 and 19.0 in due time. commit f6556ec7d126b31da37c08d7cb657250505e01a0 Author: Sergii Romantsov Date: Thu Feb 21 10:28:11 2019 +0200 dri: meson: do not prefix user provided dri-drivers-path The user can select the location where there dri drivers -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109443] Build failure with MSVC 2017 when using Scons >= 3.0.2
https://bugs.freedesktop.org/show_bug.cgi?id=109443 Alex Granni changed: What|Removed |Added CC||jfons...@vmware.com -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109761] misuse of enums in vdpau_private.h
https://bugs.freedesktop.org/show_bug.cgi?id=109761 Christian König changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |NOTABUG --- Comment #1 from Christian König --- Please submit a patch for proposed changes. This report system is only for bugs. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109761] misuse of enums in vdpau_private.h
https://bugs.freedesktop.org/show_bug.cgi?id=109761 Christian König changed: What|Removed |Added Status|RESOLVED|CLOSED -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109532] ir_variable has maximum access out of bounds -- but it's not out of bounds
https://bugs.freedesktop.org/show_bug.cgi?id=109532 --- Comment #42 from Mark Janes --- i965 CI runs debug builds by default, but for mesa it uses these meson configurations: -Dbuildtype=release -Db_ndebug=true We must catch assertions in the CI, however debug builds are much slower to execute tests. -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] i965: Reimplement all the PIPE_CONTROL rules.
On Thu, Nov 01, 2018 at 08:04:21PM -0700, Kenneth Graunke wrote: > This implements virtually all documented PIPE_CONTROL restrictions > in a centralized helper. You now simply ask for the operations you > want, and the pipe control "brain" will figure out exactly what pipe > controls to emit to make that happen without tanking your system. > > The hope is that this will fix some intermittent flushing issues as > well as GPU hangs. However, it also has a high risk of causing GPU > hangs and other regressions, as this is a particularly sensitive > area and poking the bear isn't always advisable. First I checked I could find all the things in bspec. There was one that I couldn't, noted further down. Second I checked that all the rules earlier were implemented. Found one exception, noted further down as well. I didn't check if the rules still miss something in bspec. That would be another exercise... > --- > src/mesa/drivers/dri/i965/genX_pipe_control.c | 563 +- > 1 file changed, 428 insertions(+), 135 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/genX_pipe_control.c > b/src/mesa/drivers/dri/i965/genX_pipe_control.c > index 8eb37444253..503e674661b 100644 > --- a/src/mesa/drivers/dri/i965/genX_pipe_control.c > +++ b/src/mesa/drivers/dri/i965/genX_pipe_control.c > @@ -25,172 +25,465 @@ > #include "brw_defines.h" > #include "brw_state.h" > > +static unsigned > +flags_to_post_sync_op(uint32_t flags) > +{ > + if (flags & PIPE_CONTROL_WRITE_IMMEDIATE) > + return WriteImmediateData; > + > + if (flags & PIPE_CONTROL_WRITE_DEPTH_COUNT) > + return WritePSDepthCount; > + > + if (flags & PIPE_CONTROL_WRITE_TIMESTAMP) > + return WriteTimestamp; > + > + return 0; > +} > + > /** > - * According to the latest documentation, any PIPE_CONTROL with the > - * "Command Streamer Stall" bit set must also have another bit set, > - * with five different options: > - * > - * - Render Target Cache Flush > - * - Depth Cache Flush > - * - Stall at Pixel Scoreboard > - * - Post-Sync Operation > - * - Depth Stall > - * - DC Flush Enable > - * > - * I chose "Stall at Pixel Scoreboard" since we've used it effectively > - * in the past, but the choice is fairly arbitrary. > + * Do the given flags have a Post Sync or LRI Post Sync operation? > */ > -static void > -gen8_add_cs_stall_workaround_bits(uint32_t *flags) > +static enum pipe_control_flags > +get_post_sync_flags(enum pipe_control_flags flags) > { > - uint32_t wa_bits = PIPE_CONTROL_RENDER_TARGET_FLUSH | > - PIPE_CONTROL_DEPTH_CACHE_FLUSH | > - PIPE_CONTROL_WRITE_IMMEDIATE | > - PIPE_CONTROL_WRITE_DEPTH_COUNT | > - PIPE_CONTROL_WRITE_TIMESTAMP | > - PIPE_CONTROL_STALL_AT_SCOREBOARD | > - PIPE_CONTROL_DEPTH_STALL | > - PIPE_CONTROL_DATA_CACHE_FLUSH; > - > - /* If we're doing a CS stall, and don't already have one of the > -* workaround bits set, add "Stall at Pixel Scoreboard." > + flags &= PIPE_CONTROL_WRITE_IMMEDIATE | > +PIPE_CONTROL_WRITE_DEPTH_COUNT | > +PIPE_CONTROL_WRITE_TIMESTAMP | > +PIPE_CONTROL_LRI_POST_SYNC_OP; > + > + /* Only one "Post Sync Op" is allowed, and it's mutually exclusive with > +* "LRI Post Sync Operation". So more than one bit set would be illegal. > */ > - if ((*flags & PIPE_CONTROL_CS_STALL) != 0 && (*flags & wa_bits) == 0) > - *flags |= PIPE_CONTROL_STALL_AT_SCOREBOARD; > + assert(util_bitcount(flags) <= 1); > + > + return flags; > } > > -/* Implement the WaCsStallAtEveryFourthPipecontrol workaround on IVB, BYT: > +#define IS_COMPUTE_PIPELINE(brw) \ > + (GEN_GEN >= 7 && brw->last_pipeline == BRW_COMPUTE_PIPELINE) > + > +/* Closed interval - GEN_GEN \in [x, y] */ > +#define IS_GEN_BETWEEN(x, y) (GEN_GEN >= x && GEN_GEN <= y) > +#define IS_GENx10_BETWEEN(x, y) \ > + (GEN_VERSIONx10 >= x && GEN_VERSIONx10 <= y) > + > +/** > + * Emit a series of PIPE_CONTROL commands, taking into account any > + * workarounds necessary to actually accomplish the caller's request. > + * > + * Unless otherwise noted, spec quotations in this function come from: > * > - * "Every 4th PIPE_CONTROL command, not counting the PIPE_CONTROL with > - * only read-cache-invalidate bit(s) set, must have a CS_STALL bit set." > + * Synchronization of the 3D Pipeline > PIPE_CONTROL Command > Programming > + * Restrictions for PIPE_CONTROL. > * > - * Note that the kernel does CS stalls between batches, so we only need > - * to count them within a batch. > + * You should not use this function directly. Use the helpers in > + * brw_pipe_control.c instead, which may split the pipe control further. > */ > -static uint32_t > -gen7_cs_stall_every_four_pipe_controls(struct brw_context *brw, uint32_t > flags) > +void > +genX(emit_raw_pipe_control)(struct brw_context *brw, uint32_t flags, > +
[Mesa-dev] [PATCH 2/2] radv: don't copy buffer descriptors list for samplers
Sampler descriptors don't have a buffer list. This fixes some crashes with new CTS dEQP-VK.binding_model.descriptor_copy.*.sampler_*. Cc: 18.3 19.0 Signed-off-by: Samuel Pitoiset --- src/amd/vulkan/radv_descriptor_set.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/amd/vulkan/radv_descriptor_set.c b/src/amd/vulkan/radv_descriptor_set.c index e6649305961..68171b5d244 100644 --- a/src/amd/vulkan/radv_descriptor_set.c +++ b/src/amd/vulkan/radv_descriptor_set.c @@ -969,7 +969,11 @@ void radv_update_descriptor_sets( } src_ptr += src_binding_layout->size / 4; dst_ptr += dst_binding_layout->size / 4; - dst_buffer_list[j] = src_buffer_list[j]; + + if (src_binding_layout->type != VK_DESCRIPTOR_TYPE_SAMPLER) { + /* Sampler descriptors don't have a buffer list. */ + dst_buffer_list[j] = src_buffer_list[j]; + } } } } -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] radv: fix out-of-bounds access when copying descriptors BO list
We shouldn't increment the buffer list pointers twice. This fixes some crashes with new CTS dEQP-VK.binding_model.descriptor_copy.*. Cc: 18.3 19.0 Signed-off-by: Samuel Pitoiset --- src/amd/vulkan/radv_descriptor_set.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/src/amd/vulkan/radv_descriptor_set.c b/src/amd/vulkan/radv_descriptor_set.c index e47ae6ad67a..e6649305961 100644 --- a/src/amd/vulkan/radv_descriptor_set.c +++ b/src/amd/vulkan/radv_descriptor_set.c @@ -970,8 +970,6 @@ void radv_update_descriptor_sets( src_ptr += src_binding_layout->size / 4; dst_ptr += dst_binding_layout->size / 4; dst_buffer_list[j] = src_buffer_list[j]; - ++src_buffer_list; - ++dst_buffer_list; } } } -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir: use nir_variable_create instead of open-coding the logic
Rb On February 25, 2019 05:35:17 Tapani Pälli wrote: Fixes: 3d7611e9 "st/nir: use NIR for asm programs" Reported-by: Matthias Lorenz Signed-off-by: Tapani Pälli --- src/mesa/program/prog_to_nir.c | 10 -- 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/src/mesa/program/prog_to_nir.c b/src/mesa/program/prog_to_nir.c index 1c9d0018d55..aa4f2aaf72a 100644 --- a/src/mesa/program/prog_to_nir.c +++ b/src/mesa/program/prog_to_nir.c @@ -1012,13 +1012,11 @@ prog_to_nir(const struct gl_program *prog, s = c->build.shader; if (prog->Parameters->NumParameters > 0) { - c->parameters = rzalloc(s, nir_variable); - c->parameters->type = + const struct glsl_type *type = glsl_array_type(glsl_vec4_type(), prog->Parameters->NumParameters, 0); - c->parameters->name = strdup(prog->Parameters->Parameters[0].Name); - c->parameters->data.read_only = true; - c->parameters->data.mode = nir_var_uniform; - exec_list_push_tail(>uniforms, >parameters->node); + c->parameters = + nir_variable_create(s, nir_var_uniform, type, + prog->Parameters->Parameters[0].Name); } setup_registers_and_variables(c); -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109532] ir_variable has maximum access out of bounds -- but it's not out of bounds
https://bugs.freedesktop.org/show_bug.cgi?id=109532 --- Comment #41 from asimiklit --- (In reply to andrii simiklit from comment #40) > (In reply to Mark Janes from comment #39) > > I just noticed that this new test passes for 32 bit builds. > > > > Does that surprise anyone else? > > > > http://mesa-ci.01.org/mesa_master_daily/builds/4806/group/ > > e60513df13ade427f01bb7334bd5174e > > Thanks that pointed that out. > I didn't see any platform specific code there. > I will post an update here as far as I figure out the reason. Unfortunately I can't to reproduce this behavior locally. I built the 32-bit debug mesa + 32-bit debug piglit and have the following results: asimiklit@asimiklit-pc:~/projects/piglit32$ bin/glslparsertest tests/spec/arb_shader_storage_buffer_object/compiler /unused-array-element.comp pass 4.50 ir_variable has maximum access out of bounds (1 vs 0) Aborted (core dumped) glslparsertest and mesa 100% have 32-bit architecture: asimiklit@asimiklit-pc:~/projects/piglit32$ file bin/glslparsertest bin/glslparsertest: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=0243d96a1caf7c54189447516b066e5582ef86aa, with debug_info, not stripped asimiklit@asimiklit-pc:~/projects/piglit32$ ldd bin/glslparsertest libgbm.so.1 => /home/.../mesa_versions/1802_32_dbg/lib/libgbm.so.1 (0xf7a9d000) libGL.so.1 => /home/.../mesa_versions/1802_32_dbg/lib/libGL.so.1 (0xf7a09000) libEGL.so.1 => /home/.../mesa_versions/1802_32_dbg/lib/libEGL.so.1 (0xf7554000) libglapi.so.0 => /home/.../mesa_versions/1802_32_dbg/lib/libglapi.so.0 (0xf7466000) ... asimiklit@asimiklit-pc:~/projects/piglit32$ file /home/.../mesa_versions /1802_32_dbg/lib/libGL.so.1.2.0 /home/.../mesa_versions/1802_32_dbg/lib/libGL.so.1.2.0: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked, BuildID[sha1]=8708ca7f174b267838c5b1ffe71ca68d5307f62d, with debug_info, not stripped PS: I know that 1802 in the folder name isn't a correct name for the latest mesa folder name but I just forgot to update it in prefix :-) Mark could you please clarify which mesa configuration is used with 32-bit piglit, debug or release? Because I see the same behavior just with a release mesa configuration. -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] docs: mention "Allow commits from members who can merge..."
On Monday, 2019-02-25 11:57:20 +, Emil Velikov wrote: > From: Emil Velikov > > Mention the tick-box otherwise only the MR author can rebase the series. I thought it was already mentioned... oops. Reviewed-by: Eric Engestrom > > Cc: Jordan Justen > Cc: Dylan Baker > Cc: Erik Faye-Lund > Cc: Eric Engestrom > Signed-off-by: Emil Velikov > --- > docs/submittingpatches.html | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/docs/submittingpatches.html b/docs/submittingpatches.html > index 137e39c925d..65af32d4bba 100644 > --- a/docs/submittingpatches.html > +++ b/docs/submittingpatches.html > @@ -236,6 +236,11 @@ your email administrator for this.) > Other tag examples: gallium, util > > > + > + Tick the following when creating the MR. It allows developers to > + rebase your work on top of master. > + Allow commits from members who can merge to the target branch > + > >If you revise your patches based on code review and push an update >to your branch, you should maintain a clean history > -- > 2.20.1 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] docs: mention "Allow commits from members who can merge..."
Yeah, good move! Reivewed-by: Erik Faye-Lund On Mon, 2019-02-25 at 11:57 +, Emil Velikov wrote: > From: Emil Velikov > > Mention the tick-box otherwise only the MR author can rebase the > series. > > Cc: Jordan Justen > Cc: Dylan Baker > Cc: Erik Faye-Lund > Cc: Eric Engestrom > Signed-off-by: Emil Velikov > --- > docs/submittingpatches.html | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/docs/submittingpatches.html > b/docs/submittingpatches.html > index 137e39c925d..65af32d4bba 100644 > --- a/docs/submittingpatches.html > +++ b/docs/submittingpatches.html > @@ -236,6 +236,11 @@ your email administrator for this.) > Other tag examples: gallium, util > > > + > + Tick the following when creating the MR. It allows developers to > + rebase your work on top of master. > + Allow commits from members who can merge to the target > branch > + > >If you revise your patches based on code review and push an update >to your branch, you should maintain a clean > history ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] docs: mention "Allow commits from members who can merge..."
From: Emil Velikov Mention the tick-box otherwise only the MR author can rebase the series. Cc: Jordan Justen Cc: Dylan Baker Cc: Erik Faye-Lund Cc: Eric Engestrom Signed-off-by: Emil Velikov --- docs/submittingpatches.html | 5 + 1 file changed, 5 insertions(+) diff --git a/docs/submittingpatches.html b/docs/submittingpatches.html index 137e39c925d..65af32d4bba 100644 --- a/docs/submittingpatches.html +++ b/docs/submittingpatches.html @@ -236,6 +236,11 @@ your email administrator for this.) Other tag examples: gallium, util + + Tick the following when creating the MR. It allows developers to + rebase your work on top of master. + Allow commits from members who can merge to the target branch + If you revise your patches based on code review and push an update to your branch, you should maintain a clean history -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nir: use nir_variable_create instead of open-coding the logic
Fixes: 3d7611e9 "st/nir: use NIR for asm programs" Reported-by: Matthias Lorenz Signed-off-by: Tapani Pälli --- src/mesa/program/prog_to_nir.c | 10 -- 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/src/mesa/program/prog_to_nir.c b/src/mesa/program/prog_to_nir.c index 1c9d0018d55..aa4f2aaf72a 100644 --- a/src/mesa/program/prog_to_nir.c +++ b/src/mesa/program/prog_to_nir.c @@ -1012,13 +1012,11 @@ prog_to_nir(const struct gl_program *prog, s = c->build.shader; if (prog->Parameters->NumParameters > 0) { - c->parameters = rzalloc(s, nir_variable); - c->parameters->type = + const struct glsl_type *type = glsl_array_type(glsl_vec4_type(), prog->Parameters->NumParameters, 0); - c->parameters->name = strdup(prog->Parameters->Parameters[0].Name); - c->parameters->data.read_only = true; - c->parameters->data.mode = nir_var_uniform; - exec_list_push_tail(>uniforms, >parameters->node); + c->parameters = + nir_variable_create(s, nir_var_uniform, type, + prog->Parameters->Parameters[0].Name); } setup_registers_and_variables(c); -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radv: fix clearing attachments in secondary command buffers
If no framebuffer is bound, get the number of samples and the image format from the render pass. This fixes new CTS dEQP-VK.geometry.layered.*.secondary_cmd_buffer. Cc: 18.3 19.0 Signed-off-by: Samuel Pitoiset --- src/amd/vulkan/radv_meta_clear.c | 53 ++-- 1 file changed, 43 insertions(+), 10 deletions(-) diff --git a/src/amd/vulkan/radv_meta_clear.c b/src/amd/vulkan/radv_meta_clear.c index 4f557092838..dc761ca17b2 100644 --- a/src/amd/vulkan/radv_meta_clear.c +++ b/src/amd/vulkan/radv_meta_clear.c @@ -370,14 +370,29 @@ emit_color_clear(struct radv_cmd_buffer *cmd_buffer, const struct radv_framebuffer *fb = cmd_buffer->state.framebuffer; const uint32_t subpass_att = clear_att->colorAttachment; const uint32_t pass_att = subpass->color_attachments[subpass_att].attachment; - const struct radv_image_view *iview = fb->attachments[pass_att].attachment; - const uint32_t samples = iview->image->info.samples; - const uint32_t samples_log2 = ffs(samples) - 1; - unsigned fs_key = radv_format_meta_fs_key(iview->vk_format); + const struct radv_image_view *iview = fb ? fb->attachments[pass_att].attachment : NULL; + uint32_t samples, samples_log2; + VkFormat format; + unsigned fs_key; VkClearColorValue clear_value = clear_att->clearValue.color; VkCommandBuffer cmd_buffer_h = radv_cmd_buffer_to_handle(cmd_buffer); VkPipeline pipeline; + /* When a framebuffer is bound to the current command buffer, get the +* number of samples from it. Otherwise, get the number of samples from +* the render pass because it's likely a secondary command buffer. +*/ + if (iview) { + samples = iview->image->info.samples; + format = iview->vk_format; + } else { + samples = cmd_buffer->state.pass->attachments[pass_att].samples; + format = cmd_buffer->state.pass->attachments[pass_att].format; + } + + samples_log2 = ffs(samples) - 1; + fs_key = radv_format_meta_fs_key(format); + if (fs_key == -1) { radv_finishme("color clears incomplete"); return; @@ -617,6 +632,9 @@ static bool depth_view_can_fast_clear(struct radv_cmd_buffer *cmd_buffer, const VkClearRect *clear_rect, VkClearDepthStencilValue clear_value) { + if (!iview) + return false; + uint32_t queue_mask = radv_image_queue_family_mask(iview->image, cmd_buffer->queue_family_index, cmd_buffer->queue_family_index); @@ -705,11 +723,22 @@ emit_depthstencil_clear(struct radv_cmd_buffer *cmd_buffer, const uint32_t pass_att = subpass->depth_stencil_attachment->attachment; VkClearDepthStencilValue clear_value = clear_att->clearValue.depthStencil; VkImageAspectFlags aspects = clear_att->aspectMask; - const struct radv_image_view *iview = fb->attachments[pass_att].attachment; - const uint32_t samples = iview->image->info.samples; - const uint32_t samples_log2 = ffs(samples) - 1; + const struct radv_image_view *iview = fb ? fb->attachments[pass_att].attachment : NULL; + uint32_t samples, samples_log2; VkCommandBuffer cmd_buffer_h = radv_cmd_buffer_to_handle(cmd_buffer); + /* When a framebuffer is bound to the current command buffer, get the +* number of samples from it. Otherwise, get the number of samples from +* the render pass because it's likely a secondary command buffer. +*/ + if (iview) { + samples = iview->image->info.samples; + } else { + samples = cmd_buffer->state.pass->attachments[pass_att].samples; + } + + samples_log2 = ffs(samples) - 1; + assert(pass_att != VK_ATTACHMENT_UNUSED); if (!(aspects & VK_IMAGE_ASPECT_DEPTH_BIT)) @@ -915,7 +944,11 @@ static bool radv_image_view_can_fast_clear(struct radv_device *device, const struct radv_image_view *iview) { - struct radv_image *image = iview->image; + struct radv_image *image; + + if (!iview) + return false; + image = iview->image; /* Only fast clear if the image itself can be fast cleared. */ if (!radv_image_can_fast_clear(device, image)) @@ -1528,7 +1561,7 @@ emit_clear(struct radv_cmd_buffer *cmd_buffer, return; VkImageLayout image_layout = subpass->color_attachments[subpass_att].layout; - const struct radv_image_view *iview = fb->attachments[pass_att].attachment; + const struct radv_image_view *iview = fb ? fb->attachments[pass_att].attachment : NULL; VkClearColorValue clear_value =
Re: [Mesa-dev] [PATCH v6] etnaviv: fix resource usage tracking across different pipe_context's
On Sat, 23 Feb 2019 16:15:19 +0100 Christian Gmeiner wrote: > A pipe_resource can be shared by all the pipe_context's hanging off the > same pipe_screen. > > Changes from v2 -> v3: > - add locking with mtx_*() to resource and screen (Marek) > Changes from v3 -> v4: > - drop rsc->lock, just use screen->lock for the entire serialization (Marek) > - simplify etna_resource_used() flush condition, which also prevents >potentially flushing resources twice (Marek) > - don't remove resouces from screen->used_resources in >etna_cmd_stream_reset_notify(), they may still be used in other >contexts and may need flushing there later on (Marek) > Changes from v4 -> v5: > - Fix coding style issues reported by Guido > Changes from v5 -> v6: > - Add missing locking in etna_transfer_map(..) (Boris) > > Signed-off-by: Christian Gmeiner > Signed-off-by: Marek Vasut > Signed-off-by: Boris Brezillon Reviewed-by: Boris Brezillon Tested-by: Boris Brezillon This being said, I'm still unsure all races are fixed with this patch (see the part about RS-based tiling in my reply to v5). > Tested-by: Marek Vasut > --- > src/gallium/drivers/etnaviv/etnaviv_context.c | 26 +- > src/gallium/drivers/etnaviv/etnaviv_context.h | 3 -- > .../drivers/etnaviv/etnaviv_resource.c| 52 +++ > .../drivers/etnaviv/etnaviv_resource.h| 8 +-- > src/gallium/drivers/etnaviv/etnaviv_screen.c | 12 + > src/gallium/drivers/etnaviv/etnaviv_screen.h | 6 +++ > .../drivers/etnaviv/etnaviv_transfer.c| 5 ++ > 7 files changed, 83 insertions(+), 29 deletions(-) > > diff --git a/src/gallium/drivers/etnaviv/etnaviv_context.c > b/src/gallium/drivers/etnaviv/etnaviv_context.c > index 44b50925a4f..83a703f7cc2 100644 > --- a/src/gallium/drivers/etnaviv/etnaviv_context.c > +++ b/src/gallium/drivers/etnaviv/etnaviv_context.c > @@ -36,6 +36,7 @@ > #include "etnaviv_query.h" > #include "etnaviv_query_hw.h" > #include "etnaviv_rasterizer.h" > +#include "etnaviv_resource.h" > #include "etnaviv_screen.h" > #include "etnaviv_shader.h" > #include "etnaviv_state.h" > @@ -329,7 +330,8 @@ static void > etna_cmd_stream_reset_notify(struct etna_cmd_stream *stream, void *priv) > { > struct etna_context *ctx = priv; > - struct etna_resource *rsc, *rsc_tmp; > + struct etna_screen *screen = ctx->screen; > + struct set_entry *entry; > > etna_set_state(stream, VIVS_GL_API_MODE, VIVS_GL_API_MODE_OPENGL); > etna_set_state(stream, VIVS_GL_VERTEX_ELEMENT_CONFIG, 0x0001); > @@ -384,16 +386,18 @@ etna_cmd_stream_reset_notify(struct etna_cmd_stream > *stream, void *priv) > ctx->dirty = ~0L; > ctx->dirty_sampler_views = ~0L; > > - /* go through all the used resources and clear their status flag */ > - LIST_FOR_EACH_ENTRY_SAFE(rsc, rsc_tmp, >used_resources, list) > - { > - debug_assert(rsc->status != 0); > - rsc->status = 0; > - rsc->pending_ctx = NULL; > - list_delinit(>list); > - } > + /* > +* Go through all _resources_ associated with this _screen_, pending > +* in this _context_ and mark them as not pending in this _context_ > +* anymore, since they were just flushed. > +*/ > + mtx_lock(>lock); > + set_foreach(screen->used_resources, entry) { > + struct etna_resource *rsc = (struct etna_resource *)entry->key; > > - assert(LIST_IS_EMPTY(>used_resources)); > + _mesa_set_remove_key(rsc->pending_ctx, ctx); > + } > + mtx_unlock(>lock); > } > > static void > @@ -437,8 +441,6 @@ etna_context_create(struct pipe_screen *pscreen, void > *priv, unsigned flags) > /* need some sane default in case state tracker doesn't set some state: */ > ctx->sample_mask = 0x; > > - list_inithead(>used_resources); > - > /* Set sensible defaults for state */ > etna_cmd_stream_reset_notify(ctx->stream, ctx); > > diff --git a/src/gallium/drivers/etnaviv/etnaviv_context.h > b/src/gallium/drivers/etnaviv/etnaviv_context.h > index 6ad9f3431e1..50a2cdf3d07 100644 > --- a/src/gallium/drivers/etnaviv/etnaviv_context.h > +++ b/src/gallium/drivers/etnaviv/etnaviv_context.h > @@ -136,9 +136,6 @@ struct etna_context { > uint32_t prim_hwsupport; > struct primconvert_context *primconvert; > > - /* list of resources used by currently-unsubmitted renders */ > - struct list_head used_resources; > - > struct slab_child_pool transfer_pool; > struct blitter_context *blitter; > > diff --git a/src/gallium/drivers/etnaviv/etnaviv_resource.c > b/src/gallium/drivers/etnaviv/etnaviv_resource.c > index db5ead4d0ba..1e8fa714060 100644 > --- a/src/gallium/drivers/etnaviv/etnaviv_resource.c > +++ b/src/gallium/drivers/etnaviv/etnaviv_resource.c > @@ -33,6 +33,7 @@ > #include "etnaviv_screen.h" > #include "etnaviv_translate.h" > > +#include "util/hash_table.h" > #include "util/u_inlines.h" > #include "util/u_memory.h" > > @@ -280,7 +281,6 @@ etna_resource_alloc(struct pipe_screen *pscreen,
Re: [Mesa-dev] [PATCH 1/1] Avoid leaking parameter name in prog_to_nir.
This is correct. However, it's probably better to just replace most of that code with a call to nir_variable_create which will do everything in that block except set the data.read_only bit including the strdup (properly) and adding it to the list. --Jason On Mon, Feb 25, 2019 at 3:06 AM Tapani Pälli wrote: > Yep, confirmed that this plugs the leak. > > FWIW there seems to be also "Conditional jump or move depends on > uninitialised value(s)" from valgrind but that is for something different. > > Reviewed-by: Tapani Pälli > > On 2/21/19 11:09 AM, Matthias Lorenz wrote: > > Fixes: 3d7611e9 "st/nir: use NIR for asm programs" > > --- > > src/mesa/program/prog_to_nir.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/src/mesa/program/prog_to_nir.c > b/src/mesa/program/prog_to_nir.c > > index 312b299361e..6e3fa9432a3 100644 > > --- a/src/mesa/program/prog_to_nir.c > > +++ b/src/mesa/program/prog_to_nir.c > > @@ -1024,7 +1024,8 @@ prog_to_nir(const struct gl_program *prog, > > c->parameters = rzalloc(s, nir_variable); > > c->parameters->type = > >glsl_array_type(glsl_vec4_type(), > prog->Parameters->NumParameters, 0); > > - c->parameters->name = > strdup(prog->Parameters->Parameters[0].Name); > > + c->parameters->name = > > + ralloc_strdup(c->parameters, > prog->Parameters->Parameters[0].Name); > > c->parameters->data.read_only = true; > > c->parameters->data.mode = nir_var_uniform; > > exec_list_push_tail(>uniforms, >parameters->node); > > -- > > 2.20.1 > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/1] Avoid leaking parameter name in prog_to_nir.
Yep, confirmed that this plugs the leak. FWIW there seems to be also "Conditional jump or move depends on uninitialised value(s)" from valgrind but that is for something different. Reviewed-by: Tapani Pälli On 2/21/19 11:09 AM, Matthias Lorenz wrote: Fixes: 3d7611e9 "st/nir: use NIR for asm programs" --- src/mesa/program/prog_to_nir.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/program/prog_to_nir.c b/src/mesa/program/prog_to_nir.c index 312b299361e..6e3fa9432a3 100644 --- a/src/mesa/program/prog_to_nir.c +++ b/src/mesa/program/prog_to_nir.c @@ -1024,7 +1024,8 @@ prog_to_nir(const struct gl_program *prog, c->parameters = rzalloc(s, nir_variable); c->parameters->type = glsl_array_type(glsl_vec4_type(), prog->Parameters->NumParameters, 0); - c->parameters->name = strdup(prog->Parameters->Parameters[0].Name); + c->parameters->name = + ralloc_strdup(c->parameters, prog->Parameters->Parameters[0].Name); c->parameters->data.read_only = true; c->parameters->data.mode = nir_var_uniform; exec_list_push_tail(>uniforms, >parameters->node); -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallium/util: Fix off-by-one in box intersection
From: Daniel Stone pipe_boxes are x/y + width/height, rather than x0/y0 -> x1/y1. This means that (x+width) is not included in the box. The box intersection check was seemingly written for inclusive regions, and would falsely assert that adjacent boxes would overlap. Fix the off-by-one by being one pixel less greedy. Signed-off-by: Daniel Stone Signed-off-by: Boris Brezillon --- src/gallium/auxiliary/util/u_box.h | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/src/gallium/auxiliary/util/u_box.h b/src/gallium/auxiliary/util/u_box.h index b3f478e7bfc4..ead7189ecaf8 100644 --- a/src/gallium/auxiliary/util/u_box.h +++ b/src/gallium/auxiliary/util/u_box.h @@ -161,15 +161,15 @@ u_box_test_intersection_2d(const struct pipe_box *a, unsigned i; int a_l[2], a_r[2], b_l[2], b_r[2]; - a_l[0] = MIN2(a->x, a->x + a->width); - a_r[0] = MAX2(a->x, a->x + a->width); - a_l[1] = MIN2(a->y, a->y + a->height); - a_r[1] = MAX2(a->y, a->y + a->height); + a_l[0] = MIN2(a->x, a->x + a->width - 1); + a_r[0] = MAX2(a->x, a->x + a->width - 1); + a_l[1] = MIN2(a->y, a->y + a->height - 1); + a_r[1] = MAX2(a->y, a->y + a->height - 1); - b_l[0] = MIN2(b->x, b->x + b->width); - b_r[0] = MAX2(b->x, b->x + b->width); - b_l[1] = MIN2(b->y, b->y + b->height); - b_r[1] = MAX2(b->y, b->y + b->height); + b_l[0] = MIN2(b->x, b->x + b->width - 1); + b_r[0] = MAX2(b->x, b->x + b->width - 1); + b_l[1] = MIN2(b->y, b->y + b->height - 1); + b_r[1] = MAX2(b->y, b->y + b->height - 1); for (i = 0; i < 2; ++i) { if (a_l[i] > b_r[i] || a_r[i] < b_l[i]) -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 109532] ir_variable has maximum access out of bounds -- but it's not out of bounds
https://bugs.freedesktop.org/show_bug.cgi?id=109532 --- Comment #40 from andrii simiklit --- (In reply to Mark Janes from comment #39) > I just noticed that this new test passes for 32 bit builds. > > Does that surprise anyone else? > > http://mesa-ci.01.org/mesa_master_daily/builds/4806/group/ > e60513df13ade427f01bb7334bd5174e Thanks that pointed that out. I didn't see any platform specific code there. I will post an update here as far as I figure out the reason. -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev