[Mesa-dev] [Bug 106905] Account request
https://bugs.freedesktop.org/show_bug.cgi?id=106905 Bug ID: 106905 Summary: Account request Product: Mesa Version: git Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: Other Assignee: mesa-dev@lists.freedesktop.org Reporter: gw.foss...@gmail.com QA Contact: mesa-dev@lists.freedesktop.org Created attachment 140145 --> https://bugs.freedesktop.org/attachment.cgi?id=140145&action=edit SSH key Name: Gert Wollny Email: gw.foss...@gmail.com Username: gerddie for now I plan to continue to contributing to the GLSL->TGSI layer, r600, and virgl. many thanks -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 106905] Account request
https://bugs.freedesktop.org/show_bug.cgi?id=106905 --- Comment #1 from Gert Wollny --- Created attachment 140146 --> https://bugs.freedesktop.org/attachment.cgi?id=140146&action=edit GPG key -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 106905] Account request
https://bugs.freedesktop.org/show_bug.cgi?id=106905 --- Comment #2 from Gert Wollny --- BTW: I've added these keys already to my gitlab.fdo account. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 106903] radv: Fragment shader output goes to wrong attachments when render targets are sparse
https://bugs.freedesktop.org/show_bug.cgi?id=106903 --- Comment #1 from Samuel Pitoiset --- Well, AMDVLK hangs on Polaris/Vega here. (I recompiled that fragment shader manually). -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 106877] The game Rise of the Tomb Raider lead to GPU hang when I try in same place jump into the hole.
https://bugs.freedesktop.org/show_bug.cgi?id=106877 --- Comment #6 from Samuel Pitoiset --- I can reproduce the hang as well. This seems to only affect Vega and LLVM 6 (latest LLVM trunk fixes the GPU hang on my side). I have no ideas what changed. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 106897] Ubuntu 16.04. Mesa can't be built with specified configurations
https://bugs.freedesktop.org/show_bug.cgi?id=106897 Sergii Romantsov changed: What|Removed |Added Resolution|--- |NOTABUG Status|REOPENED|RESOLVED --- Comment #6 from Sergii Romantsov --- Sorry, didn't realise at once your words "core Wayland repository". If that means that user has to make Wayland manually or upgrade system to the 18.10 because of lack just only one header, than seems its not issue... -- You are receiving this mail because: You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 106906] Failed to recongnize keyword “sampler2DRect” and "sampler2DRectShadow"
https://bugs.freedesktop.org/show_bug.cgi?id=106906 Bug ID: 106906 Summary: Failed to recongnize keyword “sampler2DRect” and "sampler2DRectShadow" Product: Mesa Version: 17.1 Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: glsl-compiler Assignee: mesa-dev@lists.freedesktop.org Reporter: zhaowei.y...@samsung.com QA Contact: intel-3d-b...@lists.freedesktop.org -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 106906] Failed to recongnize keyword “sampler2DRect” and "sampler2DRectShadow"
https://bugs.freedesktop.org/show_bug.cgi?id=106906 --- Comment #1 from Zhaowei Yuan --- CTS cases "dEQP-GLES2.functional.shaders.keywords.reserved_keywords.sampler2DRectShadow_vertex" and "dEQP-GLES2.functional.shaders.keywords.reserved_keywords.sampler2DRectShadow_fragment" check that if shader complier can recongnize reserved keywords "sampler2DRect" and "sampler2DRectShadow" GLSL ES spec 1.0.17 says they are keywords reserved for future use. Using them will result in an error I've fixed the problem with follow modification: -sampler2DRect DEPRECATED_ES_TYPE_WITH_ALT(yyextra->ARB_texture_rectangle_enable, glsl_type::sampler2DRect_type); +sampler2DRect TYPE_WITH_ALT(110, 100, 0, 0, yyextra->ARB_texture_rectangle_enable, glsl_type::sampler2DRect_type); sampler3DRect KEYWORD(110, 100, 0, 0, SAMPLER3DRECT); -sampler2DRectShadow DEPRECATED_ES_TYPE_WITH_ALT(yyextra->ARB_texture_rectangle_enable, glsl_type::sampler2DRectShadow_type); +sampler2DRectShadowTYPE_WITH_ALT(110, 100, 0, 0, yyextra->ARB_texture_rectangle_enable, glsl_type::sampler2DRectShadow_type); -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: Take sampler2DRect and sampler2DRectShadow as reserved
"sampler2DRect" and "sampler2DRectShadow" are specified as reserved from GLSL 1.1 and GLSL ES 1.0 Signed-off-by: zhaowei yuan Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106906 --- src/compiler/glsl/glsl_lexer.ll | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/compiler/glsl/glsl_lexer.ll b/src/compiler/glsl/glsl_lexer.ll index de6dc64..87b64e0 100644 --- a/src/compiler/glsl/glsl_lexer.ll +++ b/src/compiler/glsl/glsl_lexer.ll @@ -627,9 +627,9 @@ dmat4x4 TYPE_WITH_ALT(110, 100, 400, 0, yyextra->ARB_gpu_shader_fp64_enable, gl fvec2 KEYWORD(110, 100, 0, 0, FVEC2); fvec3 KEYWORD(110, 100, 0, 0, FVEC3); fvec4 KEYWORD(110, 100, 0, 0, FVEC4); -sampler2DRect DEPRECATED_ES_TYPE_WITH_ALT(yyextra->ARB_texture_rectangle_enable, glsl_type::sampler2DRect_type); +sampler2DRect TYPE_WITH_ALT(110, 100, 0, 0, yyextra->ARB_texture_rectangle_enable, glsl_type::sampler2DRect_type); sampler3DRect KEYWORD(110, 100, 0, 0, SAMPLER3DRECT); -sampler2DRectShadow DEPRECATED_ES_TYPE_WITH_ALT(yyextra->ARB_texture_rectangle_enable, glsl_type::sampler2DRectShadow_type); +sampler2DRectShadowTYPE_WITH_ALT(110, 100, 0, 0, yyextra->ARB_texture_rectangle_enable, glsl_type::sampler2DRectShadow_type); sizeof KEYWORD(110, 100, 0, 0, SIZEOF); cast KEYWORD(110, 100, 0, 0, CAST); namespace KEYWORD(110, 100, 0, 0, NAMESPACE); -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 106906] Failed to recongnize keyword “sampler2DRect” and "sampler2DRectShadow"
https://bugs.freedesktop.org/show_bug.cgi?id=106906 --- Comment #2 from Zhaowei Yuan --- patch is posted here: https://patchwork.freedesktop.org/patch/229229/ -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v4 075/129] nir: convert lower_samplers_as_deref to deref instructions
On Tuesday, June 12, 2018 5:54:31 PM PDT Rob Clark wrote: > On Tue, Jun 12, 2018 at 6:34 PM, Kenneth Graunke > wrote: > > On Thursday, May 31, 2018 10:04:05 PM PDT Jason Ekstrand wrote: > >> From: Rob Clark > >> > >> This also removes the legacy version of lower_samplers. > > > > It does not, that's what patch 76 (the next one) does. > > > > (for lack of good way of viewing full patchset atm, I'll take your > word for that, but that said) maybe just add the words "need for" into > that sentence, rather than squashing this and the following patch > together, to reduce the noise in the patch history.. > > BR, > -R Yeah, definitely, I'd just change the message, not the patch. signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] intel/compiler: Properly consider UBO loads that cross 32B boundaries.
On Tuesday, June 12, 2018 1:38:03 PM PDT Rafael Antognolli wrote: > On Mon, Jun 11, 2018 at 02:01:49PM -0700, Kenneth Graunke wrote: > > The UBO push analysis pass incorrectly assumed that all values would fit > > within a 32B chunk, and only recorded a bit for the 32B chunk containing > > the starting offset. > > > > For example, if a UBO contained the following, tightly packed: > > > >vec4 a; // [0, 16) > >float b; // [16, 20) > >vec4 c; // [20, 36) > > > > then, c would start at offset 20 / 32 = 0 and end at 36 / 32 = 1, > > which means that we ought to record two 32B chunks in the bitfield. > > > > Similarly, dvec4s would suffer from the same problem. > > --- > > src/intel/compiler/brw_nir_analyze_ubo_ranges.c | 8 +++- > > 1 file changed, 7 insertions(+), 1 deletion(-) > > > > diff --git a/src/intel/compiler/brw_nir_analyze_ubo_ranges.c > > b/src/intel/compiler/brw_nir_analyze_ubo_ranges.c > > index d58fe3dd2e3..6d6ccf73ade 100644 > > --- a/src/intel/compiler/brw_nir_analyze_ubo_ranges.c > > +++ b/src/intel/compiler/brw_nir_analyze_ubo_ranges.c > > @@ -141,10 +141,16 @@ analyze_ubos_block(struct ubo_analysis_state *state, > > nir_block *block) > > if (offset >= 64) > > continue; > > > > + /* The value might span multiple 32-byte chunks. */ > > + const int bytes = nir_intrinsic_dest_components(intrin) * > > + (nir_dest_bit_size(intrin->dest) / 8); > > + const int end = DIV_ROUND_UP(offset_const->u32[0] + bytes, 32); > > + const int regs = end - offset + 1; > > + > > But if I understood it correctly, offset is the first 32B chunk within > the UBO block (it's actually an ubo "chunk offset"). And you calculate > bytes by taking the number of components times the size of each > component of the nir_intrinsic_load_ubo instruction (which apparently > supports multiple components). So yeah, this makes sense to me. Yeah, that's exactly right. load_ubo can load up to 4 components. > Take this review with a grain of salt (assuming what I wrote above is > correct), but this looks simple enough. So it is > > Reviewed-by: Rafael Antognolli Thanks! signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 106907] Correct Transform Feedback Varyings information is expected after using ProgramBinary
https://bugs.freedesktop.org/show_bug.cgi?id=106907 Bug ID: 106907 Summary: Correct Transform Feedback Varyings information is expected after using ProgramBinary Product: Mesa Version: git Hardware: Other OS: Linux (All) Status: NEW Severity: normal Priority: medium Component: Mesa core Assignee: mesa-dev@lists.freedesktop.org Reporter: xinghua@intel.com QA Contact: mesa-dev@lists.freedesktop.org Steps: 1. Download chrome and install it on your Ubuntu, https://www.google.com/chrome/?platform=linux&extra=devchannel; 2. Open https://www.khronos.org/registry/webgl/sdk/tests/conformance2/transform_feedback/transform_feedback.html?webglVersion=2&quiet=0 3. First time open link, all cases will be successful. Please click refresh button of chrome, some cases fail. Notes: 1. I could only reproduce it on mesa git master, could not reproduce on system driver(Ubuntu 17.10). May our latest code introduced some regression? 2. The case may be related with https://bugs.freedesktop.org/show_bug.cgi?id=106810 3. Chrome will cache program binary, the second time run page, chrome will call glProgramBinary to avoid re-compile shaders and re-link program. 4. This failed cases verify glGetProgramiv to get tranform feedback varyings number, and getTransformFeedbackVarying to get transform feedback's size, type and name. But current program seems be without transform feedback varyings information. 5.I had checked mesa code. For example, glProgramBinary triggers read_xfb function to re-serialize binary, creates gl_transform_feedback_info object, which also has a member named "NumVarying", if the program binary has two tranform feedback varyings, "NumVarying" value is 2. Then call glGetProgramiv to get tranform feedback varyings number in shaderapi.c, the value is got from "NumVarying", which is a member of TransformFeedback of gl_shader_program. I found that glProgramBinary implemetation in mesa did not update transform feedback varyings number from gl_transform_feedback_info object to TransformFeedback object. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 106907] Correct Transform Feedback Varyings information is expected after using ProgramBinary
https://bugs.freedesktop.org/show_bug.cgi?id=106907 xinghua changed: What|Removed |Added CC||jljus...@gmail.com, ||lem...@gmail.com, ||yang...@intel.com, ||yunchao...@intel.com -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] configure.ac/meson.build: Add options for library suffixes
On Tuesday, 2018-06-12 11:19:40 -0600, bmgor...@chromium.org wrote: > From: Benjamin Gordon > > When building the Chrome OS Android container, we need to build copies > of mesa that don't conflict with the Android system-supplied libraries. > This adds options to create suffixed versions of EGL and GLES libraries: > > libEGL.so -> libEGL${egl-lib-suffix}.so > libGLESv1_CM.so -> libGLESv1_CM${gles-lib-suffix}.so > libGLESv2.so -> libGLES${gles-lib-suffix}.so > > This is similar to what happens when --enable-libglvnd is specified, but > without the side effects of linking against libglvnd. This seems reasonable, and the meson side of this patch is correct, but we need to document or prevent the interaction between --enable-libglvnd and --with-egl-lib-suffix. I can't think of a use-case for having both, so I suggest "if both are enabled, error out"; scroll down for what this could look like in meson. With that (and the corresponding autotools hunk): Reviewed-by: Eric Engestrom > > Change-Id: I0a534d3921a24c031e2532ee7d5ba9813740b33b (Note to whoever merges this patch: drop this line ^) > Signed-off-by: Benjamin Gordon > --- > configure.ac| 14 ++ > meson_options.txt | 12 > src/egl/Makefile.am | 8 > src/egl/meson.build | 2 +- > src/mapi/Makefile.am| 28 ++-- > src/mapi/es1api/meson.build | 2 +- > src/mapi/es2api/meson.build | 2 +- > 7 files changed, 47 insertions(+), 21 deletions(-) > > diff --git a/configure.ac b/configure.ac > index 35ade986d1..6070a2146b 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -1511,12 +1511,24 @@ AC_ARG_WITH([gl-lib-name], > [specify GL library name @<:@default=GL@:>@])], >[GL_LIB=$withval], >[GL_LIB="$DEFAULT_GL_LIB_NAME"]) > +AC_ARG_WITH([egl-lib-suffix], > + [AS_HELP_STRING([--with-egl-lib-suffix@<:@=NAME@:>@], > +[specify EGL library suffix @<:@default=none@:>@])], > + [EGL_LIB_SUFFIX=$withval], > + [EGL_LIB_SUFFIX=""]) > +AC_ARG_WITH([gles-lib-suffix], > + [AS_HELP_STRING([--with-gles-lib-suffix@<:@=NAME@:>@], > +[specify GLES library suffix @<:@default=none@:>@])], > + [GLES_LIB_SUFFIX=$withval], > + [GLES_LIB_SUFFIX=""]) > AC_ARG_WITH([osmesa-lib-name], >[AS_HELP_STRING([--with-osmesa-lib-name@<:@=NAME@:>@], > [specify OSMesa library name @<:@default=OSMesa@:>@])], >[OSMESA_LIB=$withval], >[OSMESA_LIB=OSMesa]) > AS_IF([test "x$GL_LIB" = xyes], [GL_LIB="$DEFAULT_GL_LIB_NAME"]) > +AS_IF([test "x$EGL_LIB_SUFFIX" = xyes], [EGL_LIB_SUFFIX=""]) > +AS_IF([test "x$GLES_LIB_SUFFIX" = xyes], [GLES_LIB_SUFFIX=""]) > AS_IF([test "x$OSMESA_LIB" = xyes], [OSMESA_LIB=OSMesa]) > > dnl > @@ -1534,6 +1546,8 @@ if test "x${enable_mangling}" = "xyes" ; then >OSMESA_LIB="Mangled${OSMESA_LIB}" > fi > AC_SUBST([GL_LIB]) > +AC_SUBST([EGL_LIB_SUFFIX]) > +AC_SUBST([GLES_LIB_SUFFIX]) > AC_SUBST([OSMESA_LIB]) > > # Check for libdrm > diff --git a/meson_options.txt b/meson_options.txt > index ce7d87f1eb..9d84c3b5bb 100644 > --- a/meson_options.txt > +++ b/meson_options.txt > @@ -298,3 +298,15 @@ option( >choices : ['freedreno', 'glsl', 'intel', 'nir', 'nouveau', 'all'], >description : 'List of tools to build.', > ) > +option( > + 'egl-lib-suffix', > + type : 'string', > + value : '', > + description : 'Suffix to append to EGL library name. Default: none.' > +) > +option( > + 'gles-lib-suffix', > + type : 'string', > + value : '', > + description : 'Suffix to append to GLES library names. Default: none.' > +) > diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am > index 086a4a1e63..c3aeeea007 100644 > --- a/src/egl/Makefile.am > +++ b/src/egl/Makefile.am > @@ -184,12 +184,12 @@ libEGL_mesa_la_LDFLAGS = \ > > else # USE_LIBGLVND > > -lib_LTLIBRARIES = libEGL.la > -libEGL_la_SOURCES = > -libEGL_la_LIBADD = \ > +lib_LTLIBRARIES = libEGL@EGL_LIB_SUFFIX@.la > +libEGL@EGL_LIB_SUFFIX@_la_SOURCES = > +libEGL@EGL_LIB_SUFFIX@_la_LIBADD = \ > libEGL_common.la \ > $(top_builddir)/src/mapi/shared-glapi/libglapi.la > -libEGL_la_LDFLAGS = \ > +libEGL@EGL_LIB_SUFFIX@_la_LDFLAGS = \ > -no-undefined \ > -version-number 1:0 \ > $(BSYMBOLIC) \ > diff --git a/src/egl/meson.build b/src/egl/meson.build > index 6537e4bdee..b833fd1729 100644 > --- a/src/egl/meson.build > +++ b/src/egl/meson.build > @@ -148,7 +148,7 @@ if cc.has_function('mincore') > endif > if with_glvnd and get_option('egl-lib-suffix') != '' error('''EGL lib suffix can't be used with libglvnd''') endif > if not with_glvnd > - egl_lib_name = 'EGL' > + egl_lib_name = 'EGL' + get_option('egl-lib-suffix') >egl_lib_version = '1.0.0' > else >egl_lib_name = 'EGL_mesa' > diff --git a/src/mapi/Makefile.am b/src/mapi/Makefile.am > index 3da1a193d2..a2b108adc9 100644 > --- a/src/mapi/Makefile.am > +++ b/src/mapi/Makefile.am > @@ -178,24 +178,24 @@ GLES_include_HEADERS = \ > $(top_srcdir)/inc
[Mesa-dev] [PATCH v2] radv: update the ZRANGE_PRECISION value for the TC-compat bug
On GFX8+, there is a bug that affects TC-compatible depth surfaces when the ZRange is not reset after LateZ kills pixels. The workaround is to always set DB_Z_INFO.ZRANGE_PRECISION to match the last fast clear value. Because the value is set to 1 by default, we only need to update it when clearing Z to 0.0. Original patch from James Legg. v2: - only update ZRANGE_PRECISION for depth aspects - adjust base address in presence of stencil Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105396 CC: Signed-off-by: Samuel Pitoiset --- src/amd/vulkan/radv_cmd_buffer.c | 94 1 file changed, 94 insertions(+) diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c index 043b4a2f44a..b8724d6b937 100644 --- a/src/amd/vulkan/radv_cmd_buffer.c +++ b/src/amd/vulkan/radv_cmd_buffer.c @@ -1044,6 +1044,68 @@ radv_emit_fb_color_state(struct radv_cmd_buffer *cmd_buffer, } } +static void +radv_update_zrange_precision(struct radv_cmd_buffer *cmd_buffer, +struct radv_ds_buffer_info *ds, +struct radv_image *image, VkImageLayout layout, +bool requires_cond_write) +{ + uint32_t db_z_info = ds->db_z_info; + uint32_t db_z_info_reg; + + if (!radv_image_is_tc_compat_htile(image)) + return; + + if (!radv_layout_has_htile(image, layout, + radv_image_queue_family_mask(image, + cmd_buffer->queue_family_index, + cmd_buffer->queue_family_index))) { + db_z_info &= C_028040_TILE_SURFACE_ENABLE; + } + + db_z_info &= C_028040_ZRANGE_PRECISION; + + if (cmd_buffer->device->physical_device->rad_info.chip_class >= GFX9) { + db_z_info_reg = R_028038_DB_Z_INFO; + } else { + db_z_info_reg = R_028040_DB_Z_INFO; + } + + /* When we don't know the last fast clear value we need to emit a +* conditional packet, otherwise we can update DB_Z_INFO directly. +*/ + if (requires_cond_write) { + radeon_emit(cmd_buffer->cs, PKT3(PKT3_COND_WRITE, 7, 0)); + + const uint32_t write_space = 0 << 8;/* register */ + const uint32_t poll_space = 1 << 4; /* memory */ + const uint32_t function = 3 << 0; /* equal to the reference */ + const uint32_t options = write_space | poll_space | function; + radeon_emit(cmd_buffer->cs, options); + + /* poll address - location of the depth clear value */ + uint64_t va = radv_buffer_get_va(image->bo); + va += image->offset + image->clear_value_offset; + + /* In presence of stencil format, we have to adjust the base +* address because the first value is the stencil clear value. +*/ + if (vk_format_is_stencil(image->vk_format)) + va += 4; + + radeon_emit(cmd_buffer->cs, va); + radeon_emit(cmd_buffer->cs, va >> 32); + + radeon_emit(cmd_buffer->cs, fui(0.0f)); /* reference value */ + radeon_emit(cmd_buffer->cs, (uint32_t)-1); /* comparison mask */ + radeon_emit(cmd_buffer->cs, db_z_info_reg >> 2); /* write address low */ + radeon_emit(cmd_buffer->cs, 0u); /* write address high */ + radeon_emit(cmd_buffer->cs, db_z_info); + } else { + radeon_set_context_reg(cmd_buffer->cs, db_z_info_reg, db_z_info); + } +} + static void radv_emit_fb_ds_state(struct radv_cmd_buffer *cmd_buffer, struct radv_ds_buffer_info *ds, @@ -1102,6 +1164,9 @@ radv_emit_fb_ds_state(struct radv_cmd_buffer *cmd_buffer, } + /* Update the ZRANGE_PRECISION value for the TC-compat bug. */ + radv_update_zrange_precision(cmd_buffer, ds, image, layout, true); + radeon_set_context_reg(cmd_buffer->cs, R_028B78_PA_SU_POLY_OFFSET_DB_FMT_CNTL, ds->pa_su_poly_offset_db_fmt_cntl); } @@ -1143,6 +1208,35 @@ radv_set_depth_clear_regs(struct radv_cmd_buffer *cmd_buffer, radeon_emit(cmd_buffer->cs, ds_clear_value.stencil); /* R_028028_DB_STENCIL_CLEAR */ if (aspects & VK_IMAGE_ASPECT_DEPTH_BIT) radeon_emit(cmd_buffer->cs, fui(ds_clear_value.depth)); /* R_02802C_DB_DEPTH_CLEAR */ + + /* Update the ZRANGE_PRECISION value for the TC-compat bug. This is +* only needed when clearing Z to 0.0. +*/ + if ((aspects & VK_IMAGE_ASPECT_DEPTH_BIT) && + ds_clear_value.depth == 0.0) { + struct radv_framebuffer *framebuffer = cmd_buffer->state.framebuffer; + const struct radv_subpas
[Mesa-dev] [Bug 105396] tc compatible htile sets depth of htiles of discarded fragments to 1.0
https://bugs.freedesktop.org/show_bug.cgi?id=105396 --- Comment #9 from Samuel Pitoiset --- Can you confirm this patch fixes the issue ? https://patchwork.freedesktop.org/patch/229236/ -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] radv: update the ZRANGE_PRECISION value for the TC-compat bug
Thanks for figuring out the remaning issues, Reviewed-by: Bas Nieuwenhuizen On Wed, Jun 13, 2018 at 12:04 PM, Samuel Pitoiset wrote: > On GFX8+, there is a bug that affects TC-compatible depth surfaces > when the ZRange is not reset after LateZ kills pixels. > > The workaround is to always set DB_Z_INFO.ZRANGE_PRECISION to match > the last fast clear value. Because the value is set to 1 by default, > we only need to update it when clearing Z to 0.0. > > Original patch from James Legg. > > v2: - only update ZRANGE_PRECISION for depth aspects > - adjust base address in presence of stencil > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105396 > CC: > Signed-off-by: Samuel Pitoiset > --- > src/amd/vulkan/radv_cmd_buffer.c | 94 > 1 file changed, 94 insertions(+) > > diff --git a/src/amd/vulkan/radv_cmd_buffer.c > b/src/amd/vulkan/radv_cmd_buffer.c > index 043b4a2f44a..b8724d6b937 100644 > --- a/src/amd/vulkan/radv_cmd_buffer.c > +++ b/src/amd/vulkan/radv_cmd_buffer.c > @@ -1044,6 +1044,68 @@ radv_emit_fb_color_state(struct radv_cmd_buffer > *cmd_buffer, > } > } > > +static void > +radv_update_zrange_precision(struct radv_cmd_buffer *cmd_buffer, > +struct radv_ds_buffer_info *ds, > +struct radv_image *image, VkImageLayout layout, > +bool requires_cond_write) > +{ > + uint32_t db_z_info = ds->db_z_info; > + uint32_t db_z_info_reg; > + > + if (!radv_image_is_tc_compat_htile(image)) > + return; > + > + if (!radv_layout_has_htile(image, layout, > + radv_image_queue_family_mask(image, > + > cmd_buffer->queue_family_index, > + > cmd_buffer->queue_family_index))) { > + db_z_info &= C_028040_TILE_SURFACE_ENABLE; > + } > + > + db_z_info &= C_028040_ZRANGE_PRECISION; > + > + if (cmd_buffer->device->physical_device->rad_info.chip_class >= GFX9) > { > + db_z_info_reg = R_028038_DB_Z_INFO; > + } else { > + db_z_info_reg = R_028040_DB_Z_INFO; > + } > + > + /* When we don't know the last fast clear value we need to emit a > +* conditional packet, otherwise we can update DB_Z_INFO directly. > +*/ > + if (requires_cond_write) { > + radeon_emit(cmd_buffer->cs, PKT3(PKT3_COND_WRITE, 7, 0)); > + > + const uint32_t write_space = 0 << 8;/* register */ > + const uint32_t poll_space = 1 << 4; /* memory */ > + const uint32_t function = 3 << 0; /* equal to the > reference */ > + const uint32_t options = write_space | poll_space | function; > + radeon_emit(cmd_buffer->cs, options); > + > + /* poll address - location of the depth clear value */ > + uint64_t va = radv_buffer_get_va(image->bo); > + va += image->offset + image->clear_value_offset; > + > + /* In presence of stencil format, we have to adjust the base > +* address because the first value is the stencil clear value. > +*/ > + if (vk_format_is_stencil(image->vk_format)) > + va += 4; > + > + radeon_emit(cmd_buffer->cs, va); > + radeon_emit(cmd_buffer->cs, va >> 32); > + > + radeon_emit(cmd_buffer->cs, fui(0.0f)); /* reference > value */ > + radeon_emit(cmd_buffer->cs, (uint32_t)-1); /* > comparison mask */ > + radeon_emit(cmd_buffer->cs, db_z_info_reg >> 2); /* write > address low */ > + radeon_emit(cmd_buffer->cs, 0u); /* write > address high */ > + radeon_emit(cmd_buffer->cs, db_z_info); > + } else { > + radeon_set_context_reg(cmd_buffer->cs, db_z_info_reg, > db_z_info); > + } > +} > + > static void > radv_emit_fb_ds_state(struct radv_cmd_buffer *cmd_buffer, > struct radv_ds_buffer_info *ds, > @@ -1102,6 +1164,9 @@ radv_emit_fb_ds_state(struct radv_cmd_buffer > *cmd_buffer, > > } > > + /* Update the ZRANGE_PRECISION value for the TC-compat bug. */ > + radv_update_zrange_precision(cmd_buffer, ds, image, layout, true); > + > radeon_set_context_reg(cmd_buffer->cs, > R_028B78_PA_SU_POLY_OFFSET_DB_FMT_CNTL, >ds->pa_su_poly_offset_db_fmt_cntl); > } > @@ -1143,6 +1208,35 @@ radv_set_depth_clear_regs(struct radv_cmd_buffer > *cmd_buffer, > radeon_emit(cmd_buffer->cs, ds_clear_value.stencil); /* > R_028028_DB_STENCIL_CLEAR */ > if (aspects & VK_IMAGE_ASPECT_DEPTH_BIT) > radeon_emit(cmd_buffer->cs, fui(ds_clear_value.depth)); /* > R_02802C_DB_DE
[Mesa-dev] [Bug 106910] Primus Segfaults after updating Mesa to 18.1.1
https://bugs.freedesktop.org/show_bug.cgi?id=106910 Bug ID: 106910 Summary: Primus Segfaults after updating Mesa to 18.1.1 Product: Mesa Version: git Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: normal Priority: medium Component: Other Assignee: mesa-dev@lists.freedesktop.org Reporter: sali...@gmail.com QA Contact: mesa-dev@lists.freedesktop.org Created attachment 140148 --> https://bugs.freedesktop.org/attachment.cgi?id=140148&action=edit Journalctl traces After upgrading MESA to version 18.1.1 Primus (bridge for Bumblebee, the NVIDIA Optimus implementation) segfaults while trying to use applications that require GLX.Running same applications with Intel and VirtualGL (another bridge for Bumblebee that runs slower than Primus) works fine.After downgrading to Mesa 18.0.4 everything starts working again.Applications' logs don't say much. The issue can be reproduced by: 1)Installing Bumblebee 3.2.1, MESA 18.1.1 and primus 20151110 2)Running any GLX application with primusrun -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 106910] Primus Segfaults after updating Mesa to 18.1.1
https://bugs.freedesktop.org/show_bug.cgi?id=106910 --- Comment #1 from Alexander --- The same issue remains in git version as well. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105396] tc compatible htile sets depth of htiles of discarded fragments to 1.0
https://bugs.freedesktop.org/show_bug.cgi?id=105396 --- Comment #10 from James Legg --- Yes, that patch fixes it. Thanks. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] meson: Fix -latomic check
On Tuesday, 2018-06-12 17:50:20 -0700, Matt Turner wrote: > Commit 54ba73ef102f (configure.ac/meson.build: Fix -latomic test) fixed > some checks for -latomic, and then commit 54bbe600ec26 (configure.ac: > rework -latomic check) further extended the fixes in configure.ac but > not in Meson. This commit extends those fixes to the Meson tests. > > Fixes: 54bbe600ec26 (configure.ac: rework -latomic check) Reviewed-by: Eric Engestrom > --- > meson.build | 8 +++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/meson.build b/meson.build > index 7dba52369b0..62200476216 100644 > --- a/meson.build > +++ b/meson.build > @@ -836,7 +836,13 @@ endif > # Check for GCC style atomics > dep_atomic = null_dep > > -if cc.compiles('int main() { int n; return __atomic_load_n(&n, > __ATOMIC_ACQUIRE); }', > +if cc.compiles('''#include > + int main() { > +struct { > + uint64_t *v; > +} x; > +return (int)__atomic_load_n(x.v, __ATOMIC_ACQUIRE); > + }''', > name : 'GCC atomic builtins') >pre_args += '-DUSE_GCC_ATOMIC_BUILTINS' > > -- > 2.16.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa GitLab access approval process
On Wed, Jun 13, 2018 at 12:43 AM, Jason Ekstrand wrote: > Since we've been on GitLab (it's been less than a week), we've already > gotten a couple of developer access requests through GitLab. As it stands, > these just show up as an e-mail to the group owners with zero explanation or > opportunity for the requester to provide justification for the request. > This is clearly worse than the bugzilla system we had before. > > I don't think we want to change the general guidelines for getting commit > access of ~2 dozen patches, good standing, and an understanding of the Mesa > code review process. However, we do need to do something else for > requesting access so we have some real dialogue and provide opportunities > for people from the same area of Mesa that the new developer wants to work > in to vouch for them. > > My recommendation (if no one minds) would be to create a mesa "accounts" > project that doesn't have a git repo or anything else and just provides an > issue tracker. People could then use that much in the same way as they've > used Bugzilla in the past to request accounts. If people would rather stick > to bugzilla, that's fine with me. I just thought this would be a relatively > painless way to try out the issue tracker. > I guess in the long run, if we switch over to gitlab issue tracker for "real" bugs, I was kinda expecting account requests would just be a special component in the mesa project's issue tracker, instead of a special project. Either way, I agree w/ keeping the the process of filling an issue/bz, and keeping the same general guidelines (wherever the issues/bzs live, ie same project, different project, or bugzilla). BR, -R ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] virgl: add ARB_tessellation_shader support. (v2)
On Wed, Jun 13, 2018 at 11:03:55AM +1000, Dave Airlie wrote: > From: Dave Airlie > > This should add all the pieces to enable tess shaders on virgl. > > v2: fixup transform to handle tess and strip out precise. > set default for max patch varyings to work around issue when > tess gets enabled from v1 caps but v2 caps aren't in place. (Elie) Reviewed-by: Elie Tournier > --- > src/gallium/auxiliary/tgsi/tgsi_transform.c | 4 -- > src/gallium/drivers/virgl/virgl_context.c | 69 > + > src/gallium/drivers/virgl/virgl_encode.c| 21 - > src/gallium/drivers/virgl/virgl_encode.h| 4 ++ > src/gallium/drivers/virgl/virgl_protocol.h | 5 +++ > src/gallium/drivers/virgl/virgl_screen.c| 10 - > src/gallium/drivers/virgl/virgl_winsys.h| 2 +- > 7 files changed, 107 insertions(+), 8 deletions(-) > > diff --git a/src/gallium/auxiliary/tgsi/tgsi_transform.c > b/src/gallium/auxiliary/tgsi/tgsi_transform.c > index cd076c9e79e..4b2b10f50ad 100644 > --- a/src/gallium/auxiliary/tgsi/tgsi_transform.c > +++ b/src/gallium/auxiliary/tgsi/tgsi_transform.c > @@ -140,10 +140,6 @@ tgsi_transform_shader(const struct tgsi_token *tokens_in, >return -1; > } > procType = parse.FullHeader.Processor.Processor; > - assert(procType == PIPE_SHADER_FRAGMENT || > - procType == PIPE_SHADER_VERTEX || > - procType == PIPE_SHADER_GEOMETRY); > - > > /** > ** Setup output shader > diff --git a/src/gallium/drivers/virgl/virgl_context.c > b/src/gallium/drivers/virgl/virgl_context.c > index 8d701bb8f40..e6f8dc85256 100644 > --- a/src/gallium/drivers/virgl/virgl_context.c > +++ b/src/gallium/drivers/virgl/virgl_context.c > @@ -492,6 +492,18 @@ static void *virgl_create_vs_state(struct pipe_context > *ctx, > return virgl_shader_encoder(ctx, shader, PIPE_SHADER_VERTEX); > } > > +static void *virgl_create_tcs_state(struct pipe_context *ctx, > + const struct pipe_shader_state *shader) > +{ > + return virgl_shader_encoder(ctx, shader, PIPE_SHADER_TESS_CTRL); > +} > + > +static void *virgl_create_tes_state(struct pipe_context *ctx, > + const struct pipe_shader_state *shader) > +{ > + return virgl_shader_encoder(ctx, shader, PIPE_SHADER_TESS_EVAL); > +} > + > static void *virgl_create_gs_state(struct pipe_context *ctx, > const struct pipe_shader_state *shader) > { > @@ -534,6 +546,26 @@ virgl_delete_vs_state(struct pipe_context *ctx, > virgl_encode_delete_object(vctx, handle, VIRGL_OBJECT_SHADER); > } > > +static void > +virgl_delete_tcs_state(struct pipe_context *ctx, > + void *tcs) > +{ > + uint32_t handle = (unsigned long)tcs; > + struct virgl_context *vctx = virgl_context(ctx); > + > + virgl_encode_delete_object(vctx, handle, VIRGL_OBJECT_SHADER); > +} > + > +static void > +virgl_delete_tes_state(struct pipe_context *ctx, > + void *tes) > +{ > + uint32_t handle = (unsigned long)tes; > + struct virgl_context *vctx = virgl_context(ctx); > + > + virgl_encode_delete_object(vctx, handle, VIRGL_OBJECT_SHADER); > +} > + > static void virgl_bind_vs_state(struct pipe_context *ctx, > void *vss) > { > @@ -543,6 +575,24 @@ static void virgl_bind_vs_state(struct pipe_context *ctx, > virgl_encode_bind_shader(vctx, handle, PIPE_SHADER_VERTEX); > } > > +static void virgl_bind_tcs_state(struct pipe_context *ctx, > + void *vss) > +{ > + uint32_t handle = (unsigned long)vss; > + struct virgl_context *vctx = virgl_context(ctx); > + > + virgl_encode_bind_shader(vctx, handle, PIPE_SHADER_TESS_CTRL); > +} > + > +static void virgl_bind_tes_state(struct pipe_context *ctx, > + void *vss) > +{ > + uint32_t handle = (unsigned long)vss; > + struct virgl_context *vctx = virgl_context(ctx); > + > + virgl_encode_bind_shader(vctx, handle, PIPE_SHADER_TESS_EVAL); > +} > + > static void virgl_bind_gs_state(struct pipe_context *ctx, > void *vss) > { > @@ -801,6 +851,18 @@ static void virgl_set_clip_state(struct pipe_context > *ctx, > virgl_encoder_set_clip_state(vctx, clip); > } > > +static void virgl_set_tess_state(struct pipe_context *ctx, > + const float default_outer_level[4], > + const float default_inner_level[2]) > +{ > + struct virgl_context *vctx = virgl_context(ctx); > + struct virgl_screen *rs = virgl_screen(ctx->screen); > + > + if (!rs->caps.caps.v1.bset.has_tessellation_shaders) > + return; > + virgl_encode_set_tess_state(vctx, default_outer_level, > default_inner_level); > +} > + > static void virgl_resource_copy_region(struct pipe_context *ctx, >struct pipe_resource *dst, >
[Mesa-dev] [PATCH v3] radv: update the ZRANGE_PRECISION value for the TC-compat bug
On GFX8+, there is a bug that affects TC-compatible depth surfaces when the ZRange is not reset after LateZ kills pixels. The workaround is to always set DB_Z_INFO.ZRANGE_PRECISION to match the last fast clear value. Because the value is set to 1 by default, we only need to update it when clearing Z to 0.0. We also need to set the depth clear regs and to update ZRANGE_PRECISION when initializing a TC-compat depth image to 0. Original patch from James Legg. v3: - check that subpass isn't NULL (needed for the next patch) - set depth clear regs when initializing HTILE v2: - only update ZRANGE_PRECISION for depth aspects - adjust base address in presence of stencil This fixes random CTS fails with dEQP-VK.renderpass.suballocation.formats.d32_sfloat_s8_uint.input.* Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105396 CC: Signed-off-by: Samuel Pitoiset --- src/amd/vulkan/radv_cmd_buffer.c | 108 +++ 1 file changed, 108 insertions(+) diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c index 043b4a2f44a..53fb4988a8c 100644 --- a/src/amd/vulkan/radv_cmd_buffer.c +++ b/src/amd/vulkan/radv_cmd_buffer.c @@ -1044,6 +1044,68 @@ radv_emit_fb_color_state(struct radv_cmd_buffer *cmd_buffer, } } +static void +radv_update_zrange_precision(struct radv_cmd_buffer *cmd_buffer, +struct radv_ds_buffer_info *ds, +struct radv_image *image, VkImageLayout layout, +bool requires_cond_write) +{ + uint32_t db_z_info = ds->db_z_info; + uint32_t db_z_info_reg; + + if (!radv_image_is_tc_compat_htile(image)) + return; + + if (!radv_layout_has_htile(image, layout, + radv_image_queue_family_mask(image, + cmd_buffer->queue_family_index, + cmd_buffer->queue_family_index))) { + db_z_info &= C_028040_TILE_SURFACE_ENABLE; + } + + db_z_info &= C_028040_ZRANGE_PRECISION; + + if (cmd_buffer->device->physical_device->rad_info.chip_class >= GFX9) { + db_z_info_reg = R_028038_DB_Z_INFO; + } else { + db_z_info_reg = R_028040_DB_Z_INFO; + } + + /* When we don't know the last fast clear value we need to emit a +* conditional packet, otherwise we can update DB_Z_INFO directly. +*/ + if (requires_cond_write) { + radeon_emit(cmd_buffer->cs, PKT3(PKT3_COND_WRITE, 7, 0)); + + const uint32_t write_space = 0 << 8;/* register */ + const uint32_t poll_space = 1 << 4; /* memory */ + const uint32_t function = 3 << 0; /* equal to the reference */ + const uint32_t options = write_space | poll_space | function; + radeon_emit(cmd_buffer->cs, options); + + /* poll address - location of the depth clear value */ + uint64_t va = radv_buffer_get_va(image->bo); + va += image->offset + image->clear_value_offset; + + /* In presence of stencil format, we have to adjust the base +* address because the first value is the stencil clear value. +*/ + if (vk_format_is_stencil(image->vk_format)) + va += 4; + + radeon_emit(cmd_buffer->cs, va); + radeon_emit(cmd_buffer->cs, va >> 32); + + radeon_emit(cmd_buffer->cs, fui(0.0f)); /* reference value */ + radeon_emit(cmd_buffer->cs, (uint32_t)-1); /* comparison mask */ + radeon_emit(cmd_buffer->cs, db_z_info_reg >> 2); /* write address low */ + radeon_emit(cmd_buffer->cs, 0u); /* write address high */ + radeon_emit(cmd_buffer->cs, db_z_info); + } else { + radeon_set_context_reg(cmd_buffer->cs, db_z_info_reg, db_z_info); + } +} + static void radv_emit_fb_ds_state(struct radv_cmd_buffer *cmd_buffer, struct radv_ds_buffer_info *ds, @@ -1102,6 +1164,9 @@ radv_emit_fb_ds_state(struct radv_cmd_buffer *cmd_buffer, } + /* Update the ZRANGE_PRECISION value for the TC-compat bug. */ + radv_update_zrange_precision(cmd_buffer, ds, image, layout, true); + radeon_set_context_reg(cmd_buffer->cs, R_028B78_PA_SU_POLY_OFFSET_DB_FMT_CNTL, ds->pa_su_poly_offset_db_fmt_cntl); } @@ -1143,6 +1208,35 @@ radv_set_depth_clear_regs(struct radv_cmd_buffer *cmd_buffer, radeon_emit(cmd_buffer->cs, ds_clear_value.stencil); /* R_028028_DB_STENCIL_CLEAR */ if (aspects & VK_IMAGE_ASPECT_DEPTH_BIT) radeon_emit(cmd_buffer->cs, fui(ds_clear_value.depth)); /* R_02802C_DB_DEPTH_CLEAR */ + + /* Update t
Re: [Mesa-dev] [PATCH v3] radv: update the ZRANGE_PRECISION value for the TC-compat bug
Reviewed-by: Bas Nieuwenhuizen On Wed, Jun 13, 2018 at 2:27 PM, Samuel Pitoiset wrote: > On GFX8+, there is a bug that affects TC-compatible depth surfaces > when the ZRange is not reset after LateZ kills pixels. > > The workaround is to always set DB_Z_INFO.ZRANGE_PRECISION to match > the last fast clear value. Because the value is set to 1 by default, > we only need to update it when clearing Z to 0.0. > > We also need to set the depth clear regs and to update > ZRANGE_PRECISION when initializing a TC-compat depth image to 0. > > Original patch from James Legg. > > v3: - check that subpass isn't NULL (needed for the next patch) > - set depth clear regs when initializing HTILE > v2: - only update ZRANGE_PRECISION for depth aspects > - adjust base address in presence of stencil > > This fixes random CTS fails with > dEQP-VK.renderpass.suballocation.formats.d32_sfloat_s8_uint.input.* > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105396 > CC: > Signed-off-by: Samuel Pitoiset > --- > src/amd/vulkan/radv_cmd_buffer.c | 108 +++ > 1 file changed, 108 insertions(+) > > diff --git a/src/amd/vulkan/radv_cmd_buffer.c > b/src/amd/vulkan/radv_cmd_buffer.c > index 043b4a2f44a..53fb4988a8c 100644 > --- a/src/amd/vulkan/radv_cmd_buffer.c > +++ b/src/amd/vulkan/radv_cmd_buffer.c > @@ -1044,6 +1044,68 @@ radv_emit_fb_color_state(struct radv_cmd_buffer > *cmd_buffer, > } > } > > +static void > +radv_update_zrange_precision(struct radv_cmd_buffer *cmd_buffer, > +struct radv_ds_buffer_info *ds, > +struct radv_image *image, VkImageLayout layout, > +bool requires_cond_write) > +{ > + uint32_t db_z_info = ds->db_z_info; > + uint32_t db_z_info_reg; > + > + if (!radv_image_is_tc_compat_htile(image)) > + return; > + > + if (!radv_layout_has_htile(image, layout, > + radv_image_queue_family_mask(image, > + > cmd_buffer->queue_family_index, > + > cmd_buffer->queue_family_index))) { > + db_z_info &= C_028040_TILE_SURFACE_ENABLE; > + } > + > + db_z_info &= C_028040_ZRANGE_PRECISION; > + > + if (cmd_buffer->device->physical_device->rad_info.chip_class >= GFX9) > { > + db_z_info_reg = R_028038_DB_Z_INFO; > + } else { > + db_z_info_reg = R_028040_DB_Z_INFO; > + } > + > + /* When we don't know the last fast clear value we need to emit a > +* conditional packet, otherwise we can update DB_Z_INFO directly. > +*/ > + if (requires_cond_write) { > + radeon_emit(cmd_buffer->cs, PKT3(PKT3_COND_WRITE, 7, 0)); > + > + const uint32_t write_space = 0 << 8;/* register */ > + const uint32_t poll_space = 1 << 4; /* memory */ > + const uint32_t function = 3 << 0; /* equal to the > reference */ > + const uint32_t options = write_space | poll_space | function; > + radeon_emit(cmd_buffer->cs, options); > + > + /* poll address - location of the depth clear value */ > + uint64_t va = radv_buffer_get_va(image->bo); > + va += image->offset + image->clear_value_offset; > + > + /* In presence of stencil format, we have to adjust the base > +* address because the first value is the stencil clear value. > +*/ > + if (vk_format_is_stencil(image->vk_format)) > + va += 4; > + > + radeon_emit(cmd_buffer->cs, va); > + radeon_emit(cmd_buffer->cs, va >> 32); > + > + radeon_emit(cmd_buffer->cs, fui(0.0f)); /* reference > value */ > + radeon_emit(cmd_buffer->cs, (uint32_t)-1); /* > comparison mask */ > + radeon_emit(cmd_buffer->cs, db_z_info_reg >> 2); /* write > address low */ > + radeon_emit(cmd_buffer->cs, 0u); /* write > address high */ > + radeon_emit(cmd_buffer->cs, db_z_info); > + } else { > + radeon_set_context_reg(cmd_buffer->cs, db_z_info_reg, > db_z_info); > + } > +} > + > static void > radv_emit_fb_ds_state(struct radv_cmd_buffer *cmd_buffer, > struct radv_ds_buffer_info *ds, > @@ -1102,6 +1164,9 @@ radv_emit_fb_ds_state(struct radv_cmd_buffer > *cmd_buffer, > > } > > + /* Update the ZRANGE_PRECISION value for the TC-compat bug. */ > + radv_update_zrange_precision(cmd_buffer, ds, image, layout, true); > + > radeon_set_context_reg(cmd_buffer->cs, > R_028B78_PA_SU_POLY_OFFSET_DB_FMT_CNTL, >ds->pa_su_poly_offset_db_fmt_cntl); > } > @@ -1143,6 +1208,35 @@ radv_set_d
Re: [Mesa-dev] [PATCH 1/2] ac/gpu_info: report real total memory sizes
Reviewed-by: Bas Nieuwenhuizen for both. Thanks! On Wed, Jun 13, 2018 at 3:15 AM, Marek Olšák wrote: > From: Marek Olšák > > The change from MIN2 to MAX2 is intentional. > --- > src/amd/common/ac_gpu_info.c | 82 > 1 file changed, 54 insertions(+), 28 deletions(-) > > diff --git a/src/amd/common/ac_gpu_info.c b/src/amd/common/ac_gpu_info.c > index 6bee96b9eee..3b6600dcbc6 100644 > --- a/src/amd/common/ac_gpu_info.c > +++ b/src/amd/common/ac_gpu_info.c > @@ -91,21 +91,20 @@ static bool has_syncobj(int fd) > return false; > return value ? true : false; > } > > bool ac_query_gpu_info(int fd, amdgpu_device_handle dev, >struct radeon_info *info, >struct amdgpu_gpu_info *amdinfo) > { > struct drm_amdgpu_info_device device_info = {}; > struct amdgpu_buffer_size_alignments alignment_info = {}; > - struct amdgpu_heap_info vram, vram_vis, gtt; > struct drm_amdgpu_info_hw_ip dma = {}, compute = {}, uvd = {}; > struct drm_amdgpu_info_hw_ip uvd_enc = {}, vce = {}, vcn_dec = {}; > struct drm_amdgpu_info_hw_ip vcn_enc = {}, gfx = {}; > struct amdgpu_gds_resource_info gds = {}; > uint32_t vce_version = 0, vce_feature = 0, uvd_version = 0, > uvd_feature = 0; > int r, i, j; > drmDevicePtr devinfo; > > /* Get PCI info. */ > r = drmGetDevice2(fd, 0, &devinfo); > @@ -132,40 +131,20 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev, > fprintf(stderr, "amdgpu: amdgpu_query_info(dev_info) > failed.\n"); > return false; > } > > r = amdgpu_query_buffer_size_alignment(dev, &alignment_info); > if (r) { > fprintf(stderr, "amdgpu: amdgpu_query_buffer_size_alignment > failed.\n"); > return false; > } > > - r = amdgpu_query_heap_info(dev, AMDGPU_GEM_DOMAIN_VRAM, 0, &vram); > - if (r) { > - fprintf(stderr, "amdgpu: amdgpu_query_heap_info(vram) > failed.\n"); > - return false; > - } > - > - r = amdgpu_query_heap_info(dev, AMDGPU_GEM_DOMAIN_VRAM, > - AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED, > - &vram_vis); > - if (r) { > - fprintf(stderr, "amdgpu: amdgpu_query_heap_info(vram_vis) > failed.\n"); > - return false; > - } > - > - r = amdgpu_query_heap_info(dev, AMDGPU_GEM_DOMAIN_GTT, 0, >t); > - if (r) { > - fprintf(stderr, "amdgpu: amdgpu_query_heap_info(gtt) > failed.\n"); > - return false; > - } > - > r = amdgpu_query_hw_ip_info(dev, AMDGPU_HW_IP_DMA, 0, &dma); > if (r) { > fprintf(stderr, "amdgpu: amdgpu_query_hw_ip_info(dma) > failed.\n"); > return false; > } > > r = amdgpu_query_hw_ip_info(dev, AMDGPU_HW_IP_GFX, 0, &gfx); > if (r) { > fprintf(stderr, "amdgpu: amdgpu_query_hw_ip_info(gfx) > failed.\n"); > return false; > @@ -256,20 +235,74 @@ bool ac_query_gpu_info(int fd, amdgpu_device_handle dev, > fprintf(stderr, "amdgpu: amdgpu_query_sw_info(address32_hi) > failed.\n"); > return false; > } > > r = amdgpu_query_gds_info(dev, &gds); > if (r) { > fprintf(stderr, "amdgpu: amdgpu_query_gds_info failed.\n"); > return false; > } > > + if (info->drm_minor >= 9) { > + struct drm_amdgpu_memory_info meminfo; > + > + r = amdgpu_query_info(dev, AMDGPU_INFO_MEMORY, > sizeof(meminfo), &meminfo); > + if (r) { > + fprintf(stderr, "amdgpu: amdgpu_query_info(memory) > failed.\n"); > + return false; > + } > + > + /* Note: usable_heap_size values can be random and can't be > relied on. */ > + info->gart_size = meminfo.gtt.total_heap_size; > + info->vram_size = meminfo.vram.total_heap_size; > + info->vram_vis_size = > meminfo.cpu_accessible_vram.total_heap_size; > + > + info->max_alloc_size = MAX2(meminfo.vram.max_allocation, > + meminfo.gtt.max_allocation); > + } else { > + /* This is a deprecated interface, which reports usable sizes > +* (total minus pinned), but the pinned size computation is > +* buggy, so the values returned from these functions can be > +* random. > +*/ > + struct amdgpu_heap_info vram, vram_vis, gtt; > + > + r = amdgpu_query_heap_info(dev, AMDGPU_GEM_DOMAIN_VRAM, 0, > &vram); > + if (r) { > + fprintf(stderr, "amdgpu: amdgpu_query_heap_info(vram) > failed.\n");
[Mesa-dev] [Bug 106897] Ubuntu 16.04. Mesa can't be built with specified configurations
https://bugs.freedesktop.org/show_bug.cgi?id=106897 --- Comment #7 from Timo Aaltonen --- such is life, 16.04 won't get a newer wayland, but 18.04 will.. eventually for now, you can use a ppa for a backport with the necessary packaging changes: https://launchpad.net/~ubuntu-x-swat/+archive/ubuntu/updates -- You are receiving this mail because: You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 106912] radv: 16-bit depth buffer causes artifacts in Shadow Warrior 2
https://bugs.freedesktop.org/show_bug.cgi?id=106912 Bug ID: 106912 Summary: radv: 16-bit depth buffer causes artifacts in Shadow Warrior 2 Product: Mesa Version: git Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: Drivers/Vulkan/radeon Assignee: mesa-dev@lists.freedesktop.org Reporter: philip.rebo...@tu-dortmund.de QA Contact: mesa-dev@lists.freedesktop.org Hello, Shadow Warrior 2 uses D16_UNORM as a shadow map format, and clearing the depth buffer in one render pass instance and rendering to it in another results in parts of the depth buffer getting set to 1.0. The game renders correctly with RADV_DEBUG=nohiz. I wasn't able to reproduce this issue outside of DXVK so far, so here's a Renderdoc capture of the issue (captured on Polaris 10): https://mega.nz/#!gfoWFDSC!rb9qsW9H6dGq_gsNvpdhPW82mSkZEy94PX-4Ey6BSTs The render pass in question starts at EID 19473, which should be bookmarked. For the capture I used Mesa 18.1.1, but the issue is still present in latest -git. Regards - Philip -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 3/3] egl/android: Add DRM node probing and filtering
+Amit and John On Sat, Jun 9, 2018 at 11:27 AM, Robert Foss wrote: > This patch both adds support for probing & filtering DRM nodes > and switches away from using the GRALLOC_MODULE_PERFORM_GET_DRM_FD > gralloc call. > > Currently the filtering is based just on the driver name, > and the desired name is supplied using the "drm.gpu.vendor_name" > Android property. There's a potential issue with this whole approach and that is SELinux. With the way SELinux locks down accesses, getting probing thru device files to work can be a pain. It may be better now than the prior version because sysfs is not probed. I'll leave it to Amit or John to comment. Rob > > Signed-off-by: Robert Foss > --- > > Changes since v2: > - Switch from drmGetDevices2 to manual renderD node iteration > - Add probe_res enum to communicate probing results better > - Avoid using _eglError() in internal static functions > - Avoid actually loading the driver while probing, just verify >that it exists. > - Replace strlen call with the assumed length PROPERTY_VALUE_MAX > > Changes since v1: > - Do not rely on libdrm for probing > - Distinguish between errors and when no drm devices are found > > Changes since RFC: > - Rebased on newer libdrm drmHandleMatch patch > - Added support for driver probing > > > src/egl/drivers/dri2/platform_android.c | 222 ++-- > 1 file changed, 169 insertions(+), 53 deletions(-) > > diff --git a/src/egl/drivers/dri2/platform_android.c > b/src/egl/drivers/dri2/platform_android.c > index 4ba96aad90..a2cbe92d93 100644 > --- a/src/egl/drivers/dri2/platform_android.c > +++ b/src/egl/drivers/dri2/platform_android.c > @@ -27,12 +27,16 @@ > * DEALINGS IN THE SOFTWARE. > */ > > +#include > #include > +#include > #include > #include > #include > #include > +#include > #include > +#include > > #include "loader.h" > #include "egl_dri2.h" > @@ -1130,31 +1134,6 @@ droid_add_configs_for_visuals(_EGLDriver *drv, > _EGLDisplay *dpy) > return (config_count != 0); > } > > -enum { > -/* perform(const struct gralloc_module_t *mod, > - * int op, > - * int *fd); > - */ > -GRALLOC_MODULE_PERFORM_GET_DRM_FD = 0x4002, > -}; > - > -static int > -droid_open_device(struct dri2_egl_display *dri2_dpy) > -{ > - int fd = -1, err = -EINVAL; > - > - if (dri2_dpy->gralloc->perform) > - err = dri2_dpy->gralloc->perform(dri2_dpy->gralloc, > - GRALLOC_MODULE_PERFORM_GET_DRM_FD, > - &fd); > - if (err || fd < 0) { > - _eglLog(_EGL_WARNING, "fail to get drm fd"); > - fd = -1; > - } > - > - return (fd >= 0) ? fcntl(fd, F_DUPFD_CLOEXEC, 3) : -1; > -} > - > static const struct dri2_egl_display_vtbl droid_display_vtbl = { > .authenticate = NULL, > .create_window_surface = droid_create_window_surface, > @@ -1215,6 +1194,168 @@ static const __DRIextension > *droid_image_loader_extensions[] = { > NULL, > }; > > +EGLBoolean > +droid_load_driver(_EGLDisplay *disp) > +{ > + struct dri2_egl_display *dri2_dpy = disp->DriverData; > + const char *err; > + > + dri2_dpy->driver_name = loader_get_driver_for_fd(dri2_dpy->fd); > + if (dri2_dpy->driver_name == NULL) > + return false; > + > + dri2_dpy->is_render_node = drmGetNodeTypeFromFd(dri2_dpy->fd) == > DRM_NODE_RENDER; > + > + if (!dri2_dpy->is_render_node) { > + #ifdef HAVE_DRM_GRALLOC > + /* Handle control nodes using __DRI_DRI2_LOADER extension and GEM > names > +* for backwards compatibility with drm_gralloc. (Do not use on new > +* systems.) */ > + dri2_dpy->loader_extensions = droid_dri2_loader_extensions; > + if (!dri2_load_driver(disp)) { > + err = "DRI2: failed to load driver"; > + goto error; > + } > + #else > + err = "DRI2: handle is not for a render node"; > + goto error; > + #endif > + } else { > + dri2_dpy->loader_extensions = droid_image_loader_extensions; > + if (!dri2_load_driver_dri3(disp)) { > + err = "DRI3: failed to load driver"; > + goto error; > + } > +} > + > + return true; > + > +error: > + free(dri2_dpy->driver_name); > + dri2_dpy->driver_name = NULL; > + return false; > +} > + > +static bool > +droid_probe_driver(int fd) > +{ > + char *driver_name; > + > + driver_name = loader_get_driver_for_fd(fd); > + if (driver_name == NULL) > + return false; > + > + free(driver_name); > + return true; > +} > + > +typedef enum { > + probe_error = -1, > + probe_success = 0, > + probe_filtered_out = 1, > + probe_no_driver = 2 > +} probe_ret_t; > + > +static probe_ret_t > +droid_probe_device(_EGLDisplay *disp, int fd, char *vendor) > +{ > + int ret; > + > + drmVersionPtr ver = drmGetVersion(fd); > + if (!ver) > + return probe_error; > + > + if (vendor != NULL && ver->name != NULL && > +
Re: [Mesa-dev] [PATCH 1/3] meson: Fix -latomic check
Quoting Matt Turner (2018-06-12 17:50:20) > Commit 54ba73ef102f (configure.ac/meson.build: Fix -latomic test) fixed > some checks for -latomic, and then commit 54bbe600ec26 (configure.ac: > rework -latomic check) further extended the fixes in configure.ac but > not in Meson. This commit extends those fixes to the Meson tests. > > Fixes: 54bbe600ec26 (configure.ac: rework -latomic check) > --- > meson.build | 8 +++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/meson.build b/meson.build > index 7dba52369b0..62200476216 100644 > --- a/meson.build > +++ b/meson.build > @@ -836,7 +836,13 @@ endif > # Check for GCC style atomics > dep_atomic = null_dep > > -if cc.compiles('int main() { int n; return __atomic_load_n(&n, > __ATOMIC_ACQUIRE); }', > +if cc.compiles('''#include > + int main() { > +struct { > + uint64_t *v; > +} x; > +return (int)__atomic_load_n(x.v, __ATOMIC_ACQUIRE); > + }''', > name : 'GCC atomic builtins') >pre_args += '-DUSE_GCC_ATOMIC_BUILTINS' > > -- > 2.16.1 > Should patches 2 and 3 be cc 18.1? Reviewed-by: Dylan Baker signature.asc Description: signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] configure.ac/meson.build: Add options for library suffixes
Quoting Eric Engestrom (2018-06-13 03:03:25) > On Tuesday, 2018-06-12 11:19:40 -0600, bmgor...@chromium.org wrote: > > From: Benjamin Gordon > > > > When building the Chrome OS Android container, we need to build copies > > of mesa that don't conflict with the Android system-supplied libraries. > > This adds options to create suffixed versions of EGL and GLES libraries: > > > > libEGL.so -> libEGL${egl-lib-suffix}.so > > libGLESv1_CM.so -> libGLESv1_CM${gles-lib-suffix}.so > > libGLESv2.so -> libGLES${gles-lib-suffix}.so > > > > This is similar to what happens when --enable-libglvnd is specified, but > > without the side effects of linking against libglvnd. > > This seems reasonable, and the meson side of this patch is correct, > but we need to document or prevent the interaction between > --enable-libglvnd and --with-egl-lib-suffix. > > I can't think of a use-case for having both, so I suggest "if both are > enabled, error out"; scroll down for what this could look like in meson. Agreed, making it hard error to use both makes sense to me. > With that (and the corresponding autotools hunk): > Reviewed-by: Eric Engestrom > > > > > Change-Id: I0a534d3921a24c031e2532ee7d5ba9813740b33b > > (Note to whoever merges this patch: drop this line ^) > > > Signed-off-by: Benjamin Gordon > > --- > > configure.ac| 14 ++ > > meson_options.txt | 12 > > src/egl/Makefile.am | 8 > > src/egl/meson.build | 2 +- > > src/mapi/Makefile.am| 28 ++-- > > src/mapi/es1api/meson.build | 2 +- > > src/mapi/es2api/meson.build | 2 +- > > 7 files changed, 47 insertions(+), 21 deletions(-) > > > > diff --git a/configure.ac b/configure.ac > > index 35ade986d1..6070a2146b 100644 > > --- a/configure.ac > > +++ b/configure.ac > > @@ -1511,12 +1511,24 @@ AC_ARG_WITH([gl-lib-name], > > [specify GL library name @<:@default=GL@:>@])], > >[GL_LIB=$withval], > >[GL_LIB="$DEFAULT_GL_LIB_NAME"]) > > +AC_ARG_WITH([egl-lib-suffix], > > + [AS_HELP_STRING([--with-egl-lib-suffix@<:@=NAME@:>@], > > +[specify EGL library suffix @<:@default=none@:>@])], > > + [EGL_LIB_SUFFIX=$withval], > > + [EGL_LIB_SUFFIX=""]) > > +AC_ARG_WITH([gles-lib-suffix], > > + [AS_HELP_STRING([--with-gles-lib-suffix@<:@=NAME@:>@], > > +[specify GLES library suffix @<:@default=none@:>@])], > > + [GLES_LIB_SUFFIX=$withval], > > + [GLES_LIB_SUFFIX=""]) > > AC_ARG_WITH([osmesa-lib-name], > >[AS_HELP_STRING([--with-osmesa-lib-name@<:@=NAME@:>@], > > [specify OSMesa library name @<:@default=OSMesa@:>@])], > >[OSMESA_LIB=$withval], > >[OSMESA_LIB=OSMesa]) > > AS_IF([test "x$GL_LIB" = xyes], [GL_LIB="$DEFAULT_GL_LIB_NAME"]) > > +AS_IF([test "x$EGL_LIB_SUFFIX" = xyes], [EGL_LIB_SUFFIX=""]) > > +AS_IF([test "x$GLES_LIB_SUFFIX" = xyes], [GLES_LIB_SUFFIX=""]) > > AS_IF([test "x$OSMESA_LIB" = xyes], [OSMESA_LIB=OSMesa]) > > > > dnl > > @@ -1534,6 +1546,8 @@ if test "x${enable_mangling}" = "xyes" ; then > >OSMESA_LIB="Mangled${OSMESA_LIB}" > > fi > > AC_SUBST([GL_LIB]) > > +AC_SUBST([EGL_LIB_SUFFIX]) > > +AC_SUBST([GLES_LIB_SUFFIX]) > > AC_SUBST([OSMESA_LIB]) > > > > # Check for libdrm > > diff --git a/meson_options.txt b/meson_options.txt > > index ce7d87f1eb..9d84c3b5bb 100644 > > --- a/meson_options.txt > > +++ b/meson_options.txt > > @@ -298,3 +298,15 @@ option( > >choices : ['freedreno', 'glsl', 'intel', 'nir', 'nouveau', 'all'], > >description : 'List of tools to build.', > > ) > > +option( > > + 'egl-lib-suffix', > > + type : 'string', > > + value : '', > > + description : 'Suffix to append to EGL library name. Default: none.' > > +) > > +option( > > + 'gles-lib-suffix', > > + type : 'string', > > + value : '', > > + description : 'Suffix to append to GLES library names. Default: none.' > > +) > > diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am > > index 086a4a1e63..c3aeeea007 100644 > > --- a/src/egl/Makefile.am > > +++ b/src/egl/Makefile.am > > @@ -184,12 +184,12 @@ libEGL_mesa_la_LDFLAGS = \ > > > > else # USE_LIBGLVND > > > > -lib_LTLIBRARIES = libEGL.la > > -libEGL_la_SOURCES = > > -libEGL_la_LIBADD = \ > > +lib_LTLIBRARIES = libEGL@EGL_LIB_SUFFIX@.la > > +libEGL@EGL_LIB_SUFFIX@_la_SOURCES = > > +libEGL@EGL_LIB_SUFFIX@_la_LIBADD = \ > > libEGL_common.la \ > > $(top_builddir)/src/mapi/shared-glapi/libglapi.la > > -libEGL_la_LDFLAGS = \ > > +libEGL@EGL_LIB_SUFFIX@_la_LDFLAGS = \ > > -no-undefined \ > > -version-number 1:0 \ > > $(BSYMBOLIC) \ > > diff --git a/src/egl/meson.build b/src/egl/meson.build > > index 6537e4bdee..b833fd1729 100644 > > --- a/src/egl/meson.build > > +++ b/src/egl/meson.build > > @@ -148,7 +148,7 @@ if cc.has_function('mincore') > > endif > > > > if with_glvnd and get_option('egl-lib-suffix') != '' > error('''EGL lib suffix can't be used with libglvnd''') > endif > > > if not with_glvnd > > - egl_lib
Re: [Mesa-dev] [PATCH 04/13] i965/draw: Fix adding the stencil bo to the depth cache
On Wed, Jun 13, 2018 at 09:25:02AM +0300, Pohjolainen, Topi wrote: > On Tue, Jun 12, 2018 at 12:21:56PM -0700, Nanley Chery wrote: > > Fix the case where only stencil writes are enabled on a depth stencil > > Isn't this an issue even when depth writes are enabled? Both would add the > same bo to cache? > You're right. The message should omit the word "only". I think we'd be adding the same BO pre-SNB, but I'm not sure. -Nanley > > texture. Found by inspection. > > > > --- > > > > I'm looking into writing a test for this. > > > > src/mesa/drivers/dri/i965/brw_draw.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_draw.c > > b/src/mesa/drivers/dri/i965/brw_draw.c > > index 271456e0f7d..71461d7b0a7 100644 > > --- a/src/mesa/drivers/dri/i965/brw_draw.c > > +++ b/src/mesa/drivers/dri/i965/brw_draw.c > > @@ -623,10 +623,10 @@ brw_postdraw_set_buffers_need_resolve(struct > > brw_context *brw) > > } > > > > if (stencil_irb && brw->stencil_write_enabled) { > > - brw_depth_cache_add_bo(brw, stencil_irb->mt->bo); > >struct intel_mipmap_tree *stencil_mt = > > stencil_irb->mt->stencil_mt != NULL ? > > stencil_irb->mt->stencil_mt : stencil_irb->mt; > > + brw_depth_cache_add_bo(brw, stencil_mt->bo); > >intel_miptree_finish_write(brw, stencil_mt, stencil_irb->mt_level, > > stencil_irb->mt_layer, > > stencil_irb->layer_count, > > ISL_AUX_USAGE_NONE); > > -- > > 2.17.0 > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/13] i965/miptree: Share the miptree format in miptree_create
On Wed, Jun 13, 2018 at 09:33:41AM +0300, Pohjolainen, Topi wrote: > On Tue, Jun 12, 2018 at 12:22:00PM -0700, Nanley Chery wrote: > > --- > > src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 30 +-- > > 1 file changed, 15 insertions(+), 15 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > index 03628e3fd9f..97de30076e0 100644 > > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > @@ -696,8 +696,19 @@ miptree_create(struct brw_context *brw, > > if (devinfo->gen < 6 && _mesa_is_format_color_format(format)) > >tiling_flags &= ~ISL_TILING_Y0_BIT; > > > > + mesa_format mt_fmt; > > + if (_mesa_is_format_color_format(format)) { > > + mt_fmt = intel_lower_compressed_format(brw, format); > > + } else { > > + /* Fix up the Z miptree format for how we're splitting out separate > > + * stencil. Gen7 expects there to be no stencil bits in its depth > > buffer. > > + */ > > + mt_fmt = (devinfo->gen < 6) ? format : > > + intel_depth_format_for_depthstencil_format(format); > > + } > > I wonder if we need to add something of this sort for coverity not complaining > later on (I don't know if it is clever to know what > _mesa_is_format_color_format() does): > > } else { > unreachable("Format with invalid base"); > } > > Where would we be adding this unreachable? There is already an else case here. -Nanley > > + > > if (format == MESA_FORMAT_S_UINT8) > > - return make_surface(brw, target, format, first_level, last_level, > > + return make_surface(brw, target, mt_fmt, first_level, last_level, > >width0, height0, depth0, num_samples, > >tiling_flags, > >ISL_SURF_USAGE_STENCIL_BIT | > > @@ -709,13 +720,8 @@ miptree_create(struct brw_context *brw, > > const GLenum base_format = _mesa_get_format_base_format(format); > > if ((base_format == GL_DEPTH_COMPONENT || > > base_format == GL_DEPTH_STENCIL)) { > > - /* Fix up the Z miptree format for how we're splitting out separate > > - * stencil. Gen7 expects there to be no stencil bits in its depth > > buffer. > > - */ > > - const mesa_format depth_only_format = > > - intel_depth_format_for_depthstencil_format(format); > >struct intel_mipmap_tree *mt = make_surface( > > - brw, target, devinfo->gen >= 6 ? depth_only_format : format, > > + brw, target, mt_fmt, > > first_level, last_level, > > width0, height0, depth0, num_samples, tiling_flags, > > ISL_SURF_USAGE_DEPTH_BIT | ISL_SURF_USAGE_TEXTURE_BIT, > > @@ -733,19 +739,13 @@ miptree_create(struct brw_context *brw, > >return mt; > > } > > > > - mesa_format tex_format = format; > > - mesa_format etc_format = MESA_FORMAT_NONE; > > uint32_t alloc_flags = 0; > > > > - format = intel_lower_compressed_format(brw, format); > > - > > - etc_format = (format != tex_format) ? tex_format : MESA_FORMAT_NONE; > > - > > if (flags & MIPTREE_CREATE_BUSY) > >alloc_flags |= BO_ALLOC_BUSY; > > > > struct intel_mipmap_tree *mt = make_surface( > > - brw, target, format, > > + brw, target, mt_fmt, > > first_level, last_level, > > width0, height0, depth0, > > num_samples, tiling_flags, > > @@ -755,7 +755,7 @@ miptree_create(struct brw_context *brw, > > if (!mt) > >return NULL; > > > > - mt->etc_format = etc_format; > > + mt->etc_format = (mt_fmt != format) ? format : MESA_FORMAT_NONE; > > > > if (!(flags & MIPTREE_CREATE_NO_AUX)) > >intel_miptree_choose_aux_usage(brw, mt); > > -- > > 2.17.0 > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] intel/compiler: Properly consider UBO loads that cross 32B boundaries.
I just reverted this in master because it regressed about 30K Vulkan CTS tests. More investigation needed? On Wed, Jun 13, 2018 at 2:07 AM, Kenneth Graunke wrote: > On Tuesday, June 12, 2018 1:38:03 PM PDT Rafael Antognolli wrote: > > On Mon, Jun 11, 2018 at 02:01:49PM -0700, Kenneth Graunke wrote: > > > The UBO push analysis pass incorrectly assumed that all values would > fit > > > within a 32B chunk, and only recorded a bit for the 32B chunk > containing > > > the starting offset. > > > > > > For example, if a UBO contained the following, tightly packed: > > > > > >vec4 a; // [0, 16) > > >float b; // [16, 20) > > >vec4 c; // [20, 36) > > > > > > then, c would start at offset 20 / 32 = 0 and end at 36 / 32 = 1, > > > which means that we ought to record two 32B chunks in the bitfield. > > > > > > Similarly, dvec4s would suffer from the same problem. > > > --- > > > src/intel/compiler/brw_nir_analyze_ubo_ranges.c | 8 +++- > > > 1 file changed, 7 insertions(+), 1 deletion(-) > > > > > > diff --git a/src/intel/compiler/brw_nir_analyze_ubo_ranges.c > b/src/intel/compiler/brw_nir_analyze_ubo_ranges.c > > > index d58fe3dd2e3..6d6ccf73ade 100644 > > > --- a/src/intel/compiler/brw_nir_analyze_ubo_ranges.c > > > +++ b/src/intel/compiler/brw_nir_analyze_ubo_ranges.c > > > @@ -141,10 +141,16 @@ analyze_ubos_block(struct ubo_analysis_state > *state, nir_block *block) > > > if (offset >= 64) > > > continue; > > > > > > + /* The value might span multiple 32-byte chunks. */ > > > + const int bytes = nir_intrinsic_dest_components(intrin) * > > > + (nir_dest_bit_size(intrin->dest) / 8); > > > + const int end = DIV_ROUND_UP(offset_const->u32[0] + bytes, > 32); > > > + const int regs = end - offset + 1; > > > + > > > > But if I understood it correctly, offset is the first 32B chunk within > > the UBO block (it's actually an ubo "chunk offset"). And you calculate > > bytes by taking the number of components times the size of each > > component of the nir_intrinsic_load_ubo instruction (which apparently > > supports multiple components). So yeah, this makes sense to me. > > Yeah, that's exactly right. load_ubo can load up to 4 components. > > > Take this review with a grain of salt (assuming what I wrote above is > > correct), but this looks simple enough. So it is > > > > Reviewed-by: Rafael Antognolli > > Thanks! > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/13] i965/miptree: Add and use mt_surf_usage
On Wed, Jun 13, 2018 at 09:39:08AM +0300, Pohjolainen, Topi wrote: > On Tue, Jun 12, 2018 at 12:22:02PM -0700, Nanley Chery wrote: > > --- > > src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 40 --- > > 1 file changed, 26 insertions(+), 14 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > index cfb83d15ecc..5e00da86d32 100644 > > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > @@ -677,6 +677,23 @@ make_separate_stencil_surface(struct brw_context *brw, > > return true; > > } > > > > +/* Return the usual surface usage flags for the given format. */ > > +static isl_surf_usage_flags_t > > +mt_surf_usage(mesa_format format) > > +{ > > + switch(_mesa_get_format_base_format(format)) { > > + case GL_DEPTH_COMPONENT: > > + return ISL_SURF_USAGE_DEPTH_BIT | ISL_SURF_USAGE_TEXTURE_BIT; > > + case GL_DEPTH_STENCIL: > > + return ISL_SURF_USAGE_DEPTH_BIT | ISL_SURF_USAGE_STENCIL_BIT | > > + ISL_SURF_USAGE_TEXTURE_BIT; > > + case GL_STENCIL_INDEX: > > + return ISL_SURF_USAGE_STENCIL_BIT | ISL_SURF_USAGE_TEXTURE_BIT; > > + default: > > + return ISL_SURF_USAGE_RENDER_TARGET_BIT | ISL_SURF_USAGE_TEXTURE_BIT; > > + } > > +} > > + > > static struct intel_mipmap_tree * > > miptree_create(struct brw_context *brw, > > GLenum target, > > @@ -713,8 +730,7 @@ miptree_create(struct brw_context *brw, > >return make_surface(brw, target, mt_fmt, first_level, last_level, > >width0, height0, depth0, num_samples, > >tiling_flags, > > - ISL_SURF_USAGE_STENCIL_BIT | > > - ISL_SURF_USAGE_TEXTURE_BIT, > > New logic also sets ISL_SURF_USAGE_DEPTH_BIT here. > How so? The base format of MESA_FORMAT_S_UINT8 is GL_STENCIL_INDEX. > > + mt_surf_usage(mt_fmt), > >alloc_flags, > >0, > >NULL); > > @@ -726,7 +742,7 @@ miptree_create(struct brw_context *brw, > > brw, target, mt_fmt, > > first_level, last_level, > > width0, height0, depth0, num_samples, tiling_flags, > > - ISL_SURF_USAGE_DEPTH_BIT | ISL_SURF_USAGE_TEXTURE_BIT, > > + mt_surf_usage(mt_fmt), > > alloc_flags, 0, NULL); > > > >if (needs_separate_stencil(brw, mt, format) && > > @@ -746,8 +762,7 @@ miptree_create(struct brw_context *brw, > > first_level, last_level, > > width0, height0, depth0, > > num_samples, tiling_flags, > > - ISL_SURF_USAGE_RENDER_TARGET_BIT | > > - ISL_SURF_USAGE_TEXTURE_BIT, > > + mt_surf_usage(mt_fmt), > > alloc_flags, 0, NULL); > > if (!mt) > >return NULL; > > @@ -816,12 +831,11 @@ intel_miptree_create_for_bo(struct brw_context *brw, > > > > if ((base_format == GL_DEPTH_COMPONENT || > > base_format == GL_DEPTH_STENCIL)) { > > - const mesa_format depth_only_format = > > - intel_depth_format_for_depthstencil_format(format); > > - mt = make_surface(brw, target, > > -devinfo->gen >= 6 ? depth_only_format : format, > > + mesa_format mt_fmt = (devinfo->gen < 6) ? format : > > + > > intel_depth_format_for_depthstencil_format(format); > > + mt = make_surface(brw, target, mt_fmt, > > 0, 0, width, height, depth, 1, ISL_TILING_Y0_BIT, > > -ISL_SURF_USAGE_DEPTH_BIT | > > ISL_SURF_USAGE_TEXTURE_BIT, > > +mt_surf_usage(mt_fmt), > > 0, pitch, bo); > >if (!mt) > > return NULL; > > @@ -836,8 +850,7 @@ intel_miptree_create_for_bo(struct brw_context *brw, > >mt = make_surface(brw, target, MESA_FORMAT_S_UINT8, > > 0, 0, width, height, depth, 1, > > ISL_TILING_W_BIT, > > -ISL_SURF_USAGE_STENCIL_BIT | > > -ISL_SURF_USAGE_TEXTURE_BIT, > > +mt_surf_usage(MESA_FORMAT_S_UINT8), > > Same here, new logic also sets ISL_SURF_USAGE_DEPTH_BIT here. > How so? -Nanley > > 0, pitch, bo); > >if (!mt) > > return NULL; > > @@ -862,8 +875,7 @@ intel_miptree_create_for_bo(struct brw_context *brw, > > mt = make_surface(brw, target, format, > > 0, 0, width, height, depth, 1, > > 1lu << tiling, > > - ISL_SURF_USAGE_RENDER_TARGET_BIT | > > - ISL_SURF_USAGE_T
[Mesa-dev] [Bug 106912] radv: 16-bit depth buffer causes artifacts in Shadow Warrior 2
https://bugs.freedesktop.org/show_bug.cgi?id=106912 --- Comment #1 from Samuel Pitoiset --- Can you explain how to reproduce the issue in-game? I would like to know if Vega is affected as well. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] meson: Fix -latomic check
On Wed, Jun 13, 2018 at 8:37 AM, Dylan Baker wrote: > Quoting Matt Turner (2018-06-12 17:50:20) >> Commit 54ba73ef102f (configure.ac/meson.build: Fix -latomic test) fixed >> some checks for -latomic, and then commit 54bbe600ec26 (configure.ac: >> rework -latomic check) further extended the fixes in configure.ac but >> not in Meson. This commit extends those fixes to the Meson tests. >> >> Fixes: 54bbe600ec26 (configure.ac: rework -latomic check) >> --- >> meson.build | 8 +++- >> 1 file changed, 7 insertions(+), 1 deletion(-) >> >> diff --git a/meson.build b/meson.build >> index 7dba52369b0..62200476216 100644 >> --- a/meson.build >> +++ b/meson.build >> @@ -836,7 +836,13 @@ endif >> # Check for GCC style atomics >> dep_atomic = null_dep >> >> -if cc.compiles('int main() { int n; return __atomic_load_n(&n, >> __ATOMIC_ACQUIRE); }', >> +if cc.compiles('''#include >> + int main() { >> +struct { >> + uint64_t *v; >> +} x; >> +return (int)__atomic_load_n(x.v, __ATOMIC_ACQUIRE); >> + }''', >> name : 'GCC atomic builtins') >>pre_args += '-DUSE_GCC_ATOMIC_BUILTINS' >> >> -- >> 2.16.1 >> > > Should patches 2 and 3 be cc 18.1? Yes, I will send you a list of 5 patches that I'd like to include in 18.1 including these. (I *really* hate that we're using 64-bit atomics) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 11/13] i965/miptree: Refactor miptree_create
On Wed, Jun 13, 2018 at 09:44:14AM +0300, Pohjolainen, Topi wrote: > On Tue, Jun 12, 2018 at 12:22:03PM -0700, Nanley Chery wrote: > > Enable a future patch to create the r8stencil_mt in this function. > > --- > > src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 48 +-- > > 1 file changed, 12 insertions(+), 36 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > index 5e00da86d32..b078c759243 100644 > > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > @@ -726,48 +726,24 @@ miptree_create(struct brw_context *brw, > > intel_depth_format_for_depthstencil_format(format); > > } > > > > - if (format == MESA_FORMAT_S_UINT8) > > - return make_surface(brw, target, mt_fmt, first_level, last_level, > > - width0, height0, depth0, num_samples, > > - tiling_flags, > > - mt_surf_usage(mt_fmt), > > - alloc_flags, > > - 0, > > - NULL); > > + struct intel_mipmap_tree *mt = > > + make_surface(brw, target, mt_fmt, first_level, last_level, > > + width0, height0, depth0, num_samples, > > + tiling_flags, mt_surf_usage(mt_fmt), > > + alloc_flags, 0, NULL); > > > > - const GLenum base_format = _mesa_get_format_base_format(format); > > - if ((base_format == GL_DEPTH_COMPONENT || > > -base_format == GL_DEPTH_STENCIL)) { > > - struct intel_mipmap_tree *mt = make_surface( > > - brw, target, mt_fmt, > > - first_level, last_level, > > - width0, height0, depth0, num_samples, tiling_flags, > > - mt_surf_usage(mt_fmt), > > - alloc_flags, 0, NULL); > > - > > - if (needs_separate_stencil(brw, mt, format) && > > - !make_separate_stencil_surface(brw, mt)) { > > + if (mt == NULL) > > + return NULL; > > + > > + if (needs_separate_stencil(brw, mt, format)) { > > + if (!make_separate_stencil_surface(brw, mt)) { > > intel_miptree_release(&mt); > > return NULL; > >} > > - > > - if (!(flags & MIPTREE_CREATE_NO_AUX)) > > - intel_miptree_choose_aux_usage(brw, mt); > > - > > - return mt; > > } > > > > - struct intel_mipmap_tree *mt = make_surface( > > - brw, target, mt_fmt, > > - first_level, last_level, > > - width0, height0, depth0, > > - num_samples, tiling_flags, > > - mt_surf_usage(mt_fmt), > > - alloc_flags, 0, NULL); > > - if (!mt) > > - return NULL; > > - > > - mt->etc_format = (mt_fmt != format) ? format : MESA_FORMAT_NONE; > > + if (_mesa_is_format_color_format(format) && mt_fmt != format) > > + mt->etc_format = format; > > This relies on MESA_FORMAT_NONE == 0 and make_surface() to use calloc(). > Should we play safe and: > > else > mt->etc_format = MESA_FORMAT_NONE; > Sure, I plan to change it to this: mt->etc_format = (_mesa_is_format_color_format(format) && mt_fmt != format) ? format : MESA_FORMAT_NONE; v2: Explicitly set etc_format to MESA_FORMAT_NONE (Topi) > > > > if (!(flags & MIPTREE_CREATE_NO_AUX)) > >intel_miptree_choose_aux_usage(brw, mt); > > -- > > 2.17.0 > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 106677] vmwgfx: atom (electron-based app) causes corruption, hangs
https://bugs.freedesktop.org/show_bug.cgi?id=106677 --- Comment #2 from Deepak --- (In reply to David Cuthbert from comment #0) > I'm filing this currently so I have a place to keep notes on this bug. > > Running the atom text editor under various OSes (tried Linux Mint 18.3, > Ubuntu 18.04, and currently using Fedora 28) results in minor screen > glitches, eventually followed by drawing going completely haywire. I > recompiled vmwgfx.ko from the current HEAD which resulted in fewer glitches, > but it never completely goes away. > > The hangs are always immediately preceded by: > [drm:vmw_cmdbuf_work_func [vmwgfx]] *ERROR* Command "(null)" causing device > error. > [drm:vmw_cmdbuf_work_func [vmwgfx]] *ERROR* Command buffer offset is 28 > [drm:vmw_cmdbuf_work_func [vmwgfx]] *ERROR* Command size is 24 > Hi David, thanks for the bug report. Do you see the command buffer error with the new top of the tree vmwgfx only ? I tried to reproduce this bug with clean Ubuntu 18.04 and Atom installed from software center. I see that Atom text editor will be unresponsive but couldn't see the kernel command buffer errors. Will try with Fedora 28 later. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] glsl: Don't copy propagate from SSBO or shared variables either
Reviewed-by: Caio Marcelo de Oliveira Filho On Tue, Jun 12, 2018 at 03:48:13PM -0700, Ian Romanick wrote: > From: Ian Romanick > > Since SSBOs can be written, copy propagating a read can cause the Optional: maybe write "... can be written by other threads"? > value to magically change. SSBO reads are also very expensive, so > doing it twice will be slower. > > Haswell, Broadwell, and Skylake had similar results. (Skylake shown) > total instructions in shared programs: 14399120 -> 14399119 (<.01%) > instructions in affected programs: 684 -> 683 (-0.15%) > helped: 1 > HURT: 0 > > total cycles in shared programs: 532978931 -> 532973113 (<.01%) > cycles in affected programs: 530484 -> 524666 (-1.10%) > helped: 1 > HURT: 0 > > Signed-off-by: Ian Romanick > Cc: mesa-sta...@lists.freedesktop.org > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106774 > --- > src/compiler/glsl/opt_copy_propagation.cpp | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/src/compiler/glsl/opt_copy_propagation.cpp > b/src/compiler/glsl/opt_copy_propagation.cpp > index 6220aa86da9..206dffe4f1c 100644 > --- a/src/compiler/glsl/opt_copy_propagation.cpp > +++ b/src/compiler/glsl/opt_copy_propagation.cpp > @@ -347,6 +347,8 @@ ir_copy_propagation_visitor::add_copy(ir_assignment *ir) > if (lhs_var != NULL && rhs_var != NULL && lhs_var != rhs_var) { >if (lhs_var->data.mode != ir_var_shader_storage && >lhs_var->data.mode != ir_var_shader_shared && > + rhs_var->data.mode != ir_var_shader_storage && > + rhs_var->data.mode != ir_var_shader_shared && >lhs_var->data.precise == rhs_var->data.precise) { > _mesa_hash_table_insert(acp, lhs_var, rhs_var); >} > -- > 2.14.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] glsl: Don't copy propagate elements from SSBO or shared variables either
Reviewed-by: Caio Marcelo de Oliveira Filho On Tue, Jun 12, 2018 at 03:48:14PM -0700, Ian Romanick wrote: > From: Ian Romanick > > Since SSBOs can be written, copy propagating a read can cause the > value to magically change. SSBO reads are also very expensive, so > doing it twice will be slower. > > The same shader was helped by this patch and the previous. > > Haswell, Broadwell, and Skylake had similar results. (Skylake shown) > total instructions in shared programs: 14399119 -> 14399113 (<.01%) > instructions in affected programs: 683 -> 677 (-0.88%) > helped: 1 > HURT: 0 > > total cycles in shared programs: 532973113 -> 532971865 (<.01%) > cycles in affected programs: 524666 -> 523418 (-0.24%) > helped: 1 > HURT: 0 > > Signed-off-by: Ian Romanick > Cc: mesa-sta...@lists.freedesktop.org > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106774 > --- > src/compiler/glsl/opt_copy_propagation_elements.cpp | 8 > 1 file changed, 8 insertions(+) > > diff --git a/src/compiler/glsl/opt_copy_propagation_elements.cpp > b/src/compiler/glsl/opt_copy_propagation_elements.cpp > index 8bae424a1d0..8975e727522 100644 > --- a/src/compiler/glsl/opt_copy_propagation_elements.cpp > +++ b/src/compiler/glsl/opt_copy_propagation_elements.cpp > @@ -544,6 +544,10 @@ > ir_copy_propagation_elements_visitor::add_copy(ir_assignment *ir) > if (!lhs || !(lhs->type->is_scalar() || lhs->type->is_vector())) >return; > > + if (lhs->var->data.mode == ir_var_shader_storage || > + lhs->var->data.mode == ir_var_shader_shared) > + return; > + > ir_dereference_variable *rhs = ir->rhs->as_dereference_variable(); > if (!rhs) { >ir_swizzle *swiz = ir->rhs->as_swizzle(); > @@ -560,6 +564,10 @@ > ir_copy_propagation_elements_visitor::add_copy(ir_assignment *ir) >orig_swizzle[3] = swiz->mask.w; > } > > + if (rhs->var->data.mode == ir_var_shader_storage || > + rhs->var->data.mode == ir_var_shader_shared) > + return; > + > /* Move the swizzle channels out to the positions they match in the > * destination. We don't want to have to rewrite the swizzle[] > * array every time we clear a bit of the write_mask. > -- > 2.14.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/13] i965/miptree: Add and use mt_surf_usage
On Wed, Jun 13, 2018 at 09:25:37AM -0700, Nanley Chery wrote: > On Wed, Jun 13, 2018 at 09:39:08AM +0300, Pohjolainen, Topi wrote: > > On Tue, Jun 12, 2018 at 12:22:02PM -0700, Nanley Chery wrote: > > > --- > > > src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 40 --- > > > 1 file changed, 26 insertions(+), 14 deletions(-) > > > > > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > > index cfb83d15ecc..5e00da86d32 100644 > > > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > > @@ -677,6 +677,23 @@ make_separate_stencil_surface(struct brw_context > > > *brw, > > > return true; > > > } > > > > > > +/* Return the usual surface usage flags for the given format. */ > > > +static isl_surf_usage_flags_t > > > +mt_surf_usage(mesa_format format) > > > +{ > > > + switch(_mesa_get_format_base_format(format)) { > > > + case GL_DEPTH_COMPONENT: > > > + return ISL_SURF_USAGE_DEPTH_BIT | ISL_SURF_USAGE_TEXTURE_BIT; > > > + case GL_DEPTH_STENCIL: > > > + return ISL_SURF_USAGE_DEPTH_BIT | ISL_SURF_USAGE_STENCIL_BIT | > > > + ISL_SURF_USAGE_TEXTURE_BIT; > > > + case GL_STENCIL_INDEX: > > > + return ISL_SURF_USAGE_STENCIL_BIT | ISL_SURF_USAGE_TEXTURE_BIT; > > > + default: > > > + return ISL_SURF_USAGE_RENDER_TARGET_BIT | > > > ISL_SURF_USAGE_TEXTURE_BIT; > > > + } > > > +} > > > + > > > static struct intel_mipmap_tree * > > > miptree_create(struct brw_context *brw, > > > GLenum target, > > > @@ -713,8 +730,7 @@ miptree_create(struct brw_context *brw, > > >return make_surface(brw, target, mt_fmt, first_level, last_level, > > >width0, height0, depth0, num_samples, > > >tiling_flags, > > > - ISL_SURF_USAGE_STENCIL_BIT | > > > - ISL_SURF_USAGE_TEXTURE_BIT, > > > > New logic also sets ISL_SURF_USAGE_DEPTH_BIT here. > > > > How so? The base format of MESA_FORMAT_S_UINT8 is GL_STENCIL_INDEX. Yeah, my bad, I misread completely, same further down, sorry for the noise :( > > > > + mt_surf_usage(mt_fmt), > > >alloc_flags, > > >0, > > >NULL); > > > @@ -726,7 +742,7 @@ miptree_create(struct brw_context *brw, > > > brw, target, mt_fmt, > > > first_level, last_level, > > > width0, height0, depth0, num_samples, tiling_flags, > > > - ISL_SURF_USAGE_DEPTH_BIT | ISL_SURF_USAGE_TEXTURE_BIT, > > > + mt_surf_usage(mt_fmt), > > > alloc_flags, 0, NULL); > > > > > >if (needs_separate_stencil(brw, mt, format) && > > > @@ -746,8 +762,7 @@ miptree_create(struct brw_context *brw, > > > first_level, last_level, > > > width0, height0, depth0, > > > num_samples, tiling_flags, > > > - ISL_SURF_USAGE_RENDER_TARGET_BIT | > > > - ISL_SURF_USAGE_TEXTURE_BIT, > > > + mt_surf_usage(mt_fmt), > > > alloc_flags, 0, NULL); > > > if (!mt) > > >return NULL; > > > @@ -816,12 +831,11 @@ intel_miptree_create_for_bo(struct brw_context *brw, > > > > > > if ((base_format == GL_DEPTH_COMPONENT || > > > base_format == GL_DEPTH_STENCIL)) { > > > - const mesa_format depth_only_format = > > > - intel_depth_format_for_depthstencil_format(format); > > > - mt = make_surface(brw, target, > > > -devinfo->gen >= 6 ? depth_only_format : format, > > > + mesa_format mt_fmt = (devinfo->gen < 6) ? format : > > > + > > > intel_depth_format_for_depthstencil_format(format); > > > + mt = make_surface(brw, target, mt_fmt, > > > 0, 0, width, height, depth, 1, ISL_TILING_Y0_BIT, > > > -ISL_SURF_USAGE_DEPTH_BIT | > > > ISL_SURF_USAGE_TEXTURE_BIT, > > > +mt_surf_usage(mt_fmt), > > > 0, pitch, bo); > > >if (!mt) > > > return NULL; > > > @@ -836,8 +850,7 @@ intel_miptree_create_for_bo(struct brw_context *brw, > > >mt = make_surface(brw, target, MESA_FORMAT_S_UINT8, > > > 0, 0, width, height, depth, 1, > > > ISL_TILING_W_BIT, > > > -ISL_SURF_USAGE_STENCIL_BIT | > > > -ISL_SURF_USAGE_TEXTURE_BIT, > > > +mt_surf_usage(MESA_FORMAT_S_UINT8), > > > > Same here, new logic also sets ISL_SURF_USAGE_DEPTH_BIT here. > > > > How so? > > -Nanley > > > > 0, pitch, bo); > > >
Re: [Mesa-dev] [PATCH 08/13] i965/miptree: Share the miptree format in miptree_create
On Wed, Jun 13, 2018 at 09:20:55AM -0700, Nanley Chery wrote: > On Wed, Jun 13, 2018 at 09:33:41AM +0300, Pohjolainen, Topi wrote: > > On Tue, Jun 12, 2018 at 12:22:00PM -0700, Nanley Chery wrote: > > > --- > > > src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 30 +-- > > > 1 file changed, 15 insertions(+), 15 deletions(-) > > > > > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > > index 03628e3fd9f..97de30076e0 100644 > > > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > > @@ -696,8 +696,19 @@ miptree_create(struct brw_context *brw, > > > if (devinfo->gen < 6 && _mesa_is_format_color_format(format)) > > >tiling_flags &= ~ISL_TILING_Y0_BIT; > > > > > > + mesa_format mt_fmt; > > > + if (_mesa_is_format_color_format(format)) { > > > + mt_fmt = intel_lower_compressed_format(brw, format); > > > + } else { > > > + /* Fix up the Z miptree format for how we're splitting out separate > > > + * stencil. Gen7 expects there to be no stencil bits in its depth > > > buffer. > > > + */ > > > + mt_fmt = (devinfo->gen < 6) ? format : > > > + intel_depth_format_for_depthstencil_format(format); > > > + } > > > > I wonder if we need to add something of this sort for coverity not > > complaining > > later on (I don't know if it is clever to know what > > _mesa_is_format_color_format() does): > > > > } else { > > unreachable("Format with invalid base"); > > } > > > > > > Where would we be adding this unreachable? There is already an else case here. Yeah, same thing as with the other patch, I was thinking the STENCIL_INDEX case in my head and somehow stopped reading what you actually had here. This is all fine, sorry. > > -Nanley > > > > + > > > if (format == MESA_FORMAT_S_UINT8) > > > - return make_surface(brw, target, format, first_level, last_level, > > > + return make_surface(brw, target, mt_fmt, first_level, last_level, > > >width0, height0, depth0, num_samples, > > >tiling_flags, > > >ISL_SURF_USAGE_STENCIL_BIT | > > > @@ -709,13 +720,8 @@ miptree_create(struct brw_context *brw, > > > const GLenum base_format = _mesa_get_format_base_format(format); > > > if ((base_format == GL_DEPTH_COMPONENT || > > > base_format == GL_DEPTH_STENCIL)) { > > > - /* Fix up the Z miptree format for how we're splitting out separate > > > - * stencil. Gen7 expects there to be no stencil bits in its depth > > > buffer. > > > - */ > > > - const mesa_format depth_only_format = > > > - intel_depth_format_for_depthstencil_format(format); > > >struct intel_mipmap_tree *mt = make_surface( > > > - brw, target, devinfo->gen >= 6 ? depth_only_format : format, > > > + brw, target, mt_fmt, > > > first_level, last_level, > > > width0, height0, depth0, num_samples, tiling_flags, > > > ISL_SURF_USAGE_DEPTH_BIT | ISL_SURF_USAGE_TEXTURE_BIT, > > > @@ -733,19 +739,13 @@ miptree_create(struct brw_context *brw, > > >return mt; > > > } > > > > > > - mesa_format tex_format = format; > > > - mesa_format etc_format = MESA_FORMAT_NONE; > > > uint32_t alloc_flags = 0; > > > > > > - format = intel_lower_compressed_format(brw, format); > > > - > > > - etc_format = (format != tex_format) ? tex_format : MESA_FORMAT_NONE; > > > - > > > if (flags & MIPTREE_CREATE_BUSY) > > >alloc_flags |= BO_ALLOC_BUSY; > > > > > > struct intel_mipmap_tree *mt = make_surface( > > > - brw, target, format, > > > + brw, target, mt_fmt, > > > first_level, last_level, > > > width0, height0, depth0, > > > num_samples, tiling_flags, > > > @@ -755,7 +755,7 @@ miptree_create(struct brw_context *brw, > > > if (!mt) > > >return NULL; > > > > > > - mt->etc_format = etc_format; > > > + mt->etc_format = (mt_fmt != format) ? format : MESA_FORMAT_NONE; > > > > > > if (!(flags & MIPTREE_CREATE_NO_AUX)) > > >intel_miptree_choose_aux_usage(brw, mt); > > > -- > > > 2.17.0 > > > > > > ___ > > > mesa-dev mailing list > > > mesa-dev@lists.freedesktop.org > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 106677] vmwgfx: atom (electron-based app) causes corruption, hangs
https://bugs.freedesktop.org/show_bug.cgi?id=106677 --- Comment #3 from Thomas Hellström --- FWIW, no apparent problems on Fedora Rawhide with 4.18.0-rc0. /Thomas -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radv: don't fast clear HTILE for 16-bit depth surfaces on GFX8
This causes rendering issues in Shadow Warrior 2 with DXVK. Cc: mesa-sta...@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106912 Signed-off-by: Samuel Pitoiset --- src/amd/vulkan/radv_meta_clear.c | 8 1 file changed, 8 insertions(+) diff --git a/src/amd/vulkan/radv_meta_clear.c b/src/amd/vulkan/radv_meta_clear.c index fae441ceb6..373072dd36 100644 --- a/src/amd/vulkan/radv_meta_clear.c +++ b/src/amd/vulkan/radv_meta_clear.c @@ -717,6 +717,14 @@ emit_fast_htile_clear(struct radv_cmd_buffer *cmd_buffer, if ((clear_value.depth != 0.0 && clear_value.depth != 1.0) || !(aspects & VK_IMAGE_ASPECT_DEPTH_BIT)) goto fail; + /* GFX8 only supports 32-bit depth surfaces but we can enable TC-compat +* HTILE for 16-bit surfaces if no Z planes are compressed. Though, +* fast HTILE clears don't seem to work. +*/ + if (cmd_buffer->device->physical_device->rad_info.chip_class == VI && + iview->image->vk_format == VK_FORMAT_D16_UNORM) + goto fail; + if (vk_format_aspects(iview->image->vk_format) & VK_IMAGE_ASPECT_STENCIL_BIT) { if (clear_value.stencil != 0 || !(aspects & VK_IMAGE_ASPECT_STENCIL_BIT)) goto fail; -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radv: don't fast clear HTILE for 16-bit depth surfaces on GFX8
Reviewed-by: Bas Nieuwenhuizen On Wed, Jun 13, 2018 at 8:19 PM, Samuel Pitoiset wrote: > This causes rendering issues in Shadow Warrior 2 with DXVK. > > Cc: mesa-sta...@lists.freedesktop.org > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106912 > Signed-off-by: Samuel Pitoiset > --- > src/amd/vulkan/radv_meta_clear.c | 8 > 1 file changed, 8 insertions(+) > > diff --git a/src/amd/vulkan/radv_meta_clear.c > b/src/amd/vulkan/radv_meta_clear.c > index fae441ceb6..373072dd36 100644 > --- a/src/amd/vulkan/radv_meta_clear.c > +++ b/src/amd/vulkan/radv_meta_clear.c > @@ -717,6 +717,14 @@ emit_fast_htile_clear(struct radv_cmd_buffer *cmd_buffer, > if ((clear_value.depth != 0.0 && clear_value.depth != 1.0) || > !(aspects & VK_IMAGE_ASPECT_DEPTH_BIT)) > goto fail; > > + /* GFX8 only supports 32-bit depth surfaces but we can enable > TC-compat > +* HTILE for 16-bit surfaces if no Z planes are compressed. Though, > +* fast HTILE clears don't seem to work. > +*/ > + if (cmd_buffer->device->physical_device->rad_info.chip_class == VI && > + iview->image->vk_format == VK_FORMAT_D16_UNORM) > + goto fail; > + > if (vk_format_aspects(iview->image->vk_format) & > VK_IMAGE_ASPECT_STENCIL_BIT) { > if (clear_value.stencil != 0 || !(aspects & > VK_IMAGE_ASPECT_STENCIL_BIT)) > goto fail; > -- > 2.17.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 106912] radv: 16-bit depth buffer causes artifacts in Shadow Warrior 2
https://bugs.freedesktop.org/show_bug.cgi?id=106912 Samuel Pitoiset changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #2 from Samuel Pitoiset --- Fixed. https://cgit.freedesktop.org/mesa/mesa/commit/?id=51e23d34190076159129dd7b449b95a1ac3d4949 -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] radv: don't check for linear images in emit_fast_color_clear()
We don't enable CMASK for linear surfaces and addrlib only enables DCC for tiling surfaces. Signed-off-by: Samuel Pitoiset --- src/amd/vulkan/radv_meta_clear.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/src/amd/vulkan/radv_meta_clear.c b/src/amd/vulkan/radv_meta_clear.c index 28050079f92..b52beb3861c 100644 --- a/src/amd/vulkan/radv_meta_clear.c +++ b/src/amd/vulkan/radv_meta_clear.c @@ -1008,8 +1008,6 @@ emit_fast_color_clear(struct radv_cmd_buffer *cmd_buffer, if (iview->image->info.array_size != iview->layer_count) goto fail; - if (iview->image->surface.is_linear) - goto fail; if (!radv_image_extent_compare(iview->image, &iview->extent)) goto fail; -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] radv: don't check the number of levels in emit_fast_color_clear()
This is useless because we don't support DCC/CMASK for mipmaps. Signed-off-by: Samuel Pitoiset --- src/amd/vulkan/radv_meta_clear.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/src/amd/vulkan/radv_meta_clear.c b/src/amd/vulkan/radv_meta_clear.c index fae441ceb66..28050079f92 100644 --- a/src/amd/vulkan/radv_meta_clear.c +++ b/src/amd/vulkan/radv_meta_clear.c @@ -1008,9 +1008,6 @@ emit_fast_color_clear(struct radv_cmd_buffer *cmd_buffer, if (iview->image->info.array_size != iview->layer_count) goto fail; - if (iview->image->info.levels > 1) - goto fail; - if (iview->image->surface.is_linear) goto fail; if (!radv_image_extent_compare(iview->image, &iview->extent)) -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 106677] vmwgfx: atom (electron-based app) causes corruption, hangs
https://bugs.freedesktop.org/show_bug.cgi?id=106677 --- Comment #4 from David Cuthbert --- Note that it takes some fiddling to reproduce this currently (the exact trigger isn't known). I can go hours without seeing this issue. I've been banging my head against the wall trying to get my extra logging to work -- finally realized yesterday that vmwgfx.ko is being loaded in initramfs and not from my filesystem. I'm attempting to reproduce it now with a rebuilt initramfs. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 106479] NDEBUG not defined for libamdgpu_addrlib
https://bugs.freedesktop.org/show_bug.cgi?id=106479 Samuel Pitoiset changed: What|Removed |Added Status|NEEDINFO|RESOLVED Resolution|--- |FIXED --- Comment #3 from Samuel Pitoiset --- This has been fixed by Bas. https://cgit.freedesktop.org/mesa/mesa/commit/?id=62e0e089d710835d9f79138377bcc37147f75ebd -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 106696] repeatable drm:amdgpu_job_timedout with vulkan toy
https://bugs.freedesktop.org/show_bug.cgi?id=106696 Samuel Pitoiset changed: What|Removed |Added Resolution|--- |NOTOURBUG Status|REOPENED|RESOLVED --- Comment #8 from Samuel Pitoiset --- As Nicolai said, this is a known issue. Definitely unrelated to RADV. Please don't re-open, thanks! -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 3/3] egl/android: Add DRM node probing and filtering
On Wed, Jun 13, 2018 at 12:19 PM, Amit Pundir wrote: > On 13 June 2018 at 20:45, Rob Herring wrote: >> >> +Amit and John >> >> On Sat, Jun 9, 2018 at 11:27 AM, Robert Foss >> wrote: >> > This patch both adds support for probing & filtering DRM nodes >> > and switches away from using the GRALLOC_MODULE_PERFORM_GET_DRM_FD >> > gralloc call. >> > >> > Currently the filtering is based just on the driver name, >> > and the desired name is supplied using the "drm.gpu.vendor_name" >> > Android property. >> >> There's a potential issue with this whole approach and that is >> SELinux. With the way SELinux locks down accesses, getting probing >> thru device files to work can be a pain. It may be better now than the >> prior version because sysfs is not probed. I'll leave it to Amit or >> John to comment. > > Right.. so ICYMI, this patch is already pulled into external/mesa3d > project of AOSP and I stumbled upon one such /dev/dri/ access denial > on db820c recently. A prior version of the patch series which accesses sysfs too (via libdrm). > > In AOSP, zygote spawned apps already have access to GPU device nodes > in the form of /dev/gpu_device file, but the missing part is the It's "gpu_device" in terms a a SELinux context, right? Not an actual /dev path? > open-read access to "/dev/dri/" which need to be allowed explicitly. Or we need a way to just open a specific device. Rob ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 106756] Wine 3.9 crashes with DXVK on Just Cause 3 and Quantum Break on VEGA but works ON POLARIS
https://bugs.freedesktop.org/show_bug.cgi?id=106756 Samuel Pitoiset changed: What|Removed |Added Status|NEW |NEEDINFO -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/8] i965: Don't recycle BOs until they are idle
The current BO cache puts BOs back into the recycle bucket the moment the refcount hits zero. If the BO is busy, we just don't re-use it until it isn't or we re-use it for a render target which we assume will be used first for drawing. This patch series reworks the way the BO cache works a bit so that we don't ever recycle a busy BO. On the down side, it means that we don't get the "keep busy BOs busy" heuristic (which we have no proof actually helps). On the up side, we can now easily use a MRU heuristic instead of round-robin for all buffers and not just the busy ones. Will this be an improvement, a regression or a wash? I don't know but I doubt it will have a major effect one way or another. Jason Ekstrand (8): i965/bufmgr: Bail early in bo_busy if the BO is flagged idle i965/miptree: Stop setting BO_ALLOC_BUSY i965/bufmgr: Drop the BO_ALLOC_BUSY flag i965/bufmgr: Add a garbage collection mechanism i965/batch: Use brw_bo_unreference_bos_when_idle i965: Call intel_finish before destroying the context i965/bufmgr: Don't allow busy BOs to be returned to the pool i965/bufmgr: Allocate from the tail of the bucket free list src/mesa/drivers/dri/i965/brw_bufmgr.c| 186 +- src/mesa/drivers/dri/i965/brw_bufmgr.h| 18 +- src/mesa/drivers/dri/i965/brw_context.c | 8 + src/mesa/drivers/dri/i965/intel_batchbuffer.c | 8 +- src/mesa/drivers/dri/i965/intel_fbo.c | 2 +- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 29 ++- src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 9 - src/mesa/drivers/dri/i965/intel_screen.c | 2 +- .../drivers/dri/i965/intel_tex_validate.c | 2 +- 9 files changed, 182 insertions(+), 82 deletions(-) -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/8] i965/bufmgr: Drop the BO_ALLOC_BUSY flag
--- src/mesa/drivers/dri/i965/brw_bufmgr.c | 46 ++ src/mesa/drivers/dri/i965/brw_bufmgr.h | 1 - 2 files changed, 10 insertions(+), 37 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c b/src/mesa/drivers/dri/i965/brw_bufmgr.c index 58bb559fdee..e9d3daa5985 100644 --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c @@ -448,11 +448,6 @@ int brw_bo_busy(struct brw_bo *bo) { struct brw_bufmgr *bufmgr = bo->bufmgr; - - /* If we know it's idle, don't bother with the kernel round trip */ - if (bo->idle && !bo->external) - return false; - struct drm_i915_gem_busy busy = { .handle = bo->gem_handle }; int ret = drmIoctl(bufmgr->fd, DRM_IOCTL_I915_GEM_BUSY, &busy); @@ -506,20 +501,11 @@ bo_alloc_internal(struct brw_bufmgr *bufmgr, struct bo_cache_bucket *bucket; bool alloc_from_cache; uint64_t bo_size; - bool busy = false; bool zeroed = false; - if (flags & BO_ALLOC_BUSY) - busy = true; - if (flags & BO_ALLOC_ZEROED) zeroed = true; - /* BUSY does doesn't really jive with ZEROED as we have to wait for it to -* be idle before we can memset. Just disallow that combination. -*/ - assert(!(busy && zeroed)); - /* Round the allocated size up to a power of two number of pages. */ bucket = bucket_for_size(bufmgr, size); @@ -539,29 +525,17 @@ bo_alloc_internal(struct brw_bufmgr *bufmgr, retry: alloc_from_cache = false; if (bucket != NULL && !list_empty(&bucket->head)) { - if (busy && !zeroed) { - /* Allocate new render-target BOs from the tail (MRU) - * of the list, as it will likely be hot in the GPU - * cache and in the aperture for us. If the caller - * asked us to zero the buffer, we don't want this - * because we are going to mmap it. - */ - bo = LIST_ENTRY(struct brw_bo, bucket->head.prev, head); - list_del(&bo->head); + /* For non-render-target BOs (where we're probably + * going to map it first thing in order to fill it + * with data), check if the last BO in the cache is + * unbusy, and only reuse in that case. Otherwise, + * allocating a new buffer is probably faster than + * waiting for the GPU to finish. + */ + bo = LIST_ENTRY(struct brw_bo, bucket->head.next, head); + if (!brw_bo_busy(bo)) { alloc_from_cache = true; - } else { - /* For non-render-target BOs (where we're probably - * going to map it first thing in order to fill it - * with data), check if the last BO in the cache is - * unbusy, and only reuse in that case. Otherwise, - * allocating a new buffer is probably faster than - * waiting for the GPU to finish. - */ - bo = LIST_ENTRY(struct brw_bo, bucket->head.next, head); - if (!brw_bo_busy(bo)) { -alloc_from_cache = true; -list_del(&bo->head); - } + list_del(&bo->head); } if (alloc_from_cache) { diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.h b/src/mesa/drivers/dri/i965/brw_bufmgr.h index 32fc7a553c9..d3b3aadc0db 100644 --- a/src/mesa/drivers/dri/i965/brw_bufmgr.h +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.h @@ -195,7 +195,6 @@ struct brw_bo { bool cache_coherent; }; -#define BO_ALLOC_BUSY (1<<0) #define BO_ALLOC_ZEROED (1<<1) /** -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/8] i965: Call intel_finish before destroying the context
--- src/mesa/drivers/dri/i965/brw_context.c | 8 1 file changed, 8 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 9ced230ec14..98ec54f2ae3 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -1099,6 +1099,14 @@ intelDestroyContext(__DRIcontext * driContextPriv) (struct brw_context *) driContextPriv->driverPrivate; struct gl_context *ctx = &brw->ctx; + /* Wait for our any outstanding rendering to be completed before we start +* freeing anything. It's probably safe to destroy the context while stuff +* is sill in flight since the kernel will reference count our BOs. This +* just ensures that everything is safe before we start destroying things +* in case doing so has any side-effects. +*/ + intel_finish(ctx); + _mesa_meta_free(&brw->ctx); if (INTEL_DEBUG & DEBUG_SHADER_TIME) { -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 7/8] i965/bufmgr: Don't allow busy BOs to be returned to the pool
--- src/mesa/drivers/dri/i965/brw_bufmgr.c | 51 -- 1 file changed, 32 insertions(+), 19 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c b/src/mesa/drivers/dri/i965/brw_bufmgr.c index cfa32ff3726..ef918315c65 100644 --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c @@ -524,6 +524,14 @@ bo_alloc_internal(struct brw_bufmgr *bufmgr, bo_size = page_size; } else { bo_size = bucket->size; + + /* If there's nothing in the bucket, call bufmgr_collect in the hopes + * that maybe we can free and re-use an old BO. It should be safe to + * call list_empty() without taking a lock since it's just a pointer + * comparison and nothing bad will happen if we get it wrong. + */ + if (list_empty(&bucket->head)) + brw_bufmgr_collect(bufmgr); } mtx_lock(&bufmgr->lock); @@ -539,31 +547,29 @@ retry: * waiting for the GPU to finish. */ bo = LIST_ENTRY(struct brw_bo, bucket->head.next, head); - if (!brw_bo_busy(bo)) { - alloc_from_cache = true; - list_del(&bo->head); + assert(!brw_bo_busy(bo)); + + alloc_from_cache = true; + list_del(&bo->head); + + if (!brw_bo_madvise(bo, I915_MADV_WILLNEED)) { + bo_free(bo); + brw_bo_cache_purge_bucket(bufmgr, bucket); + goto retry; } - if (alloc_from_cache) { - if (!brw_bo_madvise(bo, I915_MADV_WILLNEED)) { -bo_free(bo); -brw_bo_cache_purge_bucket(bufmgr, bucket); -goto retry; - } + if (bo_set_tiling_internal(bo, tiling_mode, stride)) { + bo_free(bo); + goto retry; + } - if (bo_set_tiling_internal(bo, tiling_mode, stride)) { + if (zeroed) { + void *map = brw_bo_map(NULL, bo, MAP_WRITE | MAP_RAW); + if (!map) { bo_free(bo); goto retry; } - - if (zeroed) { -void *map = brw_bo_map(NULL, bo, MAP_WRITE | MAP_RAW); -if (!map) { - bo_free(bo); - goto retry; -} -memset(map, 0, bo_size); - } + memset(map, 0, bo_size); } } @@ -871,6 +877,13 @@ bo_unreference_final(struct brw_bo *bo, time_t time) DBG("bo_unreference final: %d (%s)\n", bo->gem_handle, bo->name); + /* The only way an internal BO can be busy is if it's in use by one of our +* (this screen's) batch buffers. Since we always wait for the batch to be +* idle before we unref the BOs it references, we can never get here with a +* busy internal BO. +*/ + assert(bo->external || !brw_bo_busy(bo)); + bucket = bucket_for_size(bufmgr, bo->size); /* Put the buffer into our internal cache for reuse if we can. */ if (bufmgr->bo_reuse && bo->reusable && bucket != NULL && -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/8] i965/bufmgr: Add a garbage collection mechanism
While we can always trust the kernel to reference count things and not actually free any memory until the GPU is done with it, that may not actually do what we want. We have to be careful, for instance, with recycling buffers that we might immediately map. This commit provides a tagging mechanism that we can use to avoid unreferencing a BO until some other BO (presumably a batch) goes idle. The next commit will actually start using the new mechanism. --- src/mesa/drivers/dri/i965/brw_bufmgr.c | 102 + src/mesa/drivers/dri/i965/brw_bufmgr.h | 17 + 2 files changed, 119 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c b/src/mesa/drivers/dri/i965/brw_bufmgr.c index e9d3daa5985..cfa32ff3726 100644 --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c @@ -158,6 +158,12 @@ struct brw_bufmgr { bool bo_reuse:1; uint64_t initial_kflags; + + /* List of struct bo_idle_unref_request +* +* See also brw_bufmgr_collect() +*/ + struct list_head unref_requests; }; static int bo_set_tiling_internal(struct brw_bo *bo, uint32_t tiling_mode, @@ -904,6 +910,92 @@ brw_bo_unreference(struct brw_bo *bo) } } +struct bo_idle_unref_request { + struct brw_bo *wait_bo; + + struct list_head link; + + unsigned num_unref_bos; + struct brw_bo *unref_bos[0]; +}; + +void +brw_bo_unreference_bos_when_idle(struct brw_bo *wait_bo, + struct brw_bo **unref_bos, + unsigned num_unref_bos) +{ + struct brw_bufmgr *bufmgr = wait_bo->bufmgr; + + struct bo_idle_unref_request *req = + malloc(sizeof(*req) + num_unref_bos * sizeof(req->unref_bos[0])); + + if (req == NULL) { + /* This should never happen. If it does, we can always just stall and + * then unreference everything. + */ + brw_bo_wait_rendering(wait_bo); + for (unsigned i = 0; i < num_unref_bos; i++) + brw_bo_unreference(unref_bos[i]); + return; + } + + req->wait_bo = wait_bo; + brw_bo_reference(wait_bo); + + req->num_unref_bos = num_unref_bos; + memcpy(req->unref_bos, unref_bos, num_unref_bos * sizeof(*unref_bos)); + + mtx_lock(&bufmgr->lock); + list_addtail(&req->link, &bufmgr->unref_requests); + mtx_unlock(&bufmgr->lock); +} + +static void +bufmgr_collect(struct brw_bufmgr *bufmgr, bool wait) +{ + mtx_lock(&bufmgr->lock); + + struct list_head idle_list; + list_inithead(&idle_list); + + /* Move all entries with idle BOs into the idle list */ + list_for_each_entry_safe(struct bo_idle_unref_request, req, +&bufmgr->unref_requests, link) { + if (wait) { + /* This case is only for when we're destroying the bufmgr so nothing + * should ever be busy. We'll wait on it in release builds just to + * make sure. + */ + assert(!brw_bo_busy(req->wait_bo)); + brw_bo_wait(req->wait_bo, -1); + } else if (brw_bo_busy(req->wait_bo)) { + continue; + } + + list_del(&req->link); + list_addtail(&req->link, &idle_list); + } + + /* Drop the lock before we start unreferencing things */ + mtx_unlock(&bufmgr->lock); + + list_for_each_entry_safe(struct bo_idle_unref_request, req, +&idle_list, link) { + brw_bo_unreference(req->wait_bo); + for (unsigned i = 0; i < req->num_unref_bos; i++) + brw_bo_unreference(req->unref_bos[i]); + list_del(&req->link); + free(req); + } + assert(list_empty(&idle_list)); +} + +void +brw_bufmgr_collect(struct brw_bufmgr *bufmgr) +{ + bufmgr_collect(bufmgr, false); +} + static void bo_wait_with_stall_warning(struct brw_context *brw, struct brw_bo *bo, @@ -1270,12 +1362,20 @@ brw_bo_wait(struct brw_bo *bo, int64_t timeout_ns) bo->idle = true; + /* We just had to call into the kernel to wait on a BO, something is now +* idle so we may as well garbage collect. +*/ + brw_bufmgr_collect(bufmgr); + return ret; } void brw_bufmgr_destroy(struct brw_bufmgr *bufmgr) { + bufmgr_collect(bufmgr, true); + assert(list_empty(&bufmgr->unref_requests)); + mtx_destroy(&bufmgr->lock); /* Free any cached buffer objects we were going to reuse */ @@ -1731,5 +1831,7 @@ brw_bufmgr_init(struct gen_device_info *devinfo, int fd) bufmgr->handle_table = _mesa_hash_table_create(NULL, key_hash_uint, key_uint_equal); + list_inithead(&bufmgr->unref_requests); + return bufmgr; } diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.h b/src/mesa/drivers/dri/i965/brw_bufmgr.h index d3b3aadc0db..644ba3a47aa 100644 --- a/src/mesa/drivers/dri/i965/brw_bufmgr.h +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.h @@ -262,6 +262,23 @@ brw_bo_reference(struct brw_bo *bo) */ void brw_bo_unreference(struct brw_bo *bo); +/** + * Release references on a list of BOs when the given BO becomes idle. + * + *
[Mesa-dev] [PATCH 1/8] i965/bufmgr: Bail early in bo_busy if the BO is flagged idle
This has the potential to make brw_bo_busy a bit cheaper for internal BOs if someone has checked it for busy or waited on it before. We already do the same thing in brw_bo_wait. --- src/mesa/drivers/dri/i965/brw_bufmgr.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c b/src/mesa/drivers/dri/i965/brw_bufmgr.c index 7ac3bcad3da..58bb559fdee 100644 --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c @@ -448,6 +448,11 @@ int brw_bo_busy(struct brw_bo *bo) { struct brw_bufmgr *bufmgr = bo->bufmgr; + + /* If we know it's idle, don't bother with the kernel round trip */ + if (bo->idle && !bo->external) + return false; + struct drm_i915_gem_busy busy = { .handle = bo->gem_handle }; int ret = drmIoctl(bufmgr->fd, DRM_IOCTL_I915_GEM_BUSY, &busy); -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 8/8] i965/bufmgr: Allocate from the tail of the bucket free list
The previous approach gave a sort of round-robin behavior which made sense because we didn't want to walk the entire list looking for the first idle BO. Now that everything is idle, we can pick any BO in the list and it should be fine. Using the most recently used BO should give us less over-all thrash than the round-robin because we will be trying to re-use BOs as much as possible. --- src/mesa/drivers/dri/i965/brw_bufmgr.c | 10 +++--- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c b/src/mesa/drivers/dri/i965/brw_bufmgr.c index ef918315c65..02aea435e84 100644 --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c @@ -539,14 +539,10 @@ bo_alloc_internal(struct brw_bufmgr *bufmgr, retry: alloc_from_cache = false; if (bucket != NULL && !list_empty(&bucket->head)) { - /* For non-render-target BOs (where we're probably - * going to map it first thing in order to fill it - * with data), check if the last BO in the cache is - * unbusy, and only reuse in that case. Otherwise, - * allocating a new buffer is probably faster than - * waiting for the GPU to finish. + /* Allocate BOs from the tail (MRU) of the list as it will likely be + * hotter in the GPU cache and in the aperature for us. */ - bo = LIST_ENTRY(struct brw_bo, bucket->head.next, head); + bo = LIST_ENTRY(struct brw_bo, bucket->head.prev, head); assert(!brw_bo_busy(bo)); alloc_from_cache = true; -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/8] i965/batch: Use brw_bo_unreference_bos_when_idle
Instead of unreferencing all the BOs used by the freshly submitted batch directly, ask the bufmgr to unref them for us once the batch goes idle. This should more-or-less have the same effect except that we now wait to unref the BOs until the batch is idle. --- src/mesa/drivers/dri/i965/intel_batchbuffer.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c index df999ffeb1d..127d0c34bea 100644 --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c @@ -535,10 +535,12 @@ static void brw_new_batch(struct brw_context *brw) { /* Unreference any BOs held by the previous batch, and reset counts. */ - for (int i = 0; i < brw->batch.exec_count; i++) { - brw_bo_unreference(brw->batch.exec_bos[i]); + brw_bo_unreference_bos_when_idle(brw->batch.batch.bo, +brw->batch.exec_bos, +brw->batch.exec_count); + + for (int i = 0; i < brw->batch.exec_count; i++) brw->batch.exec_bos[i] = NULL; - } brw->batch.batch_relocs.reloc_count = 0; brw->batch.state_relocs.reloc_count = 0; brw->batch.exec_count = 0; -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/8] i965/miptree: Stop setting BO_ALLOC_BUSY
It was never all that useful and no one had really demonstrated the value of it in any concrete way. It is, however, a very easy way to run into trouble if you're not careful. Let's just drop it and hope to solve whatever problems it was solving in some other way. --- src/mesa/drivers/dri/i965/intel_fbo.c | 2 +- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 29 +++ src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 9 -- src/mesa/drivers/dri/i965/intel_screen.c | 2 +- .../drivers/dri/i965/intel_tex_validate.c | 2 +- 5 files changed, 14 insertions(+), 30 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_fbo.c b/src/mesa/drivers/dri/i965/intel_fbo.c index fb84b738c08..5d446023d12 100644 --- a/src/mesa/drivers/dri/i965/intel_fbo.c +++ b/src/mesa/drivers/dri/i965/intel_fbo.c @@ -948,7 +948,7 @@ intel_renderbuffer_move_to_temp(struct brw_context *brw, 0, 0, width, height, 1, irb->mt->surf.samples, - MIPTREE_CREATE_BUSY); + MIPTREE_CREATE_DEFAULT); if (!invalidate) intel_miptree_copy_slice(brw, intel_image->mt, diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index 6b89bf6848a..6a1d4fc670c 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -554,7 +554,7 @@ make_surface(struct brw_context *brw, GLenum target, mesa_format format, unsigned first_level, unsigned last_level, unsigned width0, unsigned height0, unsigned depth0, unsigned num_samples, isl_tiling_flags_t tiling_flags, - isl_surf_usage_flags_t isl_usage_flags, uint32_t alloc_flags, + isl_surf_usage_flags_t isl_usage_flags, unsigned row_pitch, struct brw_bo *bo) { struct intel_mipmap_tree *mt = calloc(sizeof(*mt), 1); @@ -630,7 +630,7 @@ make_surface(struct brw_context *brw, GLenum target, mesa_format format, BRW_MEMZONE_OTHER, isl_tiling_to_i915_tiling( mt->surf.tiling), - mt->surf.row_pitch, alloc_flags); + mt->surf.row_pitch, 0); if (!mt->bo) goto fail; } else { @@ -667,7 +667,7 @@ make_separate_stencil_surface(struct brw_context *brw, mt->surf.samples, ISL_TILING_W_BIT, ISL_SURF_USAGE_STENCIL_BIT | ISL_SURF_USAGE_TEXTURE_BIT, - BO_ALLOC_BUSY, 0, NULL); + 0, NULL); if (!mt->stencil_mt) return false; @@ -697,7 +697,6 @@ miptree_create(struct brw_context *brw, ISL_TILING_W_BIT, ISL_SURF_USAGE_STENCIL_BIT | ISL_SURF_USAGE_TEXTURE_BIT, - BO_ALLOC_BUSY, 0, NULL); @@ -715,7 +714,7 @@ miptree_create(struct brw_context *brw, first_level, last_level, width0, height0, depth0, num_samples, ISL_TILING_Y0_BIT, ISL_SURF_USAGE_DEPTH_BIT | ISL_SURF_USAGE_TEXTURE_BIT, - BO_ALLOC_BUSY, 0, NULL); + 0, NULL); if (needs_separate_stencil(brw, mt, format) && !make_separate_stencil_surface(brw, mt)) { @@ -731,15 +730,11 @@ miptree_create(struct brw_context *brw, mesa_format tex_format = format; mesa_format etc_format = MESA_FORMAT_NONE; - uint32_t alloc_flags = 0; format = intel_lower_compressed_format(brw, format); etc_format = (format != tex_format) ? tex_format : MESA_FORMAT_NONE; - if (flags & MIPTREE_CREATE_BUSY) - alloc_flags |= BO_ALLOC_BUSY; - isl_tiling_flags_t tiling_flags = (flags & MIPTREE_CREATE_LINEAR) ? ISL_TILING_LINEAR_BIT : ISL_TILING_ANY_MASK; @@ -754,7 +749,7 @@ miptree_create(struct brw_context *brw, num_samples, tiling_flags, ISL_SURF_USAGE_RENDER_TARGET_BIT | ISL_SURF_USAGE_TEXTURE_BIT, - alloc_flags, 0, NULL); + 0, NULL); if (!mt) return NULL; @@ -828,7 +823,7 @@ intel_miptree_create_for_bo(struct brw_context *brw, devinfo->gen >= 6 ? depth_only_format : format, 0, 0, width, height, depth, 1, ISL_TILING_Y0_BIT, ISL_SURF_USAGE_DEPTH_BIT | ISL_SURF_USAGE_TEXTURE_BIT, -0, pitch, bo); +pitch, bo); if (!mt) return NULL; @@ -844,7 +839,7 @@ intel_miptree_
Re: [Mesa-dev] [PATCH 01/14] intel/compiler: general 8/16/32/64-bit shuffle_src_to_dst function
On Sat, Jun 9, 2018 at 4:13 AM, Jose Maria Casanova Crespo < jmcasan...@igalia.com> wrote: > This new function takes care of shuffle/unshuffle components of a > particular bit-size in components with a different bit-size. > > If source type size is smaller than destination type size the operation > needed is a component shuffle. The opposite case would be an unshuffle. > > The operation allows to skip first_component number of components from > the source. > > Shuffle MOVs are retyped using integer types avoiding problems with denorms > and float types. This allows to simplify uses of shuffle functions that are > dealing with these retypes individually. > > Now there is a new restriction so source and destination can not overlap > anymore when calling this suffle function. Following patches that migrate > to use this new function will take care individually of avoiding source > and destination overlaps. > --- > src/intel/compiler/brw_fs_nir.cpp | 92 +++ > 1 file changed, 92 insertions(+) > > diff --git a/src/intel/compiler/brw_fs_nir.cpp > b/src/intel/compiler/brw_fs_nir.cpp > index 166da0aa6d7..1a9d3c41d1d 100644 > --- a/src/intel/compiler/brw_fs_nir.cpp > +++ b/src/intel/compiler/brw_fs_nir.cpp > @@ -5362,6 +5362,98 @@ shuffle_16bit_data_for_32bit_write(const > fs_builder &bld, > } > } > > +/* > + * This helper takes a source register and un/shuffles it into the > destination > + * register. > + * > + * If source type size is smaller than destination type size the operation > + * needed is a component shuffle. The opposite case would be an > unshuffle. If > + * source/destination type size is equal a shuffle is done that would be > + * equivalent to a simple MOV. > There's a sticky bit here if we want this to work with 64-bit types on gen7 and earlier because we only have DF there and not Q so the brw_reg_type_from_bit_size below doesn't work. If we care about that case (and I'm not convinced we do), it should be easy enough to add a type_sz(src.type) == type_sz(dst.type) case which just does MOVs from source to dest. > + * > + * For example, if source is a 16-bit type and destination is 32-bit. A 3 > + * components .xyz 16-bit vector on SIMD8 would be. > + * > + *|x1|x2|x3|x4|x5|x6|x7|x8|y1|y2|y3|y4|y5|y6|y7|y8| > + *|z1|z2|z3|z4|z5|z6|z7|z8| | | | | | | | | > + * > + * This helper will return the following 2 32-bit components with the > 16-bit > + * values shuffled: > + * > + *|x1 y1|x2 y2|x3 y3|x4 y4|x5 y5|x6 y6|x7 y7|x8 y8| > + *|z1 |z2 |z3 |z4 |z5 |z6 |z7 |z8 | > + * > + * For unshuffle, the example would be the opposite, a 64-bit type source > + * and a 32-bit destination. A 2 component .xy 64-bit vector on SIMD8 > + * would be: > + * > + *| x1l x1h | x2l x2h | x3l x3h | x4l x4h | > + *| x5l x5h | x6l x6h | x7l x7h | x8l x8h | > + *| y1l y1h | y2l y2h | y3l y3h | y4l y4h | > + *| y5l y5h | y6l y6h | y7l y7h | y8l y8h | > + * > + * The returned result would be the following 4 32-bit components > unshuffled: > + * > + *| x1l | x2l | x3l | x4l | x5l | x6l | x7l | x8l | > + *| x1h | x2h | x3h | x4h | x5h | x6h | x7h | x8h | > + *| y1l | y2l | y3l | y4l | y5l | y6l | y7l | y8l | > + *| y1h | y2h | y3h | y4h | y5h | y6h | y7h | y8h | > + * > + * - Source and destination register must not be overlapped. > + * - first_component parameter allows skipping source components. > + */ > +void > +shuffle_src_to_dst(const fs_builder &bld, > + const fs_reg &dst, > + const fs_reg &src, > + uint32_t first_component, > + uint32_t components) > +{ > + if (type_sz(src.type) <= type_sz(dst.type)) { > + /* Source is shuffled into destination */ > + unsigned size_ratio = type_sz(dst.type) / type_sz(src.type); > +#ifndef NDEBUG > + boolean src_dst_overlap = regions_overlap(dst, > + type_sz(dst.type) * bld.dispatch_width() * components, > + offset(src, bld, first_component * size_ratio), > Why do you need to multiply first_component by size_ratio? It's already in units of source components. > + type_sz(src.type) * bld.dispatch_width() * components * > size_ratio); > +#endif > + assert(!src_dst_overlap); > If the only thing you're doing with src_dst_overlap is to assert on it, you may as well put the regions_overlap call inside the assert and drop the #ifndef. > + > + brw_reg_type shuffle_type = > + brw_reg_type_from_bit_size(8 * type_sz(src.type), > +BRW_REGISTER_TYPE_D); > + for (unsigned i = 0; i < components; i++) { > + fs_reg shuffle_component_i = > +subscript(offset(dst, bld, i / size_ratio), > + shuffle_type, i % size_ratio); > + bld.MOV(shuffle_component_i, > + retype(offset(src, bld, i + first_component), > shuffle_type)); > + } >
[Mesa-dev] [Bug 106907] Correct Transform Feedback Varyings information is expected after using ProgramBinary
https://bugs.freedesktop.org/show_bug.cgi?id=106907 --- Comment #1 from Jordan Justen --- Any chance you might be able to write a small piglit test that shows the bug? For example: https://cgit.freedesktop.org/piglit/commit/?id=f1dc46ddf8c1 -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radv: Fix output for sparse MRTs.
We need to init the cb_shader_format correctly with the changed col_format, so this moves the col_format adjustment to before the adjustment to before the cb_shader_mask gets generated. Fixes: 06d3c650980 "radv: fix a GPU hang when MRTs are sparse" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106903 CC: 18.1 --- src/amd/vulkan/radv_pipeline.c | 19 ++- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c index b8b425aca9f..6eeedc65a39 100644 --- a/src/amd/vulkan/radv_pipeline.c +++ b/src/amd/vulkan/radv_pipeline.c @@ -524,20 +524,21 @@ radv_pipeline_compute_spi_color_formats(struct radv_pipeline *pipeline, col_format |= cf << (4 * i); } - blend->cb_shader_mask = ac_get_cb_shader_mask(col_format); - - if (blend->mrt0_is_dual_src) - col_format |= (col_format & 0xf) << 4; - blend->spi_shader_col_format = col_format; - /* If the i-th target format is set, all previous target formats must * be non-zero to avoid hangs. */ - num_targets = (util_last_bit(blend->spi_shader_col_format) + 3) / 4; + num_targets = (util_last_bit(col_format) + 3) / 4; for (unsigned i = 0; i < num_targets; i++) { - if (!(blend->spi_shader_col_format & (0xf << (i * 4 - blend->spi_shader_col_format |= V_028714_SPI_SHADER_32_R << (i * 4); + if (!(col_format & (0xf << (i * 4 { + col_format |= V_028714_SPI_SHADER_32_R << (i * 4); + } } + + blend->cb_shader_mask = ac_get_cb_shader_mask(col_format); + + if (blend->mrt0_is_dual_src) + col_format |= (col_format & 0xf) << 4; + blend->spi_shader_col_format = col_format; } static bool -- 2.17.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa 1/9] vulkan: Add KHR_display extension using DRM [v8]
I'm trusting that not much changed other than what was explicitly called out. I didn't want to re-read in *that* much detail again. :-) Reviewed-by: Jason Ekstrand On Mon, Jun 11, 2018 at 10:39 PM, Keith Packard wrote: > This adds support for the KHR_display extension support to the vulkan > WSI layer. Driver support will be added separately. > > v2: > * fix double ;; in wsi_common_display.c > > * Move mode list from wsi_display to wsi_display_connector > > * Fix scope for wsi_display_mode andwsi_display_connector > allocs > > * Switch all allocations to vk_zalloc instead of vk_alloc. > > * Fix DRM failure in > wsi_display_get_physical_device_display_properties > > When DRM fails, or when we don't have a master fd > (presumably due to application errors), just return 0 > properties from this function, which is at least a valid > response. > > * Use vk_outarray for all property queries > > This is a bit less error-prone than open-coding the same > stuff. > > * Remove VK_COMPOSITE_ALPHA_INHERIT_BIT_KHR from surface caps > > Until we have multi-plane support, we shouldn't pretend to > have any multi-plane semantics, even if undefined. > > Suggested-by: Jason Ekstrand > > * Simplify addition of VK_USE_PLATFORM_DISPLAY_KHR to > vulkan_wsi_args > > Suggested-by: Eric Engestrom > > v3: > Add separate 'display_fd' and 'render_fd' arguments to > wsi_device_init API. This allows drivers to use different FDs > for the different aspects of the device. > > Use largest mode as display size when no preferred mode. > > If the display doesn't provide a preferred mode, we'll assume > that the largest supported mode is the "physical size" of the > device and report that. > > v4: > Make wsi_image_state enumeration values uppercase. > Follow more common mesa conventions. > > Remove 'render_fd' from wsi_device_init API. The > wsi_common_display code doesn't use this fd at all, so stop > passing it in. This avoids any potential confusion over which > fd to use when creating display-relative object handles. > > Remove call to wsi_create_prime_image which would never have > been reached as the necessary condition (use_prime_blit) is > never set. > > whitespace cleanups in wsi_common_display.c > > Suggested-by: Jason Ekstrand > > Add depth/bpp info to available surface formats. Instead of > hard-coding depth 24 bpp 32 in the drmModeAddFB call, use the > requested format to find suitable values. > > Destroy kernel buffers and FBs when swapchain is destroyed. We > were leaking both of these kernel objects across swapchain > destruction. > > Note that wsi_display_wait_for_event waits for anything to > happen. wsi_display_wait_for_event is simply a yield so that > the caller can then check to see if the desired state change > has occurred. > > Record swapchain failures in chain for later return. If some > asynchronous swapchain activity fails, we need to tell the > application eventually. Record the failure in the swapchain > and report it at the next acquire_next_image or queue_present > call. > > Fix error returns from wsi_display_setup_connector. If a > malloc failed, then the result should be > VK_ERROR_OUT_OF_HOST_MEMORY. Otherwise, the associated ioctl > failed and we're either VT switched away, or our lease has > been revoked, in which case we should return > VK_ERROR_OUT_OF_DATE_KHR. > > Make sure both sides of if/else brace use matches > > Note that we assume drmModeSetCrtc is synchronous. Add a > comment explaining why we can idle any previous displayed > image as soon as the mode set returns. > > Note that EACCES from drmModePageFlip means VT inactive. When > vt switched away drmModePageFlip returns EACCES. Poll once a > second waiting until we get some other return value back. > > Clean up after alloc failure in > wsi_display_surface_create_swapchain. Destroy any created > images, free the swapchain. > > Remove physical_device from wsi_display_init_wsi. We never > need this value, so remove it from the API and from the > internal wsi_display structure. > > Use drmModeAddFB2 in wsi_display_image_init. This takes a drm > format instead of depth/bpp, which provides more control over > the format of the data. > > v5: > Set the 'currentStackIndex' member of the > VkDisplayPlanePropertiesKHR record to zero, instead of > indexing across all displays. This value is the stac
Re: [Mesa-dev] [PATCH mesa 2/9] anv: Add KHR_display extension to anv [v5]
On Mon, Jun 11, 2018 at 10:39 PM, Keith Packard wrote: > This adds support for the KHR_display extension to the anv Vulkan > driver. The driver now attempts to open the master DRM node when the > KHR_display extension is requested so that the common winsys code can > perform the necessary operations. > > v2: Make sure primary fd is usable > > When KHR_display is selected, we try to open the primary node > instead of the render node in case the user wants to use > KHR_display for presentation. However, if we're actually going > to end up using RandR leases, then we don't care if the > resulting fd can't be used for display, but the kernel also > prevents us from using it for drawing when someone else has > master. > > v3: > Simplify addition of VK_USE_PLATFORM_DISPLAY_KHR to vulkan_wsi_args > > Suggested-by: Eric Engestrom > > v4: > Adapt primary node usage to new wsi_device_init API > > v5: > Adopt Jason Ekstrand's coding conventions > > Declare variables at first use, eliminate extra whitespace between > types and names. Wrap lines to 80 columns. > > Remove spurious MM_PER_PIXEL define > > Suggested-by: Jason Ekstrand > > Signed-off-by: Keith Packard > > fixup > --- > src/intel/Makefile.sources | 3 + > src/intel/Makefile.vulkan.am | 7 ++ > src/intel/vulkan/anv_device.c | 21 > src/intel/vulkan/anv_extensions.py | 1 + > src/intel/vulkan/anv_extensions_gen.py | 5 +- > src/intel/vulkan/anv_wsi_display.c | 129 + > src/intel/vulkan/meson.build | 5 + > 7 files changed, 169 insertions(+), 2 deletions(-) > create mode 100644 src/intel/vulkan/anv_wsi_display.c > > diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources > index f22e727553f..5f6cd96825b 100644 > --- a/src/intel/Makefile.sources > +++ b/src/intel/Makefile.sources > @@ -254,6 +254,9 @@ VULKAN_WSI_WAYLAND_FILES := \ > VULKAN_WSI_X11_FILES := \ > vulkan/anv_wsi_x11.c > > +VULKAN_WSI_DISPLAY_FILES := \ > + vulkan/anv_wsi_display.c > + > VULKAN_GEM_FILES := \ > vulkan/anv_gem.c > > diff --git a/src/intel/Makefile.vulkan.am b/src/intel/Makefile.vulkan.am > index 4125cb205ad..9b7fbb74007 100644 > --- a/src/intel/Makefile.vulkan.am > +++ b/src/intel/Makefile.vulkan.am > @@ -192,6 +192,13 @@ VULKAN_SOURCES += $(VULKAN_WSI_WAYLAND_FILES) > VULKAN_LIB_DEPS += $(WAYLAND_CLIENT_LIBS) > endif > > +if HAVE_PLATFORM_DRM > +VULKAN_CPPFLAGS += \ > + -DVK_USE_PLATFORM_DISPLAY_KHR > + > +VULKAN_SOURCES += $(VULKAN_WSI_DISPLAY_FILES) > +endif > + > noinst_LTLIBRARIES += vulkan/libvulkan_common.la > vulkan_libvulkan_common_la_SOURCES = $(VULKAN_SOURCES) > vulkan_libvulkan_common_la_CFLAGS = $(VULKAN_CFLAGS) > diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c > index 56e91fe5de1..b3c6d1a8722 100644 > --- a/src/intel/vulkan/anv_device.c > +++ b/src/intel/vulkan/anv_device.c > @@ -274,6 +274,7 @@ anv_physical_device_init_uuids(struct > anv_physical_device *device) > static VkResult > anv_physical_device_init(struct anv_physical_device *device, > struct anv_instance *instance, > + const char *primary_path, > const char *path) > { > VkResult result; > @@ -445,6 +446,25 @@ anv_physical_device_init(struct anv_physical_device > *device, > anv_physical_device_get_supported_extensions(device, > > &device->supported_extensions); > > + if (instance->enabled_extensions.KHR_display) { > + master_fd = open(path, O_RDWR | O_CLOEXEC); > Is this supposed to be opening primary_path instead? > + if (master_fd >= 0) { > + /* prod the device with a GETPARAM call which will fail if > + * we don't have permission to even render on this device > + */ > + drm_i915_getparam_t gp; > + memset(&gp, '\0', sizeof(gp)); > + int devid = 0; > + gp.param = I915_PARAM_CHIPSET_ID; > + gp.value = &devid; > + int ret = drmIoctl(fd, DRM_IOCTL_I915_GETPARAM, &gp); > + if (ret < 0) { > +close(master_fd); > +master_fd = -1; > + } > This could just be if (anv_gem_get_param(master_fd, I915_PARAM_CHIPSET_ID) == 0) { close(master_fd); master_fd = -1; } No need to type out all that IOCTL stuff. > + } > + } > + > device->local_fd = fd; > device->master_fd = master_fd; > return VK_SUCCESS; > @@ -635,6 +655,7 @@ anv_enumerate_devices(struct anv_instance *instance) > > result = anv_physical_device_init(&instance->physicalDevice, > instance, > +devices[i]->nodes[DRM_NODE_PRIMARY], > devices[i]->nodes[DRM_NODE_RENDER]); > if (result != VK_ERROR_INCOMPATIBLE_DRIVER) > break; > diff --git a/
Re: [Mesa-dev] [PATCH mesa 04/21] vulkan: Add EXT_direct_mode_display
On Mon, Jun 11, 2018 at 9:32 PM, Keith Packard wrote: > Jason Ekstrand writes: > > > This seems a bit odd. Why is the FD not stored in the display? What if > > you acquire multiple displays for two-player VR? If the master FD passed > > in is not -1, we could just create a VkDisplayKHR object containing > > it. > > You want to share the master_fd passed in at init_wsi time among all > VkDisplayKHR objects, so you need to leave that FD in the global > structure. However, you're right that when you use > EXT_acquire_xlib_display, then you get a separate master_fd for each DRM > output and need to have one per display. > > However, extending this code to support multiple master FDs looks tricky > -- in the case where you have a single master_fd, then enumerating the > DRM resources for that gives you all of the available > connectors. However, if you have one DRM master per connector, then you > need to enumerate each independently to get the complete set of > available resources. For APIs which don't explicitly include a > connector, I would have to go find a suitable master FD for each > resource. > > How about I just disallow multiple leases for now? If you want multiple > outputs, I think you'd want them on the same DRM master anyways, and we > could get that by creating a new extension which had the application > pass in a DRM master that had all of the resources you want to access. > >/* XXX no support for multiple leases yet */ >if (wsi->fd >= 0) > return VK_ERROR_OUT_OF_DATE_KHR; > That's fine with me. As long as we do something sensible such as disallowing it instead of just falling over. --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa 4/9] vulkan: Add EXT_direct_mode_display [v2]
patches 4-6 are Reviewed-by: Jason Ekstrand On Mon, Jun 11, 2018 at 10:39 PM, Keith Packard wrote: > Add support for the EXT_direct_mode_display extension. This just > provides the vkReleaseDisplayEXT function. > > v2: > Adopt Jason Ekstrand's coding conventions > > Declare variables at first use, eliminate extra whitespace > between types and names. Wrap lines to 80 columns. > > Suggested-by: Jason Ekstrand > > Signed-off-by: Keith Packard > --- > src/vulkan/wsi/wsi_common_display.c | 18 ++ > src/vulkan/wsi/wsi_common_display.h | 5 + > 2 files changed, 23 insertions(+) > > diff --git a/src/vulkan/wsi/wsi_common_display.c > b/src/vulkan/wsi/wsi_common_display.c > index e529d2fc580..7a484c0df95 100644 > --- a/src/vulkan/wsi/wsi_common_display.c > +++ b/src/vulkan/wsi/wsi_common_display.c > @@ -1430,3 +1430,21 @@ wsi_display_finish_wsi(struct wsi_device > *wsi_device, >vk_free(alloc, wsi); > } > } > + > +/* > + * Implement vkReleaseDisplay > + */ > +VkResult > +wsi_release_display(VkPhysicalDevicephysical_device, > +struct wsi_device *wsi_device, > +VkDisplayKHRdisplay) > +{ > + struct wsi_display *wsi = > + (struct wsi_display *) wsi_device->wsi[VK_ICD_WSI_ > PLATFORM_DISPLAY]; > + > + if (wsi->fd >= 0) { > + close(wsi->fd); > + wsi->fd = -1; > + } > + return VK_SUCCESS; > +} > diff --git a/src/vulkan/wsi/wsi_common_display.h > b/src/vulkan/wsi/wsi_common_display.h > index 4bb86cf2102..dd3a098f80a 100644 > --- a/src/vulkan/wsi/wsi_common_display.h > +++ b/src/vulkan/wsi/wsi_common_display.h > @@ -74,4 +74,9 @@ wsi_create_display_surface(VkInstance instance, > const VkDisplaySurfaceCreateInfoKHR > *pCreateInfo, > VkSurfaceKHR *pSurface); > > +VkResult > +wsi_release_display(VkPhysicalDevicephysical_device, > +struct wsi_device *wsi_device, > +VkDisplayKHRdisplay); > + > #endif > -- > 2.17.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa 3/9] radv: Add KHR_display extension to radv [v4]
Reviewed-by: Jason Ekstrand On Mon, Jun 11, 2018 at 10:39 PM, Keith Packard wrote: > This adds support for the KHR_display extension to the radv Vulkan > driver. The driver now attempts to open the master DRM node when the > KHR_display extension is requested so that the common winsys code can > perform the necessary operations. > > v2: > * Simplify addition of VK_USE_PLATFORM_DISPLAY_KHR to > vulkan_wsi_args > > Suggested-by: Eric Engestrom > > v3: > Adapt to new wsi_device_init API (added display_fd) > > v4: > Adopt Jason Ekstrand's coding conventions > > Declare variables at first use, eliminate extra whitespace > between types and names. Wrap lines to 80 columns. > > Suggested-by: Jason Ekstrand > > Signed-off-by: Keith Packard > --- > src/amd/vulkan/Makefile.am| 8 ++ > src/amd/vulkan/Makefile.sources | 3 + > src/amd/vulkan/meson.build| 5 + > src/amd/vulkan/radv_device.c | 17 > src/amd/vulkan/radv_extensions.py | 7 +- > src/amd/vulkan/radv_private.h | 1 + > src/amd/vulkan/radv_wsi_display.c | 149 ++ > 7 files changed, 188 insertions(+), 2 deletions(-) > create mode 100644 src/amd/vulkan/radv_wsi_display.c > > diff --git a/src/amd/vulkan/Makefile.am b/src/amd/vulkan/Makefile.am > index 18f263ab447..f4f99400275 100644 > --- a/src/amd/vulkan/Makefile.am > +++ b/src/amd/vulkan/Makefile.am > @@ -80,6 +80,14 @@ VULKAN_LIB_DEPS = \ > $(DLOPEN_LIBS) \ > -lm > > +if HAVE_PLATFORM_DRM > +AM_CPPFLAGS += \ > + -DVK_USE_PLATFORM_DISPLAY_KHR > + > +VULKAN_SOURCES += $(VULKAN_WSI_DISPLAY_FILES) > + > +endif > + > if HAVE_PLATFORM_X11 > AM_CPPFLAGS += \ > $(XCB_DRI3_CFLAGS) \ > diff --git a/src/amd/vulkan/Makefile.sources b/src/amd/vulkan/Makefile. > sources > index ccb956a2396..70d56e88cb3 100644 > --- a/src/amd/vulkan/Makefile.sources > +++ b/src/amd/vulkan/Makefile.sources > @@ -80,6 +80,9 @@ VULKAN_WSI_WAYLAND_FILES := \ > VULKAN_WSI_X11_FILES := \ > radv_wsi_x11.c > > +VULKAN_WSI_DISPLAY_FILES := \ > + radv_wsi_display.c > + > VULKAN_GENERATED_FILES := \ > radv_entrypoints.c \ > radv_entrypoints.h \ > diff --git a/src/amd/vulkan/meson.build b/src/amd/vulkan/meson.build > index b5a99fe91e1..15e69d582dd 100644 > --- a/src/amd/vulkan/meson.build > +++ b/src/amd/vulkan/meson.build > @@ -115,6 +115,11 @@ if with_platform_wayland >libradv_files += files('radv_wsi_wayland.c') > endif > > +if with_platform_drm > + radv_flags += '-DVK_USE_PLATFORM_DISPLAY_KHR' > + libradv_files += files('radv_wsi_display.c') > +endif > + > libvulkan_radeon = shared_library( >'vulkan_radeon', >[libradv_files, radv_entrypoints, radv_extensions_c, vk_format_table_c], > diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c > index ca091ee12ba..59ee503c8c2 100644 > --- a/src/amd/vulkan/radv_device.c > +++ b/src/amd/vulkan/radv_device.c > @@ -274,6 +274,23 @@ radv_physical_device_init(struct > radv_physical_device *device, > goto fail; > } > > + if (instance->enabled_extensions.KHR_display) { > + master_fd = open(drm_device->nodes[DRM_NODE_PRIMARY], > O_RDWR | O_CLOEXEC); > + if (master_fd >= 0) { > + uint32_t accel_working = 0; > + struct drm_amdgpu_info request = { > + .return_pointer = > (uintptr_t)&accel_working, > + .return_size = sizeof(accel_working), > + .query = AMDGPU_INFO_ACCEL_WORKING > + }; > + > + if (drmCommandWrite(master_fd, DRM_AMDGPU_INFO, > &request, sizeof (struct drm_amdgpu_info)) < 0 || !accel_working) { > + close(master_fd); > + master_fd = -1; > + } > + } > + } > + > device->master_fd = master_fd; > device->local_fd = fd; > device->ws->query_info(device->ws, &device->rad_info); > diff --git a/src/amd/vulkan/radv_extensions.py b/src/amd/vulkan/radv_ > extensions.py > index a5b5a8dc34e..6f4fc71bfd8 100644 > --- a/src/amd/vulkan/radv_extensions.py > +++ b/src/amd/vulkan/radv_extensions.py > @@ -86,6 +86,7 @@ EXTENSIONS = [ > Extension('VK_KHR_xcb_surface', 6, > 'VK_USE_PLATFORM_XCB_KHR'), > Extension('VK_KHR_xlib_surface', 6, > 'VK_USE_PLATFORM_XLIB_KHR'), > Extension('VK_KHR_multiview', 1, True), > +Extension('VK_KHR_display', 23, > 'VK_USE_PLATFORM_DISPLAY_KHR'), > Extension('VK_EXT_debug_report', 9, True), > Extension('VK_EXT_depth_range_unrestricted', 1, True), > Extension('VK_EXT_descriptor_indexing', 2, True), > @@ -214,7 +215,7 @@ _TEMPLATE_C = Te
Re: [Mesa-dev] [PATCH mesa 3/9] radv: Add KHR_display extension to radv [v4]
On Wed, Jun 13, 2018 at 2:46 PM, Jason Ekstrand wrote: > Reviewed-by: Jason Ekstrand > With the caveat that I have no idea how the amdgpu kernel interface works. :-) > On Mon, Jun 11, 2018 at 10:39 PM, Keith Packard wrote: > >> This adds support for the KHR_display extension to the radv Vulkan >> driver. The driver now attempts to open the master DRM node when the >> KHR_display extension is requested so that the common winsys code can >> perform the necessary operations. >> >> v2: >> * Simplify addition of VK_USE_PLATFORM_DISPLAY_KHR to >> vulkan_wsi_args >> >> Suggested-by: Eric Engestrom >> >> v3: >> Adapt to new wsi_device_init API (added display_fd) >> >> v4: >> Adopt Jason Ekstrand's coding conventions >> >> Declare variables at first use, eliminate extra whitespace >> between types and names. Wrap lines to 80 columns. >> >> Suggested-by: Jason Ekstrand >> >> Signed-off-by: Keith Packard >> --- >> src/amd/vulkan/Makefile.am| 8 ++ >> src/amd/vulkan/Makefile.sources | 3 + >> src/amd/vulkan/meson.build| 5 + >> src/amd/vulkan/radv_device.c | 17 >> src/amd/vulkan/radv_extensions.py | 7 +- >> src/amd/vulkan/radv_private.h | 1 + >> src/amd/vulkan/radv_wsi_display.c | 149 ++ >> 7 files changed, 188 insertions(+), 2 deletions(-) >> create mode 100644 src/amd/vulkan/radv_wsi_display.c >> >> diff --git a/src/amd/vulkan/Makefile.am b/src/amd/vulkan/Makefile.am >> index 18f263ab447..f4f99400275 100644 >> --- a/src/amd/vulkan/Makefile.am >> +++ b/src/amd/vulkan/Makefile.am >> @@ -80,6 +80,14 @@ VULKAN_LIB_DEPS = \ >> $(DLOPEN_LIBS) \ >> -lm >> >> +if HAVE_PLATFORM_DRM >> +AM_CPPFLAGS += \ >> + -DVK_USE_PLATFORM_DISPLAY_KHR >> + >> +VULKAN_SOURCES += $(VULKAN_WSI_DISPLAY_FILES) >> + >> +endif >> + >> if HAVE_PLATFORM_X11 >> AM_CPPFLAGS += \ >> $(XCB_DRI3_CFLAGS) \ >> diff --git a/src/amd/vulkan/Makefile.sources >> b/src/amd/vulkan/Makefile.sources >> index ccb956a2396..70d56e88cb3 100644 >> --- a/src/amd/vulkan/Makefile.sources >> +++ b/src/amd/vulkan/Makefile.sources >> @@ -80,6 +80,9 @@ VULKAN_WSI_WAYLAND_FILES := \ >> VULKAN_WSI_X11_FILES := \ >> radv_wsi_x11.c >> >> +VULKAN_WSI_DISPLAY_FILES := \ >> + radv_wsi_display.c >> + >> VULKAN_GENERATED_FILES := \ >> radv_entrypoints.c \ >> radv_entrypoints.h \ >> diff --git a/src/amd/vulkan/meson.build b/src/amd/vulkan/meson.build >> index b5a99fe91e1..15e69d582dd 100644 >> --- a/src/amd/vulkan/meson.build >> +++ b/src/amd/vulkan/meson.build >> @@ -115,6 +115,11 @@ if with_platform_wayland >>libradv_files += files('radv_wsi_wayland.c') >> endif >> >> +if with_platform_drm >> + radv_flags += '-DVK_USE_PLATFORM_DISPLAY_KHR' >> + libradv_files += files('radv_wsi_display.c') >> +endif >> + >> libvulkan_radeon = shared_library( >>'vulkan_radeon', >>[libradv_files, radv_entrypoints, radv_extensions_c, >> vk_format_table_c], >> diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c >> index ca091ee12ba..59ee503c8c2 100644 >> --- a/src/amd/vulkan/radv_device.c >> +++ b/src/amd/vulkan/radv_device.c >> @@ -274,6 +274,23 @@ radv_physical_device_init(struct >> radv_physical_device *device, >> goto fail; >> } >> >> + if (instance->enabled_extensions.KHR_display) { >> + master_fd = open(drm_device->nodes[DRM_NODE_PRIMARY], >> O_RDWR | O_CLOEXEC); >> + if (master_fd >= 0) { >> + uint32_t accel_working = 0; >> + struct drm_amdgpu_info request = { >> + .return_pointer = >> (uintptr_t)&accel_working, >> + .return_size = sizeof(accel_working), >> + .query = AMDGPU_INFO_ACCEL_WORKING >> + }; >> + >> + if (drmCommandWrite(master_fd, DRM_AMDGPU_INFO, >> &request, sizeof (struct drm_amdgpu_info)) < 0 || !accel_working) { >> + close(master_fd); >> + master_fd = -1; >> + } >> + } >> + } >> + >> device->master_fd = master_fd; >> device->local_fd = fd; >> device->ws->query_info(device->ws, &device->rad_info); >> diff --git a/src/amd/vulkan/radv_extensions.py >> b/src/amd/vulkan/radv_extensions.py >> index a5b5a8dc34e..6f4fc71bfd8 100644 >> --- a/src/amd/vulkan/radv_extensions.py >> +++ b/src/amd/vulkan/radv_extensions.py >> @@ -86,6 +86,7 @@ EXTENSIONS = [ >> Extension('VK_KHR_xcb_surface', 6, >> 'VK_USE_PLATFORM_XCB_KHR'), >> Extension('VK_KHR_xlib_surface', 6, >> 'VK_USE_PLATFORM_XLIB_KHR'), >> Extension('VK_KHR_multiview', 1, True), >> +Extension('VK_KHR_display', 23, >> 'VK_USE_
Re: [Mesa-dev] [PATCH mesa 2/9] anv: Add KHR_display extension to anv [v5]
On Mon, Jun 11, 2018 at 10:39 PM, Keith Packard wrote: > This adds support for the KHR_display extension to the anv Vulkan > driver. The driver now attempts to open the master DRM node when the > KHR_display extension is requested so that the common winsys code can > perform the necessary operations. > > v2: Make sure primary fd is usable > > When KHR_display is selected, we try to open the primary node > instead of the render node in case the user wants to use > KHR_display for presentation. However, if we're actually going > to end up using RandR leases, then we don't care if the > resulting fd can't be used for display, but the kernel also > prevents us from using it for drawing when someone else has > master. > > v3: > Simplify addition of VK_USE_PLATFORM_DISPLAY_KHR to vulkan_wsi_args > > Suggested-by: Eric Engestrom > > v4: > Adapt primary node usage to new wsi_device_init API > > v5: > Adopt Jason Ekstrand's coding conventions > > Declare variables at first use, eliminate extra whitespace between > types and names. Wrap lines to 80 columns. > > Remove spurious MM_PER_PIXEL define > > Suggested-by: Jason Ekstrand > > Signed-off-by: Keith Packard > > fixup > Did you mean to leave this in here? > --- > src/intel/Makefile.sources | 3 + > src/intel/Makefile.vulkan.am | 7 ++ > src/intel/vulkan/anv_device.c | 21 > src/intel/vulkan/anv_extensions.py | 1 + > src/intel/vulkan/anv_extensions_gen.py | 5 +- > src/intel/vulkan/anv_wsi_display.c | 129 + > src/intel/vulkan/meson.build | 5 + > 7 files changed, 169 insertions(+), 2 deletions(-) > create mode 100644 src/intel/vulkan/anv_wsi_display.c > > diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources > index f22e727553f..5f6cd96825b 100644 > --- a/src/intel/Makefile.sources > +++ b/src/intel/Makefile.sources > @@ -254,6 +254,9 @@ VULKAN_WSI_WAYLAND_FILES := \ > VULKAN_WSI_X11_FILES := \ > vulkan/anv_wsi_x11.c > > +VULKAN_WSI_DISPLAY_FILES := \ > + vulkan/anv_wsi_display.c > + > VULKAN_GEM_FILES := \ > vulkan/anv_gem.c > > diff --git a/src/intel/Makefile.vulkan.am b/src/intel/Makefile.vulkan.am > index 4125cb205ad..9b7fbb74007 100644 > --- a/src/intel/Makefile.vulkan.am > +++ b/src/intel/Makefile.vulkan.am > @@ -192,6 +192,13 @@ VULKAN_SOURCES += $(VULKAN_WSI_WAYLAND_FILES) > VULKAN_LIB_DEPS += $(WAYLAND_CLIENT_LIBS) > endif > > +if HAVE_PLATFORM_DRM > +VULKAN_CPPFLAGS += \ > + -DVK_USE_PLATFORM_DISPLAY_KHR > + > +VULKAN_SOURCES += $(VULKAN_WSI_DISPLAY_FILES) > +endif > + > noinst_LTLIBRARIES += vulkan/libvulkan_common.la > vulkan_libvulkan_common_la_SOURCES = $(VULKAN_SOURCES) > vulkan_libvulkan_common_la_CFLAGS = $(VULKAN_CFLAGS) > diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c > index 56e91fe5de1..b3c6d1a8722 100644 > --- a/src/intel/vulkan/anv_device.c > +++ b/src/intel/vulkan/anv_device.c > @@ -274,6 +274,7 @@ anv_physical_device_init_uuids(struct > anv_physical_device *device) > static VkResult > anv_physical_device_init(struct anv_physical_device *device, > struct anv_instance *instance, > + const char *primary_path, > const char *path) > { > VkResult result; > @@ -445,6 +446,25 @@ anv_physical_device_init(struct anv_physical_device > *device, > anv_physical_device_get_supported_extensions(device, > > &device->supported_extensions); > > + if (instance->enabled_extensions.KHR_display) { > + master_fd = open(path, O_RDWR | O_CLOEXEC); > + if (master_fd >= 0) { > + /* prod the device with a GETPARAM call which will fail if > + * we don't have permission to even render on this device > + */ > + drm_i915_getparam_t gp; > + memset(&gp, '\0', sizeof(gp)); > + int devid = 0; > + gp.param = I915_PARAM_CHIPSET_ID; > + gp.value = &devid; > + int ret = drmIoctl(fd, DRM_IOCTL_I915_GETPARAM, &gp); > + if (ret < 0) { > +close(master_fd); > +master_fd = -1; > + } > + } > + } > + > device->local_fd = fd; > device->master_fd = master_fd; > return VK_SUCCESS; > @@ -635,6 +655,7 @@ anv_enumerate_devices(struct anv_instance *instance) > > result = anv_physical_device_init(&instance->physicalDevice, > instance, > +devices[i]->nodes[DRM_NODE_PRIMARY], > devices[i]->nodes[DRM_NODE_RENDER]); > if (result != VK_ERROR_INCOMPATIBLE_DRIVER) > break; > diff --git a/src/intel/vulkan/anv_extensions.py b/src/intel/vulkan/anv_ > extensions.py > index 8160864685f..83c09a46741 100644 > --- a/src/intel/vulkan/anv_extensions.py > +++ b/src/intel/vulkan/anv_exte
Re: [Mesa-dev] [PATCH] mesa: enable EXT_render_snorm extension
Tapani Pälli writes: > Patch sets additional formats renderable and enables the extension > when OpenGL ES 3.1 is supported. > > Signed-off-by: Tapani Pälli > --- > src/mesa/main/extensions_table.h | 1 + > src/mesa/main/fbobject.c | 20 +++- > src/mesa/main/glformats.c| 9 + > 3 files changed, 25 insertions(+), 5 deletions(-) > > diff --git a/src/mesa/main/extensions_table.h > b/src/mesa/main/extensions_table.h > index 79ef228b69..bc60475bea 100644 > --- a/src/mesa/main/extensions_table.h > +++ b/src/mesa/main/extensions_table.h > @@ -245,6 +245,7 @@ EXT(EXT_polygon_offset_clamp, > ARB_polygon_offset_clamp > EXT(EXT_primitive_bounding_box , OES_primitive_bounding_box > , x , x , x , 31, 2014) > EXT(EXT_provoking_vertex, EXT_provoking_vertex > , GLL, GLC, x , x , 2009) > EXT(EXT_read_format_bgra, dummy_true > , x , x , ES1, ES2, 2009) > +EXT(EXT_render_snorm, dummy_true > , x , x , x, 31, 2014) Since this is an extension beyond GLES 3.1, I think it shouldn't be dummy_true -- at least V3D 3.3 should be able to do 3.1, and can't render to snorm. signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/4] nv50/ir: add preliminary support for OP_XMAD
Signed-off-by: Rhys Perry --- src/gallium/drivers/nouveau/codegen/nv50_ir.cpp| 3 ++- src/gallium/drivers/nouveau/codegen/nv50_ir.h | 14 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 12 +-- .../drivers/nouveau/codegen/nv50_ir_print.cpp | 20 + .../drivers/nouveau/codegen/nv50_ir_target.cpp | 7 +++--- .../nouveau/codegen/nv50_ir_target_gm107.cpp | 1 + .../nouveau/codegen/nv50_ir_target_nv50.cpp| 5 +++-- .../nouveau/codegen/nv50_ir_target_nvc0.cpp| 25 -- 8 files changed, 77 insertions(+), 10 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp index 49425b98b9..99bf8de370 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp @@ -53,7 +53,8 @@ Modifier Modifier::operator*(const Modifier m) const b &= ~NV50_IR_MOD_NEG; a = (this->bits ^ b) & (NV50_IR_MOD_NOT | NV50_IR_MOD_NEG); - c = (this->bits | m.bits) & (NV50_IR_MOD_ABS | NV50_IR_MOD_SAT); + c = (this->bits | m.bits) & (NV50_IR_MOD_ABS | NV50_IR_MOD_SAT | +NV50_IR_MOD_H1 | NV50_IR_MOD_SEXT); return Modifier(a | c); } diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h b/src/gallium/drivers/nouveau/codegen/nv50_ir.h index f4f3c70888..4deaf09989 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h @@ -58,6 +58,7 @@ enum operation OP_FMA, OP_SAD, // abs(src0 - src1) + src2 OP_SHLADD, + OP_XMAD, // extended multiply-add (GM107+), does a lot of things OP_ABS, OP_NEG, OP_NOT, @@ -251,6 +252,13 @@ enum operation #define NV50_IR_SUBOP_VOTE_ALL 0 #define NV50_IR_SUBOP_VOTE_ANY 1 #define NV50_IR_SUBOP_VOTE_UNI 2 +#define NV50_IR_SUBOP_XMAD_PSL (1 << 0) +#define NV50_IR_SUBOP_XMAD_MRG (1 << 1) +#define NV50_IR_SUBOP_XMAD_CLO (1 << 2) +#define NV50_IR_SUBOP_XMAD_CHI (2 << 2) +#define NV50_IR_SUBOP_XMAD_CSFU (3 << 2) +#define NV50_IR_SUBOP_XMAD_CBCC (4 << 2) +#define NV50_IR_SUBOP_XMAD_CMODE_MASK (0x7 << 2) #define NV50_IR_SUBOP_MINMAX_LOW 1 #define NV50_IR_SUBOP_MINMAX_MED 2 @@ -527,6 +535,9 @@ struct Storage #define NV50_IR_MOD_SAT (1 << 2) #define NV50_IR_MOD_NOT (1 << 3) #define NV50_IR_MOD_NEG_ABS (NV50_IR_MOD_NEG | NV50_IR_MOD_ABS) +// modifiers only for XMAD +#define NV50_IR_MOD_H1 (1 << 4) +#define NV50_IR_MOD_SEXT (1 << 5) #define NV50_IR_INTERP_MODE_MASK 0x3 #define NV50_IR_INTERP_LINEAR (0 << 0) @@ -556,11 +567,14 @@ public: inline Modifier operator&(const Modifier m) const { return bits & m.bits; } inline Modifier operator|(const Modifier m) const { return bits | m.bits; } inline Modifier operator^(const Modifier m) const { return bits ^ m.bits; } + inline Modifier operator~() const { return ~bits; } operation getOp() const; inline int neg() const { return (bits & NV50_IR_MOD_NEG) ? 1 : 0; } inline int abs() const { return (bits & NV50_IR_MOD_ABS) ? 1 : 0; } + inline int h1() const { return (bits & NV50_IR_MOD_H1) ? 1 : 0; } + inline int sext() const { return (bits & NV50_IR_MOD_SEXT) ? 1 : 0; } inline operator bool() const { return bits ? true : false; } diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 4d0589214d..a43b481a01 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -191,9 +191,16 @@ void LoadPropagation::checkSwapSrc01(Instruction *insn) { const Target *targ = prog->getTarget(); - if (!targ->getOpInfo(insn).commutative) - if (insn->op != OP_SET && insn->op != OP_SLCT && insn->op != OP_SUB) + if (!targ->getOpInfo(insn).commutative) { + if (insn->op != OP_SET && insn->op != OP_SLCT && + insn->op != OP_SUB && insn->op != OP_XMAD) return; + // XMAD is only commutative if both the CBCC and MRG flags are not set. + if (insn->op == OP_XMAD && (insn->subOp & 0x1c) == NV50_IR_SUBOP_XMAD_CBCC) + return; + if (insn->op == OP_XMAD && (insn->subOp & NV50_IR_SUBOP_XMAD_MRG)) + return; + } if (insn->src(1).getFile() != FILE_GPR) return; // This is the special OP_SET used for alphatesting, we can't reverse its @@ -488,6 +495,7 @@ Modifier::applyTo(ImmediateValue& imm) const imm.reg.data.s32 = -imm.reg.data.s32; if (bits & NV50_IR_MOD_NOT) imm.reg.data.s32 = ~imm.reg.data.s32; + // NOTE: applying the h1 and sext modifiers is confusing and not very useful break; case TYPE_F64: diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp index cbb21f5f72..c4906c31a8 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp +++ b/src/
[Mesa-dev] [PATCH 4/4] nv50/ir: further optimize multiplication by immediates
Strongly mitigates the harm from the previous commit, which made many integer multiplications much more heavy on the register and instruction count. total instructions in shared programs : 5294693 -> 5268293 (-0.50%) total gprs used in shared programs: 624962 -> 624196 (-0.12%) total shared used in shared programs : 360704 -> 360704 (0.00%) total local used in shared programs : 21048 -> 20952 (-0.46%) local sharedgpr inst bytes helped 1 0 36817721772 hurt 0 0 74 23 23 Signed-off-by: Rhys Perry --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 123 ++--- src/util/bitscan.h | 26 + 2 files changed, 135 insertions(+), 14 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 84cb5eb04b..aaad4db479 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -371,6 +371,10 @@ private: void tryCollapseChainedMULs(Instruction *, const int s, ImmediateValue&); CmpInstruction *findOriginForTestWithZero(Value *); + + Value *createMulMethod1(Value *a, unsigned b, Value *c); + Value *createMulMethod2(Value *a, unsigned b, Value *c); + Value *createMul(Value *a, unsigned b, Value *c); unsigned int foldCount; @@ -946,6 +950,97 @@ ConstantFolding::opnd3(Instruction *i, ImmediateValue &imm2) return; } } + +Value * +ConstantFolding::createMulMethod1(Value *a, unsigned b, Value *c) +{ + if (b == 1) + return a; + + // Basically constant folded shift and add multiplication. + Value *res = c ? c : bld.loadImm(NULL, 0u); + bool resZero = !c; + unsigned ashift = 0; + while (b) { + if ((b & 1) && ashift) { + if (resZero) +res = bld.mkOp2v(OP_SHL, TYPE_U32, bld.getSSA(), a, bld.mkImm(ashift)); + else +res = bld.mkOp3v(OP_SHLADD, TYPE_U32, bld.getSSA(), a, bld.mkImm(ashift), res); + resZero = false; + } else if (b & 1) { + if (resZero) +res = a; + else +res = bld.mkOp2v(OP_ADD, TYPE_U32, bld.getSSA(), res, a); + resZero = false; + } + b >>= 1; + ashift++; + } + return res; +} + +Value * +ConstantFolding::createMulMethod2(Value *a, unsigned b, Value *c) +{ + uint64_t b2 = u_next_power_of_two(b); + unsigned b2shift = ffsll(b2) - 1; + if (b2 != b) { // a * b2 - a * (b2 - b) + // mul1 = a * (b2 - b) + Value *mul1 = createMulMethod1(a, b2 - b, NULL); + + if (b2shift < 32 && c) { // a * b2 - mul1 + c (implemented as a * b2 + c - mul1) + return bld.mkOp2v(OP_SUB, TYPE_U32, bld.getSSA(), + bld.mkOp3v(OP_SHLADD, TYPE_U32, bld.getSSA(), + a, bld.mkImm(b2shift), c), + mul1); + } else + if (b2shift < 32) { // a * b2 - mul1 + Value *res = bld.getSSA(); + Instruction *i = bld.mkOp3(OP_SHLADD, TYPE_U32, res, a, bld.mkImm(b2shift), mul1); + if (bld.getProgram()->getTarget()->isModSupported(i, 2, NV50_IR_MOD_NEG)) +i->src(2).mod *= Modifier(NV50_IR_MOD_NEG); + else +i->setSrc(2, bld.mkOp1v(OP_NEG, TYPE_U32, bld.getSSA(), mul1)); + return res; + } else + if (c) { // - mul1 + c (implemented as c - mul1) + return bld.mkOp2v(OP_SUB, TYPE_U32, bld.getSSA(), c, mul1); + } else { // - mul1 + return bld.mkOp1v(OP_NEG, TYPE_U32, bld.getSSA(), mul1); + } + } else { + if (c) // a * b2 + c + return bld.mkOp3v(OP_SHLADD, TYPE_U32, bld.getSSA(), a, bld.mkImm(b2shift), c); + else // a * b2 + return bld.mkOp2v(OP_SHL, TYPE_U32, bld.getSSA(), a, bld.loadImm(NULL, b2shift)); + } +} + +Value * +ConstantFolding::createMul(Value *a, unsigned b, Value *c) +{ + unsigned cost[2]; + + // Estimate cost for first method (a << i) + (b << j) + ... + cost[0] = u_bit_count64(b >> 1); + + // Estimate cost for second method (a << i) - ((a << j) + (a << k) + ...) + uint64_t rounded_b = u_next_power_of_two(b); + cost[1] = rounded_b == b ? 1 : (u_bit_count64((rounded_b - b) >> 1) + 2); + if (c) cost[1]++; + + // The general method, multiplication by XMADs, costs three instructions. + // So nothing larger than that or it could be making things worse. + if (cost[0] > 3 && cost[1] > 3) + return NULL; + + if (cost[0] < cost[1]) + return createMulMethod1(a, b, c); + else + return createMulMethod2(a, b, c); +} void ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s) @@ -1034,13 +1129,13 @@ ConstantFolding::opnd(Instruction *i, ImmediateValue &imm0, int s) i->setSrc(s, i->getSrc(t)); i->src(s).mod =
[Mesa-dev] [PATCH 3/4] nv50/ir: optimize imul/imad to xmads
This hits the shader-db numbers a good bit, though a few xmads is way faster than an imul or imad and the cost is mitigated by the next commit, which optimizes many multiplications by immediates into shorter and less register heavy instructions than the xmads. total instructions in shared programs : 5256901 -> 5294693 (0.72%) total gprs used in shared programs: 624328 -> 624962 (0.10%) total shared used in shared programs : 360704 -> 360704 (0.00%) total local used in shared programs : 20952 -> 21048 (0.46%) local sharedgpr inst bytes helped 0 0 39 0 0 hurt 1 0 33422772277 Signed-off-by: Rhys Perry --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 53 ++ 1 file changed, 53 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index a43b481a01..84cb5eb04b 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -2246,13 +2246,18 @@ AlgebraicOpt::visit(BasicBlock *bb) // = // ADD(SHL(a, b), c) -> SHLADD(a, b, c) +// MUL(a, b) -> a few XMADs +// MAD/FMA(a, b, c) -> a few XMADs class LateAlgebraicOpt : public Pass { private: virtual bool visit(Instruction *); void handleADD(Instruction *); + void handleMULMAD(Instruction *); bool tryADDToSHLADD(Instruction *); + + BuildUtil bld; }; void @@ -2312,6 +2317,49 @@ LateAlgebraicOpt::tryADDToSHLADD(Instruction *add) return true; } + +// MUL(a, b) -> a few XMADs +// MAD/FMA(a, b, c) -> a few XMADs +void +LateAlgebraicOpt::handleMULMAD(Instruction *i) +{ + // TODO: handle NV50_IR_SUBOP_MUL_HIGH + if (!prog->getTarget()->isOpSupported(OP_XMAD, TYPE_U32)) + return; + if (isFloatType(i->dType) || typeSizeof(i->dType) != 4) + return; + if (i->subOp || i->usesFlags() || i->flagsDef >= 0) + return; + + assert(!i->src(0).mod); + assert(!i->src(1).mod); + assert(i->op == OP_MUL ? 1 : !i->src(2).mod); + + bld.setPosition(i, true); + + Value *a = i->getSrc(0); + Value *b = i->getSrc(1); + Value *c = i->op == OP_MUL ? bld.mkImm(0) : i->getSrc(2); + + Value *tmp0 = bld.getSSA(); + Value *tmp1 = bld.getSSA(); + + Instruction *insn = bld.mkOp3(OP_XMAD, TYPE_U32, tmp0, b, a, c); + insn->setPredicate(i->cc, i->getPredicate()); + + insn = bld.mkOp3(OP_XMAD, TYPE_U32, tmp1, b, a, bld.mkImm(0)); + insn->setPredicate(i->cc, i->getPredicate()); + insn->src(1).mod = NV50_IR_MOD_H1; + insn->subOp = NV50_IR_SUBOP_XMAD_MRG; + + insn = bld.mkOp3(OP_XMAD, TYPE_U32, i->getDef(0), b, tmp1, tmp0); + insn->setPredicate(i->cc, i->getPredicate()); + insn->src(0).mod = NV50_IR_MOD_H1; + insn->src(1).mod = NV50_IR_MOD_H1; + insn->subOp = NV50_IR_SUBOP_XMAD_PSL | NV50_IR_SUBOP_XMAD_CBCC; + + delete_Instruction(prog, i); +} bool LateAlgebraicOpt::visit(Instruction *i) @@ -2320,6 +2368,11 @@ LateAlgebraicOpt::visit(Instruction *i) case OP_ADD: handleADD(i); break; + case OP_MUL: + case OP_MAD: + case OP_FMA: + handleMULMAD(i); + break; default: break; } -- 2.14.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/4] nv50/ir: Improve Performance of Integer Multiplication
This series improve the performance of integer multiplication by removing much usage of the very slow IMAD and IMUL. It depends on the SHLADD/IndirectPropagation patches. The first and second patch add support for the XMAD instruction in codegen The third patch replaces most IMADs and IMULs with a sequence of XMADs. This is far faster but increases the total instructions in the shader-db by 0.72%. This number is significantly lowered with the next patch. It replaces many multiplications with instructions that should be as fast or faster than the XMAD approach. They are also typically be smaller and less register heavy, so they decrease the total instruction count by -0.50%. This series gives about a ~50% speedup in fragment-heavy scenaries with Dolphin 5.0. All timings were made with interesting looking fifos from Dolphin's bugtracker: Wind Waker: 18 FPS -> 26 FPS at 3x internal resolution Wind Waker: 8 FPS -> 11 FPS at 5x internal resolution Paper Mario?: 26 FPS -> 42 FPS at 5x internal resolution SpongeBob Movie: 19 FPS -> 30 FPS at 5x internal resolution Unigine Heaven and Unigine Valley seems to run the same at low quality with no anti-aliasing and no tessellation. SuperTuxKart and 0 A.D. also show no change. It's possible these patches may break something, especially the fourth one. Piglit shows no functionality regressions though they should probably be tested for improvements or breakage with actual applications. These patches can also be found on my github: https://github.com/pendingchaos/mesa/tree/nv-xmad-v1 The final changes in shader-db are as follows: total instructions in shared programs : 5256901 -> 5268293 (0.22%) total gprs used in shared programs: 624328 -> 624196 (-0.02%) total shared used in shared programs : 360704 -> 360704 (0.00%) total local used in shared programs : 20952 -> 20952 (0.00%) local sharedgpr inst bytes helped 0 0 255 680 680 hurt 0 0 12814841484 Rhys Perry (4): nv50/ir: add preliminary support for OP_XMAD gm107/ir: add support for OP_XMAD on GM107+ nv50/ir: optimize imul/imad to xmads nv50/ir: further optimize multiplication by immediates src/gallium/drivers/nouveau/codegen/nv50_ir.cpp| 3 +- src/gallium/drivers/nouveau/codegen/nv50_ir.h | 14 ++ .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 61 +++ .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 188 +++-- .../drivers/nouveau/codegen/nv50_ir_print.cpp | 20 +++ .../drivers/nouveau/codegen/nv50_ir_target.cpp | 7 +- .../nouveau/codegen/nv50_ir_target_gm107.cpp | 5 + .../nouveau/codegen/nv50_ir_target_nv50.cpp| 5 +- .../nouveau/codegen/nv50_ir_target_nvc0.cpp| 26 ++- src/util/bitscan.h | 26 +++ 10 files changed, 331 insertions(+), 24 deletions(-) -- 2.14.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/4] gm107/ir: add support for OP_XMAD on GM107+
Signed-off-by: Rhys Perry --- .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 61 ++ .../nouveau/codegen/nv50_ir_target_gm107.cpp | 6 ++- .../nouveau/codegen/nv50_ir_target_nvc0.cpp| 1 + 3 files changed, 67 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp index 26826d6360..8ace77aa59 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp @@ -155,6 +155,7 @@ private: void emitIMUL(); void emitIMAD(); void emitISCADD(); + void emitXMAD(); void emitIMNMX(); void emitICMP(); void emitISET(); @@ -1881,6 +1882,63 @@ CodeEmitterGM107::emitISCADD() emitGPR (0x08, insn->src(0)); emitGPR (0x00, insn->def(0)); } + +void +CodeEmitterGM107::emitXMAD() +{ + assert(insn->src(0).getFile() == FILE_GPR); + + bool constbuf = false; + bool psl_mrg = true; + bool immediate = false; + if (insn->src(2).getFile() == FILE_MEMORY_CONST) { + assert(insn->src(1).getFile() == FILE_GPR); + constbuf = true; + psl_mrg = false; + emitInsn(0x5100); + emitGPR(0x27, insn->src(1)); + emitCBUF(0x22, -1, 0x14, 16, 2, insn->src(2)); + } else if (insn->src(1).getFile() == FILE_MEMORY_CONST) { + assert(insn->src(2).getFile() == FILE_GPR); + constbuf = true; + emitInsn(0x4e00); + emitCBUF(0x22, -1, 0x14, 16, 2, insn->src(1)); + emitGPR(0x27, insn->src(2)); + } else if (insn->src(1).getFile() == FILE_IMMEDIATE) { + assert(insn->src(2).getFile() == FILE_GPR); + assert(!insn->src(1).mod.h1()); + immediate = false; + emitInsn(0x3600); + emitIMMD(0x14, 19, insn->src(1)); + emitGPR(0x27, insn->src(2)); + } else { + assert(insn->src(1).getFile() == FILE_GPR); + assert(insn->src(2).getFile() == FILE_GPR); + emitInsn(0x5b00); + emitGPR(0x14, insn->src(1)); + emitGPR(0x27, insn->src(2)); + } + + if (insn->src(0).mod.sext()) + emitField(0x30, 2, insn->src(1).mod.sext() ? 3 : 1); + else + emitField(0x30, 2, insn->src(1).mod.sext() ? 2 : 0); + emitField(0x35, 1, insn->src(0).mod.h1()); + if (!immediate) + emitField(constbuf ? 0x34 : 0x23, 1, insn->src(1).mod.h1()); + + if (psl_mrg) { + emitField(constbuf ? 0x37 : 0x24, 1, insn->subOp & NV50_IR_SUBOP_XMAD_PSL ? 1 : 0); + emitField(constbuf ? 0x38 : 0x25, 1, insn->subOp & NV50_IR_SUBOP_XMAD_MRG ? 1 : 0); + } + emitField(0x32, constbuf ? 2 : 3, (insn->subOp >> 2) & 0x7); + + emitX(constbuf ? 0x36 : 0x26); + emitCC(0x2f); + + emitGPR(0x0, insn->def(0)); + emitGPR(0x8, insn->src(0)); +} void CodeEmitterGM107::emitIMNMX() @@ -3253,6 +3311,9 @@ CodeEmitterGM107::emitInstruction(Instruction *i) case OP_SHLADD: emitISCADD(); break; + case OP_XMAD: + emitXMAD(); + break; case OP_MIN: case OP_MAX: if (isFloatType(insn->dType)) { diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_gm107.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_gm107.cpp index 24a1cbb8da..f918fbfdd3 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_gm107.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_gm107.cpp @@ -60,8 +60,11 @@ TargetGM107::isOpSupported(operation op, DataType ty) const case OP_SQRT: case OP_DIV: case OP_MOD: - case OP_XMAD: return false; + case OP_XMAD: + if (isFloatType(ty)) + return false; + break; default: break; } @@ -230,6 +233,7 @@ TargetGM107::getLatency(const Instruction *insn) const case OP_SUB: case OP_VOTE: case OP_XOR: + case OP_XMAD: if (insn->dType != TYPE_F64) return 6; break; diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp index 66efa0135f..3b96c71f44 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp @@ -161,6 +161,7 @@ static const struct opProperties _initPropsGM107[] = { { OP_SUSTP, 0x0, 0x0, 0x0, 0x0, 0x0, 0x4 }, { OP_SUREDB, 0x0, 0x0, 0x0, 0x0, 0x0, 0x4 }, { OP_SUREDP, 0x0, 0x0, 0x0, 0x0, 0x0, 0x4 }, + { OP_XMAD,0x0, 0x0, 0x0, 0x0, 0x6, 0x2 }, }; void TargetNVC0::initProps(const struct opProperties *props, int size) -- 2.14.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 106903] radv: Fragment shader output goes to wrong attachments when render targets are sparse
https://bugs.freedesktop.org/show_bug.cgi?id=106903 --- Comment #2 from Bas Nieuwenhuizen --- https://patchwork.freedesktop.org/patch/229361/ should fix this. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/4] nv50/ir: Improve Performance of Integer Multiplication
Forgot to CC you. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] configure.ac/meson.build: Add options for library suffixes
From: Benjamin Gordon When building the Chrome OS Android container, we need to build copies of mesa that don't conflict with the Android system-supplied libraries. This adds options to create suffixed versions of EGL and GLES libraries: libEGL.so -> libEGL${egl-lib-suffix}.so libGLESv1_CM.so -> libGLESv1_CM${gles-lib-suffix}.so libGLESv2.so -> libGLES${gles-lib-suffix}.so This is similar to what happens when --enable-libglvnd is specified, but without the side effects of linking against libglvnd. To avoid unexpected clashes with the suffixed appended by libglvnd, make it an error to specify both --enable-libglvnd and --with-egl-lib-suffix. Signed-off-by: Benjamin Gordon Reviewed-by: Eric Engestrom --- configure.ac| 18 ++ meson.build | 3 +++ meson_options.txt | 12 src/egl/Makefile.am | 8 src/egl/meson.build | 2 +- src/mapi/Makefile.am| 28 ++-- src/mapi/es1api/meson.build | 2 +- src/mapi/es2api/meson.build | 2 +- 8 files changed, 54 insertions(+), 21 deletions(-) diff --git a/configure.ac b/configure.ac index 35ade986d1..95ec47266f 100644 --- a/configure.ac +++ b/configure.ac @@ -1511,14 +1511,30 @@ AC_ARG_WITH([gl-lib-name], [specify GL library name @<:@default=GL@:>@])], [GL_LIB=$withval], [GL_LIB="$DEFAULT_GL_LIB_NAME"]) +AC_ARG_WITH([egl-lib-suffix], + [AS_HELP_STRING([--with-egl-lib-suffix@<:@=NAME@:>@], +[specify EGL library suffix @<:@default=none@:>@])], + [EGL_LIB_SUFFIX=$withval], + [EGL_LIB_SUFFIX=""]) +AC_ARG_WITH([gles-lib-suffix], + [AS_HELP_STRING([--with-gles-lib-suffix@<:@=NAME@:>@], +[specify GLES library suffix @<:@default=none@:>@])], + [GLES_LIB_SUFFIX=$withval], + [GLES_LIB_SUFFIX=""]) AC_ARG_WITH([osmesa-lib-name], [AS_HELP_STRING([--with-osmesa-lib-name@<:@=NAME@:>@], [specify OSMesa library name @<:@default=OSMesa@:>@])], [OSMESA_LIB=$withval], [OSMESA_LIB=OSMesa]) AS_IF([test "x$GL_LIB" = xyes], [GL_LIB="$DEFAULT_GL_LIB_NAME"]) +AS_IF([test "x$EGL_LIB_SUFFIX" = xyes], [EGL_LIB_SUFFIX=""]) +AS_IF([test "x$GLES_LIB_SUFFIX" = xyes], [GLES_LIB_SUFFIX=""]) AS_IF([test "x$OSMESA_LIB" = xyes], [OSMESA_LIB=OSMesa]) +if test "x$enable_libglvnd" = xyes -a "x$EGL_LIB_SUFFIX" != x; then +AC_MSG_ERROR([EGL lib suffix can't be used with libglvnd]) +fi + dnl dnl Mangled Mesa support dnl @@ -1534,6 +1550,8 @@ if test "x${enable_mangling}" = "xyes" ; then OSMESA_LIB="Mangled${OSMESA_LIB}" fi AC_SUBST([GL_LIB]) +AC_SUBST([EGL_LIB_SUFFIX]) +AC_SUBST([GLES_LIB_SUFFIX]) AC_SUBST([OSMESA_LIB]) # Check for libdrm diff --git a/meson.build b/meson.build index e52b4a5109..ca081b1e0b 100644 --- a/meson.build +++ b/meson.build @@ -373,6 +373,9 @@ if with_glvnd elif with_glx == 'disabled' and not with_egl error('glvnd requires DRI based GLX and/or EGL') endif + if get_option('egl-lib-suffix') != '' +error('''EGL lib suffix can't be used with libglvnd''') + endif endif # TODO: toggle for this diff --git a/meson_options.txt b/meson_options.txt index ce7d87f1eb..9d84c3b5bb 100644 --- a/meson_options.txt +++ b/meson_options.txt @@ -298,3 +298,15 @@ option( choices : ['freedreno', 'glsl', 'intel', 'nir', 'nouveau', 'all'], description : 'List of tools to build.', ) +option( + 'egl-lib-suffix', + type : 'string', + value : '', + description : 'Suffix to append to EGL library name. Default: none.' +) +option( + 'gles-lib-suffix', + type : 'string', + value : '', + description : 'Suffix to append to GLES library names. Default: none.' +) diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am index 086a4a1e63..c3aeeea007 100644 --- a/src/egl/Makefile.am +++ b/src/egl/Makefile.am @@ -184,12 +184,12 @@ libEGL_mesa_la_LDFLAGS = \ else # USE_LIBGLVND -lib_LTLIBRARIES = libEGL.la -libEGL_la_SOURCES = -libEGL_la_LIBADD = \ +lib_LTLIBRARIES = libEGL@EGL_LIB_SUFFIX@.la +libEGL@EGL_LIB_SUFFIX@_la_SOURCES = +libEGL@EGL_LIB_SUFFIX@_la_LIBADD = \ libEGL_common.la \ $(top_builddir)/src/mapi/shared-glapi/libglapi.la -libEGL_la_LDFLAGS = \ +libEGL@EGL_LIB_SUFFIX@_la_LDFLAGS = \ -no-undefined \ -version-number 1:0 \ $(BSYMBOLIC) \ diff --git a/src/egl/meson.build b/src/egl/meson.build index 6537e4bdee..b833fd1729 100644 --- a/src/egl/meson.build +++ b/src/egl/meson.build @@ -148,7 +148,7 @@ if cc.has_function('mincore') endif if not with_glvnd - egl_lib_name = 'EGL' + egl_lib_name = 'EGL' + get_option('egl-lib-suffix') egl_lib_version = '1.0.0' else egl_lib_name = 'EGL_mesa' diff --git a/src/mapi/Makefile.am b/src/mapi/Makefile.am index 3da1a193d2..a2b108adc9 100644 --- a/src/mapi/Makefile.am +++ b/src/mapi/Makefile.am @@ -178,24 +178,24 @@ GLES_include_HEADERS = \ $(top_srcdir)/include/GLES/glext.h \ $(top_srcdir)/include/GLES/glplatform.h -lib_LTLIBRARIES += es1api/libGLESv1_CM.la +lib_LTLIBRARIES += es1api/libGLE
Re: [Mesa-dev] [PATCH] configure.ac/meson.build: Add options for library suffixes
On Wed, Jun 13, 2018 at 9:46 AM Dylan Baker wrote: > Quoting Eric Engestrom (2018-06-13 03:03:25) > > On Tuesday, 2018-06-12 11:19:40 -0600, bmgor...@chromium.org wrote: > > > From: Benjamin Gordon > > > > > > When building the Chrome OS Android container, we need to build copies > > > of mesa that don't conflict with the Android system-supplied libraries. > > > This adds options to create suffixed versions of EGL and GLES > libraries: > > > > > > libEGL.so -> libEGL${egl-lib-suffix}.so > > > libGLESv1_CM.so -> libGLESv1_CM${gles-lib-suffix}.so > > > libGLESv2.so -> libGLES${gles-lib-suffix}.so > > > > > > This is similar to what happens when --enable-libglvnd is specified, > but > > > without the side effects of linking against libglvnd. > > > > This seems reasonable, and the meson side of this patch is correct, > > but we need to document or prevent the interaction between > > --enable-libglvnd and --with-egl-lib-suffix. > > > > I can't think of a use-case for having both, so I suggest "if both are > > enabled, error out"; scroll down for what this could look like in meson. > > Agreed, making it hard error to use both makes sense to me. > Thanks for the reviews. I just sent a v2 that makes it an error to pass both flags. > > > With that (and the corresponding autotools hunk): > > Reviewed-by: Eric Engestrom > > > > > > > > Change-Id: I0a534d3921a24c031e2532ee7d5ba9813740b33b > > > > (Note to whoever merges this patch: drop this line ^) > > > > > Signed-off-by: Benjamin Gordon > > > --- > > > configure.ac| 14 ++ > > > meson_options.txt | 12 > > > src/egl/Makefile.am | 8 > > > src/egl/meson.build | 2 +- > > > src/mapi/Makefile.am| 28 ++-- > > > src/mapi/es1api/meson.build | 2 +- > > > src/mapi/es2api/meson.build | 2 +- > > > 7 files changed, 47 insertions(+), 21 deletions(-) > > > > > > diff --git a/configure.ac b/configure.ac > > > index 35ade986d1..6070a2146b 100644 > > > --- a/configure.ac > > > +++ b/configure.ac > > > @@ -1511,12 +1511,24 @@ AC_ARG_WITH([gl-lib-name], > > > [specify GL library name @<:@default=GL@:>@])], > > >[GL_LIB=$withval], > > >[GL_LIB="$DEFAULT_GL_LIB_NAME"]) > > > +AC_ARG_WITH([egl-lib-suffix], > > > + [AS_HELP_STRING([--with-egl-lib-suffix@<:@=NAME@:>@], > > > +[specify EGL library suffix @<:@default=none@:>@])], > > > + [EGL_LIB_SUFFIX=$withval], > > > + [EGL_LIB_SUFFIX=""]) > > > +AC_ARG_WITH([gles-lib-suffix], > > > + [AS_HELP_STRING([--with-gles-lib-suffix@<:@=NAME@:>@], > > > +[specify GLES library suffix @<:@default=none@:>@])], > > > + [GLES_LIB_SUFFIX=$withval], > > > + [GLES_LIB_SUFFIX=""]) > > > AC_ARG_WITH([osmesa-lib-name], > > >[AS_HELP_STRING([--with-osmesa-lib-name@<:@=NAME@:>@], > > > [specify OSMesa library name @<:@default=OSMesa@:>@])], > > >[OSMESA_LIB=$withval], > > >[OSMESA_LIB=OSMesa]) > > > AS_IF([test "x$GL_LIB" = xyes], [GL_LIB="$DEFAULT_GL_LIB_NAME"]) > > > +AS_IF([test "x$EGL_LIB_SUFFIX" = xyes], [EGL_LIB_SUFFIX=""]) > > > +AS_IF([test "x$GLES_LIB_SUFFIX" = xyes], [GLES_LIB_SUFFIX=""]) > > > AS_IF([test "x$OSMESA_LIB" = xyes], [OSMESA_LIB=OSMesa]) > > > > > > dnl > > > @@ -1534,6 +1546,8 @@ if test "x${enable_mangling}" = "xyes" ; then > > >OSMESA_LIB="Mangled${OSMESA_LIB}" > > > fi > > > AC_SUBST([GL_LIB]) > > > +AC_SUBST([EGL_LIB_SUFFIX]) > > > +AC_SUBST([GLES_LIB_SUFFIX]) > > > AC_SUBST([OSMESA_LIB]) > > > > > > # Check for libdrm > > > diff --git a/meson_options.txt b/meson_options.txt > > > index ce7d87f1eb..9d84c3b5bb 100644 > > > --- a/meson_options.txt > > > +++ b/meson_options.txt > > > @@ -298,3 +298,15 @@ option( > > >choices : ['freedreno', 'glsl', 'intel', 'nir', 'nouveau', 'all'], > > >description : 'List of tools to build.', > > > ) > > > +option( > > > + 'egl-lib-suffix', > > > + type : 'string', > > > + value : '', > > > + description : 'Suffix to append to EGL library name. Default: > none.' > > > +) > > > +option( > > > + 'gles-lib-suffix', > > > + type : 'string', > > > + value : '', > > > + description : 'Suffix to append to GLES library names. Default: > none.' > > > +) > > > diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am > > > index 086a4a1e63..c3aeeea007 100644 > > > --- a/src/egl/Makefile.am > > > +++ b/src/egl/Makefile.am > > > @@ -184,12 +184,12 @@ libEGL_mesa_la_LDFLAGS = \ > > > > > > else # USE_LIBGLVND > > > > > > -lib_LTLIBRARIES = libEGL.la > > > -libEGL_la_SOURCES = > > > -libEGL_la_LIBADD = \ > > > +lib_LTLIBRARIES = libEGL@EGL_LIB_SUFFIX@.la > > > +libEGL@EGL_LIB_SUFFIX@_la_SOURCES = > > > +libEGL@EGL_LIB_SUFFIX@_la_LIBADD = \ > > > libEGL_common.la \ > > > $(top_builddir)/src/mapi/shared-glapi/libglapi.la > > > -libEGL_la_LDFLAGS = \ > > > +libEGL@EGL_LIB_SUFFIX@_la_LDFLAGS = \ > > > -no-undefined \ > > > -version-number 1:0 \ > > > $(BSYMBOLIC) \ > > > diff --git a/src/egl/
Re: [Mesa-dev] [PATCH mesa 7/9] vulkan: Add EXT_acquire_xlib_display [v3]
On Mon, Jun 11, 2018 at 10:39 PM, Keith Packard wrote: > This extension adds the ability to borrow an X RandR output for > temporary use directly by a Vulkan application. For DRM, we use the > Linux resource leasing mechanism. > > v2: > Clean up xlib_lease detection > > * Use separate temporary '_xlib_lease' variable to hold the > option value to avoid changin the type of a variable. > > * Use boolean expressions instead of additional if statements > to compute resulting with_xlib_lease value. > > * Simplify addition of VK_USE_PLATFORM_XLIB_XRANDR_KHR to > vulkan_wsi_args > > Suggested-by: Eric Engestrom > > Move mode list from wsi_display to wsi_display_connector > > Fix scope for wsi_display_mode and wsi_display_connector allocs > > Suggested-by: Jason Ekstrand > > v3: > Adopt Jason Ekstrand's coding conventions > > Declare variables at first use, eliminate extra whitespace > between types and names. Wrap lines to 80 columns. > > Explicitly forbid multiple DRM leases. Making the code support > this looks tricky and will require additional thought. > > Use xcb_randr_output_t throughout the internals of the > implementation. Convert at the public API > (wsi_get_randr_output_display). > > Clean up check for usable active_crtc (possible when only the > desired output is connected to the crtc). > > Suggested-by: Jason Ekstrand > > Signed-off-by: Keith Packard > > fixup for acquire > > fixup for RROutput type > > Signed-off-by: Keith Packard > > fixup > Lots of "fixup". Did you mean to actually comment on what that was? > --- > configure.ac| 32 ++ > meson.build | 11 + > meson_options.txt | 7 + > src/vulkan/Makefile.am | 5 + > src/vulkan/wsi/meson.build | 5 + > src/vulkan/wsi/wsi_common_display.c | 493 > src/vulkan/wsi/wsi_common_display.h | 17 + > 7 files changed, 570 insertions(+) > [...] > +static bool > +wsi_display_mode_matches_x(struct wsi_display_mode *wsi, > + xcb_randr_mode_info_t *xcb) > +{ > + return wsi->clock == (xcb->dot_clock + 500) / 1000 && > + wsi->hdisplay == xcb->width && > + wsi->hsync_start == xcb->hsync_start && > + wsi->hsync_end == xcb->hsync_end && > + wsi->htotal == xcb->htotal && > + wsi->hskew == xcb->hskew && > + wsi->vdisplay == xcb->height && > + wsi->vsync_start == xcb->vsync_start && > + wsi->vsync_end == xcb->vsync_end && > + wsi->vtotal == xcb->vtotal && > You're not checking vscan here. > + wsi->flags == xcb->mode_flags; > +} > [...] > +static struct wsi_display_connector * > +wsi_display_get_output(struct wsi_device *wsi_device, > + xcb_connection_t *connection, > + xcb_randr_output_t output) > +{ > + struct wsi_display *wsi = > + (struct wsi_display *) wsi_device->wsi[VK_ICD_WSI_PLA > TFORM_DISPLAY]; > + struct wsi_display_connector *connector; > + uint32_t connector_id; > + > + xcb_window_t root = wsi_display_output_to_root(connection, output); > + if (!root) > + return NULL; > + > + xcb_randr_get_screen_resources_cookie_t src = > + xcb_randr_get_screen_resources(connection, root); > + xcb_randr_get_output_info_cookie_t oic = > + xcb_randr_get_output_info(connection, output, XCB_CURRENT_TIME); > + xcb_randr_get_screen_resources_reply_t *srr = > + xcb_randr_get_screen_resources_reply(connection, src, NULL); > + xcb_randr_get_output_info_reply_t *oir = > + xcb_randr_get_output_info_reply(connection, oic, NULL); > Why are you fetching these here and not lower down? The only uses of them inside the "if (!connector)" is to free them. Seems to be a bit of a waste. > + > + /* See if we already have a connector for this output */ > + connector = wsi_display_find_output(wsi_device, output); > + > + if (!connector) { > + xcb_atom_t connector_id_atom = 0; > + > + /* > + * Go get the kernel connector ID for this X output > + */ > + connector_id = wsi_display_output_to_connector_id(connection, > + > &connector_id_atom, > +output); > + > + /* Any X server with lease support will have this atom */ > + if (!connector_id) { > + free(oir); > + free(srr); > + return NULL; > + } > + > + /* See if we already have a connector for this id */ > + connector = wsi_display_find_connector(wsi_device, connector_id); > + > + if (connector == NULL) { > + connector = wsi_display_alloc_connector(wsi, connector_id); > + if (!connector) { > +free(oir); > +free(srr); > +return NULL; > + } > + li
Re: [Mesa-dev] [PATCH 1/4] nv50/ir: add preliminary support for OP_XMAD
On Thu, Jun 14, 2018 at 12:02 AM, Rhys Perry wrote: > Signed-off-by: Rhys Perry > --- > src/gallium/drivers/nouveau/codegen/nv50_ir.cpp| 3 ++- > src/gallium/drivers/nouveau/codegen/nv50_ir.h | 14 > .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 12 +-- > .../drivers/nouveau/codegen/nv50_ir_print.cpp | 20 + > .../drivers/nouveau/codegen/nv50_ir_target.cpp | 7 +++--- > .../nouveau/codegen/nv50_ir_target_gm107.cpp | 1 + > .../nouveau/codegen/nv50_ir_target_nv50.cpp| 5 +++-- > .../nouveau/codegen/nv50_ir_target_nvc0.cpp| 25 > -- > 8 files changed, 77 insertions(+), 10 deletions(-) > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp > b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp > index 49425b98b9..99bf8de370 100644 > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp > @@ -53,7 +53,8 @@ Modifier Modifier::operator*(const Modifier m) const >b &= ~NV50_IR_MOD_NEG; > > a = (this->bits ^ b) & (NV50_IR_MOD_NOT | NV50_IR_MOD_NEG); > - c = (this->bits | m.bits) & (NV50_IR_MOD_ABS | NV50_IR_MOD_SAT); > + c = (this->bits | m.bits) & (NV50_IR_MOD_ABS | NV50_IR_MOD_SAT | > +NV50_IR_MOD_H1 | NV50_IR_MOD_SEXT); > > return Modifier(a | c); > } > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h > b/src/gallium/drivers/nouveau/codegen/nv50_ir.h > index f4f3c70888..4deaf09989 100644 > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h > @@ -58,6 +58,7 @@ enum operation > OP_FMA, > OP_SAD, // abs(src0 - src1) + src2 > OP_SHLADD, > + OP_XMAD, // extended multiply-add (GM107+), does a lot of things > OP_ABS, > OP_NEG, > OP_NOT, > @@ -251,6 +252,13 @@ enum operation > #define NV50_IR_SUBOP_VOTE_ALL 0 > #define NV50_IR_SUBOP_VOTE_ANY 1 > #define NV50_IR_SUBOP_VOTE_UNI 2 > +#define NV50_IR_SUBOP_XMAD_PSL (1 << 0) > +#define NV50_IR_SUBOP_XMAD_MRG (1 << 1) > +#define NV50_IR_SUBOP_XMAD_CLO (1 << 2) > +#define NV50_IR_SUBOP_XMAD_CHI (2 << 2) > +#define NV50_IR_SUBOP_XMAD_CSFU (3 << 2) > +#define NV50_IR_SUBOP_XMAD_CBCC (4 << 2) > +#define NV50_IR_SUBOP_XMAD_CMODE_MASK (0x7 << 2) please document what all of those subops do here or at least for those you know. > > #define NV50_IR_SUBOP_MINMAX_LOW 1 > #define NV50_IR_SUBOP_MINMAX_MED 2 > @@ -527,6 +535,9 @@ struct Storage > #define NV50_IR_MOD_SAT (1 << 2) > #define NV50_IR_MOD_NOT (1 << 3) > #define NV50_IR_MOD_NEG_ABS (NV50_IR_MOD_NEG | NV50_IR_MOD_ABS) > +// modifiers only for XMAD > +#define NV50_IR_MOD_H1 (1 << 4) > +#define NV50_IR_MOD_SEXT (1 << 5) same here > > #define NV50_IR_INTERP_MODE_MASK 0x3 > #define NV50_IR_INTERP_LINEAR (0 << 0) > @@ -556,11 +567,14 @@ public: > inline Modifier operator&(const Modifier m) const { return bits & m.bits; > } > inline Modifier operator|(const Modifier m) const { return bits | m.bits; > } > inline Modifier operator^(const Modifier m) const { return bits ^ m.bits; > } > + inline Modifier operator~() const { return ~bits; } > > operation getOp() const; > > inline int neg() const { return (bits & NV50_IR_MOD_NEG) ? 1 : 0; } > inline int abs() const { return (bits & NV50_IR_MOD_ABS) ? 1 : 0; } > + inline int h1() const { return (bits & NV50_IR_MOD_H1) ? 1 : 0; } > + inline int sext() const { return (bits & NV50_IR_MOD_SEXT) ? 1 : 0; } > > inline operator bool() const { return bits ? true : false; } > > diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > index 4d0589214d..a43b481a01 100644 > --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp > @@ -191,9 +191,16 @@ void > LoadPropagation::checkSwapSrc01(Instruction *insn) > { > const Target *targ = prog->getTarget(); > - if (!targ->getOpInfo(insn).commutative) > - if (insn->op != OP_SET && insn->op != OP_SLCT && insn->op != OP_SUB) > + if (!targ->getOpInfo(insn).commutative) { > + if (insn->op != OP_SET && insn->op != OP_SLCT && > + insn->op != OP_SUB && insn->op != OP_XMAD) > return; > + // XMAD is only commutative if both the CBCC and MRG flags are not set. > + if (insn->op == OP_XMAD && (insn->subOp & 0x1c) == > NV50_IR_SUBOP_XMAD_CBCC) > + return; > + if (insn->op == OP_XMAD && (insn->subOp & NV50_IR_SUBOP_XMAD_MRG)) > + return; > + } > if (insn->src(1).getFile() != FILE_GPR) >return; > // This is the special OP_SET used for alphatesting, we can't reverse its > @@ -488,6 +495,7 @@ Modifier::applyTo(ImmediateValue& imm) const > imm.reg.data.s32 = -imm.reg.data.s32; >if (bits & NV50_IR_MOD_NOT) > imm.reg.data.s32 = ~imm.reg.d
[Mesa-dev] [PATCH] glsl: allow standalone semicolons outside main()
From: Dave Airlie GLSL 4.60 offically added this but games and older CTS suites actually had shaders that did this, we may as well enable it everywhere. --- src/compiler/glsl/glsl_parser.yy | 1 + 1 file changed, 1 insertion(+) diff --git a/src/compiler/glsl/glsl_parser.yy b/src/compiler/glsl/glsl_parser.yy index 91c10ce1a60..432fc874268 100644 --- a/src/compiler/glsl/glsl_parser.yy +++ b/src/compiler/glsl/glsl_parser.yy @@ -2706,6 +2706,7 @@ external_declaration: | declaration{ $$ = $1; } | pragma_statement { $$ = NULL; } | layout_defaults{ $$ = $1; } + | ';' { $$ = NULL; } ; function_definition: -- 2.17.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: allow standalone semicolons outside main()
On 14/06/18 09:53, Dave Airlie wrote: From: Dave Airlie GLSL 4.60 offically added this but games and older CTS suites actually had shaders that did this, we may as well enable it everywhere. --- src/compiler/glsl/glsl_parser.yy | 1 + 1 file changed, 1 insertion(+) diff --git a/src/compiler/glsl/glsl_parser.yy b/src/compiler/glsl/glsl_parser.yy index 91c10ce1a60..432fc874268 100644 --- a/src/compiler/glsl/glsl_parser.yy +++ b/src/compiler/glsl/glsl_parser.yy @@ -2706,6 +2706,7 @@ external_declaration: | declaration{ $$ = $1; } | pragma_statement { $$ = NULL; } | layout_defaults{ $$ = $1; } + | ';' { $$ = NULL; } Should the $$ stuff be aligned with above? Otherwise: Acked-by: Timothy Arceri ; function_definition: ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radv: Fix output for sparse MRTs.
On 14 June 2018 at 07:35, Bas Nieuwenhuizen wrote: > We need to init the cb_shader_format correctly with the changed > col_format, so this moves the col_format adjustment to before the > adjustment to before the cb_shader_mask gets generated. > > Fixes: 06d3c650980 "radv: fix a GPU hang when MRTs are sparse" > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106903 > CC: 18.1 Reviewed-by: Dave Airlie > --- > src/amd/vulkan/radv_pipeline.c | 19 ++- > 1 file changed, 10 insertions(+), 9 deletions(-) > > diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c > index b8b425aca9f..6eeedc65a39 100644 > --- a/src/amd/vulkan/radv_pipeline.c > +++ b/src/amd/vulkan/radv_pipeline.c > @@ -524,20 +524,21 @@ radv_pipeline_compute_spi_color_formats(struct > radv_pipeline *pipeline, > col_format |= cf << (4 * i); > } > > - blend->cb_shader_mask = ac_get_cb_shader_mask(col_format); > - > - if (blend->mrt0_is_dual_src) > - col_format |= (col_format & 0xf) << 4; > - blend->spi_shader_col_format = col_format; > - > /* If the i-th target format is set, all previous target formats must > * be non-zero to avoid hangs. > */ > - num_targets = (util_last_bit(blend->spi_shader_col_format) + 3) / 4; > + num_targets = (util_last_bit(col_format) + 3) / 4; > for (unsigned i = 0; i < num_targets; i++) { > - if (!(blend->spi_shader_col_format & (0xf << (i * 4 > - blend->spi_shader_col_format |= > V_028714_SPI_SHADER_32_R << (i * 4); > + if (!(col_format & (0xf << (i * 4 { > + col_format |= V_028714_SPI_SHADER_32_R << (i * 4); > + } > } > + > + blend->cb_shader_mask = ac_get_cb_shader_mask(col_format); > + > + if (blend->mrt0_is_dual_src) > + col_format |= (col_format & 0xf) << 4; > + blend->spi_shader_col_format = col_format; > } > > static bool > -- > 2.17.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/14] intel/compiler: general 8/16/32/64-bit shuffle_src_to_dst function
On 13/06/18 22:46, Jason Ekstrand wrote: > On Sat, Jun 9, 2018 at 4:13 AM, Jose Maria Casanova Crespo > mailto:jmcasan...@igalia.com>> wrote: > > This new function takes care of shuffle/unshuffle components of a > particular bit-size in components with a different bit-size. > > If source type size is smaller than destination type size the operation > needed is a component shuffle. The opposite case would be an unshuffle. > > The operation allows to skip first_component number of components from > the source. > > Shuffle MOVs are retyped using integer types avoiding problems with > denorms > and float types. This allows to simplify uses of shuffle functions > that are > dealing with these retypes individually. > > Now there is a new restriction so source and destination can not overlap > anymore when calling this suffle function. Following patches that > migrate > to use this new function will take care individually of avoiding source > and destination overlaps. > --- > src/intel/compiler/brw_fs_nir.cpp | 92 +++ > 1 file changed, 92 insertions(+) > > diff --git a/src/intel/compiler/brw_fs_nir.cpp > b/src/intel/compiler/brw_fs_nir.cpp > index 166da0aa6d7..1a9d3c41d1d 100644 > --- a/src/intel/compiler/brw_fs_nir.cpp > +++ b/src/intel/compiler/brw_fs_nir.cpp > @@ -5362,6 +5362,98 @@ shuffle_16bit_data_for_32bit_write(const > fs_builder &bld, > } > } > > +/* > + * This helper takes a source register and un/shuffles it into the > destination > + * register. > + * > + * If source type size is smaller than destination type size the > operation > + * needed is a component shuffle. The opposite case would be an > unshuffle. If > + * source/destination type size is equal a shuffle is done that > would be > + * equivalent to a simple MOV. > > > There's a sticky bit here if we want this to work with 64-bit types on > gen7 and earlier because we only have DF there and not Q so the > brw_reg_type_from_bit_size below doesn't work. If we care about that > case (and I'm not convinced we do), it should be easy enough to add a > type_sz(src.type) == type_sz(dst.type) case which just does MOVs from > source to dest. At this moment, current uses of this function are to read from 32-bits or to write to 32-bit. But I think that for completeness if would be nice to have all cases covered. The option of doing the MOVs in the case of equality (that would be quite normal) saves us to do the shuffle calculus for the simple case. So I'm going for it. > + * > + * For example, if source is a 16-bit type and destination is > 32-bit. A 3 > + * components .xyz 16-bit vector on SIMD8 would be. > + * > + * |x1|x2|x3|x4|x5|x6|x7|x8|y1|y2|y3|y4|y5|y6|y7|y8| > + * |z1|z2|z3|z4|z5|z6|z7|z8| | | | | | | | | > + * > + * This helper will return the following 2 32-bit components with > the 16-bit > + * values shuffled: > + * > + * |x1 y1|x2 y2|x3 y3|x4 y4|x5 y5|x6 y6|x7 y7|x8 y8| > + * |z1 |z2 |z3 |z4 |z5 |z6 |z7 |z8 | > + * > + * For unshuffle, the example would be the opposite, a 64-bit type > source > + * and a 32-bit destination. A 2 component .xy 64-bit vector on SIMD8 > + * would be: > + * > + * | x1l x1h | x2l x2h | x3l x3h | x4l x4h | > + * | x5l x5h | x6l x6h | x7l x7h | x8l x8h | > + * | y1l y1h | y2l y2h | y3l y3h | y4l y4h | > + * | y5l y5h | y6l y6h | y7l y7h | y8l y8h | > + * > + * The returned result would be the following 4 32-bit components > unshuffled: > + * > + * | x1l | x2l | x3l | x4l | x5l | x6l | x7l | x8l | > + * | x1h | x2h | x3h | x4h | x5h | x6h | x7h | x8h | > + * | y1l | y2l | y3l | y4l | y5l | y6l | y7l | y8l | > + * | y1h | y2h | y3h | y4h | y5h | y6h | y7h | y8h | > + * > + * - Source and destination register must not be overlapped. > + * - first_component parameter allows skipping source components. > + */ > +void > +shuffle_src_to_dst(const fs_builder &bld, > + const fs_reg &dst, > + const fs_reg &src, > + uint32_t first_component, > + uint32_t components) > +{ > + if (type_sz(src.type) <= type_sz(dst.type)) { > + /* Source is shuffled into destination */ > + unsigned size_ratio = type_sz(dst.type) / type_sz(src.type); > +#ifndef NDEBUG > + boolean src_dst_overlap = regions_overlap(dst, > + type_sz(dst.type) * bld.dispatch_width() * components, > + offset(src, bld, first_component * size_ratio), > > > Why do you need to multiply first_component by size_ratio? It's already > in units of source components. Yes, that's wro
Re: [Mesa-dev] [PATCH] glsl: allow standalone semicolons outside main()
On Wed, Jun 13, 2018 at 4:53 PM, Dave Airlie wrote: > From: Dave Airlie > > GLSL 4.60 offically added this but games and older CTS suites actually > had shaders that did this, we may as well enable it everywhere. > --- > src/compiler/glsl/glsl_parser.yy | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/compiler/glsl/glsl_parser.yy > b/src/compiler/glsl/glsl_parser.yy > index 91c10ce1a60..432fc874268 100644 > --- a/src/compiler/glsl/glsl_parser.yy > +++ b/src/compiler/glsl/glsl_parser.yy > @@ -2706,6 +2706,7 @@ external_declaration: > | declaration{ $$ = $1; } > | pragma_statement { $$ = NULL; } > | layout_defaults{ $$ = $1; } > + | ';' { $$ = NULL; } Indentation. Also, piglit test? With those, Reviewed-by: Matt Turner ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: allow standalone semicolons outside main()
On 14 June 2018 at 10:12, Matt Turner wrote: > On Wed, Jun 13, 2018 at 4:53 PM, Dave Airlie wrote: >> From: Dave Airlie >> >> GLSL 4.60 offically added this but games and older CTS suites actually >> had shaders that did this, we may as well enable it everywhere. >> --- >> src/compiler/glsl/glsl_parser.yy | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/src/compiler/glsl/glsl_parser.yy >> b/src/compiler/glsl/glsl_parser.yy >> index 91c10ce1a60..432fc874268 100644 >> --- a/src/compiler/glsl/glsl_parser.yy >> +++ b/src/compiler/glsl/glsl_parser.yy >> @@ -2706,6 +2706,7 @@ external_declaration: >> | declaration{ $$ = $1; } >> | pragma_statement { $$ = NULL; } >> | layout_defaults{ $$ = $1; } >> + | ';' { $$ = NULL; } > > Indentation. > > Also, piglit test? There is already a piglit test, unfortunately it only runs under glsl 4.60, I suppose I can send a patch to enable it to run from GLSL1.30. > > Reviewed-by: Matt Turner Thanks, Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/14] intel/compiler: general 8/16/32/64-bit shuffle_src_to_dst function
On Wed, Jun 13, 2018 at 5:07 PM, Chema Casanova wrote: > On 13/06/18 22:46, Jason Ekstrand wrote: > > On Sat, Jun 9, 2018 at 4:13 AM, Jose Maria Casanova Crespo > > mailto:jmcasan...@igalia.com>> wrote: > > > > This new function takes care of shuffle/unshuffle components of a > > particular bit-size in components with a different bit-size. > > > > If source type size is smaller than destination type size the > operation > > needed is a component shuffle. The opposite case would be an > unshuffle. > > > > The operation allows to skip first_component number of components > from > > the source. > > > > Shuffle MOVs are retyped using integer types avoiding problems with > > denorms > > and float types. This allows to simplify uses of shuffle functions > > that are > > dealing with these retypes individually. > > > > Now there is a new restriction so source and destination can not > overlap > > anymore when calling this suffle function. Following patches that > > migrate > > to use this new function will take care individually of avoiding > source > > and destination overlaps. > > --- > > src/intel/compiler/brw_fs_nir.cpp | 92 > +++ > > 1 file changed, 92 insertions(+) > > > > diff --git a/src/intel/compiler/brw_fs_nir.cpp > > b/src/intel/compiler/brw_fs_nir.cpp > > index 166da0aa6d7..1a9d3c41d1d 100644 > > --- a/src/intel/compiler/brw_fs_nir.cpp > > +++ b/src/intel/compiler/brw_fs_nir.cpp > > @@ -5362,6 +5362,98 @@ shuffle_16bit_data_for_32bit_write(const > > fs_builder &bld, > > } > > } > > > > +/* > > + * This helper takes a source register and un/shuffles it into the > > destination > > + * register. > > + * > > + * If source type size is smaller than destination type size the > > operation > > + * needed is a component shuffle. The opposite case would be an > > unshuffle. If > > + * source/destination type size is equal a shuffle is done that > > would be > > + * equivalent to a simple MOV. > > > > > > There's a sticky bit here if we want this to work with 64-bit types on > > gen7 and earlier because we only have DF there and not Q so the > > brw_reg_type_from_bit_size below doesn't work. If we care about that > > case (and I'm not convinced we do), it should be easy enough to add a > > type_sz(src.type) == type_sz(dst.type) case which just does MOVs from > > source to dest. > > At this moment, current uses of this function are to read from 32-bits > or to write to 32-bit. But I think that for completeness if would be > nice to have all cases covered. The option of doing the MOVs in the case > of equality (that would be quite normal) saves us to do the shuffle > calculus for the simple case. So I'm going for it. > > > + * > > + * For example, if source is a 16-bit type and destination is > > 32-bit. A 3 > > + * components .xyz 16-bit vector on SIMD8 would be. > > + * > > + *|x1|x2|x3|x4|x5|x6|x7|x8|y1|y2|y3|y4|y5|y6|y7|y8| > > + *|z1|z2|z3|z4|z5|z6|z7|z8| | | | | | | | | > > + * > > + * This helper will return the following 2 32-bit components with > > the 16-bit > > + * values shuffled: > > + * > > + *|x1 y1|x2 y2|x3 y3|x4 y4|x5 y5|x6 y6|x7 y7|x8 y8| > > + *|z1 |z2 |z3 |z4 |z5 |z6 |z7 |z8 | > > + * > > + * For unshuffle, the example would be the opposite, a 64-bit type > > source > > + * and a 32-bit destination. A 2 component .xy 64-bit vector on > SIMD8 > > + * would be: > > + * > > + *| x1l x1h | x2l x2h | x3l x3h | x4l x4h | > > + *| x5l x5h | x6l x6h | x7l x7h | x8l x8h | > > + *| y1l y1h | y2l y2h | y3l y3h | y4l y4h | > > + *| y5l y5h | y6l y6h | y7l y7h | y8l y8h | > > + * > > + * The returned result would be the following 4 32-bit components > > unshuffled: > > + * > > + *| x1l | x2l | x3l | x4l | x5l | x6l | x7l | x8l | > > + *| x1h | x2h | x3h | x4h | x5h | x6h | x7h | x8h | > > + *| y1l | y2l | y3l | y4l | y5l | y6l | y7l | y8l | > > + *| y1h | y2h | y3h | y4h | y5h | y6h | y7h | y8h | > > + * > > + * - Source and destination register must not be overlapped. > > + * - first_component parameter allows skipping source components. > > + */ > > +void > > +shuffle_src_to_dst(const fs_builder &bld, > > + const fs_reg &dst, > > + const fs_reg &src, > > + uint32_t first_component, > > + uint32_t components) > > +{ > > + if (type_sz(src.type) <= type_sz(dst.type)) { > > + /* Source is shuffled into destination */ > > + unsigned size_ratio = type_sz(dst.type) / type_sz(src.type); > > +#ifndef NDEBUG > > + boolean src_dst_overlap =
[Mesa-dev] [PATCH v2 2/5] mesa/util: add allow_glsl_builtin_const_expression driconf override
Google Earth VR shaders uses builtins in constant expressions with GLSL 1.10. That feature wasn't allowed until GLSL 1.20. --- src/compiler/glsl/ast_function.cpp | 3 ++- src/gallium/auxiliary/pipe-loader/driinfo_gallium.h | 1 + src/gallium/include/state_tracker/st_api.h | 1 + src/gallium/state_trackers/dri/dri_screen.c | 2 ++ src/mesa/main/mtypes.h | 6 ++ src/mesa/state_tracker/st_extensions.c | 3 +++ src/util/xmlpool/t_options.h| 5 + 7 files changed, 20 insertions(+), 1 deletion(-) diff --git a/src/compiler/glsl/ast_function.cpp b/src/compiler/glsl/ast_function.cpp index 22d58e48c64..127aa1f91c4 100644 --- a/src/compiler/glsl/ast_function.cpp +++ b/src/compiler/glsl/ast_function.cpp @@ -529,7 +529,8 @@ generate_call(exec_list *instructions, ir_function_signature *sig, * If the function call is a constant expression, don't generate any * instructions; just generate an ir_constant. */ - if (state->is_version(120, 100)) { + if (state->is_version(120, 100) || + state->ctx->Const.AllowGLSLBuiltinConstantExpression) { ir_constant *value = sig->constant_expression_value(ctx, actual_parameters, NULL); diff --git a/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h b/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h index 21dc599dc26..f25f2080080 100644 --- a/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h +++ b/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h @@ -23,6 +23,7 @@ DRI_CONF_SECTION_DEBUG DRI_CONF_DISABLE_SHADER_BIT_ENCODING("false") DRI_CONF_FORCE_GLSL_VERSION(0) DRI_CONF_ALLOW_GLSL_EXTENSION_DIRECTIVE_MIDSHADER("false") + DRI_CONF_ALLOW_GLSL_BUILTIN_CONST_EXPRESSION("false") DRI_CONF_ALLOW_GLSL_BUILTIN_VARIABLE_REDECLARATION("false") DRI_CONF_ALLOW_GLSL_CROSS_STAGE_INTERPOLATION_MISMATCH("false") DRI_CONF_ALLOW_HIGHER_COMPAT_VERSION("false") diff --git a/src/gallium/include/state_tracker/st_api.h b/src/gallium/include/state_tracker/st_api.h index ec6e7844b87..1efc7f081d1 100644 --- a/src/gallium/include/state_tracker/st_api.h +++ b/src/gallium/include/state_tracker/st_api.h @@ -222,6 +222,7 @@ struct st_config_options boolean force_glsl_extensions_warn; unsigned force_glsl_version; boolean allow_glsl_extension_directive_midshader; + boolean allow_glsl_builtin_const_expression; boolean allow_glsl_builtin_variable_redeclaration; boolean allow_higher_compat_version; boolean glsl_zero_init; diff --git a/src/gallium/state_trackers/dri/dri_screen.c b/src/gallium/state_trackers/dri/dri_screen.c index aaee9870776..a86b7519364 100644 --- a/src/gallium/state_trackers/dri/dri_screen.c +++ b/src/gallium/state_trackers/dri/dri_screen.c @@ -74,6 +74,8 @@ dri_fill_st_options(struct dri_screen *screen) driQueryOptioni(optionCache, "force_glsl_version"); options->allow_glsl_extension_directive_midshader = driQueryOptionb(optionCache, "allow_glsl_extension_directive_midshader"); + options->allow_glsl_builtin_const_expression = + driQueryOptionb(optionCache, "allow_glsl_builtin_const_expression"); options->allow_glsl_builtin_variable_redeclaration = driQueryOptionb(optionCache, "allow_glsl_builtin_variable_redeclaration"); options->allow_higher_compat_version = diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 482c42a4b2d..41ad783d4b1 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -3716,6 +3716,12 @@ struct gl_constants */ GLboolean AllowGLSLExtensionDirectiveMidShader; + /** +* Allow builtins as part of constant expressions. This was not allowed +* until GLSL 1.20 this allows it everywhere. +*/ + GLboolean AllowGLSLBuiltinConstantExpression; + /** * Allow GLSL built-in variables to be redeclared verbatim */ diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker/st_extensions.c index 467d9b07596..7f44b4a80c0 100644 --- a/src/mesa/state_tracker/st_extensions.c +++ b/src/mesa/state_tracker/st_extensions.c @@ -1133,6 +1133,9 @@ void st_init_extensions(struct pipe_screen *screen, if (options->allow_glsl_extension_directive_midshader) consts->AllowGLSLExtensionDirectiveMidShader = GL_TRUE; + if (options->allow_glsl_builtin_const_expression) + consts->AllowGLSLBuiltinConstantExpression = GL_TRUE; + consts->MinMapBufferAlignment = screen->get_param(screen, PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT); diff --git a/src/util/xmlpool/t_options.h b/src/util/xmlpool/t_options.h index 3ada813d639..1a4945d6888 100644 --- a/src/util/xmlpool/t_options.h +++ b/src/util/xmlpool/t_options.h @@ -115,6 +115,11 @@ DRI_CONF_OPT_BEGIN_B(allow_glsl_extension_directive_midshader, def) \ DRI_CONF_DESC(en,gettext("Allow GLSL #extension directives
[Mesa-dev] [PATCH v2 1/5] util: manually extract the program name from program_invocation_name
Glibc has the same code to get program_invocation_short_name. However for some reason the short name gets mangled for some wine apps. For example with Google Earth VR I get: program_invocation_name: "/home/tarceri/.local/share/Steam/steamapps/common/EarthVR/Earth.exe" program_invocation_short_name: "e" --- src/util/xmlconfig.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/src/util/xmlconfig.c b/src/util/xmlconfig.c index 60a6331c86c..ad943e2ce48 100644 --- a/src/util/xmlconfig.c +++ b/src/util/xmlconfig.c @@ -45,7 +45,16 @@ /* These aren't declared in any libc5 header */ extern char *program_invocation_name, *program_invocation_short_name; #endif -#define GET_PROGRAM_NAME() program_invocation_short_name +static const char * +__getProgramName() +{ +char * arg = strrchr(program_invocation_name, '/'); +if (arg) +return arg+1; +else +return program_invocation_name; +} +#define GET_PROGRAM_NAME() __getProgramName() #elif defined(__CYGWIN__) #define GET_PROGRAM_NAME() program_invocation_short_name #elif defined(__FreeBSD__) && (__FreeBSD__ >= 2) -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 5/5] util: add allow_glsl_relaxed_es to drirc for Google Earth VR
--- src/util/drirc | 1 + 1 file changed, 1 insertion(+) diff --git a/src/util/drirc b/src/util/drirc index ff706d16001..7f91035ae8b 100644 --- a/src/util/drirc +++ b/src/util/drirc @@ -178,6 +178,7 @@ TODO: document the other workarounds. +
[Mesa-dev] [PATCH v2 4/5] mesa/util: add allow_glsl_relaxed_es driconfig override
This relaxes a number of ES shader restrictions allowing shaders to follow more desktop GLSL like rules. This initial implementation relaxes the following: - allows linking ES shaders with desktop shaders - allows mismatching precision qualifiers - always enables standard derivative builtins These relaxations allow Google Earth VR shaders to compile. --- src/compiler/glsl/builtin_functions.cpp | 3 ++- src/compiler/glsl/linker.cpp | 22 +++ .../auxiliary/pipe-loader/driinfo_gallium.h | 1 + src/gallium/include/state_tracker/st_api.h| 1 + src/gallium/state_trackers/dri/dri_screen.c | 2 ++ src/mesa/main/mtypes.h| 6 + src/mesa/state_tracker/st_extensions.c| 3 +++ src/util/xmlpool/t_options.h | 5 + 8 files changed, 33 insertions(+), 10 deletions(-) diff --git a/src/compiler/glsl/builtin_functions.cpp b/src/compiler/glsl/builtin_functions.cpp index efe90346d0e..7119903795f 100644 --- a/src/compiler/glsl/builtin_functions.cpp +++ b/src/compiler/glsl/builtin_functions.cpp @@ -446,7 +446,8 @@ fs_oes_derivatives(const _mesa_glsl_parse_state *state) { return state->stage == MESA_SHADER_FRAGMENT && (state->is_version(110, 300) || - state->OES_standard_derivatives_enable); + state->OES_standard_derivatives_enable || + state->ctx->Const.AllowGLSLRelaxedES); } static bool diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp index e4bf634abe8..487a1ffcb05 100644 --- a/src/compiler/glsl/linker.cpp +++ b/src/compiler/glsl/linker.cpp @@ -894,7 +894,7 @@ validate_intrastage_arrays(struct gl_shader_program *prog, * Perform validation of global variables used across multiple shaders */ static void -cross_validate_globals(struct gl_shader_program *prog, +cross_validate_globals(struct gl_context *ctx, struct gl_shader_program *prog, struct exec_list *ir, glsl_symbol_table *variables, bool uniforms_only) { @@ -1115,7 +1115,8 @@ cross_validate_globals(struct gl_shader_program *prog, /* Check the precision qualifier matches for uniform variables on * GLSL ES. */ - if (prog->IsES && !var->get_interface_type() && + if (!ctx->Const.AllowGLSLRelaxedES && + prog->IsES && !var->get_interface_type() && existing->data.precision != var->data.precision) { if ((existing->data.used && var->data.used) || prog->data->Version >= 300) { linker_error(prog, "declarations for %s `%s` have " @@ -1168,15 +1169,16 @@ cross_validate_globals(struct gl_shader_program *prog, * Perform validation of uniforms used across multiple shader stages */ static void -cross_validate_uniforms(struct gl_shader_program *prog) +cross_validate_uniforms(struct gl_context *ctx, +struct gl_shader_program *prog) { glsl_symbol_table variables; for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { if (prog->_LinkedShaders[i] == NULL) continue; - cross_validate_globals(prog, prog->_LinkedShaders[i]->ir, &variables, - true); + cross_validate_globals(ctx, prog, prog->_LinkedShaders[i]->ir, + &variables, true); } } @@ -2210,7 +2212,8 @@ link_intrastage_shaders(void *mem_ctx, for (unsigned i = 0; i < num_shaders; i++) { if (shader_list[i] == NULL) continue; - cross_validate_globals(prog, shader_list[i]->ir, &variables, false); + cross_validate_globals(ctx, prog, shader_list[i]->ir, &variables, + false); } if (!prog->data->LinkStatus) @@ -4807,7 +4810,8 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog) min_version = MIN2(min_version, prog->Shaders[i]->Version); max_version = MAX2(max_version, prog->Shaders[i]->Version); - if (prog->Shaders[i]->IsES != prog->Shaders[0]->IsES) { + if (!ctx->Const.AllowGLSLRelaxedES && + prog->Shaders[i]->IsES != prog->Shaders[0]->IsES) { linker_error(prog, "all shaders must use same shading " "language version\n"); goto done; @@ -4825,7 +4829,7 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog) /* In desktop GLSL, different shader versions may be linked together. In * GLSL ES, all shader versions must be the same. */ - if (prog->Shaders[0]->IsES && min_version != max_version) { + if (!ctx->Const.AllowGLSLRelaxedES && min_version != max_version) { linker_error(prog, "all shaders must use same shading " "language version\n"); goto done; @@ -4951,7 +4955,7 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog) * performed, then locations are assigned for uniforms, attributes, and * varyings. */ - cross_validat
[Mesa-dev] [PATCH v2 3/5] util: add allow_glsl_builtin_const_expression to drirc for Google Earth VR
--- src/util/drirc | 4 1 file changed, 4 insertions(+) diff --git a/src/util/drirc b/src/util/drirc index c76f1ca4380..ff706d16001 100644 --- a/src/util/drirc +++ b/src/util/drirc @@ -176,6 +176,10 @@ TODO: document the other workarounds. + + + + -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/14] intel/compiler: new shuffle_for_32bit_write and shuffle_from_32bit_read
On Sat, Jun 9, 2018 at 4:13 AM, Jose Maria Casanova Crespo < jmcasan...@igalia.com> wrote: > These new shuffle functions deal with the shuffle/unshuffle operations > needed for read/write operations using 32-bit components when the > read/written components have a different bit-size (8, 16, 64-bits). > Shuffle from 32-bit to 32-bit becomes a simple MOV. > > As the new function shuffle_src_to_dst takes of doing a shuffle or an > unshuffle based on the different type_sz of source an destination this > generic functions work with any source/destination assuming that writes > use a 32-bit destination or reads use a 32-bit source. > I'm having a lot of trouble understanding this paragraph. Would you mind rephrasing it? > To enable this new functions it is needed than there is no > source/destination overlap in the case of shuffle_from_32bit_read. > That never happens on shuffle_for_32bit_write as it allocates a new > destination register as it was at shuffle_64bit_data_for_32bit_write. > --- > src/intel/compiler/brw_fs.h | 11 + > src/intel/compiler/brw_fs_nir.cpp | 38 +++ > 2 files changed, 49 insertions(+) > > diff --git a/src/intel/compiler/brw_fs.h b/src/intel/compiler/brw_fs.h > index faf51568637..779170ecc95 100644 > --- a/src/intel/compiler/brw_fs.h > +++ b/src/intel/compiler/brw_fs.h > @@ -519,6 +519,17 @@ void shuffle_16bit_data_for_32bit_write(const > brw::fs_builder &bld, > const fs_reg &src, > uint32_t components); > > +void shuffle_from_32bit_read(const brw::fs_builder &bld, > + const fs_reg &dst, > + const fs_reg &src, > + uint32_t first_component, > + uint32_t components); > + > +fs_reg shuffle_for_32bit_write(const brw::fs_builder &bld, > + const fs_reg &src, > + uint32_t first_component, > + uint32_t components); > + > fs_reg setup_imm_df(const brw::fs_builder &bld, > double v); > > diff --git a/src/intel/compiler/brw_fs_nir.cpp > b/src/intel/compiler/brw_fs_nir.cpp > index 1a9d3c41d1d..1f684149fd5 100644 > --- a/src/intel/compiler/brw_fs_nir.cpp > +++ b/src/intel/compiler/brw_fs_nir.cpp > @@ -5454,6 +5454,44 @@ shuffle_src_to_dst(const fs_builder &bld, > } > } > > +void > +shuffle_from_32bit_read(const fs_builder &bld, > +const fs_reg &dst, > +const fs_reg &src, > +uint32_t first_component, > +uint32_t components) > +{ > + assert(type_sz(src.type) == 4); > + > /* This function takes components in units of the destination type while shuffle_src_to_dst takes components in units of the smallest type */ > + if (type_sz(dst.type) > 4) { > + assert(type_sz(dst.type) == 8); > + first_component *= 2; > + components *= 2; > + } > + > + shuffle_src_to_dst(bld, dst, src, first_component, components); > +} > + > +fs_reg > +shuffle_for_32bit_write(const fs_builder &bld, > +const fs_reg &src, > +uint32_t first_component, > +uint32_t components) > +{ > + fs_reg dst = bld.vgrf(BRW_REGISTER_TYPE_D, > + DIV_ROUND_UP (components * type_sz(src.type), > 4)); > + > /* This function takes components in units of the source type while shuffle_src_to_dst takes components in units of the smallest type */ With those added and the commit message re-worded a bit, Reviewed-by: Jason Ekstrand > + if (type_sz(src.type) > 4) { > + assert(type_sz(src.type) == 8); > + first_component *= 2; > + components *= 2; > + } > + > + shuffle_src_to_dst(bld, dst, src, first_component, components); > + > + return dst; > +} > + > fs_reg > setup_imm_df(const fs_builder &bld, double v) > { > -- > 2.17.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/14] intel/compiler: Use shuffle_from_32bit_write for 16-bits store_ssbo
s/from/for/ in the commit message. On Sat, Jun 9, 2018 at 4:13 AM, Jose Maria Casanova Crespo < jmcasan...@igalia.com> wrote: > --- > src/intel/compiler/brw_fs_nir.cpp | 7 ++- > 1 file changed, 2 insertions(+), 5 deletions(-) > > diff --git a/src/intel/compiler/brw_fs_nir.cpp > b/src/intel/compiler/brw_fs_nir.cpp > index ef7895262b8..a54935f7049 100644 > --- a/src/intel/compiler/brw_fs_nir.cpp > +++ b/src/intel/compiler/brw_fs_nir.cpp > @@ -4297,11 +4297,8 @@ fs_visitor::nir_emit_intrinsic(const fs_builder > &bld, nir_intrinsic_instr *instr > * aligned. Shuffling only one component would be the same as > * striding it. > */ > -fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_D, > - DIV_ROUND_UP(num_components, 2)); > -shuffle_16bit_data_for_32bit_write(bld, tmp, write_src, > - num_components); > -write_src = tmp; > +write_src = shuffle_for_32bit_write(bld, write_src, 0, > +num_components); > } > > fs_reg offset_reg; > -- > 2.17.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev