Re: [Mesa-dev] [PATCH] radv: add scratch support for spilling.
I'm not sure if using a scratch buffer per command buffer is correct. AFAIU each ring has a separate counter for the scratch offsets, and if a command buffer is used in multiple compute rings at the same time, these separate counters could conflict. I'd think we need a preamble IB per queue that sets SGPR0/1 for all relevant stages, and modify the winsys so that that is called in the same submit ioctl as the application command buffers. - Bas On Tue, Jan 24, 2017, at 18:32, Dave Airlie wrote: > From: Dave Airlie> > Currently LLVM 5.0 has support for spilling to a place > pointed to by the user sgprs instead of using relocations. > > This is enabled by using the amdgcn-mesa-mesa3d triple. > > For compute gfx shaders we spill to a buffer pointed to > by 64-bit address stored in sgprs 0/1. > For other gfx shaders we spill to a buffer pointed to by > the first two dwords of the buffer pointed to in sgprs 0/1. > > This patch enables radv to use the llvm support when present. > > This fixes Sascha Willems computeshader demo first screen, > and a bunch of CTS tests now pass. > > This patch is likely to be in LLVM 4.0 release as well > (fingers crossed) in which case we need to adjust the detection > logic. > > SIgned-off-by: Dave Airlie > --- > src/amd/common/ac_binary.c | 30 + > src/amd/common/ac_binary.h | 4 +- > src/amd/common/ac_llvm_util.c| 4 +- > src/amd/common/ac_llvm_util.h| 2 +- > src/amd/common/ac_nir_to_llvm.c | 14 ++-- > src/amd/common/ac_nir_to_llvm.h | 6 +- > src/amd/vulkan/radv_cmd_buffer.c | 137 > ++- > src/amd/vulkan/radv_device.c | 22 +++ > src/amd/vulkan/radv_pipeline.c | 10 +-- > src/amd/vulkan/radv_private.h| 13 > 10 files changed, 215 insertions(+), 27 deletions(-) > > diff --git a/src/amd/common/ac_binary.c b/src/amd/common/ac_binary.c > index 01cf000..9c66a82 100644 > --- a/src/amd/common/ac_binary.c > +++ b/src/amd/common/ac_binary.c > @@ -212,23 +212,28 @@ static const char *scratch_rsrc_dword1_symbol = > > void ac_shader_binary_read_config(struct ac_shader_binary *binary, > struct ac_shader_config *conf, > - unsigned symbol_offset) > + unsigned symbol_offset, > + bool supports_spill) > { > unsigned i; > const unsigned char *config = > ac_shader_binary_config_start(binary, symbol_offset); > bool really_needs_scratch = false; > - > + uint32_t wavesize = 0; > /* LLVM adds SGPR spills to the scratch size. >* Find out if we really need the scratch buffer. >*/ > - for (i = 0; i < binary->reloc_count; i++) { > - const struct ac_shader_reloc *reloc = >relocs[i]; > + if (supports_spill) { > + really_needs_scratch = true; > + } else { > + for (i = 0; i < binary->reloc_count; i++) { > + const struct ac_shader_reloc *reloc = > >relocs[i]; > > - if (!strcmp(scratch_rsrc_dword0_symbol, reloc->name) || > - !strcmp(scratch_rsrc_dword1_symbol, reloc->name)) { > - really_needs_scratch = true; > - break; > + if (!strcmp(scratch_rsrc_dword0_symbol, > reloc->name) || > + !strcmp(scratch_rsrc_dword1_symbol, > reloc->name)) { > + really_needs_scratch = true; > + break; > + } > } > } > > @@ -259,9 +264,7 @@ void ac_shader_binary_read_config(struct > ac_shader_binary *binary, > case R_0286E8_SPI_TMPRING_SIZE: > case R_00B860_COMPUTE_TMPRING_SIZE: > /* WAVESIZE is in units of 256 dwords. */ > - if (really_needs_scratch) > - conf->scratch_bytes_per_wave = > - G_00B860_WAVESIZE(value) * 256 * > 4; > + wavesize = value; > break; > case SPILLED_SGPRS: > conf->spilled_sgprs = value; > @@ -285,4 +288,9 @@ void ac_shader_binary_read_config(struct > ac_shader_binary *binary, > if (!conf->spi_ps_input_addr) > conf->spi_ps_input_addr = conf->spi_ps_input_ena; > } > + > + if (really_needs_scratch) { > + /* sgprs spills aren't spilling */ > + conf->scratch_bytes_per_wave = > G_00B860_WAVESIZE(wavesize) * 256 * 4; > + } > } > diff --git a/src/amd/common/ac_binary.h b/src/amd/common/ac_binary.h > index 282f33d..06fd855 100644 > --- a/src/amd/common/ac_binary.h > +++ b/src/amd/common/ac_binary.h > @@ -27,6 +27,7 @@ > #pragma once > > #include > +#include > > struct
Re: [Mesa-dev] [PATCH 9/9] i965: Drop _mesa_meta_pbo_TexSubImage() even for gen < 6
I tested dropping meta here separately in the context of this bug: https://bugs.freedesktop.org/show_bug.cgi?id=99209 No regressions seen there. Tested-by: Tapani PälliOn 12/20/2016 04:45 PM, Topi Pohjolainen wrote: Signed-off-by: Topi Pohjolainen --- src/mesa/drivers/dri/i965/intel_tex_image.c| 24 +++- src/mesa/drivers/dri/i965/intel_tex_subimage.c | 19 +-- 2 files changed, 12 insertions(+), 31 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c b/src/mesa/drivers/dri/i965/intel_tex_image.c index 67f83db..e503043 100644 --- a/src/mesa/drivers/dri/i965/intel_tex_image.c +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c @@ -127,7 +127,6 @@ intelTexImage(struct gl_context * ctx, { struct brw_context *brw = brw_context(ctx); struct intel_texture_image *intelImage = intel_texture_image(texImage); - bool ok; bool tex_busy = intelImage->mt && drm_intel_bo_busy(intelImage->mt->bo); @@ -156,22 +155,13 @@ intelTexImage(struct gl_context * ctx, format, type, pixels, unpack)) return; - if (brw->gen < 6 && - _mesa_meta_pbo_TexSubImage(ctx, dims, texImage, 0, 0, 0, - texImage->Width, texImage->Height, - texImage->Depth, - format, type, pixels, - tex_busy, unpack)) - return; - - ok = intel_texsubimage_tiled_memcpy(ctx, dims, texImage, - 0, 0, 0, /*x,y,z offsets*/ - texImage->Width, - texImage->Height, - texImage->Depth, - format, type, pixels, unpack, - false /*allocate_storage*/); - if (ok) + if (intel_texsubimage_tiled_memcpy(ctx, dims, texImage, + 0, 0, 0, /*x,y,z offsets*/ + texImage->Width, + texImage->Height, + texImage->Depth, + format, type, pixels, unpack, + false /*allocate_storage*/)) return; DBG("%s: upload image %dx%dx%d pixels %p\n", diff --git a/src/mesa/drivers/dri/i965/intel_tex_subimage.c b/src/mesa/drivers/dri/i965/intel_tex_subimage.c index 741637a..60dc862 100644 --- a/src/mesa/drivers/dri/i965/intel_tex_subimage.c +++ b/src/mesa/drivers/dri/i965/intel_tex_subimage.c @@ -395,7 +395,6 @@ intelTexSubImage(struct gl_context * ctx, { struct brw_context *brw = brw_context(ctx); struct intel_mipmap_tree *mt = intel_texture_image(texImage)->mt; - bool ok; bool tex_busy = mt && drm_intel_bo_busy(mt->bo); @@ -416,19 +415,11 @@ intelTexSubImage(struct gl_context * ctx, format, type, pixels, packing)) return; - ok = _mesa_meta_pbo_TexSubImage(ctx, dims, texImage, - xoffset, yoffset, zoffset, - width, height, depth, format, type, - pixels, tex_busy, packing); - if (ok) - return; - - ok = intel_texsubimage_tiled_memcpy(ctx, dims, texImage, - xoffset, yoffset, zoffset, - width, height, depth, - format, type, pixels, packing, - false /*for_glTexImage*/); - if (ok) + if (intel_texsubimage_tiled_memcpy(ctx, dims, texImage, + xoffset, yoffset, zoffset, + width, height, depth, + format, type, pixels, packing, + false /*for_glTexImage*/)) return; _mesa_store_texsubimage(ctx, dims, texImage, ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] configure.ac: Remove redundant libglvnd stanza
There were two "libglvnd configuration" section in the squashed commit that added libglvnd support, while only one in the original libglvnd branch. A following commit moves one of them downwards. Now remove the upper "older" one and move GL_LIB name decision downwards after the new libglvnd configuration section. Signed-off-by: Boyan Ding--- configure.ac | 81 1 file changed, 32 insertions(+), 49 deletions(-) diff --git a/configure.ac b/configure.ac index 64ace9dbcb..687ad9f99b 100644 --- a/configure.ac +++ b/configure.ac @@ -528,8 +528,6 @@ else DEFINES="$DEFINES -DNDEBUG" fi -DEFAULT_GL_LIB_NAME=GL - dnl dnl Check if linker supports -Bsymbolic dnl @@ -627,23 +625,6 @@ esac AM_CONDITIONAL(HAVE_COMPAT_SYMLINKS, test "x$HAVE_COMPAT_SYMLINKS" = xyes) -DEFAULT_GL_LIB_NAME=GL - -dnl -dnl Libglvnd configuration -dnl -AC_ARG_ENABLE([libglvnd], -[AS_HELP_STRING([--enable-libglvnd], -[Build for libglvnd @<:@default=disabled@:>@])], -[enable_libglvnd="$enableval"], -[enable_libglvnd=no]) -AM_CONDITIONAL(USE_LIBGLVND_GLX, test "x$enable_libglvnd" = xyes) -#AM_COND_IF([USE_LIBGLVND_GLX], [DEFINES="${DEFINES} -DUSE_LIBGLVND_GLX=1"]) -if test "x$enable_libglvnd" = xyes ; then -DEFINES="${DEFINES} -DUSE_LIBGLVND_GLX=1" -DEFAULT_GL_LIB_NAME=GLX_mesa -fi - dnl dnl library names dnl @@ -677,36 +658,6 @@ esac AC_SUBST([LIB_EXT]) -AC_ARG_WITH([gl-lib-name], - [AS_HELP_STRING([--with-gl-lib-name@<:@=NAME@:>@], -[specify GL library name @<:@default=GL@:>@])], - [GL_LIB=$withval], - [GL_LIB="$DEFAULT_GL_LIB_NAME"]) -AC_ARG_WITH([osmesa-lib-name], - [AS_HELP_STRING([--with-osmesa-lib-name@<:@=NAME@:>@], -[specify OSMesa library name @<:@default=OSMesa@:>@])], - [OSMESA_LIB=$withval], - [OSMESA_LIB=OSMesa]) -AS_IF([test "x$GL_LIB" = xyes], [GL_LIB="$DEFAULT_GL_LIB_NAME"]) -AS_IF([test "x$OSMESA_LIB" = xyes], [OSMESA_LIB=OSMesa]) - -dnl -dnl Mangled Mesa support -dnl -AC_ARG_ENABLE([mangling], - [AS_HELP_STRING([--enable-mangling], -[enable mangled symbols and library name @<:@default=disabled@:>@])], - [enable_mangling="${enableval}"], - [enable_mangling=no] -) -if test "x${enable_mangling}" = "xyes" ; then - DEFINES="${DEFINES} -DUSE_MGL_NAMESPACE" - GL_LIB="Mangled${GL_LIB}" - OSMESA_LIB="Mangled${OSMESA_LIB}" -fi -AC_SUBST([GL_LIB]) -AC_SUBST([OSMESA_LIB]) - dnl dnl potentially-infringing-but-nobody-knows-for-sure stuff dnl @@ -1332,6 +1283,8 @@ AM_CONDITIONAL(HAVE_DRI_GLX, test "x$enable_glx" = xdri) AM_CONDITIONAL(HAVE_XLIB_GLX, test "x$enable_glx" = xxlib) AM_CONDITIONAL(HAVE_GALLIUM_XLIB_GLX, test "x$enable_glx" = xgallium-xlib) +DEFAULT_GL_LIB_NAME=GL + dnl dnl Libglvnd configuration dnl @@ -1361,6 +1314,36 @@ if test "x$enable_libglvnd" = xyes ; then DEFAULT_GL_LIB_NAME=GLX_mesa fi +AC_ARG_WITH([gl-lib-name], + [AS_HELP_STRING([--with-gl-lib-name@<:@=NAME@:>@], +[specify GL library name @<:@default=GL@:>@])], + [GL_LIB=$withval], + [GL_LIB="$DEFAULT_GL_LIB_NAME"]) +AC_ARG_WITH([osmesa-lib-name], + [AS_HELP_STRING([--with-osmesa-lib-name@<:@=NAME@:>@], +[specify OSMesa library name @<:@default=OSMesa@:>@])], + [OSMESA_LIB=$withval], + [OSMESA_LIB=OSMesa]) +AS_IF([test "x$GL_LIB" = xyes], [GL_LIB="$DEFAULT_GL_LIB_NAME"]) +AS_IF([test "x$OSMESA_LIB" = xyes], [OSMESA_LIB=OSMesa]) + +dnl +dnl Mangled Mesa support +dnl +AC_ARG_ENABLE([mangling], + [AS_HELP_STRING([--enable-mangling], +[enable mangled symbols and library name @<:@default=disabled@:>@])], + [enable_mangling="${enableval}"], + [enable_mangling=no] +) +if test "x${enable_mangling}" = "xyes" ; then + DEFINES="${DEFINES} -DUSE_MGL_NAMESPACE" + GL_LIB="Mangled${GL_LIB}" + OSMESA_LIB="Mangled${OSMESA_LIB}" +fi +AC_SUBST([GL_LIB]) +AC_SUBST([OSMESA_LIB]) + # Check for libdrm PKG_CHECK_MODULES([LIBDRM], [libdrm >= $LIBDRM_REQUIRED], [have_libdrm=yes], [have_libdrm=no]) -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 98002] Mud rendering bug in Portal 2
https://bugs.freedesktop.org/show_bug.cgi?id=98002 --- Comment #15 from Clément Guérin--- Today's Portal 2 update fixed the bug. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium/radeon: add a new HUD query for the number of mapped buffers
On 24/01/17 07:38 PM, Nicolai Hähnle wrote: > On 24.01.2017 11:34, Samuel Pitoiset wrote: >> On 01/24/2017 11:31 AM, Nicolai Hähnle wrote: >>> On 24.01.2017 11:25, Samuel Pitoiset wrote: On 01/24/2017 07:39 AM, Michel Dänzer wrote: > On 24/01/17 05:44 AM, Samuel Pitoiset wrote: >> Useful when debugging applications which map too much VRAM. > > Is the number of mapped buffers really useful, as opposed to the total > size of buffer mappings? Even if it was the latter though, it doesn't > show which mappings are for BOs in VRAM vs GTT, does it? Also, even > the > total size of mappings of BOs currently in VRAM doesn't directly > reflect > the pressure on the CPU visible part of VRAM — only the BOs which are > actively being accessed by the CPU contribute to that. It's actually useful to know the number of mapped buffers, but maybe it would be better to have two separate counters for GTT and VRAM. Although the number of mapped buffers in VRAM is most of the time very high compared to GTT AFAIK. I will submit in a follow-up patch, something which reduces the number of mapped buffers in VRAM (when a BO has been mapped only once). And this new counter helped me. >>> >>> Michel's point probably means that reducing the number/size of mapped >>> VRAM buffers isn't actually that important though. >> >> It seems useful for apps which map more than 256MB of VRAM. > > True, if all of that range is actually used by the CPU (which may well > happen, of course). If I understand Michel correctly (and this was news > to me as well), if 1GB of VRAM is mapped, but only 64MB of that are > regularly accessed by the CPU, then the kernel will migrate all of the > rest into non-visible VRAM. Some caveats: While what you're describing should certainly be possible, I'm not sure it's what currently happens with the amdgpu kernel driver. It's possible that BOs are evicted from CPU visible VRAM to GTT instead of to CPU invisible VRAM. Also, if a BO is currently in CPU invisible VRAM when the CPU tries accessing it, and it can't be moved into CPU visible VRAM (e.g. due to fragmentation caused by BOs which are pinned, either permanently for scanout or temporarily for command stream execution), it's migrated to GTT instead. Anyway, the point is that the existence or absence of mappings per se shouldn't affect the BO migration; only actual CPU access does. Also note that BOs can currently only be migrated into CPU visible VRAM as a whole for CPU access, i.e. the whole BO has to fit into a single physically contiguous range of VRAM. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium/radeon: add a new HUD query for the number of mapped buffers
On 25/01/17 12:05 AM, Marek Olšák wrote: > On Tue, Jan 24, 2017 at 2:17 PM, Christian König >wrote: >> Am 24.01.2017 um 11:44 schrieb Samuel Pitoiset: >>> On 01/24/2017 11:38 AM, Nicolai Hähnle wrote: On 24.01.2017 11:34, Samuel Pitoiset wrote: > On 01/24/2017 11:31 AM, Nicolai Hähnle wrote: >> On 24.01.2017 11:25, Samuel Pitoiset wrote: >>> On 01/24/2017 07:39 AM, Michel Dänzer wrote: On 24/01/17 05:44 AM, Samuel Pitoiset wrote: > > Useful when debugging applications which map too much VRAM. Is the number of mapped buffers really useful, as opposed to the total size of buffer mappings? Even if it was the latter though, it doesn't show which mappings are for BOs in VRAM vs GTT, does it? Also, even the total size of mappings of BOs currently in VRAM doesn't directly reflect the pressure on the CPU visible part of VRAM — only the BOs which are actively being accessed by the CPU contribute to that. >>> >>> >>> It's actually useful to know the number of mapped buffers, but maybe >>> it >>> would be better to have two separate counters for GTT and VRAM. >>> Although >>> the number of mapped buffers in VRAM is most of the time very high >>> compared to GTT AFAIK. >>> >>> I will submit in a follow-up patch, something which reduces the number >>> of mapped buffers in VRAM (when a BO has been mapped only once). And >>> this new counter helped me. >> >> >> Michel's point probably means that reducing the number/size of mapped >> VRAM buffers isn't actually that important though. > > > It seems useful for apps which map more than 256MB of VRAM. True, if all of that range is actually used by the CPU (which may well happen, of course). If I understand Michel correctly (and this was news to me as well), if 1GB of VRAM is mapped, but only 64MB of that are regularly accessed by the CPU, then the kernel will migrate all of the rest into non-visible VRAM. >>> >>> >>> And this can hurt us, for example DXMD maps over 500MB of VRAM. And a >>> bunch of BOs are only mapped once. >> >> >> But when they are mapped once that won't be a problem. >> >> Again as Michel noted when a VRAM buffer is mapped it is migrated into the >> visible parts of VRAM on access, not on mapping. >> >> In other words you can map all your VRAM buffers and keep them mapped and >> that won't hurt anybody. > > Are you saying that I can map 2 GB of VRAM and it will all stay in > VRAM and I'll get maximum performance if it's not accessed by the CPU > too much? Yes, that's how it's supposed to work. > Are you sure it won't have any adverse effects on anything? That's a pretty big statement. :) Bugs happen. > Having useless memory mappings certainly must have some negative > effect on something. It doesn't seem like a good idea to have a lot of > mapped memory that doesn't have to be mapped. I guess e.g. the bookkeeping overhead might become significant with large numbers of mappings. Maybe the issue Sam has been looking into is actually related to something like that, not to VRAM? -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium/radeon: add a new HUD query for the number of mapped buffers
On 24/01/17 07:18 PM, Nicolai Hähnle wrote: > On 24.01.2017 07:39, Michel Dänzer wrote: >> On 24/01/17 05:44 AM, Samuel Pitoiset wrote: >>> Useful when debugging applications which map too much VRAM. >> >> Is the number of mapped buffers really useful, as opposed to the total >> size of buffer mappings? Even if it was the latter though, it doesn't >> show which mappings are for BOs in VRAM vs GTT, does it? Also, even the >> total size of mappings of BOs currently in VRAM doesn't directly reflect >> the pressure on the CPU visible part of VRAM — only the BOs which are >> actively being accessed by the CPU contribute to that. > > Thanks, I didn't know that. > > However, the number of mapped buffers is still useful information > because we used to run into Linux's limit on the number of simultaneous > mmap()ings before :) Makes sense, but then the commit log should be changed to better reflect what it's useful for. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/37] util: add a disk_cache_remove() function
On Tue, 2017-01-24 at 17:38 -0800, Eric Anholt wrote: > Timothy Arceriwrites: > > > On Tue, 2017-01-24 at 15:54 -0800, Eric Anholt wrote: > > > Timothy Arceri writes: > > > > > > > From: Timothy Arceri > > > > > > > > This will be used to remove cache items created with old > > > > versions > > > > of Mesa or other invalid cache items from the cache. > > > > > > I'm not convinced that removing the item from cache when we get a > > > hit > > > on > > > everything in the key except for Mesa version is the right way to > > > go. I > > > think we should just be hashing the Mesa version in the key so > > > that > > > we > > > don't hit on mismatched versions. Then we wouldn't thrash our > > > cache > > > when we're, say, checking out around different versions of Mesa > > > and > > > re-pigliting things. > > > > I agree. I mention this problem in the cover letter, it's going to > > take > > some reworking so I was hoping to fix it in a follow-up. > > > > The plan is to create directory structures like so: > > > > Mesa-17.0.0/i965-BDW/ > > Mesa-17.1.0/i965-BDW/ > > > > This will allow us to just delete and entire directory if we are > > hitting the cache limit and also easily allows third parties to > > install > > precompiled shaders in those dirs. > > I don't get how Mesa-17.0.0 identifies a specific compile of Mesa, so > that doesn't seem to solve versioning. Are you going to have the > Mesa > build date or something under that? It will be the Mesa version string which for stable would be something like Mesa-17.0.0 and for git based packages it would be something like Mesa 17.1.0 (git-38a67f0). > > I'm pretty skeptical of anybody ever actually installing precompiled > shaders and their users successfully getting cache hits off of them, > so > architecting for that seems strange to me. Don't make Plagman sad. It's in the pipeline :) > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radv: add scratch support for spilling.
From: Dave AirlieCurrently LLVM 5.0 has support for spilling to a place pointed to by the user sgprs instead of using relocations. This is enabled by using the amdgcn-mesa-mesa3d triple. For compute gfx shaders we spill to a buffer pointed to by 64-bit address stored in sgprs 0/1. For other gfx shaders we spill to a buffer pointed to by the first two dwords of the buffer pointed to in sgprs 0/1. This patch enables radv to use the llvm support when present. This fixes Sascha Willems computeshader demo first screen, and a bunch of CTS tests now pass. This patch is likely to be in LLVM 4.0 release as well (fingers crossed) in which case we need to adjust the detection logic. SIgned-off-by: Dave Airlie --- src/amd/common/ac_binary.c | 30 + src/amd/common/ac_binary.h | 4 +- src/amd/common/ac_llvm_util.c| 4 +- src/amd/common/ac_llvm_util.h| 2 +- src/amd/common/ac_nir_to_llvm.c | 14 ++-- src/amd/common/ac_nir_to_llvm.h | 6 +- src/amd/vulkan/radv_cmd_buffer.c | 137 ++- src/amd/vulkan/radv_device.c | 22 +++ src/amd/vulkan/radv_pipeline.c | 10 +-- src/amd/vulkan/radv_private.h| 13 10 files changed, 215 insertions(+), 27 deletions(-) diff --git a/src/amd/common/ac_binary.c b/src/amd/common/ac_binary.c index 01cf000..9c66a82 100644 --- a/src/amd/common/ac_binary.c +++ b/src/amd/common/ac_binary.c @@ -212,23 +212,28 @@ static const char *scratch_rsrc_dword1_symbol = void ac_shader_binary_read_config(struct ac_shader_binary *binary, struct ac_shader_config *conf, - unsigned symbol_offset) + unsigned symbol_offset, + bool supports_spill) { unsigned i; const unsigned char *config = ac_shader_binary_config_start(binary, symbol_offset); bool really_needs_scratch = false; - + uint32_t wavesize = 0; /* LLVM adds SGPR spills to the scratch size. * Find out if we really need the scratch buffer. */ - for (i = 0; i < binary->reloc_count; i++) { - const struct ac_shader_reloc *reloc = >relocs[i]; + if (supports_spill) { + really_needs_scratch = true; + } else { + for (i = 0; i < binary->reloc_count; i++) { + const struct ac_shader_reloc *reloc = >relocs[i]; - if (!strcmp(scratch_rsrc_dword0_symbol, reloc->name) || - !strcmp(scratch_rsrc_dword1_symbol, reloc->name)) { - really_needs_scratch = true; - break; + if (!strcmp(scratch_rsrc_dword0_symbol, reloc->name) || + !strcmp(scratch_rsrc_dword1_symbol, reloc->name)) { + really_needs_scratch = true; + break; + } } } @@ -259,9 +264,7 @@ void ac_shader_binary_read_config(struct ac_shader_binary *binary, case R_0286E8_SPI_TMPRING_SIZE: case R_00B860_COMPUTE_TMPRING_SIZE: /* WAVESIZE is in units of 256 dwords. */ - if (really_needs_scratch) - conf->scratch_bytes_per_wave = - G_00B860_WAVESIZE(value) * 256 * 4; + wavesize = value; break; case SPILLED_SGPRS: conf->spilled_sgprs = value; @@ -285,4 +288,9 @@ void ac_shader_binary_read_config(struct ac_shader_binary *binary, if (!conf->spi_ps_input_addr) conf->spi_ps_input_addr = conf->spi_ps_input_ena; } + + if (really_needs_scratch) { + /* sgprs spills aren't spilling */ + conf->scratch_bytes_per_wave = G_00B860_WAVESIZE(wavesize) * 256 * 4; + } } diff --git a/src/amd/common/ac_binary.h b/src/amd/common/ac_binary.h index 282f33d..06fd855 100644 --- a/src/amd/common/ac_binary.h +++ b/src/amd/common/ac_binary.h @@ -27,6 +27,7 @@ #pragma once #include +#include struct ac_shader_reloc { char name[32]; @@ -85,4 +86,5 @@ void ac_elf_read(const char *elf_data, unsigned elf_size, void ac_shader_binary_read_config(struct ac_shader_binary *binary, struct ac_shader_config *conf, - unsigned symbol_offset); + unsigned symbol_offset, + bool supports_spill); diff --git a/src/amd/common/ac_llvm_util.c b/src/amd/common/ac_llvm_util.c index 770e3bd..3ba5281 100644 --- a/src/amd/common/ac_llvm_util.c +++ b/src/amd/common/ac_llvm_util.c @@ -126,11 +126,11 @@ static const char *ac_get_llvm_processor_name(enum radeon_family
[Mesa-dev] [Bug 99527] Provide option for llvmpipe JIT code to run cleanly under valgrind
https://bugs.freedesktop.org/show_bug.cgi?id=99527 --- Comment #1 from Roland Scheidegger--- I agree it would be really nice if we wouldn't get valgrind errors. If you figure out how to fix it, patches welcome... I tried to look into it at some point but couldn't really figure it out (didn't invest all that much time though). I'm not even sure this isn't a valgrind bug (last I checked there could still be some problems with simd instructions). Tracking this stuff down in jit code isn't exactly easy, and having these harmless errors makes it more difficult to debug real issues (I've seen invalid reads and writes which needed to be fixed, and they got kinda buried in the valgrind output). -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 8/8] nir/spirv/glsl450: Implement IEEE-compliant handling of atan2(±∞, ±∞).
This appears to do the same thing as the GLSL change. This patch is Reviewed-by: Ian RomanickOn 01/24/2017 03:26 PM, Francisco Jerez wrote: > --- > src/compiler/spirv/vtn_glsl450.c | 22 +- > 1 file changed, 21 insertions(+), 1 deletion(-) > > diff --git a/src/compiler/spirv/vtn_glsl450.c > b/src/compiler/spirv/vtn_glsl450.c > index 508f218..7af2dad 100644 > --- a/src/compiler/spirv/vtn_glsl450.c > +++ b/src/compiler/spirv/vtn_glsl450.c > @@ -325,12 +325,32 @@ build_atan2(nir_builder *b, nir_ssa_def *y, nir_ssa_def > *x) > nir_ssa_def *rcp_scaled_t = nir_frcp(b, nir_fmul(b, t, scale)); > nir_ssa_def *s_over_t = nir_fmul(b, nir_fmul(b, s, scale), rcp_scaled_t); > > + /* For |x| = |y| assume tan = 1 even if infinite (i.e. pretend momentarily > +* that ∞/∞ = 1) in order to comply with the rather artificial rules > +* inherited from IEEE 754-2008, namely: > +* > +* "atan2(±∞, −∞) is ±3π/4 > +* atan2(±∞, +∞) is ±π/4" > +* > +* Note that this is inconsistent with the rules for the neighborhood of > +* zero that are based on iterated limits: > +* > +* "atan2(±0, −0) is ±π > +* atan2(±0, +0) is ±0" > +* > +* but GLSL specifically allows implementations to deviate from IEEE rules > +* at (0,0), so we take that license (i.e. pretend that 0/0 = 1 here as > +* well). > +*/ > + nir_ssa_def *tan = nir_bcsel(b, nir_feq(b, nir_fabs(b, x), nir_fabs(b, > y)), > +one, nir_fabs(b, s_over_t)); > + > /* Calculate the arctangent and fix up the result if we had flipped the > * coordinate system. > */ > nir_ssa_def *arc = nir_fadd(b, nir_fmul(b, nir_b2f(b, flip), > nir_imm_float(b, M_PI_2f)), > - build_atan(b, nir_fabs(b, s_over_t))); > + build_atan(b, tan)); > > /* Rather convoluted calculation of the sign of the result. When x < 0 we > * cannot use fsign because we need to be able to distinguish between > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 7/8] glsl: Implement IEEE-compliant handling of atan2(±∞, ±∞).
This patch is Reviewed-by: Ian RomanickOn 01/24/2017 03:26 PM, Francisco Jerez wrote: > --- > src/compiler/glsl/builtin_functions.cpp | 22 +- > 1 file changed, 21 insertions(+), 1 deletion(-) > > diff --git a/src/compiler/glsl/builtin_functions.cpp > b/src/compiler/glsl/builtin_functions.cpp > index fd59381..9d6ab80 100644 > --- a/src/compiler/glsl/builtin_functions.cpp > +++ b/src/compiler/glsl/builtin_functions.cpp > @@ -3590,11 +3590,31 @@ builtin_builder::_atan2(const glsl_type *type) > body.emit(assign(rcp_scaled_t, rcp(mul(t, scale; > ir_expression *s_over_t = mul(mul(s, scale), rcp_scaled_t); > > + /* For |x| = |y| assume tan = 1 even if infinite (i.e. pretend momentarily > +* that ∞/∞ = 1) in order to comply with the rather artificial rules > +* inherited from IEEE 754-2008, namely: > +* > +* "atan2(±∞, −∞) is ±3π/4 > +* atan2(±∞, +∞) is ±π/4" > +* > +* Note that this is inconsistent with the rules for the neighborhood of > +* zero that are based on iterated limits: > +* > +* "atan2(±0, −0) is ±π > +* atan2(±0, +0) is ±0" > +* > +* but GLSL specifically allows implementations to deviate from IEEE rules > +* at (0,0), so we take that license (i.e. pretend that 0/0 = 1 here as > +* well). > +*/ > + ir_expression *tan = csel(equal(abs(x), abs(y)), > + imm(1.0f, n), abs(s_over_t)); > + > /* Calculate the arctangent and fix up the result if we had flipped the > * coordinate system. > */ > ir_variable *arc = body.make_temp(type, "arc"); > - do_atan(body, type, arc, abs(s_over_t)); > + do_atan(body, type, arc, tan); > body.emit(assign(arc, add(arc, mul(b2f(flip), imm(M_PI_2f); > > /* Rather convoluted calculation of the sign of the result. When x < 0 we > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/8] glsl: Rewrite atan2 implementation to fix accuracy and handling of zero/infinity.
It's a real bummer that we have two implementations of this function that are basically written in assembly... I'm not sure what else you'd call generating IR by hand. The code review and maintenance costs are of the same magnitude for sure. We could move this to GLSL and let the standalone compiler generate the builder code. I don't think that is currently helpful. However, for future "soft" int64 and fp64 work the standalone compiler will need to be extended to also generate NIR builder. Once that is done, I think the cost-benefit analysis changes. On 01/24/2017 03:26 PM, Francisco Jerez wrote: > This addresses several issues of the current atan2 implementation: > > - Negative zero (and negative denorms which end up getting flushed to >zero) isn't handled correctly by the current implementation. The >reason is that it does 'y >= 0' and 'x < 0' comparisons to decide >on which side of the branch cut the argument is, which causes us to >return incorrect results (off by up to 2π) for very small negative >values. > > - There is a serious precision problem for x values of large enough >magnitude introduced by the floating point division operation being >implemented as a mul+rcp sequence. This can lead to the quotient >getting flushed to zero in some cases introducing an error of over >8e6 ULP in the result -- Or in the most catastrophic case will >cause us to return NaN instead of the correct value ±π/2 for y=±∞ >and x very large. We can fix this easily by scaling down both >arguments when the absolute value of the denominator goes above >certain threshold. The error of this atan2 implementation remains >below 25 ULP in most of its domain except for a neighborhood of y=0 >where it reaches a maximum error of about 180 ULP. > > - It emits a bunch of instructions including no less than three >if-else branches per scalar component that don't seem to get >optimized out later on. This implementation uses about 13% less >instructions on Intel SKL hardware and doesn't emit any control >flow instructions. > --- > src/compiler/glsl/builtin_functions.cpp | 82 > ++--- > 1 file changed, 46 insertions(+), 36 deletions(-) > > diff --git a/src/compiler/glsl/builtin_functions.cpp > b/src/compiler/glsl/builtin_functions.cpp > index 4a6c5af..fd59381 100644 > --- a/src/compiler/glsl/builtin_functions.cpp > +++ b/src/compiler/glsl/builtin_functions.cpp > @@ -3560,44 +3560,54 @@ builtin_builder::_acos(const glsl_type *type) > ir_function_signature * > builtin_builder::_atan2(const glsl_type *type) > { > - ir_variable *vec_y = in_var(type, "vec_y"); > - ir_variable *vec_x = in_var(type, "vec_x"); > - MAKE_SIG(type, always_available, 2, vec_y, vec_x); > - > - ir_variable *vec_result = body.make_temp(type, "vec_result"); > - ir_variable *r = body.make_temp(glsl_type::float_type, "r"); > - for (int i = 0; i < type->vector_elements; i++) { > - ir_variable *y = body.make_temp(glsl_type::float_type, "y"); > - ir_variable *x = body.make_temp(glsl_type::float_type, "x"); > - body.emit(assign(y, swizzle(vec_y, i, 1))); > - body.emit(assign(x, swizzle(vec_x, i, 1))); > - > - /* If |x| >= 1.0e-8 * |y|: */ > - ir_if *outer_if = > - new(mem_ctx) ir_if(greater(abs(x), mul(imm(1.0e-8f), abs(y; > - > - ir_factory outer_then(_if->then_instructions, mem_ctx); > - > - /* Then...call atan(y/x) */ > - do_atan(outer_then, glsl_type::float_type, r, div(y, x)); > - > - /* ...and fix it up: */ > - ir_if *inner_if = new(mem_ctx) ir_if(less(x, imm(0.0f))); > - inner_if->then_instructions.push_tail( > - if_tree(gequal(y, imm(0.0f)), > - assign(r, add(r, imm(M_PIf))), > - assign(r, sub(r, imm(M_PIf); > - outer_then.emit(inner_if); > - > - /* Else... */ > - outer_if->else_instructions.push_tail( > - assign(r, mul(sign(y), imm(M_PI_2f; > + const unsigned n = type->vector_elements; > + ir_variable *y = in_var(type, "y"); > + ir_variable *x = in_var(type, "x"); > + MAKE_SIG(type, always_available, 2, y, x); > > - body.emit(outer_if); > + /* If we're on the left half-plane rotate the coordinates π/2 clock-wise > +* for the y=0 discontinuity to end up aligned with the vertical > +* discontinuity of atan(s/t) along t=0. > +*/ > + ir_variable *flip = body.make_temp(glsl_type::bvec(n), "flip"); > + body.emit(assign(flip, less(x, imm(0.0f, n; > + ir_variable *s = body.make_temp(type, "s"); > + body.emit(assign(s, csel(flip, abs(x), y))); > + ir_variable *t = body.make_temp(type, "t"); > + body.emit(assign(t, csel(flip, y, abs(x; > > - body.emit(assign(vec_result, r, 1 << i)); > - } > - body.emit(ret(vec_result)); > + /* If the magnitude of the denominator exceeds some huge value, scale down > +* the arguments in order to
Re: [Mesa-dev] [PATCH 06/37] util: add a disk_cache_remove() function
Timothy Arceriwrites: > On Tue, 2017-01-24 at 15:54 -0800, Eric Anholt wrote: >> Timothy Arceri writes: >> >> > From: Timothy Arceri >> > >> > This will be used to remove cache items created with old versions >> > of Mesa or other invalid cache items from the cache. >> >> I'm not convinced that removing the item from cache when we get a hit >> on >> everything in the key except for Mesa version is the right way to >> go. I >> think we should just be hashing the Mesa version in the key so that >> we >> don't hit on mismatched versions. Then we wouldn't thrash our cache >> when we're, say, checking out around different versions of Mesa and >> re-pigliting things. > > I agree. I mention this problem in the cover letter, it's going to take > some reworking so I was hoping to fix it in a follow-up. > > The plan is to create directory structures like so: > > Mesa-17.0.0/i965-BDW/ > Mesa-17.1.0/i965-BDW/ > > This will allow us to just delete and entire directory if we are > hitting the cache limit and also easily allows third parties to install > precompiled shaders in those dirs. I don't get how Mesa-17.0.0 identifies a specific compile of Mesa, so that doesn't seem to solve versioning. Are you going to have the Mesa build date or something under that? I'm pretty skeptical of anybody ever actually installing precompiled shaders and their users successfully getting cache hits off of them, so architecting for that seems strange to me. signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 99527] Provide option for llvmpipe JIT code to run cleanly under valgrind
https://bugs.freedesktop.org/show_bug.cgi?id=99527 Bug ID: 99527 Summary: Provide option for llvmpipe JIT code to run cleanly under valgrind Product: Mesa Version: 13.0 Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: normal Priority: medium Component: Other Assignee: mesa-dev@lists.freedesktop.org Reporter: john.fireba...@gmail.com QA Contact: mesa-dev@lists.freedesktop.org Currently llvmpipe JIT code is known to trigger errors when run under valgrind. For example, bug #29922 reports the following, which I also observe: ==17795== Conditional jump or move depends on uninitialised value(s) ==17795==at 0x573F792: ??? ==17795==by 0x4171342: lp_rast_shade_quads_mask (lp_rast.c:473) ==17795==by 0x4173EE9: do_block_4_3 (lp_rast_tri_tmp.h:61) ==17795==by 0x4178087: lp_rast_triangle_3_16 (lp_rast_tri.c:229) ==17795==by 0x4171913: rasterize_bin (lp_rast.c:667) ==17795==by 0x4171ACE: rasterize_scene (lp_rast.c:766) ==17795==by 0x4171BA4: lp_rast_queue_scene (lp_rast.c:791) ==17795==by 0x4178EB4: lp_scene_rasterize (lp_scene.c:405) ==17795==by 0x4179DF4: lp_setup_rasterize_scene (lp_setup.c:158) ==17795==by 0x417A296: set_scene_state (lp_setup.c:260) ==17795==by 0x417A39C: lp_setup_flush (lp_setup.c:295) ==17795==by 0x416E756: llvmpipe_flush (lp_flush.c:56) That bug is closed as RESOLVED WONTFIX but I would like to ask that this be reconsidered. Conscientious downstream developers want to make sure their code runs cleanly under valgrind. If libraries they use trigger lots of errors, it makes this task more difficult. For instance, I first had to determine whether or not this error represented a misuse of OpenGL by my own code. In this case, it's possible to search for "valgrind lp_rast_shade_quads_mask" and find the above bug report, so I was able to reasonably conclude that this was not a bug I was responsible for. In many of the other errors in JIT code that valgrind reports, that's not the case, and I'm still not 100% sure of the status -- whether it's a bug in my code, a bug in llvm, a supposedly harmless use of an uninitialized value, or a true false positive. I'm not the only one dissatisfied with the status quo. For a more strongly worded opinion, see http://www.americanteeth.org/2013/08/14/valgrind-is-not-optional/. If you believe that fixing these errors would harm performance of production builds, please consider using the `--enable-valgrind` configure flag as an explicit opt-in mechanism. For reference, here are some of the other errors I have received: ==9337== Conditional jump or move depends on uninitialised value(s) ==9337==at 0x402E63D: ??? ==9337==by 0xD32C84D: lp_rast_shade_quads_all (lp_rast_priv.h:271) ==9337==by 0xD32C368: block_full_4 (lp_rast_tri.c:46) ==9337==by 0xD329222: do_block_16_32_3 (lp_rast_tri_tmp.h:167) ==9337==by 0xD328E52: lp_rast_triangle_32_3 (lp_rast_tri_tmp.h:305) ==9337==by 0xD32073C: do_rasterize_bin (lp_rast.c:609) ==9337==by 0xD3203EB: rasterize_bin (lp_rast.c:628) ==9337==by 0xD31FBD1: rasterize_scene (lp_rast.c:688) ==9337==by 0xD321823: thread_function (lp_rast.c:828) ==9337==by 0xD321A61: impl_thrd_routine (threads_posix.h:87) ==9337==by 0x4E42183: start_thread (pthread_create.c:312) ==9337==by 0x6A6E37C: clone (clone.S:111) ==9337== Uninitialised value was created by a heap allocation ==9337==at 0x4C2B221: operator new(unsigned long) (in /home/travis/build/mapbox/mapbox-gl-native/mason_packages/linux-x86_64/valgrind/3.12.0/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==9337==by 0xDB14217: llvm::User::operator new(unsigned long, unsigned int) (in /home/travis/build/mapbox/mapbox-gl-native/mason_packages/linux-x86_64/mesa/13.0.3/lib/dri/swrast_dri.so) ==9337==by 0xDA60CDA: llvm::ConstantFP::get(llvm::LLVMContext&, llvm::APFloat const&) (in /home/travis/build/mapbox/mapbox-gl-native/mason_packages/linux-x86_64/mesa/13.0.3/lib/dri/swrast_dri.so) ==9337==by 0xDA629BD: llvm::ConstantFP::get(llvm::Type*, double) (in /home/travis/build/mapbox/mapbox-gl-native/mason_packages/linux-x86_64/mesa/13.0.3/lib/dri/swrast_dri.so) ==9337==by 0xD29993E: lp_build_const_elem (lp_bld_const.c:309) ==9337==by 0xD2999F0: lp_build_const_vec (lp_bld_const.c:333) ==9337==by 0xD29B902: lp_build_conv (lp_bld_conv.c:654) ==9337==by 0xD29B08E: lp_build_conv_auto (lp_bld_conv.c:491) ==9337==by 0xD344C3C: generate_unswizzled_blend (lp_state_fs.c:1884) ==9337==by 0xD342505: generate_fragment (lp_state_fs.c:2452) ==9337==by 0xD340947: generate_variant (lp_state_fs.c:2637) ==9337==by 0xD33FC79: llvmpipe_update_fs (lp_state_fs.c:3204) ==9337== ==9337== Thread 3 llvmpipe-1: ==9337== Use of uninitialised value of size 8 ==9337==at 0x4035AEE: ??? ==9337==by 0x40354D4: ???
[Mesa-dev] [PATCH 1/4] mesa: Trivial clean-ups in uniform_query.cpp
From: Ian RomanickThis is C++, so we can mix code and declarations. Doing so allows constification. Signed-off-by: Ian Romanick --- src/mesa/main/uniform_query.cpp | 12 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp index d5a2d0f..c2429c1 100644 --- a/src/mesa/main/uniform_query.cpp +++ b/src/mesa/main/uniform_query.cpp @@ -992,10 +992,6 @@ _mesa_uniform_matrix(struct gl_context *ctx, struct gl_shader_program *shProg, const GLvoid *values, enum glsl_base_type basicType) { unsigned offset; - unsigned vectors; - unsigned components; - unsigned elements; - int size_mul; struct gl_uniform_storage *const uni = validate_uniform_parameters(ctx, shProg, location, count, , "glUniformMatrix"); @@ -1009,11 +1005,11 @@ _mesa_uniform_matrix(struct gl_context *ctx, struct gl_shader_program *shProg, } assert(basicType == GLSL_TYPE_FLOAT || basicType == GLSL_TYPE_DOUBLE); - size_mul = basicType == GLSL_TYPE_DOUBLE ? 2 : 1; + const unsigned size_mul = basicType == GLSL_TYPE_DOUBLE ? 2 : 1; assert(!uni->type->is_sampler()); - vectors = uni->type->matrix_columns; - components = uni->type->vector_elements; + const unsigned vectors = uni->type->matrix_columns; + const unsigned components = uni->type->vector_elements; /* Verify that the types are compatible. This is greatly simplified for * matrices because they can only have a float base type. @@ -1084,7 +1080,7 @@ _mesa_uniform_matrix(struct gl_context *ctx, struct gl_shader_program *shProg, /* Store the data in the "actual type" backing storage for the uniform. */ - elements = components * vectors; + const unsigned elements = components * vectors; if (!transpose) { memcpy(>storage[size_mul * elements * offset], values, -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/4] mesa: Arrange _mesa_uniform parameters to match the call sites
From: Ian RomanickBy putting the parameters first that match the parameters to the call site, 4 (of 14) instructions are saved at _mesa_Uniform4fv on x64. On IA32, the details of the instructions change, but it is the same count and mix of instructions. Before: 0830 <_mesa_Uniform4fv>: 830: 48 83 ec 10 sub$0x10,%rsp 834: 49 89 d0mov%rdx,%r8 837: 48 8b 15 00 00 00 00mov0x0(%rip),%rdx# 83e <_mesa_Uniform4fv+0xe> 83e: 89 f8 mov%edi,%eax 840: 89 f1 mov%esi,%ecx 842: 41 b9 02 00 00 00 mov$0x2,%r9d 848: 64 48 8b 3a mov%fs:(%rdx),%rdi 84c: 48 8b 97 c8 01 02 00mov0x201c8(%rdi),%rdx 853: 48 8b 72 70 mov0x70(%rdx),%rsi 857: 6a 04 pushq $0x4 859: 89 c2 mov%eax,%edx 85b: e8 00 00 00 00 callq 860 <_mesa_Uniform4fv+0x30> 860: 48 83 c4 18 add$0x18,%rsp 864: c3 retq After: 07f0 <_mesa_Uniform4fv>: 7f0: 48 83 ec 10 sub$0x10,%rsp 7f4: 48 8b 05 00 00 00 00mov0x0(%rip),%rax# 7fb <_mesa_Uniform4fv+0xb> 7fb: 41 b9 02 00 00 00 mov$0x2,%r9d 801: 64 48 8b 08 mov%fs:(%rax),%rcx 805: 48 8b 81 c8 01 02 00mov0x201c8(%rcx),%rax 80c: 6a 04 pushq $0x4 80e: 4c 8b 40 70 mov0x70(%rax),%r8 812: e8 00 00 00 00 callq 817 <_mesa_Uniform4fv+0x27> 817: 48 83 c4 18 add$0x18,%rsp 81b: c3 retq Saves a measly 416 bytes of text on x64. Depending on exactly when this is applied, a lot of variation is possible due to function alignment. textdata bss dec hex filename 6670131 228340 22552 6921023 699b3f lib/i965_dri.so before 6670131 228340 22552 6921023 699b3f lib/i965_dri.so after 6343348 293872 29880 6667100 65bb5c lib64/i965_dri.so before 6342932 293872 29880 684 65b9bc lib64/i965_dri.so after There is likely to be no performance change with just this patch. _mesa_uniform immediately calls validate_uniform_parameters with parameters in the "wrong" (different from the call site) order. v2: Rebase on GL_ARB_gpu_shader_fp64. v3: Rebase on GL_ARB_gpu_shader_int64. Signed-off-by: Ian Romanick --- src/mesa/main/uniform_query.cpp | 8 +- src/mesa/main/uniforms.c| 192 src/mesa/main/uniforms.h| 8 +- 3 files changed, 102 insertions(+), 106 deletions(-) diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp index 0275e4f..ef51571 100644 --- a/src/mesa/main/uniform_query.cpp +++ b/src/mesa/main/uniform_query.cpp @@ -771,11 +771,9 @@ glsl_type_name(enum glsl_base_type type) * Called via glUniform*() functions. */ extern "C" void -_mesa_uniform(struct gl_context *ctx, struct gl_shader_program *shProg, - GLint location, GLsizei count, - const GLvoid *values, - enum glsl_base_type basicType, - unsigned src_components) +_mesa_uniform(GLint location, GLsizei count, const GLvoid *values, + struct gl_context *ctx, struct gl_shader_program *shProg, + enum glsl_base_type basicType, unsigned src_components) { unsigned offset; int size_mul = glsl_base_type_is_64bit(basicType) ? 2 : 1; diff --git a/src/mesa/main/uniforms.c b/src/mesa/main/uniforms.c index c1d951a..a954055 100644 --- a/src/mesa/main/uniforms.c +++ b/src/mesa/main/uniforms.c @@ -150,7 +150,7 @@ void GLAPIENTRY _mesa_Uniform1f(GLint location, GLfloat v0) { GET_CURRENT_CONTEXT(ctx); - _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, 1, , GLSL_TYPE_FLOAT, 1); + _mesa_uniform(location, 1, , ctx, ctx->_Shader->ActiveProgram, GLSL_TYPE_FLOAT, 1); } void GLAPIENTRY @@ -160,7 +160,7 @@ _mesa_Uniform2f(GLint location, GLfloat v0, GLfloat v1) GLfloat v[2]; v[0] = v0; v[1] = v1; - _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, 1, v, GLSL_TYPE_FLOAT, 2); + _mesa_uniform(location, 1, v, ctx, ctx->_Shader->ActiveProgram, GLSL_TYPE_FLOAT, 2); } void GLAPIENTRY @@ -171,7 +171,7 @@ _mesa_Uniform3f(GLint location, GLfloat v0, GLfloat v1, GLfloat v2) v[0] = v0; v[1] = v1; v[2] = v2; - _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, 1, v, GLSL_TYPE_FLOAT, 3); + _mesa_uniform(location, 1, v, ctx, ctx->_Shader->ActiveProgram, GLSL_TYPE_FLOAT, 3); } void GLAPIENTRY @@ -184,14 +184,14 @@ _mesa_Uniform4f(GLint location, GLfloat v0, GLfloat v1, GLfloat v2, v[1] = v1; v[2] = v2; v[3] = v3; -
[Mesa-dev] [PATCH 0/4] Micro optimizations for glUniform and glUniformMatrix
These are some patches that I wrote ages ago... the initial versions pre-date Mesa's ARB_gpu_shader_fp64 support. This was part of a larger effort that got bogged down and eventually abandonded. The problem with the larger series was trying to measure the performance impact. Random changes in function alignment had more impact on CPU-bound tests than anything else I did. I believe that these changes are good without collecting performance data. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/4] mesa: Arrange validate_uniform_parameters parameters to match call sites
From: Ian RomanickSaves a measly 20 bytes on IA32 and nothing on x64. Depending on exactly when this is applied, a lot of variation is possible due to function alignment. textdata bss dec hex filename 6670131 228340 22552 6921023 699b3f lib/i965_dri.so before 6670111 228340 22552 6921003 699b2b lib/i965_dri.so after 6342932 293872 29880 684 65b9bc lib64/i965_dri.so before 6342932 293872 29880 684 65b9bc lib64/i965_dri.so after Signed-off-by: Ian Romanick --- src/mesa/main/uniform_query.cpp | 22 +++--- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp index ef51571..418cfc9 100644 --- a/src/mesa/main/uniform_query.cpp +++ b/src/mesa/main/uniform_query.cpp @@ -156,11 +156,11 @@ _mesa_GetActiveUniformsiv(GLuint program, } static struct gl_uniform_storage * -validate_uniform_parameters(struct gl_context *ctx, - struct gl_shader_program *shProg, - GLint location, GLsizei count, - unsigned *array_index, - const char *caller) +validate_uniform_parameters(GLint location, GLsizei count, +unsigned *array_index, +struct gl_context *ctx, +struct gl_shader_program *shProg, +const char *caller) { if (shProg == NULL) { _mesa_error(ctx, GL_INVALID_OPERATION, "%s(program not linked)", caller); @@ -284,8 +284,8 @@ _mesa_get_uniform(struct gl_context *ctx, GLuint program, GLint location, unsigned offset; struct gl_uniform_storage *const uni = - validate_uniform_parameters(ctx, shProg, location, 1, - , "glGetUniform"); + validate_uniform_parameters(location, 1, , + ctx, shProg, "glGetUniform"); if (uni == NULL) { /* For glGetUniform, page 264 (page 278 of the PDF) of the OpenGL 2.1 * spec says: @@ -779,8 +779,8 @@ _mesa_uniform(GLint location, GLsizei count, const GLvoid *values, int size_mul = glsl_base_type_is_64bit(basicType) ? 2 : 1; struct gl_uniform_storage *const uni = - validate_uniform_parameters(ctx, shProg, location, count, - , "glUniform"); + validate_uniform_parameters(location, count, , + ctx, shProg, "glUniform"); if (uni == NULL) return; @@ -990,8 +990,8 @@ _mesa_uniform_matrix(GLint location, GLsizei count, { unsigned offset; struct gl_uniform_storage *const uni = - validate_uniform_parameters(ctx, shProg, location, count, - , "glUniformMatrix"); + validate_uniform_parameters(location, count, , + ctx, shProg, "glUniformMatrix"); if (uni == NULL) return; -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/4] mesa: Arrange _mesa_uniform_matrix parameters to match the call sites
From: Ian RomanickBy putting the parameters first that match the parameters to the call site, 4 (of 16) instructions are saved at _mesa_UniformMatrix4fv on x64. On IA32, the details of the instructions change, but it is the same count and mix of instructions. Before: 1380 <_mesa_UniformMatrix4fv>: 1380: 48 83 ec 10 sub$0x10,%rsp 1384: 48 8b 05 00 00 00 00mov0x0(%rip),%rax# 138b <_mesa_UniformMatrix4fv+0xb> 138b: 41 89 f8mov%edi,%r8d 138e: 41 89 f1mov%esi,%r9d 1391: 0f b6 d2movzbl %dl,%edx 1394: 64 48 8b 38 mov%fs:(%rax),%rdi 1398: 48 8b b7 c8 01 02 00mov0x201c8(%rdi),%rsi 139f: 48 8b 76 70 mov0x70(%rsi),%rsi 13a3: 68 06 14 00 00 pushq $0x1406 13a8: 51 push %rcx 13a9: 52 push %rdx 13aa: b9 04 00 00 00 mov$0x4,%ecx 13af: ba 04 00 00 00 mov$0x4,%edx 13b4: e8 00 00 00 00 callq 13b9 <_mesa_UniformMatrix4fv+0x39> 13b9: 48 83 c4 28 add$0x28,%rsp 13bd: c3 retq After: 1360 <_mesa_UniformMatrix4fv>: 1360: 48 83 ec 10 sub$0x10,%rsp 1364: 48 8b 05 00 00 00 00mov0x0(%rip),%rax# 136b <_mesa_UniformMatrix4fv+0xb> 136b: 0f b6 d2movzbl %dl,%edx 136e: 64 4c 8b 00 mov%fs:(%rax),%r8 1372: 49 8b 80 c8 01 02 00mov0x201c8(%r8),%rax 1379: 68 06 14 00 00 pushq $0x1406 137e: 6a 04 pushq $0x4 1380: 6a 04 pushq $0x4 1382: 4c 8b 48 70 mov0x70(%rax),%r9 1386: e8 00 00 00 00 callq 138b <_mesa_UniformMatrix4fv+0x2b> 138b: 48 83 c4 28 add$0x28,%rsp 138f: c3 retq Saves a measly 576 bytes of text on x64. textdata bss dec hex filename 6670131 228340 22552 6921023 699b3f lib/i965_dri.so before 6670131 228340 22552 6921023 699b3f lib/i965_dri.so after 6343924 293872 29880 6667676 65bd9c lib64/i965_dri.so before 6343348 293872 29880 6667100 65bb5c lib64/i965_dri.so after v2: Rebase on GL_ARB_gpu_shader_fp64. Signed-off-by: Ian Romanick --- src/mesa/main/uniform_query.cpp | 9 ++-- src/mesa/main/uniforms.c| 117 +--- src/mesa/main/uniforms.h| 9 ++-- 3 files changed, 71 insertions(+), 64 deletions(-) diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp index c2429c1..0275e4f 100644 --- a/src/mesa/main/uniform_query.cpp +++ b/src/mesa/main/uniform_query.cpp @@ -985,11 +985,10 @@ _mesa_uniform(struct gl_context *ctx, struct gl_shader_program *shProg, * Note: cols=2, rows=4 ==> array[2] of vec4 */ extern "C" void -_mesa_uniform_matrix(struct gl_context *ctx, struct gl_shader_program *shProg, -GLuint cols, GLuint rows, - GLint location, GLsizei count, - GLboolean transpose, - const GLvoid *values, enum glsl_base_type basicType) +_mesa_uniform_matrix(GLint location, GLsizei count, + GLboolean transpose, const void *values, + struct gl_context *ctx, struct gl_shader_program *shProg, + GLuint cols, GLuint rows, enum glsl_base_type basicType) { unsigned offset; struct gl_uniform_storage *const uni = diff --git a/src/mesa/main/uniforms.c b/src/mesa/main/uniforms.c index 3b645cb..c1d951a 100644 --- a/src/mesa/main/uniforms.c +++ b/src/mesa/main/uniforms.c @@ -551,8 +551,8 @@ _mesa_UniformMatrix2fv(GLint location, GLsizei count, GLboolean transpose, const GLfloat * value) { GET_CURRENT_CONTEXT(ctx); - _mesa_uniform_matrix(ctx, ctx->_Shader->ActiveProgram, - 2, 2, location, count, transpose, value, GLSL_TYPE_FLOAT); + _mesa_uniform_matrix(location, count, transpose, value, +ctx, ctx->_Shader->ActiveProgram, 2, 2, GLSL_TYPE_FLOAT); } void GLAPIENTRY @@ -560,8 +560,8 @@ _mesa_UniformMatrix3fv(GLint location, GLsizei count, GLboolean transpose, const GLfloat * value) { GET_CURRENT_CONTEXT(ctx); - _mesa_uniform_matrix(ctx, ctx->_Shader->ActiveProgram, - 3, 3, location, count, transpose, value, GLSL_TYPE_FLOAT); + _mesa_uniform_matrix(location, count, transpose, value, +ctx, ctx->_Shader->ActiveProgram, 3, 3, GLSL_TYPE_FLOAT); } void GLAPIENTRY @@ -569,8 +569,8 @@ _mesa_UniformMatrix4fv(GLint location, GLsizei count,
Re: [Mesa-dev] [PATCH 08/37] glsl: add initial implementation of shader cache
On Tue, 2017-01-24 at 16:33 -0800, Eric Anholt wrote: > Timothy Arceriwrites: > > > From: Timothy Arceri > > > > This uses disk_cache.c to write out a serialization of various > > state that's required in order to successfully load and use a > > binary written out by a drivers backend, this state is referred to > > as > > "metadata" throughout the implementation. > > > > This initial version is intended to work with vertex and fragment > > shader stages only. > > This is really interesting. I was definitely expecting that the > cache > at this level would be a map from ([sha1s of shader source], mesa > version, compiler options, other linker inputs) -> ([compiled GLSL IR > shaders], linker metadata output). The advantage you seem to be > going > for is to not have GLSL IR ever present in memory, which would be > pretty > cool. That's the plan. It does mean we need some special handling for when we must fallback to a recompile (i965 shader variants, corrupt cache items, etc) but it's not so bad. It certainly simpler that caching the IR. In the i965 patchset I add an environment var to enabled this fallback path to be forced for debugging. > I'm really curious to see how this would work out for a gallium > driver. Yeah I really haven't looked at this very hard yet. I'll start looking at it next week, but my assumption was we might need 3 levels of cache for a gallium driver. glsl, gallium and backend caches. > > Could you extend the file's doxygen comment to cover some of these > design decisions? Sure. > > Also, I think in this series you've missed having the > gl_shader_compiler_options options in the shader key, which I believe > might affect the compiled metadata output. Other than that, will > gallium vs i965 have different GLSL IR passes being run at the > CompileShader or LinkShader stages before we write to disk? Will we > need the driver's name to be in the key, maybe? See my reply to patch 6 I think that should cover all of these issues. I'd really like it if that didn't hold this up from landing however as I'd really like to start working on improvements rather than constantly wasting time rebasing things :) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] vulkan/wsi: Lower the maximum image sizes
--- src/vulkan/wsi/wsi_common_wayland.c | 3 ++- src/vulkan/wsi/wsi_common_x11.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/src/vulkan/wsi/wsi_common_wayland.c b/src/vulkan/wsi/wsi_common_wayland.c index c9c476e..bdb80a7 100644 --- a/src/vulkan/wsi/wsi_common_wayland.c +++ b/src/vulkan/wsi/wsi_common_wayland.c @@ -379,7 +379,8 @@ wsi_wl_surface_get_capabilities(VkIcdSurfaceBase *surface, caps->currentExtent = (VkExtent2D) { -1, -1 }; caps->minImageExtent = (VkExtent2D) { 1, 1 }; - caps->maxImageExtent = (VkExtent2D) { INT16_MAX, INT16_MAX }; + /* This is the maximum supported size on Intel */ + caps->maxImageExtent = (VkExtent2D) { 1 << 14, 1 << 14 }; caps->supportedTransforms = VK_SURFACE_TRANSFORM_IDENTITY_BIT_KHR; caps->currentTransform = VK_SURFACE_TRANSFORM_IDENTITY_BIT_KHR; caps->maxImageArrayLayers = 1; diff --git a/src/vulkan/wsi/wsi_common_x11.c b/src/vulkan/wsi/wsi_common_x11.c index 5e3c910..851932d 100644 --- a/src/vulkan/wsi/wsi_common_x11.c +++ b/src/vulkan/wsi/wsi_common_x11.c @@ -370,7 +370,8 @@ x11_surface_get_capabilities(VkIcdSurfaceBase *icd_surface, */ caps->currentExtent = (VkExtent2D) { -1, -1 }; caps->minImageExtent = (VkExtent2D) { 1, 1 }; - caps->maxImageExtent = (VkExtent2D) { INT16_MAX, INT16_MAX }; + /* This is the maximum supported size on Intel */ + caps->maxImageExtent = (VkExtent2D) { 1 << 14, 1 << 14 }; } free(err); free(geom); -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/8] i965/fs: Fix nir_op_fsign of absolute value.
On 01/24/2017 03:26 PM, Francisco Jerez wrote: > This does point at the front-end emitting silly code that could have > been optimized out, but the current fsign implementation would emit > bogus IR if abs was set for the argument (because it would apply the > abs modifier on an unsigned integer type), and we shouldn't rely on > the upper layer's optimization passes for correctness. Other than the atan2 code you emit later in the series, is there a test for this? > --- > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 9 - > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > index e1ab598..e0c2fa0 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > @@ -701,7 +701,14 @@ fs_visitor::nir_emit_alu(const fs_builder , > nir_alu_instr *instr) >break; > > case nir_op_fsign: { > - if (type_sz(op[0].type) < 8) { > + if (op[0].abs) { > + /* Straightforward since the source can be assumed to be > + * non-negative. > + */ > + set_condmod(BRW_CONDITIONAL_NZ, bld.MOV(result, op[0])); > + set_predicate(BRW_PREDICATE_NORMAL, bld.MOV(result, > brw_imm_f(1.0f))); Does this work for DF source? If we had an optimization pass for this, it would probably map fsign(abs(a)) to float(a != 0) or double(a != 0). This is different from what we would generate for that, but I don't know which is better. > + > + } else if (type_sz(op[0].type) < 8) { > /* AND(val, 0x8000) gives the sign bit. >* >* Predicated OR ORs 1.0 (0x3f80) with the sign bit if val is > not > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 97879] [amdgpu] Rocket League: long hangs (several seconds) when loading assets (models/textures/shaders?)
https://bugs.freedesktop.org/show_bug.cgi?id=97879 --- Comment #54 from Michel Dänzer--- (In reply to Marek Olšák from comment #52) > 2) Make a screenshot of the sysprof window and send it to the game developer. Please save the profile in sysprof and send the saved data instead of a screenshot. Then the recipient can peruse the profile any way they like. -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/8] glsl: Fix constant evaluation of the rcp op.
On 01/24/2017 03:26 PM, Francisco Jerez wrote: > Will avoid a regression in a future commit that introduces some > additional rcp operations. When I converted GLSL IR to ir_expression_operation.py, I was careful to keep all the expressions the same. rcp and div had these weird guards. GLSL doesn't require that NaN be generated, and quite a few old GPUs don't. If the atan2 implementation depends on NaN being generated by rcp, it may have problems on i915, r300, and similar GPUs. I don't know what they generate, but it's not NaN and it's probably not 0.0. That said, this matches NIR, and it's probably fine. > --- > src/compiler/glsl/ir_expression_operation.py | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/compiler/glsl/ir_expression_operation.py > b/src/compiler/glsl/ir_expression_operation.py > index f91ac9b..4ac1ffb 100644 > --- a/src/compiler/glsl/ir_expression_operation.py > +++ b/src/compiler/glsl/ir_expression_operation.py > @@ -422,7 +422,7 @@ ir_expression_operation = [ > operation("neg", 1, source_types=numeric_types, c_expression={'u': > "-((int) {src0})", 'default': "-{src0}"}), > operation("abs", 1, source_types=signed_numeric_types, c_expression={'i': > "{src0} < 0 ? -{src0} : {src0}", 'f': "fabsf({src0})", 'd': "fabs({src0})", > 'i64': "{src0} < 0 ? -{src0} : {src0}"}), > operation("sign", 1, source_types=signed_numeric_types, > c_expression={'i': "({src0} > 0) - ({src0} < 0)", 'f': "float(({src0} > 0.0F) > - ({src0} < 0.0F))", 'd': "double(({src0} > 0.0) - ({src0} < 0.0))", 'i64': > "({src0} > 0) - ({src0} < 0)"}), > - operation("rcp", 1, source_types=real_types, c_expression={'f': "{src0} > != 0.0F ? 1.0F / {src0} : 0.0F", 'd': "{src0} != 0.0 ? 1.0 / {src0} : 0.0"}), > + operation("rcp", 1, source_types=real_types, c_expression={'f': "1.0F / > {src0}", 'd': "1.0 / {src0}"}), > operation("rsq", 1, source_types=real_types, c_expression={'f': "1.0F / > sqrtf({src0})", 'd': "1.0 / sqrt({src0})"}), > operation("sqrt", 1, source_types=real_types, c_expression={'f': > "sqrtf({src0})", 'd': "sqrt({src0})"}), > operation("exp", 1, source_types=(float_type,), > c_expression="expf({src0})"), # Log base e on gentype > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Fix fast depth clears for surfaces with a dimension of 16384.
On Tue, Jan 24, 2017 at 03:32:28PM -0800, Kenneth Graunke wrote: > I hadn't bothered to set this bit because I figured it would just > paper over us getting the rectangle wrong. But it turns out that > there is a legitimate reason to use it, so let's do so. > > The alternative would be to chop up 16k clears to multiple 8k clears, > which is pointlessly painful. > > Signed-off-by: Kenneth Graunke> --- > src/mesa/drivers/dri/i965/gen8_depth_state.c | 11 +++ > 1 file changed, 11 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/gen8_depth_state.c > b/src/mesa/drivers/dri/i965/gen8_depth_state.c > index ec296698267..de5a16e91bf 100644 > --- a/src/mesa/drivers/dri/i965/gen8_depth_state.c > +++ b/src/mesa/drivers/dri/i965/gen8_depth_state.c > @@ -477,6 +477,17 @@ gen8_hiz_exec(struct brw_context *brw, struct > intel_mipmap_tree *mt, >break; > case BLORP_HIZ_OP_DEPTH_CLEAR: >dw1 |= GEN8_WM_HZ_DEPTH_CLEAR; > + > + /* The "Clear Rectangle X Max" (and Y Max) fields are exclusive, > + * rather than inclusive, and limited to 16383. This means that > + * for a 16384x16384 render target, we would miss the last pixel. Perhaps you meant to say that we'd miss the last pixels (plural) on the far edges? The comment gets the point across nonetheless. This patch is, Reviewed-by: Nanley Chery > + * > + * To work around this, we have to set the "Full Surface Depth > + * and Stencil Clear" bit. We can do this in all cases because > + * we always clear the full rectangle anyway. We'll need to > + * change this if we ever add scissored clear support. > + */ > + dw1 |= GEN8_WM_HZ_FULL_SURFACE_DEPTH_CLEAR; >break; > case BLORP_HIZ_OP_NONE: >unreachable("Should not get here."); > -- > 2.11.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Fix fast depth clears for surfaces with a dimension of 16384.
On Tue, Jan 24, 2017 at 3:32 PM, Kenneth Graunkewrote: > I hadn't bothered to set this bit because I figured it would just > paper over us getting the rectangle wrong. But it turns out that > there is a legitimate reason to use it, so let's do so. > > The alternative would be to chop up 16k clears to multiple 8k clears, > which is pointlessly painful. > > Signed-off-by: Kenneth Graunke > --- > src/mesa/drivers/dri/i965/gen8_depth_state.c | 11 +++ > 1 file changed, 11 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/gen8_depth_state.c > b/src/mesa/drivers/dri/i965/gen8_depth_state.c > index ec296698267..de5a16e91bf 100644 > --- a/src/mesa/drivers/dri/i965/gen8_depth_state.c > +++ b/src/mesa/drivers/dri/i965/gen8_depth_state.c > @@ -477,6 +477,17 @@ gen8_hiz_exec(struct brw_context *brw, struct > intel_mipmap_tree *mt, >break; > case BLORP_HIZ_OP_DEPTH_CLEAR: >dw1 |= GEN8_WM_HZ_DEPTH_CLEAR; > + > + /* The "Clear Rectangle X Max" (and Y Max) fields are exclusive, > + * rather than inclusive, and limited to 16383. This means that > + * for a 16384x16384 render target, we would miss the last pixel. > + * > + * To work around this, we have to set the "Full Surface Depth > + * and Stencil Clear" bit. We can do this in all cases because > + * we always clear the full rectangle anyway. We'll need to > + * change this if we ever add scissored clear support. > + */ > + dw1 |= GEN8_WM_HZ_FULL_SURFACE_DEPTH_CLEAR; >break; > case BLORP_HIZ_OP_NONE: >unreachable("Should not get here."); > -- > 2.11.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev Verified the restriction from PRM. Patch looks good to me. Reviewed-by: Anuj Phogat ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/8] glsl/ir_builder: Add rcp builder.
This patch is Reviewed-by: Ian RomanickNext time someone asks for a newbie task, we should have ir_builder be generated from ir_expression_operation.py. On 01/24/2017 03:26 PM, Francisco Jerez wrote: > --- > src/compiler/glsl/ir_builder.cpp | 6 ++ > src/compiler/glsl/ir_builder.h | 1 + > 2 files changed, 7 insertions(+) > > diff --git a/src/compiler/glsl/ir_builder.cpp > b/src/compiler/glsl/ir_builder.cpp > index 0cee856..8d61533 100644 > --- a/src/compiler/glsl/ir_builder.cpp > +++ b/src/compiler/glsl/ir_builder.cpp > @@ -315,6 +315,12 @@ exp(operand a) > } > > ir_expression * > +rcp(operand a) > +{ > + return expr(ir_unop_rcp, a); > +} > + > +ir_expression * > rsq(operand a) > { > return expr(ir_unop_rsq, a); > diff --git a/src/compiler/glsl/ir_builder.h b/src/compiler/glsl/ir_builder.h > index 5ee9412..ff1ff70 100644 > --- a/src/compiler/glsl/ir_builder.h > +++ b/src/compiler/glsl/ir_builder.h > @@ -148,6 +148,7 @@ ir_expression *neg(operand a); > ir_expression *sin(operand a); > ir_expression *cos(operand a); > ir_expression *exp(operand a); > +ir_expression *rcp(operand a); > ir_expression *rsq(operand a); > ir_expression *sqrt(operand a); > ir_expression *log(operand a); > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] [swr] Update fs texture & sampler state logic
Reviewed-by: Bruce Cherniak> On Jan 24, 2017, at 5:27 PM, George Kyriazis > wrote: > > In swr_update_derived() update texture and sampler state on a new fragment > shader. GALLIUM_HUD can update fs using a previously bound texture and > sampler. > --- > src/gallium/drivers/swr/swr_state.cpp | 7 +-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/src/gallium/drivers/swr/swr_state.cpp > b/src/gallium/drivers/swr/swr_state.cpp > index 41e0356..f1f4963 100644 > --- a/src/gallium/drivers/swr/swr_state.cpp > +++ b/src/gallium/drivers/swr/swr_state.cpp > @@ -1283,7 +1283,8 @@ swr_update_derived(struct pipe_context *pipe, > SwrSetPixelShaderState(ctx->swrContext, ); > > /* JIT sampler state */ > - if (ctx->dirty & SWR_NEW_SAMPLER) { > + if (ctx->dirty & (SWR_NEW_SAMPLER | > +SWR_NEW_FS)) { > swr_update_sampler_state(ctx, > PIPE_SHADER_FRAGMENT, > key.nr_samplers, > @@ -1291,7 +1292,9 @@ swr_update_derived(struct pipe_context *pipe, > } > > /* JIT sampler view state */ > - if (ctx->dirty & (SWR_NEW_SAMPLER_VIEW | SWR_NEW_FRAMEBUFFER)) { > + if (ctx->dirty & (SWR_NEW_SAMPLER_VIEW | > +SWR_NEW_FRAMEBUFFER | > +SWR_NEW_FS)) { > swr_update_texture_state(ctx, > PIPE_SHADER_FRAGMENT, > key.nr_sampler_views, > -- > 2.10.0.windows.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] vulkan/wsi/wayland: Handle VK_INCOMPLETE for GetPresentModes
--- src/vulkan/wsi/wsi_common_wayland.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/src/vulkan/wsi/wsi_common_wayland.c b/src/vulkan/wsi/wsi_common_wayland.c index d745413..04cea97 100644 --- a/src/vulkan/wsi/wsi_common_wayland.c +++ b/src/vulkan/wsi/wsi_common_wayland.c @@ -443,11 +443,13 @@ wsi_wl_surface_get_present_modes(VkIcdSurfaceBase *surface, return VK_SUCCESS; } - assert(*pPresentModeCount >= ARRAY_SIZE(present_modes)); + *pPresentModeCount = MIN2(*pPresentModeCount, ARRAY_SIZE(present_modes)); typed_memcpy(pPresentModes, present_modes, *pPresentModeCount); - *pPresentModeCount = ARRAY_SIZE(present_modes); - return VK_SUCCESS; + if (*pPresentModeCount < ARRAY_SIZE(present_modes)) + return VK_INCOMPLETE; + else + return VK_SUCCESS; } VkResult wsi_create_wl_surface(const VkAllocationCallbacks *pAllocator, -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] vulkan/wsi/wayland: Handle VK_INCOMPLETE for GetFormats
--- src/vulkan/wsi/wsi_common_wayland.c | 16 +--- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/src/vulkan/wsi/wsi_common_wayland.c b/src/vulkan/wsi/wsi_common_wayland.c index 687ac9c..d745413 100644 --- a/src/vulkan/wsi/wsi_common_wayland.c +++ b/src/vulkan/wsi/wsi_common_wayland.c @@ -409,25 +409,27 @@ wsi_wl_surface_get_formats(VkIcdSurfaceBase *icd_surface, if (!display) return VK_ERROR_OUT_OF_HOST_MEMORY; - uint32_t count = u_vector_length(>formats); - if (pSurfaceFormats == NULL) { - *pSurfaceFormatCount = count; + *pSurfaceFormatCount = u_vector_length(>formats); return VK_SUCCESS; } - assert(*pSurfaceFormatCount >= count); - *pSurfaceFormatCount = count; - + uint32_t count = 0; VkFormat *f; u_vector_foreach(f, >formats) { - *(pSurfaceFormats++) = (VkSurfaceFormatKHR) { + if (count == *pSurfaceFormatCount) + return VK_INCOMPLETE; + + pSurfaceFormats[count++] = (VkSurfaceFormatKHR) { .format = *f, /* TODO: We should get this from the compositor somehow */ .colorSpace = VK_COLORSPACE_SRGB_NONLINEAR_KHR, }; } + assert(*pSurfaceFormatCount <= count); + *pSurfaceFormatCount = count; + return VK_SUCCESS; } -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/8] mesa/program: Translate csel operation from GLSL IR.
I'd swear that I wrote a nearly identical patch almost 2 years ago. The work that depended on it fizzled, so I never sent it out. The one difference is I had the following comment: /* We assume that Boolean true and false are 1.0 and 0.0. OPCODE_CMP * selects src1 if src0 is < 0, src2 otherwise. */ Either way, this patch is Reviewed-by: Ian RomanickOn 01/24/2017 03:26 PM, Francisco Jerez wrote: > This will be used internally by the GLSL front-end in order to > implement some built-in functions. Plumb it through MESA IR for > back-ends that rely on this translation pass. > --- > src/mesa/program/ir_to_mesa.cpp | 6 +- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp > index 0ae797f..5ff7304 100644 > --- a/src/mesa/program/ir_to_mesa.cpp > +++ b/src/mesa/program/ir_to_mesa.cpp > @@ -1360,13 +1360,17 @@ ir_to_mesa_visitor::visit(ir_expression *ir) >emit(ir, OPCODE_LRP, result_dst, op[2], op[1], op[0]); >break; > > + case ir_triop_csel: > + op[0].negate = ~op[0].negate; > + emit(ir, OPCODE_CMP, result_dst, op[0], op[1], op[2]); > + break; > + > case ir_binop_vector_extract: > case ir_triop_fma: > case ir_triop_bitfield_extract: > case ir_triop_vector_insert: > case ir_quadop_bitfield_insert: > case ir_binop_ldexp: > - case ir_triop_csel: > case ir_binop_carry: > case ir_binop_borrow: > case ir_binop_imul_high: > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/37] glsl: add initial implementation of shader cache
Timothy Arceriwrites: > From: Timothy Arceri > > This uses disk_cache.c to write out a serialization of various > state that's required in order to successfully load and use a > binary written out by a drivers backend, this state is referred to as > "metadata" throughout the implementation. > > This initial version is intended to work with vertex and fragment > shader stages only. This is really interesting. I was definitely expecting that the cache at this level would be a map from ([sha1s of shader source], mesa version, compiler options, other linker inputs) -> ([compiled GLSL IR shaders], linker metadata output). The advantage you seem to be going for is to not have GLSL IR ever present in memory, which would be pretty cool. I'm really curious to see how this would work out for a gallium driver. Could you extend the file's doxygen comment to cover some of these design decisions? Also, I think in this series you've missed having the gl_shader_compiler_options options in the shader key, which I believe might affect the compiled metadata output. Other than that, will gallium vs i965 have different GLSL IR passes being run at the CompileShader or LinkShader stages before we write to disk? Will we need the driver's name to be in the key, maybe? signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/37] util: add a disk_cache_remove() function
On Tue, 2017-01-24 at 15:54 -0800, Eric Anholt wrote: > Timothy Arceriwrites: > > > From: Timothy Arceri > > > > This will be used to remove cache items created with old versions > > of Mesa or other invalid cache items from the cache. > > I'm not convinced that removing the item from cache when we get a hit > on > everything in the key except for Mesa version is the right way to > go. I > think we should just be hashing the Mesa version in the key so that > we > don't hit on mismatched versions. Then we wouldn't thrash our cache > when we're, say, checking out around different versions of Mesa and > re-pigliting things. I agree. I mention this problem in the cover letter, it's going to take some reworking so I was hoping to fix it in a follow-up. The plan is to create directory structures like so: Mesa-17.0.0/i965-BDW/ Mesa-17.1.0/i965-BDW/ This will allow us to just delete and entire directory if we are hitting the cache limit and also easily allows third parties to install precompiled shaders in those dirs. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: Fix copy-and-paste bug in _mesa_(Program|)Uniform[1234](i|ui)64vARB functions
From: Ian RomanickAll of the functions were passing 1 to _mesa_uniform instead of passing count. Fixes 16 unsed parameter warnings like: main/uniforms.c: In function ‘_mesa_Uniform1i64vARB’: main/uniforms.c:1692:47: warning: unused parameter ‘count’ [-Wunused-parameter] _mesa_Uniform1i64vARB(GLint location, GLsizei count, const GLint64 *value) ^ This is why I build with extra warnings enabled. Unfortunately, there are so many unused parameter warnings in Mesa that I didn't notice these added warnings for over 6 months. :( Signed-off-by: Ian Romanick --- src/mesa/main/uniforms.c | 32 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/src/mesa/main/uniforms.c b/src/mesa/main/uniforms.c index 29a1155..3b645cb 100644 --- a/src/mesa/main/uniforms.c +++ b/src/mesa/main/uniforms.c @@ -1683,28 +1683,28 @@ void GLAPIENTRY _mesa_Uniform1i64vARB(GLint location, GLsizei count, const GLint64 *value) { GET_CURRENT_CONTEXT(ctx); - _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, 1, value, GLSL_TYPE_INT64, 1); + _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, count, value, GLSL_TYPE_INT64, 1); } void GLAPIENTRY _mesa_Uniform2i64vARB(GLint location, GLsizei count, const GLint64 *value) { GET_CURRENT_CONTEXT(ctx); - _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, 1, value, GLSL_TYPE_INT64, 2); + _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, count, value, GLSL_TYPE_INT64, 2); } void GLAPIENTRY _mesa_Uniform3i64vARB(GLint location, GLsizei count, const GLint64 *value) { GET_CURRENT_CONTEXT(ctx); - _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, 1, value, GLSL_TYPE_INT64, 3); + _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, count, value, GLSL_TYPE_INT64, 3); } void GLAPIENTRY _mesa_Uniform4i64vARB(GLint location, GLsizei count, const GLint64 *value) { GET_CURRENT_CONTEXT(ctx); - _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, 1, value, GLSL_TYPE_INT64, 4); + _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, count, value, GLSL_TYPE_INT64, 4); } void GLAPIENTRY @@ -1751,28 +1751,28 @@ void GLAPIENTRY _mesa_Uniform1ui64vARB(GLint location, GLsizei count, const GLuint64 *value) { GET_CURRENT_CONTEXT(ctx); - _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, 1, value, GLSL_TYPE_UINT64, 1); + _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, count, value, GLSL_TYPE_UINT64, 1); } void GLAPIENTRY _mesa_Uniform2ui64vARB(GLint location, GLsizei count, const GLuint64 *value) { GET_CURRENT_CONTEXT(ctx); - _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, 1, value, GLSL_TYPE_UINT64, 2); + _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, count, value, GLSL_TYPE_UINT64, 2); } void GLAPIENTRY _mesa_Uniform3ui64vARB(GLint location, GLsizei count, const GLuint64 *value) { GET_CURRENT_CONTEXT(ctx); - _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, 1, value, GLSL_TYPE_UINT64, 3); + _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, count, value, GLSL_TYPE_UINT64, 3); } void GLAPIENTRY _mesa_Uniform4ui64vARB(GLint location, GLsizei count, const GLuint64 *value) { GET_CURRENT_CONTEXT(ctx); - _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, 1, value, GLSL_TYPE_UINT64, 4); + _mesa_uniform(ctx, ctx->_Shader->ActiveProgram, location, count, value, GLSL_TYPE_UINT64, 4); } /* DSA entrypoints */ @@ -1835,7 +1835,7 @@ _mesa_ProgramUniform1i64vARB(GLuint program, GLint location, GLsizei count, cons struct gl_shader_program *shProg = _mesa_lookup_shader_program_err(ctx, program, "glProgramUniform1i64vARB"); - _mesa_uniform(ctx, shProg, location, 1, value, GLSL_TYPE_INT64, 1); + _mesa_uniform(ctx, shProg, location, count, value, GLSL_TYPE_INT64, 1); } void GLAPIENTRY @@ -1845,7 +1845,7 @@ _mesa_ProgramUniform2i64vARB(GLuint program, GLint location, GLsizei count, con struct gl_shader_program *shProg = _mesa_lookup_shader_program_err(ctx, program, "glProgramUniform2i64vARB"); - _mesa_uniform(ctx, shProg, location, 1, value, GLSL_TYPE_INT64, 2); + _mesa_uniform(ctx, shProg, location, count, value, GLSL_TYPE_INT64, 2); } void GLAPIENTRY @@ -1855,7 +1855,7 @@ _mesa_ProgramUniform3i64vARB(GLuint program, GLint location, GLsizei count, con struct gl_shader_program *shProg = _mesa_lookup_shader_program_err(ctx, program, "glProgramUniform3i64vARB"); - _mesa_uniform(ctx, shProg, location, 1, value, GLSL_TYPE_INT64, 3); + _mesa_uniform(ctx, shProg, location, count, value, GLSL_TYPE_INT64, 3); } void GLAPIENTRY @@ -1865,7 +1865,7 @@ _mesa_ProgramUniform4i64vARB(GLuint
Re: [Mesa-dev] [PATCH] i965: Use a UW source type for CS_OPCODE_CS_TERMINATE.
Matt Turnerwrites: > On Tue, Jan 24, 2017 at 2:18 PM, Kenneth Graunke > wrote: >> SIMD16 compute shaders use a send(16) with mlen 1 for the EOT message, >> using a source of g127 for the single register. With a UD type, this >> supposedly could read g128, which doesn't exist, causing the simulator >> to get cranky. Use a UW type to avoid this. > > Bizarre. Is the hardware this stupid, or just the simulator? I doubt the hardware would care, but I guess it wouldn't hurt to do this in order to make the simulator happy. How about we fix this in the generator instead for consistency with the other send-message UW register retyping workarounds? Assuming you apply the same fix in fs_generator::generate_cs_terminate instead patch is: Reviewed-by: Francisco Jerez signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/37] glsl: Switch to disable-by-default for the GLSL shader cache
Timothy Arceriwrites: > From: Carl Worth > > The shader cache is expected to be developed incrementally over a > fairly long series of commits. For that period of instability, we > require users to opt into the shader cache by setting: > > MESA_GLSL_CACHE_ENABLE=1 > > In the future, when the shader cache is complete, we can revert this > commit so that the cache will be on by default. > > The user can always disable the cache with > MESA_GLSL_CACHE_DISABLE=1. That functionality is not affected by this > commit, (nor will it be affected by the future revert). > --- > src/compiler/glsl/tests/cache_test.c | 5 + > src/util/disk_cache.c| 7 +++ > 2 files changed, 12 insertions(+) > > diff --git a/src/compiler/glsl/tests/cache_test.c > b/src/compiler/glsl/tests/cache_test.c > index 0ef05aa..8547141 100644 > --- a/src/compiler/glsl/tests/cache_test.c > +++ b/src/compiler/glsl/tests/cache_test.c > @@ -388,6 +388,11 @@ main(void) > #ifdef ENABLE_SHADER_CACHE > int err; > > + /* While the shader cache is still experimental, this variable must > +* be set or the cache does nothing. > +*/ > + setenv("MESA_GLSL_CACHE_ENABLE", "1", 1); > + > test_disk_cache_create(); > > test_put_and_get(); > diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c > index 6de608c..dec09e0 100644 > --- a/src/util/disk_cache.c > +++ b/src/util/disk_cache.c > @@ -151,6 +151,13 @@ disk_cache_create(void) > if (getenv("MESA_GLSL_CACHE_DISABLE")) >goto fail; > > + /* As a temporary measure, (while the shader cache is under > +* development, and known to not be fully function), also require "functional" > +* the MESA_GLSL_CACHE_ENABLE variable to be set. > +*/ > + if (! getenv ("MESA_GLSL_CACHE_ENABLE")) > + goto fail; cworth-style whitespace to be fixed here. Other than that, 1-5 are: Reviewed-by: Eric Anholt signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/37] util: add a disk_cache_remove() function
Timothy Arceriwrites: > From: Timothy Arceri > > This will be used to remove cache items created with old versions > of Mesa or other invalid cache items from the cache. I'm not convinced that removing the item from cache when we get a hit on everything in the key except for Mesa version is the right way to go. I think we should just be hashing the Mesa version in the key so that we don't hit on mismatched versions. Then we wouldn't thrash our cache when we're, say, checking out around different versions of Mesa and re-pigliting things. signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Use a UW source type for CS_OPCODE_CS_TERMINATE.
On Tue, Jan 24, 2017 at 2:18 PM, Kenneth Graunkewrote: > SIMD16 compute shaders use a send(16) with mlen 1 for the EOT message, > using a source of g127 for the single register. With a UD type, this > supposedly could read g128, which doesn't exist, causing the simulator > to get cranky. Use a UW type to avoid this. Bizarre. Is the hardware this stupid, or just the simulator? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 7/7] i965/blorp: Remove a pile of blorp_blit restrictions
Previously, blorp could only blit into something that was renderable. Thanks to recent additions to blorp, it can now blit into basically anything so long as it isn't compressed. --- src/mesa/drivers/dri/i965/brw_blorp.c | 64 +-- 1 file changed, 32 insertions(+), 32 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c b/src/mesa/drivers/dri/i965/brw_blorp.c index 3a7cf84..624b5e8 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp.c +++ b/src/mesa/drivers/dri/i965/brw_blorp.c @@ -274,6 +274,26 @@ blorp_surf_for_miptree(struct brw_context *brw, (surf->aux_addr.buffer == NULL)); } +static bool +brw_blorp_supports_dst_format(struct brw_context *brw, mesa_format format) +{ + /* If it's renderable, it's definitely supported. */ + if (brw->format_supported_as_render_target[format]) + return true; + + /* BLORP can't compress anything */ + if (_mesa_is_format_compressed(format)) + return false; + + /* No exotic formats such as GL_LUMINANCE_ALPHA */ + if (_mesa_get_format_bits(format, GL_RED_BITS) == 0 && + _mesa_get_format_bits(format, GL_DEPTH_BITS) == 0 && + _mesa_get_format_bits(format, GL_STENCIL_BITS) == 0) + return false; + + return true; +} + static enum isl_format brw_blorp_to_isl_format(struct brw_context *brw, mesa_format format, bool is_render_target) @@ -291,15 +311,20 @@ brw_blorp_to_isl_format(struct brw_context *brw, mesa_format format, return ISL_FORMAT_R32_FLOAT; case MESA_FORMAT_Z_UNORM16: return ISL_FORMAT_R16_UNORM; - default: { + default: if (is_render_target) { - assert(brw->format_supported_as_render_target[format]); - return brw->render_target_format[format]; + assert(brw_blorp_supports_dst_format(brw, format)); + if (brw->format_supported_as_render_target[format]) { +return brw->render_target_format[format]; + } else { +return brw_format_for_mesa_format(format); + } } else { + /* Some destinations (is_render_target == true) are supported by + * blorp even though we technically can't render to them. + */ return brw_format_for_mesa_format(format); } - break; - } } } @@ -540,8 +565,6 @@ try_blorp_blit(struct brw_context *brw, /* Find buffers */ struct intel_renderbuffer *src_irb; struct intel_renderbuffer *dst_irb; - struct intel_mipmap_tree *src_mt; - struct intel_mipmap_tree *dst_mt; switch (buffer_bit) { case GL_COLOR_BUFFER_BIT: src_irb = intel_renderbuffer(read_fb->_ColorReadBuffer); @@ -561,16 +584,6 @@ try_blorp_blit(struct brw_context *brw, intel_renderbuffer(read_fb->Attachment[BUFFER_DEPTH].Renderbuffer); dst_irb = intel_renderbuffer(draw_fb->Attachment[BUFFER_DEPTH].Renderbuffer); - src_mt = find_miptree(buffer_bit, src_irb); - dst_mt = find_miptree(buffer_bit, dst_irb); - - /* We can't handle format conversions between Z24 and other formats - * since we have to lie about the surface format. See the comments in - * brw_blorp_surface_info::set(). - */ - if ((src_mt->format == MESA_FORMAT_Z24_UNORM_X8_UINT) != - (dst_mt->format == MESA_FORMAT_Z24_UNORM_X8_UINT)) - return false; do_blorp_blit(brw, buffer_bit, src_irb, MESA_FORMAT_NONE, dst_irb, MESA_FORMAT_NONE, srcX0, srcY0, @@ -627,21 +640,8 @@ brw_blorp_copytexsubimage(struct brw_context *brw, if (brw->gen < 6) return false; - if (_mesa_get_format_base_format(src_rb->Format) != - _mesa_get_format_base_format(dst_image->TexFormat)) { - return false; - } - - /* We can't handle format conversions between Z24 and other formats since -* we have to lie about the surface format. See the comments in -* brw_blorp_surface_info::set(). -*/ - if ((src_mt->format == MESA_FORMAT_Z24_UNORM_X8_UINT) != - (dst_mt->format == MESA_FORMAT_Z24_UNORM_X8_UINT)) { - return false; - } - - if (!brw->format_supported_as_render_target[dst_image->TexFormat]) + /* BLORP can't compress anything */ + if (!brw_blorp_supports_dst_format(brw, dst_image->TexFormat)) return false; /* Source clipping shouldn't be necessary, since copytexsubimage (in -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/7] intel/blorp: Silently convert RGBX destination formats to RGBA
--- src/intel/blorp/blorp_blit.c | 4 1 file changed, 4 insertions(+) diff --git a/src/intel/blorp/blorp_blit.c b/src/intel/blorp/blorp_blit.c index b964224..4d8942e 100644 --- a/src/intel/blorp/blorp_blit.c +++ b/src/intel/blorp/blorp_blit.c @@ -1883,6 +1883,10 @@ try_blorp_blit(struct blorp_batch *batch, wm_prog_key->dst_rgb = true; wm_prog_key->need_dst_offset = true; + } else if (isl_format_is_rgbx(params->dst.view.format)) { + /* We can handle RGBX formats easily enough by treating them as RGBA */ + params->dst.view.format = + isl_format_rgbx_to_rgba(params->dst.view.format); } else if (params->dst.view.format == ISL_FORMAT_R24_UNORM_X8_TYPELESS) { wm_prog_key->dst_format = params->dst.view.format; params->dst.view.format = ISL_FORMAT_R32_UNORM; -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/7] anv: Allow blitting to/from any supported format
Now that blorp handles all the cases, why not? --- src/intel/vulkan/anv_formats.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/src/intel/vulkan/anv_formats.c b/src/intel/vulkan/anv_formats.c index f4183f0..2a924d5 100644 --- a/src/intel/vulkan/anv_formats.c +++ b/src/intel/vulkan/anv_formats.c @@ -319,8 +319,7 @@ get_image_format_properties(const struct gen_device_info *devinfo, VkFormatFeatureFlags flags = 0; if (isl_format_supports_sampling(devinfo, format.isl_format)) { - flags |= VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT | - VK_FORMAT_FEATURE_BLIT_SRC_BIT; + flags |= VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT; if (isl_format_supports_filtering(devinfo, format.isl_format)) flags |= VK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_LINEAR_BIT; @@ -332,8 +331,7 @@ get_image_format_properties(const struct gen_device_info *devinfo, */ if (isl_format_supports_rendering(devinfo, format.isl_format) && format.swizzle.a == ISL_CHANNEL_SELECT_ALPHA) { - flags |= VK_FORMAT_FEATURE_COLOR_ATTACHMENT_BIT | - VK_FORMAT_FEATURE_BLIT_DST_BIT; + flags |= VK_FORMAT_FEATURE_COLOR_ATTACHMENT_BIT; if (isl_format_supports_alpha_blending(devinfo, format.isl_format)) flags |= VK_FORMAT_FEATURE_COLOR_ATTACHMENT_BLEND_BIT; @@ -349,7 +347,9 @@ get_image_format_properties(const struct gen_device_info *devinfo, flags |= VK_FORMAT_FEATURE_STORAGE_IMAGE_ATOMIC_BIT; if (flags) { - flags |= VK_FORMAT_FEATURE_TRANSFER_SRC_BIT_KHR | + flags |= VK_FORMAT_FEATURE_BLIT_SRC_BIT | + VK_FORMAT_FEATURE_BLIT_DST_BIT | + VK_FORMAT_FEATURE_TRANSFER_SRC_BIT_KHR | VK_FORMAT_FEATURE_TRANSFER_DST_BIT_KHR; } -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/7] intel/blorp: Handle more exotic destination formats
This commit adds support for using both R24_UNORM_X8_TYPELESS and R9G9B9E5_SHAREDEXP as destination formats even though the hardware does not support rendering to them. This is done by using a different format and emitting shader code to fake it the rest of the way. --- src/intel/blorp/blorp_blit.c | 92 src/intel/blorp/blorp_priv.h | 6 +++ 2 files changed, 98 insertions(+) diff --git a/src/intel/blorp/blorp_blit.c b/src/intel/blorp/blorp_blit.c index fc76fd4..b964224 100644 --- a/src/intel/blorp/blorp_blit.c +++ b/src/intel/blorp/blorp_blit.c @@ -26,6 +26,7 @@ #include "blorp_priv.h" #include "brw_meta_util.h" +#include "util/format_rgb9e5.h" /* header-only include needed for _mesa_unorm_to_float and friends. */ #include "mesa/main/format_utils.h" @@ -916,6 +917,88 @@ bit_cast_color(struct nir_builder *b, nir_ssa_def *color, } } +static nir_ssa_def * +convert_color(struct nir_builder *b, nir_ssa_def *color, + const struct brw_blorp_blit_prog_key *key) +{ + /* All of our color conversions end up generating a single-channel color +* value that we need to write out. +*/ + nir_ssa_def *value; + + if (key->dst_format == ISL_FORMAT_R24_UNORM_X8_TYPELESS) { + /* The destination image is bound as R32_UNORM but the data needs to be + * in R24_UNORM_X8_TYPELESS. The bottom 24 are the actual data and the + * top 8 need to be zero. We can accomplish this by simply multiplying + * by a factor to scale things down. + */ + float factor = (float)((1 << 24) - 1) / (float)UINT32_MAX; + value = nir_fmul(b, nir_fsat(b, nir_channel(b, color, 0)), + nir_imm_float(b, factor)); + } else if (key->dst_format == ISL_FORMAT_R9G9B9E5_SHAREDEXP) { + /* See also float3_to_rgb9e5 */ + + /* First, we need to clamp it to range. */ + nir_ssa_def *clamped = nir_fmin(b, color, nir_imm_float(b, MAX_RGB9E5)); + + /* Get rid of negatives and NaN */ + clamped = nir_bcsel(b, nir_ult(b, nir_imm_int(b, 0x7f80), color), + nir_imm_float(b, 0), clamped); + + /* maxrgb.u = MAX3(rc.u, gc.u, bc.u); */ + nir_ssa_def *maxu = nir_umax(b, nir_channel(b, clamped, 0), + nir_umax(b, nir_channel(b, clamped, 1), + nir_channel(b, clamped, 2))); + + /* maxrgb.u += maxrgb.u & (1 << (23-9)); */ + maxu = nir_iadd(b, maxu, nir_iand(b, maxu, nir_imm_int(b, 1 << 14))); + + /* exp_shared = MAX2((maxrgb.u >> 23), -RGB9E5_EXP_BIAS - 1 + 127) + + * 1 + RGB9E5_EXP_BIAS - 127; + */ + nir_ssa_def *exp_shared = + nir_iadd(b, nir_umax(b, nir_ushr(b, maxu, nir_imm_int(b, 23)), + nir_imm_int(b, -RGB9E5_EXP_BIAS - 1 + 127)), + nir_imm_int(b, 1 + RGB9E5_EXP_BIAS - 127)); + + /* revdenom_biasedexp = 127 - (exp_shared - RGB9E5_EXP_BIAS - + * RGB9E5_MANTISSA_BITS) + 1; + */ + nir_ssa_def *revdenom_biasedexp = + nir_isub(b, nir_imm_int(b, 127 + RGB9E5_EXP_BIAS + +RGB9E5_MANTISSA_BITS + 1), + exp_shared); + + /* revdenom.u = revdenom_biasedexp << 23; */ + nir_ssa_def *revdenom = + nir_ishl(b, revdenom_biasedexp, nir_imm_int(b, 23)); + + /* rm = (int) (rc.f * revdenom.f); + * gm = (int) (gc.f * revdenom.f); + * bm = (int) (bc.f * revdenom.f); + */ + nir_ssa_def *mantissa = + nir_f2i(b, nir_fmul(b, clamped, revdenom)); + + /* rm = (rm & 1) + (rm >> 1); + * gm = (gm & 1) + (gm >> 1); + * bm = (bm & 1) + (bm >> 1); + */ + mantissa = nir_iadd(b, nir_iand(b, mantissa, nir_imm_int(b, 1)), + nir_ushr(b, mantissa, nir_imm_int(b, 1))); + + value = nir_channel(b, mantissa, 0); + value = nir_mask_shift_or(b, value, nir_channel(b, mantissa, 1), ~0, 9); + value = nir_mask_shift_or(b, value, nir_channel(b, mantissa, 2), ~0, 18); + value = nir_mask_shift_or(b, value, exp_shared, ~0, 27); + } else { + unreachable("Unsupported format conversion"); + } + + nir_ssa_def *u = nir_ssa_undef(b, 1, 32); + return nir_vec4(b, value, u, u, u); +} + /** * Generator for WM programs used in BLORP blits. * @@ -1274,6 +1357,9 @@ brw_blorp_build_nir_shader(struct blorp_context *blorp, void *mem_ctx, if (key->dst_bpc != key->src_bpc) color = bit_cast_color(, color, key); + if (key->dst_format) + color = convert_color(, color, key); + if (key->dst_rgb) { /* The destination image is bound as a red texture three times as wide * as the actual image. Our shader is effectively running one color @@ -1797,6 +1883,12 @@ try_blorp_blit(struct blorp_batch *batch, wm_prog_key->dst_rgb = true; wm_prog_key->need_dst_offset = true; + }
[Mesa-dev] [PATCH 5/7] intel/blorp: Support the RGB workaround on more formats
Previously we only supported UINT formats because that's what blorp_copy required. If we want to use it in blorp_blit, however, we need to support everything. --- src/intel/blorp/blorp_blit.c | 73 1 file changed, 53 insertions(+), 20 deletions(-) diff --git a/src/intel/blorp/blorp_blit.c b/src/intel/blorp/blorp_blit.c index 4d8942e..ff8352d 100644 --- a/src/intel/blorp/blorp_blit.c +++ b/src/intel/blorp/blorp_blit.c @@ -1639,6 +1639,56 @@ struct blt_coords { struct blt_axis x, y; }; +static enum isl_format +get_red_format_for_rgb_format(enum isl_format format) +{ + const struct isl_format_layout *fmtl = isl_format_get_layout(format); + + switch (fmtl->channels.r.bits) { + case 8: + switch (fmtl->channels.r.type) { + case ISL_UNORM: + return ISL_FORMAT_R8_UNORM; + case ISL_SNORM: + return ISL_FORMAT_R8_SNORM; + case ISL_UINT: + return ISL_FORMAT_R8_UINT; + case ISL_SINT: + return ISL_FORMAT_R8_SINT; + default: + unreachable("Invalid 8-bit RGB channel type"); + } + case 16: + switch (fmtl->channels.r.type) { + case ISL_UNORM: + return ISL_FORMAT_R16_UNORM; + case ISL_SNORM: + return ISL_FORMAT_R16_SNORM; + case ISL_SFLOAT: + return ISL_FORMAT_R16_FLOAT; + case ISL_UINT: + return ISL_FORMAT_R16_UINT; + case ISL_SINT: + return ISL_FORMAT_R16_SINT; + default: + unreachable("Invalid 8-bit RGB channel type"); + } + case 32: + switch (fmtl->channels.r.type) { + case ISL_SFLOAT: + return ISL_FORMAT_R32_FLOAT; + case ISL_UINT: + return ISL_FORMAT_R32_UINT; + case ISL_SINT: + return ISL_FORMAT_R32_SINT; + default: + unreachable("Invalid 8-bit RGB channel type"); + } + default: + unreachable("Invalid number of red channel bits"); + } +} + static void surf_fake_rgb_with_red(const struct isl_device *isl_dev, struct brw_blorp_surface_info *info) @@ -1648,26 +1698,9 @@ surf_fake_rgb_with_red(const struct isl_device *isl_dev, info->surf.logical_level0_px.width *= 3; info->surf.phys_level0_sa.width *= 3; - enum isl_format red_format; - switch (info->view.format) { - case ISL_FORMAT_R8G8B8_UNORM: - red_format = ISL_FORMAT_R8_UNORM; - break; - case ISL_FORMAT_R8G8B8_UINT: - red_format = ISL_FORMAT_R8_UINT; - break; - case ISL_FORMAT_R16G16B16_UNORM: - red_format = ISL_FORMAT_R16_UNORM; - break; - case ISL_FORMAT_R16G16B16_UINT: - red_format = ISL_FORMAT_R16_UINT; - break; - case ISL_FORMAT_R32G32B32_UINT: - red_format = ISL_FORMAT_R32_UINT; - break; - default: - unreachable("Invalid RGB copy destination format"); - } + enum isl_format red_format = + get_red_format_for_rgb_format(info->view.format); + assert(isl_format_get_layout(red_format)->channels.r.type == isl_format_get_layout(info->view.format)->channels.r.type); assert(isl_format_get_layout(red_format)->channels.r.bits == -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/7] blorp: Handle the RGB workaround more like other workarounds
The previous version was sort-of strapped on in that it just adjusted the blit rectangle and trusted in the fact that we would use texelFetch and round to the nearest integer to ensure that the component positions matched. This new version, while slightly more complicated, is more accurate because all three components end up with exactly the same dst_pos and so they will get interpolated and sampled at the same texture coordinate. This makes the workaround suitable for using with scaled blits. --- src/intel/blorp/blorp_blit.c | 60 ++-- 1 file changed, 30 insertions(+), 30 deletions(-) diff --git a/src/intel/blorp/blorp_blit.c b/src/intel/blorp/blorp_blit.c index 111f1c1..fc76fd4 100644 --- a/src/intel/blorp/blorp_blit.c +++ b/src/intel/blorp/blorp_blit.c @@ -1138,6 +1138,20 @@ brw_blorp_build_nir_shader(struct blorp_context *blorp, void *mem_ctx, key->dst_layout); } + nir_ssa_def *comp = NULL; + if (key->dst_rgb) { + /* The destination image is bound as a red texture three times as wide + * as the actual image. Our shader is effectively running one color + * component at a time. We need to save off the component and adjust + * the destination position. + */ + assert(dst_pos->num_components == 2); + nir_ssa_def *dst_x = nir_channel(, dst_pos, 0); + comp = nir_umod(, dst_x, nir_imm_int(, 3)); + dst_pos = nir_vec2(, nir_idiv(, dst_x, nir_imm_int(, 3)), + nir_channel(, dst_pos, 1)); + } + /* Now (X, Y, S) = decode_msaa(dst_samples, detile(dst_tiling, offset)). * * That is: X, Y and S now contain the true coordinates and sample index of @@ -1267,8 +1281,6 @@ brw_blorp_build_nir_shader(struct blorp_context *blorp, void *mem_ctx, * from the source color and write that to destination red. */ assert(dst_pos->num_components == 2); - nir_ssa_def *comp = - nir_umod(, nir_channel(, dst_pos, 0), nir_imm_int(, 3)); nir_ssa_def *color_component = nir_bcsel(, nir_ieq(, comp, nir_imm_int(, 0)), @@ -1543,15 +1555,12 @@ struct blt_coords { static void surf_fake_rgb_with_red(const struct isl_device *isl_dev, - struct brw_blorp_surface_info *info, - uint32_t *x, uint32_t *width) + struct brw_blorp_surface_info *info) { surf_convert_to_single_slice(isl_dev, info); info->surf.logical_level0_px.width *= 3; info->surf.phys_level0_sa.width *= 3; - *x *= 3; - *width *= 3; enum isl_format red_format; switch (info->view.format) { @@ -1581,28 +1590,6 @@ surf_fake_rgb_with_red(const struct isl_device *isl_dev, info->surf.format = info->view.format = red_format; } -static void -fake_dest_rgb_with_red(const struct isl_device *dev, - struct blorp_params *params, - struct brw_blorp_blit_prog_key *wm_prog_key, - struct blt_coords *coords) -{ - /* Handle RGB destinations for blorp_copy */ - const struct isl_format_layout *dst_fmtl = - isl_format_get_layout(params->dst.surf.format); - - if (dst_fmtl->bpb % 3 == 0) { - uint32_t dst_x = coords->x.dst0; - uint32_t dst_width = coords->x.dst1 - dst_x; - surf_fake_rgb_with_red(dev, >dst, - _x, _width); - coords->x.dst0 = dst_x; - coords->x.dst1 = dst_x + dst_width; - wm_prog_key->dst_rgb = true; - wm_prog_key->need_dst_offset = true; - } -} - enum blit_shrink_status { BLIT_NO_SHRINK = 0, BLIT_WIDTH_SHRINK = 1, @@ -1621,8 +1608,6 @@ try_blorp_blit(struct blorp_batch *batch, { const struct gen_device_info *devinfo = batch->blorp->isl_dev->info; - fake_dest_rgb_with_red(batch->blorp->isl_dev, params, wm_prog_key, coords); - if (isl_format_has_sint_channel(params->src.view.format)) { wm_prog_key->texture_data_type = nir_type_int; } else if (isl_format_has_uint_channel(params->src.view.format)) { @@ -1799,6 +1784,21 @@ try_blorp_blit(struct blorp_batch *batch, params->num_samples = params->dst.surf.samples; + if (isl_format_get_layout(params->dst.view.format)->bpb % 3 == 0) { + /* We can't render to RGB formats natively because they aren't a + * power-of-two size. Instead, we fake them by using a red format + * with the same channel type and size and emitting shader code to + * only write one channel at a time. + */ + params->x0 *= 3; + params->x1 *= 3; + + surf_fake_rgb_with_red(batch->blorp->isl_dev, >dst); + + wm_prog_key->dst_rgb = true; + wm_prog_key->need_dst_offset = true; + } + if (params->src.tile_x_sa || params->src.tile_y_sa) { assert(wm_prog_key->need_src_offset); surf_get_intratile_offset_px(>src, -- 2.5.0.400.gff86faf ___ mesa-dev mailing list
[Mesa-dev] [PATCH 3/7] intel/isl: Add some helpers for working with RGBX formats
--- src/intel/isl/isl.h| 11 +++ src/intel/isl/isl_format.c | 32 2 files changed, 43 insertions(+) diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h index 07368f9..9d5b372 100644 --- a/src/intel/isl/isl.h +++ b/src/intel/isl/isl.h @@ -1138,8 +1138,19 @@ isl_format_is_rgb(enum isl_format fmt) isl_format_layouts[fmt].channels.a.bits == 0; } +static inline bool +isl_format_is_rgbx(enum isl_format fmt) +{ + return isl_format_layouts[fmt].channels.r.bits > 0 && + isl_format_layouts[fmt].channels.g.bits > 0 && + isl_format_layouts[fmt].channels.b.bits > 0 && + isl_format_layouts[fmt].channels.a.bits > 0 && + isl_format_layouts[fmt].channels.a.type == ISL_VOID; +} + enum isl_format isl_format_rgb_to_rgba(enum isl_format rgb) ATTRIBUTE_CONST; enum isl_format isl_format_rgb_to_rgbx(enum isl_format rgb) ATTRIBUTE_CONST; +enum isl_format isl_format_rgbx_to_rgba(enum isl_format rgb) ATTRIBUTE_CONST; bool isl_is_storage_image_format(enum isl_format fmt); diff --git a/src/intel/isl/isl_format.c b/src/intel/isl/isl_format.c index c8daece..8473285 100644 --- a/src/intel/isl/isl_format.c +++ b/src/intel/isl/isl_format.c @@ -623,3 +623,35 @@ isl_format_rgb_to_rgbx(enum isl_format rgb) return ISL_FORMAT_UNSUPPORTED; } } + +enum isl_format +isl_format_rgbx_to_rgba(enum isl_format rgbx) +{ + assert(isl_format_is_rgbx(rgbx)); + + switch (rgbx) { + case ISL_FORMAT_R32G32B32X32_FLOAT: + return ISL_FORMAT_R32G32B32A32_FLOAT; + case ISL_FORMAT_R16G16B16X16_UNORM: + return ISL_FORMAT_R16G16B16A16_UNORM; + case ISL_FORMAT_R16G16B16X16_FLOAT: + return ISL_FORMAT_R16G16B16A16_FLOAT; + case ISL_FORMAT_B8G8R8X8_UNORM: + return ISL_FORMAT_B8G8R8A8_UNORM; + case ISL_FORMAT_B8G8R8X8_UNORM_SRGB: + return ISL_FORMAT_B8G8R8A8_UNORM_SRGB; + case ISL_FORMAT_R8G8B8X8_UNORM: + return ISL_FORMAT_R8G8B8A8_UNORM; + case ISL_FORMAT_R8G8B8X8_UNORM_SRGB: + return ISL_FORMAT_R8G8B8A8_UNORM_SRGB; + case ISL_FORMAT_B10G10R10X2_UNORM: + return ISL_FORMAT_B10G10R10A2_UNORM; + case ISL_FORMAT_B5G5R5X1_UNORM: + return ISL_FORMAT_B5G5R5A1_UNORM; + case ISL_FORMAT_B5G5R5X1_UNORM_SRGB: + return ISL_FORMAT_B5G5R5A1_UNORM_SRGB; + default: + assert(!"Invalid RGBX format"); + return rgbx; + } +} -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/8] glsl: Rewrite atan2 implementation to fix accuracy and handling of zero/infinity.
This addresses several issues of the current atan2 implementation: - Negative zero (and negative denorms which end up getting flushed to zero) isn't handled correctly by the current implementation. The reason is that it does 'y >= 0' and 'x < 0' comparisons to decide on which side of the branch cut the argument is, which causes us to return incorrect results (off by up to 2π) for very small negative values. - There is a serious precision problem for x values of large enough magnitude introduced by the floating point division operation being implemented as a mul+rcp sequence. This can lead to the quotient getting flushed to zero in some cases introducing an error of over 8e6 ULP in the result -- Or in the most catastrophic case will cause us to return NaN instead of the correct value ±π/2 for y=±∞ and x very large. We can fix this easily by scaling down both arguments when the absolute value of the denominator goes above certain threshold. The error of this atan2 implementation remains below 25 ULP in most of its domain except for a neighborhood of y=0 where it reaches a maximum error of about 180 ULP. - It emits a bunch of instructions including no less than three if-else branches per scalar component that don't seem to get optimized out later on. This implementation uses about 13% less instructions on Intel SKL hardware and doesn't emit any control flow instructions. --- src/compiler/glsl/builtin_functions.cpp | 82 ++--- 1 file changed, 46 insertions(+), 36 deletions(-) diff --git a/src/compiler/glsl/builtin_functions.cpp b/src/compiler/glsl/builtin_functions.cpp index 4a6c5af..fd59381 100644 --- a/src/compiler/glsl/builtin_functions.cpp +++ b/src/compiler/glsl/builtin_functions.cpp @@ -3560,44 +3560,54 @@ builtin_builder::_acos(const glsl_type *type) ir_function_signature * builtin_builder::_atan2(const glsl_type *type) { - ir_variable *vec_y = in_var(type, "vec_y"); - ir_variable *vec_x = in_var(type, "vec_x"); - MAKE_SIG(type, always_available, 2, vec_y, vec_x); - - ir_variable *vec_result = body.make_temp(type, "vec_result"); - ir_variable *r = body.make_temp(glsl_type::float_type, "r"); - for (int i = 0; i < type->vector_elements; i++) { - ir_variable *y = body.make_temp(glsl_type::float_type, "y"); - ir_variable *x = body.make_temp(glsl_type::float_type, "x"); - body.emit(assign(y, swizzle(vec_y, i, 1))); - body.emit(assign(x, swizzle(vec_x, i, 1))); - - /* If |x| >= 1.0e-8 * |y|: */ - ir_if *outer_if = - new(mem_ctx) ir_if(greater(abs(x), mul(imm(1.0e-8f), abs(y; - - ir_factory outer_then(_if->then_instructions, mem_ctx); - - /* Then...call atan(y/x) */ - do_atan(outer_then, glsl_type::float_type, r, div(y, x)); - - /* ...and fix it up: */ - ir_if *inner_if = new(mem_ctx) ir_if(less(x, imm(0.0f))); - inner_if->then_instructions.push_tail( - if_tree(gequal(y, imm(0.0f)), - assign(r, add(r, imm(M_PIf))), - assign(r, sub(r, imm(M_PIf); - outer_then.emit(inner_if); - - /* Else... */ - outer_if->else_instructions.push_tail( - assign(r, mul(sign(y), imm(M_PI_2f; + const unsigned n = type->vector_elements; + ir_variable *y = in_var(type, "y"); + ir_variable *x = in_var(type, "x"); + MAKE_SIG(type, always_available, 2, y, x); - body.emit(outer_if); + /* If we're on the left half-plane rotate the coordinates π/2 clock-wise +* for the y=0 discontinuity to end up aligned with the vertical +* discontinuity of atan(s/t) along t=0. +*/ + ir_variable *flip = body.make_temp(glsl_type::bvec(n), "flip"); + body.emit(assign(flip, less(x, imm(0.0f, n; + ir_variable *s = body.make_temp(type, "s"); + body.emit(assign(s, csel(flip, abs(x), y))); + ir_variable *t = body.make_temp(type, "t"); + body.emit(assign(t, csel(flip, y, abs(x; - body.emit(assign(vec_result, r, 1 << i)); - } - body.emit(ret(vec_result)); + /* If the magnitude of the denominator exceeds some huge value, scale down +* the arguments in order to prevent the reciprocal operation from flushing +* its result to zero, which would cause precision problems, and for s +* infinite would cause us to return a NaN instead of the correct finite +* value. +*/ + ir_constant *huge = imm(1e37f, n); + ir_variable *scale = body.make_temp(type, "scale"); + body.emit(assign(scale, csel(gequal(abs(t), huge), +imm(0.0625f, n), imm(1.0f, n; + ir_variable *rcp_scaled_t = body.make_temp(type, "rcp_scaled_t"); + body.emit(assign(rcp_scaled_t, rcp(mul(t, scale; + ir_expression *s_over_t = mul(mul(s, scale), rcp_scaled_t); + + /* Calculate the arctangent and fix up the result if we had flipped the +* coordinate system. +*/ + ir_variable *arc = body.make_temp(type, "arc"); +
[Mesa-dev] [PATCH 0/7] intel/blorp: Be able to blit to ANYTHING!!!
This somewhat tongue-in-cheek series adds support to BLORP for blitting to a lot more different destination formats. We now even support the crazy R9G9B9E5_SHAREDEXP format by emitting shader code to do the conversion. The result of this is that we can now use blorp for almost all blit operations in gl and *all* Vulkan formats we support in any way shape or form we now support for VkBlitImage. Why? Because we can! Jason Ekstrand (7): blorp: Handle the RGB workaround more like other workarounds intel/blorp: Handle more exotic destination formats intel/isl: Add some helpers for working with RGBX formats intel/blorp: Silently convert RGBX destination formats to RGBA intel/blorp: Support the RGB workaround on more formats anv: Allow blitting to/from any supported format i965/blorp: Remove a pile of blorp_blit restrictions src/intel/blorp/blorp_blit.c | 229 ++ src/intel/blorp/blorp_priv.h | 6 + src/intel/isl/isl.h | 11 ++ src/intel/isl/isl_format.c| 32 + src/intel/vulkan/anv_formats.c| 10 +- src/mesa/drivers/dri/i965/brw_blorp.c | 64 +- 6 files changed, 265 insertions(+), 87 deletions(-) -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/8] i965/fs: Fix nir_op_fsign of absolute value.
This does point at the front-end emitting silly code that could have been optimized out, but the current fsign implementation would emit bogus IR if abs was set for the argument (because it would apply the abs modifier on an unsigned integer type), and we shouldn't rely on the upper layer's optimization passes for correctness. --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index e1ab598..e0c2fa0 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -701,7 +701,14 @@ fs_visitor::nir_emit_alu(const fs_builder , nir_alu_instr *instr) break; case nir_op_fsign: { - if (type_sz(op[0].type) < 8) { + if (op[0].abs) { + /* Straightforward since the source can be assumed to be + * non-negative. + */ + set_condmod(BRW_CONDITIONAL_NZ, bld.MOV(result, op[0])); + set_predicate(BRW_PREDICATE_NORMAL, bld.MOV(result, brw_imm_f(1.0f))); + + } else if (type_sz(op[0].type) < 8) { /* AND(val, 0x8000) gives the sign bit. * * Predicated OR ORs 1.0 (0x3f80) with the sign bit if val is not -- 2.10.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/8] glsl: Fix constant evaluation of the rcp op.
Will avoid a regression in a future commit that introduces some additional rcp operations. --- src/compiler/glsl/ir_expression_operation.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/compiler/glsl/ir_expression_operation.py b/src/compiler/glsl/ir_expression_operation.py index f91ac9b..4ac1ffb 100644 --- a/src/compiler/glsl/ir_expression_operation.py +++ b/src/compiler/glsl/ir_expression_operation.py @@ -422,7 +422,7 @@ ir_expression_operation = [ operation("neg", 1, source_types=numeric_types, c_expression={'u': "-((int) {src0})", 'default': "-{src0}"}), operation("abs", 1, source_types=signed_numeric_types, c_expression={'i': "{src0} < 0 ? -{src0} : {src0}", 'f': "fabsf({src0})", 'd': "fabs({src0})", 'i64': "{src0} < 0 ? -{src0} : {src0}"}), operation("sign", 1, source_types=signed_numeric_types, c_expression={'i': "({src0} > 0) - ({src0} < 0)", 'f': "float(({src0} > 0.0F) - ({src0} < 0.0F))", 'd': "double(({src0} > 0.0) - ({src0} < 0.0))", 'i64': "({src0} > 0) - ({src0} < 0)"}), - operation("rcp", 1, source_types=real_types, c_expression={'f': "{src0} != 0.0F ? 1.0F / {src0} : 0.0F", 'd': "{src0} != 0.0 ? 1.0 / {src0} : 0.0"}), + operation("rcp", 1, source_types=real_types, c_expression={'f': "1.0F / {src0}", 'd': "1.0 / {src0}"}), operation("rsq", 1, source_types=real_types, c_expression={'f': "1.0F / sqrtf({src0})", 'd': "1.0 / sqrt({src0})"}), operation("sqrt", 1, source_types=real_types, c_expression={'f': "sqrtf({src0})", 'd': "sqrt({src0})"}), operation("exp", 1, source_types=(float_type,), c_expression="expf({src0})"), # Log base e on gentype -- 2.10.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/8] glsl/ir_builder: Add rcp builder.
--- src/compiler/glsl/ir_builder.cpp | 6 ++ src/compiler/glsl/ir_builder.h | 1 + 2 files changed, 7 insertions(+) diff --git a/src/compiler/glsl/ir_builder.cpp b/src/compiler/glsl/ir_builder.cpp index 0cee856..8d61533 100644 --- a/src/compiler/glsl/ir_builder.cpp +++ b/src/compiler/glsl/ir_builder.cpp @@ -315,6 +315,12 @@ exp(operand a) } ir_expression * +rcp(operand a) +{ + return expr(ir_unop_rcp, a); +} + +ir_expression * rsq(operand a) { return expr(ir_unop_rsq, a); diff --git a/src/compiler/glsl/ir_builder.h b/src/compiler/glsl/ir_builder.h index 5ee9412..ff1ff70 100644 --- a/src/compiler/glsl/ir_builder.h +++ b/src/compiler/glsl/ir_builder.h @@ -148,6 +148,7 @@ ir_expression *neg(operand a); ir_expression *sin(operand a); ir_expression *cos(operand a); ir_expression *exp(operand a); +ir_expression *rcp(operand a); ir_expression *rsq(operand a); ir_expression *sqrt(operand a); ir_expression *log(operand a); -- 2.10.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/8] nir/spirv/glsl450: Rewrite atan2 implementation to fix accuracy and handling of zero/infinity.
See "glsl: Rewrite atan2 implementation to fix accuracy and handling of zero/infinity." for the rationale, but note that the instruction count benefit discussed there is somewhat less important for the SPIRV implementation, because the current code already emitted no control flow instructions -- Still this saves us one hardware instruction per scalar component on Intel SKL hardware. Fixes the following Vulkan CTS tests on Intel hardware: dEQP-VK.glsl.builtin.precision.atan2.highp_compute.scalar dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec2 dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec3 dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec4 dEQP-VK.glsl.builtin.precision.atan2.mediump_compute.vec2 dEQP-VK.glsl.builtin.precision.atan2.mediump_compute.vec4 Note that most of the test-cases above expect IEEE-compliant handling of atan2(±∞, ±∞), which this patch doesn't explicitly handle, so except for the last two the test-cases above weren't expected to pass yet. The reason they do is that the i965 back-end implementation of the NIR fmin and fmax instructions is not quite GLSL-compliant (it complies with IEEE 754 recommendations though), because fmin/fmax of a NaN and a non-NaN argument currently always return the non-NaN argument, which causes atan() to flush NaN to one and return the expected value. The front-end should probably not be relying on this behavior for correctness though because other back-ends are likely to behave differently -- A follow-up patch will handle the atan2(±∞, ±∞) corner cases explicitly. --- src/compiler/spirv/vtn_glsl450.c | 61 ++-- 1 file changed, 40 insertions(+), 21 deletions(-) diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c index 0d32fdd..508f218 100644 --- a/src/compiler/spirv/vtn_glsl450.c +++ b/src/compiler/spirv/vtn_glsl450.c @@ -302,28 +302,47 @@ build_atan(nir_builder *b, nir_ssa_def *y_over_x) static nir_ssa_def * build_atan2(nir_builder *b, nir_ssa_def *y, nir_ssa_def *x) { - nir_ssa_def *zero = nir_imm_float(b, 0.0f); + nir_ssa_def *zero = nir_imm_float(b, 0); + nir_ssa_def *one = nir_imm_float(b, 1); - /* If |x| >= 1.0e-8 * |y|: */ - nir_ssa_def *condition = - nir_fge(b, nir_fabs(b, x), - nir_fmul(b, nir_imm_float(b, 1.0e-8f), nir_fabs(b, y))); - - /* Then...call atan(y/x) and fix it up: */ - nir_ssa_def *atan1 = build_atan(b, nir_fdiv(b, y, x)); - nir_ssa_def *r_then = - nir_bcsel(b, nir_flt(b, x, zero), - nir_fadd(b, atan1, - nir_bcsel(b, nir_fge(b, y, zero), -nir_imm_float(b, M_PIf), -nir_imm_float(b, -M_PIf))), - atan1); - - /* Else... */ - nir_ssa_def *r_else = - nir_fmul(b, nir_fsign(b, y), nir_imm_float(b, M_PI_2f)); - - return nir_bcsel(b, condition, r_then, r_else); + /* If we're on the left half-plane rotate the coordinates π/2 clock-wise +* for the y=0 discontinuity to end up aligned with the vertical +* discontinuity of atan(s/t) along t=0. +*/ + nir_ssa_def *flip = nir_flt(b, x, zero); + nir_ssa_def *s = nir_bcsel(b, flip, nir_fabs(b, x), y); + nir_ssa_def *t = nir_bcsel(b, flip, y, nir_fabs(b, x)); + + /* If the magnitude of the denominator exceeds some huge value, scale down +* the arguments in order to prevent the reciprocal operation from flushing +* its result to zero, which would cause precision problems, and for s +* infinite would cause us to return a NaN instead of the correct finite +* value. +*/ + nir_ssa_def *huge = nir_imm_float(b, 1e37f); + nir_ssa_def *scale = nir_bcsel(b, nir_fge(b, nir_fabs(b, t), huge), + nir_imm_float(b, 0.0625), one); + nir_ssa_def *rcp_scaled_t = nir_frcp(b, nir_fmul(b, t, scale)); + nir_ssa_def *s_over_t = nir_fmul(b, nir_fmul(b, s, scale), rcp_scaled_t); + + /* Calculate the arctangent and fix up the result if we had flipped the +* coordinate system. +*/ + nir_ssa_def *arc = nir_fadd(b, nir_fmul(b, nir_b2f(b, flip), + nir_imm_float(b, M_PI_2f)), + build_atan(b, nir_fabs(b, s_over_t))); + + /* Rather convoluted calculation of the sign of the result. When x < 0 we +* cannot use fsign because we need to be able to distinguish between +* negative and positive zero. We don't use bitwise arithmetic tricks for +* consistency with the GLSL front-end. When x >= 0 rcp_scaled_t will +* always be non-negative so this won't be able to distinguish between +* negative and positive zero, but we don't care because atan2 is +* continuous along the whole positive y = 0 half-line, so it won't affect +* the result. +*/ + return nir_bcsel(b, nir_flt(b, nir_fmin(b, y, rcp_scaled_t), zero), +
[Mesa-dev] [PATCH] [swr] Update fs texture & sampler state logic
In swr_update_derived() update texture and sampler state on a new fragment shader. GALLIUM_HUD can update fs using a previously bound texture and sampler. --- src/gallium/drivers/swr/swr_state.cpp | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/swr/swr_state.cpp b/src/gallium/drivers/swr/swr_state.cpp index 41e0356..f1f4963 100644 --- a/src/gallium/drivers/swr/swr_state.cpp +++ b/src/gallium/drivers/swr/swr_state.cpp @@ -1283,7 +1283,8 @@ swr_update_derived(struct pipe_context *pipe, SwrSetPixelShaderState(ctx->swrContext, ); /* JIT sampler state */ - if (ctx->dirty & SWR_NEW_SAMPLER) { + if (ctx->dirty & (SWR_NEW_SAMPLER | +SWR_NEW_FS)) { swr_update_sampler_state(ctx, PIPE_SHADER_FRAGMENT, key.nr_samplers, @@ -1291,7 +1292,9 @@ swr_update_derived(struct pipe_context *pipe, } /* JIT sampler view state */ - if (ctx->dirty & (SWR_NEW_SAMPLER_VIEW | SWR_NEW_FRAMEBUFFER)) { + if (ctx->dirty & (SWR_NEW_SAMPLER_VIEW | +SWR_NEW_FRAMEBUFFER | +SWR_NEW_FS)) { swr_update_texture_state(ctx, PIPE_SHADER_FRAGMENT, key.nr_sampler_views, -- 2.10.0.windows.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Fix fast depth clears for surfaces with a dimension of 16384.
I hadn't bothered to set this bit because I figured it would just paper over us getting the rectangle wrong. But it turns out that there is a legitimate reason to use it, so let's do so. The alternative would be to chop up 16k clears to multiple 8k clears, which is pointlessly painful. Signed-off-by: Kenneth Graunke--- src/mesa/drivers/dri/i965/gen8_depth_state.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/mesa/drivers/dri/i965/gen8_depth_state.c b/src/mesa/drivers/dri/i965/gen8_depth_state.c index ec296698267..de5a16e91bf 100644 --- a/src/mesa/drivers/dri/i965/gen8_depth_state.c +++ b/src/mesa/drivers/dri/i965/gen8_depth_state.c @@ -477,6 +477,17 @@ gen8_hiz_exec(struct brw_context *brw, struct intel_mipmap_tree *mt, break; case BLORP_HIZ_OP_DEPTH_CLEAR: dw1 |= GEN8_WM_HZ_DEPTH_CLEAR; + + /* The "Clear Rectangle X Max" (and Y Max) fields are exclusive, + * rather than inclusive, and limited to 16383. This means that + * for a 16384x16384 render target, we would miss the last pixel. + * + * To work around this, we have to set the "Full Surface Depth + * and Stencil Clear" bit. We can do this in all cases because + * we always clear the full rectangle anyway. We'll need to + * change this if we ever add scissored clear support. + */ + dw1 |= GEN8_WM_HZ_FULL_SURFACE_DEPTH_CLEAR; break; case BLORP_HIZ_OP_NONE: unreachable("Should not get here."); -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 7/8] glsl: Implement IEEE-compliant handling of atan2(±∞, ±∞).
--- src/compiler/glsl/builtin_functions.cpp | 22 +- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/src/compiler/glsl/builtin_functions.cpp b/src/compiler/glsl/builtin_functions.cpp index fd59381..9d6ab80 100644 --- a/src/compiler/glsl/builtin_functions.cpp +++ b/src/compiler/glsl/builtin_functions.cpp @@ -3590,11 +3590,31 @@ builtin_builder::_atan2(const glsl_type *type) body.emit(assign(rcp_scaled_t, rcp(mul(t, scale; ir_expression *s_over_t = mul(mul(s, scale), rcp_scaled_t); + /* For |x| = |y| assume tan = 1 even if infinite (i.e. pretend momentarily +* that ∞/∞ = 1) in order to comply with the rather artificial rules +* inherited from IEEE 754-2008, namely: +* +* "atan2(±∞, −∞) is ±3π/4 +* atan2(±∞, +∞) is ±π/4" +* +* Note that this is inconsistent with the rules for the neighborhood of +* zero that are based on iterated limits: +* +* "atan2(±0, −0) is ±π +* atan2(±0, +0) is ±0" +* +* but GLSL specifically allows implementations to deviate from IEEE rules +* at (0,0), so we take that license (i.e. pretend that 0/0 = 1 here as +* well). +*/ + ir_expression *tan = csel(equal(abs(x), abs(y)), + imm(1.0f, n), abs(s_over_t)); + /* Calculate the arctangent and fix up the result if we had flipped the * coordinate system. */ ir_variable *arc = body.make_temp(type, "arc"); - do_atan(body, type, arc, abs(s_over_t)); + do_atan(body, type, arc, tan); body.emit(assign(arc, add(arc, mul(b2f(flip), imm(M_PI_2f); /* Rather convoluted calculation of the sign of the result. When x < 0 we -- 2.10.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 8/8] nir/spirv/glsl450: Implement IEEE-compliant handling of atan2(±∞, ±∞).
--- src/compiler/spirv/vtn_glsl450.c | 22 +- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c index 508f218..7af2dad 100644 --- a/src/compiler/spirv/vtn_glsl450.c +++ b/src/compiler/spirv/vtn_glsl450.c @@ -325,12 +325,32 @@ build_atan2(nir_builder *b, nir_ssa_def *y, nir_ssa_def *x) nir_ssa_def *rcp_scaled_t = nir_frcp(b, nir_fmul(b, t, scale)); nir_ssa_def *s_over_t = nir_fmul(b, nir_fmul(b, s, scale), rcp_scaled_t); + /* For |x| = |y| assume tan = 1 even if infinite (i.e. pretend momentarily +* that ∞/∞ = 1) in order to comply with the rather artificial rules +* inherited from IEEE 754-2008, namely: +* +* "atan2(±∞, −∞) is ±3π/4 +* atan2(±∞, +∞) is ±π/4" +* +* Note that this is inconsistent with the rules for the neighborhood of +* zero that are based on iterated limits: +* +* "atan2(±0, −0) is ±π +* atan2(±0, +0) is ±0" +* +* but GLSL specifically allows implementations to deviate from IEEE rules +* at (0,0), so we take that license (i.e. pretend that 0/0 = 1 here as +* well). +*/ + nir_ssa_def *tan = nir_bcsel(b, nir_feq(b, nir_fabs(b, x), nir_fabs(b, y)), +one, nir_fabs(b, s_over_t)); + /* Calculate the arctangent and fix up the result if we had flipped the * coordinate system. */ nir_ssa_def *arc = nir_fadd(b, nir_fmul(b, nir_b2f(b, flip), nir_imm_float(b, M_PI_2f)), - build_atan(b, nir_fabs(b, s_over_t))); + build_atan(b, tan)); /* Rather convoluted calculation of the sign of the result. When x < 0 we * cannot use fsign because we need to be able to distinguish between -- 2.10.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/8] mesa/program: Translate csel operation from GLSL IR.
This will be used internally by the GLSL front-end in order to implement some built-in functions. Plumb it through MESA IR for back-ends that rely on this translation pass. --- src/mesa/program/ir_to_mesa.cpp | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp index 0ae797f..5ff7304 100644 --- a/src/mesa/program/ir_to_mesa.cpp +++ b/src/mesa/program/ir_to_mesa.cpp @@ -1360,13 +1360,17 @@ ir_to_mesa_visitor::visit(ir_expression *ir) emit(ir, OPCODE_LRP, result_dst, op[2], op[1], op[0]); break; + case ir_triop_csel: + op[0].negate = ~op[0].negate; + emit(ir, OPCODE_CMP, result_dst, op[0], op[1], op[2]); + break; + case ir_binop_vector_extract: case ir_triop_fma: case ir_triop_bitfield_extract: case ir_triop_vector_insert: case ir_quadop_bitfield_insert: case ir_binop_ldexp: - case ir_triop_csel: case ir_binop_carry: case ir_binop_borrow: case ir_binop_imul_high: -- 2.10.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/31] radv: program a default point size.
Reviewed-by: Bas NieuwenhuizenOn Fri, Jan 20, 2017 at 4:02 AM, Dave Airlie wrote: > From: Dave Airlie > > Along the lines of what > 3b804819 anv: Default PointSize to 1.0 if not written by the shader > does for anv, program a default point size in the hw of 1.0. > > This preempt fixes a bunch of geom shader tests. > > Signed-off-by: Dave Airlie > --- > src/amd/vulkan/radv_cmd_buffer.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/src/amd/vulkan/radv_cmd_buffer.c > b/src/amd/vulkan/radv_cmd_buffer.c > index c6f238b..c62d275 100644 > --- a/src/amd/vulkan/radv_cmd_buffer.c > +++ b/src/amd/vulkan/radv_cmd_buffer.c > @@ -438,7 +438,8 @@ radv_emit_graphics_raster_state(struct radv_cmd_buffer > *cmd_buffer, >raster->spi_interp_control); > > radeon_set_context_reg_seq(cmd_buffer->cs, R_028A00_PA_SU_POINT_SIZE, > 2); > - radeon_emit(cmd_buffer->cs, 0); > + unsigned tmp = (unsigned)(1.0 * 8.0); > + radeon_emit(cmd_buffer->cs, S_028A00_HEIGHT(tmp) | > S_028A00_WIDTH(tmp)); > radeon_emit(cmd_buffer->cs, > S_028A04_MIN_SIZE(radv_pack_float_12p4(0)) | > S_028A04_MAX_SIZE(radv_pack_float_12p4(8192/2))); /* > R_028A04_PA_SU_POINT_MINMAX */ > > -- > 2.9.3 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 23/31] radv/ac: handle case of swizzle with single components in get_alu_src.
How are you hitting this? The enclosing if is (need_swizzle || num_components != src_components) and if src_components = num_components = 1, then need_swizzle should be false? On Fri, Jan 20, 2017 at 4:03 AM, Dave Airliewrote: > From: Dave Airlie > > This gets hit with some geom shaders. > > Signed-off-by: Dave Airlie > --- > src/amd/common/ac_nir_to_llvm.c | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c > index 92e2b44..97e352b 100644 > --- a/src/amd/common/ac_nir_to_llvm.c > +++ b/src/amd/common/ac_nir_to_llvm.c > @@ -844,7 +844,10 @@ static LLVMValueRef get_alu_src(struct > nir_to_llvm_context *ctx, > LLVMConstInt(ctx->i32, src.swizzle[2], false), > LLVMConstInt(ctx->i32, src.swizzle[3], false)}; > > - if (src_components > 1 && num_components == 1) { > + if (src_components == 1 && num_components == 1) { > + value = LLVMBuildExtractElement(ctx->builder, value, > + masks[0], ""); > + } else if (src_components > 1 && num_components == 1) { > value = LLVMBuildExtractElement(ctx->builder, value, > masks[0], ""); > } else if (src_components == 1 && num_components > 1) { > -- > 2.9.3 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeonsi Add disk shader cache
On 24/01/2017 22:59, Timothy Arceri wrote: On Tue, 2017-01-24 at 18:10 +0100, kdj0c wrote: On 24/01/2017 17:40, Nicolai Hähnle wrote: On 24.01.2017 17:08, kdj0c wrote: use the util/disk_cache.c interface to cache some? radeonsi shaders on disk missing features : - add #if ENABLE_SHADER_CACHE where needed. - when loading from disk cache, also insert it to RAM cache. must be built with --enable-shader-cache to have the cache working. --- Hi, This is my first mail to the list. I'm not sure this is the right way to do this, it's my first attempt to patch mesa. I've tested on a radeon HD7950 with glxgears and quake3. I have some binary shaders in ~/.cache/mesa after running them, and they are re-used when re-launching them. I wanted to test more recent games, but the LD_LIBRARY_PATH trick didn't work with steam games, and I don't want to install mesa master system-wide. Unfortunately, I'd say that this is a pretty wrong approach. A radeonsi-level cache is nice, but the GLSL-level compilation and linking has overhead as well, which we want to avoid with the cache. We really want to detect a re-used shader already at the GLSL level, to be able to go straight to binaries (and TGSI I guess, for optimized monolithic variants). ok This is what I was wondering, it's not the right place to put it. (but it was easy because there was already a RAM cache). Thanks Hi, Welcome to contributing to Mesa :) I'm not sure how much time you have to work on this feature, but just letting you know it was my intention to start work on shade cache support for radeonsi next week. ok, I was following the advice and looking at glsl and tgsi code, but I'm a bit lost. That looks to complex for a first contribution. I can still help by testing patches (on radeon HD7950 only). Thanks, -- Jocelyn Tim ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] anv: implement pipeline statistics queries
On Tue, Jan 24, 2017 at 2:37 PM, Ilia Mirkinwrote: > On Tue, Jan 24, 2017 at 5:27 PM, Robert Bragg wrote: >>> +/* >>> + * GPR0 = GPR0 >> 2; >>> + * >>> + * Note that the upper 30 bits of GPR are lost! >>> + */ >>> +static void >>> +shr_gpr0_by_2_bits(struct anv_batch *batch) >>> +{ >>> + shl_gpr0_by_30_bits(batch); >>> + emit_load_alu_reg_reg32(batch, CS_GPR(0) + 4, CS_GPR(0)); >>> + emit_load_alu_reg_imm32(batch, CS_GPR(0) + 4, 0); >> >> >> I recently noticed from inspecting the original hsw_queryobj,c code >> that this looks suspicious. >> >> Conceptually it makes sense to implement a right shift as lshift by >> 32-n and then keeping the upper 32bits, but the emit_load_ functions >> take a destination followed by a source and so it looks like after the >> left shift it's copying the least significant 32bits of R0 over the >> most significant and then setting the most significant 32bits of R0 to >> zero. It looks like the first load_alu is redundant if the second one >> just writes zero to the same location. >> >> Maybe I'm misreading something here though, this comment it just based >> on inspection. > > What you're missing, I think, is that > > emit_load_alu_reg_reg32(batch, CS_GPR(0) + 4, CS_GPR(0)); > > does CS_GPR(0) = CS_GPR(0) + 4, and not the inverse as one logically > might have thought. I copied the semantics from the hsw_queryobj.c > file, but I think they stink. But it stinks even more to have 2 > functions with inverted argument meanings. > > Does that make sense? oh yeah sorry, not sure how I convinced myself it took dst then src. > > [So we have GPR0 which is a 64-bit entity, and do GPR0 <<= 30; GPR0_LO > = GPR0_HI; GPR0_HI = 0; and then we can store GPR0 somewhere.] > > As for re-using your generalized shifter, I don't think that'd make > sense to introduce in this change. It feels like a component on its > own, which should be integrated (or not) separately. When/if it is, > this and hsw_queryobj.c could migrate to using it. Yup definitely, this code works for the current need so no need to mess around with it here - thanks for clarifying my misreading. - Robert > > Cheers, > > -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 99517] [TRACKER] Mesa 17.0 release tracker
https://bugs.freedesktop.org/show_bug.cgi?id=99517 Mark Janeschanged: What|Removed |Added Depends on||96907 Referenced Bugs: https://bugs.freedesktop.org/show_bug.cgi?id=96907 [Bug 96907] piglit.spec.arb_gpu_shader5.arb_gpu_shader5-emitstreamvertex_nodraw intermittent -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 99517] [TRACKER] Mesa 17.0 release tracker
https://bugs.freedesktop.org/show_bug.cgi?id=99517 Mark Janeschanged: What|Removed |Added Depends on||98892 Referenced Bugs: https://bugs.freedesktop.org/show_bug.cgi?id=98892 [Bug 98892] [BDW] dEQP-VK.ubo.single_nested_struct_array tests intermittent -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 99517] [TRACKER] Mesa 17.0 release tracker
https://bugs.freedesktop.org/show_bug.cgi?id=99517 Mark Janeschanged: What|Removed |Added Depends on||99099 Referenced Bugs: https://bugs.freedesktop.org/show_bug.cgi?id=99099 [Bug 99099] [SNB] intermittent gpu hang in piglit.spec.ext_framebuffer_multisample.accuracy -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 99517] [TRACKER] Mesa 17.0 release tracker
https://bugs.freedesktop.org/show_bug.cgi?id=99517 Mark Janeschanged: What|Removed |Added Depends on||99266 Referenced Bugs: https://bugs.freedesktop.org/show_bug.cgi?id=99266 [Bug 99266] piglit.spec.ext_framebuffer_object.getteximage-formats init-by-clear-and-render -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] anv: implement pipeline statistics queries
On Tue, Jan 24, 2017 at 5:27 PM, Robert Braggwrote: >> +/* >> + * GPR0 = GPR0 >> 2; >> + * >> + * Note that the upper 30 bits of GPR are lost! >> + */ >> +static void >> +shr_gpr0_by_2_bits(struct anv_batch *batch) >> +{ >> + shl_gpr0_by_30_bits(batch); >> + emit_load_alu_reg_reg32(batch, CS_GPR(0) + 4, CS_GPR(0)); >> + emit_load_alu_reg_imm32(batch, CS_GPR(0) + 4, 0); > > > I recently noticed from inspecting the original hsw_queryobj,c code > that this looks suspicious. > > Conceptually it makes sense to implement a right shift as lshift by > 32-n and then keeping the upper 32bits, but the emit_load_ functions > take a destination followed by a source and so it looks like after the > left shift it's copying the least significant 32bits of R0 over the > most significant and then setting the most significant 32bits of R0 to > zero. It looks like the first load_alu is redundant if the second one > just writes zero to the same location. > > Maybe I'm misreading something here though, this comment it just based > on inspection. What you're missing, I think, is that emit_load_alu_reg_reg32(batch, CS_GPR(0) + 4, CS_GPR(0)); does CS_GPR(0) = CS_GPR(0) + 4, and not the inverse as one logically might have thought. I copied the semantics from the hsw_queryobj.c file, but I think they stink. But it stinks even more to have 2 functions with inverted argument meanings. Does that make sense? [So we have GPR0 which is a 64-bit entity, and do GPR0 <<= 30; GPR0_LO = GPR0_HI; GPR0_HI = 0; and then we can store GPR0 somewhere.] As for re-using your generalized shifter, I don't think that'd make sense to introduce in this change. It feels like a component on its own, which should be integrated (or not) separately. When/if it is, this and hsw_queryobj.c could migrate to using it. Cheers, -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] anv: implement pipeline statistics queries
Sorry for the delay responding here; some comments below... On Tue, Jan 24, 2017 at 11:48 AM, Ilia Mirkinwrote: > 2-month ping. [ok, it hasn't been 2 months on the dot, but ... close.] > > On Tue, Jan 10, 2017 at 5:49 PM, Ilia Mirkin wrote: >> ping. >> >> On Thu, Dec 22, 2016 at 11:14 AM, Ilia Mirkin wrote: >>> Ping? Any further comments/feedback/reviews? >>> >>> >>> On Dec 5, 2016 11:22 AM, "Ilia Mirkin" wrote: >>> >>> On Mon, Dec 5, 2016 at 11:11 AM, Robert Bragg wrote: On Sun, Nov 27, 2016 at 7:23 PM, Ilia Mirkin wrote: > > The strategy is to just keep n anv_query_pool_slot entries per query > instead of one. The available bit is only valid in the last one. > > Signed-off-by: Ilia Mirkin > --- > > I think this is in a pretty good state now. I've tested both the direct > and > buffer paths with a hacked up cube application, and I'm seeing > non-ridiculous > values for the various counters, although I haven't 100% verified them > for > accuracy. > > This also implements the hsw/bdw workaround for dividing frag invocations > by 4, > copied from hsw_queryobj. I tested this on SKL and it seem to divide the > values > as expected. > > The cube patch I've been testing with is at > http://paste.debian.net/899374/ > You can flip between copying to a buffer and explicit retrieval by > commenting > out the relevant function calls. > > src/intel/vulkan/anv_device.c | 2 +- > src/intel/vulkan/anv_private.h | 4 + > src/intel/vulkan/anv_query.c | 99 ++ > src/intel/vulkan/genX_cmd_buffer.c | 260 > - > 4 files changed, 308 insertions(+), 57 deletions(-) > > > diff --git a/src/intel/vulkan/anv_device.c > b/src/intel/vulkan/anv_device.c > index 99eb73c..7ad1970 100644 > --- a/src/intel/vulkan/anv_device.c > +++ b/src/intel/vulkan/anv_device.c > @@ -427,7 +427,7 @@ void anv_GetPhysicalDeviceFeatures( >.textureCompressionASTC_LDR = pdevice->info.gen >= > 9, > /* FINISHME CHV */ >.textureCompressionBC = true, >.occlusionQueryPrecise= true, > - .pipelineStatisticsQuery = false, > + .pipelineStatisticsQuery = true, >.fragmentStoresAndAtomics = true, >.shaderTessellationAndGeometryPointSize = true, >.shaderImageGatherExtended= false, > diff --git a/src/intel/vulkan/anv_private.h > b/src/intel/vulkan/anv_private.h > index 2fc543d..7271609 100644 > --- a/src/intel/vulkan/anv_private.h > +++ b/src/intel/vulkan/anv_private.h > @@ -1763,6 +1763,8 @@ struct anv_render_pass { > struct anv_subpass subpasses[0]; > }; > > +#define ANV_PIPELINE_STATISTICS_COUNT 11 > + > struct anv_query_pool_slot { > uint64_t begin; > uint64_t end; > @@ -1772,6 +1774,8 @@ struct anv_query_pool_slot { > struct anv_query_pool { > VkQueryType type; > uint32_t slots; > + uint32_t pipeline_statistics; > + uint32_t slot_stride; > struct anv_bobo; > }; > > diff --git a/src/intel/vulkan/anv_query.c b/src/intel/vulkan/anv_query.c > index 293257b..dc00859 100644 > --- a/src/intel/vulkan/anv_query.c > +++ b/src/intel/vulkan/anv_query.c > @@ -38,8 +38,10 @@ VkResult anv_CreateQueryPool( > ANV_FROM_HANDLE(anv_device, device, _device); > struct anv_query_pool *pool; > VkResult result; > - uint32_t slot_size; > - uint64_t size; > + uint32_t slot_size = sizeof(struct anv_query_pool_slot); > + uint32_t slot_stride = 1; > + uint64_t size = pCreateInfo->queryCount * slot_size; > + uint32_t pipeline_statistics = 0; > > assert(pCreateInfo->sType == > VK_STRUCTURE_TYPE_QUERY_POOL_CREATE_INFO); > > @@ -48,12 +50,16 @@ VkResult anv_CreateQueryPool( > case VK_QUERY_TYPE_TIMESTAMP: >break; > case VK_QUERY_TYPE_PIPELINE_STATISTICS: > - return VK_ERROR_INCOMPATIBLE_DRIVER; > + pipeline_statistics = pCreateInfo->pipelineStatistics & > + ((1 << ANV_PIPELINE_STATISTICS_COUNT) - 1); > + slot_stride = _mesa_bitcount(pipeline_statistics); > + size *= slot_stride; > + break; > default: >assert(!"Invalid query type"); > + return
[Mesa-dev] [PATCH] i965: Use a UW source type for CS_OPCODE_CS_TERMINATE.
SIMD16 compute shaders use a send(16) with mlen 1 for the EOT message, using a source of g127 for the single register. With a UD type, this supposedly could read g128, which doesn't exist, causing the simulator to get cranky. Use a UW type to avoid this. Signed-off-by: Kenneth Graunke--- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index cea38d86237..97420586d71 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -814,7 +814,8 @@ fs_visitor::emit_cs_terminate() /* Send a message to the thread spawner to terminate the thread. */ fs_inst *inst = bld.exec_all() - .emit(CS_OPCODE_CS_TERMINATE, reg_undef, payload); + .emit(CS_OPCODE_CS_TERMINATE, reg_undef, +retype(payload, BRW_REGISTER_TYPE_UW)); inst->eot = true; } -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeonsi Add disk shader cache
On Tue, 2017-01-24 at 18:10 +0100, kdj0c wrote: > On 24/01/2017 17:40, Nicolai Hähnle wrote: > > On 24.01.2017 17:08, kdj0c wrote: > > > use the util/disk_cache.c interface to cache some? radeonsi > > > shaders on disk > > > > > > missing features : > > > > > > - add #if ENABLE_SHADER_CACHE where needed. > > > - when loading from disk cache, also insert it to RAM cache. > > > > > > must be built with --enable-shader-cache to have the cache > > > working. > > > --- > > > Hi, This is my first mail to the list. > > > > > > I'm not sure this is the right way to do this, it's my first > > > attempt to patch mesa. > > > I've tested on a radeon HD7950 with glxgears and quake3. I have > > > some binary shaders in ~/.cache/mesa after running them, and they > > > are re-used when re-launching them. > > > I wanted to test more recent games, but the LD_LIBRARY_PATH trick > > > didn't work with steam games, and I don't want to install mesa > > > master system-wide. > > > > Unfortunately, I'd say that this is a pretty wrong approach. A > > radeonsi-level cache is nice, but the GLSL-level compilation and > > linking has overhead as well, which we want to avoid with the > > cache. > > > > We really want to detect a re-used shader already at the GLSL > > level, to be able to go straight to binaries (and TGSI I guess, for > > optimized monolithic variants). > > > > ok This is what I was wondering, it's not the right place to put it. > (but it was easy because there was already a RAM cache). > > Thanks Hi, Welcome to contributing to Mesa :) I'm not sure how much time you have to work on this feature, but just letting you know it was my intention to start work on shade cache support for radeonsi next week. Tim ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] r600g: use ieee variants of multiplication instructions
On 24/01/2017 20:11, Matteo Bruni wrote: 2017-01-24 19:15 GMT+01:00 Ilia Mirkin: On Tue, Jan 24, 2017 at 1:11 PM, Matteo Bruni wrote: 2017-01-24 3:18 GMT+01:00 Ilia Mirkin : This matches the behavior of most other drivers, including nouveau. Doesn't this break all the applications depending on d3d9 NaN behavior (including, but not limited to, d3d9 games in Wine) on r600g? If I got this right, flipping around the two patches in this series and enabling the TGSI_PROPERTY_MUL_ZERO_WINS flag for OpenGL non-compute shaders (if that's not the case already) should avoid regressions. This patch normalizes r600g wrt multiply handling with the other DX10/11 hardware drivers. nv50, nvc0, si, and i965 all use the IEEE behavior. I don't know for sure, but assume that nv30 and r300 have the DX9 behavior natively without IEEE support. The next patch allows for the MUL_ZERO_WINS property to be used to get the DX9 behavior, which st/nine will make use of. That doesn't help Wine or any "native" OpenGL application which happens to depend on the old behavior. Even if there are none of them (which doesn't sound right to me) applying this patch before 2/2 means that you are changing behavior for nine in this one patch and changing it back again with the next, which looks to me as something generally better avoided. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev Bad apps that depend on the behaviour could be listed in drirc with a workaround to force them use the gl extension associated with the feature. Yours, Axel Davy ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V3] glsl: lower constant arrays to uniform arrays before optimisation loop
On Tue, 2017-01-24 at 09:57 -0800, Eric Anholt wrote: > Timothy Arceriwrites: > > > From: Timothy Arceri > > > > Previously the constant array would not get copy propagated until > > the backend > > did its GLSL IR opt loop. I plan on removing that from i965 shortly > > which > > caused huge regressions in Deus-ex and Tomb Raider which have large > > constant arrays. Moving lowering before the opt loop in the GLSL > > linker > > fixes this and unexpectedly improves some compute shaders also. > > It seems like we should figure out what's missing in NIR that the > lack > of GLSL copy propagation hurt, but this is a pretty easy fix for now: > > Reviewed-by: Eric Anholt Thanks. The problem in NIR is that we end up with IR that looks like this. vec4 32 ssa_496 = intrinsic load_var () (constarray_0_4[264]) () intrinsic store_var (ssa_496) (icb[264]) (15) /* wrmask=xyzw */ But NIRs variable-based copy propagation pass needs there to be a copy_var in order to progress. We certainly need to improve this but there are so many bits that need to be improved I'm trying not to get sidetracked, for now my goal is to remove all GLSL IR opts from the i965 linker. Also since this actually improved some shaders it makes sense to make the change now so that we can try to carry over the improvement when fixing the NIR pass. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 99517] [TRACKER] Mesa 17.0 release tracker
https://bugs.freedesktop.org/show_bug.cgi?id=99517 Mark Janeschanged: What|Removed |Added Depends on||99509 Referenced Bugs: https://bugs.freedesktop.org/show_bug.cgi?id=99509 [Bug 99509] [SKLGT4e] piglit.spec.arb_shader_image_load_store.qualifiers intermittent -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: fix compile errors with mingw due to missing PRIx64 definitions
On 01/23/2017 11:21 AM, srol...@vmware.com wrote: > From: Roland Scheidegger> > define __STDC_FORMAT_MACROS and include (same as > ir_builder_print_visitor.cpp already does). > > Otherwise, some mingw build errors out (since > 8e7e1ae0365ddc7edb0d4d98250ab46728e6c14a and > bbce1c538dc0cb8bf3769510283d11847dc07540 presumably) with: > src/compiler/glsl/ir_print_visitor.cpp:479:40: error: expected ‘)’ before > ‘PRIu64’ >case GLSL_TYPE_UINT64:fprintf(f, "%" PRIu64, ir->value.u64[i]); break; > > (Note even with that fix I get other format specifier warnings: > src/compiler/glsl/ir_print_visitor.cpp:473:47: > warning: unknown conversion type character ‘a’ in format [-Wformat=] > fprintf(f, "%a", ir->value.f[i]); >^ > src/compiler/glsl/ir_print_visitor.cpp:473:47: > warning: too many arguments for format [-Wformat-extra-args] > but it still compiles at least) Ouch. That was added over 3 years ago. commit 1ecfdba98a346c8bb05ad9403e3a6412574215f4 Author: Matt Turner Date: Sun Aug 4 14:01:30 2013 -0700 glsl: Add heuristics to print floating-point numbers better. v2: Fix *.expected files to match. Reviewed-by: Paul Berry > --- > src/compiler/glsl/glsl_parser_extras.cpp | 2 ++ > src/compiler/glsl/ir_print_visitor.cpp | 2 ++ > 2 files changed, 4 insertions(+) > > diff --git a/src/compiler/glsl/glsl_parser_extras.cpp > b/src/compiler/glsl/glsl_parser_extras.cpp > index e888090..3d2fc14 100644 > --- a/src/compiler/glsl/glsl_parser_extras.cpp > +++ b/src/compiler/glsl/glsl_parser_extras.cpp > @@ -20,6 +20,8 @@ > * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > * DEALINGS IN THE SOFTWARE. > */ > +#define __STDC_FORMAT_MACROS 1 > +#include /* for PRIx64 macro */ > #include > #include > #include > diff --git a/src/compiler/glsl/ir_print_visitor.cpp > b/src/compiler/glsl/ir_print_visitor.cpp > index 0763277..debbdad 100644 > --- a/src/compiler/glsl/ir_print_visitor.cpp > +++ b/src/compiler/glsl/ir_print_visitor.cpp > @@ -21,6 +21,8 @@ > * DEALINGS IN THE SOFTWARE. > */ > > +#define __STDC_FORMAT_MACROS 1 > +#include /* for PRIx64 macro */ > #include "ir_print_visitor.h" > #include "compiler/glsl_types.h" > #include "glsl_parser_extras.h" > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/5] utils/sha1: make _mesa_sha1_final a simple define around SHA1Final
From: Emil VelikovSwap the argument order as applicable. Signed-off-by: Emil Velikov --- Similar patch for _mesa_sha1_update will require a bunch of casting due to the data type, which imho makes things uglier. --- src/amd/vulkan/radv_descriptor_set.c | 2 +- src/amd/vulkan/radv_pipeline_cache.c | 2 +- src/intel/vulkan/anv_descriptor_set.c | 2 +- src/intel/vulkan/anv_pipeline_cache.c | 2 +- src/util/mesa-sha1.h | 8 ++-- 5 files changed, 6 insertions(+), 10 deletions(-) diff --git a/src/amd/vulkan/radv_descriptor_set.c b/src/amd/vulkan/radv_descriptor_set.c index 435b7394a3..e35ed99d71 100644 --- a/src/amd/vulkan/radv_descriptor_set.c +++ b/src/amd/vulkan/radv_descriptor_set.c @@ -219,7 +219,7 @@ VkResult radv_CreatePipelineLayout( layout->push_constant_size = align(layout->push_constant_size, 16); _mesa_sha1_update(, >push_constant_size, sizeof(layout->push_constant_size)); - _mesa_sha1_final(, layout->sha1); + _mesa_sha1_final(layout->sha1, ); *pPipelineLayout = radv_pipeline_layout_to_handle(layout); return VK_SUCCESS; diff --git a/src/amd/vulkan/radv_pipeline_cache.c b/src/amd/vulkan/radv_pipeline_cache.c index 1bfdbe804c..164d38fc96 100644 --- a/src/amd/vulkan/radv_pipeline_cache.c +++ b/src/amd/vulkan/radv_pipeline_cache.c @@ -104,7 +104,7 @@ radv_hash_shader(unsigned char *hash, struct radv_shader_module *module, spec_info->mapEntryCount * sizeof spec_info->pMapEntries[0]); _mesa_sha1_update(, spec_info->pData, spec_info->dataSize); } - _mesa_sha1_final(, hash); + _mesa_sha1_final(hash, ); } diff --git a/src/intel/vulkan/anv_descriptor_set.c b/src/intel/vulkan/anv_descriptor_set.c index 29bb67c5c3..05a9828aab 100644 --- a/src/intel/vulkan/anv_descriptor_set.c +++ b/src/intel/vulkan/anv_descriptor_set.c @@ -271,7 +271,7 @@ VkResult anv_CreatePipelineLayout( _mesa_sha1_update(, >stage[s].has_dynamic_offsets, sizeof(layout->stage[s].has_dynamic_offsets)); } - _mesa_sha1_final(, layout->sha1); + _mesa_sha1_final(layout->sha1, ); *pPipelineLayout = anv_pipeline_layout_to_handle(layout); diff --git a/src/intel/vulkan/anv_pipeline_cache.c b/src/intel/vulkan/anv_pipeline_cache.c index 0b677a49f3..b34bffaca4 100644 --- a/src/intel/vulkan/anv_pipeline_cache.c +++ b/src/intel/vulkan/anv_pipeline_cache.c @@ -221,7 +221,7 @@ anv_hash_shader(unsigned char *hash, const void *key, size_t key_size, spec_info->mapEntryCount * sizeof spec_info->pMapEntries[0]); _mesa_sha1_update(, spec_info->pData, spec_info->dataSize); } - _mesa_sha1_final(, hash); + _mesa_sha1_final(hash, ); } static struct anv_shader_bin * diff --git a/src/util/mesa-sha1.h b/src/util/mesa-sha1.h index 02dd5f81bf..bab81299c6 100644 --- a/src/util/mesa-sha1.h +++ b/src/util/mesa-sha1.h @@ -40,11 +40,7 @@ _mesa_sha1_update(struct mesa_sha1 *ctx, const void *data, int size) SHA1Update(ctx, data, size); } -static inline void -_mesa_sha1_final(struct mesa_sha1 *ctx, unsigned char result[20]) -{ - SHA1Final(result, ctx); -} +#define _mesa_sha1_final SHA1Final static inline void _mesa_sha1_format(char *buf, const unsigned char *sha1) @@ -66,7 +62,7 @@ _mesa_sha1_compute(const void *data, size_t size, unsigned char result[20]) _mesa_sha1_init(); _mesa_sha1_update(, data, size); - _mesa_sha1_final(, result); + _mesa_sha1_final(result, ); } -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/5] util/sha1: inline the final _mesa_sha1 wrappers inside the header
From: Emil VelikovSigned-off-by: Emil Velikov --- src/util/Makefile.sources | 1 - src/util/mesa-sha1.c | 57 --- src/util/mesa-sha1.h | 33 ++- 3 files changed, 27 insertions(+), 64 deletions(-) delete mode 100644 src/util/mesa-sha1.c diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources index a68a5fe22f..aeedbffdf5 100644 --- a/src/util/Makefile.sources +++ b/src/util/Makefile.sources @@ -17,7 +17,6 @@ MESA_UTIL_FILES :=\ hash_table.h \ list.h \ macros.h \ - mesa-sha1.c \ mesa-sha1.h \ sha1/sha1.c \ sha1/sha1.h \ diff --git a/src/util/mesa-sha1.c b/src/util/mesa-sha1.c deleted file mode 100644 index a14fec97e7..00 --- a/src/util/mesa-sha1.c +++ /dev/null @@ -1,57 +0,0 @@ -/* Copyright © 2007 Carl Worth - * Copyright © 2009 Jeremy Huddleston, Julien Cristau, and Matthieu Herrb - * Copyright © 2009-2010 Mikhail Gusarov - * Copyright © 2012 Yaakov Selkowitz and Keith Packard - * Copyright © 2014 Intel Corporation - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING - * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER - * DEALINGS IN THE SOFTWARE. - */ - -#include "sha1/sha1.h" -#include "mesa-sha1.h" - -void -_mesa_sha1_update(struct mesa_sha1 *ctx, const void *data, int size) -{ - SHA1Update(ctx, data, size); -} - -void -_mesa_sha1_compute(const void *data, size_t size, unsigned char result[20]) -{ - struct mesa_sha1 ctx; - - _mesa_sha1_init(); - _mesa_sha1_update(, data, size); - _mesa_sha1_final(, result); -} - -void -_mesa_sha1_format(char *buf, const unsigned char *sha1) -{ - static const char hex_digits[] = "0123456789abcdef"; - int i; - - for (i = 0; i < 40; i += 2) { - buf[i] = hex_digits[sha1[i >> 1] >> 4]; - buf[i + 1] = hex_digits[sha1[i >> 1] & 0x0f]; - } - buf[i] = '\0'; -} diff --git a/src/util/mesa-sha1.h b/src/util/mesa-sha1.h index ecbc708b5e..02dd5f81bf 100644 --- a/src/util/mesa-sha1.h +++ b/src/util/mesa-sha1.h @@ -34,8 +34,11 @@ extern "C" { #define _mesa_sha1_init SHA1Init -void -_mesa_sha1_update(struct mesa_sha1 *ctx, const void *data, int size); +static inline void +_mesa_sha1_update(struct mesa_sha1 *ctx, const void *data, int size) +{ + SHA1Update(ctx, data, size); +} static inline void _mesa_sha1_final(struct mesa_sha1 *ctx, unsigned char result[20]) @@ -43,11 +46,29 @@ _mesa_sha1_final(struct mesa_sha1 *ctx, unsigned char result[20]) SHA1Final(result, ctx); } -void -_mesa_sha1_format(char *buf, const unsigned char *sha1); +static inline void +_mesa_sha1_format(char *buf, const unsigned char *sha1) +{ + static const char hex_digits[] = "0123456789abcdef"; + int i; + + for (i = 0; i < 40; i += 2) { + buf[i] = hex_digits[sha1[i >> 1] >> 4]; + buf[i + 1] = hex_digits[sha1[i >> 1] & 0x0f]; + } + buf[i] = '\0'; +} + +static inline void +_mesa_sha1_compute(const void *data, size_t size, unsigned char result[20]) +{ + struct mesa_sha1 ctx; + + _mesa_sha1_init(); + _mesa_sha1_update(, data, size); + _mesa_sha1_final(, result); +} -void -_mesa_sha1_compute(const void *data, size_t size, unsigned char result[20]); #ifdef __cplusplus } /* extern C */ -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/5] util/sha1: drop _mesa_sha1_{update, format} return type
From: Emil VelikovUnused/unchecked by any of the callers. Signed-off-by: Emil Velikov --- src/util/mesa-sha1.c | 7 ++- src/util/mesa-sha1.h | 4 ++-- 2 files changed, 4 insertions(+), 7 deletions(-) diff --git a/src/util/mesa-sha1.c b/src/util/mesa-sha1.c index eb882e8bd0..a14fec97e7 100644 --- a/src/util/mesa-sha1.c +++ b/src/util/mesa-sha1.c @@ -27,11 +27,10 @@ #include "sha1/sha1.h" #include "mesa-sha1.h" -int +void _mesa_sha1_update(struct mesa_sha1 *ctx, const void *data, int size) { SHA1Update(ctx, data, size); - return 1; } void @@ -44,7 +43,7 @@ _mesa_sha1_compute(const void *data, size_t size, unsigned char result[20]) _mesa_sha1_final(, result); } -char * +void _mesa_sha1_format(char *buf, const unsigned char *sha1) { static const char hex_digits[] = "0123456789abcdef"; @@ -55,6 +54,4 @@ _mesa_sha1_format(char *buf, const unsigned char *sha1) buf[i + 1] = hex_digits[sha1[i >> 1] & 0x0f]; } buf[i] = '\0'; - - return buf; } diff --git a/src/util/mesa-sha1.h b/src/util/mesa-sha1.h index f927d5772d..ecbc708b5e 100644 --- a/src/util/mesa-sha1.h +++ b/src/util/mesa-sha1.h @@ -34,7 +34,7 @@ extern "C" { #define _mesa_sha1_init SHA1Init -int +void _mesa_sha1_update(struct mesa_sha1 *ctx, const void *data, int size); static inline void @@ -43,7 +43,7 @@ _mesa_sha1_final(struct mesa_sha1 *ctx, unsigned char result[20]) SHA1Final(result, ctx); } -char * +void _mesa_sha1_format(char *buf, const unsigned char *sha1); void -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/5] util/sha1: rework _mesa_sha1_{init,final}
From: Emil VelikovRather than having an extra memory allocation [that we currently do not and act accordingly] just make the API take an pointer to a stack allocated instance. This and follow-up steps will effectively make the _mesa_sha1_foo simple define/inlines around their SHA1 counterparts. Signed-off-by: Emil Velikov --- src/amd/vulkan/radv_descriptor_set.c | 10 +- src/amd/vulkan/radv_pipeline_cache.c | 18 +- src/intel/vulkan/anv_descriptor_set.c | 13 +++-- src/intel/vulkan/anv_pipeline_cache.c | 18 +- src/util/mesa-sha1.c | 34 +- src/util/mesa-sha1.h | 13 - 6 files changed, 43 insertions(+), 63 deletions(-) diff --git a/src/amd/vulkan/radv_descriptor_set.c b/src/amd/vulkan/radv_descriptor_set.c index eb8b5d6e3a..435b7394a3 100644 --- a/src/amd/vulkan/radv_descriptor_set.c +++ b/src/amd/vulkan/radv_descriptor_set.c @@ -180,7 +180,7 @@ VkResult radv_CreatePipelineLayout( { RADV_FROM_HANDLE(radv_device, device, _device); struct radv_pipeline_layout *layout; - struct mesa_sha1 *ctx; + struct mesa_sha1 ctx; assert(pCreateInfo->sType == VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO); @@ -194,7 +194,7 @@ VkResult radv_CreatePipelineLayout( unsigned dynamic_offset_count = 0; - ctx = _mesa_sha1_init(); + _mesa_sha1_init(); for (uint32_t set = 0; set < pCreateInfo->setLayoutCount; set++) { RADV_FROM_HANDLE(radv_descriptor_set_layout, set_layout, pCreateInfo->pSetLayouts[set]); @@ -204,7 +204,7 @@ VkResult radv_CreatePipelineLayout( for (uint32_t b = 0; b < set_layout->binding_count; b++) { dynamic_offset_count += set_layout->binding[b].array_size * set_layout->binding[b].dynamic_offset_count; } - _mesa_sha1_update(ctx, set_layout->binding, + _mesa_sha1_update(, set_layout->binding, sizeof(set_layout->binding[0]) * set_layout->binding_count); } @@ -217,9 +217,9 @@ VkResult radv_CreatePipelineLayout( } layout->push_constant_size = align(layout->push_constant_size, 16); - _mesa_sha1_update(ctx, >push_constant_size, + _mesa_sha1_update(, >push_constant_size, sizeof(layout->push_constant_size)); - _mesa_sha1_final(ctx, layout->sha1); + _mesa_sha1_final(, layout->sha1); *pPipelineLayout = radv_pipeline_layout_to_handle(layout); return VK_SUCCESS; diff --git a/src/amd/vulkan/radv_pipeline_cache.c b/src/amd/vulkan/radv_pipeline_cache.c index 2cb1dfb6eb..1bfdbe804c 100644 --- a/src/amd/vulkan/radv_pipeline_cache.c +++ b/src/amd/vulkan/radv_pipeline_cache.c @@ -90,21 +90,21 @@ radv_hash_shader(unsigned char *hash, struct radv_shader_module *module, const struct radv_pipeline_layout *layout, const union ac_shader_variant_key *key) { - struct mesa_sha1 *ctx; + struct mesa_sha1 ctx; - ctx = _mesa_sha1_init(); + _mesa_sha1_init(); if (key) - _mesa_sha1_update(ctx, key, sizeof(*key)); - _mesa_sha1_update(ctx, module->sha1, sizeof(module->sha1)); - _mesa_sha1_update(ctx, entrypoint, strlen(entrypoint)); + _mesa_sha1_update(, key, sizeof(*key)); + _mesa_sha1_update(, module->sha1, sizeof(module->sha1)); + _mesa_sha1_update(, entrypoint, strlen(entrypoint)); if (layout) - _mesa_sha1_update(ctx, layout->sha1, sizeof(layout->sha1)); + _mesa_sha1_update(, layout->sha1, sizeof(layout->sha1)); if (spec_info) { - _mesa_sha1_update(ctx, spec_info->pMapEntries, + _mesa_sha1_update(, spec_info->pMapEntries, spec_info->mapEntryCount * sizeof spec_info->pMapEntries[0]); - _mesa_sha1_update(ctx, spec_info->pData, spec_info->dataSize); + _mesa_sha1_update(, spec_info->pData, spec_info->dataSize); } - _mesa_sha1_final(ctx, hash); + _mesa_sha1_final(, hash); } diff --git a/src/intel/vulkan/anv_descriptor_set.c b/src/intel/vulkan/anv_descriptor_set.c index a5e65afc48..29bb67c5c3 100644 --- a/src/intel/vulkan/anv_descriptor_set.c +++ b/src/intel/vulkan/anv_descriptor_set.c @@ -259,18 +259,19 @@ VkResult anv_CreatePipelineLayout( } } - struct mesa_sha1 *ctx = _mesa_sha1_init(); + struct mesa_sha1 ctx; + _mesa_sha1_init(); for (unsigned s = 0; s < layout->num_sets; s++) { - sha1_update_descriptor_set_layout(ctx, layout->set[s].layout); - _mesa_sha1_update(ctx, >set[s].dynamic_offset_start, + sha1_update_descriptor_set_layout(, layout->set[s].layout); + _mesa_sha1_update(,
[Mesa-dev] [PATCH 1/5] util/sha1: add non-typedef name for the SHA1_CTX struct
From: Emil VelikovUsing typedef(s) is not always the answer and makes it harder for people to do clever (or one might call nasty) things with the code. Add a struct name which we will use with follow-up commit. Signed-off-by: Emil Velikov --- src/util/sha1/README | 3 +++ src/util/sha1/sha1.h | 2 +- 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/src/util/sha1/README b/src/util/sha1/README index f13baf9d1a..f30acf984e 100644 --- a/src/util/sha1/README +++ b/src/util/sha1/README @@ -57,3 +57,6 @@ Upstream status: TBD (N/A ?) - Manually expand __BEGIN_DECLS/__END_DECLS and make sure that they include the struct declaration. Upstream status: TBD + + - Add non-typedef struct name. +Upstream status: TBD diff --git a/src/util/sha1/sha1.h b/src/util/sha1/sha1.h index 243481a98e..029a0ae87f 100644 --- a/src/util/sha1/sha1.h +++ b/src/util/sha1/sha1.h @@ -20,7 +20,7 @@ extern "C" { #endif -typedef struct { +typedef struct _SHA1_CTX { uint32_t state[5]; uint64_t count; uint8_t buffer[SHA1_BLOCK_LENGTH]; -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] anv: implement pipeline statistics queries
2-month ping. [ok, it hasn't been 2 months on the dot, but ... close.] On Tue, Jan 10, 2017 at 5:49 PM, Ilia Mirkinwrote: > ping. > > On Thu, Dec 22, 2016 at 11:14 AM, Ilia Mirkin wrote: >> Ping? Any further comments/feedback/reviews? >> >> >> On Dec 5, 2016 11:22 AM, "Ilia Mirkin" wrote: >> >> On Mon, Dec 5, 2016 at 11:11 AM, Robert Bragg wrote: >>> >>> >>> On Sun, Nov 27, 2016 at 7:23 PM, Ilia Mirkin wrote: The strategy is to just keep n anv_query_pool_slot entries per query instead of one. The available bit is only valid in the last one. Signed-off-by: Ilia Mirkin --- I think this is in a pretty good state now. I've tested both the direct and buffer paths with a hacked up cube application, and I'm seeing non-ridiculous values for the various counters, although I haven't 100% verified them for accuracy. This also implements the hsw/bdw workaround for dividing frag invocations by 4, copied from hsw_queryobj. I tested this on SKL and it seem to divide the values as expected. The cube patch I've been testing with is at http://paste.debian.net/899374/ You can flip between copying to a buffer and explicit retrieval by commenting out the relevant function calls. src/intel/vulkan/anv_device.c | 2 +- src/intel/vulkan/anv_private.h | 4 + src/intel/vulkan/anv_query.c | 99 ++ src/intel/vulkan/genX_cmd_buffer.c | 260 - 4 files changed, 308 insertions(+), 57 deletions(-) diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c index 99eb73c..7ad1970 100644 --- a/src/intel/vulkan/anv_device.c +++ b/src/intel/vulkan/anv_device.c @@ -427,7 +427,7 @@ void anv_GetPhysicalDeviceFeatures( .textureCompressionASTC_LDR = pdevice->info.gen >= 9, /* FINISHME CHV */ .textureCompressionBC = true, .occlusionQueryPrecise= true, - .pipelineStatisticsQuery = false, + .pipelineStatisticsQuery = true, .fragmentStoresAndAtomics = true, .shaderTessellationAndGeometryPointSize = true, .shaderImageGatherExtended= false, diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h index 2fc543d..7271609 100644 --- a/src/intel/vulkan/anv_private.h +++ b/src/intel/vulkan/anv_private.h @@ -1763,6 +1763,8 @@ struct anv_render_pass { struct anv_subpass subpasses[0]; }; +#define ANV_PIPELINE_STATISTICS_COUNT 11 + struct anv_query_pool_slot { uint64_t begin; uint64_t end; @@ -1772,6 +1774,8 @@ struct anv_query_pool_slot { struct anv_query_pool { VkQueryType type; uint32_t slots; + uint32_t pipeline_statistics; + uint32_t slot_stride; struct anv_bobo; }; diff --git a/src/intel/vulkan/anv_query.c b/src/intel/vulkan/anv_query.c index 293257b..dc00859 100644 --- a/src/intel/vulkan/anv_query.c +++ b/src/intel/vulkan/anv_query.c @@ -38,8 +38,10 @@ VkResult anv_CreateQueryPool( ANV_FROM_HANDLE(anv_device, device, _device); struct anv_query_pool *pool; VkResult result; - uint32_t slot_size; - uint64_t size; + uint32_t slot_size = sizeof(struct anv_query_pool_slot); + uint32_t slot_stride = 1; + uint64_t size = pCreateInfo->queryCount * slot_size; + uint32_t pipeline_statistics = 0; assert(pCreateInfo->sType == VK_STRUCTURE_TYPE_QUERY_POOL_CREATE_INFO); @@ -48,12 +50,16 @@ VkResult anv_CreateQueryPool( case VK_QUERY_TYPE_TIMESTAMP: break; case VK_QUERY_TYPE_PIPELINE_STATISTICS: - return VK_ERROR_INCOMPATIBLE_DRIVER; + pipeline_statistics = pCreateInfo->pipelineStatistics & + ((1 << ANV_PIPELINE_STATISTICS_COUNT) - 1); + slot_stride = _mesa_bitcount(pipeline_statistics); + size *= slot_stride; + break; default: assert(!"Invalid query type"); + return VK_ERROR_INCOMPATIBLE_DRIVER; } - slot_size = sizeof(struct anv_query_pool_slot); pool = vk_alloc2(>alloc, pAllocator, sizeof(*pool), 8, VK_SYSTEM_ALLOCATION_SCOPE_OBJECT); if (pool == NULL) @@
Re: [Mesa-dev] [PATCH 1/2] r600g: use ieee variants of multiplication instructions
On Tue, Jan 24, 2017 at 2:11 PM, Matteo Bruniwrote: > That doesn't help Wine or any "native" OpenGL application which > happens to depend on the old behavior. Oh, and another note on that - I *do* think it helps those applications. Because now they will no longer inexplicably work on r600 and not work on radeonsi, i965, and nouveau. It will now be consistent, which will eliminate the "oh, that driver is broken" suspicion. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] r600g: add support for optionally using non-IEEE mul ops
I think of the first patch as a fix to the driver, and the second patch as a new feature. On Tue, Jan 24, 2017 at 2:27 PM, Nicolai Hähnlewrote: > No piglit regressions on Redwood with these two patches. Matteo's point > about switching the order of the patches around seems reasonable. > > Cheers, > Nicolai > > > On 24.01.2017 10:20, Nicolai Hähnle wrote: >> >> The series looks reasonable to me, so >> >> Reviewed-by: Nicolai Hähnle >> >> Please hold off on pushing this for a day or so, to give me or someone >> else a chance to test this. >> >> On 24.01.2017 03:18, Ilia Mirkin wrote: >>> >>> Signed-off-by: Ilia Mirkin >>> --- >>> >>> Untested. Can be verified with Xnine. It should pass before 1/2 of >>> this series, >>> start failing with it, and pass again with 2/2 in place. >>> >>> src/gallium/drivers/r600/r600_pipe.c | 2 +- >>> src/gallium/drivers/r600/r600_shader.c | 20 +--- >>> 2 files changed, 18 insertions(+), 4 deletions(-) >>> >>> diff --git a/src/gallium/drivers/r600/r600_pipe.c >>> b/src/gallium/drivers/r600/r600_pipe.c >>> index 98ceebf..d126d37 100644 >>> --- a/src/gallium/drivers/r600/r600_pipe.c >>> +++ b/src/gallium/drivers/r600/r600_pipe.c >>> @@ -286,6 +286,7 @@ static int r600_get_param(struct pipe_screen* >>> pscreen, enum pipe_cap param) >>> case PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENT: >>> case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED: >>> case PIPE_CAP_CLEAR_TEXTURE: >>> +case PIPE_CAP_TGSI_MUL_ZERO_WINS: >>> return 1; >>> >>> case PIPE_CAP_DEVICE_RESET_STATUS_QUERY: >>> @@ -378,7 +379,6 @@ static int r600_get_param(struct pipe_screen* >>> pscreen, enum pipe_cap param) >>> case PIPE_CAP_NATIVE_FENCE_FD: >>> case PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY: >>> case PIPE_CAP_TGSI_FS_FBFETCH: >>> -case PIPE_CAP_TGSI_MUL_ZERO_WINS: >>> return 0; >>> >>> case PIPE_CAP_MAX_SHADER_PATCH_VARYINGS: >>> diff --git a/src/gallium/drivers/r600/r600_shader.c >>> b/src/gallium/drivers/r600/r600_shader.c >>> index 0114f8f..b692e7f 100644 >>> --- a/src/gallium/drivers/r600/r600_shader.c >>> +++ b/src/gallium/drivers/r600/r600_shader.c >>> @@ -3906,6 +3906,11 @@ static int tgsi_op2_s(struct r600_shader_ctx >>> *ctx, int swap, int trans_only) >>> int i, j, r, lasti = tgsi_last_instruction(write_mask); >>> /* use temp register if trans_only and more than one dst >>> component */ >>> int use_tmp = trans_only && (write_mask ^ (1 << lasti)); >>> +unsigned op = ctx->inst_info->op; >>> + >>> +if (op == ALU_OP2_MUL_IEEE && >>> +ctx->info.properties[TGSI_PROPERTY_MUL_ZERO_WINS]) >>> +op = ALU_OP2_MUL; >>> >>> for (i = 0; i <= lasti; i++) { >>> if (!(write_mask & (1 << i))) >>> @@ -3919,7 +3924,7 @@ static int tgsi_op2_s(struct r600_shader_ctx >>> *ctx, int swap, int trans_only) >>> } else >>> tgsi_dst(ctx, >Dst[0], i, ); >>> >>> -alu.op = ctx->inst_info->op; >>> +alu.op = op; >>> if (!swap) { >>> for (j = 0; j < inst->Instruction.NumSrcRegs; j++) { >>> r600_bytecode_src([j], >src[j], i); >>> @@ -6543,6 +6548,11 @@ static int tgsi_op3(struct r600_shader_ctx *ctx) >>> int i, j, r; >>> int lasti = tgsi_last_instruction(inst->Dst[0].Register.WriteMask); >>> int temp_regs[4]; >>> +unsigned op = ctx->inst_info->op; >>> + >>> +if (op == ALU_OP3_MULADD_IEEE && >>> +ctx->info.properties[TGSI_PROPERTY_MUL_ZERO_WINS]) >>> +op = ALU_OP3_MULADD; >>> >>> for (j = 0; j < inst->Instruction.NumSrcRegs; j++) { >>> temp_regs[j] = 0; >>> @@ -6554,7 +6564,7 @@ static int tgsi_op3(struct r600_shader_ctx *ctx) >>> continue; >>> >>> memset(, 0, sizeof(struct r600_bytecode_alu)); >>> -alu.op = ctx->inst_info->op; >>> +alu.op = op; >>> for (j = 0; j < inst->Instruction.NumSrcRegs; j++) { >>> r = tgsi_make_src_for_op3(ctx, temp_regs[j], i, >>> [j], >src[j]); >>> if (r) >>> @@ -6580,10 +6590,14 @@ static int tgsi_dp(struct r600_shader_ctx *ctx) >>> struct tgsi_full_instruction *inst = >>> >parse.FullToken.FullInstruction; >>> struct r600_bytecode_alu alu; >>> int i, j, r; >>> +unsigned op = ctx->inst_info->op; >>> +if (op == ALU_OP2_DOT4_IEEE && >>> +ctx->info.properties[TGSI_PROPERTY_MUL_ZERO_WINS]) >>> +op = ALU_OP2_DOT4; >>> >>> for (i = 0; i < 4; i++) { >>> memset(, 0, sizeof(struct r600_bytecode_alu)); >>> -alu.op = ctx->inst_info->op; >>> +alu.op = op; >>> for (j = 0; j < inst->Instruction.NumSrcRegs; j++) { >>> r600_bytecode_src([j], >src[j], i); >>> } >>> > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] r600g: use ieee variants of multiplication instructions
On Tue, Jan 24, 2017 at 2:11 PM, Matteo Bruniwrote: > 2017-01-24 19:15 GMT+01:00 Ilia Mirkin : >> On Tue, Jan 24, 2017 at 1:11 PM, Matteo Bruni >> wrote: >>> 2017-01-24 3:18 GMT+01:00 Ilia Mirkin : This matches the behavior of most other drivers, including nouveau. >>> >>> Doesn't this break all the applications depending on d3d9 NaN behavior >>> (including, but not limited to, d3d9 games in Wine) on r600g? >>> >>> If I got this right, flipping around the two patches in this series >>> and enabling the TGSI_PROPERTY_MUL_ZERO_WINS flag for OpenGL >>> non-compute shaders (if that's not the case already) should avoid >>> regressions. >> >> This patch normalizes r600g wrt multiply handling with the other >> DX10/11 hardware drivers. nv50, nvc0, si, and i965 all use the IEEE >> behavior. I don't know for sure, but assume that nv30 and r300 have >> the DX9 behavior natively without IEEE support. >> >> The next patch allows for the MUL_ZERO_WINS property to be used to get >> the DX9 behavior, which st/nine will make use of. > > That doesn't help Wine or any "native" OpenGL application which > happens to depend on the old behavior. > Even if there are none of them (which doesn't sound right to me) > applying this patch before 2/2 means that you are changing behavior > for nine in this one patch and changing it back again with the next, > which looks to me as something generally better avoided. IMHO this patch should go in irrespective of the second patch. The IEEE behavior on multiplies is what all the other hw drivers do. Having one driver do one thing and every other driver do another thing is not a great situation to be in. The second patch is a nicety for st/nine and any future GL extensions that can make use of the functionality. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] vulkan: Don't install vk_platform.h or vulkan.h.
On Tue, Jan 24, 2017 at 11:25 AM, Emil Velikovwrote: > On 24 January 2017 at 18:02, Jason Ekstrand wrote: > > On Tue, Jan 24, 2017 at 9:03 AM, Matt Turner wrote: > >> > >> On Tue, Jan 24, 2017 at 8:41 AM, Emil Velikov > > >> wrote: > >> > On 24 January 2017 at 00:54, Matt Turner wrote: > >> >> These files belong to the vulkan loader. > >> > Fully agreed, patch is > >> > Reviewed-by: Emil Velikov > >> > >> Thanks! > >> > >> > Related question: > >> > I was wondering about getting this a step further: > >> > - having the loader provide a .pc file > >> > - tracking required version at configure time and dropping our local > >> > copies of the headers/xml. > >> > > >> > Would you be in favour, against, neutral of such an approach ? > >> > >> I'd be in favor of that, but let's see what Jason thinks. > > > > > > I'd rather not. That would make sense if we all lived in the open-source > > world where everything is upstream all the time. Unfortunately, not all > of > > us have that luxury and we need to be able to work on experimental > branches > > of the spec that may have more extensions than are provided by any loader > > version we can install. I'd be ok with a check for a particular loader > > version just to force distros to update their loader but I would like to > be > > able to build with arbitrary XML branches without having to install a > branch > > of the loader. > What if I tell you that you wouldn't need to install the loader ;-) > More as we get a .pc patches in. > A lot of extensions don't require explicit loader support. I don't want to have to update my loader (or put it in some folder and point pkg-config at it) just to hack on them. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] r600g: add support for optionally using non-IEEE mul ops
No piglit regressions on Redwood with these two patches. Matteo's point about switching the order of the patches around seems reasonable. Cheers, Nicolai On 24.01.2017 10:20, Nicolai Hähnle wrote: The series looks reasonable to me, so Reviewed-by: Nicolai HähnlePlease hold off on pushing this for a day or so, to give me or someone else a chance to test this. On 24.01.2017 03:18, Ilia Mirkin wrote: Signed-off-by: Ilia Mirkin --- Untested. Can be verified with Xnine. It should pass before 1/2 of this series, start failing with it, and pass again with 2/2 in place. src/gallium/drivers/r600/r600_pipe.c | 2 +- src/gallium/drivers/r600/r600_shader.c | 20 +--- 2 files changed, 18 insertions(+), 4 deletions(-) diff --git a/src/gallium/drivers/r600/r600_pipe.c b/src/gallium/drivers/r600/r600_pipe.c index 98ceebf..d126d37 100644 --- a/src/gallium/drivers/r600/r600_pipe.c +++ b/src/gallium/drivers/r600/r600_pipe.c @@ -286,6 +286,7 @@ static int r600_get_param(struct pipe_screen* pscreen, enum pipe_cap param) case PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENT: case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED: case PIPE_CAP_CLEAR_TEXTURE: +case PIPE_CAP_TGSI_MUL_ZERO_WINS: return 1; case PIPE_CAP_DEVICE_RESET_STATUS_QUERY: @@ -378,7 +379,6 @@ static int r600_get_param(struct pipe_screen* pscreen, enum pipe_cap param) case PIPE_CAP_NATIVE_FENCE_FD: case PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY: case PIPE_CAP_TGSI_FS_FBFETCH: -case PIPE_CAP_TGSI_MUL_ZERO_WINS: return 0; case PIPE_CAP_MAX_SHADER_PATCH_VARYINGS: diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 0114f8f..b692e7f 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -3906,6 +3906,11 @@ static int tgsi_op2_s(struct r600_shader_ctx *ctx, int swap, int trans_only) int i, j, r, lasti = tgsi_last_instruction(write_mask); /* use temp register if trans_only and more than one dst component */ int use_tmp = trans_only && (write_mask ^ (1 << lasti)); +unsigned op = ctx->inst_info->op; + +if (op == ALU_OP2_MUL_IEEE && +ctx->info.properties[TGSI_PROPERTY_MUL_ZERO_WINS]) +op = ALU_OP2_MUL; for (i = 0; i <= lasti; i++) { if (!(write_mask & (1 << i))) @@ -3919,7 +3924,7 @@ static int tgsi_op2_s(struct r600_shader_ctx *ctx, int swap, int trans_only) } else tgsi_dst(ctx, >Dst[0], i, ); -alu.op = ctx->inst_info->op; +alu.op = op; if (!swap) { for (j = 0; j < inst->Instruction.NumSrcRegs; j++) { r600_bytecode_src([j], >src[j], i); @@ -6543,6 +6548,11 @@ static int tgsi_op3(struct r600_shader_ctx *ctx) int i, j, r; int lasti = tgsi_last_instruction(inst->Dst[0].Register.WriteMask); int temp_regs[4]; +unsigned op = ctx->inst_info->op; + +if (op == ALU_OP3_MULADD_IEEE && +ctx->info.properties[TGSI_PROPERTY_MUL_ZERO_WINS]) +op = ALU_OP3_MULADD; for (j = 0; j < inst->Instruction.NumSrcRegs; j++) { temp_regs[j] = 0; @@ -6554,7 +6564,7 @@ static int tgsi_op3(struct r600_shader_ctx *ctx) continue; memset(, 0, sizeof(struct r600_bytecode_alu)); -alu.op = ctx->inst_info->op; +alu.op = op; for (j = 0; j < inst->Instruction.NumSrcRegs; j++) { r = tgsi_make_src_for_op3(ctx, temp_regs[j], i, [j], >src[j]); if (r) @@ -6580,10 +6590,14 @@ static int tgsi_dp(struct r600_shader_ctx *ctx) struct tgsi_full_instruction *inst = >parse.FullToken.FullInstruction; struct r600_bytecode_alu alu; int i, j, r; +unsigned op = ctx->inst_info->op; +if (op == ALU_OP2_DOT4_IEEE && +ctx->info.properties[TGSI_PROPERTY_MUL_ZERO_WINS]) +op = ALU_OP2_DOT4; for (i = 0; i < 4; i++) { memset(, 0, sizeof(struct r600_bytecode_alu)); -alu.op = ctx->inst_info->op; +alu.op = op; for (j = 0; j < inst->Instruction.NumSrcRegs; j++) { r600_bytecode_src([j], >src[j], i); } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/7] gallium/radeon: adjust the rule for using the LINEAR_ALIGNED layout
This patch breaks piglit ./bin/ext_image_dma_buf_import-refcount -auto -fbo at least on Redwood. VI seems to be fine. Nicolai On 20.01.2017 20:07, Marek Olšák wrote: From: Marek Olšák--- src/gallium/drivers/radeon/r600_texture.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeon/r600_texture.c b/src/gallium/drivers/radeon/r600_texture.c index cba4e7d..0b77c82 100644 --- a/src/gallium/drivers/radeon/r600_texture.c +++ b/src/gallium/drivers/radeon/r600_texture.c @@ -1177,21 +1177,23 @@ r600_choose_tiling(struct r600_common_screen *rscreen, if (rscreen->chip_class >= SI && (templ->bind & PIPE_BIND_CURSOR)) return RADEON_SURF_MODE_LINEAR_ALIGNED; if (templ->bind & PIPE_BIND_LINEAR) return RADEON_SURF_MODE_LINEAR_ALIGNED; /* Textures with a very small height are recommended to be linear. */ if (templ->target == PIPE_TEXTURE_1D || templ->target == PIPE_TEXTURE_1D_ARRAY || - templ->height0 <= 4) + /* Only very thin and long 2D textures should benefit from +* linear_aligned. */ + (templ->width0 > 8 && templ->height0 <= 2)) return RADEON_SURF_MODE_LINEAR_ALIGNED; /* Textures likely to be mapped often. */ if (templ->usage == PIPE_USAGE_STAGING || templ->usage == PIPE_USAGE_STREAM) return RADEON_SURF_MODE_LINEAR_ALIGNED; } /* Make small textures 1D tiled. */ if (templ->width0 <= 16 || templ->height0 <= 16 || ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] vulkan: Don't install vk_platform.h or vulkan.h.
On 24 January 2017 at 18:02, Jason Ekstrandwrote: > On Tue, Jan 24, 2017 at 9:03 AM, Matt Turner wrote: >> >> On Tue, Jan 24, 2017 at 8:41 AM, Emil Velikov >> wrote: >> > On 24 January 2017 at 00:54, Matt Turner wrote: >> >> These files belong to the vulkan loader. >> > Fully agreed, patch is >> > Reviewed-by: Emil Velikov >> >> Thanks! >> >> > Related question: >> > I was wondering about getting this a step further: >> > - having the loader provide a .pc file >> > - tracking required version at configure time and dropping our local >> > copies of the headers/xml. >> > >> > Would you be in favour, against, neutral of such an approach ? >> >> I'd be in favor of that, but let's see what Jason thinks. > > > I'd rather not. That would make sense if we all lived in the open-source > world where everything is upstream all the time. Unfortunately, not all of > us have that luxury and we need to be able to work on experimental branches > of the spec that may have more extensions than are provided by any loader > version we can install. I'd be ok with a check for a particular loader > version just to force distros to update their loader but I would like to be > able to build with arbitrary XML branches without having to install a branch > of the loader. What if I tell you that you wouldn't need to install the loader ;-) More as we get a .pc patches in. Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] r600g: use ieee variants of multiplication instructions
2017-01-24 19:15 GMT+01:00 Ilia Mirkin: > On Tue, Jan 24, 2017 at 1:11 PM, Matteo Bruni > wrote: >> 2017-01-24 3:18 GMT+01:00 Ilia Mirkin : >>> This matches the behavior of most other drivers, including nouveau. >> >> Doesn't this break all the applications depending on d3d9 NaN behavior >> (including, but not limited to, d3d9 games in Wine) on r600g? >> >> If I got this right, flipping around the two patches in this series >> and enabling the TGSI_PROPERTY_MUL_ZERO_WINS flag for OpenGL >> non-compute shaders (if that's not the case already) should avoid >> regressions. > > This patch normalizes r600g wrt multiply handling with the other > DX10/11 hardware drivers. nv50, nvc0, si, and i965 all use the IEEE > behavior. I don't know for sure, but assume that nv30 and r300 have > the DX9 behavior natively without IEEE support. > > The next patch allows for the MUL_ZERO_WINS property to be used to get > the DX9 behavior, which st/nine will make use of. That doesn't help Wine or any "native" OpenGL application which happens to depend on the old behavior. Even if there are none of them (which doesn't sound right to me) applying this patch before 2/2 means that you are changing behavior for nine in this one patch and changing it back again with the next, which looks to me as something generally better avoided. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] anv: bail out if using loader interface prior to v3
On Tue, Jan 24, 2017 at 8:17 AM, Emil Velikovwrote: > On 24 January 2017 at 15:41, Chad Versace > wrote: > > On Tue 24 Jan 2017, Emil Velikov wrote: > >> From: Emil Velikov > >> > >> Strictly speaking we could add support for v2 and earlier. At the same > >> time, those tend to be buggy and as such there's limited testing done. > > > > I'm confused by the claim of "limited testing". Before my patch landed > > that upgraded anvil to loader interface v3, the driver only supported > > loader interface v1. And any differences between v1 and v2 are > > negligible enough to not be the cause of any crash. > > > > So... is the real problem > > a. anvil doesn't support loader interface v2, or > > b. Fedora 25 ships a buggy loader, and this patch effectively forces > >the user to upgrade the loader to a version in which the bug is > >fixed. > > > > I have difficulty understanding how (a) could possibly be the problem. > > Did some patches land in src/vulkan/wsi that broke the v2 interface? If > > so, then this patch is probably justified. > > > > If the actual problem is (b), then I believe this patch is the wrong way > > to fix it. The real fix should go into the loader. And this patch > > prevents the driver working on systems where it should work. > > > I fully agree with your reasoning. > > B is the one to blame here. I may have gone overzealous with the > wording/approach, but the idea is there - how do we deal with issues, > reported against Mesa (ANV/RADV) where the problems seems to be in the > loader. > We don't want to have the behaviour we had with OpenGL where people > jump to assumptions that ANV/RADV is broken because "it works" with > binary driver FOO. Even when the crash/issue is outside Mesa. > > Looking at git log (as per the bugreport) I wonder if encouraging > people to use updated loader (as this patch does) isn't that bad of a > thing. Esp. since distros might not always see a reason otherwise. > For what it's worth, Fedora is in the process of updating their loader... Also, I think we will want to do this eventually but not yet. One of these days, I'm going to rewrite the WSI implementation *again* to make it do something useful in CreateFooSurface. > > More comments below. > > > >> Cc: Jason Ekstrand > >> Cc: Shawn Starr > >> Cc: Chad Versace > >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99446 > >> Signed-off-by: Emil Velikov > >> --- > >> Slightly pedantic, yet explicitly mentioned in the spec as a way to > >> detect/manage older loader versions. Would have saved us a crash, so I'm > >> wondering if we want it for stable ? > >> > >> Shawn considering you still have the old libvulkan.so around can you > >> give this and/or 2/2 a test ? > >> --- > >> src/intel/vulkan/anv_device.c | 8 > >> 1 file changed, 8 insertions(+) > >> > >> diff --git a/src/intel/vulkan/anv_device.c > b/src/intel/vulkan/anv_device.c > >> index f80a36a940..e7aa81883a 100644 > >> --- a/src/intel/vulkan/anv_device.c > >> +++ b/src/intel/vulkan/anv_device.c > >> @@ -36,6 +36,8 @@ > >> > >> #include "genxml/gen7_pack.h" > >> > >> +static uint32_t loader_version; > >> + > >> struct anv_dispatch_table dtable; > >> > >> static void > >> @@ -739,6 +741,11 @@ VKAPI_ATTR PFN_vkVoidFunction VKAPI_CALL > vk_icdGetInstanceProcAddr( > >> VkInstance instance, > >> const char* pName) > >> { > >> + if (loader_version < 3u) { > >> + fprintf(stderr, "WARNING: ANV supports Loader interface v3 or > newer, v%u " > >> + "detected. Update your libvulkan.so.\n", > loader_version); > >> + return NULL; > >> + } > >> return anv_GetInstanceProcAddr(instance, pName); > >> } > >> > >> @@ -2075,6 +2082,7 @@ vk_icdNegotiateLoaderICDInterfaceVersion(uint32_t* > pSupportedVersion) > >> * vkDestroySurfaceKHR(), and other API which uses > VKSurfaceKHR, > >> * because the loader no longer does so. > >> */ > >> + loader_version = *pSupportedVersion; > >> *pSupportedVersion = MIN2(*pSupportedVersion, 3u); > >> return VK_SUCCESS; > >> } > > > > If this patch does land, then This hunk needs fixing. If the driver > > doesn't support loader interface version 2, then the loader spec > > requires that we return VK_ERROR_INCOMPATIBLE_DRIVER here if > > *pSupportedVersion < 3. > > > > The loader spec says: > > > > If the ICD receiving the call no longer supports the interface > > version provided by the loader (due to deprecation), then it should > > report VK_ERROR_INCOMPATIBLE_DRIVER error. Otherwise it sets the > > value pointed by "pSupportedVersion" to the latest interface version > > supported by both the ICD and the loader and returns
Re: [Mesa-dev] [PATCH 1/6] anv: Set viewport extents correctly when height is negative
On 24/01/17 17:40, Jason Ekstrand wrote: On Tue, Jan 24, 2017 at 12:49 AM, Iago Toral> wrote: On Mon, 2017-01-23 at 14:12 -0800, Jason Ekstrand wrote: > As per VK_KHR_maintenance1, setting a negative height in the viewport > can be used to get flipped coordinates. This is, aparently, very > useful > when porting D3D apps to Vulkan. All we need to do to support this > is > to make sure we actually set the min and max correctly. > --- > src/intel/vulkan/gen8_cmd_buffer.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/src/intel/vulkan/gen8_cmd_buffer.c > b/src/intel/vulkan/gen8_cmd_buffer.c > index f22037b..ab68872 100644 > --- a/src/intel/vulkan/gen8_cmd_buffer.c > +++ b/src/intel/vulkan/gen8_cmd_buffer.c > @@ -59,8 +59,8 @@ gen8_cmd_buffer_emit_viewport(struct anv_cmd_buffer > *cmd_buffer) > .YMaxClipGuardband = 1.0f, > .XMinViewPort = vp->x, > .XMaxViewPort = vp->x + vp->width - 1, > - .YMinViewPort = vp->y, > - .YMaxViewPort = vp->y + vp->height - 1, > + .YMinViewPort = MIN2(vp->y, vp->y + vp->height), > + .YMaxViewPort = MAX2(vp->y, vp->y + vp->height) - 1, >}; If we have y = 0 and height = -100, shouldn't we use YMinVP = -99 and YMaxVP = 0 instead of (-100, -1)? No, I think we still want -100, -1. In the case mentioned, the Y region, in floating-point, is [-100, 0]. However, it appears that, even though it's float, we're expected to provide max-1 in the max fields. Thanks for the explanation! Reviewed-by: Lionel Landwerlin >GENX(SF_CLIP_VIEWPORT_pack)(NULL, sf_clip_state.map + i * 64, ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] spirv: handle gl_SampleMask
Reviewed-by: Jason EkstrandOn Tue, Jan 24, 2017 at 4:48 AM, Iago Toral Quiroga wrote: > SPIR-V maps both gl_SampleMask and gl_SampleMaskIn to the same > builtin (SampleMask). The only way to tell which one we are dealing with > is to check if it is an input or an output. > > Fixes: > dEQP-VK.pipeline.multisample_shader_builtin.sample_mask.write.* > --- > I am still waiting on Jenkins to report results from this patch, but for > some reason it is taking surprisingly long so I figured I'd send it for > review ahead of the results, I don't expect regressions, but I'll verify > there aren't any when I get them in any case. > > src/compiler/spirv/vtn_variables.c | 8 ++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/src/compiler/spirv/vtn_variables.c b/src/compiler/spirv/vtn_ > variables.c > index d55f81e..4d1ec78 100644 > --- a/src/compiler/spirv/vtn_variables.c > +++ b/src/compiler/spirv/vtn_variables.c > @@ -975,8 +975,12 @@ vtn_get_builtin_location(struct vtn_builder *b, >set_mode_system_value(mode); >break; > case SpvBuiltInSampleMask: > - *location = SYSTEM_VALUE_SAMPLE_MASK_IN; /* XXX out? */ > - set_mode_system_value(mode); > + if (*mode == nir_var_shader_out) { > + *location = FRAG_RESULT_SAMPLE_MASK; > + } else { > + *location = SYSTEM_VALUE_SAMPLE_MASK_IN; > + set_mode_system_value(mode); > + } >break; > case SpvBuiltInFragDepth: >*location = FRAG_RESULT_DEPTH; > -- > 2.7.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 97879] [amdgpu] Rocket League: long hangs (several seconds) when loading assets (models/textures/shaders?)
https://bugs.freedesktop.org/show_bug.cgi?id=97879 --- Comment #53 from Timothee Besset--- Hello! I have started working on this. I haven't found the root cause yet but I will update here when I have something. (For context, I did the initial port work for Psyonix. I just recently got a radeonsi setup together so I can look at this now.) -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] r600g: use ieee variants of multiplication instructions
On Tue, Jan 24, 2017 at 1:11 PM, Matteo Bruniwrote: > 2017-01-24 3:18 GMT+01:00 Ilia Mirkin : >> This matches the behavior of most other drivers, including nouveau. > > Doesn't this break all the applications depending on d3d9 NaN behavior > (including, but not limited to, d3d9 games in Wine) on r600g? > > If I got this right, flipping around the two patches in this series > and enabling the TGSI_PROPERTY_MUL_ZERO_WINS flag for OpenGL > non-compute shaders (if that's not the case already) should avoid > regressions. This patch normalizes r600g wrt multiply handling with the other DX10/11 hardware drivers. nv50, nvc0, si, and i965 all use the IEEE behavior. I don't know for sure, but assume that nv30 and r300 have the DX9 behavior natively without IEEE support. The next patch allows for the MUL_ZERO_WINS property to be used to get the DX9 behavior, which st/nine will make use of. Cheers, -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: fix compile errors with mingw due to missing PRIx64 definitions
Am 24.01.2017 um 14:23 schrieb Jose Fonseca: > On 23/01/17 19:21, srol...@vmware.com wrote: >> From: Roland Scheidegger>> >> define __STDC_FORMAT_MACROS and include (same as >> ir_builder_print_visitor.cpp already does). >> >> Otherwise, some mingw build errors out (since >> 8e7e1ae0365ddc7edb0d4d98250ab46728e6c14a and >> bbce1c538dc0cb8bf3769510283d11847dc07540 presumably) with: >> src/compiler/glsl/ir_print_visitor.cpp:479:40: error: expected ‘)’ >> before ‘PRIu64’ >>case GLSL_TYPE_UINT64:fprintf(f, "%" PRIu64, ir->value.u64[i]); break; >> >> (Note even with that fix I get other format specifier warnings: >> src/compiler/glsl/ir_print_visitor.cpp:473:47: >> warning: unknown conversion type character ‘a’ in format [-Wformat=] >> fprintf(f, "%a", ir->value.f[i]); >>^ >> src/compiler/glsl/ir_print_visitor.cpp:473:47: >> warning: too many arguments for format [-Wformat-extra-args] >> but it still compiles at least) >> --- >> src/compiler/glsl/glsl_parser_extras.cpp | 2 ++ >> src/compiler/glsl/ir_print_visitor.cpp | 2 ++ >> 2 files changed, 4 insertions(+) >> >> diff --git a/src/compiler/glsl/glsl_parser_extras.cpp >> b/src/compiler/glsl/glsl_parser_extras.cpp >> index e888090..3d2fc14 100644 >> --- a/src/compiler/glsl/glsl_parser_extras.cpp >> +++ b/src/compiler/glsl/glsl_parser_extras.cpp >> @@ -20,6 +20,8 @@ >> * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER >> * DEALINGS IN THE SOFTWARE. >> */ >> +#define __STDC_FORMAT_MACROS 1 >> +#include /* for PRIx64 macro */ >> #include >> #include >> #include >> diff --git a/src/compiler/glsl/ir_print_visitor.cpp >> b/src/compiler/glsl/ir_print_visitor.cpp >> index 0763277..debbdad 100644 >> --- a/src/compiler/glsl/ir_print_visitor.cpp >> +++ b/src/compiler/glsl/ir_print_visitor.cpp >> @@ -21,6 +21,8 @@ >> * DEALINGS IN THE SOFTWARE. >> */ >> >> +#define __STDC_FORMAT_MACROS 1 >> +#include /* for PRIx64 macro */ >> #include "ir_print_visitor.h" >> #include "compiler/glsl_types.h" >> #include "glsl_parser_extras.h" >> > > Reviewed-by: Jose Fonseca > > But I think it might be more efficient to define this on configure.ac > and scons/gallium.py like we already do for other __STDC__MACROS > > Jose Sounds reasonable, but I'll leave that to someone else... Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] spirv: handle gl_SampleMask
On Tue, Jan 24, 2017 at 4:48 AM, Iago Toral Quirogawrote: > SPIR-V maps both gl_SampleMask and gl_SampleMaskIn to the same > builtin (SampleMask). The only way to tell which one we are dealing with > is to check if it is an input or an output. > > Fixes: > dEQP-VK.pipeline.multisample_shader_builtin.sample_mask.write.* > --- > I am still waiting on Jenkins to report results from this patch, but for > some reason it is taking surprisingly long so I figured I'd send it for > review ahead of the results, I don't expect regressions, but I'll verify > there aren't any when I get them in any case. > > src/compiler/spirv/vtn_variables.c | 8 ++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/src/compiler/spirv/vtn_variables.c > b/src/compiler/spirv/vtn_variables.c > index d55f81e..4d1ec78 100644 > --- a/src/compiler/spirv/vtn_variables.c > +++ b/src/compiler/spirv/vtn_variables.c > @@ -975,8 +975,12 @@ vtn_get_builtin_location(struct vtn_builder *b, >set_mode_system_value(mode); >break; > case SpvBuiltInSampleMask: > - *location = SYSTEM_VALUE_SAMPLE_MASK_IN; /* XXX out? */ > - set_mode_system_value(mode); > + if (*mode == nir_var_shader_out) { > + *location = FRAG_RESULT_SAMPLE_MASK; > + } else { > + *location = SYSTEM_VALUE_SAMPLE_MASK_IN; > + set_mode_system_value(mode); > + } >break; > case SpvBuiltInFragDepth: >*location = FRAG_RESULT_DEPTH; > -- > 2.7.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev Reviewed-by: Anuj Phogat ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] r600g: use ieee variants of multiplication instructions
2017-01-24 3:18 GMT+01:00 Ilia Mirkin: > This matches the behavior of most other drivers, including nouveau. Doesn't this break all the applications depending on d3d9 NaN behavior (including, but not limited to, d3d9 games in Wine) on r600g? If I got this right, flipping around the two patches in this series and enabling the TGSI_PROPERTY_MUL_ZERO_WINS flag for OpenGL non-compute shaders (if that's not the case already) should avoid regressions. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa 1/2] egl: update headers from registry
Khronos introduced a new macro (suggested by Google) to avoid using C-style casts in C++ code, as those generate warnings. Khronos Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=16113 Signed-off-by: Eric Engestrom--- include/EGL/egl.h | 24 +++--- include/EGL/eglext.h | 197 +++--- include/EGL/eglplatform.h | 10 ++- 3 files changed, 206 insertions(+), 25 deletions(-) diff --git a/include/EGL/egl.h b/include/EGL/egl.h index 0d514e4def..29f30d94de 100644 --- a/include/EGL/egl.h +++ b/include/EGL/egl.h @@ -6,7 +6,7 @@ extern "C" { #endif /* -** Copyright (c) 2013-2014 The Khronos Group Inc. +** Copyright (c) 2013-2017 The Khronos Group Inc. ** ** Permission is hereby granted, free of charge, to any person obtaining a ** copy of this software and/or associated documentation files (the @@ -31,14 +31,14 @@ extern "C" { ** This header is generated from the Khronos OpenGL / OpenGL ES XML ** API Registry. The current version of the Registry, generator scripts ** used to make the header, and the header can be found at -** http://www.opengl.org/registry/ +** http://www.opengl.org/registry/egl ** -** Khronos $Revision: 31039 $ on $Date: 2015-05-04 17:01:57 -0700 (Mon, 04 May 2015) $ +** Khronos $Revision$ on $Date$ */ #include -/* Generated on date 20150504 */ +/* Generated on date 20161230 */ /* Generated C header for: * API: egl @@ -78,7 +78,7 @@ typedef void (*__eglMustCastToProperFunctionPointerType)(void); #define EGL_CONFIG_ID 0x3028 #define EGL_CORE_NATIVE_ENGINE0x305B #define EGL_DEPTH_SIZE0x3025 -#define EGL_DONT_CARE ((EGLint)-1) +#define EGL_DONT_CARE EGL_CAST(EGLint,-1) #define EGL_DRAW 0x3059 #define EGL_EXTENSIONS0x3055 #define EGL_FALSE 0 @@ -95,9 +95,9 @@ typedef void (*__eglMustCastToProperFunctionPointerType)(void); #define EGL_NONE 0x3038 #define EGL_NON_CONFORMANT_CONFIG 0x3051 #define EGL_NOT_INITIALIZED 0x3001 -#define EGL_NO_CONTEXT((EGLContext)0) -#define EGL_NO_DISPLAY((EGLDisplay)0) -#define EGL_NO_SURFACE((EGLSurface)0) +#define EGL_NO_CONTEXTEGL_CAST(EGLContext,0) +#define EGL_NO_DISPLAYEGL_CAST(EGLDisplay,0) +#define EGL_NO_SURFACEEGL_CAST(EGLSurface,0) #define EGL_PBUFFER_BIT 0x0001 #define EGL_PIXMAP_BIT0x0002 #define EGL_READ 0x305A @@ -197,7 +197,7 @@ typedef void *EGLClientBuffer; #define EGL_RGB_BUFFER0x308E #define EGL_SINGLE_BUFFER 0x3085 #define EGL_SWAP_BEHAVIOR 0x3093 -#define EGL_UNKNOWN ((EGLint)-1) +#define EGL_UNKNOWN EGL_CAST(EGLint,-1) #define EGL_VERTICAL_RESOLUTION 0x3091 EGLAPI EGLBoolean EGLAPIENTRY eglBindAPI (EGLenum api); EGLAPI EGLenum EGLAPIENTRY eglQueryAPI (void); @@ -224,7 +224,7 @@ EGLAPI EGLBoolean EGLAPIENTRY eglWaitClient (void); #ifndef EGL_VERSION_1_4 #define EGL_VERSION_1_4 1 -#define EGL_DEFAULT_DISPLAY ((EGLNativeDisplayType)0) +#define EGL_DEFAULT_DISPLAY EGL_CAST(EGLNativeDisplayType,0) #define EGL_MULTISAMPLE_RESOLVE_BOX_BIT 0x0200 #define EGL_MULTISAMPLE_RESOLVE 0x3099 #define EGL_MULTISAMPLE_RESOLVE_DEFAULT 0x309A @@ -266,7 +266,7 @@ typedef void *EGLImage; #define EGL_FOREVER 0xull #define EGL_TIMEOUT_EXPIRED 0x30F5 #define EGL_CONDITION_SATISFIED 0x30F6 -#define EGL_NO_SYNC ((EGLSync)0) +#define EGL_NO_SYNC EGL_CAST(EGLSync,0) #define EGL_SYNC_FENCE0x30F9 #define EGL_GL_COLORSPACE 0x309D #define EGL_GL_COLORSPACE_SRGB0x3089 @@ -283,7 +283,7 @@ typedef void *EGLImage; #define EGL_GL_TEXTURE_CUBE_MAP_POSITIVE_Z 0x30B7 #define EGL_GL_TEXTURE_CUBE_MAP_NEGATIVE_Z 0x30B8 #define EGL_IMAGE_PRESERVED 0x30D2 -#define EGL_NO_IMAGE ((EGLImage)0) +#define EGL_NO_IMAGE EGL_CAST(EGLImage,0) EGLAPI EGLSync EGLAPIENTRY eglCreateSync (EGLDisplay dpy, EGLenum type, const EGLAttrib *attrib_list); EGLAPI EGLBoolean EGLAPIENTRY eglDestroySync (EGLDisplay dpy, EGLSync sync); EGLAPI EGLint EGLAPIENTRY eglClientWaitSync (EGLDisplay dpy, EGLSync sync, EGLint flags, EGLTime timeout); diff --git a/include/EGL/eglext.h b/include/EGL/eglext.h index 4ccbab8927..bc8f0bab23 100644 --- a/include/EGL/eglext.h +++ b/include/EGL/eglext.h @@ -6,7 +6,7 @@ extern "C" { #endif /* -** Copyright (c) 2013-2016 The Khronos Group Inc. +** Copyright (c) 2013-2017 The Khronos Group Inc.
[Mesa-dev] [PATCH mesa 2/2] egl: EGL_PLATFORM_SURFACELESS_MESA is now upstream
EGL_PLATFORM_SURFACELESS_MESA is in eglext.h as of last commit. Signed-off-by: Eric Engestrom--- include/EGL/eglmesaext.h | 5 - 1 file changed, 5 deletions(-) diff --git a/include/EGL/eglmesaext.h b/include/EGL/eglmesaext.h index 405d0e9ee4..3a1b88e3d1 100644 --- a/include/EGL/eglmesaext.h +++ b/include/EGL/eglmesaext.h @@ -85,11 +85,6 @@ #define EGL_NO_CONFIG_MESA ((EGLConfig)0) #endif -#ifndef EGL_MESA_platform_surfaceless -#define EGL_MESA_platform_surfaceless 1 -#define EGL_PLATFORM_SURFACELESS_MESA 0x31DD -#endif /* EGL_MESA_platform_surfaceless */ - #ifdef __cplusplus } #endif -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] vulkan: Don't install vk_platform.h or vulkan.h.
On Tue, Jan 24, 2017 at 9:03 AM, Matt Turnerwrote: > On Tue, Jan 24, 2017 at 8:41 AM, Emil Velikov > wrote: > > On 24 January 2017 at 00:54, Matt Turner wrote: > >> These files belong to the vulkan loader. > > Fully agreed, patch is > > Reviewed-by: Emil Velikov > > Thanks! > > > Related question: > > I was wondering about getting this a step further: > > - having the loader provide a .pc file > > - tracking required version at configure time and dropping our local > > copies of the headers/xml. > > > > Would you be in favour, against, neutral of such an approach ? > > I'd be in favor of that, but let's see what Jason thinks. > I'd rather not. That would make sense if we all lived in the open-source world where everything is upstream all the time. Unfortunately, not all of us have that luxury and we need to be able to work on experimental branches of the spec that may have more extensions than are provided by any loader version we can install. I'd be ok with a check for a particular loader version just to force distros to update their loader but I would like to be able to build with arbitrary XML branches without having to install a branch of the loader. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 97879] [amdgpu] Rocket League: long hangs (several seconds) when loading assets (models/textures/shaders?)
https://bugs.freedesktop.org/show_bug.cgi?id=97879 --- Comment #52 from Marek Olšák--- We don't need a debug build. We just need: 1) One person to run the debug build and use sysprof to capture where the CPU is spending time during the freeze. 2) Make a screenshot of the sysprof window and send it to the game developer. 3) The game developer should look at it and decide what to do next. sysprof is a very-easy-to-use standalone CPU profiler GUI that you run under root. It's observing all processes and also the kernel. For apps built with -g (but also keep -O2 at least), it will show the functions and % of CPU time spent in them. For apps also built with -fno-omit-frame-pointer, it will show whole call stacks. -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V3] glsl: lower constant arrays to uniform arrays before optimisation loop
Timothy Arceriwrites: > From: Timothy Arceri > > Previously the constant array would not get copy propagated until the backend > did its GLSL IR opt loop. I plan on removing that from i965 shortly which > caused huge regressions in Deus-ex and Tomb Raider which have large > constant arrays. Moving lowering before the opt loop in the GLSL linker > fixes this and unexpectedly improves some compute shaders also. It seems like we should figure out what's missing in NIR that the lack of GLSL copy propagation hurt, but this is a pretty easy fix for now: Reviewed-by: Eric Anholt signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev