[Mesa-dev] [Bug 103895] Spurious “used uninitialized” warning
https://bugs.freedesktop.org/show_bug.cgi?id=103895 Matt Turner changed: What|Removed |Added Component|Drivers/DRI/i965|glsl-compiler Assignee|intel-3d-bugs@lists.freedes |mesa-dev@lists.freedesktop. |ktop.org|org -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] util: Fix SHA1 implementation on big endian
On Fri, Nov 24, 2017 at 2:25 AM, Eric Engestrom wrote: > On Thursday, 2017-11-23 11:08:04 -0800, Matt Turner wrote: >> The code defines a macro blk0(i) based on the preprocessor condition >> BYTE_ORDER == LITTLE_ENDIAN. If true, blk0(i) is defined as a byte swap >> operation. Unfortunately, if the preprocessor macros used in the test >> are no defined, then the comparison becomes 0 == 0 and it evaluates as >> true. >> --- >> src/util/sha1/sha1.c | 9 + >> 1 file changed, 9 insertions(+) >> >> diff --git a/src/util/sha1/sha1.c b/src/util/sha1/sha1.c >> index ef59ea1dfc..2c629520c3 100644 >> --- a/src/util/sha1/sha1.c >> +++ b/src/util/sha1/sha1.c >> @@ -16,8 +16,17 @@ >> >> #include >> #include >> +#include "u_endian.h" > > If you're including this header, why not use > `#ifdef PIPE_ARCH_LITTLE_ENDIAN` instead of > `#if BYTE_ORDER == LITTLE_ENDIAN`? > > `#error` becomes unnecessary In response to Andres' report that the Windows build fails with this patch, I was going to take your suggestion... but it doesn't seem like u_endian.h has any code to handle Windows. I see this crap in p_config.h: /* * Endian detection. */ #include "util/u_endian.h" #if !defined(PIPE_ARCH_LITTLE_ENDIAN) && !defined(PIPE_ARCH_BIG_ENDIAN) #if defined(PIPE_ARCH_X86) || defined(PIPE_ARCH_X86_64) || defined(PIPE_ARCH_ARM) || defined(PIPE_ARCH_AARCH64) #define PIPE_ARCH_LITTLE_ENDIAN #elif defined(PIPE_ARCH_PPC) || defined(PIPE_ARCH_PPC_64) || defined(PIPE_ARCH_S390) #define PIPE_ARCH_BIG_ENDIAN #endif #endif #if !defined(PIPE_ARCH_LITTLE_ENDIAN) && !defined(PIPE_ARCH_BIG_ENDIAN) #error Unknown Endianness #endif I don't think *any* of that would be necessary if we just handled Windows (in the appropriate place -- u_endian.h)! José, Roland: I think I'm going to commit a patch to u_endian.h to define PIPE_ARCH_LITTLE_ENDIAN in the absense of any platform-specific handling, so that the rest of my series can land: diff --git a/src/util/u_endian.h b/src/util/u_endian.h index 7bbd7dc215..3d5c006f35 100644 --- a/src/util/u_endian.h +++ b/src/util/u_endian.h @@ -67,4 +67,7 @@ #endif +#warn Unknown Endianness for this platform. Assuming little endian +#define PIPE_ARCH_LITTLE_ENDIAN + #endif Presumably Windows only runs on little endian platforms, so that should be fine for your purposes. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] GPU (and system) monitoring
On my TODO list was to start learning R, including how to do R from within Perl. So I got Statistics::R installed. Not really sure of what kind of model to apply to any of this data, what I decided to try was Hyndman's ets() from the forecast package at CRAN. This is a swiss army chainsaw of exponential smoothing models. The first (only so far) set of data I through at it, was the CPU Load (%). As mentioned earlier, the median CPU load was 100%. The ets() function will choose the best from a large assortment of models. But by and large what I've seen is that it always went to the same model (no linear trend, no seasonality), with a smoothing factor so small that fitted values only do a small random walk around the mean CPU Load. Forcing ets() to use a significantly larger smoothing factor, does produce larger swings in this random walk about the mean, but the sum of residuals squared does actually seem to get worse. The sets of data which are in Watts (instead of CPU Load %) should behave the same way as the CPU Load, so no sense looking at them. The temperature data should be different. This shouldn't show a trend, but it could show seasonality. What I would expect is that the smoothed data temperature data should show a correlation with the smoothed power, except with a lag. Gord ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] util: Fix SHA1 implementation on big endian
On Fri, Nov 24, 2017 at 8:51 AM, Andres Gomez wrote: > On Fri, 2017-11-24 at 13:32 +, Emil Velikov wrote: >> On 24 November 2017 at 10:25, Eric Engestrom >> wrote: >> > On Thursday, 2017-11-23 11:08:04 -0800, Matt Turner wrote: >> > > The code defines a macro blk0(i) based on the preprocessor condition >> > > BYTE_ORDER == LITTLE_ENDIAN. If true, blk0(i) is defined as a byte swap >> > > operation. Unfortunately, if the preprocessor macros used in the test >> > > are no defined, then the comparison becomes 0 == 0 and it evaluates as >> > > true. >> > > --- >> > > src/util/sha1/sha1.c | 9 + >> > > 1 file changed, 9 insertions(+) >> > > >> > > diff --git a/src/util/sha1/sha1.c b/src/util/sha1/sha1.c >> > > index ef59ea1dfc..2c629520c3 100644 >> > > --- a/src/util/sha1/sha1.c >> > > +++ b/src/util/sha1/sha1.c >> > > @@ -16,8 +16,17 @@ >> > > >> > > #include >> > > #include >> > > +#include "u_endian.h" >> > >> > If you're including this header, why not use >> > `#ifdef PIPE_ARCH_LITTLE_ENDIAN` instead of >> > `#if BYTE_ORDER == LITTLE_ENDIAN`? >> > >> > `#error` becomes unnecessary >> > >> >> I won't bother with that - we do want to address all the cases where >> undefined macro is used in #if statement. >> This is handled by -Wundef which seemingly is not part of Wall and we >> don't use - patch incoming in a second. >> >> We want to make it a Werror=undef after we fix the ~60 issues. More >> than half of those are coming from gtest :-\ >> >> Thanks for fixing these Matt. As-is the series is >> Reviewed-by: Emil Velikov > > Matt, this is breaking Windows compilation. > > You can check the Appveyor output here: > https://ci.appveyor.com/project/AndresGomez/mesa-dwep2/build/488 > > I've also seen a similar error with the MinGW crosscompilation to > Windows: > > -- > > src/util/hash_table.c: In function ‘_mesa_hash_table_u64_insert’: > src/util/hash_table.c:591:42: warning: cast to pointer from integer of > different size [-Wint-to-pointer-cast] >_mesa_hash_table_insert(ht->table, (void *)key, data); > ^ > src/util/hash_table.c: In function ‘hash_table_u64_search’: > src/util/hash_table.c:607:49: warning: cast to pointer from integer of > different size [-Wint-to-pointer-cast] >return _mesa_hash_table_search(ht->table, (void *)key); > ^ > src/util/sha1/sha1.c:23:2: error: #error BYTE_ORDER not defined > #error BYTE_ORDER not defined > ^ > src/util/sha1/sha1.c:27:2: error: #error LITTLE_ENDIAN no defined > #error LITTLE_ENDIAN no defined > ^ > scons: *** [build/windows-x86-debug/util/sha1/sha1.o] Error 1 > scons: building terminated because of errors. > > -- > Br, > > Andres Thank you for testing. That helps a lot. In that case, I think I'll take Eric's suggestion to use PIPE_ARCH_* which I know will be defined. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radeonsi: always initialize max_forced_staging_uploads
From: Marek Olšák r600_resource is malloc'd. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103808 --- src/gallium/drivers/radeon/r600_buffer_common.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c b/src/gallium/drivers/radeon/r600_buffer_common.c index 770f4e9..3e476f7 100644 --- a/src/gallium/drivers/radeon/r600_buffer_common.c +++ b/src/gallium/drivers/radeon/r600_buffer_common.c @@ -183,20 +183,22 @@ void si_init_resource_fields(struct r600_common_screen *rscreen, res->domains = RADEON_DOMAIN_VRAM_GTT; res->flags &= ~RADEON_FLAG_NO_CPU_ACCESS; /* disallowed with VRAM_GTT */ } if (rscreen->debug_flags & DBG(NO_WC)) res->flags &= ~RADEON_FLAG_GTT_WC; /* Set expected VRAM and GART usage for the buffer. */ res->vram_usage = 0; res->gart_usage = 0; + res->max_forced_staging_uploads = 0; + res->b.max_forced_staging_uploads = 0; if (res->domains & RADEON_DOMAIN_VRAM) { res->vram_usage = size; res->max_forced_staging_uploads = res->b.max_forced_staging_uploads = rscreen->info.has_dedicated_vram && size >= rscreen->info.vram_vis_size / 4 ? 1 : 0; } else if (res->domains & RADEON_DOMAIN_GTT) { res->gart_usage = size; -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa 02/16] anv: tie anv_assert() enablement to regular assert()
On 25/11/17 05:07, Eric Engestrom wrote: Signed-off-by: Eric Engestrom --- src/intel/vulkan/anv_private.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h index 6d4e43f2e687cbf26ccd..6474abf0f3694c7fcd3a 100644 --- a/src/intel/vulkan/anv_private.h +++ b/src/intel/vulkan/anv_private.h @@ -382,7 +382,7 @@ void anv_debug_report(struct anv_instance *instance, } while (0) /* A non-fatal assert. Useful for debugging. */ -#ifdef DEBUG +#ifndef NDEBUG I'm confused by all these assert patches. Doesn't NDEBUG mean no debug or non-debug why are you switching things around? Won't this add all this code to release builds and remove it from debug builds? #define anv_assert(x) ({ \ if (unlikely(!(x))) \ intel_loge("%s:%d ASSERT: %s", __FILE__, __LINE__, #x); \ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa 08/16] amd/addrlib: fix DEBUG guards
On 24 November 2017 at 18:24, Marek Olšák wrote: > I'd like to keep the code as-is, because addrlib is an external project. > > However, since you Cc'd addrlib developers, they can apply the changes > to addrlib (if they want) and we can get the changes when we update > addrlib in Mesa. > Adding some alternative information - hopefully to inspire the addrlib devs. Some of the series is related to my proposal to use -Wundef flag in Mesa. What that means is the compiler will give you a nice warning as the macro you're using is undefined. Say there's a typo or the relevant header is not included. HTH Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] configure.ac: add Wundef to the build flags
On Friday, 2017-11-24 18:14:41 +, Emil Velikov wrote: > On 24 November 2017 at 14:32, Eric Engestrom > wrote: > > On Friday, 2017-11-24 14:25:02 +, Emil Velikov wrote: > >> From: Emil Velikov > >> > >> From the manual: > >> Warn if an undefined identifier is evaluated in an `#if' directive. > >> > >> This is something we want to know and address. Otherwise we can end up > >> with subtle issues, in the less commonly used codepaths. > >> > >> Note: this will trigger a lot of extra warnings, with ~60 of those being > >> unique. Once all those are resolved we'd want to promote the warning to > >> an error. > > > > Yes please; series is > > Reviewed-by: Eric Engestrom > > > Thanks. I think we should hold these off, until some (say 1/3?) of the > issues are resolved. > Otherwise devs might get a bit annoyed my the massive amount of warnings. Agreed. The series I just sent fixes 99% of the warnings already, because c99_{compat,math}.h is included everywhere. Once that series and your gtest patches land, if think it should be good enough, and individual devs can take care of the rest. The next biggest offender is Nouveau, and I haven't had a proper look but at a glance I think it looked like it was probably just a few places generating many warnings. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa 05/16] amd/addrlib: remove duplicate definition of ADDR_DBG_BREAK()
I'd like to keep the code as-is, because addrlib is an external project. Marek On Fri, Nov 24, 2017 at 7:07 PM, Eric Engestrom wrote: > It's already defined above (around line 60). > > Signed-off-by: Eric Engestrom > --- > src/amd/addrlib/core/addrcommon.h | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/src/amd/addrlib/core/addrcommon.h > b/src/amd/addrlib/core/addrcommon.h > index 99bb62e77f446f912d35..80ea0eb247aa48dcb176 100644 > --- a/src/amd/addrlib/core/addrcommon.h > +++ b/src/amd/addrlib/core/addrcommon.h > @@ -143,8 +143,6 @@ > > #define ADDR_PRNT(a) > > -#define ADDR_DBG_BREAK() > - > #define ADDR_INFO(cond, a) > > #define ADDR_WARN(cond, a) > -- > Cheers, > Eric > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa 09/16] amd/addrlib: add comment about breakpoint not working in some cases
I'd like to keep the code as-is, because addrlib is an external project. Marek On Fri, Nov 24, 2017 at 7:07 PM, Eric Engestrom wrote: > Signed-off-by: Eric Engestrom > --- > src/amd/addrlib/core/addrcommon.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/amd/addrlib/core/addrcommon.h > b/src/amd/addrlib/core/addrcommon.h > index 1a7473a4d33c1271f077..9261c5890903d0dc1c0b 100644 > --- a/src/amd/addrlib/core/addrcommon.h > +++ b/src/amd/addrlib/core/addrcommon.h > @@ -51,7 +51,7 @@ > > > #ifdef DEBUG > #if defined(__GNUC__) > -#define ADDR_DBG_BREAK()assert(false) > +#define ADDR_DBG_BREAK()assert(false) /* FIXME: this can't work > on debug builds with asserts off */ > #elif defined(__APPLE__) > #define ADDR_DBG_BREAK(){ IOPanic("");} > #else > -- > Cheers, > Eric > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa 08/16] amd/addrlib: fix DEBUG guards
I'd like to keep the code as-is, because addrlib is an external project. However, since you Cc'd addrlib developers, they can apply the changes to addrlib (if they want) and we can get the changes when we update addrlib in Mesa. Marek On Fri, Nov 24, 2017 at 7:07 PM, Eric Engestrom wrote: > Use the normal `#ifdef` style used by the rest of Mesa, instead of > sometimes locally defining DEBUG on non-debug builds and then relying on > the precompiler converting #undef'ed macros to 0 when checking. > > Signed-off-by: Eric Engestrom > --- > I would argue this block should go, as all it does is enable debug > prints on release builds with asserts. It doesn't make sense to me that > the fact asserts are enabled is relevant to this. > --- > src/amd/addrlib/core/addrcommon.h| 8 +++- > src/amd/addrlib/core/addrobject.cpp | 2 +- > src/amd/addrlib/gfx9/gfx9addrlib.cpp | 2 +- > 3 files changed, 5 insertions(+), 7 deletions(-) > > diff --git a/src/amd/addrlib/core/addrcommon.h > b/src/amd/addrlib/core/addrcommon.h > index 80ea0eb247aa48dcb176..1a7473a4d33c1271f077 100644 > --- a/src/amd/addrlib/core/addrcommon.h > +++ b/src/amd/addrlib/core/addrcommon.h > @@ -41,9 +41,7 @@ > #include > > #if !defined(DEBUG) > -#ifdef NDEBUG > -#define DEBUG 0 > -#else > +#ifndef NDEBUG > #define DEBUG 1 > #endif > #endif > @@ -51,7 +49,7 @@ > > > // Platform specific debug break defines > > > -#if DEBUG > +#ifdef DEBUG > #if defined(__GNUC__) > #define ADDR_DBG_BREAK()assert(false) > #elif defined(__APPLE__) > @@ -82,7 +80,7 @@ > > > // Debug print macro from legacy address library > > > -#if DEBUG > +#ifdef DEBUG > > #define ADDR_PRNT(a)Object::DebugPrint a > > diff --git a/src/amd/addrlib/core/addrobject.cpp > b/src/amd/addrlib/core/addrobject.cpp > index 452feb5fac0fbd6404bc..04c343d3da1c1d62b334 100644 > --- a/src/amd/addrlib/core/addrobject.cpp > +++ b/src/amd/addrlib/core/addrobject.cpp > @@ -213,7 +213,7 @@ VOID Object::DebugPrint( > ... > ) const > { > -#if DEBUG > +#ifdef DEBUG > if (m_client.callbacks.debugPrint != NULL) > { > ADDR_DEBUGPRINT_INPUT debugPrintInput = {0}; > diff --git a/src/amd/addrlib/gfx9/gfx9addrlib.cpp > b/src/amd/addrlib/gfx9/gfx9addrlib.cpp > index 1d42cbfc8a3a50c84343..48de5016807944043fdf 100644 > --- a/src/amd/addrlib/gfx9/gfx9addrlib.cpp > +++ b/src/amd/addrlib/gfx9/gfx9addrlib.cpp > @@ -3718,7 +3718,7 @@ ADDR_E_RETURNCODE > Gfx9Lib::HwlGetPreferredSurfaceSetting( > returnCode = ADDR_NOTSUPPORTED; > } > > -#if DEBUG > +#ifdef DEBUG > // Post sanity check, at least AddrLib should accept the > output generated by its own > if (pOut->swizzleMode != ADDR_SW_LINEAR) > { > -- > Cheers, > Eric > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa 07/16] amd/addrlib: use NDEBUG to guard asserts
I'd like to keep the code as-is, because addrlib is an external project. Marek On Fri, Nov 24, 2017 at 7:07 PM, Eric Engestrom wrote: > Signed-off-by: Eric Engestrom > --- > src/amd/addrlib/core/addrelemlib.cpp | 2 +- > src/amd/addrlib/core/addrlib1.cpp| 8 > src/amd/addrlib/core/addrlib2.h | 2 +- > src/amd/addrlib/gfx9/gfx9addrlib.cpp | 4 ++-- > src/amd/addrlib/r800/egbaddrlib.cpp | 2 +- > 5 files changed, 9 insertions(+), 9 deletions(-) > > diff --git a/src/amd/addrlib/core/addrelemlib.cpp > b/src/amd/addrlib/core/addrelemlib.cpp > index c9e6da4729a5e2ef2aef..910c6ffda27000a4349b 100644 > --- a/src/amd/addrlib/core/addrelemlib.cpp > +++ b/src/amd/addrlib/core/addrelemlib.cpp > @@ -1219,7 +1219,7 @@ VOID ElemLib::AdjustSurfaceInfo( > basePitch = basePitch / expandX; > width = width / expandX; > height= height / expandY; > -#if DEBUG > +#ifndef NDEBUG > width = (width == 0) ? 1 : width; > height= (height == 0) ? 1 : height; > > diff --git a/src/amd/addrlib/core/addrlib1.cpp > b/src/amd/addrlib/core/addrlib1.cpp > index 86cd66aee43c4487ffe5..4f9ecae10002a7b51c84 100644 > --- a/src/amd/addrlib/core/addrlib1.cpp > +++ b/src/amd/addrlib/core/addrlib1.cpp > @@ -367,12 +367,12 @@ ADDR_E_RETURNCODE Lib::ComputeSurfaceInfo( > pOut->pixelPitch= pOut->pitch; > pOut->pixelHeight = pOut->height; > > -#if DEBUG > +#ifndef NDEBUG > if (localIn.flags.display) > { > ADDR_ASSERT((pOut->pitchAlign % 32) == 0); > } > -#endif //DEBUG > +#endif //NDEBUG > > if (localIn.format != ADDR_FMT_INVALID) > { > @@ -2036,12 +2036,12 @@ ADDR_E_RETURNCODE Lib::ComputeCmaskInfo( > UINT_32 slice = (*pPitchOut) * (*pHeightOut); > UINT_32 blockMax = slice / 128 / 128 - 1; > > -#if DEBUG > +#ifndef NDEBUG > if (slice % (64*256) != 0) > { > ADDR_ASSERT_ALWAYS(); > } > -#endif //DEBUG > +#endif //NDEBUG > > UINT_32 maxBlockMax = HwlGetMaxCmaskBlockMax(); > > diff --git a/src/amd/addrlib/core/addrlib2.h b/src/amd/addrlib/core/addrlib2.h > index bea2a485a61aa10990a1..e9cbea8f62ef6ac2b3aa 100644 > --- a/src/amd/addrlib/core/addrlib2.h > +++ b/src/amd/addrlib/core/addrlib2.h > @@ -707,7 +707,7 @@ class Lib : public Addr::Lib > > VOID VerifyMipLevelInfo(const ADDR2_COMPUTE_SURFACE_INFO_INPUT* pIn) > const > { > -#if DEBUG > +#ifndef NDEBUG > if (pIn->numMipLevels > 1) > { > UINT_32 actualMipLevels = 1; > diff --git a/src/amd/addrlib/gfx9/gfx9addrlib.cpp > b/src/amd/addrlib/gfx9/gfx9addrlib.cpp > index e06f13c0afe01d026a0b..1d42cbfc8a3a50c84343 100644 > --- a/src/amd/addrlib/gfx9/gfx9addrlib.cpp > +++ b/src/amd/addrlib/gfx9/gfx9addrlib.cpp > @@ -196,7 +196,7 @@ ADDR_E_RETURNCODE Gfx9Lib::HwlComputeHtileInfo( > metaBlkDim.w <<= widthAmp; > metaBlkDim.h <<= heightAmp; > > -#if DEBUG > +#ifndef NDEBUG > Dim3d metaBlkDimDbg = {8, 8, 1}; > for (UINT_32 index = 0; index < numCompressBlkPerMetaBlkLog2; index++) > { > @@ -311,7 +311,7 @@ ADDR_E_RETURNCODE Gfx9Lib::HwlComputeCmaskInfo( > metaBlkDim.w <<= widthAmp; > metaBlkDim.h <<= heightAmp; > > -#if DEBUG > +#ifndef NDEBUG > Dim2d metaBlkDimDbg = {8, 8}; > for (UINT_32 index = 0; index < numCompressBlkPerMetaBlkLog2; index++) > { > diff --git a/src/amd/addrlib/r800/egbaddrlib.cpp > b/src/amd/addrlib/r800/egbaddrlib.cpp > index 854d4cbe8ad87f70db14..ab5e554da7b190562558 100644 > --- a/src/amd/addrlib/r800/egbaddrlib.cpp > +++ b/src/amd/addrlib/r800/egbaddrlib.cpp > @@ -3861,7 +3861,7 @@ ADDR_E_RETURNCODE EgBasedLib::HwlComputeSurfaceInfo( > // Resets pTileInfo to NULL if the internal tile info is used > if (pOut->pTileInfo == &tileInfo) > { > -#if DEBUG > +#ifndef NDEBUG > // Client does not pass in a valid pTileInfo > if (IsMacroTiled(pOut->tileMode)) > { > -- > Cheers, > Eric > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa 06/16] amd/addrlib: fix macro test
I'd like to keep the code as-is, because addrlib is an external project. Marek On Fri, Nov 24, 2017 at 7:07 PM, Eric Engestrom wrote: > Check macro existence instead of relying on the compiler turning undef into 0. > > Signed-off-by: Eric Engestrom > --- > src/amd/addrlib/core/addrlib1.cpp | 2 +- > src/amd/addrlib/r800/egbaddrlib.cpp | 4 ++-- > src/amd/addrlib/r800/siaddrlib.cpp | 2 +- > 3 files changed, 4 insertions(+), 4 deletions(-) > > diff --git a/src/amd/addrlib/core/addrlib1.cpp > b/src/amd/addrlib/core/addrlib1.cpp > index c796a63436c71969ddcd..86cd66aee43c4487ffe5 100644 > --- a/src/amd/addrlib/core/addrlib1.cpp > +++ b/src/amd/addrlib/core/addrlib1.cpp > @@ -236,7 +236,7 @@ ADDR_E_RETURNCODE Lib::ComputeSurfaceInfo( > pOut->last2DLevel = FALSE; > pOut->tcCompatible = FALSE; > > -#if !ALT_TEST > +#ifndef ALT_TEST > if (localIn.numSamples > 1) > { > ADDR_ASSERT(localIn.mipLevel == 0); > diff --git a/src/amd/addrlib/r800/egbaddrlib.cpp > b/src/amd/addrlib/r800/egbaddrlib.cpp > index 99aa6cf4cdb52affe42c..854d4cbe8ad87f70db14 100644 > --- a/src/amd/addrlib/r800/egbaddrlib.cpp > +++ b/src/amd/addrlib/r800/egbaddrlib.cpp > @@ -242,7 +242,7 @@ BOOL_32 EgBasedLib::ComputeSurfaceInfoLinear( > > if ((pIn->tileMode == ADDR_TM_LINEAR_GENERAL) && pIn->flags.color && > (pIn->height > 1)) > { > -#if !ALT_TEST > +#ifndef ALT_TEST > // When linear_general surface is accessed in multiple lines, it > requires 8 pixels in pitch > // alignment since PITCH_TILE_MAX is in unit of 8 pixels. > // It is OK if it is accessed per line. > @@ -3905,7 +3905,7 @@ ADDR_E_RETURNCODE > EgBasedLib::HwlComputeSurfaceAddrFromCoord( > ADDR_E_RETURNCODE retCode = ADDR_OK; > > if ( > -#if !ALT_TEST // Overflow test needs this out-of-boundary coord > +#ifndef ALT_TEST // Overflow test needs this out-of-boundary coord > (pIn->x > pIn->pitch) || > (pIn->y > pIn->height) || > #endif > diff --git a/src/amd/addrlib/r800/siaddrlib.cpp > b/src/amd/addrlib/r800/siaddrlib.cpp > index 0fb5c2befdc2a3ad7590..70f0d5cb240ad01de2be 100644 > --- a/src/amd/addrlib/r800/siaddrlib.cpp > +++ b/src/amd/addrlib/r800/siaddrlib.cpp > @@ -1834,7 +1834,7 @@ UINT_64 SiLib::HwlGetSizeAdjustmentMicroTiled( > physicalSliceSize = logicalSliceSize * thickness; > } > > -#if !ALT_TEST > +#ifndef ALT_TEST > // > // Special workaround for depth/stencil buffer, use 8 bpp to align depth > buffer again since > // the stencil plane may have larger pitch if the slice size is smaller > than base alignment. > -- > Cheers, > Eric > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa 04/16] amd: remove always-true BRAHMA_BUILD define
Reviewed-by: Marek Olšák Marek On Fri, Nov 24, 2017 at 7:07 PM, Eric Engestrom wrote: > Signed-off-by: Eric Engestrom > --- > src/amd/Android.addrlib.mk | 2 -- > src/amd/Makefile.addrlib.am | 3 +-- > src/amd/addrlib/core/addrcommon.h| 15 ++- > src/amd/addrlib/meson.build | 2 +- > src/gallium/winsys/amdgpu/drm/Android.mk | 4 +--- > 5 files changed, 5 insertions(+), 21 deletions(-) > > diff --git a/src/amd/Android.addrlib.mk b/src/amd/Android.addrlib.mk > index a29f7c16d1797b5943c4..a37112d7154de568c815 100644 > --- a/src/amd/Android.addrlib.mk > +++ b/src/amd/Android.addrlib.mk > @@ -30,8 +30,6 @@ LOCAL_MODULE := libmesa_amdgpu_addrlib > > LOCAL_SRC_FILES := $(ADDRLIB_FILES) > > -LOCAL_CFLAGS := -DBRAHMA_BUILD=1 > - > LOCAL_C_INCLUDES := \ > $(MESA_TOP)/src \ > $(MESA_TOP)/src/amd/common \ > diff --git a/src/amd/Makefile.addrlib.am b/src/amd/Makefile.addrlib.am > index 90dfe9634454218bee82..322a5c86973b9238c74c 100644 > --- a/src/amd/Makefile.addrlib.am > +++ b/src/amd/Makefile.addrlib.am > @@ -29,8 +29,7 @@ addrlib_libamdgpu_addrlib_la_CPPFLAGS = \ > -I$(srcdir)/addrlib/inc/chip/gfx9 \ > -I$(srcdir)/addrlib/inc/chip/r800 \ > -I$(srcdir)/addrlib/gfx9/chip \ > - -I$(srcdir)/addrlib/r800/chip \ > - -DBRAHMA_BUILD=1 > + -I$(srcdir)/addrlib/r800/chip > > addrlib_libamdgpu_addrlib_la_CXXFLAGS = \ > $(VISIBILITY_CXXFLAGS) $(CXX11_CXXFLAGS) > diff --git a/src/amd/addrlib/core/addrcommon.h > b/src/amd/addrlib/core/addrcommon.h > index 62f8ac61618e65db6768..99bb62e77f446f912d35 100644 > --- a/src/amd/addrlib/core/addrcommon.h > +++ b/src/amd/addrlib/core/addrcommon.h > @@ -40,7 +40,7 @@ > #include > #include > > -#if BRAHMA_BUILD && !defined(DEBUG) > +#if !defined(DEBUG) > #ifdef NDEBUG > #define DEBUG 0 > #else > @@ -73,18 +73,7 @@ > #define ADDR_ANALYSIS_ASSUME(expr) do { (void)(expr); } while (0) > #endif > > -#if BRAHMA_BUILD > -#define ADDR_ASSERT(__e) assert(__e) > -#elif DEBUG > -#define ADDR_ASSERT(__e)\ > -do {\ > -ADDR_ANALYSIS_ASSUME(__e); \ > -if ( !((__e) ? TRUE : FALSE)) { ADDR_DBG_BREAK(); } \ > -} while (0) > -#else //DEBUG > -#define ADDR_ASSERT(__e) ADDR_ANALYSIS_ASSUME(__e) > -#endif //DEBUG > - > +#define ADDR_ASSERT(__e) assert(__e) > #define ADDR_ASSERT_ALWAYS() ADDR_DBG_BREAK() > #define ADDR_UNHANDLED_CASE() ADDR_ASSERT(!"Unhandled case") > #define ADDR_NOT_IMPLEMENTED() ADDR_ASSERT(!"Not implemented"); > diff --git a/src/amd/addrlib/meson.build b/src/amd/addrlib/meson.build > index 1a7f2fdef5d8e347b5ad..ed0dde6245b3396305fc 100644 > --- a/src/amd/addrlib/meson.build > +++ b/src/amd/addrlib/meson.build > @@ -56,5 +56,5 @@ libamdgpu_addrlib = static_library( >include_directories : include_directories( > 'core', 'inc/chip/gfx9', 'inc/chip/r800', 'gfx9/chip', 'r800/chip', > '../common', '../../'), > - cpp_args : [cpp_vis_args, '-DBRAHMA_BUILD=1'], > + cpp_args : cpp_vis_args, > ) > diff --git a/src/gallium/winsys/amdgpu/drm/Android.mk > b/src/gallium/winsys/amdgpu/drm/Android.mk > index a05304ae5dea571e0a67..6e84a0c8de1a8e503ef4 100644 > --- a/src/gallium/winsys/amdgpu/drm/Android.mk > +++ b/src/gallium/winsys/amdgpu/drm/Android.mk > @@ -30,9 +30,7 @@ include $(CLEAR_VARS) > > LOCAL_SRC_FILES := $(C_SOURCES) > > -LOCAL_CFLAGS := \ > - $(AMDGPU_CFLAGS) \ > - -DBRAHMA_BUILD=1 > +LOCAL_CFLAGS := $(AMDGPU_CFLAGS) > > LOCAL_STATIC_LIBRARIES := libmesa_amdgpu_addrlib > > -- > Cheers, > Eric > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] configure.ac: add Wundef to the build flags
On 24 November 2017 at 14:32, Eric Engestrom wrote: > On Friday, 2017-11-24 14:25:02 +, Emil Velikov wrote: >> From: Emil Velikov >> >> From the manual: >> Warn if an undefined identifier is evaluated in an `#if' directive. >> >> This is something we want to know and address. Otherwise we can end up >> with subtle issues, in the less commonly used codepaths. >> >> Note: this will trigger a lot of extra warnings, with ~60 of those being >> unique. Once all those are resolved we'd want to promote the warning to >> an error. > > Yes please; series is > Reviewed-by: Eric Engestrom > Thanks. I think we should hold these off, until some (say 1/3?) of the issues are resolved. Otherwise devs might get a bit annoyed my the massive amount of warnings. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa 15/16] anv: use NDEBUG to guard asserts
Signed-off-by: Eric Engestrom --- src/intel/vulkan/anv_pipeline.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c index 907b24a758deb7181115..1dbef4f14a33302300d4 100644 --- a/src/intel/vulkan/anv_pipeline.c +++ b/src/intel/vulkan/anv_pipeline.c @@ -1163,7 +1163,7 @@ copy_non_dynamic_state(struct anv_pipeline *pipeline, static void anv_pipeline_validate_create_info(const VkGraphicsPipelineCreateInfo *info) { -#ifdef DEBUG +#ifndef NDEBUG struct anv_render_pass *renderpass = NULL; struct anv_subpass *subpass = NULL; -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa 13/16] compiler: fix typo
Signed-off-by: Eric Engestrom --- src/compiler/nir/nir_lower_io.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/compiler/nir/nir_lower_io.c b/src/compiler/nir/nir_lower_io.c index 3879f0297d3959e720ff..df91febd68dd1f5fa7af 100644 --- a/src/compiler/nir/nir_lower_io.c +++ b/src/compiler/nir/nir_lower_io.c @@ -522,7 +522,7 @@ nir_lower_io(nir_shader *shader, nir_variable_mode modes, } /** - * Return the offset soruce for a load/store intrinsic. + * Return the offset source for a load/store intrinsic. */ nir_src * nir_get_io_offset_src(nir_intrinsic_instr *instr) -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa 08/16] amd/addrlib: fix DEBUG guards
Use the normal `#ifdef` style used by the rest of Mesa, instead of sometimes locally defining DEBUG on non-debug builds and then relying on the precompiler converting #undef'ed macros to 0 when checking. Signed-off-by: Eric Engestrom --- I would argue this block should go, as all it does is enable debug prints on release builds with asserts. It doesn't make sense to me that the fact asserts are enabled is relevant to this. --- src/amd/addrlib/core/addrcommon.h| 8 +++- src/amd/addrlib/core/addrobject.cpp | 2 +- src/amd/addrlib/gfx9/gfx9addrlib.cpp | 2 +- 3 files changed, 5 insertions(+), 7 deletions(-) diff --git a/src/amd/addrlib/core/addrcommon.h b/src/amd/addrlib/core/addrcommon.h index 80ea0eb247aa48dcb176..1a7473a4d33c1271f077 100644 --- a/src/amd/addrlib/core/addrcommon.h +++ b/src/amd/addrlib/core/addrcommon.h @@ -41,9 +41,7 @@ #include #if !defined(DEBUG) -#ifdef NDEBUG -#define DEBUG 0 -#else +#ifndef NDEBUG #define DEBUG 1 #endif #endif @@ -51,7 +49,7 @@ // Platform specific debug break defines -#if DEBUG +#ifdef DEBUG #if defined(__GNUC__) #define ADDR_DBG_BREAK()assert(false) #elif defined(__APPLE__) @@ -82,7 +80,7 @@ // Debug print macro from legacy address library -#if DEBUG +#ifdef DEBUG #define ADDR_PRNT(a)Object::DebugPrint a diff --git a/src/amd/addrlib/core/addrobject.cpp b/src/amd/addrlib/core/addrobject.cpp index 452feb5fac0fbd6404bc..04c343d3da1c1d62b334 100644 --- a/src/amd/addrlib/core/addrobject.cpp +++ b/src/amd/addrlib/core/addrobject.cpp @@ -213,7 +213,7 @@ VOID Object::DebugPrint( ... ) const { -#if DEBUG +#ifdef DEBUG if (m_client.callbacks.debugPrint != NULL) { ADDR_DEBUGPRINT_INPUT debugPrintInput = {0}; diff --git a/src/amd/addrlib/gfx9/gfx9addrlib.cpp b/src/amd/addrlib/gfx9/gfx9addrlib.cpp index 1d42cbfc8a3a50c84343..48de5016807944043fdf 100644 --- a/src/amd/addrlib/gfx9/gfx9addrlib.cpp +++ b/src/amd/addrlib/gfx9/gfx9addrlib.cpp @@ -3718,7 +3718,7 @@ ADDR_E_RETURNCODE Gfx9Lib::HwlGetPreferredSurfaceSetting( returnCode = ADDR_NOTSUPPORTED; } -#if DEBUG +#ifdef DEBUG // Post sanity check, at least AddrLib should accept the output generated by its own if (pOut->swizzleMode != ADDR_SW_LINEAR) { -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa 16/16] util: use NDEBUG to guard asserts
Signed-off-by: Eric Engestrom --- src/util/ralloc.c | 18 +- src/util/slab.c | 4 ++-- 2 files changed, 11 insertions(+), 11 deletions(-) diff --git a/src/util/ralloc.c b/src/util/ralloc.c index 42cfa2e391d52df68db2..b52079ac075a0fe11944 100644 --- a/src/util/ralloc.c +++ b/src/util/ralloc.c @@ -61,7 +61,7 @@ struct #endif ralloc_header { -#ifdef DEBUG +#ifndef NDEBUG /* A canary value used to determine whether a pointer is ralloc'd. */ unsigned canary; #endif @@ -88,7 +88,7 @@ get_header(const void *ptr) { ralloc_header *info = (ralloc_header *) (((char *) ptr) - sizeof(ralloc_header)); -#ifdef DEBUG +#ifndef NDEBUG assert(info->canary == CANARY); #endif return info; @@ -140,7 +140,7 @@ ralloc_size(const void *ctx, size_t size) add_child(parent, info); -#ifdef DEBUG +#ifndef NDEBUG info->canary = CANARY; #endif @@ -560,7 +560,7 @@ ralloc_vasprintf_rewrite_tail(char **str, size_t *start, const char *fmt, #define LMAGIC 0x87b9c7d3 struct linear_header { -#ifdef DEBUG +#ifndef NDEBUG unsigned magic; /* for debugging */ #endif unsigned offset; /* points to the first unused byte in the buffer */ @@ -610,7 +610,7 @@ create_linear_node(void *ralloc_ctx, unsigned min_size) if (unlikely(!node)) return NULL; -#ifdef DEBUG +#ifndef NDEBUG node->magic = LMAGIC; #endif node->offset = 0; @@ -630,7 +630,7 @@ linear_alloc_child(void *parent, unsigned size) linear_size_chunk *ptr; unsigned full_size; -#ifdef DEBUG +#ifndef NDEBUG assert(first->magic == LMAGIC); #endif assert(!latest->next); @@ -704,7 +704,7 @@ linear_free_parent(void *ptr) return; node = LINEAR_PARENT_TO_HEADER(ptr); -#ifdef DEBUG +#ifndef NDEBUG assert(node->magic == LMAGIC); #endif @@ -725,7 +725,7 @@ ralloc_steal_linear_parent(void *new_ralloc_ctx, void *ptr) return; node = LINEAR_PARENT_TO_HEADER(ptr); -#ifdef DEBUG +#ifndef NDEBUG assert(node->magic == LMAGIC); #endif @@ -740,7 +740,7 @@ void * ralloc_parent_of_linear_parent(void *ptr) { linear_header *node = LINEAR_PARENT_TO_HEADER(ptr); -#ifdef DEBUG +#ifndef NDEBUG assert(node->magic == LMAGIC); #endif return node->ralloc_parent; diff --git a/src/util/slab.c b/src/util/slab.c index 4ce0e9a34852ca08d473..771c6bc2443b7ed3685f 100644 --- a/src/util/slab.c +++ b/src/util/slab.c @@ -33,7 +33,7 @@ #define SLAB_MAGIC_ALLOCATED 0xcafe4321 #define SLAB_MAGIC_FREE 0x7ee01234 -#ifdef DEBUG +#ifndef NDEBUG #define SET_MAGIC(element, value) (element)->magic = (value) #define CHECK_MAGIC(element, value) assert((element)->magic == (value)) #else @@ -53,7 +53,7 @@ struct slab_element_header { */ intptr_t owner; -#ifdef DEBUG +#ifndef NDEBUG intptr_t magic; #endif }; -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa 12/16] compiler: use NDEBUG to guard asserts
nir_validate.c's #endif already had the correct NDEBUG comment Fixes: dcb1acdea00a8f2c29777 "nir/validate: Only build in debug mode" Fixes: 9ff71b649b4b3808a9e17 "i965/nir: Validate that NIR passes call nir_metadata_preserve()" Signed-off-by: Eric Engestrom --- src/compiler/nir/nir.h | 8 src/compiler/nir/nir_metadata.c | 2 +- src/compiler/nir/nir_validate.c | 2 +- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h index f46f6147110acbffec5c..fb4fe02f8a8cfe86e4b8 100644 --- a/src/compiler/nir/nir.h +++ b/src/compiler/nir/nir.h @@ -41,9 +41,9 @@ #include "compiler/shader_info.h" #include -#ifdef DEBUG +#ifndef NDEBUG #include "util/debug.h" -#endif /* DEBUG */ +#endif /* NDEBUG */ #include "nir_opcodes.h" @@ -2325,7 +2325,7 @@ nir_deref_var *nir_deref_var_clone(const nir_deref_var *deref, void *mem_ctx); nir_shader *nir_shader_serialize_deserialize(void *mem_ctx, nir_shader *s); -#ifdef DEBUG +#ifndef NDEBUG void nir_validate_shader(nir_shader *shader); void nir_metadata_set_validation_flag(nir_shader *shader); void nir_metadata_check_validation_flag(nir_shader *shader); @@ -2366,7 +2366,7 @@ static inline void nir_metadata_check_validation_flag(nir_shader *shader) { (voi static inline bool should_clone_nir(void) { return false; } static inline bool should_serialize_deserialize_nir(void) { return false; } static inline bool should_print_nir(void) { return false; } -#endif /* DEBUG */ +#endif /* NDEBUG */ #define _PASS(nir, do_pass) do { \ do_pass \ diff --git a/src/compiler/nir/nir_metadata.c b/src/compiler/nir/nir_metadata.c index f71cf432b703451e6997..e681ba34f7557a7c3051 100644 --- a/src/compiler/nir/nir_metadata.c +++ b/src/compiler/nir/nir_metadata.c @@ -59,7 +59,7 @@ nir_metadata_preserve(nir_function_impl *impl, nir_metadata preserved) impl->valid_metadata &= preserved; } -#ifdef DEBUG +#ifndef NDEBUG /** * Make sure passes properly invalidate metadata (part 1). * diff --git a/src/compiler/nir/nir_validate.c b/src/compiler/nir/nir_validate.c index 9bf8c7029012ef26af14..a49948fbb489beb9509e 100644 --- a/src/compiler/nir/nir_validate.c +++ b/src/compiler/nir/nir_validate.c @@ -35,7 +35,7 @@ /* Since this file is just a pile of asserts, don't bother compiling it if * we're not building a debug build. */ -#ifdef DEBUG +#ifndef NDEBUG /* * Per-register validation state. -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa 06/16] amd/addrlib: fix macro test
Check macro existence instead of relying on the compiler turning undef into 0. Signed-off-by: Eric Engestrom --- src/amd/addrlib/core/addrlib1.cpp | 2 +- src/amd/addrlib/r800/egbaddrlib.cpp | 4 ++-- src/amd/addrlib/r800/siaddrlib.cpp | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/src/amd/addrlib/core/addrlib1.cpp b/src/amd/addrlib/core/addrlib1.cpp index c796a63436c71969ddcd..86cd66aee43c4487ffe5 100644 --- a/src/amd/addrlib/core/addrlib1.cpp +++ b/src/amd/addrlib/core/addrlib1.cpp @@ -236,7 +236,7 @@ ADDR_E_RETURNCODE Lib::ComputeSurfaceInfo( pOut->last2DLevel = FALSE; pOut->tcCompatible = FALSE; -#if !ALT_TEST +#ifndef ALT_TEST if (localIn.numSamples > 1) { ADDR_ASSERT(localIn.mipLevel == 0); diff --git a/src/amd/addrlib/r800/egbaddrlib.cpp b/src/amd/addrlib/r800/egbaddrlib.cpp index 99aa6cf4cdb52affe42c..854d4cbe8ad87f70db14 100644 --- a/src/amd/addrlib/r800/egbaddrlib.cpp +++ b/src/amd/addrlib/r800/egbaddrlib.cpp @@ -242,7 +242,7 @@ BOOL_32 EgBasedLib::ComputeSurfaceInfoLinear( if ((pIn->tileMode == ADDR_TM_LINEAR_GENERAL) && pIn->flags.color && (pIn->height > 1)) { -#if !ALT_TEST +#ifndef ALT_TEST // When linear_general surface is accessed in multiple lines, it requires 8 pixels in pitch // alignment since PITCH_TILE_MAX is in unit of 8 pixels. // It is OK if it is accessed per line. @@ -3905,7 +3905,7 @@ ADDR_E_RETURNCODE EgBasedLib::HwlComputeSurfaceAddrFromCoord( ADDR_E_RETURNCODE retCode = ADDR_OK; if ( -#if !ALT_TEST // Overflow test needs this out-of-boundary coord +#ifndef ALT_TEST // Overflow test needs this out-of-boundary coord (pIn->x > pIn->pitch) || (pIn->y > pIn->height) || #endif diff --git a/src/amd/addrlib/r800/siaddrlib.cpp b/src/amd/addrlib/r800/siaddrlib.cpp index 0fb5c2befdc2a3ad7590..70f0d5cb240ad01de2be 100644 --- a/src/amd/addrlib/r800/siaddrlib.cpp +++ b/src/amd/addrlib/r800/siaddrlib.cpp @@ -1834,7 +1834,7 @@ UINT_64 SiLib::HwlGetSizeAdjustmentMicroTiled( physicalSliceSize = logicalSliceSize * thickness; } -#if !ALT_TEST +#ifndef ALT_TEST // // Special workaround for depth/stencil buffer, use 8 bpp to align depth buffer again since // the stencil plane may have larger pitch if the slice size is smaller than base alignment. -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa 14/16] util/cache: use NDEBUG to guard asserts
Signed-off-by: Eric Engestrom --- src/gallium/auxiliary/util/u_cache.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/gallium/auxiliary/util/u_cache.c b/src/gallium/auxiliary/util/u_cache.c index c748cb99dd0c468f4342..f14ba97996873a716aff 100644 --- a/src/gallium/auxiliary/util/u_cache.c +++ b/src/gallium/auxiliary/util/u_cache.c @@ -56,7 +56,7 @@ struct util_cache_entry void *key; void *value; -#ifdef DEBUG +#ifndef NDEBUG unsigned count; #endif }; @@ -214,7 +214,7 @@ util_cache_set(struct util_cache *cache, util_cache_entry_destroy(cache, entry); -#ifdef DEBUG +#ifndef NDEBUG ++entry->count; #endif @@ -289,7 +289,7 @@ util_cache_destroy(struct util_cache *cache) if (!cache) return; -#ifdef DEBUG +#ifndef NDEBUG if (cache->count >= 20*cache->size) { /* Normal approximation of the Poisson distribution */ double mean = (double)cache->count/(double)cache->size; @@ -341,7 +341,7 @@ util_cache_remove(struct util_cache *cache, static void ensure_sanity(const struct util_cache *cache) { -#ifdef DEBUG +#ifndef NDEBUG unsigned i, cnt = 0; assert(cache); -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa 09/16] amd/addrlib: add comment about breakpoint not working in some cases
Signed-off-by: Eric Engestrom --- src/amd/addrlib/core/addrcommon.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/amd/addrlib/core/addrcommon.h b/src/amd/addrlib/core/addrcommon.h index 1a7473a4d33c1271f077..9261c5890903d0dc1c0b 100644 --- a/src/amd/addrlib/core/addrcommon.h +++ b/src/amd/addrlib/core/addrcommon.h @@ -51,7 +51,7 @@ #ifdef DEBUG #if defined(__GNUC__) -#define ADDR_DBG_BREAK()assert(false) +#define ADDR_DBG_BREAK()assert(false) /* FIXME: this can't work on debug builds with asserts off */ #elif defined(__APPLE__) #define ADDR_DBG_BREAK(){ IOPanic("");} #else -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa 11/16] broadcom: use NDEBUG to guard asserts
Signed-off-by: Eric Engestrom --- src/broadcom/cle/v3d_packet_helpers.h | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/src/broadcom/cle/v3d_packet_helpers.h b/src/broadcom/cle/v3d_packet_helpers.h index c86cad85266f503dbeba..bc1bf3eb76ec94aff16d 100644 --- a/src/broadcom/cle/v3d_packet_helpers.h +++ b/src/broadcom/cle/v3d_packet_helpers.h @@ -64,7 +64,7 @@ __gen_uint(uint64_t v, uint32_t start, uint32_t end) { __gen_validate_value(v); -#if DEBUG +#ifndef NDEBUG const int width = end - start + 1; if (width < 64) { const uint64_t max = (1ull << width) - 1; @@ -82,7 +82,7 @@ __gen_sint(int64_t v, uint32_t start, uint32_t end) __gen_validate_value(v); -#if DEBUG +#ifndef NDEBUG if (width < 64) { const int64_t max = (1ll << (width - 1)) - 1; const int64_t min = -(1ll << (width - 1)); @@ -99,7 +99,7 @@ static inline uint64_t __gen_offset(uint64_t v, uint32_t start, uint32_t end) { __gen_validate_value(v); -#if DEBUG +#ifndef NDEBUG uint64_t mask = (~0ull >> (64 - (end - start + 1))) << start; assert((v & ~mask) == 0); @@ -122,7 +122,7 @@ __gen_sfixed(float v, uint32_t start, uint32_t end, uint32_t fract_bits) const float factor = (1 << fract_bits); -#if DEBUG +#ifndef NDEBUG const float max = ((1 << (end - start)) - 1) / factor; const float min = -(1 << (end - start)) / factor; assert(min <= v && v <= max); @@ -141,7 +141,7 @@ __gen_ufixed(float v, uint32_t start, uint32_t end, uint32_t fract_bits) const float factor = (1 << fract_bits); -#if DEBUG +#ifndef NDEBUG const float max = ((1 << (end - start + 1)) - 1) / factor; const float min = 0.0f; assert(min <= v && v <= max); -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa 07/16] amd/addrlib: use NDEBUG to guard asserts
Signed-off-by: Eric Engestrom --- src/amd/addrlib/core/addrelemlib.cpp | 2 +- src/amd/addrlib/core/addrlib1.cpp| 8 src/amd/addrlib/core/addrlib2.h | 2 +- src/amd/addrlib/gfx9/gfx9addrlib.cpp | 4 ++-- src/amd/addrlib/r800/egbaddrlib.cpp | 2 +- 5 files changed, 9 insertions(+), 9 deletions(-) diff --git a/src/amd/addrlib/core/addrelemlib.cpp b/src/amd/addrlib/core/addrelemlib.cpp index c9e6da4729a5e2ef2aef..910c6ffda27000a4349b 100644 --- a/src/amd/addrlib/core/addrelemlib.cpp +++ b/src/amd/addrlib/core/addrelemlib.cpp @@ -1219,7 +1219,7 @@ VOID ElemLib::AdjustSurfaceInfo( basePitch = basePitch / expandX; width = width / expandX; height= height / expandY; -#if DEBUG +#ifndef NDEBUG width = (width == 0) ? 1 : width; height= (height == 0) ? 1 : height; diff --git a/src/amd/addrlib/core/addrlib1.cpp b/src/amd/addrlib/core/addrlib1.cpp index 86cd66aee43c4487ffe5..4f9ecae10002a7b51c84 100644 --- a/src/amd/addrlib/core/addrlib1.cpp +++ b/src/amd/addrlib/core/addrlib1.cpp @@ -367,12 +367,12 @@ ADDR_E_RETURNCODE Lib::ComputeSurfaceInfo( pOut->pixelPitch= pOut->pitch; pOut->pixelHeight = pOut->height; -#if DEBUG +#ifndef NDEBUG if (localIn.flags.display) { ADDR_ASSERT((pOut->pitchAlign % 32) == 0); } -#endif //DEBUG +#endif //NDEBUG if (localIn.format != ADDR_FMT_INVALID) { @@ -2036,12 +2036,12 @@ ADDR_E_RETURNCODE Lib::ComputeCmaskInfo( UINT_32 slice = (*pPitchOut) * (*pHeightOut); UINT_32 blockMax = slice / 128 / 128 - 1; -#if DEBUG +#ifndef NDEBUG if (slice % (64*256) != 0) { ADDR_ASSERT_ALWAYS(); } -#endif //DEBUG +#endif //NDEBUG UINT_32 maxBlockMax = HwlGetMaxCmaskBlockMax(); diff --git a/src/amd/addrlib/core/addrlib2.h b/src/amd/addrlib/core/addrlib2.h index bea2a485a61aa10990a1..e9cbea8f62ef6ac2b3aa 100644 --- a/src/amd/addrlib/core/addrlib2.h +++ b/src/amd/addrlib/core/addrlib2.h @@ -707,7 +707,7 @@ class Lib : public Addr::Lib VOID VerifyMipLevelInfo(const ADDR2_COMPUTE_SURFACE_INFO_INPUT* pIn) const { -#if DEBUG +#ifndef NDEBUG if (pIn->numMipLevels > 1) { UINT_32 actualMipLevels = 1; diff --git a/src/amd/addrlib/gfx9/gfx9addrlib.cpp b/src/amd/addrlib/gfx9/gfx9addrlib.cpp index e06f13c0afe01d026a0b..1d42cbfc8a3a50c84343 100644 --- a/src/amd/addrlib/gfx9/gfx9addrlib.cpp +++ b/src/amd/addrlib/gfx9/gfx9addrlib.cpp @@ -196,7 +196,7 @@ ADDR_E_RETURNCODE Gfx9Lib::HwlComputeHtileInfo( metaBlkDim.w <<= widthAmp; metaBlkDim.h <<= heightAmp; -#if DEBUG +#ifndef NDEBUG Dim3d metaBlkDimDbg = {8, 8, 1}; for (UINT_32 index = 0; index < numCompressBlkPerMetaBlkLog2; index++) { @@ -311,7 +311,7 @@ ADDR_E_RETURNCODE Gfx9Lib::HwlComputeCmaskInfo( metaBlkDim.w <<= widthAmp; metaBlkDim.h <<= heightAmp; -#if DEBUG +#ifndef NDEBUG Dim2d metaBlkDimDbg = {8, 8}; for (UINT_32 index = 0; index < numCompressBlkPerMetaBlkLog2; index++) { diff --git a/src/amd/addrlib/r800/egbaddrlib.cpp b/src/amd/addrlib/r800/egbaddrlib.cpp index 854d4cbe8ad87f70db14..ab5e554da7b190562558 100644 --- a/src/amd/addrlib/r800/egbaddrlib.cpp +++ b/src/amd/addrlib/r800/egbaddrlib.cpp @@ -3861,7 +3861,7 @@ ADDR_E_RETURNCODE EgBasedLib::HwlComputeSurfaceInfo( // Resets pTileInfo to NULL if the internal tile info is used if (pOut->pTileInfo == &tileInfo) { -#if DEBUG +#ifndef NDEBUG // Client does not pass in a valid pTileInfo if (IsMacroTiled(pOut->tileMode)) { -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa 10/16] vc4: check preprocessor token existence using #ifdef instead of #if
(other uses of USE_VC4_SIMULATOR are already correct) Signed-off-by: Eric Engestrom --- src/gallium/drivers/vc4/vc4_screen.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/vc4/vc4_screen.c b/src/gallium/drivers/vc4/vc4_screen.c index a42ba675c130c25976be..74dd9d5767e38dbc461b 100644 --- a/src/gallium/drivers/vc4/vc4_screen.c +++ b/src/gallium/drivers/vc4/vc4_screen.c @@ -64,7 +64,7 @@ static const struct debug_named_value debug_options[] = { "Flush after each draw call" }, { "always_sync", VC4_DEBUG_ALWAYS_SYNC, "Wait for finish after each flush" }, -#if USE_VC4_SIMULATOR +#ifdef USE_VC4_SIMULATOR { "dump", VC4_DEBUG_DUMP, "Write a GPU command stream trace file" }, #endif @@ -105,7 +105,7 @@ vc4_screen_destroy(struct pipe_screen *pscreen) slab_destroy_parent(&screen->transfer_pool); free(screen->ro); -#if USE_VC4_SIMULATOR +#ifdef USE_VC4_SIMULATOR vc4_simulator_destroy(screen); #endif @@ -710,7 +710,7 @@ vc4_screen_create(int fd, struct renderonly *ro) if (vc4_debug & VC4_DEBUG_SHADERDB) vc4_debug |= VC4_DEBUG_NORAST; -#if USE_VC4_SIMULATOR +#ifdef USE_VC4_SIMULATOR vc4_simulator_init(screen); #endif -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa 04/16] amd: remove always-true BRAHMA_BUILD define
Signed-off-by: Eric Engestrom --- src/amd/Android.addrlib.mk | 2 -- src/amd/Makefile.addrlib.am | 3 +-- src/amd/addrlib/core/addrcommon.h| 15 ++- src/amd/addrlib/meson.build | 2 +- src/gallium/winsys/amdgpu/drm/Android.mk | 4 +--- 5 files changed, 5 insertions(+), 21 deletions(-) diff --git a/src/amd/Android.addrlib.mk b/src/amd/Android.addrlib.mk index a29f7c16d1797b5943c4..a37112d7154de568c815 100644 --- a/src/amd/Android.addrlib.mk +++ b/src/amd/Android.addrlib.mk @@ -30,8 +30,6 @@ LOCAL_MODULE := libmesa_amdgpu_addrlib LOCAL_SRC_FILES := $(ADDRLIB_FILES) -LOCAL_CFLAGS := -DBRAHMA_BUILD=1 - LOCAL_C_INCLUDES := \ $(MESA_TOP)/src \ $(MESA_TOP)/src/amd/common \ diff --git a/src/amd/Makefile.addrlib.am b/src/amd/Makefile.addrlib.am index 90dfe9634454218bee82..322a5c86973b9238c74c 100644 --- a/src/amd/Makefile.addrlib.am +++ b/src/amd/Makefile.addrlib.am @@ -29,8 +29,7 @@ addrlib_libamdgpu_addrlib_la_CPPFLAGS = \ -I$(srcdir)/addrlib/inc/chip/gfx9 \ -I$(srcdir)/addrlib/inc/chip/r800 \ -I$(srcdir)/addrlib/gfx9/chip \ - -I$(srcdir)/addrlib/r800/chip \ - -DBRAHMA_BUILD=1 + -I$(srcdir)/addrlib/r800/chip addrlib_libamdgpu_addrlib_la_CXXFLAGS = \ $(VISIBILITY_CXXFLAGS) $(CXX11_CXXFLAGS) diff --git a/src/amd/addrlib/core/addrcommon.h b/src/amd/addrlib/core/addrcommon.h index 62f8ac61618e65db6768..99bb62e77f446f912d35 100644 --- a/src/amd/addrlib/core/addrcommon.h +++ b/src/amd/addrlib/core/addrcommon.h @@ -40,7 +40,7 @@ #include #include -#if BRAHMA_BUILD && !defined(DEBUG) +#if !defined(DEBUG) #ifdef NDEBUG #define DEBUG 0 #else @@ -73,18 +73,7 @@ #define ADDR_ANALYSIS_ASSUME(expr) do { (void)(expr); } while (0) #endif -#if BRAHMA_BUILD -#define ADDR_ASSERT(__e) assert(__e) -#elif DEBUG -#define ADDR_ASSERT(__e)\ -do {\ -ADDR_ANALYSIS_ASSUME(__e); \ -if ( !((__e) ? TRUE : FALSE)) { ADDR_DBG_BREAK(); } \ -} while (0) -#else //DEBUG -#define ADDR_ASSERT(__e) ADDR_ANALYSIS_ASSUME(__e) -#endif //DEBUG - +#define ADDR_ASSERT(__e) assert(__e) #define ADDR_ASSERT_ALWAYS() ADDR_DBG_BREAK() #define ADDR_UNHANDLED_CASE() ADDR_ASSERT(!"Unhandled case") #define ADDR_NOT_IMPLEMENTED() ADDR_ASSERT(!"Not implemented"); diff --git a/src/amd/addrlib/meson.build b/src/amd/addrlib/meson.build index 1a7f2fdef5d8e347b5ad..ed0dde6245b3396305fc 100644 --- a/src/amd/addrlib/meson.build +++ b/src/amd/addrlib/meson.build @@ -56,5 +56,5 @@ libamdgpu_addrlib = static_library( include_directories : include_directories( 'core', 'inc/chip/gfx9', 'inc/chip/r800', 'gfx9/chip', 'r800/chip', '../common', '../../'), - cpp_args : [cpp_vis_args, '-DBRAHMA_BUILD=1'], + cpp_args : cpp_vis_args, ) diff --git a/src/gallium/winsys/amdgpu/drm/Android.mk b/src/gallium/winsys/amdgpu/drm/Android.mk index a05304ae5dea571e0a67..6e84a0c8de1a8e503ef4 100644 --- a/src/gallium/winsys/amdgpu/drm/Android.mk +++ b/src/gallium/winsys/amdgpu/drm/Android.mk @@ -30,9 +30,7 @@ include $(CLEAR_VARS) LOCAL_SRC_FILES := $(C_SOURCES) -LOCAL_CFLAGS := \ - $(AMDGPU_CFLAGS) \ - -DBRAHMA_BUILD=1 +LOCAL_CFLAGS := $(AMDGPU_CFLAGS) LOCAL_STATIC_LIBRARIES := libmesa_amdgpu_addrlib -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa 02/16] anv: tie anv_assert() enablement to regular assert()
Signed-off-by: Eric Engestrom --- src/intel/vulkan/anv_private.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_private.h index 6d4e43f2e687cbf26ccd..6474abf0f3694c7fcd3a 100644 --- a/src/intel/vulkan/anv_private.h +++ b/src/intel/vulkan/anv_private.h @@ -382,7 +382,7 @@ void anv_debug_report(struct anv_instance *instance, } while (0) /* A non-fatal assert. Useful for debugging. */ -#ifdef DEBUG +#ifndef NDEBUG #define anv_assert(x) ({ \ if (unlikely(!(x))) \ intel_loge("%s:%d ASSERT: %s", __FILE__, __LINE__, #x); \ -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa 03/16] radv: tie radv_assert() enablement to regular assert()
Signed-off-by: Eric Engestrom --- src/amd/vulkan/radv_private.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv_private.h index addd35e5ce10fa37bd3f..327ce5415947abc1712d 100644 --- a/src/amd/vulkan/radv_private.h +++ b/src/amd/vulkan/radv_private.h @@ -235,7 +235,7 @@ void radv_loge_v(const char *format, va_list va); } while (0) /* A non-fatal assert. Useful for debugging. */ -#ifdef DEBUG +#ifndef NDEBUG #define radv_assert(x) ({ \ if (unlikely(!(x))) \ fprintf(stderr, "%s:%d ASSERT: %s\n", __FILE__, __LINE__, #x); \ -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa 05/16] amd/addrlib: remove duplicate definition of ADDR_DBG_BREAK()
It's already defined above (around line 60). Signed-off-by: Eric Engestrom --- src/amd/addrlib/core/addrcommon.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/src/amd/addrlib/core/addrcommon.h b/src/amd/addrlib/core/addrcommon.h index 99bb62e77f446f912d35..80ea0eb247aa48dcb176 100644 --- a/src/amd/addrlib/core/addrcommon.h +++ b/src/amd/addrlib/core/addrcommon.h @@ -143,8 +143,6 @@ #define ADDR_PRNT(a) -#define ADDR_DBG_BREAK() - #define ADDR_INFO(cond, a) #define ADDR_WARN(cond, a) -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa 01/16] c99_{compat, math}.h: check for macro existance before using it
Fixes thousands of warnings when using -Wundef, which is about to land, as these two files are included virtually everywhere. Signed-off-by: Eric Engestrom --- include/c99_compat.h | 6 +++--- include/c99_math.h | 5 +++-- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/include/c99_compat.h b/include/c99_compat.h index cb690c6e2a0180702a5b..008630b7868b77b4b5fd 100644 --- a/include/c99_compat.h +++ b/include/c99_compat.h @@ -81,7 +81,7 @@ /* Intel compiler supports inline keyword */ # elif defined(__WATCOMC__) && (__WATCOMC__ >= 1100) #define inline __inline -# elif (__STDC_VERSION__ >= 199901L) +# elif defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L) /* C99 supports inline keyword */ # else #define inline @@ -96,7 +96,7 @@ * - http://cellperformance.beyond3d.com/articles/2006/05/demystifying-the-restrict-keyword.html */ #ifndef restrict -# if (__STDC_VERSION__ >= 199901L) +# if defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L) /* C99 */ # elif defined(__GNUC__) #define restrict __restrict__ @@ -112,7 +112,7 @@ * C99 __func__ macro */ #ifndef __func__ -# if (__STDC_VERSION__ >= 199901L) +# if defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L) /* C99 */ # elif defined(__GNUC__) #define __func__ __FUNCTION__ diff --git a/include/c99_math.h b/include/c99_math.h index e906c26aa543bc1adfd1..629ee26ba2a803242950 100644 --- a/include/c99_math.h +++ b/include/c99_math.h @@ -45,7 +45,7 @@ #if !defined(_MSC_VER) && \ -__STDC_VERSION__ < 199901L && \ +defined(__STDC_VERSION__) && __STDC_VERSION__ < 199901L && \ (!defined(_XOPEN_SOURCE) || _XOPEN_SOURCE < 600) && \ !defined(__cplusplus) @@ -190,7 +190,8 @@ fpclassify(double x) * undefines those functions, which in glibc 2.23, are defined as macros rather * than functions as in glibc 2.22. */ -#if __cplusplus >= 201103L && (__GLIBC__ > 2 || (__GLIBC__ == 2 && __GLIBC_MINOR__ >= 23)) +#if defined(__cplusplus) && __cplusplus >= 201103L && \ +defined(__GLIBC__) && (__GLIBC__ > 2 || (__GLIBC__ == 2 && __GLIBC_MINOR__ >= 23)) #include using std::fpclassify; -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] util: Fix SHA1 implementation on big endian
On Fri, 2017-11-24 at 13:32 +, Emil Velikov wrote: > On 24 November 2017 at 10:25, Eric Engestrom > wrote: > > On Thursday, 2017-11-23 11:08:04 -0800, Matt Turner wrote: > > > The code defines a macro blk0(i) based on the preprocessor condition > > > BYTE_ORDER == LITTLE_ENDIAN. If true, blk0(i) is defined as a byte swap > > > operation. Unfortunately, if the preprocessor macros used in the test > > > are no defined, then the comparison becomes 0 == 0 and it evaluates as > > > true. > > > --- > > > src/util/sha1/sha1.c | 9 + > > > 1 file changed, 9 insertions(+) > > > > > > diff --git a/src/util/sha1/sha1.c b/src/util/sha1/sha1.c > > > index ef59ea1dfc..2c629520c3 100644 > > > --- a/src/util/sha1/sha1.c > > > +++ b/src/util/sha1/sha1.c > > > @@ -16,8 +16,17 @@ > > > > > > #include > > > #include > > > +#include "u_endian.h" > > > > If you're including this header, why not use > > `#ifdef PIPE_ARCH_LITTLE_ENDIAN` instead of > > `#if BYTE_ORDER == LITTLE_ENDIAN`? > > > > `#error` becomes unnecessary > > > > I won't bother with that - we do want to address all the cases where > undefined macro is used in #if statement. > This is handled by -Wundef which seemingly is not part of Wall and we > don't use - patch incoming in a second. > > We want to make it a Werror=undef after we fix the ~60 issues. More > than half of those are coming from gtest :-\ > > Thanks for fixing these Matt. As-is the series is > Reviewed-by: Emil Velikov Matt, this is breaking Windows compilation. You can check the Appveyor output here: https://ci.appveyor.com/project/AndresGomez/mesa-dwep2/build/488 I've also seen a similar error with the MinGW crosscompilation to Windows: -- src/util/hash_table.c: In function ‘_mesa_hash_table_u64_insert’: src/util/hash_table.c:591:42: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] _mesa_hash_table_insert(ht->table, (void *)key, data); ^ src/util/hash_table.c: In function ‘hash_table_u64_search’: src/util/hash_table.c:607:49: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] return _mesa_hash_table_search(ht->table, (void *)key); ^ src/util/sha1/sha1.c:23:2: error: #error BYTE_ORDER not defined #error BYTE_ORDER not defined ^ src/util/sha1/sha1.c:27:2: error: #error LITTLE_ENDIAN no defined #error LITTLE_ENDIAN no defined ^ scons: *** [build/windows-x86-debug/util/sha1/sha1.o] Error 1 scons: building terminated because of errors. -- Br, Andres ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103814] incorrect dust rendering in hl2 without sisched
https://bugs.freedesktop.org/show_bug.cgi?id=103814 Hleb Valoshka <375...@gmail.com> changed: What|Removed |Added Summary|incorrect dust rendering in |incorrect dust rendering in |hl2ep1 without some |hl2 without sisched |R600_DEBUG options | QA Contact|dri-devel@lists.freedesktop |mesa-dev@lists.freedesktop. |.org|org Assignee|dri-devel@lists.freedesktop |mesa-dev@lists.freedesktop. |.org|org --- Comment #2 from Hleb Valoshka <375...@gmail.com> --- I've tested other HL2 titles (HL2, EP2, Lost Coast), they all have this issue. But it can be workarounded by usage of sisched. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] GBM and the Device Memory Allocator Proposals
On November 24, 2017 09:45:07 Jason Ekstrand wrote: On November 23, 2017 09:00:05 Emil Velikov wrote: Hi James, On 21 November 2017 at 01:11, James Jones wrote: -I have also heard some general comments that regardless of the relationship between GBM and the new allocator mechanisms, it might be time to move GBM out of Mesa so it can be developed as a stand-alone project. I'd be interested what others think about that, as it would be something worth coordinating with any other new development based on or inside of GBM. Having a GBM frontend is one thing I've been pondering as well. Regardless of exact solution wrt the new allocator, having a clear frontend/backend separation for GBM will be beneficial. I'll be giving it a stab these days. I'm not sure what you mean by that. It currently has something that looks like separation but it's a joke. Unless we have a real reason to have anything other than a dri_interface back-end, I'd rather we just stop pretending and drop the extra layer of function pointer indirection entirely. Gah! I didn't read Rob's email before writing this. It looks like there is a use-case for this. I'm still a bit skeptical about whether or not we really want to extend what we have our if it would be better to start over and just require that the new thing also support the current GBM ABI. --Jason Disclaimer: Mostly thinking out loud, so please take the following with grain of salt. On the details wrt the new allocator project, I think that having a new lean library would be a good idea. One could borrow ideas from GBM, but by default no connection between the two should be required. That might lead to having a the initial hurdle of porting a bit harder, but it will allow for more efficient driver implementation. HTH Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] GBM and the Device Memory Allocator Proposals
On November 23, 2017 09:00:05 Emil Velikov wrote: Hi James, On 21 November 2017 at 01:11, James Jones wrote: -I have also heard some general comments that regardless of the relationship between GBM and the new allocator mechanisms, it might be time to move GBM out of Mesa so it can be developed as a stand-alone project. I'd be interested what others think about that, as it would be something worth coordinating with any other new development based on or inside of GBM. Having a GBM frontend is one thing I've been pondering as well. Regardless of exact solution wrt the new allocator, having a clear frontend/backend separation for GBM will be beneficial. I'll be giving it a stab these days. I'm not sure what you mean by that. It currently has something that looks like separation but it's a joke. Unless we have a real reason to have anything other than a dri_interface back-end, I'd rather we just stop pretending and drop the extra layer of function pointer indirection entirely. --Jason Disclaimer: Mostly thinking out loud, so please take the following with grain of salt. On the details wrt the new allocator project, I think that having a new lean library would be a good idea. One could borrow ideas from GBM, but by default no connection between the two should be required. That might lead to having a the initial hurdle of porting a bit harder, but it will allow for more efficient driver implementation. HTH Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 99125] Log to a file all GALLIUM_HUD infos
https://bugs.freedesktop.org/show_bug.cgi?id=99125 --- Comment #3 from Shmerl --- (In reply to Edmondo Tommasina from comment #2) > FYI: Marek pushed the series of patches to mesa git master. > > https://cgit.freedesktop.org/mesa/mesa/commit/ > ?id=3f5fba8a7be61bfc0f46a5ea058108f6e0e1c268 Shouldn't this be closed then? -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/5] scons: add Wundef to the build flags
On 24/11/17 14:25, Emil Velikov wrote: From: Emil Velikov Analogous to the other build systems. Cc: Jose Fonseca Cc: Brian Paul Signed-off-by: Emil Velikov --- scons/gallium.py | 1 + 1 file changed, 1 insertion(+) diff --git a/scons/gallium.py b/scons/gallium.py index ef3b2ee81ae..74793a2525c 100755 --- a/scons/gallium.py +++ b/scons/gallium.py @@ -455,6 +455,7 @@ def generate(env): # - http://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html ccflags += [ '-Wall', +'-Wundef', '-Wno-long-long', '-fmessage-length=0', # be nice to Eclipse ] Sounds good to me. Reviewed-by: Jose Fonseca ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] GBM and the Device Memory Allocator Proposals
On Mon, Nov 20, 2017 at 8:11 PM, James Jones wrote: > As many here know at this point, I've been working on solving issues related > to DMA-capable memory allocation for various devices for some time now. I'd > like to take this opportunity to apologize for the way I handled the EGL > stream proposals. I understand now that the development process followed > there was unacceptable to the community and likely offended many great > engineers. > > Moving forward, I attempted to reboot talks in a more constructive manner > with the generic allocator library proposals & discussion forum at XDC 2016. > Some great design ideas came out of that, and I've since been prototyping > some code to prove them out before bringing them back as official proposals. > Again, I understand some people are growing concerned that I've been doing > this off on the side in a github project that has primarily NVIDIA > contributors. My goal was only to avoid wasting everyone's time with > unproven ideas. The intent was never to dump the prototype code as-is on > the community and presume acceptance. It's just a public research project. > > Now the prototyping is nearing completion, and I'd like to renew discussion > on whether and how the new mechanisms can be integrated with the Linux > graphics stack. > > I'd be interested to know if more work is needed to demonstrate the > usefulness of the new mechanisms, or whether people think they have value at > this point. > > After talking with people on the hallway track at XDC this year, I've heard > several proposals for incorporating the new mechanisms: > > -Include ideas from the generic allocator design into GBM. This could take > the form of designing a "GBM 2.0" API, or incrementally adding to the > existing GBM API. > > -Develop a library to replace GBM. The allocator prototype code could be > massaged into something production worthy to jump start this process. > > -Develop a library that sits beside or on top of GBM, using GBM for > low-level graphics buffer allocation, while supporting non-graphics kernel > APIs directly. The additional cross-device negotiation and sorting of > capabilities would be handled in this slightly higher-level API before > handing off to GBM and other APIs for actual allocation somehow. tbh, I kinda see GBM and $new_thing sitting side by side.. GBM is still the "winsys" for running on "bare metal" (ie. kms). And we don't want to saddle $new_thing with aspects of that, but rather have it focus on being the thing that in multiple-"device"[1] scenarious figures out what sort of buffer can be allocated by who for sharing. Ie $new_thing should really not care about winsys level things like cursors or surfaces.. only buffers. The mesa implementation of $new_thing could sit on top of GBM, although it could also just sit on top of the same internal APIs that GBM sits on top of. That is an implementation detail. It could be that GBM grows an API to return an instance of $new_thing for use-cases that involve sharing a buffer with the GPU. Or perhaps that is exposed via some sort of EGL extension. (We probably also need a way to get an instance from libdrm (?) for display-only KMS drivers, to cover cases like etnaviv sharing a buffer with a separate display driver.) [1] where "devices" could be multiple GPUs or multiple APIs for one or more GPUs, but also includes non-GPU devices like camera, video decoder, "image processor" (which may or may not be part of camera), etc, etc > -I have also heard some general comments that regardless of the relationship > between GBM and the new allocator mechanisms, it might be time to move GBM > out of Mesa so it can be developed as a stand-alone project. I'd be > interested what others think about that, as it would be something worth > coordinating with any other new development based on or inside of GBM. +1 We already have at least a couple different non-mesa implementations of GBM (which afaict tend to lag behind mesa's GBM and cause headaches). The extracted part probably isn't much more than a header and shim. But probably does need to grow some versioning for the backend to know if, for example, gbm->bo_map() is supported.. at least it could provide stubs that return an error, rather than having link-time fail if building something w/ $vendor's old gbm implementation. > And of course I'm open to any other ideas for integration. Beyond just > where this code would live, there is much to debate about the mechanisms > themselves and all the implementation details. I was just hoping to kick > things off with something high level to start. My $0.02, is that the place where devel happens and place to go for releases could be different. Either way, I would like to see git tree for tagged release versions live on fd.o and use the common release process[2] for generating/uploading release tarballs that distros can use. [2] https://cgit.freedesktop.org/xorg/util/modular/tree/release.sh > For reference, t
[Mesa-dev] [Bug 103868] VK_PRESENT_MODE_MAILBOX_KHR blacks out the whole screen intermittently when using X11 compositing window managers
https://bugs.freedesktop.org/show_bug.cgi?id=103868 Michel Dänzer changed: What|Removed |Added Assignee|mesa-dev@lists.freedesktop. |xorg-t...@lists.x.org |org | Component|Drivers/Vulkan/radeon |Driver/modesetting Version|17.2|unspecified Product|Mesa|xorg QA Contact|mesa-dev@lists.freedesktop. |xorg-t...@lists.x.org |org | -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103868] VK_PRESENT_MODE_MAILBOX_KHR blacks out the whole screen intermittently when using X11 compositing window managers
https://bugs.freedesktop.org/show_bug.cgi?id=103868 --- Comment #5 from Spencer Brown --- Okay, it looks like this doesn't happen with the AMDGPU driver. It's only a problem with modesetting. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: Add GL4.6 aliases of functions from GL_ARB_indirect_parameters
--- src/mapi/glapi/gen/GL4x.xml | 22 ++ 1 file changed, 22 insertions(+) diff --git a/src/mapi/glapi/gen/GL4x.xml b/src/mapi/glapi/gen/GL4x.xml index 88dba5c..ea28d8e 100644 --- a/src/mapi/glapi/gen/GL4x.xml +++ b/src/mapi/glapi/gen/GL4x.xml @@ -73,6 +73,28 @@ + + + + + + + + + + + + + + + + + + + + -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103868] VK_PRESENT_MODE_MAILBOX_KHR blacks out the whole screen intermittently when using X11 compositing window managers
https://bugs.freedesktop.org/show_bug.cgi?id=103868 --- Comment #4 from Michel Dänzer --- (In reply to Spencer Brown from comment #3) > dmesg > > Nothing really seems interesting to me here. Did you capture it after reproducing the problem as well? This doesn't look directly related to RADV but rather like some kind of resource shortage preventing page flipping from working for the compositor. Does this also happen with xf86-video-amdgpu instead of the modesetting driver? If yes, please attach the Xorg log file from that as well. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] configure.ac: add Wundef to the build flags
On Friday, 2017-11-24 14:25:02 +, Emil Velikov wrote: > From: Emil Velikov > > From the manual: > Warn if an undefined identifier is evaluated in an `#if' directive. > > This is something we want to know and address. Otherwise we can end up > with subtle issues, in the less commonly used codepaths. > > Note: this will trigger a lot of extra warnings, with ~60 of those being > unique. Once all those are resolved we'd want to promote the warning to > an error. Yes please; series is Reviewed-by: Eric Engestrom > > Cc: Matt Turner > Signed-off-by: Emil Velikov > --- > configure.ac | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/configure.ac b/configure.ac > index 1344c12884f..ba7dda7b575 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -291,6 +291,7 @@ dnl > dnl Check compiler flags > dnl > AX_CHECK_COMPILE_FLAG([-Wall], > [CFLAGS="$CFLAGS -Wall"]) > +AX_CHECK_COMPILE_FLAG([-Wundef], > [CFLAGS="$CFLAGS -Wundef"]) > AX_CHECK_COMPILE_FLAG([-Werror=implicit-function-declaration], > [CFLAGS="$CFLAGS -Werror=implicit-function-declaration"]) > AX_CHECK_COMPILE_FLAG([-Werror=missing-prototypes], > [CFLAGS="$CFLAGS -Werror=missing-prototypes"]) > AX_CHECK_COMPILE_FLAG([-Wmissing-prototypes], > [CFLAGS="$CFLAGS -Wmissing-prototypes"]) > @@ -303,6 +304,7 @@ dnl Check C++ compiler flags > dnl > AC_LANG_PUSH([C++]) > AX_CHECK_COMPILE_FLAG([-Wall], > [CXXFLAGS="$CXXFLAGS -Wall"]) > +AX_CHECK_COMPILE_FLAG([-Wundef], > [CXXFLAGS="$CXXFLAGS -Wundef"]) > AX_CHECK_COMPILE_FLAG([-fno-math-errno], > [CXXFLAGS="$CXXFLAGS -fno-math-errno"]) > AX_CHECK_COMPILE_FLAG([-fno-trapping-math], > [CXXFLAGS="$CXXFLAGS -fno-trapping-math"]) > AX_CHECK_COMPILE_FLAG([-fvisibility=hidden], > [VISIBILITY_CXXFLAGS="-fvisibility=hidden"]) > -- > 2.14.1 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/5] configure.ac: add Wundef to the build flags
From: Emil Velikov From the manual: Warn if an undefined identifier is evaluated in an `#if' directive. This is something we want to know and address. Otherwise we can end up with subtle issues, in the less commonly used codepaths. Note: this will trigger a lot of extra warnings, with ~60 of those being unique. Once all those are resolved we'd want to promote the warning to an error. Cc: Matt Turner Signed-off-by: Emil Velikov --- configure.ac | 2 ++ 1 file changed, 2 insertions(+) diff --git a/configure.ac b/configure.ac index 1344c12884f..ba7dda7b575 100644 --- a/configure.ac +++ b/configure.ac @@ -291,6 +291,7 @@ dnl dnl Check compiler flags dnl AX_CHECK_COMPILE_FLAG([-Wall], [CFLAGS="$CFLAGS -Wall"]) +AX_CHECK_COMPILE_FLAG([-Wundef], [CFLAGS="$CFLAGS -Wundef"]) AX_CHECK_COMPILE_FLAG([-Werror=implicit-function-declaration], [CFLAGS="$CFLAGS -Werror=implicit-function-declaration"]) AX_CHECK_COMPILE_FLAG([-Werror=missing-prototypes], [CFLAGS="$CFLAGS -Werror=missing-prototypes"]) AX_CHECK_COMPILE_FLAG([-Wmissing-prototypes], [CFLAGS="$CFLAGS -Wmissing-prototypes"]) @@ -303,6 +304,7 @@ dnl Check C++ compiler flags dnl AC_LANG_PUSH([C++]) AX_CHECK_COMPILE_FLAG([-Wall], [CXXFLAGS="$CXXFLAGS -Wall"]) +AX_CHECK_COMPILE_FLAG([-Wundef], [CXXFLAGS="$CXXFLAGS -Wundef"]) AX_CHECK_COMPILE_FLAG([-fno-math-errno], [CXXFLAGS="$CXXFLAGS -fno-math-errno"]) AX_CHECK_COMPILE_FLAG([-fno-trapping-math], [CXXFLAGS="$CXXFLAGS -fno-trapping-math"]) AX_CHECK_COMPILE_FLAG([-fvisibility=hidden], [VISIBILITY_CXXFLAGS="-fvisibility=hidden"]) -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/5] Android: copy -fno*math* options from the autotools build
From: Emil Velikov Add -fno-math-errno and -fno-trapping-math to the build. Mesa does not depend on the functionality provided, thus this should result in slightly faster code and smaller binaries. Cc: Tapani Pälli Cc: Rob Herring Signed-off-by: Emil Velikov --- Gents, please do some basic runtime checks. This should work fine OOTB although one can never be too sure considering the different compiler/C runtime used on Android. --- Android.common.mk | 2 ++ 1 file changed, 2 insertions(+) diff --git a/Android.common.mk b/Android.common.mk index 544d813c02d..d97974b1c84 100644 --- a/Android.common.mk +++ b/Android.common.mk @@ -68,6 +68,8 @@ LOCAL_CFLAGS += \ -DMAJOR_IN_SYSMACROS \ -Wundef \ -fvisibility=hidden \ + -fno-math-errno \ + -fno-trapping-math \ -Wno-sign-compare LOCAL_CPPFLAGS += \ -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/5] scons: add Wundef to the build flags
From: Emil Velikov Analogous to the other build systems. Cc: Jose Fonseca Cc: Brian Paul Signed-off-by: Emil Velikov --- scons/gallium.py | 1 + 1 file changed, 1 insertion(+) diff --git a/scons/gallium.py b/scons/gallium.py index ef3b2ee81ae..74793a2525c 100755 --- a/scons/gallium.py +++ b/scons/gallium.py @@ -455,6 +455,7 @@ def generate(env): # - http://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html ccflags += [ '-Wall', +'-Wundef', '-Wno-long-long', '-fmessage-length=0', # be nice to Eclipse ] -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/5] Android: add Wundef to the build flags
From: Emil Velikov The compiler will warns us when we're misusing undefined macros. Note: this will trigger a bunch of warnings, which will be resolved ASAP. Cc: Tapani Pälli Cc: Rob Herring Signed-off-by: Emil Velikov --- Android.common.mk | 1 + 1 file changed, 1 insertion(+) diff --git a/Android.common.mk b/Android.common.mk index 5671c1c1a59..544d813c02d 100644 --- a/Android.common.mk +++ b/Android.common.mk @@ -66,6 +66,7 @@ LOCAL_CFLAGS += \ -DHAVE_DLADDR \ -DHAVE_DL_ITERATE_PHDR \ -DMAJOR_IN_SYSMACROS \ + -Wundef \ -fvisibility=hidden \ -Wno-sign-compare -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/5] meson: add Wundef to the build flags
From: Emil Velikov Analogous to the other build systems. Cc: Eric Engestrom Cc: Dylan Baker Signed-off-by: Emil Velikov --- meson.build | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/meson.build b/meson.build index 53013e47ec4..2e704e18c93 100644 --- a/meson.build +++ b/meson.build @@ -468,7 +468,7 @@ endif # Check for generic C arguments c_args = [] -foreach a : ['-Wall', '-Werror=implicit-function-declaration', +foreach a : ['-Wall', '-Wundef', '-Werror=implicit-function-declaration', '-Werror=missing-prototypes', '-fno-math-errno', '-fno-trapping-math', '-Qunused-arguments'] if cc.has_argument(a) @@ -483,7 +483,7 @@ endif # Check for generic C++ arguments cpp = meson.get_compiler('cpp') cpp_args = [] -foreach a : ['-Wall', '-fno-math-errno', '-fno-trapping-math', +foreach a : ['-Wall', '-Wundef', '-fno-math-errno', '-fno-trapping-math', '-Qunused-arguments', '-Wno-non-virtual-dtor'] if cpp.has_argument(a) cpp_args += a -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa] st/dri: replace hard-coded array size with ARRAY_SIZE()
Reviewed-by: Marek Olšák Marek On Fri, Nov 24, 2017 at 11:51 AM, Eric Engestrom wrote: > Signed-off-by: Eric Engestrom > --- > Could've sworn I had already seen someone post this patch; guess either > I was mistaken, or it got lost on the way. > --- > src/gallium/state_trackers/dri/dri_screen.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/gallium/state_trackers/dri/dri_screen.c > b/src/gallium/state_trackers/dri/dri_screen.c > index 91f50fe8e32ae549c1df..31b2c37bfd0a9b8906b7 100644 > --- a/src/gallium/state_trackers/dri/dri_screen.c > +++ b/src/gallium/state_trackers/dri/dri_screen.c > @@ -219,7 +219,7 @@ dri_fill_in_modes(struct dri_screen *screen) > if (dri_loader_get_cap(screen, DRI_LOADER_CAP_RGBA_ORDERING)) >num_formats = ARRAY_SIZE(mesa_formats); > else > - num_formats = 5; > + num_formats = ARRAY_SIZE(mesa_formats) - 2; > > /* Add configs. */ > for (format = 0; format < num_formats; format++) { > -- > Cheers, > Eric > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] util: Fix SHA1 implementation on big endian
On 24 November 2017 at 10:25, Eric Engestrom wrote: > On Thursday, 2017-11-23 11:08:04 -0800, Matt Turner wrote: >> The code defines a macro blk0(i) based on the preprocessor condition >> BYTE_ORDER == LITTLE_ENDIAN. If true, blk0(i) is defined as a byte swap >> operation. Unfortunately, if the preprocessor macros used in the test >> are no defined, then the comparison becomes 0 == 0 and it evaluates as >> true. >> --- >> src/util/sha1/sha1.c | 9 + >> 1 file changed, 9 insertions(+) >> >> diff --git a/src/util/sha1/sha1.c b/src/util/sha1/sha1.c >> index ef59ea1dfc..2c629520c3 100644 >> --- a/src/util/sha1/sha1.c >> +++ b/src/util/sha1/sha1.c >> @@ -16,8 +16,17 @@ >> >> #include >> #include >> +#include "u_endian.h" > > If you're including this header, why not use > `#ifdef PIPE_ARCH_LITTLE_ENDIAN` instead of > `#if BYTE_ORDER == LITTLE_ENDIAN`? > > `#error` becomes unnecessary > I won't bother with that - we do want to address all the cases where undefined macro is used in #if statement. This is handled by -Wundef which seemingly is not part of Wall and we don't use - patch incoming in a second. We want to make it a Werror=undef after we fix the ~60 issues. More than half of those are coming from gtest :-\ Thanks for fixing these Matt. As-is the series is Reviewed-by: Emil Velikov -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 44/51] glsl: WIP: Add lowering pass for treating mediump as float16
On Friday, 2017-11-24 14:27:11 +0200, Topi Pohjolainen wrote: > At least the following need more thought: > > 1) Converting right-hand-side of assignments from 16-bits to 32-bits >- More correct thing to do is to treat rhs as 32-bits latest in the > expression producing the value > > 2) Texture arguments except coordinates are not handled at all >- Moreover, coordinates are always converted into 32-bits due to > logic missing in the Intel compiler backend. > > Signed-off-by: Topi Pohjolainen > --- > src/compiler/Makefile.sources | 1 + > src/compiler/glsl/ir_optimization.h | 1 + > src/compiler/glsl/lower_mediump.cpp | 273 > > 3 files changed, 275 insertions(+) > create mode 100644 src/compiler/glsl/lower_mediump.cpp > > diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources > index 2ab8e163a2..47bde4fb78 100644 > --- a/src/compiler/Makefile.sources > +++ b/src/compiler/Makefile.sources > @@ -94,6 +94,7 @@ LIBGLSL_FILES = \ > glsl/lower_int64.cpp \ > glsl/lower_jumps.cpp \ > glsl/lower_mat_op_to_vec.cpp \ > + glsl/lower_mediump.cpp \ > glsl/lower_noise.cpp \ > glsl/lower_offset_array.cpp \ > glsl/lower_packed_varyings.cpp \ 8< diff --git a/src/compiler/glsl/meson.build b/src/compiler/glsl/meson.build index 5b505c007a03d8d2e19b..1bfc2106738caf3a266f 100644 --- a/src/compiler/glsl/meson.build +++ b/src/compiler/glsl/meson.build @@ -133,6 +133,7 @@ files_libglsl = files( 'lower_int64.cpp', 'lower_jumps.cpp', 'lower_mat_op_to_vec.cpp', + 'lower_mediump.cpp', 'lower_noise.cpp', 'lower_offset_array.cpp', 'lower_packed_varyings.cpp', >8 > diff --git a/src/compiler/glsl/ir_optimization.h > b/src/compiler/glsl/ir_optimization.h > index 2b8c195151..09c4d664e0 100644 > --- a/src/compiler/glsl/ir_optimization.h > +++ b/src/compiler/glsl/ir_optimization.h > @@ -132,6 +132,7 @@ bool do_vec_index_to_swizzle(exec_list *instructions); > bool lower_discard(exec_list *instructions); > void lower_discard_flow(exec_list *instructions); > bool lower_instructions(exec_list *instructions, unsigned what_to_lower); > +bool lower_mediump(struct gl_linked_shader *shader); > bool lower_noise(exec_list *instructions); > bool lower_variable_index_to_cond_assign(gl_shader_stage stage, > exec_list *instructions, bool lower_input, bool lower_output, > diff --git a/src/compiler/glsl/lower_mediump.cpp > b/src/compiler/glsl/lower_mediump.cpp > new file mode 100644 > index 00..89eed8b294 > --- /dev/null > +++ b/src/compiler/glsl/lower_mediump.cpp > @@ -0,0 +1,273 @@ > +/* > + * Copyright 2017 Intel Corporation > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice (including the next > + * paragraph) shall be included in all copies or substantial portions of the > + * Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > + * DEALINGS IN THE SOFTWARE. > + */ > + > +/** > + * \file lower_mediump.cpp > + * > + */ > + > +#include "compiler/glsl_types.h" > +#include "ir.h" > +#include "ir_rvalue_visitor.h" > +#include "ast.h" > + > +static const glsl_type * > +get_mediump(const glsl_type *highp) > +{ > + if (highp->is_float()) > + return glsl_type::get_instance(GLSL_TYPE_FLOAT16, > + highp->vector_elements, > + highp->matrix_columns); > + > + if (highp->is_array() && highp->fields.array->is_float()) > + return glsl_type::get_array_instance( > +glsl_type::get_instance(GLSL_TYPE_FLOAT16, > + > highp->fields.array->vector_elements, > +highp->fields.array->matrix_columns), > +highp->length); > + > + return highp; > +} > + > +static bool > +is_16_bit(const ir_rvalue *ir) > +{ > + return ir->type->get_scalar_type()->base_type == GLSL_TYPE_FLOAT16; > +} > + > +static bool > +refers_16_bit_float(const ir_rvalue *ir) > +{ > + ir_variable *var = ir->variable_refere
Re: [Mesa-dev] [PATCH mesa] compiler: use proper guard for asserts
I ended up looking at all the uses of DEBUG in Mesa, and will send a whole series next week, including more changes in nir & compiler, so you can disregard this patch for now. On Thursday, 2017-11-23 13:24:25 +, Eric Engestrom wrote: > nir_validate.c's #endif already had the correct NDEBUG comment > > Fixes: dcb1acdea00a8f2c29777 "nir/validate: Only build in debug mode" > Fixes: 9ff71b649b4b3808a9e17 "i965/nir: Validate that NIR passes call > nir_metadata_preserve()" > Signed-off-by: Eric Engestrom > --- > src/compiler/nir/nir.h | 4 ++-- > src/compiler/nir/nir_metadata.c | 2 +- > src/compiler/nir/nir_validate.c | 2 +- > 3 files changed, 4 insertions(+), 4 deletions(-) > > diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h > index f46f6147110acbffec5c..626f5f5a5533e577c66b 100644 > --- a/src/compiler/nir/nir.h > +++ b/src/compiler/nir/nir.h > @@ -2325,7 +2325,7 @@ nir_deref_var *nir_deref_var_clone(const nir_deref_var > *deref, void *mem_ctx); > > nir_shader *nir_shader_serialize_deserialize(void *mem_ctx, nir_shader *s); > > -#ifdef DEBUG > +#ifndef NDEBUG > void nir_validate_shader(nir_shader *shader); > void nir_metadata_set_validation_flag(nir_shader *shader); > void nir_metadata_check_validation_flag(nir_shader *shader); > @@ -2366,7 +2366,7 @@ static inline void > nir_metadata_check_validation_flag(nir_shader *shader) { (voi > static inline bool should_clone_nir(void) { return false; } > static inline bool should_serialize_deserialize_nir(void) { return false; } > static inline bool should_print_nir(void) { return false; } > -#endif /* DEBUG */ > +#endif /* NDEBUG */ > > #define _PASS(nir, do_pass) do { \ > do_pass \ > diff --git a/src/compiler/nir/nir_metadata.c b/src/compiler/nir/nir_metadata.c > index f71cf432b703451e6997..e681ba34f7557a7c3051 100644 > --- a/src/compiler/nir/nir_metadata.c > +++ b/src/compiler/nir/nir_metadata.c > @@ -59,7 +59,7 @@ nir_metadata_preserve(nir_function_impl *impl, nir_metadata > preserved) > impl->valid_metadata &= preserved; > } > > -#ifdef DEBUG > +#ifndef NDEBUG > /** > * Make sure passes properly invalidate metadata (part 1). > * > diff --git a/src/compiler/nir/nir_validate.c b/src/compiler/nir/nir_validate.c > index 9bf8c7029012ef26af14..a49948fbb489beb9509e 100644 > --- a/src/compiler/nir/nir_validate.c > +++ b/src/compiler/nir/nir_validate.c > @@ -35,7 +35,7 @@ > /* Since this file is just a pile of asserts, don't bother compiling it if > * we're not building a debug build. > */ > -#ifdef DEBUG > +#ifndef NDEBUG > > /* > * Per-register validation state. > -- > Cheers, > Eric > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 44/51] glsl: WIP: Add lowering pass for treating mediump as float16
At least the following need more thought: 1) Converting right-hand-side of assignments from 16-bits to 32-bits - More correct thing to do is to treat rhs as 32-bits latest in the expression producing the value 2) Texture arguments except coordinates are not handled at all - Moreover, coordinates are always converted into 32-bits due to logic missing in the Intel compiler backend. Signed-off-by: Topi Pohjolainen --- src/compiler/Makefile.sources | 1 + src/compiler/glsl/ir_optimization.h | 1 + src/compiler/glsl/lower_mediump.cpp | 273 3 files changed, 275 insertions(+) create mode 100644 src/compiler/glsl/lower_mediump.cpp diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources index 2ab8e163a2..47bde4fb78 100644 --- a/src/compiler/Makefile.sources +++ b/src/compiler/Makefile.sources @@ -94,6 +94,7 @@ LIBGLSL_FILES = \ glsl/lower_int64.cpp \ glsl/lower_jumps.cpp \ glsl/lower_mat_op_to_vec.cpp \ + glsl/lower_mediump.cpp \ glsl/lower_noise.cpp \ glsl/lower_offset_array.cpp \ glsl/lower_packed_varyings.cpp \ diff --git a/src/compiler/glsl/ir_optimization.h b/src/compiler/glsl/ir_optimization.h index 2b8c195151..09c4d664e0 100644 --- a/src/compiler/glsl/ir_optimization.h +++ b/src/compiler/glsl/ir_optimization.h @@ -132,6 +132,7 @@ bool do_vec_index_to_swizzle(exec_list *instructions); bool lower_discard(exec_list *instructions); void lower_discard_flow(exec_list *instructions); bool lower_instructions(exec_list *instructions, unsigned what_to_lower); +bool lower_mediump(struct gl_linked_shader *shader); bool lower_noise(exec_list *instructions); bool lower_variable_index_to_cond_assign(gl_shader_stage stage, exec_list *instructions, bool lower_input, bool lower_output, diff --git a/src/compiler/glsl/lower_mediump.cpp b/src/compiler/glsl/lower_mediump.cpp new file mode 100644 index 00..89eed8b294 --- /dev/null +++ b/src/compiler/glsl/lower_mediump.cpp @@ -0,0 +1,273 @@ +/* + * Copyright 2017 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + */ + +/** + * \file lower_mediump.cpp + * + */ + +#include "compiler/glsl_types.h" +#include "ir.h" +#include "ir_rvalue_visitor.h" +#include "ast.h" + +static const glsl_type * +get_mediump(const glsl_type *highp) +{ + if (highp->is_float()) + return glsl_type::get_instance(GLSL_TYPE_FLOAT16, + highp->vector_elements, + highp->matrix_columns); + + if (highp->is_array() && highp->fields.array->is_float()) + return glsl_type::get_array_instance( +glsl_type::get_instance(GLSL_TYPE_FLOAT16, +highp->fields.array->vector_elements, +highp->fields.array->matrix_columns), +highp->length); + + return highp; +} + +static bool +is_16_bit(const ir_rvalue *ir) +{ + return ir->type->get_scalar_type()->base_type == GLSL_TYPE_FLOAT16; +} + +static bool +refers_16_bit_float(const ir_rvalue *ir) +{ + ir_variable *var = ir->variable_referenced(); + + /* Only variables have the mediump property, constants need conversion. */ + if (!var) + return false; + + return var->type->get_scalar_type()->base_type == GLSL_TYPE_FLOAT16; +} + +static ir_rvalue * +convert(ir_rvalue *ir, enum ir_expression_operation op) +{ + if (ir->ir_type == ir_type_constant) { + assert(op == ir_unop_f2h); + ir->type = get_mediump(ir->type); + return ir; + } + + void *ctx = ralloc_parent(ir); + return new(ctx) ir_expression(op, ir); +} + +class lower_mediump_visitor : public ir_rvalue_visitor { +public: + lower_mediump_visitor() : progress(false) {} + + virtual ir_visitor_status visit(ir_variable *ir); + virtual ir_visitor_status visit(ir_deref
[Mesa-dev] [PATCH 29/51] intel/compiler/fs: Add register padding support
Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_fs.cpp | 3 ++- src/intel/compiler/brw_fs.h| 3 ++- src/intel/compiler/brw_fs_builder.h| 25 ++--- src/intel/compiler/brw_fs_copy_propagation.cpp | 1 + src/intel/compiler/brw_fs_nir.cpp | 9 +++-- src/intel/compiler/brw_ir_fs.h | 3 +++ 6 files changed, 33 insertions(+), 11 deletions(-) diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp index cedfde5096..9c3410b698 100644 --- a/src/intel/compiler/brw_fs.cpp +++ b/src/intel/compiler/brw_fs.cpp @@ -440,6 +440,7 @@ fs_reg::fs_reg(struct ::brw_reg reg) : { this->offset = 0; this->stride = 1; + this->pad_per_component = 0; if (this->file == IMM && (this->type != BRW_REGISTER_TYPE_V && this->type != BRW_REGISTER_TYPE_UV && @@ -467,7 +468,7 @@ fs_reg::component_size(unsigned width) const const unsigned stride = ((file != ARF && file != FIXED_GRF) ? this->stride : hstride == 0 ? 0 : 1 << (hstride - 1)); - return MAX2(width * stride, 1) * type_sz(type); + return (MAX2(width * stride, 1) * (type_sz(type)) + pad_per_component); } /** diff --git a/src/intel/compiler/brw_fs.h b/src/intel/compiler/brw_fs.h index 30557324d5..d9c4f737e6 100644 --- a/src/intel/compiler/brw_fs.h +++ b/src/intel/compiler/brw_fs.h @@ -231,7 +231,8 @@ public: nir_jump_instr *instr); fs_reg get_nir_src(const nir_src &src); fs_reg get_nir_src_imm(const nir_src &src); - fs_reg get_nir_dest(const nir_dest &dest); + fs_reg get_nir_dest(const nir_dest &dest, + bool pad_components_to_full_registers = false); fs_reg get_nir_image_deref(const nir_deref_var *deref); fs_reg get_indirect_offset(nir_intrinsic_instr *instr); void emit_percomp(const brw::fs_builder &bld, const fs_inst &inst, diff --git a/src/intel/compiler/brw_fs_builder.h b/src/intel/compiler/brw_fs_builder.h index 633086c64b..804d52e5df 100644 --- a/src/intel/compiler/brw_fs_builder.h +++ b/src/intel/compiler/brw_fs_builder.h @@ -182,17 +182,28 @@ namespace brw { * component in this IR). */ dst_reg - vgrf(enum brw_reg_type type, unsigned n = 1) const + vgrf(enum brw_reg_type type, + unsigned n = 1, + bool pad_components_to_full_registers = false) const { assert(dispatch_width() <= 32); - if (n > 0) -return dst_reg(VGRF, shader->alloc.allocate( - DIV_ROUND_UP(n * type_sz(type) * dispatch_width(), - REG_SIZE)), - type); - else + if (n == 0) return retype(null_reg_ud(), type); + + const unsigned pad_per_component = +(pad_components_to_full_registers && + type_sz(type) == 2 && + dispatch_width() == 8) ? (REG_SIZE / 2) : 0; + const unsigned size = +n * ((type_sz(type) * dispatch_width()) + pad_per_component); + const unsigned nr = shader->alloc.allocate( +DIV_ROUND_UP(size, REG_SIZE)); + + dst_reg dst = dst_reg(VGRF, nr, type); + dst.pad_per_component = pad_per_component; + + return dst; } /** diff --git a/src/intel/compiler/brw_fs_copy_propagation.cpp b/src/intel/compiler/brw_fs_copy_propagation.cpp index ed2511ecfa..637a1de6ae 100644 --- a/src/intel/compiler/brw_fs_copy_propagation.cpp +++ b/src/intel/compiler/brw_fs_copy_propagation.cpp @@ -447,6 +447,7 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) inst->src[arg].file = entry->src.file; inst->src[arg].nr = entry->src.nr; inst->src[arg].stride *= entry->src.stride; + inst->src[arg].pad_per_component = entry->src.pad_per_component; inst->saturate = inst->saturate || entry->saturate; /* Compute the offset of inst->src[arg] relative to entry->dst */ diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp index 16e8dfc186..35e78b134a 100644 --- a/src/intel/compiler/brw_fs_nir.cpp +++ b/src/intel/compiler/brw_fs_nir.cpp @@ -357,6 +357,9 @@ fs_visitor::nir_emit_impl(nir_function_impl *impl) unsigned size = array_elems * reg->num_components; const brw_reg_type reg_type = brw_reg_type_from_bit_size(reg->bit_size, BRW_REGISTER_TYPE_F); + + /* TODO: Consider if 16-bit component padding is needed. */ + nir_locals[reg->index] = bld.vgrf(reg_type, size); } @@ -1602,13 +1605,15 @@ fs_visitor::get_nir_src_imm(const nir_src &src) } fs_reg -fs_visitor::get_nir_dest(const nir_dest &dest) +fs_visitor::get_nir_dest(const nir_dest &dest, + bool pad_components_to_full_registers) { if (dest.is_ssa) { const brw_reg_type reg_type =
[Mesa-dev] [PATCH 45/51] glsl: Use 16-bit constants if operation is otherwise 16-bit
Signed-off-by: Topi Pohjolainen --- src/compiler/glsl/lower_mediump.cpp | 43 - 1 file changed, 42 insertions(+), 1 deletion(-) diff --git a/src/compiler/glsl/lower_mediump.cpp b/src/compiler/glsl/lower_mediump.cpp index 89eed8b294..0276e74d6e 100644 --- a/src/compiler/glsl/lower_mediump.cpp +++ b/src/compiler/glsl/lower_mediump.cpp @@ -67,6 +67,25 @@ refers_16_bit_float(const ir_rvalue *ir) return var->type->get_scalar_type()->base_type == GLSL_TYPE_FLOAT16; } +static bool +is_constant(const ir_rvalue *ir) +{ + if (ir->ir_type == ir_type_constant) + return true; + + if (ir->ir_type != ir_type_expression) + return false; + + const ir_expression *expr = (const ir_expression *)ir; + + for (unsigned i = 0; i < expr->num_operands; i++) { + if (!is_constant(expr->operands[i])) + return false; + } + + return true; +} + static ir_rvalue * convert(ir_rvalue *ir, enum ir_expression_operation op) { @@ -99,6 +118,7 @@ private: bool can_be_lowered(const ir_variable *var) const; void retype_to_float16(const glsl_type **t); + void retype_to_float16(ir_rvalue *ir); }; bool @@ -119,6 +139,22 @@ lower_mediump_visitor::retype_to_float16(const glsl_type **t) *t = mediump; } +void +lower_mediump_visitor::retype_to_float16(ir_rvalue *ir) +{ + retype_to_float16(&ir->type); + + if (ir->ir_type != ir_type_expression) + return; + + const ir_expression *expr = (const ir_expression *)ir; + + for (unsigned i = 0; i < expr->num_operands; i++) { + assert(is_constant(expr->operands[i])); + retype_to_float16(&expr->operands[i]->type); + } +} + ir_visitor_status lower_mediump_visitor::visit(ir_variable *ir) { @@ -228,7 +264,7 @@ lower_mediump_visitor::visit_leave(ir_expression *ir) for (unsigned i = 0; i < ir->num_operands; i++) { if (is_16_bit(ir->operands[i])) has_16_bit_src = true; - else + else if (!is_constant(ir->operands[i])) has_32_bit_src = true; } @@ -240,6 +276,11 @@ lower_mediump_visitor::visit_leave(ir_expression *ir) */ if (!has_32_bit_src && ir->operation != ir_triop_lrp) { + for (unsigned i = 0; i < ir->num_operands; i++) { + if (is_constant(ir->operands[i])) +retype_to_float16(ir->operands[i]); + } + retype_to_float16(&ir->type); return visit_continue; } -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 41/51] intel/compiler/eu: Take stride into account in 16-bit ops
This is needed when converting from F -> HF. Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_eu_validate.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/intel/compiler/brw_eu_validate.c b/src/intel/compiler/brw_eu_validate.c index 6ee6b4ffbe..735ea6 100644 --- a/src/intel/compiler/brw_eu_validate.c +++ b/src/intel/compiler/brw_eu_validate.c @@ -459,6 +459,9 @@ general_restrictions_based_on_operand_types(const struct gen_device_info *devinf exec_type_size == 8 && dst_type_size == 4) dst_type_size = 8; + if (exec_type_size == 4 && dst_type_size == 2 && dst_stride == 2) + dst_type_size = 4; + if (exec_type_size > dst_type_size) { ERROR_IF(dst_stride * dst_type_size != exec_type_size, "Destination stride must be equal to the ratio of the sizes of " -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 15/51] intel/compiler: Add support for loading 16-bit constants
Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_fs_nir.cpp | 5 + 1 file changed, 5 insertions(+) diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp index a973c18203..65a5bfa49a 100644 --- a/src/intel/compiler/brw_fs_nir.cpp +++ b/src/intel/compiler/brw_fs_nir.cpp @@ -1515,6 +1515,11 @@ fs_visitor::nir_emit_load_const(const fs_builder &bld, fs_reg reg = bld.vgrf(reg_type, instr->def.num_components); switch (instr->def.bit_size) { + case 16: + for (unsigned i = 0; i < instr->def.num_components; i++) + bld.MOV(offset(reg, bld, i), brw_imm_w(instr->value.i16[i])); + break; + case 32: for (unsigned i = 0; i < instr->def.num_components; i++) bld.MOV(offset(reg, bld, i), brw_imm_d(instr->value.i32[i])); -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 51/51] i965/fs: Lower gles mediump floats into 16-bits
Signed-off-by: Topi Pohjolainen --- src/mesa/drivers/dri/i965/brw_link.cpp | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp b/src/mesa/drivers/dri/i965/brw_link.cpp index d18521e792..89ccbb06b5 100644 --- a/src/mesa/drivers/dri/i965/brw_link.cpp +++ b/src/mesa/drivers/dri/i965/brw_link.cpp @@ -134,6 +134,9 @@ process_glsl_ir(struct brw_context *brw, lower_noise(shader->ir); lower_quadop_vector(shader->ir, false); + if (shader_prog->IsES && shader->Stage == MESA_SHADER_FRAGMENT) + lower_mediump(shader); + validate_ir_tree(shader->ir); /* Now that we've finished altering the linked IR, reparent any live IR back -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 30/51] intel/compiler/fs: Pad 16-bit texture return payloads
This is to tell offset and read/write calculators enough to work correctly with 16-bit texture payloads. Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_fs_nir.cpp | 21 +++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp index 35e78b134a..6d9b272a57 100644 --- a/src/intel/compiler/brw_fs_nir.cpp +++ b/src/intel/compiler/brw_fs_nir.cpp @@ -4949,7 +4949,22 @@ fs_visitor::nir_emit_texture(const fs_builder &bld, nir_tex_instr *instr) } } - fs_reg dst = bld.vgrf(brw_type_for_nir_type(devinfo, instr->dest_type), 4); + const enum brw_reg_type dst_type = + brw_type_for_nir_type(devinfo, instr->dest_type); + + /* In case of 16-bit return format one needs to prepare for 4 registers +* regardless of the dispatch width: +* +* From SKL PRM Vol. 7 Page 131, Return Format = 16-bit: +* +* A SIMD8* writeback message with Return Format of 16-bit consists of +* up to 4 destination registers). +* +* Therefore tell builder to give full register per component even in +* case of 16-bit size and SIMD8. +*/ + const bool pad_components_to_full_registers = true; + fs_reg dst = bld.vgrf(dst_type, 4, pad_components_to_full_registers); fs_inst *inst = bld.emit(opcode, dst, srcs, ARRAY_SIZE(srcs)); inst->offset = header_bits; @@ -4987,7 +5002,9 @@ fs_visitor::nir_emit_texture(const fs_builder &bld, nir_tex_instr *instr) bld.emit_minmax(nir_dest[2], depth, brw_imm_d(1), BRW_CONDITIONAL_GE); } - bld.LOAD_PAYLOAD(get_nir_dest(instr->dest), nir_dest, dest_size, 0); + bld.LOAD_PAYLOAD(get_nir_dest(instr->dest, + pad_components_to_full_registers), +nir_dest, dest_size, 0); } void -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 39/51] intel/compiler/fs: Consider logic ops on 16-bit booleans
Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_fs_nir.cpp | 70 ++- 1 file changed, 69 insertions(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp index 2a32b1449a..aff592c354 100644 --- a/src/intel/compiler/brw_fs_nir.cpp +++ b/src/intel/compiler/brw_fs_nir.cpp @@ -1662,7 +1662,75 @@ fs_visitor::get_nir_alu_dest(const nir_alu_instr *instr) * one component per register. */ const bool pad_components_to_full_register = true; - return get_nir_dest(instr->dest.dest, pad_components_to_full_register); + + switch (instr->op) { + case nir_op_flt: + case nir_op_fge: + case nir_op_feq: + case nir_op_fne: { + assert(instr->dest.dest.is_ssa); + + if (nir_src_bit_size(instr->src[0].src) > 16) + return get_nir_dest(instr->dest.dest); + + assert(nir_src_bit_size(instr->src[0].src) == 16 && + nir_src_bit_size(instr->src[1].src) == 16); + + /* Destination type for comparison operations is boolean which NIR + * treats as having 32-bit size. If, however, sources are 16-bit + * hardware will produce 16-bit result (0x/0x). Therefore set + * the destination type accordingly. + */ + nir_ssa_values[instr->dest.dest.ssa.index] = + bld.vgrf(BRW_REGISTER_TYPE_HF, + instr->dest.dest.ssa.num_components, + pad_components_to_full_register); + return nir_ssa_values[instr->dest.dest.ssa.index]; + } + case nir_op_inot: + case nir_op_ixor: + case nir_op_ior: + case nir_op_iand: { + assert(instr->dest.dest.is_ssa); + + const fs_reg src0 = get_nir_src(instr->src[0].src); + const fs_reg src1 = get_nir_src(instr->src[0].src); + + /* TODO: This specifically prepares for mixed precision operations which + * in principle shouldn't happen. There is, however, corner case + * when this is possible. As NIR doesn't consider how booleans + * are produced, we may end up here with one source operand + * produced from an operation with 32-bit sources and another from + * 16-bits. + * This is handled by marking this operation as producing 16-bits + * and relying on nir_emit_alu() to adjust the 32-bit source + * operand to 16-bits with stride == 2. Recall that 32-bit + * booleans are just 0x/0x and it suffices to read + * only the lower 16-bits. + * WARN: This blindly assumes that mixed precision integer source + * operands represent boolean values. There is no way of checking + * if that holds. + */ + if (brw_reg_type_to_size(src0.type) > 2 && + brw_reg_type_to_size(src1.type) > 2) + return get_nir_dest(instr->dest.dest); + + /* Translation from GLSL to NIR produces logical operations with + * integer operands even when operands are booleans. See handling + * of ir_binop_bit_*. + * As hardware will produce 16-bit results when the sources are 16-bit + * set the destination type accordingly. + */ + nir_ssa_values[instr->dest.dest.ssa.index] = + bld.vgrf(BRW_REGISTER_TYPE_W, + instr->dest.dest.ssa.num_components, + pad_components_to_full_register); + return nir_ssa_values[instr->dest.dest.ssa.index]; + } + default: + return get_nir_dest(instr->dest.dest, + pad_components_to_full_register); + } } fs_reg -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 35/51] intel/compiler/fs: Pad 16-bit payload lowering
Otherwise copy propagation fails when write sizes differ. Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_fs.cpp | 5 - src/intel/compiler/brw_ir_fs.h | 13 + 2 files changed, 17 insertions(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp index 9c3410b698..8e77248470 100644 --- a/src/intel/compiler/brw_fs.cpp +++ b/src/intel/compiler/brw_fs.cpp @@ -3450,7 +3450,10 @@ fs_visitor::lower_load_payload() for (uint8_t i = inst->header_size; i < inst->sources; i++) { if (inst->src[i].file != BAD_FILE) -ibld.MOV(retype(dst, inst->src[i].type), inst->src[i]); +ibld.MOV(retype_pad_to_full_register( +dst, dispatch_width, inst->src[i].type), + inst->src[i]); + if (type_sz(inst->src[i].type) == 2) dst = byte_offset(dst, REG_SIZE); else diff --git a/src/intel/compiler/brw_ir_fs.h b/src/intel/compiler/brw_ir_fs.h index b4a1d7ef5a..fe7f7c4be7 100644 --- a/src/intel/compiler/brw_ir_fs.h +++ b/src/intel/compiler/brw_ir_fs.h @@ -72,6 +72,19 @@ retype(fs_reg reg, enum brw_reg_type type) } static inline fs_reg +retype_pad_to_full_register(fs_reg reg, unsigned dispatch_width, +enum brw_reg_type type) +{ + reg.type = type; + + assert(reg.pad_per_component == 0); + if (dispatch_width == 8 && type_sz(reg.type) == 2) + reg.pad_per_component = REG_SIZE / 2; + + return reg; +} + +static inline fs_reg byte_offset(fs_reg reg, unsigned delta) { switch (reg.file) { -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 48/51] glsl: HACK: Treat input varyings as 16-bits by conversion
Signed-off-by: Topi Pohjolainen --- src/compiler/glsl/lower_mediump.cpp | 26 +- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/src/compiler/glsl/lower_mediump.cpp b/src/compiler/glsl/lower_mediump.cpp index 094ab4e743..45cf75b53c 100644 --- a/src/compiler/glsl/lower_mediump.cpp +++ b/src/compiler/glsl/lower_mediump.cpp @@ -92,6 +92,20 @@ refers_16_bit_float(const ir_rvalue *ir) } static bool +defers_input_varying(const ir_rvalue *ir) +{ + ir_variable *var = ir->variable_referenced(); + if (!var) + return false; + + if (var->data.mode != ir_var_shader_in) + return false; + + return var->data.precision == ast_precision_low || + var->data.precision == ast_precision_medium; +} + +static bool is_constant(const ir_rvalue *ir) { if (ir->ir_type == ir_type_constant) @@ -152,6 +166,13 @@ lower_mediump_visitor::can_be_lowered(const ir_variable *var) const if (!var->type->get_scalar_type()->is_float()) return false; + /* TODO: Intel compiler backend isn't prepared for interpolated 16-bit +* varyings. Input varyings are instead converted to 16-bits before +* use. +*/ + if (var->data.mode == ir_var_shader_in) + return false; + return var->data.precision == ast_precision_low || var->data.precision == ast_precision_medium; } @@ -309,7 +330,8 @@ lower_mediump_visitor::visit_leave(ir_expression *ir) for (unsigned i = 0; i < ir->num_operands; i++) { if (is_16_bit(ir->operands[i])) has_16_bit_src = true; - else if (!is_constant(ir->operands[i])) + else if (!is_constant(ir->operands[i]) && + !defers_input_varying(ir->operands[i])) has_32_bit_src = true; } @@ -324,6 +346,8 @@ lower_mediump_visitor::visit_leave(ir_expression *ir) for (unsigned i = 0; i < ir->num_operands; i++) { if (is_constant(ir->operands[i])) retype_to_float16(ir->operands[i]); + else if (defers_input_varying(ir->operands[i])) +ir->operands[i] = convert(ir->operands[i], ir_unop_f2h); } retype_to_float16(&ir->type); -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 33/51] intel/compiler/fs: Pad 16-bit nir intrinsic dest into full reg
Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_fs_nir.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp index cbb1c118d2..64243312b9 100644 --- a/src/intel/compiler/brw_fs_nir.cpp +++ b/src/intel/compiler/brw_fs_nir.cpp @@ -3881,7 +3881,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr { fs_reg dest; if (nir_intrinsic_infos[instr->intrinsic].has_dest) - dest = get_nir_dest(instr->dest); + dest = get_nir_dest(instr->dest, true /* pad components to full regs */); switch (instr->intrinsic) { case nir_intrinsic_image_load: -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 13/51] intel/compiler/disasm: Print fp16 also for sampler messages
This is what render target write does. Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_disasm.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/intel/compiler/brw_disasm.c b/src/intel/compiler/brw_disasm.c index da2a5d78dd..fbb18b0f26 100644 --- a/src/intel/compiler/brw_disasm.c +++ b/src/intel/compiler/brw_disasm.c @@ -1621,6 +1621,11 @@ brw_disassemble_inst(FILE *file, const struct gen_device_info *devinfo, brw_inst_sampler_msg_type(devinfo, inst), &space); err |= control(file, "sampler simd mode", gen5_sampler_simd_mode, brw_inst_sampler_simd_mode(devinfo, inst), &space); + if ((devinfo->gen >= 9 || devinfo->is_cherryview) && + brw_inst_data_format(devinfo, inst)) { + string(file, " HP"); + } + format(file, " Surface = %"PRIu64" Sampler = %"PRIu64, brw_inst_binding_table_index(devinfo, inst), brw_inst_sampler(devinfo, inst)); -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 37/51] intel/compiler/fs: Consider original sizes when retyping alu ops
Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_fs_nir.cpp | 30 -- 1 file changed, 24 insertions(+), 6 deletions(-) diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp index baa84b0f3c..d28ed57eca 100644 --- a/src/intel/compiler/brw_fs_nir.cpp +++ b/src/intel/compiler/brw_fs_nir.cpp @@ -655,6 +655,26 @@ emit_find_msb_using_lzd(const fs_builder &bld, inst->src[0].negate = true; } +static enum brw_reg_type +get_nir_alu_dest_type(const struct gen_device_info *devinfo, + const nir_alu_instr *instr, unsigned size) +{ + brw_reg_type res = brw_type_for_nir_type(devinfo, + (nir_alu_type)(nir_op_infos[instr->op].output_type | + nir_dest_bit_size(instr->dest.dest))); + return brw_reg_type_from_bit_size(size * 8, res); +} + +static enum brw_reg_type +get_nir_alu_src_type(const struct gen_device_info *devinfo, + const nir_alu_instr *instr, unsigned i, unsigned size) +{ + brw_reg_type res = brw_type_for_nir_type(devinfo, + (nir_alu_type)(nir_op_infos[instr->op].input_types[i] | + nir_src_bit_size(instr->src[i].src))); + return brw_reg_type_from_bit_size(size * 8, res); +} + void fs_visitor::nir_emit_alu(const fs_builder &bld, nir_alu_instr *instr) { @@ -662,16 +682,14 @@ fs_visitor::nir_emit_alu(const fs_builder &bld, nir_alu_instr *instr) fs_inst *inst; fs_reg result = get_nir_alu_dest(instr); - result.type = brw_type_for_nir_type(devinfo, - (nir_alu_type)(nir_op_infos[instr->op].output_type | - nir_dest_bit_size(instr->dest.dest))); + result.type = get_nir_alu_dest_type(devinfo, instr, + brw_reg_type_to_size(result.type)); fs_reg op[4]; for (unsigned i = 0; i < nir_op_infos[instr->op].num_inputs; i++) { op[i] = get_nir_src(instr->src[i].src); - op[i].type = brw_type_for_nir_type(devinfo, - (nir_alu_type)(nir_op_infos[instr->op].input_types[i] | -nir_src_bit_size(instr->src[i].src))); + op[i].type = get_nir_alu_src_type(devinfo, instr, i, +brw_reg_type_to_size(op[i].type)); op[i].abs = instr->src[i].abs; op[i].negate = instr->src[i].negate; } -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 32/51] intel/compiler/fs: Pad 16-bit nir vec* components into full reg
This allows quite a bit of infra to be kept as is, such as liveness analysis, copy propagation and dead code elimination. Here one deals with virtual register space and this doesn't prevent from packing more than one component into one hardware register later on. That is entirely matter of register allocator working with sub-registers. Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_fs.h | 1 + src/intel/compiler/brw_fs_nir.cpp | 19 ++- 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_fs.h b/src/intel/compiler/brw_fs.h index d9c4f737e6..b23d2b1733 100644 --- a/src/intel/compiler/brw_fs.h +++ b/src/intel/compiler/brw_fs.h @@ -233,6 +233,7 @@ public: fs_reg get_nir_src_imm(const nir_src &src); fs_reg get_nir_dest(const nir_dest &dest, bool pad_components_to_full_registers = false); + fs_reg get_nir_alu_dest(const nir_alu_instr *instr); fs_reg get_nir_image_deref(const nir_deref_var *deref); fs_reg get_indirect_offset(nir_intrinsic_instr *instr); void emit_percomp(const brw::fs_builder &bld, const fs_inst &inst, diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp index d3125d7dcd..cbb1c118d2 100644 --- a/src/intel/compiler/brw_fs_nir.cpp +++ b/src/intel/compiler/brw_fs_nir.cpp @@ -656,7 +656,7 @@ fs_visitor::nir_emit_alu(const fs_builder &bld, nir_alu_instr *instr) struct brw_wm_prog_key *fs_key = (struct brw_wm_prog_key *) this->key; fs_inst *inst; - fs_reg result = get_nir_dest(instr->dest.dest); + fs_reg result = get_nir_alu_dest(instr); result.type = brw_type_for_nir_type(devinfo, (nir_alu_type)(nir_op_infos[instr->op].output_type | nir_dest_bit_size(instr->dest.dest))); @@ -1624,6 +1624,23 @@ fs_visitor::get_nir_dest(const nir_dest &dest, } fs_reg +fs_visitor::get_nir_alu_dest(const nir_alu_instr *instr) +{ + /* With data type size =< 16 bits one can fit two or more components +* into one register. In virtual register space this doesn't really add +* any value but requires things such as liveness analysis, +* copy propagation and dead code elimination to be updated to work with +* sub-regsiter regions. +* +* Therefore instead allocate full padded registers per component. This +* doesn't prevent final hardware register allocator from packing more than +* one component per register. +*/ + const bool pad_components_to_full_register = true; + return get_nir_dest(instr->dest.dest, pad_components_to_full_register); +} + +fs_reg fs_visitor::get_nir_image_deref(const nir_deref_var *deref) { fs_reg image(UNIFORM, deref->var->data.driver_location / 4, -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 27/51] intel/compiler/fs: Set tex type for generator to flag fp16
Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_fs.cpp | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp index 5751bb0ad7..0d415e2393 100644 --- a/src/intel/compiler/brw_fs.cpp +++ b/src/intel/compiler/brw_fs.cpp @@ -2601,7 +2601,15 @@ fs_visitor::opt_sampler_eot() tex_inst->offset |= fb_write->target << 24; tex_inst->eot = true; - tex_inst->dst = ibld.null_reg_ud(); + + /* Set the null destination type specifically so that generator knows to +* flag half precision flag. +*/ + if (tex_inst->dst.type == BRW_REGISTER_TYPE_HF) + tex_inst->dst = ibld.null_reg_hf(); + else + tex_inst->dst = ibld.null_reg_ud(); + tex_inst->size_written = 0; fb_write->remove(cfg->blocks[cfg->num_blocks - 1]); -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 50/51] glsl: HACK: Lower all temporary float variables to 16-bits
Signed-off-by: Topi Pohjolainen --- src/compiler/glsl/lower_mediump.cpp | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/compiler/glsl/lower_mediump.cpp b/src/compiler/glsl/lower_mediump.cpp index bae18c9bfb..73b8aa577c 100644 --- a/src/compiler/glsl/lower_mediump.cpp +++ b/src/compiler/glsl/lower_mediump.cpp @@ -184,6 +184,17 @@ lower_mediump_visitor::can_be_lowered(const ir_variable *var) const var->data.how_declared == ir_var_declared_implicitly) return true; + /* Such as builtins, temporary variables don't have have precision +* qualifiers either. Lower them by default. +* +* TODO: Surrounding expressions should really be examined to tell if +* full precision needed. Moreover, these can be referred from +* multiple locations. If any requires full precision, then all +* expressions involved would need to operate on full precision? +*/ + if (var->data.mode == ir_var_temporary) + return true; + return var->data.precision == ast_precision_low || var->data.precision == ast_precision_medium; } -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 46/51] glsl: Lower float conversions to mediump
Signed-off-by: Topi Pohjolainen --- src/compiler/glsl/lower_mediump.cpp | 26 ++ 1 file changed, 26 insertions(+) diff --git a/src/compiler/glsl/lower_mediump.cpp b/src/compiler/glsl/lower_mediump.cpp index 0276e74d6e..07f1f1ba9d 100644 --- a/src/compiler/glsl/lower_mediump.cpp +++ b/src/compiler/glsl/lower_mediump.cpp @@ -55,6 +55,30 @@ is_16_bit(const ir_rvalue *ir) return ir->type->get_scalar_type()->base_type == GLSL_TYPE_FLOAT16; } +static void +retype_x2f_x2f16(ir_rvalue *ir) +{ + if (ir->ir_type != ir_type_expression) + return; + + ir_expression *expr = (ir_expression *)ir; + switch (expr->operation) { + case ir_unop_i2f: + expr->operation = ir_unop_i2h; + break; + case ir_unop_b2f: + expr->operation = ir_unop_b2h; + break; + case ir_unop_u2f: + expr->operation = ir_unop_u2h; + break; + default: + return; + } + + ir->type = get_mediump(ir->type); +} + static bool refers_16_bit_float(const ir_rvalue *ir) { @@ -259,6 +283,8 @@ lower_mediump_visitor::visit_leave(ir_expression *ir) { ir_rvalue_visitor::visit_leave(ir); + retype_x2f_x2f16(ir); + bool has_32_bit_src = false; bool has_16_bit_src = false; for (unsigned i = 0; i < ir->num_operands; i++) { -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 43/51] intel/compiler/fs: WIP: Use 32-bit slots for 16-bit uniforms
--- src/intel/compiler/brw_fs_nir.cpp | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp index 2060a3139d..631bbf7f92 100644 --- a/src/intel/compiler/brw_fs_nir.cpp +++ b/src/intel/compiler/brw_fs_nir.cpp @@ -4164,7 +4164,11 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr src.offset = const_offset->u32[0]; for (unsigned j = 0; j < instr->num_components; j++) { -bld.MOV(offset(dest, bld, j), offset(src, bld, j)); +/* Currently 16-bit uniforms occupy 32-bit slot. */ +const unsigned src_offset = + src.type == BRW_REGISTER_TYPE_HF ? 2 * j : j; + +bld.MOV(offset(dest, bld, j), offset(src, bld, src_offset)); } } else { fs_reg indirect = retype(get_nir_src(instr->src[0]), -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 17/51] intel/compiler: Prepare for glsl mediump float uniforms
Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_shader.cpp | 13 + src/mesa/drivers/dri/i965/brw_program.c | 10 +- 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_shader.cpp b/src/intel/compiler/brw_shader.cpp index 234b5a11c1..cc9297772b 100644 --- a/src/intel/compiler/brw_shader.cpp +++ b/src/intel/compiler/brw_shader.cpp @@ -78,6 +78,19 @@ type_size_scalar(const struct glsl_type *type) return 0; } +/* Variant of type_size_scalar() taking into account that GL core and api + * don't deal with 16-bit uniforms but with 32-bit. Only compiler backend can + * work with reduced precision if desired. + */ +extern "C" int +uniform_storage_type_size_scalar(const struct glsl_type *type) +{ + if (type->base_type == GLSL_TYPE_FLOAT16) + return type->components(); + + return type_size_scalar(type); +} + enum brw_reg_type brw_type_for_base_type(const struct glsl_type *type) { diff --git a/src/mesa/drivers/dri/i965/brw_program.c b/src/mesa/drivers/dri/i965/brw_program.c index 755d4973cc..4573d9d303 100644 --- a/src/mesa/drivers/dri/i965/brw_program.c +++ b/src/mesa/drivers/dri/i965/brw_program.c @@ -47,12 +47,20 @@ #include "brw_defines.h" #include "intel_batchbuffer.h" +int uniform_storage_type_size_scalar(const struct glsl_type *type); + +static int +uniform_storage_type_size_scalar_bytes(const struct glsl_type *type) +{ + return uniform_storage_type_size_scalar(type) * 4; +} + static bool brw_nir_lower_uniforms(nir_shader *nir, bool is_scalar) { if (is_scalar) { nir_assign_var_locations(&nir->uniforms, &nir->num_uniforms, - type_size_scalar_bytes); + uniform_storage_type_size_scalar_bytes); return nir_lower_io(nir, nir_var_uniform, type_size_scalar_bytes, 0); } else { nir_assign_var_locations(&nir->uniforms, &nir->num_uniforms, -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 49/51] glsl: HACK: Lower builtin float outputs to 16-bits by default
Signed-off-by: Topi Pohjolainen --- src/compiler/glsl/lower_mediump.cpp | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/compiler/glsl/lower_mediump.cpp b/src/compiler/glsl/lower_mediump.cpp index 45cf75b53c..bae18c9bfb 100644 --- a/src/compiler/glsl/lower_mediump.cpp +++ b/src/compiler/glsl/lower_mediump.cpp @@ -173,6 +173,17 @@ lower_mediump_visitor::can_be_lowered(const ir_variable *var) const if (var->data.mode == ir_var_shader_in) return false; + /* Builtin outputs such as gl_FragColor don't have precision qualifier. +* Lower them by default. +* +* TODO: If this gets assigned with full precision value, output would +* need to be in full precision instead of the value being converted +* to 16-bits? +*/ + if (var->data.mode == ir_var_shader_out && + var->data.how_declared == ir_var_declared_implicitly) + return true; + return var->data.precision == ast_precision_low || var->data.precision == ast_precision_medium; } -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 38/51] intel/compiler/fs: Use original reg size when retyping nir src
In case of boolean typed the values maybe given in 16-bits whereas NIR unconditionally regards them as 32-bit. Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_fs_nir.cpp | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp index d28ed57eca..2a32b1449a 100644 --- a/src/intel/compiler/brw_fs_nir.cpp +++ b/src/intel/compiler/brw_fs_nir.cpp @@ -1604,8 +1604,9 @@ fs_visitor::get_nir_src(const nir_src &src) * default to an integer type - instructions that need floating point * semantics will set this to F if they need to */ - reg.type = brw_reg_type_from_bit_size(nir_src_bit_size(src), -BRW_REGISTER_TYPE_D); + reg.type = brw_reg_type_from_bit_size( +brw_reg_type_to_size(reg.type) * 8, +BRW_REGISTER_TYPE_D); } return reg; -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 26/51] intel/compiler/fs: Set 16-bit sampler return format
Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_fs_generator.cpp | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/intel/compiler/brw_fs_generator.cpp b/src/intel/compiler/brw_fs_generator.cpp index 20d018e1fe..610a545cd8 100644 --- a/src/intel/compiler/brw_fs_generator.cpp +++ b/src/intel/compiler/brw_fs_generator.cpp @@ -1051,6 +1051,9 @@ fs_generator::generate_tex(fs_inst *inst, struct brw_reg dst, struct brw_reg src brw_inst_set_eot(p->devinfo, brw_last_inst, true); brw_inst_set_opcode(p->devinfo, brw_last_inst, BRW_OPCODE_SENDC); } + + if (dst.type == BRW_REGISTER_TYPE_HF) + brw_inst_set_data_format(p->devinfo, brw_last_inst, 1); } -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 47/51] glsl: HACK: Force texture return into 16-bits
and convert coordinates unconditionally to 32-bits. Signed-off-by: Topi Pohjolainen --- src/compiler/glsl/lower_mediump.cpp | 19 +++ 1 file changed, 19 insertions(+) diff --git a/src/compiler/glsl/lower_mediump.cpp b/src/compiler/glsl/lower_mediump.cpp index 07f1f1ba9d..094ab4e743 100644 --- a/src/compiler/glsl/lower_mediump.cpp +++ b/src/compiler/glsl/lower_mediump.cpp @@ -132,6 +132,7 @@ public: virtual ir_visitor_status visit_leave(ir_assignment *ir); virtual ir_visitor_status visit_leave(ir_expression *ir); + virtual ir_visitor_status visit_leave(ir_texture *ir); virtual ir_visitor_status visit_leave(ir_swizzle *ir); virtual void handle_rvalue(ir_rvalue **rvalue); @@ -238,6 +239,24 @@ lower_mediump_visitor::visit_leave(ir_assignment *ir) } ir_visitor_status +lower_mediump_visitor::visit_leave(ir_texture *ir) +{ + ir_rvalue_visitor::visit_leave(ir); + + /* HACK: Intel compiler backend isn't prepared for 16-bit texture +* arguments. +* TODO: Convert the rest of the operands. +*/ + if (is_16_bit(ir->coordinate)) + ir->coordinate = convert(ir->coordinate, ir_unop_h2f); + + if (ir->type->is_float()) + retype_to_float16(&ir->type); + + return visit_continue; +} + +ir_visitor_status lower_mediump_visitor::visit_leave(ir_swizzle *ir) { ir_rvalue_visitor::visit_leave(ir); -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 14/51] intel/compiler/fs: Support for dumping 16-bit IMM values
Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_fs.cpp | 5 + 1 file changed, 5 insertions(+) diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp index 694fcc1919..1b972972c1 100644 --- a/src/intel/compiler/brw_fs.cpp +++ b/src/intel/compiler/brw_fs.cpp @@ -39,6 +39,7 @@ #include "compiler/glsl_types.h" #include "compiler/nir/nir_builder.h" #include "program/prog_parameter.h" +#include "util/half_float.h" using namespace brw; @@ -5532,6 +5533,10 @@ fs_visitor::dump_instruction(backend_instruction *be_inst, FILE *file) break; case IMM: switch (inst->src[i].type) { + case BRW_REGISTER_TYPE_HF: +fprintf(file, "%-gHF", +_mesa_half_to_float((uint16_t)inst->src[i].ud)); +break; case BRW_REGISTER_TYPE_F: fprintf(file, "%-gf", inst->src[i].f); break; -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 31/51] intel/compiler/fs: Pad 16-bit output (store/fb write) payloads
Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_fs_nir.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp index 6d9b272a57..d3125d7dcd 100644 --- a/src/intel/compiler/brw_fs_nir.cpp +++ b/src/intel/compiler/brw_fs_nir.cpp @@ -3254,7 +3254,7 @@ alloc_temporary(const fs_builder &bld, unsigned size, fs_reg *regs, unsigned n, } else { const brw_reg_type type = is_16bit ? BRW_REGISTER_TYPE_HF : BRW_REGISTER_TYPE_F; - const fs_reg tmp = bld.vgrf(type, size); + const fs_reg tmp = bld.vgrf(type, size, is_16bit); for (unsigned i = 0; i < n; i++) regs[i] = tmp; -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 42/51] i965: WIP: Support for uploading 16-bit uniforms from 32-bit store
At this point 16-bit uniforms still take full 32-bit slots in the pull/push constant buffers and in shader deployment payload. Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_compiler.h | 9 + src/intel/compiler/brw_fs.cpp | 12 src/intel/compiler/brw_fs_nir.cpp | 2 ++ src/intel/compiler/brw_fs_visitor.cpp | 1 + src/intel/compiler/brw_vec4.cpp | 8 src/intel/compiler/brw_vec4_gs_visitor.cpp | 8 src/intel/compiler/brw_vec4_visitor.cpp | 4 src/mesa/drivers/dri/i965/brw_cs.c | 2 ++ src/mesa/drivers/dri/i965/brw_curbe.c | 2 ++ src/mesa/drivers/dri/i965/brw_disk_cache.c | 14 ++ src/mesa/drivers/dri/i965/brw_gs.c | 2 ++ src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp | 10 ++ src/mesa/drivers/dri/i965/brw_program.c | 2 ++ src/mesa/drivers/dri/i965/brw_state.h | 1 + src/mesa/drivers/dri/i965/brw_tcs.c | 2 ++ src/mesa/drivers/dri/i965/brw_tes.c | 2 ++ src/mesa/drivers/dri/i965/brw_vs.c | 2 ++ src/mesa/drivers/dri/i965/brw_wm.c | 2 ++ src/mesa/drivers/dri/i965/gen6_constant_state.c | 17 - 19 files changed, 101 insertions(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_compiler.h b/src/intel/compiler/brw_compiler.h index cdd61aae6c..7b43c4a135 100644 --- a/src/intel/compiler/brw_compiler.h +++ b/src/intel/compiler/brw_compiler.h @@ -613,6 +613,12 @@ struct brw_stage_prog_data { */ uint32_t *param; uint32_t *pull_param; + + /* Tells for GLSL backend if conversion from 32-bit store to, for example, +* 16-bits is required. +*/ + unsigned char *param_type; /* enum glsl_base_type */ + unsigned char *pull_param_type; /* enum glsl_base_type */ }; static inline uint32_t * @@ -621,6 +627,9 @@ brw_stage_prog_data_add_params(struct brw_stage_prog_data *prog_data, { unsigned old_nr_params = prog_data->nr_params; prog_data->nr_params += nr_new_params; + prog_data->param_type = reralloc(ralloc_parent(prog_data->param_type), +prog_data->param_type, unsigned char, +prog_data->nr_params); prog_data->param = reralloc(ralloc_parent(prog_data->param), prog_data->param, uint32_t, prog_data->nr_params); diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp index 8e77248470..3ca1d4cbc7 100644 --- a/src/intel/compiler/brw_fs.cpp +++ b/src/intel/compiler/brw_fs.cpp @@ -2102,19 +2102,26 @@ fs_visitor::assign_constant_locations() * create two new arrays for push/pull params. */ uint32_t *param = stage_prog_data->param; + unsigned char *param_type = stage_prog_data->param_type; stage_prog_data->nr_params = num_push_constants; if (num_push_constants) { stage_prog_data->param = ralloc_array(mem_ctx, uint32_t, num_push_constants); + stage_prog_data->param_type = ralloc_array(mem_ctx, unsigned char, + num_push_constants); } else { stage_prog_data->param = NULL; + stage_prog_data->param_type = NULL; } assert(stage_prog_data->nr_pull_params == 0); assert(stage_prog_data->pull_param == NULL); + assert(stage_prog_data->pull_param_type == NULL); if (num_pull_constants > 0) { stage_prog_data->nr_pull_params = num_pull_constants; stage_prog_data->pull_param = ralloc_array(mem_ctx, uint32_t, num_pull_constants); + stage_prog_data->pull_param_type = ralloc_array(NULL, unsigned char, + num_pull_constants); } /* Now that we know how many regular uniforms we'll push, reduce the @@ -2143,11 +2150,16 @@ fs_visitor::assign_constant_locations() uint32_t value = param[i]; if (pull_constant_loc[i] != -1) { stage_prog_data->pull_param[pull_constant_loc[i]] = value; + stage_prog_data->pull_param_type[pull_constant_loc[i]] = +param_type[i]; } else if (push_constant_loc[i] != -1) { stage_prog_data->param[push_constant_loc[i]] = value; + stage_prog_data->param_type[push_constant_loc[i]] = +param_type[i]; } } ralloc_free(param); + ralloc_free(param_type); } bool diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp index 43127e00e8..2060a3139d 100644 --- a/src/intel/compiler/brw_fs_nir.cpp +++ b/src/intel/compiler/brw_fs_nir.cpp @@ -120,9 +120,11 @@ fs_visitor::nir_setup_uniforms() * on the list. */ assert(uniforms == prog_data->nr_params); + const unsigned old_nr_params = prog_data->nr_params;
[Mesa-dev] [PATCH 28/51] intel/compiler/fs: Use component_size() instead of open coded
This prepares for following patch will add 16-bit tex/fb write payload padding support. Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_fs.cpp | 2 +- src/intel/compiler/brw_fs_copy_propagation.cpp | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp index 0d415e2393..cedfde5096 100644 --- a/src/intel/compiler/brw_fs.cpp +++ b/src/intel/compiler/brw_fs.cpp @@ -639,7 +639,7 @@ bool fs_inst::is_partial_write() const { return ((this->predicate && this->opcode != BRW_OPCODE_SEL) || - (this->exec_size * type_sz(this->dst.type)) < 32 || + dst.component_size(exec_size) < 32 || !this->dst.is_contiguous() || this->dst.offset % REG_SIZE != 0); } diff --git a/src/intel/compiler/brw_fs_copy_propagation.cpp b/src/intel/compiler/brw_fs_copy_propagation.cpp index 470eaeec4f..ed2511ecfa 100644 --- a/src/intel/compiler/brw_fs_copy_propagation.cpp +++ b/src/intel/compiler/brw_fs_copy_propagation.cpp @@ -801,8 +801,8 @@ fs_visitor::opt_copy_propagation_local(void *copy_prop_ctx, bblock_t *block, for (int i = 0; i < inst->sources; i++) { int effective_width = i < inst->header_size ? 8 : inst->exec_size; assert(effective_width * MAX2(4, type_sz(inst->src[i].type)) % REG_SIZE == 0); -const unsigned size_written = effective_width * - type_sz(inst->src[i].type); +const unsigned size_written = + inst->src[i].component_size(effective_width); if (inst->src[i].file == VGRF) { acp_entry *entry = rzalloc(copy_prop_ctx, acp_entry); entry->dst = byte_offset(inst->dst, offset); -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 24/51] intel/compiler: Add support for negating 16-bit floats
Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_shader.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_shader.cpp b/src/intel/compiler/brw_shader.cpp index cc9297772b..3a83f55f28 100644 --- a/src/intel/compiler/brw_shader.cpp +++ b/src/intel/compiler/brw_shader.cpp @@ -653,7 +653,8 @@ brw_negate_immediate(enum brw_reg_type type, struct brw_reg *reg) case BRW_REGISTER_TYPE_V: assert(!"unimplemented: negate UV/V immediate"); case BRW_REGISTER_TYPE_HF: - assert(!"unimplemented: negate HF immediate"); + reg->ud ^= 0x8000; + return true; } return false; -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 18/51] intel/compiler: Allow 16-bit math
Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_eu_emit.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c index 1507968e6c..87b144e871 100644 --- a/src/intel/compiler/brw_eu_emit.c +++ b/src/intel/compiler/brw_eu_emit.c @@ -1921,8 +1921,10 @@ void gen6_math(struct brw_codegen *p, assert(src1.file == BRW_GENERAL_REGISTER_FILE || (devinfo->gen >= 8 && src1.file == BRW_IMMEDIATE_VALUE)); } else { - assert(src0.type == BRW_REGISTER_TYPE_F); - assert(src1.type == BRW_REGISTER_TYPE_F); + assert(src0.type == BRW_REGISTER_TYPE_F || + src0.type == BRW_REGISTER_TYPE_HF); + assert(src1.type == BRW_REGISTER_TYPE_F || + src1.type == BRW_REGISTER_TYPE_HF); } /* Source modifiers are ignored for extended math instructions on Gen6. */ -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 34/51] intel/compiler/fs: Pad 16-bit const loads into full regs
Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_fs_nir.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp index 64243312b9..c455fa4e27 100644 --- a/src/intel/compiler/brw_fs_nir.cpp +++ b/src/intel/compiler/brw_fs_nir.cpp @@ -1519,7 +1519,8 @@ fs_visitor::nir_emit_load_const(const fs_builder &bld, { const brw_reg_type reg_type = brw_reg_type_from_bit_size(instr->def.bit_size, BRW_REGISTER_TYPE_D); - fs_reg reg = bld.vgrf(reg_type, instr->def.num_components); + fs_reg reg = bld.vgrf(reg_type, instr->def.num_components, + true /* pad components to full regs */); switch (instr->def.bit_size) { case 16: -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 40/51] intel/compiler/fs: Prepare 16-bit and/or/xor for 32-bit src
In GLSL->NIR translation logic operations with boolean typed operands are treated as operating with integer operands. The values of the operands therefore can be 0xFFF/0x000 in case they are produced with 32-bit execution type or 0x/0x in case of 16-bit. This patch allows 16-bit logic operations to use 32-bit boolean types as sources. Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_fs_nir.cpp | 21 + 1 file changed, 21 insertions(+) diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp index aff592c354..43127e00e8 100644 --- a/src/intel/compiler/brw_fs_nir.cpp +++ b/src/intel/compiler/brw_fs_nir.cpp @@ -1127,6 +1127,13 @@ fs_visitor::nir_emit_alu(const fs_builder &bld, nir_alu_instr *instr) break; case nir_op_ixor: if (devinfo->gen >= 8) { + if (brw_reg_type_to_size(result.type) == 2) { +op[0] = subscript(op[0], + brw_reg_type_from_bit_size(16, op[0].type), 0); +op[1] = subscript(op[1], + brw_reg_type_from_bit_size(16, op[1].type), 0); + } + op[0] = resolve_source_modifiers(op[0]); op[1] = resolve_source_modifiers(op[1]); } @@ -1134,6 +1141,13 @@ fs_visitor::nir_emit_alu(const fs_builder &bld, nir_alu_instr *instr) break; case nir_op_ior: if (devinfo->gen >= 8) { + if (brw_reg_type_to_size(result.type) == 2) { +op[0] = subscript(op[0], + brw_reg_type_from_bit_size(16, op[0].type), 0); +op[1] = subscript(op[1], + brw_reg_type_from_bit_size(16, op[1].type), 0); + } + op[0] = resolve_source_modifiers(op[0]); op[1] = resolve_source_modifiers(op[1]); } @@ -1141,6 +1155,13 @@ fs_visitor::nir_emit_alu(const fs_builder &bld, nir_alu_instr *instr) break; case nir_op_iand: if (devinfo->gen >= 8) { + if (brw_reg_type_to_size(result.type) == 2) { +op[0] = subscript(op[0], + brw_reg_type_from_bit_size(16, op[0].type), 0); +op[1] = subscript(op[1], + brw_reg_type_from_bit_size(16, op[1].type), 0); + } + op[0] = resolve_source_modifiers(op[0]); op[1] = resolve_source_modifiers(op[1]); } -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 21/51] intel/compiler/fs: Use 16-bit null dest with 16-bit math
Even though this doesn't seem to alter anything else than dumping it is more consistent. Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_fs_generator.cpp | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_fs_generator.cpp b/src/intel/compiler/brw_fs_generator.cpp index 03fd34c00a..20d018e1fe 100644 --- a/src/intel/compiler/brw_fs_generator.cpp +++ b/src/intel/compiler/brw_fs_generator.cpp @@ -1918,8 +1918,13 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width) if (devinfo->gen >= 6) { assert(inst->mlen == 0); assert(devinfo->gen >= 7 || inst->exec_size == 8); + +struct brw_reg null_reg = brw_null_reg(); +if (brw_reg_type_to_size(dst.type) == 2) + null_reg = retype(null_reg, BRW_REGISTER_TYPE_HF); + gen6_math(p, dst, brw_math_function(inst->opcode), - src[0], brw_null_reg()); + src[0], null_reg); } else { assert(inst->mlen >= 1); assert(devinfo->gen == 5 || devinfo->is_g4x || inst->exec_size == 8); -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 36/51] intel/compiler/fs: Prepare nir_emit_if() for 16-bit sources
Comparison operations using 16-bit sources produce 16-bit results (0x/0x) instead of (0xFFF/0x). Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_fs_nir.cpp | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp index c455fa4e27..baa84b0f3c 100644 --- a/src/intel/compiler/brw_fs_nir.cpp +++ b/src/intel/compiler/brw_fs_nir.cpp @@ -396,10 +396,15 @@ fs_visitor::nir_emit_cf_list(exec_list *list) void fs_visitor::nir_emit_if(nir_if *if_stmt) { + const fs_reg src = get_nir_src(if_stmt->condition); + fs_inst *inst; + /* first, put the condition into f0 */ - fs_inst *inst = bld.MOV(bld.null_reg_d(), -retype(get_nir_src(if_stmt->condition), - BRW_REGISTER_TYPE_D)); + if (brw_reg_type_to_size(src.type) == 2) + inst = bld.MOV(bld.null_reg_w(), retype(src, BRW_REGISTER_TYPE_W)); + else + inst = bld.MOV(bld.null_reg_d(), retype(src, BRW_REGISTER_TYPE_D)); + inst->conditional_mod = BRW_CONDITIONAL_NZ; bld.IF(BRW_PREDICATE_NORMAL); -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 12/51] intel/compiler/disasm: Print 16-bit IMM values
Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_disasm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_disasm.c b/src/intel/compiler/brw_disasm.c index c752e15331..da2a5d78dd 100644 --- a/src/intel/compiler/brw_disasm.c +++ b/src/intel/compiler/brw_disasm.c @@ -1286,7 +1286,8 @@ imm(FILE *file, const struct gen_device_info *devinfo, enum brw_reg_type type, format(file, "%-gDF", brw_inst_imm_df(devinfo, inst)); break; case BRW_REGISTER_TYPE_HF: - string(file, "Half Float IMM"); + format(file, "%-gHF", + _mesa_half_to_float((uint16_t) brw_inst_imm_ud(devinfo, inst))); break; case BRW_REGISTER_TYPE_UB: case BRW_REGISTER_TYPE_B: -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 22/51] intel/compiler/fs: Use 16-bit null dest with 16-bit compare
Otherwise EU-emitter will deduce wrong execution size when examining source types and finding 32-bit wide register. Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_fs_nir.cpp | 16 +--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp index 65a5bfa49a..16e8dfc186 100644 --- a/src/intel/compiler/brw_fs_nir.cpp +++ b/src/intel/compiler/brw_fs_nir.cpp @@ -25,6 +25,7 @@ #include "brw_fs.h" #include "brw_fs_surface_builder.h" #include "brw_nir.h" +#include "util/half_float.h" using namespace brw; using namespace brw::surface_access; @@ -1446,7 +1447,10 @@ fs_visitor::nir_emit_alu(const fs_builder &bld, nir_alu_instr *instr) if (optimize_frontfacing_ternary(instr, result)) return; - bld.CMP(bld.null_reg_d(), op[0], brw_imm_d(0), BRW_CONDITIONAL_NZ); + if (brw_reg_type_to_size(op[0].type) == 2) + bld.CMP(bld.null_reg_w(), op[0], brw_imm_w(0), BRW_CONDITIONAL_NZ); + else + bld.CMP(bld.null_reg_d(), op[0], brw_imm_d(0), BRW_CONDITIONAL_NZ); inst = bld.SEL(result, op[1], op[2]); inst->predicate = BRW_PREDICATE_NORMAL; break; @@ -3410,8 +3414,14 @@ fs_visitor::nir_emit_fs_intrinsic(const fs_builder &bld, */ fs_inst *cmp; if (instr->intrinsic == nir_intrinsic_discard_if) { - cmp = bld.CMP(bld.null_reg_f(), get_nir_src(instr->src[0]), - brw_imm_d(0), BRW_CONDITIONAL_Z); + const fs_reg src = get_nir_src(instr->src[0]); + + if (brw_reg_type_to_size(src.type) == 2) +cmp = bld.CMP(bld.null_reg_hf(), get_nir_src(instr->src[0]), + brw_imm_w(0), BRW_CONDITIONAL_Z); + else +cmp = bld.CMP(bld.null_reg_f(), get_nir_src(instr->src[0]), + brw_imm_d(0), BRW_CONDITIONAL_Z); } else { fs_reg some_reg = fs_reg(retype(brw_vec8_grf(0, 0), BRW_REGISTER_TYPE_UW)); -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 11/51] glsl: Enable 16-bit texturing in nir-conversion
Signed-off-by: Topi Pohjolainen --- src/compiler/glsl/glsl_to_nir.cpp | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/compiler/glsl/glsl_to_nir.cpp b/src/compiler/glsl/glsl_to_nir.cpp index c0adf744e0..b16efa6555 100644 --- a/src/compiler/glsl/glsl_to_nir.cpp +++ b/src/compiler/glsl/glsl_to_nir.cpp @@ -2057,6 +2057,9 @@ nir_visitor::visit(ir_texture *ir) case GLSL_TYPE_FLOAT: instr->dest_type = nir_type_float; break; + case GLSL_TYPE_FLOAT16: + instr->dest_type = nir_type_float16; + break; case GLSL_TYPE_INT: instr->dest_type = nir_type_int; break; -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 25/51] intel/compiler/fs: Support for combining 16-bit immediates
Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_fs_combine_constants.cpp | 84 + 1 file changed, 71 insertions(+), 13 deletions(-) diff --git a/src/intel/compiler/brw_fs_combine_constants.cpp b/src/intel/compiler/brw_fs_combine_constants.cpp index e0c95d379b..5772ffb94a 100644 --- a/src/intel/compiler/brw_fs_combine_constants.cpp +++ b/src/intel/compiler/brw_fs_combine_constants.cpp @@ -36,6 +36,7 @@ #include "brw_fs.h" #include "brw_cfg.h" +#include "util/half_float.h" using namespace brw; @@ -95,6 +96,15 @@ link(void *mem_ctx, fs_reg *reg) return &l->link; } +union imm_val { + double df; + uint64_t u64; + int64_t d64; + float f; + int d; + unsigned ud; +}; + /** * Information about an immediate value. */ @@ -114,8 +124,10 @@ struct imm { */ exec_list *uses; - /** The immediate value. We currently only handle floats. */ - float val; + enum brw_reg_type type; + + /** The immediate value. We currently handle floats and half floats. */ + union imm_val val; /** * The GRF register and subregister number where we've decided to store the @@ -145,10 +157,10 @@ struct table { }; static struct imm * -find_imm(struct table *table, float val) +find_imm(struct table *table, enum brw_reg_type type, union imm_val val) { for (int i = 0; i < table->len; i++) { - if (table->imm[i].val == val) { + if (table->imm[i].val.u64 == val.u64 && table->imm[i].type == type) { return &table->imm[i]; } } @@ -190,6 +202,33 @@ compare(const void *_a, const void *_b) return a->first_use_ip - b->first_use_ip; } +static uint16_t +fabs_f16(uint16_t hf) +{ + return _mesa_float_to_half(fabs(_mesa_half_to_float(hf))); +} + +static union imm_val +get_val(const struct gen_device_info *devinfo, fs_inst *inst, unsigned i) +{ + union imm_val res = { 0 }; + + switch (inst->src[i].type) { + case BRW_REGISTER_TYPE_F: + res.f = !inst->can_do_source_mods(devinfo) ? + inst->src[i].f : fabs(inst->src[i].f); + break; + case BRW_REGISTER_TYPE_HF: + res.ud = !inst->can_do_source_mods(devinfo) ? + inst->src[i].ud : fabs_f16(inst->src[i].ud); + break; + default: + unreachable("unsupported immediate type"); + } + + return res; +} + bool fs_visitor::opt_combine_constants() { @@ -215,12 +254,12 @@ fs_visitor::opt_combine_constants() for (int i = 0; i < inst->sources; i++) { if (inst->src[i].file != IMM || - inst->src[i].type != BRW_REGISTER_TYPE_F) + (inst->src[i].type != BRW_REGISTER_TYPE_F && + inst->src[i].type != BRW_REGISTER_TYPE_HF)) continue; - float val = !inst->can_do_source_mods(devinfo) ? inst->src[i].f : - fabs(inst->src[i].f); - struct imm *imm = find_imm(&table, val); + union imm_val val = get_val(devinfo, inst, i); + struct imm *imm = find_imm(&table, inst->src[i].type, val); if (imm) { bblock_t *intersection = cfg_t::intersect(block, imm->block); @@ -238,6 +277,7 @@ fs_visitor::opt_combine_constants() imm->uses = new(const_ctx) exec_list(); imm->uses->push_tail(link(const_ctx, &inst->src[i])); imm->val = val; +imm->type = inst->src[i].type; imm->uses_by_coissue = could_coissue(devinfo, inst); imm->must_promote = must_promote_imm(devinfo, inst); imm->first_use_ip = ip; @@ -278,7 +318,14 @@ fs_visitor::opt_combine_constants() imm->block->last_non_control_flow_inst()->next); const fs_builder ibld = bld.at(imm->block, n).exec_all().group(1, 0); - ibld.MOV(reg, brw_imm_f(imm->val)); + if (imm->type == BRW_REGISTER_TYPE_F) + ibld.MOV(reg, brw_imm_f(imm->val.f)); + else if (imm->type == BRW_REGISTER_TYPE_HF) { + ibld.MOV(retype(reg, BRW_REGISTER_TYPE_HF), + retype(brw_imm_ud(imm->val.ud), BRW_REGISTER_TYPE_HF)); + } else + unreachable("unsupported immediate type"); + imm->nr = reg.nr; imm->subreg_offset = reg.offset; @@ -298,9 +345,19 @@ fs_visitor::opt_combine_constants() reg->nr = table.imm[i].nr; reg->offset = table.imm[i].subreg_offset; reg->stride = 0; - reg->negate = signbit(reg->f) != signbit(table.imm[i].val); - assert((isnan(reg->f) && isnan(table.imm[i].val)) || -fabsf(reg->f) == fabs(table.imm[i].val)); + reg->negate = signbit(reg->f) != signbit(table.imm[i].val.f); + + switch (table.imm[i].type) { + case BRW_REGISTER_TYPE_F: +assert((isnan(reg->f) && isnan(table.imm[i].val.f)) || + fabsf(reg->f) == fabs(table.imm[i].val.f)); +break; + case BRW_REGISTER_TYPE_HF: +assert(fabs_f16(reg->ud) == fabs_f16(table.imm[i].val.ud)); +br
[Mesa-dev] [PATCH 23/51] intel/compiler: Prepare for 16-bit 3-src ops
Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_eu_emit.c | 21 + src/intel/compiler/brw_inst.h | 4 src/intel/compiler/brw_reg_type.c | 2 ++ 3 files changed, 27 insertions(+) diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c index 87b144e871..fb8d5b5513 100644 --- a/src/intel/compiler/brw_eu_emit.c +++ b/src/intel/compiler/brw_eu_emit.c @@ -810,6 +810,7 @@ brw_alu3(struct brw_codegen *p, unsigned opcode, struct brw_reg dest, assert(dest.file == BRW_GENERAL_REGISTER_FILE || dest.file == BRW_MESSAGE_REGISTER_FILE); assert(dest.type == BRW_REGISTER_TYPE_F || + dest.type == BRW_REGISTER_TYPE_HF || dest.type == BRW_REGISTER_TYPE_DF || dest.type == BRW_REGISTER_TYPE_D || dest.type == BRW_REGISTER_TYPE_UD); @@ -857,6 +858,21 @@ brw_alu3(struct brw_codegen *p, unsigned opcode, struct brw_reg dest, */ brw_inst_set_3src_a16_src_type(devinfo, inst, dest.type); brw_inst_set_3src_a16_dst_type(devinfo, inst, dest.type); + + if (dest.type == BRW_REGISTER_TYPE_HF) { +/* From the Bspec: Instruction types + * + * Three source instructions can use operands with mixed-mode + * precision. When SrcType field is set to :f or :hf it defines + * precision for source 0 only, and fields Src1Type and Src2Type + * define precision for other source operands: + * + * 0b = :f. Single precision Float (32-bit). + * 1b = :hf. Half precision Float (16-bit). + */ +brw_inst_set_3src_src1_type(devinfo, inst, 1); +brw_inst_set_3src_src2_type(devinfo, inst, 1); + } } } @@ -902,11 +918,16 @@ brw_inst *brw_##OP(struct brw_codegen *p, \ struct brw_reg src2) \ { \ assert(dest.type == BRW_REGISTER_TYPE_F || \ + dest.type == BRW_REGISTER_TYPE_HF || \ dest.type == BRW_REGISTER_TYPE_DF); \ if (dest.type == BRW_REGISTER_TYPE_F) { \ assert(src0.type == BRW_REGISTER_TYPE_F); \ assert(src1.type == BRW_REGISTER_TYPE_F); \ assert(src2.type == BRW_REGISTER_TYPE_F); \ + } else if (dest.type == BRW_REGISTER_TYPE_HF) { \ + assert(src0.type == BRW_REGISTER_TYPE_HF);\ + assert(src1.type == BRW_REGISTER_TYPE_HF);\ + assert(src2.type == BRW_REGISTER_TYPE_HF);\ } else if (dest.type == BRW_REGISTER_TYPE_DF) { \ assert(src0.type == BRW_REGISTER_TYPE_DF);\ assert(src1.type == BRW_REGISTER_TYPE_DF);\ diff --git a/src/intel/compiler/brw_inst.h b/src/intel/compiler/brw_inst.h index 2501d6adff..c295a2b3ff 100644 --- a/src/intel/compiler/brw_inst.h +++ b/src/intel/compiler/brw_inst.h @@ -222,6 +222,10 @@ F8(3src_src1_negate,39, 39, 40, 40) F8(3src_src1_abs, 38, 38, 39, 39) F8(3src_src0_negate,37, 37, 38, 38) F8(3src_src0_abs, 36, 36, 37, 37) + +F(3src_src2_type, 36, 36) +F(3src_src1_type, 35, 35) + F8(3src_a16_flag_reg_nr,34, 34, 33, 33) F8(3src_a16_flag_subreg_nr, 33, 33, 32, 32) FF(3src_a16_dst_reg_file, diff --git a/src/intel/compiler/brw_reg_type.c b/src/intel/compiler/brw_reg_type.c index b7fff0867f..55956ef563 100644 --- a/src/intel/compiler/brw_reg_type.c +++ b/src/intel/compiler/brw_reg_type.c @@ -93,6 +93,7 @@ enum hw_3src_reg_type { GEN7_3SRC_TYPE_D = 1, GEN7_3SRC_TYPE_UD = 2, GEN7_3SRC_TYPE_DF = 3, + GEN7_3SRC_TYPE_HF = 4, /** When ExecutionDatatype is 1: @{ */ GEN10_ALIGN1_3SRC_REG_TYPE_HF = 0b000, @@ -120,6 +121,7 @@ static const struct hw_3src_type { [BRW_REGISTER_TYPE_D] = { GEN7_3SRC_TYPE_D }, [BRW_REGISTER_TYPE_UD] = { GEN7_3SRC_TYPE_UD }, [BRW_REGISTER_TYPE_DF] = { GEN7_3SRC_TYPE_DF }, + [BRW_REGISTER_TYPE_HF] = { GEN7_3SRC_TYPE_HF }, }, gen10_hw_3src_align1_type[] = { #define E(x) BRW_ALIGN1_3SRC_EXEC_TYPE_##x [0 ... BRW_REGISTER_TYPE_LAST] = { INVALID }, -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 09/51] glsl: Allow 16-bit neg() and dot()
Signed-off-by: Topi Pohjolainen --- src/compiler/glsl/ir_validate.cpp | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/src/compiler/glsl/ir_validate.cpp b/src/compiler/glsl/ir_validate.cpp index a20f52e527..735e862141 100644 --- a/src/compiler/glsl/ir_validate.cpp +++ b/src/compiler/glsl/ir_validate.cpp @@ -263,7 +263,8 @@ ir_validate::visit_leave(ir_expression *ir) assert(ir->operands[0]->type->base_type == GLSL_TYPE_INT || ir->operands[0]->type->is_float() || ir->operands[0]->type->is_double() || - ir->operands[0]->type->base_type == GLSL_TYPE_INT64); + ir->operands[0]->type->base_type == GLSL_TYPE_INT64 || + ir->operands[0]->type->base_type == GLSL_TYPE_FLOAT16); assert(ir->type == ir->operands[0]->type); break; @@ -742,9 +743,11 @@ ir_validate::visit_leave(ir_expression *ir) case ir_binop_dot: assert(ir->type == glsl_type::float_type || - ir->type == glsl_type::double_type); + ir->type == glsl_type::double_type || + ir->type->base_type == GLSL_TYPE_FLOAT16); assert(ir->operands[0]->type->is_float() || - ir->operands[0]->type->is_double()); + ir->operands[0]->type->is_double() || + ir->operands[0]->type->base_type == GLSL_TYPE_FLOAT16); assert(ir->operands[0]->type->is_vector()); assert(ir->operands[0]->type == ir->operands[1]->type); break; -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 19/51] intel/compiler/fs: Add helpers for 16-bit null regs
Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_fs_builder.h | 12 1 file changed, 12 insertions(+) diff --git a/src/intel/compiler/brw_fs_builder.h b/src/intel/compiler/brw_fs_builder.h index 87394bc17b..633086c64b 100644 --- a/src/intel/compiler/brw_fs_builder.h +++ b/src/intel/compiler/brw_fs_builder.h @@ -205,6 +205,12 @@ namespace brw { } dst_reg + null_reg_hf() const + { + return dst_reg(retype(brw_null_reg(), BRW_REGISTER_TYPE_HF)); + } + + dst_reg null_reg_df() const { return dst_reg(retype(brw_null_reg(), BRW_REGISTER_TYPE_DF)); @@ -219,6 +225,12 @@ namespace brw { return dst_reg(retype(brw_null_reg(), BRW_REGISTER_TYPE_D)); } + dst_reg + null_reg_w() const + { + return dst_reg(retype(brw_null_reg(), BRW_REGISTER_TYPE_W)); + } + /** * Create a null register of unsigned integer type. */ -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 20/51] intel/compiler/fs: Use two SIMD8 instructions for 16-bit math
Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_fs.cpp | 18 ++ 1 file changed, 18 insertions(+) diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp index 3c70231be8..5751bb0ad7 100644 --- a/src/intel/compiler/brw_fs.cpp +++ b/src/intel/compiler/brw_fs.cpp @@ -4903,6 +4903,15 @@ get_lowered_simd_width(const struct gen_device_info *devinfo, case SHADER_OPCODE_LOG2: case SHADER_OPCODE_SIN: case SHADER_OPCODE_COS: + /* From the SKL PRM Vol 2, math - Extended Math Function: + * + * The execution size must be no more than 8 when half-floats are used + * in source or destination operand. + */ + if (inst->src[0].type == BRW_REGISTER_TYPE_HF || + inst->dst.type == BRW_REGISTER_TYPE_HF) + return MIN2(8, inst->exec_size); + /* Unary extended math instructions are limited to SIMD8 on Gen4 and * Gen6. */ @@ -4911,6 +4920,15 @@ get_lowered_simd_width(const struct gen_device_info *devinfo, MIN2(8, inst->exec_size)); case SHADER_OPCODE_POW: + /* From the SKL PRM Vol 2, math - Extended Math Function: + * + * The execution size must be no more than 8 when half-floats are used + * in source or destination operand. + */ + if (inst->src[0].type == BRW_REGISTER_TYPE_HF || + inst->dst.type == BRW_REGISTER_TYPE_HF) + return MIN2(8, inst->exec_size); + /* SIMD16 is only allowed on Gen7+. */ return (devinfo->gen >= 7 ? MIN2(16, inst->exec_size) : MIN2(8, inst->exec_size)); -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 10/51] glsl: Allow 16-bit math
Signed-off-by: Topi Pohjolainen --- src/compiler/glsl/ir_validate.cpp | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/compiler/glsl/ir_validate.cpp b/src/compiler/glsl/ir_validate.cpp index 735e862141..d246af866d 100644 --- a/src/compiler/glsl/ir_validate.cpp +++ b/src/compiler/glsl/ir_validate.cpp @@ -272,7 +272,8 @@ ir_validate::visit_leave(ir_expression *ir) case ir_unop_rsq: case ir_unop_sqrt: assert(ir->type->is_float() || - ir->type->is_double()); + ir->type->is_double() || + ir->type->base_type == GLSL_TYPE_FLOAT16); assert(ir->type == ir->operands[0]->type); break; @@ -281,7 +282,9 @@ ir_validate::visit_leave(ir_expression *ir) case ir_unop_exp2: case ir_unop_log2: case ir_unop_saturate: - assert(ir->operands[0]->type->is_float()); + assert(ir->operands[0]->type->is_float() || + (ir->operands[0]->type->get_scalar_type()->base_type == + GLSL_TYPE_FLOAT16)); assert(ir->type == ir->operands[0]->type); break; -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 16/51] intel/compiler: Move type_size_scalar() into brw_shader.cpp
Next path will add another variant and in order not to make brw_fs.cpp any bigger it already is, add both in brw_shader.cpp instead. Signed-off-by: Topi Pohjolainen --- src/intel/compiler/brw_fs.cpp | 48 --- src/intel/compiler/brw_shader.cpp | 48 +++ 2 files changed, 48 insertions(+), 48 deletions(-) diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp index 1b972972c1..3c70231be8 100644 --- a/src/intel/compiler/brw_fs.cpp +++ b/src/intel/compiler/brw_fs.cpp @@ -470,54 +470,6 @@ fs_reg::component_size(unsigned width) const return MAX2(width * stride, 1) * type_sz(type); } -extern "C" int -type_size_scalar(const struct glsl_type *type) -{ - unsigned int size, i; - - switch (type->base_type) { - case GLSL_TYPE_UINT: - case GLSL_TYPE_INT: - case GLSL_TYPE_FLOAT: - case GLSL_TYPE_BOOL: - return type->components(); - case GLSL_TYPE_UINT16: - case GLSL_TYPE_INT16: - case GLSL_TYPE_FLOAT16: - return DIV_ROUND_UP(type->components(), 2); - case GLSL_TYPE_DOUBLE: - case GLSL_TYPE_UINT64: - case GLSL_TYPE_INT64: - return type->components() * 2; - case GLSL_TYPE_ARRAY: - return type_size_scalar(type->fields.array) * type->length; - case GLSL_TYPE_STRUCT: - size = 0; - for (i = 0; i < type->length; i++) { -size += type_size_scalar(type->fields.structure[i].type); - } - return size; - case GLSL_TYPE_SAMPLER: - /* Samplers take up no register space, since they're baked in at - * link time. - */ - return 0; - case GLSL_TYPE_ATOMIC_UINT: - return 0; - case GLSL_TYPE_SUBROUTINE: - return 1; - case GLSL_TYPE_IMAGE: - return BRW_IMAGE_PARAM_SIZE; - case GLSL_TYPE_VOID: - case GLSL_TYPE_ERROR: - case GLSL_TYPE_INTERFACE: - case GLSL_TYPE_FUNCTION: - unreachable("not reached"); - } - - return 0; -} - /** * Create a MOV to read the timestamp register. * diff --git a/src/intel/compiler/brw_shader.cpp b/src/intel/compiler/brw_shader.cpp index 74b52976d7..234b5a11c1 100644 --- a/src/intel/compiler/brw_shader.cpp +++ b/src/intel/compiler/brw_shader.cpp @@ -30,6 +30,54 @@ #include "main/uniforms.h" #include "util/macros.h" +extern "C" int +type_size_scalar(const struct glsl_type *type) +{ + unsigned int size, i; + + switch (type->base_type) { + case GLSL_TYPE_UINT: + case GLSL_TYPE_INT: + case GLSL_TYPE_FLOAT: + case GLSL_TYPE_BOOL: + return type->components(); + case GLSL_TYPE_UINT16: + case GLSL_TYPE_INT16: + case GLSL_TYPE_FLOAT16: + return DIV_ROUND_UP(type->components(), 2); + case GLSL_TYPE_DOUBLE: + case GLSL_TYPE_UINT64: + case GLSL_TYPE_INT64: + return type->components() * 2; + case GLSL_TYPE_ARRAY: + return type_size_scalar(type->fields.array) * type->length; + case GLSL_TYPE_STRUCT: + size = 0; + for (i = 0; i < type->length; i++) { + size += type_size_scalar(type->fields.structure[i].type); + } + return size; + case GLSL_TYPE_SAMPLER: + /* Samplers take up no register space, since they're baked in at + * link time. + */ + return 0; + case GLSL_TYPE_ATOMIC_UINT: + return 0; + case GLSL_TYPE_SUBROUTINE: + return 1; + case GLSL_TYPE_IMAGE: + return BRW_IMAGE_PARAM_SIZE; + case GLSL_TYPE_VOID: + case GLSL_TYPE_ERROR: + case GLSL_TYPE_INTERFACE: + case GLSL_TYPE_FUNCTION: + unreachable("not reached"); + } + + return 0; +} + enum brw_reg_type brw_type_for_base_type(const struct glsl_type *type) { -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/51] nir: Add 16-bit float support into algebraic opts
Signed-off-by: Topi Pohjolainen --- src/compiler/nir/nir_search.c | 4 1 file changed, 4 insertions(+) diff --git a/src/compiler/nir/nir_search.c b/src/compiler/nir/nir_search.c index dec56fee74..3b28da4a3f 100644 --- a/src/compiler/nir/nir_search.c +++ b/src/compiler/nir/nir_search.c @@ -27,6 +27,7 @@ #include #include "nir_search.h" +#include "util/half_float.h" struct match_state { bool inexact_match; @@ -194,6 +195,9 @@ match_value(const nir_search_value *value, nir_alu_instr *instr, unsigned src, for (unsigned i = 0; i < num_components; ++i) { double val; switch (load->def.bit_size) { +case 16: + val = _mesa_half_to_float(load->value.u16[new_swizzle[i]]); + break; case 32: val = load->value.f32[new_swizzle[i]]; break; -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 07/51] glsl: Add conversion ops to/from 16-bit floats
Signed-off-by: Topi Pohjolainen --- src/compiler/glsl/glsl_to_nir.cpp| 2 ++ src/compiler/glsl/ir.cpp | 8 src/compiler/glsl/ir_expression_operation.py | 5 + src/compiler/glsl/ir_validate.cpp| 8 src/mesa/program/ir_to_mesa.cpp | 2 ++ src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 3 +++ 6 files changed, 28 insertions(+) diff --git a/src/compiler/glsl/glsl_to_nir.cpp b/src/compiler/glsl/glsl_to_nir.cpp index 289f8be031..14c358465b 100644 --- a/src/compiler/glsl/glsl_to_nir.cpp +++ b/src/compiler/glsl/glsl_to_nir.cpp @@ -1561,6 +1561,8 @@ nir_visitor::visit(ir_expression *ir) case ir_unop_d2b: case ir_unop_i2d: case ir_unop_u2d: + case ir_unop_h2f: + case ir_unop_f2h: case ir_unop_i642i: case ir_unop_i642u: case ir_unop_i642f: diff --git a/src/compiler/glsl/ir.cpp b/src/compiler/glsl/ir.cpp index 2c61dd9d64..a901ec5683 100644 --- a/src/compiler/glsl/ir.cpp +++ b/src/compiler/glsl/ir.cpp @@ -281,6 +281,7 @@ ir_expression::ir_expression(int op, ir_rvalue *op0) case ir_unop_i2f: case ir_unop_u2f: case ir_unop_d2f: + case ir_unop_h2f: case ir_unop_bitcast_i2f: case ir_unop_bitcast_u2f: case ir_unop_i642f: @@ -334,6 +335,13 @@ ir_expression::ir_expression(int op, ir_rvalue *op0) this->type = glsl_type::get_instance(GLSL_TYPE_UINT64, op0->type->vector_elements, 1); break; + + case ir_unop_f2h: + this->type = glsl_type::get_instance(GLSL_TYPE_FLOAT16, + op0->type->vector_elements, 1); + break; + + case ir_unop_noise: this->type = glsl_type::float_type; break; diff --git a/src/compiler/glsl/ir_expression_operation.py b/src/compiler/glsl/ir_expression_operation.py index d8542925a0..3158533c02 100644 --- a/src/compiler/glsl/ir_expression_operation.py +++ b/src/compiler/glsl/ir_expression_operation.py @@ -82,6 +82,7 @@ int_type = type("int", "i", "GLSL_TYPE_INT") uint64_type = type("uint64_t", "u64", "GLSL_TYPE_UINT64") int64_type = type("int64_t", "i64", "GLSL_TYPE_INT64") float_type = type("float", "f", "GLSL_TYPE_FLOAT") +float16_t_type = type("float16_t_type", "f", "GLSL_TYPE_FLOAT16") double_type = type("double", "d", "GLSL_TYPE_DOUBLE") bool_type = type("bool", "b", "GLSL_TYPE_BOOL") @@ -460,6 +461,10 @@ ir_expression_operation = [ operation("u2d", 1, source_types=(uint_type,), dest_type=double_type, c_expression="{src0}"), # Double-to-boolean conversion. operation("d2b", 1, source_types=(double_type,), dest_type=bool_type, c_expression="{src0} != 0.0"), + # hafl-to-float conversion. + operation("h2f", 1, source_types=(float16_t_type,), dest_type=float_type, c_expression="{src0}"), + # hafl-to-float conversion. + operation("f2h", 1, source_types=(float_type,), dest_type=float16_t_type, c_expression="{src0}"), # 'Bit-identical int-to-float "conversion" operation("bitcast_i2f", 1, source_types=(int_type,), dest_type=float_type, c_expression="bitcast_u2f({src0})"), # 'Bit-identical float-to-int "conversion" diff --git a/src/compiler/glsl/ir_validate.cpp b/src/compiler/glsl/ir_validate.cpp index aa07f8aea6..29e3cda865 100644 --- a/src/compiler/glsl/ir_validate.cpp +++ b/src/compiler/glsl/ir_validate.cpp @@ -595,6 +595,14 @@ ir_validate::visit_leave(ir_expression *ir) assert(ir->operands[0]->type->is_double()); assert(ir->type->is_boolean()); break; + case ir_unop_h2f: + assert(ir->operands[0]->type->base_type == GLSL_TYPE_FLOAT16); + assert(ir->type->is_float()); + break; + case ir_unop_f2h: + assert(ir->operands[0]->type->is_float()); + assert(ir->type->base_type == GLSL_TYPE_FLOAT16); + break; case ir_unop_frexp_sig: assert(ir->operands[0]->type->is_float() || diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp index ac12389f70..d57e50366e 100644 --- a/src/mesa/program/ir_to_mesa.cpp +++ b/src/mesa/program/ir_to_mesa.cpp @@ -1313,6 +1313,8 @@ ir_to_mesa_visitor::visit(ir_expression *ir) case ir_unop_d2u: case ir_unop_u2d: case ir_unop_d2b: + case ir_unop_h2f: + case ir_unop_f2h: case ir_unop_frexp_sig: case ir_unop_frexp_exp: assert(!"not supported"); diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index 0772b73627..f8cb94c7dc 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -1797,6 +1797,9 @@ glsl_to_tgsi_visitor::visit_expression(ir_expression* ir, st_src_reg *op) else emit_asm(ir, TGSI_OPCODE_SNE, result_dst, op[0], st_src_reg_for_float(0.0)); break; + case ir_unop_h2f: + case ir_unop_f2h: + unreachable("not implemented yet"); case ir_unop_bitcast_u642d: case ir_unop_bitcast_i642d: result_src = op[0]; -- 2.11.0 ___
[Mesa-dev] i965: Kicking off fp16 glsl support
After Igalia's work on SPIRV 16-bit storage question arose how much is needed on top in order to optimize GLES lowp/mediump with 16-bit floats. I took glb 2.7 trex as a target and started drafting a glsl lowering pass re-typing mediump floats into float16. In parallel, I added bit by bit equivalent support into GLSL -> NIR pass and into Intel compiler backend. This series enables lowering for fragment shaders only. This was sufficient for trex which doesn't use mediump precision for vertex shaders. First of all this is not complete work. I'd like to think of it more as trying to give an idea what is currently missing. And by giving concrete (if not ideal) solutions making each case a little clearer. On SKL this runs trex pretty much on par compared to 32-bit. Intel hardware doesn't have native support for linear interpolation using 16-bit floats and therefore pln() and lrp() incur additional moves from 16-bits to 32-bits (and vice versa). Both can be replaced relatively efficiently using mad() later on. Comparing shader dumps between 16-bit and 32-bit indicates that all optimization passes kick in nicely (sampler-eot, mad(), etc). Only additional are the before mentioned conversion instructions. Series starts with miscellanious bits needed in the glsl and nir. This is followed by equivalent bits in the Intel compiler backend. These are followed up changes that are subject to more debate: 1) Support for SIMD8 fp16 in liveness analysis, copy propagation, dead code elimination, etc. In order to tell if one instruction fully overwrites the results of another one needs to examine how much of a register is written. Until now this has been done in granularity of full or partial register, i.e., there is no concept of "full sub-region write". And until now there was no need as all data types took 4-bytes per element resulting into full 32-byte register even in case of SIMD8. Partial writes where all special and could be safely ignored in various analysis passes. Half precision floats, however, break this assumption. On SIMD8 full write with 16-bit elements results into half register. I tried patching different passes examing partial writes one by one but that started to get out hand. Moreover, just by looking a register type size it is not safe to say if it really is full write or not. Solution here is to explicitly store this information into registers: added new member fs_reg::pad_per_component. Subsequently patching fs_reg::component_size() to take the padding into account propagates the information to all users. Patch 28 updates a few users to use component_size() instead of open coded, 29 adds the actual support and 30-35 update NIR -> FS to signal the padding (these are separated just for review). It should be noted that here one deals with virtual registers. Final hardware allocator is separate and using full registers in virtual space shouldn't prevent it from using thighter packing. Chema, this overlaps with your work, I hope you don't mind. 2) Booleans produced from 16-bit sources. Whereas for GLSL and for NIR booleans are just booleans, on Intel hardware they are integers. And the presentation depends on how they are produced. Comparisons (flt, fge, feq and fne) with 32-bit sources produce 32-bit results (0x000/0x) while with 16-bits one gets 16-bit results (0x0/0x). I thought about introducing 16-bit boolean into NIR but that felt too much hardware specific thing to do. Instead I patched NIR -> FS to take the type of producing instruction into account when setting up the SSA values. See patch 39 for the setup and patches 36-38 for consulting the backend SSA store instead of relying on NIR. Another approach left to try is emitting additional moves into 32-bits (the same way we do for fp64). One could then add an optimization pass that removes unnecessary moves and uses strided sources instead. 3) Following up 2) GLSL -> NIR decides to emit integer typed and/or/xor even for originally boolean typed logic ops. Patch 40 tries to cope with the case where the booleans are produced with non-matching precision. 4) In Intel compiler backend and push/pull constant setup things are relying on values being packed in 32-bit slots. Moreover, these slots are typeless and the laoder doesn't know if it is dealing with floats or integers let alone about precision. Patch 42 takes the first step and simply adds type information into the backend. This is not particularly pretty but I had to start from somewhere. This allows the loader to convert float values from the 32-bit store in the core to 16-bits on the fly. Patch 43 adjusts compiler to use 32-bit slots. Using 16-bit slots would require substantially more work. I think there is no question about core using 32-bit values. And even if the values there were 16-bit, bac
[Mesa-dev] [PATCH 05/51] nir: Print 16-bit constants
Signed-off-by: Topi Pohjolainen --- src/compiler/nir/nir_print.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/compiler/nir/nir_print.c b/src/compiler/nir/nir_print.c index fcc8025346..9ed23a74bb 100644 --- a/src/compiler/nir/nir_print.c +++ b/src/compiler/nir/nir_print.c @@ -27,6 +27,7 @@ #include "nir.h" #include "compiler/shader_enums.h" +#include "util/half_float.h" #include #include #include /* for PRIx64 macro */ @@ -842,6 +843,10 @@ print_load_const_instr(nir_load_const_instr *instr, print_state *state) if (instr->def.bit_size == 64) fprintf(fp, "0x%16" PRIx64 " /* %f */", instr->value.u64[i], instr->value.f64[i]); + else if (instr->def.bit_size == 16) + fprintf(fp, "0x%04x /* %f */", + instr->value.u16[i], + _mesa_half_to_float(instr->value.u16[i])); else fprintf(fp, "0x%08x /* %f */", instr->value.u32[i], instr->value.f32[i]); } -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/51] glsl: Add support for 16-bit float constants in nir-conversion
Signed-off-by: Topi Pohjolainen --- src/compiler/glsl/glsl_to_nir.cpp | 9 + 1 file changed, 9 insertions(+) diff --git a/src/compiler/glsl/glsl_to_nir.cpp b/src/compiler/glsl/glsl_to_nir.cpp index 1e636225c1..289f8be031 100644 --- a/src/compiler/glsl/glsl_to_nir.cpp +++ b/src/compiler/glsl/glsl_to_nir.cpp @@ -32,6 +32,7 @@ #include "compiler/nir/nir_control_flow.h" #include "compiler/nir/nir_builder.h" #include "main/imports.h" +#include "util/half_float.h" /* * pass to lower GLSL IR to NIR @@ -245,6 +246,14 @@ constant_copy(ir_constant *ir, void *mem_ctx) break; + case GLSL_TYPE_FLOAT16: + for (unsigned c = 0; c < cols; c++) { + for (unsigned r = 0; r < rows; r++) +ret->values[c].u16[r] = + _mesa_float_to_half(ir->value.f[c * rows + r]); + } + break; + case GLSL_TYPE_FLOAT: for (unsigned c = 0; c < cols; c++) { for (unsigned r = 0; r < rows; r++) -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev