[Mesa-dev] [PATCH] mesa: fix texstore for MESA_FORMAT_R8G8B8A8_SRGB
The case for this was in the wrong function, and this format's store func was not set in the table at all. Signed-off-by: Chris Forbes --- src/mesa/main/texstore.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/src/mesa/main/texstore.c b/src/mesa/main/texstore.c index b68ba60..fe3b072 100644 --- a/src/mesa/main/texstore.c +++ b/src/mesa/main/texstore.c @@ -3260,12 +3260,16 @@ _mesa_texstore_srgba8(TEXSTORE_PARAMS) GLboolean k; ASSERT(dstFormat == MESA_FORMAT_A8B8G8R8_SRGB || - dstFormat == MESA_FORMAT_R8G8B8X8_SRGB); + dstFormat == MESA_FORMAT_R8G8B8X8_SRGB || + dstFormat == MESA_FORMAT_R8G8B8A8_SRGB); /* reuse normal rgba texstore code */ if (dstFormat == MESA_FORMAT_A8B8G8R8_SRGB) { newDstFormat = MESA_FORMAT_A8B8G8R8_UNORM; } + else if (dstFormat == MESA_FORMAT_R8G8B8A8_SRGB) { + newDstFormat = MESA_FORMAT_R8G8B8A8_UNORM; + } else if (dstFormat == MESA_FORMAT_R8G8B8X8_SRGB) { newDstFormat = MESA_FORMAT_R8G8B8X8_UNORM; } @@ -3294,9 +3298,6 @@ _mesa_texstore_sargb8(TEXSTORE_PARAMS) case MESA_FORMAT_B8G8R8A8_SRGB: newDstFormat = MESA_FORMAT_B8G8R8A8_UNORM; break; - case MESA_FORMAT_R8G8B8A8_SRGB: - newDstFormat = MESA_FORMAT_R8G8B8A8_UNORM; - break; case MESA_FORMAT_B8G8R8X8_SRGB: newDstFormat = MESA_FORMAT_B8G8R8X8_UNORM; break; @@ -3852,6 +3853,7 @@ _mesa_get_texstore_func(mesa_format format) table[MESA_FORMAT_B5G5R5X1_UNORM] = store_ubyte_texture; table[MESA_FORMAT_R8G8B8X8_SNORM] = _mesa_texstore_signed_rgbx; table[MESA_FORMAT_R8G8B8X8_SRGB] = _mesa_texstore_srgba8; + table[MESA_FORMAT_R8G8B8A8_SRGB] = _mesa_texstore_srgba8; table[MESA_FORMAT_RGBX_UINT8] = _mesa_texstore_rgba_uint8; table[MESA_FORMAT_RGBX_SINT8] = _mesa_texstore_rgba_int8; table[MESA_FORMAT_B10G10R10X2_UNORM] = _mesa_texstore_argb2101010; -- 1.9.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/vec4: Don't trim writemasks of texture instructions.
On 03/28/2014 04:58 PM, Matt Turner wrote: > It was my understanding that the writemask works in SIMD4x2 mode for > texturing instructions and doesn't require a message header. Some bit of > this logic must be wrong, so disable it until it's understood. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76617 > --- > v2: Base on master, rather than breaking commit. > > src/mesa/drivers/dri/i965/brw_vec4.cpp | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp > b/src/mesa/drivers/dri/i965/brw_vec4.cpp > index 4ae6020..32a3892 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp > @@ -351,8 +351,10 @@ try_eliminate_instruction(vec4_instruction *inst, int > new_writemask) >case VS_OPCODE_PULL_CONSTANT_LOAD_GEN7: > break; >default: > - inst->dst.writemask = new_writemask; > - return true; > + if (!inst->is_tex()) { > +inst->dst.writemask = new_writemask; > +return true; > + } >} > } > > Reviewed-by: Kenneth Graunke signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] RFC: ARB_sample_shading in gallium
Am 29.03.2014 00:57, schrieb Ilia Mirkin: > On Fri, Mar 28, 2014 at 7:43 PM, Roland Scheidegger > wrote: >> Am 28.03.2014 23:57, schrieb Ilia Mirkin: >>> On Fri, Mar 28, 2014 at 6:41 PM, Roland Scheidegger >>> wrote: Am 28.03.2014 22:56, schrieb Ilia Mirkin: > On Fri, Mar 28, 2014 at 5:47 PM, Roland Scheidegger > wrote: >> Am 28.03.2014 22:18, schrieb Ilia Mirkin: >>> Hey guys, >>> >>> I was thinking of taking a shot at implementing ARB_sample_shading for >>> nv50 (well, nva3-nva8) this weekend. One of the issues is that it's >>> not implemented in gallium at all right now, so I need to pipe it >>> through somehow. I believe that the only piece of data that needs to >>> be piped through is the value returned by >>> _mesa_get_min_invocations_per_fragment, which is a function of the fp, >>> the drawbuffer, and the MS state. When that value is > 1, sample >>> shading is effectively enabled. (I guess even when it's == 1, things >>> like gl_SampleID still need to work, perhaps it's worth adding a >>> separate enabled bit too.) >>> >>> Should this single integer get its own set_* callback, similar to >>> set_sample_mask, or should it be included somewhere, e.g. >>> pipe_framebuffer_state? Or even added to the set_sample_mask call? >>> >> >> Would something like in d3d10.1 work where you simply say that inputs >> are interpolated at sample frequency? That way you can also have some >> inputs which are not interpolated at sample frequency (I thought there's >> opengl functionality for this too somewhere - even if not I'd really >> like to have that functionality in gallium). It would just need new >> interpolation mode enums. >> Though I guess this does not fully cover ARB_sample_shading - this >> extension allows you for instance to have msaa 4x, but run fs at 2x (I >> could be wrong but I don't think you can do that in d3d, I don't know if >> hw can do it presumably some can otherwise it wouldn't be in the >> extension, though it is definitely worded in a way that makes it >> possible to just run at full sample frequency). > > I have 0 familiarity with d3d, but it does indeed seem like part of > the point of ARB_sample_shading is to run on less than 100% of the > samples. This appears to be supported by NVA3+ hardware based on our > current docs in rnndb, although the current piglit tests don't really > exercise all the functionality. [I haven't checked, but I assume NVC0+ > as well.] Although only 1/2/4/8 are supported, based on those docs > (e.g. you can't tell it to run on 5 samples). > > An alternative to passing in the result of > _mesa_get_min_invocations_per_fragment is to just pass the percentage > (which, I guess for D3D10.1 would either be 0 or 100?), Yes I guess it would be just 0 or 100. > and redoing > the calculation inside of gallium based on the same criteria. That would be doable too indeed. Though indeed OpenGl also allows "sample" interpolation qualifier, so it looks like we're going to need this anyway (ARB_shading_language_420pack for instance). Don't ask me though how this is supposed to work if simply enabling ARB_sample_shading already causes all inputs to be interpolated per sample anyway? The gl spec (4.4 core, end of chapter 14.3.1 and 14.3.1.1 has some explanation how it could work - so if there's at least one "sample" qualifier in the fs inputs, that causes those inputs to be evaluated per sample (which implies running the fragment shader at sample frequency). The interactions with SAMPLE_SHADING are not resolved, though, and imho anything but obvious. So if the ability to run the fragment shader at something else than per-pixel or per-sample frequency is useful, then something is needed to set this value one way or another. Otherwise new interpolation modes should do just fine and make things easier. Roland >>> >>> I believe the use-case for the partial thing is in issue #3 of the >>> ARB_sample_shading spec (although I'm not 100% sure what they're >>> talking about, they do seem to be talking about a gl_Sample*-less >>> shader). Based on the _mesa_get_min_invocations_per_fragment impl, as >>> soon as gl_Sample* gets used by the shader, it flips into per-sample >>> mode (which wasn't at all my reading of the spec, but I assume this >>> was done by people who understand things). Presumably there's some >>> benefit to doing the per-some-sample mode, otherwise the spec wouldn't >>> have introduced the MinSampleShadingARB call. >> Note that ARB_sample_shading is _older_ than the sample input qualifier >> (I think that first came with ARB_gpu_shader5), that issue #3 is solved >> with per-sample frequency just as well of course, though obviously it >> should be cheaper (and less qualit
[Mesa-dev] [PATCH] i965/vec4: Don't trim writemasks of texture instructions.
It was my understanding that the writemask works in SIMD4x2 mode for texturing instructions and doesn't require a message header. Some bit of this logic must be wrong, so disable it until it's understood. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76617 --- v2: Base on master, rather than breaking commit. src/mesa/drivers/dri/i965/brw_vec4.cpp | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 4ae6020..32a3892 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -351,8 +351,10 @@ try_eliminate_instruction(vec4_instruction *inst, int new_writemask) case VS_OPCODE_PULL_CONSTANT_LOAD_GEN7: break; default: - inst->dst.writemask = new_writemask; - return true; + if (!inst->is_tex()) { +inst->dst.writemask = new_writemask; +return true; + } } } -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] RFC: ARB_sample_shading in gallium
On Fri, Mar 28, 2014 at 7:43 PM, Roland Scheidegger wrote: > Am 28.03.2014 23:57, schrieb Ilia Mirkin: >> On Fri, Mar 28, 2014 at 6:41 PM, Roland Scheidegger >> wrote: >>> Am 28.03.2014 22:56, schrieb Ilia Mirkin: On Fri, Mar 28, 2014 at 5:47 PM, Roland Scheidegger wrote: > Am 28.03.2014 22:18, schrieb Ilia Mirkin: >> Hey guys, >> >> I was thinking of taking a shot at implementing ARB_sample_shading for >> nv50 (well, nva3-nva8) this weekend. One of the issues is that it's >> not implemented in gallium at all right now, so I need to pipe it >> through somehow. I believe that the only piece of data that needs to >> be piped through is the value returned by >> _mesa_get_min_invocations_per_fragment, which is a function of the fp, >> the drawbuffer, and the MS state. When that value is > 1, sample >> shading is effectively enabled. (I guess even when it's == 1, things >> like gl_SampleID still need to work, perhaps it's worth adding a >> separate enabled bit too.) >> >> Should this single integer get its own set_* callback, similar to >> set_sample_mask, or should it be included somewhere, e.g. >> pipe_framebuffer_state? Or even added to the set_sample_mask call? >> > > Would something like in d3d10.1 work where you simply say that inputs > are interpolated at sample frequency? That way you can also have some > inputs which are not interpolated at sample frequency (I thought there's > opengl functionality for this too somewhere - even if not I'd really > like to have that functionality in gallium). It would just need new > interpolation mode enums. > Though I guess this does not fully cover ARB_sample_shading - this > extension allows you for instance to have msaa 4x, but run fs at 2x (I > could be wrong but I don't think you can do that in d3d, I don't know if > hw can do it presumably some can otherwise it wouldn't be in the > extension, though it is definitely worded in a way that makes it > possible to just run at full sample frequency). I have 0 familiarity with d3d, but it does indeed seem like part of the point of ARB_sample_shading is to run on less than 100% of the samples. This appears to be supported by NVA3+ hardware based on our current docs in rnndb, although the current piglit tests don't really exercise all the functionality. [I haven't checked, but I assume NVC0+ as well.] Although only 1/2/4/8 are supported, based on those docs (e.g. you can't tell it to run on 5 samples). An alternative to passing in the result of _mesa_get_min_invocations_per_fragment is to just pass the percentage (which, I guess for D3D10.1 would either be 0 or 100?), >>> Yes I guess it would be just 0 or 100. >>> and redoing the calculation inside of gallium based on the same criteria. >>> >>> That would be doable too indeed. >>> Though indeed OpenGl also allows "sample" interpolation qualifier, so it >>> looks like we're going to need this anyway (ARB_shading_language_420pack >>> for instance). Don't ask me though how this is supposed to work if >>> simply enabling ARB_sample_shading already causes all inputs to be >>> interpolated per sample anyway? >>> The gl spec (4.4 core, end of chapter 14.3.1 and 14.3.1.1 has some >>> explanation how it could work - so if there's at least one "sample" >>> qualifier in the fs inputs, that causes those inputs to be evaluated per >>> sample (which implies running the fragment shader at sample frequency). >>> The interactions with SAMPLE_SHADING are not resolved, though, and imho >>> anything but obvious. >>> >>> So if the ability to run the fragment shader at something else than >>> per-pixel or per-sample frequency is useful, then something is needed to >>> set this value one way or another. Otherwise new interpolation modes >>> should do just fine and make things easier. >>> >>> Roland >> >> I believe the use-case for the partial thing is in issue #3 of the >> ARB_sample_shading spec (although I'm not 100% sure what they're >> talking about, they do seem to be talking about a gl_Sample*-less >> shader). Based on the _mesa_get_min_invocations_per_fragment impl, as >> soon as gl_Sample* gets used by the shader, it flips into per-sample >> mode (which wasn't at all my reading of the spec, but I assume this >> was done by people who understand things). Presumably there's some >> benefit to doing the per-some-sample mode, otherwise the spec wouldn't >> have introduced the MinSampleShadingARB call. > Note that ARB_sample_shading is _older_ than the sample input qualifier > (I think that first came with ARB_gpu_shader5), that issue #3 is solved > with per-sample frequency just as well of course, though obviously it > should be cheaper (and less quality) to run at some frequency between 1 > and max samples rather than max sample. > >> Although it'd be >> entirely within the spec (if
[Mesa-dev] [PATCH] i965/vec4: Don't trim writemasks of texture instructions.
It was my understanding that the writemask works in SIMD4x2 mode for texturing instructions and doesn't require a message header. Some bit of this logic must be wrong, so disable it until it's understood. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76617 --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 673086d..d7d649d 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -383,8 +383,10 @@ vec4_visitor::dead_code_eliminate() case VS_OPCODE_PULL_CONSTANT_LOAD_GEN7: break; default: -progress = true; -inst->dst.writemask = write_mask; +if (!inst->is_tex()) { + progress = true; + inst->dst.writemask = write_mask; +} break; } } -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] RFC: ARB_sample_shading in gallium
Am 28.03.2014 23:57, schrieb Ilia Mirkin: > On Fri, Mar 28, 2014 at 6:41 PM, Roland Scheidegger > wrote: >> Am 28.03.2014 22:56, schrieb Ilia Mirkin: >>> On Fri, Mar 28, 2014 at 5:47 PM, Roland Scheidegger >>> wrote: Am 28.03.2014 22:18, schrieb Ilia Mirkin: > Hey guys, > > I was thinking of taking a shot at implementing ARB_sample_shading for > nv50 (well, nva3-nva8) this weekend. One of the issues is that it's > not implemented in gallium at all right now, so I need to pipe it > through somehow. I believe that the only piece of data that needs to > be piped through is the value returned by > _mesa_get_min_invocations_per_fragment, which is a function of the fp, > the drawbuffer, and the MS state. When that value is > 1, sample > shading is effectively enabled. (I guess even when it's == 1, things > like gl_SampleID still need to work, perhaps it's worth adding a > separate enabled bit too.) > > Should this single integer get its own set_* callback, similar to > set_sample_mask, or should it be included somewhere, e.g. > pipe_framebuffer_state? Or even added to the set_sample_mask call? > Would something like in d3d10.1 work where you simply say that inputs are interpolated at sample frequency? That way you can also have some inputs which are not interpolated at sample frequency (I thought there's opengl functionality for this too somewhere - even if not I'd really like to have that functionality in gallium). It would just need new interpolation mode enums. Though I guess this does not fully cover ARB_sample_shading - this extension allows you for instance to have msaa 4x, but run fs at 2x (I could be wrong but I don't think you can do that in d3d, I don't know if hw can do it presumably some can otherwise it wouldn't be in the extension, though it is definitely worded in a way that makes it possible to just run at full sample frequency). >>> >>> I have 0 familiarity with d3d, but it does indeed seem like part of >>> the point of ARB_sample_shading is to run on less than 100% of the >>> samples. This appears to be supported by NVA3+ hardware based on our >>> current docs in rnndb, although the current piglit tests don't really >>> exercise all the functionality. [I haven't checked, but I assume NVC0+ >>> as well.] Although only 1/2/4/8 are supported, based on those docs >>> (e.g. you can't tell it to run on 5 samples). >>> >>> An alternative to passing in the result of >>> _mesa_get_min_invocations_per_fragment is to just pass the percentage >>> (which, I guess for D3D10.1 would either be 0 or 100?), >> Yes I guess it would be just 0 or 100. >> >>> and redoing >>> the calculation inside of gallium based on the same criteria. >> >> That would be doable too indeed. >> Though indeed OpenGl also allows "sample" interpolation qualifier, so it >> looks like we're going to need this anyway (ARB_shading_language_420pack >> for instance). Don't ask me though how this is supposed to work if >> simply enabling ARB_sample_shading already causes all inputs to be >> interpolated per sample anyway? >> The gl spec (4.4 core, end of chapter 14.3.1 and 14.3.1.1 has some >> explanation how it could work - so if there's at least one "sample" >> qualifier in the fs inputs, that causes those inputs to be evaluated per >> sample (which implies running the fragment shader at sample frequency). >> The interactions with SAMPLE_SHADING are not resolved, though, and imho >> anything but obvious. >> >> So if the ability to run the fragment shader at something else than >> per-pixel or per-sample frequency is useful, then something is needed to >> set this value one way or another. Otherwise new interpolation modes >> should do just fine and make things easier. >> >> Roland > > I believe the use-case for the partial thing is in issue #3 of the > ARB_sample_shading spec (although I'm not 100% sure what they're > talking about, they do seem to be talking about a gl_Sample*-less > shader). Based on the _mesa_get_min_invocations_per_fragment impl, as > soon as gl_Sample* gets used by the shader, it flips into per-sample > mode (which wasn't at all my reading of the spec, but I assume this > was done by people who understand things). Presumably there's some > benefit to doing the per-some-sample mode, otherwise the spec wouldn't > have introduced the MinSampleShadingARB call. Note that ARB_sample_shading is _older_ than the sample input qualifier (I think that first came with ARB_gpu_shader5), that issue #3 is solved with per-sample frequency just as well of course, though obviously it should be cheaper (and less quality) to run at some frequency between 1 and max samples rather than max sample. > Although it'd be > entirely within the spec (if not efficient) to ignore it entirely and > just assume that it's always 0 or 1. > > I think I'm going to start by adding a set_sample_shading() call that >
[Mesa-dev] [PATCH 1/2] st: fix st_choose_matching_format to ignore intensity
_mesa_format_matches_format_and_type() returns true for GL_RED/GL_RED_INTEGER (with an appropriate type) into an intensity mesa_format. We want the `red`-based format instead, regardless of the order we find them in our walk of the mesa formats list. Signed-off-by: Chris Forbes --- src/mesa/state_tracker/st_format.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/mesa/state_tracker/st_format.c b/src/mesa/state_tracker/st_format.c index cd6b466..62cee1c 100644 --- a/src/mesa/state_tracker/st_format.c +++ b/src/mesa/state_tracker/st_format.c @@ -1750,6 +1750,11 @@ st_choose_matching_format(struct pipe_screen *screen, unsigned bind, if (_mesa_get_format_color_encoding(mesa_format) == GL_SRGB) { continue; } + if (_mesa_get_format_bits(mesa_format, GL_TEXTURE_INTENSITY_SIZE) > 0) { + /* if `format` is GL_RED/GL_RED_INTEGER, then we might match some + * intensity formats, which we don't want. */ + continue; + } if (_mesa_format_matches_format_and_type(mesa_format, format, type, swapBytes)) { -- 1.9.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] mesa: Fix format matching checks for GL_INTENSITY* internalformats.
GL_INTENSITY has never been valid as a pixel format -- to get the memcpy pack/unpack paths, the app needs to specify GL_RED as the pixel format (or GL_RED_INTEGER for the integer formats). Note: This was briefly merged before, but exposed some breakage in gallium, so was reverted. Hopefully it will stick this time. Signed-off-by: Chris Forbes Reviewed-by: Brian Paul --- src/mesa/main/formats.c | 20 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/src/mesa/main/formats.c b/src/mesa/main/formats.c index 4fb1f11..fb2501c 100644 --- a/src/mesa/main/formats.c +++ b/src/mesa/main/formats.c @@ -3153,9 +3153,9 @@ _mesa_format_matches_format_and_type(mesa_format mesa_format, case MESA_FORMAT_L_UNORM16: return format == GL_LUMINANCE && type == GL_UNSIGNED_SHORT && !swapBytes; case MESA_FORMAT_I_UNORM8: - return format == GL_INTENSITY && type == GL_UNSIGNED_BYTE; + return format == GL_RED && type == GL_UNSIGNED_BYTE; case MESA_FORMAT_I_UNORM16: - return format == GL_INTENSITY && type == GL_UNSIGNED_SHORT && !swapBytes; + return format == GL_RED && type == GL_UNSIGNED_SHORT && !swapBytes; case MESA_FORMAT_YCBCR: return format == GL_YCBCR_MESA && @@ -3247,9 +3247,9 @@ _mesa_format_matches_format_and_type(mesa_format mesa_format, return format == GL_LUMINANCE_ALPHA && type == GL_HALF_FLOAT && !swapBytes; case MESA_FORMAT_I_FLOAT32: - return format == GL_INTENSITY && type == GL_FLOAT && !swapBytes; + return format == GL_RED && type == GL_FLOAT && !swapBytes; case MESA_FORMAT_I_FLOAT16: - return format == GL_INTENSITY && type == GL_HALF_FLOAT && !swapBytes; + return format == GL_RED && type == GL_HALF_FLOAT && !swapBytes; case MESA_FORMAT_R_FLOAT32: return format == GL_RED && type == GL_FLOAT && !swapBytes; @@ -3277,13 +3277,17 @@ _mesa_format_matches_format_and_type(mesa_format mesa_format, return format == GL_ALPHA_INTEGER && type == GL_INT && !swapBytes; case MESA_FORMAT_I_UINT8: + return format == GL_RED_INTEGER && type == GL_UNSIGNED_BYTE; case MESA_FORMAT_I_UINT16: + return format == GL_RED_INTEGER && type == GL_UNSIGNED_SHORT && !swapBytes; case MESA_FORMAT_I_UINT32: + return format == GL_RED_INTEGER && type == GL_UNSIGNED_INT && !swapBytes; case MESA_FORMAT_I_SINT8: + return format == GL_RED_INTEGER && type == GL_BYTE; case MESA_FORMAT_I_SINT16: + return format == GL_RED_INTEGER && type == GL_SHORT && !swapBytes; case MESA_FORMAT_I_SINT32: - /* GL_INTENSITY_INTEGER_EXT doesn't exist. */ - return GL_FALSE; + return format == GL_RED_INTEGER && type == GL_INT && !swapBytes; case MESA_FORMAT_L_UINT8: return format == GL_LUMINANCE_INTEGER_EXT && type == GL_UNSIGNED_BYTE; @@ -3450,7 +3454,7 @@ _mesa_format_matches_format_and_type(mesa_format mesa_format, return format == GL_LUMINANCE_ALPHA && type == GL_BYTE && littleEndian && !swapBytes; case MESA_FORMAT_I_SNORM8: - return format == GL_INTENSITY && type == GL_BYTE; + return format == GL_RED && type == GL_BYTE; case MESA_FORMAT_A_SNORM16: return format == GL_ALPHA && type == GL_SHORT && !swapBytes; case MESA_FORMAT_L_SNORM16: @@ -3459,7 +3463,7 @@ _mesa_format_matches_format_and_type(mesa_format mesa_format, return format == GL_LUMINANCE_ALPHA && type == GL_SHORT && littleEndian && !swapBytes; case MESA_FORMAT_I_SNORM16: - return format == GL_INTENSITY && type == GL_SHORT && littleEndian && + return format == GL_RED && type == GL_SHORT && littleEndian && !swapBytes; case MESA_FORMAT_B10G10R10A2_UINT: -- 1.9.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] RFC: ARB_sample_shading in gallium
On Fri, Mar 28, 2014 at 6:41 PM, Roland Scheidegger wrote: > Am 28.03.2014 22:56, schrieb Ilia Mirkin: >> On Fri, Mar 28, 2014 at 5:47 PM, Roland Scheidegger >> wrote: >>> Am 28.03.2014 22:18, schrieb Ilia Mirkin: Hey guys, I was thinking of taking a shot at implementing ARB_sample_shading for nv50 (well, nva3-nva8) this weekend. One of the issues is that it's not implemented in gallium at all right now, so I need to pipe it through somehow. I believe that the only piece of data that needs to be piped through is the value returned by _mesa_get_min_invocations_per_fragment, which is a function of the fp, the drawbuffer, and the MS state. When that value is > 1, sample shading is effectively enabled. (I guess even when it's == 1, things like gl_SampleID still need to work, perhaps it's worth adding a separate enabled bit too.) Should this single integer get its own set_* callback, similar to set_sample_mask, or should it be included somewhere, e.g. pipe_framebuffer_state? Or even added to the set_sample_mask call? >>> >>> Would something like in d3d10.1 work where you simply say that inputs >>> are interpolated at sample frequency? That way you can also have some >>> inputs which are not interpolated at sample frequency (I thought there's >>> opengl functionality for this too somewhere - even if not I'd really >>> like to have that functionality in gallium). It would just need new >>> interpolation mode enums. >>> Though I guess this does not fully cover ARB_sample_shading - this >>> extension allows you for instance to have msaa 4x, but run fs at 2x (I >>> could be wrong but I don't think you can do that in d3d, I don't know if >>> hw can do it presumably some can otherwise it wouldn't be in the >>> extension, though it is definitely worded in a way that makes it >>> possible to just run at full sample frequency). >> >> I have 0 familiarity with d3d, but it does indeed seem like part of >> the point of ARB_sample_shading is to run on less than 100% of the >> samples. This appears to be supported by NVA3+ hardware based on our >> current docs in rnndb, although the current piglit tests don't really >> exercise all the functionality. [I haven't checked, but I assume NVC0+ >> as well.] Although only 1/2/4/8 are supported, based on those docs >> (e.g. you can't tell it to run on 5 samples). >> >> An alternative to passing in the result of >> _mesa_get_min_invocations_per_fragment is to just pass the percentage >> (which, I guess for D3D10.1 would either be 0 or 100?), > Yes I guess it would be just 0 or 100. > >> and redoing >> the calculation inside of gallium based on the same criteria. > > That would be doable too indeed. > Though indeed OpenGl also allows "sample" interpolation qualifier, so it > looks like we're going to need this anyway (ARB_shading_language_420pack > for instance). Don't ask me though how this is supposed to work if > simply enabling ARB_sample_shading already causes all inputs to be > interpolated per sample anyway? > The gl spec (4.4 core, end of chapter 14.3.1 and 14.3.1.1 has some > explanation how it could work - so if there's at least one "sample" > qualifier in the fs inputs, that causes those inputs to be evaluated per > sample (which implies running the fragment shader at sample frequency). > The interactions with SAMPLE_SHADING are not resolved, though, and imho > anything but obvious. > > So if the ability to run the fragment shader at something else than > per-pixel or per-sample frequency is useful, then something is needed to > set this value one way or another. Otherwise new interpolation modes > should do just fine and make things easier. > > Roland I believe the use-case for the partial thing is in issue #3 of the ARB_sample_shading spec (although I'm not 100% sure what they're talking about, they do seem to be talking about a gl_Sample*-less shader). Based on the _mesa_get_min_invocations_per_fragment impl, as soon as gl_Sample* gets used by the shader, it flips into per-sample mode (which wasn't at all my reading of the spec, but I assume this was done by people who understand things). Presumably there's some benefit to doing the per-some-sample mode, otherwise the spec wouldn't have introduced the MinSampleShadingARB call. Although it'd be entirely within the spec (if not efficient) to ignore it entirely and just assume that it's always 0 or 1. I think I'm going to start by adding a set_sample_shading() call that takes a [0,1] float, and see where that takes me. In any case, it should be fairly simple to change should it be decided that a different thing is needed. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] glsl: remove UBO fields from _mesa_glsl_parse_state
Reviewed-by: Matt Turner ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] RFC: ARB_sample_shading in gallium
Am 28.03.2014 22:56, schrieb Ilia Mirkin: > On Fri, Mar 28, 2014 at 5:47 PM, Roland Scheidegger > wrote: >> Am 28.03.2014 22:18, schrieb Ilia Mirkin: >>> Hey guys, >>> >>> I was thinking of taking a shot at implementing ARB_sample_shading for >>> nv50 (well, nva3-nva8) this weekend. One of the issues is that it's >>> not implemented in gallium at all right now, so I need to pipe it >>> through somehow. I believe that the only piece of data that needs to >>> be piped through is the value returned by >>> _mesa_get_min_invocations_per_fragment, which is a function of the fp, >>> the drawbuffer, and the MS state. When that value is > 1, sample >>> shading is effectively enabled. (I guess even when it's == 1, things >>> like gl_SampleID still need to work, perhaps it's worth adding a >>> separate enabled bit too.) >>> >>> Should this single integer get its own set_* callback, similar to >>> set_sample_mask, or should it be included somewhere, e.g. >>> pipe_framebuffer_state? Or even added to the set_sample_mask call? >>> >> >> Would something like in d3d10.1 work where you simply say that inputs >> are interpolated at sample frequency? That way you can also have some >> inputs which are not interpolated at sample frequency (I thought there's >> opengl functionality for this too somewhere - even if not I'd really >> like to have that functionality in gallium). It would just need new >> interpolation mode enums. >> Though I guess this does not fully cover ARB_sample_shading - this >> extension allows you for instance to have msaa 4x, but run fs at 2x (I >> could be wrong but I don't think you can do that in d3d, I don't know if >> hw can do it presumably some can otherwise it wouldn't be in the >> extension, though it is definitely worded in a way that makes it >> possible to just run at full sample frequency). > > I have 0 familiarity with d3d, but it does indeed seem like part of > the point of ARB_sample_shading is to run on less than 100% of the > samples. This appears to be supported by NVA3+ hardware based on our > current docs in rnndb, although the current piglit tests don't really > exercise all the functionality. [I haven't checked, but I assume NVC0+ > as well.] Although only 1/2/4/8 are supported, based on those docs > (e.g. you can't tell it to run on 5 samples). > > An alternative to passing in the result of > _mesa_get_min_invocations_per_fragment is to just pass the percentage > (which, I guess for D3D10.1 would either be 0 or 100?), Yes I guess it would be just 0 or 100. > and redoing > the calculation inside of gallium based on the same criteria. That would be doable too indeed. Though indeed OpenGl also allows "sample" interpolation qualifier, so it looks like we're going to need this anyway (ARB_shading_language_420pack for instance). Don't ask me though how this is supposed to work if simply enabling ARB_sample_shading already causes all inputs to be interpolated per sample anyway? The gl spec (4.4 core, end of chapter 14.3.1 and 14.3.1.1 has some explanation how it could work - so if there's at least one "sample" qualifier in the fs inputs, that causes those inputs to be evaluated per sample (which implies running the fragment shader at sample frequency). The interactions with SAMPLE_SHADING are not resolved, though, and imho anything but obvious. So if the ability to run the fragment shader at something else than per-pixel or per-sample frequency is useful, then something is needed to set this value one way or another. Otherwise new interpolation modes should do just fine and make things easier. Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] i965/fs: Name temporary ralloc contexts something other than mem_ctx.
Or else poor programmers might mistakenly use the temporary mem_ctx, instead of the fs_visitor's mem_ctx and wonder why their code is crashing. Also remove the parenting. These contexts are local to the optimization passes they're in and are freed at the end. --- src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 14 +++--- src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 6 +++--- 2 files changed, 10 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp index 2816d3c..a148c54 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp @@ -483,7 +483,7 @@ fs_visitor::try_constant_propagate(fs_inst *inst, acp_entry *entry) * list. */ bool -fs_visitor::opt_copy_propagate_local(void *mem_ctx, bblock_t *block, +fs_visitor::opt_copy_propagate_local(void *copy_prop_ctx, bblock_t *block, exec_list *acp) { bool progress = false; @@ -543,7 +543,7 @@ fs_visitor::opt_copy_propagate_local(void *mem_ctx, bblock_t *block, inst->src[0].type == inst->dst.type && !inst->saturate && !inst->is_partial_write()) { -acp_entry *entry = ralloc(mem_ctx, acp_entry); +acp_entry *entry = ralloc(copy_prop_ctx, acp_entry); entry->dst = inst->dst; entry->src = inst->src[0]; acp[entry->dst.reg % ACP_HASH_SIZE].push_tail(entry); @@ -557,7 +557,7 @@ bool fs_visitor::opt_copy_propagate() { bool progress = false; - void *mem_ctx = ralloc_context(this->mem_ctx); + void *copy_prop_ctx = ralloc_context(NULL); cfg_t cfg(&instructions); exec_list *out_acp[cfg.num_blocks]; for (int i = 0; i < cfg.num_blocks; i++) @@ -569,12 +569,12 @@ fs_visitor::opt_copy_propagate() for (int b = 0; b < cfg.num_blocks; b++) { bblock_t *block = cfg.blocks[b]; - progress = opt_copy_propagate_local(mem_ctx, block, + progress = opt_copy_propagate_local(copy_prop_ctx, block, out_acp[b]) || progress; } /* Do dataflow analysis for those available copies. */ - fs_copy_prop_dataflow dataflow(mem_ctx, &cfg, out_acp); + fs_copy_prop_dataflow dataflow(copy_prop_ctx, &cfg, out_acp); /* Next, re-run local copy propagation, this time with the set of copies * provided by the dataflow analysis available at the start of a block. @@ -590,12 +590,12 @@ fs_visitor::opt_copy_propagate() } } - progress = opt_copy_propagate_local(mem_ctx, block, in_acp) || progress; + progress = opt_copy_propagate_local(copy_prop_ctx, block, in_acp) || progress; } for (int i = 0; i < cfg.num_blocks; i++) delete [] out_acp[i]; - ralloc_free(mem_ctx); + ralloc_free(copy_prop_ctx); if (progress) invalidate_live_intervals(); diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp index d8a5434..ea610bd 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp @@ -121,7 +121,7 @@ fs_visitor::opt_cse_local(bblock_t *block, exec_list *aeb) { bool progress = false; - void *mem_ctx = ralloc_context(this->mem_ctx); + void *cse_ctx = ralloc_context(NULL); int ip = block->start_ip; for (fs_inst *inst = (fs_inst *)block->start; @@ -148,7 +148,7 @@ fs_visitor::opt_cse_local(bblock_t *block, exec_list *aeb) if (!found) { /* Our first sighting of this expression. Create an entry. */ - aeb_entry *entry = ralloc(mem_ctx, aeb_entry); + aeb_entry *entry = ralloc(cse_ctx, aeb_entry); entry->tmp = reg_undef; entry->generator = inst; aeb->push_tail(entry); @@ -254,7 +254,7 @@ fs_visitor::opt_cse_local(bblock_t *block, exec_list *aeb) ip++; } - ralloc_free(mem_ctx); + ralloc_free(cse_ctx); if (progress) invalidate_live_intervals(); -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] i965/fs: Recalculate live intervals in calculate_register_pressure().
Otherwise calling dump_instructions() after declaring a new fs_reg would segfault when calculate_register_pressure()'s loop over reg walked off the end of the virtual_grf_start[] array that calculate_live_intervals() would have reallocated for you, if it had known there was a new register. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 713e477..c88f7c9 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3294,6 +3294,7 @@ fs_visitor::assign_binding_table_offsets() void fs_visitor::calculate_register_pressure() { + invalidate_live_intervals(); calculate_live_intervals(); int num_instructions = 0; -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] RFC: ARB_sample_shading in gallium
On Fri, Mar 28, 2014 at 5:47 PM, Roland Scheidegger wrote: > Am 28.03.2014 22:18, schrieb Ilia Mirkin: >> Hey guys, >> >> I was thinking of taking a shot at implementing ARB_sample_shading for >> nv50 (well, nva3-nva8) this weekend. One of the issues is that it's >> not implemented in gallium at all right now, so I need to pipe it >> through somehow. I believe that the only piece of data that needs to >> be piped through is the value returned by >> _mesa_get_min_invocations_per_fragment, which is a function of the fp, >> the drawbuffer, and the MS state. When that value is > 1, sample >> shading is effectively enabled. (I guess even when it's == 1, things >> like gl_SampleID still need to work, perhaps it's worth adding a >> separate enabled bit too.) >> >> Should this single integer get its own set_* callback, similar to >> set_sample_mask, or should it be included somewhere, e.g. >> pipe_framebuffer_state? Or even added to the set_sample_mask call? >> > > Would something like in d3d10.1 work where you simply say that inputs > are interpolated at sample frequency? That way you can also have some > inputs which are not interpolated at sample frequency (I thought there's > opengl functionality for this too somewhere - even if not I'd really > like to have that functionality in gallium). It would just need new > interpolation mode enums. > Though I guess this does not fully cover ARB_sample_shading - this > extension allows you for instance to have msaa 4x, but run fs at 2x (I > could be wrong but I don't think you can do that in d3d, I don't know if > hw can do it presumably some can otherwise it wouldn't be in the > extension, though it is definitely worded in a way that makes it > possible to just run at full sample frequency). I have 0 familiarity with d3d, but it does indeed seem like part of the point of ARB_sample_shading is to run on less than 100% of the samples. This appears to be supported by NVA3+ hardware based on our current docs in rnndb, although the current piglit tests don't really exercise all the functionality. [I haven't checked, but I assume NVC0+ as well.] Although only 1/2/4/8 are supported, based on those docs (e.g. you can't tell it to run on 5 samples). An alternative to passing in the result of _mesa_get_min_invocations_per_fragment is to just pass the percentage (which, I guess for D3D10.1 would either be 0 or 100?), and redoing the calculation inside of gallium based on the same criteria. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/6] Some glapi clean-up releated to GLES
On Wed, Mar 26, 2014 at 5:12 PM, Ian Romanick wrote: > Tomorrow or Friday I'm going to send out the last of the > GL_ARB_separate_shader_objects patches. Shortly after that, I will send > out patches to enable GL_EXT_separate_shader_objects on GLES. This EXT > is the GLES subset of the ARB extension. > > In preparing for this new extension, I noticed the old problem that any > extension function that aliases a core function (whether it is core in > GLES or desktop GL) isn't hidden. This series should fix that. I tried to do that last year and gave up, so I'm glad to see it happen one way or another. Series is Reviewed-by: Matt Turner ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/24] LDFLAG additions and misc automake cleanups
On Thu, Mar 27, 2014 at 2:00 PM, Emil Velikov wrote: > Hello list, > > Yet another small series from me, targeting the following > > - Revert "allow only shared builds". Static osmesa/libgl-xlib is > still a used, so rever the commit for now. > > - Add -no-undefined, -Wl,--no-undefined and -Wl,--gc-sections. > The former two deal with missing symbols, while the latter gives us > a nice size reduction of the final libraries (each of the gallium > drivers have decreased by ~600KiB) > > - Cleanup LDFLAGS duplication and other across gallium > - A couple of bugs and cleanups in configure > > As usual the series can be fetched in the linkerflags branch at > https://github.com/evelikov/Mesa/ > > Comments, review and flame is greatly appreciated. > > Cheers > -Emil 1, 3-11, 18, 20-22 are Reviewed-by: Matt Turner ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 22/24] gallium/targets: add missing library dependencies
On Thu, Mar 27, 2014 at 2:00 PM, Emil Velikov wrote: > Signed-off-by: Emil Velikov > --- > src/gallium/targets/gbm/Makefile.am | 5 - > src/gallium/targets/xa/Makefile.am | 7 ++- > 2 files changed, 10 insertions(+), 2 deletions(-) > > diff --git a/src/gallium/targets/gbm/Makefile.am > b/src/gallium/targets/gbm/Makefile.am > index e36d317..bad581d 100644 > --- a/src/gallium/targets/gbm/Makefile.am > +++ b/src/gallium/targets/gbm/Makefile.am > @@ -49,7 +49,10 @@ gbm_gallium_drm_la_LIBADD = \ > $(top_builddir)/src/gallium/state_trackers/gbm/libgbm.la \ > $(top_builddir)/src/gallium/auxiliary/libgallium.la \ > $(LIBDRM_LIBS) \ > - -lm > + -lm \ > + $(CLOCK_LIB) \ > + $(PTHREAD_LIBS) \ > + $(DLOPEN_LIBS) We seem to list -lm last in the list most of the time. If you do that you don't have to modify that line. Might as well make -lm the last in the list in the second hunk as well. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 21/24] pipe-loader: use PTHREAD_LIBS over -lpthread
On Thu, Mar 27, 2014 at 2:00 PM, Emil Velikov wrote: > Signed-off-by: Emil Velikov > --- > src/gallium/targets/pipe-loader/Makefile.am | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/src/gallium/targets/pipe-loader/Makefile.am > b/src/gallium/targets/pipe-loader/Makefile.am > index fae4fa3..6e78a75 100644 > --- a/src/gallium/targets/pipe-loader/Makefile.am > +++ b/src/gallium/targets/pipe-loader/Makefile.am > @@ -40,10 +40,10 @@ PIPE_LIBS = \ > $(top_builddir)/src/gallium/drivers/rbug/librbug.la \ > $(top_builddir)/src/gallium/drivers/trace/libtrace.la \ > $(top_builddir)/src/gallium/drivers/galahad/libgalahad.la \ > - $(DLOPEN_LIBS) \ > + -lm \ > $(CLOCK_LIB) \ > - -lpthread \ > - -lm > + $(PTHREAD_LIBS) \ > + $(DLOPEN_LIBS) Why is there so much shuffling going on here? Just make the single change the commit summary says? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] RFC: ARB_sample_shading in gallium
Am 28.03.2014 22:18, schrieb Ilia Mirkin: > Hey guys, > > I was thinking of taking a shot at implementing ARB_sample_shading for > nv50 (well, nva3-nva8) this weekend. One of the issues is that it's > not implemented in gallium at all right now, so I need to pipe it > through somehow. I believe that the only piece of data that needs to > be piped through is the value returned by > _mesa_get_min_invocations_per_fragment, which is a function of the fp, > the drawbuffer, and the MS state. When that value is > 1, sample > shading is effectively enabled. (I guess even when it's == 1, things > like gl_SampleID still need to work, perhaps it's worth adding a > separate enabled bit too.) > > Should this single integer get its own set_* callback, similar to > set_sample_mask, or should it be included somewhere, e.g. > pipe_framebuffer_state? Or even added to the set_sample_mask call? > Would something like in d3d10.1 work where you simply say that inputs are interpolated at sample frequency? That way you can also have some inputs which are not interpolated at sample frequency (I thought there's opengl functionality for this too somewhere - even if not I'd really like to have that functionality in gallium). It would just need new interpolation mode enums. Though I guess this does not fully cover ARB_sample_shading - this extension allows you for instance to have msaa 4x, but run fs at 2x (I could be wrong but I don't think you can do that in d3d, I don't know if hw can do it presumably some can otherwise it wouldn't be in the extension, though it is definitely worded in a way that makes it possible to just run at full sample frequency). Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 18/24] targets/egl-static: move the common LDFLAGS into AM_LDFLAGS
On Thu, Mar 27, 2014 at 2:00 PM, Emil Velikov wrote: > Signed-off-by: Emil Velikov > --- > src/gallium/targets/egl-static/Makefile.am | 19 +-- > 1 file changed, 13 insertions(+), 6 deletions(-) > > diff --git a/src/gallium/targets/egl-static/Makefile.am > b/src/gallium/targets/egl-static/Makefile.am > index 282fa66..b492496 100644 > --- a/src/gallium/targets/egl-static/Makefile.am > +++ b/src/gallium/targets/egl-static/Makefile.am > @@ -30,8 +30,6 @@ > # > include $(top_srcdir)/src/gallium/Automake.inc > > -LDFLAGS += > -Wl,--version-script=$(top_srcdir)/src/gallium/targets/egl-static/egl.link Ugh. Not supposed to modify LDFLAGS in Makefile.am. Good that we're getting rid of that. There's another instance in pipe-loader (that I'm not sure if this series kills) that should be fixed too. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 16/24] gallium/targets: explicitly include a dummy.cpp and remove all the LINK mayhem
On Thu, Mar 27, 2014 at 2:00 PM, Emil Velikov wrote: > We've been copying and pasting this hunk for a while now, only to prevent > build issues on very old and buggy build toolchains. At this point this > should no longer be needed, so we can cleanup all the mess and simplify > our makefiles. That's not why. It's because if all of your source files for a target are .c, but you link with a .la file containing C++ then you need to link using a C++ linker. Automake doesn't know that there's C++ in the static library, so it doesn't know to link with g++. So I can't review this patch, because I don't know whether what you're trying to do is based on reality or not. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] RFC: per-driver extension lists
On Fri, Mar 28, 2014 at 10:32 AM, Ilia Mirkin wrote: > On Fri, Mar 28, 2014 at 11:14 AM, Aaron Watry wrote: >>> >>> Do people have opinions on whether it'd be useful to also gather data >>> for older hardware? FWIW I threw my TNT2 in there, which is probably >>> among the oldest hw supported by mesa. >>> >> >> I'm not sure if it's worthwhile or not, but if you want/need it, I've >> got a Radeon x1950 at home that I can pop in for a r300g run. > > I'm uncertain of the usefulness, but if you give it to me, I'll throw > it up there. I'll give it a run on a spare system later. Probably mesa 10.1, but I'll have to see what's installed on that machine when I get home. > >> >>> Any other suggestions? As a reminder, the current list is available at >>> http://people.freedesktop.org/~imirkin/glxinfo/glxinfo.html (defaults >>> to core context, so older stuff doesn't show up, click on 'compat' to >>> see it). >>> >> >> I like the UI in general, the one suggestion that I have at the moment >> is to split into two divs. Anchor the driver >> names/generations to the top of the window (position:fixed) and allow >> the table content to continue to be scrolled. That way you always >> have the card names at the top of your screen. You'll probably need >> to add a dynamically-sized spacer to the top of the 2nd div, but I'll >> leave that as an exercise to the reader. > > Good idea. One problem with position:fixed is that it doesn't > (didn't?) work on mobile browsers. I'll give it a shot though. Ahh, yeah... I haven't tried it on mobile... I spend my days doing desktop web software, not mobile. If it works, great, if not, then it's not the end of the world.. > > -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 12/24] targets/omx: introduce GALLIUM_OMX_LIB_DEPS
On Thu, Mar 27, 2014 at 2:00 PM, Emil Velikov wrote: > Cc: Christian König > Signed-off-by: Emil Velikov > --- > src/gallium/Automake.inc | 6 ++ > src/gallium/targets/r600/omx/Makefile.am | 5 + > src/gallium/targets/radeonsi/omx/Makefile.am | 5 + > 3 files changed, 8 insertions(+), 8 deletions(-) > > diff --git a/src/gallium/Automake.inc b/src/gallium/Automake.inc > index 1acc99e..39475d7 100644 > --- a/src/gallium/Automake.inc > +++ b/src/gallium/Automake.inc > @@ -90,6 +90,12 @@ GALLIUM_XVMC_LIB_DEPS = \ > $(XVMC_LIBS) \ > $(LIBDRM_LIBS) > > +GALLIUM_OMX_LIB_DEPS = \ > + $(top_builddir)/src/gallium/auxiliary/libgallium.la \ > + $(top_builddir)/src/gallium/state_trackers/omx/libomxtracker.la \ > + $(GALLIUM_DRI_LIB_DEPS) \ > + $(OMX_LIBS) > + > GALLIUM_WINSYS_CFLAGS = \ > -I$(top_srcdir)/include \ > -I$(top_srcdir)/src/gallium/include \ > diff --git a/src/gallium/targets/r600/omx/Makefile.am > b/src/gallium/targets/r600/omx/Makefile.am > index 98b064f..3776771 100644 > --- a/src/gallium/targets/r600/omx/Makefile.am > +++ b/src/gallium/targets/r600/omx/Makefile.am > @@ -47,13 +47,10 @@ libomx_r600_la_LDFLAGS = \ > -no-undefined > > libomx_r600_la_LIBADD = \ > - $(top_builddir)/src/gallium/auxiliary/libgallium.la \ > $(top_builddir)/src/gallium/drivers/r600/libr600.la \ > - $(top_builddir)/src/gallium/state_trackers/omx/libomxtracker.la \ > $(top_builddir)/src/gallium/winsys/radeon/drm/libradeonwinsys.la \ > $(top_builddir)/src/gallium/drivers/trace/libtrace.la \ > - $(GALLIUM_DRI_LIB_DEPS) \ > - $(OMX_LIBS) \ > + $(GALLIUM_OMX_LIB_DEPS) \ > $(LIBDRM_LIBS) \ > $(RADEON_LIBS) \ > -lstdc++ Unrelated, but explicitly linking with -lstdc++ is a hack. Use the dummy.cpp trick to make autotools use C++ linking. There's another instance of this in drivers/nouveau. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 11/24] targets/pipe-loader: move LLVM_LIBS handling inside PIPE_LIBS
On Thu, Mar 27, 2014 at 2:00 PM, Emil Velikov wrote: > This lets us have only one if HAVE_MESA_LLVM block, rather than > one for each driver. > > Signed-off-by: Emil Velikov > --- > src/gallium/targets/pipe-loader/Makefile.am | 34 > ++--- > 1 file changed, 11 insertions(+), 23 deletions(-) > > diff --git a/src/gallium/targets/pipe-loader/Makefile.am > b/src/gallium/targets/pipe-loader/Makefile.am > index 8e76d41..fae4fa3 100644 > --- a/src/gallium/targets/pipe-loader/Makefile.am > +++ b/src/gallium/targets/pipe-loader/Makefile.am > @@ -52,11 +52,13 @@ AM_LDFLAGS = \ > > -Wl,--version-script=$(top_srcdir)/src/gallium/targets/pipe-loader/pipe.link > > if HAVE_MESA_LLVM > +PIPE_LIBS += $(LLVM_LIBS) It would be kind of nice to not just tack this onto an unrelated variable. I see that we're already abusing PIPE_LIBS to contain DLOPEN_LIBS, CLOCK_LIBS, -lpthread (grr, should be PTHREAD_LIBS), and -lm. Maybe do the whole thing as a follow on patch -- splitting these out into a COMMON_LIBS variable or something. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/24] targets/xa: drop libudev references from automake build
On Thu, Mar 27, 2014 at 2:00 PM, Emil Velikov wrote: > Mesa does _not_ against libudev. I think you a word. :) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/24] Partially revert "automake: allow only shared builds"
On Thu, Mar 27, 2014 at 2:00 PM, Emil Velikov wrote: > Evidently at least static OSMesa is still used as shared one > causes substantial increase in the load time for some programs > that use it (from seconds up-to ~30min). > > Rather than forcing everyone to use shared mesa, revert commit > a6efbac9fb502c4f0166e7a0680b6828e1f6926c and default to shared > build when both shared and static are disabled. > > Reported-by: Burlen Loring > --- > configure.ac | 30 +++--- > install-gallium-links.mk | 2 ++ > install-lib-links.mk | 2 ++ > 3 files changed, 27 insertions(+), 7 deletions(-) > > diff --git a/configure.ac b/configure.ac > index 1e5e496..39330cf 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -284,15 +284,18 @@ dnl Can't have static and shared libraries, default to > static if user > dnl explicitly requested. If both disabled, set to static since shared > dnl was explicitly requested. > case "x$enable_static$enable_shared" in > -xnoyes ) > +xyesyes ) Don't put a space between the token and the )? That's not a style I've ever seen. > +AC_MSG_WARN([Cannot build static and shared libraries, disabling shared]) > +enable_shared=no > ;; > -* ) > -AC_MSG_WARN([Messa build supports only shared libraries, enabling > shared]) > +xnono ) > +AC_MSG_WARN([Cannot disable both static and shared libraries, enabling > shared]) > enable_shared=yes > -enable_static=no > ;; > esac > > +AM_CONDITIONAL(BUILD_SHARED, test "x$enable_shared" = xyes) > + > dnl > dnl other compiler options > dnl > @@ -773,6 +776,11 @@ PKG_CHECK_MODULES([LIBUDEV], [libudev >= > $LIBUDEV_REQUIRED], >have_libudev=yes, have_libudev=no) > > if test "x$enable_dri" = xyes; then > +# DRI must be shared, I think Drop the comment? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] RFC: ARB_sample_shading in gallium
Hey guys, I was thinking of taking a shot at implementing ARB_sample_shading for nv50 (well, nva3-nva8) this weekend. One of the issues is that it's not implemented in gallium at all right now, so I need to pipe it through somehow. I believe that the only piece of data that needs to be piped through is the value returned by _mesa_get_min_invocations_per_fragment, which is a function of the fp, the drawbuffer, and the MS state. When that value is > 1, sample shading is effectively enabled. (I guess even when it's == 1, things like gl_SampleID still need to work, perhaps it's worth adding a separate enabled bit too.) Should this single integer get its own set_* callback, similar to set_sample_mask, or should it be included somewhere, e.g. pipe_framebuffer_state? Or even added to the set_sample_mask call? Thanks, -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/5] Using MAC to optimize LRP on i965 gen < 6
On Fri, Mar 28, 2014 at 6:28 AM, Juha-Pekka Heikkila wrote: > v3: > I took out accumulator flag from backend_instruction::has_side_effects() > as Matt suggested and rebased my patches on top of master where had shown up > Matt's patches which were overlapping a bit on dead code elimination. > This set does not do anything for the MACH anomaly on vec4_visitor::visit > and fs_visitor:::visit, I will reverify it and do something about it later > if needed. I tried these on Ironlake maching and did not see any regression > on Piglit quick set. I sent comments on patch 1, but 2-5 are Reviewed-by: Matt Turner No need to resend 2-5. When patch 1 is updated and reviewed I'll push them all. A good follow-on patch would be to add some assertions to the generator code that ensure we're not violating any of the accumulator restrictions. See the "Accumulator Restrictions" section of IVB's PRM Vol 4 Part 3. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] i965: Add writes_accumulator flag
On Fri, Mar 28, 2014 at 6:28 AM, Juha-Pekka Heikkila wrote: > + if (inst->writes_accumulator||inst->writes_flag()) Spaces around || > inst->dst = dst_reg(retype(brw_null_reg(), inst->dst.type)); > - break; > - default: > - if (inst->writes_flag()) { > -inst->dst = dst_reg(retype(brw_null_reg(), inst->dst.type)); > - } else { > -inst->remove(); > - } > - } > + else > + inst->remove(); And let's use braces on nested if statements. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] i965: Add writes_accumulator flag
On Fri, Mar 28, 2014 at 6:28 AM, Juha-Pekka Heikkila wrote: > diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp > b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp > index a951459..492ee0d 100644 > --- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp > +++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp > @@ -818,7 +818,7 @@ fs_instruction_scheduler::calculate_deps() > } >} > > - if (inst->reads_flag()) { > + if (inst->reads_flag() || inst->writes_accumulator) { > add_dep(last_conditional_mod[inst->flag_subreg], n); last_conditional_mod tracks the last instructions to write f0.0 and f0.1, so we don't want to use it to also track writes to the accumulator. Add another variable schedule_node *last_accumulator_write = NULL; and use it like we do with last_conditional_mod if (inst->writes_accumulator || inst->dst.is_accumulator()) { ... } You'll need to add an is_accumulator method to the FS backend's fs_reg class, and to the vec4 backend's reg class. You can tell if a register is the accumulator if reg.file == HW_REG && reg.fixed_hw_reg.file == BRW_ARCHITECTURE_REGISTER_FILE && reg.fixed_hw_reg.nr == BRW_ARF_ACCUMULATOR ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: disable blorp's linear filtering on SNB
On Thu, Mar 27, 2014 at 1:29 AM, Samuel Iglesias Gonsalvez wrote: > Commit 079bdba05f870807d3ed77fa3093cdb7727aa2fd enabled the use of BLORP > engine for single sample scaled blit with bilinear filter. > > However piglit fails when running fbo-blit-stretch test on SandyBridge. > This patch makes the code to fallback to other blit paths for SandyBridge. My thoughts: - Yes, fbo-blit-stretch test passes on SNB with fallback blit paths. But similar piglit tests fbo-blit-scaled-linear and fbo-attachments-blit-scaled-linear continues to fail with small color differences with meta fallback. So, the fallback helped one out of three linear scaled blit tests. Note: Use git-89ccd11 to run other two tests. A later patch causes the tests to assert fail on SNB. - The meta fallback will possibly have some performance penalty and It will just hide the linear filtering bug in blorp engine. - Current output from blorp on SNB is little off from expected but still looks visually correct. I would prefer a patch which fixes the issue in blorp. Let's wait for more opinions on this. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68365 > > Signed-off-by: Samuel Iglesias Gonsalvez > --- > src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 9 + > 1 file changed, 9 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp > b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp > index 9e80935..a0a9a7b 100644 > --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp > +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp > @@ -251,6 +251,15 @@ try_blorp_blit(struct brw_context *brw, > fixup_mirroring(mirror_y, srcY0, srcY1); > fixup_mirroring(mirror_y, dstY0, dstY1); > > + /* Linear filtering using blorp engine is failing on Sandybridge. So, > fallback > +* to other blit paths. > +* See https://bugs.freedesktop.org/show_bug.cgi?id=68365 > +*/ > + if ((brw->gen == 6) && (srcX1 - srcX0 != dstX1 - dstX0 || > + srcY1 - srcY0 != dstY1 - dstY0) && > + filter == GL_LINEAR) > + return false; > + > /* If the destination rectangle needs to be clipped or scissored, do so. > */ > if (!(clip_or_scissor(mirror_x, srcX0, srcX1, dstX0, dstX1, > -- > 1.9.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 5/9] mesa: Handle QUERY_RESULT_NO_WAIT in GetQueryObject{ui64}v
On 28.03.2014 08:01, Kenneth Graunke wrote: On 03/27/2014 11:34 PM, Kenneth Graunke wrote: On 03/27/2014 01:59 PM, Rafal Mielniczuk wrote: Just return and do nothing if query result is not yet available Signed-off-by: Rafal Mielniczuk --- src/mesa/main/queryobj.c | 16 1 file changed, 16 insertions(+) diff --git a/src/mesa/main/queryobj.c b/src/mesa/main/queryobj.c index 86e7c3a..d2d9fa7 100644 --- a/src/mesa/main/queryobj.c +++ b/src/mesa/main/queryobj.c @@ -594,6 +594,10 @@ _mesa_GetQueryObjectiv(GLuint id, GLenum pname, GLint *params) } switch (pname) { + case GL_QUERY_RESULT_NO_WAIT: + if (!q->Ready) +return; + //else fall through We don't usually use C++ style comments in Mesa. I would do: case GL_QUERY_NO_WAIT: if (!q->Ready) return; /* fallthrough */ case GL_QUERY_RESULT_ARB: Other than that, patches 1-6 are: Reviewed-by: Kenneth Graunke Actually, I take that back...I don't think this is what we want for GPU drivers. (It's probably reasonable for software drivers though.) When a buffer object is bound to GL_QUERY_BUFFER, the idea is that the GL_QUERY_RESULT/GL_QUERY_RESULT_NO_WAIT queries should emit GPU commands to deliver the query result into the buffer object. The query result may not actually be available yet (so, q->Ready == false), but the GPU commands to obtain the result have already been submitted. Since any GPU commands we submit will happen after those, they can work with the result as if it's available...because it will be by the time they run. At least, that's my understanding right now. So, we need a way to know if a query result is "in flight, but done" (i.e. all commands to compute it have been submitted, but may not have run yet), and a way to ask the driver to deliver it to a particular buffer object/offset. That probably means two new driver hooks, but I'm not quite sure what they should look like just yet. Ok, I see there is no point in adding software version of the extension only. I will play with it in the comming days, try to understand the driver side more and come up with something working, hopefully. So, patches 1-4 and 6 are: Reviewed-by: Kenneth Graunke I won't be around next week, but I'd be happy to help look into this when I'm back. (Unless, of course, others beat me to it...) :) Thanks again for your work on this! Thanks for your time and extensive review! :) I would be happy to finish this up. Thanks again, Rafal --Ken ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 76753] New: [swrast] piglit arb_clear_buffer_object-formats regression
https://bugs.freedesktop.org/show_bug.cgi?id=76753 Priority: medium Bug ID: 76753 Keywords: regression CC: bri...@vmware.com, chr...@ijw.co.nz Assignee: mesa-dev@lists.freedesktop.org Summary: [swrast] piglit arb_clear_buffer_object-formats regression Severity: normal Classification: Unclassified OS: Linux (All) Reporter: v...@freedesktop.org Hardware: x86-64 (AMD64) Status: NEW Version: git Component: Other Product: Mesa mesa: 4047263cb15e89d23cb145c74fb3f303904e8f14 (master 10.2.0-devel) $ ./bin/arb_clear_buffer_object-formats -auto Testing GL_ALPHA8... Passed. Testing GL_ALPHA16... Passed. Testing GL_LUMINANCE8... Passed. Testing GL_LUMINANCE16... Passed. Testing GL_LUMINANCE8_ALPHA8... Passed. Testing GL_LUMINANCE16_ALPHA16... Passed. Testing GL_INTENSITY8... Passed. Testing GL_INTENSITY16... Passed. Testing GL_RGBA8... Passed. Testing GL_RGBA16... Passed. Testing GL_ALPHA32F_ARB... Passed. Testing GL_LUMINANCE32F_ARB... Passed. Testing GL_LUMINANCE_ALPHA32F_ARB... Passed. Testing GL_INTENSITY32F_ARB... Passed. Testing GL_RGBA32F... Passed. Testing GL_ALPHA16F_ARB... Passed. Testing GL_LUMINANCE16F_ARB... Passed. Testing GL_LUMINANCE_ALPHA16F_ARB... Passed. Testing GL_INTENSITY16F_ARB... Failed! Testing GL_RGBA16F... Passed. Testing GL_R8... Passed. Testing GL_R16... Passed. Testing GL_R16F... Passed. Testing GL_R32F... Passed. Testing GL_RG8... Passed. Testing GL_RG16... Passed. Testing GL_RG16F... Passed. Testing GL_RG32F... Passed. PIGLIT: {'result': 'fail' } 4002daf09545f321a070d0ead06324be11331553 is the first bad commit commit 4002daf09545f321a070d0ead06324be11331553 Author: Chris Forbes Date: Wed Mar 26 10:03:38 2014 +1300 Revert "mesa: Fix format matching checks for GL_INTENSITY* internalformats." This reverts commit 40d7b5195351d3e4199e7a840615a595a6dbaefc. :04 04 bce2d8c961eb8680545dfcb6093d5fa0c8f84a4a e2afe16a16934d9b296972dae22d12fc980ecaae Msrc bisect run success -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/5] i965/vec4: Change vec4_visitor::emit_lrp to use MAC for gen<6
This allows us to emit ADD/MUL/MAC instead of MUL/ADD/MUL/ADD, saving one instruction and two temporary registers. Signed-off-by: Juha-Pekka Heikkila --- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 22 ++ 1 file changed, 6 insertions(+), 16 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index 298ca26..6e98c06 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -1165,24 +1165,14 @@ vec4_visitor::emit_lrp(const dst_reg &dst, } else { /* Earlier generations don't support three source operations, so we * need to emit x*(1-a) + y*a. - * - * A better way to do this would be: - *ADD one_minus_a, negate(a), 1.0f - *MUL null, y, a - *MAC dst, x, one_minus_a - * but we would need to support MAC and implicit accumulator. */ - dst_reg y_times_a = dst_reg(this, glsl_type::vec4_type); - dst_reg one_minus_a = dst_reg(this, glsl_type::vec4_type); - dst_reg x_times_one_minus_a = dst_reg(this, glsl_type::vec4_type); - y_times_a.writemask = dst.writemask; - one_minus_a.writemask = dst.writemask; - x_times_one_minus_a.writemask = dst.writemask; - - emit(MUL(y_times_a, y, a)); + dst_reg one_minus_a = dst_reg(this, glsl_type::vec4_type); + one_minus_a.writemask = dst.writemask; + emit(ADD(one_minus_a, negate(a), src_reg(1.0f))); - emit(MUL(x_times_one_minus_a, x, src_reg(one_minus_a))); - emit(ADD(dst, src_reg(x_times_one_minus_a), src_reg(y_times_a))); + vec4_instruction *mul = emit(MUL(dst_null_f(), y, a)); + mul->writes_accumulator = true; + emit(MAC(dst, x, src_reg(one_minus_a))); } } -- 1.8.1.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 0/9] Add ARB_query_buffer_object (swrast)
On 28.03.2014 07:51, Kenneth Graunke wrote: On 03/27/2014 01:59 PM, Rafal Mielniczuk wrote: Hello, This is the second version of the series implementing ARB_query_buffer_object in mesa. Main changes to the first version are: - Enable extension only for software driver - Fix possible segfault in patch #7 - Fix typos and comment style issues Drivers, which are not able to implement it without CPU roundtrip can just mark extension flag as true and use software fallback. I did not add any ctx.Driver callback for now, as I was not sure what would be the required api there. Perhaps it would be better to add this while implementing gpu acceleration in driver? I tested the extension by enabling software fallback in i965 driver. This is not included in this series. Thanks, Rafal Mielniczuk (9): glapi: Add xml infrastructure for ARB_query_buffer_object mesa: Add ARB_query_buffer_object extension flag mesa: Add QueryBuffer to context mesa: Handle QUERY_BUFFER_BINDING in GetIntegerv mesa: Handle QUERY_RESULT_NO_WAIT in GetQueryObject{ui64}v mesa: Fix typos in function names in queryobj mesa: Implement software fallback for ARB_query_buffer_object mesa: Enable GL_ARB_query_buffer_object for software drivers doc: mark GL_ARB_query_buffer_object as done for swrast docs/GL3.txt | 2 +- src/mapi/glapi/gen/ARB_query_buffer_object.xml | 18 src/mapi/glapi/gen/Makefile.am | 1 + src/mapi/glapi/gen/gl_API.xml | 4 + src/mesa/main/bufferobj.c | 5 ++ src/mesa/main/extensions.c | 2 + src/mesa/main/get.c| 5 ++ src/mesa/main/get_hash_params.py | 3 + src/mesa/main/mtypes.h | 3 + src/mesa/main/queryobj.c | 109 + 10 files changed, 135 insertions(+), 17 deletions(-) create mode 100644 src/mapi/glapi/gen/ARB_query_buffer_object.xml Hi Rafal! Hi Kenneth! :) Thanks for doing this! Looking through the spec, I had a few questions. Are applications required to bind GL_QUERY_BUFFER buffer before starting a query (and keep it bound for the duration of the query)? Or, does it only matter at glGetQueryObject time? From my understanding it only matters at glGetQueryObject time: Spec says: "if a non-zero buffer object is bound as the current query result buffer then is treated as an offset into the designated buffer object" It doesn't specifically says when the query buffer object should be bound. On the other hand some of the examples in the specs bind the buffer two times, first time before the query to set the buffer data, second time before the glGetQueryObject, which was kind of confusing and seemed redundant, but I believe it was just made for the clarity of example. I'm guessing it's the latter - otherwise, it sure seems like the spec would have had to answer what happens if you change/delete it while the query is active...i.e. glBindBuffer(GL_QUERY_BUFFER, queryBuffer); glBeginQuery(GL_SAMPLES_PASSED, queryId); ... glBindBuffer(GL_QUERY_BUFFER, 0); /* or some other buffer... */ ... glEndQuery(GL_SAMPLES_PASSED); In i965, we use a buffer object (query->bo - see brw_context.h:846) during the query. We take an initial snapshot of some counter (i.e. PS_DEPTH_COUNT for occlusion queries, TIMESTAMP for timer queries) at BeginQuery time, and a second snapshot at EndQuery time. Both get stored in query->bo at offsets 0 and 8 (1*sizeof(uint64_t)). When they ask for the results, we map that buffer and use the CPU to subtract the two values. So, avoiding a CPU round trip seems tricky. If we instead zeroed the counters at BeginQuery time, we would have a single counter that we could possibly blit from query->bo to the currently bound GL_QUERY_BUFFER via intel_emit_linear_blit. But I don't think we can do that for the TIMESTAMP register, and doing so would make implementing some upcoming GL extensions harder. We could use intel_emit_linear_blit for DEPTH_COUNT and think on some other way for TIMESTAMP, or even allow CPU roundtrip for it, if there is really no other way? So, maybe we could bind query->bo as a UBO or TexBO, bind a shader that would read from there, subtract the two values, and write that to the GL_QUERY_BUFFER object. This could probably be done by drawing a single point primitive with a vertex shader and rasterizer discard. This could work, but it seems pretty ugly... Maybe that's the only way for older gen hardware. We might also be able to use MI_LOAD_REGISTER_MEM/MI_MATH/MI_STORE_REGISTER_MEM on Haswell, but I don't know how that works yet. Thanks for the detailed overview. I will play with it, look into intel gpu specs and try to make it work without roundtrip for some of the querieson my Ivy Bridge. That should make me busy for a couple of days. :) Thanks! Rafal --Ken _
[Mesa-dev] [Bug 76212] latest dota2 builds do crash with the radeon open source driver
https://bugs.freedesktop.org/show_bug.cgi?id=76212 Sylvain BERTRAND changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |NOTOURBUG --- Comment #4 from Sylvain BERTRAND --- Yes... it's that bug. Then it does not seem related to the radeon open source driver. How did you manage to get the symbols? -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 24/24] automake: ask the linker to do garbage collection
On Fri, 28 Mar 2014 10:08:53 + Emil Velikov wrote: > On 28/03/14 08:35, Lauri Kasanen wrote: > > Emil Velikov wrote: > > > >> By doing GC the linker removes all the symbols that are not referenced > >> and/or used by the final library. This results in a saving of ~100K > >> up-to ~600K per (stripped) binary (classic vs gallium drivers). > >> > >> If interested one can ask the compiler to print the sections that are > >> removed using -Wl,--print-gc-sections. > > > > I didn't see where you add the corresponding flags to CFLAGS/CXXFLAGS. > > > What do you have in mind with "the corresponding flags" ? The next paragraph, ie ffunction-sections and fdata-sections. > > Without -ffunction-sections -fdata-sections in compile flags the garbage > > collection is handicapped - add those by default too, and see much > > greater gains. > > > "Handicapped" or not it produces very nice results at this point. AFAICS both > options that you've mentioned have some interesting side effects, from which I > would rather opt out at this stage. Speaking of "much greather gains" do you > have any numbers in mind (wrt mesa) ? No numbers for mesa, sorry. Without those switches the garbage collection is effectively file-level, not symbol/function-level. - Lauri ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] RFC: per-driver extension lists
Am 28.03.2014 05:06, schrieb Kenneth Graunke: > > I've attached the info for everything on the i965 driver. In case you > ever want to regenerate things, I just ran: > > $ INTEL_DEVID_OVERRIDE=0x29a2 glxinfo -l -s &> gen4 > $ INTEL_DEVID_OVERRIDE=0x2a42 glxinfo -l -s &> gen4.5 > $ INTEL_DEVID_OVERRIDE=0x0046 glxinfo -l -s &> gen5 > $ INTEL_DEVID_OVERRIDE=0x0116 glxinfo -l -s &> gen6 > $ INTEL_DEVID_OVERRIDE=0x0166 glxinfo -l -s &> gen7-ivb > $ INTEL_DEVID_OVERRIDE=0x0f31 glxinfo -l -s &> gen7-byt > $ INTEL_DEVID_OVERRIDE=0x0d26 glxinfo -l -s &> gen7.5 > > Naming for these are kind of...complicated. :( And you even forgot some names - HD Graphics (no number) is used for SandyBridge/Ivy Bridge too... Roland > > Gen4: > Intel GMA X3000* / G965* / Broadwater (* tie for most recognizable) > Intel GMA X3100 / GM965 / Crestline > > Gen4.5: > Intel GMA 4500HD / G45* / Eaglelake > Intel GMA 4500MHD / GM45* / Cantiga (* G45/GM45 are most recognizable) > > Gen5: > Intel HD Graphics / Ironlake* / Arrandale / Clarkdale (* most recognizable) > > Gen6: > Intel HD 2000 / HD 3000 / Sandybridge > > Gen7: > Intel HD 2500 / HD 4000 / Ivybridge > > Gen7 LP: > Intel HD Graphics / Baytrail > > Gen7.5: > Intel HD Graphics (no number for GT1) / HD 4200, 4400, 4600, 5000 / Iris > 5100 / Iris Pro 5200 / Haswell / Crystalwell > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] glsl: Move Doxygen block closing ot the correct place
On 03/27/2014 02:33 PM, Ian Romanick wrote: > From: Ian Romanick > > This is the closing for the "\defgroup IR Intermediate representation > nodes" all the way at the top of the file. > > Signed-off-by: Ian Romanick > --- > src/glsl/ir.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/src/glsl/ir.h b/src/glsl/ir.h > index 8fa3b9e..ee276d2 100644 > --- a/src/glsl/ir.h > +++ b/src/glsl/ir.h > @@ -2186,8 +2186,6 @@ private: > ir_constant(void); > }; > > -/*@}*/ > - > /** > * IR instruction to emit a vertex in a geometry shader. > */ > @@ -2235,6 +2233,8 @@ public: > virtual ir_visitor_status accept(ir_hierarchical_visitor *); > }; > > +/*@}*/ > + > /** > * Apply a visitor to each IR node in a list > */ > Series is: Reviewed-by: Kenneth Graunke signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 24/24] automake: ask the linker to do garbagecollection
On Fri, 28 Mar 2014 09:57:15 +0100 Marc Dietrich wrote: > lto is broken on many compiler/ld combinations. Even if it is supported I > won't recomment to enable it. A config option to enable it would be nice > though. > > Lauri, on which compiler/binutils version did you got it going? GNU ld (GNU Binutils) 2.23.1 gcc-4.7 (GCC) 4.7.1 In my experience LTO wasn't fully complete in GCC 4.6 (with many projects failing to build), but in 4.7 and 4.8 it works great. I haven't built mesa with LTO though, only my own projects. - Lauri ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] i965: Vector splitting of outputs
On 03/26/2014 02:23 PM, Eric Anholt wrote: > Here's a little series I wrote yesterday while tracking down why some > silly MOVs were generated in a microbenchmark. As usual with > optimization, the thing I was trying to work on (not present in this > series) ended up requiring a bunch of other work to prevent regressions. > > Well, the first patch is an old one not from this series that still hadn't > seen review, but I'd love to get it landed. > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev Patches 1-8 and 10 are: Reviewed-by: Kenneth Graunke Patch 9 needs some fixing up, I think. signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] RFC: per-driver extension lists
It would be nicer not to use dots to indicate an extension is supported. I recommend using a high-contrast background color for of supported extensions and white or other neutral color for unsupported extensions, a lot like piglit has green for pass and gray for skip, but each vendor can have their own color. Marek On Fri, Mar 28, 2014 at 3:50 PM, Ilia Mirkin wrote: > On Fri, Mar 28, 2014 at 10:15 AM, Roland Scheidegger > wrote: >> Am 28.03.2014 05:06, schrieb Kenneth Graunke: >>> >>> I've attached the info for everything on the i965 driver. In case you >>> ever want to regenerate things, I just ran: >>> >>> $ INTEL_DEVID_OVERRIDE=0x29a2 glxinfo -l -s &> gen4 >>> $ INTEL_DEVID_OVERRIDE=0x2a42 glxinfo -l -s &> gen4.5 >>> $ INTEL_DEVID_OVERRIDE=0x0046 glxinfo -l -s &> gen5 >>> $ INTEL_DEVID_OVERRIDE=0x0116 glxinfo -l -s &> gen6 >>> $ INTEL_DEVID_OVERRIDE=0x0166 glxinfo -l -s &> gen7-ivb >>> $ INTEL_DEVID_OVERRIDE=0x0f31 glxinfo -l -s &> gen7-byt >>> $ INTEL_DEVID_OVERRIDE=0x0d26 glxinfo -l -s &> gen7.5 >>> >>> Naming for these are kind of...complicated. :( >> And you even forgot some names - HD Graphics (no number) is used for >> SandyBridge/Ivy Bridge too... > > So that name is clearly unusable. Unfortunately the generated > glxinfo's were for a git mesa build, but armed with this information, > I was able to regenerate them on my own snb laptop. Unfortunately it > turns out these aren't quite accurate, apparently there are some > checks that need to happen in order for ARB_(multi_)draw_indirect, > ARB_tf2/3/instanced, and AMD_performance_monitor to appear activated. > I've manually added those in for IVB and left them out for HSW since > I'm told there's presently no kernel support for them. (I had actually > received a HSW glxinfo earlier from Eric Anholt, but I guess he must > have been running with a hacked up kernel.) > > If you guys notice any errors in these, please let me know. I also > chose some names that made sense based on the below info from Ken, as > well as the names reported by the driver (965G vs G965, for some > reason GM45 gets a utf8'd registered symbol while all the other > versions are satisified with a plain (R), etc). > > I think at this point, my list is complete for semi-recent hardware > (i965, nv50, nvc0, r600, radeonsi drivers), and vastly incomplete for > pretty much anything else. [Ugh, with the exception of NVA0 (G200), > which has a few things G80-G98 don't, and is missing a few things that > GT21x have. But I'll get it.] > > Do people have opinions on whether it'd be useful to also gather data > for older hardware? FWIW I threw my TNT2 in there, which is probably > among the oldest hw supported by mesa. > > Any other suggestions? As a reminder, the current list is available at > http://people.freedesktop.org/~imirkin/glxinfo/glxinfo.html (defaults > to core context, so older stuff doesn't show up, click on 'compat' to > see it). > > I'm thinking of adding > - history support (would allow one to link directly to a > version/{core,compat}) > - selecting specific configs to make comparison easier > - option to hide rows that are supported by everything > - some sort of design + words around the table to explain what it is > and how to use it > >>> >>> Gen4: >>> Intel GMA X3000* / G965* / Broadwater (* tie for most recognizable) >>> Intel GMA X3100 / GM965 / Crestline >>> >>> Gen4.5: >>> Intel GMA 4500HD / G45* / Eaglelake >>> Intel GMA 4500MHD / GM45* / Cantiga (* G45/GM45 are most recognizable) >>> >>> Gen5: >>> Intel HD Graphics / Ironlake* / Arrandale / Clarkdale (* most recognizable) >>> >>> Gen6: >>> Intel HD 2000 / HD 3000 / Sandybridge >>> >>> Gen7: >>> Intel HD 2500 / HD 4000 / Ivybridge >>> >>> Gen7 LP: >>> Intel HD Graphics / Baytrail >>> >>> Gen7.5: >>> Intel HD Graphics (no number for GT1) / HD 4200, 4400, 4600, 5000 / Iris >>> 5100 / Iris Pro 5200 / Haswell / Crystalwell >>> >>> > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] RFC: per-driver extension lists
> > Do people have opinions on whether it'd be useful to also gather data > for older hardware? FWIW I threw my TNT2 in there, which is probably > among the oldest hw supported by mesa. > I'm not sure if it's worthwhile or not, but if you want/need it, I've got a Radeon x1950 at home that I can pop in for a r300g run. > Any other suggestions? As a reminder, the current list is available at > http://people.freedesktop.org/~imirkin/glxinfo/glxinfo.html (defaults > to core context, so older stuff doesn't show up, click on 'compat' to > see it). > I like the UI in general, the one suggestion that I have at the moment is to split into two divs. Anchor the driver names/generations to the top of the window (position:fixed) and allow the table content to continue to be scrolled. That way you always have the card names at the top of your screen. You'll probably need to add a dynamically-sized spacer to the top of the 2nd div, but I'll leave that as an exercise to the reader. --Aaron > I'm thinking of adding > - history support (would allow one to link directly to a > version/{core,compat}) > - selecting specific configs to make comparison easier > - option to hide rows that are supported by everything > - some sort of design + words around the table to explain what it is > and how to use it > >>> >>> Gen4: >>> Intel GMA X3000* / G965* / Broadwater (* tie for most recognizable) >>> Intel GMA X3100 / GM965 / Crestline >>> >>> Gen4.5: >>> Intel GMA 4500HD / G45* / Eaglelake >>> Intel GMA 4500MHD / GM45* / Cantiga (* G45/GM45 are most recognizable) >>> >>> Gen5: >>> Intel HD Graphics / Ironlake* / Arrandale / Clarkdale (* most recognizable) >>> >>> Gen6: >>> Intel HD 2000 / HD 3000 / Sandybridge >>> >>> Gen7: >>> Intel HD 2500 / HD 4000 / Ivybridge >>> >>> Gen7 LP: >>> Intel HD Graphics / Baytrail >>> >>> Gen7.5: >>> Intel HD Graphics (no number for GT1) / HD 4200, 4400, 4600, 5000 / Iris >>> 5100 / Iris Pro 5200 / Haswell / Crystalwell >>> >>> > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/10] i965/fs: Track output regs on a split virtual GRF basis.
On 03/26/2014 02:23 PM, Eric Anholt wrote: > Basically, replace the output_components[] array with per-channel tracking > of the register storing that channel, or a BAD_FILE undefined reg. > > Right now var->data.location_frac is always 0, but I'm going to use that > in vector_splitting next. > --- > src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +- > src/mesa/drivers/dri/i965/brw_fs.h | 5 +-- > src/mesa/drivers/dri/i965/brw_fs_fp.cpp | 18 + > src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 55 > ++-- > 4 files changed, 40 insertions(+), 40 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp > b/src/mesa/drivers/dri/i965/brw_fs.cpp > index 0d24f59..eee0c8a 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp > @@ -1732,7 +1732,7 @@ fs_visitor::compact_virtual_grfs() >{ &pixel_y, 1 }, >{ &pixel_w, 1 }, >{ &wpos_w, 1 }, > - { &dual_src_output, 1 }, > + { dual_src_output, ARRAY_SIZE(dual_src_output) }, >{ outputs, ARRAY_SIZE(outputs) }, >{ delta_x, ARRAY_SIZE(delta_x) }, >{ delta_y, ARRAY_SIZE(delta_y) }, > diff --git a/src/mesa/drivers/dri/i965/brw_fs.h > b/src/mesa/drivers/dri/i965/brw_fs.h > index f410733..d47bc28 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs.h > +++ b/src/mesa/drivers/dri/i965/brw_fs.h > @@ -526,9 +526,8 @@ public: > struct hash_table *variable_ht; > fs_reg frag_depth; > fs_reg sample_mask; > - fs_reg outputs[BRW_MAX_DRAW_BUFFERS]; > - unsigned output_components[BRW_MAX_DRAW_BUFFERS]; > - fs_reg dual_src_output; > + fs_reg outputs[BRW_MAX_DRAW_BUFFERS * 4]; > + fs_reg dual_src_output[4]; > bool do_dual_src; > int first_non_payload_grf; > /** Either BRW_MAX_GRF or GEN7_MRF_HACK_START */ > diff --git a/src/mesa/drivers/dri/i965/brw_fs_fp.cpp > b/src/mesa/drivers/dri/i965/brw_fs_fp.cpp > index 49eaf05..19483e3 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_fp.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_fp.cpp > @@ -646,25 +646,27 @@ fs_visitor::get_fp_dst_reg(const prog_dst_register *dst) > return frag_depth; >} else if (dst->Index == FRAG_RESULT_COLOR) { > if (outputs[0].file == BAD_FILE) { > -outputs[0] = fs_reg(this, glsl_type::vec4_type); > -output_components[0] = 4; > +fs_reg reg = fs_reg(this, glsl_type::vec4_type); > > /* Tell emit_fb_writes() to smear fragment.color across all the > * color attachments. > */ > for (int i = 1; i < c->key.nr_color_regions; i++) { > - outputs[i] = outputs[0]; > - output_components[i] = output_components[0]; > + for (int j = 0; j < 4; j++) { > + outputs[i * 4 + j] = offset(reg, j); > + } > } > } > return outputs[0]; This sure doesn't look right. We check if outputs[0].file == BAD_FILE, and then proceed to initialize outputs[4], outputs[5], etc...but never outputs[0] through outputs[3]. Then we return outputs[0], giving them a bogus register... Did you mean to change the loop condition to 'int i = 0'? >} else { I'm pretty the above bug means this else case will never happen... The GLSL code looks mostly reasonable. > int output_index = dst->Index - FRAG_RESULT_DATA0; > - if (outputs[output_index].file == BAD_FILE) { > -outputs[output_index] = fs_reg(this, glsl_type::vec4_type); > + if (outputs[output_index * 4].file == BAD_FILE) { > +fs_reg reg = fs_reg(this, glsl_type::vec4_type); > +for (int i = 0; i < 4; i++) { > + outputs[output_index * 4 + i] = offset(reg, i); > +} > } > - output_components[output_index] = 4; > - return outputs[output_index]; > + return outputs[output_index * 4]; >} > > case PROGRAM_UNDEFINED: > diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp > b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp > index 047ec21..d9bb4de 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp > @@ -70,17 +70,25 @@ fs_visitor::visit(ir_variable *ir) > } else if (ir->data.mode == ir_var_shader_out) { >reg = new(this->mem_ctx) fs_reg(this, ir->type); > > + int vector_elements = > + ir->type->is_array() ? ir->type->fields.array->vector_elements > + : ir->type->vector_elements; > + >if (ir->data.index > 0) { > - assert(ir->data.location == FRAG_RESULT_DATA0); > - assert(ir->data.index == 1); > - this->dual_src_output = *reg; > + assert(ir->data.location == FRAG_RESULT_DATA0); > + assert(ir->data.index == 1); > + for (unsigned i = 0; i < vector_elements; i++) { > +this->dual_src_output[i + ir->data.location_frac] = offset(
Re: [Mesa-dev] RFC: per-driver extension lists
On Fri, Mar 28, 2014 at 10:15 AM, Roland Scheidegger wrote: > Am 28.03.2014 05:06, schrieb Kenneth Graunke: >> >> I've attached the info for everything on the i965 driver. In case you >> ever want to regenerate things, I just ran: >> >> $ INTEL_DEVID_OVERRIDE=0x29a2 glxinfo -l -s &> gen4 >> $ INTEL_DEVID_OVERRIDE=0x2a42 glxinfo -l -s &> gen4.5 >> $ INTEL_DEVID_OVERRIDE=0x0046 glxinfo -l -s &> gen5 >> $ INTEL_DEVID_OVERRIDE=0x0116 glxinfo -l -s &> gen6 >> $ INTEL_DEVID_OVERRIDE=0x0166 glxinfo -l -s &> gen7-ivb >> $ INTEL_DEVID_OVERRIDE=0x0f31 glxinfo -l -s &> gen7-byt >> $ INTEL_DEVID_OVERRIDE=0x0d26 glxinfo -l -s &> gen7.5 >> >> Naming for these are kind of...complicated. :( > And you even forgot some names - HD Graphics (no number) is used for > SandyBridge/Ivy Bridge too... So that name is clearly unusable. Unfortunately the generated glxinfo's were for a git mesa build, but armed with this information, I was able to regenerate them on my own snb laptop. Unfortunately it turns out these aren't quite accurate, apparently there are some checks that need to happen in order for ARB_(multi_)draw_indirect, ARB_tf2/3/instanced, and AMD_performance_monitor to appear activated. I've manually added those in for IVB and left them out for HSW since I'm told there's presently no kernel support for them. (I had actually received a HSW glxinfo earlier from Eric Anholt, but I guess he must have been running with a hacked up kernel.) If you guys notice any errors in these, please let me know. I also chose some names that made sense based on the below info from Ken, as well as the names reported by the driver (965G vs G965, for some reason GM45 gets a utf8'd registered symbol while all the other versions are satisified with a plain (R), etc). I think at this point, my list is complete for semi-recent hardware (i965, nv50, nvc0, r600, radeonsi drivers), and vastly incomplete for pretty much anything else. [Ugh, with the exception of NVA0 (G200), which has a few things G80-G98 don't, and is missing a few things that GT21x have. But I'll get it.] Do people have opinions on whether it'd be useful to also gather data for older hardware? FWIW I threw my TNT2 in there, which is probably among the oldest hw supported by mesa. Any other suggestions? As a reminder, the current list is available at http://people.freedesktop.org/~imirkin/glxinfo/glxinfo.html (defaults to core context, so older stuff doesn't show up, click on 'compat' to see it). I'm thinking of adding - history support (would allow one to link directly to a version/{core,compat}) - selecting specific configs to make comparison easier - option to hide rows that are supported by everything - some sort of design + words around the table to explain what it is and how to use it >> >> Gen4: >> Intel GMA X3000* / G965* / Broadwater (* tie for most recognizable) >> Intel GMA X3100 / GM965 / Crestline >> >> Gen4.5: >> Intel GMA 4500HD / G45* / Eaglelake >> Intel GMA 4500MHD / GM45* / Cantiga (* G45/GM45 are most recognizable) >> >> Gen5: >> Intel HD Graphics / Ironlake* / Arrandale / Clarkdale (* most recognizable) >> >> Gen6: >> Intel HD 2000 / HD 3000 / Sandybridge >> >> Gen7: >> Intel HD 2500 / HD 4000 / Ivybridge >> >> Gen7 LP: >> Intel HD Graphics / Baytrail >> >> Gen7.5: >> Intel HD Graphics (no number for GT1) / HD 4200, 4400, 4600, 5000 / Iris >> 5100 / Iris Pro 5200 / Haswell / Crystalwell >> >> ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/5] i965/fs: Add support for the MAC instruction.
This allows us to generate the MAC (multiply-accumulate) instruction, which can be used to implement some expressions in fewer instructions than doing a series of MUL and ADDs. Signed-off-by: Juha-Pekka Heikkila --- src/mesa/drivers/dri/i965/brw_fs.cpp | 1 + src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 3 +++ 2 files changed, 4 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 7dce84a..1bbd911 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -196,6 +196,7 @@ ALU3(MAD) ALU2_ACC(ADDC) ALU2_ACC(SUBB) ALU2(SEL) +ALU2(MAC) /** Gen4 predicated IF. */ fs_inst * diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index 1cf35b4..c6b4aae 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -1545,6 +1545,9 @@ fs_generator::generate_code(exec_list *instructions, FILE *dump_file) assert(brw->gen >= 7); brw_SUBB(p, dst, src[0], src[1]); break; + case BRW_OPCODE_MAC: + brw_MAC(p, dst, src[0], src[1]); + break; case BRW_OPCODE_BFE: assert(brw->gen >= 7); -- 1.8.1.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/5] i965: Add writes_accumulator flag
Our hardware has an "accumulator" register, which can be used to store intermediate results across multiple instructions. Many instructions can implicitly write a value to the accumulator in addition to their normal destination register. This is enabled by the "AccWrEn" flag. This patch introduces a new flag, inst->writes_accumulator, which allows us to express the AccWrEn notion in the IR. It also creates a n ALU2_ACC macro to easily define emitters for instructions that implicitly write the accumulator. Previously, we only supported implicit accumulator writes from the ADDC, SUBB, and MACH instructions. We always enabled them on those instructions, and left them disabled for other instructions. To take advantage of the MAC (multiply-accumulate) instruction, we need to be able to set AccWrEn on other types of instructions. Signed-off-by: Juha-Pekka Heikkila --- src/mesa/drivers/dri/i965/brw_fs.cpp | 26 +- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 7 +- .../drivers/dri/i965/brw_schedule_instructions.cpp | 8 +++ src/mesa/drivers/dri/i965/brw_shader.h | 1 + src/mesa/drivers/dri/i965/brw_vec4.cpp | 17 +- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 7 +- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 17 +++--- 7 files changed, 42 insertions(+), 41 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 713e477..7dce84a 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -64,6 +64,8 @@ fs_inst::init() /* This will be the case for almost all instructions. */ this->regs_written = 1; + + this->writes_accumulator = false; } fs_inst::fs_inst() @@ -151,6 +153,15 @@ fs_inst::fs_inst(enum opcode opcode, fs_reg dst, return new(mem_ctx) fs_inst(BRW_OPCODE_##op, dst, src0, src1);\ } +#define ALU2_ACC(op)\ + fs_inst *\ + fs_visitor::op(fs_reg dst, fs_reg src0, fs_reg src1) \ + {\ + fs_inst *inst = new(mem_ctx) fs_inst(BRW_OPCODE_##op, dst, src0, src1);\ + inst->writes_accumulator = true; \ + return inst; \ + } + #define ALU3(op)\ fs_inst *\ fs_visitor::op(fs_reg dst, fs_reg src0, fs_reg src1, fs_reg src2)\ @@ -166,7 +177,7 @@ ALU1(RNDE) ALU1(RNDZ) ALU2(ADD) ALU2(MUL) -ALU2(MACH) +ALU2_ACC(MACH) ALU2(AND) ALU2(OR) ALU2(XOR) @@ -182,8 +193,8 @@ ALU1(FBH) ALU1(FBL) ALU1(CBIT) ALU3(MAD) -ALU2(ADDC) -ALU2(SUBB) +ALU2_ACC(ADDC) +ALU2_ACC(SUBB) ALU2(SEL) /** Gen4 predicated IF. */ @@ -2107,16 +2118,11 @@ fs_visitor::dead_code_eliminate() * accumulator as a side-effect. Instead just set the destination * to the null register to free it. */ -switch (inst->opcode) { -case BRW_OPCODE_ADDC: -case BRW_OPCODE_SUBB: -case BRW_OPCODE_MACH: +if (inst->writes_accumulator) { inst->dst = fs_reg(retype(brw_null_reg(), inst->dst.type)); - break; -default: +} else { inst->remove(); progress = true; - break; } } } diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index e590bdf..1cf35b4 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -1411,6 +1411,7 @@ fs_generator::generate_code(exec_list *instructions, FILE *dump_file) brw_set_flag_reg(p, 0, inst->flag_subreg); brw_set_saturate(p, inst->saturate); brw_set_mask_control(p, inst->force_writemask_all); + brw_set_acc_write_control(p, inst->writes_accumulator); if (inst->force_uncompressed || dispatch_width == 8) { brw_set_compression_control(p, BRW_COMPRESSION_NONE); @@ -1434,9 +1435,7 @@ fs_generator::generate_code(exec_list *instructions, FILE *dump_file) brw_AVG(p, dst, src[0], src[1]); break; case BRW_OPCODE_MACH: -brw_set_acc_write_control(p, 1); brw_MACH(p, dst, src[0], src[1]); -brw_set_acc_write_control(p, 0); break; case BRW_OPCODE_MAD: @@ -1540,15 +1539,11 @@ fs_generator::generate_code(exec_list *instructions, FILE *dump_file) break; case BRW_OPCODE_ADDC: assert(brw->gen >= 7); - brw_set_acc_write_control(p, 1); brw_ADDC(p, dst, src[0], src[1]); - brw_set_acc_write_control
[Mesa-dev] [PATCH 5/5] i965/fs: Change fs_visitor::emit_lrp to use MAC for gen<6
This allows us to emit ADD/MUL/MAC instead of MUL/ADD/MUL/ADD, saving one instruction and two temporary registers. Signed-off-by: Juha-Pekka Heikkila --- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 11 --- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index ce6d3da..5a8aae2 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -220,18 +220,15 @@ fs_visitor::emit_lrp(const fs_reg &dst, const fs_reg &x, const fs_reg &y, !y.is_valid_3src() || !a.is_valid_3src()) { /* We can't use the LRP instruction. Emit x*(1-a) + y*a. */ - fs_reg y_times_a = fs_reg(this, glsl_type::float_type); fs_reg one_minus_a = fs_reg(this, glsl_type::float_type); - fs_reg x_times_one_minus_a = fs_reg(this, glsl_type::float_type); - - emit(MUL(y_times_a, y, a)); fs_reg negative_a = a; negative_a.negate = !a.negate; - emit(ADD(one_minus_a, negative_a, fs_reg(1.0f))); - emit(MUL(x_times_one_minus_a, x, one_minus_a)); - emit(ADD(dst, x_times_one_minus_a, y_times_a)); + emit(ADD(one_minus_a, negative_a, fs_reg(1.0f))); + fs_inst *mul = emit(MUL(reg_null_f, y, a)); + mul->writes_accumulator = true; + emit(MAC(dst, x, one_minus_a)); } else { /* The LRP instruction actually does op1 * op0 + op2 * (1 - op0), so * we need to reorder the operands. -- 1.8.1.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/5] i965/vec4: Add support for the MAC instruction.
This allows us to generate the MAC (multiply-accumulate) instruction, which can be used to implement some expressions in fewer instructions than doing a series of MUL and ADDs. Signed-off-by: Juha-Pekka Heikkila --- src/mesa/drivers/dri/i965/brw_eu.h | 1 + src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 3 +++ src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 1 + 3 files changed, 5 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_eu.h b/src/mesa/drivers/dri/i965/brw_eu.h index 5df6bb7..f10ad50 100644 --- a/src/mesa/drivers/dri/i965/brw_eu.h +++ b/src/mesa/drivers/dri/i965/brw_eu.h @@ -183,6 +183,7 @@ ALU1(FBL) ALU1(CBIT) ALU2(ADDC) ALU2(SUBB) +ALU2(MAC) ROUND(RNDZ) ROUND(RNDE) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp index 5f85d31..bcacde9 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp @@ -1081,6 +1081,9 @@ vec4_generator::generate_vec4_instruction(vec4_instruction *instruction, assert(brw->gen >= 7); brw_SUBB(p, dst, src[0], src[1]); break; + case BRW_OPCODE_MAC: + brw_MAC(p, dst, src[0], src[1]); + break; case BRW_OPCODE_BFE: assert(brw->gen >= 7); diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index 2600114..298ca26 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -175,6 +175,7 @@ ALU1(CBIT) ALU3(MAD) ALU2_ACC(ADDC) ALU2_ACC(SUBB) +ALU2(MAC) /** Gen4 predicated IF. */ vec4_instruction * -- 1.8.1.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/5] Using MAC to optimize LRP on i965 gen < 6
v3: I took out accumulator flag from backend_instruction::has_side_effects() as Matt suggested and rebased my patches on top of master where had shown up Matt's patches which were overlapping a bit on dead code elimination. This set does not do anything for the MACH anomaly on vec4_visitor::visit and fs_visitor:::visit, I will reverify it and do something about it later if needed. I tried these on Ironlake maching and did not see any regression on Piglit quick set. v2: Thanks Matt, Eric and Kenneth for the comments, I did rework my set to look a bit different. I noticed Matt has partially overlapping set of patches about dead code elimination and as for my set I did not have access this week to Ironlake machine thus calling this 'RFC'y set. Kenneth, I think pedantic is good :) I hope you don't mind I took your versions of commit messages almost directly as I probably could not come up with anything better. The biggest differences are that I moved the accumulator flag to backend_instruction class which is accessible from both vec4 and fs. This also allow to see the flag in scheduler easily via backend_instruction::has_side_effects() as well as remove the switch/cases for ADDC/SUBB/MACH from dead code elimination. Also the fs side should be working now, this was anyhow almost cut'n'paste type thing. As for correctly scheduling instructions has_side_effects returning true when writes_accumulator is true seems to make everything ok without need to touch inside calculate_deps() I did run these new LRP pieces on my Ivybridge machine and with Piglit I don't see any regressions on glsl tests. Next week I probably can try with Ironlake. Juha-Pekka Heikkila (5): i965: Add writes_accumulator flag i965/vec4: Add support for the MAC instruction. i965/vec4: Change vec4_visitor::emit_lrp to use MAC for gen<6 i965/fs: Add support for the MAC instruction. i965/fs: Change fs_visitor::emit_lrp to use MAC for gen<6 src/mesa/drivers/dri/i965/brw_eu.h | 1 + src/mesa/drivers/dri/i965/brw_fs.cpp | 27 +-- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 10 +++--- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 11 +++--- .../drivers/dri/i965/brw_schedule_instructions.cpp | 8 ++--- src/mesa/drivers/dri/i965/brw_shader.h | 1 + src/mesa/drivers/dri/i965/brw_vec4.cpp | 17 +++-- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 10 +++--- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 40 -- 9 files changed, 61 insertions(+), 64 deletions(-) -- 1.8.1.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] targets/omx: do not link against the trace driver
Am 28.03.2014 13:26, schrieb Emil Velikov: Unused due to the missing GALLIUM_TRACE define. Requested-by: Christian König Signed-off-by: Emil Velikov Reviewed-by: Christian König --- Hi Christian Seems like none of the other video accel targets use the tracer. I'm assuming that the VDPAU issue(s) you've mentioned have been resolved. -Emil src/gallium/targets/r600/omx/Makefile.am | 1 - src/gallium/targets/radeonsi/omx/Makefile.am | 1 - 2 files changed, 2 deletions(-) diff --git a/src/gallium/targets/r600/omx/Makefile.am b/src/gallium/targets/r600/omx/Makefile.am index 22ef08f..aecbb83 100644 --- a/src/gallium/targets/r600/omx/Makefile.am +++ b/src/gallium/targets/r600/omx/Makefile.am @@ -46,7 +46,6 @@ libomx_r600_la_LDFLAGS = $(GALLIUM_OMX_LINKER_FLAGS) libomx_r600_la_LIBADD = \ $(top_builddir)/src/gallium/drivers/r600/libr600.la \ $(top_builddir)/src/gallium/winsys/radeon/drm/libradeonwinsys.la \ - $(top_builddir)/src/gallium/drivers/trace/libtrace.la \ $(GALLIUM_OMX_LIB_DEPS) \ $(LIBDRM_LIBS) \ $(RADEON_LIBS) \ diff --git a/src/gallium/targets/radeonsi/omx/Makefile.am b/src/gallium/targets/radeonsi/omx/Makefile.am index 439e91c..3c37909 100644 --- a/src/gallium/targets/radeonsi/omx/Makefile.am +++ b/src/gallium/targets/radeonsi/omx/Makefile.am @@ -46,7 +46,6 @@ libomx_radeonsi_la_LDFLAGS = $(GALLIUM_OMX_LINKER_FLAGS) libomx_radeonsi_la_LIBADD = \ $(top_builddir)/src/gallium/drivers/radeonsi/libradeonsi.la \ $(top_builddir)/src/gallium/winsys/radeon/drm/libradeonwinsys.la \ - $(top_builddir)/src/gallium/drivers/trace/libtrace.la \ $(GALLIUM_OMX_LIB_DEPS) \ $(LIBDRM_LIBS) \ $(RADEON_LIBS) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] targets/omx: do not link against the trace driver
Unused due to the missing GALLIUM_TRACE define. Requested-by: Christian König Signed-off-by: Emil Velikov --- Hi Christian Seems like none of the other video accel targets use the tracer. I'm assuming that the VDPAU issue(s) you've mentioned have been resolved. -Emil src/gallium/targets/r600/omx/Makefile.am | 1 - src/gallium/targets/radeonsi/omx/Makefile.am | 1 - 2 files changed, 2 deletions(-) diff --git a/src/gallium/targets/r600/omx/Makefile.am b/src/gallium/targets/r600/omx/Makefile.am index 22ef08f..aecbb83 100644 --- a/src/gallium/targets/r600/omx/Makefile.am +++ b/src/gallium/targets/r600/omx/Makefile.am @@ -46,7 +46,6 @@ libomx_r600_la_LDFLAGS = $(GALLIUM_OMX_LINKER_FLAGS) libomx_r600_la_LIBADD = \ $(top_builddir)/src/gallium/drivers/r600/libr600.la \ $(top_builddir)/src/gallium/winsys/radeon/drm/libradeonwinsys.la \ - $(top_builddir)/src/gallium/drivers/trace/libtrace.la \ $(GALLIUM_OMX_LIB_DEPS) \ $(LIBDRM_LIBS) \ $(RADEON_LIBS) \ diff --git a/src/gallium/targets/radeonsi/omx/Makefile.am b/src/gallium/targets/radeonsi/omx/Makefile.am index 439e91c..3c37909 100644 --- a/src/gallium/targets/radeonsi/omx/Makefile.am +++ b/src/gallium/targets/radeonsi/omx/Makefile.am @@ -46,7 +46,6 @@ libomx_radeonsi_la_LDFLAGS = $(GALLIUM_OMX_LINKER_FLAGS) libomx_radeonsi_la_LIBADD = \ $(top_builddir)/src/gallium/drivers/radeonsi/libradeonsi.la \ $(top_builddir)/src/gallium/winsys/radeon/drm/libradeonwinsys.la \ - $(top_builddir)/src/gallium/drivers/trace/libtrace.la \ $(GALLIUM_OMX_LIB_DEPS) \ $(LIBDRM_LIBS) \ $(RADEON_LIBS) -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa branchpoint tags?
On Fri, Mar 28, 2014 at 8:25 AM, Eric Anholt wrote: > I was looking at a bug report about old software, and wanted to see what > development branch a quoted commit was on: > > anholt@eliezer:anholt/src/mesa-release% git describe > 97217a40f97cdeae0304798b607f704deb0c3558 > snb-magic-15797-g97217a4 > > That's... not terribly useful. It would be nice if git describe could > be used so I could figure out what development branch a commit was for. You could also have used "git branch --contains 97217a40f97cdeae0304798b607f704deb0c3558", which would've shown you that it was first introduced in the 9.2-branch. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 17/24] targets/omx: define GALLIUM_TRACE when using the trace driver
On 28/03/14 09:40, Christian König wrote: > Am 27.03.2014 22:00, schrieb Emil Velikov: >> Otherwise the omx drivers are explicitly linked but never wrapped in >> order to use it. > > On the other hand I'm not sure if we really need the tracer linked in here, > referencing it was just to make drm_target.c happy. > Fair enough, although I'm not sure what you're implying with "make drm_target.c happy" here. AFAICS the tracer should be mentioned only when the define is present. And indeed, dropping the define + driver from LIBADD does build. Build tested only, due to lack of hw :( -Emil > Christian. > >> Cc: Christian König >> Signed-off-by: Emil Velikov >> --- >> src/gallium/targets/r600/omx/Makefile.am | 3 ++- >> src/gallium/targets/radeonsi/omx/Makefile.am | 3 ++- >> 2 files changed, 4 insertions(+), 2 deletions(-) >> >> diff --git a/src/gallium/targets/r600/omx/Makefile.am >> b/src/gallium/targets/r600/omx/Makefile.am >> index 22ef08f..0bae51b 100644 >> --- a/src/gallium/targets/r600/omx/Makefile.am >> +++ b/src/gallium/targets/r600/omx/Makefile.am >> @@ -31,7 +31,8 @@ AM_CFLAGS = \ >> $(LIBDRM_CFLAGS) >> AM_CPPFLAGS = \ >> -I$(top_srcdir)/src/gallium/drivers \ >> --I$(top_srcdir)/src/gallium/winsys >> +-I$(top_srcdir)/src/gallium/winsys \ >> +-DGALLIUM_TRACE >> omxdir = $(OMX_LIB_INSTALL_DIR) >> omx_LTLIBRARIES = libomx_r600.la >> diff --git a/src/gallium/targets/radeonsi/omx/Makefile.am >> b/src/gallium/targets/radeonsi/omx/Makefile.am >> index 439e91c..c0e0218 100644 >> --- a/src/gallium/targets/radeonsi/omx/Makefile.am >> +++ b/src/gallium/targets/radeonsi/omx/Makefile.am >> @@ -31,7 +31,8 @@ AM_CFLAGS = \ >> $(LIBDRM_CFLAGS) >> AM_CPPFLAGS = \ >> -I$(top_srcdir)/src/gallium/drivers \ >> --I$(top_srcdir)/src/gallium/winsys >> +-I$(top_srcdir)/src/gallium/winsys \ >> +-DGALLIUM_TRACE >> omxdir = $(OMX_LIB_INSTALL_DIR) >> omx_LTLIBRARIES = libomx_radeonsi.la > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 24/24] automake: ask the linker to do garbage collection
On 28/03/14 08:35, Lauri Kasanen wrote: > On Thu, 27 Mar 2014 21:00:39 + > Emil Velikov wrote: > >> By doing GC the linker removes all the symbols that are not referenced >> and/or used by the final library. This results in a saving of ~100K >> up-to ~600K per (stripped) binary (classic vs gallium drivers). >> >> If interested one can ask the compiler to print the sections that are >> removed using -Wl,--print-gc-sections. > > I didn't see where you add the corresponding flags to CFLAGS/CXXFLAGS. > What do you have in mind with "the corresponding flags" ? > Without -ffunction-sections -fdata-sections in compile flags the garbage > collection is handicapped - add those by default too, and see much > greater gains. > "Handicapped" or not it produces very nice results at this point. AFAICS both options that you've mentioned have some interesting side effects, from which I would rather opt out at this stage. Speaking of "much greather gains" do you have any numbers in mind (wrt mesa) ? > I should also add that this is a GNUism, so it should be detected for Ouch good point. > in configure in case people use other compilers/linkers. It's also > redundant if LTO is used. As mentioned by Marc, LTO is somewhat broken + adds a substantial delay to the build process. Thus I'd rather not enforce it on people. > > In one of my projects I detect as such: > > if -flto supported, use it (needs to be added to all three CFLAGS > CXXFLAGS LDFLAGS) > else if sections supported, use it > else print a message they're missing out > Thanks -Emil > - Lauri > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 76252] Dynamic loading/unloading of opengl32.dll results in a deadlock
https://bugs.freedesktop.org/show_bug.cgi?id=76252 --- Comment #7 from cgerlac...@gmail.com --- José, thanks for the feedback. Our first try was to call the dll llvmpipe.dll and we ran exactly in the issues you mentioned regarding GDI. Therefore we don't see this as an option for us. A colleaque had another idea: We export stw_cleanup and call it before unloading the dll. This seems to work very well and shuts down the threads as expected. If you don't see any problems with this approach, we could live with this workaround. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 24/24] automake: ask the linker to do garbagecollection
Am Freitag, 28. März 2014, 10:35:00 schrieb Lauri Kasanen: > On Thu, 27 Mar 2014 21:00:39 + > > Emil Velikov wrote: > > By doing GC the linker removes all the symbols that are not referenced > > and/or used by the final library. This results in a saving of ~100K > > up-to ~600K per (stripped) binary (classic vs gallium drivers). > > > > If interested one can ask the compiler to print the sections that are > > removed using -Wl,--print-gc-sections. > > I didn't see where you add the corresponding flags to CFLAGS/CXXFLAGS. > > Without -ffunction-sections -fdata-sections in compile flags the garbage > collection is handicapped - add those by default too, and see much > greater gains. > > I should also add that this is a GNUism, so it should be detected for > in configure in case people use other compilers/linkers. It's also > redundant if LTO is used. > > In one of my projects I detect as such: > > if -flto supported, use it (needs to be added to all three CFLAGS > CXXFLAGS LDFLAGS) lto is broken on many compiler/ld combinations. Even if it is supported I won't recomment to enable it. A config option to enable it would be nice though. Lauri, on which compiler/binutils version did you got it going? Marc ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 24/24] automake: ask the linker to do garbage collection
On Thu, 27 Mar 2014 21:00:39 + Emil Velikov wrote: > By doing GC the linker removes all the symbols that are not referenced > and/or used by the final library. This results in a saving of ~100K > up-to ~600K per (stripped) binary (classic vs gallium drivers). > > If interested one can ask the compiler to print the sections that are > removed using -Wl,--print-gc-sections. I didn't see where you add the corresponding flags to CFLAGS/CXXFLAGS. Without -ffunction-sections -fdata-sections in compile flags the garbage collection is handicapped - add those by default too, and see much greater gains. I should also add that this is a GNUism, so it should be detected for in configure in case people use other compilers/linkers. It's also redundant if LTO is used. In one of my projects I detect as such: if -flto supported, use it (needs to be added to all three CFLAGS CXXFLAGS LDFLAGS) else if sections supported, use it else print a message they're missing out - Lauri ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Questions regarding KHR_debug for OpenGL ES
On Fri, 2014-03-28 at 19:22 +1100, Timothy Arceri wrote: > On Wed, 2014-03-26 at 16:27 -0700, Felipe Tonello wrote: > > Hi Timothy, > > > > On Sun, Mar 23, 2014 at 2:11 PM, Timothy Arceri > > wrote: > > > On Mon, 2014-03-17 at 11:42 -0700, Felipe Tonello wrote: > > >> Hi all, > > >> > > >> I'm working on the KHR_debug for OpenGL ES junior job. I recently > > >> submitted patches to allow the piglit tests to be enabled in GLES > > >> contexts as well. > > >> Now I want to work on the src/mapi/glapi/gen/es_EXT.xml file. Recently > > >> I saw a patch that moved the KHR_debug extension to a include type of > > >> file. > > >> > > >> My question is: How can I reuse this file but change the functions > > >> definitions to add the KHR suffix? > > > > > > Hi Felipe, > > > > > > You need to use the alias attribute. Take a look at ARB_debug_output.xml > > > basically you will need to do the same thing in the es_EXT.xml file but > > > using the KHR suffix rather than ARB. I don't think there is anyway to > > > avoid the obvious duplications here (someone please correct me if I'm > > > wrong). > > > > > > Tim > > > > > > > Where should I define GLDEBUGPROCKHR? I tried gl_and_es_API.xml under > > but when I compile I get that the compiler > > cannot find GLDEBUGPROCKHR type name. > > > > Any idea? > > I'm not 100% sure where the es stuff should go After a second look I think it might need to go in as per Ian's instructions on the wiki page "advertise GL_KHR_debug in ES1 and ES2 contexts." > but I think you need to > add it like this: > > > > i.e try removing GL > > Does that help? > > Tim > > > > > BR > > > > Felipe > > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Questions regarding KHR_debug for OpenGL ES
On Wed, 2014-03-26 at 16:27 -0700, Felipe Tonello wrote: > Hi Timothy, > > On Sun, Mar 23, 2014 at 2:11 PM, Timothy Arceri wrote: > > On Mon, 2014-03-17 at 11:42 -0700, Felipe Tonello wrote: > >> Hi all, > >> > >> I'm working on the KHR_debug for OpenGL ES junior job. I recently > >> submitted patches to allow the piglit tests to be enabled in GLES > >> contexts as well. > >> Now I want to work on the src/mapi/glapi/gen/es_EXT.xml file. Recently > >> I saw a patch that moved the KHR_debug extension to a include type of > >> file. > >> > >> My question is: How can I reuse this file but change the functions > >> definitions to add the KHR suffix? > > > > Hi Felipe, > > > > You need to use the alias attribute. Take a look at ARB_debug_output.xml > > basically you will need to do the same thing in the es_EXT.xml file but > > using the KHR suffix rather than ARB. I don't think there is anyway to > > avoid the obvious duplications here (someone please correct me if I'm > > wrong). > > > > Tim > > > > Where should I define GLDEBUGPROCKHR? I tried gl_and_es_API.xml under > but when I compile I get that the compiler > cannot find GLDEBUGPROCKHR type name. > > Any idea? I'm not 100% sure where the es stuff should go but I think you need to add it like this: i.e try removing GL Does that help? Tim > > BR > > Felipe ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/10] i965/fs: Handle arrays of special regs more cleanly.
Kenneth Graunke writes: > On 03/26/2014 02:23 PM, Eric Anholt wrote: >> I need to extend the size of the outputs[] array, and this was going to >> get ridiculous. >> --- >> src/mesa/drivers/dri/i965/brw_fs.cpp | 36 >> ++-- >> 1 file changed, 22 insertions(+), 14 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp >> b/src/mesa/drivers/dri/i965/brw_fs.cpp >> index b3f1dfd..9610cde 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp >> @@ -1723,22 +1723,27 @@ fs_visitor::compact_virtual_grfs() >> /* In addition to registers used in instructions, fs_visitor keeps >> * direct references to certain special values which must be patched: >> */ >> - fs_reg *special[] = { >> - &frag_depth, &pixel_x, &pixel_y, &pixel_w, &wpos_w, &dual_src_output, >> - &outputs[0], &outputs[1], &outputs[2], &outputs[3], >> - &outputs[4], &outputs[5], &outputs[6], &outputs[7], >> - &delta_x[0], &delta_x[1], &delta_x[2], >> - &delta_x[3], &delta_x[4], &delta_x[5], >> - &delta_y[0], &delta_y[1], &delta_y[2], >> - &delta_y[3], &delta_y[4], &delta_y[5], >> + struct { >> + fs_reg *reg; >> + unsigned count; >> + } special[] = { >> + { &frag_depth, 1 }, >> + { &pixel_x, 1 }, >> + { &pixel_y, 1 }, >> + { &pixel_w, 1 }, >> + { &wpos_w, 1 }, >> + { &dual_src_output, 1 }, >> + { outputs, ARRAY_SIZE(outputs) }, >> + { delta_x, ARRAY_SIZE(delta_x) }, >> + { delta_y, ARRAY_SIZE(delta_y) }, >> }; >> - STATIC_ASSERT(BRW_WM_BARYCENTRIC_INTERP_MODE_COUNT == 6); >> - STATIC_ASSERT(BRW_MAX_DRAW_BUFFERS == 8); >> >> /* Treat all special values as used, to be conservative */ >> for (unsigned i = 0; i < ARRAY_SIZE(special); i++) { >> - if (special[i]->file == GRF) >> - remap_table[special[i]->reg] = 0; >> + for (unsigned j = 0; j < special[i].count; j++) { >> + if (special[i].reg[j].file == GRF) >> +remap_table[special[i].reg[j].reg] = 0; >> + } >> } >> >> /* Compact the GRF arrays. */ >> @@ -1769,8 +1774,11 @@ fs_visitor::compact_virtual_grfs() >> >> /* Patch all the references to special values */ >> for (unsigned i = 0; i < ARRAY_SIZE(special); i++) { >> - if (special[i]->file == GRF && remap_table[special[i]->reg] != -1) >> - special[i]->reg = remap_table[special[i]->reg]; >> + for (unsigned j = 0; j < special[i].count; j++) { >> + fs_reg *reg = &special[i].reg[j]; >> + if (reg->file == GRF && remap_table[reg->reg] != -1) >> +reg->reg = remap_table[reg->reg]; >> + } >> } >> } >> >> > > Much better - thanks for fixing my mess. > > I did notice that a few newer registers are missing: > > { &sample_mask, 1 }, > { &shader_start_time, 1 }, > > It would be great to add them as a follow-up patch. I almost fixed them in the same commit, except that these ones don't actually get used other than during the visitor, so it's "safe" (other than leaving bad information laying around as a trap for the future). I should definitely put in a followup to add them. pgpW7sPy1q2EP.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Mesa branchpoint tags?
I was looking at a bug report about old software, and wanted to see what development branch a quoted commit was on: anholt@eliezer:anholt/src/mesa-release% git describe 97217a40f97cdeae0304798b607f704deb0c3558 snb-magic-15797-g97217a4 That's... not terribly useful. It would be nice if git describe could be used so I could figure out what development branch a commit was for. So I wrote a little script: branchpoint=`git rev-list origin/master..origin/$@ | tail -n 1` git tag -f -s -m "Mesa $@ branchpoint" $@-branchpoint $branchpoint~ and generated a bunch of tags in my repo: 10.0-branchpoint 10.1-branchpoint 7.10-branchpoint 7.11-branchpoint 7.8-branchpoint 7.9-branchpoint 8.0-branchpoint 9.0-branchpoint 9.1-branchpoint 9.2-branchpoint which look like: tag 10.1-branchpoint Tagger: Eric Anholt Date: Thu Mar 27 23:59:46 2014 -0700 Mesa 10.1 branchpoint -BEGIN PGP SIGNATURE- Version: GnuPG v1 iQIcBAABCAAGBQJTNR3iAAoJELXWKTbR/J7ovjkP/0IbzY1BMcyIAEwKKTDwPKbk wPalZWYZiPsMozbie8crJS4GIciDGktsMWzVF0PwHL3FxaRqmGvZysvrT6UgWWg/ ySWZ9yApW3kVYrH0dQplQhqnpe/xalx2ooWGJX44ZL71KDg8BOCZ9+JC7JG4/pU2 Rl0dmBhmepXiKYvhfl53voggaZldPL5U3yN2mtXh5uzBw1IoHco8C9iwbsijzwIw KGSbYV+cHsmdu5k2xnLJQu9Tr/1dH4yuYjNy6MshCweOwj0T+x3qw2gURUOAkyz0 NSRrQM8IO9ACvr0uFsF69mVya8/76lNXhxbRgLFk4CEcrLi+2SbH1/ozzgTWniQw 6FKOqoQOA8VxNek8oH7r48RAATe34RkhcDqyrG2cEAnSzYFsqwVCT+kWoxFdAAcR mIDi6xOD/FtotylWUuQFS6UBb0iDNS9o1S+OzBBq+nBYGgezRsxpsVyALHYLJ1QF ZD+qVikZdHbgP4npjFHkgx08z3cNM5Qk/X4VWwT9DlaI+bdDqTx7z+d2o7nWkvTv UOO4UPEQESFsova2LnOBd+wg3zomH+StY3c0LhEfkE6tUXP7JjjBVMoMLWtVAJ5D dhcI0pXSS6vbafKVb2g3BTY2/ZtmJmNBXc4OwJFPEttHP69c3dXJ8W2yTT+GbvOK VR9vzInQ2Rgd9F20XZ1B =m5VJ -END PGP SIGNATURE- And now git describe command tells me: anholt@eliezer:anholt/src/mesa% git describe 97217a40f97cdeae0304798b607f704deb0c3558 9.1-branchpoint-1412-g97217a4 So now I know that this commit came after we forked off 9.1 (thus, it's going to be in the 9.2 branch for sure). Would others like to see these tags in the main repo? I would be happy to push them. And would whoever forks the release branch want to do this command in the future? I guess this isn't the only way we could generate tags for making git-describe useful. For example, you could run a similar script to generate a tag just *after* the branchpoint identified here, along master, so that git-describe would say "9.2-devel-1412-g97217a4" or whatever for that commit, but "9.1-branchpoint-2-gaf2d8f8" for af2d8f8072c53d4c63ed22b74f78213c1181c1eb for an early commit on the 9.1 branch. pgphIZqnbdaCy.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Make sure we always compute valid index bounds before drawing.
When doing software rendering (i.e. rendering to the selection buffer) we need to make sure that we have valid index bounds before calling _tnl_draw_prims(), otherwise we can crash. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59455 --- src/mesa/drivers/dri/i965/brw_draw.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index d684c17..ef0f273 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -554,7 +554,8 @@ void brw_draw_prims( struct gl_context *ctx, * get the minimum and maximum of their index buffer so we know what range * to upload. */ - if (!vbo_all_varyings_in_vbos(arrays) && !index_bounds_valid) { + if (!index_bounds_valid && + (ctx->RenderMode != GL_RENDER || !vbo_all_varyings_in_vbos(arrays))) { perf_debug("Scanning index buffer to compute index buffer bounds. " "Use glDrawRangeElements() to avoid this.\n"); vbo_get_minmax_indices(ctx, prims, ib, &min_index, &max_index, nr_prims); -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 5/9] mesa: Handle QUERY_RESULT_NO_WAIT in GetQueryObject{ui64}v
On 03/27/2014 11:34 PM, Kenneth Graunke wrote: > On 03/27/2014 01:59 PM, Rafal Mielniczuk wrote: >> Just return and do nothing if query result is not yet available >> >> Signed-off-by: Rafal Mielniczuk >> --- >> src/mesa/main/queryobj.c | 16 >> 1 file changed, 16 insertions(+) >> >> diff --git a/src/mesa/main/queryobj.c b/src/mesa/main/queryobj.c >> index 86e7c3a..d2d9fa7 100644 >> --- a/src/mesa/main/queryobj.c >> +++ b/src/mesa/main/queryobj.c >> @@ -594,6 +594,10 @@ _mesa_GetQueryObjectiv(GLuint id, GLenum pname, GLint >> *params) >> } >> >> switch (pname) { >> + case GL_QUERY_RESULT_NO_WAIT: >> + if (!q->Ready) >> +return; >> + //else fall through > > We don't usually use C++ style comments in Mesa. I would do: > > case GL_QUERY_NO_WAIT: >if (!q->Ready) >return; >/* fallthrough */ > case GL_QUERY_RESULT_ARB: > > Other than that, patches 1-6 are: > Reviewed-by: Kenneth Graunke Actually, I take that back...I don't think this is what we want for GPU drivers. (It's probably reasonable for software drivers though.) When a buffer object is bound to GL_QUERY_BUFFER, the idea is that the GL_QUERY_RESULT/GL_QUERY_RESULT_NO_WAIT queries should emit GPU commands to deliver the query result into the buffer object. The query result may not actually be available yet (so, q->Ready == false), but the GPU commands to obtain the result have already been submitted. Since any GPU commands we submit will happen after those, they can work with the result as if it's available...because it will be by the time they run. At least, that's my understanding right now. So, we need a way to know if a query result is "in flight, but done" (i.e. all commands to compute it have been submitted, but may not have run yet), and a way to ask the driver to deliver it to a particular buffer object/offset. That probably means two new driver hooks, but I'm not quite sure what they should look like just yet. So, patches 1-4 and 6 are: Reviewed-by: Kenneth Graunke I won't be around next week, but I'd be happy to help look into this when I'm back. (Unless, of course, others beat me to it...) :) Thanks again for your work on this! --Ken signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev