Re: [Mesa-dev] [PATCH v2 17/23] mesa: Add non-normalized formats support for ubyte packing functions
On 01/12/14 20:14, Jason Ekstrand wrote: On Mon, Dec 1, 2014 at 3:04 AM, Iago Toral Quiroga ito...@igalia.com wrote: From: Samuel Iglesias Gonsalvez sigles...@igalia.com Signed-off-by: Samuel Iglesias Gonsalvez sigles...@igalia.com --- src/mesa/main/format_pack.c.mako | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/format_pack.c.mako b/src/mesa/main/format_pack.c.mako index 7feb3f8..6815bf3 100644 --- a/src/mesa/main/format_pack.c.mako +++ b/src/mesa/main/format_pack.c.mako @@ -83,7 +83,9 @@ pack_ubyte_${f.short_name()}(const GLubyte src[4], void *dst) %endif ${c.datatype()} ${c.name} = - %if c.type == parser.UNSIGNED: + %if not f.is_normalized() and f.is_int(): +(${c.datatype()}) src[${i}]; Clamping Working on it. Sam + %elif c.type == parser.UNSIGNED: %if f.colorspace == 'srgb' and c.name in 'rgb': util_format_linear_to_srgb_8unorm(src[${i}]); %else: -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 16/23] mesa: Add _mesa_pack_uint_rgba_row() format conversion function
On 01/12/14 20:14, Jason Ekstrand wrote: On Mon, Dec 1, 2014 at 3:04 AM, Iago Toral Quiroga ito...@igalia.com wrote: From: Samuel Iglesias Gonsalvez sigles...@igalia.com We will use this later on to handle uint conversion scenarios in a master convert function. v2: - Modify pack_uint_*() function generation to use c.datatype() and f.datatype(). - Remove UINT_TO_FLOAT() macro usage from pack_uint*() - Remove if not f.is_normalized() conditional as pack_uint*() functions are only autogenerated for non normalized formats. Signed-off-by: Samuel Iglesias Gonsalvez sigles...@igalia.com --- src/mesa/main/format_pack.c.mako | 82 src/mesa/main/format_pack.h | 3 ++ 2 files changed, 85 insertions(+) diff --git a/src/mesa/main/format_pack.c.mako b/src/mesa/main/format_pack.c.mako index aced58d..7feb3f8 100644 --- a/src/mesa/main/format_pack.c.mako +++ b/src/mesa/main/format_pack.c.mako @@ -149,6 +149,56 @@ pack_ubyte_r11g11b10_float(const GLubyte src[4], void *dst) *d = float3_to_r11g11b10f(rgb); } +/* uint packing functions */ + +%for f in rgb_formats: + %if not f.is_int(): + % continue % + %elif f.is_normalized(): + % continue % + %elif f.is_compressed(): + % continue % + %endif + +static inline void +pack_uint_${f.short_name()}(const GLuint src[4], void *dst) +{ + %for (i, c) in enumerate(f.channels): + % i = f.swizzle.inverse()[i] % + %if c.type == 'x': + % continue % + %endif + + ${c.datatype()} ${c.name} = + %if c.type == parser.FLOAT and c.size == 16: + _mesa_float_to_half(src[${i}]); + %else: + (${c.datatype()}) src[${i}]; I think we're missing the clamping here. We should probably use the same clamping functions that we cooked up for swizzle_and_convert. We'll have to be careful though because the source gets interpreted as signed vs. unsigned depending on the destination format. OK. I'll work on it. Thanks! Sam + %endif + %endfor + + %if f.layout == parser.ARRAY: + ${f.datatype()} *d = (${f.datatype()} *)dst; + %for (i, c) in enumerate(f.channels): + %if c.type == 'x': +% continue % + %endif + d[${i}] = ${c.name}; + %endfor + %elif f.layout == parser.PACKED: + ${f.datatype()} d = 0; + %for (i, c) in enumerate(f.channels): + %if c.type == 'x': +% continue % + %endif + d |= PACK(${c.name}, ${c.shift}, ${c.size}); + %endfor + (*(${f.datatype()} *)dst) = d; + %else: + % assert False % + %endif +} +%endfor /* float packing functions */ @@ -297,6 +347,38 @@ _mesa_pack_ubyte_rgba_row(mesa_format format, GLuint n, } /** + * Pack a row of GLuint rgba[4] values to the destination. + */ +void +_mesa_pack_uint_rgba_row(mesa_format format, GLuint n, + const GLuint src[][4], void *dst) +{ + GLuint i; + GLubyte *d = dst; + + switch (format) { +%for f in rgb_formats: + %if not f.is_int(): + % continue % + %elif f.is_normalized(): + % continue % + %elif f.is_compressed(): + % continue % + %endif + + case ${f.name}: + for (i = 0; i n; ++i) { + pack_uint_${f.short_name()}(src[i], d); + d += ${f.block_size() / 8}; + } + break; +%endfor + default: + assert(!Invalid format); + } +} + +/** * Pack a row of GLfloat rgba[4] values to the destination. */ void diff --git a/src/mesa/main/format_pack.h b/src/mesa/main/format_pack.h index 2577def..1582ad1 100644 --- a/src/mesa/main/format_pack.h +++ b/src/mesa/main/format_pack.h @@ -77,6 +77,9 @@ extern void _mesa_pack_ubyte_rgba_row(mesa_format format, GLuint n, const GLubyte src[][4], void *dst); +extern void +_mesa_pack_uint_rgba_row(mesa_format format, GLuint n, + const GLuint src[][4], void *dst); extern void _mesa_pack_ubyte_rgba_rect(mesa_format format, GLuint width, GLuint height, -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 06/23] mesa: Fix incorrect assertion in init_teximage_fields_ms
On Mon, 2014-12-01 at 13:25 -0500, Ilia Mirkin wrote: On Mon, Dec 1, 2014 at 6:04 AM, Iago Toral Quiroga ito...@igalia.com wrote: _BaseFormat is a GLenum (unsigned int) so testing if its value is greater than 0 to detect the cases where _mesa_base_tex_format returns -1 doesn't work. Fixing the assertion breaks the arb_texture_view-lifetime-format piglit test on nouveau, since that test calls _mesa_base_tex_format with GL_R16F with a context that does not have ARB_texture_float, so it returns -1 for the BaseFormat, which was not being catched properly by the ASSERT in init_teximage_fields_ms until now. Sorry to hijack the thread, but... can you elaborate why this would fail on nouveau but not on i965? Does st/mesa do something differently wrt exposing ARB_texture_float vs i965? Does nouveau do something wrong? Hi Illia, I didn't investigate the source of the problem in Nouveau or why this is different than the other drivers, however I can give more details: We are running piglit against i965, llvmpipe, classic swrast, nouveu and radeon, but we only hit the problem with nouveau. The code that triggers the assertion is _mesa_base_tex_format (teximage.c), which is called with internalFormat = GL_R16F, and then, in that function there is this code: if (ctx-Extensions.ARB_texture_rg) { switch (internalFormat) { case GL_R16F: case GL_R32F: if (!ctx-Extensions.ARB_texture_float) break; return GL_RED; ... On i965, the code returns GL_RED, but in Nouveau it hits the break because ctx-Extensions.ARB_texture_float is false in this case (and later will return -1 after being unable to find a proper base format). In the case of Intel, ARB_texture_float is always enabled. In the case of Gallium I see this in st_extensions.c: static const struct st_extension_format_mapping rendertarget_mapping[] = { { { o(ARB_texture_float) }, { PIPE_FORMAT_R32G32B32A32_FLOAT, PIPE_FORMAT_R16G16B16A16_FLOAT } }, ... so if I understand that right, for ARB_texture_float to be activated I need PIPE_FORMAT_R32G32B32A32_FLOAT and PIPE_FORMAT_R16G16B16A16_FLOAT to be supported, so I guess that at least one of these is not supported in the nVidia hardware I am using (NVIDIA GT218 [ION]). If that is not the problem, then I guess it would need further investigation, but I don't see any other places where Gallium or nouveau set that extension. Iago --- src/mesa/main/teximage.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c index e238863..c9658c1 100644 --- a/src/mesa/main/teximage.c +++ b/src/mesa/main/teximage.c @@ -1313,7 +1313,7 @@ init_teximage_fields_ms(struct gl_context *ctx, target = img-TexObject-Target; img-_BaseFormat = _mesa_base_tex_format( ctx, internalFormat ); - ASSERT(img-_BaseFormat 0); + ASSERT(img-_BaseFormat != -1); img-InternalFormat = internalFormat; img-Border = border; img-Width = width; -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 08/23] mesa: Fix _mesa_swizzle_and_convert integer conversions to clamp properly
On Tuesday, December 02, 2014 07:54:19 AM Samuel Iglesias Gonsálvez wrote: On Monday, December 01, 2014 10:16:35 AM Jason Ekstrand wrote: This looks much better. Two comments though. First, I think we need to tweak the float_to_uint function to clamp above. I don't know that it properly handles values larger than MAX_UINT right now. Second, I think we may want a MIN_INT macro for this. In particular, I don't know that this is well-defined to work for dst_size == 32 because a signed shift by 31 isn't well-defined according to the C spec. With those two fixed, this one is Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com --Jason OK, I'm going to work on it. Thanks, Sam By the way, I found that float_to_uint() function is unused. So, instead of tweaking it, I will remove it. Sam On Mon, Dec 1, 2014 at 3:04 AM, Iago Toral Quiroga ito...@igalia.com wrote: From: Samuel Iglesias Gonsalvez sigles...@igalia.com Fix various conversion paths that involved integer data types of different sizes (uint16_t to uint8_t, int16_t to uint8_t, etc) that were not being clamped properly. Also, one of the paths was incorrectly assigning the value 12, instead of 1, to the constant one. v2: - Create auxiliary clamping functions and use them in all paths that required clamp because of different source and destination sizes and signed-unsigned conversions. Signed-off-by: Samuel Iglesias Gonsalvez sigles...@igalia.com --- src/mesa/main/format_utils.c | 47 ++-- src/mesa/main/format_utils.h | 25 +++ 2 files changed, 48 insertions(+), 24 deletions(-) diff --git a/src/mesa/main/format_utils.c b/src/mesa/main/format_utils.c index 41fd043..dc63e1f 100644 --- a/src/mesa/main/format_utils.c +++ b/src/mesa/main/format_utils.c @@ -449,7 +449,6 @@ convert_half_float(void *void_dst, int num_dst_channels, } } - static void convert_ubyte(void *void_dst, int num_dst_channels, const void *void_src, GLenum src_type, int num_src_channels, @@ -469,7 +468,7 @@ convert_ubyte(void *void_dst, int num_dst_channels, if (normalized) { SWIZZLE_CONVERT(uint8_t, uint16_t, _mesa_half_to_unorm(src, 8)); } else { - SWIZZLE_CONVERT(uint8_t, uint16_t, half_to_uint(src)); + SWIZZLE_CONVERT(uint8_t, uint16_t, _mesa_unsigned_to_unsigned(half_to_uint(src), 8)); } break; case GL_UNSIGNED_BYTE: @@ -479,35 +478,35 @@ convert_ubyte(void *void_dst, int num_dst_channels, if (normalized) { SWIZZLE_CONVERT(uint8_t, int8_t, _mesa_snorm_to_unorm(src, 8, 8)); } else { - SWIZZLE_CONVERT(uint8_t, int8_t, (src 0) ? 0 : src); + SWIZZLE_CONVERT(uint8_t, int8_t, _mesa_signed_to_unsigned(src, 8)); } break; case GL_UNSIGNED_SHORT: if (normalized) { SWIZZLE_CONVERT(uint8_t, uint16_t, _mesa_unorm_to_unorm(src, 16, 8)); } else { - SWIZZLE_CONVERT(uint8_t, uint16_t, src); + SWIZZLE_CONVERT(uint8_t, uint16_t, _mesa_unsigned_to_unsigned(src, 8)); } break; case GL_SHORT: if (normalized) { SWIZZLE_CONVERT(uint8_t, int16_t, _mesa_snorm_to_unorm(src, 16, 8)); } else { - SWIZZLE_CONVERT(uint8_t, int16_t, (src 0) ? 0 : src); + SWIZZLE_CONVERT(uint8_t, int16_t, _mesa_signed_to_unsigned(src, 8)); } break; case GL_UNSIGNED_INT: if (normalized) { SWIZZLE_CONVERT(uint8_t, uint32_t, _mesa_unorm_to_unorm(src, 32, 8)); } else { - SWIZZLE_CONVERT(uint8_t, uint32_t, src); + SWIZZLE_CONVERT(uint8_t, uint32_t, _mesa_unsigned_to_unsigned(src, 8)); } break; case GL_INT: if (normalized) { SWIZZLE_CONVERT(uint8_t, int32_t, _mesa_snorm_to_unorm(src, 32, 8)); } else { - SWIZZLE_CONVERT(uint8_t, int32_t, (src 0) ? 0 : src); + SWIZZLE_CONVERT(uint8_t, int32_t, _mesa_signed_to_unsigned(src, 8)); } break; default: @@ -542,7 +541,7 @@ convert_byte(void *void_dst, int num_dst_channels, if (normalized) { SWIZZLE_CONVERT(int8_t, uint8_t, _mesa_unorm_to_snorm(src, 8, 8)); } else { - SWIZZLE_CONVERT(int8_t, uint8_t, src); +
Re: [Mesa-dev] [PATCH v2 16/23] mesa: Add _mesa_pack_uint_rgba_row() format conversion function
On Mon, 2014-12-01 at 11:14 -0800, Jason Ekstrand wrote: On Mon, Dec 1, 2014 at 3:04 AM, Iago Toral Quiroga ito...@igalia.com wrote: From: Samuel Iglesias Gonsalvez sigles...@igalia.com We will use this later on to handle uint conversion scenarios in a master convert function. v2: - Modify pack_uint_*() function generation to use c.datatype() and f.datatype(). - Remove UINT_TO_FLOAT() macro usage from pack_uint*() - Remove if not f.is_normalized() conditional as pack_uint*() functions are only autogenerated for non normalized formats. Signed-off-by: Samuel Iglesias Gonsalvez sigles...@igalia.com --- src/mesa/main/format_pack.c.mako | 82 src/mesa/main/format_pack.h | 3 ++ 2 files changed, 85 insertions(+) diff --git a/src/mesa/main/format_pack.c.mako b/src/mesa/main/format_pack.c.mako index aced58d..7feb3f8 100644 --- a/src/mesa/main/format_pack.c.mako +++ b/src/mesa/main/format_pack.c.mako @@ -149,6 +149,56 @@ pack_ubyte_r11g11b10_float(const GLubyte src[4], void *dst) *d = float3_to_r11g11b10f(rgb); } +/* uint packing functions */ + +%for f in rgb_formats: + %if not f.is_int(): + % continue % + %elif f.is_normalized(): + % continue % + %elif f.is_compressed(): + % continue % + %endif + +static inline void +pack_uint_${f.short_name()}(const GLuint src[4], void *dst) +{ + %for (i, c) in enumerate(f.channels): + % i = f.swizzle.inverse()[i] % + %if c.type == 'x': + % continue % + %endif + + ${c.datatype()} ${c.name} = + %if c.type == parser.FLOAT and c.size == 16: + _mesa_float_to_half(src[${i}]); Jason, I think this float handling shouldn't be here we only allow packing/unpacking between integer types, right? In that case maybe we should add an assertion to catch these other cases. + %else: + (${c.datatype()}) src[${i}]; I think we're missing the clamping here. We should probably use the same clamping functions that we cooked up for swizzle_and_convert. We'll have to be careful though because the source gets interpreted as signed vs. unsigned depending on the destination format. So you mean that we should use _mesa_unsigned_to_unsigned when c.type == parser.UNSIGNED and _mesa_signed_to_signed when c.type == parser.SIGNED. If there is nothing wrong with what I suggest above, then I think this is how it should look like in the end: ${c.datatype()} ${c.name} = %if c.type == parser.SIGNED: _mesa_signed_to_signed(src[${i}], ${c.size}); %elif c.type == parser.UNSIGNED: _mesa_unsigned_to_unsigned(src[${i}], ${c.size}); %else: assert(!Invalid type: only integer types are allowed); %endif + %endif + %endfor + + %if f.layout == parser.ARRAY: + ${f.datatype()} *d = (${f.datatype()} *)dst; + %for (i, c) in enumerate(f.channels): + %if c.type == 'x': +% continue % + %endif + d[${i}] = ${c.name}; + %endfor + %elif f.layout == parser.PACKED: + ${f.datatype()} d = 0; + %for (i, c) in enumerate(f.channels): + %if c.type == 'x': +% continue % + %endif + d |= PACK(${c.name}, ${c.shift}, ${c.size}); + %endfor + (*(${f.datatype()} *)dst) = d; + %else: + % assert False % + %endif +} +%endfor /* float packing functions */ @@ -297,6 +347,38 @@ _mesa_pack_ubyte_rgba_row(mesa_format format, GLuint n, } /** + * Pack a row of GLuint rgba[4] values to the destination. + */ +void +_mesa_pack_uint_rgba_row(mesa_format format, GLuint n, + const GLuint src[][4], void *dst) +{ + GLuint i; + GLubyte *d = dst; + + switch (format) { +%for f in rgb_formats: + %if not f.is_int(): + % continue % + %elif f.is_normalized(): + % continue % + %elif f.is_compressed(): + % continue % + %endif + + case ${f.name}: + for (i = 0; i n; ++i) { +
Re: [Mesa-dev] [PATCH 04/10] mesa: Recompute LegalTypesMask if the GL API has changed
On Mon, 2014-12-01 at 09:24 -0700, Brian Paul wrote: On 12/01/2014 06:04 AM, Eduardo Lima Mitev wrote: From: Iago Toral Quiroga ito...@igalia.com The current code computes ctx-Array.LegalTypesMask just once, however, computing this needs to consider ctx-API so we need to make sure that the API for that context has not changed if we intend to reuse the result. The context API can change, at least, if we go through _mesa_meta_begin, since that will always force API_OPENGL_COMPAT until we call _mesa_meta_end. If any operation in between these two calls triggers a call to update_array_format, then we might be caching a value for LegalTypesMask that will not be right once we have called _mesa_meta_end and restored the context API. Fixes the following 179 dEQP tests in i965: dEQP-GLES3.functional.vertex_arrays.single_attribute.strides.fixed.* dEQP-GLES3.functional.vertex_arrays.single_attribute.normalize.fixed.* dEQP-GLES3.functional.vertex_arrays.single_attribute.output_types.fixed.* dEQP-GLES3.functional.vertex_arrays.single_attribute.usages.static_draw.*fixed* dEQP-GLES3.functional.vertex_arrays.single_attribute.usages.stream_draw.*fixed* dEQP-GLES3.functional.vertex_arrays.single_attribute.usages.dynamic_draw.*fixed* dEQP-GLES3.functional.vertex_arrays.single_attribute.usages.static_copy.*fixed* dEQP-GLES3.functional.vertex_arrays.single_attribute.usages.stream_copy.*fixed* dEQP-GLES3.functional.vertex_arrays.single_attribute.usages.dynamic_copy.*fixed* dEQP-GLES3.functional.vertex_arrays.single_attribute.usages.static_read.*fixed* dEQP-GLES3.functional.vertex_arrays.single_attribute.usages.stream_read.*fixed* dEQP-GLES3.functional.vertex_arrays.single_attribute.usages.dynamic_read.*fixed* dEQP-GLES3.functional.vertex_arrays.multiple_attributes.input_types.3_*fixed2* dEQP-GLES3.functional.draw.random.{2,18,28,68,83,106,109,156,181,191} --- src/mesa/main/mtypes.h | 3 ++- src/mesa/main/varray.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 7389baa..78f034d 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -1701,8 +1701,9 @@ struct gl_array_attrib /** One of the DRAW_xxx flags, not consumed by drivers */ gl_draw_method DrawMethod; - /** Legal array datatypes */ + /** Legal array datatypes and the API for which they have been computed */ GLbitfield LegalTypesMask; + int LegalTypesMaskAPI; gl_api LegalTypesMaskAPI; Sure. This also requires to move the gl_api enum definition above this. Will do that too. }; diff --git a/src/mesa/main/varray.c b/src/mesa/main/varray.c index 96c2b26..acfc4bd 100644 --- a/src/mesa/main/varray.c +++ b/src/mesa/main/varray.c @@ -258,11 +258,12 @@ update_array_format(struct gl_context *ctx, GLuint elementSize; GLenum format = GL_RGBA; - if (ctx-Array.LegalTypesMask == 0) { + if (ctx-Array.LegalTypesMask == 0 || ctx-Array.LegalTypesMaskAPI != ctx-API) { /* One-time initialization. We can't do this in _mesa_init_varrays() * below because extensions are not yet enabled at that point. */ Should probably update the comment to say something like: Compute the LegalTypesMask if it's uninitialized or the context API changes. Sure, will do. Thanks for reviewing! Iago ctx-Array.LegalTypesMask = get_legal_types_mask(ctx); + ctx-Array.LegalTypesMaskAPI = ctx-API; } legalTypesMask = ctx-Array.LegalTypesMask; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/10] Various fixes for dEQP failing tests
On 12/01/2014 05:29 PM, Brian Paul wrote: On 12/01/2014 06:04 AM, Eduardo Lima Mitev wrote: This set of (unrelated) patches fixes over 230 tests from the dEQP test suite [1]. [...] The non-GLSL changes look good to me and are: Reviewed-by: Brian Paul bri...@vmware.com The GLSL changes look OK too but should probably be reviewed by someone more active in that area. Thanks Brian, We fixed the issues you commented, and after we get a go for the GLSL ones, Iago will push them. cheers, Eduardo ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Move PSCDEPTH calculations from draw time to compile time.
The Pixel Shader Computed Depth Mode value is entirely based on the shader program, so we can easily do it at compile time. This avoids the if+switch on every 3DSTATE_WM (Gen7)/3DSTATE_PS_EXTRA (Gen8+) upload, and shares a bit more code. This also simplifies the PMA stall code, making it match the formula more closely, and drops a BRW_NEW_FRAGMENT_PROGRAM dependency. (Note that the previous comment was wrong - the code and the documentation have != PSCDEPTH_OFF, not ==.) Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_context.h | 2 ++ src/mesa/drivers/dri/i965/brw_defines.h | 17 + src/mesa/drivers/dri/i965/brw_wm.c | 21 + src/mesa/drivers/dri/i965/gen7_wm_state.c| 22 +++--- src/mesa/drivers/dri/i965/gen8_depth_state.c | 12 src/mesa/drivers/dri/i965/gen8_ps_state.c| 18 +- 6 files changed, 40 insertions(+), 52 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index ec4b3dd..b4ddc17 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -390,6 +390,8 @@ struct brw_wm_prog_data { /** @} */ } binding_table; + uint8_t computed_depth_mode; + bool no_8; bool dual_src_blend; bool uses_pos_offset; diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index adcf1db..2acd0f8 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -2051,16 +2051,20 @@ enum brw_message_target { # define GEN9_WM_DS_BF_STENCIL_REF_MASK INTEL_MASK(7, 0) # define GEN9_WM_DS_BF_STENCIL_REF_SHIFT0 +enum brw_pixel_shader_computed_depth_mode { + BRW_PSCDEPTH_OFF = 0, /* PS does not compute depth */ + BRW_PSCDEPTH_ON= 1, /* PS computes depth; no guarantee about value */ + BRW_PSCDEPTH_ON_GE = 2, /* PS guarantees output depth = source depth */ + BRW_PSCDEPTH_ON_LE = 3, /* PS guarantees output depth = source depth */ +}; + #define _3DSTATE_PS_EXTRA 0x784F /* GEN8+ */ /* DW1 */ # define GEN8_PSX_PIXEL_SHADER_VALID(1 31) # define GEN8_PSX_PIXEL_SHADER_NO_RT_WRITE (1 30) # define GEN8_PSX_OMASK_TO_RENDER_TARGET(1 29) # define GEN8_PSX_KILL_ENABLE (1 28) -# define GEN8_PSX_PSCDEPTH_OFF (0 26) -# define GEN8_PSX_PSCDEPTH_ON (1 26) -# define GEN8_PSX_PSCDEPTH_ON_GE(2 26) -# define GEN8_PSX_PSCDEPTH_ON_LE(3 26) +# define GEN8_PSX_COMPUTED_DEPTH_MODE_SHIFT 26 # define GEN8_PSX_FORCE_COMPUTED_DEPTH (1 25) # define GEN8_PSX_USES_SOURCE_DEPTH (1 24) # define GEN8_PSX_USES_SOURCE_W (1 23) @@ -2202,10 +2206,7 @@ enum brw_wm_barycentric_interp_mode { # define GEN7_WM_DEPTH_RESOLVE (1 28) # define GEN7_WM_HIERARCHICAL_DEPTH_RESOLVE(1 27) # define GEN7_WM_KILL_ENABLE (1 25) -# define GEN7_WM_PSCDEPTH_OFF (0 23) -# define GEN7_WM_PSCDEPTH_ON (1 23) -# define GEN7_WM_PSCDEPTH_ON_GE(2 23) -# define GEN7_WM_PSCDEPTH_ON_LE(3 23) +# define GEN7_WM_COMPUTED_DEPTH_MODE_SHIFT 23 # define GEN7_WM_USES_SOURCE_DEPTH (1 20) # define GEN7_WM_USES_SOURCE_W (1 19) # define GEN7_WM_POSITION_ZW_PIXEL (0 17) diff --git a/src/mesa/drivers/dri/i965/brw_wm.c b/src/mesa/drivers/dri/i965/brw_wm.c index fe36dd4..7badb23 100644 --- a/src/mesa/drivers/dri/i965/brw_wm.c +++ b/src/mesa/drivers/dri/i965/brw_wm.c @@ -116,6 +116,25 @@ brw_compute_barycentric_interp_modes(struct brw_context *brw, return barycentric_interp_modes; } +static uint8_t +computed_depth_mode(struct gl_fragment_program *fp) +{ + if (fp-Base.OutputsWritten BITFIELD64_BIT(FRAG_RESULT_DEPTH)) { + switch (fp-FragDepthLayout) { + case FRAG_DEPTH_LAYOUT_NONE: + case FRAG_DEPTH_LAYOUT_ANY: + return BRW_PSCDEPTH_ON; + case FRAG_DEPTH_LAYOUT_GREATER: + return BRW_PSCDEPTH_ON_GE; + case FRAG_DEPTH_LAYOUT_LESS: + return BRW_PSCDEPTH_ON_LE; + case FRAG_DEPTH_LAYOUT_UNCHANGED: + return BRW_PSCDEPTH_OFF; + } + } + return BRW_PSCDEPTH_OFF; +} + bool brw_wm_prog_data_compare(const void *in_a, const void *in_b) { @@ -161,6 +180,8 @@ bool do_wm_prog(struct brw_context *brw, */ prog_data.uses_kill = fp-program.UsesKill || key-alpha_test_func; + prog_data.computed_depth_mode = computed_depth_mode(fp-program); + /* Allocate the references to the uniforms that will end
[Mesa-dev] [PATCH 1/4] i965: Make Gen4-5 and Gen8+ ALT checks use ctx-_Shader too.
Commit c0347705 changed the Gen6-7 code to use ctx-_Shader rather than ctx-Shader, but neglected to change the Gen4-5 or Gen8+ code. This might fix SSO related bugs, but ALT mode is only used for ARB programs, so if there's an actual problem, it's likely no one would run into it. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_vs_state.c | 2 +- src/mesa/drivers/dri/i965/brw_wm_state.c | 2 +- src/mesa/drivers/dri/i965/gen8_ps_state.c | 2 +- src/mesa/drivers/dri/i965/gen8_vs_state.c | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vs_state.c b/src/mesa/drivers/dri/i965/brw_vs_state.c index 998a225..abd6771 100644 --- a/src/mesa/drivers/dri/i965/brw_vs_state.c +++ b/src/mesa/drivers/dri/i965/brw_vs_state.c @@ -60,7 +60,7 @@ brw_upload_vs_unit(struct brw_context *brw) /* Use ALT floating point mode for ARB vertex programs, because they * require 0^0 == 1. */ - if (brw-ctx.Shader.CurrentProgram[MESA_SHADER_VERTEX] == NULL) + if (brw-ctx._Shader-CurrentProgram[MESA_SHADER_VERTEX] == NULL) vs-thread1.floating_point_mode = BRW_FLOATING_POINT_NON_IEEE_754; else vs-thread1.floating_point_mode = BRW_FLOATING_POINT_IEEE_754; diff --git a/src/mesa/drivers/dri/i965/brw_wm_state.c b/src/mesa/drivers/dri/i965/brw_wm_state.c index 12cbc72..d2903c7 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_state.c @@ -119,7 +119,7 @@ brw_upload_wm_unit(struct brw_context *brw) * rendering, CurrentProgram[MESA_SHADER_FRAGMENT] is used for this check * to differentiate between the GLSL and non-GLSL cases. */ - if (ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT] == NULL) + if (ctx-_Shader-CurrentProgram[MESA_SHADER_FRAGMENT] == NULL) wm-thread1.floating_point_mode = BRW_FLOATING_POINT_NON_IEEE_754; else wm-thread1.floating_point_mode = BRW_FLOATING_POINT_IEEE_754; diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c b/src/mesa/drivers/dri/i965/gen8_ps_state.c index 3aa0ef3..a3ce1d4 100644 --- a/src/mesa/drivers/dri/i965/gen8_ps_state.c +++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c @@ -146,7 +146,7 @@ upload_ps_state(struct brw_context *brw) * rendering, CurrentFragmentProgram is used for this check to * differentiate between the GLSL and non-GLSL cases. */ - if (ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT] == NULL) + if (ctx-_Shader-CurrentProgram[MESA_SHADER_FRAGMENT] == NULL) dw3 |= GEN7_PS_FLOATING_POINT_MODE_ALT; /* 3DSTATE_PS expects the number of threads per PSD, which is always 64; diff --git a/src/mesa/drivers/dri/i965/gen8_vs_state.c b/src/mesa/drivers/dri/i965/gen8_vs_state.c index 00f2731..5a2021f 100644 --- a/src/mesa/drivers/dri/i965/gen8_vs_state.c +++ b/src/mesa/drivers/dri/i965/gen8_vs_state.c @@ -42,7 +42,7 @@ upload_vs_state(struct brw_context *brw) /* Use ALT floating point mode for ARB vertex programs, because they * require 0^0 == 1. */ - if (ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX] == NULL) + if (ctx-_Shader-CurrentProgram[MESA_SHADER_VERTEX] == NULL) floating_point_mode = GEN6_VS_FLOATING_POINT_MODE_ALT; BEGIN_BATCH(9); -- 2.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/4] i965: Store floating point mode choice in brw_stage_prog_data.
We use IEEE mode for GLSL programs, but need to use ALT mode for ARB programs so that 0^0 == 1. The choice is based entirely on the shader source language. Previously, our code to determine which mode we wanted was duplicated in 8 different places (VS and FS for Gen4-5, Gen6, Gen7, and Gen8). The ctx-_Shader-CurrentProgram[stage] == NULL check was confusing as well - we use CurrentProgram (non-derived state), but _Shader (derived state). It also relies on knowing that ARB programs don't use gl_shader_program structures today. The compiler already makes this assumption in a few places, but I'd rather keep that assumption out of the state upload code. With this patch, we select the mode at compile time, and store that choice in prog_data. The state upload code simply uses that decision. This eliminates a BRW_NEW_*_PROGRAM dependency in the state upload code. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_context.h | 2 ++ src/mesa/drivers/dri/i965/brw_vs.c| 4 src/mesa/drivers/dri/i965/brw_vs_state.c | 5 + src/mesa/drivers/dri/i965/brw_wm.c| 4 src/mesa/drivers/dri/i965/brw_wm_state.c | 7 +-- src/mesa/drivers/dri/i965/gen6_vs_state.c | 6 +- src/mesa/drivers/dri/i965/gen6_wm_state.c | 7 +-- src/mesa/drivers/dri/i965/gen7_vs_state.c | 6 +- src/mesa/drivers/dri/i965/gen7_wm_state.c | 8 +--- src/mesa/drivers/dri/i965/gen8_ps_state.c | 7 +-- src/mesa/drivers/dri/i965/gen8_vs_state.c | 5 + 11 files changed, 18 insertions(+), 43 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index b4ddc17..4bb3b17 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -355,6 +355,8 @@ struct brw_stage_prog_data { */ unsigned dispatch_grf_start_reg; + bool use_alt_mode; /** Use ALT floating point mode? Otherwise, IEEE. */ + /* Pointers to tracked values (only valid once * _mesa_load_state_parameters has been called at runtime). * diff --git a/src/mesa/drivers/dri/i965/brw_vs.c b/src/mesa/drivers/dri/i965/brw_vs.c index 2f628e5..970d86c 100644 --- a/src/mesa/drivers/dri/i965/brw_vs.c +++ b/src/mesa/drivers/dri/i965/brw_vs.c @@ -209,6 +209,10 @@ do_vs_prog(struct brw_context *brw, memcpy(c.key, key, sizeof(*key)); memset(prog_data, 0, sizeof(prog_data)); + /* Use ALT floating point mode for ARB programs so that 0^0 == 1. */ + if (!prog) + stage_prog_data-use_alt_mode = true; + mem_ctx = ralloc_context(NULL); c.vp = vp; diff --git a/src/mesa/drivers/dri/i965/brw_vs_state.c b/src/mesa/drivers/dri/i965/brw_vs_state.c index abd6771..5371f71 100644 --- a/src/mesa/drivers/dri/i965/brw_vs_state.c +++ b/src/mesa/drivers/dri/i965/brw_vs_state.c @@ -57,10 +57,7 @@ brw_upload_vs_unit(struct brw_context *brw) stage_state-prog_offset + (vs-thread0.grf_reg_count 1)) 6; - /* Use ALT floating point mode for ARB vertex programs, because they -* require 0^0 == 1. -*/ - if (brw-ctx._Shader-CurrentProgram[MESA_SHADER_VERTEX] == NULL) + if (brw-vs.prog_data-base.base.use_alt_mode) vs-thread1.floating_point_mode = BRW_FLOATING_POINT_NON_IEEE_754; else vs-thread1.floating_point_mode = BRW_FLOATING_POINT_IEEE_754; diff --git a/src/mesa/drivers/dri/i965/brw_wm.c b/src/mesa/drivers/dri/i965/brw_wm.c index 7badb23..e7939f0 100644 --- a/src/mesa/drivers/dri/i965/brw_wm.c +++ b/src/mesa/drivers/dri/i965/brw_wm.c @@ -182,6 +182,10 @@ bool do_wm_prog(struct brw_context *brw, prog_data.computed_depth_mode = computed_depth_mode(fp-program); + /* Use ALT floating point mode for ARB programs so that 0^0 == 1. */ + if (!prog) + prog_data.base.use_alt_mode = true; + /* Allocate the references to the uniforms that will end up in the * prog_data associated with the compiled program, and which will be freed * by the state cache. diff --git a/src/mesa/drivers/dri/i965/brw_wm_state.c b/src/mesa/drivers/dri/i965/brw_wm_state.c index d2903c7..0dee1f8 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_state.c @@ -114,12 +114,7 @@ brw_upload_wm_unit(struct brw_context *brw) (wm-wm9.grf_reg_count_2 1)) 6; wm-thread1.depth_coef_urb_read_offset = 1; - /* Use ALT floating point mode for ARB fragment programs, because they -* require 0^0 == 1. Even though _CurrentFragmentProgram is used for -* rendering, CurrentProgram[MESA_SHADER_FRAGMENT] is used for this check -* to differentiate between the GLSL and non-GLSL cases. -*/ - if (ctx-_Shader-CurrentProgram[MESA_SHADER_FRAGMENT] == NULL) + if (prog_data-base.use_alt_mode) wm-thread1.floating_point_mode = BRW_FLOATING_POINT_NON_IEEE_754; else wm-thread1.floating_point_mode = BRW_FLOATING_POINT_IEEE_754; diff --git
[Mesa-dev] [PATCH 4/4] i965: Drop BRW_NEW_VERTEX_PROGRAM and _NEW_TRANSFORM from Gen4 VS state.
This simply looks wrong - I don't see any code that uses _NEW_TRANSFORM or BRW_NEW_VERTEX_PROGRAM. It looks like the intention was to duplicate the brw_curbe_offsets atom's flags, which computes brw-curbe.vs_start. This is unnecessary - we flag BRW_NEW_CURBE_OFFSETS whenever that field changes; listening to that is sufficient. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_vs_state.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vs_state.c b/src/mesa/drivers/dri/i965/brw_vs_state.c index 5371f71..17bdbb9 100644 --- a/src/mesa/drivers/dri/i965/brw_vs_state.c +++ b/src/mesa/drivers/dri/i965/brw_vs_state.c @@ -96,7 +96,7 @@ brw_upload_vs_unit(struct brw_context *brw) brw-vs.prog_data-base.base.dispatch_grf_start_reg; vs-thread3.urb_entry_read_offset = 0; - /* BRW_NEW_CURBE_OFFSETS, _NEW_TRANSFORM, BRW_NEW_VERTEX_PROGRAM */ + /* BRW_NEW_CURBE_OFFSETS */ vs-thread3.const_urb_entry_read_offset = brw-curbe.vs_start * 2; /* BRW_NEW_URB_FENCE */ @@ -183,13 +183,12 @@ brw_upload_vs_unit(struct brw_context *brw) const struct brw_tracked_state brw_vs_unit = { .dirty = { - .mesa = _NEW_TRANSFORM, + .mesa = 0, .brw = BRW_NEW_BATCH | BRW_NEW_CURBE_OFFSETS | BRW_NEW_PROGRAM_CACHE | BRW_NEW_SAMPLER_STATE_TABLE | BRW_NEW_URB_FENCE | - BRW_NEW_VERTEX_PROGRAM | BRW_NEW_VS_PROG_DATA, }, .emit = brw_upload_vs_unit, -- 2.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/4] i965: Drop BRW_NEW_VERTEX_PROGRAM from Gen7+ 3DSTATE_VS atoms.
We don't access brw-vertex_program or ctx-_Shader since the previous commit, so we don't need this dirty bit. I think it's still necessary on Gen6 because it still conflates constant uploading with unit state uploading. We can fix that later. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/gen7_vs_state.c | 1 - src/mesa/drivers/dri/i965/gen8_vs_state.c | 1 - 2 files changed, 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen7_vs_state.c b/src/mesa/drivers/dri/i965/gen7_vs_state.c index 5a11588..404dd20 100644 --- a/src/mesa/drivers/dri/i965/gen7_vs_state.c +++ b/src/mesa/drivers/dri/i965/gen7_vs_state.c @@ -110,7 +110,6 @@ const struct brw_tracked_state gen7_vs_state = { .mesa = _NEW_TRANSFORM, .brw = BRW_NEW_BATCH | BRW_NEW_CONTEXT | - BRW_NEW_VERTEX_PROGRAM | BRW_NEW_VS_PROG_DATA, }, .emit = upload_vs_state, diff --git a/src/mesa/drivers/dri/i965/gen8_vs_state.c b/src/mesa/drivers/dri/i965/gen8_vs_state.c index e5b7e03..b7af466 100644 --- a/src/mesa/drivers/dri/i965/gen8_vs_state.c +++ b/src/mesa/drivers/dri/i965/gen8_vs_state.c @@ -81,7 +81,6 @@ const struct brw_tracked_state gen8_vs_state = { .mesa = _NEW_TRANSFORM, .brw = BRW_NEW_BATCH | BRW_NEW_CONTEXT | - BRW_NEW_VERTEX_PROGRAM | BRW_NEW_VS_PROG_DATA, }, .emit = upload_vs_state, -- 2.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/10] glsl: Don't allow gl_FragData[i], with i0 in GLES shaders
On 12/01/2014 05:43 PM, Matteo Bruni wrote: 2014-12-01 14:04 GMT+01:00 Eduardo Lima Mitev el...@igalia.com: The OpenGL ES Shading Language specification describes the values that may be output by a fragment shader. These are gl_FragColor and gl_FragData[0]. Multiple render targets are not supported in GLES. Fixes dEQP test: * dEQP-GLES3.functional.shaders.fragdata.invalid_assign_to_1 --- src/glsl/ast_array_index.cpp | 21 - 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/src/glsl/ast_array_index.cpp b/src/glsl/ast_array_index.cpp index ff0c757..b507d34 100644 --- a/src/glsl/ast_array_index.cpp +++ b/src/glsl/ast_array_index.cpp @@ -46,7 +46,9 @@ ast_array_specifier::print(void) const * * This function also checks whether the array is a built-in array whose * maximum size is too small to accommodate the given index, and if so uses - * loc and state to report the error. + * loc and state to report the error. It also checks that the built-in array + * gl_FragData is not accessed with indexes greater than zero in OpenGL ES, + * where multiple render targets are not allowed. */ static void update_max_array_access(ir_rvalue *ir, int idx, YYLTYPE *loc, @@ -54,6 +56,23 @@ update_max_array_access(ir_rvalue *ir, int idx, YYLTYPE *loc, { if (ir_dereference_variable *deref_var = ir-as_dereference_variable()) { ir_variable *var = deref_var-var; + + /* Page 89 in the section 3.8 (Fragment Shaders) of the the + * OpenGL ES 2.0.25 spec says: + * The OpenGL ES Shading Language specification describes the + * values that may be output by a fragment shader. These are + * gl_FragColor and gl_FragData[0]. + * ... + * gl_FragData is supported for compatibility with the desktop + * OpenGL Shading Language, but only a single fragment color + * output is allowed in the OpenGL ES Shading Language. + */ + if (state-es_shader idx 0 + strcmp(var-name, gl_FragData) == 0) { + _mesa_glsl_error(loc, state, + multiple render targets are not supported); + } + if (idx (int)var-data.max_array_access) { var-data.max_array_access = idx; -- 2.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev AFAICS this restriction is lifted in ES 3.0+ (e.g. see section 3.9.2 in the OpenGL ES 3.0 spec). Yep, it seems this check should be against earlier versions and only when not having extensions that allow MRT enabled (NV_draw_buffers). You can still use gl_FragData[] when using OpenGL ES 3 and GLSL ES 1.00, however when using GLSL ES 3.00 gl_FragData[] gets deprecated and you should use output layout qualifier to bind a output variable and target draw buffer together. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev // Tapani ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 06/23] mesa: Fix incorrect assertion in init_teximage_fields_ms
On Tue, Dec 2, 2014 at 3:05 AM, Iago Toral ito...@igalia.com wrote: On Mon, 2014-12-01 at 13:25 -0500, Ilia Mirkin wrote: On Mon, Dec 1, 2014 at 6:04 AM, Iago Toral Quiroga ito...@igalia.com wrote: _BaseFormat is a GLenum (unsigned int) so testing if its value is greater than 0 to detect the cases where _mesa_base_tex_format returns -1 doesn't work. Fixing the assertion breaks the arb_texture_view-lifetime-format piglit test on nouveau, since that test calls _mesa_base_tex_format with GL_R16F with a context that does not have ARB_texture_float, so it returns -1 for the BaseFormat, which was not being catched properly by the ASSERT in init_teximage_fields_ms until now. Sorry to hijack the thread, but... can you elaborate why this would fail on nouveau but not on i965? Does st/mesa do something differently wrt exposing ARB_texture_float vs i965? Does nouveau do something wrong? Hi Illia, I didn't investigate the source of the problem in Nouveau or why this is different than the other drivers, however I can give more details: We are running piglit against i965, llvmpipe, classic swrast, nouveu and radeon, but we only hit the problem with nouveau. The code that triggers the assertion is _mesa_base_tex_format (teximage.c), which is called with internalFormat = GL_R16F, and then, in that function there is this code: if (ctx-Extensions.ARB_texture_rg) { switch (internalFormat) { case GL_R16F: case GL_R32F: if (!ctx-Extensions.ARB_texture_float) break; return GL_RED; ... On i965, the code returns GL_RED, but in Nouveau it hits the break because ctx-Extensions.ARB_texture_float is false in this case (and later will return -1 after being unable to find a proper base format). In the case of Intel, ARB_texture_float is always enabled. In the case of Gallium I see this in st_extensions.c: static const struct st_extension_format_mapping rendertarget_mapping[] = { { { o(ARB_texture_float) }, { PIPE_FORMAT_R32G32B32A32_FLOAT, PIPE_FORMAT_R16G16B16A16_FLOAT } }, ... so if I understand that right, for ARB_texture_float to be activated I need PIPE_FORMAT_R32G32B32A32_FLOAT and PIPE_FORMAT_R16G16B16A16_FLOAT to be supported, so I guess that at least one of these is not supported in the nVidia hardware I am using (NVIDIA GT218 [ION]). If that is not the problem, then I guess it would need further investigation, but I don't see any other places where Gallium or nouveau set that extension. The only way I can think of that it wouldn't have ARB_texture_float is if you didn't build with --enable-texture-float. This would lose you GL3 support, among other things... Perhaps you're only seeing the issue on nouveau because it's the only driver other than i965 that implements ARB_texture_view. You can see the set of extensions that should be enabled... http://people.freedesktop.org/~imirkin/glxinfo/glxinfo.html (look at the GT21x column). -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/13] mesa/meta: Don't free meta if it was never initialized
Signed-off-by: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com --- src/mesa/drivers/common/meta.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c index 87532c1..e106899 100644 --- a/src/mesa/drivers/common/meta.c +++ b/src/mesa/drivers/common/meta.c @@ -404,6 +404,9 @@ void _mesa_meta_free(struct gl_context *ctx) { GET_CURRENT_CONTEXT(old_context); + if (!ctx-Meta) + return; + _mesa_make_current(ctx, NULL, NULL); _mesa_meta_glsl_blit_cleanup(ctx-Meta-Blit); meta_glsl_clear_cleanup(ctx-Meta-Clear); -- 1.8.5.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/13] mesa/main: Don't go freeing texture data which was never allocated
Signed-off-by: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com --- src/mesa/main/texstate.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/texstate.c b/src/mesa/main/texstate.c index e0f0852..6f7d781 100644 --- a/src/mesa/main/texstate.c +++ b/src/mesa/main/texstate.c @@ -930,8 +930,10 @@ _mesa_free_texture_data(struct gl_context *ctx) } /* Free proxy texture objects */ - for (tgt = 0; tgt NUM_TEXTURE_TARGETS; tgt++) - ctx-Driver.DeleteTexture(ctx, ctx-Texture.ProxyTex[tgt]); + for (tgt = 0; tgt NUM_TEXTURE_TARGETS; tgt++) { + if (ctx-Texture.ProxyTex[tgt]) + ctx-Driver.DeleteTexture(ctx, ctx-Texture.ProxyTex[tgt]); + } /* GL_ARB_texture_buffer_object */ _mesa_reference_buffer_object(ctx, ctx-Texture.BufferObject, NULL); -- 1.8.5.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 13/13] mesa/main: Verify context creation on progress
Stop context creation if something failed. If something errored during context creation we'd segfault. Now will clean up and return error. Signed-off-by: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com --- src/mesa/main/shared.c | 63 ++ 1 file changed, 59 insertions(+), 4 deletions(-) diff --git a/src/mesa/main/shared.c b/src/mesa/main/shared.c index f74a823..b1f1eb0 100644 --- a/src/mesa/main/shared.c +++ b/src/mesa/main/shared.c @@ -64,9 +64,21 @@ _mesa_alloc_shared_state(struct gl_context *ctx) mtx_init(shared-Mutex, mtx_plain); + /* Mutex and timestamp for texobj state validation */ + mtx_init(shared-TexMutex, mtx_recursive); + shared-TextureStateStamp = 0; + shared-DisplayList = _mesa_NewHashTable(); + if (!shared-DisplayList) + goto error_out; + shared-TexObjects = _mesa_NewHashTable(); + if (!shared-TexObjects) + goto error_out; + shared-Programs = _mesa_NewHashTable(); + if (!shared-Programs) + goto error_out; shared-DefaultVertexProgram = gl_vertex_program(ctx-Driver.NewProgram(ctx, @@ -76,17 +88,28 @@ _mesa_alloc_shared_state(struct gl_context *ctx) GL_FRAGMENT_PROGRAM_ARB, 0)); shared-ATIShaders = _mesa_NewHashTable(); + if (!shared-ATIShaders) + goto error_out; + shared-DefaultFragmentShader = _mesa_new_ati_fragment_shader(ctx, 0); shared-ShaderObjects = _mesa_NewHashTable(); + if (!shared-ShaderObjects) + goto error_out; shared-BufferObjects = _mesa_NewHashTable(); + if (!shared-BufferObjects) + goto error_out; /* GL_ARB_sampler_objects */ shared-SamplerObjects = _mesa_NewHashTable(); + if (!shared-SamplerObjects) + goto error_out; /* Allocate the default buffer object */ shared-NullBufferObj = ctx-Driver.NewBufferObject(ctx, 0); + if (!shared-NullBufferObj) + goto error_out; /* Create default texture objects */ for (i = 0; i NUM_TEXTURE_TARGETS; i++) { @@ -112,16 +135,48 @@ _mesa_alloc_shared_state(struct gl_context *ctx) /* sanity check */ assert(shared-DefaultTex[TEXTURE_1D_INDEX]-RefCount == 1); - /* Mutex and timestamp for texobj state validation */ - mtx_init(shared-TexMutex, mtx_recursive); - shared-TextureStateStamp = 0; - shared-FrameBuffers = _mesa_NewHashTable(); + if (!shared-FrameBuffers) + goto error_out; + shared-RenderBuffers = _mesa_NewHashTable(); + if (!shared-RenderBuffers) + goto error_out; shared-SyncObjects = _mesa_set_create(NULL, _mesa_key_pointer_equal); + if (!shared-SyncObjects) + goto error_out; return shared; + +error_out: + for (i = 0; i NUM_TEXTURE_TARGETS; i++) { + if (shared-DefaultTex[i]) { + ctx-Driver.DeleteTexture(ctx, shared-DefaultTex[i]); + } + } + + _mesa_reference_buffer_object(ctx, shared-NullBufferObj, NULL); + + _mesa_DeleteHashTable(shared-RenderBuffers); + _mesa_DeleteHashTable(shared-FrameBuffers); + _mesa_DeleteHashTable(shared-SamplerObjects); + _mesa_DeleteHashTable(shared-BufferObjects); + _mesa_DeleteHashTable(shared-ShaderObjects); + _mesa_DeleteHashTable(shared-ATIShaders); + _mesa_DeleteHashTable(shared-Programs); + _mesa_DeleteHashTable(shared-TexObjects); + _mesa_DeleteHashTable(shared-DisplayList); + + _mesa_reference_vertprog(ctx, shared-DefaultVertexProgram, NULL); + _mesa_reference_geomprog(ctx, shared-DefaultGeometryProgram, NULL); + _mesa_reference_fragprog(ctx, shared-DefaultFragmentProgram, NULL); + + mtx_destroy(shared-Mutex); + mtx_destroy(shared-TexMutex); + + free(shared); + return NULL; } -- 1.8.5.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/10] glsl: Don't allow gl_FragData[i], with i0 in GLES shaders
On 12/02/2014 01:31 PM, Tapani Pälli wrote: AFAICS this restriction is lifted in ES 3.0+ (e.g. see section 3.9.2 in the OpenGL ES 3.0 spec). Yep, it seems this check should be against earlier versions and only when not having extensions that allow MRT enabled (NV_draw_buffers). Hi, Yes, we will update the patch to consider the extensions too. Good catch. You can still use gl_FragData[] when using OpenGL ES 3 and GLSL ES 1.00, however when using GLSL ES 3.00 gl_FragData[] gets deprecated and you should use output layout qualifier to bind a output variable and target draw buffer together. I suppose this is already working like that. This patch does not modify that behavior. Thanks for the feedback, Eduardo ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Sort array elements to increase chances of reusing buffer relocation
Ok, I've written a somewhat contrived test case here: https://github.com/bpeel/glthing/tree/time-attribs (Make sure to use the time-attribs branch) The example draws a 1000 single-pixel points each with a separate draw call. Each call uses a separate but identical VAO so that the driver will be be forced to emit the vertices state for each point. Each vertex uses the maximum number of vertex attributes as returned by GL_MAX_VERTEX_ATTRIBS. All of the attributes are used to determine a color value in the vertex shader. Normally it will order the attributes in memory so that the first one is in generic attribute 0, the second in 1 and so on. However if you pass the option ‘backwards’ on the command line it will put them in reverse order. With git master, if the attributes are given in order then it will generate a 3DSTATE_VERTEX_BUFFERS command with a single buffer and a single relocation otherwise it will generate one for each attribute. I ran the test with each of these three versions of Mesa and noted the FPS. This is based on top of commit 29c7cf2b2 with -O3 and --disable-debug. libdrm is 00847fa4 with -O3. 1) Mesa master 2) Master with my patch applied 3) The original optimization removed completely so that it will always generate a buffer relocation for every attribute. The test was run with LIBGL_SHOW_FPS=1 vblank_mode=0 on my Haswell laptop. The results are: attributes are │ master with patch optimization removed ┼── in order│ 820 560 325 out of order│ 325 540 325 The FPS fluctuated by around 20 FPS either side so I've just noted down what looked like an approximate representation. So I guess the results are that yes, in this extreme case having more relocations makes a big difference but also doing the qsort is quite expensive. The original optimization does seem worth doing. It might be worth making a simpler hard-coded implementation of quicksort because calling qsort is probably not very sensible for such a small array and the function call overhead for each comparison is probably quite a bit. It would also probably be good to see if this difference is noticeable in a real use case. - Neil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] clover: clCompileProgram CL_INVALID_COMPILER_OPTIONS
On Mon, Nov 10, 2014 at 07:04:54PM +0200, Francisco Jerez wrote: EdB edb+m...@sigluy.net writes: clCompileProgram should return CL_INVALID_COMPILER_OPTIONS instead of CL_INVALID_BUILD_OPTIONS Looks good to me, Reviewed-by: Francisco Jerez curroje...@riseup.net I've pushed this, thanks! -Tom --- src/gallium/state_trackers/clover/api/program.cpp | 2 ++ src/gallium/state_trackers/clover/llvm/invocation.cpp | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/clover/api/program.cpp b/src/gallium/state_trackers/clover/api/program.cpp index 3a6c054..60184ed 100644 --- a/src/gallium/state_trackers/clover/api/program.cpp +++ b/src/gallium/state_trackers/clover/api/program.cpp @@ -182,6 +182,8 @@ clBuildProgram(cl_program d_prog, cl_uint num_devs, prog.build(devs, opts); return CL_SUCCESS; } catch (error e) { + if (e.get() == CL_INVALID_COMPILER_OPTIONS) + return CL_INVALID_BUILD_OPTIONS; if (e.get() == CL_COMPILE_PROGRAM_FAILURE) return CL_BUILD_PROGRAM_FAILURE; return e.get(); diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp b/src/gallium/state_trackers/clover/llvm/invocation.cpp index d29f5a6..30547d0 100644 --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp @@ -177,7 +177,7 @@ namespace { opts_carray.data() + opts_carray.size(), Diags); if (!Success) { - throw error(CL_INVALID_BUILD_OPTIONS); + throw error(CL_INVALID_COMPILER_OPTIONS); } c.getFrontendOpts().ProgramAction = clang::frontend::EmitLLVMOnly; c.getHeaderSearchOpts().UseBuiltinIncludes = true; -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/1] r600, llvm: Fix mem leak
On Mon, Dec 01, 2014 at 06:33:10PM -0500, Jan Vesely wrote: ping I've pushed this, thanks. -Tom On Mon, 2014-11-03 at 20:29 -0500, Jan Vesely wrote: Signed-off-by: Jan Vesely jan.ves...@rutgers.edu --- src/gallium/drivers/r600/r600_llvm.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/gallium/drivers/r600/r600_llvm.c b/src/gallium/drivers/r600/r600_llvm.c index c19693a..5f74bf7 100644 --- a/src/gallium/drivers/r600/r600_llvm.c +++ b/src/gallium/drivers/r600/r600_llvm.c @@ -888,6 +888,7 @@ unsigned r600_llvm_compile( FREE(binary.code); FREE(binary.config); + FREE(binary.rodata); return r; } -- Jan Vesely jan.ves...@rutgers.edu ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/1] r600: upload implicit arguments even if there are no explicit args
On Mon, Nov 03, 2014 at 08:29:37PM -0500, Jan Vesely wrote: Signed-off-by: Jan Vesely jan.ves...@rutgers.edu --- moreover, the condition is never true now that clover appends dim info src/gallium/drivers/r600/evergreen_compute.c | 4 1 file changed, 4 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_compute.c b/src/gallium/drivers/r600/evergreen_compute.c index 90fdd79..41dc93e 100644 --- a/src/gallium/drivers/r600/evergreen_compute.c +++ b/src/gallium/drivers/r600/evergreen_compute.c @@ -295,10 +295,6 @@ void evergreen_compute_upload_input( struct pipe_box box; struct pipe_transfer *transfer = NULL; - if (shader-input_size == 0) { - return; - } - We shouldn't rely on clover specific behavior, because in theory there could be other state trackers. -Tom if (!shader-kernel_param) { /* Add space for the grid dimensions */ shader-kernel_param = (struct r600_resource *) -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] draw: use the prim type from prim_info not emit in passthrough emit
Series sort of sounds sensible but it's really outside my expertise. Zack would be a much better reviewer. This should be thoroughly tested too. Jose On 02/12/14 01:31, srol...@vmware.com wrote: From: Roland Scheidegger srol...@vmware.com The prim assembler may change the prim type when injecting prim ids now, which isn't reflected by what's stored in emit. This looks brittle and potentially dangerous (it is not obvious if such prim type changes are really supported by pt emit, the prim type is actually also set in prepare which would then be different). I guess ideally shouldn't mess with topologies when injecting prim ids and just inject them at the right vertices (depending on prim type and provoking vertex convention). This fixes piglit primitive-id-no-gs-first-vertex.shader_test. --- src/gallium/auxiliary/draw/draw_pt_emit.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_pt_emit.c b/src/gallium/auxiliary/draw/draw_pt_emit.c index 011efe7..b215c5f 100644 --- a/src/gallium/auxiliary/draw/draw_pt_emit.c +++ b/src/gallium/auxiliary/draw/draw_pt_emit.c @@ -143,7 +143,7 @@ draw_pt_emit(struct pt_emit *emit, /* XXX: and work out some way to coordinate the render primitive * between vbuf.c and here... */ - draw-render-set_primitive(draw-render, emit-prim); + render-set_primitive(draw-render, prim_info-prim); render-allocate_vertices(render, (ushort)translate-key.output_stride, @@ -214,7 +214,7 @@ draw_pt_emit_linear(struct pt_emit *emit, /* XXX: and work out some way to coordinate the render primitive * between vbuf.c and here... */ - draw-render-set_primitive(draw-render, emit-prim); + render-set_primitive(draw-render, prim_info-prim); if (!render-allocate_vertices(render, (ushort)translate-key.output_stride, ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: fix height error check for 1D array textures
height=0 is legal for 1D array textures (as height=0 is legal for 2D arrays). Fixes new piglit ext_texture_array-errors test. Cc: 10.3 10.4 mesa-sta...@lists.freedesktop.org --- src/mesa/main/teximage.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c index 4f4bb11..7766904 100644 --- a/src/mesa/main/teximage.c +++ b/src/mesa/main/teximage.c @@ -1542,7 +1542,7 @@ _mesa_legal_texture_dimensions(struct gl_context *ctx, GLenum target, maxSize = level; if (width 2 * border || width 2 * border + maxSize) return GL_FALSE; - if (height 1 || height ctx-Const.MaxArrayTextureLayers) + if (height 0 || height ctx-Const.MaxArrayTextureLayers) return GL_FALSE; if (!ctx-Extensions.ARB_texture_non_power_of_two) { if (width 0 !_mesa_is_pow_two(width - 2 * border)) -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: fix height error check for 1D array textures
LGTM. On 02/12/14 16:43, Brian Paul wrote: height=0 is legal for 1D array textures (as height=0 is legal for I think you mean as depth=0 is legal for 2D arrays 2D arrays). Fixes new piglit ext_texture_array-errors test. Cc: 10.3 10.4 mesa-sta...@lists.freedesktop.org --- src/mesa/main/teximage.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c index 4f4bb11..7766904 100644 --- a/src/mesa/main/teximage.c +++ b/src/mesa/main/teximage.c @@ -1542,7 +1542,7 @@ _mesa_legal_texture_dimensions(struct gl_context *ctx, GLenum target, maxSize = level; if (width 2 * border || width 2 * border + maxSize) return GL_FALSE; - if (height 1 || height ctx-Const.MaxArrayTextureLayers) + if (height 0 || height ctx-Const.MaxArrayTextureLayers) return GL_FALSE; if (!ctx-Extensions.ARB_texture_non_power_of_two) { if (width 0 !_mesa_is_pow_two(width - 2 * border)) Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: fix height error check for 1D array textures
On 12/02/2014 09:49 AM, Jose Fonseca wrote: LGTM. On 02/12/14 16:43, Brian Paul wrote: height=0 is legal for 1D array textures (as height=0 is legal for I think you mean as depth=0 is legal for 2D arrays Yes, thanks. -Brian 2D arrays). Fixes new piglit ext_texture_array-errors test. Cc: 10.3 10.4 mesa-sta...@lists.freedesktop.org --- src/mesa/main/teximage.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c index 4f4bb11..7766904 100644 --- a/src/mesa/main/teximage.c +++ b/src/mesa/main/teximage.c @@ -1542,7 +1542,7 @@ _mesa_legal_texture_dimensions(struct gl_context *ctx, GLenum target, maxSize = level; if (width 2 * border || width 2 * border + maxSize) return GL_FALSE; - if (height 1 || height ctx-Const.MaxArrayTextureLayers) + if (height 0 || height ctx-Const.MaxArrayTextureLayers) return GL_FALSE; if (!ctx-Extensions.ARB_texture_non_power_of_two) { if (width 0 !_mesa_is_pow_two(width - 2 * border)) Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/7] mesa: Generate GL_INVALID_OPERATION when drawing w/o a VAO in core profile
On 11/20/2014 05:19 PM, Kenneth Graunke wrote: On Thursday, November 20, 2014 11:14:49 AM Ian Romanick wrote: From: Ian Romanick ian.d.roman...@intel.com GL 3-ish versions of the spec are less clear that an error should be generated here, so Ken (and I during review) just missed it in 1afe335. Signed-off-by: Ian Romanick ian.d.roman...@intel.com Cc: Kenneth Graunke kenn...@whitecape.org --- src/mesa/main/api_validate.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c index bf4fa3e..006fca4 100644 --- a/src/mesa/main/api_validate.c +++ b/src/mesa/main/api_validate.c @@ -79,8 +79,16 @@ check_valid_to_render(struct gl_context *ctx, const char *function) break; case API_OPENGL_CORE: - if (ctx-Array.VAO == ctx-Array.DefaultVAO) + /* Section 10.4 (Drawing Commands Using Vertex Arrays) of the OpenGL 4.5 + * Core Profile spec says: + * + * An INVALID_OPERATION error is generated if no vertex array + * object is bound (see section 10.3.1). + */ + if (ctx-Array.VAO == ctx-Array.DefaultVAO) { + _mesa_error(ctx, GL_INVALID_OPERATION, %s(no VAO bound), function); return GL_FALSE; + } /* fallthrough */ case API_OPENGL_COMPAT: { This seems okay - we were already prohibiting drawing, you're just adding a GL error. I thought we already did that, but I can't find any code to do so. Reviewed-by: Kenneth Graunke kenn...@whitecape.org That said, I don't think we ever resolved whether prohibiting drawing is correct, given that we support ARB_ES3_compatibility, and ES3 allows this. OpenGL 4.5 includes GL_ARB_ES3_compatibility as a required feature, and it explicitly calls out the error. So I think we're okay? Also... the brw_state_flags patch I was taking about yesterday is patch 7 in this series. :) signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 13/23] mesa: Autogenerate most of format_pack.c
On Mon, Dec 1, 2014 at 11:03 PM, Samuel Iglesias Gonsálvez sigles...@igalia.com wrote: On 01/12/14 20:00, Jason Ekstrand wrote: On Mon, Dec 1, 2014 at 3:04 AM, Iago Toral Quiroga ito...@igalia.com wrote: From: Jason Ekstrand jason.ekstr...@intel.com We were auto-generating it before. The problem was that the autogeneration tool we were using was called copy, paste, and edit. Let's use a more sensible solution. Signed-off-by: Jason Ekstrand jason.ekstr...@intel.com v2 by Samuel Iglesias sigles...@igalia.com - Remove format_pack.c as it is now autogenerated - Add usage of INDENT_FLAGS in Makefile.am - Remove trailing blank line v3 by Samuel Iglesias sigles...@igalia.com - Merge format_convert.py into format_parser.py - Adapt pack_*_* function generations - Fix out-of-tree build Signed-off-by: Samuel Iglesias Gonsalvez sigles...@igalia.com --- [...] + def __get_datatype(self, type, size): + if type == FLOAT: + if size == 32: +return 'float' + elif size == 16: +return 'uint16_t' + else: +assert False + elif type == UNSIGNED: + if size = 8: +return 'uint8_t' + elif size = 16: +return 'uint16_t' + elif size = 32: +return 'uint32_t' + else: +assert False + elif type == SIGNED: + if size = 8: +return 'int8_t' + elif size = 16: +return 'int16_t' + elif size = 32: +return 'int32_t' + else: +assert False + else: + assert False Let's put this in a helper that's called by both Format and Channel. That way if we change any of this, it's all in one place. Also, it still doesn't build. Matt is looking into that. --Jason OK, I will put this in a helper. Which error do you get when building? I created a build/ directory in Mesa and build from there with success. However, any help on this would be great. Thanks, Sam Maybe it would help if I explained my setup a bit more My mesa tree is in ~/projects/mesa and my usual build tree is in ~/build/mesa-debug so they are in completely different locations. I don't know why that would be different from nested, but maybe it is. The exact error that I'm getting is that it has no rule to build ~/projects/mesa/src/mesa/main/format_pack.c which is a path in the source tree even though the actual file is getting successfully generated as ~/build/mesa-debug/src/mesa/main/format_pack.c (in the build tree). --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 16/23] mesa: Add _mesa_pack_uint_rgba_row() format conversion function
On Tue, Dec 2, 2014 at 2:32 AM, Iago Toral ito...@igalia.com wrote: On Mon, 2014-12-01 at 11:14 -0800, Jason Ekstrand wrote: On Mon, Dec 1, 2014 at 3:04 AM, Iago Toral Quiroga ito...@igalia.com wrote: From: Samuel Iglesias Gonsalvez sigles...@igalia.com We will use this later on to handle uint conversion scenarios in a master convert function. v2: - Modify pack_uint_*() function generation to use c.datatype() and f.datatype(). - Remove UINT_TO_FLOAT() macro usage from pack_uint*() - Remove if not f.is_normalized() conditional as pack_uint*() functions are only autogenerated for non normalized formats. Signed-off-by: Samuel Iglesias Gonsalvez sigles...@igalia.com --- src/mesa/main/format_pack.c.mako | 82 src/mesa/main/format_pack.h | 3 ++ 2 files changed, 85 insertions(+) diff --git a/src/mesa/main/format_pack.c.mako b/src/mesa/main/format_pack.c.mako index aced58d..7feb3f8 100644 --- a/src/mesa/main/format_pack.c.mako +++ b/src/mesa/main/format_pack.c.mako @@ -149,6 +149,56 @@ pack_ubyte_r11g11b10_float(const GLubyte src[4], void *dst) *d = float3_to_r11g11b10f(rgb); } +/* uint packing functions */ + +%for f in rgb_formats: + %if not f.is_int(): + % continue % + %elif f.is_normalized(): + % continue % + %elif f.is_compressed(): + % continue % + %endif + +static inline void +pack_uint_${f.short_name()}(const GLuint src[4], void *dst) +{ + %for (i, c) in enumerate(f.channels): + % i = f.swizzle.inverse()[i] % + %if c.type == 'x': + % continue % + %endif + + ${c.datatype()} ${c.name} = + %if c.type == parser.FLOAT and c.size == 16: + _mesa_float_to_half(src[${i}]); Jason, I think this float handling shouldn't be here we only allow packing/unpacking between integer types, right? In that case maybe we should add an assertion to catch these other cases. + %else: + (${c.datatype()}) src[${i}]; I think we're missing the clamping here. We should probably use the same clamping functions that we cooked up for swizzle_and_convert. We'll have to be careful though because the source gets interpreted as signed vs. unsigned depending on the destination format. So you mean that we should use _mesa_unsigned_to_unsigned when c.type == parser.UNSIGNED and _mesa_signed_to_signed when c.type == parser.SIGNED. If there is nothing wrong with what I suggest above, then I think this is how it should look like in the end: ${c.datatype()} ${c.name} = %if c.type == parser.SIGNED: _mesa_signed_to_signed(src[${i}], ${c.size}); %elif c.type == parser.UNSIGNED: _mesa_unsigned_to_unsigned(src[${i}], ${c.size}); %else: assert(!Invalid type: only integer types are allowed); %endif Looks good to me. + %endif + %endfor + + %if f.layout == parser.ARRAY: + ${f.datatype()} *d = (${f.datatype()} *)dst; + %for (i, c) in enumerate(f.channels): + %if c.type == 'x': +% continue % + %endif + d[${i}] = ${c.name}; + %endfor + %elif f.layout == parser.PACKED: + ${f.datatype()} d = 0; + %for (i, c) in enumerate(f.channels): + %if c.type == 'x': +% continue % + %endif + d |= PACK(${c.name}, ${c.shift}, ${c.size}); + %endfor + (*(${f.datatype()} *)dst) = d; + %else: + % assert False % + %endif +} +%endfor /* float packing functions */ @@ -297,6 +347,38 @@ _mesa_pack_ubyte_rgba_row(mesa_format format, GLuint n, } /** + * Pack a row of GLuint rgba[4] values to the destination. + */ +void +_mesa_pack_uint_rgba_row(mesa_format format, GLuint n, + const GLuint src[][4], void *dst) +{ + GLuint i; + GLubyte *d = dst; + + switch (format) { +%for f in rgb_formats: + %if not f.is_int(): + % continue % + %elif f.is_normalized(): + % continue % + %elif f.is_compressed(): + % continue % +
Re: [Mesa-dev] [PATCH v2 06/23] mesa: Fix incorrect assertion in init_teximage_fields_ms
On Tue, Dec 2, 2014 at 8:24 AM, Ilia Mirkin imir...@alum.mit.edu wrote: On Tue, Dec 2, 2014 at 3:05 AM, Iago Toral ito...@igalia.com wrote: On Mon, 2014-12-01 at 13:25 -0500, Ilia Mirkin wrote: On Mon, Dec 1, 2014 at 6:04 AM, Iago Toral Quiroga ito...@igalia.com wrote: _BaseFormat is a GLenum (unsigned int) so testing if its value is greater than 0 to detect the cases where _mesa_base_tex_format returns -1 doesn't work. Fixing the assertion breaks the arb_texture_view-lifetime-format piglit test on nouveau, since that test calls _mesa_base_tex_format with GL_R16F with a context that does not have ARB_texture_float, so it returns -1 for the BaseFormat, which was not being catched properly by the ASSERT in init_teximage_fields_ms until now. Sorry to hijack the thread, but... can you elaborate why this would fail on nouveau but not on i965? Does st/mesa do something differently wrt exposing ARB_texture_float vs i965? Does nouveau do something wrong? Hi Illia, I didn't investigate the source of the problem in Nouveau or why this is different than the other drivers, however I can give more details: We are running piglit against i965, llvmpipe, classic swrast, nouveu and radeon, but we only hit the problem with nouveau. The code that triggers the assertion is _mesa_base_tex_format (teximage.c), which is called with internalFormat = GL_R16F, and then, in that function there is this code: if (ctx-Extensions.ARB_texture_rg) { switch (internalFormat) { case GL_R16F: case GL_R32F: if (!ctx-Extensions.ARB_texture_float) break; return GL_RED; ... On i965, the code returns GL_RED, but in Nouveau it hits the break because ctx-Extensions.ARB_texture_float is false in this case (and later will return -1 after being unable to find a proper base format). In the case of Intel, ARB_texture_float is always enabled. In the case of Gallium I see this in st_extensions.c: static const struct st_extension_format_mapping rendertarget_mapping[] = { { { o(ARB_texture_float) }, { PIPE_FORMAT_R32G32B32A32_FLOAT, PIPE_FORMAT_R16G16B16A16_FLOAT } }, ... so if I understand that right, for ARB_texture_float to be activated I need PIPE_FORMAT_R32G32B32A32_FLOAT and PIPE_FORMAT_R16G16B16A16_FLOAT to be supported, so I guess that at least one of these is not supported in the nVidia hardware I am using (NVIDIA GT218 [ION]). If that is not the problem, then I guess it would need further investigation, but I don't see any other places where Gallium or nouveau set that extension. The only way I can think of that it wouldn't have ARB_texture_float is if you didn't build with --enable-texture-float. This would lose you GL3 support, among other things... Perhaps you're only seeing the issue on nouveau because it's the only driver other than i965 that implements ARB_texture_view. You can see the set of extensions that should be enabled... http://people.freedesktop.org/~imirkin/glxinfo/glxinfo.html (look at the GT21x column). -ilia I believe the issue is that _mesa_TextureView should be checking the target internal format against _mesa_base_tex_format() and returning a GL_INVALID_VALUE error (? the spec conveniently lists errors as 'TODO') if it returns -1. Chris -- thoughts? Maybe stick it in lookup_view_class? -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] [RFC] MESA_shading_language_130: is this it?
On 11/26/2014 06:09 PM, Dave Airlie wrote: Glamor is 4x faster on my ILK using glsl 130 at core text using x11perf -ftext. Ian started writing a spec for this extension a while back, which seems like most of the work, this patch seems to do enough, to advertise GLSL 1.30. Yeah... I started writing the extension when Chris Forbes was working on adding GLSL 1.30 for Ironlake. I seem to recall that gl_ClipDistance still does not work for ILK, and difficulties with that caused Chris to abandon the project. This was over a year ago, so the details are a bit fuzzy. The common Mesa parts look good, though. If we want to pursue this, I can finish up the extension spec and get it published. One other minor nit below... TODO: fix extension numbering get piglit to execute tests on this Just-hacked-up-by: Dave Airlie airl...@redhat.com --- src/mapi/glapi/gen/MESA_shading_language_130.xml | 255 +++ src/mapi/glapi/gen/gl_API.xml| 2 + src/mesa/drivers/dri/i965/intel_extensions.c | 5 + src/mesa/main/extensions.c | 1 + src/mesa/main/mtypes.h | 1 + 5 files changed, 264 insertions(+) create mode 100644 src/mapi/glapi/gen/MESA_shading_language_130.xml diff --git a/src/mapi/glapi/gen/MESA_shading_language_130.xml b/src/mapi/glapi/gen/MESA_shading_language_130.xml new file mode 100644 index 000..9aa553d --- /dev/null +++ b/src/mapi/glapi/gen/MESA_shading_language_130.xml @@ -0,0 +1,255 @@ +?xml version=1.0? +!DOCTYPE OpenGLAPI SYSTEM gl_API.dtd + +OpenGLAPI + +category name=GL_MESA_shading_language_130 number=999 + +enum name=VERTEX_ATTRIB_ARRAY_INTEGERvalue=0x88FD/ +enum name=SAMPLER_1D_ARRAY value=0x8DC0/ +enum name=SAMPLER_2D_ARRAY value=0x8DC1/ +enum name=SAMPLER_1D_ARRAY_SHADOWvalue=0x8DC3/ +enum name=SAMPLER_2D_ARRAY_SHADOWvalue=0x8DC4/ +enum name=SAMPLER_CUBE_SHADOWvalue=0x8DC5/ +enum name=UNSIGNED_INT_VEC2 value=0x8DC6/ +enum name=UNSIGNED_INT_VEC3 value=0x8DC7/ +enum name=UNSIGNED_INT_VEC4 value=0x8DC8/ +enum name=INT_SAMPLER_1D value=0x8DC9/ +enum name=INT_SAMPLER_2D value=0x8DCA/ +enum name=INT_SAMPLER_3D value=0x8DCB/ +enum name=INT_SAMPLER_CUBE value=0x8DCC/ +enum name=INT_SAMPLER_1D_ARRAY value=0x8DCE/ +enum name=INT_SAMPLER_2D_ARRAY value=0x8DCF/ +enum name=UNSIGNED_INT_SAMPLER_1Dvalue=0x8DD1/ +enum name=UNSIGNED_INT_SAMPLER_2Dvalue=0x8DD2/ +enum name=UNSIGNED_INT_SAMPLER_3Dvalue=0x8DD3/ +enum name=UNSIGNED_INT_SAMPLER_CUBE value=0x8DD4/ +enum name=UNSIGNED_INT_SAMPLER_1D_ARRAY value=0x8DD6/ +enum name=UNSIGNED_INT_SAMPLER_2D_ARRAY value=0x8DD7/ + +!-- There is no MIN_PROGRAM_TEXEL_OFFSET in glext.h. There is + MIN_PROGRAM_TEXEL_OFFSET_NV and MIN_PROGRAM_TEXEL_OFFSET (OpenGL + 3.0). Same goes for MAX_PROGRAM_TEXEL_OFFSET. +-- +enum name=MIN_PROGRAM_TEXEL_OFFSET value=0x8904 +size name=Get mode=get/ +/enum +enum name=MAX_PROGRAM_TEXEL_OFFSET value=0x8905 +size name=Get mode=get/ +/enum + +enum name=CLIP_DISTANCE0 value=0x3000/ +enum name=CLIP_DISTANCE1 value=0x3001/ +enum name=CLIP_DISTANCE2 value=0x3002/ +enum name=CLIP_DISTANCE3 value=0x3003/ +enum name=CLIP_DISTANCE4 value=0x3004/ +enum name=CLIP_DISTANCE5 value=0x3005/ +enum name=CLIP_DISTANCE6 value=0x3006/ +enum name=CLIP_DISTANCE7 value=0x3007/ + +enum name=MAX_CLIP_DISTANCES value=0x0D32/ + +!-- +function name=VertexAttribI1i offset=assign exec=dynamic +param name=index type=GLuint/ +param name=x type=GLint/ +/function + +function name=VertexAttribI2i offset=assign exec=dynamic +param name=index type=GLuint/ +param name=x type=GLint/ +param name=y type=GLint/ +/function + +function name=VertexAttribI3i offset=assign exec=dynamic +param name=index type=GLuint/ +param name=x type=GLint/ +param name=y type=GLint/ +param name=z type=GLint/ +/function + +function name=VertexAttribI4i offset=assign exec=dynamic +param name=index type=GLuint/ +param name=x type=GLint/ +param name=y type=GLint/ +param name=z type=GLint/ +param name=w type=GLint/ +/function + +function name=VertexAttribI1ui offset=assign exec=dynamic +param name=index
Re: [Mesa-dev] [PATCH] i965: Sort array elements to increase chances of reusing buffer relocation
On Tue, Dec 02, 2014 at 04:17:35PM +, Neil Roberts wrote: Ok, I've written a somewhat contrived test case here: https://github.com/bpeel/glthing/tree/time-attribs (Make sure to use the time-attribs branch) The example draws a 1000 single-pixel points each with a separate draw call. Each call uses a separate but identical VAO so that the driver will be be forced to emit the vertices state for each point. Each vertex uses the maximum number of vertex attributes as returned by GL_MAX_VERTEX_ATTRIBS. All of the attributes are used to determine a color value in the vertex shader. Normally it will order the attributes in memory so that the first one is in generic attribute 0, the second in 1 and so on. However if you pass the option ‘backwards’ on the command line it will put them in reverse order. With git master, if the attributes are given in order then it will generate a 3DSTATE_VERTEX_BUFFERS command with a single buffer and a single relocation otherwise it will generate one for each attribute. I ran the test with each of these three versions of Mesa and noted the FPS. This is based on top of commit 29c7cf2b2 with -O3 and --disable-debug. libdrm is 00847fa4 with -O3. 1) Mesa master 2) Master with my patch applied 3) The original optimization removed completely so that it will always generate a buffer relocation for every attribute. The test was run with LIBGL_SHOW_FPS=1 vblank_mode=0 on my Haswell laptop. The results are: attributes are │ master with patch optimization removed ┼── in order│ 820 560 325 out of order│ 325 540 325 The FPS fluctuated by around 20 FPS either side so I've just noted down what looked like an approximate representation. So I guess the results are that yes, in this extreme case having more relocations makes a big difference but also doing the qsort is quite expensive. The original optimization does seem worth doing. It might be worth making a simpler hard-coded implementation of quicksort because calling qsort is probably not very sensible for such a small array and the function call overhead for each comparison is probably quite a bit. It would also probably be good to see if this difference is noticeable in a real use case. - Neil Cool. My statement was really getting at there normally won't be many duplicated relocations. What kind of numbers do you see as you scale down the number of points? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Move PSCDEPTH calculations from draw time to compile time.
On Tue, Dec 2, 2014 at 3:50 AM, Kenneth Graunke kenn...@whitecape.org wrote: The Pixel Shader Computed Depth Mode value is entirely based on the shader program, so we can easily do it at compile time. This avoids the if+switch on every 3DSTATE_WM (Gen7)/3DSTATE_PS_EXTRA (Gen8+) upload, and shares a bit more code. This also simplifies the PMA stall code, making it match the formula more closely, and drops a BRW_NEW_FRAGMENT_PROGRAM dependency. (Note that the previous comment was wrong - the code and the documentation have != PSCDEPTH_OFF, not ==.) Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_context.h | 2 ++ src/mesa/drivers/dri/i965/brw_defines.h | 17 + src/mesa/drivers/dri/i965/brw_wm.c | 21 + src/mesa/drivers/dri/i965/gen7_wm_state.c| 22 +++--- src/mesa/drivers/dri/i965/gen8_depth_state.c | 12 src/mesa/drivers/dri/i965/gen8_ps_state.c| 18 +- 6 files changed, 40 insertions(+), 52 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index ec4b3dd..b4ddc17 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -390,6 +390,8 @@ struct brw_wm_prog_data { /** @} */ } binding_table; + uint8_t computed_depth_mode; Presumably we should use the new enum type here (and below in the function call), and mark the enum definition PACKED. With that, Reviewed-by: Matt Turner matts...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Sort array elements to increase chances of reusing buffer relocation
On 12/02/2014 08:17 AM, Neil Roberts wrote: Ok, I've written a somewhat contrived test case here: https://github.com/bpeel/glthing/tree/time-attribs (Make sure to use the time-attribs branch) The example draws a 1000 single-pixel points each with a separate draw call. Each call uses a separate but identical VAO so that the driver will be be forced to emit the vertices state for each point. Each vertex uses the maximum number of vertex attributes as returned by GL_MAX_VERTEX_ATTRIBS. All of the attributes are used to determine a color value in the vertex shader. Normally it will order the attributes in memory so that the first one is in generic attribute 0, the second in 1 and so on. However if you pass the option ‘backwards’ on the command line it will put them in reverse order. With git master, if the attributes are given in order then it will generate a 3DSTATE_VERTEX_BUFFERS command with a single buffer and a single relocation otherwise it will generate one for each attribute. I ran the test with each of these three versions of Mesa and noted the FPS. This is based on top of commit 29c7cf2b2 with -O3 and --disable-debug. libdrm is 00847fa4 with -O3. 1) Mesa master 2) Master with my patch applied 3) The original optimization removed completely so that it will always generate a buffer relocation for every attribute. The test was run with LIBGL_SHOW_FPS=1 vblank_mode=0 on my Haswell laptop. The results are: attributes are │ master with patch optimization removed ┼── in order│ 820 560 325 out of order│ 325 540 325 The FPS fluctuated by around 20 FPS either side so I've just noted down what looked like an approximate representation. Try ministat. http://anholt.net/compare-perf/ So I guess the results are that yes, in this extreme case having more relocations makes a big difference but also doing the qsort is quite expensive. The original optimization does seem worth doing. It might be worth making a simpler hard-coded implementation of quicksort because calling qsort is probably not very sensible for such a small array and the function call overhead for each comparison is probably quite a bit. The number of elements is generally small, and the maximum, currently, is 16. Quite a bit of work has been done on generating (provably) optimal comparison sorts using sorting networks (Google sorting network generator). You should be able to hack something up for just the number of vertex attributes used by your test. It would also probably be good to see if this difference is noticeable in a real use case. Also that. :) - Neil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 86939] test_vf_float_conversions.cpp:63:12: error: expected primary-expression before ‘union’
https://bugs.freedesktop.org/show_bug.cgi?id=86939 Bug ID: 86939 Summary: test_vf_float_conversions.cpp:63:12: error: expected primary-expression before ‘union’ Product: Mesa Version: git Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Keywords: bisected, regression Severity: normal Priority: medium Component: Other Assignee: mesa-dev@lists.freedesktop.org Reporter: v...@freedesktop.org CC: matts...@gmail.com mesa: 4e6244e80f7dd6dad526ff04f5103ed24d61d38a (master 10.5.0-devel) $ make check [...] CXXtest_vf_float_conversions.o test_vf_float_conversions.cpp: In function ‘unsigned int f2u(float)’: test_vf_float_conversions.cpp:63:12: error: expected primary-expression before ‘union’ test_vf_float_conversions.cpp:63:12: error: expected ‘)’ before ‘union’ test_vf_float_conversions.cpp:64:1: warning: control reaches end of non-void function [-Wreturn-type] $ gcc --version gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-11) Copyright (C) 2010 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. commit b2abf033e06f3085e84dd039a7d84132c74a69b5 Author: Matt Turner matts...@gmail.com Date: Fri Oct 24 11:42:21 2014 -0700 i965: Add unit test for float - VF conversions. Using Eric's original VF - float conversion code to initialize the table. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Sort array elements to increase chances of reusing buffer relocation
On 12/02/2014 08:17 AM, Neil Roberts wrote: Ok, I've written a somewhat contrived test case here: https://github.com/bpeel/glthing/tree/time-attribs (Make sure to use the time-attribs branch) The example draws a 1000 single-pixel points each with a separate draw call. Each call uses a separate but identical VAO so that the driver will be be forced to emit the vertices state for each point. Each vertex uses the maximum number of vertex attributes as returned by GL_MAX_VERTEX_ATTRIBS. All of the attributes are used to determine a color value in the vertex shader. Normally it will order the attributes in memory so that the first one is in generic attribute 0, the second in 1 and so on. However if you pass the option ‘backwards’ on the command line it will put them in reverse order. With git master, if the attributes are given in order then it will generate a 3DSTATE_VERTEX_BUFFERS command with a single buffer and a single relocation otherwise it will generate one for each attribute. I ran the test with each of these three versions of Mesa and noted the FPS. This is based on top of commit 29c7cf2b2 with -O3 and --disable-debug. libdrm is 00847fa4 with -O3. 1) Mesa master 2) Master with my patch applied 3) The original optimization removed completely so that it will always generate a buffer relocation for every attribute. The test was run with LIBGL_SHOW_FPS=1 vblank_mode=0 on my Haswell laptop. The results are: attributes are │ master with patch optimization removed ┼── in order│ 820 560 325 out of order│ 325 540 325 Also... what is the affect of the optimization when the relocations cannot be merged? It should be easy enough to modify the test to get that data as well. The FPS fluctuated by around 20 FPS either side so I've just noted down what looked like an approximate representation. So I guess the results are that yes, in this extreme case having more relocations makes a big difference but also doing the qsort is quite expensive. The original optimization does seem worth doing. It might be worth making a simpler hard-coded implementation of quicksort because calling qsort is probably not very sensible for such a small array and the function call overhead for each comparison is probably quite a bit. It would also probably be good to see if this difference is noticeable in a real use case. - Neil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] i965: Sort array elements to increase chances of reusing buffer relocation
Neil wrote: It might be worth making a simpler hard-coded implementation of quicksort because calling qsort is probably not very sensible for such a small array and the function call overhead for each comparison is probably quite a bit. Ok, here is a v2 of the patch which has a simple custom quick sort implementation based on the Wikipedia description. It seems a lot faster than using qsort so the table is now like this: attributes are │ master with patch optimization removed patchv2 ┼── in order│ 820 560 325 760 out of order│ 325 540 325 760 - Neil --- 8 --- (use git am --scissors to automatically chop here) When submitting the vertex buffers the i965 driver will try to recognise when multiple attributes are using the same buffer so that it can submit a single relocation for it and set a different offset in the attribute. Previously however if the application happens to have the attributes in a struct with an order that doesn't match the order they are listed in the gl_vert_attrib enum then the loop would end up processing the attributes with a greater offset first and the optimisation wouldn't be used. To make the optmisation more likely to be used this patch makes it always process the elements in increasing order of offset. This is done copying the element pointers into a separate array and sorting it with a simple quick sort implementation. This only affects the order that the elements are processed and doesn't change the order that they are submitted to the hardware. --- src/mesa/drivers/dri/i965/brw_draw_upload.c | 102 +++- 1 file changed, 99 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c b/src/mesa/drivers/dri/i965/brw_draw_upload.c index 7bf9163..8480043 100644 --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c @@ -396,6 +396,89 @@ copy_array_to_vbo_array(struct brw_context *brw, buffer-stride = dst_stride; } +static bool +compare_array_ptr(const struct brw_vertex_element *input_a, + const struct brw_vertex_element *input_b) +{ + const GLubyte *ptr_a = input_a-glarray-Ptr; + const GLubyte *ptr_b = input_b-glarray-Ptr; + + if (ptr_a ptr_b) + return false; + else if (ptr_a ptr_b) + return true; + else { + /* Resort to comparing the element pointers so that it never returns + * equality because apparently that can be bad for quick sort */ + if (input_a input_b) + return false; + else + return true; + } +} + +static int +partition_elements(struct brw_vertex_element **elements, + int n_elements) +{ + int pivot = n_elements / 2; + struct brw_vertex_element *pivot_value = elements[pivot]; + struct brw_vertex_element *tmp; + int i, store_index; + + elements[pivot] = elements[n_elements - 1]; + + store_index = 0; + + for (i = 0; i n_elements - 1; i++) { + if (!compare_array_ptr(elements[i], pivot_value)) { + tmp = elements[i]; + elements[i] = elements[store_index]; + elements[store_index] = tmp; + store_index++; + } + } + + elements[n_elements - 1] = elements[store_index]; + elements[store_index] = pivot_value; + + return store_index; +} + +struct sort_stack { + int16_t start, end; +}; + +static void +sort_elements(struct brw_vertex_element **elements, + int n_elements) +{ + struct sort_stack stack[VERT_ATTRIB_MAX]; + int stack_size = 1; + int start, end, pivot_point; + + stack[0].start = 0; + stack[0].end = n_elements; + + while (stack_size 0) { + stack_size--; + start = stack[stack_size].start; + end = stack[stack_size].end; + + if (end - start = 2) { + pivot_point = partition_elements(elements + start, + end - start) + start; + assert(stack_size + 2 = ARRAY_SIZE(stack)); + stack[stack_size].start = pivot_point + 1; + stack[stack_size].end = end; + stack_size++; + stack[stack_size].start = start; + stack[stack_size].end = pivot_point; + stack_size++; + } + } +} + void brw_prepare_vertices(struct brw_context *brw) { @@ -409,6 +492,7 @@ brw_prepare_vertices(struct brw_context *brw) int delta, i, j; struct brw_vertex_element *upload[VERT_ATTRIB_MAX]; + struct brw_vertex_element *sorted[VERT_ATTRIB_MAX]; GLuint nr_uploads = 0; /* _NEW_POLYGON @@ -442,8 +526,20 @@ brw_prepare_vertices(struct brw_context *brw) if (brw-vb.nr_buffers) return; + /* In the loop below if it finds an element that is using the same buffer +* as a previous element then it will reuse the same buffer relocation. +* However it will only work if the offset for the previous element is
Re: [Mesa-dev] [PATCH 3/4] i965: Drop BRW_NEW_VERTEX_PROGRAM from Gen7+ 3DSTATE_VS atoms.
On Tue, Dec 2, 2014 at 3:51 AM, Kenneth Graunke kenn...@whitecape.org wrote: We don't access brw-vertex_program or ctx-_Shader since the previous commit, so we don't need this dirty bit. I think it's still necessary on Gen6 because it still conflates constant uploading with unit state uploading. We can fix that later. I might add a FIXME into the Gen6 code so we don't forget. Either way, the series is Reviewed-by: Matt Turner matts...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 86939] test_vf_float_conversions.cpp:63:12: error: expected primary-expression before ‘union’
https://bugs.freedesktop.org/show_bug.cgi?id=86939 Matt Turner matts...@gmail.com changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #1 from Matt Turner matts...@gmail.com --- I think this was fixed yesterday by http://cgit.freedesktop.org/mesa/mesa/commit/?id=31a46fb7a5063b7d292acbefb89138ee25b2673e -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 86939] test_vf_float_conversions.cpp:63:12: error: expected primary-expression before ‘union’
https://bugs.freedesktop.org/show_bug.cgi?id=86939 Vinson Lee v...@freedesktop.org changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |--- --- Comment #2 from Vinson Lee v...@freedesktop.org --- Still not fixed with top-of-tree master. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Patch v2] Mesa: Add support for GL_OES_texture_*float* extensions.
On 11/27/2014 10:56 AM, Kalyan Kondapally wrote: This patch adds support for following GLES2 Texture Float extensions: 1)GL_OES_texture_float, 2)GL_OES_texture_half_float, 3)GL_OES_texture_float_linear, 4)GL_OES_texture_half_float_linear. Support for these extensions need to be explicitly enabled per driver and this patch enables support for i965 drivers. v2: Indentation fixes. (Brian Paul) Fixed Comments and added some to new private functions.(Brian Paul) Added assert in valid_filter_for_float.(Brian Paul) Renamed Float and HALF_FLOAT_OES as IsFloatOES and IsHalfFloatOES.(Brian Paul) adjust_for_oes_float_texture to return GLenum. (Brain Paul) Use RGB_32F internal floating point format for RGB base format. Signed-off-by: Kevin Rogovin kevin.rogo...@intel.com Signed-off-by: Kalyan Kondapally kalyan.kondapa...@intel.com --- src/mesa/drivers/dri/i965/intel_extensions.c | 6 +++ src/mesa/main/extensions.c | 4 ++ src/mesa/main/glformats.c| 46 +-- src/mesa/main/glformats.h| 3 +- src/mesa/main/mtypes.h | 6 +++ src/mesa/main/pack.c | 16 +++ src/mesa/main/teximage.c | 68 +++- src/mesa/main/texobj.c | 53 ++ 8 files changed, 196 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index 76f..e95eaef 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++ b/src/mesa/drivers/dri/i965/intel_extensions.c @@ -245,6 +245,12 @@ intelInitExtensions(struct gl_context *ctx) ctx-Extensions.OES_standard_derivatives = true; ctx-Extensions.OES_EGL_image_external = true; + bool enable_opengles2_extensions = ctx-API == API_OPENGLES2; + ctx-Extensions.OES_texture_float = enable_opengles2_extensions; + ctx-Extensions.OES_texture_half_float = enable_opengles2_extensions; + ctx-Extensions.OES_texture_float_linear = enable_opengles2_extensions; + ctx-Extensions.OES_texture_half_float_linear = enable_opengles2_extensions; + The code that decides which extension strings to expose to the application already does per-API filtering. Unless there is code elsewhere in Mesa that behaves differently when these extensions are enabled, there is no reason to do this. Notice that we enable all the desktop OpenGL extensions even in an OpenGL ES 1.1 context. I see that v3 moves this to handle_first_current. I'll add further comments on the updated patch. if (brw-gen = 6) ctx-Const.GLSLVersion = 330; else diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c index 0df04c2..6833fcf 100644 --- a/src/mesa/main/extensions.c +++ b/src/mesa/main/extensions.c @@ -314,6 +314,10 @@ static const struct extension extension_table[] = { { GL_OES_texture_3D, o(EXT_texture3D), ES2, 2005 }, { GL_OES_texture_cube_map,o(ARB_texture_cube_map), ES1, 2007 }, { GL_OES_texture_env_crossbar, o(ARB_texture_env_crossbar), ES1, 2005 }, + { GL_OES_texture_float, o(OES_texture_float), ES2, 2005 }, + { GL_OES_texture_float_linear, o(OES_texture_float_linear), ES2, 2005 }, + { GL_OES_texture_half_float, o(OES_texture_half_float), ES2, 2005 }, + { GL_OES_texture_half_float_linear, o(OES_texture_half_float_linear), ES2, 2005 }, { GL_OES_texture_mirrored_repeat, o(dummy_true), ES1, 2005 }, { GL_OES_texture_npot, o(ARB_texture_non_power_of_two), ES1 | ES2, 2005 }, { GL_OES_vertex_array_object, o(dummy_true), ES1 | ES2, 2010 }, diff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c index 00478f9..c2e6c37 100644 --- a/src/mesa/main/glformats.c +++ b/src/mesa/main/glformats.c @@ -93,6 +93,7 @@ _mesa_sizeof_type(GLenum type) case GL_DOUBLE: return sizeof(GLdouble); case GL_HALF_FLOAT_ARB: + case GL_HALF_FLOAT_OES: return sizeof(GLhalfARB); case GL_FIXED: return sizeof(GLfixed); @@ -125,6 +126,7 @@ _mesa_sizeof_packed_type(GLenum type) case GL_INT: return sizeof(GLint); case GL_HALF_FLOAT_ARB: + case GL_HALF_FLOAT_OES: return sizeof(GLhalfARB); case GL_FLOAT: return sizeof(GLfloat); @@ -241,6 +243,7 @@ _mesa_bytes_per_pixel(GLenum format, GLenum type) case GL_FLOAT: return comps * sizeof(GLfloat); case
[Mesa-dev] [PATCH] i965: Avoid union literal, for old gcc compatibility.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86939 --- src/mesa/drivers/dri/i965/test_vf_float_conversions.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/test_vf_float_conversions.cpp b/src/mesa/drivers/dri/i965/test_vf_float_conversions.cpp index a21000c..2ea36fd 100644 --- a/src/mesa/drivers/dri/i965/test_vf_float_conversions.cpp +++ b/src/mesa/drivers/dri/i965/test_vf_float_conversions.cpp @@ -60,7 +60,8 @@ union fu { static unsigned f2u(float f) { - return (union fu){ .f = f }.u; + union fu fu = { .f = f }; + return fu.u; } TEST_F(vf_float_conversion_test, test_vf_to_float) -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Avoid union literal, for old gcc compatibility.
Reviewed-by: Ian Romanick ian.d.roman...@intel.com On 12/02/2014 11:24 AM, Matt Turner wrote: Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86939 --- src/mesa/drivers/dri/i965/test_vf_float_conversions.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/test_vf_float_conversions.cpp b/src/mesa/drivers/dri/i965/test_vf_float_conversions.cpp index a21000c..2ea36fd 100644 --- a/src/mesa/drivers/dri/i965/test_vf_float_conversions.cpp +++ b/src/mesa/drivers/dri/i965/test_vf_float_conversions.cpp @@ -60,7 +60,8 @@ union fu { static unsigned f2u(float f) { - return (union fu){ .f = f }.u; + union fu fu = { .f = f }; + return fu.u; } TEST_F(vf_float_conversion_test, test_vf_to_float) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/1] r600, llvm: Don't leak global symbol offsets
Signed-off-by: Jan Vesely jan.ves...@rutgers.edu --- You were right, this one was leaking too. src/gallium/drivers/r600/r600_llvm.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/gallium/drivers/r600/r600_llvm.c b/src/gallium/drivers/r600/r600_llvm.c index 3a3ee3a..a928fb8 100644 --- a/src/gallium/drivers/r600/r600_llvm.c +++ b/src/gallium/drivers/r600/r600_llvm.c @@ -888,6 +888,7 @@ unsigned r600_llvm_compile( FREE(binary.code); FREE(binary.config); FREE(binary.rodata); + FREE(binary.global_symbol_offsets); return r; } -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/1] r600: upload implicit arguments even if there are no explicit args
On Tue, 2014-12-02 at 11:33 -0500, Tom Stellard wrote: On Mon, Nov 03, 2014 at 08:29:37PM -0500, Jan Vesely wrote: Signed-off-by: Jan Vesely jan.ves...@rutgers.edu --- moreover, the condition is never true now that clover appends dim info src/gallium/drivers/r600/evergreen_compute.c | 4 1 file changed, 4 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_compute.c b/src/gallium/drivers/r600/evergreen_compute.c index 90fdd79..41dc93e 100644 --- a/src/gallium/drivers/r600/evergreen_compute.c +++ b/src/gallium/drivers/r600/evergreen_compute.c @@ -295,10 +295,6 @@ void evergreen_compute_upload_input( struct pipe_box box; struct pipe_transfer *transfer = NULL; - if (shader-input_size == 0) { - return; - } - We shouldn't rely on clover specific behavior, because in theory there could be other state trackers. right, I should probably drop that comment from commit message. Other than that, is there a reason to skip uploading driver arguments if there are no state tracker ones? jan -Tom if (!shader-kernel_param) { /* Add space for the grid dimensions */ shader-kernel_param = (struct r600_resource *) -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev -- Jan Vesely jan.ves...@rutgers.edu signature.asc Description: This is a digitally signed message part ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] DRI3/Present fixes for Mesa 10.3+ (v2)
A slightly updated and extended series of the dri3/present fixes for Mesa i sent last week. Patch 1 and 2 are same as before. Patch 3 now has signed off by Frank Binns and reviewed by Chris Wilson. Patch 4 and 5 are additional fixes. The last one makes INTEL_swap_events behave properly again when swap completion events come in out of order due to skipped present requests. Before the first out of sequence sbc event caused the INTEL_swap_events extension to become completely unuseable for the rest of a session. thanks, -mario ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/5] glx/dri3: Track separate (ust, msc) for PresentPixmap vs. PresentNotifyMsc
Prevent calls to glXGetSyncValuesOML() and glXWaitForMscOML() from overwriting the (ust,msc) values of the last successfull swapbuffers call (PresentPixmapCompleteNotify event), as glXWaitForSbcOML() relies on those values corresponding to the most recent completed swap, not to whatever was last returned from the server. Problematic call sequence without this patch would have been, e.g., glXSwapBuffers() ... wait ... swap completes - PresentPixmapComplete event - (ust,msc) updated to reflect swap completion time and count. ... wait for at least 1 video refresh cycle/vblank increment. glXGetSyncValuesOML() - PresentNotifyMsc event overwrites (ust,msc) of swap completion with (ust,msc) of most recent vblank glXWaitForSbcOML() - Returns sbc of last completed swap but (ust,msc) of last completed vblank, not of last completed swap. - Client is confused. Do this by tracking a separate set of (ust, msc) for the dri3_wait_for_msc() call than for the dri3_wait_for_sbc() call. This makes the glXWaitForSbcOML() call robust again and restores consistent behaviour with the DRI2 implementation. Fixes applications originally written and tested against DRI2 which also rely on this not regressing under DRI3/Present, e.g., Neuro-Science software like Psychtoolbox-3. This patch fixes the problem. Cc: 10.3 10.4 mesa-sta...@lists.freedesktop.org Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com --- src/glx/dri3_glx.c | 11 +++ src/glx/dri3_priv.h | 5 - 2 files changed, 11 insertions(+), 5 deletions(-) diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c index b4ac278..5796491 100644 --- a/src/glx/dri3_glx.c +++ b/src/glx/dri3_glx.c @@ -420,11 +420,14 @@ dri3_handle_present_event(struct dri3_drawable *priv, xcb_present_generic_event_ if (psc-show_fps_interval) show_fps(priv, ce-ust); + + priv-ust = ce-ust; + priv-msc = ce-msc; } else { priv-recv_msc_serial = ce-serial; + priv-vblank_ust = ce-ust; + priv-vblank_msc = ce-msc; } - priv-ust = ce-ust; - priv-msc = ce-msc; break; } case XCB_PRESENT_EVENT_IDLE_NOTIFY: { @@ -498,8 +501,8 @@ dri3_wait_for_msc(__GLXDRIdrawable *pdraw, int64_t target_msc, int64_t divisor, } } - *ust = priv-ust; - *msc = priv-msc; + *ust = priv-vblank_ust; + *msc = priv-vblank_msc; *sbc = priv-recv_sbc; return 1; diff --git a/src/glx/dri3_priv.h b/src/glx/dri3_priv.h index 8e46640..222deb0 100644 --- a/src/glx/dri3_priv.h +++ b/src/glx/dri3_priv.h @@ -182,9 +182,12 @@ struct dri3_drawable { uint64_t send_sbc; uint64_t recv_sbc; - /* Last received UST/MSC values */ + /* Last received UST/MSC values for pixmap present complete */ uint64_t ust, msc; + /* Last received UST/MSC values for vblank */ + uint64_t vblank_ust, vblank_msc; + /* Serial numbers for tracking wait_for_msc events */ uint32_t send_msc_serial; uint32_t recv_msc_serial; -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/5] glx/dri3: Request non-vsynced Present for swapinterval zero. (v2)
Restores proper immediate tearing swap behaviour for OpenGL bufferswap under DRI3/Present. Cc: 10.3 10.4 mesa-sta...@lists.freedesktop.org v2: Add Frank Binns signed off by for his original earlier patch from April 2014, which is identical to this one, and Chris Wilsons reviewed tag from May 2014 for that patch, ergo also for this one. Signed-off-by: Frank Binns frank.bi...@imgtec.com Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com Reviewed-by: Chris Wilson ch...@chris-wilson.co.uk --- src/glx/dri3_glx.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c index 5796491..c53be1b 100644 --- a/src/glx/dri3_glx.c +++ b/src/glx/dri3_glx.c @@ -1518,6 +1518,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t target_msc, int64_t divisor, xcb_connection_t *c = XGetXCBConnection(dpy); struct dri3_buffer *back; int64_t ret = 0; + uint32_t options = XCB_PRESENT_OPTION_NONE; unsigned flags = __DRI2_FLUSH_DRAWABLE; if (flush) @@ -1557,6 +1558,9 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t target_msc, int64_t divisor, if (target_msc == 0) target_msc = priv-msc + priv-swap_interval * (priv-send_sbc - priv-recv_sbc); + if (priv-swap_interval == 0) + options |= XCB_PRESENT_OPTION_ASYNC; + back-busy = 1; back-last_swap = priv-send_sbc; xcb_present_pixmap(c, @@ -1570,7 +1574,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t target_msc, int64_t divisor, None, /* target_crtc */ None, back-sync_fence, - XCB_PRESENT_OPTION_NONE, + options, target_msc, divisor, remainder, 0, NULL); -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/5] glx: Handle out-of-sequence swap completion events correctly.
The code for emitting INTEL_swap_events swap completion events needs to translate from 32-Bit sbc on the wire to 64-Bit sbc for the events and handle wraparound accordingly. It assumed that events would be sent by the server in the order their corresponding swap requests were emitted from the client, iow. sbc count should be always increasing. This was correct for DRI2. This is not always the case under the DRI3/Present backend, where the Present extension can execute swaps and send out completion events in a different order than the submission order of the present requests. This confused the wraparound handling. This patch fixes the problem by handling 32-Bit wraparound in both directions. As long as successive swap completion events real 64-Bit sbc's don't differ by more than 2^30, this should be able to do the right thing. Cc: 10.3 10.4 mesa-sta...@lists.freedesktop.org Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com --- src/glx/glxext.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/src/glx/glxext.c b/src/glx/glxext.c index 68c359e..fdc24d4 100644 --- a/src/glx/glxext.c +++ b/src/glx/glxext.c @@ -143,8 +143,13 @@ __glXWireToEvent(Display *dpy, XEvent *event, xEvent *wire) aevent-ust = ((CARD64)awire-ust_hi 32) | awire-ust_lo; aevent-msc = ((CARD64)awire-msc_hi 32) | awire-msc_lo; - if (awire-sbc glxDraw-lastEventSbc) -glxDraw-eventSbcWrap += 0x1; + /* Handle 32-Bit wire sbc wraparound in both directions to cope with out + * of sequence 64-Bit sbc's + */ + if ((int64_t) awire-sbc ((int64_t) glxDraw-lastEventSbc - 0x4000)) + glxDraw-eventSbcWrap += 0x1; + if ((int64_t) awire-sbc ((int64_t) glxDraw-lastEventSbc + 0x4000)) + glxDraw-eventSbcWrap -= 0x1; glxDraw-lastEventSbc = awire-sbc; aevent-sbc = awire-sbc + glxDraw-eventSbcWrap; return True; -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/5] glx/dri3: Fix glXWaitForSbcOML() to handle targetSBC==0 correctly.
targetSBC == 0 is a special case, which asks the function to block until all pending OpenGL bufferswap requests have completed. Currently the function just falls through for targetSBC == 0, returning bogus results. This breaks applications originally written and tested against DRI2 which also rely on this not regressing under DRI3/Present, e.g., Neuro-Science software like Psychtoolbox-3. This patch fixes the problem. Cc: 10.3 10.4 mesa-sta...@lists.freedesktop.org Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com --- src/glx/dri3_glx.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c index a9ff73b..b4ac278 100644 --- a/src/glx/dri3_glx.c +++ b/src/glx/dri3_glx.c @@ -529,7 +529,8 @@ dri3_wait_for_sbc(__GLXDRIdrawable *pdraw, int64_t target_sbc, int64_t *ust, { struct dri3_drawable *priv = (struct dri3_drawable *) pdraw; - while (priv-recv_sbc target_sbc) { + while ((target_sbc != 0 priv-recv_sbc target_sbc) || + (target_sbc == 0 priv-recv_sbc priv-send_sbc)) { if (!dri3_wait_for_event(pdraw)) return 0; } -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/5] glx/dri3: Don't fail on glXSwapBuffersMscOML(dpy, window, 0, 0, 0)
glXSwapBuffersMscOML() with target_msc=divisor=remainder=0 gets translated into target_msc=divisor=0 but remainder=1 by the mesa api. This is done for server DRI2 where there needs to be a way to tell the server-side DRI2ScheduleSwap implementation if a call to glXSwapBuffers() or glXSwapBuffersMscOML(dpy,window,0,0,0) was done. remainder = 1 was (ab)used as a flag to tell the server to select proper semantic. The DRI3/Present backend ignored this signalling, treated any target_msc=0 as glXSwapBuffers() request, and called xcb_present_pixmap with invalid divisor=0, remainder=1 combo. The present extension responded kindly to this with a BadValue error and dropped the request, but mesa's DRI3/Present backend doesn't check for error codes. From there on stuff went downhill quickly for the calling OpenGL client... This patch fixes the problem. Cc: 10.3 10.4 mesa-sta...@lists.freedesktop.org Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com --- src/glx/dri3_glx.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c index c53be1b..efff907 100644 --- a/src/glx/dri3_glx.c +++ b/src/glx/dri3_glx.c @@ -1552,11 +1552,21 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t target_msc, int64_t divisor, dri3_fence_reset(c, back); /* Compute when we want the frame shown by taking the last known successful - * MSC and adding in a swap interval for each outstanding swap request + * MSC and adding in a swap interval for each outstanding swap request. + * target_msc=divisor=remainder=0 means Use glXSwapBuffers() semantic */ ++priv-send_sbc; - if (target_msc == 0) + if (target_msc == 0 divisor == 0 remainder == 0) target_msc = priv-msc + priv-swap_interval * (priv-send_sbc - priv-recv_sbc); + else if (divisor == 0 remainder 0) { + /* Special case signalling: This means glXSwapBuffersMscOML() called with + * target_msc=divisor=remainder=0. Needed to distinguish from glXSwapBuffers + * case above. Reset remainder to zero, so PresentPixmap won't bail on us. + * GLX_OML_sync_control says for divisor == 0 any remainder = 0 is fine, + * whereas Present extension wants remainder == 0 for divisor == 0. + */ + remainder = 0; + } if (priv-swap_interval == 0) options |= XCB_PRESENT_OPTION_ASYNC; -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] [RFC] MESA_shading_language_130: is this it?
Ian, Dave My ILK is down at the moment, but I don't recall any problem with clip distances. All that work landed in mesa-10.0, and at the time passed all the relevant piglits (with 1.30 + EXT_gpu_shader4 hacked in for entrypoints). The Gen4-5 VS currently lowers all user clipping to clip distances, and the clip shader works purely in terms of clip distances. The old Broadwater and Crestline chips have problems supporting 8 clip distances due to the negative-W workaround mess. It could be done, but doesn't seem worth it. All we were missing was the API side to make 1.30 useful without GL3. [Note that there's a possible perf tradeoff still to be explored in not doing real geometry clipping for clip distances, but instead just discarding the unwanted fragments in the FS. At least Ironlake provides a mechanism to fixup the fragment counts for this case; Cantiga may do as well.] -- Chris On Wed, Dec 3, 2014 at 7:26 AM, Ian Romanick i...@freedesktop.org wrote: On 11/26/2014 06:09 PM, Dave Airlie wrote: Glamor is 4x faster on my ILK using glsl 130 at core text using x11perf -ftext. Ian started writing a spec for this extension a while back, which seems like most of the work, this patch seems to do enough, to advertise GLSL 1.30. Yeah... I started writing the extension when Chris Forbes was working on adding GLSL 1.30 for Ironlake. I seem to recall that gl_ClipDistance still does not work for ILK, and difficulties with that caused Chris to abandon the project. This was over a year ago, so the details are a bit fuzzy. The common Mesa parts look good, though. If we want to pursue this, I can finish up the extension spec and get it published. One other minor nit below... TODO: fix extension numbering get piglit to execute tests on this Just-hacked-up-by: Dave Airlie airl...@redhat.com --- src/mapi/glapi/gen/MESA_shading_language_130.xml | 255 +++ src/mapi/glapi/gen/gl_API.xml| 2 + src/mesa/drivers/dri/i965/intel_extensions.c | 5 + src/mesa/main/extensions.c | 1 + src/mesa/main/mtypes.h | 1 + 5 files changed, 264 insertions(+) create mode 100644 src/mapi/glapi/gen/MESA_shading_language_130.xml diff --git a/src/mapi/glapi/gen/MESA_shading_language_130.xml b/src/mapi/glapi/gen/MESA_shading_language_130.xml new file mode 100644 index 000..9aa553d --- /dev/null +++ b/src/mapi/glapi/gen/MESA_shading_language_130.xml @@ -0,0 +1,255 @@ +?xml version=1.0? +!DOCTYPE OpenGLAPI SYSTEM gl_API.dtd + +OpenGLAPI + +category name=GL_MESA_shading_language_130 number=999 + +enum name=VERTEX_ATTRIB_ARRAY_INTEGERvalue=0x88FD/ +enum name=SAMPLER_1D_ARRAY value=0x8DC0/ +enum name=SAMPLER_2D_ARRAY value=0x8DC1/ +enum name=SAMPLER_1D_ARRAY_SHADOWvalue=0x8DC3/ +enum name=SAMPLER_2D_ARRAY_SHADOWvalue=0x8DC4/ +enum name=SAMPLER_CUBE_SHADOWvalue=0x8DC5/ +enum name=UNSIGNED_INT_VEC2 value=0x8DC6/ +enum name=UNSIGNED_INT_VEC3 value=0x8DC7/ +enum name=UNSIGNED_INT_VEC4 value=0x8DC8/ +enum name=INT_SAMPLER_1D value=0x8DC9/ +enum name=INT_SAMPLER_2D value=0x8DCA/ +enum name=INT_SAMPLER_3D value=0x8DCB/ +enum name=INT_SAMPLER_CUBE value=0x8DCC/ +enum name=INT_SAMPLER_1D_ARRAY value=0x8DCE/ +enum name=INT_SAMPLER_2D_ARRAY value=0x8DCF/ +enum name=UNSIGNED_INT_SAMPLER_1Dvalue=0x8DD1/ +enum name=UNSIGNED_INT_SAMPLER_2Dvalue=0x8DD2/ +enum name=UNSIGNED_INT_SAMPLER_3Dvalue=0x8DD3/ +enum name=UNSIGNED_INT_SAMPLER_CUBE value=0x8DD4/ +enum name=UNSIGNED_INT_SAMPLER_1D_ARRAY value=0x8DD6/ +enum name=UNSIGNED_INT_SAMPLER_2D_ARRAY value=0x8DD7/ + +!-- There is no MIN_PROGRAM_TEXEL_OFFSET in glext.h. There is + MIN_PROGRAM_TEXEL_OFFSET_NV and MIN_PROGRAM_TEXEL_OFFSET (OpenGL + 3.0). Same goes for MAX_PROGRAM_TEXEL_OFFSET. +-- +enum name=MIN_PROGRAM_TEXEL_OFFSET value=0x8904 +size name=Get mode=get/ +/enum +enum name=MAX_PROGRAM_TEXEL_OFFSET value=0x8905 +size name=Get mode=get/ +/enum + +enum name=CLIP_DISTANCE0 value=0x3000/ +enum name=CLIP_DISTANCE1 value=0x3001/ +enum name=CLIP_DISTANCE2 value=0x3002/ +enum name=CLIP_DISTANCE3 value=0x3003/ +enum name=CLIP_DISTANCE4 value=0x3004/ +enum name=CLIP_DISTANCE5 value=0x3005/ +enum name=CLIP_DISTANCE6 value=0x3006/ +enum name=CLIP_DISTANCE7 value=0x3007/ + +enum
[Mesa-dev] [PATCH 1/1] clover: Use switch when creating kernel arguments
This way we get a warning when an enum value is not handled Signed-off-by: Jan Vesely jan.ves...@rutgers.edu --- src/gallium/state_trackers/clover/core/kernel.cpp | 45 ++- 1 file changed, 20 insertions(+), 25 deletions(-) diff --git a/src/gallium/state_trackers/clover/core/kernel.cpp b/src/gallium/state_trackers/clover/core/kernel.cpp index e07d14d..9ddc244 100644 --- a/src/gallium/state_trackers/clover/core/kernel.cpp +++ b/src/gallium/state_trackers/clover/core/kernel.cpp @@ -293,38 +293,33 @@ namespace { std::unique_ptrkernel::argument kernel::argument::create(const module::argument marg) { - if (marg.type == module::argument::scalar) - return std::unique_ptrkernel::argument( -new scalar_argument(marg.size)); + switch(marg.type) + { + case module::argument::scalar: + return std::unique_ptrkernel::argument(new scalar_argument(marg.size)); - else if (marg.type == module::argument::global) - return std::unique_ptrkernel::argument( -new global_argument); + case module::argument::global: + return std::unique_ptrkernel::argument(new global_argument); - else if (marg.type == module::argument::local) - return std::unique_ptrkernel::argument( -new local_argument); + case module::argument::local: + return std::unique_ptrkernel::argument(new local_argument); - else if (marg.type == module::argument::constant) - return std::unique_ptrkernel::argument( -new constant_argument); + case module::argument::constant: + return std::unique_ptrkernel::argument(new constant_argument); - else if (marg.type == module::argument::image2d_rd || - marg.type == module::argument::image3d_rd) - return std::unique_ptrkernel::argument( -new image_rd_argument); + case module::argument::image2d_rd: + case module::argument::image3d_rd: + return std::unique_ptrkernel::argument(new image_rd_argument); - else if (marg.type == module::argument::image2d_wr || - marg.type == module::argument::image3d_wr) - return std::unique_ptrkernel::argument( -new image_wr_argument); + case module::argument::image2d_wr: + case module::argument::image3d_wr: + return std::unique_ptrkernel::argument(new image_wr_argument); - else if (marg.type == module::argument::sampler) - return std::unique_ptrkernel::argument( -new sampler_argument); + case module::argument::sampler: + return std::unique_ptrkernel::argument(new sampler_argument); - else - throw error(CL_INVALID_KERNEL_DEFINITION); + } + throw error(CL_INVALID_KERNEL_DEFINITION); } kernel::argument::argument() : _set(false) { -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/1] st/xvmc: Fix compiler warnings
Mostly signed/unsigned comparison Signed-off-by: Jan Vesely jan.ves...@rutgers.edu --- src/gallium/state_trackers/xvmc/context.c | 6 +++--- src/gallium/state_trackers/xvmc/subpicture.c | 2 +- src/gallium/state_trackers/xvmc/xvmc_private.h | 4 ++-- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/src/gallium/state_trackers/xvmc/context.c b/src/gallium/state_trackers/xvmc/context.c index 2329e2a..9ded2e5 100644 --- a/src/gallium/state_trackers/xvmc/context.c +++ b/src/gallium/state_trackers/xvmc/context.c @@ -67,7 +67,7 @@ static Status Validate(Display *dpy, XvPortID port, int surface_type_id, *found_port = false; - for (unsigned int i = 0; i XScreenCount(dpy); ++i) { + for (int i = 0; i XScreenCount(dpy); ++i) { ret = XvQueryAdaptors(dpy, XRootWindow(dpy, i), num_adaptors, adaptor_info); if (ret != Success) return ret; @@ -87,7 +87,7 @@ static Status Validate(Display *dpy, XvPortID port, int surface_type_id, return BadAlloc; } -for (unsigned int l = 0; l num_types !found_surface; ++l) { +for (int l = 0; l num_types !found_surface; ++l) { if (surface_info[l].surface_type_id != surface_type_id) continue; @@ -191,7 +191,7 @@ Status XvMCCreateContext(Display *dpy, XvPortID port, int surface_type_id, Status ret; struct vl_screen *vscreen; struct pipe_context *pipe; - struct pipe_video_codec templat = {}; + struct pipe_video_codec templat = {0}; XvMCContextPrivate *context_priv; vl_csc_matrix csc; diff --git a/src/gallium/state_trackers/xvmc/subpicture.c b/src/gallium/state_trackers/xvmc/subpicture.c index 7a951fa..6f42216 100644 --- a/src/gallium/state_trackers/xvmc/subpicture.c +++ b/src/gallium/state_trackers/xvmc/subpicture.c @@ -112,7 +112,7 @@ static Status Validate(Display *dpy, XvPortID port, int surface_type_id, int xvi { XvImageFormatValues *subpictures; int num_subpics; - unsigned int i; + int i; subpictures = XvMCListSubpictureTypes(dpy, port, surface_type_id, num_subpics); if (num_subpics 1) { diff --git a/src/gallium/state_trackers/xvmc/xvmc_private.h b/src/gallium/state_trackers/xvmc/xvmc_private.h index eaf388a..84c7b6c 100644 --- a/src/gallium/state_trackers/xvmc/xvmc_private.h +++ b/src/gallium/state_trackers/xvmc/xvmc_private.h @@ -69,7 +69,7 @@ typedef struct struct pipe_video_buffer *video_buffer; /* nonzero if this picture is already being decoded */ - int picture_structure; + unsigned picture_structure; XvMCSurface *ref[2]; @@ -106,7 +106,7 @@ typedef struct #define XVMC_WARN 2 #define XVMC_TRACE 3 -static INLINE void XVMC_MSG(unsigned int level, const char *fmt, ...) +static INLINE void XVMC_MSG(int level, const char *fmt, ...) { static int debug_level = -1; -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH RESEND] nv50/ir: use unordered_set instead of list to keep track of var defs
The set of variable defs does not need to be ordered in any way, and removing/adding elements is a fairly common operation in various optimization passes. This shortens runtime of piglit test fp-long-alu to ~11s from ~22s No piglit regressions observed on nvc0! Signed-off-by: Tobias Klausmann tobias.johannes.klausm...@mni.thm.de --- src/gallium/drivers/nouveau/codegen/nv50_ir.cpp| 4 ++-- src/gallium/drivers/nouveau/codegen/nv50_ir.h | 7 +++--- .../drivers/nouveau/codegen/nv50_ir_inlines.h | 28 +- .../nouveau/codegen/nv50_ir_lowering_nv50.cpp | 4 ++-- .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 6 ++--- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 4 ++-- src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 17 +++-- 7 files changed, 38 insertions(+), 32 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp index ca3c806..3138118 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.cpp @@ -154,9 +154,9 @@ ValueDef::set(Value *defVal) if (value == defVal) return; if (value) - value-defs.remove(this); + value-defs.erase(this); if (defVal) - defVal-defs.push_back(this); + defVal-defs.insert(this); value = defVal; } diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h b/src/gallium/drivers/nouveau/codegen/nv50_ir.h index 0ff5e5d..56033f1 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h @@ -567,6 +567,7 @@ public: inline Value *rep() const { return join; } + inline Instruction *getUniqueInsnMerged() const; inline Instruction *getUniqueInsn() const; inline Instruction *getInsn() const; // use when uniqueness is certain @@ -583,11 +584,11 @@ public: static inline Value *get(Iterator); std::tr1::unordered_setValueRef * uses; - std::listValueDef * defs; + std::tr1::unordered_setValueDef * defs; typedef std::tr1::unordered_setValueRef *::iterator UseIterator; typedef std::tr1::unordered_setValueRef *::const_iterator UseCIterator; - typedef std::listValueDef *::iterator DefIterator; - typedef std::listValueDef *::const_iterator DefCIterator; + typedef std::tr1::unordered_setValueDef *::iterator DefIterator; + typedef std::tr1::unordered_setValueDef *::const_iterator DefCIterator; int id; Storage reg; diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_inlines.h b/src/gallium/drivers/nouveau/codegen/nv50_ir_inlines.h index 255324f..471d47f 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_inlines.h +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_inlines.h @@ -205,21 +205,26 @@ const LValue *ValueDef::preSSA() const Instruction *Value::getInsn() const { - return defs.empty() ? NULL : defs.front()-getInsn(); + return defs.empty() ? NULL : (*defs.begin())-getInsn(); } -Instruction *Value::getUniqueInsn() const +Instruction *Value::getUniqueInsnMerged() const { if (defs.empty()) return NULL; + /* It is not guaranteed that this is the first in the set, lets find it */ + for (DefCIterator it = defs.begin(); it != defs.end(); ++it) + if ((*it)-get() == this) + return (*it)-getInsn(); + /* We should never hit this assert */ + assert(0); + return NULL; +} - // after regalloc, the definitions of coalesced values are linked - if (join != this) { - for (DefCIterator it = defs.begin(); it != defs.end(); ++it) - if ((*it)-get() == this) -return (*it)-getInsn(); - // should be unreachable and trigger assertion at the end - } +Instruction *Value::getUniqueInsn() const +{ + if (defs.empty()) + return NULL; #ifdef DEBUG if (reg.data.id 0) { int n = 0; @@ -230,8 +235,9 @@ Instruction *Value::getUniqueInsn() const WARN(value %%%i not uniquely defined\n, id); // return NULL ? } #endif - assert(defs.front()-get() == this); - return defs.front()-getInsn(); + ValueDef *def = *defs.begin(); + assert(def-get() == this); + return def-getInsn(); } inline bool Instruction::constrainedDefs() const diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp index e283424..f13e0d4 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp @@ -211,7 +211,7 @@ NV50LegalizePostRA::visit(Function *fn) if (outWrites) { for (std::listInstruction *::iterator it = outWrites-begin(); it != outWrites-end(); ++it) - (*it)-getSrc(1)-defs.front()-getInsn()-setDef(0, (*it)-getSrc(0)); + (*(*it)-getSrc(1)-defs.begin())-getInsn()-setDef(0, (*it)-getSrc(0)); // instructions will be deleted on exit outWrites-clear(); } @@ -343,7
[Mesa-dev] [PATCH] mesa: Ensure stack is realigned on x86.
From: José Fonseca jfons...@vmware.com Nowadays GCC assumes stack pointer is 16-byte aligned even on 32-bits, but that is an assumption OpenGL drivers (or any dynamic library for that matter) can affort to make as there are many closed- and open- source application binaries out there that only assume 4-byte stack alignment. This fix uses force_align_arg_pointer GCC attribute, and is only a stop-gap measure. The right fix would be to pass -mstackrealign or -mincoming-stack-boundary=2 to all source fails that use any -msse* option, as there is no way to guarantee if/when GCC will decide to spill SSE registers to the stack. https://bugs.freedesktop.org/show_bug.cgi?id=86788 --- src/mesa/main/sse_minmax.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/main/sse_minmax.c b/src/mesa/main/sse_minmax.c index 222ac14..93cf2a6 100644 --- a/src/mesa/main/sse_minmax.c +++ b/src/mesa/main/sse_minmax.c @@ -31,6 +31,9 @@ #include stdint.h void +#if !defined(__x86_64__) + __attribute__((force_align_arg_pointer)) +#endif _mesa_uint_array_min_max(const unsigned *ui_indices, unsigned *min_index, unsigned *max_index, const unsigned count) { -- 2.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Patch v3] Mesa: Add support for GL_OES_texture_*float* extensions.
On 11/28/2014 01:08 PM, Kalyan Kondapally wrote: This patch adds support for following GLES2 Texture Float extensions: 1)GL_OES_texture_float, 2)GL_OES_texture_half_float, 3)GL_OES_texture_float_linear, 4)GL_OES_texture_half_float_linear. If we are using GLES context and the driver has advertised support for ARB_texture_float, then support for all texture_float* extensions is enabled. Here, we are assuming that when driver advertises support for ARB_texture_float, it should be able to support all these extensions. v2: Indentation fixes. (Brian Paul) Fixed Comments and added some to new private functions.(Brian Paul) Added assert in valid_filter_for_float.(Brian Paul) Renamed Float and HALF_FLOAT_OES as IsFloatOES and IsHalfFloatOES.(Brian Paul) adjust_for_oes_float_texture to return GLenum. (Brian Paul) Use RGB_32F internal floating point format for RGB base format. v3: Don't indent case. (Matt Turner) Enable support for float extensions in case driver supports ARB_texture_float. (Fredrik Höglund) At this point, I think this patch should be split in three: 1. Extension infrastructure bits. 2. GL_HALF_FLOAT_OES fixes. 3. The rest. I suggest this because I think the first two should be pretty much landable. The rest may still need some changes. I'd also accept a 0th patch. 0. Global s/GL_HALF_FLOAT_ARB/GL_HALF_FLOAT/g;s/GLhalfARB/GLhalf/g :) Signed-off-by: Kevin Rogovin kevin.rogo...@intel.com Signed-off-by: Kalyan Kondapally kalyan.kondapa...@intel.com --- src/mesa/main/context.c| 15 +++ src/mesa/main/extensions.c | 4 +++ src/mesa/main/glformats.c | 46 +--- src/mesa/main/glformats.h | 3 ++- src/mesa/main/mtypes.h | 6 + src/mesa/main/pack.c | 16 +++ src/mesa/main/teximage.c | 66 +- src/mesa/main/texobj.c | 53 + 8 files changed, 203 insertions(+), 6 deletions(-) diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index 400c158..8b54967 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -1544,6 +1544,21 @@ handle_first_current(struct gl_context *ctx) return; } + /* If we are using GLES context and the driver has advertised support for +* ARB_texture_float, then enable support for all texture_float* extensions. +* Here, we are assuming that when driver advertises support for +* ARB_texture_float, it should be able to support all these extensions. In +* case any particular driver doesn't want to enable these extensions or +* only a subset on GLES, then it shouldn't advertise support for +* ARB_texture_float and toggle OES_texture_float* flags accordingly. +*/ + if (ctx-Extensions.ARB_texture_float _mesa_is_gles(ctx)) { + ctx-Extensions.OES_texture_float = GL_TRUE; + ctx-Extensions.OES_texture_float_linear = GL_TRUE; + ctx-Extensions.OES_texture_half_float = GL_TRUE; + ctx-Extensions.OES_texture_half_float_linear = GL_TRUE; + } I think this is a bad idea. Right now OpenGL ES 3.0 support depends on ARB_texture_float (see compute_version_es2 in src/mesa/main/version.c). That's a superset of OpenGL ES 3.0: GL_OES_texture_float_linear is not required. Now when someone wants to enable ES3 on their GPU that supports everything except GL_OES_texture_float_linear they will have to find-and-move this code because this code occurs after the version is calculated. That person will also have to modify other drivers (that they probably cannot test). Gosh, what could go wrong? :) There are a bunch of other cases where an extension can be automatically enabled if some other conditions are met. If we care about saving a couple lines of code here and there, we should make a function that drivers explicitly call to do that. Drivers that opt-in can call this function after they have enabled all of their other extensions in their create-context path. Moreover, this code is already broken for r300. Marek said in the previous thread that r300 advertises GL_ARB_texture_float (and developers know the documented caveats there) but does not want to advertise GL_OES_texture_*_linear. For the most part in Mesa, extensions are only enabled by default in three cases: 1. All in-tree drivers were already enabling the extension. GL_ARB_texture_env_add is an example. In the long past this extension was not enabled by default. Eventually all drivers either grew support or were removed from the tree. 2. The extension is just API syntactic sugar that the driver won't even see. GL_ARB_multi_bind is an example. 3. The extension has a secondary enable, such as a number of available formats, that drivers can control. GL_ARB_get_program_binary, GL_ARB_multisample, and GL_ARB_texture_compression are examples. In these cases we have a compelling testing story. - If the
Re: [Mesa-dev] [PATCH 4/4] i965: Drop BRW_NEW_VERTEX_PROGRAM and _NEW_TRANSFORM from Gen4 VS state.
Kenneth Graunke kenn...@whitecape.org writes: This simply looks wrong - I don't see any code that uses _NEW_TRANSFORM or BRW_NEW_VERTEX_PROGRAM. It looks like the intention was to duplicate the brw_curbe_offsets atom's flags, which computes brw-curbe.vs_start. This is unnecessary - we flag BRW_NEW_CURBE_OFFSETS whenever that field changes; listening to that is sufficient. The dependency disappeared in ab973403e445cd8211dba4e87e057a9dc1b1af9d signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/10] mesa: Considers GL_DEPTH_STENCIL_ATTACHMENT a valid argument for FBO invalidation under GLES3
On 12/01/2014 05:04 AM, Eduardo Lima Mitev wrote: In OpenGL and OpenGL-Es 3+, GL_DEPTH_STENCIL_ATTACHMENT is a valid attachment point for the family of functions that invalidate a framebuffer object (e.g, glInvalidateFramebuffer, glInvalidateSubFramebuffer, etc). Currently, a GL_INVALID_ENUM error is emitted for this attachment point. Fixes 21 dEQP test failures under 'dEQP-GLES3.functional.fbo.invalidate.*'. --- src/mesa/main/fbobject.c | 4 1 file changed, 4 insertions(+) diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c index 8283373..19c4020 100644 --- a/src/mesa/main/fbobject.c +++ b/src/mesa/main/fbobject.c @@ -3073,6 +3073,10 @@ invalidate_framebuffer_storage(GLenum target, GLsizei numAttachments, case GL_DEPTH_ATTACHMENT: case GL_STENCIL_ATTACHMENT: break; + case GL_DEPTH_STENCIL_ATTACHMENT: +if (_mesa_is_desktop_gl(ctx) || _mesa_is_gles3(ctx)) I would add a note here that GL_OES_packed_depth_stencil does not add GL_DEPTH_STENCIL_ATTACHMENT. I was sure that it did... until I went and checked the spec. With that and Jason's suggestion, this patch is Reviewed-by: Ian Romanick ian.d.roman...@intel.com + break; +/* otherwise fall through */ case GL_COLOR_ATTACHMENT0: case GL_COLOR_ATTACHMENT1: case GL_COLOR_ATTACHMENT2: ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/10] mesa: Returns zero samples when querying GL_NUM_SAMPLE_COUNTS when internal format is integer
On 12/01/2014 05:04 AM, Eduardo Lima Mitev wrote: In GLES3, multisampling is not supported for signed and unsigned integer internal formats. See https://www.khronos.org/opengles/sdk/docs/man3/docbook4/xhtml/glGetInternalformativ.xml. Fixes 19 dEQP tests under 'dEQP-GLES3.functional.state_query.internal_format.*'. --- src/mesa/main/formatquery.c | 44 +++- 1 file changed, 27 insertions(+), 17 deletions(-) diff --git a/src/mesa/main/formatquery.c b/src/mesa/main/formatquery.c index 40eca87..629b07b 100644 --- a/src/mesa/main/formatquery.c +++ b/src/mesa/main/formatquery.c @@ -115,23 +115,33 @@ _mesa_GetInternalformativ(GLenum target, GLenum internalformat, GLenum pname, internalformat, buffer); break; case GL_NUM_SAMPLE_COUNTS: { - /* The driver can return 0, and we should pass that along to the - * application. The ARB decided that ARB_internalformat_query should - * behave as ARB_internalformat_query2 in this situation. - * - * The ARB_internalformat_query2 spec says: - * - * - NUM_SAMPLE_COUNTS: The number of sample counts that would be - *returned by querying SAMPLES is returned in params. - ** If internalformat is not color-renderable, - * depth-renderable, or stencil-renderable (as defined in - * section 4.4.4), or if target does not support multiple - * samples (ie other than TEXTURE_2D_MULTISAMPLE, - * TEXTURE_2D_MULTISAMPLE_ARRAY, or RENDERBUFFER), 0 is - * returned. - */ - const size_t num_samples = - ctx-Driver.QuerySamplesForFormat(ctx, target, internalformat, buffer); + size_t num_samples; + + if (_mesa_is_gles3(ctx) _mesa_is_enum_format_integer(internalformat)) { + /* In GLES3.0, multisampling is not supported for signed and unsigned integer internal +formats https://www.khronos.org/opengles/sdk/docs/man3/docbook4/xhtml/glGetInternalformativ.xml, +so return zero + */ Rather than quote the man page, please quote the spec. I might also be tempted to leave this code along but change what gets stored to buffer[0]. Not a big deal either way. + num_samples = 0; + } + else { } else { + /* The driver can return 0, and we should pass that along to the + * application. The ARB decided that ARB_internalformat_query should + * behave as ARB_internalformat_query2 in this situation. + * + * The ARB_internalformat_query2 spec says: + * + * - NUM_SAMPLE_COUNTS: The number of sample counts that would be + *returned by querying SAMPLES is returned in params. + ** If internalformat is not color-renderable, + * depth-renderable, or stencil-renderable (as defined in + * section 4.4.4), or if target does not support multiple + * samples (ie other than TEXTURE_2D_MULTISAMPLE, + * TEXTURE_2D_MULTISAMPLE_ARRAY, or RENDERBUFFER), 0 is + * returned. + */ + num_samples = ctx-Driver.QuerySamplesForFormat(ctx, target, internalformat, buffer); + } /* QuerySamplesForFormat writes some stuff to buffer, so we have to * separately over-write it with the requested value. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] [RFC] MESA_shading_language_130: is this it?
Ian Romanick i...@freedesktop.org writes: On 11/26/2014 06:09 PM, Dave Airlie wrote: Glamor is 4x faster on my ILK using glsl 130 at core text using x11perf -ftext. Ian started writing a spec for this extension a while back, which seems like most of the work, this patch seems to do enough, to advertise GLSL 1.30. Yeah... I started writing the extension when Chris Forbes was working on adding GLSL 1.30 for Ironlake. I seem to recall that gl_ClipDistance still does not work for ILK, and difficulties with that caused Chris to abandon the project. This was over a year ago, so the details are a bit fuzzy. The common Mesa parts look good, though. If we want to pursue this, I can finish up the extension spec and get it published. I'd definitely be interested -- integers are really useful for 2D rendering (as evidenced by Dave's numbers), and I can do them in VC4. What I see in a glance through 1.30 that I don't have in my 2.1 implementation is: - Size queries (but I can fake it with uniforms) - Texture arrays (can't fake that without some real craziness). - Texture offsets (could fake with uniforms and some math?) - gl_VertexID (could fake with a cached VBO of integers I think. I'd be interested in whether the MESA_1.30 spec would require support for extensions that expose those things, or if I could expose it without doing all of that. signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Patch v3] Mesa: Add support for GL_OES_texture_*float* extensions.
On Tue, Dec 2, 2014 at 3:36 PM, Ian Romanick i...@freedesktop.org wrote: On 11/28/2014 01:08 PM, Kalyan Kondapally wrote: This patch adds support for following GLES2 Texture Float extensions: 1)GL_OES_texture_float, 2)GL_OES_texture_half_float, 3)GL_OES_texture_float_linear, 4)GL_OES_texture_half_float_linear. If we are using GLES context and the driver has advertised support for ARB_texture_float, then support for all texture_float* extensions is enabled. Here, we are assuming that when driver advertises support for ARB_texture_float, it should be able to support all these extensions. v2: Indentation fixes. (Brian Paul) Fixed Comments and added some to new private functions.(Brian Paul) Added assert in valid_filter_for_float.(Brian Paul) Renamed Float and HALF_FLOAT_OES as IsFloatOES and IsHalfFloatOES.(Brian Paul) adjust_for_oes_float_texture to return GLenum. (Brian Paul) Use RGB_32F internal floating point format for RGB base format. v3: Don't indent case. (Matt Turner) Enable support for float extensions in case driver supports ARB_texture_float. (Fredrik Höglund) At this point, I think this patch should be split in three: 1. Extension infrastructure bits. 2. GL_HALF_FLOAT_OES fixes. 3. The rest. I suggest this because I think the first two should be pretty much landable. The rest may still need some changes. I'd also accept a 0th patch. 0. Global s/GL_HALF_FLOAT_ARB/GL_HALF_FLOAT/g;s/GLhalfARB/GLhalf/g :) Signed-off-by: Kevin Rogovin kevin.rogo...@intel.com Signed-off-by: Kalyan Kondapally kalyan.kondapa...@intel.com --- src/mesa/main/context.c| 15 +++ src/mesa/main/extensions.c | 4 +++ src/mesa/main/glformats.c | 46 +--- src/mesa/main/glformats.h | 3 ++- src/mesa/main/mtypes.h | 6 + src/mesa/main/pack.c | 16 +++ src/mesa/main/teximage.c | 66 +- src/mesa/main/texobj.c | 53 + 8 files changed, 203 insertions(+), 6 deletions(-) diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index 400c158..8b54967 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -1544,6 +1544,21 @@ handle_first_current(struct gl_context *ctx) return; } + /* If we are using GLES context and the driver has advertised support for +* ARB_texture_float, then enable support for all texture_float* extensions. +* Here, we are assuming that when driver advertises support for +* ARB_texture_float, it should be able to support all these extensions. In +* case any particular driver doesn't want to enable these extensions or +* only a subset on GLES, then it shouldn't advertise support for +* ARB_texture_float and toggle OES_texture_float* flags accordingly. +*/ + if (ctx-Extensions.ARB_texture_float _mesa_is_gles(ctx)) { + ctx-Extensions.OES_texture_float = GL_TRUE; + ctx-Extensions.OES_texture_float_linear = GL_TRUE; + ctx-Extensions.OES_texture_half_float = GL_TRUE; + ctx-Extensions.OES_texture_half_float_linear = GL_TRUE; + } I think this is a bad idea. Right now OpenGL ES 3.0 support depends on ARB_texture_float (see compute_version_es2 in src/mesa/main/version.c). That's a superset of OpenGL ES 3.0: GL_OES_texture_float_linear is not required. Now when someone wants to enable ES3 on their GPU that supports everything except GL_OES_texture_float_linear they will have to find-and-move this code because this code occurs after the version is calculated. That person will also have to modify other drivers (that they probably cannot test). Gosh, what could go wrong? :) There are a bunch of other cases where an extension can be automatically enabled if some other conditions are met. If we care about saving a couple lines of code here and there, we should make a function that drivers explicitly call to do that. Drivers that opt-in can call this function after they have enabled all of their other extensions in their create-context path. Moreover, this code is already broken for r300. Marek said in the previous thread that r300 advertises GL_ARB_texture_float (and developers know the documented caveats there) but does not want to advertise GL_OES_texture_*_linear. For the most part in Mesa, extensions are only enabled by default in three cases: 1. All in-tree drivers were already enabling the extension. GL_ARB_texture_env_add is an example. In the long past this extension was not enabled by default. Eventually all drivers either grew support or were removed from the tree. 2. The extension is just API syntactic sugar that the driver won't even see. GL_ARB_multi_bind is an example. 3. The extension has a secondary enable, such as a number of available formats, that drivers can control. GL_ARB_get_program_binary, GL_ARB_multisample, and
Re: [Mesa-dev] [PATCH 05/10] glsl: Don't allow gl_FragData[i], with i0 in GLES shaders
On 12/01/2014 05:04 AM, Eduardo Lima Mitev wrote: The OpenGL ES Shading Language specification describes the values that may be output by a fragment shader. These are gl_FragColor and gl_FragData[0]. Multiple render targets are not supported in GLES. Fixes dEQP test: * dEQP-GLES3.functional.shaders.fragdata.invalid_assign_to_1 I think the test needs to be updated for interactions with GL_NV_draw_buffers. Right now every Mesa driver enables GL_NV_draw_buffers by default, so I don't think this check is necessary. --- src/glsl/ast_array_index.cpp | 21 - 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/src/glsl/ast_array_index.cpp b/src/glsl/ast_array_index.cpp index ff0c757..b507d34 100644 --- a/src/glsl/ast_array_index.cpp +++ b/src/glsl/ast_array_index.cpp @@ -46,7 +46,9 @@ ast_array_specifier::print(void) const * * This function also checks whether the array is a built-in array whose * maximum size is too small to accommodate the given index, and if so uses - * loc and state to report the error. + * loc and state to report the error. It also checks that the built-in array + * gl_FragData is not accessed with indexes greater than zero in OpenGL ES, + * where multiple render targets are not allowed. */ static void update_max_array_access(ir_rvalue *ir, int idx, YYLTYPE *loc, @@ -54,6 +56,23 @@ update_max_array_access(ir_rvalue *ir, int idx, YYLTYPE *loc, { if (ir_dereference_variable *deref_var = ir-as_dereference_variable()) { ir_variable *var = deref_var-var; + + /* Page 89 in the section 3.8 (Fragment Shaders) of the the + * OpenGL ES 2.0.25 spec says: + * The OpenGL ES Shading Language specification describes the + * values that may be output by a fragment shader. These are + * gl_FragColor and gl_FragData[0]. + * ... + * gl_FragData is supported for compatibility with the desktop + * OpenGL Shading Language, but only a single fragment color + * output is allowed in the OpenGL ES Shading Language. + */ + if (state-es_shader idx 0 + strcmp(var-name, gl_FragData) == 0) { + _mesa_glsl_error(loc, state, + multiple render targets are not supported); + } + if (idx (int)var-data.max_array_access) { var-data.max_array_access = idx; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/10] glsl: invariant qualifier is not valid for shader inputs in GLSL ES 3.00
On 12/01/2014 05:04 AM, Eduardo Lima Mitev wrote: From: Samuel Iglesias Gonsalvez sigles...@igalia.com GLSL ES 3.00 spec, chapter 4.6.1 The Invariant Qualifier, Only variables output from a shader can be candidates for invariance. This includes user-defined output variables and the built-in output variables. As only outputs can be declared as invariant, an invariant output from one shader stage will still match an input of a subsequent stage without the input being declared as invariant. This patch fixes the following dEQP tests: dEQP-GLES3.functional.shaders.qualification_order.variables.valid.invariant_interp_storage_precision dEQP-GLES3.functional.shaders.qualification_order.variables.valid.invariant_interp_storage dEQP-GLES3.functional.shaders.qualification_order.variables.valid.invariant_storage_precision dEQP-GLES3.functional.shaders.qualification_order.variables.valid.invariant_storage dEQP-GLES3.functional.shaders.qualification_order.variables.invalid.invariant_interp_storage_precision_invariant_input dEQP-GLES3.functional.shaders.qualification_order.variables.invalid.invariant_interp_storage_invariant_input dEQP-GLES3.functional.shaders.qualification_order.variables.invalid.invariant_storage_precision_invariant_input dEQP-GLES3.functional.shaders.qualification_order.variables.invalid.invariant_storage_invariant_input No piglit regressions observed. Signed-off-by: Samuel Iglesias Gonsalvez sigles...@igalia.com --- src/glsl/glsl_parser.yy| 2 ++ src/glsl/link_varyings.cpp | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy index 6160e26..55b3a7d 100644 --- a/src/glsl/glsl_parser.yy +++ b/src/glsl/glsl_parser.yy @@ -1591,6 +1591,8 @@ type_qualifier: $$ = $2; $$.flags.q.invariant = 1; + if (state-es_shader state-language_version = 300 $$.flags.q.in) + _mesa_glsl_error(@1, state, invariant qualifiers cannot be used with shader inputs); Since we already reject the invariant keyword for GLSL ES 1.00, I think you can drop the ' state-language_version = 300' part. I would also add the spec quotation from the commit message here. With those changes, this patch is Reviewed-by: Ian Romanick ian.d.roman...@intel.com } | interpolation_qualifier type_qualifier { diff --git a/src/glsl/link_varyings.cpp b/src/glsl/link_varyings.cpp index 1866ab2..1fe198a 100644 --- a/src/glsl/link_varyings.cpp +++ b/src/glsl/link_varyings.cpp @@ -116,7 +116,7 @@ cross_validate_types_and_qualifiers(struct gl_shader_program *prog, return; } - if (input-data.invariant != output-data.invariant) { + if (!prog-IsES input-data.invariant != output-data.invariant) { linker_error(prog, %s shader output `%s' %s invariant qualifier, but %s shader input %s invariant qualifier\n, ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/10] glsl: invariant qualifier is not valid for shader inputs in GLSL ES 3.00
On 12/02/2014 12:54 PM, Ian Romanick wrote: On 12/01/2014 05:04 AM, Eduardo Lima Mitev wrote: From: Samuel Iglesias Gonsalvez sigles...@igalia.com GLSL ES 3.00 spec, chapter 4.6.1 The Invariant Qualifier, Only variables output from a shader can be candidates for invariance. This includes user-defined output variables and the built-in output variables. As only outputs can be declared as invariant, an invariant output from one shader stage will still match an input of a subsequent stage without the input being declared as invariant. This patch fixes the following dEQP tests: dEQP-GLES3.functional.shaders.qualification_order.variables.valid.invariant_interp_storage_precision dEQP-GLES3.functional.shaders.qualification_order.variables.valid.invariant_interp_storage dEQP-GLES3.functional.shaders.qualification_order.variables.valid.invariant_storage_precision dEQP-GLES3.functional.shaders.qualification_order.variables.valid.invariant_storage dEQP-GLES3.functional.shaders.qualification_order.variables.invalid.invariant_interp_storage_precision_invariant_input dEQP-GLES3.functional.shaders.qualification_order.variables.invalid.invariant_interp_storage_invariant_input dEQP-GLES3.functional.shaders.qualification_order.variables.invalid.invariant_storage_precision_invariant_input dEQP-GLES3.functional.shaders.qualification_order.variables.invalid.invariant_storage_invariant_input No piglit regressions observed. Signed-off-by: Samuel Iglesias Gonsalvez sigles...@igalia.com --- src/glsl/glsl_parser.yy| 2 ++ src/glsl/link_varyings.cpp | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy index 6160e26..55b3a7d 100644 --- a/src/glsl/glsl_parser.yy +++ b/src/glsl/glsl_parser.yy @@ -1591,6 +1591,8 @@ type_qualifier: $$ = $2; $$.flags.q.invariant = 1; + if (state-es_shader state-language_version = 300 $$.flags.q.in) + _mesa_glsl_error(@1, state, invariant qualifiers cannot be used with shader inputs); Since we already reject the invariant keyword for GLSL ES 1.00, I think you can drop the ' state-language_version = 300' part. Ignore this part. invariant is a keyword in GLSL ES 1.00. I would also add the spec quotation from the commit message here. With those changes, this patch is Reviewed-by: Ian Romanick ian.d.roman...@intel.com } | interpolation_qualifier type_qualifier { diff --git a/src/glsl/link_varyings.cpp b/src/glsl/link_varyings.cpp index 1866ab2..1fe198a 100644 --- a/src/glsl/link_varyings.cpp +++ b/src/glsl/link_varyings.cpp @@ -116,7 +116,7 @@ cross_validate_types_and_qualifiers(struct gl_shader_program *prog, return; } - if (input-data.invariant != output-data.invariant) { + if (!prog-IsES input-data.invariant != output-data.invariant) { linker_error(prog, %s shader output `%s' %s invariant qualifier, but %s shader input %s invariant qualifier\n, ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/10] glsl: don't allow invariant qualifiers for interface blocks in GLSL ES
On 12/01/2014 05:04 AM, Eduardo Lima Mitev wrote: From: Samuel Iglesias Gonsalvez sigles...@igalia.com From GLSL ES 3.0, chapter 4.3.7 Interface Blocks, page 38: GLSL ES 3.0 does not support interface blocks for shader inputs or outputs. and from GLSL ES 3.0, chapter 4.6.1 The invariant qualifier, page 52. Only variables output from a shader can be candidates for invariance. This includes user-defined output variables and the built-in output variables. As only outputs can be declared as invariant, an invariant output from one shader stage will still match an input of a subsequent stage without the input being declared as invariant. From GLSL ES 1.0, chapter 4.6.1 The invariant qualifier, page 37. Only the following variables may be declared as invariant: * Built-in special variables output from the vertex shader * Varying variables output from the vertex shader * Built-in special variables input to the fragment shader * Varying variables input to the fragment shader * Built-in special variables output from the fragment shader. This patch fixes the following dEQP tests: dEQP-GLES3.functional.shaders.declarations.invalid_declarations.invariant_uniform_block_2_vertex dEQP-GLES3.functional.shaders.declarations.invalid_declarations.invariant_uniform_block_2_fragment I'm pretty sure you can't have an invariant uniform block in desktop OpenGL either. Should this change also apply universally to desktop OpenGL? No piglit regressions. Signed-off-by: Samuel Iglesias Gonsalvez sigles...@igalia.com --- src/glsl/glsl_parser.yy | 25 + 1 file changed, 25 insertions(+) diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy index 55b3a7d..489d3af 100644 --- a/src/glsl/glsl_parser.yy +++ b/src/glsl/glsl_parser.yy @@ -2519,6 +2519,31 @@ basic_interface_block: interface block member does not match the interface block); } + /* From GLSL ES 3.0, chapter 4.3.7 Interface Blocks, page 38: + * + * GLSL ES 3.0 does not support interface blocks for shader inputs or + * outputs. + * + * And from GLSL ES 3.0, chapter 4.6.1 The invariant qualifier, page 52. + * + * Only variables output from a shader can be candidates for + * invariance. This includes user-defined output variables and the + * built-in output variables. As only outputs can be declared as + * invariant, an invariant output from one shader stage will + * still match an input of a subsequent stage without the input being + * declared as invariant. + * + * From GLSL ES 1.0, chapter 4.6.1 The invariant qualifier, page 37. + * + * Only the following variables may be declared as invariant: + * * Built-in special variables output from the vertex shader + * * Varying variables output from the vertex shader + * * Built-in special variables input to the fragment shader + * * Varying variables input to the fragment shader + * * Built-in special variables output from the fragment shader. + */ + if (state-es_shader qualifier.flags.q.invariant) +_mesa_glsl_error(@1, state, invariant qualifiers cannot be used with interface blocks in GLSL ES); } $$ = block; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/10] main/glsles: return two minor digits for SHADING_LANGUAGE_VERSION
With Brian's suggested change to the commit message, this patch is Reviewed-by: Ian Romanick ian.d.roman...@intel.com On 12/01/2014 05:04 AM, Eduardo Lima Mitev wrote: From: Samuel Iglesias Gonsalvez sigles...@igalia.com For OpenGL ES 3.0 spec, the minor number for SHADING_LANGUAGE_VERSION is always two digits, matching the OpenGL ES Shading Language Specification release number. For example, this query might return the string 3.00. This patch fixes the following dEQP test: dEQP-GLES3.functional.state_query.string.shading_language_version No piglit regression observed. Signed-off-by: Samuel Iglesias Gonsalvez sigles...@igalia.com --- src/mesa/main/getstring.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/main/getstring.c b/src/mesa/main/getstring.c index 431d60b..7054fc7 100644 --- a/src/mesa/main/getstring.c +++ b/src/mesa/main/getstring.c @@ -68,7 +68,7 @@ shading_language_version(struct gl_context *ctx) case API_OPENGLES2: return (ctx-Version 30) ? (const GLubyte *) OpenGL ES GLSL ES 1.0.16 - : (const GLubyte *) OpenGL ES GLSL ES 3.0; + : (const GLubyte *) OpenGL ES GLSL ES 3.00; case API_OPENGLES: /* fall-through */ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/10] glsl: A shader cannot redefine or overload built-in functions in GLSL ES 3.00
Oof. I'd really like to have Ken review this code. He's the expert in this area... On 12/01/2014 05:04 AM, Eduardo Lima Mitev wrote: From: Samuel Iglesias Gonsalvez sigles...@igalia.com Create a new search function to look for matching built-in functions by name and use it for built-in function redefinition or overload in GLSL ES 3.00. GLSL ES 3.0 spec, chapter 6.1 Function Definitions, page 71 A shader cannot redefine or overload built-in functions. In case of GLSL ES 1.0 spec, chapter 6.1 Function Definitions, page 54 Function names can be overloaded. This allows the same function name to be used for multiple functions, as long as the argument list types differ. If functions’ names and argument types match, then their return type and parameter qualifiers must also match. Function signature matching is based on parameter type only, no qualifiers are used. Overloading is used heavily in the built-in functions. When overloaded functions (or indeed any functions) are resolved, an exact match for the function's signature is sought. This includes exact match of array size as well. No promotion or demotion of the return type or input argument types is done. All expected combination of inputs and outputs must be defined as separate functions. So this check is specific to GLSL ES 3.00. This patch fixes the following dEQP tests: dEQP-GLES3.functional.shaders.functions.invalid.overload_builtin_function_vertex dEQP-GLES3.functional.shaders.functions.invalid.overload_builtin_function_fragment dEQP-GLES3.functional.shaders.functions.invalid.redefine_builtin_function_vertex dEQP-GLES3.functional.shaders.functions.invalid.redefine_builtin_function_fragment No piglit regressions. Signed-off-by: Samuel Iglesias Gonsalvez sigles...@igalia.com --- src/glsl/ast_to_hir.cpp| 22 ++ src/glsl/builtin_functions.cpp | 11 +++ src/glsl/ir.h | 4 3 files changed, 37 insertions(+) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index fe1e129..b7074bc 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -4167,6 +4167,28 @@ ast_function::hir(exec_list *instructions, return NULL; } } + } else { + /* From GLSL ES 3.0 spec, chapter 6.1 Function Definitions, page 71: + * + * A shader cannot redefine or overload built-in functions. + * + * While in GLSL ES 1.0 spec, chapter 6.1 Function Definitions, page + * 54, this is allowed: + * + * Function names can be overloaded. [...] Overloading is used heavily + * in the built-in functions. + * + */ + if (state-es_shader state-language_version = 300) { +/* Local shader has no exact candidates; check the built-ins. */ +_mesa_glsl_initialize_builtin_functions(); +if (_mesa_glsl_find_builtin_function_by_name(state, name)) { + YYLTYPE loc = this-get_location(); + _mesa_glsl_error( loc, state, +A shader cannot redefine or overload built-in +function `%s' in GLSL ES 3.00, name); +} + } } } diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp index bb7fbcd..f5052d3 100644 --- a/src/glsl/builtin_functions.cpp +++ b/src/glsl/builtin_functions.cpp @@ -4618,6 +4618,17 @@ _mesa_glsl_find_builtin_function(_mesa_glsl_parse_state *state, return s; } +ir_function * +_mesa_glsl_find_builtin_function_by_name(_mesa_glsl_parse_state *state, + const char *name) +{ + ir_function * f; + mtx_lock(builtins_lock); + f = builtins.shader-symbols-get_function(name); + mtx_unlock(builtins_lock); + return f; +} + gl_shader * _mesa_glsl_get_builtin_function_shader() { diff --git a/src/glsl/ir.h b/src/glsl/ir.h index a0f48b2..f2d8269 100644 --- a/src/glsl/ir.h +++ b/src/glsl/ir.h @@ -2417,6 +2417,10 @@ extern ir_function_signature * _mesa_glsl_find_builtin_function(_mesa_glsl_parse_state *state, const char *name, exec_list *actual_parameters); +extern ir_function * +_mesa_glsl_find_builtin_function_by_name(_mesa_glsl_parse_state *state, + const char *name); + extern gl_shader * _mesa_glsl_get_builtin_function_shader(void); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/10] glsl: A shader cannot redefine or overload built-in functions in GLSL ES 3.00
It would be nice if this could be added to the existing logic for the interaction between builtins and app-provided overloads -- or do we need to fail earlier than that? - Chris On Tue, Dec 2, 2014 at 2:04 AM, Eduardo Lima Mitev el...@igalia.com wrote: From: Samuel Iglesias Gonsalvez sigles...@igalia.com Create a new search function to look for matching built-in functions by name and use it for built-in function redefinition or overload in GLSL ES 3.00. GLSL ES 3.0 spec, chapter 6.1 Function Definitions, page 71 A shader cannot redefine or overload built-in functions. In case of GLSL ES 1.0 spec, chapter 6.1 Function Definitions, page 54 Function names can be overloaded. This allows the same function name to be used for multiple functions, as long as the argument list types differ. If functions’ names and argument types match, then their return type and parameter qualifiers must also match. Function signature matching is based on parameter type only, no qualifiers are used. Overloading is used heavily in the built-in functions. When overloaded functions (or indeed any functions) are resolved, an exact match for the function's signature is sought. This includes exact match of array size as well. No promotion or demotion of the return type or input argument types is done. All expected combination of inputs and outputs must be defined as separate functions. So this check is specific to GLSL ES 3.00. This patch fixes the following dEQP tests: dEQP-GLES3.functional.shaders.functions.invalid.overload_builtin_function_vertex dEQP-GLES3.functional.shaders.functions.invalid.overload_builtin_function_fragment dEQP-GLES3.functional.shaders.functions.invalid.redefine_builtin_function_vertex dEQP-GLES3.functional.shaders.functions.invalid.redefine_builtin_function_fragment No piglit regressions. Signed-off-by: Samuel Iglesias Gonsalvez sigles...@igalia.com --- src/glsl/ast_to_hir.cpp| 22 ++ src/glsl/builtin_functions.cpp | 11 +++ src/glsl/ir.h | 4 3 files changed, 37 insertions(+) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index fe1e129..b7074bc 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -4167,6 +4167,28 @@ ast_function::hir(exec_list *instructions, return NULL; } } + } else { + /* From GLSL ES 3.0 spec, chapter 6.1 Function Definitions, page 71: + * + * A shader cannot redefine or overload built-in functions. + * + * While in GLSL ES 1.0 spec, chapter 6.1 Function Definitions, page + * 54, this is allowed: + * + * Function names can be overloaded. [...] Overloading is used heavily + * in the built-in functions. + * + */ + if (state-es_shader state-language_version = 300) { +/* Local shader has no exact candidates; check the built-ins. */ +_mesa_glsl_initialize_builtin_functions(); +if (_mesa_glsl_find_builtin_function_by_name(state, name)) { + YYLTYPE loc = this-get_location(); + _mesa_glsl_error( loc, state, +A shader cannot redefine or overload built-in +function `%s' in GLSL ES 3.00, name); +} + } } } diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp index bb7fbcd..f5052d3 100644 --- a/src/glsl/builtin_functions.cpp +++ b/src/glsl/builtin_functions.cpp @@ -4618,6 +4618,17 @@ _mesa_glsl_find_builtin_function(_mesa_glsl_parse_state *state, return s; } +ir_function * +_mesa_glsl_find_builtin_function_by_name(_mesa_glsl_parse_state *state, + const char *name) +{ + ir_function * f; + mtx_lock(builtins_lock); + f = builtins.shader-symbols-get_function(name); + mtx_unlock(builtins_lock); + return f; +} + gl_shader * _mesa_glsl_get_builtin_function_shader() { diff --git a/src/glsl/ir.h b/src/glsl/ir.h index a0f48b2..f2d8269 100644 --- a/src/glsl/ir.h +++ b/src/glsl/ir.h @@ -2417,6 +2417,10 @@ extern ir_function_signature * _mesa_glsl_find_builtin_function(_mesa_glsl_parse_state *state, const char *name, exec_list *actual_parameters); +extern ir_function * +_mesa_glsl_find_builtin_function_by_name(_mesa_glsl_parse_state *state, + const char *name); + extern gl_shader * _mesa_glsl_get_builtin_function_shader(void); -- 2.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org
Re: [Mesa-dev] [PATCH] mesa: Ensure stack is realigned on x86.
On 12/02/2014 01:27 PM, Jose Fonseca wrote: From: José Fonseca jfons...@vmware.com Nowadays GCC assumes stack pointer is 16-byte aligned even on 32-bits, but that is an assumption OpenGL drivers (or any dynamic library for that matter) can affort to make as there are many closed- and open- can or cannot? affort - afford source application binaries out there that only assume 4-byte stack alignment. This fix uses force_align_arg_pointer GCC attribute, and is only a stop-gap measure. The right fix would be to pass -mstackrealign or -mincoming-stack-boundary=2 to all source fails that use any -msse* option, as there is no way to guarantee if/when GCC will decide to spill SSE registers to the stack. https://bugs.freedesktop.org/show_bug.cgi?id=86788 --- src/mesa/main/sse_minmax.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/main/sse_minmax.c b/src/mesa/main/sse_minmax.c index 222ac14..93cf2a6 100644 --- a/src/mesa/main/sse_minmax.c +++ b/src/mesa/main/sse_minmax.c @@ -31,6 +31,9 @@ #include stdint.h void +#if !defined(__x86_64__) + __attribute__((force_align_arg_pointer)) +#endif _mesa_uint_array_min_max(const unsigned *ui_indices, unsigned *min_index, unsigned *max_index, const unsigned count) { I don't remember if this code is compiled with MSVC. If so, do you also need a gcc check for the __attribute__ part? -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] st/nine: Fix vertex declarations for non-standard (usage/index)
Nine code to match vertex declaration to vs inputs was limiting the number of possible combinations. Some sm3 games have issues with that, because arbitrary (usage/index) can be used. This patch does the following changes to fix the problem: . Change the numbers given to (usage/index) combinations to uint16 . Do not put limits on the indices when it doesn't make sense . change the conversion rule (usage/index) - number to fit all combinations . Instead of having a table usage_map mapping a (usage/index) number to an input index, usage_map maps input indices to their (usage/index) Cc: 10.4 mesa-sta...@lists.freedesktop.org Tested-by: Yaroslav Andrusyak pontost...@gmail.com Acked-by: Ilia Mirkin imir...@alum.mit.edu Signed-off-by: Axel Davy axel.d...@ens.fr --- src/gallium/state_trackers/nine/nine_defines.h | 40 ++- src/gallium/state_trackers/nine/nine_ff.c | 49 +++-- src/gallium/state_trackers/nine/nine_shader.h | 2 +- src/gallium/state_trackers/nine/nine_state.c | 16 +++-- .../state_trackers/nine/vertexdeclaration9.c | 84 +++--- .../state_trackers/nine/vertexdeclaration9.h | 4 +- src/gallium/state_trackers/nine/vertexshader9.h| 2 +- 7 files changed, 89 insertions(+), 108 deletions(-) diff --git a/src/gallium/state_trackers/nine/nine_defines.h b/src/gallium/state_trackers/nine/nine_defines.h index aa3b257..4f61982 100644 --- a/src/gallium/state_trackers/nine/nine_defines.h +++ b/src/gallium/state_trackers/nine/nine_defines.h @@ -30,25 +30,27 @@ #define NINE_RESOURCE_FLAG_DUMMY(PIPE_RESOURCE_FLAG_ST_PRIV 2) /* vertexdeclaration9.c */ -unsigned nine_d3d9_to_nine_declusage(unsigned usage, unsigned index); - -#define NINE_DECLUSAGE_POSITION(i) ( 0 + (i)) -#define NINE_DECLUSAGE_BLENDWEIGHT(i) ( 5 + (i)) -#define NINE_DECLUSAGE_BLENDINDICES(i) ( 9 + (i)) -#define NINE_DECLUSAGE_NORMAL(i) (13 + (i)) -#define NINE_DECLUSAGE_PSIZE15 -#define NINE_DECLUSAGE_TEXCOORD(i) (16 + (i)) -#define NINE_DECLUSAGE_TANGENT(i) (32 + (i)) -#define NINE_DECLUSAGE_BINORMAL(i) (34 + (i)) -#define NINE_DECLUSAGE_TESSFACTOR 36 -#define NINE_DECLUSAGE_POSITIONT37 -#define NINE_DECLUSAGE_COLOR(i)(38 + (i)) -#define NINE_DECLUSAGE_DEPTH43 -#define NINE_DECLUSAGE_FOG 44 -#define NINE_DECLUSAGE_SAMPLE 45 -#define NINE_DECLUSAGE_NONE 46 -#define NINE_DECLUSAGE_LAST NINE_DECLUSAGE_NONE -#define NINE_DECLUSAGE_COUNT (NINE_DECLUSAGE_LAST + 1) +uint16_t nine_d3d9_to_nine_declusage(unsigned usage, unsigned index); + +#define NINE_DECLUSAGE_POSITION 0 +#define NINE_DECLUSAGE_BLENDWEIGHT 1 +#define NINE_DECLUSAGE_BLENDINDICES 2 +#define NINE_DECLUSAGE_NORMAL 3 +#define NINE_DECLUSAGE_TEXCOORD 4 +#define NINE_DECLUSAGE_TANGENT 5 +#define NINE_DECLUSAGE_BINORMAL 6 +#define NINE_DECLUSAGE_COLOR7 +#define NINE_DECLUSAGE_POSITIONT8 + +#define NINE_DECLUSAGE_PSIZE9 +#define NINE_DECLUSAGE_TESSFACTOR 10 +#define NINE_DECLUSAGE_DEPTH11 +#define NINE_DECLUSAGE_FOG 12 +#define NINE_DECLUSAGE_SAMPLE 13 +#define NINE_DECLUSAGE_NONE 14 +#define NINE_DECLUSAGE_COUNT(NINE_DECLUSAGE_NONE + 1) + +#define NINE_DECLUSAGE_i(declusage, n) NINE_DECLUSAGE_##declusage + n * NINE_DECLUSAGE_COUNT #define NINED3DCLEAR_DEPTHSTENCIL (D3DCLEAR_ZBUFFER | D3DCLEAR_STENCIL) diff --git a/src/gallium/state_trackers/nine/nine_ff.c b/src/gallium/state_trackers/nine/nine_ff.c index 184c411..a6bd360 100644 --- a/src/gallium/state_trackers/nine/nine_ff.c +++ b/src/gallium/state_trackers/nine/nine_ff.c @@ -275,7 +275,7 @@ struct vs_build_ctx struct ureg_program *ureg; const struct nine_ff_vs_key *key; -unsigned input[PIPE_MAX_ATTRIBS]; +uint16_t input[PIPE_MAX_ATTRIBS]; unsigned num_inputs; struct ureg_src aVtx; @@ -304,7 +304,7 @@ get_texcoord_sn(struct pipe_screen *screen) } static INLINE struct ureg_src -build_vs_add_input(struct vs_build_ctx *vs, unsigned ndecl) +build_vs_add_input(struct vs_build_ctx *vs, uint16_t ndecl) { const unsigned i = vs-num_inputs++; assert(i PIPE_MAX_ATTRIBS); @@ -370,10 +370,10 @@ nine_ff_build_vs(struct NineDevice9 *device, struct vs_build_ctx *vs) * (texture coordinates handled later) */ vs-aVtx = build_vs_add_input(vs, -key-position_t ? NINE_DECLUSAGE_POSITIONT : NINE_DECLUSAGE_POSITION(0)); +key-position_t ? NINE_DECLUSAGE_POSITIONT : NINE_DECLUSAGE_POSITION); if (need_rNrm) -vs-aNrm = build_vs_add_input(vs, NINE_DECLUSAGE_NORMAL(0)); +vs-aNrm = build_vs_add_input(vs, NINE_DECLUSAGE_NORMAL); vs-aCol[0] = ureg_imm1f(ureg, 1.0f); vs-aCol[1] = ureg_imm1f(ureg, 1.0f); @@ -382,9 +382,9 @@ nine_ff_build_vs(struct NineDevice9 *device, struct vs_build_ctx *vs)
[Mesa-dev] [PATCH 1/2] st/nine: sm1_declusage_to_tgsi, do not restrict indices with TGSI_SEMANTIC_GENERIC
With sm3, you can declare an input/output with an usage and an usage index. Nine code hardcodes the translation usage/index to a corresponding TGSI code. The translation was limited to a few usage/index combinations that were corresponding to most of the needs of games, but some games did not work. This patch rewrites that Nine code to map all possible usage/index combination to TGSI code. The index associated to TGSI_SEMANTIC_GENERIC doesn't need to be low for good performance, as the old code was supposing, and is not particularly bounded (it's UINT16). Given the index is BYTE, we can map all combinations. Cc: 10.4 mesa-sta...@lists.freedesktop.org Tested-by: Yaroslav Andrusyak pontost...@gmail.com Reviewed-by: Marek Olšák marek.ol...@amd.com Signed-off-by: Axel Davy axel.d...@ens.fr --- src/gallium/state_trackers/nine/nine_shader.c | 112 -- 1 file changed, 52 insertions(+), 60 deletions(-) diff --git a/src/gallium/state_trackers/nine/nine_shader.c b/src/gallium/state_trackers/nine/nine_shader.c index 4f78f6e..3c39b24 100644 --- a/src/gallium/state_trackers/nine/nine_shader.c +++ b/src/gallium/state_trackers/nine/nine_shader.c @@ -1677,101 +1677,93 @@ sm1_declusage_to_tgsi(struct tgsi_declaration_semantic *sem, boolean tc, struct sm1_semantic *dcl) { -const unsigned generic_base = tc ? 0 : 8; /* TEXCOORD[0..7] */ +BYTE index = dcl-usage_idx; -sem-Name = TGSI_SEMANTIC_GENERIC; -sem-Index = 0; - -/* TGSI_SEMANTIC_GENERIC assignments (+8 if !PIPE_CAP_TGSI_TEXCOORD): - * Try to put frequently used semantics at low GENERIC indices. +/* For everything that is not matching to a TGSI_SEMANTIC_, + * we match to a TGSI_SEMANTIC_GENERIC with index. + * + * The index can be anything UINT16 and usage_idx is BYTE, + * so we can fit everything. It doesn't matter if indices + * are close together or low. * - * POSITION[1..4]: 17, 27, 28, 29 - * COLOR[2..4]: 14, 15, 26 - * TEXCOORD[8..15]: 10, 11, 18, 19, 20, 21, 22, 23 - * BLENDWEIGHT[0..3]: 0, 4, 8, 12 - * BLENDINDICES[0..3]: 1, 5, 9, 13 - * NORMAL[0..1]: 2, 6 - * TANGENT[0]: 3, 24 - * BINORMAL[0]: 7, 25 - * TESSFACTOR[0]: 16 + * + * POSITION = 1: 10 * index + 6 + * COLOR = 2: 10 * (index-1) + 7 + * TEXCOORD[0..15]: index + * BLENDWEIGHT: 10 * index + 18 + * BLENDINDICES: 10 * index + 19 + * NORMAL: 10 * index + 20 + * TANGENT: 10 * index + 21 + * BINORMAL: 10 * index + 22 + * TESSFACTOR: 10 * index + 23 */ switch (dcl-usage) { case D3DDECLUSAGE_POSITION: case D3DDECLUSAGE_POSITIONT: case D3DDECLUSAGE_DEPTH: -sem-Name = TGSI_SEMANTIC_POSITION; -assert(dcl-usage_idx = 4); -if (dcl-usage_idx == 1) { -sem-Name = TGSI_SEMANTIC_GENERIC; -sem-Index = generic_base + 17; -} else -if (dcl-usage_idx = 2) { +if (index == 0) { +sem-Name = TGSI_SEMANTIC_POSITION; +sem-Index = 0; +} else { sem-Name = TGSI_SEMANTIC_GENERIC; -sem-Index = generic_base + 27 + (dcl-usage_idx - 2); +sem-Index = 10 * index + 6; } break; case D3DDECLUSAGE_COLOR: -assert(dcl-usage_idx = 4); -if (dcl-usage_idx 2) { +if (index 2) { sem-Name = TGSI_SEMANTIC_COLOR; -sem-Index = dcl-usage_idx; -} else -if (dcl-usage_idx 4) { -sem-Index = generic_base + 14 + (dcl-usage_idx - 2); +sem-Index = index; } else { -sem-Index = generic_base + 26; +sem-Name = TGSI_SEMANTIC_GENERIC; +sem-Index = 10 * (index-1) + 7; } break; case D3DDECLUSAGE_FOG: +assert(index == 0); sem-Name = TGSI_SEMANTIC_FOG; -assert(dcl-usage_idx == 0); +sem-Index = 0; break; case D3DDECLUSAGE_PSIZE: +assert(index == 0); sem-Name = TGSI_SEMANTIC_PSIZE; -assert(dcl-usage_idx == 0); +sem-Index = 0; break; case D3DDECLUSAGE_TEXCOORD: -assert(dcl-usage_idx 16); -if (dcl-usage_idx 8) { -if (tc) -sem-Name = TGSI_SEMANTIC_TEXCOORD; -sem-Index = dcl-usage_idx; -} else -if (dcl-usage_idx 10) { -sem-Index = generic_base + 10 + (dcl-usage_idx - 8); -} else { -sem-Index = generic_base + 18 + (dcl-usage_idx - 10); -} +assert(index 16); +if (index 8 tc) +sem-Name = TGSI_SEMANTIC_TEXCOORD; +else +sem-Name = TGSI_SEMANTIC_GENERIC; +sem-Index = index; break; -case D3DDECLUSAGE_BLENDWEIGHT: /* 0, 4, 8, 12 */ -assert(dcl-usage_idx 4); -sem-Index = generic_base + dcl-usage_idx * 4; +case D3DDECLUSAGE_BLENDWEIGHT: +
[Mesa-dev] [PATCH 7/8] st/nine: Queries: always succeed for D3DQUERYTYPE_TIMESTAMP when flushing
This is the behaviour that Wine tests Tested-by: David Heidelberg da...@ixit.cz Signed-off-by: Axel Davy axel.d...@ens.fr --- src/gallium/state_trackers/nine/query9.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/gallium/state_trackers/nine/query9.c b/src/gallium/state_trackers/nine/query9.c index f22e821..c87b019 100644 --- a/src/gallium/state_trackers/nine/query9.c +++ b/src/gallium/state_trackers/nine/query9.c @@ -223,6 +223,9 @@ NineQuery9_GetData( struct NineQuery9 *This, wait_query_result = TRUE; } +/* Wine tests: D3DQUERYTYPE_TIMESTAMP always succeeds */ +wait_query_result |= This-type == D3DQUERYTYPE_TIMESTAMP; + /* Note: We ignore dwGetDataFlags, because get_query_result will * flush automatically if needed */ -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/8] st/nine: Queries: Fix D3DISSUE_END behaviour.
Issuing D3DISSUE_END should: . reset previous queries if possible . end the query Previous behaviour wasn't calling end_query for queries not needing D3DISSUE_BEGIN, no resetting previous queries. This fixes several applications not launching properly. Cc: 10.4 mesa-sta...@lists.freedesktop.org Tested-by: David Heidelberg da...@ixit.cz Signed-off-by: Axel Davy axel.d...@ens.fr --- src/gallium/state_trackers/nine/query9.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/src/gallium/state_trackers/nine/query9.c b/src/gallium/state_trackers/nine/query9.c index a3184d9..dfb17b9 100644 --- a/src/gallium/state_trackers/nine/query9.c +++ b/src/gallium/state_trackers/nine/query9.c @@ -174,10 +174,12 @@ NineQuery9_Issue( struct NineQuery9 *This, pipe-begin_query(pipe, This-pq); This-state = NINE_QUERY_STATE_RUNNING; } else { -if (This-state == NINE_QUERY_STATE_RUNNING) { -pipe-end_query(pipe, This-pq); -This-state = NINE_QUERY_STATE_ENDED; -} +if (This-state != NINE_QUERY_STATE_RUNNING +This-type != D3DQUERYTYPE_EVENT +This-type != D3DQUERYTYPE_TIMESTAMP) +pipe-begin_query(pipe, This-pq); +pipe-end_query(pipe, This-pq); +This-state = NINE_QUERY_STATE_ENDED; } return D3D_OK; } -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/8] st/nine: Queries: remove dummy queries
Applications are supposed to call CreateQuery with a NULL ppQuery to know if the query is supported. We supported that. However when ppQuery was not NULL, we were accepting to create the query and were creating a dummy query even when the query is not supported. Wine has different behaviour. This patch drops the dummy queries support and matches wine behaviour. Reviewed-by: David Heidelberg da...@ixit.cz Signed-off-by: Axel Davy axel.d...@ens.fr --- src/gallium/state_trackers/nine/device9.c | 5 +- src/gallium/state_trackers/nine/query9.c | 101 +++--- 2 files changed, 12 insertions(+), 94 deletions(-) diff --git a/src/gallium/state_trackers/nine/device9.c b/src/gallium/state_trackers/nine/device9.c index d48f47d..c16f728 100644 --- a/src/gallium/state_trackers/nine/device9.c +++ b/src/gallium/state_trackers/nine/device9.c @@ -3359,8 +3359,9 @@ NineDevice9_CreateQuery( struct NineDevice9 *This, DBG(This=%p Type=%d ppQuery=%p\n, This, Type, ppQuery); -if (!ppQuery) -return nine_is_query_supported(Type); +hr = nine_is_query_supported(Type); +if (!ppQuery || hr != D3D_OK) +return hr; hr = NineQuery9_new(This, query, Type); if (FAILED(hr)) diff --git a/src/gallium/state_trackers/nine/query9.c b/src/gallium/state_trackers/nine/query9.c index 39c4435..0cb3d2e 100644 --- a/src/gallium/state_trackers/nine/query9.c +++ b/src/gallium/state_trackers/nine/query9.c @@ -54,29 +54,18 @@ d3dquerytype_to_pipe_query(D3DQUERYTYPE type) } } -#define GET_DATA_SIZE_CASE9(a)case D3DQUERYTYPE_##a: return sizeof(D3DDEVINFO_D3D9##a) -#define GET_DATA_SIZE_CASE1(a)case D3DQUERYTYPE_##a: return sizeof(D3DDEVINFO_##a) #define GET_DATA_SIZE_CASE2(a, b) case D3DQUERYTYPE_##a: return sizeof(D3DDEVINFO_##b) #define GET_DATA_SIZE_CASET(a, b) case D3DQUERYTYPE_##a: return sizeof(b) static INLINE DWORD nine_query_result_size(D3DQUERYTYPE type) { switch (type) { -GET_DATA_SIZE_CASE1(VCACHE); -GET_DATA_SIZE_CASE1(RESOURCEMANAGER); GET_DATA_SIZE_CASE2(VERTEXSTATS, D3DVERTEXSTATS); GET_DATA_SIZE_CASET(EVENT, BOOL); GET_DATA_SIZE_CASET(OCCLUSION, DWORD); GET_DATA_SIZE_CASET(TIMESTAMP, UINT64); GET_DATA_SIZE_CASET(TIMESTAMPDISJOINT, BOOL); GET_DATA_SIZE_CASET(TIMESTAMPFREQ, UINT64); -GET_DATA_SIZE_CASE9(PIPELINETIMINGS); -GET_DATA_SIZE_CASE9(INTERFACETIMINGS); -GET_DATA_SIZE_CASE2(VERTEXTIMINGS, D3D9STAGETIMINGS); -GET_DATA_SIZE_CASE2(PIXELTIMINGS, D3D9STAGETIMINGS); -GET_DATA_SIZE_CASE9(BANDWIDTHTIMINGS); -GET_DATA_SIZE_CASE9(CACHEUTILIZATION); -/* GET_DATA_SIZE_CASE1(MEMORYPRESSURE); Win7 only */ default: assert(0); return 0; @@ -123,8 +112,7 @@ NineQuery9_ctor( struct NineQuery9 *This, if (!This-pq) return E_OUTOFMEMORY; } else { -DBG(Returning dummy NineQuery9 for %s.\n, -nine_D3DQUERYTYPE_to_str(Type)); +assert(0); /* we have checked this case before */ } This-instant = @@ -178,11 +166,6 @@ NineQuery9_Issue( struct NineQuery9 *This, (dwIssueFlags == 0) || (dwIssueFlags == D3DISSUE_END), D3DERR_INVALIDCALL); -if (!This-pq) { -DBG(Issued dummy query.\n); -return D3D_OK; -} - if (dwIssueFlags == D3DISSUE_BEGIN) { if (This-state == NINE_QUERY_STATE_RUNNING) { pipe-end_query(pipe, This-pq); @@ -201,13 +184,6 @@ NineQuery9_Issue( struct NineQuery9 *This, union nine_query_result { D3DDEVINFO_D3DVERTEXSTATS vertexstats; -D3DDEVINFO_D3D9BANDWIDTHTIMINGS bandwidth; -D3DDEVINFO_VCACHE vcache; -D3DDEVINFO_RESOURCEMANAGER rm; -D3DDEVINFO_D3D9PIPELINETIMINGS pipe; -D3DDEVINFO_D3D9STAGETIMINGS stage; -D3DDEVINFO_D3D9INTERFACETIMINGS iface; -D3DDEVINFO_D3D9CACHEUTILIZATION cacheu; DWORD dw; BOOL b; UINT64 u64; @@ -220,7 +196,7 @@ NineQuery9_GetData( struct NineQuery9 *This, DWORD dwGetDataFlags ) { struct pipe_context *pipe = This-base.device-pipe; -boolean ok = !This-pq; +boolean ok; unsigned i; union pipe_query_result presult; union nine_query_result nresult; @@ -233,25 +209,19 @@ NineQuery9_GetData( struct NineQuery9 *This, user_assert(dwGetDataFlags == 0 || dwGetDataFlags == D3DGETDATA_FLUSH, D3DERR_INVALIDCALL); -if (!This-pq) { -DBG(No pipe query available.\n); -if (!dwSize) - return S_OK; -} if (This-state == NINE_QUERY_STATE_FRESH) return S_OK; +ok = pipe-get_query_result(pipe, This-pq, FALSE, presult); if (!ok) { -ok = pipe-get_query_result(pipe, This-pq, FALSE, presult); -if (!ok) { -if (dwGetDataFlags) { -if (This-state != NINE_QUERY_STATE_FLUSHED) -pipe-flush(pipe, NULL, 0); -This-state = NINE_QUERY_STATE_FLUSHED; -} -
[Mesa-dev] [PATCH 3/8] st/nine: Queries: Use gallium caps to get if queries are supported. (v2)
Some queries need the driver to advertise a cap to be supported. For example r300 doesn't support them. v2 (David): check also for PIPE_CAP_QUERY_PIPELINE_STATISTICS, fix wine tests on r300g Reviewed-by: David Heidelberg da...@ixit.cz Signed-off-by: Axel Davy axel.d...@ens.fr --- src/gallium/state_trackers/nine/device9.c | 2 +- src/gallium/state_trackers/nine/query9.c | 45 --- src/gallium/state_trackers/nine/query9.h | 2 +- 3 files changed, 25 insertions(+), 24 deletions(-) diff --git a/src/gallium/state_trackers/nine/device9.c b/src/gallium/state_trackers/nine/device9.c index c16f728..e9599b8 100644 --- a/src/gallium/state_trackers/nine/device9.c +++ b/src/gallium/state_trackers/nine/device9.c @@ -3359,7 +3359,7 @@ NineDevice9_CreateQuery( struct NineDevice9 *This, DBG(This=%p Type=%d ppQuery=%p\n, This, Type, ppQuery); -hr = nine_is_query_supported(Type); +hr = nine_is_query_supported(This-screen, Type); if (!ppQuery || hr != D3D_OK) return hr; diff --git a/src/gallium/state_trackers/nine/query9.c b/src/gallium/state_trackers/nine/query9.c index 5e30144..2be9bf4 100644 --- a/src/gallium/state_trackers/nine/query9.c +++ b/src/gallium/state_trackers/nine/query9.c @@ -23,34 +23,35 @@ #include device9.h #include query9.h #include nine_helpers.h +#include pipe/p_screen.h #include pipe/p_context.h #include util/u_math.h #include nine_dump.h #define DBG_CHANNEL DBG_QUERY -#define QUERY_TYPE_MAP_CASE(a, b) case D3DQUERYTYPE_##a: return PIPE_QUERY_##b static inline unsigned -d3dquerytype_to_pipe_query(D3DQUERYTYPE type) +d3dquerytype_to_pipe_query(struct pipe_screen *screen, D3DQUERYTYPE type) { switch (type) { -QUERY_TYPE_MAP_CASE(EVENT, GPU_FINISHED); -QUERY_TYPE_MAP_CASE(OCCLUSION, OCCLUSION_COUNTER); -QUERY_TYPE_MAP_CASE(TIMESTAMP, TIMESTAMP); -QUERY_TYPE_MAP_CASE(TIMESTAMPDISJOINT, TIMESTAMP_DISJOINT); -QUERY_TYPE_MAP_CASE(TIMESTAMPFREQ, TIMESTAMP_DISJOINT); -QUERY_TYPE_MAP_CASE(VERTEXSTATS, PIPELINE_STATISTICS); -case D3DQUERYTYPE_VCACHE: -case D3DQUERYTYPE_RESOURCEMANAGER: -case D3DQUERYTYPE_PIPELINETIMINGS: -case D3DQUERYTYPE_INTERFACETIMINGS: -case D3DQUERYTYPE_VERTEXTIMINGS: -case D3DQUERYTYPE_PIXELTIMINGS: -case D3DQUERYTYPE_BANDWIDTHTIMINGS: -case D3DQUERYTYPE_CACHEUTILIZATION: - return PIPE_QUERY_TYPES; -default: -return ~0; +case D3DQUERYTYPE_EVENT: +return PIPE_QUERY_GPU_FINISHED; +case D3DQUERYTYPE_OCCLUSION: +return screen-get_param(screen, PIPE_CAP_OCCLUSION_QUERY) ? + PIPE_QUERY_OCCLUSION_COUNTER : PIPE_QUERY_TYPES; +case D3DQUERYTYPE_TIMESTAMP: +return screen-get_param(screen, PIPE_CAP_QUERY_TIMESTAMP) ? + PIPE_QUERY_TIMESTAMP : PIPE_QUERY_TYPES; +case D3DQUERYTYPE_TIMESTAMPDISJOINT: +case D3DQUERYTYPE_TIMESTAMPFREQ: +return screen-get_param(screen, PIPE_CAP_QUERY_TIMESTAMP) ? + PIPE_QUERY_TIMESTAMP_DISJOINT : PIPE_QUERY_TYPES; +case D3DQUERYTYPE_VERTEXSTATS: +return screen-get_param(screen, +PIPE_CAP_QUERY_PIPELINE_STATISTICS) ? + PIPE_QUERY_PIPELINE_STATISTICS : PIPE_QUERY_TYPES; +default: +return PIPE_QUERY_TYPES; /* Query not supported */ } } @@ -73,9 +74,9 @@ nine_query_result_size(D3DQUERYTYPE type) } HRESULT -nine_is_query_supported(D3DQUERYTYPE type) +nine_is_query_supported(struct pipe_screen *screen, D3DQUERYTYPE type) { -const unsigned ptype = d3dquerytype_to_pipe_query(type); +const unsigned ptype = d3dquerytype_to_pipe_query(screen, type); user_assert(ptype != ~0, D3DERR_INVALIDCALL); @@ -93,7 +94,7 @@ NineQuery9_ctor( struct NineQuery9 *This, D3DQUERYTYPE Type ) { struct pipe_context *pipe = pParams-device-pipe; -const unsigned ptype = d3dquerytype_to_pipe_query(Type); +const unsigned ptype = d3dquerytype_to_pipe_query(pParams-device-screen, Type); HRESULT hr; DBG(This=%p pParams=%p Type=%d\n, This, pParams, Type); diff --git a/src/gallium/state_trackers/nine/query9.h b/src/gallium/state_trackers/nine/query9.h index abd4352..ad1ca50 100644 --- a/src/gallium/state_trackers/nine/query9.h +++ b/src/gallium/state_trackers/nine/query9.h @@ -48,7 +48,7 @@ NineQuery9( void *data ) } HRESULT -nine_is_query_supported(D3DQUERYTYPE); +nine_is_query_supported(struct pipe_screen *screen, D3DQUERYTYPE); HRESULT NineQuery9_new( struct NineDevice9 *Device, -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/8] st/nine: Queries: Remove flush logic
get_query_result flushes automatically, we don't need to flush. Signed-off-by: Axel Davy axel.d...@ens.fr --- src/gallium/state_trackers/nine/query9.c | 13 + src/gallium/state_trackers/nine/query9.h | 1 - 2 files changed, 5 insertions(+), 9 deletions(-) diff --git a/src/gallium/state_trackers/nine/query9.c b/src/gallium/state_trackers/nine/query9.c index 0cb3d2e..5e30144 100644 --- a/src/gallium/state_trackers/nine/query9.c +++ b/src/gallium/state_trackers/nine/query9.c @@ -212,15 +212,12 @@ NineQuery9_GetData( struct NineQuery9 *This, if (This-state == NINE_QUERY_STATE_FRESH) return S_OK; +/* Note: We ignore dwGetDataFlags, because get_query_result will + * flush automatically if needed */ + ok = pipe-get_query_result(pipe, This-pq, FALSE, presult); -if (!ok) { -if (dwGetDataFlags) { -if (This-state != NINE_QUERY_STATE_FLUSHED) -pipe-flush(pipe, NULL, 0); -This-state = NINE_QUERY_STATE_FLUSHED; -} -return S_FALSE; -} + +if (!ok) return S_FALSE; if (!dwSize) return S_OK; diff --git a/src/gallium/state_trackers/nine/query9.h b/src/gallium/state_trackers/nine/query9.h index f08393f..abd4352 100644 --- a/src/gallium/state_trackers/nine/query9.h +++ b/src/gallium/state_trackers/nine/query9.h @@ -30,7 +30,6 @@ enum nine_query_state NINE_QUERY_STATE_FRESH = 0, NINE_QUERY_STATE_RUNNING, NINE_QUERY_STATE_ENDED, -NINE_QUERY_STATE_FLUSHED }; struct NineQuery9 -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/8] st/nine: Queries: return S_FALSE instead of INVALIDCALL when in building query state
It is the same behaviour as wine has. Reviewed-by: David Heidelberg da...@ixit.cz Signed-off-by: Axel Davy axel.d...@ens.fr --- src/gallium/state_trackers/nine/query9.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/nine/query9.c b/src/gallium/state_trackers/nine/query9.c index 2be9bf4..a3184d9 100644 --- a/src/gallium/state_trackers/nine/query9.c +++ b/src/gallium/state_trackers/nine/query9.c @@ -205,7 +205,10 @@ NineQuery9_GetData( struct NineQuery9 *This, DBG(This=%p pData=%p dwSize=%d dwGetDataFlags=%d\n, This, pData, dwSize, dwGetDataFlags); -user_assert(This-state != NINE_QUERY_STATE_RUNNING, D3DERR_INVALIDCALL); +/* according to spec we should return D3DERR_INVALIDCALL here, but + * wine returns S_FALSE because it is apparently the behaviour + * on windows */ +user_assert(This-state != NINE_QUERY_STATE_RUNNING, S_FALSE); user_assert(dwSize == 0 || pData, D3DERR_INVALIDCALL); user_assert(dwGetDataFlags == 0 || dwGetDataFlags == D3DGETDATA_FLUSH, D3DERR_INVALIDCALL); -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/8] st/nine: Queries: allow app to call GetData without Issuing first
Nine was allowing that behaviour, but was not filling the result. Tested-by: David Heidelberg da...@ixit.cz Signed-off-by: Axel Davy axel.d...@ens.fr --- src/gallium/state_trackers/nine/query9.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/src/gallium/state_trackers/nine/query9.c b/src/gallium/state_trackers/nine/query9.c index dfb17b9..f22e821 100644 --- a/src/gallium/state_trackers/nine/query9.c +++ b/src/gallium/state_trackers/nine/query9.c @@ -199,7 +199,7 @@ NineQuery9_GetData( struct NineQuery9 *This, DWORD dwGetDataFlags ) { struct pipe_context *pipe = This-base.device-pipe; -boolean ok; +boolean ok, wait_query_result = FALSE; unsigned i; union pipe_query_result presult; union nine_query_result nresult; @@ -215,13 +215,18 @@ NineQuery9_GetData( struct NineQuery9 *This, user_assert(dwGetDataFlags == 0 || dwGetDataFlags == D3DGETDATA_FLUSH, D3DERR_INVALIDCALL); -if (This-state == NINE_QUERY_STATE_FRESH) -return S_OK; +if (This-state == NINE_QUERY_STATE_FRESH) { +/* App forgot calling Issue. call it for it. + * However Wine states that return value should + * be S_OK, so wait for the result to return S_OK. */ +NineQuery9_Issue(This, D3DISSUE_END); +wait_query_result = TRUE; +} /* Note: We ignore dwGetDataFlags, because get_query_result will * flush automatically if needed */ -ok = pipe-get_query_result(pipe, This-pq, FALSE, presult); +ok = pipe-get_query_result(pipe, This-pq, wait_query_result, presult); if (!ok) return S_FALSE; -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 8/8] st/nine: Queries: Always return D3D_OK when issuing with D3DISSUE_BEGIN
This is the behaviour that wine tests. Reviewed-by: David Heidelberg da...@ixit.cz Signed-off-by: Axel Davy axel.d...@ens.fr --- src/gallium/state_trackers/nine/query9.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/nine/query9.c b/src/gallium/state_trackers/nine/query9.c index c87b019..163b818 100644 --- a/src/gallium/state_trackers/nine/query9.c +++ b/src/gallium/state_trackers/nine/query9.c @@ -163,10 +163,15 @@ NineQuery9_Issue( struct NineQuery9 *This, DBG(This=%p dwIssueFlags=%d\n, This, dwIssueFlags); -user_assert((dwIssueFlags == D3DISSUE_BEGIN !This-instant) || +user_assert((dwIssueFlags == D3DISSUE_BEGIN) || (dwIssueFlags == 0) || (dwIssueFlags == D3DISSUE_END), D3DERR_INVALIDCALL); +/* Wine tests: always return D3D_OK on D3DISSUE_BEGIN + * even when the call is supposed to be forbidden */ +if (dwIssueFlags == D3DISSUE_BEGIN This-instant) +return D3D_OK; + if (dwIssueFlags == D3DISSUE_BEGIN) { if (This-state == NINE_QUERY_STATE_RUNNING) { pipe-end_query(pipe, This-pq); -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] i965: Make Gen4-5 and Gen8+ ALT checks use ctx-_Shader too.
On 12/02/2014 03:51 AM, Kenneth Graunke wrote: Commit c0347705 changed the Gen6-7 code to use ctx-_Shader rather than ctx-Shader, but neglected to change the Gen4-5 or Gen8+ code. This might fix SSO related bugs, but ALT mode is only used for ARB programs, so if there's an actual problem, it's likely no one would run into it. I think the Gen8 code may not have existed upstream when I made the other changes. I'm not sure how the Gen4-5 parts got missed... This patch is Reviewed-by: Ian Romanick ian.d.roman...@intel.com Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_vs_state.c | 2 +- src/mesa/drivers/dri/i965/brw_wm_state.c | 2 +- src/mesa/drivers/dri/i965/gen8_ps_state.c | 2 +- src/mesa/drivers/dri/i965/gen8_vs_state.c | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vs_state.c b/src/mesa/drivers/dri/i965/brw_vs_state.c index 998a225..abd6771 100644 --- a/src/mesa/drivers/dri/i965/brw_vs_state.c +++ b/src/mesa/drivers/dri/i965/brw_vs_state.c @@ -60,7 +60,7 @@ brw_upload_vs_unit(struct brw_context *brw) /* Use ALT floating point mode for ARB vertex programs, because they * require 0^0 == 1. */ - if (brw-ctx.Shader.CurrentProgram[MESA_SHADER_VERTEX] == NULL) + if (brw-ctx._Shader-CurrentProgram[MESA_SHADER_VERTEX] == NULL) vs-thread1.floating_point_mode = BRW_FLOATING_POINT_NON_IEEE_754; else vs-thread1.floating_point_mode = BRW_FLOATING_POINT_IEEE_754; diff --git a/src/mesa/drivers/dri/i965/brw_wm_state.c b/src/mesa/drivers/dri/i965/brw_wm_state.c index 12cbc72..d2903c7 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_state.c @@ -119,7 +119,7 @@ brw_upload_wm_unit(struct brw_context *brw) * rendering, CurrentProgram[MESA_SHADER_FRAGMENT] is used for this check * to differentiate between the GLSL and non-GLSL cases. */ - if (ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT] == NULL) + if (ctx-_Shader-CurrentProgram[MESA_SHADER_FRAGMENT] == NULL) wm-thread1.floating_point_mode = BRW_FLOATING_POINT_NON_IEEE_754; else wm-thread1.floating_point_mode = BRW_FLOATING_POINT_IEEE_754; diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c b/src/mesa/drivers/dri/i965/gen8_ps_state.c index 3aa0ef3..a3ce1d4 100644 --- a/src/mesa/drivers/dri/i965/gen8_ps_state.c +++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c @@ -146,7 +146,7 @@ upload_ps_state(struct brw_context *brw) * rendering, CurrentFragmentProgram is used for this check to * differentiate between the GLSL and non-GLSL cases. */ - if (ctx-Shader.CurrentProgram[MESA_SHADER_FRAGMENT] == NULL) + if (ctx-_Shader-CurrentProgram[MESA_SHADER_FRAGMENT] == NULL) dw3 |= GEN7_PS_FLOATING_POINT_MODE_ALT; /* 3DSTATE_PS expects the number of threads per PSD, which is always 64; diff --git a/src/mesa/drivers/dri/i965/gen8_vs_state.c b/src/mesa/drivers/dri/i965/gen8_vs_state.c index 00f2731..5a2021f 100644 --- a/src/mesa/drivers/dri/i965/gen8_vs_state.c +++ b/src/mesa/drivers/dri/i965/gen8_vs_state.c @@ -42,7 +42,7 @@ upload_vs_state(struct brw_context *brw) /* Use ALT floating point mode for ARB vertex programs, because they * require 0^0 == 1. */ - if (ctx-Shader.CurrentProgram[MESA_SHADER_VERTEX] == NULL) + if (ctx-_Shader-CurrentProgram[MESA_SHADER_VERTEX] == NULL) floating_point_mode = GEN6_VS_FLOATING_POINT_MODE_ALT; BEGIN_BATCH(9); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/4] i965: Store floating point mode choice in brw_stage_prog_data.
This patch is Reviewed-by: Ian Romanick ian.d.roman...@intel.com On 12/02/2014 03:51 AM, Kenneth Graunke wrote: We use IEEE mode for GLSL programs, but need to use ALT mode for ARB programs so that 0^0 == 1. The choice is based entirely on the shader source language. Previously, our code to determine which mode we wanted was duplicated in 8 different places (VS and FS for Gen4-5, Gen6, Gen7, and Gen8). The ctx-_Shader-CurrentProgram[stage] == NULL check was confusing as well - we use CurrentProgram (non-derived state), but _Shader (derived state). It also relies on knowing that ARB programs don't use gl_shader_program structures today. The compiler already makes this assumption in a few places, but I'd rather keep that assumption out of the state upload code. With this patch, we select the mode at compile time, and store that choice in prog_data. The state upload code simply uses that decision. This eliminates a BRW_NEW_*_PROGRAM dependency in the state upload code. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_context.h | 2 ++ src/mesa/drivers/dri/i965/brw_vs.c| 4 src/mesa/drivers/dri/i965/brw_vs_state.c | 5 + src/mesa/drivers/dri/i965/brw_wm.c| 4 src/mesa/drivers/dri/i965/brw_wm_state.c | 7 +-- src/mesa/drivers/dri/i965/gen6_vs_state.c | 6 +- src/mesa/drivers/dri/i965/gen6_wm_state.c | 7 +-- src/mesa/drivers/dri/i965/gen7_vs_state.c | 6 +- src/mesa/drivers/dri/i965/gen7_wm_state.c | 8 +--- src/mesa/drivers/dri/i965/gen8_ps_state.c | 7 +-- src/mesa/drivers/dri/i965/gen8_vs_state.c | 5 + 11 files changed, 18 insertions(+), 43 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index b4ddc17..4bb3b17 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -355,6 +355,8 @@ struct brw_stage_prog_data { */ unsigned dispatch_grf_start_reg; + bool use_alt_mode; /** Use ALT floating point mode? Otherwise, IEEE. */ + /* Pointers to tracked values (only valid once * _mesa_load_state_parameters has been called at runtime). * diff --git a/src/mesa/drivers/dri/i965/brw_vs.c b/src/mesa/drivers/dri/i965/brw_vs.c index 2f628e5..970d86c 100644 --- a/src/mesa/drivers/dri/i965/brw_vs.c +++ b/src/mesa/drivers/dri/i965/brw_vs.c @@ -209,6 +209,10 @@ do_vs_prog(struct brw_context *brw, memcpy(c.key, key, sizeof(*key)); memset(prog_data, 0, sizeof(prog_data)); + /* Use ALT floating point mode for ARB programs so that 0^0 == 1. */ + if (!prog) + stage_prog_data-use_alt_mode = true; + mem_ctx = ralloc_context(NULL); c.vp = vp; diff --git a/src/mesa/drivers/dri/i965/brw_vs_state.c b/src/mesa/drivers/dri/i965/brw_vs_state.c index abd6771..5371f71 100644 --- a/src/mesa/drivers/dri/i965/brw_vs_state.c +++ b/src/mesa/drivers/dri/i965/brw_vs_state.c @@ -57,10 +57,7 @@ brw_upload_vs_unit(struct brw_context *brw) stage_state-prog_offset + (vs-thread0.grf_reg_count 1)) 6; - /* Use ALT floating point mode for ARB vertex programs, because they -* require 0^0 == 1. -*/ - if (brw-ctx._Shader-CurrentProgram[MESA_SHADER_VERTEX] == NULL) + if (brw-vs.prog_data-base.base.use_alt_mode) vs-thread1.floating_point_mode = BRW_FLOATING_POINT_NON_IEEE_754; else vs-thread1.floating_point_mode = BRW_FLOATING_POINT_IEEE_754; diff --git a/src/mesa/drivers/dri/i965/brw_wm.c b/src/mesa/drivers/dri/i965/brw_wm.c index 7badb23..e7939f0 100644 --- a/src/mesa/drivers/dri/i965/brw_wm.c +++ b/src/mesa/drivers/dri/i965/brw_wm.c @@ -182,6 +182,10 @@ bool do_wm_prog(struct brw_context *brw, prog_data.computed_depth_mode = computed_depth_mode(fp-program); + /* Use ALT floating point mode for ARB programs so that 0^0 == 1. */ + if (!prog) + prog_data.base.use_alt_mode = true; + /* Allocate the references to the uniforms that will end up in the * prog_data associated with the compiled program, and which will be freed * by the state cache. diff --git a/src/mesa/drivers/dri/i965/brw_wm_state.c b/src/mesa/drivers/dri/i965/brw_wm_state.c index d2903c7..0dee1f8 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_state.c @@ -114,12 +114,7 @@ brw_upload_wm_unit(struct brw_context *brw) (wm-wm9.grf_reg_count_2 1)) 6; wm-thread1.depth_coef_urb_read_offset = 1; - /* Use ALT floating point mode for ARB fragment programs, because they -* require 0^0 == 1. Even though _CurrentFragmentProgram is used for -* rendering, CurrentProgram[MESA_SHADER_FRAGMENT] is used for this check -* to differentiate between the GLSL and non-GLSL cases. -*/ - if (ctx-_Shader-CurrentProgram[MESA_SHADER_FRAGMENT]
Re: [Mesa-dev] [PATCH 3/8] st/nine: Queries: Use gallium caps to get if queries are supported. (v2)
On Tue, Dec 2, 2014 at 4:12 PM, Axel Davy axel.d...@ens.fr wrote: Some queries need the driver to advertise a cap to be supported. For example r300 doesn't support them. v2 (David): check also for PIPE_CAP_QUERY_PIPELINE_STATISTICS, fix wine tests on r300g Reviewed-by: David Heidelberg da...@ixit.cz Signed-off-by: Axel Davy axel.d...@ens.fr --- src/gallium/state_trackers/nine/device9.c | 2 +- src/gallium/state_trackers/nine/query9.c | 45 --- src/gallium/state_trackers/nine/query9.h | 2 +- 3 files changed, 25 insertions(+), 24 deletions(-) diff --git a/src/gallium/state_trackers/nine/device9.c b/src/gallium/state_trackers/nine/device9.c index c16f728..e9599b8 100644 --- a/src/gallium/state_trackers/nine/device9.c +++ b/src/gallium/state_trackers/nine/device9.c @@ -3359,7 +3359,7 @@ NineDevice9_CreateQuery( struct NineDevice9 *This, DBG(This=%p Type=%d ppQuery=%p\n, This, Type, ppQuery); -hr = nine_is_query_supported(Type); +hr = nine_is_query_supported(This-screen, Type); if (!ppQuery || hr != D3D_OK) return hr; diff --git a/src/gallium/state_trackers/nine/query9.c b/src/gallium/state_trackers/nine/query9.c index 5e30144..2be9bf4 100644 --- a/src/gallium/state_trackers/nine/query9.c +++ b/src/gallium/state_trackers/nine/query9.c @@ -23,34 +23,35 @@ #include device9.h #include query9.h #include nine_helpers.h +#include pipe/p_screen.h #include pipe/p_context.h #include util/u_math.h #include nine_dump.h #define DBG_CHANNEL DBG_QUERY -#define QUERY_TYPE_MAP_CASE(a, b) case D3DQUERYTYPE_##a: return PIPE_QUERY_##b static inline unsigned -d3dquerytype_to_pipe_query(D3DQUERYTYPE type) +d3dquerytype_to_pipe_query(struct pipe_screen *screen, D3DQUERYTYPE type) { switch (type) { -QUERY_TYPE_MAP_CASE(EVENT, GPU_FINISHED); -QUERY_TYPE_MAP_CASE(OCCLUSION, OCCLUSION_COUNTER); -QUERY_TYPE_MAP_CASE(TIMESTAMP, TIMESTAMP); -QUERY_TYPE_MAP_CASE(TIMESTAMPDISJOINT, TIMESTAMP_DISJOINT); -QUERY_TYPE_MAP_CASE(TIMESTAMPFREQ, TIMESTAMP_DISJOINT); -QUERY_TYPE_MAP_CASE(VERTEXSTATS, PIPELINE_STATISTICS); -case D3DQUERYTYPE_VCACHE: -case D3DQUERYTYPE_RESOURCEMANAGER: -case D3DQUERYTYPE_PIPELINETIMINGS: -case D3DQUERYTYPE_INTERFACETIMINGS: -case D3DQUERYTYPE_VERTEXTIMINGS: -case D3DQUERYTYPE_PIXELTIMINGS: -case D3DQUERYTYPE_BANDWIDTHTIMINGS: -case D3DQUERYTYPE_CACHEUTILIZATION: - return PIPE_QUERY_TYPES; -default: -return ~0; +case D3DQUERYTYPE_EVENT: The convention throughout mesa is to align 'case' with 'switch'. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: Ensure stack is realigned on x86.
On 02/12/14 21:10, Brian Paul wrote: On 12/02/2014 01:27 PM, Jose Fonseca wrote: From: José Fonseca jfons...@vmware.com Nowadays GCC assumes stack pointer is 16-byte aligned even on 32-bits, but that is an assumption OpenGL drivers (or any dynamic library for that matter) can affort to make as there are many closed- and open- can or cannot? Oops. cannot. affort - afford Thanks. source application binaries out there that only assume 4-byte stack alignment. This fix uses force_align_arg_pointer GCC attribute, and is only a stop-gap measure. The right fix would be to pass -mstackrealign or -mincoming-stack-boundary=2 to all source fails that use any -msse* option, as there is no way to guarantee if/when GCC will decide to spill SSE registers to the stack. https://bugs.freedesktop.org/show_bug.cgi?id=86788 --- src/mesa/main/sse_minmax.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/main/sse_minmax.c b/src/mesa/main/sse_minmax.c index 222ac14..93cf2a6 100644 --- a/src/mesa/main/sse_minmax.c +++ b/src/mesa/main/sse_minmax.c @@ -31,6 +31,9 @@ #include stdint.h void +#if !defined(__x86_64__) + __attribute__((force_align_arg_pointer)) +#endif _mesa_uint_array_min_max(const unsigned *ui_indices, unsigned *min_index, unsigned *max_index, const unsigned count) { I don't remember if this code is compiled with MSVC. If so, do you also need a gcc check for the __attribute__ part? No, it doesn't compile on MSVC yet -- there are a few non-standard GCC __attribute__ below already, plus MSVC doesn't define __SSE4_1__, so the whole thing would be no-oped. Getting this to compile on MSVC/MinGW is something I rather leave to another time, as it will involve updating scons to set -msse4.1 correctly, and port a few defines from src/gallium/include/pipe/p_config.h and src/gallium/auxiliary/util/u_sse.h for portability. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/8] st/nine: Queries: Fix D3DISSUE_END behaviour.
On Tue, Dec 2, 2014 at 4:12 PM, Axel Davy axel.d...@ens.fr wrote: Issuing D3DISSUE_END should: . reset previous queries if possible . end the query Previous behaviour wasn't calling end_query for queries not needing D3DISSUE_BEGIN, no resetting noR resetting previous queries. This fixes several applications not launching properly. Cc: 10.4 mesa-sta...@lists.freedesktop.org Tested-by: David Heidelberg da...@ixit.cz Signed-off-by: Axel Davy axel.d...@ens.fr --- src/gallium/state_trackers/nine/query9.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/src/gallium/state_trackers/nine/query9.c b/src/gallium/state_trackers/nine/query9.c index a3184d9..dfb17b9 100644 --- a/src/gallium/state_trackers/nine/query9.c +++ b/src/gallium/state_trackers/nine/query9.c @@ -174,10 +174,12 @@ NineQuery9_Issue( struct NineQuery9 *This, pipe-begin_query(pipe, This-pq); This-state = NINE_QUERY_STATE_RUNNING; } else { -if (This-state == NINE_QUERY_STATE_RUNNING) { -pipe-end_query(pipe, This-pq); -This-state = NINE_QUERY_STATE_ENDED; -} +if (This-state != NINE_QUERY_STATE_RUNNING +This-type != D3DQUERYTYPE_EVENT +This-type != D3DQUERYTYPE_TIMESTAMP) +pipe-begin_query(pipe, This-pq); +pipe-end_query(pipe, This-pq); +This-state = NINE_QUERY_STATE_ENDED; } return D3D_OK; } -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] [RFC] MESA_shading_language_130: is this it?
On 12/02/2014 12:51 PM, Eric Anholt wrote: Ian Romanick i...@freedesktop.org writes: On 11/26/2014 06:09 PM, Dave Airlie wrote: Glamor is 4x faster on my ILK using glsl 130 at core text using x11perf -ftext. Ian started writing a spec for this extension a while back, which seems like most of the work, this patch seems to do enough, to advertise GLSL 1.30. Yeah... I started writing the extension when Chris Forbes was working on adding GLSL 1.30 for Ironlake. I seem to recall that gl_ClipDistance still does not work for ILK, and difficulties with that caused Chris to abandon the project. This was over a year ago, so the details are a bit fuzzy. The common Mesa parts look good, though. If we want to pursue this, I can finish up the extension spec and get it published. I'd definitely be interested -- integers are really useful for 2D rendering (as evidenced by Dave's numbers), and I can do them in VC4. What I see in a glance through 1.30 that I don't have in my 2.1 implementation is: - Size queries (but I can fake it with uniforms) - Texture arrays (can't fake that without some real craziness). - Texture offsets (could fake with uniforms and some math?) - gl_VertexID (could fake with a cached VBO of integers I think. I'd be interested in whether the MESA_1.30 spec would require support for extensions that expose those things, or if I could expose it without doing all of that. I think it would be easy to have an interaction with GL_ARB_texture_array at least. I think the easiest thing is: Interactions with EXT_texture_array and OpenGL 3.0 If EXT_texture_array or OpenGL 3.0 are not supported, shaders using sampler1DArray, sampler2DArray, isampler1DArray, isampler2DArray, usampler1DArray, usampler2DArray, sampler1DArrayShadow, or sampler2DArrayShadow types or built-in functions related to those types will compile and link. However, since textures for those targets cannot be created, these shaders will fail draw-time validation checks. Would that work? I think we'd need a nearly identical interaction for EXT_texture_integer. There may be other similar cases. For the other things... it seems better to leave them in as-is. GLSL 1.30 is largely a roll-up of existing extensions. The idea behind MESA_shading_language_130 was to provide enough API infrastructure to make GLSL 1.30 shaders work. If we subtract things from GLSL 1.30 due to missing extensions, it seems like the value of the Mesa extension is diminished. If we're really just trying to enable glamor, maybe it would be better to have an extension that just enables true integer support in GLSL and the API. Then we could enable it on Gen4 too. Is there other hardware that would benefit from MESA_shading_language_130? A hypothetical MESA_shading_language_integer extension? signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 8/8] st/nine: Queries: Always return D3D_OK when issuing with D3DISSUE_BEGIN
With the very minor corrections I've mentioned earlier, series is Reviwed-by: Ilia Mirkin imir...@alum.mit.edu On Tue, Dec 2, 2014 at 4:12 PM, Axel Davy axel.d...@ens.fr wrote: This is the behaviour that wine tests. Reviewed-by: David Heidelberg da...@ixit.cz Signed-off-by: Axel Davy axel.d...@ens.fr --- src/gallium/state_trackers/nine/query9.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/nine/query9.c b/src/gallium/state_trackers/nine/query9.c index c87b019..163b818 100644 --- a/src/gallium/state_trackers/nine/query9.c +++ b/src/gallium/state_trackers/nine/query9.c @@ -163,10 +163,15 @@ NineQuery9_Issue( struct NineQuery9 *This, DBG(This=%p dwIssueFlags=%d\n, This, dwIssueFlags); -user_assert((dwIssueFlags == D3DISSUE_BEGIN !This-instant) || +user_assert((dwIssueFlags == D3DISSUE_BEGIN) || (dwIssueFlags == 0) || (dwIssueFlags == D3DISSUE_END), D3DERR_INVALIDCALL); +/* Wine tests: always return D3D_OK on D3DISSUE_BEGIN + * even when the call is supposed to be forbidden */ +if (dwIssueFlags == D3DISSUE_BEGIN This-instant) +return D3D_OK; + if (dwIssueFlags == D3DISSUE_BEGIN) { if (This-state == NINE_QUERY_STATE_RUNNING) { pipe-end_query(pipe, This-pq); -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 86944] glsl_parser_extras.cpp, line 1455: Error: Badly formed expression.
https://bugs.freedesktop.org/show_bug.cgi?id=86944 Bug ID: 86944 Summary: glsl_parser_extras.cpp, line 1455: Error: Badly formed expression. Product: Mesa Version: git Hardware: x86 (IA32) OS: Solaris Status: NEW Keywords: bisected, regression Severity: normal Priority: medium Component: Mesa core Assignee: mesa-dev@lists.freedesktop.org Reporter: v...@freedesktop.org CC: ja...@jlekstrand.net, matts...@gmail.com mesa: 4e6244e80f7dd6dad526ff04f5103ed24d61d38a (master 10.5.0-devel) Build error with Oracle Studio. $ gmake [...] ../../src/glsl/glsl_parser_extras.cpp, line 58: Warning: stage hides _mesa_glsl_parse_state::stage. ../../src/glsl/glsl_parser_extras.cpp, line 864: Warning: new_scope hides ast_compound_statement::new_scope. ../../src/glsl/glsl_parser_extras.cpp, line 865: Warning: statements hides ast_compound_statement::statements. ../../src/glsl/glsl_parser_extras.cpp, line 998: Warning: oper hides ast_expression::oper. ../../src/glsl/glsl_parser_extras.cpp, line 1090: Warning: identifier hides ast_declaration::identifier. ../../src/glsl/glsl_parser_extras.cpp, line 1091: Warning: array_specifier hides ast_declaration::array_specifier. ../../src/glsl/glsl_parser_extras.cpp, line 1092: Warning: initializer hides ast_declaration::initializer. ../../src/glsl/glsl_parser_extras.cpp, line 1123: Warning: type hides ast_declarator_list::type. ../../src/glsl/glsl_parser_extras.cpp, line 1154: Warning: mode hides ast_jump_statement::mode. ../../src/glsl/glsl_parser_extras.cpp, line 1181: Warning: condition hides ast_selection_statement::condition. ../../src/glsl/glsl_parser_extras.cpp, line 1182: Warning: then_statement hides ast_selection_statement::then_statement. ../../src/glsl/glsl_parser_extras.cpp, line 1183: Warning: else_statement hides ast_selection_statement::else_statement. ../../src/glsl/glsl_parser_extras.cpp, line 1202: Warning: test_expression hides ast_switch_statement::test_expression. ../../src/glsl/glsl_parser_extras.cpp, line 1203: Warning: body hides ast_switch_statement::body. ../../src/glsl/glsl_parser_extras.cpp, line 1221: Warning: stmts hides ast_switch_body::stmts. ../../src/glsl/glsl_parser_extras.cpp, line 1239: Warning: test_value hides ast_case_label::test_value. ../../src/glsl/glsl_parser_extras.cpp, line 1269: Warning: labels hides ast_case_statement::labels. ../../src/glsl/glsl_parser_extras.cpp, line 1329: Warning: mode hides ast_iteration_statement::mode. ../../src/glsl/glsl_parser_extras.cpp, line 1331: Warning: condition hides ast_iteration_statement::condition. ../../src/glsl/glsl_parser_extras.cpp, line 1332: Warning: rest_expression hides ast_iteration_statement::rest_expression. ../../src/glsl/glsl_parser_extras.cpp, line 1333: Warning: body hides ast_iteration_statement::body. ../../src/glsl/glsl_parser_extras.cpp, line 1455: Error: Badly formed expression. commit 9db278d0e2b3bf35b28f00ab7ec3392443aae6b3 Author: Matt Turner matts...@gmail.com Date: Fri Nov 21 18:04:21 2014 -0800 glsl: Initialize static temporaries_allocate_names once per process. Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] [RFC] MESA_shading_language_130: is this it?
On 3 December 2014 at 07:32, Ian Romanick i...@freedesktop.org wrote: On 12/02/2014 12:51 PM, Eric Anholt wrote: Ian Romanick i...@freedesktop.org writes: On 11/26/2014 06:09 PM, Dave Airlie wrote: Glamor is 4x faster on my ILK using glsl 130 at core text using x11perf -ftext. Ian started writing a spec for this extension a while back, which seems like most of the work, this patch seems to do enough, to advertise GLSL 1.30. Yeah... I started writing the extension when Chris Forbes was working on adding GLSL 1.30 for Ironlake. I seem to recall that gl_ClipDistance still does not work for ILK, and difficulties with that caused Chris to abandon the project. This was over a year ago, so the details are a bit fuzzy. The common Mesa parts look good, though. If we want to pursue this, I can finish up the extension spec and get it published. I'd definitely be interested -- integers are really useful for 2D rendering (as evidenced by Dave's numbers), and I can do them in VC4. What I see in a glance through 1.30 that I don't have in my 2.1 implementation is: - Size queries (but I can fake it with uniforms) - Texture arrays (can't fake that without some real craziness). - Texture offsets (could fake with uniforms and some math?) - gl_VertexID (could fake with a cached VBO of integers I think. I'd be interested in whether the MESA_1.30 spec would require support for extensions that expose those things, or if I could expose it without doing all of that. I think it would be easy to have an interaction with GL_ARB_texture_array at least. I think the easiest thing is: Interactions with EXT_texture_array and OpenGL 3.0 If EXT_texture_array or OpenGL 3.0 are not supported, shaders using sampler1DArray, sampler2DArray, isampler1DArray, isampler2DArray, usampler1DArray, usampler2DArray, sampler1DArrayShadow, or sampler2DArrayShadow types or built-in functions related to those types will compile and link. However, since textures for those targets cannot be created, these shaders will fail draw-time validation checks. Would that work? I think we'd need a nearly identical interaction for EXT_texture_integer. There may be other similar cases. From a piglit run on ILK, AMD_shader_trinary_minmax ARB_draw_elements_base_vertex ARB_explicit_attrib_location ARB_explicit_uniform_location ARB_shader_bit_encoding ARB_texture_query_levels ARB_texture_query_lod ARB_texture_rg ARB_texture_rgb10_a2ui EXT_shader_integer_mix EXT_texture_integer ARB_separate_shader_objects all had new tests enabled with my patch. Eric, gl_VertexID is one thing glamor needs isn't it? Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/7] mesa: Generate GL_INVALID_OPERATION when drawing w/o a VAO in core profile
On Tuesday, December 02, 2014 09:26:38 AM Ian Romanick wrote: On 11/20/2014 05:19 PM, Kenneth Graunke wrote: On Thursday, November 20, 2014 11:14:49 AM Ian Romanick wrote: From: Ian Romanick ian.d.roman...@intel.com GL 3-ish versions of the spec are less clear that an error should be generated here, so Ken (and I during review) just missed it in 1afe335. Signed-off-by: Ian Romanick ian.d.roman...@intel.com Cc: Kenneth Graunke kenn...@whitecape.org --- src/mesa/main/api_validate.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c index bf4fa3e..006fca4 100644 --- a/src/mesa/main/api_validate.c +++ b/src/mesa/main/api_validate.c @@ -79,8 +79,16 @@ check_valid_to_render(struct gl_context *ctx, const char *function) break; case API_OPENGL_CORE: - if (ctx-Array.VAO == ctx-Array.DefaultVAO) + /* Section 10.4 (Drawing Commands Using Vertex Arrays) of the OpenGL 4.5 + * Core Profile spec says: + * + * An INVALID_OPERATION error is generated if no vertex array + * object is bound (see section 10.3.1). + */ + if (ctx-Array.VAO == ctx-Array.DefaultVAO) { + _mesa_error(ctx, GL_INVALID_OPERATION, %s(no VAO bound), function); return GL_FALSE; + } /* fallthrough */ case API_OPENGL_COMPAT: { This seems okay - we were already prohibiting drawing, you're just adding a GL error. I thought we already did that, but I can't find any code to do so. Reviewed-by: Kenneth Graunke kenn...@whitecape.org That said, I don't think we ever resolved whether prohibiting drawing is correct, given that we support ARB_ES3_compatibility, and ES3 allows this. OpenGL 4.5 includes GL_ARB_ES3_compatibility as a required feature, and it explicitly calls out the error. So I think we're okay? Oh, cool - I didn't realize that had already been clarified. Great. --Ken signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 86944] glsl_parser_extras.cpp, line 1455: Error: Badly formed expression.
https://bugs.freedesktop.org/show_bug.cgi?id=86944 --- Comment #1 from Matt Turner matts...@gmail.com --- (In reply to Vinson Lee from comment #0) Build error with Oracle Studio. [snip useless warnings] ../../src/glsl/glsl_parser_extras.cpp, line 1455: Error: Badly formed expression. I'm not sure what's wrong with + (void) p_atomic_cmpxchg(ir_variable::temporaries_allocate_names, + false, true); gcc and MSVC seem to accept it. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 86944] glsl_parser_extras.cpp, line 1455: Error: Badly formed expression.
https://bugs.freedesktop.org/show_bug.cgi?id=86944 --- Comment #2 from Matt Turner matts...@gmail.com --- Oh, I bet there's something wrong with the macro. I'd try removing the typeof(*v) cast in src/util/u_atomic.h around line 173. If that fixes it, we should remove those casts from p_atomic_inc_return and p_atomic_dec_return as well. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] [v2] i965: implement ARB_pipeline_statistics_query
This patch implements ARB_pipeline_statistics_query. This addition to GL does not add a new API. Instead, it adds new tokens to the existing query APIs. The work to hook up the new tokens is trivial due to it's similarity to the previous work done for the query APIs. I've implemented all the new tokens to some degree, but have stubbed out the untested ones at the entry point for Begin(). Doing this should allow the remainder of the code to be left in. The new tokens give GL clients a way to obtain stats about the GL pipeline. Generally, you get the number of things going in, invocations, and number of things coming out, primitives, of the various stages. There are two immediate uses for this, performance information, and debugging various types of misrendering. I doubt one can use these for debugging very complex applications, but for piglit tests, it should be quite useful. Tessellation shaders, and compute shaders are not addressed in this patch because there is no upstream implementation. I've implemented how I believe tessellation shader stats will work for Intel hardware (though there is a bit of ambiguity). Compute shaders are a bit more interesting though, and I don't yet know what we'll do there. For the lazy, here is a link to the relevant part of the spec: https://www.opengl.org/registry/specs/ARB/pipeline_statistics_query.txt Running the piglit tests http://lists.freedesktop.org/archives/piglit/2014-November/013321.html (http://cgit.freedesktop.org/~bwidawsk/piglit/log/?h=pipe_stats) yield the following results: python2 ./piglit-run.py -t stats tests/all.py output/pipeline_stats [5/5] pass: 5 Running Test(s): 5 Previously I was seeing the adjacent vertex test failing on certain Intel hardware. I am currently not able to reproduce this, and therefore for now, I'll assume it was some transient issue which has been fixed. v2: - Don't allow pipeline_stats to be per stream (Ilia). This would be needed for AMD_transform_feedback4, which we do not support. If AMD_transform_feedback4 is supported then GEOMETRY_SHADER_PRIMITIVES_- EMITTED_ARB counts primitives emitted to any of the vertex streams for which STREAM_RASTERIZATION_AMD is enabled. - Remove comment from GL3.txt because it is only used for extensions that are part of required versions (Ilia) - Move the new tokens to a new XML doc instead of using the main GL4x.xml (Ilia) - Add a fallthrough comment (Ilia) - Only divide PS invocations by 4 on HSW+ (Ben) Cc: Ilia Mirkin imir...@alum.mit.edu Signed-off-by: Ben Widawsky b...@bwidawsk.net --- .../glapi/gen/ARB_pipeline_statistics_query.xml| 24 src/mesa/drivers/dri/i965/gen6_queryobj.c | 121 + src/mesa/drivers/dri/i965/intel_extensions.c | 1 + src/mesa/main/config.h | 3 + src/mesa/main/extensions.c | 1 + src/mesa/main/mtypes.h | 15 +++ src/mesa/main/queryobj.c | 77 + 7 files changed, 242 insertions(+) create mode 100644 src/mapi/glapi/gen/ARB_pipeline_statistics_query.xml diff --git a/src/mapi/glapi/gen/ARB_pipeline_statistics_query.xml b/src/mapi/glapi/gen/ARB_pipeline_statistics_query.xml new file mode 100644 index 000..db37267 --- /dev/null +++ b/src/mapi/glapi/gen/ARB_pipeline_statistics_query.xml @@ -0,0 +1,24 @@ +?xml version=1.0? +!DOCTYPE OpenGLAPI SYSTEM gl_API.dtd + +!-- Note: no GLX protocol info yet. -- + +OpenGLAPI + +category name=GL_ARB_pipeline_statistics_query number=100 + + enum name=VERTICES_SUBMITTED value=0x82EE/ + enum name=PRIMITIVES_SUBMITTEDvalue=0x82EF/ + enum name=VERTEX_SHADER_INVOCATIONS value=0x82F0/ + enum name=TESS_CONTROL_SHADER_PATCHES value=0x82F1/ + enum name=TESS_EVALUATION_SHADER_INVOCATIONS value=0x82F2/ + !-- enum name=GEOMETRY_SHADER_INVOCATIONS value=0x887F/ -- + enum name=GEOMETRY_SHADER_PRIMITIVES_EMITTED value=0x82F3/ + enum name=FRAGMENT_SHADER_INVOCATIONS value=0x82F4/ + enum name=COMPUTE_SHADER_INVOCATIONS value=0x82F5/ + enum name=CLIPPING_INPUT_PRIMITIVES value=0x82F6/ + enum name=CLIPPING_OUTPUT_PRIMITIVES value=0x82F7/ + +/category + +/OpenGLAPI diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c b/src/mesa/drivers/dri/i965/gen6_queryobj.c index 130236e..f8b9bc3 100644 --- a/src/mesa/drivers/dri/i965/gen6_queryobj.c +++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c @@ -109,6 +109,73 @@ write_xfb_primitives_written(struct brw_context *brw, } } +static inline const int +pipeline_target_to_index(int target) +{ + if (target == GL_GEOMETRY_SHADER_INVOCATIONS) + return MAX_PIPELINE_STATISTICS - 1; + else + return target - GL_VERTICES_SUBMITTED_ARB; +} + +static void +emit_pipeline_stat(struct brw_context *brw, drm_intel_bo *bo, + int stream, int target,
Re: [Mesa-dev] [PATCH] [RFC] MESA_shading_language_130: is this it?
Am 02.12.2014 um 21:51 schrieb Eric Anholt: Ian Romanick i...@freedesktop.org writes: On 11/26/2014 06:09 PM, Dave Airlie wrote: Glamor is 4x faster on my ILK using glsl 130 at core text using x11perf -ftext. Ian started writing a spec for this extension a while back, which seems like most of the work, this patch seems to do enough, to advertise GLSL 1.30. Yeah... I started writing the extension when Chris Forbes was working on adding GLSL 1.30 for Ironlake. I seem to recall that gl_ClipDistance still does not work for ILK, and difficulties with that caused Chris to abandon the project. This was over a year ago, so the details are a bit fuzzy. The common Mesa parts look good, though. If we want to pursue this, I can finish up the extension spec and get it published. I'd definitely be interested -- integers are really useful for 2D rendering (as evidenced by Dave's numbers), and I can do them in VC4. What I see in a glance through 1.30 that I don't have in my 2.1 implementation is: - Size queries (but I can fake it with uniforms) - Texture arrays (can't fake that without some real craziness). - Texture offsets (could fake with uniforms and some math?) Offsets get applied in texel space, after denormalization, hence depend on the mipmap chosen. So if you can't get the mip level used in sampling from your hw (and even then correct filtering of two mips is going to be difficult) things probably get annoying pretty fast. Though I would assume offsets are mostly used on level-zero-only textures or the sampling otherwise limited to one level (textureLodOffset, min/max lod). Roland - gl_VertexID (could fake with a cached VBO of integers I think. I'd be interested in whether the MESA_1.30 spec would require support for extensions that expose those things, or if I could expose it without doing all of that. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddevd=AAIGaQc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=Vjtt0vs_iqoI31UfJxBl7yv9I2FeiaeAYgMTLKRBc_Im=r5_-aHqaEDIufEVmTnZTiBFyWWXoNR6TT3nMKR9kug4s=t8ibrSzKSkiS-k9hJPiwy7zGjvKCGBWmxfmeMsUJZx4e= ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] llvmpipe: decrease MAX_SCENES from 2 to 1
From: Roland Scheidegger srol...@vmware.com Multiple scenes per context are meant to be used so a new scene can be built while another one is processed in rasterization. However, quite surprisingly, this does not actually work (and according to git log, possibly never did, though maybe it did at some point further back (5 years+) but was buggy) because we always wait immediately on the rasterizer to finish the scene when contexts (and hence setup/scene) is flushed. This means when we try to get an empty scene later, any old one is already empty again. Thus using multiple scenes is just a waste of memory (not too bad, since the additional scenes are guaranteed to be empty, which means their size ought to be one data block (64kB) plus the size of some structs), without actually really doing anything. (There is also quite some code for the whole concept of multiple scenes which doesn't really do much in practice, but keep it hoping the wait-on-scene-flush can be fixed some day.) --- src/gallium/drivers/llvmpipe/lp_setup.c | 11 +++ src/gallium/drivers/llvmpipe/lp_setup_context.h | 3 ++- 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/llvmpipe/lp_setup.c b/src/gallium/drivers/llvmpipe/lp_setup.c index 7d0c8cc..3b0056c 100644 --- a/src/gallium/drivers/llvmpipe/lp_setup.c +++ b/src/gallium/drivers/llvmpipe/lp_setup.c @@ -165,6 +165,17 @@ lp_setup_rasterize_scene( struct lp_setup_context *setup ) setup-last_fence-issued = TRUE; pipe_mutex_lock(screen-rast_mutex); + + /* FIXME: We enqueue the scene then wait on the rasterizer to finish. +* This means we never actually run any vertex stuff in parallel to +* rasterization (not in the same context at least) which is what the +* multiple scenes per setup is about - when we get a new empty scene +* any old one is already empty again because we waited here for +* raster tasks to be finished. Ideally, we shouldn't need to wait here +* and rely on fences elsewhere when waiting is necessary. +* Certainly, lp_scene_end_rasterization() would need to be deferred too +* and there's probably other bits why this doesn't actually work. +*/ lp_rast_queue_scene(screen-rast, scene); lp_rast_finish(screen-rast); pipe_mutex_unlock(screen-rast_mutex); diff --git a/src/gallium/drivers/llvmpipe/lp_setup_context.h b/src/gallium/drivers/llvmpipe/lp_setup_context.h index ef54403..2410e23 100644 --- a/src/gallium/drivers/llvmpipe/lp_setup_context.h +++ b/src/gallium/drivers/llvmpipe/lp_setup_context.h @@ -55,7 +55,8 @@ struct lp_setup_variant; /** Max number of scenes */ -#define MAX_SCENES 2 +/* XXX: make multiple scenes per context work, see lp_setup_rasterize_scene */ +#define MAX_SCENES 1 -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/1] r600, llvm: Don't leak global symbol offsets
On Tue, Dec 02, 2014 at 02:47:53PM -0500, Jan Vesely wrote: Signed-off-by: Jan Vesely jan.ves...@rutgers.edu I've pushed this, thanks! -Tom --- You were right, this one was leaking too. src/gallium/drivers/r600/r600_llvm.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/gallium/drivers/r600/r600_llvm.c b/src/gallium/drivers/r600/r600_llvm.c index 3a3ee3a..a928fb8 100644 --- a/src/gallium/drivers/r600/r600_llvm.c +++ b/src/gallium/drivers/r600/r600_llvm.c @@ -888,6 +888,7 @@ unsigned r600_llvm_compile( FREE(binary.code); FREE(binary.config); FREE(binary.rodata); + FREE(binary.global_symbol_offsets); return r; } -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] [RFC] MESA_shading_language_130: is this it?
On 3 December 2014 at 08:46, Dave Airlie airl...@gmail.com wrote: On 3 December 2014 at 07:32, Ian Romanick i...@freedesktop.org wrote: On 12/02/2014 12:51 PM, Eric Anholt wrote: Ian Romanick i...@freedesktop.org writes: On 11/26/2014 06:09 PM, Dave Airlie wrote: Glamor is 4x faster on my ILK using glsl 130 at core text using x11perf -ftext. Ian started writing a spec for this extension a while back, which seems like most of the work, this patch seems to do enough, to advertise GLSL 1.30. Yeah... I started writing the extension when Chris Forbes was working on adding GLSL 1.30 for Ironlake. I seem to recall that gl_ClipDistance still does not work for ILK, and difficulties with that caused Chris to abandon the project. This was over a year ago, so the details are a bit fuzzy. The common Mesa parts look good, though. If we want to pursue this, I can finish up the extension spec and get it published. I'd definitely be interested -- integers are really useful for 2D rendering (as evidenced by Dave's numbers), and I can do them in VC4. What I see in a glance through 1.30 that I don't have in my 2.1 implementation is: - Size queries (but I can fake it with uniforms) - Texture arrays (can't fake that without some real craziness). - Texture offsets (could fake with uniforms and some math?) - gl_VertexID (could fake with a cached VBO of integers I think. I'd be interested in whether the MESA_1.30 spec would require support for extensions that expose those things, or if I could expose it without doing all of that. I think it would be easy to have an interaction with GL_ARB_texture_array at least. I think the easiest thing is: Interactions with EXT_texture_array and OpenGL 3.0 If EXT_texture_array or OpenGL 3.0 are not supported, shaders using sampler1DArray, sampler2DArray, isampler1DArray, isampler2DArray, usampler1DArray, usampler2DArray, sampler1DArrayShadow, or sampler2DArrayShadow types or built-in functions related to those types will compile and link. However, since textures for those targets cannot be created, these shaders will fail draw-time validation checks. Would that work? I think we'd need a nearly identical interaction for EXT_texture_integer. There may be other similar cases. From a piglit run on ILK, AMD_shader_trinary_minmax ARB_draw_elements_base_vertex ARB_explicit_attrib_location ARB_explicit_uniform_location ARB_shader_bit_encoding ARB_texture_query_levels ARB_texture_query_lod ARB_texture_rg ARB_texture_rgb10_a2ui EXT_shader_integer_mix EXT_texture_integer ARB_separate_shader_objects all had new tests enabled with my patch. I just put up the piglit results from Ironlake http://people.freedesktop.org/~airlied/piglit/ilk/enabled.html being the interesting page. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] [v2] i965: implement ARB_pipeline_statistics_query
On Tue, Dec 2, 2014 at 9:33 PM, Ben Widawsky benjamin.widaw...@intel.com wrote: This patch implements ARB_pipeline_statistics_query. This addition to GL does not add a new API. Instead, it adds new tokens to the existing query APIs. The work to hook up the new tokens is trivial due to it's similarity to the previous work done for the query APIs. I've implemented all the new tokens to some degree, but have stubbed out the untested ones at the entry point for Begin(). Doing this should allow the remainder of the code to be left in. The new tokens give GL clients a way to obtain stats about the GL pipeline. Generally, you get the number of things going in, invocations, and number of things coming out, primitives, of the various stages. There are two immediate uses for this, performance information, and debugging various types of misrendering. I doubt one can use these for debugging very complex applications, but for piglit tests, it should be quite useful. Tessellation shaders, and compute shaders are not addressed in this patch because there is no upstream implementation. I've implemented how I believe tessellation shader stats will work for Intel hardware (though there is a bit of ambiguity). Compute shaders are a bit more interesting though, and I don't yet know what we'll do there. For the lazy, here is a link to the relevant part of the spec: https://www.opengl.org/registry/specs/ARB/pipeline_statistics_query.txt Running the piglit tests http://lists.freedesktop.org/archives/piglit/2014-November/013321.html (http://cgit.freedesktop.org/~bwidawsk/piglit/log/?h=pipe_stats) yield the following results: python2 ./piglit-run.py -t stats tests/all.py output/pipeline_stats [5/5] pass: 5 Running Test(s): 5 Previously I was seeing the adjacent vertex test failing on certain Intel hardware. I am currently not able to reproduce this, and therefore for now, I'll assume it was some transient issue which has been fixed. v2: - Don't allow pipeline_stats to be per stream (Ilia). This would be needed for AMD_transform_feedback4, which we do not support. If AMD_transform_feedback4 is supported then GEOMETRY_SHADER_PRIMITIVES_- EMITTED_ARB counts primitives emitted to any of the vertex streams for which STREAM_RASTERIZATION_AMD is enabled. Actually it wouldn't be needed even if mesa supported AMD_tf4. The way I'm reading it, the counter should count primitives emitted to *ANY* vertex stream with rasterization, not keep separate counters per vertex stream. - Remove comment from GL3.txt because it is only used for extensions that are part of required versions (Ilia) You should still add it to relnotes though... the file's been created already - Move the new tokens to a new XML doc instead of using the main GL4x.xml (Ilia) - Add a fallthrough comment (Ilia) - Only divide PS invocations by 4 on HSW+ (Ben) Cc: Ilia Mirkin imir...@alum.mit.edu Signed-off-by: Ben Widawsky b...@bwidawsk.net --- .../glapi/gen/ARB_pipeline_statistics_query.xml| 24 src/mesa/drivers/dri/i965/gen6_queryobj.c | 121 + src/mesa/drivers/dri/i965/intel_extensions.c | 1 + Didn't we agree that you'd split this into 2-3 patches, with the intel stuff in the last patch? Normally you'd have patch 1: xml, queryobj stub, maybe include extension var + table entry, but can also be in patch 2. patch 2: queryobj impl, driver hook patch 3: driver impl src/mesa/main/config.h | 3 + src/mesa/main/extensions.c | 1 + src/mesa/main/mtypes.h | 15 +++ src/mesa/main/queryobj.c | 77 + 7 files changed, 242 insertions(+) create mode 100644 src/mapi/glapi/gen/ARB_pipeline_statistics_query.xml diff --git a/src/mapi/glapi/gen/ARB_pipeline_statistics_query.xml b/src/mapi/glapi/gen/ARB_pipeline_statistics_query.xml new file mode 100644 index 000..db37267 --- /dev/null +++ b/src/mapi/glapi/gen/ARB_pipeline_statistics_query.xml This needs to be added to src/mapi/glapi/Makefile.am or similar -- look for a giant list of xml files, add to that. You also need to include it in the master xml file. I wonder why you weren't having issues... maybe you're not adding functions so it works out, dunno. [skipping all intel-specific stuff] The rest seems reasonable. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 86690] glmark2-es2-wayland shortly freezes on some frames with egl_dri2 backend (Nouveau/GK20A)
https://bugs.freedesktop.org/show_bug.cgi?id=86690 --- Comment #7 from Alexandre Courbot gnu...@gmail.com --- Some more input about this oddity. We are really waiting for fences here. During the freezes I can see the EGL client looping in nouveau_fence_wait(), and displaying the fence sequence and the highest signaled fence by the screen (screen-fence.sequence_ack) shows that sequence_ack fence-sequence, indicating the GPU has not signaled this fence yet. I am starting to suspect a CPU scheduling issue here. Another very interesting phenomenon happens if I disable all but 1 CPU core: the issue becomes immediately visible (even worse than when perf top is running), for all Wayland EGL clients that set eglSwapInterval to 0. Wayland itself (or any DRM program for that instance) is still not affected, neither are Wayland clients not touching eglSwapInterval *until* they exit, at which point nouveau_fence_wait() will again endlessly loop on a fence. One way to release that fence is to trigger a draw event in another client. Then the screen's sequence_ack finally increases and the client goes through and exits. Doing the same in a client that set eglSwapInterval(0) allows it to draw a few frames, and then it blocks again. I suspect that some event in Wayland remains stuck somewhere, which leads to the EGL client to wait for a fence that Wayland fails to release when expected. Here is a gdb backtrace of weston-simple-egl when waiting for the fence: #0 0xb6d9298c in sched_yield () #1 0xb6a448ce in nouveau_fence_wait () #2 0xb6a45480 in nouveau_screen_fence_finish () #3 0xb69a351e in dri_flush () #4 0xb6fc399e in dri2_wl_swap_buffers_with_damage () #5 0xb6fc0f66 in dri2_swap_buffers_with_damage () #6 0xb6fb700c in eglSwapBuffersWithDamageEXT () #7 0xaa4a in redraw (data=0xbefff9e4, callback=0x0, time=0) #8 0xb254 in main (argc=2, argv=0xbefffc64) I guess the next step for me is to find how to get some output about Wayland events as they happen and see which one remains stuck. -- You are receiving this mail because: You are the QA Contact for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] [v2] i965: implement ARB_pipeline_statistics_query
On Tue, Dec 02, 2014 at 10:47:37PM -0500, Ilia Mirkin wrote: On Tue, Dec 2, 2014 at 9:33 PM, Ben Widawsky benjamin.widaw...@intel.com wrote: This patch implements ARB_pipeline_statistics_query. This addition to GL does not add a new API. Instead, it adds new tokens to the existing query APIs. The work to hook up the new tokens is trivial due to it's similarity to the previous work done for the query APIs. I've implemented all the new tokens to some degree, but have stubbed out the untested ones at the entry point for Begin(). Doing this should allow the remainder of the code to be left in. The new tokens give GL clients a way to obtain stats about the GL pipeline. Generally, you get the number of things going in, invocations, and number of things coming out, primitives, of the various stages. There are two immediate uses for this, performance information, and debugging various types of misrendering. I doubt one can use these for debugging very complex applications, but for piglit tests, it should be quite useful. Tessellation shaders, and compute shaders are not addressed in this patch because there is no upstream implementation. I've implemented how I believe tessellation shader stats will work for Intel hardware (though there is a bit of ambiguity). Compute shaders are a bit more interesting though, and I don't yet know what we'll do there. For the lazy, here is a link to the relevant part of the spec: https://www.opengl.org/registry/specs/ARB/pipeline_statistics_query.txt Running the piglit tests http://lists.freedesktop.org/archives/piglit/2014-November/013321.html (http://cgit.freedesktop.org/~bwidawsk/piglit/log/?h=pipe_stats) yield the following results: python2 ./piglit-run.py -t stats tests/all.py output/pipeline_stats [5/5] pass: 5 Running Test(s): 5 Previously I was seeing the adjacent vertex test failing on certain Intel hardware. I am currently not able to reproduce this, and therefore for now, I'll assume it was some transient issue which has been fixed. v2: - Don't allow pipeline_stats to be per stream (Ilia). This would be needed for AMD_transform_feedback4, which we do not support. If AMD_transform_feedback4 is supported then GEOMETRY_SHADER_PRIMITIVES_- EMITTED_ARB counts primitives emitted to any of the vertex streams for which STREAM_RASTERIZATION_AMD is enabled. Actually it wouldn't be needed even if mesa supported AMD_tf4. The way I'm reading it, the counter should count primitives emitted to *ANY* vertex stream with rasterization, not keep separate counters per vertex stream. I'll defer to you on this since I really don't understand the implications of TF4. I don't quite understand how any makes sense when there are multiple vertex streams coming from the GS. So, I read it as, you need to capture all and let the user of the API figure out what to do with it. Again, I'm more than happy to defer to you on that. - Remove comment from GL3.txt because it is only used for extensions that are part of required versions (Ilia) You should still add it to relnotes though... the file's been created already Sorry, you are correct. I was too hasty in sending this out. - Move the new tokens to a new XML doc instead of using the main GL4x.xml (Ilia) - Add a fallthrough comment (Ilia) - Only divide PS invocations by 4 on HSW+ (Ben) Cc: Ilia Mirkin imir...@alum.mit.edu Signed-off-by: Ben Widawsky b...@bwidawsk.net --- .../glapi/gen/ARB_pipeline_statistics_query.xml| 24 src/mesa/drivers/dri/i965/gen6_queryobj.c | 121 + src/mesa/drivers/dri/i965/intel_extensions.c | 1 + Didn't we agree that you'd split this into 2-3 patches, with the intel stuff in the last patch? Normally you'd have patch 1: xml, queryobj stub, maybe include extension var + table entry, but can also be in patch 2. patch 2: queryobj impl, driver hook patch 3: driver impl Sorry, you are correct. Same excuse as above. src/mesa/main/config.h | 3 + src/mesa/main/extensions.c | 1 + src/mesa/main/mtypes.h | 15 +++ src/mesa/main/queryobj.c | 77 + 7 files changed, 242 insertions(+) create mode 100644 src/mapi/glapi/gen/ARB_pipeline_statistics_query.xml diff --git a/src/mapi/glapi/gen/ARB_pipeline_statistics_query.xml b/src/mapi/glapi/gen/ARB_pipeline_statistics_query.xml new file mode 100644 index 000..db37267 --- /dev/null +++ b/src/mapi/glapi/gen/ARB_pipeline_statistics_query.xml This needs to be added to src/mapi/glapi/Makefile.am or similar -- look for a giant list of xml files, add to that. You also need to include it in the master xml file. I wonder why you weren't having issues... maybe you're not adding functions so it works