Re: [Mesa-dev] [PATCH] r600g: Fix UMAD on Cayman
On Tue, Apr 9, 2013 at 3:18 AM, Marek Olšák mar...@gmail.com wrote: Pushed, thanks. The transform feedback test still doesn't pass, but at least the hardlocks are gone. Thanks, I have looked into the other issue as well http://lists.freedesktop.org/archives/mesa-dev/2013-March/036941.html The problem arises when there are nested loops. If I rework the code so there are no nested loops the issue disappears. At least one pixel also needs to enter the outer loop. The pixels that should enter the outer loop behaves correctly. It is those pixels that should not enter the outer loop that misbehaves. It does not matter if they also fails the test for the inner loop, they will still execute the instruction inside. That leads to the strange results for that test. The strangeness is easier to see if the NUM_POINTS in the ext_transform_feedback/ order.c are run with smaller values,like 3, 6 and 9. Disable the code that fail the test and print starting_x, shift_reg_final and iteration_count. Marek, since you implemented transform feedback for r600, do you think the issue is with the tranform feedback code or the shader compiler or some other thing? //Martin ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] radeonsi: add 2d tiling support for texture v3
On Mon, 2013-04-08 at 13:46 -0400, j.gli...@gmail.com wrote: From: Jerome Glisse jgli...@redhat.com v2: Remove left over code v3: Restage properly the commit so hunk of first one are not in second one. Signed-off-by: Jerome Glisse jgli...@redhat.com --- src/gallium/drivers/radeonsi/r600_texture.c | 11 ++-- src/gallium/drivers/radeonsi/si_state.c | 81 + 2 files changed, 20 insertions(+), 72 deletions(-) diff --git a/src/gallium/drivers/radeonsi/r600_texture.c b/src/gallium/drivers/radeonsi/r600_texture.c index 1b8382f..8992f9a 100644 --- a/src/gallium/drivers/radeonsi/r600_texture.c +++ b/src/gallium/drivers/radeonsi/r600_texture.c @@ -47,7 +47,6 @@ static void r600_copy_to_staging_texture(struct pipe_context *ctx, struct r600_t transfer-box); } - /* Copy from a transfer's staging texture to a full GPU one. */ static void r600_copy_from_staging_texture(struct pipe_context *ctx, struct r600_transfer *rtransfer) { @@ -152,12 +151,12 @@ static int r600_init_surface(struct r600_screen *rscreen, if (!is_flushed_depth is_depth) { surface-flags |= RADEON_SURF_ZBUFFER; - if (is_stencil) { surface-flags |= RADEON_SURF_SBUFFER | RADEON_SURF_HAS_SBUFFER_MIPTREE; These whitespace-only changes make me cringe. :) Either way though, the series is Reviewed-by: Michel Dänzer michel.daen...@amd.com -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallium/opencl: Fix out-of-tree build
From: Michel Dänzer michel.daen...@amd.com Signed-off-by: Michel Dänzer michel.daen...@amd.com --- src/gallium/targets/opencl/Makefile.am | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/targets/opencl/Makefile.am b/src/gallium/targets/opencl/Makefile.am index 389eecc..810f9bb 100644 --- a/src/gallium/targets/opencl/Makefile.am +++ b/src/gallium/targets/opencl/Makefile.am @@ -32,11 +32,11 @@ libOpenCL_la_SOURCES = # Force usage of a C++ linker nodist_EXTRA_libOpenCL_la_SOURCES = dummy.cpp -PIPE_SRC_DIR = $(top_srcdir)/src/gallium/targets/pipe-loader +PIPE_BUILD_DIR = $(top_builddir)/src/gallium/targets/pipe-loader # Provide compatibility with scripts for the old Mesa build system for # a while by putting a link to the driver into /lib of the build tree. all-local: libOpenCL.la - @$(MAKE) -C $(PIPE_SRC_DIR) + @$(MAKE) -C $(PIPE_BUILD_DIR) $(MKDIR_P) $(top_builddir)/$(LIB_DIR) ln -f .libs/libOpenCL.so* $(top_builddir)/$(LIB_DIR)/ -- 1.8.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium/opencl: Fix out-of-tree build
Am Dienstag, 9. April 2013, 11:17:39 schrieb Michel Dänzer: From: Michel Dänzer michel.daen...@amd.com Signed-off-by: Michel Dänzer michel.daen...@amd.com --- src/gallium/targets/opencl/Makefile.am | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/targets/opencl/Makefile.am b/src/gallium/targets/opencl/Makefile.am index 389eecc..810f9bb 100644 --- a/src/gallium/targets/opencl/Makefile.am +++ b/src/gallium/targets/opencl/Makefile.am @@ -32,11 +32,11 @@ libOpenCL_la_SOURCES = # Force usage of a C++ linker nodist_EXTRA_libOpenCL_la_SOURCES = dummy.cpp -PIPE_SRC_DIR = $(top_srcdir)/src/gallium/targets/pipe-loader +PIPE_BUILD_DIR = $(top_builddir)/src/gallium/targets/pipe-loader # Provide compatibility with scripts for the old Mesa build system for # a while by putting a link to the driver into /lib of the build tree. all-local: libOpenCL.la - @$(MAKE) -C $(PIPE_SRC_DIR) + @$(MAKE) -C $(PIPE_BUILD_DIR) $(MKDIR_P) $(top_builddir)/$(LIB_DIR) ln -f .libs/libOpenCL.so* $(top_builddir)/$(LIB_DIR)/ -- 1.8.2 I sent that patch to the list on 24.02.2013, but Matt Turner said that he has a better solution that does not involve calling make... ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: Fix UMAD on Cayman
I think the transform feedback code is correct, because all the other TFB tests pass. If there is an issue, it must be in the shader. Marek On Tue, Apr 9, 2013 at 8:58 AM, Martin Andersson g02ma...@gmail.com wrote: On Tue, Apr 9, 2013 at 3:18 AM, Marek Olšák mar...@gmail.com wrote: Pushed, thanks. The transform feedback test still doesn't pass, but at least the hardlocks are gone. Thanks, I have looked into the other issue as well http://lists.freedesktop.org/archives/mesa-dev/2013-March/036941.html The problem arises when there are nested loops. If I rework the code so there are no nested loops the issue disappears. At least one pixel also needs to enter the outer loop. The pixels that should enter the outer loop behaves correctly. It is those pixels that should not enter the outer loop that misbehaves. It does not matter if they also fails the test for the inner loop, they will still execute the instruction inside. That leads to the strange results for that test. The strangeness is easier to see if the NUM_POINTS in the ext_transform_feedback/ order.c are run with smaller values,like 3, 6 and 9. Disable the code that fail the test and print starting_x, shift_reg_final and iteration_count. Marek, since you implemented transform feedback for r600, do you think the issue is with the tranform feedback code or the shader compiler or some other thing? //Martin ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 63117] OSMesa Gallium Empty Output
https://bugs.freedesktop.org/show_bug.cgi?id=63117 --- Comment #3 from Kevin Hobbs hob...@ohiou.edu --- The(In reply to comment #2) Created attachment 77643 [details] [review] patch for osmesa.c I tried the patched mesa with VTK's LoadOpenGLExtension test and the test passed. I'm running the rest of VTK's nightly tests now... -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] mesa: fix glGet queries depending on derived framebuffer state (v2)
ctx-DrawBuffer-Visual might be invalid if (NewState _NEW_BUFFERS) != 0. v2: also fix: - RGBA_INTEGER_MODE_EXT - RGBA_FLOAT_MODE_ARB (also check API support) - FRAMEBUFFER_SRGB_CAPABLE_EXT NOTE: This is a candidate for stable branches. --- Yes Eric, a few more enums need the same treatment. src/mesa/main/get.c | 19 +++ src/mesa/main/get_hash_params.py | 14 +++--- 2 files changed, 26 insertions(+), 7 deletions(-) diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c index 2ba868c..31abe05 100644 --- a/src/mesa/main/get.c +++ b/src/mesa/main/get.c @@ -281,6 +281,12 @@ static const int extra_EXT_texture_integer[] = { EXTRA_END }; +static const int extra_EXT_texture_integer_and_new_buffers[] = { + EXT(EXT_texture_integer), + EXTRA_NEW_BUFFERS, + EXTRA_END +}; + static const int extra_GLSL_130[] = { EXTRA_GLSL_130, EXTRA_END @@ -317,6 +323,12 @@ static const int extra_ARB_ES3_compatibility_api_es3[] = { EXTRA_END }; +static const int extra_EXT_framebuffer_sRGB_and_new_buffers[] = { + EXT(EXT_framebuffer_sRGB), + EXTRA_NEW_BUFFERS, + EXTRA_END +}; + EXTRA_EXT(ARB_texture_cube_map); EXTRA_EXT(MESA_texture_array); EXTRA_EXT2(EXT_secondary_color, ARB_vertex_program); @@ -397,6 +409,13 @@ extra_NV_read_buffer_api_gl[] = { EXTRA_END }; +static const int extra_core_ARB_color_buffer_float_and_new_buffers[] = { + EXTRA_API_GL_CORE, + EXT(ARB_color_buffer_float), + EXTRA_NEW_BUFFERS, + EXTRA_END +}; + /* This is the big table describing all the enums we accept in * glGet*v(). The table is partitioned into six parts: enums * understood by all GL APIs (OpenGL, GLES and GLES2), enums shared diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py index 4ef2324..2b97da6 100644 --- a/src/mesa/main/get_hash_params.py +++ b/src/mesa/main/get_hash_params.py @@ -8,7 +8,7 @@ descriptor=[ [ COLOR_WRITEMASK, LOC_CUSTOM, TYPE_INT_4, 0, NO_EXTRA ], [ CULL_FACE, CONTEXT_BOOL(Polygon.CullFlag), NO_EXTRA ], [ CULL_FACE_MODE, CONTEXT_ENUM(Polygon.CullFaceMode), NO_EXTRA ], - [ DEPTH_BITS, BUFFER_INT(Visual.depthBits), NO_EXTRA ], + [ DEPTH_BITS, BUFFER_INT(Visual.depthBits), extra_new_buffers ], [ DEPTH_CLEAR_VALUE, CONTEXT_FIELD(Depth.Clear, TYPE_DOUBLEN), NO_EXTRA ], [ DEPTH_FUNC, CONTEXT_ENUM(Depth.Func), NO_EXTRA ], [ DEPTH_RANGE, CONTEXT_FIELD(Viewport.Near, TYPE_FLOATN_2), NO_EXTRA ], @@ -31,7 +31,7 @@ descriptor=[ [ RED_BITS, BUFFER_INT(Visual.redBits), extra_new_buffers ], [ SCISSOR_BOX, LOC_CUSTOM, TYPE_INT_4, 0, NO_EXTRA ], [ SCISSOR_TEST, CONTEXT_BOOL(Scissor.Enabled), NO_EXTRA ], - [ STENCIL_BITS, BUFFER_INT(Visual.stencilBits), NO_EXTRA ], + [ STENCIL_BITS, BUFFER_INT(Visual.stencilBits), extra_new_buffers ], [ STENCIL_CLEAR_VALUE, CONTEXT_INT(Stencil.Clear), NO_EXTRA ], [ STENCIL_FAIL, LOC_CUSTOM, TYPE_ENUM, NO_OFFSET, NO_EXTRA ], [ STENCIL_FUNC, LOC_CUSTOM, TYPE_ENUM, NO_OFFSET, NO_EXTRA ], @@ -80,8 +80,8 @@ descriptor=[ [ SAMPLE_COVERAGE_ARB, CONTEXT_BOOL(Multisample.SampleCoverage), NO_EXTRA ], [ SAMPLE_COVERAGE_VALUE_ARB, CONTEXT_FLOAT(Multisample.SampleCoverageValue), NO_EXTRA ], [ SAMPLE_COVERAGE_INVERT_ARB, CONTEXT_BOOL(Multisample.SampleCoverageInvert), NO_EXTRA ], - [ SAMPLE_BUFFERS_ARB, BUFFER_INT(Visual.sampleBuffers), NO_EXTRA ], - [ SAMPLES_ARB, BUFFER_INT(Visual.samples), NO_EXTRA ], + [ SAMPLE_BUFFERS_ARB, BUFFER_INT(Visual.sampleBuffers), extra_new_buffers ], + [ SAMPLES_ARB, BUFFER_INT(Visual.samples), extra_new_buffers ], # GL_SGIS_generate_mipmap [ GENERATE_MIPMAP_HINT_SGIS, CONTEXT_ENUM(Hint.GenerateMipmap), NO_EXTRA ], @@ -630,7 +630,7 @@ descriptor=[ [ TEXTURE_CUBE_MAP_SEAMLESS, CONTEXT_BOOL(Texture.CubeMapSeamless), extra_ARB_seamless_cube_map ], # GL_EXT_texture_integer - [ RGBA_INTEGER_MODE_EXT, BUFFER_BOOL(_IntegerColor), extra_EXT_texture_integer ], + [ RGBA_INTEGER_MODE_EXT, BUFFER_BOOL(_IntegerColor), extra_EXT_texture_integer_and_new_buffers ], # GL_ARB_transform_feedback3 [ MAX_TRANSFORM_FEEDBACK_BUFFERS, CONTEXT_INT(Const.MaxTransformFeedbackBuffers), extra_ARB_transform_feedback3 ], @@ -645,7 +645,7 @@ descriptor=[ [ MAX_VERTEX_VARYING_COMPONENTS_ARB, CONTEXT_INT(Const.MaxVertexVaryingComponents), extra_ARB_geometry_shader4 ], # GL_ARB_color_buffer_float - [ RGBA_FLOAT_MODE_ARB, BUFFER_FIELD(Visual.floatMode, TYPE_BOOLEAN), 0 ], + [ RGBA_FLOAT_MODE_ARB, BUFFER_FIELD(Visual.floatMode, TYPE_BOOLEAN), extra_core_ARB_color_buffer_float_and_new_buffers ], # GL_EXT_gpu_shader4 / GLSL 1.30 [ MIN_PROGRAM_TEXEL_OFFSET, CONTEXT_INT(Const.MinProgramTexelOffset), extra_GLSL_130 ], @@ -676,7 +676,7 @@ descriptor=[ # GL3.0 / GL_EXT_framebuffer_sRGB [ FRAMEBUFFER_SRGB_EXT, CONTEXT_BOOL(Color.sRGBEnabled), extra_EXT_framebuffer_sRGB ], - [ FRAMEBUFFER_SRGB_CAPABLE_EXT, BUFFER_INT(Visual.sRGBCapable), extra_EXT_framebuffer_sRGB ], + [
[Mesa-dev] [Bug 63078] EGL X11 Regression: Maximum swap interval is 0 (worked with 9.0)
https://bugs.freedesktop.org/show_bug.cgi?id=63078 --- Comment #1 from post+...@ralfj.de --- I found the problem: In platform_x11.c, dri2_initialize_x11_dri2, the function dri2_add_configs_for_visuals (which sets the maximum swap interval of the EGLConfigs) is called before dri2_setup_swap_interval, so dri2_dpy-max_swap_interval is still 0 when the config is created. Later, it is changed, but that does not reflect in the EGLConfigs. I can fix the issue locally by moving the call of dri2_setup_swap_interval before the one to dri2_add_configs_for_visuals, but not knowing this codebase at all, I do not know whether everything is actually properly initialised for dri2_setup_swap_interval to be safe to call. All I know is that it does not crash here ;-) and applications can be v-sync'ed again. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 63078] EGL X11 Regression: Maximum swap interval is 0 (worked with 9.0)
https://bugs.freedesktop.org/show_bug.cgi?id=63078 --- Comment #2 from post+...@ralfj.de --- Created attachment 77661 -- https://bugs.freedesktop.org/attachment.cgi?id=77661action=edit this patch fixes the issue for me, but I don't know whether it's actually safe to call dri2_setup_swap_interval that early -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] radeon/winsys: add uvd ring support to winsys
From: Christian König christian.koe...@amd.com Separated from UVD patch for clarity. Signed-off-by: Christian König christian.koe...@amd.com --- src/gallium/winsys/radeon/drm/radeon_drm_cs.c | 11 +++ src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 17 + src/gallium/winsys/radeon/drm/radeon_winsys.h |3 +++ 3 files changed, 31 insertions(+) diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_cs.c b/src/gallium/winsys/radeon/drm/radeon_drm_cs.c index aa7e295..720e086 100644 --- a/src/gallium/winsys/radeon/drm/radeon_drm_cs.c +++ b/src/gallium/winsys/radeon/drm/radeon_drm_cs.c @@ -94,6 +94,10 @@ #define RADEON_CS_RING_DMA 2 #endif +#ifndef RADEON_CS_RING_UVD +#define RADEON_CS_RING_UVD 3 +#endif + #ifndef RADEON_CS_END_OF_FRAME #define RADEON_CS_END_OF_FRAME 0x04 #endif @@ -490,6 +494,13 @@ static void radeon_drm_cs_flush(struct radeon_winsys_cs *rcs, unsigned flags) cs-cst-flags[0] |= RADEON_CS_USE_VM; } break; + +case RING_UVD: +cs-cst-flags[0] = 0; +cs-cst-flags[1] = RADEON_CS_RING_UVD; +cs-cst-cs.num_chunks = 3; +break; + default: case RING_GFX: cs-cst-flags[0] = 0; diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c index d1f7643..a378878 100644 --- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c +++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c @@ -90,6 +90,14 @@ #define RADEON_INFO_TIMESTAMP 0x11 #endif +#ifndef RADEON_INFO_RING_WORKING +#define RADEON_INFO_RING_WORKING 0x14 +#endif + +#ifndef RADEON_CS_RING_UVD +#define RADEON_CS_RING_UVD 3 +#endif + static struct util_hash_table *fd_tab = NULL; /* Enable/disable feature access for one command stream. @@ -323,6 +331,15 @@ static boolean do_winsys_init(struct radeon_drm_winsys *ws) ws-info.r600_has_dma = TRUE; } +/* Check for UVD */ +ws-info.has_uvd = FALSE; +if (ws-info.drm_minor = 31) { + uint32_t value = RADEON_CS_RING_UVD; +if (radeon_get_drm_value(ws-fd, RADEON_INFO_RING_WORKING, + UVD Ring working, value)) +ws-info.has_uvd = value; +} + /* Get GEM info. */ retval = drmCommandWriteRead(ws-fd, DRM_RADEON_GEM_INFO, gem_info, sizeof(gem_info)); diff --git a/src/gallium/winsys/radeon/drm/radeon_winsys.h b/src/gallium/winsys/radeon/drm/radeon_winsys.h index 36f1f8e..e343188 100644 --- a/src/gallium/winsys/radeon/drm/radeon_winsys.h +++ b/src/gallium/winsys/radeon/drm/radeon_winsys.h @@ -142,6 +142,7 @@ enum chip_class { enum ring_type { RING_GFX = 0, RING_DMA, +RING_UVD, RING_LAST, }; @@ -170,6 +171,8 @@ struct radeon_info { uint32_tdrm_minor; uint32_tdrm_patchlevel; +boolean has_uvd; + uint32_tr300_num_gb_pipes; uint32_tr300_num_z_pipes; -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] i965: Prefer Y-tiling on Gen6+.
On 8 April 2013 19:27, Kenneth Graunke kenn...@whitecape.org wrote: In the past, we preferred X-tiling for color buffers because our BLT code couldn't handle Y-tiling. However, the BLT paths have been largely replaced by BLORP on Gen6+, which can handle any kind of tiling. We hadn't measured any performance improvement in the past, but that's probably because compressed textures were all uncompressed anyway. Improves performance in GLB27_TRex_C24Z16_FixedTime by 7.69231%. Signed-off-by: Kenneth Graunke kenn...@whitecape.org This patch regresses piglit test spec/!OpenGL 1.2/tex3d-maxsize for me (on Gen7). --- src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c index 8dd04be..6a9f08c 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c @@ -344,7 +344,7 @@ intel_miptree_choose_tiling(struct intel_context *intel, return I915_TILING_Y; if (width0 = 64) - return I915_TILING_X; + return intel-gen = 6 ? I915_TILING_Y : I915_TILING_X; return I915_TILING_NONE; } -- 1.8.1.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965/blorp: Remove unnecessary test in gen7_blorp_emit_depth_stencil_config.
gen7_blorp_emit_depth_stencil_config() is only called when params-depth.mt is non-null. Therefore, it's not necessary to do an if (params-depth.mt) test inside it. The presence of this if test was misleading static analysis tools (and briefly, me) into thinking that gen7_blorp_emit_depth_stencil_config() might sometimes access uninitialized data and dereference a null pointer. --- src/mesa/drivers/dri/i965/gen7_blorp.cpp | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen7_blorp.cpp b/src/mesa/drivers/dri/i965/gen7_blorp.cpp index bfd2cbd..3138773 100644 --- a/src/mesa/drivers/dri/i965/gen7_blorp.cpp +++ b/src/mesa/drivers/dri/i965/gen7_blorp.cpp @@ -584,10 +584,8 @@ gen7_blorp_emit_depth_stencil_config(struct brw_context *brw, uint32_t draw_y = params-depth.y_offset; uint32_t tile_mask_x, tile_mask_y; - if (params-depth.mt) { - brw_get_depthstencil_tile_masks(params-depth.mt, NULL, - tile_mask_x, tile_mask_y); - } + brw_get_depthstencil_tile_masks(params-depth.mt, NULL, + tile_mask_x, tile_mask_y); /* 3DSTATE_DEPTH_BUFFER */ { -- 1.8.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 63117] OSMesa Gallium Empty Output
https://bugs.freedesktop.org/show_bug.cgi?id=63117 --- Comment #5 from Brian Paul bri...@vmware.com --- OK, great. If you find any significant differences between the old swrast and llvmpipe results, please file another bug. I'll push the patch today. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/17] i965/vs: Make type of vec4_visitor::vp more generic.
On 8 April 2013 09:15, Eric Anholt e...@anholt.net wrote: Paul Berry stereotype...@gmail.com writes: The vec4_visitor functions don't use any VS specific data from vec4_visitor::vp. So rename it to just p and change its type from struct gl_vertex_program * to struct gl_program *. This will allow the code to be re-used for geometry shaders. In many other places in the driver, p is a brw_compile. I'd rather not overload that name. In a couple other cases where we've had both the gl_shader_program and the gl_program, the shader_program becomes shader_prog (only about 8 instances in brw_vec4) and gl_program gets to be just prog That seems reasonable to me. I'll break this out into two patches, one to rename struct gl_shader_program *prog to shader_prog, and another to change the type of vp and rename it to prog. Note that gl_shader_program *prog appears in the backend_visitor base class, so renaming it will affect fs code too. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] st/vdpau: fix subtitle related bug
From: Christian König christian.koe...@amd.com Drawing subtitles didn't increased the dirty area of the surface. Signed-off-by: Christian König christian.koe...@amd.com --- src/gallium/state_trackers/vdpau/output.c |8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/gallium/state_trackers/vdpau/output.c b/src/gallium/state_trackers/vdpau/output.c index 8237eac..043d149 100644 --- a/src/gallium/state_trackers/vdpau/output.c +++ b/src/gallium/state_trackers/vdpau/output.c @@ -382,7 +382,7 @@ vlVdpOutputSurfacePutBitsIndexed(VdpOutputSurface surface, vl_compositor_clear_layers(cstate); vl_compositor_set_palette_layer(cstate, compositor, 0, sv_idx, sv_tbl, NULL, NULL, false); vl_compositor_set_layer_dst_area(cstate, 0, RectToPipe(destination_rect, dst_rect)); - vl_compositor_render(cstate, compositor, vlsurface-surface, NULL); + vl_compositor_render(cstate, compositor, vlsurface-surface, vlsurface-dirty_area); pipe_sampler_view_reference(sv_idx, NULL); pipe_sampler_view_reference(sv_tbl, NULL); @@ -488,7 +488,7 @@ vlVdpOutputSurfacePutBitsYCbCr(VdpOutputSurface surface, vl_compositor_clear_layers(cstate); vl_compositor_set_buffer_layer(cstate, compositor, 0, vbuffer, NULL, NULL, VL_COMPOSITOR_WEAVE); vl_compositor_set_layer_dst_area(cstate, 0, RectToPipe(destination_rect, dst_rect)); - vl_compositor_render(cstate, compositor, vlsurface-surface, NULL); + vl_compositor_render(cstate, compositor, vlsurface-surface, vlsurface-dirty_area); vbuffer-destroy(vbuffer); pipe_mutex_unlock(vlsurface-device-mutex); @@ -658,7 +658,7 @@ vlVdpOutputSurfaceRenderOutputSurface(VdpOutputSurface destination_surface, RectToPipe(source_rect, src_rect), NULL, ColorsToPipe(colors, flags, vlcolors)); vl_compositor_set_layer_dst_area(cstate, 0, RectToPipe(destination_rect, dst_rect)); - vl_compositor_render(cstate, compositor, dst_vlsurface-surface, NULL); + vl_compositor_render(cstate, compositor, dst_vlsurface-surface, dst_vlsurface-dirty_area); context-delete_blend_state(context, blend); pipe_mutex_unlock(dst_vlsurface-device-mutex); @@ -717,7 +717,7 @@ vlVdpOutputSurfaceRenderBitmapSurface(VdpOutputSurface destination_surface, RectToPipe(source_rect, src_rect), NULL, ColorsToPipe(colors, flags, vlcolors)); vl_compositor_set_layer_dst_area(cstate, 0, RectToPipe(destination_rect, dst_rect)); - vl_compositor_render(cstate, compositor, dst_vlsurface-surface, NULL); + vl_compositor_render(cstate, compositor, dst_vlsurface-surface, dst_vlsurface-dirty_area); context-delete_blend_state(context, blend); pipe_mutex_unlock(dst_vlsurface-device-mutex); -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Regarding UVD implementation check-out
Hi, Can you please let me know how to check out the mesa code for UVD implementation and how to apply the kernel patches http://people.freedesktop.org/~agd5f/radeon_ucode/?C=N;O=D; to test the UVD with AMD GPU in Linux. Regards, Ramesh CAUTION - Disclaimer * This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions are unlawful. This e-mail may contain viruses. Infosys has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. Infosys reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infosys e-mail system. ***INFOSYS End of Disclaimer INFOSYS*** ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] EGL and glClear(): Nothing is drawn on screen
Hi list, first of all, sorry if this is the wrong list for my issue. I tried various IRC channels, but got no reply, and since I am not actually sure there is a bug here, I decided to write to a list instead of filing a bugreport. My issue is as follows: I wrote a simple GL application which makes it easy to see tearing. All it does is * call glClear() * draw a vertical box as quad * swap the buffers * call glFlush() I wrote it in a way that I can compile it to use either GLX or EGL. Everything works fine with GLX, but using literally the same code (except for the initialisation and buffer swap, of course), it does not work in EGL. Instead of drawing the box, the window stays black. If I draw a black quad all over the screen instead of calling glClear(), it works fine even with EGL. I am very new to GL, but from my understanding, above code should work fine - is there a problem with my program, or is this a bug in EGL? You can find the sample program at http://www.ralfj.de/git/gltest.git . The default behaviour is as described above, calling it with -o uses a black quad instead of glCear(). I am using Debian testing with the current 3.8.6 vanilla kernel. Besides the mesa 8.0 that comes with Debian, I compiled the 9.0, 9.1 and master branches of mesa, all with the same result. My GPU is the one built-in to my Core i5-2450M (I think that's an HD2000). Kind regards, Ralf PS: I am not subscribed to the list, so please keep me in CC. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: Fix ir_print_visitor's handling of interpolation qualifiers.
On 8 April 2013 10:23, Ian Romanick i...@freedesktop.org wrote: On 04/06/2013 07:20 PM, Paul Berry wrote: This patch updates the interp[] array to match the enum glsl_interp_qualifier. Can we use STATIC_ASSERT to make sure these arrays are at least the correct size? Sure, that seems reasonable. I'll do a similar thing for the mode[] array. --- src/glsl/ir_print_visitor.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/glsl/ir_print_visitor.**cpp b/src/glsl/ir_print_visitor. **cpp index 597d281..8b445dc 100644 --- a/src/glsl/ir_print_visitor.**cpp +++ b/src/glsl/ir_print_visitor.**cpp @@ -149,7 +149,7 @@ void ir_print_visitor::visit(ir_**variable *ir) const char *const mode[] = { , uniform , shader_in , shader_out , in , out , inout , const_in , sys , temporary }; - const char *const interp[] = { , flat, noperspective }; + const char *const interp[] = { , smooth, flat, noperspective }; printf((%s%s%s%s) , cent, inv, mode[ir-mode], interp[ir-interpolation]); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] GSoC Call for Projects and Mentors
As some of you may know, the X.Org Foundation was accepted as a GSoC Project for 2013. If you would be interested in participating as either a mentor or a student, please sign up here: http://www.google-melange.com/gsoc/homepage/google/gsoc2013 For students, you can see some potential projects ideas here to get you started: http://www.x.org/wiki/SummerOfCodeIdeas If anyone has any questions, let me know. Thanks! Alex ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/12] glsl: Add lowering pass for ir_triop_vector_insert
Ian Romanick i...@freedesktop.org writes: From: Ian Romanick ian.d.roman...@intel.com This will eventually replace do_vec_index_to_cond_assign. This lowering pass is called in all the places where do_vec_index_to_cond_assign or do_vec_index_to_swizzle is called. + + ir_constant *const cmp_indices = + new(factory.mem_ctx) ir_constant(glsl_type::get_instance(GLSL_TYPE_INT, + components, + 1), + cmp_indices_data); + + ir_variable *const cmp_result = + factory.make_temp(glsl_type::get_instance(GLSL_TYPE_BOOL, + components, + 1), + index_condition); I wish we had some helpers like glsl_type::bvec(components) instead of these verbose getters for sized vectors. + ir_variable *const src_temp = + factory.make_temp(expr-operands[1]-type, src_temp); + + const ir_swizzle_mask m = { 0, 0, 0, 0, components, false }; + ir_rvalue *broadcast_index = + new(factory.mem_ctx) ir_swizzle(expr-operands[2], m); Optionally: ir_rvalue *broadcast_index = swizzle_for_size(swizzle_(expr-operands[2]), components). + factory.emit(assign(temp, expr-operands[0])); + factory.emit(assign(src_temp, expr-operands[1])); + factory.emit(assign(cmp_result, equal(broadcast_index, cmp_indices))); + factory.emit(if_tree(swizzle_x(cmp_result), assign(temp, src_temp, 1))); + factory.emit(if_tree(swizzle_y(cmp_result), assign(temp, src_temp, 2))); + + if (components 2) + factory.emit(if_tree(swizzle_z(cmp_result), assign(temp, src_temp, 4))); + + if (components 3) + factory.emit(if_tree(swizzle_w(cmp_result), assign(temp, src_temp, 8))); Please use the WRITEMASK_* defines here. I know our hardware doesn't like the swizzling of that bvec compare result and we'd rather just see individual compares as the condition of each if statement. (we basically have to emit a compare of the swizzled bool against 0, after masking its high bits off, when we could have just compared the index value to a constant). I imagine other hardware would prefer the same. pgp62cYICW9KI.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/12] glsl: Generate ir_binop_vector_extract for indexing of vectors
Ian Romanick i...@freedesktop.org writes: From: Ian Romanick ian.d.roman...@intel.com Now ir_dereference_array of a vector will never occur in the RHS of an expression. Signed-off-by: Ian Romanick ian.d.roman...@intel.com --- src/glsl/ast_array_index.cpp | 23 +-- 1 file changed, 17 insertions(+), 6 deletions(-) diff --git a/src/glsl/ast_array_index.cpp b/src/glsl/ast_array_index.cpp index 862f64c..e7bc299 100644 --- a/src/glsl/ast_array_index.cpp +++ b/src/glsl/ast_array_index.cpp @@ -31,17 +31,13 @@ _mesa_ast_array_index_to_hir(void *mem_ctx, ir_rvalue *array, ir_rvalue *idx, YYLTYPE loc, YYLTYPE idx_loc) { - ir_rvalue *result = new(mem_ctx) ir_dereference_array(array, idx); - if (!array-type-is_error() !array-type-is_array() !array-type-is_matrix() -!array-type-is_vector()) { +!array-type-is_vector()) _mesa_glsl_error( idx_loc, state, cannot dereference non-array / non-matrix / non-vector); - result-type = glsl_type::error_type; - } Style-wise, I'd prefer those curly braces stay -- there's very little distinguishing the if condition from the body, already. pgpTjIORxJbYi.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/blorp: Remove unnecessary test in gen7_blorp_emit_depth_stencil_config.
On 04/09/2013 06:32 AM, Paul Berry wrote: gen7_blorp_emit_depth_stencil_config() is only called when params-depth.mt is non-null. Therefore, it's not necessary to do an if (params-depth.mt) test inside it. The presence of this if test was misleading static analysis tools (and briefly, me) into thinking that gen7_blorp_emit_depth_stencil_config() might sometimes access uninitialized data and dereference a null pointer. --- src/mesa/drivers/dri/i965/gen7_blorp.cpp | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen7_blorp.cpp b/src/mesa/drivers/dri/i965/gen7_blorp.cpp index bfd2cbd..3138773 100644 --- a/src/mesa/drivers/dri/i965/gen7_blorp.cpp +++ b/src/mesa/drivers/dri/i965/gen7_blorp.cpp @@ -584,10 +584,8 @@ gen7_blorp_emit_depth_stencil_config(struct brw_context *brw, uint32_t draw_y = params-depth.y_offset; uint32_t tile_mask_x, tile_mask_y; - if (params-depth.mt) { - brw_get_depthstencil_tile_masks(params-depth.mt, NULL, - tile_mask_x, tile_mask_y); - } + brw_get_depthstencil_tile_masks(params-depth.mt, NULL, + tile_mask_x, tile_mask_y); /* 3DSTATE_DEPTH_BUFFER */ { Reviewed-by: Kenneth Graunke kenn...@whitecape.org ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/12] Death to array dereferences of vectors!
Ian Romanick i...@freedesktop.org writes: This series gradually replaces array dereferences of vectors with two expressions. It takes so many patches because changes are needed to the existing lowering passes and because several places in the code generate array dereferences of vectors (e.g., lowering accessed to gl_ClipDistance). There is also some challenge in dealing with function inout parameters that are indexed vectors. The two new expressions are ir_binop_vector_extract and ir_triop_vector_insert. The former has a vector operand and a scalar operand. The result is the scalar value from the vector specified by the scalar. The later takes a vector and two scalars. The result is a new vector with one indexed field replaced by a scalar value. Together this series fixes piglit tests glsl-vs-channel-overwrite-01 and glsl-vs-channel-overwrite-03. Throughout the series, there's a bunch of introduction of new tabs for indentation. Paul pointed out long ago that the devinfo.html had specified a no-tabs indent style in the indent command since 2006, and I found that basically you and I were the only ones putting tabs in, so I stopped. I've found reading diffs has become easier since avoiding tabs, since you don't get diffs with apparently-incorrect indentation (thanks to + being 3 sharacters, in particular). I'd love to see this code fixed to not use tabs. If you use emacs, removing your custom configuration for Mesa and relying on .dir-locals.el will get you the preferred style for future work. Other than that, the patches other than the ones I commented on and the gl_ClipDistance ones are: Reviewed-by: Eric Anholt e...@anholt.net pgpklv9Gc44Fy.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/17] i965/vs: Make type of vec4_visitor::vp more generic.
Paul Berry stereotype...@gmail.com writes: On 8 April 2013 09:15, Eric Anholt e...@anholt.net wrote: Paul Berry stereotype...@gmail.com writes: The vec4_visitor functions don't use any VS specific data from vec4_visitor::vp. So rename it to just p and change its type from struct gl_vertex_program * to struct gl_program *. This will allow the code to be re-used for geometry shaders. In many other places in the driver, p is a brw_compile. I'd rather not overload that name. In a couple other cases where we've had both the gl_shader_program and the gl_program, the shader_program becomes shader_prog (only about 8 instances in brw_vec4) and gl_program gets to be just prog That seems reasonable to me. I'll break this out into two patches, one to rename struct gl_shader_program *prog to shader_prog, and another to change the type of vp and rename it to prog. Note that gl_shader_program *prog appears in the backend_visitor base class, so renaming it will affect fs code too. Sounds good. More consistency on this front will only help my sanity. pgpJYeosgTXOQ.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] st/vdpau: fix subtitle related bug v2
From: Christian König christian.koe...@amd.com Drawing subtitles didn't increased the dirty area of the surface. v2: don't clear the surface Signed-off-by: Christian König christian.koe...@amd.com --- src/gallium/state_trackers/vdpau/output.c |4 1 file changed, 4 insertions(+) diff --git a/src/gallium/state_trackers/vdpau/output.c b/src/gallium/state_trackers/vdpau/output.c index 8237eac..df0f458 100644 --- a/src/gallium/state_trackers/vdpau/output.c +++ b/src/gallium/state_trackers/vdpau/output.c @@ -383,6 +383,7 @@ vlVdpOutputSurfacePutBitsIndexed(VdpOutputSurface surface, vl_compositor_set_palette_layer(cstate, compositor, 0, sv_idx, sv_tbl, NULL, NULL, false); vl_compositor_set_layer_dst_area(cstate, 0, RectToPipe(destination_rect, dst_rect)); vl_compositor_render(cstate, compositor, vlsurface-surface, NULL); + vl_compositor_reset_dirty_area(vlsurface-dirty_area); pipe_sampler_view_reference(sv_idx, NULL); pipe_sampler_view_reference(sv_tbl, NULL); @@ -489,6 +490,7 @@ vlVdpOutputSurfacePutBitsYCbCr(VdpOutputSurface surface, vl_compositor_set_buffer_layer(cstate, compositor, 0, vbuffer, NULL, NULL, VL_COMPOSITOR_WEAVE); vl_compositor_set_layer_dst_area(cstate, 0, RectToPipe(destination_rect, dst_rect)); vl_compositor_render(cstate, compositor, vlsurface-surface, NULL); + vl_compositor_reset_dirty_area(vlsurface-dirty_area); vbuffer-destroy(vbuffer); pipe_mutex_unlock(vlsurface-device-mutex); @@ -659,6 +661,7 @@ vlVdpOutputSurfaceRenderOutputSurface(VdpOutputSurface destination_surface, ColorsToPipe(colors, flags, vlcolors)); vl_compositor_set_layer_dst_area(cstate, 0, RectToPipe(destination_rect, dst_rect)); vl_compositor_render(cstate, compositor, dst_vlsurface-surface, NULL); + vl_compositor_reset_dirty_area(dst_vlsurface-dirty_area); context-delete_blend_state(context, blend); pipe_mutex_unlock(dst_vlsurface-device-mutex); @@ -718,6 +721,7 @@ vlVdpOutputSurfaceRenderBitmapSurface(VdpOutputSurface destination_surface, ColorsToPipe(colors, flags, vlcolors)); vl_compositor_set_layer_dst_area(cstate, 0, RectToPipe(destination_rect, dst_rect)); vl_compositor_render(cstate, compositor, dst_vlsurface-surface, NULL); + vl_compositor_reset_dirty_area(dst_vlsurface-dirty_area); context-delete_blend_state(context, blend); pipe_mutex_unlock(dst_vlsurface-device-mutex); -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 63078] EGL X11 Regression: Maximum swap interval is 0 (worked with 9.0)
https://bugs.freedesktop.org/show_bug.cgi?id=63078 Chad Versace chad.vers...@linux.intel.com changed: What|Removed |Added Status|NEW |ASSIGNED --- Comment #3 from Chad Versace chad.vers...@linux.intel.com --- (In reply to comment #2) Created attachment 77661 [details] [review] this patch fixes the issue for me, but I don't know whether it's actually safe to call dri2_setup_swap_interval that early Your patch looks good to me. I'll submit it to the mesa-dev list. Your patch, though, only fixes X11. I'll follow up with a similar fix for Wayland. Embarassingly, I recall seeing this bug when I reviewed the guity commit long ago. But, that was a busy day and I forgot about the bug. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium/opencl: Fix out-of-tree build
On Tue, Apr 9, 2013 at 2:28 AM, Niels Ole Salscheider niels_...@salscheider-online.de wrote: Am Dienstag, 9. April 2013, 11:17:39 schrieb Michel Dänzer: From: Michel Dänzer michel.daen...@amd.com Signed-off-by: Michel Dänzer michel.daen...@amd.com --- src/gallium/targets/opencl/Makefile.am | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/targets/opencl/Makefile.am b/src/gallium/targets/opencl/Makefile.am index 389eecc..810f9bb 100644 --- a/src/gallium/targets/opencl/Makefile.am +++ b/src/gallium/targets/opencl/Makefile.am @@ -32,11 +32,11 @@ libOpenCL_la_SOURCES = # Force usage of a C++ linker nodist_EXTRA_libOpenCL_la_SOURCES = dummy.cpp -PIPE_SRC_DIR = $(top_srcdir)/src/gallium/targets/pipe-loader +PIPE_BUILD_DIR = $(top_builddir)/src/gallium/targets/pipe-loader # Provide compatibility with scripts for the old Mesa build system for # a while by putting a link to the driver into /lib of the build tree. all-local: libOpenCL.la - @$(MAKE) -C $(PIPE_SRC_DIR) + @$(MAKE) -C $(PIPE_BUILD_DIR) $(MKDIR_P) $(top_builddir)/$(LIB_DIR) ln -f .libs/libOpenCL.so* $(top_builddir)/$(LIB_DIR)/ -- 1.8.2 I sent that patch to the list on 24.02.2013, but Matt Turner said that he has a better solution that does not involve calling make... Sorry for not handling it sooner. I really don't enjoy doing automake stuff -- it's hard to find the motivation to push through a big series that has even a small chance of breaking anything. I expected to get to finishing that series sooner when I emailed you. Michel, if you want to commit Niels' patch I'd be grateful. Whenever I get back to my series, I can easily enough rebase it. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/17] i965/vs: split brw_vs_prog_data into generic and VS-specific parts.
On 8 April 2013 09:28, Eric Anholt e...@anholt.net wrote: Paul Berry stereotype...@gmail.com writes: -/* Note: brw_vs_prog_data_compare() must be updated when adding fields to this - * struct! + +/* Note: brw_vec4_prog_data_compare() must be updated when adding fields to + * this struct! */ -struct brw_vs_prog_data { +struct brw_vec4_prog_data { struct brw_vue_map vue_map; GLuint curb_read_length; - GLuint urb_read_length; GLuint total_grf; GLuint nr_params; /** number of float params/constants */ GLuint nr_pull_params; /** number of dwords referenced by pull_param[] */ GLuint total_scratch; + int num_surfaces; + + /* These pointers must appear last. See brw_vec4_prog_data_compare(). */ + const float **param; + const float **pull_param; +}; + + +/* Note: brw_vs_prog_data_compare() must be updated when adding fields to this + * struct! + */ +struct brw_vs_prog_data { + struct brw_vec4_prog_data base; + + GLuint urb_read_length; There's a URB read length in the GS state packet, so it seems like you'd want this field in the GS case as well as VS. I'm confused. I also would have expected urb_entry_size in GS. You're right. I don't know what I was thinking. I'll fix both of those in v2. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] i965: NULL check prog on compilation failure.
I believe that prog can only be NULL for ARB programs. Neither brw_fs_fp.cpp nor brw_vec4_vp.cpp call fail(), but not NULL checking prog is obviously fragile. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 8 +--- src/mesa/drivers/dri/i965/brw_vec4.cpp | 8 +--- src/mesa/drivers/dri/i965/brw_vs.c | 2 +- src/mesa/drivers/dri/i965/brw_wm.c | 2 +- 4 files changed, 12 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index c12ba45..2086af8 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -2886,7 +2886,7 @@ brw_wm_fs_emit(struct brw_context *brw, struct brw_wm_compile *c, if (unlikely(INTEL_DEBUG DEBUG_WM)) { if (shader) { - printf(GLSL IR for native fragment shader %d:\n, prog-Name); + printf(GLSL IR for native fragment shader %d:\n, prog ? prog-Name : -1); _mesa_print_ir(shader-ir, NULL); printf(\n\n); } else { @@ -2900,8 +2900,10 @@ brw_wm_fs_emit(struct brw_context *brw, struct brw_wm_compile *c, */ fs_visitor v(brw, c, prog, fp, 8); if (!v.run()) { - prog-LinkStatus = false; - ralloc_strcat(prog-InfoLog, v.fail_msg); + if (prog) { + prog-LinkStatus = false; + ralloc_strcat(prog-InfoLog, v.fail_msg); + } _mesa_problem(NULL, Failed to compile fragment shader: %s\n, v.fail_msg); diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index c58fb44..446b4cf 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -1506,7 +1506,7 @@ brw_vs_emit(struct brw_context *brw, if (unlikely(INTEL_DEBUG DEBUG_VS)) { if (shader) { - printf(GLSL IR for native vertex shader %d:\n, prog-Name); + printf(GLSL IR for native vertex shader %d:\n, prog ? prog-Name : -1); _mesa_print_ir(shader-ir, NULL); printf(\n\n); } else { @@ -1518,8 +1518,10 @@ brw_vs_emit(struct brw_context *brw, vec4_visitor v(brw, c, prog, shader, mem_ctx); if (!v.run()) { - prog-LinkStatus = false; - ralloc_strcat(prog-InfoLog, v.fail_msg); + if (prog) { + prog-LinkStatus = false; + ralloc_strcat(prog-InfoLog, v.fail_msg); + } return NULL; } diff --git a/src/mesa/drivers/dri/i965/brw_vs.c b/src/mesa/drivers/dri/i965/brw_vs.c index 6d2c0fd..f7a5e41 100644 --- a/src/mesa/drivers/dri/i965/brw_vs.c +++ b/src/mesa/drivers/dri/i965/brw_vs.c @@ -351,7 +351,7 @@ brw_vs_debug_recompile(struct brw_context *brw, const struct brw_vs_prog_key *old_key = NULL; bool found = false; - perf_debug(Recompiling vertex shader for program %d\n, prog-Name); + perf_debug(Recompiling vertex shader for program %d\n, prog ? prog-Name : -1); for (unsigned int i = 0; i brw-cache.size; i++) { for (c = brw-cache.items[i]; c; c = c-next) { diff --git a/src/mesa/drivers/dri/i965/brw_wm.c b/src/mesa/drivers/dri/i965/brw_wm.c index 9b30ba1..6bd95b4 100644 --- a/src/mesa/drivers/dri/i965/brw_wm.c +++ b/src/mesa/drivers/dri/i965/brw_wm.c @@ -248,7 +248,7 @@ brw_wm_debug_recompile(struct brw_context *brw, const struct brw_wm_prog_key *old_key = NULL; bool found = false; - perf_debug(Recompiling fragment shader for program %d\n, prog-Name); + perf_debug(Recompiling fragment shader for program %d\n, prog ? prog-Name : -1); for (unsigned int i = 0; i brw-cache.size; i++) { for (c = brw-cache.items[i]; c; c = c-next) { -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] i965/vs: Print error if vertex shader fails to compile.
--- src/mesa/drivers/dri/i965/brw_vec4.cpp | 4 1 file changed, 4 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 446b4cf..e129816 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -1522,6 +1522,10 @@ brw_vs_emit(struct brw_context *brw, prog-LinkStatus = false; ralloc_strcat(prog-InfoLog, v.fail_msg); } + + _mesa_problem(NULL, Failed to compile vertex shader: %s\n, +v.fail_msg); + return NULL; } -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 11/17] i965/vs: move VS-specific data members to vs_vec4_visitor.
On 8 April 2013 09:41, Eric Anholt e...@anholt.net wrote: Paul Berry stereotype...@gmail.com writes: This patch moves the following data structures from vec4_visitor to vec4_vs_visitor, since they contain VS-specific data: - struct brw_vs_compile *c - struct brw_vs_prog_data *prog_data - src_reg *vp_temp_regs - src_reg vp_addr_reg Since brw_vs_compile and brw_vs_prog_data also contain vec4-generic data, the following pointers are added to the base class, to allow it to access the vec4-generic portions of these data structures: - struct brw_vec4_compile *vec4_compile - struct brw_vec4_prog_key *vec4_key - struct brw_vec4_prog_data *vec4_prog_data I would lean toward the base class (which contains most of the members and usages, I think) having the short name, and the derived class having the more specific name. Sure, I can go along with that. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: Fix UMAD on Cayman
On 04/09/2013 10:58 AM, Martin Andersson wrote: On Tue, Apr 9, 2013 at 3:18 AM, Marek Olšák mar...@gmail.com wrote: Pushed, thanks. The transform feedback test still doesn't pass, but at least the hardlocks are gone. Thanks, I have looked into the other issue as well http://lists.freedesktop.org/archives/mesa-dev/2013-March/036941.html The problem arises when there are nested loops. If I rework the code so there are no nested loops the issue disappears. At least one pixel also needs to enter the outer loop. The pixels that should enter the outer loop behaves correctly. It is those pixels that should not enter the outer loop that misbehaves. It does not matter if they also fails the test for the inner loop, they will still execute the instruction inside. That leads to the strange results for that test. Please test the attached patch. Vadim The strangeness is easier to see if the NUM_POINTS in the ext_transform_feedback/ order.c are run with smaller values,like 3, 6 and 9. Disable the code that fail the test and print starting_x, shift_reg_final and iteration_count. Marek, since you implemented transform feedback for r600, do you think the issue is with the tranform feedback code or the shader compiler or some other thing? //Martin ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev From 46456ca7ecfa3f0b107b1f9106d024f9f239a571 Mon Sep 17 00:00:00 2001 From: Vadim Girlin vadimgir...@gmail.com Date: Wed, 10 Apr 2013 01:20:19 +0400 Subject: [PATCH] r600g: use ALU EXECUTE_MASK_OP on cayman instead of LOOP_BREAK/CONTINUE Signed-off-by: Vadim Girlin vadimgir...@gmail.com --- src/gallium/drivers/r600/r600_asm.c| 14 -- src/gallium/drivers/r600/r600_shader.c | 24 +++- src/gallium/drivers/r600/r600d.h | 5 + 3 files changed, 40 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/r600/r600_asm.c b/src/gallium/drivers/r600/r600_asm.c index 26a848a..2874adf 100644 --- a/src/gallium/drivers/r600/r600_asm.c +++ b/src/gallium/drivers/r600/r600_asm.c @@ -1985,6 +1985,7 @@ void r600_bytecode_disasm(struct r600_bytecode *bc) LIST_FOR_EACH_ENTRY(alu, cf-alu, list) { const char *omod_str[] = {,*2,*4,/2}; const struct alu_op_info *aop = r600_isa_alu(alu-op); + bool cm_execmask_op = alu-execute_mask bc-chip_class == CAYMAN; int o = 0; r600_bytecode_alu_nliterals(bc, alu, literal, nliteral); @@ -1997,8 +1998,10 @@ void r600_bytecode_disasm(struct r600_bytecode *bc) alu-update_pred ? 'P':' ', alu-pred_sel ? alu-pred_sel==2 ? '0':'1':' '); - o += fprintf(stderr, %s%s%s , aop-name, - omod_str[alu-omod], alu-dst.clamp ? _sat:); + o += fprintf(stderr, %s , aop-name); + if (!cm_execmask_op) +o += fprintf(stderr, %s , omod_str[alu-omod]); + o += fprintf(stderr, %s , alu-dst.clamp ? _sat:); o += print_indent(o,60); o += print_dst(alu); @@ -2012,6 +2015,13 @@ void r600_bytecode_disasm(struct r600_bytecode *bc) o += fprintf(stderr, BS:%d, alu-bank_swizzle); } + if (cm_execmask_op alu-omod) { +static const char* cm_em_op_names[] = + {BREAK, CONTINUE, KILL}; + +fprintf(stderr, %s, cm_em_op_names[alu-omod - 1]); + } + fprintf(stderr, \n); id += 2; diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index f801707..d1cac36 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -5827,7 +5827,29 @@ static int tgsi_loop_brk_cont(struct r600_shader_ctx *ctx) return -EINVAL; } - r600_bytecode_add_cfinst(ctx-bc, ctx-inst_info-op); + + if (ctx-bc-chip_class == CAYMAN) { + struct r600_bytecode_alu alu = {}; + int r; + + alu.op = ALU_OP2_PRED_SETE; + alu.src[0].sel = V_SQ_ALU_SRC_0; + alu.src[1].sel = V_SQ_ALU_SRC_1; + + if (ctx-inst_info-op == CF_OP_LOOP_BREAK) + alu.omod = SQ_ALU_EXECUTE_MASK_OP_BREAK; + else + alu.omod = SQ_ALU_EXECUTE_MASK_OP_CONTINUE; + + alu.execute_mask = 1; + alu.last = 1; + + r = r600_bytecode_add_alu(ctx-bc, alu); + if (r) + return r; + } else { + r600_bytecode_add_cfinst(ctx-bc, ctx-inst_info-op); + } fc_set_mid(ctx, fscp); diff --git a/src/gallium/drivers/r600/r600d.h b/src/gallium/drivers/r600/r600d.h index 9b31383..679dd81 100644 --- a/src/gallium/drivers/r600/r600d.h +++ b/src/gallium/drivers/r600/r600d.h @@ -3698,4 +3698,9 @@ #define DMA_PACKET_CONSTANT_FILL 0xd /* 7xx only */ #define DMA_PACKET_NOP 0xf +#define SQ_ALU_EXECUTE_MASK_OP_DEACTIVATE0x0 +#define SQ_ALU_EXECUTE_MASK_OP_BREAK 0x1 +#define SQ_ALU_EXECUTE_MASK_OP_CONTINUE 0x2 +#define SQ_ALU_EXECUTE_MASK_OP_KILL 0x3 + #endif -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] R600: Add VTX_READ_* and RAT_WRITE_CACHELESS_* when computing cf addr
--- lib/Target/R600/R600ControlFlowFinalizer.cpp | 11 ++- test/CodeGen/R600/loop-adress.ll | 44 2 files changed, 54 insertions(+), 1 deletion(-) create mode 100644 test/CodeGen/R600/loop-adress.ll diff --git a/lib/Target/R600/R600ControlFlowFinalizer.cpp b/lib/Target/R600/R600ControlFlowFinalizer.cpp index cfaa36e..2350130 100644 --- a/lib/Target/R600/R600ControlFlowFinalizer.cpp +++ b/lib/Target/R600/R600ControlFlowFinalizer.cpp @@ -67,6 +67,13 @@ private: case AMDGPU::TEX_SAMPLE_C_G: case AMDGPU::TXD: case AMDGPU::TXD_SHADOW: +case AMDGPU::VTX_READ_GLOBAL_8_eg: +case AMDGPU::VTX_READ_GLOBAL_32_eg: +case AMDGPU::VTX_READ_GLOBAL_128_eg: +case AMDGPU::VTX_READ_PARAM_8_eg: +case AMDGPU::VTX_READ_PARAM_16_eg: +case AMDGPU::VTX_READ_PARAM_32_eg: +case AMDGPU::VTX_READ_PARAM_128_eg: return true; default: return false; @@ -207,6 +214,8 @@ public: case AMDGPU::EG_ExportSwz: case AMDGPU::R600_ExportBuf: case AMDGPU::R600_ExportSwz: +case AMDGPU::RAT_WRITE_CACHELESS_32_eg: +case AMDGPU::RAT_WRITE_CACHELESS_128_eg: DEBUG(dbgs() CfCount :; MI-dump();); CfCount++; break; @@ -215,7 +224,7 @@ public: MaxStack = std::max(MaxStack, CurrentStack); MachineInstr *MIb = BuildMI(MBB, MI, MBB.findDebugLoc(MI), getHWInstrDesc(CF_WHILE_LOOP)) - .addImm(2); + .addImm(1); std::pairunsigned, std::setMachineInstr * Pair(CfCount, std::setMachineInstr *()); Pair.second.insert(MIb); diff --git a/test/CodeGen/R600/loop-adress.ll b/test/CodeGen/R600/loop-adress.ll new file mode 100644 index 000..dc9295e --- /dev/null +++ b/test/CodeGen/R600/loop-adress.ll @@ -0,0 +1,44 @@ +;RUN: llc %s -march=r600 -mcpu=redwood | FileCheck %s + +;CHECK: TEX +;CHECK: ALU_PUSH +;CHECK: JUMP @4 +;CHECK: ELSE @16 +;CHECK: TEX +;CHECK: LOOP_START_DX10 @15 +;CHECK: LOOP_BREAK @14 +;CHECK: POP @16 + +target datalayout = e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-v16:16:16-v24:32:32-v32:32:32-v48:64:64-v64:64:64-v96:128:128-v128:128:128-v192:256:256-v256:256:256-v512:512:512-v1024:1024:1024-v2048:2048:2048-n32:64 +target triple = r600-- + +define void @loop_ge(i32 addrspace(1)* nocapture %out, i32 %iterations) #0 { +entry: + %cmp5 = icmp sgt i32 %iterations, 0 + br i1 %cmp5, label %for.body, label %for.end + +for.body: ; preds = %for.body, %entry + %i.07.in = phi i32 [ %i.07, %for.body ], [ %iterations, %entry ] + %ai.06 = phi i32 [ %add, %for.body ], [ 0, %entry ] + %i.07 = add nsw i32 %i.07.in, -1 + %arrayidx = getelementptr inbounds i32 addrspace(1)* %out, i32 %ai.06 + store i32 %i.07, i32 addrspace(1)* %arrayidx, align 4, !tbaa !4 + %add = add nsw i32 %ai.06, 1 + %exitcond = icmp eq i32 %add, %iterations + br i1 %exitcond, label %for.end, label %for.body + +for.end: ; preds = %for.body, %entry + ret void +} + +attributes #0 = { nounwind fp-contract-model=standard relocation-model=pic ssp-buffers-size=8 } + +!opencl.kernels = !{!0, !1, !2, !3} + +!0 = metadata !{void (i32 addrspace(1)*, i32)* @loop_ge} +!1 = metadata !{null} +!2 = metadata !{null} +!3 = metadata !{null} +!4 = metadata !{metadata !int, metadata !5} +!5 = metadata !{metadata !omnipotent char, metadata !6} +!6 = metadata !{metadata !Simple C/C++ TBAA} -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] EGL and glClear(): Nothing is drawn on screen
Ralf Jung p...@ralfj.de writes: Hi list, first of all, sorry if this is the wrong list for my issue. I tried various IRC channels, but got no reply, and since I am not actually sure there is a bug here, I decided to write to a list instead of filing a bugreport. My issue is as follows: I wrote a simple GL application which makes it easy to see tearing. All it does is * call glClear() * draw a vertical box as quad * swap the buffers * call glFlush() I wrote it in a way that I can compile it to use either GLX or EGL. Everything works fine with GLX, but using literally the same code (except for the initialisation and buffer swap, of course), it does not work in EGL. Instead of drawing the box, the window stays black. If I draw a black quad all over the screen instead of calling glClear(), it works fine even with EGL. I am very new to GL, but from my understanding, above code should work fine - is there a problem with my program, or is this a bug in EGL? You can find the sample program at http://www.ralfj.de/git/gltest.git . The default behaviour is as described above, calling it with -o uses a black quad instead of glCear(). If you were testing with front buffer rendering, GLES doesn't support front buffer rendering (because nobody should ever use it). I recommend using a debug build of mesa when developing GL applications, since it dumps GL error messages to the console by default. If it was in the double-buffered path, then I don't have any other info. pgpOpXWpDLJqD.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] R600: Add VTX_READ_* and RAT_WRITE_CACHELESS_* when computing cf addr
On Tue, Apr 09, 2013 at 11:38:42PM +0200, Vincent Lejeune wrote: --- Reviewed-by: Tom Stellard thomas.stell...@amd.com lib/Target/R600/R600ControlFlowFinalizer.cpp | 11 ++- test/CodeGen/R600/loop-adress.ll | 44 2 files changed, 54 insertions(+), 1 deletion(-) create mode 100644 test/CodeGen/R600/loop-adress.ll diff --git a/lib/Target/R600/R600ControlFlowFinalizer.cpp b/lib/Target/R600/R600ControlFlowFinalizer.cpp index cfaa36e..2350130 100644 --- a/lib/Target/R600/R600ControlFlowFinalizer.cpp +++ b/lib/Target/R600/R600ControlFlowFinalizer.cpp @@ -67,6 +67,13 @@ private: case AMDGPU::TEX_SAMPLE_C_G: case AMDGPU::TXD: case AMDGPU::TXD_SHADOW: +case AMDGPU::VTX_READ_GLOBAL_8_eg: +case AMDGPU::VTX_READ_GLOBAL_32_eg: +case AMDGPU::VTX_READ_GLOBAL_128_eg: +case AMDGPU::VTX_READ_PARAM_8_eg: +case AMDGPU::VTX_READ_PARAM_16_eg: +case AMDGPU::VTX_READ_PARAM_32_eg: +case AMDGPU::VTX_READ_PARAM_128_eg: return true; default: return false; @@ -207,6 +214,8 @@ public: case AMDGPU::EG_ExportSwz: case AMDGPU::R600_ExportBuf: case AMDGPU::R600_ExportSwz: +case AMDGPU::RAT_WRITE_CACHELESS_32_eg: +case AMDGPU::RAT_WRITE_CACHELESS_128_eg: DEBUG(dbgs() CfCount :; MI-dump();); CfCount++; break; @@ -215,7 +224,7 @@ public: MaxStack = std::max(MaxStack, CurrentStack); MachineInstr *MIb = BuildMI(MBB, MI, MBB.findDebugLoc(MI), getHWInstrDesc(CF_WHILE_LOOP)) - .addImm(2); + .addImm(1); std::pairunsigned, std::setMachineInstr * Pair(CfCount, std::setMachineInstr *()); Pair.second.insert(MIb); diff --git a/test/CodeGen/R600/loop-adress.ll b/test/CodeGen/R600/loop-adress.ll new file mode 100644 index 000..dc9295e --- /dev/null +++ b/test/CodeGen/R600/loop-adress.ll @@ -0,0 +1,44 @@ +;RUN: llc %s -march=r600 -mcpu=redwood | FileCheck %s + +;CHECK: TEX +;CHECK: ALU_PUSH +;CHECK: JUMP @4 +;CHECK: ELSE @16 +;CHECK: TEX +;CHECK: LOOP_START_DX10 @15 +;CHECK: LOOP_BREAK @14 +;CHECK: POP @16 + +target datalayout = e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-v16:16:16-v24:32:32-v32:32:32-v48:64:64-v64:64:64-v96:128:128-v128:128:128-v192:256:256-v256:256:256-v512:512:512-v1024:1024:1024-v2048:2048:2048-n32:64 +target triple = r600-- + +define void @loop_ge(i32 addrspace(1)* nocapture %out, i32 %iterations) #0 { +entry: + %cmp5 = icmp sgt i32 %iterations, 0 + br i1 %cmp5, label %for.body, label %for.end + +for.body: ; preds = %for.body, %entry + %i.07.in = phi i32 [ %i.07, %for.body ], [ %iterations, %entry ] + %ai.06 = phi i32 [ %add, %for.body ], [ 0, %entry ] + %i.07 = add nsw i32 %i.07.in, -1 + %arrayidx = getelementptr inbounds i32 addrspace(1)* %out, i32 %ai.06 + store i32 %i.07, i32 addrspace(1)* %arrayidx, align 4, !tbaa !4 + %add = add nsw i32 %ai.06, 1 + %exitcond = icmp eq i32 %add, %iterations + br i1 %exitcond, label %for.end, label %for.body + +for.end: ; preds = %for.body, %entry + ret void +} + +attributes #0 = { nounwind fp-contract-model=standard relocation-model=pic ssp-buffers-size=8 } + +!opencl.kernels = !{!0, !1, !2, !3} + +!0 = metadata !{void (i32 addrspace(1)*, i32)* @loop_ge} +!1 = metadata !{null} +!2 = metadata !{null} +!3 = metadata !{null} +!4 = metadata !{metadata !int, metadata !5} +!5 = metadata !{metadata !omnipotent char, metadata !6} +!6 = metadata !{metadata !Simple C/C++ TBAA} -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 00/19] i965/vs: Generalize VS compiler back-end in preparation for GS.
Changes since v1: - Consistently use the variable name shader_prog to refer to struct gl_shader_program and prog to refer to struct gl_program. - Move urb_read_length and urb_entry_size into brw_vec4_prog_data so that they can be re-used by geometry shaders. - When vec4_vs_visitor has a pointer to a VS-specific data structure, and vec4_visitor has a pointer to the generic version of that data structure, use a short variable name in vec4_visitor and a longer variable name in vec4_vs_visitor. I've updated the branch gs-backend-prep on git://github.com/stereotype441/mesa.git to point to v2 of the series. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 01/19] i965: Rename backend_visitor::prog to shader_prog.
The next patch is going to change the type of vec4_visitor::vp from struct gl_vertex_program * to struct gl_program *, and rename it. The sensible name to change it to is vec4_visitor::prog. However, prog is already used in backend_visitor (which vec4_visitor derives from). Since backend_visitor::prog is of type struct gl_shader_program *, it makes sense to rename it to shader_prog. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 8 src/mesa/drivers/dri/i965/brw_fs.h | 2 +- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 12 +++- src/mesa/drivers/dri/i965/brw_shader.h | 2 +- src/mesa/drivers/dri/i965/brw_vec4.cpp | 4 ++-- src/mesa/drivers/dri/i965/brw_vec4.h | 2 +- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 11 ++- 7 files changed, 22 insertions(+), 19 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index c12ba45..4338ae6 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -616,8 +616,8 @@ void fs_visitor::emit_shader_time_write(enum shader_time_shader_type type, fs_reg value) { - int shader_time_index = brw_get_shader_time_index(brw, prog, fp-Base, - type); + int shader_time_index = + brw_get_shader_time_index(brw, shader_prog, fp-Base, type); fs_reg offset = fs_reg(shader_time_index * SHADER_TIME_STRIDE); fs_reg payload; @@ -852,8 +852,8 @@ fs_visitor::setup_uniform_values(ir_variable *ir) * with our name, or the prefix of a component that starts with our name. */ unsigned params_before = c-prog_data.nr_params; - for (unsigned u = 0; u prog-NumUserUniformStorage; u++) { - struct gl_uniform_storage *storage = prog-UniformStorage[u]; + for (unsigned u = 0; u shader_prog-NumUserUniformStorage; u++) { + struct gl_uniform_storage *storage = shader_prog-UniformStorage[u]; if (strncmp(ir-name, storage-name, namelen) != 0 || (storage-name[namelen] != 0 diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 60e3e0a..115a878 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -223,7 +223,7 @@ public: fs_visitor(struct brw_context *brw, struct brw_wm_compile *c, - struct gl_shader_program *prog, + struct gl_shader_program *shader_prog, struct gl_fragment_program *fp, unsigned dispatch_width); ~fs_visitor(); diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index d54d134..422816d 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -1361,7 +1361,8 @@ fs_visitor::visit(ir_texture *ir) { fs_inst *inst = NULL; - int sampler = _mesa_get_sampler_uniform_value(ir-sampler, prog, fp-Base); + int sampler = + _mesa_get_sampler_uniform_value(ir-sampler, shader_prog, fp-Base); /* FINISHME: We're failing to recompile our programs when the sampler is * updated. This only matters for the texture rectangle scale parameters * (pre-gen6, or gen6+ with GL_CLAMP). @@ -2371,7 +2372,7 @@ fs_visitor::resolve_bool_comparison(ir_rvalue *rvalue, fs_reg *reg) fs_visitor::fs_visitor(struct brw_context *brw, struct brw_wm_compile *c, - struct gl_shader_program *prog, + struct gl_shader_program *shader_prog, struct gl_fragment_program *fp, unsigned dispatch_width) : dispatch_width(dispatch_width) @@ -2379,12 +2380,13 @@ fs_visitor::fs_visitor(struct brw_context *brw, this-c = c; this-brw = brw; this-fp = fp; - this-prog = prog; + this-shader_prog = shader_prog; this-intel = brw-intel; this-ctx = intel-ctx; this-mem_ctx = ralloc_context(NULL); - if (prog) - shader = (struct brw_shader *) prog-_LinkedShaders[MESA_SHADER_FRAGMENT]; + if (shader_prog) + shader = (struct brw_shader *) + shader_prog-_LinkedShaders[MESA_SHADER_FRAGMENT]; else shader = NULL; this-failed = false; diff --git a/src/mesa/drivers/dri/i965/brw_shader.h b/src/mesa/drivers/dri/i965/brw_shader.h index 9d21d8f..a29618f 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.h +++ b/src/mesa/drivers/dri/i965/brw_shader.h @@ -42,7 +42,7 @@ public: struct intel_context *intel; struct gl_context *ctx; struct brw_shader *shader; - struct gl_shader_program *prog; + struct gl_shader_program *shader_prog; /** ralloc context for temporary data used during compile */ void *mem_ctx; diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index c58fb44..c05970a 100644 ---
[Mesa-dev] [PATCH v2 02/19] i965/vs: Make type of vec4_visitor::vp more generic.
The vec4_visitor functions don't use any VS specific data from vec4_visitor::vp. So rename it to prog and change its type from struct gl_vertex_program * to struct gl_program *. This will allow the code to be re-used for geometry shaders. Reviewed-by: Jordan Justen jordan.l.jus...@intel.com v2: Use the name prog rather than p. --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 6 +++--- src/mesa/drivers/dri/i965/brw_vec4.h | 2 +- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 8 src/mesa/drivers/dri/i965/brw_vec4_vp.cpp | 6 +++--- 4 files changed, 11 insertions(+), 11 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index c05970a..cd6466c 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -1364,7 +1364,7 @@ vec4_visitor::emit_shader_time_write(enum shader_time_shader_type type, src_reg value) { int shader_time_index = - brw_get_shader_time_index(brw, shader_prog, vp-Base, type); + brw_get_shader_time_index(brw, shader_prog, prog, type); dst_reg dst = dst_reg(this, glsl_type::get_array_instance(glsl_type::vec4_type, 2)); @@ -1385,7 +1385,7 @@ vec4_visitor::emit_shader_time_write(enum shader_time_shader_type type, bool vec4_visitor::run() { - sanity_param_count = vp-Base.Parameters-NumParameters; + sanity_param_count = prog-Parameters-NumParameters; if (INTEL_DEBUG DEBUG_SHADER_TIME) emit_shader_time_begin(); @@ -1469,7 +1469,7 @@ vec4_visitor::run() * _mesa_associate_uniform_storage() would point to freed memory. Make * sure that didn't happen. */ - assert(sanity_param_count == vp-Base.Parameters-NumParameters); + assert(sanity_param_count == prog-Parameters-NumParameters); return !failed; } diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index ff9f5ab..967da00 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -226,7 +226,7 @@ public: return dst_reg(retype(brw_null_reg(), BRW_REGISTER_TYPE_D)); } - struct gl_vertex_program *vp; + struct gl_program *prog; struct brw_vs_compile *c; struct brw_vs_prog_data *prog_data; unsigned int sanity_param_count; diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index 9eacd83..11611b2 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -676,9 +676,9 @@ vec4_visitor::setup_builtin_uniform_values(ir_variable *ir) * ParameterValues directly, since unlike brw_fs.cpp, we never * add new state references during compile. */ - int index = _mesa_add_state_reference(this-vp-Base.Parameters, + int index = _mesa_add_state_reference(this-prog-Parameters, (gl_state_index *)slots[i].tokens); - float *values = this-vp-Base.Parameters-ParameterValues[index][0].f; + float *values = this-prog-Parameters-ParameterValues[index][0].f; this-uniform_vector_size[this-uniforms] = 0; /* Add each of the unique swizzled channels of the element. @@ -2078,7 +2078,7 @@ void vec4_visitor::visit(ir_texture *ir) { int sampler = - _mesa_get_sampler_uniform_value(ir-sampler, shader_prog, vp-Base); + _mesa_get_sampler_uniform_value(ir-sampler, shader_prog, prog); /* Should be lowered by do_lower_texture_projection */ assert(!ir-projector); @@ -2994,7 +2994,7 @@ vec4_visitor::vec4_visitor(struct brw_context *brw, memset(this-output_reg_annotation, 0, sizeof(this-output_reg_annotation)); this-c = c; - this-vp = c-vp-program; + this-prog = c-vp-program.Base; this-prog_data = c-prog_data; this-variable_ht = hash_table_ctor(0, diff --git a/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp b/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp index 1acdd52..3669193 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp @@ -84,8 +84,8 @@ vec4_visitor::emit_vertex_program_code() src_reg one = src_reg(this, glsl_type::float_type); emit(MOV(dst_reg(one), src_reg(1.0f))); - for (unsigned int insn = 0; insn vp-Base.NumInstructions; insn++) { - const struct prog_instruction *vpi = vp-Base.Instructions[insn]; + for (unsigned int insn = 0; insn prog-NumInstructions; insn++) { + const struct prog_instruction *vpi = prog-Instructions[insn]; base_ir = vpi; dst_reg dst; @@ -423,7 +423,7 @@ void vec4_visitor::setup_vp_regs() { /* PROGRAM_TEMPORARY */ - int num_temp = vp-Base.NumTemporaries; + int num_temp = prog-NumTemporaries; vp_temp_regs = rzalloc_array(mem_ctx, src_reg, num_temp); for (int i = 0; i num_temp; i++) vp_temp_regs[i] = src_reg(this, glsl_type::vec4_type); -- 1.8.2.1
[Mesa-dev] [PATCH v2 03/19] i965: Generalize computation of VUE map in preparation for GS.
This patch modifies the arguments to brw_compute_vue_map() so that they no longer bake in the assumption that we are generating a VUE map for vertex shader outputs. It also makes the function non-static so that we can re-use it for geometry shader outputs. Reviewed-by: Jordan Justen jordan.l.jus...@intel.com Reviewed-by: Eric Anholt e...@anholt.net --- src/mesa/drivers/dri/i965/brw_context.h | 3 +++ src/mesa/drivers/dri/i965/brw_vs.c | 12 ++-- 2 files changed, 9 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index afcba46..559f7e8 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -400,6 +400,9 @@ static inline GLuint brw_varying_to_offset(struct brw_vue_map *vue_map, return brw_vue_slot_to_offset(vue_map-varying_to_slot[varying]); } +void brw_compute_vue_map(struct brw_context *brw, struct brw_vue_map *vue_map, + GLbitfield64 slots_valid, bool userclip_active); + struct brw_sf_prog_data { GLuint urb_read_length; diff --git a/src/mesa/drivers/dri/i965/brw_vs.c b/src/mesa/drivers/dri/i965/brw_vs.c index 6d2c0fd..13971ab 100644 --- a/src/mesa/drivers/dri/i965/brw_vs.c +++ b/src/mesa/drivers/dri/i965/brw_vs.c @@ -57,12 +57,11 @@ static inline void assign_vue_slot(struct brw_vue_map *vue_map, * prog_data-userclip and prog_data-outputs_written in their key * (generated by CACHE_NEW_VS_PROG). */ -static void -brw_compute_vue_map(struct brw_context *brw, struct brw_vs_compile *c, -GLbitfield64 slots_valid) +void +brw_compute_vue_map(struct brw_context *brw, struct brw_vue_map *vue_map, +GLbitfield64 slots_valid, bool userclip_active) { const struct intel_context *intel = brw-intel; - struct brw_vue_map *vue_map = c-prog_data.vue_map; /* Prior to Gen6, don't assign a slot for VARYING_SLOT_CLIP_VERTEX, since * it is unsupported. @@ -133,7 +132,7 @@ brw_compute_vue_map(struct brw_context *brw, struct brw_vs_compile *c, */ assign_vue_slot(vue_map, VARYING_SLOT_PSIZ); assign_vue_slot(vue_map, VARYING_SLOT_POS); - if (c-key.userclip_active) { + if (userclip_active) { assign_vue_slot(vue_map, VARYING_SLOT_CLIP_DIST0); assign_vue_slot(vue_map, VARYING_SLOT_CLIP_DIST1); } @@ -284,7 +283,8 @@ do_vs_prog(struct brw_context *brw, } } - brw_compute_vue_map(brw, c, outputs_written); + brw_compute_vue_map(brw, c.prog_data.vue_map, outputs_written, + c.key.userclip_active); if (0) { _mesa_fprint_program_opt(stdout, c.vp-program.Base, PROG_PRINT_DEBUG, -- 1.8.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 04/19] i965/vs: Remove brw_vs_prog_data pointer from brw_vs_compile.
In patches that follow, we'll be splitting structs brw_vs_prog_data and brw_vs_compile into a vec4-generic base struct and a VS-specific derived struct (this will allow the vec4-generic code to be re-used for geometry shaders). Having brw_vs_compile point to brw_vs_prog_data makes it difficult to do this cleanly. Fortunately most of the functions that use brw_vs_compile (those in the vec4_visitor class) already have access to brw_vs_prog_data through a separate pointer (vec4_visitor::prog_data). So all we have to do is use that pointer consistently, and plumb prog_data through the few remaining functions that need access to it. Reviewed-by: Jordan Justen jordan.l.jus...@intel.com --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 19 ++--- src/mesa/drivers/dri/i965/brw_vec4.h | 1 + src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 31 +++--- src/mesa/drivers/dri/i965/brw_vec4_vp.cpp | 10 +++ src/mesa/drivers/dri/i965/brw_vs.c | 28 ++- src/mesa/drivers/dri/i965/brw_vs.h | 2 +- .../dri/i965/test_vec4_register_coalesce.cpp | 2 +- 7 files changed, 49 insertions(+), 44 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index cd6466c..c305add 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -410,8 +410,8 @@ vec4_visitor::pack_uniform_registers() /* Move the references to the data */ for (int j = 0; j size; j++) { - c-prog_data.param[dst * 4 + new_chan[src] + j] = - c-prog_data.param[src * 4 + j]; + prog_data-param[dst * 4 + new_chan[src] + j] = + prog_data-param[src * 4 + j]; } this-uniform_vector_size[dst] += size; @@ -1235,12 +1235,12 @@ vec4_visitor::setup_attributes(int payload_reg) prog_data-urb_read_length = (nr_attributes + 1) / 2; - unsigned vue_entries = MAX2(nr_attributes, c-prog_data.vue_map.num_slots); + unsigned vue_entries = MAX2(nr_attributes, prog_data-vue_map.num_slots); if (intel-gen == 6) - c-prog_data.urb_entry_size = ALIGN(vue_entries, 8) / 8; + prog_data-urb_entry_size = ALIGN(vue_entries, 8) / 8; else - c-prog_data.urb_entry_size = ALIGN(vue_entries, 4) / 4; + prog_data-urb_entry_size = ALIGN(vue_entries, 4) / 4; return payload_reg + nr_attributes; } @@ -1257,7 +1257,7 @@ vec4_visitor::setup_uniforms(int reg) for (unsigned int i = 0; i 4; i++) { unsigned int slot = this-uniforms * 4 + i; static float zero = 0.0; -c-prog_data.param[slot] = zero; +prog_data-param[slot] = zero; } this-uniforms++; @@ -1266,9 +1266,9 @@ vec4_visitor::setup_uniforms(int reg) reg += ALIGN(uniforms, 2) / 2; } - c-prog_data.nr_params = this-uniforms * 4; + prog_data-nr_params = this-uniforms * 4; - c-prog_data.curb_read_length = reg - 1; + prog_data-curb_read_length = reg - 1; return reg; } @@ -1487,6 +1487,7 @@ const unsigned * brw_vs_emit(struct brw_context *brw, struct gl_shader_program *prog, struct brw_vs_compile *c, +struct brw_vs_prog_data *prog_data, void *mem_ctx, unsigned *final_assembly_size) { @@ -1516,7 +1517,7 @@ brw_vs_emit(struct brw_context *brw, } } - vec4_visitor v(brw, c, prog, shader, mem_ctx); + vec4_visitor v(brw, c, prog_data, prog, shader, mem_ctx); if (!v.run()) { prog-LinkStatus = false; ralloc_strcat(prog-InfoLog, v.fail_msg); diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 967da00..c98003d 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -211,6 +211,7 @@ class vec4_visitor : public backend_visitor public: vec4_visitor(struct brw_context *brw, struct brw_vs_compile *c, +struct brw_vs_prog_data *prog_data, struct gl_shader_program *shader_prog, struct brw_shader *shader, void *mem_ctx); diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index 11611b2..6287634 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -605,12 +605,12 @@ vec4_visitor::setup_uniform_values(ir_variable *ir) int i; for (i = 0; i uniform_vector_size[uniforms]; i++) { -c-prog_data.param[uniforms * 4 + i] = components-f; +prog_data-param[uniforms * 4 + i] = components-f; components++; } for (; i 4; i++) { static float zero = 0; -c-prog_data.param[uniforms * 4 + i] = zero; +prog_data-param[uniforms * 4 + i] = zero; } uniforms++; @@
[Mesa-dev] [PATCH v2 05/19] i965/vs: split brw_vs_compile into generic and VS-specific parts.
This will allow the generic parts to be re-used for geometry shaders. Reviewed-by: Jordan Justen jordan.l.jus...@intel.com --- src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp | 2 +- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 8 src/mesa/drivers/dri/i965/brw_vs.c | 4 ++-- src/mesa/drivers/dri/i965/brw_vs.h | 8 ++-- 4 files changed, 13 insertions(+), 9 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp index ac3d401..0853c0a 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp @@ -321,7 +321,7 @@ void vec4_visitor::spill_reg(int spill_reg_nr) { assert(virtual_grf_sizes[spill_reg_nr] == 1); - unsigned int spill_offset = c-last_scratch++; + unsigned int spill_offset = c-base.last_scratch++; /* Generate spill/unspill instructions for the objects being spilled. */ foreach_list(node, this-instructions) { diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index 6287634..51d997b 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -2819,8 +2819,8 @@ vec4_visitor::move_grf_array_access_to_scratch() if (inst-dst.file == GRF inst-dst.reladdr scratch_loc[inst-dst.reg] == -1) { -scratch_loc[inst-dst.reg] = c-last_scratch; -c-last_scratch += this-virtual_grf_sizes[inst-dst.reg]; +scratch_loc[inst-dst.reg] = c-base.last_scratch; +c-base.last_scratch += this-virtual_grf_sizes[inst-dst.reg]; } for (int i = 0 ; i 3; i++) { @@ -2828,8 +2828,8 @@ vec4_visitor::move_grf_array_access_to_scratch() if (src-file == GRF src-reladdr scratch_loc[src-reg] == -1) { - scratch_loc[src-reg] = c-last_scratch; - c-last_scratch += this-virtual_grf_sizes[src-reg]; + scratch_loc[src-reg] = c-base.last_scratch; + c-base.last_scratch += this-virtual_grf_sizes[src-reg]; } } } diff --git a/src/mesa/drivers/dri/i965/brw_vs.c b/src/mesa/drivers/dri/i965/brw_vs.c index 25cd397..c0a3bae 100644 --- a/src/mesa/drivers/dri/i965/brw_vs.c +++ b/src/mesa/drivers/dri/i965/brw_vs.c @@ -312,12 +312,12 @@ do_vs_prog(struct brw_context *brw, } /* Scratch space is used for register spilling */ - if (c.last_scratch) { + if (c.base.last_scratch) { perf_debug(Vertex shader triggered register spilling. Try reducing the number of live vec4 values to improve performance.\n); - prog_data.total_scratch = brw_get_scratch_size(c.last_scratch*REG_SIZE); + prog_data.total_scratch = brw_get_scratch_size(c.base.last_scratch*REG_SIZE); brw_get_scratch_bo(intel, brw-vs.scratch_bo, prog_data.total_scratch * brw-max_vs_threads); diff --git a/src/mesa/drivers/dri/i965/brw_vs.h b/src/mesa/drivers/dri/i965/brw_vs.h index d0f9805..c2b4bc6 100644 --- a/src/mesa/drivers/dri/i965/brw_vs.h +++ b/src/mesa/drivers/dri/i965/brw_vs.h @@ -103,12 +103,16 @@ struct brw_vs_prog_key { }; +struct brw_vec4_compile { + GLuint last_scratch; /** measured in 32-byte (register size) units */ +}; + + struct brw_vs_compile { + struct brw_vec4_compile base; struct brw_vs_prog_key key; struct brw_vertex_program *vp; - - GLuint last_scratch; /** measured in 32-byte (register size) units */ }; const unsigned *brw_vs_emit(struct brw_context *brw, -- 1.8.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 06/19] i965/vs: split brw_vs_prog_key into generic and VS-specific parts.
This will allow the generic parts to be re-used for geometry shaders. Reviewed-by: Jordan Justen jordan.l.jus...@intel.com --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 2 +- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 16 - src/mesa/drivers/dri/i965/brw_vs.c | 47 +- src/mesa/drivers/dri/i965/brw_vs.h | 25 -- 4 files changed, 48 insertions(+), 42 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index c305add..b924c70 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -1402,7 +1402,7 @@ vec4_visitor::run() } base_ir = NULL; - if (c-key.userclip_active !c-key.uses_clip_distance) + if (c-key.base.userclip_active !c-key.base.uses_clip_distance) setup_uniform_clipplane_values(); emit_urb_writes(); diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index 51d997b..8769e9f 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -632,7 +632,7 @@ vec4_visitor::setup_uniform_clipplane_values() */ int compacted_clipplane_index = 0; for (int i = 0; i MAX_CLIP_PLANES; ++i) { -if (!(c-key.userclip_planes_enabled_gen_4_5 (1 i))) +if (!(c-key.base.userclip_planes_enabled_gen_4_5 (1 i))) continue; this-uniform_vector_size[this-uniforms] = 4; @@ -648,7 +648,7 @@ vec4_visitor::setup_uniform_clipplane_values() /* In Gen6 and later, we don't compact clip planes, because this * simplifies the implementation of gl_ClipDistance. */ - for (int i = 0; i c-key.nr_userclip_plane_consts; ++i) { + for (int i = 0; i c-key.base.nr_userclip_plane_consts; ++i) { this-uniform_vector_size[this-uniforms] = 4; this-userplane[i] = dst_reg(UNIFORM, this-uniforms); this-userplane[i].type = BRW_REGISTER_TYPE_F; @@ -2294,7 +2294,7 @@ vec4_visitor::visit(ir_texture *ir) void vec4_visitor::swizzle_result(ir_texture *ir, src_reg orig_val, int sampler) { - int s = c-key.tex.swizzles[sampler]; + int s = c-key.base.tex.swizzles[sampler]; this-result = src_reg(this, ir-type); dst_reg swizzled_result(this-result); @@ -2409,7 +2409,7 @@ vec4_visitor::emit_psiz_and_flags(struct brw_reg reg) { if (intel-gen 6 ((prog_data-vue_map.slots_valid VARYING_BIT_PSIZ) || -c-key.userclip_active || brw-has_negative_rhw_bug)) { +c-key.base.userclip_active || brw-has_negative_rhw_bug)) { dst_reg header1 = dst_reg(this, glsl_type::uvec4_type); dst_reg header1_w = header1; header1_w.writemask = WRITEMASK_W; @@ -2426,7 +2426,7 @@ vec4_visitor::emit_psiz_and_flags(struct brw_reg reg) } current_annotation = Clipping flags; - for (i = 0; i c-key.nr_userclip_plane_consts; i++) { + for (i = 0; i c-key.base.nr_userclip_plane_consts; i++) { vec4_instruction *inst; inst = emit(DP4(dst_null_f(), src_reg(output_reg[VARYING_SLOT_POS]), @@ -2497,7 +2497,7 @@ vec4_visitor::emit_clip_distances(struct brw_reg reg, int offset) clip_vertex = VARYING_SLOT_POS; } - for (int i = 0; i + offset c-key.nr_userclip_plane_consts i 4; + for (int i = 0; i + offset c-key.base.nr_userclip_plane_consts i 4; ++i) { emit(DP4(dst_reg(brw_writemask(reg, 1 i)), src_reg(output_reg[clip_vertex]), @@ -2518,7 +2518,7 @@ vec4_visitor::emit_generic_urb_slot(dst_reg reg, int varying) varying == VARYING_SLOT_COL1 || varying == VARYING_SLOT_BFC0 || varying == VARYING_SLOT_BFC1) - c-key.clamp_vertex_color) { + c-key.base.clamp_vertex_color) { inst-saturate = true; } } @@ -2547,7 +2547,7 @@ vec4_visitor::emit_urb_slot(int mrf, int varying) break; case VARYING_SLOT_CLIP_DIST0: case VARYING_SLOT_CLIP_DIST1: - if (this-c-key.uses_clip_distance) { + if (this-c-key.base.uses_clip_distance) { emit_generic_urb_slot(reg, varying); } else { current_annotation = user clip distances; diff --git a/src/mesa/drivers/dri/i965/brw_vs.c b/src/mesa/drivers/dri/i965/brw_vs.c index c0a3bae..2d0849a 100644 --- a/src/mesa/drivers/dri/i965/brw_vs.c +++ b/src/mesa/drivers/dri/i965/brw_vs.c @@ -286,7 +286,7 @@ do_vs_prog(struct brw_context *brw, } brw_compute_vue_map(brw, prog_data.vue_map, outputs_written, - c.key.userclip_active); + c.key.base.userclip_active); if (0) { _mesa_fprint_program_opt(stdout, c.vp-program.Base, PROG_PRINT_DEBUG, @@ -360,7 +360,7 @@ brw_vs_debug_recompile(struct brw_context *brw, if (c-cache_id == BRW_VS_PROG) { old_key = c-key; -if (old_key-program_string_id ==
[Mesa-dev] [PATCH v2 07/19] i965/vs: split brw_vs_prog_data into generic and VS-specific parts.
This will allow the generic parts to be re-used for geometry shaders. Reviewed-by: Jordan Justen jordan.l.jus...@intel.com v2: Put urb_read_length and urb_entry_size in the generic struct. --- src/mesa/drivers/dri/i965/brw_context.h| 29 +++--- src/mesa/drivers/dri/i965/brw_curbe.c | 6 +- src/mesa/drivers/dri/i965/brw_gs.c | 4 +- src/mesa/drivers/dri/i965/brw_urb.c| 2 +- src/mesa/drivers/dri/i965/brw_vec4.cpp | 34 +-- .../drivers/dri/i965/brw_vec4_reg_allocate.cpp | 10 ++-- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 35 +-- src/mesa/drivers/dri/i965/brw_vec4_vp.cpp | 12 ++-- src/mesa/drivers/dri/i965/brw_vs.c | 67 -- src/mesa/drivers/dri/i965/brw_vs.h | 3 + src/mesa/drivers/dri/i965/brw_vs_state.c | 14 +++-- src/mesa/drivers/dri/i965/brw_vs_surface_state.c | 18 +++--- src/mesa/drivers/dri/i965/gen6_urb.c | 2 +- src/mesa/drivers/dri/i965/gen6_vs_state.c | 16 +++--- src/mesa/drivers/dri/i965/gen7_urb.c | 2 +- src/mesa/drivers/dri/i965/gen7_vs_state.c | 6 +- 16 files changed, 155 insertions(+), 105 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 559f7e8..93bcf55 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -435,10 +435,11 @@ struct brw_gs_prog_data { unsigned svbi_postincrement_value; }; -/* Note: brw_vs_prog_data_compare() must be updated when adding fields to this - * struct! + +/* Note: brw_vec4_prog_data_compare() must be updated when adding fields to + * this struct! */ -struct brw_vs_prog_data { +struct brw_vec4_prog_data { struct brw_vue_map vue_map; GLuint curb_read_length; @@ -448,21 +449,31 @@ struct brw_vs_prog_data { GLuint nr_pull_params; /** number of dwords referenced by pull_param[] */ GLuint total_scratch; - GLbitfield64 inputs_read; - - /* Used for calculating urb partitions: + /* Used for calculating urb partitions. In the VS, this is the size of the +* URB entry used for both input and output to the thread. In the GS, this +* is the size of the URB entry used for output. */ GLuint urb_entry_size; - bool uses_vertexid; - int num_surfaces; - /* These pointers must appear last. See brw_vs_prog_data_compare(). */ + /* These pointers must appear last. See brw_vec4_prog_data_compare(). */ const float **param; const float **pull_param; }; + +/* Note: brw_vs_prog_data_compare() must be updated when adding fields to this + * struct! + */ +struct brw_vs_prog_data { + struct brw_vec4_prog_data base; + + GLbitfield64 inputs_read; + + bool uses_vertexid; +}; + /** Number of texture sampler units */ #define BRW_MAX_TEX_UNIT 16 diff --git a/src/mesa/drivers/dri/i965/brw_curbe.c b/src/mesa/drivers/dri/i965/brw_curbe.c index b332f19..3abd22b 100644 --- a/src/mesa/drivers/dri/i965/brw_curbe.c +++ b/src/mesa/drivers/dri/i965/brw_curbe.c @@ -60,7 +60,7 @@ static void calculate_curbe_offsets( struct brw_context *brw ) const GLuint nr_fp_regs = (brw-wm.prog_data-nr_params + 15) / 16; /* BRW_NEW_VERTEX_PROGRAM */ - const GLuint nr_vp_regs = (brw-vs.prog_data-nr_params + 15) / 16; + const GLuint nr_vp_regs = (brw-vs.prog_data-base.nr_params + 15) / 16; GLuint nr_clip_regs = 0; GLuint total_regs; @@ -240,8 +240,8 @@ brw_upload_constant_buffer(struct brw_context *brw) if (brw-curbe.vs_size) { GLuint offset = brw-curbe.vs_start * 16; - for (i = 0; i brw-vs.prog_data-nr_params; i++) { - buf[offset + i] = *brw-vs.prog_data-param[i]; + for (i = 0; i brw-vs.prog_data-base.nr_params; i++) { + buf[offset + i] = *brw-vs.prog_data-base.param[i]; } } diff --git a/src/mesa/drivers/dri/i965/brw_gs.c b/src/mesa/drivers/dri/i965/brw_gs.c index 00a2a5d..caa3b3e 100644 --- a/src/mesa/drivers/dri/i965/brw_gs.c +++ b/src/mesa/drivers/dri/i965/brw_gs.c @@ -57,7 +57,7 @@ static void compile_gs_prog( struct brw_context *brw, memset(c, 0, sizeof(c)); c.key = *key; - c.vue_map = brw-vs.prog_data-vue_map; + c.vue_map = brw-vs.prog_data-base.vue_map; c.nr_regs = (c.vue_map.num_slots + 1)/2; mem_ctx = ralloc_context(NULL); @@ -167,7 +167,7 @@ static void populate_key( struct brw_context *brw, memset(key, 0, sizeof(*key)); /* CACHE_NEW_VS_PROG (part of VUE map) */ - key-attrs = brw-vs.prog_data-vue_map.slots_valid; + key-attrs = brw-vs.prog_data-base.vue_map.slots_valid; /* BRW_NEW_PRIMITIVE */ key-primitive = brw-primitive; diff --git a/src/mesa/drivers/dri/i965/brw_urb.c b/src/mesa/drivers/dri/i965/brw_urb.c index b1126b5..3f42ba8 100644 --- a/src/mesa/drivers/dri/i965/brw_urb.c +++ b/src/mesa/drivers/dri/i965/brw_urb.c @@ -116,7 +116,7 @@ static void
[Mesa-dev] [PATCH v2 08/19] i965/vs: Make vec4_vs_visitor class derived from vec4_visitor.
This patch just creates the derived class; later patches will migrate VS-specific functions and data structures from the base class into the derived class. Reviewed-by: Jordan Justen jordan.l.jus...@intel.com Reviewed-by: Eric Anholt e...@anholt.net --- src/mesa/drivers/dri/i965/brw_vec4.cpp| 2 +- src/mesa/drivers/dri/i965/brw_vec4.h | 11 +++ src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp| 11 +++ src/mesa/drivers/dri/i965/test_vec4_register_coalesce.cpp | 15 ++- 4 files changed, 37 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 8a52910..6420e4d 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -1519,7 +1519,7 @@ brw_vs_emit(struct brw_context *brw, } } - vec4_visitor v(brw, c, prog_data, prog, shader, mem_ctx); + vec4_vs_visitor v(brw, c, prog_data, prog, shader, mem_ctx); if (!v.run()) { prog-LinkStatus = false; ralloc_strcat(prog-InfoLog, v.fail_msg); diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index c98003d..5dbe128 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -486,6 +486,17 @@ public: void dump_instructions(); }; +class vec4_vs_visitor : public vec4_visitor +{ +public: + vec4_vs_visitor(struct brw_context *brw, + struct brw_vs_compile *c, + struct brw_vs_prog_data *prog_data, + struct gl_shader_program *prog, + struct brw_shader *shader, + void *mem_ctx); +}; + /** * The vertex shader code generator. * diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index 6bc9769..f5f53ad 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -3023,6 +3023,17 @@ vec4_visitor::~vec4_visitor() } +vec4_vs_visitor::vec4_vs_visitor(struct brw_context *brw, + struct brw_vs_compile *c, + struct brw_vs_prog_data *prog_data, + struct gl_shader_program *prog, + struct brw_shader *shader, + void *mem_ctx) + : vec4_visitor(brw, c, prog_data, prog, shader, mem_ctx) +{ +} + + void vec4_visitor::fail(const char *format, ...) { diff --git a/src/mesa/drivers/dri/i965/test_vec4_register_coalesce.cpp b/src/mesa/drivers/dri/i965/test_vec4_register_coalesce.cpp index fb55714..418edd2 100644 --- a/src/mesa/drivers/dri/i965/test_vec4_register_coalesce.cpp +++ b/src/mesa/drivers/dri/i965/test_vec4_register_coalesce.cpp @@ -42,6 +42,19 @@ public: vec4_visitor *v; }; + +class register_coalesce_vec4_visitor : public vec4_visitor +{ +public: + register_coalesce_vec4_visitor(struct brw_context *brw, + struct brw_vs_compile *c, + struct gl_shader_program *shader_prog) + : vec4_visitor(brw, c, NULL, shader_prog, NULL, NULL) + { + } +}; + + void register_coalesce_test::SetUp() { brw = (struct brw_context *)calloc(1, sizeof(*brw)); @@ -53,7 +66,7 @@ void register_coalesce_test::SetUp() shader_prog = ralloc(NULL, struct gl_shader_program); - v = new vec4_visitor(brw, c, NULL, shader_prog, NULL, NULL); + v = new register_coalesce_vec4_visitor(brw, c, shader_prog); _mesa_init_vertex_program(ctx, c-vp-program, GL_VERTEX_SHADER, 0); -- 1.8.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 09/19] i965/vs: Make some vec4_visitor functions virtual.
This patch makes the following vec4_visitor functions virtual, since they will need to be implemented differently for vertex and geometry shaders. Some of the functions are renamed to reflect their generic purpose, rather than their VS-specific behaviour: - setup_attributes - emit_attribute_fixups (renamed to emit_prolog) - emit_vertex_program_code (renamed to emit_program_code) - emit_urb_writes (renamed to emit_thread_end) Reviewed-by: Jordan Justen jordan.l.jus...@intel.com Reviewed-by: Eric Anholt e...@anholt.net --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 8 src/mesa/drivers/dri/i965/brw_vec4.h | 16 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 4 ++-- src/mesa/drivers/dri/i965/brw_vec4_vp.cpp | 2 +- .../dri/i965/test_vec4_register_coalesce.cpp | 22 ++ 5 files changed, 41 insertions(+), 11 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 6420e4d..8c4c08a 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -1172,7 +1172,7 @@ vec4_visitor::dump_instructions() } int -vec4_visitor::setup_attributes(int payload_reg) +vec4_vs_visitor::setup_attributes(int payload_reg) { int nr_attributes; int attribute_map[VERT_ATTRIB_MAX + 1]; @@ -1392,7 +1392,7 @@ vec4_visitor::run() if (INTEL_DEBUG DEBUG_SHADER_TIME) emit_shader_time_begin(); - emit_attribute_fixups(); + emit_prolog(); /* Generate VS IR for main(). (the visitor only descends into * functions called main). @@ -1400,14 +1400,14 @@ vec4_visitor::run() if (shader) { visit_instructions(shader-ir); } else { - emit_vertex_program_code(); + emit_program_code(); } base_ir = NULL; if (c-key.base.userclip_active !c-key.base.uses_clip_distance) setup_uniform_clipplane_values(); - emit_urb_writes(); + emit_thread_end(); /* Before any optimization, push array accesses out to scratch * space where we need them to be. This pass may allocate new diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 5dbe128..76a840a 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -319,7 +319,6 @@ public: void setup_uniform_clipplane_values(); void setup_uniform_values(ir_variable *ir); void setup_builtin_uniform_values(ir_variable *ir); - int setup_attributes(int payload_reg); int setup_uniforms(int payload_reg); void setup_payload(); bool reg_allocate_trivial(); @@ -403,8 +402,6 @@ public: void visit_instructions(const exec_list *list); void setup_vp_regs(); - void emit_attribute_fixups(); - void emit_vertex_program_code(); void emit_vp_sop(uint32_t condmod, dst_reg dst, src_reg src0, src_reg src1, src_reg one); dst_reg get_vp_dst_reg(const prog_dst_register dst); @@ -453,7 +450,6 @@ public: void emit_clip_distances(struct brw_reg reg, int offset); void emit_generic_urb_slot(dst_reg reg, int varying); void emit_urb_slot(int mrf, int varying); - void emit_urb_writes(void); void emit_shader_time_begin(); void emit_shader_time_end(); @@ -484,6 +480,12 @@ public: void dump_instruction(vec4_instruction *inst); void dump_instructions(); + +protected: + virtual int setup_attributes(int payload_reg) = 0; + virtual void emit_prolog() = 0; + virtual void emit_program_code() = 0; + virtual void emit_thread_end() = 0; }; class vec4_vs_visitor : public vec4_visitor @@ -495,6 +497,12 @@ public: struct gl_shader_program *prog, struct brw_shader *shader, void *mem_ctx); + +protected: + virtual int setup_attributes(int payload_reg); + virtual void emit_prolog(); + virtual void emit_program_code(); + virtual void emit_thread_end(); }; /** diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index f5f53ad..13a293a 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -902,7 +902,7 @@ with_writemask(dst_reg const r, int mask) } void -vec4_visitor::emit_attribute_fixups() +vec4_vs_visitor::emit_prolog() { dst_reg sign_recovery_shift; dst_reg normalize_factor; @@ -2602,7 +2602,7 @@ align_interleaved_urb_mlen(struct brw_context *brw, int mlen) * The VUE layout is documented in Volume 2a. */ void -vec4_visitor::emit_urb_writes() +vec4_vs_visitor::emit_thread_end() { /* MRF 0 is reserved for the debugger, so start with message header * in MRF 1. diff --git a/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp b/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp index 13156dd..abb64ed 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp @@
[Mesa-dev] [PATCH v2 10/19] i965/vs: Add virtual function make_reg_for_system_value().
The system values handled by vec4_visitor::visit(ir_variable *) are VS-specific (vertex ID and instance ID). This patch moves the handling of those values into a new virtual function, make_reg_for_system_value(), so that this VS-specific code won't be inherited by geomtry shaders. Reviewed-by: Jordan Justen jordan.l.jus...@intel.com Reviewed-by: Eric Anholt e...@anholt.net --- src/mesa/drivers/dri/i965/brw_vec4.h | 2 + src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 46 +- .../dri/i965/test_vec4_register_coalesce.cpp | 6 +++ 3 files changed, 36 insertions(+), 18 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 76a840a..096ade7 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -482,6 +482,7 @@ public: void dump_instructions(); protected: + virtual dst_reg *make_reg_for_system_value(ir_variable *ir) = 0; virtual int setup_attributes(int payload_reg) = 0; virtual void emit_prolog() = 0; virtual void emit_program_code() = 0; @@ -499,6 +500,7 @@ public: void *mem_ctx); protected: + virtual dst_reg *make_reg_for_system_value(ir_variable *ir); virtual int setup_attributes(int payload_reg); virtual void emit_prolog(); virtual void emit_program_code(); diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index 13a293a..c3154b8 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -1015,6 +1015,33 @@ vec4_vs_visitor::emit_prolog() } } + +dst_reg * +vec4_vs_visitor::make_reg_for_system_value(ir_variable *ir) +{ + /* VertexID is stored by the VF as the last vertex element, but +* we don't represent it with a flag in inputs_read, so we call +* it VERT_ATTRIB_MAX, which setup_attributes() picks up on. +*/ + dst_reg *reg = new(mem_ctx) dst_reg(ATTR, VERT_ATTRIB_MAX); + prog_data-uses_vertexid = true; + + switch (ir-location) { + case SYSTEM_VALUE_VERTEX_ID: + reg-writemask = WRITEMASK_X; + break; + case SYSTEM_VALUE_INSTANCE_ID: + reg-writemask = WRITEMASK_Y; + break; + default: + assert(!not reached); + break; + } + + return reg; +} + + void vec4_visitor::visit(ir_variable *ir) { @@ -1068,24 +1095,7 @@ vec4_visitor::visit(ir_variable *ir) break; case ir_var_system_value: - /* VertexID is stored by the VF as the last vertex element, but - * we don't represent it with a flag in inputs_read, so we call - * it VERT_ATTRIB_MAX, which setup_attributes() picks up on. - */ - reg = new(mem_ctx) dst_reg(ATTR, VERT_ATTRIB_MAX); - prog_data-uses_vertexid = true; - - switch (ir-location) { - case SYSTEM_VALUE_VERTEX_ID: -reg-writemask = WRITEMASK_X; -break; - case SYSTEM_VALUE_INSTANCE_ID: -reg-writemask = WRITEMASK_Y; -break; - default: -assert(!not reached); -break; - } + reg = make_reg_for_system_value(ir); break; default: diff --git a/src/mesa/drivers/dri/i965/test_vec4_register_coalesce.cpp b/src/mesa/drivers/dri/i965/test_vec4_register_coalesce.cpp index 60a993e..7c507f8 100644 --- a/src/mesa/drivers/dri/i965/test_vec4_register_coalesce.cpp +++ b/src/mesa/drivers/dri/i965/test_vec4_register_coalesce.cpp @@ -54,6 +54,12 @@ public: } protected: + virtual dst_reg *make_reg_for_system_value(ir_variable *ir) + { + assert(!Not reached); + return NULL; + } + virtual int setup_attributes(int payload_reg) { assert(!Not reached); -- 1.8.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 11/19] i965/vs: move ARB_vertex_program functions to vec4_vs_visitor.
This patch moves functions from vec4_visitor to vec4_vs_visitor that deal with ARB (assembly) vertex programs. There's no point in having these functions in the base class since we don't intend to support assembly programs for the GS stage. The following functions are moved: - setup_vp_regs - get_vp_dst_reg - get_vp_src_reg Reviewed-by: Jordan Justen jordan.l.jus...@intel.com Reviewed-by: Eric Anholt e...@anholt.net --- src/mesa/drivers/dri/i965/brw_vec4.h | 8 +--- src/mesa/drivers/dri/i965/brw_vec4_vp.cpp | 6 +++--- 2 files changed, 8 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 096ade7..11c3eb4 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -401,11 +401,8 @@ public: /** Walks an exec_list of ir_instruction and sends it through this visitor. */ void visit_instructions(const exec_list *list); - void setup_vp_regs(); void emit_vp_sop(uint32_t condmod, dst_reg dst, src_reg src0, src_reg src1, src_reg one); - dst_reg get_vp_dst_reg(const prog_dst_register dst); - src_reg get_vp_src_reg(const prog_src_register src); void emit_bool_to_cond_code(ir_rvalue *ir, uint32_t *predicate); void emit_bool_comparison(unsigned int op, dst_reg dst, src_reg src0, src_reg src1); @@ -505,6 +502,11 @@ protected: virtual void emit_prolog(); virtual void emit_program_code(); virtual void emit_thread_end(); + +private: + void setup_vp_regs(); + dst_reg get_vp_dst_reg(const prog_dst_register dst); + src_reg get_vp_src_reg(const prog_src_register src); }; /** diff --git a/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp b/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp index abb64ed..3f1d3a7 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp @@ -420,7 +420,7 @@ vec4_vs_visitor::emit_program_code() } void -vec4_visitor::setup_vp_regs() +vec4_vs_visitor::setup_vp_regs() { /* PROGRAM_TEMPORARY */ int num_temp = prog-NumTemporaries; @@ -464,7 +464,7 @@ vec4_visitor::setup_vp_regs() } dst_reg -vec4_visitor::get_vp_dst_reg(const prog_dst_register dst) +vec4_vs_visitor::get_vp_dst_reg(const prog_dst_register dst) { dst_reg result; @@ -498,7 +498,7 @@ vec4_visitor::get_vp_dst_reg(const prog_dst_register dst) } src_reg -vec4_visitor::get_vp_src_reg(const prog_src_register src) +vec4_vs_visitor::get_vp_src_reg(const prog_src_register src) { struct gl_program_parameter_list *plist = c-vp-program.Base.Parameters; -- 1.8.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 13/19] i965/vs: Rename vec4_generator::prog to shader_prog.
The next patch is going to change the type of vec4_generator::vp from struct gl_vertex_program * to struct gl_program *, and rename it. The sensible name to change it to is vec4_generator::prog. However, prog is already used. Since the existing vec4_generator::prog is of type struct gl_shader_program, it makes sense to rename it to shader_prog. --- src/mesa/drivers/dri/i965/brw_vec4.h| 4 ++-- src/mesa/drivers/dri/i965/brw_vec4_emit.cpp | 8 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 08a6654..dca6a55 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -524,7 +524,7 @@ class vec4_generator public: vec4_generator(struct brw_context *brw, struct brw_vs_compile *c, - struct gl_shader_program *prog, + struct gl_shader_program *shader_prog, void *mem_ctx); ~vec4_generator(); @@ -581,7 +581,7 @@ private: struct brw_compile *p; struct brw_vs_compile *c; - struct gl_shader_program *prog; + struct gl_shader_program *shader_prog; struct gl_shader *shader; const struct gl_vertex_program *vp; diff --git a/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp b/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp index e378f7f..1c99d6a 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp @@ -134,14 +134,14 @@ vec4_instruction::get_src(int i) vec4_generator::vec4_generator(struct brw_context *brw, struct brw_vs_compile *c, - struct gl_shader_program *prog, + struct gl_shader_program *shader_prog, void *mem_ctx) - : brw(brw), c(c), prog(prog), mem_ctx(mem_ctx) + : brw(brw), c(c), shader_prog(shader_prog), mem_ctx(mem_ctx) { intel = brw-intel; vp = c-vp-program; - shader = prog ? prog-_LinkedShaders[MESA_SHADER_VERTEX] : NULL; + shader = shader_prog ? shader_prog-_LinkedShaders[MESA_SHADER_VERTEX] : NULL; p = rzalloc(mem_ctx, struct brw_compile); brw_init_compile(brw, p, mem_ctx); @@ -697,7 +697,7 @@ vec4_generator::generate_code(exec_list *instructions) if (unlikely(INTEL_DEBUG DEBUG_VS)) { if (shader) { - printf(Native code for vertex shader %d:\n, prog-Name); + printf(Native code for vertex shader %d:\n, shader_prog-Name); } else { printf(Native code for vertex program %d:\n, c-vp-program.Base.Id); } -- 1.8.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 14/19] i965/vs: Generalize data structures pointed to by vec4_generator.
This patch removes the following field from vec4_generator, since it is not used: - struct brw_vs_compile *c And changes the following field: - struct gl_vertex_program *vp = struct gl_program *prog With these changes, vec4_generator no longer refers to any VS-specific data structures. This will pave the way for re-using it for geometry shaders. Reviewed-by: Jordan Justen jordan.l.jus...@intel.com v2: Use the name prog rather than p. --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 2 +- src/mesa/drivers/dri/i965/brw_vec4.h| 5 ++--- src/mesa/drivers/dri/i965/brw_vec4_emit.cpp | 9 - 3 files changed, 7 insertions(+), 9 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 9026501..b366036 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -1526,7 +1526,7 @@ brw_vs_emit(struct brw_context *brw, return NULL; } - vec4_generator g(brw, c, prog, mem_ctx); + vec4_generator g(brw, prog, c-vp-program.Base, mem_ctx); const unsigned *generated =g.generate_assembly(v.instructions, final_assembly_size); diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index dca6a55..0c9fb70 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -523,8 +523,8 @@ class vec4_generator { public: vec4_generator(struct brw_context *brw, - struct brw_vs_compile *c, struct gl_shader_program *shader_prog, + struct gl_program *prog, void *mem_ctx); ~vec4_generator(); @@ -579,11 +579,10 @@ private: struct gl_context *ctx; struct brw_compile *p; - struct brw_vs_compile *c; struct gl_shader_program *shader_prog; struct gl_shader *shader; - const struct gl_vertex_program *vp; + const struct gl_program *prog; void *mem_ctx; }; diff --git a/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp b/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp index 1c99d6a..fc00cb3 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp @@ -133,13 +133,12 @@ vec4_instruction::get_src(int i) } vec4_generator::vec4_generator(struct brw_context *brw, - struct brw_vs_compile *c, struct gl_shader_program *shader_prog, + struct gl_program *prog, void *mem_ctx) - : brw(brw), c(c), shader_prog(shader_prog), mem_ctx(mem_ctx) + : brw(brw), shader_prog(shader_prog), prog(prog), mem_ctx(mem_ctx) { intel = brw-intel; - vp = c-vp-program; shader = shader_prog ? shader_prog-_LinkedShaders[MESA_SHADER_VERTEX] : NULL; @@ -699,7 +698,7 @@ vec4_generator::generate_code(exec_list *instructions) if (shader) { printf(Native code for vertex shader %d:\n, shader_prog-Name); } else { - printf(Native code for vertex program %d:\n, c-vp-program.Base.Id); + printf(Native code for vertex program %d:\n, prog-Id); } } @@ -717,7 +716,7 @@ vec4_generator::generate_code(exec_list *instructions) } else { const prog_instruction *vpi; vpi = (const prog_instruction *) inst-ir; - printf(%d: , (int)(vpi - vp-Base.Instructions)); + printf(%d: , (int)(vpi - prog-Instructions)); _mesa_fprint_instruction_opt(stdout, vpi, 0, PROG_PRINT_DEBUG, NULL); } -- 1.8.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 15/19] i965/vs: rename vec4_generator::generate_vs_instruction.
Since this function is going to get used for geometry shaders too, it deserves a more generic name: generate_vec4_instruction. Reviewed-by: Jordan Justen jordan.l.jus...@intel.com --- src/mesa/drivers/dri/i965/brw_vec4.h| 6 +++--- src/mesa/drivers/dri/i965/brw_vec4_emit.cpp | 8 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 0c9fb70..73d8949 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -532,9 +532,9 @@ public: private: void generate_code(exec_list *instructions); - void generate_vs_instruction(vec4_instruction *inst, - struct brw_reg dst, - struct brw_reg *src); + void generate_vec4_instruction(vec4_instruction *inst, + struct brw_reg dst, + struct brw_reg *src); void generate_math1_gen4(vec4_instruction *inst, struct brw_reg dst, diff --git a/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp b/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp index fc00cb3..0f0877e 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp @@ -613,9 +613,9 @@ vec4_generator::generate_pull_constant_load(vec4_instruction *inst, } void -vec4_generator::generate_vs_instruction(vec4_instruction *instruction, -struct brw_reg dst, -struct brw_reg *src) +vec4_generator::generate_vec4_instruction(vec4_instruction *instruction, + struct brw_reg dst, + struct brw_reg *src) { vec4_instruction *inst = (vec4_instruction *)instruction; @@ -865,7 +865,7 @@ vec4_generator::generate_code(exec_list *instructions) break; default: -generate_vs_instruction(inst, dst, src); +generate_vec4_instruction(inst, dst, src); break; } -- 1.8.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 16/19] i965/vs: Generalize vertex emission code in preparation for GS.
This patch introduces a new function, vec4_visitor::emit_vertex(), which contains the code for emitting vertices that will need to be common between the vertex and geometry shaders. Geometry shaders will need to use a different message header, and a different opcode, for their URB writes, so we introduce virtual functions emit_urb_write_header() and emit_urb_write_opcode() to take care of the GS-specific behaviours. Also, since vertex emission happens at the end of the VS, but in the middle of the GS, we need to be sure to only call emit_shader_time_end() during VS vertex emission. We accomplish this by moving the call to emit_shader_time_end() into the VS implementation of emit_urb_write_opcode(). Reviewed-by: Jordan Justen jordan.l.jus...@intel.com --- src/mesa/drivers/dri/i965/brw_vec4.h | 5 ++ src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 63 +++--- .../dri/i965/test_vec4_register_coalesce.cpp | 10 3 files changed, 59 insertions(+), 19 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 73d8949..19fa1a5 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -479,11 +479,14 @@ public: void dump_instructions(); protected: + void emit_vertex(); virtual dst_reg *make_reg_for_system_value(ir_variable *ir) = 0; virtual int setup_attributes(int payload_reg) = 0; virtual void emit_prolog() = 0; virtual void emit_program_code() = 0; virtual void emit_thread_end() = 0; + virtual void emit_urb_write_header(int mrf) = 0; + virtual vec4_instruction *emit_urb_write_opcode(bool complete) = 0; }; class vec4_vs_visitor : public vec4_visitor @@ -502,6 +505,8 @@ protected: virtual void emit_prolog(); virtual void emit_program_code(); virtual void emit_thread_end(); + virtual void emit_urb_write_header(int mrf); + virtual vec4_instruction *emit_urb_write_opcode(bool complete); private: void setup_vp_regs(); diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index e3dafc4..a98aae8 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -2605,14 +2605,38 @@ align_interleaved_urb_mlen(struct brw_context *brw, int mlen) return mlen; } +void +vec4_vs_visitor::emit_urb_write_header(int mrf) +{ + /* No need to do anything for VS; an implied write to this MRF will be +* performed by VS_OPCODE_URB_WRITE. +*/ + (void) mrf; +} + +vec4_instruction * +vec4_vs_visitor::emit_urb_write_opcode(bool complete) +{ + /* For VS, the URB writes end the thread. */ + if (complete) { + if (INTEL_DEBUG DEBUG_SHADER_TIME) + emit_shader_time_end(); + } + + vec4_instruction *inst = emit(VS_OPCODE_URB_WRITE); + inst-eot = complete; + + return inst; +} + /** - * Generates the VUE payload plus the 1 or 2 URB write instructions to - * complete the VS thread. + * Generates the VUE payload plus the necessary URB write instructions to + * output it. * * The VUE layout is documented in Volume 2a. */ void -vec4_vs_visitor::emit_thread_end() +vec4_visitor::emit_vertex() { /* MRF 0 is reserved for the debugger, so start with message header * in MRF 1. @@ -2631,10 +2655,10 @@ vec4_vs_visitor::emit_thread_end() */ assert ((max_usable_mrf - base_mrf) % 2 == 0); - /* First mrf is the g0-based message header containing URB handles and such, -* which is implied in VS_OPCODE_URB_WRITE. + /* First mrf is the g0-based message header containing URB handles and +* such. */ - mrf++; + emit_urb_write_header(mrf++); if (intel-gen 6) { emit_ndc_computation(); @@ -2654,19 +2678,14 @@ vec4_vs_visitor::emit_thread_end() } } - bool eot = slot = prog_data-vue_map.num_slots; - if (eot) { - if (INTEL_DEBUG DEBUG_SHADER_TIME) - emit_shader_time_end(); - } + bool complete = slot = prog_data-vue_map.num_slots; current_annotation = URB write; - vec4_instruction *inst = emit(VS_OPCODE_URB_WRITE); + vec4_instruction *inst = emit_urb_write_opcode(complete); inst-base_mrf = base_mrf; inst-mlen = align_interleaved_urb_mlen(brw, mrf - base_mrf); - inst-eot = eot; /* Optional second URB write */ - if (!inst-eot) { + if (!complete) { mrf = base_mrf + 1; for (; slot prog_data-vue_map.num_slots; ++slot) { @@ -2675,14 +2694,10 @@ vec4_vs_visitor::emit_thread_end() emit_urb_slot(mrf++, prog_data-vue_map.slot_to_varying[slot]); } - if (INTEL_DEBUG DEBUG_SHADER_TIME) - emit_shader_time_end(); - current_annotation = URB write; - inst = emit(VS_OPCODE_URB_WRITE); + inst = emit_urb_write_opcode(true /* complete */); inst-base_mrf = base_mrf; inst-mlen = align_interleaved_urb_mlen(brw, mrf - base_mrf); - inst-eot = true;
Re: [Mesa-dev] [PATCH 00/12] Death to array dereferences of vectors!
On 04/09/2013 10:49 AM, Eric Anholt wrote: Ian Romanick i...@freedesktop.org writes: This series gradually replaces array dereferences of vectors with two expressions. It takes so many patches because changes are needed to the existing lowering passes and because several places in the code generate array dereferences of vectors (e.g., lowering accessed to gl_ClipDistance). There is also some challenge in dealing with function inout parameters that are indexed vectors. The two new expressions are ir_binop_vector_extract and ir_triop_vector_insert. The former has a vector operand and a scalar operand. The result is the scalar value from the vector specified by the scalar. The later takes a vector and two scalars. The result is a new vector with one indexed field replaced by a scalar value. Together this series fixes piglit tests glsl-vs-channel-overwrite-01 and glsl-vs-channel-overwrite-03. Throughout the series, there's a bunch of introduction of new tabs for indentation. Paul pointed out long ago that the devinfo.html had specified a no-tabs indent style in the indent command since 2006, and I found that basically you and I were the only ones putting tabs in, so I stopped. I've found reading diffs has become easier since avoiding tabs, since you don't get diffs with apparently-incorrect indentation (thanks to + being 3 sharacters, in particular). Yeah... I agree 100%, and I thought I had fixed that. :( It seems like emacs gets confused when editing code areas that use tabs. I'm honestly not sure what the deal is, but I'll go back and fix that before pushing. Thanks for pointing it out... I didn't notice. :( I'd love to see this code fixed to not use tabs. If you use emacs, removing your custom configuration for Mesa and relying on .dir-locals.el will get you the preferred style for future work. Other than that, the patches other than the ones I commented on and the gl_ClipDistance ones are: Reviewed-by: Eric Anholt e...@anholt.net ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 17/19] i965/vs: Generalize attribute setup code in preparation for GS.
This patch introduces a new function, vec4_visitor::lower_attributes_to_hw_regs(), which replaces registers of type ATTR in the instruction stream with the hardware registers that store those attributes. This logic will need to be common between the vertex and geometry shaders. Reviewed-by: Jordan Justen jordan.l.jus...@intel.com --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 54 -- src/mesa/drivers/dri/i965/brw_vec4.h | 1 + 2 files changed, 33 insertions(+), 22 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index b366036..d4d09aa 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -1171,29 +1171,13 @@ vec4_visitor::dump_instructions() } } -int -vec4_vs_visitor::setup_attributes(int payload_reg) +/** + * Replace each register of type ATTR in this-instructions with a reference + * to a fixed HW register. + */ +void +vec4_visitor::lower_attributes_to_hw_regs(const int *attribute_map) { - int nr_attributes; - int attribute_map[VERT_ATTRIB_MAX + 1]; - - nr_attributes = 0; - for (int i = 0; i VERT_ATTRIB_MAX; i++) { - if (vs_prog_data-inputs_read BITFIELD64_BIT(i)) { -attribute_map[i] = payload_reg + nr_attributes; -nr_attributes++; - } - } - - /* VertexID is stored by the VF as the last vertex element, but we -* don't represent it with a flag in inputs_read, so we call it -* VERT_ATTRIB_MAX. -*/ - if (vs_prog_data-uses_vertexid) { - attribute_map[VERT_ATTRIB_MAX] = payload_reg + nr_attributes; - nr_attributes++; - } - foreach_list(node, this-instructions) { vec4_instruction *inst = (vec4_instruction *)node; @@ -1227,6 +1211,32 @@ vec4_vs_visitor::setup_attributes(int payload_reg) inst-src[i].fixed_hw_reg = reg; } } +} + +int +vec4_vs_visitor::setup_attributes(int payload_reg) +{ + int nr_attributes; + int attribute_map[VERT_ATTRIB_MAX + 1]; + + nr_attributes = 0; + for (int i = 0; i VERT_ATTRIB_MAX; i++) { + if (vs_prog_data-inputs_read BITFIELD64_BIT(i)) { +attribute_map[i] = payload_reg + nr_attributes; +nr_attributes++; + } + } + + /* VertexID is stored by the VF as the last vertex element, but we +* don't represent it with a flag in inputs_read, so we call it +* VERT_ATTRIB_MAX. +*/ + if (vs_prog_data-uses_vertexid) { + attribute_map[VERT_ATTRIB_MAX] = payload_reg + nr_attributes; + nr_attributes++; + } + + lower_attributes_to_hw_regs(attribute_map); /* The BSpec says we always have to read at least one thing from * the VF, and it appears that the hardware wedges otherwise. diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 19fa1a5..98b496a 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -480,6 +480,7 @@ public: protected: void emit_vertex(); + void lower_attributes_to_hw_regs(const int *attribute_map); virtual dst_reg *make_reg_for_system_value(ir_variable *ir) = 0; virtual int setup_attributes(int payload_reg) = 0; virtual void emit_prolog() = 0; -- 1.8.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 18/19] i965/vs: Generalize computation of array strides in preparation for GS.
Geometry shader inputs are arrays, but they use an unusual array layout: instead of all array elements for a given geometry shader input being stored consecutively, all geometry shader inputs are interleaved into one giant array. As a result, the array stride we use to access geometry shader inputs must be equal to the size of the input VUE, rather than the size of the array element. This patch introduces a new virtual function, vec4_visitor::compute_array_stride(), which will allow geometry shader compilation to specialize the computation of array stride to account for the unusual layout of geometry shader input arrays. It also renames the local variable that the ir_dereference_array visitor uses to store the stride, to avoid confusion. Reviewed-by: Jordan Justen jordan.l.jus...@intel.com --- src/mesa/drivers/dri/i965/brw_vec4.h | 1 + src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 19 +++ 2 files changed, 16 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 98b496a..ddd6087 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -488,6 +488,7 @@ protected: virtual void emit_thread_end() = 0; virtual void emit_urb_write_header(int mrf) = 0; virtual vec4_instruction *emit_urb_write_opcode(bool complete) = 0; + virtual int compute_array_stride(ir_dereference_array *ir); }; class vec4_vs_visitor : public vec4_visitor diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index a98aae8..e77ca6e 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -1692,12 +1692,23 @@ vec4_visitor::visit(ir_dereference_variable *ir) this-result.swizzle = swizzle_for_size(type-vector_elements); } + +int +vec4_visitor::compute_array_stride(ir_dereference_array *ir) +{ + /* Under normal circumstances array elements are stored consecutively, so +* the stride is equal to the size of the array element. +*/ + return type_size(ir-type); +} + + void vec4_visitor::visit(ir_dereference_array *ir) { ir_constant *constant_index; src_reg src; - int element_size = type_size(ir-type); + int array_stride = compute_array_stride(ir); constant_index = ir-array_index-constant_expression_value(); @@ -1705,7 +1716,7 @@ vec4_visitor::visit(ir_dereference_array *ir) src = this-result; if (constant_index) { - src.reg_offset += constant_index-value.i[0] * element_size; + src.reg_offset += constant_index-value.i[0] * array_stride; } else { /* Variable index array dereference. It eats the vec4 of the * base of the array and an index that offsets the Mesa register @@ -1715,12 +1726,12 @@ vec4_visitor::visit(ir_dereference_array *ir) src_reg index_reg; - if (element_size == 1) { + if (array_stride == 1) { index_reg = this-result; } else { index_reg = src_reg(this, glsl_type::int_type); -emit(MUL(dst_reg(index_reg), this-result, src_reg(element_size))); +emit(MUL(dst_reg(index_reg), this-result, src_reg(array_stride))); } if (src.reladdr) { -- 1.8.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 19/19] i965/vs: Don't hardcode DEBUG_VS in generic vec4 code.
Since the vec4_visitor and vec4_generator classes are going to be re-used for geometry shaders, we can't enable their debug functionality based on (INTEL_DEBUG DEBUG_VS) anymore. Instead, add a debug_flag boolean to these two classes, so that when they're instantiated the caller can specify whether debug dumps are needed. Reviewed-by: Jordan Justen jordan.l.jus...@intel.com --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 3 ++- src/mesa/drivers/dri/i965/brw_vec4.h | 9 +++-- src/mesa/drivers/dri/i965/brw_vec4_emit.cpp | 16 +--- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 8 +--- .../drivers/dri/i965/test_vec4_register_coalesce.cpp | 3 ++- 5 files changed, 25 insertions(+), 14 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index d4d09aa..6d9bd03 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -1536,7 +1536,8 @@ brw_vs_emit(struct brw_context *brw, return NULL; } - vec4_generator g(brw, prog, c-vp-program.Base, mem_ctx); + vec4_generator g(brw, prog, c-vp-program.Base, mem_ctx, +INTEL_DEBUG DEBUG_VS); const unsigned *generated =g.generate_assembly(v.instructions, final_assembly_size); diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index ddd6087..db68fcf 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -216,7 +216,8 @@ public: struct brw_vec4_prog_data *prog_data, struct gl_shader_program *shader_prog, struct brw_shader *shader, - void *mem_ctx); + void *mem_ctx, +bool debug_flag); ~vec4_visitor(); dst_reg dst_null_f() @@ -489,6 +490,8 @@ protected: virtual void emit_urb_write_header(int mrf) = 0; virtual vec4_instruction *emit_urb_write_opcode(bool complete) = 0; virtual int compute_array_stride(ir_dereference_array *ir); + + const bool debug_flag; }; class vec4_vs_visitor : public vec4_visitor @@ -532,7 +535,8 @@ public: vec4_generator(struct brw_context *brw, struct gl_shader_program *shader_prog, struct gl_program *prog, - void *mem_ctx); + void *mem_ctx, + bool debug_flag); ~vec4_generator(); const unsigned *generate_assembly(exec_list *insts, unsigned *asm_size); @@ -592,6 +596,7 @@ private: const struct gl_program *prog; void *mem_ctx; + const bool debug_flag; }; } /* namespace brw */ diff --git a/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp b/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp index 0f0877e..6e5dbf4 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp @@ -135,8 +135,10 @@ vec4_instruction::get_src(int i) vec4_generator::vec4_generator(struct brw_context *brw, struct gl_shader_program *shader_prog, struct gl_program *prog, - void *mem_ctx) - : brw(brw), shader_prog(shader_prog), prog(prog), mem_ctx(mem_ctx) + void *mem_ctx, + bool debug_flag) + : brw(brw), shader_prog(shader_prog), prog(prog), mem_ctx(mem_ctx), + debug_flag(debug_flag) { intel = brw-intel; @@ -694,7 +696,7 @@ vec4_generator::generate_code(exec_list *instructions) const char *last_annotation_string = NULL; const void *last_annotation_ir = NULL; - if (unlikely(INTEL_DEBUG DEBUG_VS)) { + if (unlikely(debug_flag)) { if (shader) { printf(Native code for vertex shader %d:\n, shader_prog-Name); } else { @@ -706,7 +708,7 @@ vec4_generator::generate_code(exec_list *instructions) vec4_instruction *inst = (vec4_instruction *)node; struct brw_reg src[3], dst; - if (unlikely(INTEL_DEBUG DEBUG_VS)) { + if (unlikely(debug_flag)) { if (last_annotation_ir != inst-ir) { last_annotation_ir = inst-ir; if (last_annotation_ir) { @@ -882,7 +884,7 @@ vec4_generator::generate_code(exec_list *instructions) last-header.dependency_control |= BRW_DEPENDENCY_NOTCHECKED; } - if (unlikely(INTEL_DEBUG DEBUG_VS)) { + if (unlikely(debug_flag)) { brw_dump_compile(p, stdout, last_native_insn_offset, p-next_insn_offset); } @@ -890,7 +892,7 @@ vec4_generator::generate_code(exec_list *instructions) last_native_insn_offset = p-next_insn_offset; } - if (unlikely(INTEL_DEBUG DEBUG_VS)) { + if (unlikely(debug_flag)) { printf(\n); } @@ -901,7 +903,7 @@ vec4_generator::generate_code(exec_list *instructions) * which is
[Mesa-dev] [PATCH 2/8] intel: Add functions for checking if objs have hiz enabled
From: Chad Versace chad.vers...@linux.intel.com On Haswell, HiZ will selectively be enabled on individual miptree slices to workaround a hardware bug. The two new functions below will permit us to detect if hiz is enabled for a particular slice. intel_miptree_slice_has_hiz intel_renderbuffer_has_hiz The functions are not yet used. Signed-off-by: Chad Versace chad.vers...@linux.intel.com --- src/mesa/drivers/dri/intel/intel_fbo.c | 10 ++ src/mesa/drivers/dri/intel/intel_fbo.h | 3 +++ src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 12 src/mesa/drivers/dri/intel/intel_mipmap_tree.h | 11 +-- 4 files changed, 34 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_fbo.c b/src/mesa/drivers/dri/intel/intel_fbo.c index 2977568..0e2ded5 100644 --- a/src/mesa/drivers/dri/intel/intel_fbo.c +++ b/src/mesa/drivers/dri/intel/intel_fbo.c @@ -943,6 +943,16 @@ intel_renderbuffer_set_needs_downsample(struct intel_renderbuffer *irb) irb-mt-need_downsample = true; } +/** + * Does the renderbuffer have hiz enabled? + */ +bool +intel_renderbuffer_has_hiz(struct intel_renderbuffer *irb) +{ + return irb-mt + intel_miptree_slice_has_hiz(irb-mt, irb-mt_level, irb-mt_layer); +} + void intel_renderbuffer_set_needs_hiz_resolve(struct intel_renderbuffer *irb) { diff --git a/src/mesa/drivers/dri/intel/intel_fbo.h b/src/mesa/drivers/dri/intel/intel_fbo.h index 9313c35..19edbe7 100644 --- a/src/mesa/drivers/dri/intel/intel_fbo.h +++ b/src/mesa/drivers/dri/intel/intel_fbo.h @@ -166,6 +166,9 @@ intel_get_rb_region(struct gl_framebuffer *fb, GLuint attIndex); void intel_renderbuffer_set_needs_downsample(struct intel_renderbuffer *irb); +bool +intel_renderbuffer_has_hiz(struct intel_renderbuffer *irb); + void intel_renderbuffer_set_needs_hiz_resolve(struct intel_renderbuffer *irb); diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c index 19c9088..fdb6504 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c @@ -1024,6 +1024,18 @@ intel_miptree_alloc_hiz(struct intel_context *intel, return true; } +/** + * Does the miptree slice have hiz enabled? + */ +bool +intel_miptree_slice_has_hiz(struct intel_mipmap_tree *mt, +uint32_t level, +uint32_t layer) +{ + intel_miptree_check_level_layer(mt, level, layer); + return mt-hiz_mt != NULL; +} + void intel_miptree_slice_set_needs_hiz_resolve(struct intel_mipmap_tree *mt, uint32_t level, diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.h b/src/mesa/drivers/dri/intel/intel_mipmap_tree.h index 3bdda07..f39ff06 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.h +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.h @@ -333,9 +333,11 @@ struct intel_mipmap_tree /** * \brief HiZ miptree * -* This is non-null only if HiZ is enabled for this miptree. +* The hiz miptree contains the miptree's hiz buffer. To allocate the hiz +* miptree, use intel_miptree_alloc_hiz(). * -* \see intel_miptree_alloc_hiz() +* To determine if hiz is enabled, do not check this pointer. Instead, use +* intel_miptree_slice_has_hiz(). */ struct intel_mipmap_tree *hiz_mt; @@ -532,6 +534,11 @@ intel_miptree_alloc_hiz(struct intel_context *intel, struct intel_mipmap_tree *mt, GLuint num_samples); +bool +intel_miptree_slice_has_hiz(struct intel_mipmap_tree *mt, +uint32_t level, +uint32_t layer); + void intel_miptree_slice_set_needs_hiz_resolve(struct intel_mipmap_tree *mt, uint32_t level, -- 1.8.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Enable HiZ on Haswell
(I'm sending this out for Chad, whose email doesn't appear to be working.) Enable hiz by setting intel_context::has_hiz. However, to work around a hardware bug, we selectively enable hiz for only nicely aligned miptree slices. No Piglit regressions on Haswell 0x0d26 rev07 when based atop mesa-master-97e40a5. Improves the performance of GLB27_TRex_C24Z16_FixedTimeStep by 18.52% (hsw-0x0d26-rev07; kernel-3.9.0-rc1; GLBenchmark 2.7.0 Release a68901; samples=3). ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/8] i965/blorp: Align rectangle primitive for hiz ops
From: Chad Versace chad.vers...@linux.intel.com The hardware docs and the simulator require that the rectangle primitive emitted during fast depth clears and hiz resolves must be aligned to 8x4 pixels. Signed-off-by: Chad Versace chad.vers...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_blorp.cpp | 29 + 1 file changed, 29 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp b/src/mesa/drivers/dri/i965/brw_blorp.cpp index 5f72b5d..ed966c6 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp.cpp +++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp @@ -181,6 +181,35 @@ brw_hiz_op_params::brw_hiz_op_params(struct intel_mipmap_tree *mt, this-hiz_op = op; depth.set(mt, level, layer); + + /* Align the rectangle primitive to 8x4 pixels. +* +* During fast depth clears, the emitted rectangle primitive must be +* aligned to 8x4 pixels. From the Ivybridge PRM, Vol 2 Part 1 Section +* 11.5.3.1 Depth Buffer Clear (and the matching section in the Sandybridge +* PRM): +* If Number of Multisamples is NUMSAMPLES_1, the rectangle must be +* aligned to an 8x4 pixel block relative to the upper left corner +* of the depth buffer [...] +* +* For hiz resolves, the rectangle must also be 8x4 aligned. Item +* WaHizAmbiguate8x4Aligned from the Haswell workarounds page and the +* Ivybridge simulator require the alignment. +* +* To be safe, let's just align the rect for all hiz operations and all +* hardware generations. +* +* However, for some miptree slices of a Z24 texture, emitting an 8x4 +* aligned rectangle that covers the slice may clobber adjacent slices if +* we strictly adhered to the texture alignments specified in the PRM. The +* Ivybridge PRM, Section Alignment Unit Size, states that +* SURFACE_STATE.Surface_Horizontal_Alignment should be 4 for Z24 surfaces, +* not 8. But commit 1f112cc increased the alignment from 4 to 8, which +* prevents the clobbering. +*/ + depth.width = ALIGN(depth.width, 8); + depth.height = ALIGN(depth.height, 4); + x1 = depth.width; y1 = depth.height; -- 1.8.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/8] intel: Mark needed initial hiz ambiguates only if hiz is enabled
From: Chad Versace chad.vers...@linux.intel.com When allocating the hiz miptree, we mark each slice as needing a hiz ambiguate. This patch updates the initial marking to mark only slices for which hiz is enabled. No behavioral change is introduced, because currently hiz is always enabled for all slices. However, that will change in Haswell. Signed-off-by: Chad Versace chad.vers...@linux.intel.com --- src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c index fdb6504..7ffe042 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c @@ -1010,6 +1010,9 @@ intel_miptree_alloc_hiz(struct intel_context *intel, struct intel_resolve_map *head = mt-hiz_map; for (int level = mt-first_level; level = mt-last_level; ++level) { for (int layer = 0; layer mt-level[level].depth; ++layer) { + if (!intel_miptree_slice_has_hiz(mt, level, layer)) +continue; + head-next = malloc(sizeof(*head-next)); head-next-prev = head; head-next-next = NULL; -- 1.8.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/8] i965/blorp: Add fields brw_blorp_mip_info::level, layer
From: Chad Versace chad.vers...@linux.intel.com The new fields define the 2D miptree slice to be used. A following patch will pass the new fields through to intel_miptree_slice_has_hiz(). Signed-off-by: Chad Versace chad.vers...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_blorp.cpp | 4 src/mesa/drivers/dri/i965/brw_blorp.h | 11 +++ 2 files changed, 15 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp b/src/mesa/drivers/dri/i965/brw_blorp.cpp index ed966c6..9c6fe49 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp.cpp +++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp @@ -30,6 +30,8 @@ brw_blorp_mip_info::brw_blorp_mip_info() : mt(NULL), + level(0), + layer(0), width(0), height(0), x_offset(0), @@ -50,6 +52,8 @@ brw_blorp_mip_info::set(struct intel_mipmap_tree *mt, intel_miptree_check_level_layer(mt, level, layer); this-mt = mt; + this-level = level; + this-layer = layer; this-width = mt-level[level].width; this-height = mt-level[level].height; diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h b/src/mesa/drivers/dri/i965/brw_blorp.h index 79a3f3a..1b95818 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp.h +++ b/src/mesa/drivers/dri/i965/brw_blorp.h @@ -69,6 +69,17 @@ public: struct intel_mipmap_tree *mt; /** +* The miplevel to use. +*/ + uint32_t level; + + /** +* The 2D layer within the miplevel. Combined, level and layer define the +* 2D miptree slice to use. +*/ + uint32_t layer; + + /** * Width of the miplevel to be used. For surfaces using * INTEL_MSAA_LAYOUT_IMS, this is measured in samples, not pixels. */ -- 1.8.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/8] i965: Change signature of brw_get_depthstencil_tile_masks()
From: Chad Versace chad.vers...@linux.intel.com Add new parameters `depth_level` and `depth_layer`, which specify depth miptree's slice of interest. A following patch will pass the new parameters through to intel_miptree_slice_has_hiz(). Signed-off-by: Chad Versace chad.vers...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_context.h| 2 ++ src/mesa/drivers/dri/i965/brw_misc_state.c | 7 ++- src/mesa/drivers/dri/i965/gen6_blorp.cpp | 5 - src/mesa/drivers/dri/i965/gen7_blorp.cpp | 5 - 4 files changed, 16 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index afcba46..541288d 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -1127,6 +1127,8 @@ bool brwCreateContext(int api, * brw_misc_state.c */ void brw_get_depthstencil_tile_masks(struct intel_mipmap_tree *depth_mt, + uint32_t depth_level, + uint32_t depth_layer, struct intel_mipmap_tree *stencil_mt, uint32_t *out_tile_mask_x, uint32_t *out_tile_mask_y); diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c b/src/mesa/drivers/dri/i965/brw_misc_state.c index 25672eb..1878038 100644 --- a/src/mesa/drivers/dri/i965/brw_misc_state.c +++ b/src/mesa/drivers/dri/i965/brw_misc_state.c @@ -271,6 +271,8 @@ brw_depthbuffer_format(struct brw_context *brw) */ void brw_get_depthstencil_tile_masks(struct intel_mipmap_tree *depth_mt, +uint32_t depth_level, +uint32_t depth_layer, struct intel_mipmap_tree *stencil_mt, uint32_t *out_tile_mask_x, uint32_t *out_tile_mask_y) @@ -367,7 +369,10 @@ brw_workaround_depthstencil_alignment(struct brw_context *brw, } uint32_t tile_mask_x, tile_mask_y; - brw_get_depthstencil_tile_masks(depth_mt, stencil_mt, + brw_get_depthstencil_tile_masks(depth_mt, + depth_mt ? depth_irb-mt_level : 0, + depth_mt ? depth_irb-mt_layer : 0, + stencil_mt, tile_mask_x, tile_mask_y); if (depth_irb) { diff --git a/src/mesa/drivers/dri/i965/gen6_blorp.cpp b/src/mesa/drivers/dri/i965/gen6_blorp.cpp index e6b8485..b6fbd44 100644 --- a/src/mesa/drivers/dri/i965/gen6_blorp.cpp +++ b/src/mesa/drivers/dri/i965/gen6_blorp.cpp @@ -829,7 +829,10 @@ gen6_blorp_emit_depth_stencil_config(struct brw_context *brw, uint32_t draw_y = params-depth.y_offset; uint32_t tile_mask_x, tile_mask_y; - brw_get_depthstencil_tile_masks(params-depth.mt, NULL, + brw_get_depthstencil_tile_masks(params-depth.mt, + params-depth.level, + params-depth.layer, + NULL, tile_mask_x, tile_mask_y); /* 3DSTATE_DEPTH_BUFFER */ diff --git a/src/mesa/drivers/dri/i965/gen7_blorp.cpp b/src/mesa/drivers/dri/i965/gen7_blorp.cpp index bfd2cbd..423dd3c 100644 --- a/src/mesa/drivers/dri/i965/gen7_blorp.cpp +++ b/src/mesa/drivers/dri/i965/gen7_blorp.cpp @@ -585,7 +585,10 @@ gen7_blorp_emit_depth_stencil_config(struct brw_context *brw, uint32_t tile_mask_x, tile_mask_y; if (params-depth.mt) { - brw_get_depthstencil_tile_masks(params-depth.mt, NULL, + brw_get_depthstencil_tile_masks(params-depth.mt, + params-depth.level, + params-depth.layer, + NULL, tile_mask_x, tile_mask_y); } -- 1.8.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/8] intel: Replace checks for hiz_mt with intel_has*hiz()
From: Chad Versace chad.vers...@linux.intel.com When appropriate, replace each check `hiz_mt != NULL` with either a call to intel_miptree_slice_has_hiz() or intel_renderbuffer_has_hiz(). No behavioral change. This prepares for selectively enabling hiz on individual miptree slices for Haswell. This refactoring had several side effects. 1. To prevent new warnings about discarding the const qualifier, I removed 'const' from some variable declarations in intel_validate_framebuffer(). The alternative was to add const qualifiers to multiple function signatures in the intel_renderbuffer_has_hiz call graph. Since the dominant convention in the Intel code is to not qualify function parameters as const, I chose to remove rather than add const qualifiers. 2. I changed the signature of brw_emit_depth_stencil_hiz() by replacing `struct intel_mipmap_tree *hiz_mt` with `bool hiz`. The function used hiz_mt mostly as a boolean indicator of the presence of hiz, so the signature change is consistent with the patch's goal. Signed-off-by: Chad Versace chad.vers...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_blorp.cpp| 2 +- src/mesa/drivers/dri/i965/brw_clear.c | 2 +- src/mesa/drivers/dri/i965/brw_context.h| 12 +- src/mesa/drivers/dri/i965/brw_misc_state.c | 32 +- src/mesa/drivers/dri/i965/gen7_misc_state.c| 11 + src/mesa/drivers/dri/intel/intel_context.h | 3 +-- src/mesa/drivers/dri/intel/intel_fbo.c | 6 ++--- src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 8 ++- 8 files changed, 36 insertions(+), 40 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp b/src/mesa/drivers/dri/i965/brw_blorp.cpp index 9c6fe49..c60f4f1 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp.cpp +++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp @@ -217,7 +217,7 @@ brw_hiz_op_params::brw_hiz_op_params(struct intel_mipmap_tree *mt, x1 = depth.width; y1 = depth.height; - assert(mt-hiz_mt != NULL); + assert(intel_miptree_slice_has_hiz(mt, level, layer)); switch (mt-format) { case MESA_FORMAT_Z16: depth_format = BRW_DEPTHFORMAT_D16_UNORM; break; diff --git a/src/mesa/drivers/dri/i965/brw_clear.c b/src/mesa/drivers/dri/i965/brw_clear.c index e740f65..c0ac69d 100644 --- a/src/mesa/drivers/dri/i965/brw_clear.c +++ b/src/mesa/drivers/dri/i965/brw_clear.c @@ -114,7 +114,7 @@ brw_fast_clear_depth(struct gl_context *ctx) if (intel-gen 6) return false; - if (!mt-hiz_mt) + if (!intel_renderbuffer_has_hiz(depth_irb)) return false; /* We only handle full buffer clears -- otherwise you'd have to track whether diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 541288d..9158b90 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -1336,9 +1336,9 @@ brw_emit_depth_stencil_hiz(struct brw_context *brw, uint32_t depth_offset, uint32_t depthbuffer_format, uint32_t depth_surface_type, struct intel_mipmap_tree *stencil_mt, - struct intel_mipmap_tree *hiz_mt, - bool separate_stencil, uint32_t width, - uint32_t height, uint32_t tile_x, uint32_t tile_y); + bool hiz, bool separate_stencil, + uint32_t width, uint32_t height, + uint32_t tile_x, uint32_t tile_y); void gen7_emit_depth_stencil_hiz(struct brw_context *brw, @@ -1346,9 +1346,9 @@ gen7_emit_depth_stencil_hiz(struct brw_context *brw, uint32_t depth_offset, uint32_t depthbuffer_format, uint32_t depth_surface_type, struct intel_mipmap_tree *stencil_mt, -struct intel_mipmap_tree *hiz_mt, -bool separate_stencil, uint32_t width, -uint32_t height, uint32_t tile_x, uint32_t tile_y); +bool hiz, bool separate_stencil, +uint32_t width, uint32_t height, +uint32_t tile_x, uint32_t tile_y); #ifdef __cplusplus } diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c b/src/mesa/drivers/dri/i965/brw_misc_state.c index 1878038..9324069 100644 --- a/src/mesa/drivers/dri/i965/brw_misc_state.c +++ b/src/mesa/drivers/dri/i965/brw_misc_state.c @@ -283,10 +283,9 @@ brw_get_depthstencil_tile_masks(struct intel_mipmap_tree *depth_mt, intel_region_get_tile_masks(depth_mt-region, tile_mask_x, tile_mask_y, false); - struct intel_mipmap_tree *hiz_mt = depth_mt-hiz_mt; - if (hiz_mt) { + if (intel_miptree_slice_has_hiz(depth_mt, depth_level, depth_layer)) {
[Mesa-dev] [PATCH 7/8] i965: Remove brw_context::depthstencil::hiz_mt
From: Chad Versace chad.vers...@linux.intel.com After recent refactorings, the field is written but no longer read. Signed-off-by: Chad Versace chad.vers...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_context.h| 1 - src/mesa/drivers/dri/i965/brw_misc_state.c | 2 -- 2 files changed, 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 9158b90..ab3ae1c 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -1083,7 +1083,6 @@ struct brw_context struct { struct intel_mipmap_tree *depth_mt; struct intel_mipmap_tree *stencil_mt; - struct intel_mipmap_tree *hiz_mt; /* Inter-tile (page-aligned) byte offsets. */ uint32_t depth_offset, hiz_offset, stencil_offset; diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c b/src/mesa/drivers/dri/i965/brw_misc_state.c index 9324069..db6bc2d 100644 --- a/src/mesa/drivers/dri/i965/brw_misc_state.c +++ b/src/mesa/drivers/dri/i965/brw_misc_state.c @@ -530,7 +530,6 @@ brw_workaround_depthstencil_alignment(struct brw_context *brw, brw-depthstencil.hiz_offset = 0; brw-depthstencil.depth_mt = NULL; brw-depthstencil.stencil_mt = NULL; - brw-depthstencil.hiz_mt = NULL; if (depth_irb) { depth_mt = depth_irb-mt; brw-depthstencil.depth_mt = depth_mt; @@ -540,7 +539,6 @@ brw_workaround_depthstencil_alignment(struct brw_context *brw, depth_irb-draw_y ~tile_mask_y, false); if (intel_renderbuffer_has_hiz(depth_irb)) { - brw-depthstencil.hiz_mt = depth_mt-hiz_mt; brw-depthstencil.hiz_offset = intel_region_get_aligned_offset(depth_mt-region, depth_irb-draw_x ~tile_mask_x, -- 1.8.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 8/8] intel/hsw: Enable hiz
From: Chad Versace chad.vers...@linux.intel.com Enable hiz by setting intel_context::has_hiz. However, to work around a hardware bug, we selectively enable hiz for only nicely aligned miptree slices. No Piglit regressions on Haswell 0x0d26 rev07 when based atop mesa-master-97e40a5. Improves the performance of GLB27_TRex_C24Z16_FixedTimeStep by 18.52% (hsw-0x0d26-rev07; kernel-3.9.0-rc1; GLBenchmark 2.7.0 Release a68901; samples=3). Signed-off-by: Chad Versace chad.vers...@linux.intel.com --- src/mesa/drivers/dri/intel/intel_context.c | 2 +- src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 34 +- 2 files changed, 34 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_context.c b/src/mesa/drivers/dri/intel/intel_context.c index 990fbea..797a4c8 100644 --- a/src/mesa/drivers/dri/intel/intel_context.c +++ b/src/mesa/drivers/dri/intel/intel_context.c @@ -694,7 +694,7 @@ intelInitContext(struct intel_context *intel, intel-has_separate_stencil = intel-intelScreen-hw_has_separate_stencil; intel-must_use_separate_stencil = intel-intelScreen-hw_must_use_separate_stencil; - intel-has_hiz = intel-gen = 6 !intel-is_haswell; + intel-has_hiz = intel-gen = 6; intel-has_llc = intel-intelScreen-hw_has_llc; intel-has_swizzling = intel-intelScreen-hw_has_swizzling; diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c index 6c27bab..654f0be 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c @@ -29,6 +29,7 @@ #include GL/internal/dri_interface.h #include intel_batchbuffer.h +#include intel_chipset.h #include intel_context.h #include intel_mipmap_tree.h #include intel_regions.h @@ -1036,7 +1037,38 @@ intel_miptree_slice_has_hiz(struct intel_mipmap_tree *mt, uint32_t layer) { intel_miptree_check_level_layer(mt, level, layer); - return mt-hiz_mt != NULL; + + if (!mt-hiz_mt) + return false; + + int devid = drm_intel_bufmgr_gem_get_devid(mt-region-bo-bufmgr); + if (IS_HASWELL(devid)) { + /* Disable HiZ for some slices to work around a hardware bug. + * + * Haswell hardware fails to respect + * 3DSTATE_DEPTH_BUFFER.Depth_Coordinate_Offset_X/Y when during HiZ + * ambiguate operations. The failure is inconsistent and affected by + * other GPU contexts. Running a heavy GPU workload in a separate + * process causes the failure rate to drop to nearly 0. + * + * To workaround the bug, we enable HiZ only when we can guarantee that + * the Depth Coordinate Offset fields will be set to 0. The function + * brw_get_depthstencil_tile_masks() is used to calculate the fields, + * and the function is sometimes called in such a way that the presence + * of an attached stencil buffer changes the fuction's return value. + * + * The largest tile size considered by brw_get_depthstencil_tile_masks() + * is that of the stencil buffer. Therefore, if this hiz slice's + * corresponding depth slice has an offset that is aligned to the + * stencil buffer tile size, 64x64 pixels, then + * 3DSTATE_DEPTH_BUFFER.Depth_Coordinate_Offset_X/Y is set to 0. + */ + uint32_t depth_x_offset = mt-level[level].slice[layer].x_offset; + uint32_t depth_y_offset = mt-level[level].slice[layer].y_offset; + return (depth_x_offset 63) == 0 (depth_y_offset 63) == 0; + } + + return true; } void -- 1.8.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965: NULL check prog on compilation failure.
Matt Turner matts...@gmail.com writes: I believe that prog can only be NULL for ARB programs. Neither brw_fs_fp.cpp nor brw_vec4_vp.cpp call fail(), but not NULL checking prog is obviously fragile. (shader != NULL) = (prog != NULL), so if you want consistency I'd rather see the if (shader) changed to if (prog). A bunch of these changes are not about compilation failure, anyway. pgpRBXfgm3b05.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/8] i965/blorp: Align rectangle primitive for hiz ops
Kenneth Graunke kenn...@whitecape.org writes: From: Chad Versace chad.vers...@linux.intel.com The hardware docs and the simulator require that the rectangle primitive emitted during fast depth clears and hiz resolves must be aligned to 8x4 pixels. Signed-off-by: Chad Versace chad.vers...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_blorp.cpp | 29 + 1 file changed, 29 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp b/src/mesa/drivers/dri/i965/brw_blorp.cpp index 5f72b5d..ed966c6 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp.cpp +++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp @@ -181,6 +181,35 @@ brw_hiz_op_params::brw_hiz_op_params(struct intel_mipmap_tree *mt, this-hiz_op = op; depth.set(mt, level, layer); + + /* Align the rectangle primitive to 8x4 pixels. ^ and surface state. Same content as my patch, so: Reviewed-by: Eric Anholt e...@anholt.net pgpeXTk3JPdC0.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/8] intel: Add functions for checking if objs have hiz enabled
Kenneth Graunke kenn...@whitecape.org writes: From: Chad Versace chad.vers...@linux.intel.com On Haswell, HiZ will selectively be enabled on individual miptree slices to workaround a hardware bug. The two new functions below will permit us to detect if hiz is enabled for a particular slice. intel_miptree_slice_has_hiz intel_renderbuffer_has_hiz The functions are not yet used. Signed-off-by: Chad Versace chad.vers...@linux.intel.com --- src/mesa/drivers/dri/intel/intel_fbo.c | 10 ++ src/mesa/drivers/dri/intel/intel_fbo.h | 3 +++ src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 12 src/mesa/drivers/dri/intel/intel_mipmap_tree.h | 11 +-- 4 files changed, 34 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_fbo.c b/src/mesa/drivers/dri/intel/intel_fbo.c index 2977568..0e2ded5 100644 --- a/src/mesa/drivers/dri/intel/intel_fbo.c +++ b/src/mesa/drivers/dri/intel/intel_fbo.c @@ -943,6 +943,16 @@ intel_renderbuffer_set_needs_downsample(struct intel_renderbuffer *irb) irb-mt-need_downsample = true; } +/** + * Does the renderbuffer have hiz enabled? + */ +bool +intel_renderbuffer_has_hiz(struct intel_renderbuffer *irb) +{ + return irb-mt + intel_miptree_slice_has_hiz(irb-mt, irb-mt_level, irb-mt_layer); +} irb-mt should always be non-null -- a renderbuffer without that should never exist. pgpaRXFe9_U16.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965: NULL check prog on compilation failure.
On Tue, Apr 9, 2013 at 3:55 PM, Eric Anholt e...@anholt.net wrote: Matt Turner matts...@gmail.com writes: I believe that prog can only be NULL for ARB programs. Neither brw_fs_fp.cpp nor brw_vec4_vp.cpp call fail(), but not NULL checking prog is obviously fragile. (shader != NULL) = (prog != NULL), so if you want consistency I'd rather see the if (shader) changed to if (prog). Ah, that's better. A bunch of these changes are not about compilation failure, anyway. The changes to brw_vs.c and brw_wm.c? They actually are -- brw_vs_debug_recompile and brw_wm_debug_recompile are called below the modified hunks in brw_vec4.cpp and brw_fs.cpp. But they're called from inside if (shader) tests, so never mind. So, this patch is crap. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/12] glsl: Add lowering pass for ir_triop_vector_insert
On 04/09/2013 10:29 AM, Eric Anholt wrote: Ian Romanick i...@freedesktop.org writes: From: Ian Romanick ian.d.roman...@intel.com This will eventually replace do_vec_index_to_cond_assign. This lowering pass is called in all the places where do_vec_index_to_cond_assign or do_vec_index_to_swizzle is called. + + ir_constant *const cmp_indices = + new(factory.mem_ctx) ir_constant(glsl_type::get_instance(GLSL_TYPE_INT, + components, + 1), + cmp_indices_data); + + ir_variable *const cmp_result = + factory.make_temp(glsl_type::get_instance(GLSL_TYPE_BOOL, + components, + 1), + index_condition); I wish we had some helpers like glsl_type::bvec(components) instead of these verbose getters for sized vectors. + ir_variable *const src_temp = + factory.make_temp(expr-operands[1]-type, src_temp); + + const ir_swizzle_mask m = { 0, 0, 0, 0, components, false }; + ir_rvalue *broadcast_index = + new(factory.mem_ctx) ir_swizzle(expr-operands[2], m); Optionally: ir_rvalue *broadcast_index = swizzle_for_size(swizzle_(expr-operands[2]), components). Hm... I think I tried that (or some variation there of), but it didn't work. I'll try again. + factory.emit(assign(temp, expr-operands[0])); + factory.emit(assign(src_temp, expr-operands[1])); + factory.emit(assign(cmp_result, equal(broadcast_index, cmp_indices))); + factory.emit(if_tree(swizzle_x(cmp_result), assign(temp, src_temp, 1))); + factory.emit(if_tree(swizzle_y(cmp_result), assign(temp, src_temp, 2))); + + if (components 2) + factory.emit(if_tree(swizzle_z(cmp_result), assign(temp, src_temp, 4))); + + if (components 3) + factory.emit(if_tree(swizzle_w(cmp_result), assign(temp, src_temp, 8))); Please use the WRITEMASK_* defines here. Okay. I know our hardware doesn't like the swizzling of that bvec compare result and we'd rather just see individual compares as the condition of each if statement. (we basically have to emit a compare of the swizzled bool against 0, after masking its high bits off, when we could have just compared the index value to a constant). I imagine other hardware would prefer the same. Should we add a lowering pass or an option of some sort? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/12] glsl: Generate ir_binop_vector_extract for indexing of vectors
On 04/09/2013 10:38 AM, Eric Anholt wrote: Ian Romanick i...@freedesktop.org writes: From: Ian Romanick ian.d.roman...@intel.com Now ir_dereference_array of a vector will never occur in the RHS of an expression. Signed-off-by: Ian Romanick ian.d.roman...@intel.com --- src/glsl/ast_array_index.cpp | 23 +-- 1 file changed, 17 insertions(+), 6 deletions(-) diff --git a/src/glsl/ast_array_index.cpp b/src/glsl/ast_array_index.cpp index 862f64c..e7bc299 100644 --- a/src/glsl/ast_array_index.cpp +++ b/src/glsl/ast_array_index.cpp @@ -31,17 +31,13 @@ _mesa_ast_array_index_to_hir(void *mem_ctx, ir_rvalue *array, ir_rvalue *idx, YYLTYPE loc, YYLTYPE idx_loc) { - ir_rvalue *result = new(mem_ctx) ir_dereference_array(array, idx); - if (!array-type-is_error() !array-type-is_array() !array-type-is_matrix() -!array-type-is_vector()) { +!array-type-is_vector()) _mesa_glsl_error( idx_loc, state, cannot dereference non-array / non-matrix / non-vector); - result-type = glsl_type::error_type; - } Style-wise, I'd prefer those curly braces stay -- there's very little distinguishing the if condition from the body, already. That's fair. I prefer that in general, but I thought the coding convention was to not include unnecessary { }... maybe that's just Kristian's convention. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 8/8] intel/hsw: Enable hiz
Kenneth Graunke kenn...@whitecape.org writes: From: Chad Versace chad.vers...@linux.intel.com Enable hiz by setting intel_context::has_hiz. However, to work around a hardware bug, we selectively enable hiz for only nicely aligned miptree slices. No Piglit regressions on Haswell 0x0d26 rev07 when based atop mesa-master-97e40a5. Improves the performance of GLB27_TRex_C24Z16_FixedTimeStep by 18.52% (hsw-0x0d26-rev07; kernel-3.9.0-rc1; GLBenchmark 2.7.0 Release a68901; samples=3). diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c index 6c27bab..654f0be 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c @@ -29,6 +29,7 @@ #include GL/internal/dri_interface.h #include intel_batchbuffer.h +#include intel_chipset.h #include intel_context.h #include intel_mipmap_tree.h #include intel_regions.h @@ -1036,7 +1037,38 @@ intel_miptree_slice_has_hiz(struct intel_mipmap_tree *mt, uint32_t layer) { intel_miptree_check_level_layer(mt, level, layer); - return mt-hiz_mt != NULL; + + if (!mt-hiz_mt) + return false; + + int devid = drm_intel_bufmgr_gem_get_devid(mt-region-bo-bufmgr); + if (IS_HASWELL(devid)) { This conditional is checking for about 37 different PCI IDs, and shouldn't appear in a hot path. Please use intel-is_haswell, by passing the intel_context into this function. + /* Disable HiZ for some slices to work around a hardware bug. + * + * Haswell hardware fails to respect + * 3DSTATE_DEPTH_BUFFER.Depth_Coordinate_Offset_X/Y when during HiZ ^ s/when // Other than the comments I've sent, the series is: Reviewed-by: Eric Anholt e...@anholt.net pgph4iZmD9fj6.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] mesa: fix glGet queries depending on derived framebuffer state (v2)
Marek Olšák mar...@gmail.com writes: ctx-DrawBuffer-Visual might be invalid if (NewState _NEW_BUFFERS) != 0. v2: also fix: - RGBA_INTEGER_MODE_EXT - RGBA_FLOAT_MODE_ARB (also check API support) - FRAMEBUFFER_SRGB_CAPABLE_EXT NOTE: This is a candidate for stable branches. Reviewed-by: Eric Anholt e...@anholt.net pgp6z3iQHkrpr.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/12] Death to array dereferences of vectors!
On 04/09/2013 10:49 AM, Eric Anholt wrote: Ian Romanick i...@freedesktop.org writes: This series gradually replaces array dereferences of vectors with two expressions. It takes so many patches because changes are needed to the existing lowering passes and because several places in the code generate array dereferences of vectors (e.g., lowering accessed to gl_ClipDistance). There is also some challenge in dealing with function inout parameters that are indexed vectors. The two new expressions are ir_binop_vector_extract and ir_triop_vector_insert. The former has a vector operand and a scalar operand. The result is the scalar value from the vector specified by the scalar. The later takes a vector and two scalars. The result is a new vector with one indexed field replaced by a scalar value. Together this series fixes piglit tests glsl-vs-channel-overwrite-01 and glsl-vs-channel-overwrite-03. Throughout the series, there's a bunch of introduction of new tabs for indentation. Paul pointed out long ago that the devinfo.html had specified a no-tabs indent style in the indent command since 2006, and I found that basically you and I were the only ones putting tabs in, so I stopped. I've found reading diffs has become easier since avoiding tabs, since you don't get diffs with apparently-incorrect indentation (thanks to + being 3 sharacters, in particular). I'd love to see this code fixed to not use tabs. If you use emacs, removing your custom configuration for Mesa and relying on .dir-locals.el will get you the preferred style for future work. Other than that, the patches other than the ones I commented on and the gl_ClipDistance ones are: Reviewed-by: Eric Anholt e...@anholt.net I've pushed a branch with updated patches to http://cgit.freedesktop.org/~idr/mesa/log/?h=vector Let me know if the changes to patches 6 and 8 are sufficient. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: Fix UMAD on Cayman
Hi Vadim, your patch does not fix the test. Marek On Tue, Apr 9, 2013 at 11:30 PM, Vadim Girlin vadimgir...@gmail.com wrote: On 04/09/2013 10:58 AM, Martin Andersson wrote: On Tue, Apr 9, 2013 at 3:18 AM, Marek Olšák mar...@gmail.com wrote: Pushed, thanks. The transform feedback test still doesn't pass, but at least the hardlocks are gone. Thanks, I have looked into the other issue as well http://lists.freedesktop.org/**archives/mesa-dev/2013-March/**036941.htmlhttp://lists.freedesktop.org/archives/mesa-dev/2013-March/036941.html The problem arises when there are nested loops. If I rework the code so there are no nested loops the issue disappears. At least one pixel also needs to enter the outer loop. The pixels that should enter the outer loop behaves correctly. It is those pixels that should not enter the outer loop that misbehaves. It does not matter if they also fails the test for the inner loop, they will still execute the instruction inside. That leads to the strange results for that test. Please test the attached patch. Vadim The strangeness is easier to see if the NUM_POINTS in the ext_transform_feedback/ order.c are run with smaller values,like 3, 6 and 9. Disable the code that fail the test and print starting_x, shift_reg_final and iteration_count. Marek, since you implemented transform feedback for r600, do you think the issue is with the tranform feedback code or the shader compiler or some other thing? //Martin __**_ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/**mailman/listinfo/mesa-devhttp://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/4] glsl: Fix hypothetical NULL dereference in ast_process_structure_or_interface_block
From: Ian Romanick ian.d.roman...@intel.com Fixes issue identified by Klocwork analysis: Pointer 'field_type' returned from call to function 'glsl_type' at line 4126 may be NULL and may be dereferenced at line 4139. Also there are 2 similar errors on line(s) 4165, 4174. In practice, it should be impossible to actually get NULL in here because a syntax error would have already caused compilation to halt. Signed-off-by: Ian Romanick ian.d.roman...@intel.com --- src/glsl/ast_to_hir.cpp | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index a1b4ee7..00563f3 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -4009,8 +4009,14 @@ ast_process_structure_or_interface_block(exec_list *instructions, * blocks. All other types, arrays, and structures * allowed for uniforms are allowed within a uniform * block. + * + * It should be impossible for decl_type to be NULL here. Cases that + * might naturally lead to decl_type being NULL, especially for the + * is_interface case, will have resulted in compilation having + * already halted due to a syntax error. */ - const struct glsl_type *field_type = decl_type; + const struct glsl_type *field_type = +decl_type != NULL ? decl_type : glsl_type::error_type; if (is_interface field_type-contains_sampler()) { YYLTYPE loc = decl_list-get_location(); @@ -4033,8 +4039,7 @@ ast_process_structure_or_interface_block(exec_list *instructions, field_type = process_array_type(loc, decl_type, decl-array_size, state); } -fields[i].type = (field_type != NULL) - ? field_type : glsl_type::error_type; + fields[i].type = field_type; fields[i].name = decl-identifier; if (qual-flags.q.row_major || qual-flags.q.column_major) { -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/4] glsl: Fix hypothetical NULL dereference related to process_array_type
From: Ian Romanick ian.d.roman...@intel.com Ensure that process_array_type never returns NULL, and let process_array_type handle the case where the supplied base type is NULL. Fixes issues identified by Klocwork analysis: Pointer 'type' returned from call to function 'get_type' at line 1907 may be NULL and may be dereferenced at line 1912. and Pointer 'field_type' checked for NULL at line 4160 will be dereferenced at line 4165. Also there is one similar error on line 4174. Signed-off-by: Ian Romanick ian.d.roman...@intel.com --- src/glsl/ast_to_hir.cpp | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 00563f3..eeff8c1 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -1702,6 +1702,9 @@ process_array_type(YYLTYPE *loc, const glsl_type *base, ast_node *array_size, { unsigned length = 0; + if (base == NULL) + return glsl_type::error_type; + /* From page 19 (page 25) of the GLSL 1.20 spec: * * Only one-dimensional arrays may be declared. @@ -1754,7 +1757,8 @@ process_array_type(YYLTYPE *loc, const glsl_type *base, ast_node *array_size, allowed in GLSL ES 1.00.); } - return glsl_type::get_array_instance(base, length); + const glsl_type *array_type = glsl_type::get_array_instance(base, length); + return array_type != NULL ? array_type : glsl_type::error_type; } -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/4] mesa: NULL check the pointer before trying to dereference it
From: Ian Romanick ian.d.roman...@intel.com Duh. Fixes issues identified by Klocwork analysis: Pointer 'table' returned from call to function 'calloc' at line 115 may be NULL and will be dereferenced at line 117. and Suspicious dereference of pointer 'table' before NULL check at line 119. Signed-off-by: Ian Romanick ian.d.roman...@intel.com --- src/mesa/main/hash.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/hash.c b/src/mesa/main/hash.c index 8c763e2..9b9fff8 100644 --- a/src/mesa/main/hash.c +++ b/src/mesa/main/hash.c @@ -114,9 +114,9 @@ _mesa_NewHashTable(void) { struct _mesa_HashTable *table = CALLOC_STRUCT(_mesa_HashTable); - table-ht = _mesa_hash_table_create(NULL, uint_key_compare); - _mesa_hash_table_set_deleted_key(table-ht, uint_key(DELETED_KEY_VALUE)); if (table) { + table-ht = _mesa_hash_table_create(NULL, uint_key_compare); + _mesa_hash_table_set_deleted_key(table-ht, uint_key(DELETED_KEY_VALUE)); _glthread_INIT_MUTEX(table-Mutex); _glthread_INIT_MUTEX(table-WalkMutex); } -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/4] egl/dri2: NULL check value returned by dri2_create_surface
From: Ian Romanick ian.d.roman...@intel.com dri2_create_surface can fail for a variety of reasons, including bad input data. Dereferencing the NULL pointer and crashing is not okay. Fixes issue identified by Klocwork analysis: Pointer 'surf' returned from call to function 'dri2_create_surface' at line 285 may be NULL and will be dereferenced at line 291. Signed-off-by: Ian Romanick ian.d.roman...@intel.com --- src/egl/drivers/dri2/platform_x11.c | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/src/egl/drivers/dri2/platform_x11.c b/src/egl/drivers/dri2/platform_x11.c index da61cfc..86eeafa 100644 --- a/src/egl/drivers/dri2/platform_x11.c +++ b/src/egl/drivers/dri2/platform_x11.c @@ -284,14 +284,15 @@ dri2_create_window_surface(_EGLDriver *drv, _EGLDisplay *disp, surf = dri2_create_surface(drv, disp, EGL_WINDOW_BIT, conf, window, attrib_list); - - /* When we first create the DRI2 drawable, its swap interval on the server -* side is 1. -*/ - surf-SwapInterval = 1; - - /* Override that with a driconf-set value. */ - drv-API.SwapInterval(drv, disp, surf, dri2_dpy-default_swap_interval); + if (surf != NULL) { + /* When we first create the DRI2 drawable, its swap interval on the + * server side is 1. + */ + surf-SwapInterval = 1; + + /* Override that with a driconf-set value. */ + drv-API.SwapInterval(drv, disp, surf, dri2_dpy-default_swap_interval); + } return surf; } -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/4] gallivm: fix unsigned divide and remainder opcodes
We want to both make sure we never divide by zero to not generate sigfpe and that divide by zero is guaranteed to return 0x. Based on José idea. Signed-off-by: Zack Rusin za...@vmware.com --- src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c | 29 +--- 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c index f3ae7b6..f72fafd 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c @@ -1543,8 +1543,19 @@ udiv_emit_cpu( struct lp_build_tgsi_context * bld_base, struct lp_build_emit_data * emit_data) { - emit_data-output[emit_data-chan] = lp_build_div(bld_base-uint_bld, - emit_data-args[0], emit_data-args[1]); + + LLVMBuilderRef builder = bld_base-base.gallivm-builder; + LLVMValueRef div_mask = lp_build_cmp(bld_base-uint_bld, +PIPE_FUNC_EQUAL, emit_data-args[1], +bld_base-uint_bld.zero); + LLVMValueRef divisor = LLVMBuildOr(builder, + div_mask, + emit_data-args[1], ); + LLVMValueRef result = lp_build_div(bld_base-uint_bld, + emit_data-args[0], divisor); + emit_data-output[emit_data-chan] = LLVMBuildOr(builder, +div_mask, +result, ); } /* TGSI_OPCODE_UMAX (CPU Only) */ @@ -1576,8 +1587,18 @@ umod_emit_cpu( struct lp_build_tgsi_context * bld_base, struct lp_build_emit_data * emit_data) { - emit_data-output[emit_data-chan] = lp_build_mod(bld_base-uint_bld, - emit_data-args[0], emit_data-args[1]); + LLVMBuilderRef builder = bld_base-base.gallivm-builder; + LLVMValueRef div_mask = lp_build_cmp(bld_base-uint_bld, +PIPE_FUNC_EQUAL, emit_data-args[1], +bld_base-uint_bld.zero); + LLVMValueRef divisor = LLVMBuildOr(builder, + div_mask, + emit_data-args[1], ); + LLVMValueRef result = lp_build_mod(bld_base-uint_bld, + emit_data-args[0], divisor); + emit_data-output[emit_data-chan] = LLVMBuildOr(builder, +div_mask, +result, ); } /* TGSI_OPCODE_USET Helper (CPU Only) */ -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/4] llvmpipe: implement PIPE_QUERY_SO_STATISTICS
We were missing the implementation of PIPE_QUERY_SO_STATISTICS query, this change implements it on top of the existing facilities. Signed-off-by: Zack Rusin za...@vmware.com --- src/gallium/drivers/llvmpipe/lp_query.c | 19 +++ src/gallium/drivers/llvmpipe/lp_rast.c |3 +++ 2 files changed, 22 insertions(+) diff --git a/src/gallium/drivers/llvmpipe/lp_query.c b/src/gallium/drivers/llvmpipe/lp_query.c index 01d5201..013d192 100644 --- a/src/gallium/drivers/llvmpipe/lp_query.c +++ b/src/gallium/drivers/llvmpipe/lp_query.c @@ -137,6 +137,13 @@ llvmpipe_get_query_result(struct pipe_context *pipe, case PIPE_QUERY_PRIMITIVES_EMITTED: *result = pq-num_primitives_written; break; + case PIPE_QUERY_SO_STATISTICS: { + struct pipe_query_data_so_statistics *stats = + (struct pipe_query_data_so_statistics *)vresult; + stats-num_primitives_written = pq-num_primitives_written; + stats-primitives_storage_needed = pq-num_primitives_generated; + } + break; default: assert(0); break; @@ -174,6 +181,13 @@ llvmpipe_begin_query(struct pipe_context *pipe, struct pipe_query *q) llvmpipe-num_primitives_generated = 0; } + if (pq-type == PIPE_QUERY_SO_STATISTICS) { + pq-num_primitives_written = 0; + llvmpipe-so_stats.num_primitives_written = 0; + pq-num_primitives_generated = 0; + llvmpipe-num_primitives_generated = 0; + } + if (pq-type == PIPE_QUERY_OCCLUSION_COUNTER) { llvmpipe-active_occlusion_query = TRUE; llvmpipe-dirty |= LP_NEW_OCCLUSION_QUERY; @@ -197,6 +211,11 @@ llvmpipe_end_query(struct pipe_context *pipe, struct pipe_query *q) pq-num_primitives_generated = llvmpipe-num_primitives_generated; } + if (pq-type == PIPE_QUERY_SO_STATISTICS) { + pq-num_primitives_written = llvmpipe-so_stats.num_primitives_written; + pq-num_primitives_generated = llvmpipe-num_primitives_generated; + } + if (pq-type == PIPE_QUERY_OCCLUSION_COUNTER) { assert(llvmpipe-active_occlusion_query); llvmpipe-active_occlusion_query = FALSE; diff --git a/src/gallium/drivers/llvmpipe/lp_rast.c b/src/gallium/drivers/llvmpipe/lp_rast.c index 6183f41..903cb448 100644 --- a/src/gallium/drivers/llvmpipe/lp_rast.c +++ b/src/gallium/drivers/llvmpipe/lp_rast.c @@ -476,6 +476,8 @@ lp_rast_begin_query(struct lp_rasterizer_task *task, break; case PIPE_QUERY_PRIMITIVES_GENERATED: case PIPE_QUERY_PRIMITIVES_EMITTED: + case PIPE_QUERY_SO_STATISTICS: + break; break; default: assert(0); @@ -507,6 +509,7 @@ lp_rast_end_query(struct lp_rasterizer_task *task, break; case PIPE_QUERY_PRIMITIVES_GENERATED: case PIPE_QUERY_PRIMITIVES_EMITTED: + case PIPE_QUERY_SO_STATISTICS: break; default: assert(0); -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/4] gallivm: fix loops and conditionals within GS
We were using simple temporaries, without using alloca or phi nodes which meant that on every iteration of the loop our temporaries holding the numbers of vertices and primitives which were emitted were being reset to zero. Now we're using alloca to allocate of those variables to preserve them across conditionals. Signed-off-by: Zack Rusin za...@vmware.com --- src/gallium/auxiliary/gallivm/lp_bld_tgsi.h |6 +- src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c | 118 --- 2 files changed, 105 insertions(+), 19 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h index 558a8dd..23ccacc 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h @@ -393,9 +393,9 @@ struct lp_build_tgsi_soa_context struct lp_build_context elem_bld; const struct lp_build_tgsi_gs_iface *gs_iface; - LLVMValueRef emitted_prims_vec; - LLVMValueRef total_emitted_vertices_vec; - LLVMValueRef emitted_vertices_vec; + LLVMValueRef emitted_prims_vec_ptr; + LLVMValueRef total_emitted_vertices_vec_ptr; + LLVMValueRef emitted_vertices_vec_ptr; /* if a shader doesn't have ENDPRIM instruction but it has * a number of EMIT instructions it means the END instruction * implicitly invokes ENDPRIM. handle this via a flag here diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c index 1e062e9..6cc72ff 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c @@ -1150,7 +1150,7 @@ emit_store_chan( } else { LLVMValueRef out_ptr = lp_get_output_ptr(bld, reg-Register.Index, - chan_index); + chan_index); lp_exec_mask_store(bld-exec_mask, bld_store, pred, value, out_ptr); } break; @@ -2213,6 +2213,41 @@ mask_to_one_vec(struct lp_build_tgsi_context *bld_base) } static void +increment_vec_ptr_by_mask(struct lp_build_tgsi_context * bld_base, + LLVMValueRef ptr, + LLVMValueRef mask) +{ + LLVMBuilderRef builder = bld_base-base.gallivm-builder; + + LLVMValueRef current_vec = LLVMBuildLoad(builder, ptr, ); + + current_vec = LLVMBuildAdd(builder, current_vec, mask, ); + + LLVMBuildStore(builder, current_vec, ptr); +} + +static void +clear_uint_vec_ptr_from_mask(struct lp_build_tgsi_context * bld_base, + LLVMValueRef ptr, + LLVMValueRef mask) +{ + LLVMBuilderRef builder = bld_base-base.gallivm-builder; + + LLVMValueRef current_vec = LLVMBuildLoad(builder, ptr, ); + LLVMValueRef full_mask = lp_build_cmp(bld_base-uint_bld, + PIPE_FUNC_NOTEQUAL, + mask, + bld_base-uint_bld.zero); + + current_vec = lp_build_select(bld_base-uint_bld, + full_mask, + bld_base-uint_bld.zero, + current_vec); + + LLVMBuildStore(builder, current_vec, ptr); +} + +static void emit_vertex( const struct lp_build_tgsi_action * action, struct lp_build_tgsi_context * bld_base, @@ -2223,14 +2258,22 @@ emit_vertex( if (bld-gs_iface-emit_vertex) { LLVMValueRef masked_ones = mask_to_one_vec(bld_base); + LLVMValueRef total_emitted_vertices_vec = + LLVMBuildLoad(builder, bld-total_emitted_vertices_vec_ptr, ); gather_outputs(bld); bld-gs_iface-emit_vertex(bld-gs_iface, bld-bld_base, bld-outputs, - bld-total_emitted_vertices_vec); - bld-emitted_vertices_vec = - LLVMBuildAdd(builder, bld-emitted_vertices_vec, masked_ones, ); - bld-total_emitted_vertices_vec = - LLVMBuildAdd(builder, bld-total_emitted_vertices_vec, masked_ones, ); + total_emitted_vertices_vec); + increment_vec_ptr_by_mask(bld_base, bld-emitted_vertices_vec_ptr, +masked_ones); + increment_vec_ptr_by_mask(bld_base, bld-total_emitted_vertices_vec_ptr, +masked_ones); +#if DUMP_GS_EMITS + lp_build_print_value(bld-bld_base.base.gallivm, +++ emit vertex masked ones = , + masked_ones); + lp_build_print_value(bld-bld_base.base.gallivm, +++ emit vertex emitted = , + total_emitted_vertices_vec); +#endif bld-pending_end_primitive = TRUE; } } @@ -2247,12 +2290,32 @@ end_primitive( if (bld-gs_iface-end_primitive) { LLVMValueRef masked_ones = mask_to_one_vec(bld_base); + LLVMValueRef emitted_vertices_vec = + LLVMBuildLoad(builder,
[Mesa-dev] [PATCH 4/4] gallivm/tgsi: handle untyped moves
both mov and ucmp can be used to move variables of any type. correctly note that about ucmp in the tgsi_info and make sure gallivm can handle that by correctly casting the untyped moves. Signed-off-by: Zack Rusin za...@vmware.com --- src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c |8 src/gallium/auxiliary/tgsi/tgsi_info.c |1 + 2 files changed, 9 insertions(+) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c index 6cc72ff..9501100 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c @@ -1084,6 +1084,14 @@ emit_store_chan( break; } + /* If we're destination is untyped then the source can be anything, +* but LLVM won't like if the types don't match so lets cast +* to the correct destination type as expected by LLVM */ + if (dtype == TGSI_TYPE_UNTYPED + !lp_check_vec_type(bld_store-type, LLVMTypeOf(value))) { + value = LLVMBuildBitCast(builder, value, bld_store-vec_type, src_casted); + } + switch( inst-Instruction.Saturate ) { case TGSI_SAT_NONE: break; diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c b/src/gallium/auxiliary/tgsi/tgsi_info.c index 8ae5523..1fadfec 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_info.c +++ b/src/gallium/auxiliary/tgsi/tgsi_info.c @@ -327,6 +327,7 @@ tgsi_opcode_infer_dst_type( uint opcode ) { switch (opcode) { case TGSI_OPCODE_MOV: + case TGSI_OPCODE_UCMP: return TGSI_TYPE_UNTYPED; case TGSI_OPCODE_F2U: case TGSI_OPCODE_AND: -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 63117] OSMesa Gallium Empty Output
https://bugs.freedesktop.org/show_bug.cgi?id=63117 Brian Paul bri...@vmware.com changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #6 from Brian Paul bri...@vmware.com --- Patch pushed: acd4fb8b5aa68d6545cf3c7f63d9d2fa1cf73e73 -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] gallivm: fix unsigned divide and remainder opcodes
Am 10.04.2013 02:22, schrieb Zack Rusin: We want to both make sure we never divide by zero to not generate sigfpe and that divide by zero is guaranteed to return 0x. Based on José idea. Signed-off-by: Zack Rusin za...@vmware.com --- src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c | 29 +--- 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c index f3ae7b6..f72fafd 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c @@ -1543,8 +1543,19 @@ udiv_emit_cpu( struct lp_build_tgsi_context * bld_base, struct lp_build_emit_data * emit_data) { - emit_data-output[emit_data-chan] = lp_build_div(bld_base-uint_bld, - emit_data-args[0], emit_data-args[1]); + + LLVMBuilderRef builder = bld_base-base.gallivm-builder; + LLVMValueRef div_mask = lp_build_cmp(bld_base-uint_bld, +PIPE_FUNC_EQUAL, emit_data-args[1], +bld_base-uint_bld.zero); + LLVMValueRef divisor = LLVMBuildOr(builder, + div_mask, + emit_data-args[1], ); + LLVMValueRef result = lp_build_div(bld_base-uint_bld, + emit_data-args[0], divisor); + emit_data-output[emit_data-chan] = LLVMBuildOr(builder, +div_mask, +result, ); } /* TGSI_OPCODE_UMAX (CPU Only) */ @@ -1576,8 +1587,18 @@ umod_emit_cpu( struct lp_build_tgsi_context * bld_base, struct lp_build_emit_data * emit_data) { - emit_data-output[emit_data-chan] = lp_build_mod(bld_base-uint_bld, - emit_data-args[0], emit_data-args[1]); + LLVMBuilderRef builder = bld_base-base.gallivm-builder; + LLVMValueRef div_mask = lp_build_cmp(bld_base-uint_bld, +PIPE_FUNC_EQUAL, emit_data-args[1], +bld_base-uint_bld.zero); + LLVMValueRef divisor = LLVMBuildOr(builder, + div_mask, + emit_data-args[1], ); + LLVMValueRef result = lp_build_mod(bld_base-uint_bld, + emit_data-args[0], divisor); + emit_data-output[emit_data-chan] = LLVMBuildOr(builder, +div_mask, +result, ); } /* TGSI_OPCODE_USET Helper (CPU Only) */ I think it would be nice if there'd be a comment in the code itself saying why this is done (e.g. something similar to the commit message). But either way, Reviewed-by: Roland Scheidegger srol...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/4] llvmpipe: implement PIPE_QUERY_SO_STATISTICS
Am 10.04.2013 02:22, schrieb Zack Rusin: We were missing the implementation of PIPE_QUERY_SO_STATISTICS query, this change implements it on top of the existing facilities. Signed-off-by: Zack Rusin za...@vmware.com --- src/gallium/drivers/llvmpipe/lp_query.c | 19 +++ src/gallium/drivers/llvmpipe/lp_rast.c |3 +++ 2 files changed, 22 insertions(+) diff --git a/src/gallium/drivers/llvmpipe/lp_query.c b/src/gallium/drivers/llvmpipe/lp_query.c index 01d5201..013d192 100644 --- a/src/gallium/drivers/llvmpipe/lp_query.c +++ b/src/gallium/drivers/llvmpipe/lp_query.c @@ -137,6 +137,13 @@ llvmpipe_get_query_result(struct pipe_context *pipe, case PIPE_QUERY_PRIMITIVES_EMITTED: *result = pq-num_primitives_written; break; + case PIPE_QUERY_SO_STATISTICS: { + struct pipe_query_data_so_statistics *stats = + (struct pipe_query_data_so_statistics *)vresult; + stats-num_primitives_written = pq-num_primitives_written; + stats-primitives_storage_needed = pq-num_primitives_generated; + } + break; default: assert(0); break; @@ -174,6 +181,13 @@ llvmpipe_begin_query(struct pipe_context *pipe, struct pipe_query *q) llvmpipe-num_primitives_generated = 0; } + if (pq-type == PIPE_QUERY_SO_STATISTICS) { + pq-num_primitives_written = 0; + llvmpipe-so_stats.num_primitives_written = 0; + pq-num_primitives_generated = 0; + llvmpipe-num_primitives_generated = 0; + } + if (pq-type == PIPE_QUERY_OCCLUSION_COUNTER) { llvmpipe-active_occlusion_query = TRUE; llvmpipe-dirty |= LP_NEW_OCCLUSION_QUERY; @@ -197,6 +211,11 @@ llvmpipe_end_query(struct pipe_context *pipe, struct pipe_query *q) pq-num_primitives_generated = llvmpipe-num_primitives_generated; } + if (pq-type == PIPE_QUERY_SO_STATISTICS) { + pq-num_primitives_written = llvmpipe-so_stats.num_primitives_written; + pq-num_primitives_generated = llvmpipe-num_primitives_generated; + } + if (pq-type == PIPE_QUERY_OCCLUSION_COUNTER) { assert(llvmpipe-active_occlusion_query); llvmpipe-active_occlusion_query = FALSE; diff --git a/src/gallium/drivers/llvmpipe/lp_rast.c b/src/gallium/drivers/llvmpipe/lp_rast.c index 6183f41..903cb448 100644 --- a/src/gallium/drivers/llvmpipe/lp_rast.c +++ b/src/gallium/drivers/llvmpipe/lp_rast.c @@ -476,6 +476,8 @@ lp_rast_begin_query(struct lp_rasterizer_task *task, break; case PIPE_QUERY_PRIMITIVES_GENERATED: case PIPE_QUERY_PRIMITIVES_EMITTED: + case PIPE_QUERY_SO_STATISTICS: + break; break; double break? default: assert(0); @@ -507,6 +509,7 @@ lp_rast_end_query(struct lp_rasterizer_task *task, break; case PIPE_QUERY_PRIMITIVES_GENERATED: case PIPE_QUERY_PRIMITIVES_EMITTED: + case PIPE_QUERY_SO_STATISTICS: break; default: assert(0); Otherwise Reviewed-by: Roland Scheidegger srol...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/4] gallivm: fix loops and conditionals within GS
Am 10.04.2013 02:22, schrieb Zack Rusin: We were using simple temporaries, without using alloca or phi nodes which meant that on every iteration of the loop our temporaries holding the numbers of vertices and primitives which were emitted were being reset to zero. Now we're using alloca to allocate of those variables to preserve them across conditionals. Signed-off-by: Zack Rusin za...@vmware.com --- src/gallium/auxiliary/gallivm/lp_bld_tgsi.h |6 +- src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c | 118 --- 2 files changed, 105 insertions(+), 19 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h index 558a8dd..23ccacc 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi.h @@ -393,9 +393,9 @@ struct lp_build_tgsi_soa_context struct lp_build_context elem_bld; const struct lp_build_tgsi_gs_iface *gs_iface; - LLVMValueRef emitted_prims_vec; - LLVMValueRef total_emitted_vertices_vec; - LLVMValueRef emitted_vertices_vec; + LLVMValueRef emitted_prims_vec_ptr; + LLVMValueRef total_emitted_vertices_vec_ptr; + LLVMValueRef emitted_vertices_vec_ptr; /* if a shader doesn't have ENDPRIM instruction but it has * a number of EMIT instructions it means the END instruction * implicitly invokes ENDPRIM. handle this via a flag here diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c index 1e062e9..6cc72ff 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c @@ -1150,7 +1150,7 @@ emit_store_chan( } else { LLVMValueRef out_ptr = lp_get_output_ptr(bld, reg-Register.Index, - chan_index); + chan_index); lp_exec_mask_store(bld-exec_mask, bld_store, pred, value, out_ptr); } break; @@ -2213,6 +2213,41 @@ mask_to_one_vec(struct lp_build_tgsi_context *bld_base) } static void +increment_vec_ptr_by_mask(struct lp_build_tgsi_context * bld_base, + LLVMValueRef ptr, + LLVMValueRef mask) +{ + LLVMBuilderRef builder = bld_base-base.gallivm-builder; + + LLVMValueRef current_vec = LLVMBuildLoad(builder, ptr, ); + + current_vec = LLVMBuildAdd(builder, current_vec, mask, ); + + LLVMBuildStore(builder, current_vec, ptr); +} + +static void +clear_uint_vec_ptr_from_mask(struct lp_build_tgsi_context * bld_base, + LLVMValueRef ptr, + LLVMValueRef mask) +{ + LLVMBuilderRef builder = bld_base-base.gallivm-builder; + + LLVMValueRef current_vec = LLVMBuildLoad(builder, ptr, ); + LLVMValueRef full_mask = lp_build_cmp(bld_base-uint_bld, + PIPE_FUNC_NOTEQUAL, + mask, + bld_base-uint_bld.zero); + + current_vec = lp_build_select(bld_base-uint_bld, + full_mask, + bld_base-uint_bld.zero, + current_vec); + + LLVMBuildStore(builder, current_vec, ptr); +} + +static void emit_vertex( const struct lp_build_tgsi_action * action, struct lp_build_tgsi_context * bld_base, @@ -2223,14 +2258,22 @@ emit_vertex( if (bld-gs_iface-emit_vertex) { LLVMValueRef masked_ones = mask_to_one_vec(bld_base); + LLVMValueRef total_emitted_vertices_vec = + LLVMBuildLoad(builder, bld-total_emitted_vertices_vec_ptr, ); gather_outputs(bld); bld-gs_iface-emit_vertex(bld-gs_iface, bld-bld_base, bld-outputs, - bld-total_emitted_vertices_vec); - bld-emitted_vertices_vec = - LLVMBuildAdd(builder, bld-emitted_vertices_vec, masked_ones, ); - bld-total_emitted_vertices_vec = - LLVMBuildAdd(builder, bld-total_emitted_vertices_vec, masked_ones, ); + total_emitted_vertices_vec); + increment_vec_ptr_by_mask(bld_base, bld-emitted_vertices_vec_ptr, +masked_ones); + increment_vec_ptr_by_mask(bld_base, bld-total_emitted_vertices_vec_ptr, +masked_ones); +#if DUMP_GS_EMITS + lp_build_print_value(bld-bld_base.base.gallivm, +++ emit vertex masked ones = , + masked_ones); + lp_build_print_value(bld-bld_base.base.gallivm, +++ emit vertex emitted = , + total_emitted_vertices_vec); +#endif bld-pending_end_primitive = TRUE; } } @@ -2247,12 +2290,32 @@ end_primitive( if
Re: [Mesa-dev] [PATCH 4/4] gallivm/tgsi: handle untyped moves
Am 10.04.2013 02:22, schrieb Zack Rusin: both mov and ucmp can be used to move variables of any type. correctly note that about ucmp in the tgsi_info and make sure gallivm can handle that by correctly casting the untyped moves. Signed-off-by: Zack Rusin za...@vmware.com --- src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c |8 src/gallium/auxiliary/tgsi/tgsi_info.c |1 + 2 files changed, 9 insertions(+) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c index 6cc72ff..9501100 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c @@ -1084,6 +1084,14 @@ emit_store_chan( break; } + /* If we're destination is untyped then the source can be anything, The sentence doesn't parse. +* but LLVM won't like if the types don't match so lets cast +* to the correct destination type as expected by LLVM */ + if (dtype == TGSI_TYPE_UNTYPED + !lp_check_vec_type(bld_store-type, LLVMTypeOf(value))) { + value = LLVMBuildBitCast(builder, value, bld_store-vec_type, src_casted); + } + switch( inst-Instruction.Saturate ) { case TGSI_SAT_NONE: break; diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c b/src/gallium/auxiliary/tgsi/tgsi_info.c index 8ae5523..1fadfec 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_info.c +++ b/src/gallium/auxiliary/tgsi/tgsi_info.c @@ -327,6 +327,7 @@ tgsi_opcode_infer_dst_type( uint opcode ) { switch (opcode) { case TGSI_OPCODE_MOV: + case TGSI_OPCODE_UCMP: return TGSI_TYPE_UNTYPED; case TGSI_OPCODE_F2U: case TGSI_OPCODE_AND: Otherwise looks good. Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/8] intel: Add functions for checking if objs have hiz enabled
On 04/09/2013 04:03 PM, Eric Anholt wrote: Kenneth Graunke kenn...@whitecape.org writes: From: Chad Versace chad.vers...@linux.intel.com On Haswell, HiZ will selectively be enabled on individual miptree slices to workaround a hardware bug. The two new functions below will permit us to detect if hiz is enabled for a particular slice. intel_miptree_slice_has_hiz intel_renderbuffer_has_hiz The functions are not yet used. Signed-off-by: Chad Versace chad.vers...@linux.intel.com --- src/mesa/drivers/dri/intel/intel_fbo.c | 10 ++ src/mesa/drivers/dri/intel/intel_fbo.h | 3 +++ src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 12 src/mesa/drivers/dri/intel/intel_mipmap_tree.h | 11 +-- 4 files changed, 34 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_fbo.c b/src/mesa/drivers/dri/intel/intel_fbo.c index 2977568..0e2ded5 100644 --- a/src/mesa/drivers/dri/intel/intel_fbo.c +++ b/src/mesa/drivers/dri/intel/intel_fbo.c @@ -943,6 +943,16 @@ intel_renderbuffer_set_needs_downsample(struct intel_renderbuffer *irb) irb-mt-need_downsample = true; } +/** + * Does the renderbuffer have hiz enabled? + */ +bool +intel_renderbuffer_has_hiz(struct intel_renderbuffer *irb) +{ + return irb-mt + intel_miptree_slice_has_hiz(irb-mt, irb-mt_level, irb-mt_layer); +} irb-mt should always be non-null -- a renderbuffer without that should never exist. Is that true for i915? I've never understood i915. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] glsl: Fix hypothetical NULL dereference in ast_process_structure_or_interface_block
On 04/09/2013 04:59 PM, Ian Romanick wrote: From: Ian Romanick ian.d.roman...@intel.com Fixes issue identified by Klocwork analysis: Pointer 'field_type' returned from call to function 'glsl_type' at line 4126 may be NULL and may be dereferenced at line 4139. Also there are 2 similar errors on line(s) 4165, 4174. In practice, it should be impossible to actually get NULL in here because a syntax error would have already caused compilation to halt. Signed-off-by: Ian Romanick ian.d.roman...@intel.com --- src/glsl/ast_to_hir.cpp | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) These seem okay to me. All four are: Reviewed-by: Kenneth Graunke kenn...@whitecape.org ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/8] i965/blorp: Align rectangle primitive for hiz ops
On 04/09/2013 03:51 PM, Kenneth Graunke wrote: From: Chad Versace chad.vers...@linux.intel.com The hardware docs and the simulator require that the rectangle primitive emitted during fast depth clears and hiz resolves must be aligned to 8x4 pixels. Signed-off-by: Chad Versace chad.vers...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_blorp.cpp | 29 + 1 file changed, 29 insertions(+) I agree with Eric's comments. Assuming you make those changes, this series is: Reviewed-by: Kenneth Graunke kenn...@whitecape.org Thanks so much for doing this, Chad. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965: NULL check prog on compilation failure.
On 04/09/2013 03:55 PM, Eric Anholt wrote: Matt Turner matts...@gmail.com writes: I believe that prog can only be NULL for ARB programs. Neither brw_fs_fp.cpp nor brw_vec4_vp.cpp call fail(), but not NULL checking prog is obviously fragile. (shader != NULL) = (prog != NULL), so if you want consistency I'd rather see the if (shader) changed to if (prog). A bunch of these changes are not about compilation failure, anyway. Is that so? What about fixed-function VS and a GLSL fragment shader? I think that's allowed in at least some older specs, and there won't be a gl_shader for the VS, but I'd be surprised if there wasn't a gl_shader_program... ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: Fix (and validate) comment above glsl_type::name.
The comment above glsl_type::name claimed that it could sometimes be NULL. This was wrong--it is never NULL. Many error handling paths would segfault if it were. Fix the comment and add assertions to validate that it really is never NULL. --- src/glsl/glsl_types.cpp | 4 src/glsl/glsl_types.h | 3 +-- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/src/glsl/glsl_types.cpp b/src/glsl/glsl_types.cpp index 419761a..df9c5d3 100644 --- a/src/glsl/glsl_types.cpp +++ b/src/glsl/glsl_types.cpp @@ -57,6 +57,7 @@ glsl_type::glsl_type(GLenum gl_type, length(0) { init_ralloc_type_ctx(); + assert(name != NULL); this-name = ralloc_strdup(this-mem_ctx, name); /* Neither dimension is zero or both dimensions are zero. */ @@ -75,6 +76,7 @@ glsl_type::glsl_type(GLenum gl_type, length(0) { init_ralloc_type_ctx(); + assert(name != NULL); this-name = ralloc_strdup(this-mem_ctx, name); memset( fields, 0, sizeof(fields)); } @@ -91,6 +93,7 @@ glsl_type::glsl_type(const glsl_struct_field *fields, unsigned num_fields, unsigned int i; init_ralloc_type_ctx(); + assert(name != NULL); this-name = ralloc_strdup(this-mem_ctx, name); this-fields.structure = ralloc_array(this-mem_ctx, glsl_struct_field, length); @@ -114,6 +117,7 @@ glsl_type::glsl_type(const glsl_struct_field *fields, unsigned num_fields, unsigned int i; init_ralloc_type_ctx(); + assert(name != NULL); this-name = ralloc_strdup(this-mem_ctx, name); this-fields.structure = ralloc_array(this-mem_ctx, glsl_struct_field, length); diff --git a/src/glsl/glsl_types.h b/src/glsl/glsl_types.h index 2f3b19f..31e3dd2 100644 --- a/src/glsl/glsl_types.h +++ b/src/glsl/glsl_types.h @@ -132,8 +132,7 @@ struct glsl_type { /** * Name of the data type * -* This may be \c NULL for anonymous structures, for arrays, or for -* function types. +* Will never be \c NULL. */ const char *name; -- 1.8.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Regarding UVD implementation check-out from MESA
Hi Christian, I working with MESA and want to test the UVD implementation in MESA and trying to check-out the code for UVD implementation as said in following site and did not find the code in mesa tree. Can you please guide me how to check-out the code for UVD radeon/uvd: add UVD implementation from mesa tree and exact kernel patch for UVD. http://lists.freedesktop.org/archives/mesa-dev/2013-April/037049.html Thank in advance. Regards, Ramesh CAUTION - Disclaimer * This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions are unlawful. This e-mail may contain viruses. Infosys has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. Infosys reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infosys e-mail system. ***INFOSYS End of Disclaimer INFOSYS*** ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 8/8] intel/hsw: Enable hiz
On 04/09/2013 04:15 PM, Eric Anholt wrote: Kenneth Graunke kenn...@whitecape.org writes: From: Chad Versace chad.vers...@linux.intel.com Enable hiz by setting intel_context::has_hiz. However, to work around a hardware bug, we selectively enable hiz for only nicely aligned miptree slices. No Piglit regressions on Haswell 0x0d26 rev07 when based atop mesa-master-97e40a5. Improves the performance of GLB27_TRex_C24Z16_FixedTimeStep by 18.52% (hsw-0x0d26-rev07; kernel-3.9.0-rc1; GLBenchmark 2.7.0 Release a68901; samples=3). diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c index 6c27bab..654f0be 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c @@ -29,6 +29,7 @@ #include GL/internal/dri_interface.h #include intel_batchbuffer.h +#include intel_chipset.h #include intel_context.h #include intel_mipmap_tree.h #include intel_regions.h @@ -1036,7 +1037,38 @@ intel_miptree_slice_has_hiz(struct intel_mipmap_tree *mt, uint32_t layer) { intel_miptree_check_level_layer(mt, level, layer); - return mt-hiz_mt != NULL; + + if (!mt-hiz_mt) + return false; + + int devid = drm_intel_bufmgr_gem_get_devid(mt-region-bo-bufmgr); + if (IS_HASWELL(devid)) { This conditional is checking for about 37 different PCI IDs, and shouldn't appear in a hot path. Please use intel-is_haswell, by passing the intel_context into this function. Passing the intel_context into this function would require changing the signature of many functions. Rather than change all those signatures, I'd prefer to store a boolean flag 'has_hiz' in intel_mipmap_slice. What do you think? + /* Disable HiZ for some slices to work around a hardware bug. + * + * Haswell hardware fails to respect + * 3DSTATE_DEPTH_BUFFER.Depth_Coordinate_Offset_X/Y when during HiZ ^ s/when // Other than the comments I've sent, the series is: Reviewed-by: Eric Anholt e...@anholt.net ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev