Re: [Mesa-dev] [PATCH] mesa: use signed temporary variable to store _ColorDrawBufferIndexes
Reviewed-by: Marek Olšák marek.ol...@amd.com Marek On Sun, Jan 12, 2014 at 11:52 PM, Emil Velikov emil.l.veli...@gmail.com wrote: _ColorDrawBufferIndexes is defined as GLint* and using a GLuint* will result in the first part of the conditional to be evaluated to true always. Unintentionally introduced by the following commit, this will result in a driver segfault if one is using an old version of the piglit test bin/clearbuffer-mixed-format -auto -fbo commit 03d848ea1003abefd8fe51a5b4a780527cd852af Author: Marek Olšák marek.ol...@amd.com Date: Wed Dec 4 00:27:20 2013 +0100 mesa: fix interpretation of glClearBuffer(drawbuffer) This corresponding piglit tests supported this incorrect behavior instead of pointing at it. Cc: Marek Olšák marek.ol...@amd.com Cc: 10.0 9.2 9.1 mesa-sta...@lists.freedesktop.org Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- src/mesa/main/clear.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/main/clear.c b/src/mesa/main/clear.c index f0b525f..d568ed8 100644 --- a/src/mesa/main/clear.c +++ b/src/mesa/main/clear.c @@ -274,7 +274,7 @@ make_color_buffer_mask(struct gl_context *ctx, GLint drawbuffer) break; default: { - GLuint buf = ctx-DrawBuffer-_ColorDrawBufferIndexes[drawbuffer]; + GLint buf = ctx-DrawBuffer-_ColorDrawBufferIndexes[drawbuffer]; if (buf = 0 att[buf].Renderbuffer) { mask |= 1 buf; -- 1.8.5.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Mark debug_print with __attribute__ ((format(__printf__, 1, 0)))
On Sun, Jan 12, 2014 at 10:34:19AM -0800, Keith Packard wrote: the drmServerInfo member, debug_print, takes a printf format string and varargs list. Tell the compiler about it. Signed-off-by: Keith Packard kei...@keithp.com --- xf86drm.h | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/xf86drm.h b/xf86drm.h index 1e763a3..5e170f8 100644 --- a/xf86drm.h +++ b/xf86drm.h @@ -92,8 +92,14 @@ extern C { typedef unsigned int drmSize, *drmSizePtr; /** For mapped regions */ typedef void *drmAddress, **drmAddressPtr; /** For mapped regions */ +#if (__GNUC__ = 3) +#define DRM_PRINTFLIKE(f, a) __attribute__ ((format(__printf__, f, a))) +#else +#define DRM_PRINTFLIKE(f, a) +#endif + typedef struct _drmServerInfo { - int (*debug_print)(const char *format, va_list ap); + int (*debug_print)(const char *format, va_list ap) DRM_PRINTFLIKE(1,0); int (*load_module)(const char *name); void (*get_perms)(gid_t *, mode_t *); } drmServerInfo, *drmServerInfoPtr; While at it, perhaps the drmMsg() and drmDebugPrint() functions should be similarily annotated as well? Thierry pgpizPDWnAMD5.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Naming everything in src/gallium/drivers/radeonsi si_*
For the series: Reviewed-by: Marek Olšák marek.ol...@amd.com Feel free to push this. Marek On Sat, Jan 11, 2014 at 4:20 PM, Andreas Hartmetz ahartm...@gmail.com wrote: Continuing here because the threads had diverged... I've updated the patch series under the same URL and applied all the suggested improvements. The variable renames are still in, but at the very end so they are trivial to omit. On Tuesday 07 January 2014 17:27:56 Andreas Hartmetz wrote: We have talked on IRC meanwhile: Everywhere was supposed to mean file names and data structures. I have made a patch series (git link because file renames produce huge diffs) that renames *everything* away from r600 (and also radeonsi) to si, where it is actually about SI. In the such modified code it is then clear at first glance that only resources, textures and some other low-level interface code from R600 / generic Radeon are actually used in SI code. The patch series is ordered by increasing controversy potential due to destruction of git blame history, so the last parts can be omitted if they are deemed too destructive to history. In my opinion, it is better to have code that is readable now than code that is less readable but with the possibility to look up how it became like that. Michel said on IRC that he'd prefer to keep the name radeonsi_pipe.h/c, I disagree: If the library name is to be kept, there must be a break between radeonsi and si *somewhere*, and it is normal for library names to not correspond to any file name in the library. The same scheme is used in llvmpipe, llvmpipe lib / directory versus lp_* file names. Here's the repository (branch is master): git git://anongit.kde.org/scratch/ahartmetz/mesa.git web http://quickgit.kde.org/?p=scratch%2Fahartmetz%2Fmesa.git On Monday 06 January 2014 15:50:05 Marek Olšák wrote: It sounds good, but I'd like the prefix to be si_ everywhere. Marek On Mon, Jan 6, 2014 at 2:47 PM, Andreas Hartmetz ahartm...@gmail.com wrote: Hello, many of the files in radeonsi originally came from other places where they had different names and were never renamed. Most of them now have names that don't tell what the files are for (r600 is not actually the first hardware supported by them, they start at radeonsi), and even those with radeonsi are split between radeonsi_ and si_. si_ is shorter than radeonsi_, but inconsistent with the directory and library name. I still think it's the best option, but no strong opinion from me. If and when the files are renamed, the next step would be doing the same with the r600_ struct and function names. Does that sound good? I'll send the patches shortly if so. Cheers, Andreas ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/6] r600g, radeonsi: if discarding the whole buffer range, discard the whole resource instead
From: Marek Olšák marek.ol...@amd.com Also set the unsynchronized flag if the whole resource was discarded to avoid doing buffer-busy checks again. --- src/gallium/drivers/radeon/r600_buffer_common.c | 8 1 file changed, 8 insertions(+) diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c b/src/gallium/drivers/radeon/r600_buffer_common.c index ac5fbcc..66e9d57 100644 --- a/src/gallium/drivers/radeon/r600_buffer_common.c +++ b/src/gallium/drivers/radeon/r600_buffer_common.c @@ -205,6 +205,12 @@ static void *r600_buffer_transfer_map(struct pipe_context *ctx, usage |= PIPE_TRANSFER_UNSYNCHRONIZED; } + /* If discarding the entire range, discard the whole resource instead. */ + if (usage PIPE_TRANSFER_DISCARD_RANGE + box-x == 0 box-width == resource-width0) { + usage |= PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE; + } + if (usage PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE !(usage PIPE_TRANSFER_UNSYNCHRONIZED)) { assert(usage PIPE_TRANSFER_WRITE); @@ -214,6 +220,8 @@ static void *r600_buffer_transfer_map(struct pipe_context *ctx, rctx-ws-buffer_is_busy(rbuffer-buf, RADEON_USAGE_READWRITE)) { rctx-invalidate_buffer(rctx-b, rbuffer-b.b); } + /* At this point, the buffer is always idle. */ + usage |= PIPE_TRANSFER_UNSYNCHRONIZED; } else if ((usage PIPE_TRANSFER_DISCARD_RANGE) !(usage PIPE_TRANSFER_UNSYNCHRONIZED) -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/6] radeonsi: handle R600_CONTEXT_PS_PARTIAL_FLUSH in si_emit_cache_flush
From: Marek Olšák marek.ol...@amd.com For consistency only, This unused by radeonsi currently. --- src/gallium/drivers/radeonsi/si_state_draw.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index f64b51a..9092fb1 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -680,7 +680,8 @@ void si_emit_cache_flush(struct r600_common_context *rctx, struct r600_atom *ato radeon_emit(cs, EVENT_TYPE(V_028A90_FLUSH_AND_INV_DB_META) | EVENT_INDEX(0)); } - if (rctx-flags R600_CONTEXT_WAIT_3D_IDLE) { + if (rctx-flags (R600_CONTEXT_WAIT_3D_IDLE | + R600_CONTEXT_PS_PARTIAL_FLUSH)) { radeon_emit(cs, PKT3(PKT3_EVENT_WRITE, 0, 0)); radeon_emit(cs, EVENT_TYPE(V_028A90_PS_PARTIAL_FLUSH) | EVENT_INDEX(4)); } else if (rctx-flags R600_CONTEXT_STREAMOUT_FLUSH) { -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/6] vdpau: flush the context after resolving delayed rendering
From: Marek Olšák marek.ol...@amd.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73191 When VL uploads vertex buffers, it uses PIPE_TRANSFER_DONTBLOCK, which always flushes the context in the winsys if the buffer being mapped is busy. Since I added handling of DISCARD_RANGE, DONTBLOCK has had no effect when combined with DISCARD_RANGE and I think the context isn't flushed anywhere else, so no commands are submitted to the GPU until the IB is full, which takes a lot of frames. Using DISCARD_RANGE is not the only way to trigger this bug. The other way is to reallocate the vertex buffer before every upload. BTW, I'm not sure if this is the right place for flushing, but it does fix the bug. --- src/gallium/state_trackers/vdpau/device.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/gallium/state_trackers/vdpau/device.c b/src/gallium/state_trackers/vdpau/device.c index fb9c68c..4fd6041 100644 --- a/src/gallium/state_trackers/vdpau/device.c +++ b/src/gallium/state_trackers/vdpau/device.c @@ -266,6 +266,7 @@ vlVdpResolveDelayedRendering(vlVdpDevice *dev, struct pipe_surface *surface, str { struct vl_compositor_state *cstate; vlVdpOutputSurface *vlsurface; + struct pipe_context *pipe = dev-context; assert(dev); @@ -283,6 +284,7 @@ vlVdpResolveDelayedRendering(vlVdpDevice *dev, struct pipe_surface *surface, str } vl_compositor_render(cstate, dev-compositor, surface, dirty_area, true); + pipe-flush(pipe, NULL, 0); dev-delayed_rendering.surface = VDP_INVALID_HANDLE; dev-delayed_rendering.cstate = NULL; -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/6] gallium/hud: just unmap the upload vertex buffer instead of recreating it
From: Marek Olšák marek.ol...@amd.com --- src/gallium/auxiliary/hud/hud_context.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/auxiliary/hud/hud_context.c b/src/gallium/auxiliary/hud/hud_context.c index c4a4f18..465013c 100644 --- a/src/gallium/auxiliary/hud/hud_context.c +++ b/src/gallium/auxiliary/hud/hud_context.c @@ -479,7 +479,7 @@ hud_draw(struct hud_context *hud, struct pipe_resource *tex) } /* unmap the uploader's vertex buffer before drawing */ - u_upload_flush(hud-uploader); + u_upload_unmap(hud-uploader); /* draw accumulated vertices for background quads */ cso_set_fragment_shader_handle(hud-cso, hud-fs_color); -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/6] gallium/vl: use u_upload_mgr to upload vertices for vl_compositor
From: Marek Olšák marek.ol...@amd.com This is the recommended way for streaming vertices. Always use this if you need to upload vertices every frame. --- src/gallium/auxiliary/vl/vl_compositor.c | 51 src/gallium/auxiliary/vl/vl_compositor.h | 1 + 2 files changed, 20 insertions(+), 32 deletions(-) diff --git a/src/gallium/auxiliary/vl/vl_compositor.c b/src/gallium/auxiliary/vl/vl_compositor.c index 1c8312e..0c8b424 100644 --- a/src/gallium/auxiliary/vl/vl_compositor.c +++ b/src/gallium/auxiliary/vl/vl_compositor.c @@ -33,6 +33,7 @@ #include util/u_memory.h #include util/u_draw.h #include util/u_surface.h +#include util/u_upload_mgr.h #include tgsi/tgsi_ureg.h @@ -498,23 +499,6 @@ static void cleanup_pipe_state(struct vl_compositor *c) } static bool -create_vertex_buffer(struct vl_compositor *c) -{ - assert(c); - - pipe_resource_reference(c-vertex_buf.buffer, NULL); - c-vertex_buf.buffer = pipe_buffer_create - ( - c-pipe-screen, - PIPE_BIND_VERTEX_BUFFER, - PIPE_USAGE_STREAM, - c-vertex_buf.stride * VL_COMPOSITOR_MAX_LAYERS * 4 - ); - - return c-vertex_buf.buffer != NULL; -} - -static bool init_buffers(struct vl_compositor *c) { struct pipe_vertex_element vertex_elems[3]; @@ -526,7 +510,7 @@ init_buffers(struct vl_compositor *c) */ c-vertex_buf.stride = sizeof(struct vertex2f) + sizeof(struct vertex4f) * 2; c-vertex_buf.buffer_offset = 0; - create_vertex_buffer(c); + c-vertex_buf.buffer = NULL; vertex_elems[0].src_offset = 0; vertex_elems[0].instance_divisor = 0; @@ -659,22 +643,15 @@ static void gen_vertex_data(struct vl_compositor *c, struct vl_compositor_state *s, struct u_rect *dirty) { struct vertex2f *vb; - struct pipe_transfer *buf_transfer; unsigned i; assert(c); - vb = pipe_buffer_map(c-pipe, c-vertex_buf.buffer, -PIPE_TRANSFER_WRITE | PIPE_TRANSFER_DISCARD_RANGE | PIPE_TRANSFER_DONTBLOCK, -buf_transfer); - - if (!vb) { - // If buffer is still locked from last draw create a new one - create_vertex_buffer(c); - vb = pipe_buffer_map(c-pipe, c-vertex_buf.buffer, - PIPE_TRANSFER_WRITE | PIPE_TRANSFER_DISCARD_RANGE, - buf_transfer); - } + /* Allocate new memory for vertices. */ + u_upload_alloc(c-upload, 0, + c-vertex_buf.stride * VL_COMPOSITOR_MAX_LAYERS * 4, /* size */ + c-vertex_buf.buffer_offset, c-vertex_buf.buffer, + (void**)vb); for (i = 0; i VL_COMPOSITOR_MAX_LAYERS; i++) { if (s-used_layers (1 i)) { @@ -705,7 +682,7 @@ gen_vertex_data(struct vl_compositor *c, struct vl_compositor_state *s, struct u } } - pipe_buffer_unmap(c-pipe, buf_transfer); + u_upload_unmap(c-upload); } static void @@ -802,6 +779,7 @@ vl_compositor_cleanup(struct vl_compositor *c) { assert(c); + u_upload_destroy(c-upload); cleanup_buffers(c); cleanup_shaders(c); cleanup_pipe_state(c); @@ -1037,15 +1015,24 @@ vl_compositor_init(struct vl_compositor *c, struct pipe_context *pipe) c-pipe = pipe; - if (!init_pipe_state(c)) + c-upload = u_upload_create(pipe, 128 * 1024, 4, PIPE_BIND_VERTEX_BUFFER); + + if (!c-upload) + return false; + + if (!init_pipe_state(c)) { + u_upload_destroy(c-upload); return false; + } if (!init_shaders(c)) { + u_upload_destroy(c-upload); cleanup_pipe_state(c); return false; } if (!init_buffers(c)) { + u_upload_destroy(c-upload); cleanup_shaders(c); cleanup_pipe_state(c); return false; diff --git a/src/gallium/auxiliary/vl/vl_compositor.h b/src/gallium/auxiliary/vl/vl_compositor.h index 8e01901..6a60138 100644 --- a/src/gallium/auxiliary/vl/vl_compositor.h +++ b/src/gallium/auxiliary/vl/vl_compositor.h @@ -89,6 +89,7 @@ struct vl_compositor_state struct vl_compositor { struct pipe_context *pipe; + struct u_upload_mgr *upload; struct pipe_framebuffer_state fb_state; struct pipe_vertex_buffer vertex_buf; -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/6] gallium/u_upload_mgr: don't expose u_upload_flush
From: Marek Olšák marek.ol...@amd.com It's unused and shouldn't be used at all in my opinion. If some driver doesn't support the unsynchronized flag, u_upload_mgr should avoid the synchronization by other means, e.g. by using the DONTBLOCK flag. --- src/gallium/auxiliary/util/u_upload_mgr.c | 16 src/gallium/auxiliary/util/u_upload_mgr.h | 10 -- 2 files changed, 4 insertions(+), 22 deletions(-) diff --git a/src/gallium/auxiliary/util/u_upload_mgr.c b/src/gallium/auxiliary/util/u_upload_mgr.c index 6859751..7349d00 100644 --- a/src/gallium/auxiliary/util/u_upload_mgr.c +++ b/src/gallium/auxiliary/util/u_upload_mgr.c @@ -87,16 +87,8 @@ void u_upload_unmap( struct u_upload_mgr *upload ) } } -/* Release old buffer. - * - * This must usually be called prior to firing the command stream - * which references the upload buffer, as many memory managers will - * cause subsequent maps of a fired buffer to wait. - * - * Can improve this with a change to pipe_buffer_write to use the - * DONT_WAIT bit, but for now, it's easiest just to grab a new buffer. - */ -void u_upload_flush( struct u_upload_mgr *upload ) + +static void u_upload_release_buffer(struct u_upload_mgr *upload) { /* Unmap and unreference the upload buffer. */ u_upload_unmap(upload); @@ -107,7 +99,7 @@ void u_upload_flush( struct u_upload_mgr *upload ) void u_upload_destroy( struct u_upload_mgr *upload ) { - u_upload_flush( upload ); + u_upload_release_buffer( upload ); FREE( upload ); } @@ -120,7 +112,7 @@ u_upload_alloc_buffer( struct u_upload_mgr *upload, /* Release the old buffer, if present: */ - u_upload_flush( upload ); + u_upload_release_buffer( upload ); /* Allocate a new one: */ diff --git a/src/gallium/auxiliary/util/u_upload_mgr.h b/src/gallium/auxiliary/util/u_upload_mgr.h index 82215a5..63bf30e 100644 --- a/src/gallium/auxiliary/util/u_upload_mgr.h +++ b/src/gallium/auxiliary/util/u_upload_mgr.h @@ -57,16 +57,6 @@ struct u_upload_mgr *u_upload_create( struct pipe_context *pipe, void u_upload_destroy( struct u_upload_mgr *upload ); /** - * Unmap and release old upload buffer. - * - * This is like u_upload_unmap() except the upload buffer is released for - * recycling. This should be called on real hardware flushes on systems - * that don't support the PIPE_TRANSFER_UNSYNCHRONIZED flag, as otherwise - * the next u_upload_buffer will cause a sync on the buffer. - */ -void u_upload_flush( struct u_upload_mgr *upload ); - -/** * Unmap upload buffer * * \param upload Upload manager -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glx: Add missing null check in glXCreateContextAttribsARB
Signed-off-by: Juha-Pekka Heikkila juhapekka.heikk...@gmail.com --- src/glx/create_context.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/glx/create_context.c b/src/glx/create_context.c index 38e949a..b15921f 100644 --- a/src/glx/create_context.c +++ b/src/glx/create_context.c @@ -90,6 +90,9 @@ glXCreateContextAttribsARB(Display *dpy, GLXFBConfig config, #endif } + if (gc == NULL) + return NULL; + gc-xid = xcb_generate_id(c); gc-share_xid = (share != NULL) ? share-xid : 0; -- 1.8.1.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/6] gallium/vl: use u_upload_mgr to upload vertices for vl_compositor
Am 13.01.2014 14:13, schrieb Marek Olšák: From: Marek Olšák marek.ol...@amd.com This patch is: Reviewed-by: Christian König christian.koe...@amd.com This is the recommended way for streaming vertices. Always use this if you need to upload vertices every frame. --- src/gallium/auxiliary/vl/vl_compositor.c | 51 src/gallium/auxiliary/vl/vl_compositor.h | 1 + 2 files changed, 20 insertions(+), 32 deletions(-) diff --git a/src/gallium/auxiliary/vl/vl_compositor.c b/src/gallium/auxiliary/vl/vl_compositor.c index 1c8312e..0c8b424 100644 --- a/src/gallium/auxiliary/vl/vl_compositor.c +++ b/src/gallium/auxiliary/vl/vl_compositor.c @@ -33,6 +33,7 @@ #include util/u_memory.h #include util/u_draw.h #include util/u_surface.h +#include util/u_upload_mgr.h #include tgsi/tgsi_ureg.h @@ -498,23 +499,6 @@ static void cleanup_pipe_state(struct vl_compositor *c) } static bool -create_vertex_buffer(struct vl_compositor *c) -{ - assert(c); - - pipe_resource_reference(c-vertex_buf.buffer, NULL); - c-vertex_buf.buffer = pipe_buffer_create - ( - c-pipe-screen, - PIPE_BIND_VERTEX_BUFFER, - PIPE_USAGE_STREAM, - c-vertex_buf.stride * VL_COMPOSITOR_MAX_LAYERS * 4 - ); - - return c-vertex_buf.buffer != NULL; -} - -static bool init_buffers(struct vl_compositor *c) { struct pipe_vertex_element vertex_elems[3]; @@ -526,7 +510,7 @@ init_buffers(struct vl_compositor *c) */ c-vertex_buf.stride = sizeof(struct vertex2f) + sizeof(struct vertex4f) * 2; c-vertex_buf.buffer_offset = 0; - create_vertex_buffer(c); + c-vertex_buf.buffer = NULL; vertex_elems[0].src_offset = 0; vertex_elems[0].instance_divisor = 0; @@ -659,22 +643,15 @@ static void gen_vertex_data(struct vl_compositor *c, struct vl_compositor_state *s, struct u_rect *dirty) { struct vertex2f *vb; - struct pipe_transfer *buf_transfer; unsigned i; assert(c); - vb = pipe_buffer_map(c-pipe, c-vertex_buf.buffer, -PIPE_TRANSFER_WRITE | PIPE_TRANSFER_DISCARD_RANGE | PIPE_TRANSFER_DONTBLOCK, -buf_transfer); - - if (!vb) { - // If buffer is still locked from last draw create a new one - create_vertex_buffer(c); - vb = pipe_buffer_map(c-pipe, c-vertex_buf.buffer, - PIPE_TRANSFER_WRITE | PIPE_TRANSFER_DISCARD_RANGE, - buf_transfer); - } + /* Allocate new memory for vertices. */ + u_upload_alloc(c-upload, 0, + c-vertex_buf.stride * VL_COMPOSITOR_MAX_LAYERS * 4, /* size */ + c-vertex_buf.buffer_offset, c-vertex_buf.buffer, + (void**)vb); for (i = 0; i VL_COMPOSITOR_MAX_LAYERS; i++) { if (s-used_layers (1 i)) { @@ -705,7 +682,7 @@ gen_vertex_data(struct vl_compositor *c, struct vl_compositor_state *s, struct u } } - pipe_buffer_unmap(c-pipe, buf_transfer); + u_upload_unmap(c-upload); } static void @@ -802,6 +779,7 @@ vl_compositor_cleanup(struct vl_compositor *c) { assert(c); + u_upload_destroy(c-upload); cleanup_buffers(c); cleanup_shaders(c); cleanup_pipe_state(c); @@ -1037,15 +1015,24 @@ vl_compositor_init(struct vl_compositor *c, struct pipe_context *pipe) c-pipe = pipe; - if (!init_pipe_state(c)) + c-upload = u_upload_create(pipe, 128 * 1024, 4, PIPE_BIND_VERTEX_BUFFER); + + if (!c-upload) + return false; + + if (!init_pipe_state(c)) { + u_upload_destroy(c-upload); return false; + } if (!init_shaders(c)) { + u_upload_destroy(c-upload); cleanup_pipe_state(c); return false; } if (!init_buffers(c)) { + u_upload_destroy(c-upload); cleanup_shaders(c); cleanup_pipe_state(c); return false; diff --git a/src/gallium/auxiliary/vl/vl_compositor.h b/src/gallium/auxiliary/vl/vl_compositor.h index 8e01901..6a60138 100644 --- a/src/gallium/auxiliary/vl/vl_compositor.h +++ b/src/gallium/auxiliary/vl/vl_compositor.h @@ -89,6 +89,7 @@ struct vl_compositor_state struct vl_compositor { struct pipe_context *pipe; + struct u_upload_mgr *upload; struct pipe_framebuffer_state fb_state; struct pipe_vertex_buffer vertex_buf; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa-demos: Fixes a bug in demo2 application
From: Yasir-Khan yasir_k...@mentor.com Vertices array is being passed to glColorPointer whereas its supposed to pass the color array Signed-off-by: Yasir-Khan yasir_k...@mentor.com diff --git a/src/egl/opengl/demo2.c b/src/egl/opengl/demo2.c index 71a1a31..505b474 100644 --- a/src/egl/opengl/demo2.c +++ b/src/egl/opengl/demo2.c @@ -35,7 +35,7 @@ static void _subset_Rectf(GLfloat x1, GLfloat y1, GLfloat x2, GLfloat y2, } glVertexPointer(2, GL_FLOAT, 0, v); - glColorPointer(4, GL_FLOAT, 0, v); + glColorPointer(4, GL_FLOAT, 0, c); glEnableClientState(GL_VERTEX_ARRAY); glEnableClientState(GL_COLOR_ARRAY); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] vdpau: flush the context before exporting the surface v2
From: Marek Olšák marek.ol...@amd.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73191 When VL uploads vertex buffers, it uses PIPE_TRANSFER_DONTBLOCK, which always flushes the context in the winsys if the buffer being mapped is busy. Since I added handling of DISCARD_RANGE, DONTBLOCK has had no effect when combined with DISCARD_RANGE and I think the context isn't flushed anywhere else, so no commands are submitted to the GPU until the IB is full, which takes a lot of frames. Using DISCARD_RANGE is not the only way to trigger this bug. The other way is to reallocate the vertex buffer before every upload. BTW, I'm not sure if this is the right place for flushing, but it does fix the bug. v2 (chk): move the flush to the right place. Signed-off-by: Christian König christian.koe...@amd.com --- src/gallium/state_trackers/vdpau/output.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/gallium/state_trackers/vdpau/output.c b/src/gallium/state_trackers/vdpau/output.c index e4e1433..7ff4196 100644 --- a/src/gallium/state_trackers/vdpau/output.c +++ b/src/gallium/state_trackers/vdpau/output.c @@ -736,6 +736,7 @@ struct pipe_resource *vlVdpOutputSurfaceGallium(VdpOutputSurface surface) pipe_mutex_lock(vlsurface-device-mutex); vlVdpResolveDelayedRendering(vlsurface-device, NULL, NULL); + vlsurface-device-context-flush(vlsurface-device-context, NULL, 0); pipe_mutex_unlock(vlsurface-device-mutex); return vlsurface-surface-texture; -- 1.8.1.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Mark debug_print with __attribute__ ((format(__printf__, 1, 0)))
Thierry Reding thierry.red...@gmail.com writes: While at it, perhaps the drmMsg() and drmDebugPrint() functions should be similarily annotated as well? I don't know; I'm just fixing X server warnings this week and this was the source of one of them. Additional warning fixes for drm would be a great idea! -- keith.pack...@intel.com pgp2Mp9hDDHGU.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] vdpau: flush the context before exporting the surface v2
This patch doesn't fix the bug. :( Marek On Mon, Jan 13, 2014 at 2:55 PM, Christian König deathsim...@vodafone.de wrote: From: Marek Olšák marek.ol...@amd.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73191 When VL uploads vertex buffers, it uses PIPE_TRANSFER_DONTBLOCK, which always flushes the context in the winsys if the buffer being mapped is busy. Since I added handling of DISCARD_RANGE, DONTBLOCK has had no effect when combined with DISCARD_RANGE and I think the context isn't flushed anywhere else, so no commands are submitted to the GPU until the IB is full, which takes a lot of frames. Using DISCARD_RANGE is not the only way to trigger this bug. The other way is to reallocate the vertex buffer before every upload. BTW, I'm not sure if this is the right place for flushing, but it does fix the bug. v2 (chk): move the flush to the right place. Signed-off-by: Christian König christian.koe...@amd.com --- src/gallium/state_trackers/vdpau/output.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/gallium/state_trackers/vdpau/output.c b/src/gallium/state_trackers/vdpau/output.c index e4e1433..7ff4196 100644 --- a/src/gallium/state_trackers/vdpau/output.c +++ b/src/gallium/state_trackers/vdpau/output.c @@ -736,6 +736,7 @@ struct pipe_resource *vlVdpOutputSurfaceGallium(VdpOutputSurface surface) pipe_mutex_lock(vlsurface-device-mutex); vlVdpResolveDelayedRendering(vlsurface-device, NULL, NULL); + vlsurface-device-context-flush(vlsurface-device-context, NULL, 0); pipe_mutex_unlock(vlsurface-device-mutex); return vlsurface-surface-texture; -- 1.8.1.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/egl: Flush resources before presentation
Pushed. Marek On Tue, Jan 7, 2014 at 11:20 PM, Martin Andersson g02ma...@gmail.com wrote: Hi Marek, Since it seems no one else have any comments on this, maybe you could commit it for me? //Martin On Thu, Dec 26, 2013 at 1:15 PM, Marek Olšák mar...@gmail.com wrote: Reviewed-by: Marek Olšák marek.ol...@amd.com Marek On Thu, Dec 26, 2013 at 10:33 AM, Martin Andersson g02ma...@gmail.com wrote: Fixes wayland regression on r600g due to fast clear introduced by commit edbbfac6. --- src/gallium/state_trackers/egl/common/native_helper.c | 15 +++ src/gallium/state_trackers/egl/common/native_helper.h | 5 + src/gallium/state_trackers/egl/wayland/native_wayland.c | 4 3 files changed, 24 insertions(+) diff --git a/src/gallium/state_trackers/egl/common/native_helper.c b/src/gallium/state_trackers/egl/common/native_helper.c index 4a77a50..856cbb6 100644 --- a/src/gallium/state_trackers/egl/common/native_helper.c +++ b/src/gallium/state_trackers/egl/common/native_helper.c @@ -341,6 +341,21 @@ resource_surface_throttle(struct resource_surface *rsurf) } boolean +resource_surface_flush_resource(struct resource_surface *rsurf, +struct native_display *ndpy, +enum native_attachment which) +{ + struct pipe_context *pipe = ndpy_get_copy_context(ndpy); + + if (!pipe) + return FALSE; + + pipe-flush_resource(pipe, rsurf-resources[which]); + + return TRUE; +} + +boolean resource_surface_flush(struct resource_surface *rsurf, struct native_display *ndpy) { diff --git a/src/gallium/state_trackers/egl/common/native_helper.h b/src/gallium/state_trackers/egl/common/native_helper.h index 4c369a7..0b53b28 100644 --- a/src/gallium/state_trackers/egl/common/native_helper.h +++ b/src/gallium/state_trackers/egl/common/native_helper.h @@ -91,6 +91,11 @@ resource_surface_copy_swap(struct resource_surface *rsurf, boolean resource_surface_throttle(struct resource_surface *rsurf); +boolean +resource_surface_flush_resource(struct resource_surface *rsurf, +struct native_display *ndpy, +enum native_attachment which); + /** * Flush pending rendering using the copy context. This function saves a * marker for upcoming throttles. diff --git a/src/gallium/state_trackers/egl/wayland/native_wayland.c b/src/gallium/state_trackers/egl/wayland/native_wayland.c index cfdf4f8..0ab4be6 100644 --- a/src/gallium/state_trackers/egl/wayland/native_wayland.c +++ b/src/gallium/state_trackers/egl/wayland/native_wayland.c @@ -259,6 +259,10 @@ wayland_surface_swap_buffers(struct native_surface *nsurf) if (ret == -1) return EGL_FALSE; + (void) resource_surface_flush_resource(surface-rsurf, display-base, + NATIVE_ATTACHMENT_BACK_LEFT); + (void) resource_surface_flush(surface-rsurf, display-base); + surface-frame_callback = wl_surface_frame(surface-win-surface); wl_callback_add_listener(surface-frame_callback, frame_listener, surface); wl_proxy_set_queue((struct wl_proxy *) surface-frame_callback, -- 1.8.5.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] vdpau: flush the context before exporting the surface v2
Yeah, probably because XBMC still (incorrectly) calls the map function only once. Putting the flush into vlVdpResolveDelayedRendering solves the problem because it's called the next time somebody starts rendering, but it's way to late at this point. Need to sync up with the XBMC devs on this. Christian. Am 13.01.2014 15:20, schrieb Marek Olšák: This patch doesn't fix the bug. :( Marek On Mon, Jan 13, 2014 at 2:55 PM, Christian König deathsim...@vodafone.de wrote: From: Marek Olšák marek.ol...@amd.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73191 When VL uploads vertex buffers, it uses PIPE_TRANSFER_DONTBLOCK, which always flushes the context in the winsys if the buffer being mapped is busy. Since I added handling of DISCARD_RANGE, DONTBLOCK has had no effect when combined with DISCARD_RANGE and I think the context isn't flushed anywhere else, so no commands are submitted to the GPU until the IB is full, which takes a lot of frames. Using DISCARD_RANGE is not the only way to trigger this bug. The other way is to reallocate the vertex buffer before every upload. BTW, I'm not sure if this is the right place for flushing, but it does fix the bug. v2 (chk): move the flush to the right place. Signed-off-by: Christian König christian.koe...@amd.com --- src/gallium/state_trackers/vdpau/output.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/gallium/state_trackers/vdpau/output.c b/src/gallium/state_trackers/vdpau/output.c index e4e1433..7ff4196 100644 --- a/src/gallium/state_trackers/vdpau/output.c +++ b/src/gallium/state_trackers/vdpau/output.c @@ -736,6 +736,7 @@ struct pipe_resource *vlVdpOutputSurfaceGallium(VdpOutputSurface surface) pipe_mutex_lock(vlsurface-device-mutex); vlVdpResolveDelayedRendering(vlsurface-device, NULL, NULL); + vlsurface-device-context-flush(vlsurface-device-context, NULL, 0); pipe_mutex_unlock(vlsurface-device-mutex); return vlsurface-surface-texture; -- 1.8.1.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 72895] Missing trees in flightgear 2.12.1 with r600 driver and mesa 10.0.1
https://bugs.freedesktop.org/show_bug.cgi?id=72895 --- Comment #7 from Barto mister.free...@laposte.net --- does anyone need more informations about this bug ? because this bug is still here with the new mesa 10.0.2, I did a bisect, the bug begins with 59b01ca252bd6706f08cd80a864819d71dfe741c commit, I can do another test but I need some help because I'm not a specialist in 3D programming -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 72895] Missing trees in flightgear 2.12.1 with r600 driver and mesa 10.0.1
https://bugs.freedesktop.org/show_bug.cgi?id=72895 Alex Deucher ag...@yahoo.com changed: What|Removed |Added CC||e...@anholt.net -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 72895] Missing trees in flightgear 2.12.1 with r600 driver and mesa 10.0.1
https://bugs.freedesktop.org/show_bug.cgi?id=72895 Igor Gnatenko i.gnatenko.br...@gmail.com changed: What|Removed |Added CC||i.gnatenko.br...@gmail.com -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Use AC_PATH_TOOL instead of AC_PATH_PROG for llvm-config.
On Sat, Dec 28, 2013 at 03:22:09PM +0100, Michał Górny wrote: This should help with cross-compiling and multilib when $CHOST-specific llvm-config is expected rather than build host default one. It will help us a bit in Gentoo where we've started using i686-pc-linux-gnu-llvm-config for 32-bit multilib LLVM. Reviewed-by: Tom Stellard thomas.stell...@amd.com Should we CC stable on this patch? Do you have commit access? -Tom Signed-off-by: Michał Górny mgo...@gentoo.org Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=73100 --- configure.ac | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/configure.ac b/configure.ac index f75325d..1d68547 100644 --- a/configure.ac +++ b/configure.ac @@ -1567,9 +1567,9 @@ if test x$enable_gallium_llvm = xauto; then fi if test x$enable_gallium_llvm = xyes; then if test x$llvm_prefix != x; then -AC_PATH_PROG([LLVM_CONFIG], [llvm-config], [no], [$llvm_prefix/bin]) +AC_PATH_TOOL([LLVM_CONFIG], [llvm-config], [no], [$llvm_prefix/bin]) else -AC_PATH_PROG([LLVM_CONFIG], [llvm-config], [no]) +AC_PATH_TOOL([LLVM_CONFIG], [llvm-config], [no]) fi if test x$LLVM_CONFIG != xno; then -- 1.8.5.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/5] r300g/compiler/tests: Remove an unused variable
On Mon, Jan 06, 2014 at 11:47:39AM +0200, Lauri Kasanen wrote: On Sun, 5 Jan 2014 18:51:18 -0800 Tom Stellard t...@stellard.net wrote: struct rc_test_file test_file; + struct rc_instruction *inst; unsigned optimizations = 1; unsigned do_full_regalloc = 1; - struct rc_instruction *inst; unsigned pass = 1; This doesn't do what the title says. Thanks for spotting this I will drop the patch. -Tom - Lauri ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/8] glsl: Statically cast parameter exec_node to ir_variable.
On 01/11/2014 02:37 AM, Kenneth Graunke wrote: Formal function parameters are always ir_variable objects, not an arbitrary ir_instruction. So there's no need to dynamically cast here. ...especially since we never bother to check that as_variable doesn't return NULL. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/builtin_functions.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp index 5b8463a..662ff4c 100644 --- a/src/glsl/builtin_functions.cpp +++ b/src/glsl/builtin_functions.cpp @@ -2399,7 +2399,7 @@ builtin_builder::call(ir_function *f, ir_variable *ret, exec_list params) exec_list actual_params; foreach_list(node, params) { - ir_variable *var = ((ir_instruction *) node)-as_variable(); + ir_variable *var = (ir_variable *) node; actual_params.push_tail(var_ref(var)); } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 73512] [clover] mesa.icd. should contain full path
https://bugs.freedesktop.org/show_bug.cgi?id=73512 --- Comment #4 from Tom Stellard tstel...@gmail.com --- According to the icd spec: http://www.khronos.org/registry/cl/extensions/khr/cl_khr_icd.txt The vendors directory must go in /etc/OpenCL and also only the library name is included in the *.icd file, not the full path, so I don't think this patch is correct. What problem does this patch fix? -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 7/8] glsl: Replace iterators in ir_reader.cpp with ad-hoc list walking.
On 01/11/2014 02:37 AM, Kenneth Graunke wrote: These can't use foreach_list since they want to skip over the first few list elements. Just doing the ad-hoc list walking isn't too bad. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/ir_reader.cpp | 18 ++ 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/src/glsl/ir_reader.cpp b/src/glsl/ir_reader.cpp index f5185d2..28923f3 100644 --- a/src/glsl/ir_reader.cpp +++ b/src/glsl/ir_reader.cpp @@ -205,11 +205,12 @@ ir_reader::read_function(s_expression *expr, bool skip_body) assert(added); } - exec_list_iterator it = ((s_list *) expr)-subexpressions.iterator(); - it.next(); // skip function tag - it.next(); // skip function name - for (/* nothing */; it.has_next(); it.next()) { - s_expression *s_sig = (s_expression *) it.get(); + /* Skip over function tag and function name (which are guaranteed to be +* present by the above PARTIAL_MATCH call). +*/ + exec_node *node = ((s_list *) expr)-subexpressions.head-next-next; + for (/* nothing */; !node-is_tail_sentinel(); node = node-next) { + s_expression *s_sig = (s_expression *) node; This won't behave the same in the (bug) case that the list has too few elements. If the list is empty or as only one element, there will be a NULL deref here somewhere. I believe the iterator version was safe against this. Do we have some pre-existing guarantee that the list has enough elements? read_function_sig(f, s_sig, skip_body); } return added ? f : NULL; @@ -249,9 +250,10 @@ ir_reader::read_function_sig(ir_function *f, s_expression *expr, bool skip_body) exec_list hir_parameters; state-symbols-push_scope(); - exec_list_iterator it = paramlist-subexpressions.iterator(); - for (it.next() /* skip parameters */; it.has_next(); it.next()) { - ir_variable *var = read_declaration((s_expression *) it.get()); + /* Skip over the parameters tag. */ + exec_node *node = paramlist-subexpressions.head-next; + for (/* nothing */; !node-is_tail_sentinel(); node = node-next) { + ir_variable *var = read_declaration((s_expression *) node); if (var == NULL) return; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Removing exec_list iterators
On 01/11/2014 02:37 AM, Kenneth Graunke wrote: Hello, Here's a long overdue cleanup: removing exec_list_iterator and such. Should be fairly easy to review. I ran Piglit on i965, swrast (which uses ir_to_mesa), and softpipe (which uses st_glsl_to_tgsi). Nothing changed. Patches 1 - 5 and 8 are, as-is, Reviewed-by: Ian Romanick ian.d.roman...@intel.com I sent some feedback on 6 and 7. --Ken ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 6/8] glsl: Use a new foreach_list2 macro for walking two lists at once.
On 01/11/2014 02:37 AM, Kenneth Graunke wrote: When handling function calls, we often want to walk through the list of formal parameters and list of actual parameters at the same time. (Both are guaranteed to be the same length.) Previously, we used a pattern of: exec_list_iterator 1st_iter = 1st list.iterator(); foreach_iter(exec_list_iterator, 2nd_iter, 2nd list) { ... 1st_iter.next(); } This was a bit awkward, since you had to manually iterate through one of the two lists. a bit lol. This patch introduces a foreach_list2 macro which safely walks through two lists at the same time, so you can simply do: foreach_list2(1st_node, 1st list, 2nd_node, 2nd list) { ... } My only suggestion might be to change the name to foreach_two_lists. I think it's more obvious to someone reading the header file looking for utility macros. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/ast_function.cpp | 16 -- src/glsl/ir.cpp| 12 +++--- src/glsl/linker.cpp| 9 src/glsl/list.h| 16 ++ src/glsl/opt_constant_folding.cpp | 9 src/glsl/opt_constant_propagation.cpp | 9 src/glsl/opt_constant_variable.cpp | 9 src/glsl/opt_copy_propagation.cpp | 9 src/glsl/opt_copy_propagation_elements.cpp | 9 src/glsl/opt_function_inlining.cpp | 35 -- src/glsl/opt_tree_grafting.cpp | 10 - src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 22 +++ 12 files changed, 73 insertions(+), 92 deletions(-) diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp index e4c0fd1..9a9bb74 100644 --- a/src/glsl/ast_function.cpp +++ b/src/glsl/ast_function.cpp @@ -293,15 +293,10 @@ generate_call(exec_list *instructions, ir_function_signature *sig, * call takes place. Since we haven't emitted the call yet, we'll place * the post-call conversions in a temporary exec_list, and emit them later. */ - exec_list_iterator actual_iter = actual_parameters-iterator(); - exec_list_iterator formal_iter = sig-parameters.iterator(); - - while (actual_iter.has_next()) { - ir_rvalue *actual = (ir_rvalue *) actual_iter.get(); - ir_variable *formal = (ir_variable *) formal_iter.get(); - - assert(actual != NULL); - assert(formal != NULL); + foreach_list2(formal_node, sig-parameters, + actual_node, actual_parameters) { + ir_rvalue *actual = (ir_rvalue *) actual_node; + ir_variable *formal = (ir_variable *) formal_node; The old code asserts when the lists aren't the same length... or at least when sig-parameters is shorter than actual_parameters. As do the loops in st_glsl_to_tgsi.cpp. I think a debug-build version of foreach_list2 could do the same... I'm just waffling whether there's sufficient value to make it worth doing. Opinions? if (formal-type-is_numeric() || formal-type-is_boolean()) { switch (formal-data.mode) { @@ -323,9 +318,6 @@ generate_call(exec_list *instructions, ir_function_signature *sig, break; } } - - actual_iter.next(); - formal_iter.next(); } /* If the function call is a constant expression, don't generate any diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp index 6ffa987..dcde631 100644 --- a/src/glsl/ir.cpp +++ b/src/glsl/ir.cpp @@ -1649,13 +1649,10 @@ modes_match(unsigned a, unsigned b) const char * ir_function_signature::qualifiers_match(exec_list *params) { - exec_list_iterator iter_a = parameters.iterator(); - exec_list_iterator iter_b = params-iterator(); - /* check that the qualifiers match. */ - while (iter_a.has_next()) { - ir_variable *a = (ir_variable *)iter_a.get(); - ir_variable *b = (ir_variable *)iter_b.get(); + foreach_list2(a_node, this-parameters, b_node, params) { + ir_variable *a = (ir_variable *) a_node; + ir_variable *b = (ir_variable *) b_node; if (a-data.read_only != b-data.read_only || !modes_match(a-data.mode, b-data.mode) || @@ -1666,9 +1663,6 @@ ir_function_signature::qualifiers_match(exec_list *params) /* parameter a's qualifiers don't match */ return a-name; } - - iter_a.next(); - iter_b.next(); } return NULL; } diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp index 14e2ff6..7c25031 100644 --- a/src/glsl/linker.cpp +++ b/src/glsl/linker.cpp @@ -109,10 +109,10 @@ public: virtual ir_visitor_status visit_enter(ir_call *ir) { - exec_list_iterator sig_iter = ir-callee-parameters.iterator(); - foreach_iter(exec_list_iterator, iter, *ir) { - ir_rvalue *param_rval = (ir_rvalue *)iter.get(); - ir_variable
Re: [Mesa-dev] GPU lockup CP stall when calling clBuildProgram on Cayman
On Thu, Jan 09, 2014 at 02:57:20PM +, christophe choquet wrote: Hi, I am using kernel 3.12.6-gentoo, Mesa 10.0.1 and once every two calls to clBuildProgram, the GPU goes to reset after 10 seconds. This also happens on Debian unstable with Mesa 9.2. First hello_world works, the next one hangs, third works, and so on. Despite this hang on this particular OpenCL call, every thing is just fine. I tried to comment out DMA flushing code in r600/r600_hw_context.c, but this issue does not look the one that what was discovered on R600 HW. After the hang, opencl_examples/hello_world returns the correct value (when the machine does not hang completely which happens sometimes). Same behaviour for get-global-id test program. This is likely the same issues as https://bugs.freedesktop.org/show_bug.cgi?id=73418 Are you running the OpenCL programs with or without X? Can you reply in the comments of the bug. Thanks, Tom Here is my config logs: lscpi: 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Cayman PRO [Radeon HD 6950] dmesg: [ 826.250105] radeon :01:00.0: GPU lockup CP stall for more than 1msec [ 826.250110] radeon :01:00.0: GPU lockup (waiting for 0x37bc last fence id 0x37ba) [ 826.250118] [drm] Disabling audio 0 support [ 826.257466] radeon :01:00.0: Saved 111 dwords of commands on ring 0. [ 826.257496] radeon :01:00.0: GPU softreset: 0x0008 [ 826.257498] radeon :01:00.0: GRBM_STATUS = 0xB0001828 [ 826.257500] radeon :01:00.0: GRBM_STATUS_SE0 = 0x0003 [ 826.257502] radeon :01:00.0: GRBM_STATUS_SE1 = 0x0003 [ 826.257504] radeon :01:00.0: SRBM_STATUS = 0x20C0 [ 826.257526] radeon :01:00.0: SRBM_STATUS2 = 0x [ 826.257528] radeon :01:00.0: R_008674_CP_STALLED_STAT1 = 0x [ 826.257529] radeon :01:00.0: R_008678_CP_STALLED_STAT2 = 0x4000 [ 826.257531] radeon :01:00.0: R_00867C_CP_BUSY_STAT = 0x00010006 [ 826.257533] radeon :01:00.0: R_008680_CP_STAT = 0x80228647 [ 826.257535] radeon :01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 [ 826.257537] radeon :01:00.0: R_00D834_DMA_STATUS_REG = 0x44C83D57 [ 826.257539] radeon :01:00.0: VM_CONTEXT0_PROTECTION_FAULT_ADDR 0x [ 826.257541] radeon :01:00.0: VM_CONTEXT0_PROTECTION_FAULT_STATUS 0x [ 826.257542] radeon :01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x [ 826.257544] radeon :01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x [ 826.264350] radeon :01:00.0: GRBM_SOFT_RESET=0x4001 [ 826.264403] radeon :01:00.0: SRBM_SOFT_RESET=0x0100 [ 826.265558] radeon :01:00.0: GRBM_STATUS = 0x1828 [ 826.265560] radeon :01:00.0: GRBM_STATUS_SE0 = 0x0003 [ 826.265561] radeon :01:00.0: GRBM_STATUS_SE1 = 0x0003 [ 826.265563] radeon :01:00.0: SRBM_STATUS = 0x20C0 [ 826.265585] radeon :01:00.0: SRBM_STATUS2 = 0x [ 826.265587] radeon :01:00.0: R_008674_CP_STALLED_STAT1 = 0x [ 826.265589] radeon :01:00.0: R_008678_CP_STALLED_STAT2 = 0x [ 826.265590] radeon :01:00.0: R_00867C_CP_BUSY_STAT = 0x [ 826.265592] radeon :01:00.0: R_008680_CP_STAT = 0x [ 826.265594] radeon :01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 [ 826.265596] radeon :01:00.0: R_00D834_DMA_STATUS_REG = 0x44C83D57 [ 826.265623] radeon :01:00.0: GPU reset succeeded, trying to resume [ 826.283559] [drm] PCIE gen 2 link speeds already enabled [ 826.285981] [drm] PCIE GART of 1024M enabled (table at 0x00273000). [ 826.286049] radeon :01:00.0: WB enabled [ 826.286051] radeon :01:00.0: fence driver on ring 0 use gpu addr 0x8c00 and cpu addr 0x8800cbaa3c00 .. On hello_world.c program hangs every two calls at line: error = clBuildProgram(program, 1, /* Number of devices */ device_id, NULL, /* options */ NULL, /* callback function when compile is complete */ NULL); /* user data for callback */ Thanks for your help, Regards ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Mark debug_print with __attribute__ ((format(__printf__, 1, 0)))
Reviewed-by: Ian Romanick ian.d.roman...@intel.com On 01/12/2014 10:34 AM, Keith Packard wrote: the drmServerInfo member, debug_print, takes a printf format string and varargs list. Tell the compiler about it. Signed-off-by: Keith Packard kei...@keithp.com --- xf86drm.h | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/xf86drm.h b/xf86drm.h index 1e763a3..5e170f8 100644 --- a/xf86drm.h +++ b/xf86drm.h @@ -92,8 +92,14 @@ extern C { typedef unsigned int drmSize, *drmSizePtr; /** For mapped regions */ typedef void *drmAddress, **drmAddressPtr; /** For mapped regions */ +#if (__GNUC__ = 3) +#define DRM_PRINTFLIKE(f, a) __attribute__ ((format(__printf__, f, a))) +#else +#define DRM_PRINTFLIKE(f, a) +#endif + typedef struct _drmServerInfo { - int (*debug_print)(const char *format, va_list ap); + int (*debug_print)(const char *format, va_list ap) DRM_PRINTFLIKE(1,0); int (*load_module)(const char *name); void (*get_perms)(gid_t *, mode_t *); } drmServerInfo, *drmServerInfoPtr; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] OpenCL Clang/Clover Offline Compilation issue
On Thu, Jan 09, 2014 at 12:49:51PM +, Dorrington, Albert wrote: I am not sure if this is the appropriate list on which to ask this question, if not hopefully someone can suggest an alternative. Under Linux, I am attempting to perform an offline compile of an OpenCL kernel example using Clang, and then load that binary using the clCreateProgramWithBinary() function. Unfortunately, while clover is loading the binary, I end up getting a segmentation fault: Program received signal SIGSEGV, Segmentation fault. proc (v=..., is=...) at core/module.cpp:50 50T x; I have pasted the source code I am using below, for both the kernel and the host code. I am compiling with the following commands: clang -target r600-unknown-unknown -x cl -S -emit-llvm -mcpu=r600 kernel.cl -o kernel.clbin I'm surprised that this works, since the r600 GPU does not support OpenCL (Note that R600 is the name of the target and also one of the individual GPUs supported by the compiler). The argument of -mcpu= needs to be GPU you are compiling the code for. So if you have a redwood GPU you would need to pass -mcpu=redwood. However, the main issue here is that clover does not support clCreateProgramWithBinary() yet. If you are interested in implementing this, I can give you some pointers. Just send an email to the list or ping me on irc (nick: tstellar on #radeon @ irc.freednode.net). -Tom clang -g -L/usr/local/lib -lOpenCL offline_host.c -o offline_host I have LLVM/Clang 3.4RC3 installed and Mesa 10.0.1. If anyone has suggestions, or can point me to the appropriate mailing list or documentation, I'd appreciate it. Thanks! -Al Source code for kernel.cl __kernel void vecAdd(__global float* a) { int gid = get_global_id(0); a[gid] += a[gid]; } Source code for offline_host.c == #include stdio.h #include stdlib.h #ifdef __APPLE__ #include OpenCL/opencl.h #else #include CL/cl.h #endif #define MEM_SIZE (128) #define MAX_BINARY_SIZE (0x10) int main() { cl_platform_id platform_id = NULL; cl_device_id device_id = NULL; cl_context context = NULL; cl_command_queue command_queue = NULL; cl_mem memobj = NULL; cl_program program = NULL; cl_kernel kernel = NULL; cl_uint ret_num_devices; cl_uint ret_num_platforms; cl_int ret; float mem[MEM_SIZE]; FILE *fp; char fileName[] = kernel.clbin; size_t binary_size; char *binary_buf; cl_int binary_status; cl_int i; /* Load kernel binary */ fp = fopen(fileName, r); if (!fp) { fprintf(stderr, Failed to load kernel.\n); exit(1); } binary_buf = (char *)malloc(MAX_BINARY_SIZE); binary_size = fread(binary_buf, 1, MAX_BINARY_SIZE, fp); fclose(fp); /* Initialize input data */ for (i = 0; i MEM_SIZE; i++) { mem[i] = i; } /* Get platform/device information */ ret = clGetPlatformIDs(1, platform_id, ret_num_platforms); ret = clGetDeviceIDs(platform_id, CL_DEVICE_TYPE_GPU, 1, device_id, ret_num_devices); /* Create OpenCL context*/ context = clCreateContext(NULL, 1, device_id, NULL, NULL, ret); /* Create command queue */ command_queue = clCreateCommandQueue(context, device_id, 0, ret); /* Create memory buffer */ memobj = clCreateBuffer(context, CL_MEM_READ_WRITE, MEM_SIZE * sizeof(float), NULL, ret); /* Transfer data over to the memory buffer */ ret = clEnqueueWriteBuffer(command_queue, memobj, CL_TRUE, 0, MEM_SIZE * sizeof(float), mem, 0, NULL, NULL); /* Create kernel program from the kernel binary */ program = clCreateProgramWithBinary(context, 1, device_id, (const size_t *)binary_size, (const unsigned char **)binary_buf, binary_status, ret); /* Create OpenCL kernel */ kernel = clCreateKernel(program, vecAdd, ret); printf(err:%d\n, ret); /* Set OpenCL kernel arguments */ ret = clSetKernelArg(kernel, 0, sizeof(cl_mem), (void *)memobj); size_t global_work_size[3] = {MEM_SIZE, 0, 0}; size_t local_work_size[3] = {MEM_SIZE, 0, 0}; /* Execute OpenCL kernel */ ret = clEnqueueNDRangeKernel(command_queue, kernel, 1, NULL, global_work_size, local_work_size, 0, NULL, NULL); /* Copy result from the memory buffer */ ret = clEnqueueReadBuffer(command_queue, memobj, CL_TRUE, 0, MEM_SIZE * sizeof(float), mem, 0, NULL, NULL); /* Display results */ for (i=0; i MEM_SIZE; i++) { printf(mem[%d] : $f\n, i, mem[i]); } /* Finalization */ ret = clFlush(command_queue); ret = clFinish(command_queue); ret = clReleaseKernel(kernel); ret = clReleaseProgram(program); ret = clReleaseMemObject(memobj); ret = clReleaseCommandQueue(command_queue); ret = clReleaseContext(context); free(binary_buf); return 0; } Al Dorrington Software Engineer Sr Lockheed Martin, Mission Systems and Training
Re: [Mesa-dev] [PATCH] Mark debug_print with __attribute__ ((format(__printf__, 1, 0)))
Ian Romanick i...@freedesktop.org writes: Reviewed-by: Ian Romanick ian.d.roman...@intel.com Thanks. Pushed. 8279c8f..cb4bc8e master - master -- keith.pack...@intel.com pgpkKf6zSEV5v.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 70410] egl-static/Makefile: linking fails with llvm = 3.4
https://bugs.freedesktop.org/show_bug.cgi?id=70410 --- Comment #18 from Krzysztof A. Sobiecki sob...@gmail.com --- (In reply to comment #17) I've tested attachment 91725 [details] and it works with LLVM 3.5 (r198682) in a clean build enviroment (LLVM packages for Debian from llvm.org/apt). I couldn't use attachment 91764 [details] [review], since apparently the patch from attachment 91751 [details] [review] hasn't landed in LLVM's tree yet (at least not before r198682). You can have my Tested-by: Kai Wasserbäch k...@dev.carbon-project.org for attachment 91725 [details]. Stack: LLVM: SVN:trunk/r198682 Mesa: Git:master/532b1fecd9 libdrm: 2.4.50-1 (Debian package) Thank You for Your help, I will wait for LLVM to fix newline problem, before sending this patch to mesa-dev -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [wip 1/9] glsl: memory_writer helper class for data serialization
On 2 January 2014 03:58, Tapani Pälli tapani.pa...@intel.com wrote: Class will be used by the shader binary cache implementation. Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/glsl/memory_writer.h | 147 +++ 1 file changed, 147 insertions(+) create mode 100644 src/glsl/memory_writer.h diff --git a/src/glsl/memory_writer.h b/src/glsl/memory_writer.h new file mode 100644 index 000..a6c6b55 --- /dev/null +++ b/src/glsl/memory_writer.h @@ -0,0 +1,147 @@ +/* -*- c++ -*- */ +/* + * Copyright © 2013 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + */ + +#pragma once +#ifndef MEMORY_WRITER_H +#define MEMORY_WRITER_H + +#include stdlib.h +#include unistd.h +#include string.h + +#ifdef __cplusplus +/** + * Helper class for writing data to memory + * + * This class maintains a dynamically-sized memory buffer and allows + * for data to be efficiently appended to it with automatic resizing. + */ +class memory_writer +{ +public: + memory_writer() : + memory(NULL), + curr_size(0), + pos(0) {} + + ~memory_writer() + { + free(memory); + } + + /* user wants to claim the memory */ + char *release_memory(size_t *size) + { + /* final realloc to free allocated but unused memory */ + char *result = (char *) realloc(memory, pos); + *size = pos; + memory = NULL; + curr_size = 0; + pos = 0; + return result; + } + +/** + * write functions per type + */ +#define DECL_WRITER(type) int write_ ##type (const type data) {\ + return write(data, sizeof(type));\ +} + + DECL_WRITER(int32_t); + DECL_WRITER(int64_t); + DECL_WRITER(uint8_t); + DECL_WRITER(uint32_t); + + int write_bool(bool data) + { + uint8_t val = data; + return write_uint8_t(val); + } + + /* write function that reallocates more memory if required */ + int write(const void *data, int32_t size) + { + if (!memory || pos (int32_t)(curr_size - size)) + if (grow(size)) +return -1; + + memcpy(memory + pos, data, size); + + pos += size; + return 0; + } + + int overwrite(const void *data, int32_t size, int32_t offset) + { + if (offset 0 || offset + size pos) + return -1; + memcpy(memory + offset, data, size); + return 0; + } + + int write_string(const char *str) + { + if (!str) + return -1; + char terminator = '\0'; + write(str, strlen(str)); + write(terminator, 1); C strings include a terminator, so there's no reason to write out the string contents and the terminator separtely. You can just do: write(str, strlen(str) + 1); Also, don't forget to propagate the return code to the caller: return write(str, strlen(str) + 1); + return 0; + } + + inline int32_t position() { return pos; } + + +private: + + /* reallocate more memory */ + int grow(int32_t size) + { + int32_t new_size = 2 * (curr_size + size); + char *more_mem = (char *) realloc(memory, new_size); + if (more_mem == NULL) { + free(memory); + memory = NULL; + return -1; + } else { + memory = more_mem; + curr_size = new_size; + return 0; + } + } + + /* allocated memory */ + char *memory; + + /* current size of the whole allocation */ + int32_t curr_size; + + /* write position / size of the data written */ + int32_t pos; +}; + +#endif /* ifdef __cplusplus */ + +#endif /* MEMORY_WRITER_H */ -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [wip 1/9] glsl: memory_writer helper class for data serialization
On 01/02/2014 03:58 AM, Tapani Pälli wrote: Class will be used by the shader binary cache implementation. Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/glsl/memory_writer.h | 147 +++ 1 file changed, 147 insertions(+) create mode 100644 src/glsl/memory_writer.h diff --git a/src/glsl/memory_writer.h b/src/glsl/memory_writer.h new file mode 100644 index 000..a6c6b55 --- /dev/null +++ b/src/glsl/memory_writer.h @@ -0,0 +1,147 @@ +/* -*- c++ -*- */ +/* + * Copyright © 2013 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + */ + +#pragma once +#ifndef MEMORY_WRITER_H +#define MEMORY_WRITER_H + +#include stdlib.h +#include unistd.h +#include string.h + +#ifdef __cplusplus +/** + * Helper class for writing data to memory + * + * This class maintains a dynamically-sized memory buffer and allows + * for data to be efficiently appended to it with automatic resizing. + */ +class memory_writer +{ +public: + memory_writer() : + memory(NULL), + curr_size(0), + pos(0) {} + + ~memory_writer() + { + free(memory); + } + + /* user wants to claim the memory */ + char *release_memory(size_t *size) + { + /* final realloc to free allocated but unused memory */ + char *result = (char *) realloc(memory, pos); + *size = pos; + memory = NULL; + curr_size = 0; + pos = 0; + return result; + } + +/** + * write functions per type + */ +#define DECL_WRITER(type) int write_ ##type (const type data) {\ + return write(data, sizeof(type));\ +} + + DECL_WRITER(int32_t); + DECL_WRITER(int64_t); + DECL_WRITER(uint8_t); + DECL_WRITER(uint32_t); + + int write_bool(bool data) I agree with Paul's previous comments about the return values. http://lists.freedesktop.org/archives/mesa-dev/2013-November/047740.html It looks like the only errors tested are either memory allocation or bad parameters. The bad parameter checks should just be assertions. + { + uint8_t val = data; + return write_uint8_t(val); + } + + /* write function that reallocates more memory if required */ + int write(const void *data, int32_t size) + { + if (!memory || pos (int32_t)(curr_size - size)) + if (grow(size)) +return -1; + + memcpy(memory + pos, data, size); + + pos += size; + return 0; + } + + int overwrite(const void *data, int32_t size, int32_t offset) + { + if (offset 0 || offset + size pos) + return -1; + memcpy(memory + offset, data, size); + return 0; + } + + int write_string(const char *str) + { + if (!str) + return -1; + char terminator = '\0'; + write(str, strlen(str)); + write(terminator, 1); This should just be write(str, strlen(str) + 1); + return 0; + } + + inline int32_t position() { return pos; } + + +private: + + /* reallocate more memory */ + int grow(int32_t size) + { + int32_t new_size = 2 * (curr_size + size); + char *more_mem = (char *) realloc(memory, new_size); + if (more_mem == NULL) { + free(memory); + memory = NULL; + return -1; + } else { + memory = more_mem; + curr_size = new_size; + return 0; + } + } + + /* allocated memory */ + char *memory; + + /* current size of the whole allocation */ + int32_t curr_size; Is there a reason to specifically make this int32_t instead of just int? Or even unsigned? + + /* write position / size of the data written */ + int32_t pos; +}; + +#endif /* ifdef __cplusplus */ + +#endif /* MEMORY_WRITER_H */ ___ mesa-dev mailing list
[Mesa-dev] [Bug 73512] [clover] mesa.icd. should contain full path
https://bugs.freedesktop.org/show_bug.cgi?id=73512 --- Comment #5 from Igor Gnatenko i.gnatenko.br...@gmail.com --- (In reply to comment #4) According to the icd spec: http://www.khronos.org/registry/cl/extensions/khr/cl_khr_icd.txt The vendors directory must go in /etc/OpenCL and also only the library name is included in the *.icd file, not the full path, so I don't think this patch is correct. What problem does this patch fix? 1. we're not installing .so to main package. we are installing it to -devel subpackage. So .icd should contain like .so.@LIBVER@ 2. Fabian, Can you try to use liMesaOpenCL.so.1 in .icd file. What clinfo will do say ? -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC] Build testing, wine style
On Sat, Jan 11, 2014 at 03:53:58PM +, Emil Velikov wrote: Hello list, While going though mesa's build systems I was wondering what it would take to improve the overall experience of build testing. The only thing I can think of is a more centralised solution similar to the one used by wine [1]. Having buildbots test every patch what is send to the ML [2] :) I'm sure that some companies/organisations may have similar infrastructure but I was thinking what is the possibility of having a more open/shared experience, thus one does not need to test the same environment/setup across multiple bots. Here are a couple of nice words for each build system that mesa has: * automake - tons of many build variations, most of which handled by debian/ubuntu, fedora and suse build systems. * scons - less build variations, mainly used for non-public state-trackers and/or drivers * android - possibly the most painful one out there (IMHO), 10GiB code cloned a ton of libraries build and alot more that fair rather randomly :\ Kind of wondering what it would take to have such a feature and if people will see benefits from it. Hi Emil, I've been playing around with buildbot, and I even had a local one doing Mesa builds a few weeks ago. I just need to find a dedicated machine so I can have it running full-time. For me, I'm mostly interested in using buildbot for piglit testing, but I think it would also be useful to catch build breakages for the various configurations people care about. I still don't understand the whole master/slave relationship of buildbot, so I'm not sure what kind of centralized resources would be needed, but maybe if someone would volunteer to maintain it we could use some of the fdo resources for hosting buildbot. You also may want to take a look at tinderbox.x.org, which already does some build testing. I prefer buildbot mainly because I was unable to find very much documentation for tinderbox, but it might be worth looking at. -Tom Cheers, Emil [1] http://wiki.winehq.org/BuildBot [2] http://source.winehq.org/patches/ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 73512] [clover] mesa.icd. should contain full path
https://bugs.freedesktop.org/show_bug.cgi?id=73512 --- Comment #6 from Fabian Deutsch fabian.deut...@gmx.de --- Hey, this can all be a result of me being uninformed (not knowing that only the library name is contained in the .icd file). But I think that the .icd file is still not corect, as it contains only the unversioned library name libMesaOpenCL.so - which is - as Igor metions - not packaged in the main packages (only in devel subpackages). So I'm not sure if the original icd file should contain the versioned library, or if we should do this downstream in Fedora. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 73512] [clover] mesa.icd. should contain full path
https://bugs.freedesktop.org/show_bug.cgi?id=73512 Igor Gnatenko i.gnatenko.br...@gmail.com changed: What|Removed |Added Attachment #91886|0 |1 is obsolete|| --- Comment #7 from Igor Gnatenko i.gnatenko.br...@gmail.com --- Created attachment 91973 -- https://bugs.freedesktop.org/attachment.cgi?id=91973action=edit [PATCH v3] opencl: improved auto-gen .icd v2: Use @OPENCL_VERSION@:0 for library replace /etc with @sysconfdir@ macros v3: Drop libdir from icd, because libMesaOpenCL isn't private -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] opencl: improved auto-gen .icd
On Sun, Jan 12, 2014 at 03:08:56AM +0400, Igor Gnatenko wrote: From 5b2bf87f1238e44150492a39f5db0ae90d59459b Mon Sep 17 00:00:00 2001 From: Igor Gnatenko i.gnatenko.br...@gmail.com Date: Sun, 12 Jan 2014 02:09:16 +0400 Subject: [PATCH] opencl: improved auto-gen .icd v2: Use @OPENCL_VERSION@:0 for library replace /etc with @sysconfdir@ macros Reported-by: Fabian Deutsch fabian.deut...@gmx.de Reference: https://bugs.freedesktop.org/show_bug.cgi?id=73512 Signed-off-by: Igor Gnatenko i.gnatenko.br...@gmail.com --- configure.ac | 3 +++ src/gallium/targets/opencl/Makefile.am | 4 ++-- src/gallium/targets/opencl/mesa.icd| 1 - src/gallium/targets/opencl/mesa.icd.in | 1 + 4 files changed, 6 insertions(+), 3 deletions(-) delete mode 100644 src/gallium/targets/opencl/mesa.icd create mode 100644 src/gallium/targets/opencl/mesa.icd.in diff --git a/configure.ac b/configure.ac index 4b55140..3452e15 100644 --- a/configure.ac +++ b/configure.ac @@ -25,6 +25,8 @@ m4_ifdef([AM_PROG_AR], [AM_PROG_AR]) dnl Set internal versions OSMESA_VERSION=8 AC_SUBST([OSMESA_VERSION]) +OPENCL_VERSION=1 +AC_SUBST([OPENCL_VERSION]) dnl Versions for external dependencies LIBDRM_REQUIRED=2.4.24 @@ -2023,6 +2025,7 @@ AC_CONFIG_FILES([Makefile src/gallium/targets/egl-static/Makefile src/gallium/targets/gbm/Makefile src/gallium/targets/opencl/Makefile + src/gallium/targets/opencl/mesa.icd src/gallium/targets/osmesa/Makefile src/gallium/targets/osmesa/osmesa.pc src/gallium/targets/pipe-loader/Makefile diff --git a/src/gallium/targets/opencl/Makefile.am b/src/gallium/targets/opencl/Makefile.am index 653302c..923316c 100644 --- a/src/gallium/targets/opencl/Makefile.am +++ b/src/gallium/targets/opencl/Makefile.am @@ -4,7 +4,7 @@ lib_LTLIBRARIES = lib@OPENCL_LIBNAME@.la lib@OPENCL_LIBNAME@_la_LDFLAGS = \ $(LLVM_LDFLAGS) \ - -version-number 1:0 + -version-number @OPENCL_VERSION@:0 lib@OPENCL_LIBNAME@_la_LIBADD = \ $(top_builddir)/src/gallium/auxiliary/pipe-loader/libpipe_loader.la \ @@ -34,7 +34,7 @@ lib@OPENCL_LIBNAME@_la_SOURCES = nodist_EXTRA_lib@OPENCL_LIBNAME@_la_SOURCES = dummy.cpp if HAVE_CLOVER_ICD -icddir = /etc/OpenCL/vendors/ +icddir = @sysconfdir@/OpenCL/vendors/ As I mentioned in the bug report, the ICD spec says that OpenCL/vendors/ should be in /etc/ I don't think we can change this and still be spec compliant. Why do you want to install the *.icd files in sysconfdir? icd_DATA = mesa.icd endif diff --git a/src/gallium/targets/opencl/mesa.icd b/src/gallium/targets/opencl/mesa.icd deleted file mode 100644 index 6a6a870..000 --- a/src/gallium/targets/opencl/mesa.icd +++ /dev/null @@ -1 +0,0 @@ -libMesaOpenCL.so diff --git a/src/gallium/targets/opencl/mesa.icd.in b/src/gallium/targets/opencl/mesa.icd.in new file mode 100644 index 000..a0b6489 --- /dev/null +++ b/src/gallium/targets/opencl/mesa.icd.in @@ -0,0 +1 @@ +@libdir@/lib@OPENCL_LIBNAME@.so.@OPENCL_VERSION@ Again, the spec says only the library name should go here and not the full path. -Tom -- 1.8.4.2 -- -Igor Gnatenko ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/19] nv50/ir: fix PFETCH and add RDSV to get VSTRIDE for GPs
From: Christoph Bumiller e0425...@student.tuwien.ac.at --- src/gallium/drivers/nouveau/codegen/nv50_ir.h | 1 + .../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 62 -- .../drivers/nouveau/codegen/nv50_ir_print.cpp | 1 + 3 files changed, 59 insertions(+), 5 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h b/src/gallium/drivers/nouveau/codegen/nv50_ir.h index 68c76e5..6a001d3 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h @@ -366,6 +366,7 @@ enum SVSemantic SV_CLOCK, SV_LBASE, SV_SBASE, + SV_VERTEX_STRIDE, SV_UNDEFINED, SV_LAST }; diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp index 3eca27d..cf82e2f 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp @@ -87,6 +87,7 @@ private: void emitLOAD(const Instruction *); void emitSTORE(const Instruction *); void emitMOV(const Instruction *); + void emitRDSV(const Instruction *); void emitNOP(); void emitINTERP(const Instruction *); void emitPFETCH(const Instruction *); @@ -772,6 +773,29 @@ CodeEmitterNV50::emitMOV(const Instruction *i) } } +static inline uint8_t getSRegEncoding(const ValueRef ref) +{ + switch (SDATA(ref).sv.sv) { + case SV_PHYSID:return 0; + case SV_CLOCK: return 1; + case SV_VERTEX_STRIDE: return 3; +// case SV_PM_COUNTER:return 4 + SDATA(ref).sv.index; + case SV_SAMPLE_INDEX: return 8; + default: + assert(!no sreg for system value); + return 0; + } +} + +void +CodeEmitterNV50::emitRDSV(const Instruction *i) +{ + code[0] = 0x0001; + code[1] = 0x6000 | (getSRegEncoding(i-src(0)) 14); + defId(i-def(0), 2); + emitFlagsRd(i); +} + void CodeEmitterNV50::emitNOP() { @@ -794,15 +818,40 @@ CodeEmitterNV50::emitQUADOP(const Instruction *i, uint8_t lane, uint8_t quOp) srcId(i-src(0), 32 + 14); } +/* NOTE: This returns the base address of a vertex inside the primitive. + * src0 is an immediate, the index (not offset) of the vertex + * inside the primitive. XXX: signed or unsigned ? + * src1 (may be NULL) should use whatever units the hardware requires + * (on nv50 this is bytes, so, relative index * 4; signed 16 bit value). + */ void CodeEmitterNV50::emitPFETCH(const Instruction *i) { - code[0] = 0x1181; - code[1] = 0x0420 | (0xf 14); + const uint32_t prim = i-src(0).get()-reg.data.u32; + assert(prim = 127); - defId(i-def(0), 2); - srcAddr8(i-src(0), 9); - setAReg16(i, 0); + if (i-def(0).getFile() == FILE_ADDRESS) { + // shl $aX a[] 0 + code[0] = 0x0001 | ((DDATA(i-def(0)).id + 1) 2); + code[1] = 0xc020; + code[0] |= prim 9; + assert(!i-srcExists(1)); + } else + if (i-srcExists(1)) { + // ld b32 $rX a[$aX+base] + code[0] = 0x0001; + code[1] = 0x0420 | (0xf 14); + defId(i-def(0), 2); + code[0] |= prim 9; + setARegBits(SDATA(i-src(1)).id + 1); + } else { + // mov b32 $rX a[] + code[0] = 0x1001; + code[1] = 0x0420 | (0xf 14); + defId(i-def(0), 2); + code[0] |= prim 9; + } + emitFlagsRd(i); } void @@ -1620,6 +1669,9 @@ CodeEmitterNV50::emitInstruction(Instruction *insn) case OP_PFETCH: emitPFETCH(insn); break; + case OP_RDSV: + emitRDSV(insn); + break; case OP_LINTERP: case OP_PINTERP: emitINTERP(insn); diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp index ee39b3c..ae42d03 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp @@ -265,6 +265,7 @@ static const char *SemanticStr[SV_LAST + 1] = CLOCK, LBASE, SBASE, + VERTEX_STRIDE, ?, (INVALID) }; -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 09/19] nv50: properly set the PRIMITIVE_ID enable flag when it is a gp input.
Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- src/gallium/drivers/nouveau/nv50/nv50_program.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_program.c b/src/gallium/drivers/nouveau/nv50/nv50_program.c index 78a12e3..f46f240 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_program.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_program.c @@ -52,6 +52,9 @@ nv50_vertprog_assign_slots(struct nv50_ir_prog_info *info) for (c = 0; c 4; ++c) if (info-in[i].mask (1 c)) info-in[i].slot[c] = n++; + + if (info-in[i].sn == TGSI_SEMANTIC_PRIMID) + prog-vp.attrs[2] |= NV50_3D_VP_GP_BUILTIN_ATTR_EN_PRIMITIVE_ID; } prog-in_nr = info-numInputs; -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC PATCH 18/19] nv50: report glsl 1.50 now that gp tests pass
Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- There are still some things that fail -- mostly gl_Layer stuff, and also using gl_PositionID without a gp. src/gallium/drivers/nouveau/nv50/nv50_screen.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c b/src/gallium/drivers/nouveau/nv50/nv50_screen.c index 5732b21..123bdab 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c @@ -126,7 +126,7 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) case PIPE_CAP_SM3: return 1; case PIPE_CAP_GLSL_FEATURE_LEVEL: - return 140; + return 150; case PIPE_CAP_MAX_RENDER_TARGETS: return 8; case PIPE_CAP_MAX_DUAL_SOURCE_RENDER_TARGETS: -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 14/19] nvc0: don't forget to also clear additional layers
Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- src/gallium/drivers/nouveau/nv50/nv50_program.c | 2 ++ src/gallium/drivers/nouveau/nvc0/nvc0_surface.c | 22 -- 2 files changed, 18 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_program.c b/src/gallium/drivers/nouveau/nv50/nv50_program.c index 813795f..e7609fa 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_program.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_program.c @@ -166,6 +166,8 @@ nv50_fragprog_assign_slots(struct nv50_ir_prog_info *info) if (info-in[i].sn == TGSI_SEMANTIC_COLOR) prog-vp.bfc[info-in[i].si] = j; + if (info-in[i].sn == TGSI_SEMANTIC_PRIMID) +prog-vp.attrs[2] |= NV50_3D_VP_GP_BUILTIN_ATTR_EN_PRIMITIVE_ID; prog-in[j].id = i; prog-in[j].mask = info-in[i].mask; diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c b/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c index 5375bd4..8cc7021 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c @@ -414,7 +414,7 @@ nvc0_clear(struct pipe_context *pipe, unsigned buffers, struct nvc0_context *nvc0 = nvc0_context(pipe); struct nouveau_pushbuf *push = nvc0-base.pushbuf; struct pipe_framebuffer_state *fb = nvc0-framebuffer; - unsigned i; + unsigned i, j; uint32_t mode = 0; /* don't need NEW_BLEND, COLOR_MASK doesn't affect CLEAR_BUFFERS */ @@ -444,12 +444,22 @@ nvc0_clear(struct pipe_context *pipe, unsigned buffers, mode |= NVC0_3D_CLEAR_BUFFERS_S; } - BEGIN_NVC0(push, NVC0_3D(CLEAR_BUFFERS), 1); - PUSH_DATA (push, mode); + if ((buffers PIPE_CLEAR_DEPTH) || (buffers PIPE_CLEAR_STENCIL)) { + for (j = fb-zsbuf-u.tex.first_layer; j = fb-zsbuf-u.tex.last_layer; j++) { + BEGIN_NVC0(push, NVC0_3D(CLEAR_BUFFERS), 1); + PUSH_DATA(push, mode | (j NVC0_3D_CLEAR_BUFFERS_LAYER__SHIFT)); + } + } - for (i = 1; i fb-nr_cbufs; i++) { - BEGIN_NVC0(push, NVC0_3D(CLEAR_BUFFERS), 1); - PUSH_DATA (push, (i 6) | 0x3c); + if (buffers PIPE_CLEAR_COLOR) { + for (i = 0; i fb-nr_cbufs; i++) { + struct pipe_surface *sf = fb-cbufs[i]; + for (j = sf-u.tex.first_layer; j = sf-u.tex.last_layer; j++) { +BEGIN_NVC0(push, NVC0_3D(CLEAR_BUFFERS), 1); +PUSH_DATA (push, (i 6) | 0x3c | + (j NVC0_3D_CLEAR_BUFFERS_LAYER__SHIFT)); + } + } } } -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 08/19] nv50/ir: add support for gl_PrimitiveIDIn
Note that the primitive id is stored in a[0x18], while usually the geometry instructions are of the form a[$a1 + 0x4] which gets mapped to p[] space. We need to avoid the change from a[] to p[] here, so it's keyed on whether the access is indirect or not. Note that there's also a use-case for accessing e.g. a[$r1], however that's not supported for now. (Could be added by checking the register file of the indirect parameter.) Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 6 +++--- src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 7 +-- src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp | 3 +++ 3 files changed, 11 insertions(+), 5 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp index f4db2ed..a6ed4b0 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp @@ -381,7 +381,7 @@ CodeEmitterNV50::setSrcFileBits(const Instruction *i, int enc) case 0x00: // rrr break; case 0x01: // arr/grr - if (progType == Program::TYPE_GEOMETRY) { + if (progType == Program::TYPE_GEOMETRY i-src(0).isIndirect(0)) { code[0] |= 0x0180; if (enc == NV50_OP_ENC_LONG || enc == NV50_OP_ENC_LONG_ALT) code[1] |= 0x0020; @@ -407,7 +407,7 @@ CodeEmitterNV50::setSrcFileBits(const Instruction *i, int enc) code[1] |= (i-getSrc(1)-reg.fileIndex 22); break; case 0x09: // acr/gcr - if (progType == Program::TYPE_GEOMETRY) { + if (progType == Program::TYPE_GEOMETRY i-src(0).isIndirect(0)) { code[0] |= 0x0180; } else { code[0] |= (enc == NV50_OP_ENC_LONG_ALT) ? 0x0100 : 0x0080; @@ -612,7 +612,7 @@ CodeEmitterNV50::emitLOAD(const Instruction *i) switch (sf) { case FILE_SHADER_INPUT: - if (progType == Program::TYPE_GEOMETRY) + if (progType == Program::TYPE_GEOMETRY i-src(0).isIndirect(0)) code[0] = 0x1181; else // use 'mov' where we can diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp index 3c790cf..321410e 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp @@ -1434,13 +1434,16 @@ Converter::fetchSrc(tgsi::Instruction::SrcRegister src, int c, Value *ptr) return mkOp1v(OP_RDSV, TYPE_F32, getSSA(), mkSysVal(SV_FACE, 0)); return interpolate(src, c, shiftAddress(ptr)); } else - if (ptr prog-getType() == Program::TYPE_GEOMETRY) { + if (prog-getType() == Program::TYPE_GEOMETRY) { + if (!ptr info-in[idx].sn == TGSI_SEMANTIC_PRIMID) +return mkOp1v(OP_RDSV, TYPE_U32, getSSA(), mkSysVal(SV_PRIMITIVE_ID, 0)); // XXX: This is going to be a problem with scalar arrays, i.e. when // we cannot assume that the address is given in units of vec4. // // nv50 and nvc0 need different things here, so let the lowering // passes decide what to do with the address - return mkLoadv(TYPE_U32, srcToSym(src, c), ptr); + if (ptr) +return mkLoadv(TYPE_U32, srcToSym(src, c), ptr); } return mkLoadv(TYPE_U32, srcToSym(src, c), shiftAddress(ptr)); case TGSI_FILE_OUTPUT: diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp index a84a54a..1925c09 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp @@ -238,6 +238,9 @@ TargetNV50::getSVAddress(DataFile shaderFile, const Symbol *sym) const addr += 4; return addr; } + case SV_PRIMITIVE_ID: + return shaderFile == FILE_SHADER_INPUT ? 0x18 : + sysvalLocation[sym-reg.data.sv.sv]; case SV_NCTAID: return 0x8 + 2 * sym-reg.data.sv.index; case SV_CTAID: -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/19] nv50: add support for geometry shaders
From: Bryan Cain bryanca...@gmail.com Layer output probably doesn't work yet, but other than that everything seems to be working. Signed-off-by: Bryan Cain bryanca...@gmail.com [calim: fix up minor bugs, code formatting] Signed-off-by: Christoph Bumiller e0425...@student.tuwien.ac.at Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- .../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 25 -- src/gallium/drivers/nouveau/nv50/nv50_program.c| 16 ++ .../drivers/nouveau/nv50/nv50_shader_state.c | 2 ++ src/gallium/drivers/nouveau/nv50/nv50_tex.c| 2 ++ 4 files changed, 39 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp index cf82e2f..f4db2ed 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp @@ -493,7 +493,12 @@ CodeEmitterNV50::emitForm_MAD(const Instruction *i) setSrc(i, 1, 1); setSrc(i, 2, 2); - setAReg16(i, 1); + if (i-getIndirect(0, 0)) { + assert(!i-getIndirect(1, 0)); + setAReg16(i, 0); + } else { + setAReg16(i, 1); + } } // like default form, but 2nd source in slot 2, and no 3rd source @@ -512,7 +517,12 @@ CodeEmitterNV50::emitForm_ADD(const Instruction *i) setSrc(i, 0, 0); setSrc(i, 1, 2); - setAReg16(i, 1); + if (i-getIndirect(0, 0)) { + assert(!i-getIndirect(1, 0)); + setAReg16(i, 0); + } else { + setAReg16(i, 1); + } } // default short form (rr, ar, rc, gr) @@ -602,8 +612,11 @@ CodeEmitterNV50::emitLOAD(const Instruction *i) switch (sf) { case FILE_SHADER_INPUT: - // use 'mov' where we can - code[0] = i-src(0).isIndirect(0) ? 0x0001 : 0x1001; + if (progType == Program::TYPE_GEOMETRY) + code[0] = 0x1181; + else + // use 'mov' where we can + code[0] = i-src(0).isIndirect(0) ? 0x0001 : 0x1001; code[1] = 0x0020 | (i-lanes 14); if (typeSizeof(i-dType) == 4) code[1] |= 0x0400; @@ -1399,8 +1412,8 @@ CodeEmitterNV50::emitShift(const Instruction *i) void CodeEmitterNV50::emitOUT(const Instruction *i) { - code[0] = (i-op == OP_EMIT) ? 0xf200 : 0xf400; - code[1] = 0xc001; + code[0] = (i-op == OP_EMIT) ? 0xf201 : 0xf401; + code[1] = 0xc000; emitFlagsRd(i); } diff --git a/src/gallium/drivers/nouveau/nv50/nv50_program.c b/src/gallium/drivers/nouveau/nv50/nv50_program.c index 97857d7..78a12e3 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_program.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_program.c @@ -358,6 +358,22 @@ nv50_program_translate(struct nv50_program *prog, uint16_t chipset) } if (info-prop.fp.usesDiscard) prog-fp.flags[0] |= NV50_3D_FP_CONTROL_USES_KIL; + } else + if (prog-type == PIPE_SHADER_GEOMETRY) { + switch (info-prop.gp.outputPrim) { + case PIPE_PRIM_LINE_STRIP: + prog-gp.prim_type = NV50_3D_GP_OUTPUT_PRIMITIVE_TYPE_LINE_STRIP; + break; + case PIPE_PRIM_TRIANGLE_STRIP: + prog-gp.prim_type = NV50_3D_GP_OUTPUT_PRIMITIVE_TYPE_TRIANGLE_STRIP; + break; + case PIPE_PRIM_POINTS: + default: + assert(info-prop.gp.outputPrim == PIPE_PRIM_POINTS); + prog-gp.prim_type = NV50_3D_GP_OUTPUT_PRIMITIVE_TYPE_POINTS; + break; + } + prog-gp.vert_count = info-prop.gp.maxVertices; } if (prog-pipe.stream_output.num_outputs) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_shader_state.c b/src/gallium/drivers/nouveau/nv50/nv50_shader_state.c index 9144fc4..ba4f592 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_shader_state.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_shader_state.c @@ -193,6 +193,8 @@ nv50_gmtyprog_validate(struct nv50_context *nv50) struct nv50_program *gp = nv50-gmtyprog; if (gp) { + if (!nv50_program_validate(nv50, gp)) + return; BEGIN_NV04(push, NV50_3D(GP_REG_ALLOC_TEMP), 1); PUSH_DATA (push, gp-max_gpr); BEGIN_NV04(push, NV50_3D(GP_REG_ALLOC_RESULT), 1); diff --git a/src/gallium/drivers/nouveau/nv50/nv50_tex.c b/src/gallium/drivers/nouveau/nv50/nv50_tex.c index f7284fa..6663a61 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_tex.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_tex.c @@ -293,6 +293,7 @@ void nv50_validate_textures(struct nv50_context *nv50) boolean need_flush; need_flush = nv50_validate_tic(nv50, 0); + need_flush |= nv50_validate_tic(nv50, 1); need_flush |= nv50_validate_tic(nv50, 2); if (need_flush) { @@ -343,6 +344,7 @@ void nv50_validate_samplers(struct nv50_context *nv50) boolean need_flush; need_flush = nv50_validate_tsc(nv50, 0); + need_flush |= nv50_validate_tsc(nv50, 1); need_flush |= nv50_validate_tsc(nv50, 2); if (need_flush) { -- 1.8.3.2
[Mesa-dev] [PATCH 02/19] nv50/ir: delay calculation of indirect addresses
From: Bryan Cain bryanca...@gmail.com Instead of emitting an SHL 4 io an address register on the TGSI ARL and UARL instructions, emit the shift when the loaded address is actually used. This is necessary because input vertex and attribute indices in geometry shaders on nv50 need to be shifted left by 2 instead of 4. Signed-off-by: Bryan Cain bryanca...@gmail.com [calim: various updates to the indirect address logic] Signed-off-by: Christoph Bumiller e0425...@student.tuwien.ac.at [imirkin: remove OP_MAD change that calim made, add OP_RESTART handling same as OP_EMIT for code flow analysis] Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 38 ++-- .../nouveau/codegen/nv50_ir_lowering_nv50.cpp | 104 - .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 7 ++ 3 files changed, 136 insertions(+), 13 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp index 49a45f8..3c790cf 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp @@ -1126,6 +1126,7 @@ private: ValueMap values; }; + Value *shiftAddress(Value *); Value *getVertexBase(int s); DataArray *getArrayForFile(unsigned file, int idx); Value *fetchSrc(int s, int c); @@ -1344,7 +1345,8 @@ Converter::getVertexBase(int s) if (tgsi.getSrc(s).isIndirect(1)) rel = fetchSrc(tgsi.getSrc(s).getIndirect(1), 0, NULL); vtxBaseValid |= 1 s; - vtxBase[s] = mkOp2v(OP_PFETCH, TYPE_U32, getSSA(), mkImm(index), rel); + vtxBase[s] = mkOp2v(OP_PFETCH, TYPE_U32, getSSA(4, FILE_ADDRESS), + mkImm(index), rel); } return vtxBase[s]; } @@ -1403,6 +1405,14 @@ Converter::getArrayForFile(unsigned file, int idx) } Value * +Converter::shiftAddress(Value *index) +{ + if (!index) + return NULL; + return mkOp2v(OP_SHL, TYPE_U32, getSSA(4, FILE_ADDRESS), index, mkImm(4)); +} + +Value * Converter::fetchSrc(tgsi::Instruction::SrcRegister src, int c, Value *ptr) { const int idx2d = src.is2D() ? src.getIndex(1) : 0; @@ -1414,7 +1424,7 @@ Converter::fetchSrc(tgsi::Instruction::SrcRegister src, int c, Value *ptr) assert(!ptr); return loadImm(NULL, info-immd.data[idx * 4 + swz]); case TGSI_FILE_CONSTANT: - return mkLoadv(TYPE_U32, srcToSym(src, c), ptr); + return mkLoadv(TYPE_U32, srcToSym(src, c), shiftAddress(ptr)); case TGSI_FILE_INPUT: if (prog-getType() == Program::TYPE_FRAGMENT) { // don't load masked inputs, won't be assigned a slot @@ -1422,9 +1432,17 @@ Converter::fetchSrc(tgsi::Instruction::SrcRegister src, int c, Value *ptr) return loadImm(NULL, swz == TGSI_SWIZZLE_W ? 1.0f : 0.0f); if (!ptr info-in[idx].sn == TGSI_SEMANTIC_FACE) return mkOp1v(OP_RDSV, TYPE_F32, getSSA(), mkSysVal(SV_FACE, 0)); - return interpolate(src, c, ptr); + return interpolate(src, c, shiftAddress(ptr)); + } else + if (ptr prog-getType() == Program::TYPE_GEOMETRY) { + // XXX: This is going to be a problem with scalar arrays, i.e. when + // we cannot assume that the address is given in units of vec4. + // + // nv50 and nvc0 need different things here, so let the lowering + // passes decide what to do with the address + return mkLoadv(TYPE_U32, srcToSym(src, c), ptr); } - return mkLoadv(TYPE_U32, srcToSym(src, c), ptr); + return mkLoadv(TYPE_U32, srcToSym(src, c), shiftAddress(ptr)); case TGSI_FILE_OUTPUT: assert(!load from output file); return NULL; @@ -1433,7 +1451,7 @@ Converter::fetchSrc(tgsi::Instruction::SrcRegister src, int c, Value *ptr) return mkOp1v(OP_RDSV, TYPE_U32, getSSA(), srcToSym(src, c)); default: return getArrayForFile(src.getFile(), idx2d)-load( - sub.cur-values, idx, swz, ptr); + sub.cur-values, idx, swz, shiftAddress(ptr)); } } @@ -1476,8 +1494,9 @@ Converter::storeDst(int d, int c, Value *val) break; } - Value *ptr = dst.isIndirect(0) ? - fetchSrc(dst.getIndirect(0), 0, NULL) : NULL; + Value *ptr = NULL; + if (dst.isIndirect(0)) + ptr = shiftAddress(fetchSrc(dst.getIndirect(0), 0, NULL)); if (info-io.genUserClip 0 dst.getFile() == TGSI_FILE_OUTPUT @@ -2179,12 +2198,11 @@ Converter::handleInstruction(const struct tgsi_full_instruction *insn) FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) { src0 = fetchSrc(0, c); mkCvt(OP_CVT, TYPE_S32, dst0[c], TYPE_F32, src0)-rnd = ROUND_M; - mkOp2(OP_SHL, TYPE_U32, dst0[c], dst0[c], mkImm(4)); } break; case TGSI_OPCODE_UARL: FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) - mkOp2(OP_SHL, TYPE_U32, dst0[c], fetchSrc(0, c), mkImm(4)); +
[Mesa-dev] [RFC PATCH 19/19] nv50: enable seamless cube maps on all hw for OpenGL 3.2
Some of the hardware support is missing. The NVIDIA-provided driver, which claims 3.3 support fails a slew of the relevant tests as well. This allows us to expose geometry shaders without doing the additional work involved in supporting ARB_geometry_shader4. Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- src/gallium/drivers/nouveau/nv50/nv50_screen.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c b/src/gallium/drivers/nouveau/nv50/nv50_screen.c index 123bdab..a108ece 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c @@ -111,7 +111,7 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE: return 65536; case PIPE_CAP_SEAMLESS_CUBE_MAP: - return nv50_screen(pscreen)-tesla-oclass = NVA0_3D_CLASS; + return 1; //nv50_screen(pscreen)-tesla-oclass = NVA0_3D_CLASS; case PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE: return 0; case PIPE_CAP_CUBE_MAP_ARRAY: -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 00/19] nv50: add sampler2DMS/GP support to get OpenGL 3.2
OK, so there's a bunch of stuff in here. The geometry stuff is based on the work started by Bryan Cain and Christoph Bumiller. Patches 01-12: Add support for geometry shaders and fix related issues Patches 13-14: Make it possible for fb clears to operate on texture attachments with an explicit layer set (as is allowed in gl 3.2). Patches 15-17: Make ARB_texture_multisample work Patch 18: Enable GLSL 1.50 Patch 19: Turn on ARB_seamless_cube_map irrespective of HW support so that all nv50 cards can get OpenGL 3.2 and geometry shaders (which are otherwise unsupported) There are still a few geometry-related piglits that fail -- specifically: primitive-id-no-gs gl-3.2-layered-rendering-gl-layer* I need to trace the blob to figure out exactly how to configure the HW for those situations, but I suspect that the fixes will be fairly small and self-contained. Note that there are also a bunch of EXT_framebuffer_multisample tests that are failing, but that has nothing to do with these changes. There's something wrong with the blit_3d function, at the very least to do with depth/stencil, but also some color tests fail as well. These patches are available at https://github.com/imirkin/mesa.git nv50-gs or https://github.com/imirkin/mesa/commits/nv50-gs for those who prefer a web ui. Bryan Cain (2): nv50/ir: delay calculation of indirect addresses nv50: add support for geometry shaders Christoph Bumiller (1): nv50/ir: fix PFETCH and add RDSV to get VSTRIDE for GPs Ilia Mirkin (16): nv50: allow vert_count to be 255 nv50/ir: disallow predicates on emit/restart ops nv50/ir: disallow shader input propagation for gp nv50/ir: comment out code to allow input/immed loads nv50/ir: add support for gl_PrimitiveIDIn nv50: properly set the PRIMITIVE_ID enable flag when it is a gp input. nv50: VP_RESULT_MAP_SIZE has to be positive nv50: GP_REG_ALLOC_RESULT must be positive nv50: allocate an extra code bo to avoid dmesg spam nv50: don't forget to also clear additional layers nvc0: don't forget to also clear additional layers nv50: add comments about CB_AUX contents nv50: copy nvc0's get_sample_position implementation nv50: add support for textureFetch'ing MS textures, ARB_texture_multisample nv50: report glsl 1.50 now that gp tests pass nv50: enable seamless cube maps on all hw for OpenGL 3.2 src/gallium/drivers/nouveau/codegen/nv50_ir.h | 9 ++ .../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 92 ++-- .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 41 -- .../nouveau/codegen/nv50_ir_lowering_nv50.cpp | 164 - .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 7 + .../drivers/nouveau/codegen/nv50_ir_print.cpp | 1 + .../nouveau/codegen/nv50_ir_target_nv50.cpp| 18 ++- src/gallium/drivers/nouveau/nv50/nv50_context.c| 46 ++ src/gallium/drivers/nouveau/nv50/nv50_context.h| 17 +++ src/gallium/drivers/nouveau/nv50/nv50_program.c| 30 +++- src/gallium/drivers/nouveau/nv50/nv50_program.h| 2 +- src/gallium/drivers/nouveau/nv50/nv50_screen.c | 23 ++- .../drivers/nouveau/nv50/nv50_shader_state.c | 6 + .../drivers/nouveau/nv50/nv50_state_validate.c | 2 +- src/gallium/drivers/nouveau/nv50/nv50_surface.c| 25 ++-- src/gallium/drivers/nouveau/nv50/nv50_tex.c| 77 +- src/gallium/drivers/nouveau/nvc0/nvc0_surface.c| 22 ++- 17 files changed, 526 insertions(+), 56 deletions(-) -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 17/19] nv50: add support for textureFetch'ing MS textures, ARB_texture_multisample
Creates two areas in the AUX constbuf: - Sample offsets for MS textures - Per-texture MS settings When executing a textureFetch with a MS sampler, looks up that texture's settings and adjusts the parameters given to the texfetch instruction. With this change, all the ARB_texture_multisample piglits pass, so turn on PIPE_CAP_TEXTURE_MULTISAMPLE. Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- src/gallium/drivers/nouveau/codegen/nv50_ir.h | 8 +++ .../drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp | 1 + .../nouveau/codegen/nv50_ir_lowering_nv50.cpp | 60 + src/gallium/drivers/nouveau/nv50/nv50_context.h| 13 +++- src/gallium/drivers/nouveau/nv50/nv50_program.c| 7 +- src/gallium/drivers/nouveau/nv50/nv50_screen.c | 7 +- src/gallium/drivers/nouveau/nv50/nv50_tex.c| 75 +- 7 files changed, 164 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h b/src/gallium/drivers/nouveau/codegen/nv50_ir.h index 6a001d3..857980d 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h @@ -827,6 +827,14 @@ public: int isShadow() const { return descTable[target].shadow ? 1 : 0; } int isMS() const { return target == TEX_TARGET_2D_MS || target == TEX_TARGET_2D_MS_ARRAY; } + void clearMS() { + if (isMS()) { +if (isArray()) + target = TEX_TARGET_2D_ARRAY; +else + target = TEX_TARGET_2D; + } + } Target operator=(TexTarget targ) { diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp index a6ed4b0..8f9b7de 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nv50.cpp @@ -1221,6 +1221,7 @@ CodeEmitterNV50::emitCVT(const Instruction *i) case TYPE_S32: code[1] = 0x44014000; break; case TYPE_U32: code[1] = 0x44004000; break; case TYPE_F16: code[1] = 0xc400; break; + case TYPE_U16: code[1] = 0x4400; break; default: assert(0); break; diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp index 1d13aea..984a8ca 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp @@ -549,6 +549,8 @@ private: bool handleCONT(Instruction *); void checkPredicate(Instruction *); + void loadTexMsInfo(uint32_t off, Value **ms, Value **ms_x, Value **ms_y); + void loadMsInfo(Value *ms, Value *s, Value **dx, Value **dy); private: const Target *const targ; @@ -582,6 +584,41 @@ NV50LoweringPreSSA::visit(Function *f) return true; } +void NV50LoweringPreSSA::loadTexMsInfo(uint32_t off, Value **ms, + Value **ms_x, Value **ms_y) { + // This loads the texture-indexed ms setting from the constant buffer + Value *tmp = new_LValue(func, FILE_GPR); + uint8_t b = prog-driver-io.resInfoCBSlot; + off += prog-driver-io.suInfoBase; + *ms_x = bld.mkLoadv(TYPE_U32, bld.mkSymbol( + FILE_MEMORY_CONST, b, TYPE_U32, off + 0), NULL); + *ms_y = bld.mkLoadv(TYPE_U32, bld.mkSymbol( + FILE_MEMORY_CONST, b, TYPE_U32, off + 4), NULL); + *ms = bld.mkOp2v(OP_ADD, TYPE_U32, tmp, *ms_x, *ms_y); +} + +void NV50LoweringPreSSA::loadMsInfo(Value *ms, Value *s, Value **dx, Value **dy) { + // Given a MS level, and a sample id, compute the delta x/y + uint8_t b = prog-driver-io.msInfoCBSlot; + Value *off = new_LValue(func, FILE_ADDRESS), *t = new_LValue(func, FILE_GPR); + + // The required information is at mslevel * 16 * 4 + sample * 8 + // = (mslevel * 8 + sample) * 8 + bld.mkOp2(OP_SHL, + TYPE_U32, + off, + bld.mkOp2v(OP_ADD, TYPE_U32, t, +bld.mkOp2v(OP_SHL, TYPE_U32, t, ms, bld.mkImm(3)), +s), + bld.mkImm(3)); + *dx = bld.mkLoadv(TYPE_U32, bld.mkSymbol( + FILE_MEMORY_CONST, b, TYPE_U32, + prog-driver-io.msInfoBase), off); + *dy = bld.mkLoadv(TYPE_U32, bld.mkSymbol( + FILE_MEMORY_CONST, b, TYPE_U32, + prog-driver-io.msInfoBase + 4), off); +} + bool NV50LoweringPreSSA::handleTEX(TexInstruction *i) { @@ -589,6 +626,29 @@ NV50LoweringPreSSA::handleTEX(TexInstruction *i) const int dref = arg; const int lod = i-tex.target.isShadow() ? (arg + 1) : arg; + // handle MS, which means looking up the MS params for this texture, and + // adjusting the input coordinates to point at the right sample. + if (i-tex.target.isMS()) { + Value *x = i-getSrc(0); + Value *y =
[Mesa-dev] [PATCH 16/19] nv50: copy nvc0's get_sample_position implementation
Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- src/gallium/drivers/nouveau/nv50/nv50_context.c | 46 + 1 file changed, 46 insertions(+) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_context.c b/src/gallium/drivers/nouveau/nv50/nv50_context.c index 11afc48..db3bd3a 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_context.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_context.c @@ -196,6 +196,10 @@ nv50_invalidate_resource_storage(struct nouveau_context *ctx, return ref; } +static void +nv50_context_get_sample_position(struct pipe_context *, unsigned, unsigned, + float *); + struct pipe_context * nv50_create(struct pipe_screen *pscreen, void *priv) { @@ -239,6 +243,7 @@ nv50_create(struct pipe_screen *pscreen, void *priv) pipe-flush = nv50_flush; pipe-texture_barrier = nv50_texture_barrier; + pipe-get_sample_position = nv50_context_get_sample_position; if (!screen-cur_ctx) { screen-cur_ctx = nv50; @@ -317,3 +322,44 @@ nv50_bufctx_fence(struct nouveau_bufctx *bufctx, boolean on_flush) nv50_resource_validate(res, (unsigned)ref-priv_data); } } + +static void +nv50_context_get_sample_position(struct pipe_context *pipe, + unsigned sample_count, unsigned sample_index, + float *xy) +{ + static const uint8_t ms1[1][2] = { { 0x8, 0x8 } }; + static const uint8_t ms2[2][2] = { + { 0x4, 0x4 }, { 0xc, 0xc } }; /* surface coords (0,0), (1,0) */ + static const uint8_t ms4[4][2] = { + { 0x6, 0x2 }, { 0xe, 0x6 }, /* (0,0), (1,0) */ + { 0x2, 0xa }, { 0xa, 0xe } }; /* (0,1), (1,1) */ + static const uint8_t ms8[8][2] = { + { 0x1, 0x7 }, { 0x5, 0x3 }, /* (0,0), (1,0) */ + { 0x3, 0xd }, { 0x7, 0xb }, /* (0,1), (1,1) */ + { 0x9, 0x5 }, { 0xf, 0x1 }, /* (2,0), (3,0) */ + { 0xb, 0xf }, { 0xd, 0x9 } }; /* (2,1), (3,1) */ +#if 0 + /* NOTE: there are alternative modes for MS2 and MS8, currently not used */ + static const uint8_t ms8_alt[8][2] = { + { 0x9, 0x5 }, { 0x7, 0xb }, /* (2,0), (1,1) */ + { 0xd, 0x9 }, { 0x5, 0x3 }, /* (3,1), (1,0) */ + { 0x3, 0xd }, { 0x1, 0x7 }, /* (0,1), (0,0) */ + { 0xb, 0xf }, { 0xf, 0x1 } }; /* (2,1), (3,0) */ +#endif + + const uint8_t (*ptr)[2]; + + switch (sample_count) { + case 0: + case 1: ptr = ms1; break; + case 2: ptr = ms2; break; + case 4: ptr = ms4; break; + case 8: ptr = ms8; break; + default: + assert(0); + return; /* bad sample count - undefined locations */ + } + xy[0] = ptr[sample_index][0] * 0.0625f; + xy[1] = ptr[sample_index][1] * 0.0625f; +} -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 07/19] nv50/ir: comment out code to allow input/immed loads
This code was missing a break which made it ineffective. But since shader input loads have been disallowed, define the code out. Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp index 18fa069..a84a54a 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp @@ -310,9 +310,12 @@ TargetNV50::insnCanLoad(const Instruction *i, int s, if (ld-bb-getProgram()-getType() == Program::TYPE_GEOMETRY) return false; break; +#if 0 case 0x0d: if (ld-bb-getProgram()-getType() != Program::TYPE_GEOMETRY) return false; + break; +#endif default: return false; } -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/19] nv50/ir: disallow shader input propagation for gp
For some reason, shader input accesses don't work correctly in non-ld instructions. Disallow those loads from being propagated. Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- I'm not particularly happy with this patch. Some investigation needs to happen as to what's going on here. NVIDIA's shaders include p[] accesses in various instructions just fine. Perhaps this is just masking some other bug. However this works for now for all the piglit tests in the repo. src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp index 52257a8..18fa069 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp @@ -297,14 +297,19 @@ TargetNV50::insnCanLoad(const Instruction *i, int s, switch (mode) { case 0x00: - case 0x01: case 0x03: case 0x08: - case 0x09: case 0x0c: case 0x20: case 0x21: break; + case 0x01: + case 0x09: + // TODO: Figure out why a[] accesses can't be propagated into non-ld + // instructions. Something to do with vstride maybe? + if (ld-bb-getProgram()-getType() == Program::TYPE_GEOMETRY) + return false; + break; case 0x0d: if (ld-bb-getProgram()-getType() != Program::TYPE_GEOMETRY) return false; -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 10/19] nv50: VP_RESULT_MAP_SIZE has to be positive
Make sure that we never try to use a 0-sized map. This can happen when using a gp, so add a dummy mapping when computing vp_gp_mapping in that case. Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- src/gallium/drivers/nouveau/nv50/nv50_shader_state.c | 4 1 file changed, 4 insertions(+) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_shader_state.c b/src/gallium/drivers/nouveau/nv50/nv50_shader_state.c index ba4f592..265ef20 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_shader_state.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_shader_state.c @@ -457,6 +457,7 @@ nv50_fp_linkage_validate(struct nv50_context *nv50) BEGIN_NV04(push, NV50_3D(SEMANTIC_PRIM_ID), 1); PUSH_DATA (push, primid); + assert(m 0); BEGIN_NV04(push, NV50_3D(VP_RESULT_MAP_SIZE), 1); PUSH_DATA (push, m); BEGIN_NV04(push, NV50_3D(VP_RESULT_MAP(0)), n); @@ -516,6 +517,8 @@ nv50_vp_gp_mapping(uint8_t *map, int m, oid += mv 1; } } + if (!m) + map[m++] = 0; return m; } @@ -540,6 +543,7 @@ nv50_gp_linkage_validate(struct nv50_context *nv50) BEGIN_NV04(push, NV50_3D(VP_GP_BUILTIN_ATTR_EN), 1); PUSH_DATA (push, vp-vp.attrs[2] | gp-vp.attrs[2]); + assert(m 0); BEGIN_NV04(push, NV50_3D(VP_RESULT_MAP_SIZE), 1); PUSH_DATA (push, m); BEGIN_NV04(push, NV50_3D(VP_RESULT_MAP(0)), n); -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 04/19] nv50: allow vert_count to be 255
--- src/gallium/drivers/nouveau/nv50/nv50_program.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_program.h b/src/gallium/drivers/nouveau/nv50/nv50_program.h index 13b9516..f63352f 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_program.h +++ b/src/gallium/drivers/nouveau/nv50/nv50_program.h @@ -88,7 +88,7 @@ struct nv50_program { struct { ubyte primid; /* primitive id output register */ - uint8_t vert_count; + uint32_t vert_count; uint8_t prim_type; /* point, line strip or tri strip */ } gp; -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 15/19] nv50: add comments about CB_AUX contents
Updates a few inconsistencies as well, like the size of the buffer, location of the runout, etc. Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- src/gallium/drivers/nouveau/nv50/nv50_context.h| 10 ++ src/gallium/drivers/nouveau/nv50/nv50_screen.c | 8 src/gallium/drivers/nouveau/nv50/nv50_state_validate.c | 2 +- 3 files changed, 15 insertions(+), 5 deletions(-) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_context.h b/src/gallium/drivers/nouveau/nv50/nv50_context.h index ee6eb0e..7bf4ce3 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_context.h +++ b/src/gallium/drivers/nouveau/nv50/nv50_context.h @@ -70,7 +70,17 @@ #define NV50_CB_PVP 124 #define NV50_CB_PGP 126 #define NV50_CB_PFP 125 +/* constant buffer permanently mapped in as c15[] */ #define NV50_CB_AUX 127 +/* size of the buffer: 64k. not all taken up, can be reduced if needed. */ +#define NV50_CB_AUX_SIZE (1 16) +/* 8 user clip planes, at 4 32-bit floats each */ +#define NV50_CB_AUX_UCP_OFFSET0x0 +/* 256 textures, each with 2 16-bit integers specifying the x/y MS shift */ +#define NV50_CB_AUX_MS_OFFSET 0x80 +/* 4 32-bit floats for the vertex runout, put at the end */ +#define NV50_CB_AUX_RUNOUT_OFFSET (NV50_CB_AUX_SIZE - 0x10) + struct nv50_blitctx; diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c b/src/gallium/drivers/nouveau/nv50/nv50_screen.c index 82b0207..9ed2d01 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c @@ -472,7 +472,7 @@ nv50_screen_init_hwctx(struct nv50_screen *screen) BEGIN_NV04(push, NV50_3D(CB_DEF_ADDRESS_HIGH), 3); PUSH_DATAh(push, screen-uniforms-offset + (3 16)); PUSH_DATA (push, screen-uniforms-offset + (3 16)); - PUSH_DATA (push, (NV50_CB_AUX 16) | 0x0200); + PUSH_DATA (push, (NV50_CB_AUX 16) | (NV50_CB_AUX_SIZE 0x)); BEGIN_NI04(push, NV50_3D(SET_PROGRAM_CB), 3); PUSH_DATA (push, (NV50_CB_AUX 12) | 0xf01); @@ -481,15 +481,15 @@ nv50_screen_init_hwctx(struct nv50_screen *screen) /* return { 0.0, 0.0, 0.0, 0.0 } on out-of-bounds vtxbuf access */ BEGIN_NV04(push, NV50_3D(CB_ADDR), 1); - PUSH_DATA (push, ((1 9) 6) | NV50_CB_AUX); + PUSH_DATA (push, (NV50_CB_AUX_RUNOUT_OFFSET 6) | NV50_CB_AUX); BEGIN_NI04(push, NV50_3D(CB_DATA(0)), 4); PUSH_DATAf(push, 0.0f); PUSH_DATAf(push, 0.0f); PUSH_DATAf(push, 0.0f); PUSH_DATAf(push, 0.0f); BEGIN_NV04(push, NV50_3D(VERTEX_RUNOUT_ADDRESS_HIGH), 2); - PUSH_DATAh(push, screen-uniforms-offset + (3 16) + (1 9)); - PUSH_DATA (push, screen-uniforms-offset + (3 16) + (1 9)); + PUSH_DATAh(push, screen-uniforms-offset + (3 16) + NV50_CB_AUX_RUNOUT_OFFSET); + PUSH_DATA (push, screen-uniforms-offset + (3 16) + NV50_CB_AUX_RUNOUT_OFFSET); /* max TIC (bits 4:8) TSC bindings, per program type */ for (i = 0; i 3; ++i) { diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c b/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c index 86b9a23..3d99b73 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c @@ -238,7 +238,7 @@ nv50_validate_clip(struct nv50_context *nv50) if (nv50-dirty NV50_NEW_CLIP) { BEGIN_NV04(push, NV50_3D(CB_ADDR), 1); - PUSH_DATA (push, (0 8) | NV50_CB_AUX); + PUSH_DATA (push, (NV50_CB_AUX_UCP_OFFSET 8) | NV50_CB_AUX); BEGIN_NI04(push, NV50_3D(CB_DATA(0)), PIPE_MAX_CLIP_PLANES * 4); PUSH_DATAp(push, nv50-clip.ucp[0][0], PIPE_MAX_CLIP_PLANES * 4); } -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 11/19] nv50: GP_REG_ALLOC_RESULT must be positive
Set max_out to 1 when there are no outputs. Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- src/gallium/drivers/nouveau/nv50/nv50_program.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_program.c b/src/gallium/drivers/nouveau/nv50/nv50_program.c index f46f240..813795f 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_program.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_program.c @@ -118,6 +118,8 @@ nv50_vertprog_assign_slots(struct nv50_ir_prog_info *info) } prog-out_nr = info-numOutputs; prog-max_out = n; + if (!prog-max_out) + prog-max_out = 1; if (prog-vp.psiz info-numOutputs) prog-vp.psiz = prog-out[prog-vp.psiz].hw; -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 05/19] nv50/ir: disallow predicates on emit/restart ops
--- src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp index ade9be0..52257a8 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp @@ -130,7 +130,8 @@ void TargetNV50::initOpInfo() }; static const operation noPredList[] = { - OP_CALL, OP_PREBREAK, OP_PRERET, OP_QUADON, OP_QUADPOP, OP_JOINAT + OP_CALL, OP_PREBREAK, OP_PRERET, OP_QUADON, OP_QUADPOP, OP_JOINAT, + OP_EMIT, OP_RESTART }; for (i = 0; i DATA_FILE_COUNT; ++i) -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 7/8] glsl: Replace iterators in ir_reader.cpp with ad-hoc list walking.
On 01/13/2014 09:49 AM, Ian Romanick wrote: On 01/11/2014 02:37 AM, Kenneth Graunke wrote: These can't use foreach_list since they want to skip over the first few list elements. Just doing the ad-hoc list walking isn't too bad. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/ir_reader.cpp | 18 ++ 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/src/glsl/ir_reader.cpp b/src/glsl/ir_reader.cpp index f5185d2..28923f3 100644 --- a/src/glsl/ir_reader.cpp +++ b/src/glsl/ir_reader.cpp @@ -205,11 +205,12 @@ ir_reader::read_function(s_expression *expr, bool skip_body) assert(added); } - exec_list_iterator it = ((s_list *) expr)-subexpressions.iterator(); - it.next(); // skip function tag - it.next(); // skip function name - for (/* nothing */; it.has_next(); it.next()) { - s_expression *s_sig = (s_expression *) it.get(); + /* Skip over function tag and function name (which are guaranteed to be +* present by the above PARTIAL_MATCH call). +*/ + exec_node *node = ((s_list *) expr)-subexpressions.head-next-next; + for (/* nothing */; !node-is_tail_sentinel(); node = node-next) { + s_expression *s_sig = (s_expression *) node; This won't behave the same in the (bug) case that the list has too few elements. If the list is empty or as only one element, there will be a NULL deref here somewhere. I believe the iterator version was safe against this. Do we have some pre-existing guarantee that the list has enough elements? Yes. Above: s_pattern pat[] = { function, name }; if (!PARTIAL_MATCH(expr, pat)) { ir_read_error(expr, Expected (function name (signature ...) ...)); return NULL; } If the list doesn't match the (partial) S-Expression (function name ...) we would have bailed by now. So the list is guaranteed to have at least two elements. read_function_sig(f, s_sig, skip_body); } return added ? f : NULL; @@ -249,9 +250,10 @@ ir_reader::read_function_sig(ir_function *f, s_expression *expr, bool skip_body) exec_list hir_parameters; state-symbols-push_scope(); - exec_list_iterator it = paramlist-subexpressions.iterator(); - for (it.next() /* skip parameters */; it.has_next(); it.next()) { - ir_variable *var = read_declaration((s_expression *) it.get()); + /* Skip over the parameters tag. */ + exec_node *node = paramlist-subexpressions.head-next; + for (/* nothing */; !node-is_tail_sentinel(); node = node-next) { + ir_variable *var = read_declaration((s_expression *) node); if (var == NULL) return; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 6/8] glsl: Use a new foreach_list2 macro for walking two lists at once.
On 01/13/2014 09:58 AM, Ian Romanick wrote: On 01/11/2014 02:37 AM, Kenneth Graunke wrote: When handling function calls, we often want to walk through the list of formal parameters and list of actual parameters at the same time. (Both are guaranteed to be the same length.) Previously, we used a pattern of: exec_list_iterator 1st_iter = 1st list.iterator(); foreach_iter(exec_list_iterator, 2nd_iter, 2nd list) { ... 1st_iter.next(); } This was a bit awkward, since you had to manually iterate through one of the two lists. a bit lol. This patch introduces a foreach_list2 macro which safely walks through two lists at the same time, so you can simply do: foreach_list2(1st_node, 1st list, 2nd_node, 2nd list) { ... } My only suggestion might be to change the name to foreach_two_lists. I think it's more obvious to someone reading the header file looking for utility macros. Yeah, that is better. Renamed in v2. Thanks! Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/ast_function.cpp | 16 -- src/glsl/ir.cpp| 12 +++--- src/glsl/linker.cpp| 9 src/glsl/list.h| 16 ++ src/glsl/opt_constant_folding.cpp | 9 src/glsl/opt_constant_propagation.cpp | 9 src/glsl/opt_constant_variable.cpp | 9 src/glsl/opt_copy_propagation.cpp | 9 src/glsl/opt_copy_propagation_elements.cpp | 9 src/glsl/opt_function_inlining.cpp | 35 -- src/glsl/opt_tree_grafting.cpp | 10 - src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 22 +++ 12 files changed, 73 insertions(+), 92 deletions(-) diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp index e4c0fd1..9a9bb74 100644 --- a/src/glsl/ast_function.cpp +++ b/src/glsl/ast_function.cpp @@ -293,15 +293,10 @@ generate_call(exec_list *instructions, ir_function_signature *sig, * call takes place. Since we haven't emitted the call yet, we'll place * the post-call conversions in a temporary exec_list, and emit them later. */ - exec_list_iterator actual_iter = actual_parameters-iterator(); - exec_list_iterator formal_iter = sig-parameters.iterator(); - - while (actual_iter.has_next()) { - ir_rvalue *actual = (ir_rvalue *) actual_iter.get(); - ir_variable *formal = (ir_variable *) formal_iter.get(); - - assert(actual != NULL); - assert(formal != NULL); + foreach_list2(formal_node, sig-parameters, + actual_node, actual_parameters) { + ir_rvalue *actual = (ir_rvalue *) actual_node; + ir_variable *formal = (ir_variable *) formal_node; The old code asserts when the lists aren't the same length... or at least when sig-parameters is shorter than actual_parameters. As do the loops in st_glsl_to_tgsi.cpp. I think a debug-build version of foreach_list2 could do the same... I'm just waffling whether there's sufficient value to make it worth doing. Opinions? I'd rather not. These lists are always the same length. It might be worth checking that when creating them, but making every code site that walks them assert seems like overkill. Plus, it seems tricky to shoehorn assertions into a macro that only defines a for loop (without the body). And right now, it has the defined behavior that it stops at the shorter of the two lists, which could be useful someday. --Ken ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3] opencl: improved auto-gen .icd
From 91796da9c00c0756b90b9e09d404a5357ff32ec6 Mon Sep 17 00:00:00 2001 From: Igor Gnatenko i.gnatenko.br...@gmail.com Date: Sun, 12 Jan 2014 02:09:16 +0400 Subject: [PATCH] opencl: improved auto-gen .icd v2: Use @OPENCL_VERSION@:0 for library replace /etc with @sysconfdir@ macros v3: Drop libdir from icd, because libMesaOpenCL isn't private Reported-by: Fabian Deutsch fabian.deut...@gmx.de Reference: https://bugs.freedesktop.org/show_bug.cgi?id=73512 Signed-off-by: Igor Gnatenko i.gnatenko.br...@gmail.com --- configure.ac | 3 +++ src/gallium/targets/opencl/Makefile.am | 4 ++-- src/gallium/targets/opencl/mesa.icd| 1 - src/gallium/targets/opencl/mesa.icd.in | 1 + 4 files changed, 6 insertions(+), 3 deletions(-) delete mode 100644 src/gallium/targets/opencl/mesa.icd create mode 100644 src/gallium/targets/opencl/mesa.icd.in diff --git a/configure.ac b/configure.ac index 4b55140..3452e15 100644 --- a/configure.ac +++ b/configure.ac @@ -25,6 +25,8 @@ m4_ifdef([AM_PROG_AR], [AM_PROG_AR]) dnl Set internal versions OSMESA_VERSION=8 AC_SUBST([OSMESA_VERSION]) +OPENCL_VERSION=1 +AC_SUBST([OPENCL_VERSION]) dnl Versions for external dependencies LIBDRM_REQUIRED=2.4.24 @@ -2023,6 +2025,7 @@ AC_CONFIG_FILES([Makefile src/gallium/targets/egl-static/Makefile src/gallium/targets/gbm/Makefile src/gallium/targets/opencl/Makefile + src/gallium/targets/opencl/mesa.icd src/gallium/targets/osmesa/Makefile src/gallium/targets/osmesa/osmesa.pc src/gallium/targets/pipe-loader/Makefile diff --git a/src/gallium/targets/opencl/Makefile.am b/src/gallium/targets/opencl/Makefile.am index 653302c..923316c 100644 --- a/src/gallium/targets/opencl/Makefile.am +++ b/src/gallium/targets/opencl/Makefile.am @@ -4,7 +4,7 @@ lib_LTLIBRARIES = lib@OPENCL_LIBNAME@.la lib@OPENCL_LIBNAME@_la_LDFLAGS = \ $(LLVM_LDFLAGS) \ - -version-number 1:0 + -version-number @OPENCL_VERSION@:0 lib@OPENCL_LIBNAME@_la_LIBADD = \ $(top_builddir)/src/gallium/auxiliary/pipe-loader/libpipe_loader.la \ @@ -34,7 +34,7 @@ lib@OPENCL_LIBNAME@_la_SOURCES = nodist_EXTRA_lib@OPENCL_LIBNAME@_la_SOURCES = dummy.cpp if HAVE_CLOVER_ICD -icddir = /etc/OpenCL/vendors/ +icddir = @sysconfdir@/OpenCL/vendors/ icd_DATA = mesa.icd endif diff --git a/src/gallium/targets/opencl/mesa.icd b/src/gallium/targets/opencl/mesa.icd deleted file mode 100644 index 6a6a870..000 --- a/src/gallium/targets/opencl/mesa.icd +++ /dev/null @@ -1 +0,0 @@ -libMesaOpenCL.so diff --git a/src/gallium/targets/opencl/mesa.icd.in b/src/gallium/targets/opencl/mesa.icd.in new file mode 100644 index 000..1b77b4e --- /dev/null +++ b/src/gallium/targets/opencl/mesa.icd.in @@ -0,0 +1 @@ +lib@OPENCL_LIBNAME@.so.@OPENCL_VERSION@ -- 1.8.4.2 -- -Igor Gnatenko ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 73571] New: [clover] OpenCL segfault in gegl 'clones' test
https://bugs.freedesktop.org/show_bug.cgi?id=73571 Priority: medium Bug ID: 73571 Assignee: mesa-dev@lists.freedesktop.org Summary: [clover] OpenCL segfault in gegl 'clones' test Severity: normal Classification: Unclassified OS: Linux (All) Reporter: jano.ves...@gmail.com Hardware: x86-64 (AMD64) Status: NEW Version: git Component: Other Product: Mesa Created attachment 91998 -- https://bugs.freedesktop.org/attachment.cgi?id=91998action=edit gegl don't askfor cl/gl extensions The tests/compositions/clones.xml from gegl test suite segfaults when using mesa OpenCL on Radeon HD 7570 (AMD Turks). I tired running it in gdb, here's the backtrace: [New Thread 0x7fffca312700 (LWP 8187)] Program received signal SIGSEGV, Segmentation fault. 0x7fffc8b0 in ?? () (gdb) bt #0 0x7fffc8b0 in ?? () #1 0x7fffc960 in ?? () #2 0x7fffe6238202 in (anonymous namespace)::InlineSpiller::insertSpill(unsigned int, bool, llvm::MachineBasicBlock::bundle_iteratorllvm::MachineInstr, llvm::ilist_iteratorllvm::MachineInstr ) () from /home/vesely/.local/lib/libLLVMCodeGen.so Backtrace stopped: previous frame inner to this frame (corrupt stack?) llvm, clang, mesa, libclc, gegl, babl are all latest git as of today. Note that I had to patch gegl in order to use OpenCL on mesa at all (it requires some GL/CL extensions). The patch is attached. Note that the same test crashes when using intel-ocl too. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Naming everything in src/gallium/drivers/radeonsi si_*
I don't have an fdo account or push rights. Can somebody else push it for me please? I've added the Reviewed-by: lines so the patches only need to be pushed now. On Monday 13 January 2014 11:22:07 Marek Olšák wrote: For the series: Reviewed-by: Marek Olšák marek.ol...@amd.com Feel free to push this. Marek On Sat, Jan 11, 2014 at 4:20 PM, Andreas Hartmetz ahartm...@gmail.com wrote: Continuing here because the threads had diverged... I've updated the patch series under the same URL and applied all the suggested improvements. The variable renames are still in, but at the very end so they are trivial to omit. On Tuesday 07 January 2014 17:27:56 Andreas Hartmetz wrote: We have talked on IRC meanwhile: Everywhere was supposed to mean file names and data structures. I have made a patch series (git link because file renames produce huge diffs) that renames *everything* away from r600 (and also radeonsi) to si, where it is actually about SI. In the such modified code it is then clear at first glance that only resources, textures and some other low-level interface code from R600 / generic Radeon are actually used in SI code. The patch series is ordered by increasing controversy potential due to destruction of git blame history, so the last parts can be omitted if they are deemed too destructive to history. In my opinion, it is better to have code that is readable now than code that is less readable but with the possibility to look up how it became like that. Michel said on IRC that he'd prefer to keep the name radeonsi_pipe.h/c, I disagree: If the library name is to be kept, there must be a break between radeonsi and si *somewhere*, and it is normal for library names to not correspond to any file name in the library. The same scheme is used in llvmpipe, llvmpipe lib / directory versus lp_* file names. Here's the repository (branch is master): git git://anongit.kde.org/scratch/ahartmetz/mesa.git web http://quickgit.kde.org/?p=scratch%2Fahartmetz%2Fmesa.git On Monday 06 January 2014 15:50:05 Marek Olšák wrote: It sounds good, but I'd like the prefix to be si_ everywhere. Marek On Mon, Jan 6, 2014 at 2:47 PM, Andreas Hartmetz ahartm...@gmail.com wrote: Hello, many of the files in radeonsi originally came from other places where they had different names and were never renamed. Most of them now have names that don't tell what the files are for (r600 is not actually the first hardware supported by them, they start at radeonsi), and even those with radeonsi are split between radeonsi_ and si_. si_ is shorter than radeonsi_, but inconsistent with the directory and library name. I still think it's the best option, but no strong opinion from me. If and when the files are renamed, the next step would be doing the same with the r600_ struct and function names. Does that sound good? I'll send the patches shortly if so. Cheers, Andreas ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 73571] [clover] OpenCL segfault in gegl 'clones' test
https://bugs.freedesktop.org/show_bug.cgi?id=73571 --- Comment #1 from Jan Vesely jano.ves...@gmail.com --- There are 4 more tests that were failing for different reason, but after applying http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20131216/199497.html segfault in the same way: contrast-curve, pixelize, posterize, weighted blend. Note that all of these test use conversion kernels from (gegl)/opencl/colors.cl. patching gegl to not use opencl makes the tests pass. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/10] i965: Use Global GTT for Sandybridge post-sync non-zero workaround.
Kenneth Graunke kenn...@whitecape.org writes: On 01/09/2014 10:03 PM, Eric Anholt wrote: Eric Anholt e...@anholt.net writes: Kenneth Graunke kenn...@whitecape.org writes: The kernel doesn't even set up the aliasing PPGTT on Sandybridge, so any writes marked as PPGTT will likely just get dropped on the floor. The hardware bug is that writes not marked as GTT are still looked up in the GTT anyway. The kernel does set up the PPGTT, which is how we found we needed to put in the kernel workaround based on DOMAIN_INSTRUCTION (of binding the target buffer to the gtt as well as the ppgtt, since the writes landed in the wrong place) I don't think this patch will change anything, but it seems reasonable if the commit message is updated. Actually, thinking about it more, I'd rather not explicitly use global GTT, unless the function is also renamed to gen6_emit_post_sync_nonzero_workaround, since now this function on non-gen6 would reference GTT memory in its instruction, but the kernel wouldn't put anything in the GTT. (I'd rather just leave the workaround as is, myself). Okay, sounds like this is unnecessary. But...the next patch (helper function for writes) causes this to use PIPE_CONTROL_GLOBAL_GTT_WRITE on SNB only, and PPGTT on Gen7+. Oh, right. I'm fine with this as-is, then (r-b). pgpScCjmeJVuJ.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/10] i965: Introduce an OUT_RELOC64 macro.
Kenneth Graunke kenn...@whitecape.org writes: On 01/09/2014 09:31 PM, Eric Anholt wrote: Kenneth Graunke kenn...@whitecape.org writes: On 12/13/2013 09:28 AM, Daniel Vetter wrote: On Thu, Dec 12, 2013 at 01:26:40AM -0800, Kenneth Graunke wrote: Broadwell uses 48-bit addresses. The first DWord is the low 32 bits, and the second DWord is the high 16 bits. Since individual buffers shouldn't be larger than 4GB in size, any offsets into those buffers (buffer-offset + delta) should fit in the low 32 bits. So I believe we can simply emit 0 for the high 16-bits, and drm_intel_bo_emit_reloc() should patch it up. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/intel_batchbuffer.h | 5 + 1 file changed, 5 insertions(+) diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.h b/src/mesa/drivers/dri/i965/intel_batchbuffer.h index 159f928..128eed9 100644 --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.h +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.h @@ -178,6 +178,11 @@ void intel_batchbuffer_cached_advance(struct brw_context *brw); read_domains, write_domain, delta); \ } while (0) +/* Handle 48-bit address relocations for Gen8+ */ +#define OUT_RELOC64(buf, read_domains, write_domain, delta) \ + OUT_RELOC(buf, read_domains, write_domain, delta); \ + OUT_BATCH(0); Please not. The presumed_offset that libdrm uses is 64bits, and you need to emit the full presumed address (and correctly shifted). Atm the kernel never gives you a presumed reloc offset with the high bits set so it doesn't matter. But I'd prefer if we don't need to make this opt-in behaviour once we enable address spaces with more than 4G. i-g-t gets away with the cheap hack since we're allowed to break igt. Let me check ddx and libva whether I've lost this fight already ... -Daniel I'm more than happy to do the right thing, I just don't know what that is. I don't see any uint64_t values in the interface we use at all: OUT_RELOC becomes ret = drm_intel_bo_emit_reloc(brw-batch.bo, 4*brw-batch.used, buffer, delta, read_domains, write_domain); The libdrm ABI is a disaster. bo-offset is a long, so we're keeping 32 bits of the kernel's returned value on 32 bit userspace, and 64 bits on 64 bit userspace. This means that on 32-bit we'll write in an expected-incorrect offset in the presumed offset for a 4g-located BO, which the kernel will map and fix up at exec time. On 64-bit, your patch would write an expected-incorrect 32-bit value into the batch, but libdrm would tell the kernel the full expected 64 bit value in the presumed_offset field, and you'll get brokenness for 4g buffers. So, I think you do need a drm_intel_bo_emit_reloc64 that returns a uint64_t value that the kernel wrote into the presumed offset, which you then plug into your batchbuffer. (In other news, while thinking about this, there are some obscure races with buffer migration due to presumed_offset being read at a separate time from when we look up bo-offset to actually write the offset into the batch, in the presence of context sharing in GL). I'd really like to land this patch as-is, since I need it to land the rest of my Broadwell code. I would update the commit message to note that it's broken for 4G currently. I don't like landing known-broken code that will give you mysterious hangs under memory pressure. I could possibly ack this if there was a WARN_ON_ONCE or just having it be a stub or something, but kind of works except when you start running a big app or run something for a long time is not cool. pgpFRf2zYI20f.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 73571] [clover] OpenCL segfault in gegl 'clones' test
https://bugs.freedesktop.org/show_bug.cgi?id=73571 --- Comment #2 from Jan Vesely jano.ves...@gmail.com --- I should have noted that my llvm git includes http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20131216/199497.html. Without these patches the backtrace in clones.xml test looks like this: Program received signal SIGSEGV, Segmentation fault. 0x7fffee163705 in clover::kernel::global_argument::set (this=0xfee690, size=8, value=0x0) at core/kernel.cpp:330 330 buf = objbuffer(*(cl_mem *)value); (gdb) bt #0 0x7fffee163705 in clover::kernel::global_argument::set (this=0xfee690, size=8, value=0x0) at core/kernel.cpp:330 #1 0x7fffee1af9d6 in clSetKernelArg (d_kern=0x115e658, idx=1, size=8, value=0x0) at api/kernel.cpp:98 #2 0x77db32c2 in gegl_operation_point_composer_cl_process (level=0, result=0xc7f1d0, output=0x106b4b0, aux=0x0, input=0xe152f0, operation=0x9dd010) at gegl-operation-point-composer.c:195 #3 gegl_operation_point_composer_process (operation=0x9dd010, input=0xe152f0, aux=0x0, output=0x106b4b0, result=0xc7f1d0, level=0) at gegl-operation-point-composer.c:246 #4 0x77db2bc4 in gegl_operation_composer_process2 ( operation=0x9dd010, context=optimized out, output_prop=optimized out, result=0xc7f1d0, level=0) at gegl-operation-point-composer.c:117 #5 0x77dbbe46 in gegl_graph_process (path=0xcc1020) at gegl-graph-traversal.c:418 #6 0x77dbb268 in gegl_eval_manager_apply (self=self@entry=0x81df40, roi=roi@entry=0xb89140) at gegl-eval-manager.c:133 #7 0x77db67ed in gegl_node_apply_roi (self=self@entry=0xf8d030, roi=roi@entry=0xb89140) at gegl-node.c:887 #8 0x77db6c53 in gegl_node_blit (self=0xf8d030, scale=scale@entry=1, roi=roi@entry=0xb89140, format=0x63ee60, destination_buf=destination_buf@entry=0x1284530, rowstride=rowstride@entry=0, flags=flags@entry=GEGL_BLIT_DEFAULT) ---Type return to continue, or q return to quit--- at gegl-node.c:948 #9 0x77dbd0be in render_rectangle (processor=0xd7a560) at gegl-processor.c:502 #10 gegl_processor_render (progress=0x0, rectangle=0xd7a580, processor=0xd7a560) at gegl-processor.c:642 #11 gegl_processor_work (processor=processor@entry=0xd7a560, progress=progress@entry=0x0) at gegl-processor.c:777 #12 0x77db68b2 in gegl_node_process (self=optimized out) at gegl-node.c:1610 #13 0x00401d27 in main (argc=6, argv=0x7fffe008) at gegl.c:232 -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Use sample barycentric coordinates with per sample shading
On Fri, Jan 10, 2014 at 5:25 PM, Anuj Phogat anuj.pho...@gmail.com wrote: On Thu, Jan 9, 2014 at 4:34 PM, Chris Forbes chr...@ijw.co.nz wrote: Hi Anuj, There's one fiddly interaction that I don't think this handles quite right, although I think it does conform. Suppose we have this fragment shader: #version 330 #extension ARB_gpu_shader5: require sample in vec4 a; in vec4 b; ... Then `b` is being evaluated at the sample position as well. This is allowed by my reading of the spec, but probably not what the author expected. Good catch. From the ARB_gpu_shader5 spec, emphasis mine: (11) Should we support per-sample interpolation of attributes? If so, how? RESOLVED. Yes. When multisample rasterization is enabled, qualifying one or more fragment shader inputs with sample will force per-sample interpolation of those attributes. If the same shader includes other fragment inputs not qualified with sample, those attributes _may_ be interpolated per-pixel (i.e., all samples get the same values, likely evaluated at the pixel center). What do you think? I agree with your interpretation. Spec seems to be flexible about it. I'll check what NVIDIA does in this case. This should be easy to fix if we need to. I verified that NVIDIA doesn't evaluate variable 'b' at sample position. I'll send out an updated patch to match this behavior. -- Chris ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] EXTERNAL: Re: OpenCL Clang/Clover Offline Compilation issue
Tom, Thanks for your response. I am very interested in implementing this, so any pointers you can provide would be greatly appreciated. I don't have access to IRC at work (at least I doubt I do) due to firewalls - but I can use the mailing list. I wasn't entirely sure about the proper clang command line, so I wrote another program which does the online compile, then saves the output away. I think I can produce an appropriate binary now. I am currently using a Radeon 6670; so I assume it will be: -mcpu=turks It looks like the LLVM output from clang is identical with either -mcpu=turks or -mcpu=r600. I can't seem to make clang output a binary file. (I figure I'm not using clang correctly) Since I can capture the binary with another C program (I think) I'm not too worried about using clang/llvm directly yet. Thanks! -Al -Original Message- From: Tom Stellard [mailto:t...@stellard.net] Sent: Monday, January 13, 2014 1:12 PM To: Dorrington, Albert Cc: mesa-dev@lists.freedesktop.org Subject: EXTERNAL: Re: [Mesa-dev] OpenCL Clang/Clover Offline Compilation issue On Thu, Jan 09, 2014 at 12:49:51PM +, Dorrington, Albert wrote: I am not sure if this is the appropriate list on which to ask this question, if not hopefully someone can suggest an alternative. Under Linux, I am attempting to perform an offline compile of an OpenCL kernel example using Clang, and then load that binary using the clCreateProgramWithBinary() function. Unfortunately, while clover is loading the binary, I end up getting a segmentation fault: Program received signal SIGSEGV, Segmentation fault. proc (v=..., is=...) at core/module.cpp:50 50T x; I have pasted the source code I am using below, for both the kernel and the host code. I am compiling with the following commands: clang -target r600-unknown-unknown -x cl -S -emit-llvm -mcpu=r600 kernel.cl -o kernel.clbin I'm surprised that this works, since the r600 GPU does not support OpenCL (Note that R600 is the name of the target and also one of the individual GPUs supported by the compiler). The argument of -mcpu= needs to be GPU you are compiling the code for. So if you have a redwood GPU you would need to pass -mcpu=redwood. However, the main issue here is that clover does not support clCreateProgramWithBinary() yet. If you are interested in implementing this, I can give you some pointers. Just send an email to the list or ping me on irc (nick: tstellar on #radeon @ irc.freednode.net). -Tom clang -g -L/usr/local/lib -lOpenCL offline_host.c -o offline_host I have LLVM/Clang 3.4RC3 installed and Mesa 10.0.1. If anyone has suggestions, or can point me to the appropriate mailing list or documentation, I'd appreciate it. Thanks! -Al ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] EXTERNAL: Re: OpenCL Clang/Clover Offline Compilation issue
On Mon, Jan 13, 2014 at 06:44:15PM +, Dorrington, Albert wrote: Tom, Thanks for your response. I am very interested in implementing this, so any pointers you can provide would be greatly appreciated. I'm cc'ing Fransisco since he may also have some feedback. The first step is to build a clover::module object from the binary code. When we compile OpenCL C, we use the build_module_llvm() function in llvm/invocation.cpp to do this. This function takes LLVM IR as input (stored in the LLVM:Module object) and produces a clover::module as output. With clCreateProgramFromBinary() we build a clover::module by deserializing the binary code using the module::deserialize function declared in module.cpp. This function expects the binary code to use a specific format, the code that is output from Clang/LLVM is not in the expected format which is probably why this is crashing for you. I don't think this format is documented anywhere, but you should be able to deduce it by looking through the code in core/module.cpp. The challenge is to get Clang/LLVM to produce code in the correct format. I think the correct way to do this would be to add a new triple, something like r600-clover-unknown, and then have the code emitter produce clover formatted code when it is passed this triple. However, I would recommend not worrying about the triple for now and just change the code emitter to emit clover's format. Once this is working, then we can go back and add the new triple. Once LLVM is producing the correct format, you will need to find a way for clover to communicate to the drivers that the code being passed is binary and not whatever its preferred IR is. One way to do this is to add the enum pipe_shader_ir ir_type; field to struct pipe_compute_state and use this to tell the drivers what kind of IR it has. You will also need to add the PIPE_SHADER_IR_BINARY type to enum pipe_hsader_ir. Then you will need to implement support for PIPE_SHADER_IR_BINARY in r600g. The code for doing this is already their you will just need to add a code path which skips over all of the LLVM compilation stages. Hopefully, this will help get you started. When it comes to generating a binary from clang and llvm. Here is the clang invocation I use: clang -o test.o -target r600-unknown-unknown -mcpu=redwood -integrated-as -c test.cl Note that this will work only if you uses non-vector types and don't use any builtin functions. To cover all use cases you can use the attached shell script to compile the code. -Tom I don't have access to IRC at work (at least I doubt I do) due to firewalls - but I can use the mailing list. I wasn't entirely sure about the proper clang command line, so I wrote another program which does the online compile, then saves the output away. I think I can produce an appropriate binary now. I am currently using a Radeon 6670; so I assume it will be: -mcpu=turks It looks like the LLVM output from clang is identical with either -mcpu=turks or -mcpu=r600. I can't seem to make clang output a binary file. (I figure I'm not using clang correctly) Since I can capture the binary with another C program (I think) I'm not too worried about using clang/llvm directly yet. Thanks! -Al -Original Message- From: Tom Stellard [mailto:t...@stellard.net] Sent: Monday, January 13, 2014 1:12 PM To: Dorrington, Albert Cc: mesa-dev@lists.freedesktop.org Subject: EXTERNAL: Re: [Mesa-dev] OpenCL Clang/Clover Offline Compilation issue On Thu, Jan 09, 2014 at 12:49:51PM +, Dorrington, Albert wrote: I am not sure if this is the appropriate list on which to ask this question, if not hopefully someone can suggest an alternative. Under Linux, I am attempting to perform an offline compile of an OpenCL kernel example using Clang, and then load that binary using the clCreateProgramWithBinary() function. Unfortunately, while clover is loading the binary, I end up getting a segmentation fault: Program received signal SIGSEGV, Segmentation fault. proc (v=..., is=...) at core/module.cpp:50 50T x; I have pasted the source code I am using below, for both the kernel and the host code. I am compiling with the following commands: clang -target r600-unknown-unknown -x cl -S -emit-llvm -mcpu=r600 kernel.cl -o kernel.clbin I'm surprised that this works, since the r600 GPU does not support OpenCL (Note that R600 is the name of the target and also one of the individual GPUs supported by the compiler). The argument of -mcpu= needs to be GPU you are compiling the code for. So if you have a redwood GPU you would need to pass -mcpu=redwood. However, the main issue here is that clover does not support clCreateProgramWithBinary() yet. If you are interested in implementing this, I can give you some pointers. Just send an email to the list or ping me on irc (nick: tstellar on #radeon @ irc.freednode.net). -Tom
[Mesa-dev] [Bug 73512] [clover] mesa.icd. should contain full path
https://bugs.freedesktop.org/show_bug.cgi?id=73512 --- Comment #8 from Tom Stellard tstel...@gmail.com --- (In reply to comment #7) Created attachment 91973 [details] [review] [PATCH v3] opencl: improved auto-gen .icd v2: Use @OPENCL_VERSION@:0 for library replace /etc with @sysconfdir@ macros v3: Drop libdir from icd, because libMesaOpenCL isn't private If we install the *.icd file to @sysconfdir@ and not /etc then standards compliant ICD loaders will not work with clover. The way I interpret the spec, we have no choice, but to install it to /etc . Why is it necessary to use @sysconfdir@ ? -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] How to contribute a translation?
Hi, I'd like to translate the DRI driver options (src/mesa/drivers/dri/common/xmlpool) to the Catalan language. What is the procedure for adding new translations? What tool should I use to generate ca.po, and how do I submit the file for review? -Alex ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/10] i965: Introduce an OUT_RELOC64 macro.
On 01/13/2014 01:04 PM, Eric Anholt wrote: Kenneth Graunke kenn...@whitecape.org writes: On 01/09/2014 09:31 PM, Eric Anholt wrote: Kenneth Graunke kenn...@whitecape.org writes: On 12/13/2013 09:28 AM, Daniel Vetter wrote: On Thu, Dec 12, 2013 at 01:26:40AM -0800, Kenneth Graunke wrote: Broadwell uses 48-bit addresses. The first DWord is the low 32 bits, and the second DWord is the high 16 bits. Since individual buffers shouldn't be larger than 4GB in size, any offsets into those buffers (buffer-offset + delta) should fit in the low 32 bits. So I believe we can simply emit 0 for the high 16-bits, and drm_intel_bo_emit_reloc() should patch it up. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/intel_batchbuffer.h | 5 + 1 file changed, 5 insertions(+) diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.h b/src/mesa/drivers/dri/i965/intel_batchbuffer.h index 159f928..128eed9 100644 --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.h +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.h @@ -178,6 +178,11 @@ void intel_batchbuffer_cached_advance(struct brw_context *brw); read_domains, write_domain, delta); \ } while (0) +/* Handle 48-bit address relocations for Gen8+ */ +#define OUT_RELOC64(buf, read_domains, write_domain, delta) \ + OUT_RELOC(buf, read_domains, write_domain, delta); \ + OUT_BATCH(0); Please not. The presumed_offset that libdrm uses is 64bits, and you need to emit the full presumed address (and correctly shifted). Atm the kernel never gives you a presumed reloc offset with the high bits set so it doesn't matter. But I'd prefer if we don't need to make this opt-in behaviour once we enable address spaces with more than 4G. i-g-t gets away with the cheap hack since we're allowed to break igt. Let me check ddx and libva whether I've lost this fight already ... -Daniel I'm more than happy to do the right thing, I just don't know what that is. I don't see any uint64_t values in the interface we use at all: OUT_RELOC becomes ret = drm_intel_bo_emit_reloc(brw-batch.bo, 4*brw-batch.used, buffer, delta, read_domains, write_domain); The libdrm ABI is a disaster. bo-offset is a long, so we're keeping 32 bits of the kernel's returned value on 32 bit userspace, and 64 bits on 64 bit userspace. This means that on 32-bit we'll write in an expected-incorrect offset in the presumed offset for a 4g-located BO, which the kernel will map and fix up at exec time. On 64-bit, your patch would write an expected-incorrect 32-bit value into the batch, but libdrm would tell the kernel the full expected 64 bit value in the presumed_offset field, and you'll get brokenness for 4g buffers. So, I think you do need a drm_intel_bo_emit_reloc64 that returns a uint64_t value that the kernel wrote into the presumed offset, which you then plug into your batchbuffer. (In other news, while thinking about this, there are some obscure races with buffer migration due to presumed_offset being read at a separate time from when we look up bo-offset to actually write the offset into the batch, in the presence of context sharing in GL). I'd really like to land this patch as-is, since I need it to land the rest of my Broadwell code. I would update the commit message to note that it's broken for 4G currently. I don't like landing known-broken code that will give you mysterious hangs under memory pressure. I could possibly ack this if there was a WARN_ON_ONCE or just having it be a stub or something, but kind of works except when you start running a big app or run something for a long time is not cool. Well, hooray for double standards, given that every other userspace component has landed this code, but didn't bother to even consolidate it into one easily fixable place... It's been over a year since I wrote most of this code, and I would REALLY like to actually land some things. But fine, I'll go write some libdrm patches... --Ken ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 72926] Memory corruption (crash) in draw/draw_pt_fetch_shade_pipeline_llvm.c:435
https://bugs.freedesktop.org/show_bug.cgi?id=72926 Peter Wu lekenst...@gmail.com changed: What|Removed |Added Attachment #91053|0 |1 is obsolete|| --- Comment #6 from Peter Wu lekenst...@gmail.com --- Created attachment 92000 -- https://bugs.freedesktop.org/attachment.cgi?id=92000action=edit gdb bt full for smaller C program robot -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 73512] [clover] mesa.icd. should contain full path
https://bugs.freedesktop.org/show_bug.cgi?id=73512 --- Comment #9 from Igor Gnatenko i.gnatenko.br...@gmail.com --- (In reply to comment #8) (In reply to comment #7) Created attachment 91973 [details] [review] [review] [PATCH v3] opencl: improved auto-gen .icd v2: Use @OPENCL_VERSION@:0 for library replace /etc with @sysconfdir@ macros v3: Drop libdir from icd, because libMesaOpenCL isn't private If we install the *.icd file to @sysconfdir@ and not /etc then standards compliant ICD loaders will not work with clover. The way I interpret the spec, we have no choice, but to install it to /etc . Why is it necessary to use @sysconfdir@ ? why I can't install mesa in /usr/local or in /opt ? I think no problems there.. Should I update patch w/o/ this macros ? -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 72926] Memory corruption (crash) in draw/draw_pt_fetch_shade_pipeline_llvm.c:435
https://bugs.freedesktop.org/show_bug.cgi?id=72926 Peter Wu lekenst...@gmail.com changed: What|Removed |Added Attachment #91216|0 |1 is obsolete|| --- Comment #7 from Peter Wu lekenst...@gmail.com --- Created attachment 92001 -- https://bugs.freedesktop.org/attachment.cgi?id=92001action=edit smaller apitrace output for robot program This is a smaller test case, the previous gdb output was generated using Mesa 10.0.2 + LLVM 3.4. ./configure line: LDFLAGS='-fsanitize=address -lasan' CFLAGS='-g -O0 -fsanitize=address -fno-omit-frame-pointer' \ CXXFLAGS=$CFLAGS \ ./configure --enable-debug --prefix=/tmp/mesa-root \ --with-gallium-drivers=swrast --with-llvm-shared-libs \ --enable-gallium-llvm --enable-shared-glapi --enable-dri \ --enable-glx --with-dri-drivers= -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [wip 2/9] glsl: serialize methods for IR instructions
On 2 January 2014 03:58, Tapani Pälli tapani.pa...@intel.com wrote: diff --git a/src/glsl/ir_serialize.cpp b/src/glsl/ir_serialize.cpp new file mode 100644 index 000..30ca018 --- /dev/null +++ b/src/glsl/ir_serialize.cpp @@ -0,0 +1,392 @@ +/* -*- c++ -*- */ +/* + * Copyright © 2013 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + */ + +#include ir_serialize.h + + +/** + * Wraps serialization of an ir instruction, writes ir_type + * and length of each instruction package as a header for it + */ +void +ir_instruction::serialize(memory_writer mem) +{ + uint32_t data_len = 0; + uint8_t ir_type = this-ir_type; + mem.write_uint8_t(ir_type); + + int32_t start_pos = mem.position(); + mem.write_uint32_t(data_len); + + this-serialize_data(mem); + + data_len = mem.position() - start_pos - sizeof(data_len); + mem.overwrite(data_len, sizeof(data_len), start_pos); This function isn't checking the return values from mem.write_*(), so there's no way for it to detect failure. Also, since this function returns void, there's no way for it to notify the caller of failure. A similar comment applies to all of the other serialize*() functions in this patch. (Of course, considering our previous discussion about potentially removing these int return values, this issue may be moot). +} + + + + +static void +serialize_glsl_type(const glsl_type *type, memory_writer mem) The last time I reviewed this series, I mentioned the idea of making a hashtable that maps each glsl_type to a small integer, so that we could serialize each type just once (see http://lists.freedesktop.org/archives/mesa-dev/2013-November/047740.html). At the time, it sounded like you liked that idea. Have you made that change? It looks to me like you've stopped serializing the built-in types, but user-defined types are still serialized each time they occur. With those two issues addressed, the patch is: Reviewed-by: Paul Berry stereotype...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 73512] [clover] mesa.icd. should contain full path
https://bugs.freedesktop.org/show_bug.cgi?id=73512 Igor Gnatenko i.gnatenko.br...@gmail.com changed: What|Removed |Added Attachment #91973|0 |1 is obsolete|| --- Comment #10 from Igor Gnatenko i.gnatenko.br...@gmail.com --- Created attachment 92004 -- https://bugs.freedesktop.org/attachment.cgi?id=92004action=edit [PATCH v4] opencl: improved auto-gen .icd v2: Use @OPENCL_VERSION@:0 for library replace /etc with @sysconfdir@ macros v3: Drop libdir from icd, because libMesaOpenCL isn't private v4: install ocl vendor always to /etc -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] automake: include the git sha in the opengl version string for oot builds
On Mon, Jan 13, 2014 at 02:02:12AM +, Emil Velikov wrote: Because it's a great feature and we should not penalise people for doing out-of-tree builds. Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- src/mesa/Makefile.am | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) Please. Acked-by: Chad Versace chad.vers...@linux.intel.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/7] mesa: implement glBindTextures()
On 01/07/2014 12:05 AM, Fredrik Höglund wrote: On Friday 03 January 2014, Marek Olšák wrote: On Fri, Jan 3, 2014 at 2:04 PM, Marek Olšák mar...@gmail.com wrote: On Fri, Jan 3, 2014 at 1:27 AM, Maxence Le Doré maxence.led...@gmail.com wrote: --- src/mesa/main/texobj.c | 52 ++ src/mesa/main/texobj.h | 3 +++ 2 files changed, 55 insertions(+) diff --git a/src/mesa/main/texobj.c b/src/mesa/main/texobj.c index bddbc50..66e2fb0 100644 --- a/src/mesa/main/texobj.c +++ b/src/mesa/main/texobj.c @@ -1686,4 +1686,56 @@ _mesa_InvalidateTexImage(GLuint texture, GLint level) return; } +/** ARB_multi_bind / OpenGL 4.4 */ + +void GLAPIENTRY +_mesa_BindTextures(GLuint first, GLsizei count, const GLuint *textures) +{ + GET_CURRENT_CONTEXT(ctx); + struct GLuint currentTexUnit = 0; + int i = 0; + + currentTexUnit = ctx-Texture.CurrentUnit; + + if(first + count ctx-Const.MaxCombinedTextureImageUnits) { + _mesa_error(ctx, GL_INVALID_OPERATION, glBindTextures(first+count)); + return; + } + + for(i = 0 ; i count ; i++) { + GLuint texture; + struct gl_texture_object *texObj; + GLenum texTarget; + int j = 0; + + if(textures == NULL) +texture = 0; + else +texture = textures[i]; + + _mesa_ActiveTexture(GL_TEXTURE0 + first + i); + if(texture != 0) { +texObj = _mesa_lookup_texture(ctx, texture); +if(texObj) { + texTarget = texObj-Target; + _mesa_BindTexture(texTarget, texture); +} +else + _mesa_error(ctx, GL_INVALID_OPERATION, + glBindTextures(textures[%i]), i); This error is set too late. It should be done before changing textures. Note that you make the same mistake in the other patches too. Also please double-check that none of the _mesa_ functions generate errors. This is actually not the case with the ARB_multi_bind functions: (11) Typically, OpenGL specifies that if an error is generated by a command, that command has no effect. This is somewhat unfortunate for multi-bind commands, because it would require a first pass to scan the entire list of bound objects for errors and then a second pass to actually perform the bindings. Should we have different error semantics? RESOLVED: Yes. In this specification, when the parameters for one of the count binding points are invalid, that binding point is not updated and an error will be generated. However, other binding points in the same command will be updated if their parameters are valid and no other error occurs. The code should reference this spec text. Otherwise someone will come along later and try to fix it. The code is still wrong for a different reason though; when a texture has has never been bound, it doesn't have a target. That case needs to be handled correctly. Fredrik ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [wip 3/9] glsl: memory_map helper class for data deserialization
On 2 January 2014 03:58, Tapani Pälli tapani.pa...@intel.com wrote: Class will be used by the shader binary cache implementation. Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/glsl/memory_map.h | 174 ++ 1 file changed, 174 insertions(+) create mode 100644 src/glsl/memory_map.h diff --git a/src/glsl/memory_map.h b/src/glsl/memory_map.h new file mode 100644 index 000..1b68b72 --- /dev/null +++ b/src/glsl/memory_map.h @@ -0,0 +1,174 @@ +/* -*- c++ -*- */ +/* + * Copyright © 2013 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + */ + +#pragma once +#ifndef MEMORY_MAP_H +#define MEMORY_MAP_H + +#include fcntl.h +#include unistd.h +#include sys/mman.h +#include sys/stat.h + +#ifdef __cplusplus + +/** + * Helper class to read data + * + * Class can read either from user given memory or from a file. On Linux + * file reading wraps around the Posix functions for mapping a file into + * the process's address space. Other OS may need different implementation. + */ +class memory_map +{ +public: + memory_map() : + mode(memory_map::READ_MEM), + fd(0), + cache_size(0), + cache_mmap(NULL), + cache_mmap_p(NULL) + { + /* only used by read_string() */ + mem_ctx = ralloc_context(NULL); + } + + /* read from disk */ + int map(const char *path) + { + struct stat stat_info; + if (stat(path, stat_info) != 0) + return -1; As before, I'm not thrilled with the use of -1 to mean failure and 0 to mean success, because it forces the caller to use counterintuitive if statements. I'd prefer for map() to return a bool with true meaning success and false meaning failure. + + mode = memory_map::READ_MAP; + cache_size = stat_info.st_size; + + fd = open(path, O_RDONLY); + if (fd) { + cache_mmap_p = cache_mmap = (char *) +mmap(NULL, cache_size, PROT_READ, MAP_PRIVATE, fd, 0); + return (cache_mmap == MAP_FAILED) ? -1 : 0; MAP_FAILED is a nonzero value, so if this error condition ever occurs, the destructor will errneously try to call munmap(). What I'd recommend doing instead is: void *mmap_result = mmap(...); if (mmap_result == MAP_FAILED) { close(fd); return -1; } cache_mmap_p = cache_mmap = (char *) mmap_result; return 0; + } + return -1; + } + + /* read from memory */ + int map(const void *memory, size_t size) + { + cache_mmap_p = cache_mmap = (char *) memory; + cache_size = size; + return 0; + } IMHO, functions that cannot fail should return void. + + /* wrap a portion from another map */ + int map(memory_map map, size_t size) + { + cache_mmap_p = cache_mmap = map.cache_mmap_p; + cache_size = size; + map.ffwd(size); + return 0; + } + + ~memory_map() { + if (cache_mmap mode == READ_MAP) { + munmap(cache_mmap, cache_size); + close(fd); + } + ralloc_free(mem_ctx); + } + + /* move read pointer forward */ + inline void ffwd(int len) + { + cache_mmap_p += len; + } + + inline void jump(unsigned pos) + { + cache_mmap_p = cache_mmap + pos; + } + + + /* position of read pointer */ + inline uint32_t position() + { + return cache_mmap_p - cache_mmap; + } + + inline char *read_string() + { + char *str = ralloc_strdup(mem_ctx, cache_mmap_p); + ffwd(strlen(str)+1); + return str; This is problematic from a security perspective. If the client provides corrupted data that ends in a truncated string (lacking a null terminator) that could cause ralloc_strdup() to try to read beyond the end of the file. We need to make sure the code doesn't try to read beyond the end of file, even if
[Mesa-dev] Mesa 10.1 release plan strawman
Fast forwarding 3 months from the 10.0 release (November 30th) is February 28th. I'd like to propose the following set of dates: January 31st: Feature freeze / 10.1 branch created. I promise to not let anyone on my team (myself included) dump any giant commit series the day of the freeze. I'll be traveling to FOSDEM, so this may be delayed by a day (or someone else may make the branch). February 7th: RC1 February 14th: RC2, with chocolates and flowers February 21st: RC3 February 28th: 10.1 final release Does this plan sound reasonable to all? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 72926] Memory corruption (crash) in draw/draw_pt_fetch_shade_pipeline_llvm.c:435
https://bugs.freedesktop.org/show_bug.cgi?id=72926 Peter Wu lekenst...@gmail.com changed: What|Removed |Added CC||lekenst...@gmail.com --- Comment #8 from Peter Wu lekenst...@gmail.com --- bisecting with the small program (via glretrace) and ASN + -O0 and -g still points to the same faulty commit: a3ae5dc7dd5c2f8893f86a920247e690e550ebd4 is the first bad commit commit a3ae5dc7dd5c2f8893f86a920247e690e550ebd4 Author: Zack Rusin za...@vmware.com Date: Fri Aug 9 10:11:31 2013 -0400 draw: make sure that the stages setup outputs Calling the prepare outputs cleans up the slot assignments for outputs, unfortunately aapoint and aaline didn't have code to reset their slots after the initial setup, this was messing up our slot assignments. The unfilled stage was just missing the initial assignment of the face slot. This fixes all of the reported piglit failures. Signed-off-by: Zack Rusin za...@vmware.com Reviewed-by: Roland Scheidegger srol...@vmware.com :04 04 fb87dfd2039663da7ff0fa6f12a5b0668fecee7f fc98438608d4df5bd64ff651bf9098aaabc5a262 M src git bisect log: git bisect start # bad: [277dbf08b0e78fe6cff0fc751768a6f3d33e61f7] glsl: Remove exec_list iterators now that nothing uses them. git bisect bad 277dbf08b0e78fe6cff0fc751768a6f3d33e61f7 # skip: [3e385d1bc314a50c9572b04210c4d6ac1b0a7381] docs: Add release notes for the 9.2.4 release. git bisect skip 3e385d1bc314a50c9572b04210c4d6ac1b0a7381 # good: [3e385d1bc314a50c9572b04210c4d6ac1b0a7381] docs: Add release notes for the 9.2.4 release. git bisect good 3e385d1bc314a50c9572b04210c4d6ac1b0a7381 # skip: [9f07ca11c1797ac12de1e1c6aef13cf58824b5f5] mesa: Dispatch ARB_framebuffer_object and EXT_framebuffer_object differently git bisect skip 9f07ca11c1797ac12de1e1c6aef13cf58824b5f5 # skip: [9f07ca11c1797ac12de1e1c6aef13cf58824b5f5] mesa: Dispatch ARB_framebuffer_object and EXT_framebuffer_object differently git bisect skip 9f07ca11c1797ac12de1e1c6aef13cf58824b5f5 # bad: [8d4ecbccd6a5608005b5c8f473d9a44dbde0b08d] i965: Remove #define name from PCI ID table. git bisect bad 8d4ecbccd6a5608005b5c8f473d9a44dbde0b08d # bad: [7086636358b611a2bb124253e1fe870107e1cecb] nvc0/ir: fix use after free in texture barrier insertion pass git bisect bad 7086636358b611a2bb124253e1fe870107e1cecb # bad: [e858921d527bfcbbda27760f781c25cab469e852] ilo: implement new float comparison instructions git bisect bad e858921d527bfcbbda27760f781c25cab469e852 # bad: [e858921d527bfcbbda27760f781c25cab469e852] ilo: implement new float comparison instructions git bisect bad e858921d527bfcbbda27760f781c25cab469e852 # good: [6065a87bce0c3fb0d9694c381c5a31b63e1f0300] glsl: Cross-validate GS layout qualifiers while intrastage linking. git bisect good 6065a87bce0c3fb0d9694c381c5a31b63e1f0300 # good: [6065a87bce0c3fb0d9694c381c5a31b63e1f0300] glsl: Cross-validate GS layout qualifiers while intrastage linking. git bisect good 6065a87bce0c3fb0d9694c381c5a31b63e1f0300 # good: [331a8fa41d174c74afe58f43a5943627398eac6b] gallium-egl: Simplify native_wayland_drm_bufmgr_helper interface git bisect good 331a8fa41d174c74afe58f43a5943627398eac6b # good: [2c32c3985ca6232a81d21feb9ac6443145b42d0e] i965/fs: Consider predicated SEL instructions as whole variable writes. git bisect good 2c32c3985ca6232a81d21feb9ac6443145b42d0e # good: [438cc6bc49d109f9ddeed6a741c4f0b8f1c4ffe2] mesa: Make detach_renderbuffer available outside fbobject.c git bisect good 438cc6bc49d109f9ddeed6a741c4f0b8f1c4ffe2 # good: [336351e971d6232bbed11d9812ebf05341b6aa36] glsl/ast: Check that geometry shader interface block inputs are arrays. git bisect good 336351e971d6232bbed11d9812ebf05341b6aa36 # good: [98d2498404ba69a3efc1c765b1a1885d151181ed] glsl: Fix incorrect pattern matching in ir_set_program_inouts git bisect good 98d2498404ba69a3efc1c765b1a1885d151181ed # bad: [c6c55ad3e967f3d151c24795a99634b297c13fde] gallivm: fix border color with normalized texture formats git bisect bad c6c55ad3e967f3d151c24795a99634b297c13fde # bad: [27cedd8aecccea808a35ef297477cac5fe87e476] llvmpipe: fix pipeline statistics with a null ps git bisect bad 27cedd8aecccea808a35ef297477cac5fe87e476 # bad: [a3ae5dc7dd5c2f8893f86a920247e690e550ebd4] draw: make sure that the stages setup outputs git bisect bad a3ae5dc7dd5c2f8893f86a920247e690e550ebd4 # first bad commit: [a3ae5dc7dd5c2f8893f86a920247e690e550ebd4] draw: make sure that the stages setup outputs -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Use AC_PATH_TOOL instead of AC_PATH_PROG for llvm-config.
On Mon, Jan 13, 2014 at 07:04:44PM +0100, Michał Górny wrote: Dnia 2014-01-13, o godz. 08:59:22 Tom Stellard t...@stellard.net napisał(a): On Sat, Dec 28, 2013 at 03:22:09PM +0100, Michał Górny wrote: This should help with cross-compiling and multilib when $CHOST-specific llvm-config is expected rather than build host default one. It will help us a bit in Gentoo where we've started using i686-pc-linux-gnu-llvm-config for 32-bit multilib LLVM. Reviewed-by: Tom Stellard thomas.stell...@amd.com Should we CC stable on this patch? I have no strong opinion here. It would be a bit helpful though it's not a killer feature for us (yet :)). Do you have commit access? No, I don't. I've pushed this patch and added CC: Stable. Thanks! -Tom ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 72926] Memory corruption (crash) in draw/draw_pt_fetch_shade_pipeline_llvm.c:435
https://bugs.freedesktop.org/show_bug.cgi?id=72926 Peter Wu lekenst...@gmail.com changed: What|Removed |Added CC||za...@vmware.com -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 73571] [clover] OpenCL segfault in gegl 'clones' test
https://bugs.freedesktop.org/show_bug.cgi?id=73571 --- Comment #3 from Jan Vesely jano.ves...@gmail.com --- Created attachment 92006 -- https://bugs.freedesktop.org/attachment.cgi?id=92006action=edit Don't crash on NULL global mem objects The attached patch fixes the original issue (bt in #c2), and adds prelimnary support for NULL global mem objects. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 73571] [clover] Add support for NULL global memory object arguments
https://bugs.freedesktop.org/show_bug.cgi?id=73571 Jan Vesely jano.ves...@gmail.com changed: What|Removed |Added Summary|[clover] OpenCL segfault in |[clover] Add support for |gegl 'clones' test |NULL global memory object ||arguments -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 73512] [clover] mesa.icd. should contain full path
https://bugs.freedesktop.org/show_bug.cgi?id=73512 --- Comment #11 from Igor Gnatenko i.gnatenko.br...@gmail.com --- (In reply to comment #8) (In reply to comment #7) Created attachment 91973 [details] [review] [review] [PATCH v3] opencl: improved auto-gen .icd v2: Use @OPENCL_VERSION@:0 for library replace /etc with @sysconfdir@ macros v3: Drop libdir from icd, because libMesaOpenCL isn't private If we install the *.icd file to @sysconfdir@ and not /etc then standards compliant ICD loaders will not work with clover. The way I interpret the spec, we have no choice, but to install it to /etc . Why is it necessary to use @sysconfdir@ ? Yes. I'm sorry. https://forge.imag.fr/plugins/scmgit/cgi-bin/gitweb.cgi?p=ocl-icd/ocl-icd.git;a=blob;f=ocl_icd_loader.c;h=ab419b2dccb82db6d632cae6dc86e5151a320c07;hb=HEAD#l52 Only /etc will work. Fixed. Patch here. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] opencl: improved auto-gen .icd
On Mon, Jan 13, 2014 at 11:12 AM, Tom Stellard t...@stellard.net wrote: On Sun, Jan 12, 2014 at 03:08:56AM +0400, Igor Gnatenko wrote: From 5b2bf87f1238e44150492a39f5db0ae90d59459b Mon Sep 17 00:00:00 2001 From: Igor Gnatenko i.gnatenko.br...@gmail.com Date: Sun, 12 Jan 2014 02:09:16 +0400 Subject: [PATCH] opencl: improved auto-gen .icd v2: Use @OPENCL_VERSION@:0 for library replace /etc with @sysconfdir@ macros Reported-by: Fabian Deutsch fabian.deut...@gmx.de Reference: https://bugs.freedesktop.org/show_bug.cgi?id=73512 Signed-off-by: Igor Gnatenko i.gnatenko.br...@gmail.com --- configure.ac | 3 +++ src/gallium/targets/opencl/Makefile.am | 4 ++-- src/gallium/targets/opencl/mesa.icd| 1 - src/gallium/targets/opencl/mesa.icd.in | 1 + 4 files changed, 6 insertions(+), 3 deletions(-) delete mode 100644 src/gallium/targets/opencl/mesa.icd create mode 100644 src/gallium/targets/opencl/mesa.icd.in diff --git a/configure.ac b/configure.ac index 4b55140..3452e15 100644 --- a/configure.ac +++ b/configure.ac @@ -25,6 +25,8 @@ m4_ifdef([AM_PROG_AR], [AM_PROG_AR]) dnl Set internal versions OSMESA_VERSION=8 AC_SUBST([OSMESA_VERSION]) +OPENCL_VERSION=1 +AC_SUBST([OPENCL_VERSION]) dnl Versions for external dependencies LIBDRM_REQUIRED=2.4.24 @@ -2023,6 +2025,7 @@ AC_CONFIG_FILES([Makefile src/gallium/targets/egl-static/Makefile src/gallium/targets/gbm/Makefile src/gallium/targets/opencl/Makefile + src/gallium/targets/opencl/mesa.icd src/gallium/targets/osmesa/Makefile src/gallium/targets/osmesa/osmesa.pc src/gallium/targets/pipe-loader/Makefile diff --git a/src/gallium/targets/opencl/Makefile.am b/src/gallium/targets/opencl/Makefile.am index 653302c..923316c 100644 --- a/src/gallium/targets/opencl/Makefile.am +++ b/src/gallium/targets/opencl/Makefile.am @@ -4,7 +4,7 @@ lib_LTLIBRARIES = lib@OPENCL_LIBNAME@.la lib@OPENCL_LIBNAME@_la_LDFLAGS = \ $(LLVM_LDFLAGS) \ - -version-number 1:0 + -version-number @OPENCL_VERSION@:0 lib@OPENCL_LIBNAME@_la_LIBADD = \ $(top_builddir)/src/gallium/auxiliary/pipe-loader/libpipe_loader.la \ @@ -34,7 +34,7 @@ lib@OPENCL_LIBNAME@_la_SOURCES = nodist_EXTRA_lib@OPENCL_LIBNAME@_la_SOURCES = dummy.cpp if HAVE_CLOVER_ICD -icddir = /etc/OpenCL/vendors/ +icddir = @sysconfdir@/OpenCL/vendors/ As I mentioned in the bug report, the ICD spec says that OpenCL/vendors/ should be in /etc/ I don't think we can change this and still be spec compliant. Why do you want to install the *.icd files in sysconfdir? sysconfdir basically is etc. This hunk would allow you to install into a prefix and not have this file installed into /etc outside of your prefix. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Naming everything in src/gallium/drivers/radeonsi si_*
Pushed, thanks. Marek On Mon, Jan 13, 2014 at 10:00 PM, Andreas Hartmetz ahartm...@gmail.com wrote: I don't have an fdo account or push rights. Can somebody else push it for me please? I've added the Reviewed-by: lines so the patches only need to be pushed now. On Monday 13 January 2014 11:22:07 Marek Olšák wrote: For the series: Reviewed-by: Marek Olšák marek.ol...@amd.com Feel free to push this. Marek On Sat, Jan 11, 2014 at 4:20 PM, Andreas Hartmetz ahartm...@gmail.com wrote: Continuing here because the threads had diverged... I've updated the patch series under the same URL and applied all the suggested improvements. The variable renames are still in, but at the very end so they are trivial to omit. On Tuesday 07 January 2014 17:27:56 Andreas Hartmetz wrote: We have talked on IRC meanwhile: Everywhere was supposed to mean file names and data structures. I have made a patch series (git link because file renames produce huge diffs) that renames *everything* away from r600 (and also radeonsi) to si, where it is actually about SI. In the such modified code it is then clear at first glance that only resources, textures and some other low-level interface code from R600 / generic Radeon are actually used in SI code. The patch series is ordered by increasing controversy potential due to destruction of git blame history, so the last parts can be omitted if they are deemed too destructive to history. In my opinion, it is better to have code that is readable now than code that is less readable but with the possibility to look up how it became like that. Michel said on IRC that he'd prefer to keep the name radeonsi_pipe.h/c, I disagree: If the library name is to be kept, there must be a break between radeonsi and si *somewhere*, and it is normal for library names to not correspond to any file name in the library. The same scheme is used in llvmpipe, llvmpipe lib / directory versus lp_* file names. Here's the repository (branch is master): git git://anongit.kde.org/scratch/ahartmetz/mesa.git web http://quickgit.kde.org/?p=scratch%2Fahartmetz%2Fmesa.git On Monday 06 January 2014 15:50:05 Marek Olšák wrote: It sounds good, but I'd like the prefix to be si_ everywhere. Marek On Mon, Jan 6, 2014 at 2:47 PM, Andreas Hartmetz ahartm...@gmail.com wrote: Hello, many of the files in radeonsi originally came from other places where they had different names and were never renamed. Most of them now have names that don't tell what the files are for (r600 is not actually the first hardware supported by them, they start at radeonsi), and even those with radeonsi are split between radeonsi_ and si_. si_ is shorter than radeonsi_, but inconsistent with the directory and library name. I still think it's the best option, but no strong opinion from me. If and when the files are renamed, the next step would be doing the same with the r600_ struct and function names. Does that sound good? I'll send the patches shortly if so. Cheers, Andreas ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] gallium endianness and hw drivers
I think the format conversion functions should look like: #ifdef BIG_ENDIAN case PIPE_FORMAT_A8B8G8R8_UNORM: return hw_format_for_R8G8B8A8_UNORM; ... #else case PIPE_FORMAT_R8G8B8A8_UNORM: return hw_format_for_R8G8B8A8_UNORM; #endif which can be simplified to: case PIPE_FORMAT_RGBA_UNORM: return hw_format_for_R8G8B8A8_UNORM; So that the GPU can see the same formats, but they are different for the CPU. What do you think? Marek On Mon, Jan 6, 2014 at 10:00 AM, Michel Dänzer mic...@daenzer.net wrote: On Fre, 2013-12-27 at 19:41 +0100, Marek Olšák wrote: Okay. Using Axxx for transfers only is a good idea, just please make sure the formats are not advertised to the state tracker. Advertising the format to the state tracker is the whole point :), as it's the format that matches the X11 semantics on big endian hosts. -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [wip 6/9] glsl: ir_deserializer class for the binary shader cache
On 2 January 2014 03:58, Tapani Pälli tapani.pa...@intel.com wrote: + + +/** + * Reads header part of the binary blob. Main purpose of this header is to + * validate that cached shader was produced with same Mesa driver version. + */ +int +ir_deserializer::read_header(struct gl_shader *shader) +{ + char *cache_magic_id = map-read_string(); + char *driver_vendor = map-read_string(); + char *driver_renderer = map-read_string(); + + /* only used or debug output, silence compiler warning */ + (void) driver_vendor; + (void) driver_renderer; A single version of Mesa potentially supports many different hardware types, and those different hardware types may define different values of GLSL built-in constants. They also may require core Mesa to do different sets of lowering passes during compilation. So we can't just ignore driver_vendor and driver_renderer. We need to reject the binary blob if they don't match. + + shader-Version = map-read_uint32_t(); + shader-Type = map-read_uint32_t(); + shader-IsES = map-read_uint8_t(); + + CACHE_DEBUG(%s: version %d, type 0x%x, %s (mesa %s)\n[%s %s]\n, + __func__, shader-Version, shader-Type, + (shader-IsES) ? glsl es : desktop glsl, + cache_magic_id, driver_vendor, driver_renderer); + + const char *magic = mesa_get_shader_cache_magic(); + + if (memcmp(cache_magic_id, magic, strlen(magic))) + return DIFFERENT_MESA_VER; If cache_magic_id is foobar and magic is foo, this will erroneusly consider them equal. The correct way to do this is to use strcmp(). + + /* post-link data */ + shader-num_samplers = map-read_uint32_t(); + shader-active_samplers = map-read_uint32_t(); + shader-shadow_samplers = map-read_uint32_t(); + shader-num_uniform_components = map-read_uint32_t(); + shader-num_combined_uniform_components = map-read_uint32_t(); + shader-uses_builtin_functions = map-read_uint8_t(); + + map-read(shader-Geom, sizeof(shader-Geom)); + + for (unsigned i = 0; i MAX_SAMPLERS; i++) + shader-SamplerUnits[i] = map-read_uint8_t(); + + for (unsigned i = 0; i MAX_SAMPLERS; i++) + shader-SamplerTargets[i] = (gl_texture_index) map-read_int32_t(); + + return 0; +} + + +const glsl_type * +ir_deserializer::read_glsl_type() +{ + char *name = map-read_string(); + uint32_t type_size = map-read_uint32_t(); + + const glsl_type *existing_type = + state-symbols-get_type(name); + + /* if type exists, move read pointer forward and return type */ + if (existing_type) { + map-ffwd(type_size); + return existing_type; + } + + uint8_t base_type = map-read_uint8_t(); + uint32_t length = map-read_uint32_t(); + uint8_t vector_elms = map-read_uint8_t(); + uint8_t matrix_cols = map-read_uint8_t(); + uint8_t interface_packing = map-read_uint8_t(); + + /* array type has additional element_type information */ + if (base_type == GLSL_TYPE_ARRAY) { + const glsl_type *element_type = read_glsl_type(); + if (!element_type) { + CACHE_DEBUG(error reading array element type\n); + return NULL; + } + return glsl_type::get_array_instance(element_type, length); + } + + /* structures have fields containing of names and types */ + else if (base_type == GLSL_TYPE_STRUCT || + base_type == GLSL_TYPE_INTERFACE) { + glsl_struct_field *fields = ralloc_array(mem_ctx, + glsl_struct_field, length); + + if (!fields) + return glsl_type::error_type; + + for (unsigned k = 0; k length; k++) { + uint8_t row_major, interpolation, centroid; + int32_t location; + char *field_name = map-read_string(); + fields[k].name = _mesa_strdup(field_name); + fields[k].type = read_glsl_type(); + row_major = map-read_uint8_t(); + location = map-read_int32_t(); + interpolation = map-read_uint8_t(); + centroid = map-read_uint8_t(); + fields[k].row_major = row_major; + fields[k].location = location; + fields[k].interpolation = interpolation; + fields[k].centroid = centroid; Another security issue: if the binary blob is corrupted, length may be outrageously large (e.g. 0x). We need a way for this loop to bail out and exit if it tries to read past the end of the binary blob. + } + + const glsl_type *ret_type = NULL; + + if (base_type == GLSL_TYPE_STRUCT) + ret_type = glsl_type::get_record_instance(fields, length, name); + else if (base_type == GLSL_TYPE_INTERFACE) + ret_type = glsl_type::get_interface_instance(fields, +length, (glsl_interface_packing) interface_packing, name); + + /* free allocated memory */ + for (unsigned k = 0; k length; k++) + free((void *)fields[k].name); + ralloc_free(fields); + + return ret_type;
[Mesa-dev] [Bug 73578] New: egl_pipe.c:46:38: fatal error: radeonsi/radeonsi_public.h: No such file or directory
https://bugs.freedesktop.org/show_bug.cgi?id=73578 Priority: medium Bug ID: 73578 Keywords: regression CC: ahartm...@gmail.com, mar...@gmail.com Assignee: mesa-dev@lists.freedesktop.org Summary: egl_pipe.c:46:38: fatal error: radeonsi/radeonsi_public.h: No such file or directory Severity: blocker Classification: Unclassified OS: Linux (All) Reporter: v...@freedesktop.org Hardware: x86-64 (AMD64) Status: NEW Version: git Component: Other Product: Mesa mesa: aa7ae4fd6e24ba7f2b687e3f3c4301919830750b (master) $ scons [...] Compiling src/gallium/targets/egl-static/egl_pipe.c ... src/gallium/targets/egl-static/egl_pipe.c:46:38: fatal error: radeonsi/radeonsi_public.h: No such file or directory #include radeonsi/radeonsi_public.h ^ compilation terminated. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 73578] egl_pipe.c:46:38: fatal error: radeonsi/radeonsi_public.h: No such file or directory
https://bugs.freedesktop.org/show_bug.cgi?id=73578 --- Comment #1 from Vinson Lee v...@freedesktop.org --- 786af2f963925df2c2a6fb60b29a83e8340f03c7 is the first bad commit commit 786af2f963925df2c2a6fb60b29a83e8340f03c7 Author: Andreas Hartmetz ahartm...@gmail.com Date: Sat Jan 4 18:44:33 2014 +0100 radeonsi: Apply si_* file naming scheme. Reviewed-by: Marek Olšák marek.ol...@amd.com :04 04 d05e480d033201d725c16b7cb392b536538837ed 864adcad0405ebe443285fd74c24612fa4ae287d Msrc bisect run success -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [libdrm PATCH] intel: Create a new drm_intel_bo offset64 field.
The existing 'offset' field is unfortunately typed as 'unsigned long', which is unfortunately only 4 bytes with a 32-bit userspace. Traditionally, the hardware has only supported 32-bit virtual addresses, so even though the kernel uses a __u64, the value would always fit. However, Broadwell supports 48-bit addressing. So with a 64-bit kernel, the card virtual address may be too large to fit in the 'offset' field. Ideally, we would change the type of 'offset' to be a uint64_t---but this would break the libdrm ABI. Instead, we create a new 'offset64' field to hold the full 64-bit value from the kernel, and store the 32-bit truncation in the existing 'offset' field, for compatibility. Cc: Eric Anholt e...@anholt.net Cc: Daniel Vetter daniel.vet...@ffwll.ch Cc: Ben Widawsky b...@bwidawsk.net Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- intel/intel_bufmgr.h | 12 +--- intel/intel_bufmgr_gem.c | 16 ++-- 2 files changed, 19 insertions(+), 9 deletions(-) I didn't update the bufmgr_fake stuff. Do I need to...? diff --git a/intel/intel_bufmgr.h b/intel/intel_bufmgr.h index 2eb9742..9383c72 100644 --- a/intel/intel_bufmgr.h +++ b/intel/intel_bufmgr.h @@ -61,9 +61,8 @@ struct _drm_intel_bo { unsigned long align; /** -* Last seen card virtual address (offset from the beginning of the -* aperture) for the object. This should be used to fill relocation -* entries when calling drm_intel_bo_emit_reloc() +* Deprecated field containing (possibly the low 32-bits of) the last +* seen virtual card address. Use offset64 instead. */ unsigned long offset; @@ -84,6 +83,13 @@ struct _drm_intel_bo { * MM-specific handle for accessing object */ int handle; + + /** +* Last seen card virtual address (offset from the beginning of the +* aperture) for the object. This should be used to fill relocation +* entries when calling drm_intel_bo_emit_reloc() +*/ + uint64_t offset64; }; enum aub_dump_bmp_format { diff --git a/intel/intel_bufmgr_gem.c b/intel/intel_bufmgr_gem.c index ad722dd..f4db1a6 100644 --- a/intel/intel_bufmgr_gem.c +++ b/intel/intel_bufmgr_gem.c @@ -382,7 +382,7 @@ drm_intel_gem_dump_validation_list(drm_intel_bufmgr_gem *bufmgr_gem) (unsigned long long)bo_gem-relocs[j].offset, target_gem-gem_handle, target_gem-name, - target_bo-offset, + target_bo-offset64, bo_gem-relocs[j].delta); } } @@ -894,6 +894,7 @@ drm_intel_bo_gem_create_from_name(drm_intel_bufmgr *bufmgr, bo_gem-bo.size = open_arg.size; bo_gem-bo.offset = 0; + bo_gem-bo.offset64 = 0; bo_gem-bo.virtual = NULL; bo_gem-bo.bufmgr = bufmgr; bo_gem-name = name; @@ -1689,7 +1690,7 @@ do_bo_emit_reloc(drm_intel_bo *bo, uint32_t offset, target_bo_gem-gem_handle; bo_gem-relocs[bo_gem-reloc_count].read_domains = read_domains; bo_gem-relocs[bo_gem-reloc_count].write_domain = write_domain; - bo_gem-relocs[bo_gem-reloc_count].presumed_offset = target_bo-offset; + bo_gem-relocs[bo_gem-reloc_count].presumed_offset = target_bo-offset64; bo_gem-reloc_target_info[bo_gem-reloc_count].bo = target_bo; if (target_bo != bo) @@ -1840,11 +1841,12 @@ drm_intel_update_buffer_offsets(drm_intel_bufmgr_gem *bufmgr_gem) drm_intel_bo_gem *bo_gem = (drm_intel_bo_gem *) bo; /* Update the buffer offset */ - if (bufmgr_gem-exec_objects[i].offset != bo-offset) { + if (bufmgr_gem-exec_objects[i].offset != bo-offset64) { DBG(BO %d (%s) migrated: 0x%08lx - 0x%08llx\n, - bo_gem-gem_handle, bo_gem-name, bo-offset, + bo_gem-gem_handle, bo_gem-name, bo-offset64, (unsigned long long)bufmgr_gem-exec_objects[i]. offset); + bo-offset64 = bufmgr_gem-exec_objects[i].offset; bo-offset = bufmgr_gem-exec_objects[i].offset; } } @@ -1860,10 +1862,11 @@ drm_intel_update_buffer_offsets2 (drm_intel_bufmgr_gem *bufmgr_gem) drm_intel_bo_gem *bo_gem = (drm_intel_bo_gem *)bo; /* Update the buffer offset */ - if (bufmgr_gem-exec2_objects[i].offset != bo-offset) { + if (bufmgr_gem-exec2_objects[i].offset != bo-offset64) { DBG(BO %d (%s) migrated: 0x%08lx - 0x%08llx\n, - bo_gem-gem_handle, bo_gem-name, bo-offset, + bo_gem-gem_handle, bo_gem-name, bo-offset64, (unsigned long long)bufmgr_gem-exec2_objects[i].offset); +
[Mesa-dev] [Mesa PATCH 3/3] i965: Introduce an OUT_RELOC64 macro.
Broadwell uses 48-bit addresses. The first DWord is the low 32 bits, and the second DWord is the high 16 bits. Cc: Eric Anholt e...@anholt.net Cc: Daniel Vetter daniel.vet...@ffwll.ch Cc: Ben Widawsky b...@bwidawsk.net Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/intel_batchbuffer.c | 24 src/mesa/drivers/dri/i965/intel_batchbuffer.h | 10 ++ 2 files changed, 34 insertions(+) diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c index 966b95b..88540f0 100644 --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c @@ -397,6 +397,30 @@ intel_batchbuffer_emit_reloc(struct brw_context *brw, return true; } +bool +intel_batchbuffer_emit_reloc64(struct brw_context *brw, + drm_intel_bo *buffer, + uint32_t read_domains, uint32_t write_domain, + uint32_t delta) +{ + int ret = drm_intel_bo_emit_reloc(brw-batch.bo, 4*brw-batch.used, + buffer, delta, + read_domains, write_domain); + assert(ret == 0); + (void) ret; + + /* Using the old buffer offset, write in what the right data would be, in +* case the buffer doesn't move and we can short-circuit the relocation +* processing in the kernel +*/ + uint64_t offset = buffer-offset64 + delta; + intel_batchbuffer_emit_dword(brw, offset); + intel_batchbuffer_emit_dword(brw, offset 32); + + return true; +} + + void intel_batchbuffer_data(struct brw_context *brw, const void *data, GLuint bytes, enum brw_gpu_ring ring) diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.h b/src/mesa/drivers/dri/i965/intel_batchbuffer.h index 2a3c6ed..86923e4 100644 --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.h +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.h @@ -59,6 +59,11 @@ bool intel_batchbuffer_emit_reloc(struct brw_context *brw, uint32_t read_domains, uint32_t write_domain, uint32_t offset); +bool intel_batchbuffer_emit_reloc64(struct brw_context *brw, +drm_intel_bo *buffer, +uint32_t read_domains, +uint32_t write_domain, +uint32_t offset); void intel_batchbuffer_emit_mi_flush(struct brw_context *brw); void intel_emit_post_sync_nonzero_flush(struct brw_context *brw); void intel_emit_depth_stall_flushes(struct brw_context *brw); @@ -169,6 +174,11 @@ void intel_batchbuffer_cached_advance(struct brw_context *brw); read_domains, write_domain, delta); \ } while (0) +/* Handle 48-bit address relocations for Gen8+ */ +#define OUT_RELOC64(buf, read_domains, write_domain, delta) do { \ + intel_batchbuffer_emit_reloc64(brw, buf, read_domains, write_domain, delta);\ +} while (0) + #define ADVANCE_BATCH() intel_batchbuffer_advance(brw); #define CACHED_BATCH() intel_batchbuffer_cached_advance(brw); -- 1.8.5.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] FOSDEM14: Graphics DevRoom: Deadline approaching fast.
On Tue, Jan 07, 2014 at 02:22:00AM +0100, Luc Verhaegen wrote: Hi, There are still 5 slots open for the FOSDEM graphics DevRoom, and the deadline is this friday, the 10th. Get a move on. If you have requested an account reset with me before, but if you then haven't bothered filing a talk, you do NOT have a slot. Please file a talk ASAP to still secure a place. For more information on how to file for a devroom, read the email sent back in october: http://lists.x.org/archives/xorg-devel/2013-October/038185.html Luc Verhaegen. There are still 3 slots open. This is your final chance to get a talk in the FOSDEM 2014 graphics DevRoom. Monday night (13th), the schedule will be locked down and no further talks or events will be accepted. Luc Verhaegen. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Use AC_PATH_TOOL instead of AC_PATH_PROG for llvm-config.
Dnia 2014-01-13, o godz. 08:59:22 Tom Stellard t...@stellard.net napisał(a): On Sat, Dec 28, 2013 at 03:22:09PM +0100, Michał Górny wrote: This should help with cross-compiling and multilib when $CHOST-specific llvm-config is expected rather than build host default one. It will help us a bit in Gentoo where we've started using i686-pc-linux-gnu-llvm-config for 32-bit multilib LLVM. Reviewed-by: Tom Stellard thomas.stell...@amd.com Should we CC stable on this patch? I have no strong opinion here. It would be a bit helpful though it's not a killer feature for us (yet :)). Do you have commit access? No, I don't. -- Best regards, Michał Górny signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 73578] egl_pipe.c:46:38: fatal error: radeonsi/radeonsi_public.h: No such file or directory
https://bugs.freedesktop.org/show_bug.cgi?id=73578 Vinson Lee v...@freedesktop.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #2 from Vinson Lee v...@freedesktop.org --- commit 8f9b70fa3c41418bc2b28551642ea786ed0c2e79 Author: Vinson Lee v...@freedesktop.org Date: Mon Jan 13 15:51:50 2014 -0800 egl-static: Fix build error. Fix build regression introduced with commit 786af2f963925df2c2a6fb60b29a83e8340f03c7. egl_pipe.c:46:38: fatal error: radeonsi/radeonsi_public.h: No such file or directory #include radeonsi/radeonsi_public.h ^ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73578 Signed-off-by: Vinson Lee v...@freedesktop.org vinson@vinson-ubuntu:~/workspace/mesa$ -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Nominations for X.Org Foundation Board of Directors are OPEN
We are seeking nominations for candidates for election to the X.Org Foundation Board of Directors. All X.Org Foundation members are eligible for election to the board. Nominations for the 2014 election are now open and will remain open until 23.59 GMT on 12 February 2013. The Board consists of directors elected from the membership. Each year, an election is held to bring the total number of directors to eight. The four members receiving the highest vote totals will serve as directors for two year terms. The directors who received two year terms starting in 2013 were Alan Coopersmith, Martin Peres, Peter Hutterer and Stuart Kreitman. They will continue to serve until their term ends in 2015. Current directors whose term expires in 2014 are Matthias Hopf, Keith Packard, Matt Dew, and Alex Deucher. A director is expected to participate in the bi-weekly IRC meeting to discuss current business and to attend the annual meeting of the X.Org Foundation, which will be held at a location determined in advance by the Board of Directors. A member may nominate themselves or any other member they feel is qualified. Nominations should be sent to the Election Committee at elections at x.org. Nominees shall be required to be current members of the X.Org Foundation, and submit a personal statement of up to 200 words that will be provided to prospective voters. The collected statements, along with the statement of contribution to the X.Org Foundation in the members account page on http://members.x.org, will be made available to all voters to help them make their voting decisions. Nominations, membership applications or renewals and completed personal statements must be received no later than 23.59 GMT on 12 February 2014. The slate of candidates will be published 13 February 2014 and candidate QA will begin then. The deadline for Xorg membership applications and renewals is 18 February 2014. The Election Committee X.Org Foundation ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Mesa PATCH 2/3] i965: Use the new drm_intel_bo offset64 field.
libdrm 2.4.52 introduces a new 'uint64_t offset64' field, intended to replace the old 'unsigned long offset' field. To preserve ABI, libdrm continues to store the presumed offset in both locations. On Broadwell, a 64-bit kernel may place BOs at high ( 4G) addresses. However, with a 32-bit userspace, the 'unsigned long offset' field will only be 32-bit, which is not large enough to hold this value. We need to use a proper uint64_t (like the kernel does). Technically, a lot of this code doesn't affect Broadwell, so we could leave it using the old field. But it makes sense to just switch to the new, properly typed field. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- configure.ac | 2 +- src/mesa/drivers/dri/i965/brw_cc.c| 2 +- src/mesa/drivers/dri/i965/brw_clip_state.c| 2 +- src/mesa/drivers/dri/i965/brw_context.h | 2 +- src/mesa/drivers/dri/i965/brw_sf_state.c | 2 +- src/mesa/drivers/dri/i965/brw_vs_state.c | 4 ++-- src/mesa/drivers/dri/i965/brw_wm_sampler_state.c | 2 +- src/mesa/drivers/dri/i965/brw_wm_state.c | 4 ++-- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 14 +++--- src/mesa/drivers/dri/i965/gen6_blorp.cpp | 4 ++-- src/mesa/drivers/dri/i965/gen7_blorp.cpp | 4 ++-- src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 14 +++--- src/mesa/drivers/dri/i965/intel_batchbuffer.c | 6 +++--- 13 files changed, 31 insertions(+), 31 deletions(-) This was generated by temporarily removing the 'offset' field from libdrm and fixing all the compile errors. Obviously, we can't actually delete the field, but you can at least have some confidence that I caught all the existing uses. diff --git a/configure.ac b/configure.ac index 4b55140..fd189ea 100644 --- a/configure.ac +++ b/configure.ac @@ -29,7 +29,7 @@ AC_SUBST([OSMESA_VERSION]) dnl Versions for external dependencies LIBDRM_REQUIRED=2.4.24 LIBDRM_RADEON_REQUIRED=2.4.50 -LIBDRM_INTEL_REQUIRED=2.4.49 +LIBDRM_INTEL_REQUIRED=2.4.52 LIBDRM_NVVIEUX_REQUIRED=2.4.33 LIBDRM_NOUVEAU_REQUIRED=2.4.33 libdrm = 2.4.41 LIBDRM_FREEDRENO_REQUIRED=2.4.51 diff --git a/src/mesa/drivers/dri/i965/brw_cc.c b/src/mesa/drivers/dri/i965/brw_cc.c index 4bc3b23..497d91a 100644 --- a/src/mesa/drivers/dri/i965/brw_cc.c +++ b/src/mesa/drivers/dri/i965/brw_cc.c @@ -215,7 +215,7 @@ static void upload_cc_unit(struct brw_context *brw) cc-cc5.statistics_enable = 1; /* CACHE_NEW_CC_VP */ - cc-cc4.cc_viewport_state_offset = (brw-batch.bo-offset + + cc-cc4.cc_viewport_state_offset = (brw-batch.bo-offset64 + brw-cc.vp_offset) 5; /* reloc */ brw-state.dirty.cache |= CACHE_NEW_CC_UNIT; diff --git a/src/mesa/drivers/dri/i965/brw_clip_state.c b/src/mesa/drivers/dri/i965/brw_clip_state.c index 66b3229..8647b0d 100644 --- a/src/mesa/drivers/dri/i965/brw_clip_state.c +++ b/src/mesa/drivers/dri/i965/brw_clip_state.c @@ -132,7 +132,7 @@ brw_upload_clip_unit(struct brw_context *brw) { clip-clip5.guard_band_enable = 1; clip-clip6.clipper_viewport_state_ptr = - (brw-batch.bo-offset + brw-clip.vp_offset) 5; + (brw-batch.bo-offset64 + brw-clip.vp_offset) 5; /* emit clip viewport relocation */ drm_intel_bo_emit_reloc(brw-batch.bo, diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 63dd4a0..77c4c3e 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -1809,7 +1809,7 @@ brw_program_reloc(struct brw_context *brw, uint32_t state_offset, prog_offset, I915_GEM_DOMAIN_INSTRUCTION, 0); - return brw-cache.bo-offset + prog_offset; + return brw-cache.bo-offset64 + prog_offset; } bool brw_do_cubemap_normalize(struct exec_list *instructions); diff --git a/src/mesa/drivers/dri/i965/brw_sf_state.c b/src/mesa/drivers/dri/i965/brw_sf_state.c index 69093f2..9bc0cd3 100644 --- a/src/mesa/drivers/dri/i965/brw_sf_state.c +++ b/src/mesa/drivers/dri/i965/brw_sf_state.c @@ -173,7 +173,7 @@ static void upload_sf_unit( struct brw_context *brw ) sf-thread4.stats_enable = 1; /* CACHE_NEW_SF_VP */ - sf-sf5.sf_viewport_state_offset = (brw-batch.bo-offset + + sf-sf5.sf_viewport_state_offset = (brw-batch.bo-offset64 + brw-sf.vp_offset) 5; /* reloc */ sf-sf5.viewport_transform = 1; diff --git a/src/mesa/drivers/dri/i965/brw_vs_state.c b/src/mesa/drivers/dri/i965/brw_vs_state.c index 015abf1..a3ea62d 100644 --- a/src/mesa/drivers/dri/i965/brw_vs_state.c +++ b/src/mesa/drivers/dri/i965/brw_vs_state.c @@ -84,7 +84,7 @@ brw_upload_vs_unit(struct brw_context *brw) if (brw-vs.prog_data-base.total_scratch != 0) { vs-thread2.scratch_space_base_pointer = -stage_state-scratch_bo-offset 10; /* reloc */ +