Re: [Mesa-dev] [v4 09/10] egl: definitions for EXT_image_dma_buf_import
On Thu, May 23, 2013 at 09:40:09PM -0700, Chad Versace wrote: On 05/02/2013 12:08 AM, Topi Pohjolainen wrote: As specified in: http://www.khronos.org/registry/egl/extensions/EXT/EGL_EXT_image_dma_buf_import.txt Checking for the valid fourcc values is left for drivers avoiding dependency to drm header files here. v2: - enforce EGL_NO_CONTEXT v3: - declare the extension as EGL (not GLES) v4: - do not update eglext.h manually but rely on update from Khronos instead Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com --- src/egl/main/eglapi.c | 7 - src/egl/main/egldisplay.h | 1 + src/egl/main/eglimage.c | 76 +++ src/egl/main/eglimage.h | 15 ++ src/egl/main/eglmisc.c| 1 + 5 files changed, 99 insertions(+), 1 deletion(-) diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c index bcc5465..2355d45 100644 --- a/src/egl/main/eglapi.c +++ b/src/egl/main/eglapi.c @@ -1310,7 +1310,12 @@ eglCreateImageKHR(EGLDisplay dpy, EGLContext ctx, EGLenum target, _EGL_CHECK_DISPLAY(disp, EGL_NO_IMAGE_KHR, drv); if (!disp-Extensions.KHR_image_base) RETURN_EGL_EVAL(disp, EGL_NO_IMAGE_KHR); - if (!context ctx != EGL_NO_CONTEXT) + + /** +* If target is EGL_LINUX_DMA_BUF_EXT, dpy must be a valid display, +* ctx must be EGL_NO_CONTEXT... +*/ + if (ctx != EGL_NO_CONTEXT (!context || target == EGL_LINUX_DMA_BUF_EXT)) RETURN_EGL_ERROR(disp, EGL_BAD_CONTEXT, EGL_NO_IMAGE_KHR); img = drv-API.CreateImageKHR(drv, diff --git a/src/egl/main/egldisplay.h b/src/egl/main/egldisplay.h index 4b33470..5a21f78 100644 --- a/src/egl/main/egldisplay.h +++ b/src/egl/main/egldisplay.h @@ -115,6 +115,7 @@ struct _egl_extensions EGLBoolean EXT_create_context_robustness; EGLBoolean EXT_buffer_age; + EGLBoolean EXT_image_dma_buf_import; }; diff --git a/src/egl/main/eglimage.c b/src/egl/main/eglimage.c index bfae709..1cede31 100644 --- a/src/egl/main/eglimage.c +++ b/src/egl/main/eglimage.c @@ -93,6 +93,82 @@ _eglParseImageAttribList(_EGLImageAttribs *attrs, _EGLDisplay *dpy, attrs-PlaneWL = val; break; + case EGL_LINUX_DRM_FOURCC_EXT: + attrs-DMABufFourCC.Value = val; + attrs-DMABufFourCC.IsPresent = EGL_TRUE; + break; + case EGL_DMA_BUF_PLANE0_FD_EXT: + attrs-DMABufPlaneFds[0].Value = val; + attrs-DMABufPlaneFds[0].IsPresent = EGL_TRUE; + break; + case EGL_DMA_BUF_PLANE0_OFFSET_EXT: + attrs-DMABufPlaneOffsets[0].Value = val; + attrs-DMABufPlaneOffsets[0].IsPresent = EGL_TRUE; + break; + case EGL_DMA_BUF_PLANE0_PITCH_EXT: + attrs-DMABufPlanePitches[0].Value = val; + attrs-DMABufPlanePitches[0].IsPresent = EGL_TRUE; + break; + case EGL_DMA_BUF_PLANE1_FD_EXT: + attrs-DMABufPlaneFds[1].Value = val; + attrs-DMABufPlaneFds[1].IsPresent = EGL_TRUE; + break; + case EGL_DMA_BUF_PLANE1_OFFSET_EXT: + attrs-DMABufPlaneOffsets[1].Value = val; + attrs-DMABufPlaneOffsets[1].IsPresent = EGL_TRUE; + break; + case EGL_DMA_BUF_PLANE1_PITCH_EXT: + attrs-DMABufPlanePitches[1].Value = val; + attrs-DMABufPlanePitches[1].IsPresent = EGL_TRUE; + break; + case EGL_DMA_BUF_PLANE2_FD_EXT: + attrs-DMABufPlaneFds[2].Value = val; + attrs-DMABufPlaneFds[2].IsPresent = EGL_TRUE; + break; + case EGL_DMA_BUF_PLANE2_OFFSET_EXT: + attrs-DMABufPlaneOffsets[2].Value = val; + attrs-DMABufPlaneOffsets[2].IsPresent = EGL_TRUE; + break; + case EGL_DMA_BUF_PLANE2_PITCH_EXT: + attrs-DMABufPlanePitches[2].Value = val; + attrs-DMABufPlanePitches[2].IsPresent = EGL_TRUE; + break; + case EGL_YUV_COLOR_SPACE_HINT_EXT: + if (val != EGL_ITU_REC601_EXT || val != EGL_ITU_REC709_EXT || + val != EGL_ITU_REC2020_EXT) { This should be `val != X val != Y val != Z`. +err = EGL_BAD_ATTRIBUTE; + } else { +attrs-DMABufYuvColorSpaceHint.Value = val; +attrs-DMABufYuvColorSpaceHint.IsPresent = EGL_TRUE; + } + break; + case EGL_SAMPLE_RANGE_HINT_EXT: + if (val != EGL_YUV_FULL_RANGE_EXT || val != EGL_YUV_NARROW_RANGE_EXT) { +err = EGL_BAD_ATTRIBUTE; Again, s/||//. Also, there is a tab above, but all the surrounding code uses spaces. + } else { +attrs-DMABufSampleRangeHint.Value = val; +attrs-DMABufSampleRangeHint.IsPresent = EGL_TRUE; + } + break; + case EGL_YUV_CHROMA_HORIZONTAL_SITING_HINT_EXT: + if (val != EGL_YUV_CHROMA_SITING_0_EXT || + val != EGL_YUV_CHROMA_SITING_0_5_EXT) { +err =
[Mesa-dev] [PATCH] glsl linker: Initialize member variable interface_namespace.
Fixes Uninitialized pointer field defect reported by Coverity. Signed-off-by: Vinson Lee v...@freedesktop.org --- src/glsl/lower_named_interface_blocks.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/glsl/lower_named_interface_blocks.cpp b/src/glsl/lower_named_interface_blocks.cpp index eba667a..922cc02 100644 --- a/src/glsl/lower_named_interface_blocks.cpp +++ b/src/glsl/lower_named_interface_blocks.cpp @@ -72,7 +72,8 @@ public: hash_table *interface_namespace; flatten_named_interface_blocks_declarations(void *mem_ctx) - : mem_ctx(mem_ctx) + : mem_ctx(mem_ctx), +interface_namespace(NULL) { } -- 1.8.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Error compiling mesa 9.1.1
Hi Matt, From the build in your path, it looks like you might be trying to do an out-of-tree build. I don't remember if that completely worked with 9.1. I just untarred 9.1.3 and did libtoolize --force ./autogen.sh --with-dri-drivers=i965,swrast --with-gallium-drivers=swrast --enable-glx-tls --with-egl-platforms=x11 --enable-gles1 --enable-gles2 --enable-gallium-egl --disable-glu make -jX and it built. thanks for your update. It builds fine with 9.1.3 but not with 9.1.1. I will simply start using 9.1.3. Thanks again, Regards, Divick ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] st/vdpau: remove vlCreateHTAB from surface functions
From: Christian König christian.koe...@amd.com Signed-off-by: Christian König christian.koe...@amd.com --- src/gallium/state_trackers/vdpau/surface.c |9 - 1 file changed, 9 deletions(-) diff --git a/src/gallium/state_trackers/vdpau/surface.c b/src/gallium/state_trackers/vdpau/surface.c index 135eb85..bd11fc3 100644 --- a/src/gallium/state_trackers/vdpau/surface.c +++ b/src/gallium/state_trackers/vdpau/surface.c @@ -54,11 +54,6 @@ vlVdpVideoSurfaceCreate(VdpDevice device, VdpChromaType chroma_type, goto inv_size; } - if (!vlCreateHTAB()) { - ret = VDP_STATUS_RESOURCES; - goto no_htab; - } - p_surf = CALLOC(1, sizeof(vlVdpSurface)); if (!p_surf) { ret = VDP_STATUS_RESOURCES; @@ -110,7 +105,6 @@ inv_device: FREE(p_surf); no_res: -no_htab: inv_size: return ret; } @@ -272,9 +266,6 @@ vlVdpVideoSurfacePutBitsYCbCr(VdpVideoSurface surface, struct pipe_sampler_view **sampler_views; unsigned i, j; - if (!vlCreateHTAB()) - return VDP_STATUS_RESOURCES; - vlVdpSurface *p_surf = vlGetDataHTAB(surface); if (!p_surf) return VDP_STATUS_INVALID_HANDLE; -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] st/vdpau: invalidate the handles on destruction
From: Christian König christian.koe...@amd.com Fixes a problem with xbmc when switching channels. Signed-off-by: Christian König christian.koe...@amd.com --- src/gallium/state_trackers/vdpau/decode.c |1 + src/gallium/state_trackers/vdpau/device.c |1 + src/gallium/state_trackers/vdpau/surface.c |2 ++ 3 files changed, 4 insertions(+) diff --git a/src/gallium/state_trackers/vdpau/decode.c b/src/gallium/state_trackers/vdpau/decode.c index 61b10e0..2ffd8dd 100644 --- a/src/gallium/state_trackers/vdpau/decode.c +++ b/src/gallium/state_trackers/vdpau/decode.c @@ -139,6 +139,7 @@ vlVdpDecoderDestroy(VdpDecoder decoder) vldecoder-decoder-destroy(vldecoder-decoder); pipe_mutex_unlock(vldecoder-device-mutex); + vlRemoveDataHTAB(decoder); FREE(vldecoder); return VDP_STATUS_OK; diff --git a/src/gallium/state_trackers/vdpau/device.c b/src/gallium/state_trackers/vdpau/device.c index c530f43..a829c27 100644 --- a/src/gallium/state_trackers/vdpau/device.c +++ b/src/gallium/state_trackers/vdpau/device.c @@ -166,6 +166,7 @@ vlVdpDeviceDestroy(VdpDevice device) dev-context-destroy(dev-context); vl_screen_destroy(dev-vscreen); + vlRemoveDataHTAB(device); FREE(dev); vlDestroyHTAB(); diff --git a/src/gallium/state_trackers/vdpau/surface.c b/src/gallium/state_trackers/vdpau/surface.c index ad56125..135eb85 100644 --- a/src/gallium/state_trackers/vdpau/surface.c +++ b/src/gallium/state_trackers/vdpau/surface.c @@ -132,7 +132,9 @@ vlVdpVideoSurfaceDestroy(VdpVideoSurface surface) p_surf-video_buffer-destroy(p_surf-video_buffer); pipe_mutex_unlock(p_surf-device-mutex); + vlRemoveDataHTAB(surface); FREE(p_surf); + return VDP_STATUS_OK; } -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] st/vdpau: destroy handle table only when it's empty
From: Christian König christian.koe...@amd.com Signed-off-by: Christian König christian.koe...@amd.com --- src/gallium/state_trackers/vdpau/htab.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/vdpau/htab.c b/src/gallium/state_trackers/vdpau/htab.c index 39ff7be..8b809f2 100644 --- a/src/gallium/state_trackers/vdpau/htab.c +++ b/src/gallium/state_trackers/vdpau/htab.c @@ -55,7 +55,7 @@ void vlDestroyHTAB(void) { #ifdef VL_HANDLES pipe_mutex_lock(htab_lock); - if (htab) { + if (htab !handle_table_get_first_handle(htab)) { handle_table_destroy(htab); htab = NULL; } -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v4 06/10] intel: prepare for dri images having more than one plane
On Thu, May 23, 2013 at 09:39:57PM -0700, Chad Versace wrote: On 05/02/2013 12:08 AM, Topi Pohjolainen wrote: v2 (as advised by Eric): - use ARRAY_SIZE - re-use 'image_destroy' for cleaning up after failure - check directly the region pointer instead of the buffer object when determining if a region exists Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com --- src/mesa/drivers/dri/intel/intel_screen.c | 103 +- 1 file changed, 72 insertions(+), 31 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_screen.c b/src/mesa/drivers/dri/intel/intel_screen.c index 4973441..d822b1c 100644 --- a/src/mesa/drivers/dri/intel/intel_screen.c +++ b/src/mesa/drivers/dri/intel/intel_screen.c @@ -490,8 +490,14 @@ intel_create_image_from_texture(__DRIcontext *context, int target, static void intel_destroy_image(__DRIimage *image) { -intel_region_release(image-regions[0]); -free(image); + int i; + + for (i = 0; i ARRAY_SIZE(image-regions); ++i) { + if (image-regions[i]) + intel_region_release(image-regions[i]); + } + + free(image); } static __DRIimage * @@ -568,16 +574,22 @@ intel_query_image(__DRIimage *image, int attrib, int *value) static __DRIimage * intel_dup_image(__DRIimage *orig_image, void *loaderPrivate) { + int i; __DRIimage *image; image = calloc(1, sizeof *image); if (image == NULL) return NULL; - intel_region_reference(image-regions[0], orig_image-regions[0]); - if (image-regions[0] == NULL) { - free(image); - return NULL; Pre-patch, this hunk returned NULL if orig_image-region[0] was somehow NULL. Good catch! + for (i = 0; i ARRAY_SIZE(image-regions); ++i) { + if (!orig_image-regions[i]) + break; Post-patch, if orig_image-region[0] was NULL, then this function no longer returns NULL because of the above break. To ensure that this patch doesn't regress anything, it needs to reproduce that behavior with `if (orig_image-regions[0] != NULL) return NULL`.. Or, if your confident (... I'm not, but maybe you are) that orig_image-region[0] is never NULL then assert that. I maintained the old logic skipping the copy in case the source does not have any regions. I cannot see how that would be possible but I'm not confident enough to assert. + + intel_region_reference(image-regions[i], orig_image-regions[i]); + if (image-regions[i] == NULL) { + intel_destroy_image(image); + return NULL; + } } image-internal_format = orig_image-internal_format; @@ -646,47 +658,76 @@ intel_create_image_from_names(__DRIscreen *screen, } static __DRIimage * +intel_setup_image_from_fds(struct intel_screen *screen, int width, int height, + const struct intel_image_format *f, + const int *fds, int num_fds, const int *strides, + void *loaderPriv) +{ I don't see the utility in extracting this code out of intel_create_image_from_fds() into its own, similarly named function. In fact, it makes the code harder to read. If no following patch reuses this function, then its body should remain in its original location, intel_create_image_from_fds. Agreed, it does complicate things. + int i; + __DRIimage *img; + + if (f-nplanes == 1) + img = intel_allocate_image(f-planes[0].dri_format, loaderPriv); + else + img = intel_allocate_image(__DRI_IMAGE_FORMAT_NONE, loaderPriv); + + if (img == NULL) + return NULL; + + for (i = 0; i num_fds; i++) { + img-regions[i] = intel_region_alloc_for_fd(screen, f-planes[i].cpp, + width f-planes[i].width_shift, + height f-planes[i].height_shift, + strides[i], fds[i], image); + + if (img-regions[i] == NULL) { + intel_destroy_image(img); + return NULL; + } + } + + intel_setup_image_from_dimensions(img); + + return img; +} + +static __DRIimage * intel_create_image_from_fds(__DRIscreen *screen, int width, int height, int fourcc, int *fds, int num_fds, int *strides, int *offsets, void *loaderPrivate) { struct intel_screen *intelScreen = screen-driverPrivate; - struct intel_image_format *f; + struct intel_image_format *f = intel_image_format_lookup(fourcc); __DRIimage *image; int i, index; - if (fds == NULL || num_fds != 1) - return NULL; - - f = intel_image_format_lookup(fourcc); - if (f == NULL) + /** +* In case the image is to consist of multiple regions, there must be exactly +* one region per plane. +*/ + if (fds == NULL || f == NULL || (num_fds 1 f-nplanes != num_fds))
[Mesa-dev] [Bug 64952] New: Build failure in egl-static when using llvm-3.3
https://bugs.freedesktop.org/show_bug.cgi?id=64952 Priority: medium Bug ID: 64952 Assignee: mesa-dev@lists.freedesktop.org Summary: Build failure in egl-static when using llvm-3.3 Severity: normal Classification: Unclassified OS: Linux (All) Reporter: gustav.peters...@gmail.com Hardware: Other Status: NEW Version: git Component: Mesa core Product: Mesa egl-static needs LLVM component IPO which is only included when building with opencl -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 64952] Build failure in egl-static when using llvm-3.3
https://bugs.freedesktop.org/show_bug.cgi?id=64952 --- Comment #1 from Gustav Petersson gustav.peters...@gmail.com --- Created attachment 79759 -- https://bugs.freedesktop.org/attachment.cgi?id=79759action=edit Proposed patch for building egl-static -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] st/glx: add null ctx check in glXDestroyContext()
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64934 NOTE: This is a candidate for the stable branches. --- src/gallium/state_trackers/glx/xlib/glx_api.c | 22 -- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/src/gallium/state_trackers/glx/xlib/glx_api.c b/src/gallium/state_trackers/glx/xlib/glx_api.c index a66ebc8..c6dc134 100644 --- a/src/gallium/state_trackers/glx/xlib/glx_api.c +++ b/src/gallium/state_trackers/glx/xlib/glx_api.c @@ -1353,16 +1353,18 @@ glXQueryExtension( Display *dpy, int *errorBase, int *eventBase ) PUBLIC void glXDestroyContext( Display *dpy, GLXContext ctx ) { - GLXContext glxCtx = ctx; - (void) dpy; - MakeCurrent_PrevContext = 0; - MakeCurrent_PrevDrawable = 0; - MakeCurrent_PrevReadable = 0; - MakeCurrent_PrevDrawBuffer = 0; - MakeCurrent_PrevReadBuffer = 0; - XMesaDestroyContext( glxCtx-xmesaContext ); - XMesaGarbageCollect(); - free(glxCtx); + if (ctx) { + GLXContext glxCtx = ctx; + (void) dpy; + MakeCurrent_PrevContext = 0; + MakeCurrent_PrevDrawable = 0; + MakeCurrent_PrevReadable = 0; + MakeCurrent_PrevDrawBuffer = 0; + MakeCurrent_PrevReadBuffer = 0; + XMesaDestroyContext( glxCtx-xmesaContext ); + XMesaGarbageCollect(); + free(glxCtx); + } } -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] xlib: add null ctx check in glXDestroyContext()
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64934 NOTE: This is a candidate for the stable branches. --- src/mesa/drivers/x11/fakeglx.c | 22 -- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/x11/fakeglx.c b/src/mesa/drivers/x11/fakeglx.c index c7fb327..031c305 100644 --- a/src/mesa/drivers/x11/fakeglx.c +++ b/src/mesa/drivers/x11/fakeglx.c @@ -1533,16 +1533,18 @@ void _kw_ungrab_all( Display *dpy ) static void Fake_glXDestroyContext( Display *dpy, GLXContext ctx ) { - struct fake_glx_context *glxCtx = (struct fake_glx_context *) ctx; - (void) dpy; - MakeCurrent_PrevContext = 0; - MakeCurrent_PrevDrawable = 0; - MakeCurrent_PrevReadable = 0; - MakeCurrent_PrevDrawBuffer = 0; - MakeCurrent_PrevReadBuffer = 0; - XMesaDestroyContext( glxCtx-xmesaContext ); - XMesaGarbageCollect(dpy); - free(glxCtx); + if (ctx) { + struct fake_glx_context *glxCtx = (struct fake_glx_context *) ctx; + (void) dpy; + MakeCurrent_PrevContext = 0; + MakeCurrent_PrevDrawable = 0; + MakeCurrent_PrevReadable = 0; + MakeCurrent_PrevDrawBuffer = 0; + MakeCurrent_PrevReadBuffer = 0; + XMesaDestroyContext( glxCtx-xmesaContext ); + XMesaGarbageCollect(dpy); + free(glxCtx); + } } -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 64934] [llvmpipe] SIGSEGV src/gallium/state_trackers/glx/xlib/glx_api.c:1363
https://bugs.freedesktop.org/show_bug.cgi?id=64934 --- Comment #1 from Brian Paul bri...@vmware.com --- I've posted patches to add null pointer checking in glXDestroyContext. But the latest build of glxinfo wouldn't call glXDestroyContext with a null context either. In any case, I'm not sure why context creation is failing for you in this case. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v4 10/10] egl: dri2: support for creating images out of dma buffers
On 05/23/2013 10:15 PM, Pohjolainen, Topi wrote: On Thu, May 23, 2013 at 09:39:30PM -0700, Chad Versace wrote: When touching the src/egl/drivers/dri2 directory, use a commit subject that looks like egl/dri2: STUFF, not egl: dri2: STUFF. [snip] +/** + * The spec says: + * + * If eglCreateImageKHR is successful for a EGL_LINUX_DMA_BUF_EXT target, + * the EGL takes ownership of the file descriptor and is responsible for + * closing it, which it may do at any time while the EGLDisplay is + * initialized. + */ +static void +dri2_take_dma_buf_ownership(const int *fds, unsigned num_fds) +{ + int already_closed[num_fds]; + unsigned num_closed = 0; + unsigned i, j; + + for (i = 0; i num_fds; ++i) { + /** + * The same file descriptor can be referenced multiple times in case more + * than one plane is found in the same buffer, just with a different + * offset. + */ + for (j = 0; j num_closed; ++j) { + if (already_closed[j] == fds[i]) The condition above has undefined behavior, ... There is the explicit counter 'num_closed' telling how many valid elements there are in 'already_closed'. My mistake. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] xlib: add null ctx check in glXDestroyContext()
- Original Message - Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64934 NOTE: This is a candidate for the stable branches. --- src/mesa/drivers/x11/fakeglx.c | 22 -- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/x11/fakeglx.c b/src/mesa/drivers/x11/fakeglx.c index c7fb327..031c305 100644 --- a/src/mesa/drivers/x11/fakeglx.c +++ b/src/mesa/drivers/x11/fakeglx.c @@ -1533,16 +1533,18 @@ void _kw_ungrab_all( Display *dpy ) static void Fake_glXDestroyContext( Display *dpy, GLXContext ctx ) { - struct fake_glx_context *glxCtx = (struct fake_glx_context *) ctx; - (void) dpy; - MakeCurrent_PrevContext = 0; - MakeCurrent_PrevDrawable = 0; - MakeCurrent_PrevReadable = 0; - MakeCurrent_PrevDrawBuffer = 0; - MakeCurrent_PrevReadBuffer = 0; - XMesaDestroyContext( glxCtx-xmesaContext ); - XMesaGarbageCollect(dpy); - free(glxCtx); + if (ctx) { + struct fake_glx_context *glxCtx = (struct fake_glx_context *) ctx; + (void) dpy; + MakeCurrent_PrevContext = 0; + MakeCurrent_PrevDrawable = 0; + MakeCurrent_PrevReadable = 0; + MakeCurrent_PrevDrawBuffer = 0; + MakeCurrent_PrevReadBuffer = 0; + XMesaDestroyContext( glxCtx-xmesaContext ); + XMesaGarbageCollect(dpy); + free(glxCtx); + } } -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev Reviewed-by: Jose Fonseca jfons...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] gallium: Add support for multiple viewports
Just some minor formatting nits below... On 05/23/2013 02:33 PM, Zack Rusin wrote: Gallium supported only a single viewport/scissor combination. This commit changes the interface to allow us to add support for multiple viewports/scissors. Signed-off-by: Zack Rusin za...@vmware.com --- src/gallium/auxiliary/cso_cache/cso_context.c | 37 +++ src/gallium/auxiliary/cso_cache/cso_context.h |9 +++--- src/gallium/auxiliary/draw/draw_context.c |6 ++-- src/gallium/auxiliary/draw/draw_context.h |5 +-- src/gallium/auxiliary/hud/hud_context.c |6 ++-- src/gallium/auxiliary/postprocess/pp_run.c |6 ++-- src/gallium/auxiliary/tgsi/tgsi_scan.c |6 src/gallium/auxiliary/tgsi/tgsi_scan.h |1 + src/gallium/auxiliary/tgsi/tgsi_strings.c |3 +- src/gallium/auxiliary/util/u_blit.c | 12 src/gallium/auxiliary/util/u_blitter.c |8 ++--- src/gallium/auxiliary/util/u_gen_mipmap.c |6 ++-- src/gallium/auxiliary/vl/vl_compositor.c|4 +-- src/gallium/auxiliary/vl/vl_idct.c |4 +-- src/gallium/auxiliary/vl/vl_matrix_filter.c |2 +- src/gallium/auxiliary/vl/vl_mc.c|2 +- src/gallium/auxiliary/vl/vl_median_filter.c |2 +- src/gallium/auxiliary/vl/vl_zscan.c |2 +- src/gallium/docs/source/context.rst |8 +++-- src/gallium/drivers/freedreno/freedreno_state.c | 10 +++--- src/gallium/drivers/galahad/glhd_context.c | 16 +- src/gallium/drivers/i915/i915_state.c | 12 +--- src/gallium/drivers/identity/id_context.c | 22 -- src/gallium/drivers/ilo/ilo_state.c | 14 + src/gallium/drivers/llvmpipe/lp_screen.c|2 ++ src/gallium/drivers/llvmpipe/lp_state_clip.c| 20 ++-- src/gallium/drivers/noop/noop_state.c | 14 + src/gallium/drivers/nv30/nv30_draw.c|2 +- src/gallium/drivers/nv30/nv30_state.c | 14 + src/gallium/drivers/nv50/nv50_state.c | 16 +- src/gallium/drivers/nvc0/nvc0_state.c | 14 + src/gallium/drivers/r300/r300_context.c |2 +- src/gallium/drivers/r300/r300_state.c | 16 +- src/gallium/drivers/r600/evergreen_state.c |5 +-- src/gallium/drivers/r600/r600_state.c |7 +++-- src/gallium/drivers/r600/r600_state_common.c|9 +++--- src/gallium/drivers/radeonsi/si_state.c | 14 + src/gallium/drivers/rbug/rbug_context.c | 22 -- src/gallium/drivers/softpipe/sp_screen.c|2 ++ src/gallium/drivers/softpipe/sp_state_clip.c| 16 +- src/gallium/drivers/svga/svga_pipe_misc.c | 18 ++- src/gallium/drivers/svga/svga_swtnl_state.c |2 +- src/gallium/drivers/trace/tr_context.c | 28 + src/gallium/include/pipe/p_context.h| 10 +++--- src/gallium/include/pipe/p_defines.h|3 +- src/gallium/include/pipe/p_shader_tokens.h |3 +- src/gallium/include/pipe/p_state.h |1 + src/gallium/state_trackers/vega/renderer.c | 10 +++--- src/gallium/state_trackers/xa/xa_renderer.c |2 +- src/gallium/state_trackers/xorg/xorg_renderer.c |2 +- src/gallium/tests/graw/fs-test.c|2 +- src/gallium/tests/graw/graw_util.h |2 +- src/gallium/tests/graw/gs-test.c|2 +- src/gallium/tests/graw/quad-sample.c|2 +- src/gallium/tests/graw/shader-leak.c|2 +- src/gallium/tests/graw/tri-gs.c |2 +- src/gallium/tests/graw/tri-instanced.c |2 +- src/gallium/tests/graw/vs-test.c|2 +- src/gallium/tests/trivial/quad-tex.c|2 +- src/gallium/tests/trivial/tri.c |2 +- src/mesa/state_tracker/st_atom_scissor.c|2 +- src/mesa/state_tracker/st_atom_viewport.c |2 +- src/mesa/state_tracker/st_cb_bitmap.c |6 ++-- src/mesa/state_tracker/st_cb_clear.c|6 ++-- src/mesa/state_tracker/st_cb_drawpixels.c |6 ++-- src/mesa/state_tracker/st_cb_drawtex.c |6 ++-- src/mesa/state_tracker/st_draw_feedback.c |2 +- 67 files changed, 290 insertions(+), 217 deletions(-) diff --git a/src/gallium/drivers/galahad/glhd_context.c b/src/gallium/drivers/galahad/glhd_context.c index a73a3ad..849c12e 100644 --- a/src/gallium/drivers/galahad/glhd_context.c +++ b/src/gallium/drivers/galahad/glhd_context.c @@ -524,25 +524,27 @@ galahad_context_set_polygon_stipple(struct pipe_context *_pipe, } static void -galahad_context_set_scissor_state(struct pipe_context *_pipe,
Re: [Mesa-dev] [PATCH 1/4] gallium: Add support for multiple viewports
On 05/23/2013 03:02 PM, Roland Scheidegger wrote: Am 23.05.2013 22:33, schrieb Zack Rusin: Gallium supported only a single viewport/scissor combination. This commit changes the interface to allow us to add support for multiple viewports/scissors. Signed-off-by: Zack Rusin za...@vmware.com --- diff --git a/src/gallium/include/pipe/p_context.h b/src/gallium/include/pipe/p_context.h index d1130bc..eaaa043 100644 --- a/src/gallium/include/pipe/p_context.h +++ b/src/gallium/include/pipe/p_context.h @@ -211,11 +211,13 @@ struct pipe_context { void (*set_polygon_stipple)( struct pipe_context *, const struct pipe_poly_stipple * ); - void (*set_scissor_state)( struct pipe_context *, - const struct pipe_scissor_state * ); + void (*set_scissor_states)( struct pipe_context *, + unsigned num_scissors, + const struct pipe_scissor_state * ); - void (*set_viewport_state)( struct pipe_context *, - const struct pipe_viewport_state * ); + void (*set_viewport_states)( struct pipe_context *, +unsigned num_viewports, +const struct pipe_viewport_state *); void (*set_fragment_sampler_views)(struct pipe_context *, unsigned num_views, diff --git a/src/gallium/include/pipe/p_defines.h b/src/gallium/include/pipe/p_defines.h index bb86968..00f0a37 100644 --- a/src/gallium/include/pipe/p_defines.h +++ b/src/gallium/include/pipe/p_defines.h @@ -507,7 +507,8 @@ enum pipe_cap { PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER = 80, PIPE_CAP_QUERY_PIPELINE_STATISTICS = 81, PIPE_CAP_TEXTURE_BORDER_COLOR_QUIRK = 82, - PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE = 83 + PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE = 83, + PIPE_CAP_MULTIPLE_VIEWPORTS = 84 Would it be better if this were PIPE_CAP_MAX_VIEWPORTS instead? Though I guess there's no real need right now to support anything but 16 (as that's needed by d3d10/11, and is the minimum supported value for GL, though GL would allow for more), so I don't have a strong opinion on that. I second this suggestion. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/13] gallium: Introduce 32-bit bytewise format names
- Original Message - Michel Dänzer mic...@daenzer.net writes: For packed formats such as RGBA, the order used in these patches (which is what I suggested in my proposal) matches the order humans use for digits of numbers, as well as the Mesa formats. That seems more important to me than 'matching' any non-packed formats (which only makes sense if one presumes little endian byte order). I'm sorry I didn't notice this was what you proposed earlier.. However I don't think that consistency with Mesa formats is strong: Mesa formats even have both ways with their *_REV variants. So I prefer that we keep existing low-high bit/byte/word/etc naming convention for gallium formats. Fair enough. I do appreciate all the work and thought that went on this series so far, and I really want to get this in. So here is a summary of what's needed from my POV to get this in mergeable state: - leave r8g8b8a8 variants alone (ie, as endianess independent) - fix the util_format_description::is_array == TRUE util_format_description::is_bitmask == TRUE ambiguity, either: - add new rgba/argb formats for endianess dependent formats - add a new field on util_format_description (e.g., native_endian) for the endianess formats (**) - or add rgba/argb #define - make sure util_format_description::is_array is set for r8g8b8a8 variants, but util_format_description::is_bitmask is not Is there any drawback to the latter approach for formats where it's feasible? If not, it might reduce code duplication somewhat. (**) Actually, I'm surprised that formats like PIPE_FORMAT_B5G6R5_UNORM aren't busted on big-endiang without this, as they haven't been converted yet, so they need to be handled precisely as before, right? I suppose everything was busted before, so no net change here Exactly. :\ Yeah. I deliberately left those for future work :-) The 8-bits-per-channel formats were more interesting when trying out your idea, both because they're used more and because they have both the array and int interpretations. But it was actually because of things like B5G6R5 that I used int formats like RGBA and made .8.8.8.8 an alias of them, rather than the other way around. The layout of B5G6R5 on little-endian targets is AIUI: 76543210 76543210 GGGB RGGG Reversing the components gives: 76543210 76543210 GGGR BGGG But on a big-endian target the blue first format is: 76543210 76543210 BGGG GGGR and reversing the components gives: 76543210 76543210 RGGG GGGB So, unlike for the .8.8.8.8 formats, a plain swizzle doesn't give you the other endianness. You need to do something more complicated. Little-endian support for the big-endian arrangement, and vice-versa, would be pretty involved. So in practice I thought we'd want the first two formats on little-endian targets and the last two on big-endian targets. I thought that would mean _replacing_ the current B5G6R5 and R5G6B5 formats with BGR565 and RGB565 formats that match the endian-specific arrangements above. These two int formats wouldn't be aliases of a target-independent representation. So the patch was in some ways an experiment to see how easy it would be to make gallium treat a common format like .8.8.8.8 as an int, in the hope that if it was easy, things like .5.6.5 would be less of a special case. And in the end it all seemed pretty natural, although of course that's from a newbie's perspective. You both know the code much better than I do. Sorry for the delay. I agree that with non-array formats, like B5G6R5 and R5G6B5, replacing them with endian-variant BGR565 and RGB565 makes a lot of sense (as the swapped version will probably never be needed). But I'm not sure about RGBA8 variants... - On one hand, it is often more efficient to read/write them as 32bit integers than as an array of bytes. - On the other hand it is easier to think of then as an array of bytes than an integer quantity. One thing is clear -- a given format can't be both -- either it is endianess-variant packed color or a endianness-invariant array color. The choices are force rgba8 to be one kind, the other kind, or have different format enums for each. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 64959] New: Cannot build against EGL without X11
https://bugs.freedesktop.org/show_bug.cgi?id=64959 Priority: medium Bug ID: 64959 Assignee: mesa-dev@lists.freedesktop.org Summary: Cannot build against EGL without X11 Severity: normal Classification: Unclassified OS: All Reporter: r...@burtonini.com Hardware: Other Status: NEW Version: unspecified Component: Mesa core Product: Mesa I'm building wayland 1.1 and weston 1.1 in an environment without any X headers against Mesa 9.0.2: | In file included from /data/poky-master/tmp/sysroots/atom-pc/usr/include/EGL/egl.h:36:0, | from /data/poky-master/tmp/work/core2-poky-linux/weston/1.1.0-r0/weston-1.1.0/src/gl-renderer.h:30, | from /data/poky-master/tmp/work/core2-poky-linux/weston/1.1.0-r0/weston-1.1.0/src/gl-renderer.c:35: | /data/poky-master/tmp/sysroots/atom-pc/usr/include/EGL/eglplatform.h:118:22: fatal error: X11/Xlib.h: No such file or directory Currently I'm working around this by adding -DMESA_EGL_NO_X11_HEADERS to CFLAGS but that's clearly not the right thing to do. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/7] radeonsi: Add support for TGSI TXF opcode
From: Michel Dänzer michel.daen...@amd.com Signed-off-by: Michel Dänzer michel.daen...@amd.com --- src/gallium/drivers/radeonsi/radeonsi_shader.c | 63 -- 1 file changed, 50 insertions(+), 13 deletions(-) diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.c b/src/gallium/drivers/radeonsi/radeonsi_shader.c index df7aa1e..b82f885 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_shader.c +++ b/src/gallium/drivers/radeonsi/radeonsi_shader.c @@ -934,7 +934,7 @@ static void tex_fetch_args( } /* Pack LOD */ - if (opcode == TGSI_OPCODE_TXL) + if (opcode == TGSI_OPCODE_TXL || opcode == TGSI_OPCODE_TXF) address[count++] = coords[3]; if (count 16) { @@ -949,26 +949,56 @@ static void tex_fetch_args( ); } - /* Pad to power of two vector */ - while (count util_next_power_of_two(count)) - address[count++] = LLVMGetUndef(LLVMInt32TypeInContext(gallivm-context)); - - emit_data-args[0] = lp_build_gather_values(gallivm, address, count); - /* Resource */ emit_data-args[1] = si_shader_ctx-resources[emit_data-inst-Src[1].Register.Index]; - /* Sampler */ - emit_data-args[2] = si_shader_ctx-samplers[emit_data-inst-Src[1].Register.Index]; + if (opcode == TGSI_OPCODE_TXF) { + /* add tex offsets */ + if (inst-Texture.NumOffsets) { + struct lp_build_context *uint_bld = bld_base-uint_bld; + struct lp_build_tgsi_soa_context *bld = lp_soa_context(bld_base); + const struct tgsi_texture_offset * off = inst-TexOffsets; + + assert(inst-Texture.NumOffsets == 1); + + address[0] = + lp_build_add(uint_bld, address[0], + bld-immediates[off-Index][off-SwizzleX]); + if (num_coords 1) + address[1] = + lp_build_add(uint_bld, address[1], + bld-immediates[off-Index][off-SwizzleY]); + if (num_coords 2) + address[2] = + lp_build_add(uint_bld, address[2], + bld-immediates[off-Index][off-SwizzleZ]); + } - /* Dimensions */ - emit_data-args[3] = lp_build_const_int32(bld_base-base.gallivm, target); + emit_data-dst_type = LLVMVectorType( + LLVMInt32TypeInContext(bld_base-base.gallivm-context), + 4); - emit_data-arg_count = 4; + emit_data-arg_count = 3; + } else { + /* Sampler */ + emit_data-args[2] = si_shader_ctx-samplers[emit_data-inst-Src[1].Register.Index]; - emit_data-dst_type = LLVMVectorType( + emit_data-dst_type = LLVMVectorType( LLVMFloatTypeInContext(bld_base-base.gallivm-context), 4); + + emit_data-arg_count = 4; + } + + /* Dimensions */ + emit_data-args[emit_data-arg_count - 1] = + lp_build_const_int32(bld_base-base.gallivm, target); + + /* Pad to power of two vector */ + while (count util_next_power_of_two(count)) + address[count++] = LLVMGetUndef(LLVMInt32TypeInContext(gallivm-context)); + + emit_data-args[0] = lp_build_gather_values(gallivm, address, count); } static void build_tex_intrinsic(const struct lp_build_tgsi_action * action, @@ -999,6 +1029,12 @@ static const struct lp_build_tgsi_action txb_action = { .intr_name = llvm.SI.sampleb. }; +static const struct lp_build_tgsi_action txf_action = { + .fetch_args = tex_fetch_args, + .emit = build_tex_intrinsic, + .intr_name = llvm.SI.imageload. +}; + static const struct lp_build_tgsi_action txl_action = { .fetch_args = tex_fetch_args, .emit = build_tex_intrinsic, @@ -1243,6 +1279,7 @@ int si_pipe_shader_create( bld_base-op_actions[TGSI_OPCODE_TEX] = tex_action; bld_base-op_actions[TGSI_OPCODE_TXB] = txb_action; + bld_base-op_actions[TGSI_OPCODE_TXF] = txf_action; bld_base-op_actions[TGSI_OPCODE_TXL] = txl_action; bld_base-op_actions[TGSI_OPCODE_TXP] = tex_action; -- 1.8.3.rc3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/7] radeonsi: Fix hardware state for dual source blending
From: Michel Dänzer michel.daen...@amd.com Set up CB_SHADER_MASK register according to pixel shader exports, and enable some minimal state for colour buffer 1 in case dual source blending is used. Signed-off-by: Michel Dänzer michel.daen...@amd.com --- src/gallium/drivers/radeonsi/radeonsi_shader.c | 5 + src/gallium/drivers/radeonsi/radeonsi_shader.h | 1 + src/gallium/drivers/radeonsi/si_state.c| 16 ++-- src/gallium/drivers/radeonsi/si_state_draw.c | 1 + 4 files changed, 17 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.c b/src/gallium/drivers/radeonsi/radeonsi_shader.c index 484f7ec..3e023f8 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_shader.c +++ b/src/gallium/drivers/radeonsi/radeonsi_shader.c @@ -461,6 +461,8 @@ static void si_llvm_init_export_args(struct lp_build_tgsi_context *bld_base, else si_shader_ctx-shader-spi_shader_col_format |= V_028714_SPI_SHADER_32_ABGR (4 * cbuf); + + si_shader_ctx-shader-cb_shader_mask |= 0xf (4 * cbuf); } } @@ -806,6 +808,7 @@ static void si_llvm_emit_epilogue(struct lp_build_tgsi_context * bld_base) si_shader_ctx-shader-spi_shader_col_format |= V_028714_SPI_SHADER_32_ABGR; + si_shader_ctx-shader-cb_shader_mask |= S_02823C_OUTPUT0_ENABLE(0xf); } /* Specify whether the EXEC mask represents the valid mask */ @@ -830,6 +833,8 @@ static void si_llvm_emit_epilogue(struct lp_build_tgsi_context * bld_base) si_shader_ctx-shader-spi_shader_col_format |= si_shader_ctx-shader-spi_shader_col_format 4; + si_shader_ctx-shader-cb_shader_mask |= + si_shader_ctx-shader-cb_shader_mask 4; } last_args[3] = lp_build_const_int32(base-gallivm, V_008DFC_SQ_EXP_MRT); diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.h b/src/gallium/drivers/radeonsi/radeonsi_shader.h index 01b8b5d..33e81c7 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_shader.h +++ b/src/gallium/drivers/radeonsi/radeonsi_shader.h @@ -140,6 +140,7 @@ struct si_pipe_shader { unsignednum_vgprs; unsignedspi_ps_input_ena; unsignedspi_shader_col_format; + unsignedcb_shader_mask; unsignedsprite_coord_enable; unsignedso_strides[4]; union si_shader_key key; diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index dec535c..e7dc792 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -1728,6 +1728,12 @@ static void si_cb(struct r600_context *rctx, struct si_pm4_state *pm4, si_pm4_set_reg(pm4, R_028C70_CB_COLOR0_INFO + cb * 0x3C, color_info); si_pm4_set_reg(pm4, R_028C74_CB_COLOR0_ATTRIB + cb * 0x3C, color_attrib); + /* set CB_COLOR1_INFO for possible dual-src blending */ + if (state-nr_cbufs == 1) { + assert(cb == 0); + si_pm4_set_reg(pm4, R_028C70_CB_COLOR0_INFO + 1 * 0x3C, color_info); + } + /* Determine pixel shader export format */ max_comp_size = si_colorformat_max_comp_size(format); if (ntype == V_028C70_NUMBER_SRGB || @@ -1735,6 +1741,9 @@ static void si_cb(struct r600_context *rctx, struct si_pm4_state *pm4, max_comp_size = 10) || (ntype == V_028C70_NUMBER_FLOAT max_comp_size = 16)) { rctx-export_16bpc |= 1 cb; + /* set SPI_SHADER_COL_FORMAT for possible dual-src blending */ + if (state-nr_cbufs == 1) + rctx-export_16bpc |= 1 1; } } @@ -1811,7 +1820,7 @@ static void si_set_framebuffer_state(struct pipe_context *ctx, { struct r600_context *rctx = (struct r600_context *)ctx; struct si_pm4_state *pm4 = CALLOC_STRUCT(si_pm4_state); - uint32_t shader_mask, tl, br; + uint32_t tl, br; int tl_x, tl_y, br_x, br_y; if (pm4 == NULL) @@ -1832,10 +1841,6 @@ static void si_set_framebuffer_state(struct pipe_context *ctx, assert(!(rctx-export_16bpc ~0xff)); si_db(rctx, pm4, state); - shader_mask = 0; - for (int i = 0; i state-nr_cbufs; i++) { - shader_mask |= 0xf (i * 4); - } tl_x = 0; tl_y = 0; br_x = state-width; @@ -1854,7 +1859,6 @@ static void si_set_framebuffer_state(struct pipe_context *ctx, si_pm4_set_reg(pm4, R_028208_PA_SC_WINDOW_SCISSOR_BR, br); si_pm4_set_reg(pm4, R_028200_PA_SC_WINDOW_OFFSET, 0x); si_pm4_set_reg(pm4,
[Mesa-dev] [PATCH 3/7] radeonsi: Use tgsi_util_get_texture_coord_dim()
From: Michel Dänzer michel.daen...@amd.com Signed-off-by: Michel Dänzer michel.daen...@amd.com --- src/gallium/drivers/radeonsi/radeonsi_shader.c | 32 ++ 1 file changed, 7 insertions(+), 25 deletions(-) diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.c b/src/gallium/drivers/radeonsi/radeonsi_shader.c index 3e023f8..df7aa1e 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_shader.c +++ b/src/gallium/drivers/radeonsi/radeonsi_shader.c @@ -40,6 +40,7 @@ #include tgsi/tgsi_info.h #include tgsi/tgsi_parse.h #include tgsi/tgsi_scan.h +#include tgsi/tgsi_util.h #include tgsi/tgsi_dump.h #include radeonsi_pipe.h @@ -863,6 +864,8 @@ static void tex_fetch_args( unsigned target = inst-Texture.Texture; LLVMValueRef coords[4]; LLVMValueRef address[16]; + int ref_pos; + unsigned num_coords = tgsi_util_get_texture_coord_dim(target, ref_pos); unsigned count = 0; unsigned chan; @@ -896,11 +899,10 @@ static void tex_fetch_args( case TGSI_TEXTURE_SHADOW1D_ARRAY: case TGSI_TEXTURE_SHADOW2D: case TGSI_TEXTURE_SHADOWRECT: - address[count++] = coords[2]; - break; case TGSI_TEXTURE_SHADOWCUBE: case TGSI_TEXTURE_SHADOW2D_ARRAY: - address[count++] = coords[3]; + assert(ref_pos = 0); + address[count++] = coords[ref_pos]; break; case TGSI_TEXTURE_SHADOWCUBE_ARRAY: address[count++] = lp_build_emit_fetch(bld_base, inst, 1, 0); @@ -908,30 +910,10 @@ static void tex_fetch_args( /* Pack texture coordinates */ address[count++] = coords[0]; - switch (target) { - case TGSI_TEXTURE_2D: - case TGSI_TEXTURE_2D_ARRAY: - case TGSI_TEXTURE_3D: - case TGSI_TEXTURE_CUBE: - case TGSI_TEXTURE_RECT: - case TGSI_TEXTURE_SHADOW2D: - case TGSI_TEXTURE_SHADOWRECT: - case TGSI_TEXTURE_SHADOW2D_ARRAY: - case TGSI_TEXTURE_SHADOWCUBE: - case TGSI_TEXTURE_2D_MSAA: - case TGSI_TEXTURE_2D_ARRAY_MSAA: - case TGSI_TEXTURE_CUBE_ARRAY: - case TGSI_TEXTURE_SHADOWCUBE_ARRAY: + if (num_coords 1) address[count++] = coords[1]; - } - switch (target) { - case TGSI_TEXTURE_3D: - case TGSI_TEXTURE_CUBE: - case TGSI_TEXTURE_SHADOWCUBE: - case TGSI_TEXTURE_CUBE_ARRAY: - case TGSI_TEXTURE_SHADOWCUBE_ARRAY: + if (num_coords 2) address[count++] = coords[2]; - } /* Pack array slice */ switch (target) { -- 1.8.3.rc3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/7] radeonsi: GLSL 1.30 support
This series fixes a couple of problems in preparation, then adds the missing functionality for GLSL 1.30 and finally enables it. This enables around 800 more piglit tests, keeping the overall passrate about the same as before. [PATCH 1/7] radeonsi: Fix hardware state for dual source blending [PATCH 2/7] radeonsi: Make border colour state handling safe for [PATCH 3/7] radeonsi: Use tgsi_util_get_texture_coord_dim() [PATCH 4/7] radeonsi: Add support for TGSI TXF opcode [PATCH 5/7] radeonsi: Handle TGSI TXQ opcode [PATCH 6/7] radeonsi: Handle TGSI_SEMANTIC_CLIPDIST [PATCH 7/7] radeonsi: Enable GLSL 1.30 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/7] radeonsi: Make border colour state handling safe for integer textures
From: Michel Dänzer michel.daen...@amd.com Signed-off-by: Michel Dänzer michel.daen...@amd.com --- src/gallium/drivers/radeonsi/radeonsi_pipe.h | 2 +- src/gallium/drivers/radeonsi/si_state.c | 45 2 files changed, 27 insertions(+), 20 deletions(-) diff --git a/src/gallium/drivers/radeonsi/radeonsi_pipe.h b/src/gallium/drivers/radeonsi/radeonsi_pipe.h index 3274049..67cb14b 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_pipe.h +++ b/src/gallium/drivers/radeonsi/radeonsi_pipe.h @@ -87,7 +87,7 @@ struct si_pipe_sampler_view { struct si_pipe_sampler_state { uint32_tval[4]; - float border_color[4]; + uint32_tborder_color[4]; }; struct si_cs_shader_state { diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index e7dc792..4556be6 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -2273,11 +2273,31 @@ static void si_sampler_view_destroy(struct pipe_context *ctx, FREE(resource); } +static bool wrap_mode_uses_border_color(unsigned wrap, bool linear_filter) +{ + return wrap == PIPE_TEX_WRAP_CLAMP_TO_BORDER || + wrap == PIPE_TEX_WRAP_MIRROR_CLAMP_TO_BORDER || + (linear_filter + (wrap == PIPE_TEX_WRAP_CLAMP || +wrap == PIPE_TEX_WRAP_MIRROR_CLAMP)); +} + +static bool sampler_state_needs_border_color(const struct pipe_sampler_state *state) +{ + bool linear_filter = state-min_img_filter != PIPE_TEX_FILTER_NEAREST || +state-mag_img_filter != PIPE_TEX_FILTER_NEAREST; + + return (state-border_color.ui[0] || state-border_color.ui[1] || + state-border_color.ui[2] || state-border_color.ui[3]) + (wrap_mode_uses_border_color(state-wrap_s, linear_filter) || + wrap_mode_uses_border_color(state-wrap_t, linear_filter) || + wrap_mode_uses_border_color(state-wrap_r, linear_filter)); +} + static void *si_create_sampler_state(struct pipe_context *ctx, const struct pipe_sampler_state *state) { struct si_pipe_sampler_state *rstate = CALLOC_STRUCT(si_pipe_sampler_state); - union util_color uc; unsigned aniso_flag_offset = state-max_anisotropy 1 ? 2 : 0; unsigned border_color_type; @@ -2285,20 +2305,10 @@ static void *si_create_sampler_state(struct pipe_context *ctx, return NULL; } - util_pack_color(state-border_color.f, PIPE_FORMAT_A8R8G8B8_UNORM, uc); - switch (uc.ui) { - case 0x00FF: - border_color_type = V_008F3C_SQ_TEX_BORDER_COLOR_OPAQUE_BLACK; - break; - case 0x: - border_color_type = V_008F3C_SQ_TEX_BORDER_COLOR_TRANS_BLACK; - break; - case 0x: - border_color_type = V_008F3C_SQ_TEX_BORDER_COLOR_OPAQUE_WHITE; - break; - default: /* Use border color pointer */ + if (sampler_state_needs_border_color(state)) border_color_type = V_008F3C_SQ_TEX_BORDER_COLOR_REGISTER; - } + else + border_color_type = V_008F3C_SQ_TEX_BORDER_COLOR_TRANS_BLACK; rstate-val[0] = (S_008F30_CLAMP_X(si_tex_wrap(state-wrap_s)) | S_008F30_CLAMP_Y(si_tex_wrap(state-wrap_t)) | @@ -2317,7 +2327,7 @@ static void *si_create_sampler_state(struct pipe_context *ctx, rstate-val[3] = S_008F3C_BORDER_COLOR_TYPE(border_color_type); if (border_color_type == V_008F3C_SQ_TEX_BORDER_COLOR_REGISTER) { - memcpy(rstate-border_color, state-border_color.f, + memcpy(rstate-border_color, state-border_color.ui, sizeof(rstate-border_color)); } @@ -2440,11 +2450,8 @@ static struct si_pm4_state *si_bind_sampler(struct r600_context *rctx, unsigned } for (j = 0; j 4; j++) { - union fi border_color; - - border_color.f = rstates[i]-border_color[j]; border_color_table[4 * rctx-border_color_offset + j] = - util_le32_to_cpu(border_color.i); + util_le32_to_cpu(rstates[i]-border_color[j]); } rstates[i]-val[3] = C_008F3C_BORDER_COLOR_PTR; -- 1.8.3.rc3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/7] radeonsi: Handle TGSI TXQ opcode
From: Michel Dänzer michel.daen...@amd.com Signed-off-by: Michel Dänzer michel.daen...@amd.com --- src/gallium/drivers/radeonsi/radeonsi_shader.c | 34 -- 1 file changed, 32 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.c b/src/gallium/drivers/radeonsi/radeonsi_shader.c index b82f885..572c665 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_shader.c +++ b/src/gallium/drivers/radeonsi/radeonsi_shader.c @@ -889,8 +889,7 @@ static void tex_fetch_args( if (opcode == TGSI_OPCODE_TXB) address[count++] = coords[3]; - if ((target == TGSI_TEXTURE_CUBE || target == TGSI_TEXTURE_SHADOWCUBE) - opcode != TGSI_OPCODE_TXQ) + if (target == TGSI_TEXTURE_CUBE || target == TGSI_TEXTURE_SHADOWCUBE) radeon_llvm_emit_prepare_cube_coords(bld_base, emit_data, coords); /* Pack depth comparison value */ @@ -1017,6 +1016,30 @@ static void build_tex_intrinsic(const struct lp_build_tgsi_action * action, LLVMReadNoneAttribute | LLVMNoUnwindAttribute); } +static void txq_fetch_args( + struct lp_build_tgsi_context * bld_base, + struct lp_build_emit_data * emit_data) +{ + struct si_shader_context *si_shader_ctx = si_shader_context(bld_base); + const struct tgsi_full_instruction *inst = emit_data-inst; + + /* Mip level */ + emit_data-args[0] = lp_build_emit_fetch(bld_base, inst, 0, TGSI_CHAN_X); + + /* Resource */ + emit_data-args[1] = si_shader_ctx-resources[inst-Src[1].Register.Index]; + + /* Dimensions */ + emit_data-args[2] = lp_build_const_int32(bld_base-base.gallivm, + inst-Texture.Texture); + + emit_data-arg_count = 3; + + emit_data-dst_type = LLVMVectorType( + LLVMInt32TypeInContext(bld_base-base.gallivm-context), + 4); +} + static const struct lp_build_tgsi_action tex_action = { .fetch_args = tex_fetch_args, .emit = build_tex_intrinsic, @@ -1041,6 +1064,12 @@ static const struct lp_build_tgsi_action txl_action = { .intr_name = llvm.SI.samplel. }; +static const struct lp_build_tgsi_action txq_action = { + .fetch_args = txq_fetch_args, + .emit = build_tgsi_intrinsic_nomem, + .intr_name = llvm.SI.resinfo +}; + static void create_meta_data(struct si_shader_context *si_shader_ctx) { struct gallivm_state *gallivm = si_shader_ctx-radeon_bld.soa.bld_base.base.gallivm; @@ -1282,6 +1311,7 @@ int si_pipe_shader_create( bld_base-op_actions[TGSI_OPCODE_TXF] = txf_action; bld_base-op_actions[TGSI_OPCODE_TXL] = txl_action; bld_base-op_actions[TGSI_OPCODE_TXP] = tex_action; + bld_base-op_actions[TGSI_OPCODE_TXQ] = txq_action; si_shader_ctx.radeon_bld.load_input = declare_input; si_shader_ctx.radeon_bld.load_system_value = declare_system_value; -- 1.8.3.rc3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/7] radeonsi: Handle TGSI_SEMANTIC_CLIPDIST
From: Michel Dänzer michel.daen...@amd.com Signed-off-by: Michel Dänzer michel.daen...@amd.com --- src/gallium/drivers/radeonsi/radeonsi_shader.c | 21 + 1 file changed, 17 insertions(+), 4 deletions(-) diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.c b/src/gallium/drivers/radeonsi/radeonsi_shader.c index 572c665..f6fdfae 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_shader.c +++ b/src/gallium/drivers/radeonsi/radeonsi_shader.c @@ -626,6 +626,7 @@ static void si_llvm_emit_epilogue(struct lp_build_tgsi_context * bld_base) struct tgsi_parse_context *parse = si_shader_ctx-parse; LLVMValueRef args[9]; LLVMValueRef last_args[9] = { 0 }; + unsigned semantic_name; unsigned color_count = 0; unsigned param_count = 0; int depth_index = -1, stencil_index = -1; @@ -669,9 +670,11 @@ static void si_llvm_emit_epilogue(struct lp_build_tgsi_context * bld_base) continue; } + semantic_name = d-Semantic.Name; +handle_semantic: for (index = d-Range.First; index = d-Range.Last; index++) { /* Select the correct target */ - switch(d-Semantic.Name) { + switch(semantic_name) { case TGSI_SEMANTIC_PSIZE: shader-vs_out_misc_write = 1; shader-vs_out_point_size = 1; @@ -703,6 +706,11 @@ static void si_llvm_emit_epilogue(struct lp_build_tgsi_context * bld_base) color_count++; } break; + case TGSI_SEMANTIC_CLIPDIST: + shader-clip_dist_write |= + d-Declaration.UsageMask (d-Semantic.Index 2); + target = V_008DFC_SQ_EXP_POS + 2 + d-Semantic.Index; + break; case TGSI_SEMANTIC_CLIPVERTEX: si_llvm_emit_clipvertex(bld_base, index); shader-clip_dist_write = 0xFF; @@ -717,14 +725,14 @@ static void si_llvm_emit_epilogue(struct lp_build_tgsi_context * bld_base) target = 0; fprintf(stderr, Warning: SI unhandled output type:%d\n, - d-Semantic.Name); + semantic_name); } si_llvm_init_export_args(bld_base, d, index, target, args); if (si_shader_ctx-type == TGSI_PROCESSOR_VERTEX ? - (d-Semantic.Name == TGSI_SEMANTIC_POSITION) : - (d-Semantic.Name == TGSI_SEMANTIC_COLOR)) { + (semantic_name == TGSI_SEMANTIC_POSITION) : + (semantic_name == TGSI_SEMANTIC_COLOR)) { if (last_args[0]) { lp_build_intrinsic(base-gallivm-builder, llvm.SI.export, @@ -741,6 +749,11 @@ static void si_llvm_emit_epilogue(struct lp_build_tgsi_context * bld_base) } } + + if (semantic_name == TGSI_SEMANTIC_CLIPDIST) { + semantic_name = TGSI_SEMANTIC_GENERIC; + goto handle_semantic; + } } if (depth_index = 0 || stencil_index = 0) { -- 1.8.3.rc3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 7/7] radeonsi: Enable GLSL 1.30
From: Michel Dänzer michel.daen...@amd.com Signed-off-by: Michel Dänzer michel.daen...@amd.com --- src/gallium/drivers/radeonsi/radeonsi_pipe.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/radeonsi_pipe.c b/src/gallium/drivers/radeonsi/radeonsi_pipe.c index b988e72..30254a8 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_pipe.c +++ b/src/gallium/drivers/radeonsi/radeonsi_pipe.c @@ -364,7 +364,7 @@ static int r600_get_param(struct pipe_screen* pscreen, enum pipe_cap param) return 256; case PIPE_CAP_GLSL_FEATURE_LEVEL: - return debug_get_bool_option(R600_GLSL130, FALSE) ? 130 : 120; + return 130; /* Unsupported features. */ case PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT: -- 1.8.3.rc3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Mask the cut index based on the index buffer type in 3DSTATE_VF.
On 05/23/2013 04:14 PM, Jordan Justen wrote: On Thu, May 23, 2013 at 3:46 PM, Kenneth Graunke kenn...@whitecape.org wrote: According to the documentation: The Cut Index is compared to the fetched (and possibly-sign-extended) vertex index, and if these values are equal, the current primitive topology is terminated. Note that, for index buffers 32bpp, it is possible to set the Cut Index to a (large) value that will never match a sign-extended vertex index. This suggests that we should not set the value to 0x for unsigned byte or short index buffers, but rather 0xFF or 0x. I was wondering what the GL spec had to say about this situation. For example, what should happen if the index is 0x100, and bytes are used. Should it effectively disable prim-restart? Should it use 0xff, or 0x00? Unfortunately, I didn't find anything concrete. Reviewed-by: Jordan Justen jordan.l.jus...@intel.com You raise a good point. If I set the cut index to 0x31337 and DrawElements with GL_UNSIGNED_BYTE, should it reset when it sees 37 or not? I'll have to write Piglit tests and find out what other implementations do. Thanks for the excellent review! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] radeonsi compute improvements
Hi, These patches along with the associated LLVM changes improve compute support on radeonsi to the point were it can run a number of simple apps, including the bitcoin mining program bfgminer. Patch #4 re-introduces the r600_upload_const_buffer() function that was removed in eb19163a4dd3d7bfeed63229820c926f99ed00d9. However, using this function from si_set_constant_buffer() causes a memory leak in X/Glamor which makes it impossible to complete a full piglit run. I'm not sure what the problem is, since it worked before the above mentioned commit, but I'm hoping someone can spot my mistake. -Tom ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/5] radeonsi/compute: Add missing PIPE_COMPUTE caps
From: Tom Stellard thomas.stell...@amd.com --- src/gallium/drivers/radeonsi/radeonsi_pipe.c | 16 1 file changed, 16 insertions(+) diff --git a/src/gallium/drivers/radeonsi/radeonsi_pipe.c b/src/gallium/drivers/radeonsi/radeonsi_pipe.c index b988e72..7a79db3 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_pipe.c +++ b/src/gallium/drivers/radeonsi/radeonsi_pipe.c @@ -579,6 +579,22 @@ static int r600_get_compute_param(struct pipe_screen *screen, } return sizeof(uint64_t); + case PIPE_COMPUTE_CAP_MAX_GLOBAL_SIZE: + if (ret) { + uint64_t *max_global_size = ret; + /* XXX: Not sure what to put here. */ + *max_global_size = 20; + } + return sizeof(uint64_t); + + case PIPE_COMPUTE_CAP_MAX_MEM_ALLOC_SIZE: + if (ret) { + uint64_t max_global_size; + uint64_t *max_mem_alloc_size = ret; + r600_get_compute_param(screen, PIPE_COMPUTE_CAP_MAX_GLOBAL_SIZE, max_global_size); + *max_mem_alloc_size = max_global_size / 4; + } + return sizeof(uint64_t); default: fprintf(stderr, unknown PIPE_COMPUTE_CAP %d\n, param); return 0; -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/5] radeonsi/compute: Implement un-binding of global buffers
From: Tom Stellard thomas.stell...@amd.com --- src/gallium/drivers/radeonsi/radeonsi_compute.c | 31 +++-- 1 file changed, 19 insertions(+), 12 deletions(-) diff --git a/src/gallium/drivers/radeonsi/radeonsi_compute.c b/src/gallium/drivers/radeonsi/radeonsi_compute.c index 1ae7d9b..3fb6eb1 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_compute.c +++ b/src/gallium/drivers/radeonsi/radeonsi_compute.c @@ -5,6 +5,8 @@ #include radeon_llvm_util.h +#define MAX_GLOBAL_BUFFERS 20 + struct si_pipe_compute { struct r600_context *ctx; @@ -15,7 +17,7 @@ struct si_pipe_compute { struct si_pipe_shader *kernels; unsigned num_user_sgprs; -struct si_pm4_state *pm4_buffers; +struct pipe_resource *global_buffers[MAX_GLOBAL_BUFFERS]; }; @@ -65,22 +67,18 @@ static void radeonsi_set_global_binding( unsigned i; struct r600_context *rctx = (struct r600_context*)ctx; struct si_pipe_compute *program = rctx-cs_shader_state.program; - struct si_pm4_state *pm4; - - if (!program-pm4_buffers) { - program-pm4_buffers = CALLOC_STRUCT(si_pm4_state); - } - pm4 = program-pm4_buffers; - pm4-compute_pkt = true; if (!resources) { + for (i = first; i first + n; i++) { + program-global_buffers[i] = NULL; + } return; } for (i = first; i first + n; i++) { - uint64_t va = r600_resource_va(ctx-screen, resources[i]); - si_pm4_add_bo(pm4, (struct si_resource*)resources[i], - RADEON_USAGE_READWRITE); + uint64_t va; + program-global_buffers[i] = resources[i]; + va = r600_resource_va(ctx-screen, resources[i]); memcpy(handles[i], va, sizeof(va)); } } @@ -138,6 +136,16 @@ static void radeonsi_launch_grid( si_pm4_set_reg(pm4, R_00B824_COMPUTE_NUM_THREAD_Z, S_00B824_NUM_THREAD_FULL(block_layout[2])); + /* Global buffers */ + for (i = 0; i MAX_GLOBAL_BUFFERS; i++) { + struct si_resource *buffer = + (struct si_resource*)program-global_buffers[i]; + if (!buffer) { + continue; + } + si_pm4_add_bo(pm4, buffer, RADEON_USAGE_READWRITE); + } + /* XXX: This should be: * (number of compute units) * 4 * (waves per simd) - 1 */ si_pm4_set_reg(pm4, R_00B82C_COMPUTE_MAX_WAVE_ID, 0x190 /* Default value */); @@ -199,7 +207,6 @@ static void radeonsi_launch_grid( si_pm4_inval_shader_cache(pm4); si_cmd_surface_sync(pm4, pm4-cp_coher_cntl); - si_pm4_emit(rctx, program-pm4_buffers); si_pm4_emit(rctx, pm4); #if 0 -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/5] radeonsi/compute: Pass kernel arguments in a buffer
From: Tom Stellard thomas.stell...@amd.com --- src/gallium/drivers/radeonsi/r600_buffer.c | 31 + src/gallium/drivers/radeonsi/radeonsi_compute.c | 26 ++--- src/gallium/drivers/radeonsi/si_state.c | 29 +++ 3 files changed, 51 insertions(+), 35 deletions(-) diff --git a/src/gallium/drivers/radeonsi/r600_buffer.c b/src/gallium/drivers/radeonsi/r600_buffer.c index cdf9988..87763c3 100644 --- a/src/gallium/drivers/radeonsi/r600_buffer.c +++ b/src/gallium/drivers/radeonsi/r600_buffer.c @@ -25,6 +25,8 @@ * Corbin Simpson mostawesomed...@gmail.com */ +#include byteswap.h + #include pipe/p_screen.h #include util/u_format.h #include util/u_math.h @@ -168,3 +170,32 @@ void r600_upload_index_buffer(struct r600_context *rctx, u_upload_data(rctx-uploader, 0, count * ib-index_size, ib-user_buffer, ib-offset, ib-buffer); } + +void r600_upload_const_buffer(struct r600_context *rctx, struct si_resource **rbuffer, + const uint8_t *ptr, unsigned size, + uint32_t *const_offset) +{ + *rbuffer = NULL; + + if (R600_BIG_ENDIAN) { + uint32_t *tmpPtr; + unsigned i; + + if (!(tmpPtr = malloc(size))) { + R600_ERR(Failed to allocate BE swap buffer.\n); + return; + } + + for (i = 0; i size / 4; ++i) { + tmpPtr[i] = bswap_32(((uint32_t *)ptr)[i]); + } + + u_upload_data(rctx-uploader, 0, size, tmpPtr, const_offset, + (struct pipe_resource**)rbuffer); + + free(tmpPtr); + } else { + u_upload_data(rctx-uploader, 0, size, ptr, const_offset, + (struct pipe_resource**)rbuffer); + } +} diff --git a/src/gallium/drivers/radeonsi/radeonsi_compute.c b/src/gallium/drivers/radeonsi/radeonsi_compute.c index 3fb6eb1..035076d 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_compute.c +++ b/src/gallium/drivers/radeonsi/radeonsi_compute.c @@ -91,8 +91,11 @@ static void radeonsi_launch_grid( struct r600_context *rctx = (struct r600_context*)ctx; struct si_pipe_compute *program = rctx-cs_shader_state.program; struct si_pm4_state *pm4 = CALLOC_STRUCT(si_pm4_state); + struct si_resource *input_buffer; + uint32_t input_offset = 0; + uint64_t input_va; uint64_t shader_va; - unsigned arg_user_sgpr_count; + unsigned arg_user_sgpr_count = 2; unsigned i; struct si_pipe_shader *shader = program-kernels[pc]; @@ -109,21 +112,16 @@ static void radeonsi_launch_grid( si_pm4_inval_shader_cache(pm4); si_cmd_surface_sync(pm4, pm4-cp_coher_cntl); - arg_user_sgpr_count = program-input_size / 4; - if (program-input_size % 4 != 0) { - arg_user_sgpr_count++; - } + /* Upload the input data */ + r600_upload_const_buffer(rctx, input_buffer, input, + program-input_size, input_offset); + input_va = r600_resource_va(ctx-screen, (struct pipe_resource*)input_buffer); + input_va += input_offset; - /* XXX: We should store arguments in memory if we run out of user sgprs. -*/ - assert(arg_user_sgpr_count 16); + si_pm4_add_bo(pm4, input_buffer, RADEON_USAGE_READ); - for (i = 0; i arg_user_sgpr_count; i++) { - uint32_t *args = (uint32_t*)input; - si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0 + - (i * 4), - args[i]); - } + si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0, input_va); + si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0 + 4, S_008F04_BASE_ADDRESS_HI (input_va 32) | S_008F04_STRIDE(0)); si_pm4_set_reg(pm4, R_00B810_COMPUTE_START_X, 0); si_pm4_set_reg(pm4, R_00B814_COMPUTE_START_Y, 0); diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index dec535c..1e94f7e 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -24,8 +24,6 @@ * Christian König christian.koe...@amd.com */ -#include byteswap.h - #include util/u_memory.h #include util/u_framebuffer.h #include util/u_blitter.h @@ -2526,25 +2524,14 @@ static void si_set_constant_buffer(struct pipe_context *ctx, uint shader, uint i ptr = input-user_buffer; if (ptr) { - /* Upload the user buffer. */ - if (R600_BIG_ENDIAN) { - uint32_t *tmpPtr; - unsigned i, size = input-buffer_size; - - if (!(tmpPtr = malloc(size))) { - R600_ERR(Failed to allocate BE swap buffer.\n); -
[Mesa-dev] [PATCH 2/5] radeonsi/compute: Support multiple kernels in a compute program
From: Tom Stellard thomas.stell...@amd.com --- src/gallium/drivers/radeonsi/radeonsi_compute.c | 27 - 1 file changed, 18 insertions(+), 9 deletions(-) diff --git a/src/gallium/drivers/radeonsi/radeonsi_compute.c b/src/gallium/drivers/radeonsi/radeonsi_compute.c index e67d127..1ae7d9b 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_compute.c +++ b/src/gallium/drivers/radeonsi/radeonsi_compute.c @@ -11,7 +11,8 @@ struct si_pipe_compute { unsigned local_size; unsigned private_size; unsigned input_size; - struct si_pipe_shader shader; + unsigned num_kernels; + struct si_pipe_shader *kernels; unsigned num_user_sgprs; struct si_pm4_state *pm4_buffers; @@ -27,7 +28,7 @@ static void *radeonsi_create_compute_state( CALLOC_STRUCT(si_pipe_compute); const struct pipe_llvm_program_header *header; const unsigned char *code; - LLVMModuleRef mod; + unsigned i; header = cso-prog; code = cso-prog + sizeof(struct pipe_llvm_program_header); @@ -37,8 +38,15 @@ static void *radeonsi_create_compute_state( program-private_size = cso-req_private_mem; program-input_size = cso-req_input_mem; - mod = radeon_llvm_parse_bitcode(code, header-num_bytes); - si_compile_llvm(rctx, program-shader, mod); + program-num_kernels = radeon_llvm_get_num_kernels(code, + header-num_bytes); + program-kernels = CALLOC(sizeof(struct si_pipe_shader), + program-num_kernels); + for (i = 0; i program-num_kernels; i++) { + LLVMModuleRef mod = radeon_llvm_get_kernel_module(i, code, + header-num_bytes); + si_compile_llvm(rctx, program-kernels[i], mod); + } return program; } @@ -88,6 +96,7 @@ static void radeonsi_launch_grid( uint64_t shader_va; unsigned arg_user_sgpr_count; unsigned i; + struct si_pipe_shader *shader = program-kernels[pc]; pm4-compute_pkt = true; si_cmd_context_control(pm4); @@ -133,8 +142,8 @@ static void radeonsi_launch_grid( * (number of compute units) * 4 * (waves per simd) - 1 */ si_pm4_set_reg(pm4, R_00B82C_COMPUTE_MAX_WAVE_ID, 0x190 /* Default value */); - shader_va = r600_resource_va(ctx-screen, (void *)program-shader.bo); - si_pm4_add_bo(pm4, program-shader.bo, RADEON_USAGE_READ); + shader_va = r600_resource_va(ctx-screen, (void *)shader-bo); + si_pm4_add_bo(pm4, shader-bo, RADEON_USAGE_READ); si_pm4_set_reg(pm4, R_00B830_COMPUTE_PGM_LO, (shader_va 8) 0x); si_pm4_set_reg(pm4, R_00B834_COMPUTE_PGM_HI, shader_va 40); @@ -143,13 +152,13 @@ static void radeonsi_launch_grid( * TIDIG_COMP_CNT. * XXX: The compiler should account for this. */ - S_00B848_VGPRS((MAX2(3, program-shader.num_vgprs) - 1) / 4) + S_00B848_VGPRS((MAX2(3, shader-num_vgprs) - 1) / 4) /* We always use at least 4 + arg_user_sgpr_count. The 4 extra * sgprs are from TGID_X_EN, TGID_Y_EN, TGID_Z_EN, TG_SIZE_EN * XXX: The compiler should account for this. */ | S_00B848_SGPRS(((MAX2(4 + arg_user_sgpr_count, - program-shader.num_sgprs)) - 1) / 8)) + shader-num_sgprs)) - 1) / 8)) ; si_pm4_set_reg(pm4, R_00B84C_COMPUTE_PGM_RSRC2, @@ -201,7 +210,7 @@ static void radeonsi_launch_grid( #endif rctx-ws-cs_flush(rctx-cs, RADEON_FLUSH_COMPUTE, 0); - rctx-ws-buffer_wait(program-shader.bo-buf, 0); + rctx-ws-buffer_wait(shader-bo-buf, 0); FREE(pm4); } -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/5] radeonsi/compute: Upload work group, work item size in input buffer
From: Tom Stellard thomas.stell...@amd.com --- src/gallium/drivers/radeonsi/radeonsi_compute.c | 38 ++--- 1 file changed, 27 insertions(+), 11 deletions(-) diff --git a/src/gallium/drivers/radeonsi/radeonsi_compute.c b/src/gallium/drivers/radeonsi/radeonsi_compute.c index 035076d..3abf50b 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_compute.c +++ b/src/gallium/drivers/radeonsi/radeonsi_compute.c @@ -91,9 +91,12 @@ static void radeonsi_launch_grid( struct r600_context *rctx = (struct r600_context*)ctx; struct si_pipe_compute *program = rctx-cs_shader_state.program; struct si_pm4_state *pm4 = CALLOC_STRUCT(si_pm4_state); - struct si_resource *input_buffer; - uint32_t input_offset = 0; - uint64_t input_va; + struct si_resource *kernel_args_buffer; + unsigned kernel_args_size; + unsigned num_work_size_bytes = 36; + uint32_t kernel_args_offset = 0; + uint32_t *kernel_args; + uint64_t kernel_args_va; uint64_t shader_va; unsigned arg_user_sgpr_count = 2; unsigned i; @@ -112,16 +115,29 @@ static void radeonsi_launch_grid( si_pm4_inval_shader_cache(pm4); si_cmd_surface_sync(pm4, pm4-cp_coher_cntl); - /* Upload the input data */ - r600_upload_const_buffer(rctx, input_buffer, input, - program-input_size, input_offset); - input_va = r600_resource_va(ctx-screen, (struct pipe_resource*)input_buffer); - input_va += input_offset; + /* Upload the kernel arguments */ - si_pm4_add_bo(pm4, input_buffer, RADEON_USAGE_READ); + /* The extra num_work_size_bytes are for work group / work item size information */ + kernel_args_size = program-input_size + num_work_size_bytes; + kernel_args = MALLOC(kernel_args_size); + for (i = 0; i 3; i++) { + kernel_args[i] = grid_layout[i]; + kernel_args[i + 3] = grid_layout[i] * block_layout[i]; + kernel_args[i + 6] = block_layout[i]; + } + + memcpy(kernel_args + (num_work_size_bytes / 4), input, program-input_size); + + r600_upload_const_buffer(rctx, kernel_args_buffer, kernel_args, + kernel_args_size, kernel_args_offset); + kernel_args_va = r600_resource_va(ctx-screen, + (struct pipe_resource*)kernel_args_buffer); + kernel_args_va += kernel_args_offset; + + si_pm4_add_bo(pm4, kernel_args_buffer, RADEON_USAGE_READ); - si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0, input_va); - si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0 + 4, S_008F04_BASE_ADDRESS_HI (input_va 32) | S_008F04_STRIDE(0)); + si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0, kernel_args_va); + si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0 + 4, S_008F04_BASE_ADDRESS_HI (kernel_args_va 32) | S_008F04_STRIDE(0)); si_pm4_set_reg(pm4, R_00B810_COMPUTE_START_X, 0); si_pm4_set_reg(pm4, R_00B814_COMPUTE_START_Y, 0); -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 1/4] mesa: Implement ext_framebuffer_multisample_blit_scaled extension
On 16 May 2013 11:44, Anuj Phogat anuj.pho...@gmail.com wrote: Signed-off-by: Anuj Phogat anuj.pho...@gmail.com Reviewed-by: Paul Berry stereotype...@gmail.com Reviewed-by: Brian Paul bri...@vmware.com --- src/mesa/main/extensions.c | 1 + src/mesa/main/fbobject.c | 30 +++--- src/mesa/main/mtypes.h | 1 + 3 files changed, 29 insertions(+), 3 deletions(-) diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c index db5a5ed..39aaad4 100644 --- a/src/mesa/main/extensions.c +++ b/src/mesa/main/extensions.c @@ -184,6 +184,7 @@ static const struct extension extension_table[] = { { GL_EXT_fog_coord, o(EXT_fog_coord), GLL,1999 }, { GL_EXT_framebuffer_blit, o(EXT_framebuffer_blit),GL, 2005 }, { GL_EXT_framebuffer_multisample, o(EXT_framebuffer_multisample), GL, 2005 }, + { GL_EXT_framebuffer_multisample_blit_scaled, o(EXT_framebuffer_multisample_blit_scaled), GL, 2011 }, { GL_EXT_framebuffer_object, o(EXT_framebuffer_object), GL, 2000 }, { GL_EXT_framebuffer_sRGB, o(EXT_framebuffer_sRGB),GL, 1998 }, { GL_EXT_gpu_program_parameters, o(EXT_gpu_program_parameters), GLL,2006 }, diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c index 80485f7..e7300f6 100644 --- a/src/mesa/main/fbobject.c +++ b/src/mesa/main/fbobject.c @@ -2974,6 +2974,20 @@ compatible_resolve_formats(const struct gl_renderbuffer *readRb, return GL_FALSE; } +static GLboolean +is_valid_blit_filter(const struct gl_context *ctx, GLenum filter) +{ + switch (filter) { + case GL_NEAREST: + case GL_LINEAR: + return true; + case GL_SCALED_RESOLVE_FASTEST_EXT: + case GL_SCALED_RESOLVE_NICEST_EXT: + return ctx-Extensions.EXT_framebuffer_multisample_blit_scaled; + default: + return false; + } +} /** * Blit rectangular region, optionally from one framebuffer to another. @@ -3023,8 +3037,17 @@ _mesa_BlitFramebuffer(GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1, return; } - if (filter != GL_NEAREST filter != GL_LINEAR) { - _mesa_error(ctx, GL_INVALID_ENUM, glBlitFramebufferEXT(filter)); + if (!is_valid_blit_filter(ctx, filter)) { + _mesa_error(ctx, GL_INVALID_ENUM, glBlitFramebufferEXT(%s), + _mesa_lookup_enum_by_nr(filter)); + return; + } + + if ((filter == GL_SCALED_RESOLVE_FASTEST_EXT || +filter == GL_SCALED_RESOLVE_NICEST_EXT) +(readFb-Visual.samples == 0 || drawFb-Visual.samples 0)) { + _mesa_error(ctx, GL_INVALID_OPERATION, glBlitFramebufferEXT(%s), + _mesa_lookup_enum_by_nr(filter)); return; } @@ -3257,7 +3280,8 @@ _mesa_BlitFramebuffer(GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1, } /* extra checks for multisample copies... */ - if (readFb-Visual.samples 0 || drawFb-Visual.samples 0) { + if ((readFb-Visual.samples 0 || drawFb-Visual.samples 0) + (filter == GL_NEAREST || filter == GL_LINEAR)) { /* src and dest region sizes must be the same */ if (abs(srcX1 - srcX0) != abs(dstX1 - dstX0) || abs(srcY1 - srcY0) != abs(dstY1 - dstY0)) { Later in this function, the following error check appears: if (filter == GL_LINEAR) { /* 3.1 spec, page 199: * Calling BlitFramebuffer will result in an INVALID_OPERATION error * if filter is LINEAR and read buffer contains integer data. */ GLenum type = _mesa_get_format_datatype(colorReadRb-Format); if (type == GL_INT || type == GL_UNSIGNED_INT) { _mesa_error(ctx, GL_INVALID_OPERATION, glBlitFramebufferEXT(integer color type)); return; } } This needs to be changed to if (filter != GL_NEAREST) in accordance with the following text from the extension: Calling BlitFramebuffer will result in an INVALID_OPERATION error if filter is not NEAREST and read buffer contains integer data. With that fixed, this patch is: Reviewed-by: Paul Berry stereotype...@gmail.com diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index b68853b..8af6dc6 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -3023,6 +3023,7 @@ struct gl_extensions GLboolean EXT_fog_coord; GLboolean EXT_framebuffer_blit; GLboolean EXT_framebuffer_multisample; + GLboolean EXT_framebuffer_multisample_blit_scaled; GLboolean EXT_framebuffer_object; GLboolean EXT_framebuffer_sRGB; GLboolean EXT_gpu_program_parameters; -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org
Re: [Mesa-dev] [PATCH 4/5] radeonsi/compute: Pass kernel arguments in a buffer
The only difference I could see is that in the old code you passed cb-buffer (which maybe points to a value?) directly into u_upload_data() where as in the new code, you do pass cb-buffer as the parameter rbuffer to r600_upload_const_buffer(), but then inside that function, you do *rbuffer = NULL before you start, which effectively erases any previous pointer, so if *rbuffer was examined by u_upload_data(), it may be different. I don't know if that matters, though. Patrick On Fri, May 24, 2013 at 1:07 PM, Tom Stellard t...@stellard.net wrote: From: Tom Stellard thomas.stell...@amd.com --- src/gallium/drivers/radeonsi/r600_buffer.c | 31 + src/gallium/drivers/radeonsi/radeonsi_compute.c | 26 ++--- src/gallium/drivers/radeonsi/si_state.c | 29 +++ 3 files changed, 51 insertions(+), 35 deletions(-) diff --git a/src/gallium/drivers/radeonsi/r600_buffer.c b/src/gallium/drivers/radeonsi/r600_buffer.c index cdf9988..87763c3 100644 --- a/src/gallium/drivers/radeonsi/r600_buffer.c +++ b/src/gallium/drivers/radeonsi/r600_buffer.c @@ -25,6 +25,8 @@ * Corbin Simpson mostawesomed...@gmail.com */ +#include byteswap.h + #include pipe/p_screen.h #include util/u_format.h #include util/u_math.h @@ -168,3 +170,32 @@ void r600_upload_index_buffer(struct r600_context *rctx, u_upload_data(rctx-uploader, 0, count * ib-index_size, ib-user_buffer, ib-offset, ib-buffer); } + +void r600_upload_const_buffer(struct r600_context *rctx, struct si_resource **rbuffer, + const uint8_t *ptr, unsigned size, + uint32_t *const_offset) +{ + *rbuffer = NULL; + + if (R600_BIG_ENDIAN) { + uint32_t *tmpPtr; + unsigned i; + + if (!(tmpPtr = malloc(size))) { + R600_ERR(Failed to allocate BE swap buffer.\n); + return; + } + + for (i = 0; i size / 4; ++i) { + tmpPtr[i] = bswap_32(((uint32_t *)ptr)[i]); + } + + u_upload_data(rctx-uploader, 0, size, tmpPtr, const_offset, + (struct pipe_resource**)rbuffer); + + free(tmpPtr); + } else { + u_upload_data(rctx-uploader, 0, size, ptr, const_offset, + (struct pipe_resource**)rbuffer); + } +} diff --git a/src/gallium/drivers/radeonsi/radeonsi_compute.c b/src/gallium/drivers/radeonsi/radeonsi_compute.c index 3fb6eb1..035076d 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_compute.c +++ b/src/gallium/drivers/radeonsi/radeonsi_compute.c @@ -91,8 +91,11 @@ static void radeonsi_launch_grid( struct r600_context *rctx = (struct r600_context*)ctx; struct si_pipe_compute *program = rctx-cs_shader_state.program; struct si_pm4_state *pm4 = CALLOC_STRUCT(si_pm4_state); + struct si_resource *input_buffer; + uint32_t input_offset = 0; + uint64_t input_va; uint64_t shader_va; - unsigned arg_user_sgpr_count; + unsigned arg_user_sgpr_count = 2; unsigned i; struct si_pipe_shader *shader = program-kernels[pc]; @@ -109,21 +112,16 @@ static void radeonsi_launch_grid( si_pm4_inval_shader_cache(pm4); si_cmd_surface_sync(pm4, pm4-cp_coher_cntl); - arg_user_sgpr_count = program-input_size / 4; - if (program-input_size % 4 != 0) { - arg_user_sgpr_count++; - } + /* Upload the input data */ + r600_upload_const_buffer(rctx, input_buffer, input, + program-input_size, input_offset); + input_va = r600_resource_va(ctx-screen, (struct pipe_resource*)input_buffer); + input_va += input_offset; - /* XXX: We should store arguments in memory if we run out of user sgprs. -*/ - assert(arg_user_sgpr_count 16); + si_pm4_add_bo(pm4, input_buffer, RADEON_USAGE_READ); - for (i = 0; i arg_user_sgpr_count; i++) { - uint32_t *args = (uint32_t*)input; - si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0 + - (i * 4), - args[i]); - } + si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0, input_va); + si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0 + 4, S_008F04_BASE_ADDRESS_HI (input_va 32) | S_008F04_STRIDE(0)); si_pm4_set_reg(pm4, R_00B810_COMPUTE_START_X, 0); si_pm4_set_reg(pm4, R_00B814_COMPUTE_START_Y, 0); diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index dec535c..1e94f7e 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -24,8 +24,6 @@ *
Re: [Mesa-dev] [PATCH V2 2/4] intel: Change the register type from UW to UD in blorp engine
On 16 May 2013 11:44, Anuj Phogat anuj.pho...@gmail.com wrote: These changes are required to implement scaled blitting in blorp in my next patch. No regressions observed in piglit quick-driver.tests with this patch. Signed-off-by: Anuj Phogat anuj.pho...@gmail.com --- src/mesa/drivers/dri/i965/brw_blorp.h| 15 ++-- src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 120 +-- src/mesa/drivers/dri/i965/brw_reg.h | 7 ++ 3 files changed, 90 insertions(+), 52 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h b/src/mesa/drivers/dri/i965/brw_blorp.h index 8915080..70e3933 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp.h +++ b/src/mesa/drivers/dri/i965/brw_blorp.h @@ -161,22 +161,19 @@ struct brw_blorp_coord_transform_params void setup(GLuint src0, GLuint dst0, GLuint dst1, bool mirror); - int16_t multiplier; - int16_t offset; + int32_t multiplier; + int32_t offset; }; struct brw_blorp_wm_push_constants { - uint16_t dst_x0; - uint16_t dst_x1; - uint16_t dst_y0; - uint16_t dst_y1; + uint32_t dst_x0; + uint32_t dst_x1; + uint32_t dst_y0; + uint32_t dst_y1; brw_blorp_coord_transform_params x_transform; brw_blorp_coord_transform_params y_transform; - - /* Pad out to an integral number of registers */ - uint16_t pad[8]; }; /* Every 32 bytes of push constant data constitutes one GEN register. */ diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp index c3ef054..b7ee92b 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp @@ -590,13 +590,12 @@ private: void encode_msaa(unsigned num_samples, intel_msaa_layout layout); void decode_msaa(unsigned num_samples, intel_msaa_layout layout); void kill_if_outside_dst_rect(); - void translate_dst_to_src(); + void translate_dst_to_src(unsigned intel_gen); void single_to_blend(); void manual_blend(unsigned num_samples); void sample(struct brw_reg dst); void texel_fetch(struct brw_reg dst); void mcs_fetch(); - void expand_to_32_bits(struct brw_reg src, struct brw_reg dst); void texture_lookup(struct brw_reg dst, GLuint msg_type, const sampler_message_arg *args, int num_args); void render_target_write(); @@ -773,7 +772,7 @@ brw_blorp_blit_program::compile(struct brw_context *brw, kill_if_outside_dst_rect(); /* Next, apply a translation to obtain coordinates in the source image. */ - translate_dst_to_src(); + translate_dst_to_src(brw-intel.gen); /* If the source image is not multisampled, then we want to fetch sample * number 0, because that's the only sample there is. @@ -845,7 +844,7 @@ brw_blorp_blit_program::alloc_push_const_regs(int base_reg) #define CONST_LOC(name) offsetof(brw_blorp_wm_push_constants, name) #define ALLOC_REG(name) \ this-name = \ - brw_uw1_reg(BRW_GENERAL_REGISTER_FILE, base_reg, CONST_LOC(name) / 2) + brw_ud1_reg(BRW_GENERAL_REGISTER_FILE, base_reg, CONST_LOC(name) / 4) ALLOC_REG(dst_x0); ALLOC_REG(dst_x1); @@ -875,17 +874,23 @@ brw_blorp_blit_program::alloc_regs() } this-mcs_data = retype(brw_vec8_grf(reg, 0), BRW_REGISTER_TYPE_UD); reg += 8; + for (int i = 0; i 2; ++i) { this-x_coords[i] - = vec16(retype(brw_vec8_grf(reg++, 0), BRW_REGISTER_TYPE_UW)); + = vec8(retype(brw_vec8_grf(reg, 0), BRW_REGISTER_TYPE_UD)); It should be sufficient to say this-x_coords[i] = retype(brw_vec8_grf(reg, 0), BRW_REGISTER_TYPE_UD), since the register returned by brw_vec8_grf() is already a vec8. This applies to y_coords[i], sample_index, t1, and t2 below. Regardless of whether you decide to change that, this patch is: Reviewed-by: Paul Berry stereotype...@gmail.com Nice work, BTW. Some day soon I want to port blorp over to share more code with the FS back-end (so that it's easier to port to future chipsets). Your work here paves the way for that nicely. + reg += 2; this-y_coords[i] - = vec16(retype(brw_vec8_grf(reg++, 0), BRW_REGISTER_TYPE_UW)); + = vec8(retype(brw_vec8_grf(reg, 0), BRW_REGISTER_TYPE_UD)); + reg += 2; } this-xy_coord_index = 0; this-sample_index - = vec16(retype(brw_vec8_grf(reg++, 0), BRW_REGISTER_TYPE_UW)); - this-t1 = vec16(retype(brw_vec8_grf(reg++, 0), BRW_REGISTER_TYPE_UW)); - this-t2 = vec16(retype(brw_vec8_grf(reg++, 0), BRW_REGISTER_TYPE_UW)); + = vec8(retype(brw_vec8_grf(reg, 0), BRW_REGISTER_TYPE_UD)); + reg += 2; + this-t1 = vec8(retype(brw_vec8_grf(reg, 0), BRW_REGISTER_TYPE_UD)); + reg += 2; + this-t2 = vec8(retype(brw_vec8_grf(reg, 0), BRW_REGISTER_TYPE_UD)); + reg += 2; /* Make sure we didn't run out of registers */ assert(reg = GEN7_MRF_HACK_START); @@ -942,7
Re: [Mesa-dev] [PATCH V2 3/4] intel: Add multisample scaled blitting in blorp engine
On 16 May 2013 11:44, Anuj Phogat anuj.pho...@gmail.com wrote: In traditional multisampled framebuffer rendering, color samples must be explicitly resolved via BlitFramebuffer before doing the scaled blitting of the framebuffer. So, scaled blitting of a multisample framebuffer takes two separate calls to BlitFramebuffer. This patch implements the functionality of doing multisampled scaled resolve using just one BlitFramebuffer call. Important changes involved in this patch are listed below: - Use float registers to scale and offset texture coordinates. - Change offset computation to consider float coordinates. - Round the scaled coordinates down to nearest integer. - Modify src texture coordinates clipping to account for scaling.. - Linear filter is not yet implemented in blorp. So, don't use blorp engine to do single sampled scaled blitting. Note: Observed no piglit regressions on sandybridge ivybridge with these changes. Signed-off-by: Anuj Phogat anuj.pho...@gmail.com --- src/mesa/drivers/dri/i965/brw_blorp.h | 23 ++-- src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 143 +++-- src/mesa/drivers/dri/i965/brw_reg.h| 7 -- src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 2 + 4 files changed, 102 insertions(+), 73 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h b/src/mesa/drivers/dri/i965/brw_blorp.h index 70e3933..a40324b 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp.h +++ b/src/mesa/drivers/dri/i965/brw_blorp.h @@ -40,9 +40,10 @@ brw_blorp_blit_miptrees(struct intel_context *intel, unsigned src_level, unsigned src_layer, struct intel_mipmap_tree *dst_mt, unsigned dst_level, unsigned dst_layer, -int src_x0, int src_y0, -int dst_x0, int dst_y0, -int dst_x1, int dst_y1, +float src_x0, float src_y0, +float src_x1, float src_y1, +float dst_x0, float dst_y0, +float dst_x1, float dst_y1, bool mirror_x, bool mirror_y); bool @@ -158,11 +159,11 @@ public: struct brw_blorp_coord_transform_params { - void setup(GLuint src0, GLuint dst0, GLuint dst1, + void setup(GLfloat src0, GLfloat src1, GLfloat dst0, GLfloat dst1, bool mirror); - int32_t multiplier; - int32_t offset; + float multiplier; + float offset; }; @@ -304,6 +305,9 @@ struct brw_blorp_blit_prog_key * than one sample per pixel. */ bool persample_msaa_dispatch; + + /* True for scaled blitting. */ + bool blit_scaled; }; class brw_blorp_blit_params : public brw_blorp_params @@ -314,9 +318,10 @@ public: unsigned src_level, unsigned src_layer, struct intel_mipmap_tree *dst_mt, unsigned dst_level, unsigned dst_layer, - GLuint src_x0, GLuint src_y0, - GLuint dst_x0, GLuint dst_y0, - GLuint width, GLuint height, + GLfloat src_x0, GLfloat src_y0, + GLfloat src_x1, GLfloat src_y1, + GLfloat dst_x0, GLfloat dst_y0, + GLfloat dst_x1, GLfloat dst_y1, bool mirror_x, bool mirror_y); virtual uint32_t get_wm_prog(struct brw_context *brw, diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp index b7ee92b..19169ef 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp @@ -41,11 +41,11 @@ * If coord0 coord1, swap them and invert the mirror boolean. */ static inline void -fixup_mirroring(bool mirror, GLint coord0, GLint coord1) +fixup_mirroring(bool mirror, GLfloat coord0, GLfloat coord1) { if (coord0 coord1) { mirror = !mirror; - GLint tmp = coord0; + GLfloat tmp = coord0; coord0 = coord1; coord1 = tmp; } @@ -67,9 +67,10 @@ fixup_mirroring(bool mirror, GLint coord0, GLint coord1) * coordinates, by swapping the roles of src and dst. */ static inline bool -clip_or_scissor(bool mirror, GLint src_x0, GLint src_x1, GLint dst_x0, -GLint dst_x1, GLint fb_xmin, GLint fb_xmax) +clip_or_scissor(bool mirror, GLfloat src_x0, GLfloat src_x1, GLfloat dst_x0, +GLfloat dst_x1, GLfloat fb_xmin, GLfloat fb_xmax) { + float scale = (float) (src_x1 - src_x0) / (dst_x1 - dst_x0); /* If we are going to scissor everything away, stop. */ if (!(fb_xmin fb_xmax dst_x0 fb_xmax @@ -105,8 +106,8 @@ clip_or_scissor(bool mirror, GLint src_x0, GLint src_x1, GLint dst_x0, /* Adjust the source
Re: [Mesa-dev] [PATCH V2 4/4] i965: Enable ext_framebuffer_multisample_blit_scaled on intel h/w
On 16 May 2013 11:44, Anuj Phogat anuj.pho...@gmail.com wrote: This patch enables ext_framebuffer_multisample_blit_scaled extension on intel h/w = gen6. Note: Patches for piglit tests to verify this functionality are out for review on piglit mailing list. Tests pass for all of the scaling factors from 0.1 to 2.4. Comment from Paul Berry: I have some concerns about the image quality of the method you've implemented. As I understand it, the primary use case of this extension is to allow the client to do multisampled rendering at slightly less than screen resolution (e.g. 720p instead of 1080p), and then blit the result to the screen in one step while keeping most of the quality benefits of multisampling. Since your implementation is effectively equivalent to downsampling and then blitting using GL_NEAREST filtering, my fear is that it will lead to blocky artifacts that are severe enough to negate the benefit of multisampling in the first place. Before we turn this extension on in the Intel driver, I'd like to look at a comparison of: (1) your technique (2) downsampling followed by scaling with GL_LINEAR filtering (3) The nVidia implementation, in GL_SCALED_RESOLVE_FASTEST_EXT mode (4) The nVidia implementation, in GL_SCALED_RESOLVE_NICEST_EXT mode (5) Just rendering the image directly to the single-sampled destination buffer Observation: Image quality is better in cases 2, 3, 4 and 5 as compared to case 1. Although extension's implementation meets the specification's requirements, using it leads to blocky artifacts due to nearest filtering. I'll work on implementing a better filtering technique in blorp. Thanks for quoting my comment here. It's good to have context so that we can continue the discussion. My preference would be to go ahead and land patches 1-3 now, but hold patch 4 back until we've figured out how to get comparable image quality to the nVidia implementation. It seems like it would be nice to go out of the gate with our best looking implementation. Does that seem reasonable to other folks? Signed-off-by: Anuj Phogat anuj.pho...@gmail.com --- src/mesa/drivers/dri/intel/intel_extensions.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/drivers/dri/intel/intel_extensions.c b/src/mesa/drivers/dri/intel/intel_extensions.c index 8d8e325..de12ec3 100644 --- a/src/mesa/drivers/dri/intel/intel_extensions.c +++ b/src/mesa/drivers/dri/intel/intel_extensions.c @@ -97,6 +97,7 @@ intelInitExtensions(struct gl_context *ctx) if (intel-gen = 6) { ctx-Extensions.EXT_framebuffer_multisample = true; + ctx-Extensions.EXT_framebuffer_multisample_blit_scaled = true; ctx-Extensions.ARB_blend_func_extended = !driQueryOptionb(intel-optionCache, disable_blend_func_extended); ctx-Extensions.ARB_draw_buffers_blend = true; ctx-Extensions.ARB_ES3_compatibility = true; -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Mask the cut index based on the index buffer type in 3DSTATE_VF.
On 05/23/2013 03:46 PM, Kenneth Graunke wrote: According to the documentation: The Cut Index is compared to the fetched (and possibly-sign-extended) vertex index, and if these values Which documentation is this? The only types that are valid for index buffers are unsigned, so what does possibly-sign-extended even mean? are equal, the current primitive topology is terminated. Note that, for index buffers 32bpp, it is possible to set the Cut Index to a (large) value that will never match a sign-extended vertex index. This suggests that we should not set the value to 0x for unsigned byte or short index buffers, but rather 0xFF or 0x. For GL_PRIMITIVE_RESTART_FIXED_INDEX (ES and desktop 4.something), where the setting of the restart value is out of application control, this is absolutely correct. The OpenGL 4.3 spec says: Primitive restart can also be enabled or disabled with a target of PRIMITIVE_RESTART_FIXED_INDEX. In this case, the primitive restart index is equal to 2^N − 1, where N is 8, 16 or 32 if the type is UNSIGNED_BYTE, UNSIGNED_SHORT, or UNSIGNED_INT, respectively, and the index value specified by PrimitiveRestartIndex is ignored. For GL_PRIMITIVE_RESTART, I'm not so sure. I couldn't find anything conclusive any of the specs. The only thing I found was a hint in the NV_primitive_restart extension spec: * What should the default primitive restart index be? RESOLVED: Zero. It's tough to pick another number that is meaningful for all three element data types. In practice, apps are likely to set it to 0x or 0x. You can infer from this that applications are expected to set 0x for GL_UNSIGNED_SHORT and 0x for GL_UNSIGNED_LONG. Experimentation is the only way to know for sure. :( Fixes sporadic failures in the ES 3 instanced_arrays_primitive_restart conformance test when run in combination with other tests. No Piglit regressions. Cc: Ian Romanick i...@freedesktop.org Cc: Paul Berry stereotype...@gmail.com Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_primitive_restart.c | 27 --- 1 file changed, 19 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_primitive_restart.c b/src/mesa/drivers/dri/i965/brw_primitive_restart.c index f824915..cf4a1ea 100644 --- a/src/mesa/drivers/dri/i965/brw_primitive_restart.c +++ b/src/mesa/drivers/dri/i965/brw_primitive_restart.c @@ -183,19 +183,30 @@ haswell_upload_cut_index(struct brw_context *brw) if (!intel-is_haswell) return; - const unsigned cut_index_setting = - ctx-Array._PrimitiveRestart ? HSW_CUT_INDEX_ENABLE : 0; - - BEGIN_BATCH(2); - OUT_BATCH(_3DSTATE_VF 16 | cut_index_setting | (2 - 2)); - OUT_BATCH(ctx-Array._RestartIndex); - ADVANCE_BATCH(); + if (ctx-Array._PrimitiveRestart) { + int cut_index = ctx-Array._RestartIndex; + + if (brw-ib.type == GL_UNSIGNED_BYTE) + cut_index = 0xff; + else if (brw-ib.type == GL_UNSIGNED_SHORT) + cut_index = 0x; + + BEGIN_BATCH(2); + OUT_BATCH(_3DSTATE_VF 16 | HSW_CUT_INDEX_ENABLE | (2 - 2)); + OUT_BATCH(cut_index); + ADVANCE_BATCH(); + } else { + BEGIN_BATCH(2); + OUT_BATCH(_3DSTATE_VF 16 | (2 - 2)); + OUT_BATCH(0); + ADVANCE_BATCH(); + } } const struct brw_tracked_state haswell_cut_index = { .dirty = { .mesa = _NEW_TRANSFORM, - .brw = 0, + .brw = BRW_NEW_INDEX_BUFFER, .cache = 0, }, .emit = haswell_upload_cut_index, ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/12] i965/gen7+: Implement fast color clears.
On 05/21/2013 04:52 PM, Paul Berry wrote: This series implements fast color clears, a Gen7+ feature which reduces memory bandwidth by deferring the memory writes involved in a glClear() until the same memory is later touched during rendering. From a broad overview point of view, fast color clears work in a similar way to HiZ: an auxiliary MCS buffer keeps track of which parts of the buffer have been cleared but haven't yet had the necessary memory writes performed. Whenever a color buffer needs to be accessed by the CPU, or by a part of the GPU that is not fast-color-aware, we have to perform a resolve operation to force any pending memory writes to occur. This patch series adopts a slightly different strategy (compared to HiZ) for making sure the resolves happen when needed. Instead of modifying each code path that might need to do a resolve so that it does one if needed, we create an accessor function that does the resolve if needed and then provides the caller with access to the miptree's underlying memory region. This lets us have a lot more confidence that we didn't miss any code paths, which is important since color buffers are accessed by a large number of code paths. To discourage future maintainers from trying to bypass the accessor function, it is inline (so that overhead is negligible), and the field it provides access to has been renamed to region_private. Patch 01 ifdefs out some code so that it does not appear in the i915 (pre-Gen4) driver--this makes it easier to be confident that these changes won't regress i915. Patch 02 introduces the aforementioned accessor function. Patches 03-11 are the guts of the implementation, and patch 12 enables the new feature. No piglit regressions. I have additional piglit tests which validate specific important corner cases--I hope to get those out to the list later this week. I sent some comments and review for the tests, and I've sent some other comments about these patches. My only concern is whether the case of swapping a non-current drawable (that had a fast-clear as the last render) produces the correct result. In the piglit thread, I suggested adding a test specifically for this case. I suspect that if fast-clear fails in that case, then multisampling also fails. Both can probably be fixed as follow-on work. Does that seem plausible? [PATCH 01/12] intel: Conditionally compile mcs-related code for i965 only. [PATCH 02/12] intel: Create intel_miptree_get_region() to prepare for fast color clear. [PATCH 03/12] i965/gen7+: Create an enum for keeping track of fast color clear state. [PATCH 04/12] i965/gen7+: Set up MCS in SURFACE_STATE whenever MCS is present. [PATCH 05/12] i965/gen7+: Create helper functions for single-sample MCS buffers. [PATCH 06/12] i965/gen7+: Implement fast color clear operation in BLORP. [PATCH 07/12] i965/blorp: Expand clear class hierarchy to prepare for RT resolves. [PATCH 08/12] i965/blorp: Write blorp code to do render target resolves. [PATCH 09/12] i965/gen7+: Ensure that front/back buffers are fast-clear resolved. [PATCH 10/12] i965/gen7+: Resolve color buffers when necessary. [PATCH 11/12] i965/gen7+: Disable fast color clears on shared regions. [PATCH 12/12] i965/gen7: Enable support for fast color clears. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/12] sso: update glGet: GL_PROGRAM_PIPELINE_BINDING
On Sat, 4 May 2013 11:35:22 +0200 gregory hainaut gregory.hain...@gmail.com wrote: On Fri, 3 May 2013 12:04:48 -0700 Matt Turner matts...@gmail.com wrote: On Fri, May 3, 2013 at 10:44 AM, Gregory Hainaut gregory.hain...@gmail.com wrote: --- src/mesa/main/get.c |9 + src/mesa/main/get_hash_params.py |3 +++ 2 files changed, 12 insertions(+) diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c index 54159c0..6cbb7db 100644 --- a/src/mesa/main/get.c +++ b/src/mesa/main/get.c @@ -369,6 +369,7 @@ EXTRA_EXT(ARB_map_buffer_alignment); EXTRA_EXT(ARB_texture_cube_map_array); EXTRA_EXT(ARB_texture_buffer_range); EXTRA_EXT(ARB_texture_multisample); +EXTRA_EXT(ARB_separate_shader_objects); static const int extra_ARB_color_buffer_float_or_glcore[] = { @@ -889,6 +890,14 @@ find_custom_value(struct gl_context *ctx, const struct value_desc *d, union valu _mesa_problem(ctx, driver doesn't implement GetTimestamp); } break; + /* GL_ARB_separate_shader_objects */ + case GL_PROGRAM_PIPELINE_BINDING: + if (ctx-Pipeline.Current) { + v-value_int = ctx-Pipeline.Current-Name; + } else { + v-value_int = 0; + } + break; } } This looks believable, but I can't find a description in the extension spec or GL 4.1+ specs that say precisely what this query is supposed to do. Looks like it's just mentioned in the extension spec, and not at all in GL 4.1+ specs. Yes you're right that strange. There is also a couple of line in glGet man page. GL_PROGRAM_PIPELINE_BINDING params a single value, the name of the currently bound program pipeline object, or zero if no program pipeline object is bound. See glBindProgramPipeline. Both Nvidia and AMD support this query. I did a quick update on my piglit test, on the AMD side: * UseProgram(2) * BindPipeline(5) (the pipeline isn't really bound because UseProgram got an higher priority) * Get GL_PROGRAM_PIPELINE_BINDING = 5 I will try to check the behavior on Nvidia implementation. Nvidia implementation is this one: if (ctx-_Shader) { v-value_int = ctx-_Shader-Name; } else { v-value_int = 0; } So on my previous example * UseProgram(2) * BindPipeline(5) * Get GL_PROGRAM_PIPELINE_BINDING = 0 There is no spec but the SSO spec was written by Nvidia so I would say that Nvidia is correct. diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py index 2b97da6..43a11cf 100644 --- a/src/mesa/main/get_hash_params.py +++ b/src/mesa/main/get_hash_params.py @@ -709,6 +709,9 @@ descriptor=[ # GL_ARB_texture_cube_map_array [ TEXTURE_BINDING_CUBE_MAP_ARRAY_ARB, LOC_CUSTOM, TYPE_INT, TEXTURE_CUBE_ARRAY_INDEX, extra_ARB_texture_cube_map_array ], + +# GL_ARB_separate_shader_objects + [ PROGRAM_PIPELINE_BINDING, LOC_CUSTOM, TYPE_INT, GL_PROGRAM_PIPELINE_BINDING, extra_ARB_separate_shader_objects ], ]}, # Enums restricted to OpenGL Core profile -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/5] radeonsi/compute: Pass kernel arguments in a buffer
On Fri, May 24, 2013 at 01:32:20PM -0500, Patrick Baggett wrote: The only difference I could see is that in the old code you passed cb-buffer (which maybe points to a value?) directly into u_upload_data() where as in the new code, you do pass cb-buffer as the parameter rbuffer to r600_upload_const_buffer(), but then inside that function, you do *rbuffer = NULL before you start, which effectively erases any previous pointer, so if *rbuffer was examined by u_upload_data(), it may be different. I don't know if that matters, though. This was the problem, thanks for spotting it! u_upload_data() was deleting the old buffer, so by initializing rbuffer to NULL, the old buffer was never being deleted. An updated patch is on the way. -Tom Patrick On Fri, May 24, 2013 at 1:07 PM, Tom Stellard t...@stellard.net wrote: From: Tom Stellard thomas.stell...@amd.com --- src/gallium/drivers/radeonsi/r600_buffer.c | 31 + src/gallium/drivers/radeonsi/radeonsi_compute.c | 26 ++--- src/gallium/drivers/radeonsi/si_state.c | 29 +++ 3 files changed, 51 insertions(+), 35 deletions(-) diff --git a/src/gallium/drivers/radeonsi/r600_buffer.c b/src/gallium/drivers/radeonsi/r600_buffer.c index cdf9988..87763c3 100644 --- a/src/gallium/drivers/radeonsi/r600_buffer.c +++ b/src/gallium/drivers/radeonsi/r600_buffer.c @@ -25,6 +25,8 @@ * Corbin Simpson mostawesomed...@gmail.com */ +#include byteswap.h + #include pipe/p_screen.h #include util/u_format.h #include util/u_math.h @@ -168,3 +170,32 @@ void r600_upload_index_buffer(struct r600_context *rctx, u_upload_data(rctx-uploader, 0, count * ib-index_size, ib-user_buffer, ib-offset, ib-buffer); } + +void r600_upload_const_buffer(struct r600_context *rctx, struct si_resource **rbuffer, + const uint8_t *ptr, unsigned size, + uint32_t *const_offset) +{ + *rbuffer = NULL; + + if (R600_BIG_ENDIAN) { + uint32_t *tmpPtr; + unsigned i; + + if (!(tmpPtr = malloc(size))) { + R600_ERR(Failed to allocate BE swap buffer.\n); + return; + } + + for (i = 0; i size / 4; ++i) { + tmpPtr[i] = bswap_32(((uint32_t *)ptr)[i]); + } + + u_upload_data(rctx-uploader, 0, size, tmpPtr, const_offset, + (struct pipe_resource**)rbuffer); + + free(tmpPtr); + } else { + u_upload_data(rctx-uploader, 0, size, ptr, const_offset, + (struct pipe_resource**)rbuffer); + } +} diff --git a/src/gallium/drivers/radeonsi/radeonsi_compute.c b/src/gallium/drivers/radeonsi/radeonsi_compute.c index 3fb6eb1..035076d 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_compute.c +++ b/src/gallium/drivers/radeonsi/radeonsi_compute.c @@ -91,8 +91,11 @@ static void radeonsi_launch_grid( struct r600_context *rctx = (struct r600_context*)ctx; struct si_pipe_compute *program = rctx-cs_shader_state.program; struct si_pm4_state *pm4 = CALLOC_STRUCT(si_pm4_state); + struct si_resource *input_buffer; + uint32_t input_offset = 0; + uint64_t input_va; uint64_t shader_va; - unsigned arg_user_sgpr_count; + unsigned arg_user_sgpr_count = 2; unsigned i; struct si_pipe_shader *shader = program-kernels[pc]; @@ -109,21 +112,16 @@ static void radeonsi_launch_grid( si_pm4_inval_shader_cache(pm4); si_cmd_surface_sync(pm4, pm4-cp_coher_cntl); - arg_user_sgpr_count = program-input_size / 4; - if (program-input_size % 4 != 0) { - arg_user_sgpr_count++; - } + /* Upload the input data */ + r600_upload_const_buffer(rctx, input_buffer, input, + program-input_size, input_offset); + input_va = r600_resource_va(ctx-screen, (struct pipe_resource*)input_buffer); + input_va += input_offset; - /* XXX: We should store arguments in memory if we run out of user sgprs. -*/ - assert(arg_user_sgpr_count 16); + si_pm4_add_bo(pm4, input_buffer, RADEON_USAGE_READ); - for (i = 0; i arg_user_sgpr_count; i++) { - uint32_t *args = (uint32_t*)input; - si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0 + - (i * 4), - args[i]); - } + si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0, input_va); + si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0 + 4, S_008F04_BASE_ADDRESS_HI (input_va
[Mesa-dev] [PATCH] radeonsi/compute: Pass kernel arguments in a buffer v2
From: Tom Stellard thomas.stell...@amd.com v2: - Fix memory leak in si_set_constant_buffer() --- src/gallium/drivers/radeonsi/r600_buffer.c | 29 + src/gallium/drivers/radeonsi/radeonsi_compute.c | 26 ++ src/gallium/drivers/radeonsi/si_state.c | 23 ++-- 3 files changed, 43 insertions(+), 35 deletions(-) diff --git a/src/gallium/drivers/radeonsi/r600_buffer.c b/src/gallium/drivers/radeonsi/r600_buffer.c index cdf9988..3d295e8 100644 --- a/src/gallium/drivers/radeonsi/r600_buffer.c +++ b/src/gallium/drivers/radeonsi/r600_buffer.c @@ -25,6 +25,8 @@ * Corbin Simpson mostawesomed...@gmail.com */ +#include byteswap.h + #include pipe/p_screen.h #include util/u_format.h #include util/u_math.h @@ -168,3 +170,30 @@ void r600_upload_index_buffer(struct r600_context *rctx, u_upload_data(rctx-uploader, 0, count * ib-index_size, ib-user_buffer, ib-offset, ib-buffer); } + +void r600_upload_const_buffer(struct r600_context *rctx, struct si_resource **rbuffer, + const uint8_t *ptr, unsigned size, + uint32_t *const_offset) +{ + if (R600_BIG_ENDIAN) { + uint32_t *tmpPtr; + unsigned i; + + if (!(tmpPtr = malloc(size))) { + R600_ERR(Failed to allocate BE swap buffer.\n); + return; + } + + for (i = 0; i size / 4; ++i) { + tmpPtr[i] = bswap_32(((uint32_t *)ptr)[i]); + } + + u_upload_data(rctx-uploader, 0, size, tmpPtr, const_offset, + (struct pipe_resource**)rbuffer); + + free(tmpPtr); + } else { + u_upload_data(rctx-uploader, 0, size, ptr, const_offset, + (struct pipe_resource**)rbuffer); + } +} diff --git a/src/gallium/drivers/radeonsi/radeonsi_compute.c b/src/gallium/drivers/radeonsi/radeonsi_compute.c index 3fb6eb1..4341ecc 100644 --- a/src/gallium/drivers/radeonsi/radeonsi_compute.c +++ b/src/gallium/drivers/radeonsi/radeonsi_compute.c @@ -91,8 +91,11 @@ static void radeonsi_launch_grid( struct r600_context *rctx = (struct r600_context*)ctx; struct si_pipe_compute *program = rctx-cs_shader_state.program; struct si_pm4_state *pm4 = CALLOC_STRUCT(si_pm4_state); + struct si_resource *input_buffer = NULL; + uint32_t input_offset = 0; + uint64_t input_va; uint64_t shader_va; - unsigned arg_user_sgpr_count; + unsigned arg_user_sgpr_count = 2; unsigned i; struct si_pipe_shader *shader = program-kernels[pc]; @@ -109,21 +112,16 @@ static void radeonsi_launch_grid( si_pm4_inval_shader_cache(pm4); si_cmd_surface_sync(pm4, pm4-cp_coher_cntl); - arg_user_sgpr_count = program-input_size / 4; - if (program-input_size % 4 != 0) { - arg_user_sgpr_count++; - } + /* Upload the input data */ + r600_upload_const_buffer(rctx, input_buffer, input, + program-input_size, input_offset); + input_va = r600_resource_va(ctx-screen, (struct pipe_resource*)input_buffer); + input_va += input_offset; - /* XXX: We should store arguments in memory if we run out of user sgprs. -*/ - assert(arg_user_sgpr_count 16); + si_pm4_add_bo(pm4, input_buffer, RADEON_USAGE_READ); - for (i = 0; i arg_user_sgpr_count; i++) { - uint32_t *args = (uint32_t*)input; - si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0 + - (i * 4), - args[i]); - } + si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0, input_va); + si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0 + 4, S_008F04_BASE_ADDRESS_HI (input_va 32) | S_008F04_STRIDE(0)); si_pm4_set_reg(pm4, R_00B810_COMPUTE_START_X, 0); si_pm4_set_reg(pm4, R_00B814_COMPUTE_START_Y, 0); diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index dec535c..98e54c7 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -24,8 +24,6 @@ * Christian König christian.koe...@amd.com */ -#include byteswap.h - #include util/u_memory.h #include util/u_framebuffer.h #include util/u_blitter.h @@ -2526,25 +2524,8 @@ static void si_set_constant_buffer(struct pipe_context *ctx, uint shader, uint i ptr = input-user_buffer; if (ptr) { - /* Upload the user buffer. */ - if (R600_BIG_ENDIAN) { - uint32_t *tmpPtr; - unsigned i, size = input-buffer_size; - - if (!(tmpPtr = malloc(size))) { - R600_ERR(Failed to allocate
Re: [Mesa-dev] [PATCH] i965: Mask the cut index based on the index buffer type in 3DSTATE_VF.
On 24 May 2013 12:23, Ian Romanick i...@freedesktop.org wrote: On 05/23/2013 03:46 PM, Kenneth Graunke wrote: According to the documentation: The Cut Index is compared to the fetched (and possibly-sign-extended) vertex index, and if these values Which documentation is this? The only types that are valid for index buffers are unsigned, so what does possibly-sign-extended even mean? This is from the Haswell bspec. I have never understood what they mean by possibly-sign-extended either. are equal, the current primitive topology is terminated. Note that, for index buffers 32bpp, it is possible to set the Cut Index to a (large) value that will never match a sign-extended vertex index. This suggests that we should not set the value to 0x for unsigned byte or short index buffers, but rather 0xFF or 0x. For GL_PRIMITIVE_RESTART_FIXED_**INDEX (ES and desktop 4.something), where the setting of the restart value is out of application control, this is absolutely correct. The OpenGL 4.3 spec says: Primitive restart can also be enabled or disabled with a target of PRIMITIVE_RESTART_FIXED_INDEX. In this case, the primitive restart index is equal to 2^N - 1, where N is 8, 16 or 32 if the type is UNSIGNED_BYTE, UNSIGNED_SHORT, or UNSIGNED_INT, respectively, and the index value specified by PrimitiveRestartIndex is ignored. For GL_PRIMITIVE_RESTART, I'm not so sure. I couldn't find anything conclusive any of the specs. The only thing I found was a hint in the NV_primitive_restart extension spec: * What should the default primitive restart index be? RESOLVED: Zero. It's tough to pick another number that is meaningful for all three element data types. In practice, apps are likely to set it to 0x or 0x. You can infer from this that applications are expected to set 0x for GL_UNSIGNED_SHORT and 0x for GL_UNSIGNED_LONG. Experimentation is the only way to know for sure. :( Fixes sporadic failures in the ES 3 instanced_arrays_primitive_**restart conformance test when run in combination with other tests. No Piglit regressions. Cc: Ian Romanick i...@freedesktop.org Cc: Paul Berry stereotype...@gmail.com Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_**primitive_restart.c | 27 --- 1 file changed, 19 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/**brw_primitive_restart.c b/src/mesa/drivers/dri/i965/**brw_primitive_restart.c index f824915..cf4a1ea 100644 --- a/src/mesa/drivers/dri/i965/**brw_primitive_restart.c +++ b/src/mesa/drivers/dri/i965/**brw_primitive_restart.c @@ -183,19 +183,30 @@ haswell_upload_cut_index(**struct brw_context *brw) if (!intel-is_haswell) return; - const unsigned cut_index_setting = - ctx-Array._PrimitiveRestart ? HSW_CUT_INDEX_ENABLE : 0; - - BEGIN_BATCH(2); - OUT_BATCH(_3DSTATE_VF 16 | cut_index_setting | (2 - 2)); - OUT_BATCH(ctx-Array._**RestartIndex); - ADVANCE_BATCH(); + if (ctx-Array._PrimitiveRestart) { + int cut_index = ctx-Array._RestartIndex; + + if (brw-ib.type == GL_UNSIGNED_BYTE) + cut_index = 0xff; + else if (brw-ib.type == GL_UNSIGNED_SHORT) + cut_index = 0x; + + BEGIN_BATCH(2); + OUT_BATCH(_3DSTATE_VF 16 | HSW_CUT_INDEX_ENABLE | (2 - 2)); + OUT_BATCH(cut_index); + ADVANCE_BATCH(); + } else { + BEGIN_BATCH(2); + OUT_BATCH(_3DSTATE_VF 16 | (2 - 2)); + OUT_BATCH(0); + ADVANCE_BATCH(); + } } const struct brw_tracked_state haswell_cut_index = { .dirty = { .mesa = _NEW_TRANSFORM, - .brw = 0, + .brw = BRW_NEW_INDEX_BUFFER, .cache = 0, }, .emit = haswell_upload_cut_index, __**_ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/**mailman/listinfo/mesa-devhttp://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/4] V2 Multiple viewports in Gallium
This series adds support for multiple viewports/scissors to gallium and implements it in llvmpipe. All the other drivers still support just a single viewport/scissor combo and their behavior should be exactly the same as it was. I think this one takes care of all the comments. I think it addresses everyones concerns. Please let me know if I missed something. Zack Rusin (4): gallium: Add support for multiple viewports draw: implement support for multiple viewports llvmpipe: implement support for multiple viewports draw: fixup draw_find_shader_output src/gallium/auxiliary/cso_cache/cso_context.c |4 +- src/gallium/auxiliary/draw/draw_cliptest_tmp.h | 10 +++- src/gallium/auxiliary/draw/draw_context.c | 63 +++- src/gallium/auxiliary/draw/draw_context.h |6 +- src/gallium/auxiliary/draw/draw_gs.c | 11 +++- src/gallium/auxiliary/draw/draw_gs.h |1 + src/gallium/auxiliary/draw/draw_pipe_clip.c| 11 +++- src/gallium/auxiliary/draw/draw_private.h |8 +-- .../draw/draw_pt_fetch_shade_pipeline_llvm.c |4 +- src/gallium/auxiliary/draw/draw_vertex.h |2 +- src/gallium/auxiliary/draw/draw_vs.c |7 --- src/gallium/auxiliary/draw/draw_vs_variant.c | 34 +-- src/gallium/auxiliary/tgsi/tgsi_scan.c |6 ++ src/gallium/auxiliary/tgsi/tgsi_scan.h |1 + src/gallium/auxiliary/tgsi/tgsi_strings.c |3 +- src/gallium/auxiliary/util/u_blitter.c |8 +-- src/gallium/auxiliary/vl/vl_compositor.c |4 +- src/gallium/auxiliary/vl/vl_idct.c |4 +- src/gallium/auxiliary/vl/vl_matrix_filter.c|2 +- src/gallium/auxiliary/vl/vl_mc.c |2 +- src/gallium/auxiliary/vl/vl_median_filter.c|2 +- src/gallium/auxiliary/vl/vl_zscan.c|2 +- src/gallium/docs/source/context.rst|8 ++- src/gallium/drivers/freedreno/freedreno_state.c| 12 ++-- src/gallium/drivers/galahad/glhd_context.c | 20 --- src/gallium/drivers/i915/i915_state.c | 15 +++-- src/gallium/drivers/identity/id_context.c | 22 +++ src/gallium/drivers/ilo/ilo_state.c| 16 +++-- src/gallium/drivers/llvmpipe/lp_context.h |7 ++- src/gallium/drivers/llvmpipe/lp_screen.c |2 + src/gallium/drivers/llvmpipe/lp_setup.c| 29 + src/gallium/drivers/llvmpipe/lp_setup.h|4 +- src/gallium/drivers/llvmpipe/lp_setup_context.h|8 ++- src/gallium/drivers/llvmpipe/lp_setup_line.c | 12 +++- src/gallium/drivers/llvmpipe/lp_setup_point.c | 12 ++-- src/gallium/drivers/llvmpipe/lp_setup_tri.c| 17 -- src/gallium/drivers/llvmpipe/lp_state_clip.c | 25 +--- src/gallium/drivers/llvmpipe/lp_state_derived.c| 20 +-- src/gallium/drivers/llvmpipe/lp_surface.c |4 +- src/gallium/drivers/noop/noop_state.c | 16 +++-- src/gallium/drivers/nv30/nv30_draw.c |2 +- src/gallium/drivers/nv30/nv30_state.c | 16 +++-- src/gallium/drivers/nv50/nv50_state.c | 16 +++-- src/gallium/drivers/nvc0/nvc0_state.c | 16 +++-- src/gallium/drivers/r300/r300_context.c|2 +- src/gallium/drivers/r300/r300_state.c | 18 +++--- src/gallium/drivers/r600/evergreen_state.c |6 +- src/gallium/drivers/r600/r600_state.c |8 ++- src/gallium/drivers/r600/r600_state_common.c | 10 ++-- src/gallium/drivers/radeonsi/si_state.c| 16 +++-- src/gallium/drivers/rbug/rbug_context.c| 22 +++ src/gallium/drivers/softpipe/sp_screen.c |2 + src/gallium/drivers/softpipe/sp_state_clip.c | 19 +++--- src/gallium/drivers/softpipe/sp_state_derived.c|2 +- src/gallium/drivers/svga/svga_pipe_misc.c | 20 --- src/gallium/drivers/svga/svga_swtnl_state.c|2 +- src/gallium/drivers/trace/tr_context.c | 32 ++ src/gallium/include/pipe/p_context.h | 12 ++-- src/gallium/include/pipe/p_defines.h |3 +- src/gallium/include/pipe/p_shader_tokens.h |3 +- src/gallium/include/pipe/p_state.h |1 + src/gallium/tests/graw/fs-test.c |2 +- src/gallium/tests/graw/graw_util.h |2 +- src/gallium/tests/graw/gs-test.c |2 +- src/gallium/tests/graw/quad-sample.c |2 +- src/gallium/tests/graw/shader-leak.c |2 +- src/gallium/tests/graw/tri-gs.c|2 +- src/gallium/tests/graw/tri-instanced.c |2 +- src/gallium/tests/graw/vs-test.c |2 +-
[Mesa-dev] [PATCH 2/4] draw: implement support for multiple viewports
This adds support for multiple viewports to the draw module. Multiple viewports depend on the presence of geometry shaders which can write the viewport index. Signed-off-by: Zack Rusin za...@vmware.com --- src/gallium/auxiliary/draw/draw_cliptest_tmp.h | 10 +++- src/gallium/auxiliary/draw/draw_context.c | 52 src/gallium/auxiliary/draw/draw_gs.c | 11 - src/gallium/auxiliary/draw/draw_gs.h |1 + src/gallium/auxiliary/draw/draw_pipe_clip.c| 11 - src/gallium/auxiliary/draw/draw_private.h |8 +-- .../draw/draw_pt_fetch_shade_pipeline_llvm.c |4 +- src/gallium/auxiliary/draw/draw_vs.c |7 --- src/gallium/auxiliary/draw/draw_vs_variant.c | 34 +++-- 9 files changed, 105 insertions(+), 33 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_cliptest_tmp.h b/src/gallium/auxiliary/draw/draw_cliptest_tmp.h index 48f2349..09e1fd7 100644 --- a/src/gallium/auxiliary/draw/draw_cliptest_tmp.h +++ b/src/gallium/auxiliary/draw/draw_cliptest_tmp.h @@ -31,8 +31,6 @@ static boolean TAG(do_cliptest)( struct pt_post_vs *pvs, struct draw_vertex_info *info ) { struct vertex_header *out = info-verts; - const float *scale = pvs-draw-viewport.scale; - const float *trans = pvs-draw-viewport.translate; /* const */ float (*plane)[4] = pvs-draw-plane; const unsigned pos = draw_current_shader_position_output(pvs-draw); const unsigned cv = draw_current_shader_clipvertex_output(pvs-draw); @@ -44,6 +42,9 @@ static boolean TAG(do_cliptest)( struct pt_post_vs *pvs, unsigned j; unsigned i; bool have_cd = false; + unsigned viewport_index_output = + draw_current_shader_viewport_index_output(pvs-draw); + cd[0] = draw_current_shader_clipdistance_output(pvs-draw, 0); cd[1] = draw_current_shader_clipdistance_output(pvs-draw, 1); @@ -52,7 +53,12 @@ static boolean TAG(do_cliptest)( struct pt_post_vs *pvs, for (j = 0; j info-count; j++) { float *position = out-data[pos]; + int viewport_index = + draw_current_shader_uses_viewport_index(pvs-draw) ? + *((unsigned*)out-data[viewport_index_output]): 0; unsigned mask = 0x0; + const float *scale = pvs-draw-viewports[viewport_index].scale; + const float *trans = pvs-draw-viewports[viewport_index].translate; initialize_vertex_header(out); diff --git a/src/gallium/auxiliary/draw/draw_context.c b/src/gallium/auxiliary/draw/draw_context.c index b555c65..4250f10 100644 --- a/src/gallium/auxiliary/draw/draw_context.c +++ b/src/gallium/auxiliary/draw/draw_context.c @@ -318,17 +318,24 @@ void draw_set_viewport_states( struct draw_context *draw, { const struct pipe_viewport_state *viewport = vps; draw_do_flush(draw, DRAW_FLUSH_PARAMETER_CHANGE); - draw-viewport = *viewport; /* struct copy */ - draw-identity_viewport = (viewport-scale[0] == 1.0f - viewport-scale[1] == 1.0f - viewport-scale[2] == 1.0f - viewport-scale[3] == 1.0f - viewport-translate[0] == 0.0f - viewport-translate[1] == 0.0f - viewport-translate[2] == 0.0f - viewport-translate[3] == 0.0f); - draw_vs_set_viewport( draw, viewport ); + if (start_slot PIPE_MAX_VIEWPORTS) + return; + + if ((start_slot + num_viewports) PIPE_MAX_VIEWPORTS) + num_viewports = PIPE_MAX_VIEWPORTS - start_slot; + + memcpy(draw-viewports + start_slot, vps, + sizeof(struct pipe_viewport_state) * num_viewports); + draw-identity_viewport = (num_viewports == 1) + (viewport-scale[0] == 1.0f + viewport-scale[1] == 1.0f + viewport-scale[2] == 1.0f + viewport-scale[3] == 1.0f + viewport-translate[0] == 0.0f + viewport-translate[1] == 0.0f + viewport-translate[2] == 0.0f + viewport-translate[3] == 0.0f); } @@ -695,6 +702,31 @@ draw_current_shader_position_output(const struct draw_context *draw) /** * Return the index of the shader output which will contain the + * viewport index. + */ +uint +draw_current_shader_viewport_index_output(const struct draw_context *draw) +{ + if (draw-gs.geometry_shader) + return draw-gs.geometry_shader-viewport_index_output; + return 0; +} + +/** + * Returns true if there's a geometry shader bound and the geometry + * shader writes out a viewport index. + */ +boolean +draw_current_shader_uses_viewport_index(const struct draw_context *draw) +{ + if (draw-gs.geometry_shader) + return draw-gs.geometry_shader-info.writes_viewport_index; + return FALSE; +} + + +/** + * Return the index of the shader output which will contain the * vertex position. */ uint diff --git a/src/gallium/auxiliary/draw/draw_gs.c
[Mesa-dev] [PATCH 3/4] llvmpipe: implement support for multiple viewports
Largely related to making sure the rasterizer can correctly pick out the correct scissor box for the current viewport. Signed-off-by: Zack Rusin za...@vmware.com --- src/gallium/drivers/llvmpipe/lp_context.h |7 -- src/gallium/drivers/llvmpipe/lp_screen.c|2 +- src/gallium/drivers/llvmpipe/lp_setup.c | 29 ++- src/gallium/drivers/llvmpipe/lp_setup.h |4 ++-- src/gallium/drivers/llvmpipe/lp_setup_context.h |8 --- src/gallium/drivers/llvmpipe/lp_setup_line.c| 12 +++--- src/gallium/drivers/llvmpipe/lp_setup_point.c | 12 ++ src/gallium/drivers/llvmpipe/lp_setup_tri.c | 17 + src/gallium/drivers/llvmpipe/lp_state_clip.c|6 +++-- src/gallium/drivers/llvmpipe/lp_state_derived.c | 14 ++- src/gallium/drivers/llvmpipe/lp_surface.c |4 ++-- 11 files changed, 79 insertions(+), 36 deletions(-) diff --git a/src/gallium/drivers/llvmpipe/lp_context.h b/src/gallium/drivers/llvmpipe/lp_context.h index d605dba..54f3830 100644 --- a/src/gallium/drivers/llvmpipe/lp_context.h +++ b/src/gallium/drivers/llvmpipe/lp_context.h @@ -75,10 +75,10 @@ struct llvmpipe_context { struct pipe_constant_buffer constants[PIPE_SHADER_TYPES][LP_MAX_TGSI_CONST_BUFFERS]; struct pipe_framebuffer_state framebuffer; struct pipe_poly_stipple poly_stipple; - struct pipe_scissor_state scissor; + struct pipe_scissor_state scissors[PIPE_MAX_VIEWPORTS]; struct pipe_sampler_view *sampler_views[PIPE_SHADER_TYPES][PIPE_MAX_SHADER_SAMPLER_VIEWS]; - struct pipe_viewport_state viewport; + struct pipe_viewport_state viewports[PIPE_MAX_VIEWPORTS]; struct pipe_vertex_buffer vertex_buffer[PIPE_MAX_ATTRIBS]; struct pipe_index_buffer index_buffer; struct pipe_resource *mapped_vs_tex[PIPE_MAX_SHADER_SAMPLER_VIEWS]; @@ -116,6 +116,9 @@ struct llvmpipe_context { /** Which vertex shader output slot contains point size */ int psize_slot; + /** Which vertex shader output slot contains viewport index */ + int viewport_index_slot; + /** minimum resolvable depth value, for polygon offset */ double mrd; diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c b/src/gallium/drivers/llvmpipe/lp_screen.c index 35630b9..562fb51 100644 --- a/src/gallium/drivers/llvmpipe/lp_screen.c +++ b/src/gallium/drivers/llvmpipe/lp_screen.c @@ -231,7 +231,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum pipe_cap param) case PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER: return 0; case PIPE_CAP_MAX_VIEWPORTS: - return 1; + return PIPE_MAX_VIEWPORTS; } /* should only get here on unhandled cases */ debug_printf(Unexpected PIPE_CAP %d query\n, param); diff --git a/src/gallium/drivers/llvmpipe/lp_setup.c b/src/gallium/drivers/llvmpipe/lp_setup.c index 9fef34e..c8b2767 100644 --- a/src/gallium/drivers/llvmpipe/lp_setup.c +++ b/src/gallium/drivers/llvmpipe/lp_setup.c @@ -616,17 +616,20 @@ lp_setup_set_blend_color( struct lp_setup_context *setup, void -lp_setup_set_scissor( struct lp_setup_context *setup, - const struct pipe_scissor_state *scissor ) +lp_setup_set_scissors( struct lp_setup_context *setup, + const struct pipe_scissor_state *scissors ) { + unsigned i; LP_DBG(DEBUG_SETUP, %s\n, __FUNCTION__); - assert(scissor); + assert(scissors); - setup-scissor.x0 = scissor-minx; - setup-scissor.x1 = scissor-maxx-1; - setup-scissor.y0 = scissor-miny; - setup-scissor.y1 = scissor-maxy-1; + for (i = 0; i PIPE_MAX_VIEWPORTS; ++i) { + setup-scissors[i].x0 = scissors[i].minx; + setup-scissors[i].x1 = scissors[i].maxx-1; + setup-scissors[i].y0 = scissors[i].miny; + setup-scissors[i].y1 = scissors[i].maxy-1; + } setup-dirty |= LP_SETUP_NEW_SCISSOR; } @@ -1012,10 +1015,13 @@ try_update_scene_state( struct lp_setup_context *setup ) } if (setup-dirty LP_SETUP_NEW_SCISSOR) { - setup-draw_region = setup-framebuffer; - if (setup-scissor_test) { - u_rect_possible_intersection(setup-scissor, - setup-draw_region); + unsigned i; + for (i = 0; i PIPE_MAX_VIEWPORTS; ++i) { + setup-draw_regions[i] = setup-framebuffer; + if (setup-scissor_test) { +u_rect_possible_intersection(setup-scissors[i], + setup-draw_regions[i]); + } } /* If the framebuffer is large we have to think about fixed-point * integer overflow. For 2K by 2K images, coordinates need 15 bits @@ -1061,6 +1067,7 @@ lp_setup_update_state( struct lp_setup_context *setup, * to know about vertex shader point size attribute. */ setup-psize = lp-psize_slot; + setup-viewport_index_slot = lp-viewport_index_slot; assert(lp-dirty == 0); diff --git a/src/gallium/drivers/llvmpipe/lp_setup.h
[Mesa-dev] [PATCH 4/4] draw: fixup draw_find_shader_output
draw_find_shader_output like most of the code in draw used to depend on position always being at output slot 0. which meant that any other attribute being at 0 could signify an error. unfortunately position can be at any of the output slots, thus other attributes can occupy slot 0 and we need to mark the ones which were not found by something else. This commit changes draw_find_shader_output so that it returns -1 if it can't find the given attribute and adjust the code that depended on it returning 0 whenever it correctly found an attrib. Signed-off-by: Zack Rusin za...@vmware.com --- src/gallium/auxiliary/draw/draw_context.c |4 ++-- src/gallium/auxiliary/draw/draw_vertex.h|2 +- src/gallium/drivers/llvmpipe/lp_state_derived.c |8 src/gallium/drivers/softpipe/sp_state_derived.c |2 +- 4 files changed, 8 insertions(+), 8 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_context.c b/src/gallium/auxiliary/draw/draw_context.c index 4250f10..91cb136 100644 --- a/src/gallium/auxiliary/draw/draw_context.c +++ b/src/gallium/auxiliary/draw/draw_context.c @@ -493,7 +493,7 @@ draw_alloc_extra_vertex_attrib(struct draw_context *draw, uint n; slot = draw_find_shader_output(draw, semantic_name, semantic_index); - if (slot 0) { + if (slot = 0) { return slot; } @@ -574,7 +574,7 @@ draw_find_shader_output(const struct draw_context *draw, } } - return 0; + return -1; } diff --git a/src/gallium/auxiliary/draw/draw_vertex.h b/src/gallium/auxiliary/draw/draw_vertex.h index c87c3d8..9e10ada 100644 --- a/src/gallium/auxiliary/draw/draw_vertex.h +++ b/src/gallium/auxiliary/draw/draw_vertex.h @@ -125,7 +125,7 @@ static INLINE uint draw_emit_vertex_attr(struct vertex_info *vinfo, enum attrib_emit emit, enum interp_mode interp, /* only used by softpipe??? */ - uint src_index) + int src_index) { const uint n = vinfo-num_attribs; assert(n Elements(vinfo-attrib)); diff --git a/src/gallium/drivers/llvmpipe/lp_state_derived.c b/src/gallium/drivers/llvmpipe/lp_state_derived.c index 9c5e847..ea24ffc 100644 --- a/src/gallium/drivers/llvmpipe/lp_state_derived.c +++ b/src/gallium/drivers/llvmpipe/lp_state_derived.c @@ -50,7 +50,7 @@ compute_vertex_info(struct llvmpipe_context *llvmpipe) { const struct lp_fragment_shader *lpfs = llvmpipe-fs; struct vertex_info *vinfo = llvmpipe-vertex_info; - unsigned vs_index; + int vs_index; uint i; llvmpipe-color_slot[0] = -1; @@ -99,7 +99,7 @@ compute_vertex_info(struct llvmpipe_context *llvmpipe) vs_index = draw_find_shader_output(llvmpipe-draw, TGSI_SEMANTIC_BCOLOR, i); - if (vs_index 0) { + if (vs_index = 0) { llvmpipe-bcolor_slot[i] = (int)vinfo-num_attribs; draw_emit_vertex_attr(vinfo, EMIT_4F, INTERP_PERSPECTIVE, vs_index); } @@ -111,7 +111,7 @@ compute_vertex_info(struct llvmpipe_context *llvmpipe) vs_index = draw_find_shader_output(llvmpipe-draw, TGSI_SEMANTIC_PSIZE, 0); - if (vs_index 0) { + if (vs_index = 0) { llvmpipe-psize_slot = vinfo-num_attribs; draw_emit_vertex_attr(vinfo, EMIT_4F, INTERP_CONSTANT, vs_index); } @@ -120,7 +120,7 @@ compute_vertex_info(struct llvmpipe_context *llvmpipe) vs_index = draw_find_shader_output(llvmpipe-draw, TGSI_SEMANTIC_VIEWPORT_INDEX, 0); - if (vs_index 0) { + if (vs_index = 0) { llvmpipe-viewport_index_slot = vinfo-num_attribs; draw_emit_vertex_attr(vinfo, EMIT_4F, INTERP_CONSTANT, vs_index); } else { diff --git a/src/gallium/drivers/softpipe/sp_state_derived.c b/src/gallium/drivers/softpipe/sp_state_derived.c index 85fd47d..93cd38e 100644 --- a/src/gallium/drivers/softpipe/sp_state_derived.c +++ b/src/gallium/drivers/softpipe/sp_state_derived.c @@ -137,7 +137,7 @@ softpipe_get_vertex_info(struct softpipe_context *softpipe) softpipe-psize_slot = draw_find_shader_output(softpipe-draw, TGSI_SEMANTIC_PSIZE, 0); - if (softpipe-psize_slot 0) { + if (softpipe-psize_slot = 0) { draw_emit_vertex_attr(vinfo, EMIT_4F, INTERP_CONSTANT, softpipe-psize_slot); } -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] intel driver blit rework
Here's a big rework of blitting. It's a followon to some of the work I started back in February to make struct intel_region die in a fire. It's not a reduction in code like I hoped, but it's a reduction in a lot of bugs and it should help make extension of our driver to support fast color clears (and possibly zero-copy PBOs again) much easier. The code is available at mtblit of git://people.freedesktop.org/~anholt/mesa ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/17] intel: Rename intel_renderbuffer_tile_offsets.
This makes it more consistent with intel_miptree_get_tile_offsets(). --- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 4 ++-- src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 2 +- src/mesa/drivers/dri/intel/intel_fbo.h| 6 +++--- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index 2022159..f73ea20 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -1329,7 +1329,7 @@ brw_update_renderbuffer_surface(struct brw_context *brw, gl_format rb_format = _mesa_get_render_format(ctx, intel_rb_format(irb)); if (rb-TexImage !brw-has_surface_tile_offset) { - intel_renderbuffer_tile_offsets(irb, tile_x, tile_y); + intel_renderbuffer_get_tile_offsets(irb, tile_x, tile_y); if (tile_x != 0 || tile_y != 0) { /* Original gen4 hardware couldn't draw to a non-tile-aligned @@ -1358,7 +1358,7 @@ brw_update_renderbuffer_surface(struct brw_context *brw, format BRW_SURFACE_FORMAT_SHIFT); /* reloc */ - surf[1] = (intel_renderbuffer_tile_offsets(irb, tile_x, tile_y) + + surf[1] = (intel_renderbuffer_get_tile_offsets(irb, tile_x, tile_y) + region-bo-offset); surf[2] = ((rb-Width - 1) BRW_SURFACE_WIDTH_SHIFT | diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c index c23a8be..0376705 100644 --- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c @@ -560,7 +560,7 @@ gen7_update_renderbuffer_surface(struct brw_context *brw, surf[0] |= GEN7_SURFACE_HALIGN_8; /* reloc */ - surf[1] = intel_renderbuffer_tile_offsets(irb, tile_x, tile_y) + + surf[1] = intel_renderbuffer_get_tile_offsets(irb, tile_x, tile_y) + region-bo-offset; /* reloc */ assert(brw-has_surface_tile_offset); diff --git a/src/mesa/drivers/dri/intel/intel_fbo.h b/src/mesa/drivers/dri/intel/intel_fbo.h index 5d6dc7e..e1b4df5 100644 --- a/src/mesa/drivers/dri/intel/intel_fbo.h +++ b/src/mesa/drivers/dri/intel/intel_fbo.h @@ -150,9 +150,9 @@ void intel_renderbuffer_set_draw_offset(struct intel_renderbuffer *irb); static inline uint32_t -intel_renderbuffer_tile_offsets(struct intel_renderbuffer *irb, - uint32_t *tile_x, - uint32_t *tile_y) +intel_renderbuffer_get_tile_offsets(struct intel_renderbuffer *irb, +uint32_t *tile_x, +uint32_t *tile_y) { return intel_miptree_get_tile_offsets(irb-mt, irb-mt_level, irb-mt_layer, tile_x, tile_y); -- 1.8.3.rc0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 04/17] intel: Make a wrapper for intelEmitCopyBlit using miptrees.
I had previously asserted that it was hard to write a useful, simpler blit function, but I think this might be it. This has the side effect of extending the 32k pitch check to a few more places that were missing it. --- src/mesa/drivers/dri/intel/intel_blit.c| 91 ++ src/mesa/drivers/dri/intel/intel_blit.h| 10 +++ src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 15 ++--- src/mesa/drivers/dri/intel/intel_pixel_copy.c | 42 +++- src/mesa/drivers/dri/intel/intel_tex_copy.c| 80 -- 5 files changed, 127 insertions(+), 111 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_blit.c b/src/mesa/drivers/dri/intel/intel_blit.c index f9cba85..007f900 100644 --- a/src/mesa/drivers/dri/intel/intel_blit.c +++ b/src/mesa/drivers/dri/intel/intel_blit.c @@ -85,6 +85,97 @@ br13_for_cpp(int cpp) } } +/** + * Implements a rectangular block transfer (blit) of pixels between two + * miptrees. + * + * Our blitter can operate on 1, 2, or 4-byte-per-pixel data, with generous, + * but limited, pitches and sizes allowed. + * + * The src/dst coordinates are relative to the given level/slice of the + * miptree. + * + * If @src_flip or @dst_flip is set, then the rectangle within that miptree + * will be inverted (including scanline order) when copying. This is common + * in GL when copying between window system and user-created + * renderbuffers/textures. + */ +bool +intel_miptree_blit(struct intel_context *intel, + struct intel_mipmap_tree *src_mt, + int src_level, int src_slice, + uint32_t src_x, uint32_t src_y, bool src_flip, + struct intel_mipmap_tree *dst_mt, + int dst_level, int dst_slice, + uint32_t dst_x, uint32_t dst_y, bool dst_flip, + uint32_t width, uint32_t height, + GLenum logicop) +{ + /* We don't assert on format because we may blit from ARGB to XRGB, +* for example. +*/ + assert(src_mt-cpp == dst_mt-cpp); + + /* According to the Ivy Bridge PRM, Vol1 Part4, section 1.2.1.2 (Graphics +* Data Size Limitations): +* +*The BLT engine is capable of transferring very large quantities of +*graphics data. Any graphics data read from and written to the +*destination is permitted to represent a number of pixels that +*occupies up to 65,536 scan lines and up to 32,768 bytes per scan line +*at the destination. The maximum number of pixels that may be +*represented per scan line’s worth of graphics data depends on the +*color depth. +* +* Furthermore, intel_miptree_blit (which is called below) uses a signed +* 16-bit integer to represent buffer pitch, so it can only handle buffer +* pitches 32k. +* +* As a result of these two limitations, we can only use the blitter to do +* this copy when the region's pitch is less than 32k. +*/ + if (src_mt-region-pitch 32768 || + dst_mt-region-pitch 32768) { + perf_debug(Falling back due to 32k pitch\n); + return false; + } + + if (src_flip) + src_y = src_mt-level[src_level].height - src_y - height; + + if (dst_flip) + dst_y = dst_mt-level[dst_level].height - dst_y - height; + + int src_pitch = src_mt-region-pitch; + if (src_flip != dst_flip) + src_pitch = -src_pitch; + + uint32_t src_image_x, src_image_y; + intel_miptree_get_image_offset(src_mt, src_level, src_slice, + src_image_x, src_image_y); + src_x += src_image_x; + src_y += src_image_y; + + uint32_t dst_image_x, dst_image_y; + intel_miptree_get_image_offset(dst_mt, dst_level, dst_slice, + dst_image_x, dst_image_y); + dst_x += dst_image_x; + dst_y += dst_image_y; + + return intelEmitCopyBlit(intel, +src_mt-cpp, +src_pitch, +src_mt-region-bo, src_mt-offset, +src_mt-region-tiling, +dst_mt-region-pitch, +dst_mt-region-bo, dst_mt-offset, +dst_mt-region-tiling, +src_x, src_y, +dst_x, dst_y, +width, height, +logicop); +} + /* Copy BitBlt */ bool diff --git a/src/mesa/drivers/dri/intel/intel_blit.h b/src/mesa/drivers/dri/intel/intel_blit.h index d195e6b..9bfe91d 100644 --- a/src/mesa/drivers/dri/intel/intel_blit.h +++ b/src/mesa/drivers/dri/intel/intel_blit.h @@ -51,6 +51,16 @@ intelEmitCopyBlit(struct intel_context *intel, GLshort w, GLshort h, GLenum logicop ); +bool intel_miptree_blit(struct intel_context *intel, +struct intel_mipmap_tree *src_mt, +int
[Mesa-dev] [PATCH 01/17] intel: Make intel_miptree_get_tile_offsets return a page offset.
Right now, the callers in i965 don't expect a nonzero page offset to actually occur (since that's being handled elsewhere), but it seems like a trap to leave it this way. --- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 6 +++--- src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 7 --- src/mesa/drivers/dri/intel/intel_mipmap_tree.c| 21 ++--- src/mesa/drivers/dri/intel/intel_mipmap_tree.h| 2 +- 4 files changed, 26 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index bbe8579..2022159 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -986,6 +986,8 @@ brw_update_texture_surface(struct gl_context *ctx, BRW_SURFACE_FORMAT_SHIFT)); surf[1] = intelObj-mt-region-bo-offset + intelObj-mt-offset; /* reloc */ + surf[1] += intel_miptree_get_tile_offsets(intelObj-mt, firstImage-Level, 0, + tile_x, tile_y); surf[2] = ((intelObj-_MaxLevel - tObj-BaseLevel) BRW_SURFACE_LOD_SHIFT | (width - 1) BRW_SURFACE_WIDTH_SHIFT | @@ -998,8 +1000,6 @@ brw_update_texture_surface(struct gl_context *ctx, surf[4] = brw_get_surface_num_multisamples(intelObj-mt-num_samples); - intel_miptree_get_tile_offsets(intelObj-mt, firstImage-Level, 0, - tile_x, tile_y); assert(brw-has_surface_tile_offset || (tile_x == 0 tile_y == 0)); /* Note that the low bits of these fields are missing, so * there's the possibility of getting in trouble. @@ -1014,7 +1014,7 @@ brw_update_texture_surface(struct gl_context *ctx, drm_intel_bo_emit_reloc(brw-intel.batch.bo, binding_table[surf_index] + 4, intelObj-mt-region-bo, - intelObj-mt-offset, + surf[1] - intelObj-mt-region-bo-offset, I915_GEM_DOMAIN_SAMPLER, 0); } diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c index 435f9dc..c23a8be 100644 --- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c @@ -331,6 +331,8 @@ gen7_update_texture_surface(struct gl_context *ctx, surf[0] |= GEN7_SURFACE_ARYSPC_LOD0; surf[1] = mt-region-bo-offset + mt-offset; /* reloc */ + surf[1] += intel_miptree_get_tile_offsets(intelObj-mt, firstImage-Level, 0, + tile_x, tile_y); surf[2] = SET_FIELD(width - 1, GEN7_SURFACE_WIDTH) | SET_FIELD(height - 1, GEN7_SURFACE_HEIGHT); @@ -339,8 +341,6 @@ gen7_update_texture_surface(struct gl_context *ctx, surf[4] = gen7_surface_msaa_bits(mt-num_samples, mt-msaa_layout); - intel_miptree_get_tile_offsets(intelObj-mt, firstImage-Level, 0, - tile_x, tile_y); assert(brw-has_surface_tile_offset || (tile_x == 0 tile_y == 0)); /* Note that the low bits of these fields are missing, so * there's the possibility of getting in trouble. @@ -372,7 +372,8 @@ gen7_update_texture_surface(struct gl_context *ctx, /* Emit relocation to surface contents */ drm_intel_bo_emit_reloc(brw-intel.batch.bo, binding_table[surf_index] + 4, - intelObj-mt-region-bo, intelObj-mt-offset, + intelObj-mt-region-bo, + surf[1] - intelObj-mt-region-bo-offset, I915_GEM_DOMAIN_SAMPLER, 0); gen7_check_surface_setup(surf, false /* is_render_target */); diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c index d967b19..0278799 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c @@ -777,19 +777,34 @@ intel_miptree_get_image_offset(struct intel_mipmap_tree *mt, *y = mt-level[level].slice[slice].y_offset; } -void +/** + * Rendering with tiled buffers requires that the base address of the buffer + * be aligned to a page boundary. For renderbuffers, and sometimes with + * textures, we may want the surface to point at a texture image level that + * isn't at a page boundary. + * + * This function returns an appropriately-aligned base offset + * according to the tiling restrictions, plus any required x/y offset + * from there. + */ +uint32_t intel_miptree_get_tile_offsets(struct intel_mipmap_tree *mt, GLuint level, GLuint slice, uint32_t *tile_x, uint32_t *tile_y) { struct intel_region *region = mt-region; + uint32_t x, y; uint32_t mask_x, mask_y; intel_region_get_tile_masks(region, mask_x, mask_y, false); +
[Mesa-dev] [PATCH 02/17] intel: Reduce intel_renderbuffer_tile_offsets to a thin wrapper.
--- src/mesa/drivers/dri/intel/intel_fbo.c | 26 -- src/mesa/drivers/dri/intel/intel_fbo.h | 9 +++-- 2 files changed, 7 insertions(+), 28 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_fbo.c b/src/mesa/drivers/dri/intel/intel_fbo.c index 69f8629..34f31fb 100644 --- a/src/mesa/drivers/dri/intel/intel_fbo.c +++ b/src/mesa/drivers/dri/intel/intel_fbo.c @@ -535,32 +535,6 @@ intel_renderbuffer_set_draw_offset(struct intel_renderbuffer *irb) } /** - * Rendering to tiled buffers requires that the base address of the - * buffer be aligned to a page boundary. We generally render to - * textures by pointing the surface at the mipmap image level, which - * may not be aligned to a tile boundary. - * - * This function returns an appropriately-aligned base offset - * according to the tiling restrictions, plus any required x/y offset - * from there. - */ -uint32_t -intel_renderbuffer_tile_offsets(struct intel_renderbuffer *irb, - uint32_t *tile_x, - uint32_t *tile_y) -{ - struct intel_region *region = irb-mt-region; - uint32_t mask_x, mask_y; - - intel_region_get_tile_masks(region, mask_x, mask_y, false); - - *tile_x = irb-draw_x mask_x; - *tile_y = irb-draw_y mask_y; - return intel_region_get_aligned_offset(region, irb-draw_x ~mask_x, - irb-draw_y ~mask_y, false); -} - -/** * Called by glFramebufferTexture[123]DEXT() (and other places) to * prepare for rendering into texture memory. This might be called * many times to choose different texture levels, cube faces, etc diff --git a/src/mesa/drivers/dri/intel/intel_fbo.h b/src/mesa/drivers/dri/intel/intel_fbo.h index aa52b97..5d6dc7e 100644 --- a/src/mesa/drivers/dri/intel/intel_fbo.h +++ b/src/mesa/drivers/dri/intel/intel_fbo.h @@ -33,6 +33,7 @@ #include main/formats.h #include main/macros.h #include intel_context.h +#include intel_mipmap_tree.h #include intel_screen.h #ifdef __cplusplus @@ -148,10 +149,14 @@ intel_flip_renderbuffers(struct gl_framebuffer *fb); void intel_renderbuffer_set_draw_offset(struct intel_renderbuffer *irb); -uint32_t +static inline uint32_t intel_renderbuffer_tile_offsets(struct intel_renderbuffer *irb, uint32_t *tile_x, - uint32_t *tile_y); + uint32_t *tile_y) +{ + return intel_miptree_get_tile_offsets(irb-mt, irb-mt_level, irb-mt_layer, + tile_x, tile_y); +} struct intel_region* intel_get_rb_region(struct gl_framebuffer *fb, GLuint attIndex); -- 1.8.3.rc0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 05/17] i965: Consistently do depth resolves before blitting.
We were protected for a long time by the fact that depth was Y tiled and you couldn't blit Y. Now that we can blit Y, we were failing to resolve depth in glCopyPixels(). Note in the comment about swrast, that the swrast map path does resolves appropriately already. --- src/mesa/drivers/dri/intel/intel_blit.c| 6 ++ src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 6 -- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_blit.c b/src/mesa/drivers/dri/intel/intel_blit.c index 007f900..ddb9edb 100644 --- a/src/mesa/drivers/dri/intel/intel_blit.c +++ b/src/mesa/drivers/dri/intel/intel_blit.c @@ -140,6 +140,12 @@ intel_miptree_blit(struct intel_context *intel, return false; } + /* The blitter has no idea about HiZ, so we need to get the real depth +* data into the two miptrees before we do anything. +*/ + intel_miptree_slice_resolve_depth(intel, src_mt, src_level, src_slice); + intel_miptree_slice_resolve_depth(intel, dst_mt, dst_level, dst_slice); + if (src_flip) src_y = src_mt-level[src_level].height - src_y - height; diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c index eedf80c..c3e55f4 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c @@ -919,12 +919,6 @@ intel_miptree_copy_slice(struct intel_context *intel, dst_mt, dst_x, dst_y, dst_mt-region-pitch, width, height); - /* Since we are about to copy depth data using either the blitter or swrast -* (neither of which respect HiZ), we need to do a depth resolve first. -*/ - intel_miptree_slice_resolve_depth(intel, src_mt, level, slice); - intel_miptree_slice_resolve_depth(intel, dst_mt, level, slice); - if (!intel_miptree_blit(intel, src_mt, level, slice, 0, 0, false, dst_mt, level, slice, 0, 0, false, -- 1.8.3.rc0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/17] i965: Prefer blorp glBlitFramebuffer() to the glCopyTexSubImage-based blit.
I think we've measured no performance difference from this in the past, except that the blorp code can do things like multisample resolves. Prevents piglit regression in the next commit when a testcase started trying to do a multisampled resolve through the old glCopyTexSubImage() path. --- src/mesa/drivers/dri/intel/intel_fbo.c | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_fbo.c b/src/mesa/drivers/dri/intel/intel_fbo.c index 34f31fb..05ff784 100644 --- a/src/mesa/drivers/dri/intel/intel_fbo.c +++ b/src/mesa/drivers/dri/intel/intel_fbo.c @@ -816,14 +816,6 @@ intel_blit_framebuffer(struct gl_context *ctx, GLint dstX0, GLint dstY0, GLint dstX1, GLint dstY1, GLbitfield mask, GLenum filter) { - /* Try faster, glCopyTexSubImage2D approach first which uses the BLT. */ - mask = intel_blit_framebuffer_copy_tex_sub_image(ctx, -srcX0, srcY0, srcX1, srcY1, -dstX0, dstY0, dstX1, dstY1, -mask, filter); - if (mask == 0x0) - return; - #ifndef I915 mask = brw_blorp_framebuffer(intel_context(ctx), srcX0, srcY0, srcX1, srcY1, @@ -833,6 +825,15 @@ intel_blit_framebuffer(struct gl_context *ctx, return; #endif + /* Try glCopyTexSubImage2D approach which uses the BLT. */ + mask = intel_blit_framebuffer_copy_tex_sub_image(ctx, +srcX0, srcY0, srcX1, srcY1, +dstX0, dstY0, dstX1, dstY1, +mask, filter); + if (mask == 0x0) + return; + + _mesa_meta_BlitFramebuffer(ctx, srcX0, srcY0, srcX1, srcY1, dstX0, dstY0, dstX1, dstY1, -- 1.8.3.rc0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 07/17] i965: Allow glCopyTexSubImage() on depth textures.
If the hw is pre-gen5 and can't blit depth, it'll cleanly error out. --- src/mesa/drivers/dri/intel/intel_tex_copy.c | 5 - 1 file changed, 5 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_tex_copy.c b/src/mesa/drivers/dri/intel/intel_tex_copy.c index 7a38082..94e90da 100644 --- a/src/mesa/drivers/dri/intel/intel_tex_copy.c +++ b/src/mesa/drivers/dri/intel/intel_tex_copy.c @@ -96,11 +96,6 @@ intel_copy_texsubimage(struct intel_context *intel, return false; } - /* The blitter can't handle Y-tiled buffers. */ - if (intelImage-mt-region-tiling == I915_TILING_Y) { - return false; - } - /* blit from src buffer to texture */ if (!intel_miptree_blit(intel, irb-mt, irb-mt_level, irb-mt_layer, -- 1.8.3.rc0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 13/17] intel: Rebuild PBO blit glReadPixels() on top of miptrees.
The previous code was missing depth resolves, that had only been prevented due to no blitting of Y tiling. The pair of flip args in the new blit function means that we can just drop the pack-Invert fallback. --- src/mesa/drivers/dri/intel/intel_pixel_read.c | 48 +-- 1 file changed, 23 insertions(+), 25 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_pixel_read.c b/src/mesa/drivers/dri/intel/intel_pixel_read.c index ebdc528..26eb496 100644 --- a/src/mesa/drivers/dri/intel/intel_pixel_read.c +++ b/src/mesa/drivers/dri/intel/intel_pixel_read.c @@ -76,7 +76,6 @@ do_blit_readpixels(struct gl_context * ctx, const struct gl_pixelstore_attrib *pack, GLvoid * pixels) { struct intel_context *intel = intel_context(ctx); - struct intel_region *src = intel_readbuf_region(intel); struct intel_buffer_object *dst = intel_buffer_object(pack-BufferObj); GLuint dst_offset; drm_intel_bo *dst_buffer; @@ -86,9 +85,6 @@ do_blit_readpixels(struct gl_context * ctx, DBG(%s\n, __FUNCTION__); - if (!src) - return false; - assert(_mesa_is_bufferobj(pack-BufferObj)); struct gl_renderbuffer *rb = ctx-ReadBuffer-_ColorReadBuffer; @@ -107,13 +103,13 @@ do_blit_readpixels(struct gl_context * ctx, } int dst_stride = _mesa_image_row_stride(pack, width, format, type); + bool dst_flip = false; + /* Mesa flips the dst_stride for pack-Invert, but we want our mt to have a +* normal dst_stride. +*/ if (pack-Invert) { - DBG(%s: MESA_PACK_INVERT not done yet\n, __FUNCTION__); - return false; - } - else { - if (_mesa_is_winsys_fbo(ctx-ReadBuffer)) -dst_stride = -dst_stride; + dst_stride = -dst_stride; + dst_flip = true; } dst_offset = (GLintptr)pixels; @@ -131,30 +127,32 @@ do_blit_readpixels(struct gl_context * ctx, intel_prepare_render(intel); intel-front_buffer_dirty = dirty; - all = (width * height * src-cpp == dst-Base.Size + all = (width * height * irb-mt-cpp == dst-Base.Size x == 0 dst_offset == 0); - dst_x = 0; - dst_y = 0; - dst_buffer = intel_bufferobj_buffer(intel, dst, all ? INTEL_WRITE_FULL : INTEL_WRITE_PART); - if (_mesa_is_winsys_fbo(ctx-ReadBuffer)) - y = ctx-ReadBuffer-Height - (y + height); - - if (!intelEmitCopyBlit(intel, - src-cpp, - src-pitch, src-bo, 0, src-tiling, - dst_stride, dst_buffer, dst_offset, false, - x, y, - dst_x, dst_y, - width, height, - GL_COPY)) { + struct intel_mipmap_tree *pbo_mt = + intel_miptree_create_for_bo(intel, + dst_buffer, + irb-mt-format, + dst_offset, + width, height, + dst_stride, I915_TILING_NONE); + + if (!intel_miptree_blit(intel, + irb-mt, irb-mt_level, irb-mt_layer, + x, y, _mesa_is_winsys_fbo(ctx-ReadBuffer), + pbo_mt, 0, 0, + 0, 0, dst_flip, + width, height, GL_COPY)) { return false; } + intel_miptree_release(pbo_mt); + DBG(%s - DONE\n, __FUNCTION__); return true; -- 1.8.3.rc0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 12/17] intel: Rework intel_miptree_create_for_region() to wrap a BO.
I needed to do this for the PBO blit cases to use intel_miptree_blit(). But this also actually partially fixes a bug in EGLImage handling: We can't share regions across contexts, because regions have a refcount that isn't protected by a mutex, and different contexts can be simulataneously accessed from multiple threads. Now we just need to get regions out of __DRIImage. There was also a missing use of image-offset in the EGLImage renderbuffer storage code. --- src/mesa/drivers/dri/intel/intel_fbo.c | 12 +++-- src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 65 -- src/mesa/drivers/dri/intel/intel_mipmap_tree.h | 14 -- 3 files changed, 67 insertions(+), 24 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_fbo.c b/src/mesa/drivers/dri/intel/intel_fbo.c index 73ed91d..cbbd31c 100644 --- a/src/mesa/drivers/dri/intel/intel_fbo.c +++ b/src/mesa/drivers/dri/intel/intel_fbo.c @@ -293,10 +293,14 @@ intel_image_target_renderbuffer_storage(struct gl_context *ctx, irb = intel_renderbuffer(rb); intel_miptree_release(irb-mt); - irb-mt = intel_miptree_create_for_region(intel, - GL_TEXTURE_2D, - image-format, - image-region); + irb-mt = intel_miptree_create_for_bo(intel, + image-region-bo, + image-format, + image-offset, + image-region-width, + image-region-height, + image-region-pitch, + image-region-tiling); if (!irb-mt) return; diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c index dd0b9ce..443791c 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c @@ -124,8 +124,8 @@ compute_msaa_layout(struct intel_context *intel, gl_format format, GLenum target /** - * @param for_region Indicates that the caller is - *intel_miptree_create_for_region(). If true, then do not create + * @param for_bo Indicates that the caller is + *intel_miptree_create_for_bo(). If true, then do not create *\c stencil_mt. */ struct intel_mipmap_tree * @@ -137,7 +137,7 @@ intel_miptree_create_layout(struct intel_context *intel, GLuint width0, GLuint height0, GLuint depth0, -bool for_region, +bool for_bo, GLuint num_samples) { struct intel_mipmap_tree *mt = calloc(sizeof(*mt), 1); @@ -250,7 +250,7 @@ intel_miptree_create_layout(struct intel_context *intel, mt-physical_height0 = height0; mt-physical_depth0 = depth0; - if (!for_region + if (!for_bo _mesa_get_format_base_format(format) == GL_DEPTH_STENCIL (intel-must_use_separate_stencil || (intel-has_separate_stencil @@ -485,21 +485,50 @@ intel_miptree_create(struct intel_context *intel, } struct intel_mipmap_tree * -intel_miptree_create_for_region(struct intel_context *intel, - GLenum target, - gl_format format, - struct intel_region *region) +intel_miptree_create_for_bo(struct intel_context *intel, +drm_intel_bo *bo, +gl_format format, +uint32_t offset, +uint32_t width, +uint32_t height, +int pitch, +uint32_t tiling) { struct intel_mipmap_tree *mt; - mt = intel_miptree_create_layout(intel, target, format, - 0, 0, - region-width, region-height, 1, - true, 0 /* num_samples */); + struct intel_region *region = calloc(1, sizeof(*region)); + if (!region) + return NULL; + + /* Nothing will be able to use this miptree with the BO if the offset isn't +* aligned. +*/ + if (tiling != I915_TILING_NONE) + assert(offset % 4096 == 0); + + /* miptrees can't handle negative pitch. If you need flipping of images, +* that's outside of the scope of the mt. +*/ + assert(pitch = 0); + + mt = intel_miptree_create_layout(intel, GL_TEXTURE_2D, format, +0, 0, +width, height, 1, +true, 0 /* num_samples */); if (!mt) return mt; - intel_region_reference(mt-region, region); + region-cpp = mt-cpp; + region-width =
[Mesa-dev] [PATCH 08/17] intle: Add an assert for glCopyTexSubImage() being called on MSAA buffers.
This is just in case someone else trips over this due to our weird reuse of this code in glBlitFramebuffer(). --- src/mesa/drivers/dri/intel/intel_tex_copy.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/mesa/drivers/dri/intel/intel_tex_copy.c b/src/mesa/drivers/dri/intel/intel_tex_copy.c index 94e90da..4a13b9a 100644 --- a/src/mesa/drivers/dri/intel/intel_tex_copy.c +++ b/src/mesa/drivers/dri/intel/intel_tex_copy.c @@ -62,6 +62,12 @@ intel_copy_texsubimage(struct intel_context *intel, intel_prepare_render(intel); + /* glCopyTexSubImage() can't be called on multisampled renderbuffers or +* textures. +*/ + assert(!irb-Base.Base.NumSamples); + assert(!intelImage-base.Base.NumSamples); + if (!intelImage-mt || !irb || !irb-mt) { if (unlikely(INTEL_DEBUG DEBUG_PERF)) fprintf(stderr, %s fail %p %p (0x%08x)\n, -- 1.8.3.rc0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 14/17] intel: Rebuild PBO blit glTexImage() on top of miptrees.
This will ensure that we have resolves if we ever extend this to glTexSubImage(), and fixes missing image start offset handling. The texture buffer alloc ended up getting moved up, because we want to look at the format of the image's actual mt to see if we'll end up blitting the right thing, in the case of packed depth/stencil uploads. This is the last caller of intelEmitCopyBlit() on a miptree-wrapped BO. --- src/mesa/drivers/dri/intel/intel_tex_image.c | 62 ++-- 1 file changed, 32 insertions(+), 30 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_tex_image.c b/src/mesa/drivers/dri/intel/intel_tex_image.c index a3928bb..4ad5ccc 100644 --- a/src/mesa/drivers/dri/intel/intel_tex_image.c +++ b/src/mesa/drivers/dri/intel/intel_tex_image.c @@ -6,6 +6,7 @@ #include main/bufferobj.h #include main/context.h #include main/formats.h +#include main/image.h #include main/pbo.h #include main/renderbuffer.h #include main/texcompress.h @@ -117,9 +118,8 @@ try_pbo_upload(struct gl_context *ctx, struct intel_texture_image *intelImage = intel_texture_image(image); struct intel_context *intel = intel_context(ctx); struct intel_buffer_object *pbo = intel_buffer_object(unpack-BufferObj); - GLuint src_offset, src_stride; - GLuint dst_x, dst_y; - drm_intel_bo *dst_buffer, *src_buffer; + GLuint src_offset; + drm_intel_bo *src_buffer; if (!_mesa_is_bufferobj(unpack-BufferObj)) return false; @@ -132,14 +132,6 @@ try_pbo_upload(struct gl_context *ctx, return false; } - if (!_mesa_format_matches_format_and_type(image-TexFormat, - format, type, false)) { - DBG(%s: format mismatch (upload to %s with format 0x%x, type 0x%x)\n, - __FUNCTION__, _mesa_get_format_name(image-TexFormat), - format, type); - return false; - } - ctx-Driver.AllocTextureImageBuffer(ctx, image); if (!intelImage-mt) { @@ -147,39 +139,49 @@ try_pbo_upload(struct gl_context *ctx, return false; } + if (!_mesa_format_matches_format_and_type(intelImage-mt-format, + format, type, false)) { + DBG(%s: format mismatch (upload to %s with format 0x%x, type 0x%x)\n, + __FUNCTION__, _mesa_get_format_name(intelImage-mt-format), + format, type); + return false; + } + if (image-TexObject-Target == GL_TEXTURE_1D_ARRAY || image-TexObject-Target == GL_TEXTURE_2D_ARRAY) { DBG(%s: no support for array textures\n, __FUNCTION__); return false; } - dst_buffer = intelImage-mt-region-bo; src_buffer = intel_bufferobj_source(intel, pbo, 64, src_offset); /* note: potential 64-bit ptr to 32-bit int cast */ src_offset += (GLuint) (unsigned long) pixels; - if (unpack-RowLength 0) - src_stride = unpack-RowLength; - else - src_stride = image-Width; - src_stride *= intelImage-mt-region-cpp; - - intel_miptree_get_image_offset(intelImage-mt, intelImage-base.Base.Level, - intelImage-base.Base.Face, - dst_x, dst_y); - - if (!intelEmitCopyBlit(intel, - intelImage-mt-cpp, - src_stride, src_buffer, - src_offset, false, - intelImage-mt-region-pitch, dst_buffer, 0, - intelImage-mt-region-tiling, - 0, 0, dst_x, dst_y, image-Width, image-Height, - GL_COPY)) { + int src_stride = + _mesa_image_row_stride(unpack, image-Width, format, type); + + struct intel_mipmap_tree *pbo_mt = + intel_miptree_create_for_bo(intel, + src_buffer, + intelImage-mt-format, + src_offset, + image-Width, image-Height, + src_stride, I915_TILING_NONE); + if (!pbo_mt) + return false; + + if (!intel_miptree_blit(intel, + pbo_mt, 0, 0, + 0, 0, false, + intelImage-mt, image-Level, image-Face, + 0, 0, false, + image-Width, image-Height, GL_COPY)) { DBG(%s: blit failed\n, __FUNCTION__); return false; } + intel_miptree_release(pbo_mt); + DBG(%s: success\n, __FUNCTION__); return true; } -- 1.8.3.rc0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 11/17] intel: Make a temporary miptree for the blit path of miptree mapping.
In a bit of debug code, we no longer have the inter-slice x/y to print. But I think the level/slice is more useful in this case for looking at what's getting mapped, especially given that INTEL_DEBUG=blit will tell you the other value. --- src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 99 +++--- src/mesa/drivers/dri/intel/intel_mipmap_tree.h | 4 +- 2 files changed, 29 insertions(+), 74 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c index d41fbdf..dd0b9ce 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c @@ -1456,57 +1456,39 @@ intel_miptree_map_blit(struct intel_context *intel, struct intel_miptree_map *map, unsigned int level, unsigned int slice) { - unsigned int image_x, image_y; - int x = map-x; - int y = map-y; - int ret; - - /* The blitter requires the pitch to be aligned to 4. */ - map-stride = ALIGN(map-w * mt-region-cpp, 4); - - map-bo = drm_intel_bo_alloc(intel-bufmgr, intel_miptree_map_blit() temp, - map-stride * map-h, 4096); - if (!map-bo) { + map-mt = intel_miptree_create(intel, GL_TEXTURE_2D, mt-format, + 0, 0, + map-w, map-h, 1, + false, 0, + (1 I915_TILING_NONE)); + if (!map-mt) { fprintf(stderr, Failed to allocate blit temporary\n); goto fail; } + map-stride = map-mt-region-pitch; - intel_miptree_get_image_offset(mt, level, slice, image_x, image_y); - x += image_x; - y += image_y; - - if (!intelEmitCopyBlit(intel, - mt-region-cpp, - mt-region-pitch, mt-region-bo, - mt-offset, mt-region-tiling, - map-stride, map-bo, - 0, I915_TILING_NONE, - x, y, - 0, 0, - map-w, map-h, - GL_COPY)) { + if (!intel_miptree_blit(intel, + mt, level, slice, + map-x, map-y, false, + map-mt, 0, 0, + 0, 0, false, + map-w, map-h, GL_COPY)) { fprintf(stderr, Failed to blit\n); goto fail; } intel_batchbuffer_flush(intel); - ret = drm_intel_bo_map(map-bo, (map-mode GL_MAP_WRITE_BIT) != 0); - if (ret) { - fprintf(stderr, Failed to map blit temporary\n); - goto fail; - } - - map-ptr = map-bo-virtual; + map-ptr = intel_miptree_map_raw(intel, map-mt); DBG(%s: %d,%d %dx%d from mt %p (%s) %d,%d = %p/%d\n, __FUNCTION__, map-x, map-y, map-w, map-h, mt, _mesa_get_format_name(mt-format), - x, y, map-ptr, map-stride); + level, slice, map-ptr, map-stride); return; fail: - drm_intel_bo_unreference(map-bo); + intel_miptree_release(map-mt); map-ptr = NULL; map-stride = 0; } @@ -1519,30 +1501,20 @@ intel_miptree_unmap_blit(struct intel_context *intel, unsigned int slice) { struct gl_context *ctx = intel-ctx; - drm_intel_bo_unmap(map-bo); - if (map-mode GL_MAP_WRITE_BIT) { - unsigned int image_x, image_y; - int x = map-x; - int y = map-y; - intel_miptree_get_image_offset(mt, level, slice, image_x, image_y); - x += image_x; - y += image_y; + intel_miptree_unmap_raw(intel, map-mt); - bool ok = intelEmitCopyBlit(intel, - mt-region-cpp, - map-stride, map-bo, - 0, I915_TILING_NONE, - mt-region-pitch, mt-region-bo, - mt-offset, mt-region-tiling, - 0, 0, - x, y, - map-w, map-h, - GL_COPY); + if (map-mode GL_MAP_WRITE_BIT) { + bool ok = intel_miptree_blit(intel, + map-mt, 0, 0, + 0, 0, false, + mt, level, slice, + map-x, map-y, false, + map-w, map-h, GL_COPY); WARN_ONCE(!ok, Failed to blit from linear temporary mapping); } - drm_intel_bo_unreference(map-bo); + intel_miptree_release(map-mt); } static void @@ -1896,24 +1868,7 @@ intel_miptree_map_singlesample(struct intel_context *intel, } else if (mt-stencil_mt !(mode BRW_MAP_DIRECT_BIT)) { intel_miptree_map_depthstencil(intel, mt, map, level, slice); } - /* According to the Ivy Bridge PRM, Vol1 Part4, section 1.2.1.2 (Graphics -* Data Size
[Mesa-dev] [PATCH 17/17] intel: Remove dead intel_drawbuf_region().
Since the glBitmap() MRT change, it's unused. There was basically no way to responsibly use this function since MRT was introduced. --- src/mesa/drivers/dri/intel/intel_buffers.c | 14 -- src/mesa/drivers/dri/intel/intel_buffers.h | 2 -- 2 files changed, 16 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_buffers.c b/src/mesa/drivers/dri/intel/intel_buffers.c index 9a9a259..fdad480 100644 --- a/src/mesa/drivers/dri/intel/intel_buffers.c +++ b/src/mesa/drivers/dri/intel/intel_buffers.c @@ -35,20 +35,6 @@ #include main/renderbuffer.h /** - * Return pointer to current color drawing region, or NULL. - */ -struct intel_region * -intel_drawbuf_region(struct intel_context *intel) -{ - struct intel_renderbuffer *irbColor = - intel_renderbuffer(intel-ctx.DrawBuffer-_ColorDrawBuffers[0]); - if (irbColor irbColor-mt) - return irbColor-mt-region; - else - return NULL; -} - -/** * Return pointer to current color reading region, or NULL. */ struct intel_region * diff --git a/src/mesa/drivers/dri/intel/intel_buffers.h b/src/mesa/drivers/dri/intel/intel_buffers.h index e68cc67..4e3d130 100644 --- a/src/mesa/drivers/dri/intel/intel_buffers.h +++ b/src/mesa/drivers/dri/intel/intel_buffers.h @@ -38,8 +38,6 @@ struct intel_framebuffer; extern struct intel_region *intel_readbuf_region(struct intel_context *intel); -extern struct intel_region *intel_drawbuf_region(struct intel_context *intel); - extern void intel_check_front_buffer_rendering(struct intel_context *intel); static inline void -- 1.8.3.rc0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 15/17] intel: Fix MRT handling of glBitmap().
We'd only hit color buffer 0 even if multiple draw buffers were bound. NOTE: This is a candidate for the stable branches. --- src/mesa/drivers/dri/intel/intel_pixel_bitmap.c | 23 ++- 1 file changed, 14 insertions(+), 9 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_pixel_bitmap.c b/src/mesa/drivers/dri/intel/intel_pixel_bitmap.c index c538a29..e258945 100644 --- a/src/mesa/drivers/dri/intel/intel_pixel_bitmap.c +++ b/src/mesa/drivers/dri/intel/intel_pixel_bitmap.c @@ -45,6 +45,7 @@ #include intel_context.h #include intel_batchbuffer.h #include intel_blit.h +#include intel_fbo.h #include intel_regions.h #include intel_buffers.h #include intel_pixel.h @@ -176,8 +177,8 @@ do_blit_bitmap( struct gl_context *ctx, const GLubyte *bitmap ) { struct intel_context *intel = intel_context(ctx); - struct intel_region *dst; struct gl_framebuffer *fb = ctx-DrawBuffer; + struct intel_renderbuffer *irb; GLfloat tmpColor[4]; GLubyte ubcolor[4]; GLuint color; @@ -200,10 +201,14 @@ do_blit_bitmap( struct gl_context *ctx, } intel_prepare_render(intel); - dst = intel_drawbuf_region(intel); - if (!dst) - return false; + if (fb-_NumColorDrawBuffers != 1) { + perf_debug(accelerated glBitmap() only supports rendering to a + single color buffer\n); + return false; + } + + irb = intel_renderbuffer(fb-_ColorDrawBuffers[0]); if (_mesa_is_bufferobj(unpack-BufferObj)) { bitmap = map_pbo(ctx, width, height, unpack, bitmap); @@ -222,7 +227,7 @@ do_blit_bitmap( struct gl_context *ctx, UNCLAMPED_FLOAT_TO_UBYTE(ubcolor[2], tmpColor[2]); UNCLAMPED_FLOAT_TO_UBYTE(ubcolor[3], tmpColor[3]); - if (dst-cpp == 2) + if (irb-mt-cpp == 2) color = PACK_COLOR_565(ubcolor[0], ubcolor[1], ubcolor[2]); else color = PACK_COLOR_(ubcolor[3], ubcolor[0], ubcolor[1], ubcolor[2]); @@ -271,14 +276,14 @@ do_blit_bitmap( struct gl_context *ctx, continue; if (!intelEmitImmediateColorExpandBlit(intel, - dst-cpp, + irb-mt-cpp, (GLubyte *)stipple, sz, color, - dst-pitch, - dst-bo, + irb-mt-region-pitch, + irb-mt-region-bo, 0, - dst-tiling, + irb-mt-region-tiling, dstx + px, dsty + py, w, h, -- 1.8.3.rc0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 16/17] intel: Fix format handling of blit glBitmap()
Any 32-bit format got ARGB handling (including, say, GL_RG1616), and anything else got 16-bit (including, say, GL_R8), which could potentially hang the GPU by writing out of bounds. NOTE: This is a candidate for the stable branches. --- src/mesa/drivers/dri/intel/intel_pixel_bitmap.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_pixel_bitmap.c b/src/mesa/drivers/dri/intel/intel_pixel_bitmap.c index e258945..c82253a 100644 --- a/src/mesa/drivers/dri/intel/intel_pixel_bitmap.c +++ b/src/mesa/drivers/dri/intel/intel_pixel_bitmap.c @@ -227,10 +227,19 @@ do_blit_bitmap( struct gl_context *ctx, UNCLAMPED_FLOAT_TO_UBYTE(ubcolor[2], tmpColor[2]); UNCLAMPED_FLOAT_TO_UBYTE(ubcolor[3], tmpColor[3]); - if (irb-mt-cpp == 2) - color = PACK_COLOR_565(ubcolor[0], ubcolor[1], ubcolor[2]); - else + switch (irb-mt-format) { + case MESA_FORMAT_ARGB: + case MESA_FORMAT_XRGB: color = PACK_COLOR_(ubcolor[3], ubcolor[0], ubcolor[1], ubcolor[2]); + break; + case MESA_FORMAT_RGB565: + color = PACK_COLOR_565(ubcolor[0], ubcolor[1], ubcolor[2]); + break; + default: + perf_debug(Unsupported format %s in accelerated glBitmap()\n, + _mesa_get_format_name(irb-mt-format)); + return false; + } if (!intel_check_blit_fragment_ops(ctx, tmpColor[3] == 1.0F)) return false; -- 1.8.3.rc0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/4] V2 Multiple viewports in Gallium
Am 24.05.2013 22:56, schrieb Zack Rusin: This series adds support for multiple viewports/scissors to gallium and implements it in llvmpipe. All the other drivers still support just a single viewport/scissor combo and their behavior should be exactly the same as it was. I think this one takes care of all the comments. I think it addresses everyones concerns. Please let me know if I missed something. Zack Rusin (4): gallium: Add support for multiple viewports draw: implement support for multiple viewports llvmpipe: implement support for multiple viewports draw: fixup draw_find_shader_output src/gallium/auxiliary/cso_cache/cso_context.c |4 +- src/gallium/auxiliary/draw/draw_cliptest_tmp.h | 10 +++- src/gallium/auxiliary/draw/draw_context.c | 63 +++- src/gallium/auxiliary/draw/draw_context.h |6 +- src/gallium/auxiliary/draw/draw_gs.c | 11 +++- src/gallium/auxiliary/draw/draw_gs.h |1 + src/gallium/auxiliary/draw/draw_pipe_clip.c| 11 +++- src/gallium/auxiliary/draw/draw_private.h |8 +-- .../draw/draw_pt_fetch_shade_pipeline_llvm.c |4 +- src/gallium/auxiliary/draw/draw_vertex.h |2 +- src/gallium/auxiliary/draw/draw_vs.c |7 --- src/gallium/auxiliary/draw/draw_vs_variant.c | 34 +-- src/gallium/auxiliary/tgsi/tgsi_scan.c |6 ++ src/gallium/auxiliary/tgsi/tgsi_scan.h |1 + src/gallium/auxiliary/tgsi/tgsi_strings.c |3 +- src/gallium/auxiliary/util/u_blitter.c |8 +-- src/gallium/auxiliary/vl/vl_compositor.c |4 +- src/gallium/auxiliary/vl/vl_idct.c |4 +- src/gallium/auxiliary/vl/vl_matrix_filter.c|2 +- src/gallium/auxiliary/vl/vl_mc.c |2 +- src/gallium/auxiliary/vl/vl_median_filter.c|2 +- src/gallium/auxiliary/vl/vl_zscan.c|2 +- src/gallium/docs/source/context.rst|8 ++- src/gallium/drivers/freedreno/freedreno_state.c| 12 ++-- src/gallium/drivers/galahad/glhd_context.c | 20 --- src/gallium/drivers/i915/i915_state.c | 15 +++-- src/gallium/drivers/identity/id_context.c | 22 +++ src/gallium/drivers/ilo/ilo_state.c| 16 +++-- src/gallium/drivers/llvmpipe/lp_context.h |7 ++- src/gallium/drivers/llvmpipe/lp_screen.c |2 + src/gallium/drivers/llvmpipe/lp_setup.c| 29 + src/gallium/drivers/llvmpipe/lp_setup.h|4 +- src/gallium/drivers/llvmpipe/lp_setup_context.h|8 ++- src/gallium/drivers/llvmpipe/lp_setup_line.c | 12 +++- src/gallium/drivers/llvmpipe/lp_setup_point.c | 12 ++-- src/gallium/drivers/llvmpipe/lp_setup_tri.c| 17 -- src/gallium/drivers/llvmpipe/lp_state_clip.c | 25 +--- src/gallium/drivers/llvmpipe/lp_state_derived.c| 20 +-- src/gallium/drivers/llvmpipe/lp_surface.c |4 +- src/gallium/drivers/noop/noop_state.c | 16 +++-- src/gallium/drivers/nv30/nv30_draw.c |2 +- src/gallium/drivers/nv30/nv30_state.c | 16 +++-- src/gallium/drivers/nv50/nv50_state.c | 16 +++-- src/gallium/drivers/nvc0/nvc0_state.c | 16 +++-- src/gallium/drivers/r300/r300_context.c|2 +- src/gallium/drivers/r300/r300_state.c | 18 +++--- src/gallium/drivers/r600/evergreen_state.c |6 +- src/gallium/drivers/r600/r600_state.c |8 ++- src/gallium/drivers/r600/r600_state_common.c | 10 ++-- src/gallium/drivers/radeonsi/si_state.c| 16 +++-- src/gallium/drivers/rbug/rbug_context.c| 22 +++ src/gallium/drivers/softpipe/sp_screen.c |2 + src/gallium/drivers/softpipe/sp_state_clip.c | 19 +++--- src/gallium/drivers/softpipe/sp_state_derived.c|2 +- src/gallium/drivers/svga/svga_pipe_misc.c | 20 --- src/gallium/drivers/svga/svga_swtnl_state.c|2 +- src/gallium/drivers/trace/tr_context.c | 32 ++ src/gallium/include/pipe/p_context.h | 12 ++-- src/gallium/include/pipe/p_defines.h |3 +- src/gallium/include/pipe/p_shader_tokens.h |3 +- src/gallium/include/pipe/p_state.h |1 + src/gallium/tests/graw/fs-test.c |2 +- src/gallium/tests/graw/graw_util.h |2 +- src/gallium/tests/graw/gs-test.c |2 +- src/gallium/tests/graw/quad-sample.c |2 +- src/gallium/tests/graw/shader-leak.c |2 +- src/gallium/tests/graw/tri-gs.c|2 +-
Re: [Mesa-dev] [PATCH 5/5] i965 gen7: use SURFACE_STATE fields to select render level/layer
On 22 May 2013 20:00, Jordan Justen jljus...@gmail.com wrote: On Wed, May 22, 2013 at 3:56 PM, Eric Anholt e...@anholt.net wrote: Jordan Justen jordan.l.jus...@intel.com writes: - surf[0] = BRW_SURFACE_2D BRW_SURFACE_TYPE_SHIFT | + switch (gl_target) { + case GL_TEXTURE_CUBE_MAP_ARRAY: + case GL_TEXTURE_CUBE_MAP: + surftype = BRW_SURFACE_2D; + is_array = true; + depth *= 6; + break; + default: + surftype = translate_tex_target(gl_target); + is_array = _mesa_tex_target_is_array(gl_target); + break; + } Why the conversion of cubes to arrays? It looks from mentions in the render target write message section's mention of RTAI that cubes are supported. Hmm. Good catch. I think I started implementing this in brw_wm_surface_state.c, so I was looking at what would be needed for the older gens. It looks like pre-gen6, that cube-arrays were not supported in the surface_state. I'm not sure right now why I extended that to include converting non-array cubes to 2d-arrays as well. Anyway, I'll investigate cleaning this up for gen7, since that is what we are starting with. When Jordan was first working on this feature, he asked me to help debug it, and I found by reading simulator source code that SURFACE_STATE's minimum array element field is ignored for cube surfaces (in direct contradiction to the hw docs). Fortunately, treating the surface as an array is an effective workaround, since for render targets there is effectively no difference between a cube map and an ordinary array with a 6x higher depth. Thanks, -Jordan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallium/docs: adds documentation for multi viewport cap
Signed-off-by: Zack Rusin za...@vmware.com --- src/gallium/docs/source/screen.rst |4 1 file changed, 4 insertions(+) diff --git a/src/gallium/docs/source/screen.rst b/src/gallium/docs/source/screen.rst index 2630491..0957e56 100644 --- a/src/gallium/docs/source/screen.rst +++ b/src/gallium/docs/source/screen.rst @@ -164,6 +164,10 @@ The integer capabilities: for permutational swizzles. * ``PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE``: The maximum accessible size with a buffer sampler view, in bytes. +* ``PIPE_CAP_MAX_VIEWPORTS``: The maximum number of viewports (and scissors + since they are linked) a driver can support. Returning 0 is equivalent + to returning 1 because every driver has to support at least a single + viewport/scissor combination. .. _pipe_capf: -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/4] V2 Multiple viewports in Gallium
There's no documentation of PIPE_CAP_MAX_VIEWPORTS but otherwise the series looks good. I've just sent another patch with it. Let me know if that's enough. z ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2 4/4] i965: Enable ext_framebuffer_multisample_blit_scaled on intel h/w
Paul Berry stereotype...@gmail.com writes: On 16 May 2013 11:44, Anuj Phogat anuj.pho...@gmail.com wrote: This patch enables ext_framebuffer_multisample_blit_scaled extension on intel h/w = gen6. Note: Patches for piglit tests to verify this functionality are out for review on piglit mailing list. Tests pass for all of the scaling factors from 0.1 to 2.4. Comment from Paul Berry: I have some concerns about the image quality of the method you've implemented. As I understand it, the primary use case of this extension is to allow the client to do multisampled rendering at slightly less than screen resolution (e.g. 720p instead of 1080p), and then blit the result to the screen in one step while keeping most of the quality benefits of multisampling. Since your implementation is effectively equivalent to downsampling and then blitting using GL_NEAREST filtering, my fear is that it will lead to blocky artifacts that are severe enough to negate the benefit of multisampling in the first place. Before we turn this extension on in the Intel driver, I'd like to look at a comparison of: (1) your technique (2) downsampling followed by scaling with GL_LINEAR filtering (3) The nVidia implementation, in GL_SCALED_RESOLVE_FASTEST_EXT mode (4) The nVidia implementation, in GL_SCALED_RESOLVE_NICEST_EXT mode (5) Just rendering the image directly to the single-sampled destination buffer Observation: Image quality is better in cases 2, 3, 4 and 5 as compared to case 1. Although extension's implementation meets the specification's requirements, using it leads to blocky artifacts due to nearest filtering. I'll work on implementing a better filtering technique in blorp. Thanks for quoting my comment here. It's good to have context so that we can continue the discussion. My preference would be to go ahead and land patches 1-3 now, but hold patch 4 back until we've figured out how to get comparable image quality to the nVidia implementation. It seems like it would be nice to go out of the gate with our best looking implementation. Does that seem reasonable to other folks? Yeah, I don't think should ship a nearest-filtered-only implementation. pgpQAe0p3e1HZ.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/12] i965/gen7+: Implement fast color clears.
Ian Romanick i...@freedesktop.org writes: On 05/21/2013 04:52 PM, Paul Berry wrote: This series implements fast color clears, a Gen7+ feature which reduces memory bandwidth by deferring the memory writes involved in a glClear() until the same memory is later touched during rendering. From a broad overview point of view, fast color clears work in a similar way to HiZ: an auxiliary MCS buffer keeps track of which parts of the buffer have been cleared but haven't yet had the necessary memory writes performed. Whenever a color buffer needs to be accessed by the CPU, or by a part of the GPU that is not fast-color-aware, we have to perform a resolve operation to force any pending memory writes to occur. This patch series adopts a slightly different strategy (compared to HiZ) for making sure the resolves happen when needed. Instead of modifying each code path that might need to do a resolve so that it does one if needed, we create an accessor function that does the resolve if needed and then provides the caller with access to the miptree's underlying memory region. This lets us have a lot more confidence that we didn't miss any code paths, which is important since color buffers are accessed by a large number of code paths. To discourage future maintainers from trying to bypass the accessor function, it is inline (so that overhead is negligible), and the field it provides access to has been renamed to region_private. Patch 01 ifdefs out some code so that it does not appear in the i915 (pre-Gen4) driver--this makes it easier to be confident that these changes won't regress i915. Patch 02 introduces the aforementioned accessor function. Patches 03-11 are the guts of the implementation, and patch 12 enables the new feature. No piglit regressions. I have additional piglit tests which validate specific important corner cases--I hope to get those out to the list later this week. I sent some comments and review for the tests, and I've sent some other comments about these patches. My only concern is whether the case of swapping a non-current drawable (that had a fast-clear as the last render) produces the correct result. In the piglit thread, I suggested adding a test specifically for this case. I suspect that if fast-clear fails in that case, then multisampling also fails. Both can probably be fixed as follow-on work. Does that seem plausible? Swapping a non-current drawable doesn't work in direct rendering, at all, since as far back as I was able to figure out. I saw no way forward toward making it possible. I don't think we should distract this series with this issue. pgp8F8UvG0yd6.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/4] V2 Multiple viewports in Gallium
Am 24.05.2013 23:41, schrieb Zack Rusin: There's no documentation of PIPE_CAP_MAX_VIEWPORTS but otherwise the series looks good. I've just sent another patch with it. Let me know if that's enough. z Thanks! That's certainly enough. Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/17] intle: Add an assert for glCopyTexSubImage() being called on MSAA buffers.
On 05/24/2013 01:56 PM, Eric Anholt wrote: s/intle/intel/ in the title. This is just in case someone else trips over this due to our weird reuse of this code in glBlitFramebuffer(). --- src/mesa/drivers/dri/intel/intel_tex_copy.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/mesa/drivers/dri/intel/intel_tex_copy.c b/src/mesa/drivers/dri/intel/intel_tex_copy.c index 94e90da..4a13b9a 100644 --- a/src/mesa/drivers/dri/intel/intel_tex_copy.c +++ b/src/mesa/drivers/dri/intel/intel_tex_copy.c @@ -62,6 +62,12 @@ intel_copy_texsubimage(struct intel_context *intel, intel_prepare_render(intel); + /* glCopyTexSubImage() can't be called on multisampled renderbuffers or +* textures. +*/ + assert(!irb-Base.Base.NumSamples); + assert(!intelImage-base.Base.NumSamples); + if (!intelImage-mt || !irb || !irb-mt) { if (unlikely(INTEL_DEBUG DEBUG_PERF)) fprintf(stderr, %s fail %p %p (0x%08x)\n, ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/17] intel: Extend the force_y_tiling flag to allow forcing no tiling.
On 05/24/2013 01:56 PM, Eric Anholt wrote: For a blit-uploaded temporary, it's faster on current hardware to memcpy the data into a linear CPU mapping than to go through the GTT. --- src/mesa/drivers/dri/intel/intel_fbo.c | 2 +- src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 22 +- src/mesa/drivers/dri/intel/intel_mipmap_tree.h | 2 +- src/mesa/drivers/dri/intel/intel_tex_image.c| 2 +- src/mesa/drivers/dri/intel/intel_tex_validate.c | 2 +- 5 files changed, 17 insertions(+), 13 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_fbo.c b/src/mesa/drivers/dri/intel/intel_fbo.c index 05ff784..73ed91d 100644 --- a/src/mesa/drivers/dri/intel/intel_fbo.c +++ b/src/mesa/drivers/dri/intel/intel_fbo.c @@ -924,7 +924,7 @@ intel_renderbuffer_move_to_temp(struct intel_context *intel, width, height, depth, true, irb-mt-num_samples, - false /* force_y_tiling */); + 0 /* force_tiling_mask */); if (intel-vtbl.is_hiz_depth_format(intel, new_mt-format)) { intel_miptree_alloc_hiz(intel, new_mt); diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c index c3e55f4..d41fbdf 100644 --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c @@ -265,7 +265,7 @@ intel_miptree_create_layout(struct intel_context *intel, mt-logical_depth0, true, num_samples, -false /* force_y_tiling */); +0 /* force_tiling_mask */); if (!mt-stencil_mt) { intel_miptree_release(mt); return NULL; @@ -309,7 +309,7 @@ intel_miptree_choose_tiling(struct intel_context *intel, gl_format format, uint32_t width0, uint32_t num_samples, -bool force_y_tiling, +int force_tiling_mask, struct intel_mipmap_tree *mt) { @@ -320,8 +320,12 @@ intel_miptree_choose_tiling(struct intel_context *intel, return I915_TILING_NONE; } - if (force_y_tiling) - return I915_TILING_Y; + /* Some usages may want only one type of tiling, like depth miptrees (Y +* tiled), or temporary BOs for uploading data once (linear). So far the +* mask only ever has one bit set. +*/ + if (force_tiling_mask) + return ffs(force_tiling_mask) - 1; if (num_samples 1) { /* From p82 of the Sandy Bridge PRM, dw3[1] of SURFACE_STATE (Tiled @@ -375,7 +379,7 @@ intel_miptree_create(struct intel_context *intel, GLuint depth0, bool expect_accelerated_upload, GLuint num_samples, - bool force_y_tiling) + int force_tiling_mask) unsigned? { struct intel_mipmap_tree *mt; gl_format tex_format = format; @@ -441,7 +445,7 @@ intel_miptree_create(struct intel_context *intel, } uint32_t tiling = intel_miptree_choose_tiling(intel, format, width0, - num_samples, force_y_tiling, + num_samples, force_tiling_mask, mt); bool y_or_x = tiling == (I915_TILING_Y | I915_TILING_X); @@ -570,7 +574,7 @@ intel_miptree_create_for_renderbuffer(struct intel_context *intel, mt = intel_miptree_create(intel, GL_TEXTURE_2D, format, 0, 0, width, height, depth, true, num_samples, - false /* force_y_tiling */); + 0 /* force_tiling_mask */); if (!mt) goto fail; @@ -1008,7 +1012,7 @@ intel_miptree_alloc_mcs(struct intel_context *intel, mt-logical_depth0, true, 0 /* num_samples */, - true /* force_y_tiling */); + (1 I915_TILING_Y)); /* From the Ivy Bridge PRM, Vol 2 Part 1 p326: * @@ -1089,7 +1093,7 @@ intel_miptree_alloc_hiz(struct intel_context *intel, mt-logical_depth0, true, mt-num_samples, - false /* force_y_tiling */); + 0 /* force_tiling_mask */); if (!mt-hiz_mt) return false; diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.h
Re: [Mesa-dev] [PATCH 10/17] intel: Make a temporary miptree when doing blit uploads for glTexSubImage().
On 05/24/2013 01:56 PM, Eric Anholt wrote: While this is a bit more CPU work, it also is less code to handle this path, and fixes problems with 32k-pitch textures and missing resolves. --- src/mesa/drivers/dri/intel/intel_tex_subimage.c | 62 +++-- 1 file changed, 18 insertions(+), 44 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_tex_subimage.c b/src/mesa/drivers/dri/intel/intel_tex_subimage.c index 42cc739..e436dc1 100644 --- a/src/mesa/drivers/dri/intel/intel_tex_subimage.c +++ b/src/mesa/drivers/dri/intel/intel_tex_subimage.c @@ -53,12 +53,6 @@ intel_blit_texsubimage(struct gl_context * ctx, { struct intel_context *intel = intel_context(ctx); struct intel_texture_image *intelImage = intel_texture_image(texImage); - GLuint dstRowStride = 0; - drm_intel_bo *temp_bo = NULL; - unsigned int blit_x = 0, blit_y = 0; - unsigned long pitch; - uint32_t tiling_mode = I915_TILING_NONE; - GLubyte *dstMap; /* Try to do a blit upload of the subimage if the texture is * currently busy. @@ -93,57 +87,37 @@ intel_blit_texsubimage(struct gl_context * ctx, if (!pixels) return false; - temp_bo = drm_intel_bo_alloc_tiled(intel-bufmgr, - subimage blit bo, - width, height, - intelImage-mt-cpp, - tiling_mode, - pitch, - 0); - if (temp_bo == NULL) { - _mesa_error(ctx, GL_OUT_OF_MEMORY, intelTexSubImage); - return false; - } + struct intel_mipmap_tree *temp_mt = + intel_miptree_create(intel, GL_TEXTURE_2D, texImage-TexFormat, + 0, 0, + width, height, 1, + false, 0, + (1 I915_TILING_NONE) /* force_tiling_mask */); The old code did error checking. Should we continue to error check temp_mt and dst (below)? - if (drm_intel_gem_bo_map_gtt(temp_bo)) { - _mesa_error(ctx, GL_OUT_OF_MEMORY, intelTexSubImage); - return false; - } - - dstMap = temp_bo-virtual; - dstRowStride = pitch; - - intel_miptree_get_image_offset(intelImage-mt, texImage-Level, - intelImage-base.Base.Face, - blit_x, blit_y); - blit_x += xoffset; - blit_y += yoffset; + GLubyte *dst = intel_miptree_map_raw(intel, temp_mt); if (!_mesa_texstore(ctx, 2, texImage-_BaseFormat, texImage-TexFormat, - dstRowStride, - dstMap, + temp_mt-region-pitch, + dst, width, height, 1, format, type, pixels, packing)) { _mesa_error(ctx, GL_OUT_OF_MEMORY, intelTexSubImage); Since this code doesn't bail (and never has), we blit garbage into the texture, right? } + intel_miptree_unmap_raw(intel, temp_mt); + bool ret; - drm_intel_gem_bo_unmap_gtt(temp_bo); - - ret = intelEmitCopyBlit(intel, - intelImage-mt-cpp, - dstRowStride, - temp_bo, 0, false, - intelImage-mt-region-pitch, - intelImage-mt-region-bo, 0, - intelImage-mt-region-tiling, - 0, 0, blit_x, blit_y, width, height, - GL_COPY); + ret = intel_miptree_blit(intel, +temp_mt, 0, 0, +0, 0, false, +intelImage-mt, texImage-Level, texImage-Face, +xoffset, yoffset, false, +width, height, GL_COPY); assert(ret); - drm_intel_bo_unreference(temp_bo); + intel_miptree_release(temp_mt); _mesa_unmap_teximage_pbo(ctx, packing); return ret; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 14/17] intel: Rebuild PBO blit glTexImage() on top of miptrees.
On 05/24/2013 01:56 PM, Eric Anholt wrote: This will ensure that we have resolves if we ever extend this to glTexSubImage(), and fixes missing image start offset handling. The texture buffer alloc ended up getting moved up, because we want to look at the format of the image's actual mt to see if we'll end up blitting the right thing, in the case of packed depth/stencil uploads. This is the last caller of intelEmitCopyBlit() on a miptree-wrapped BO. It looks like after this the two remaining callers are all in intel_blit.c. Should intelEmitCopyBlit be static? Looking at what's left, it looks like there should be some more refactoring of intelEmitCopyBlit after this commit. A bunch of the checks, etc. in intelEmitCopyBlit are only relevant for one of the callers. That can happen later, if there's value. --- src/mesa/drivers/dri/intel/intel_tex_image.c | 62 ++-- 1 file changed, 32 insertions(+), 30 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_tex_image.c b/src/mesa/drivers/dri/intel/intel_tex_image.c index a3928bb..4ad5ccc 100644 --- a/src/mesa/drivers/dri/intel/intel_tex_image.c +++ b/src/mesa/drivers/dri/intel/intel_tex_image.c @@ -6,6 +6,7 @@ #include main/bufferobj.h #include main/context.h #include main/formats.h +#include main/image.h #include main/pbo.h #include main/renderbuffer.h #include main/texcompress.h @@ -117,9 +118,8 @@ try_pbo_upload(struct gl_context *ctx, struct intel_texture_image *intelImage = intel_texture_image(image); struct intel_context *intel = intel_context(ctx); struct intel_buffer_object *pbo = intel_buffer_object(unpack-BufferObj); - GLuint src_offset, src_stride; - GLuint dst_x, dst_y; - drm_intel_bo *dst_buffer, *src_buffer; + GLuint src_offset; + drm_intel_bo *src_buffer; if (!_mesa_is_bufferobj(unpack-BufferObj)) return false; @@ -132,14 +132,6 @@ try_pbo_upload(struct gl_context *ctx, return false; } - if (!_mesa_format_matches_format_and_type(image-TexFormat, - format, type, false)) { - DBG(%s: format mismatch (upload to %s with format 0x%x, type 0x%x)\n, - __FUNCTION__, _mesa_get_format_name(image-TexFormat), - format, type); - return false; - } - ctx-Driver.AllocTextureImageBuffer(ctx, image); if (!intelImage-mt) { @@ -147,39 +139,49 @@ try_pbo_upload(struct gl_context *ctx, return false; } + if (!_mesa_format_matches_format_and_type(intelImage-mt-format, + format, type, false)) { + DBG(%s: format mismatch (upload to %s with format 0x%x, type 0x%x)\n, + __FUNCTION__, _mesa_get_format_name(intelImage-mt-format), + format, type); + return false; + } + if (image-TexObject-Target == GL_TEXTURE_1D_ARRAY || image-TexObject-Target == GL_TEXTURE_2D_ARRAY) { DBG(%s: no support for array textures\n, __FUNCTION__); return false; } - dst_buffer = intelImage-mt-region-bo; src_buffer = intel_bufferobj_source(intel, pbo, 64, src_offset); /* note: potential 64-bit ptr to 32-bit int cast */ src_offset += (GLuint) (unsigned long) pixels; - if (unpack-RowLength 0) - src_stride = unpack-RowLength; - else - src_stride = image-Width; - src_stride *= intelImage-mt-region-cpp; - - intel_miptree_get_image_offset(intelImage-mt, intelImage-base.Base.Level, - intelImage-base.Base.Face, - dst_x, dst_y); - - if (!intelEmitCopyBlit(intel, - intelImage-mt-cpp, - src_stride, src_buffer, - src_offset, false, - intelImage-mt-region-pitch, dst_buffer, 0, - intelImage-mt-region-tiling, - 0, 0, dst_x, dst_y, image-Width, image-Height, - GL_COPY)) { + int src_stride = + _mesa_image_row_stride(unpack, image-Width, format, type); + + struct intel_mipmap_tree *pbo_mt = + intel_miptree_create_for_bo(intel, + src_buffer, + intelImage-mt-format, + src_offset, + image-Width, image-Height, + src_stride, I915_TILING_NONE); + if (!pbo_mt) + return false; + + if (!intel_miptree_blit(intel, + pbo_mt, 0, 0, + 0, 0, false, + intelImage-mt, image-Level, image-Face, + 0, 0, false, + image-Width, image-Height, GL_COPY)) { DBG(%s: blit failed\n, __FUNCTION__); return false; } + intel_miptree_release(pbo_mt); + DBG(%s: success\n, __FUNCTION__); return true; }
Re: [Mesa-dev] [PATCH 14/17] intel: Rebuild PBO blit glTexImage() on top of miptrees.
Ian Romanick i...@freedesktop.org writes: On 05/24/2013 01:56 PM, Eric Anholt wrote: This will ensure that we have resolves if we ever extend this to glTexSubImage(), and fixes missing image start offset handling. The texture buffer alloc ended up getting moved up, because we want to look at the format of the image's actual mt to see if we'll end up blitting the right thing, in the case of packed depth/stencil uploads. This is the last caller of intelEmitCopyBlit() on a miptree-wrapped BO. It looks like after this the two remaining callers are all in intel_blit.c. Should intelEmitCopyBlit be static? Looking at what's left, it looks like there should be some more refactoring of intelEmitCopyBlit after this commit. A bunch of the checks, etc. in intelEmitCopyBlit are only relevant for one of the callers. That can happen later, if there's value. I thought about doing so, but the aperture check is painful enough I decided not to duplicate it. pgpIMP1X7ncvj.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] libclc: vload/vstore initial implementation
On Thu, May 23, 2013 at 07:49:39PM -0500, Aaron Watry wrote: I've implemented the OpenCL vload/vstore builtin functions in two parts. 1) Pure CL C implementation. No Assembly 2) Add assembly optimizations for 32-bit int/uint loads/stores of 4+ component vectors Note: The vstore implementation assumes that the hardware back end supports byte-addressable stores. This may not always be optimal. Hi Aaron, I've pushed these to my libclc repo, thanks! -Tom ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/7] mesa: Add infrastructure for ARB_shading_language_420pack.
From: Todd Previte tprev...@gmail.com v2 [mattst88] - Split infrastructure into separate patch. - Add preprocessor #define. --- src/glsl/glcpp/glcpp-parse.y| 3 +++ src/glsl/glsl_parser_extras.cpp | 1 + src/glsl/glsl_parser_extras.h | 2 ++ src/mesa/main/extensions.c | 1 + src/mesa/main/mtypes.h | 1 + 5 files changed, 8 insertions(+) diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y index 81ba04b..2e3e6a8 100644 --- a/src/glsl/glcpp/glcpp-parse.y +++ b/src/glsl/glcpp/glcpp-parse.y @@ -1242,6 +1242,9 @@ glcpp_parser_create (const struct gl_extensions *extensions, int api) if (extensions-AMD_vertex_shader_layer) add_builtin_define(parser, GL_AMD_vertex_shader_layer, 1); + + if (extensions-ARB_shading_language_420pack) +add_builtin_define(parser, GL_ARB_shading_language_420pack, 1); } } diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp index c0dd713..d02b308 100644 --- a/src/glsl/glsl_parser_extras.cpp +++ b/src/glsl/glsl_parser_extras.cpp @@ -466,6 +466,7 @@ static const _mesa_glsl_extension _mesa_glsl_supported_extensions[] = { EXT(OES_standard_derivatives, false, false, true, false, true, OES_standard_derivatives), EXT(ARB_texture_cube_map_array, true, false, true, true, false, ARB_texture_cube_map_array), EXT(ARB_shading_language_packing, true, false, true, true, false, ARB_shading_language_packing), + EXT(ARB_shading_language_420pack, true, true, true, true, false, ARB_shading_language_420pack), EXT(ARB_texture_multisample,true, false, true, true, false, ARB_texture_multisample), EXT(ARB_texture_query_lod, false, false, true, true, false, ARB_texture_query_lod), EXT(ARB_gpu_shader5,true, true, true, true, false, ARB_gpu_shader5), diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h index 16e180d..95918de 100644 --- a/src/glsl/glsl_parser_extras.h +++ b/src/glsl/glsl_parser_extras.h @@ -288,6 +288,8 @@ struct _mesa_glsl_parse_state { bool ARB_gpu_shader5_warn; bool AMD_vertex_shader_layer_enable; bool AMD_vertex_shader_layer_warn; + bool ARB_shading_language_420pack_enable; + bool ARB_shading_language_420pack_warn; /*@}*/ /** Extensions supported by the OpenGL implementation. */ diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c index db5a5ed..32a331b 100644 --- a/src/mesa/main/extensions.c +++ b/src/mesa/main/extensions.c @@ -127,6 +127,7 @@ static const struct extension extension_table[] = { { GL_ARB_shader_texture_lod, o(ARB_shader_texture_lod), GL, 2009 }, { GL_ARB_shading_language_100, o(ARB_shading_language_100),GLL,2003 }, { GL_ARB_shading_language_packing, o(ARB_shading_language_packing),GL, 2011 }, + { GL_ARB_shading_language_420pack, o(ARB_shading_language_420pack),GL, 2011 }, { GL_ARB_shadow, o(ARB_shadow), GLL,2001 }, { GL_ARB_sync,o(ARB_sync), GL, 2003 }, { GL_ARB_texture_border_clamp, o(ARB_texture_border_clamp),GLL,2000 }, diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index b68853b..597f36f 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -2985,6 +2985,7 @@ struct gl_extensions GLboolean ARB_shader_texture_lod; GLboolean ARB_shading_language_100; GLboolean ARB_shading_language_packing; + GLboolean ARB_shading_language_420pack; GLboolean ARB_shadow; GLboolean ARB_sync; GLboolean ARB_texture_border_clamp; -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/7] glsl: Allow .length() method on vectors and matrices.
Required by ARB_shading_language_420pack. --- src/glsl/hir_field_selection.cpp | 58 ++-- 1 file changed, 38 insertions(+), 20 deletions(-) diff --git a/src/glsl/hir_field_selection.cpp b/src/glsl/hir_field_selection.cpp index 0035a5f..cc7ba61 100644 --- a/src/glsl/hir_field_selection.cpp +++ b/src/glsl/hir_field_selection.cpp @@ -47,20 +47,6 @@ _mesa_ast_field_selection_to_hir(const ast_expression *expr, YYLTYPE loc = expr-get_location(); if (op-type-is_error()) { /* silently propagate the error */ - } else if (op-type-is_vector()) { - ir_swizzle *swiz = ir_swizzle::create(op, - expr-primary_expression.identifier, - op-type-vector_elements); - if (swiz != NULL) { -result = swiz; - } else { -/* FINISHME: Logging of error messages should be moved into - * FINISHME: ir_swizzle::create. This allows the generation of more - * FINISHME: specific error messages. - */ -_mesa_glsl_error( loc, state, Invalid swizzle / mask `%s', - expr-primary_expression.identifier); - } } else if (op-type-base_type == GLSL_TYPE_STRUCT || op-type-base_type == GLSL_TYPE_INTERFACE) { result = new(ctx) ir_dereference_record(op, @@ -81,17 +67,49 @@ _mesa_ast_field_selection_to_hir(const ast_expression *expr, const char *method; method = call-subexpressions[0]-primary_expression.identifier; - if (op-type-is_array() strcmp(method, length) == 0) { -if (!call-expressions.is_empty()) - _mesa_glsl_error(loc, state, length method takes no arguments.); + if (strcmp(method, length) == 0) { + if (!call-expressions.is_empty()) +_mesa_glsl_error(loc, state, length method takes no arguments.); -if (op-type-array_size() == 0) - _mesa_glsl_error(loc, state, length called on unsized array.); + if (op-type-is_array()) { +if (op-type-array_size() == 0) + _mesa_glsl_error(loc, state, length called on unsized array.); -result = new(ctx) ir_constant(op-type-array_size()); +result = new(ctx) ir_constant(op-type-array_size()); + } else if (op-type-is_vector()) { +if (state-ARB_shading_language_420pack_enable) { + /* .length() returns int. */ + result = new(ctx) ir_constant((int) op-type-vector_elements); +} else { + _mesa_glsl_error(loc, state, length method on matrix only available + with ARB_shading_language_420pack.); +} + } else if (op-type-is_matrix()) { +if (state-ARB_shading_language_420pack_enable) { + /* .length() returns int. */ + result = new(ctx) ir_constant((int) op-type-matrix_columns); +} else { + _mesa_glsl_error(loc, state, length method on matrix only available + with ARB_shading_language_420pack.); +} + } } else { _mesa_glsl_error(loc, state, Unknown method: `%s'., method); } + } else if (op-type-is_vector()) { + ir_swizzle *swiz = ir_swizzle::create(op, + expr-primary_expression.identifier, + op-type-vector_elements); + if (swiz != NULL) { +result = swiz; + } else { +/* FINISHME: Logging of error messages should be moved into + * FINISHME: ir_swizzle::create. This allows the generation of more + * FINISHME: specific error messages. + */ +_mesa_glsl_error( loc, state, Invalid swizzle / mask `%s', + expr-primary_expression.identifier); + } } else { _mesa_glsl_error( loc, state, Cannot access field `%s' of non-structure / non-vector., -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/7] glsl: Allow swizzles on scalars.
Required by ARB_shading_language_420pack. --- src/glsl/hir_field_selection.cpp | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/glsl/hir_field_selection.cpp b/src/glsl/hir_field_selection.cpp index cc7ba61..ceb0a4c 100644 --- a/src/glsl/hir_field_selection.cpp +++ b/src/glsl/hir_field_selection.cpp @@ -96,7 +96,9 @@ _mesa_ast_field_selection_to_hir(const ast_expression *expr, } else { _mesa_glsl_error(loc, state, Unknown method: `%s'., method); } - } else if (op-type-is_vector()) { + } else if (op-type-is_vector() || + (state-ARB_shading_language_420pack_enable + op-type-is_scalar())) { ir_swizzle *swiz = ir_swizzle::create(op, expr-primary_expression.identifier, op-type-vector_elements); -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/7] glsl: Add gl_{Max, Min}ProgramTexelOffset built-in constants.
Required by ARB_shading_language_420pack. Note that the 420pack spec incorrectly specifies their values as (Min, Max) = (-7, 8) when they should be (-8, 7) as listed in the GLSL 4.30 and ESSL 3.0 specs. --- src/glsl/builtin_variables.cpp | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp index 4bb361c..f4ac205 100644 --- a/src/glsl/builtin_variables.cpp +++ b/src/glsl/builtin_variables.cpp @@ -790,6 +790,13 @@ generate_130_uniforms(exec_list *instructions, state-Const.MaxClipPlanes); add_builtin_constant(instructions, symtab, gl_MaxVaryingComponents, state-Const.MaxVaryingFloats); + + if (state-ARB_shading_language_420pack_enable) { + add_builtin_constant(instructions, symtab, gl_MinProgramTexelOffset, + state-Const.MinProgramTexelOffset); + add_builtin_constant(instructions, symtab, gl_MaxProgramTexelOffset, + state-Const.MaxProgramTexelOffset); + } } -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/7] glsl: Allow implicit conversion of return values.
Required by ARB_shading_language_420pack. --- src/glsl/ast_to_hir.cpp | 31 ++- 1 file changed, 22 insertions(+), 9 deletions(-) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index b206380..6e689b4 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -3358,7 +3358,7 @@ ast_jump_statement::hir(exec_list *instructions, assert(state-current_function); if (opt_return_value) { -ir_rvalue *const ret = opt_return_value-hir(instructions, state); +ir_rvalue *ret = opt_return_value-hir(instructions, state); /* The value of the return type can be NULL if the shader says * 'return foo();' and foo() is a function that returns void. @@ -3370,16 +3370,29 @@ ast_jump_statement::hir(exec_list *instructions, const glsl_type *const ret_type = (ret == NULL) ? glsl_type::void_type : ret-type; -/* Implicit conversions are not allowed for return values. */ -if (state-current_function-return_type != ret_type) { + /* Implicit conversions are not allowed for return values prior to + * ARB_shading_language_420pack. + */ + if (state-current_function-return_type != ret_type) { YYLTYPE loc = this-get_location(); - _mesa_glsl_error( loc, state, -`return' with wrong type %s, in function `%s' -returning %s, -ret_type-name, -state-current_function-function_name(), -state-current_function-return_type-name); +if (state-ARB_shading_language_420pack_enable) { + if (!apply_implicit_conversion(state-current_function-return_type, + ret, state)) { + _mesa_glsl_error( loc, state, + Could not implicitly convert return value + to %s, in function `%s', + state-current_function-return_type-name, + state-current_function-function_name()); + } +} else { + _mesa_glsl_error( loc, state, +`return' with wrong type %s, in function `%s' +returning %s, +ret_type-name, +state-current_function-function_name(), +state-current_function-return_type-name); +} } inst = new(ctx) ir_return(ret); -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/7] glsl: Allow non-constant expression initializers of const-qualified vars.
Required by ARB_shading_language_420pack. --- src/glsl/ast_to_hir.cpp | 30 +++--- 1 file changed, 19 insertions(+), 11 deletions(-) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 6e689b4..6b56e87 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -2337,17 +2337,25 @@ process_initializer(ir_variable *var, ast_declaration *decl, ir_constant *constant_value = rhs-constant_expression_value(); if (!constant_value) { - _mesa_glsl_error( initializer_loc, state, -initializer of %s variable `%s' must be a -constant expression, -(type-qualifier.flags.q.constant) -? const : uniform, -decl-identifier); - if (var-type-is_numeric()) { - /* Reduce cascading errors. */ - var-constant_value = ir_constant::zero(state, var-type); - } -} else { +/* If ARB_shading_language_420pack is enabled, initializers of + * const-qualified local variables do not have to be constant + * expressions. Const-qualified global variables must still be + * initialized with constant expressions. + */ +if (!state-ARB_shading_language_420pack_enable +|| state-current_function == NULL) { + _mesa_glsl_error( initializer_loc, state, +initializer of %s variable `%s' must be a +constant expression, +(type-qualifier.flags.q.constant) +? const : uniform, +decl-identifier); + if (var-type-is_numeric()) { + /* Reduce cascading errors. */ + var-constant_value = ir_constant::zero(state, var-type); + } +} + } else { rhs = constant_value; var-constant_value = constant_value; } -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 7/7] glsl: Disallow return with a void argument from void functions.
NOTE: This is a candidate for the stable branches. --- src/glsl/ast_to_hir.cpp | 18 +- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 6b56e87..6ee50f5 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -3401,7 +3401,23 @@ ast_jump_statement::hir(exec_list *instructions, state-current_function-function_name(), state-current_function-return_type-name); } -} + } else if (state-current_function-return_type-base_type == +GLSL_TYPE_VOID) { +YYLTYPE loc = this-get_location(); + +/* The ARB_shading_language_420pack, GLSL ES 3.0, and GLSL 4.20 + * specs add a clarification: + * + *A void function can only use return without a return argument, even if + * the return argument has void type. Return statements only accept values: + * + * void func1() { } + * void func2() { return func1(); } // illegal return statement + */ +_mesa_glsl_error( loc, state, + void functions can only use `return' without a + return argument); + } inst = new(ctx) ir_return(ret); } else { -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/7] Beginnings of ARB_shading_language_420pack
I'm on vacation for the next week, so in case anyone else wants to finish off ARB_shading_language_420pack, here are the tests and patches I've done so far. They cover - Swizzles on scalars - .length() method of matrices and vectors - gl_{Max,Min}ProgramTexelOffset built-in constants (needs a piglit test) - Implicit conversion of return values - Non-constant expression initializers of const variables - and a GLSL spec clarification tacked on at the end Thanks, Matt ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/5] i965 gen7: use SURFACE_STATE fields to select render level/layer
On Fri, May 24, 2013 at 2:17 PM, Paul Berry stereotype...@gmail.com wrote: On 22 May 2013 20:00, Jordan Justen jljus...@gmail.com wrote: On Wed, May 22, 2013 at 3:56 PM, Eric Anholt e...@anholt.net wrote: Jordan Justen jordan.l.jus...@intel.com writes: - surf[0] = BRW_SURFACE_2D BRW_SURFACE_TYPE_SHIFT | + switch (gl_target) { + case GL_TEXTURE_CUBE_MAP_ARRAY: + case GL_TEXTURE_CUBE_MAP: + surftype = BRW_SURFACE_2D; + is_array = true; + depth *= 6; + break; + default: + surftype = translate_tex_target(gl_target); + is_array = _mesa_tex_target_is_array(gl_target); + break; + } Why the conversion of cubes to arrays? It looks from mentions in the render target write message section's mention of RTAI that cubes are supported. Hmm. Good catch. I think I started implementing this in brw_wm_surface_state.c, so I was looking at what would be needed for the older gens. It looks like pre-gen6, that cube-arrays were not supported in the surface_state. I'm not sure right now why I extended that to include converting non-array cubes to 2d-arrays as well. Anyway, I'll investigate cleaning this up for gen7, since that is what we are starting with. When Jordan was first working on this feature, he asked me to help debug it, and I found by reading simulator source code that SURFACE_STATE's minimum array element field is ignored for cube surfaces (in direct contradiction to the hw docs). Fortunately, treating the surface as an array is an effective workaround, since for render targets there is effectively no difference between a cube map and an ordinary array with a 6x higher depth. I guess I forgot this discussion of ours. I spent some time trying to get BRW_SURFACE_CUBE working (again, I suppose), with no luck. Anyway... Either of you good for an r-b on this second version of patch 5 in this series then? -Jordan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Mask the cut index based on the index buffer type in 3DSTATE_VF.
On 05/23/2013 03:46 PM, Kenneth Graunke wrote: According to the documentation: The Cut Index is compared to the fetched (and possibly-sign-extended) vertex index, and if these values are equal, the current primitive topology is terminated. Note that, for index buffers 32bpp, it is possible to set the Cut Index to a (large) value that will never match a sign-extended vertex index. This suggests that we should not set the value to 0x for unsigned byte or short index buffers, but rather 0xFF or 0x. Fixes sporadic failures in the ES 3 instanced_arrays_primitive_restart conformance test when run in combination with other tests. No Piglit regressions. Cc: Ian Romanick i...@freedesktop.org Cc: Paul Berry stereotype...@gmail.com Signed-off-by: Kenneth Graunke kenn...@whitecape.org NAK on this patch. It looks like 0x133700ff is not supposed to match 0xff in GL_UNSIGNED_BYTE mode. I think i've found a bunch more bugs. Going to write some tests and new patches... ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev