Re: [Mesa-dev] [v4 09/10] egl: definitions for EXT_image_dma_buf_import

2013-05-24 Thread Pohjolainen, Topi
On Thu, May 23, 2013 at 09:40:09PM -0700, Chad Versace wrote:
 On 05/02/2013 12:08 AM, Topi Pohjolainen wrote:
 As specified in:
 
 http://www.khronos.org/registry/egl/extensions/EXT/EGL_EXT_image_dma_buf_import.txt
 
 Checking for the valid fourcc values is left for drivers avoiding
 dependency to drm header files here.
 
 v2:
 - enforce EGL_NO_CONTEXT
 
 v3:
 - declare the extension as EGL (not GLES)
 
 v4:
 - do not update eglext.h manually but rely on update from
   Khronos instead
 
 Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com
 ---
   src/egl/main/eglapi.c |  7 -
   src/egl/main/egldisplay.h |  1 +
   src/egl/main/eglimage.c   | 76 
  +++
   src/egl/main/eglimage.h   | 15 ++
   src/egl/main/eglmisc.c|  1 +
   5 files changed, 99 insertions(+), 1 deletion(-)
 
 diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
 index bcc5465..2355d45 100644
 --- a/src/egl/main/eglapi.c
 +++ b/src/egl/main/eglapi.c
 @@ -1310,7 +1310,12 @@ eglCreateImageKHR(EGLDisplay dpy, EGLContext ctx, 
 EGLenum target,
  _EGL_CHECK_DISPLAY(disp, EGL_NO_IMAGE_KHR, drv);
  if (!disp-Extensions.KHR_image_base)
 RETURN_EGL_EVAL(disp, EGL_NO_IMAGE_KHR);
 -   if (!context  ctx != EGL_NO_CONTEXT)
 +
 +   /**
 +* If target is EGL_LINUX_DMA_BUF_EXT, dpy must be a valid display,
 +*  ctx must be EGL_NO_CONTEXT...
 +*/
 +   if (ctx != EGL_NO_CONTEXT  (!context || target == 
 EGL_LINUX_DMA_BUF_EXT))
 RETURN_EGL_ERROR(disp, EGL_BAD_CONTEXT, EGL_NO_IMAGE_KHR);
 
  img = drv-API.CreateImageKHR(drv,
 diff --git a/src/egl/main/egldisplay.h b/src/egl/main/egldisplay.h
 index 4b33470..5a21f78 100644
 --- a/src/egl/main/egldisplay.h
 +++ b/src/egl/main/egldisplay.h
 @@ -115,6 +115,7 @@ struct _egl_extensions
 
  EGLBoolean EXT_create_context_robustness;
  EGLBoolean EXT_buffer_age;
 +   EGLBoolean EXT_image_dma_buf_import;
   };
 
 
 diff --git a/src/egl/main/eglimage.c b/src/egl/main/eglimage.c
 index bfae709..1cede31 100644
 --- a/src/egl/main/eglimage.c
 +++ b/src/egl/main/eglimage.c
 @@ -93,6 +93,82 @@ _eglParseImageAttribList(_EGLImageAttribs *attrs, 
 _EGLDisplay *dpy,
attrs-PlaneWL = val;
break;
 
 +  case EGL_LINUX_DRM_FOURCC_EXT:
 + attrs-DMABufFourCC.Value = val;
 + attrs-DMABufFourCC.IsPresent = EGL_TRUE;
 + break;
 +  case EGL_DMA_BUF_PLANE0_FD_EXT:
 + attrs-DMABufPlaneFds[0].Value = val;
 + attrs-DMABufPlaneFds[0].IsPresent = EGL_TRUE;
 + break;
 +  case EGL_DMA_BUF_PLANE0_OFFSET_EXT:
 + attrs-DMABufPlaneOffsets[0].Value = val;
 + attrs-DMABufPlaneOffsets[0].IsPresent = EGL_TRUE;
 + break;
 +  case EGL_DMA_BUF_PLANE0_PITCH_EXT:
 + attrs-DMABufPlanePitches[0].Value = val;
 + attrs-DMABufPlanePitches[0].IsPresent = EGL_TRUE;
 + break;
 +  case EGL_DMA_BUF_PLANE1_FD_EXT:
 + attrs-DMABufPlaneFds[1].Value = val;
 + attrs-DMABufPlaneFds[1].IsPresent = EGL_TRUE;
 + break;
 +  case EGL_DMA_BUF_PLANE1_OFFSET_EXT:
 + attrs-DMABufPlaneOffsets[1].Value = val;
 + attrs-DMABufPlaneOffsets[1].IsPresent = EGL_TRUE;
 + break;
 +  case EGL_DMA_BUF_PLANE1_PITCH_EXT:
 + attrs-DMABufPlanePitches[1].Value = val;
 + attrs-DMABufPlanePitches[1].IsPresent = EGL_TRUE;
 + break;
 +  case EGL_DMA_BUF_PLANE2_FD_EXT:
 + attrs-DMABufPlaneFds[2].Value = val;
 + attrs-DMABufPlaneFds[2].IsPresent = EGL_TRUE;
 + break;
 +  case EGL_DMA_BUF_PLANE2_OFFSET_EXT:
 + attrs-DMABufPlaneOffsets[2].Value = val;
 + attrs-DMABufPlaneOffsets[2].IsPresent = EGL_TRUE;
 + break;
 +  case EGL_DMA_BUF_PLANE2_PITCH_EXT:
 + attrs-DMABufPlanePitches[2].Value = val;
 + attrs-DMABufPlanePitches[2].IsPresent = EGL_TRUE;
 + break;
 +  case EGL_YUV_COLOR_SPACE_HINT_EXT:
 + if (val != EGL_ITU_REC601_EXT || val != EGL_ITU_REC709_EXT ||
 + val != EGL_ITU_REC2020_EXT) {
 
 This should be `val != X  val != Y  val != Z`.
 
 +err = EGL_BAD_ATTRIBUTE;
 + } else {
 +attrs-DMABufYuvColorSpaceHint.Value = val;
 +attrs-DMABufYuvColorSpaceHint.IsPresent = EGL_TRUE;
 + }
 + break;
 +  case EGL_SAMPLE_RANGE_HINT_EXT:
 + if (val != EGL_YUV_FULL_RANGE_EXT || val != 
 EGL_YUV_NARROW_RANGE_EXT) {
 +err = EGL_BAD_ATTRIBUTE;
 
 Again, s/||//. Also, there is a tab above, but all the surrounding code 
 uses spaces.
 
 + } else {
 +attrs-DMABufSampleRangeHint.Value = val;
 +attrs-DMABufSampleRangeHint.IsPresent = EGL_TRUE;
 + }
 + break;
 +  case EGL_YUV_CHROMA_HORIZONTAL_SITING_HINT_EXT:
 + if (val != EGL_YUV_CHROMA_SITING_0_EXT ||
 + val != EGL_YUV_CHROMA_SITING_0_5_EXT) {
 +err = 

[Mesa-dev] [PATCH] glsl linker: Initialize member variable interface_namespace.

2013-05-24 Thread Vinson Lee
Fixes Uninitialized pointer field defect reported by Coverity.

Signed-off-by: Vinson Lee v...@freedesktop.org
---
 src/glsl/lower_named_interface_blocks.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/glsl/lower_named_interface_blocks.cpp 
b/src/glsl/lower_named_interface_blocks.cpp
index eba667a..922cc02 100644
--- a/src/glsl/lower_named_interface_blocks.cpp
+++ b/src/glsl/lower_named_interface_blocks.cpp
@@ -72,7 +72,8 @@ public:
hash_table *interface_namespace;
 
flatten_named_interface_blocks_declarations(void *mem_ctx)
-  : mem_ctx(mem_ctx)
+  : mem_ctx(mem_ctx),
+interface_namespace(NULL)
{
}
 
-- 
1.8.2.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Error compiling mesa 9.1.1

2013-05-24 Thread Divick Kishore
Hi Matt,

 From the build in your path, it looks like you might be trying to do
 an out-of-tree build. I don't remember if that completely worked with
 9.1.

 I just untarred 9.1.3 and did

 libtoolize --force
 ./autogen.sh --with-dri-drivers=i965,swrast
 --with-gallium-drivers=swrast --enable-glx-tls
 --with-egl-platforms=x11 --enable-gles1 --enable-gles2
 --enable-gallium-egl --disable-glu
 make -jX

 and it built.

thanks for your update. It builds fine with 9.1.3 but not with 9.1.1.
I will simply start using 9.1.3.

Thanks again,
Regards,
Divick
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] st/vdpau: remove vlCreateHTAB from surface functions

2013-05-24 Thread Christian König
From: Christian König christian.koe...@amd.com

Signed-off-by: Christian König christian.koe...@amd.com
---
 src/gallium/state_trackers/vdpau/surface.c |9 -
 1 file changed, 9 deletions(-)

diff --git a/src/gallium/state_trackers/vdpau/surface.c 
b/src/gallium/state_trackers/vdpau/surface.c
index 135eb85..bd11fc3 100644
--- a/src/gallium/state_trackers/vdpau/surface.c
+++ b/src/gallium/state_trackers/vdpau/surface.c
@@ -54,11 +54,6 @@ vlVdpVideoSurfaceCreate(VdpDevice device, VdpChromaType 
chroma_type,
   goto inv_size;
}
 
-   if (!vlCreateHTAB()) {
-  ret = VDP_STATUS_RESOURCES;
-  goto no_htab;
-   }
-
p_surf = CALLOC(1, sizeof(vlVdpSurface));
if (!p_surf) {
   ret = VDP_STATUS_RESOURCES;
@@ -110,7 +105,6 @@ inv_device:
FREE(p_surf);
 
 no_res:
-no_htab:
 inv_size:
return ret;
 }
@@ -272,9 +266,6 @@ vlVdpVideoSurfacePutBitsYCbCr(VdpVideoSurface surface,
struct pipe_sampler_view **sampler_views;
unsigned i, j;
 
-   if (!vlCreateHTAB())
-  return VDP_STATUS_RESOURCES;
-
vlVdpSurface *p_surf = vlGetDataHTAB(surface);
if (!p_surf)
   return VDP_STATUS_INVALID_HANDLE;
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] st/vdpau: invalidate the handles on destruction

2013-05-24 Thread Christian König
From: Christian König christian.koe...@amd.com

Fixes a problem with xbmc when switching channels.

Signed-off-by: Christian König christian.koe...@amd.com
---
 src/gallium/state_trackers/vdpau/decode.c  |1 +
 src/gallium/state_trackers/vdpau/device.c  |1 +
 src/gallium/state_trackers/vdpau/surface.c |2 ++
 3 files changed, 4 insertions(+)

diff --git a/src/gallium/state_trackers/vdpau/decode.c 
b/src/gallium/state_trackers/vdpau/decode.c
index 61b10e0..2ffd8dd 100644
--- a/src/gallium/state_trackers/vdpau/decode.c
+++ b/src/gallium/state_trackers/vdpau/decode.c
@@ -139,6 +139,7 @@ vlVdpDecoderDestroy(VdpDecoder decoder)
vldecoder-decoder-destroy(vldecoder-decoder);
pipe_mutex_unlock(vldecoder-device-mutex);
 
+   vlRemoveDataHTAB(decoder);
FREE(vldecoder);
 
return VDP_STATUS_OK;
diff --git a/src/gallium/state_trackers/vdpau/device.c 
b/src/gallium/state_trackers/vdpau/device.c
index c530f43..a829c27 100644
--- a/src/gallium/state_trackers/vdpau/device.c
+++ b/src/gallium/state_trackers/vdpau/device.c
@@ -166,6 +166,7 @@ vlVdpDeviceDestroy(VdpDevice device)
dev-context-destroy(dev-context);
vl_screen_destroy(dev-vscreen);
 
+   vlRemoveDataHTAB(device);
FREE(dev);
vlDestroyHTAB();
 
diff --git a/src/gallium/state_trackers/vdpau/surface.c 
b/src/gallium/state_trackers/vdpau/surface.c
index ad56125..135eb85 100644
--- a/src/gallium/state_trackers/vdpau/surface.c
+++ b/src/gallium/state_trackers/vdpau/surface.c
@@ -132,7 +132,9 @@ vlVdpVideoSurfaceDestroy(VdpVideoSurface surface)
   p_surf-video_buffer-destroy(p_surf-video_buffer);
pipe_mutex_unlock(p_surf-device-mutex);
 
+   vlRemoveDataHTAB(surface);
FREE(p_surf);
+
return VDP_STATUS_OK;
 }
 
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] st/vdpau: destroy handle table only when it's empty

2013-05-24 Thread Christian König
From: Christian König christian.koe...@amd.com

Signed-off-by: Christian König christian.koe...@amd.com
---
 src/gallium/state_trackers/vdpau/htab.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/vdpau/htab.c 
b/src/gallium/state_trackers/vdpau/htab.c
index 39ff7be..8b809f2 100644
--- a/src/gallium/state_trackers/vdpau/htab.c
+++ b/src/gallium/state_trackers/vdpau/htab.c
@@ -55,7 +55,7 @@ void vlDestroyHTAB(void)
 {
 #ifdef VL_HANDLES
pipe_mutex_lock(htab_lock);
-   if (htab) {
+   if (htab  !handle_table_get_first_handle(htab)) {
   handle_table_destroy(htab);
   htab = NULL;
}
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [v4 06/10] intel: prepare for dri images having more than one plane

2013-05-24 Thread Pohjolainen, Topi
On Thu, May 23, 2013 at 09:39:57PM -0700, Chad Versace wrote:
 On 05/02/2013 12:08 AM, Topi Pohjolainen wrote:
 v2 (as advised by Eric):
 - use ARRAY_SIZE
 - re-use 'image_destroy' for cleaning up after failure
 - check directly the region pointer instead of the buffer object
   when determining if a region exists
 
 Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com
 ---
   src/mesa/drivers/dri/intel/intel_screen.c | 103 
  +-
   1 file changed, 72 insertions(+), 31 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/intel/intel_screen.c 
 b/src/mesa/drivers/dri/intel/intel_screen.c
 index 4973441..d822b1c 100644
 --- a/src/mesa/drivers/dri/intel/intel_screen.c
 +++ b/src/mesa/drivers/dri/intel/intel_screen.c
 @@ -490,8 +490,14 @@ intel_create_image_from_texture(__DRIcontext *context, 
 int target,
   static void
   intel_destroy_image(__DRIimage *image)
   {
 -intel_region_release(image-regions[0]);
 -free(image);
 +   int i;
 +
 +   for (i = 0; i  ARRAY_SIZE(image-regions); ++i) {
 +  if (image-regions[i])
 + intel_region_release(image-regions[i]);
 +   }
 +
 +   free(image);
   }
 
   static __DRIimage *
 @@ -568,16 +574,22 @@ intel_query_image(__DRIimage *image, int attrib, int 
 *value)
   static __DRIimage *
   intel_dup_image(__DRIimage *orig_image, void *loaderPrivate)
   {
 +   int i;
  __DRIimage *image;
 
  image = calloc(1, sizeof *image);
  if (image == NULL)
 return NULL;
 
 -   intel_region_reference(image-regions[0], orig_image-regions[0]);
 -   if (image-regions[0] == NULL) {
 -  free(image);
 -  return NULL;
 
 Pre-patch, this hunk returned NULL if orig_image-region[0] was somehow NULL.

Good catch!

 
 +   for (i = 0; i  ARRAY_SIZE(image-regions); ++i) {
 +  if (!orig_image-regions[i])
 + break;
 
 Post-patch, if orig_image-region[0] was NULL, then this function no longer 
 returns
 NULL because of the above break. To ensure that this patch doesn't regress 
 anything,
 it needs to reproduce that behavior with `if (orig_image-regions[0] != NULL) 
 return NULL`..
 Or, if your confident (... I'm not, but maybe you are) that 
 orig_image-region[0] is never NULL
 then assert that.

I maintained the old logic skipping the copy in case the source does not have
any regions. I cannot see how that would be possible but I'm not confident
enough to assert.

 
 +
 +  intel_region_reference(image-regions[i], orig_image-regions[i]);
 +  if (image-regions[i] == NULL) {
 + intel_destroy_image(image);
 + return NULL;
 +  }
  }
 
  image-internal_format = orig_image-internal_format;
 @@ -646,47 +658,76 @@ intel_create_image_from_names(__DRIscreen *screen,
   }
 
   static __DRIimage *
 +intel_setup_image_from_fds(struct intel_screen *screen, int width, int 
 height,
 +   const struct intel_image_format *f,
 +   const int *fds, int num_fds, const int *strides,
 +   void *loaderPriv)
 +{
 
 I don't see the utility in extracting this code out of 
 intel_create_image_from_fds()
 into its own, similarly named function. In fact, it makes the code harder to 
 read.
 If no following patch reuses this function, then its body should remain in 
 its original location,
 intel_create_image_from_fds.

Agreed, it does complicate things.

 
 +   int i;
 +   __DRIimage *img;
 +
 +   if (f-nplanes == 1)
 +  img = intel_allocate_image(f-planes[0].dri_format, loaderPriv);
 +   else
 +  img = intel_allocate_image(__DRI_IMAGE_FORMAT_NONE, loaderPriv);
 +
 +   if (img == NULL)
 +  return NULL;
 +
 +   for (i = 0; i  num_fds; i++) {
 +  img-regions[i] = intel_region_alloc_for_fd(screen, f-planes[i].cpp,
 +   width  f-planes[i].width_shift,
 +   height  f-planes[i].height_shift,
 +   strides[i], fds[i], image);
 +
 +  if (img-regions[i] == NULL) {
 + intel_destroy_image(img);
 + return NULL;
 +  }
 +   }
 +
 +   intel_setup_image_from_dimensions(img);
 +
 +   return img;
 +}
 +
 +static __DRIimage *
   intel_create_image_from_fds(__DRIscreen *screen,
   int width, int height, int fourcc,
   int *fds, int num_fds, int *strides, int 
  *offsets,
   void *loaderPrivate)
   {
  struct intel_screen *intelScreen = screen-driverPrivate;
 -   struct intel_image_format *f;
 +   struct intel_image_format *f = intel_image_format_lookup(fourcc);
  __DRIimage *image;
  int i, index;
 
 -   if (fds == NULL || num_fds != 1)
 -  return NULL;
 -
 -   f = intel_image_format_lookup(fourcc);
 -   if (f == NULL)
 +   /**
 +* In case the image is to consist of multiple regions, there must be 
 exactly
 +* one region per plane.
 +*/
 +   if (fds == NULL || f == NULL || (num_fds  1  f-nplanes != num_fds))
 

[Mesa-dev] [Bug 64952] New: Build failure in egl-static when using llvm-3.3

2013-05-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=64952

  Priority: medium
Bug ID: 64952
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: Build failure in egl-static when using llvm-3.3
  Severity: normal
Classification: Unclassified
OS: Linux (All)
  Reporter: gustav.peters...@gmail.com
  Hardware: Other
Status: NEW
   Version: git
 Component: Mesa core
   Product: Mesa

egl-static needs LLVM component IPO which is only included when building with
opencl

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 64952] Build failure in egl-static when using llvm-3.3

2013-05-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=64952

--- Comment #1 from Gustav Petersson gustav.peters...@gmail.com ---
Created attachment 79759
  -- https://bugs.freedesktop.org/attachment.cgi?id=79759action=edit
Proposed patch for building egl-static

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] st/glx: add null ctx check in glXDestroyContext()

2013-05-24 Thread Brian Paul
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64934
NOTE: This is a candidate for the stable branches.
---
 src/gallium/state_trackers/glx/xlib/glx_api.c |   22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/src/gallium/state_trackers/glx/xlib/glx_api.c 
b/src/gallium/state_trackers/glx/xlib/glx_api.c
index a66ebc8..c6dc134 100644
--- a/src/gallium/state_trackers/glx/xlib/glx_api.c
+++ b/src/gallium/state_trackers/glx/xlib/glx_api.c
@@ -1353,16 +1353,18 @@ glXQueryExtension( Display *dpy, int *errorBase, int 
*eventBase )
 PUBLIC void
 glXDestroyContext( Display *dpy, GLXContext ctx )
 {
-   GLXContext glxCtx = ctx;
-   (void) dpy;
-   MakeCurrent_PrevContext = 0;
-   MakeCurrent_PrevDrawable = 0;
-   MakeCurrent_PrevReadable = 0;
-   MakeCurrent_PrevDrawBuffer = 0;
-   MakeCurrent_PrevReadBuffer = 0;
-   XMesaDestroyContext( glxCtx-xmesaContext );
-   XMesaGarbageCollect();
-   free(glxCtx);
+   if (ctx) {
+  GLXContext glxCtx = ctx;
+  (void) dpy;
+  MakeCurrent_PrevContext = 0;
+  MakeCurrent_PrevDrawable = 0;
+  MakeCurrent_PrevReadable = 0;
+  MakeCurrent_PrevDrawBuffer = 0;
+  MakeCurrent_PrevReadBuffer = 0;
+  XMesaDestroyContext( glxCtx-xmesaContext );
+  XMesaGarbageCollect();
+  free(glxCtx);
+   }
 }
 
 
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] xlib: add null ctx check in glXDestroyContext()

2013-05-24 Thread Brian Paul
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64934
NOTE: This is a candidate for the stable branches.
---
 src/mesa/drivers/x11/fakeglx.c |   22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/x11/fakeglx.c b/src/mesa/drivers/x11/fakeglx.c
index c7fb327..031c305 100644
--- a/src/mesa/drivers/x11/fakeglx.c
+++ b/src/mesa/drivers/x11/fakeglx.c
@@ -1533,16 +1533,18 @@ void _kw_ungrab_all( Display *dpy )
 static void
 Fake_glXDestroyContext( Display *dpy, GLXContext ctx )
 {
-   struct fake_glx_context *glxCtx = (struct fake_glx_context *) ctx;
-   (void) dpy;
-   MakeCurrent_PrevContext = 0;
-   MakeCurrent_PrevDrawable = 0;
-   MakeCurrent_PrevReadable = 0;
-   MakeCurrent_PrevDrawBuffer = 0;
-   MakeCurrent_PrevReadBuffer = 0;
-   XMesaDestroyContext( glxCtx-xmesaContext );
-   XMesaGarbageCollect(dpy);
-   free(glxCtx);
+   if (ctx) {
+  struct fake_glx_context *glxCtx = (struct fake_glx_context *) ctx;
+  (void) dpy;
+  MakeCurrent_PrevContext = 0;
+  MakeCurrent_PrevDrawable = 0;
+  MakeCurrent_PrevReadable = 0;
+  MakeCurrent_PrevDrawBuffer = 0;
+  MakeCurrent_PrevReadBuffer = 0;
+  XMesaDestroyContext( glxCtx-xmesaContext );
+  XMesaGarbageCollect(dpy);
+  free(glxCtx);
+   }
 }
 
 
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 64934] [llvmpipe] SIGSEGV src/gallium/state_trackers/glx/xlib/glx_api.c:1363

2013-05-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=64934

--- Comment #1 from Brian Paul bri...@vmware.com ---
I've posted patches to add null pointer checking in glXDestroyContext.

But the latest build of glxinfo wouldn't call glXDestroyContext with a null
context either.

In any case, I'm not sure why context creation is failing for you in this case.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [v4 10/10] egl: dri2: support for creating images out of dma buffers

2013-05-24 Thread Chad Versace

On 05/23/2013 10:15 PM, Pohjolainen, Topi wrote:

On Thu, May 23, 2013 at 09:39:30PM -0700, Chad Versace wrote:

When touching the src/egl/drivers/dri2 directory, use a commit subject
that looks like egl/dri2: STUFF, not egl: dri2: STUFF.


[snip]


+/**
+ * The spec says:
+ *
+ * If eglCreateImageKHR is successful for a EGL_LINUX_DMA_BUF_EXT target,
+ *  the EGL takes ownership of the file descriptor and is responsible for
+ *  closing it, which it may do at any time while the EGLDisplay is
+ *  initialized.
+ */
+static void
+dri2_take_dma_buf_ownership(const int *fds, unsigned num_fds)
+{
+   int already_closed[num_fds];
+   unsigned num_closed = 0;
+   unsigned i, j;
+
+   for (i = 0; i  num_fds; ++i) {
+  /**
+   * The same file descriptor can be referenced multiple times in case more
+   * than one plane is found in the same buffer, just with a different
+   * offset.
+   */
+  for (j = 0; j  num_closed; ++j) {
+ if (already_closed[j] == fds[i])


The condition above has undefined behavior, ...



There is the explicit counter 'num_closed' telling how many valid elements there
are in 'already_closed'.


My mistake.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] xlib: add null ctx check in glXDestroyContext()

2013-05-24 Thread Jose Fonseca


- Original Message -
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64934
 NOTE: This is a candidate for the stable branches.
 ---
  src/mesa/drivers/x11/fakeglx.c |   22 --
  1 file changed, 12 insertions(+), 10 deletions(-)
 
 diff --git a/src/mesa/drivers/x11/fakeglx.c b/src/mesa/drivers/x11/fakeglx.c
 index c7fb327..031c305 100644
 --- a/src/mesa/drivers/x11/fakeglx.c
 +++ b/src/mesa/drivers/x11/fakeglx.c
 @@ -1533,16 +1533,18 @@ void _kw_ungrab_all( Display *dpy )
  static void
  Fake_glXDestroyContext( Display *dpy, GLXContext ctx )
  {
 -   struct fake_glx_context *glxCtx = (struct fake_glx_context *) ctx;
 -   (void) dpy;
 -   MakeCurrent_PrevContext = 0;
 -   MakeCurrent_PrevDrawable = 0;
 -   MakeCurrent_PrevReadable = 0;
 -   MakeCurrent_PrevDrawBuffer = 0;
 -   MakeCurrent_PrevReadBuffer = 0;
 -   XMesaDestroyContext( glxCtx-xmesaContext );
 -   XMesaGarbageCollect(dpy);
 -   free(glxCtx);
 +   if (ctx) {
 +  struct fake_glx_context *glxCtx = (struct fake_glx_context *) ctx;
 +  (void) dpy;
 +  MakeCurrent_PrevContext = 0;
 +  MakeCurrent_PrevDrawable = 0;
 +  MakeCurrent_PrevReadable = 0;
 +  MakeCurrent_PrevDrawBuffer = 0;
 +  MakeCurrent_PrevReadBuffer = 0;
 +  XMesaDestroyContext( glxCtx-xmesaContext );
 +  XMesaGarbageCollect(dpy);
 +  free(glxCtx);
 +   }
  }
  
  
 --
 1.7.10.4
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 


Reviewed-by: Jose Fonseca jfons...@vmware.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] gallium: Add support for multiple viewports

2013-05-24 Thread Brian Paul

Just some minor formatting nits below...


On 05/23/2013 02:33 PM, Zack Rusin wrote:

Gallium supported only a single viewport/scissor combination. This
commit changes the interface to allow us to add support for multiple
viewports/scissors.

Signed-off-by: Zack Rusin za...@vmware.com
---
  src/gallium/auxiliary/cso_cache/cso_context.c   |   37 +++
  src/gallium/auxiliary/cso_cache/cso_context.h   |9 +++---
  src/gallium/auxiliary/draw/draw_context.c   |6 ++--
  src/gallium/auxiliary/draw/draw_context.h   |5 +--
  src/gallium/auxiliary/hud/hud_context.c |6 ++--
  src/gallium/auxiliary/postprocess/pp_run.c  |6 ++--
  src/gallium/auxiliary/tgsi/tgsi_scan.c  |6 
  src/gallium/auxiliary/tgsi/tgsi_scan.h  |1 +
  src/gallium/auxiliary/tgsi/tgsi_strings.c   |3 +-
  src/gallium/auxiliary/util/u_blit.c |   12 
  src/gallium/auxiliary/util/u_blitter.c  |8 ++---
  src/gallium/auxiliary/util/u_gen_mipmap.c   |6 ++--
  src/gallium/auxiliary/vl/vl_compositor.c|4 +--
  src/gallium/auxiliary/vl/vl_idct.c  |4 +--
  src/gallium/auxiliary/vl/vl_matrix_filter.c |2 +-
  src/gallium/auxiliary/vl/vl_mc.c|2 +-
  src/gallium/auxiliary/vl/vl_median_filter.c |2 +-
  src/gallium/auxiliary/vl/vl_zscan.c |2 +-
  src/gallium/docs/source/context.rst |8 +++--
  src/gallium/drivers/freedreno/freedreno_state.c |   10 +++---
  src/gallium/drivers/galahad/glhd_context.c  |   16 +-
  src/gallium/drivers/i915/i915_state.c   |   12 +---
  src/gallium/drivers/identity/id_context.c   |   22 --
  src/gallium/drivers/ilo/ilo_state.c |   14 +
  src/gallium/drivers/llvmpipe/lp_screen.c|2 ++
  src/gallium/drivers/llvmpipe/lp_state_clip.c|   20 ++--
  src/gallium/drivers/noop/noop_state.c   |   14 +
  src/gallium/drivers/nv30/nv30_draw.c|2 +-
  src/gallium/drivers/nv30/nv30_state.c   |   14 +
  src/gallium/drivers/nv50/nv50_state.c   |   16 +-
  src/gallium/drivers/nvc0/nvc0_state.c   |   14 +
  src/gallium/drivers/r300/r300_context.c |2 +-
  src/gallium/drivers/r300/r300_state.c   |   16 +-
  src/gallium/drivers/r600/evergreen_state.c  |5 +--
  src/gallium/drivers/r600/r600_state.c   |7 +++--
  src/gallium/drivers/r600/r600_state_common.c|9 +++---
  src/gallium/drivers/radeonsi/si_state.c |   14 +
  src/gallium/drivers/rbug/rbug_context.c |   22 --
  src/gallium/drivers/softpipe/sp_screen.c|2 ++
  src/gallium/drivers/softpipe/sp_state_clip.c|   16 +-
  src/gallium/drivers/svga/svga_pipe_misc.c   |   18 ++-
  src/gallium/drivers/svga/svga_swtnl_state.c |2 +-
  src/gallium/drivers/trace/tr_context.c  |   28 +
  src/gallium/include/pipe/p_context.h|   10 +++---
  src/gallium/include/pipe/p_defines.h|3 +-
  src/gallium/include/pipe/p_shader_tokens.h  |3 +-
  src/gallium/include/pipe/p_state.h  |1 +
  src/gallium/state_trackers/vega/renderer.c  |   10 +++---
  src/gallium/state_trackers/xa/xa_renderer.c |2 +-
  src/gallium/state_trackers/xorg/xorg_renderer.c |2 +-
  src/gallium/tests/graw/fs-test.c|2 +-
  src/gallium/tests/graw/graw_util.h  |2 +-
  src/gallium/tests/graw/gs-test.c|2 +-
  src/gallium/tests/graw/quad-sample.c|2 +-
  src/gallium/tests/graw/shader-leak.c|2 +-
  src/gallium/tests/graw/tri-gs.c |2 +-
  src/gallium/tests/graw/tri-instanced.c  |2 +-
  src/gallium/tests/graw/vs-test.c|2 +-
  src/gallium/tests/trivial/quad-tex.c|2 +-
  src/gallium/tests/trivial/tri.c |2 +-
  src/mesa/state_tracker/st_atom_scissor.c|2 +-
  src/mesa/state_tracker/st_atom_viewport.c   |2 +-
  src/mesa/state_tracker/st_cb_bitmap.c   |6 ++--
  src/mesa/state_tracker/st_cb_clear.c|6 ++--
  src/mesa/state_tracker/st_cb_drawpixels.c   |6 ++--
  src/mesa/state_tracker/st_cb_drawtex.c  |6 ++--
  src/mesa/state_tracker/st_draw_feedback.c   |2 +-
  67 files changed, 290 insertions(+), 217 deletions(-)





diff --git a/src/gallium/drivers/galahad/glhd_context.c 
b/src/gallium/drivers/galahad/glhd_context.c
index a73a3ad..849c12e 100644
--- a/src/gallium/drivers/galahad/glhd_context.c
+++ b/src/gallium/drivers/galahad/glhd_context.c
@@ -524,25 +524,27 @@ galahad_context_set_polygon_stipple(struct pipe_context 
*_pipe,
  }

  static void
-galahad_context_set_scissor_state(struct pipe_context *_pipe,

Re: [Mesa-dev] [PATCH 1/4] gallium: Add support for multiple viewports

2013-05-24 Thread Brian Paul

On 05/23/2013 03:02 PM, Roland Scheidegger wrote:

Am 23.05.2013 22:33, schrieb Zack Rusin:

Gallium supported only a single viewport/scissor combination. This
commit changes the interface to allow us to add support for multiple
viewports/scissors.

Signed-off-by: Zack Rusin za...@vmware.com
---
diff --git a/src/gallium/include/pipe/p_context.h 
b/src/gallium/include/pipe/p_context.h
index d1130bc..eaaa043 100644
--- a/src/gallium/include/pipe/p_context.h
+++ b/src/gallium/include/pipe/p_context.h
@@ -211,11 +211,13 @@ struct pipe_context {
 void (*set_polygon_stipple)( struct pipe_context *,
const struct pipe_poly_stipple * );

-   void (*set_scissor_state)( struct pipe_context *,
-  const struct pipe_scissor_state * );
+   void (*set_scissor_states)( struct pipe_context *,
+   unsigned num_scissors,
+   const struct pipe_scissor_state * );

-   void (*set_viewport_state)( struct pipe_context *,
-   const struct pipe_viewport_state * );
+   void (*set_viewport_states)( struct pipe_context *,
+unsigned num_viewports,
+const struct pipe_viewport_state *);

 void (*set_fragment_sampler_views)(struct pipe_context *,
unsigned num_views,
diff --git a/src/gallium/include/pipe/p_defines.h 
b/src/gallium/include/pipe/p_defines.h
index bb86968..00f0a37 100644
--- a/src/gallium/include/pipe/p_defines.h
+++ b/src/gallium/include/pipe/p_defines.h
@@ -507,7 +507,8 @@ enum pipe_cap {
 PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER = 80,
 PIPE_CAP_QUERY_PIPELINE_STATISTICS = 81,
 PIPE_CAP_TEXTURE_BORDER_COLOR_QUIRK = 82,
-   PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE = 83
+   PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE = 83,
+   PIPE_CAP_MULTIPLE_VIEWPORTS = 84

Would it be better if this were PIPE_CAP_MAX_VIEWPORTS instead? Though I
guess there's no real need right now to support anything but 16 (as
that's needed by d3d10/11, and is the minimum supported value for GL,
though GL would allow for more), so I don't have a strong opinion on that.


I second this suggestion.

-Brian


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/13] gallium: Introduce 32-bit bytewise format names

2013-05-24 Thread Jose Fonseca


- Original Message -
 Michel Dänzer mic...@daenzer.net writes:
   For packed formats such as RGBA, the order used in these patches
   (which is what I suggested in my proposal) matches the order humans use
   for digits of numbers, as well as the Mesa formats. That seems more
   important to me than 'matching' any non-packed formats (which only makes
   sense if one presumes little endian byte order).
  
  I'm sorry I didn't notice this was what you proposed earlier..
  
  However I don't think that consistency with Mesa formats is strong:
  Mesa formats even have both ways with their *_REV variants.
  
  So I prefer that we keep existing low-high bit/byte/word/etc naming
  convention for gallium formats.
 
  Fair enough.
 
 
  I do appreciate all the work and thought that went on this series so far,
  and I really want to get this in.  So here is a summary of what's needed
  from my POV to get this in mergeable state:
  
  - leave r8g8b8a8 variants alone (ie, as endianess independent)
  
  - fix the util_format_description::is_array == TRUE 
  util_format_description::is_bitmask == TRUE ambiguity, either:
  
 - add new rgba/argb formats for endianess dependent formats
  
  - add a new field on util_format_description (e.g., native_endian)
  for the endianess formats (**)
  
 - or add rgba/argb #define
  
  - make sure util_format_description::is_array is set for r8g8b8a8
  variants, but util_format_description::is_bitmask is not
 
  Is there any drawback to the latter approach for formats where it's
  feasible? If not, it might reduce code duplication somewhat.
 
  (**) Actually, I'm surprised that formats like
  PIPE_FORMAT_B5G6R5_UNORM aren't busted on big-endiang without this, as
  they haven't been converted yet, so they need to be handled precisely
  as before, right? I suppose everything was busted before, so no net
  change here
 
  Exactly. :\
 
 Yeah.  I deliberately left those for future work :-)  The 8-bits-per-channel
 formats were more interesting when trying out your idea, both because they're
 used more and because they have both the array and int interpretations.
 But it was actually because of things like B5G6R5 that I used int formats
 like RGBA and made .8.8.8.8 an alias of them, rather than the other way
 around.  The layout of B5G6R5 on little-endian targets is AIUI:
 
 76543210 76543210
 GGGB RGGG
 
 Reversing the components gives:
 
 76543210 76543210
 GGGR BGGG
 
 But on a big-endian target the blue first format is:
 
 76543210 76543210
 BGGG GGGR
 
 and reversing the components gives:
 
 76543210 76543210
 RGGG GGGB
 
 So, unlike for the .8.8.8.8 formats, a plain swizzle doesn't give you
 the other endianness.  You need to do something more complicated.
 Little-endian support for the big-endian arrangement, and vice-versa,
 would be pretty involved.
 
 So in practice I thought we'd want the first two formats on little-endian
 targets and the last two on big-endian targets.  I thought that would mean
 _replacing_ the current B5G6R5 and R5G6B5 formats with BGR565 and RGB565
 formats that match the endian-specific arrangements above.  These two int
 formats wouldn't be aliases of a target-independent representation.
 
 So the patch was in some ways an experiment to see how easy it would be
 to make gallium treat a common format like .8.8.8.8 as an int, in the hope
 that if it was easy, things like .5.6.5 would be less of a special case.
 And in the end it all seemed pretty natural, although of course that's
 from a newbie's perspective.  You both know the code much better than I do.

Sorry for the delay.

I agree that with non-array formats, like B5G6R5 and R5G6B5, replacing them 
with endian-variant BGR565 and RGB565 makes a lot of sense (as the swapped 
version will probably never be needed). 

But I'm not sure about RGBA8 variants...

 - On one hand, it is often more efficient to read/write them as 32bit integers 
than as an array of bytes.
 
 - On the other hand it is easier to think of then as an array of bytes than an 
integer quantity.

One thing is clear -- a given format can't be both -- either it is 
endianess-variant packed color or a endianness-invariant array color. The 
choices are force rgba8 to be one kind, the other kind, or have different 
format enums for each.

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 64959] New: Cannot build against EGL without X11

2013-05-24 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=64959

  Priority: medium
Bug ID: 64959
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: Cannot build against EGL without X11
  Severity: normal
Classification: Unclassified
OS: All
  Reporter: r...@burtonini.com
  Hardware: Other
Status: NEW
   Version: unspecified
 Component: Mesa core
   Product: Mesa

I'm building wayland 1.1 and weston 1.1 in an environment without any X headers
against Mesa 9.0.2:

| In file included from
/data/poky-master/tmp/sysroots/atom-pc/usr/include/EGL/egl.h:36:0,
|  from
/data/poky-master/tmp/work/core2-poky-linux/weston/1.1.0-r0/weston-1.1.0/src/gl-renderer.h:30,
|  from
/data/poky-master/tmp/work/core2-poky-linux/weston/1.1.0-r0/weston-1.1.0/src/gl-renderer.c:35:
| /data/poky-master/tmp/sysroots/atom-pc/usr/include/EGL/eglplatform.h:118:22:
fatal error: X11/Xlib.h: No such file or directory

Currently I'm working around this by adding -DMESA_EGL_NO_X11_HEADERS to CFLAGS
but that's clearly not the right thing to do.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/7] radeonsi: Add support for TGSI TXF opcode

2013-05-24 Thread Michel Dänzer
From: Michel Dänzer michel.daen...@amd.com

Signed-off-by: Michel Dänzer michel.daen...@amd.com
---
 src/gallium/drivers/radeonsi/radeonsi_shader.c | 63 --
 1 file changed, 50 insertions(+), 13 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.c 
b/src/gallium/drivers/radeonsi/radeonsi_shader.c
index df7aa1e..b82f885 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_shader.c
+++ b/src/gallium/drivers/radeonsi/radeonsi_shader.c
@@ -934,7 +934,7 @@ static void tex_fetch_args(
}
 
/* Pack LOD */
-   if (opcode == TGSI_OPCODE_TXL)
+   if (opcode == TGSI_OPCODE_TXL || opcode == TGSI_OPCODE_TXF)
address[count++] = coords[3];
 
if (count  16) {
@@ -949,26 +949,56 @@ static void tex_fetch_args(
 );
}
 
-   /* Pad to power of two vector */
-   while (count  util_next_power_of_two(count))
-   address[count++] = 
LLVMGetUndef(LLVMInt32TypeInContext(gallivm-context));
-
-   emit_data-args[0] = lp_build_gather_values(gallivm, address, count);
-
/* Resource */
emit_data-args[1] = 
si_shader_ctx-resources[emit_data-inst-Src[1].Register.Index];
 
-   /* Sampler */
-   emit_data-args[2] = 
si_shader_ctx-samplers[emit_data-inst-Src[1].Register.Index];
+   if (opcode == TGSI_OPCODE_TXF) {
+   /* add tex offsets */
+   if (inst-Texture.NumOffsets) {
+   struct lp_build_context *uint_bld = bld_base-uint_bld;
+   struct lp_build_tgsi_soa_context *bld = 
lp_soa_context(bld_base);
+   const struct tgsi_texture_offset * off = 
inst-TexOffsets;
+
+   assert(inst-Texture.NumOffsets == 1);
+
+   address[0] =
+   lp_build_add(uint_bld, address[0],
+
bld-immediates[off-Index][off-SwizzleX]);
+   if (num_coords  1)
+   address[1] =
+   lp_build_add(uint_bld, address[1],
+
bld-immediates[off-Index][off-SwizzleY]);
+   if (num_coords  2)
+   address[2] =
+   lp_build_add(uint_bld, address[2],
+
bld-immediates[off-Index][off-SwizzleZ]);
+   }
 
-   /* Dimensions */
-   emit_data-args[3] = lp_build_const_int32(bld_base-base.gallivm, 
target);
+   emit_data-dst_type = LLVMVectorType(
+   LLVMInt32TypeInContext(bld_base-base.gallivm-context),
+   4);
 
-   emit_data-arg_count = 4;
+   emit_data-arg_count = 3;
+   } else {
+   /* Sampler */
+   emit_data-args[2] = 
si_shader_ctx-samplers[emit_data-inst-Src[1].Register.Index];
 
-   emit_data-dst_type = LLVMVectorType(
+   emit_data-dst_type = LLVMVectorType(
LLVMFloatTypeInContext(bld_base-base.gallivm-context),
4);
+
+   emit_data-arg_count = 4;
+   }
+
+   /* Dimensions */
+   emit_data-args[emit_data-arg_count - 1] =
+   lp_build_const_int32(bld_base-base.gallivm, target);
+
+   /* Pad to power of two vector */
+   while (count  util_next_power_of_two(count))
+   address[count++] = 
LLVMGetUndef(LLVMInt32TypeInContext(gallivm-context));
+
+   emit_data-args[0] = lp_build_gather_values(gallivm, address, count);
 }
 
 static void build_tex_intrinsic(const struct lp_build_tgsi_action * action,
@@ -999,6 +1029,12 @@ static const struct lp_build_tgsi_action txb_action = {
.intr_name = llvm.SI.sampleb.
 };
 
+static const struct lp_build_tgsi_action txf_action = {
+   .fetch_args = tex_fetch_args,
+   .emit = build_tex_intrinsic,
+   .intr_name = llvm.SI.imageload.
+};
+
 static const struct lp_build_tgsi_action txl_action = {
.fetch_args = tex_fetch_args,
.emit = build_tex_intrinsic,
@@ -1243,6 +1279,7 @@ int si_pipe_shader_create(
 
bld_base-op_actions[TGSI_OPCODE_TEX] = tex_action;
bld_base-op_actions[TGSI_OPCODE_TXB] = txb_action;
+   bld_base-op_actions[TGSI_OPCODE_TXF] = txf_action;
bld_base-op_actions[TGSI_OPCODE_TXL] = txl_action;
bld_base-op_actions[TGSI_OPCODE_TXP] = tex_action;
 
-- 
1.8.3.rc3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/7] radeonsi: Fix hardware state for dual source blending

2013-05-24 Thread Michel Dänzer
From: Michel Dänzer michel.daen...@amd.com

Set up CB_SHADER_MASK register according to pixel shader exports, and enable
some minimal state for colour buffer 1 in case dual source blending is used.

Signed-off-by: Michel Dänzer michel.daen...@amd.com
---
 src/gallium/drivers/radeonsi/radeonsi_shader.c |  5 +
 src/gallium/drivers/radeonsi/radeonsi_shader.h |  1 +
 src/gallium/drivers/radeonsi/si_state.c| 16 ++--
 src/gallium/drivers/radeonsi/si_state_draw.c   |  1 +
 4 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.c 
b/src/gallium/drivers/radeonsi/radeonsi_shader.c
index 484f7ec..3e023f8 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_shader.c
+++ b/src/gallium/drivers/radeonsi/radeonsi_shader.c
@@ -461,6 +461,8 @@ static void si_llvm_init_export_args(struct 
lp_build_tgsi_context *bld_base,
else
si_shader_ctx-shader-spi_shader_col_format |=
V_028714_SPI_SHADER_32_ABGR  (4 * 
cbuf);
+
+   si_shader_ctx-shader-cb_shader_mask |= 0xf  (4 * 
cbuf);
}
}
 
@@ -806,6 +808,7 @@ static void si_llvm_emit_epilogue(struct 
lp_build_tgsi_context * bld_base)
 
si_shader_ctx-shader-spi_shader_col_format |=
V_028714_SPI_SHADER_32_ABGR;
+   si_shader_ctx-shader-cb_shader_mask |= 
S_02823C_OUTPUT0_ENABLE(0xf);
}
 
/* Specify whether the EXEC mask represents the valid mask */
@@ -830,6 +833,8 @@ static void si_llvm_emit_epilogue(struct 
lp_build_tgsi_context * bld_base)
 
si_shader_ctx-shader-spi_shader_col_format |=
si_shader_ctx-shader-spi_shader_col_format  
4;
+   si_shader_ctx-shader-cb_shader_mask |=
+   si_shader_ctx-shader-cb_shader_mask  4;
}
 
last_args[3] = lp_build_const_int32(base-gallivm, 
V_008DFC_SQ_EXP_MRT);
diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.h 
b/src/gallium/drivers/radeonsi/radeonsi_shader.h
index 01b8b5d..33e81c7 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_shader.h
+++ b/src/gallium/drivers/radeonsi/radeonsi_shader.h
@@ -140,6 +140,7 @@ struct si_pipe_shader {
unsignednum_vgprs;
unsignedspi_ps_input_ena;
unsignedspi_shader_col_format;
+   unsignedcb_shader_mask;
unsignedsprite_coord_enable;
unsignedso_strides[4];
union si_shader_key key;
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index dec535c..e7dc792 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -1728,6 +1728,12 @@ static void si_cb(struct r600_context *rctx, struct 
si_pm4_state *pm4,
si_pm4_set_reg(pm4, R_028C70_CB_COLOR0_INFO + cb * 0x3C, color_info);
si_pm4_set_reg(pm4, R_028C74_CB_COLOR0_ATTRIB + cb * 0x3C, 
color_attrib);
 
+   /* set CB_COLOR1_INFO for possible dual-src blending */
+   if (state-nr_cbufs == 1) {
+   assert(cb == 0);
+   si_pm4_set_reg(pm4, R_028C70_CB_COLOR0_INFO + 1 * 0x3C, 
color_info);
+   }
+
/* Determine pixel shader export format */
max_comp_size = si_colorformat_max_comp_size(format);
if (ntype == V_028C70_NUMBER_SRGB ||
@@ -1735,6 +1741,9 @@ static void si_cb(struct r600_context *rctx, struct 
si_pm4_state *pm4,
 max_comp_size = 10) ||
(ntype == V_028C70_NUMBER_FLOAT  max_comp_size = 16)) {
rctx-export_16bpc |= 1  cb;
+   /* set SPI_SHADER_COL_FORMAT for possible dual-src blending */
+   if (state-nr_cbufs == 1)
+   rctx-export_16bpc |= 1  1;
}
 }
 
@@ -1811,7 +1820,7 @@ static void si_set_framebuffer_state(struct pipe_context 
*ctx,
 {
struct r600_context *rctx = (struct r600_context *)ctx;
struct si_pm4_state *pm4 = CALLOC_STRUCT(si_pm4_state);
-   uint32_t shader_mask, tl, br;
+   uint32_t tl, br;
int tl_x, tl_y, br_x, br_y;
 
if (pm4 == NULL)
@@ -1832,10 +1841,6 @@ static void si_set_framebuffer_state(struct pipe_context 
*ctx,
assert(!(rctx-export_16bpc  ~0xff));
si_db(rctx, pm4, state);
 
-   shader_mask = 0;
-   for (int i = 0; i  state-nr_cbufs; i++) {
-   shader_mask |= 0xf  (i * 4);
-   }
tl_x = 0;
tl_y = 0;
br_x = state-width;
@@ -1854,7 +1859,6 @@ static void si_set_framebuffer_state(struct pipe_context 
*ctx,
si_pm4_set_reg(pm4, R_028208_PA_SC_WINDOW_SCISSOR_BR, br);
si_pm4_set_reg(pm4, R_028200_PA_SC_WINDOW_OFFSET, 0x);
si_pm4_set_reg(pm4, 

[Mesa-dev] [PATCH 3/7] radeonsi: Use tgsi_util_get_texture_coord_dim()

2013-05-24 Thread Michel Dänzer
From: Michel Dänzer michel.daen...@amd.com

Signed-off-by: Michel Dänzer michel.daen...@amd.com
---
 src/gallium/drivers/radeonsi/radeonsi_shader.c | 32 ++
 1 file changed, 7 insertions(+), 25 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.c 
b/src/gallium/drivers/radeonsi/radeonsi_shader.c
index 3e023f8..df7aa1e 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_shader.c
+++ b/src/gallium/drivers/radeonsi/radeonsi_shader.c
@@ -40,6 +40,7 @@
 #include tgsi/tgsi_info.h
 #include tgsi/tgsi_parse.h
 #include tgsi/tgsi_scan.h
+#include tgsi/tgsi_util.h
 #include tgsi/tgsi_dump.h
 
 #include radeonsi_pipe.h
@@ -863,6 +864,8 @@ static void tex_fetch_args(
unsigned target = inst-Texture.Texture;
LLVMValueRef coords[4];
LLVMValueRef address[16];
+   int ref_pos;
+   unsigned num_coords = tgsi_util_get_texture_coord_dim(target, ref_pos);
unsigned count = 0;
unsigned chan;
 
@@ -896,11 +899,10 @@ static void tex_fetch_args(
case TGSI_TEXTURE_SHADOW1D_ARRAY:
case TGSI_TEXTURE_SHADOW2D:
case TGSI_TEXTURE_SHADOWRECT:
-   address[count++] = coords[2];
-   break;
case TGSI_TEXTURE_SHADOWCUBE:
case TGSI_TEXTURE_SHADOW2D_ARRAY:
-   address[count++] = coords[3];
+   assert(ref_pos = 0);
+   address[count++] = coords[ref_pos];
break;
case TGSI_TEXTURE_SHADOWCUBE_ARRAY:
address[count++] = lp_build_emit_fetch(bld_base, inst, 1, 0);
@@ -908,30 +910,10 @@ static void tex_fetch_args(
 
/* Pack texture coordinates */
address[count++] = coords[0];
-   switch (target) {
-   case TGSI_TEXTURE_2D:
-   case TGSI_TEXTURE_2D_ARRAY:
-   case TGSI_TEXTURE_3D:
-   case TGSI_TEXTURE_CUBE:
-   case TGSI_TEXTURE_RECT:
-   case TGSI_TEXTURE_SHADOW2D:
-   case TGSI_TEXTURE_SHADOWRECT:
-   case TGSI_TEXTURE_SHADOW2D_ARRAY:
-   case TGSI_TEXTURE_SHADOWCUBE:
-   case TGSI_TEXTURE_2D_MSAA:
-   case TGSI_TEXTURE_2D_ARRAY_MSAA:
-   case TGSI_TEXTURE_CUBE_ARRAY:
-   case TGSI_TEXTURE_SHADOWCUBE_ARRAY:
+   if (num_coords  1)
address[count++] = coords[1];
-   }
-   switch (target) {
-   case TGSI_TEXTURE_3D:
-   case TGSI_TEXTURE_CUBE:
-   case TGSI_TEXTURE_SHADOWCUBE:
-   case TGSI_TEXTURE_CUBE_ARRAY:
-   case TGSI_TEXTURE_SHADOWCUBE_ARRAY:
+   if (num_coords  2)
address[count++] = coords[2];
-   }
 
/* Pack array slice */
switch (target) {
-- 
1.8.3.rc3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/7] radeonsi: GLSL 1.30 support

2013-05-24 Thread Michel Dänzer
This series fixes a couple of problems in preparation, then adds the missing
functionality for GLSL 1.30 and finally enables it. This enables around 800
more piglit tests, keeping the overall passrate about the same as before.

[PATCH 1/7] radeonsi: Fix hardware state for dual source blending
[PATCH 2/7] radeonsi: Make border colour state handling safe for
[PATCH 3/7] radeonsi: Use tgsi_util_get_texture_coord_dim()
[PATCH 4/7] radeonsi: Add support for TGSI TXF opcode
[PATCH 5/7] radeonsi: Handle TGSI TXQ opcode
[PATCH 6/7] radeonsi: Handle TGSI_SEMANTIC_CLIPDIST
[PATCH 7/7] radeonsi: Enable GLSL 1.30
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/7] radeonsi: Make border colour state handling safe for integer textures

2013-05-24 Thread Michel Dänzer
From: Michel Dänzer michel.daen...@amd.com

Signed-off-by: Michel Dänzer michel.daen...@amd.com
---
 src/gallium/drivers/radeonsi/radeonsi_pipe.h |  2 +-
 src/gallium/drivers/radeonsi/si_state.c  | 45 
 2 files changed, 27 insertions(+), 20 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/radeonsi_pipe.h 
b/src/gallium/drivers/radeonsi/radeonsi_pipe.h
index 3274049..67cb14b 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_pipe.h
+++ b/src/gallium/drivers/radeonsi/radeonsi_pipe.h
@@ -87,7 +87,7 @@ struct si_pipe_sampler_view {
 
 struct si_pipe_sampler_state {
uint32_tval[4];
-   float   border_color[4];
+   uint32_tborder_color[4];
 };
 
 struct si_cs_shader_state {
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index e7dc792..4556be6 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -2273,11 +2273,31 @@ static void si_sampler_view_destroy(struct pipe_context 
*ctx,
FREE(resource);
 }
 
+static bool wrap_mode_uses_border_color(unsigned wrap, bool linear_filter)
+{
+   return wrap == PIPE_TEX_WRAP_CLAMP_TO_BORDER ||
+  wrap == PIPE_TEX_WRAP_MIRROR_CLAMP_TO_BORDER ||
+  (linear_filter 
+   (wrap == PIPE_TEX_WRAP_CLAMP ||
+wrap == PIPE_TEX_WRAP_MIRROR_CLAMP));
+}
+
+static bool sampler_state_needs_border_color(const struct pipe_sampler_state 
*state)
+{
+   bool linear_filter = state-min_img_filter != PIPE_TEX_FILTER_NEAREST ||
+state-mag_img_filter != PIPE_TEX_FILTER_NEAREST;
+
+   return (state-border_color.ui[0] || state-border_color.ui[1] ||
+   state-border_color.ui[2] || state-border_color.ui[3]) 
+  (wrap_mode_uses_border_color(state-wrap_s, linear_filter) ||
+   wrap_mode_uses_border_color(state-wrap_t, linear_filter) ||
+   wrap_mode_uses_border_color(state-wrap_r, linear_filter));
+}
+
 static void *si_create_sampler_state(struct pipe_context *ctx,
 const struct pipe_sampler_state *state)
 {
struct si_pipe_sampler_state *rstate = 
CALLOC_STRUCT(si_pipe_sampler_state);
-   union util_color uc;
unsigned aniso_flag_offset = state-max_anisotropy  1 ? 2 : 0;
unsigned border_color_type;
 
@@ -2285,20 +2305,10 @@ static void *si_create_sampler_state(struct 
pipe_context *ctx,
return NULL;
}
 
-   util_pack_color(state-border_color.f, PIPE_FORMAT_A8R8G8B8_UNORM, uc);
-   switch (uc.ui) {
-   case 0x00FF:
-   border_color_type = V_008F3C_SQ_TEX_BORDER_COLOR_OPAQUE_BLACK;
-   break;
-   case 0x:
-   border_color_type = V_008F3C_SQ_TEX_BORDER_COLOR_TRANS_BLACK;
-   break;
-   case 0x:
-   border_color_type = V_008F3C_SQ_TEX_BORDER_COLOR_OPAQUE_WHITE;
-   break;
-   default: /* Use border color pointer */
+   if (sampler_state_needs_border_color(state))
border_color_type = V_008F3C_SQ_TEX_BORDER_COLOR_REGISTER;
-   }
+   else
+   border_color_type = V_008F3C_SQ_TEX_BORDER_COLOR_TRANS_BLACK;
 
rstate-val[0] = (S_008F30_CLAMP_X(si_tex_wrap(state-wrap_s)) |
  S_008F30_CLAMP_Y(si_tex_wrap(state-wrap_t)) |
@@ -2317,7 +2327,7 @@ static void *si_create_sampler_state(struct pipe_context 
*ctx,
rstate-val[3] = S_008F3C_BORDER_COLOR_TYPE(border_color_type);
 
if (border_color_type == V_008F3C_SQ_TEX_BORDER_COLOR_REGISTER) {
-   memcpy(rstate-border_color, state-border_color.f,
+   memcpy(rstate-border_color, state-border_color.ui,
   sizeof(rstate-border_color));
}
 
@@ -2440,11 +2450,8 @@ static struct si_pm4_state *si_bind_sampler(struct 
r600_context *rctx, unsigned
}
 
for (j = 0; j  4; j++) {
-   union fi border_color;
-
-   border_color.f = rstates[i]-border_color[j];
border_color_table[4 * 
rctx-border_color_offset + j] =
-   util_le32_to_cpu(border_color.i);
+   
util_le32_to_cpu(rstates[i]-border_color[j]);
}
 
rstates[i]-val[3] = C_008F3C_BORDER_COLOR_PTR;
-- 
1.8.3.rc3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/7] radeonsi: Handle TGSI TXQ opcode

2013-05-24 Thread Michel Dänzer
From: Michel Dänzer michel.daen...@amd.com

Signed-off-by: Michel Dänzer michel.daen...@amd.com
---
 src/gallium/drivers/radeonsi/radeonsi_shader.c | 34 --
 1 file changed, 32 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.c 
b/src/gallium/drivers/radeonsi/radeonsi_shader.c
index b82f885..572c665 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_shader.c
+++ b/src/gallium/drivers/radeonsi/radeonsi_shader.c
@@ -889,8 +889,7 @@ static void tex_fetch_args(
if (opcode == TGSI_OPCODE_TXB)
address[count++] = coords[3];
 
-   if ((target == TGSI_TEXTURE_CUBE || target == TGSI_TEXTURE_SHADOWCUBE) 

-   opcode != TGSI_OPCODE_TXQ)
+   if (target == TGSI_TEXTURE_CUBE || target == TGSI_TEXTURE_SHADOWCUBE)
radeon_llvm_emit_prepare_cube_coords(bld_base, emit_data, 
coords);
 
/* Pack depth comparison value */
@@ -1017,6 +1016,30 @@ static void build_tex_intrinsic(const struct 
lp_build_tgsi_action * action,
LLVMReadNoneAttribute | LLVMNoUnwindAttribute);
 }
 
+static void txq_fetch_args(
+   struct lp_build_tgsi_context * bld_base,
+   struct lp_build_emit_data * emit_data)
+{
+   struct si_shader_context *si_shader_ctx = si_shader_context(bld_base);
+   const struct tgsi_full_instruction *inst = emit_data-inst;
+
+   /* Mip level */
+   emit_data-args[0] = lp_build_emit_fetch(bld_base, inst, 0, 
TGSI_CHAN_X);
+
+   /* Resource */
+   emit_data-args[1] = 
si_shader_ctx-resources[inst-Src[1].Register.Index];
+
+   /* Dimensions */
+   emit_data-args[2] = lp_build_const_int32(bld_base-base.gallivm,
+ inst-Texture.Texture);
+
+   emit_data-arg_count = 3;
+
+   emit_data-dst_type = LLVMVectorType(
+   LLVMInt32TypeInContext(bld_base-base.gallivm-context),
+   4);
+}
+
 static const struct lp_build_tgsi_action tex_action = {
.fetch_args = tex_fetch_args,
.emit = build_tex_intrinsic,
@@ -1041,6 +1064,12 @@ static const struct lp_build_tgsi_action txl_action = {
.intr_name = llvm.SI.samplel.
 };
 
+static const struct lp_build_tgsi_action txq_action = {
+   .fetch_args = txq_fetch_args,
+   .emit = build_tgsi_intrinsic_nomem,
+   .intr_name = llvm.SI.resinfo
+};
+
 static void create_meta_data(struct si_shader_context *si_shader_ctx)
 {
struct gallivm_state *gallivm = 
si_shader_ctx-radeon_bld.soa.bld_base.base.gallivm;
@@ -1282,6 +1311,7 @@ int si_pipe_shader_create(
bld_base-op_actions[TGSI_OPCODE_TXF] = txf_action;
bld_base-op_actions[TGSI_OPCODE_TXL] = txl_action;
bld_base-op_actions[TGSI_OPCODE_TXP] = tex_action;
+   bld_base-op_actions[TGSI_OPCODE_TXQ] = txq_action;
 
si_shader_ctx.radeon_bld.load_input = declare_input;
si_shader_ctx.radeon_bld.load_system_value = declare_system_value;
-- 
1.8.3.rc3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/7] radeonsi: Handle TGSI_SEMANTIC_CLIPDIST

2013-05-24 Thread Michel Dänzer
From: Michel Dänzer michel.daen...@amd.com

Signed-off-by: Michel Dänzer michel.daen...@amd.com
---
 src/gallium/drivers/radeonsi/radeonsi_shader.c | 21 +
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.c 
b/src/gallium/drivers/radeonsi/radeonsi_shader.c
index 572c665..f6fdfae 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_shader.c
+++ b/src/gallium/drivers/radeonsi/radeonsi_shader.c
@@ -626,6 +626,7 @@ static void si_llvm_emit_epilogue(struct 
lp_build_tgsi_context * bld_base)
struct tgsi_parse_context *parse = si_shader_ctx-parse;
LLVMValueRef args[9];
LLVMValueRef last_args[9] = { 0 };
+   unsigned semantic_name;
unsigned color_count = 0;
unsigned param_count = 0;
int depth_index = -1, stencil_index = -1;
@@ -669,9 +670,11 @@ static void si_llvm_emit_epilogue(struct 
lp_build_tgsi_context * bld_base)
continue;
}
 
+   semantic_name = d-Semantic.Name;
+handle_semantic:
for (index = d-Range.First; index = d-Range.Last; index++) {
/* Select the correct target */
-   switch(d-Semantic.Name) {
+   switch(semantic_name) {
case TGSI_SEMANTIC_PSIZE:
shader-vs_out_misc_write = 1;
shader-vs_out_point_size = 1;
@@ -703,6 +706,11 @@ static void si_llvm_emit_epilogue(struct 
lp_build_tgsi_context * bld_base)
color_count++;
}
break;
+   case TGSI_SEMANTIC_CLIPDIST:
+   shader-clip_dist_write |=
+   d-Declaration.UsageMask  
(d-Semantic.Index  2);
+   target = V_008DFC_SQ_EXP_POS + 2 + 
d-Semantic.Index;
+   break;
case TGSI_SEMANTIC_CLIPVERTEX:
si_llvm_emit_clipvertex(bld_base, index);
shader-clip_dist_write = 0xFF;
@@ -717,14 +725,14 @@ static void si_llvm_emit_epilogue(struct 
lp_build_tgsi_context * bld_base)
target = 0;
fprintf(stderr,
Warning: SI unhandled output 
type:%d\n,
-   d-Semantic.Name);
+   semantic_name);
}
 
si_llvm_init_export_args(bld_base, d, index, target, 
args);
 
if (si_shader_ctx-type == TGSI_PROCESSOR_VERTEX ?
-   (d-Semantic.Name == TGSI_SEMANTIC_POSITION) :
-   (d-Semantic.Name == TGSI_SEMANTIC_COLOR)) {
+   (semantic_name == TGSI_SEMANTIC_POSITION) :
+   (semantic_name == TGSI_SEMANTIC_COLOR)) {
if (last_args[0]) {

lp_build_intrinsic(base-gallivm-builder,
   llvm.SI.export,
@@ -741,6 +749,11 @@ static void si_llvm_emit_epilogue(struct 
lp_build_tgsi_context * bld_base)
}
 
}
+
+   if (semantic_name == TGSI_SEMANTIC_CLIPDIST) {
+   semantic_name = TGSI_SEMANTIC_GENERIC;
+   goto handle_semantic;
+   }
}
 
if (depth_index = 0 || stencil_index = 0) {
-- 
1.8.3.rc3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/7] radeonsi: Enable GLSL 1.30

2013-05-24 Thread Michel Dänzer
From: Michel Dänzer michel.daen...@amd.com

Signed-off-by: Michel Dänzer michel.daen...@amd.com
---
 src/gallium/drivers/radeonsi/radeonsi_pipe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/radeonsi_pipe.c 
b/src/gallium/drivers/radeonsi/radeonsi_pipe.c
index b988e72..30254a8 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_pipe.c
+++ b/src/gallium/drivers/radeonsi/radeonsi_pipe.c
@@ -364,7 +364,7 @@ static int r600_get_param(struct pipe_screen* pscreen, enum 
pipe_cap param)
return 256;
 
case PIPE_CAP_GLSL_FEATURE_LEVEL:
-   return debug_get_bool_option(R600_GLSL130, FALSE) ? 130 : 120;
+   return 130;
 
/* Unsupported features. */
case PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT:
-- 
1.8.3.rc3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Mask the cut index based on the index buffer type in 3DSTATE_VF.

2013-05-24 Thread Kenneth Graunke

On 05/23/2013 04:14 PM, Jordan Justen wrote:

On Thu, May 23, 2013 at 3:46 PM, Kenneth Graunke kenn...@whitecape.org wrote:

According to the documentation: The Cut Index is compared to the
fetched (and possibly-sign-extended) vertex index, and if these values
are equal, the current primitive topology is terminated.  Note that,
for index buffers 32bpp, it is possible to set the Cut Index to a
(large) value that will never match a sign-extended vertex index.

This suggests that we should not set the value to 0x for
unsigned byte or short index buffers, but rather 0xFF or 0x.


I was wondering what the GL spec had to say about this situation. For
example, what should happen if the index is 0x100, and bytes are used.
Should it effectively disable prim-restart? Should it use 0xff, or
0x00? Unfortunately, I didn't find anything concrete.

Reviewed-by: Jordan Justen jordan.l.jus...@intel.com


You raise a good point.  If I set the cut index to 0x31337 and 
DrawElements with GL_UNSIGNED_BYTE, should it reset when it sees 37 or not?


I'll have to write Piglit tests and find out what other implementations do.

Thanks for the excellent review!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] radeonsi compute improvements

2013-05-24 Thread Tom Stellard
Hi,

These patches along with the associated LLVM changes improve compute
support on radeonsi to the point were it can run a number of simple apps,
including the bitcoin mining program bfgminer.

Patch #4 re-introduces the r600_upload_const_buffer() function that was removed 
in
eb19163a4dd3d7bfeed63229820c926f99ed00d9.  However, using this function
from si_set_constant_buffer() causes a memory leak in X/Glamor which makes it 
impossible
to complete a full piglit run.  I'm not sure what the problem is, since it 
worked before
the above mentioned commit, but I'm hoping someone can spot my mistake.

-Tom

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/5] radeonsi/compute: Add missing PIPE_COMPUTE caps

2013-05-24 Thread Tom Stellard
From: Tom Stellard thomas.stell...@amd.com

---
 src/gallium/drivers/radeonsi/radeonsi_pipe.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/radeonsi_pipe.c 
b/src/gallium/drivers/radeonsi/radeonsi_pipe.c
index b988e72..7a79db3 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_pipe.c
+++ b/src/gallium/drivers/radeonsi/radeonsi_pipe.c
@@ -579,6 +579,22 @@ static int r600_get_compute_param(struct pipe_screen 
*screen,
}
return sizeof(uint64_t);
 
+   case PIPE_COMPUTE_CAP_MAX_GLOBAL_SIZE:
+   if (ret) {
+   uint64_t *max_global_size = ret;
+   /* XXX: Not sure what to put here. */
+   *max_global_size = 20;
+   }
+   return sizeof(uint64_t);
+
+   case PIPE_COMPUTE_CAP_MAX_MEM_ALLOC_SIZE:
+   if (ret) {
+   uint64_t max_global_size;
+   uint64_t *max_mem_alloc_size = ret;
+   r600_get_compute_param(screen, 
PIPE_COMPUTE_CAP_MAX_GLOBAL_SIZE, max_global_size);
+   *max_mem_alloc_size = max_global_size / 4;
+   }
+   return sizeof(uint64_t);
default:
fprintf(stderr, unknown PIPE_COMPUTE_CAP %d\n, param);
return 0;
-- 
1.8.1.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/5] radeonsi/compute: Implement un-binding of global buffers

2013-05-24 Thread Tom Stellard
From: Tom Stellard thomas.stell...@amd.com

---
 src/gallium/drivers/radeonsi/radeonsi_compute.c | 31 +++--
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/radeonsi_compute.c 
b/src/gallium/drivers/radeonsi/radeonsi_compute.c
index 1ae7d9b..3fb6eb1 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_compute.c
+++ b/src/gallium/drivers/radeonsi/radeonsi_compute.c
@@ -5,6 +5,8 @@
 
 #include radeon_llvm_util.h
 
+#define MAX_GLOBAL_BUFFERS 20
+
 struct si_pipe_compute {
struct r600_context *ctx;
 
@@ -15,7 +17,7 @@ struct si_pipe_compute {
struct si_pipe_shader *kernels;
unsigned num_user_sgprs;
 
-struct si_pm4_state *pm4_buffers;
+struct pipe_resource *global_buffers[MAX_GLOBAL_BUFFERS];
 
 };
 
@@ -65,22 +67,18 @@ static void radeonsi_set_global_binding(
unsigned i;
struct r600_context *rctx = (struct r600_context*)ctx;
struct si_pipe_compute *program = rctx-cs_shader_state.program;
-   struct si_pm4_state *pm4;
-
-   if (!program-pm4_buffers) {
-   program-pm4_buffers = CALLOC_STRUCT(si_pm4_state);
-   }
-   pm4 = program-pm4_buffers;
-   pm4-compute_pkt = true;
 
if (!resources) {
+   for (i = first; i  first + n; i++) {
+   program-global_buffers[i] = NULL;
+   }
return;
}
 
for (i = first; i  first + n; i++) {
-   uint64_t va = r600_resource_va(ctx-screen, resources[i]);
-   si_pm4_add_bo(pm4, (struct si_resource*)resources[i],
-   RADEON_USAGE_READWRITE);
+   uint64_t va;
+   program-global_buffers[i] = resources[i];
+   va = r600_resource_va(ctx-screen, resources[i]);
memcpy(handles[i], va, sizeof(va));
}
 }
@@ -138,6 +136,16 @@ static void radeonsi_launch_grid(
si_pm4_set_reg(pm4, R_00B824_COMPUTE_NUM_THREAD_Z,
S_00B824_NUM_THREAD_FULL(block_layout[2]));
 
+   /* Global buffers */
+   for (i = 0; i  MAX_GLOBAL_BUFFERS; i++) {
+   struct si_resource *buffer =
+   (struct si_resource*)program-global_buffers[i];
+   if (!buffer) {
+   continue;
+   }
+   si_pm4_add_bo(pm4, buffer, RADEON_USAGE_READWRITE);
+   }
+
/* XXX: This should be:
 * (number of compute units) * 4 * (waves per simd) - 1 */
si_pm4_set_reg(pm4, R_00B82C_COMPUTE_MAX_WAVE_ID, 0x190 /* Default 
value */);
@@ -199,7 +207,6 @@ static void radeonsi_launch_grid(
si_pm4_inval_shader_cache(pm4);
si_cmd_surface_sync(pm4, pm4-cp_coher_cntl);
 
-   si_pm4_emit(rctx, program-pm4_buffers);
si_pm4_emit(rctx, pm4);
 
 #if 0
-- 
1.8.1.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/5] radeonsi/compute: Pass kernel arguments in a buffer

2013-05-24 Thread Tom Stellard
From: Tom Stellard thomas.stell...@amd.com

---
 src/gallium/drivers/radeonsi/r600_buffer.c  | 31 +
 src/gallium/drivers/radeonsi/radeonsi_compute.c | 26 ++---
 src/gallium/drivers/radeonsi/si_state.c | 29 +++
 3 files changed, 51 insertions(+), 35 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/r600_buffer.c 
b/src/gallium/drivers/radeonsi/r600_buffer.c
index cdf9988..87763c3 100644
--- a/src/gallium/drivers/radeonsi/r600_buffer.c
+++ b/src/gallium/drivers/radeonsi/r600_buffer.c
@@ -25,6 +25,8 @@
  *  Corbin Simpson mostawesomed...@gmail.com
  */
 
+#include byteswap.h
+
 #include pipe/p_screen.h
 #include util/u_format.h
 #include util/u_math.h
@@ -168,3 +170,32 @@ void r600_upload_index_buffer(struct r600_context *rctx,
u_upload_data(rctx-uploader, 0, count * ib-index_size,
  ib-user_buffer, ib-offset, ib-buffer);
 }
+
+void r600_upload_const_buffer(struct r600_context *rctx, struct si_resource 
**rbuffer,
+   const uint8_t *ptr, unsigned size,
+   uint32_t *const_offset)
+{
+   *rbuffer = NULL;
+
+   if (R600_BIG_ENDIAN) {
+   uint32_t *tmpPtr;
+   unsigned i;
+
+   if (!(tmpPtr = malloc(size))) {
+   R600_ERR(Failed to allocate BE swap buffer.\n);
+   return;
+   }
+
+   for (i = 0; i  size / 4; ++i) {
+   tmpPtr[i] = bswap_32(((uint32_t *)ptr)[i]);
+   }
+
+   u_upload_data(rctx-uploader, 0, size, tmpPtr, const_offset,
+   (struct pipe_resource**)rbuffer);
+
+   free(tmpPtr);
+   } else {
+   u_upload_data(rctx-uploader, 0, size, ptr, const_offset,
+   (struct pipe_resource**)rbuffer);
+   }
+}
diff --git a/src/gallium/drivers/radeonsi/radeonsi_compute.c 
b/src/gallium/drivers/radeonsi/radeonsi_compute.c
index 3fb6eb1..035076d 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_compute.c
+++ b/src/gallium/drivers/radeonsi/radeonsi_compute.c
@@ -91,8 +91,11 @@ static void radeonsi_launch_grid(
struct r600_context *rctx = (struct r600_context*)ctx;
struct si_pipe_compute *program = rctx-cs_shader_state.program;
struct si_pm4_state *pm4 = CALLOC_STRUCT(si_pm4_state);
+   struct si_resource *input_buffer;
+   uint32_t input_offset = 0;
+   uint64_t input_va;
uint64_t shader_va;
-   unsigned arg_user_sgpr_count;
+   unsigned arg_user_sgpr_count = 2;
unsigned i;
struct si_pipe_shader *shader = program-kernels[pc];
 
@@ -109,21 +112,16 @@ static void radeonsi_launch_grid(
si_pm4_inval_shader_cache(pm4);
si_cmd_surface_sync(pm4, pm4-cp_coher_cntl);
 
-   arg_user_sgpr_count = program-input_size / 4;
-   if (program-input_size % 4 != 0) {
-   arg_user_sgpr_count++;
-   }
+   /* Upload the input data */
+   r600_upload_const_buffer(rctx, input_buffer, input,
+   program-input_size, input_offset);
+   input_va = r600_resource_va(ctx-screen, (struct 
pipe_resource*)input_buffer);
+   input_va += input_offset;
 
-   /* XXX: We should store arguments in memory if we run out of user sgprs.
-*/
-   assert(arg_user_sgpr_count  16);
+   si_pm4_add_bo(pm4, input_buffer, RADEON_USAGE_READ);
 
-   for (i = 0; i  arg_user_sgpr_count; i++) {
-   uint32_t *args = (uint32_t*)input;
-   si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0 +
-   (i * 4),
-   args[i]);
-   }
+   si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0, input_va);
+   si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0 + 4, 
S_008F04_BASE_ADDRESS_HI (input_va  32) | S_008F04_STRIDE(0));
 
si_pm4_set_reg(pm4, R_00B810_COMPUTE_START_X, 0);
si_pm4_set_reg(pm4, R_00B814_COMPUTE_START_Y, 0);
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index dec535c..1e94f7e 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -24,8 +24,6 @@
  *  Christian König christian.koe...@amd.com
  */
 
-#include byteswap.h
-
 #include util/u_memory.h
 #include util/u_framebuffer.h
 #include util/u_blitter.h
@@ -2526,25 +2524,14 @@ static void si_set_constant_buffer(struct pipe_context 
*ctx, uint shader, uint i
ptr = input-user_buffer;
 
if (ptr) {
-   /* Upload the user buffer. */
-   if (R600_BIG_ENDIAN) {
-   uint32_t *tmpPtr;
-   unsigned i, size = input-buffer_size;
-
-   if (!(tmpPtr = malloc(size))) {
-   R600_ERR(Failed to allocate BE swap 
buffer.\n);
-

[Mesa-dev] [PATCH 2/5] radeonsi/compute: Support multiple kernels in a compute program

2013-05-24 Thread Tom Stellard
From: Tom Stellard thomas.stell...@amd.com

---
 src/gallium/drivers/radeonsi/radeonsi_compute.c | 27 -
 1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/radeonsi_compute.c 
b/src/gallium/drivers/radeonsi/radeonsi_compute.c
index e67d127..1ae7d9b 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_compute.c
+++ b/src/gallium/drivers/radeonsi/radeonsi_compute.c
@@ -11,7 +11,8 @@ struct si_pipe_compute {
unsigned local_size;
unsigned private_size;
unsigned input_size;
-   struct si_pipe_shader shader;
+   unsigned num_kernels;
+   struct si_pipe_shader *kernels;
unsigned num_user_sgprs;
 
 struct si_pm4_state *pm4_buffers;
@@ -27,7 +28,7 @@ static void *radeonsi_create_compute_state(
CALLOC_STRUCT(si_pipe_compute);
const struct pipe_llvm_program_header *header;
const unsigned char *code;
-   LLVMModuleRef mod;
+   unsigned i;
 
header = cso-prog;
code = cso-prog + sizeof(struct pipe_llvm_program_header);
@@ -37,8 +38,15 @@ static void *radeonsi_create_compute_state(
program-private_size = cso-req_private_mem;
program-input_size = cso-req_input_mem;
 
-   mod = radeon_llvm_parse_bitcode(code, header-num_bytes);
-   si_compile_llvm(rctx, program-shader, mod);
+   program-num_kernels = radeon_llvm_get_num_kernels(code,
+   header-num_bytes);
+   program-kernels = CALLOC(sizeof(struct si_pipe_shader),
+   program-num_kernels);
+   for (i = 0; i  program-num_kernels; i++) {
+   LLVMModuleRef mod = radeon_llvm_get_kernel_module(i, code,
+   header-num_bytes);
+   si_compile_llvm(rctx, program-kernels[i], mod);
+   }
 
return program;
 }
@@ -88,6 +96,7 @@ static void radeonsi_launch_grid(
uint64_t shader_va;
unsigned arg_user_sgpr_count;
unsigned i;
+   struct si_pipe_shader *shader = program-kernels[pc];
 
pm4-compute_pkt = true;
si_cmd_context_control(pm4);
@@ -133,8 +142,8 @@ static void radeonsi_launch_grid(
 * (number of compute units) * 4 * (waves per simd) - 1 */
si_pm4_set_reg(pm4, R_00B82C_COMPUTE_MAX_WAVE_ID, 0x190 /* Default 
value */);
 
-   shader_va = r600_resource_va(ctx-screen, (void *)program-shader.bo);
-   si_pm4_add_bo(pm4, program-shader.bo, RADEON_USAGE_READ);
+   shader_va = r600_resource_va(ctx-screen, (void *)shader-bo);
+   si_pm4_add_bo(pm4, shader-bo, RADEON_USAGE_READ);
si_pm4_set_reg(pm4, R_00B830_COMPUTE_PGM_LO, (shader_va  8)  
0x);
si_pm4_set_reg(pm4, R_00B834_COMPUTE_PGM_HI, shader_va  40);
 
@@ -143,13 +152,13 @@ static void radeonsi_launch_grid(
 * TIDIG_COMP_CNT.
 * XXX: The compiler should account for this.
 */
-   S_00B848_VGPRS((MAX2(3, program-shader.num_vgprs) - 1) / 4)
+   S_00B848_VGPRS((MAX2(3, shader-num_vgprs) - 1) / 4)
/* We always use at least 4 + arg_user_sgpr_count.  The 4 extra
 * sgprs are from TGID_X_EN, TGID_Y_EN, TGID_Z_EN, TG_SIZE_EN
 * XXX: The compiler should account for this.
 */
|  S_00B848_SGPRS(((MAX2(4 + arg_user_sgpr_count,
-   program-shader.num_sgprs)) - 1) / 8))
+   shader-num_sgprs)) - 1) / 8))
;
 
si_pm4_set_reg(pm4, R_00B84C_COMPUTE_PGM_RSRC2,
@@ -201,7 +210,7 @@ static void radeonsi_launch_grid(
 #endif
 
rctx-ws-cs_flush(rctx-cs, RADEON_FLUSH_COMPUTE, 0);
-   rctx-ws-buffer_wait(program-shader.bo-buf, 0);
+   rctx-ws-buffer_wait(shader-bo-buf, 0);
 
FREE(pm4);
 }
-- 
1.8.1.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] radeonsi/compute: Upload work group, work item size in input buffer

2013-05-24 Thread Tom Stellard
From: Tom Stellard thomas.stell...@amd.com

---
 src/gallium/drivers/radeonsi/radeonsi_compute.c | 38 ++---
 1 file changed, 27 insertions(+), 11 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/radeonsi_compute.c 
b/src/gallium/drivers/radeonsi/radeonsi_compute.c
index 035076d..3abf50b 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_compute.c
+++ b/src/gallium/drivers/radeonsi/radeonsi_compute.c
@@ -91,9 +91,12 @@ static void radeonsi_launch_grid(
struct r600_context *rctx = (struct r600_context*)ctx;
struct si_pipe_compute *program = rctx-cs_shader_state.program;
struct si_pm4_state *pm4 = CALLOC_STRUCT(si_pm4_state);
-   struct si_resource *input_buffer;
-   uint32_t input_offset = 0;
-   uint64_t input_va;
+   struct si_resource *kernel_args_buffer;
+   unsigned kernel_args_size;
+   unsigned num_work_size_bytes = 36;
+   uint32_t kernel_args_offset = 0;
+   uint32_t *kernel_args;
+   uint64_t kernel_args_va;
uint64_t shader_va;
unsigned arg_user_sgpr_count = 2;
unsigned i;
@@ -112,16 +115,29 @@ static void radeonsi_launch_grid(
si_pm4_inval_shader_cache(pm4);
si_cmd_surface_sync(pm4, pm4-cp_coher_cntl);
 
-   /* Upload the input data */
-   r600_upload_const_buffer(rctx, input_buffer, input,
-   program-input_size, input_offset);
-   input_va = r600_resource_va(ctx-screen, (struct 
pipe_resource*)input_buffer);
-   input_va += input_offset;
+   /* Upload the kernel arguments */
 
-   si_pm4_add_bo(pm4, input_buffer, RADEON_USAGE_READ);
+   /* The extra num_work_size_bytes are for work group / work item size 
information */
+   kernel_args_size = program-input_size + num_work_size_bytes;
+   kernel_args = MALLOC(kernel_args_size);
+   for (i = 0; i  3; i++) {
+   kernel_args[i] = grid_layout[i];
+   kernel_args[i + 3] = grid_layout[i] * block_layout[i];
+   kernel_args[i + 6] = block_layout[i];
+   }
+
+   memcpy(kernel_args + (num_work_size_bytes / 4), input, 
program-input_size);
+
+   r600_upload_const_buffer(rctx, kernel_args_buffer, kernel_args,
+   kernel_args_size, kernel_args_offset);
+   kernel_args_va = r600_resource_va(ctx-screen,
+   (struct pipe_resource*)kernel_args_buffer);
+   kernel_args_va += kernel_args_offset;
+
+   si_pm4_add_bo(pm4, kernel_args_buffer, RADEON_USAGE_READ);
 
-   si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0, input_va);
-   si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0 + 4, 
S_008F04_BASE_ADDRESS_HI (input_va  32) | S_008F04_STRIDE(0));
+   si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0, kernel_args_va);
+   si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0 + 4, 
S_008F04_BASE_ADDRESS_HI (kernel_args_va  32) | S_008F04_STRIDE(0));
 
si_pm4_set_reg(pm4, R_00B810_COMPUTE_START_X, 0);
si_pm4_set_reg(pm4, R_00B814_COMPUTE_START_Y, 0);
-- 
1.8.1.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/4] mesa: Implement ext_framebuffer_multisample_blit_scaled extension

2013-05-24 Thread Paul Berry
On 16 May 2013 11:44, Anuj Phogat anuj.pho...@gmail.com wrote:

 Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
 Reviewed-by: Paul Berry stereotype...@gmail.com
 Reviewed-by: Brian Paul bri...@vmware.com
 ---
  src/mesa/main/extensions.c |  1 +
  src/mesa/main/fbobject.c   | 30 +++---
  src/mesa/main/mtypes.h |  1 +
  3 files changed, 29 insertions(+), 3 deletions(-)

 diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
 index db5a5ed..39aaad4 100644
 --- a/src/mesa/main/extensions.c
 +++ b/src/mesa/main/extensions.c
 @@ -184,6 +184,7 @@ static const struct extension extension_table[] = {
 { GL_EXT_fog_coord,   o(EXT_fog_coord),
   GLL,1999 },
 { GL_EXT_framebuffer_blit,
  o(EXT_framebuffer_blit),GL, 2005 },
 { GL_EXT_framebuffer_multisample,
 o(EXT_framebuffer_multisample), GL, 2005 },
 +   { GL_EXT_framebuffer_multisample_blit_scaled,
 o(EXT_framebuffer_multisample_blit_scaled), GL, 2011 },
 { GL_EXT_framebuffer_object,
  o(EXT_framebuffer_object),  GL, 2000 },
 { GL_EXT_framebuffer_sRGB,
  o(EXT_framebuffer_sRGB),GL, 1998 },
 { GL_EXT_gpu_program_parameters,
  o(EXT_gpu_program_parameters),  GLL,2006 },
 diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
 index 80485f7..e7300f6 100644
 --- a/src/mesa/main/fbobject.c
 +++ b/src/mesa/main/fbobject.c
 @@ -2974,6 +2974,20 @@ compatible_resolve_formats(const struct
 gl_renderbuffer *readRb,
 return GL_FALSE;
  }

 +static GLboolean
 +is_valid_blit_filter(const struct gl_context *ctx, GLenum filter)
 +{
 +   switch (filter) {
 +   case GL_NEAREST:
 +   case GL_LINEAR:
 +  return true;
 +   case GL_SCALED_RESOLVE_FASTEST_EXT:
 +   case GL_SCALED_RESOLVE_NICEST_EXT:
 +  return ctx-Extensions.EXT_framebuffer_multisample_blit_scaled;
 +   default:
 +  return false;
 +   }
 +}

  /**
   * Blit rectangular region, optionally from one framebuffer to another.
 @@ -3023,8 +3037,17 @@ _mesa_BlitFramebuffer(GLint srcX0, GLint srcY0,
 GLint srcX1, GLint srcY1,
return;
 }

 -   if (filter != GL_NEAREST  filter != GL_LINEAR) {
 -  _mesa_error(ctx, GL_INVALID_ENUM, glBlitFramebufferEXT(filter));
 +   if (!is_valid_blit_filter(ctx, filter)) {
 +  _mesa_error(ctx, GL_INVALID_ENUM, glBlitFramebufferEXT(%s),
 +  _mesa_lookup_enum_by_nr(filter));
 +  return;
 +   }
 +
 +   if ((filter == GL_SCALED_RESOLVE_FASTEST_EXT ||
 +filter == GL_SCALED_RESOLVE_NICEST_EXT) 
 +(readFb-Visual.samples == 0 || drawFb-Visual.samples  0)) {
 +  _mesa_error(ctx, GL_INVALID_OPERATION, glBlitFramebufferEXT(%s),
 +  _mesa_lookup_enum_by_nr(filter));
return;
 }

 @@ -3257,7 +3280,8 @@ _mesa_BlitFramebuffer(GLint srcX0, GLint srcY0,
 GLint srcX1, GLint srcY1,
}

/* extra checks for multisample copies... */
 -  if (readFb-Visual.samples  0 || drawFb-Visual.samples  0) {
 +  if ((readFb-Visual.samples  0 || drawFb-Visual.samples  0) 
 +  (filter == GL_NEAREST || filter == GL_LINEAR)) {
   /* src and dest region sizes must be the same */
   if (abs(srcX1 - srcX0) != abs(dstX1 - dstX0) ||
   abs(srcY1 - srcY0) != abs(dstY1 - dstY0)) {


Later in this function, the following error check appears:

 if (filter == GL_LINEAR) {
/* 3.1 spec, page 199:
 * Calling BlitFramebuffer will result in an INVALID_OPERATION
error
 * if filter is LINEAR and read buffer contains integer data.
 */
GLenum type = _mesa_get_format_datatype(colorReadRb-Format);
if (type == GL_INT || type == GL_UNSIGNED_INT) {
   _mesa_error(ctx, GL_INVALID_OPERATION,
   glBlitFramebufferEXT(integer color type));
   return;
}
 }

This needs to be changed to if (filter != GL_NEAREST) in accordance with
the following text from the extension:

Calling BlitFramebuffer will result in an INVALID_OPERATION error if
filter is not NEAREST and read buffer contains integer data.

With that fixed, this patch is:

Reviewed-by: Paul Berry stereotype...@gmail.com


 diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
 index b68853b..8af6dc6 100644
 --- a/src/mesa/main/mtypes.h
 +++ b/src/mesa/main/mtypes.h
 @@ -3023,6 +3023,7 @@ struct gl_extensions
 GLboolean EXT_fog_coord;
 GLboolean EXT_framebuffer_blit;
 GLboolean EXT_framebuffer_multisample;
 +   GLboolean EXT_framebuffer_multisample_blit_scaled;
 GLboolean EXT_framebuffer_object;
 GLboolean EXT_framebuffer_sRGB;
 GLboolean EXT_gpu_program_parameters;
 --
 1.8.1.4


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org

Re: [Mesa-dev] [PATCH 4/5] radeonsi/compute: Pass kernel arguments in a buffer

2013-05-24 Thread Patrick Baggett
The only difference I could see is that in the old code you passed
cb-buffer (which maybe points to a value?) directly into u_upload_data()
where as in the new code, you do pass cb-buffer as the parameter rbuffer
to r600_upload_const_buffer(), but then inside that function, you do
*rbuffer = NULL before you start, which effectively erases any previous
pointer, so if *rbuffer was examined by u_upload_data(), it may be
different. I don't know if that matters, though.

Patrick


On Fri, May 24, 2013 at 1:07 PM, Tom Stellard t...@stellard.net wrote:

 From: Tom Stellard thomas.stell...@amd.com

 ---
  src/gallium/drivers/radeonsi/r600_buffer.c  | 31
 +
  src/gallium/drivers/radeonsi/radeonsi_compute.c | 26 ++---
  src/gallium/drivers/radeonsi/si_state.c | 29
 +++
  3 files changed, 51 insertions(+), 35 deletions(-)

 diff --git a/src/gallium/drivers/radeonsi/r600_buffer.c
 b/src/gallium/drivers/radeonsi/r600_buffer.c
 index cdf9988..87763c3 100644
 --- a/src/gallium/drivers/radeonsi/r600_buffer.c
 +++ b/src/gallium/drivers/radeonsi/r600_buffer.c
 @@ -25,6 +25,8 @@
   *  Corbin Simpson mostawesomed...@gmail.com
   */

 +#include byteswap.h
 +
  #include pipe/p_screen.h
  #include util/u_format.h
  #include util/u_math.h
 @@ -168,3 +170,32 @@ void r600_upload_index_buffer(struct r600_context
 *rctx,
 u_upload_data(rctx-uploader, 0, count * ib-index_size,
   ib-user_buffer, ib-offset, ib-buffer);
  }
 +
 +void r600_upload_const_buffer(struct r600_context *rctx, struct
 si_resource **rbuffer,
 +   const uint8_t *ptr, unsigned size,
 +   uint32_t *const_offset)
 +{
 +   *rbuffer = NULL;
 +
 +   if (R600_BIG_ENDIAN) {
 +   uint32_t *tmpPtr;
 +   unsigned i;
 +
 +   if (!(tmpPtr = malloc(size))) {
 +   R600_ERR(Failed to allocate BE swap buffer.\n);
 +   return;
 +   }
 +
 +   for (i = 0; i  size / 4; ++i) {
 +   tmpPtr[i] = bswap_32(((uint32_t *)ptr)[i]);
 +   }
 +
 +   u_upload_data(rctx-uploader, 0, size, tmpPtr,
 const_offset,
 +   (struct pipe_resource**)rbuffer);
 +
 +   free(tmpPtr);
 +   } else {
 +   u_upload_data(rctx-uploader, 0, size, ptr, const_offset,
 +   (struct pipe_resource**)rbuffer);
 +   }
 +}
 diff --git a/src/gallium/drivers/radeonsi/radeonsi_compute.c
 b/src/gallium/drivers/radeonsi/radeonsi_compute.c
 index 3fb6eb1..035076d 100644
 --- a/src/gallium/drivers/radeonsi/radeonsi_compute.c
 +++ b/src/gallium/drivers/radeonsi/radeonsi_compute.c
 @@ -91,8 +91,11 @@ static void radeonsi_launch_grid(
 struct r600_context *rctx = (struct r600_context*)ctx;
 struct si_pipe_compute *program = rctx-cs_shader_state.program;
 struct si_pm4_state *pm4 = CALLOC_STRUCT(si_pm4_state);
 +   struct si_resource *input_buffer;
 +   uint32_t input_offset = 0;
 +   uint64_t input_va;
 uint64_t shader_va;
 -   unsigned arg_user_sgpr_count;
 +   unsigned arg_user_sgpr_count = 2;
 unsigned i;
 struct si_pipe_shader *shader = program-kernels[pc];

 @@ -109,21 +112,16 @@ static void radeonsi_launch_grid(
 si_pm4_inval_shader_cache(pm4);
 si_cmd_surface_sync(pm4, pm4-cp_coher_cntl);

 -   arg_user_sgpr_count = program-input_size / 4;
 -   if (program-input_size % 4 != 0) {
 -   arg_user_sgpr_count++;
 -   }
 +   /* Upload the input data */
 +   r600_upload_const_buffer(rctx, input_buffer, input,
 +   program-input_size,
 input_offset);
 +   input_va = r600_resource_va(ctx-screen, (struct
 pipe_resource*)input_buffer);
 +   input_va += input_offset;

 -   /* XXX: We should store arguments in memory if we run out of user
 sgprs.
 -*/
 -   assert(arg_user_sgpr_count  16);
 +   si_pm4_add_bo(pm4, input_buffer, RADEON_USAGE_READ);

 -   for (i = 0; i  arg_user_sgpr_count; i++) {
 -   uint32_t *args = (uint32_t*)input;
 -   si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0 +
 -   (i * 4),
 -   args[i]);
 -   }
 +   si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0, input_va);
 +   si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0 + 4,
 S_008F04_BASE_ADDRESS_HI (input_va  32) | S_008F04_STRIDE(0));

 si_pm4_set_reg(pm4, R_00B810_COMPUTE_START_X, 0);
 si_pm4_set_reg(pm4, R_00B814_COMPUTE_START_Y, 0);
 diff --git a/src/gallium/drivers/radeonsi/si_state.c
 b/src/gallium/drivers/radeonsi/si_state.c
 index dec535c..1e94f7e 100644
 --- a/src/gallium/drivers/radeonsi/si_state.c
 +++ b/src/gallium/drivers/radeonsi/si_state.c
 @@ -24,8 +24,6 @@
   *  

Re: [Mesa-dev] [PATCH V2 2/4] intel: Change the register type from UW to UD in blorp engine

2013-05-24 Thread Paul Berry
On 16 May 2013 11:44, Anuj Phogat anuj.pho...@gmail.com wrote:

 These changes are required to implement scaled blitting in blorp
 in my next patch.

 No regressions observed in piglit quick-driver.tests with this patch.

 Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
 ---
  src/mesa/drivers/dri/i965/brw_blorp.h|  15 ++--
  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 120
 +--
  src/mesa/drivers/dri/i965/brw_reg.h  |   7 ++
  3 files changed, 90 insertions(+), 52 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h
 b/src/mesa/drivers/dri/i965/brw_blorp.h
 index 8915080..70e3933 100644
 --- a/src/mesa/drivers/dri/i965/brw_blorp.h
 +++ b/src/mesa/drivers/dri/i965/brw_blorp.h
 @@ -161,22 +161,19 @@ struct brw_blorp_coord_transform_params
 void setup(GLuint src0, GLuint dst0, GLuint dst1,
bool mirror);

 -   int16_t multiplier;
 -   int16_t offset;
 +   int32_t multiplier;
 +   int32_t offset;
  };


  struct brw_blorp_wm_push_constants
  {
 -   uint16_t dst_x0;
 -   uint16_t dst_x1;
 -   uint16_t dst_y0;
 -   uint16_t dst_y1;
 +   uint32_t dst_x0;
 +   uint32_t dst_x1;
 +   uint32_t dst_y0;
 +   uint32_t dst_y1;
 brw_blorp_coord_transform_params x_transform;
 brw_blorp_coord_transform_params y_transform;
 -
 -   /* Pad out to an integral number of registers */
 -   uint16_t pad[8];
  };

  /* Every 32 bytes of push constant data constitutes one GEN register. */
 diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
 b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
 index c3ef054..b7ee92b 100644
 --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
 @@ -590,13 +590,12 @@ private:
 void encode_msaa(unsigned num_samples, intel_msaa_layout layout);
 void decode_msaa(unsigned num_samples, intel_msaa_layout layout);
 void kill_if_outside_dst_rect();
 -   void translate_dst_to_src();
 +   void translate_dst_to_src(unsigned intel_gen);
 void single_to_blend();
 void manual_blend(unsigned num_samples);
 void sample(struct brw_reg dst);
 void texel_fetch(struct brw_reg dst);
 void mcs_fetch();
 -   void expand_to_32_bits(struct brw_reg src, struct brw_reg dst);
 void texture_lookup(struct brw_reg dst, GLuint msg_type,
 const sampler_message_arg *args, int num_args);
 void render_target_write();
 @@ -773,7 +772,7 @@ brw_blorp_blit_program::compile(struct brw_context
 *brw,
kill_if_outside_dst_rect();

 /* Next, apply a translation to obtain coordinates in the source
 image. */
 -   translate_dst_to_src();
 +   translate_dst_to_src(brw-intel.gen);

 /* If the source image is not multisampled, then we want to fetch
 sample
  * number 0, because that's the only sample there is.
 @@ -845,7 +844,7 @@ brw_blorp_blit_program::alloc_push_const_regs(int
 base_reg)
  #define CONST_LOC(name) offsetof(brw_blorp_wm_push_constants, name)
  #define ALLOC_REG(name) \
 this-name = \
 -  brw_uw1_reg(BRW_GENERAL_REGISTER_FILE, base_reg, CONST_LOC(name) /
 2)
 +  brw_ud1_reg(BRW_GENERAL_REGISTER_FILE, base_reg, CONST_LOC(name) /
 4)

 ALLOC_REG(dst_x0);
 ALLOC_REG(dst_x1);
 @@ -875,17 +874,23 @@ brw_blorp_blit_program::alloc_regs()
 }
 this-mcs_data =
retype(brw_vec8_grf(reg, 0), BRW_REGISTER_TYPE_UD); reg += 8;
 +
 for (int i = 0; i  2; ++i) {
this-x_coords[i]
 - = vec16(retype(brw_vec8_grf(reg++, 0), BRW_REGISTER_TYPE_UW));
 + = vec8(retype(brw_vec8_grf(reg, 0), BRW_REGISTER_TYPE_UD));


It should be sufficient to say this-x_coords[i] =
retype(brw_vec8_grf(reg, 0), BRW_REGISTER_TYPE_UD), since the register
returned by brw_vec8_grf() is already a vec8.  This applies to y_coords[i],
sample_index, t1, and t2 below.

Regardless of whether you decide to change that, this patch is:

Reviewed-by: Paul Berry stereotype...@gmail.com

Nice work, BTW.  Some day soon I want to port blorp over to share more code
with the FS back-end (so that it's easier to port to future chipsets).
Your work here paves the way for that nicely.


 +  reg += 2;
this-y_coords[i]
 - = vec16(retype(brw_vec8_grf(reg++, 0), BRW_REGISTER_TYPE_UW));
 + = vec8(retype(brw_vec8_grf(reg, 0), BRW_REGISTER_TYPE_UD));
 +  reg += 2;
 }
 this-xy_coord_index = 0;
 this-sample_index
 -  = vec16(retype(brw_vec8_grf(reg++, 0), BRW_REGISTER_TYPE_UW));
 -   this-t1 = vec16(retype(brw_vec8_grf(reg++, 0), BRW_REGISTER_TYPE_UW));
 -   this-t2 = vec16(retype(brw_vec8_grf(reg++, 0), BRW_REGISTER_TYPE_UW));
 +  = vec8(retype(brw_vec8_grf(reg, 0), BRW_REGISTER_TYPE_UD));
 +   reg += 2;
 +   this-t1 = vec8(retype(brw_vec8_grf(reg, 0), BRW_REGISTER_TYPE_UD));
 +   reg += 2;
 +   this-t2 = vec8(retype(brw_vec8_grf(reg, 0), BRW_REGISTER_TYPE_UD));
 +   reg += 2;

 /* Make sure we didn't run out of registers */
 assert(reg = GEN7_MRF_HACK_START);
 @@ -942,7 

Re: [Mesa-dev] [PATCH V2 3/4] intel: Add multisample scaled blitting in blorp engine

2013-05-24 Thread Paul Berry
On 16 May 2013 11:44, Anuj Phogat anuj.pho...@gmail.com wrote:

 In traditional multisampled framebuffer rendering, color samples must be
 explicitly resolved via BlitFramebuffer before doing the scaled blitting
 of the framebuffer. So, scaled blitting of a multisample framebuffer
 takes two separate calls to BlitFramebuffer.

 This patch implements the functionality of doing multisampled scaled
 resolve using just one BlitFramebuffer call. Important changes involved
 in this patch are listed below:
 - Use float registers to scale and offset texture coordinates.
 - Change offset computation to consider float coordinates.
 - Round the scaled coordinates down to nearest integer.
 - Modify src texture coordinates clipping to account for scaling..
 - Linear filter is not yet implemented in blorp. So, don't use
   blorp engine to do single sampled scaled blitting.

 Note: Observed no piglit regressions on sandybridge  ivybridge with
 these changes.

 Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
 ---
  src/mesa/drivers/dri/i965/brw_blorp.h  |  23 ++--
  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp   | 143
 +++--
  src/mesa/drivers/dri/i965/brw_reg.h|   7 --
  src/mesa/drivers/dri/intel/intel_mipmap_tree.c |   2 +
  4 files changed, 102 insertions(+), 73 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h
 b/src/mesa/drivers/dri/i965/brw_blorp.h
 index 70e3933..a40324b 100644
 --- a/src/mesa/drivers/dri/i965/brw_blorp.h
 +++ b/src/mesa/drivers/dri/i965/brw_blorp.h
 @@ -40,9 +40,10 @@ brw_blorp_blit_miptrees(struct intel_context *intel,
  unsigned src_level, unsigned src_layer,
  struct intel_mipmap_tree *dst_mt,
  unsigned dst_level, unsigned dst_layer,
 -int src_x0, int src_y0,
 -int dst_x0, int dst_y0,
 -int dst_x1, int dst_y1,
 +float src_x0, float src_y0,
 +float src_x1, float src_y1,
 +float dst_x0, float dst_y0,
 +float dst_x1, float dst_y1,
  bool mirror_x, bool mirror_y);

  bool
 @@ -158,11 +159,11 @@ public:

  struct brw_blorp_coord_transform_params
  {
 -   void setup(GLuint src0, GLuint dst0, GLuint dst1,
 +   void setup(GLfloat src0, GLfloat src1, GLfloat dst0, GLfloat dst1,
bool mirror);

 -   int32_t multiplier;
 -   int32_t offset;
 +   float multiplier;
 +   float offset;
  };


 @@ -304,6 +305,9 @@ struct brw_blorp_blit_prog_key
  * than one sample per pixel.
  */
 bool persample_msaa_dispatch;
 +
 +   /* True for scaled blitting. */
 +   bool blit_scaled;
  };

  class brw_blorp_blit_params : public brw_blorp_params
 @@ -314,9 +318,10 @@ public:
   unsigned src_level, unsigned src_layer,
   struct intel_mipmap_tree *dst_mt,
   unsigned dst_level, unsigned dst_layer,
 - GLuint src_x0, GLuint src_y0,
 - GLuint dst_x0, GLuint dst_y0,
 - GLuint width, GLuint height,
 + GLfloat src_x0, GLfloat src_y0,
 + GLfloat src_x1, GLfloat src_y1,
 + GLfloat dst_x0, GLfloat dst_y0,
 + GLfloat dst_x1, GLfloat dst_y1,
   bool mirror_x, bool mirror_y);

 virtual uint32_t get_wm_prog(struct brw_context *brw,
 diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
 b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
 index b7ee92b..19169ef 100644
 --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
 @@ -41,11 +41,11 @@
   * If coord0  coord1, swap them and invert the mirror boolean.
   */
  static inline void
 -fixup_mirroring(bool mirror, GLint coord0, GLint coord1)
 +fixup_mirroring(bool mirror, GLfloat coord0, GLfloat coord1)
  {
 if (coord0  coord1) {
mirror = !mirror;
 -  GLint tmp = coord0;
 +  GLfloat tmp = coord0;
coord0 = coord1;
coord1 = tmp;
 }
 @@ -67,9 +67,10 @@ fixup_mirroring(bool mirror, GLint coord0, GLint
 coord1)
   * coordinates, by swapping the roles of src and dst.
   */
  static inline bool
 -clip_or_scissor(bool mirror, GLint src_x0, GLint src_x1, GLint dst_x0,
 -GLint dst_x1, GLint fb_xmin, GLint fb_xmax)
 +clip_or_scissor(bool mirror, GLfloat src_x0, GLfloat src_x1, GLfloat
 dst_x0,
 +GLfloat dst_x1, GLfloat fb_xmin, GLfloat fb_xmax)
  {
 +   float scale = (float) (src_x1 - src_x0) / (dst_x1 - dst_x0);
 /* If we are going to scissor everything away, stop. */
 if (!(fb_xmin  fb_xmax 
   dst_x0  fb_xmax 
 @@ -105,8 +106,8 @@ clip_or_scissor(bool mirror, GLint src_x0, GLint
 src_x1, GLint dst_x0,
 /* Adjust the source 

Re: [Mesa-dev] [PATCH V2 4/4] i965: Enable ext_framebuffer_multisample_blit_scaled on intel h/w

2013-05-24 Thread Paul Berry
On 16 May 2013 11:44, Anuj Phogat anuj.pho...@gmail.com wrote:

 This patch enables ext_framebuffer_multisample_blit_scaled extension
 on intel h/w = gen6.

 Note: Patches for piglit tests to verify this functionality are out
 for review on piglit mailing list. Tests pass for all of the scaling
 factors from 0.1 to 2.4.

 Comment from Paul Berry:
 I have some concerns about the image quality of the method you've
 implemented.  As I understand it, the primary use case of this extension
 is to allow the client to do multisampled rendering at slightly less
 than screen resolution (e.g. 720p instead of 1080p), and then blit the
 result to the screen in one step while keeping most of the quality
 benefits of multisampling.  Since your implementation is effectively
 equivalent to downsampling and then blitting using GL_NEAREST filtering,
 my fear is that it will lead to blocky artifacts that are severe enough
 to negate the benefit of multisampling in the first place.

 Before we turn this extension on in the Intel driver, I'd like to look
 at a comparison of:

 (1) your technique
 (2) downsampling followed by scaling with GL_LINEAR filtering
 (3) The nVidia implementation, in GL_SCALED_RESOLVE_FASTEST_EXT mode
 (4) The nVidia implementation, in GL_SCALED_RESOLVE_NICEST_EXT mode
 (5) Just rendering the image directly to the single-sampled destination
 buffer

 Observation: Image quality is better in cases 2, 3, 4 and 5 as
 compared to case 1. Although extension's implementation meets the
 specification's requirements, using it leads to  blocky artifacts
 due to nearest filtering.

 I'll work on implementing a better filtering technique in blorp.


Thanks for quoting my comment here.  It's good to have context so that we
can continue the discussion.

My preference would be to go ahead and land patches 1-3 now, but hold patch
4 back until we've figured out how to get comparable image quality to the
nVidia implementation.  It seems like it would be nice to go out of the
gate with our best looking implementation.

Does that seem reasonable to other folks?



 Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
 ---
  src/mesa/drivers/dri/intel/intel_extensions.c | 1 +
  1 file changed, 1 insertion(+)

 diff --git a/src/mesa/drivers/dri/intel/intel_extensions.c
 b/src/mesa/drivers/dri/intel/intel_extensions.c
 index 8d8e325..de12ec3 100644
 --- a/src/mesa/drivers/dri/intel/intel_extensions.c
 +++ b/src/mesa/drivers/dri/intel/intel_extensions.c
 @@ -97,6 +97,7 @@ intelInitExtensions(struct gl_context *ctx)

 if (intel-gen = 6) {
ctx-Extensions.EXT_framebuffer_multisample = true;
 +  ctx-Extensions.EXT_framebuffer_multisample_blit_scaled = true;
ctx-Extensions.ARB_blend_func_extended =
 !driQueryOptionb(intel-optionCache, disable_blend_func_extended);
ctx-Extensions.ARB_draw_buffers_blend = true;
ctx-Extensions.ARB_ES3_compatibility = true;
 --
 1.8.1.4


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Mask the cut index based on the index buffer type in 3DSTATE_VF.

2013-05-24 Thread Ian Romanick

On 05/23/2013 03:46 PM, Kenneth Graunke wrote:

According to the documentation: The Cut Index is compared to the
fetched (and possibly-sign-extended) vertex index, and if these values


Which documentation is this?  The only types that are valid for index 
buffers are unsigned, so what does possibly-sign-extended even mean?



are equal, the current primitive topology is terminated.  Note that,
for index buffers 32bpp, it is possible to set the Cut Index to a
(large) value that will never match a sign-extended vertex index.

This suggests that we should not set the value to 0x for
unsigned byte or short index buffers, but rather 0xFF or 0x.


For GL_PRIMITIVE_RESTART_FIXED_INDEX (ES and desktop 4.something), where 
the setting of the restart value is out of application control, this is 
absolutely correct.  The OpenGL 4.3 spec says:


Primitive restart can also be enabled or disabled with a target
of PRIMITIVE_RESTART_FIXED_INDEX. In this case, the primitive
restart index is equal to 2^N − 1, where N is 8, 16 or 32 if the
type is UNSIGNED_BYTE, UNSIGNED_SHORT, or UNSIGNED_INT,
respectively, and the index value specified by
PrimitiveRestartIndex is ignored.

For GL_PRIMITIVE_RESTART, I'm not so sure.  I couldn't find anything 
conclusive any of the specs.  The only thing I found was a hint in the 
NV_primitive_restart extension spec:


*   What should the default primitive restart index be?

RESOLVED: Zero.  It's tough to pick another number that is
meaningful for all three element data types.  In practice, apps
are likely to set it to 0x or 0x.

You can infer from this that applications are expected to set 0x for 
GL_UNSIGNED_SHORT and 0x for GL_UNSIGNED_LONG.  Experimentation 
is the only way to know for sure. :(



Fixes sporadic failures in the ES 3 instanced_arrays_primitive_restart
conformance test when run in combination with other tests.  No Piglit
regressions.

Cc: Ian Romanick i...@freedesktop.org
Cc: Paul Berry stereotype...@gmail.com
Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
  src/mesa/drivers/dri/i965/brw_primitive_restart.c | 27 ---
  1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_primitive_restart.c 
b/src/mesa/drivers/dri/i965/brw_primitive_restart.c
index f824915..cf4a1ea 100644
--- a/src/mesa/drivers/dri/i965/brw_primitive_restart.c
+++ b/src/mesa/drivers/dri/i965/brw_primitive_restart.c
@@ -183,19 +183,30 @@ haswell_upload_cut_index(struct brw_context *brw)
 if (!intel-is_haswell)
return;

-   const unsigned cut_index_setting =
-  ctx-Array._PrimitiveRestart ? HSW_CUT_INDEX_ENABLE : 0;
-
-   BEGIN_BATCH(2);
-   OUT_BATCH(_3DSTATE_VF  16 | cut_index_setting | (2 - 2));
-   OUT_BATCH(ctx-Array._RestartIndex);
-   ADVANCE_BATCH();
+   if (ctx-Array._PrimitiveRestart) {
+  int cut_index = ctx-Array._RestartIndex;
+
+  if (brw-ib.type == GL_UNSIGNED_BYTE)
+ cut_index = 0xff;
+  else if (brw-ib.type == GL_UNSIGNED_SHORT)
+ cut_index = 0x;
+
+  BEGIN_BATCH(2);
+  OUT_BATCH(_3DSTATE_VF  16 | HSW_CUT_INDEX_ENABLE | (2 - 2));
+  OUT_BATCH(cut_index);
+  ADVANCE_BATCH();
+   } else {
+  BEGIN_BATCH(2);
+  OUT_BATCH(_3DSTATE_VF  16 | (2 - 2));
+  OUT_BATCH(0);
+  ADVANCE_BATCH();
+   }
  }

  const struct brw_tracked_state haswell_cut_index = {
 .dirty = {
.mesa  = _NEW_TRANSFORM,
-  .brw   = 0,
+  .brw   = BRW_NEW_INDEX_BUFFER,
.cache = 0,
 },
 .emit = haswell_upload_cut_index,



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/12] i965/gen7+: Implement fast color clears.

2013-05-24 Thread Ian Romanick

On 05/21/2013 04:52 PM, Paul Berry wrote:

This series implements fast color clears, a Gen7+ feature which
reduces memory bandwidth by deferring the memory writes involved in a
glClear() until the same memory is later touched during rendering.

 From a broad overview point of view, fast color clears work in a
similar way to HiZ: an auxiliary MCS buffer keeps track of which
parts of the buffer have been cleared but haven't yet had the
necessary memory writes performed.  Whenever a color buffer needs to
be accessed by the CPU, or by a part of the GPU that is not
fast-color-aware, we have to perform a resolve operation to force
any pending memory writes to occur.

This patch series adopts a slightly different strategy (compared to
HiZ) for making sure the resolves happen when needed.  Instead of
modifying each code path that might need to do a resolve so that it
does one if needed, we create an accessor function that does the
resolve if needed and then provides the caller with access to the
miptree's underlying memory region.  This lets us have a lot more
confidence that we didn't miss any code paths, which is important
since color buffers are accessed by a large number of code paths.  To
discourage future maintainers from trying to bypass the accessor
function, it is inline (so that overhead is negligible), and the field
it provides access to has been renamed to region_private.

Patch 01 ifdefs out some code so that it does not appear in the i915
(pre-Gen4) driver--this makes it easier to be confident that these
changes won't regress i915.  Patch 02 introduces the aforementioned
accessor function.  Patches 03-11 are the guts of the implementation,
and patch 12 enables the new feature.

No piglit regressions.  I have additional piglit tests which validate
specific important corner cases--I hope to get those out to the list
later this week.


I sent some comments and review for the tests, and I've sent some other 
comments about these patches.  My only concern is whether the case of 
swapping a non-current drawable (that had a fast-clear as the last 
render) produces the correct result.  In the piglit thread, I suggested 
adding a test specifically for this case.


I suspect that if fast-clear fails in that case, then multisampling also 
fails.  Both can probably be fixed as follow-on work.  Does that seem 
plausible?



[PATCH 01/12] intel: Conditionally compile mcs-related code for i965 only.
[PATCH 02/12] intel: Create intel_miptree_get_region() to prepare for fast 
color clear.
[PATCH 03/12] i965/gen7+: Create an enum for keeping track of fast color clear 
state.
[PATCH 04/12] i965/gen7+: Set up MCS in SURFACE_STATE whenever MCS is present.
[PATCH 05/12] i965/gen7+: Create helper functions for single-sample MCS buffers.
[PATCH 06/12] i965/gen7+: Implement fast color clear operation in BLORP.
[PATCH 07/12] i965/blorp: Expand clear class hierarchy to prepare for RT 
resolves.
[PATCH 08/12] i965/blorp: Write blorp code to do render target resolves.
[PATCH 09/12] i965/gen7+: Ensure that front/back buffers are fast-clear 
resolved.
[PATCH 10/12] i965/gen7+: Resolve color buffers when necessary.
[PATCH 11/12] i965/gen7+: Disable fast color clears on shared regions.
[PATCH 12/12] i965/gen7: Enable support for fast color clears.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/12] sso: update glGet: GL_PROGRAM_PIPELINE_BINDING

2013-05-24 Thread gregory hainaut
On Sat, 4 May 2013 11:35:22 +0200
gregory hainaut gregory.hain...@gmail.com wrote:

 On Fri, 3 May 2013 12:04:48 -0700
 Matt Turner matts...@gmail.com wrote:
 
  On Fri, May 3, 2013 at 10:44 AM, Gregory Hainaut
  gregory.hain...@gmail.com wrote:
   ---
src/mesa/main/get.c  |9 +
src/mesa/main/get_hash_params.py |3 +++
2 files changed, 12 insertions(+)
  
   diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
   index 54159c0..6cbb7db 100644
   --- a/src/mesa/main/get.c
   +++ b/src/mesa/main/get.c
   @@ -369,6 +369,7 @@ EXTRA_EXT(ARB_map_buffer_alignment);
EXTRA_EXT(ARB_texture_cube_map_array);
EXTRA_EXT(ARB_texture_buffer_range);
EXTRA_EXT(ARB_texture_multisample);
   +EXTRA_EXT(ARB_separate_shader_objects);
  
static const int
extra_ARB_color_buffer_float_or_glcore[] = {
   @@ -889,6 +890,14 @@ find_custom_value(struct gl_context *ctx, const 
   struct value_desc *d, union valu
 _mesa_problem(ctx, driver doesn't implement GetTimestamp);
  }
  break;
   +   /* GL_ARB_separate_shader_objects */
   +   case GL_PROGRAM_PIPELINE_BINDING:
   +  if (ctx-Pipeline.Current) {
   + v-value_int = ctx-Pipeline.Current-Name;
   +  } else {
   + v-value_int = 0;
   +  }
   +  break;
   }
}
  
  This looks believable, but I can't find a description in the extension
  spec or GL 4.1+ specs that say precisely what this query is supposed
  to do. Looks like it's just mentioned in the extension spec, and not
  at all in GL 4.1+ specs.
 
 
 Yes you're right that strange. There is also a couple of line in glGet man 
 page.
 
 GL_PROGRAM_PIPELINE_BINDING
   
 params a single value, the name of the currently 
 bound program pipeline
 object, or zero if no program pipeline object is 
 bound.
 See glBindProgramPipeline.
 
 Both Nvidia and AMD support this query. I did a quick update on my piglit 
 test, on the AMD side:
 * UseProgram(2)
 * BindPipeline(5) (the pipeline isn't really bound because UseProgram got an 
 higher priority)
 * Get GL_PROGRAM_PIPELINE_BINDING = 5
 
 I will try to check the behavior on Nvidia implementation.

Nvidia implementation is this one:
  if (ctx-_Shader) {
 v-value_int = ctx-_Shader-Name;
  } else {
 v-value_int = 0;
  }

So on my previous example
* UseProgram(2)
* BindPipeline(5)
* Get GL_PROGRAM_PIPELINE_BINDING = 0

There is no spec but the SSO spec was written by Nvidia so I would say that 
Nvidia is correct.

   diff --git a/src/mesa/main/get_hash_params.py 
   b/src/mesa/main/get_hash_params.py
   index 2b97da6..43a11cf 100644
   --- a/src/mesa/main/get_hash_params.py
   +++ b/src/mesa/main/get_hash_params.py
   @@ -709,6 +709,9 @@ descriptor=[
  
# GL_ARB_texture_cube_map_array
  [ TEXTURE_BINDING_CUBE_MAP_ARRAY_ARB, LOC_CUSTOM, TYPE_INT, 
   TEXTURE_CUBE_ARRAY_INDEX, extra_ARB_texture_cube_map_array ],
   +
   +# GL_ARB_separate_shader_objects
   +  [ PROGRAM_PIPELINE_BINDING, LOC_CUSTOM, TYPE_INT, 
   GL_PROGRAM_PIPELINE_BINDING, extra_ARB_separate_shader_objects ],
]},
  
# Enums restricted to OpenGL Core profile
   --
   1.7.10.4
  
   ___
   mesa-dev mailing list
   mesa-dev@lists.freedesktop.org
   http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] radeonsi/compute: Pass kernel arguments in a buffer

2013-05-24 Thread Tom Stellard
On Fri, May 24, 2013 at 01:32:20PM -0500, Patrick Baggett wrote:
 The only difference I could see is that in the old code you passed
 cb-buffer (which maybe points to a value?) directly into u_upload_data()
 where as in the new code, you do pass cb-buffer as the parameter rbuffer
 to r600_upload_const_buffer(), but then inside that function, you do
 *rbuffer = NULL before you start, which effectively erases any previous
 pointer, so if *rbuffer was examined by u_upload_data(), it may be
 different. I don't know if that matters, though.
 

This was the problem, thanks for spotting it!  u_upload_data() was
deleting the old buffer, so by initializing rbuffer to NULL, the old
buffer was never being deleted.  An updated patch is on the way.

-Tom

 Patrick
 
 
 On Fri, May 24, 2013 at 1:07 PM, Tom Stellard t...@stellard.net wrote:
 
  From: Tom Stellard thomas.stell...@amd.com
 
  ---
   src/gallium/drivers/radeonsi/r600_buffer.c  | 31
  +
   src/gallium/drivers/radeonsi/radeonsi_compute.c | 26 ++---
   src/gallium/drivers/radeonsi/si_state.c | 29
  +++
   3 files changed, 51 insertions(+), 35 deletions(-)
 
  diff --git a/src/gallium/drivers/radeonsi/r600_buffer.c
  b/src/gallium/drivers/radeonsi/r600_buffer.c
  index cdf9988..87763c3 100644
  --- a/src/gallium/drivers/radeonsi/r600_buffer.c
  +++ b/src/gallium/drivers/radeonsi/r600_buffer.c
  @@ -25,6 +25,8 @@
*  Corbin Simpson mostawesomed...@gmail.com
*/
 
  +#include byteswap.h
  +
   #include pipe/p_screen.h
   #include util/u_format.h
   #include util/u_math.h
  @@ -168,3 +170,32 @@ void r600_upload_index_buffer(struct r600_context
  *rctx,
  u_upload_data(rctx-uploader, 0, count * ib-index_size,
ib-user_buffer, ib-offset, ib-buffer);
   }
  +
  +void r600_upload_const_buffer(struct r600_context *rctx, struct
  si_resource **rbuffer,
  +   const uint8_t *ptr, unsigned size,
  +   uint32_t *const_offset)
  +{
  +   *rbuffer = NULL;
  +
  +   if (R600_BIG_ENDIAN) {
  +   uint32_t *tmpPtr;
  +   unsigned i;
  +
  +   if (!(tmpPtr = malloc(size))) {
  +   R600_ERR(Failed to allocate BE swap buffer.\n);
  +   return;
  +   }
  +
  +   for (i = 0; i  size / 4; ++i) {
  +   tmpPtr[i] = bswap_32(((uint32_t *)ptr)[i]);
  +   }
  +
  +   u_upload_data(rctx-uploader, 0, size, tmpPtr,
  const_offset,
  +   (struct pipe_resource**)rbuffer);
  +
  +   free(tmpPtr);
  +   } else {
  +   u_upload_data(rctx-uploader, 0, size, ptr, const_offset,
  +   (struct pipe_resource**)rbuffer);
  +   }
  +}
  diff --git a/src/gallium/drivers/radeonsi/radeonsi_compute.c
  b/src/gallium/drivers/radeonsi/radeonsi_compute.c
  index 3fb6eb1..035076d 100644
  --- a/src/gallium/drivers/radeonsi/radeonsi_compute.c
  +++ b/src/gallium/drivers/radeonsi/radeonsi_compute.c
  @@ -91,8 +91,11 @@ static void radeonsi_launch_grid(
  struct r600_context *rctx = (struct r600_context*)ctx;
  struct si_pipe_compute *program = rctx-cs_shader_state.program;
  struct si_pm4_state *pm4 = CALLOC_STRUCT(si_pm4_state);
  +   struct si_resource *input_buffer;
  +   uint32_t input_offset = 0;
  +   uint64_t input_va;
  uint64_t shader_va;
  -   unsigned arg_user_sgpr_count;
  +   unsigned arg_user_sgpr_count = 2;
  unsigned i;
  struct si_pipe_shader *shader = program-kernels[pc];
 
  @@ -109,21 +112,16 @@ static void radeonsi_launch_grid(
  si_pm4_inval_shader_cache(pm4);
  si_cmd_surface_sync(pm4, pm4-cp_coher_cntl);
 
  -   arg_user_sgpr_count = program-input_size / 4;
  -   if (program-input_size % 4 != 0) {
  -   arg_user_sgpr_count++;
  -   }
  +   /* Upload the input data */
  +   r600_upload_const_buffer(rctx, input_buffer, input,
  +   program-input_size,
  input_offset);
  +   input_va = r600_resource_va(ctx-screen, (struct
  pipe_resource*)input_buffer);
  +   input_va += input_offset;
 
  -   /* XXX: We should store arguments in memory if we run out of user
  sgprs.
  -*/
  -   assert(arg_user_sgpr_count  16);
  +   si_pm4_add_bo(pm4, input_buffer, RADEON_USAGE_READ);
 
  -   for (i = 0; i  arg_user_sgpr_count; i++) {
  -   uint32_t *args = (uint32_t*)input;
  -   si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0 +
  -   (i * 4),
  -   args[i]);
  -   }
  +   si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0, input_va);
  +   si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0 + 4,
  S_008F04_BASE_ADDRESS_HI (input_va  

[Mesa-dev] [PATCH] radeonsi/compute: Pass kernel arguments in a buffer v2

2013-05-24 Thread Tom Stellard
From: Tom Stellard thomas.stell...@amd.com

v2:
  - Fix memory leak in si_set_constant_buffer()
---
 src/gallium/drivers/radeonsi/r600_buffer.c  | 29 +
 src/gallium/drivers/radeonsi/radeonsi_compute.c | 26 ++
 src/gallium/drivers/radeonsi/si_state.c | 23 ++--
 3 files changed, 43 insertions(+), 35 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/r600_buffer.c 
b/src/gallium/drivers/radeonsi/r600_buffer.c
index cdf9988..3d295e8 100644
--- a/src/gallium/drivers/radeonsi/r600_buffer.c
+++ b/src/gallium/drivers/radeonsi/r600_buffer.c
@@ -25,6 +25,8 @@
  *  Corbin Simpson mostawesomed...@gmail.com
  */
 
+#include byteswap.h
+
 #include pipe/p_screen.h
 #include util/u_format.h
 #include util/u_math.h
@@ -168,3 +170,30 @@ void r600_upload_index_buffer(struct r600_context *rctx,
u_upload_data(rctx-uploader, 0, count * ib-index_size,
  ib-user_buffer, ib-offset, ib-buffer);
 }
+
+void r600_upload_const_buffer(struct r600_context *rctx, struct si_resource 
**rbuffer,
+   const uint8_t *ptr, unsigned size,
+   uint32_t *const_offset)
+{
+   if (R600_BIG_ENDIAN) {
+   uint32_t *tmpPtr;
+   unsigned i;
+
+   if (!(tmpPtr = malloc(size))) {
+   R600_ERR(Failed to allocate BE swap buffer.\n);
+   return;
+   }
+
+   for (i = 0; i  size / 4; ++i) {
+   tmpPtr[i] = bswap_32(((uint32_t *)ptr)[i]);
+   }
+
+   u_upload_data(rctx-uploader, 0, size, tmpPtr, const_offset,
+   (struct pipe_resource**)rbuffer);
+
+   free(tmpPtr);
+   } else {
+   u_upload_data(rctx-uploader, 0, size, ptr, const_offset,
+   (struct pipe_resource**)rbuffer);
+   }
+}
diff --git a/src/gallium/drivers/radeonsi/radeonsi_compute.c 
b/src/gallium/drivers/radeonsi/radeonsi_compute.c
index 3fb6eb1..4341ecc 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_compute.c
+++ b/src/gallium/drivers/radeonsi/radeonsi_compute.c
@@ -91,8 +91,11 @@ static void radeonsi_launch_grid(
struct r600_context *rctx = (struct r600_context*)ctx;
struct si_pipe_compute *program = rctx-cs_shader_state.program;
struct si_pm4_state *pm4 = CALLOC_STRUCT(si_pm4_state);
+   struct si_resource *input_buffer = NULL;
+   uint32_t input_offset = 0;
+   uint64_t input_va;
uint64_t shader_va;
-   unsigned arg_user_sgpr_count;
+   unsigned arg_user_sgpr_count = 2;
unsigned i;
struct si_pipe_shader *shader = program-kernels[pc];
 
@@ -109,21 +112,16 @@ static void radeonsi_launch_grid(
si_pm4_inval_shader_cache(pm4);
si_cmd_surface_sync(pm4, pm4-cp_coher_cntl);
 
-   arg_user_sgpr_count = program-input_size / 4;
-   if (program-input_size % 4 != 0) {
-   arg_user_sgpr_count++;
-   }
+   /* Upload the input data */
+   r600_upload_const_buffer(rctx, input_buffer, input,
+   program-input_size, input_offset);
+   input_va = r600_resource_va(ctx-screen, (struct 
pipe_resource*)input_buffer);
+   input_va += input_offset;
 
-   /* XXX: We should store arguments in memory if we run out of user sgprs.
-*/
-   assert(arg_user_sgpr_count  16);
+   si_pm4_add_bo(pm4, input_buffer, RADEON_USAGE_READ);
 
-   for (i = 0; i  arg_user_sgpr_count; i++) {
-   uint32_t *args = (uint32_t*)input;
-   si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0 +
-   (i * 4),
-   args[i]);
-   }
+   si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0, input_va);
+   si_pm4_set_reg(pm4, R_00B900_COMPUTE_USER_DATA_0 + 4, 
S_008F04_BASE_ADDRESS_HI (input_va  32) | S_008F04_STRIDE(0));
 
si_pm4_set_reg(pm4, R_00B810_COMPUTE_START_X, 0);
si_pm4_set_reg(pm4, R_00B814_COMPUTE_START_Y, 0);
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index dec535c..98e54c7 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -24,8 +24,6 @@
  *  Christian König christian.koe...@amd.com
  */
 
-#include byteswap.h
-
 #include util/u_memory.h
 #include util/u_framebuffer.h
 #include util/u_blitter.h
@@ -2526,25 +2524,8 @@ static void si_set_constant_buffer(struct pipe_context 
*ctx, uint shader, uint i
ptr = input-user_buffer;
 
if (ptr) {
-   /* Upload the user buffer. */
-   if (R600_BIG_ENDIAN) {
-   uint32_t *tmpPtr;
-   unsigned i, size = input-buffer_size;
-
-   if (!(tmpPtr = malloc(size))) {
-   R600_ERR(Failed to allocate 

Re: [Mesa-dev] [PATCH] i965: Mask the cut index based on the index buffer type in 3DSTATE_VF.

2013-05-24 Thread Paul Berry
On 24 May 2013 12:23, Ian Romanick i...@freedesktop.org wrote:

 On 05/23/2013 03:46 PM, Kenneth Graunke wrote:

 According to the documentation: The Cut Index is compared to the
 fetched (and possibly-sign-extended) vertex index, and if these values


 Which documentation is this?  The only types that are valid for index
 buffers are unsigned, so what does possibly-sign-extended even mean?


This is from the Haswell bspec.  I have never understood what they mean by
possibly-sign-extended either.



  are equal, the current primitive topology is terminated.  Note that,
 for index buffers 32bpp, it is possible to set the Cut Index to a
 (large) value that will never match a sign-extended vertex index.

 This suggests that we should not set the value to 0x for
 unsigned byte or short index buffers, but rather 0xFF or 0x.


 For GL_PRIMITIVE_RESTART_FIXED_**INDEX (ES and desktop 4.something),
 where the setting of the restart value is out of application control, this
 is absolutely correct.  The OpenGL 4.3 spec says:

 Primitive restart can also be enabled or disabled with a target
 of PRIMITIVE_RESTART_FIXED_INDEX. In this case, the primitive
 restart index is equal to 2^N - 1, where N is 8, 16 or 32 if the
 type is UNSIGNED_BYTE, UNSIGNED_SHORT, or UNSIGNED_INT,
 respectively, and the index value specified by
 PrimitiveRestartIndex is ignored.

 For GL_PRIMITIVE_RESTART, I'm not so sure.  I couldn't find anything
 conclusive any of the specs.  The only thing I found was a hint in the
 NV_primitive_restart extension spec:

 *   What should the default primitive restart index be?

 RESOLVED: Zero.  It's tough to pick another number that is
 meaningful for all three element data types.  In practice, apps
 are likely to set it to 0x or 0x.

 You can infer from this that applications are expected to set 0x for
 GL_UNSIGNED_SHORT and 0x for GL_UNSIGNED_LONG.  Experimentation is
 the only way to know for sure. :(


  Fixes sporadic failures in the ES 3 instanced_arrays_primitive_**restart
 conformance test when run in combination with other tests.  No Piglit
 regressions.

 Cc: Ian Romanick i...@freedesktop.org
 Cc: Paul Berry stereotype...@gmail.com
 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 ---
   src/mesa/drivers/dri/i965/brw_**primitive_restart.c | 27
 ---
   1 file changed, 19 insertions(+), 8 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/**brw_primitive_restart.c
 b/src/mesa/drivers/dri/i965/**brw_primitive_restart.c
 index f824915..cf4a1ea 100644
 --- a/src/mesa/drivers/dri/i965/**brw_primitive_restart.c
 +++ b/src/mesa/drivers/dri/i965/**brw_primitive_restart.c
 @@ -183,19 +183,30 @@ haswell_upload_cut_index(**struct brw_context *brw)
  if (!intel-is_haswell)
 return;

 -   const unsigned cut_index_setting =
 -  ctx-Array._PrimitiveRestart ? HSW_CUT_INDEX_ENABLE : 0;
 -
 -   BEGIN_BATCH(2);
 -   OUT_BATCH(_3DSTATE_VF  16 | cut_index_setting | (2 - 2));
 -   OUT_BATCH(ctx-Array._**RestartIndex);
 -   ADVANCE_BATCH();
 +   if (ctx-Array._PrimitiveRestart) {
 +  int cut_index = ctx-Array._RestartIndex;
 +
 +  if (brw-ib.type == GL_UNSIGNED_BYTE)
 + cut_index = 0xff;
 +  else if (brw-ib.type == GL_UNSIGNED_SHORT)
 + cut_index = 0x;
 +
 +  BEGIN_BATCH(2);
 +  OUT_BATCH(_3DSTATE_VF  16 | HSW_CUT_INDEX_ENABLE | (2 - 2));
 +  OUT_BATCH(cut_index);
 +  ADVANCE_BATCH();
 +   } else {
 +  BEGIN_BATCH(2);
 +  OUT_BATCH(_3DSTATE_VF  16 | (2 - 2));
 +  OUT_BATCH(0);
 +  ADVANCE_BATCH();
 +   }
   }

   const struct brw_tracked_state haswell_cut_index = {
  .dirty = {
 .mesa  = _NEW_TRANSFORM,
 -  .brw   = 0,
 +  .brw   = BRW_NEW_INDEX_BUFFER,
 .cache = 0,
  },
  .emit = haswell_upload_cut_index,


 __**_
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/**mailman/listinfo/mesa-devhttp://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/4] V2 Multiple viewports in Gallium

2013-05-24 Thread Zack Rusin
This series adds support for multiple viewports/scissors
to gallium and implements it in llvmpipe. All the other
drivers still support just a single viewport/scissor
combo and their behavior should be exactly the same as
it was.

I think this one takes care of all the comments. I think
it addresses everyones concerns. Please let me know if 
I missed something.

Zack Rusin (4):
  gallium: Add support for multiple viewports
  draw: implement support for multiple viewports
  llvmpipe: implement support for multiple viewports
  draw: fixup draw_find_shader_output

 src/gallium/auxiliary/cso_cache/cso_context.c  |4 +-
 src/gallium/auxiliary/draw/draw_cliptest_tmp.h |   10 +++-
 src/gallium/auxiliary/draw/draw_context.c  |   63 +++-
 src/gallium/auxiliary/draw/draw_context.h  |6 +-
 src/gallium/auxiliary/draw/draw_gs.c   |   11 +++-
 src/gallium/auxiliary/draw/draw_gs.h   |1 +
 src/gallium/auxiliary/draw/draw_pipe_clip.c|   11 +++-
 src/gallium/auxiliary/draw/draw_private.h  |8 +--
 .../draw/draw_pt_fetch_shade_pipeline_llvm.c   |4 +-
 src/gallium/auxiliary/draw/draw_vertex.h   |2 +-
 src/gallium/auxiliary/draw/draw_vs.c   |7 ---
 src/gallium/auxiliary/draw/draw_vs_variant.c   |   34 +--
 src/gallium/auxiliary/tgsi/tgsi_scan.c |6 ++
 src/gallium/auxiliary/tgsi/tgsi_scan.h |1 +
 src/gallium/auxiliary/tgsi/tgsi_strings.c  |3 +-
 src/gallium/auxiliary/util/u_blitter.c |8 +--
 src/gallium/auxiliary/vl/vl_compositor.c   |4 +-
 src/gallium/auxiliary/vl/vl_idct.c |4 +-
 src/gallium/auxiliary/vl/vl_matrix_filter.c|2 +-
 src/gallium/auxiliary/vl/vl_mc.c   |2 +-
 src/gallium/auxiliary/vl/vl_median_filter.c|2 +-
 src/gallium/auxiliary/vl/vl_zscan.c|2 +-
 src/gallium/docs/source/context.rst|8 ++-
 src/gallium/drivers/freedreno/freedreno_state.c|   12 ++--
 src/gallium/drivers/galahad/glhd_context.c |   20 ---
 src/gallium/drivers/i915/i915_state.c  |   15 +++--
 src/gallium/drivers/identity/id_context.c  |   22 +++
 src/gallium/drivers/ilo/ilo_state.c|   16 +++--
 src/gallium/drivers/llvmpipe/lp_context.h  |7 ++-
 src/gallium/drivers/llvmpipe/lp_screen.c   |2 +
 src/gallium/drivers/llvmpipe/lp_setup.c|   29 +
 src/gallium/drivers/llvmpipe/lp_setup.h|4 +-
 src/gallium/drivers/llvmpipe/lp_setup_context.h|8 ++-
 src/gallium/drivers/llvmpipe/lp_setup_line.c   |   12 +++-
 src/gallium/drivers/llvmpipe/lp_setup_point.c  |   12 ++--
 src/gallium/drivers/llvmpipe/lp_setup_tri.c|   17 --
 src/gallium/drivers/llvmpipe/lp_state_clip.c   |   25 +---
 src/gallium/drivers/llvmpipe/lp_state_derived.c|   20 +--
 src/gallium/drivers/llvmpipe/lp_surface.c  |4 +-
 src/gallium/drivers/noop/noop_state.c  |   16 +++--
 src/gallium/drivers/nv30/nv30_draw.c   |2 +-
 src/gallium/drivers/nv30/nv30_state.c  |   16 +++--
 src/gallium/drivers/nv50/nv50_state.c  |   16 +++--
 src/gallium/drivers/nvc0/nvc0_state.c  |   16 +++--
 src/gallium/drivers/r300/r300_context.c|2 +-
 src/gallium/drivers/r300/r300_state.c  |   18 +++---
 src/gallium/drivers/r600/evergreen_state.c |6 +-
 src/gallium/drivers/r600/r600_state.c  |8 ++-
 src/gallium/drivers/r600/r600_state_common.c   |   10 ++--
 src/gallium/drivers/radeonsi/si_state.c|   16 +++--
 src/gallium/drivers/rbug/rbug_context.c|   22 +++
 src/gallium/drivers/softpipe/sp_screen.c   |2 +
 src/gallium/drivers/softpipe/sp_state_clip.c   |   19 +++---
 src/gallium/drivers/softpipe/sp_state_derived.c|2 +-
 src/gallium/drivers/svga/svga_pipe_misc.c  |   20 ---
 src/gallium/drivers/svga/svga_swtnl_state.c|2 +-
 src/gallium/drivers/trace/tr_context.c |   32 ++
 src/gallium/include/pipe/p_context.h   |   12 ++--
 src/gallium/include/pipe/p_defines.h   |3 +-
 src/gallium/include/pipe/p_shader_tokens.h |3 +-
 src/gallium/include/pipe/p_state.h |1 +
 src/gallium/tests/graw/fs-test.c   |2 +-
 src/gallium/tests/graw/graw_util.h |2 +-
 src/gallium/tests/graw/gs-test.c   |2 +-
 src/gallium/tests/graw/quad-sample.c   |2 +-
 src/gallium/tests/graw/shader-leak.c   |2 +-
 src/gallium/tests/graw/tri-gs.c|2 +-
 src/gallium/tests/graw/tri-instanced.c |2 +-
 src/gallium/tests/graw/vs-test.c   |2 +-
 

[Mesa-dev] [PATCH 2/4] draw: implement support for multiple viewports

2013-05-24 Thread Zack Rusin
This adds support for multiple viewports to the draw module.
Multiple viewports depend on the presence of geometry shaders
which can write the viewport index.

Signed-off-by: Zack Rusin za...@vmware.com
---
 src/gallium/auxiliary/draw/draw_cliptest_tmp.h |   10 +++-
 src/gallium/auxiliary/draw/draw_context.c  |   52 
 src/gallium/auxiliary/draw/draw_gs.c   |   11 -
 src/gallium/auxiliary/draw/draw_gs.h   |1 +
 src/gallium/auxiliary/draw/draw_pipe_clip.c|   11 -
 src/gallium/auxiliary/draw/draw_private.h  |8 +--
 .../draw/draw_pt_fetch_shade_pipeline_llvm.c   |4 +-
 src/gallium/auxiliary/draw/draw_vs.c   |7 ---
 src/gallium/auxiliary/draw/draw_vs_variant.c   |   34 +++--
 9 files changed, 105 insertions(+), 33 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_cliptest_tmp.h 
b/src/gallium/auxiliary/draw/draw_cliptest_tmp.h
index 48f2349..09e1fd7 100644
--- a/src/gallium/auxiliary/draw/draw_cliptest_tmp.h
+++ b/src/gallium/auxiliary/draw/draw_cliptest_tmp.h
@@ -31,8 +31,6 @@ static boolean TAG(do_cliptest)( struct pt_post_vs *pvs,
  struct draw_vertex_info *info )
 {
struct vertex_header *out = info-verts;
-   const float *scale = pvs-draw-viewport.scale;
-   const float *trans = pvs-draw-viewport.translate;
/* const */ float (*plane)[4] = pvs-draw-plane;
const unsigned pos = draw_current_shader_position_output(pvs-draw);
const unsigned cv = draw_current_shader_clipvertex_output(pvs-draw);
@@ -44,6 +42,9 @@ static boolean TAG(do_cliptest)( struct pt_post_vs *pvs,
unsigned j;
unsigned i;
bool have_cd = false;
+   unsigned viewport_index_output =
+  draw_current_shader_viewport_index_output(pvs-draw);
+  
cd[0] = draw_current_shader_clipdistance_output(pvs-draw, 0);
cd[1] = draw_current_shader_clipdistance_output(pvs-draw, 1);
   
@@ -52,7 +53,12 @@ static boolean TAG(do_cliptest)( struct pt_post_vs *pvs,
 
for (j = 0; j  info-count; j++) {
   float *position = out-data[pos];
+  int viewport_index = 
+ draw_current_shader_uses_viewport_index(pvs-draw) ?
+ *((unsigned*)out-data[viewport_index_output]): 0;
   unsigned mask = 0x0;
+  const float *scale = pvs-draw-viewports[viewport_index].scale;
+  const float *trans = pvs-draw-viewports[viewport_index].translate;
   
   initialize_vertex_header(out);
 
diff --git a/src/gallium/auxiliary/draw/draw_context.c 
b/src/gallium/auxiliary/draw/draw_context.c
index b555c65..4250f10 100644
--- a/src/gallium/auxiliary/draw/draw_context.c
+++ b/src/gallium/auxiliary/draw/draw_context.c
@@ -318,17 +318,24 @@ void draw_set_viewport_states( struct draw_context *draw,
 {
const struct pipe_viewport_state *viewport = vps;
draw_do_flush(draw, DRAW_FLUSH_PARAMETER_CHANGE);
-   draw-viewport = *viewport; /* struct copy */
-   draw-identity_viewport = (viewport-scale[0] == 1.0f 
-  viewport-scale[1] == 1.0f 
-  viewport-scale[2] == 1.0f 
-  viewport-scale[3] == 1.0f 
-  viewport-translate[0] == 0.0f 
-  viewport-translate[1] == 0.0f 
-  viewport-translate[2] == 0.0f 
-  viewport-translate[3] == 0.0f);
 
-   draw_vs_set_viewport( draw, viewport );
+   if (start_slot  PIPE_MAX_VIEWPORTS)
+  return;
+
+   if ((start_slot + num_viewports)  PIPE_MAX_VIEWPORTS)
+  num_viewports = PIPE_MAX_VIEWPORTS - start_slot;
+
+   memcpy(draw-viewports + start_slot, vps,
+  sizeof(struct pipe_viewport_state) * num_viewports);
+   draw-identity_viewport = (num_viewports == 1) 
+  (viewport-scale[0] == 1.0f 
+   viewport-scale[1] == 1.0f 
+   viewport-scale[2] == 1.0f 
+   viewport-scale[3] == 1.0f 
+   viewport-translate[0] == 0.0f 
+   viewport-translate[1] == 0.0f 
+   viewport-translate[2] == 0.0f 
+   viewport-translate[3] == 0.0f);
 }
 
 
@@ -695,6 +702,31 @@ draw_current_shader_position_output(const struct 
draw_context *draw)
 
 /**
  * Return the index of the shader output which will contain the
+ * viewport index.
+ */
+uint
+draw_current_shader_viewport_index_output(const struct draw_context *draw)
+{
+   if (draw-gs.geometry_shader)
+  return draw-gs.geometry_shader-viewport_index_output;
+   return 0;
+}
+
+/**
+ * Returns true if there's a geometry shader bound and the geometry
+ * shader writes out a viewport index.
+ */
+boolean
+draw_current_shader_uses_viewport_index(const struct draw_context *draw)
+{
+   if (draw-gs.geometry_shader)
+  return draw-gs.geometry_shader-info.writes_viewport_index;
+   return FALSE;
+}
+
+
+/**
+ * Return the index of the shader output which will contain the
  * vertex position.
  */
 uint
diff --git a/src/gallium/auxiliary/draw/draw_gs.c 

[Mesa-dev] [PATCH 3/4] llvmpipe: implement support for multiple viewports

2013-05-24 Thread Zack Rusin
Largely related to making sure the rasterizer can correctly
pick out the correct scissor box for the current viewport.

Signed-off-by: Zack Rusin za...@vmware.com
---
 src/gallium/drivers/llvmpipe/lp_context.h   |7 --
 src/gallium/drivers/llvmpipe/lp_screen.c|2 +-
 src/gallium/drivers/llvmpipe/lp_setup.c |   29 ++-
 src/gallium/drivers/llvmpipe/lp_setup.h |4 ++--
 src/gallium/drivers/llvmpipe/lp_setup_context.h |8 ---
 src/gallium/drivers/llvmpipe/lp_setup_line.c|   12 +++---
 src/gallium/drivers/llvmpipe/lp_setup_point.c   |   12 ++
 src/gallium/drivers/llvmpipe/lp_setup_tri.c |   17 +
 src/gallium/drivers/llvmpipe/lp_state_clip.c|6 +++--
 src/gallium/drivers/llvmpipe/lp_state_derived.c |   14 ++-
 src/gallium/drivers/llvmpipe/lp_surface.c   |4 ++--
 11 files changed, 79 insertions(+), 36 deletions(-)

diff --git a/src/gallium/drivers/llvmpipe/lp_context.h 
b/src/gallium/drivers/llvmpipe/lp_context.h
index d605dba..54f3830 100644
--- a/src/gallium/drivers/llvmpipe/lp_context.h
+++ b/src/gallium/drivers/llvmpipe/lp_context.h
@@ -75,10 +75,10 @@ struct llvmpipe_context {
struct pipe_constant_buffer 
constants[PIPE_SHADER_TYPES][LP_MAX_TGSI_CONST_BUFFERS];
struct pipe_framebuffer_state framebuffer;
struct pipe_poly_stipple poly_stipple;
-   struct pipe_scissor_state scissor;
+   struct pipe_scissor_state scissors[PIPE_MAX_VIEWPORTS];
struct pipe_sampler_view 
*sampler_views[PIPE_SHADER_TYPES][PIPE_MAX_SHADER_SAMPLER_VIEWS];
 
-   struct pipe_viewport_state viewport;
+   struct pipe_viewport_state viewports[PIPE_MAX_VIEWPORTS];
struct pipe_vertex_buffer vertex_buffer[PIPE_MAX_ATTRIBS];
struct pipe_index_buffer index_buffer;
struct pipe_resource *mapped_vs_tex[PIPE_MAX_SHADER_SAMPLER_VIEWS];
@@ -116,6 +116,9 @@ struct llvmpipe_context {
/** Which vertex shader output slot contains point size */
int psize_slot;
 
+   /** Which vertex shader output slot contains viewport index */
+   int viewport_index_slot;
+
/** minimum resolvable depth value, for polygon offset */   
double mrd;

diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
b/src/gallium/drivers/llvmpipe/lp_screen.c
index 35630b9..562fb51 100644
--- a/src/gallium/drivers/llvmpipe/lp_screen.c
+++ b/src/gallium/drivers/llvmpipe/lp_screen.c
@@ -231,7 +231,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
pipe_cap param)
case PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER:
   return 0;
case PIPE_CAP_MAX_VIEWPORTS:
-  return 1;
+  return PIPE_MAX_VIEWPORTS;
}
/* should only get here on unhandled cases */
debug_printf(Unexpected PIPE_CAP %d query\n, param);
diff --git a/src/gallium/drivers/llvmpipe/lp_setup.c 
b/src/gallium/drivers/llvmpipe/lp_setup.c
index 9fef34e..c8b2767 100644
--- a/src/gallium/drivers/llvmpipe/lp_setup.c
+++ b/src/gallium/drivers/llvmpipe/lp_setup.c
@@ -616,17 +616,20 @@ lp_setup_set_blend_color( struct lp_setup_context *setup,
 
 
 void
-lp_setup_set_scissor( struct lp_setup_context *setup,
-  const struct pipe_scissor_state *scissor )
+lp_setup_set_scissors( struct lp_setup_context *setup,
+   const struct pipe_scissor_state *scissors )
 {
+   unsigned i;
LP_DBG(DEBUG_SETUP, %s\n, __FUNCTION__);
 
-   assert(scissor);
+   assert(scissors);
 
-   setup-scissor.x0 = scissor-minx;
-   setup-scissor.x1 = scissor-maxx-1;
-   setup-scissor.y0 = scissor-miny;
-   setup-scissor.y1 = scissor-maxy-1;
+   for (i = 0; i  PIPE_MAX_VIEWPORTS; ++i) {
+  setup-scissors[i].x0 = scissors[i].minx;
+  setup-scissors[i].x1 = scissors[i].maxx-1;
+  setup-scissors[i].y0 = scissors[i].miny;
+  setup-scissors[i].y1 = scissors[i].maxy-1;
+   }
setup-dirty |= LP_SETUP_NEW_SCISSOR;
 }
 
@@ -1012,10 +1015,13 @@ try_update_scene_state( struct lp_setup_context *setup )
}
 
if (setup-dirty  LP_SETUP_NEW_SCISSOR) {
-  setup-draw_region = setup-framebuffer;
-  if (setup-scissor_test) {
- u_rect_possible_intersection(setup-scissor,
-  setup-draw_region);
+  unsigned i;
+  for (i = 0; i  PIPE_MAX_VIEWPORTS; ++i) {
+ setup-draw_regions[i] = setup-framebuffer;
+ if (setup-scissor_test) {
+u_rect_possible_intersection(setup-scissors[i],
+ setup-draw_regions[i]);
+ }
   }
   /* If the framebuffer is large we have to think about fixed-point
* integer overflow.  For 2K by 2K images, coordinates need 15 bits
@@ -1061,6 +1067,7 @@ lp_setup_update_state( struct lp_setup_context *setup,
* to know about vertex shader point size attribute.
*/
   setup-psize = lp-psize_slot;
+  setup-viewport_index_slot = lp-viewport_index_slot;
 
   assert(lp-dirty == 0);
 
diff --git a/src/gallium/drivers/llvmpipe/lp_setup.h 

[Mesa-dev] [PATCH 4/4] draw: fixup draw_find_shader_output

2013-05-24 Thread Zack Rusin
draw_find_shader_output like most of the code in draw used to
depend on position always being at output slot 0. which meant
that any other attribute being at 0 could signify an error.
unfortunately position can be at any of the output slots, thus
other attributes can occupy slot 0 and we need to mark the ones
which were not found by something else. This commit changes
draw_find_shader_output so that it returns -1 if it can't
find the given attribute and adjust the code that depended
on it returning 0 whenever it correctly found an attrib.

Signed-off-by: Zack Rusin za...@vmware.com
---
 src/gallium/auxiliary/draw/draw_context.c   |4 ++--
 src/gallium/auxiliary/draw/draw_vertex.h|2 +-
 src/gallium/drivers/llvmpipe/lp_state_derived.c |8 
 src/gallium/drivers/softpipe/sp_state_derived.c |2 +-
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_context.c 
b/src/gallium/auxiliary/draw/draw_context.c
index 4250f10..91cb136 100644
--- a/src/gallium/auxiliary/draw/draw_context.c
+++ b/src/gallium/auxiliary/draw/draw_context.c
@@ -493,7 +493,7 @@ draw_alloc_extra_vertex_attrib(struct draw_context *draw,
uint n;
 
slot = draw_find_shader_output(draw, semantic_name, semantic_index);
-   if (slot  0) {
+   if (slot = 0) {
   return slot;
}
 
@@ -574,7 +574,7 @@ draw_find_shader_output(const struct draw_context *draw,
   }
}
 
-   return 0;
+   return -1;
 }
 
 
diff --git a/src/gallium/auxiliary/draw/draw_vertex.h 
b/src/gallium/auxiliary/draw/draw_vertex.h
index c87c3d8..9e10ada 100644
--- a/src/gallium/auxiliary/draw/draw_vertex.h
+++ b/src/gallium/auxiliary/draw/draw_vertex.h
@@ -125,7 +125,7 @@ static INLINE uint
 draw_emit_vertex_attr(struct vertex_info *vinfo,
   enum attrib_emit emit, 
   enum interp_mode interp, /* only used by softpipe??? */
-  uint src_index)
+  int src_index)
 {
const uint n = vinfo-num_attribs;
assert(n  Elements(vinfo-attrib));
diff --git a/src/gallium/drivers/llvmpipe/lp_state_derived.c 
b/src/gallium/drivers/llvmpipe/lp_state_derived.c
index 9c5e847..ea24ffc 100644
--- a/src/gallium/drivers/llvmpipe/lp_state_derived.c
+++ b/src/gallium/drivers/llvmpipe/lp_state_derived.c
@@ -50,7 +50,7 @@ compute_vertex_info(struct llvmpipe_context *llvmpipe)
 {
const struct lp_fragment_shader *lpfs = llvmpipe-fs;
struct vertex_info *vinfo = llvmpipe-vertex_info;
-   unsigned vs_index;
+   int vs_index;
uint i;
 
llvmpipe-color_slot[0] = -1;
@@ -99,7 +99,7 @@ compute_vertex_info(struct llvmpipe_context *llvmpipe)
   vs_index = draw_find_shader_output(llvmpipe-draw,
  TGSI_SEMANTIC_BCOLOR, i);
 
-  if (vs_index  0) {
+  if (vs_index = 0) {
  llvmpipe-bcolor_slot[i] = (int)vinfo-num_attribs;
  draw_emit_vertex_attr(vinfo, EMIT_4F, INTERP_PERSPECTIVE, vs_index);
   }
@@ -111,7 +111,7 @@ compute_vertex_info(struct llvmpipe_context *llvmpipe)
vs_index = draw_find_shader_output(llvmpipe-draw,
   TGSI_SEMANTIC_PSIZE, 0);
 
-   if (vs_index  0) {
+   if (vs_index = 0) {
   llvmpipe-psize_slot = vinfo-num_attribs;
   draw_emit_vertex_attr(vinfo, EMIT_4F, INTERP_CONSTANT, vs_index);
}
@@ -120,7 +120,7 @@ compute_vertex_info(struct llvmpipe_context *llvmpipe)
vs_index = draw_find_shader_output(llvmpipe-draw,
   TGSI_SEMANTIC_VIEWPORT_INDEX,
   0);
-   if (vs_index  0) {
+   if (vs_index = 0) {
   llvmpipe-viewport_index_slot = vinfo-num_attribs;
   draw_emit_vertex_attr(vinfo, EMIT_4F, INTERP_CONSTANT, vs_index);
} else {
diff --git a/src/gallium/drivers/softpipe/sp_state_derived.c 
b/src/gallium/drivers/softpipe/sp_state_derived.c
index 85fd47d..93cd38e 100644
--- a/src/gallium/drivers/softpipe/sp_state_derived.c
+++ b/src/gallium/drivers/softpipe/sp_state_derived.c
@@ -137,7 +137,7 @@ softpipe_get_vertex_info(struct softpipe_context *softpipe)
 
   softpipe-psize_slot = draw_find_shader_output(softpipe-draw,
  TGSI_SEMANTIC_PSIZE, 0);
-  if (softpipe-psize_slot  0) {
+  if (softpipe-psize_slot = 0) {
  draw_emit_vertex_attr(vinfo, EMIT_4F, INTERP_CONSTANT,
softpipe-psize_slot);
   }
-- 
1.7.10.4
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] intel driver blit rework

2013-05-24 Thread Eric Anholt
Here's a big rework of blitting.  It's a followon to some of the work I
started back in February to make struct intel_region die in a fire.  It's
not a reduction in code like I hoped, but it's a reduction in a lot of
bugs and it should help make extension of our driver to support fast color
clears (and possibly zero-copy PBOs again) much easier.

The code is available at mtblit of
git://people.freedesktop.org/~anholt/mesa

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/17] intel: Rename intel_renderbuffer_tile_offsets.

2013-05-24 Thread Eric Anholt
This makes it more consistent with intel_miptree_get_tile_offsets().
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 4 ++--
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 2 +-
 src/mesa/drivers/dri/intel/intel_fbo.h| 6 +++---
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 2022159..f73ea20 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -1329,7 +1329,7 @@ brw_update_renderbuffer_surface(struct brw_context *brw,
gl_format rb_format = _mesa_get_render_format(ctx, intel_rb_format(irb));
 
if (rb-TexImage  !brw-has_surface_tile_offset) {
-  intel_renderbuffer_tile_offsets(irb, tile_x, tile_y);
+  intel_renderbuffer_get_tile_offsets(irb, tile_x, tile_y);
 
   if (tile_x != 0 || tile_y != 0) {
 /* Original gen4 hardware couldn't draw to a non-tile-aligned
@@ -1358,7 +1358,7 @@ brw_update_renderbuffer_surface(struct brw_context *brw,
  format  BRW_SURFACE_FORMAT_SHIFT);
 
/* reloc */
-   surf[1] = (intel_renderbuffer_tile_offsets(irb, tile_x, tile_y) +
+   surf[1] = (intel_renderbuffer_get_tile_offsets(irb, tile_x, tile_y) +
  region-bo-offset);
 
surf[2] = ((rb-Width - 1)  BRW_SURFACE_WIDTH_SHIFT |
diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
index c23a8be..0376705 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
@@ -560,7 +560,7 @@ gen7_update_renderbuffer_surface(struct brw_context *brw,
   surf[0] |= GEN7_SURFACE_HALIGN_8;
 
/* reloc */
-   surf[1] = intel_renderbuffer_tile_offsets(irb, tile_x, tile_y) +
+   surf[1] = intel_renderbuffer_get_tile_offsets(irb, tile_x, tile_y) +
  region-bo-offset; /* reloc */
 
assert(brw-has_surface_tile_offset);
diff --git a/src/mesa/drivers/dri/intel/intel_fbo.h 
b/src/mesa/drivers/dri/intel/intel_fbo.h
index 5d6dc7e..e1b4df5 100644
--- a/src/mesa/drivers/dri/intel/intel_fbo.h
+++ b/src/mesa/drivers/dri/intel/intel_fbo.h
@@ -150,9 +150,9 @@ void
 intel_renderbuffer_set_draw_offset(struct intel_renderbuffer *irb);
 
 static inline uint32_t
-intel_renderbuffer_tile_offsets(struct intel_renderbuffer *irb,
-   uint32_t *tile_x,
-   uint32_t *tile_y)
+intel_renderbuffer_get_tile_offsets(struct intel_renderbuffer *irb,
+uint32_t *tile_x,
+uint32_t *tile_y)
 {
return intel_miptree_get_tile_offsets(irb-mt, irb-mt_level, irb-mt_layer,
  tile_x, tile_y);
-- 
1.8.3.rc0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/17] intel: Make a wrapper for intelEmitCopyBlit using miptrees.

2013-05-24 Thread Eric Anholt
I had previously asserted that it was hard to write a useful, simpler
blit function, but I think this might be it.

This has the side effect of extending the 32k pitch check to a few more
places that were missing it.
---
 src/mesa/drivers/dri/intel/intel_blit.c| 91 ++
 src/mesa/drivers/dri/intel/intel_blit.h| 10 +++
 src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 15 ++---
 src/mesa/drivers/dri/intel/intel_pixel_copy.c  | 42 +++-
 src/mesa/drivers/dri/intel/intel_tex_copy.c| 80 --
 5 files changed, 127 insertions(+), 111 deletions(-)

diff --git a/src/mesa/drivers/dri/intel/intel_blit.c 
b/src/mesa/drivers/dri/intel/intel_blit.c
index f9cba85..007f900 100644
--- a/src/mesa/drivers/dri/intel/intel_blit.c
+++ b/src/mesa/drivers/dri/intel/intel_blit.c
@@ -85,6 +85,97 @@ br13_for_cpp(int cpp)
}
 }
 
+/**
+ * Implements a rectangular block transfer (blit) of pixels between two
+ * miptrees.
+ *
+ * Our blitter can operate on 1, 2, or 4-byte-per-pixel data, with generous,
+ * but limited, pitches and sizes allowed.
+ *
+ * The src/dst coordinates are relative to the given level/slice of the
+ * miptree.
+ *
+ * If @src_flip or @dst_flip is set, then the rectangle within that miptree
+ * will be inverted (including scanline order) when copying.  This is common
+ * in GL when copying between window system and user-created
+ * renderbuffers/textures.
+ */
+bool
+intel_miptree_blit(struct intel_context *intel,
+   struct intel_mipmap_tree *src_mt,
+   int src_level, int src_slice,
+   uint32_t src_x, uint32_t src_y, bool src_flip,
+   struct intel_mipmap_tree *dst_mt,
+   int dst_level, int dst_slice,
+   uint32_t dst_x, uint32_t dst_y, bool dst_flip,
+   uint32_t width, uint32_t height,
+   GLenum logicop)
+{
+   /* We don't assert on format because we may blit from ARGB to XRGB,
+* for example.
+*/
+   assert(src_mt-cpp == dst_mt-cpp);
+
+   /* According to the Ivy Bridge PRM, Vol1 Part4, section 1.2.1.2 (Graphics
+* Data Size Limitations):
+*
+*The BLT engine is capable of transferring very large quantities of
+*graphics data. Any graphics data read from and written to the
+*destination is permitted to represent a number of pixels that
+*occupies up to 65,536 scan lines and up to 32,768 bytes per scan line
+*at the destination. The maximum number of pixels that may be
+*represented per scan line’s worth of graphics data depends on the
+*color depth.
+*
+* Furthermore, intel_miptree_blit (which is called below) uses a signed
+* 16-bit integer to represent buffer pitch, so it can only handle buffer
+* pitches  32k.
+*
+* As a result of these two limitations, we can only use the blitter to do
+* this copy when the region's pitch is less than 32k.
+*/
+   if (src_mt-region-pitch  32768 ||
+   dst_mt-region-pitch  32768) {
+  perf_debug(Falling back due to 32k pitch\n);
+  return false;
+   }
+
+   if (src_flip)
+  src_y = src_mt-level[src_level].height - src_y - height;
+
+   if (dst_flip)
+  dst_y = dst_mt-level[dst_level].height - dst_y - height;
+
+   int src_pitch = src_mt-region-pitch;
+   if (src_flip != dst_flip)
+  src_pitch = -src_pitch;
+
+   uint32_t src_image_x, src_image_y;
+   intel_miptree_get_image_offset(src_mt, src_level, src_slice,
+  src_image_x, src_image_y);
+   src_x += src_image_x;
+   src_y += src_image_y;
+
+   uint32_t dst_image_x, dst_image_y;
+   intel_miptree_get_image_offset(dst_mt, dst_level, dst_slice,
+  dst_image_x, dst_image_y);
+   dst_x += dst_image_x;
+   dst_y += dst_image_y;
+
+   return intelEmitCopyBlit(intel,
+src_mt-cpp,
+src_pitch,
+src_mt-region-bo, src_mt-offset,
+src_mt-region-tiling,
+dst_mt-region-pitch,
+dst_mt-region-bo, dst_mt-offset,
+dst_mt-region-tiling,
+src_x, src_y,
+dst_x, dst_y,
+width, height,
+logicop);
+}
+
 /* Copy BitBlt
  */
 bool
diff --git a/src/mesa/drivers/dri/intel/intel_blit.h 
b/src/mesa/drivers/dri/intel/intel_blit.h
index d195e6b..9bfe91d 100644
--- a/src/mesa/drivers/dri/intel/intel_blit.h
+++ b/src/mesa/drivers/dri/intel/intel_blit.h
@@ -51,6 +51,16 @@ intelEmitCopyBlit(struct intel_context *intel,
   GLshort w, GLshort h,
  GLenum logicop );
 
+bool intel_miptree_blit(struct intel_context *intel,
+struct intel_mipmap_tree *src_mt,
+int 

[Mesa-dev] [PATCH 01/17] intel: Make intel_miptree_get_tile_offsets return a page offset.

2013-05-24 Thread Eric Anholt
Right now, the callers in i965 don't expect a nonzero page offset to
actually occur (since that's being handled elsewhere), but it seems
like a trap to leave it this way.
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  |  6 +++---
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c |  7 ---
 src/mesa/drivers/dri/intel/intel_mipmap_tree.c| 21 ++---
 src/mesa/drivers/dri/intel/intel_mipmap_tree.h|  2 +-
 4 files changed, 26 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index bbe8579..2022159 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -986,6 +986,8 @@ brw_update_texture_surface(struct gl_context *ctx,
   BRW_SURFACE_FORMAT_SHIFT));
 
surf[1] = intelObj-mt-region-bo-offset + intelObj-mt-offset; /* reloc 
*/
+   surf[1] += intel_miptree_get_tile_offsets(intelObj-mt, firstImage-Level, 
0,
+ tile_x, tile_y);
 
surf[2] = ((intelObj-_MaxLevel - tObj-BaseLevel)  BRW_SURFACE_LOD_SHIFT 
|
  (width - 1)  BRW_SURFACE_WIDTH_SHIFT |
@@ -998,8 +1000,6 @@ brw_update_texture_surface(struct gl_context *ctx,
 
surf[4] = brw_get_surface_num_multisamples(intelObj-mt-num_samples);
 
-   intel_miptree_get_tile_offsets(intelObj-mt, firstImage-Level, 0,
-  tile_x, tile_y);
assert(brw-has_surface_tile_offset || (tile_x == 0  tile_y == 0));
/* Note that the low bits of these fields are missing, so
 * there's the possibility of getting in trouble.
@@ -1014,7 +1014,7 @@ brw_update_texture_surface(struct gl_context *ctx,
drm_intel_bo_emit_reloc(brw-intel.batch.bo,
   binding_table[surf_index] + 4,
   intelObj-mt-region-bo,
-   intelObj-mt-offset,
+   surf[1] - intelObj-mt-region-bo-offset,
   I915_GEM_DOMAIN_SAMPLER, 0);
 }
 
diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
index 435f9dc..c23a8be 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
@@ -331,6 +331,8 @@ gen7_update_texture_surface(struct gl_context *ctx,
   surf[0] |= GEN7_SURFACE_ARYSPC_LOD0;
 
surf[1] = mt-region-bo-offset + mt-offset; /* reloc */
+   surf[1] += intel_miptree_get_tile_offsets(intelObj-mt, firstImage-Level, 
0,
+ tile_x, tile_y);
 
surf[2] = SET_FIELD(width - 1, GEN7_SURFACE_WIDTH) |
  SET_FIELD(height - 1, GEN7_SURFACE_HEIGHT);
@@ -339,8 +341,6 @@ gen7_update_texture_surface(struct gl_context *ctx,
 
surf[4] = gen7_surface_msaa_bits(mt-num_samples, mt-msaa_layout);
 
-   intel_miptree_get_tile_offsets(intelObj-mt, firstImage-Level, 0,
-  tile_x, tile_y);
assert(brw-has_surface_tile_offset || (tile_x == 0  tile_y == 0));
/* Note that the low bits of these fields are missing, so
 * there's the possibility of getting in trouble.
@@ -372,7 +372,8 @@ gen7_update_texture_surface(struct gl_context *ctx,
/* Emit relocation to surface contents */
drm_intel_bo_emit_reloc(brw-intel.batch.bo,
   binding_table[surf_index] + 4,
-  intelObj-mt-region-bo, intelObj-mt-offset,
+  intelObj-mt-region-bo,
+   surf[1] - intelObj-mt-region-bo-offset,
   I915_GEM_DOMAIN_SAMPLER, 0);
 
gen7_check_surface_setup(surf, false /* is_render_target */);
diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
index d967b19..0278799 100644
--- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
@@ -777,19 +777,34 @@ intel_miptree_get_image_offset(struct intel_mipmap_tree 
*mt,
*y = mt-level[level].slice[slice].y_offset;
 }
 
-void
+/**
+ * Rendering with tiled buffers requires that the base address of the buffer
+ * be aligned to a page boundary.  For renderbuffers, and sometimes with
+ * textures, we may want the surface to point at a texture image level that
+ * isn't at a page boundary.
+ *
+ * This function returns an appropriately-aligned base offset
+ * according to the tiling restrictions, plus any required x/y offset
+ * from there.
+ */
+uint32_t
 intel_miptree_get_tile_offsets(struct intel_mipmap_tree *mt,
GLuint level, GLuint slice,
uint32_t *tile_x,
uint32_t *tile_y)
 {
struct intel_region *region = mt-region;
+   uint32_t x, y;
uint32_t mask_x, mask_y;
 
intel_region_get_tile_masks(region, mask_x, mask_y, false);
+   

[Mesa-dev] [PATCH 02/17] intel: Reduce intel_renderbuffer_tile_offsets to a thin wrapper.

2013-05-24 Thread Eric Anholt
---
 src/mesa/drivers/dri/intel/intel_fbo.c | 26 --
 src/mesa/drivers/dri/intel/intel_fbo.h |  9 +++--
 2 files changed, 7 insertions(+), 28 deletions(-)

diff --git a/src/mesa/drivers/dri/intel/intel_fbo.c 
b/src/mesa/drivers/dri/intel/intel_fbo.c
index 69f8629..34f31fb 100644
--- a/src/mesa/drivers/dri/intel/intel_fbo.c
+++ b/src/mesa/drivers/dri/intel/intel_fbo.c
@@ -535,32 +535,6 @@ intel_renderbuffer_set_draw_offset(struct 
intel_renderbuffer *irb)
 }
 
 /**
- * Rendering to tiled buffers requires that the base address of the
- * buffer be aligned to a page boundary.  We generally render to
- * textures by pointing the surface at the mipmap image level, which
- * may not be aligned to a tile boundary.
- *
- * This function returns an appropriately-aligned base offset
- * according to the tiling restrictions, plus any required x/y offset
- * from there.
- */
-uint32_t
-intel_renderbuffer_tile_offsets(struct intel_renderbuffer *irb,
-   uint32_t *tile_x,
-   uint32_t *tile_y)
-{
-   struct intel_region *region = irb-mt-region;
-   uint32_t mask_x, mask_y;
-
-   intel_region_get_tile_masks(region, mask_x, mask_y, false);
-
-   *tile_x = irb-draw_x  mask_x;
-   *tile_y = irb-draw_y  mask_y;
-   return intel_region_get_aligned_offset(region, irb-draw_x  ~mask_x,
-  irb-draw_y  ~mask_y, false);
-}
-
-/**
  * Called by glFramebufferTexture[123]DEXT() (and other places) to
  * prepare for rendering into texture memory.  This might be called
  * many times to choose different texture levels, cube faces, etc
diff --git a/src/mesa/drivers/dri/intel/intel_fbo.h 
b/src/mesa/drivers/dri/intel/intel_fbo.h
index aa52b97..5d6dc7e 100644
--- a/src/mesa/drivers/dri/intel/intel_fbo.h
+++ b/src/mesa/drivers/dri/intel/intel_fbo.h
@@ -33,6 +33,7 @@
 #include main/formats.h
 #include main/macros.h
 #include intel_context.h
+#include intel_mipmap_tree.h
 #include intel_screen.h
 
 #ifdef __cplusplus
@@ -148,10 +149,14 @@ intel_flip_renderbuffers(struct gl_framebuffer *fb);
 void
 intel_renderbuffer_set_draw_offset(struct intel_renderbuffer *irb);
 
-uint32_t
+static inline uint32_t
 intel_renderbuffer_tile_offsets(struct intel_renderbuffer *irb,
uint32_t *tile_x,
-   uint32_t *tile_y);
+   uint32_t *tile_y)
+{
+   return intel_miptree_get_tile_offsets(irb-mt, irb-mt_level, irb-mt_layer,
+ tile_x, tile_y);
+}
 
 struct intel_region*
 intel_get_rb_region(struct gl_framebuffer *fb, GLuint attIndex);
-- 
1.8.3.rc0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/17] i965: Consistently do depth resolves before blitting.

2013-05-24 Thread Eric Anholt
We were protected for a long time by the fact that depth was Y tiled and
you couldn't blit Y.  Now that we can blit Y, we were failing to resolve
depth in glCopyPixels().

Note in the comment about swrast, that the swrast map path does resolves
appropriately already.
---
 src/mesa/drivers/dri/intel/intel_blit.c| 6 ++
 src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 6 --
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/intel/intel_blit.c 
b/src/mesa/drivers/dri/intel/intel_blit.c
index 007f900..ddb9edb 100644
--- a/src/mesa/drivers/dri/intel/intel_blit.c
+++ b/src/mesa/drivers/dri/intel/intel_blit.c
@@ -140,6 +140,12 @@ intel_miptree_blit(struct intel_context *intel,
   return false;
}
 
+   /* The blitter has no idea about HiZ, so we need to get the real depth
+* data into the two miptrees before we do anything.
+*/
+   intel_miptree_slice_resolve_depth(intel, src_mt, src_level, src_slice);
+   intel_miptree_slice_resolve_depth(intel, dst_mt, dst_level, dst_slice);
+
if (src_flip)
   src_y = src_mt-level[src_level].height - src_y - height;
 
diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
index eedf80c..c3e55f4 100644
--- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
@@ -919,12 +919,6 @@ intel_miptree_copy_slice(struct intel_context *intel,
dst_mt, dst_x, dst_y, dst_mt-region-pitch,
width, height);
 
-   /* Since we are about to copy depth data using either the blitter or swrast
-* (neither of which respect HiZ), we need to do a depth resolve first.
-*/
-   intel_miptree_slice_resolve_depth(intel, src_mt, level, slice);
-   intel_miptree_slice_resolve_depth(intel, dst_mt, level, slice);
-
if (!intel_miptree_blit(intel,
src_mt, level, slice, 0, 0, false,
dst_mt, level, slice, 0, 0, false,
-- 
1.8.3.rc0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/17] i965: Prefer blorp glBlitFramebuffer() to the glCopyTexSubImage-based blit.

2013-05-24 Thread Eric Anholt
I think we've measured no performance difference from this in the past,
except that the blorp code can do things like multisample resolves.
Prevents piglit regression in the next commit when a testcase started
trying to do a multisampled resolve through the old glCopyTexSubImage()
path.
---
 src/mesa/drivers/dri/intel/intel_fbo.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/intel/intel_fbo.c 
b/src/mesa/drivers/dri/intel/intel_fbo.c
index 34f31fb..05ff784 100644
--- a/src/mesa/drivers/dri/intel/intel_fbo.c
+++ b/src/mesa/drivers/dri/intel/intel_fbo.c
@@ -816,14 +816,6 @@ intel_blit_framebuffer(struct gl_context *ctx,
GLint dstX0, GLint dstY0, GLint dstX1, GLint dstY1,
GLbitfield mask, GLenum filter)
 {
-   /* Try faster, glCopyTexSubImage2D approach first which uses the BLT. */
-   mask = intel_blit_framebuffer_copy_tex_sub_image(ctx,
-srcX0, srcY0, srcX1, srcY1,
-dstX0, dstY0, dstX1, dstY1,
-mask, filter);
-   if (mask == 0x0)
-  return;
-
 #ifndef I915
mask = brw_blorp_framebuffer(intel_context(ctx),
 srcX0, srcY0, srcX1, srcY1,
@@ -833,6 +825,15 @@ intel_blit_framebuffer(struct gl_context *ctx,
   return;
 #endif
 
+   /* Try glCopyTexSubImage2D approach which uses the BLT. */
+   mask = intel_blit_framebuffer_copy_tex_sub_image(ctx,
+srcX0, srcY0, srcX1, srcY1,
+dstX0, dstY0, dstX1, dstY1,
+mask, filter);
+   if (mask == 0x0)
+  return;
+
+
_mesa_meta_BlitFramebuffer(ctx,
   srcX0, srcY0, srcX1, srcY1,
   dstX0, dstY0, dstX1, dstY1,
-- 
1.8.3.rc0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/17] i965: Allow glCopyTexSubImage() on depth textures.

2013-05-24 Thread Eric Anholt
If the hw is pre-gen5 and can't blit depth, it'll cleanly error out.
---
 src/mesa/drivers/dri/intel/intel_tex_copy.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/src/mesa/drivers/dri/intel/intel_tex_copy.c 
b/src/mesa/drivers/dri/intel/intel_tex_copy.c
index 7a38082..94e90da 100644
--- a/src/mesa/drivers/dri/intel/intel_tex_copy.c
+++ b/src/mesa/drivers/dri/intel/intel_tex_copy.c
@@ -96,11 +96,6 @@ intel_copy_texsubimage(struct intel_context *intel,
   return false;
}
 
-   /* The blitter can't handle Y-tiled buffers. */
-   if (intelImage-mt-region-tiling == I915_TILING_Y) {
-  return false;
-   }
-
/* blit from src buffer to texture */
if (!intel_miptree_blit(intel,
irb-mt, irb-mt_level, irb-mt_layer,
-- 
1.8.3.rc0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/17] intel: Rebuild PBO blit glReadPixels() on top of miptrees.

2013-05-24 Thread Eric Anholt
The previous code was missing depth resolves, that had only been prevented
due to no blitting of Y tiling.  The pair of flip args in the new blit
function means that we can just drop the pack-Invert fallback.
---
 src/mesa/drivers/dri/intel/intel_pixel_read.c | 48 +--
 1 file changed, 23 insertions(+), 25 deletions(-)

diff --git a/src/mesa/drivers/dri/intel/intel_pixel_read.c 
b/src/mesa/drivers/dri/intel/intel_pixel_read.c
index ebdc528..26eb496 100644
--- a/src/mesa/drivers/dri/intel/intel_pixel_read.c
+++ b/src/mesa/drivers/dri/intel/intel_pixel_read.c
@@ -76,7 +76,6 @@ do_blit_readpixels(struct gl_context * ctx,
const struct gl_pixelstore_attrib *pack, GLvoid * pixels)
 {
struct intel_context *intel = intel_context(ctx);
-   struct intel_region *src = intel_readbuf_region(intel);
struct intel_buffer_object *dst = intel_buffer_object(pack-BufferObj);
GLuint dst_offset;
drm_intel_bo *dst_buffer;
@@ -86,9 +85,6 @@ do_blit_readpixels(struct gl_context * ctx,
 
DBG(%s\n, __FUNCTION__);
 
-   if (!src)
-  return false;
-
assert(_mesa_is_bufferobj(pack-BufferObj));
 
struct gl_renderbuffer *rb = ctx-ReadBuffer-_ColorReadBuffer;
@@ -107,13 +103,13 @@ do_blit_readpixels(struct gl_context * ctx,
}
 
int dst_stride = _mesa_image_row_stride(pack, width, format, type);
+   bool dst_flip = false;
+   /* Mesa flips the dst_stride for pack-Invert, but we want our mt to have a
+* normal dst_stride.
+*/
if (pack-Invert) {
-  DBG(%s: MESA_PACK_INVERT not done yet\n, __FUNCTION__);
-  return false;
-   }
-   else {
-  if (_mesa_is_winsys_fbo(ctx-ReadBuffer))
-dst_stride = -dst_stride;
+  dst_stride = -dst_stride;
+  dst_flip = true;
}
 
dst_offset = (GLintptr)pixels;
@@ -131,30 +127,32 @@ do_blit_readpixels(struct gl_context * ctx,
intel_prepare_render(intel);
intel-front_buffer_dirty = dirty;
 
-   all = (width * height * src-cpp == dst-Base.Size 
+   all = (width * height * irb-mt-cpp == dst-Base.Size 
  x == 0  dst_offset == 0);
 
-   dst_x = 0;
-   dst_y = 0;
-
dst_buffer = intel_bufferobj_buffer(intel, dst,
   all ? INTEL_WRITE_FULL :
   INTEL_WRITE_PART);
 
-   if (_mesa_is_winsys_fbo(ctx-ReadBuffer))
-  y = ctx-ReadBuffer-Height - (y + height);
-
-   if (!intelEmitCopyBlit(intel,
- src-cpp,
- src-pitch, src-bo, 0, src-tiling,
- dst_stride, dst_buffer, dst_offset, false,
- x, y,
- dst_x, dst_y,
- width, height,
- GL_COPY)) {
+   struct intel_mipmap_tree *pbo_mt =
+  intel_miptree_create_for_bo(intel,
+  dst_buffer,
+  irb-mt-format,
+  dst_offset,
+  width, height,
+  dst_stride, I915_TILING_NONE);
+
+   if (!intel_miptree_blit(intel,
+   irb-mt, irb-mt_level, irb-mt_layer,
+   x, y, _mesa_is_winsys_fbo(ctx-ReadBuffer),
+   pbo_mt, 0, 0,
+   0, 0, dst_flip,
+   width, height, GL_COPY)) {
   return false;
}
 
+   intel_miptree_release(pbo_mt);
+
DBG(%s - DONE\n, __FUNCTION__);
 
return true;
-- 
1.8.3.rc0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/17] intel: Rework intel_miptree_create_for_region() to wrap a BO.

2013-05-24 Thread Eric Anholt
I needed to do this for the PBO blit cases to use intel_miptree_blit().
But this also actually partially fixes a bug in EGLImage handling: We
can't share regions across contexts, because regions have a refcount that
isn't protected by a mutex, and different contexts can be simulataneously
accessed from multiple threads.  Now we just need to get regions out of
__DRIImage.  There was also a missing use of image-offset in the EGLImage
renderbuffer storage code.
---
 src/mesa/drivers/dri/intel/intel_fbo.c | 12 +++--
 src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 65 --
 src/mesa/drivers/dri/intel/intel_mipmap_tree.h | 14 --
 3 files changed, 67 insertions(+), 24 deletions(-)

diff --git a/src/mesa/drivers/dri/intel/intel_fbo.c 
b/src/mesa/drivers/dri/intel/intel_fbo.c
index 73ed91d..cbbd31c 100644
--- a/src/mesa/drivers/dri/intel/intel_fbo.c
+++ b/src/mesa/drivers/dri/intel/intel_fbo.c
@@ -293,10 +293,14 @@ intel_image_target_renderbuffer_storage(struct gl_context 
*ctx,
 
irb = intel_renderbuffer(rb);
intel_miptree_release(irb-mt);
-   irb-mt = intel_miptree_create_for_region(intel,
- GL_TEXTURE_2D,
- image-format,
- image-region);
+   irb-mt = intel_miptree_create_for_bo(intel,
+ image-region-bo,
+ image-format,
+ image-offset,
+ image-region-width,
+ image-region-height,
+ image-region-pitch,
+ image-region-tiling);
if (!irb-mt)
   return;
 
diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
index dd0b9ce..443791c 100644
--- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
@@ -124,8 +124,8 @@ compute_msaa_layout(struct intel_context *intel, gl_format 
format, GLenum target
 
 
 /**
- * @param for_region Indicates that the caller is
- *intel_miptree_create_for_region(). If true, then do not create
+ * @param for_bo Indicates that the caller is
+ *intel_miptree_create_for_bo(). If true, then do not create
  *\c stencil_mt.
  */
 struct intel_mipmap_tree *
@@ -137,7 +137,7 @@ intel_miptree_create_layout(struct intel_context *intel,
 GLuint width0,
 GLuint height0,
 GLuint depth0,
-bool for_region,
+bool for_bo,
 GLuint num_samples)
 {
struct intel_mipmap_tree *mt = calloc(sizeof(*mt), 1);
@@ -250,7 +250,7 @@ intel_miptree_create_layout(struct intel_context *intel,
mt-physical_height0 = height0;
mt-physical_depth0 = depth0;
 
-   if (!for_region 
+   if (!for_bo 
_mesa_get_format_base_format(format) == GL_DEPTH_STENCIL 
(intel-must_use_separate_stencil ||
(intel-has_separate_stencil 
@@ -485,21 +485,50 @@ intel_miptree_create(struct intel_context *intel,
 }
 
 struct intel_mipmap_tree *
-intel_miptree_create_for_region(struct intel_context *intel,
-   GLenum target,
-   gl_format format,
-   struct intel_region *region)
+intel_miptree_create_for_bo(struct intel_context *intel,
+drm_intel_bo *bo,
+gl_format format,
+uint32_t offset,
+uint32_t width,
+uint32_t height,
+int pitch,
+uint32_t tiling)
 {
struct intel_mipmap_tree *mt;
 
-   mt = intel_miptree_create_layout(intel, target, format,
- 0, 0,
- region-width, region-height, 1,
- true, 0 /* num_samples */);
+   struct intel_region *region = calloc(1, sizeof(*region));
+   if (!region)
+  return NULL;
+
+   /* Nothing will be able to use this miptree with the BO if the offset isn't
+* aligned.
+*/
+   if (tiling != I915_TILING_NONE)
+  assert(offset % 4096 == 0);
+
+   /* miptrees can't handle negative pitch.  If you need flipping of images,
+* that's outside of the scope of the mt.
+*/
+   assert(pitch = 0);
+
+   mt = intel_miptree_create_layout(intel, GL_TEXTURE_2D, format,
+0, 0,
+width, height, 1,
+true, 0 /* num_samples */);
if (!mt)
   return mt;
 
-   intel_region_reference(mt-region, region);
+   region-cpp = mt-cpp;
+   region-width = 

[Mesa-dev] [PATCH 08/17] intle: Add an assert for glCopyTexSubImage() being called on MSAA buffers.

2013-05-24 Thread Eric Anholt
This is just in case someone else trips over this due to our weird reuse
of this code in glBlitFramebuffer().
---
 src/mesa/drivers/dri/intel/intel_tex_copy.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/mesa/drivers/dri/intel/intel_tex_copy.c 
b/src/mesa/drivers/dri/intel/intel_tex_copy.c
index 94e90da..4a13b9a 100644
--- a/src/mesa/drivers/dri/intel/intel_tex_copy.c
+++ b/src/mesa/drivers/dri/intel/intel_tex_copy.c
@@ -62,6 +62,12 @@ intel_copy_texsubimage(struct intel_context *intel,
 
intel_prepare_render(intel);
 
+   /* glCopyTexSubImage() can't be called on multisampled renderbuffers or
+* textures.
+*/
+   assert(!irb-Base.Base.NumSamples);
+   assert(!intelImage-base.Base.NumSamples);
+
if (!intelImage-mt || !irb || !irb-mt) {
   if (unlikely(INTEL_DEBUG  DEBUG_PERF))
 fprintf(stderr, %s fail %p %p (0x%08x)\n,
-- 
1.8.3.rc0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 14/17] intel: Rebuild PBO blit glTexImage() on top of miptrees.

2013-05-24 Thread Eric Anholt
This will ensure that we have resolves if we ever extend this to
glTexSubImage(), and fixes missing image start offset handling.

The texture buffer alloc ended up getting moved up, because we want to
look at the format of the image's actual mt to see if we'll end up
blitting the right thing, in the case of packed depth/stencil uploads.

This is the last caller of intelEmitCopyBlit() on a miptree-wrapped BO.
---
 src/mesa/drivers/dri/intel/intel_tex_image.c | 62 ++--
 1 file changed, 32 insertions(+), 30 deletions(-)

diff --git a/src/mesa/drivers/dri/intel/intel_tex_image.c 
b/src/mesa/drivers/dri/intel/intel_tex_image.c
index a3928bb..4ad5ccc 100644
--- a/src/mesa/drivers/dri/intel/intel_tex_image.c
+++ b/src/mesa/drivers/dri/intel/intel_tex_image.c
@@ -6,6 +6,7 @@
 #include main/bufferobj.h
 #include main/context.h
 #include main/formats.h
+#include main/image.h
 #include main/pbo.h
 #include main/renderbuffer.h
 #include main/texcompress.h
@@ -117,9 +118,8 @@ try_pbo_upload(struct gl_context *ctx,
struct intel_texture_image *intelImage = intel_texture_image(image);
struct intel_context *intel = intel_context(ctx);
struct intel_buffer_object *pbo = intel_buffer_object(unpack-BufferObj);
-   GLuint src_offset, src_stride;
-   GLuint dst_x, dst_y;
-   drm_intel_bo *dst_buffer, *src_buffer;
+   GLuint src_offset;
+   drm_intel_bo *src_buffer;
 
if (!_mesa_is_bufferobj(unpack-BufferObj))
   return false;
@@ -132,14 +132,6 @@ try_pbo_upload(struct gl_context *ctx,
   return false;
}
 
-   if (!_mesa_format_matches_format_and_type(image-TexFormat,
- format, type, false)) {
-  DBG(%s: format mismatch (upload to %s with format 0x%x, type 0x%x)\n,
- __FUNCTION__, _mesa_get_format_name(image-TexFormat),
- format, type);
-  return false;
-   }
-
ctx-Driver.AllocTextureImageBuffer(ctx, image);
 
if (!intelImage-mt) {
@@ -147,39 +139,49 @@ try_pbo_upload(struct gl_context *ctx,
   return false;
}
 
+   if (!_mesa_format_matches_format_and_type(intelImage-mt-format,
+ format, type, false)) {
+  DBG(%s: format mismatch (upload to %s with format 0x%x, type 0x%x)\n,
+ __FUNCTION__, _mesa_get_format_name(intelImage-mt-format),
+ format, type);
+  return false;
+   }
+
if (image-TexObject-Target == GL_TEXTURE_1D_ARRAY ||
image-TexObject-Target == GL_TEXTURE_2D_ARRAY) {
   DBG(%s: no support for array textures\n, __FUNCTION__);
   return false;
}
 
-   dst_buffer = intelImage-mt-region-bo;
src_buffer = intel_bufferobj_source(intel, pbo, 64, src_offset);
/* note: potential 64-bit ptr to 32-bit int cast */
src_offset += (GLuint) (unsigned long) pixels;
 
-   if (unpack-RowLength  0)
-  src_stride = unpack-RowLength;
-   else
-  src_stride = image-Width;
-   src_stride *= intelImage-mt-region-cpp;
-
-   intel_miptree_get_image_offset(intelImage-mt, intelImage-base.Base.Level,
- intelImage-base.Base.Face,
- dst_x, dst_y);
-
-   if (!intelEmitCopyBlit(intel,
- intelImage-mt-cpp,
- src_stride, src_buffer,
- src_offset, false,
- intelImage-mt-region-pitch, dst_buffer, 0,
- intelImage-mt-region-tiling,
- 0, 0, dst_x, dst_y, image-Width, image-Height,
- GL_COPY)) {
+   int src_stride =
+  _mesa_image_row_stride(unpack, image-Width, format, type);
+
+   struct intel_mipmap_tree *pbo_mt =
+  intel_miptree_create_for_bo(intel,
+  src_buffer,
+  intelImage-mt-format,
+  src_offset,
+  image-Width, image-Height,
+  src_stride, I915_TILING_NONE);
+   if (!pbo_mt)
+  return false;
+
+   if (!intel_miptree_blit(intel,
+   pbo_mt, 0, 0,
+   0, 0, false,
+   intelImage-mt, image-Level, image-Face,
+   0, 0, false,
+   image-Width, image-Height, GL_COPY)) {
   DBG(%s: blit failed\n, __FUNCTION__);
   return false;
}
 
+   intel_miptree_release(pbo_mt);
+
DBG(%s: success\n, __FUNCTION__);
return true;
 }
-- 
1.8.3.rc0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/17] intel: Make a temporary miptree for the blit path of miptree mapping.

2013-05-24 Thread Eric Anholt
In a bit of debug code, we no longer have the inter-slice x/y to print.
But I think the level/slice is more useful in this case for looking at
what's getting mapped, especially given that INTEL_DEBUG=blit will tell
you the other value.
---
 src/mesa/drivers/dri/intel/intel_mipmap_tree.c | 99 +++---
 src/mesa/drivers/dri/intel/intel_mipmap_tree.h |  4 +-
 2 files changed, 29 insertions(+), 74 deletions(-)

diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
index d41fbdf..dd0b9ce 100644
--- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
@@ -1456,57 +1456,39 @@ intel_miptree_map_blit(struct intel_context *intel,
   struct intel_miptree_map *map,
   unsigned int level, unsigned int slice)
 {
-   unsigned int image_x, image_y;
-   int x = map-x;
-   int y = map-y;
-   int ret;
-
-   /* The blitter requires the pitch to be aligned to 4. */
-   map-stride = ALIGN(map-w * mt-region-cpp, 4);
-
-   map-bo = drm_intel_bo_alloc(intel-bufmgr, intel_miptree_map_blit() temp,
-   map-stride * map-h, 4096);
-   if (!map-bo) {
+   map-mt = intel_miptree_create(intel, GL_TEXTURE_2D, mt-format,
+  0, 0,
+  map-w, map-h, 1,
+  false, 0,
+  (1  I915_TILING_NONE));
+   if (!map-mt) {
   fprintf(stderr, Failed to allocate blit temporary\n);
   goto fail;
}
+   map-stride = map-mt-region-pitch;
 
-   intel_miptree_get_image_offset(mt, level, slice, image_x, image_y);
-   x += image_x;
-   y += image_y;
-
-   if (!intelEmitCopyBlit(intel,
- mt-region-cpp,
- mt-region-pitch, mt-region-bo,
- mt-offset, mt-region-tiling,
- map-stride, map-bo,
- 0, I915_TILING_NONE,
- x, y,
- 0, 0,
- map-w, map-h,
- GL_COPY)) {
+   if (!intel_miptree_blit(intel,
+   mt, level, slice,
+   map-x, map-y, false,
+   map-mt, 0, 0,
+   0, 0, false,
+   map-w, map-h, GL_COPY)) {
   fprintf(stderr, Failed to blit\n);
   goto fail;
}
 
intel_batchbuffer_flush(intel);
-   ret = drm_intel_bo_map(map-bo, (map-mode  GL_MAP_WRITE_BIT) != 0);
-   if (ret) {
-  fprintf(stderr, Failed to map blit temporary\n);
-  goto fail;
-   }
-
-   map-ptr = map-bo-virtual;
+   map-ptr = intel_miptree_map_raw(intel, map-mt);
 
DBG(%s: %d,%d %dx%d from mt %p (%s) %d,%d = %p/%d\n, __FUNCTION__,
map-x, map-y, map-w, map-h,
mt, _mesa_get_format_name(mt-format),
-   x, y, map-ptr, map-stride);
+   level, slice, map-ptr, map-stride);
 
return;
 
 fail:
-   drm_intel_bo_unreference(map-bo);
+   intel_miptree_release(map-mt);
map-ptr = NULL;
map-stride = 0;
 }
@@ -1519,30 +1501,20 @@ intel_miptree_unmap_blit(struct intel_context *intel,
 unsigned int slice)
 {
struct gl_context *ctx = intel-ctx;
-   drm_intel_bo_unmap(map-bo);
 
-   if (map-mode  GL_MAP_WRITE_BIT) {
-  unsigned int image_x, image_y;
-  int x = map-x;
-  int y = map-y;
-  intel_miptree_get_image_offset(mt, level, slice, image_x, image_y);
-  x += image_x;
-  y += image_y;
+   intel_miptree_unmap_raw(intel, map-mt);
 
-  bool ok = intelEmitCopyBlit(intel,
-  mt-region-cpp,
-  map-stride, map-bo,
-  0, I915_TILING_NONE,
-  mt-region-pitch, mt-region-bo,
-  mt-offset, mt-region-tiling,
-  0, 0,
-  x, y,
-  map-w, map-h,
-  GL_COPY);
+   if (map-mode  GL_MAP_WRITE_BIT) {
+  bool ok = intel_miptree_blit(intel,
+   map-mt, 0, 0,
+   0, 0, false,
+   mt, level, slice,
+   map-x, map-y, false,
+   map-w, map-h, GL_COPY);
   WARN_ONCE(!ok, Failed to blit from linear temporary mapping);
}
 
-   drm_intel_bo_unreference(map-bo);
+   intel_miptree_release(map-mt);
 }
 
 static void
@@ -1896,24 +1868,7 @@ intel_miptree_map_singlesample(struct intel_context 
*intel,
} else if (mt-stencil_mt  !(mode  BRW_MAP_DIRECT_BIT)) {
   intel_miptree_map_depthstencil(intel, mt, map, level, slice);
}
-   /* According to the Ivy Bridge PRM, Vol1 Part4, section 1.2.1.2 (Graphics
-* Data Size 

[Mesa-dev] [PATCH 17/17] intel: Remove dead intel_drawbuf_region().

2013-05-24 Thread Eric Anholt
Since the glBitmap() MRT change, it's unused.  There was basically no way
to responsibly use this function since MRT was introduced.
---
 src/mesa/drivers/dri/intel/intel_buffers.c | 14 --
 src/mesa/drivers/dri/intel/intel_buffers.h |  2 --
 2 files changed, 16 deletions(-)

diff --git a/src/mesa/drivers/dri/intel/intel_buffers.c 
b/src/mesa/drivers/dri/intel/intel_buffers.c
index 9a9a259..fdad480 100644
--- a/src/mesa/drivers/dri/intel/intel_buffers.c
+++ b/src/mesa/drivers/dri/intel/intel_buffers.c
@@ -35,20 +35,6 @@
 #include main/renderbuffer.h
 
 /**
- * Return pointer to current color drawing region, or NULL.
- */
-struct intel_region *
-intel_drawbuf_region(struct intel_context *intel)
-{
-   struct intel_renderbuffer *irbColor =
-  intel_renderbuffer(intel-ctx.DrawBuffer-_ColorDrawBuffers[0]);
-   if (irbColor  irbColor-mt)
-  return irbColor-mt-region;
-   else
-  return NULL;
-}
-
-/**
  * Return pointer to current color reading region, or NULL.
  */
 struct intel_region *
diff --git a/src/mesa/drivers/dri/intel/intel_buffers.h 
b/src/mesa/drivers/dri/intel/intel_buffers.h
index e68cc67..4e3d130 100644
--- a/src/mesa/drivers/dri/intel/intel_buffers.h
+++ b/src/mesa/drivers/dri/intel/intel_buffers.h
@@ -38,8 +38,6 @@ struct intel_framebuffer;
 
 extern struct intel_region *intel_readbuf_region(struct intel_context *intel);
 
-extern struct intel_region *intel_drawbuf_region(struct intel_context *intel);
-
 extern void intel_check_front_buffer_rendering(struct intel_context *intel);
 
 static inline void
-- 
1.8.3.rc0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 15/17] intel: Fix MRT handling of glBitmap().

2013-05-24 Thread Eric Anholt
We'd only hit color buffer 0 even if multiple draw buffers were bound.

NOTE: This is a candidate for the stable branches.
---
 src/mesa/drivers/dri/intel/intel_pixel_bitmap.c | 23 ++-
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/intel/intel_pixel_bitmap.c 
b/src/mesa/drivers/dri/intel/intel_pixel_bitmap.c
index c538a29..e258945 100644
--- a/src/mesa/drivers/dri/intel/intel_pixel_bitmap.c
+++ b/src/mesa/drivers/dri/intel/intel_pixel_bitmap.c
@@ -45,6 +45,7 @@
 #include intel_context.h
 #include intel_batchbuffer.h
 #include intel_blit.h
+#include intel_fbo.h
 #include intel_regions.h
 #include intel_buffers.h
 #include intel_pixel.h
@@ -176,8 +177,8 @@ do_blit_bitmap( struct gl_context *ctx,
const GLubyte *bitmap )
 {
struct intel_context *intel = intel_context(ctx);
-   struct intel_region *dst;
struct gl_framebuffer *fb = ctx-DrawBuffer;
+   struct intel_renderbuffer *irb;
GLfloat tmpColor[4];
GLubyte ubcolor[4];
GLuint color;
@@ -200,10 +201,14 @@ do_blit_bitmap( struct gl_context *ctx,
}
 
intel_prepare_render(intel);
-   dst = intel_drawbuf_region(intel);
 
-   if (!dst)
-   return false;
+   if (fb-_NumColorDrawBuffers != 1) {
+  perf_debug(accelerated glBitmap() only supports rendering to a 
+ single color buffer\n);
+  return false;
+   }
+
+   irb = intel_renderbuffer(fb-_ColorDrawBuffers[0]);
 
if (_mesa_is_bufferobj(unpack-BufferObj)) {
   bitmap = map_pbo(ctx, width, height, unpack, bitmap);
@@ -222,7 +227,7 @@ do_blit_bitmap( struct gl_context *ctx,
UNCLAMPED_FLOAT_TO_UBYTE(ubcolor[2], tmpColor[2]);
UNCLAMPED_FLOAT_TO_UBYTE(ubcolor[3], tmpColor[3]);
 
-   if (dst-cpp == 2)
+   if (irb-mt-cpp == 2)
   color = PACK_COLOR_565(ubcolor[0], ubcolor[1], ubcolor[2]);
else
   color = PACK_COLOR_(ubcolor[3], ubcolor[0], ubcolor[1], ubcolor[2]);
@@ -271,14 +276,14 @@ do_blit_bitmap( struct gl_context *ctx,
continue;
 
 if (!intelEmitImmediateColorExpandBlit(intel,
-   dst-cpp,
+   irb-mt-cpp,
(GLubyte *)stipple,
sz,
color,
-   dst-pitch,
-   dst-bo,
+   irb-mt-region-pitch,
+   irb-mt-region-bo,
0,
-   dst-tiling,
+   irb-mt-region-tiling,
dstx + px,
dsty + py,
w, h,
-- 
1.8.3.rc0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 16/17] intel: Fix format handling of blit glBitmap()

2013-05-24 Thread Eric Anholt
Any 32-bit format got ARGB handling (including, say, GL_RG1616), and
anything else got 16-bit (including, say, GL_R8), which could potentially
hang the GPU by writing out of bounds.

NOTE: This is a candidate for the stable branches.
---
 src/mesa/drivers/dri/intel/intel_pixel_bitmap.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/intel/intel_pixel_bitmap.c 
b/src/mesa/drivers/dri/intel/intel_pixel_bitmap.c
index e258945..c82253a 100644
--- a/src/mesa/drivers/dri/intel/intel_pixel_bitmap.c
+++ b/src/mesa/drivers/dri/intel/intel_pixel_bitmap.c
@@ -227,10 +227,19 @@ do_blit_bitmap( struct gl_context *ctx,
UNCLAMPED_FLOAT_TO_UBYTE(ubcolor[2], tmpColor[2]);
UNCLAMPED_FLOAT_TO_UBYTE(ubcolor[3], tmpColor[3]);
 
-   if (irb-mt-cpp == 2)
-  color = PACK_COLOR_565(ubcolor[0], ubcolor[1], ubcolor[2]);
-   else
+   switch (irb-mt-format) {
+   case MESA_FORMAT_ARGB:
+   case MESA_FORMAT_XRGB:
   color = PACK_COLOR_(ubcolor[3], ubcolor[0], ubcolor[1], ubcolor[2]);
+  break;
+   case MESA_FORMAT_RGB565:
+  color = PACK_COLOR_565(ubcolor[0], ubcolor[1], ubcolor[2]);
+  break;
+   default:
+  perf_debug(Unsupported format %s in accelerated glBitmap()\n,
+ _mesa_get_format_name(irb-mt-format));
+  return false;
+   }
 
if (!intel_check_blit_fragment_ops(ctx, tmpColor[3] == 1.0F))
   return false;
-- 
1.8.3.rc0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/4] V2 Multiple viewports in Gallium

2013-05-24 Thread Roland Scheidegger
Am 24.05.2013 22:56, schrieb Zack Rusin:
 This series adds support for multiple viewports/scissors
 to gallium and implements it in llvmpipe. All the other
 drivers still support just a single viewport/scissor
 combo and their behavior should be exactly the same as
 it was.
 
 I think this one takes care of all the comments. I think
 it addresses everyones concerns. Please let me know if 
 I missed something.
 
 Zack Rusin (4):
   gallium: Add support for multiple viewports
   draw: implement support for multiple viewports
   llvmpipe: implement support for multiple viewports
   draw: fixup draw_find_shader_output
 
  src/gallium/auxiliary/cso_cache/cso_context.c  |4 +-
  src/gallium/auxiliary/draw/draw_cliptest_tmp.h |   10 +++-
  src/gallium/auxiliary/draw/draw_context.c  |   63 
 +++-
  src/gallium/auxiliary/draw/draw_context.h  |6 +-
  src/gallium/auxiliary/draw/draw_gs.c   |   11 +++-
  src/gallium/auxiliary/draw/draw_gs.h   |1 +
  src/gallium/auxiliary/draw/draw_pipe_clip.c|   11 +++-
  src/gallium/auxiliary/draw/draw_private.h  |8 +--
  .../draw/draw_pt_fetch_shade_pipeline_llvm.c   |4 +-
  src/gallium/auxiliary/draw/draw_vertex.h   |2 +-
  src/gallium/auxiliary/draw/draw_vs.c   |7 ---
  src/gallium/auxiliary/draw/draw_vs_variant.c   |   34 +--
  src/gallium/auxiliary/tgsi/tgsi_scan.c |6 ++
  src/gallium/auxiliary/tgsi/tgsi_scan.h |1 +
  src/gallium/auxiliary/tgsi/tgsi_strings.c  |3 +-
  src/gallium/auxiliary/util/u_blitter.c |8 +--
  src/gallium/auxiliary/vl/vl_compositor.c   |4 +-
  src/gallium/auxiliary/vl/vl_idct.c |4 +-
  src/gallium/auxiliary/vl/vl_matrix_filter.c|2 +-
  src/gallium/auxiliary/vl/vl_mc.c   |2 +-
  src/gallium/auxiliary/vl/vl_median_filter.c|2 +-
  src/gallium/auxiliary/vl/vl_zscan.c|2 +-
  src/gallium/docs/source/context.rst|8 ++-
  src/gallium/drivers/freedreno/freedreno_state.c|   12 ++--
  src/gallium/drivers/galahad/glhd_context.c |   20 ---
  src/gallium/drivers/i915/i915_state.c  |   15 +++--
  src/gallium/drivers/identity/id_context.c  |   22 +++
  src/gallium/drivers/ilo/ilo_state.c|   16 +++--
  src/gallium/drivers/llvmpipe/lp_context.h  |7 ++-
  src/gallium/drivers/llvmpipe/lp_screen.c   |2 +
  src/gallium/drivers/llvmpipe/lp_setup.c|   29 +
  src/gallium/drivers/llvmpipe/lp_setup.h|4 +-
  src/gallium/drivers/llvmpipe/lp_setup_context.h|8 ++-
  src/gallium/drivers/llvmpipe/lp_setup_line.c   |   12 +++-
  src/gallium/drivers/llvmpipe/lp_setup_point.c  |   12 ++--
  src/gallium/drivers/llvmpipe/lp_setup_tri.c|   17 --
  src/gallium/drivers/llvmpipe/lp_state_clip.c   |   25 +---
  src/gallium/drivers/llvmpipe/lp_state_derived.c|   20 +--
  src/gallium/drivers/llvmpipe/lp_surface.c  |4 +-
  src/gallium/drivers/noop/noop_state.c  |   16 +++--
  src/gallium/drivers/nv30/nv30_draw.c   |2 +-
  src/gallium/drivers/nv30/nv30_state.c  |   16 +++--
  src/gallium/drivers/nv50/nv50_state.c  |   16 +++--
  src/gallium/drivers/nvc0/nvc0_state.c  |   16 +++--
  src/gallium/drivers/r300/r300_context.c|2 +-
  src/gallium/drivers/r300/r300_state.c  |   18 +++---
  src/gallium/drivers/r600/evergreen_state.c |6 +-
  src/gallium/drivers/r600/r600_state.c  |8 ++-
  src/gallium/drivers/r600/r600_state_common.c   |   10 ++--
  src/gallium/drivers/radeonsi/si_state.c|   16 +++--
  src/gallium/drivers/rbug/rbug_context.c|   22 +++
  src/gallium/drivers/softpipe/sp_screen.c   |2 +
  src/gallium/drivers/softpipe/sp_state_clip.c   |   19 +++---
  src/gallium/drivers/softpipe/sp_state_derived.c|2 +-
  src/gallium/drivers/svga/svga_pipe_misc.c  |   20 ---
  src/gallium/drivers/svga/svga_swtnl_state.c|2 +-
  src/gallium/drivers/trace/tr_context.c |   32 ++
  src/gallium/include/pipe/p_context.h   |   12 ++--
  src/gallium/include/pipe/p_defines.h   |3 +-
  src/gallium/include/pipe/p_shader_tokens.h |3 +-
  src/gallium/include/pipe/p_state.h |1 +
  src/gallium/tests/graw/fs-test.c   |2 +-
  src/gallium/tests/graw/graw_util.h |2 +-
  src/gallium/tests/graw/gs-test.c   |2 +-
  src/gallium/tests/graw/quad-sample.c   |2 +-
  src/gallium/tests/graw/shader-leak.c   |2 +-
  src/gallium/tests/graw/tri-gs.c|2 +-
  

Re: [Mesa-dev] [PATCH 5/5] i965 gen7: use SURFACE_STATE fields to select render level/layer

2013-05-24 Thread Paul Berry
On 22 May 2013 20:00, Jordan Justen jljus...@gmail.com wrote:

 On Wed, May 22, 2013 at 3:56 PM, Eric Anholt e...@anholt.net wrote:
  Jordan Justen jordan.l.jus...@intel.com writes:
  -   surf[0] = BRW_SURFACE_2D  BRW_SURFACE_TYPE_SHIFT |
  +   switch (gl_target) {
  +   case GL_TEXTURE_CUBE_MAP_ARRAY:
  +   case GL_TEXTURE_CUBE_MAP:
  +  surftype = BRW_SURFACE_2D;
  +  is_array = true;
  +  depth *= 6;
  +  break;
  +   default:
  +  surftype = translate_tex_target(gl_target);
  +  is_array = _mesa_tex_target_is_array(gl_target);
  +  break;
  +   }
 
  Why the conversion of cubes to arrays?  It looks from mentions in the
  render target write message section's mention of RTAI that cubes are
  supported.

 Hmm. Good catch.

 I think I started implementing this in brw_wm_surface_state.c, so I
 was looking at what would be needed for the older gens. It looks like
 pre-gen6, that cube-arrays were not supported in the surface_state.
 I'm not sure right now why I extended that to include converting
 non-array cubes to 2d-arrays as well.

 Anyway, I'll investigate cleaning this up for gen7, since that is what
 we are starting with.


When Jordan was first working on this feature, he asked me to help debug
it, and I found by reading simulator source code that SURFACE_STATE's
minimum array element field is ignored for cube surfaces (in direct
contradiction to the hw docs).  Fortunately, treating the surface as an
array is an effective workaround, since for render targets there is
effectively no difference between a cube map and an ordinary array with a
6x higher depth.




 Thanks,

 -Jordan
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium/docs: adds documentation for multi viewport cap

2013-05-24 Thread Zack Rusin

Signed-off-by: Zack Rusin za...@vmware.com
---
 src/gallium/docs/source/screen.rst |4 
 1 file changed, 4 insertions(+)

diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index 2630491..0957e56 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -164,6 +164,10 @@ The integer capabilities:
   for permutational swizzles.
 * ``PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE``: The maximum accessible size with
   a buffer sampler view, in bytes.
+* ``PIPE_CAP_MAX_VIEWPORTS``: The maximum number of viewports (and scissors
+  since they are linked) a driver can support. Returning 0 is equivalent
+  to returning 1 because every driver has to support at least a single
+  viewport/scissor combination.
 
 
 .. _pipe_capf:
-- 
1.7.10.4
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/4] V2 Multiple viewports in Gallium

2013-05-24 Thread Zack Rusin
 There's no documentation of PIPE_CAP_MAX_VIEWPORTS but otherwise the
 series looks good.

I've just sent another patch with it. Let me know if that's enough.

z
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 4/4] i965: Enable ext_framebuffer_multisample_blit_scaled on intel h/w

2013-05-24 Thread Eric Anholt
Paul Berry stereotype...@gmail.com writes:

 On 16 May 2013 11:44, Anuj Phogat anuj.pho...@gmail.com wrote:

 This patch enables ext_framebuffer_multisample_blit_scaled extension
 on intel h/w = gen6.

 Note: Patches for piglit tests to verify this functionality are out
 for review on piglit mailing list. Tests pass for all of the scaling
 factors from 0.1 to 2.4.

 Comment from Paul Berry:
 I have some concerns about the image quality of the method you've
 implemented.  As I understand it, the primary use case of this extension
 is to allow the client to do multisampled rendering at slightly less
 than screen resolution (e.g. 720p instead of 1080p), and then blit the
 result to the screen in one step while keeping most of the quality
 benefits of multisampling.  Since your implementation is effectively
 equivalent to downsampling and then blitting using GL_NEAREST filtering,
 my fear is that it will lead to blocky artifacts that are severe enough
 to negate the benefit of multisampling in the first place.

 Before we turn this extension on in the Intel driver, I'd like to look
 at a comparison of:

 (1) your technique
 (2) downsampling followed by scaling with GL_LINEAR filtering
 (3) The nVidia implementation, in GL_SCALED_RESOLVE_FASTEST_EXT mode
 (4) The nVidia implementation, in GL_SCALED_RESOLVE_NICEST_EXT mode
 (5) Just rendering the image directly to the single-sampled destination
 buffer

 Observation: Image quality is better in cases 2, 3, 4 and 5 as
 compared to case 1. Although extension's implementation meets the
 specification's requirements, using it leads to  blocky artifacts
 due to nearest filtering.

 I'll work on implementing a better filtering technique in blorp.


 Thanks for quoting my comment here.  It's good to have context so that we
 can continue the discussion.

 My preference would be to go ahead and land patches 1-3 now, but hold patch
 4 back until we've figured out how to get comparable image quality to the
 nVidia implementation.  It seems like it would be nice to go out of the
 gate with our best looking implementation.

 Does that seem reasonable to other folks?

Yeah, I don't think should ship a nearest-filtered-only implementation.


pgpQAe0p3e1HZ.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/12] i965/gen7+: Implement fast color clears.

2013-05-24 Thread Eric Anholt
Ian Romanick i...@freedesktop.org writes:

 On 05/21/2013 04:52 PM, Paul Berry wrote:
 This series implements fast color clears, a Gen7+ feature which
 reduces memory bandwidth by deferring the memory writes involved in a
 glClear() until the same memory is later touched during rendering.

  From a broad overview point of view, fast color clears work in a
 similar way to HiZ: an auxiliary MCS buffer keeps track of which
 parts of the buffer have been cleared but haven't yet had the
 necessary memory writes performed.  Whenever a color buffer needs to
 be accessed by the CPU, or by a part of the GPU that is not
 fast-color-aware, we have to perform a resolve operation to force
 any pending memory writes to occur.

 This patch series adopts a slightly different strategy (compared to
 HiZ) for making sure the resolves happen when needed.  Instead of
 modifying each code path that might need to do a resolve so that it
 does one if needed, we create an accessor function that does the
 resolve if needed and then provides the caller with access to the
 miptree's underlying memory region.  This lets us have a lot more
 confidence that we didn't miss any code paths, which is important
 since color buffers are accessed by a large number of code paths.  To
 discourage future maintainers from trying to bypass the accessor
 function, it is inline (so that overhead is negligible), and the field
 it provides access to has been renamed to region_private.

 Patch 01 ifdefs out some code so that it does not appear in the i915
 (pre-Gen4) driver--this makes it easier to be confident that these
 changes won't regress i915.  Patch 02 introduces the aforementioned
 accessor function.  Patches 03-11 are the guts of the implementation,
 and patch 12 enables the new feature.

 No piglit regressions.  I have additional piglit tests which validate
 specific important corner cases--I hope to get those out to the list
 later this week.

 I sent some comments and review for the tests, and I've sent some other 
 comments about these patches.  My only concern is whether the case of 
 swapping a non-current drawable (that had a fast-clear as the last 
 render) produces the correct result.  In the piglit thread, I suggested 
 adding a test specifically for this case.

 I suspect that if fast-clear fails in that case, then multisampling also 
 fails.  Both can probably be fixed as follow-on work.  Does that seem 
 plausible?

Swapping a non-current drawable doesn't work in direct rendering, at
all, since as far back as I was able to figure out.  I saw no way
forward toward making it possible.  I don't think we should distract
this series with this issue.


pgp8F8UvG0yd6.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/4] V2 Multiple viewports in Gallium

2013-05-24 Thread Roland Scheidegger
Am 24.05.2013 23:41, schrieb Zack Rusin:
 There's no documentation of PIPE_CAP_MAX_VIEWPORTS but otherwise the
 series looks good.
 
 I've just sent another patch with it. Let me know if that's enough.
 
 z
 

Thanks! That's certainly enough.

Roland
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/17] intle: Add an assert for glCopyTexSubImage() being called on MSAA buffers.

2013-05-24 Thread Ian Romanick

On 05/24/2013 01:56 PM, Eric Anholt wrote:

s/intle/intel/ in the title.


This is just in case someone else trips over this due to our weird reuse
of this code in glBlitFramebuffer().
---
  src/mesa/drivers/dri/intel/intel_tex_copy.c | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/src/mesa/drivers/dri/intel/intel_tex_copy.c 
b/src/mesa/drivers/dri/intel/intel_tex_copy.c
index 94e90da..4a13b9a 100644
--- a/src/mesa/drivers/dri/intel/intel_tex_copy.c
+++ b/src/mesa/drivers/dri/intel/intel_tex_copy.c
@@ -62,6 +62,12 @@ intel_copy_texsubimage(struct intel_context *intel,

 intel_prepare_render(intel);

+   /* glCopyTexSubImage() can't be called on multisampled renderbuffers or
+* textures.
+*/
+   assert(!irb-Base.Base.NumSamples);
+   assert(!intelImage-base.Base.NumSamples);
+
 if (!intelImage-mt || !irb || !irb-mt) {
if (unlikely(INTEL_DEBUG  DEBUG_PERF))
 fprintf(stderr, %s fail %p %p (0x%08x)\n,



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/17] intel: Extend the force_y_tiling flag to allow forcing no tiling.

2013-05-24 Thread Ian Romanick

On 05/24/2013 01:56 PM, Eric Anholt wrote:

For a blit-uploaded temporary, it's faster on current hardware to memcpy
the data into a linear CPU mapping than to go through the GTT.
---
  src/mesa/drivers/dri/intel/intel_fbo.c  |  2 +-
  src/mesa/drivers/dri/intel/intel_mipmap_tree.c  | 22 +-
  src/mesa/drivers/dri/intel/intel_mipmap_tree.h  |  2 +-
  src/mesa/drivers/dri/intel/intel_tex_image.c|  2 +-
  src/mesa/drivers/dri/intel/intel_tex_validate.c |  2 +-
  5 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/src/mesa/drivers/dri/intel/intel_fbo.c 
b/src/mesa/drivers/dri/intel/intel_fbo.c
index 05ff784..73ed91d 100644
--- a/src/mesa/drivers/dri/intel/intel_fbo.c
+++ b/src/mesa/drivers/dri/intel/intel_fbo.c
@@ -924,7 +924,7 @@ intel_renderbuffer_move_to_temp(struct intel_context *intel,
   width, height, depth,
   true,
   irb-mt-num_samples,
- false /* force_y_tiling */);
+ 0 /* force_tiling_mask */);

 if (intel-vtbl.is_hiz_depth_format(intel, new_mt-format)) {
intel_miptree_alloc_hiz(intel, new_mt);
diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
index c3e55f4..d41fbdf 100644
--- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c
@@ -265,7 +265,7 @@ intel_miptree_create_layout(struct intel_context *intel,
  mt-logical_depth0,
  true,
  num_samples,
-false /* force_y_tiling */);
+0 /* force_tiling_mask */);
if (!mt-stencil_mt) {
 intel_miptree_release(mt);
 return NULL;
@@ -309,7 +309,7 @@ intel_miptree_choose_tiling(struct intel_context *intel,
  gl_format format,
  uint32_t width0,
  uint32_t num_samples,
-bool force_y_tiling,
+int force_tiling_mask,
  struct intel_mipmap_tree *mt)
  {

@@ -320,8 +320,12 @@ intel_miptree_choose_tiling(struct intel_context *intel,
return I915_TILING_NONE;
 }

-   if (force_y_tiling)
-  return I915_TILING_Y;
+   /* Some usages may want only one type of tiling, like depth miptrees (Y
+* tiled), or temporary BOs for uploading data once (linear).  So far the
+* mask only ever has one bit set.
+*/
+   if (force_tiling_mask)
+  return ffs(force_tiling_mask) - 1;

 if (num_samples  1) {
/* From p82 of the Sandy Bridge PRM, dw3[1] of SURFACE_STATE (Tiled
@@ -375,7 +379,7 @@ intel_miptree_create(struct intel_context *intel,
 GLuint depth0,
 bool expect_accelerated_upload,
   GLuint num_samples,
- bool force_y_tiling)
+ int force_tiling_mask)


unsigned?


  {
 struct intel_mipmap_tree *mt;
 gl_format tex_format = format;
@@ -441,7 +445,7 @@ intel_miptree_create(struct intel_context *intel,
 }

 uint32_t tiling = intel_miptree_choose_tiling(intel, format, width0,
- num_samples, force_y_tiling,
+ num_samples, 
force_tiling_mask,
   mt);
 bool y_or_x = tiling == (I915_TILING_Y | I915_TILING_X);

@@ -570,7 +574,7 @@ intel_miptree_create_for_renderbuffer(struct intel_context 
*intel,

 mt = intel_miptree_create(intel, GL_TEXTURE_2D, format, 0, 0,
 width, height, depth, true, num_samples,
- false /* force_y_tiling */);
+ 0 /* force_tiling_mask */);
 if (!mt)
goto fail;

@@ -1008,7 +1012,7 @@ intel_miptree_alloc_mcs(struct intel_context *intel,
   mt-logical_depth0,
   true,
   0 /* num_samples */,
- true /* force_y_tiling */);
+ (1  I915_TILING_Y));

 /* From the Ivy Bridge PRM, Vol 2 Part 1 p326:
  *
@@ -1089,7 +1093,7 @@ intel_miptree_alloc_hiz(struct intel_context *intel,
   mt-logical_depth0,
   true,
   mt-num_samples,
- false /* force_y_tiling */);
+ 0 /* force_tiling_mask */);

 if (!mt-hiz_mt)
return false;
diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.h 

Re: [Mesa-dev] [PATCH 10/17] intel: Make a temporary miptree when doing blit uploads for glTexSubImage().

2013-05-24 Thread Ian Romanick

On 05/24/2013 01:56 PM, Eric Anholt wrote:

While this is a bit more CPU work, it also is less code to handle this
path, and fixes problems with 32k-pitch textures and missing resolves.
---
  src/mesa/drivers/dri/intel/intel_tex_subimage.c | 62 +++--
  1 file changed, 18 insertions(+), 44 deletions(-)

diff --git a/src/mesa/drivers/dri/intel/intel_tex_subimage.c 
b/src/mesa/drivers/dri/intel/intel_tex_subimage.c
index 42cc739..e436dc1 100644
--- a/src/mesa/drivers/dri/intel/intel_tex_subimage.c
+++ b/src/mesa/drivers/dri/intel/intel_tex_subimage.c
@@ -53,12 +53,6 @@ intel_blit_texsubimage(struct gl_context * ctx,
  {
 struct intel_context *intel = intel_context(ctx);
 struct intel_texture_image *intelImage = intel_texture_image(texImage);
-   GLuint dstRowStride = 0;
-   drm_intel_bo *temp_bo = NULL;
-   unsigned int blit_x = 0, blit_y = 0;
-   unsigned long pitch;
-   uint32_t tiling_mode = I915_TILING_NONE;
-   GLubyte *dstMap;

 /* Try to do a blit upload of the subimage if the texture is
  * currently busy.
@@ -93,57 +87,37 @@ intel_blit_texsubimage(struct gl_context * ctx,
 if (!pixels)
return false;

-   temp_bo = drm_intel_bo_alloc_tiled(intel-bufmgr,
- subimage blit bo,
- width, height,
- intelImage-mt-cpp,
- tiling_mode,
- pitch,
- 0);
-   if (temp_bo == NULL) {
-  _mesa_error(ctx, GL_OUT_OF_MEMORY, intelTexSubImage);
-  return false;
-   }
+   struct intel_mipmap_tree *temp_mt =
+  intel_miptree_create(intel, GL_TEXTURE_2D, texImage-TexFormat,
+   0, 0,
+   width, height, 1,
+   false, 0,
+   (1  I915_TILING_NONE) /* force_tiling_mask */);


The old code did error checking.  Should we continue to error check 
temp_mt and dst (below)?




-   if (drm_intel_gem_bo_map_gtt(temp_bo)) {
-  _mesa_error(ctx, GL_OUT_OF_MEMORY, intelTexSubImage);
-  return false;
-   }
-
-   dstMap = temp_bo-virtual;
-   dstRowStride = pitch;
-
-   intel_miptree_get_image_offset(intelImage-mt, texImage-Level,
- intelImage-base.Base.Face,
- blit_x, blit_y);
-   blit_x += xoffset;
-   blit_y += yoffset;
+   GLubyte *dst = intel_miptree_map_raw(intel, temp_mt);

 if (!_mesa_texstore(ctx, 2, texImage-_BaseFormat,
   texImage-TexFormat,
-  dstRowStride,
-  dstMap,
+  temp_mt-region-pitch,
+  dst,
   width, height, 1,
   format, type, pixels, packing)) {
_mesa_error(ctx, GL_OUT_OF_MEMORY, intelTexSubImage);


Since this code doesn't bail (and never has), we blit garbage into the 
texture, right?



 }

+   intel_miptree_unmap_raw(intel, temp_mt);
+
 bool ret;

-   drm_intel_gem_bo_unmap_gtt(temp_bo);
-
-   ret = intelEmitCopyBlit(intel,
-  intelImage-mt-cpp,
-  dstRowStride,
-  temp_bo, 0, false,
-  intelImage-mt-region-pitch,
-  intelImage-mt-region-bo, 0,
-  intelImage-mt-region-tiling,
-  0, 0, blit_x, blit_y, width, height,
-  GL_COPY);
+   ret = intel_miptree_blit(intel,
+temp_mt, 0, 0,
+0, 0, false,
+intelImage-mt, texImage-Level, texImage-Face,
+xoffset, yoffset, false,
+width, height, GL_COPY);
 assert(ret);

-   drm_intel_bo_unreference(temp_bo);
+   intel_miptree_release(temp_mt);
 _mesa_unmap_teximage_pbo(ctx, packing);

 return ret;



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/17] intel: Rebuild PBO blit glTexImage() on top of miptrees.

2013-05-24 Thread Ian Romanick

On 05/24/2013 01:56 PM, Eric Anholt wrote:

This will ensure that we have resolves if we ever extend this to
glTexSubImage(), and fixes missing image start offset handling.

The texture buffer alloc ended up getting moved up, because we want to
look at the format of the image's actual mt to see if we'll end up
blitting the right thing, in the case of packed depth/stencil uploads.

This is the last caller of intelEmitCopyBlit() on a miptree-wrapped BO.


It looks like after this the two remaining callers are all in 
intel_blit.c.  Should intelEmitCopyBlit be static?


Looking at what's left, it looks like there should be some more 
refactoring of intelEmitCopyBlit after this commit.  A bunch of the 
checks, etc. in intelEmitCopyBlit are only relevant for one of the 
callers.  That can happen later, if there's value.



---
  src/mesa/drivers/dri/intel/intel_tex_image.c | 62 ++--
  1 file changed, 32 insertions(+), 30 deletions(-)

diff --git a/src/mesa/drivers/dri/intel/intel_tex_image.c 
b/src/mesa/drivers/dri/intel/intel_tex_image.c
index a3928bb..4ad5ccc 100644
--- a/src/mesa/drivers/dri/intel/intel_tex_image.c
+++ b/src/mesa/drivers/dri/intel/intel_tex_image.c
@@ -6,6 +6,7 @@
  #include main/bufferobj.h
  #include main/context.h
  #include main/formats.h
+#include main/image.h
  #include main/pbo.h
  #include main/renderbuffer.h
  #include main/texcompress.h
@@ -117,9 +118,8 @@ try_pbo_upload(struct gl_context *ctx,
 struct intel_texture_image *intelImage = intel_texture_image(image);
 struct intel_context *intel = intel_context(ctx);
 struct intel_buffer_object *pbo = intel_buffer_object(unpack-BufferObj);
-   GLuint src_offset, src_stride;
-   GLuint dst_x, dst_y;
-   drm_intel_bo *dst_buffer, *src_buffer;
+   GLuint src_offset;
+   drm_intel_bo *src_buffer;

 if (!_mesa_is_bufferobj(unpack-BufferObj))
return false;
@@ -132,14 +132,6 @@ try_pbo_upload(struct gl_context *ctx,
return false;
 }

-   if (!_mesa_format_matches_format_and_type(image-TexFormat,
- format, type, false)) {
-  DBG(%s: format mismatch (upload to %s with format 0x%x, type 0x%x)\n,
- __FUNCTION__, _mesa_get_format_name(image-TexFormat),
- format, type);
-  return false;
-   }
-
 ctx-Driver.AllocTextureImageBuffer(ctx, image);

 if (!intelImage-mt) {
@@ -147,39 +139,49 @@ try_pbo_upload(struct gl_context *ctx,
return false;
 }

+   if (!_mesa_format_matches_format_and_type(intelImage-mt-format,
+ format, type, false)) {
+  DBG(%s: format mismatch (upload to %s with format 0x%x, type 0x%x)\n,
+ __FUNCTION__, _mesa_get_format_name(intelImage-mt-format),
+ format, type);
+  return false;
+   }
+
 if (image-TexObject-Target == GL_TEXTURE_1D_ARRAY ||
 image-TexObject-Target == GL_TEXTURE_2D_ARRAY) {
DBG(%s: no support for array textures\n, __FUNCTION__);
return false;
 }

-   dst_buffer = intelImage-mt-region-bo;
 src_buffer = intel_bufferobj_source(intel, pbo, 64, src_offset);
 /* note: potential 64-bit ptr to 32-bit int cast */
 src_offset += (GLuint) (unsigned long) pixels;

-   if (unpack-RowLength  0)
-  src_stride = unpack-RowLength;
-   else
-  src_stride = image-Width;
-   src_stride *= intelImage-mt-region-cpp;
-
-   intel_miptree_get_image_offset(intelImage-mt, intelImage-base.Base.Level,
- intelImage-base.Base.Face,
- dst_x, dst_y);
-
-   if (!intelEmitCopyBlit(intel,
- intelImage-mt-cpp,
- src_stride, src_buffer,
- src_offset, false,
- intelImage-mt-region-pitch, dst_buffer, 0,
- intelImage-mt-region-tiling,
- 0, 0, dst_x, dst_y, image-Width, image-Height,
- GL_COPY)) {
+   int src_stride =
+  _mesa_image_row_stride(unpack, image-Width, format, type);
+
+   struct intel_mipmap_tree *pbo_mt =
+  intel_miptree_create_for_bo(intel,
+  src_buffer,
+  intelImage-mt-format,
+  src_offset,
+  image-Width, image-Height,
+  src_stride, I915_TILING_NONE);
+   if (!pbo_mt)
+  return false;
+
+   if (!intel_miptree_blit(intel,
+   pbo_mt, 0, 0,
+   0, 0, false,
+   intelImage-mt, image-Level, image-Face,
+   0, 0, false,
+   image-Width, image-Height, GL_COPY)) {
DBG(%s: blit failed\n, __FUNCTION__);
return false;
 }

+   intel_miptree_release(pbo_mt);
+
 DBG(%s: success\n, __FUNCTION__);
 return true;
  }




Re: [Mesa-dev] [PATCH 14/17] intel: Rebuild PBO blit glTexImage() on top of miptrees.

2013-05-24 Thread Eric Anholt
Ian Romanick i...@freedesktop.org writes:

 On 05/24/2013 01:56 PM, Eric Anholt wrote:
 This will ensure that we have resolves if we ever extend this to
 glTexSubImage(), and fixes missing image start offset handling.

 The texture buffer alloc ended up getting moved up, because we want to
 look at the format of the image's actual mt to see if we'll end up
 blitting the right thing, in the case of packed depth/stencil uploads.

 This is the last caller of intelEmitCopyBlit() on a miptree-wrapped BO.

 It looks like after this the two remaining callers are all in 
 intel_blit.c.  Should intelEmitCopyBlit be static?

 Looking at what's left, it looks like there should be some more 
 refactoring of intelEmitCopyBlit after this commit.  A bunch of the 
 checks, etc. in intelEmitCopyBlit are only relevant for one of the 
 callers.  That can happen later, if there's value.

I thought about doing so, but the aperture check is painful enough I
decided not to duplicate it.


pgpIMP1X7ncvj.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] libclc: vload/vstore initial implementation

2013-05-24 Thread Tom Stellard
On Thu, May 23, 2013 at 07:49:39PM -0500, Aaron Watry wrote:
 I've implemented the OpenCL vload/vstore builtin functions in two parts.
 1) Pure CL C implementation. No Assembly
 2) Add assembly optimizations for 32-bit int/uint loads/stores of 4+ component
vectors
 
 Note: The vstore implementation assumes that the hardware back end supports
 byte-addressable stores.  This may not always be optimal.


Hi Aaron,

I've pushed these to my libclc repo, thanks!

-Tom
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/7] mesa: Add infrastructure for ARB_shading_language_420pack.

2013-05-24 Thread Matt Turner
From: Todd Previte tprev...@gmail.com

v2 [mattst88]
  - Split infrastructure into separate patch.
  - Add preprocessor #define.
---
 src/glsl/glcpp/glcpp-parse.y| 3 +++
 src/glsl/glsl_parser_extras.cpp | 1 +
 src/glsl/glsl_parser_extras.h   | 2 ++
 src/mesa/main/extensions.c  | 1 +
 src/mesa/main/mtypes.h  | 1 +
 5 files changed, 8 insertions(+)

diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y
index 81ba04b..2e3e6a8 100644
--- a/src/glsl/glcpp/glcpp-parse.y
+++ b/src/glsl/glcpp/glcpp-parse.y
@@ -1242,6 +1242,9 @@ glcpp_parser_create (const struct gl_extensions 
*extensions, int api)
 
  if (extensions-AMD_vertex_shader_layer)
 add_builtin_define(parser, GL_AMD_vertex_shader_layer, 1);
+
+ if (extensions-ARB_shading_language_420pack)
+add_builtin_define(parser, GL_ARB_shading_language_420pack, 
1);
   }
}
 
diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index c0dd713..d02b308 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -466,6 +466,7 @@ static const _mesa_glsl_extension 
_mesa_glsl_supported_extensions[] = {
EXT(OES_standard_derivatives,   false, false, true,  false,  true, 
OES_standard_derivatives),
EXT(ARB_texture_cube_map_array, true,  false, true,  true,  false, 
ARB_texture_cube_map_array),
EXT(ARB_shading_language_packing,   true,  false, true,  true,  false, 
ARB_shading_language_packing),
+   EXT(ARB_shading_language_420pack,   true,  true,  true,  true,  false, 
ARB_shading_language_420pack),
EXT(ARB_texture_multisample,true,  false, true,  true,  false, 
ARB_texture_multisample),
EXT(ARB_texture_query_lod,  false, false, true,  true,  false, 
ARB_texture_query_lod),
EXT(ARB_gpu_shader5,true,  true,  true,  true,  false, 
ARB_gpu_shader5),
diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
index 16e180d..95918de 100644
--- a/src/glsl/glsl_parser_extras.h
+++ b/src/glsl/glsl_parser_extras.h
@@ -288,6 +288,8 @@ struct _mesa_glsl_parse_state {
bool ARB_gpu_shader5_warn;
bool AMD_vertex_shader_layer_enable;
bool AMD_vertex_shader_layer_warn;
+   bool ARB_shading_language_420pack_enable;
+   bool ARB_shading_language_420pack_warn;
/*@}*/
 
/** Extensions supported by the OpenGL implementation. */
diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
index db5a5ed..32a331b 100644
--- a/src/mesa/main/extensions.c
+++ b/src/mesa/main/extensions.c
@@ -127,6 +127,7 @@ static const struct extension extension_table[] = {
{ GL_ARB_shader_texture_lod,  o(ARB_shader_texture_lod),  
GL, 2009 },
{ GL_ARB_shading_language_100,
o(ARB_shading_language_100),GLL,2003 },
{ GL_ARB_shading_language_packing,
o(ARB_shading_language_packing),GL, 2011 },
+   { GL_ARB_shading_language_420pack,
o(ARB_shading_language_420pack),GL, 2011 },
{ GL_ARB_shadow,  o(ARB_shadow),  
GLL,2001 },
{ GL_ARB_sync,o(ARB_sync),
GL, 2003 },
{ GL_ARB_texture_border_clamp,
o(ARB_texture_border_clamp),GLL,2000 },
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index b68853b..597f36f 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2985,6 +2985,7 @@ struct gl_extensions
GLboolean ARB_shader_texture_lod;
GLboolean ARB_shading_language_100;
GLboolean ARB_shading_language_packing;
+   GLboolean ARB_shading_language_420pack;
GLboolean ARB_shadow;
GLboolean ARB_sync;
GLboolean ARB_texture_border_clamp;
-- 
1.8.1.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/7] glsl: Allow .length() method on vectors and matrices.

2013-05-24 Thread Matt Turner
Required by ARB_shading_language_420pack.
---
 src/glsl/hir_field_selection.cpp | 58 ++--
 1 file changed, 38 insertions(+), 20 deletions(-)

diff --git a/src/glsl/hir_field_selection.cpp b/src/glsl/hir_field_selection.cpp
index 0035a5f..cc7ba61 100644
--- a/src/glsl/hir_field_selection.cpp
+++ b/src/glsl/hir_field_selection.cpp
@@ -47,20 +47,6 @@ _mesa_ast_field_selection_to_hir(const ast_expression *expr,
YYLTYPE loc = expr-get_location();
if (op-type-is_error()) {
   /* silently propagate the error */
-   } else if (op-type-is_vector()) {
-  ir_swizzle *swiz = ir_swizzle::create(op,
-   expr-primary_expression.identifier,
-   op-type-vector_elements);
-  if (swiz != NULL) {
-result = swiz;
-  } else {
-/* FINISHME: Logging of error messages should be moved into
- * FINISHME: ir_swizzle::create.  This allows the generation of more
- * FINISHME: specific error messages.
- */
-_mesa_glsl_error( loc, state, Invalid swizzle / mask `%s',
- expr-primary_expression.identifier);
-  }
} else if (op-type-base_type == GLSL_TYPE_STRUCT
   || op-type-base_type == GLSL_TYPE_INTERFACE) {
   result = new(ctx) ir_dereference_record(op,
@@ -81,17 +67,49 @@ _mesa_ast_field_selection_to_hir(const ast_expression *expr,
   const char *method;
   method = call-subexpressions[0]-primary_expression.identifier;
 
-  if (op-type-is_array()  strcmp(method, length) == 0) {
-if (!call-expressions.is_empty())
-   _mesa_glsl_error(loc, state, length method takes no arguments.);
+  if (strcmp(method, length) == 0) {
+ if (!call-expressions.is_empty())
+_mesa_glsl_error(loc, state, length method takes no arguments.);
 
-if (op-type-array_size() == 0)
-   _mesa_glsl_error(loc, state, length called on unsized array.);
+ if (op-type-is_array()) {
+if (op-type-array_size() == 0)
+   _mesa_glsl_error(loc, state, length called on unsized 
array.);
 
-result = new(ctx) ir_constant(op-type-array_size());
+result = new(ctx) ir_constant(op-type-array_size());
+ } else if (op-type-is_vector()) {
+if (state-ARB_shading_language_420pack_enable) {
+   /* .length() returns int. */
+   result = new(ctx) ir_constant((int) op-type-vector_elements);
+} else {
+   _mesa_glsl_error(loc, state, length method on matrix only 
available
+ with 
ARB_shading_language_420pack.);
+}
+ } else if (op-type-is_matrix()) {
+if (state-ARB_shading_language_420pack_enable) {
+   /* .length() returns int. */
+   result = new(ctx) ir_constant((int) op-type-matrix_columns);
+} else {
+   _mesa_glsl_error(loc, state, length method on matrix only 
available
+ with 
ARB_shading_language_420pack.);
+}
+ }
   } else {
 _mesa_glsl_error(loc, state, Unknown method: `%s'., method);
   }
+   } else if (op-type-is_vector()) {
+  ir_swizzle *swiz = ir_swizzle::create(op,
+   expr-primary_expression.identifier,
+   op-type-vector_elements);
+  if (swiz != NULL) {
+result = swiz;
+  } else {
+/* FINISHME: Logging of error messages should be moved into
+ * FINISHME: ir_swizzle::create.  This allows the generation of more
+ * FINISHME: specific error messages.
+ */
+_mesa_glsl_error( loc, state, Invalid swizzle / mask `%s',
+ expr-primary_expression.identifier);
+  }
} else {
   _mesa_glsl_error( loc, state, Cannot access field `%s' of 
   non-structure / non-vector.,
-- 
1.8.1.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/7] glsl: Allow swizzles on scalars.

2013-05-24 Thread Matt Turner
Required by ARB_shading_language_420pack.
---
 src/glsl/hir_field_selection.cpp | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/glsl/hir_field_selection.cpp b/src/glsl/hir_field_selection.cpp
index cc7ba61..ceb0a4c 100644
--- a/src/glsl/hir_field_selection.cpp
+++ b/src/glsl/hir_field_selection.cpp
@@ -96,7 +96,9 @@ _mesa_ast_field_selection_to_hir(const ast_expression *expr,
   } else {
 _mesa_glsl_error(loc, state, Unknown method: `%s'., method);
   }
-   } else if (op-type-is_vector()) {
+   } else if (op-type-is_vector() ||
+  (state-ARB_shading_language_420pack_enable 
+   op-type-is_scalar())) {
   ir_swizzle *swiz = ir_swizzle::create(op,
expr-primary_expression.identifier,
op-type-vector_elements);
-- 
1.8.1.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/7] glsl: Add gl_{Max, Min}ProgramTexelOffset built-in constants.

2013-05-24 Thread Matt Turner
Required by ARB_shading_language_420pack. Note that the 420pack spec
incorrectly specifies their values as (Min, Max) = (-7, 8) when they
should be (-8, 7) as listed in the GLSL 4.30 and ESSL 3.0 specs.
---
 src/glsl/builtin_variables.cpp | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp
index 4bb361c..f4ac205 100644
--- a/src/glsl/builtin_variables.cpp
+++ b/src/glsl/builtin_variables.cpp
@@ -790,6 +790,13 @@ generate_130_uniforms(exec_list *instructions,
 state-Const.MaxClipPlanes);
add_builtin_constant(instructions, symtab, gl_MaxVaryingComponents,
state-Const.MaxVaryingFloats);
+
+   if (state-ARB_shading_language_420pack_enable) {
+  add_builtin_constant(instructions, symtab, gl_MinProgramTexelOffset,
+   state-Const.MinProgramTexelOffset);
+  add_builtin_constant(instructions, symtab, gl_MaxProgramTexelOffset,
+   state-Const.MaxProgramTexelOffset);
+   }
 }
 
 
-- 
1.8.1.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/7] glsl: Allow implicit conversion of return values.

2013-05-24 Thread Matt Turner
Required by ARB_shading_language_420pack.
---
 src/glsl/ast_to_hir.cpp | 31 ++-
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index b206380..6e689b4 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -3358,7 +3358,7 @@ ast_jump_statement::hir(exec_list *instructions,
   assert(state-current_function);
 
   if (opt_return_value) {
-ir_rvalue *const ret = opt_return_value-hir(instructions, state);
+ir_rvalue *ret = opt_return_value-hir(instructions, state);
 
 /* The value of the return type can be NULL if the shader says
  * 'return foo();' and foo() is a function that returns void.
@@ -3370,16 +3370,29 @@ ast_jump_statement::hir(exec_list *instructions,
 const glsl_type *const ret_type =
(ret == NULL) ? glsl_type::void_type : ret-type;
 
-/* Implicit conversions are not allowed for return values. */
-if (state-current_function-return_type != ret_type) {
+ /* Implicit conversions are not allowed for return values prior to
+  * ARB_shading_language_420pack.
+  */
+ if (state-current_function-return_type != ret_type) {
YYLTYPE loc = this-get_location();
 
-   _mesa_glsl_error( loc, state,
-`return' with wrong type %s, in function `%s' 
-returning %s,
-ret_type-name,
-state-current_function-function_name(),
-state-current_function-return_type-name);
+if (state-ARB_shading_language_420pack_enable) {
+   if 
(!apply_implicit_conversion(state-current_function-return_type,
+  ret, state)) {
+  _mesa_glsl_error( loc, state,
+   Could not implicitly convert return value 
+   to %s, in function `%s',
+   state-current_function-return_type-name,
+   state-current_function-function_name());
+   }
+} else {
+   _mesa_glsl_error( loc, state,
+`return' with wrong type %s, in function `%s' 

+returning %s,
+ret_type-name,
+state-current_function-function_name(),
+state-current_function-return_type-name);
+}
 }
 
 inst = new(ctx) ir_return(ret);
-- 
1.8.1.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/7] glsl: Allow non-constant expression initializers of const-qualified vars.

2013-05-24 Thread Matt Turner
Required by ARB_shading_language_420pack.
---
 src/glsl/ast_to_hir.cpp | 30 +++---
 1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 6e689b4..6b56e87 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -2337,17 +2337,25 @@ process_initializer(ir_variable *var, ast_declaration 
*decl,
 
 ir_constant *constant_value = rhs-constant_expression_value();
 if (!constant_value) {
-   _mesa_glsl_error( initializer_loc, state,
-initializer of %s variable `%s' must be a 
-constant expression,
-(type-qualifier.flags.q.constant)
-? const : uniform,
-decl-identifier);
-   if (var-type-is_numeric()) {
-  /* Reduce cascading errors. */
-  var-constant_value = ir_constant::zero(state, var-type);
-   }
-} else {
+/* If ARB_shading_language_420pack is enabled, initializers of
+ * const-qualified local variables do not have to be constant
+ * expressions. Const-qualified global variables must still be
+ * initialized with constant expressions.
+ */
+if (!state-ARB_shading_language_420pack_enable
+|| state-current_function == NULL) {
+   _mesa_glsl_error( initializer_loc, state,
+initializer of %s variable `%s' must be a 
+constant expression,
+(type-qualifier.flags.q.constant)
+? const : uniform,
+decl-identifier);
+   if (var-type-is_numeric()) {
+  /* Reduce cascading errors. */
+  var-constant_value = ir_constant::zero(state, var-type);
+   }
+}
+ } else {
rhs = constant_value;
var-constant_value = constant_value;
 }
-- 
1.8.1.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/7] glsl: Disallow return with a void argument from void functions.

2013-05-24 Thread Matt Turner
NOTE: This is a candidate for the stable branches.
---
 src/glsl/ast_to_hir.cpp | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 6b56e87..6ee50f5 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -3401,7 +3401,23 @@ ast_jump_statement::hir(exec_list *instructions,
 state-current_function-function_name(),
 state-current_function-return_type-name);
 }
-}
+ } else if (state-current_function-return_type-base_type ==
+GLSL_TYPE_VOID) {
+YYLTYPE loc = this-get_location();
+
+/* The ARB_shading_language_420pack, GLSL ES 3.0, and GLSL 4.20
+ * specs add a clarification:
+ *
+ *A void function can only use return without a return 
argument, even if
+ * the return argument has void type. Return statements only 
accept values:
+ *
+ * void func1() { }
+ * void func2() { return func1(); } // illegal return 
statement
+ */
+_mesa_glsl_error( loc, state,
+ void functions can only use `return' without a 
+ return argument);
+ }
 
 inst = new(ctx) ir_return(ret);
   } else {
-- 
1.8.1.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/7] Beginnings of ARB_shading_language_420pack

2013-05-24 Thread Matt Turner
I'm on vacation for the next week, so in case anyone else wants to
finish off ARB_shading_language_420pack, here are the tests and
patches I've done so far.

They cover
 - Swizzles on scalars
 - .length() method of matrices and vectors
 - gl_{Max,Min}ProgramTexelOffset built-in constants (needs a piglit test)
 - Implicit conversion of return values
 - Non-constant expression initializers of const variables
 - and a GLSL spec clarification tacked on at the end

Thanks,
Matt
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/5] i965 gen7: use SURFACE_STATE fields to select render level/layer

2013-05-24 Thread Jordan Justen
On Fri, May 24, 2013 at 2:17 PM, Paul Berry stereotype...@gmail.com wrote:
 On 22 May 2013 20:00, Jordan Justen jljus...@gmail.com wrote:
 On Wed, May 22, 2013 at 3:56 PM, Eric Anholt e...@anholt.net wrote:
  Jordan Justen jordan.l.jus...@intel.com writes:
  -   surf[0] = BRW_SURFACE_2D  BRW_SURFACE_TYPE_SHIFT |
  +   switch (gl_target) {
  +   case GL_TEXTURE_CUBE_MAP_ARRAY:
  +   case GL_TEXTURE_CUBE_MAP:
  +  surftype = BRW_SURFACE_2D;
  +  is_array = true;
  +  depth *= 6;
  +  break;
  +   default:
  +  surftype = translate_tex_target(gl_target);
  +  is_array = _mesa_tex_target_is_array(gl_target);
  +  break;
  +   }
 
  Why the conversion of cubes to arrays?  It looks from mentions in the
  render target write message section's mention of RTAI that cubes are
  supported.

 Hmm. Good catch.

 I think I started implementing this in brw_wm_surface_state.c, so I
 was looking at what would be needed for the older gens. It looks like
 pre-gen6, that cube-arrays were not supported in the surface_state.
 I'm not sure right now why I extended that to include converting
 non-array cubes to 2d-arrays as well.

 Anyway, I'll investigate cleaning this up for gen7, since that is what
 we are starting with.

 When Jordan was first working on this feature, he asked me to help debug it,
 and I found by reading simulator source code that SURFACE_STATE's minimum
 array element field is ignored for cube surfaces (in direct contradiction
 to the hw docs). Fortunately, treating the surface as an array is an
 effective workaround, since for render targets there is effectively no
 difference between a cube map and an ordinary array with a 6x higher depth.

I guess I forgot this discussion of ours. I spent some time trying to
get BRW_SURFACE_CUBE working (again, I suppose), with no luck.

Anyway...

Either of you good for an r-b on this second version of patch 5 in
this series then?

-Jordan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Mask the cut index based on the index buffer type in 3DSTATE_VF.

2013-05-24 Thread Kenneth Graunke

On 05/23/2013 03:46 PM, Kenneth Graunke wrote:

According to the documentation: The Cut Index is compared to the
fetched (and possibly-sign-extended) vertex index, and if these values
are equal, the current primitive topology is terminated.  Note that,
for index buffers 32bpp, it is possible to set the Cut Index to a
(large) value that will never match a sign-extended vertex index.

This suggests that we should not set the value to 0x for
unsigned byte or short index buffers, but rather 0xFF or 0x.

Fixes sporadic failures in the ES 3 instanced_arrays_primitive_restart
conformance test when run in combination with other tests.  No Piglit
regressions.

Cc: Ian Romanick i...@freedesktop.org
Cc: Paul Berry stereotype...@gmail.com
Signed-off-by: Kenneth Graunke kenn...@whitecape.org


NAK on this patch.  It looks like 0x133700ff is not supposed to match 
0xff in GL_UNSIGNED_BYTE mode.  I think i've found a bunch more bugs. 
Going to write some tests and new patches...


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev