[Mesa-dev] [Bug 42645] Graphics issues on BZFlag 2.4.0 with i915g (gallium)

2011-12-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=42645

Stephane Marchesin  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED

--- Comment #5 from Stephane Marchesin  2011-12-29 
22:55:52 PST ---
That's great to hear, I'm going to close this bug then.

Can you please open another bug about the menu with details and pics? I'm not
too sure if it is supposed to be checkered or not :) Please assign it to the
right driver as needed.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 42645] Graphics issues on BZFlag 2.4.0 with i915g (gallium)

2011-12-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=42645

--- Comment #4 from Christian Inci  2011-12-29 
22:08:36 PST ---
I've already tested earlier the day if this bug still occurs.
It won't. :-)
But another bug still does:
With the classic driver (i915) the main menu is checkered, but not with the
gallium driver (i915g).

Greetings,
Chris

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 44298] SIGSEGV src/mesa/state_tracker/st_cb_blit.c:87

2011-12-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=44298

--- Comment #2 from Stephane Marchesin  2011-12-29 
17:29:28 PST ---
I think it's a similar problem to the one discussed in
http://lists.freedesktop.org/archives/mesa-dev/2011-September/011578.html

The bug will affect all gallium drivers, and requires quite a bit of
refactoring. I've been meaning to do it, but don't hold your breath just yet.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 44298] SIGSEGV src/mesa/state_tracker/st_cb_blit.c:87

2011-12-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=44298

Vinson Lee  changed:

   What|Removed |Added

 AssignedTo|dri-devel@lists.freedesktop |mesa-dev@lists.freedesktop.
   |.org|org
Summary|[i915g] SIGSEGV |SIGSEGV
   |src/mesa/state_tracker/st_c |src/mesa/state_tracker/st_c
   |b_blit.c:87 |b_blit.c:87
  Component|Drivers/Gallium/i915g   |Mesa core

--- Comment #1 from Vinson Lee  2011-12-29 17:26:37 UTC ---
Crash can be reproduced with llvmpipe too.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 39017] [bisected] Segfault in Gallium drivers in Mupen64Plus

2011-12-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=39017

Peter Ward  changed:

   What|Removed |Added

 CC||peteraw...@gmail.com

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 39017] [bisected] Segfault in Gallium drivers in Mupen64Plus

2011-12-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=39017

--- Comment #6 from Peter Ward  2011-12-29 16:33:45 PST 
---
Created attachment 54965
  --> https://bugs.freedesktop.org/attachment.cgi?id=54965
gdb session

I'm also getting this crash. I've attached a log from gdb showing the immediate
cause of the crash, let me know if there's any more information I can provide.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 20/20] tests/glx: Add unit tests for GLX_ARB_create_context GLX protocol

2011-12-29 Thread Chad Versace
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 12/20/2011 12:31 PM, Ian Romanick wrote:
> From: Ian Romanick 
> 
> This adds a new tests directory at the top-level and some extra build
> infrastructure.  The tests use the Google C++ Testing Framework, and
> they will only be built if configure can detect its availability.  The
> tests are automatically wired-in to run with 'make check'.
> 
> Signed-off-by: Ian Romanick 
> ---
>  Makefile  |2 +-
>  configure.ac  |   20 +-
>  src/glx/Makefile  |5 +-
>  tests/Makefile.am |1 +
>  tests/glx/Makefile.am |   16 +
>  tests/glx/clientinfo_unittest.cpp |  723 
> +
>  tests/glx/create_context_unittest.cpp |  513 +++
>  tests/glx/fake_glx_screen.cpp |   57 +++
>  tests/glx/fake_glx_screen.h   |  149 +++
>  tests/glx/mock_xdisplay.h |   32 ++
>  10 files changed, 1514 insertions(+), 4 deletions(-)
>  create mode 100644 tests/Makefile.am
>  create mode 100644 tests/glx/Makefile.am
>  create mode 100644 tests/glx/clientinfo_unittest.cpp
>  create mode 100644 tests/glx/create_context_unittest.cpp
>  create mode 100644 tests/glx/fake_glx_screen.cpp
>  create mode 100644 tests/glx/fake_glx_screen.h
>  create mode 100644 tests/glx/mock_xdisplay.h

I'm not qualified to review these tests, but I do like the idea of
adding unit tests to Mesa with gtest.

Acked-by: Chad Versace 

- 
Chad Versace
chad.vers...@linux.intel.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJO/P8eAAoJEAIvNt057x8ik4MQALVQjB8+bCNniN1+vh1gqL+p
DMaC9GwCsILuUFf9Z5kVs7U2PLEFFMGJ0xP3vSivA63yZNN9pFkhrRTmdrney5ND
Ubc/uNlW6X01AQp/aWZ8tt/mLylaZh9TrC+2mvuizd9Bnw4MSRipsGNjp8qwq2Kz
mzaNpadfMeKXszn4PQFOTOQ+kKdNxuLDns6aD0aHM2v1KAZNeF/yw7dpiyBX4qq4
GR6nLjSoxSJ9FOR8q+5DjY7vwbYezSH/V2WcRAwnq7lEgjDJAGOwkC86sTi/bGJ1
pE178/6akQF2/kt6zE9Co06GHLSfVQ1jyHRYZe+8eC8xohrqCLwN0CVSg8eAQ8Dz
EW2aFnLaFrWhJ+vpHxTLt8SMiZoIalG+x2ZKFbCjw7dsCCInZjEPuuO47LeB/apy
iTNhF2nTXsF7CrUsawAol54df8Eka8DY560lGUWW4Fchk/5Ym8zIUdbkeJTbbOCT
NEOKagORTqs2BNhnAjnjpRWpoIK0+bRintRgcC09YSFFnp+/RbAB0CWh32Q0FMqr
e4YPk7Ukg/fphVx8ncAlXkKX/GuVwMiyq1wyH7yfpt3/r42jwJ3qVgmG9xmMbSPH
P/X/vMusxWchn2jZQ4l1Ui2SrG1gFK8BKB04USVM+kflayrZttnnBjLPU487zWPz
kxANracRn4G/tuvgqxWj
=5OXj
-END PGP SIGNATURE-
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] mesa: Clean up gl_array_object::_MaxElement computation.

2011-12-29 Thread Mathias Fröhlich
Use a bitmask approach to compute gl_array_object::_MaxElement.
To make this work reliably depending on the shader type actually used,
make use of the newly introduced typed bitmaks getters.
With this change I gain about 5% draw time on some osgviewer examples.

Signed-off-by: Mathias Fröhlich 
---
 src/mesa/main/arrayobj.c |   41 +++--
 src/mesa/main/state.c|  116 +
 2 files changed, 30 insertions(+), 127 deletions(-)

diff --git a/src/mesa/main/arrayobj.c b/src/mesa/main/arrayobj.c
index 4b3e07b..c2e978f 100644
--- a/src/mesa/main/arrayobj.c
+++ b/src/mesa/main/arrayobj.c
@@ -280,15 +280,26 @@ remove_array_object( struct gl_context *ctx, struct 
gl_array_object *obj )
 
 
 /**
- * Helper for update_arrays().
- * \return  min(current min, array->_MaxElement).
+ * Helper for _mesa_update_array_object_max_element().
+ * \return  min(arrayObj->VertexAttrib[*]._MaxElement).
  */
 static GLuint
-update_min(GLuint min, struct gl_client_array *array)
+compute_max_element(struct gl_array_object *arrayObj, GLbitfield64 enabled)
 {
-   assert(array->Enabled);
-   _mesa_update_array_max_element(array);
-   return MIN2(min, array->_MaxElement);
+   GLuint min = ~((GLuint)0);
+   
+   while (enabled) {
+  struct gl_client_array *client_array;
+  GLint attrib = _mesa_ffsll(enabled) - 1;
+  enabled ^= BITFIELD64_BIT(attrib);
+  
+  client_array = &arrayObj->VertexAttrib[attrib];
+  assert(client_array->Enabled);
+  _mesa_update_array_max_element(client_array);
+  min = MIN2(min, client_array->_MaxElement);
+   }
+   
+   return min;
 }
 
 
@@ -299,17 +310,19 @@ void
 _mesa_update_array_object_max_element(struct gl_context *ctx,
   struct gl_array_object *arrayObj)
 {
-   GLbitfield64 enabled = arrayObj->_Enabled;
-   GLuint min = ~0u;
-
-   while (enabled) {
-  GLint attrib = _mesa_ffsll(enabled) - 1;
-  enabled &= ~BITFIELD64_BIT(attrib);
-  min = update_min(min, &arrayObj->VertexAttrib[attrib]);
+   GLbitfield64 enabled;
+
+   if (!ctx->VertexProgram._Current ||
+   ctx->VertexProgram._Current == ctx->VertexProgram._TnlProgram) {
+  enabled = _mesa_array_object_get_enabled_ff(arrayObj);
+   } else if (ctx->VertexProgram._Current->IsNVProgram) {
+  enabled = _mesa_array_object_get_enabled_nv(arrayObj);
+   } else {
+  enabled = _mesa_array_object_get_enabled_arb(arrayObj);
}
 
/* _MaxElement is one past the last legal array element */
-   arrayObj->_MaxElement = min;
+   arrayObj->_MaxElement = compute_max_element(arrayObj, enabled);
 }
 
 
diff --git a/src/mesa/main/state.c b/src/mesa/main/state.c
index 7e43563..5e37830 100644
--- a/src/mesa/main/state.c
+++ b/src/mesa/main/state.c
@@ -62,126 +62,16 @@ update_separate_specular(struct gl_context *ctx)
 
 
 /**
- * Helper for update_arrays().
- * \return  min(current min, array->_MaxElement).
- */
-static GLuint
-update_min(GLuint min, struct gl_client_array *array)
-{
-   _mesa_update_array_max_element(array);
-   return MIN2(min, array->_MaxElement);
-}
-
-
-/**
  * Update ctx->Array._MaxElement (the max legal index into all enabled 
arrays).
- * Need to do this upon new array state or new buffer object state.
+ * Need to do this upon new array state, new buffer object state
+ * or a new shader.
  */
 static void
 update_arrays( struct gl_context *ctx )
 {
struct gl_array_object *arrayObj = ctx->Array.ArrayObj;
-   GLuint i, min = ~0;
-
-   /* find min of _MaxElement values for all enabled arrays.
-* Note that the generic arrays always take precedence over
-* the legacy arrays.
-*/
-
-   /* 0 */
-   if (ctx->VertexProgram._Current
-   && arrayObj->VertexAttrib[VERT_ATTRIB_GENERIC0].Enabled) {
-  min = update_min(min, &arrayObj->VertexAttrib[VERT_ATTRIB_GENERIC0]);
-   }
-   else if (arrayObj->VertexAttrib[VERT_ATTRIB_POS].Enabled) {
-  min = update_min(min, &arrayObj->VertexAttrib[VERT_ATTRIB_POS]);
-   }
-
-   /* 1 */
-   if (ctx->VertexProgram._Enabled
-   && arrayObj->VertexAttrib[VERT_ATTRIB_GENERIC1].Enabled) {
-  min = update_min(min, &arrayObj->VertexAttrib[VERT_ATTRIB_GENERIC1]);
-   }
-   /* no conventional vertex weight array */
-
-   /* 2 */
-   if (ctx->VertexProgram._Enabled
-   && arrayObj->VertexAttrib[VERT_ATTRIB_GENERIC2].Enabled) {
-  min = update_min(min, &arrayObj->VertexAttrib[VERT_ATTRIB_GENERIC2]);
-   }
-   else if (arrayObj->VertexAttrib[VERT_ATTRIB_NORMAL].Enabled) {
-  min = update_min(min, &arrayObj->VertexAttrib[VERT_ATTRIB_NORMAL]);
-   }
-
-   /* 3 */
-   if (ctx->VertexProgram._Enabled
-   && arrayObj->VertexAttrib[VERT_ATTRIB_GENERIC3].Enabled) {
-  min = update_min(min, &arrayObj->VertexAttrib[VERT_ATTRIB_GENERIC3]);
-   }
-   else if (arrayObj->VertexAttrib[VERT_ATTRIB_COLOR0].Enabled) {
-  min = update_min(min, &arrayObj->VertexAttrib[VERT_ATTRIB_COLOR0]);
-   }
-
-   /* 4 */
-   if (ctx->VertexProgram._Enabled
-

[Mesa-dev] [PATCH 2/3] mesa: Introduce enabled bitfield helper functions.

2011-12-29 Thread Mathias Froehlich
Depending on the installed shader type, different arrays are used
from gl_array_object. Provide helper functions that compute
the bitmask of these arrays that are finally enabled for a given
shader type. The will be used in a followup change.

Signed-off-by: Mathias Fröhlich 
---
 src/mesa/main/arrayobj.h |   38 ++
 src/mesa/main/mtypes.h   |4 
 2 files changed, 42 insertions(+), 0 deletions(-)

diff --git a/src/mesa/main/arrayobj.h b/src/mesa/main/arrayobj.h
index 0b5a013..938fa1e 100644
--- a/src/mesa/main/arrayobj.h
+++ b/src/mesa/main/arrayobj.h
@@ -29,6 +29,7 @@
 #define ARRAYOBJ_H
 
 #include "glheader.h"
+#include "mtypes.h"
 
 struct gl_context;
 
@@ -64,6 +65,43 @@ extern void
 _mesa_update_array_object_max_element(struct gl_context *ctx,
   struct gl_array_object *arrayObj);
 
+/** Returns the bitmask of all enabled arrays in fixed function mode.
+ *
+ *  In fixed function mode only the traditional fixed function arrays
+ *  are available.
+ */
+static inline GLbitfield64
+_mesa_array_object_get_enabled_ff(const struct gl_array_object *arrayObj)
+{
+   return arrayObj->_Enabled & VERT_BIT_FF_ALL;
+}
+
+/** Returns the bitmask of all enabled arrays in nv shader mode.
+ *
+ *  In nv shader mode any the generic nv arrays superseed the traditional
+ *  fixed function arrays. The nv generic arrays take precedence
+ *  over the legacy arrays.
+ */
+static inline GLbitfield64
+_mesa_array_object_get_enabled_nv(const struct gl_array_object *arrayObj)
+{
+   GLbitfield64 enabled = arrayObj->_Enabled;
+   return enabled & ~(VERT_BIT_FF_NVALIAS & (enabled >> 
VERT_ATTRIB_GENERIC0));
+}
+
+/** Returns the bitmask of all enabled arrays in arb/glsl shader mode.
+ *
+ *  In arb/glsl shader mode all the fixed function and the arg/glsl generic
+ *  arrays are available. Here only the first generic array takes
+ *  precedence over the legacy position array.
+ */
+static inline GLbitfield64
+_mesa_array_object_get_enabled_arb(const struct gl_array_object *arrayObj)
+{
+   GLbitfield64 enabled = arrayObj->_Enabled;
+   return enabled & ~(VERT_BIT_POS & (enabled >> VERT_ATTRIB_GENERIC0));
+}
+
 
 /*
  * API functions
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 451d442..3bef71d 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -206,9 +206,13 @@ typedef enum
 #define VERT_BIT_TEX(i)  VERT_BIT(VERT_ATTRIB_TEX(i))
 #define VERT_BIT_TEX_ALL \
BITFIELD64_RANGE(VERT_ATTRIB_TEX(0), VERT_ATTRIB_TEX_MAX)
+#define VERT_BIT_FF_NVALIAS  \
+   BITFIELD64_RANGE(VERT_ATTRIB_POS, VERT_ATTRIB_GENERIC_NV_MAX)
+
 #define VERT_BIT_GENERIC_NV(i)   VERT_BIT(VERT_ATTRIB_GENERIC_NV(i))
 #define VERT_BIT_GENERIC_NV_ALL  \
BITFIELD64_RANGE(VERT_ATTRIB_GENERIC_NV(0), VERT_ATTRIB_GENERIC_NV_MAX)
+
 #define VERT_BIT_GENERIC(i)  VERT_BIT(VERT_ATTRIB_GENERIC(i))
 #define VERT_BIT_GENERIC_ALL \
BITFIELD64_RANGE(VERT_ATTRIB_GENERIC(0), VERT_ATTRIB_GENERIC_MAX)
-- 
1.7.4.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] mesa: Use BITFIELD64_RANGE for VERT_BIT_*_ALL.

2011-12-29 Thread Mathias Fröhlich
Signed-off-by: Mathias Fröhlich 
---
 src/mesa/main/mtypes.h |   10 +-
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 107371e..451d442 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -199,19 +199,19 @@ typedef enum
 #define VERT_BIT_GENERIC0BITFIELD64_BIT(VERT_ATTRIB_GENERIC0)
 
 #define VERT_BIT(i)  BITFIELD64_BIT(i)
-#define VERT_BIT_ALL (BITFIELD64_BIT(VERT_ATTRIB_MAX) - 1)
+#define VERT_BIT_ALL BITFIELD64_RANGE(0, VERT_ATTRIB_MAX)
 
 #define VERT_BIT_FF(i)   VERT_BIT(i)
-#define VERT_BIT_FF_ALL  (BITFIELD64_BIT(VERT_ATTRIB_FF_MAX) - 1)
+#define VERT_BIT_FF_ALL  BITFIELD64_RANGE(0, VERT_ATTRIB_FF_MAX)
 #define VERT_BIT_TEX(i)  VERT_BIT(VERT_ATTRIB_TEX(i))
 #define VERT_BIT_TEX_ALL \
-  ((BITFIELD64_BIT(VERT_ATTRIB_TEX_MAX) - 1) << VERT_ATTRIB_TEX(0))
+   BITFIELD64_RANGE(VERT_ATTRIB_TEX(0), VERT_ATTRIB_TEX_MAX)
 #define VERT_BIT_GENERIC_NV(i)   VERT_BIT(VERT_ATTRIB_GENERIC_NV(i))
 #define VERT_BIT_GENERIC_NV_ALL  \
-  ((BITFIELD64_BIT(VERT_ATTRIB_GENERIC_NV_MAX) - 1) << 
(VERT_ATTRIB_GENERIC_NV(0)))
+   BITFIELD64_RANGE(VERT_ATTRIB_GENERIC_NV(0), VERT_ATTRIB_GENERIC_NV_MAX)
 #define VERT_BIT_GENERIC(i)  VERT_BIT(VERT_ATTRIB_GENERIC(i))
 #define VERT_BIT_GENERIC_ALL \
-  ((BITFIELD64_BIT(VERT_ATTRIB_GENERIC_MAX) - 1) << (VERT_ATTRIB_GENERIC(0)))
+   BITFIELD64_RANGE(VERT_ATTRIB_GENERIC(0), VERT_ATTRIB_GENERIC_MAX)
 /*@}*/
 
 
-- 
1.7.4.4
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/3] Array object cleanup.

2011-12-29 Thread Mathias Fröhlich

Hi,

Following a short series of cleanup following the gl_array_object change.
The most important change is the use ot an ffs based loop to compute 
gl_array_object::_MaxElement. This change provides a noticable performance 
improovement for my average workloads.
To implement this, a small set of helper functions is introduced in the patch 
before, to compute the bitmasks of enabled arrays taking into account the 
aliasing rules for the different arrays.
The first patch just makes use of the recently introduced BITFIELD64_RANGE 
macro where it makes sense.

The complete changeset survives a piglit quick run for r600g/rv770 and swrast 
without regressions.

Please review.
Thanks

Mathias


Mathias Fröhlich (3):
  mesa: Use BITFIELD64_RANGE for VERT_BIT_*_ALL.
  mesa: Introduce enabled bitfield helper functions.
  mesa: Clean up gl_array_object::_MaxElement computation.

 src/mesa/main/arrayobj.c |   41 +++--
 src/mesa/main/arrayobj.h |   38 +++
 src/mesa/main/mtypes.h   |   14 --
 src/mesa/main/state.c|  116 +
 4 files changed, 77 insertions(+), 132 deletions(-)

-- 
1.7.4.4
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 42645] Graphics issues on BZFlag 2.4.0 with i915g (gallium)

2011-12-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=42645

--- Comment #3 from Stephane Marchesin  2011-12-29 
12:21:52 PST ---
Can you retry with git mesa? It might be fixed by my recent changes.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] intel: Fix pitch handling for linear blits.

2011-12-29 Thread Chad Versace
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 12/28/2011 11:14 AM, Eric Anholt wrote:
> The new assert in intelEmitCopyBlit() gets angry if we don't align to
> dwords.  Rather than make the assert have a special case for height ==
> 1 on the assumption that the hardware doesn't use it in that case,
> just supply a correct pitch.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=43214
> ---
>  src/mesa/drivers/dri/intel/intel_blit.c |8 
>  1 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/intel/intel_blit.c 
> b/src/mesa/drivers/dri/intel/intel_blit.c
> index b1a839a..1369e63 100644
> --- a/src/mesa/drivers/dri/intel/intel_blit.c
> +++ b/src/mesa/drivers/dri/intel/intel_blit.c
> @@ -491,7 +491,7 @@ intel_emit_linear_blit(struct intel_context *intel,
>  * we want width to match pitch. Max width is (1 << 15 - 1),
>  * rounding that down to the nearest DWORD is 1 << 15 - 4
>  */
> -   pitch = MIN2(size, (1 << 15) - 4);
> +   pitch = ROUND_DOWN_TO(MIN2(size, (1 << 15) - 1), 4);
> height = size / pitch;
> ok = intelEmitCopyBlit(intel, 1,
> pitch, src_bo, src_offset, I915_TILING_NONE,
> @@ -506,11 +506,11 @@ intel_emit_linear_blit(struct intel_context *intel,
> dst_offset += pitch * height;
> size -= pitch * height;
> assert (size < (1 << 15));
> -   assert ((size & 3) == 0); /* Pitch must be DWORD aligned */
> +   pitch = ALIGN(size, 4);
> if (size != 0) {
>ok = intelEmitCopyBlit(intel, 1,
> -  size, src_bo, src_offset, I915_TILING_NONE,
> -  size, dst_bo, dst_offset, I915_TILING_NONE,
> +  pitch, src_bo, src_offset, I915_TILING_NONE,
> +  pitch, dst_bo, dst_offset, I915_TILING_NONE,
>0, 0, /* src x/y */
>0, 0, /* dst x/y */
>size, 1, /* w, h */

Reviewed-by: Chad Versace 

- 
Chad Versace
chad.vers...@linux.intel.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJO/McVAAoJEAIvNt057x8iwIQP/3r0Zm/XfuMeHYPTOWBEhtLM
y1om2mMVk8z8gVtUKk5H0/ikW9UhQJ/7gmVj2pITXmajxZZ+QdsXWklmvFWbeFPM
TDl/Z8VWeTNav4FBjWhRFvLLjKNQIJ2X1LeG1R592KmLTSiV24Tdcvf0mD3kJziA
XF1kWKGd+6tslzQqFejOvHlB4PACC9nN00UN0I3yONSL4Ud54kBhrHFctONrJhxH
1DFLOaauFQyFueXCpK+UmZStULxFbmshBnSsVKzgxVW4qGsieoUqSfXpUymtb0kH
gIrCBBbitwm1JiFaDXoF8a9CuyjWhm8IFadMJRs01Q+0YbNp1IW0yC1gJNSAVBnp
B63cFZdOZzobXRioT+4HCZ5bfHboRlkxQnS/GV7IdsLNaA9EJi9HRa7DceBZI8Lr
2KkNZrPBcMHR51T9cq6l/PpxrxYv7CrnYBa9RdA0fJzXqV4DVZnzSa2xenJMv6vu
ZL8I2IJyUaNVXXPC94lUyUOEDsBrawMriwXbRkJ27sKew5buesA5d7jDNMFPhCME
bBH42qaqXhlGsw9Gwi+4ADDDOxRcmWRJwEoFEDULD1zgybmEAKStEnvAkETO9mSz
oHhNGAveNgL8YADzTqT6YNC39rfgrWPo1vP0cX/KyzGkqLhk6WlCsLodgC5tO6G1
vN3OjxrfJ02Fs85+/5SU
=r7/C
-END PGP SIGNATURE-
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] intel: Fix bad read/write flags on self-copies for glCopyBufferSubData().

2011-12-29 Thread Chad Versace
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 12/28/2011 11:14 AM, Eric Anholt wrote:
> We didn't consume these flags in any way that would produce a
> functional difference, but we might have some day.
> ---
>  src/mesa/drivers/dri/intel/intel_buffer_objects.c |4 +++-
>  1 files changed, 3 insertions(+), 1 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/intel/intel_buffer_objects.c 
> b/src/mesa/drivers/dri/intel/intel_buffer_objects.c
> index 4a1a816..9b1f642 100644
> --- a/src/mesa/drivers/dri/intel/intel_buffer_objects.c
> +++ b/src/mesa/drivers/dri/intel/intel_buffer_objects.c
> @@ -663,7 +663,9 @@ intel_bufferobj_copy_subdata(struct gl_context *ctx,
> */
>if (src == dst) {
>char *ptr = intel_bufferobj_map_range(ctx, 0, dst->Size,
> -GL_MAP_READ_BIT, dst);
> +GL_MAP_READ_BIT |
> +GL_MAP_WRITE_BIT,
> +dst);
>memmove(ptr + write_offset, ptr + read_offset, size);
>intel_bufferobj_unmap(ctx, dst);
>} else {


Reviewed-by: Chad Versace 

- 
Chad Versace
chad.vers...@linux.intel.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIbBAEBAgAGBQJO/McBAAoJEAIvNt057x8iurQP+PYwcn/TiQhWHzeRde2jxnDs
nOP/Qm79mGNpxncGOxeXMmr8J/GzYFMLfRDLgBIq1q86myEBgi6s0teL26nBQyFM
0ZGOQ/3JuDDlR52FMkQeFzLQpI0uB+BWvyKHB88hh46Z+/ztdULeYSqlqvRjRvWz
ls1g1DwzFFzLGxN3NaaUbQyqI2EXu4kMJOpNHO+3M2DOqQgwX1YXVIl6UcDWIjqM
xAY/wnGfEoNci3PPTF2u2B01EzBfOf0jMT4SzsypBXrNjo+Fzm+VYYp55ylN9Ks0
nPPcSDZcjqH0YwCv2s5EF/fOuWDeDX8U20XowZ55woi4Cp/SrhxHJaAYM+P8zW6B
0KN63Gg+Uwrcu0LJWfAPOVVqAwpvi8zu8gPWIh7UntVoJtLHBsq3mOdGh8pg19HO
1TXl+943jdC7nigWy/foanZtJKJYygyr9O+KB4N9ukFDvbUJvGXw0cTWJlCZOvF1
yRPnGch4zQV6CPUYjIkcwakTN21NmFMQKooUn9j+NEj40UHR3sRx/UZurWF1UBRX
YQzwI48Jtx+eNO2klSbF9D9gcDgsfmwGw7Pzci1FsiLgjIxuCCAaPQ/5AmiKpsdC
ppa3F3E1ac1dc5+aBesCC4UFwcy3BVHQ2Juy7D+LXRGmzJzZGUEUNOSiL6xt8c7x
ZAs68Uovx9LLH0DWT6w=
=/k3Z
-END PGP SIGNATURE-
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] gallium: dereference are now handled by a separate visitor in glsl_to_tgsi

2011-12-29 Thread Vincent Lejeune
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  453 ++--
 1 files changed, 294 insertions(+), 159 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 59ecb52..3523159 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -293,6 +293,45 @@ public:
void push(class variable_storage *);
 };
 
+/**
+ * This visitor will retrieve offset and expression of indirect addressing
+ * from any ir_dereference*
+ */
+class glsl_to_tgsi_dereference_to_address : public ir_visitor {
+public:
+   unsigned offset;
+   struct {
+  unsigned stride;
+  ir_rvalue *expr;
+   } indirect_address_expression[8];
+   unsigned indirect_address_expression_count;
+   variable_store &store;
+   int &next_temp;
+   glsl_to_tgsi_dereference_to_address(variable_store&,int&);
+   void* mem_ctx;
+   gl_register_file file;
+
+   void visit(class ir_dereference_variable *);
+   void visit(class ir_dereference_array *);
+   void visit(class ir_dereference_record *);
+
+   void visit(ir_variable *);
+   void visit(ir_function_signature *);
+   void visit(ir_function *);
+   void visit(ir_expression *);
+   void visit(ir_texture *);
+   void visit(ir_swizzle *);
+   void visit(ir_assignment *);
+   void visit(ir_constant *);
+   void visit(ir_call *);
+   void visit(ir_discard *);
+   void visit(ir_if *);
+   void visit(ir_loop *);
+   void visit(ir_loop_jump *);
+   void visit(ir_return *);
+};
+
+
 class glsl_to_tgsi_visitor : public ir_visitor {
 public:
glsl_to_tgsi_visitor();
@@ -354,6 +393,8 @@ public:
virtual void visit(ir_if *);
/*@}*/
 
+   void handle_dereference(ir_dereference *);
+
st_src_reg result;
 
/** List of variable_storage */
@@ -491,6 +532,211 @@ num_inst_src_regs(unsigned opcode)
return info->is_tex ? info->num_src - 1 : info->num_src;
 }
 
+static int
+type_size(const struct glsl_type *type)
+{
+   unsigned int i;
+   int size;
+
+   switch (type->base_type) {
+   case GLSL_TYPE_UINT:
+   case GLSL_TYPE_INT:
+   case GLSL_TYPE_FLOAT:
+   case GLSL_TYPE_BOOL:
+  if (type->is_matrix()) {
+ return type->matrix_columns;
+  } else {
+ /* Regardless of size of vector, it gets a vec4. This is bad
+  * packing for things like floats, but otherwise arrays become a
+  * mess.  Hopefully a later pass over the code can pack scalars
+  * down if appropriate.
+  */
+ return 1;
+  }
+   case GLSL_TYPE_ARRAY:
+  assert(type->length > 0);
+  return type_size(type->fields.array) * type->length;
+   case GLSL_TYPE_STRUCT:
+  size = 0;
+  for (i = 0; i < type->length; i++) {
+ size += type_size(type->fields.structure[i].type);
+  }
+  return size;
+   case GLSL_TYPE_SAMPLER:
+  /* Samplers take up one slot in UNIFORMS[], but they're baked in
+   * at link time.
+   */
+  return 1;
+   default:
+  assert(0);
+  return 0;
+   }
+}
+
+glsl_to_tgsi_dereference_to_address::glsl_to_tgsi_dereference_to_address(variable_store
 &s, int &t):indirect_address_expression_count(0),store(s),next_temp(t)
+{
+
+}
+
+void glsl_to_tgsi_dereference_to_address::visit(ir_dereference_variable *ir)
+{
+   variable_storage *entry = store.find_variable_storage(ir->var);
+   ir_variable *var = ir->var;
+
+   if (!entry) {
+  switch (var->mode) {
+  case ir_var_uniform:
+ entry = new(mem_ctx) variable_storage(var, PROGRAM_UNIFORM,
+ var->location);
+ store.push(entry);
+ break;
+  case ir_var_in:
+  case ir_var_inout:
+ /* The linker assigns locations for varyings and attributes,
+  * including deprecated builtins (like gl_Color), user-assign
+  * generic attributes (glBindVertexLocation), and
+  * user-defined varyings.
+  *
+  * FINISHME: We would hit this path for function arguments.  Fix!
+  */
+ assert(var->location != -1);
+ entry = new(mem_ctx) variable_storage(var,
+   PROGRAM_INPUT,
+   var->location);
+ break;
+  case ir_var_out:
+ assert(var->location != -1);
+ entry = new(mem_ctx) variable_storage(var,
+   PROGRAM_OUTPUT,
+   var->location);
+ break;
+  case ir_var_system_value:
+ entry = new(mem_ctx) variable_storage(var,
+   PROGRAM_SYSTEM_VALUE,
+   var->location);
+ break;
+  case ir_var_auto:
+  case ir_var_temporary:
+ entry = new(mem_ctx) variable_storage(var, PROGRAM_TEMPORARY,
+ this->next_temp);
+ store.push(entry);
+
+ n

[Mesa-dev] [PATCH 2/2] gallium: improves glsl_to_tgsi generation of ARL

2011-12-29 Thread Vincent Lejeune
This commit should generates less ARL instructions when dealing with indirect
addressing.

v2: fix glsl-fs-uniform-array-4 piglit test
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   56 +---
 1 files changed, 34 insertions(+), 22 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 3523159..0027396 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -482,6 +482,8 @@ static st_src_reg undef_src = st_src_reg(PROGRAM_UNDEFINED, 
0, GLSL_TYPE_ERROR);
 static st_dst_reg undef_dst = st_dst_reg(PROGRAM_UNDEFINED, SWIZZLE_NOOP, 
GLSL_TYPE_ERROR);
 
 static st_dst_reg address_reg = st_dst_reg(PROGRAM_ADDRESS, WRITEMASK_X, 
GLSL_TYPE_FLOAT);
+static int available_address_writemask = WRITEMASK_X;
+static GLuint available_address_swizzle = SWIZZLE_X;
 
 static void
 fail_link(struct gl_shader_program *prog, const char *fmt, ...) PRINTFLIKE(2, 
3);
@@ -743,29 +745,10 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned 
op,
 st_src_reg src0, st_src_reg src1, st_src_reg src2)
 {
glsl_to_tgsi_instruction *inst = new(mem_ctx) glsl_to_tgsi_instruction();
-   int num_reladdr = 0, i;
+   int i;

op = get_opcode(ir, op, dst, src0, src1);
 
-   /* If we have to do relative addressing, we want to load the ARL
-* reg directly for one of the regs, and preload the other reladdr
-* sources into temps.
-*/
-   num_reladdr += dst.reladdr != NULL;
-   num_reladdr += src0.reladdr != NULL;
-   num_reladdr += src1.reladdr != NULL;
-   num_reladdr += src2.reladdr != NULL;
-
-   reladdr_to_temp(ir, &src2, &num_reladdr);
-   reladdr_to_temp(ir, &src1, &num_reladdr);
-   reladdr_to_temp(ir, &src0, &num_reladdr);
-
-   if (dst.reladdr) {
-  emit_arl(ir, address_reg, *dst.reladdr);
-  num_reladdr--;
-   }
-   assert(num_reladdr == 0);
-
inst->op = op;
inst->dst = dst;
inst->src[0] = src0;
@@ -2108,6 +2091,31 @@ glsl_to_tgsi_visitor::visit(ir_swizzle *ir)
this->result = src;
 }
 
+static
+void update_address_mask_availability(void)
+{
+   switch (available_address_writemask) {
+  case WRITEMASK_X:
+ available_address_writemask = WRITEMASK_Y;
+ available_address_swizzle = SWIZZLE_Y;
+ break;
+  case WRITEMASK_Y:
+ available_address_writemask = WRITEMASK_Z;
+ available_address_swizzle = SWIZZLE_Z;
+ break;
+  case WRITEMASK_Z:
+ available_address_writemask = WRITEMASK_W;
+ available_address_swizzle = SWIZZLE_W;
+ break;
+  case WRITEMASK_W:
+ available_address_writemask = WRITEMASK_X;
+ available_address_swizzle = SWIZZLE_X;
+ break;
+  default:
+ assert(0);
+   }
+}
+
 void
 glsl_to_tgsi_visitor::handle_dereference(ir_dereference *ir)
 {
@@ -2144,8 +2152,12 @@ glsl_to_tgsi_visitor::handle_dereference(ir_dereference 
*ir)
  this->result, 
st_src_reg_for_type(index_reg.type,element_size),index_reg);
 }
  }
+ address_reg.writemask = available_address_writemask;
+ emit_arl(ir,address_reg,index_reg);
  result.reladdr = ralloc(mem_ctx, st_src_reg);
- memcpy(result.reladdr, &index_reg, sizeof(index_reg));
+ *(result.reladdr) = st_src_reg(address_reg);
+ result.reladdr->swizzle = available_address_swizzle;
+ update_address_mask_availability();
   }
}
 
@@ -4314,7 +4326,7 @@ translate_src(struct st_translate *t, const st_src_reg 
*src_reg)
   src.Indirect = 1;
   src.IndirectFile = addr.File;
   src.IndirectIndex = addr.Index;
-  src.IndirectSwizzle = addr.SwizzleX;
+  src.IndirectSwizzle = GET_SWZ(src_reg->reladdr->swizzle,0);
   
   if (src_reg->file != PROGRAM_INPUT &&
   src_reg->file != PROGRAM_OUTPUT) {
-- 
1.7.7

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] llvmpipe vs fbo-alphatest-formats

2011-12-29 Thread Dave Airlie
Hi guys,

was doing some piglit comparisons between softpipe and llvmpipe and
noticed the fbo-alphatest-formats test fails on llvmpipe for all the
I8 formats.

Now I looked at the code generated for
lp_tile_soa.c:lp_tile_i8_unorm_unswizzle_4ub and it references a[i+0]
and a[i+1], if I change it to reference r[i + 0] and r[i + 1] the
tests all pass, I suspect of course this code is reading off the end
of the array for an I8_UNORM, but I'm not really sure what it expects
to happen in this case.

Maybe someone can take a look and let me know what the intentions were
originally.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] swrast: Remove dead code in _swrast_clear_depth_buffer()

2011-12-29 Thread Paul Berry
This code was generating the gcc warning:

  variable ‘clearValue’ set but not used [-Wunused-but-set-variable]
---
 src/mesa/swrast/s_depth.c |9 -
 1 files changed, 0 insertions(+), 9 deletions(-)

diff --git a/src/mesa/swrast/s_depth.c b/src/mesa/swrast/s_depth.c
index f87adaa..53f21cb 100644
--- a/src/mesa/swrast/s_depth.c
+++ b/src/mesa/swrast/s_depth.c
@@ -489,7 +489,6 @@ _swrast_clear_depth_buffer(struct gl_context *ctx)
 {
struct gl_renderbuffer *rb =
   ctx->DrawBuffer->Attachment[BUFFER_DEPTH].Renderbuffer;
-   GLuint clearValue;
GLint x, y, width, height;
GLubyte *map;
GLint rowStride, i, j;
@@ -500,14 +499,6 @@ _swrast_clear_depth_buffer(struct gl_context *ctx)
   return;
}
 
-   /* compute integer clearing value */
-   if (ctx->Depth.Clear == 1.0) {
-  clearValue = ctx->DrawBuffer->_DepthMax;
-   }
-   else {
-  clearValue = (GLuint) (ctx->Depth.Clear * ctx->DrawBuffer->_DepthMaxF);
-   }
-
/* compute region to clear */
x = ctx->DrawBuffer->_Xmin;
y = ctx->DrawBuffer->_Ymin;
-- 
1.7.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium: use Haiku provided debug_printf in OS.h

2011-12-29 Thread Alexander von Gluck


---
 src/gallium/auxiliary/util/u_debug.h |6 --
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_debug.h 
b/src/gallium/auxiliary/util/u_debug.h

index b5ea405..448d799 100644
--- a/src/gallium/auxiliary/util/u_debug.h
+++ b/src/gallium/auxiliary/util/u_debug.h
@@ -91,8 +91,10 @@ debug_printf(const char *format, ...)
(void) format; /* silence warning */
 #endif
 }
-
-#endif /* !PIPE_OS_HAIKU */
+#else /* is Haiku */
+// Haiku provides debug_printf in libroot with OS.h
+#include 
+#endif

 /*
  * ... isn't portable so we need to pass arguments in parentheses.
--
1.7.7.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] gallium: improves glsl_to_tgsi generation of ARL

2011-12-29 Thread Vincent Lejeune
This commit should generates less ARL instructions when dealing with indirect
addressing.
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   56 +---
 1 files changed, 34 insertions(+), 22 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index becb774..eb71a26 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -483,6 +483,8 @@ static st_src_reg undef_src = st_src_reg(PROGRAM_UNDEFINED, 
0, GLSL_TYPE_ERROR);
 static st_dst_reg undef_dst = st_dst_reg(PROGRAM_UNDEFINED, SWIZZLE_NOOP, 
GLSL_TYPE_ERROR);
 
 static st_dst_reg address_reg = st_dst_reg(PROGRAM_ADDRESS, WRITEMASK_X, 
GLSL_TYPE_FLOAT);
+static int available_address_writemask = WRITEMASK_X;
+static GLuint available_address_swizzle = SWIZZLE_X;
 
 static void
 fail_link(struct gl_shader_program *prog, const char *fmt, ...) PRINTFLIKE(2, 
3);
@@ -745,29 +747,10 @@ glsl_to_tgsi_visitor::emit(ir_instruction *ir, unsigned 
op,
 st_src_reg src0, st_src_reg src1, st_src_reg src2)
 {
glsl_to_tgsi_instruction *inst = new(mem_ctx) glsl_to_tgsi_instruction();
-   int num_reladdr = 0, i;
+   int i;

op = get_opcode(ir, op, dst, src0, src1);
 
-   /* If we have to do relative addressing, we want to load the ARL
-* reg directly for one of the regs, and preload the other reladdr
-* sources into temps.
-*/
-   num_reladdr += dst.reladdr != NULL;
-   num_reladdr += src0.reladdr != NULL;
-   num_reladdr += src1.reladdr != NULL;
-   num_reladdr += src2.reladdr != NULL;
-
-   reladdr_to_temp(ir, &src2, &num_reladdr);
-   reladdr_to_temp(ir, &src1, &num_reladdr);
-   reladdr_to_temp(ir, &src0, &num_reladdr);
-
-   if (dst.reladdr) {
-  emit_arl(ir, address_reg, *dst.reladdr);
-  num_reladdr--;
-   }
-   assert(num_reladdr == 0);
-
inst->op = op;
inst->dst = dst;
inst->src[0] = src0;
@@ -2110,6 +2093,31 @@ glsl_to_tgsi_visitor::visit(ir_swizzle *ir)
this->result = src;
 }
 
+static
+void update_address_mask_availability(void)
+{
+   switch (available_address_writemask) {
+  case WRITEMASK_X:
+ available_address_writemask = WRITEMASK_Y;
+ available_address_swizzle = SWIZZLE_Y;
+ break;
+  case WRITEMASK_Y:
+ available_address_writemask = WRITEMASK_Z;
+ available_address_swizzle = SWIZZLE_Z;
+ break;
+  case WRITEMASK_Z:
+ available_address_writemask = WRITEMASK_W;
+ available_address_swizzle = SWIZZLE_W;
+ break;
+  case WRITEMASK_W:
+ available_address_writemask = WRITEMASK_X;
+ available_address_swizzle = SWIZZLE_X;
+ break;
+  default:
+ assert(0);
+   }
+}
+
 void
 glsl_to_tgsi_visitor::handle_dereference(ir_dereference *ir)
 {
@@ -2146,8 +2154,12 @@ glsl_to_tgsi_visitor::handle_dereference(ir_dereference 
*ir)
  this->result, 
st_src_reg_for_type(index_reg.type,element_size),index_reg);
 }
  }
+ address_reg.writemask = available_address_writemask;
+ emit_arl(ir,address_reg,index_reg);
  result.reladdr = ralloc(mem_ctx, st_src_reg);
- memcpy(result.reladdr, &index_reg, sizeof(index_reg));
+ *(result.reladdr) = st_src_reg(address_reg);
+ result.reladdr->swizzle = available_address_swizzle;
+ update_address_mask_availability();
   }
}
 
@@ -4316,7 +4328,7 @@ translate_src(struct st_translate *t, const st_src_reg 
*src_reg)
   src.Indirect = 1;
   src.IndirectFile = addr.File;
   src.IndirectIndex = addr.Index;
-  src.IndirectSwizzle = addr.SwizzleX;
+  src.IndirectSwizzle = GET_SWZ(src_reg->reladdr->swizzle,0);
   
   if (src_reg->file != PROGRAM_INPUT &&
   src_reg->file != PROGRAM_OUTPUT) {
-- 
1.7.7

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] gallium: dereference are now handled by a separate visitor in glsl_to_tgsi

2011-12-29 Thread Vincent Lejeune
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  455 ++--
 1 files changed, 296 insertions(+), 159 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 59ecb52..becb774 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -293,6 +293,46 @@ public:
void push(class variable_storage *);
 };
 
+/**
+ * This visitor will retrieve offset and expression of indirect addressing
+ * from any ir_dereference*
+ */
+class glsl_to_tgsi_dereference_to_address : public ir_visitor {
+public:
+   unsigned offset;
+   struct {
+  unsigned stride;
+  ir_rvalue *expr;
+   } indirect_address_expression[8];
+   unsigned indirect_address_expression_count;
+   variable_store &store;
+   int &next_temp;
+   glsl_to_tgsi_dereference_to_address(variable_store&,int&);
+   void* mem_ctx;
+   gl_register_file file;
+   const glsl_type* type;
+
+   void visit(class ir_dereference_variable *);
+   void visit(class ir_dereference_array *);
+   void visit(class ir_dereference_record *);
+
+   void visit(ir_variable *);
+   void visit(ir_function_signature *);
+   void visit(ir_function *);
+   void visit(ir_expression *);
+   void visit(ir_texture *);
+   void visit(ir_swizzle *);
+   void visit(ir_assignment *);
+   void visit(ir_constant *);
+   void visit(ir_call *);
+   void visit(ir_discard *);
+   void visit(ir_if *);
+   void visit(ir_loop *);
+   void visit(ir_loop_jump *);
+   void visit(ir_return *);
+};
+
+
 class glsl_to_tgsi_visitor : public ir_visitor {
 public:
glsl_to_tgsi_visitor();
@@ -354,6 +394,8 @@ public:
virtual void visit(ir_if *);
/*@}*/
 
+   void handle_dereference(ir_dereference *);
+
st_src_reg result;
 
/** List of variable_storage */
@@ -491,6 +533,212 @@ num_inst_src_regs(unsigned opcode)
return info->is_tex ? info->num_src - 1 : info->num_src;
 }
 
+static int
+type_size(const struct glsl_type *type)
+{
+   unsigned int i;
+   int size;
+
+   switch (type->base_type) {
+   case GLSL_TYPE_UINT:
+   case GLSL_TYPE_INT:
+   case GLSL_TYPE_FLOAT:
+   case GLSL_TYPE_BOOL:
+  if (type->is_matrix()) {
+ return type->matrix_columns;
+  } else {
+ /* Regardless of size of vector, it gets a vec4. This is bad
+  * packing for things like floats, but otherwise arrays become a
+  * mess.  Hopefully a later pass over the code can pack scalars
+  * down if appropriate.
+  */
+ return 1;
+  }
+   case GLSL_TYPE_ARRAY:
+  assert(type->length > 0);
+  return type_size(type->fields.array) * type->length;
+   case GLSL_TYPE_STRUCT:
+  size = 0;
+  for (i = 0; i < type->length; i++) {
+ size += type_size(type->fields.structure[i].type);
+  }
+  return size;
+   case GLSL_TYPE_SAMPLER:
+  /* Samplers take up one slot in UNIFORMS[], but they're baked in
+   * at link time.
+   */
+  return 1;
+   default:
+  assert(0);
+  return 0;
+   }
+}
+
+glsl_to_tgsi_dereference_to_address::glsl_to_tgsi_dereference_to_address(variable_store
 &s, int &t):indirect_address_expression_count(0),store(s),next_temp(t)
+{
+
+}
+
+void glsl_to_tgsi_dereference_to_address::visit(ir_dereference_variable *ir)
+{
+   variable_storage *entry = store.find_variable_storage(ir->var);
+   ir_variable *var = ir->var;
+
+   if (!entry) {
+  switch (var->mode) {
+  case ir_var_uniform:
+ entry = new(mem_ctx) variable_storage(var, PROGRAM_UNIFORM,
+ var->location);
+ store.push(entry);
+ break;
+  case ir_var_in:
+  case ir_var_inout:
+ /* The linker assigns locations for varyings and attributes,
+  * including deprecated builtins (like gl_Color), user-assign
+  * generic attributes (glBindVertexLocation), and
+  * user-defined varyings.
+  *
+  * FINISHME: We would hit this path for function arguments.  Fix!
+  */
+ assert(var->location != -1);
+ entry = new(mem_ctx) variable_storage(var,
+   PROGRAM_INPUT,
+   var->location);
+ break;
+  case ir_var_out:
+ assert(var->location != -1);
+ entry = new(mem_ctx) variable_storage(var,
+   PROGRAM_OUTPUT,
+   var->location);
+ break;
+  case ir_var_system_value:
+ entry = new(mem_ctx) variable_storage(var,
+   PROGRAM_SYSTEM_VALUE,
+   var->location);
+ break;
+  case ir_var_auto:
+  case ir_var_temporary:
+ entry = new(mem_ctx) variable_storage(var, PROGRAM_TEMPORARY,
+ this->next_temp);
+ store

[Mesa-dev] [PATCH 1/3] gallium: create a new variable_store class replacing variables field in glsl_to_tgsi_visitor

2011-12-29 Thread Vincent Lejeune
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |   37 ++--
 1 files changed, 24 insertions(+), 13 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 77aa0d1..59ecb52 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -286,6 +286,13 @@ public:
st_src_reg return_reg;
 };
 
+class variable_store {
+public:
+   exec_list variables;
+   variable_storage *find_variable_storage(class ir_variable *var);
+   void push(class variable_storage *);
+};
+
 class glsl_to_tgsi_visitor : public ir_visitor {
 public:
glsl_to_tgsi_visitor();
@@ -308,8 +315,6 @@ public:
int glsl_version;
bool native_integers;
 
-   variable_storage *find_variable_storage(ir_variable *var);
-
int add_constant(gl_register_file file, gl_constant_value values[4],
 int size, int datatype, GLuint *swizzle_out);
 
@@ -352,7 +357,7 @@ public:
st_src_reg result;
 
/** List of variable_storage */
-   exec_list variables;
+   variable_store store;
 
/** List of immediate_storage */
exec_list immediates;
@@ -994,7 +999,7 @@ glsl_to_tgsi_visitor::get_temp(const glsl_type *type)
 }
 
 variable_storage *
-glsl_to_tgsi_visitor::find_variable_storage(ir_variable *var)
+variable_store::find_variable_storage(ir_variable *var)
 {

variable_storage *entry;
@@ -1010,6 +1015,12 @@ glsl_to_tgsi_visitor::find_variable_storage(ir_variable 
*var)
 }
 
 void
+variable_store::push(variable_storage *storage)
+{
+   variables.push_tail(storage);
+}
+
+void
 glsl_to_tgsi_visitor::visit(ir_variable *ir)
 {
if (strcmp(ir->name, "gl_FragCoord") == 0) {
@@ -1041,7 +1052,7 @@ glsl_to_tgsi_visitor::visit(ir_variable *ir)
   if (i == ir->num_state_slots) {
  /* We'll set the index later. */
  storage = new(mem_ctx) variable_storage(ir, PROGRAM_STATE_VAR, -1);
- this->variables.push_tail(storage);
+ store.push(storage);
 
  dst = undef_dst;
   } else {
@@ -1053,7 +1064,7 @@ glsl_to_tgsi_visitor::visit(ir_variable *ir)
 
  storage = new(mem_ctx) variable_storage(ir, PROGRAM_TEMPORARY,
 this->next_temp);
- this->variables.push_tail(storage);
+ store.push(storage);
  this->next_temp += type_size(ir->type);
 
  dst = st_dst_reg(st_src_reg(PROGRAM_TEMPORARY, storage->index,
@@ -1893,7 +1904,7 @@ glsl_to_tgsi_visitor::visit(ir_swizzle *ir)
 void
 glsl_to_tgsi_visitor::visit(ir_dereference_variable *ir)
 {
-   variable_storage *entry = find_variable_storage(ir->var);
+   variable_storage *entry = store.find_variable_storage(ir->var);
ir_variable *var = ir->var;
 
if (!entry) {
@@ -1901,7 +1912,7 @@ glsl_to_tgsi_visitor::visit(ir_dereference_variable *ir)
   case ir_var_uniform:
  entry = new(mem_ctx) variable_storage(var, PROGRAM_UNIFORM,
   var->location);
- this->variables.push_tail(entry);
+ store.push(entry);
  break;
   case ir_var_in:
   case ir_var_inout:
@@ -1932,7 +1943,7 @@ glsl_to_tgsi_visitor::visit(ir_dereference_variable *ir)
   case ir_var_temporary:
  entry = new(mem_ctx) variable_storage(var, PROGRAM_TEMPORARY,
   this->next_temp);
- this->variables.push_tail(entry);
+ store.push(entry);
 
  next_temp += type_size(var->type);
  break;
@@ -2411,12 +2422,12 @@ 
glsl_to_tgsi_visitor::get_function_signature(ir_function_signature *sig)
   ir_variable *param = (ir_variable *)iter.get();
   variable_storage *storage;
 
-  storage = find_variable_storage(param);
+  storage = store.find_variable_storage(param);
   assert(!storage);
 
   storage = new(mem_ctx) variable_storage(param, PROGRAM_TEMPORARY,
  this->next_temp);
-  this->variables.push_tail(storage);
+  store.push(storage);
 
   this->next_temp += type_size(param->type);
}
@@ -2447,7 +2458,7 @@ glsl_to_tgsi_visitor::visit(ir_call *ir)
 
   if (param->mode == ir_var_in ||
   param->mode == ir_var_inout) {
- variable_storage *storage = find_variable_storage(param);
+ variable_storage *storage = store.find_variable_storage(param);
  assert(storage);
 
  param_rval->accept(this);
@@ -2483,7 +2494,7 @@ glsl_to_tgsi_visitor::visit(ir_call *ir)
 
   if (param->mode == ir_var_out ||
   param->mode == ir_var_inout) {
- variable_storage *storage = find_variable_storage(param);
+ variable_storage *storage = store.find_variable_storage(param);
  assert(storage);
 
  st_src_reg r;
-- 
1.7.7

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo

[Mesa-dev] [RFC] glsl-to-tgsi ARL generator small improvements

2011-12-29 Thread Vincent Lejeune
Hi,

These patches reworks sightly glsl-to-tgsi ir_dereference* visitor.
The first one is self explanatory.
The second one factor ir_dereference* in a separate visitor ; by itself,
this change does not bring anything to code generation, however it should
make easier to handle case where variable location is not standard, for
instance UBO. In addition glsl-to-tgsi should generate (U)MAD instead of
(U)MUL and (U)ADD when there is 2 level indirect addressing (which is not
very common at the moment).
Third patch should help glsl-to-tgsi to avoid generating redondant (U)ARL
opcode in indirect addressing. I tested it against piglit's
tests/shaders/glsl-fs-vec4-indexing-temp-src-in-loop.shader_test
and it removed a UARL instruction. As this patch changes the way relative
addressing is treated, it might regress in some situation so it needs more
testing that the previous ones.

Regards,
Vincent

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/7] gallium: Make use of gl_transform_feedback_info::ComponentOffset.

2011-12-29 Thread Marek Olšák
Reviewed-by: Marek Olšák 

On Thu, Dec 29, 2011 at 6:16 PM, Paul Berry  wrote:
> ---
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
>
> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> index 77aa0d1..d337f9b 100644
> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> @@ -5120,7 +5120,8 @@ st_translate_stream_output_info(struct 
> glsl_to_tgsi_visitor *glsl_to_tgsi,
>       so->output[i].register_index =
>          outputMapping[info->Outputs[i].OutputRegister];
>       so->output[i].register_mask =
> -         comps_to_mask[info->Outputs[i].NumComponents];
> +         comps_to_mask[info->Outputs[i].NumComponents]
> +         << info->Outputs[i].ComponentOffset;
>       so->output[i].output_buffer = info->Outputs[i].OutputBuffer;
>    }
>    so->num_outputs = info->NumOutputs;
> --
> 1.7.6.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] RFC [PATCH 0/7] Fix transform feedback of builtin "varyings".

2011-12-29 Thread Paul Berry
On 29 December 2011 09:16, Paul Berry  wrote:

> Arguments in favor: (1) Because of transform feedback's intended use
> and its position in the pipeline, the distinction between varyings and
> other vertex shader outputs is irrelevant; in all likelihood the spec
> writers intended for it to work on all vertex shader outputs.  (2) The
> very use of the term "varying" (and hence, this distinction) has
> largely disappeared from the standard as of GLSL 1.30.  (3) nVidia's
> proprietary Linux driver supports transform feedback of all vertex
> shader outputs (except gl_ClipVertex, which has many other bugs), so
> it's conceivable that some code in the wild relies on this feature.
> (4) Fixing transform feedback of gl_ClipVertex provides a nice
> opportunity to prepare for the changes we will have to make to
> transform feedback in order to support varying packing.
>

Whoops, that last "gl_ClipVertex" should be "gl_ClipDistance".  Sorry if
that caused any confusion.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/7] mesa: Fix transform feedback of gl_ClipDistance.

2011-12-29 Thread Paul Berry
On drivers that set gl_shader_compiler_options::LowerClipDistance (for
example i965), references to gl_ClipDistance (a float[8] array) will
be converted to references to gl_ClipDistanceMESA (a vec4[2] array).

This patch modifies the linker so that requests for transform feedback
of gl_ClipDistance are similarly converted.

Fixes Piglit test "EXT_transform_feedback/builtin-varyings
gl_ClipDistance".
---
 src/glsl/linker.cpp |   59 +++---
 1 files changed, 41 insertions(+), 18 deletions(-)

diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index 6def821..ed9a5d7 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -1376,8 +1376,8 @@ demote_shader_inputs_and_outputs(gl_shader *sh, enum 
ir_variable_mode mode)
 class tfeedback_decl
 {
 public:
-   bool init(struct gl_shader_program *prog, const void *mem_ctx,
- const char *input);
+   bool init(struct gl_context *ctx, struct gl_shader_program *prog,
+ const void *mem_ctx, const char *input);
static bool is_same(const tfeedback_decl &x, const tfeedback_decl &y);
bool assign_location(struct gl_context *ctx, struct gl_shader_program *prog,
 ir_variable *output_var);
@@ -1433,6 +1433,13 @@ private:
unsigned array_index;
 
/**
+* Which component to extract from the vertex shader output location that
+* the linker assigned to this variable.  -1 if all components should be
+* extracted.
+*/
+   int single_component;
+
+   /**
 * The vertex shader output location that the linker assigned for this
 * variable.  -1 if a location hasn't been assigned yet.
 */
@@ -1458,8 +1465,8 @@ private:
  * reported using linker_error(), and false is returned.
  */
 bool
-tfeedback_decl::init(struct gl_shader_program *prog, const void *mem_ctx,
- const char *input)
+tfeedback_decl::init(struct gl_context *ctx, struct gl_shader_program *prog,
+ const void *mem_ctx, const char *input)
 {
/* We don't have to be pedantic about what is a valid GLSL variable name,
 * because any variable with an invalid name can't exist in the IR anyway.
@@ -1467,23 +1474,34 @@ tfeedback_decl::init(struct gl_shader_program *prog, 
const void *mem_ctx,
 
this->location = -1;
this->orig_name = input;
+   this->single_component = -1;
 
const char *bracket = strrchr(input, '[');
 
if (bracket) {
   this->var_name = ralloc_strndup(mem_ctx, input, bracket - input);
-  if (sscanf(bracket, "[%u]", &this->array_index) == 1) {
- this->is_array = true;
- return true;
+  if (sscanf(bracket, "[%u]", &this->array_index) != 1) {
+ linker_error(prog, "Cannot parse transform feedback varying %s", 
input);
+ return false;
   }
+  this->is_array = true;
} else {
   this->var_name = ralloc_strdup(mem_ctx, input);
   this->is_array = false;
-  return true;
}
 
-   linker_error(prog, "Cannot parse transform feedback varying %s", input);
-   return false;
+   /* For drivers that lower gl_ClipDistance to gl_ClipDistanceMESA, we need
+* to convert a request for gl_ClipDistance[n] into a request for a
+* component of gl_ClipDistanceMESA[n/4].
+*/
+   if (ctx->ShaderCompilerOptions[MESA_SHADER_VERTEX].LowerClipDistance &&
+   strcmp(this->var_name, "gl_ClipDistance") == 0) {
+  this->var_name = "gl_ClipDistanceMESA";
+  this->single_component = this->array_index % 4;
+  this->array_index /= 4;
+   }
+
+   return true;
 }
 
 
@@ -1500,6 +1518,8 @@ tfeedback_decl::is_same(const tfeedback_decl &x, const 
tfeedback_decl &y)
   return false;
if (x.is_array && x.array_index != y.array_index)
   return false;
+   if (x.single_component != y.single_component)
+  return false;
return true;
 }
 
@@ -1595,13 +1615,16 @@ tfeedback_decl::store(struct gl_shader_program *prog,
   return false;
}
for (unsigned v = 0; v < this->matrix_columns; ++v) {
+  unsigned num_components =
+ this->single_component >= 0 ? 1 : this->vector_elements;
   info->Outputs[info->NumOutputs].OutputRegister = this->location + v;
-  info->Outputs[info->NumOutputs].NumComponents = this->vector_elements;
+  info->Outputs[info->NumOutputs].NumComponents = num_components;
   info->Outputs[info->NumOutputs].OutputBuffer = buffer;
   info->Outputs[info->NumOutputs].DstOffset = info->BufferStride[buffer];
-  info->Outputs[info->NumOutputs].ComponentOffset = 0;
+  info->Outputs[info->NumOutputs].ComponentOffset =
+ this->single_component >= 0 ? this->single_component : 0;
   ++info->NumOutputs;
-  info->BufferStride[buffer] += this->vector_elements;
+  info->BufferStride[buffer] += num_components;
}
return true;
 }
@@ -1615,12 +1638,12 @@ tfeedback_decl::store(struct gl_shader_program *prog,
  * is returned.
  */
 static bool
-parse_tfeedback_decls(struct gl_shader_program 

[Mesa-dev] [PATCH 6/7] mesa: Make tfeedback_decl::var_name a const char *.

2011-12-29 Thread Paul Berry
---
 src/glsl/linker.cpp |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index 452d8b5..6def821 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -1420,7 +1420,7 @@ private:
/**
 * The name of the variable, parsed from orig_name.
 */
-   char *var_name;
+   const char *var_name;
 
/**
 * True if the declaration in orig_name represents an array.
-- 
1.7.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/7] gallium: Make use of gl_transform_feedback_info::ComponentOffset.

2011-12-29 Thread Paul Berry
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 77aa0d1..d337f9b 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -5120,7 +5120,8 @@ st_translate_stream_output_info(struct 
glsl_to_tgsi_visitor *glsl_to_tgsi,
   so->output[i].register_index =
  outputMapping[info->Outputs[i].OutputRegister];
   so->output[i].register_mask =
- comps_to_mask[info->Outputs[i].NumComponents];
+ comps_to_mask[info->Outputs[i].NumComponents]
+ << info->Outputs[i].ComponentOffset;
   so->output[i].output_buffer = info->Outputs[i].OutputBuffer;
}
so->num_outputs = info->NumOutputs;
-- 
1.7.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/7] i965: Make use of gl_transform_feedback_info::ComponentOffset.

2011-12-29 Thread Paul Berry
---
 src/mesa/drivers/dri/i965/brw_gs.c |9 +
 src/mesa/drivers/dri/i965/brw_gs.h |7 +++
 src/mesa/drivers/dri/i965/brw_gs_emit.c|2 +-
 src/mesa/drivers/dri/i965/gen7_sol_state.c |2 ++
 4 files changed, 19 insertions(+), 1 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_gs.c 
b/src/mesa/drivers/dri/i965/brw_gs.c
index 850d7b4..f9c4f6a 100644
--- a/src/mesa/drivers/dri/i965/brw_gs.c
+++ b/src/mesa/drivers/dri/i965/brw_gs.c
@@ -154,6 +154,13 @@ static void compile_gs_prog( struct brw_context *brw,
 static void populate_key( struct brw_context *brw,
  struct brw_gs_prog_key *key )
 {
+   static const unsigned swizzle_for_offset[4] = {
+  BRW_SWIZZLE4(0, 1, 2, 3),
+  BRW_SWIZZLE4(1, 2, 3, 3),
+  BRW_SWIZZLE4(2, 3, 3, 3),
+  BRW_SWIZZLE4(3, 3, 3, 3)
+   };
+
struct gl_context *ctx = &brw->intel.ctx;
struct intel_context *intel = &brw->intel;
 
@@ -207,6 +214,8 @@ static void populate_key( struct brw_context *brw,
  for (i = 0; i < key->num_transform_feedback_bindings; ++i) {
 key->transform_feedback_bindings[i] =
linked_xfb_info->Outputs[i].OutputRegister;
+key->transform_feedback_swizzles[i] =
+   swizzle_for_offset[linked_xfb_info->Outputs[i].ComponentOffset];
  }
   }
   /* On Gen6, GS is also used for rasterizer discard. */
diff --git a/src/mesa/drivers/dri/i965/brw_gs.h 
b/src/mesa/drivers/dri/i965/brw_gs.h
index 2ab8b72..f2597c8 100644
--- a/src/mesa/drivers/dri/i965/brw_gs.h
+++ b/src/mesa/drivers/dri/i965/brw_gs.h
@@ -63,6 +63,13 @@ struct brw_gs_prog_key {
 * entry.
 */
unsigned char transform_feedback_bindings[BRW_MAX_SOL_BINDINGS];
+
+   /**
+* Map from the index of a transform feedback binding table entry to the
+* swizzles that should be used when streaming out data through that
+* binding table entry.
+*/
+   unsigned char transform_feedback_swizzles[BRW_MAX_SOL_BINDINGS];
 };
 
 struct brw_gs_compile {
diff --git a/src/mesa/drivers/dri/i965/brw_gs_emit.c 
b/src/mesa/drivers/dri/i965/brw_gs_emit.c
index 4074501..501cee4 100644
--- a/src/mesa/drivers/dri/i965/brw_gs_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_gs_emit.c
@@ -448,7 +448,7 @@ gen6_sol_program(struct brw_gs_compile *c, struct 
brw_gs_prog_key *key,
 vertex_slot.subnr = (slot % 2) * 16;
 /* gl_PointSize is stored in VERT_RESULT_PSIZ.w. */
 vertex_slot.dw1.bits.swizzle = vert_result == VERT_RESULT_PSIZ
-   ? BRW_SWIZZLE_ : BRW_SWIZZLE_NOOP;
+   ? BRW_SWIZZLE_ : key->transform_feedback_swizzles[binding];
 brw_set_access_mode(p, BRW_ALIGN_16);
 brw_MOV(p, stride(c->reg.header, 4, 4, 1),
 retype(vertex_slot, BRW_REGISTER_TYPE_UD));
diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c 
b/src/mesa/drivers/dri/i965/gen7_sol_state.c
index df6b9ee..674e14f 100644
--- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
@@ -129,6 +129,8 @@ upload_3dstate_so_decl_list(struct brw_context *brw,
   if (vert_result == VERT_RESULT_PSIZ) {
  assert(linked_xfb_info->Outputs[i].NumComponents == 1);
  component_mask <<= 3;
+  } else {
+ component_mask <<= linked_xfb_info->Outputs[i].ComponentOffset;
   }
 
   buffer_mask |= 1 << buffer;
-- 
1.7.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/7] mesa: Add gl_transform_feedback_info::ComponentOffset.

2011-12-29 Thread Paul Berry
When using transform feedback, there are three circumstances in which
it is useful for Mesa to instruct a driver to stream out just a
portion of a varying slot (rather than the whole vec4):

(a) When a varying is smaller than a vec4, Mesa needs to instruct the
driver to stream out just the first one, two, or three components of
the varying slot.

(b) In the future, when we implement varying packing, some varyings
will be offset within the vec4, so Mesa will have to instruct the
driver to stream out an arbitrary contiguous subset of the components
of the varying slot (e.g. .yzw or .yz).

(c) On drivers that set gl_shader_compiler_options::LowerClipDistance,
if the client requests that an element of gl_ClipDistance be streamed
out using transform feedback, Mesa will have to instruct the driver to
stream out a single component of one of the gl_ClipDistance varying
slots.

Previous to this patch, only (a) was possible, since
gl_transform_feedback_info specified only the number of components of
the varying slot to stream out.  This patch adds
gl_transform_feedback_info::ComponentOffset, which indicates which
components should be streamed out.
---
 src/glsl/linker.cpp|1 +
 src/mesa/main/mtypes.h |7 +++
 2 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index 6587008..452d8b5 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -1599,6 +1599,7 @@ tfeedback_decl::store(struct gl_shader_program *prog,
   info->Outputs[info->NumOutputs].NumComponents = this->vector_elements;
   info->Outputs[info->NumOutputs].OutputBuffer = buffer;
   info->Outputs[info->NumOutputs].DstOffset = info->BufferStride[buffer];
+  info->Outputs[info->NumOutputs].ComponentOffset = 0;
   ++info->NumOutputs;
   info->BufferStride[buffer] += this->vector_elements;
}
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 107371e..d520f98 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -1828,6 +1828,13 @@ struct gl_transform_feedback_info {
 
   /** offset (in DWORDs) of this output within the interleaved structure */
   unsigned DstOffset;
+
+  /**
+   * Offset into the output register of the data to output.  For example,
+   * if NumComponents is 2 and ComponentOffset is 1, then the data to
+   * offset is in the y and z components of the output register.
+   */
+  unsigned ComponentOffset;
} Outputs[MAX_PROGRAM_OUTPUTS];
 
/**
-- 
1.7.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/7] i965: Fix transform feedback of gl_ClipVertex.

2011-12-29 Thread Paul Berry
Previously, on i965 Gen6 and above, we weren't allocating space for
gl_ClipVertex in the VUE, since the VS was automatically converting it
to clip distances.  This prevented transform feedback from being able
to capture gl_ClipVertex.

This patch goes aheads and allocates space for gl_ClipVertex in the
VUE on Gen6 and above.  The old behavior is retained on Gen5 and
below, since (a) transform feedback is not yet supported on those
platforms, and (b) those platforms don't currently support
gl_ClipVertex anyhow.

Note: this constitutes a slight waste of VUE space for shaders that
use gl_ClipVertex and don't use transform feedback to capture it.
However, that seems preferable to making the VUE map (and all of the
state that depends on it) dependent on transform feedback settings.

Fixes Piglit test "EXT_transform_feedback/builtin-varyings
gl_ClipVertex".
---
 src/mesa/drivers/dri/i965/brw_vs.c |   13 -
 1 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
b/src/mesa/drivers/dri/i965/brw_vs.c
index 6eec973..2f17900 100644
--- a/src/mesa/drivers/dri/i965/brw_vs.c
+++ b/src/mesa/drivers/dri/i965/brw_vs.c
@@ -139,14 +139,17 @@ brw_compute_vue_map(struct brw_vue_map *vue_map,
 * assign them contiguously.  Don't reassign outputs that already have a
 * slot.
 *
-* Also, don't assign a slot for VERT_RESULT_CLIP_VERTEX, since it is
-* unsupported in pre-GEN6, and in GEN6+ the vertex shader converts it into
-* clip distances.
+* Also, prior to Gen6, don't assign a slot for VERT_RESULT_CLIP_VERTEX,
+* since it is unsupported.  In Gen6 and above, VERT_RESULT_CLIP_VERTEX may
+* be needed for transform feedback; since we don't want to have to
+* recompute the VUE map (and everything that depends on it) when transform
+* feedback is enabled or disabled, just go ahead and assign a slot for it.
 */
for (int i = 0; i < VERT_RESULT_MAX; ++i) {
+  if (intel->gen < 6 && i == VERT_RESULT_CLIP_VERTEX)
+ continue;
   if ((outputs_written & BITFIELD64_BIT(i)) &&
-  vue_map->vert_result_to_slot[i] == -1 &&
-  i != VERT_RESULT_CLIP_VERTEX) {
+  vue_map->vert_result_to_slot[i] == -1) {
  assign_vue_slot(vue_map, i);
   }
}
-- 
1.7.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/7] i965: Fix transform feedback of gl_PointSize.

2011-12-29 Thread Paul Berry
On i965 Gen6 and above, gl_PointSize is stored in component W of the
first VUE slot (which corresponds to VERT_RESULT_PSIZ in the VUE map).
Normally we store varying floats in component X of a VUE slot, so we
need special case logic for gl_PointSize.

For Gen6, we do this with a "." swizzle in the GS.  For Gen7, we
shift the component mask by 3 to select the W component.

Fixes Piglit test "EXT_transform_feedback/builtin-varyings
gl_PointSize".
---
 src/mesa/drivers/dri/i965/brw_gs_emit.c|5 +
 src/mesa/drivers/dri/i965/gen7_sol_state.c |   11 +--
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_gs_emit.c 
b/src/mesa/drivers/dri/i965/brw_gs_emit.c
index 607ee75..4074501 100644
--- a/src/mesa/drivers/dri/i965/brw_gs_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_gs_emit.c
@@ -446,8 +446,13 @@ gen6_sol_program(struct brw_gs_compile *c, struct 
brw_gs_prog_key *key,
 struct brw_reg vertex_slot = c->reg.vertex[vertex];
 vertex_slot.nr += slot / 2;
 vertex_slot.subnr = (slot % 2) * 16;
+/* gl_PointSize is stored in VERT_RESULT_PSIZ.w. */
+vertex_slot.dw1.bits.swizzle = vert_result == VERT_RESULT_PSIZ
+   ? BRW_SWIZZLE_ : BRW_SWIZZLE_NOOP;
+brw_set_access_mode(p, BRW_ALIGN_16);
 brw_MOV(p, stride(c->reg.header, 4, 4, 1),
 retype(vertex_slot, BRW_REGISTER_TYPE_UD));
+brw_set_access_mode(p, BRW_ALIGN_1);
 brw_svb_write(p,
   final_write ? c->reg.temp : brw_null_reg(), /* dest 
*/
   1, /* msg_reg_nr */
diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c 
b/src/mesa/drivers/dri/i965/gen7_sol_state.c
index 7346866..df6b9ee 100644
--- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
@@ -122,14 +122,21 @@ upload_3dstate_so_decl_list(struct brw_context *brw,
   int buffer = linked_xfb_info->Outputs[i].OutputBuffer;
   uint16_t decl = 0;
   int vert_result = linked_xfb_info->Outputs[i].OutputRegister;
+  unsigned component_mask =
+ (1 << linked_xfb_info->Outputs[i].NumComponents) - 1;
+
+  /* gl_PointSize is stored in VERT_RESULT_PSIZ.w. */
+  if (vert_result == VERT_RESULT_PSIZ) {
+ assert(linked_xfb_info->Outputs[i].NumComponents == 1);
+ component_mask <<= 3;
+  }
 
   buffer_mask |= 1 << buffer;
 
   decl |= buffer << SO_DECL_OUTPUT_BUFFER_SLOT_SHIFT;
   decl |= vue_map->vert_result_to_slot[vert_result] <<
 SO_DECL_REGISTER_INDEX_SHIFT;
-  decl |= ((1 << linked_xfb_info->Outputs[i].NumComponents) - 1) <<
-SO_DECL_COMPONENT_MASK_SHIFT;
+  decl |= component_mask << SO_DECL_COMPONENT_MASK_SHIFT;
 
   /* This assert should be true until GL_ARB_transform_feedback_instanced
* is added and we start using the hole flag.
-- 
1.7.6.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] RFC [PATCH 0/7] Fix transform feedback of builtin "varyings".

2011-12-29 Thread Paul Berry
This patch series allows transform feedback to work properly on the
built-in vertex shader output variables gl_PointSize, gl_ClipVertex,
and gl_ClipDistance.  gl_PointSize and gl_ClipVertex were broken due
to bugs in the i965 driver, and were trivial to fix--those are fixed
in patches 1 and 2.

gl_ClipDistance was harder to fix, since on i965 its 8 floats are
packed into 2 vec4s, so the linker has to tell the back-end to select
a single component of one of the vec4's for streaming out.  This
required changing both core mesa and the back-ends, and adding a new
field to gl_transform_feedback_info.  However, the work seems worth it
because it lays some of the groundwork we will need when we get around
to packing varyings.  Patch 3 adds the new field, patches 4-5 cause
the back-ends to use it, and patches 6-7 update the linker to populate
it correctly for gl_ClipDistance.

I'm putting this patch series out as an RFC partly because I want to
find out if the new field in gl_transform_feedback_info makes sense
for other driver back-ends, and partly because it is not 100% clear
from the spec whether transform feedback is intended to work on all
vertex shader outputs, or just the "varyings" (the ones that are
interpolated across the surface of a primitive).  Here are the
arguments I can see for and against going through with this patch
series:

Arguments against: (1) The GL 3.0 spec says that "The varying
variables specified in  can be either built-in varying
variables (beginning with 'gl_') or user-defined ones".  But it also
explicitly states that gl_Position is not a varying variable.  And
GLSL 1.20 lists gl_Position, gl_PointSize, and gl_ClipVertex in
section 7.1 ("Vertex Shader Special Variables") rather than section
7.6 ("Varying Variables").  It seems clear that there was an intention
to distinguish between "varyings" and other vertex shader outputs, and
transform feedback is defined to work on varyings.  (2) In all
likelihood, most code that uses transform feedback uses it on
user-defined varyings anyhow, so fixing these built-in variables is
unlikely to make much difference.  (3) Making transform feedback work
on gl_ClipDistance adds a lot of complication for the benefit of a
tiny corner case.

Arguments in favor: (1) Because of transform feedback's intended use
and its position in the pipeline, the distinction between varyings and
other vertex shader outputs is irrelevant; in all likelihood the spec
writers intended for it to work on all vertex shader outputs.  (2) The
very use of the term "varying" (and hence, this distinction) has
largely disappeared from the standard as of GLSL 1.30.  (3) nVidia's
proprietary Linux driver supports transform feedback of all vertex
shader outputs (except gl_ClipVertex, which has many other bugs), so
it's conceivable that some code in the wild relies on this feature.
(4) Fixing transform feedback of gl_ClipVertex provides a nice
opportunity to prepare for the changes we will have to make to
transform feedback in order to support varying packing.

Personally, I'm swayed by the arguments in favor but I would like to
hear what others think.

[PATCH 1/7] i965: Fix transform feedback of gl_PointSize.
[PATCH 2/7] i965: Fix transform feedback of gl_ClipVertex.
[PATCH 3/7] mesa: Add gl_transform_feedback_info::ComponentOffset.
[PATCH 4/7] i965: Make use of gl_transform_feedback_info::ComponentOffset.
[PATCH 5/7] gallium: Make use of gl_transform_feedback_info::ComponentOffset.
[PATCH 6/7] mesa: Make tfeedback_decl::var_name a const char *.
[PATCH 7/7] mesa: Fix transform feedback of gl_ClipDistance.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 43332] corrupted output in mesa-demo/fp-tri using r600g

2011-12-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=43332

Michel Dänzer  changed:

   What|Removed |Added

 AssignedTo|dri-devel@lists.freedesktop |mesa-dev@lists.freedesktop.
   |.org|org
  Component|Drivers/Gallium/r600|Demos

--- Comment #6 from Michel Dänzer  2011-12-29 09:13:35 PST 
---
Reassigning to demos per comment #5.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: fix usage of potentially undefined data_end union

2011-12-29 Thread Alexander von Gluck


---
 src/glsl/link_uniforms.cpp |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/src/glsl/link_uniforms.cpp b/src/glsl/link_uniforms.cpp
index c7de480..b331db7 100644
--- a/src/glsl/link_uniforms.cpp
+++ b/src/glsl/link_uniforms.cpp
@@ -365,9 +365,9 @@ link_assign_uniform_locations(struct gl_shader_program 
*prog)

for (unsigned i = 0; i < num_user_uniforms; i++) {
   assert(uniforms[i].storage != NULL);
}
-#endif

assert(parcel.values == data_end);
+#endif

prog->NumUserUniformStorage = num_user_uniforms;
prog->UniformStorage = uniforms;
--
1.7.7.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: Manage fences per screen rather than per context.

2011-12-29 Thread Mathias Fröhlich

Hi,

On Thursday, December 29, 2011 15:34:24 Michel Dänzer wrote:
> From: Michel Dänzer 
> 
> A fence is a screen object and can outlive the context it was created from.
> The previous code would access freed memory in that case, resulting in
> various problems.
> 
> Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=44151
> 
> Probably fixes: https://bugs.freedesktop.org/show_bug.cgi?id=44007
> https://bugs.freedesktop.org/show_bug.cgi?id=43993
> ... and likely a fair number of similar issues.
> 
> Signed-off-by: Michel Dänzer 
> ---
> 
> v2: Add pipe_mutex_destroy() call for newly added mutex.
Reviewed-by: Mathias Fröhlich 

Greetings

Mathias
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] vl: replace decode_buffers with auxiliary data field

2011-12-29 Thread Maarten Lankhorst
Hey Christian,

Op 26-12-11 14:00, Christian König schreef:
> Based on patches from Maarten Lankhorst 
>
> Signed-off-by: Christian König 
>
> diff --git a/src/gallium/include/pipe/p_context.h 
> b/src/gallium/include/pipe/p_context.h
> index de79a9b..f7ee522 100644
> --- a/src/gallium/include/pipe/p_context.h
> +++ b/src/gallium/include/pipe/p_context.h
> @@ -410,7 +410,8 @@ struct pipe_context {
> enum 
> pipe_video_profile profile,
> enum 
> pipe_video_entrypoint entrypoint,
> enum 
> pipe_video_chroma_format chroma_format,
> -   unsigned width, 
> unsigned height, unsigned max_references );
> +   unsigned width, 
> unsigned height, unsigned max_references,
> +   bool 
> expect_chunked_decode);
>
I really don't like this part, isn't it implied from entrypoint >= 
PIPE_VIDEO_ENTRYPOINT_IDCT?

~Maarten
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] vl: seperate shader buffers from components

2011-12-29 Thread Maarten Lankhorst
Hey,

Op 26-12-11 14:00, Christian König schreef:
> Buffers for shader based decoding can now be
> released without its component still being around.
>
> Signed-off-by: Christian König 
>
Looks good to me.   

Acked-by: Maarten Lankhorst 

~Maarten
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: Manage fences per screen rather than per context.

2011-12-29 Thread Michel Dänzer
On Don, 2011-12-29 at 14:57 +0100, Mathias Fröhlich wrote: 
> On Thursday, December 29, 2011 13:35:19 Michel Dänzer wrote:
> > From: Michel Dänzer 
> > 
> > Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=44151
> > 
> > Probably fixes: https://bugs.freedesktop.org/show_bug.cgi?id=44007
> > https://bugs.freedesktop.org/show_bug.cgi?id=43993
> > 
> > Signed-off-by: Michel Dänzer 
> > ---
> > 
> > This introduces a potential race condition with apps using several contexts
> > concurrently in several threads: rscreen->fences.bo is referenced by all
> > contexts, so e.g. rscreen->fences.bo->cs_buf->last_flush may be concurrently
> > read/written by several threads. However, I think this could already happen
> > e.g. when sharing textures between GLX contexts, so it should probably be
> > addressed separately. Also, in the case of rscreen->fences.bo I think it's
> > harmless, as no caches should need to be flushed for it.
> > 
> > No regressions (or fixes) in piglit quick.tests.
> 
> This actually fixes some piglit problems that I have here using different 
> vblank_mode's set in .drirc. These problems arise from the fences may be 
> living longer than the context itself. Which should be fixed with this patch 
> too.

Ah, the intermittent hangs of some GLX tests? It didn't occur to me that
those could be related to this problem as well, but it makes sense.


> I am missing a pipe_mutex_destroy in this change.

Good point, fixed in v2.


> Other than that - I am fine with that change.
> Also tested on a rv770 here - no piglit quick regressions.

Great, thanks!


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] r600g: Manage fences per screen rather than per context.

2011-12-29 Thread Michel Dänzer
From: Michel Dänzer 

A fence is a screen object and can outlive the context it was created from.
The previous code would access freed memory in that case, resulting in
various problems.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=44151

Probably fixes: https://bugs.freedesktop.org/show_bug.cgi?id=44007
https://bugs.freedesktop.org/show_bug.cgi?id=43993
... and likely a fair number of similar issues.

Signed-off-by: Michel Dänzer 
---

v2: Add pipe_mutex_destroy() call for newly added mutex.

 src/gallium/drivers/r600/r600_pipe.c |   95 ++---
 src/gallium/drivers/r600/r600_pipe.h |   26 +-
 2 files changed, 65 insertions(+), 56 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_pipe.c 
b/src/gallium/drivers/r600/r600_pipe.c
index 7f62e0e..9f09080 100644
--- a/src/gallium/drivers/r600/r600_pipe.c
+++ b/src/gallium/drivers/r600/r600_pipe.c
@@ -53,27 +53,31 @@
  */
 static struct r600_fence *r600_create_fence(struct r600_pipe_context *ctx)
 {
+   struct r600_screen *rscreen = ctx->screen;
struct r600_fence *fence = NULL;
 
-   if (!ctx->fences.bo) {
+   pipe_mutex_lock(rscreen->fences.mutex);
+
+   if (!rscreen->fences.bo) {
/* Create the shared buffer object */
-   ctx->fences.bo = (struct r600_resource*)
-   pipe_buffer_create(&ctx->screen->screen, 
PIPE_BIND_CUSTOM,
+   rscreen->fences.bo = (struct r600_resource*)
+   pipe_buffer_create(&rscreen->screen, PIPE_BIND_CUSTOM,
   PIPE_USAGE_STAGING, 4096);
-   if (!ctx->fences.bo) {
+   if (!rscreen->fences.bo) {
R600_ERR("r600: failed to create bo for fence 
objects\n");
-   return NULL;
+   goto out;
}
-   ctx->fences.data = ctx->ws->buffer_map(ctx->fences.bo->buf, 
ctx->ctx.cs,
-  PIPE_TRANSFER_WRITE);
+   rscreen->fences.data = 
ctx->ws->buffer_map(rscreen->fences.bo->buf,
+  ctx->ctx.cs,
+  
PIPE_TRANSFER_READ_WRITE);
}
 
-   if (!LIST_IS_EMPTY(&ctx->fences.pool)) {
+   if (!LIST_IS_EMPTY(&rscreen->fences.pool)) {
struct r600_fence *entry;
 
/* Try to find a freed fence that has been signalled */
-   LIST_FOR_EACH_ENTRY(entry, &ctx->fences.pool, head) {
-   if (ctx->fences.data[entry->index] != 0) {
+   LIST_FOR_EACH_ENTRY(entry, &rscreen->fences.pool, head) {
+   if (rscreen->fences.data[entry->index] != 0) {
LIST_DELINIT(&entry->head);
fence = entry;
break;
@@ -86,33 +90,34 @@ static struct r600_fence *r600_create_fence(struct 
r600_pipe_context *ctx)
struct r600_fence_block *block;
unsigned index;
 
-   if ((ctx->fences.next_index + 1) >= 1024) {
+   if ((rscreen->fences.next_index + 1) >= 1024) {
R600_ERR("r600: too many concurrent fences\n");
-   return NULL;
+   goto out;
}
 
-   index = ctx->fences.next_index++;
+   index = rscreen->fences.next_index++;
 
if (!(index % FENCE_BLOCK_SIZE)) {
/* Allocate a new block */
block = CALLOC_STRUCT(r600_fence_block);
if (block == NULL)
-   return NULL;
+   goto out;
 
-   LIST_ADD(&block->head, &ctx->fences.blocks);
+   LIST_ADD(&block->head, &rscreen->fences.blocks);
} else {
-   block = LIST_ENTRY(struct r600_fence_block, 
ctx->fences.blocks.next, head);
+   block = LIST_ENTRY(struct r600_fence_block, 
rscreen->fences.blocks.next, head);
}
 
fence = &block->fences[index % FENCE_BLOCK_SIZE];
-   fence->ctx = ctx;
fence->index = index;
}
 
pipe_reference_init(&fence->reference, 1);
 
-   ctx->fences.data[fence->index] = 0;
-   r600_context_emit_fence(&ctx->ctx, ctx->fences.bo, fence->index, 1);
+   rscreen->fences.data[fence->index] = 0;
+   r600_context_emit_fence(&ctx->ctx, rscreen->fences.bo, fence->index, 1);
+out:
+   pipe_mutex_unlock(rscreen->fences.mutex);
return fence;
 }
 
@@ -191,18 +196,6 @@ static void r600_destroy_context(struct pipe_context 
*context)
u_vbuf_destroy(rctx->vbuf_mgr);
util_slab_destroy(&rctx->pool_transfers);
 
-   if (rctx->fences.bo) {
-   struct r600_fenc

Re: [Mesa-dev] [PATCH] util: u_gen_mipmap: use software path for small mipmap levels

2011-12-29 Thread Lucas Stach
Ok, disregard this patch.

I see this is a bad idea and will fix the issue in another way in nvfx.

Thanks,
Lucas

Am Donnerstag, den 29.12.2011, 01:12 +0100 schrieb Marek Olšák:
> HI Lucas,
> 
> The fallback will be slower on Radeons for these two reasons:
> - Texture transfers are implemented using a blit to or from a
> temporary texture, which is allocated in get_transfer, so it takes
> more CPU time than simply generating all mipmap levels on hardware
> - Mapping a texture causes waiting for the GPU to complete all of its
> work, i.e. there will be a stall in transfer_map for an unspecified
> time. At the time of calling transfer_map, the GPU may not have even
> started generating any mipmap levels, so the waiting would be quite
> noticable. This applies to all GPUs. Also, some rendering techniques
> use mipmap generation every frame, i.e. to compute an average color of
> the back buffer, which is common in HDR rendering. I would not like to
> have any stalls there.
> 
> Marek
> 
> On Wed, Dec 28, 2011 at 8:28 PM, Lucas Stach  wrote:
> > From 1273dd1e1ede35b9a434c3f9d9eaa4a03eb8d0b3 Mon Sep 17 00:00:00 2001
> > From: Lucas Stach 
> > Date: Wed, 28 Dec 2011 20:00:48 +0100
> > Subject: [PATCH] util: u_gen_mipmap: use software path for small mipmap
> >  levels
> >
> > We are changing a lot of states to generate mipmaps with the
> > hardware 3D engine, which is a good thing for big mipmap levels as
> > it is fast. Generating the small mipmap levels this way is unlikely
> > to outperform the software path, which is using a transfer.
> >
> > Additionally some hardware, like the nv3x and nv4x ones, have
> > alignment requirements for render targets which prevents them from
> > rendering into smaller render targets than 16x16 pixel. To
> > generate those small mipmap levels the nvfx driver has to render to
> > a temporary surface just to copy the result to the real render
> > target using the 2D engine.
> >
> > Avoid all this overhead by just generating small mipmap levels
> > using the software path.
> >
> > Signed-off-by: Lucas Stach 
> > ---
> >  src/gallium/auxiliary/util/u_gen_mipmap.c |   23 +++
> >  1 files changed, 19 insertions(+), 4 deletions(-)
> >
> > diff --git a/src/gallium/auxiliary/util/u_gen_mipmap.c 
> > b/src/gallium/auxiliary/util/u_gen_mipmap.c
> > index 7cce815..88351b6 100644
> > --- a/src/gallium/auxiliary/util/u_gen_mipmap.c
> > +++ b/src/gallium/auxiliary/util/u_gen_mipmap.c
> > @@ -911,6 +911,8 @@ format_to_type_comps(enum pipe_format pformat,
> >case PIPE_FORMAT_B8G8R8X8_UNORM:
> >case PIPE_FORMAT_A8R8G8B8_UNORM:
> >case PIPE_FORMAT_X8R8G8B8_UNORM:
> > +   case PIPE_FORMAT_R8G8B8A8_UNORM:
> > +   case PIPE_FORMAT_R8G8B8X8_UNORM:
> >case PIPE_FORMAT_A8B8G8R8_SRGB:
> >case PIPE_FORMAT_X8B8G8R8_SRGB:
> >case PIPE_FORMAT_B8G8R8A8_SRGB:
> > @@ -1506,7 +1508,7 @@ util_gen_mipmap(struct gen_mipmap_state *ctx,
> >struct pipe_screen *screen = pipe->screen;
> >struct pipe_framebuffer_state fb;
> >struct pipe_resource *pt = psv->texture;
> > -   uint dstLevel;
> > +   uint dstLevel, hwLastLevel;
> >uint offset;
> >uint type;
> >
> > @@ -1588,10 +1590,17 @@ util_gen_mipmap(struct gen_mipmap_state *ctx,
> >ctx->sampler.min_img_filter = filter;
> >
> >/*
> > -* XXX for small mipmap levels, it may be faster to use the software
> > -* fallback path...
> > +*  for small mipmap levels we use the software path as it is likely 
> > faster
> > +*  as we avoid a bunch of state changes and avoid triggering fallback 
> > paths
> > +*  in drivers incapable of rendering to images smaller than 16x16 
> > pixel.
> > */
> > -   for (dstLevel = baseLevel + 1; dstLevel <= lastLevel; dstLevel++) {
> > +   for(hwLastLevel = baseLevel; hwLastLevel <= lastLevel; hwLastLevel++) {
> > +  if((pt->height0 >> hwLastLevel <= 16) ||
> > + (pt->width0 >> hwLastLevel <= 16))
> > +  break;
> > +   }
> > +
> > +   for (dstLevel = baseLevel + 1; dstLevel <= hwLastLevel; dstLevel++) {
> >   const uint srcLevel = dstLevel - 1;
> >   struct pipe_viewport_state vp;
> >   unsigned nr_layers, layer, i;
> > @@ -1677,6 +1686,12 @@ util_gen_mipmap(struct gen_mipmap_state *ctx,
> >   }
> >}
> >
> > +   /* if hardware path didn't fill all requested mip levels fill the 
> > remaining
> > +* levels with the software path
> > +*/
> > +   if(dstLevel < lastLevel)
> > +  fallback_gen_mipmap(ctx, pt, face, dstLevel-1, lastLevel);
> > +
> >/* restore state we changed */
> >cso_restore_blend(ctx->cso);
> >cso_restore_depth_stencil_alpha(ctx->cso);
> > --
> > 1.7.7.4
> >
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedes

Re: [Mesa-dev] [PATCH] r600g: Manage fences per screen rather than per context.

2011-12-29 Thread Mathias Fröhlich

Hi,

On Thursday, December 29, 2011 13:35:19 Michel Dänzer wrote:
> From: Michel Dänzer 
> 
> Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=44151
> 
> Probably fixes: https://bugs.freedesktop.org/show_bug.cgi?id=44007
> https://bugs.freedesktop.org/show_bug.cgi?id=43993
> 
> Signed-off-by: Michel Dänzer 
> ---
> 
> This introduces a potential race condition with apps using several contexts
> concurrently in several threads: rscreen->fences.bo is referenced by all
> contexts, so e.g. rscreen->fences.bo->cs_buf->last_flush may be concurrently
> read/written by several threads. However, I think this could already happen
> e.g. when sharing textures between GLX contexts, so it should probably be
> addressed separately. Also, in the case of rscreen->fences.bo I think it's
> harmless, as no caches should need to be flushed for it.
> 
> No regressions (or fixes) in piglit quick.tests.

This actually fixes some piglit problems that I have here using different 
vblank_mode's set in .drirc. These problems arise from the fences may be 
living longer than the context itself. Which should be fixed with this patch 
too.

I am missing a pipe_mutex_destroy in this change.

Other than that - I am fine with that change.
Also tested on a rv770 here - no piglit quick regressions.

Mathias
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] llvmpipe: Remove useless draw_install_pstipple_stage call.

2011-12-29 Thread Jose Fonseca
- Original Message -
> It is #ifdef'd out, and is already called unconditionnaly a couple
> lines above.
> ---
>  src/gallium/drivers/llvmpipe/lp_context.c |5 -
>  1 files changed, 0 insertions(+), 5 deletions(-)
> 
> diff --git a/src/gallium/drivers/llvmpipe/lp_context.c
> b/src/gallium/drivers/llvmpipe/lp_context.c
> index b6ac068..c19272f 100644
> --- a/src/gallium/drivers/llvmpipe/lp_context.c
> +++ b/src/gallium/drivers/llvmpipe/lp_context.c
> @@ -229,11 +229,6 @@ llvmpipe_create_context( struct pipe_screen
> *screen, void *priv )
> draw_wide_point_threshold(llvmpipe->draw, 1.0);
> draw_wide_line_threshold(llvmpipe->draw, 1.0);
>  
> -#if USE_DRAW_STAGE_PSTIPPLE
> -   /* Do polygon stipple w/ texture map + frag prog? */
> -   draw_install_pstipple_stage(llvmpipe->draw, &llvmpipe->pipe);
> -#endif
> -
> lp_reset_counters();
>  
> gallivm_register_garbage_collector_callback(garbage_collect_callback,
> --
> 1.7.5.3.367.ga9930

 
Reviewed-By: Jose Fonseca 

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] r600g: Manage fences per screen rather than per context.

2011-12-29 Thread Michel Dänzer
From: Michel Dänzer 

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=44151

Probably fixes: https://bugs.freedesktop.org/show_bug.cgi?id=44007
https://bugs.freedesktop.org/show_bug.cgi?id=43993

Signed-off-by: Michel Dänzer 
---

This introduces a potential race condition with apps using several contexts
concurrently in several threads: rscreen->fences.bo is referenced by all
contexts, so e.g. rscreen->fences.bo->cs_buf->last_flush may be concurrently
read/written by several threads. However, I think this could already happen
e.g. when sharing textures between GLX contexts, so it should probably be
addressed separately. Also, in the case of rscreen->fences.bo I think it's
harmless, as no caches should need to be flushed for it.

No regressions (or fixes) in piglit quick.tests.

 src/gallium/drivers/r600/r600_pipe.c |   94 ++---
 src/gallium/drivers/r600/r600_pipe.h |   26 +-
 2 files changed, 64 insertions(+), 56 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_pipe.c 
b/src/gallium/drivers/r600/r600_pipe.c
index 7f62e0e..085c4e8 100644
--- a/src/gallium/drivers/r600/r600_pipe.c
+++ b/src/gallium/drivers/r600/r600_pipe.c
@@ -53,27 +53,31 @@
  */
 static struct r600_fence *r600_create_fence(struct r600_pipe_context *ctx)
 {
+   struct r600_screen *rscreen = ctx->screen;
struct r600_fence *fence = NULL;
 
-   if (!ctx->fences.bo) {
+   pipe_mutex_lock(rscreen->fences.mutex);
+
+   if (!rscreen->fences.bo) {
/* Create the shared buffer object */
-   ctx->fences.bo = (struct r600_resource*)
-   pipe_buffer_create(&ctx->screen->screen, 
PIPE_BIND_CUSTOM,
+   rscreen->fences.bo = (struct r600_resource*)
+   pipe_buffer_create(&rscreen->screen, PIPE_BIND_CUSTOM,
   PIPE_USAGE_STAGING, 4096);
-   if (!ctx->fences.bo) {
+   if (!rscreen->fences.bo) {
R600_ERR("r600: failed to create bo for fence 
objects\n");
-   return NULL;
+   goto out;
}
-   ctx->fences.data = ctx->ws->buffer_map(ctx->fences.bo->buf, 
ctx->ctx.cs,
-  PIPE_TRANSFER_WRITE);
+   rscreen->fences.data = 
ctx->ws->buffer_map(rscreen->fences.bo->buf,
+  ctx->ctx.cs,
+  
PIPE_TRANSFER_READ_WRITE);
}
 
-   if (!LIST_IS_EMPTY(&ctx->fences.pool)) {
+   if (!LIST_IS_EMPTY(&rscreen->fences.pool)) {
struct r600_fence *entry;
 
/* Try to find a freed fence that has been signalled */
-   LIST_FOR_EACH_ENTRY(entry, &ctx->fences.pool, head) {
-   if (ctx->fences.data[entry->index] != 0) {
+   LIST_FOR_EACH_ENTRY(entry, &rscreen->fences.pool, head) {
+   if (rscreen->fences.data[entry->index] != 0) {
LIST_DELINIT(&entry->head);
fence = entry;
break;
@@ -86,33 +90,34 @@ static struct r600_fence *r600_create_fence(struct 
r600_pipe_context *ctx)
struct r600_fence_block *block;
unsigned index;
 
-   if ((ctx->fences.next_index + 1) >= 1024) {
+   if ((rscreen->fences.next_index + 1) >= 1024) {
R600_ERR("r600: too many concurrent fences\n");
-   return NULL;
+   goto out;
}
 
-   index = ctx->fences.next_index++;
+   index = rscreen->fences.next_index++;
 
if (!(index % FENCE_BLOCK_SIZE)) {
/* Allocate a new block */
block = CALLOC_STRUCT(r600_fence_block);
if (block == NULL)
-   return NULL;
+   goto out;
 
-   LIST_ADD(&block->head, &ctx->fences.blocks);
+   LIST_ADD(&block->head, &rscreen->fences.blocks);
} else {
-   block = LIST_ENTRY(struct r600_fence_block, 
ctx->fences.blocks.next, head);
+   block = LIST_ENTRY(struct r600_fence_block, 
rscreen->fences.blocks.next, head);
}
 
fence = &block->fences[index % FENCE_BLOCK_SIZE];
-   fence->ctx = ctx;
fence->index = index;
}
 
pipe_reference_init(&fence->reference, 1);
 
-   ctx->fences.data[fence->index] = 0;
-   r600_context_emit_fence(&ctx->ctx, ctx->fences.bo, fence->index, 1);
+   rscreen->fences.data[fence->index] = 0;
+   r600_context_emit_fence(&ctx->ctx, rscreen->fences.bo, fence->index, 1);
+out:
+   pipe_mutex_unlock(rscr

Re: [Mesa-dev] [PATCH 2/2] i965: fix the wrong min/max_index for nr_prims > 1

2011-12-29 Thread Michel Dänzer
On Don, 2011-12-29 at 10:03 +0800, Yuanhan Liu wrote: 
> On Wed, Dec 28, 2011 at 12:07:08PM -0800, Eric Anholt wrote:
> > On Wed, 28 Dec 2011 13:54:43 +0800, Yuanhan Liu 
> >  wrote:
> > > The current code would just calculate min/max_index for the first prim
> > > unconditionally, which is wrong if nr_prims > 1.
> > > 
> > > This would some cases like that the index is stored in element array
> > > buffer object and drawing by glMultiDrawEelements. Thus it fixes some
> > > intel oglc primbuff test cases.
> > > 
> > > Signed-off-by: Yuanhan Liu 
> > 
> > It does look like gallium has the same bug --
> 
> i965g?  I just found that the whole i965g is deleted by commit
> 2c27f204f1ca6f09f9520712be1da9a13ed5c01d.
> 
> > this should probably be a
> > vbo helper function.
> 
> If you were talking about i965g and now it was deleted, should I make
> this be a vbo helper function?

i965g was just one Gallium driver. Presumably, Eric was referring to the
Gallium Mesa state tracker (src/mesa/state_tracker/), which translates
between the Mesa and Gallium driver interfaces.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev