[Mesa-dev] [Bug 86690] glmark2-es2-wayland shortly freezes on some frames with egl_dri2 backend (Nouveau/GK20A)

2014-11-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=86690

Bug ID: 86690
   Summary: glmark2-es2-wayland shortly freezes on some frames
with egl_dri2 backend (Nouveau/GK20A)
   Product: Mesa
   Version: git
  Hardware: ARM
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: EGL/Wayland
  Assignee: wayland-b...@lists.freedesktop.org
  Reporter: gnu...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

When using the egl_dri2 driver, glmark2 will sometimes keep displaying the same
frame for up to 1/2 second before resuming normally. This also affects the
reported frame rate, which drops dramatically with each occurence.

This only seems to happen for applications that set eglSwapInterval to 0 in
order to exceed the monitor frame rate. With applications that do not set
eglSwapInterval (like weston-simple-egl) or the (recently removed) egl_gallium
driver that also does not allow more than 60fps, the issue is not visible.

Relevant comments from Pekka Paalanen when discussing this on the mailing-list:

I have a hunch (wl_buffer.release not delivered in time, and client
side EGL running out of available buffers), but confirming that would
require a Wayland protocol dump up to such hickup. You could try to get
one by setting the enviroment variable WAYLAND_DEBUG=client for
glmark2. It will be a flood, especially if glmark2 succeeds in running
at uncapped framerates. The trace will come to stderr, so you want to
redirect that to file. You need to find out where in the trace the
hickup happened. The timestamps are in milliseconds. I could then take
a look (will need the whole trace).

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Performance regression on Tegra/GK20A since commit 363b53f00069

2014-11-25 Thread Alexandre Courbot

Hi Pekka,

On 11/20/2014 08:41 PM, Pekka Paalanen wrote:

On Thu, 20 Nov 2014 18:24:34 +0900
Alexandre Courbot acour...@nvidia.com wrote:


Hi Pekka,

On 11/19/2014 04:34 PM, Pekka Paalanen wrote:

On Wed, 19 Nov 2014 15:32:38 +0900
Alexandre Courbot acour...@nvidia.com wrote:


Some more information: CPU usage of the EGL app (glmark2 here) is much
higher when this patch is applied, which I presume is what triggers the
frame skips.

On 11/19/2014 03:05 PM, Alexandre Courbot wrote:

Hi guys,

I am seeing a severe performance regression (lots frame drops when
running EGL apps in Weston) on Tegra/GK20A since the following commit:

commit 363b53f00069af718f64cf047f19ad5681a8bf6d
Author: Marek Olšák marek.ol...@amd.com
Date:   Sat Nov 1 14:31:09 2014 +0100

   egl: remove egl_gallium from the loader

Reverting said commit on top of master brings the expected performance
back. I am not knowledgeable enough about Mesa to speculate about the
reason, but could we try to investigate why this happens and how we
could fix this?


Hi,

that sounds like you used to get egl_gallium as the EGL driver, and
after that patch you get egl_dri2. These two have separate Wayland
platform implementations (and probably all other platforms as well?). I
think that might be a lead for investigation. EGL debug environment
variable (EGL_LOG_LEVEL=debug) should confirm which EGL driver gets
loaded. You can force the EGL driver with e.g. EGL_DRIVER=egl_dri2.


You are spot on, EGL_LOG_LEVEL revealed that I was using the egl_gallium
driver while this patch makes Wayland applications use egl_dri2. If I
set EGL_DRIVER=egl_gallium things go back to the expected behavior.



Note, that there are two different EGL platforms in play: DRM/GBM for
Weston, and Wayland for the app. Have you confirmed the problem is in
the app side? That is, does Weston itself run smoothly?


Weston seems to be fine, and since setting EGL_DRIVER=egl_gallium after
starting it brings things back to the previous behavior I believe we can
consider it is not part of this problem.


Agreed.


You say frame drops, how do you see that? Only on screen, or also in
the app performance profile? How's the average framerate for the app?


Looking at it again this is actually quite interesting. The misbehaving
app is glmark2, and what happens is that despite working nicely
otherwise, a given frame sometimes remain displayed for up to half a
second. Now looking at the framerates reported by glmark2, I noticed
that while they are capped at 60fps when using the gallium driver, the
numbers are much higher when using dri2 (which is nice!). Excepted when
these stuck frames happen, then the reported framerate drops
dramatically, indicating that the app itself is also blocked by this.


I have a hunch (wl_buffer.release not delivered in time, and client
side EGL running out of available buffers), but confirming that would
require a Wayland protocol dump up to such hickup. You could try to get
one by setting the enviroment variable WAYLAND_DEBUG=client for
glmark2. It will be a flood, especially if glmark2 succeeds in running
at uncapped framerates. The trace will come to stderr, so you want to
redirect that to file. You need to find out where in the trace the
hickup happened. The timestamps are in milliseconds. I could then take
a look (will need the whole trace).

At this point, I think it would be best to open a bug report against
Mesa, and continue there. Such freezes obviously should not happen on
either EGL driver. Please add me to CC on the bug.


Thanks, I have opened a bug against Mesa and added you to the CC list:

https://bugs.freedesktop.org/show_bug.cgi?id=86690

I will try to attach the WAYLAND_DEBUG trace you suggested to it.

Cheers,
Alex.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 86690] glmark2-es2-wayland shortly freezes on some frames with egl_dri2 backend (Nouveau/GK20A)

2014-11-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=86690

--- Comment #1 from Alexandre Courbot gnu...@gmail.com ---
Created attachment 109990
  -- https://bugs.freedesktop.org/attachment.cgi?id=109990action=edit
Trace when running glmark2 with WAYLAND_DEBUG=client

Attached the trace suggested by Pekka. The stuttering is visible almost
immediatly after the bench started, and occured regularly until the end of the
trace.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 78914] [llvmpipe] Front/Backfaces do not get the same depth values when interpolated

2014-11-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=78914

--- Comment #21 from Florian Link florianl...@gmail.com ---
It is ok for me if you close this bug as WONTFIX, since I worked around the
depth fighting and I agree that it is a hard problem to do this with high
precision and speed at the same time.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Performance regression on Tegra/GK20A since commit 363b53f00069

2014-11-25 Thread Pekka Paalanen
On Tue, 25 Nov 2014 17:04:36 +0900
Alexandre Courbot acour...@nvidia.com wrote:

 Hi Pekka,
 
 On 11/20/2014 08:41 PM, Pekka Paalanen wrote:

  At this point, I think it would be best to open a bug report against
  Mesa, and continue there. Such freezes obviously should not happen on
  either EGL driver. Please add me to CC on the bug.
 
 Thanks, I have opened a bug against Mesa and added you to the CC list:
 
 https://bugs.freedesktop.org/show_bug.cgi?id=86690
 
 I will try to attach the WAYLAND_DEBUG trace you suggested to it.

Cool, thanks,
pq
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 71199] [llvmpipe] piglit gl-1.4-polygon-offset regression

2014-11-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=71199

--- Comment #8 from José Fonseca jfons...@vmware.com ---
Yes, I suspect that ideal MRD at near plane may never go below 2, because of
the the near_1_far_infinity perspective transformation applied.

We could experiment whether an orthogonal transformation would allow to tease
out
an ideal MRD of 1.


Another solution would be to compute the minimum MRD, not from this complex
procedure, but from the glGetIntegerv(GL_DEPTH_BITS).

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] glx/dri3: Request non-vsynced Present for swapinterval zero.

2014-11-25 Thread Frank Binns

Hi,

I sent exactly the same patch back in April. Despite getting a review-by
it was never merged. See:

http://lists.freedesktop.org/archives/mesa-dev/2014-April/058347.html
http://lists.freedesktop.org/archives/mesa-dev/2014-May/060432.html

Thanks
Frank

On 25/11/14 03:00, Mario Kleiner wrote:

Restores proper immediate tearing swap behaviour for
OpenGL bufferswap under DRI3/Present.

Cc: 10.3 10.4 mesa-sta...@lists.freedesktop.org
Signed-off-by: Mario Kleiner mario.kleiner...@gmail.com
---
  src/glx/dri3_glx.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index 5796491..c53be1b 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -1518,6 +1518,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t 
target_msc, int64_t divisor,
 xcb_connection_t *c = XGetXCBConnection(dpy);
 struct dri3_buffer *back;
 int64_t ret = 0;
+   uint32_t options = XCB_PRESENT_OPTION_NONE;
  
 unsigned flags = __DRI2_FLUSH_DRAWABLE;

 if (flush)
@@ -1557,6 +1558,9 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t 
target_msc, int64_t divisor,
if (target_msc == 0)
   target_msc = priv-msc + priv-swap_interval * (priv-send_sbc - 
priv-recv_sbc);
  
+  if (priv-swap_interval == 0)

+  options |= XCB_PRESENT_OPTION_ASYNC;
+
back-busy = 1;
back-last_swap = priv-send_sbc;
xcb_present_pixmap(c,
@@ -1570,7 +1574,7 @@ dri3_swap_buffers(__GLXDRIdrawable *pdraw, int64_t 
target_msc, int64_t divisor,
   None, /* target_crtc 
*/
   None,
   back-sync_fence,
- XCB_PRESENT_OPTION_NONE,
+ options,
   target_msc,
   divisor,
   remainder, 0, NULL);


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] util/hash_table: Rework the API to know about hashing

2014-11-25 Thread Kenneth Graunke
On Monday, November 24, 2014 10:19:50 PM Jason Ekstrand wrote:
 Previously, the hash_table API required the user to do all of the hashing
 of keys as it passed them in.  Since the hashing function is intrinsicly
 tied to the comparison function, it makes sense for the hash table to know
 about it.  Also, it makes for a somewhat clumsy API as the user is
 constantly calling hashing functions many of which have long names.  This
 is especially bad when the standard call looks something like
 
 _mesa_hash_table_insert(ht, _mesa_pointer_hash(key), key, data);
 
 In the above case, there is no reason why the hash table shouldn't do the
 hashing for you.  We leave the option for you to do your own hashing if
 it's more efficient, but it's no longer needed.  Also, if you do do your
 own hashing, the hash table will assert that your hash matches what it
 expects out of the hashing function.  This should make it harder to mess up
 your hashing.
 
 Signed-off-by: Jason Ekstrand jason.ekstr...@intel.com

(CC'ing Eric to make sure he sees your patch - it's his code after all)

Yes, please - the new API looks much nicer to work with.  This should be
especially nice in the compiler, where we have a lot of hash tables, and
aren't concerned with saving every last CPU cycle.

I like that there are still _with_hash variants, so you can still compute
the hash once and use it multiple times where that makes sense.

--Ken

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/20] Auto-generate pack/unpack functions

2014-11-25 Thread Michel Dänzer

On 21.11.2014 23:57, Samuel Iglesias Gonsálvez wrote:

On Wed, 2014-11-19 at 17:09 +0900, Michel Dänzer wrote:

On 18.11.2014 17:43, Iago Toral Quiroga wrote:


For software drivers we worked with a trimmed set of piglit tests (related to
format conversion), ~5700 tests selected with the following filter:

-t format -t color -t tex -t image -t swizzle -t clamp -t rgb -t lum -t pix
-t fbo -t frame


Any particular reason for not testing at least piglit gpu.py with
llvmpipe? Last time I tried that a few months ago, it didn't take much
more than ten minutes on a quad-core A10-7850K.


gpu.py takes much more time on quad core i7-3610QM laptop with gallium
llvmpipe... After 6 hours waiting for piglit to finish, I started
killing the processes that took too much time.


I just tried it again with today's snapshots of LLVM SVN trunk and Mesa 
Git master, it took about 22.5 minutes. Did you really test gpu.py 
(which is a subset of quick.py), not all.py?



--
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/Gen6-7: Do not replace texcoords with point coord if not drawing points

2014-11-25 Thread Chris Forbes
Fixes broken rendering in Windows-based QtQuick2 apps run through Wine.
This library sets all texture units' GL_COORD_REPLACE, leaves point
sprite mode enabled, and then draws a triangle fan.

Will need a slightly different fix for Gen4-5, but I don't have my old
machines in a usable state currently.

V2: - Simplify patch -- the real changes are no longer duplicated across
  the Gen6 and Gen7 atoms.
- Also don't clobber attr overrides -- which matters on Haswell too,
  and fixes the other half of the problem
- Fix newly-introduced warnings

Signed-off-by: Chris Forbes chr...@ijw.co.nz
Cc: 10.4 mesa-sta...@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84651
---
 src/mesa/drivers/dri/i965/gen6_sf_state.c | 54 ---
 src/mesa/drivers/dri/i965/gen7_sf_state.c |  3 +-
 2 files changed, 45 insertions(+), 12 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_sf_state.c 
b/src/mesa/drivers/dri/i965/gen6_sf_state.c
index 24d2754..778c6d3 100644
--- a/src/mesa/drivers/dri/i965/gen6_sf_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_sf_state.c
@@ -129,6 +129,18 @@ get_attr_override(const struct brw_vue_map *vue_map, int 
urb_entry_read_offset,
 }
 
 
+static bool
+is_drawing_points(const struct brw_context *brw)
+{
+   /* Determine if the primitives *reaching the SF* are points */
+   const struct gl_context *ctx = brw-ctx;
+   if (ctx-GeometryProgram._Current)
+  return ctx-GeometryProgram._Current-OutputType == GL_POINTS;
+   else
+  return brw-primitive == _3DPRIM_POINTLIST;
+}
+
+
 /**
  * Create the mapping from the FS inputs we produce to the previous pipeline
  * stage (GS or VS) outputs they source from.
@@ -149,6 +161,23 @@ calculate_attr_overrides(const struct brw_context *brw,
/* _NEW_LIGHT */
bool shade_model_flat = brw-ctx.Light.ShadeModel == GL_FLAT;
 
+   /* From the Ivybridge PRM, Vol 2 Part 1, 3DSTATE_SBE,
+* description of dw10 Point Sprite Texture Coordinate Enable:
+*
+* This field must be programmed to zero when non-point primitives
+* are rendered.
+*
+* The SandyBridge PRM doesn't explicitly say that point sprite enables
+* must be programmed to zero when rendering non-point primitives, but
+* the IvyBridge PRM does, and if we don't, we get garbage.
+*
+* This is not required on Haswell, as the hardware ignores this state
+* when drawing non-points -- although we do still need to be careful to
+* correctly set the attr overrides.
+*/
+   /* BRW_NEW_PRIMITIVE | _NEW_PROGRAM */
+   bool drawing_points = is_drawing_points(brw);
+
/* Initialize all the attr_overrides to 0.  In the loop below we'll modify
 * just the ones that correspond to inputs used by the fs.
 */
@@ -167,18 +196,20 @@ calculate_attr_overrides(const struct brw_context *brw,
 
   /* _NEW_POINT */
   bool point_sprite = false;
-  if (brw-ctx.Point.PointSprite 
- (attr = VARYING_SLOT_TEX0  attr = VARYING_SLOT_TEX7) 
- brw-ctx.Point.CoordReplace[attr - VARYING_SLOT_TEX0]) {
- point_sprite = true;
+  if (drawing_points) {
+ if (brw-ctx.Point.PointSprite 
+ (attr = VARYING_SLOT_TEX0  attr = VARYING_SLOT_TEX7) 
+ brw-ctx.Point.CoordReplace[attr - VARYING_SLOT_TEX0]) {
+point_sprite = true;
+ }
+
+ if (attr == VARYING_SLOT_PNTC)
+point_sprite = true;
+
+ if (point_sprite)
+*point_sprite_enables |= (1  input_index);
   }
 
-  if (attr == VARYING_SLOT_PNTC)
- point_sprite = true;
-
-  if (point_sprite)
-*point_sprite_enables |= (1  input_index);
-
   /* flat shading */
   if (interp_qualifier == INTERP_QUALIFIER_FLAT ||
   (shade_model_flat  is_gl_Color 
@@ -411,7 +442,8 @@ const struct brw_tracked_state gen6_sf_state = {
 _NEW_MULTISAMPLE),
   .brw   = (BRW_NEW_CONTEXT |
BRW_NEW_FRAGMENT_PROGRAM |
-BRW_NEW_VUE_MAP_GEOM_OUT),
+BRW_NEW_VUE_MAP_GEOM_OUT |
+BRW_NEW_PRIMITIVE),
   .cache = CACHE_NEW_WM_PROG
},
.emit = upload_sf_state,
diff --git a/src/mesa/drivers/dri/i965/gen7_sf_state.c 
b/src/mesa/drivers/dri/i965/gen7_sf_state.c
index 109b825..371b9de 100644
--- a/src/mesa/drivers/dri/i965/gen7_sf_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_sf_state.c
@@ -93,7 +93,8 @@ const struct brw_tracked_state gen7_sbe_state = {
_NEW_PROGRAM),
   .brw   = (BRW_NEW_CONTEXT |
BRW_NEW_FRAGMENT_PROGRAM |
-BRW_NEW_VUE_MAP_GEOM_OUT),
+BRW_NEW_VUE_MAP_GEOM_OUT |
+BRW_NEW_PRIMITIVE),
   .cache = CACHE_NEW_WM_PROG
},
.emit = upload_sbe_state,
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/Gen6-7: Do not replace texcoords with point coord if not drawing points

2014-11-25 Thread Kenneth Graunke
On Tuesday, November 25, 2014 10:05:18 PM Chris Forbes wrote:
 Fixes broken rendering in Windows-based QtQuick2 apps run through Wine.
 This library sets all texture units' GL_COORD_REPLACE, leaves point
 sprite mode enabled, and then draws a triangle fan.
 
 Will need a slightly different fix for Gen4-5, but I don't have my old
 machines in a usable state currently.
 
 V2: - Simplify patch -- the real changes are no longer duplicated across
   the Gen6 and Gen7 atoms.
 - Also don't clobber attr overrides -- which matters on Haswell too,
   and fixes the other half of the problem
 - Fix newly-introduced warnings
 
 Signed-off-by: Chris Forbes chr...@ijw.co.nz
 Cc: 10.4 mesa-sta...@lists.freedesktop.org
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84651
 ---
  src/mesa/drivers/dri/i965/gen6_sf_state.c | 54 
 ---
  src/mesa/drivers/dri/i965/gen7_sf_state.c |  3 +-
  2 files changed, 45 insertions(+), 12 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/gen6_sf_state.c 
 b/src/mesa/drivers/dri/i965/gen6_sf_state.c
 index 24d2754..778c6d3 100644
 --- a/src/mesa/drivers/dri/i965/gen6_sf_state.c
 +++ b/src/mesa/drivers/dri/i965/gen6_sf_state.c
 @@ -129,6 +129,18 @@ get_attr_override(const struct brw_vue_map *vue_map, int 
 urb_entry_read_offset,
  }
  
  
 +static bool
 +is_drawing_points(const struct brw_context *brw)
 +{
 +   /* Determine if the primitives *reaching the SF* are points */
 +   const struct gl_context *ctx = brw-ctx;

It doesn't seem like BRW_NEW_PRIMITIVE and BRW_NEW_VUE_MAP_GEOM_OUT
are enough to cover the use of OutputType.  I think you want:

  if (brw-geometry_program) {
  /* BRW_NEW_GEOMETRY_PROGRAM */
  return brw-geometry_program-OutputType == GL_POINTS;
  } else {
  /* BRW_NEW_PRIMITIVE */
  return brw-primitive == _3DPRIM_POINTLIST;
  }

and then include BRW_NEW_GEOMETRY_PROGRAM in the atom's dirty flags.

(Using brw-geometry_program should be equivalent to
ctx-GeometryProgram._Current, but we may as well use the brw copy for
consistency with the rest of the code.  It's also more obviously connected
to the BRW_NEW_GEOMETRY_PROGRAM flag.)

 +   if (ctx-GeometryProgram._Current)
 +  return ctx-GeometryProgram._Current-OutputType == GL_POINTS;
 +   else
 +  return brw-primitive == _3DPRIM_POINTLIST;
 +}
 +
 +
  /**
   * Create the mapping from the FS inputs we produce to the previous pipeline
   * stage (GS or VS) outputs they source from.
 @@ -149,6 +161,23 @@ calculate_attr_overrides(const struct brw_context *brw,
 /* _NEW_LIGHT */
 bool shade_model_flat = brw-ctx.Light.ShadeModel == GL_FLAT;
  
 +   /* From the Ivybridge PRM, Vol 2 Part 1, 3DSTATE_SBE,
 +* description of dw10 Point Sprite Texture Coordinate Enable:
 +*
 +* This field must be programmed to zero when non-point primitives
 +* are rendered.
 +*
 +* The SandyBridge PRM doesn't explicitly say that point sprite enables
 +* must be programmed to zero when rendering non-point primitives, but
 +* the IvyBridge PRM does, and if we don't, we get garbage.
 +*
 +* This is not required on Haswell, as the hardware ignores this state
 +* when drawing non-points -- although we do still need to be careful to
 +* correctly set the attr overrides.
 +*/
 +   /* BRW_NEW_PRIMITIVE | _NEW_PROGRAM */
 +   bool drawing_points = is_drawing_points(brw);
 +
 /* Initialize all the attr_overrides to 0.  In the loop below we'll modify
  * just the ones that correspond to inputs used by the fs.
  */
 @@ -167,18 +196,20 @@ calculate_attr_overrides(const struct brw_context *brw,
  
/* _NEW_POINT */
bool point_sprite = false;
 -  if (brw-ctx.Point.PointSprite 
 -   (attr = VARYING_SLOT_TEX0  attr = VARYING_SLOT_TEX7) 
 -   brw-ctx.Point.CoordReplace[attr - VARYING_SLOT_TEX0]) {
 - point_sprite = true;
 +  if (drawing_points) {
 + if (brw-ctx.Point.PointSprite 
 + (attr = VARYING_SLOT_TEX0  attr = VARYING_SLOT_TEX7) 
 + brw-ctx.Point.CoordReplace[attr - VARYING_SLOT_TEX0]) {
 +point_sprite = true;
 + }
 +
 + if (attr == VARYING_SLOT_PNTC)
 +point_sprite = true;
 +
 + if (point_sprite)
 +*point_sprite_enables |= (1  input_index);
}
  
 -  if (attr == VARYING_SLOT_PNTC)
 - point_sprite = true;
 -
 -  if (point_sprite)
 -  *point_sprite_enables |= (1  input_index);
 -
/* flat shading */
if (interp_qualifier == INTERP_QUALIFIER_FLAT ||
(shade_model_flat  is_gl_Color 
 @@ -411,7 +442,8 @@ const struct brw_tracked_state gen6_sf_state = {
  _NEW_MULTISAMPLE),
.brw   = (BRW_NEW_CONTEXT |
   BRW_NEW_FRAGMENT_PROGRAM |

Add BRW_NEW_GEOMETRY_PROGRAM here, and please add them in alphabetical order.

 -BRW_NEW_VUE_MAP_GEOM_OUT),
 +

Re: [Mesa-dev] [PATCH 00/20] Auto-generate pack/unpack functions

2014-11-25 Thread Samuel Iglesias Gonsálvez


On 25/11/14 09:59, Michel Dänzer wrote:
 On 21.11.2014 23:57, Samuel Iglesias Gonsálvez wrote:
 On Wed, 2014-11-19 at 17:09 +0900, Michel Dänzer wrote:
 On 18.11.2014 17:43, Iago Toral Quiroga wrote:

 For software drivers we worked with a trimmed set of piglit tests
 (related to
 format conversion), ~5700 tests selected with the following filter:

 -t format -t color -t tex -t image -t swizzle -t clamp -t rgb -t lum
 -t pix
 -t fbo -t frame

 Any particular reason for not testing at least piglit gpu.py with
 llvmpipe? Last time I tried that a few months ago, it didn't take much
 more than ten minutes on a quad-core A10-7850K.

 gpu.py takes much more time on quad core i7-3610QM laptop with gallium
 llvmpipe... After 6 hours waiting for piglit to finish, I started
 killing the processes that took too much time.
 
 I just tried it again with today's snapshots of LLVM SVN trunk and Mesa
 Git master, it took about 22.5 minutes. Did you really test gpu.py
 (which is a subset of quick.py), not all.py?
 
 

Yes, I tried with gpu.py but with an old llvm version. I will test again
with LLVM SVN trunk and last Mesa Git master.

Sam
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/Gen6-7: Do not replace texcoords with point coord if not drawing points

2014-11-25 Thread Chris Forbes
Changes done and pushed.

On Tue, Nov 25, 2014 at 10:17 PM, Kenneth Graunke kenn...@whitecape.org wrote:
 On Tuesday, November 25, 2014 10:05:18 PM Chris Forbes wrote:
 Fixes broken rendering in Windows-based QtQuick2 apps run through Wine.
 This library sets all texture units' GL_COORD_REPLACE, leaves point
 sprite mode enabled, and then draws a triangle fan.

 Will need a slightly different fix for Gen4-5, but I don't have my old
 machines in a usable state currently.

 V2: - Simplify patch -- the real changes are no longer duplicated across
   the Gen6 and Gen7 atoms.
 - Also don't clobber attr overrides -- which matters on Haswell too,
   and fixes the other half of the problem
 - Fix newly-introduced warnings

 Signed-off-by: Chris Forbes chr...@ijw.co.nz
 Cc: 10.4 mesa-sta...@lists.freedesktop.org
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84651
 ---
  src/mesa/drivers/dri/i965/gen6_sf_state.c | 54 
 ---
  src/mesa/drivers/dri/i965/gen7_sf_state.c |  3 +-
  2 files changed, 45 insertions(+), 12 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/gen6_sf_state.c 
 b/src/mesa/drivers/dri/i965/gen6_sf_state.c
 index 24d2754..778c6d3 100644
 --- a/src/mesa/drivers/dri/i965/gen6_sf_state.c
 +++ b/src/mesa/drivers/dri/i965/gen6_sf_state.c
 @@ -129,6 +129,18 @@ get_attr_override(const struct brw_vue_map *vue_map, 
 int urb_entry_read_offset,
  }


 +static bool
 +is_drawing_points(const struct brw_context *brw)
 +{
 +   /* Determine if the primitives *reaching the SF* are points */
 +   const struct gl_context *ctx = brw-ctx;

 It doesn't seem like BRW_NEW_PRIMITIVE and BRW_NEW_VUE_MAP_GEOM_OUT
 are enough to cover the use of OutputType.  I think you want:

   if (brw-geometry_program) {
   /* BRW_NEW_GEOMETRY_PROGRAM */
   return brw-geometry_program-OutputType == GL_POINTS;
   } else {
   /* BRW_NEW_PRIMITIVE */
   return brw-primitive == _3DPRIM_POINTLIST;
   }

 and then include BRW_NEW_GEOMETRY_PROGRAM in the atom's dirty flags.

 (Using brw-geometry_program should be equivalent to
 ctx-GeometryProgram._Current, but we may as well use the brw copy for
 consistency with the rest of the code.  It's also more obviously connected
 to the BRW_NEW_GEOMETRY_PROGRAM flag.)

 +   if (ctx-GeometryProgram._Current)
 +  return ctx-GeometryProgram._Current-OutputType == GL_POINTS;
 +   else
 +  return brw-primitive == _3DPRIM_POINTLIST;
 +}
 +
 +
  /**
   * Create the mapping from the FS inputs we produce to the previous pipeline
   * stage (GS or VS) outputs they source from.
 @@ -149,6 +161,23 @@ calculate_attr_overrides(const struct brw_context *brw,
 /* _NEW_LIGHT */
 bool shade_model_flat = brw-ctx.Light.ShadeModel == GL_FLAT;

 +   /* From the Ivybridge PRM, Vol 2 Part 1, 3DSTATE_SBE,
 +* description of dw10 Point Sprite Texture Coordinate Enable:
 +*
 +* This field must be programmed to zero when non-point primitives
 +* are rendered.
 +*
 +* The SandyBridge PRM doesn't explicitly say that point sprite enables
 +* must be programmed to zero when rendering non-point primitives, but
 +* the IvyBridge PRM does, and if we don't, we get garbage.
 +*
 +* This is not required on Haswell, as the hardware ignores this state
 +* when drawing non-points -- although we do still need to be careful to
 +* correctly set the attr overrides.
 +*/
 +   /* BRW_NEW_PRIMITIVE | _NEW_PROGRAM */
 +   bool drawing_points = is_drawing_points(brw);
 +
 /* Initialize all the attr_overrides to 0.  In the loop below we'll 
 modify
  * just the ones that correspond to inputs used by the fs.
  */
 @@ -167,18 +196,20 @@ calculate_attr_overrides(const struct brw_context *brw,

/* _NEW_POINT */
bool point_sprite = false;
 -  if (brw-ctx.Point.PointSprite 
 -   (attr = VARYING_SLOT_TEX0  attr = VARYING_SLOT_TEX7) 
 -   brw-ctx.Point.CoordReplace[attr - VARYING_SLOT_TEX0]) {
 - point_sprite = true;
 +  if (drawing_points) {
 + if (brw-ctx.Point.PointSprite 
 + (attr = VARYING_SLOT_TEX0  attr = VARYING_SLOT_TEX7) 
 + brw-ctx.Point.CoordReplace[attr - VARYING_SLOT_TEX0]) {
 +point_sprite = true;
 + }
 +
 + if (attr == VARYING_SLOT_PNTC)
 +point_sprite = true;
 +
 + if (point_sprite)
 +*point_sprite_enables |= (1  input_index);
}

 -  if (attr == VARYING_SLOT_PNTC)
 - point_sprite = true;
 -
 -  if (point_sprite)
 -  *point_sprite_enables |= (1  input_index);
 -
/* flat shading */
if (interp_qualifier == INTERP_QUALIFIER_FLAT ||
(shade_model_flat  is_gl_Color 
 @@ -411,7 +442,8 @@ const struct brw_tracked_state gen6_sf_state = {
  _NEW_MULTISAMPLE),
.brw   = (BRW_NEW_CONTEXT |
   BRW_NEW_FRAGMENT_PROGRAM |

 Add BRW_NEW_GEOMETRY_PROGRAM here, and 

Re: [Mesa-dev] [PATCH 00/20] Auto-generate pack/unpack functions

2014-11-25 Thread Jose Fonseca

On 25/11/14 09:43, Samuel Iglesias Gonsálvez wrote:



On 25/11/14 09:59, Michel Dänzer wrote:

On 21.11.2014 23:57, Samuel Iglesias Gonsálvez wrote:

On Wed, 2014-11-19 at 17:09 +0900, Michel Dänzer wrote:

On 18.11.2014 17:43, Iago Toral Quiroga wrote:


For software drivers we worked with a trimmed set of piglit tests
(related to
format conversion), ~5700 tests selected with the following filter:

-t format -t color -t tex -t image -t swizzle -t clamp -t rgb -t lum
-t pix
-t fbo -t frame


Any particular reason for not testing at least piglit gpu.py with
llvmpipe? Last time I tried that a few months ago, it didn't take much
more than ten minutes on a quad-core A10-7850K.


gpu.py takes much more time on quad core i7-3610QM laptop with gallium
llvmpipe... After 6 hours waiting for piglit to finish, I started
killing the processes that took too much time.


I just tried it again with today's snapshots of LLVM SVN trunk and Mesa
Git master, it took about 22.5 minutes. Did you really test gpu.py
(which is a subset of quick.py), not all.py?




Yes, I tried with gpu.py but with an old llvm version. I will test again
with LLVM SVN trunk and last Mesa Git master.



Try the new llvmpipe.py test-list instead of gpu.py.  It should skip 
the problematic tests.


Jose


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa/glsl/glapi: enable GL_EXT_draw_buffers extension

2014-11-25 Thread Tapani Pälli
Patch enables ES2 extension that utilizes existing ES3 functionality.

Changes make all the subtests to run and pass in WebGL conformance
test 'webgl-draw-buffers' when running Chrome on OpenGL ES.

Signed-off-by: Tapani Pälli tapani.pa...@intel.com
---
 src/glsl/glcpp/glcpp-parse.y| 2 ++
 src/glsl/glsl_parser_extras.cpp | 1 +
 src/glsl/glsl_parser_extras.h   | 2 ++
 src/mapi/glapi/gen/es_EXT.xml   | 9 +
 src/mesa/main/extensions.c  | 1 +
 src/mesa/main/mtypes.h  | 1 +
 6 files changed, 16 insertions(+)

diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y
index f1119eb..414f4df 100644
--- a/src/glsl/glcpp/glcpp-parse.y
+++ b/src/glsl/glcpp/glcpp-parse.y
@@ -2380,6 +2380,8 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t 
*parser, intmax_t versio
 add_builtin_define(parser, GL_OES_EGL_image_external, 1);
   if (extensions-OES_standard_derivatives)
  add_builtin_define(parser, GL_OES_standard_derivatives, 1);
+ if (extensions-EXT_draw_buffers)
+add_builtin_define(parser, GL_EXT_draw_buffers, 1);
   }
} else {
   add_builtin_define(parser, GL_ARB_draw_buffers, 1);
diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index 27e3301..2d49c3e 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -556,6 +556,7 @@ static const _mesa_glsl_extension 
_mesa_glsl_supported_extensions[] = {
EXT(AMD_shader_trinary_minmax,  true,  false, dummy_true),
EXT(AMD_vertex_shader_layer,true,  false, 
AMD_vertex_shader_layer),
EXT(AMD_vertex_shader_viewport_index, true,  false,   
AMD_vertex_shader_viewport_index),
+   EXT(EXT_draw_buffers,   false,  true, dummy_true),
EXT(EXT_separate_shader_objects,false, true,  dummy_true),
EXT(EXT_shader_integer_mix, true,  true,  
EXT_shader_integer_mix),
EXT(EXT_texture_array,  true,  false, EXT_texture_array),
diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
index c14d74c..7a13875 100644
--- a/src/glsl/glsl_parser_extras.h
+++ b/src/glsl/glsl_parser_extras.h
@@ -470,6 +470,8 @@ struct _mesa_glsl_parse_state {
bool AMD_vertex_shader_layer_warn;
bool AMD_vertex_shader_viewport_index_enable;
bool AMD_vertex_shader_viewport_index_warn;
+   bool EXT_draw_buffers_enable;
+   bool EXT_draw_buffers_warn;
bool EXT_separate_shader_objects_enable;
bool EXT_separate_shader_objects_warn;
bool EXT_shader_integer_mix_enable;
diff --git a/src/mapi/glapi/gen/es_EXT.xml b/src/mapi/glapi/gen/es_EXT.xml
index e2dc390..3a2adeb 100644
--- a/src/mapi/glapi/gen/es_EXT.xml
+++ b/src/mapi/glapi/gen/es_EXT.xml
@@ -837,4 +837,13 @@
 /function
 /category
 
+!-- 151. GL_EXT_draw_buffers --
+category name=GL_EXT_draw_buffers number=151
+function name=DrawBuffersEXT alias=DrawBuffers
+  static_dispatch=false es2=2.0
+param name=n type=GLsizei counter=true/
+param name=bufs type=const GLenum * count=n/
+/function
+/category
+
 /OpenGLAPI
diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
index 0df04c2..3b206bf 100644
--- a/src/mesa/main/extensions.c
+++ b/src/mesa/main/extensions.c
@@ -212,6 +212,7 @@ static const struct extension extension_table[] = {
{ GL_EXT_compiled_vertex_array,   o(dummy_true),  
GLL,1996 },
{ GL_EXT_copy_texture,o(dummy_true),  
GLL,1995 },
{ GL_EXT_depth_bounds_test,   o(EXT_depth_bounds_test),   
GL, 2002 },
+   { GL_EXT_draw_buffers,o(dummy_true),  
ES2, 2012 },
{ GL_EXT_draw_buffers2,   o(EXT_draw_buffers2),   
GL, 2006 },
{ GL_EXT_draw_instanced,  o(ARB_draw_instanced),  
GL, 2006 },
{ GL_EXT_draw_range_elements, o(dummy_true),  
GLL,1997 },
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 7389baa..c798737 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3785,6 +3785,7 @@ struct gl_extensions
GLboolean EXT_blend_func_separate;
GLboolean EXT_blend_minmax;
GLboolean EXT_depth_bounds_test;
+   GLboolean EXT_draw_buffers;
GLboolean EXT_draw_buffers2;
GLboolean EXT_framebuffer_multisample;
GLboolean EXT_framebuffer_multisample_blit_scaled;
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/11] st/nine: Rework queries

2014-11-25 Thread Marek Olšák
Calling pipe-flush is unnecessary before get_query_result, because
get_query_result flushes automatically if it has to (at least on
radeon).

Marek

On Sun, Nov 23, 2014 at 11:40 PM, David Heidelberg da...@ixit.cz wrote:
 From: Axel Davy axel.d...@ens.fr

 From this moment we should handle errors same way as Wine does.

 Original patch from John Ettedgui john.etted...@gmail.com

 Cc: 10.4 mesa-sta...@lists.freedesktop.org
 Tested-by: David Heidelberg da...@ixit.cz
 Signed-off-by: Axel Davy axel.d...@ens.fr
 ---
  src/gallium/state_trackers/nine/query9.c | 66 
 +---
  1 file changed, 44 insertions(+), 22 deletions(-)

 diff --git a/src/gallium/state_trackers/nine/query9.c 
 b/src/gallium/state_trackers/nine/query9.c
 index 908420c..34dfec7 100644
 --- a/src/gallium/state_trackers/nine/query9.c
 +++ b/src/gallium/state_trackers/nine/query9.c
 @@ -123,6 +123,15 @@ NineQuery9_ctor( struct NineQuery9 *This,
  if (!This-pq)
  return E_OUTOFMEMORY;
  } else {
 +/* we have a fallback when app create a query that is
 +   not supported. Wine has different behaviour. It won't fill the
 +   pointer with a valid NineQuery9, but let it NULL and return error.
 +   However even if driver doesn't support D3DQUERYTYPE_EVENT, it
 +   will say it is supported and have a fallback for it. Since we
 +   support more queries than wine we may hit different rendering 
 paths
 +   than it, so perhaps these fallbacks are required.
 +   TODO: someone with a lot of different games should try to see
 +   if these dummy queries are needed. */
  DBG(Returning dummy NineQuery9 for %s.\n,
  nine_D3DQUERYTYPE_to_str(Type));
  }
 @@ -174,10 +183,15 @@ NineQuery9_Issue( struct NineQuery9 *This,

  DBG(This=%p dwIssueFlags=%d\n, This, dwIssueFlags);

 -user_assert((dwIssueFlags == D3DISSUE_BEGIN  !This-instant) ||
 +user_assert((dwIssueFlags == D3DISSUE_BEGIN) ||
  (dwIssueFlags == 0) ||
  (dwIssueFlags == D3DISSUE_END), D3DERR_INVALIDCALL);

 +/* Wine tests: always return D3D_OK on D3DISSUE_BEGIN
 + * even when the call is supposed to be forbidden */
 +if (dwIssueFlags == D3DISSUE_BEGIN  This-instant)
 +return D3D_OK;
 +
  if (!This-pq) {
  DBG(Issued dummy query.\n);
  return D3D_OK;
 @@ -185,15 +199,17 @@ NineQuery9_Issue( struct NineQuery9 *This,

  if (dwIssueFlags == D3DISSUE_BEGIN) {
  if (This-state == NINE_QUERY_STATE_RUNNING) {
 -   pipe-end_query(pipe, This-pq);
 -   }
 +pipe-end_query(pipe, This-pq);
 +}
  pipe-begin_query(pipe, This-pq);
  This-state = NINE_QUERY_STATE_RUNNING;
  } else {
 -if (This-state == NINE_QUERY_STATE_RUNNING) {
 -pipe-end_query(pipe, This-pq);
 -This-state = NINE_QUERY_STATE_ENDED;
 -   }
 +if (This-state != NINE_QUERY_STATE_RUNNING 
 +This-type != D3DQUERYTYPE_EVENT 
 +This-type != D3DQUERYTYPE_TIMESTAMP)
 +pipe-begin_query(pipe, This-pq);
 +pipe-end_query(pipe, This-pq);
 +This-state = NINE_QUERY_STATE_ENDED;
  }
  return D3D_OK;
  }
 @@ -220,7 +236,7 @@ NineQuery9_GetData( struct NineQuery9 *This,
  DWORD dwGetDataFlags )
  {
  struct pipe_context *pipe = This-base.device-pipe;
 -boolean ok = !This-pq;
 +boolean ok, should_flush, should_wait;
  unsigned i;
  union pipe_query_result presult;
  union nine_query_result nresult;
 @@ -235,22 +251,28 @@ NineQuery9_GetData( struct NineQuery9 *This,

  if (!This-pq) {
  DBG(No pipe query available.\n);
 -if (!dwSize)
 -   return S_OK;
 -}
 -if (This-state == NINE_QUERY_STATE_FRESH)
 -return S_OK;
 +} else {
 +should_flush = dwGetDataFlags  This-state != 
 NINE_QUERY_STATE_FLUSHED;
 +/* Wine tests: D3DQUERYTYPE_TIMESTAMP always succeeds
 + * directly when flushed */
 +should_wait = dwGetDataFlags  This-type == D3DQUERYTYPE_TIMESTAMP;
 +
 +if (This-state == NINE_QUERY_STATE_FRESH) {
 +/* App forgot issue the request. Be nice and issue it. */
 +(void) NineQuery9_Issue(This, D3DISSUE_END);
 +/* Wine tests: we have to succeed. */
 +should_flush = TRUE;
 +should_wait = TRUE;
 +}

 -if (!ok) {
 -ok = pipe-get_query_result(pipe, This-pq, FALSE, presult);
 -if (!ok) {
 -if (dwGetDataFlags) {
 -if (This-state != NINE_QUERY_STATE_FLUSHED)
 -pipe-flush(pipe, NULL, 0);
 -This-state = NINE_QUERY_STATE_FLUSHED;
 -}
 -return S_FALSE;
 +if (should_flush) {
 +pipe-flush(pipe, NULL, 0);
 +This-state = NINE_QUERY_STATE_FLUSHED;
  }
 +
 +ok 

Re: [Mesa-dev] [PATCH 09/11] st/nine: Rework queries

2014-11-25 Thread Marek Olšák
BTW, radeon flushes in get_query_result even if wait is FALSE,
therefore this loop will always finish:

while (!get_query_result(...));

I think it's a requirement for OpenGL.

Marek

On Tue, Nov 25, 2014 at 12:42 PM, Marek Olšák mar...@gmail.com wrote:
 Calling pipe-flush is unnecessary before get_query_result, because
 get_query_result flushes automatically if it has to (at least on
 radeon).

 Marek

 On Sun, Nov 23, 2014 at 11:40 PM, David Heidelberg da...@ixit.cz wrote:
 From: Axel Davy axel.d...@ens.fr

 From this moment we should handle errors same way as Wine does.

 Original patch from John Ettedgui john.etted...@gmail.com

 Cc: 10.4 mesa-sta...@lists.freedesktop.org
 Tested-by: David Heidelberg da...@ixit.cz
 Signed-off-by: Axel Davy axel.d...@ens.fr
 ---
  src/gallium/state_trackers/nine/query9.c | 66 
 +---
  1 file changed, 44 insertions(+), 22 deletions(-)

 diff --git a/src/gallium/state_trackers/nine/query9.c 
 b/src/gallium/state_trackers/nine/query9.c
 index 908420c..34dfec7 100644
 --- a/src/gallium/state_trackers/nine/query9.c
 +++ b/src/gallium/state_trackers/nine/query9.c
 @@ -123,6 +123,15 @@ NineQuery9_ctor( struct NineQuery9 *This,
  if (!This-pq)
  return E_OUTOFMEMORY;
  } else {
 +/* we have a fallback when app create a query that is
 +   not supported. Wine has different behaviour. It won't fill the
 +   pointer with a valid NineQuery9, but let it NULL and return 
 error.
 +   However even if driver doesn't support D3DQUERYTYPE_EVENT, it
 +   will say it is supported and have a fallback for it. Since we
 +   support more queries than wine we may hit different rendering 
 paths
 +   than it, so perhaps these fallbacks are required.
 +   TODO: someone with a lot of different games should try to see
 +   if these dummy queries are needed. */
  DBG(Returning dummy NineQuery9 for %s.\n,
  nine_D3DQUERYTYPE_to_str(Type));
  }
 @@ -174,10 +183,15 @@ NineQuery9_Issue( struct NineQuery9 *This,

  DBG(This=%p dwIssueFlags=%d\n, This, dwIssueFlags);

 -user_assert((dwIssueFlags == D3DISSUE_BEGIN  !This-instant) ||
 +user_assert((dwIssueFlags == D3DISSUE_BEGIN) ||
  (dwIssueFlags == 0) ||
  (dwIssueFlags == D3DISSUE_END), D3DERR_INVALIDCALL);

 +/* Wine tests: always return D3D_OK on D3DISSUE_BEGIN
 + * even when the call is supposed to be forbidden */
 +if (dwIssueFlags == D3DISSUE_BEGIN  This-instant)
 +return D3D_OK;
 +
  if (!This-pq) {
  DBG(Issued dummy query.\n);
  return D3D_OK;
 @@ -185,15 +199,17 @@ NineQuery9_Issue( struct NineQuery9 *This,

  if (dwIssueFlags == D3DISSUE_BEGIN) {
  if (This-state == NINE_QUERY_STATE_RUNNING) {
 -   pipe-end_query(pipe, This-pq);
 -   }
 +pipe-end_query(pipe, This-pq);
 +}
  pipe-begin_query(pipe, This-pq);
  This-state = NINE_QUERY_STATE_RUNNING;
  } else {
 -if (This-state == NINE_QUERY_STATE_RUNNING) {
 -pipe-end_query(pipe, This-pq);
 -This-state = NINE_QUERY_STATE_ENDED;
 -   }
 +if (This-state != NINE_QUERY_STATE_RUNNING 
 +This-type != D3DQUERYTYPE_EVENT 
 +This-type != D3DQUERYTYPE_TIMESTAMP)
 +pipe-begin_query(pipe, This-pq);
 +pipe-end_query(pipe, This-pq);
 +This-state = NINE_QUERY_STATE_ENDED;
  }
  return D3D_OK;
  }
 @@ -220,7 +236,7 @@ NineQuery9_GetData( struct NineQuery9 *This,
  DWORD dwGetDataFlags )
  {
  struct pipe_context *pipe = This-base.device-pipe;
 -boolean ok = !This-pq;
 +boolean ok, should_flush, should_wait;
  unsigned i;
  union pipe_query_result presult;
  union nine_query_result nresult;
 @@ -235,22 +251,28 @@ NineQuery9_GetData( struct NineQuery9 *This,

  if (!This-pq) {
  DBG(No pipe query available.\n);
 -if (!dwSize)
 -   return S_OK;
 -}
 -if (This-state == NINE_QUERY_STATE_FRESH)
 -return S_OK;
 +} else {
 +should_flush = dwGetDataFlags  This-state != 
 NINE_QUERY_STATE_FLUSHED;
 +/* Wine tests: D3DQUERYTYPE_TIMESTAMP always succeeds
 + * directly when flushed */
 +should_wait = dwGetDataFlags  This-type == 
 D3DQUERYTYPE_TIMESTAMP;
 +
 +if (This-state == NINE_QUERY_STATE_FRESH) {
 +/* App forgot issue the request. Be nice and issue it. */
 +(void) NineQuery9_Issue(This, D3DISSUE_END);
 +/* Wine tests: we have to succeed. */
 +should_flush = TRUE;
 +should_wait = TRUE;
 +}

 -if (!ok) {
 -ok = pipe-get_query_result(pipe, This-pq, FALSE, presult);
 -if (!ok) {
 -if (dwGetDataFlags) {
 -if (This-state != NINE_QUERY_STATE_FLUSHED)
 -pipe-flush(pipe, 

[Mesa-dev] [PATCH] glsl: check if implicitly sized arrays dont match explicitly sized arrays across the same stage

2014-11-25 Thread Timothy Arceri
Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au
---
 src/glsl/linker.cpp | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index de6b1fb..a3a43a0 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -732,8 +732,25 @@ cross_validate_globals(struct gl_shader_program *prog,
((var-type-length == 0)
   || (existing-type-length == 0))) {
  if (var-type-length != 0) {
+ if (var-type-length = existing-data.max_array_access) 
{
+linker_error(prog, %s `%s' declared as type 
+ `%s' and type `%s'\n,
+ mode_string(var),
+ var-name, var-type-name,
+ existing-type-name);
+return;
+ }
 existing-type = var-type;
- }
+ } else if (existing-type-length != 0
+  existing-type-length =
+var-data.max_array_access) {
+ linker_error(prog, %s `%s' declared as type 
+  `%s' and type `%s'\n,
+  mode_string(var),
+  var-name, existing-type-name,
+  var-type-name);
+ return;
+  }
} else if (var-type-is_record()
existing-type-is_record()
existing-type-record_compare(var-type)) {
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 86690] glmark2-es2-wayland shortly freezes on some frames with egl_dri2 backend (Nouveau/GK20A)

2014-11-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=86690

--- Comment #2 from Pekka Paalanen ppaala...@gmail.com ---
(In reply to Alexandre Courbot from comment #1)
 Created attachment 109990 [details]
 Trace when running glmark2 with WAYLAND_DEBUG=client
 
 Attached the trace suggested by Pekka. The stuttering is visible almost
 immediatly after the bench started, and occured regularly until the end of
 the trace.

Ok, here's a piece of what I detected as delays  50ms in the protocol dump
(timestamp, skip in ms):
2007458 55
2007628 56
2008779 62
2008926 142
2009104 128
2009638 529
2009733 84
2012659 105
2012769 62
2012838 64

It looks quite random in both occurrence and length. I checked the first six,
and the delay always happens between damage and commit requests. Due to how
libwayland-client prints these, these timestamp are when Mesa calls the
functions wl_surface_damage and wl_surface_commit.

In src/egl/drivers/dri2/platform_wayland.c (egl_dri2), in
dri2_wl_swap_buffers_with_damage() you see these function calls. The things
between them are flush and invalidate. One of these probably causes the stall.

If we look at egl_gallium in
src/gallium/state_trackers/egl/wayland/native_wayland.c, well, the code is
different. Swapbuffers and present seem to be two different hooks.

I'm not familiar with this code to say what the actual difference is. Maybe
egl_dri2 allows queueing many frames' worth of drawing to the GPU and at some
point the driver simply has to wait for it all to drain? And egl_gallium does
something slightly different?

Oh right, you said egl_gallium didn't respect swapinterval=0 but stayed
vsynced. That explains why the drawing never accumulates. Looks like
wayland_surface_swap_buffers() always waits for the frame callback, which
causes exactly the throttling to vsync rate.

I believe this would be a case of queueing too much GPU work and eventually
having to wait for it to drain. I'm not sure how or where you would start
fixing it. Buffer re-use could also be a factor.

I am fairly sure this is not a Weston bug. It seems to swapping between the two
wl_buffers just fine.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/10] i965: Fold the gen7_cc_viewport_state_pointer atom into brw_cc_vp.

2014-11-25 Thread Kenneth Graunke
These always happen together; the extra atom just means another item to
iterate through, flags to check, and a call through a function pointer.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_cc.c  |  9 -
 src/mesa/drivers/dri/i965/brw_state.h   |  1 -
 src/mesa/drivers/dri/i965/brw_state_upload.c|  2 --
 src/mesa/drivers/dri/i965/gen7_viewport_state.c | 19 ---
 4 files changed, 8 insertions(+), 23 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_cc.c 
b/src/mesa/drivers/dri/i965/brw_cc.c
index da72995..86ab503 100644
--- a/src/mesa/drivers/dri/i965/brw_cc.c
+++ b/src/mesa/drivers/dri/i965/brw_cc.c
@@ -62,7 +62,14 @@ brw_upload_cc_vp(struct brw_context *brw)
   }
}
 
-   brw-state.dirty.cache |= CACHE_NEW_CC_VP;
+   if (brw-gen = 7) {
+  BEGIN_BATCH(2);
+  OUT_BATCH(_3DSTATE_VIEWPORT_STATE_POINTERS_CC  16 | (2 - 2));
+  OUT_BATCH(brw-cc.vp_offset);
+  ADVANCE_BATCH();
+   } else {
+  brw-state.dirty.cache |= CACHE_NEW_CC_VP;
+   }
 }
 
 const struct brw_tracked_state brw_cc_vp = {
diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index 209fab1..399347c 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -117,7 +117,6 @@ extern const struct brw_tracked_state gen6_vs_state;
 extern const struct brw_tracked_state gen6_wm_push_constants;
 extern const struct brw_tracked_state gen6_wm_state;
 extern const struct brw_tracked_state gen7_depthbuffer;
-extern const struct brw_tracked_state gen7_cc_viewport_state_pointer;
 extern const struct brw_tracked_state gen7_clip_state;
 extern const struct brw_tracked_state gen7_disable_stages;
 extern const struct brw_tracked_state gen7_gs_push_constants;
diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c 
b/src/mesa/drivers/dri/i965/brw_state_upload.c
index 1235d49..6cc2770 100644
--- a/src/mesa/drivers/dri/i965/brw_state_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
@@ -190,7 +190,6 @@ static const struct brw_tracked_state *gen7_atoms[] =
brw_state_base_address,
 
brw_cc_vp,
-   gen7_cc_viewport_state_pointer, /* must do after brw_cc_vp */
gen7_sf_clip_viewport,
 
gen7_push_constant_space,
@@ -265,7 +264,6 @@ static const struct brw_tracked_state *gen8_atoms[] =
gen8_state_base_address,
 
brw_cc_vp,
-   gen7_cc_viewport_state_pointer, /* must do after brw_cc_vp */
gen8_sf_clip_viewport,
 
gen7_push_constant_space,
diff --git a/src/mesa/drivers/dri/i965/gen7_viewport_state.c 
b/src/mesa/drivers/dri/i965/gen7_viewport_state.c
index 193ead7..01af044 100644
--- a/src/mesa/drivers/dri/i965/gen7_viewport_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_viewport_state.c
@@ -99,22 +99,3 @@ const struct brw_tracked_state gen7_sf_clip_viewport = {
},
.emit = gen7_upload_sf_clip_viewport,
 };
-
-/* - */
-
-static void upload_cc_viewport_state_pointer(struct brw_context *brw)
-{
-   BEGIN_BATCH(2);
-   OUT_BATCH(_3DSTATE_VIEWPORT_STATE_POINTERS_CC  16 | (2 - 2));
-   OUT_BATCH(brw-cc.vp_offset);
-   ADVANCE_BATCH();
-}
-
-const struct brw_tracked_state gen7_cc_viewport_state_pointer = {
-   .dirty = {
-  .mesa = 0,
-  .brw = BRW_NEW_BATCH,
-  .cache = CACHE_NEW_CC_VP
-   },
-   .emit = upload_cc_viewport_state_pointer,
-};
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/10] i965: Combine CACHE_NEW_*_UNIT into BRW_NEW_GEN4_UNIT_STATE.

2014-11-25 Thread Kenneth Graunke
On Gen4-5, unit state is specified as indirect state, rather than
commands.  If any unit state changes, we upload it via brw_state_batch
and arrange for 3DSTATE_PIPELINED_POINTERS to be re-emitted, which
updates pointers to all unit state at once.

Since there's only one command and state atom (brw_psp_urb_cs) that
needs to know about this, there's no benefit to having six separate
flags.  We can combine CACHE_NEW_*_UNIT into a single flag.

We also haven't cached these in a long time, so it doesn't make sense
to use the CACHE_NEW_ prefix.  Instead, use the BRW_NEW_ prefix.

This also saves 12 * sizeof(void *) bytes of memory per context, as
we remove useless aux_compare/aux_free functions for each CACHE bit.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_cc.c   |  2 +-
 src/mesa/drivers/dri/i965/brw_clip_state.c   |  2 +-
 src/mesa/drivers/dri/i965/brw_context.h  | 14 ++
 src/mesa/drivers/dri/i965/brw_gs_state.c |  2 +-
 src/mesa/drivers/dri/i965/brw_misc_state.c   |  9 ++---
 src/mesa/drivers/dri/i965/brw_sf_state.c |  2 +-
 src/mesa/drivers/dri/i965/brw_state_upload.c |  7 +--
 src/mesa/drivers/dri/i965/brw_vs_state.c |  2 +-
 src/mesa/drivers/dri/i965/brw_wm_state.c |  2 +-
 9 files changed, 11 insertions(+), 31 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_cc.c 
b/src/mesa/drivers/dri/i965/brw_cc.c
index 62c9261..da72995 100644
--- a/src/mesa/drivers/dri/i965/brw_cc.c
+++ b/src/mesa/drivers/dri/i965/brw_cc.c
@@ -224,7 +224,7 @@ static void upload_cc_unit(struct brw_context *brw)
cc-cc4.cc_viewport_state_offset = (brw-batch.bo-offset64 +
   brw-cc.vp_offset)  5; /* reloc */
 
-   brw-state.dirty.cache |= CACHE_NEW_CC_UNIT;
+   brw-state.dirty.brw |= BRW_NEW_GEN4_UNIT_STATE;
 
/* Emit CC viewport relocation */
drm_intel_bo_emit_reloc(brw-batch.bo,
diff --git a/src/mesa/drivers/dri/i965/brw_clip_state.c 
b/src/mesa/drivers/dri/i965/brw_clip_state.c
index a1f2e33..df334b5 100644
--- a/src/mesa/drivers/dri/i965/brw_clip_state.c
+++ b/src/mesa/drivers/dri/i965/brw_clip_state.c
@@ -158,7 +158,7 @@ brw_upload_clip_unit(struct brw_context *brw)
clip-viewport_ymin = -1;
clip-viewport_ymax = 1;
 
-   brw-state.dirty.cache |= CACHE_NEW_CLIP_UNIT;
+   brw-state.dirty.brw |= BRW_NEW_GEN4_UNIT_STATE;
 }
 
 const struct brw_tracked_state brw_clip_unit = {
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 656cbe8..f741848 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -183,6 +183,7 @@ enum brw_state_id {
BRW_STATE_PUSH_CONSTANT_ALLOCATION,
BRW_STATE_NUM_SAMPLES,
BRW_STATE_TEXTURE_BUFFER,
+   BRW_STATE_GEN4_UNIT_STATE,
BRW_NUM_STATE_BITS
 };
 
@@ -224,6 +225,7 @@ enum brw_state_id {
 #define BRW_NEW_PUSH_CONSTANT_ALLOCATION (1ull  
BRW_STATE_PUSH_CONSTANT_ALLOCATION)
 #define BRW_NEW_NUM_SAMPLES (1ull  BRW_STATE_NUM_SAMPLES)
 #define BRW_NEW_TEXTURE_BUFFER  (1ull  BRW_STATE_TEXTURE_BUFFER)
+#define BRW_NEW_GEN4_UNIT_STATE (1ull  BRW_STATE_GEN4_UNIT_STATE)
 
 struct brw_state_flags {
/** State update flags signalled by mesa internals */
@@ -684,21 +686,15 @@ struct brw_gs_prog_data
 
 enum brw_cache_id {
BRW_CC_VP,
-   BRW_CC_UNIT,
BRW_WM_PROG,
BRW_BLORP_BLIT_PROG,
BRW_SAMPLER,
-   BRW_WM_UNIT,
BRW_SF_PROG,
BRW_SF_VP,
-   BRW_SF_UNIT, /* scissor state on gen6 */
-   BRW_VS_UNIT,
BRW_VS_PROG,
-   BRW_FF_GS_UNIT,
BRW_FF_GS_PROG,
BRW_GS_PROG,
BRW_CLIP_VP,
-   BRW_CLIP_UNIT,
BRW_CLIP_PROG,
 
BRW_MAX_CACHE
@@ -778,21 +774,15 @@ enum shader_time_shader_type {
 /* Flags for brw-state.cache.
  */
 #define CACHE_NEW_CC_VP  (1BRW_CC_VP)
-#define CACHE_NEW_CC_UNIT(1BRW_CC_UNIT)
 #define CACHE_NEW_WM_PROG(1BRW_WM_PROG)
 #define CACHE_NEW_BLORP_BLIT_PROG(1BRW_BLORP_BLIT_PROG)
 #define CACHE_NEW_SAMPLER(1BRW_SAMPLER)
-#define CACHE_NEW_WM_UNIT(1BRW_WM_UNIT)
 #define CACHE_NEW_SF_PROG(1BRW_SF_PROG)
 #define CACHE_NEW_SF_VP  (1BRW_SF_VP)
-#define CACHE_NEW_SF_UNIT(1BRW_SF_UNIT)
-#define CACHE_NEW_VS_UNIT(1BRW_VS_UNIT)
 #define CACHE_NEW_VS_PROG(1BRW_VS_PROG)
-#define CACHE_NEW_FF_GS_UNIT (1BRW_FF_GS_UNIT)
 #define CACHE_NEW_FF_GS_PROG (1BRW_FF_GS_PROG)
 #define CACHE_NEW_GS_PROG(1BRW_GS_PROG)
 #define CACHE_NEW_CLIP_VP(1BRW_CLIP_VP)
-#define CACHE_NEW_CLIP_UNIT  (1BRW_CLIP_UNIT)
 #define CACHE_NEW_CLIP_PROG  (1BRW_CLIP_PROG)
 
 struct brw_vertex_buffer {
diff --git a/src/mesa/drivers/dri/i965/brw_gs_state.c 
b/src/mesa/drivers/dri/i965/brw_gs_state.c
index 698f7ee..9f4efdf 100644
--- a/src/mesa/drivers/dri/i965/brw_gs_state.c
+++ 

[Mesa-dev] [PATCH 09/10] i965: Move BRW_NEW_*_PROG_DATA flags to .brw (not .cache).

2014-11-25 Thread Kenneth Graunke
I put the BRW_NEW_*_PROG_DATA flags at the beginning so that
brw_state_cache.c can still continue using 1  brw_cache_id.

I also added a comment explaining the difference between
BRW_NEW_*_PROG_DATA and BRW_NEW_*_PROGRAM, as it took me a long time
to remember it.

Non-mechanical changes:
- brw_state_cache.c and brw_ff_gs.c now signal .brw, not .cache.
- brw_state_upload.c - INTEL_DEBUG=state changes.
- brw_context.h - bit definition merging.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_binding_tables.c   |  6 +--
 src/mesa/drivers/dri/i965/brw_clip_state.c   |  2 +-
 src/mesa/drivers/dri/i965/brw_context.h  | 63 +++-
 src/mesa/drivers/dri/i965/brw_curbe.c| 12 ++---
 src/mesa/drivers/dri/i965/brw_draw_upload.c  |  4 +-
 src/mesa/drivers/dri/i965/brw_ff_gs.c|  6 +--
 src/mesa/drivers/dri/i965/brw_gs_state.c |  2 +-
 src/mesa/drivers/dri/i965/brw_gs_surface_state.c | 10 ++--
 src/mesa/drivers/dri/i965/brw_misc_state.c   |  2 +-
 src/mesa/drivers/dri/i965/brw_sf_state.c |  2 +-
 src/mesa/drivers/dri/i965/brw_state_cache.c  |  4 +-
 src/mesa/drivers/dri/i965/brw_state_upload.c | 15 +++---
 src/mesa/drivers/dri/i965/brw_urb.c  |  6 +--
 src/mesa/drivers/dri/i965/brw_vs_state.c |  4 +-
 src/mesa/drivers/dri/i965/brw_vs_surface_state.c | 12 ++---
 src/mesa/drivers/dri/i965/brw_wm_state.c |  2 +-
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 10 ++--
 src/mesa/drivers/dri/i965/gen6_clip_state.c  |  4 +-
 src/mesa/drivers/dri/i965/gen6_gs_state.c|  6 +--
 src/mesa/drivers/dri/i965/gen6_sf_state.c|  2 +-
 src/mesa/drivers/dri/i965/gen6_urb.c |  8 +--
 src/mesa/drivers/dri/i965/gen6_vs_state.c|  8 +--
 src/mesa/drivers/dri/i965/gen6_wm_state.c|  4 +-
 src/mesa/drivers/dri/i965/gen7_gs_state.c|  4 +-
 src/mesa/drivers/dri/i965/gen7_sf_state.c|  2 +-
 src/mesa/drivers/dri/i965/gen7_urb.c |  6 +--
 src/mesa/drivers/dri/i965/gen7_vs_state.c|  4 +-
 src/mesa/drivers/dri/i965/gen7_wm_state.c|  8 +--
 src/mesa/drivers/dri/i965/gen8_depth_state.c |  4 +-
 src/mesa/drivers/dri/i965/gen8_draw_upload.c |  4 +-
 src/mesa/drivers/dri/i965/gen8_gs_state.c|  4 +-
 src/mesa/drivers/dri/i965/gen8_ps_state.c| 10 ++--
 src/mesa/drivers/dri/i965/gen8_sf_state.c|  2 +-
 src/mesa/drivers/dri/i965/gen8_vs_state.c|  4 +-
 34 files changed, 131 insertions(+), 115 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c 
b/src/mesa/drivers/dri/i965/brw_binding_tables.c
index 2e843d5..7ffd7b2 100644
--- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
+++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
@@ -111,8 +111,8 @@ const struct brw_tracked_state brw_vs_binding_table = {
   .mesa = 0,
   .brw = BRW_NEW_BATCH |
  BRW_NEW_VS_CONSTBUF |
+ BRW_NEW_VS_PROG_DATA |
  BRW_NEW_SURFACES,
-  .cache = BRW_NEW_VS_PROG_DATA
},
.emit = brw_vs_upload_binding_table,
 };
@@ -131,8 +131,8 @@ const struct brw_tracked_state brw_wm_binding_table = {
.dirty = {
   .mesa = 0,
   .brw = BRW_NEW_BATCH |
+ BRW_NEW_FS_PROG_DATA |
  BRW_NEW_SURFACES,
-  .cache = BRW_NEW_FS_PROG_DATA
},
.emit = brw_upload_wm_binding_table,
 };
@@ -155,8 +155,8 @@ const struct brw_tracked_state brw_gs_binding_table = {
   .mesa = 0,
   .brw = BRW_NEW_BATCH |
  BRW_NEW_GS_CONSTBUF |
+ BRW_NEW_GS_PROG_DATA |
  BRW_NEW_SURFACES,
-  .cache = BRW_NEW_GS_PROG_DATA
},
.emit = brw_gs_upload_binding_table,
 };
diff --git a/src/mesa/drivers/dri/i965/brw_clip_state.c 
b/src/mesa/drivers/dri/i965/brw_clip_state.c
index 0e1aa58..09a2523 100644
--- a/src/mesa/drivers/dri/i965/brw_clip_state.c
+++ b/src/mesa/drivers/dri/i965/brw_clip_state.c
@@ -167,10 +167,10 @@ const struct brw_tracked_state brw_clip_unit = {
_NEW_TRANSFORM |
_NEW_VIEWPORT,
   .brw   = BRW_NEW_BATCH |
+   BRW_NEW_CLIP_PROG_DATA |
BRW_NEW_CURBE_OFFSETS |
BRW_NEW_PROGRAM_CACHE |
BRW_NEW_URB_FENCE,
-  .cache = BRW_NEW_CLIP_PROG_DATA
},
.emit = brw_upload_clip_unit,
 };
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index a766937..e60c054 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -149,8 +149,21 @@ struct brw_vec4_prog_key;
 struct brw_wm_prog_key;
 struct brw_wm_prog_data;
 
+enum brw_cache_id {
+   BRW_CACHE_FS_PROG,
+   BRW_CACHE_BLORP_BLIT_PROG,
+   BRW_CACHE_SF_PROG,
+   BRW_CACHE_VS_PROG,
+   BRW_CACHE_FF_GS_PROG,
+   BRW_CACHE_GS_PROG,
+   BRW_CACHE_CLIP_PROG,
+
+   BRW_MAX_CACHE
+};
+
 enum brw_state_id {
-   BRW_STATE_URB_FENCE,
+   /* 

[Mesa-dev] [PATCH 05/10] i965: Move some /* CACHE_NEW_SAMPLER */ comments.

2014-11-25 Thread Kenneth Graunke
Marking brw_stage_state::sampler_count as CACHE_NEW_SAMPLER is wrong.

The number of samplers used by each program is actually computed at
draw time (brw_try_draw_prims), based purely on the currently bound
shader programs (gl_program::SamplersUsed).

CACHE_NEW_SAMPLER means that we've emitted a new SAMPLER_STATE table.
Although this could indicate that the number of samplers has changed,
it could also simply mean that the contents of the table has changed
(i.e. we've bound different textures).

The real reason these atoms depend on CACHE_NEW_SAMPLER is because they
include a pointer to the SAMPLER_STATE table.  This was not commented.

So, move the comments to the appropriate place.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_vs_state.c | 2 +-
 src/mesa/drivers/dri/i965/brw_wm_state.c | 3 +--
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vs_state.c 
b/src/mesa/drivers/dri/i965/brw_vs_state.c
index cd740db..351f7ef 100644
--- a/src/mesa/drivers/dri/i965/brw_vs_state.c
+++ b/src/mesa/drivers/dri/i965/brw_vs_state.c
@@ -145,7 +145,6 @@ brw_upload_vs_unit(struct brw_context *brw)
if (brw-gen == 5)
   vs-vs5.sampler_count = 0; /* hardware requirement */
else {
-  /* CACHE_NEW_SAMPLER */
   vs-vs5.sampler_count = (stage_state-sampler_count + 3) / 4;
}
 
@@ -160,6 +159,7 @@ brw_upload_vs_unit(struct brw_context *brw)
/* Set the sampler state pointer, and its reloc
 */
if (stage_state-sampler_count) {
+  /* CACHE_NEW_SAMPLER - reloc */
   vs-vs5.sampler_state_pointer =
  (brw-batch.bo-offset64 + stage_state-sampler_offset)  5;
   drm_intel_bo_emit_reloc(brw-batch.bo,
diff --git a/src/mesa/drivers/dri/i965/brw_wm_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_state.c
index 6a53a98..2f315c6 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_state.c
@@ -150,12 +150,11 @@ brw_upload_wm_unit(struct brw_context *brw)
if (brw-gen == 5)
   wm-wm4.sampler_count = 0; /* hardware requirement */
else {
-  /* CACHE_NEW_SAMPLER */
   wm-wm4.sampler_count = (brw-wm.base.sampler_count + 1) / 4;
}
 
if (brw-wm.base.sampler_count) {
-  /* reloc */
+  /* CACHE_NEW_SAMPLER - reloc */
   wm-wm4.sampler_state_pointer = (brw-batch.bo-offset64 +
   brw-wm.base.sampler_offset)  5;
} else {
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/10] i965: Move CACHE_NEW_SAMPLER to BRW_NEW_SAMPLER_STATE_TABLE.

2014-11-25 Thread Kenneth Graunke
This flag signifies that we've emitted a new SAMPLER_STATE table.
Given that we haven't cached those in years, CACHE_NEW_SAMPLER isn't
a great name.  Putting it in the BRW_NEW_* hierarchy would make more
sense; BRW_NEW_SAMPLER_STATE_TABLE better reflects its actual purpose.

When this flag is raised, the pointer to the SAMPLER_STATE table has
changed, so we need to re-issue any packets which point to it (unit
state on Gen4-5, 3DSTATE_SAMPLER_STATE_POINTERS on Gen6, and the
per-stage variants on Gen7+).

Saves 2 * sizeof(void *) bytes per context, as we remove useless
aux_compare/aux_free function pointers.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_context.h| 4 ++--
 src/mesa/drivers/dri/i965/brw_sampler_state.c  | 2 +-
 src/mesa/drivers/dri/i965/brw_state_upload.c   | 2 +-
 src/mesa/drivers/dri/i965/brw_vs_state.c   | 6 +++---
 src/mesa/drivers/dri/i965/brw_wm_state.c   | 6 +++---
 src/mesa/drivers/dri/i965/gen6_sampler_state.c | 3 ++-
 6 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index cd1daee..fac22dc 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -187,6 +187,7 @@ enum brw_state_id {
BRW_STATE_CC_VP,
BRW_STATE_SF_VP,
BRW_STATE_CLIP_VP,
+   BRW_STATE_SAMPLER_STATE_TABLE,
BRW_NUM_STATE_BITS
 };
 
@@ -232,6 +233,7 @@ enum brw_state_id {
 #define BRW_NEW_CC_VP   (1ull  BRW_STATE_CC_VP)
 #define BRW_NEW_SF_VP   (1ull  BRW_STATE_SF_VP)
 #define BRW_NEW_CLIP_VP (1ull  BRW_STATE_CLIP_VP)
+#define BRW_NEW_SAMPLER_STATE_TABLE (1ull  BRW_STATE_SAMPLER_STATE_TABLE)
 
 struct brw_state_flags {
/** State update flags signalled by mesa internals */
@@ -693,7 +695,6 @@ struct brw_gs_prog_data
 enum brw_cache_id {
BRW_WM_PROG,
BRW_BLORP_BLIT_PROG,
-   BRW_SAMPLER,
BRW_SF_PROG,
BRW_VS_PROG,
BRW_FF_GS_PROG,
@@ -778,7 +779,6 @@ enum shader_time_shader_type {
  */
 #define CACHE_NEW_WM_PROG(1BRW_WM_PROG)
 #define CACHE_NEW_BLORP_BLIT_PROG(1BRW_BLORP_BLIT_PROG)
-#define CACHE_NEW_SAMPLER(1BRW_SAMPLER)
 #define CACHE_NEW_SF_PROG(1BRW_SF_PROG)
 #define CACHE_NEW_VS_PROG(1BRW_VS_PROG)
 #define CACHE_NEW_FF_GS_PROG (1BRW_FF_GS_PROG)
diff --git a/src/mesa/drivers/dri/i965/brw_sampler_state.c 
b/src/mesa/drivers/dri/i965/brw_sampler_state.c
index 8363a48..9c5e45c 100644
--- a/src/mesa/drivers/dri/i965/brw_sampler_state.c
+++ b/src/mesa/drivers/dri/i965/brw_sampler_state.c
@@ -512,7 +512,7 @@ brw_upload_sampler_state_table(struct brw_context *brw,
   /* Flag that the sampler state table pointer has changed; later atoms
* will handle it.
*/
-  brw-state.dirty.cache |= CACHE_NEW_SAMPLER;
+  brw-state.dirty.brw |= BRW_NEW_SAMPLER_STATE_TABLE;
}
 }
 
diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c 
b/src/mesa/drivers/dri/i965/brw_state_upload.c
index db0119c..57c4519 100644
--- a/src/mesa/drivers/dri/i965/brw_state_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
@@ -523,13 +523,13 @@ static struct dirty_bit_map brw_bits[] = {
DEFINE_BIT(BRW_NEW_CC_VP),
DEFINE_BIT(BRW_NEW_SF_VP),
DEFINE_BIT(BRW_NEW_CLIP_VP),
+   DEFINE_BIT(BRW_NEW_SAMPLER_STATE_TABLE),
{0, 0, 0}
 };
 
 static struct dirty_bit_map cache_bits[] = {
DEFINE_BIT(CACHE_NEW_WM_PROG),
DEFINE_BIT(CACHE_NEW_BLORP_BLIT_PROG),
-   DEFINE_BIT(CACHE_NEW_SAMPLER),
DEFINE_BIT(CACHE_NEW_SF_PROG),
DEFINE_BIT(CACHE_NEW_VS_PROG),
DEFINE_BIT(CACHE_NEW_FF_GS_PROG),
diff --git a/src/mesa/drivers/dri/i965/brw_vs_state.c 
b/src/mesa/drivers/dri/i965/brw_vs_state.c
index 351f7ef..f9ee2d0 100644
--- a/src/mesa/drivers/dri/i965/brw_vs_state.c
+++ b/src/mesa/drivers/dri/i965/brw_vs_state.c
@@ -159,7 +159,7 @@ brw_upload_vs_unit(struct brw_context *brw)
/* Set the sampler state pointer, and its reloc
 */
if (stage_state-sampler_count) {
-  /* CACHE_NEW_SAMPLER - reloc */
+  /* BRW_NEW_SAMPLER_STATE_TABLE - reloc */
   vs-vs5.sampler_state_pointer =
  (brw-batch.bo-offset64 + stage_state-sampler_offset)  5;
   drm_intel_bo_emit_reloc(brw-batch.bo,
@@ -190,10 +190,10 @@ const struct brw_tracked_state brw_vs_unit = {
   .brw   = BRW_NEW_BATCH |
BRW_NEW_CURBE_OFFSETS |
BRW_NEW_PROGRAM_CACHE |
+   BRW_NEW_SAMPLER_STATE_TABLE |
BRW_NEW_URB_FENCE |
BRW_NEW_VERTEX_PROGRAM,
-  .cache = CACHE_NEW_SAMPLER |
-   CACHE_NEW_VS_PROG,
+  .cache = CACHE_NEW_VS_PROG,
},
.emit = brw_upload_vs_unit,
 };
diff --git a/src/mesa/drivers/dri/i965/brw_wm_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_state.c
index 2f315c6..763ea5f 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_state.c
+++ 

[Mesa-dev] [PATCH 08/10] i965: Rename CACHE_NEW_*_PROG to BRW_NEW_*_PROG_DATA.

2014-11-25 Thread Kenneth Graunke
Now that we've moved a bunch of CACHE_NEW_* bits to BRW_NEW_*, the only
ones that are left are legitimately related to the program cache.  Yet,
it seems a bit wasteful to have an entire bitfield for only 7 bits.

State upload is one of the hottest paths in the driver.  For each atom
in the list, we call check_state() to see if it needs to be emitted.
Currently, this involves comparing three separate bitfields (mesa, brw,
and cache).  Consolidating the brw and cache bitfields would save a
small amount of CPU overhead per atom.  Broadwell, for example, has
57 state atoms, so this small savings can add up.

CACHE_NEW_*_PROG covers the brw_*_prog_data structures, as well as the
offset into the program cache BO (prog_offset).  Since most uses refer
to brw_*_prog_data, I decided to use BRW_NEW_*_PROG_DATA as the name.

Removing cache completely is a bit painful, so I decided to do it in
several patches for easier review, and to separate mechanical changes
from manual ones.  This one simply renames things, and was made via:

$ for file in *.[ch]; do
  sed -i -e 's/CACHE_NEW_\([A-Z_\*]*\)_PROG/BRW_NEW_\1_PROG_DATA/g' \
 -e 's/BRW_NEW_WM_PROG_DATA/BRW_NEW_FS_PROG_DATA/g' $file
  done

Note that BRW_NEW_*_PROG_DATA is still in .cache, not .brw!
The next patch will remedy this flaw.  It will also fix the
alphabetization issues.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_binding_tables.c   |  8 
 src/mesa/drivers/dri/i965/brw_clip_state.c   |  4 ++--
 src/mesa/drivers/dri/i965/brw_context.h  | 16 
 src/mesa/drivers/dri/i965/brw_curbe.c| 16 
 src/mesa/drivers/dri/i965/brw_draw_upload.c  |  4 ++--
 src/mesa/drivers/dri/i965/brw_ff_gs.c|  6 +++---
 src/mesa/drivers/dri/i965/brw_gs_state.c |  4 ++--
 src/mesa/drivers/dri/i965/brw_gs_surface_state.c | 12 ++--
 src/mesa/drivers/dri/i965/brw_misc_state.c   |  2 +-
 src/mesa/drivers/dri/i965/brw_sf_state.c |  6 +++---
 src/mesa/drivers/dri/i965/brw_state_upload.c | 14 +++---
 src/mesa/drivers/dri/i965/brw_urb.c  |  4 ++--
 src/mesa/drivers/dri/i965/brw_vs_state.c |  4 ++--
 src/mesa/drivers/dri/i965/brw_vs_surface_state.c | 14 +++---
 src/mesa/drivers/dri/i965/brw_wm_state.c |  6 +++---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 12 ++--
 src/mesa/drivers/dri/i965/gen6_clip_state.c  |  6 +++---
 src/mesa/drivers/dri/i965/gen6_gs_state.c| 10 +-
 src/mesa/drivers/dri/i965/gen6_sf_state.c|  8 
 src/mesa/drivers/dri/i965/gen6_urb.c |  8 
 src/mesa/drivers/dri/i965/gen6_vs_state.c|  6 +++---
 src/mesa/drivers/dri/i965/gen6_wm_state.c| 10 +-
 src/mesa/drivers/dri/i965/gen7_gs_state.c|  4 ++--
 src/mesa/drivers/dri/i965/gen7_sf_state.c|  6 +++---
 src/mesa/drivers/dri/i965/gen7_urb.c |  8 
 src/mesa/drivers/dri/i965/gen7_vs_state.c|  2 +-
 src/mesa/drivers/dri/i965/gen7_wm_state.c| 12 ++--
 src/mesa/drivers/dri/i965/gen8_depth_state.c |  6 +++---
 src/mesa/drivers/dri/i965/gen8_draw_upload.c |  2 +-
 src/mesa/drivers/dri/i965/gen8_gs_state.c|  4 ++--
 src/mesa/drivers/dri/i965/gen8_ps_state.c| 14 +++---
 src/mesa/drivers/dri/i965/gen8_sf_state.c|  6 +++---
 src/mesa/drivers/dri/i965/gen8_vs_state.c|  4 ++--
 33 files changed, 124 insertions(+), 124 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c 
b/src/mesa/drivers/dri/i965/brw_binding_tables.c
index bf99689..2e843d5 100644
--- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
+++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
@@ -56,7 +56,7 @@ brw_upload_binding_table(struct brw_context *brw,
  GLbitfield brw_new_binding_table,
  struct brw_stage_state *stage_state)
 {
-   /* CACHE_NEW_*_PROG */
+   /* BRW_NEW_*_PROG_DATA */
struct brw_stage_prog_data *prog_data = stage_state-prog_data;
 
if (prog_data-binding_table.size_bytes == 0) {
@@ -112,7 +112,7 @@ const struct brw_tracked_state brw_vs_binding_table = {
   .brw = BRW_NEW_BATCH |
  BRW_NEW_VS_CONSTBUF |
  BRW_NEW_SURFACES,
-  .cache = CACHE_NEW_VS_PROG
+  .cache = BRW_NEW_VS_PROG_DATA
},
.emit = brw_vs_upload_binding_table,
 };
@@ -132,7 +132,7 @@ const struct brw_tracked_state brw_wm_binding_table = {
   .mesa = 0,
   .brw = BRW_NEW_BATCH |
  BRW_NEW_SURFACES,
-  .cache = CACHE_NEW_WM_PROG
+  .cache = BRW_NEW_FS_PROG_DATA
},
.emit = brw_upload_wm_binding_table,
 };
@@ -156,7 +156,7 @@ const struct brw_tracked_state brw_gs_binding_table = {
   .brw = BRW_NEW_BATCH |
  BRW_NEW_GS_CONSTBUF |
  BRW_NEW_SURFACES,
-  .cache = CACHE_NEW_GS_PROG
+  .cache = 

[Mesa-dev] [PATCH 04/10] i965: Move CACHE_NEW_*_VP flags to BRW_NEW_*_VP.

2014-11-25 Thread Kenneth Graunke
We've been streaming these out for ages, so they basically have nothing
to do with brw_state_cache.c.

Saves 6 * sizeof(void *) bytes per context, as we won't have useless
aux_compare/aux_free functions for them.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_cc.c  |  7 ---
 src/mesa/drivers/dri/i965/brw_context.h | 12 ++--
 src/mesa/drivers/dri/i965/brw_sf_state.c|  8 
 src/mesa/drivers/dri/i965/brw_state_upload.c|  6 +++---
 src/mesa/drivers/dri/i965/gen6_viewport_state.c | 11 ++-
 5 files changed, 23 insertions(+), 21 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_cc.c 
b/src/mesa/drivers/dri/i965/brw_cc.c
index 86ab503..01974e1 100644
--- a/src/mesa/drivers/dri/i965/brw_cc.c
+++ b/src/mesa/drivers/dri/i965/brw_cc.c
@@ -68,7 +68,7 @@ brw_upload_cc_vp(struct brw_context *brw)
   OUT_BATCH(brw-cc.vp_offset);
   ADVANCE_BATCH();
} else {
-  brw-state.dirty.cache |= CACHE_NEW_CC_VP;
+  brw-state.dirty.brw |= BRW_NEW_CC_VP;
}
 }
 
@@ -227,7 +227,7 @@ static void upload_cc_unit(struct brw_context *brw)
if (brw-stats_wm || unlikely(INTEL_DEBUG  DEBUG_STATS))
   cc-cc5.statistics_enable = 1;
 
-   /* CACHE_NEW_CC_VP */
+   /* BRW_NEW_CC_VP */
cc-cc4.cc_viewport_state_offset = (brw-batch.bo-offset64 +
   brw-cc.vp_offset)  5; /* reloc */
 
@@ -248,8 +248,9 @@ const struct brw_tracked_state brw_cc_unit = {
   _NEW_DEPTH |
   _NEW_STENCIL,
   .brw = BRW_NEW_BATCH |
+ BRW_NEW_CC_VP |
  BRW_NEW_STATS_WM,
-  .cache = CACHE_NEW_CC_VP
+  .cache = 0
},
.emit = upload_cc_unit,
 };
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index f741848..cd1daee 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -184,6 +184,9 @@ enum brw_state_id {
BRW_STATE_NUM_SAMPLES,
BRW_STATE_TEXTURE_BUFFER,
BRW_STATE_GEN4_UNIT_STATE,
+   BRW_STATE_CC_VP,
+   BRW_STATE_SF_VP,
+   BRW_STATE_CLIP_VP,
BRW_NUM_STATE_BITS
 };
 
@@ -226,6 +229,9 @@ enum brw_state_id {
 #define BRW_NEW_NUM_SAMPLES (1ull  BRW_STATE_NUM_SAMPLES)
 #define BRW_NEW_TEXTURE_BUFFER  (1ull  BRW_STATE_TEXTURE_BUFFER)
 #define BRW_NEW_GEN4_UNIT_STATE (1ull  BRW_STATE_GEN4_UNIT_STATE)
+#define BRW_NEW_CC_VP   (1ull  BRW_STATE_CC_VP)
+#define BRW_NEW_SF_VP   (1ull  BRW_STATE_SF_VP)
+#define BRW_NEW_CLIP_VP (1ull  BRW_STATE_CLIP_VP)
 
 struct brw_state_flags {
/** State update flags signalled by mesa internals */
@@ -685,16 +691,13 @@ struct brw_gs_prog_data
 #define SHADER_TIME_STRIDE 64
 
 enum brw_cache_id {
-   BRW_CC_VP,
BRW_WM_PROG,
BRW_BLORP_BLIT_PROG,
BRW_SAMPLER,
BRW_SF_PROG,
-   BRW_SF_VP,
BRW_VS_PROG,
BRW_FF_GS_PROG,
BRW_GS_PROG,
-   BRW_CLIP_VP,
BRW_CLIP_PROG,
 
BRW_MAX_CACHE
@@ -773,16 +776,13 @@ enum shader_time_shader_type {
 
 /* Flags for brw-state.cache.
  */
-#define CACHE_NEW_CC_VP  (1BRW_CC_VP)
 #define CACHE_NEW_WM_PROG(1BRW_WM_PROG)
 #define CACHE_NEW_BLORP_BLIT_PROG(1BRW_BLORP_BLIT_PROG)
 #define CACHE_NEW_SAMPLER(1BRW_SAMPLER)
 #define CACHE_NEW_SF_PROG(1BRW_SF_PROG)
-#define CACHE_NEW_SF_VP  (1BRW_SF_VP)
 #define CACHE_NEW_VS_PROG(1BRW_VS_PROG)
 #define CACHE_NEW_FF_GS_PROG (1BRW_FF_GS_PROG)
 #define CACHE_NEW_GS_PROG(1BRW_GS_PROG)
-#define CACHE_NEW_CLIP_VP(1BRW_CLIP_VP)
 #define CACHE_NEW_CLIP_PROG  (1BRW_CLIP_PROG)
 
 struct brw_vertex_buffer {
diff --git a/src/mesa/drivers/dri/i965/brw_sf_state.c 
b/src/mesa/drivers/dri/i965/brw_sf_state.c
index 0316a6b..1b79cc0 100644
--- a/src/mesa/drivers/dri/i965/brw_sf_state.c
+++ b/src/mesa/drivers/dri/i965/brw_sf_state.c
@@ -109,7 +109,7 @@ static void upload_sf_vp(struct brw_context *brw)
   sfv-scissor.ymax = ctx-DrawBuffer-Height - ctx-DrawBuffer-_Ymin - 1;
}
 
-   brw-state.dirty.cache |= CACHE_NEW_SF_VP;
+   brw-state.dirty.brw |= BRW_NEW_SF_VP;
 }
 
 const struct brw_tracked_state brw_sf_vp = {
@@ -172,7 +172,7 @@ static void upload_sf_unit( struct brw_context *brw )
if (unlikely(INTEL_DEBUG  DEBUG_STATS))
   sf-thread4.stats_enable = 1;
 
-   /* CACHE_NEW_SF_VP */
+   /* BRW_NEW_SF_VP */
sf-sf5.sf_viewport_state_offset = (brw-batch.bo-offset64 +
   brw-sf.vp_offset)  5; /* reloc */
 
@@ -306,9 +306,9 @@ const struct brw_tracked_state brw_sf_unit = {
_NEW_SCISSOR,
   .brw   = BRW_NEW_BATCH |
BRW_NEW_PROGRAM_CACHE |
+   BRW_NEW_SF_VP |
BRW_NEW_URB_FENCE,
-  .cache = CACHE_NEW_SF_PROG |
-   CACHE_NEW_SF_VP,
+  .cache = CACHE_NEW_SF_PROG,
 

[Mesa-dev] [PATCH 01/10] i965: Alphabetize brw_tracked_state flags and use a consistent style.

2014-11-25 Thread Kenneth Graunke
Most of the dirty flags were listed in some arbitrary order.  Some used
bonus parenthesis.  Some put multiple flags on one line, others put one
per line.  Some used tabs instead of spaces...but only on some lines.

This patch settles on one flag per line, in alphabetical order, using
spaces instead of tabs, and sheds the unnecessary parentheses.

Sorting was mostly done with vim's visual block feature and !sort,
although I alphabetized short lists by hand; it was pretty manual.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_binding_tables.c | 23 
 src/mesa/drivers/dri/i965/brw_cc.c | 11 ++--
 src/mesa/drivers/dri/i965/brw_clip.c   | 14 +-
 src/mesa/drivers/dri/i965/brw_clip_state.c | 12 
 src/mesa/drivers/dri/i965/brw_curbe.c  | 15 +-
 src/mesa/drivers/dri/i965/brw_draw_upload.c|  6 ++--
 src/mesa/drivers/dri/i965/brw_ff_gs.c  |  6 ++--
 src/mesa/drivers/dri/i965/brw_gs.c | 10 ---
 src/mesa/drivers/dri/i965/brw_gs_state.c   |  8 +++---
 src/mesa/drivers/dri/i965/brw_gs_surface_state.c   | 11 +---
 src/mesa/drivers/dri/i965/brw_interpolation_map.c  |  4 +--
 src/mesa/drivers/dri/i965/brw_misc_state.c | 32 +++---
 src/mesa/drivers/dri/i965/brw_sf.c | 15 ++
 src/mesa/drivers/dri/i965/brw_sf_state.c   | 30 ++--
 src/mesa/drivers/dri/i965/brw_urb.c|  4 +--
 src/mesa/drivers/dri/i965/brw_vs.c | 13 +
 src/mesa/drivers/dri/i965/brw_vs_state.c   | 13 +
 src/mesa/drivers/dri/i965/brw_vs_surface_state.c   | 11 +---
 src/mesa/drivers/dri/i965/brw_wm.c | 30 ++--
 src/mesa/drivers/dri/i965/brw_wm_state.c   | 26 --
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c   | 21 --
 src/mesa/drivers/dri/i965/gen6_cc.c| 15 ++
 src/mesa/drivers/dri/i965/gen6_clip_state.c|  9 --
 src/mesa/drivers/dri/i965/gen6_depthstencil.c  |  7 +++--
 src/mesa/drivers/dri/i965/gen6_gs_state.c  | 23 +---
 src/mesa/drivers/dri/i965/gen6_multisample_state.c |  4 +--
 src/mesa/drivers/dri/i965/gen6_sampler_state.c |  4 +--
 src/mesa/drivers/dri/i965/gen6_scissor_state.c |  4 ++-
 src/mesa/drivers/dri/i965/gen6_sf_state.c  | 22 +++
 src/mesa/drivers/dri/i965/gen6_sol.c   | 16 +--
 src/mesa/drivers/dri/i965/gen6_urb.c   |  7 +++--
 src/mesa/drivers/dri/i965/gen6_viewport_state.c| 13 +
 src/mesa/drivers/dri/i965/gen6_vs_state.c  | 20 --
 src/mesa/drivers/dri/i965/gen6_wm_state.c  | 26 +-
 src/mesa/drivers/dri/i965/gen7_gs_state.c  |  6 ++--
 src/mesa/drivers/dri/i965/gen7_misc_state.c|  4 ++-
 src/mesa/drivers/dri/i965/gen7_sf_state.c  | 30 ++--
 src/mesa/drivers/dri/i965/gen7_sol_state.c |  8 +++---
 src/mesa/drivers/dri/i965/gen7_urb.c   |  6 ++--
 src/mesa/drivers/dri/i965/gen7_viewport_state.c|  3 +-
 src/mesa/drivers/dri/i965/gen7_vs_state.c  |  6 ++--
 src/mesa/drivers/dri/i965/gen7_wm_state.c  | 24 
 src/mesa/drivers/dri/i965/gen8_blend_state.c   | 14 +++---
 src/mesa/drivers/dri/i965/gen8_draw_upload.c   |  6 ++--
 src/mesa/drivers/dri/i965/gen8_gs_state.c  |  6 ++--
 src/mesa/drivers/dri/i965/gen8_misc_state.c|  3 +-
 src/mesa/drivers/dri/i965/gen8_multisample_state.c |  4 +--
 src/mesa/drivers/dri/i965/gen8_ps_state.c  | 11 +---
 src/mesa/drivers/dri/i965/gen8_sf_state.c  |  5 +++-
 src/mesa/drivers/dri/i965/gen8_viewport_state.c|  3 +-
 src/mesa/drivers/dri/i965/gen8_vs_state.c  |  6 ++--
 src/mesa/drivers/dri/i965/gen8_wm_depth_stencil.c  |  4 ++-
 52 files changed, 355 insertions(+), 279 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c 
b/src/mesa/drivers/dri/i965/brw_binding_tables.c
index cb50d3b..bf99689 100644
--- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
+++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
@@ -130,7 +130,8 @@ brw_upload_wm_binding_table(struct brw_context *brw)
 const struct brw_tracked_state brw_wm_binding_table = {
.dirty = {
   .mesa = 0,
-  .brw = BRW_NEW_BATCH | BRW_NEW_SURFACES,
+  .brw = BRW_NEW_BATCH |
+ BRW_NEW_SURFACES,
   .cache = CACHE_NEW_WM_PROG
},
.emit = brw_upload_wm_binding_table,
@@ -189,11 +190,11 @@ gen4_upload_binding_table_pointers(struct brw_context 
*brw)
 const struct brw_tracked_state brw_binding_table_pointers = {
.dirty = {
   .mesa = 0,
-  .brw = (BRW_NEW_BATCH |
-  BRW_NEW_STATE_BASE_ADDRESS |
-  BRW_NEW_VS_BINDING_TABLE |
-  BRW_NEW_GS_BINDING_TABLE |
-

[Mesa-dev] [PATCH 00/10] i965/state: Merge cache and brw flags.

2014-11-25 Thread Kenneth Graunke
Hello,

This series does some longstanding cleaning I've been meaning to do
in the i965 state upload code.  The distinction between BRW_NEW_* and
CACHE_NEW_* flags has been pretty arbitrary for a while - 10/17 of
them were for things we stopped caching years ago.  So, I moved
those to be BRW_NEW_* bits, and combined a bunch of redundant ones
while I was at it.

Patches 1-6 move non-cache-related things out of .cache, along with
other tidying.  This actually could save up to 160 bytes of memory
per context (on 64-bit), because cache types have auxiliary compare
and free function pointers...which weren't used at all for these.
(I haven't actually measured this - just eliminated the fields).

Patches 7-10 take it a step further, and kill off the cache bitset
altogether.  A while back, I was looking at callgrind graphs for Glamor,
trying to reduce brw_state_upload costs.  One of the places where I saw
cycles being wasted was in check_state(), which sees if each atom needs
to be emitted.  Eliminating cache should eliminate 1/4 of the cycles
spent there, and every little bit helps.

I also like the new names - BRW_NEW_VERTEX_PROGRAM vs CACHE_NEW_VS_PROG
was always confusing - which is which, and why should I use one or the
other?  BRW_NEW_VS_PROG_DATA is clearly tied to brw_vs_prog_data.

No regressions on 965, GM45, Ironlake, Sandybridge GT1/2, Ivybridge GT1/2,
or Haswell GT3e.  I really should check Broadwell before pushing, but
haven't yet.

This is available as the 'state-kill-cache' branch of my tree.
It depends on the ddx/ddy cleanups I sent yesterday.

--Ken


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/10] i965: Add _CACHE_ in brw_cache_id enum names.

2014-11-25 Thread Kenneth Graunke
BRW_CACHE_VS_PROG is more easily associated with program caches than
plain BRW_VS_PROG.

While we're at it, rename BRW_WM_PROG to BRW_CACHE_FS_PROG, to move away
from the outdated Windowizer/Masker name.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_blorp_blit.cpp |  4 ++--
 src/mesa/drivers/dri/i965/brw_clip.c |  4 ++--
 src/mesa/drivers/dri/i965/brw_context.h  | 28 ++--
 src/mesa/drivers/dri/i965/brw_ff_gs.c|  4 ++--
 src/mesa/drivers/dri/i965/brw_gs.c   |  4 ++--
 src/mesa/drivers/dri/i965/brw_sf.c   |  4 ++--
 src/mesa/drivers/dri/i965/brw_state_cache.c  | 12 ++--
 src/mesa/drivers/dri/i965/brw_state_dump.c   | 14 +++---
 src/mesa/drivers/dri/i965/brw_vs.c   |  6 +++---
 src/mesa/drivers/dri/i965/brw_wm.c   |  6 +++---
 10 files changed, 43 insertions(+), 43 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
index 844f5e4..a103af0 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
@@ -2119,14 +2119,14 @@ brw_blorp_blit_params::get_wm_prog(struct brw_context 
*brw,
brw_blorp_prog_data **prog_data) const
 {
uint32_t prog_offset = 0;
-   if (!brw_search_cache(brw-cache, BRW_BLORP_BLIT_PROG,
+   if (!brw_search_cache(brw-cache, BRW_CACHE_BLORP_BLIT_PROG,
  this-wm_prog_key, sizeof(this-wm_prog_key),
  prog_offset, prog_data)) {
   brw_blorp_blit_program prog(brw, this-wm_prog_key,
   INTEL_DEBUG  DEBUG_BLORP);
   GLuint program_size;
   const GLuint *program = prog.compile(brw, program_size);
-  brw_upload_cache(brw-cache, BRW_BLORP_BLIT_PROG,
+  brw_upload_cache(brw-cache, BRW_CACHE_BLORP_BLIT_PROG,
this-wm_prog_key, sizeof(this-wm_prog_key),
program, program_size,
prog.prog_data, sizeof(prog.prog_data),
diff --git a/src/mesa/drivers/dri/i965/brw_clip.c 
b/src/mesa/drivers/dri/i965/brw_clip.c
index debeee5..3fef38c 100644
--- a/src/mesa/drivers/dri/i965/brw_clip.c
+++ b/src/mesa/drivers/dri/i965/brw_clip.c
@@ -122,7 +122,7 @@ static void compile_clip_prog( struct brw_context *brw,
}
 
brw_upload_cache(brw-cache,
-   BRW_CLIP_PROG,
+   BRW_CACHE_CLIP_PROG,
c.key, sizeof(c.key),
program, program_size,
c.prog_data, sizeof(c.prog_data),
@@ -248,7 +248,7 @@ brw_upload_clip_prog(struct brw_context *brw)
   }
}
 
-   if (!brw_search_cache(brw-cache, BRW_CLIP_PROG,
+   if (!brw_search_cache(brw-cache, BRW_CACHE_CLIP_PROG,
 key, sizeof(key),
 brw-clip.prog_offset, brw-clip.prog_data)) {
   compile_clip_prog( brw, key );
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index fac22dc..c4e96de 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -693,13 +693,13 @@ struct brw_gs_prog_data
 #define SHADER_TIME_STRIDE 64
 
 enum brw_cache_id {
-   BRW_WM_PROG,
-   BRW_BLORP_BLIT_PROG,
-   BRW_SF_PROG,
-   BRW_VS_PROG,
-   BRW_FF_GS_PROG,
-   BRW_GS_PROG,
-   BRW_CLIP_PROG,
+   BRW_CACHE_FS_PROG,
+   BRW_CACHE_BLORP_BLIT_PROG,
+   BRW_CACHE_SF_PROG,
+   BRW_CACHE_VS_PROG,
+   BRW_CACHE_FF_GS_PROG,
+   BRW_CACHE_GS_PROG,
+   BRW_CACHE_CLIP_PROG,
 
BRW_MAX_CACHE
 };
@@ -777,13 +777,13 @@ enum shader_time_shader_type {
 
 /* Flags for brw-state.cache.
  */
-#define CACHE_NEW_WM_PROG(1BRW_WM_PROG)
-#define CACHE_NEW_BLORP_BLIT_PROG(1BRW_BLORP_BLIT_PROG)
-#define CACHE_NEW_SF_PROG(1BRW_SF_PROG)
-#define CACHE_NEW_VS_PROG(1BRW_VS_PROG)
-#define CACHE_NEW_FF_GS_PROG (1BRW_FF_GS_PROG)
-#define CACHE_NEW_GS_PROG(1BRW_GS_PROG)
-#define CACHE_NEW_CLIP_PROG  (1BRW_CLIP_PROG)
+#define CACHE_NEW_WM_PROG(1  BRW_CACHE_FS_PROG)
+#define CACHE_NEW_BLORP_BLIT_PROG(1  BRW_CACHE_BLORP_BLIT_PROG)
+#define CACHE_NEW_SF_PROG(1  BRW_CACHE_SF_PROG)
+#define CACHE_NEW_VS_PROG(1  BRW_CACHE_VS_PROG)
+#define CACHE_NEW_FF_GS_PROG (1  BRW_CACHE_FF_GS_PROG)
+#define CACHE_NEW_GS_PROG(1  BRW_CACHE_GS_PROG)
+#define CACHE_NEW_CLIP_PROG  (1  BRW_CACHE_CLIP_PROG)
 
 struct brw_vertex_buffer {
/** Buffer object containing the uploaded vertex data */
diff --git a/src/mesa/drivers/dri/i965/brw_ff_gs.c 
b/src/mesa/drivers/dri/i965/brw_ff_gs.c
index 377e1fa..d212438 100644
--- a/src/mesa/drivers/dri/i965/brw_ff_gs.c
+++ b/src/mesa/drivers/dri/i965/brw_ff_gs.c
@@ -139,7 +139,7 @@ static void compile_ff_gs_prog(struct brw_context *brw,
   

[Mesa-dev] [PATCH 10/10] i965: Delete brw_state_flags::cache and related code.

2014-11-25 Thread Kenneth Graunke
It's been merged into brw_state_flags::brw for simplicity and
efficiency.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/brw_binding_tables.c |  2 --
 src/mesa/drivers/dri/i965/brw_blorp.cpp|  1 -
 src/mesa/drivers/dri/i965/brw_cc.c |  3 ---
 src/mesa/drivers/dri/i965/brw_context.h| 12 +---
 src/mesa/drivers/dri/i965/brw_draw_upload.c|  2 --
 src/mesa/drivers/dri/i965/brw_misc_state.c |  8 
 src/mesa/drivers/dri/i965/brw_primitive_restart.c  |  1 -
 src/mesa/drivers/dri/i965/brw_sampler_state.c  |  3 ---
 src/mesa/drivers/dri/i965/brw_sf_state.c   |  1 -
 src/mesa/drivers/dri/i965/brw_state_cache.c|  1 -
 src/mesa/drivers/dri/i965/brw_state_upload.c   | 20 +++-
 src/mesa/drivers/dri/i965/brw_vs.c |  1 -
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c   |  3 ---
 src/mesa/drivers/dri/i965/gen6_cc.c|  2 --
 src/mesa/drivers/dri/i965/gen6_depthstencil.c  |  1 -
 src/mesa/drivers/dri/i965/gen6_multisample_state.c |  1 -
 src/mesa/drivers/dri/i965/gen6_sampler_state.c |  1 -
 src/mesa/drivers/dri/i965/gen6_scissor_state.c |  1 -
 src/mesa/drivers/dri/i965/gen6_sol.c   |  2 --
 src/mesa/drivers/dri/i965/gen6_viewport_state.c|  3 ---
 src/mesa/drivers/dri/i965/gen7_disable.c   |  1 -
 src/mesa/drivers/dri/i965/gen7_misc_state.c|  1 -
 src/mesa/drivers/dri/i965/gen7_sf_state.c  |  1 -
 src/mesa/drivers/dri/i965/gen7_urb.c   |  1 -
 src/mesa/drivers/dri/i965/gen7_viewport_state.c|  1 -
 src/mesa/drivers/dri/i965/gen8_blend_state.c   |  2 --
 src/mesa/drivers/dri/i965/gen8_disable.c   |  1 -
 src/mesa/drivers/dri/i965/gen8_draw_upload.c   |  2 --
 src/mesa/drivers/dri/i965/gen8_misc_state.c|  1 -
 src/mesa/drivers/dri/i965/gen8_multisample_state.c |  1 -
 src/mesa/drivers/dri/i965/gen8_sf_state.c  |  2 --
 src/mesa/drivers/dri/i965/gen8_sol_state.c |  1 -
 src/mesa/drivers/dri/i965/gen8_viewport_state.c|  1 -
 src/mesa/drivers/dri/i965/gen8_wm_depth_stencil.c  |  1 -
 34 files changed, 4 insertions(+), 82 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c 
b/src/mesa/drivers/dri/i965/brw_binding_tables.c
index 7ffd7b2..ea82e71 100644
--- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
+++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
@@ -195,7 +195,6 @@ const struct brw_tracked_state brw_binding_table_pointers = 
{
  BRW_NEW_PS_BINDING_TABLE |
  BRW_NEW_STATE_BASE_ADDRESS |
  BRW_NEW_VS_BINDING_TABLE,
-  .cache = 0,
},
.emit = gen4_upload_binding_table_pointers,
 };
@@ -232,7 +231,6 @@ const struct brw_tracked_state gen6_binding_table_pointers 
= {
  BRW_NEW_PS_BINDING_TABLE |
  BRW_NEW_STATE_BASE_ADDRESS |
  BRW_NEW_VS_BINDING_TABLE,
-  .cache = 0,
},
.emit = gen6_upload_binding_table_pointers,
 };
diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp.cpp
index 20ce7b7..df00b77 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp
@@ -277,7 +277,6 @@ retry:
 * rendering tracks for GL.
 */
brw-state.dirty.brw = ~0ull;
-   brw-state.dirty.cache = ~0;
brw-no_depth_or_stencil = false;
brw-ib.type = -1;
 
diff --git a/src/mesa/drivers/dri/i965/brw_cc.c 
b/src/mesa/drivers/dri/i965/brw_cc.c
index 01974e1..02f5a3a 100644
--- a/src/mesa/drivers/dri/i965/brw_cc.c
+++ b/src/mesa/drivers/dri/i965/brw_cc.c
@@ -77,7 +77,6 @@ const struct brw_tracked_state brw_cc_vp = {
   .mesa = _NEW_TRANSFORM |
   _NEW_VIEWPORT,
   .brw = BRW_NEW_BATCH,
-  .cache = 0
},
.emit = brw_upload_cc_vp
 };
@@ -250,7 +249,6 @@ const struct brw_tracked_state brw_cc_unit = {
   .brw = BRW_NEW_BATCH |
  BRW_NEW_CC_VP |
  BRW_NEW_STATS_WM,
-  .cache = 0
},
.emit = upload_cc_unit,
 };
@@ -272,7 +270,6 @@ const struct brw_tracked_state brw_blend_constant_color = {
.dirty = {
   .mesa = _NEW_COLOR,
   .brw = BRW_NEW_CONTEXT,
-  .cache = 0
},
.emit = upload_blend_constant_color
 };
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index e60c054..2bc61d6 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -281,15 +281,6 @@ struct brw_state_flags {
 * State update flags signalled as the result of brw_tracked_state updates
 */
uint64_t brw;
-   /**
-* State update flags that used to be signalled by brw_state_cache.c
-* searches.
-*
-* Now almost all of that state is just streamed out on demand, but the
-* flags for those state blobs updating have stayed in the same bitfield.
-* brw_state_cache.c still flags 

[Mesa-dev] [Bug 86701] [regression] weston-simple-egl not running anymore inside qemu

2014-11-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=86701

Bug ID: 86701
   Summary: [regression] weston-simple-egl not running anymore
inside qemu
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: EGL
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: jjar...@gnome.org

Hi,

after the removal of the --enable-gallium-egl option [1] Im unable to run this
egl demo.

The discussion of this removal was discussed here: [2]

[1]
http://git.baserock.org/cgi-bin/cgit.cgi/delta/mesa.git/commit/?h=10.4id=c46c551c56f78c6bf9e63524c89478695fc4f525

[2] http://lists.freedesktop.org/archives/mesa-dev/2014-November/070204.html

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 86701] [regression] weston-simple-egl not running anymore inside qemu

2014-11-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=86701

Pekka Paalanen ppaala...@gmail.com changed:

   What|Removed |Added

 CC||ppaala...@gmail.com

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 86701] [regression] weston-simple-egl not running anymore inside qemu

2014-11-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=86701

--- Comment #1 from Pekka Paalanen ppaala...@gmail.com ---
egl_gallium.so was able to use llvmpipe and direct that rendering to wl_shm
based buffers, and therefore does not need any EGL support from the compositor.

egl_dri2 does not implement that.

Could egl_dri2 be worked to implement it? What other options there are?

If someone suggests VGEM, then that will require special support in the
compositor, and I will ask how compositors are supposed to support that and
what do you do on systems with kernels that don't support VGEM.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/glsl/glapi: enable GL_EXT_draw_buffers extension

2014-11-25 Thread Ilia Mirkin
On Tue, Nov 25, 2014 at 6:23 AM, Tapani Pälli tapani.pa...@intel.com wrote:
 Patch enables ES2 extension that utilizes existing ES3 functionality.

 Changes make all the subtests to run and pass in WebGL conformance
 test 'webgl-draw-buffers' when running Chrome on OpenGL ES.

 Signed-off-by: Tapani Pälli tapani.pa...@intel.com
 ---
  src/glsl/glcpp/glcpp-parse.y| 2 ++
  src/glsl/glsl_parser_extras.cpp | 1 +
  src/glsl/glsl_parser_extras.h   | 2 ++
  src/mapi/glapi/gen/es_EXT.xml   | 9 +
  src/mesa/main/extensions.c  | 1 +
  src/mesa/main/mtypes.h  | 1 +
  6 files changed, 16 insertions(+)

 diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y
 index f1119eb..414f4df 100644
 --- a/src/glsl/glcpp/glcpp-parse.y
 +++ b/src/glsl/glcpp/glcpp-parse.y
 @@ -2380,6 +2380,8 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t 
 *parser, intmax_t versio
  add_builtin_define(parser, GL_OES_EGL_image_external, 1);
if (extensions-OES_standard_derivatives)
   add_builtin_define(parser, GL_OES_standard_derivatives, 
 1);
 + if (extensions-EXT_draw_buffers)
 +add_builtin_define(parser, GL_EXT_draw_buffers, 1);

It appears that you used tabs here instead of spaces. Also, nothing
was ever setting extensions-EXT_draw_buffers and you were enabling
the extension unconditionally. As such, the define should be added
unconditionally... Or am I misunderstanding?

}
 } else {
add_builtin_define(parser, GL_ARB_draw_buffers, 1);
 diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
 index 7389baa..c798737 100644
 --- a/src/mesa/main/mtypes.h
 +++ b/src/mesa/main/mtypes.h
 @@ -3785,6 +3785,7 @@ struct gl_extensions
 GLboolean EXT_blend_func_separate;
 GLboolean EXT_blend_minmax;
 GLboolean EXT_depth_bounds_test;
 +   GLboolean EXT_draw_buffers;

This does not appear to be used anywhere... why did you add it?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 86701] [regression] weston-simple-egl not running anymore inside qemu

2014-11-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=86701

Javier Jardón jjar...@gnome.org changed:

   What|Removed |Added

 CC||e...@anholt.net,
   ||k...@bitplanet.net,
   ||mar...@gmail.com

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/7] util: Move u_atomic.h to src/util and modify API

2014-11-25 Thread Jose Fonseca

On 25/11/14 00:39, Matt Turner wrote:

I've got some thread-safety fixes queued up after this and thought I'd
be a good Mesa citizen and pull some code into src/util.


Thanks for going the extra mile!


I did some clean ups like replacing INLINE (MSVC knows about inline
these days, right?) and used stdbool.h instead of the boolean type.


No, at least MSVC 2012 doesn't have `inline` keyword when compiling C 
files, and requires a


  #if !defined(__cplusplus)  !defined(inline)
  #define inline __inline
  #endif

somewhere.  Anyway, there are no `INLINES` nor `inlines` left after your 
series, so we're good.


stdbool.h is fine -- we include our own when MSVC doesn't


I also removed the inline assembly implementations because they were
either dead code, or only allowed *ancient* gcc to build Mesa and
because I didn't want to update them for the next patch, which makes
the API consist of some macros that internally do the right operation
based on the type.

The last patch looks funky, but I think it's actually a reasonable
solution. I don't have MSVC or Sun Studio, so please give this a
test.


I had to do a few tweaks to get things building on MSVC properly.

I pushed my changes to

  http://cgit.freedesktop.org/~jrfonseca/mesa/log/?h=u_atomic

I need to do a few more tests, but all looks feasible so far -- I don't 
get any warnings with MSVC and I believe that the generated code quality 
should be exactly the same.


And it is indeed a nice cleanup.


BTW, we could rename the macros to something not allusive to gallium 
(ie, remove pipe).  (We could even match the C11 stdatomic.h until the 
C runtime provide them, like we're doing with thread.h.)  Anyway, this 
is a just cosmetic, so it can wait.



Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] configure.ac: remove enable flags for EGL and GBM Gallium state trackers

2014-11-25 Thread Pekka Paalanen
On Tue,  4 Nov 2014 23:42:44 +0100
Marek Olšák mar...@gmail.com wrote:

 From: Marek Olšák marek.ol...@amd.com

Btw. would have been *really* nice if the commit message here explained
why you do this and what all things it intentionally breaks, before
pushing it.


Thanks,
pq


 ---
  configure.ac| 74 
 +++--
  docs/egl.html   |  7 -
  src/gallium/Makefile.am |  8 --
  3 files changed, 4 insertions(+), 85 deletions(-)
 
 diff --git a/configure.ac b/configure.ac
 index fc7d372..91e111b 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -697,20 +697,6 @@ AC_ARG_ENABLE([xlib-glx],
  [make GLX library Xlib-based instead of DRI-based 
 @:@default=disabled@:@])],
  [enable_xlib_glx=$enableval],
  [enable_xlib_glx=no])
 -AC_ARG_ENABLE([gallium-egl],
 -[AS_HELP_STRING([--enable-gallium-egl],
 -[enable optional EGL state tracker (not required
 - for EGL support in Gallium with OpenGL and OpenGL ES)
 - @:@default=disabled@:@])],
 -[enable_gallium_egl=$enableval],
 -[enable_gallium_egl=no])
 -AC_ARG_ENABLE([gallium-gbm],
 -[AS_HELP_STRING([--enable-gallium-gbm],
 -[enable optional gbm state tracker (not required for
 - gbm support in Gallium)
 - @:@default=auto@:@])],
 -[enable_gallium_gbm=$enableval],
 -[enable_gallium_gbm=auto])
  
  AC_ARG_ENABLE([r600-llvm-compiler],
  [AS_HELP_STRING([--enable-r600-llvm-compiler],
 @@ -1314,51 +1300,6 @@ AM_CONDITIONAL(HAVE_EGL, test x$enable_egl = xyes)
  AC_SUBST([EGL_LIB_DEPS])
  
  dnl
 -dnl EGL Gallium configuration
 -dnl
 -if test x$enable_gallium_egl = xyes; then
 -if test -z $with_gallium_drivers; then
 -AC_MSG_ERROR([cannot enable egl_gallium without Gallium])
 -fi
 -if test x$enable_egl = xno; then
 -AC_MSG_ERROR([cannot enable egl_gallium without EGL])
 -fi
 -if test x$have_libdrm != xyes; then
 -AC_MSG_ERROR([egl_gallium requires libdrm = $LIBDRM_REQUIRED])
 -fi
 -# XXX: Uncomment once converted to use static/shared pipe-drivers
 -#enable_gallium_loader=$enable_shared_pipe_drivers
 -fi
 -AM_CONDITIONAL(HAVE_GALLIUM_EGL, test x$enable_gallium_egl = xyes)
 -
 -dnl
 -dnl gbm Gallium configuration
 -dnl
 -if test x$enable_gallium_gbm = xauto; then
 -case $enable_gbm$enable_gallium_egl$enable_dri$with_egl_platforms in
 -yesyesyes*drm*)
 -enable_gallium_gbm=yes ;;
 - *)
 -enable_gallium_gbm=no ;;
 -esac
 -fi
 -if test x$enable_gallium_gbm = xyes; then
 -if test -z $with_gallium_drivers; then
 -AC_MSG_ERROR([cannot enable gbm_gallium without Gallium])
 -fi
 -if test x$enable_gbm = xno; then
 -AC_MSG_ERROR([cannot enable gbm_gallium without gbm])
 -fi
 -
 -if test x$enable_gallium_egl != xyes; then
 -AC_MSG_ERROR([gbm_gallium is only used by egl_gallium])
 -fi
 -
 -enable_gallium_loader=$enable_shared_pipe_drivers
 -fi
 -AM_CONDITIONAL(HAVE_GALLIUM_GBM, test x$enable_gallium_gbm = xyes)
 -
 -dnl
  dnl XA configuration
  dnl
  if test x$enable_xa = xyes; then
 @@ -1386,9 +1327,9 @@ if test x$enable_openvg = xyes; then
  if test -z $with_gallium_drivers; then
  AC_MSG_ERROR([cannot enable OpenVG without Gallium])
  fi
 -if test x$enable_gallium_egl = xno; then
 -AC_MSG_ERROR([cannot enable OpenVG without egl_gallium])
 -fi
 +
 +AC_MSG_ERROR([Cannot enable OpenVG, because egl_gallium has been removed 
 and
 +  OpenVG hasn't been integrated into standard libEGL yet])
  
  EGL_CLIENT_APIS=$EGL_CLIENT_APIS '$(VG_LIB)'
  VG_LIB_DEPS=$VG_LIB_DEPS $SELINUX_LIBS $PTHREAD_LIBS
 @@ -2170,8 +2111,6 @@ AC_CONFIG_FILES([Makefile
   src/gallium/drivers/vc4/kernel/Makefile
   src/gallium/state_trackers/clover/Makefile
   src/gallium/state_trackers/dri/Makefile
 - src/gallium/state_trackers/egl/Makefile
 - src/gallium/state_trackers/gbm/Makefile
   src/gallium/state_trackers/glx/xlib/Makefile
   src/gallium/state_trackers/omx/Makefile
   src/gallium/state_trackers/osmesa/Makefile
 @@ -2307,12 +2246,7 @@ if test $enable_egl = yes; then
  egl_drivers=$egl_drivers builtin:egl_dri2
  fi
  
 -if test x$enable_gallium_egl = xyes; then
 -echo EGL drivers:${egl_drivers} egl_gallium
 -echo EGL Gallium STs:$EGL_CLIENT_APIS
 -else
 -echo EGL drivers:$egl_drivers
 -fi
 +echo EGL drivers:$egl_drivers
  fi
  
  echo 
 diff --git a/docs/egl.html b/docs/egl.html
 index eebb8c7..e77c235 100644
 --- a/docs/egl.html
 +++ b/docs/egl.html
 @@ -77,13 +77,6 @@ drivers will be installed to 
 code${libdir}/egl/code./p
  
  /dd
  
 -dtcode--enable-gallium-egl/code/dt
 -dd
 -
 -pEnable the optional codeegl_gallium/code driver./p
 -
 -/dd
 -
  

Re: [Mesa-dev] [PATCH 1/2] configure.ac: remove enable flags for EGL and GBM Gallium state trackers

2014-11-25 Thread Marek Olšák
Hi Pekka,

I explained in the introductory thread [PATCH 0/2] Disable the EGL
state tracker for Linux/DRI builds that it may break swrast
(llvmpipe) on Wayland. I also explained all the reasons why I was
about to do it. Does it break something not discussed before?

Marek

On Tue, Nov 25, 2014 at 3:49 PM, Pekka Paalanen ppaala...@gmail.com wrote:
 On Tue,  4 Nov 2014 23:42:44 +0100
 Marek Olšák mar...@gmail.com wrote:

 From: Marek Olšák marek.ol...@amd.com

 Btw. would have been *really* nice if the commit message here explained
 why you do this and what all things it intentionally breaks, before
 pushing it.


 Thanks,
 pq


 ---
  configure.ac| 74 
 +++--
  docs/egl.html   |  7 -
  src/gallium/Makefile.am |  8 --
  3 files changed, 4 insertions(+), 85 deletions(-)

 diff --git a/configure.ac b/configure.ac
 index fc7d372..91e111b 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -697,20 +697,6 @@ AC_ARG_ENABLE([xlib-glx],
  [make GLX library Xlib-based instead of DRI-based 
 @:@default=disabled@:@])],
  [enable_xlib_glx=$enableval],
  [enable_xlib_glx=no])
 -AC_ARG_ENABLE([gallium-egl],
 -[AS_HELP_STRING([--enable-gallium-egl],
 -[enable optional EGL state tracker (not required
 - for EGL support in Gallium with OpenGL and OpenGL ES)
 - @:@default=disabled@:@])],
 -[enable_gallium_egl=$enableval],
 -[enable_gallium_egl=no])
 -AC_ARG_ENABLE([gallium-gbm],
 -[AS_HELP_STRING([--enable-gallium-gbm],
 -[enable optional gbm state tracker (not required for
 - gbm support in Gallium)
 - @:@default=auto@:@])],
 -[enable_gallium_gbm=$enableval],
 -[enable_gallium_gbm=auto])

  AC_ARG_ENABLE([r600-llvm-compiler],
  [AS_HELP_STRING([--enable-r600-llvm-compiler],
 @@ -1314,51 +1300,6 @@ AM_CONDITIONAL(HAVE_EGL, test x$enable_egl = xyes)
  AC_SUBST([EGL_LIB_DEPS])

  dnl
 -dnl EGL Gallium configuration
 -dnl
 -if test x$enable_gallium_egl = xyes; then
 -if test -z $with_gallium_drivers; then
 -AC_MSG_ERROR([cannot enable egl_gallium without Gallium])
 -fi
 -if test x$enable_egl = xno; then
 -AC_MSG_ERROR([cannot enable egl_gallium without EGL])
 -fi
 -if test x$have_libdrm != xyes; then
 -AC_MSG_ERROR([egl_gallium requires libdrm = $LIBDRM_REQUIRED])
 -fi
 -# XXX: Uncomment once converted to use static/shared pipe-drivers
 -#enable_gallium_loader=$enable_shared_pipe_drivers
 -fi
 -AM_CONDITIONAL(HAVE_GALLIUM_EGL, test x$enable_gallium_egl = xyes)
 -
 -dnl
 -dnl gbm Gallium configuration
 -dnl
 -if test x$enable_gallium_gbm = xauto; then
 -case $enable_gbm$enable_gallium_egl$enable_dri$with_egl_platforms in
 -yesyesyes*drm*)
 -enable_gallium_gbm=yes ;;
 - *)
 -enable_gallium_gbm=no ;;
 -esac
 -fi
 -if test x$enable_gallium_gbm = xyes; then
 -if test -z $with_gallium_drivers; then
 -AC_MSG_ERROR([cannot enable gbm_gallium without Gallium])
 -fi
 -if test x$enable_gbm = xno; then
 -AC_MSG_ERROR([cannot enable gbm_gallium without gbm])
 -fi
 -
 -if test x$enable_gallium_egl != xyes; then
 -AC_MSG_ERROR([gbm_gallium is only used by egl_gallium])
 -fi
 -
 -enable_gallium_loader=$enable_shared_pipe_drivers
 -fi
 -AM_CONDITIONAL(HAVE_GALLIUM_GBM, test x$enable_gallium_gbm = xyes)
 -
 -dnl
  dnl XA configuration
  dnl
  if test x$enable_xa = xyes; then
 @@ -1386,9 +1327,9 @@ if test x$enable_openvg = xyes; then
  if test -z $with_gallium_drivers; then
  AC_MSG_ERROR([cannot enable OpenVG without Gallium])
  fi
 -if test x$enable_gallium_egl = xno; then
 -AC_MSG_ERROR([cannot enable OpenVG without egl_gallium])
 -fi
 +
 +AC_MSG_ERROR([Cannot enable OpenVG, because egl_gallium has been 
 removed and
 +  OpenVG hasn't been integrated into standard libEGL yet])

  EGL_CLIENT_APIS=$EGL_CLIENT_APIS '$(VG_LIB)'
  VG_LIB_DEPS=$VG_LIB_DEPS $SELINUX_LIBS $PTHREAD_LIBS
 @@ -2170,8 +2111,6 @@ AC_CONFIG_FILES([Makefile
   src/gallium/drivers/vc4/kernel/Makefile
   src/gallium/state_trackers/clover/Makefile
   src/gallium/state_trackers/dri/Makefile
 - src/gallium/state_trackers/egl/Makefile
 - src/gallium/state_trackers/gbm/Makefile
   src/gallium/state_trackers/glx/xlib/Makefile
   src/gallium/state_trackers/omx/Makefile
   src/gallium/state_trackers/osmesa/Makefile
 @@ -2307,12 +2246,7 @@ if test $enable_egl = yes; then
  egl_drivers=$egl_drivers builtin:egl_dri2
  fi

 -if test x$enable_gallium_egl = xyes; then
 -echo EGL drivers:${egl_drivers} egl_gallium
 -echo EGL Gallium STs:$EGL_CLIENT_APIS
 -else
 -echo EGL drivers:$egl_drivers
 -fi
 +echo EGL drivers:$egl_drivers
  fi

 

[Mesa-dev] [Bug 78914] [llvmpipe] Front/Backfaces do not get the same depth values when interpolated

2014-11-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=78914

José Fonseca jfons...@vmware.com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #22 from José Fonseca jfons...@vmware.com ---
OK!

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/10] i965/state: Merge cache and brw flags.

2014-11-25 Thread Kristian Høgsberg
On Tue, Nov 25, 2014 at 4:43 AM, Kenneth Graunke kenn...@whitecape.org wrote:
 Hello,

 This series does some longstanding cleaning I've been meaning to do
 in the i965 state upload code.  The distinction between BRW_NEW_* and
 CACHE_NEW_* flags has been pretty arbitrary for a while - 10/17 of
 them were for things we stopped caching years ago.  So, I moved
 those to be BRW_NEW_* bits, and combined a bunch of redundant ones
 while I was at it.

 Patches 1-6 move non-cache-related things out of .cache, along with
 other tidying.  This actually could save up to 160 bytes of memory
 per context (on 64-bit), because cache types have auxiliary compare
 and free function pointers...which weren't used at all for these.
 (I haven't actually measured this - just eliminated the fields).

 Patches 7-10 take it a step further, and kill off the cache bitset
 altogether.  A while back, I was looking at callgrind graphs for Glamor,
 trying to reduce brw_state_upload costs.  One of the places where I saw
 cycles being wasted was in check_state(), which sees if each atom needs
 to be emitted.  Eliminating cache should eliminate 1/4 of the cycles
 spent there, and every little bit helps.

 I also like the new names - BRW_NEW_VERTEX_PROGRAM vs CACHE_NEW_VS_PROG
 was always confusing - which is which, and why should I use one or the
 other?  BRW_NEW_VS_PROG_DATA is clearly tied to brw_vs_prog_data.

Wow, nice, I like everything about this series.  One of the most
confusing things about the state mechanism in mesa is the .cache
stuff, which certainly comes down to most of the items not being
cached.  Also, BRW_NEW_PROG_DATA is a better name, but it doesn't as
clearly suggest that the kernel start pointer (prog_offset) also
changes.  BRW_NEW_VS_BINARY, perhaps?  The one comment I have is
regarding this:

+ * BRW_NEW_*_PROG_DATA does not occur quite as often, and is a strict subset.
+ * Multiple shader programs may have identical vertex shaders (for example),
+ * or compile down to the same code in the backend.  We combine those into
+ * a single program cache entry.  BRW_NEW_*_PROG_DATA occurs when switching
+ * program cache entries, which covers the brw_*_prog_data structures, and
+ * brw-*.prog_offset.

Isn't the main cause of BRW_NEW_*_PROG_DATA going to be switching to a
different kernel/prog_data combination because of non-orthogonal state
changes?

For the series, enthusiastically

Reviewed-by: Kristian Høgsberg k...@bitplanet.net


 No regressions on 965, GM45, Ironlake, Sandybridge GT1/2, Ivybridge GT1/2,
 or Haswell GT3e.  I really should check Broadwell before pushing, but
 haven't yet.

 This is available as the 'state-kill-cache' branch of my tree.
 It depends on the ddx/ddy cleanups I sent yesterday.

 --Ken


 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] meta: Fix saving the results of the current occlusion query

2014-11-25 Thread Neil Roberts
When restoring the current state in _mesa_meta_end it was previously trying to
copy the on-going sample count of the current occlusion query into the new
query after restarting it so that the driver will continue adding to the
previous value. This wouldn't work for two reasons. Firstly, the query might
not be ready yet so the Result member will usually be zero. Secondly the saved
query is stored as a pointer to the query object, not a copy of the struct, so
it is actually restarting the exact same object. Copying the result value is
just copying between identical addresses with no effect. The call to
_mesa_BeginQuery will have always reset it back to zero.

This patch fixes it by making it actually wait for the query object to be
ready before grabbing the previous result. The downside of doing this is that
it could introduce a stall but I think this situation is unlikely so it might
not matter too much. A better solution might be to introduce a real
suspend/resume mechanism to the driver interface. This could be implemented in
the i965 driver by saving the depth count multiple times like it does in the
i945 driver.
---
 src/mesa/drivers/common/meta.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

I've posted a piglit test to demonstate the problem here:

http://lists.freedesktop.org/archives/piglit/2014-November/013479.html

diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
index 87532c1..ace1c71 100644
--- a/src/mesa/drivers/common/meta.c
+++ b/src/mesa/drivers/common/meta.c
@@ -825,15 +825,18 @@ _mesa_meta_end(struct gl_context *ctx)
const GLbitfield state = save-SavedState;
int i;
 
-   /* After starting a new occlusion query, initialize the results to the
-* values saved previously. The driver will then continue to increment
-* these values.
-*/
+   /* Grab the result of the old occlusion query before starting it again. The
+* old result is added to the result of the new query so the driver will
+* continue adding where it left off. */
if (state  MESA_META_OCCLUSION_QUERY) {
   if (save-CurrentOcclusionObject) {
- _mesa_BeginQuery(save-CurrentOcclusionObject-Target,
-  save-CurrentOcclusionObject-Id);
- ctx-Query.CurrentOcclusionObject-Result = 
save-CurrentOcclusionObject-Result;
+ struct gl_query_object *q = save-CurrentOcclusionObject;
+ GLuint64EXT result;
+ if (!q-Ready)
+ctx-Driver.WaitQuery(ctx, q);
+ result = q-Result;
+ _mesa_BeginQuery(q-Target, q-Id);
+ ctx-Query.CurrentOcclusionObject-Result += result;
   }
}
 
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] i965: Fix missing CACHE_NEW_WM_PROG in 3DSTATE_PS_EXTRA.

2014-11-25 Thread Jordan Justen
Series Reviewed-by: Jordan Justen jordan.l.jus...@intel.com

On 2014-11-24 22:03:10, Kenneth Graunke wrote:
 brw-wm.prog_data is covered by CACHE_NEW_WM_PROG, not
 BRW_NEW_FRAGMENT_PROGRAM.  So, we should listen to it.
 
 However, I believe that BRW_NEW_FRAGMENT_PROGRAM is sufficient to cover
 all the necessary cases - CACHE_NEW_WM_PROG happens in a subset of
 cases.  So, the code being wrong shouldn't have triggered bugs.
 
 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/mesa/drivers/dri/i965/gen8_ps_state.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c 
 b/src/mesa/drivers/dri/i965/gen8_ps_state.c
 index 3d3df19..7e3d78b 100644
 --- a/src/mesa/drivers/dri/i965/gen8_ps_state.c
 +++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c
 @@ -41,7 +41,7 @@ upload_ps_extra(struct brw_context *brw)
 if (fp-program.UsesKill)
dw1 |= GEN8_PSX_KILL_ENABLE;
  
 -   /* BRW_NEW_FRAGMENT_PROGRAM */
 +   /* CACHE_NEW_WM_PROG */
 if (brw-wm.prog_data-num_varying_inputs != 0)
dw1 |= GEN8_PSX_ATTRIBUTE_ENABLE;
  
 @@ -87,7 +87,7 @@ const struct brw_tracked_state gen8_ps_extra = {
 .dirty = {
.mesa  = _NEW_MULTISAMPLE,
.brw   = BRW_NEW_CONTEXT | BRW_NEW_FRAGMENT_PROGRAM | 
 BRW_NEW_NUM_SAMPLES,
 -  .cache = 0,
 +  .cache = CACHE_NEW_WM_PROG,
 },
 .emit = upload_ps_extra,
  };
 -- 
 2.1.3
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: check if implicitly sized arrays dont match explicitly sized arrays across the same stage

2014-11-25 Thread Timothy Arceri
Whoops I edited the wrong patch before sending this out it was meant to
be RFC and have the following comments attached:

 I came across this when I was writting arrays of arrays piglit tests.
 This change fixes the new link error single dimension array
 tests I sent to the piglit list [1].

 I've sent this as RFC as the error message needs some work
 currently it says shader output `color' declared as type `vec4[2]' and
  type `vec4[]' any suggestions on how this should be reworded would be
 helpful (keeping in mind it will also need to apply to arrays of
arrays).

 I've also run all the glsl piglit tests without any regressions.

 [1]
http://lists.freedesktop.org/archives/piglit/2014-November/013478.html

On Tue, 2014-11-25 at 23:32 +1100, Timothy Arceri wrote:
 Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au
 ---
  src/glsl/linker.cpp | 19 ++-
  1 file changed, 18 insertions(+), 1 deletion(-)
 
 diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
 index de6b1fb..a3a43a0 100644
 --- a/src/glsl/linker.cpp
 +++ b/src/glsl/linker.cpp
 @@ -732,8 +732,25 @@ cross_validate_globals(struct gl_shader_program *prog,
   ((var-type-length == 0)
  || (existing-type-length == 0))) {
 if (var-type-length != 0) {
 + if (var-type-length = 
 existing-data.max_array_access) {
 +linker_error(prog, %s `%s' declared as type 
 + `%s' and type `%s'\n,
 + mode_string(var),
 + var-name, var-type-name,
 + existing-type-name);
 +return;
 + }
existing-type = var-type;
 -   }
 +   } else if (existing-type-length != 0
 +  existing-type-length =
 +var-data.max_array_access) {
 + linker_error(prog, %s `%s' declared as type 
 +  `%s' and type `%s'\n,
 +  mode_string(var),
 +  var-name, existing-type-name,
 +  var-type-name);
 + return;
 +  }
 } else if (var-type-is_record()
   existing-type-is_record()
   existing-type-record_compare(var-type)) {


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 13/16] i965: Move fs_visitor ra pass to new fs_visitor::allocate_registers()

2014-11-25 Thread Kristian Høgsberg
On Thu, Nov 13, 2014 at 6:38 PM, Connor Abbott cwabbo...@gmail.com wrote:
 On Thu, Nov 13, 2014 at 7:28 PM, Kristian Høgsberg k...@bitplanet.net wrote:
 This will be reused for the scalar VS pass.

 Signed-off-by: Kristian Høgsberg k...@bitplanet.net
 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp | 132 
 +++
  src/mesa/drivers/dri/i965/brw_fs.h   |   1 +
  2 files changed, 71 insertions(+), 62 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index cb73b9f..4dce0a2 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -3538,11 +3538,79 @@ fs_visitor::optimize()
 lower_uniform_pull_constant_loads();
  }

 +void
 +fs_visitor::allocate_registers()
 +{
 +   bool allocated_without_spills;
 +
 +   static enum instruction_scheduler_mode pre_modes[] = {
 +  SCHEDULE_PRE,
 +  SCHEDULE_PRE_NON_LIFO,
 +  SCHEDULE_PRE_LIFO,
 +   };
 +
 +   /* Try each scheduling heuristic to see if it can successfully register
 +* allocate without spilling.  They should be ordered by decreasing
 +* performance but increasing likelihood of allocating.
 +*/
 +   for (unsigned i = 0; i  ARRAY_SIZE(pre_modes); i++) {
 +  schedule_instructions(pre_modes[i]);
 +
 +  if (0) {
 + assign_regs_trivial();
 + allocated_without_spills = true;
 +  } else {
 + allocated_without_spills = assign_regs(false);
 +  }
 +  if (allocated_without_spills)
 + break;
 +   }
 +
 +   if (!allocated_without_spills) {
 +  /* We assume that any spilling is worse than just dropping back to
 +   * SIMD8.  There's probably actually some intermediate point where
 +   * SIMD16 with a couple of spills is still better.
 +   */
 +  if (dispatch_width == 16) {
 + fail(Failure to register allocate.  Reduce number of 
 +  live scalar values to avoid this.);
 +  } else {
 + perf_debug(Fragment shader triggered register spilling.  
 +Try reducing the number of live scalar values to 
 +improve performance.\n);

 Hmm, this warning will be pretty confusing once we start hitting this
 path for vertex shaders as well...

Right, I'll put the actual stage name there instead.

Kristian

 +  }
 +
 +  /* Since we're out of heuristics, just go spill registers until we
 +   * get an allocation.
 +   */
 +  while (!assign_regs(true)) {
 + if (failed)
 +break;
 +  }
 +   }
 +
 +   assert(force_uncompressed_stack == 0);
 +
 +   /* This must come after all optimization and register allocation, since
 +* it inserts dead code that happens to have side effects, and it does
 +* so based on the actual physical registers in use.
 +*/
 +   insert_gen4_send_dependency_workarounds();
 +
 +   if (failed)
 +  return;
 +
 +   if (!allocated_without_spills)
 +  schedule_instructions(SCHEDULE_POST);
 +
 +   if (last_scratch  0)
 +  prog_data-total_scratch = brw_get_scratch_size(last_scratch);
 +}
 +
  bool
  fs_visitor::run()
  {
 sanity_param_count = prog-Parameters-NumParameters;
 -   bool allocated_without_spills;

 assign_binding_table_offsets();

 @@ -3555,7 +3623,6 @@ fs_visitor::run()
emit_dummy_fs();
 } else if (brw-use_rep_send  dispatch_width == 16) {
emit_repclear_shader();
 -  allocated_without_spills = true;
 } else {
if (INTEL_DEBUG  DEBUG_SHADER_TIME)
   emit_shader_time_begin();
 @@ -3610,68 +3677,9 @@ fs_visitor::run()
assign_curb_setup();
assign_urb_setup();

 -  static enum instruction_scheduler_mode pre_modes[] = {
 - SCHEDULE_PRE,
 - SCHEDULE_PRE_NON_LIFO,
 - SCHEDULE_PRE_LIFO,
 -  };
 -
 -  /* Try each scheduling heuristic to see if it can successfully 
 register
 -   * allocate without spilling.  They should be ordered by decreasing
 -   * performance but increasing likelihood of allocating.
 -   */
 -  for (unsigned i = 0; i  ARRAY_SIZE(pre_modes); i++) {
 - schedule_instructions(pre_modes[i]);
 -
 - if (0) {
 -assign_regs_trivial();
 -allocated_without_spills = true;
 - } else {
 -allocated_without_spills = assign_regs(false);
 - }
 - if (allocated_without_spills)
 -break;
 -  }
 -
 -  if (!allocated_without_spills) {
 - /* We assume that any spilling is worse than just dropping back to
 -  * SIMD8.  There's probably actually some intermediate point where
 -  * SIMD16 with a couple of spills is still better.
 -  */
 - if (dispatch_width == 16) {
 -fail(Failure to register allocate.  Reduce number of 
 - live scalar values to avoid this.);
 - } else {
 -perf_debug(Fragment shader triggered register spilling.  
 -  

Re: [Mesa-dev] [PATCH v2 09/16] i965: Prepare for using the ATTR register file in the fs backend

2014-11-25 Thread Kristian Høgsberg
On Thu, Nov 13, 2014 at 9:03 PM, Matt Turner matts...@gmail.com wrote:
 On Thu, Nov 13, 2014 at 4:28 PM, Kristian Høgsberg k...@bitplanet.net wrote:
 @@ -3148,6 +3150,9 @@ fs_visitor::dump_instruction(backend_instruction 
 *be_inst, FILE *file)
 case UNIFORM:
fprintf(file, ***u%d***, inst-dst.reg + inst-dst.reg_offset);
break;
 +   case ATTR:
 +  fprintf(file, attr%d, inst-dst.reg + inst-dst.reg_offset);

 This can't be a destination, so put *** around it like the uniform case above.

Ah, that's what that is.  Done.

Kristian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/6] i965/fs: Handle derivative quality decisions in the front-end.

2014-11-25 Thread Matt Turner
Patches 3-6 are

Reviewed-by: Matt Turner matts...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/7] util: Move u_atomic.h to src/util and modify API

2014-11-25 Thread Matt Turner
On Tue, Nov 25, 2014 at 6:48 AM, Jose Fonseca jfons...@vmware.com wrote:
 On 25/11/14 00:39, Matt Turner wrote:

 I've got some thread-safety fixes queued up after this and thought I'd
 be a good Mesa citizen and pull some code into src/util.


 Thanks for going the extra mile!

 I did some clean ups like replacing INLINE (MSVC knows about inline
 these days, right?) and used stdbool.h instead of the boolean type.


 No, at least MSVC 2012 doesn't have `inline` keyword when compiling C files,
 and requires a

   #if !defined(__cplusplus)  !defined(inline)
   #define inline __inline
   #endif

 somewhere.  Anyway, there are no `INLINES` nor `inlines` left after your
 series, so we're good.

 stdbool.h is fine -- we include our own when MSVC doesn't

 I also removed the inline assembly implementations because they were
 either dead code, or only allowed *ancient* gcc to build Mesa and
 because I didn't want to update them for the next patch, which makes
 the API consist of some macros that internally do the right operation
 based on the type.

 The last patch looks funky, but I think it's actually a reasonable
 solution. I don't have MSVC or Sun Studio, so please give this a
 test.


 I had to do a few tweaks to get things building on MSVC properly.

 I pushed my changes to

   http://cgit.freedesktop.org/~jrfonseca/mesa/log/?h=u_atomic

 I need to do a few more tests, but all looks feasible so far -- I don't get
 any warnings with MSVC and I believe that the generated code quality should
 be exactly the same.

 And it is indeed a nice cleanup.

Excellent, thanks a lot José!

I'll merge your patches into my series and wire up the test into
automake. I'll also put parentheses around the nested ternary in the
Sun Studio case, like you noticed was necessary for MSVC.

My thread-safety fixes actually do a compare-and-swap on a bool, so I
do need an 8-bit CAS. I found this [0] page lists
_InterlockedCompareExchange8. I don't see a non-intrinsic version
though. Can we use this? GCC will generate 8-bit atomic ops on 32- and
64-bit x86, so I don't know of a technical reason it can't work.

[0] http://msdn.microsoft.com/en-us/library/ttk2z1ws.aspx

Also, I'm a little surprised that

+#define p_atomic_dec(_v) \
+ ((void) p_atomic_dec_return(_v))

is sufficient. Are you sure?

 BTW, we could rename the macros to something not allusive to gallium (ie,
 remove pipe).  (We could even match the C11 stdatomic.h until the C
 runtime provide them, like we're doing with thread.h.)  Anyway, this is a
 just cosmetic, so it can wait.

That's probably a good idea.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/7] util: Move u_atomic.h to src/util and modify API

2014-11-25 Thread Jose Fonseca

On 25/11/14 21:04, Matt Turner wrote:

On Tue, Nov 25, 2014 at 6:48 AM, Jose Fonseca jfons...@vmware.com wrote:

On 25/11/14 00:39, Matt Turner wrote:


I've got some thread-safety fixes queued up after this and thought I'd
be a good Mesa citizen and pull some code into src/util.



Thanks for going the extra mile!


I did some clean ups like replacing INLINE (MSVC knows about inline
these days, right?) and used stdbool.h instead of the boolean type.



No, at least MSVC 2012 doesn't have `inline` keyword when compiling C files,
and requires a

   #if !defined(__cplusplus)  !defined(inline)
   #define inline __inline
   #endif

somewhere.  Anyway, there are no `INLINES` nor `inlines` left after your
series, so we're good.

stdbool.h is fine -- we include our own when MSVC doesn't


I also removed the inline assembly implementations because they were
either dead code, or only allowed *ancient* gcc to build Mesa and
because I didn't want to update them for the next patch, which makes
the API consist of some macros that internally do the right operation
based on the type.

The last patch looks funky, but I think it's actually a reasonable
solution. I don't have MSVC or Sun Studio, so please give this a
test.



I had to do a few tweaks to get things building on MSVC properly.

I pushed my changes to

   
https://urldefense.proofpoint.com/v2/url?u=http-3A__cgit.freedesktop.org_-7Ejrfonseca_mesa_log_-3Fh-3Du-5Fatomicd=AAIFaQc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=zfmBZnnVGHeYde45pMKNnVyzeaZbdIqVLprmZCM2zzEm=8QJFF29dW3a3z8eyBjuTu3qPR2fyjPrA74i0KYdqcfYs=VuQNqF0RPRZ7KG-MqTPY5S8_b_yShiJ6ddvx_9Y9LT0e=

I need to do a few more tests, but all looks feasible so far -- I don't get
any warnings with MSVC and I believe that the generated code quality should
be exactly the same.

And it is indeed a nice cleanup.


Excellent, thanks a lot José!

I'll merge your patches into my series and wire up the test into
automake. I'll also put parentheses around the nested ternary in the
Sun Studio case, like you noticed was necessary for MSVC.

My thread-safety fixes actually do a compare-and-swap on a bool, so I
do need an 8-bit CAS. I found this [0] page lists
_InterlockedCompareExchange8. I don't see a non-intrinsic version
though. Can we use this? GCC will generate 8-bit atomic ops on 32- and
64-bit x86, so I don't know of a technical reason it can't work.

[0] 
https://urldefense.proofpoint.com/v2/url?u=http-3A__msdn.microsoft.com_en-2Dus_library_ttk2z1ws.aspxd=AAIFaQc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=zfmBZnnVGHeYde45pMKNnVyzeaZbdIqVLprmZCM2zzEm=8QJFF29dW3a3z8eyBjuTu3qPR2fyjPrA74i0KYdqcfYs=Qkw_NXkDNtKz2k3ULg5JWRJNL0PhrQFGAAbtcNrjz-0e=


You're right.  I was confused because I didn't find a corresponding 
InterlockedCompareExchange8() in the Win32 API, but the 
underscore-prefixed intrinsic itself exists and works fine.  I've 
brought it back with:


http://cgit.freedesktop.org/~jrfonseca/mesa/commit/?h=u_atomic


Also, I'm a little surprised that

+#define p_atomic_dec(_v) \
+ ((void) p_atomic_dec_return(_v))

is sufficient. Are you sure?


Could you elaborate your concern?


The `void` return type is the only difference between 
p_atomic_{inc,dec}_return and p_atomic_{inc,dec}, even for 
PIPE_ATOMIC_GCC_INTRINSIC case.   In fact we should probably drop the 
the void case completely, since it seems a pointless special case.



If it is the parenthesis around the (void) cast you're worried about, it 
is fine.  See e.g., the assert macro on Linux:


$ cat /usr/include/assert.h
...
#if defined __cplusplus  __GNUC_PREREQ (2,95)
# define __ASSERT_VOID_CAST static_castvoid
#else
# define __ASSERT_VOID_CAST (void)
#endif
...
# define assert(expr)   (__ASSERT_VOID_CAST (0))


Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/7] util: Move u_atomic.h to src/util and modify API

2014-11-25 Thread Matt Turner
On Tue, Nov 25, 2014 at 1:42 PM, Jose Fonseca jfons...@vmware.com wrote:
 On 25/11/14 21:04, Matt Turner wrote:

 On Tue, Nov 25, 2014 at 6:48 AM, Jose Fonseca jfons...@vmware.com wrote:

 On 25/11/14 00:39, Matt Turner wrote:


 I've got some thread-safety fixes queued up after this and thought I'd
 be a good Mesa citizen and pull some code into src/util.



 Thanks for going the extra mile!

 I did some clean ups like replacing INLINE (MSVC knows about inline
 these days, right?) and used stdbool.h instead of the boolean type.



 No, at least MSVC 2012 doesn't have `inline` keyword when compiling C
 files,
 and requires a

#if !defined(__cplusplus)  !defined(inline)
#define inline __inline
#endif

 somewhere.  Anyway, there are no `INLINES` nor `inlines` left after your
 series, so we're good.

 stdbool.h is fine -- we include our own when MSVC doesn't

 I also removed the inline assembly implementations because they were
 either dead code, or only allowed *ancient* gcc to build Mesa and
 because I didn't want to update them for the next patch, which makes
 the API consist of some macros that internally do the right operation
 based on the type.

 The last patch looks funky, but I think it's actually a reasonable
 solution. I don't have MSVC or Sun Studio, so please give this a
 test.



 I had to do a few tweaks to get things building on MSVC properly.

 I pushed my changes to


 https://urldefense.proofpoint.com/v2/url?u=http-3A__cgit.freedesktop.org_-7Ejrfonseca_mesa_log_-3Fh-3Du-5Fatomicd=AAIFaQc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=zfmBZnnVGHeYde45pMKNnVyzeaZbdIqVLprmZCM2zzEm=8QJFF29dW3a3z8eyBjuTu3qPR2fyjPrA74i0KYdqcfYs=VuQNqF0RPRZ7KG-MqTPY5S8_b_yShiJ6ddvx_9Y9LT0e=

 I need to do a few more tests, but all looks feasible so far -- I don't
 get
 any warnings with MSVC and I believe that the generated code quality
 should
 be exactly the same.

 And it is indeed a nice cleanup.


 Excellent, thanks a lot José!

 I'll merge your patches into my series and wire up the test into
 automake. I'll also put parentheses around the nested ternary in the
 Sun Studio case, like you noticed was necessary for MSVC.

 My thread-safety fixes actually do a compare-and-swap on a bool, so I
 do need an 8-bit CAS. I found this [0] page lists
 _InterlockedCompareExchange8. I don't see a non-intrinsic version
 though. Can we use this? GCC will generate 8-bit atomic ops on 32- and
 64-bit x86, so I don't know of a technical reason it can't work.

 [0]
 https://urldefense.proofpoint.com/v2/url?u=http-3A__msdn.microsoft.com_en-2Dus_library_ttk2z1ws.aspxd=AAIFaQc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=zfmBZnnVGHeYde45pMKNnVyzeaZbdIqVLprmZCM2zzEm=8QJFF29dW3a3z8eyBjuTu3qPR2fyjPrA74i0KYdqcfYs=Qkw_NXkDNtKz2k3ULg5JWRJNL0PhrQFGAAbtcNrjz-0e=


 You're right.  I was confused because I didn't find a corresponding
 InterlockedCompareExchange8() in the Win32 API, but the underscore-prefixed
 intrinsic itself exists and works fine.  I've brought it back with:

 http://cgit.freedesktop.org/~jrfonseca/mesa/commit/?h=u_atomic

Awesome, thanks!

 Also, I'm a little surprised that

 +#define p_atomic_dec(_v) \
 + ((void) p_atomic_dec_return(_v))

 is sufficient. Are you sure?


 Could you elaborate your concern?

Oh, I'm sorry -- I totally misread. Yeah, that seems completely fine.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/10] i965: Add _CACHE_ in brw_cache_id enum names.

2014-11-25 Thread Matt Turner
On Tue, Nov 25, 2014 at 4:43 AM, Kenneth Graunke kenn...@whitecape.org wrote:
 BRW_CACHE_VS_PROG is more easily associated with program caches than
 plain BRW_VS_PROG.

 While we're at it, rename BRW_WM_PROG to BRW_CACHE_FS_PROG, to move away
 from the outdated Windowizer/Masker name.

 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp |  4 ++--
  src/mesa/drivers/dri/i965/brw_clip.c |  4 ++--
  src/mesa/drivers/dri/i965/brw_context.h  | 28 
 ++--
  src/mesa/drivers/dri/i965/brw_ff_gs.c|  4 ++--
  src/mesa/drivers/dri/i965/brw_gs.c   |  4 ++--
  src/mesa/drivers/dri/i965/brw_sf.c   |  4 ++--
  src/mesa/drivers/dri/i965/brw_state_cache.c  | 12 ++--
  src/mesa/drivers/dri/i965/brw_state_dump.c   | 14 +++---
  src/mesa/drivers/dri/i965/brw_vs.c   |  6 +++---
  src/mesa/drivers/dri/i965/brw_wm.c   |  6 +++---
  10 files changed, 43 insertions(+), 43 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
 b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
 index 844f5e4..a103af0 100644
 --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
 @@ -2119,14 +2119,14 @@ brw_blorp_blit_params::get_wm_prog(struct brw_context 
 *brw,
 brw_blorp_prog_data **prog_data) const
  {
 uint32_t prog_offset = 0;
 -   if (!brw_search_cache(brw-cache, BRW_BLORP_BLIT_PROG,
 +   if (!brw_search_cache(brw-cache, BRW_CACHE_BLORP_BLIT_PROG,
   this-wm_prog_key, sizeof(this-wm_prog_key),
   prog_offset, prog_data)) {
brw_blorp_blit_program prog(brw, this-wm_prog_key,
INTEL_DEBUG  DEBUG_BLORP);
GLuint program_size;
const GLuint *program = prog.compile(brw, program_size);
 -  brw_upload_cache(brw-cache, BRW_BLORP_BLIT_PROG,
 +  brw_upload_cache(brw-cache, BRW_CACHE_BLORP_BLIT_PROG,
 this-wm_prog_key, sizeof(this-wm_prog_key),
 program, program_size,
 prog.prog_data, sizeof(prog.prog_data),
 diff --git a/src/mesa/drivers/dri/i965/brw_clip.c 
 b/src/mesa/drivers/dri/i965/brw_clip.c
 index debeee5..3fef38c 100644
 --- a/src/mesa/drivers/dri/i965/brw_clip.c
 +++ b/src/mesa/drivers/dri/i965/brw_clip.c
 @@ -122,7 +122,7 @@ static void compile_clip_prog( struct brw_context *brw,
 }

 brw_upload_cache(brw-cache,
 -   BRW_CLIP_PROG,
 +   BRW_CACHE_CLIP_PROG,
 c.key, sizeof(c.key),
 program, program_size,
 c.prog_data, sizeof(c.prog_data),
 @@ -248,7 +248,7 @@ brw_upload_clip_prog(struct brw_context *brw)
}
 }

 -   if (!brw_search_cache(brw-cache, BRW_CLIP_PROG,
 +   if (!brw_search_cache(brw-cache, BRW_CACHE_CLIP_PROG,
  key, sizeof(key),
  brw-clip.prog_offset, brw-clip.prog_data)) {
compile_clip_prog( brw, key );
 diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
 b/src/mesa/drivers/dri/i965/brw_context.h
 index fac22dc..c4e96de 100644
 --- a/src/mesa/drivers/dri/i965/brw_context.h
 +++ b/src/mesa/drivers/dri/i965/brw_context.h
 @@ -693,13 +693,13 @@ struct brw_gs_prog_data
  #define SHADER_TIME_STRIDE 64

  enum brw_cache_id {
 -   BRW_WM_PROG,
 -   BRW_BLORP_BLIT_PROG,
 -   BRW_SF_PROG,
 -   BRW_VS_PROG,
 -   BRW_FF_GS_PROG,
 -   BRW_GS_PROG,
 -   BRW_CLIP_PROG,
 +   BRW_CACHE_FS_PROG,
 +   BRW_CACHE_BLORP_BLIT_PROG,
 +   BRW_CACHE_SF_PROG,
 +   BRW_CACHE_VS_PROG,
 +   BRW_CACHE_FF_GS_PROG,
 +   BRW_CACHE_GS_PROG,
 +   BRW_CACHE_CLIP_PROG,

 BRW_MAX_CACHE
  };
 @@ -777,13 +777,13 @@ enum shader_time_shader_type {

  /* Flags for brw-state.cache.
   */
 -#define CACHE_NEW_WM_PROG(1BRW_WM_PROG)
 -#define CACHE_NEW_BLORP_BLIT_PROG(1BRW_BLORP_BLIT_PROG)
 -#define CACHE_NEW_SF_PROG(1BRW_SF_PROG)
 -#define CACHE_NEW_VS_PROG(1BRW_VS_PROG)
 -#define CACHE_NEW_FF_GS_PROG (1BRW_FF_GS_PROG)
 -#define CACHE_NEW_GS_PROG(1BRW_GS_PROG)
 -#define CACHE_NEW_CLIP_PROG  (1BRW_CLIP_PROG)
 +#define CACHE_NEW_WM_PROG(1  BRW_CACHE_FS_PROG)
 +#define CACHE_NEW_BLORP_BLIT_PROG(1  BRW_CACHE_BLORP_BLIT_PROG)
 +#define CACHE_NEW_SF_PROG(1  BRW_CACHE_SF_PROG)
 +#define CACHE_NEW_VS_PROG(1  BRW_CACHE_VS_PROG)
 +#define CACHE_NEW_FF_GS_PROG (1  BRW_CACHE_FF_GS_PROG)
 +#define CACHE_NEW_GS_PROG(1  BRW_CACHE_GS_PROG)
 +#define CACHE_NEW_CLIP_PROG  (1  BRW_CACHE_CLIP_PROG)

Would it be useful to go ahead and make this an enum at the same time?
(Maybe not even change the type of state.cache yet)
___
mesa-dev mailing list

Re: [Mesa-dev] [PATCH 01/10] i965: Alphabetize brw_tracked_state flags and use a consistent style.

2014-11-25 Thread Matt Turner
On Tue, Nov 25, 2014 at 4:43 AM, Kenneth Graunke kenn...@whitecape.org wrote:

 4:43 AM

If I didn't see you a couple of times a week I'd wonder what timezone
you were in!

 Most of the dirty flags were listed in some arbitrary order.  Some used
 bonus parenthesis.  Some put multiple flags on one line, others put one
 per line.  Some used tabs instead of spaces...but only on some lines.

 This patch settles on one flag per line, in alphabetical order, using
 spaces instead of tabs, and sheds the unnecessary parentheses.

 Sorting was mostly done with vim's visual block feature and !sort,
 although I alphabetized short lists by hand; it was pretty manual.

 Signed-off-by: Kenneth Graunke kenn...@whitecape.org

1-7, the non-scripted parts of 9, and 10 are (Yes, I actually reviewed #1)

Reviewed-by: Matt Turner matts...@gmail.com

8 and the scripted parts of 9 are

Acked-by: Matt Turner matts...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 01/16] i965: Don't copy propagate sat MOVs into LOAD_PAYLOAD

2014-11-25 Thread Kristian Høgsberg
On Thu, Nov 13, 2014 at 8:49 PM, Matt Turner matts...@gmail.com wrote:
 On Thu, Nov 13, 2014 at 4:28 PM, Kristian Høgsberg k...@bitplanet.net wrote:
 The LOAD_PAYLOAD opcode can't saturate its sources, so skip
 saturating MOVs.  The register coalescing after lower_load_payload()
 will clean up the extra MOVs.

 Signed-off-by: Kristian Høgsberg k...@bitplanet.net
 ---
  src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
 index e1989cb..87ea9c2 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
 @@ -454,8 +454,12 @@ fs_visitor::try_constant_propagate(fs_inst *inst, 
 acp_entry *entry)
val.effective_width = inst-src[i].effective_width;

switch (inst-opcode) {
 -  case BRW_OPCODE_MOV:
case SHADER_OPCODE_LOAD_PAYLOAD:
 + /* LOAD_PAYLOAD can't sat its sources. */
 + if (entry-saturate)
 +break;
 + /* Otherwise, fall through */
 +  case BRW_OPCODE_MOV:
   inst-src[i] = val;
   progress = true;
   break;
 --
 2.1.0

 This looks like the same patch as 01/14 in the previous series. I
 suggested a better approach there:

 At the beginning of fs_visitor::try_constant_propagate, if
 (entry-saturate) return false, or just saturate the argument and
 proceed. We don't want to be propagating the result of a mov.sat 4.0
 into anything without saturating the result first.

Fixed in v3.

Kristian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] util: update hash type comments

2014-11-25 Thread Eric Anholt
Timothy Arceri t_arc...@yahoo.com.au writes:

 Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au

Reviewed-by: Eric Anholt e...@anholt.net

and I've pushed the equivalent change with your name to the upstream
repo.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] util/hash_table: Rework the API to know about hashing

2014-11-25 Thread Eric Anholt
Jason Ekstrand ja...@jlekstrand.net writes:

 Previously, the hash_table API required the user to do all of the hashing
 of keys as it passed them in.  Since the hashing function is intrinsicly
 tied to the comparison function, it makes sense for the hash table to know
 about it.  Also, it makes for a somewhat clumsy API as the user is
 constantly calling hashing functions many of which have long names.  This
 is especially bad when the standard call looks something like

 _mesa_hash_table_insert(ht, _mesa_pointer_hash(key), key, data);

 In the above case, there is no reason why the hash table shouldn't do the
 hashing for you.  We leave the option for you to do your own hashing if
 it's more efficient, but it's no longer needed.  Also, if you do do your
 own hashing, the hash table will assert that your hash matches what it
 expects out of the hashing function.  This should make it harder to mess up
 your hashing.

I went to go port this change to the upstream repo, and found a commit
laying around from cworth that did the same thing.  He called the
renamed functions _pre_hashed, and included sets as well.  I think I
prefer that name slightly?  What do you think?  I've pushed his change
to the upstream (git+ssh://people.freedesktop.org/~anholt/hash_table).

It's a really good idea, though, and I'm sorry I hadn't pushed the
change and propagated it to Mesa already. :(


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 14/16] i965: Add fs_visitor::run_vs() to generate scalar vertex shader code

2014-11-25 Thread Kristian Høgsberg
On Fri, Nov 14, 2014 at 4:08 PM, Kenneth Graunke kenn...@whitecape.org wrote:
 On Thursday, November 13, 2014 04:28:20 PM Kristian Høgsberg wrote:
 This patch uses the previous refactoring to add a new run_vs() method
 that generates vertex shader code using the scalar visitor and
 optimizer.

 Signed-off-by: Kristian Høgsberg k...@bitplanet.net
 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp |  99 -
  src/mesa/drivers/dri/i965/brw_fs.h   |  21 +-
  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 303 
 ++-
  3 files changed, 412 insertions(+), 11 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index 4dce0a2..8007977 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -1828,6 +1828,56 @@ fs_visitor::assign_urb_setup()
urb_start + prog_data-num_varying_inputs * 2;
  }

 +void
 +fs_visitor::assign_vs_urb_setup()
 +{
 +   brw_vs_prog_data *vs_prog_data = (brw_vs_prog_data *) prog_data;
 +   int grf, count, slot, channel, attr;
 +
 +   assert(stage == MESA_SHADER_VERTEX);
 +   count = _mesa_bitcount_64(vs_prog_data-inputs_read);
 +   if (vs_prog_data-uses_vertexid || vs_prog_data-uses_instanceid)
 +  count++;
 +
 +   /* Each attribute is 4 regs. */
 +   this-first_non_payload_grf =
 +  payload.num_regs + prog_data-curb_read_length + count * 4;
 +
 +   unsigned vue_entries =
 +  MAX2(count, vs_prog_data-base.vue_map.num_slots);
 +
 +   vs_prog_data-base.urb_entry_size = ALIGN(vue_entries, 4) / 4;
 +   vs_prog_data-base.urb_read_length = (count + 1) / 2;
 +
 +   assert(vs_prog_data-base.urb_read_length = 15);
 +
 +   /* Rewrite all ATTR file references to the hw grf that they land in. */
 +   foreach_block_and_inst(block, fs_inst, inst, cfg) {
 +  for (int i = 0; i  inst-sources; i++) {
 + if (inst-src[i].file == ATTR) {
 +
 +if (inst-src[i].reg == VERT_ATTRIB_MAX) {
 +   slot = count - 1;
 +} else {
 +   attr = inst-src[i].reg + inst-src[i].reg_offset / 4;
 +   slot = _mesa_bitcount_64(vs_prog_data-inputs_read 
 +BITFIELD64_MASK(attr));

 I'm having trouble understanding this code - can you explain?

 Reading ir_set_program_inouts.cpp:98 I see that incoming vertex attributes
 are always vec4 slots, except for matrices and arrays, which use multiple
 vec4 slots.

 I expected your ATTR registers to always be size 4, so reg_offset would have
 valid values of 0..3.  But I must be mistaken, since you're doing
 reg_offset / 4, which would always be 0.  Are ATTRs 4*N where N == the # of
 matrix columns or array length?

There were cases where reg_offset was  3, which is why I did it this
way.  It may be that that's the problem and I shouldn't work around it
here... let me assert reg_offset  4 there and find the piglit cases
that triggered this.

 Even still - I don't see how applying BITFIELD64_MASK to a potentially
 non-power-of-two number and then doing a bitcount will give you a single
 accurate slot value.

The slot computation is functionally the same as attribute_map[attr].
vec4_vs_visitor::setup_attributes, computes the number of enabled
attributes lower than attr in attribute_map[attr].  That's the number
of enabled bits in inputs_read that are lower than 1  attr.  We can
mask out those bits using BITFIELD64_MASK(attr) and count them using
bitcount.

 Adding a comment would also be nice to future maintainers.

Yea, fair point.

 This was the main spot where I got confused - otherwise most of the code
 looks good to me.

 +}
 +
 +channel = inst-src[i].reg_offset  3;
 +
 +grf = payload.num_regs +
 +   prog_data-curb_read_length +
 +   slot * 4 + channel;
 +
 +inst-src[i].file = HW_REG;
 +inst-src[i].fixed_hw_reg =
 +   retype(brw_vec8_grf(grf, 0), inst-src[i].type);
 + }
 +  }
 +   }
 +}
 +
  /**
   * Split large virtual GRFs into separate components if we can.
   *
 @@ -3405,6 +3455,13 @@ fs_visitor::setup_payload_gen6()
  }

  void
 +fs_visitor::setup_vs_payload()
 +{
 +   /* R0: thread header, R1: urb handles */
 +   payload.num_regs = 2;
 +}
 +
 +void
  fs_visitor::assign_binding_table_offsets()
  {
 assert(stage == MESA_SHADER_FRAGMENT);
 @@ -3471,6 +3528,8 @@ fs_visitor::opt_drop_redundant_mov_to_flags()
  void
  fs_visitor::optimize()
  {
 +   const char *stage_name = stage == MESA_SHADER_VERTEX ? vs : fs;
 +
 calculate_cfg();

 split_virtual_grfs();
 @@ -3487,8 +3546,8 @@ fs_visitor::optimize()
  \
if (unlikely(INTEL_DEBUG  DEBUG_OPTIMIZER)  this_progress) {   \
   char filename[64]; \
 - snprintf(filename, 64, fs%d-%04d-%02d-%02d- #pass,   \
 -  

Re: [Mesa-dev] [PATCH] util/hash_table: Rework the API to know about hashing

2014-11-25 Thread Jason Ekstrand
On Tue, Nov 25, 2014 at 3:40 PM, Eric Anholt e...@anholt.net wrote:

 Jason Ekstrand ja...@jlekstrand.net writes:

  Previously, the hash_table API required the user to do all of the hashing
  of keys as it passed them in.  Since the hashing function is intrinsicly
  tied to the comparison function, it makes sense for the hash table to
 know
  about it.  Also, it makes for a somewhat clumsy API as the user is
  constantly calling hashing functions many of which have long names.  This
  is especially bad when the standard call looks something like
 
  _mesa_hash_table_insert(ht, _mesa_pointer_hash(key), key, data);
 
  In the above case, there is no reason why the hash table shouldn't do the
  hashing for you.  We leave the option for you to do your own hashing if
  it's more efficient, but it's no longer needed.  Also, if you do do your
  own hashing, the hash table will assert that your hash matches what it
  expects out of the hashing function.  This should make it harder to mess
 up
  your hashing.

 I went to go port this change to the upstream repo, and found a commit
 laying around from cworth that did the same thing.  He called the
 renamed functions _pre_hashed, and included sets as well.  I think I
 prefer that name slightly?  What do you think?  I've pushed his change
 to the upstream (git+ssh://people.freedesktop.org/~anholt/hash_table).


That looks fine to me.  It looks mostly like what I did.  The one thing I
think my patch has over Carl's is that I split the insert_pre_hashed
function into an internal and an external version.  If you call the
external version, it asserts that the hash value from the internal hashing
function matches the hash value you gave it.  That way you'll notice if you
ever mix up hashing functions.  If you'd like, I can make that a patch on
top of what you have in your branch there.  Or you can do it and just push
it.


 It's a really good idea, though, and I'm sorry I hadn't pushed the
 change and propagated it to Mesa already. :(


That's ok.  Mesa doesn't use it that many places.  We don't really feel the
pain until NIR which uses it about twice as much as the rest of mesa.
--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/7] util: Make u_atomic.h typeless.

2014-11-25 Thread Matt Turner
like how C11's stdatomic.h provides generic functions. GCC's __sync_*
builtins already take a variety of types, so that's simple.

MSVC and Sun Studio don't, but we can implement it with something that
looks a little crazy but is actually quite readable.
---
v2: - Squash in Jose's MSVC fixes
- Apply similar fixes to Sun code, like wrapping macro bodies
  in parens, including assert.h, and casting new/old params.

 src/util/u_atomic.h | 206 
 1 file changed, 93 insertions(+), 113 deletions(-)

diff --git a/src/util/u_atomic.h b/src/util/u_atomic.h
index 620191c..5867a0f 100644
--- a/src/util/u_atomic.h
+++ b/src/util/u_atomic.h
@@ -34,52 +34,15 @@
 
 #define PIPE_ATOMIC GCC Sync Intrinsics
 
-#ifdef __cplusplus
-extern C {
-#endif
-
 #define p_atomic_set(_v, _i) (*(_v) = (_i))
 #define p_atomic_read(_v) (*(_v))
-
-static inline boolean
-p_atomic_dec_zero(int32_t *v)
-{
-   return (__sync_sub_and_fetch(v, 1) == 0);
-}
-
-static inline void
-p_atomic_inc(int32_t *v)
-{
-   (void) __sync_add_and_fetch(v, 1);
-}
-
-static inline void
-p_atomic_dec(int32_t *v)
-{
-   (void) __sync_sub_and_fetch(v, 1);
-}
-
-static inline int32_t
-p_atomic_inc_return(int32_t *v)
-{
-   return __sync_add_and_fetch(v, 1);
-}
-
-static inline int32_t
-p_atomic_dec_return(int32_t *v)
-{
-   return __sync_sub_and_fetch(v, 1);
-}
-
-static inline int32_t
-p_atomic_cmpxchg(int32_t *v, int32_t old, int32_t _new)
-{
-   return __sync_val_compare_and_swap(v, old, _new);
-}
-
-#ifdef __cplusplus
-}
-#endif
+#define p_atomic_dec_zero(v) (__sync_sub_and_fetch((v), 1) == 0)
+#define p_atomic_inc(v) (void) __sync_add_and_fetch((v), 1)
+#define p_atomic_dec(v) (void) __sync_sub_and_fetch((v), 1)
+#define p_atomic_inc_return(v) __sync_add_and_fetch((v), 1)
+#define p_atomic_dec_return(v) __sync_sub_and_fetch((v), 1)
+#define p_atomic_cmpxchg(v, old, _new) \
+   __sync_val_compare_and_swap((v), (old), (_new))
 
 #endif
 
@@ -108,58 +71,57 @@ p_atomic_cmpxchg(int32_t *v, int32_t old, int32_t _new)
 
 #define PIPE_ATOMIC MSVC Intrinsics
 
+/* We use the Windows header's Interlocked* functions instead of the
+ * _Interlocked* intrinsics wherever we can, as support for the latter varies
+ * with target CPU, whereas Windows headers take care of all portability
+ * issues: using intrinsics where available, falling back to library
+ * implementations where not.
+ */
+#ifndef WIN32_LEAN_AND_MEAN
+#define WIN32_LEAN_AND_MEAN 1
+#endif
+#include windows.h
 #include intrin.h
+#include assert.h
 
-#pragma intrinsic(_InterlockedIncrement)
-#pragma intrinsic(_InterlockedDecrement)
-#pragma intrinsic(_InterlockedCompareExchange)
+#pragma intrinsic(_InterlockedCompareExchange8)
 
-#ifdef __cplusplus
-extern C {
-#endif
+/* MSVC supports decltype keyword, but it's only supported on C++ and doesn't
+ * quite work here; and if a C++-only solution is worthwhile, then it would be
+ * better to use templates / function overloading, instead of decltype magic.
+ * Therefore, we rely on implicit casting to LONGLONG for the functions that 
return
+ */
 
 #define p_atomic_set(_v, _i) (*(_v) = (_i))
 #define p_atomic_read(_v) (*(_v))
 
-static inline bool
-p_atomic_dec_zero(int32_t *v)
-{
-   return _InterlockedDecrement((long *)v) == 0;
-}
-
-static inline void
-p_atomic_inc(int32_t *v)
-{
-   _InterlockedIncrement((long *)v);
-}
-
-static inline int32_t
-p_atomic_inc_return(int32_t *v)
-{
-   return _InterlockedIncrement((long *)v);
-}
-
-static inline void
-p_atomic_dec(int32_t *v)
-{
-   _InterlockedDecrement((long *)v);
-}
-
-static inline int32_t
-p_atomic_dec_return(int32_t *v)
-{
-   return _InterlockedDecrement((long *)v);
-}
-
-static inline int32_t
-p_atomic_cmpxchg(int32_t *v, int32_t old, int32_t _new)
-{
-   return _InterlockedCompareExchange((long *)v, _new, old);
-}
-
-#ifdef __cplusplus
-}
-#endif
+#define p_atomic_dec_zero(_v) \
+   (p_atomic_dec_return(_v) == 0)
+
+#define p_atomic_inc(_v) \
+   ((void) p_atomic_inc_return(_v))
+
+#define p_atomic_inc_return(_v) (\
+   sizeof *(_v) == sizeof(short)   ? InterlockedIncrement16((short *)  (_v)) : 
\
+   sizeof *(_v) == sizeof(long)? InterlockedIncrement  ((long *)   (_v)) : 
\
+   sizeof *(_v) == sizeof(__int64) ? InterlockedIncrement64((__int64 *)(_v)) : 
\
+ (assert(!should not get here), 0))
+
+#define p_atomic_dec(_v) \
+   ((void) p_atomic_dec_return(_v))
+
+#define p_atomic_dec_return(_v) (\
+   sizeof *(_v) == sizeof(short)   ? InterlockedDecrement16((short *)  (_v)) : 
\
+   sizeof *(_v) == sizeof(long)? InterlockedDecrement  ((long *)   (_v)) : 
\
+   sizeof *(_v) == sizeof(__int64) ? InterlockedDecrement64((__int64 *)(_v)) : 
\
+ (assert(!should not get here), 0))
+
+#define p_atomic_cmpxchg(_v, _old, _new) (\
+   sizeof *(_v) == sizeof(char)? _InterlockedCompareExchange8((char *)   
(_v), (char)   (_new), (char)   (_old)) : \
+   sizeof *(_v) == 

Re: [Mesa-dev] [PATCH 0/7] util: Move u_atomic.h to src/util and modify API

2014-11-25 Thread Matt Turner
On Mon, Nov 24, 2014 at 4:39 PM, Matt Turner matts...@gmail.com wrote:
 I've got some thread-safety fixes queued up after this and thought I'd
 be a good Mesa citizen and pull some code into src/util.

 I did some clean ups like replacing INLINE (MSVC knows about inline
 these days, right?) and used stdbool.h instead of the boolean type.

 I also removed the inline assembly implementations because they were
 either dead code, or only allowed *ancient* gcc to build Mesa and
 because I didn't want to update them for the next patch, which makes
 the API consist of some macros that internally do the right operation
 based on the type.

 The last patch looks funky, but I think it's actually a reasonable
 solution. I don't have MSVC or Sun Studio, so please give this a
 test.

With José's fixes merged in, I've pushed a branch to

git://people.freedesktop.org/~mattst88/mesa thread-safe

Feel free to ignore the patches on top of this series.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 85203] [DRI2][RadeonSI] Pageflip completion event has impossible msc.

2014-11-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=85203

--- Comment #5 from Mario Kleiner mario.klei...@tuebingen.mpg.de ---
(In reply to Michel Dänzer from comment #4)
 Mario, any ideas?
 
 These are possible scenarios I can think of where this could still happen:
 
 1. The pflip interrupt triggers before the corresponding vblank interrupt
 2. The page flip is queued just before the vertical blank period, such that
 it's
too late for the corresponding vblank interrupt to trigger, but early
 enough
for the flip to complete in the same vertical blank period.
 3. Race condition when enabling the vblank interrupt, causing the vertical
 blank
counter to be off by 1.
 
 AFAIK 1. shouldn't happen, not sure if 2. can happen.

3. Can't cause this: The ddx calls waitvblank ioctl() while preparing a swap,
to queue a vblank event - vblank irqs on. They stay on for at least 5 seconds
after queueing and dispatching the vblank event to the server, and it won't
take the server/ddx over 5 seconds from receiving the event to calling the
pflip ioctl, and then the pending flip keeps vblank irqs on until flip
completion. So if there were a off by one bug, it would consistently affect the
flip completion msc and the calculated target_msc used by the check that was
triggered and causing this log message.

2. Don't think this can happen either: vblank irq is enabled before queueing a
flip and vblank irq fires a bit before vblank iirc, and the hw is programmed to
execute and complete pageflips only at leading edge of vblank since Linux 3.16,
so the order of interrupt events in the interrupt ring should be so that the
vblank handler gets called before the corresponding pflip irq handler. Unless
the hw wouldn't fire a vblank irq .

A variation could in theory happen: The vblank irq fires shortly before vblank
(about 2 scanlines iirc) - msc++ , then a glXSwapBuffers swap gets scheduled,
ddx reads the already incremented msc for the upcoming vblank and bases its
reference value for expected msc on it, queues vblank event to the kernel -
gets dispatched immediately, triggers the pageflip ioctl and the flip gets
programmed into the hw before leading edge of the upcoming vblank - flip
completes and delivers a completion msc that's one smaller than the expected
minimum msc - This warning gets logged. But i don't think this is likely. All
this would have to happen within a tiny time-window of maybe at most 30
microseconds at exactly the right moment, about 2 scanlines before start of
vblank. From tracing this a few month ago, i seem to remember that the sequence
of actions takes longer than 30 usecs.

1. Shouldn't happen if everything works correctly. But if for some reason the
pageflip wouldn't be synchronized to vblank, we'd get a tearing flip and these
warning messages, because the flip might get queued in the middle of the
scanout and then complete immediately and report the msc of the current refresh
cycle, as the pflip irq would execute long before the vblank irq.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 85203] [DRI2][RadeonSI] Pageflip completion event has impossible msc.

2014-11-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=85203

--- Comment #6 from Mario Kleiner mario.klei...@tuebingen.mpg.de ---
Also looking at the current ati-ddx, there have been changes to the logic since
i last touched that code a long time ago. Functions like
radeon_dri2_extrapolate_msc_delay() and radeon_get_interpolated_vblanks() which
seem to tinker with the values which go into the check that triggered these
warnings in the log. Some kind of adjustments for dpms off periods? Maybe some
off-by-one error there due to some rounding?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] util/hash_table: Rework the API to know about hashing

2014-11-25 Thread Timothy Arceri
On Tue, 2014-11-25 at 01:56 -0500, Connor Abbott wrote:
 On Tue, Nov 25, 2014 at 1:19 AM, Jason Ekstrand ja...@jlekstrand.net wrote:
  Previously, the hash_table API required the user to do all of the hashing
  of keys as it passed them in.  Since the hashing function is intrinsicly
  tied to the comparison function, it makes sense for the hash table to know
  about it.  Also, it makes for a somewhat clumsy API as the user is
  constantly calling hashing functions many of which have long names.  This
  is especially bad when the standard call looks something like
 
  _mesa_hash_table_insert(ht, _mesa_pointer_hash(key), key, data);
 
  In the above case, there is no reason why the hash table shouldn't do the
  hashing for you.  We leave the option for you to do your own hashing if
  it's more efficient, but it's no longer needed.  Also, if you do do your
  own hashing, the hash table will assert that your hash matches what it
  expects out of the hashing function.  This should make it harder to mess up
  your hashing.
 
 Yes, please! The only fear I have is that this may affect some
 microbenchmark somewhere, since every time you insert, lookup, or
 remove an element of the hash table you now have to go through a
 function pointer to get to the hashing function. This doesn't seem too
 likely to make a difference, but since the primary goal of the hash
 table and the reason it was written is performance you can never be
 too careful. Of course, if it does turn out to make a difference, the
 fix is easy.

On the subject of the hash table performance I spent way to much time
looking into this recently. In the end it was taking up to much time
benchmarking and profiling for what it was worth (at least in my test
cases) so I've moved on for now. However for those interested here is
what I found.

Unsurprisingly most time is spent doing the hash and resolving
collisions. For increased hash speed I implemented the _mm_crc32_u8()
intrinsic which seemed to give a nice bump in performance and with no
noticeable change in collisions over the FNV-1a hash.
For better collision resolution I was looking at robin hood hashing
which looks like a nice fit for the current table [1] but moved on
before trying it out.

As for other improvements I also split the insert function in two as
currently there is almost always a search done before the insert, but
then the insert does a second search to find an empty slot. So I created
a _mesa_hash_table_search_pre_insert function that checks if the hash
already exists but also returns the spare slot if it doesn't which can
then just be passed to an insert that doesn't do a second search.
Once this is done you can also reorder the if statement in the search as
you now can expect search calls to be lookups and return a result rather
than looking for a free slot.
Although this seemed to reduce time spent in this area I also noticed a
drop (very small but seemingly reproducible) in frame rates I couldn't
work out why this might be happening.

Anyway if anyone's interested in looking at this further I'm happy to
send the patches or push it to my github repo.

[1] http://codecapsule.com/2013/11/11/robin-hood-hashing/


 
 Connor
 
 
  Signed-off-by: Jason Ekstrand jason.ekstr...@intel.com
  ---
   src/gallium/drivers/vc4/vc4_opt_cse.c  |  7 +--
   src/glsl/ir_variable_refcount.cpp  |  9 ++--
   src/glsl/link_uniform_block_active_visitor.cpp |  6 +--
   src/glsl/link_uniform_blocks.cpp   |  3 +-
   src/mesa/main/hash.c   | 17 --
   src/util/hash_table.c  | 75 
  +++---
   src/util/hash_table.h  | 19 +--
   src/util/tests/hash_table/collision.c  | 22 
   src/util/tests/hash_table/delete_and_lookup.c  | 16 +++---
   src/util/tests/hash_table/delete_management.c  |  9 ++--
   src/util/tests/hash_table/destroy_callback.c   |  9 ++--
   src/util/tests/hash_table/insert_and_lookup.c  | 13 +++--
   src/util/tests/hash_table/insert_many.c|  6 +--
   src/util/tests/hash_table/random_entry.c   |  4 +-
   src/util/tests/hash_table/remove_null.c|  2 +-
   src/util/tests/hash_table/replacement.c| 13 +++--
   16 files changed, 138 insertions(+), 92 deletions(-)
 
  diff --git a/src/gallium/drivers/vc4/vc4_opt_cse.c 
  b/src/gallium/drivers/vc4/vc4_opt_cse.c
  index bebfb652..0e9335e 100644
  --- a/src/gallium/drivers/vc4/vc4_opt_cse.c
  +++ b/src/gallium/drivers/vc4/vc4_opt_cse.c
  @@ -84,7 +84,7 @@ vc4_find_cse(struct vc4_compile *c, struct hash_table *ht,
 
   uint32_t hash = _mesa_hash_data(key, sizeof(key));
   struct hash_entry *entry =
  -_mesa_hash_table_search(ht, hash, key);
  +_mesa_hash_table_search_with_hash(ht, hash, key);
 
   if (entry) {
   if (debug) {
  @@ -106,7 +106,7 @@ vc4_find_cse(struct vc4_compile *c, struct hash_table 
  *ht,
   

Re: [Mesa-dev] [PATCH] mesa/glsl/glapi: enable GL_EXT_draw_buffers extension

2014-11-25 Thread Tapani

On 11/25/2014 03:56 PM, Ilia Mirkin wrote:

On Tue, Nov 25, 2014 at 6:23 AM, Tapani Pälli tapani.pa...@intel.com wrote:

Patch enables ES2 extension that utilizes existing ES3 functionality.

Changes make all the subtests to run and pass in WebGL conformance
test 'webgl-draw-buffers' when running Chrome on OpenGL ES.

Signed-off-by: Tapani Pälli tapani.pa...@intel.com
---
  src/glsl/glcpp/glcpp-parse.y| 2 ++
  src/glsl/glsl_parser_extras.cpp | 1 +
  src/glsl/glsl_parser_extras.h   | 2 ++
  src/mapi/glapi/gen/es_EXT.xml   | 9 +
  src/mesa/main/extensions.c  | 1 +
  src/mesa/main/mtypes.h  | 1 +
  6 files changed, 16 insertions(+)

diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y
index f1119eb..414f4df 100644
--- a/src/glsl/glcpp/glcpp-parse.y
+++ b/src/glsl/glcpp/glcpp-parse.y
@@ -2380,6 +2380,8 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t 
*parser, intmax_t versio
  add_builtin_define(parser, GL_OES_EGL_image_external, 1);
if (extensions-OES_standard_derivatives)
   add_builtin_define(parser, GL_OES_standard_derivatives, 1);
+ if (extensions-EXT_draw_buffers)
+add_builtin_define(parser, GL_EXT_draw_buffers, 1);

It appears that you used tabs here instead of spaces. Also, nothing
was ever setting extensions-EXT_draw_buffers and you were enabling
the extension unconditionally. As such, the define should be added
unconditionally... Or am I misunderstanding?


oops yes, this is a leftover, this patch was more complicated until I 
cleaned it up to current state but it seems I can now remove this 
boolean as well and enable extension unconditionally.



}
 } else {
add_builtin_define(parser, GL_ARB_draw_buffers, 1);
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 7389baa..c798737 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3785,6 +3785,7 @@ struct gl_extensions
 GLboolean EXT_blend_func_separate;
 GLboolean EXT_blend_minmax;
 GLboolean EXT_depth_bounds_test;
+   GLboolean EXT_draw_buffers;

This does not appear to be used anywhere... why did you add it?


serves no purpose, will remove it

// Tapani
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] mesa/glsl/glapi: enable GL_EXT_draw_buffers extension

2014-11-25 Thread Tapani Pälli
Patch enables ES2 extension that utilizes existing ES3 functionality.

Changes make all the subtests to run and pass in WebGL conformance
test 'webgl-draw-buffers' when running Chrome on OpenGL ES.

v2: remove unused boolean (Ilia Mirkin)

Signed-off-by: Tapani Pälli tapani.pa...@intel.com
---
 src/glsl/glcpp/glcpp-parse.y| 1 +
 src/glsl/glsl_parser_extras.cpp | 1 +
 src/glsl/glsl_parser_extras.h   | 2 ++
 src/mapi/glapi/gen/es_EXT.xml   | 9 +
 src/mesa/main/extensions.c  | 1 +
 5 files changed, 14 insertions(+)

diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y
index f1119eb..3dd2ebc 100644
--- a/src/glsl/glcpp/glcpp-parse.y
+++ b/src/glsl/glcpp/glcpp-parse.y
@@ -2374,6 +2374,7 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t 
*parser, intmax_t versio
if (parser-is_gles) {
   add_builtin_define(parser, GL_ES, 1);
add_builtin_define(parser, GL_EXT_separate_shader_objects, 1);
+   add_builtin_define(parser, GL_EXT_draw_buffers, 1);
 
   if (extensions != NULL) {
  if (extensions-OES_EGL_image_external)
diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index 27e3301..2d49c3e 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -556,6 +556,7 @@ static const _mesa_glsl_extension 
_mesa_glsl_supported_extensions[] = {
EXT(AMD_shader_trinary_minmax,  true,  false, dummy_true),
EXT(AMD_vertex_shader_layer,true,  false, 
AMD_vertex_shader_layer),
EXT(AMD_vertex_shader_viewport_index, true,  false,   
AMD_vertex_shader_viewport_index),
+   EXT(EXT_draw_buffers,   false,  true, dummy_true),
EXT(EXT_separate_shader_objects,false, true,  dummy_true),
EXT(EXT_shader_integer_mix, true,  true,  
EXT_shader_integer_mix),
EXT(EXT_texture_array,  true,  false, EXT_texture_array),
diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
index c14d74c..7a13875 100644
--- a/src/glsl/glsl_parser_extras.h
+++ b/src/glsl/glsl_parser_extras.h
@@ -470,6 +470,8 @@ struct _mesa_glsl_parse_state {
bool AMD_vertex_shader_layer_warn;
bool AMD_vertex_shader_viewport_index_enable;
bool AMD_vertex_shader_viewport_index_warn;
+   bool EXT_draw_buffers_enable;
+   bool EXT_draw_buffers_warn;
bool EXT_separate_shader_objects_enable;
bool EXT_separate_shader_objects_warn;
bool EXT_shader_integer_mix_enable;
diff --git a/src/mapi/glapi/gen/es_EXT.xml b/src/mapi/glapi/gen/es_EXT.xml
index e2dc390..3a2adeb 100644
--- a/src/mapi/glapi/gen/es_EXT.xml
+++ b/src/mapi/glapi/gen/es_EXT.xml
@@ -837,4 +837,13 @@
 /function
 /category
 
+!-- 151. GL_EXT_draw_buffers --
+category name=GL_EXT_draw_buffers number=151
+function name=DrawBuffersEXT alias=DrawBuffers
+  static_dispatch=false es2=2.0
+param name=n type=GLsizei counter=true/
+param name=bufs type=const GLenum * count=n/
+/function
+/category
+
 /OpenGLAPI
diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
index 0df04c2..3b206bf 100644
--- a/src/mesa/main/extensions.c
+++ b/src/mesa/main/extensions.c
@@ -212,6 +212,7 @@ static const struct extension extension_table[] = {
{ GL_EXT_compiled_vertex_array,   o(dummy_true),  
GLL,1996 },
{ GL_EXT_copy_texture,o(dummy_true),  
GLL,1995 },
{ GL_EXT_depth_bounds_test,   o(EXT_depth_bounds_test),   
GL, 2002 },
+   { GL_EXT_draw_buffers,o(dummy_true),  
ES2, 2012 },
{ GL_EXT_draw_buffers2,   o(EXT_draw_buffers2),   
GL, 2006 },
{ GL_EXT_draw_instanced,  o(ARB_draw_instanced),  
GL, 2006 },
{ GL_EXT_draw_range_elements, o(dummy_true),  
GLL,1997 },
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] configure.ac: remove enable flags for EGL and GBM Gallium state trackers

2014-11-25 Thread Pekka Paalanen
On Tue, 25 Nov 2014 16:22:25 +0100
Marek Olšák mar...@gmail.com wrote:

 Hi Pekka,
 
 I explained in the introductory thread [PATCH 0/2] Disable the EGL
 state tracker for Linux/DRI builds that it may break swrast
 (llvmpipe) on Wayland. I also explained all the reasons why I was
 about to do it. Does it break something not discussed before?

Hello Marek,

it's just that those explanations never ended up in git. A user was
complaining about the Wayland breakage and the fact that the commit (he
found it completely by himself, even!) gave absolutely no hint as to
why. So I needed to seach the email archives for this thread, because I
remembered it was discussed and the reason was here.

I just wish you had copied it to the commit message. Cover letters do
not end up in the git history.

Not a problem anymore for me personally, since I have now bookmarked the
thread so I can easily point people to it.

The discussion about how to add the Wayland support back is now open at
https://bugs.freedesktop.org/show_bug.cgi?id=86701
since there is an actual use case reported (testing GL apps inside
qemu).


Thanks,
pq
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev