Re: [Mesa-dev] [PATCH 9/9] [AUTONAK] i965/nir: Call nir_sweep().

2015-03-29 Thread Eric Anholt
Kenneth Graunke  writes:

> Mostly a proof of concept that it works; we free the memory shortly
> afterwards anyway, so it's kind of dumb to do this.
>
> The plan is to instead build nir_shaders at link time, rather than when
> compiling each shader specialization, and delete the GLSL IR.

This sounds really interesting -- it might make sense for me, too, if I
had a good way to clone the NIR for doing shader specialization.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] [RFC] egl: propose simple EGL_MESA_image_dma_buf_export v2.4

2015-03-29 Thread Dave Airlie
From: Dave Airlie 

At the moment to get an EGL image to a dma-buf file descriptor,
you have to use EGL_MESA_drm_image, and then use libdrm to
convert this to a file descriptor.

This extension just provides an API modelled on EGL_MESA_drm_image,
to return a dma-buf file descriptor.

v2: update spec for new API proposal
add internal queries to get the fourcc back from intel driver.

v2.1: add gallium pieces.

v2.2: add offsets to spec and API, rename fd->fds, stride->strides
in API. rewrite spec a bit more, add some q/a

v2.3:
add modifiers to query interface and 64-bit type for that (Daniel Stone)
specifiy what happens to num fds vs num planes differences. (Chad Versace)

v2.4:
fix grammar (Daniel Stone)

Signed-off-by: Dave Airlie 
---
 docs/specs/MESA_image_dma_buf_export.txt | 142 +++
 include/EGL/eglmesaext.h |   8 ++
 include/GL/internal/dri_interface.h  |   4 +-
 src/egl/drivers/dri2/egl_dri2.c  |  59 -
 src/egl/main/eglapi.c|  48 +++
 src/egl/main/eglapi.h|  10 +++
 src/egl/main/egldisplay.h|   2 +
 src/egl/main/eglfallbacks.c  |   5 ++
 src/egl/main/eglmisc.c   |   2 +
 src/gallium/state_trackers/dri/dri2.c|  32 ++-
 src/mesa/drivers/dri/i965/intel_screen.c |  25 +-
 11 files changed, 332 insertions(+), 5 deletions(-)
 create mode 100644 docs/specs/MESA_image_dma_buf_export.txt

diff --git a/docs/specs/MESA_image_dma_buf_export.txt 
b/docs/specs/MESA_image_dma_buf_export.txt
new file mode 100644
index 000..3bc5890
--- /dev/null
+++ b/docs/specs/MESA_image_dma_buf_export.txt
@@ -0,0 +1,142 @@
+Name
+
+MESA_image_dma_buf_export
+
+Name Strings
+
+EGL_MESA_image_dma_buf_export
+
+Contributors
+
+Dave Airlie
+
+Contact
+
+Dave Airlie (airlied 'at' redhat 'dot' com)
+
+Status
+
+Proposal
+
+Version
+
+Version 2
+
+Number
+
+ 
+
+Dependencies
+
+Reguires EGL 1.4 or later.  This extension is written against the
+wording of the EGL 1.4 specification.
+
+EGL_KHR_base_image is required.
+
+The EGL implementation must be running on a Linux kernel supporting the
+dma_buf buffer sharing mechanism.
+
+Overview
+
+This extension provides entry points for integrating EGLImage with the
+dma-buf infrastructure.  The extension allows creating a Linux dma_buf
+file descriptor or multiple file descriptors, in the case of multi-plane
+YUV image, from an EGLImage.
+
+It is designed to provide the complementary functionality to 
EGL_EXT_image_dma_buf_import.
+
+IP Status
+
+Open-source; freely implementable.
+
+New Types
+
+This is a 64 bit unsigned integer.
+
+typedef khronos_uint64_t EGLuint64MESA;
+
+
+New Procedures and Functions
+
+EGLBoolean eglExportDMABUFImageQueryMESA(EGLDisplay dpy,
+  EGLImageKHR image,
+ int *fourcc,
+ int *num_planes,
+ EGLuint64MESA *modifiers);
+
+EGLBoolean eglExportDMABUFImageMESA(EGLDisplay dpy,
+EGLImageKHR image,
+int *fds,
+   EGLint *strides,
+   EGLint *offsets);
+
+New Tokens
+
+None
+
+
+Additions to the EGL 1.4 Specification:
+
+To mirror the import extension, this extension attempts to return
+enough information to enable an exported dma-buf to be imported
+via eglCreateImageKHR and EGL_LINUX_DMA_BUF_EXT token.
+
+Retrieving the information is a two step process, so two APIs
+are required.
+
+The first entrypoint
+   EGLBoolean eglExportDMABUFImageQueryMESA(EGLDisplay dpy,
+  EGLImageKHR image,
+ int *fourcc,
+ int *num_planes,
+ EGLuint64MESA *modifiers);
+
+is used to retrieve the pixel format of the buffer, as specified by
+drm_fourcc.h, the number of planes in the image and the Linux
+drm modifiers. ,  and  may be NULL,
+in which case no value is retrieved.
+
+The second entrypoint retrieves the dma_buf file descriptors,
+strides and offsets for the image. The caller should pass
+arrays sized according to the num_planes values retrieved previously.
+Passing arrays of the wrong size will have undefined results.
+If the number of fds is less than the number of planes, then
+subsequent fd slots should contain -1.
+
+EGLBoolean eglExportDMABUFImageMESA(EGLDisplay dpy,
+ EGLImageKHR image,
+int *fds,
+ EGLint *strides,
+ EGLint *offsets);
+
+, ,  can be NULL if the infomatation isn't

Re: [Mesa-dev] [PATCH 3/3] gallium: Add tgsi_to_nir to get a nir_shader for a TGSI shader.

2015-03-29 Thread Eric Anholt
Kenneth Graunke  writes:

> On Friday, March 27, 2015 01:54:32 PM Eric Anholt wrote:
>> This will be used by the VC4 driver for doing device-independent
>> optimization, and hopefully eventually replacing its whole IR.  It also
>> may be useful to other drivers for the same reason.

> Hi Eric!
>
> I have a bunch of comments below, but overall this looks great.
>
> You should probably have someone who knows TGSI better than I do review
> it, but for what it's worth, this is:
>
> Reviewed-by: Kenneth Graunke 

Thanks!  There was definitely useful feedback in here, and I've taken
most of it.

>> +/* LOG - Approximate Logarithm Base 2
>> + *  dst.x = \lfloor\log_2{|src.x|}\rfloor
>> + *  dst.y = \frac{|src.x|}{2^{\lfloor\log_2{|src.x|}\rfloor}}
>> + *  dst.z = \log_2{|src.x|}
>> + *  dst.w = 1.0
>> + */
>> +static void
>> +ttn_log(nir_builder *b, nir_op op, nir_alu_dest dest, nir_ssa_def **src)
>> +{
>> +   nir_ssa_def *abs_srcx = nir_fabs(b, ttn_channel(b, src[0], X));
>> +   nir_ssa_def *log2 = nir_flog2(b, abs_srcx);
>> +
>> +   ttn_move_dest_masked(b, dest, nir_ffloor(b, log2), TGSI_WRITEMASK_X);
>> +   ttn_move_dest_masked(b, dest,
>> +nir_fdiv(b, abs_srcx, nir_fexp2(b, nir_ffloor(b, 
> log2))),
>
> You're generating two copies of floor(log2) here, which will have to be
> CSE'd later.  In prog_to_nir, I created a temporary and used it in both
> places:
>
>nir_ssa_def *floor_log2 = nir_ffloor(b, log2);
>
> We're generating tons of rubbish for NIR to optimize anyway, so it's not
> a big deal...but...may as well do the trivial improvement.

I much more expect the whole mess except for the dst.z computation to
get DCEed away, so it's just one extra DCE out of so many.

(and we generate lots of copy prop to avoid all throughout this mess,
which I've considered short-circuting in nir_builder some day).

>> +static void
>> +ttn_sle(nir_builder *b, nir_op op, nir_alu_dest dest, nir_ssa_def **src)
>> +{
>> +   ttn_move_dest(b, dest, nir_sge(b, src[1], src[0]));
>> +}
>
> I've got code here to generate b2f(fge(...)) instead of sge(...) since I
> didn't want to bother implementing it in my driver, and figured the b2fs
> might be able to get optimized away.
>
> That said, I suppose we could probably just add lowering transformations
> that turn sge -> b2f(fge(...)) when options->native_integers is set, and
> delete my code...

For me an SGE in hardware is:

fsub.sf(null, src0, src1)
mov.nc(dest, 1.0)
mov.ns(dest, 0)

while your plan would be... oh wait.  I didn't even have a b2f
implementation because TGSI doesn't do that (they just AND the bool with
1.0).  But an FGE is:

fsub.sf(null, src0, src1)
mov.nc(dest, ~0)
mov.ns(dest, 0)

so any more instructions would be worse.

>> +static void
>> +ttn_xpd(nir_builder *b, nir_op op, nir_alu_dest dest, nir_ssa_def **src)
>> +{
>> +   ttn_move_dest_masked(b, dest,
>> +nir_fsub(b,
>> + nir_fmul(b,
>> +  ttn_swizzle(b, src[0], Y, Z, X, 
> X),
>> +  ttn_swizzle(b, src[1], Z, X, Y, 
> X)),
>> + nir_fmul(b,
>> +  ttn_swizzle(b, src[1], Y, Z, X, 
> X),
>> +  ttn_swizzle(b, src[0], Z, X, Y, 
> X))),
>> +TGSI_WRITEMASK_XYZ);
>> +   ttn_move_dest_masked(b, dest, nir_imm_float(b, 1.0), TGSI_WRITEMASK_W);
>> +}
>> +
>> +static void
>> +ttn_dp2a(nir_builder *b, nir_op op, nir_alu_dest dest, nir_ssa_def **src)
>> +{
>> +   ttn_move_dest(b, dest,
>> + ttn_channel(b, nir_fadd(b,
>> + ttn_channel(b, nir_fdot2(b, 
> src[0],
>> +  src[1]),
>> + X),
>
> Do you really need to do ttn_channel(b, ..., X) on a fdot2 result?  It's
> already a scalar value.  Same comment applies to the below four.
>
> I should probably delete that from prog_to_nir as well.

Good catch, the cleanups for scalar in the builder have obsoleted this,
I think.

>
>> + src[2]),
>> + X));
>> +}

>> +static void
>> +ttn_ucmp(nir_builder *b, nir_op op, nir_alu_dest dest, nir_ssa_def **src)
>> +{
>> +   ttn_move_dest(b, dest, nir_bcsel(b,
>> +nir_ine(b, src[0], nir_imm_float(b, 
> 0.0)),
>
> Doing nir_imm_int(b, 0) here would make more sense.

Yeah, now that I have it :)

>> +static void
>> +ttn_kill_if(nir_builder *b, nir_op op, nir_alu_dest dest, nir_ssa_def 
> **src)
>> +{
>> +   nir_if *if_stmt = nir_if_create(b->shader);
>> +   if_stmt->condition =
>> +  nir_src_for_ssa(nir_bany4(b, nir_flt(b, src[0], nir_imm_float(b, 
> 0.0;
>> +   nir_cf_node_insert_end(b->cf_node_list, &if_stmt->cf_node);
>> +
>> +   nir_intrinsic_instr *discard =
>> +  nir_int

[Mesa-dev] swrast: Correct pixel draw span endpoints computation, rid vertical lines

2015-03-29 Thread Daniel J Sebald

Hello mesa-dev,

I've created a changeset for the legacy swrast_dri.so driver which fixes 
vertical lines at the GL_MAX_TEXTURE_SIZE boundaries due to miscomputed 
column range computations.  Please consider.


Basically, I was very meticulous about the mathematical formulas in 
terms of rounding according to the OpenGL definition for both positive 
and negative xfactor and yfactor.  That fixed the vertical lines, but 
after creating a suggested Piglit test for glPixelZoom I noticed the 
driver still wasn't behaving according to some intricacies of the 
standard formulas.


The other advantage of being more detailed about the formulas is sort of 
sweeping numerical issues into a corner, i.e., the behavior of ceil() 
and floor() functions.  In order to pass the expected tests, I needed to 
add a tolerance of about 0.4 to the rounding functions.  This isn't 
anywhere near the, say, ULP (units in the last place) called out in 
ARB-shader-precision, but it is the necessary value I've found when 
working with single-precision float arithmetic.  It's surprising how 
quickly precision can be lost by the division that is typically used in 
computing xfactor/yfactor before the user makes the glPixelZoom() call.


I've collected image files comparing the legacy driver and patched 
legacy, as well as Gallium driver here:


  https://bugs.freedesktop.org/show_bug.cgi?id=89586

The PNG files for Piglit test results, more than anything, should 
provide a good understanding of the issues.  The Gallium driver shows a 
slight discrepancy from the patched legacy driver.  I'm not saying which 
is correct, if any, just pointing out the slight difference.


Patch file is attached.  I'd be happy to answer any questions or tweak 
the code.


Regards,

Dan Sebald
>From 5eba613e22a1096302c46df395f9a6d67f6b8625 Mon Sep 17 00:00:00 2001
From: Daniel J Sebald 
Date: Sun, 29 Mar 2015 22:52:08 -0500
Subject: [PATCH] swrast: Correct pixel draw span endpoints computation, rid
 vertical lines

Change the start/end indeces computation to use ceiling, not an int cast,
of the encompassing rectangle.  Solves problem of dropped pixels at
GL_MAX_TEXTURE_SIZE/SWRAST_MAX_WIDTH intervals that created vertical lines
when -1.0 < xfactor < 1.0.  Also fine tune unzoom formula and add a macro
SPAN_LOOP_X() for all pixel zoom operations.
---
 src/mesa/swrast/s_zoom.c |  252 +
 1 files changed, 162 insertions(+), 90 deletions(-)

diff --git a/src/mesa/swrast/s_zoom.c b/src/mesa/swrast/s_zoom.c
index ab22652..067d1d6 100644
--- a/src/mesa/swrast/s_zoom.c
+++ b/src/mesa/swrast/s_zoom.c
@@ -34,11 +34,36 @@
 #include "s_zoom.h"
 
 
+#define SPAN_LOOP_X(OPERATION) \
+   do { \
+  if (ctx->Pixel.ZoomX > 0) { \
+ GLint i; \
+ for (i = 0; i < zoomedWidth; i++) { \
+GLint j = positive_unzoom_x(ctx->Pixel.ZoomX, imgX, x0 + i) - spanX; \
+OPERATION; \
+ } \
+  } \
+  else { \
+ GLint i; \
+ for (i = 0; i < zoomedWidth; i++) { \
+GLint j = negative_unzoom_x(ctx->Pixel.ZoomX, imgX, x0 + i) - spanX; \
+OPERATION; \
+ } \
+  } \
+   } while (0)
+
+
+/* These are meant to address numerical effects of xfactor/yfactor being
+ * single-precision floating point numbers, as opposed to real numbers.
+ */
+#define EPSFLOOR 0.4
+#define EPSCEIL  0.4
+
 /**
  * Compute the bounds of the region resulting from zooming a pixel span.
  * The resulting region will be entirely inside the window/scissor bounds
  * so no additional clipping is needed.
- * \param imageX, imageY  position of the mage being drawn (gl WindowPos)
+ * \param imageX, imageY  position of the image being drawn (gl WindowPos)
  * \param spanX, spanY  position of span being drawing
  * \param width  number of pixels in span
  * \param x0, x1  returned X bounds of zoomed region [x0, x1)
@@ -47,7 +72,7 @@
  */
 static GLboolean
 compute_zoomed_bounds(struct gl_context *ctx, GLint imageX, GLint imageY,
-  GLint spanX, GLint spanY, GLint width,
+  GLint spanX, GLint spanY, GLint spanWidth,
   GLint *x0, GLint *x1, GLint *y0, GLint *y1)
 {
const struct gl_framebuffer *fb = ctx->DrawBuffer;
@@ -58,35 +83,41 @@ compute_zoomed_bounds(struct gl_context *ctx, GLint imageX, GLint imageY,
 
/*
 * Compute destination columns: [c0, c1)
+*
+* c0 - Pixels on left rectangle edge and greater are included
+* c1 - Pixels on right rectangle edge and greater are excluded
 */
-   c0 = imageX + (GLint) ((spanX - imageX) * ctx->Pixel.ZoomX);
-   c1 = imageX + (GLint) ((spanX + width - imageX) * ctx->Pixel.ZoomX);
-   if (c1 < c0) {
-  /* swap */
+   c0 = imageX + (GLint) ceilf((spanX - imageX) * ctx->Pixel.ZoomX - EPSCEIL);
+   c1 = imageX + (GLint) ceilf((spanX + spanWidth - imageX) * ctx->Pixel.ZoomX - EPSCEIL);
+   if (ctx->Pixel.ZoomX < 0) {
+  /* swap edge roles */
   G

Re: [Mesa-dev] [PATCH] glsl: fix unreachable(!"") to unreachable("")

2015-03-29 Thread Ilia Mirkin
Reviewed-by: Ilia Mirkin 

On Mon, Mar 30, 2015 at 1:05 AM, Tapani Pälli  wrote:
> Correct error with commit 151fb1e where assert was renamed
> to unreachable without removing ! from string argument.
>
> Signed-off-by: Tapani Pälli 
> ---
>  src/glsl/loop_controls.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/glsl/loop_controls.cpp b/src/glsl/loop_controls.cpp
> index d7f0b28..51804bb 100644
> --- a/src/glsl/loop_controls.cpp
> +++ b/src/glsl/loop_controls.cpp
> @@ -139,7 +139,7 @@ calculate_iterations(ir_rvalue *from, ir_rvalue *to, 
> ir_rvalue *increment,
>   iter = new(mem_ctx) ir_constant(double(iter_value + bias[i]));
>   break;
>default:
> -  unreachable(!"Unsupported type for loop iterator.");
> +  unreachable("Unsupported type for loop iterator.");
>}
>
>ir_expression *const mul =
> --
> 2.1.0
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: fix unreachable(!"") to unreachable("")

2015-03-29 Thread Tapani Pälli
Correct error with commit 151fb1e where assert was renamed
to unreachable without removing ! from string argument.

Signed-off-by: Tapani Pälli 
---
 src/glsl/loop_controls.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glsl/loop_controls.cpp b/src/glsl/loop_controls.cpp
index d7f0b28..51804bb 100644
--- a/src/glsl/loop_controls.cpp
+++ b/src/glsl/loop_controls.cpp
@@ -139,7 +139,7 @@ calculate_iterations(ir_rvalue *from, ir_rvalue *to, 
ir_rvalue *increment,
  iter = new(mem_ctx) ir_constant(double(iter_value + bias[i]));
  break;
   default:
-  unreachable(!"Unsupported type for loop iterator.");
+  unreachable("Unsupported type for loop iterator.");
   }
 
   ir_expression *const mul =
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 89586] Drivers/DRI/swrast

2015-03-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=89586

Dan Sebald  changed:

   What|Removed |Added

 Attachment #114599|0   |1
is obsolete||

--- Comment #45 from Dan Sebald  ---
Created attachment 114715
  --> https://bugs.freedesktop.org/attachment.cgi?id=114715&action=edit
swrast Gallium with patch Piglit test images

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 89586] Drivers/DRI/swrast

2015-03-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=89586

Dan Sebald  changed:

   What|Removed |Added

 Attachment #114598|0   |1
is obsolete||

--- Comment #44 from Dan Sebald  ---
Created attachment 114714
  --> https://bugs.freedesktop.org/attachment.cgi?id=114714&action=edit
swrast Gallium Piglit test images

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 89586] Drivers/DRI/swrast

2015-03-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=89586

Dan Sebald  changed:

   What|Removed |Added

 Attachment #114597|0   |1
is obsolete||

--- Comment #43 from Dan Sebald  ---
Created attachment 114713
  --> https://bugs.freedesktop.org/attachment.cgi?id=114713&action=edit
swrast legacy with patch Piglit test images

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 89586] Drivers/DRI/swrast

2015-03-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=89586

Dan Sebald  changed:

   What|Removed |Added

 Attachment #114596|0   |1
is obsolete||

--- Comment #42 from Dan Sebald  ---
Created attachment 114712
  --> https://bugs.freedesktop.org/attachment.cgi?id=114712&action=edit
swrast legacy Piglit test images

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 89586] Drivers/DRI/swrast

2015-03-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=89586

Dan Sebald  changed:

   What|Removed |Added

 Attachment #114619|0   |1
is obsolete||

--- Comment #41 from Dan Sebald  ---
Created attachment 114711
  --> https://bugs.freedesktop.org/attachment.cgi?id=114711&action=edit
Piglit pixelzoom test suite

An update to test results after making some modifications to SWRAST legacy
changeset and the Piglit gl-1.0-pixelzoom tests:

(1) swrast-legacy: The repository legacy swrast_dri.so driver for which my
system and others is using and exibits vertical lines.

(2) swrast-legacy-patch: The legacy swrast_dri.co patched with the changeset I
originally attached to this bug report.

(3) swrast-gallium: The repository Gallium/llvm swrast_dri.so driver.

(4) swrast-gallium-patch: The Gallium swrast_dri.co patched by removing the
line of code that limits the size of the image.

TEST   (1)   (2)   (3)   (4)

positive monotonic x   fail  pass  fail  fail
positive edge xfail  pass  fail  fail
positive over/underrun x   fail  pass  pass  pass
negative monotonic x   pass  pass  fail  pass
negative edge xfail  pass  fail  fail
negative over/underrun x   fail  pass  pass  pass
positive monotonic y   fail  pass  fail  fail
positive edge yfail  pass  fail  fail
positive over/underrun y   fail  pass  pass  pass
negative monotonic y   pass  pass  fail  pass
negative edge ypass  pass  fail  fail
negative over/underrun y   fail  pass  pass  pass

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 89586] Drivers/DRI/swrast

2015-03-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=89586

Dan Sebald  changed:

   What|Removed |Added

 Attachment #114342|0   |1
is obsolete||

--- Comment #40 from Dan Sebald  ---
Created attachment 114710
  --> https://bugs.freedesktop.org/attachment.cgi?id=114710&action=edit
Changeset to fix vertical lines and fine tune positive_unzoom_x() and
negative_unzoom_x()

Attached is an update to the SWRAST legacy changeset.  With this change, the
driver passes all tests in Piglit gl-1.0-pixelzoom check.

The main addition to the changeset over the last changeset is the inclusion of
a tolerance for the ceil() and floor() functions.  The issue is that with
single precision float division and multiplication the formula

(xz - xr) / xfactor

can be off by a fair amount, on the order of 10e-5.  I printed out some numbers
the driver was using in cases there the gl-1.0-pixelzoom alternating-line test
was failing.  The numbers agree exactly with this example result:

octave:3> single(53)/single(400) * single(400)
ans =  52.961853027

I put in a tolerance of 0.4 on the rounding functions.  By my very rough
estimate, I think it is possible to scale input images with dimension of about
100,000 down to typical screen sizes without the added tolerance causing its
own sort of artifact.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Good compiler literature?

2015-03-29 Thread Connor Abbott
On Sun, Mar 29, 2015 at 9:51 PM, Connor Abbott  wrote:
> On Sun, Mar 29, 2015 at 5:54 PM, Thomas Helland
>  wrote:
>> Does anyone have suggestions for good literature on compilers?
>> Since GPU's and CPU's are a bit different there are probably
>> books that are better suited than others for GPUs?
>> I have what is probably Norway's biggest library on the subject to rent
>> books from, so I guess I should be able to find most suggestions there.
>>
>> Regards
>> Thomas
>>
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>
> Hi,
>
> Unfortunately there seems to be a bit of a dearth of books when it
> comes to compiler optimizations and backend things, especially when it
> comes to SSA -- which is the most interesting part! I would skip the
> dragon book unless you're interested in the front-end things (parsing,
> symbol tables, etc.); I haven't read a lot of it myself, but
> apparently the latest edition only has a passing mention of SSA and
> others don't mention it at all. GCC has a list of books:
> https://gcc.gnu.org/wiki/ListOfCompilerBooks and apparently some of
> them are better in that regard. Personally, I learned about a lot of
> this stuff through papers. I have a list of them here:
>
> http://cwabbottdev.blogspot.com/2013/06/compiler-theory-links.html
>
> although it may be a little out of date, and some of the things might
> have been more useful back when I was working on lima. Suggestions for
> adding to the list are very welcome.

Ok, since that page was a little lacking I added a few more links.
Also, as to GPU-specific things... well, there really isn't much of
anything I'm aware of that's out there. There's one paper called
"Divergance Analysis and Optimizations" that's specific to SIMD
machines that run multiple threads execution at once like GPU's, and
we'd like to implement that in the future to give us more precise
information about how we can move derivatives and textures that take
an implicit derivative, since they can't be moved out of uniform
control flow, but other than that there isn't anything I'm aware of.
Most of the other things carry over to both CPU's and GPU's. One other
GPU-specific problem I'm aware of is how to go out of SSA on classic
vector-based architectures like i965 vec4 VS without introducing extra
copies due to writemasked operations, but afaik there isn't a
publically-available description of someone's solution -- the paper
has yet to be written :).
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: acknowledge the existence of nir_builder.h

2015-03-29 Thread Eric Anholt
Emil Velikov  writes:

> The header was added with commit 2a135c470e3(nir: Add an ALU op builder
> kind of like ir_builder.h) but did not made it into to the sources list,
> and its dependency of nir_builder_opcodes.h was missing.
>
> Fortunately it remained unused until resent commit faf6106c6f6(nir:

"recent"

> Implement a Mesa IR -> NIR translator.)
>
> Cc: Kenneth Graunke 
> Cc: Eric Anholt 
> Signed-off-by: Emil Velikov 
> ---
>
> Not sure how the out-of-tree build was able to finish without this, 
> although the commit looks like a must have if we want the file in the 
> tarball.
>
> Based on top of the earlier Android series.
>
> -Emil
>
> ---
>  src/glsl/Android.gen.mk   | 2 ++
>  src/glsl/Makefile.am  | 2 ++
>  src/glsl/Makefile.sources | 1 +
>  3 files changed, 5 insertions(+)
>
> diff --git a/src/glsl/Android.gen.mk b/src/glsl/Android.gen.mk
> index 82f2bf1..2f54da4 100644
> --- a/src/glsl/Android.gen.mk
> +++ b/src/glsl/Android.gen.mk
> @@ -97,6 +97,8 @@ $(intermediates)/nir/nir_builder_opcodes.h: 
> $(nir_builder_opcodes_deps)
>   @mkdir -p $(dir $@)
>   @$(MESA_PYTHON2) $(nir_builder_opcodes_gen) $< > $@
>  
> +$(LOCAL_PATH)/nir/nir_builder.h: $(intermediates)/nir/nir_builder_opcodes.h
> +
>  nir_constant_expressions_gen := $(LOCAL_PATH)/nir/nir_constant_expressions.py
>  nir_constant_expressions_deps := \
>   $(LOCAL_PATH)/nir/nir_opcodes.py \
> diff --git a/src/glsl/Makefile.am b/src/glsl/Makefile.am
> index ed90366..58af166 100644
> --- a/src/glsl/Makefile.am
> +++ b/src/glsl/Makefile.am
> @@ -244,6 +244,8 @@ nir/nir_builder_opcodes.h: nir/nir_opcodes.py 
> nir/nir_builder_opcodes_h.py
>   $(MKDIR_P) nir; \
>   $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/nir/nir_builder_opcodes_h.py > $@
>  
> +nir/nir_builder.h: nir/nir_builder_opcodes.h
> +
>  nir/nir_constant_expressions.c: nir/nir_opcodes.py 
> nir/nir_constant_expressions.py nir/nir_constant_expressions.h
>   $(MKDIR_P) nir; \
>   $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/nir/nir_constant_expressions.py > 
> $@

This is weird -- nir_builder.h isn't a build target that needs to be
regenerated.  What's it for?

> diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
> index 8d29c55..c3b63d1 100644
> --- a/src/glsl/Makefile.sources
> +++ b/src/glsl/Makefile.sources
> @@ -22,6 +22,7 @@ NIR_FILES = \
>   nir/glsl_to_nir.h \
>   nir/nir.c \
>   nir/nir.h \
> + nir/nir_builder.h \
>   nir/nir_constant_expressions.h \
>   nir/nir_dominance.c \
>   nir/nir_from_ssa.c \
> -- 
> 2.3.1

This hunk is certainly needed.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] i965: Split out brw__populate_key into their own functions

2015-03-29 Thread Kenneth Graunke
On Friday, March 20, 2015 05:49:06 PM Carl Worth wrote:
> This commit splits portions of the existing brw_upload_vs_prog and
> brw_upload_gs_prog function into new brw_vs_populate_key and
> brw_gs_populate_key functions. This follows the same style as is
> already present for all other stages, (see brw_wm_populate_key, etc.).
> 
> This commit is intended to have no functional change. It exists in
> preparation for some upcoming code movement in preparation for the
> shader cache.
> ---
>  src/mesa/drivers/dri/i965/brw_ff_gs.c |  7 +++--
>  src/mesa/drivers/dri/i965/brw_gs.c| 39 ++-
>  src/mesa/drivers/dri/i965/brw_vs.c| 58 
> +--
>  3 files changed, 64 insertions(+), 40 deletions(-)

Patches 1-3 are:
Reviewed-by: Kenneth Graunke 

(assuming you like my suggestion for the rename in patch 3)


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Good compiler literature?

2015-03-29 Thread Connor Abbott
On Sun, Mar 29, 2015 at 5:54 PM, Thomas Helland
 wrote:
> Does anyone have suggestions for good literature on compilers?
> Since GPU's and CPU's are a bit different there are probably
> books that are better suited than others for GPUs?
> I have what is probably Norway's biggest library on the subject to rent
> books from, so I guess I should be able to find most suggestions there.
>
> Regards
> Thomas
>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>

Hi,

Unfortunately there seems to be a bit of a dearth of books when it
comes to compiler optimizations and backend things, especially when it
comes to SSA -- which is the most interesting part! I would skip the
dragon book unless you're interested in the front-end things (parsing,
symbol tables, etc.); I haven't read a lot of it myself, but
apparently the latest edition only has a passing mention of SSA and
others don't mention it at all. GCC has a list of books:
https://gcc.gnu.org/wiki/ListOfCompilerBooks and apparently some of
them are better in that regard. Personally, I learned about a lot of
this stuff through papers. I have a list of them here:

http://cwabbottdev.blogspot.com/2013/06/compiler-theory-links.html

although it may be a little out of date, and some of the things might
have been more useful back when I was working on lima. Suggestions for
adding to the list are very welcome.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/4] i965: Rename do__prog to brw__compile

2015-03-29 Thread Kenneth Graunke
On Friday, March 20, 2015 11:28:28 PM Carl Worth wrote:
> On Fri, Mar 20 2015, Chris Forbes wrote:
> > I think that having both the existing `struct brw_vs_compile` and a
> > function with the same name is going to cause confusion. (same with
> > the other non-fs stages)
> 
> In an earlier version of the patch I had brw_vs_do_compile, (there is a
> "do" precedent in the code being replaced here). I could go back to that
> if it helps.
> 
> -Carl

How about brw_compile_vs_prog?  It sounds natural and doesn't appear to
conflict with anything.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 0/9] Support multiple state pipelines for i965

2015-03-29 Thread Kenneth Graunke
On Friday, March 20, 2015 05:28:55 PM Jordan Justen wrote:
> git://people.freedesktop.org/~jljusten/mesa i965-pipelines-v3
> 
> v2:
>  * Rename brw->atoms[] to render_atoms
>   * Add brw->compute_atoms[]
>* Replace brw_pipeline_first_atom with brw_get_pipeline_atoms
> 
> v3:
>  * Avoid changing pipelines' state bits in upload path
>  * brw_clear_dirty_bits => brw_render_state_finished
>  * brw->compute_atoms[] starts with size of 1
>  * Deprecate and remove brw->state.dirty
> 
> Jordan Justen (9):
>   i965/state: Rename brw_upload_state to brw_upload_render_state
>   i965/state: Rename brw_clear_dirty_bits to brw_render_state_finished
>   i965/state: Support multiple pipelines in brw->num_atoms
>   i965/state: Create separate dirty state bits for each pipeline
>   i965/state: Only upload render programs for render state uploads
>   i965/state: Add compute pipeline with empty atom lists
>   i965/state: Don't use brw->state.dirty.brw
>   i965/state: Don't use brw->state.dirty.mesa
>   i965/state: Remove brw->state.dirty

Series is:
Reviewed-by: Kenneth Graunke 

Thanks, Jordan!



signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Update the #line behaviour on GLSL 3.30+ and GLSL ES+

2015-03-29 Thread Kenneth Graunke
On Monday, March 23, 2015 09:56:29 AM Antia Puentes wrote:
> From GLSL 3.30 and GLSL ES 1.00 on, after processing the line
> directive (including its new-line), the implementation should
> behave as if it is compiling at the line number passed as
> argument. In previous versions, it behaved as if compiling
> at the passed line number + 1.
> 
> Partially fixes https://bugs.freedesktop.org/show_bug.cgi?id=88815
> ---
>  src/glsl/glsl_lexer.ll | 17 +
>  1 file changed, 17 insertions(+)
> 
> diff --git a/src/glsl/glsl_lexer.ll b/src/glsl/glsl_lexer.ll
> index f0e047e..2785ed1 100644
> --- a/src/glsl/glsl_lexer.ll
> +++ b/src/glsl/glsl_lexer.ll
> @@ -187,6 +187,15 @@ HASH ^{SPC}#{SPC}
>   * one-based.
>   */
>  yylineno = strtol(ptr, &ptr, 0) - 1;
> +
> +   /* From GLSL 3.30 and GLSL ES on, after 
> processing the
> +* line directive (including its 
> new-line), the implementation
> +* will behave as if it is compiling at 
> the line number passed
> +* as argument. It was line number + 1 in 
> older specifications.
> +*/
> +   if (yyextra->is_version(330, 100))
> +  yylineno--;
> +
>  yylloc->source = strtol(ptr, NULL, 0);
>   }
>  {HASH}line{SPCP}{INT}{SPC}$  {
> @@ -202,6 +211,14 @@ HASH ^{SPC}#{SPC}
>   * one-based.
>   */
>  yylineno = strtol(ptr, &ptr, 0) - 1;
> +
> +   /* From GLSL 3.30 and GLSL ES on, after 
> processing the
> +* line directive (including its 
> new-line), the implementation
> +* will behave as if it is compiling at 
> the line number passed
> +* as argument. It was line number + 1 in 
> older specifications.
> +*/
> +   if (yyextra->is_version(330, 100))
> +  yylineno--;
>   }
>  ^{SPC}#{SPC}pragma{SPCP}debug{SPC}\({SPC}on{SPC}\) {
> BEGIN PP;
> 

Thanks for taking the time to make our error messages better :)

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/3] Hash-table improvements, V3

2015-03-29 Thread Connor Abbott
On Sun, Mar 29, 2015 at 2:05 PM, Thomas Helland
 wrote:
> Here's the latest round of fixup on the hash-table patches.
> I think I've gotten all the review feedback incorporated now.
> These patches give a nice little boost, indicated in each commit.
> As a side effect of upping the minimum size of the table and set
> there is now also less spamming of rzalloc and friends.
> _int_malloc is cut from 935'000 to 847'000 samples.
> calloc is cut from 683'000 to 655'000 samples.
> _int_free is cut from 644'000 to 617'000 samples
> The series reduced shader-db run-time with NIR on my collection
> from 180 seconds to about 160 seconds.
>
> Thomas Helland (3):
>   util: Change hash_table to use quadratic probing
>   util: Change util/set to use quadratic probing
>   util: Use 32 bit integer hash function for pointers
>
>  src/util/hash_table.c | 132 
> ++
>  src/util/hash_table.h |   3 +-
>  src/util/set.c| 124 ++-
>  src/util/set.h|   3 +-
>  4 files changed, 109 insertions(+), 153 deletions(-)
>
> --
> 2.3.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

I don't see any performance data on each commit, did you leave it out
by accident? Other than that, the series is

Reviewed-by: Connor Abbott 

but you'll want to get Eric to review it too.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: respect the source number set by #line

2015-03-29 Thread Kenneth Graunke
On Monday, March 23, 2015 09:56:52 AM Antia Puentes wrote:
> From GLSL 1.30.10, section 3.3 (Preprocessor):
> "#line line source-string-number ... After processing this directive
> (including its new-line), the implementation will behave as if it is
> compiling at ... source string number source-string-number. Subsequent
> source strings will be numbered sequentially, until another #line
> directive overrides that numbering."
> 
> In the previous implementation the source number was always zero.
> Subsequent source strings are still not numbered sequentially, because
> in the glShaderSource implementation we are concatenating the source code
> strings into one long string.
> 
> Partially fixes https://bugs.freedesktop.org/show_bug.cgi?id=88815
> ---
>  src/glsl/glsl_lexer.ll | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/src/glsl/glsl_lexer.ll b/src/glsl/glsl_lexer.ll
> index 8dc3d10..f0e047e 100644
> --- a/src/glsl/glsl_lexer.ll
> +++ b/src/glsl/glsl_lexer.ll
> @@ -36,14 +36,13 @@ static int classify_identifier(struct 
_mesa_glsl_parse_state *, const char *);
>  
>  #define YY_USER_ACTION   \
> do {  \
> -  yylloc->source = 0;\
>yylloc->first_column = yycolumn + 1;   \
>yylloc->first_line = yylloc->last_line = yylineno + 1; \
>yycolumn += yyleng;\
>yylloc->last_column = yycolumn + 1;\
> } while(0);
>  
> -#define YY_USER_INIT yylineno = 0; yycolumn = 0;
> +#define YY_USER_INIT yylineno = 0; yycolumn = 0; yylloc->source = 0;
>  
>  /* A macro for handling reserved words and keywords across language 
versions.
>   *
> 

Looks good to me!  We could probably concatenate the strings together
but put "#line 0 i" between each source string's content, if we wanted
to fix the bug completely.  Seems simple enough.

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 89819] WebGL Conformance swrast failure in conformance/uniforms/uniform-default-values.html

2015-03-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=89819

Bug ID: 89819
   Summary: WebGL Conformance swrast failure in
conformance/uniforms/uniform-default-values.html
   Product: Mesa
   Version: unspecified
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: lukebe...@hotmail.com
QA Contact: mesa-dev@lists.freedesktop.org

The swrast renderer requires DRAW_USE_LLVM=false to pass the OGLES 2.0 Uniform
Default Values test.

Steps to reproduce:
1. set LIBGL_ALWAYS_SOFTWARE=1
2. visit
https://www.khronos.org/registry/webgl/sdk/tests/conformance/uniforms/uniform-default-values.html?webglVersion=1
3. set DRAW_USE_LLVM=false 
4.
https://www.khronos.org/registry/webgl/sdk/tests/conformance/uniforms/uniform-default-values.html?webglVersion=1


llvmpipe results:
testing: samplerCube
fragment shaderFAIL Error in program linking:null
FAIL uniform is not zero
default value should be zero
FAIL at (0, 0) expected: 0,255,0,255 was 0,0,0,0
test test by setting value
FAIL at (0, 0) expected: 255,0,0,255 was 0,0,0,0
re-linking should reset to defaults
FAIL at (0, 0) expected: 0,255,0,255 was 0,0,0,0
FAIL getError expected: NO_ERROR. Was CONTEXT_LOST_WEBGL : should be no GL
errors
...

See Bug 78875  for the i965

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/11] i965/inst: Add notify and gateway_subfuncid fields

2015-03-29 Thread Kenneth Graunke
On Wednesday, March 25, 2015 05:53:43 PM Ben Widawsky wrote:
> On Sun, Mar 22, 2015 at 06:49:15PM -0700, Jordan Justen wrote:
> > These fields will be used when emitting a send for the barrier function.
> > 
> > Reference: IVB PRM Volume 4, Part 2, Section 1.1.1 Message Descriptor
> > 
> > Signed-off-by: Jordan Justen 
> > Reviewed-by: Chris Forbes 
> > ---
> >  src/mesa/drivers/dri/i965/brw_inst.h | 18 +++---
> >  1 file changed, 15 insertions(+), 3 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_inst.h 
b/src/mesa/drivers/dri/i965/brw_inst.h
> > index 372aa2b..8701771 100644
> > --- a/src/mesa/drivers/dri/i965/brw_inst.h
> > +++ b/src/mesa/drivers/dri/i965/brw_inst.h
> > @@ -322,6 +322,9 @@ FJ(gen4_jump_count, 111,  96, brw->gen < 6)
> >  FC(gen4_pop_count,  115, 112, brw->gen < 6)
> >  /** @} */
> >  
> > +/* Message descriptor bits */
> > +#define MD(x) (x + 96)
> > +
> >  /**
> >   * Fields for SEND messages:
> >   *  @{
> > @@ -347,6 +350,12 @@ FF(header_present,
> > /* 6:   */ 115, 115,
> > /* 7:   */ 115, 115,
> > /* 8:   */ 115, 115)
> > +FF(notify,
> > +   /* 4: doesn't exist */ -1, -1, -1, -1,
> > +   /* 5: doesn't exist */ -1, -1,
> > +   /* 6: doesn't exist */ -1, -1,
> > +   /* 7:   */ MD(16), MD(15),
> > +   /* 8:   */ MD(16), MD(15))
> 
> I'm pretty sure notify has existed for much longer than Gen7. I understand 
that
> you don't implement it, but "doesn't exist is at least a little confusing."
> (Also, if it does exist all the way back, you could potentially just use 
F())

The "Notify" bit in the "Message Gateway" message descriptor has existed
since the original 965 - it is 16:15 on all generations.

So I agree with Ben, this should be F(notify, MD(16), MD(15)).

Since this only applies to Message Gateway messages, it might make sense
to call it gateway_notify or some such...I've tried to prefix the other
descriptor bits with "math_", "sampler_", "urb_", and so on.

> If you end up modifying stuff, should you throw in AckReq?
> 
> >  FF(function_control,
> > /* 4:   */ 111,  96,
> > /* 4.5: */ 111,  96,
> > @@ -354,6 +363,12 @@ FF(function_control,
> > /* 6:   */ 114,  96,
> > /* 7:   */ 114,  96,
> > /* 8:   */ 114,  96)
> > +FF(gateway_subfuncid,
> > +   /* 4: doesn't exist */  -1, -1, -1, -1,
> > +   /* 5: doesn't exist */  -1, -1,
> > +   /* 6: doesn't exist */  -1, -1,
> > +   /* 7:   */  MD(2),  MD(0),
> > +   /* 8:   */  MD(2),  MD(0))

Likewise, these exist on older platforms too...

FF(gateway_subfuncid,
   /* 4:   */  MD(1),  MD(0),
   /* 4.5: */  MD(1),  MD(0),
   /* 5:   */  MD(1),  MD(0), /* 2:0, but bit 2 is reserved MBZ */
   /* 6:   */  MD(2),  MD(0),
   /* 7:   */  MD(2),  MD(0),
   /* 8:   */  MD(2),  MD(0))

With those changes, this would get a:
Reviewed-by: Kenneth Graunke 

> >  FF(sfid,
> > /* 4:   */ 123, 120, /* called msg_target */
> > /* 4.5  */ 123, 120,
> > @@ -364,9 +379,6 @@ FF(sfid,
> >  FC(base_mrf,   27,  24, brw->gen < 6);
> >  /** @} */
> >  
> > -/* Message descriptor bits */
> > -#define MD(x) (x + 96)
> > -
> >  /**
> >   * URB message function control bits:
> >   *  @{
> 
> I am not a huge fan of MD(x) but I suppose you didn't create that yourself. 
I'd
> be in favor of killing it at some point.
> 
> Patches up through this one are:
> Reviewed-by: Ben Widawsky 
> 
> (I think 1 & 2 make more sense as a single patch, but meh)


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 89818] WebGL Conformance conformance/textures/texture-size-limit.html -> OUT_OF_MEMORY

2015-03-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=89818

Bug ID: 89818
   Summary: WebGL Conformance
conformance/textures/texture-size-limit.html ->
OUT_OF_MEMORY
   Product: Mesa
   Version: unspecified
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: lukebe...@hotmail.com
QA Contact: mesa-dev@lists.freedesktop.org

With the mesa llvmpipe or r300 in either Firefox or Chrome, navigate to:

https://www.khronos.org/registry/webgl/sdk/tests/webgl-conformance-tests.html

It fails with the mesa drivers, but passes with the nvidia proprietary driver
on my GTX650. The intel driver was fixed here Bug 78770.

Sample failure output:

failed: getError expected: NO_ERROR. Was OUT_OF_MEMORY : there should be no
error for level: 12 1x1
failed: getError expected: NO_ERROR. Was OUT_OF_MEMORY : there should be no
error for level: 11 2x2
failed: getError expected: NO_ERROR. Was OUT_OF_MEMORY : there should be no
error for level: 10 4x4
failed: getError expected: NO_ERROR. Was OUT_OF_MEMORY : there should be no
error for level: 9 8x8
failed: getError expected: NO_ERROR. Was OUT_OF_MEMORY : there should be no
error for level: 8 16x16

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] gallivm: add gather support to sampler interface

2015-03-29 Thread sroland
From: Roland Scheidegger 

Luckily thanks to the revamped interface this is a lot less work now...
---
 src/gallium/auxiliary/gallivm/lp_bld_sample.h | 18 +
 src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 31 +--
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c   |  6 ++---
 3 files changed, 34 insertions(+), 21 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample.h 
b/src/gallium/auxiliary/gallivm/lp_bld_sample.h
index b95ee6f..640b7e0 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_sample.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld_sample.h
@@ -76,13 +76,21 @@ enum lp_sampler_lod_control {
 };
 
 
+enum lp_sampler_op_type {
+   LP_SAMPLER_OP_TEXTURE,
+   LP_SAMPLER_OP_FETCH,
+   LP_SAMPLER_OP_GATHER
+};
+
+
 #define LP_SAMPLER_SHADOW (1 << 0)
 #define LP_SAMPLER_OFFSETS(1 << 1)
-#define LP_SAMPLER_FETCH  (1 << 2)
-#define LP_SAMPLER_LOD_CONTROL_SHIFT3
-#define LP_SAMPLER_LOD_CONTROL_MASK   (3 << 3)
-#define LP_SAMPLER_LOD_PROPERTY_SHIFT   5
-#define LP_SAMPLER_LOD_PROPERTY_MASK  (3 << 5)
+#define LP_SAMPLER_OP_TYPE_SHIFT2
+#define LP_SAMPLER_OP_TYPE_MASK   (3 << 2)
+#define LP_SAMPLER_LOD_CONTROL_SHIFT4
+#define LP_SAMPLER_LOD_CONTROL_MASK   (3 << 4)
+#define LP_SAMPLER_LOD_PROPERTY_SHIFT   6
+#define LP_SAMPLER_LOD_PROPERTY_MASK  (3 << 6)
 
 struct lp_sampler_params
 {
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
index 82ef359..962f478 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
@@ -2391,9 +2391,10 @@ lp_build_sample_soa_code(struct gallivm_state *gallivm,
LLVMValueRef tex_width, newcoords[5];
enum lp_sampler_lod_property lod_property;
enum lp_sampler_lod_control lod_control;
+   enum lp_sampler_op_type op_type;
LLVMValueRef lod_bias = NULL;
LLVMValueRef explicit_lod = NULL;
-   boolean is_fetch = !!(sample_key & LP_SAMPLER_FETCH);
+   boolean op_is_tex;
 
if (0) {
   enum pipe_format fmt = static_texture_state->format;
@@ -2404,6 +2405,10 @@ lp_build_sample_soa_code(struct gallivm_state *gallivm,
  LP_SAMPLER_LOD_PROPERTY_SHIFT;
lod_control = (sample_key & LP_SAMPLER_LOD_CONTROL_MASK) >>
 LP_SAMPLER_LOD_CONTROL_SHIFT;
+   op_type = (sample_key & LP_SAMPLER_OP_TYPE_MASK) >>
+ LP_SAMPLER_OP_TYPE_SHIFT;
+
+   op_is_tex = op_type == LP_SAMPLER_OP_TEXTURE;
 
if (lod_control == LP_SAMPLER_LOD_BIAS) {
   lod_bias = lod;
@@ -2534,7 +2539,7 @@ lp_build_sample_soa_code(struct gallivm_state *gallivm,
(gallivm_debug & GALLIVM_DEBUG_NO_RHO_APPROX) &&
(static_texture_state->target == PIPE_TEXTURE_CUBE ||
 static_texture_state->target == PIPE_TEXTURE_CUBE_ARRAY) &&
-   (!is_fetch && mip_filter != PIPE_TEX_MIPFILTER_NONE)) {
+   (op_is_tex && mip_filter != PIPE_TEX_MIPFILTER_NONE)) {
   /*
* special case for using per-pixel lod even for implicit lod,
* which is generally never required (ok by APIs) except to please
@@ -2548,23 +2553,23 @@ lp_build_sample_soa_code(struct gallivm_state *gallivm,
}
else if (lod_property == LP_SAMPLER_LOD_PER_ELEMENT ||
(explicit_lod || lod_bias || derivs)) {
-  if ((is_fetch && target != PIPE_BUFFER) ||
-  (!is_fetch && mip_filter != PIPE_TEX_MIPFILTER_NONE)) {
+  if ((!op_is_tex && target != PIPE_BUFFER) ||
+  (op_is_tex && mip_filter != PIPE_TEX_MIPFILTER_NONE)) {
  bld.num_mips = type.length;
  bld.num_lods = type.length;
   }
-  else if (!is_fetch && min_img_filter != mag_img_filter) {
+  else if (op_is_tex && min_img_filter != mag_img_filter) {
  bld.num_mips = 1;
  bld.num_lods = type.length;
   }
}
/* TODO: for true scalar_lod should only use 1 lod value */
-   else if ((is_fetch && explicit_lod && target != PIPE_BUFFER) ||
-(!is_fetch && mip_filter != PIPE_TEX_MIPFILTER_NONE)) {
+   else if ((!op_is_tex && explicit_lod && target != PIPE_BUFFER) ||
+(op_is_tex && mip_filter != PIPE_TEX_MIPFILTER_NONE)) {
   bld.num_mips = num_quads;
   bld.num_lods = num_quads;
}
-   else if (!is_fetch && min_img_filter != mag_img_filter) {
+   else if (op_is_tex && min_img_filter != mag_img_filter) {
   bld.num_mips = 1;
   bld.num_lods = num_quads;
}
@@ -2658,7 +2663,7 @@ lp_build_sample_soa_code(struct gallivm_state *gallivm,
   texel_out);
}
 
-   else if (is_fetch) {
+   else if (op_type == LP_SAMPLER_OP_FETCH) {
   lp_build_fetch_texel(&bld, texture_index, newcoords,
lod, offsets,
texel_out);
@@ -2786,18 +2791,18 @@ lp_build_sample_soa_code(struct gallivm_state *gallivm,
  (gallivm_debug & GALLIVM_DEBUG_NO_RHO_APPROX) &&

[Mesa-dev] [PATCH 3/3] llvmpipe: enable ARB_texture_gather

2015-03-29 Thread sroland
From: Roland Scheidegger 

Just announce support for 4 components.
While here also increase the max/min texel offsets (the limit is completely
artificial, was chosen because that's what other hardware did, however there's
other drivers using larger limits).
Over a thousand little piglits skip->pass.
---
 src/gallium/drivers/llvmpipe/lp_screen.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
b/src/gallium/drivers/llvmpipe/lp_screen.c
index 4b45725..f4ba596 100644
--- a/src/gallium/drivers/llvmpipe/lp_screen.c
+++ b/src/gallium/drivers/llvmpipe/lp_screen.c
@@ -180,10 +180,10 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
pipe_cap param)
/* this is a lie could support arbitrary large offsets */
case PIPE_CAP_MIN_TEXTURE_GATHER_OFFSET:
case PIPE_CAP_MIN_TEXEL_OFFSET:
-  return -8;
+  return -32;
case PIPE_CAP_MAX_TEXTURE_GATHER_OFFSET:
case PIPE_CAP_MAX_TEXEL_OFFSET:
-  return 7;
+  return 31;
case PIPE_CAP_CONDITIONAL_RENDER:
   return 1;
case PIPE_CAP_TEXTURE_BARRIER:
@@ -249,6 +249,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
pipe_cap param)
case PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT:
   return 1;
case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS:
+  return 4;
case PIPE_CAP_TEXTURE_GATHER_SM5:
case PIPE_CAP_TEXTURE_QUERY_LOD:
case PIPE_CAP_SAMPLE_SHADING:
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] gallivm: implement TG4 for ARB_texture_gather

2015-03-29 Thread sroland
From: Roland Scheidegger 

This is quite trivial, essentially just follow all the same code you'd
use with linear min/mag (and no mip) filter, then just skip the filtering
after looking up the texels in favor of direct assignment of the right channel
to the result. (This is though not true for the multi-offset version if we'd
want to support it - for this would probably need to do something along the
lines of 4x nearest sampling due to the necessity of doing coord wrapping
individually per texel.)
Supports multi-channel formats.
From the SM5 gather cap bit, should support non-constant offsets, plus shadow
comparisons (the former untested), but not component selection (should be
easy to implement but all this stuff is not really exposable anyway for now).
---
 src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 137 +-
 src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c   |  36 --
 2 files changed, 133 insertions(+), 40 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
index 962f478..ff508e2 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
@@ -840,6 +840,7 @@ lp_build_masklerp2d(struct lp_build_context *bld,
  */
 static void
 lp_build_sample_image_linear(struct lp_build_sample_context *bld,
+ boolean is_gather,
  LLVMValueRef size,
  LLVMValueRef linear_mask,
  LLVMValueRef row_stride_vec,
@@ -853,6 +854,7 @@ lp_build_sample_image_linear(struct lp_build_sample_context 
*bld,
LLVMBuilderRef builder = bld->gallivm->builder;
struct lp_build_context *ivec_bld = &bld->int_coord_bld;
struct lp_build_context *coord_bld = &bld->coord_bld;
+   struct lp_build_context *texel_bld = &bld->texel_bld;
const unsigned dims = bld->dims;
LLVMValueRef width_vec;
LLVMValueRef height_vec;
@@ -875,7 +877,16 @@ lp_build_sample_image_linear(struct 
lp_build_sample_context *bld,
seamless_cube_filter = (bld->static_texture_state->target == 
PIPE_TEXTURE_CUBE ||
bld->static_texture_state->target == 
PIPE_TEXTURE_CUBE_ARRAY) &&
   bld->static_sampler_state->seamless_cube_map;
-   accurate_cube_corners = ACCURATE_CUBE_CORNERS && seamless_cube_filter;
+   /*
+* XXX I don't know how this is really supposed to work with gather. From GL
+* spec wording (not gather specific) it sounds like the 4th missing texel
+* should be an average of the other 3, hence for gather could return this.
+* This is however NOT how the code here works, which just fixes up the
+* weights used for filtering instead. And of course for gather there is
+* no filter to tweak...
+*/
+   accurate_cube_corners = ACCURATE_CUBE_CORNERS && seamless_cube_filter &&
+   !is_gather;
 
lp_build_extract_image_sizes(bld,
 &bld->int_size_bld,
@@ -1160,10 +1171,11 @@ lp_build_sample_image_linear(struct 
lp_build_sample_context *bld,
  data_ptr, mipoffsets, neighbors[0][1]);
 
if (dims == 1) {
+  assert(!is_gather);
   if (bld->static_sampler_state->compare_mode == PIPE_TEX_COMPARE_NONE) {
  /* Interpolate two samples from 1D image to produce one color */
  for (chan = 0; chan < 4; chan++) {
-colors_out[chan] = lp_build_lerp(&bld->texel_bld, s_fpart,
+colors_out[chan] = lp_build_lerp(texel_bld, s_fpart,
  neighbors[0][0][chan],
  neighbors[0][1][chan],
  0);
@@ -1174,7 +1186,7 @@ lp_build_sample_image_linear(struct 
lp_build_sample_context *bld,
  cmpval0 = lp_build_sample_comparefunc(bld, coords[4], 
neighbors[0][0][0]);
  cmpval1 = lp_build_sample_comparefunc(bld, coords[4], 
neighbors[0][1][0]);
  /* simplified lerp, AND mask with weight and add */
- colors_out[0] = lp_build_masklerp(&bld->texel_bld, s_fpart,
+ colors_out[0] = lp_build_masklerp(texel_bld, s_fpart,
cmpval0, cmpval1);
  colors_out[1] = colors_out[2] = colors_out[3] = colors_out[0];
   }
@@ -1301,15 +1313,38 @@ lp_build_sample_image_linear(struct 
lp_build_sample_context *bld,
   }
 
   if (bld->static_sampler_state->compare_mode == PIPE_TEX_COMPARE_NONE) {
- /* Bilinear interpolate the four samples from the 2D image / 3D slice 
*/
- for (chan = 0; chan < 4; chan++) {
-colors0[chan] = lp_build_lerp_2d(&bld->texel_bld,
- s_fpart, t_fpart,
- neighbors[0][0][chan],
- neighbors[0][1][chan],
-  

Re: [Mesa-dev] [PATCH 3/3] util: Use 32 bit integer hash function for pointers

2015-03-29 Thread Jordan Justen
On 2015-03-29 13:28:02, Thomas Helland wrote:
> (Forgot to send to list)
> 
> That is indeed an issue.
> I found the original article on the wayback machine and it
> doesn't state anything particular wrt license.
> However, it seems to be used in a LOT of projects.
> (javascript, chromium, hiphop-php, kde, +++)
> 
> I found this webpage that gives some more insight:
> http://burtleburtle.net/bob/hash/integer.html
> (it seems to be the website of Bob Jenkins)
> It states the following:
> 
> "The hashes on this page (with the possible exception
> of HashMap.java's) are all public domain.
> So are the ones on Thomas Wang's page.
> Thomas recommends citing the author
> and page when using them."

Is there a reference to Thomas directly indicating this?

On that same page, Bob provides some hash functions of his own, and
directly states that they are public domain. He would probably be in a
better position to speak for his code than Thomas's. :) He does
indicate that Thomas's version is faster than all of his.

-Jordan

> So it looks like it is public domain.
> I'll add a credit to Thomas, and link
> to Bob Jenkins' webpage .
> Does that sound like an acceptable solution?
> 
> 2015-03-29 20:51 GMT+02:00 Jordan Justen :
> > On 2015-03-29 11:05:40, Thomas Helland wrote:
> >> Since a pointer is basically just an int we can use integer hashing.
> >> This one is taken from https://gist.github.com/badboy/6267743
> >
> > There doesn't seem to be a license associated with this code, nor is
> > it indicated that it is public domain code.
> >
> > -Jordan
> >
> >> A google search seems to suggest this is a common and good algorithm.
> >> Since it operates 32 bits at a time it is really effective.
> >> assert that we are hashing 32bit aligned pointers.
> >>
> >> Signed-off-by: Thomas Helland 
> >> ---
> >>  src/util/hash_table.c | 24 ++--
> >>  1 file changed, 22 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/src/util/hash_table.c b/src/util/hash_table.c
> >> index 24184c0..54d04ef 100644
> >> --- a/src/util/hash_table.c
> >> +++ b/src/util/hash_table.c
> >> @@ -393,6 +393,15 @@ _mesa_hash_table_random_entry(struct hash_table *ht,
> >> return NULL;
> >>  }
> >>
> >> +static inline uint32_t
> >> +hash_32bit_int(uint32_t a) {
> >> +   a = (a ^ 61) ^ (a >> 16);
> >> +   a = a + (a << 3);
> >> +   a = a ^ (a >> 4);
> >> +   a = a * 0x27d4eb2d;
> >> +   a = a ^ (a >> 15);
> >> +   return a;
> >> +}
> >>
> >>  /**
> >>   * Quick FNV-1a hash implementation based on:
> >> @@ -406,8 +415,19 @@ _mesa_hash_table_random_entry(struct hash_table *ht,
> >>  uint32_t
> >>  _mesa_hash_data(const void *data, size_t size)
> >>  {
> >> -   return _mesa_fnv32_1a_accumulate_block(_mesa_fnv32_1a_offset_bias,
> >> -  data, size);
> >> +   uint32_t hash = _mesa_fnv32_1a_offset_bias;
> >> +   const uint32_t *ints = (const uint32_t *) data;
> >> +
> >> +   assert((size % 4) == 0);
> >> +
> >> +   uint32_t i = size / 4;
> >> +
> >> +   while (i-- != 0) {
> >> +  hash ^= hash_32bit_int(*ints);
> >> +  ints++;
> >> +   }
> >> +
> >> +   return hash;
> >>  }
> >>
> >>  /** FNV-1a string hash implementation */
> >> --
> >> 2.3.4
> >>
> >> ___
> >> mesa-dev mailing list
> >> mesa-dev@lists.freedesktop.org
> >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0.5/3] util/tests: Expand collision test for hash table

2015-03-29 Thread Thomas Helland
Add a test to exercise a worst case collision scenario
that may cause us to not be able to find an empty
slot in the table even though it is not full.
This hits the bug in my last revision of the series
converting the hash table to quadratic probing.

Signed-off-by: Thomas Helland 
---
 src/util/tests/hash_table/collision.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/util/tests/hash_table/collision.c 
b/src/util/tests/hash_table/collision.c
index 69a4c29..ba284d8 100644
--- a/src/util/tests/hash_table/collision.c
+++ b/src/util/tests/hash_table/collision.c
@@ -89,6 +89,20 @@ main(int argc, char **argv)
entry2 = _mesa_hash_table_search_pre_hashed(ht, bad_hash, str2);
assert(entry2->key == str2);
 
+
+   _mesa_hash_table_destroy(ht, NULL);
+
+   /* Try inserting multiple items with the same hash
+* This exercises a worst case scenario where we might fail to find
+* an empty slot in the table, even though there is free space
+*/
+   ht = _mesa_hash_table_create(NULL, NULL, _mesa_key_string_equal);
+   for (i = 0; i < 100; i++) {
+  char *key = malloc(10);
+  sprintf(key, "spam%d", i);
+  assert(_mesa_hash_table_insert_pre_hashed(ht, bad_hash, key, NULL) != 
NULL);
+   }
+
_mesa_hash_table_destroy(ht, NULL);
 
return 0;
-- 
2.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 3/3] util: Use 32 bit integer hash function for

2015-03-29 Thread Thomas Helland
Since a pointer is basically just an int we can use integer hashing.
This implementation is found on Bob Jenkins' webpage on:
http://burtleburtle.net/bob/hash/integer.html
It states that this implementation is faster than any of his algorithms.
It also statest that the algorithm is public domain.
Since it operates 32 bits at a time it is really effective.

Oprofile of shader-db run with INTEL_USE_NIR set:
mesa_hash_data 3.09 % ---> 2.15 %

V2: Feedback from Matt Turner
   - Use a for-loop for readability
   - Don't mix code and declaration

Feedback from Jordan Justen
   - Add comment regarding licensing

Signed-off-by: Thomas Helland 
---
 src/util/hash_table.c | 38 --
 1 file changed, 28 insertions(+), 10 deletions(-)

diff --git a/src/util/hash_table.c b/src/util/hash_table.c
index 24184c0..7c6b3ae 100644
--- a/src/util/hash_table.c
+++ b/src/util/hash_table.c
@@ -393,21 +393,39 @@ _mesa_hash_table_random_entry(struct hash_table *ht,
return NULL;
 }
 
-
 /**
- * Quick FNV-1a hash implementation based on:
- * http://www.isthe.com/chongo/tech/comp/fnv/
- *
- * FNV-1a is not be the best hash out there -- Jenkins's lookup3 is supposed
- * to be quite good, and it probably beats FNV.  But FNV has the advantage
- * that it involves almost no code.  For an improvement on both, see Paul
- * Hsieh's http://www.azillionmonkeys.com/qed/hash.html
+ * This hashing function is described on Bob Jenkins' website:
+ * http://burtleburtle.net/bob/hash/integer.html
+ * It states that the code originally is Thomas Wang's,
+ * and that it is public domain.
+ * The original page is down, but a copy can be found on
+ * 
http://web.archive.org/web/20120720045250/http://www.cris.com/~Ttwang/tech/inthash.htm
  */
+static inline uint32_t
+hash_32bit_int(uint32_t a) {
+   a = (a ^ 61) ^ (a >> 16);
+   a = a + (a << 3);
+   a = a ^ (a >> 4);
+   a = a * 0x27d4eb2d;
+   a = a ^ (a >> 15);
+   return a;
+}
+
 uint32_t
 _mesa_hash_data(const void *data, size_t size)
 {
-   return _mesa_fnv32_1a_accumulate_block(_mesa_fnv32_1a_offset_bias,
-  data, size);
+   uint32_t hash = _mesa_fnv32_1a_offset_bias;
+   const uint32_t *ints = (const uint32_t *) data;
+   uint32_t i = 0;
+
+   assert((size % 4) == 0);
+
+   uint32_t i = size / 4;
+
+   for (i = size / 4; i != 0; i--, ints++)
+  hash ^= hash_32bit_int(*ints);
+
+   return hash;
 }
 
 /** FNV-1a string hash implementation */
-- 
2.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Good compiler literature?

2015-03-29 Thread Thomas Helland
Does anyone have suggestions for good literature on compilers?
Since GPU's and CPU's are a bit different there are probably
books that are better suited than others for GPUs?
I have what is probably Norway's biggest library on the subject to rent
books from, so I guess I should be able to find most suggestions there.

Regards
Thomas
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/22] Expand get_range in minmax_pruning, V3

2015-03-29 Thread Thomas Helland
I just remembered the existence of this series so
I went ahead and tried it on top of today's master.
With NIR enabled there is no benefit at all, didn't try without NIR.
I've marked it as rejected in patchwork, so it's not floating around in
there.
3. jan. 2015 20.21 skrev "Thomas Helland" :

> I couple months ago i posted a series for expanding the get_range
> function in minmax_pruning with support for more operations.
> So I've been hacking on this during my spare time in Christmas.
> I've now gotten to a point where I think this is not getting us
> anywhere, and I need some opinions from other devs to prove me wrong.
>
> As it stands only a couple of these patches yield any results,
> and IMHO only the first patch is merge-material until we can
> confirm that this is actually gonna give is significant improvements.
> The first patch gives noticeable improvements on shader-db since
> we are no longer messing up our saturate-detection.
>
> Patches up to patch 9 are quite small, and patch 5 yeilds
> some benefit one some Dungeon Defenders shaders that do max(exp, 0).
> These might be merge-material to, due to their small size.
> However the benefits of patch 5 could easily be had with
> a 10-line patch to opt_algebraic, and that's the only one showing
> any benefit on my collection of shaders.
>
> Patches 10-19 take on more complexity, with no apparent benefit.
> I feel theres not adequate return on investement for the code to be
> sitting around bloating the codebase.
>
> The last three patches are RFC only, as we at least need to rename
> the file to something more generic before merging them.
>
> There is still room for improvement, but I'm not sure its worth it.
> Maybe someone can do a shader-db run that proves me wrong?
> My shader-db is dominated by TF2, DOTA2, Portal,
> Brutal Legend and Dungeon Defenders. Maybe non-Source-engine
> games show some benefit from this series?
>
> I'm not comfortable with how this might mess up our shaders.
> It gets hard to verify that things end up complying to spec,
> and that we are not doing something wrong.
> Some lerp-instructions got removed in Brutal Legend
> (it could be guaranteed that the operation would be saturated to 1),
> and while I could tell it was likely that the pass was doing the
> right thing, it was not easily confirmable. This worries me.
>
> IMHO we need to do better for this to be worth it.
> I added a print to the get_range function to gather some stats:
> 2 million calls are made to get_range on my shader-db run.
> 500'000 of these are expressions, 2'000 of these are unsupported.
> 350'000 of these are constants.
> So our coverage in get_range is less than 50%.
> Is there anything more we could get/know the range of?
>
> Thomas Helland (22):
>   glsl: Reorder optimization-passes
>   glsl: Move common code to ir_constant_util.h
>   glsl: Add a IS_CONSTANT macro
>   glsl: Change to using switch-case in get_range
>   glsl: Add sqrt, rsq, exp, exp2 to get_range
>   glsl: Add sin, cos and sign to get_range
>   glsl: Add saturate to get_range
>   glsl: Add abs to get_range
>   glsl: Add ir_unop_neg to get_range
>   glsl: Add ir_binop_add to get_range
>   glsl: Add ir_binop_mul to get_range
>   glsl: Add ir_binop_sub to get_range
>   glsl: Add ir_binop_pow to get_range
>   glsl: Add ir_triop_fma to get_range
>   glsl: Add ir_triop_lrp to get_range
>   glsl: Add ir_binop_dot to get_range
>   glsl: Add log and log2 to get_range
>   glsl: Add ir_unop_rcp to get_range
>   glsl: Add a saturate range optimization
>   glsl: Optimize some cases of undefined behaviour.
>   glsl: Add a range based comparison opt-pass
>   glsl: Remove useless abs based on range analysis
>
>  src/glsl/glsl_parser_extras.cpp |   2 +-
>  src/glsl/ir_constant_util.h | 110 +
>  src/glsl/opt_algebraic.cpp  |  93 +---
>  src/glsl/opt_minmax.cpp | 495
> ++--
>  4 files changed, 589 insertions(+), 111 deletions(-)
>  create mode 100644 src/glsl/ir_constant_util.h
>
> --
> 2.2.1
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Fwd: [PATCH 3/3] util: Use 32 bit integer hash function for pointers

2015-03-29 Thread Thomas Helland
(Forgot to send to list)


That is indeed an issue.
I found the original article on the wayback machine and it
doesn't state anything particular wrt license.
However, it seems to be used in a LOT of projects.
(javascript, chromium, hiphop-php, kde, +++)

I found this webpage that gives some more insight:
http://burtleburtle.net/bob/hash/integer.html
(it seems to be the website of Bob Jenkins)
It states the following:

"The hashes on this page (with the possible exception
of HashMap.java's) are all public domain.
So are the ones on Thomas Wang's page.
Thomas recommends citing the author
and page when using them."

So it looks like it is public domain.
I'll add a credit to Thomas, and link
to Bob Jenkins' webpage .
Does that sound like an acceptable solution?

2015-03-29 20:51 GMT+02:00 Jordan Justen :
> On 2015-03-29 11:05:40, Thomas Helland wrote:
>> Since a pointer is basically just an int we can use integer hashing.
>> This one is taken from https://gist.github.com/badboy/6267743
>
> There doesn't seem to be a license associated with this code, nor is
> it indicated that it is public domain code.
>
> -Jordan
>
>> A google search seems to suggest this is a common and good algorithm.
>> Since it operates 32 bits at a time it is really effective.
>> assert that we are hashing 32bit aligned pointers.
>>
>> Signed-off-by: Thomas Helland 
>> ---
>>  src/util/hash_table.c | 24 ++--
>>  1 file changed, 22 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/util/hash_table.c b/src/util/hash_table.c
>> index 24184c0..54d04ef 100644
>> --- a/src/util/hash_table.c
>> +++ b/src/util/hash_table.c
>> @@ -393,6 +393,15 @@ _mesa_hash_table_random_entry(struct hash_table *ht,
>> return NULL;
>>  }
>>
>> +static inline uint32_t
>> +hash_32bit_int(uint32_t a) {
>> +   a = (a ^ 61) ^ (a >> 16);
>> +   a = a + (a << 3);
>> +   a = a ^ (a >> 4);
>> +   a = a * 0x27d4eb2d;
>> +   a = a ^ (a >> 15);
>> +   return a;
>> +}
>>
>>  /**
>>   * Quick FNV-1a hash implementation based on:
>> @@ -406,8 +415,19 @@ _mesa_hash_table_random_entry(struct hash_table *ht,
>>  uint32_t
>>  _mesa_hash_data(const void *data, size_t size)
>>  {
>> -   return _mesa_fnv32_1a_accumulate_block(_mesa_fnv32_1a_offset_bias,
>> -  data, size);
>> +   uint32_t hash = _mesa_fnv32_1a_offset_bias;
>> +   const uint32_t *ints = (const uint32_t *) data;
>> +
>> +   assert((size % 4) == 0);
>> +
>> +   uint32_t i = size / 4;
>> +
>> +   while (i-- != 0) {
>> +  hash ^= hash_32bit_int(*ints);
>> +  ints++;
>> +   }
>> +
>> +   return hash;
>>  }
>>
>>  /** FNV-1a string hash implementation */
>> --
>> 2.3.4
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] util: Use 32 bit integer hash function for pointers

2015-03-29 Thread Jordan Justen
On 2015-03-29 11:05:40, Thomas Helland wrote:
> Since a pointer is basically just an int we can use integer hashing.
> This one is taken from https://gist.github.com/badboy/6267743

There doesn't seem to be a license associated with this code, nor is
it indicated that it is public domain code.

-Jordan

> A google search seems to suggest this is a common and good algorithm.
> Since it operates 32 bits at a time it is really effective.
> assert that we are hashing 32bit aligned pointers.
> 
> Signed-off-by: Thomas Helland 
> ---
>  src/util/hash_table.c | 24 ++--
>  1 file changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/src/util/hash_table.c b/src/util/hash_table.c
> index 24184c0..54d04ef 100644
> --- a/src/util/hash_table.c
> +++ b/src/util/hash_table.c
> @@ -393,6 +393,15 @@ _mesa_hash_table_random_entry(struct hash_table *ht,
> return NULL;
>  }
>  
> +static inline uint32_t
> +hash_32bit_int(uint32_t a) {
> +   a = (a ^ 61) ^ (a >> 16);
> +   a = a + (a << 3);
> +   a = a ^ (a >> 4);
> +   a = a * 0x27d4eb2d;
> +   a = a ^ (a >> 15);
> +   return a;
> +}
>  
>  /**
>   * Quick FNV-1a hash implementation based on:
> @@ -406,8 +415,19 @@ _mesa_hash_table_random_entry(struct hash_table *ht,
>  uint32_t
>  _mesa_hash_data(const void *data, size_t size)
>  {
> -   return _mesa_fnv32_1a_accumulate_block(_mesa_fnv32_1a_offset_bias,
> -  data, size);
> +   uint32_t hash = _mesa_fnv32_1a_offset_bias;
> +   const uint32_t *ints = (const uint32_t *) data;
> +
> +   assert((size % 4) == 0);
> +
> +   uint32_t i = size / 4;
> +
> +   while (i-- != 0) {
> +  hash ^= hash_32bit_int(*ints);
> +  ints++;
> +   }
> +
> +   return hash;
>  }
>  
>  /** FNV-1a string hash implementation */
> -- 
> 2.3.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] util: Use 32 bit integer hash function for pointers

2015-03-29 Thread Thomas Helland
2015-03-29 20:15 GMT+02:00 Matt Turner :
> On Sun, Mar 29, 2015 at 11:05 AM, Thomas Helland
>  wrote:
>> Since a pointer is basically just an int we can use integer hashing.
>> This one is taken from https://gist.github.com/badboy/6267743
>> A google search seems to suggest this is a common and good algorithm.
>> Since it operates 32 bits at a time it is really effective.
>> assert that we are hashing 32bit aligned pointers.
>>
>> Signed-off-by: Thomas Helland 
>> ---
>>  src/util/hash_table.c | 24 ++--
>>  1 file changed, 22 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/util/hash_table.c b/src/util/hash_table.c
>> index 24184c0..54d04ef 100644
>> --- a/src/util/hash_table.c
>> +++ b/src/util/hash_table.c
>> @@ -393,6 +393,15 @@ _mesa_hash_table_random_entry(struct hash_table *ht,
>> return NULL;
>>  }
>>
>> +static inline uint32_t
>> +hash_32bit_int(uint32_t a) {
>> +   a = (a ^ 61) ^ (a >> 16);
>> +   a = a + (a << 3);
>> +   a = a ^ (a >> 4);
>> +   a = a * 0x27d4eb2d;
>> +   a = a ^ (a >> 15);
>> +   return a;
>> +}
>>
>>  /**
>>   * Quick FNV-1a hash implementation based on:
>> @@ -406,8 +415,19 @@ _mesa_hash_table_random_entry(struct hash_table *ht,
>>  uint32_t
>>  _mesa_hash_data(const void *data, size_t size)
>>  {
>> -   return _mesa_fnv32_1a_accumulate_block(_mesa_fnv32_1a_offset_bias,
>> -  data, size);
>> +   uint32_t hash = _mesa_fnv32_1a_offset_bias;
>> +   const uint32_t *ints = (const uint32_t *) data;
>> +
>> +   assert((size % 4) == 0);
>
> Not sure if we can mix code and declarations. This might need to go
> after the declaration of i.
>

Darn. Rebase failure on my part.
Will post a V2 ASAP.

>> +
>> +   uint32_t i = size / 4;
>> +
>> +   while (i-- != 0) {
>> +  hash ^= hash_32bit_int(*ints);
>> +  ints++;
>> +   }
>
> This would read a lot better as
>
> for (i = size / 4; i != 0; i--) {
>hash ^= hash_32_bit_int(ints[i]);
> }

I'll  get this into the V2 as well.

Thanks for the fast response =)

>
>> +
>> +   return hash;
>>  }
>>
>>  /** FNV-1a string hash implementation */
>> --
>> 2.3.4
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] util: Use 32 bit integer hash function for pointers

2015-03-29 Thread Matt Turner
On Sun, Mar 29, 2015 at 11:05 AM, Thomas Helland
 wrote:
> Since a pointer is basically just an int we can use integer hashing.
> This one is taken from https://gist.github.com/badboy/6267743
> A google search seems to suggest this is a common and good algorithm.
> Since it operates 32 bits at a time it is really effective.
> assert that we are hashing 32bit aligned pointers.
>
> Signed-off-by: Thomas Helland 
> ---
>  src/util/hash_table.c | 24 ++--
>  1 file changed, 22 insertions(+), 2 deletions(-)
>
> diff --git a/src/util/hash_table.c b/src/util/hash_table.c
> index 24184c0..54d04ef 100644
> --- a/src/util/hash_table.c
> +++ b/src/util/hash_table.c
> @@ -393,6 +393,15 @@ _mesa_hash_table_random_entry(struct hash_table *ht,
> return NULL;
>  }
>
> +static inline uint32_t
> +hash_32bit_int(uint32_t a) {
> +   a = (a ^ 61) ^ (a >> 16);
> +   a = a + (a << 3);
> +   a = a ^ (a >> 4);
> +   a = a * 0x27d4eb2d;
> +   a = a ^ (a >> 15);
> +   return a;
> +}
>
>  /**
>   * Quick FNV-1a hash implementation based on:
> @@ -406,8 +415,19 @@ _mesa_hash_table_random_entry(struct hash_table *ht,
>  uint32_t
>  _mesa_hash_data(const void *data, size_t size)
>  {
> -   return _mesa_fnv32_1a_accumulate_block(_mesa_fnv32_1a_offset_bias,
> -  data, size);
> +   uint32_t hash = _mesa_fnv32_1a_offset_bias;
> +   const uint32_t *ints = (const uint32_t *) data;
> +
> +   assert((size % 4) == 0);

Not sure if we can mix code and declarations. This might need to go
after the declaration of i.

> +
> +   uint32_t i = size / 4;
> +
> +   while (i-- != 0) {
> +  hash ^= hash_32bit_int(*ints);
> +  ints++;
> +   }

This would read a lot better as

for (i = size / 4; i != 0; i--) {
   hash ^= hash_32_bit_int(ints[i]);
}

> +
> +   return hash;
>  }
>
>  /** FNV-1a string hash implementation */
> --
> 2.3.4
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] util: Change hash_table to use quadratic probing

2015-03-29 Thread Thomas Helland
This should give better cache locality, less memory consumption,
less code, and should also be faster since we avoid a modulo operation.
Also change table size to be power of two.
This gives better performance as we can do bitmasking instead of
modulo operations for fitting the hash in the address space.
By using the algorithm hash = sh + i/2 + i*i/2
we are guaranteed that all retries from the quad probing
are distinct, and so should be able to completely fill the table.
This passes the test added to exercise a worst case collision scenario.
Also, start at size = 16 instead of 4.
This should reduce some allocation overhead
when constantly using tables larger than 3 entries.
The amount of space used before rehash is changed to 70%.
This should decrease collisions slightly, leading to better performance.

V3: Feedback from Connor Abbott
- Remove hash_size table
- Correct comment-style

Feedback from Eric Anholt
- Correct quadratic probing algorithm

Feedback from Jason Ekstrand
- Add "unreachable" if we fail to insert in table

Signed-off-by: Thomas Helland 
---
 src/util/hash_table.c | 108 +-
 src/util/hash_table.h |   3 +-
 2 files changed, 38 insertions(+), 73 deletions(-)

diff --git a/src/util/hash_table.c b/src/util/hash_table.c
index 3247593..24184c0 100644
--- a/src/util/hash_table.c
+++ b/src/util/hash_table.c
@@ -33,11 +33,16 @@
  */
 
 /**
- * Implements an open-addressing, linear-reprobing hash table.
+ * Implements an open-addressing, quadratic probing hash table.
  *
- * For more information, see:
- *
- * http://cgit.freedesktop.org/~anholt/hash_table/tree/README
+ * We choose table sizes that's a power of two.
+ * This is computationally less expensive than primes.
+ * As a bonus the size and free space can be calculated instead of looked up.
+ * FNV-1a has good avalanche properties, so collision is not an issue.
+ * These tables are sized to have an extra 30% free to avoid
+ * exponential performance degradation as the hash table fills.
+ * The table has a starting size of 16 to avoid spamming
+ * rzalloc and friends in the start of most of our tables.
  */
 
 #include 
@@ -50,47 +55,6 @@
 
 static const uint32_t deleted_key_value;
 
-/**
- * From Knuth -- a good choice for hash/rehash values is p, p-2 where
- * p and p-2 are both prime.  These tables are sized to have an extra 10%
- * free to avoid exponential performance degradation as the hash table fills
- */
-static const struct {
-   uint32_t max_entries, size, rehash;
-} hash_sizes[] = {
-   { 2,5,  3 },
-   { 4,7,  5 },
-   { 8,13, 11},
-   { 16,   19, 17},
-   { 32,   43, 41},
-   { 64,   73, 71},
-   { 128,  151,149   },
-   { 256,  283,281   },
-   { 512,  571,569   },
-   { 1024, 1153,   1151  },
-   { 2048, 2269,   2267  },
-   { 4096, 4519,   4517  },
-   { 8192, 9013,   9011  },
-   { 16384,18043,  18041 },
-   { 32768,36109,  36107 },
-   { 65536,72091,  72089 },
-   { 131072,   144409, 144407},
-   { 262144,   288361, 288359},
-   { 524288,   576883, 576881},
-   { 1048576,  1153459,1153457   },
-   { 2097152,  2307163,2307161   },
-   { 4194304,  4613893,4613891   },
-   { 8388608,  9227641,9227639   },
-   { 16777216, 18455029,   18455027  },
-   { 33554432, 36911011,   36911009  },
-   { 67108864, 73819861,   73819859  },
-   { 134217728,147639589,  147639587 },
-   { 268435456,295279081,  295279079 },
-   { 536870912,590559793,  590559791 },
-   { 1073741824,   1181116273, 1181116271},
-   { 2147483648ul, 2362232233ul,   2362232231ul}
-};
-
 static int
 entry_is_free(const struct hash_entry *entry)
 {
@@ -121,10 +85,13 @@ _mesa_hash_table_create(void *mem_ctx,
if (ht == NULL)
   return NULL;
 
-   ht->size_index = 0;
-   ht->size = hash_sizes[ht->size_index].size;
-   ht->rehash = hash_sizes[ht->size_index].rehash;
-   ht->max_entries = hash_sizes[ht->size_index].max_entries;
+   /* Start the table at an initial size of 16
+* We use a bit more memory, but avoid spamming
+* malloc and friends when starting a new table
+*/
+   ht->size_iteration = 4;
+   ht->size = 1 << ht->size_iteration;
+   ht->max_entries = ht->size * 0.7;
ht->key_hash_function = key_hash_function;
ht->key_equals_function = key_equals_function;
ht->tabl

[Mesa-dev] [PATCH 0/3] Hash-table improvements, V3

2015-03-29 Thread Thomas Helland
Here's the latest round of fixup on the hash-table patches.
I think I've gotten all the review feedback incorporated now.
These patches give a nice little boost, indicated in each commit.
As a side effect of upping the minimum size of the table and set
there is now also less spamming of rzalloc and friends.
_int_malloc is cut from 935'000 to 847'000 samples.
calloc is cut from 683'000 to 655'000 samples.
_int_free is cut from 644'000 to 617'000 samples
The series reduced shader-db run-time with NIR on my collection
from 180 seconds to about 160 seconds.

Thomas Helland (3):
  util: Change hash_table to use quadratic probing
  util: Change util/set to use quadratic probing
  util: Use 32 bit integer hash function for pointers

 src/util/hash_table.c | 132 ++
 src/util/hash_table.h |   3 +-
 src/util/set.c| 124 ++-
 src/util/set.h|   3 +-
 4 files changed, 109 insertions(+), 153 deletions(-)

-- 
2.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] util: Change util/set to use quadratic probing

2015-03-29 Thread Thomas Helland
The same rationale applies here as for the hash table.
Power of two size should give better performance,
and using the algorithm hash = sh + i/2 + i*i/2
should result in only distinct hash values when hitting collisions.
Should give a performance increase as we can do bitmasking instead
of a modulo operation for fitting the hash in the address space.

V2: Feedback from Connor Abbott
   - Don't set initial hash address before potential rehash
   - Remove hash_sizes table
   - Correct the quadratic hashing algorithm
   - Use correct comment style

   Feedback from Jason Ekstrand
   - Use unreachable() to detect if we fail to insert

Signed-off-by: Thomas Helland 
---
 src/util/set.c | 124 ++---
 src/util/set.h |   3 +-
 2 files changed, 49 insertions(+), 78 deletions(-)

diff --git a/src/util/set.c b/src/util/set.c
index f01f869..1496178 100644
--- a/src/util/set.c
+++ b/src/util/set.c
@@ -32,6 +32,19 @@
  *Keith Packard 
  */
 
+/**
+ * Implements an open-addressing, quadratic probing hash-set.
+ *
+ * We choose set sizes that's a power of two.
+ * This is computationally less expensive than primes.
+ * As a bonus the size and free space can be calculated instead of looked up.
+ * FNV-1a has good avalanche properties, so collision is not an issue.
+ * These sets are sized to have an extra 30% free to avoid
+ * exponential performance degradation as the set fills.
+ * The set has a starting size of 16 to avoid spamming
+ * rzalloc and friends in the start of most of our sets.
+ */
+
 #include 
 #include 
 
@@ -39,51 +52,9 @@
 #include "ralloc.h"
 #include "set.h"
 
-/*
- * From Knuth -- a good choice for hash/rehash values is p, p-2 where
- * p and p-2 are both prime.  These tables are sized to have an extra 10%
- * free to avoid exponential performance degradation as the hash table fills
- */
-
 uint32_t deleted_key_value;
 const void *deleted_key = &deleted_key_value;
 
-static const struct {
-   uint32_t max_entries, size, rehash;
-} hash_sizes[] = {
-   { 2,5,3},
-   { 4,7,5},
-   { 8,13,   11   },
-   { 16,   19,   17   },
-   { 32,   43,   41   },
-   { 64,   73,   71   },
-   { 128,  151,  149  },
-   { 256,  283,  281  },
-   { 512,  571,  569  },
-   { 1024, 1153, 1151 },
-   { 2048, 2269, 2267 },
-   { 4096, 4519, 4517 },
-   { 8192, 9013, 9011 },
-   { 16384,18043,18041},
-   { 32768,36109,36107},
-   { 65536,72091,72089},
-   { 131072,   144409,   144407   },
-   { 262144,   288361,   288359   },
-   { 524288,   576883,   576881   },
-   { 1048576,  1153459,  1153457  },
-   { 2097152,  2307163,  2307161  },
-   { 4194304,  4613893,  4613891  },
-   { 8388608,  9227641,  9227639  },
-   { 16777216, 18455029, 18455027 },
-   { 33554432, 36911011, 36911009 },
-   { 67108864, 73819861, 73819859 },
-   { 134217728,147639589,147639587},
-   { 268435456,295279081,295279079},
-   { 536870912,590559793,590559791},
-   { 1073741824,   1181116273,   1181116271   },
-   { 2147483648ul, 2362232233ul, 2362232231ul }
-};
-
 static int
 entry_is_free(struct set_entry *entry)
 {
@@ -114,10 +85,13 @@ _mesa_set_create(void *mem_ctx,
if (ht == NULL)
   return NULL;
 
-   ht->size_index = 0;
-   ht->size = hash_sizes[ht->size_index].size;
-   ht->rehash = hash_sizes[ht->size_index].rehash;
-   ht->max_entries = hash_sizes[ht->size_index].max_entries;
+   /* Start the set at an initial size of 16
+* We use a bit more memory, but avoid spamming
+* malloc and friends when starting a new set
+*/
+   ht->size_iteration = 4;
+   ht->size = 1 << ht->size_iteration;
+   ht->max_entries = ht->size * 0.7;
ht->key_hash_function = key_hash_function;
ht->key_equals_function = key_equals_function;
ht->table = rzalloc_array(ht, struct set_entry, ht->size);
@@ -163,12 +137,11 @@ _mesa_set_destroy(struct set *ht, void 
(*delete_function)(struct set_entry *entr
 static struct set_entry *
 set_search(const struct set *ht, uint32_t hash, const void *key)
 {
-   uint32_t hash_address;
+   uint32_t start_hash_address = hash & (ht->size - 1);
+   uint32_t hash_address = start_hash_address;
+   uint32_t quad_hash = 1;
 
-   hash_address = hash % ht->size;
do {
-  uint32_t double_hash;
-
   struct set_entry *entry = ht->table + hash_address;
 
   if (entry_is_free(entry)) {
@@ -179,10 +152,10 @@ set_search(const struct set *ht, uint32_t hash, const 
void *key)
  }
   }

[Mesa-dev] [PATCH 3/3] util: Use 32 bit integer hash function for pointers

2015-03-29 Thread Thomas Helland
Since a pointer is basically just an int we can use integer hashing.
This one is taken from https://gist.github.com/badboy/6267743
A google search seems to suggest this is a common and good algorithm.
Since it operates 32 bits at a time it is really effective.
assert that we are hashing 32bit aligned pointers.

Signed-off-by: Thomas Helland 
---
 src/util/hash_table.c | 24 ++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/src/util/hash_table.c b/src/util/hash_table.c
index 24184c0..54d04ef 100644
--- a/src/util/hash_table.c
+++ b/src/util/hash_table.c
@@ -393,6 +393,15 @@ _mesa_hash_table_random_entry(struct hash_table *ht,
return NULL;
 }
 
+static inline uint32_t
+hash_32bit_int(uint32_t a) {
+   a = (a ^ 61) ^ (a >> 16);
+   a = a + (a << 3);
+   a = a ^ (a >> 4);
+   a = a * 0x27d4eb2d;
+   a = a ^ (a >> 15);
+   return a;
+}
 
 /**
  * Quick FNV-1a hash implementation based on:
@@ -406,8 +415,19 @@ _mesa_hash_table_random_entry(struct hash_table *ht,
 uint32_t
 _mesa_hash_data(const void *data, size_t size)
 {
-   return _mesa_fnv32_1a_accumulate_block(_mesa_fnv32_1a_offset_bias,
-  data, size);
+   uint32_t hash = _mesa_fnv32_1a_offset_bias;
+   const uint32_t *ints = (const uint32_t *) data;
+
+   assert((size % 4) == 0);
+
+   uint32_t i = size / 4;
+
+   while (i-- != 0) {
+  hash ^= hash_32bit_int(*ints);
+  ints++;
+   }
+
+   return hash;
 }
 
 /** FNV-1a string hash implementation */
-- 
2.3.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa 10.5.2

2015-03-29 Thread Emil Velikov
On 28/03/15 19:23, Emil Velikov wrote:
> Mesa 10.5.2 is now available. This release addresses bugs in the common glsl 
> code-base, the libGL and glapi libraries, and the dri modules. The tarball no 
> longer contains hardlinks and has all the haiku files. With this release one 
> can build mesa without the need of python and mako.
> 
Hi all,

It seems that there is a bug which prevents mesa 10.5.2 from going
python/mako free. The issue has been resolved and will feature in the
next stable release.

Thanks to everyone who reported this.

-Emil



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 06/15] st/mesa: implement GL_AMD_performance_monitor

2015-03-29 Thread Martin Peres

On 29/03/2015 17:56, Samuel Pitoiset wrote:


On 03/28/2015 09:43 PM, Martin Peres wrote:

On 22/03/2015 17:35, Samuel Pitoiset wrote:

From: Christoph Bumiller 

This is based on the original patch of Christoph Bumiller.
(source: http://people.freedesktop.org/~chrisbmr/perfmon.diff)

It would be nice if you could add "v2: Samuel Pitoiset" and tell what
you changed. Christoph may delete his perfmon.diff and no-one will be
able to diff the diffs :)

Good idea!


As for the Gallium HUD, we keep a list of busy queries in a ring
buffer in order to prevent stalls when reading queries.

Drivers must implement get_driver_query_group_info and
get_driver_query_info in order to enable this extension.

Signed-off-by: Samuel Pitoiset 
---
   src/mesa/Makefile.sources  |   2 +
   src/mesa/state_tracker/st_cb_perfmon.c | 455
+
   src/mesa/state_tracker/st_cb_perfmon.h |  70 +
   src/mesa/state_tracker/st_context.c|   4 +
   src/mesa/state_tracker/st_extensions.c |   3 +
   5 files changed, 534 insertions(+)
   create mode 100644 src/mesa/state_tracker/st_cb_perfmon.c
   create mode 100644 src/mesa/state_tracker/st_cb_perfmon.h

diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index 217be9a..e54e618 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -432,6 +432,8 @@ STATETRACKER_FILES = \
   state_tracker/st_cb_flush.h \
   state_tracker/st_cb_msaa.c \
   state_tracker/st_cb_msaa.h \
+state_tracker/st_cb_perfmon.c \
+state_tracker/st_cb_perfmon.h \
   state_tracker/st_cb_program.c \
   state_tracker/st_cb_program.h \
   state_tracker/st_cb_queryobj.c \
diff --git a/src/mesa/state_tracker/st_cb_perfmon.c
b/src/mesa/state_tracker/st_cb_perfmon.c
new file mode 100644
index 000..fb6774b
--- /dev/null
+++ b/src/mesa/state_tracker/st_cb_perfmon.c
@@ -0,0 +1,455 @@
+/*
+ * Copyright (C) 2013 Christoph Bumiller
+ * Copyright (C) 2015 Samuel Pitoiset
+ *
+ * Permission is hereby granted, free of charge, to any person
obtaining a
+ * copy of this software and associated documentation files (the
"Software"),
+ * to deal in the Software without restriction, including without
limitation
+ * the rights to use, copy, modify, merge, publish, distribute,
sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom
the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be
included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO
EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+/**
+ * Performance monitoring counters interface to gallium.
+ */
+
+#include "st_context.h"
+#include "st_cb_bitmap.h"
+#include "st_cb_perfmon.h"
+
+#include "util/bitset.h"
+
+#include "pipe/p_context.h"
+#include "pipe/p_screen.h"
+#include "util/u_memory.h"
+
+/**
+ * Return a PIPE_QUERY_x type >= PIPE_QUERY_DRIVER_SPECIFIC, or -1 if
+ * the driver-specific query doesn't exist.
+ */
+static int
+find_query_type(struct pipe_screen *screen, const char *name)
+{
+   int num_queries;
+   int type = -1;
+   int i;
+
+   num_queries = screen->get_driver_query_info(screen, 0, NULL);
+   if (!num_queries)
+  return type;
+
+   for (i = 0; i < num_queries; i++) {
+  struct pipe_driver_query_info info;
+
+  if (!screen->get_driver_query_info(screen, i, &info))
+ continue;
+
+  if (!strncmp(info.name, name, strlen(name))) {
+ type = info.query_type;
+ break;
+  }
+   }
+   return type;
+}
+
+static bool
+init_perf_monitor(struct gl_context *ctx, struct
gl_perf_monitor_object *m)
+{
+   struct st_perf_monitor_object *stm = st_perf_monitor_object(m);
+   struct pipe_screen *screen = st_context(ctx)->pipe->screen;
+   struct pipe_context *pipe = st_context(ctx)->pipe;
+   int gid, cid;
+
+   st_flush_bitmap_cache(st_context(ctx));
+
+   /* Create a query for each active counter. */
+   for (gid = 0; gid < ctx->PerfMonitor.NumGroups; gid++) {
+  const struct gl_perf_monitor_group *g =
&ctx->PerfMonitor.Groups[gid];
+  for (cid = 0; cid < g->NumCounters; cid++) {
+ const struct gl_perf_monitor_counter *c = &g->Counters[cid];
+ struct st_perf_counter_object *cntr;
+ int query_type;
+
+ if (!BITSET_TEST(m->ActiveCounters[gid], cid))
+continue;

It would seem like the extension would not work with more than 32
counters per group.

This certainly is not a problem on the NVIDIA side but it may be

Re: [Mesa-dev] [PATCH v2 06/15] st/mesa: implement GL_AMD_performance_monitor

2015-03-29 Thread Martin Peres

On 29/03/2015 17:57, Samuel Pitoiset wrote:


On 03/29/2015 11:13 AM, Martin Peres wrote:

On 29/03/2015 04:02, Marek Olšák wrote:

On Sat, Mar 28, 2015 at 9:43 PM, Martin Peres 
wrote:

On 22/03/2015 17:35, Samuel Pitoiset wrote:

From: Christoph Bumiller 

This is based on the original patch of Christoph Bumiller.
(source: http://people.freedesktop.org/~chrisbmr/perfmon.diff)

It would be nice if you could add "v2: Samuel Pitoiset" and tell
what you
changed. Christoph may delete his perfmon.diff and no-one will be
able to
diff the diffs :)


As for the Gallium HUD, we keep a list of busy queries in a ring
buffer in order to prevent stalls when reading queries.

Drivers must implement get_driver_query_group_info and
get_driver_query_info in order to enable this extension.

Signed-off-by: Samuel Pitoiset 
---
src/mesa/Makefile.sources  |   2 +
src/mesa/state_tracker/st_cb_perfmon.c | 455
+
src/mesa/state_tracker/st_cb_perfmon.h |  70 +
src/mesa/state_tracker/st_context.c|   4 +
src/mesa/state_tracker/st_extensions.c |   3 +
5 files changed, 534 insertions(+)
create mode 100644 src/mesa/state_tracker/st_cb_perfmon.c
create mode 100644 src/mesa/state_tracker/st_cb_perfmon.h

diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index 217be9a..e54e618 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -432,6 +432,8 @@ STATETRACKER_FILES = \
  state_tracker/st_cb_flush.h \
  state_tracker/st_cb_msaa.c \
  state_tracker/st_cb_msaa.h \
+   state_tracker/st_cb_perfmon.c \
+   state_tracker/st_cb_perfmon.h \
  state_tracker/st_cb_program.c \
  state_tracker/st_cb_program.h \
  state_tracker/st_cb_queryobj.c \
diff --git a/src/mesa/state_tracker/st_cb_perfmon.c
b/src/mesa/state_tracker/st_cb_perfmon.c
new file mode 100644
index 000..fb6774b
--- /dev/null
+++ b/src/mesa/state_tracker/st_cb_perfmon.c
@@ -0,0 +1,455 @@
+/*
+ * Copyright (C) 2013 Christoph Bumiller
+ * Copyright (C) 2015 Samuel Pitoiset
+ *
+ * Permission is hereby granted, free of charge, to any person
obtaining
a
+ * copy of this software and associated documentation files (the
"Software"),
+ * to deal in the Software without restriction, including without
limitation
+ * the rights to use, copy, modify, merge, publish, distribute,
sublicense,
+ * and/or sell copies of the Software, and to permit persons to
whom the
+ * Software is furnished to do so, subject to the following
conditions:
+ *
+ * The above copyright notice and this permission notice shall be
included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT
SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+/**
+ * Performance monitoring counters interface to gallium.
+ */
+
+#include "st_context.h"
+#include "st_cb_bitmap.h"
+#include "st_cb_perfmon.h"
+
+#include "util/bitset.h"
+
+#include "pipe/p_context.h"
+#include "pipe/p_screen.h"
+#include "util/u_memory.h"
+
+/**
+ * Return a PIPE_QUERY_x type >= PIPE_QUERY_DRIVER_SPECIFIC, or -1 if
+ * the driver-specific query doesn't exist.
+ */
+static int
+find_query_type(struct pipe_screen *screen, const char *name)
+{
+   int num_queries;
+   int type = -1;
+   int i;
+
+   num_queries = screen->get_driver_query_info(screen, 0, NULL);
+   if (!num_queries)
+  return type;
+
+   for (i = 0; i < num_queries; i++) {
+  struct pipe_driver_query_info info;
+
+  if (!screen->get_driver_query_info(screen, i, &info))
+ continue;
+
+  if (!strncmp(info.name, name, strlen(name))) {
+ type = info.query_type;
+ break;
+  }
+   }
+   return type;
+}
+
+static bool
+init_perf_monitor(struct gl_context *ctx, struct
gl_perf_monitor_object
*m)
+{
+   struct st_perf_monitor_object *stm = st_perf_monitor_object(m);
+   struct pipe_screen *screen = st_context(ctx)->pipe->screen;
+   struct pipe_context *pipe = st_context(ctx)->pipe;
+   int gid, cid;
+
+   st_flush_bitmap_cache(st_context(ctx));
+
+   /* Create a query for each active counter. */
+   for (gid = 0; gid < ctx->PerfMonitor.NumGroups; gid++) {
+  const struct gl_perf_monitor_group *g =
&ctx->PerfMonitor.Groups[gid];
+  for (cid = 0; cid < g->NumCounters; cid++) {
+ const struct gl_perf_monitor_counter *c = &g->Counters[cid];
+ struct st_perf_counter_object *cntr;
+ int query_type;
+
+ if (!BITSET_TEST(m->ActiveCounters[gid], cid))
+continue;

It would seem like the extension wou

Re: [Mesa-dev] [PATCH v2 06/15] st/mesa: implement GL_AMD_performance_monitor

2015-03-29 Thread Samuel Pitoiset



On 03/29/2015 11:13 AM, Martin Peres wrote:

On 29/03/2015 04:02, Marek Olšák wrote:
On Sat, Mar 28, 2015 at 9:43 PM, Martin Peres  
wrote:

On 22/03/2015 17:35, Samuel Pitoiset wrote:

From: Christoph Bumiller 

This is based on the original patch of Christoph Bumiller.
(source: http://people.freedesktop.org/~chrisbmr/perfmon.diff)


It would be nice if you could add "v2: Samuel Pitoiset" and tell 
what you
changed. Christoph may delete his perfmon.diff and no-one will be 
able to

diff the diffs :)


As for the Gallium HUD, we keep a list of busy queries in a ring
buffer in order to prevent stalls when reading queries.

Drivers must implement get_driver_query_group_info and
get_driver_query_info in order to enable this extension.

Signed-off-by: Samuel Pitoiset 
---
   src/mesa/Makefile.sources  |   2 +
   src/mesa/state_tracker/st_cb_perfmon.c | 455
+
   src/mesa/state_tracker/st_cb_perfmon.h |  70 +
   src/mesa/state_tracker/st_context.c|   4 +
   src/mesa/state_tracker/st_extensions.c |   3 +
   5 files changed, 534 insertions(+)
   create mode 100644 src/mesa/state_tracker/st_cb_perfmon.c
   create mode 100644 src/mesa/state_tracker/st_cb_perfmon.h

diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index 217be9a..e54e618 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -432,6 +432,8 @@ STATETRACKER_FILES = \
 state_tracker/st_cb_flush.h \
 state_tracker/st_cb_msaa.c \
 state_tracker/st_cb_msaa.h \
+   state_tracker/st_cb_perfmon.c \
+   state_tracker/st_cb_perfmon.h \
 state_tracker/st_cb_program.c \
 state_tracker/st_cb_program.h \
 state_tracker/st_cb_queryobj.c \
diff --git a/src/mesa/state_tracker/st_cb_perfmon.c
b/src/mesa/state_tracker/st_cb_perfmon.c
new file mode 100644
index 000..fb6774b
--- /dev/null
+++ b/src/mesa/state_tracker/st_cb_perfmon.c
@@ -0,0 +1,455 @@
+/*
+ * Copyright (C) 2013 Christoph Bumiller
+ * Copyright (C) 2015 Samuel Pitoiset
+ *
+ * Permission is hereby granted, free of charge, to any person 
obtaining

a
+ * copy of this software and associated documentation files (the
"Software"),
+ * to deal in the Software without restriction, including without
limitation
+ * the rights to use, copy, modify, merge, publish, distribute,
sublicense,
+ * and/or sell copies of the Software, and to permit persons to 
whom the
+ * Software is furnished to do so, subject to the following 
conditions:

+ *
+ * The above copyright notice and this permission notice shall be
included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT
SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, 
DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 
OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE 
USE OR

+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+/**
+ * Performance monitoring counters interface to gallium.
+ */
+
+#include "st_context.h"
+#include "st_cb_bitmap.h"
+#include "st_cb_perfmon.h"
+
+#include "util/bitset.h"
+
+#include "pipe/p_context.h"
+#include "pipe/p_screen.h"
+#include "util/u_memory.h"
+
+/**
+ * Return a PIPE_QUERY_x type >= PIPE_QUERY_DRIVER_SPECIFIC, or -1 if
+ * the driver-specific query doesn't exist.
+ */
+static int
+find_query_type(struct pipe_screen *screen, const char *name)
+{
+   int num_queries;
+   int type = -1;
+   int i;
+
+   num_queries = screen->get_driver_query_info(screen, 0, NULL);
+   if (!num_queries)
+  return type;
+
+   for (i = 0; i < num_queries; i++) {
+  struct pipe_driver_query_info info;
+
+  if (!screen->get_driver_query_info(screen, i, &info))
+ continue;
+
+  if (!strncmp(info.name, name, strlen(name))) {
+ type = info.query_type;
+ break;
+  }
+   }
+   return type;
+}
+
+static bool
+init_perf_monitor(struct gl_context *ctx, struct 
gl_perf_monitor_object

*m)
+{
+   struct st_perf_monitor_object *stm = st_perf_monitor_object(m);
+   struct pipe_screen *screen = st_context(ctx)->pipe->screen;
+   struct pipe_context *pipe = st_context(ctx)->pipe;
+   int gid, cid;
+
+   st_flush_bitmap_cache(st_context(ctx));
+
+   /* Create a query for each active counter. */
+   for (gid = 0; gid < ctx->PerfMonitor.NumGroups; gid++) {
+  const struct gl_perf_monitor_group *g =
&ctx->PerfMonitor.Groups[gid];
+  for (cid = 0; cid < g->NumCounters; cid++) {
+ const struct gl_perf_monitor_counter *c = &g->Counters[cid];
+ struct st_perf_counter_object *cntr;
+ int query_type;
+
+ if (!BITSET_TEST(m->ActiveCounters[gid], cid))
+continue;
It would seem like the extension would not work with more than 32 
counters

per

Re: [Mesa-dev] [PATCH v2 06/15] st/mesa: implement GL_AMD_performance_monitor

2015-03-29 Thread Samuel Pitoiset



On 03/28/2015 09:43 PM, Martin Peres wrote:

On 22/03/2015 17:35, Samuel Pitoiset wrote:

From: Christoph Bumiller 

This is based on the original patch of Christoph Bumiller.
(source: http://people.freedesktop.org/~chrisbmr/perfmon.diff)


It would be nice if you could add "v2: Samuel Pitoiset" and tell what 
you changed. Christoph may delete his perfmon.diff and no-one will be 
able to diff the diffs :)


Good idea!



As for the Gallium HUD, we keep a list of busy queries in a ring
buffer in order to prevent stalls when reading queries.

Drivers must implement get_driver_query_group_info and
get_driver_query_info in order to enable this extension.

Signed-off-by: Samuel Pitoiset 
---
  src/mesa/Makefile.sources  |   2 +
  src/mesa/state_tracker/st_cb_perfmon.c | 455 
+

  src/mesa/state_tracker/st_cb_perfmon.h |  70 +
  src/mesa/state_tracker/st_context.c|   4 +
  src/mesa/state_tracker/st_extensions.c |   3 +
  5 files changed, 534 insertions(+)
  create mode 100644 src/mesa/state_tracker/st_cb_perfmon.c
  create mode 100644 src/mesa/state_tracker/st_cb_perfmon.h

diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index 217be9a..e54e618 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -432,6 +432,8 @@ STATETRACKER_FILES = \
  state_tracker/st_cb_flush.h \
  state_tracker/st_cb_msaa.c \
  state_tracker/st_cb_msaa.h \
+state_tracker/st_cb_perfmon.c \
+state_tracker/st_cb_perfmon.h \
  state_tracker/st_cb_program.c \
  state_tracker/st_cb_program.h \
  state_tracker/st_cb_queryobj.c \
diff --git a/src/mesa/state_tracker/st_cb_perfmon.c 
b/src/mesa/state_tracker/st_cb_perfmon.c

new file mode 100644
index 000..fb6774b
--- /dev/null
+++ b/src/mesa/state_tracker/st_cb_perfmon.c
@@ -0,0 +1,455 @@
+/*
+ * Copyright (C) 2013 Christoph Bumiller
+ * Copyright (C) 2015 Samuel Pitoiset
+ *
+ * Permission is hereby granted, free of charge, to any person 
obtaining a
+ * copy of this software and associated documentation files (the 
"Software"),
+ * to deal in the Software without restriction, including without 
limitation
+ * the rights to use, copy, modify, merge, publish, distribute, 
sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom 
the

+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be 
included in

+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO 
EVENT SHALL

+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 
OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE 
USE OR

+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+/**
+ * Performance monitoring counters interface to gallium.
+ */
+
+#include "st_context.h"
+#include "st_cb_bitmap.h"
+#include "st_cb_perfmon.h"
+
+#include "util/bitset.h"
+
+#include "pipe/p_context.h"
+#include "pipe/p_screen.h"
+#include "util/u_memory.h"
+
+/**
+ * Return a PIPE_QUERY_x type >= PIPE_QUERY_DRIVER_SPECIFIC, or -1 if
+ * the driver-specific query doesn't exist.
+ */
+static int
+find_query_type(struct pipe_screen *screen, const char *name)
+{
+   int num_queries;
+   int type = -1;
+   int i;
+
+   num_queries = screen->get_driver_query_info(screen, 0, NULL);
+   if (!num_queries)
+  return type;
+
+   for (i = 0; i < num_queries; i++) {
+  struct pipe_driver_query_info info;
+
+  if (!screen->get_driver_query_info(screen, i, &info))
+ continue;
+
+  if (!strncmp(info.name, name, strlen(name))) {
+ type = info.query_type;
+ break;
+  }
+   }
+   return type;
+}
+
+static bool
+init_perf_monitor(struct gl_context *ctx, struct 
gl_perf_monitor_object *m)

+{
+   struct st_perf_monitor_object *stm = st_perf_monitor_object(m);
+   struct pipe_screen *screen = st_context(ctx)->pipe->screen;
+   struct pipe_context *pipe = st_context(ctx)->pipe;
+   int gid, cid;
+
+   st_flush_bitmap_cache(st_context(ctx));
+
+   /* Create a query for each active counter. */
+   for (gid = 0; gid < ctx->PerfMonitor.NumGroups; gid++) {
+  const struct gl_perf_monitor_group *g = 
&ctx->PerfMonitor.Groups[gid];

+  for (cid = 0; cid < g->NumCounters; cid++) {
+ const struct gl_perf_monitor_counter *c = &g->Counters[cid];
+ struct st_perf_counter_object *cntr;
+ int query_type;
+
+ if (!BITSET_TEST(m->ActiveCounters[gid], cid))
+continue;
It would seem like the extension would not work with more than 32 
counters per group.


This certainly is not a problem on the NVIDIA side but it may become a 
problem for another
G

[Mesa-dev] [PATCH 2/2] xmlpool: remove the clean target

2015-03-29 Thread Emil Velikov
... by folding it into CLEANFILES. Don't worry about $(LANG) as it is
essentially the first folder of $(POS). With the latter already handled.

Signed-off-by: Emil Velikov 
---
 src/mesa/drivers/dri/common/xmlpool/Makefile.am | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/common/xmlpool/Makefile.am 
b/src/mesa/drivers/dri/common/xmlpool/Makefile.am
index 9700499..a6f1652 100644
--- a/src/mesa/drivers/dri/common/xmlpool/Makefile.am
+++ b/src/mesa/drivers/dri/common/xmlpool/Makefile.am
@@ -61,12 +61,10 @@ EXTRA_DIST = \
SConscript
 
 BUILT_SOURCES = options.h
-CLEANFILES = $(MOS) options.h
-
-# All generated files are cleaned up.
-clean:
-   -rm -f $(POT) options.h *~
-   -rm -rf $(LANGS)
+CLEANFILES = \
+   options.h
+   $(POS) \
+   $(MOS)
 
 # Default target options.h
 options.h: LOCALEDIR := .
-- 
2.3.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] xmlpool: don't forget to ship the MOS

2015-03-29 Thread Emil Velikov
This will allow us to finally remove python from the build time
dependencies list. Considering that you're building from a release
tarball of course :-)

Cc: Bernd Kuhls 
Reported-by: Bernd Kuhls 
Cc: "10.5" 
Signed-off-by: Emil Velikov 
---
 src/mesa/drivers/dri/common/xmlpool/Makefile.am | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/common/xmlpool/Makefile.am 
b/src/mesa/drivers/dri/common/xmlpool/Makefile.am
index 5557716..9700499 100644
--- a/src/mesa/drivers/dri/common/xmlpool/Makefile.am
+++ b/src/mesa/drivers/dri/common/xmlpool/Makefile.am
@@ -52,7 +52,14 @@ POT=xmlpool.pot
 
 .PHONY: all clean pot po mo
 
-EXTRA_DIST = gen_xmlpool.py options.h t_options.h $(POS) SConscript
+EXTRA_DIST = \
+   gen_xmlpool.py \
+   options.h \
+   t_options.h \
+   $(POS) \
+   $(MOS) \
+   SConscript
+
 BUILT_SOURCES = options.h
 CLEANFILES = $(MOS) options.h
 
-- 
2.3.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] configure.ac: error out if python/mako is not found when required

2015-03-29 Thread Emil Velikov
On 28 March 2015 at 23:51, Bernd Kuhls  wrote:
> Hi,
>
> Emil Velikov  wrote in news:1427132964-21468-2-
> git-send-email-emil.l.veli...@gmail.com:
>
>> In case of using a distribution tarball (or a dirty git tree) one can
>> have the generated sources locally. Make configure.ac error out
>> otherwise, to alert that about the unmet requirement(s) of python/mako.
>
> [...]
>
>> +if test "x$acv_mako_found" = xno; then
>> +if test ! -f "$srcdir/src/glsl/nir/nir_builder_opcodes.h" -o \
>
> I can not find any reference to this file in the mesa3d 10.5.2 tarball. Is
> it save to assume that the check for this file can be removed from the
> patch when applied to 10.5 branch?
>
Indeed. I have noticed in another commit that I have overestimated
when nir_builder_opcodes.h landed.

> When python is missing on the build machine there is still an error using
> these configure options:
>
> --disable-glx --disable-xa --disable-static --enable-shared-glapi --with-
> gallium-drivers=nouveau,r600,svga,swrast --without-dri-drivers --disable-
> dri3 --enable-opengl --enable-gbm --enable-egl --with-egl-platforms=drm --
> enable-gles1 --enable-gles2
>
> make[7]: Entering directory `/home/br/br3/output/build/mesa3d-
> 10.5.2/src/mesa/drivers/dri/common/xmlpool'
> Updating (ca) ca/LC_MESSAGES/options.mo from ca.po.
> Updating (de) de/LC_MESSAGES/options.mo from de.po.
> Updating (es) es/LC_MESSAGES/options.mo from es.po.
> Updating (nl) nl/LC_MESSAGES/options.mo from nl.po.
> Updating (fr) fr/LC_MESSAGES/options.mo from fr.po.
> Updating (sv) sv/LC_MESSAGES/options.mo from sv.po.
>   GEN  options.h
> /bin/bash: ./gen_xmlpool.py: Permission denied
> make[7]: *** [options.h] Error 126
>
> The reason is src/mesa/drivers/dri/common/xmlpool/Makefile.am, line 66
>
> options.h: t_options.h $(MOS)
>
> Files mentioned in $(MOS) are newer after their creation during the build
> than the pre-supplied options.h.
>
> This hack fixes the build error here:
>
> -options.h: t_options.h $(MOS)
> +options.h: t_options.h
>
Grr forgot about this one. Upon a closer look it seems that the MOS
are missing from the tarball causing all this.
Will give it a test and send out a patch.

Thanks.
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/16] android: add inital NIR build

2015-03-29 Thread Emil Velikov
On 29 March 2015 at 04:17, Kenneth Graunke  wrote:
> On Sunday, March 29, 2015 12:14:50 AM Emil Velikov wrote:
>> On 28 March 2015 at 20:54, Emil Velikov  wrote:
>> > From: Mauro Rossi 
>> >
>> > Required by the i965 driver.
>> >
>> > Cc: "10.5" 
>> > [Emil Velikov: Split from a larger commit]
>> > Signed-off-by: Emil Velikov 
>> > ---
>> >  src/glsl/Android.gen.mk | 62
> +++--
>> >  src/glsl/Android.mk |  3 +-
>> >  src/mesa/drivers/dri/Android.mk |  1 +
>> >  3 files changed, 63 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/src/glsl/Android.gen.mk b/src/glsl/Android.gen.mk
>> > index 7ec56d4..82f2bf1 100644
>> > --- a/src/glsl/Android.gen.mk
>> > +++ b/src/glsl/Android.gen.mk
>> > @@ -33,11 +33,21 @@ sources := \
>> > glsl_lexer.cpp \
>> > glsl_parser.cpp \
>> > glcpp/glcpp-lex.c \
>> > -   glcpp/glcpp-parse.c
>> > +   glcpp/glcpp-parse.c \
>> > +   nir/nir_builder_opcodes.h \
>>
>> Seems like the nir_builder_opcodes.h addition came after the 10.5
>> branchpoint. So in order to get this for 10.5 we'll need to split them
>> out into a separate patch.
>>
>> -Emil
>
> Building NIR on 10.5 isn't really worth doing - the version we've got in
> master now is pretty solid, but the version in 10.5 offers only bugs,
> and no actual performance benefits.
>
> I'd just skip it, honestly - I've actually thought about just patching
> it out of 10.5 so people don't report bugs against outdated code.
>
> We're aiming to have NIR up and running in 10.6.
>
Hi Ken

Bth I would be fine with either solution - patching NIR out of the
build for 10.5 or adding build support for Android. Both will allow us
to have a i965 dri module without unresolved dri symbols. although I
would assume that the latter option will be sorter/easier. About the
functionality(bugs) I would say that it people explicitly set the env
variable then they are asking for what they deserve ;-)

Cheers,
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] GL_TEXTURE_2D to wl_buffer

2015-03-29 Thread x414e54
Hi thank you for the help, I will test

The only reason for trying to use the texture to wl_buffer directly was
just to get something working and then work back from there to see if it
was an application side issue or not.

Also just a fun experiment.

On Sun, Mar 29, 2015 at 6:04 AM, Jason Ekstrand 
wrote:

> On Sat, Mar 28, 2015 at 6:57 AM, x414e54  wrote:
>
> > I was originally blitting the texture to the default framebuffer and then
> > trying to use eglSwapBuffers. But for some reason eglSwapBuffers was
> > returning EGL_BAD_SURFACE even though eglMakeCurrent had no errors.
>
> Can you render to it?  That sounds like something is wrong in the way
> you're setting up your EGLSurface.  Of you can render you should be able to
> blit.
> --Jason
>

If I call eglSwapBuffers just after the context creation it works fine and
the buffer is committed to the Wayland surface.

Making the context current and performing the glBlit also works fine
(tested using glReadPixels). eglQuerySurface also seems to work fine.

However eglSwapInterval and eglSwapBuffers returns EGL_BAD_CONTEXT and
EGL_BAD_SURFACE respectively.

The code is run all in one function and during a mutex to prevent the
context becoming current on another thread.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 06/15] st/mesa: implement GL_AMD_performance_monitor

2015-03-29 Thread Martin Peres

On 29/03/2015 04:02, Marek Olšák wrote:

On Sat, Mar 28, 2015 at 9:43 PM, Martin Peres  wrote:

On 22/03/2015 17:35, Samuel Pitoiset wrote:

From: Christoph Bumiller 

This is based on the original patch of Christoph Bumiller.
(source: http://people.freedesktop.org/~chrisbmr/perfmon.diff)


It would be nice if you could add "v2: Samuel Pitoiset" and tell what you
changed. Christoph may delete his perfmon.diff and no-one will be able to
diff the diffs :)


As for the Gallium HUD, we keep a list of busy queries in a ring
buffer in order to prevent stalls when reading queries.

Drivers must implement get_driver_query_group_info and
get_driver_query_info in order to enable this extension.

Signed-off-by: Samuel Pitoiset 
---
   src/mesa/Makefile.sources  |   2 +
   src/mesa/state_tracker/st_cb_perfmon.c | 455
+
   src/mesa/state_tracker/st_cb_perfmon.h |  70 +
   src/mesa/state_tracker/st_context.c|   4 +
   src/mesa/state_tracker/st_extensions.c |   3 +
   5 files changed, 534 insertions(+)
   create mode 100644 src/mesa/state_tracker/st_cb_perfmon.c
   create mode 100644 src/mesa/state_tracker/st_cb_perfmon.h

diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
index 217be9a..e54e618 100644
--- a/src/mesa/Makefile.sources
+++ b/src/mesa/Makefile.sources
@@ -432,6 +432,8 @@ STATETRACKER_FILES = \
 state_tracker/st_cb_flush.h \
 state_tracker/st_cb_msaa.c \
 state_tracker/st_cb_msaa.h \
+   state_tracker/st_cb_perfmon.c \
+   state_tracker/st_cb_perfmon.h \
 state_tracker/st_cb_program.c \
 state_tracker/st_cb_program.h \
 state_tracker/st_cb_queryobj.c \
diff --git a/src/mesa/state_tracker/st_cb_perfmon.c
b/src/mesa/state_tracker/st_cb_perfmon.c
new file mode 100644
index 000..fb6774b
--- /dev/null
+++ b/src/mesa/state_tracker/st_cb_perfmon.c
@@ -0,0 +1,455 @@
+/*
+ * Copyright (C) 2013 Christoph Bumiller
+ * Copyright (C) 2015 Samuel Pitoiset
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining
a
+ * copy of this software and associated documentation files (the
"Software"),
+ * to deal in the Software without restriction, including without
limitation
+ * the rights to use, copy, modify, merge, publish, distribute,
sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be
included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+/**
+ * Performance monitoring counters interface to gallium.
+ */
+
+#include "st_context.h"
+#include "st_cb_bitmap.h"
+#include "st_cb_perfmon.h"
+
+#include "util/bitset.h"
+
+#include "pipe/p_context.h"
+#include "pipe/p_screen.h"
+#include "util/u_memory.h"
+
+/**
+ * Return a PIPE_QUERY_x type >= PIPE_QUERY_DRIVER_SPECIFIC, or -1 if
+ * the driver-specific query doesn't exist.
+ */
+static int
+find_query_type(struct pipe_screen *screen, const char *name)
+{
+   int num_queries;
+   int type = -1;
+   int i;
+
+   num_queries = screen->get_driver_query_info(screen, 0, NULL);
+   if (!num_queries)
+  return type;
+
+   for (i = 0; i < num_queries; i++) {
+  struct pipe_driver_query_info info;
+
+  if (!screen->get_driver_query_info(screen, i, &info))
+ continue;
+
+  if (!strncmp(info.name, name, strlen(name))) {
+ type = info.query_type;
+ break;
+  }
+   }
+   return type;
+}
+
+static bool
+init_perf_monitor(struct gl_context *ctx, struct gl_perf_monitor_object
*m)
+{
+   struct st_perf_monitor_object *stm = st_perf_monitor_object(m);
+   struct pipe_screen *screen = st_context(ctx)->pipe->screen;
+   struct pipe_context *pipe = st_context(ctx)->pipe;
+   int gid, cid;
+
+   st_flush_bitmap_cache(st_context(ctx));
+
+   /* Create a query for each active counter. */
+   for (gid = 0; gid < ctx->PerfMonitor.NumGroups; gid++) {
+  const struct gl_perf_monitor_group *g =
&ctx->PerfMonitor.Groups[gid];
+  for (cid = 0; cid < g->NumCounters; cid++) {
+ const struct gl_perf_monitor_counter *c = &g->Counters[cid];
+ struct st_perf_counter_object *cntr;
+ int query_type;
+
+ if (!BITSET_TEST(m->ActiveCounters[gid], cid))
+continue;

It would seem like the extension would not work with more than 32 counters
per group.

This certainly is not a problem on the NVIDIA side b