Re: [Mesa-dev] [PATCH 9/9] [AUTONAK] i965/nir: Call nir_sweep().
Kenneth Graunke writes: > Mostly a proof of concept that it works; we free the memory shortly > afterwards anyway, so it's kind of dumb to do this. > > The plan is to instead build nir_shaders at link time, rather than when > compiling each shader specialization, and delete the GLSL IR. This sounds really interesting -- it might make sense for me, too, if I had a good way to clone the NIR for doing shader specialization. signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] [RFC] egl: propose simple EGL_MESA_image_dma_buf_export v2.4
From: Dave Airlie At the moment to get an EGL image to a dma-buf file descriptor, you have to use EGL_MESA_drm_image, and then use libdrm to convert this to a file descriptor. This extension just provides an API modelled on EGL_MESA_drm_image, to return a dma-buf file descriptor. v2: update spec for new API proposal add internal queries to get the fourcc back from intel driver. v2.1: add gallium pieces. v2.2: add offsets to spec and API, rename fd->fds, stride->strides in API. rewrite spec a bit more, add some q/a v2.3: add modifiers to query interface and 64-bit type for that (Daniel Stone) specifiy what happens to num fds vs num planes differences. (Chad Versace) v2.4: fix grammar (Daniel Stone) Signed-off-by: Dave Airlie --- docs/specs/MESA_image_dma_buf_export.txt | 142 +++ include/EGL/eglmesaext.h | 8 ++ include/GL/internal/dri_interface.h | 4 +- src/egl/drivers/dri2/egl_dri2.c | 59 - src/egl/main/eglapi.c| 48 +++ src/egl/main/eglapi.h| 10 +++ src/egl/main/egldisplay.h| 2 + src/egl/main/eglfallbacks.c | 5 ++ src/egl/main/eglmisc.c | 2 + src/gallium/state_trackers/dri/dri2.c| 32 ++- src/mesa/drivers/dri/i965/intel_screen.c | 25 +- 11 files changed, 332 insertions(+), 5 deletions(-) create mode 100644 docs/specs/MESA_image_dma_buf_export.txt diff --git a/docs/specs/MESA_image_dma_buf_export.txt b/docs/specs/MESA_image_dma_buf_export.txt new file mode 100644 index 000..3bc5890 --- /dev/null +++ b/docs/specs/MESA_image_dma_buf_export.txt @@ -0,0 +1,142 @@ +Name + +MESA_image_dma_buf_export + +Name Strings + +EGL_MESA_image_dma_buf_export + +Contributors + +Dave Airlie + +Contact + +Dave Airlie (airlied 'at' redhat 'dot' com) + +Status + +Proposal + +Version + +Version 2 + +Number + + + +Dependencies + +Reguires EGL 1.4 or later. This extension is written against the +wording of the EGL 1.4 specification. + +EGL_KHR_base_image is required. + +The EGL implementation must be running on a Linux kernel supporting the +dma_buf buffer sharing mechanism. + +Overview + +This extension provides entry points for integrating EGLImage with the +dma-buf infrastructure. The extension allows creating a Linux dma_buf +file descriptor or multiple file descriptors, in the case of multi-plane +YUV image, from an EGLImage. + +It is designed to provide the complementary functionality to EGL_EXT_image_dma_buf_import. + +IP Status + +Open-source; freely implementable. + +New Types + +This is a 64 bit unsigned integer. + +typedef khronos_uint64_t EGLuint64MESA; + + +New Procedures and Functions + +EGLBoolean eglExportDMABUFImageQueryMESA(EGLDisplay dpy, + EGLImageKHR image, + int *fourcc, + int *num_planes, + EGLuint64MESA *modifiers); + +EGLBoolean eglExportDMABUFImageMESA(EGLDisplay dpy, +EGLImageKHR image, +int *fds, + EGLint *strides, + EGLint *offsets); + +New Tokens + +None + + +Additions to the EGL 1.4 Specification: + +To mirror the import extension, this extension attempts to return +enough information to enable an exported dma-buf to be imported +via eglCreateImageKHR and EGL_LINUX_DMA_BUF_EXT token. + +Retrieving the information is a two step process, so two APIs +are required. + +The first entrypoint + EGLBoolean eglExportDMABUFImageQueryMESA(EGLDisplay dpy, + EGLImageKHR image, + int *fourcc, + int *num_planes, + EGLuint64MESA *modifiers); + +is used to retrieve the pixel format of the buffer, as specified by +drm_fourcc.h, the number of planes in the image and the Linux +drm modifiers. , and may be NULL, +in which case no value is retrieved. + +The second entrypoint retrieves the dma_buf file descriptors, +strides and offsets for the image. The caller should pass +arrays sized according to the num_planes values retrieved previously. +Passing arrays of the wrong size will have undefined results. +If the number of fds is less than the number of planes, then +subsequent fd slots should contain -1. + +EGLBoolean eglExportDMABUFImageMESA(EGLDisplay dpy, + EGLImageKHR image, +int *fds, + EGLint *strides, + EGLint *offsets); + +, , can be NULL if the infomatation isn't
Re: [Mesa-dev] [PATCH 3/3] gallium: Add tgsi_to_nir to get a nir_shader for a TGSI shader.
Kenneth Graunke writes: > On Friday, March 27, 2015 01:54:32 PM Eric Anholt wrote: >> This will be used by the VC4 driver for doing device-independent >> optimization, and hopefully eventually replacing its whole IR. It also >> may be useful to other drivers for the same reason. > Hi Eric! > > I have a bunch of comments below, but overall this looks great. > > You should probably have someone who knows TGSI better than I do review > it, but for what it's worth, this is: > > Reviewed-by: Kenneth Graunke Thanks! There was definitely useful feedback in here, and I've taken most of it. >> +/* LOG - Approximate Logarithm Base 2 >> + * dst.x = \lfloor\log_2{|src.x|}\rfloor >> + * dst.y = \frac{|src.x|}{2^{\lfloor\log_2{|src.x|}\rfloor}} >> + * dst.z = \log_2{|src.x|} >> + * dst.w = 1.0 >> + */ >> +static void >> +ttn_log(nir_builder *b, nir_op op, nir_alu_dest dest, nir_ssa_def **src) >> +{ >> + nir_ssa_def *abs_srcx = nir_fabs(b, ttn_channel(b, src[0], X)); >> + nir_ssa_def *log2 = nir_flog2(b, abs_srcx); >> + >> + ttn_move_dest_masked(b, dest, nir_ffloor(b, log2), TGSI_WRITEMASK_X); >> + ttn_move_dest_masked(b, dest, >> +nir_fdiv(b, abs_srcx, nir_fexp2(b, nir_ffloor(b, > log2))), > > You're generating two copies of floor(log2) here, which will have to be > CSE'd later. In prog_to_nir, I created a temporary and used it in both > places: > >nir_ssa_def *floor_log2 = nir_ffloor(b, log2); > > We're generating tons of rubbish for NIR to optimize anyway, so it's not > a big deal...but...may as well do the trivial improvement. I much more expect the whole mess except for the dst.z computation to get DCEed away, so it's just one extra DCE out of so many. (and we generate lots of copy prop to avoid all throughout this mess, which I've considered short-circuting in nir_builder some day). >> +static void >> +ttn_sle(nir_builder *b, nir_op op, nir_alu_dest dest, nir_ssa_def **src) >> +{ >> + ttn_move_dest(b, dest, nir_sge(b, src[1], src[0])); >> +} > > I've got code here to generate b2f(fge(...)) instead of sge(...) since I > didn't want to bother implementing it in my driver, and figured the b2fs > might be able to get optimized away. > > That said, I suppose we could probably just add lowering transformations > that turn sge -> b2f(fge(...)) when options->native_integers is set, and > delete my code... For me an SGE in hardware is: fsub.sf(null, src0, src1) mov.nc(dest, 1.0) mov.ns(dest, 0) while your plan would be... oh wait. I didn't even have a b2f implementation because TGSI doesn't do that (they just AND the bool with 1.0). But an FGE is: fsub.sf(null, src0, src1) mov.nc(dest, ~0) mov.ns(dest, 0) so any more instructions would be worse. >> +static void >> +ttn_xpd(nir_builder *b, nir_op op, nir_alu_dest dest, nir_ssa_def **src) >> +{ >> + ttn_move_dest_masked(b, dest, >> +nir_fsub(b, >> + nir_fmul(b, >> + ttn_swizzle(b, src[0], Y, Z, X, > X), >> + ttn_swizzle(b, src[1], Z, X, Y, > X)), >> + nir_fmul(b, >> + ttn_swizzle(b, src[1], Y, Z, X, > X), >> + ttn_swizzle(b, src[0], Z, X, Y, > X))), >> +TGSI_WRITEMASK_XYZ); >> + ttn_move_dest_masked(b, dest, nir_imm_float(b, 1.0), TGSI_WRITEMASK_W); >> +} >> + >> +static void >> +ttn_dp2a(nir_builder *b, nir_op op, nir_alu_dest dest, nir_ssa_def **src) >> +{ >> + ttn_move_dest(b, dest, >> + ttn_channel(b, nir_fadd(b, >> + ttn_channel(b, nir_fdot2(b, > src[0], >> + src[1]), >> + X), > > Do you really need to do ttn_channel(b, ..., X) on a fdot2 result? It's > already a scalar value. Same comment applies to the below four. > > I should probably delete that from prog_to_nir as well. Good catch, the cleanups for scalar in the builder have obsoleted this, I think. > >> + src[2]), >> + X)); >> +} >> +static void >> +ttn_ucmp(nir_builder *b, nir_op op, nir_alu_dest dest, nir_ssa_def **src) >> +{ >> + ttn_move_dest(b, dest, nir_bcsel(b, >> +nir_ine(b, src[0], nir_imm_float(b, > 0.0)), > > Doing nir_imm_int(b, 0) here would make more sense. Yeah, now that I have it :) >> +static void >> +ttn_kill_if(nir_builder *b, nir_op op, nir_alu_dest dest, nir_ssa_def > **src) >> +{ >> + nir_if *if_stmt = nir_if_create(b->shader); >> + if_stmt->condition = >> + nir_src_for_ssa(nir_bany4(b, nir_flt(b, src[0], nir_imm_float(b, > 0.0; >> + nir_cf_node_insert_end(b->cf_node_list, &if_stmt->cf_node); >> + >> + nir_intrinsic_instr *discard = >> + nir_int
[Mesa-dev] swrast: Correct pixel draw span endpoints computation, rid vertical lines
Hello mesa-dev, I've created a changeset for the legacy swrast_dri.so driver which fixes vertical lines at the GL_MAX_TEXTURE_SIZE boundaries due to miscomputed column range computations. Please consider. Basically, I was very meticulous about the mathematical formulas in terms of rounding according to the OpenGL definition for both positive and negative xfactor and yfactor. That fixed the vertical lines, but after creating a suggested Piglit test for glPixelZoom I noticed the driver still wasn't behaving according to some intricacies of the standard formulas. The other advantage of being more detailed about the formulas is sort of sweeping numerical issues into a corner, i.e., the behavior of ceil() and floor() functions. In order to pass the expected tests, I needed to add a tolerance of about 0.4 to the rounding functions. This isn't anywhere near the, say, ULP (units in the last place) called out in ARB-shader-precision, but it is the necessary value I've found when working with single-precision float arithmetic. It's surprising how quickly precision can be lost by the division that is typically used in computing xfactor/yfactor before the user makes the glPixelZoom() call. I've collected image files comparing the legacy driver and patched legacy, as well as Gallium driver here: https://bugs.freedesktop.org/show_bug.cgi?id=89586 The PNG files for Piglit test results, more than anything, should provide a good understanding of the issues. The Gallium driver shows a slight discrepancy from the patched legacy driver. I'm not saying which is correct, if any, just pointing out the slight difference. Patch file is attached. I'd be happy to answer any questions or tweak the code. Regards, Dan Sebald >From 5eba613e22a1096302c46df395f9a6d67f6b8625 Mon Sep 17 00:00:00 2001 From: Daniel J Sebald Date: Sun, 29 Mar 2015 22:52:08 -0500 Subject: [PATCH] swrast: Correct pixel draw span endpoints computation, rid vertical lines Change the start/end indeces computation to use ceiling, not an int cast, of the encompassing rectangle. Solves problem of dropped pixels at GL_MAX_TEXTURE_SIZE/SWRAST_MAX_WIDTH intervals that created vertical lines when -1.0 < xfactor < 1.0. Also fine tune unzoom formula and add a macro SPAN_LOOP_X() for all pixel zoom operations. --- src/mesa/swrast/s_zoom.c | 252 + 1 files changed, 162 insertions(+), 90 deletions(-) diff --git a/src/mesa/swrast/s_zoom.c b/src/mesa/swrast/s_zoom.c index ab22652..067d1d6 100644 --- a/src/mesa/swrast/s_zoom.c +++ b/src/mesa/swrast/s_zoom.c @@ -34,11 +34,36 @@ #include "s_zoom.h" +#define SPAN_LOOP_X(OPERATION) \ + do { \ + if (ctx->Pixel.ZoomX > 0) { \ + GLint i; \ + for (i = 0; i < zoomedWidth; i++) { \ +GLint j = positive_unzoom_x(ctx->Pixel.ZoomX, imgX, x0 + i) - spanX; \ +OPERATION; \ + } \ + } \ + else { \ + GLint i; \ + for (i = 0; i < zoomedWidth; i++) { \ +GLint j = negative_unzoom_x(ctx->Pixel.ZoomX, imgX, x0 + i) - spanX; \ +OPERATION; \ + } \ + } \ + } while (0) + + +/* These are meant to address numerical effects of xfactor/yfactor being + * single-precision floating point numbers, as opposed to real numbers. + */ +#define EPSFLOOR 0.4 +#define EPSCEIL 0.4 + /** * Compute the bounds of the region resulting from zooming a pixel span. * The resulting region will be entirely inside the window/scissor bounds * so no additional clipping is needed. - * \param imageX, imageY position of the mage being drawn (gl WindowPos) + * \param imageX, imageY position of the image being drawn (gl WindowPos) * \param spanX, spanY position of span being drawing * \param width number of pixels in span * \param x0, x1 returned X bounds of zoomed region [x0, x1) @@ -47,7 +72,7 @@ */ static GLboolean compute_zoomed_bounds(struct gl_context *ctx, GLint imageX, GLint imageY, - GLint spanX, GLint spanY, GLint width, + GLint spanX, GLint spanY, GLint spanWidth, GLint *x0, GLint *x1, GLint *y0, GLint *y1) { const struct gl_framebuffer *fb = ctx->DrawBuffer; @@ -58,35 +83,41 @@ compute_zoomed_bounds(struct gl_context *ctx, GLint imageX, GLint imageY, /* * Compute destination columns: [c0, c1) +* +* c0 - Pixels on left rectangle edge and greater are included +* c1 - Pixels on right rectangle edge and greater are excluded */ - c0 = imageX + (GLint) ((spanX - imageX) * ctx->Pixel.ZoomX); - c1 = imageX + (GLint) ((spanX + width - imageX) * ctx->Pixel.ZoomX); - if (c1 < c0) { - /* swap */ + c0 = imageX + (GLint) ceilf((spanX - imageX) * ctx->Pixel.ZoomX - EPSCEIL); + c1 = imageX + (GLint) ceilf((spanX + spanWidth - imageX) * ctx->Pixel.ZoomX - EPSCEIL); + if (ctx->Pixel.ZoomX < 0) { + /* swap edge roles */ G
Re: [Mesa-dev] [PATCH] glsl: fix unreachable(!"") to unreachable("")
Reviewed-by: Ilia Mirkin On Mon, Mar 30, 2015 at 1:05 AM, Tapani Pälli wrote: > Correct error with commit 151fb1e where assert was renamed > to unreachable without removing ! from string argument. > > Signed-off-by: Tapani Pälli > --- > src/glsl/loop_controls.cpp | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/glsl/loop_controls.cpp b/src/glsl/loop_controls.cpp > index d7f0b28..51804bb 100644 > --- a/src/glsl/loop_controls.cpp > +++ b/src/glsl/loop_controls.cpp > @@ -139,7 +139,7 @@ calculate_iterations(ir_rvalue *from, ir_rvalue *to, > ir_rvalue *increment, > iter = new(mem_ctx) ir_constant(double(iter_value + bias[i])); > break; >default: > - unreachable(!"Unsupported type for loop iterator."); > + unreachable("Unsupported type for loop iterator."); >} > >ir_expression *const mul = > -- > 2.1.0 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: fix unreachable(!"") to unreachable("")
Correct error with commit 151fb1e where assert was renamed to unreachable without removing ! from string argument. Signed-off-by: Tapani Pälli --- src/glsl/loop_controls.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/glsl/loop_controls.cpp b/src/glsl/loop_controls.cpp index d7f0b28..51804bb 100644 --- a/src/glsl/loop_controls.cpp +++ b/src/glsl/loop_controls.cpp @@ -139,7 +139,7 @@ calculate_iterations(ir_rvalue *from, ir_rvalue *to, ir_rvalue *increment, iter = new(mem_ctx) ir_constant(double(iter_value + bias[i])); break; default: - unreachable(!"Unsupported type for loop iterator."); + unreachable("Unsupported type for loop iterator."); } ir_expression *const mul = -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 89586] Drivers/DRI/swrast
https://bugs.freedesktop.org/show_bug.cgi?id=89586 Dan Sebald changed: What|Removed |Added Attachment #114599|0 |1 is obsolete|| --- Comment #45 from Dan Sebald --- Created attachment 114715 --> https://bugs.freedesktop.org/attachment.cgi?id=114715&action=edit swrast Gallium with patch Piglit test images -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 89586] Drivers/DRI/swrast
https://bugs.freedesktop.org/show_bug.cgi?id=89586 Dan Sebald changed: What|Removed |Added Attachment #114598|0 |1 is obsolete|| --- Comment #44 from Dan Sebald --- Created attachment 114714 --> https://bugs.freedesktop.org/attachment.cgi?id=114714&action=edit swrast Gallium Piglit test images -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 89586] Drivers/DRI/swrast
https://bugs.freedesktop.org/show_bug.cgi?id=89586 Dan Sebald changed: What|Removed |Added Attachment #114597|0 |1 is obsolete|| --- Comment #43 from Dan Sebald --- Created attachment 114713 --> https://bugs.freedesktop.org/attachment.cgi?id=114713&action=edit swrast legacy with patch Piglit test images -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 89586] Drivers/DRI/swrast
https://bugs.freedesktop.org/show_bug.cgi?id=89586 Dan Sebald changed: What|Removed |Added Attachment #114596|0 |1 is obsolete|| --- Comment #42 from Dan Sebald --- Created attachment 114712 --> https://bugs.freedesktop.org/attachment.cgi?id=114712&action=edit swrast legacy Piglit test images -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 89586] Drivers/DRI/swrast
https://bugs.freedesktop.org/show_bug.cgi?id=89586 Dan Sebald changed: What|Removed |Added Attachment #114619|0 |1 is obsolete|| --- Comment #41 from Dan Sebald --- Created attachment 114711 --> https://bugs.freedesktop.org/attachment.cgi?id=114711&action=edit Piglit pixelzoom test suite An update to test results after making some modifications to SWRAST legacy changeset and the Piglit gl-1.0-pixelzoom tests: (1) swrast-legacy: The repository legacy swrast_dri.so driver for which my system and others is using and exibits vertical lines. (2) swrast-legacy-patch: The legacy swrast_dri.co patched with the changeset I originally attached to this bug report. (3) swrast-gallium: The repository Gallium/llvm swrast_dri.so driver. (4) swrast-gallium-patch: The Gallium swrast_dri.co patched by removing the line of code that limits the size of the image. TEST (1) (2) (3) (4) positive monotonic x fail pass fail fail positive edge xfail pass fail fail positive over/underrun x fail pass pass pass negative monotonic x pass pass fail pass negative edge xfail pass fail fail negative over/underrun x fail pass pass pass positive monotonic y fail pass fail fail positive edge yfail pass fail fail positive over/underrun y fail pass pass pass negative monotonic y pass pass fail pass negative edge ypass pass fail fail negative over/underrun y fail pass pass pass -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 89586] Drivers/DRI/swrast
https://bugs.freedesktop.org/show_bug.cgi?id=89586 Dan Sebald changed: What|Removed |Added Attachment #114342|0 |1 is obsolete|| --- Comment #40 from Dan Sebald --- Created attachment 114710 --> https://bugs.freedesktop.org/attachment.cgi?id=114710&action=edit Changeset to fix vertical lines and fine tune positive_unzoom_x() and negative_unzoom_x() Attached is an update to the SWRAST legacy changeset. With this change, the driver passes all tests in Piglit gl-1.0-pixelzoom check. The main addition to the changeset over the last changeset is the inclusion of a tolerance for the ceil() and floor() functions. The issue is that with single precision float division and multiplication the formula (xz - xr) / xfactor can be off by a fair amount, on the order of 10e-5. I printed out some numbers the driver was using in cases there the gl-1.0-pixelzoom alternating-line test was failing. The numbers agree exactly with this example result: octave:3> single(53)/single(400) * single(400) ans = 52.961853027 I put in a tolerance of 0.4 on the rounding functions. By my very rough estimate, I think it is possible to scale input images with dimension of about 100,000 down to typical screen sizes without the added tolerance causing its own sort of artifact. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Good compiler literature?
On Sun, Mar 29, 2015 at 9:51 PM, Connor Abbott wrote: > On Sun, Mar 29, 2015 at 5:54 PM, Thomas Helland > wrote: >> Does anyone have suggestions for good literature on compilers? >> Since GPU's and CPU's are a bit different there are probably >> books that are better suited than others for GPUs? >> I have what is probably Norway's biggest library on the subject to rent >> books from, so I guess I should be able to find most suggestions there. >> >> Regards >> Thomas >> >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev >> > > Hi, > > Unfortunately there seems to be a bit of a dearth of books when it > comes to compiler optimizations and backend things, especially when it > comes to SSA -- which is the most interesting part! I would skip the > dragon book unless you're interested in the front-end things (parsing, > symbol tables, etc.); I haven't read a lot of it myself, but > apparently the latest edition only has a passing mention of SSA and > others don't mention it at all. GCC has a list of books: > https://gcc.gnu.org/wiki/ListOfCompilerBooks and apparently some of > them are better in that regard. Personally, I learned about a lot of > this stuff through papers. I have a list of them here: > > http://cwabbottdev.blogspot.com/2013/06/compiler-theory-links.html > > although it may be a little out of date, and some of the things might > have been more useful back when I was working on lima. Suggestions for > adding to the list are very welcome. Ok, since that page was a little lacking I added a few more links. Also, as to GPU-specific things... well, there really isn't much of anything I'm aware of that's out there. There's one paper called "Divergance Analysis and Optimizations" that's specific to SIMD machines that run multiple threads execution at once like GPU's, and we'd like to implement that in the future to give us more precise information about how we can move derivatives and textures that take an implicit derivative, since they can't be moved out of uniform control flow, but other than that there isn't anything I'm aware of. Most of the other things carry over to both CPU's and GPU's. One other GPU-specific problem I'm aware of is how to go out of SSA on classic vector-based architectures like i965 vec4 VS without introducing extra copies due to writemasked operations, but afaik there isn't a publically-available description of someone's solution -- the paper has yet to be written :). ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir: acknowledge the existence of nir_builder.h
Emil Velikov writes: > The header was added with commit 2a135c470e3(nir: Add an ALU op builder > kind of like ir_builder.h) but did not made it into to the sources list, > and its dependency of nir_builder_opcodes.h was missing. > > Fortunately it remained unused until resent commit faf6106c6f6(nir: "recent" > Implement a Mesa IR -> NIR translator.) > > Cc: Kenneth Graunke > Cc: Eric Anholt > Signed-off-by: Emil Velikov > --- > > Not sure how the out-of-tree build was able to finish without this, > although the commit looks like a must have if we want the file in the > tarball. > > Based on top of the earlier Android series. > > -Emil > > --- > src/glsl/Android.gen.mk | 2 ++ > src/glsl/Makefile.am | 2 ++ > src/glsl/Makefile.sources | 1 + > 3 files changed, 5 insertions(+) > > diff --git a/src/glsl/Android.gen.mk b/src/glsl/Android.gen.mk > index 82f2bf1..2f54da4 100644 > --- a/src/glsl/Android.gen.mk > +++ b/src/glsl/Android.gen.mk > @@ -97,6 +97,8 @@ $(intermediates)/nir/nir_builder_opcodes.h: > $(nir_builder_opcodes_deps) > @mkdir -p $(dir $@) > @$(MESA_PYTHON2) $(nir_builder_opcodes_gen) $< > $@ > > +$(LOCAL_PATH)/nir/nir_builder.h: $(intermediates)/nir/nir_builder_opcodes.h > + > nir_constant_expressions_gen := $(LOCAL_PATH)/nir/nir_constant_expressions.py > nir_constant_expressions_deps := \ > $(LOCAL_PATH)/nir/nir_opcodes.py \ > diff --git a/src/glsl/Makefile.am b/src/glsl/Makefile.am > index ed90366..58af166 100644 > --- a/src/glsl/Makefile.am > +++ b/src/glsl/Makefile.am > @@ -244,6 +244,8 @@ nir/nir_builder_opcodes.h: nir/nir_opcodes.py > nir/nir_builder_opcodes_h.py > $(MKDIR_P) nir; \ > $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/nir/nir_builder_opcodes_h.py > $@ > > +nir/nir_builder.h: nir/nir_builder_opcodes.h > + > nir/nir_constant_expressions.c: nir/nir_opcodes.py > nir/nir_constant_expressions.py nir/nir_constant_expressions.h > $(MKDIR_P) nir; \ > $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/nir/nir_constant_expressions.py > > $@ This is weird -- nir_builder.h isn't a build target that needs to be regenerated. What's it for? > diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources > index 8d29c55..c3b63d1 100644 > --- a/src/glsl/Makefile.sources > +++ b/src/glsl/Makefile.sources > @@ -22,6 +22,7 @@ NIR_FILES = \ > nir/glsl_to_nir.h \ > nir/nir.c \ > nir/nir.h \ > + nir/nir_builder.h \ > nir/nir_constant_expressions.h \ > nir/nir_dominance.c \ > nir/nir_from_ssa.c \ > -- > 2.3.1 This hunk is certainly needed. signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] i965: Split out brw__populate_key into their own functions
On Friday, March 20, 2015 05:49:06 PM Carl Worth wrote: > This commit splits portions of the existing brw_upload_vs_prog and > brw_upload_gs_prog function into new brw_vs_populate_key and > brw_gs_populate_key functions. This follows the same style as is > already present for all other stages, (see brw_wm_populate_key, etc.). > > This commit is intended to have no functional change. It exists in > preparation for some upcoming code movement in preparation for the > shader cache. > --- > src/mesa/drivers/dri/i965/brw_ff_gs.c | 7 +++-- > src/mesa/drivers/dri/i965/brw_gs.c| 39 ++- > src/mesa/drivers/dri/i965/brw_vs.c| 58 > +-- > 3 files changed, 64 insertions(+), 40 deletions(-) Patches 1-3 are: Reviewed-by: Kenneth Graunke (assuming you like my suggestion for the rename in patch 3) signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Good compiler literature?
On Sun, Mar 29, 2015 at 5:54 PM, Thomas Helland wrote: > Does anyone have suggestions for good literature on compilers? > Since GPU's and CPU's are a bit different there are probably > books that are better suited than others for GPUs? > I have what is probably Norway's biggest library on the subject to rent > books from, so I guess I should be able to find most suggestions there. > > Regards > Thomas > > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > Hi, Unfortunately there seems to be a bit of a dearth of books when it comes to compiler optimizations and backend things, especially when it comes to SSA -- which is the most interesting part! I would skip the dragon book unless you're interested in the front-end things (parsing, symbol tables, etc.); I haven't read a lot of it myself, but apparently the latest edition only has a passing mention of SSA and others don't mention it at all. GCC has a list of books: https://gcc.gnu.org/wiki/ListOfCompilerBooks and apparently some of them are better in that regard. Personally, I learned about a lot of this stuff through papers. I have a list of them here: http://cwabbottdev.blogspot.com/2013/06/compiler-theory-links.html although it may be a little out of date, and some of the things might have been more useful back when I was working on lima. Suggestions for adding to the list are very welcome. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/4] i965: Rename do__prog to brw__compile
On Friday, March 20, 2015 11:28:28 PM Carl Worth wrote: > On Fri, Mar 20 2015, Chris Forbes wrote: > > I think that having both the existing `struct brw_vs_compile` and a > > function with the same name is going to cause confusion. (same with > > the other non-fs stages) > > In an earlier version of the patch I had brw_vs_do_compile, (there is a > "do" precedent in the code being replaced here). I could go back to that > if it helps. > > -Carl How about brw_compile_vs_prog? It sounds natural and doesn't appear to conflict with anything. signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 0/9] Support multiple state pipelines for i965
On Friday, March 20, 2015 05:28:55 PM Jordan Justen wrote: > git://people.freedesktop.org/~jljusten/mesa i965-pipelines-v3 > > v2: > * Rename brw->atoms[] to render_atoms > * Add brw->compute_atoms[] >* Replace brw_pipeline_first_atom with brw_get_pipeline_atoms > > v3: > * Avoid changing pipelines' state bits in upload path > * brw_clear_dirty_bits => brw_render_state_finished > * brw->compute_atoms[] starts with size of 1 > * Deprecate and remove brw->state.dirty > > Jordan Justen (9): > i965/state: Rename brw_upload_state to brw_upload_render_state > i965/state: Rename brw_clear_dirty_bits to brw_render_state_finished > i965/state: Support multiple pipelines in brw->num_atoms > i965/state: Create separate dirty state bits for each pipeline > i965/state: Only upload render programs for render state uploads > i965/state: Add compute pipeline with empty atom lists > i965/state: Don't use brw->state.dirty.brw > i965/state: Don't use brw->state.dirty.mesa > i965/state: Remove brw->state.dirty Series is: Reviewed-by: Kenneth Graunke Thanks, Jordan! signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: Update the #line behaviour on GLSL 3.30+ and GLSL ES+
On Monday, March 23, 2015 09:56:29 AM Antia Puentes wrote: > From GLSL 3.30 and GLSL ES 1.00 on, after processing the line > directive (including its new-line), the implementation should > behave as if it is compiling at the line number passed as > argument. In previous versions, it behaved as if compiling > at the passed line number + 1. > > Partially fixes https://bugs.freedesktop.org/show_bug.cgi?id=88815 > --- > src/glsl/glsl_lexer.ll | 17 + > 1 file changed, 17 insertions(+) > > diff --git a/src/glsl/glsl_lexer.ll b/src/glsl/glsl_lexer.ll > index f0e047e..2785ed1 100644 > --- a/src/glsl/glsl_lexer.ll > +++ b/src/glsl/glsl_lexer.ll > @@ -187,6 +187,15 @@ HASH ^{SPC}#{SPC} > * one-based. > */ > yylineno = strtol(ptr, &ptr, 0) - 1; > + > + /* From GLSL 3.30 and GLSL ES on, after > processing the > +* line directive (including its > new-line), the implementation > +* will behave as if it is compiling at > the line number passed > +* as argument. It was line number + 1 in > older specifications. > +*/ > + if (yyextra->is_version(330, 100)) > + yylineno--; > + > yylloc->source = strtol(ptr, NULL, 0); > } > {HASH}line{SPCP}{INT}{SPC}$ { > @@ -202,6 +211,14 @@ HASH ^{SPC}#{SPC} > * one-based. > */ > yylineno = strtol(ptr, &ptr, 0) - 1; > + > + /* From GLSL 3.30 and GLSL ES on, after > processing the > +* line directive (including its > new-line), the implementation > +* will behave as if it is compiling at > the line number passed > +* as argument. It was line number + 1 in > older specifications. > +*/ > + if (yyextra->is_version(330, 100)) > + yylineno--; > } > ^{SPC}#{SPC}pragma{SPCP}debug{SPC}\({SPC}on{SPC}\) { > BEGIN PP; > Thanks for taking the time to make our error messages better :) Reviewed-by: Kenneth Graunke signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/3] Hash-table improvements, V3
On Sun, Mar 29, 2015 at 2:05 PM, Thomas Helland wrote: > Here's the latest round of fixup on the hash-table patches. > I think I've gotten all the review feedback incorporated now. > These patches give a nice little boost, indicated in each commit. > As a side effect of upping the minimum size of the table and set > there is now also less spamming of rzalloc and friends. > _int_malloc is cut from 935'000 to 847'000 samples. > calloc is cut from 683'000 to 655'000 samples. > _int_free is cut from 644'000 to 617'000 samples > The series reduced shader-db run-time with NIR on my collection > from 180 seconds to about 160 seconds. > > Thomas Helland (3): > util: Change hash_table to use quadratic probing > util: Change util/set to use quadratic probing > util: Use 32 bit integer hash function for pointers > > src/util/hash_table.c | 132 > ++ > src/util/hash_table.h | 3 +- > src/util/set.c| 124 ++- > src/util/set.h| 3 +- > 4 files changed, 109 insertions(+), 153 deletions(-) > > -- > 2.3.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev I don't see any performance data on each commit, did you leave it out by accident? Other than that, the series is Reviewed-by: Connor Abbott but you'll want to get Eric to review it too. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: respect the source number set by #line
On Monday, March 23, 2015 09:56:52 AM Antia Puentes wrote: > From GLSL 1.30.10, section 3.3 (Preprocessor): > "#line line source-string-number ... After processing this directive > (including its new-line), the implementation will behave as if it is > compiling at ... source string number source-string-number. Subsequent > source strings will be numbered sequentially, until another #line > directive overrides that numbering." > > In the previous implementation the source number was always zero. > Subsequent source strings are still not numbered sequentially, because > in the glShaderSource implementation we are concatenating the source code > strings into one long string. > > Partially fixes https://bugs.freedesktop.org/show_bug.cgi?id=88815 > --- > src/glsl/glsl_lexer.ll | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/src/glsl/glsl_lexer.ll b/src/glsl/glsl_lexer.ll > index 8dc3d10..f0e047e 100644 > --- a/src/glsl/glsl_lexer.ll > +++ b/src/glsl/glsl_lexer.ll > @@ -36,14 +36,13 @@ static int classify_identifier(struct _mesa_glsl_parse_state *, const char *); > > #define YY_USER_ACTION \ > do { \ > - yylloc->source = 0;\ >yylloc->first_column = yycolumn + 1; \ >yylloc->first_line = yylloc->last_line = yylineno + 1; \ >yycolumn += yyleng;\ >yylloc->last_column = yycolumn + 1;\ > } while(0); > > -#define YY_USER_INIT yylineno = 0; yycolumn = 0; > +#define YY_USER_INIT yylineno = 0; yycolumn = 0; yylloc->source = 0; > > /* A macro for handling reserved words and keywords across language versions. > * > Looks good to me! We could probably concatenate the strings together but put "#line 0 i" between each source string's content, if we wanted to fix the bug completely. Seems simple enough. Reviewed-by: Kenneth Graunke signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 89819] WebGL Conformance swrast failure in conformance/uniforms/uniform-default-values.html
https://bugs.freedesktop.org/show_bug.cgi?id=89819 Bug ID: 89819 Summary: WebGL Conformance swrast failure in conformance/uniforms/uniform-default-values.html Product: Mesa Version: unspecified Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: Mesa core Assignee: mesa-dev@lists.freedesktop.org Reporter: lukebe...@hotmail.com QA Contact: mesa-dev@lists.freedesktop.org The swrast renderer requires DRAW_USE_LLVM=false to pass the OGLES 2.0 Uniform Default Values test. Steps to reproduce: 1. set LIBGL_ALWAYS_SOFTWARE=1 2. visit https://www.khronos.org/registry/webgl/sdk/tests/conformance/uniforms/uniform-default-values.html?webglVersion=1 3. set DRAW_USE_LLVM=false 4. https://www.khronos.org/registry/webgl/sdk/tests/conformance/uniforms/uniform-default-values.html?webglVersion=1 llvmpipe results: testing: samplerCube fragment shaderFAIL Error in program linking:null FAIL uniform is not zero default value should be zero FAIL at (0, 0) expected: 0,255,0,255 was 0,0,0,0 test test by setting value FAIL at (0, 0) expected: 255,0,0,255 was 0,0,0,0 re-linking should reset to defaults FAIL at (0, 0) expected: 0,255,0,255 was 0,0,0,0 FAIL getError expected: NO_ERROR. Was CONTEXT_LOST_WEBGL : should be no GL errors ... See Bug 78875 for the i965 -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/11] i965/inst: Add notify and gateway_subfuncid fields
On Wednesday, March 25, 2015 05:53:43 PM Ben Widawsky wrote: > On Sun, Mar 22, 2015 at 06:49:15PM -0700, Jordan Justen wrote: > > These fields will be used when emitting a send for the barrier function. > > > > Reference: IVB PRM Volume 4, Part 2, Section 1.1.1 Message Descriptor > > > > Signed-off-by: Jordan Justen > > Reviewed-by: Chris Forbes > > --- > > src/mesa/drivers/dri/i965/brw_inst.h | 18 +++--- > > 1 file changed, 15 insertions(+), 3 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_inst.h b/src/mesa/drivers/dri/i965/brw_inst.h > > index 372aa2b..8701771 100644 > > --- a/src/mesa/drivers/dri/i965/brw_inst.h > > +++ b/src/mesa/drivers/dri/i965/brw_inst.h > > @@ -322,6 +322,9 @@ FJ(gen4_jump_count, 111, 96, brw->gen < 6) > > FC(gen4_pop_count, 115, 112, brw->gen < 6) > > /** @} */ > > > > +/* Message descriptor bits */ > > +#define MD(x) (x + 96) > > + > > /** > > * Fields for SEND messages: > > * @{ > > @@ -347,6 +350,12 @@ FF(header_present, > > /* 6: */ 115, 115, > > /* 7: */ 115, 115, > > /* 8: */ 115, 115) > > +FF(notify, > > + /* 4: doesn't exist */ -1, -1, -1, -1, > > + /* 5: doesn't exist */ -1, -1, > > + /* 6: doesn't exist */ -1, -1, > > + /* 7: */ MD(16), MD(15), > > + /* 8: */ MD(16), MD(15)) > > I'm pretty sure notify has existed for much longer than Gen7. I understand that > you don't implement it, but "doesn't exist is at least a little confusing." > (Also, if it does exist all the way back, you could potentially just use F()) The "Notify" bit in the "Message Gateway" message descriptor has existed since the original 965 - it is 16:15 on all generations. So I agree with Ben, this should be F(notify, MD(16), MD(15)). Since this only applies to Message Gateway messages, it might make sense to call it gateway_notify or some such...I've tried to prefix the other descriptor bits with "math_", "sampler_", "urb_", and so on. > If you end up modifying stuff, should you throw in AckReq? > > > FF(function_control, > > /* 4: */ 111, 96, > > /* 4.5: */ 111, 96, > > @@ -354,6 +363,12 @@ FF(function_control, > > /* 6: */ 114, 96, > > /* 7: */ 114, 96, > > /* 8: */ 114, 96) > > +FF(gateway_subfuncid, > > + /* 4: doesn't exist */ -1, -1, -1, -1, > > + /* 5: doesn't exist */ -1, -1, > > + /* 6: doesn't exist */ -1, -1, > > + /* 7: */ MD(2), MD(0), > > + /* 8: */ MD(2), MD(0)) Likewise, these exist on older platforms too... FF(gateway_subfuncid, /* 4: */ MD(1), MD(0), /* 4.5: */ MD(1), MD(0), /* 5: */ MD(1), MD(0), /* 2:0, but bit 2 is reserved MBZ */ /* 6: */ MD(2), MD(0), /* 7: */ MD(2), MD(0), /* 8: */ MD(2), MD(0)) With those changes, this would get a: Reviewed-by: Kenneth Graunke > > FF(sfid, > > /* 4: */ 123, 120, /* called msg_target */ > > /* 4.5 */ 123, 120, > > @@ -364,9 +379,6 @@ FF(sfid, > > FC(base_mrf, 27, 24, brw->gen < 6); > > /** @} */ > > > > -/* Message descriptor bits */ > > -#define MD(x) (x + 96) > > - > > /** > > * URB message function control bits: > > * @{ > > I am not a huge fan of MD(x) but I suppose you didn't create that yourself. I'd > be in favor of killing it at some point. > > Patches up through this one are: > Reviewed-by: Ben Widawsky > > (I think 1 & 2 make more sense as a single patch, but meh) signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 89818] WebGL Conformance conformance/textures/texture-size-limit.html -> OUT_OF_MEMORY
https://bugs.freedesktop.org/show_bug.cgi?id=89818 Bug ID: 89818 Summary: WebGL Conformance conformance/textures/texture-size-limit.html -> OUT_OF_MEMORY Product: Mesa Version: unspecified Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: Mesa core Assignee: mesa-dev@lists.freedesktop.org Reporter: lukebe...@hotmail.com QA Contact: mesa-dev@lists.freedesktop.org With the mesa llvmpipe or r300 in either Firefox or Chrome, navigate to: https://www.khronos.org/registry/webgl/sdk/tests/webgl-conformance-tests.html It fails with the mesa drivers, but passes with the nvidia proprietary driver on my GTX650. The intel driver was fixed here Bug 78770. Sample failure output: failed: getError expected: NO_ERROR. Was OUT_OF_MEMORY : there should be no error for level: 12 1x1 failed: getError expected: NO_ERROR. Was OUT_OF_MEMORY : there should be no error for level: 11 2x2 failed: getError expected: NO_ERROR. Was OUT_OF_MEMORY : there should be no error for level: 10 4x4 failed: getError expected: NO_ERROR. Was OUT_OF_MEMORY : there should be no error for level: 9 8x8 failed: getError expected: NO_ERROR. Was OUT_OF_MEMORY : there should be no error for level: 8 16x16 -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] gallivm: add gather support to sampler interface
From: Roland Scheidegger Luckily thanks to the revamped interface this is a lot less work now... --- src/gallium/auxiliary/gallivm/lp_bld_sample.h | 18 + src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 31 +-- src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c | 6 ++--- 3 files changed, 34 insertions(+), 21 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample.h b/src/gallium/auxiliary/gallivm/lp_bld_sample.h index b95ee6f..640b7e0 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_sample.h +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample.h @@ -76,13 +76,21 @@ enum lp_sampler_lod_control { }; +enum lp_sampler_op_type { + LP_SAMPLER_OP_TEXTURE, + LP_SAMPLER_OP_FETCH, + LP_SAMPLER_OP_GATHER +}; + + #define LP_SAMPLER_SHADOW (1 << 0) #define LP_SAMPLER_OFFSETS(1 << 1) -#define LP_SAMPLER_FETCH (1 << 2) -#define LP_SAMPLER_LOD_CONTROL_SHIFT3 -#define LP_SAMPLER_LOD_CONTROL_MASK (3 << 3) -#define LP_SAMPLER_LOD_PROPERTY_SHIFT 5 -#define LP_SAMPLER_LOD_PROPERTY_MASK (3 << 5) +#define LP_SAMPLER_OP_TYPE_SHIFT2 +#define LP_SAMPLER_OP_TYPE_MASK (3 << 2) +#define LP_SAMPLER_LOD_CONTROL_SHIFT4 +#define LP_SAMPLER_LOD_CONTROL_MASK (3 << 4) +#define LP_SAMPLER_LOD_PROPERTY_SHIFT 6 +#define LP_SAMPLER_LOD_PROPERTY_MASK (3 << 6) struct lp_sampler_params { diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c index 82ef359..962f478 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c @@ -2391,9 +2391,10 @@ lp_build_sample_soa_code(struct gallivm_state *gallivm, LLVMValueRef tex_width, newcoords[5]; enum lp_sampler_lod_property lod_property; enum lp_sampler_lod_control lod_control; + enum lp_sampler_op_type op_type; LLVMValueRef lod_bias = NULL; LLVMValueRef explicit_lod = NULL; - boolean is_fetch = !!(sample_key & LP_SAMPLER_FETCH); + boolean op_is_tex; if (0) { enum pipe_format fmt = static_texture_state->format; @@ -2404,6 +2405,10 @@ lp_build_sample_soa_code(struct gallivm_state *gallivm, LP_SAMPLER_LOD_PROPERTY_SHIFT; lod_control = (sample_key & LP_SAMPLER_LOD_CONTROL_MASK) >> LP_SAMPLER_LOD_CONTROL_SHIFT; + op_type = (sample_key & LP_SAMPLER_OP_TYPE_MASK) >> + LP_SAMPLER_OP_TYPE_SHIFT; + + op_is_tex = op_type == LP_SAMPLER_OP_TEXTURE; if (lod_control == LP_SAMPLER_LOD_BIAS) { lod_bias = lod; @@ -2534,7 +2539,7 @@ lp_build_sample_soa_code(struct gallivm_state *gallivm, (gallivm_debug & GALLIVM_DEBUG_NO_RHO_APPROX) && (static_texture_state->target == PIPE_TEXTURE_CUBE || static_texture_state->target == PIPE_TEXTURE_CUBE_ARRAY) && - (!is_fetch && mip_filter != PIPE_TEX_MIPFILTER_NONE)) { + (op_is_tex && mip_filter != PIPE_TEX_MIPFILTER_NONE)) { /* * special case for using per-pixel lod even for implicit lod, * which is generally never required (ok by APIs) except to please @@ -2548,23 +2553,23 @@ lp_build_sample_soa_code(struct gallivm_state *gallivm, } else if (lod_property == LP_SAMPLER_LOD_PER_ELEMENT || (explicit_lod || lod_bias || derivs)) { - if ((is_fetch && target != PIPE_BUFFER) || - (!is_fetch && mip_filter != PIPE_TEX_MIPFILTER_NONE)) { + if ((!op_is_tex && target != PIPE_BUFFER) || + (op_is_tex && mip_filter != PIPE_TEX_MIPFILTER_NONE)) { bld.num_mips = type.length; bld.num_lods = type.length; } - else if (!is_fetch && min_img_filter != mag_img_filter) { + else if (op_is_tex && min_img_filter != mag_img_filter) { bld.num_mips = 1; bld.num_lods = type.length; } } /* TODO: for true scalar_lod should only use 1 lod value */ - else if ((is_fetch && explicit_lod && target != PIPE_BUFFER) || -(!is_fetch && mip_filter != PIPE_TEX_MIPFILTER_NONE)) { + else if ((!op_is_tex && explicit_lod && target != PIPE_BUFFER) || +(op_is_tex && mip_filter != PIPE_TEX_MIPFILTER_NONE)) { bld.num_mips = num_quads; bld.num_lods = num_quads; } - else if (!is_fetch && min_img_filter != mag_img_filter) { + else if (op_is_tex && min_img_filter != mag_img_filter) { bld.num_mips = 1; bld.num_lods = num_quads; } @@ -2658,7 +2663,7 @@ lp_build_sample_soa_code(struct gallivm_state *gallivm, texel_out); } - else if (is_fetch) { + else if (op_type == LP_SAMPLER_OP_FETCH) { lp_build_fetch_texel(&bld, texture_index, newcoords, lod, offsets, texel_out); @@ -2786,18 +2791,18 @@ lp_build_sample_soa_code(struct gallivm_state *gallivm, (gallivm_debug & GALLIVM_DEBUG_NO_RHO_APPROX) &&
[Mesa-dev] [PATCH 3/3] llvmpipe: enable ARB_texture_gather
From: Roland Scheidegger Just announce support for 4 components. While here also increase the max/min texel offsets (the limit is completely artificial, was chosen because that's what other hardware did, however there's other drivers using larger limits). Over a thousand little piglits skip->pass. --- src/gallium/drivers/llvmpipe/lp_screen.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c b/src/gallium/drivers/llvmpipe/lp_screen.c index 4b45725..f4ba596 100644 --- a/src/gallium/drivers/llvmpipe/lp_screen.c +++ b/src/gallium/drivers/llvmpipe/lp_screen.c @@ -180,10 +180,10 @@ llvmpipe_get_param(struct pipe_screen *screen, enum pipe_cap param) /* this is a lie could support arbitrary large offsets */ case PIPE_CAP_MIN_TEXTURE_GATHER_OFFSET: case PIPE_CAP_MIN_TEXEL_OFFSET: - return -8; + return -32; case PIPE_CAP_MAX_TEXTURE_GATHER_OFFSET: case PIPE_CAP_MAX_TEXEL_OFFSET: - return 7; + return 31; case PIPE_CAP_CONDITIONAL_RENDER: return 1; case PIPE_CAP_TEXTURE_BARRIER: @@ -249,6 +249,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum pipe_cap param) case PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT: return 1; case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS: + return 4; case PIPE_CAP_TEXTURE_GATHER_SM5: case PIPE_CAP_TEXTURE_QUERY_LOD: case PIPE_CAP_SAMPLE_SHADING: -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] gallivm: implement TG4 for ARB_texture_gather
From: Roland Scheidegger This is quite trivial, essentially just follow all the same code you'd use with linear min/mag (and no mip) filter, then just skip the filtering after looking up the texels in favor of direct assignment of the right channel to the result. (This is though not true for the multi-offset version if we'd want to support it - for this would probably need to do something along the lines of 4x nearest sampling due to the necessity of doing coord wrapping individually per texel.) Supports multi-channel formats. From the SM5 gather cap bit, should support non-constant offsets, plus shadow comparisons (the former untested), but not component selection (should be easy to implement but all this stuff is not really exposable anyway for now). --- src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 137 +- src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c | 36 -- 2 files changed, 133 insertions(+), 40 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c index 962f478..ff508e2 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c @@ -840,6 +840,7 @@ lp_build_masklerp2d(struct lp_build_context *bld, */ static void lp_build_sample_image_linear(struct lp_build_sample_context *bld, + boolean is_gather, LLVMValueRef size, LLVMValueRef linear_mask, LLVMValueRef row_stride_vec, @@ -853,6 +854,7 @@ lp_build_sample_image_linear(struct lp_build_sample_context *bld, LLVMBuilderRef builder = bld->gallivm->builder; struct lp_build_context *ivec_bld = &bld->int_coord_bld; struct lp_build_context *coord_bld = &bld->coord_bld; + struct lp_build_context *texel_bld = &bld->texel_bld; const unsigned dims = bld->dims; LLVMValueRef width_vec; LLVMValueRef height_vec; @@ -875,7 +877,16 @@ lp_build_sample_image_linear(struct lp_build_sample_context *bld, seamless_cube_filter = (bld->static_texture_state->target == PIPE_TEXTURE_CUBE || bld->static_texture_state->target == PIPE_TEXTURE_CUBE_ARRAY) && bld->static_sampler_state->seamless_cube_map; - accurate_cube_corners = ACCURATE_CUBE_CORNERS && seamless_cube_filter; + /* +* XXX I don't know how this is really supposed to work with gather. From GL +* spec wording (not gather specific) it sounds like the 4th missing texel +* should be an average of the other 3, hence for gather could return this. +* This is however NOT how the code here works, which just fixes up the +* weights used for filtering instead. And of course for gather there is +* no filter to tweak... +*/ + accurate_cube_corners = ACCURATE_CUBE_CORNERS && seamless_cube_filter && + !is_gather; lp_build_extract_image_sizes(bld, &bld->int_size_bld, @@ -1160,10 +1171,11 @@ lp_build_sample_image_linear(struct lp_build_sample_context *bld, data_ptr, mipoffsets, neighbors[0][1]); if (dims == 1) { + assert(!is_gather); if (bld->static_sampler_state->compare_mode == PIPE_TEX_COMPARE_NONE) { /* Interpolate two samples from 1D image to produce one color */ for (chan = 0; chan < 4; chan++) { -colors_out[chan] = lp_build_lerp(&bld->texel_bld, s_fpart, +colors_out[chan] = lp_build_lerp(texel_bld, s_fpart, neighbors[0][0][chan], neighbors[0][1][chan], 0); @@ -1174,7 +1186,7 @@ lp_build_sample_image_linear(struct lp_build_sample_context *bld, cmpval0 = lp_build_sample_comparefunc(bld, coords[4], neighbors[0][0][0]); cmpval1 = lp_build_sample_comparefunc(bld, coords[4], neighbors[0][1][0]); /* simplified lerp, AND mask with weight and add */ - colors_out[0] = lp_build_masklerp(&bld->texel_bld, s_fpart, + colors_out[0] = lp_build_masklerp(texel_bld, s_fpart, cmpval0, cmpval1); colors_out[1] = colors_out[2] = colors_out[3] = colors_out[0]; } @@ -1301,15 +1313,38 @@ lp_build_sample_image_linear(struct lp_build_sample_context *bld, } if (bld->static_sampler_state->compare_mode == PIPE_TEX_COMPARE_NONE) { - /* Bilinear interpolate the four samples from the 2D image / 3D slice */ - for (chan = 0; chan < 4; chan++) { -colors0[chan] = lp_build_lerp_2d(&bld->texel_bld, - s_fpart, t_fpart, - neighbors[0][0][chan], - neighbors[0][1][chan], -
Re: [Mesa-dev] [PATCH 3/3] util: Use 32 bit integer hash function for pointers
On 2015-03-29 13:28:02, Thomas Helland wrote: > (Forgot to send to list) > > That is indeed an issue. > I found the original article on the wayback machine and it > doesn't state anything particular wrt license. > However, it seems to be used in a LOT of projects. > (javascript, chromium, hiphop-php, kde, +++) > > I found this webpage that gives some more insight: > http://burtleburtle.net/bob/hash/integer.html > (it seems to be the website of Bob Jenkins) > It states the following: > > "The hashes on this page (with the possible exception > of HashMap.java's) are all public domain. > So are the ones on Thomas Wang's page. > Thomas recommends citing the author > and page when using them." Is there a reference to Thomas directly indicating this? On that same page, Bob provides some hash functions of his own, and directly states that they are public domain. He would probably be in a better position to speak for his code than Thomas's. :) He does indicate that Thomas's version is faster than all of his. -Jordan > So it looks like it is public domain. > I'll add a credit to Thomas, and link > to Bob Jenkins' webpage . > Does that sound like an acceptable solution? > > 2015-03-29 20:51 GMT+02:00 Jordan Justen : > > On 2015-03-29 11:05:40, Thomas Helland wrote: > >> Since a pointer is basically just an int we can use integer hashing. > >> This one is taken from https://gist.github.com/badboy/6267743 > > > > There doesn't seem to be a license associated with this code, nor is > > it indicated that it is public domain code. > > > > -Jordan > > > >> A google search seems to suggest this is a common and good algorithm. > >> Since it operates 32 bits at a time it is really effective. > >> assert that we are hashing 32bit aligned pointers. > >> > >> Signed-off-by: Thomas Helland > >> --- > >> src/util/hash_table.c | 24 ++-- > >> 1 file changed, 22 insertions(+), 2 deletions(-) > >> > >> diff --git a/src/util/hash_table.c b/src/util/hash_table.c > >> index 24184c0..54d04ef 100644 > >> --- a/src/util/hash_table.c > >> +++ b/src/util/hash_table.c > >> @@ -393,6 +393,15 @@ _mesa_hash_table_random_entry(struct hash_table *ht, > >> return NULL; > >> } > >> > >> +static inline uint32_t > >> +hash_32bit_int(uint32_t a) { > >> + a = (a ^ 61) ^ (a >> 16); > >> + a = a + (a << 3); > >> + a = a ^ (a >> 4); > >> + a = a * 0x27d4eb2d; > >> + a = a ^ (a >> 15); > >> + return a; > >> +} > >> > >> /** > >> * Quick FNV-1a hash implementation based on: > >> @@ -406,8 +415,19 @@ _mesa_hash_table_random_entry(struct hash_table *ht, > >> uint32_t > >> _mesa_hash_data(const void *data, size_t size) > >> { > >> - return _mesa_fnv32_1a_accumulate_block(_mesa_fnv32_1a_offset_bias, > >> - data, size); > >> + uint32_t hash = _mesa_fnv32_1a_offset_bias; > >> + const uint32_t *ints = (const uint32_t *) data; > >> + > >> + assert((size % 4) == 0); > >> + > >> + uint32_t i = size / 4; > >> + > >> + while (i-- != 0) { > >> + hash ^= hash_32bit_int(*ints); > >> + ints++; > >> + } > >> + > >> + return hash; > >> } > >> > >> /** FNV-1a string hash implementation */ > >> -- > >> 2.3.4 > >> > >> ___ > >> mesa-dev mailing list > >> mesa-dev@lists.freedesktop.org > >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0.5/3] util/tests: Expand collision test for hash table
Add a test to exercise a worst case collision scenario that may cause us to not be able to find an empty slot in the table even though it is not full. This hits the bug in my last revision of the series converting the hash table to quadratic probing. Signed-off-by: Thomas Helland --- src/util/tests/hash_table/collision.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/src/util/tests/hash_table/collision.c b/src/util/tests/hash_table/collision.c index 69a4c29..ba284d8 100644 --- a/src/util/tests/hash_table/collision.c +++ b/src/util/tests/hash_table/collision.c @@ -89,6 +89,20 @@ main(int argc, char **argv) entry2 = _mesa_hash_table_search_pre_hashed(ht, bad_hash, str2); assert(entry2->key == str2); + + _mesa_hash_table_destroy(ht, NULL); + + /* Try inserting multiple items with the same hash +* This exercises a worst case scenario where we might fail to find +* an empty slot in the table, even though there is free space +*/ + ht = _mesa_hash_table_create(NULL, NULL, _mesa_key_string_equal); + for (i = 0; i < 100; i++) { + char *key = malloc(10); + sprintf(key, "spam%d", i); + assert(_mesa_hash_table_insert_pre_hashed(ht, bad_hash, key, NULL) != NULL); + } + _mesa_hash_table_destroy(ht, NULL); return 0; -- 2.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 3/3] util: Use 32 bit integer hash function for
Since a pointer is basically just an int we can use integer hashing. This implementation is found on Bob Jenkins' webpage on: http://burtleburtle.net/bob/hash/integer.html It states that this implementation is faster than any of his algorithms. It also statest that the algorithm is public domain. Since it operates 32 bits at a time it is really effective. Oprofile of shader-db run with INTEL_USE_NIR set: mesa_hash_data 3.09 % ---> 2.15 % V2: Feedback from Matt Turner - Use a for-loop for readability - Don't mix code and declaration Feedback from Jordan Justen - Add comment regarding licensing Signed-off-by: Thomas Helland --- src/util/hash_table.c | 38 -- 1 file changed, 28 insertions(+), 10 deletions(-) diff --git a/src/util/hash_table.c b/src/util/hash_table.c index 24184c0..7c6b3ae 100644 --- a/src/util/hash_table.c +++ b/src/util/hash_table.c @@ -393,21 +393,39 @@ _mesa_hash_table_random_entry(struct hash_table *ht, return NULL; } - /** - * Quick FNV-1a hash implementation based on: - * http://www.isthe.com/chongo/tech/comp/fnv/ - * - * FNV-1a is not be the best hash out there -- Jenkins's lookup3 is supposed - * to be quite good, and it probably beats FNV. But FNV has the advantage - * that it involves almost no code. For an improvement on both, see Paul - * Hsieh's http://www.azillionmonkeys.com/qed/hash.html + * This hashing function is described on Bob Jenkins' website: + * http://burtleburtle.net/bob/hash/integer.html + * It states that the code originally is Thomas Wang's, + * and that it is public domain. + * The original page is down, but a copy can be found on + * http://web.archive.org/web/20120720045250/http://www.cris.com/~Ttwang/tech/inthash.htm */ +static inline uint32_t +hash_32bit_int(uint32_t a) { + a = (a ^ 61) ^ (a >> 16); + a = a + (a << 3); + a = a ^ (a >> 4); + a = a * 0x27d4eb2d; + a = a ^ (a >> 15); + return a; +} + uint32_t _mesa_hash_data(const void *data, size_t size) { - return _mesa_fnv32_1a_accumulate_block(_mesa_fnv32_1a_offset_bias, - data, size); + uint32_t hash = _mesa_fnv32_1a_offset_bias; + const uint32_t *ints = (const uint32_t *) data; + uint32_t i = 0; + + assert((size % 4) == 0); + + uint32_t i = size / 4; + + for (i = size / 4; i != 0; i--, ints++) + hash ^= hash_32bit_int(*ints); + + return hash; } /** FNV-1a string hash implementation */ -- 2.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Good compiler literature?
Does anyone have suggestions for good literature on compilers? Since GPU's and CPU's are a bit different there are probably books that are better suited than others for GPUs? I have what is probably Norway's biggest library on the subject to rent books from, so I guess I should be able to find most suggestions there. Regards Thomas ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 00/22] Expand get_range in minmax_pruning, V3
I just remembered the existence of this series so I went ahead and tried it on top of today's master. With NIR enabled there is no benefit at all, didn't try without NIR. I've marked it as rejected in patchwork, so it's not floating around in there. 3. jan. 2015 20.21 skrev "Thomas Helland" : > I couple months ago i posted a series for expanding the get_range > function in minmax_pruning with support for more operations. > So I've been hacking on this during my spare time in Christmas. > I've now gotten to a point where I think this is not getting us > anywhere, and I need some opinions from other devs to prove me wrong. > > As it stands only a couple of these patches yield any results, > and IMHO only the first patch is merge-material until we can > confirm that this is actually gonna give is significant improvements. > The first patch gives noticeable improvements on shader-db since > we are no longer messing up our saturate-detection. > > Patches up to patch 9 are quite small, and patch 5 yeilds > some benefit one some Dungeon Defenders shaders that do max(exp, 0). > These might be merge-material to, due to their small size. > However the benefits of patch 5 could easily be had with > a 10-line patch to opt_algebraic, and that's the only one showing > any benefit on my collection of shaders. > > Patches 10-19 take on more complexity, with no apparent benefit. > I feel theres not adequate return on investement for the code to be > sitting around bloating the codebase. > > The last three patches are RFC only, as we at least need to rename > the file to something more generic before merging them. > > There is still room for improvement, but I'm not sure its worth it. > Maybe someone can do a shader-db run that proves me wrong? > My shader-db is dominated by TF2, DOTA2, Portal, > Brutal Legend and Dungeon Defenders. Maybe non-Source-engine > games show some benefit from this series? > > I'm not comfortable with how this might mess up our shaders. > It gets hard to verify that things end up complying to spec, > and that we are not doing something wrong. > Some lerp-instructions got removed in Brutal Legend > (it could be guaranteed that the operation would be saturated to 1), > and while I could tell it was likely that the pass was doing the > right thing, it was not easily confirmable. This worries me. > > IMHO we need to do better for this to be worth it. > I added a print to the get_range function to gather some stats: > 2 million calls are made to get_range on my shader-db run. > 500'000 of these are expressions, 2'000 of these are unsupported. > 350'000 of these are constants. > So our coverage in get_range is less than 50%. > Is there anything more we could get/know the range of? > > Thomas Helland (22): > glsl: Reorder optimization-passes > glsl: Move common code to ir_constant_util.h > glsl: Add a IS_CONSTANT macro > glsl: Change to using switch-case in get_range > glsl: Add sqrt, rsq, exp, exp2 to get_range > glsl: Add sin, cos and sign to get_range > glsl: Add saturate to get_range > glsl: Add abs to get_range > glsl: Add ir_unop_neg to get_range > glsl: Add ir_binop_add to get_range > glsl: Add ir_binop_mul to get_range > glsl: Add ir_binop_sub to get_range > glsl: Add ir_binop_pow to get_range > glsl: Add ir_triop_fma to get_range > glsl: Add ir_triop_lrp to get_range > glsl: Add ir_binop_dot to get_range > glsl: Add log and log2 to get_range > glsl: Add ir_unop_rcp to get_range > glsl: Add a saturate range optimization > glsl: Optimize some cases of undefined behaviour. > glsl: Add a range based comparison opt-pass > glsl: Remove useless abs based on range analysis > > src/glsl/glsl_parser_extras.cpp | 2 +- > src/glsl/ir_constant_util.h | 110 + > src/glsl/opt_algebraic.cpp | 93 +--- > src/glsl/opt_minmax.cpp | 495 > ++-- > 4 files changed, 589 insertions(+), 111 deletions(-) > create mode 100644 src/glsl/ir_constant_util.h > > -- > 2.2.1 > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Fwd: [PATCH 3/3] util: Use 32 bit integer hash function for pointers
(Forgot to send to list) That is indeed an issue. I found the original article on the wayback machine and it doesn't state anything particular wrt license. However, it seems to be used in a LOT of projects. (javascript, chromium, hiphop-php, kde, +++) I found this webpage that gives some more insight: http://burtleburtle.net/bob/hash/integer.html (it seems to be the website of Bob Jenkins) It states the following: "The hashes on this page (with the possible exception of HashMap.java's) are all public domain. So are the ones on Thomas Wang's page. Thomas recommends citing the author and page when using them." So it looks like it is public domain. I'll add a credit to Thomas, and link to Bob Jenkins' webpage . Does that sound like an acceptable solution? 2015-03-29 20:51 GMT+02:00 Jordan Justen : > On 2015-03-29 11:05:40, Thomas Helland wrote: >> Since a pointer is basically just an int we can use integer hashing. >> This one is taken from https://gist.github.com/badboy/6267743 > > There doesn't seem to be a license associated with this code, nor is > it indicated that it is public domain code. > > -Jordan > >> A google search seems to suggest this is a common and good algorithm. >> Since it operates 32 bits at a time it is really effective. >> assert that we are hashing 32bit aligned pointers. >> >> Signed-off-by: Thomas Helland >> --- >> src/util/hash_table.c | 24 ++-- >> 1 file changed, 22 insertions(+), 2 deletions(-) >> >> diff --git a/src/util/hash_table.c b/src/util/hash_table.c >> index 24184c0..54d04ef 100644 >> --- a/src/util/hash_table.c >> +++ b/src/util/hash_table.c >> @@ -393,6 +393,15 @@ _mesa_hash_table_random_entry(struct hash_table *ht, >> return NULL; >> } >> >> +static inline uint32_t >> +hash_32bit_int(uint32_t a) { >> + a = (a ^ 61) ^ (a >> 16); >> + a = a + (a << 3); >> + a = a ^ (a >> 4); >> + a = a * 0x27d4eb2d; >> + a = a ^ (a >> 15); >> + return a; >> +} >> >> /** >> * Quick FNV-1a hash implementation based on: >> @@ -406,8 +415,19 @@ _mesa_hash_table_random_entry(struct hash_table *ht, >> uint32_t >> _mesa_hash_data(const void *data, size_t size) >> { >> - return _mesa_fnv32_1a_accumulate_block(_mesa_fnv32_1a_offset_bias, >> - data, size); >> + uint32_t hash = _mesa_fnv32_1a_offset_bias; >> + const uint32_t *ints = (const uint32_t *) data; >> + >> + assert((size % 4) == 0); >> + >> + uint32_t i = size / 4; >> + >> + while (i-- != 0) { >> + hash ^= hash_32bit_int(*ints); >> + ints++; >> + } >> + >> + return hash; >> } >> >> /** FNV-1a string hash implementation */ >> -- >> 2.3.4 >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] util: Use 32 bit integer hash function for pointers
On 2015-03-29 11:05:40, Thomas Helland wrote: > Since a pointer is basically just an int we can use integer hashing. > This one is taken from https://gist.github.com/badboy/6267743 There doesn't seem to be a license associated with this code, nor is it indicated that it is public domain code. -Jordan > A google search seems to suggest this is a common and good algorithm. > Since it operates 32 bits at a time it is really effective. > assert that we are hashing 32bit aligned pointers. > > Signed-off-by: Thomas Helland > --- > src/util/hash_table.c | 24 ++-- > 1 file changed, 22 insertions(+), 2 deletions(-) > > diff --git a/src/util/hash_table.c b/src/util/hash_table.c > index 24184c0..54d04ef 100644 > --- a/src/util/hash_table.c > +++ b/src/util/hash_table.c > @@ -393,6 +393,15 @@ _mesa_hash_table_random_entry(struct hash_table *ht, > return NULL; > } > > +static inline uint32_t > +hash_32bit_int(uint32_t a) { > + a = (a ^ 61) ^ (a >> 16); > + a = a + (a << 3); > + a = a ^ (a >> 4); > + a = a * 0x27d4eb2d; > + a = a ^ (a >> 15); > + return a; > +} > > /** > * Quick FNV-1a hash implementation based on: > @@ -406,8 +415,19 @@ _mesa_hash_table_random_entry(struct hash_table *ht, > uint32_t > _mesa_hash_data(const void *data, size_t size) > { > - return _mesa_fnv32_1a_accumulate_block(_mesa_fnv32_1a_offset_bias, > - data, size); > + uint32_t hash = _mesa_fnv32_1a_offset_bias; > + const uint32_t *ints = (const uint32_t *) data; > + > + assert((size % 4) == 0); > + > + uint32_t i = size / 4; > + > + while (i-- != 0) { > + hash ^= hash_32bit_int(*ints); > + ints++; > + } > + > + return hash; > } > > /** FNV-1a string hash implementation */ > -- > 2.3.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] util: Use 32 bit integer hash function for pointers
2015-03-29 20:15 GMT+02:00 Matt Turner : > On Sun, Mar 29, 2015 at 11:05 AM, Thomas Helland > wrote: >> Since a pointer is basically just an int we can use integer hashing. >> This one is taken from https://gist.github.com/badboy/6267743 >> A google search seems to suggest this is a common and good algorithm. >> Since it operates 32 bits at a time it is really effective. >> assert that we are hashing 32bit aligned pointers. >> >> Signed-off-by: Thomas Helland >> --- >> src/util/hash_table.c | 24 ++-- >> 1 file changed, 22 insertions(+), 2 deletions(-) >> >> diff --git a/src/util/hash_table.c b/src/util/hash_table.c >> index 24184c0..54d04ef 100644 >> --- a/src/util/hash_table.c >> +++ b/src/util/hash_table.c >> @@ -393,6 +393,15 @@ _mesa_hash_table_random_entry(struct hash_table *ht, >> return NULL; >> } >> >> +static inline uint32_t >> +hash_32bit_int(uint32_t a) { >> + a = (a ^ 61) ^ (a >> 16); >> + a = a + (a << 3); >> + a = a ^ (a >> 4); >> + a = a * 0x27d4eb2d; >> + a = a ^ (a >> 15); >> + return a; >> +} >> >> /** >> * Quick FNV-1a hash implementation based on: >> @@ -406,8 +415,19 @@ _mesa_hash_table_random_entry(struct hash_table *ht, >> uint32_t >> _mesa_hash_data(const void *data, size_t size) >> { >> - return _mesa_fnv32_1a_accumulate_block(_mesa_fnv32_1a_offset_bias, >> - data, size); >> + uint32_t hash = _mesa_fnv32_1a_offset_bias; >> + const uint32_t *ints = (const uint32_t *) data; >> + >> + assert((size % 4) == 0); > > Not sure if we can mix code and declarations. This might need to go > after the declaration of i. > Darn. Rebase failure on my part. Will post a V2 ASAP. >> + >> + uint32_t i = size / 4; >> + >> + while (i-- != 0) { >> + hash ^= hash_32bit_int(*ints); >> + ints++; >> + } > > This would read a lot better as > > for (i = size / 4; i != 0; i--) { >hash ^= hash_32_bit_int(ints[i]); > } I'll get this into the V2 as well. Thanks for the fast response =) > >> + >> + return hash; >> } >> >> /** FNV-1a string hash implementation */ >> -- >> 2.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] util: Use 32 bit integer hash function for pointers
On Sun, Mar 29, 2015 at 11:05 AM, Thomas Helland wrote: > Since a pointer is basically just an int we can use integer hashing. > This one is taken from https://gist.github.com/badboy/6267743 > A google search seems to suggest this is a common and good algorithm. > Since it operates 32 bits at a time it is really effective. > assert that we are hashing 32bit aligned pointers. > > Signed-off-by: Thomas Helland > --- > src/util/hash_table.c | 24 ++-- > 1 file changed, 22 insertions(+), 2 deletions(-) > > diff --git a/src/util/hash_table.c b/src/util/hash_table.c > index 24184c0..54d04ef 100644 > --- a/src/util/hash_table.c > +++ b/src/util/hash_table.c > @@ -393,6 +393,15 @@ _mesa_hash_table_random_entry(struct hash_table *ht, > return NULL; > } > > +static inline uint32_t > +hash_32bit_int(uint32_t a) { > + a = (a ^ 61) ^ (a >> 16); > + a = a + (a << 3); > + a = a ^ (a >> 4); > + a = a * 0x27d4eb2d; > + a = a ^ (a >> 15); > + return a; > +} > > /** > * Quick FNV-1a hash implementation based on: > @@ -406,8 +415,19 @@ _mesa_hash_table_random_entry(struct hash_table *ht, > uint32_t > _mesa_hash_data(const void *data, size_t size) > { > - return _mesa_fnv32_1a_accumulate_block(_mesa_fnv32_1a_offset_bias, > - data, size); > + uint32_t hash = _mesa_fnv32_1a_offset_bias; > + const uint32_t *ints = (const uint32_t *) data; > + > + assert((size % 4) == 0); Not sure if we can mix code and declarations. This might need to go after the declaration of i. > + > + uint32_t i = size / 4; > + > + while (i-- != 0) { > + hash ^= hash_32bit_int(*ints); > + ints++; > + } This would read a lot better as for (i = size / 4; i != 0; i--) { hash ^= hash_32_bit_int(ints[i]); } > + > + return hash; > } > > /** FNV-1a string hash implementation */ > -- > 2.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] util: Change hash_table to use quadratic probing
This should give better cache locality, less memory consumption, less code, and should also be faster since we avoid a modulo operation. Also change table size to be power of two. This gives better performance as we can do bitmasking instead of modulo operations for fitting the hash in the address space. By using the algorithm hash = sh + i/2 + i*i/2 we are guaranteed that all retries from the quad probing are distinct, and so should be able to completely fill the table. This passes the test added to exercise a worst case collision scenario. Also, start at size = 16 instead of 4. This should reduce some allocation overhead when constantly using tables larger than 3 entries. The amount of space used before rehash is changed to 70%. This should decrease collisions slightly, leading to better performance. V3: Feedback from Connor Abbott - Remove hash_size table - Correct comment-style Feedback from Eric Anholt - Correct quadratic probing algorithm Feedback from Jason Ekstrand - Add "unreachable" if we fail to insert in table Signed-off-by: Thomas Helland --- src/util/hash_table.c | 108 +- src/util/hash_table.h | 3 +- 2 files changed, 38 insertions(+), 73 deletions(-) diff --git a/src/util/hash_table.c b/src/util/hash_table.c index 3247593..24184c0 100644 --- a/src/util/hash_table.c +++ b/src/util/hash_table.c @@ -33,11 +33,16 @@ */ /** - * Implements an open-addressing, linear-reprobing hash table. + * Implements an open-addressing, quadratic probing hash table. * - * For more information, see: - * - * http://cgit.freedesktop.org/~anholt/hash_table/tree/README + * We choose table sizes that's a power of two. + * This is computationally less expensive than primes. + * As a bonus the size and free space can be calculated instead of looked up. + * FNV-1a has good avalanche properties, so collision is not an issue. + * These tables are sized to have an extra 30% free to avoid + * exponential performance degradation as the hash table fills. + * The table has a starting size of 16 to avoid spamming + * rzalloc and friends in the start of most of our tables. */ #include @@ -50,47 +55,6 @@ static const uint32_t deleted_key_value; -/** - * From Knuth -- a good choice for hash/rehash values is p, p-2 where - * p and p-2 are both prime. These tables are sized to have an extra 10% - * free to avoid exponential performance degradation as the hash table fills - */ -static const struct { - uint32_t max_entries, size, rehash; -} hash_sizes[] = { - { 2,5, 3 }, - { 4,7, 5 }, - { 8,13, 11}, - { 16, 19, 17}, - { 32, 43, 41}, - { 64, 73, 71}, - { 128, 151,149 }, - { 256, 283,281 }, - { 512, 571,569 }, - { 1024, 1153, 1151 }, - { 2048, 2269, 2267 }, - { 4096, 4519, 4517 }, - { 8192, 9013, 9011 }, - { 16384,18043, 18041 }, - { 32768,36109, 36107 }, - { 65536,72091, 72089 }, - { 131072, 144409, 144407}, - { 262144, 288361, 288359}, - { 524288, 576883, 576881}, - { 1048576, 1153459,1153457 }, - { 2097152, 2307163,2307161 }, - { 4194304, 4613893,4613891 }, - { 8388608, 9227641,9227639 }, - { 16777216, 18455029, 18455027 }, - { 33554432, 36911011, 36911009 }, - { 67108864, 73819861, 73819859 }, - { 134217728,147639589, 147639587 }, - { 268435456,295279081, 295279079 }, - { 536870912,590559793, 590559791 }, - { 1073741824, 1181116273, 1181116271}, - { 2147483648ul, 2362232233ul, 2362232231ul} -}; - static int entry_is_free(const struct hash_entry *entry) { @@ -121,10 +85,13 @@ _mesa_hash_table_create(void *mem_ctx, if (ht == NULL) return NULL; - ht->size_index = 0; - ht->size = hash_sizes[ht->size_index].size; - ht->rehash = hash_sizes[ht->size_index].rehash; - ht->max_entries = hash_sizes[ht->size_index].max_entries; + /* Start the table at an initial size of 16 +* We use a bit more memory, but avoid spamming +* malloc and friends when starting a new table +*/ + ht->size_iteration = 4; + ht->size = 1 << ht->size_iteration; + ht->max_entries = ht->size * 0.7; ht->key_hash_function = key_hash_function; ht->key_equals_function = key_equals_function; ht->tabl
[Mesa-dev] [PATCH 0/3] Hash-table improvements, V3
Here's the latest round of fixup on the hash-table patches. I think I've gotten all the review feedback incorporated now. These patches give a nice little boost, indicated in each commit. As a side effect of upping the minimum size of the table and set there is now also less spamming of rzalloc and friends. _int_malloc is cut from 935'000 to 847'000 samples. calloc is cut from 683'000 to 655'000 samples. _int_free is cut from 644'000 to 617'000 samples The series reduced shader-db run-time with NIR on my collection from 180 seconds to about 160 seconds. Thomas Helland (3): util: Change hash_table to use quadratic probing util: Change util/set to use quadratic probing util: Use 32 bit integer hash function for pointers src/util/hash_table.c | 132 ++ src/util/hash_table.h | 3 +- src/util/set.c| 124 ++- src/util/set.h| 3 +- 4 files changed, 109 insertions(+), 153 deletions(-) -- 2.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] util: Change util/set to use quadratic probing
The same rationale applies here as for the hash table. Power of two size should give better performance, and using the algorithm hash = sh + i/2 + i*i/2 should result in only distinct hash values when hitting collisions. Should give a performance increase as we can do bitmasking instead of a modulo operation for fitting the hash in the address space. V2: Feedback from Connor Abbott - Don't set initial hash address before potential rehash - Remove hash_sizes table - Correct the quadratic hashing algorithm - Use correct comment style Feedback from Jason Ekstrand - Use unreachable() to detect if we fail to insert Signed-off-by: Thomas Helland --- src/util/set.c | 124 ++--- src/util/set.h | 3 +- 2 files changed, 49 insertions(+), 78 deletions(-) diff --git a/src/util/set.c b/src/util/set.c index f01f869..1496178 100644 --- a/src/util/set.c +++ b/src/util/set.c @@ -32,6 +32,19 @@ *Keith Packard */ +/** + * Implements an open-addressing, quadratic probing hash-set. + * + * We choose set sizes that's a power of two. + * This is computationally less expensive than primes. + * As a bonus the size and free space can be calculated instead of looked up. + * FNV-1a has good avalanche properties, so collision is not an issue. + * These sets are sized to have an extra 30% free to avoid + * exponential performance degradation as the set fills. + * The set has a starting size of 16 to avoid spamming + * rzalloc and friends in the start of most of our sets. + */ + #include #include @@ -39,51 +52,9 @@ #include "ralloc.h" #include "set.h" -/* - * From Knuth -- a good choice for hash/rehash values is p, p-2 where - * p and p-2 are both prime. These tables are sized to have an extra 10% - * free to avoid exponential performance degradation as the hash table fills - */ - uint32_t deleted_key_value; const void *deleted_key = &deleted_key_value; -static const struct { - uint32_t max_entries, size, rehash; -} hash_sizes[] = { - { 2,5,3}, - { 4,7,5}, - { 8,13, 11 }, - { 16, 19, 17 }, - { 32, 43, 41 }, - { 64, 73, 71 }, - { 128, 151, 149 }, - { 256, 283, 281 }, - { 512, 571, 569 }, - { 1024, 1153, 1151 }, - { 2048, 2269, 2267 }, - { 4096, 4519, 4517 }, - { 8192, 9013, 9011 }, - { 16384,18043,18041}, - { 32768,36109,36107}, - { 65536,72091,72089}, - { 131072, 144409, 144407 }, - { 262144, 288361, 288359 }, - { 524288, 576883, 576881 }, - { 1048576, 1153459, 1153457 }, - { 2097152, 2307163, 2307161 }, - { 4194304, 4613893, 4613891 }, - { 8388608, 9227641, 9227639 }, - { 16777216, 18455029, 18455027 }, - { 33554432, 36911011, 36911009 }, - { 67108864, 73819861, 73819859 }, - { 134217728,147639589,147639587}, - { 268435456,295279081,295279079}, - { 536870912,590559793,590559791}, - { 1073741824, 1181116273, 1181116271 }, - { 2147483648ul, 2362232233ul, 2362232231ul } -}; - static int entry_is_free(struct set_entry *entry) { @@ -114,10 +85,13 @@ _mesa_set_create(void *mem_ctx, if (ht == NULL) return NULL; - ht->size_index = 0; - ht->size = hash_sizes[ht->size_index].size; - ht->rehash = hash_sizes[ht->size_index].rehash; - ht->max_entries = hash_sizes[ht->size_index].max_entries; + /* Start the set at an initial size of 16 +* We use a bit more memory, but avoid spamming +* malloc and friends when starting a new set +*/ + ht->size_iteration = 4; + ht->size = 1 << ht->size_iteration; + ht->max_entries = ht->size * 0.7; ht->key_hash_function = key_hash_function; ht->key_equals_function = key_equals_function; ht->table = rzalloc_array(ht, struct set_entry, ht->size); @@ -163,12 +137,11 @@ _mesa_set_destroy(struct set *ht, void (*delete_function)(struct set_entry *entr static struct set_entry * set_search(const struct set *ht, uint32_t hash, const void *key) { - uint32_t hash_address; + uint32_t start_hash_address = hash & (ht->size - 1); + uint32_t hash_address = start_hash_address; + uint32_t quad_hash = 1; - hash_address = hash % ht->size; do { - uint32_t double_hash; - struct set_entry *entry = ht->table + hash_address; if (entry_is_free(entry)) { @@ -179,10 +152,10 @@ set_search(const struct set *ht, uint32_t hash, const void *key) } }
[Mesa-dev] [PATCH 3/3] util: Use 32 bit integer hash function for pointers
Since a pointer is basically just an int we can use integer hashing. This one is taken from https://gist.github.com/badboy/6267743 A google search seems to suggest this is a common and good algorithm. Since it operates 32 bits at a time it is really effective. assert that we are hashing 32bit aligned pointers. Signed-off-by: Thomas Helland --- src/util/hash_table.c | 24 ++-- 1 file changed, 22 insertions(+), 2 deletions(-) diff --git a/src/util/hash_table.c b/src/util/hash_table.c index 24184c0..54d04ef 100644 --- a/src/util/hash_table.c +++ b/src/util/hash_table.c @@ -393,6 +393,15 @@ _mesa_hash_table_random_entry(struct hash_table *ht, return NULL; } +static inline uint32_t +hash_32bit_int(uint32_t a) { + a = (a ^ 61) ^ (a >> 16); + a = a + (a << 3); + a = a ^ (a >> 4); + a = a * 0x27d4eb2d; + a = a ^ (a >> 15); + return a; +} /** * Quick FNV-1a hash implementation based on: @@ -406,8 +415,19 @@ _mesa_hash_table_random_entry(struct hash_table *ht, uint32_t _mesa_hash_data(const void *data, size_t size) { - return _mesa_fnv32_1a_accumulate_block(_mesa_fnv32_1a_offset_bias, - data, size); + uint32_t hash = _mesa_fnv32_1a_offset_bias; + const uint32_t *ints = (const uint32_t *) data; + + assert((size % 4) == 0); + + uint32_t i = size / 4; + + while (i-- != 0) { + hash ^= hash_32bit_int(*ints); + ints++; + } + + return hash; } /** FNV-1a string hash implementation */ -- 2.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa 10.5.2
On 28/03/15 19:23, Emil Velikov wrote: > Mesa 10.5.2 is now available. This release addresses bugs in the common glsl > code-base, the libGL and glapi libraries, and the dri modules. The tarball no > longer contains hardlinks and has all the haiku files. With this release one > can build mesa without the need of python and mako. > Hi all, It seems that there is a bug which prevents mesa 10.5.2 from going python/mako free. The issue has been resolved and will feature in the next stable release. Thanks to everyone who reported this. -Emil signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 06/15] st/mesa: implement GL_AMD_performance_monitor
On 29/03/2015 17:56, Samuel Pitoiset wrote: On 03/28/2015 09:43 PM, Martin Peres wrote: On 22/03/2015 17:35, Samuel Pitoiset wrote: From: Christoph Bumiller This is based on the original patch of Christoph Bumiller. (source: http://people.freedesktop.org/~chrisbmr/perfmon.diff) It would be nice if you could add "v2: Samuel Pitoiset" and tell what you changed. Christoph may delete his perfmon.diff and no-one will be able to diff the diffs :) Good idea! As for the Gallium HUD, we keep a list of busy queries in a ring buffer in order to prevent stalls when reading queries. Drivers must implement get_driver_query_group_info and get_driver_query_info in order to enable this extension. Signed-off-by: Samuel Pitoiset --- src/mesa/Makefile.sources | 2 + src/mesa/state_tracker/st_cb_perfmon.c | 455 + src/mesa/state_tracker/st_cb_perfmon.h | 70 + src/mesa/state_tracker/st_context.c| 4 + src/mesa/state_tracker/st_extensions.c | 3 + 5 files changed, 534 insertions(+) create mode 100644 src/mesa/state_tracker/st_cb_perfmon.c create mode 100644 src/mesa/state_tracker/st_cb_perfmon.h diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources index 217be9a..e54e618 100644 --- a/src/mesa/Makefile.sources +++ b/src/mesa/Makefile.sources @@ -432,6 +432,8 @@ STATETRACKER_FILES = \ state_tracker/st_cb_flush.h \ state_tracker/st_cb_msaa.c \ state_tracker/st_cb_msaa.h \ +state_tracker/st_cb_perfmon.c \ +state_tracker/st_cb_perfmon.h \ state_tracker/st_cb_program.c \ state_tracker/st_cb_program.h \ state_tracker/st_cb_queryobj.c \ diff --git a/src/mesa/state_tracker/st_cb_perfmon.c b/src/mesa/state_tracker/st_cb_perfmon.c new file mode 100644 index 000..fb6774b --- /dev/null +++ b/src/mesa/state_tracker/st_cb_perfmon.c @@ -0,0 +1,455 @@ +/* + * Copyright (C) 2013 Christoph Bumiller + * Copyright (C) 2015 Samuel Pitoiset + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +/** + * Performance monitoring counters interface to gallium. + */ + +#include "st_context.h" +#include "st_cb_bitmap.h" +#include "st_cb_perfmon.h" + +#include "util/bitset.h" + +#include "pipe/p_context.h" +#include "pipe/p_screen.h" +#include "util/u_memory.h" + +/** + * Return a PIPE_QUERY_x type >= PIPE_QUERY_DRIVER_SPECIFIC, or -1 if + * the driver-specific query doesn't exist. + */ +static int +find_query_type(struct pipe_screen *screen, const char *name) +{ + int num_queries; + int type = -1; + int i; + + num_queries = screen->get_driver_query_info(screen, 0, NULL); + if (!num_queries) + return type; + + for (i = 0; i < num_queries; i++) { + struct pipe_driver_query_info info; + + if (!screen->get_driver_query_info(screen, i, &info)) + continue; + + if (!strncmp(info.name, name, strlen(name))) { + type = info.query_type; + break; + } + } + return type; +} + +static bool +init_perf_monitor(struct gl_context *ctx, struct gl_perf_monitor_object *m) +{ + struct st_perf_monitor_object *stm = st_perf_monitor_object(m); + struct pipe_screen *screen = st_context(ctx)->pipe->screen; + struct pipe_context *pipe = st_context(ctx)->pipe; + int gid, cid; + + st_flush_bitmap_cache(st_context(ctx)); + + /* Create a query for each active counter. */ + for (gid = 0; gid < ctx->PerfMonitor.NumGroups; gid++) { + const struct gl_perf_monitor_group *g = &ctx->PerfMonitor.Groups[gid]; + for (cid = 0; cid < g->NumCounters; cid++) { + const struct gl_perf_monitor_counter *c = &g->Counters[cid]; + struct st_perf_counter_object *cntr; + int query_type; + + if (!BITSET_TEST(m->ActiveCounters[gid], cid)) +continue; It would seem like the extension would not work with more than 32 counters per group. This certainly is not a problem on the NVIDIA side but it may be
Re: [Mesa-dev] [PATCH v2 06/15] st/mesa: implement GL_AMD_performance_monitor
On 29/03/2015 17:57, Samuel Pitoiset wrote: On 03/29/2015 11:13 AM, Martin Peres wrote: On 29/03/2015 04:02, Marek Olšák wrote: On Sat, Mar 28, 2015 at 9:43 PM, Martin Peres wrote: On 22/03/2015 17:35, Samuel Pitoiset wrote: From: Christoph Bumiller This is based on the original patch of Christoph Bumiller. (source: http://people.freedesktop.org/~chrisbmr/perfmon.diff) It would be nice if you could add "v2: Samuel Pitoiset" and tell what you changed. Christoph may delete his perfmon.diff and no-one will be able to diff the diffs :) As for the Gallium HUD, we keep a list of busy queries in a ring buffer in order to prevent stalls when reading queries. Drivers must implement get_driver_query_group_info and get_driver_query_info in order to enable this extension. Signed-off-by: Samuel Pitoiset --- src/mesa/Makefile.sources | 2 + src/mesa/state_tracker/st_cb_perfmon.c | 455 + src/mesa/state_tracker/st_cb_perfmon.h | 70 + src/mesa/state_tracker/st_context.c| 4 + src/mesa/state_tracker/st_extensions.c | 3 + 5 files changed, 534 insertions(+) create mode 100644 src/mesa/state_tracker/st_cb_perfmon.c create mode 100644 src/mesa/state_tracker/st_cb_perfmon.h diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources index 217be9a..e54e618 100644 --- a/src/mesa/Makefile.sources +++ b/src/mesa/Makefile.sources @@ -432,6 +432,8 @@ STATETRACKER_FILES = \ state_tracker/st_cb_flush.h \ state_tracker/st_cb_msaa.c \ state_tracker/st_cb_msaa.h \ + state_tracker/st_cb_perfmon.c \ + state_tracker/st_cb_perfmon.h \ state_tracker/st_cb_program.c \ state_tracker/st_cb_program.h \ state_tracker/st_cb_queryobj.c \ diff --git a/src/mesa/state_tracker/st_cb_perfmon.c b/src/mesa/state_tracker/st_cb_perfmon.c new file mode 100644 index 000..fb6774b --- /dev/null +++ b/src/mesa/state_tracker/st_cb_perfmon.c @@ -0,0 +1,455 @@ +/* + * Copyright (C) 2013 Christoph Bumiller + * Copyright (C) 2015 Samuel Pitoiset + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +/** + * Performance monitoring counters interface to gallium. + */ + +#include "st_context.h" +#include "st_cb_bitmap.h" +#include "st_cb_perfmon.h" + +#include "util/bitset.h" + +#include "pipe/p_context.h" +#include "pipe/p_screen.h" +#include "util/u_memory.h" + +/** + * Return a PIPE_QUERY_x type >= PIPE_QUERY_DRIVER_SPECIFIC, or -1 if + * the driver-specific query doesn't exist. + */ +static int +find_query_type(struct pipe_screen *screen, const char *name) +{ + int num_queries; + int type = -1; + int i; + + num_queries = screen->get_driver_query_info(screen, 0, NULL); + if (!num_queries) + return type; + + for (i = 0; i < num_queries; i++) { + struct pipe_driver_query_info info; + + if (!screen->get_driver_query_info(screen, i, &info)) + continue; + + if (!strncmp(info.name, name, strlen(name))) { + type = info.query_type; + break; + } + } + return type; +} + +static bool +init_perf_monitor(struct gl_context *ctx, struct gl_perf_monitor_object *m) +{ + struct st_perf_monitor_object *stm = st_perf_monitor_object(m); + struct pipe_screen *screen = st_context(ctx)->pipe->screen; + struct pipe_context *pipe = st_context(ctx)->pipe; + int gid, cid; + + st_flush_bitmap_cache(st_context(ctx)); + + /* Create a query for each active counter. */ + for (gid = 0; gid < ctx->PerfMonitor.NumGroups; gid++) { + const struct gl_perf_monitor_group *g = &ctx->PerfMonitor.Groups[gid]; + for (cid = 0; cid < g->NumCounters; cid++) { + const struct gl_perf_monitor_counter *c = &g->Counters[cid]; + struct st_perf_counter_object *cntr; + int query_type; + + if (!BITSET_TEST(m->ActiveCounters[gid], cid)) +continue; It would seem like the extension wou
Re: [Mesa-dev] [PATCH v2 06/15] st/mesa: implement GL_AMD_performance_monitor
On 03/29/2015 11:13 AM, Martin Peres wrote: On 29/03/2015 04:02, Marek Olšák wrote: On Sat, Mar 28, 2015 at 9:43 PM, Martin Peres wrote: On 22/03/2015 17:35, Samuel Pitoiset wrote: From: Christoph Bumiller This is based on the original patch of Christoph Bumiller. (source: http://people.freedesktop.org/~chrisbmr/perfmon.diff) It would be nice if you could add "v2: Samuel Pitoiset" and tell what you changed. Christoph may delete his perfmon.diff and no-one will be able to diff the diffs :) As for the Gallium HUD, we keep a list of busy queries in a ring buffer in order to prevent stalls when reading queries. Drivers must implement get_driver_query_group_info and get_driver_query_info in order to enable this extension. Signed-off-by: Samuel Pitoiset --- src/mesa/Makefile.sources | 2 + src/mesa/state_tracker/st_cb_perfmon.c | 455 + src/mesa/state_tracker/st_cb_perfmon.h | 70 + src/mesa/state_tracker/st_context.c| 4 + src/mesa/state_tracker/st_extensions.c | 3 + 5 files changed, 534 insertions(+) create mode 100644 src/mesa/state_tracker/st_cb_perfmon.c create mode 100644 src/mesa/state_tracker/st_cb_perfmon.h diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources index 217be9a..e54e618 100644 --- a/src/mesa/Makefile.sources +++ b/src/mesa/Makefile.sources @@ -432,6 +432,8 @@ STATETRACKER_FILES = \ state_tracker/st_cb_flush.h \ state_tracker/st_cb_msaa.c \ state_tracker/st_cb_msaa.h \ + state_tracker/st_cb_perfmon.c \ + state_tracker/st_cb_perfmon.h \ state_tracker/st_cb_program.c \ state_tracker/st_cb_program.h \ state_tracker/st_cb_queryobj.c \ diff --git a/src/mesa/state_tracker/st_cb_perfmon.c b/src/mesa/state_tracker/st_cb_perfmon.c new file mode 100644 index 000..fb6774b --- /dev/null +++ b/src/mesa/state_tracker/st_cb_perfmon.c @@ -0,0 +1,455 @@ +/* + * Copyright (C) 2013 Christoph Bumiller + * Copyright (C) 2015 Samuel Pitoiset + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +/** + * Performance monitoring counters interface to gallium. + */ + +#include "st_context.h" +#include "st_cb_bitmap.h" +#include "st_cb_perfmon.h" + +#include "util/bitset.h" + +#include "pipe/p_context.h" +#include "pipe/p_screen.h" +#include "util/u_memory.h" + +/** + * Return a PIPE_QUERY_x type >= PIPE_QUERY_DRIVER_SPECIFIC, or -1 if + * the driver-specific query doesn't exist. + */ +static int +find_query_type(struct pipe_screen *screen, const char *name) +{ + int num_queries; + int type = -1; + int i; + + num_queries = screen->get_driver_query_info(screen, 0, NULL); + if (!num_queries) + return type; + + for (i = 0; i < num_queries; i++) { + struct pipe_driver_query_info info; + + if (!screen->get_driver_query_info(screen, i, &info)) + continue; + + if (!strncmp(info.name, name, strlen(name))) { + type = info.query_type; + break; + } + } + return type; +} + +static bool +init_perf_monitor(struct gl_context *ctx, struct gl_perf_monitor_object *m) +{ + struct st_perf_monitor_object *stm = st_perf_monitor_object(m); + struct pipe_screen *screen = st_context(ctx)->pipe->screen; + struct pipe_context *pipe = st_context(ctx)->pipe; + int gid, cid; + + st_flush_bitmap_cache(st_context(ctx)); + + /* Create a query for each active counter. */ + for (gid = 0; gid < ctx->PerfMonitor.NumGroups; gid++) { + const struct gl_perf_monitor_group *g = &ctx->PerfMonitor.Groups[gid]; + for (cid = 0; cid < g->NumCounters; cid++) { + const struct gl_perf_monitor_counter *c = &g->Counters[cid]; + struct st_perf_counter_object *cntr; + int query_type; + + if (!BITSET_TEST(m->ActiveCounters[gid], cid)) +continue; It would seem like the extension would not work with more than 32 counters per
Re: [Mesa-dev] [PATCH v2 06/15] st/mesa: implement GL_AMD_performance_monitor
On 03/28/2015 09:43 PM, Martin Peres wrote: On 22/03/2015 17:35, Samuel Pitoiset wrote: From: Christoph Bumiller This is based on the original patch of Christoph Bumiller. (source: http://people.freedesktop.org/~chrisbmr/perfmon.diff) It would be nice if you could add "v2: Samuel Pitoiset" and tell what you changed. Christoph may delete his perfmon.diff and no-one will be able to diff the diffs :) Good idea! As for the Gallium HUD, we keep a list of busy queries in a ring buffer in order to prevent stalls when reading queries. Drivers must implement get_driver_query_group_info and get_driver_query_info in order to enable this extension. Signed-off-by: Samuel Pitoiset --- src/mesa/Makefile.sources | 2 + src/mesa/state_tracker/st_cb_perfmon.c | 455 + src/mesa/state_tracker/st_cb_perfmon.h | 70 + src/mesa/state_tracker/st_context.c| 4 + src/mesa/state_tracker/st_extensions.c | 3 + 5 files changed, 534 insertions(+) create mode 100644 src/mesa/state_tracker/st_cb_perfmon.c create mode 100644 src/mesa/state_tracker/st_cb_perfmon.h diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources index 217be9a..e54e618 100644 --- a/src/mesa/Makefile.sources +++ b/src/mesa/Makefile.sources @@ -432,6 +432,8 @@ STATETRACKER_FILES = \ state_tracker/st_cb_flush.h \ state_tracker/st_cb_msaa.c \ state_tracker/st_cb_msaa.h \ +state_tracker/st_cb_perfmon.c \ +state_tracker/st_cb_perfmon.h \ state_tracker/st_cb_program.c \ state_tracker/st_cb_program.h \ state_tracker/st_cb_queryobj.c \ diff --git a/src/mesa/state_tracker/st_cb_perfmon.c b/src/mesa/state_tracker/st_cb_perfmon.c new file mode 100644 index 000..fb6774b --- /dev/null +++ b/src/mesa/state_tracker/st_cb_perfmon.c @@ -0,0 +1,455 @@ +/* + * Copyright (C) 2013 Christoph Bumiller + * Copyright (C) 2015 Samuel Pitoiset + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +/** + * Performance monitoring counters interface to gallium. + */ + +#include "st_context.h" +#include "st_cb_bitmap.h" +#include "st_cb_perfmon.h" + +#include "util/bitset.h" + +#include "pipe/p_context.h" +#include "pipe/p_screen.h" +#include "util/u_memory.h" + +/** + * Return a PIPE_QUERY_x type >= PIPE_QUERY_DRIVER_SPECIFIC, or -1 if + * the driver-specific query doesn't exist. + */ +static int +find_query_type(struct pipe_screen *screen, const char *name) +{ + int num_queries; + int type = -1; + int i; + + num_queries = screen->get_driver_query_info(screen, 0, NULL); + if (!num_queries) + return type; + + for (i = 0; i < num_queries; i++) { + struct pipe_driver_query_info info; + + if (!screen->get_driver_query_info(screen, i, &info)) + continue; + + if (!strncmp(info.name, name, strlen(name))) { + type = info.query_type; + break; + } + } + return type; +} + +static bool +init_perf_monitor(struct gl_context *ctx, struct gl_perf_monitor_object *m) +{ + struct st_perf_monitor_object *stm = st_perf_monitor_object(m); + struct pipe_screen *screen = st_context(ctx)->pipe->screen; + struct pipe_context *pipe = st_context(ctx)->pipe; + int gid, cid; + + st_flush_bitmap_cache(st_context(ctx)); + + /* Create a query for each active counter. */ + for (gid = 0; gid < ctx->PerfMonitor.NumGroups; gid++) { + const struct gl_perf_monitor_group *g = &ctx->PerfMonitor.Groups[gid]; + for (cid = 0; cid < g->NumCounters; cid++) { + const struct gl_perf_monitor_counter *c = &g->Counters[cid]; + struct st_perf_counter_object *cntr; + int query_type; + + if (!BITSET_TEST(m->ActiveCounters[gid], cid)) +continue; It would seem like the extension would not work with more than 32 counters per group. This certainly is not a problem on the NVIDIA side but it may become a problem for another G
[Mesa-dev] [PATCH 2/2] xmlpool: remove the clean target
... by folding it into CLEANFILES. Don't worry about $(LANG) as it is essentially the first folder of $(POS). With the latter already handled. Signed-off-by: Emil Velikov --- src/mesa/drivers/dri/common/xmlpool/Makefile.am | 10 -- 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/common/xmlpool/Makefile.am b/src/mesa/drivers/dri/common/xmlpool/Makefile.am index 9700499..a6f1652 100644 --- a/src/mesa/drivers/dri/common/xmlpool/Makefile.am +++ b/src/mesa/drivers/dri/common/xmlpool/Makefile.am @@ -61,12 +61,10 @@ EXTRA_DIST = \ SConscript BUILT_SOURCES = options.h -CLEANFILES = $(MOS) options.h - -# All generated files are cleaned up. -clean: - -rm -f $(POT) options.h *~ - -rm -rf $(LANGS) +CLEANFILES = \ + options.h + $(POS) \ + $(MOS) # Default target options.h options.h: LOCALEDIR := . -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] xmlpool: don't forget to ship the MOS
This will allow us to finally remove python from the build time dependencies list. Considering that you're building from a release tarball of course :-) Cc: Bernd Kuhls Reported-by: Bernd Kuhls Cc: "10.5" Signed-off-by: Emil Velikov --- src/mesa/drivers/dri/common/xmlpool/Makefile.am | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/common/xmlpool/Makefile.am b/src/mesa/drivers/dri/common/xmlpool/Makefile.am index 5557716..9700499 100644 --- a/src/mesa/drivers/dri/common/xmlpool/Makefile.am +++ b/src/mesa/drivers/dri/common/xmlpool/Makefile.am @@ -52,7 +52,14 @@ POT=xmlpool.pot .PHONY: all clean pot po mo -EXTRA_DIST = gen_xmlpool.py options.h t_options.h $(POS) SConscript +EXTRA_DIST = \ + gen_xmlpool.py \ + options.h \ + t_options.h \ + $(POS) \ + $(MOS) \ + SConscript + BUILT_SOURCES = options.h CLEANFILES = $(MOS) options.h -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] configure.ac: error out if python/mako is not found when required
On 28 March 2015 at 23:51, Bernd Kuhls wrote: > Hi, > > Emil Velikov wrote in news:1427132964-21468-2- > git-send-email-emil.l.veli...@gmail.com: > >> In case of using a distribution tarball (or a dirty git tree) one can >> have the generated sources locally. Make configure.ac error out >> otherwise, to alert that about the unmet requirement(s) of python/mako. > > [...] > >> +if test "x$acv_mako_found" = xno; then >> +if test ! -f "$srcdir/src/glsl/nir/nir_builder_opcodes.h" -o \ > > I can not find any reference to this file in the mesa3d 10.5.2 tarball. Is > it save to assume that the check for this file can be removed from the > patch when applied to 10.5 branch? > Indeed. I have noticed in another commit that I have overestimated when nir_builder_opcodes.h landed. > When python is missing on the build machine there is still an error using > these configure options: > > --disable-glx --disable-xa --disable-static --enable-shared-glapi --with- > gallium-drivers=nouveau,r600,svga,swrast --without-dri-drivers --disable- > dri3 --enable-opengl --enable-gbm --enable-egl --with-egl-platforms=drm -- > enable-gles1 --enable-gles2 > > make[7]: Entering directory `/home/br/br3/output/build/mesa3d- > 10.5.2/src/mesa/drivers/dri/common/xmlpool' > Updating (ca) ca/LC_MESSAGES/options.mo from ca.po. > Updating (de) de/LC_MESSAGES/options.mo from de.po. > Updating (es) es/LC_MESSAGES/options.mo from es.po. > Updating (nl) nl/LC_MESSAGES/options.mo from nl.po. > Updating (fr) fr/LC_MESSAGES/options.mo from fr.po. > Updating (sv) sv/LC_MESSAGES/options.mo from sv.po. > GEN options.h > /bin/bash: ./gen_xmlpool.py: Permission denied > make[7]: *** [options.h] Error 126 > > The reason is src/mesa/drivers/dri/common/xmlpool/Makefile.am, line 66 > > options.h: t_options.h $(MOS) > > Files mentioned in $(MOS) are newer after their creation during the build > than the pre-supplied options.h. > > This hack fixes the build error here: > > -options.h: t_options.h $(MOS) > +options.h: t_options.h > Grr forgot about this one. Upon a closer look it seems that the MOS are missing from the tarball causing all this. Will give it a test and send out a patch. Thanks. Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 14/16] android: add inital NIR build
On 29 March 2015 at 04:17, Kenneth Graunke wrote: > On Sunday, March 29, 2015 12:14:50 AM Emil Velikov wrote: >> On 28 March 2015 at 20:54, Emil Velikov wrote: >> > From: Mauro Rossi >> > >> > Required by the i965 driver. >> > >> > Cc: "10.5" >> > [Emil Velikov: Split from a larger commit] >> > Signed-off-by: Emil Velikov >> > --- >> > src/glsl/Android.gen.mk | 62 > +++-- >> > src/glsl/Android.mk | 3 +- >> > src/mesa/drivers/dri/Android.mk | 1 + >> > 3 files changed, 63 insertions(+), 3 deletions(-) >> > >> > diff --git a/src/glsl/Android.gen.mk b/src/glsl/Android.gen.mk >> > index 7ec56d4..82f2bf1 100644 >> > --- a/src/glsl/Android.gen.mk >> > +++ b/src/glsl/Android.gen.mk >> > @@ -33,11 +33,21 @@ sources := \ >> > glsl_lexer.cpp \ >> > glsl_parser.cpp \ >> > glcpp/glcpp-lex.c \ >> > - glcpp/glcpp-parse.c >> > + glcpp/glcpp-parse.c \ >> > + nir/nir_builder_opcodes.h \ >> >> Seems like the nir_builder_opcodes.h addition came after the 10.5 >> branchpoint. So in order to get this for 10.5 we'll need to split them >> out into a separate patch. >> >> -Emil > > Building NIR on 10.5 isn't really worth doing - the version we've got in > master now is pretty solid, but the version in 10.5 offers only bugs, > and no actual performance benefits. > > I'd just skip it, honestly - I've actually thought about just patching > it out of 10.5 so people don't report bugs against outdated code. > > We're aiming to have NIR up and running in 10.6. > Hi Ken Bth I would be fine with either solution - patching NIR out of the build for 10.5 or adding build support for Android. Both will allow us to have a i965 dri module without unresolved dri symbols. although I would assume that the latter option will be sorter/easier. About the functionality(bugs) I would say that it people explicitly set the env variable then they are asking for what they deserve ;-) Cheers, Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] GL_TEXTURE_2D to wl_buffer
Hi thank you for the help, I will test The only reason for trying to use the texture to wl_buffer directly was just to get something working and then work back from there to see if it was an application side issue or not. Also just a fun experiment. On Sun, Mar 29, 2015 at 6:04 AM, Jason Ekstrand wrote: > On Sat, Mar 28, 2015 at 6:57 AM, x414e54 wrote: > > > I was originally blitting the texture to the default framebuffer and then > > trying to use eglSwapBuffers. But for some reason eglSwapBuffers was > > returning EGL_BAD_SURFACE even though eglMakeCurrent had no errors. > > Can you render to it? That sounds like something is wrong in the way > you're setting up your EGLSurface. Of you can render you should be able to > blit. > --Jason > If I call eglSwapBuffers just after the context creation it works fine and the buffer is committed to the Wayland surface. Making the context current and performing the glBlit also works fine (tested using glReadPixels). eglQuerySurface also seems to work fine. However eglSwapInterval and eglSwapBuffers returns EGL_BAD_CONTEXT and EGL_BAD_SURFACE respectively. The code is run all in one function and during a mutex to prevent the context becoming current on another thread. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 06/15] st/mesa: implement GL_AMD_performance_monitor
On 29/03/2015 04:02, Marek Olšák wrote: On Sat, Mar 28, 2015 at 9:43 PM, Martin Peres wrote: On 22/03/2015 17:35, Samuel Pitoiset wrote: From: Christoph Bumiller This is based on the original patch of Christoph Bumiller. (source: http://people.freedesktop.org/~chrisbmr/perfmon.diff) It would be nice if you could add "v2: Samuel Pitoiset" and tell what you changed. Christoph may delete his perfmon.diff and no-one will be able to diff the diffs :) As for the Gallium HUD, we keep a list of busy queries in a ring buffer in order to prevent stalls when reading queries. Drivers must implement get_driver_query_group_info and get_driver_query_info in order to enable this extension. Signed-off-by: Samuel Pitoiset --- src/mesa/Makefile.sources | 2 + src/mesa/state_tracker/st_cb_perfmon.c | 455 + src/mesa/state_tracker/st_cb_perfmon.h | 70 + src/mesa/state_tracker/st_context.c| 4 + src/mesa/state_tracker/st_extensions.c | 3 + 5 files changed, 534 insertions(+) create mode 100644 src/mesa/state_tracker/st_cb_perfmon.c create mode 100644 src/mesa/state_tracker/st_cb_perfmon.h diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources index 217be9a..e54e618 100644 --- a/src/mesa/Makefile.sources +++ b/src/mesa/Makefile.sources @@ -432,6 +432,8 @@ STATETRACKER_FILES = \ state_tracker/st_cb_flush.h \ state_tracker/st_cb_msaa.c \ state_tracker/st_cb_msaa.h \ + state_tracker/st_cb_perfmon.c \ + state_tracker/st_cb_perfmon.h \ state_tracker/st_cb_program.c \ state_tracker/st_cb_program.h \ state_tracker/st_cb_queryobj.c \ diff --git a/src/mesa/state_tracker/st_cb_perfmon.c b/src/mesa/state_tracker/st_cb_perfmon.c new file mode 100644 index 000..fb6774b --- /dev/null +++ b/src/mesa/state_tracker/st_cb_perfmon.c @@ -0,0 +1,455 @@ +/* + * Copyright (C) 2013 Christoph Bumiller + * Copyright (C) 2015 Samuel Pitoiset + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +/** + * Performance monitoring counters interface to gallium. + */ + +#include "st_context.h" +#include "st_cb_bitmap.h" +#include "st_cb_perfmon.h" + +#include "util/bitset.h" + +#include "pipe/p_context.h" +#include "pipe/p_screen.h" +#include "util/u_memory.h" + +/** + * Return a PIPE_QUERY_x type >= PIPE_QUERY_DRIVER_SPECIFIC, or -1 if + * the driver-specific query doesn't exist. + */ +static int +find_query_type(struct pipe_screen *screen, const char *name) +{ + int num_queries; + int type = -1; + int i; + + num_queries = screen->get_driver_query_info(screen, 0, NULL); + if (!num_queries) + return type; + + for (i = 0; i < num_queries; i++) { + struct pipe_driver_query_info info; + + if (!screen->get_driver_query_info(screen, i, &info)) + continue; + + if (!strncmp(info.name, name, strlen(name))) { + type = info.query_type; + break; + } + } + return type; +} + +static bool +init_perf_monitor(struct gl_context *ctx, struct gl_perf_monitor_object *m) +{ + struct st_perf_monitor_object *stm = st_perf_monitor_object(m); + struct pipe_screen *screen = st_context(ctx)->pipe->screen; + struct pipe_context *pipe = st_context(ctx)->pipe; + int gid, cid; + + st_flush_bitmap_cache(st_context(ctx)); + + /* Create a query for each active counter. */ + for (gid = 0; gid < ctx->PerfMonitor.NumGroups; gid++) { + const struct gl_perf_monitor_group *g = &ctx->PerfMonitor.Groups[gid]; + for (cid = 0; cid < g->NumCounters; cid++) { + const struct gl_perf_monitor_counter *c = &g->Counters[cid]; + struct st_perf_counter_object *cntr; + int query_type; + + if (!BITSET_TEST(m->ActiveCounters[gid], cid)) +continue; It would seem like the extension would not work with more than 32 counters per group. This certainly is not a problem on the NVIDIA side b