Re: [Mesa-dev] [PATCH 10/12] i965/fs: Reimplement emit_texture() in terms of logical send messages.
On Wed, Jul 22, 2015 at 12:41 AM, Francisco Jerez curroje...@riseup.net wrote: Jason Ekstrand ja...@jlekstrand.net writes: On Sat, Jul 18, 2015 at 7:34 AM, Francisco Jerez curroje...@riseup.net wrote: --- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 66 +--- 1 file changed, 49 insertions(+), 17 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 89fcc49..4011639 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -861,6 +861,14 @@ fs_visitor::emit_texture(ir_texture_opcode op, } } + if (op == ir_query_levels) { + /* textureQueryLevels() is implemented in terms of TXS so we need to + * pass a valid LOD argument. + */ + assert(lod.file == BAD_FILE); + lod = fs_reg(0u); Why can't this go as part of the switch below? It seems kind of odd to split the logic up like this. What I had in mind originally was to factor out the switch statement below into a separate function -- Later on I noticed that you wanted to rework emit_texture() and friends to stop using ir_texture_opcode as representation for the texturing operation, so you would likely want to get rid of the whole switch-case statement below or replace it with something else, so I didn't bother. What are your current plans for emit_texture()? At this point, we may just roll it into nir_emit_texture or we may leave it alone. In any case, we will want to get rid of the ir_texture_opcode eventually. Feel free to leave this patch as is. I can merge things or whatever as needed whenever we do that refactor. + } + if (coordinate.file != BAD_FILE) { /* FINISHME: Texture coordinate rescaling doesn't work with non-constant * samplers. This should only be a problem with GL_CLAMP on Gen7. @@ -873,26 +881,50 @@ fs_visitor::emit_texture(ir_texture_opcode op, * samples, so don't worry about them. */ fs_reg dst = vgrf(glsl_type::get_instance(dest_type-base_type, 4, 1)); + const fs_reg srcs[] = { + coordinate, shadow_c, lod, lod2, + sample_index, mcs, sampler_reg, offset_value, + fs_reg(coord_components), fs_reg(grad_components) + }; + enum opcode opcode; - if (devinfo-gen = 7) { - inst = emit_texture_gen7(op, dst, coordinate, coord_components, - shadow_c, lod, lod2, grad_components, - sample_index, mcs, sampler_reg, - offset_value); - } else if (devinfo-gen = 5) { - inst = emit_texture_gen5(op, dst, coordinate, coord_components, - shadow_c, lod, lod2, grad_components, - sample_index, sampler, - offset_value.file != BAD_FILE); - } else if (dispatch_width == 16) { - inst = emit_texture_gen4_simd16(op, dst, coordinate, coord_components, - shadow_c, lod, sampler); - } else { - inst = emit_texture_gen4(op, dst, coordinate, coord_components, - shadow_c, lod, lod2, grad_components, - sampler); + switch (op) { + case ir_tex: + opcode = SHADER_OPCODE_TEX_LOGICAL; + break; + case ir_txb: + opcode = FS_OPCODE_TXB_LOGICAL; + break; + case ir_txl: + opcode = SHADER_OPCODE_TXL_LOGICAL; + break; + case ir_txd: + opcode = SHADER_OPCODE_TXD_LOGICAL; + break; + case ir_txf: + opcode = SHADER_OPCODE_TXF_LOGICAL; + break; + case ir_txf_ms: + opcode = SHADER_OPCODE_TXF_CMS_LOGICAL; + break; + case ir_txs: + case ir_query_levels: + opcode = SHADER_OPCODE_TXS_LOGICAL; + break; + case ir_lod: + opcode = SHADER_OPCODE_LOD_LOGICAL; + break; + case ir_tg4: + opcode = (offset_value.file != BAD_FILE offset_value.file != IMM ? +SHADER_OPCODE_TG4_OFFSET_LOGICAL : SHADER_OPCODE_TG4_LOGICAL); + break; + default: + unreachable(not reached); Please choose something more descriptive than not reached. How about invalid texture opcode Sure, I'll fix that, thanks. } + inst = bld.emit(opcode, dst, srcs, ARRAY_SIZE(srcs)); + inst-regs_written = 4 * dispatch_width / 8; + if (shadow_c.file != BAD_FILE) inst-shadow_compare = true; -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallium/util: Stop bundling our snprintf implementation.
Use MSVCRT functions instead. Their semantics are slightly different but they can be made to work as expected. Also, use the same code paths for both MSVCRT and MinGW. No testing yet. Just built. https://bugs.freedesktop.org/show_bug.cgi?id=91418 --- src/gallium/auxiliary/Makefile.sources |1 - src/gallium/auxiliary/util/u_snprintf.c | 1480 --- src/gallium/auxiliary/util/u_string.h | 33 +- 3 files changed, 30 insertions(+), 1484 deletions(-) delete mode 100644 src/gallium/auxiliary/util/u_snprintf.c diff --git a/src/gallium/auxiliary/Makefile.sources b/src/gallium/auxiliary/Makefile.sources index 62e6b94..3616d88 100644 --- a/src/gallium/auxiliary/Makefile.sources +++ b/src/gallium/auxiliary/Makefile.sources @@ -274,7 +274,6 @@ C_SOURCES := \ util/u_simple_shaders.h \ util/u_slab.c \ util/u_slab.h \ - util/u_snprintf.c \ util/u_split_prim.h \ util/u_sse.h \ util/u_staging.c \ diff --git a/src/gallium/auxiliary/util/u_snprintf.c b/src/gallium/auxiliary/util/u_snprintf.c deleted file mode 100644 index 39e9b70..000 --- a/src/gallium/auxiliary/util/u_snprintf.c +++ /dev/null @@ -1,1480 +0,0 @@ -/* - * Copyright (c) 1995 Patrick Powell. - * - * This code is based on code written by Patrick Powell papow...@astart.com. - * It may be used for any purpose as long as this notice remains intact on all - * source code distributions. - */ - -/* - * Copyright (c) 2008 Holger Weiss. - * - * This version of the code is maintained by Holger Weiss hol...@jhweiss.de. - * My changes to the code may freely be used, modified and/or redistributed for - * any purpose. It would be nice if additions and fixes to this file (including - * trivial code cleanups) would be sent back in order to let me include them in - * the version available at http://www.jhweiss.de/software/snprintf.html. - * However, this is not a requirement for using or redistributing (possibly - * modified) versions of this file, nor is leaving this notice intact mandatory. - */ - -/* - * History - * - * 2008-01-20 Holger Weiss hol...@jhweiss.de for C99-snprintf 1.1: - * - * Fixed the detection of infinite floating point values on IRIX (and - * possibly other systems) and applied another few minor cleanups. - * - * 2008-01-06 Holger Weiss hol...@jhweiss.de for C99-snprintf 1.0: - * - * Added a lot of new features, fixed many bugs, and incorporated various - * improvements done by Andrew Tridgell tri...@samba.org, Russ Allbery - * r...@stanford.edu, Hrvoje Niksic hnik...@xemacs.org, Damien Miller - * d...@mindrot.org, and others for the Samba, INN, Wget, and OpenSSH - * projects. The additions include: support the e, E, g, G, and - * F conversion specifiers (and use conversion style f or F for the - * still unsupported a and A specifiers); support the hh, ll, j, - * t, and z length modifiers; support the # flag and the (non-C99) - * ' flag; use localeconv(3) (if available) to get both the current - * locale's decimal point character and the separator between groups of - * digits; fix the handling of various corner cases of field width and - * precision specifications; fix various floating point conversion bugs; - * handle infinite and NaN floating point values; don't attempt to write to - * the output buffer (which may be NULL) if a size of zero was specified; - * check for integer overflow of the field width, precision, and return - * values and during the floating point conversion; use the OUTCHAR() macro - * instead of a function for better performance; provide asprintf(3) and - * vasprintf(3) functions; add new test cases. The replacement functions - * have been renamed to use an rpl_ prefix, the function calls in the - * main project (and in this file) must be redefined accordingly for each - * replacement function which is needed (by using Autoconf or other means). - * Various other minor improvements have been applied and the coding style - * was cleaned up for consistency. - * - * 2007-07-23 Holger Weiss hol...@jhweiss.de for Mutt 1.5.13: - * - * C99 compliant snprintf(3) and vsnprintf(3) functions return the number - * of characters that would have been written to a sufficiently sized - * buffer (excluding the '\0'). The original code simply returned the - * length of the resulting output string, so that's been fixed. - * - * 1998-03-05 Michael Elkins m...@mutt.org for Mutt 0.90.8: - * - * The original code assumed that both snprintf(3) and vsnprintf(3) were - * missing. Some systems only have snprintf(3) but not vsnprintf(3), so - * the code is now broken down under HAVE_SNPRINTF and HAVE_VSNPRINTF. - * - * 1998-01-27 Thomas Roessler roess...@does-not-exist.org for Mutt 0.89i: - * - * The PGP code was using unsigned hexadecimal formats. Unfortunately, - * unsigned formats simply didn't work. - * - *
Re: [Mesa-dev] [PATCH] targets/dri: scons: add missing link against libdrm
On 22/07/15 16:04, Emil Velikov wrote: Otherwise the final dri module will have (additional) unresolved symbols. Cc: Brian Paul bri...@vmware.com Cc: Jose Fonseca jfons...@vmware.com Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- We can only fix the remaining unresolved symbols (_glapi_foo), as we remove the non-shared glapi when building with DRI. With this we at least match the autotools build. -Emil src/gallium/targets/dri/SConscript | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/gallium/targets/dri/SConscript b/src/gallium/targets/dri/SConscript index 8d29f3b..2fb0da0 100644 --- a/src/gallium/targets/dri/SConscript +++ b/src/gallium/targets/dri/SConscript @@ -25,6 +25,8 @@ if env['llvm']: env.Append(CPPDEFINES = 'GALLIUM_LLVMPIPE') env.Prepend(LIBS = [llvmpipe]) +env.PkgUseModules('DRM') + env.Append(CPPDEFINES = [ 'GALLIUM_VMWGFX', 'GALLIUM_SOFTPIPE', Reviwed-by: Jose Fonseca jfons...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] i965/fs: Implement pass to lower instructions of unsupported SIMD width.
On Wed, Jul 22, 2015 at 12:31 AM, Francisco Jerez curroje...@riseup.net wrote: Jason Ekstrand ja...@jlekstrand.net writes: A few comments below. Mostly just asking for explanation. 1-3 are Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com Obviously, don't merge 4/4 until it actually has users. --Jason Thanks. On Thu, Jul 16, 2015 at 8:35 AM, Francisco Jerez curroje...@riseup.net wrote: This lowering pass implements an algorithm to expand SIMDN instructions into a sequence of SIMDM instructions in cases where the hardware doesn't support the original execution size natively for some particular instruction. The most important use-cases are: - Lowering send message instructions that don't support SIMD16 natively into SIMD8 (several texturing, framebuffer write and typed surface operations). - Lowering messages that don't support SIMD8 natively into SIMD16 (*cough*gen4*cough*). - 64-bit precision operations (e.g. FP64 and 64-bit integer multiplication). - SIMD32. The algorithm works by splitting the sources of the original instruction into chunks of width appropriate for the lowered instructions, and then interleaving the results component-wise into the destination of the original instruction. The pass is controlled by the get_lowered_simd_width() function that currently just returns the original execution size making the whole pass a no-op for the moment until some user is introduced. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 142 +++ src/mesa/drivers/dri/i965/brw_fs.h | 1 + 2 files changed, 143 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index d031352..eeb6938 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3204,6 +3204,147 @@ fs_visitor::lower_logical_sends() return progress; } +/** + * Get the closest native SIMD width supported by the hardware for instruction + * \p inst. The instruction will be left untouched by + * fs_visitor::lower_simd_width() if the returned value is equal to the + * original execution size. + */ +static unsigned +get_lowered_simd_width(const struct brw_device_info *devinfo, + const fs_inst *inst) +{ + switch (inst-opcode) { + default: + return inst-exec_size; + } +} + +/** + * The \p rows array of registers represents a \p num_rows by \p num_columns + * matrix in row-major order, write it in column-major order into the register + * passed as destination. \p stride gives the separation between matrix + * elements in the input in fs_builder::dispatch_width() units. + */ +static void +emit_transpose(const fs_builder bld, + const fs_reg dst, const fs_reg *rows, + unsigned num_rows, unsigned num_columns, unsigned stride) I'm not sure what I think about calling this emit_transpose. I guess it is kind of a transpose, but maybe it's more of a gather. I'm not going to quibble about it though. *Shrug*, it doesn't only gather the vectors of the rows array (that would have been emit_collect :P), it copies them out in vertical, just like a matrix transpose -- assuming you're not horrified by the idea of considering the argument a matrix. That's fine. Feel free to ignore that suggestion. +{ + fs_reg *const components = new fs_reg[num_rows * num_columns]; + + for (unsigned i = 0; i num_columns; ++i) { + for (unsigned j = 0; j num_rows; ++j) + components[num_rows * i + j] = offset(rows[j], bld, stride * i); + } + + bld.LOAD_PAYLOAD(dst, components, num_rows * num_columns, 0); + + delete[] components; +} + +bool +fs_visitor::lower_simd_width() +{ + bool progress = false; + + foreach_block_and_inst_safe(block, fs_inst, inst, cfg) { + const unsigned lower_width = get_lowered_simd_width(devinfo, inst); + + if (lower_width != inst-exec_size) { + /* Builder matching the original instruction. */ + const fs_builder ibld = bld.at(block, inst) +.exec_all(inst-force_writemask_all) +.group(inst-exec_size, inst-force_sechalf); + + /* Split the copies in chunks of the execution width of either the + * original or the lowered instruction, whichever is lower. + */ + const unsigned copy_width = MIN2(lower_width, inst-exec_size); + const unsigned n = inst-exec_size / copy_width; + const unsigned dst_size = inst-regs_written * REG_SIZE / +inst-dst.component_size(inst-exec_size); + fs_reg dsts[4]; + + assert(n 0 n = ARRAY_SIZE(dsts) +!inst-writes_accumulator !inst-mlen); + + for (unsigned i = 0; i n; i++) { +/* Emit a copy of the original instruction with the lowered width. + * If the
Re: [Mesa-dev] [PATCH] glapi: fix argument parsing in glX_proto_recv.py
It should be in master now. -Emil On 21 July 2015 at 23:34, Dylan Baker baker.dyla...@gmail.com wrote: Cool, I was under that impression too, but having it confirmed is great. Since these do matter and I don't have push access, could you make sure this lands? On Jul 21, 2015 15:33, Emil Velikov emil.l.veli...@gmail.com wrote: It seems that I was under the wrong impression on the hole thing. Some of the files generated from these scripts are used in/by xserver. Although they are imported/tracked into version control of the latter in order to prevent the xserver build directly diving into mesa source tree. The XORG_GLAPI_OUTPUTS files on the other hand, are the ones that were removed from xserver with commit be6680967a4(glx: convert to direct GL dispatch (v2)). Unfortunatelly they only cover ~20 lines of makefile rules, thus none of the python scripts can be removed/simplified. Apologies if I mislead anyone. -Emil On 04/07/15 10:42, Emil Velikov wrote: Seems like there is a misunderstanding somewhere - perhaps some on my end. Yet I do recall Adam Jackson stating on numerous occasions, on IRC, that we don't want/need any sources from mesa (outside dri_interface.h and the gl* headers) for xserver. Adam can we truly nuke (some of) the generated sources in mesa, or does X/xserver relies on them ? Cheers, Emil On 3 July 2015 at 21:13, Dylan Baker baker.dyla...@gmail.com wrote: I asked about dumping them and was told that they would leave mesa only when x itself died. If be more than happy to see them go if they're not useful On Jul 3, 2015 06:28, Emil Velikov emil.l.veli...@gmail.com wrote: On 02/07/15 18:25, Dylan Baker wrote: One of the plugins I use with vim helpfully added an underscore to the front of mode for kicks. Obviously this isn't a feature used very often because it's been broken since d986cb7c70db (since May 20th), and no one has noticed. Iirc these have a long and boring history with xserver. The short version is that they (and many other files reciding in XORG_GL{API,X}_DIR) have been dead/unused for a long while. If you want to avoid the rewriting them a few times, I would suggest that we nuke them up first. That's why I've asked about your plans earlier :) -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] mesa: Detect and provide macros for function attributes pure and const.
On 21/07/15 15:57, Emil Velikov wrote: On 18 July 2015 at 08:13, Jose Fonseca jfons...@vmware.com wrote: On 18/07/15 01:38, Eric Anholt wrote: Emil Velikov emil.l.veli...@gmail.com writes: On 14/07/15 19:45, Eric Anholt wrote: These are really useful hints to the compiler in the absence of link-time optimization, and I'm going to use them in VC4. I've made the const attribute be ATTRIBUTE_CONST unlike other function attributes, because we have other things in the tree #defining CONST for their own unrelated purposes. Mindly related: how people feel about making these macros less screamy, by following the approach used in the kernel: PURE - __pure and so on ? I'd love it. Less screamy is fine, but beware prefixing double underscore: the C standard stipulates that its use is reserved for for C/C++ runtime. [1] I though about it before posting although I've seen others define those, even do so in their public headers. Now that I have some examples from my current /usr/include Searching for __pure dwarves/dutil.h:#define __pure __attribute__ ((pure)) Searching for __attribute_const__ sys/cdefs.h:# define __attribute_const__ __attribute__ ((__const__)) sys/cdefs.h:# define __attribute_const__ /* Ignore */ Searching for __printf Searching for __always_unused Searching for __noreturn Searching for __packed libvisual-0.4/libvisual/lv_defines.h:# define __packed __attribute__ ((packed)) libvisual-0.4/libvisual/lv_defines.h:# define __packed /* no packed */ bsd/sys/cdefs.h:# define __packed __attribute__((__packed__)) bsd/sys/cdefs.h:# define __packed Searching for __deprecated pciaccess.h:#define __deprecated __attribute__((deprecated)) pciaccess.h:#define __deprecated Searching for __weak Searching for __alias With a handful of other headers defining more double underscore prefixed macros. Look at stdlibc++ implementation: every internal variable has a double underscore prefix. Unless we're talking about STL/other template library we don't care what library foo uses in it's internal implementation do we ? After all these will be resolved at compile time. Maybe kernel gets away on GLIBC (and because it doesn't use C++), but there's no guarantee it will work on other C runtimes, and even if it does, it could start failing anytime. True, it's not the best of ideas. Just worth pointing out that the cat is already out, for other projects. There are already more than 12K #define __foo cases on my system. These defines are reserved for system headers, so it's natural to be lots of them in /usr/include. MacOSX also defines some of these on its sys/cdefs.h: http://www.opensource.apple.com/source/xnu/xnu-1456.1.26/bsd/sys/cdefs.h The question is: can we expect that most systems will define these __foo, or at least not use them for other purposes. I don't know the answer. At a glance MSVC doesn't seem to rely on them for anything. So it might work. I don't oppose if you want to give it a shot. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 04/12] i965/fs: Pass a BAD_FILE header source to LOAD_PAYLOAD in emit_texture_gen7().
On Wed, Jul 22, 2015 at 12:43 AM, Francisco Jerez curroje...@riseup.net wrote: Jason Ekstrand ja...@jlekstrand.net writes: On Sat, Jul 18, 2015 at 7:34 AM, Francisco Jerez curroje...@riseup.net wrote: So that it's left uninitialized by LOAD_PAYLOAD, we only need to reserve space for it in the message since it will be initialized implicitly by the generator. --- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 522e13e..89fcc49 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -473,8 +473,9 @@ fs_visitor::emit_texture_gen7(ir_texture_opcode op, fs_reg dst, if (op == ir_tg4 || offset_value.file != BAD_FILE || is_high_sampler(devinfo, sampler)) { /* For general texture offsets (no txf workaround), we need a header to - * put them in. Note that for SIMD16 we're making space for two actual - * hardware registers here, so the emit will have to fix up for this. + * put them in. Note that we're only reserving space for it in the + * message payload as it will be initialized implicitly by the + * generator. Mind splitting this in two? Fixing the comment above is really unrelated to passing BAD_FILE below. It seemed like the sentence I changed was trying to explain what the line of code I changed was doing. It may sound unrelated because the comment was completely misleading (or just outdated), but it can only refer to the three lines of code below... It does refer to the three lines of code below, but the fact that it's bogus has nothing to do with the change below. It's been wrong for probably 8 months and I've just never noticed. Register allocation used to, in SIMD16 mode, work in terms of pairs of physical registers. In the texturing code, we needed a 1-register header so in SIMD16, we got 2 registers and then the generator knew to put the header in the second half. About 8 months ago, I changed RA to work in terms of single hardware registers even on SIMD16 and this comment has been stale ever since. --Jason * * * ir4_tg4 needs to place its channel select in the header, * for interaction with ARB_texture_swizzle @@ -483,7 +484,7 @@ fs_visitor::emit_texture_gen7(ir_texture_opcode op, fs_reg dst, * need to offset the Sampler State Pointer in the header. */ header_size = 1; - sources[0] = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD); + sources[0] = fs_reg(); length++; } -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 12/12] i965/fs: Reimplement emit_single_fb_write() in terms of logical framebuffer writes.
On Wed, Jul 22, 2015 at 12:54 AM, Francisco Jerez curroje...@riseup.net wrote: Jason Ekstrand ja...@jlekstrand.net writes: On Thu, Jul 16, 2015 at 8:41 AM, Francisco Jerez curroje...@riseup.net wrote: The only non-trivial thing it still has to do is figure out where to take the src/dst depth values from and predicate the instruction if discard is in use. The manual SIMD unrolling logic in the dual-source case goes away because this is now handled transparently by the SIMD lowering pass. --- src/mesa/drivers/dri/i965/brw_fs.h | 5 +- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 203 +++ 2 files changed, 20 insertions(+), 188 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 1ae79a9..64f89d4 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -290,13 +290,10 @@ public: bool optimize_frontfacing_ternary(nir_alu_instr *instr, const fs_reg result); - void setup_color_payload(fs_reg *dst, fs_reg color, unsigned components, -unsigned exec_size, bool use_2nd_half); void emit_alpha_test(); fs_inst *emit_single_fb_write(const brw::fs_builder bld, fs_reg color1, fs_reg color2, - fs_reg src0_alpha, unsigned components, - unsigned exec_size, bool use_2nd_half = false); + fs_reg src0_alpha, unsigned components); void emit_fb_writes(); void emit_urb_writes(); void emit_cs_terminate(); diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index ba4b177..bcfeaa0 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -1409,33 +1409,6 @@ fs_visitor::emit_interpolation_setup_gen6() } } -void -fs_visitor::setup_color_payload(fs_reg *dst, fs_reg color, unsigned components, -unsigned exec_size, bool use_2nd_half) -{ - brw_wm_prog_key *key = (brw_wm_prog_key*) this-key; - fs_inst *inst; - - if (key-clamp_fragment_color) { - fs_reg tmp = vgrf(glsl_type::vec4_type); - assert(color.type == BRW_REGISTER_TYPE_F); - for (unsigned i = 0; i components; i++) { - inst = bld.MOV(offset(tmp, bld, i), offset(color, bld, i)); - inst-saturate = true; - } - color = tmp; - } - - if (exec_size dispatch_width) { - unsigned half_idx = use_2nd_half ? 1 : 0; - for (unsigned i = 0; i components; i++) - dst[i] = half(offset(color, bld, i), half_idx); - } else { - for (unsigned i = 0; i components; i++) - dst[i] = offset(color, bld, i); - } -} - static enum brw_conditional_mod cond_for_alpha_func(GLenum func) { @@ -1493,146 +1466,34 @@ fs_visitor::emit_alpha_test() fs_inst * fs_visitor::emit_single_fb_write(const fs_builder bld, fs_reg color0, fs_reg color1, - fs_reg src0_alpha, unsigned components, - unsigned exec_size, bool use_2nd_half) + fs_reg src0_alpha, unsigned components) { assert(stage == MESA_SHADER_FRAGMENT); brw_wm_prog_data *prog_data = (brw_wm_prog_data*) this-prog_data; - brw_wm_prog_key *key = (brw_wm_prog_key*) this-key; - const fs_builder ubld = bld.group(exec_size, use_2nd_half); - int header_size = 2, payload_header_size; - - /* We can potentially have a message length of up to 15, so we have to set -* base_mrf to either 0 or 1 in order to fit in m0..m15. -*/ - fs_reg *sources = ralloc_array(mem_ctx, fs_reg, 15); - int length = 0; - - /* From the Sandy Bridge PRM, volume 4, page 198: -* -* Dispatched Pixel Enables. One bit per pixel indicating -* which pixels were originally enabled when the thread was -* dispatched. This field is only required for the end-of- -* thread message and on all dual-source messages. -*/ - if (devinfo-gen = 6 - (devinfo-is_haswell || devinfo-gen = 8 || !prog_data-uses_kill) - color1.file == BAD_FILE - key-nr_color_regions == 1) { - header_size = 0; - } - - if (header_size != 0) { - assert(header_size == 2); - /* Allocate 2 registers for a header */ - length += 2; - } - - if (payload.aa_dest_stencil_reg) { - sources[length] = fs_reg(GRF, alloc.allocate(1)); - bld.group(8, 0).exec_all().annotate(FB write stencil/AA alpha) - .MOV(sources[length], - fs_reg(brw_vec8_grf(payload.aa_dest_stencil_reg, 0))); - length++; - } - - if (prog_data-uses_omask) { - sources[length] = fs_reg(GRF,
Re: [Mesa-dev] [PATCH 09/12] i965/fs: Hook up SIMD lowering to handle texturing opcodes of unsupported width.
On Wed, Jul 22, 2015 at 12:55 AM, Francisco Jerez curroje...@riseup.net wrote: Jason Ekstrand ja...@jlekstrand.net writes: On Sat, Jul 18, 2015 at 7:34 AM, Francisco Jerez curroje...@riseup.net wrote: This should match the set of cases in which we currently call fail() or no16() from the emit_texture_*() methods and the ones in which emit_texture_gen4() enables the SIMD16 workaround. Hint for reviewers: It's not a big deal if I happen to have missed some case here, it will just lead to an assertion failure down the road which is easily fixable, however being stricter than necessary won't cause any visible breakage, it would just decrease performance silently due to the unnecessary message splitting, so feel free to double-check that all cases listed here already cause a SIMD8/16 fall-back with the current texturing code -- You may want to skip over the Gen5-6 cases though if you don't have pencil and paper at hand. I'll believe you. Even if you got it wrong, the worst that happens is we hit the fail case when we try and lower. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 27 +++ 1 file changed, 27 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 043d9e9..f291202 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3918,6 +3918,33 @@ get_lowered_simd_width(const struct brw_device_info *devinfo, /* Dual-source FB writes are unsupported in SIMD16 mode. */ return (inst-src[1].file != BAD_FILE ? 8 : inst-exec_size); + case SHADER_OPCODE_TXD_LOGICAL: + /* TXD is unsupported in SIMD16 mode. */ + return 8; + + case SHADER_OPCODE_TG4_OFFSET_LOGICAL: { + /* gather4_po_c is unsupported in SIMD16 mode. */ + const fs_reg shadow_c = inst-src[1]; + return (shadow_c.file != BAD_FILE ? 8 : inst-exec_size); + } + case SHADER_OPCODE_TXL_LOGICAL: + case FS_OPCODE_TXB_LOGICAL: { + /* Gen4 doesn't have SIMD8 non-shadow-compare bias/LOD instructions, and + * Gen4-6 don't support TXL and TXB with shadow comparison in SIMD16 + * mode. + */ + const fs_reg shadow_c = inst-src[1]; + return (devinfo-gen == 4 shadow_c.file == BAD_FILE ? 16 : + devinfo-gen 7 shadow_c.file != BAD_FILE ? 8 : + inst-exec_size); Could we please use an if-ladder here. Nested ternaries are hard to read and kind of pointless given that we can return multiple places in the if. Heh, I always have the very opposite style itch -- If you can compute something succinctly with a handful of pure expressions why venture into the dangerous realm of statements? Heh... The mathematician in me is tempted to agree with you. However, the programmer in me likes his control-flow to have obvious nesting and indentation. :-) I think John Backus' article Can programming be liberated from the von Neumann style? although slightly outdated elaborates the point pretty well. But sure, if you'd like it better with an if-ladder why not. + } + case SHADER_OPCODE_TXF_LOGICAL: + case SHADER_OPCODE_TXS_LOGICAL: + /* Gen4 doesn't have SIMD8 variants for the RESINFO and LD-with-LOD + * messages. Use SIMD16 instead. + */ + return (devinfo-gen == 4 ? 16 : inst-exec_size); I'd kind of also like this ternary to go, but I'm ok with it if you want to keep it. + default: return inst-exec_size; } -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91425] [regression, bisected] Piglit spec/ext_packed_float/ getteximage-invalid-format-for-packed-type fails
https://bugs.freedesktop.org/show_bug.cgi?id=91425 --- Comment #3 from Ilia Mirkin imir...@alum.mit.edu --- (In reply to Brian Paul from comment #1) Hmm, I can't reproduce that. The test passes completely for me. What driver are you using? I'm testing llvmpipe/softpipe. Odd. I repro with llvmpipe. I guess the issue is the if (depth != 1) { _mesa_error(ctx, GL_INVALID_VALUE, %s(depth = %d), caller, depth); return true; } check for 1d/2d targets. The piglit test does: glGetTexImage(GL_TEXTURE_2D, 0, format, type, pxBuffer); Since it's checking for, roughly speaking, invalid things, I assume that the texImage isn't there, and so get_texture_image_dims will end up setting a depth = 0. Should that be depth = 1 instead? -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 18/78] i965: Take is_scalar_shader_stage() method out to allow reuse
On Wed, Jul 22, 2015 at 4:26 AM, Eduardo Lima Mitev el...@igalia.com wrote: On 07/13/2015 01:38 PM, Jason Ekstrand wrote: On Fri, Jul 10, 2015 at 8:53 AM, Eduardo Lima Mitev el...@igalia.com wrote: On 06/30/2015 06:58 PM, Jason Ekstrand wrote: On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev el...@igalia.com wrote: This patch makes public the is_scalar_shader_stage() method in brw_shader, and renames it to brw_compiler_is_scalar_shader_stage(). The plan is to later reuse it in brw_nir, to enable/disable optimization passes depending on the type of shader stage. I'm not so sure that this is a good plan. It assumes that whether we are doing a scalar or vec4 compile is based entirely on the shader stage and some static information (possibly based on environment variables). Ken and I were talking around the office and we may want to use both SIMD4x2 and SIMD8 mode for geometry shaders depending on the number of inputs, etc. This won't work in the given paradigm. If I understand correctly, what you propose is having a function to dynamically choose the type of shader (scalar vs. vec4) when compiling the shader, using not only gen and stage, but also actual application data. I think this is a good idea and will allow experimenting with different combinations of shaders with real input data. However, I wonder if we this should be added later after more elaborated thoughts on what exactly do we need and where to plug it. I have been experimenting with a function following the use case you mentioned, choosing shader backend based on inputs to a GS. But honestly it feels like a blind guess from my side to what actually you and Ken have in mind. Current patch basically reuses the function we already have to select the shader, so what I propose is to move forward with it for the moment (adding the missing MESA_SHADER_COMPUTE) and discuss how to extend it to factor in dynamic data; or perhaps you can explain us your proposal with a bit more detail. WDYT? My primary issue was with having it based on a is_scalar_stage function at all. It seems like a better long-term plan would be to simply pass an is_scalar boolean into those lowering functions. That said, I haven't taken a real hard look as to what calls those functions and whether or not that's actually what we want either. Make sense? If there's no better place to make that determination further up the call chain, then this is fine for now. Yes, I like the idea of passing the is_scalar boolean to brw_create_nir() directly, so that this decision is moved up to a higher context. By doing this we also leave the is_scalar_shader_stage() function private to brw_shader.cpp, meaning this patch is not relevant anymore and we leave that function untouched. A patch illustrating these changes is at: https://github.com/Igalia/mesa/commit/9e7bc62d43eaa30de005c40b91551ad113c4065c In the future, if we want to select shader type (scalar vs. vector) based on dynamic info, the logic that needs changing is contained in brw_shader. But this patch doesn't address that yet, lacking a more defined use-case. My (wild) idea would be to have a preferred_shader_type variable in brw_context for example, that can be modifiable dynamically at any point, and will instruct the link phase to use scalar or vec4 if possible (depending on gen, stage, etc). What do you think? Looks fine to me. The new method accepts a brw_compiler instead of a brw_context. This is done for consistency, since the actual info we need (scalar_vs) is in brw_compiler, and fetching in through brw_content-intelScreen-compiler seems like too much indirection. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580 --- src/mesa/drivers/dri/i965/brw_shader.cpp | 22 ++ src/mesa/drivers/dri/i965/brw_shader.h | 13 + 2 files changed, 19 insertions(+), 16 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp b/src/mesa/drivers/dri/i965/brw_shader.cpp index 0b53647..3b99046 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp @@ -182,19 +182,6 @@ brw_shader_precompile(struct gl_context *ctx, return true; } -static inline bool -is_scalar_shader_stage(struct brw_context *brw, int stage) -{ - switch (stage) { - case MESA_SHADER_FRAGMENT: - return true; - case MESA_SHADER_VERTEX: - return brw-intelScreen-compiler-scalar_vs; - default: - return false; - } -} - static void brw_lower_packing_builtins(struct brw_context *brw, gl_shader_stage shader_type, @@ -205,7 +192,8 @@ brw_lower_packing_builtins(struct brw_context *brw, | LOWER_PACK_UNORM_2x16 | LOWER_UNPACK_UNORM_2x16; - if (is_scalar_shader_stage(brw, shader_type)) { + if (brw_compiler_is_scalar_shader_stage(brw-intelScreen-compiler, +
Re: [Mesa-dev] [PATCH 03/20] i965/fs: Implement lowering of logical surface instructions.
On Tue, Jul 21, 2015 at 9:38 AM, Francisco Jerez curroje...@riseup.net wrote: --- src/mesa/drivers/dri/i965/brw_fs.cpp | 63 +++- 1 file changed, 55 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index a996676..0b0c5e1 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3881,6 +3881,20 @@ lower_sampler_logical_send(const fs_builder bld, fs_inst *inst, opcode op) } } +/** + * Initialize the header present in some typed and untyped surface + * messages. + */ +static fs_reg +emit_surface_header(const fs_builder bld, const fs_reg sample_mask) +{ + fs_builder ubld = bld.exec_all().group(8, 0); + const fs_reg dst = ubld.vgrf(BRW_REGISTER_TYPE_UD); + ubld.MOV(dst, fs_reg(0)); + ubld.MOV(component(dst, 7), sample_mask); + return dst; +} + static void lower_surface_logical_send(const fs_builder bld, fs_inst *inst, opcode op, const fs_reg sample_mask) @@ -3889,10 +3903,43 @@ lower_surface_logical_send(const fs_builder bld, fs_inst *inst, opcode op, const fs_reg addr = inst-src[0]; const fs_reg src = inst-src[1]; const fs_reg surface = inst-src[2]; - const fs_reg dims = inst-src[3]; + const UNUSED fs_reg dims = inst-src[3]; const fs_reg arg = inst-src[4]; - assert(!Not implemented); + /* Calculate the total number of components of the payload. */ + const unsigned addr_sz = inst-components_read(0); + const unsigned src_sz = inst-components_read(1); + const unsigned header_sz = (sample_mask.file == BAD_FILE ? 0 : 1); + const unsigned sz = header_sz + addr_sz + src_sz; + + /* Allocate space for the payload. */ + fs_reg *const components = new fs_reg[sz]; + const fs_reg payload = bld.vgrf(BRW_REGISTER_TYPE_UD, sz); + unsigned n = 0; + + /* Construct the payload. */ + if (header_sz) + components[n++] = emit_surface_header(bld, sample_mask); + + for (unsigned i = 0; i addr_sz; i++) + components[n++] = offset(addr, bld, i); + + for (unsigned i = 0; i src_sz; i++) + components[n++] = offset(src, bld, i); + + bld.LOAD_PAYLOAD(payload, components, sz, header_sz); + + /* Update the original instruction. */ + inst-opcode = op; + inst-mlen = header_sz + (addr_sz + src_sz) * inst-exec_size / 8; + inst-header_size = header_sz; + + inst-src[0] = payload; + inst-src[1] = surface; + inst-src[2] = arg; + inst-resize_sources(3); + + delete[] components; I don't think this is correct. Unless something has changed, the fs_inst constructor that takes an array doesn't make a copy so you have to ralloc the array and *not* delete it. } bool @@ -3965,37 +4012,37 @@ fs_visitor::lower_logical_sends() case SHADER_OPCODE_UNTYPED_SURFACE_READ_LOGICAL: lower_surface_logical_send(ibld, inst, SHADER_OPCODE_UNTYPED_SURFACE_READ, -fs_reg()); +fs_reg(0x)); break; case SHADER_OPCODE_UNTYPED_SURFACE_WRITE_LOGICAL: lower_surface_logical_send(ibld, inst, SHADER_OPCODE_UNTYPED_SURFACE_WRITE, -fs_reg()); +ibld.sample_mask_reg()); break; case SHADER_OPCODE_UNTYPED_ATOMIC_LOGICAL: lower_surface_logical_send(ibld, inst, SHADER_OPCODE_UNTYPED_ATOMIC, -fs_reg()); +ibld.sample_mask_reg()); break; case SHADER_OPCODE_TYPED_SURFACE_READ_LOGICAL: lower_surface_logical_send(ibld, inst, SHADER_OPCODE_TYPED_SURFACE_READ, -fs_reg()); +fs_reg(0x)); break; case SHADER_OPCODE_TYPED_SURFACE_WRITE_LOGICAL: lower_surface_logical_send(ibld, inst, SHADER_OPCODE_TYPED_SURFACE_WRITE, -fs_reg()); +ibld.sample_mask_reg()); break; case SHADER_OPCODE_TYPED_ATOMIC_LOGICAL: lower_surface_logical_send(ibld, inst, SHADER_OPCODE_TYPED_ATOMIC, -fs_reg()); +ibld.sample_mask_reg()); break; default: -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org
Re: [Mesa-dev] [PATCH 2/2] mesa: rearrange texture error checking order
On 07/22/2015 11:02 AM, Ilia Mirkin wrote: This moves the width/height/depth == 0 check to the front and avoids doing any other checking when that is the case. Also moves the dimensions check after the format/type checks so that we don't bail out with success on a width/height/depth == 0 request when the format/type don't match. Bugzilla: https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.freedesktop.org_show-5Fbug.cgi-3Fid-3D91425d=BQIBAgc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=T0t4QG7chq2ZwJo6wilkFznRSFy-8uDKartPGbomVj8m=Y1TotaJmX1E7sCopvmMJc8PQKFXlHu6QxfcYOydKuIIs=L0lmmuk31G6M4hZEiwnX-IC4CPhW7rsrSCgEYtDgXjwe= Signed-off-by: Ilia Mirkin imir...@alum.mit.edu I suspect this isn't really needed in light of my patch to getteximage-invalid-format-for-packed-type.c I believe the test was in error, or at least sloppy, in that it was calling glGetTexImage to test format/type validation when there wasn't even a texture image to return. The OpenGL spec doesn't specify an order for error checking multiple things, so if there's multiple errors, you can't be sure which one will be reported first. My patch to the test removes that ambiguity. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] i965/fs: Implement pass to lower instructions of unsupported SIMD width.
Jason Ekstrand ja...@jlekstrand.net writes: On Wed, Jul 22, 2015 at 12:31 AM, Francisco Jerez curroje...@riseup.net wrote: Jason Ekstrand ja...@jlekstrand.net writes: A few comments below. Mostly just asking for explanation. 1-3 are Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com Obviously, don't merge 4/4 until it actually has users. --Jason Thanks. On Thu, Jul 16, 2015 at 8:35 AM, Francisco Jerez curroje...@riseup.net wrote: This lowering pass implements an algorithm to expand SIMDN instructions into a sequence of SIMDM instructions in cases where the hardware doesn't support the original execution size natively for some particular instruction. The most important use-cases are: - Lowering send message instructions that don't support SIMD16 natively into SIMD8 (several texturing, framebuffer write and typed surface operations). - Lowering messages that don't support SIMD8 natively into SIMD16 (*cough*gen4*cough*). - 64-bit precision operations (e.g. FP64 and 64-bit integer multiplication). - SIMD32. The algorithm works by splitting the sources of the original instruction into chunks of width appropriate for the lowered instructions, and then interleaving the results component-wise into the destination of the original instruction. The pass is controlled by the get_lowered_simd_width() function that currently just returns the original execution size making the whole pass a no-op for the moment until some user is introduced. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 142 +++ src/mesa/drivers/dri/i965/brw_fs.h | 1 + 2 files changed, 143 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index d031352..eeb6938 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3204,6 +3204,147 @@ fs_visitor::lower_logical_sends() return progress; } +/** + * Get the closest native SIMD width supported by the hardware for instruction + * \p inst. The instruction will be left untouched by + * fs_visitor::lower_simd_width() if the returned value is equal to the + * original execution size. + */ +static unsigned +get_lowered_simd_width(const struct brw_device_info *devinfo, + const fs_inst *inst) +{ + switch (inst-opcode) { + default: + return inst-exec_size; + } +} + +/** + * The \p rows array of registers represents a \p num_rows by \p num_columns + * matrix in row-major order, write it in column-major order into the register + * passed as destination. \p stride gives the separation between matrix + * elements in the input in fs_builder::dispatch_width() units. + */ +static void +emit_transpose(const fs_builder bld, + const fs_reg dst, const fs_reg *rows, + unsigned num_rows, unsigned num_columns, unsigned stride) I'm not sure what I think about calling this emit_transpose. I guess it is kind of a transpose, but maybe it's more of a gather. I'm not going to quibble about it though. *Shrug*, it doesn't only gather the vectors of the rows array (that would have been emit_collect :P), it copies them out in vertical, just like a matrix transpose -- assuming you're not horrified by the idea of considering the argument a matrix. That's fine. Feel free to ignore that suggestion. +{ + fs_reg *const components = new fs_reg[num_rows * num_columns]; + + for (unsigned i = 0; i num_columns; ++i) { + for (unsigned j = 0; j num_rows; ++j) + components[num_rows * i + j] = offset(rows[j], bld, stride * i); + } + + bld.LOAD_PAYLOAD(dst, components, num_rows * num_columns, 0); + + delete[] components; +} + +bool +fs_visitor::lower_simd_width() +{ + bool progress = false; + + foreach_block_and_inst_safe(block, fs_inst, inst, cfg) { + const unsigned lower_width = get_lowered_simd_width(devinfo, inst); + + if (lower_width != inst-exec_size) { + /* Builder matching the original instruction. */ + const fs_builder ibld = bld.at(block, inst) +.exec_all(inst-force_writemask_all) +.group(inst-exec_size, inst-force_sechalf); + + /* Split the copies in chunks of the execution width of either the + * original or the lowered instruction, whichever is lower. + */ + const unsigned copy_width = MIN2(lower_width, inst-exec_size); + const unsigned n = inst-exec_size / copy_width; + const unsigned dst_size = inst-regs_written * REG_SIZE / +inst-dst.component_size(inst-exec_size); + fs_reg dsts[4]; + + assert(n 0 n = ARRAY_SIZE(dsts) +!inst-writes_accumulator !inst-mlen); + + for (unsigned i = 0; i n; i++) { +/* Emit a copy of the original
Re: [Mesa-dev] [PATCH 03/20] i965/fs: Implement lowering of logical surface instructions.
On Wed, Jul 22, 2015 at 10:02 AM, Francisco Jerez curroje...@riseup.net wrote: Jason Ekstrand ja...@jlekstrand.net writes: On Tue, Jul 21, 2015 at 9:38 AM, Francisco Jerez curroje...@riseup.net wrote: --- src/mesa/drivers/dri/i965/brw_fs.cpp | 63 +++- 1 file changed, 55 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index a996676..0b0c5e1 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3881,6 +3881,20 @@ lower_sampler_logical_send(const fs_builder bld, fs_inst *inst, opcode op) } } +/** + * Initialize the header present in some typed and untyped surface + * messages. + */ +static fs_reg +emit_surface_header(const fs_builder bld, const fs_reg sample_mask) +{ + fs_builder ubld = bld.exec_all().group(8, 0); + const fs_reg dst = ubld.vgrf(BRW_REGISTER_TYPE_UD); + ubld.MOV(dst, fs_reg(0)); + ubld.MOV(component(dst, 7), sample_mask); + return dst; +} + static void lower_surface_logical_send(const fs_builder bld, fs_inst *inst, opcode op, const fs_reg sample_mask) @@ -3889,10 +3903,43 @@ lower_surface_logical_send(const fs_builder bld, fs_inst *inst, opcode op, const fs_reg addr = inst-src[0]; const fs_reg src = inst-src[1]; const fs_reg surface = inst-src[2]; - const fs_reg dims = inst-src[3]; + const UNUSED fs_reg dims = inst-src[3]; const fs_reg arg = inst-src[4]; - assert(!Not implemented); + /* Calculate the total number of components of the payload. */ + const unsigned addr_sz = inst-components_read(0); + const unsigned src_sz = inst-components_read(1); + const unsigned header_sz = (sample_mask.file == BAD_FILE ? 0 : 1); + const unsigned sz = header_sz + addr_sz + src_sz; + + /* Allocate space for the payload. */ + fs_reg *const components = new fs_reg[sz]; + const fs_reg payload = bld.vgrf(BRW_REGISTER_TYPE_UD, sz); + unsigned n = 0; + + /* Construct the payload. */ + if (header_sz) + components[n++] = emit_surface_header(bld, sample_mask); + + for (unsigned i = 0; i addr_sz; i++) + components[n++] = offset(addr, bld, i); + + for (unsigned i = 0; i src_sz; i++) + components[n++] = offset(src, bld, i); + + bld.LOAD_PAYLOAD(payload, components, sz, header_sz); + + /* Update the original instruction. */ + inst-opcode = op; + inst-mlen = header_sz + (addr_sz + src_sz) * inst-exec_size / 8; + inst-header_size = header_sz; + + inst-src[0] = payload; + inst-src[1] = surface; + inst-src[2] = arg; + inst-resize_sources(3); + + delete[] components; I don't think this is correct. Unless something has changed, the fs_inst constructor that takes an array doesn't make a copy so you have to ralloc the array and *not* delete it. It is, I fixed the fs_inst constructor some time ago to take the array by value (e.g. so allocation on the stack and re-use of the same array is possible for the caller). Probably a good idea. In that case, would you mind using a vla instead of new/delete? } bool @@ -3965,37 +4012,37 @@ fs_visitor::lower_logical_sends() case SHADER_OPCODE_UNTYPED_SURFACE_READ_LOGICAL: lower_surface_logical_send(ibld, inst, SHADER_OPCODE_UNTYPED_SURFACE_READ, -fs_reg()); +fs_reg(0x)); break; case SHADER_OPCODE_UNTYPED_SURFACE_WRITE_LOGICAL: lower_surface_logical_send(ibld, inst, SHADER_OPCODE_UNTYPED_SURFACE_WRITE, -fs_reg()); +ibld.sample_mask_reg()); break; case SHADER_OPCODE_UNTYPED_ATOMIC_LOGICAL: lower_surface_logical_send(ibld, inst, SHADER_OPCODE_UNTYPED_ATOMIC, -fs_reg()); +ibld.sample_mask_reg()); break; case SHADER_OPCODE_TYPED_SURFACE_READ_LOGICAL: lower_surface_logical_send(ibld, inst, SHADER_OPCODE_TYPED_SURFACE_READ, -fs_reg()); +fs_reg(0x)); break; case SHADER_OPCODE_TYPED_SURFACE_WRITE_LOGICAL: lower_surface_logical_send(ibld, inst, SHADER_OPCODE_TYPED_SURFACE_WRITE, -fs_reg()); +ibld.sample_mask_reg()); break; case SHADER_OPCODE_TYPED_ATOMIC_LOGICAL: lower_surface_logical_send(ibld, inst, SHADER_OPCODE_TYPED_ATOMIC, -
Re: [Mesa-dev] [PATCH 04/12] i965/fs: Pass a BAD_FILE header source to LOAD_PAYLOAD in emit_texture_gen7().
Jason Ekstrand ja...@jlekstrand.net writes: On Wed, Jul 22, 2015 at 12:43 AM, Francisco Jerez curroje...@riseup.net wrote: Jason Ekstrand ja...@jlekstrand.net writes: On Sat, Jul 18, 2015 at 7:34 AM, Francisco Jerez curroje...@riseup.net wrote: So that it's left uninitialized by LOAD_PAYLOAD, we only need to reserve space for it in the message since it will be initialized implicitly by the generator. --- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 522e13e..89fcc49 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -473,8 +473,9 @@ fs_visitor::emit_texture_gen7(ir_texture_opcode op, fs_reg dst, if (op == ir_tg4 || offset_value.file != BAD_FILE || is_high_sampler(devinfo, sampler)) { /* For general texture offsets (no txf workaround), we need a header to - * put them in. Note that for SIMD16 we're making space for two actual - * hardware registers here, so the emit will have to fix up for this. + * put them in. Note that we're only reserving space for it in the + * message payload as it will be initialized implicitly by the + * generator. Mind splitting this in two? Fixing the comment above is really unrelated to passing BAD_FILE below. It seemed like the sentence I changed was trying to explain what the line of code I changed was doing. It may sound unrelated because the comment was completely misleading (or just outdated), but it can only refer to the three lines of code below... It does refer to the three lines of code below, but the fact that it's bogus has nothing to do with the change below. It's been wrong for probably 8 months and I've just never noticed. Register allocation used to, in SIMD16 mode, work in terms of pairs of physical registers. In the texturing code, we needed a 1-register header so in SIMD16, we got 2 registers and then the generator knew to put the header in the second half. About 8 months ago, I changed RA to work in terms of single hardware registers even on SIMD16 and this comment has been stale ever since. Alright... I'll drop the comment from this patch and change it as a follow-up. --Jason * * * ir4_tg4 needs to place its channel select in the header, * for interaction with ARB_texture_swizzle @@ -483,7 +484,7 @@ fs_visitor::emit_texture_gen7(ir_texture_opcode op, fs_reg dst, * need to offset the Sampler State Pointer in the header. */ header_size = 1; - sources[0] = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD); + sources[0] = fs_reg(); length++; } -- 2.4.3 signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium/util: Stop bundling our snprintf implementation.
On 07/22/2015 10:15 AM, Jose Fonseca wrote: Use MSVCRT functions instead. Their semantics are slightly different but they can be made to work as expected. Also, use the same code paths for both MSVCRT and MinGW. No testing yet. Just built. https://bugs.freedesktop.org/show_bug.cgi?id=91418 --- src/gallium/auxiliary/Makefile.sources |1 - src/gallium/auxiliary/util/u_snprintf.c | 1480 --- src/gallium/auxiliary/util/u_string.h | 33 +- 3 files changed, 30 insertions(+), 1484 deletions(-) delete mode 100644 src/gallium/auxiliary/util/u_snprintf.c Looks OK to me. Reviewed-by: Brian Paul bri...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] mesa: adjust error message when there's a missing teximage
On 07/22/2015 11:02 AM, Ilia Mirkin wrote: The current message makes it seem like the zoffset is invalid. Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- src/mesa/main/texgetimage.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c index 59ec091..2f35ac6 100644 --- a/src/mesa/main/texgetimage.c +++ b/src/mesa/main/texgetimage.c @@ -1013,8 +1013,7 @@ dimensions_error_check(struct gl_context *ctx, texImage = select_tex_image(texObj, target, level, zoffset); if (!texImage) { /* missing texture image */ - _mesa_error(ctx, GL_INVALID_OPERATION, - %s(zoffset = %d), caller, zoffset); + _mesa_error(ctx, GL_INVALID_OPERATION, %s(missing image), caller); return true; } Reviewed-by: Brian Paul bri...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91425] [regression, bisected] Piglit spec/ext_packed_float/ getteximage-invalid-format-for-packed-type fails
https://bugs.freedesktop.org/show_bug.cgi?id=91425 --- Comment #4 from Brian Paul bri...@vmware.com --- OK, the problem is my local copy of getteximage-invalid-format-for-packed-type.c has the patch ext_packed_float: fix getteximage-invalid-format-for-packed-type test which I posted for review (and was R-b'd by Jose) but not yet pushed to master. I believe the piglit test was defective as-is, per my check-in comment: The GL spec doesn't explicitly say that glGetTexImage should generate GL_INVALID_OPERATION when attempting to retrieve a non-existant texture image, but that's what NVIDIA's driver does. The purpose of this test is to check the format/type parameters, so let's define a packed float texture to avoid the undefined texture situation. Test now passes with NVIDIA. I'll push that patch. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/20] i965/fs: Import image memory offset calculation code.
This needs a *lot* more commentary. These calculations are extremely tricky and there are almost no comments. For instance, you are turning a 2D offset on a tiled surface into a new 2D address into the raw view of the surface. Nowhere do you explain what the raw surface looks like and how its width/height map to real tiled version. Somewhere, you need to write all this down in detail and explain why your calculations are correct. If you don't write it down here, then write it down somewhere else and put a comment here explaining where to go look. I tried to dig through it, but I'm having a lot of trouble getting any further splitting the x/y into major/minor. Cc'ing chad. He knows miptree layouts far better than I do. Maybe this is all obvious to someone familiar with the calculations. --Jason On Tue, Jul 21, 2015 at 9:38 AM, Francisco Jerez curroje...@riseup.net wrote: Define a function to calculate the memory address of the image location given by a vector of coordinates. This is required in cases where we need to fall back to untyped surface access, which take a raw memory offset and know nothing about surface coordinates, type conversion or memory tiling and swizzling. They are still useful because typed surface reads don't support any 64 or 128-bit formats on IVB, and they don't support any 128-bit formats on HSW and BDW. The tiling algorithm is implemented based on a number of parameters which are passed in as uniforms and determine whether the surface layout is X-tiled, Y-tiled or untiled. This allows binding surfaces of different tiling layouts to the pipeline without recompiling the program. v2: Drop VEC4 suport. v3: Rebase. --- .../drivers/dri/i965/brw_fs_surface_builder.cpp| 108 + 1 file changed, 108 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp index 5ee04de..0c879db 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp @@ -215,4 +215,112 @@ namespace { return BRW_PREDICATE_NORMAL; } } + + namespace image_coordinates { + /** + * Calculate the offset in memory of the texel given by \p coord. + * + * This is meant to be used with untyped surface messages to access a + * tiled surface, what involves taking into account the tiling and + * swizzling modes of the surface manually so it will hopefully not + * happen very often. + */ + fs_reg + emit_address_calculation(const fs_builder bld, const fs_reg image, + const fs_reg coord, unsigned dims) + { + const brw_device_info *devinfo = bld.shader-devinfo; + const fs_reg off = offset(image, bld, BRW_IMAGE_PARAM_OFFSET_OFFSET); + const fs_reg stride = offset(image, bld, BRW_IMAGE_PARAM_STRIDE_OFFSET); + const fs_reg tile = offset(image, bld, BRW_IMAGE_PARAM_TILING_OFFSET); + const fs_reg swz = offset(image, bld, BRW_IMAGE_PARAM_SWIZZLING_OFFSET); + const fs_reg addr = bld.vgrf(BRW_REGISTER_TYPE_UD, 2); + const fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_UD, 2); + const fs_reg minor = bld.vgrf(BRW_REGISTER_TYPE_UD, 2); + const fs_reg major = bld.vgrf(BRW_REGISTER_TYPE_UD, 2); + const fs_reg dst = bld.vgrf(BRW_REGISTER_TYPE_UD); + + /* Shift the coordinates by the fixed surface offset. */ + for (unsigned c = 0; c 2; ++c) +bld.ADD(offset(addr, bld, c), offset(off, bld, c), +(c dims ? + offset(retype(coord, BRW_REGISTER_TYPE_UD), bld, c) : + fs_reg(0))); + + if (dims 2) { +/* Decompose z into a major (tmp.y) and a minor (tmp.x) + * index. + */ +bld.BFE(offset(tmp, bld, 0), offset(tile, bld, 2), fs_reg(0), +offset(retype(coord, BRW_REGISTER_TYPE_UD), bld, 2)); +bld.SHR(offset(tmp, bld, 1), +offset(retype(coord, BRW_REGISTER_TYPE_UD), bld, 2), +offset(tile, bld, 2)); + +/* Take into account the horizontal (tmp.x) and vertical (tmp.y) + * slice offset. + */ +for (unsigned c = 0; c 2; ++c) { + bld.MUL(offset(tmp, bld, c), + offset(stride, bld, 2 + c), offset(tmp, bld, c)); + bld.ADD(offset(addr, bld, c), + offset(addr, bld, c), offset(tmp, bld, c)); +} Tiled surfaces are inherently 2-D, but it seems like you're treating it as if it has 3-D tiles. What's going on here? + } + + if (dims 1) { +for (unsigned c = 0; c 2; ++c) { + /* Calculate the minor x and y indices. */ +
[Mesa-dev] [PATCH 1/2] mesa: adjust error message when there's a missing teximage
The current message makes it seem like the zoffset is invalid. Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- src/mesa/main/texgetimage.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c index 59ec091..2f35ac6 100644 --- a/src/mesa/main/texgetimage.c +++ b/src/mesa/main/texgetimage.c @@ -1013,8 +1013,7 @@ dimensions_error_check(struct gl_context *ctx, texImage = select_tex_image(texObj, target, level, zoffset); if (!texImage) { /* missing texture image */ - _mesa_error(ctx, GL_INVALID_OPERATION, - %s(zoffset = %d), caller, zoffset); + _mesa_error(ctx, GL_INVALID_OPERATION, %s(missing image), caller); return true; } -- 2.3.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] mesa: rearrange texture error checking order
This moves the width/height/depth == 0 check to the front and avoids doing any other checking when that is the case. Also moves the dimensions check after the format/type checks so that we don't bail out with success on a width/height/depth == 0 request when the format/type don't match. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91425 Signed-off-by: Ilia Mirkin imir...@alum.mit.edu --- src/mesa/main/texgetimage.c | 26 +- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c index 2f35ac6..cdbd618 100644 --- a/src/mesa/main/texgetimage.c +++ b/src/mesa/main/texgetimage.c @@ -928,6 +928,13 @@ dimensions_error_check(struct gl_context *ctx, const struct gl_texture_image *texImage; int i; + if (width == 0 || height == 0 || depth == 0) { + /* Not an error, but nothing to do. Return 'true' so that the + * caller simply returns. + */ + return true; + } + if (xoffset 0) { _mesa_error(ctx, GL_INVALID_VALUE, %s(xoffset = %d), caller, xoffset); return true; @@ -1079,13 +1086,6 @@ dimensions_error_check(struct gl_context *ctx, } } - if (width == 0 || height == 0 || depth == 0) { - /* Not an error, but nothing to do. Return 'true' so that the - * caller simply returns. - */ - return true; - } - return false; } @@ -1164,18 +1164,18 @@ getteximage_error_check(struct gl_context *ctx, return true; } - if (dimensions_error_check(ctx, texObj, target, level, - xoffset, yoffset, zoffset, - width, height, depth, caller)) { - return true; - } - err = _mesa_error_check_format_and_type(ctx, format, type); if (err != GL_NO_ERROR) { _mesa_error(ctx, err, %s(format/type), caller); return true; } + if (dimensions_error_check(ctx, texObj, target, level, + xoffset, yoffset, zoffset, + width, height, depth, caller)) { + return true; + } + if (pbo_error_check(ctx, target, width, height, depth, format, type, bufSize, pixels, caller)) { return true; -- 2.3.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/20] i965/fs: Implement lowering of logical surface instructions.
Jason Ekstrand ja...@jlekstrand.net writes: On Tue, Jul 21, 2015 at 9:38 AM, Francisco Jerez curroje...@riseup.net wrote: --- src/mesa/drivers/dri/i965/brw_fs.cpp | 63 +++- 1 file changed, 55 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index a996676..0b0c5e1 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3881,6 +3881,20 @@ lower_sampler_logical_send(const fs_builder bld, fs_inst *inst, opcode op) } } +/** + * Initialize the header present in some typed and untyped surface + * messages. + */ +static fs_reg +emit_surface_header(const fs_builder bld, const fs_reg sample_mask) +{ + fs_builder ubld = bld.exec_all().group(8, 0); + const fs_reg dst = ubld.vgrf(BRW_REGISTER_TYPE_UD); + ubld.MOV(dst, fs_reg(0)); + ubld.MOV(component(dst, 7), sample_mask); + return dst; +} + static void lower_surface_logical_send(const fs_builder bld, fs_inst *inst, opcode op, const fs_reg sample_mask) @@ -3889,10 +3903,43 @@ lower_surface_logical_send(const fs_builder bld, fs_inst *inst, opcode op, const fs_reg addr = inst-src[0]; const fs_reg src = inst-src[1]; const fs_reg surface = inst-src[2]; - const fs_reg dims = inst-src[3]; + const UNUSED fs_reg dims = inst-src[3]; const fs_reg arg = inst-src[4]; - assert(!Not implemented); + /* Calculate the total number of components of the payload. */ + const unsigned addr_sz = inst-components_read(0); + const unsigned src_sz = inst-components_read(1); + const unsigned header_sz = (sample_mask.file == BAD_FILE ? 0 : 1); + const unsigned sz = header_sz + addr_sz + src_sz; + + /* Allocate space for the payload. */ + fs_reg *const components = new fs_reg[sz]; + const fs_reg payload = bld.vgrf(BRW_REGISTER_TYPE_UD, sz); + unsigned n = 0; + + /* Construct the payload. */ + if (header_sz) + components[n++] = emit_surface_header(bld, sample_mask); + + for (unsigned i = 0; i addr_sz; i++) + components[n++] = offset(addr, bld, i); + + for (unsigned i = 0; i src_sz; i++) + components[n++] = offset(src, bld, i); + + bld.LOAD_PAYLOAD(payload, components, sz, header_sz); + + /* Update the original instruction. */ + inst-opcode = op; + inst-mlen = header_sz + (addr_sz + src_sz) * inst-exec_size / 8; + inst-header_size = header_sz; + + inst-src[0] = payload; + inst-src[1] = surface; + inst-src[2] = arg; + inst-resize_sources(3); + + delete[] components; I don't think this is correct. Unless something has changed, the fs_inst constructor that takes an array doesn't make a copy so you have to ralloc the array and *not* delete it. It is, I fixed the fs_inst constructor some time ago to take the array by value (e.g. so allocation on the stack and re-use of the same array is possible for the caller). } bool @@ -3965,37 +4012,37 @@ fs_visitor::lower_logical_sends() case SHADER_OPCODE_UNTYPED_SURFACE_READ_LOGICAL: lower_surface_logical_send(ibld, inst, SHADER_OPCODE_UNTYPED_SURFACE_READ, -fs_reg()); +fs_reg(0x)); break; case SHADER_OPCODE_UNTYPED_SURFACE_WRITE_LOGICAL: lower_surface_logical_send(ibld, inst, SHADER_OPCODE_UNTYPED_SURFACE_WRITE, -fs_reg()); +ibld.sample_mask_reg()); break; case SHADER_OPCODE_UNTYPED_ATOMIC_LOGICAL: lower_surface_logical_send(ibld, inst, SHADER_OPCODE_UNTYPED_ATOMIC, -fs_reg()); +ibld.sample_mask_reg()); break; case SHADER_OPCODE_TYPED_SURFACE_READ_LOGICAL: lower_surface_logical_send(ibld, inst, SHADER_OPCODE_TYPED_SURFACE_READ, -fs_reg()); +fs_reg(0x)); break; case SHADER_OPCODE_TYPED_SURFACE_WRITE_LOGICAL: lower_surface_logical_send(ibld, inst, SHADER_OPCODE_TYPED_SURFACE_WRITE, -fs_reg()); +ibld.sample_mask_reg()); break; case SHADER_OPCODE_TYPED_ATOMIC_LOGICAL: lower_surface_logical_send(ibld, inst, SHADER_OPCODE_TYPED_ATOMIC, -fs_reg()); +ibld.sample_mask_reg()); break; default: -- 2.4.3
Re: [Mesa-dev] [PATCH 03/20] i965/fs: Implement lowering of logical surface instructions.
Jason Ekstrand ja...@jlekstrand.net writes: On Wed, Jul 22, 2015 at 10:02 AM, Francisco Jerez curroje...@riseup.net wrote: Jason Ekstrand ja...@jlekstrand.net writes: On Tue, Jul 21, 2015 at 9:38 AM, Francisco Jerez curroje...@riseup.net wrote: --- src/mesa/drivers/dri/i965/brw_fs.cpp | 63 +++- 1 file changed, 55 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index a996676..0b0c5e1 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3881,6 +3881,20 @@ lower_sampler_logical_send(const fs_builder bld, fs_inst *inst, opcode op) } } +/** + * Initialize the header present in some typed and untyped surface + * messages. + */ +static fs_reg +emit_surface_header(const fs_builder bld, const fs_reg sample_mask) +{ + fs_builder ubld = bld.exec_all().group(8, 0); + const fs_reg dst = ubld.vgrf(BRW_REGISTER_TYPE_UD); + ubld.MOV(dst, fs_reg(0)); + ubld.MOV(component(dst, 7), sample_mask); + return dst; +} + static void lower_surface_logical_send(const fs_builder bld, fs_inst *inst, opcode op, const fs_reg sample_mask) @@ -3889,10 +3903,43 @@ lower_surface_logical_send(const fs_builder bld, fs_inst *inst, opcode op, const fs_reg addr = inst-src[0]; const fs_reg src = inst-src[1]; const fs_reg surface = inst-src[2]; - const fs_reg dims = inst-src[3]; + const UNUSED fs_reg dims = inst-src[3]; const fs_reg arg = inst-src[4]; - assert(!Not implemented); + /* Calculate the total number of components of the payload. */ + const unsigned addr_sz = inst-components_read(0); + const unsigned src_sz = inst-components_read(1); + const unsigned header_sz = (sample_mask.file == BAD_FILE ? 0 : 1); + const unsigned sz = header_sz + addr_sz + src_sz; + + /* Allocate space for the payload. */ + fs_reg *const components = new fs_reg[sz]; + const fs_reg payload = bld.vgrf(BRW_REGISTER_TYPE_UD, sz); + unsigned n = 0; + + /* Construct the payload. */ + if (header_sz) + components[n++] = emit_surface_header(bld, sample_mask); + + for (unsigned i = 0; i addr_sz; i++) + components[n++] = offset(addr, bld, i); + + for (unsigned i = 0; i src_sz; i++) + components[n++] = offset(src, bld, i); + + bld.LOAD_PAYLOAD(payload, components, sz, header_sz); + + /* Update the original instruction. */ + inst-opcode = op; + inst-mlen = header_sz + (addr_sz + src_sz) * inst-exec_size / 8; + inst-header_size = header_sz; + + inst-src[0] = payload; + inst-src[1] = surface; + inst-src[2] = arg; + inst-resize_sources(3); + + delete[] components; I don't think this is correct. Unless something has changed, the fs_inst constructor that takes an array doesn't make a copy so you have to ralloc the array and *not* delete it. It is, I fixed the fs_inst constructor some time ago to take the array by value (e.g. so allocation on the stack and re-use of the same array is possible for the caller). Probably a good idea. In that case, would you mind using a vla instead of new/delete? I'd rather use new/delete (or the C++ standard library). VLAs are a non-standard feature only guaranteed to be available in C99, elsewhere they are an extension with far from universal compiler support -- The requirement was in fact again dropped from the C11 standard and the functionally-similar C++ feature was also voted out by the committee during the standardization of C++14. I'd rather stick to the standard... } bool @@ -3965,37 +4012,37 @@ fs_visitor::lower_logical_sends() case SHADER_OPCODE_UNTYPED_SURFACE_READ_LOGICAL: lower_surface_logical_send(ibld, inst, SHADER_OPCODE_UNTYPED_SURFACE_READ, -fs_reg()); +fs_reg(0x)); break; case SHADER_OPCODE_UNTYPED_SURFACE_WRITE_LOGICAL: lower_surface_logical_send(ibld, inst, SHADER_OPCODE_UNTYPED_SURFACE_WRITE, -fs_reg()); +ibld.sample_mask_reg()); break; case SHADER_OPCODE_UNTYPED_ATOMIC_LOGICAL: lower_surface_logical_send(ibld, inst, SHADER_OPCODE_UNTYPED_ATOMIC, -fs_reg()); +ibld.sample_mask_reg()); break; case SHADER_OPCODE_TYPED_SURFACE_READ_LOGICAL: lower_surface_logical_send(ibld, inst, SHADER_OPCODE_TYPED_SURFACE_READ, -fs_reg()); +fs_reg(0x)); break; case
Re: [Mesa-dev] [PATCH 03/20] i965/fs: Implement lowering of logical surface instructions.
On Wed, Jul 22, 2015 at 10:16 AM, Francisco Jerez curroje...@riseup.net wrote: Jason Ekstrand ja...@jlekstrand.net writes: On Wed, Jul 22, 2015 at 10:02 AM, Francisco Jerez curroje...@riseup.net wrote: Jason Ekstrand ja...@jlekstrand.net writes: On Tue, Jul 21, 2015 at 9:38 AM, Francisco Jerez curroje...@riseup.net wrote: --- src/mesa/drivers/dri/i965/brw_fs.cpp | 63 +++- 1 file changed, 55 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index a996676..0b0c5e1 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3881,6 +3881,20 @@ lower_sampler_logical_send(const fs_builder bld, fs_inst *inst, opcode op) } } +/** + * Initialize the header present in some typed and untyped surface + * messages. + */ +static fs_reg +emit_surface_header(const fs_builder bld, const fs_reg sample_mask) +{ + fs_builder ubld = bld.exec_all().group(8, 0); + const fs_reg dst = ubld.vgrf(BRW_REGISTER_TYPE_UD); + ubld.MOV(dst, fs_reg(0)); + ubld.MOV(component(dst, 7), sample_mask); + return dst; +} + static void lower_surface_logical_send(const fs_builder bld, fs_inst *inst, opcode op, const fs_reg sample_mask) @@ -3889,10 +3903,43 @@ lower_surface_logical_send(const fs_builder bld, fs_inst *inst, opcode op, const fs_reg addr = inst-src[0]; const fs_reg src = inst-src[1]; const fs_reg surface = inst-src[2]; - const fs_reg dims = inst-src[3]; + const UNUSED fs_reg dims = inst-src[3]; const fs_reg arg = inst-src[4]; - assert(!Not implemented); + /* Calculate the total number of components of the payload. */ + const unsigned addr_sz = inst-components_read(0); + const unsigned src_sz = inst-components_read(1); + const unsigned header_sz = (sample_mask.file == BAD_FILE ? 0 : 1); + const unsigned sz = header_sz + addr_sz + src_sz; + + /* Allocate space for the payload. */ + fs_reg *const components = new fs_reg[sz]; + const fs_reg payload = bld.vgrf(BRW_REGISTER_TYPE_UD, sz); + unsigned n = 0; + + /* Construct the payload. */ + if (header_sz) + components[n++] = emit_surface_header(bld, sample_mask); + + for (unsigned i = 0; i addr_sz; i++) + components[n++] = offset(addr, bld, i); + + for (unsigned i = 0; i src_sz; i++) + components[n++] = offset(src, bld, i); + + bld.LOAD_PAYLOAD(payload, components, sz, header_sz); + + /* Update the original instruction. */ + inst-opcode = op; + inst-mlen = header_sz + (addr_sz + src_sz) * inst-exec_size / 8; + inst-header_size = header_sz; + + inst-src[0] = payload; + inst-src[1] = surface; + inst-src[2] = arg; + inst-resize_sources(3); + + delete[] components; I don't think this is correct. Unless something has changed, the fs_inst constructor that takes an array doesn't make a copy so you have to ralloc the array and *not* delete it. It is, I fixed the fs_inst constructor some time ago to take the array by value (e.g. so allocation on the stack and re-use of the same array is possible for the caller). Probably a good idea. In that case, would you mind using a vla instead of new/delete? I'd rather use new/delete (or the C++ standard library). VLAs are a non-standard feature only guaranteed to be available in C99, elsewhere they are an extension with far from universal compiler support -- The requirement was in fact again dropped from the C11 standard and the functionally-similar C++ feature was also voted out by the committee during the standardization of C++14. I'd rather stick to the standard... I can live with that. } bool @@ -3965,37 +4012,37 @@ fs_visitor::lower_logical_sends() case SHADER_OPCODE_UNTYPED_SURFACE_READ_LOGICAL: lower_surface_logical_send(ibld, inst, SHADER_OPCODE_UNTYPED_SURFACE_READ, -fs_reg()); +fs_reg(0x)); break; case SHADER_OPCODE_UNTYPED_SURFACE_WRITE_LOGICAL: lower_surface_logical_send(ibld, inst, SHADER_OPCODE_UNTYPED_SURFACE_WRITE, -fs_reg()); +ibld.sample_mask_reg()); break; case SHADER_OPCODE_UNTYPED_ATOMIC_LOGICAL: lower_surface_logical_send(ibld, inst, SHADER_OPCODE_UNTYPED_ATOMIC, -fs_reg()); +ibld.sample_mask_reg()); break; case SHADER_OPCODE_TYPED_SURFACE_READ_LOGICAL: lower_surface_logical_send(ibld, inst, SHADER_OPCODE_TYPED_SURFACE_READ, -fs_reg());
Re: [Mesa-dev] [PATCH 2/2] mesa: rearrange texture error checking order
On 07/22/2015 11:29 AM, Ilia Mirkin wrote: On Wed, Jul 22, 2015 at 1:20 PM, Brian Paul bri...@vmware.com wrote: On 07/22/2015 11:02 AM, Ilia Mirkin wrote: This moves the width/height/depth == 0 check to the front and avoids doing any other checking when that is the case. Also moves the dimensions check after the format/type checks so that we don't bail out with success on a width/height/depth == 0 request when the format/type don't match. Bugzilla: https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.freedesktop.org_show-5Fbug.cgi-3Fid-3D91425d=BQIBAgc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=T0t4QG7chq2ZwJo6wilkFznRSFy-8uDKartPGbomVj8m=Y1TotaJmX1E7sCopvmMJc8PQKFXlHu6QxfcYOydKuIIs=L0lmmuk31G6M4hZEiwnX-IC4CPhW7rsrSCgEYtDgXjwe= Signed-off-by: Ilia Mirkin imir...@alum.mit.edu I suspect this isn't really needed in light of my patch to getteximage-invalid-format-for-packed-type.c I believe the test was in error, or at least sloppy, in that it was calling glGetTexImage to test format/type validation when there wasn't even a texture image to return. The OpenGL spec doesn't specify an order for error checking multiple things, so if there's multiple errors, you can't be sure which one will be reported first. My patch to the test removes that ambiguity. Well, irrespective of the piglit test, the current code is a little suspect -- it does the width/height/depth == 0 checks after it ensures that height == 1 for 1d and depth == 1 for 2d. Any thoughts on that? Yeah, that makes sense. I guess we don't have any tests that exercise getting a 0 x 0 texture image. Reviewed-by: Brian Paul bri...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/14] meta: Don't do fragment color clamping in case of ReadPixels
On Tue, Jul 21, 2015 at 1:18 AM, Iago Toral ito...@igalia.com wrote: On Tue, 2015-07-21 at 09:24 +0200, Iago Toral wrote: On Tue, 2015-06-16 at 11:15 -0700, Anuj Phogat wrote: Without this patch, piglit test arb_color_buffer_float-readpixels fails, when forced to use the meta pbo path. Signed-off-by: Anuj Phogat anuj.pho...@gmail.com Cc: mesa-sta...@lists.freedesktop.org --- src/mesa/drivers/common/meta_tex_subimage.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/mesa/drivers/common/meta_tex_subimage.c b/src/mesa/drivers/common/meta_tex_subimage.c index 84cbc50..ccb7dfb 100644 --- a/src/mesa/drivers/common/meta_tex_subimage.c +++ b/src/mesa/drivers/common/meta_tex_subimage.c @@ -25,6 +25,7 @@ *Jason Ekstrand jason.ekstr...@intel.com */ +#include blend.h #include bufferobj.h #include buffers.h #include fbobject.h @@ -305,6 +306,10 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, GLuint dims, _mesa_meta_begin(ctx, ~(MESA_META_PIXEL_TRANSFER | MESA_META_PIXEL_STORE)); + /* GL_CLAMP_FRAGMENT_COLOR doesn't affect ReadPixels. */ + if (!tex_image ctx-Extensions.ARB_color_buffer_float) + _mesa_ClampColor(GL_CLAMP_FRAGMENT_COLOR, GL_FALSE); + I think you need to restore the original value of GL_CLAMP_FRAGMENT_COLOR after you are done with ReadPixels, otherwise it looks like you would be changing the setting for future API calls too. Oh, I just noticed that this setting is already being considered in the state flags passed to _mesa_meta_begin, so ignore that comment. Also, does this only apply to ReadPixels and not to GetTexImage? That looks odd to me. You're right. This should apply to GetTexImage too. I'll send out v2 and also add a piglit test for GetTexImage. Iago _mesa_GenFramebuffers(2, fbos); if (tex_image tex_image-TexObject-Target == GL_TEXTURE_1D_ARRAY) { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/20] i965/fs: Import image format conversion primitives.
On Tue, Jul 21, 2015 at 9:38 AM, Francisco Jerez curroje...@riseup.net wrote: Define bitfield packing, unpacking and type conversion operations in terms of which the image format conversion code will be implemented. These don't directly know about image formats: The packing and unpacking functions take a 4-tuple of bit shifts and a 4-tuple of bit widths as arguments, determining the bitfield position of each component. Most of the remaining functions perform integer, fixed point normalized, and floating point type conversions, mapping between a target type with per-component bit widths given by a parameter and a matching native representation of the same type. v2: Drop VEC4 suport. v3: Rebase. --- .../drivers/dri/i965/brw_fs_surface_builder.cpp| 263 + 1 file changed, 263 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp index 0c879db..ea1c4aa 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp @@ -323,4 +323,267 @@ namespace { return dst; } } + + namespace image_format_conversion { + using image_format_info::color_u; + + namespace { + /** + * Maximum representable value in an unsigned integer with the given + * number of bits. + */ + inline unsigned + scale(unsigned n) + { +return (1 n) - 1; + } + } + + /** + * Pack the vector \p src in a bitfield given the per-component bit + * shifts and widths. You should comment in here that it assumes that either everything fits in 32 bits or that it is homogeneous with a power-of-two width. More specifically, it only works if no component spans a 32-bit boundary. + */ + fs_reg + emit_pack(const fs_builder bld, const fs_reg src, +const color_u shifts, const color_u widths) + { + const fs_reg dst = bld.vgrf(BRW_REGISTER_TYPE_UD, 4); + bool seen[4] = {}; + + for (unsigned c = 0; c 4; ++c) { +if (widths[c]) { + const fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_UD); + + /* Shift each component left to the correct bitfield position. */ + bld.SHL(tmp, offset(src, bld, c), fs_reg(shifts[c] % 32)); + + /* Add everything up. */ + if (seen[shifts[c] / 32]) { + bld.OR(offset(dst, bld, shifts[c] / 32), + offset(dst, bld, shifts[c] / 32), tmp); + } else { + bld.MOV(offset(dst, bld, shifts[c] / 32), tmp); + seen[shifts[c] / 32] = true; + } +} + } + + return dst; + } + + /** + * Unpack a vector from the bitfield \p src given the per-component bit + * shifts and widths. Same comment here. + */ + fs_reg + emit_unpack(const fs_builder bld, const fs_reg src, + const color_u shifts, const color_u widths) + { + const fs_reg dst = bld.vgrf(src.type, 4); + + for (unsigned c = 0; c 4; ++c) { +if (widths[c]) { + /* Shift left to discard the most significant bits. */ + bld.SHL(offset(dst, bld, c), + offset(src, bld, shifts[c] / 32), + fs_reg(32 - shifts[c] % 32 - widths[c])); + + /* Shift back to the least significant bits using an arithmetic +* shift to get sign extension on signed types. +*/ + bld.ASR(offset(dst, bld, c), + offset(dst, bld, c), fs_reg(32 - widths[c])); +} + } + + return dst; + } + + /** + * Convert a vector into an integer vector of the specified signedness + * and bit widths, properly handling overflow. + */ + fs_reg + emit_convert_to_integer(const fs_builder bld, const fs_reg src, + const color_u widths, bool is_signed) + { + const unsigned s = (is_signed ? 1 : 0); + const fs_reg dst = bld.vgrf( +is_signed ? BRW_REGISTER_TYPE_D : BRW_REGISTER_TYPE_UD, 4); + + for (unsigned c = 0; c 4; ++c) { +if (widths[c]) { + bld.MOV(offset(dst, bld, c), offset(src, bld, c)); + + /* Clamp to the minimum value. */ + if (is_signed) + bld.emit_minmax(offset(dst, bld, c), offset(dst, bld, c), + fs_reg(-(int)scale(widths[c] - s) - 1), + BRW_CONDITIONAL_G); If it isn't signed, shouldn't you still do a min/max with zero? Also, I think you want to do the minmax while its
Re: [Mesa-dev] [PATCH] mesa: initialize variables to silence compiler warnings
On Tue, Jul 21, 2015 at 5:42 PM, Brian Paul bri...@vmware.com wrote: --- src/mesa/main/fbobject.c | 4 ++-- src/mesa/main/shaderapi.c | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c index f46554b..cc342c2 100644 --- a/src/mesa/main/fbobject.c +++ b/src/mesa/main/fbobject.c @@ -3304,7 +3304,7 @@ _mesa_FramebufferTexture(GLenum target, GLenum attachment, GET_CURRENT_CONTEXT(ctx); struct gl_framebuffer *fb; struct gl_texture_object *texObj; - GLboolean layered; + GLboolean layered = GL_FALSE; const char *func = FramebufferTexture; @@ -3347,7 +3347,7 @@ _mesa_NamedFramebufferTexture(GLuint framebuffer, GLenum attachment, GET_CURRENT_CONTEXT(ctx); struct gl_framebuffer *fb; struct gl_texture_object *texObj; - GLboolean layered; + GLboolean layered = GL_FALSE; const char *func = glNamedFramebufferTexture; diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c index 3365c7a..376874e 100644 --- a/src/mesa/main/shaderapi.c +++ b/src/mesa/main/shaderapi.c @@ -1331,7 +1331,7 @@ void GLAPIENTRY _mesa_GetObjectParameterfvARB(GLhandleARB object, GLenum pname, GLfloat *params) { - GLint iparams[1]; /* XXX is one element enough? */ + GLint iparams[1] = {0}; /* XXX is one element enough? */ _mesa_GetObjectParameterivARB(object, pname, iparams); params[0] = (GLfloat) iparams[0]; } -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev Reviewed-by: Anuj Phogat anuj.pho...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glx: Fix build warnings for PURE being redefined.
We can't completely drop this #define because it's used in the xserver generated code. Until someone gets around to putting xml generation directly in the server, we're stuck #defining it for them. --- src/mapi/glapi/gen/gl_XML.py | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/src/mapi/glapi/gen/gl_XML.py b/src/mapi/glapi/gen/gl_XML.py index 67aba81..f19570b 100644 --- a/src/mapi/glapi/gen/gl_XML.py +++ b/src/mapi/glapi/gen/gl_XML.py @@ -183,10 +183,12 @@ class gl_print_base(object): The name is also added to the file's undef_list. self.undef_list.append(PURE) -print # if defined(__GNUC__) || (defined(__SUNPRO_C) (__SUNPRO_C = 0x590)) -#define PURE __attribute__((pure)) -# else -#define PURE +print # if !defined(PURE) +#if defined(__GNUC__) || (defined(__SUNPRO_C) (__SUNPRO_C = 0x590)) +# define PURE __attribute__((pure)) +#else +# define PURE +#endif # endif return -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 1/7] nouveau: implement the nvif hardware performance counters interface
On 01/07/15 01:01, Samuel Pitoiset wrote: This commit implements the base interface for hardware performance counters that will be shared between nv50 and nvc0 drivers. TODO: Bump libdrm version of mesa when nvif will be merged. Changes since v2: - remove double-query thing for domains, signals and sources Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com --- src/gallium/drivers/nouveau/Makefile.sources | 2 + src/gallium/drivers/nouveau/nouveau_perfmon.c | 290 ++ src/gallium/drivers/nouveau/nouveau_perfmon.h | 58 ++ src/gallium/drivers/nouveau/nouveau_screen.c | 5 + src/gallium/drivers/nouveau/nouveau_screen.h | 1 + 5 files changed, 356 insertions(+) create mode 100644 src/gallium/drivers/nouveau/nouveau_perfmon.c create mode 100644 src/gallium/drivers/nouveau/nouveau_perfmon.h diff --git a/src/gallium/drivers/nouveau/Makefile.sources b/src/gallium/drivers/nouveau/Makefile.sources index 3fae3bc..3da0bdc 100644 --- a/src/gallium/drivers/nouveau/Makefile.sources +++ b/src/gallium/drivers/nouveau/Makefile.sources @@ -10,6 +10,8 @@ C_SOURCES := \ nouveau_heap.h \ nouveau_mm.c \ nouveau_mm.h \ + nouveau_perfmon.c \ + nouveau_perfmon.h \ nouveau_screen.c \ nouveau_screen.h \ nouveau_statebuf.h \ diff --git a/src/gallium/drivers/nouveau/nouveau_perfmon.c b/src/gallium/drivers/nouveau/nouveau_perfmon.c new file mode 100644 index 000..e1d4546 --- /dev/null +++ b/src/gallium/drivers/nouveau/nouveau_perfmon.c @@ -0,0 +1,290 @@ +/* + * Copyright 2015 Samuel Pitoiset + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +#include errno.h + +#include util/u_memory.h + +#include nouveau_debug.h +#include nouveau_winsys.h +#include nouveau_perfmon.h + +static int +nouveau_perfmon_query_sources(struct nouveau_perfmon *pm, + struct nouveau_perfmon_dom *dom, + struct nouveau_perfmon_sig *sig) +{ + struct nvif_perfmon_query_source_v0 args = {}; + + args.iter = 1; Why start iterating from 1 and not 0? + args.domain = dom-id; + args.signal = sig-signal; + do { + struct nouveau_perfmon_src *src; + int ret; + + ret = nouveau_object_mthd(pm-object, NVIF_PERFMON_V0_QUERY_SOURCE, +args, sizeof(args)); + if (ret) + return ret; You do not check what happens if you do not expose any source for this signal. A test on args.iter != 0x with a return if not the case would be nice! + + src = CALLOC_STRUCT(nouveau_perfmon_src); + if (!src) + return -ENOMEM; + +#if 0 + debug_printf(id = %d\n, args.source); + debug_printf(name = %s\n, args.name); + debug_printf(mask = %08x\n, args.mask); + debug_printf(\n); +#endif + + src-id = args.source; + strncpy(src-name, args.name, sizeof(src-name)); + list_addtail(src-head, sig-sources); + } while (args.iter != 0xff); + + return 0; +} + +static int +nouveau_perfmon_query_signals(struct nouveau_perfmon *pm, + struct nouveau_perfmon_dom *dom) +{ + struct nvif_perfmon_query_signal_v0 args = {}; + + args.iter = 1; + args.domain = dom-id; + do { + struct nouveau_perfmon_sig *sig; + int ret; + + ret = nouveau_object_mthd(pm-object, NVIF_PERFMON_V0_QUERY_SIGNAL, +args, sizeof(args)); + if (ret) + return ret; You do not check what happens if you do not expose any source for this signal. A test on args.iter != 0x with a return if not the case would be nice! + + sig = CALLOC_STRUCT(nouveau_perfmon_sig); + if (!sig) + return -ENOMEM; + list_inithead(sig-sources); + +#if 0 + debug_printf(name = %s\n, args.name); + debug_printf(signal= 0x%02x\n, args.signal); + debug_printf(source_nr = %d\n,
Re: [Mesa-dev] [PATCH v4 3/6] mesa/es3.1: enable GL_ARB_texture_multisample for GLES 3.1
On 06/26/2015 07:38 AM, Ilia Mirkin wrote: On Fri, Jun 26, 2015 at 4:18 AM, Tapani Pälli tapani.pa...@intel.com wrote: On 06/26/2015 01:06 AM, Ilia Mirkin wrote: On Thu, Jun 25, 2015 at 4:22 PM, Ilia Mirkin imir...@alum.mit.edu wrote: On Thu, Jun 25, 2015 at 5:08 AM, Marta Lofstedt marta.lofst...@linux.intel.com wrote: From: Marta Lofstedt marta.lofst...@intel.com v4 : only expose GL_ARB_texture_multisample enums for gles 3.1 and desktop GL. I was suspicious of this logic. Based on my reading of the code, what your ARB_texture_multisample_es31 thing does is expose those enums when *either* the driver enables ARB_texture_multisample *or* the current context is ES3.1. ARB_draw_indirect_es31 has the same problem, btw. I'm not sure I understand the issue. It should work fine with 'OR', if we have either of these then we expose the enum? I could have misread the get.c extra_ext() logic, but I don't think I have. As far as I can tell there's no way to (generically) AND these things. Why do we need AND? If you have 3.1 context then you *need* to support this enum. There is no such condition where you would have gles 3.1 context but not the enum, right? What you really need to do is create a whole new [GL, GL_CORE, ES31] section in get_hash_params and update get_hash_generator.py accordingly. An alternative, BTW, is to just say screw it and expose the enums in GL ES 3.0. In that case, just move the enums into a section that includes GLES3. I'm not sure how big of a deal it is. But your current patches are definitely wrong. We do this, but we have explicit version check against version 3.1. OK, well, at the end of the day, the things anyone says don't matter, it just has to work, right? So here are the scenarios: Let's take it as a given that the driver sets Ext.ARB_texture_multisample to true, we have the following situations: GL compat glGet(SAMPLE_MASK) GL core glGet(SAMPLE_MASK) GL ES3.0 glGet(SAMPLE_MASK) GL ES3.1 glGet(SAMPLE_MASK) Where the ES3.0 one should fail, and all the others should succeed. OK, so first off, this is in the GL/GL_CORE/GLES3 section, which means that the logic will get hit for all 4, right? Then you do a check for Yes, I think you're right, and I think I led Marta and Tapani astray. I think they originally had separate sections, and I told them to change at least some of them. The whole separate section mechanism is confusing and leads to code churn (imagine if an ES3.0 extension comes along that enables this functionality). I think a good Python project for someone would be to reorganize that file so that all the enums are in a single list. Each enum would have a list of minterms to describe its availability. These would include profile and version. Instead of having a flat array of integers, we'd have an array of structures: struct requirement { gl_api API; unsigned Version; /** Minimum version required. */ unsigned extension[2]; /** Offsets of required extensions. Use * dummy_true if none. */ enum value_extra extra; /** Other requirements. */ }; This is equivalent to the current system (which is also a maximum of minterms), but I think it's a lot easier to specify things. Either way, I think that can be follow-up work. Ext.ARB_texture_multisample || ES3.1 and if that doesn't get set to true, then you return INVALID_ENUM, right? So tell me, given that the driver *always* sets Ext.ARB_texture_multisample to true, how can that ever be false? The ES3.1 thing could be useful for drivers that don't actually support ARB_texture_multisample and you want to fake it, but in practice all the relevant drivers support it. But none of that cause a GL_INVALID_ENUM to get returned in ES3.0 as it was before the patch. Hope that clears things up. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 06/14] meta: Don't do fragment color clamping in _mesa_meta_pbo_GetTexSubImage
_mesa_meta_pbo_GetTexSubImage() uses _mesa_meta_BlitFrameBuffer(), which will do fragment clamping if enabled. But fragment clamping doesn't affect ReadPixels and GetTexImage. Without this patch, piglit test arb_color_buffer_float-clear fails, when forced to use the meta pbo path. v2: Apply this fix to both glReadPixels and glGetTexImage. Cc: mesa-sta...@lists.freedesktop.org Signed-off-by: Anuj Phogat anuj.pho...@gmail.com --- src/mesa/drivers/common/meta_tex_subimage.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/mesa/drivers/common/meta_tex_subimage.c b/src/mesa/drivers/common/meta_tex_subimage.c index 6d7c1eb..038d525 100644 --- a/src/mesa/drivers/common/meta_tex_subimage.c +++ b/src/mesa/drivers/common/meta_tex_subimage.c @@ -25,6 +25,7 @@ *Jason Ekstrand jason.ekstr...@intel.com */ +#include blend.h #include bufferobj.h #include buffers.h #include fbobject.h @@ -331,6 +332,10 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, GLuint dims, _mesa_meta_begin(ctx, ~(MESA_META_PIXEL_TRANSFER | MESA_META_PIXEL_STORE)); + /* GL_CLAMP_FRAGMENT_COLOR doesn't affect ReadPixels and GettexImage */ + if (ctx-Extensions.ARB_color_buffer_float) + _mesa_ClampColor(GL_CLAMP_FRAGMENT_COLOR, GL_FALSE); + _mesa_GenFramebuffers(2, fbos); if (tex_image tex_image-TexObject-Target == GL_TEXTURE_1D_ARRAY) { -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v4 3/6] mesa/es3.1: enable GL_ARB_texture_multisample for GLES 3.1
On Wed, Jul 22, 2015 at 2:55 PM, Ian Romanick i...@freedesktop.org wrote: On 06/26/2015 07:38 AM, Ilia Mirkin wrote: On Fri, Jun 26, 2015 at 4:18 AM, Tapani Pälli tapani.pa...@intel.com wrote: On 06/26/2015 01:06 AM, Ilia Mirkin wrote: On Thu, Jun 25, 2015 at 4:22 PM, Ilia Mirkin imir...@alum.mit.edu wrote: On Thu, Jun 25, 2015 at 5:08 AM, Marta Lofstedt marta.lofst...@linux.intel.com wrote: From: Marta Lofstedt marta.lofst...@intel.com v4 : only expose GL_ARB_texture_multisample enums for gles 3.1 and desktop GL. I was suspicious of this logic. Based on my reading of the code, what your ARB_texture_multisample_es31 thing does is expose those enums when *either* the driver enables ARB_texture_multisample *or* the current context is ES3.1. ARB_draw_indirect_es31 has the same problem, btw. I'm not sure I understand the issue. It should work fine with 'OR', if we have either of these then we expose the enum? I could have misread the get.c extra_ext() logic, but I don't think I have. As far as I can tell there's no way to (generically) AND these things. Why do we need AND? If you have 3.1 context then you *need* to support this enum. There is no such condition where you would have gles 3.1 context but not the enum, right? What you really need to do is create a whole new [GL, GL_CORE, ES31] section in get_hash_params and update get_hash_generator.py accordingly. An alternative, BTW, is to just say screw it and expose the enums in GL ES 3.0. In that case, just move the enums into a section that includes GLES3. I'm not sure how big of a deal it is. But your current patches are definitely wrong. We do this, but we have explicit version check against version 3.1. OK, well, at the end of the day, the things anyone says don't matter, it just has to work, right? So here are the scenarios: Let's take it as a given that the driver sets Ext.ARB_texture_multisample to true, we have the following situations: GL compat glGet(SAMPLE_MASK) GL core glGet(SAMPLE_MASK) GL ES3.0 glGet(SAMPLE_MASK) GL ES3.1 glGet(SAMPLE_MASK) Where the ES3.0 one should fail, and all the others should succeed. OK, so first off, this is in the GL/GL_CORE/GLES3 section, which means that the logic will get hit for all 4, right? Then you do a check for Yes, I think you're right, and I think I led Marta and Tapani astray. I think they originally had separate sections, and I told them to change at least some of them. The whole separate section mechanism is confusing and leads to code churn (imagine if an ES3.0 extension comes along that enables this functionality). I think a good Python project for someone would be to reorganize that file so that all the enums are in a single list. Each enum would have a list of minterms to describe its availability. These would include profile and version. Instead of having a flat array of integers, we'd have an array of structures: struct requirement { gl_api API; unsigned Version; /** Minimum version required. */ unsigned extension[2]; /** Offsets of required extensions. Use * dummy_true if none. */ enum value_extra extra; /** Other requirements. */ }; This is equivalent to the current system (which is also a maximum of minterms), but I think it's a lot easier to specify things. Either way, I think that can be follow-up work. We currently have varying levels of correctness wrt exposing features we're not supposed to expose. I haven't thought about your new dispatch stuff, but would, e.g., glPolygonOffsetClampEXT be retrievable in an ES context right now? If it would be, do we care? I feel like a bit of a troll when I'm pointing out these inconsistencies since they're fairly trivial, but I also see that some thought and effort has gone into hiding various functionality where it's not supposed to exist. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] mesa: Detect and provide macros for function attributes pure and const.
On 22/07/15 17:13, Jose Fonseca wrote: On 21/07/15 15:57, Emil Velikov wrote: On 18 July 2015 at 08:13, Jose Fonseca jfons...@vmware.com wrote: On 18/07/15 01:38, Eric Anholt wrote: Emil Velikov emil.l.veli...@gmail.com writes: On 14/07/15 19:45, Eric Anholt wrote: These are really useful hints to the compiler in the absence of link-time optimization, and I'm going to use them in VC4. I've made the const attribute be ATTRIBUTE_CONST unlike other function attributes, because we have other things in the tree #defining CONST for their own unrelated purposes. Mindly related: how people feel about making these macros less screamy, by following the approach used in the kernel: PURE - __pure and so on ? I'd love it. Less screamy is fine, but beware prefixing double underscore: the C standard stipulates that its use is reserved for for C/C++ runtime. [1] I though about it before posting although I've seen others define those, even do so in their public headers. Now that I have some examples from my current /usr/include Searching for __pure dwarves/dutil.h:#define __pure __attribute__ ((pure)) Searching for __attribute_const__ sys/cdefs.h:# define __attribute_const__ __attribute__ ((__const__)) sys/cdefs.h:# define __attribute_const__ /* Ignore */ Searching for __printf Searching for __always_unused Searching for __noreturn Searching for __packed libvisual-0.4/libvisual/lv_defines.h:# define __packed __attribute__ ((packed)) libvisual-0.4/libvisual/lv_defines.h:# define __packed /* no packed */ bsd/sys/cdefs.h:# define __packed __attribute__((__packed__)) bsd/sys/cdefs.h:# define __packed Searching for __deprecated pciaccess.h:#define __deprecated __attribute__((deprecated)) pciaccess.h:#define __deprecated Searching for __weak Searching for __alias With a handful of other headers defining more double underscore prefixed macros. Look at stdlibc++ implementation: every internal variable has a double underscore prefix. Unless we're talking about STL/other template library we don't care what library foo uses in it's internal implementation do we ? After all these will be resolved at compile time. Maybe kernel gets away on GLIBC (and because it doesn't use C++), but there's no guarantee it will work on other C runtimes, and even if it does, it could start failing anytime. True, it's not the best of ideas. Just worth pointing out that the cat is already out, for other projects. There are already more than 12K #define __foo cases on my system. These defines are reserved for system headers, so it's natural to be lots of them in /usr/include. MacOSX also defines some of these on its sys/cdefs.h: http://www.opensource.apple.com/source/xnu/xnu-1456.1.26/bsd/sys/cdefs.h The question is: can we expect that most systems will define these __foo, or at least not use them for other purposes. I don't know the answer. At a glance MSVC doesn't seem to rely on them for anything. So it might work. I don't oppose if you want to give it a shot. Jose Ironically, Windows headers already define PURE (its used in COM interfaces). Just realized this now after this was merged. So we really need a different name... Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa: rearrange texture error checking order
On Wed, Jul 22, 2015 at 1:20 PM, Brian Paul bri...@vmware.com wrote: On 07/22/2015 11:02 AM, Ilia Mirkin wrote: This moves the width/height/depth == 0 check to the front and avoids doing any other checking when that is the case. Also moves the dimensions check after the format/type checks so that we don't bail out with success on a width/height/depth == 0 request when the format/type don't match. Bugzilla: https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.freedesktop.org_show-5Fbug.cgi-3Fid-3D91425d=BQIBAgc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=T0t4QG7chq2ZwJo6wilkFznRSFy-8uDKartPGbomVj8m=Y1TotaJmX1E7sCopvmMJc8PQKFXlHu6QxfcYOydKuIIs=L0lmmuk31G6M4hZEiwnX-IC4CPhW7rsrSCgEYtDgXjwe= Signed-off-by: Ilia Mirkin imir...@alum.mit.edu I suspect this isn't really needed in light of my patch to getteximage-invalid-format-for-packed-type.c I believe the test was in error, or at least sloppy, in that it was calling glGetTexImage to test format/type validation when there wasn't even a texture image to return. The OpenGL spec doesn't specify an order for error checking multiple things, so if there's multiple errors, you can't be sure which one will be reported first. My patch to the test removes that ambiguity. Well, irrespective of the piglit test, the current code is a little suspect -- it does the width/height/depth == 0 checks after it ensures that height == 1 for 1d and depth == 1 for 2d. Any thoughts on that? -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeon: Silence GCC unused-but-set-variable warnings.
Reviewed-by: Marek Olšák marek.ol...@amd.com Marek On Wed, Jul 22, 2015 at 7:04 AM, Vinson Lee v...@freedesktop.org wrote: radeon_fbo.c: In function 'radeon_map_renderbuffer_s8z24': radeon_fbo.c:162:9: warning: variable 'ret' set but not used [-Wunused-but-set-variable] int ret; ^ radeon_fbo.c: In function 'radeon_map_renderbuffer_z16': radeon_fbo.c:200:9: warning: variable 'ret' set but not used [-Wunused-but-set-variable] int ret; ^ radeon_fbo.c: In function 'radeon_map_renderbuffer': radeon_fbo.c:242:8: warning: variable 'ret' set but not used [-Wunused-but-set-variable] int ret; ^ radeon_fbo.c: In function 'radeon_unmap_renderbuffer': radeon_fbo.c:419:14: warning: variable 'ok' set but not used [-Wunused-but-set-variable] GLboolean ok; ^ Signed-off-by: Vinson Lee v...@freedesktop.org --- src/mesa/drivers/dri/radeon/radeon_fbo.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/radeon/radeon_fbo.c b/src/mesa/drivers/dri/radeon/radeon_fbo.c index 5e4aaca..5eece51 100644 --- a/src/mesa/drivers/dri/radeon/radeon_fbo.c +++ b/src/mesa/drivers/dri/radeon/radeon_fbo.c @@ -169,6 +169,7 @@ radeon_map_renderbuffer_s8z24(struct gl_context *ctx, rrb-map_buffer = malloc(w * h * 4); ret = radeon_bo_map(rrb-bo, !!(mode GL_MAP_WRITE_BIT)); assert(!ret); +(void) ret; untiled_s8z24_map = rrb-map_buffer; tiled_s8z24_map = rrb-bo-ptr; @@ -207,6 +208,7 @@ radeon_map_renderbuffer_z16(struct gl_context *ctx, rrb-map_buffer = malloc(w * h * 2); ret = radeon_bo_map(rrb-bo, !!(mode GL_MAP_WRITE_BIT)); assert(!ret); +(void) ret; untiled_z16_map = rrb-map_buffer; tiled_z16_map = rrb-bo-ptr; @@ -324,6 +326,7 @@ radeon_map_renderbuffer(struct gl_context *ctx, ret = radeon_bo_map(rrb-bo, !!(mode GL_MAP_WRITE_BIT)); assert(!ret); + (void) ret; map = rrb-bo-ptr; stride = rrb-map_pitch; @@ -416,7 +419,6 @@ radeon_unmap_renderbuffer(struct gl_context *ctx, { struct radeon_context *const rmesa = RADEON_CONTEXT(ctx); struct radeon_renderbuffer *rrb = radeon_renderbuffer(rb); - GLboolean ok; if ((rmesa-radeonScreen-chip_flags RADEON_CHIPSET_DEPTH_ALWAYS_TILED) !rrb-has_surface) { if (rb-Format == MESA_FORMAT_Z24_UNORM_S8_UINT || rb-Format == MESA_FORMAT_Z24_UNORM_X8_UINT) { @@ -438,6 +440,7 @@ radeon_unmap_renderbuffer(struct gl_context *ctx, radeon_bo_unmap(rrb-map_bo); if (rrb-map_mode GL_MAP_WRITE_BIT) { + GLboolean ok; ok = rmesa-vtbl.blit(ctx, rrb-map_bo, 0, rb-Format, rrb-map_pitch / rrb-cpp, rrb-map_w, rrb-map_h, @@ -449,6 +452,7 @@ radeon_unmap_renderbuffer(struct gl_context *ctx, rrb-map_w, rrb-map_h, GL_FALSE); assert(ok); + (void) ok; } radeon_bo_unref(rrb-map_bo); -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 2/7] nv50: allocate a software object class
On 01/07/15 01:01, Samuel Pitoiset wrote: This will allow to monitor global performance counters through the command stream of the GPU instead of using ioctls. Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com --- src/gallium/drivers/nouveau/nv50/nv50_screen.c | 11 +++ src/gallium/drivers/nouveau/nv50/nv50_screen.h | 1 + src/gallium/drivers/nouveau/nv50/nv50_winsys.h | 1 + 3 files changed, 13 insertions(+) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c b/src/gallium/drivers/nouveau/nv50/nv50_screen.c index 6583a35..c985344 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c @@ -367,6 +367,7 @@ nv50_screen_destroy(struct pipe_screen *pscreen) nouveau_object_del(screen-eng2d); nouveau_object_del(screen-m2mf); nouveau_object_del(screen-sync); + nouveau_object_del(screen-sw); nouveau_screen_fini(screen-base); @@ -437,6 +438,9 @@ nv50_screen_init_hwctx(struct nv50_screen *screen) BEGIN_NV04(push, SUBC_3D(NV01_SUBCHAN_OBJECT), 1); PUSH_DATA (push, screen-tesla-handle); + BEGIN_NV04(push, SUBC_SW(NV01_SUBCHAN_OBJECT), 1); + PUSH_DATA (push, screen-sw-handle); + BEGIN_NV04(push, NV50_3D(COND_MODE), 1); PUSH_DATA (push, NV50_3D_COND_MODE_ALWAYS); @@ -768,6 +772,13 @@ nv50_screen_create(struct nouveau_device *dev) goto fail; } + ret = nouveau_object_new(chan, 0xbeef506e, 0x506e, I guess the 0x506e needs to be defined in libdrm, right? Other than that, it is Reviewed-by: Martin Peres martin.pe...@free.fr +NULL, 0, screen-sw); + if (ret) { + NOUVEAU_ERR(Failed to allocate SW object: %d\n, ret); + goto fail; + } + ret = nouveau_object_new(chan, 0xbeef5039, NV50_M2MF_CLASS, NULL, 0, screen-m2mf); if (ret) { diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.h b/src/gallium/drivers/nouveau/nv50/nv50_screen.h index 881051b..69fdfdb 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_screen.h +++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.h @@ -93,6 +93,7 @@ struct nv50_screen { struct nouveau_object *tesla; struct nouveau_object *eng2d; struct nouveau_object *m2mf; + struct nouveau_object *sw; }; static INLINE struct nv50_screen * diff --git a/src/gallium/drivers/nouveau/nv50/nv50_winsys.h b/src/gallium/drivers/nouveau/nv50/nv50_winsys.h index e8578c8..5cb33ef 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_winsys.h +++ b/src/gallium/drivers/nouveau/nv50/nv50_winsys.h @@ -60,6 +60,7 @@ PUSH_REFN(struct nouveau_pushbuf *push, struct nouveau_bo *bo, uint32_t flags) #define SUBC_COMPUTE(m) 6, (m) #define NV50_COMPUTE(n) SUBC_COMPUTE(NV50_COMPUTE_##n) +#define SUBC_SW(m) 7, (m) static INLINE uint32_t NV50_FIFO_PKHDR(int subc, int mthd, unsigned size) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/14] mesa: Set green, blue channels to zero only for formats with these components
On Tue, 2015-07-21 at 11:13 -0700, Anuj Phogat wrote: On Tue, Jul 21, 2015 at 12:50 AM, Iago Toral ito...@igalia.com wrote: On Tue, 2015-06-16 at 11:15 -0700, Anuj Phogat wrote: Signed-off-by: Anuj Phogat anuj.pho...@gmail.com --- src/mesa/drivers/common/meta.c | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c index 214a68a..fceb25d 100644 --- a/src/mesa/drivers/common/meta.c +++ b/src/mesa/drivers/common/meta.c @@ -3132,9 +3132,16 @@ decompress_texture_image(struct gl_context *ctx, * returned as red and two-channel texture values are returned as * red/alpha. */ - if ((baseTexFormat == GL_LUMINANCE || - baseTexFormat == GL_LUMINANCE_ALPHA || - baseTexFormat == GL_INTENSITY) || + if (((baseTexFormat == GL_LUMINANCE || +baseTexFormat == GL_LUMINANCE_ALPHA || +baseTexFormat == GL_INTENSITY) + (destBaseFormat == GL_RGBA || +destBaseFormat == GL_RGB || +destBaseFormat == GL_RG || +destBaseFormat == GL_GREEN || +destBaseFormat == GL_BLUE || +destBaseFormat == GL_BGRA || +destBaseFormat == GL_BGR)) || Is this needed to achieve correct behavior or just an optimization? I would expect that if the dest format does not have G/B channels, setting pixel transfer options for these channels would not have any functional effect anyway. This is just an optimization. We set pixel transfer operations based on these conditions and then call _mesa_ReadPixels, which falls back to slower path if transfer operations are set. I'll bump up the commit message of this patch. Yes, a note in the commit log is worth it, thanks, Reviewed-by: Iago Toral Quiroga ito...@igalia.com /* If we're reading back an RGB(A) texture (using glGetTexImage) as * luminance then we need to return L=tex(R). */ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] i965/fs: Implement pass to lower instructions of unsupported SIMD width.
Jason Ekstrand ja...@jlekstrand.net writes: A few comments below. Mostly just asking for explanation. 1-3 are Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com Obviously, don't merge 4/4 until it actually has users. --Jason Thanks. On Thu, Jul 16, 2015 at 8:35 AM, Francisco Jerez curroje...@riseup.net wrote: This lowering pass implements an algorithm to expand SIMDN instructions into a sequence of SIMDM instructions in cases where the hardware doesn't support the original execution size natively for some particular instruction. The most important use-cases are: - Lowering send message instructions that don't support SIMD16 natively into SIMD8 (several texturing, framebuffer write and typed surface operations). - Lowering messages that don't support SIMD8 natively into SIMD16 (*cough*gen4*cough*). - 64-bit precision operations (e.g. FP64 and 64-bit integer multiplication). - SIMD32. The algorithm works by splitting the sources of the original instruction into chunks of width appropriate for the lowered instructions, and then interleaving the results component-wise into the destination of the original instruction. The pass is controlled by the get_lowered_simd_width() function that currently just returns the original execution size making the whole pass a no-op for the moment until some user is introduced. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 142 +++ src/mesa/drivers/dri/i965/brw_fs.h | 1 + 2 files changed, 143 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index d031352..eeb6938 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3204,6 +3204,147 @@ fs_visitor::lower_logical_sends() return progress; } +/** + * Get the closest native SIMD width supported by the hardware for instruction + * \p inst. The instruction will be left untouched by + * fs_visitor::lower_simd_width() if the returned value is equal to the + * original execution size. + */ +static unsigned +get_lowered_simd_width(const struct brw_device_info *devinfo, + const fs_inst *inst) +{ + switch (inst-opcode) { + default: + return inst-exec_size; + } +} + +/** + * The \p rows array of registers represents a \p num_rows by \p num_columns + * matrix in row-major order, write it in column-major order into the register + * passed as destination. \p stride gives the separation between matrix + * elements in the input in fs_builder::dispatch_width() units. + */ +static void +emit_transpose(const fs_builder bld, + const fs_reg dst, const fs_reg *rows, + unsigned num_rows, unsigned num_columns, unsigned stride) I'm not sure what I think about calling this emit_transpose. I guess it is kind of a transpose, but maybe it's more of a gather. I'm not going to quibble about it though. *Shrug*, it doesn't only gather the vectors of the rows array (that would have been emit_collect :P), it copies them out in vertical, just like a matrix transpose -- assuming you're not horrified by the idea of considering the argument a matrix. +{ + fs_reg *const components = new fs_reg[num_rows * num_columns]; + + for (unsigned i = 0; i num_columns; ++i) { + for (unsigned j = 0; j num_rows; ++j) + components[num_rows * i + j] = offset(rows[j], bld, stride * i); + } + + bld.LOAD_PAYLOAD(dst, components, num_rows * num_columns, 0); + + delete[] components; +} + +bool +fs_visitor::lower_simd_width() +{ + bool progress = false; + + foreach_block_and_inst_safe(block, fs_inst, inst, cfg) { + const unsigned lower_width = get_lowered_simd_width(devinfo, inst); + + if (lower_width != inst-exec_size) { + /* Builder matching the original instruction. */ + const fs_builder ibld = bld.at(block, inst) +.exec_all(inst-force_writemask_all) +.group(inst-exec_size, inst-force_sechalf); + + /* Split the copies in chunks of the execution width of either the + * original or the lowered instruction, whichever is lower. + */ + const unsigned copy_width = MIN2(lower_width, inst-exec_size); + const unsigned n = inst-exec_size / copy_width; + const unsigned dst_size = inst-regs_written * REG_SIZE / +inst-dst.component_size(inst-exec_size); + fs_reg dsts[4]; + + assert(n 0 n = ARRAY_SIZE(dsts) +!inst-writes_accumulator !inst-mlen); + + for (unsigned i = 0; i n; i++) { +/* Emit a copy of the original instruction with the lowered width. + * If the EOT flag was set throw it away except for the last + * instruction to avoid killing the thread prematurely. + */
Re: [Mesa-dev] [PATCH 09/12] i965/fs: Hook up SIMD lowering to handle texturing opcodes of unsupported width.
Jason Ekstrand ja...@jlekstrand.net writes: On Sat, Jul 18, 2015 at 7:34 AM, Francisco Jerez curroje...@riseup.net wrote: This should match the set of cases in which we currently call fail() or no16() from the emit_texture_*() methods and the ones in which emit_texture_gen4() enables the SIMD16 workaround. Hint for reviewers: It's not a big deal if I happen to have missed some case here, it will just lead to an assertion failure down the road which is easily fixable, however being stricter than necessary won't cause any visible breakage, it would just decrease performance silently due to the unnecessary message splitting, so feel free to double-check that all cases listed here already cause a SIMD8/16 fall-back with the current texturing code -- You may want to skip over the Gen5-6 cases though if you don't have pencil and paper at hand. I'll believe you. Even if you got it wrong, the worst that happens is we hit the fail case when we try and lower. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 27 +++ 1 file changed, 27 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 043d9e9..f291202 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3918,6 +3918,33 @@ get_lowered_simd_width(const struct brw_device_info *devinfo, /* Dual-source FB writes are unsupported in SIMD16 mode. */ return (inst-src[1].file != BAD_FILE ? 8 : inst-exec_size); + case SHADER_OPCODE_TXD_LOGICAL: + /* TXD is unsupported in SIMD16 mode. */ + return 8; + + case SHADER_OPCODE_TG4_OFFSET_LOGICAL: { + /* gather4_po_c is unsupported in SIMD16 mode. */ + const fs_reg shadow_c = inst-src[1]; + return (shadow_c.file != BAD_FILE ? 8 : inst-exec_size); + } + case SHADER_OPCODE_TXL_LOGICAL: + case FS_OPCODE_TXB_LOGICAL: { + /* Gen4 doesn't have SIMD8 non-shadow-compare bias/LOD instructions, and + * Gen4-6 don't support TXL and TXB with shadow comparison in SIMD16 + * mode. + */ + const fs_reg shadow_c = inst-src[1]; + return (devinfo-gen == 4 shadow_c.file == BAD_FILE ? 16 : + devinfo-gen 7 shadow_c.file != BAD_FILE ? 8 : + inst-exec_size); Could we please use an if-ladder here. Nested ternaries are hard to read and kind of pointless given that we can return multiple places in the if. Heh, I always have the very opposite style itch -- If you can compute something succinctly with a handful of pure expressions why venture into the dangerous realm of statements? I think John Backus' article Can programming be liberated from the von Neumann style? although slightly outdated elaborates the point pretty well. But sure, if you'd like it better with an if-ladder why not. + } + case SHADER_OPCODE_TXF_LOGICAL: + case SHADER_OPCODE_TXS_LOGICAL: + /* Gen4 doesn't have SIMD8 variants for the RESINFO and LD-with-LOD + * messages. Use SIMD16 instead. + */ + return (devinfo-gen == 4 ? 16 : inst-exec_size); I'd kind of also like this ternary to go, but I'm ok with it if you want to keep it. + default: return inst-exec_size; } -- 2.4.3 signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 90264] [Regression, bisected] Tooltip corruption in Chrome
https://bugs.freedesktop.org/show_bug.cgi?id=90264 --- Comment #38 from Heiko lil_...@web.de --- (In reply to Antoine Labour from comment #37) Can you make glXWaitX re-evaluate window sizes? That's the reason we call it - we just resized the window, we want to make sure we're drawing to the right back buffer. This is a UI workload, we may draw only once and we don't want flashing (what a glXSwapBuffers would do), so we want to make sure we're drawing to a buffer of the right size - glXWaitX is meant to synchronize GL with changes in X. In this trace, you're only seeing parts of the calls - in particular there should be a glViewport and draws and a glXSwapBuffers after the resize. I don't think there's confusion from the app side. Parts of what you're seeing is chromium restoring state between its virtual contexts (we have different subsystems that all need to access GL, and are multiplexed, possibly on a single GL context, depending on the version), so it's not necessarily surprising to see scissor/viewport that doesn't match the window size - until we get to actually draw the picture on screen. In particular, the texture is correctly loaded with 370x72, so no confusion there. Well, most of my findings were pretty vague, because I'm neither a GL expert nor a mesa expert. With your hint, I just noticed that dri2_wait_x()/dri2_wait_gl() don't do anything, because of priv-have_fake_front being zero all the time. It would be set to 1 for buffer attachments of __DRI_BUFFER_FAKE_FRONT_LEFT, but that's nowhere set in mesa. Thus gcc just optimized it away completely along with dri2_copy_drawable(). Seems to be related to [1]? [1] http://lists.freedesktop.org/archives/mesa-dev/2011-June/008241.html -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa 2/2] egl: android: remove DRM_GRALLOC_TOP hack
On Tuesday, July 21, 2015 03:36:32 PM Emil Velikov wrote: Now that the drm_gralloc module exports the correct includes we can get rid of this hack. Cc: Varad Gautam varadgau...@gmail.com Cc: Chih-Wei Huang cwhu...@android-x86.org Cc: Eric Anholt e...@anholt.net Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- these two Tested-by: Varad Gautam varadgau...@gmail.com Android.mk | 2 -- src/egl/Android.mk | 1 - 2 files changed, 3 deletions(-) diff --git a/Android.mk b/Android.mk index 17fd5f5..ed160fb 100644 --- a/Android.mk +++ b/Android.mk @@ -45,8 +45,6 @@ endif MESA_COMMON_MK := $(MESA_TOP)/Android.common.mk MESA_PYTHON2 := python -DRM_GRALLOC_TOP := hardware/drm_gralloc - classic_drivers := i915 i965 gallium_drivers := swrast freedreno i915g ilo nouveau r300g r600g radeonsi vmwgfx vc4 diff --git a/src/egl/Android.mk b/src/egl/Android.mk index 516b53c..ebd67af 100644 --- a/src/egl/Android.mk +++ b/src/egl/Android.mk @@ -52,7 +52,6 @@ LOCAL_CFLAGS += -DDEFAULT_DRIVER_DIR=\/system/lib/dri\ endif LOCAL_C_INCLUDES := \ - $(DRM_GRALLOC_TOP) \ $(MESA_TOP)/src/egl/main \ $(MESA_TOP)/src/egl/drivers/dri2 \ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] android: rework the EGL build
On Tuesday, July 21, 2015 03:27:47 PM Emil Velikov wrote: See previous two commits for details. v2: Don't forget git mv, bring back DRM_GRALLOC_TOP. Spotted by Varad. Cc: Varad Gautam varadgau...@gmail.com Cc: Chih-Wei Huang cwhu...@android-x86.org Cc: Eric Anholt e...@anholt.net Signed-off-by: Emil Velikov emil.l.veli...@gmail.com Tested-by: Varad Gautam varadgau...@gmail.com --- Patch generated with -M to ease diff/review process. -Emil Android.mk | 3 +- src/egl/{main = }/Android.mk | 36 --- src/egl/drivers/dri2/Android.mk | 63 - 3 files changed, 26 insertions(+), 76 deletions(-) rename src/egl/{main = }/Android.mk (76%) delete mode 100644 src/egl/drivers/dri2/Android.mk diff --git a/Android.mk b/Android.mk index 69e0d33..17fd5f5 100644 --- a/Android.mk +++ b/Android.mk @@ -91,8 +91,7 @@ SUBDIRS := \ src/glsl \ src/mesa \ src/util \ - src/egl/main \ - src/egl/drivers/dri2 \ + src/egl \ src/mesa/drivers/dri ifeq ($(strip $(MESA_BUILD_GALLIUM)),true) diff --git a/src/egl/main/Android.mk b/src/egl/Android.mk similarity index 76% rename from src/egl/main/Android.mk rename to src/egl/Android.mk index 270c165..516b53c 100644 --- a/src/egl/main/Android.mk +++ b/src/egl/Android.mk @@ -27,19 +27,37 @@ LOCAL_PATH := $(call my-dir) include $(LOCAL_PATH)/Makefile.sources -SOURCES := \ - ${LIBEGL_C_FILES} - # --- # Build libGLES_mesa # --- include $(CLEAR_VARS) -LOCAL_SRC_FILES := $(SOURCES) +LOCAL_SRC_FILES := \ + $(LIBEGL_C_FILES) \ + $(dri2_backend_core_FILES) \ + drivers/dri2/platform_android.c LOCAL_CFLAGS := \ - -D_EGL_NATIVE_PLATFORM=_EGL_PLATFORM_ANDROID + -D_EGL_NATIVE_PLATFORM=_EGL_PLATFORM_ANDROID \ + -D_EGL_BUILT_IN_DRIVER_DRI2 \ + -DHAVE_ANDROID_PLATFORM + +ifeq ($(MESA_LOLLIPOP_BUILD),true) +LOCAL_CFLAGS_arm := -DDEFAULT_DRIVER_DIR=\/system/lib/dri\ +LOCAL_CFLAGS_x86 := -DDEFAULT_DRIVER_DIR=\/system/lib/dri\ +LOCAL_CFLAGS_x86_64 := -DDEFAULT_DRIVER_DIR=\/system/lib64/dri\ +else +LOCAL_CFLAGS += -DDEFAULT_DRIVER_DIR=\/system/lib/dri\ +endif + +LOCAL_C_INCLUDES := \ + $(DRM_GRALLOC_TOP) \ + $(MESA_TOP)/src/egl/main \ + $(MESA_TOP)/src/egl/drivers/dri2 \ + +LOCAL_STATIC_LIBRARIES := \ + libmesa_loader LOCAL_SHARED_LIBRARIES := \ libdl \ @@ -53,12 +71,11 @@ LOCAL_SHARED_LIBRARIES += libsync endif # add libdrm if there are hardware drivers -ifneq ($(MESA_GPU_DRIVERS),swrast) +ifneq ($(filter-out swrast,$(MESA_GPU_DRIVERS)),) +LOCAL_CFLAGS += -DHAVE_LIBDRM LOCAL_SHARED_LIBRARIES += libdrm endif -LOCAL_CFLAGS += -D_EGL_BUILT_IN_DRIVER_DRI2 - ifeq ($(strip $(MESA_BUILD_CLASSIC)),true) # require i915_dri and/or i965_dri LOCAL_REQUIRED_MODULES += \ @@ -69,9 +86,6 @@ ifeq ($(strip $(MESA_BUILD_GALLIUM)),true) LOCAL_REQUIRED_MODULES += gallium_dri endif # MESA_BUILD_GALLIUM -LOCAL_STATIC_LIBRARIES := \ - libmesa_egl_dri2 \ - libmesa_loader LOCAL_MODULE := libGLES_mesa ifeq ($(MESA_LOLLIPOP_BUILD),true) diff --git a/src/egl/drivers/dri2/Android.mk b/src/egl/drivers/dri2/Android.mk deleted file mode 100644 index 76be3b2..000 --- a/src/egl/drivers/dri2/Android.mk +++ /dev/null @@ -1,63 +0,0 @@ -# Mesa 3-D graphics library -# -# Copyright (C) 2010-2011 Chia-I Wu olva...@gmail.com -# Copyright (C) 2010-2011 LunarG Inc. -# -# Permission is hereby granted, free of charge, to any person obtaining a -# copy of this software and associated documentation files (the Software), -# to deal in the Software without restriction, including without limitation -# the rights to use, copy, modify, merge, publish, distribute, sublicense, -# and/or sell copies of the Software, and to permit persons to whom the -# Software is furnished to do so, subject to the following conditions: -# -# The above copyright notice and this permission notice shall be included -# in all copies or substantial portions of the Software. -# -# THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, -# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL -# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER -# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING -# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER -# DEALINGS IN THE SOFTWARE. - -# Android.mk for egl_dri2 - -LOCAL_PATH := $(call my-dir) - -include $(CLEAR_VARS) - -LOCAL_SRC_FILES := \ - egl_dri2.c \ - platform_android.c - -LOCAL_CFLAGS := \ - -DHAVE_ANDROID_PLATFORM - -ifeq ($(MESA_LOLLIPOP_BUILD),true) -LOCAL_CFLAGS_arm := -DDEFAULT_DRIVER_DIR=\/system/lib/dri\
Re: [Mesa-dev] [PATCH 05/14] meta: Abort meta pbo path if readpixels need signed-unsigned conversion
On Tue, 2015-07-21 at 17:05 -0700, Anuj Phogat wrote: On Tue, Jul 21, 2015 at 1:36 AM, Iago Toral ito...@igalia.com wrote: On Tue, 2015-07-21 at 08:13 +0200, Iago Toral wrote: On Mon, 2015-07-20 at 10:56 -0700, Anuj Phogat wrote: On Mon, Jul 20, 2015 at 5:10 AM, Iago Toral ito...@igalia.com wrote: On Fri, 2015-06-19 at 13:40 -0700, Anuj Phogat wrote: On Tue, Jun 16, 2015 at 9:21 PM, Jason Ekstrand ja...@jlekstrand.net wrote: On Jun 16, 2015 11:15, Anuj Phogat anuj.pho...@gmail.com wrote: Without this patch, piglit test fbo_integer_readpixels_sint_uint fails, when forced to use the meta pbo path. Signed-off-by: Anuj Phogat anuj.pho...@gmail.com Cc: mesa-sta...@lists.freedesktop.org --- src/mesa/drivers/common/meta_tex_subimage.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/drivers/common/meta_tex_subimage.c b/src/mesa/drivers/common/meta_tex_subimage.c index 00364f8..84cbc50 100644 --- a/src/mesa/drivers/common/meta_tex_subimage.c +++ b/src/mesa/drivers/common/meta_tex_subimage.c @@ -283,6 +283,9 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, GLuint dims, if (_mesa_need_rgb_to_luminance_conversion(rb-Format, format)) return false; + + if (_mesa_need_signed_unsigned_int_conversion(rb-Format, format, type)) + return false; Hrm... This seems fishy. Isn't glBlitFramebuffers supposed to handle format conversion with integers? If so we should probably fix it rather than just skip it for the meta pbo path. As discussed offline, here is relevant text for glBlitFrameBuffer() from OpenGL 4.5 spec, section 18.3.1: An INVALID_OPERATION error is generated if format conversions are not supported, which occurs under any of the following conditions: -The read buffer contains fixed-point or floating-point values and any draw buffer contains neither fixed-point nor floating-point values. -The read buffer contains unsigned integer values and any draw buffer does not contain unsigned integer values. - The read buffer contains signed integer values and any draw buffer does not contain signed integer values. I'll add a comment here explaining the reason to avoid meta path. Is this code going to run only for glBlitFramebuffer? I see this function being called from code paths that implement glReadPixels and glGetTexImage too. _mesa_meta_pbo_GetTexSubImage() is used only for glReadPixels and glGetTexImage. I quoted the glBliFrameBuffer restriction above because the function is later using _mesa_meta_BlitFramebuffer(), which doesn't support some format conversions. If this function can be used to resolve ReadPixels and GetTexImage but the checks you add are *specific* to BlitFramebuffer, it does not look like this is the right place for them. Shouldn't you put them inside _mesa_meta_BlitFramebuffer instead? Otherwise they would affect to ReadPixels and GetTexImage too and I don't see the same restrictions applying to ReadPixels for example. We already have error checks in place for glBlitFrameBuffer(). Take a look at compatible_color_datatypes() in _mesa_blit_framebuffer(). Specifically for ReadPixels I only see this in the spec: An INVALID_OPERATION error is generated if format is an integer format and the color buffer is not an integer format, or if the color buffer is an integer format and format is not an integer format. So, unlike BlitFramebuffer, it seems that ReadPixels is fine as long as both formats are integer, no matter if the types have the same sign or not. Right. That's the reason this patch doesn't generate any GL error for signed-unsigned int mismatch. It just decides not to use meta pbo path because of unsupported format conversions in _mesa_meta_BlitFrameBuffer(), and fallback to using other paths. Ah, I see now, thanks for clarifying this. Reviewed-by: Iago Toral Quiroga ito...@igalia.com Iago } /* For arrays, use a tall (height * depth) 2D texture but taking into -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org
Re: [Mesa-dev] [PATCH 18/78] i965: Take is_scalar_shader_stage() method out to allow reuse
On 07/13/2015 01:38 PM, Jason Ekstrand wrote: On Fri, Jul 10, 2015 at 8:53 AM, Eduardo Lima Mitev el...@igalia.com wrote: On 06/30/2015 06:58 PM, Jason Ekstrand wrote: On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev el...@igalia.com wrote: This patch makes public the is_scalar_shader_stage() method in brw_shader, and renames it to brw_compiler_is_scalar_shader_stage(). The plan is to later reuse it in brw_nir, to enable/disable optimization passes depending on the type of shader stage. I'm not so sure that this is a good plan. It assumes that whether we are doing a scalar or vec4 compile is based entirely on the shader stage and some static information (possibly based on environment variables). Ken and I were talking around the office and we may want to use both SIMD4x2 and SIMD8 mode for geometry shaders depending on the number of inputs, etc. This won't work in the given paradigm. If I understand correctly, what you propose is having a function to dynamically choose the type of shader (scalar vs. vec4) when compiling the shader, using not only gen and stage, but also actual application data. I think this is a good idea and will allow experimenting with different combinations of shaders with real input data. However, I wonder if we this should be added later after more elaborated thoughts on what exactly do we need and where to plug it. I have been experimenting with a function following the use case you mentioned, choosing shader backend based on inputs to a GS. But honestly it feels like a blind guess from my side to what actually you and Ken have in mind. Current patch basically reuses the function we already have to select the shader, so what I propose is to move forward with it for the moment (adding the missing MESA_SHADER_COMPUTE) and discuss how to extend it to factor in dynamic data; or perhaps you can explain us your proposal with a bit more detail. WDYT? My primary issue was with having it based on a is_scalar_stage function at all. It seems like a better long-term plan would be to simply pass an is_scalar boolean into those lowering functions. That said, I haven't taken a real hard look as to what calls those functions and whether or not that's actually what we want either. Make sense? If there's no better place to make that determination further up the call chain, then this is fine for now. Yes, I like the idea of passing the is_scalar boolean to brw_create_nir() directly, so that this decision is moved up to a higher context. By doing this we also leave the is_scalar_shader_stage() function private to brw_shader.cpp, meaning this patch is not relevant anymore and we leave that function untouched. A patch illustrating these changes is at: https://github.com/Igalia/mesa/commit/9e7bc62d43eaa30de005c40b91551ad113c4065c In the future, if we want to select shader type (scalar vs. vector) based on dynamic info, the logic that needs changing is contained in brw_shader. But this patch doesn't address that yet, lacking a more defined use-case. My (wild) idea would be to have a preferred_shader_type variable in brw_context for example, that can be modifiable dynamically at any point, and will instruct the link phase to use scalar or vec4 if possible (depending on gen, stage, etc). What do you think? The new method accepts a brw_compiler instead of a brw_context. This is done for consistency, since the actual info we need (scalar_vs) is in brw_compiler, and fetching in through brw_content-intelScreen-compiler seems like too much indirection. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580 --- src/mesa/drivers/dri/i965/brw_shader.cpp | 22 ++ src/mesa/drivers/dri/i965/brw_shader.h | 13 + 2 files changed, 19 insertions(+), 16 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp b/src/mesa/drivers/dri/i965/brw_shader.cpp index 0b53647..3b99046 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp @@ -182,19 +182,6 @@ brw_shader_precompile(struct gl_context *ctx, return true; } -static inline bool -is_scalar_shader_stage(struct brw_context *brw, int stage) -{ - switch (stage) { - case MESA_SHADER_FRAGMENT: - return true; - case MESA_SHADER_VERTEX: - return brw-intelScreen-compiler-scalar_vs; - default: - return false; - } -} - static void brw_lower_packing_builtins(struct brw_context *brw, gl_shader_stage shader_type, @@ -205,7 +192,8 @@ brw_lower_packing_builtins(struct brw_context *brw, | LOWER_PACK_UNORM_2x16 | LOWER_UNPACK_UNORM_2x16; - if (is_scalar_shader_stage(brw, shader_type)) { + if (brw_compiler_is_scalar_shader_stage(brw-intelScreen-compiler, + shader_type)) { ops |= LOWER_UNPACK_UNORM_4x8 | LOWER_UNPACK_SNORM_4x8
[Mesa-dev] [PATCH] i965/fs: Fix calculation of the number of registers read in opt_copy_propagate.
This seems to have been multiplying by stride twice since fs_inst::regs_read/regs_written were changed to return the value in register units rather than in dispatch_width-wide components. The value returned by fs_inst::regs_read() already takes into account the stride so it's wrong to do it again here. --- src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp index 269bdb5..3940158 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp @@ -309,7 +309,7 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) */ if (inst-src[arg].reg_offset entry-dst.reg_offset || (inst-src[arg].reg_offset * 32 + inst-src[arg].subreg_offset + -inst-regs_read(arg) * inst-src[arg].stride * 32) +inst-regs_read(arg) * 32) (entry-dst.reg_offset + entry-regs_written) * 32) return false; @@ -477,7 +477,7 @@ fs_visitor::try_constant_propagate(fs_inst *inst, acp_entry *entry) */ if (inst-src[i].reg_offset entry-dst.reg_offset || (inst-src[i].reg_offset * 32 + inst-src[i].subreg_offset + - inst-regs_read(i) * inst-src[i].stride * 32) + inst-regs_read(i) * 32) (entry-dst.reg_offset + entry-regs_written) * 32) continue; -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/12] st/dri: fix EGL_KHR_fence_sync since the last radeonsi change broke it
On 17.07.2015 01:54, Marek Olšák wrote: From: Marek Olšák marek.ol...@amd.com Broken by f1be3d8cdde17a9b9ae283e1bab2f46b992d3bf3, which returns NULL if no commands have been submitted. I've fixed that in radeonsi, so you can drop this patch. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 16/78] i965/nir/vec4: Implement store_output intrinsic
On 07/13/2015 01:57 PM, Jason Ekstrand wrote: On Wed, Jul 8, 2015 at 11:54 PM, Eduardo Lima Mitev el...@igalia.com wrote: On 06/30/2015 06:51 PM, Jason Ekstrand wrote: On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev el...@igalia.com wrote: The index into the output_reg array where to store the destination register is fetched from the nir_outputs map built during nir_setup_outputs stage. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580 --- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 17 +++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index 8a2d335..55d4490 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp @@ -520,10 +520,23 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr) } case nir_intrinsic_store_output_indirect: + has_indirect = true; /* fallthrough */ - case nir_intrinsic_store_output: - /* @TODO: Not yet implemented */ + case nir_intrinsic_store_output: { + int offset = instr-const_index[0]; + int output = nir_outputs[offset]; + + src = get_nir_src(instr-src[0], nir_output_types[offset]); + dest = dst_reg(src); + + dest.writemask = brw_writemask_for_size(instr-num_components); + + if (has_indirect) + dest.reladdr = new(mem_ctx) src_reg(get_nir_src(instr-src[1])); + + output_reg[output] = dest; I'm very confused about the amount of indirection going on here. It seems to me that we should be setting these outputs up in setup_outputs() rather than storring off a map from ints to other ints and setting it up here. I didn't make this comment on the patch for setup_outputs() because I wanted to wait to see it used before I commented on it. I'm guessing you did it this way because the nir_assign_var_locations is giving you bogus values. If so, then it might be better to just assign variable locations in setup_outputs() rather than having a remap table. The whole point of nir_lower_io is to make IO easy for the back-end. If you need a re-map table, then it's no longer making it easy and we need to think more about what's going on. --Jason That double indirection felt bad since the beginning, but it was needed to store the original variable's location (var-data.location). Let me explain: We are (re)using the plumbering in vec4_visitor to setup URB, so the only thing we need to do is to store the out register in output_reg map at the correct location. And that location is given by the original location in the shader (var-data.location). So, in this case, nir_assign_var_locations pass, which constructs var-data.driver_location, is not useful to us, except to give us consecutive indexes to construct the other map we have, the type map, which is needed to carry the correct type from the original variable to the output register. If nir_assign_var_locations isn't doing anything for you, don't call it. You'll need to do something with var-data.driver_location. If what you really want is var-data.location, then just copy that to var-data.driver_location when you do nir_setup_outputs. Or (depending on how the URB setup works, I don't actually know), put the actual URB location in var-data.driver_location when you walk the outputs. From there, you have two options. One would be to setup output_reg at the same time with the correct types right away and emit a MOV when you get a store_output. (Copy propagation should clean up the MOV.) For what it's worth, I don't think the type matters; a URB write just writes data to something so as long as you don't have a type mismatch in a MOV, the hardware won't care. The other option, would be to directly emit the URB write in store_output. At the moment, it may be better to take the first option since that better matches what the FS does right now. But both should work fine. Thanks for these hints, they were very useful. I rewrote the implementation of store_output intrinsic to avoid the setup phase completely. The type, as you suggested, was not important as long as they match while MOVing the contents of output_reg. To guarantee that, I had to patch the emit_urb_slot() to guarantee the types always match. This code is shared with vec4_visitor, so it makes sense to move the safeguards there instead of having both backends provide the correct register type in output_reg entries. For reference, this is the patch that implements it: https://github.com/Igalia/mesa/commit/8c703937f285c0b3a1e7bf6681c7ed7fe09815aa . I also put var-data.location in const_index[1] of the intrinsic op, and disabled nir_assign_var_locations() for output variables, since I don't need var-data.driver_location. I could have used const_index[0], but I prefer to leave driver_location there, and use const_index[1], to avoid
Re: [Mesa-dev] [PATCH 10/12] i965/fs: Reimplement emit_texture() in terms of logical send messages.
Jason Ekstrand ja...@jlekstrand.net writes: On Sat, Jul 18, 2015 at 7:34 AM, Francisco Jerez curroje...@riseup.net wrote: --- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 66 +--- 1 file changed, 49 insertions(+), 17 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 89fcc49..4011639 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -861,6 +861,14 @@ fs_visitor::emit_texture(ir_texture_opcode op, } } + if (op == ir_query_levels) { + /* textureQueryLevels() is implemented in terms of TXS so we need to + * pass a valid LOD argument. + */ + assert(lod.file == BAD_FILE); + lod = fs_reg(0u); Why can't this go as part of the switch below? It seems kind of odd to split the logic up like this. What I had in mind originally was to factor out the switch statement below into a separate function -- Later on I noticed that you wanted to rework emit_texture() and friends to stop using ir_texture_opcode as representation for the texturing operation, so you would likely want to get rid of the whole switch-case statement below or replace it with something else, so I didn't bother. What are your current plans for emit_texture()? + } + if (coordinate.file != BAD_FILE) { /* FINISHME: Texture coordinate rescaling doesn't work with non-constant * samplers. This should only be a problem with GL_CLAMP on Gen7. @@ -873,26 +881,50 @@ fs_visitor::emit_texture(ir_texture_opcode op, * samples, so don't worry about them. */ fs_reg dst = vgrf(glsl_type::get_instance(dest_type-base_type, 4, 1)); + const fs_reg srcs[] = { + coordinate, shadow_c, lod, lod2, + sample_index, mcs, sampler_reg, offset_value, + fs_reg(coord_components), fs_reg(grad_components) + }; + enum opcode opcode; - if (devinfo-gen = 7) { - inst = emit_texture_gen7(op, dst, coordinate, coord_components, - shadow_c, lod, lod2, grad_components, - sample_index, mcs, sampler_reg, - offset_value); - } else if (devinfo-gen = 5) { - inst = emit_texture_gen5(op, dst, coordinate, coord_components, - shadow_c, lod, lod2, grad_components, - sample_index, sampler, - offset_value.file != BAD_FILE); - } else if (dispatch_width == 16) { - inst = emit_texture_gen4_simd16(op, dst, coordinate, coord_components, - shadow_c, lod, sampler); - } else { - inst = emit_texture_gen4(op, dst, coordinate, coord_components, - shadow_c, lod, lod2, grad_components, - sampler); + switch (op) { + case ir_tex: + opcode = SHADER_OPCODE_TEX_LOGICAL; + break; + case ir_txb: + opcode = FS_OPCODE_TXB_LOGICAL; + break; + case ir_txl: + opcode = SHADER_OPCODE_TXL_LOGICAL; + break; + case ir_txd: + opcode = SHADER_OPCODE_TXD_LOGICAL; + break; + case ir_txf: + opcode = SHADER_OPCODE_TXF_LOGICAL; + break; + case ir_txf_ms: + opcode = SHADER_OPCODE_TXF_CMS_LOGICAL; + break; + case ir_txs: + case ir_query_levels: + opcode = SHADER_OPCODE_TXS_LOGICAL; + break; + case ir_lod: + opcode = SHADER_OPCODE_LOD_LOGICAL; + break; + case ir_tg4: + opcode = (offset_value.file != BAD_FILE offset_value.file != IMM ? +SHADER_OPCODE_TG4_OFFSET_LOGICAL : SHADER_OPCODE_TG4_LOGICAL); + break; + default: + unreachable(not reached); Please choose something more descriptive than not reached. How about invalid texture opcode Sure, I'll fix that, thanks. } + inst = bld.emit(opcode, dst, srcs, ARRAY_SIZE(srcs)); + inst-regs_written = 4 * dispatch_width / 8; + if (shadow_c.file != BAD_FILE) inst-shadow_compare = true; -- 2.4.3 signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 04/12] i965/fs: Pass a BAD_FILE header source to LOAD_PAYLOAD in emit_texture_gen7().
Jason Ekstrand ja...@jlekstrand.net writes: On Sat, Jul 18, 2015 at 7:34 AM, Francisco Jerez curroje...@riseup.net wrote: So that it's left uninitialized by LOAD_PAYLOAD, we only need to reserve space for it in the message since it will be initialized implicitly by the generator. --- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 522e13e..89fcc49 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -473,8 +473,9 @@ fs_visitor::emit_texture_gen7(ir_texture_opcode op, fs_reg dst, if (op == ir_tg4 || offset_value.file != BAD_FILE || is_high_sampler(devinfo, sampler)) { /* For general texture offsets (no txf workaround), we need a header to - * put them in. Note that for SIMD16 we're making space for two actual - * hardware registers here, so the emit will have to fix up for this. + * put them in. Note that we're only reserving space for it in the + * message payload as it will be initialized implicitly by the + * generator. Mind splitting this in two? Fixing the comment above is really unrelated to passing BAD_FILE below. It seemed like the sentence I changed was trying to explain what the line of code I changed was doing. It may sound unrelated because the comment was completely misleading (or just outdated), but it can only refer to the three lines of code below... --Jason * * * ir4_tg4 needs to place its channel select in the header, * for interaction with ARB_texture_swizzle @@ -483,7 +484,7 @@ fs_visitor::emit_texture_gen7(ir_texture_opcode op, fs_reg dst, * need to offset the Sampler State Pointer in the header. */ header_size = 1; - sources[0] = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD); + sources[0] = fs_reg(); length++; } -- 2.4.3 signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 12/12] i965/fs: Reimplement emit_single_fb_write() in terms of logical framebuffer writes.
Jason Ekstrand ja...@jlekstrand.net writes: On Thu, Jul 16, 2015 at 8:41 AM, Francisco Jerez curroje...@riseup.net wrote: The only non-trivial thing it still has to do is figure out where to take the src/dst depth values from and predicate the instruction if discard is in use. The manual SIMD unrolling logic in the dual-source case goes away because this is now handled transparently by the SIMD lowering pass. --- src/mesa/drivers/dri/i965/brw_fs.h | 5 +- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 203 +++ 2 files changed, 20 insertions(+), 188 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 1ae79a9..64f89d4 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -290,13 +290,10 @@ public: bool optimize_frontfacing_ternary(nir_alu_instr *instr, const fs_reg result); - void setup_color_payload(fs_reg *dst, fs_reg color, unsigned components, -unsigned exec_size, bool use_2nd_half); void emit_alpha_test(); fs_inst *emit_single_fb_write(const brw::fs_builder bld, fs_reg color1, fs_reg color2, - fs_reg src0_alpha, unsigned components, - unsigned exec_size, bool use_2nd_half = false); + fs_reg src0_alpha, unsigned components); void emit_fb_writes(); void emit_urb_writes(); void emit_cs_terminate(); diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index ba4b177..bcfeaa0 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -1409,33 +1409,6 @@ fs_visitor::emit_interpolation_setup_gen6() } } -void -fs_visitor::setup_color_payload(fs_reg *dst, fs_reg color, unsigned components, -unsigned exec_size, bool use_2nd_half) -{ - brw_wm_prog_key *key = (brw_wm_prog_key*) this-key; - fs_inst *inst; - - if (key-clamp_fragment_color) { - fs_reg tmp = vgrf(glsl_type::vec4_type); - assert(color.type == BRW_REGISTER_TYPE_F); - for (unsigned i = 0; i components; i++) { - inst = bld.MOV(offset(tmp, bld, i), offset(color, bld, i)); - inst-saturate = true; - } - color = tmp; - } - - if (exec_size dispatch_width) { - unsigned half_idx = use_2nd_half ? 1 : 0; - for (unsigned i = 0; i components; i++) - dst[i] = half(offset(color, bld, i), half_idx); - } else { - for (unsigned i = 0; i components; i++) - dst[i] = offset(color, bld, i); - } -} - static enum brw_conditional_mod cond_for_alpha_func(GLenum func) { @@ -1493,146 +1466,34 @@ fs_visitor::emit_alpha_test() fs_inst * fs_visitor::emit_single_fb_write(const fs_builder bld, fs_reg color0, fs_reg color1, - fs_reg src0_alpha, unsigned components, - unsigned exec_size, bool use_2nd_half) + fs_reg src0_alpha, unsigned components) { assert(stage == MESA_SHADER_FRAGMENT); brw_wm_prog_data *prog_data = (brw_wm_prog_data*) this-prog_data; - brw_wm_prog_key *key = (brw_wm_prog_key*) this-key; - const fs_builder ubld = bld.group(exec_size, use_2nd_half); - int header_size = 2, payload_header_size; - - /* We can potentially have a message length of up to 15, so we have to set -* base_mrf to either 0 or 1 in order to fit in m0..m15. -*/ - fs_reg *sources = ralloc_array(mem_ctx, fs_reg, 15); - int length = 0; - - /* From the Sandy Bridge PRM, volume 4, page 198: -* -* Dispatched Pixel Enables. One bit per pixel indicating -* which pixels were originally enabled when the thread was -* dispatched. This field is only required for the end-of- -* thread message and on all dual-source messages. -*/ - if (devinfo-gen = 6 - (devinfo-is_haswell || devinfo-gen = 8 || !prog_data-uses_kill) - color1.file == BAD_FILE - key-nr_color_regions == 1) { - header_size = 0; - } - - if (header_size != 0) { - assert(header_size == 2); - /* Allocate 2 registers for a header */ - length += 2; - } - - if (payload.aa_dest_stencil_reg) { - sources[length] = fs_reg(GRF, alloc.allocate(1)); - bld.group(8, 0).exec_all().annotate(FB write stencil/AA alpha) - .MOV(sources[length], - fs_reg(brw_vec8_grf(payload.aa_dest_stencil_reg, 0))); - length++; - } - - if (prog_data-uses_omask) { - sources[length] = fs_reg(GRF, alloc.allocate(1), - BRW_REGISTER_TYPE_UD); - - /* Hand
Re: [Mesa-dev] [PATCH 01/12] i965/fs: Define logical framebuffer write opcode.
Kenneth Graunke kenn...@whitecape.org writes: On Thursday, July 16, 2015 06:41:16 PM Francisco Jerez wrote: The logical variant is largely equivalent to the original opcode but instead of taking a single payload source it expects the arguments that make up the payload separately as individual sources, like: fb_write_logical null, color0, color1, src0_alpha, src_depth, dst_depth, sample_mask, num_components This patch defines the opcode and usual instruction boilerplate, including a placeholder lowering function provided mostly as self-documentation of its source registers. --- src/mesa/drivers/dri/i965/brw_defines.h | 1 + src/mesa/drivers/dri/i965/brw_fs.cpp | 34 src/mesa/drivers/dri/i965/brw_fs.h | 2 +- src/mesa/drivers/dri/i965/brw_shader.cpp | 2 ++ 4 files changed, 38 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index 5bf53e3..65685a9 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -875,6 +875,7 @@ enum opcode { * instructions. */ FS_OPCODE_FB_WRITE = 128, I'd like to start including block comments above our virtual opcodes describing their sources and semantics - it's a natural place to look for that information. For this, perhaps something like: /** * Framebuffer write. * * Source 0: [required] Color 0 * Source 1: [optional] Color 1 (for dual source blend messages) * Source 2: [optional] Src0 Alpha * Source 3: [optional] Source Depth (passthrough from the thread payload) * Source 4: [optional] Destination Depth (gl_FragDepth) * Source 5: [optional] Sample Mask (gl_SampleMask) * Source 6: [required] Number of color components (as a UD immediate). */ Sounds good to me. I'll steal your comment for this and write similar ones for the other logical-send virtual opcodes I've introduced lately. Patches 1-10 are: Reviewed-by: Kenneth Graunke kenn...@whitecape.org Patches 11-12 are: Acked-by: Kenneth Graunke kenn...@whitecape.org Thanks. + FS_OPCODE_FB_WRITE_LOGICAL, FS_OPCODE_BLORP_FB_WRITE, FS_OPCODE_REP_FB_WRITE, SHADER_OPCODE_RCP, signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91425] [regression, bisected] Piglit spec/ext_packed_float/ getteximage-invalid-format-for-packed-type fails
https://bugs.freedesktop.org/show_bug.cgi?id=91425 Bug ID: 91425 Summary: [regression, bisected] Piglit spec/ext_packed_float/ getteximage-invalid-format-for-packed-type fails Product: Mesa Version: git Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: Mesa core Assignee: mesa-dev@lists.freedesktop.org Reporter: sigles...@igalia.com QA Contact: mesa-dev@lists.freedesktop.org CC: bri...@vmware.com, imir...@alum.mit.edu I found a piglit regression when testing today's master: $ bin/getteximage-invalid-format-for-packed-type -auto -fbo Mesa: User error: GL_INVALID_VALUE in glGetTextImage(depth = 0) Unexpected GL error: GL_INVALID_VALUE 0x501 (Error at /home/siglesias/devel/piglit/tests/spec/ext_packed_float/getteximage-invalid-format-for-packed-type.c:124) Expected GL error: GL_INVALID_OPERATION 0x502 PIGLIT: {subtest: {GL_UNSIGNED_BYTE_3_3_2, GL_RGBA : fail}} Mesa: User error: GL_INVALID_VALUE in glGetTextImage(depth = 0) Unexpected GL error: GL_INVALID_VALUE 0x501 (Error at /home/siglesias/devel/piglit/tests/spec/ext_packed_float/getteximage-invalid-format-for-packed-type.c:118) PIGLIT: {subtest: {GL_UNSIGNED_BYTE_3_3_2, GL_RGB : fail}} Mesa: User error: GL_INVALID_VALUE in glGetTextImage(depth = 0) Unexpected GL error: GL_INVALID_VALUE 0x501 (Error at /home/siglesias/devel/piglit/tests/spec/ext_packed_float/getteximage-invalid-format-for-packed-type.c:124) Expected GL error: GL_INVALID_OPERATION 0x502 PIGLIT: {subtest: {GL_UNSIGNED_BYTE_3_3_2, GL_RED : fail}} Mesa: User error: GL_INVALID_VALUE in glGetTextImage(depth = 0) Unexpected GL error: GL_INVALID_VALUE 0x501 (Error at /home/siglesias/devel/piglit/tests/spec/ext_packed_float/getteximage-invalid-format-for-packed-type.c:124) [...] I bisected it and found the first bad commit: --- f20cfc5a409b69d5ae3d10a870d90e0b4e493ddf is the first bad commit commit f20cfc5a409b69d5ae3d10a870d90e0b4e493ddf Author: Brian Paul bri...@vmware.com Date: Tue Jul 21 18:35:38 2015 -0600 mesa: overhaul the glGetTexImage code 1. Reorganize the error checking code. 2. Lay groundwork for getting sub images by passing image offset and dimensions to the error checking code. 3. Implement _mesa_GetnTexImageARB(), _mesa_GetTexImage() and _mesa_GetTextureImage() all in terms of get_texture_image(). v2: pass offset/width/height/depth arguments to the error checking function, avoid using magic width/height/depth values. v3: remove unused bufSize param to get_texture_image() Reviewed-by: Ilia Mirkin imir...@alum.mit.edu --- Mesa master HEAD: 800efb0690e962750b9a072bcbab279fdaae24a1 $ glxinfo | grep git OpenGL core profile version string: 3.3 (Core Profile) Mesa 10.7.0-devel (git-800efb0) OpenGL version string: 3.0 Mesa 10.7.0-devel (git-800efb0) OpenGL ES profile version string: OpenGL ES 3.0 Mesa 10.7.0-devel (git-800efb0) -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 4/7] nv50: configure the ring buffer for reading back PM counters
On 01/07/15 01:01, Samuel Pitoiset wrote: To write data at the right offset, the kernel has to know some parameters of this ring buffer, like the number of domains and the maximum number of queries. Changes since v2: - only configure the ring buffer if the notifier BO is allocated - only use one BEGIN_NV04() Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com --- src/gallium/drivers/nouveau/nv50/nv50_screen.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c b/src/gallium/drivers/nouveau/nv50/nv50_screen.c index ab95d65..335bff1 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c @@ -442,6 +442,16 @@ nv50_screen_init_hwctx(struct nv50_screen *screen) BEGIN_NV04(push, SUBC_SW(NV01_SUBCHAN_OBJECT), 1); PUSH_DATA (push, screen-sw-handle); + if (screen-query) { + /* Do not need to configure the ring buffer used to read back + * global performance counters when it is not allocated. */ + BEGIN_NV04(push, SUBC_SW(0x0190), 1); + PUSH_DATA (push, screen-query-handle); + BEGIN_NV04(push, SUBC_SW(0x0600), 2); Shouldn't we have the sw method's handle defined in libdrm? Anyway, patches 3 and 4 are Reviewed-by: Martin Peres martin.pe...@free.fr + PUSH_DATA (push, NV50_HW_PM_RING_BUFFER_MAX_QUERIES); + PUSH_DATA (push, NV50_HW_PM_RING_BUFFER_NUM_DOMAINS); + } + BEGIN_NV04(push, NV50_3D(COND_MODE), 1); PUSH_DATA (push, NV50_3D_COND_MODE_ALWAYS); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 17/20] i965/fs: Handle image uniforms in NIR programs.
On Tue, Jul 21, 2015 at 9:38 AM, Francisco Jerez curroje...@riseup.net wrote: v2: Move the image_params array back to brw_stage_prog_data. --- src/mesa/drivers/dri/i965/brw_fs.h | 1 + src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 50 +++- 2 files changed, 43 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 4749c47..97df784 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -264,6 +264,7 @@ public: nir_jump_instr *instr); fs_reg get_nir_src(nir_src src); fs_reg get_nir_dest(nir_dest dest); + fs_reg get_nir_image_deref(const nir_deref_var *deref); void emit_percomp(const brw::fs_builder bld, const fs_inst inst, unsigned wr_mask); diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 6ec96c4..31024b7 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -234,17 +234,26 @@ fs_visitor::nir_setup_uniform(nir_variable *var) continue; } - unsigned slots = storage-type-component_slots(); - if (storage-array_elements) - slots *= storage-array_elements; + if (storage-type-is_image()) { + /* Images don't get a valid location assigned by nir_lower_io() + * because their size is driver-specific, so we need to allocate + * space for them here at the end of the parameter array. + */ + var-data.driver_location = uniforms; + param_size[uniforms] = +BRW_IMAGE_PARAM_SIZE * MAX2(storage-array_elements, 1); + + setup_image_uniform_values(storage); + } else { + unsigned slots = storage-type-component_slots(); + if (storage-array_elements) +slots *= storage-array_elements; - for (unsigned i = 0; i slots; i++) { - stage_prog_data-param[index++] = storage-storage[i]; + for (unsigned i = 0; i slots; i++) { +stage_prog_data-param[index++] = storage-storage[i]; + } } } - - /* Make sure we actually initialized the right amount of stuff here. */ - assert(var-data.driver_location + var-type-component_slots() == index); } void @@ -1193,6 +1202,31 @@ fs_visitor::get_nir_dest(nir_dest dest) dest.reg.indirect); } +fs_reg +fs_visitor::get_nir_image_deref(const nir_deref_var *deref) +{ + fs_reg image(UNIFORM, deref-var-data.driver_location, +BRW_REGISTER_TYPE_UD); + + if (deref-deref.child) { + const nir_deref_array *deref_array = + nir_deref_as_array(deref-deref.child); + assert(deref_array-deref.child == NULL); You should probably assert that the deref type of the child is an array before casting. + + image = offset(image, bld, + deref_array-base_offset * BRW_IMAGE_PARAM_SIZE); + + if (deref_array-deref_array_type == nir_deref_array_type_indirect) { + fs_reg *tmp = new(mem_ctx) fs_reg(vgrf(glsl_type::int_type)); + bld.MUL(*tmp, get_nir_src(deref_array-indirect), + fs_reg(BRW_IMAGE_PARAM_SIZE)); + image.reladdr = tmp; + } + } + + return image; +} + void fs_visitor::emit_percomp(const fs_builder bld, const fs_inst inst, unsigned wr_mask) -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 18/20] i965/fs: Translate image load, store and atomic NIR intrinsics.
On Tue, Jul 21, 2015 at 9:38 AM, Francisco Jerez curroje...@riseup.net wrote: --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 149 +++ 1 file changed, 149 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 31024b7..76297b7 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -24,6 +24,7 @@ #include glsl/ir.h #include glsl/ir_optimization.h #include glsl/nir/glsl_to_nir.h +#include main/shaderimage.h #include program/prog_to_nir.h #include brw_fs.h #include brw_fs_surface_builder.h @@ -1245,6 +1246,97 @@ fs_visitor::emit_percomp(const fs_builder bld, const fs_inst inst, } } +/** + * Get the matching channel register datatype for an image intrinsic of the + * specified GLSL image type. + */ +static brw_reg_type +get_image_base_type(const glsl_type *type) +{ + switch ((glsl_base_type)type-sampler_type) { + case GLSL_TYPE_UINT: + return BRW_REGISTER_TYPE_UD; + case GLSL_TYPE_INT: + return BRW_REGISTER_TYPE_D; + case GLSL_TYPE_FLOAT: + return BRW_REGISTER_TYPE_F; + default: + unreachable(Not reached.); + } +} + +/** + * Get the appropriate atomic op for an image atomic intrinsic. + */ +static unsigned +get_image_atomic_op(nir_intrinsic_op op, const glsl_type *type) +{ + switch (op) { + case nir_intrinsic_image_atomic_add: + return BRW_AOP_ADD; + case nir_intrinsic_image_atomic_min: + return (get_image_base_type(type) == BRW_REGISTER_TYPE_D ? + BRW_AOP_IMIN : BRW_AOP_UMIN); + case nir_intrinsic_image_atomic_max: + return (get_image_base_type(type) == BRW_REGISTER_TYPE_D ? + BRW_AOP_IMAX : BRW_AOP_UMAX); + case nir_intrinsic_image_atomic_and: + return BRW_AOP_AND; + case nir_intrinsic_image_atomic_or: + return BRW_AOP_OR; + case nir_intrinsic_image_atomic_xor: + return BRW_AOP_XOR; + case nir_intrinsic_image_atomic_exchange: + return BRW_AOP_MOV; + case nir_intrinsic_image_atomic_comp_swap: + return BRW_AOP_CMPWR; + default: + unreachable(Not reachable.); + } +} + +/** + * Return true if the image is a 1D array and the implementation + * requires the array index to be passed in as the Z component of the + * coordinate vector. + */ +static bool +needs_zero_y_image_coordinate(const fs_builder bld, + const nir_variable *image) +{ + const glsl_type *type = image-type-without_array(); + const mesa_format format = + _mesa_get_shader_image_format(image-data.image.format); + /* HSW in vec4 mode and our software coordinate handling for untyped +* reads want the array index to be at the Z component. +*/ + const bool array_index_at_z = (!image-data.image.write_only + !image_format_info::has_matching_typed_format( + bld.shader-devinfo, format)); + + return (type-sampler_dimensionality == GLSL_SAMPLER_DIM_1D + type-sampler_array array_index_at_z); +} + +/** + * Transform image coordinates into the form expected by the + * implementation. + */ +static fs_reg +fix_image_address(const fs_builder bld, + const nir_variable *image, const fs_reg addr) Is there a good reason why this is happening here and not inside of emit_image_load/store? It seems to break encaptulation substantially if the code that calls emit_image_load/store has to think about this stuff and adjust the address type based on whether a typed or untyped message will eventually be used. +{ + if (needs_zero_y_image_coordinate(bld, image)) { + assert(image-type-without_array()-coordinate_components() == 2); + const fs_reg srcs[] = { addr, fs_reg(0), offset(addr, bld, 1) }; + const fs_reg tmp = bld.vgrf(addr.type, ARRAY_SIZE(srcs)); + bld.LOAD_PAYLOAD(tmp, srcs, ARRAY_SIZE(srcs), 0); + return tmp; + } else { + return addr; + } +} + void fs_visitor::nir_emit_intrinsic(const fs_builder bld, nir_intrinsic_instr *instr) { @@ -1319,6 +1411,63 @@ fs_visitor::nir_emit_intrinsic(const fs_builder bld, nir_intrinsic_instr *instr break; } + case nir_intrinsic_image_load: + case nir_intrinsic_image_store: + case nir_intrinsic_image_atomic_add: + case nir_intrinsic_image_atomic_min: + case nir_intrinsic_image_atomic_max: + case nir_intrinsic_image_atomic_and: + case nir_intrinsic_image_atomic_or: + case nir_intrinsic_image_atomic_xor: + case nir_intrinsic_image_atomic_exchange: + case nir_intrinsic_image_atomic_comp_swap: { + using namespace image_access; + + /* Get the referenced image variable and type. */ + const nir_variable *var = instr-variables[0]-var; + const glsl_type *type = var-type-without_array(); +
[Mesa-dev] [Bug 90264] [Regression, bisected] Tooltip corruption in Chrome
https://bugs.freedesktop.org/show_bug.cgi?id=90264 Antoine Labour pi...@chromium.org changed: What|Removed |Added CC||pi...@chromium.org --- Comment #39 from Antoine Labour pi...@chromium.org --- (In reply to Heiko from comment #38) (In reply to Antoine Labour from comment #37) Can you make glXWaitX re-evaluate window sizes? That's the reason we call it - we just resized the window, we want to make sure we're drawing to the right back buffer. This is a UI workload, we may draw only once and we don't want flashing (what a glXSwapBuffers would do), so we want to make sure we're drawing to a buffer of the right size - glXWaitX is meant to synchronize GL with changes in X. In this trace, you're only seeing parts of the calls - in particular there should be a glViewport and draws and a glXSwapBuffers after the resize. I don't think there's confusion from the app side. Parts of what you're seeing is chromium restoring state between its virtual contexts (we have different subsystems that all need to access GL, and are multiplexed, possibly on a single GL context, depending on the version), so it's not necessarily surprising to see scissor/viewport that doesn't match the window size - until we get to actually draw the picture on screen. In particular, the texture is correctly loaded with 370x72, so no confusion there. Well, most of my findings were pretty vague, because I'm neither a GL expert nor a mesa expert. With your hint, I just noticed that dri2_wait_x()/dri2_wait_gl() don't do anything, because of priv-have_fake_front being zero all the time. It would be set to 1 for buffer attachments of __DRI_BUFFER_FAKE_FRONT_LEFT, but that's nowhere set in mesa. Thus gcc just optimized it away completely along with dri2_copy_drawable(). Seems to be related to [1]? [1] http://lists.freedesktop.org/archives/mesa-dev/2011-June/008241.html Maybe, I'm not a DRI2 (or 3 or n) expert. There is fundamentally a race between X and GL during resize, because the back buffer is supposed to be the same size as the window, but the window gets resized on the X server, whereas the back buffer is a property of the client and/or the GL stream. So, at some point, the client needs to take a snapshot of the window dimensions to operate, and problems come up if the window gets resized between the time the snapshot is taken and the time glXSwapBuffers occur. I think that's what's happening in this bug. It sounds like the referenced commit changes when the snapshot is taken. From my quick reading of the radeon driver (sorry, I'm absolutely not familiar with the codebase), if I understand correctly, it looks like that snapshot is taken with the DRI2GetBuffer{,WithFormat}, via radeon_update_renderbuffers, which is called in a few places, notably in radeon_viewport, which was always called on every glViewport before http://cgit.freedesktop.org/mesa/mesa/commit/?id=95073a2dca03a48f4c77bc846a4a6d1f0eb81ae6 , but after which is only called on non-redundant glViewport. radeon_update_renderbuffers is also called on radeonMakeCurrent, which is called on non-redundant glXMakeCurrent, and on radeon_prepare_render when dri2 stamps don't match which I think is meant to capture querying new buffers after glXSwapBuffers. There's a couple of other places too, but they may not matter for this. So, I guess there could be a couple of workarounds in chrome to force radeon_update_renderbuffers (e.g. non-idempotent glViewport or glXMakeCurrent), but they're hacks. Or the driver could make sure it invalidates whatever is needed in glXWaitX, because that's what the user uses to synchronize between X and GL (i.e. you can assume that something in X has changed). -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl/tests: fix varying_test since tess changes.
From: Dave Airlie airl...@redhat.com This fixes make check since the tess changes. Signed-off-by: Dave Airlie airl...@redhat.com --- src/glsl/tests/varyings_test.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/glsl/tests/varyings_test.cpp b/src/glsl/tests/varyings_test.cpp index 4573529..62f8c6b 100644 --- a/src/glsl/tests/varyings_test.cpp +++ b/src/glsl/tests/varyings_test.cpp @@ -70,7 +70,7 @@ public: hash_table *consumer_interface_inputs; const glsl_type *simple_interface; - ir_variable *junk[VARYING_SLOT_MAX]; + ir_variable *junk[VARYING_SLOT_TESS_MAX]; }; link_varyings::link_varyings() -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: fix warning since tess merge.
From: Dave Airlie airl...@redhat.com Signed-off-by: Dave Airlie airl...@redhat.com --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 6c97933..33469d4 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -102,6 +102,8 @@ fs_visitor::nir_setup_inputs(nir_shader *shader) } case MESA_SHADER_GEOMETRY: case MESA_SHADER_COMPUTE: + case MESA_SHADER_TESS_CTRL: + case MESA_SHADER_TESS_EVAL: unreachable(fs_visitor not used for these stages yet.); break; case MESA_SHADER_FRAGMENT: -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl/tests: fix varying_test since tess changes.
On 23.07.2015 10:14, Dave Airlie wrote: From: Dave Airlie airl...@redhat.com This fixes make check since the tess changes. Signed-off-by: Dave Airlie airl...@redhat.com Tested-by: Michel Dänzer michel.daen...@amd.com Please push this to fix make check. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] i965: fix warning since tess merge.
From: Dave Airlie airl...@redhat.com Signed-off-by: Dave Airlie airl...@redhat.com --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 6c97933..33469d4 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -102,6 +102,8 @@ fs_visitor::nir_setup_inputs(nir_shader *shader) } case MESA_SHADER_GEOMETRY: case MESA_SHADER_COMPUTE: + case MESA_SHADER_TESS_CTRL: + case MESA_SHADER_TESS_EVAL: unreachable(fs_visitor not used for these stages yet.); break; case MESA_SHADER_FRAGMENT: -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] docs/GL3.txt: ARB_shader_precision
From: Dave Airlie airl...@redhat.com This extension is about setting expectation on GL4.1 implementations rather than actually enforcing things. So once you support GLSL 410 then you support this in theory. Signed-off-by: Dave Airlie airl...@redhat.com --- docs/GL3.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index 197b676..e299cb9 100644 --- a/docs/GL3.txt +++ b/docs/GL3.txt @@ -126,7 +126,7 @@ GL 4.1, GLSL 4.10: GL_ARB_ES2_compatibility DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe) GL_ARB_get_program_binaryDONE (0 binary formats) GL_ARB_separate_shader_objects DONE (all drivers) - GL_ARB_shader_precision started (Micah) + GL_ARB_shader_precision DONE (all drivers) GL_ARB_vertex_attrib_64bit DONE (nvc0, radeonsi, llvmpipe, softpipe) GL_ARB_viewport_arrayDONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe) -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] docs/GL3.txt: ARB_shader_precision
In actuality it'll only be exposed on nvc0 atm... On Wed, Jul 22, 2015 at 9:20 PM, Dave Airlie airl...@gmail.com wrote: From: Dave Airlie airl...@redhat.com This extension is about setting expectation on GL4.1 implementations rather than actually enforcing things. So once you support GLSL 410 then you support this in theory. Signed-off-by: Dave Airlie airl...@redhat.com --- docs/GL3.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index 197b676..e299cb9 100644 --- a/docs/GL3.txt +++ b/docs/GL3.txt @@ -126,7 +126,7 @@ GL 4.1, GLSL 4.10: GL_ARB_ES2_compatibility DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe) GL_ARB_get_program_binaryDONE (0 binary formats) GL_ARB_separate_shader_objects DONE (all drivers) - GL_ARB_shader_precision started (Micah) + GL_ARB_shader_precision DONE (all drivers) GL_ARB_vertex_attrib_64bit DONE (nvc0, radeonsi, llvmpipe, softpipe) GL_ARB_viewport_arrayDONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe) -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: Allow token pasting in ESSL
All vendors I've tested allow token pasting in their ESSL shaders. At least one application expects this to be supported even though it is against spec. This would be the Dolphin GameCube/Wii emulator that requires this. The vendors I've tested that is known to support this is: ARM Mali Qualcomm Adreno IMGTec PowerVR Nvidia Tegra K1/X1 The vendors that I don't have access to testing: Intel Android Vivante --- src/glsl/glcpp/glcpp-lex.l | 2 -- 1 file changed, 2 deletions(-) diff --git a/src/glsl/glcpp/glcpp-lex.l b/src/glsl/glcpp/glcpp-lex.l index fa9aa50..3e2ad23 100644 --- a/src/glsl/glcpp/glcpp-lex.l +++ b/src/glsl/glcpp/glcpp-lex.l @@ -494,8 +494,6 @@ HEXADECIMAL_INTEGER 0[xX][0-9a-fA-F]+[uU]? ## { if (! parser-skipping) { - if (parser-is_gles) - glcpp_error(yylloc, yyextra, Token pasting (##) is illegal in GLES); RETURN_TOKEN (PASTE); } } -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 06/14] meta: Don't do fragment color clamping in _mesa_meta_pbo_GetTexSubImage
Reviewed-by: Iago Toral Quiroga ito...@igalia.com On Wed, 2015-07-22 at 11:54 -0700, Anuj Phogat wrote: _mesa_meta_pbo_GetTexSubImage() uses _mesa_meta_BlitFrameBuffer(), which will do fragment clamping if enabled. But fragment clamping doesn't affect ReadPixels and GetTexImage. Without this patch, piglit test arb_color_buffer_float-clear fails, when forced to use the meta pbo path. v2: Apply this fix to both glReadPixels and glGetTexImage. Cc: mesa-sta...@lists.freedesktop.org Signed-off-by: Anuj Phogat anuj.pho...@gmail.com --- src/mesa/drivers/common/meta_tex_subimage.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/mesa/drivers/common/meta_tex_subimage.c b/src/mesa/drivers/common/meta_tex_subimage.c index 6d7c1eb..038d525 100644 --- a/src/mesa/drivers/common/meta_tex_subimage.c +++ b/src/mesa/drivers/common/meta_tex_subimage.c @@ -25,6 +25,7 @@ *Jason Ekstrand jason.ekstr...@intel.com */ +#include blend.h #include bufferobj.h #include buffers.h #include fbobject.h @@ -331,6 +332,10 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, GLuint dims, _mesa_meta_begin(ctx, ~(MESA_META_PIXEL_TRANSFER | MESA_META_PIXEL_STORE)); + /* GL_CLAMP_FRAGMENT_COLOR doesn't affect ReadPixels and GettexImage */ + if (ctx-Extensions.ARB_color_buffer_float) + _mesa_ClampColor(GL_CLAMP_FRAGMENT_COLOR, GL_FALSE); + _mesa_GenFramebuffers(2, fbos); if (tex_image tex_image-TexObject-Target == GL_TEXTURE_1D_ARRAY) { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 18/20] mesa: fill out the ARB_shader_subroutine APIs
On Tuesday, July 21, 2015 03:19:25 PM Dave Airlie wrote: From: Dave Airlie airl...@redhat.com This fleshes out the APIs, using the program resource APIs where they should match. It also sets the default values to valid subroutines. Signed-off-by: Dave Airlie airl...@redhat.com --- src/mesa/main/shaderapi.c | 450 +- src/mesa/main/shaderapi.h | 3 + 2 files changed, 450 insertions(+), 3 deletions(-) So, one question I have...according to the spec... Subroutine uniform variables are similar to uniform variables, except they are context state rather than program state. Having subroutine uniforms be context state allows them to have different values if the program is used in multiple contexts simultaneously. There is a set of subroutine uniforms for each shader stage. but it looks like we're storing them in the shader, rather than per-context. Bug? Or am I missing something? At this point I'm just inclined to give everything I haven't reviewed an Acked-by: Kenneth Graunke kenn...@whitecape.org I don't think anybody wants to spend much more time on this, and it'd be nice to check off the box and move on. signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 16/78] i965/nir/vec4: Implement store_output intrinsic
On Wed, Jul 22, 2015 at 4:37 AM, Eduardo Lima Mitev el...@igalia.com wrote: On 07/13/2015 01:57 PM, Jason Ekstrand wrote: On Wed, Jul 8, 2015 at 11:54 PM, Eduardo Lima Mitev el...@igalia.com wrote: On 06/30/2015 06:51 PM, Jason Ekstrand wrote: On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev el...@igalia.com wrote: The index into the output_reg array where to store the destination register is fetched from the nir_outputs map built during nir_setup_outputs stage. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89580 --- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 17 +++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index 8a2d335..55d4490 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp @@ -520,10 +520,23 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr) } case nir_intrinsic_store_output_indirect: + has_indirect = true; /* fallthrough */ - case nir_intrinsic_store_output: - /* @TODO: Not yet implemented */ + case nir_intrinsic_store_output: { + int offset = instr-const_index[0]; + int output = nir_outputs[offset]; + + src = get_nir_src(instr-src[0], nir_output_types[offset]); + dest = dst_reg(src); + + dest.writemask = brw_writemask_for_size(instr-num_components); + + if (has_indirect) + dest.reladdr = new(mem_ctx) src_reg(get_nir_src(instr-src[1])); + + output_reg[output] = dest; I'm very confused about the amount of indirection going on here. It seems to me that we should be setting these outputs up in setup_outputs() rather than storring off a map from ints to other ints and setting it up here. I didn't make this comment on the patch for setup_outputs() because I wanted to wait to see it used before I commented on it. I'm guessing you did it this way because the nir_assign_var_locations is giving you bogus values. If so, then it might be better to just assign variable locations in setup_outputs() rather than having a remap table. The whole point of nir_lower_io is to make IO easy for the back-end. If you need a re-map table, then it's no longer making it easy and we need to think more about what's going on. --Jason That double indirection felt bad since the beginning, but it was needed to store the original variable's location (var-data.location). Let me explain: We are (re)using the plumbering in vec4_visitor to setup URB, so the only thing we need to do is to store the out register in output_reg map at the correct location. And that location is given by the original location in the shader (var-data.location). So, in this case, nir_assign_var_locations pass, which constructs var-data.driver_location, is not useful to us, except to give us consecutive indexes to construct the other map we have, the type map, which is needed to carry the correct type from the original variable to the output register. If nir_assign_var_locations isn't doing anything for you, don't call it. You'll need to do something with var-data.driver_location. If what you really want is var-data.location, then just copy that to var-data.driver_location when you do nir_setup_outputs. Or (depending on how the URB setup works, I don't actually know), put the actual URB location in var-data.driver_location when you walk the outputs. From there, you have two options. One would be to setup output_reg at the same time with the correct types right away and emit a MOV when you get a store_output. (Copy propagation should clean up the MOV.) For what it's worth, I don't think the type matters; a URB write just writes data to something so as long as you don't have a type mismatch in a MOV, the hardware won't care. The other option, would be to directly emit the URB write in store_output. At the moment, it may be better to take the first option since that better matches what the FS does right now. But both should work fine. Thanks for these hints, they were very useful. I rewrote the implementation of store_output intrinsic to avoid the setup phase completely. The type, as you suggested, was not important as long as they match while MOVing the contents of output_reg. To guarantee that, I had to patch the emit_urb_slot() to guarantee the types always match. This code is shared with vec4_visitor, so it makes sense to move the safeguards there instead of having both backends provide the correct register type in output_reg entries. For reference, this is the patch that implements it: https://github.com/Igalia/mesa/commit/8c703937f285c0b3a1e7bf6681c7ed7fe09815aa Seems reasonable. I also put var-data.location in const_index[1] of the intrinsic op, and disabled nir_assign_var_locations() for output variables, since I don't need var-data.driver_location. I could
[Mesa-dev] [Bug 91425] [regression, bisected] Piglit spec/ext_packed_float/ getteximage-invalid-format-for-packed-type fails
https://bugs.freedesktop.org/show_bug.cgi?id=91425 Samuel Iglesias sigles...@igalia.com changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #5 from Samuel Iglesias sigles...@igalia.com --- (In reply to Brian Paul from comment #4) OK, the problem is my local copy of getteximage-invalid-format-for-packed-type.c has the patch ext_packed_float: fix getteximage-invalid-format-for-packed-type test which I posted for review (and was R-b'd by Jose) but not yet pushed to master. I believe the piglit test was defective as-is, per my check-in comment: The GL spec doesn't explicitly say that glGetTexImage should generate GL_INVALID_OPERATION when attempting to retrieve a non-existant texture image, but that's what NVIDIA's driver does. The purpose of this test is to check the format/type parameters, so let's define a packed float texture to avoid the undefined texture situation. Test now passes with NVIDIA. I'll push that patch. It passes now. Thanks Brian! -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] doxygen: Link GLvector4f struct members properly, avoiding invalid XML/HTML warning
From: Rhys Kidd rhysk...@gmail.com Signed-off-by: Rhys Kidd rhysk...@gmail.com --- src/mesa/math/m_vector.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/math/m_vector.h b/src/mesa/math/m_vector.h index 3b7f583..5bd76b8 100644 --- a/src/mesa/math/m_vector.h +++ b/src/mesa/math/m_vector.h @@ -61,7 +61,7 @@ */ typedef struct { GLfloat (*data)[4]; /** may be malloc'd or point to client data */ - GLfloat *start; /** points somewhere inside of data */ + GLfloat *start; /** points somewhere inside of GLvector4f::data */ GLuint count; /** size of the vector (in elements) */ GLuint stride; /** stride from one element to the next (in bytes) */ GLuint size;/** 2-4 for vertices and 1-4 for texcoords */ -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] doxygen: Correct grammatical typo in math/m_vector.h
From: Rhys Kidd rhysk...@gmail.com Signed-off-by: Rhys Kidd rhysk...@gmail.com --- src/mesa/math/m_vector.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/math/m_vector.h b/src/mesa/math/m_vector.h index 8551ee7..3b7f583 100644 --- a/src/mesa/math/m_vector.h +++ b/src/mesa/math/m_vector.h @@ -51,7 +51,7 @@ /** * Wrap all the information about vectors up in a struct. Has - * additional fields compared to the other vectors to help us track of + * additional fields compared to the other vectors to help us track * different vertex sizes, and whether we need to clean columns out * because they contain non-(0,0,0,1) values. * -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: Fix a bug where LHS swizzles of swizzles were too small.
A simple shader such as vec4 color; color.xy.x = 1.0; would cause ir_assignment::set_lhs() to generate bogus IR: (swiz xy (swiz x (constant float (1.0 We were setting the number of components of each new RHS swizzle based on the highest channel used in the LHS swizzle. So, .xy.y would generate (swiz xy (swiz xx ...)), while .xy.x would break. Our existing Piglit test happened to use .xzy.z, which worked, since 'z' is the third component, resulting in an xxx swizzle. This patch sets the number of swizzle components based on the size of the LHS swizzle's inner value, so we always have the correct number at each step. Fixes new Piglit tests glsl-vs-swizzle-swizzle-lhs-[23]. Fixes ir_validate assertions in in Metro 2033 Redux. Cc: i...@freedesktop.org Cc: mesa-sta...@lists.freedesktop.org Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/ir.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp index 8fb7ca4..7350dfa 100644 --- a/src/glsl/ir.cpp +++ b/src/glsl/ir.cpp @@ -95,6 +95,8 @@ ir_assignment::set_lhs(ir_rvalue *lhs) write_mask |= (((this-write_mask i) 1) c); update_rhs_swizzle(rhs_swiz, i, c); + assert(rhs_swiz.num_components = swiz-val-type-vector_elements); + rhs_swiz.num_components = swiz-val-type-vector_elements; } this-write_mask = write_mask; -- 2.4.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 18/20] mesa: fill out the ARB_shader_subroutine APIs
On 23 July 2015 at 15:01, Kenneth Graunke kenn...@whitecape.org wrote: On Tuesday, July 21, 2015 03:19:25 PM Dave Airlie wrote: From: Dave Airlie airl...@redhat.com This fleshes out the APIs, using the program resource APIs where they should match. It also sets the default values to valid subroutines. Signed-off-by: Dave Airlie airl...@redhat.com --- src/mesa/main/shaderapi.c | 450 +- src/mesa/main/shaderapi.h | 3 + 2 files changed, 450 insertions(+), 3 deletions(-) So, one question I have...according to the spec... Subroutine uniform variables are similar to uniform variables, except they are context state rather than program state. Having subroutine uniforms be context state allows them to have different values if the program is used in multiple contexts simultaneously. There is a set of subroutine uniforms for each shader stage. but it looks like we're storing them in the shader, rather than per-context. Bug? Or am I missing something? No I think it's a bug in that I may have convinced myself the constant buffer we give to drivers is per context, when it really isn't. The question is whether this is some feature anyone is ever going to care about at this point, I'd still like to land this, and I try and go write a piglit test that isn't crap to test for that case. At this point I'm just inclined to give everything I haven't reviewed an Acked-by: Kenneth Graunke kenn...@whitecape.org I don't think anybody wants to spend much more time on this, and it'd be nice to check off the box and move on. Yup me to :-0 Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 5/7] nv50: add support for compute/graphics global performance counters
On 01/07/15 01:01, Samuel Pitoiset wrote: This commit adds support for both compute and graphics global performance counters which have been reverse engineered with CUPTI (Linux) and PerfKit (Windows). Currently, only one query type can be monitored at the same time because the Gallium's HUD doesn't fit pretty well. This will be improved later. Changes since v2: - replace \% by percentage - remove one extra call to PUSH_SPACE - use nouveau_fence instead of my hand-made fence mechanism Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com --- src/gallium/drivers/nouveau/nv50/nv50_query.c | 1066 +++- src/gallium/drivers/nouveau/nv50/nv50_screen.h | 35 + 2 files changed, 1096 insertions(+), 5 deletions(-) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_query.c b/src/gallium/drivers/nouveau/nv50/nv50_query.c index 81f7474..7fb6f3a 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_query.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_query.c @@ -27,6 +27,8 @@ #include nv50/nv50_context.h #include nv_object.xml.h +#include nouveau_perfmon.h + #define NV50_QUERY_STATE_READY 0 #define NV50_QUERY_STATE_ACTIVE 1 #define NV50_QUERY_STATE_ENDED 2 @@ -51,10 +53,25 @@ struct nv50_query { boolean is64bit; struct nouveau_mm_allocation *mm; struct nouveau_fence *fence; + struct nouveau_object *perfdom; }; #define NV50_QUERY_ALLOC_SPACE 256 +#ifdef DEBUG No need to guard the definition of this function. The compiler will get rid of it if it has no users. +static void nv50_hw_pm_dump_perfdom(struct nvif_perfdom_v0 *args); +#endif + +static boolean +nv50_hw_pm_query_create(struct nv50_context *, struct nv50_query *); +static void +nv50_hw_pm_query_destroy(struct nv50_context *, struct nv50_query *); +static boolean +nv50_hw_pm_query_begin(struct nv50_context *, struct nv50_query *); +static void nv50_hw_pm_query_end(struct nv50_context *, struct nv50_query *); +static boolean nv50_hw_pm_query_result(struct nv50_context *, +struct nv50_query *, boolean, void *); + static INLINE struct nv50_query * nv50_query(struct pipe_query *pipe) { @@ -96,9 +113,15 @@ nv50_query_allocate(struct nv50_context *nv50, struct nv50_query *q, int size) static void nv50_query_destroy(struct pipe_context *pipe, struct pipe_query *pq) { - nv50_query_allocate(nv50_context(pipe), nv50_query(pq), 0); - nouveau_fence_ref(NULL, nv50_query(pq)-fence); - FREE(nv50_query(pq)); + struct nv50_context *nv50 = nv50_context(pipe); + struct nv50_query *q = nv50_query(pq); + + if ((q-type = NV50_HW_PM_QUERY(0) q-type = NV50_HW_PM_QUERY_LAST)) + nv50_hw_pm_query_destroy(nv50, q); + + nv50_query_allocate(nv50, q, 0); + nouveau_fence_ref(NULL, q-fence); + FREE(q); } static struct pipe_query * @@ -120,6 +143,12 @@ nv50_query_create(struct pipe_context *pipe, unsigned type, unsigned index) type == PIPE_QUERY_PRIMITIVES_EMITTED || type == PIPE_QUERY_SO_STATISTICS || type == PIPE_QUERY_PIPELINE_STATISTICS); + if (type = NV50_HW_PM_QUERY(0) q-type = NV50_HW_PM_QUERY_LAST) { + /* Hardware global performance counters are not 64 bits, but we also use + * a fence to make sure the query is ready. */ I do not understand the logic of this comment. + q-is64bit = TRUE; + } + q-type = type; if (q-type == PIPE_QUERY_OCCLUSION_COUNTER) { @@ -127,6 +156,11 @@ nv50_query_create(struct pipe_context *pipe, unsigned type, unsigned index) q-data -= 32 / sizeof(*q-data); /* we advance before query_begin ! */ } + if ((q-type = NV50_HW_PM_QUERY(0) q-type = NV50_HW_PM_QUERY_LAST)) { + if (!nv50_hw_pm_query_create(nv50, q)) + return NULL; + } + return (struct pipe_query *)q; } @@ -151,6 +185,7 @@ nv50_query_begin(struct pipe_context *pipe, struct pipe_query *pq) struct nv50_context *nv50 = nv50_context(pipe); struct nouveau_pushbuf *push = nv50-base.pushbuf; struct nv50_query *q = nv50_query(pq); + boolean ret = TRUE; /* For occlusion queries we have to change the storage, because a previous * query might set the initial render conition to FALSE even *after* we re- @@ -205,10 +240,13 @@ nv50_query_begin(struct pipe_context *pipe, struct pipe_query *pq) nv50_query_get(push, q, 0x10, 0x5002); break; default: + if ((q-type = NV50_HW_PM_QUERY(0) q-type = NV50_HW_PM_QUERY_LAST)) { + ret = nv50_hw_pm_query_begin(nv50, q); + } break; } q-state = NV50_QUERY_STATE_ACTIVE; - return true; + return ret; } static void @@ -265,7 +303,9 @@ nv50_query_end(struct pipe_context *pipe, struct pipe_query *pq) q-state = NV50_QUERY_STATE_READY; break; default: - assert(0); + if ((q-type = NV50_HW_PM_QUERY(0) q-type = NV50_HW_PM_QUERY_LAST)) { +
Re: [Mesa-dev] [PATCH 1/2] mesa: Detect and provide macros for function attributes pure and const.
On 22/07/15 21:01, Jose Fonseca wrote: On 22/07/15 17:13, Jose Fonseca wrote: On 21/07/15 15:57, Emil Velikov wrote: On 18 July 2015 at 08:13, Jose Fonseca jfons...@vmware.com wrote: On 18/07/15 01:38, Eric Anholt wrote: Emil Velikov emil.l.veli...@gmail.com writes: On 14/07/15 19:45, Eric Anholt wrote: These are really useful hints to the compiler in the absence of link-time optimization, and I'm going to use them in VC4. I've made the const attribute be ATTRIBUTE_CONST unlike other function attributes, because we have other things in the tree #defining CONST for their own unrelated purposes. Mindly related: how people feel about making these macros less screamy, by following the approach used in the kernel: PURE - __pure and so on ? I'd love it. Less screamy is fine, but beware prefixing double underscore: the C standard stipulates that its use is reserved for for C/C++ runtime. [1] I though about it before posting although I've seen others define those, even do so in their public headers. Now that I have some examples from my current /usr/include Searching for __pure dwarves/dutil.h:#define __pure __attribute__ ((pure)) Searching for __attribute_const__ sys/cdefs.h:# define __attribute_const__ __attribute__ ((__const__)) sys/cdefs.h:# define __attribute_const__ /* Ignore */ Searching for __printf Searching for __always_unused Searching for __noreturn Searching for __packed libvisual-0.4/libvisual/lv_defines.h:# define __packed __attribute__ ((packed)) libvisual-0.4/libvisual/lv_defines.h:# define __packed /* no packed */ bsd/sys/cdefs.h:# define __packed __attribute__((__packed__)) bsd/sys/cdefs.h:# define __packed Searching for __deprecated pciaccess.h:#define __deprecated __attribute__((deprecated)) pciaccess.h:#define __deprecated Searching for __weak Searching for __alias With a handful of other headers defining more double underscore prefixed macros. Look at stdlibc++ implementation: every internal variable has a double underscore prefix. Unless we're talking about STL/other template library we don't care what library foo uses in it's internal implementation do we ? After all these will be resolved at compile time. Maybe kernel gets away on GLIBC (and because it doesn't use C++), but there's no guarantee it will work on other C runtimes, and even if it does, it could start failing anytime. True, it's not the best of ideas. Just worth pointing out that the cat is already out, for other projects. There are already more than 12K #define __foo cases on my system. These defines are reserved for system headers, so it's natural to be lots of them in /usr/include. MacOSX also defines some of these on its sys/cdefs.h: http://www.opensource.apple.com/source/xnu/xnu-1456.1.26/bsd/sys/cdefs.h The question is: can we expect that most systems will define these __foo, or at least not use them for other purposes. I don't know the answer. At a glance MSVC doesn't seem to rely on them for anything. So it might work. I don't oppose if you want to give it a shot. Jose Ironically, Windows headers already define PURE (its used in COM interfaces). Just realized this now after this was merged. So we really need a different name... Question is which name: ATTRIBUTE_PURE, __pure, something else? As I said, I don't oppose __pure, but some logic will be necessary to avoid redefition when system headers (like sys/cdefs.h) already define it, which is not trivial. It will need something like #if HAVE_SYS_CDEFS_H #include sys/cdefs.h // for __pure #endif #ifndef __pure #endif Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 7/7] nv50: enable GL_AMD_performance_monitor
On 01/07/15 01:01, Samuel Pitoiset wrote: This exposes a group of global performance counters that enables GL_AMD_performance_monitor. All piglit tests are okay. Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com --- src/gallium/drivers/nouveau/nv50/nv50_query.c | 35 ++ src/gallium/drivers/nouveau/nv50/nv50_screen.c | 1 + src/gallium/drivers/nouveau/nv50/nv50_screen.h | 6 + 3 files changed, 42 insertions(+) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_query.c b/src/gallium/drivers/nouveau/nv50/nv50_query.c index 7dadb77..6d57305 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_query.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_query.c @@ -1563,6 +1563,7 @@ nv50_screen_get_driver_query_info(struct pipe_screen *pscreen, info-name = cfg-event-name; info-query_type = NV50_HW_PM_QUERY(id); + info-group_id = NV50_HW_PM_QUERY_GROUP; info-max_value.u64 = (cfg-event-display == NV50_HW_PM_EVENT_DISPLAY_RATIO) ? 100 : 0; return 1; @@ -1573,6 +1574,40 @@ nv50_screen_get_driver_query_info(struct pipe_screen *pscreen, return 0; } +int +nv50_screen_get_driver_query_group_info(struct pipe_screen *pscreen, +unsigned id, +struct pipe_driver_query_group_info *info) +{ + struct nv50_screen *screen = nv50_screen(pscreen); + int count = 0; + + // TODO: Check DRM version when nvif will be merged in libdrm! + if (screen-base.perfmon) { + count++; /* NV50_HW_PM_QUERY_GROUP */ + } + + if (!info) + return count; + + if (id == NV50_HW_PM_QUERY_GROUP) { + if (screen-base.perfmon) { + info-name = Global performance counters; + info-type = PIPE_DRIVER_QUERY_GROUP_TYPE_GPU; + info-num_queries = NV50_HW_PM_QUERY_COUNT; + info-max_active_queries = 1; /* TODO: get rid of this limitation! */ + return 1; + } + } + + /* user asked for info about non-existing query group */ + info-name = this_is_not_the_query_group_you_are_looking_for; + info-max_active_queries = 0; + info-num_queries = 0; + info-type = 0; + return 0; +} + void nv50_init_query_functions(struct nv50_context *nv50) { diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c b/src/gallium/drivers/nouveau/nv50/nv50_screen.c index ac1acd1..05f921d 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c @@ -749,6 +749,7 @@ nv50_screen_create(struct nouveau_device *dev) pscreen-get_shader_param = nv50_screen_get_shader_param; pscreen-get_paramf = nv50_screen_get_paramf; pscreen-get_driver_query_info = nv50_screen_get_driver_query_info; + pscreen-get_driver_query_group_info = nv50_screen_get_driver_query_group_info; nv50_screen_init_resource_functions(pscreen); diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.h b/src/gallium/drivers/nouveau/nv50/nv50_screen.h index 69127c0..807ae0e 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_screen.h +++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.h @@ -114,6 +114,9 @@ nv50_screen(struct pipe_screen *screen) return (struct nv50_screen *)screen; } +/* Hardware global performance counters groups. */ +#define NV50_HW_PM_QUERY_GROUP 0 + /* Hardware global performance counters. */ #define NV50_HW_PM_QUERY_COUNT 24 #define NV50_HW_PM_QUERY(i)(PIPE_QUERY_DRIVER_SPECIFIC + (i)) @@ -146,6 +149,9 @@ nv50_screen(struct pipe_screen *screen) int nv50_screen_get_driver_query_info(struct pipe_screen *, unsigned, struct pipe_driver_query_info *); +int nv50_screen_get_driver_query_group_info(struct pipe_screen *, unsigned, +struct pipe_driver_query_group_info *); + boolean nv50_blitter_create(struct nv50_screen *); void nv50_blitter_destroy(struct nv50_screen *); Everything looks good to me! Excellent work Samuel! Reviewed-by: Martin Peres martin.pe...@free.fr IIRC, the kernel patches were supposed to land in 4.2 as there was no pull request from Ben, it will likely end up in 4.3. As for the libdrm patches, did you review them? Looking forward to seeing this series merged! It would also be nice to start a discussion to rework the Gallium HUD to avoid the stupid problem that we have where we can only monitor one signal at a time! Congrats again Samuel! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 6/7] nv50: expose global performance counters to the HUD
On 01/07/15 01:01, Samuel Pitoiset wrote: Signed-off-by: Samuel Pitoiset samuel.pitoi...@gmail.com --- src/gallium/drivers/nouveau/nv50/nv50_query.c | 41 ++ src/gallium/drivers/nouveau/nv50/nv50_screen.c | 1 + src/gallium/drivers/nouveau/nv50/nv50_screen.h | 3 ++ 3 files changed, 45 insertions(+) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_query.c b/src/gallium/drivers/nouveau/nv50/nv50_query.c index 7fb6f3a..7dadb77 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_query.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_query.c @@ -1532,6 +1532,47 @@ nv50_hw_pm_query_result(struct nv50_context *nv50, struct nv50_query *q, return TRUE; } +int +nv50_screen_get_driver_query_info(struct pipe_screen *pscreen, + unsigned id, + struct pipe_driver_query_info *info) +{ + struct nv50_screen *screen = nv50_screen(pscreen); + int count = 0; + + // TODO: Check DRM version when nvif will be merged in libdrm! + if (screen-base.perfmon) { + nv50_identify_events(screen); + count += NV50_HW_PM_QUERY_COUNT; + } + + if (!info) + return count; + + /* Init default values. */ + info-name = this_is_not_the_query_you_are_looking_for; The spirit of calim is still in the code. Must be the ghost in the shell! The patch is Reviewed-by: Martin Peres martin.pe...@free.fr + info-query_type = 0xdeadd01d; + info-type = PIPE_DRIVER_QUERY_TYPE_UINT64; + info-max_value.u64 = 0; + info-group_id = -1; + + if (id count) { + if (screen-base.perfmon) { + const struct nv50_hw_pm_query_cfg *cfg = +nv50_hw_pm_query_get_cfg(screen, NV50_HW_PM_QUERY(id)); + + info-name = cfg-event-name; + info-query_type = NV50_HW_PM_QUERY(id); + info-max_value.u64 = +(cfg-event-display == NV50_HW_PM_EVENT_DISPLAY_RATIO) ? 100 : 0; + return 1; + } + } + + /* User asked for info about non-existing query. */ + return 0; +} + void nv50_init_query_functions(struct nv50_context *nv50) { diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c b/src/gallium/drivers/nouveau/nv50/nv50_screen.c index 335bff1..ac1acd1 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c @@ -748,6 +748,7 @@ nv50_screen_create(struct nouveau_device *dev) pscreen-get_param = nv50_screen_get_param; pscreen-get_shader_param = nv50_screen_get_shader_param; pscreen-get_paramf = nv50_screen_get_paramf; + pscreen-get_driver_query_info = nv50_screen_get_driver_query_info; nv50_screen_init_resource_functions(pscreen); diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.h b/src/gallium/drivers/nouveau/nv50/nv50_screen.h index 0449659..69127c0 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_screen.h +++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.h @@ -143,6 +143,9 @@ nv50_screen(struct pipe_screen *screen) #define NV50_HW_PM_QUERY_TEX_CACHE_HIT 22 #define NV50_HW_PM_QUERY_TEX_WAITS_FOR_FB 23 +int nv50_screen_get_driver_query_info(struct pipe_screen *, unsigned, + struct pipe_driver_query_info *); + boolean nv50_blitter_create(struct nv50_screen *); void nv50_blitter_destroy(struct nv50_screen *); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC 17/16] glsl: add AoA support for atmoic counters
--- src/glsl/link_atomics.cpp | 73 +++ 1 file changed, 55 insertions(+), 18 deletions(-) diff --git a/src/glsl/link_atomics.cpp b/src/glsl/link_atomics.cpp index 100d03c..21f9af7 100644 --- a/src/glsl/link_atomics.cpp +++ b/src/glsl/link_atomics.cpp @@ -95,6 +95,56 @@ namespace { y-data.atomic.offset x-data.atomic.offset + x-type-atomic_size())); } + void + process_atomic_variable(const glsl_type *t, struct gl_shader_program *prog, + char **name, size_t name_length, ir_variable *var, + active_atomic_buffer *const buffers, + unsigned *num_buffers, int *offset, + const unsigned shader_stage) + { + /* FIXME: Arrays of arrays get counted separately. For example: + * x1[3][3][2] = 9 counters + * x2[3][2]= 3 counters + * x3[2] = 1 counter + * + * However this code marks all the counters as active even when they + * might not be used. + */ + if (t-is_array() t-fields.array-is_array()) { + for (unsigned i = 0; i t-length; i++) { + size_t new_length = name_length; + + /* Append the subscript to the current variable name */ + ralloc_asprintf_rewrite_tail(name, new_length, [%u], i); + +process_atomic_variable(t-fields.array, prog, name, new_length, +var, buffers, num_buffers, offset, +shader_stage); + } + } else { + unsigned id = 0; + bool found = prog-UniformHash-get(id, *name); + assert(found); + (void) found; + active_atomic_buffer *buf = buffers[var-data.binding]; + gl_uniform_storage *const storage = prog-UniformStorage[id]; + + /* If this is the first time the buffer is used, increment + * the counter of buffers used. + */ + if (buf-size == 0) +(*num_buffers)++; + + buf-push_back(id, var); + + buf-stage_references[shader_stage]++; + buf-size = MAX2(buf-size, *offset + t-atomic_size()); + + storage-offset = *offset; + *offset += t-atomic_size(); + } + } + active_atomic_buffer * find_active_atomic_counters(struct gl_context *ctx, struct gl_shader_program *prog, @@ -114,23 +164,11 @@ namespace { ir_variable *var = node-as_variable(); if (var var-type-contains_atomic()) { - unsigned id = 0; - bool found = prog-UniformHash-get(id, var-name); - assert(found); - (void) found; - active_atomic_buffer *buf = buffers[var-data.binding]; - - /* If this is the first time the buffer is used, increment -* the counter of buffers used. -*/ - if (buf-size == 0) - (*num_buffers)++; - - buf-push_back(id, var); - - buf-stage_references[i]++; - buf-size = MAX2(buf-size, var-data.atomic.offset + -var-type-atomic_size()); + char *name = ralloc_strdup(NULL, var-name); + int offset = var-data.atomic.offset; + process_atomic_variable(var-type, prog, name, strlen(name), + var, buffers, num_buffers, offset, i); + ralloc_free(name); } } } @@ -205,7 +243,6 @@ link_assign_atomic_counter_resources(struct gl_context *ctx, var-data.binding = i; storage-atomic_buffer_index = i; - storage-offset = var-data.atomic.offset; storage-array_stride = (var-type-is_array() ? var-type-without_array()-atomic_size() : 0); } -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Using the right context in st_texture_release_all_sampler_views()
Hi Marek, This is regarding your commit st/mesa: use pipe_sampler_view_release for releasing sampler views from last October. Basically, we have: void st_texture_release_all_sampler_views(struct st_context *st, struct st_texture_object *stObj) { GLuint i; /* XXX This should use sampler_views[i]-pipe, not st-pipe */ for (i = 0; i stObj-num_sampler_views; ++i) pipe_sampler_view_release(st-pipe, stObj-sampler_views[i]); } Our VMware/svga driver has an issue when pipe_context::sampler_view_destroy() is called with one context and a sampler view which was created with another context. I can hack around it in our driver code, but it would be nice to fix this in the state tracker. Ideally, the above code should be something like: for (i = 0; i stObj-num_sampler_views; ++i) pipe_sampler_view_reference(stObj-sampler_views[i], NULL); The current code which uses the st-pipe context came from the bug https://bugs.freedesktop.org/show_bug.cgi?id=81680 AFAICT, you were just working around an R600 driver issue. Any chance we could fix the state tracker and re-test Firefox on R600? -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC 18/16] nir: wrapper for glsl_type arrays_of_arrays_size()
--- src/glsl/nir/nir_types.cpp | 6 ++ src/glsl/nir/nir_types.h | 2 ++ 2 files changed, 8 insertions(+) diff --git a/src/glsl/nir/nir_types.cpp b/src/glsl/nir/nir_types.cpp index 62176f5..06f5b0a 100644 --- a/src/glsl/nir/nir_types.cpp +++ b/src/glsl/nir/nir_types.cpp @@ -106,6 +106,12 @@ glsl_get_length(const struct glsl_type *type) return type-is_matrix() ? type-matrix_columns : type-length; } +unsigned +glsl_get_aoa_size(const struct glsl_type *type) +{ + return type-arrays_of_arrays_size(); +} + const char * glsl_get_struct_elem_name(const struct glsl_type *type, unsigned index) { diff --git a/src/glsl/nir/nir_types.h b/src/glsl/nir/nir_types.h index 276d4ad..3f75e46 100644 --- a/src/glsl/nir/nir_types.h +++ b/src/glsl/nir/nir_types.h @@ -59,6 +59,8 @@ unsigned glsl_get_matrix_columns(const struct glsl_type *type); unsigned glsl_get_length(const struct glsl_type *type); +unsigned glsl_get_aoa_size(const struct glsl_type *type); + const char *glsl_get_struct_elem_name(const struct glsl_type *type, unsigned index); -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC 19/16] nir: add atomic lowering support for AoA
--- src/glsl/nir/nir_lower_atomics.c | 20 +++- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/src/glsl/nir/nir_lower_atomics.c b/src/glsl/nir/nir_lower_atomics.c index ce3615a..e5ec008 100644 --- a/src/glsl/nir/nir_lower_atomics.c +++ b/src/glsl/nir/nir_lower_atomics.c @@ -72,15 +72,17 @@ lower_instr(nir_intrinsic_instr *instr, nir_function_impl *impl) nir_ssa_def *offset_def = offset_const-def; - if (instr-variables[0]-deref.child != NULL) { - assert(instr-variables[0]-deref.child-deref_type == - nir_deref_type_array); - nir_deref_array *deref_array = - nir_deref_as_array(instr-variables[0]-deref.child); - assert(deref_array-deref.child == NULL); - - offset_const-value.u[0] += - deref_array-base_offset * ATOMIC_COUNTER_SIZE; + nir_deref *tail = instr-variables[0]-deref; + while (tail-child != NULL) { + assert(tail-child-deref_type == nir_deref_type_array); + nir_deref_array *deref_array = nir_deref_as_array(tail-child); + tail = tail-child; + + unsigned child_array_elements = tail != NULL ? + glsl_get_aoa_size(tail-type) : 1; + + offset_const-value.u[0] += deref_array-base_offset * + child_array_elements * ATOMIC_COUNTER_SIZE; if (deref_array-deref_array_type == nir_deref_array_type_indirect) { nir_load_const_instr *atomic_counter_size = -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 90311] Fail to build libglx with clang at linking stage
https://bugs.freedesktop.org/show_bug.cgi?id=90311 Julien Isorce julien.iso...@gmail.com changed: What|Removed |Added CC||emil.l.veli...@gmail.com --- Comment #3 from Julien Isorce julien.iso...@gmail.com --- Emil I am not gonna try your solution soon so I copy/past your comment from mailing list here in case someone else finds this bug and wants to go ahead: So as mentioned previously glx (libGL) should interact with the dispatch directly, rather than have any knowledge in the implementation detail (which lies in mesa). src/glx/indirect_glx.c is a nice example, where __glXNewIndirectAPI (autogenerated from mapi/glapi/gen/glX_proto_send.py) is used to create the dispatch table and then set it with _glapi_set_dispatch(). AppleDRI uses similar approach _glapi_create_table_from_handle(generated from mapi/glapi/gen/gl_gentable.py), but there are things that could be improved/fixed. - __ogl_framework_api does not need to be over a thousand entries long. There are six overrides + __ogl_framework_api-Scissor used. So tweaking the generator to directly produce __applegl_api and(?) a slimmed down __ogl_framework_api sounds like a good idea to me. Further on one could also: - Drop the multiple forward declarations of __ogl_framework_api. - Wrap/implement appledri around __GLXDRI(display|screen) shoving storing things like dl_handle, thus allowing things to be torn down on exit. Namely I'm suggesting to cleanup and test Jon's earlier work. - apple_glapi_set_dispatch is called too late - at applegl_bind_context. According to the docs/spec one should be able to fetch the function pointers without any context let alone a currently bound one. - As a follow up from the above struct glx_context_vtable::get_proc_address could be nuked, making the struct fit nicely into 32/64 byte cache :-) Obviously if you can find an alternative way (but not as hacky as this one) that'll be great. Cheers, Emil P.S. Why is there no appledriproto package, similar to xf86driproto and friends ? Having appledri{,str}.h duplicated seems fragile and ill-advised. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/20] i965/fs: Implement image load, store and atomic.
This all looks correct as far as I can tell. However, I'm very concerned about the number of checks such as has_matching_typed_format() that are built-in to the compiler (via surface_builder) where we then go on to do something that is highly dependant on state setup doing the exact same check (not via surface_builder) and doing the right thing. Another example would be the interplay with has_split_bit_layout. How do we know these won't get out of sync? On Tue, Jul 21, 2015 at 9:38 AM, Francisco Jerez curroje...@riseup.net wrote: v2: Drop VEC4 suport. v3: Rebase. --- .../drivers/dri/i965/brw_fs_surface_builder.cpp| 216 + src/mesa/drivers/dri/i965/brw_fs_surface_builder.h | 17 ++ 2 files changed, 233 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp index ea1c4aa..46b449f 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp @@ -587,3 +587,219 @@ namespace { } } } + +namespace brw { + namespace image_access { + /** + * Load a vector from a surface of the given format and dimensionality + * at the given coordinates. + */ + fs_reg + emit_image_load(const fs_builder bld, + const fs_reg image, const fs_reg addr, + mesa_format format, unsigned dims) + { + using namespace image_format_info; + using namespace image_format_conversion; + using namespace image_validity; + using namespace image_coordinates; + using namespace surface_access; + const brw_device_info *devinfo = bld.shader-devinfo; + const mesa_format lower_format = +brw_lower_mesa_image_format(devinfo, format); + fs_reg tmp; + + if (has_matching_typed_format(devinfo, format)) { +/* Hopefully we get here most of the time... */ +tmp = emit_typed_read(bld, image, addr, dims, + _mesa_format_num_components(lower_format)); + } else { +/* Untyped surface reads return 32 bits of the surface per + * component, without any sort of unpacking or type conversion, + */ +const unsigned size = _mesa_get_format_bytes(format) / 4; + +/* they don't properly handle out of bounds access, so we have to + * check manually if the coordinates are valid and predicate the + * surface read on the result, + */ +const brw_predicate pred = + emit_bounds_check(bld, image, addr, dims); + +/* and they don't know about surface coordinates, we need to + * convert them to a raw memory offset. + */ +const fs_reg laddr = emit_address_calculation(bld, image, addr, dims); + +tmp = emit_untyped_read(bld, image, laddr, 1, size, pred); + +/* An out of bounds surface access should give zero as result. */ +for (unsigned c = 0; c 4; ++c) + set_predicate(pred, bld.SEL(offset(tmp, bld, c), + offset(tmp, bld, c), fs_reg(0))); + } + + /* Set the register type to D instead of UD if the data type is + * represented as a signed integer in memory so that sign extension + * is handled correctly by unpack. + */ + if (needs_sign_extension(format)) +tmp = retype(tmp, BRW_REGISTER_TYPE_D); + + if (!has_supported_bit_layout(devinfo, format)) { +/* Unpack individual vector components from the bitfield if the + * hardware is unable to do it for us. + */ +if (has_split_bit_layout(devinfo, format)) + tmp = emit_pack(bld, tmp, get_bit_shifts(lower_format), + get_bit_widths(lower_format)); +else + tmp = emit_unpack(bld, tmp, get_bit_shifts(format), + get_bit_widths(format)); + + } else if ((needs_sign_extension(format) + !is_conversion_trivial(devinfo, format)) || +has_undefined_high_bits(devinfo, format)) { +/* Perform a trivial unpack even though the bit layout matches in + * order to get the most significant bits of each component + * initialized properly. + */ +tmp = emit_unpack(bld, tmp, color_u(0, 32, 64, 96), + get_bit_widths(format)); + } + + if (!_mesa_is_format_integer(format)) { +if (is_conversion_trivial(devinfo, format)) { + /* Just need to cast the vector to the target type. */ + tmp = retype(tmp,
Re: [Mesa-dev] [PATCH 20/20] i965: Expose ARB_shader_image_load_store.
On Tue, Jul 21, 2015 at 9:38 AM, Francisco Jerez curroje...@riseup.net wrote: Reviewed-by: Paul Berry stereotype...@gmail.com I'm sure that Paul still thinks this patch does what the commit message says. However, does the r-b really still apply to the rest of it? --- src/mesa/drivers/dri/i965/intel_extensions.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index 6b3bd12..5a5e308 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++ b/src/mesa/drivers/dri/i965/intel_extensions.c @@ -324,6 +324,7 @@ intelInitExtensions(struct gl_context *ctx) ctx-Extensions.ARB_framebuffer_no_attachments = true; ctx-Extensions.ARB_gpu_shader5 = true; ctx-Extensions.ARB_shader_atomic_counters = true; + ctx-Extensions.ARB_shader_image_load_store = true; ctx-Extensions.ARB_texture_compression_bptc = true; ctx-Extensions.ARB_texture_view = true; -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/20] glsl/types: add new subroutine type (v3.1)
On Tuesday, July 21, 2015 03:19:14 PM Dave Airlie wrote: From: Dave Airlie airl...@redhat.com This type will be used to store the name of subroutine types as in subroutine void myfunc(void); will store myfunc into a subroutine type. This is required to the parser can identify a subroutine type in a uniform decleration as a valid type, and also for looking up the type later. Also add contains_subroutine method. v2: handle subroutine to int comparisons, needed for lowering pass. v3: do subroutine to int with it's own IR operation to avoid hacking on asserts (Kayden) v3.1: fix warnings in this patch, fix nir, fix tgsi Reviewed-by: Chris Forbes chr...@ijw.co.nz Signed-off-by: Dave Airlie airl...@redhat.com Patches 7-9 are: Reviewed-by: Kenneth Graunke kenn...@whitecape.org signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] i965/fs: Implement pass to lower instructions of unsupported SIMD width.
On Wed, Jul 22, 2015 at 10:05 AM, Francisco Jerez curroje...@riseup.net wrote: Jason Ekstrand ja...@jlekstrand.net writes: On Wed, Jul 22, 2015 at 12:31 AM, Francisco Jerez curroje...@riseup.net wrote: Jason Ekstrand ja...@jlekstrand.net writes: A few comments below. Mostly just asking for explanation. 1-3 are Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com Obviously, don't merge 4/4 until it actually has users. --Jason Thanks. On Thu, Jul 16, 2015 at 8:35 AM, Francisco Jerez curroje...@riseup.net wrote: This lowering pass implements an algorithm to expand SIMDN instructions into a sequence of SIMDM instructions in cases where the hardware doesn't support the original execution size natively for some particular instruction. The most important use-cases are: - Lowering send message instructions that don't support SIMD16 natively into SIMD8 (several texturing, framebuffer write and typed surface operations). - Lowering messages that don't support SIMD8 natively into SIMD16 (*cough*gen4*cough*). - 64-bit precision operations (e.g. FP64 and 64-bit integer multiplication). - SIMD32. The algorithm works by splitting the sources of the original instruction into chunks of width appropriate for the lowered instructions, and then interleaving the results component-wise into the destination of the original instruction. The pass is controlled by the get_lowered_simd_width() function that currently just returns the original execution size making the whole pass a no-op for the moment until some user is introduced. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 142 +++ src/mesa/drivers/dri/i965/brw_fs.h | 1 + 2 files changed, 143 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index d031352..eeb6938 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3204,6 +3204,147 @@ fs_visitor::lower_logical_sends() return progress; } +/** + * Get the closest native SIMD width supported by the hardware for instruction + * \p inst. The instruction will be left untouched by + * fs_visitor::lower_simd_width() if the returned value is equal to the + * original execution size. + */ +static unsigned +get_lowered_simd_width(const struct brw_device_info *devinfo, + const fs_inst *inst) +{ + switch (inst-opcode) { + default: + return inst-exec_size; + } +} + +/** + * The \p rows array of registers represents a \p num_rows by \p num_columns + * matrix in row-major order, write it in column-major order into the register + * passed as destination. \p stride gives the separation between matrix + * elements in the input in fs_builder::dispatch_width() units. + */ +static void +emit_transpose(const fs_builder bld, + const fs_reg dst, const fs_reg *rows, + unsigned num_rows, unsigned num_columns, unsigned stride) I'm not sure what I think about calling this emit_transpose. I guess it is kind of a transpose, but maybe it's more of a gather. I'm not going to quibble about it though. *Shrug*, it doesn't only gather the vectors of the rows array (that would have been emit_collect :P), it copies them out in vertical, just like a matrix transpose -- assuming you're not horrified by the idea of considering the argument a matrix. That's fine. Feel free to ignore that suggestion. +{ + fs_reg *const components = new fs_reg[num_rows * num_columns]; + + for (unsigned i = 0; i num_columns; ++i) { + for (unsigned j = 0; j num_rows; ++j) + components[num_rows * i + j] = offset(rows[j], bld, stride * i); + } + + bld.LOAD_PAYLOAD(dst, components, num_rows * num_columns, 0); + + delete[] components; +} + +bool +fs_visitor::lower_simd_width() +{ + bool progress = false; + + foreach_block_and_inst_safe(block, fs_inst, inst, cfg) { + const unsigned lower_width = get_lowered_simd_width(devinfo, inst); + + if (lower_width != inst-exec_size) { + /* Builder matching the original instruction. */ + const fs_builder ibld = bld.at(block, inst) +.exec_all(inst-force_writemask_all) +.group(inst-exec_size, inst-force_sechalf); + + /* Split the copies in chunks of the execution width of either the + * original or the lowered instruction, whichever is lower. + */ + const unsigned copy_width = MIN2(lower_width, inst-exec_size); + const unsigned n = inst-exec_size / copy_width; + const unsigned dst_size = inst-regs_written * REG_SIZE / +inst-dst.component_size(inst-exec_size); + fs_reg dsts[4]; + + assert(n 0 n = ARRAY_SIZE(dsts) +!inst-writes_accumulator !inst-mlen); + + for
[Mesa-dev] [RFC] AoA atomic counter follow-up fixes
Hi Guys, These patches fix up AoA support for atomic counters following my change in approach to uniforms in V2 of my AoA GLES series. However I wanted to get some feedback mainly on the first patch as the end result is the whole AoA is marked as active which doesnt happen on Nvidia drivers. I guess my question is do you think this patch is ok as a first pass at just getting this stuff up and running or should I spend a bunch of time writting a pass that splits the outer arrays into separate variables to make them easier to handle separatly. At this stage in the glsl ir life cycle I don't know if its worth the extra effort. Thanks, Tim ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Using the right context in st_texture_release_all_sampler_views()
On 07/22/2015 05:31 PM, Jose Fonseca wrote: On 22/07/15 23:32, Brian Paul wrote: Hi Marek, This is regarding your commit st/mesa: use pipe_sampler_view_release for releasing sampler views from last October. Basically, we have: void st_texture_release_all_sampler_views(struct st_context *st, struct st_texture_object *stObj) { GLuint i; /* XXX This should use sampler_views[i]-pipe, not st-pipe */ for (i = 0; i stObj-num_sampler_views; ++i) pipe_sampler_view_release(st-pipe, stObj-sampler_views[i]); } Our VMware/svga driver has an issue when pipe_context::sampler_view_destroy() is called with one context and a sampler view which was created with another context. I can hack around it in our driver code, but it would be nice to fix this in the state tracker. Ideally, the above code should be something like: for (i = 0; i stObj-num_sampler_views; ++i) pipe_sampler_view_reference(stObj-sampler_views[i], NULL); The current code which uses the st-pipe context came from the bug https://bugs.freedesktop.org/show_bug.cgi?id=81680 AFAICT, you were just working around an R600 driver issue. Any chance we could fix the state tracker and re-test Firefox on R600? Freeing a view from a different context with another context is wrong. But freeing a view with a context that might be current on another thread is also wrong, as pipe_context are not thread safe. If that was the previous behavior, then maybe Firefox crashed due to the race conditions. Looking at the stack trace, I suspect that what happened was that one context was freeing the st_texture, and all its views, while the other context was destroying itself, and all its own views. In short, st_texture_release_sampler_view was being called simultanouesly from two different threads. The proper fix IMO is to let the pipe context that owns the view to destroyed from whatever thread it is current (or whenever it is made current.) OK, in our off-list discussion it wasn't clear to me that multi-threading was your main concern until your last message. I see what you're saying now. So, when we destroy a texture object which may be shared by multiple contexts, we theoretically need to move the texture's sampler views to new, per-context lists. At some later time, when the context is used, we'd check if the list of sampler views to be destroyed was non-empty and free them. Unfortunately, that involves one context/thread reaching into another context. And that sounds messy. The ultimate solution may be to get rid of the per-texture list of sampler views and instead store sampler views in a context-private data structure. I'll have to think that through. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Using the right context in st_texture_release_all_sampler_views()
On 22/07/15 23:32, Brian Paul wrote: Hi Marek, This is regarding your commit st/mesa: use pipe_sampler_view_release for releasing sampler views from last October. Basically, we have: void st_texture_release_all_sampler_views(struct st_context *st, struct st_texture_object *stObj) { GLuint i; /* XXX This should use sampler_views[i]-pipe, not st-pipe */ for (i = 0; i stObj-num_sampler_views; ++i) pipe_sampler_view_release(st-pipe, stObj-sampler_views[i]); } Our VMware/svga driver has an issue when pipe_context::sampler_view_destroy() is called with one context and a sampler view which was created with another context. I can hack around it in our driver code, but it would be nice to fix this in the state tracker. Ideally, the above code should be something like: for (i = 0; i stObj-num_sampler_views; ++i) pipe_sampler_view_reference(stObj-sampler_views[i], NULL); The current code which uses the st-pipe context came from the bug https://bugs.freedesktop.org/show_bug.cgi?id=81680 AFAICT, you were just working around an R600 driver issue. Any chance we could fix the state tracker and re-test Firefox on R600? Freeing a view from a different context with another context is wrong. But freeing a view with a context that might be current on another thread is also wrong, as pipe_context are not thread safe. If that was the previous behavior, then maybe Firefox crashed due to the race conditions. Looking at the stack trace, I suspect that what happened was that one context was freeing the st_texture, and all its views, while the other context was destroying itself, and all its own views. In short, st_texture_release_sampler_view was being called simultanouesly from two different threads. The proper fix IMO is to let the pipe context that owns the view to destroyed from whatever thread it is current (or whenever it is made current.) Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Using the right context in st_texture_release_all_sampler_views()
On 23/07/15 01:00, Brian Paul wrote: On 07/22/2015 05:31 PM, Jose Fonseca wrote: On 22/07/15 23:32, Brian Paul wrote: Hi Marek, This is regarding your commit st/mesa: use pipe_sampler_view_release for releasing sampler views from last October. Basically, we have: void st_texture_release_all_sampler_views(struct st_context *st, struct st_texture_object *stObj) { GLuint i; /* XXX This should use sampler_views[i]-pipe, not st-pipe */ for (i = 0; i stObj-num_sampler_views; ++i) pipe_sampler_view_release(st-pipe, stObj-sampler_views[i]); } Our VMware/svga driver has an issue when pipe_context::sampler_view_destroy() is called with one context and a sampler view which was created with another context. I can hack around it in our driver code, but it would be nice to fix this in the state tracker. Ideally, the above code should be something like: for (i = 0; i stObj-num_sampler_views; ++i) pipe_sampler_view_reference(stObj-sampler_views[i], NULL); The current code which uses the st-pipe context came from the bug https://bugs.freedesktop.org/show_bug.cgi?id=81680 AFAICT, you were just working around an R600 driver issue. Any chance we could fix the state tracker and re-test Firefox on R600? Freeing a view from a different context with another context is wrong. But freeing a view with a context that might be current on another thread is also wrong, as pipe_context are not thread safe. If that was the previous behavior, then maybe Firefox crashed due to the race conditions. Looking at the stack trace, I suspect that what happened was that one context was freeing the st_texture, and all its views, while the other context was destroying itself, and all its own views. In short, st_texture_release_sampler_view was being called simultanouesly from two different threads. The proper fix IMO is to let the pipe context that owns the view to destroyed from whatever thread it is current (or whenever it is made current.) OK, in our off-list discussion it wasn't clear to me that multi-threading was your main concern until your last message. I see what you're saying now. So, when we destroy a texture object which may be shared by multiple contexts, we theoretically need to move the texture's sampler views to new, per-context lists. At some later time, when the context is used, we'd check if the list of sampler views to be destroyed was non-empty and free them. Unfortunately, that involves one context/thread reaching into another context. And that sounds messy. Yes. We could have a sepeare mutex for this list. Still, it's hard to prevent dead-locks (e.g, two contexts trying to add a deferred view to each-others defferred list.) The ultimate solution may be to get rid of the per-texture list of sampler views and instead store sampler views in a context-private data structure. I'll have to think that through. Yes, that's another possibility. I haven't thought it through either. One way or another, you'll still want to notify a context to garbagge collect its views, even if it's just an atomic flag on each context, otherwise a texture that was shared by two contexts might end up lingering around indefinitely because contexts are holding to its views. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] dispatch_sanity.cpp: remove commented out tess entries
From: Dave Airlie airl...@redhat.com These entries were put in the GL4.0 section, so removed the commented out ones. Signed-off-by: Dave Airlie airl...@redhat.com --- src/mesa/main/tests/dispatch_sanity.cpp | 2 -- 1 file changed, 2 deletions(-) diff --git a/src/mesa/main/tests/dispatch_sanity.cpp b/src/mesa/main/tests/dispatch_sanity.cpp index 7e3a97b..d1bd0ce 100644 --- a/src/mesa/main/tests/dispatch_sanity.cpp +++ b/src/mesa/main/tests/dispatch_sanity.cpp @@ -1577,8 +1577,6 @@ const struct function gl_core_functions_possible[] = { // { glUniformSubroutinesuiv, 43, -1 }, // XXX: Add to xml // { glGetUniformSubroutineuiv, 43, -1 }, // XXX: Add to xml // { glGetProgramStageiv, 43, -1 }, // XXX: Add to xml -// { glPatchParameteri, 43, -1 }, // XXX: Add to xml -// { glPatchParameterfv, 43, -1 },// XXX: Add to xml { glBindTransformFeedback, 43, -1 }, { glDeleteTransformFeedbacks, 43, -1 }, -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 13/14] mesa: Change the signature of _mesa_need_rgb_to_luminance_conversion()
Looks good to me, if this did not introduce any regressions: Reviewed-by: Iago Toral Quiroga ito...@igalia.com On Tue, 2015-06-16 at 11:15 -0700, Anuj Phogat wrote: This allows us to handle cases when texImage-_BaseFormat doesn't match _mesa_format_get_base_format(texImage-Format). _BaseFormat is what we care about in this function. Signed-off-by: Anuj Phogat anuj.pho...@gmail.com --- src/mesa/drivers/common/meta_tex_subimage.c | 4 +++- src/mesa/main/readpix.c | 28 +++- src/mesa/main/readpix.h | 3 ++- 3 files changed, 20 insertions(+), 15 deletions(-) diff --git a/src/mesa/drivers/common/meta_tex_subimage.c b/src/mesa/drivers/common/meta_tex_subimage.c index 6d52014..43e1210 100644 --- a/src/mesa/drivers/common/meta_tex_subimage.c +++ b/src/mesa/drivers/common/meta_tex_subimage.c @@ -262,6 +262,7 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, GLuint dims, int full_height, image_height; struct gl_texture_image *pbo_tex_image; struct gl_renderbuffer *rb = NULL; + GLenum dstBaseFormat = _mesa_unpack_format_to_base_format(format); GLenum status, base_format; bool success = false, clear_channels_to_zero = false; float save_clear_color[4]; @@ -284,7 +285,8 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, GLuint dims, type, GL_FALSE)) return false; - if (_mesa_need_rgb_to_luminance_conversion(rb-Format, format)) + if (_mesa_need_rgb_to_luminance_conversion(rb-_BaseFormat, + dstBaseFormat)) return false; if (_mesa_need_signed_unsigned_int_conversion(rb-Format, format, type)) diff --git a/src/mesa/main/readpix.c b/src/mesa/main/readpix.c index c98975f..3a9b766 100644 --- a/src/mesa/main/readpix.c +++ b/src/mesa/main/readpix.c @@ -47,17 +47,14 @@ * Return true if the conversion L=R+G+B is needed. */ GLboolean -_mesa_need_rgb_to_luminance_conversion(mesa_format texFormat, GLenum format) +_mesa_need_rgb_to_luminance_conversion(GLenum srcBaseFormat, + GLenum dstBaseFormat) { - GLenum baseTexFormat = _mesa_get_format_base_format(texFormat); - - return (baseTexFormat == GL_RG || - baseTexFormat == GL_RGB || - baseTexFormat == GL_RGBA) - (format == GL_LUMINANCE || - format == GL_LUMINANCE_ALPHA || - format == GL_LUMINANCE_INTEGER_EXT || - format == GL_LUMINANCE_ALPHA_INTEGER_EXT); + return (srcBaseFormat == GL_RG || + srcBaseFormat == GL_RGB || + srcBaseFormat == GL_RGBA) + (dstBaseFormat == GL_LUMINANCE || + dstBaseFormat == GL_LUMINANCE_ALPHA); } /** @@ -89,6 +86,8 @@ _mesa_get_readpixels_transfer_ops(const struct gl_context *ctx, GLboolean uses_blit) { GLbitfield transferOps = ctx-_ImageTransferState; + GLenum srcBaseFormat = _mesa_get_format_base_format(texFormat); + GLenum dstBaseFormat = _mesa_unpack_format_to_base_format(format); if (format == GL_DEPTH_COMPONENT || format == GL_DEPTH_STENCIL || @@ -125,7 +124,7 @@ _mesa_get_readpixels_transfer_ops(const struct gl_context *ctx, * have any effect anyway. */ if (_mesa_get_format_datatype(texFormat) == GL_UNSIGNED_NORMALIZED - !_mesa_need_rgb_to_luminance_conversion(texFormat, format)) { + !_mesa_need_rgb_to_luminance_conversion(srcBaseFormat, dstBaseFormat)) { transferOps = ~IMAGE_CLAMP_BIT; } @@ -164,6 +163,7 @@ _mesa_readpixels_needs_slow_path(const struct gl_context *ctx, GLenum format, { struct gl_renderbuffer *rb = _mesa_get_read_renderbuffer_for_format(ctx, format); + GLenum dstBaseFormat = _mesa_unpack_format_to_base_format(format); assert(rb); @@ -184,7 +184,8 @@ _mesa_readpixels_needs_slow_path(const struct gl_context *ctx, GLenum format, default: /* Color formats. */ - if (_mesa_need_rgb_to_luminance_conversion(rb-Format, format)) { + if (_mesa_need_rgb_to_luminance_conversion(rb-_BaseFormat, + dstBaseFormat)) { return GL_TRUE; } @@ -458,6 +459,7 @@ read_rgba_pixels( struct gl_context *ctx, uint8_t rebase_swizzle[4]; struct gl_framebuffer *fb = ctx-ReadBuffer; struct gl_renderbuffer *rb = fb-_ColorReadBuffer; + GLenum dstBaseFormat = _mesa_unpack_format_to_base_format(format); if (!rb) return; @@ -469,7 +471,7 @@ read_rgba_pixels( struct gl_context *ctx, dst_stride = _mesa_image_row_stride(packing, width, format, type); dst_format = _mesa_format_from_format_and_type(format, type); convert_rgb_to_lum = - _mesa_need_rgb_to_luminance_conversion(rb-Format, format); +
Re: [Mesa-dev] [PATCH 14/14] meta: Use _mesa_need_rgb_to_luminance_conversion() in decompress_texture_image()
Reviewed-by: Iago Toral Quiroga ito...@igalia.com On Tue, 2015-06-16 at 11:15 -0700, Anuj Phogat wrote: Signed-off-by: Anuj Phogat anuj.pho...@gmail.com --- src/mesa/drivers/common/meta.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c index 6108d98..e123500 100644 --- a/src/mesa/drivers/common/meta.c +++ b/src/mesa/drivers/common/meta.c @@ -3137,11 +3137,8 @@ decompress_texture_image(struct gl_context *ctx, /* If we're reading back an RGB(A) texture (using glGetTexImage) as * luminance then we need to return L=tex(R). */ - ((baseTexFormat == GL_RGBA || -baseTexFormat == GL_RGB || -baseTexFormat == GL_RG) - (destBaseFormat == GL_LUMINANCE || - destBaseFormat == GL_LUMINANCE_ALPHA))) { + _mesa_need_rgb_to_luminance_conversion(baseTexFormat, + destBaseFormat)) { /* Green and blue must be zero */ _mesa_PixelTransferf(GL_GREEN_SCALE, 0.0f); _mesa_PixelTransferf(GL_BLUE_SCALE, 0.0f); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91425] [regression, bisected] Piglit spec/ext_packed_float/ getteximage-invalid-format-for-packed-type fails
https://bugs.freedesktop.org/show_bug.cgi?id=91425 --- Comment #2 from Michel Dänzer mic...@daenzer.net --- FWIW, I got the same regression with radeonsi. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] targets/dri: scons: add missing link against libdrm
Otherwise the final dri module will have (additional) unresolved symbols. Cc: Brian Paul bri...@vmware.com Cc: Jose Fonseca jfons...@vmware.com Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- We can only fix the remaining unresolved symbols (_glapi_foo), as we remove the non-shared glapi when building with DRI. With this we at least match the autotools build. -Emil src/gallium/targets/dri/SConscript | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/gallium/targets/dri/SConscript b/src/gallium/targets/dri/SConscript index 8d29f3b..2fb0da0 100644 --- a/src/gallium/targets/dri/SConscript +++ b/src/gallium/targets/dri/SConscript @@ -25,6 +25,8 @@ if env['llvm']: env.Append(CPPDEFINES = 'GALLIUM_LLVMPIPE') env.Prepend(LIBS = [llvmpipe]) +env.PkgUseModules('DRM') + env.Append(CPPDEFINES = [ 'GALLIUM_VMWGFX', 'GALLIUM_SOFTPIPE', -- 2.4.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91425] [regression, bisected] Piglit spec/ext_packed_float/ getteximage-invalid-format-for-packed-type fails
https://bugs.freedesktop.org/show_bug.cgi?id=91425 --- Comment #1 from Brian Paul bri...@vmware.com --- Hmm, I can't reproduce that. The test passes completely for me. What driver are you using? I'm testing llvmpipe/softpipe. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91387] Mesa 10.6.1 implementation error: invalid target in _swrast_choose_texture_sample_func
https://bugs.freedesktop.org/show_bug.cgi?id=91387 --- Comment #3 from Michael Godfrey michaeldgodf...@gmail.com --- Created attachment 117297 -- https://bugs.freedesktop.org/attachment.cgi?id=117297action=edit backtrace -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2 12/14] meta: Fix reading luminance texture as rgba in _mesa_meta_pbo_GetTexSubImage()
The problem here is that the _mesa_meta_BlitFramebuffer is not setting G/B channels to 0.0 when doing Luminance/Intensity to RGBA conversions, so why not implement the fix in _mesa_meta_BlitFramebuffer directly? The GL spec expects frambuffer blits to handle these conversions properly, so it looks like a win for all uses of that function. Iago On Fri, 2015-07-17 at 10:28 -0700, Anuj Phogat wrote: After recent addition of pbo testing in piglit test getteximage-luminance, it fails on i965. This patch makes a sub test pass. This patch adds a clear color operation to meta pbo path, which I think is better than falling back to software path. V2: Fix color mask for GL_LUMINANCE_ALPHA Signed-off-by: Anuj Phogat anuj.pho...@gmail.com Cc: mesa-sta...@lists.freedesktop.org --- src/mesa/drivers/common/meta_tex_subimage.c | 36 +++-- 1 file changed, 34 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/common/meta_tex_subimage.c b/src/mesa/drivers/common/meta_tex_subimage.c index 13f8292..f4d5ac3 100644 --- a/src/mesa/drivers/common/meta_tex_subimage.c +++ b/src/mesa/drivers/common/meta_tex_subimage.c @@ -28,6 +28,7 @@ #include blend.h #include bufferobj.h #include buffers.h +#include clear.h #include fbobject.h #include glformats.h #include glheader.h @@ -278,8 +279,9 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, GLuint dims, int full_height, image_height; struct gl_texture_image *pbo_tex_image; struct gl_renderbuffer *rb = NULL; - GLenum status; - bool success = false; + GLenum status, src_base_format; + bool success = false, clear_channels_to_zero = false; + float save_clear_color[4]; int z; if (!_mesa_is_bufferobj(packing-BufferObj)) @@ -380,6 +382,27 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, GLuint dims, GL_COLOR_BUFFER_BIT, GL_NEAREST)) goto fail; + src_base_format = tex_image ? + tex_image-_BaseFormat : + ctx-ReadBuffer-_ColorReadBuffer-_BaseFormat; + + /* Depending on the base formats involved we might need to rebase some +* values. For example if we download from a Luminance format to RGBA +* format, we want G=0 and B=0. +*/ + clear_channels_to_zero = + _mesa_need_luminance_to_rgb_conversion(src_base_format, + pbo_tex_image-_BaseFormat); + + if (clear_channels_to_zero) { + memcpy(save_clear_color, ctx-Color.ClearColor.f, 4 * sizeof(float)); + /* Clear the Green, Blue channels. */ + _mesa_ColorMask(GL_FALSE, GL_TRUE, GL_TRUE, + src_base_format != GL_LUMINANCE_ALPHA); + _mesa_ClearColor(0.0, 0.0, 0.0, 1.0); + _mesa_Clear(GL_COLOR_BUFFER_BIT); + } + for (z = 1; z depth; z++) { _mesa_meta_bind_fbo_image(GL_READ_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, tex_image, zoffset + z); @@ -392,6 +415,15 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, GLuint dims, 0, z * image_height, width, z * image_height + height, GL_COLOR_BUFFER_BIT, GL_NEAREST); + if (clear_channels_to_zero) + _mesa_Clear(GL_COLOR_BUFFER_BIT); + } + + /* Unmask the color channels and restore the saved clear color values. */ + if (clear_channels_to_zero) { + _mesa_ColorMask(GL_TRUE, GL_TRUE, GL_TRUE, GL_TRUE); + _mesa_ClearColor(save_clear_color[0], save_clear_color[1], + save_clear_color[2], save_clear_color[3]); } success = true; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 11/14] meta: Use _mesa_need_luminance_to_rgb_conversion() in decompress_texture_image()
Patches 10-11 are Reviewed-by: Iago Toral Quiroga ito...@igalia.com On Tue, 2015-06-16 at 11:15 -0700, Anuj Phogat wrote: Signed-off-by: Anuj Phogat anuj.pho...@gmail.com --- src/mesa/drivers/common/meta.c | 12 ++-- 1 file changed, 2 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c index c9e58d8..6108d98 100644 --- a/src/mesa/drivers/common/meta.c +++ b/src/mesa/drivers/common/meta.c @@ -3132,16 +3132,8 @@ decompress_texture_image(struct gl_context *ctx, * returned as red and two-channel texture values are returned as * red/alpha. */ - if (((baseTexFormat == GL_LUMINANCE || -baseTexFormat == GL_LUMINANCE_ALPHA || -baseTexFormat == GL_INTENSITY) - (destBaseFormat == GL_RGBA || -destBaseFormat == GL_RGB || -destBaseFormat == GL_RG || -destBaseFormat == GL_GREEN || -destBaseFormat == GL_BLUE || -destBaseFormat == GL_BGRA || -destBaseFormat == GL_BGR)) || + if (_mesa_need_luminance_to_rgb_conversion(baseTexFormat, + destBaseFormat) || /* If we're reading back an RGB(A) texture (using glGetTexImage) as * luminance then we need to return L=tex(R). */ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91387] Mesa 10.6.1 implementation error: invalid target in _swrast_choose_texture_sample_func
https://bugs.freedesktop.org/show_bug.cgi?id=91387 Brian Paul bri...@vmware.com changed: What|Removed |Added CC||anuj.pho...@gmail.com --- Comment #4 from Brian Paul bri...@vmware.com --- Ugh, looks like there's some sort of unintended recursion going on here. I think Anuj might want to take a look at this. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev