Re: [Mesa-dev] [PATCH 01/13] i965/fs_cse: Factor out code to create copy instructions
On Wed, Apr 01, 2015 at 06:19:12PM -0700, Jason Ekstrand wrote: --- src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 75 1 file changed, 38 insertions(+), 37 deletions(-) Just a few small notes but the logic looks right. Always nice to see things getting clearer: Reviewed-by: Topi Pohjolainen topi.pohjolai...@intel.com diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp index f2c4098..dd199fa 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp @@ -183,6 +183,29 @@ instructions_match(fs_inst *a, fs_inst *b, bool *negate) operands_match(a, b, negate); } +static fs_inst * +create_copy_instr(fs_visitor *v, fs_inst *inst, fs_reg src, bblock_t *block, You can drop 'block', it isn't used. + bool negate) +{ + int written = inst-regs_written; + int dst_width = inst-dst.width / 8; + fs_reg dst = inst-dst; + fs_inst *copy; Perhaps a separating empty line here? Your choice. + if (written dst_width) { + fs_reg *sources = ralloc_array(v-mem_ctx, fs_reg, written / dst_width); + for (int i = 0; i written / dst_width; i++) + sources[i] = offset(src, i); Other people have instructed me to leave {} out only in the first indentaion level. I know that original code didn't have them either so this is up to you. + copy = v-LOAD_PAYLOAD(dst, sources, written / dst_width); + } else { + copy = v-MOV(dst, src); + copy-force_writemask_all = inst-force_writemask_all; + copy-src[0].negate = negate; + } + assert(copy-regs_written == written); + + return copy; +} + bool fs_visitor::opt_cse_local(bblock_t *block) { @@ -228,49 +251,27 @@ fs_visitor::opt_cse_local(bblock_t *block) bool no_existing_temp = entry-tmp.file == BAD_FILE; if (no_existing_temp !entry-generator-dst.is_null()) { int written = entry-generator-regs_written; - int dst_width = entry-generator-dst.width / 8; - assert(written % dst_width == 0); - - fs_reg orig_dst = entry-generator-dst; - fs_reg tmp = fs_reg(GRF, alloc.allocate(written), - orig_dst.type, orig_dst.width); - entry-tmp = tmp; - entry-generator-dst = tmp; - - fs_inst *copy; - if (written dst_width) { - fs_reg *sources = ralloc_array(mem_ctx, fs_reg, written / dst_width); - for (int i = 0; i written / dst_width; i++) - sources[i] = offset(tmp, i); - copy = LOAD_PAYLOAD(orig_dst, sources, written / dst_width); - } else { - copy = MOV(orig_dst, tmp); - copy-force_writemask_all = - entry-generator-force_writemask_all; - } + assert((written * 8) % entry-generator-dst.width == 0); + + entry-tmp = fs_reg(GRF, alloc.allocate(written), + entry-generator-dst.type, + entry-generator-dst.width); + + fs_inst *copy = create_copy_instr(this, entry-generator, + entry-tmp, block, false); entry-generator-insert_after(block, copy); + + entry-generator-dst = entry-tmp; } /* dest - temp */ if (!inst-dst.is_null()) { - int written = inst-regs_written; - int dst_width = inst-dst.width / 8; - assert(written == entry-generator-regs_written); - assert(dst_width == entry-generator-dst.width / 8); + assert(inst-regs_written == entry-generator-regs_written); + assert(inst-dst.width == entry-generator-dst.width); assert(inst-dst.type == entry-tmp.type); - fs_reg dst = inst-dst; - fs_reg tmp = entry-tmp; - fs_inst *copy; - if (written dst_width) { - fs_reg *sources = ralloc_array(mem_ctx, fs_reg, written / dst_width); - for (int i = 0; i written / dst_width; i++) - sources[i] = offset(tmp, i); - copy = LOAD_PAYLOAD(dst, sources, written / dst_width); - } else { - copy = MOV(dst, tmp); - copy-force_writemask_all = inst-force_writemask_all; - copy-src[0].negate = negate; - } + + fs_inst *copy = create_copy_instr(this, inst, + entry-tmp, block, negate); inst-insert_before(block, copy); } -- 2.3.4 ___
Re: [Mesa-dev] [PATCH 02/13] i965: Change header_present to header_size in backend_instruction
On Wed, Apr 01, 2015 at 06:19:13PM -0700, Jason Ekstrand wrote: --- src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp | 4 ++-- src/mesa/drivers/dri/i965/brw_fs.cpp | 8 +++ src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 2 +- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 20 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 30 +--- src/mesa/drivers/dri/i965/brw_shader.h | 3 ++- src/mesa/drivers/dri/i965/brw_vec4.cpp | 2 +- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 6 ++--- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 14 +-- 9 files changed, 46 insertions(+), 43 deletions(-) I can't think the logic through all the way and therefore I need to ask. I noticed that there is now a comparison checking for exact match of header size. Before such would have passed as values two and one were both presented with boolean true. Also message length changes at least in one occurrence as well as parameter base offset (header size one vs. two). So the question is that is this patch expected to change the runtime behaviour, and if not, what guarantees that? In fact, were we doing something wrong in the past because of this. diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp index 32919b1..c1b7609 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp @@ -88,7 +88,7 @@ brw_blorp_eu_emitter::emit_texture_lookup(const struct brw_reg dst, inst-base_mrf = base_mrf; inst-mlen = msg_length; - inst-header_present = false; + inst-header_size = 0; insts.push_tail(inst); } @@ -104,7 +104,7 @@ brw_blorp_eu_emitter::emit_render_target_write(const struct brw_reg src0, inst-src[0] = src0; inst-base_mrf = msg_reg_nr; inst-mlen = msg_length; - inst-header_present = use_header; + inst-header_size = use_header ? 2 : 0; inst-target = BRW_BLORP_RENDERBUFFER_BINDING_TABLE_INDEX; insts.push_tail(inst); diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 9c2ccce..852abbe 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -430,7 +430,7 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_reg dst, if (brw-gen 7) { inst-base_mrf = 13; - inst-header_present = true; + inst-header_size = 1; if (brw-gen == 4) inst-mlen = 3; else @@ -478,7 +478,7 @@ fs_inst::equals(fs_inst *inst) const base_mrf == inst-base_mrf target == inst-target eot == inst-eot - header_present == inst-header_present + header_size == inst-header_size shadow_compare == inst-shadow_compare exec_size == inst-exec_size offset == inst-offset); @@ -2835,7 +2835,7 @@ fs_visitor::emit_repclear_shader() write-saturate = key-clamp_fragment_color; write-base_mrf = color_mrf; write-target = 0; - write-header_present = false; + write-header_size = 0; write-mlen = 1; } else { assume(key-nr_color_regions 0); @@ -2844,7 +2844,7 @@ fs_visitor::emit_repclear_shader() write-saturate = key-clamp_fragment_color; write-base_mrf = base_mrf; write-target = i; - write-header_present = true; + write-header_size = 2; write-mlen = 3; } } diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp index dd199fa..61837d2 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp @@ -177,7 +177,7 @@ instructions_match(fs_inst *a, fs_inst *b, bool *negate) a-regs_written == b-regs_written a-base_mrf == b-base_mrf a-eot == b-eot - a-header_present == b-header_present + a-header_size == b-header_size a-shadow_compare == b-shadow_compare) : true) operands_match(a, b, negate); diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index bd12147..f88c041 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -243,7 +243,7 @@ fs_generator::fire_fb_write(fs_inst *inst, 0, inst-eot, last_render_target, -inst-header_present); +inst-header_size != 0); brw_mark_surface_used(prog_data-base, surf_index); } @@ -265,7 +265,7 @@ fs_generator::generate_fb_write(fs_inst *inst, struct brw_reg payload) /* Header is 2
Re: [Mesa-dev] [Piglit] GSoC 2015: Request for Registration for Mentorship
Hello +Mentors, Thanks very much for signing up. Regards, Juliet On Fri, Apr 3, 2015 at 3:47 AM, Matt Turner matts...@gmail.com wrote: On Thu, Apr 2, 2015 at 6:51 PM, Brian Paul brian.e.p...@gmail.com wrote: Thanks, Matt. I can't complete the mentor registration now because at the bottom of the participation agreement, there's no check-box or button next to the I agree to the terms line at the end! I tried Firefox and Safari. Sheesh. I'll try again tomorrow. I think you have to scroll to the very bottom of the agreement. I think the checkbox is actually inside the scroll box, unexpectedly. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] osmesa with gallium
Hello Olivier, On 03/04/15 11:27, Olivier PENA wrote: Hi, I successfully build osmesa with gallium state tracker on windows by adding a new target (gallium-osmesa) in the scons build system. Both llvmpipe and softpipe works. May I send a patch ? That would be great thank you. I'm pretty sure that Jose won't mind ;-) May I ask what is your usecase for osmesa - is it a public/open-source project or something in house that'll be using it ? Regarding the patch in question - please use git send-email to send it to the mailing list. Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] tnl: replace __FUNCTION__ with __func__
Consistently just use C99's __func__ everywhere. The patch was verified with Microsoft Visual studio 2013 redistributable package(RTM version number: 18.0.21005.1) Next MSVC versions intends to support __func__. No functional changes. Signed-off-by: Marius Predut marius.pre...@intel.com --- src/mesa/tnl_dd/t_dd_dmatmp.h | 34 +- src/mesa/tnl_dd/t_dd_dmatmp2.h | 22 +++--- src/mesa/tnl_dd/t_dd_triemit.h |8 src/mesa/tnl_dd/t_dd_tritmp.h |2 +- src/mesa/tnl_dd/t_dd_unfilled.h |2 +- 5 files changed, 34 insertions(+), 34 deletions(-) diff --git a/src/mesa/tnl_dd/t_dd_dmatmp.h b/src/mesa/tnl_dd/t_dd_dmatmp.h index 52ea2bf..667e2a6 100644 --- a/src/mesa/tnl_dd/t_dd_dmatmp.h +++ b/src/mesa/tnl_dd/t_dd_dmatmp.h @@ -128,7 +128,7 @@ static void TAG(render_points_verts)( struct gl_context *ctx, } } else { - fprintf(stderr, %s - cannot draw primitive\n, __FUNCTION__); + fprintf(stderr, %s - cannot draw primitive\n, __func__); return; } } @@ -163,7 +163,7 @@ static void TAG(render_lines_verts)( struct gl_context *ctx, } } else { - fprintf(stderr, %s - cannot draw primitive\n, __FUNCTION__); + fprintf(stderr, %s - cannot draw primitive\n, __func__); return; } } @@ -195,7 +195,7 @@ static void TAG(render_line_strip_verts)( struct gl_context *ctx, FLUSH(); } else { - fprintf(stderr, %s - cannot draw primitive\n, __FUNCTION__); + fprintf(stderr, %s - cannot draw primitive\n, __func__); return; } } @@ -261,7 +261,7 @@ static void TAG(render_line_loop_verts)( struct gl_context *ctx, FLUSH(); } else { - fprintf(stderr, %s - cannot draw primitive\n, __FUNCTION__); + fprintf(stderr, %s - cannot draw primitive\n, __func__); return; } } @@ -331,7 +331,7 @@ static void TAG(render_tri_strip_verts)( struct gl_context *ctx, FLUSH(); } else { - fprintf(stderr, %s - cannot draw primitive\n, __FUNCTION__); + fprintf(stderr, %s - cannot draw primitive\n, __func__); return; } } @@ -370,7 +370,7 @@ static void TAG(render_tri_fan_verts)( struct gl_context *ctx, /* Could write code to emit these as indexed vertices (for the * g400, for instance). */ - fprintf(stderr, %s - cannot draw primitive\n, __FUNCTION__); + fprintf(stderr, %s - cannot draw primitive\n, __func__); return; } } @@ -409,7 +409,7 @@ static void TAG(render_poly_verts)( struct gl_context *ctx, else if (HAVE_TRI_FANS ctx-Light.ShadeModel == GL_SMOOTH) { TAG(render_tri_fan_verts)( ctx, start, count, flags ); } else { - fprintf(stderr, %s - cannot draw primitive\n, __FUNCTION__); + fprintf(stderr, %s - cannot draw primitive\n, __func__); return; } } @@ -500,7 +500,7 @@ static void TAG(render_quad_strip_verts)( struct gl_context *ctx, /* Vertices won't fit in a single buffer or elts not * available - should never happen. */ -fprintf(stderr, %s - cannot draw primitive\n, __FUNCTION__); +fprintf(stderr, %s - cannot draw primitive\n, __func__); return; } } @@ -534,7 +534,7 @@ static void TAG(render_quad_strip_verts)( struct gl_context *ctx, FLUSH(); } else { - fprintf(stderr, %s - cannot draw primitive\n, __FUNCTION__); + fprintf(stderr, %s - cannot draw primitive\n, __func__); return; } } @@ -644,7 +644,7 @@ static void TAG(render_quads_verts)( struct gl_context *ctx, else { /* Vertices won't fit in a single buffer, should never happen. */ - fprintf(stderr, %s - cannot draw primitive\n, __FUNCTION__); + fprintf(stderr, %s - cannot draw primitive\n, __func__); return; } } @@ -705,7 +705,7 @@ static void TAG(render_points_elts)( struct gl_context *ctx, currentsz = dmasz; } } else { - fprintf(stderr, %s - cannot draw primitive\n, __FUNCTION__); + fprintf(stderr, %s - cannot draw primitive\n, __func__); return; } } @@ -743,7 +743,7 @@ static void TAG(render_lines_elts)( struct gl_context *ctx, currentsz = dmasz; } } else { - fprintf(stderr, %s - cannot draw primitive\n, __FUNCTION__); + fprintf(stderr, %s - cannot draw primitive\n, __func__); return; } } @@ -777,7 +777,7 @@ static void TAG(render_line_strip_elts)( struct gl_context *ctx, } else { /* TODO: Try to emit as indexed lines. */ - fprintf(stderr, %s - cannot draw primitive\n, __FUNCTION__); + fprintf(stderr, %s - cannot draw primitive\n, __func__); return; } } @@ -845,7 +845,7 @@ static void TAG(render_line_loop_elts)( struct gl_context *ctx, FLUSH(); } else { /* TODO: Try to emit as indexed lines */ - fprintf(stderr, %s - cannot draw primitive\n, __FUNCTION__); + fprintf(stderr, %s - cannot
Re: [Mesa-dev] [PATCH] tnl: replace __FUNCTION__ with __func__
Hi Marius, On 03/04/15 13:11, Marius Predut wrote: Consistently just use C99's __func__ everywhere. The patch was verified with Microsoft Visual studio 2013 redistributable package(RTM version number: 18.0.21005.1) Next MSVC versions intends to support __func__. No functional changes. Small note - for compilers that lack __func__ (inline and others) we provide reasonable workaround via include/c99_compat.h. Thanks for going through these :-) Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] osmesa with gallium
Hi, I successfully build osmesa with gallium state tracker on windows by adding a new target (gallium-osmesa) in the scons build system. Both llvmpipe and softpipe works. May I send a patch ? - Ce message a été traité contre les virus par quatre outils différents (Kaspersky, McAfee, Symantec et ThreatSeeker). This message has been scanned for viruses (by Kaspersky, McAfee, Symantec and ThreatSeeker). - ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Building Mesa for Windows using Visual Studio
How I build mesa on windows: 1.install Microsoft vs 2013(not 2012 or less). 2.install last python 2.7 : https://www.python.org/downloads/ install pywin32 from http://heanet.dl.sourceforge.net/project/pywin32/pywin32/Build%20219/pywin32-219.win32-py2.7.exe download win flex-bison from http://sourceforge.net/projects/winflexbison/ and unzip into c:\win_flex_bison 3.Environment settings. Add these near the top of your PATH: C:\Python27 C:\Python27\Scripts c:\win_flex_bison if you use proxy add FTP_PROXY and HTTP_PROXY env. variables. 4.install pip by downloading get-pip.py file from: https://pip.pypa.io/en/latest/installing.html and then run: python get-pip.py --proxy=[user:passwd@]proxy.server:port 5.Add mako, lxml and NumPy python modules by pip pip install Mako pip install lxml pip install NumPy (not mandatory I think) 6.install scons from http://sourceforge.net/projects/scons/?source=typ_redirect 7.Build mesa: scons build=release machine=x86 platform=windows libgl-gdi or simple run: scons. This should create an opengl32.dll in build\windows-x86\gallium\targets\libgl-gdi OBS: Why vs 2013 and not vs 2012: the VS 2012 partially implement C++ TR1 C99 standard library already used into the latest upstream mesa. (http://blogs.msdn.com/b/vcblog/archive/2013/07/19/c99-library-support-in-visual-studio-2013.aspx) In vs 2012 the math.h don't include rint, rintf, rintl library support used in mesa even if the MSDN say contrary : https://msdn.microsoft.com/nl-nl/dn465165.) Mesa uses those API in mesa\src\util\rounding.h. marius From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On Behalf Of Shervin Sharifi Sent: Wednesday, March 25, 2015 3:01 AM To: mesa-dev@lists.freedesktop.org Subject: [Mesa-dev] Building Mesa for Windows using Visual Studio Hi, I'm new to Mesa. I'm trying to build Mesa for Windows using Visual Studio, but couldn't find instructions for that. The related threads on this mailing list also seem outdated. Could anyone give me some hint or point me to instructions if there is any? Thanks, Shervin ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] RFC: nir: add lowering for idiv/udiv/umod
On Wed, Apr 1, 2015 at 9:50 AM, Ilia Mirkin imir...@alum.mit.edu wrote: On Wed, Apr 1, 2015 at 7:09 AM, Roland Scheidegger srol...@vmware.com wrote: Am 01.04.2015 um 03:44 schrieb Rob Clark: On Tue, Mar 31, 2015 at 9:03 PM, Roland Scheidegger srol...@vmware.com wrote: Am 01.04.2015 um 00:57 schrieb Rob Clark: +/* Lowers idiv/udiv/umod + * Based on NV50LegalizeSSA::handleDIV() + * + * Note that this is probably not enough precision for compute shaders. + * Perhaps we want a second higher precision (looping) version of this? + * Or perhaps we assume if you can do compute shaders you can also + * branch out to a pre-optimized shader library routine.. So if this is not enough precision, maybe should state how large the error can be? tbh, if I knew what the error for this approach was, I would have included it. I'm not the original author, but this is based on nouveau codegen code (as mentioned in the comment). I guess it is better than converting to float and dividing and converting back, but worse than an iterative (ie. looping, ie. divergent flow control) approach. It is apparently enough to keep piglit happy. The original algo in nv50 lowering code is from 322bc7ed68ed92233c97168c036d0aa50c11a20e (ie. 'nv50/ir: import nv50 target') which doesn't really give more clue about the origin.. if anyone knows, I'm all ears and will add relevant links/info to comment.. Ah ok. Well it isn't even obvious to me if the results are not actually always exact. Should be easy enough to take the algo, express it in terms of e.g. numpy (or even, *gasp*, a C program), and then do a randomized search over the 32bit x 32bit input space to see if there are any errors, and what they are. (Since the full input space would take too long...) Looks like I did just that when debugging the freedreno impl... available at http://hastebin.com/ewimuvobin.py fwiw, looks like you still had some broken hacks in that script, probably left overs from your earlier experiments.. I fixed it up (or at least it seems to be giving the same results piglit expects for the same inputs) and also added udiv vs idiv support.. guess I should add umod support too and commit it along side the idiv lowering (when that actually works too) would appreciate a second set of eyes on this since I'm pretty much a python and numpy newbie: http://hastebin.com/orogikadey.vhdl now to figure out what my idiv lowering is doing differently :-P BR, -R -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Building Mesa for Windows using Visual Studio
Hi Marius. Thank you for the write-up. On 03/04/15 12:34, Predut, Marius wrote: How I build mesa on windows: 1.install Microsoft vs 2013(not 2012 or less). 2.install last python 2.7 : https://www.python.org/downloads/ install pywin32 from http://heanet.dl.sourceforge.net/project/pywin32/pywin32/Build%20219/pywin32-219.win32-py2.7.exe download win flex-bison from http://sourceforge.net/projects/winflexbison/ and unzip into c:\win_flex_bison 3.Environment settings. Add these near the top of your PATH: C:\Python27 C:\Python27\Scripts c:\win_flex_bison if you use proxy add FTP_PROXY and HTTP_PROXY env. variables. 4.install pip by downloading get-pip.py file from: https://pip.pypa.io/en/latest/installing.html and then run: python get-pip.py --proxy=[user:passwd@]proxy.server:port 5.Add mako, lxml and NumPy python modules by pip pip install Mako pip install lxml pip install NumPy (not mandatory I think) 6.install scons from http://sourceforge.net/projects/scons/?source=typ_redirect 7.Build mesa: scons build=release machine=x86 platform=windows libgl-gdi or simple run: scons. This should create an opengl32.dll in build\windows-x86\gallium\targets\libgl-gdi OBS: Why vs 2013 and not vs 2012: the VS 2012 partially implement C++ TR1 C99 standard library already used into the latest upstream mesa. (http://blogs.msdn.com/b/vcblog/archive/2013/07/19/c99-library-support-in-visual-studio-2013.aspx) In vs 2012 the math.h don't include rint, rintf, rintl library support used in mesa even if the MSDN say contrary : https://msdn.microsoft.com/nl-nl/dn465165.) Mesa uses those API in mesa\src\util\rounding.h. Just a couple of small details - mesa has a fall-back for the mentioned functions (plus others) in $(top)/include/*h. That said, I believe that the overall consensus is that building mesa with MSVC 2008, is the bare minimum, with MSVC 2013 strongly recommended. Afaik, as the VMWare guys give us the go ahead we'll drop all the workarounds for pre-2013 versions and bump the requirement. Cheers, Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] vbo: replace __FUNCTION__ with __func__
Consistently just use C99's __func__ everywhere. The patch was verified with Microsoft Visual studio 2013 redistributable package(RTM version number: 18.0.21005.1) Next MSVC versions intends to support __func__. No functional changes. Signed-off-by: Marius Predut marius.pre...@intel.com --- src/mesa/vbo/vbo_exec_api.c |2 +- src/mesa/vbo/vbo_exec_draw.c |4 ++-- src/mesa/vbo/vbo_rebase.c|2 +- src/mesa/vbo/vbo_save_api.c |2 +- 4 files changed, 5 insertions(+), 5 deletions(-) diff --git a/src/mesa/vbo/vbo_exec_api.c b/src/mesa/vbo/vbo_exec_api.c index 02741c2..859078f 100644 --- a/src/mesa/vbo/vbo_exec_api.c +++ b/src/mesa/vbo/vbo_exec_api.c @@ -439,7 +439,7 @@ do { \ } while (0) -#define ERROR(err) _mesa_error( ctx, err, __FUNCTION__ ) +#define ERROR(err) _mesa_error( ctx, err, __func__ ) #define TAG(x) vbo_##x #include vbo_attrib_tmp.h diff --git a/src/mesa/vbo/vbo_exec_draw.c b/src/mesa/vbo/vbo_exec_draw.c index 91f2ca4..37b53a8 100644 --- a/src/mesa/vbo/vbo_exec_draw.c +++ b/src/mesa/vbo/vbo_exec_draw.c @@ -45,7 +45,7 @@ vbo_exec_debug_verts( struct vbo_exec_context *exec ) GLuint i; printf(%s: %u vertices %d primitives, %d vertsize\n, - __FUNCTION__, + __func__, count, exec-vtx.prim_count, exec-vtx.vertex_size); @@ -402,7 +402,7 @@ vbo_exec_vtx_flush(struct vbo_exec_context *exec, GLboolean keepUnmapped) } if (0) -printf(%s %d %d\n, __FUNCTION__, exec-vtx.prim_count, +printf(%s %d %d\n, __func__, exec-vtx.prim_count, exec-vtx.vert_count); vbo_context(ctx)-draw_prims( ctx, diff --git a/src/mesa/vbo/vbo_rebase.c b/src/mesa/vbo/vbo_rebase.c index b06df4a..c3c4b64 100644 --- a/src/mesa/vbo/vbo_rebase.c +++ b/src/mesa/vbo/vbo_rebase.c @@ -142,7 +142,7 @@ void vbo_rebase_prims( struct gl_context *ctx, assert(min_index != 0); if (0) - printf(%s %d..%d\n, __FUNCTION__, min_index, max_index); + printf(%s %d..%d\n, __func__, min_index, max_index); /* XXX this path is disabled for now. diff --git a/src/mesa/vbo/vbo_save_api.c b/src/mesa/vbo/vbo_save_api.c index fd9a5de..5927bee 100644 --- a/src/mesa/vbo/vbo_save_api.c +++ b/src/mesa/vbo/vbo_save_api.c @@ -763,7 +763,7 @@ _save_reset_vertex(struct gl_context *ctx) -#define ERROR(err) _mesa_compile_error(ctx, err, __FUNCTION__); +#define ERROR(err) _mesa_compile_error(ctx, err, __func__); /* Only one size for each attribute may be active at once. Eg. if -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 89889] Multiple use after free bugs in fxt1 code
https://bugs.freedesktop.org/show_bug.cgi?id=89889 Bug ID: 89889 Summary: Multiple use after free bugs in fxt1 code Product: Mesa Version: unspecified Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: Mesa core Assignee: mesa-dev@lists.freedesktop.org Reporter: conc...@web.de QA Contact: mesa-dev@lists.freedesktop.org Coverity found use after free bugs in the fxt1_encode code. This report can be found under https://scan.coverity.com/projects/4391 as CID 38222. Richard Goedeken fixed this in the mupen64plus-video-glide64mk2 fork of this code in patch https://github.com/mupen64plus/mupen64plus-video-glide64mk2/commit/438dd4fb69dbe48504bca53d80c284e0f4027c7f -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 89889] Multiple use after free bugs in fxt1 code
https://bugs.freedesktop.org/show_bug.cgi?id=89889 conc...@web.de changed: What|Removed |Added Resolution|FIXED |INVALID -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 89889] Multiple use after free bugs in fxt1 code
https://bugs.freedesktop.org/show_bug.cgi?id=89889 conc...@web.de changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #1 from conc...@web.de --- Seems like the code has since then diverged too much to be comparable -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/13] SQUASH: i965/fs: Rework fs_visitor::lower_load_payload
On Fri, Apr 3, 2015 at 7:28 AM, Francisco Jerez curroje...@riseup.net wrote: Jason Ekstrand ja...@jlekstrand.net writes: On Thu, Apr 2, 2015 at 3:01 AM, Francisco Jerez curroje...@riseup.net wrote: Jason Ekstrand ja...@jlekstrand.net writes: Instead of the complicated and broken-by-design pile of heuristics we had before, we now have a straightforward lowering: 1) All header sources are copied directly using force_writemask_all and, since they are guaranteed to be a single register, there are no force_sechalf issues. 2) All non-header sources are copied using the exact same force_sechalf and saturate modifiers as the LOAD_PAYLOAD operation itself. 3) In order to accommodate older gens that need interleaved colors, lower_load_payload detects when the destination is a COMPR4 register and automatically interleaves the non-header sources. The lower_load_payload pass does the right thing here regardless of whether or not the hardware actually supports COMPR4. I had a quick glance at the series and it seems to be going in the right direction. One thing I honestly don't like is the ad-hoc and IMHO premature treatment of payload headers, it still feels like the LOAD_PAYLOAD instruction has more complex semantics than necessary and the benefit is unclear to me. I suppose that your motivation was to avoid setting force_writemask_all in LOAD_PAYLOAD instructions with header. The optimizer should be able to cope with those flags and get rid of them from the resulting moves where they are redundant, and if it's not able to it looks like something that should be fixed anyway. The explicit handling of headers is responsible for much of the churn in this series and is likely to complicate users of LOAD_PAYLOAD and optimization passes that have to manipulate them. Avoiding force_writemask_all is only half of the motivation and the small half at that. A header source, more properly defined, is a single physical register that, conceptually, applies to all channels. Effectively, a header source (I should have stated this clearly) has two properties: 1) It has force_writemask_all set 2) It is exactly one physical hardware register. This second property is the more important of the two. Most of the disaster of the previous LOAD_PAYLOAD implementation was that we did a pile of guesswork and had a ill-conceved effective width think in order to figure out how big the register actually was. Making the user specify which sources are header sources eliminates that guesswork. It also has the nice side-effect that we can do the right force_writemask_all and we can properly handle COMPR4 for the the user. Yeah, true, but this seems like the least orthogonal and most annoying to use solution for this problem, it forces the caller to provide redundant information, it takes into account the saturate flag on some arguments and not others, it shuffles sources with respect to the specified order when COMPR4 is set, but only for the first four non-header sources. I think any of the following solutions would be better-behaved than the current approach: I don't know that saying which sources are headers is really redundant. It's explicit which is what we want. Yes, the COMPR4 thing is a bit magical but we have to do COMPR4 in lower_load_payload so we have to have some way of doing it and this method puts the interleving code in one place instead of two. 1/ Use the source width to determine the size of each copy. This would imply that the source width carries semantic information and hence would have to be left alone by e.g. copy propagation. That's what do now and it's terrible. The effective width field was basically a width that gets kept. 2/ Use the instruction exec size and flags to determine the properties of *all* copies. This means that if a header is present the exec size would necessarily have to be 8 and the halves of a 16-wide register would have to be specified separately, which sounds annoying at first but in practice wouldn't necessarily be because it could be handled by the LOAD_PAYLOAD() helper based on the argument widths without running into problems with optimization passes changing the meaning of the instruction. The semantics of the instruction itself would be as stupid as possible, but the implementation could still trivially recognise 16-wide and COMPR4 copies using the exact same mechanism you are using now. Yes, that might work. I'll try and take a swing at it today. It will *hopefully* have less code churn than the solution in this series because the magic will still happen, just in a different place. Of course, this solution also requires that we lower everything with force_writemask_all and *hopefully* we can get rid of those and set force_sechalf appropriately in optimization passes. 3/ Split LOAD_PAYLOAD into two separate instructions, each of them
[Mesa-dev] [PATCH] nir: add lowering for idiv/udiv/umod
From: Rob Clark robcl...@freedesktop.org Based on the algo from NV50LegalizeSSA::handleDIV() and handleMOD(). See also trans_idiv() in freedreno/ir3/ir3_compiler.c (which was an adaptation of the nv50 code from Ilia Mirkin). Also, including a py script that implements the same algo with numpy, based on something written by Ilia (and beaten on with a hammer a bit by me). I've tested this on i965 hacked up to insert the idiv lowering pass. Signed-off-by: Rob Clark robcl...@freedesktop.org --- src/glsl/Makefile.sources | 1 + src/glsl/nir/div-lowering.py | 75 src/glsl/nir/nir.h| 1 + src/glsl/nir/nir_lower_idiv.c | 157 ++ 4 files changed, 234 insertions(+) create mode 100755 src/glsl/nir/div-lowering.py create mode 100644 src/glsl/nir/nir_lower_idiv.c diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources index ffce706..5d70e88 100644 --- a/src/glsl/Makefile.sources +++ b/src/glsl/Makefile.sources @@ -33,6 +33,7 @@ NIR_FILES = \ nir/nir_lower_atomics.c \ nir/nir_lower_global_vars_to_local.c \ nir/nir_lower_locals_to_regs.c \ + nir/nir_lower_idiv.c \ nir/nir_lower_io.c \ nir/nir_lower_phis_to_scalar.c \ nir/nir_lower_samplers.cpp \ diff --git a/src/glsl/nir/div-lowering.py b/src/glsl/nir/div-lowering.py new file mode 100755 index 000..87db784 --- /dev/null +++ b/src/glsl/nir/div-lowering.py @@ -0,0 +1,75 @@ +#!/usr/bin/python + +import numpy as np +import sys + +op = sys.argv[1] + +if op not in (idiv, udiv, umod): + print invalid op:, op + exit(1) + +is_signed = op == idiv + +if is_signed: + numer = np.int32(sys.argv[2]) + denom = np.int32(sys.argv[3]) +else: + numer = np.uint32(sys.argv[2]) + denom = np.uint32(sys.argv[3]) + +print op, numer, denom, \n + + +if is_signed: + af = np.float32(numer) + bf = np.float32(denom) + af = np.abs(af) + bf = np.abs(bf) + a = np.abs(numer).view(np.uint32) + b = np.abs(denom).view(np.uint32) +else: + af = np.float32(numer) + bf = np.float32(denom) + a = numer + b = denom + +# get first result: +bf = np.reciprocal(bf) +bf = (bf.view(np.uint32) - np.uint32(2)).view(np.float32) +q = af * bf + +if is_signed: + q = np.int32(q).view(np.uint32) +else: + q = np.uint32(q).view(np.uint32) + +# get error of first result: +r = q * b +r = a - r +r = np.float32(r) +r = r * bf +r = np.uint32(r) + +# add quotients: +q = q + r + +# correction: if modulus = divisor, add 1 +r = q * b +r = a - r + +r = np.uint32(1) if r.view(np.uint32) = b.view(np.uint32) else np.uint32(0) +q = q + r + +if is_signed: + r = np.bitwise_xor(numer, denom) + r = np.right_shift(r, 31) + b = -q + q = b if r else q + +if op == umod: + r = q * b + q = a - r + +print =, q.view(np.int32) + diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h index c14c51c..20984e9 100644 --- a/src/glsl/nir/nir.h +++ b/src/glsl/nir/nir.h @@ -1605,6 +1605,7 @@ void nir_lower_samplers(nir_shader *shader, void nir_lower_system_values(nir_shader *shader); void nir_lower_tex_projector(nir_shader *shader); +void nir_lower_idiv(nir_shader *shader); void nir_lower_atomics(nir_shader *shader); void nir_lower_to_source_mods(nir_shader *shader); diff --git a/src/glsl/nir/nir_lower_idiv.c b/src/glsl/nir/nir_lower_idiv.c new file mode 100644 index 000..c2f08df --- /dev/null +++ b/src/glsl/nir/nir_lower_idiv.c @@ -0,0 +1,157 @@ +/* + * Copyright © 2015 Red Hat + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * Authors: + *Rob Clark robcl...@freedesktop.org + */ + +#include nir.h +#include nir_builder.h + +/* Lowers idiv/udiv/umod + * Based on NV50LegalizeSSA::handleDIV() + * + * Note that this is probably not enough precision for compute shaders. + * Perhaps we want a second
[Mesa-dev] [PATCH] scons: add target gallium-osmesa
From: Olivier Pena op...@isagri.fr --- src/gallium/SConscript | 5 src/gallium/state_trackers/osmesa/SConscript | 25 + src/gallium/state_trackers/osmesa/osmesa.def | 16 +++ src/gallium/targets/osmesa/SConscript| 41 4 files changed, 87 insertions(+) create mode 100644 src/gallium/state_trackers/osmesa/SConscript create mode 100644 src/gallium/state_trackers/osmesa/osmesa.def create mode 100644 src/gallium/targets/osmesa/SConscript diff --git a/src/gallium/SConscript b/src/gallium/SConscript index 680ad92..eeb1c78 100644 --- a/src/gallium/SConscript +++ b/src/gallium/SConscript @@ -60,6 +60,11 @@ SConscript([ ]) if not env['embedded']: +SConscript([ +'state_trackers/osmesa/SConscript', +'targets/osmesa/SConscript', +]) + if env['x11']: SConscript([ 'state_trackers/glx/xlib/SConscript', diff --git a/src/gallium/state_trackers/osmesa/SConscript b/src/gallium/state_trackers/osmesa/SConscript new file mode 100644 index 000..fa7c968 --- /dev/null +++ b/src/gallium/state_trackers/osmesa/SConscript @@ -0,0 +1,25 @@ +import os + +Import('*') + +env = env.Clone() + +env.Append(CPPPATH = [ +'#src/mapi', +'#src/mesa', +'.', +]) + +env.AppendUnique(CPPDEFINES = [ +'BUILD_GL32', # declare gl* as __declspec(dllexport) in Mesa headers +'WIN32_LEAN_AND_MEAN', # http://msdn2.microsoft.com/en-us/library/6dwk3a1z.aspx +]) +if not env['gles']: +# prevent _glapi_* from being declared __declspec(dllimport) +env.Append(CPPDEFINES = ['_GLAPI_NO_EXPORTS']) + +st_osmesa = env.ConvenienceLibrary( +target ='st_osmesa', +source = env.ParseSourceList('Makefile.sources', 'C_SOURCES'), +) +Export('st_osmesa') diff --git a/src/gallium/state_trackers/osmesa/osmesa.def b/src/gallium/state_trackers/osmesa/osmesa.def new file mode 100644 index 000..e2a31ab --- /dev/null +++ b/src/gallium/state_trackers/osmesa/osmesa.def @@ -0,0 +1,16 @@ +;DESCRIPTION 'Mesa OSMesa lib for Win32' +VERSION 4.1 + +EXPORTS + OSMesaCreateContext + OSMesaCreateContextExt + OSMesaDestroyContext + OSMesaMakeCurrent + OSMesaGetCurrentContext + OSMesaPixelStore + OSMesaGetIntegerv + OSMesaGetDepthBuffer + OSMesaGetColorBuffer + OSMesaGetProcAddress + OSMesaColorClamp + OSMesaPostprocess diff --git a/src/gallium/targets/osmesa/SConscript b/src/gallium/targets/osmesa/SConscript new file mode 100644 index 000..2c936cf --- /dev/null +++ b/src/gallium/targets/osmesa/SConscript @@ -0,0 +1,41 @@ +Import('*') + +env = env.Clone() + +env.Prepend(CPPPATH = [ +'#src/mapi', +'#src/mesa', +#Dir('../../../mapi'), # src/mapi build path for python-generated GL API files/headers +]) + +sources = [ +'target.c', +] +sources += ['#src/gallium/state_trackers/osmesa/osmesa.def'] + +drivers = [] + +if env['llvm']: +env.Append(CPPDEFINES = 'GALLIUM_LLVMPIPE') +env.Append(CPPDEFINES = 'GALLIUM_TRACE') +drivers += [llvmpipe] +else: +env.Append(CPPDEFINES = 'GALLIUM_SOFTPIPE') +env.Append(CPPDEFINES = 'GALLIUM_TRACE') +drivers += [softpipe] + +if env['platform'] == 'windows': +env.AppendUnique(CPPDEFINES = [ +'BUILD_GL32', # declare gl* as __declspec(dllexport) in Mesa headers +]) +if not env['gles']: +# prevent _glapi_* from being declared __declspec(dllimport) +env.Append(CPPDEFINES = ['_GLAPI_NO_EXPORTS']) + +gallium_osmesa = env.SharedLibrary( +target ='osmesa', +source = sources, +LIBS = drivers + st_osmesa + ws_null + glapi + mesa + gallium + trace + glsl + mesautil + env['LIBS'], +) + +env.Alias('gallium-osmesa', gallium_osmesa) -- 1.9.4.msysgit.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 89599] symbol 'x86_64_entry_start' is already defined when building with LLVM/clang
https://bugs.freedesktop.org/show_bug.cgi?id=89599 Tomasz Paweł Gajc tpg...@gmail.com changed: What|Removed |Added Summary|symbol 'x86_64_entry_start' |symbol 'x86_64_entry_start' |is already defined |is already defined when ||building with LLVM/clang -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] build: add libnir.la
On Fri, Apr 3, 2015 at 2:07 PM, Rob Clark robdcl...@gmail.com wrote: From: Rob Clark robcl...@freedesktop.org If we want to use NIR from state trackers that don't already pull in the whole of glsl (ie. anything other than mesa state tracker), we need a separate more minimal libnir. Possibly NIR should be better split out from glsl, but for now, generate a second smaller libnir.la for those who just want NIR but not all of glsl. Signed-off-by: Rob Clark robcl...@freedesktop.org --- src/glsl/Makefile.am | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/src/glsl/Makefile.am b/src/glsl/Makefile.am index 6cef973..23c6fe8 100644 --- a/src/glsl/Makefile.am +++ b/src/glsl/Makefile.am @@ -68,7 +68,7 @@ TESTS_ENVIRONMENT= \ export PYTHON2=$(PYTHON2); \ export PYTHON_FLAGS=$(PYTHON_FLAGS); -noinst_LTLIBRARIES = libglsl.la libglcpp.la +noinst_LTLIBRARIES = libnir.la libglsl.la libglcpp.la check_PROGRAMS = \ glcpp/glcpp \ glsl_test \ @@ -148,6 +148,12 @@ libglsl_la_SOURCES = \ $(LIBGLSL_FILES)\ $(NIR_FILES) We still have the line above, so doesn't this mean we'll build all the NIR files twice, once in libglsl.a and once in libnir.a? Isn't that a bad thing? Or am I missing something? +libnir_la_SOURCES =\ + glsl_types.cpp \ + builtin_types.cpp \ + glsl_symbol_table.cpp \ + $(NIR_FILES) + glsl_compiler_SOURCES = \ $(GLSL_COMPILER_CXX_FILES) -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] build: add libnir.la
From: Rob Clark robcl...@freedesktop.org If we want to use NIR from state trackers that don't already pull in the whole of glsl (ie. anything other than mesa state tracker), we need a separate more minimal libnir. Possibly NIR should be better split out from glsl, but for now, generate a second smaller libnir.la for those who just want NIR but not all of glsl. Signed-off-by: Rob Clark robcl...@freedesktop.org --- src/glsl/Makefile.am | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/src/glsl/Makefile.am b/src/glsl/Makefile.am index 6cef973..23c6fe8 100644 --- a/src/glsl/Makefile.am +++ b/src/glsl/Makefile.am @@ -68,7 +68,7 @@ TESTS_ENVIRONMENT= \ export PYTHON2=$(PYTHON2); \ export PYTHON_FLAGS=$(PYTHON_FLAGS); -noinst_LTLIBRARIES = libglsl.la libglcpp.la +noinst_LTLIBRARIES = libnir.la libglsl.la libglcpp.la check_PROGRAMS = \ glcpp/glcpp \ glsl_test \ @@ -148,6 +148,12 @@ libglsl_la_SOURCES = \ $(LIBGLSL_FILES)\ $(NIR_FILES) +libnir_la_SOURCES =\ + glsl_types.cpp \ + builtin_types.cpp \ + glsl_symbol_table.cpp \ + $(NIR_FILES) + glsl_compiler_SOURCES = \ $(GLSL_COMPILER_CXX_FILES) -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] xa: support for drivers which use NIR
From: Rob Clark robcl...@freedesktop.org We need to pull in libnir.la and it's dependency libglsl_util.la. Also, _mesa_error_no_memory() must be defined. Fortunately with libnir.la (vs pulling in all of libglsl.la) we don't also need libstdc++. Signed-off-by: Rob Clark robcl...@freedesktop.org --- src/gallium/state_trackers/xa/xa_tracker.c | 7 +++ src/gallium/targets/xa/Makefile.am | 2 ++ 2 files changed, 9 insertions(+) diff --git a/src/gallium/state_trackers/xa/xa_tracker.c b/src/gallium/state_trackers/xa/xa_tracker.c index f69ac8e..3f22d64 100644 --- a/src/gallium/state_trackers/xa/xa_tracker.c +++ b/src/gallium/state_trackers/xa/xa_tracker.c @@ -535,3 +535,10 @@ xa_surface_format(const struct xa_surface *srf) { return srf-fdesc.xa_format; } + +void _mesa_error_no_memory(const char *caller); +void +_mesa_error_no_memory(const char *caller) +{ + debug_printf(Mesa error: out of memory in %s, caller); +} diff --git a/src/gallium/targets/xa/Makefile.am b/src/gallium/targets/xa/Makefile.am index a1eae2a..8ddb967 100644 --- a/src/gallium/targets/xa/Makefile.am +++ b/src/gallium/targets/xa/Makefile.am @@ -37,6 +37,8 @@ libxatracker_la_LIBADD = \ $(top_builddir)/src/gallium/state_trackers/xa/libxatracker.la \ $(top_builddir)/src/gallium/auxiliary/libgalliumvl_stub.la \ $(top_builddir)/src/gallium/auxiliary/libgallium.la \ + $(top_builddir)/src/glsl/libnir.la \ + $(top_builddir)/src/libglsl_util.la \ $(top_builddir)/src/util/libmesautil.la \ $(LIBDRM_LIBS) \ $(GALLIUM_COMMON_LIB_DEPS) -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/13] SQUASH: i965/fs: Rework fs_visitor::lower_load_payload
On Fri, Apr 3, 2015 at 8:37 AM, Francisco Jerez curroje...@riseup.net wrote: Jason Ekstrand ja...@jlekstrand.net writes: On Fri, Apr 3, 2015 at 7:28 AM, Francisco Jerez curroje...@riseup.net wrote: Jason Ekstrand ja...@jlekstrand.net writes: On Thu, Apr 2, 2015 at 3:01 AM, Francisco Jerez curroje...@riseup.net wrote: Jason Ekstrand ja...@jlekstrand.net writes: Instead of the complicated and broken-by-design pile of heuristics we had before, we now have a straightforward lowering: 1) All header sources are copied directly using force_writemask_all and, since they are guaranteed to be a single register, there are no force_sechalf issues. 2) All non-header sources are copied using the exact same force_sechalf and saturate modifiers as the LOAD_PAYLOAD operation itself. 3) In order to accommodate older gens that need interleaved colors, lower_load_payload detects when the destination is a COMPR4 register and automatically interleaves the non-header sources. The lower_load_payload pass does the right thing here regardless of whether or not the hardware actually supports COMPR4. I had a quick glance at the series and it seems to be going in the right direction. One thing I honestly don't like is the ad-hoc and IMHO premature treatment of payload headers, it still feels like the LOAD_PAYLOAD instruction has more complex semantics than necessary and the benefit is unclear to me. I suppose that your motivation was to avoid setting force_writemask_all in LOAD_PAYLOAD instructions with header. The optimizer should be able to cope with those flags and get rid of them from the resulting moves where they are redundant, and if it's not able to it looks like something that should be fixed anyway. The explicit handling of headers is responsible for much of the churn in this series and is likely to complicate users of LOAD_PAYLOAD and optimization passes that have to manipulate them. Avoiding force_writemask_all is only half of the motivation and the small half at that. A header source, more properly defined, is a single physical register that, conceptually, applies to all channels. Effectively, a header source (I should have stated this clearly) has two properties: 1) It has force_writemask_all set 2) It is exactly one physical hardware register. This second property is the more important of the two. Most of the disaster of the previous LOAD_PAYLOAD implementation was that we did a pile of guesswork and had a ill-conceved effective width think in order to figure out how big the register actually was. Making the user specify which sources are header sources eliminates that guesswork. It also has the nice side-effect that we can do the right force_writemask_all and we can properly handle COMPR4 for the the user. Ok, Allow me to be a bit more explicit as to what all we need to keep track of: 1) How big is the source register for real. Even immediates can end up being two registers in the copy. 2) Do we want force_writemask_all? 3) If not, do we want force_sechalf? 4) On g45 and gen5, we want to use COMPR4 for interlaced movs 5) When lowering, we want to use 16-wide moves when possible in SIMD16 With the patch series I sent, all of this is explicit except for COMPR4 which is, admittedly, kind of magic. Which of these sources are headers? is a reasonable question for the caller to answer. It knows explicitly and it would take the LOAD_PAYLOAD helper some work to guess it correctly. Another option would be to guess that based on exec sizes but then the caller has to know not to pass in the wrong register type or the guess will be wrong. I like explicit. Yeah, true, but this seems like the least orthogonal and most annoying to use solution for this problem, it forces the caller to provide redundant information, it takes into account the saturate flag on some arguments and not others, it shuffles sources with respect to the specified order when COMPR4 is set, but only for the first four non-header sources. I think any of the following solutions would be better-behaved than the current approach: I don't know that saying which sources are headers is really redundant. It's explicit which is what we want. Yes, the COMPR4 thing is a bit magical but we have to do COMPR4 in lower_load_payload so we have to have some way of doing it and this method puts the interleving code in one place instead of two. Well, at least with the previous approach LOAD_PAYLOAD had consistent (if broken) semantics across its arguments, and regardless of COMPR4 being used or not, which IMHO is preferable to the modest code saving. To be clear, I don't really like the way I did COMPR4 either. I just couldn't come up with anything better. 1/ Use the source width to determine the size of each copy. This would imply that the source width carries semantic information and hence would have to be left alone by e.g. copy
Re: [Mesa-dev] [PATCH 2/3] i965: Add a NIR-based cubemap normalizing pass
Jason Ekstrand ja...@jlekstrand.net writes: --- src/mesa/drivers/dri/i965/Makefile.sources | 1 + src/mesa/drivers/dri/i965/brw_nir.h| 2 + .../drivers/dri/i965/brw_nir_cubemap_normalize.c | 111 + 3 files changed, 114 insertions(+) create mode 100644 src/mesa/drivers/dri/i965/brw_nir_cubemap_normalize.c Could this go in src/glsl/nir? vc4 also lowers cubemaps the same way, so I might want to use it. (Probably won't immediately, due to the same do I really want to make my rcp that accurate for this operation? probably not. concern as for txp). diff --git a/src/mesa/drivers/dri/i965/brw_nir_cubemap_normalize.c b/src/mesa/drivers/dri/i965/brw_nir_cubemap_normalize.c new file mode 100644 index 000..6464f41 --- /dev/null +++ b/src/mesa/drivers/dri/i965/brw_nir_cubemap_normalize.c @@ -0,0 +1,111 @@ +/* + * Copyright © 2015 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * Authors: + *Jason Ekstrand ja...@jlekstrand.net + */ + +#include brw_nir.h +#include glsl/nir/nir_builder.h + +/** + * This file implements a NIR lowering pass to perform the normalization of + * the cubemap coordinates to have the largest magnitude component be -1.0 + * or 1.0. This is based on the old GLSL IR based pass by Eric. + */ + +static nir_ssa_def * +channel(nir_builder *b, nir_ssa_def *def, int c) +{ + return nir_swizzle(b, def, (unsigned[4]){c, c, c, c}, 1, false); +} + +static bool +cubemap_normalize_block(nir_block *block, void *void_state) +{ + nir_builder *b = void_state; + + nir_foreach_instr(block, instr) { + if (instr-type != nir_instr_type_tex) + continue; + + nir_tex_instr *tex = nir_instr_as_tex(instr); + if (tex-sampler_dim != GLSL_SAMPLER_DIM_CUBE) + continue; + + nir_builder_insert_before_instr(b, tex-instr); + + for (unsigned i = 0; i tex-num_srcs; i++) { + if (tex-src[i].src_type != nir_tex_src_coord) +continue; + + nir_ssa_def *orig_coord = +nir_ssa_for_src(b, tex-src[i].src, nir_tex_instr_src_size(tex, i)); + assert(orig_coord-num_components = 3); + + nir_ssa_def *abs0 = nir_fabs(b, channel(b, orig_coord, 0)); + nir_ssa_def *abs1 = nir_fabs(b, channel(b, orig_coord, 1)); + nir_ssa_def *abs2 = nir_fabs(b, channel(b, orig_coord, 2)); + + nir_ssa_def *norm1 = nir_fmax(b, abs0, nir_fmax(b, abs1, abs2)); This could just be: nir_ssa_def *abs = nir_fabs(b, orig_coord); nir_ssa_def *norm = nir_fmax(b, channel(b, abs, 0), nir_fmax(b, channel(b, abs, 1), channel(b, abs, 2))); right? Just in case vec4 NIR ends up being a thing. Other than these little comments, Reviewed-by: Eric Anholt e...@anholt.net signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir: add lowering for idiv/udiv/umod
On Fri, Apr 3, 2015 at 8:21 AM, Rob Clark robdcl...@gmail.com wrote: From: Rob Clark robcl...@freedesktop.org Based on the algo from NV50LegalizeSSA::handleDIV() and handleMOD(). See also trans_idiv() in freedreno/ir3/ir3_compiler.c (which was an adaptation of the nv50 code from Ilia Mirkin). Also, including a py script that implements the same algo with numpy, based on something written by Ilia (and beaten on with a hammer a bit by me). I've tested this on i965 hacked up to insert the idiv lowering pass. Signed-off-by: Rob Clark robcl...@freedesktop.org --- src/glsl/Makefile.sources | 1 + src/glsl/nir/div-lowering.py | 75 I have no idea if it's valuable to include this file in Mesa (sort of doubt it is?), but if it is it needs to be included in src/glsl/Makefile.am's EXTRA_DIST. It also needs a license header. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] i965: Check the INTEL_USE_NIR environment variable once at context creation
On Fri, Apr 3, 2015 at 9:46 AM, Matt Turner matts...@gmail.com wrote: On Fri, Apr 3, 2015 at 1:07 AM, Jordan Justen jordan.l.jus...@intel.com wrote: On 2015-04-02 20:56:15, Jason Ekstrand wrote: --- src/mesa/drivers/dri/i965/brw_context.c | 10 +- src/mesa/drivers/dri/i965/brw_fs.cpp| 4 ++-- src/mesa/drivers/dri/i965/brw_vec4.cpp | 4 +++- 3 files changed, 14 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 84818f0..f0de711 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -560,6 +560,12 @@ brw_initialize_context_constants(struct brw_context *brw) .lower_ffma = true, }; + bool use_nir_default[MESA_SHADER_STAGES]; + use_nir_default[MESA_SHADER_VERTEX] = false; + use_nir_default[MESA_SHADER_GEOMETRY] = false; + use_nir_default[MESA_SHADER_FRAGMENT] = false; + use_nir_default[MESA_SHADER_COMPUTE] = false; How about memset to 0 for now to make sure all stages are set? We can add use_nir_default[MESA_SHADER_FOO] = true; after the memset to update the default for the shader stage. Sure, we could do that. I'm not sure if it really saves us anything. I guess it would make sure that we initialize everything. Isn't this sufficient? bool use_nir_default[MESA_SHADER_STAGES] = {false}; Yes, that would accomplish the memset in less code. and use C99 designated initializers when we want to change the default per-stage. No, we can't do this. When we flip the switch, we're going to have use_nir_default[MESA_SHADER_VERTEX] = brw-gen = 8; and you can only use compile-time constants in initializers. + if (brw_env_var_as_boolean(INTEL_USE_NIR, use_nir_default[i])) This will read the var more once per shader type, right? Maybe read INTEL_USE_NIR once before the loop? I'd like to have a single read of the INTEL_USE_NIR variable. However, doing that *and* handling defaults will be annoying. Since this code isn't going to be around for a real long time, I'm not terribly concerned about reading it extra times. As is, we read it once per shader compile anyway. --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] vbo: replace __FUNCTION__ with __func__
On Fri, Apr 3, 2015 at 5:02 AM, Marius Predut marius.pre...@intel.com wrote: Consistently just use C99's __func__ everywhere. The patch was verified with Microsoft Visual studio 2013 redistributable package(RTM version number: 18.0.21005.1) Next MSVC versions intends to support __func__. No functional changes. Signed-off-by: Marius Predut marius.pre...@intel.com --- src/mesa/vbo/vbo_exec_api.c |2 +- src/mesa/vbo/vbo_exec_draw.c |4 ++-- src/mesa/vbo/vbo_rebase.c|2 +- src/mesa/vbo/vbo_save_api.c |2 +- 4 files changed, 5 insertions(+), 5 deletions(-) diff --git a/src/mesa/vbo/vbo_exec_api.c b/src/mesa/vbo/vbo_exec_api.c index 02741c2..859078f 100644 --- a/src/mesa/vbo/vbo_exec_api.c +++ b/src/mesa/vbo/vbo_exec_api.c @@ -439,7 +439,7 @@ do { \ } while (0) -#define ERROR(err) _mesa_error( ctx, err, __FUNCTION__ ) +#define ERROR(err) _mesa_error( ctx, err, __func__ ) #define TAG(x) vbo_##x #include vbo_attrib_tmp.h diff --git a/src/mesa/vbo/vbo_exec_draw.c b/src/mesa/vbo/vbo_exec_draw.c index 91f2ca4..37b53a8 100644 --- a/src/mesa/vbo/vbo_exec_draw.c +++ b/src/mesa/vbo/vbo_exec_draw.c @@ -45,7 +45,7 @@ vbo_exec_debug_verts( struct vbo_exec_context *exec ) GLuint i; printf(%s: %u vertices %d primitives, %d vertsize\n, - __FUNCTION__, + __func__, count, exec-vtx.prim_count, exec-vtx.vertex_size); @@ -402,7 +402,7 @@ vbo_exec_vtx_flush(struct vbo_exec_context *exec, GLboolean keepUnmapped) } if (0) -printf(%s %d %d\n, __FUNCTION__, exec-vtx.prim_count, +printf(%s %d %d\n, __func__, exec-vtx.prim_count, exec-vtx.vert_count); vbo_context(ctx)-draw_prims( ctx, diff --git a/src/mesa/vbo/vbo_rebase.c b/src/mesa/vbo/vbo_rebase.c index b06df4a..c3c4b64 100644 --- a/src/mesa/vbo/vbo_rebase.c +++ b/src/mesa/vbo/vbo_rebase.c @@ -142,7 +142,7 @@ void vbo_rebase_prims( struct gl_context *ctx, assert(min_index != 0); if (0) - printf(%s %d..%d\n, __FUNCTION__, min_index, max_index); + printf(%s %d..%d\n, __func__, min_index, max_index); /* XXX this path is disabled for now. diff --git a/src/mesa/vbo/vbo_save_api.c b/src/mesa/vbo/vbo_save_api.c index fd9a5de..5927bee 100644 --- a/src/mesa/vbo/vbo_save_api.c +++ b/src/mesa/vbo/vbo_save_api.c @@ -763,7 +763,7 @@ _save_reset_vertex(struct gl_context *ctx) -#define ERROR(err) _mesa_compile_error(ctx, err, __FUNCTION__); +#define ERROR(err) _mesa_compile_error(ctx, err, __func__); /* Only one size for each attribute may be active at once. Eg. if -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev Both patches are: Reviewed-by: Anuj Phogat anuj.pho...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] i965: Add a NIR-based cubemap normalizing pass
On Fri, Apr 3, 2015 at 11:41 AM, Eric Anholt e...@anholt.net wrote: Jason Ekstrand ja...@jlekstrand.net writes: --- src/mesa/drivers/dri/i965/Makefile.sources | 1 + src/mesa/drivers/dri/i965/brw_nir.h| 2 + .../drivers/dri/i965/brw_nir_cubemap_normalize.c | 111 + 3 files changed, 114 insertions(+) create mode 100644 src/mesa/drivers/dri/i965/brw_nir_cubemap_normalize.c Could this go in src/glsl/nir? vc4 also lowers cubemaps the same way, so I might want to use it. (Probably won't immediately, due to the same do I really want to make my rcp that accurate for this operation? probably not. concern as for txp). diff --git a/src/mesa/drivers/dri/i965/brw_nir_cubemap_normalize.c b/src/mesa/drivers/dri/i965/brw_nir_cubemap_normalize.c new file mode 100644 index 000..6464f41 --- /dev/null +++ b/src/mesa/drivers/dri/i965/brw_nir_cubemap_normalize.c @@ -0,0 +1,111 @@ +/* + * Copyright © 2015 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * Authors: + *Jason Ekstrand ja...@jlekstrand.net + */ + +#include brw_nir.h +#include glsl/nir/nir_builder.h + +/** + * This file implements a NIR lowering pass to perform the normalization of + * the cubemap coordinates to have the largest magnitude component be -1.0 + * or 1.0. This is based on the old GLSL IR based pass by Eric. + */ + +static nir_ssa_def * +channel(nir_builder *b, nir_ssa_def *def, int c) +{ + return nir_swizzle(b, def, (unsigned[4]){c, c, c, c}, 1, false); +} + +static bool +cubemap_normalize_block(nir_block *block, void *void_state) +{ + nir_builder *b = void_state; + + nir_foreach_instr(block, instr) { + if (instr-type != nir_instr_type_tex) + continue; + + nir_tex_instr *tex = nir_instr_as_tex(instr); + if (tex-sampler_dim != GLSL_SAMPLER_DIM_CUBE) + continue; + + nir_builder_insert_before_instr(b, tex-instr); + + for (unsigned i = 0; i tex-num_srcs; i++) { + if (tex-src[i].src_type != nir_tex_src_coord) +continue; + + nir_ssa_def *orig_coord = +nir_ssa_for_src(b, tex-src[i].src, nir_tex_instr_src_size(tex, i)); + assert(orig_coord-num_components = 3); + + nir_ssa_def *abs0 = nir_fabs(b, channel(b, orig_coord, 0)); + nir_ssa_def *abs1 = nir_fabs(b, channel(b, orig_coord, 1)); + nir_ssa_def *abs2 = nir_fabs(b, channel(b, orig_coord, 2)); + + nir_ssa_def *norm1 = nir_fmax(b, abs0, nir_fmax(b, abs1, abs2)); This could just be: nir_ssa_def *abs = nir_fabs(b, orig_coord); nir_ssa_def *norm = nir_fmax(b, channel(b, abs, 0), nir_fmax(b, channel(b, abs, 1), channel(b, abs, 2))); right? Just in case vec4 NIR ends up being a thing. D'oh! Yeah, I'll change that. Other than these little comments, Reviewed-by: Eric Anholt e...@anholt.net Thanks. --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] build: add libnir.la
On Fri, Apr 3, 2015 at 11:13 AM, Connor Abbott cwabbo...@gmail.com wrote: On Fri, Apr 3, 2015 at 2:07 PM, Rob Clark robdcl...@gmail.com wrote: From: Rob Clark robcl...@freedesktop.org If we want to use NIR from state trackers that don't already pull in the whole of glsl (ie. anything other than mesa state tracker), we need a separate more minimal libnir. Possibly NIR should be better split out from glsl, but for now, generate a second smaller libnir.la for those who just want NIR but not all of glsl. Signed-off-by: Rob Clark robcl...@freedesktop.org --- src/glsl/Makefile.am | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/src/glsl/Makefile.am b/src/glsl/Makefile.am index 6cef973..23c6fe8 100644 --- a/src/glsl/Makefile.am +++ b/src/glsl/Makefile.am @@ -68,7 +68,7 @@ TESTS_ENVIRONMENT= \ export PYTHON2=$(PYTHON2); \ export PYTHON_FLAGS=$(PYTHON_FLAGS); -noinst_LTLIBRARIES = libglsl.la libglcpp.la +noinst_LTLIBRARIES = libnir.la libglsl.la libglcpp.la check_PROGRAMS = \ glcpp/glcpp \ glsl_test \ @@ -148,6 +148,12 @@ libglsl_la_SOURCES = \ $(LIBGLSL_FILES)\ $(NIR_FILES) We still have the line above, so doesn't this mean we'll build all the NIR files twice, once in libglsl.a and once in libnir.a? Isn't that a bad thing? Or am I missing something? I'm pretty sure it's okay. The source files should be compiled only once, and then linked into the two .a archives. Whether we should list $(NIR_FILES) twice instead of adding libnir.la to libglsl_la_LIBADD... I'm not sure. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir: add lowering for idiv/udiv/umod
Rob Clark robdcl...@gmail.com writes: From: Rob Clark robcl...@freedesktop.org Based on the algo from NV50LegalizeSSA::handleDIV() and handleMOD(). See also trans_idiv() in freedreno/ir3/ir3_compiler.c (which was an adaptation of the nv50 code from Ilia Mirkin). Also, including a py script that implements the same algo with numpy, based on something written by Ilia (and beaten on with a hammer a bit by me). I've tested this on i965 hacked up to insert the idiv lowering pass. Tested-by: Eric Anholt e...@anholt.net (vc4) I don't think we should commit the python file, though. signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] i965: Check the INTEL_USE_NIR environment variable once at context creation
On Fri, Apr 3, 2015 at 1:07 AM, Jordan Justen jordan.l.jus...@intel.com wrote: On 2015-04-02 20:56:15, Jason Ekstrand wrote: --- src/mesa/drivers/dri/i965/brw_context.c | 10 +- src/mesa/drivers/dri/i965/brw_fs.cpp| 4 ++-- src/mesa/drivers/dri/i965/brw_vec4.cpp | 4 +++- 3 files changed, 14 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 84818f0..f0de711 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -560,6 +560,12 @@ brw_initialize_context_constants(struct brw_context *brw) .lower_ffma = true, }; + bool use_nir_default[MESA_SHADER_STAGES]; + use_nir_default[MESA_SHADER_VERTEX] = false; + use_nir_default[MESA_SHADER_GEOMETRY] = false; + use_nir_default[MESA_SHADER_FRAGMENT] = false; + use_nir_default[MESA_SHADER_COMPUTE] = false; How about memset to 0 for now to make sure all stages are set? We can add use_nir_default[MESA_SHADER_FOO] = true; after the memset to update the default for the shader stage. Isn't this sufficient? bool use_nir_default[MESA_SHADER_STAGES] = {false}; and use C99 designated initializers when we want to change the default per-stage. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/4] nir: Add a lowering pass for texture projectors.
On Wed, Apr 1, 2015 at 3:11 PM, Eric Anholt e...@anholt.net wrote: Not much hardware wants them these days, and it might give us a chance to do CSE or algebraic at the NIR level. --- I wrote this originally for vc4, but I'm not sure I'm going to turn it on -- I'm using a non-Newton-Raphson RCP in my TXP handling right now, and if I do this pass then I get the N-R step added. If it's not necessary, I probably shouldn't spend those instructions on it. src/glsl/Makefile.sources | 1 + src/glsl/nir/nir.h | 1 + src/glsl/nir/nir_lower_tex_projector.c | 142 + 3 files changed, 144 insertions(+) create mode 100644 src/glsl/nir/nir_lower_tex_projector.c diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources index b56fa26..ffce706 100644 --- a/src/glsl/Makefile.sources +++ b/src/glsl/Makefile.sources @@ -37,6 +37,7 @@ NIR_FILES = \ nir/nir_lower_phis_to_scalar.c \ nir/nir_lower_samplers.cpp \ nir/nir_lower_system_values.c \ + nir/nir_lower_tex_projector.c \ nir/nir_lower_to_source_mods.c \ nir/nir_lower_vars_to_ssa.c \ nir/nir_lower_var_copies.c \ diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h index 24deb82..6e2aa97 100644 --- a/src/glsl/nir/nir.h +++ b/src/glsl/nir/nir.h @@ -1601,6 +1601,7 @@ void nir_lower_samplers(nir_shader *shader, struct gl_program *prog); void nir_lower_system_values(nir_shader *shader); +void nir_lower_tex_projector(nir_shader *shader); void nir_lower_atomics(nir_shader *shader); void nir_lower_to_source_mods(nir_shader *shader); diff --git a/src/glsl/nir/nir_lower_tex_projector.c b/src/glsl/nir/nir_lower_tex_projector.c new file mode 100644 index 000..6327b23 --- /dev/null +++ b/src/glsl/nir/nir_lower_tex_projector.c @@ -0,0 +1,142 @@ +/* + * Copyright © 2015 Broadcom + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +/* + * This lowering pass converts the coordinate division for texture projection + * to be done in ALU instructions instead of asking the texture operation to + * do so. + */ + +#include nir.h +#include nir_builder.h + +static nir_ssa_def * +channel(nir_builder *b, nir_ssa_def *def, int c) +{ + return nir_swizzle(b, def, (unsigned[4]){c, c, c, c}, 1, false); +} + +static bool +nir_lower_tex_projector_block(nir_block *block, void *void_state) +{ + nir_builder *b = void_state; + + nir_foreach_instr_safe(block, instr) { + if (instr-type != nir_instr_type_tex) + continue; + + nir_tex_instr *tex = nir_instr_as_tex(instr); + nir_builder_insert_before_instr(b, tex-instr); + + /* Find the projector in the srcs list, if present. */ + int proj_index; + for (proj_index = 0; proj_index tex-num_srcs; proj_index++) { + if (tex-src[proj_index].src_type == nir_tex_src_projector) +break; + } + if (proj_index == tex-num_srcs) + continue; + nir_ssa_def *inv_proj = + nir_frcp(b, nir_ssa_for_src(b, tex-src[proj_index].src, 1)); + + /* Walk through the sources projecting the arguments. */ + for (int i = 0; i tex-num_srcs; i++) { + switch (tex-src[i].src_type) { + case nir_tex_src_coord: + case nir_tex_src_comparitor: +break; + default: +continue; + } + nir_ssa_def *unprojected = +nir_ssa_for_src(b, tex-src[i].src, nir_tex_instr_src_size(tex, i)); + nir_ssa_def *projected = nir_fmul(b, unprojected, inv_proj); + + /* Array indices don't get projected, so make an new vector with the + * coordinate's array index untouched. + */ + if (tex-is_array tex-src[i].src_type == nir_tex_src_coord) { I don't think
Re: [Mesa-dev] [PATCH 1/2] egl/dri2: implement platform_null (v2).
Time to revive this patch! Why? - I don't like large patchsets living in Chrome OS for too long. - Frank submitted Waffle patches to support this, and I hesitate to add Waffle support unless the platform is upstream. Of course, the patch no longer applies to master. So I rebased and pushed it to a personal branch: git fetch git://github.com/chadversary/mesa refs/heads/cooking/hshi/egl-platform-null git checkout FETCH_HEAD https://github.com/chadversary/mesa/tree/cooking/hshi/egl-platform-null On Tue 17 Feb 2015, Haixia Shi wrote: The NULL platform is for off-screen rendering only. Render node support is required. Naming it the NULL platform seems very odd. It actually *does* stuff. Usually things named null do nothing. Also, this platform is not unique in its NULL requirement. That is, there do exist other EGL platforms in which eglGetDisplay accepts only NULL, such as Android. I strongly suspect this is also the case for other lesser known operating systems. But there isn't an obviously better name. Me and Jordan chatted about this for a long time and came to no conclusion. Usually, an EGL platform is named after (1) the operating system, (2) the underlying window system, or (3) the real type of EGLNativeDisplayType. None of those precedents help here. However, this platform does have a single, unique property that distinguishes it from all other EGL platforms: ___this is the only platform that has no support for EGLSurface___. Perhaps we should name the platform after that property? Perhaps EGL_MESA_platform_surfaceless and platform_surfaceless.c? Only consider the render nodes. Do not use normal nodes as they require auth hooks. I agree with that decision. The platform should not fallback to /dev/dri/card*, at least not in this initial patch. That adds too much complication. I have comments below on how the rendernode gets selected. Signed-off-by: Haixia Shi h...@chromium.org --- src/egl/drivers/dri2/Makefile.am | 5 ++ src/egl/drivers/dri2/egl_dri2.c | 13 ++- src/egl/drivers/dri2/egl_dri2.h | 3 + src/egl/drivers/dri2/platform_null.c | 169 +++ 4 files changed, 187 insertions(+), 3 deletions(-) create mode 100644 src/egl/drivers/dri2/platform_null.c diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c index 86e5f24..6ed137e 100644 --- a/src/egl/drivers/dri2/egl_dri2.c +++ b/src/egl/drivers/dri2/egl_dri2.c @@ -632,6 +632,13 @@ dri2_initialize(_EGLDriver *drv, _EGLDisplay *disp) return EGL_FALSE; switch (disp-Platform) { +#ifdef HAVE_NULL_PLATFORM + case _EGL_PLATFORM_NULL: + if (disp-Options.TestOnly) + return EGL_TRUE; + return dri2_initialize_null(drv, disp); +#endif The platform has a major deficiency in this hunk, but I don't think it needs fixing in this initial patch. Due to the way eglGetDisplay(EGLNativeDisplay dpy) auto-detects the real type of dpy, it's impossible to build Mesa with platform_null and platform_x11 enabled and actually have both usable. eglGetDisplay will work for only one, and the one that works will be the first that occurs in the platform list given to --with-egl-platforms=... . In other words, --with-egl-platforms=x11,null = eglGetDisplay(NULL) opens the default X11 display --with-egl-platforms=null,x11 = eglGetDisplay(NULL) opens a render node Again, I don't think you need to fix this in the initial patch. The proper solution is to implement a platform extension, like EGL_MESA_platform_gbm [1], which can be done in a follow-up patch. Without a platform extension, distro packagers and most Mesa developers will not be able to ever test this platform. [1] https://www.khronos.org/registry/egl/extensions/MESA/EGL_MESA_platform_gbm.txt diff --git a/src/egl/drivers/dri2/platform_null.c b/src/egl/drivers/dri2/platform_null.c new file mode 100644 index 000..55ceab6 --- /dev/null +++ b/src/egl/drivers/dri2/platform_null.c @@ -0,0 +1,169 @@ +static __DRIbuffer * +null_get_buffers_with_format(__DRIdrawable * driDrawable, + int *width, int *height, + unsigned int *attachments, int count, + int *out_count, void *loaderPrivate) +{ + struct dri2_egl_surface *dri2_surf = loaderPrivate; + struct dri2_egl_display *dri2_dpy = + dri2_egl_display(dri2_surf-base.Resource.Display); dri2_dpy is unused. + + dri2_surf-buffer_count = 1; + if (width) + *width = dri2_surf-base.Width; + if (height) + *height = dri2_surf-base.Height; + *out_count = dri2_surf-buffer_count;; + return dri2_surf-buffers; +} + +#define DRM_RENDER_DEV_NAME %s/renderD%d + +EGLBoolean +dri2_initialize_null(_EGLDriver *drv, _EGLDisplay *disp) +{ + struct dri2_egl_display *dri2_dpy; + const char* err; + int i, render_node; render_node is unused. + int driver_loaded = 0; + + loader_set_logger(_eglLog); + + dri2_dpy = calloc(1, sizeof
Re: [Mesa-dev] [PATCH 1/4] nir: Add a src_get_parent_instr function
1-3 Reviewed-by: Jordan Justen jordan.l.jus...@intel.com Shouldn't we hold off on 4 given the lost count? On 2015-04-02 21:05:22, Jason Ekstrand wrote: --- src/glsl/nir/nir.h | 10 ++ .../drivers/dri/i965/brw_nir_analyze_boolean_resolves.c | 16 ++-- 2 files changed, 12 insertions(+), 14 deletions(-) diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h index 24deb82..94b0f49 100644 --- a/src/glsl/nir/nir.h +++ b/src/glsl/nir/nir.h @@ -529,6 +529,16 @@ nir_src_for_reg(nir_register *reg) return src; } +static inline nir_instr * +nir_src_get_parent_instr(const nir_src *src) +{ + if (src-is_ssa) { + return src-ssa-parent_instr; + } else { + return src-reg.reg-parent_instr; + } +} + static inline nir_dest nir_dest_for_reg(nir_register *reg) { diff --git a/src/mesa/drivers/dri/i965/brw_nir_analyze_boolean_resolves.c b/src/mesa/drivers/dri/i965/brw_nir_analyze_boolean_resolves.c index 3a27cf1..f0b018c 100644 --- a/src/mesa/drivers/dri/i965/brw_nir_analyze_boolean_resolves.c +++ b/src/mesa/drivers/dri/i965/brw_nir_analyze_boolean_resolves.c @@ -43,13 +43,7 @@ static uint8_t get_resolve_status_for_src(nir_src *src) { - nir_instr *src_instr; - if (src-is_ssa) { - src_instr = src-ssa-parent_instr; - } else { - src_instr = src-reg.reg-parent_instr; - } - + nir_instr *src_instr = nir_src_get_parent_instr(src); if (src_instr) { uint8_t resolve_status = src_instr-pass_flags BRW_NIR_BOOLEAN_MASK; @@ -72,13 +66,7 @@ get_resolve_status_for_src(nir_src *src) static bool src_mark_needs_resolve(nir_src *src, void *void_state) { - nir_instr *src_instr; - if (src-is_ssa) { - src_instr = src-ssa-parent_instr; - } else { - src_instr = src-reg.reg-parent_instr; - } - + nir_instr *src_instr = nir_src_get_parent_instr(src); if (src_instr) { uint8_t resolve_status = src_instr-pass_flags BRW_NIR_BOOLEAN_MASK; -- 2.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/6] android: export the path of the generated headers
The modules need the headers can get the path automatically. Signed-off-by: Chih-Wei Huang cwhu...@linux.org.tw --- src/mesa/Android.libmesa_dricore.mk| 1 - src/mesa/Android.libmesa_st_mesa.mk| 1 - src/mesa/drivers/dri/Android.mk| 2 -- src/mesa/drivers/dri/common/Android.mk | 2 ++ src/mesa/program/Android.mk| 2 ++ 5 files changed, 4 insertions(+), 4 deletions(-) diff --git a/src/mesa/Android.libmesa_dricore.mk b/src/mesa/Android.libmesa_dricore.mk index da6176a..7758d54 100644 --- a/src/mesa/Android.libmesa_dricore.mk +++ b/src/mesa/Android.libmesa_dricore.mk @@ -60,7 +60,6 @@ LOCAL_CFLAGS += \ endif LOCAL_C_INCLUDES := \ - $(call intermediates-dir-for STATIC_LIBRARIES,libmesa_program,,) \ $(MESA_TOP)/src/mapi \ $(MESA_TOP)/src/mesa/main \ $(MESA_TOP)/src/glsl \ diff --git a/src/mesa/Android.libmesa_st_mesa.mk b/src/mesa/Android.libmesa_st_mesa.mk index e02030b..b4b7fd9 100644 --- a/src/mesa/Android.libmesa_st_mesa.mk +++ b/src/mesa/Android.libmesa_st_mesa.mk @@ -52,7 +52,6 @@ LOCAL_CFLAGS := \ endif LOCAL_C_INCLUDES := \ - $(call intermediates-dir-for STATIC_LIBRARIES,libmesa_program,,) \ $(MESA_TOP)/src/mapi \ $(MESA_TOP)/src/mesa/main \ $(MESA_TOP)/src/glsl \ diff --git a/src/mesa/drivers/dri/Android.mk b/src/mesa/drivers/dri/Android.mk index 09ce55a..d399666 100644 --- a/src/mesa/drivers/dri/Android.mk +++ b/src/mesa/drivers/dri/Android.mk @@ -34,8 +34,6 @@ MESA_DRI_CFLAGS := \ -DHAVE_ANDROID_PLATFORM MESA_DRI_C_INCLUDES := \ - $(call intermediates-dir-for,STATIC_LIBRARIES,libmesa_dri_common) \ - $(call intermediates-dir-for,STATIC_LIBRARIES,libmesa_glsl)/nir \ $(addprefix $(MESA_TOP)/, $(mesa_dri_common_INCLUDES)) \ $(MESA_TOP)/src/gallium/include \ $(MESA_TOP)/src/gallium/auxiliary \ diff --git a/src/mesa/drivers/dri/common/Android.mk b/src/mesa/drivers/dri/common/Android.mk index 458e4e9..511d4af 100644 --- a/src/mesa/drivers/dri/common/Android.mk +++ b/src/mesa/drivers/dri/common/Android.mk @@ -39,6 +39,8 @@ intermediates := $(call local-generated-sources-dir) LOCAL_C_INCLUDES := \ $(MESA_DRI_C_INCLUDES) +LOCAL_EXPORT_C_INCLUDE_DIRS := $(intermediates) + # swrast only ifeq ($(MESA_GPU_DRIVERS),swrast) LOCAL_CFLAGS := -D__NOT_HAVE_DRM_H diff --git a/src/mesa/program/Android.mk b/src/mesa/program/Android.mk index 9a6f962..ccb0fa5 100644 --- a/src/mesa/program/Android.mk +++ b/src/mesa/program/Android.mk @@ -78,5 +78,7 @@ LOCAL_C_INCLUDES := \ $(MESA_TOP)/src/gallium/auxiliary \ $(MESA_TOP)/src/gallium/include +LOCAL_EXPORT_C_INCLUDE_DIRS := $(intermediates) + include $(MESA_COMMON_MK) include $(BUILD_STATIC_LIBRARY) -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/6] android: fix building issues of host binaries
Define _GNU_SOURCE to enable features (__USE_XOPEN2K and __USE_UNIX98) required to build the host binaries. Signed-off-by: Chih-Wei Huang cwhu...@linux.org.tw --- Android.common.mk| 2 +- src/mesa/Android.mesa_gen_matypes.mk | 1 - 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/Android.common.mk b/Android.common.mk index a4ee181..edf52d6 100644 --- a/Android.common.mk +++ b/Android.common.mk @@ -24,7 +24,7 @@ # use c99 compiler by default ifeq ($(LOCAL_CC),) ifeq ($(LOCAL_IS_HOST_MODULE),true) -LOCAL_CC := $(HOST_CC) -std=c99 +LOCAL_CC := $(HOST_CC) -std=c99 -D_GNU_SOURCE else LOCAL_CC := $(TARGET_CC) -std=c99 endif diff --git a/src/mesa/Android.mesa_gen_matypes.mk b/src/mesa/Android.mesa_gen_matypes.mk index 5521087..6e301f9 100644 --- a/src/mesa/Android.mesa_gen_matypes.mk +++ b/src/mesa/Android.mesa_gen_matypes.mk @@ -33,7 +33,6 @@ include $(CLEAR_VARS) LOCAL_MODULE := mesa_gen_matypes LOCAL_IS_HOST_MODULE := true -LOCAL_CFLAGS := -D_POSIX_C_SOURCE=199309L LOCAL_C_INCLUDES := \ $(MESA_TOP)/src/mapi \ -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/6] android: fix the building rules for Android 5.0
Android 5.0 allows modules to generate source into $OUT/gen, which will then be copied into $OUT/obj and $OUT/obj_$(TARGET_2ND_ARCH) as necessary. Modules will need to change calls to local-intermediates-dir into local-generated-sources-dir. The patch changes local-intermediates-dir into local-generated-sources-dir. If the Android version is less than 5.0, fallback to local-intermediates-dir. The patch also fixes the 64-bit building issue of Android 5.0. Signed-off-by: Chih-Wei Huang cwhu...@linux.org.tw --- Android.mk | 7 +++ src/egl/drivers/dri2/Android.mk| 8 +++- src/egl/main/Android.mk| 4 src/gallium/auxiliary/Android.mk | 2 +- src/gallium/auxiliary/os/os_mman.h | 2 +- src/glsl/Android.gen.mk| 3 +-- src/mapi/Android.mk| 2 +- src/mesa/Android.gen.mk| 2 +- src/mesa/drivers/dri/Android.mk| 1 - src/mesa/drivers/dri/common/Android.mk | 3 +-- src/mesa/drivers/dri/i915/Android.mk | 5 - src/mesa/drivers/dri/i965/Android.mk | 5 - src/mesa/program/Android.mk| 3 +-- src/util/Android.mk| 4 ++-- 14 files changed, 35 insertions(+), 16 deletions(-) diff --git a/Android.mk b/Android.mk index 87ed464..b19419b 100644 --- a/Android.mk +++ b/Android.mk @@ -34,6 +34,13 @@ MESA_TOP := $(call my-dir) MESA_ANDROID_MAJOR_VERSION := $(word 1, $(subst ., , $(PLATFORM_VERSION))) MESA_ANDROID_MINOR_VERSION := $(word 2, $(subst ., , $(PLATFORM_VERSION))) MESA_ANDROID_VERSION := $(MESA_ANDROID_MAJOR_VERSION).$(MESA_ANDROID_MINOR_VERSION) +ifeq ($(filter 1 2 3 4,$(MESA_ANDROID_MAJOR_VERSION)),) +MESA_LOLLIPOP_BUILD := true +else +define local-generated-sources-dir +$(call local-intermediates-dir) +endef +endif MESA_COMMON_MK := $(MESA_TOP)/Android.common.mk MESA_PYTHON2 := python diff --git a/src/egl/drivers/dri2/Android.mk b/src/egl/drivers/dri2/Android.mk index d48506a..5931ce8 100644 --- a/src/egl/drivers/dri2/Android.mk +++ b/src/egl/drivers/dri2/Android.mk @@ -32,10 +32,16 @@ LOCAL_SRC_FILES := \ platform_android.c LOCAL_CFLAGS := \ - -DDEFAULT_DRIVER_DIR=\/system/lib/dri\ \ -DHAVE_SHARED_GLAPI \ -DHAVE_ANDROID_PLATFORM +ifeq ($(MESA_LOLLIPOP_BUILD),true) +LOCAL_CFLAGS_x86 := -DDEFAULT_DRIVER_DIR=\/system/lib/dri\ +LOCAL_CFLAGS_x86_64 := -DDEFAULT_DRIVER_DIR=\/system/lib64/dri\ +else +LOCAL_CFLAGS += -DDEFAULT_DRIVER_DIR=\/system/lib/dri\ +endif + LOCAL_C_INCLUDES := \ $(MESA_TOP)/src/mapi \ $(MESA_TOP)/src/egl/main \ diff --git a/src/egl/main/Android.mk b/src/egl/main/Android.mk index 4d0cc57..12b66d0 100644 --- a/src/egl/main/Android.mk +++ b/src/egl/main/Android.mk @@ -154,7 +154,11 @@ LOCAL_STATIC_LIBRARIES := \ libmesa_loader LOCAL_MODULE := libGLES_mesa +ifeq ($(MESA_LOLLIPOP_BUILD),true) +LOCAL_MODULE_RELATIVE_PATH := egl +else LOCAL_MODULE_PATH := $(TARGET_OUT_SHARED_LIBRARIES)/egl +endif include $(MESA_COMMON_MK) include $(BUILD_SHARED_LIBRARY) diff --git a/src/gallium/auxiliary/Android.mk b/src/gallium/auxiliary/Android.mk index c7b2634..96a2125 100644 --- a/src/gallium/auxiliary/Android.mk +++ b/src/gallium/auxiliary/Android.mk @@ -39,7 +39,7 @@ LOCAL_MODULE := libmesa_gallium # generate sources LOCAL_MODULE_CLASS := STATIC_LIBRARIES -intermediates := $(call local-intermediates-dir) +intermediates := $(call local-generated-sources-dir) LOCAL_GENERATED_SOURCES := $(addprefix $(intermediates)/, $(GENERATED_SOURCES)) $(LOCAL_GENERATED_SOURCES): PRIVATE_PYTHON := $(MESA_PYTHON2) diff --git a/src/gallium/auxiliary/os/os_mman.h b/src/gallium/auxiliary/os/os_mman.h index 3fc8c43..e892610 100644 --- a/src/gallium/auxiliary/os/os_mman.h +++ b/src/gallium/auxiliary/os/os_mman.h @@ -54,7 +54,7 @@ extern C { #endif -#if defined(PIPE_OS_ANDROID) +#if defined(PIPE_OS_ANDROID) !defined(__LP64__) extern void *__mmap2(void *, size_t, int, int, int, size_t); diff --git a/src/glsl/Android.gen.mk b/src/glsl/Android.gen.mk index 35d79f2..5591f9d 100644 --- a/src/glsl/Android.gen.mk +++ b/src/glsl/Android.gen.mk @@ -27,7 +27,7 @@ ifeq ($(LOCAL_MODULE_CLASS),) LOCAL_MODULE_CLASS := STATIC_LIBRARIES endif -intermediates := $(call local-intermediates-dir) +intermediates := $(call local-generated-sources-dir) sources := \ glsl_lexer.cpp \ @@ -43,7 +43,6 @@ sources := \ LOCAL_SRC_FILES := $(filter-out $(sources), $(LOCAL_SRC_FILES)) LOCAL_C_INCLUDES += \ - $(intermediates) \ $(intermediates)/glcpp \ $(intermediates)/nir \ $(MESA_TOP)/src/glsl/glcpp \ diff --git a/src/mapi/Android.mk b/src/mapi/Android.mk index f104378..4445218 100644 --- a/src/mapi/Android.mk +++ b/src/mapi/Android.mk @@ -53,7 +53,7 @@ LOCAL_C_INCLUDES := \ LOCAL_MODULE := libglapi LOCAL_MODULE_CLASS := SHARED_LIBRARIES -intermediates := $(call local-intermediates-dir) +intermediates := $(call local-generated-sources-dir) abi_header :=
[Mesa-dev] [PATCH 4/6] android: refine the rules to generate xmlpool/options.h
The patch gets rid of the last use of intermediates-dir-for. Signed-off-by: Chih-Wei Huang cwhu...@linux.org.tw --- src/mesa/drivers/dri/Android.mk| 3 --- src/mesa/drivers/dri/common/Android.mk | 18 -- 2 files changed, 8 insertions(+), 13 deletions(-) diff --git a/src/mesa/drivers/dri/Android.mk b/src/mesa/drivers/dri/Android.mk index d399666..b18c8a2 100644 --- a/src/mesa/drivers/dri/Android.mk +++ b/src/mesa/drivers/dri/Android.mk @@ -54,9 +54,6 @@ MESA_DRI_SHARED_LIBRARIES := \ libglapi \ liblog -# All DRI modules must add this to LOCAL_GENERATED_SOURCES. -MESA_DRI_OPTIONS_H := $(call intermediates-dir-for,STATIC_LIBRARIES,libmesa_dri_common)/xmlpool/options.h - #--- # Build drivers and libmesa_dri_common diff --git a/src/mesa/drivers/dri/common/Android.mk b/src/mesa/drivers/dri/common/Android.mk index 511d4af..6ccfe27 100644 --- a/src/mesa/drivers/dri/common/Android.mk +++ b/src/mesa/drivers/dri/common/Android.mk @@ -50,8 +50,8 @@ endif LOCAL_SRC_FILES := $(DRI_COMMON_FILES) -LOCAL_GENERATED_SOURCES := \ -$(intermediates)/xmlpool/options.h +MESA_DRI_OPTIONS_H := $(intermediates)/xmlpool/options.h +LOCAL_GENERATED_SOURCES := $(MESA_DRI_OPTIONS_H) # # Generate options.h from gettext translations. @@ -81,16 +81,14 @@ $(intermediates)/xmlpool/%/LC_MESSAGES/options.mo: $(intermediates)/xmlpool/%.po mkdir -p $(dir $@) msgfmt -o $@ $ -$(intermediates)/xmlpool/options.h: PRIVATE_SCRIPT := $(LOCAL_PATH)/xmlpool/gen_xmlpool.py -$(intermediates)/xmlpool/options.h: PRIVATE_LOCALEDIR := $(intermediates)/xmlpool -$(intermediates)/xmlpool/options.h: PRIVATE_TEMPLATE_HEADER := $(LOCAL_PATH)/xmlpool/t_options.h -$(intermediates)/xmlpool/options.h: PRIVATE_MO_FILES := $(MESA_DRI_OPTIONS_LANGS:%=$(intermediates)/xmlpool/%/LC_MESSAGES/options.mo) +$(MESA_DRI_OPTIONS_H): PRIVATE_SCRIPT := $(LOCAL_PATH)/xmlpool/gen_xmlpool.py +$(MESA_DRI_OPTIONS_H): PRIVATE_TEMPLATE_HEADER := $(LOCAL_PATH)/xmlpool/t_options.h +$(MESA_DRI_OPTIONS_H): PRIVATE_MO_FILES := $(MESA_DRI_OPTIONS_LANGS:%=$(intermediates)/xmlpool/%/LC_MESSAGES/options.mo) .SECONDEXPANSION: -$(intermediates)/xmlpool/options.h: $$(PRIVATE_SCRIPT) $$(PRIVATE_TEMPLATE_HEADER) $$(PRIVATE_MO_FILES) - mkdir -p $(dir $@) - mkdir -p $(PRIVATE_LOCALEDIR) +$(MESA_DRI_OPTIONS_H): $$(PRIVATE_SCRIPT) $$(PRIVATE_TEMPLATE_HEADER) $$(PRIVATE_MO_FILES) + @mkdir -p $(@D) $(MESA_PYTHON2) $(PRIVATE_SCRIPT) $(PRIVATE_TEMPLATE_HEADER) \ - $(PRIVATE_LOCALEDIR) $(MESA_DRI_OPTIONS_LANGS) $@ + $(@D) $(MESA_DRI_OPTIONS_LANGS) $@ include $(MESA_COMMON_MK) include $(BUILD_STATIC_LIBRARY) -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [[PATCH 0/6] Fix Android 5.x building issues
This is a series of patches to fix building issues with Android 5.0 (and newer version) based on Emil's 'submit/android-fixes#1' branch. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/6] android: build x86(-64) assembly for Android 5.0
Android 5.0 changed HOST_ARCH to x86_64 that broke the asm building rules. The patch fix the rules to build asm for both x86 and x86_64 targets. Note mesa_gen_matypes is built for 32-bit only. Signed-off-by: Chih-Wei Huang cwhu...@linux.org.tw --- Android.common.mk| 9 + Android.mk | 2 +- src/mapi/Android.mk | 2 ++ src/mesa/Android.gen.mk | 2 -- src/mesa/Android.libmesa_dricore.mk | 3 +++ src/mesa/Android.libmesa_st_mesa.mk | 3 +++ src/mesa/Android.mesa_gen_matypes.mk | 3 +-- src/mesa/main/imports.h | 6 +++--- 8 files changed, 18 insertions(+), 12 deletions(-) diff --git a/Android.common.mk b/Android.common.mk index edf52d6..0ca6d13 100644 --- a/Android.common.mk +++ b/Android.common.mk @@ -61,10 +61,11 @@ LOCAL_CFLAGS += \ ifeq ($(strip $(MESA_ENABLE_ASM)),true) ifeq ($(TARGET_ARCH),x86) -LOCAL_CFLAGS += \ - -DUSE_X86_ASM \ - -DHAVE_DLOPEN \ - +LOCAL_CFLAGS += -DUSE_X86_ASM -DHAVE_DLOPEN +else ifeq ($(TARGET_2ND_ARCH),x86) +LOCAL_CFLAGS_x86 += -DUSE_X86_ASM -DHAVE_DLOPEN +LOCAL_CFLAGS_x86_64 += -DUSE_X86_64_ASM -DHAVE_DLOPEN +LOCAL_ASFLAGS_x86_64 := -DUSE_X86_64_ASM endif endif diff --git a/Android.mk b/Android.mk index b19419b..cd85937 100644 --- a/Android.mk +++ b/Android.mk @@ -62,7 +62,7 @@ MESA_GPU_DRIVERS := $(filter-out $(invalid_drivers), $(MESA_GPU_DRIVERS)) endif # host and target must be the same arch to generate matypes.h -ifeq ($(TARGET_ARCH),$(HOST_ARCH)) +ifneq ($(filter $(TARGET_ARCH) $(TARGET_2ND_ARCH),x86),) MESA_ENABLE_ASM := true else MESA_ENABLE_ASM := false diff --git a/src/mapi/Android.mk b/src/mapi/Android.mk index 4445218..c909d68 100644 --- a/src/mapi/Android.mk +++ b/src/mapi/Android.mk @@ -47,6 +47,8 @@ LOCAL_CFLAGS := \ -DMAPI_MODE_GLAPI \ -DMAPI_ABI_HEADER=\$(abi_header)\ +LOCAL_LDFLAGS := -Wl,--no-warn-shared-textrel + LOCAL_C_INCLUDES := \ $(MESA_TOP)/src/mapi diff --git a/src/mesa/Android.gen.mk b/src/mesa/Android.gen.mk index 27656cd..fb4a616 100644 --- a/src/mesa/Android.gen.mk +++ b/src/mesa/Android.gen.mk @@ -45,11 +45,9 @@ LOCAL_SRC_FILES := $(filter-out $(sources), $(LOCAL_SRC_FILES)) LOCAL_C_INCLUDES += $(intermediates)/main ifeq ($(strip $(MESA_ENABLE_ASM)),true) -ifeq ($(TARGET_ARCH),x86) sources += x86/matypes.h LOCAL_C_INCLUDES += $(intermediates)/x86 endif -endif sources += main/git_sha1.h diff --git a/src/mesa/Android.libmesa_dricore.mk b/src/mesa/Android.libmesa_dricore.mk index 7758d54..0636b2f 100644 --- a/src/mesa/Android.libmesa_dricore.mk +++ b/src/mesa/Android.libmesa_dricore.mk @@ -44,6 +44,9 @@ LOCAL_SRC_FILES := \ ifeq ($(strip $(MESA_ENABLE_ASM)),true) ifeq ($(TARGET_ARCH),x86) LOCAL_SRC_FILES += $(X86_FILES) +else ifeq ($(TARGET_2ND_ARCH),x86) + LOCAL_SRC_FILES_x86 := $(X86_FILES) + LOCAL_SRC_FILES_x86_64 := $(X86_64_FILES) endif # x86 endif # MESA_ENABLE_ASM diff --git a/src/mesa/Android.libmesa_st_mesa.mk b/src/mesa/Android.libmesa_st_mesa.mk index b4b7fd9..cec59a9 100644 --- a/src/mesa/Android.libmesa_st_mesa.mk +++ b/src/mesa/Android.libmesa_st_mesa.mk @@ -43,6 +43,9 @@ LOCAL_SRC_FILES := \ ifeq ($(strip $(MESA_ENABLE_ASM)),true) ifeq ($(TARGET_ARCH),x86) LOCAL_SRC_FILES += $(X86_FILES) +else ifeq ($(TARGET_2ND_ARCH),x86) + LOCAL_SRC_FILES_x86 := $(X86_FILES) + LOCAL_SRC_FILES_x86_64 := $(X86_64_FILES) endif # x86 endif # MESA_ENABLE_ASM diff --git a/src/mesa/Android.mesa_gen_matypes.mk b/src/mesa/Android.mesa_gen_matypes.mk index 6e301f9..296191e 100644 --- a/src/mesa/Android.mesa_gen_matypes.mk +++ b/src/mesa/Android.mesa_gen_matypes.mk @@ -25,7 +25,6 @@ # - ifeq ($(strip $(MESA_ENABLE_ASM)),true) -ifeq ($(TARGET_ARCH),x86) LOCAL_PATH := $(call my-dir) @@ -33,6 +32,7 @@ include $(CLEAR_VARS) LOCAL_MODULE := mesa_gen_matypes LOCAL_IS_HOST_MODULE := true +LOCAL_MULTILIB := 32 LOCAL_C_INCLUDES := \ $(MESA_TOP)/src/mapi \ @@ -44,5 +44,4 @@ LOCAL_SRC_FILES := \ include $(MESA_COMMON_MK) include $(BUILD_HOST_EXECUTABLE) -endif # x86 endif # MESA_ENABLE_ASM diff --git a/src/mesa/main/imports.h b/src/mesa/main/imports.h index 29f2499..627fbb8 100644 --- a/src/mesa/main/imports.h +++ b/src/mesa/main/imports.h @@ -179,7 +179,7 @@ static inline int IROUND_POS(float f) */ static inline int F_TO_I(float f) { -#if defined(USE_X86_ASM) defined(__GNUC__) defined(__i386__) +#if defined(USE_X86_ASM) (defined(__GNUC__) || defined(ANDROID)) defined(__i386__) int r; __asm__ (fistpl %0 : =m (r) : t (f) : st); return r; @@ -201,7 +201,7 @@ static inline int F_TO_I(float f) /** Return (as an integer) floor of float */ static inline int IFLOOR(float f) { -#if defined(USE_X86_ASM) defined(__GNUC__) defined(__i386__) +#if defined(USE_X86_ASM) (defined(__GNUC__) || defined(ANDROID)) defined(__i386__) /* *
[Mesa-dev] [PATCH 6/6] android: re-build all mesa binaries properly
The clean steps ensure both 32-bit and 64-bit objects are cleaned. Signed-off-by: Chih-Wei Huang cwhu...@linux.org.tw --- CleanSpec.mk | 8 1 file changed, 8 insertions(+) diff --git a/CleanSpec.mk b/CleanSpec.mk index 820a1c7..2068163 100644 --- a/CleanSpec.mk +++ b/CleanSpec.mk @@ -5,3 +5,11 @@ $(call add-clean-step, rm -rf $(PRODUCT_OUT)/obj/SHARED_LIBRARIES/libGLES_mesa_i $(call add-clean-step, rm -rf $(OUT_DIR)/host/$(HOST_OS)-$(HOST_ARCH)/obj/EXECUTABLES/mesa_*_intermediates) $(call add-clean-step, rm -rf $(OUT_DIR)/host/$(HOST_OS)-$(HOST_ARCH)/obj/EXECUTABLES/glsl_compiler_intermediates) $(call add-clean-step, rm -rf $(OUT_DIR)/host/$(HOST_OS)-$(HOST_ARCH)/obj/STATIC_LIBRARIES/libmesa_glsl_utils_intermediates) + +$(call add-clean-step, rm -rf $(PRODUCT_OUT)/*/STATIC_LIBRARIES/libmesa_*_intermediates) +$(call add-clean-step, rm -rf $(PRODUCT_OUT)/*/SHARED_LIBRARIES/i9?5_dri_intermediates) +$(call add-clean-step, rm -rf $(PRODUCT_OUT)/*/SHARED_LIBRARIES/libglapi_intermediates) +$(call add-clean-step, rm -rf $(PRODUCT_OUT)/*/SHARED_LIBRARIES/libGLES_mesa_intermediates) +$(call add-clean-step, rm -rf $(HOST_OUT_release)/*/EXECUTABLES/mesa_*_intermediates) +$(call add-clean-step, rm -rf $(HOST_OUT_release)/*/EXECUTABLES/glsl_compiler_intermediates) +$(call add-clean-step, rm -rf $(HOST_OUT_release)/*/STATIC_LIBRARIES/libmesa_*_intermediates) -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 2/3] i965/fs: Change SEL and MOV types as needed to propagate source modifiers
On Fri, Apr 3, 2015 at 2:06 PM, Jason Ekstrand ja...@jlekstrand.net wrote: SEL and MOV instructions, as long as they don't have source modifiers, are just copying bits around. This commit adds support to copy propagation to switch the type of a SEL or MOV instruction as needed so that it can propagate source modifiers. This is needed because NIR generates integer SEL and MOV instructions whenver it doesn't know what else to generate. shader-db results with NIR: total FS instructions in shared programs: 4360910 - 4360186 (-0.02%) FS instructions in affected programs: 59094 - 58370 (-1.23%) helped: 341 HURT: 0 GAINED: 2 LOST: 0 --- .../drivers/dri/i965/brw_fs_copy_propagation.cpp | 33 +++--- 1 file changed, 29 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp index e8d092c..d321509 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp @@ -275,6 +275,16 @@ is_logic_op(enum opcode opcode) opcode == BRW_OPCODE_NOT); } +static bool +can_change_source_types(fs_inst *inst) +{ + return !inst-src[0].abs !inst-src[0].negate + (inst-opcode == BRW_OPCODE_MOV || + (inst-opcode == BRW_OPCODE_SEL +inst-conditional_mod == BRW_CONDITIONAL_NONE +!inst-src[1].abs !inst-src[1].negate)); +} + bool fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) { @@ -346,7 +356,9 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) type_sz(inst-src[arg].type)) % type_sz(entry-src.type) != 0) return false; - if (has_source_modifiers entry-dst.type != inst-src[arg].type) + if (has_source_modifiers + entry-dst.type != inst-src[arg].type + !can_change_source_types(inst)) return false; if (brw-gen = 8 (entry-src.negate || entry-src.abs) @@ -412,9 +424,22 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) break; } - if (!inst-src[arg].abs) { - inst-src[arg].abs = entry-src.abs; - inst-src[arg].negate ^= entry-src.negate; + if (has_source_modifiers) { + if (entry-dst.type != inst-src[arg].type) { + /* We are propagating source modifiers from a MOV with a different from a MOV or SEL ? + * type. If we got here, then we can just change the source and + * destination types of the instruction and keep going. + */ + assert(can_change_source_types(inst)); + for (int i = 0; i inst-sources; i++) +inst-src[i].type = entry-dst.type; + inst-dst.type = entry-dst.type; + } + + if (!inst-src[arg].abs) { + inst-src[arg].abs = entry-src.abs; + inst-src[arg].negate ^= entry-src.negate; + } } return true; -- 2.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev Patches 1-2 are: Reviewed-by: Anuj Phogat anuj.pho...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/13] SQUASH: i965/fs: Rework fs_visitor::lower_load_payload
Jason Ekstrand ja...@jlekstrand.net writes: On Fri, Apr 3, 2015 at 8:37 AM, Francisco Jerez curroje...@riseup.net wrote: Jason Ekstrand ja...@jlekstrand.net writes: On Fri, Apr 3, 2015 at 7:28 AM, Francisco Jerez curroje...@riseup.net wrote: Jason Ekstrand ja...@jlekstrand.net writes: On Thu, Apr 2, 2015 at 3:01 AM, Francisco Jerez curroje...@riseup.net wrote: Jason Ekstrand ja...@jlekstrand.net writes: Instead of the complicated and broken-by-design pile of heuristics we had before, we now have a straightforward lowering: 1) All header sources are copied directly using force_writemask_all and, since they are guaranteed to be a single register, there are no force_sechalf issues. 2) All non-header sources are copied using the exact same force_sechalf and saturate modifiers as the LOAD_PAYLOAD operation itself. 3) In order to accommodate older gens that need interleaved colors, lower_load_payload detects when the destination is a COMPR4 register and automatically interleaves the non-header sources. The lower_load_payload pass does the right thing here regardless of whether or not the hardware actually supports COMPR4. I had a quick glance at the series and it seems to be going in the right direction. One thing I honestly don't like is the ad-hoc and IMHO premature treatment of payload headers, it still feels like the LOAD_PAYLOAD instruction has more complex semantics than necessary and the benefit is unclear to me. I suppose that your motivation was to avoid setting force_writemask_all in LOAD_PAYLOAD instructions with header. The optimizer should be able to cope with those flags and get rid of them from the resulting moves where they are redundant, and if it's not able to it looks like something that should be fixed anyway. The explicit handling of headers is responsible for much of the churn in this series and is likely to complicate users of LOAD_PAYLOAD and optimization passes that have to manipulate them. Avoiding force_writemask_all is only half of the motivation and the small half at that. A header source, more properly defined, is a single physical register that, conceptually, applies to all channels. Effectively, a header source (I should have stated this clearly) has two properties: 1) It has force_writemask_all set 2) It is exactly one physical hardware register. This second property is the more important of the two. Most of the disaster of the previous LOAD_PAYLOAD implementation was that we did a pile of guesswork and had a ill-conceved effective width think in order to figure out how big the register actually was. Making the user specify which sources are header sources eliminates that guesswork. It also has the nice side-effect that we can do the right force_writemask_all and we can properly handle COMPR4 for the the user. Ok, Allow me to be a bit more explicit as to what all we need to keep track of: 1) How big is the source register for real. Even immediates can end up being two registers in the copy. 2) Do we want force_writemask_all? 3) If not, do we want force_sechalf? 4) On g45 and gen5, we want to use COMPR4 for interlaced movs 5) When lowering, we want to use 16-wide moves when possible in SIMD16 With the patch series I sent, all of this is explicit except for COMPR4 which is, admittedly, kind of magic. Which of these sources are headers? is a reasonable question for the caller to answer. It knows explicitly and it would take the LOAD_PAYLOAD helper some work to guess it correctly. Another option would be to guess that based on exec sizes but then the caller has to know not to pass in the wrong register type or the guess will be wrong. I like explicit. I'm not advocating LOAD_PAYLOAD to perform any guesswork, I'm advocating it to be more stupid -- Let it do as much as it can sensibly do with the information it already has, and no more. Yeah, true, but this seems like the least orthogonal and most annoying to use solution for this problem, it forces the caller to provide redundant information, it takes into account the saturate flag on some arguments and not others, it shuffles sources with respect to the specified order when COMPR4 is set, but only for the first four non-header sources. I think any of the following solutions would be better-behaved than the current approach: I don't know that saying which sources are headers is really redundant. It's explicit which is what we want. Yes, the COMPR4 thing is a bit magical but we have to do COMPR4 in lower_load_payload so we have to have some way of doing it and this method puts the interleving code in one place instead of two. Well, at least with the previous approach LOAD_PAYLOAD had consistent (if broken) semantics across its arguments, and regardless of COMPR4 being used or not, which IMHO is preferable to the modest code saving. To be clear, I don't really like the way I
[Mesa-dev] [PATCH v2 1/3] i965/fs: Use the source type when looking for UD negations in copy prop
There can be problems with floats and conditional modifiers when copy-propagating a negated UD source. Previously, we checked the source to be copied for the negate and then checked the source being propagated to for the type. This isn't quite what we want because we are really just looking for negated UD sources. A check later in the file ensures that both ends of the propagate have the right type so it works. However, if we relax the restriction that both ends of the propagation have the same type, it ends up causing us to bail early in cases we don't want. --- src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp index 764741d..e8d092c 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp @@ -307,7 +307,7 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) * instead. See also resolve_ud_negate() and comment in * fs_generator::generate_code. */ - if (inst-src[arg].type == BRW_REGISTER_TYPE_UD + if (entry-src.type == BRW_REGISTER_TYPE_UD entry-src.negate) return false; -- 2.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] nir: Add a src_get_parent_instr function
On Fri, Apr 3, 2015 at 12:24 PM, Jordan Justen jordan.l.jus...@intel.com wrote: 1-3 Reviewed-by: Jordan Justen jordan.l.jus...@intel.com Thanks! I'm actually in the process of replacing 3 with what I think is a better long-term solution. I'll put you R-B on 1 and 2 when I send the v2. Shouldn't we hold off on 4 given the lost count? Maybe. That's also the one that gives us a nice benifit and the shaders it looses are all shaders we don't have in GLSL IR anyway presumably because it's also lowering the ifs to selects and running into the same scheduling problems. --Jason On 2015-04-02 21:05:22, Jason Ekstrand wrote: --- src/glsl/nir/nir.h | 10 ++ .../drivers/dri/i965/brw_nir_analyze_boolean_resolves.c | 16 ++-- 2 files changed, 12 insertions(+), 14 deletions(-) diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h index 24deb82..94b0f49 100644 --- a/src/glsl/nir/nir.h +++ b/src/glsl/nir/nir.h @@ -529,6 +529,16 @@ nir_src_for_reg(nir_register *reg) return src; } +static inline nir_instr * +nir_src_get_parent_instr(const nir_src *src) +{ + if (src-is_ssa) { + return src-ssa-parent_instr; + } else { + return src-reg.reg-parent_instr; + } +} + static inline nir_dest nir_dest_for_reg(nir_register *reg) { diff --git a/src/mesa/drivers/dri/i965/brw_nir_analyze_boolean_resolves.c b/src/mesa/drivers/dri/i965/brw_nir_analyze_boolean_resolves.c index 3a27cf1..f0b018c 100644 --- a/src/mesa/drivers/dri/i965/brw_nir_analyze_boolean_resolves.c +++ b/src/mesa/drivers/dri/i965/brw_nir_analyze_boolean_resolves.c @@ -43,13 +43,7 @@ static uint8_t get_resolve_status_for_src(nir_src *src) { - nir_instr *src_instr; - if (src-is_ssa) { - src_instr = src-ssa-parent_instr; - } else { - src_instr = src-reg.reg-parent_instr; - } - + nir_instr *src_instr = nir_src_get_parent_instr(src); if (src_instr) { uint8_t resolve_status = src_instr-pass_flags BRW_NIR_BOOLEAN_MASK; @@ -72,13 +66,7 @@ get_resolve_status_for_src(nir_src *src) static bool src_mark_needs_resolve(nir_src *src, void *void_state) { - nir_instr *src_instr; - if (src-is_ssa) { - src_instr = src-ssa-parent_instr; - } else { - src_instr = src-reg.reg-parent_instr; - } - + nir_instr *src_instr = nir_src_get_parent_instr(src); if (src_instr) { uint8_t resolve_status = src_instr-pass_flags BRW_NIR_BOOLEAN_MASK; -- 2.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 3/3] nir: Allow abs/neg in select peephole pass.
From: Matt Turner matts...@gmail.com total instructions in shared programs: 4314531 - 4308949 (-0.13%) instructions in affected programs: 429085 - 423503 (-1.30%) helped:1680 HURT: 0 GAINED:0 LOST: 111 Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com --- src/glsl/nir/nir_opt_peephole_select.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/glsl/nir/nir_opt_peephole_select.c b/src/glsl/nir/nir_opt_peephole_select.c index b89451b..f400cfd 100644 --- a/src/glsl/nir/nir_opt_peephole_select.c +++ b/src/glsl/nir/nir_opt_peephole_select.c @@ -84,7 +84,9 @@ block_check_for_allowed_instrs(nir_block *block) case nir_instr_type_alu: { /* It must be a move operation */ nir_alu_instr *mov = nir_instr_as_alu(instr); - if (mov-op != nir_op_fmov mov-op != nir_op_imov) + if (mov-op != nir_op_fmov mov-op != nir_op_imov + mov-op != nir_op_fneg mov-op != nir_op_ineg + mov-op != nir_op_fabs mov-op != nir_op_iabs) return false; /* Can't handle saturate */ -- 2.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 2/3] i965/fs: Change SEL and MOV types as needed to propagate source modifiers
SEL and MOV instructions, as long as they don't have source modifiers, are just copying bits around. This commit adds support to copy propagation to switch the type of a SEL or MOV instruction as needed so that it can propagate source modifiers. This is needed because NIR generates integer SEL and MOV instructions whenver it doesn't know what else to generate. shader-db results with NIR: total FS instructions in shared programs: 4360910 - 4360186 (-0.02%) FS instructions in affected programs: 59094 - 58370 (-1.23%) helped: 341 HURT: 0 GAINED: 2 LOST: 0 --- .../drivers/dri/i965/brw_fs_copy_propagation.cpp | 33 +++--- 1 file changed, 29 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp index e8d092c..d321509 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp @@ -275,6 +275,16 @@ is_logic_op(enum opcode opcode) opcode == BRW_OPCODE_NOT); } +static bool +can_change_source_types(fs_inst *inst) +{ + return !inst-src[0].abs !inst-src[0].negate + (inst-opcode == BRW_OPCODE_MOV || + (inst-opcode == BRW_OPCODE_SEL +inst-conditional_mod == BRW_CONDITIONAL_NONE +!inst-src[1].abs !inst-src[1].negate)); +} + bool fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) { @@ -346,7 +356,9 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) type_sz(inst-src[arg].type)) % type_sz(entry-src.type) != 0) return false; - if (has_source_modifiers entry-dst.type != inst-src[arg].type) + if (has_source_modifiers + entry-dst.type != inst-src[arg].type + !can_change_source_types(inst)) return false; if (brw-gen = 8 (entry-src.negate || entry-src.abs) @@ -412,9 +424,22 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) break; } - if (!inst-src[arg].abs) { - inst-src[arg].abs = entry-src.abs; - inst-src[arg].negate ^= entry-src.negate; + if (has_source_modifiers) { + if (entry-dst.type != inst-src[arg].type) { + /* We are propagating source modifiers from a MOV with a different + * type. If we got here, then we can just change the source and + * destination types of the instruction and keep going. + */ + assert(can_change_source_types(inst)); + for (int i = 0; i inst-sources; i++) +inst-src[i].type = entry-dst.type; + inst-dst.type = entry-dst.type; + } + + if (!inst-src[arg].abs) { + inst-src[arg].abs = entry-src.abs; + inst-src[arg].negate ^= entry-src.negate; + } } return true; -- 2.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Fix automatic indentation mode for recent emacs, use fewer columns in .git
On 2015-04-02 14:38:53, Carl Worth wrote: I recently noticed (after upgrading to emacs 24?) that I was no longer getting automatic C-style settings in emacs like I was accustomed to getting. That is, I was now getting a default indentation of 8 and indentation with tabs instead of spaces. It appears that the .dir-locals.el file is no longer taking effect. Presumably, emacs was previously using prog-mode for C and C++ source files but is now using a mode with some other name? I didn't chase down the name of the current mode, but just using nil makes these variables get set on all files, (which should be mostly harmless), and should be compatible with both old and new emacs. I did verify that the later change in this file (to indent with tabs when in makefile-mode) still takes precendence as desired. While editing these files, I've also set things up to use a smaller value for fill-column when editing a file within the .git directory. This will help avoid commit messages getting wrapped when git log adds some extra indentation. Note: If this change causes .dir-locals.el to take effect for someone when it never had before, then emacs may prompt about the potentially unsafe eval block here. User can reply to that prompt with ! to permanently whitelist this particular eval block as safe so that prompt will not be seen again in the future. --- .dir-locals.el| 4 ++-- src/gallium/drivers/freedreno/.dir-locals.el | 2 +- src/gallium/drivers/r600/.dir-locals.el | 2 +- src/gallium/drivers/radeon/.dir-locals.el | 2 +- src/gallium/drivers/radeonsi/.dir-locals.el | 2 +- src/gallium/drivers/vc4/.dir-locals.el| 2 +- src/gallium/drivers/vc4/kernel/.dir-locals.el | 2 +- src/gallium/winsys/radeon/.dir-locals.el | 2 +- src/mesa/drivers/dri/nouveau/.dir-locals.el | 2 +- 9 files changed, 10 insertions(+), 10 deletions(-) diff --git a/.dir-locals.el b/.dir-locals.el index d95eb48..f44d964 100644 --- a/.dir-locals.el +++ b/.dir-locals.el @@ -1,12 +1,12 @@ -((prog-mode +((nil (indent-tabs-mode . nil) (tab-width . 8) (c-basic-offset . 3) (c-file-style . stroustrup) - (fill-column . 78) Do we want to remove this? Or does it match the default? (eval . (progn (c-set-offset 'innamespace '0) (c-set-offset 'inline-open '0))) ) + (.git (nil (fill-column . 70))) Should the commit subject line be under 70 characters? I notice that yours is 74. :) https://www.kernel.org/doc/Documentation/SubmittingPatches says the subject line should be limited to 70-75 characters. It seems like the same should be used for subsequent line wrapping. Maybe the .git part should be moved that into a separate patch? Not a huge deal though, so, Reviewed-by: Jordan Justen jordan.l.jus...@intel.com (makefile-mode (indent-tabs-mode . t)) ) diff --git a/src/gallium/drivers/freedreno/.dir-locals.el b/src/gallium/drivers/freedreno/.dir-locals.el index aa20d49..c26578b 100644 --- a/src/gallium/drivers/freedreno/.dir-locals.el +++ b/src/gallium/drivers/freedreno/.dir-locals.el @@ -1,4 +1,4 @@ -((prog-mode +((nil (indent-tabs-mode . true) (tab-width . 4) (c-basic-offset . 4) diff --git a/src/gallium/drivers/r600/.dir-locals.el b/src/gallium/drivers/r600/.dir-locals.el index 4e35c12..8be6a30 100644 --- a/src/gallium/drivers/r600/.dir-locals.el +++ b/src/gallium/drivers/r600/.dir-locals.el @@ -1,4 +1,4 @@ -((prog-mode +((nil (indent-tabs-mode . true) (tab-width . 8) (c-basic-offset . 8) diff --git a/src/gallium/drivers/radeon/.dir-locals.el b/src/gallium/drivers/radeon/.dir-locals.el index 4e35c12..8be6a30 100644 --- a/src/gallium/drivers/radeon/.dir-locals.el +++ b/src/gallium/drivers/radeon/.dir-locals.el @@ -1,4 +1,4 @@ -((prog-mode +((nil (indent-tabs-mode . true) (tab-width . 8) (c-basic-offset . 8) diff --git a/src/gallium/drivers/radeonsi/.dir-locals.el b/src/gallium/drivers/radeonsi/.dir-locals.el index 4e35c12..8be6a30 100644 --- a/src/gallium/drivers/radeonsi/.dir-locals.el +++ b/src/gallium/drivers/radeonsi/.dir-locals.el @@ -1,4 +1,4 @@ -((prog-mode +((nil (indent-tabs-mode . true) (tab-width . 8) (c-basic-offset . 8) diff --git a/src/gallium/drivers/vc4/.dir-locals.el b/src/gallium/drivers/vc4/.dir-locals.el index ac94242..ed10dc2 100644 --- a/src/gallium/drivers/vc4/.dir-locals.el +++ b/src/gallium/drivers/vc4/.dir-locals.el @@ -1,4 +1,4 @@ -((prog-mode +((nil (indent-tabs-mode . nil) (tab-width . 8) (c-basic-offset . 8) diff --git a/src/gallium/drivers/vc4/kernel/.dir-locals.el b/src/gallium/drivers/vc4/kernel/.dir-locals.el index 49403de..2e58e90 100644 --- a/src/gallium/drivers/vc4/kernel/.dir-locals.el +++ b/src/gallium/drivers/vc4/kernel/.dir-locals.el @@ -1,4 +1,4 @@ -((prog-mode +((nil (indent-tabs-mode . t) (tab-width . 8) (c-basic-offset . 8)
[Mesa-dev] [PATCH] nir/lower_samplers: Use the right memory context for realloc'ing tex sources
As of da5ec2a, we allocate instruction sources out of the instruction itself. When we realloc the texture sources we need to use the right memory context or ralloc will get angry and assert-fail Cc: Kenneth Graunke kenn...@whitecape.org --- src/glsl/nir/nir_lower_samplers.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/glsl/nir/nir_lower_samplers.cpp b/src/glsl/nir/nir_lower_samplers.cpp index 3015dbd..1e509a9 100644 --- a/src/glsl/nir/nir_lower_samplers.cpp +++ b/src/glsl/nir/nir_lower_samplers.cpp @@ -90,7 +90,7 @@ lower_sampler(nir_tex_instr *instr, struct gl_shader_program *shader_program, ralloc_asprintf_append(name, [%u], deref_array-base_offset); break; case nir_deref_array_type_indirect: { -instr-src = reralloc(mem_ctx, instr-src, nir_tex_src, +instr-src = reralloc(instr, instr-src, nir_tex_src, instr-num_srcs + 1); memset(instr-src[instr-num_srcs], 0, sizeof *instr-src); instr-src[instr-num_srcs].src_type = nir_tex_src_sampler_offset; -- 2.3.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] egl/dri2: implement platform_null (v2).
On Fri, Apr 3, 2015 at 1:35 PM, Chad Versace chad.vers...@intel.com wrote: Time to revive this patch! Why? - I don't like large patchsets living in Chrome OS for too long. - Frank submitted Waffle patches to support this, and I hesitate to add Waffle support unless the platform is upstream. Of course, the patch no longer applies to master. So I rebased and pushed it to a personal branch: git fetch git://github.com/chadversary/mesa refs/heads/cooking/hshi/egl-platform-null git checkout FETCH_HEAD https://github.com/chadversary/mesa/tree/cooking/hshi/egl-platform-null On Tue 17 Feb 2015, Haixia Shi wrote: The NULL platform is for off-screen rendering only. Render node support is required. Naming it the NULL platform seems very odd. It actually *does* stuff. Usually things named null do nothing. Also, this platform is not unique in its NULL requirement. That is, there do exist other EGL platforms in which eglGetDisplay accepts only NULL, such as Android. I strongly suspect this is also the case for other lesser known operating systems. But there isn't an obviously better name. Me and Jordan chatted about this for a long time and came to no conclusion. Usually, an EGL platform is named after (1) the operating system, (2) the underlying window system, or (3) the real type of EGLNativeDisplayType. None of those precedents help here. However, this platform does have a single, unique property that distinguishes it from all other EGL platforms: ___this is the only platform that has no support for EGLSurface___. Perhaps we should name the platform after that property? Perhaps EGL_MESA_platform_surfaceless and platform_surfaceless.c? That's a very good name. As it happens, it also matches Chrome's naming. Only consider the render nodes. Do not use normal nodes as they require auth hooks. I agree with that decision. The platform should not fallback to /dev/dri/card*, at least not in this initial patch. That adds too much complication. I have comments below on how the rendernode gets selected. Signed-off-by: Haixia Shi h...@chromium.org --- src/egl/drivers/dri2/Makefile.am | 5 ++ src/egl/drivers/dri2/egl_dri2.c | 13 ++- src/egl/drivers/dri2/egl_dri2.h | 3 + src/egl/drivers/dri2/platform_null.c | 169 +++ 4 files changed, 187 insertions(+), 3 deletions(-) create mode 100644 src/egl/drivers/dri2/platform_null.c diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c index 86e5f24..6ed137e 100644 --- a/src/egl/drivers/dri2/egl_dri2.c +++ b/src/egl/drivers/dri2/egl_dri2.c @@ -632,6 +632,13 @@ dri2_initialize(_EGLDriver *drv, _EGLDisplay *disp) return EGL_FALSE; switch (disp-Platform) { +#ifdef HAVE_NULL_PLATFORM + case _EGL_PLATFORM_NULL: + if (disp-Options.TestOnly) + return EGL_TRUE; + return dri2_initialize_null(drv, disp); +#endif The platform has a major deficiency in this hunk, but I don't think it needs fixing in this initial patch. Due to the way eglGetDisplay(EGLNativeDisplay dpy) auto-detects the real type of dpy, it's impossible to build Mesa with platform_null and platform_x11 enabled and actually have both usable. eglGetDisplay will work for only one, and the one that works will be the first that occurs in the platform list given to --with-egl-platforms=... . In other words, --with-egl-platforms=x11,null = eglGetDisplay(NULL) opens the default X11 display --with-egl-platforms=null,x11 = eglGetDisplay(NULL) opens a render node Again, I don't think you need to fix this in the initial patch. The proper solution is to implement a platform extension, like EGL_MESA_platform_gbm [1], which can be done in a follow-up patch. Without a platform extension, distro packagers and most Mesa developers will not be able to ever test this platform. [1] https://www.khronos.org/registry/egl/extensions/MESA/EGL_MESA_platform_gbm.txt diff --git a/src/egl/drivers/dri2/platform_null.c b/src/egl/drivers/dri2/platform_null.c new file mode 100644 index 000..55ceab6 --- /dev/null +++ b/src/egl/drivers/dri2/platform_null.c @@ -0,0 +1,169 @@ +static __DRIbuffer * +null_get_buffers_with_format(__DRIdrawable * driDrawable, + int *width, int *height, + unsigned int *attachments, int count, + int *out_count, void *loaderPrivate) +{ + struct dri2_egl_surface *dri2_surf = loaderPrivate; + struct dri2_egl_display *dri2_dpy = + dri2_egl_display(dri2_surf-base.Resource.Display); dri2_dpy is unused. + + dri2_surf-buffer_count = 1; + if (width) + *width = dri2_surf-base.Width; + if (height) + *height = dri2_surf-base.Height; + *out_count = dri2_surf-buffer_count;; + return dri2_surf-buffers; +} + +#define DRM_RENDER_DEV_NAME %s/renderD%d + +EGLBoolean
Re: [Mesa-dev] Building Mesa for Windows using Visual Studio
Thank you for useful information. I was able to build Mesa with VS 2013 with a similar scheme with scons. Thanks, Shervin On Fri, Apr 3, 2015 at 7:01 AM, Emil Velikov emil.l.veli...@gmail.com wrote: On 3 April 2015 at 14:43, Predut, Marius marius.pre...@intel.com wrote: Just a couple of small details - mesa has a fall-back for the mentioned functions (plus others) in $(top)/include/*h. That said, I believe that the overall consensus is that building mesa with MSVC 2008, is the bare minimum, with MSVC 2013 strongly recommended. Afaik, as the VMWare guys give us the go aheadwe'll drop all the workarounds for pre-2013 versions and bump the requirement. Cheers, Emil Hmm, nice to know, but in this case the build system have to take in consideration and this is a bug or , by hand we should copy headers? Not sure I fully understand your statement here. Currently there are two types of headers - A) provide official (like) implementation, and B) that wrap around existing ones. Example: A) include/c99/* provides stdint.h co for MSVC 2012 and older as they lack the headers. B) include/c99_math.h provides compat wrapper, as MSVC 2013's math.h is not in C99 land yet. About using those - everything is handled already. Where needed the extra include is added by the build (for A) and where the code is known to be build with MSVC a header from B is used rather than the system one. Hope that clears things up. Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nv50: allocate more offset space for occlusion queries
Commit 1a170980a09 started writing to q-data[4]/[5] but kept the per-query space at 16, which meant that in some cases we would write past the end of the buffer. Rotate by 32, like nvc0 does. Signed-off-by: Ilia Mirkin imir...@alum.mit.edu Tested-by: Nick Tenney nick.ten...@gmail.com Cc: 10.4 10.5 mesa-sta...@lists.freedesktop.org --- src/gallium/drivers/nouveau/nv50/nv50_query.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_query.c b/src/gallium/drivers/nouveau/nv50/nv50_query.c index e81ac5a..6a23de4 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_query.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_query.c @@ -116,8 +116,8 @@ nv50_query_create(struct pipe_context *pipe, unsigned type, unsigned index) q-type = type; if (q-type == PIPE_QUERY_OCCLUSION_COUNTER) { - q-offset -= 16; - q-data -= 16 / sizeof(*q-data); /* we advance before query_begin ! */ + q-offset -= 32; + q-data -= 32 / sizeof(*q-data); /* we advance before query_begin ! */ } return (struct pipe_query *)q; @@ -150,8 +150,8 @@ nv50_query_begin(struct pipe_context *pipe, struct pipe_query *pq) * initialized it to TRUE. */ if (q-type == PIPE_QUERY_OCCLUSION_COUNTER) { - q-offset += 16; - q-data += 16 / sizeof(*q-data); + q-offset += 32; + q-data += 32 / sizeof(*q-data); if (q-offset - q-base == NV50_QUERY_ALLOC_SPACE) nv50_query_allocate(nv50, q, NV50_QUERY_ALLOC_SPACE); -- 2.0.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir/lower_samplers: Use the right memory context for realloc'ing tex sources
On Friday, April 03, 2015 03:50:05 PM Jason Ekstrand wrote: As of da5ec2a, we allocate instruction sources out of the instruction itself. When we realloc the texture sources we need to use the right memory context or ralloc will get angry and assert-fail Cc: Kenneth Graunke kenn...@whitecape.org --- src/glsl/nir/nir_lower_samplers.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/glsl/nir/nir_lower_samplers.cpp b/src/glsl/nir/nir_lower_samplers.cpp index 3015dbd..1e509a9 100644 --- a/src/glsl/nir/nir_lower_samplers.cpp +++ b/src/glsl/nir/nir_lower_samplers.cpp @@ -90,7 +90,7 @@ lower_sampler(nir_tex_instr *instr, struct gl_shader_program *shader_program, ralloc_asprintf_append(name, [%u], deref_array-base_offset); break; case nir_deref_array_type_indirect: { -instr-src = reralloc(mem_ctx, instr-src, nir_tex_src, +instr-src = reralloc(instr, instr-src, nir_tex_src, instr-num_srcs + 1); memset(instr-src[instr-num_srcs], 0, sizeof *instr-src); instr-src[instr-num_srcs].src_type = nir_tex_src_sampler_offset; Oops, thanks! Reviewed-by: Kenneth Graunke kenn...@whitecape.org signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/13] SQUASH: i965/fs: Rework fs_visitor::lower_load_payload
Jason Ekstrand ja...@jlekstrand.net writes: On Thu, Apr 2, 2015 at 3:01 AM, Francisco Jerez curroje...@riseup.net wrote: Jason Ekstrand ja...@jlekstrand.net writes: Instead of the complicated and broken-by-design pile of heuristics we had before, we now have a straightforward lowering: 1) All header sources are copied directly using force_writemask_all and, since they are guaranteed to be a single register, there are no force_sechalf issues. 2) All non-header sources are copied using the exact same force_sechalf and saturate modifiers as the LOAD_PAYLOAD operation itself. 3) In order to accommodate older gens that need interleaved colors, lower_load_payload detects when the destination is a COMPR4 register and automatically interleaves the non-header sources. The lower_load_payload pass does the right thing here regardless of whether or not the hardware actually supports COMPR4. I had a quick glance at the series and it seems to be going in the right direction. One thing I honestly don't like is the ad-hoc and IMHO premature treatment of payload headers, it still feels like the LOAD_PAYLOAD instruction has more complex semantics than necessary and the benefit is unclear to me. I suppose that your motivation was to avoid setting force_writemask_all in LOAD_PAYLOAD instructions with header. The optimizer should be able to cope with those flags and get rid of them from the resulting moves where they are redundant, and if it's not able to it looks like something that should be fixed anyway. The explicit handling of headers is responsible for much of the churn in this series and is likely to complicate users of LOAD_PAYLOAD and optimization passes that have to manipulate them. Avoiding force_writemask_all is only half of the motivation and the small half at that. A header source, more properly defined, is a single physical register that, conceptually, applies to all channels. Effectively, a header source (I should have stated this clearly) has two properties: 1) It has force_writemask_all set 2) It is exactly one physical hardware register. This second property is the more important of the two. Most of the disaster of the previous LOAD_PAYLOAD implementation was that we did a pile of guesswork and had a ill-conceved effective width think in order to figure out how big the register actually was. Making the user specify which sources are header sources eliminates that guesswork. It also has the nice side-effect that we can do the right force_writemask_all and we can properly handle COMPR4 for the the user. Yeah, true, but this seems like the least orthogonal and most annoying to use solution for this problem, it forces the caller to provide redundant information, it takes into account the saturate flag on some arguments and not others, it shuffles sources with respect to the specified order when COMPR4 is set, but only for the first four non-header sources. I think any of the following solutions would be better-behaved than the current approach: 1/ Use the source width to determine the size of each copy. This would imply that the source width carries semantic information and hence would have to be left alone by e.g. copy propagation. 2/ Use the instruction exec size and flags to determine the properties of *all* copies. This means that if a header is present the exec size would necessarily have to be 8 and the halves of a 16-wide register would have to be specified separately, which sounds annoying at first but in practice wouldn't necessarily be because it could be handled by the LOAD_PAYLOAD() helper based on the argument widths without running into problems with optimization passes changing the meaning of the instruction. The semantics of the instruction itself would be as stupid as possible, but the implementation could still trivially recognise 16-wide and COMPR4 copies using the exact same mechanism you are using now. 3/ Split LOAD_PAYLOAD into two separate instructions, each of them dead simple, say COLLECT and LOAD_HEADER. COLLECT dst, src0, ..., srcn would be equivalent to the LOAD_PAYLOAD instruction described in 2, but it would only be used to load full-width non-header sources of the payload, so you would avoid the need to split 16-wide registers in half. LOAD_HEADER dst, header, payload would handle the asymmetric requirements of prepending a header, like using 8-wide instead of exec_size-wide copies and setting force_writemask_all. You could use mlen to specify the size of payload as is usual for instructions taking a payload source. --Jason Thanks for looking into this BTW. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 154 ++- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 3 - 2 files changed, 80 insertions(+), 77 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
Re: [Mesa-dev] Building Mesa for Windows using Visual Studio
On 3 April 2015 at 14:43, Predut, Marius marius.pre...@intel.com wrote: Just a couple of small details - mesa has a fall-back for the mentioned functions (plus others) in $(top)/include/*h. That said, I believe that the overall consensus is that building mesa with MSVC 2008, is the bare minimum, with MSVC 2013 strongly recommended. Afaik, as the VMWare guys give us the go aheadwe'll drop all the workarounds for pre-2013 versions and bump the requirement. Cheers, Emil Hmm, nice to know, but in this case the build system have to take in consideration and this is a bug or , by hand we should copy headers? Not sure I fully understand your statement here. Currently there are two types of headers - A) provide official (like) implementation, and B) that wrap around existing ones. Example: A) include/c99/* provides stdint.h co for MSVC 2012 and older as they lack the headers. B) include/c99_math.h provides compat wrapper, as MSVC 2013's math.h is not in C99 land yet. About using those - everything is handled already. Where needed the extra include is added by the build (for A) and where the code is known to be build with MSVC a header from B is used rather than the system one. Hope that clears things up. Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Building Mesa for Windows using Visual Studio
Just a couple of small details - mesa has a fall-back for the mentioned functions (plus others) in $(top)/include/*h. That said, I believe that the overall consensus is that building mesa with MSVC 2008, is the bare minimum, with MSVC 2013 strongly recommended. Afaik, as the VMWare guys give us the go aheadwe'll drop all the workarounds for pre-2013 versions and bump the requirement. Cheers, Emil Hmm, nice to know, but in this case the build system have to take in consideration and this is a bug or , by hand we should copy headers? Thanks for this info marius ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] osmesa with gallium
Hello Emil Thanks for your quick reply. We commercialize windows softwares containing mapping component. Maps are drawed using opengl. We use mesa libgl-gdi as software fallback renderer in case of buggy windows old drivers. osmesa is used as maps offscreen rendering in non-windowed context. I'll forward the patch with git send-email, I hope it will be correctly formatted, I had hard time configuring git and msmtp under windows. -Message d'origine- De : Emil Velikov [mailto:emil.l.veli...@gmail.com] Envoyé : vendredi 3 avril 2015 15:09 À : Olivier PENA; mesa-dev@lists.freedesktop.org Cc : emil.l.veli...@gmail.com Objet : Re: [Mesa-dev] osmesa with gallium Hello Olivier, On 03/04/15 11:27, Olivier PENA wrote: Hi, I successfully build osmesa with gallium state tracker on windows by adding a new target (gallium-osmesa) in the scons build system. Both llvmpipe and softpipe works. May I send a patch ? That would be great thank you. I'm pretty sure that Jose won't mind ;-) May I ask what is your usecase for osmesa - is it a public/open-source project or something in house that'll be using it ? Regarding the patch in question - please use git send-email to send it to the mailing list. Thanks Emil Click https://www.mailcontrol.com/sr/MZbqvYs5QwJvpeaetUwhCQ== to report this email as spam. - Ce message a été traité contre les virus par quatre outils différents (Kaspersky, McAfee, Symantec et ThreatSeeker). This message has been scanned for viruses (by Kaspersky, McAfee, Symantec and ThreatSeeker). - ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir: add lowering for idiv/udiv/umod
On Fri, Apr 3, 2015 at 11:21 AM, Rob Clark robdcl...@gmail.com wrote: From: Rob Clark robcl...@freedesktop.org Based on the algo from NV50LegalizeSSA::handleDIV() and handleMOD(). See also trans_idiv() in freedreno/ir3/ir3_compiler.c (which was an adaptation of the nv50 code from Ilia Mirkin). Also, including a py script that implements the same algo with numpy, based on something written by Ilia (and beaten on with a hammer a bit by me). I've tested this on i965 hacked up to insert the idiv lowering pass. Signed-off-by: Rob Clark robcl...@freedesktop.org --- src/glsl/Makefile.sources | 1 + src/glsl/nir/div-lowering.py | 75 Python *really* hates files with - in their name. You can't import them, so you have to use underscores. Admittedly it's not designed for importing, but let's not prevent it in the future. src/glsl/nir/nir.h| 1 + src/glsl/nir/nir_lower_idiv.c | 157 ++ 4 files changed, 234 insertions(+) create mode 100755 src/glsl/nir/div-lowering.py create mode 100644 src/glsl/nir/nir_lower_idiv.c diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources index ffce706..5d70e88 100644 --- a/src/glsl/Makefile.sources +++ b/src/glsl/Makefile.sources @@ -33,6 +33,7 @@ NIR_FILES = \ nir/nir_lower_atomics.c \ nir/nir_lower_global_vars_to_local.c \ nir/nir_lower_locals_to_regs.c \ + nir/nir_lower_idiv.c \ nir/nir_lower_io.c \ nir/nir_lower_phis_to_scalar.c \ nir/nir_lower_samplers.cpp \ diff --git a/src/glsl/nir/div-lowering.py b/src/glsl/nir/div-lowering.py new file mode 100755 index 000..87db784 --- /dev/null +++ b/src/glsl/nir/div-lowering.py @@ -0,0 +1,75 @@ +#!/usr/bin/python I think it's BS, but you're going to get yelled at by people who have foolishly set up python to point to python3 (despite the *huge* quantity of programs that will never change and assume that python == python2). Probably just hard-code it to python2 for now, which is a symlink available in most, but not all, python installations. + +import numpy as np +import sys + +op = sys.argv[1] + +if op not in (idiv, udiv, umod): + print invalid op:, op + exit(1) + +is_signed = op == idiv + +if is_signed: + numer = np.int32(sys.argv[2]) + denom = np.int32(sys.argv[3]) +else: + numer = np.uint32(sys.argv[2]) + denom = np.uint32(sys.argv[3]) + +print op, numer, denom, \n print prints a newline by default, no need for the \n. Unless there's a , at the end, in which case it skips the newline. Which is what I was doing in my version since I wanted like a / b = 5 or whatever. + + +if is_signed: + af = np.float32(numer) + bf = np.float32(denom) + af = np.abs(af) + bf = np.abs(bf) + a = np.abs(numer).view(np.uint32) + b = np.abs(denom).view(np.uint32) +else: + af = np.float32(numer) + bf = np.float32(denom) + a = numer + b = denom + +# get first result: +bf = np.reciprocal(bf) +bf = (bf.view(np.uint32) - np.uint32(2)).view(np.float32) +q = af * bf + +if is_signed: + q = np.int32(q).view(np.uint32) +else: + q = np.uint32(q).view(np.uint32) + +# get error of first result: +r = q * b +r = a - r +r = np.float32(r) +r = r * bf +r = np.uint32(r) + +# add quotients: +q = q + r + +# correction: if modulus = divisor, add 1 +r = q * b +r = a - r + +r = np.uint32(1) if r.view(np.uint32) = b.view(np.uint32) else np.uint32(0) +q = q + r + +if is_signed: + r = np.bitwise_xor(numer, denom) + r = np.right_shift(r, 31) + b = -q + q = b if r else q + +if op == umod: + r = q * b + q = a - r + +print =, q.view(np.int32) + diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h index c14c51c..20984e9 100644 --- a/src/glsl/nir/nir.h +++ b/src/glsl/nir/nir.h @@ -1605,6 +1605,7 @@ void nir_lower_samplers(nir_shader *shader, void nir_lower_system_values(nir_shader *shader); void nir_lower_tex_projector(nir_shader *shader); +void nir_lower_idiv(nir_shader *shader); void nir_lower_atomics(nir_shader *shader); void nir_lower_to_source_mods(nir_shader *shader); diff --git a/src/glsl/nir/nir_lower_idiv.c b/src/glsl/nir/nir_lower_idiv.c new file mode 100644 index 000..c2f08df --- /dev/null +++ b/src/glsl/nir/nir_lower_idiv.c @@ -0,0 +1,157 @@ +/* + * Copyright © 2015 Red Hat + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * +
Re: [Mesa-dev] [PATCH 09/13] SQUASH: i965/fs: Rework fs_visitor::lower_load_payload
Jason Ekstrand ja...@jlekstrand.net writes: On Fri, Apr 3, 2015 at 7:28 AM, Francisco Jerez curroje...@riseup.net wrote: Jason Ekstrand ja...@jlekstrand.net writes: On Thu, Apr 2, 2015 at 3:01 AM, Francisco Jerez curroje...@riseup.net wrote: Jason Ekstrand ja...@jlekstrand.net writes: Instead of the complicated and broken-by-design pile of heuristics we had before, we now have a straightforward lowering: 1) All header sources are copied directly using force_writemask_all and, since they are guaranteed to be a single register, there are no force_sechalf issues. 2) All non-header sources are copied using the exact same force_sechalf and saturate modifiers as the LOAD_PAYLOAD operation itself. 3) In order to accommodate older gens that need interleaved colors, lower_load_payload detects when the destination is a COMPR4 register and automatically interleaves the non-header sources. The lower_load_payload pass does the right thing here regardless of whether or not the hardware actually supports COMPR4. I had a quick glance at the series and it seems to be going in the right direction. One thing I honestly don't like is the ad-hoc and IMHO premature treatment of payload headers, it still feels like the LOAD_PAYLOAD instruction has more complex semantics than necessary and the benefit is unclear to me. I suppose that your motivation was to avoid setting force_writemask_all in LOAD_PAYLOAD instructions with header. The optimizer should be able to cope with those flags and get rid of them from the resulting moves where they are redundant, and if it's not able to it looks like something that should be fixed anyway. The explicit handling of headers is responsible for much of the churn in this series and is likely to complicate users of LOAD_PAYLOAD and optimization passes that have to manipulate them. Avoiding force_writemask_all is only half of the motivation and the small half at that. A header source, more properly defined, is a single physical register that, conceptually, applies to all channels. Effectively, a header source (I should have stated this clearly) has two properties: 1) It has force_writemask_all set 2) It is exactly one physical hardware register. This second property is the more important of the two. Most of the disaster of the previous LOAD_PAYLOAD implementation was that we did a pile of guesswork and had a ill-conceved effective width think in order to figure out how big the register actually was. Making the user specify which sources are header sources eliminates that guesswork. It also has the nice side-effect that we can do the right force_writemask_all and we can properly handle COMPR4 for the the user. Yeah, true, but this seems like the least orthogonal and most annoying to use solution for this problem, it forces the caller to provide redundant information, it takes into account the saturate flag on some arguments and not others, it shuffles sources with respect to the specified order when COMPR4 is set, but only for the first four non-header sources. I think any of the following solutions would be better-behaved than the current approach: I don't know that saying which sources are headers is really redundant. It's explicit which is what we want. Yes, the COMPR4 thing is a bit magical but we have to do COMPR4 in lower_load_payload so we have to have some way of doing it and this method puts the interleving code in one place instead of two. Well, at least with the previous approach LOAD_PAYLOAD had consistent (if broken) semantics across its arguments, and regardless of COMPR4 being used or not, which IMHO is preferable to the modest code saving. 1/ Use the source width to determine the size of each copy. This would imply that the source width carries semantic information and hence would have to be left alone by e.g. copy propagation. That's what do now and it's terrible. The effective width field was basically a width that gets kept. Couldn't you just prevent copy propagation from propagating a source of different width into a LOAD_PAYLOAD? You would still be able to get rid of the metadata guess and effective_width, but you may have to re-run copy propagate after lower_load_payload() for the case you missed any optimization oportunities. 2/ Use the instruction exec size and flags to determine the properties of *all* copies. This means that if a header is present the exec size would necessarily have to be 8 and the halves of a 16-wide register would have to be specified separately, which sounds annoying at first but in practice wouldn't necessarily be because it could be handled by the LOAD_PAYLOAD() helper based on the argument widths without running into problems with optimization passes changing the meaning of the instruction. The semantics of the instruction itself would be as stupid as possible, but the