[Mesa-dev] [PATCH V4] mesa: use build flag to ensure stack is realigned on x86
Nowadays GCC assumes stack pointer is 16-byte aligned even on 32-bits, but that is an assumption OpenGL drivers (or any dynamic library for that matter) can't afford to make as there are many closed- and open- source application binaries out there that only assume 4-byte stack alignment. V4: fix comment and indentation V3: move all sse4.1 build flag config to the same location and add comment as to why we need to do the realign V2: use $target_cpu rather than $host_cpu and setup build flags in config rather than makefile https://bugs.freedesktop.org/show_bug.cgi?id=86788 Signed-off-by: Timothy Arceri Reviewed-by: Matt Turner CC: "10.4" --- The last hunk should be dropped when applying to 10.4 configure.ac | 11 ++- src/mesa/Makefile.am | 2 +- src/mesa/main/sse_minmax.c | 3 --- 3 files changed, 11 insertions(+), 5 deletions(-) diff --git a/configure.ac b/configure.ac index b0df1bb..4bdf75d 100644 --- a/configure.ac +++ b/configure.ac @@ -253,8 +253,16 @@ AC_SUBST([VISIBILITY_CXXFLAGS]) dnl dnl Optional flags, check for compiler support dnl +SSE41_CFLAGS="-msse4.1" +dnl Code compiled by GCC with -msse* assumes a 16 byte aligned +dnl stack, but on x86-32 such alignment is not guaranteed. +case "$target_cpu" in +i?86) +SSE41_CFLAGS="$SSE41_CFLAGS -mstackrealign" +;; +esac save_CFLAGS="$CFLAGS" -CFLAGS="-msse4.1 $CFLAGS" +CFLAGS="$SSE41_CFLAGS $CFLAGS" AC_COMPILE_IFELSE([AC_LANG_SOURCE([[ #include int main () { @@ -267,6 +275,7 @@ if test "x$SSE41_SUPPORTED" = x1; then DEFINES="$DEFINES -DUSE_SSE41" fi AM_CONDITIONAL([SSE41_SUPPORTED], [test x$SSE41_SUPPORTED = x1]) +AC_SUBST([SSE41_CFLAGS], $SSE41_CFLAGS) dnl Can't have static and shared libraries, default to static if user dnl explicitly requested. If both disabled, set to static since shared diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am index 932db4f..3b68573 100644 --- a/src/mesa/Makefile.am +++ b/src/mesa/Makefile.am @@ -153,7 +153,7 @@ libmesagallium_la_LIBADD = \ libmesa_sse41_la_SOURCES = \ main/streaming-load-memcpy.c \ main/sse_minmax.c -libmesa_sse41_la_CFLAGS = $(AM_CFLAGS) -msse4.1 +libmesa_sse41_la_CFLAGS = $(AM_CFLAGS) $(SSE41_CFLAGS) pkgconfigdir = $(libdir)/pkgconfig pkgconfig_DATA = gl.pc diff --git a/src/mesa/main/sse_minmax.c b/src/mesa/main/sse_minmax.c index 93cf2a6..222ac14 100644 --- a/src/mesa/main/sse_minmax.c +++ b/src/mesa/main/sse_minmax.c @@ -31,9 +31,6 @@ #include void -#if !defined(__x86_64__) - __attribute__((force_align_arg_pointer)) -#endif _mesa_uint_array_min_max(const unsigned *ui_indices, unsigned *min_index, unsigned *max_index, const unsigned count) { -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Disable unlit-centroid workaround on Gen < 6.
Possibly mark this for 10.4, since the assertion failures we hit when emitting a pointless centroid workaround make other issues hard to debug? On Tue, Dec 9, 2014 at 8:44 PM, Chris Forbes wrote: > Reviewed-by: Chris Forbes > > On Tue, Dec 9, 2014 at 8:08 PM, Matt Turner wrote: >> Back to the original commit (8313f444) adding the workaround, we were >> enabling it on gens <= 7, even though gens <= 5 can't do multisampling. >> >> I cannot find documentation that says that Sandybridge needs this >> workaround but in practice disabling it causes these piglit tests to >> fail: >> >> EXT_framebuffer_multisample/interpolation {2,4} centroid-deriv{,-disabled} >> --- >> src/mesa/drivers/dri/i965/brw_device_info.c | 3 --- >> 1 file changed, 3 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c >> b/src/mesa/drivers/dri/i965/brw_device_info.c >> index bbd907b..65942c2 100644 >> --- a/src/mesa/drivers/dri/i965/brw_device_info.c >> +++ b/src/mesa/drivers/dri/i965/brw_device_info.c >> @@ -28,7 +28,6 @@ >> static const struct brw_device_info brw_device_info_i965 = { >> .gen = 4, >> .has_negative_rhw_bug = true, >> - .needs_unlit_centroid_workaround = true, >> .max_vs_threads = 16, >> .max_gs_threads = 2, >> .max_wm_threads = 8 * 4, >> @@ -42,7 +41,6 @@ static const struct brw_device_info brw_device_info_g4x = { >> .has_pln = true, >> .has_compr4 = true, >> .has_surface_tile_offset = true, >> - .needs_unlit_centroid_workaround = true, >> .is_g4x = true, >> .max_vs_threads = 32, >> .max_gs_threads = 2, >> @@ -57,7 +55,6 @@ static const struct brw_device_info brw_device_info_ilk = { >> .has_pln = true, >> .has_compr4 = true, >> .has_surface_tile_offset = true, >> - .needs_unlit_centroid_workaround = true, >> .max_vs_threads = 72, >> .max_gs_threads = 32, >> .max_wm_threads = 12 * 6, >> -- >> 2.0.4 >> ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH V3] mesa: use build flag to ensure stack is realigned on x86
On Mon, 2014-12-08 at 22:08 -0800, Matt Turner wrote: > On Mon, Dec 8, 2014 at 9:43 PM, Timothy Arceri wrote: > > Nowadays GCC assumes stack pointer is 16-byte aligned even on 32-bits, but > > that is an assumption OpenGL drivers (or any dynamic library for that > > matter) > > can't afford to make as there are many closed- and open- source application > > binaries out there that only assume 4-byte stack alignment. > > > > V3: move all sse4.1 build flag config to the same location > > and add comment as to why we need to do the realign > > > > V2: use $target_cpu rather than $host_cpu > > and setup build flags in config rather than makefile > > > > https://bugs.freedesktop.org/show_bug.cgi?id=86788 > > Signed-off-by: Timothy Arceri > > Reviewed-by: Matt Turner > > CC: "10.4" > > --- > > If there are no other comments I'll commit it to master > > tomorrow. > > > > The last hunk should be dropped when applying to 10.4. > > > > configure.ac | 11 ++- > > src/mesa/Makefile.am | 2 +- > > src/mesa/main/sse_minmax.c | 3 --- > > 3 files changed, 11 insertions(+), 5 deletions(-) > > > > diff --git a/configure.ac b/configure.ac > > index b0df1bb..e510bcf 100644 > > --- a/configure.ac > > +++ b/configure.ac > > @@ -253,8 +253,16 @@ AC_SUBST([VISIBILITY_CXXFLAGS]) > > dnl > > dnl Optional flags, check for compiler support > > dnl > > +SSE41_CFLAGS="-msse4.1" > > +dnl 32-bit Mesa is 4-byte aligned to allow support for old applications > > +dnl therefore we need to realign the stack when using SSE > > I don't think this comment is right. It doesn't have anything to do > with Mesa. The 32-bit ABI just doesn't require the stack to be >4 byte > aligned. Thanks for the clarification I was wondering where the 4-byte align config switch was but it was just me misreading Jose's comments on the bug report. > > Søren's pixman patch has this comment, which I think is fine: > > Code compiled by GCC with -msse2 and -mssse3 assumes a 16 byte aligned > stack, but on x86-32 such alignment is not guaranteed. > > (Just replace "-msse2 and -mssse3" with "-msse*") > > > +case "$target_cpu" in > > +i?86) > > +SSE41_CFLAGS="$SSE41_CFLAGS -mstackrealign" > > +;; > > Okay, starting to wonder if you're trolling me with this. :-| uh sorry I did indent it, but not to the right place. The i?86) shouldn't be indented either. All fixed in the next version, thanks for all the feedback. > > Look at the other instances of ;; -- they're either on the same line > as the previous statement or they're aligned with it. > ___ > mesa-stable mailing list > mesa-sta...@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-stable ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Disable unlit-centroid workaround on Gen < 6.
On Mon, Dec 8, 2014 at 11:08 PM, Matt Turner wrote: > I cannot find documentation that says that Sandybridge needs this > workaround In fact, the BSpec specifically says that this workaround is needed for HSW, IVB, and VLV (no SNB). I was apparently incorrect that it doesn't apply to Haswell when I made commit f6db414f, although it did not regress anything. I just tested disabling it on IVB as well -- everything still passes. So who knows what's up. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Disable unlit-centroid workaround on Gen < 6.
Reviewed-by: Chris Forbes On Tue, Dec 9, 2014 at 8:08 PM, Matt Turner wrote: > Back to the original commit (8313f444) adding the workaround, we were > enabling it on gens <= 7, even though gens <= 5 can't do multisampling. > > I cannot find documentation that says that Sandybridge needs this > workaround but in practice disabling it causes these piglit tests to > fail: > > EXT_framebuffer_multisample/interpolation {2,4} centroid-deriv{,-disabled} > --- > src/mesa/drivers/dri/i965/brw_device_info.c | 3 --- > 1 file changed, 3 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c > b/src/mesa/drivers/dri/i965/brw_device_info.c > index bbd907b..65942c2 100644 > --- a/src/mesa/drivers/dri/i965/brw_device_info.c > +++ b/src/mesa/drivers/dri/i965/brw_device_info.c > @@ -28,7 +28,6 @@ > static const struct brw_device_info brw_device_info_i965 = { > .gen = 4, > .has_negative_rhw_bug = true, > - .needs_unlit_centroid_workaround = true, > .max_vs_threads = 16, > .max_gs_threads = 2, > .max_wm_threads = 8 * 4, > @@ -42,7 +41,6 @@ static const struct brw_device_info brw_device_info_g4x = { > .has_pln = true, > .has_compr4 = true, > .has_surface_tile_offset = true, > - .needs_unlit_centroid_workaround = true, > .is_g4x = true, > .max_vs_threads = 32, > .max_gs_threads = 2, > @@ -57,7 +55,6 @@ static const struct brw_device_info brw_device_info_ilk = { > .has_pln = true, > .has_compr4 = true, > .has_surface_tile_offset = true, > - .needs_unlit_centroid_workaround = true, > .max_vs_threads = 72, > .max_gs_threads = 32, > .max_wm_threads = 12 * 6, > -- > 2.0.4 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Disable unlit-centroid workaround on Gen < 6.
Back to the original commit (8313f444) adding the workaround, we were enabling it on gens <= 7, even though gens <= 5 can't do multisampling. I cannot find documentation that says that Sandybridge needs this workaround but in practice disabling it causes these piglit tests to fail: EXT_framebuffer_multisample/interpolation {2,4} centroid-deriv{,-disabled} --- src/mesa/drivers/dri/i965/brw_device_info.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c b/src/mesa/drivers/dri/i965/brw_device_info.c index bbd907b..65942c2 100644 --- a/src/mesa/drivers/dri/i965/brw_device_info.c +++ b/src/mesa/drivers/dri/i965/brw_device_info.c @@ -28,7 +28,6 @@ static const struct brw_device_info brw_device_info_i965 = { .gen = 4, .has_negative_rhw_bug = true, - .needs_unlit_centroid_workaround = true, .max_vs_threads = 16, .max_gs_threads = 2, .max_wm_threads = 8 * 4, @@ -42,7 +41,6 @@ static const struct brw_device_info brw_device_info_g4x = { .has_pln = true, .has_compr4 = true, .has_surface_tile_offset = true, - .needs_unlit_centroid_workaround = true, .is_g4x = true, .max_vs_threads = 32, .max_gs_threads = 2, @@ -57,7 +55,6 @@ static const struct brw_device_info brw_device_info_ilk = { .has_pln = true, .has_compr4 = true, .has_surface_tile_offset = true, - .needs_unlit_centroid_workaround = true, .max_vs_threads = 72, .max_gs_threads = 32, .max_wm_threads = 12 * 6, -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] r600g/sb: implement r600 gpr index workaround.
From: Dave Airlie r600, rv610 and rv630 all have a bug in their GPR indexing and how the hw inserts access to PV. If the base index for the src is the same as the dst gpr in a previous group, then it will use PV instead of using the indexed gpr correctly. The workaround is to insert a NOP when you detect this. This is half the fix, there is also a problem where the dst gpr is indexed and a subsequent src reads it, that is next. Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/sb/sb_bc.h| 2 ++ src/gallium/drivers/r600/sb/sb_bc_finalize.cpp | 50 +- src/gallium/drivers/r600/sb/sb_context.cpp | 2 ++ src/gallium/drivers/r600/sb/sb_ir.h| 1 + src/gallium/drivers/r600/sb/sb_pass.h | 5 +-- 5 files changed, 50 insertions(+), 10 deletions(-) diff --git a/src/gallium/drivers/r600/sb/sb_bc.h b/src/gallium/drivers/r600/sb/sb_bc.h index d03da98..6d3dc4d 100644 --- a/src/gallium/drivers/r600/sb/sb_bc.h +++ b/src/gallium/drivers/r600/sb/sb_bc.h @@ -616,6 +616,8 @@ public: unsigned num_slots; bool uses_mova_gpr; + bool r6xx_gpr_index_workaround; + bool stack_workaround_8xx; bool stack_workaround_9xx; diff --git a/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp b/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp index 3f362c4..c55d2cf 100644 --- a/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp +++ b/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp @@ -38,6 +38,18 @@ namespace r600_sb { +void bc_finalizer::insert_rv6xx_load_ar_workaround(alu_group_node *b4) { + + alu_group_node *g = sh.create_alu_group(); + alu_node *a = sh.create_alu(); + + a->bc.set_op(ALU_OP0_NOP); + a->bc.last = 1; + + g->push_back(a); + b4->insert_before(g); +} + int bc_finalizer::run() { run_on(sh.root); @@ -211,12 +223,12 @@ void bc_finalizer::finalize_if(region_node* r) { } void bc_finalizer::run_on(container_node* c) { - + node *prev_node = NULL; for (node_iterator I = c->begin(), E = c->end(); I != E; ++I) { node *n = *I; if (n->is_alu_group()) { - finalize_alu_group(static_cast(n)); + finalize_alu_group(static_cast(n), prev_node); } else { if (n->is_alu_clause()) { cf_node *c = static_cast(n); @@ -251,19 +263,26 @@ void bc_finalizer::run_on(container_node* c) { if (n->is_container()) run_on(static_cast(n)); } + prev_node = n; } } -void bc_finalizer::finalize_alu_group(alu_group_node* g) { +void bc_finalizer::finalize_alu_group(alu_group_node* g, node *prev_node) { alu_node *last = NULL; + alu_group_node *prev_g = NULL; + bool add_nop = false; + if (prev_node && prev_node->is_alu_group()) { + prev_g = static_cast(prev_node); + } + for (int i = 0; i < 5; i++) + g->dst_slot_regs[i] = -1; for (node_iterator I = g->begin(), E = g->end(); I != E; ++I) { alu_node *n = static_cast(*I); unsigned slot = n->bc.slot; - value *d = n->dst.empty() ? NULL : n->dst[0]; - + bool local_nop; if (d && d->is_special_reg()) { assert(n->bc.op_ptr->flags & AF_MOVA); d = NULL; @@ -286,6 +305,7 @@ void bc_finalizer::finalize_alu_group(alu_group_node* g) { n->bc.dst_rel = 0; } + g->dst_slot_regs[slot] = n->bc.dst_gpr; n->bc.write_mask = d != NULL; n->bc.last = 0; @@ -299,17 +319,24 @@ void bc_finalizer::finalize_alu_group(alu_group_node* g) { update_ngpr(n->bc.dst_gpr); - finalize_alu_src(g, n); + local_nop = finalize_alu_src(g, n, prev_g); + if (local_nop) + add_nop = true; last = n; } + if (add_nop) { + if (sh.get_ctx().r6xx_gpr_index_workaround) { + insert_rv6xx_load_ar_workaround(g); + } + } last->bc.last = 1; } -void bc_finalizer::finalize_alu_src(alu_group_node* g, alu_node* a) { +bool bc_finalizer::finalize_alu_src(alu_group_node* g, alu_node* a, alu_group_node *prev) { vvec &sv = a->src; - + bool add_nop = false; FBC_DUMP( sblog << "finalize_alu_src: "; dump::dump_op(a); @@ -336,6 +363,12 @@ void bc_finalizer::finalize_alu_src(alu_group_node* g, alu_node* a) { if (!v->rel->is_const()) { src.rel = 1; update_ngpr(v->array->gpr.sel() + v->array->array_size -1); + if (prev) { +
[Mesa-dev] [Bug 87137] Unable to build when configured with openmp and CFLAGS/LDFLAGS contain -fopenmp
https://bugs.freedesktop.org/show_bug.cgi?id=87137 --- Comment #2 from Bob --- (In reply to Matt Turner from comment #1) > Dear Vlad/Bob/Chris (which is it?!), none of the above. > > Please submit a patch to mesa-dev@lists.freedesktop.org adding > > #ifdef _OPENMP > #include > #endif > > to the files that call omp_* functions. > > Also, don't put -fopenmp in your system CFLAGS. It doesn't do anything > without the package being written with OpenMP. I don't remember why I added that. I did this long ago because the mythtv box is a low powered atom that struggles with video playback. Maybe I was just clutching at straws back then when I spent a few minutes trying to use it for video playback > Sincerely, a Gentoo Developer. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 87137] Unable to build when configured with openmp and CFLAGS/LDFLAGS contain -fopenmp
https://bugs.freedesktop.org/show_bug.cgi?id=87137 --- Comment #1 from Matt Turner --- Dear Vlad/Bob/Chris (which is it?!), Please submit a patch to mesa-dev@lists.freedesktop.org adding #ifdef _OPENMP #include #endif to the files that call omp_* functions. Also, don't put -fopenmp in your system CFLAGS. It doesn't do anything without the package being written with OpenMP. Sincerely, a Gentoo Developer. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 87136] Incorrect/undefined behavior from shifting an integer too far
https://bugs.freedesktop.org/show_bug.cgi?id=87136 Matt Turner changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |WONTFIX --- Comment #1 from Matt Turner --- (In reply to Bruce Dawson from comment #0) > drivers\dri\i965\brw_fs.cpp(2164): >for (int i = 0; i < FRAG_ATTRIB_MAX; i++) { > if (!(fp->Base.InputsRead & BITFIELD64_BIT(i))) > continue; > > if (prog->Name == 0) > key.proj_attrib_mask |= 1 << i; // BUG This doesn't exist in the code base. In fact, the proj_attrib_mask field was removed in April 2013! (commit 705c8247) I'm marking as WONTFIX, because there's no ALREADYFIX and bugzilla isn't configured to allow OBSOLETE statuses. I haven't checked whether the other two are still present. Please just submit fixes directly to the mailing list. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 87137] Unable to build when configured with openmp and CFLAGS/LDFLAGS contain -fopenmp
https://bugs.freedesktop.org/show_bug.cgi?id=87137 Bug ID: 87137 Summary: Unable to build when configured with openmp and CFLAGS/LDFLAGS contain -fopenmp Product: Mesa Version: 10.3 Hardware: x86-64 (AMD64) OS: All Status: NEW Severity: trivial Priority: medium Component: Mesa core Assignee: mesa-dev@lists.freedesktop.org Reporter: chris.anderson2...@yandex.com For sometime now, I have a mythtv machine that I built using gentoo. I sometimes update it and have noticed that for sometime that the build fails when I enable openmp in gentoo;s use flags (does that mean it gets configured with --openmp or something or other) and also add -fopenmp to the CFLAGS and LDFLAGS. The bug is here https://bugs.gentoo.org/show_bug.cgi?id=532020, which the gentoo nazi's will probably reject because there is no line that contains my CPU id but the solution for me is to add a define for: extern int omp_get_num_threads(); in src/mesa/swrast/s_aatritemp.h and the missing header in for #include in src/mesa/swrast/s_context.c and src/mesa/swrast/s_texcombine.c I think that you need to add this with some goddamn ugly #if defined(...) nastyness, but I could be wrong. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 87136] Incorrect/undefined behavior from shifting an integer too far
https://bugs.freedesktop.org/show_bug.cgi?id=87136 Bug ID: 87136 Summary: Incorrect/undefined behavior from shifting an integer too far Product: Mesa Version: unspecified Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: Mesa core Assignee: mesa-dev@lists.freedesktop.org Reporter: brucedaw...@cygnus-software.com While compiling Google Chrome with VC++'s /analyze (static code analysis) I discovered a class of bugs in the mesa code base. There are three instances of this bug where the integer '1' is shifted by an amount that ranges from zero to 47. '1' will have type 'int' and on basically every modern platform 'int' is 32 bits. Shifting an int more than 30 positions leads to undefined behavior because the result will, necessarily be unrepresentable. Even if we ignore the annoying spectre of undefined behavior, the behavior will most definitely not be what is intended. On Intel processors the likely result is going to be equivalent to shifting by shiftAmount&31 which means that 16 mask values will be repeated. The warning is: warning C6297: Arithmetic overflow: 32-bit value is shifted, then cast to 64-bit value. Results might not be an expected value. The fix is to use BITFIELD64_BIT(i) instead of (1 << I). The locations where I have noticed this bug are: drivers\dri\i965\brw_fs.cpp(2164): for (int i = 0; i < FRAG_ATTRIB_MAX; i++) { if (!(fp->Base.InputsRead & BITFIELD64_BIT(i))) continue; if (prog->Name == 0) key.proj_attrib_mask |= 1 << i; // BUG swrast\s_span.c(767): for (i = 0; i < FRAG_ATTRIB_MAX; i++) { if (span->interpMask & (1 << i)) { GLuint j; for (j = 0; j < 4; j++) { span->attrStart[i][j] += leftClip * span->attrStepX[i][j]; } } } swrast\s_span.c(788): for (i = 0; i < FRAG_ATTRIB_MAX; i++) { if (span->arrayAttribs & (1 << i)) { /* shift array elements left by 'leftClip' */ SHIFT_ARRAY(span->array->attribs[i], leftClip, n - leftClip); } } -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V3] mesa: use build flag to ensure stack is realigned on x86
On Mon, Dec 8, 2014 at 9:43 PM, Timothy Arceri wrote: > Nowadays GCC assumes stack pointer is 16-byte aligned even on 32-bits, but > that is an assumption OpenGL drivers (or any dynamic library for that matter) > can't afford to make as there are many closed- and open- source application > binaries out there that only assume 4-byte stack alignment. > > V3: move all sse4.1 build flag config to the same location > and add comment as to why we need to do the realign > > V2: use $target_cpu rather than $host_cpu > and setup build flags in config rather than makefile > > https://bugs.freedesktop.org/show_bug.cgi?id=86788 > Signed-off-by: Timothy Arceri > Reviewed-by: Matt Turner > CC: "10.4" > --- > If there are no other comments I'll commit it to master > tomorrow. > > The last hunk should be dropped when applying to 10.4. > > configure.ac | 11 ++- > src/mesa/Makefile.am | 2 +- > src/mesa/main/sse_minmax.c | 3 --- > 3 files changed, 11 insertions(+), 5 deletions(-) > > diff --git a/configure.ac b/configure.ac > index b0df1bb..e510bcf 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -253,8 +253,16 @@ AC_SUBST([VISIBILITY_CXXFLAGS]) > dnl > dnl Optional flags, check for compiler support > dnl > +SSE41_CFLAGS="-msse4.1" > +dnl 32-bit Mesa is 4-byte aligned to allow support for old applications > +dnl therefore we need to realign the stack when using SSE I don't think this comment is right. It doesn't have anything to do with Mesa. The 32-bit ABI just doesn't require the stack to be >4 byte aligned. Søren's pixman patch has this comment, which I think is fine: Code compiled by GCC with -msse2 and -mssse3 assumes a 16 byte aligned stack, but on x86-32 such alignment is not guaranteed. (Just replace "-msse2 and -mssse3" with "-msse*") > +case "$target_cpu" in > +i?86) > +SSE41_CFLAGS="$SSE41_CFLAGS -mstackrealign" > +;; Okay, starting to wonder if you're trolling me with this. :-| Look at the other instances of ;; -- they're either on the same line as the previous statement or they're aligned with it. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] llvmpipe: fix lp_test_arit denorm handling
Am 09.12.2014 um 06:40 schrieb Matt Turner: > On Mon, Nov 24, 2014 at 2:37 PM, wrote: >> From: Roland Scheidegger >> >> llvmpipe disables denorms on purpose (on x86/sse only), because denorms are >> generally neither required nor desired for graphic apis (and in case of >> d3d10, >> they are forbidden). >> However, this caused some arithmetic tests using denorms to fail on some >> systems, because the reference did not generate the same results anymore. >> (It did not fail on all systems - behavior of these math functions is sort >> of undefined when called with non-standard floating point mode, hence the >> result differing depending on implementation and in particular the sse >> capabilities.) >> So, for the reference, simply flush all (input/output) denorms manually >> to zero in this case. >> >> This fixes >> https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.freedesktop.org_show-5Fbug.cgi-3Fid-3D67672&d=AAIBaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=Vjtt0vs_iqoI31UfJxBl7yv9I2FeiaeAYgMTLKRBc_I&m=kfA7P5Lq_Kc8RiA_OyaBMl6jeaxCRiOubj10LmlV8gI&s=23wd53B39EpRPW3XV9e2YRxGyVl3wa7ZHICdUTzoPt0&e= >> . >> --- > > Can we pick this to 10.4? I've had a Gentoo bug open about this > failure for since 10.0. > > (commit 8148a06b8fdb734f7f9a11ce787ee6505939fdaa in master) > Well I guess why not. It's not like it actually fixes a bug in the driver (just fixes the test) so it shouldn't hurt in any case neither :-). Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 86939] test_vf_float_conversions.cpp:63:12: error: expected primary-expression before ‘union’
https://bugs.freedesktop.org/show_bug.cgi?id=86939 Vinson Lee changed: What|Removed |Added Status|REOPENED|RESOLVED Resolution|--- |FIXED --- Comment #3 from Vinson Lee --- commit d20235f79a4b2786c984175b502b97ac73648781 Author: Vinson Lee Date: Fri Dec 5 18:05:06 2014 -0800 i965: Fix union usage for G++ <= 4.6. This patch fixes this build error with G++ <= 4.6. CXXtest_vf_float_conversions.o test_vf_float_conversions.cpp: In function ‘unsigned int f2u(float)’: test_vf_float_conversions.cpp:63:20: error: expected primary-expression before ‘.’ token Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86939 Signed-off-by: Vinson Lee Reviewed-by: Matt Turner -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 79706] [TRACKER] Mesa regression tracker
https://bugs.freedesktop.org/show_bug.cgi?id=79706 Bug 79706 depends on bug 86939, which changed state. Bug 86939 Summary: test_vf_float_conversions.cpp:63:12: error: expected primary-expression before ‘union’ https://bugs.freedesktop.org/show_bug.cgi?id=86939 What|Removed |Added Status|REOPENED|RESOLVED Resolution|--- |FIXED -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: fix regression since UCMP change
Am 09.12.2014 um 02:31 schrieb Dave Airlie: > From: Dave Airlie > > Since d8da6deceadf5e48201d848b7061dad17a5b7cac where the > state tracker started using UCMP on cayman a number of tests > regressed. > > this seems to be r600g is doing CNDGE_INT for UCMP which is >= 0, > we should be doing CNDE_INT with reverse arguments. > > Signed-off-by: Dave Airlie > --- > src/gallium/drivers/r600/r600_shader.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/gallium/drivers/r600/r600_shader.c > b/src/gallium/drivers/r600/r600_shader.c > index 0b988df..28137e1 100644 > --- a/src/gallium/drivers/r600/r600_shader.c > +++ b/src/gallium/drivers/r600/r600_shader.c > @@ -6082,7 +6082,7 @@ static int tgsi_ucmp(struct r600_shader_ctx *ctx) > continue; > > memset(&alu, 0, sizeof(struct r600_bytecode_alu)); > - alu.op = ALU_OP3_CNDGE_INT; > + alu.op = ALU_OP3_CNDE_INT; > r600_bytecode_src(&alu.src[0], &ctx->src[0], i); > r600_bytecode_src(&alu.src[1], &ctx->src[2], i); > r600_bytecode_src(&alu.src[2], &ctx->src[1], i); > Oh, the state tracker used UCMP before (for triop_csel), which is also why I didn't even think about checking drivers if they do the right thing... But possibly only with true booleans, so you'd have got only 0 and -1 as input (though I have no idea again if you'd actually see such ops at all from glsl). So maybe it was the optimization to try to avoid the extra comparison if it was a equal/not equal comparison against 0 which caused this indeed :-). Looks good in any case. Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V3] mesa: use build flag to ensure stack is realigned on x86
Nowadays GCC assumes stack pointer is 16-byte aligned even on 32-bits, but that is an assumption OpenGL drivers (or any dynamic library for that matter) can't afford to make as there are many closed- and open- source application binaries out there that only assume 4-byte stack alignment. V3: move all sse4.1 build flag config to the same location and add comment as to why we need to do the realign V2: use $target_cpu rather than $host_cpu and setup build flags in config rather than makefile https://bugs.freedesktop.org/show_bug.cgi?id=86788 Signed-off-by: Timothy Arceri Reviewed-by: Matt Turner CC: "10.4" --- If there are no other comments I'll commit it to master tomorrow. The last hunk should be dropped when applying to 10.4. configure.ac | 11 ++- src/mesa/Makefile.am | 2 +- src/mesa/main/sse_minmax.c | 3 --- 3 files changed, 11 insertions(+), 5 deletions(-) diff --git a/configure.ac b/configure.ac index b0df1bb..e510bcf 100644 --- a/configure.ac +++ b/configure.ac @@ -253,8 +253,16 @@ AC_SUBST([VISIBILITY_CXXFLAGS]) dnl dnl Optional flags, check for compiler support dnl +SSE41_CFLAGS="-msse4.1" +dnl 32-bit Mesa is 4-byte aligned to allow support for old applications +dnl therefore we need to realign the stack when using SSE +case "$target_cpu" in +i?86) +SSE41_CFLAGS="$SSE41_CFLAGS -mstackrealign" +;; +esac save_CFLAGS="$CFLAGS" -CFLAGS="-msse4.1 $CFLAGS" +CFLAGS="$SSE41_CFLAGS $CFLAGS" AC_COMPILE_IFELSE([AC_LANG_SOURCE([[ #include int main () { @@ -267,6 +275,7 @@ if test "x$SSE41_SUPPORTED" = x1; then DEFINES="$DEFINES -DUSE_SSE41" fi AM_CONDITIONAL([SSE41_SUPPORTED], [test x$SSE41_SUPPORTED = x1]) +AC_SUBST([SSE41_CFLAGS], $SSE41_CFLAGS) dnl Can't have static and shared libraries, default to static if user dnl explicitly requested. If both disabled, set to static since shared diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am index 932db4f..3b68573 100644 --- a/src/mesa/Makefile.am +++ b/src/mesa/Makefile.am @@ -153,7 +153,7 @@ libmesagallium_la_LIBADD = \ libmesa_sse41_la_SOURCES = \ main/streaming-load-memcpy.c \ main/sse_minmax.c -libmesa_sse41_la_CFLAGS = $(AM_CFLAGS) -msse4.1 +libmesa_sse41_la_CFLAGS = $(AM_CFLAGS) $(SSE41_CFLAGS) pkgconfigdir = $(libdir)/pkgconfig pkgconfig_DATA = gl.pc diff --git a/src/mesa/main/sse_minmax.c b/src/mesa/main/sse_minmax.c index 93cf2a6..222ac14 100644 --- a/src/mesa/main/sse_minmax.c +++ b/src/mesa/main/sse_minmax.c @@ -31,9 +31,6 @@ #include void -#if !defined(__x86_64__) - __attribute__((force_align_arg_pointer)) -#endif _mesa_uint_array_min_max(const unsigned *ui_indices, unsigned *min_index, unsigned *max_index, const unsigned count) { -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] llvmpipe: fix lp_test_arit denorm handling
On Mon, Nov 24, 2014 at 2:37 PM, wrote: > From: Roland Scheidegger > > llvmpipe disables denorms on purpose (on x86/sse only), because denorms are > generally neither required nor desired for graphic apis (and in case of d3d10, > they are forbidden). > However, this caused some arithmetic tests using denorms to fail on some > systems, because the reference did not generate the same results anymore. > (It did not fail on all systems - behavior of these math functions is sort > of undefined when called with non-standard floating point mode, hence the > result differing depending on implementation and in particular the sse > capabilities.) > So, for the reference, simply flush all (input/output) denorms manually > to zero in this case. > > This fixes https://bugs.freedesktop.org/show_bug.cgi?id=67672. > --- Can we pick this to 10.4? I've had a Gentoo bug open about this failure for since 10.0. (commit 8148a06b8fdb734f7f9a11ce787ee6505939fdaa in master) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] r600/sb loop issue
On 12/09/2014 05:18 AM, Dave Airlie wrote: On 8 December 2014 at 20:41, Vadim Girlin wrote: On 12/06/2014 07:13 AM, Vadim Girlin wrote: On 12/04/2014 01:43 AM, Dave Airlie wrote: Hi Vadim, I've been looking with Glenn's help into a bug in sb for a couple of weeks now triggered by a change in how GLSL generates switch statements. I understand you probably aren't too interested in r600g but I believe I'm hitting a design level problem and I would like some advice. So it appears that GLSL can create loops that don't repeat for switch statements, and it appears SB wasn't ready to handle such a thing. Hi, Dave, I suspect we should rather get rid of such loops somehow, i.e. convert to something else, the loop that never repeats is not really a loop anyway. AFAICS "continue" is not supported in switch statements according to GLSL specs, so the loops generated for switch will never be repeated. Am I missing something? Even if repeating is possible somehow, at least we can get rid of the loops that are not repeated. I think loops are less efficient than other control flow instructions on r600g hw (at least because they increase stack usage), and possibly on other hw too. In fact it seems sb basically gets rid of it already in IR, it just doesn't know how to translate resulting control flow to ISA, because so far it only supports specific control flow structure for if-then-else that was previously preserved during optimizations. I think it may be not very hard to implement support for that in finalizer, I'll look into it. In fact handling that control flow in finalizer is not as easy as I hoped, probably impossible, at least if we want to make it efficient. I forgot about the limitations of R600 ISA. OTOH it seems I've managed to fix the issues with loops, the patch is attached (it's meant to be used instead of 7b0067d2). There are no piglit regressions on evergreen, but I didn't test any real apps. This does seem to fix the problems in piglit, and looks close to what I was attempting but written by someone who knows what they are doing :-) What is the sb_sched.cpp change for at the end for? It fixes those scheduler/regalloc errors for switch tests. Unfortunately, now I've installed some benchmarks for testing and AFAICS this patch breaks at least lightsmark 2008, so it seems the condition removed by the patch was there for a reason. I'll probably try to come up with better fix. Vadim Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] [v2] i965: implement ARB_pipeline_statistics_query
On Tue, Dec 02, 2014 at 11:07:34PM -0800, Ian Romanick wrote: > Since there will be a v3 anyway, nits below... > > On 12/02/2014 06:33 PM, Ben Widawsky wrote: > > This patch implements ARB_pipeline_statistics_query. This addition to GL > > does > > not add a new API. Instead, it adds new tokens to the existing query APIs. > > The > > work to hook up the new tokens is trivial due to it's similarity to the > > previous > > work done for the query APIs. I've implemented all the new tokens to some > > degree, but have stubbed out the untested ones at the entry point for > > Begin(). > > Doing this should allow the remainder of the code to be left in. > > > > The new tokens give GL clients a way to obtain stats about the GL pipeline. > > Generally, you get the number of things going in, invocations, and number of > > things coming out, primitives, of the various stages. There are two > > immediate > > uses for this, performance information, and debugging various types of > > misrendering. I doubt one can use these for debugging very complex > > applications, > > but for piglit tests, it should be quite useful. > > > > Tessellation shaders, and compute shaders are not addressed in this patch > > because there is no upstream implementation. I've implemented how I believe > > tessellation shader stats will work for Intel hardware (though there is a > > bit of > > ambiguity). Compute shaders are a bit more interesting though, and I don't > > yet > > know what we'll do there. > > > > > > For the lazy, here is a link to the relevant part of the spec: > > https://www.opengl.org/registry/specs/ARB/pipeline_statistics_query.txt > > > > Running the piglit tests > > http://lists.freedesktop.org/archives/piglit/2014-November/013321.html > > (http://cgit.freedesktop.org/~bwidawsk/piglit/log/?h=pipe_stats) > > yield the following results: > > > >> python2 ./piglit-run.py -t stats tests/all.py output/pipeline_stats > >> [5/5] pass: 5 Running Test(s): 5 > > > > Previously I was seeing the adjacent vertex test failing on certain Intel > > hardware. I am currently not able to reproduce this, and therefore for now, > > I'll > > assume it was some transient issue which has been fixed. > > > > v2: > > - Don't allow pipeline_stats to be per stream (Ilia). This would be needed > > for > > AMD_transform_feedback4, which we do not support. > >> If AMD_transform_feedback4 is supported then > > GEOMETRY_SHADER_PRIMITIVES_- > >> EMITTED_ARB counts primitives emitted to any of the vertex streams for > > which > >> STREAM_RASTERIZATION_AMD is enabled. > > - Remove comment from GL3.txt because it is only used for extensions that > > are > > part of required versions (Ilia) > > - Move the new tokens to a new XML doc instead of using the main GL4x.xml > > (Ilia) > > - Add a fallthrough comment (Ilia) > > - Only divide PS invocations by 4 on HSW+ (Ben) > > > > Cc: Ilia Mirkin > > Signed-off-by: Ben Widawsky > > --- > > .../glapi/gen/ARB_pipeline_statistics_query.xml| 24 > > src/mesa/drivers/dri/i965/gen6_queryobj.c | 121 > > + > > src/mesa/drivers/dri/i965/intel_extensions.c | 1 + > > src/mesa/main/config.h | 3 + > > src/mesa/main/extensions.c | 1 + > > src/mesa/main/mtypes.h | 15 +++ > > src/mesa/main/queryobj.c | 77 + > > 7 files changed, 242 insertions(+) > > create mode 100644 src/mapi/glapi/gen/ARB_pipeline_statistics_query.xml > > > > diff --git a/src/mapi/glapi/gen/ARB_pipeline_statistics_query.xml > > b/src/mapi/glapi/gen/ARB_pipeline_statistics_query.xml > > new file mode 100644 > > index 000..db37267 > > --- /dev/null > > +++ b/src/mapi/glapi/gen/ARB_pipeline_statistics_query.xml > > @@ -0,0 +1,24 @@ > > + > > + > > + > > + > > + > > + > > + > > + > > + > > + > > + > > + > > + > > + > > + > > + > > + > > + > > + > > + > > + > > + > > + > > + > > diff --git a/src/mesa/drivers/dri/i965/gen6_queryobj.c > > b/src/mesa/drivers/dri/i965/gen6_queryobj.c > > index 130236e..f8b9bc3 100644 > > --- a/src/mesa/drivers/dri/i965/gen6_queryobj.c > > +++ b/src/mesa/drivers/dri/i965/gen6_queryobj.c > > @@ -109,6 +109,73 @@ write_xfb_primitives_written(struct brw_context *brw, > > } > > } > > > > +static inline const int > > +pipeline_target_to_index(int target) > > +{ > > + if (target == GL_GEOMETRY_SHADER_INVOCATIONS) > > + return MAX_PIPELINE_STATISTICS - 1; > > + else > > + return target - GL_VERTICES_SUBMITTED_ARB; > > +} > > + > > +static void > > +emit_pipeline_stat(struct brw_context *brw, drm_intel_bo *bo, > > + int stream, int target, int idx) > > +{ > > + /* > > +* There are 2 confusing parts to implementing the various target. The > > first is > > +* the distinction between vertices submitted and primitives submitted. > > The > > +* spec
Re: [Mesa-dev] r600/sb loop issue
On 8 December 2014 at 20:41, Vadim Girlin wrote: > On 12/06/2014 07:13 AM, Vadim Girlin wrote: >> >> On 12/04/2014 01:43 AM, Dave Airlie wrote: >>> >>> Hi Vadim, >>> >>> I've been looking with Glenn's help into a bug in sb for a couple of >>> weeks now triggered by a change in how GLSL generates switch >>> statements. >>> >>> I understand you probably aren't too interested in r600g but I believe >>> I'm hitting a design level problem and I would like some advice. >>> >>> So it appears that GLSL can create loops that don't repeat for switch >>> statements, and it appears SB wasn't ready to handle such a thing. >> >> >> Hi, Dave, >> >> I suspect we should rather get rid of such loops somehow, i.e. convert >> to something else, the loop that never repeats is not really a loop >> anyway. AFAICS "continue" is not supported in switch statements >> according to GLSL specs, so the loops generated for switch will never be >> repeated. Am I missing something? Even if repeating is possible somehow, >> at least we can get rid of the loops that are not repeated. >> >> I think loops are less efficient than other control flow instructions on >> r600g hw (at least because they increase stack usage), and possibly on >> other hw too. >> >> In fact it seems sb basically gets rid of it already in IR, it just >> doesn't know how to translate resulting control flow to ISA, because so >> far it only supports specific control flow structure for if-then-else >> that was previously preserved during optimizations. I think it may be >> not very hard to implement support for that in finalizer, I'll look into >> it. > > > In fact handling that control flow in finalizer is not as easy as I hoped, > probably impossible, at least if we want to make it efficient. I forgot > about the limitations of R600 ISA. > > OTOH it seems I've managed to fix the issues with loops, the patch is > attached (it's meant to be used instead of 7b0067d2). There are no piglit > regressions on evergreen, but I didn't test any real apps. > This does seem to fix the problems in piglit, and looks close to what I was attempting but written by someone who knows what they are doing :-) What is the sb_sched.cpp change for at the end for? Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] r600/sb loop issue
On 9 December 2014 at 10:25, Dave Airlie wrote: > On 8 December 2014 at 20:41, Vadim Girlin wrote: >> On 12/06/2014 07:13 AM, Vadim Girlin wrote: >>> >>> On 12/04/2014 01:43 AM, Dave Airlie wrote: Hi Vadim, I've been looking with Glenn's help into a bug in sb for a couple of weeks now triggered by a change in how GLSL generates switch statements. I understand you probably aren't too interested in r600g but I believe I'm hitting a design level problem and I would like some advice. So it appears that GLSL can create loops that don't repeat for switch statements, and it appears SB wasn't ready to handle such a thing. >>> >>> >>> Hi, Dave, >>> >>> I suspect we should rather get rid of such loops somehow, i.e. convert >>> to something else, the loop that never repeats is not really a loop >>> anyway. AFAICS "continue" is not supported in switch statements >>> according to GLSL specs, so the loops generated for switch will never be >>> repeated. Am I missing something? Even if repeating is possible somehow, >>> at least we can get rid of the loops that are not repeated. >>> >>> I think loops are less efficient than other control flow instructions on >>> r600g hw (at least because they increase stack usage), and possibly on >>> other hw too. >>> >>> In fact it seems sb basically gets rid of it already in IR, it just >>> doesn't know how to translate resulting control flow to ISA, because so >>> far it only supports specific control flow structure for if-then-else >>> that was previously preserved during optimizations. I think it may be >>> not very hard to implement support for that in finalizer, I'll look into >>> it. >> >> >> In fact handling that control flow in finalizer is not as easy as I hoped, >> probably impossible, at least if we want to make it efficient. I forgot >> about the limitations of R600 ISA. >> >> OTOH it seems I've managed to fix the issues with loops, the patch is >> attached (it's meant to be used instead of 7b0067d2). There are no piglit >> regressions on evergreen, but I didn't test any real apps. >> > > This fixes one thing, but the switches are still broken here on cayman at > least > Actually ignore that, another regression snuck into r600g that I had to fix. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: fix regression since UCMP change
On Tue, 09 Dec 2014 02:31:01 +0100, Dave Airlie wrote: From: Dave Airlie Since d8da6deceadf5e48201d848b7061dad17a5b7cac where the state tracker started using UCMP on cayman a number of tests regressed. this seems to be r600g is doing CNDGE_INT for UCMP which is >= 0, we should be doing CNDE_INT with reverse arguments. Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/r600_shader.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 0b988df..28137e1 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -6082,7 +6082,7 @@ static int tgsi_ucmp(struct r600_shader_ctx *ctx) continue; memset(&alu, 0, sizeof(struct r600_bytecode_alu)); - alu.op = ALU_OP3_CNDGE_INT; + alu.op = ALU_OP3_CNDE_INT; r600_bytecode_src(&alu.src[0], &ctx->src[0], i); r600_bytecode_src(&alu.src[1], &ctx->src[2], i); r600_bytecode_src(&alu.src[2], &ctx->src[1], i); Reviewed-by: Glenn Kennard ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] r600g: fix regression since UCMP change
From: Dave Airlie Since d8da6deceadf5e48201d848b7061dad17a5b7cac where the state tracker started using UCMP on cayman a number of tests regressed. this seems to be r600g is doing CNDGE_INT for UCMP which is >= 0, we should be doing CNDE_INT with reverse arguments. Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/r600_shader.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 0b988df..28137e1 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -6082,7 +6082,7 @@ static int tgsi_ucmp(struct r600_shader_ctx *ctx) continue; memset(&alu, 0, sizeof(struct r600_bytecode_alu)); - alu.op = ALU_OP3_CNDGE_INT; + alu.op = ALU_OP3_CNDE_INT; r600_bytecode_src(&alu.src[0], &ctx->src[0], i); r600_bytecode_src(&alu.src[1], &ctx->src[2], i); r600_bytecode_src(&alu.src[2], &ctx->src[1], i); -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] r600/sb loop issue
On 8 December 2014 at 20:41, Vadim Girlin wrote: > On 12/06/2014 07:13 AM, Vadim Girlin wrote: >> >> On 12/04/2014 01:43 AM, Dave Airlie wrote: >>> >>> Hi Vadim, >>> >>> I've been looking with Glenn's help into a bug in sb for a couple of >>> weeks now triggered by a change in how GLSL generates switch >>> statements. >>> >>> I understand you probably aren't too interested in r600g but I believe >>> I'm hitting a design level problem and I would like some advice. >>> >>> So it appears that GLSL can create loops that don't repeat for switch >>> statements, and it appears SB wasn't ready to handle such a thing. >> >> >> Hi, Dave, >> >> I suspect we should rather get rid of such loops somehow, i.e. convert >> to something else, the loop that never repeats is not really a loop >> anyway. AFAICS "continue" is not supported in switch statements >> according to GLSL specs, so the loops generated for switch will never be >> repeated. Am I missing something? Even if repeating is possible somehow, >> at least we can get rid of the loops that are not repeated. >> >> I think loops are less efficient than other control flow instructions on >> r600g hw (at least because they increase stack usage), and possibly on >> other hw too. >> >> In fact it seems sb basically gets rid of it already in IR, it just >> doesn't know how to translate resulting control flow to ISA, because so >> far it only supports specific control flow structure for if-then-else >> that was previously preserved during optimizations. I think it may be >> not very hard to implement support for that in finalizer, I'll look into >> it. > > > In fact handling that control flow in finalizer is not as easy as I hoped, > probably impossible, at least if we want to make it efficient. I forgot > about the limitations of R600 ISA. > > OTOH it seems I've managed to fix the issues with loops, the patch is > attached (it's meant to be used instead of 7b0067d2). There are no piglit > regressions on evergreen, but I didn't test any real apps. > This fixes one thing, but the switches are still broken here on cayman at least tests/spec/glsl-1.30/execution/switch/fs-default_last.shader_test -- FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL OUT[0], COLOR DCL CONST[0] DCL TEMP[0..2], LOCAL IMM[0] FLT32 {0., 1., 0., 0.} IMM[1] UINT32 {0, 4294967295, 0, 0} IMM[2] INT32 {1, 0, 0, 0} 0: MOV TEMP[0], IMM[0]. 1: MOV TEMP[1].x, IMM[1]. 2: BGNLOOP :0 3: UCMP TEMP[1].x, CONST[0]., TEMP[1]., IMM[1]. 4: UIF TEMP[1]. :0 5: MOV TEMP[0].x, IMM[0]. 6: BRK 7: ENDIF 8: USEQ TEMP[2].x, IMM[2]., CONST[0]. 9: UCMP TEMP[1].x, TEMP[2]., IMM[1]., TEMP[1]. 10: UIF TEMP[1]. :0 11: MOV TEMP[0].y, IMM[0]. 12: BRK 13: ENDIF 14: MOV TEMP[1].x, IMM[1]. 15: MOV TEMP[0].z, IMM[0]. 16: BRK 17: ENDLOOP :0 18: MOV OUT[0], TEMP[0] 19: END = SHADER #13 PS/CAYMAN/CAYMAN = = 72 dw = 6 gprs = 2 stack = 0012 a010 ALU 5 @36 0036 00f8 00200c90 1 x: MOVR1.x, 0 0038 00f8 20200c90y: MOVR1.y, 0 0040 00f8 40200c90z: MOVR1.z, 0 0042 80f8 60200c90w: MOVR1.w, 0 0044 80f8 00400c90 2 x: MOVR2.x, 0 0002 000f 8180 LOOP_START_DX10 @30 0004 4017 a404 ALU_PUSH_BEFORE 2 @46 KC0[CB0:0-15] 0046 809f6080 0043c002 3 x: CNDGE_INT R2.x, KC0[0].x, -1, R2.x 0048 801f00fe 00a0229c 4 MP x: PRED_SETNE_INT R5.x, PV.x, 0 0006 0007 8281 JUMP @14 POP:1 0008 0019 a000 ALU 1 @50 0050 84f9 00200c90 5 x: MOVR1.x, 1.0 0010 000e 8240 LOOP_BREAK @28 0012 0007 8381 POP @14 POP:1 0014 401a a408 ALU_PUSH_BEFORE 3 @52 KC0[CB0:0-15] 0052 801000fa 00601d10 6 x: SETE_INT R3.x, 1, KC0[0].x 0054 800040fe 0043c4fb 7 x: CNDGE_INT R2.x, PV.x, R2.x, -1 0056 801f00fe 00a0229c 8 MP x: PRED_SETNE_INT R5.x, PV.x, 0 0016 000c 8281 JUMP @24 POP:1 0018 001d a000 ALU 1 @58 0058 84f9 20200c90 9 y: MOVR1.y, 1.0 0020 000e 8240 LOOP_BREAK @28 0022 000c 8381 POP @24 POP:1 0024 001e a004 ALU 2 @60 0060 04fb 00400c9010 x: MOVR2.x, -1 0062 84f9 40200c90z: MOVR1.z, 1.0 0026 000e 8240 LOOP_BREAK @28 0028 0002 8140 LOOP_END @4 0030 0020 a00c ALU 4 @64 0064 0001 0c9011 x: MOVR0.x, R1.x 0066 0401 2c90y: MOVR0.y, R1.y 0068 0801 4c90z: MOV
Re: [Mesa-dev] [PATCH 1/3] Remove useless checks for NULL before freeing
No, we depend on the MALLOC/FREE debug wrappers, etc. for memory debugging on Windows. -Brian On 12/08/2014 02:40 PM, Ian Romanick wrote: Didn't Gallium also remove the FREE business? Either way, patch 1 and 2 are Reviewed-by: Ian Romanick I sent a comment on patch 3. With that comment resolved in patch 3 or as a patch 4, patch 3 is also Reviewed-by: Ian Romanick On 12/08/2014 11:56 AM, Matt Turner wrote: See commits 5067506e and b6109de3 for the Coccinelle script. --- src/gallium/auxiliary/util/u_debug_flush.c | 12 src/gallium/drivers/i915/i915_state.c| 10 -- src/gallium/drivers/ilo/shader/toy_tgsi.c| 6 ++ src/gallium/drivers/nouveau/nv50/nv50_context.c | 3 +-- src/gallium/drivers/nouveau/nv50/nv84_video.c| 3 +-- src/gallium/drivers/nouveau/nvc0/nvc0_context.c | 3 +-- src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 3 +-- src/gallium/drivers/nouveau/nvc0/nvc0_surface.c | 3 +-- src/gallium/drivers/r600/r600_isa.c | 12 src/gallium/drivers/softpipe/sp_tile_cache.c | 3 +-- src/gallium/state_trackers/hgl/hgl.c | 6 ++ src/gallium/state_trackers/nine/nine_shader.c| 6 ++ src/gallium/state_trackers/nine/pixelshader9.c | 3 +-- src/gallium/state_trackers/nine/stateblock9.c| 8 src/gallium/state_trackers/nine/swapchain9.c | 2 +- src/gallium/state_trackers/nine/vertexdeclaration9.c | 9 +++-- src/gallium/state_trackers/nine/vertexshader9.c | 3 +-- src/gallium/winsys/svga/drm/vmw_screen_ioctl.c | 6 ++ src/mesa/drivers/dri/common/xmlconfig.c | 3 +-- src/mesa/main/objectlabel.c | 7 ++- 20 files changed, 39 insertions(+), 72 deletions(-) diff --git a/src/gallium/auxiliary/util/u_debug_flush.c b/src/gallium/auxiliary/util/u_debug_flush.c index fdb248c..cdefca2 100644 --- a/src/gallium/auxiliary/util/u_debug_flush.c +++ b/src/gallium/auxiliary/util/u_debug_flush.c @@ -132,8 +132,7 @@ debug_flush_buf_reference(struct debug_flush_buf **dst, struct debug_flush_buf *fbuf = *dst; if (pipe_reference(&(*dst)->reference, &src->reference)) { - if (fbuf->map_frame) - FREE(fbuf->map_frame); + FREE(fbuf->map_frame); FREE(fbuf); } @@ -146,8 +145,7 @@ debug_flush_item_destroy(struct debug_flush_item *item) { debug_flush_buf_reference(&item->fbuf, NULL); - if (item->ref_frame) - FREE(item->ref_frame); + FREE(item->ref_frame); FREE(item); } @@ -263,10 +261,8 @@ debug_flush_unmap(struct debug_flush_buf *fbuf) fbuf->mapped_sync = FALSE; fbuf->mapped = FALSE; - if (fbuf->map_frame) { - FREE(fbuf->map_frame); - fbuf->map_frame = NULL; - } + FREE(fbuf->map_frame); + fbuf->map_frame = NULL; pipe_mutex_unlock(fbuf->mutex); } diff --git a/src/gallium/drivers/i915/i915_state.c b/src/gallium/drivers/i915/i915_state.c index c90fcfd..6ba9646 100644 --- a/src/gallium/drivers/i915/i915_state.c +++ b/src/gallium/drivers/i915/i915_state.c @@ -628,12 +628,10 @@ void i915_delete_fs_state(struct pipe_context *pipe, void *shader) FREE(ifs->decl); ifs->decl = NULL; - if (ifs->program) { - FREE(ifs->program); - ifs->program = NULL; - FREE((struct tgsi_token *)ifs->state.tokens); - ifs->state.tokens = NULL; - } + FREE(ifs->program); + ifs->program = NULL; + FREE((struct tgsi_token *)ifs->state.tokens); + ifs->state.tokens = NULL; ifs->program_len = 0; ifs->decl_len = 0; diff --git a/src/gallium/drivers/ilo/shader/toy_tgsi.c b/src/gallium/drivers/ilo/shader/toy_tgsi.c index 57501ea..65e47bf 100644 --- a/src/gallium/drivers/ilo/shader/toy_tgsi.c +++ b/src/gallium/drivers/ilo/shader/toy_tgsi.c @@ -2296,10 +2296,8 @@ add_imm(struct toy_tgsi *tgsi, enum toy_type type, const uint32_t *buf) cur_size * sizeof(new_types[0]), new_size * sizeof(new_types[0])); if (!new_buf || !new_types) { - if (new_buf) -FREE(new_buf); - if (new_types) -FREE(new_types); + FREE(new_buf); + FREE(new_types); return -1; } diff --git a/src/gallium/drivers/nouveau/nv50/nv50_context.c b/src/gallium/drivers/nouveau/nv50/nv50_context.c index 1a53579..2cfd5db 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_context.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_context.c @@ -338,8 +338,7 @@ out_err: nouveau_bufctx_del(&nv50->bufctx_3d); if (nv50->bufctx) nouveau_bufctx_del(&nv50->bufctx); - if (nv50->blit) - FREE(nv50->blit); + FREE(nv50->blit); FREE(nv50); return NULL; } diff --git a/src/gallium/drivers/nouveau/nv50/nv84_video.c b/src/gallium/drivers/nouveau/nv50/nv84_video.c index 395bd7a..7a4670f 100644 --- a/src/gallium/drivers/nouveau/nv50/nv84_video.c +++ b/
Re: [Mesa-dev] [PATCH V2] mesa: use build flag to ensure stack is realigned on x86
On Mon, Dec 8, 2014 at 1:59 PM, Ian Romanick wrote: > On 12/08/2014 01:56 PM, Matt Turner wrote: >> On Mon, Dec 8, 2014 at 1:54 PM, Ian Romanick wrote: diff --git a/configure.ac b/configure.ac index b0df1bb..7dc435a 100644 --- a/configure.ac +++ b/configure.ac @@ -253,8 +253,9 @@ AC_SUBST([VISIBILITY_CXXFLAGS]) dnl dnl Optional flags, check for compiler support dnl +SSE41_CFLAGS="-msse4.1" save_CFLAGS="$CFLAGS" -CFLAGS="-msse4.1 $CFLAGS" +CFLAGS="$SSE41_CFLAGS $CFLAGS" AC_COMPILE_IFELSE([AC_LANG_SOURCE([[ #include int main () { @@ -474,6 +475,12 @@ fi dnl dnl Arch/platform-specific settings dnl +case "$target_cpu" in +i?86) +SSE41_CFLAGS="$SSE41_CFLAGS -mstackrealign" +;; +esac >>> >>> Should we only add -mstackrealign if SSE41_CFLAGS is not empty? >> >> It looks like it's unconditionally set. > > Right... I'm asking if we should conditionally add it instead. Is there > any harm in doing -mstackrealign when it's not needed? Ah, I see. Unclear whether -mstackrealign does anything on x86-64. I do see some reports like [1] or [2] where some version of gcc errored out with "error: -mstackrealign not supported in the 64bit mode" or ICE'd so I suspect we shouldn't use it unconditionally. [1] http://tremulous.net/forum/index.php?topic=16205.10;wap2 [2] https://www.mail-archive.com/gcc-bugs@gcc.gnu.org/msg256007.html ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2] mesa: use build flag to ensure stack is realigned on x86
On Sun, Dec 7, 2014 at 4:13 AM, Timothy Arceri wrote: > Nowadays GCC assumes stack pointer is 16-byte aligned even on 32-bits, but > that is an assumption OpenGL drivers (or any dynamic library for that matter) > can't afford to make as there are many closed- and open- source application > binaries out there that only assume 4-byte stack alignment. Line wrap the commit message. > > V2: use $target_cpu rather than $host_cpu > and setup build flags in config rather than makefile > > https://bugs.freedesktop.org/show_bug.cgi?id=86788 > Signed-off-by: Timothy Arceri > --- > Tested by cross compiling and running 32-bit version of > UrbanTerror. > > Please note if this patch is ok it should also be applied to 10.4 with > the last hunk removed. > > configure.ac | 11 ++- > src/mesa/Makefile.am | 2 +- > src/mesa/main/sse_minmax.c | 3 --- > 3 files changed, 11 insertions(+), 5 deletions(-) > > diff --git a/configure.ac b/configure.ac > index b0df1bb..7dc435a 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -253,8 +253,9 @@ AC_SUBST([VISIBILITY_CXXFLAGS]) > dnl > dnl Optional flags, check for compiler support > dnl > +SSE41_CFLAGS="-msse4.1" > save_CFLAGS="$CFLAGS" > -CFLAGS="-msse4.1 $CFLAGS" > +CFLAGS="$SSE41_CFLAGS $CFLAGS" > AC_COMPILE_IFELSE([AC_LANG_SOURCE([[ > #include > int main () { > @@ -474,6 +475,12 @@ fi > dnl > dnl Arch/platform-specific settings > dnl > +case "$target_cpu" in > +i?86) > +SSE41_CFLAGS="$SSE41_CFLAGS -mstackrealign" > +;; ;; should be indented. > +esac I'd put this immediately after the SSE41_CFLAGS="..." assignment so that we're compiling the test program with -mstackrealign as well. > + > AC_ARG_ENABLE([asm], > [AS_HELP_STRING([--disable-asm], > [disable assembly usage @<:@default=enabled on supported > plaforms@:>@])], > @@ -2091,6 +2098,8 @@ AM_CONDITIONAL(HAVE_X86_ASM, test "x$asm_arch" = xx86 > -o "x$asm_arch" = xx86_64) > AM_CONDITIONAL(HAVE_X86_64_ASM, test "x$asm_arch" = xx86_64) > AM_CONDITIONAL(HAVE_SPARC_ASM, test "x$asm_arch" = xsparc) > > +AC_SUBST([SSE41_CFLAGS], $SSE41_CFLAGS) I'd put this immediately before or after the AM_CONDITIONAL([SSE41_SUPPORTED], ...) statement. With those comments addressed, it's Reviewed-by: Matt Turner Thanks Timothy! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2] mesa: use build flag to ensure stack is realigned on x86
On 12/08/2014 01:56 PM, Matt Turner wrote: > On Mon, Dec 8, 2014 at 1:54 PM, Ian Romanick wrote: >>> diff --git a/configure.ac b/configure.ac >>> index b0df1bb..7dc435a 100644 >>> --- a/configure.ac >>> +++ b/configure.ac >>> @@ -253,8 +253,9 @@ AC_SUBST([VISIBILITY_CXXFLAGS]) >>> dnl >>> dnl Optional flags, check for compiler support >>> dnl >>> +SSE41_CFLAGS="-msse4.1" >>> save_CFLAGS="$CFLAGS" >>> -CFLAGS="-msse4.1 $CFLAGS" >>> +CFLAGS="$SSE41_CFLAGS $CFLAGS" >>> AC_COMPILE_IFELSE([AC_LANG_SOURCE([[ >>> #include >>> int main () { >>> @@ -474,6 +475,12 @@ fi >>> dnl >>> dnl Arch/platform-specific settings >>> dnl >>> +case "$target_cpu" in >>> +i?86) >>> +SSE41_CFLAGS="$SSE41_CFLAGS -mstackrealign" >>> +;; >>> +esac >> >> Should we only add -mstackrealign if SSE41_CFLAGS is not empty? > > It looks like it's unconditionally set. Right... I'm asking if we should conditionally add it instead. Is there any harm in doing -mstackrealign when it's not needed? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Finishing make distcheck
On 12/07/2014 06:57 PM, Matt Turner wrote: > I've finished fixing up make distcheck. > > git://people.freedesktop.org/~mattst88/mesa make-dist > > I've seen some (sporadic?) failures of the glcpp/tests/glcpp-test. I > think it's because it's trying to write out files into the > distribution directory, which isn't allowed. I'll try to track that > down. > > Other than that, I don't know of any problems. I've diffed the lists > of files in git vs the distribution tarball and it looks as expected. > > It's 79 small (or mechanical, like alphabetizing) patches that aren't > interesting to read, so I'm not going to send them to the list. I do > hope Emil will have a little time to give it a once-over. > > The only question I really have is what archive formats we want to > ship? As the branch is now, it generates tar.gz (11 MiB) and tar.xz > (6.5 MiB). I think bzip2 is pretty useless these days (larger than xz > and takes longer to decompress). Do we still want zip? I think we still want zip for Windows. As far as xz vs bz2 goes, what do our other distro partners want? I guess Gentoo wants xz. :) > With distcheck working, we should probably start using release.sh from > git://anongit.freedesktop.org/xorg/util/modular to generate the > annouce emails and do the uploads like the X.Org projects do. I expect > I can make that modification unless someone else wants to. That sounds good to me, but I don't do releases any more. At least having the announce e-mails look the same as the rest of X.Org may have some benefit. > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2] mesa: use build flag to ensure stack is realigned on x86
On Mon, Dec 8, 2014 at 1:54 PM, Ian Romanick wrote: >> diff --git a/configure.ac b/configure.ac >> index b0df1bb..7dc435a 100644 >> --- a/configure.ac >> +++ b/configure.ac >> @@ -253,8 +253,9 @@ AC_SUBST([VISIBILITY_CXXFLAGS]) >> dnl >> dnl Optional flags, check for compiler support >> dnl >> +SSE41_CFLAGS="-msse4.1" >> save_CFLAGS="$CFLAGS" >> -CFLAGS="-msse4.1 $CFLAGS" >> +CFLAGS="$SSE41_CFLAGS $CFLAGS" >> AC_COMPILE_IFELSE([AC_LANG_SOURCE([[ >> #include >> int main () { >> @@ -474,6 +475,12 @@ fi >> dnl >> dnl Arch/platform-specific settings >> dnl >> +case "$target_cpu" in >> +i?86) >> +SSE41_CFLAGS="$SSE41_CFLAGS -mstackrealign" >> +;; >> +esac > > Should we only add -mstackrealign if SSE41_CFLAGS is not empty? It looks like it's unconditionally set. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2] mesa: use build flag to ensure stack is realigned on x86
On 12/07/2014 04:13 AM, Timothy Arceri wrote: > Nowadays GCC assumes stack pointer is 16-byte aligned even on 32-bits, but > that is an assumption OpenGL drivers (or any dynamic library for that matter) > can't afford to make as there are many closed- and open- source application > binaries out there that only assume 4-byte stack alignment. > > V2: use $target_cpu rather than $host_cpu > and setup build flags in config rather than makefile > > https://bugs.freedesktop.org/show_bug.cgi?id=86788 > Signed-off-by: Timothy Arceri > --- > Tested by cross compiling and running 32-bit version of > UrbanTerror. > > Please note if this patch is ok it should also be applied to 10.4 with > the last hunk removed. > > configure.ac | 11 ++- > src/mesa/Makefile.am | 2 +- > src/mesa/main/sse_minmax.c | 3 --- > 3 files changed, 11 insertions(+), 5 deletions(-) > > diff --git a/configure.ac b/configure.ac > index b0df1bb..7dc435a 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -253,8 +253,9 @@ AC_SUBST([VISIBILITY_CXXFLAGS]) > dnl > dnl Optional flags, check for compiler support > dnl > +SSE41_CFLAGS="-msse4.1" > save_CFLAGS="$CFLAGS" > -CFLAGS="-msse4.1 $CFLAGS" > +CFLAGS="$SSE41_CFLAGS $CFLAGS" > AC_COMPILE_IFELSE([AC_LANG_SOURCE([[ > #include > int main () { > @@ -474,6 +475,12 @@ fi > dnl > dnl Arch/platform-specific settings > dnl > +case "$target_cpu" in > +i?86) > +SSE41_CFLAGS="$SSE41_CFLAGS -mstackrealign" > +;; > +esac Should we only add -mstackrealign if SSE41_CFLAGS is not empty? > + > AC_ARG_ENABLE([asm], > [AS_HELP_STRING([--disable-asm], > [disable assembly usage @<:@default=enabled on supported > plaforms@:>@])], > @@ -2091,6 +2098,8 @@ AM_CONDITIONAL(HAVE_X86_ASM, test "x$asm_arch" = xx86 > -o "x$asm_arch" = xx86_64) > AM_CONDITIONAL(HAVE_X86_64_ASM, test "x$asm_arch" = xx86_64) > AM_CONDITIONAL(HAVE_SPARC_ASM, test "x$asm_arch" = xsparc) > > +AC_SUBST([SSE41_CFLAGS], $SSE41_CFLAGS) > + > AC_SUBST([NINE_MAJOR], 1) > AC_SUBST([NINE_MINOR], 0) > AC_SUBST([NINE_TINY], 0) > diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am > index 932db4f..3b68573 100644 > --- a/src/mesa/Makefile.am > +++ b/src/mesa/Makefile.am > @@ -153,7 +153,7 @@ libmesagallium_la_LIBADD = \ > libmesa_sse41_la_SOURCES = \ > main/streaming-load-memcpy.c \ > main/sse_minmax.c > -libmesa_sse41_la_CFLAGS = $(AM_CFLAGS) -msse4.1 > +libmesa_sse41_la_CFLAGS = $(AM_CFLAGS) $(SSE41_CFLAGS) > > pkgconfigdir = $(libdir)/pkgconfig > pkgconfig_DATA = gl.pc > diff --git a/src/mesa/main/sse_minmax.c b/src/mesa/main/sse_minmax.c > index 93cf2a6..222ac14 100644 > --- a/src/mesa/main/sse_minmax.c > +++ b/src/mesa/main/sse_minmax.c > @@ -31,9 +31,6 @@ > #include > > void > -#if !defined(__x86_64__) > - __attribute__((force_align_arg_pointer)) > -#endif > _mesa_uint_array_min_max(const unsigned *ui_indices, unsigned *min_index, > unsigned *max_index, const unsigned count) > { > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] drirc: set allow_glsl_extension_directive_midshader for Dead Island.
Has the game vendor been notified of their bug? This won't work on any OpenGL ES 2.x or 3.x implementation (there *is* a conformance test), and AMD has said they're going to make their driver be conformant too. On 12/08/2014 10:43 AM, Sven Arvidsson wrote: > Signed-off-by: Sven Arvidsson > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87076 > --- > src/mesa/drivers/dri/common/drirc | 4 > 1 file changed, 4 insertions(+) > > diff --git a/src/mesa/drivers/dri/common/drirc > b/src/mesa/drivers/dri/common/drirc > index 4b9841b..cecd6a9 100644 > --- a/src/mesa/drivers/dri/common/drirc > +++ b/src/mesa/drivers/dri/common/drirc > @@ -87,5 +87,9 @@ TODO: document the other workarounds. > > > > + > + > + value="true" /> > + > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] program: Delete dead _mesa_realloc_instructions.
Dead since 2010 (commit 284ce209). Reviewed-by: Ian Romanick --- src/mesa/program/prog_instruction.c | 17 - src/mesa/program/prog_instruction.h | 4 2 files changed, 21 deletions(-) diff --git a/src/mesa/program/prog_instruction.c b/src/mesa/program/prog_instruction.c index c1b9527..254c012 100644 --- a/src/mesa/program/prog_instruction.c +++ b/src/mesa/program/prog_instruction.c @@ -75,23 +75,6 @@ _mesa_alloc_instructions(GLuint numInst) /** - * Reallocate memory storing an array of program instructions. - * This is used when we need to append additional instructions onto an - * program. - * \param oldInst pointer to first of old/src instructions - * \param numOldInst number of instructions at - * \param numNewInst desired size of new instruction array. - * \return pointer to start of new instruction array. - */ -struct prog_instruction * -_mesa_realloc_instructions(struct prog_instruction *oldInst, - GLuint numOldInst, GLuint numNewInst) -{ - return realloc(oldInst, numNewInst * sizeof(struct prog_instruction)); -} - - -/** * Copy an array of program instructions. * \param dest pointer to destination. * \param src pointer to source. diff --git a/src/mesa/program/prog_instruction.h b/src/mesa/program/prog_instruction.h index de78804..0957bd9 100644 --- a/src/mesa/program/prog_instruction.h +++ b/src/mesa/program/prog_instruction.h @@ -385,10 +385,6 @@ extern struct prog_instruction * _mesa_alloc_instructions(GLuint numInst); extern struct prog_instruction * -_mesa_realloc_instructions(struct prog_instruction *oldInst, - GLuint numOldInst, GLuint numNewInst); - -extern struct prog_instruction * _mesa_copy_instructions(struct prog_instruction *dest, const struct prog_instruction *src, GLuint n); -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] egl: Added NULL check in eglCreateContext
On 12/02/2014 12:10 AM, Valentin Corfu wrote: > With this check we can avoid segmentation fault when invalid value used > during eglCreateContext. > > Cc: mesa-sta...@lists.freedesktop.org > Cc: mesa-dev@lists.freedesktop.org > Signed-off-by: Valentin Corfu > --- > src/egl/drivers/dri2/egl_dri2.c | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c > index d795a2f..819cb77 100644 > --- a/src/egl/drivers/dri2/egl_dri2.c > +++ b/src/egl/drivers/dri2/egl_dri2.c > @@ -808,6 +808,11 @@ dri2_create_context(_EGLDriver *drv, _EGLDisplay *disp, > _EGLConfig *conf, > > (void) drv; > > + if (conf == NULL) { > + _eglError(EGL_BAD_CONFIG, "dri2_create_context"); > + return NULL; > + } > + Can't conf be NULL when used with MESA_configless_context? See also the conf != NULL check at line 853. Also, parameter validation etc. should go in eglCreateContext. > dri2_ctx = malloc(sizeof *dri2_ctx); > if (!dri2_ctx) { >_eglError(EGL_BAD_ALLOC, "eglCreateContext"); > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] Don't cast the return value of malloc/realloc
On Mon, Dec 8, 2014 at 1:39 PM, Ian Romanick wrote: > On 12/08/2014 11:56 AM, Matt Turner wrote: > >> diff --git a/src/mesa/program/prog_instruction.c >> b/src/mesa/program/prog_instruction.c >> index 976024e..c1b9527 100644 >> --- a/src/mesa/program/prog_instruction.c >> +++ b/src/mesa/program/prog_instruction.c >> @@ -87,13 +87,7 @@ struct prog_instruction * >> _mesa_realloc_instructions(struct prog_instruction *oldInst, >> GLuint numOldInst, GLuint numNewInst) >> { >> - struct prog_instruction *newInst; >> - >> - newInst = (struct prog_instruction *) >> - realloc(oldInst, >> - numNewInst * sizeof(struct prog_instruction)); >> - >> - return newInst; >> + return realloc(oldInst, numNewInst * sizeof(struct prog_instruction)); >> } > > I don't see any callers of this function. Delete it instead? Indeed! Dead since 2010 in fact (commit 284ce209). I'll send another patch. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] Remove useless checks for NULL before freeing
Didn't Gallium also remove the FREE business? Either way, patch 1 and 2 are Reviewed-by: Ian Romanick I sent a comment on patch 3. With that comment resolved in patch 3 or as a patch 4, patch 3 is also Reviewed-by: Ian Romanick On 12/08/2014 11:56 AM, Matt Turner wrote: > See commits 5067506e and b6109de3 for the Coccinelle script. > --- > src/gallium/auxiliary/util/u_debug_flush.c | 12 > src/gallium/drivers/i915/i915_state.c| 10 -- > src/gallium/drivers/ilo/shader/toy_tgsi.c| 6 ++ > src/gallium/drivers/nouveau/nv50/nv50_context.c | 3 +-- > src/gallium/drivers/nouveau/nv50/nv84_video.c| 3 +-- > src/gallium/drivers/nouveau/nvc0/nvc0_context.c | 3 +-- > src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 3 +-- > src/gallium/drivers/nouveau/nvc0/nvc0_surface.c | 3 +-- > src/gallium/drivers/r600/r600_isa.c | 12 > src/gallium/drivers/softpipe/sp_tile_cache.c | 3 +-- > src/gallium/state_trackers/hgl/hgl.c | 6 ++ > src/gallium/state_trackers/nine/nine_shader.c| 6 ++ > src/gallium/state_trackers/nine/pixelshader9.c | 3 +-- > src/gallium/state_trackers/nine/stateblock9.c| 8 > src/gallium/state_trackers/nine/swapchain9.c | 2 +- > src/gallium/state_trackers/nine/vertexdeclaration9.c | 9 +++-- > src/gallium/state_trackers/nine/vertexshader9.c | 3 +-- > src/gallium/winsys/svga/drm/vmw_screen_ioctl.c | 6 ++ > src/mesa/drivers/dri/common/xmlconfig.c | 3 +-- > src/mesa/main/objectlabel.c | 7 ++- > 20 files changed, 39 insertions(+), 72 deletions(-) > > diff --git a/src/gallium/auxiliary/util/u_debug_flush.c > b/src/gallium/auxiliary/util/u_debug_flush.c > index fdb248c..cdefca2 100644 > --- a/src/gallium/auxiliary/util/u_debug_flush.c > +++ b/src/gallium/auxiliary/util/u_debug_flush.c > @@ -132,8 +132,7 @@ debug_flush_buf_reference(struct debug_flush_buf **dst, > struct debug_flush_buf *fbuf = *dst; > > if (pipe_reference(&(*dst)->reference, &src->reference)) { > - if (fbuf->map_frame) > - FREE(fbuf->map_frame); > + FREE(fbuf->map_frame); > >FREE(fbuf); > } > @@ -146,8 +145,7 @@ debug_flush_item_destroy(struct debug_flush_item *item) > { > debug_flush_buf_reference(&item->fbuf, NULL); > > - if (item->ref_frame) > - FREE(item->ref_frame); > + FREE(item->ref_frame); > > FREE(item); > } > @@ -263,10 +261,8 @@ debug_flush_unmap(struct debug_flush_buf *fbuf) > > fbuf->mapped_sync = FALSE; > fbuf->mapped = FALSE; > - if (fbuf->map_frame) { > - FREE(fbuf->map_frame); > - fbuf->map_frame = NULL; > - } > + FREE(fbuf->map_frame); > + fbuf->map_frame = NULL; > pipe_mutex_unlock(fbuf->mutex); > } > > diff --git a/src/gallium/drivers/i915/i915_state.c > b/src/gallium/drivers/i915/i915_state.c > index c90fcfd..6ba9646 100644 > --- a/src/gallium/drivers/i915/i915_state.c > +++ b/src/gallium/drivers/i915/i915_state.c > @@ -628,12 +628,10 @@ void i915_delete_fs_state(struct pipe_context *pipe, > void *shader) > FREE(ifs->decl); > ifs->decl = NULL; > > - if (ifs->program) { > - FREE(ifs->program); > - ifs->program = NULL; > - FREE((struct tgsi_token *)ifs->state.tokens); > - ifs->state.tokens = NULL; > - } > + FREE(ifs->program); > + ifs->program = NULL; > + FREE((struct tgsi_token *)ifs->state.tokens); > + ifs->state.tokens = NULL; > > ifs->program_len = 0; > ifs->decl_len = 0; > diff --git a/src/gallium/drivers/ilo/shader/toy_tgsi.c > b/src/gallium/drivers/ilo/shader/toy_tgsi.c > index 57501ea..65e47bf 100644 > --- a/src/gallium/drivers/ilo/shader/toy_tgsi.c > +++ b/src/gallium/drivers/ilo/shader/toy_tgsi.c > @@ -2296,10 +2296,8 @@ add_imm(struct toy_tgsi *tgsi, enum toy_type type, > const uint32_t *buf) > cur_size * sizeof(new_types[0]), > new_size * sizeof(new_types[0])); >if (!new_buf || !new_types) { > - if (new_buf) > -FREE(new_buf); > - if (new_types) > -FREE(new_types); > + FREE(new_buf); > + FREE(new_types); > return -1; >} > > diff --git a/src/gallium/drivers/nouveau/nv50/nv50_context.c > b/src/gallium/drivers/nouveau/nv50/nv50_context.c > index 1a53579..2cfd5db 100644 > --- a/src/gallium/drivers/nouveau/nv50/nv50_context.c > +++ b/src/gallium/drivers/nouveau/nv50/nv50_context.c > @@ -338,8 +338,7 @@ out_err: >nouveau_bufctx_del(&nv50->bufctx_3d); > if (nv50->bufctx) >nouveau_bufctx_del(&nv50->bufctx); > - if (nv50->blit) > - FREE(nv50->blit); > + FREE(nv50->blit); > FREE(nv50); > return NULL; > } > diff --git a/src/gallium/drivers/nouveau/nv50/nv84_video.c > b/src/gallium/drivers/nouveau/nv50/nv84_video.c > index 395bd7a..7a4670f 1006
Re: [Mesa-dev] [PATCH 3/3] Don't cast the return value of malloc/realloc
On 12/08/2014 11:56 AM, Matt Turner wrote: > diff --git a/src/mesa/program/prog_instruction.c > b/src/mesa/program/prog_instruction.c > index 976024e..c1b9527 100644 > --- a/src/mesa/program/prog_instruction.c > +++ b/src/mesa/program/prog_instruction.c > @@ -87,13 +87,7 @@ struct prog_instruction * > _mesa_realloc_instructions(struct prog_instruction *oldInst, > GLuint numOldInst, GLuint numNewInst) > { > - struct prog_instruction *newInst; > - > - newInst = (struct prog_instruction *) > - realloc(oldInst, > - numNewInst * sizeof(struct prog_instruction)); > - > - return newInst; > + return realloc(oldInst, numNewInst * sizeof(struct prog_instruction)); > } I don't see any callers of this function. Delete it instead? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r200: Avoid out of bounds array access.
On Mon, Dec 8, 2014 at 1:34 PM, Ian Romanick wrote: > Reviewed-by: Ian Romanick > > How'd you come across this? I saw a scary looking warning during the make distcheck build. :) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] egl: Added NULL check in eglCreateContext
On Tue, Dec 2, 2014 at 12:10 AM, Valentin Corfu wrote: > With this check we can avoid segmentation fault when invalid value used > during eglCreateContext. Kristian, Ian, can one of you review this? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r200: Avoid out of bounds array access.
Reviewed-by: Ian Romanick How'd you come across this? On 12/08/2014 11:34 AM, Matt Turner wrote: > --- > Patch formatted with -U22 so that reviewers can see regs definition, > and last element initialization with -1. > > src/mesa/drivers/dri/r200/r200_sanity.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/r200/r200_sanity.c > b/src/mesa/drivers/dri/r200/r200_sanity.c > index dd3cf81..34d83d8 100644 > --- a/src/mesa/drivers/dri/r200/r200_sanity.c > +++ b/src/mesa/drivers/dri/r200/r200_sanity.c > @@ -603,45 +603,45 @@ struct reg { > int idx; > struct reg_names *closest; > int flags; > union fi current; > union fi *values; > int nvalues; > int nalloc; > float vmin, vmax; > }; > > > static struct reg regs[Elements(reg_names)+1]; > static struct reg scalars[512+1]; > static struct reg vectors[512*4+1]; > > static int total, total_changed, bufs; > > static void init_regs( void ) > { > struct reg_names *tmp; > int i; > > - for (i = 0 ; i < Elements(regs) ; i++) { > + for (i = 0 ; i < Elements(reg_names) ; i++) { >regs[i].idx = reg_names[i].idx; >regs[i].closest = ®_names[i]; >regs[i].flags = 0; > } > > for (i = 0, tmp = scalar_names ; i < Elements(scalars) ; i++) { >if (tmp[1].idx == i) tmp++; >scalars[i].idx = i; >scalars[i].closest = tmp; >scalars[i].flags = ISFLOAT; > } > > for (i = 0, tmp = vector_names ; i < Elements(vectors) ; i++) { >if (tmp[1].idx*4 == i) tmp++; >vectors[i].idx = i; >vectors[i].closest = tmp; >vectors[i].flags = ISFLOAT|ISVEC; > } > > regs[Elements(regs)-1].idx = -1; > scalars[Elements(scalars)-1].idx = -1; > vectors[Elements(vectors)-1].idx = -1; > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] ir_to_mesa: Remove sat to clamp lowering pass
Series is Reviewed-by: Ian Romanick On 12/08/2014 04:05 AM, Abdiel Janulgue wrote: > Fixes an infinite loop in swrast where the lowering pass unpacks saturate into > clamp but the opt_algebraic pass tries to do the opposite. > > v3 (Ian): > This is a revert of commit cfa8c1cb "ir_to_mesa: lower ir_unop_saturate" on > the ir_to_mesa.cpp portion. prog_execute.c can handle saturates in vertex > shaders, so classic swrast shouldn't need this lowering pass. > > Cc: "10.4" > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83463 > Signed-off-by: Abdiel Janulgue > --- > src/mesa/program/ir_to_mesa.cpp | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp > index 5cd9058..68e2597 100644 > --- a/src/mesa/program/ir_to_mesa.cpp > +++ b/src/mesa/program/ir_to_mesa.cpp > @@ -2946,9 +2946,7 @@ _mesa_ir_link_shader(struct gl_context *ctx, struct > gl_shader_program *prog) >GLenum target = > _mesa_shader_stage_to_program(prog->_LinkedShaders[i]->Stage); >lower_instructions(ir, (MOD_TO_FRACT | DIV_TO_MUL_RCP | EXP_TO_EXP2 >| LOG_TO_LOG2 | INT_DIV_TO_MUL_RCP > - | ((options->EmitNoPow) ? POW_TO_EXP2 : 0) > - | ((target == GL_VERTEX_PROGRAM_ARB) ? > SAT_TO_CLAMP > -: 0))); > + | ((options->EmitNoPow) ? POW_TO_EXP2 : 0))); > >progress = do_lower_jumps(ir, true, true, options->EmitNoMainReturn, > options->EmitNoCont, options->EmitNoLoops) || progress; > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Fix union usage for GCC <= 4.6.
On Fri, Dec 5, 2014 at 6:18 PM, Vinson Lee wrote: > This patch fixes this build error with GCC <= 4.6. Change GCC to G++ here and in the commit message, and Reviewed-by: Matt Turner ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] swrast: Remove 'inline' from tex filter functions.
Reviewed-by: Brian Paul On 12/08/2014 01:02 PM, Matt Turner wrote: Reduces .text size of mesa_dri_drivers.so (i965-only) by 62k, or 1.4%. Note that we don't remove inline from lerp_2d(), which has a comment above it saying it definitely should be inlined. Though, removing the inline keyword from it doesn't actually change the compiled code for me. --- src/mesa/swrast/s_texfilter.c | 52 +-- 1 file changed, 26 insertions(+), 26 deletions(-) diff --git a/src/mesa/swrast/s_texfilter.c b/src/mesa/swrast/s_texfilter.c index 65cf52e..faeccae 100644 --- a/src/mesa/swrast/s_texfilter.c +++ b/src/mesa/swrast/s_texfilter.c @@ -73,7 +73,7 @@ lerp_2d(GLfloat a, GLfloat b, * Do 3D/trilinear interpolation of float values. * \sa lerp_2d */ -static inline GLfloat +static GLfloat lerp_3d(GLfloat a, GLfloat b, GLfloat c, GLfloat v000, GLfloat v100, GLfloat v010, GLfloat v110, GLfloat v001, GLfloat v101, GLfloat v011, GLfloat v111) @@ -91,7 +91,7 @@ lerp_3d(GLfloat a, GLfloat b, GLfloat c, /** * Do linear interpolation of colors. */ -static inline void +static void lerp_rgba(GLfloat result[4], GLfloat t, const GLfloat a[4], const GLfloat b[4]) { result[0] = LERP(t, a[0], b[0]); @@ -104,7 +104,7 @@ lerp_rgba(GLfloat result[4], GLfloat t, const GLfloat a[4], const GLfloat b[4]) /** * Do bilinear interpolation of colors. */ -static inline void +static void lerp_rgba_2d(GLfloat result[4], GLfloat a, GLfloat b, const GLfloat t00[4], const GLfloat t10[4], const GLfloat t01[4], const GLfloat t11[4]) @@ -119,7 +119,7 @@ lerp_rgba_2d(GLfloat result[4], GLfloat a, GLfloat b, /** * Do trilinear interpolation of colors. */ -static inline void +static void lerp_rgba_3d(GLfloat result[4], GLfloat a, GLfloat b, GLfloat c, const GLfloat t000[4], const GLfloat t100[4], const GLfloat t010[4], const GLfloat t110[4], @@ -155,7 +155,7 @@ lerp_rgba_3d(GLfloat result[4], GLfloat a, GLfloat b, GLfloat c, *i0, i1 = returns two nearest texel indexes *weight = returns blend factor between texels */ -static inline void +static void linear_texel_locations(GLenum wrapMode, const struct gl_texture_image *img, GLint size, GLfloat s, @@ -285,7 +285,7 @@ linear_texel_locations(GLenum wrapMode, /** * Used to compute texel location for nearest sampling. */ -static inline GLint +static GLint nearest_texel_location(GLenum wrapMode, const struct gl_texture_image *img, GLint size, GLfloat s) @@ -410,7 +410,7 @@ nearest_texel_location(GLenum wrapMode, /* Power of two image sizes only */ -static inline void +static void linear_repeat_texel_location(GLuint size, GLfloat s, GLint *i0, GLint *i1, GLfloat *weight) { @@ -424,7 +424,7 @@ linear_repeat_texel_location(GLuint size, GLfloat s, /** * Do clamp/wrap for a texture rectangle coord, GL_NEAREST filter mode. */ -static inline GLint +static GLint clamp_rect_coord_nearest(GLenum wrapMode, GLfloat coord, GLint max) { switch (wrapMode) { @@ -444,7 +444,7 @@ clamp_rect_coord_nearest(GLenum wrapMode, GLfloat coord, GLint max) /** * As above, but GL_LINEAR filtering. */ -static inline void +static void clamp_rect_coord_linear(GLenum wrapMode, GLfloat coord, GLint max, GLint *i0out, GLint *i1out, GLfloat *weight) { @@ -486,7 +486,7 @@ clamp_rect_coord_linear(GLenum wrapMode, GLfloat coord, GLint max, /** * Compute slice/image to use for 1D or 2D array texture. */ -static inline GLint +static GLint tex_array_slice(GLfloat coord, GLsizei size) { GLint slice = IFLOOR(coord + 0.5f); @@ -499,7 +499,7 @@ tex_array_slice(GLfloat coord, GLsizei size) * Compute nearest integer texcoords for given texobj and coordinate. * NOTE: only used for depth texture sampling. */ -static inline void +static void nearest_texcoord(const struct gl_sampler_object *samp, const struct gl_texture_object *texObj, GLuint level, @@ -548,7 +548,7 @@ nearest_texcoord(const struct gl_sampler_object *samp, * Compute linear integer texcoords for given texobj and coordinate. * NOTE: only used for depth texture sampling. */ -static inline void +static void linear_texcoord(const struct gl_sampler_object *samp, const struct gl_texture_object *texObj, GLuint level, @@ -607,7 +607,7 @@ linear_texcoord(const struct gl_sampler_object *samp, * For linear interpolation between mipmap levels N and N+1, this function * computes N. */ -static inline GLint +static GLint linear_mipmap_level(const struct gl_texture_object *tObj, GLfloat lambda) { if (lambda < 0.0F) @@ -622,7 +622,7 @@ linear_mipmap_level(const struct gl_texture_object *tObj, GLfloa
Re: [Mesa-dev] [PATCH 1/3] Remove useless checks for NULL before freeing
On 12/08/2014 12:56 PM, Matt Turner wrote: See commits 5067506e and b6109de3 for the Coccinelle script. --- src/gallium/auxiliary/util/u_debug_flush.c | 12 src/gallium/drivers/i915/i915_state.c| 10 -- src/gallium/drivers/ilo/shader/toy_tgsi.c| 6 ++ src/gallium/drivers/nouveau/nv50/nv50_context.c | 3 +-- src/gallium/drivers/nouveau/nv50/nv84_video.c| 3 +-- src/gallium/drivers/nouveau/nvc0/nvc0_context.c | 3 +-- src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 3 +-- src/gallium/drivers/nouveau/nvc0/nvc0_surface.c | 3 +-- src/gallium/drivers/r600/r600_isa.c | 12 src/gallium/drivers/softpipe/sp_tile_cache.c | 3 +-- src/gallium/state_trackers/hgl/hgl.c | 6 ++ src/gallium/state_trackers/nine/nine_shader.c| 6 ++ src/gallium/state_trackers/nine/pixelshader9.c | 3 +-- src/gallium/state_trackers/nine/stateblock9.c| 8 src/gallium/state_trackers/nine/swapchain9.c | 2 +- src/gallium/state_trackers/nine/vertexdeclaration9.c | 9 +++-- src/gallium/state_trackers/nine/vertexshader9.c | 3 +-- src/gallium/winsys/svga/drm/vmw_screen_ioctl.c | 6 ++ src/mesa/drivers/dri/common/xmlconfig.c | 3 +-- src/mesa/main/objectlabel.c | 7 ++- 20 files changed, 39 insertions(+), 72 deletions(-) For the series: Reviewed-by: Brian Paul ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] swrast: Remove 'inline' from tex filter functions.
Reduces .text size of mesa_dri_drivers.so (i965-only) by 62k, or 1.4%. Note that we don't remove inline from lerp_2d(), which has a comment above it saying it definitely should be inlined. Though, removing the inline keyword from it doesn't actually change the compiled code for me. --- src/mesa/swrast/s_texfilter.c | 52 +-- 1 file changed, 26 insertions(+), 26 deletions(-) diff --git a/src/mesa/swrast/s_texfilter.c b/src/mesa/swrast/s_texfilter.c index 65cf52e..faeccae 100644 --- a/src/mesa/swrast/s_texfilter.c +++ b/src/mesa/swrast/s_texfilter.c @@ -73,7 +73,7 @@ lerp_2d(GLfloat a, GLfloat b, * Do 3D/trilinear interpolation of float values. * \sa lerp_2d */ -static inline GLfloat +static GLfloat lerp_3d(GLfloat a, GLfloat b, GLfloat c, GLfloat v000, GLfloat v100, GLfloat v010, GLfloat v110, GLfloat v001, GLfloat v101, GLfloat v011, GLfloat v111) @@ -91,7 +91,7 @@ lerp_3d(GLfloat a, GLfloat b, GLfloat c, /** * Do linear interpolation of colors. */ -static inline void +static void lerp_rgba(GLfloat result[4], GLfloat t, const GLfloat a[4], const GLfloat b[4]) { result[0] = LERP(t, a[0], b[0]); @@ -104,7 +104,7 @@ lerp_rgba(GLfloat result[4], GLfloat t, const GLfloat a[4], const GLfloat b[4]) /** * Do bilinear interpolation of colors. */ -static inline void +static void lerp_rgba_2d(GLfloat result[4], GLfloat a, GLfloat b, const GLfloat t00[4], const GLfloat t10[4], const GLfloat t01[4], const GLfloat t11[4]) @@ -119,7 +119,7 @@ lerp_rgba_2d(GLfloat result[4], GLfloat a, GLfloat b, /** * Do trilinear interpolation of colors. */ -static inline void +static void lerp_rgba_3d(GLfloat result[4], GLfloat a, GLfloat b, GLfloat c, const GLfloat t000[4], const GLfloat t100[4], const GLfloat t010[4], const GLfloat t110[4], @@ -155,7 +155,7 @@ lerp_rgba_3d(GLfloat result[4], GLfloat a, GLfloat b, GLfloat c, *i0, i1 = returns two nearest texel indexes *weight = returns blend factor between texels */ -static inline void +static void linear_texel_locations(GLenum wrapMode, const struct gl_texture_image *img, GLint size, GLfloat s, @@ -285,7 +285,7 @@ linear_texel_locations(GLenum wrapMode, /** * Used to compute texel location for nearest sampling. */ -static inline GLint +static GLint nearest_texel_location(GLenum wrapMode, const struct gl_texture_image *img, GLint size, GLfloat s) @@ -410,7 +410,7 @@ nearest_texel_location(GLenum wrapMode, /* Power of two image sizes only */ -static inline void +static void linear_repeat_texel_location(GLuint size, GLfloat s, GLint *i0, GLint *i1, GLfloat *weight) { @@ -424,7 +424,7 @@ linear_repeat_texel_location(GLuint size, GLfloat s, /** * Do clamp/wrap for a texture rectangle coord, GL_NEAREST filter mode. */ -static inline GLint +static GLint clamp_rect_coord_nearest(GLenum wrapMode, GLfloat coord, GLint max) { switch (wrapMode) { @@ -444,7 +444,7 @@ clamp_rect_coord_nearest(GLenum wrapMode, GLfloat coord, GLint max) /** * As above, but GL_LINEAR filtering. */ -static inline void +static void clamp_rect_coord_linear(GLenum wrapMode, GLfloat coord, GLint max, GLint *i0out, GLint *i1out, GLfloat *weight) { @@ -486,7 +486,7 @@ clamp_rect_coord_linear(GLenum wrapMode, GLfloat coord, GLint max, /** * Compute slice/image to use for 1D or 2D array texture. */ -static inline GLint +static GLint tex_array_slice(GLfloat coord, GLsizei size) { GLint slice = IFLOOR(coord + 0.5f); @@ -499,7 +499,7 @@ tex_array_slice(GLfloat coord, GLsizei size) * Compute nearest integer texcoords for given texobj and coordinate. * NOTE: only used for depth texture sampling. */ -static inline void +static void nearest_texcoord(const struct gl_sampler_object *samp, const struct gl_texture_object *texObj, GLuint level, @@ -548,7 +548,7 @@ nearest_texcoord(const struct gl_sampler_object *samp, * Compute linear integer texcoords for given texobj and coordinate. * NOTE: only used for depth texture sampling. */ -static inline void +static void linear_texcoord(const struct gl_sampler_object *samp, const struct gl_texture_object *texObj, GLuint level, @@ -607,7 +607,7 @@ linear_texcoord(const struct gl_sampler_object *samp, * For linear interpolation between mipmap levels N and N+1, this function * computes N. */ -static inline GLint +static GLint linear_mipmap_level(const struct gl_texture_object *tObj, GLfloat lambda) { if (lambda < 0.0F) @@ -622,7 +622,7 @@ linear_mipmap_level(const struct gl_texture_object *tObj, GLfloat lambda) /** * Compute the nearest mipmap level to take texels from. */ -static inline GLint +static GLint nearest_mipmap_level(const struct gl
[Mesa-dev] [PATCH 3/3] Don't cast the return value of malloc/realloc
See commit 2b7a972e for the Coccinelle script. --- src/gallium/state_trackers/glx/xlib/glx_api.c | 12 +--- src/gallium/state_trackers/glx/xlib/glx_usefont.c | 2 +- src/gallium/state_trackers/glx/xlib/xm_api.c | 2 +- src/gallium/state_trackers/wgl/stw_tls.c | 2 +- src/mesa/drivers/x11/fakeglx.c| 3 +-- src/mesa/drivers/x11/xm_api.c | 2 +- src/mesa/main/imports.c | 4 ++-- src/mesa/main/objectlabel.c | 2 +- src/mesa/main/shaderapi.c | 5 ++--- src/mesa/program/prog_instruction.c | 8 +--- src/mesa/program/prog_parameter.c | 2 +- 11 files changed, 17 insertions(+), 27 deletions(-) diff --git a/src/gallium/state_trackers/glx/xlib/glx_api.c b/src/gallium/state_trackers/glx/xlib/glx_api.c index 1807edb..ad80dc0 100644 --- a/src/gallium/state_trackers/glx/xlib/glx_api.c +++ b/src/gallium/state_trackers/glx/xlib/glx_api.c @@ -253,8 +253,7 @@ save_glx_visual( Display *dpy, XVisualInfo *vinfo, */ xmvis->vishandle = vinfo; /* Allocate more space for additional visual */ - VisualTable = (XMesaVisual *) realloc( VisualTable, - sizeof(XMesaVisual) * (NumVisuals + 1)); + VisualTable = realloc(VisualTable, sizeof(XMesaVisual) * (NumVisuals + 1)); /* add xmvis to the list */ VisualTable[NumVisuals] = xmvis; NumVisuals++; @@ -1078,7 +1077,7 @@ glXChooseVisual( Display *dpy, int screen, int *list ) xmvis = choose_visual(dpy, screen, list, GL_FALSE); if (xmvis) { /* create a new vishandle - the cached one may be stale */ - xmvis->vishandle = (XVisualInfo *) malloc(sizeof(XVisualInfo)); + xmvis->vishandle = malloc(sizeof(XVisualInfo)); if (xmvis->vishandle) { memcpy(xmvis->vishandle, xmvis->visinfo, sizeof(XVisualInfo)); } @@ -1829,8 +1828,7 @@ glXGetFBConfigs( Display *dpy, int screen, int *nelements ) visTemplate.screen = screen; visuals = XGetVisualInfo(dpy, visMask, &visTemplate, nelements); if (*nelements > 0) { - XMesaVisual *results; - results = (XMesaVisual *) malloc(*nelements * sizeof(XMesaVisual)); + XMesaVisual *results = malloc(*nelements * sizeof(XMesaVisual)); if (!results) { *nelements = 0; return NULL; @@ -1864,7 +1862,7 @@ glXChooseFBConfig(Display *dpy, int screen, xmvis = choose_visual(dpy, screen, attribList, GL_TRUE); if (xmvis) { - GLXFBConfig *config = (GLXFBConfig *) malloc(sizeof(XMesaVisual)); + GLXFBConfig *config = malloc(sizeof(XMesaVisual)); if (!config) { *nitems = 0; return NULL; @@ -1889,7 +1887,7 @@ glXGetVisualFromFBConfig( Display *dpy, GLXFBConfig config ) return xmvis->vishandle; #else /* create a new vishandle - the cached one may be stale */ - xmvis->vishandle = (XVisualInfo *) malloc(sizeof(XVisualInfo)); + xmvis->vishandle = malloc(sizeof(XVisualInfo)); if (xmvis->vishandle) { memcpy(xmvis->vishandle, xmvis->visinfo, sizeof(XVisualInfo)); } diff --git a/src/gallium/state_trackers/glx/xlib/glx_usefont.c b/src/gallium/state_trackers/glx/xlib/glx_usefont.c index de123f2..f7ee68b 100644 --- a/src/gallium/state_trackers/glx/xlib/glx_usefont.c +++ b/src/gallium/state_trackers/glx/xlib/glx_usefont.c @@ -241,7 +241,7 @@ glXUseXFont(Font font, int first, int count, int listbase) max_bm_width = (max_width + 7) / 8; max_bm_height = max_height; - bm = (GLubyte *) malloc((max_bm_width * max_bm_height) * sizeof(GLubyte)); + bm = malloc((max_bm_width * max_bm_height) * sizeof(GLubyte)); if (!bm) { XFreeFontInfo(NULL, fs, 1); _mesa_error(NULL, GL_OUT_OF_MEMORY, diff --git a/src/gallium/state_trackers/glx/xlib/xm_api.c b/src/gallium/state_trackers/glx/xlib/xm_api.c index 2aa5ac4..34be1f7 100644 --- a/src/gallium/state_trackers/glx/xlib/xm_api.c +++ b/src/gallium/state_trackers/glx/xlib/xm_api.c @@ -705,7 +705,7 @@ XMesaVisual XMesaCreateVisual( Display *display, * the struct but we may need some of the information contained in it * at a later time. */ - v->visinfo = (XVisualInfo *) malloc(sizeof(*visinfo)); + v->visinfo = malloc(sizeof(*visinfo)); if (!v->visinfo) { free(v); return NULL; diff --git a/src/gallium/state_trackers/wgl/stw_tls.c b/src/gallium/state_trackers/wgl/stw_tls.c index 4b51845..ca27a53 100644 --- a/src/gallium/state_trackers/wgl/stw_tls.c +++ b/src/gallium/state_trackers/wgl/stw_tls.c @@ -120,7 +120,7 @@ stw_tls_data_create(DWORD dwThreadId) debug_printf("%s(0x%04lx)\n", __FUNCTION__, dwThreadId); } - data = (struct stw_tls_data *)calloc(1, sizeof *data); + data = calloc(1, sizeof *data); if (!data) { goto no_data; } diff --git a/src/mesa/drivers/x11/fakeglx.c b/src/mesa/drivers/x11/fakeglx.c index ee05f8a..
[Mesa-dev] [PATCH 2/3] Use calloc instead of malloc/memset-0
See commit 6bda027e for the Coccinelle script. --- src/egl/drivers/dri2/platform_wayland.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/egl/drivers/dri2/platform_wayland.c b/src/egl/drivers/dri2/platform_wayland.c index 59b2792..ba0eb10 100644 --- a/src/egl/drivers/dri2/platform_wayland.c +++ b/src/egl/drivers/dri2/platform_wayland.c @@ -130,13 +130,12 @@ dri2_wl_create_surface(_EGLDriver *drv, _EGLDisplay *disp, EGLint type, (void) drv; - dri2_surf = malloc(sizeof *dri2_surf); + dri2_surf = calloc(1, sizeof *dri2_surf); if (!dri2_surf) { _eglError(EGL_BAD_ALLOC, "dri2_create_surface"); return NULL; } - memset(dri2_surf, 0, sizeof *dri2_surf); if (!_eglInitSurface(&dri2_surf->base, disp, type, conf, attrib_list)) goto cleanup_surf; -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] Remove useless checks for NULL before freeing
See commits 5067506e and b6109de3 for the Coccinelle script. --- src/gallium/auxiliary/util/u_debug_flush.c | 12 src/gallium/drivers/i915/i915_state.c| 10 -- src/gallium/drivers/ilo/shader/toy_tgsi.c| 6 ++ src/gallium/drivers/nouveau/nv50/nv50_context.c | 3 +-- src/gallium/drivers/nouveau/nv50/nv84_video.c| 3 +-- src/gallium/drivers/nouveau/nvc0/nvc0_context.c | 3 +-- src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 3 +-- src/gallium/drivers/nouveau/nvc0/nvc0_surface.c | 3 +-- src/gallium/drivers/r600/r600_isa.c | 12 src/gallium/drivers/softpipe/sp_tile_cache.c | 3 +-- src/gallium/state_trackers/hgl/hgl.c | 6 ++ src/gallium/state_trackers/nine/nine_shader.c| 6 ++ src/gallium/state_trackers/nine/pixelshader9.c | 3 +-- src/gallium/state_trackers/nine/stateblock9.c| 8 src/gallium/state_trackers/nine/swapchain9.c | 2 +- src/gallium/state_trackers/nine/vertexdeclaration9.c | 9 +++-- src/gallium/state_trackers/nine/vertexshader9.c | 3 +-- src/gallium/winsys/svga/drm/vmw_screen_ioctl.c | 6 ++ src/mesa/drivers/dri/common/xmlconfig.c | 3 +-- src/mesa/main/objectlabel.c | 7 ++- 20 files changed, 39 insertions(+), 72 deletions(-) diff --git a/src/gallium/auxiliary/util/u_debug_flush.c b/src/gallium/auxiliary/util/u_debug_flush.c index fdb248c..cdefca2 100644 --- a/src/gallium/auxiliary/util/u_debug_flush.c +++ b/src/gallium/auxiliary/util/u_debug_flush.c @@ -132,8 +132,7 @@ debug_flush_buf_reference(struct debug_flush_buf **dst, struct debug_flush_buf *fbuf = *dst; if (pipe_reference(&(*dst)->reference, &src->reference)) { - if (fbuf->map_frame) - FREE(fbuf->map_frame); + FREE(fbuf->map_frame); FREE(fbuf); } @@ -146,8 +145,7 @@ debug_flush_item_destroy(struct debug_flush_item *item) { debug_flush_buf_reference(&item->fbuf, NULL); - if (item->ref_frame) - FREE(item->ref_frame); + FREE(item->ref_frame); FREE(item); } @@ -263,10 +261,8 @@ debug_flush_unmap(struct debug_flush_buf *fbuf) fbuf->mapped_sync = FALSE; fbuf->mapped = FALSE; - if (fbuf->map_frame) { - FREE(fbuf->map_frame); - fbuf->map_frame = NULL; - } + FREE(fbuf->map_frame); + fbuf->map_frame = NULL; pipe_mutex_unlock(fbuf->mutex); } diff --git a/src/gallium/drivers/i915/i915_state.c b/src/gallium/drivers/i915/i915_state.c index c90fcfd..6ba9646 100644 --- a/src/gallium/drivers/i915/i915_state.c +++ b/src/gallium/drivers/i915/i915_state.c @@ -628,12 +628,10 @@ void i915_delete_fs_state(struct pipe_context *pipe, void *shader) FREE(ifs->decl); ifs->decl = NULL; - if (ifs->program) { - FREE(ifs->program); - ifs->program = NULL; - FREE((struct tgsi_token *)ifs->state.tokens); - ifs->state.tokens = NULL; - } + FREE(ifs->program); + ifs->program = NULL; + FREE((struct tgsi_token *)ifs->state.tokens); + ifs->state.tokens = NULL; ifs->program_len = 0; ifs->decl_len = 0; diff --git a/src/gallium/drivers/ilo/shader/toy_tgsi.c b/src/gallium/drivers/ilo/shader/toy_tgsi.c index 57501ea..65e47bf 100644 --- a/src/gallium/drivers/ilo/shader/toy_tgsi.c +++ b/src/gallium/drivers/ilo/shader/toy_tgsi.c @@ -2296,10 +2296,8 @@ add_imm(struct toy_tgsi *tgsi, enum toy_type type, const uint32_t *buf) cur_size * sizeof(new_types[0]), new_size * sizeof(new_types[0])); if (!new_buf || !new_types) { - if (new_buf) -FREE(new_buf); - if (new_types) -FREE(new_types); + FREE(new_buf); + FREE(new_types); return -1; } diff --git a/src/gallium/drivers/nouveau/nv50/nv50_context.c b/src/gallium/drivers/nouveau/nv50/nv50_context.c index 1a53579..2cfd5db 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_context.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_context.c @@ -338,8 +338,7 @@ out_err: nouveau_bufctx_del(&nv50->bufctx_3d); if (nv50->bufctx) nouveau_bufctx_del(&nv50->bufctx); - if (nv50->blit) - FREE(nv50->blit); + FREE(nv50->blit); FREE(nv50); return NULL; } diff --git a/src/gallium/drivers/nouveau/nv50/nv84_video.c b/src/gallium/drivers/nouveau/nv50/nv84_video.c index 395bd7a..7a4670f 100644 --- a/src/gallium/drivers/nouveau/nv50/nv84_video.c +++ b/src/gallium/drivers/nouveau/nv50/nv84_video.c @@ -256,8 +256,7 @@ nv84_decoder_destroy(struct pipe_video_codec *decoder) nouveau_client_del(&dec->client); - if (dec->mpeg12_bs) - FREE(dec->mpeg12_bs); + FREE(dec->mpeg12_bs); FREE(dec); } diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_context.c b/src/gallium/drivers/nouveau/nvc0/nvc0_context.c index 3992460..7662fb5 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_context.c +++
[Mesa-dev] [PATCH] r200: Avoid out of bounds array access.
--- Patch formatted with -U22 so that reviewers can see regs definition, and last element initialization with -1. src/mesa/drivers/dri/r200/r200_sanity.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/r200/r200_sanity.c b/src/mesa/drivers/dri/r200/r200_sanity.c index dd3cf81..34d83d8 100644 --- a/src/mesa/drivers/dri/r200/r200_sanity.c +++ b/src/mesa/drivers/dri/r200/r200_sanity.c @@ -603,45 +603,45 @@ struct reg { int idx; struct reg_names *closest; int flags; union fi current; union fi *values; int nvalues; int nalloc; float vmin, vmax; }; static struct reg regs[Elements(reg_names)+1]; static struct reg scalars[512+1]; static struct reg vectors[512*4+1]; static int total, total_changed, bufs; static void init_regs( void ) { struct reg_names *tmp; int i; - for (i = 0 ; i < Elements(regs) ; i++) { + for (i = 0 ; i < Elements(reg_names) ; i++) { regs[i].idx = reg_names[i].idx; regs[i].closest = ®_names[i]; regs[i].flags = 0; } for (i = 0, tmp = scalar_names ; i < Elements(scalars) ; i++) { if (tmp[1].idx == i) tmp++; scalars[i].idx = i; scalars[i].closest = tmp; scalars[i].flags = ISFLOAT; } for (i = 0, tmp = vector_names ; i < Elements(vectors) ; i++) { if (tmp[1].idx*4 == i) tmp++; vectors[i].idx = i; vectors[i].closest = tmp; vectors[i].flags = ISFLOAT|ISVEC; } regs[Elements(regs)-1].idx = -1; scalars[Elements(scalars)-1].idx = -1; vectors[Elements(vectors)-1].idx = -1; -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] draw: implement TGSI_PROPERTY_VS_WINDOW_SPACE_POSITION
It's actually quite nice in our hardware drivers. The viewport transformation enable bit and the computation of 1/w bit now lives in the VS state. The clip enable bits have to be derived from both the VS and the rasterizer state on most/all drivers anyway (based on whether ClipDistances or ClipVertex is used or none of them - yeah we have 3 ways of doing clipping), so the clip disable bit is just part of that state now. Marek On Mon, Dec 8, 2014 at 7:25 PM, Roland Scheidegger wrote: > Am 06.12.2014 um 18:24 schrieb Marek Olšák: >> Ping. >> >> On Mon, Nov 17, 2014 at 10:43 PM, Marek Olšák wrote: >>> From: Marek Olšák >>> >>> Required by Nine. Tested with util_run_tests. >>> It's added to softpipe, llvmpipe, and r300g/swtcl. >>> --- >>> src/gallium/auxiliary/draw/draw_context.c | 40 >>> ++ >>> src/gallium/auxiliary/draw/draw_llvm.c | 2 +- >>> src/gallium/auxiliary/draw/draw_private.h | 4 +++ >>> .../auxiliary/draw/draw_pt_fetch_shade_emit.c | 2 +- >>> .../auxiliary/draw/draw_pt_fetch_shade_pipeline.c | 2 +- >>> .../draw/draw_pt_fetch_shade_pipeline_llvm.c | 2 +- >>> src/gallium/auxiliary/draw/draw_vs.c | 2 ++ >>> src/gallium/drivers/llvmpipe/lp_screen.c | 2 ++ >>> src/gallium/drivers/r300/r300_screen.c | 2 +- >>> src/gallium/drivers/softpipe/sp_screen.c | 2 ++ >>> 10 files changed, 49 insertions(+), 11 deletions(-) >>> >>> diff --git a/src/gallium/auxiliary/draw/draw_context.c >>> b/src/gallium/auxiliary/draw/draw_context.c >>> index 2b640b6..d473cfc 100644 >>> --- a/src/gallium/auxiliary/draw/draw_context.c >>> +++ b/src/gallium/auxiliary/draw/draw_context.c >>> @@ -267,21 +267,48 @@ void draw_set_zs_format(struct draw_context *draw, >>> enum pipe_format format) >>> } >>> >>> >>> -static void update_clip_flags( struct draw_context *draw ) >>> +static bool >>> +draw_is_vs_window_space(struct draw_context *draw) >>> { >>> - draw->clip_xy = !draw->driver.bypass_clip_xy; >>> + if (draw->vs.vertex_shader) { >>> + struct tgsi_shader_info *info = &draw->vs.vertex_shader->info; >>> + >>> + return info->properties[TGSI_PROPERTY_VS_WINDOW_SPACE_POSITION] != 0; >>> + } >>> + return false; >>> +} >>> + >>> + >>> +void >>> +draw_update_clip_flags(struct draw_context *draw) >>> +{ >>> + bool window_space = draw_is_vs_window_space(draw); >>> + >>> + draw->clip_xy = !draw->driver.bypass_clip_xy && !window_space; >>> draw->guard_band_xy = (!draw->driver.bypass_clip_xy && >>>draw->driver.guard_band_xy); >>> draw->clip_z = (!draw->driver.bypass_clip_z && >>> - draw->rasterizer && draw->rasterizer->depth_clip); >>> + draw->rasterizer && draw->rasterizer->depth_clip) && >>> + !window_space; >>> draw->clip_user = draw->rasterizer && >>> - draw->rasterizer->clip_plane_enable != 0; >>> + draw->rasterizer->clip_plane_enable != 0 && >>> + !window_space; >>> draw->guard_band_points_xy = draw->guard_band_xy || >>> (draw->driver.bypass_clip_points && >>> (draw->rasterizer && >>> draw->rasterizer->point_tri_clip)); >>> } >>> >>> + >>> +void >>> +draw_update_viewport_flags(struct draw_context *draw) >>> +{ >>> + bool window_space = draw_is_vs_window_space(draw); >>> + >>> + draw->bypass_viewport = window_space || draw->identity_viewport; >>> +} >>> + >>> + >>> /** >>> * Register new primitive rasterization/rendering state. >>> * This causes the drawing pipeline to be rebuilt. >>> @@ -295,7 +322,7 @@ void draw_set_rasterizer_state( struct draw_context >>> *draw, >>> >>>draw->rasterizer = raster; >>>draw->rast_handle = rast_handle; >>> - update_clip_flags(draw); >>> + draw_update_clip_flags(draw); >>> } >>> } >>> >>> @@ -322,7 +349,7 @@ void draw_set_driver_clipping( struct draw_context >>> *draw, >>> draw->driver.bypass_clip_z = bypass_clip_z; >>> draw->driver.guard_band_xy = guard_band_xy; >>> draw->driver.bypass_clip_points = bypass_clip_points; >>> - update_clip_flags(draw); >>> + draw_update_clip_flags(draw); >>> } >>> >>> >>> @@ -376,6 +403,7 @@ void draw_set_viewport_states( struct draw_context >>> *draw, >>> viewport->translate[0] == 0.0f && >>> viewport->translate[1] == 0.0f && >>> viewport->translate[2] == 0.0f); >>> + draw_update_viewport_flags(draw); >>> } >>> >>> >>> diff --git a/src/gallium/auxiliary/draw/draw_llvm.c >>> b/src/gallium/auxiliary/draw/draw_llvm.c >>> index 3a1b057..fbbe08b 100644 >>> --- a/src/gallium/auxiliary/draw/draw_llvm.c >>> +++ b/src/gallium/auxiliary/draw/draw_llvm.c >>> @@ -1831,7 +1831,7 @@ draw_llvm_make_variant_key(struct draw_llvm *llvm, >>> char *store) >>> key->clip_xy = llvm->draw->clip_xy; >>
Re: [Mesa-dev] [PATCH] drirc: set allow_glsl_extension_directive_midshader for Dead Island.
On Mon, Dec 8, 2014 at 10:43 AM, Sven Arvidsson wrote: > Signed-off-by: Sven Arvidsson > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87076 > --- Thanks! (When someone commits this, please tag it for 10.4) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] drirc: set allow_glsl_extension_directive_midshader for Dead Island.
Signed-off-by: Sven Arvidsson Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87076 --- src/mesa/drivers/dri/common/drirc | 4 1 file changed, 4 insertions(+) diff --git a/src/mesa/drivers/dri/common/drirc b/src/mesa/drivers/dri/common/drirc index 4b9841b..cecd6a9 100644 --- a/src/mesa/drivers/dri/common/drirc +++ b/src/mesa/drivers/dri/common/drirc @@ -87,5 +87,9 @@ TODO: document the other workarounds. + + + + -- 2.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] draw: implement TGSI_PROPERTY_VS_WINDOW_SPACE_POSITION
Am 06.12.2014 um 18:24 schrieb Marek Olšák: > Ping. > > On Mon, Nov 17, 2014 at 10:43 PM, Marek Olšák wrote: >> From: Marek Olšák >> >> Required by Nine. Tested with util_run_tests. >> It's added to softpipe, llvmpipe, and r300g/swtcl. >> --- >> src/gallium/auxiliary/draw/draw_context.c | 40 >> ++ >> src/gallium/auxiliary/draw/draw_llvm.c | 2 +- >> src/gallium/auxiliary/draw/draw_private.h | 4 +++ >> .../auxiliary/draw/draw_pt_fetch_shade_emit.c | 2 +- >> .../auxiliary/draw/draw_pt_fetch_shade_pipeline.c | 2 +- >> .../draw/draw_pt_fetch_shade_pipeline_llvm.c | 2 +- >> src/gallium/auxiliary/draw/draw_vs.c | 2 ++ >> src/gallium/drivers/llvmpipe/lp_screen.c | 2 ++ >> src/gallium/drivers/r300/r300_screen.c | 2 +- >> src/gallium/drivers/softpipe/sp_screen.c | 2 ++ >> 10 files changed, 49 insertions(+), 11 deletions(-) >> >> diff --git a/src/gallium/auxiliary/draw/draw_context.c >> b/src/gallium/auxiliary/draw/draw_context.c >> index 2b640b6..d473cfc 100644 >> --- a/src/gallium/auxiliary/draw/draw_context.c >> +++ b/src/gallium/auxiliary/draw/draw_context.c >> @@ -267,21 +267,48 @@ void draw_set_zs_format(struct draw_context *draw, >> enum pipe_format format) >> } >> >> >> -static void update_clip_flags( struct draw_context *draw ) >> +static bool >> +draw_is_vs_window_space(struct draw_context *draw) >> { >> - draw->clip_xy = !draw->driver.bypass_clip_xy; >> + if (draw->vs.vertex_shader) { >> + struct tgsi_shader_info *info = &draw->vs.vertex_shader->info; >> + >> + return info->properties[TGSI_PROPERTY_VS_WINDOW_SPACE_POSITION] != 0; >> + } >> + return false; >> +} >> + >> + >> +void >> +draw_update_clip_flags(struct draw_context *draw) >> +{ >> + bool window_space = draw_is_vs_window_space(draw); >> + >> + draw->clip_xy = !draw->driver.bypass_clip_xy && !window_space; >> draw->guard_band_xy = (!draw->driver.bypass_clip_xy && >>draw->driver.guard_band_xy); >> draw->clip_z = (!draw->driver.bypass_clip_z && >> - draw->rasterizer && draw->rasterizer->depth_clip); >> + draw->rasterizer && draw->rasterizer->depth_clip) && >> + !window_space; >> draw->clip_user = draw->rasterizer && >> - draw->rasterizer->clip_plane_enable != 0; >> + draw->rasterizer->clip_plane_enable != 0 && >> + !window_space; >> draw->guard_band_points_xy = draw->guard_band_xy || >> (draw->driver.bypass_clip_points && >> (draw->rasterizer && >> draw->rasterizer->point_tri_clip)); >> } >> >> + >> +void >> +draw_update_viewport_flags(struct draw_context *draw) >> +{ >> + bool window_space = draw_is_vs_window_space(draw); >> + >> + draw->bypass_viewport = window_space || draw->identity_viewport; >> +} >> + >> + >> /** >> * Register new primitive rasterization/rendering state. >> * This causes the drawing pipeline to be rebuilt. >> @@ -295,7 +322,7 @@ void draw_set_rasterizer_state( struct draw_context >> *draw, >> >>draw->rasterizer = raster; >>draw->rast_handle = rast_handle; >> - update_clip_flags(draw); >> + draw_update_clip_flags(draw); >> } >> } >> >> @@ -322,7 +349,7 @@ void draw_set_driver_clipping( struct draw_context *draw, >> draw->driver.bypass_clip_z = bypass_clip_z; >> draw->driver.guard_band_xy = guard_band_xy; >> draw->driver.bypass_clip_points = bypass_clip_points; >> - update_clip_flags(draw); >> + draw_update_clip_flags(draw); >> } >> >> >> @@ -376,6 +403,7 @@ void draw_set_viewport_states( struct draw_context *draw, >> viewport->translate[0] == 0.0f && >> viewport->translate[1] == 0.0f && >> viewport->translate[2] == 0.0f); >> + draw_update_viewport_flags(draw); >> } >> >> >> diff --git a/src/gallium/auxiliary/draw/draw_llvm.c >> b/src/gallium/auxiliary/draw/draw_llvm.c >> index 3a1b057..fbbe08b 100644 >> --- a/src/gallium/auxiliary/draw/draw_llvm.c >> +++ b/src/gallium/auxiliary/draw/draw_llvm.c >> @@ -1831,7 +1831,7 @@ draw_llvm_make_variant_key(struct draw_llvm *llvm, >> char *store) >> key->clip_xy = llvm->draw->clip_xy; >> key->clip_z = llvm->draw->clip_z; >> key->clip_user = llvm->draw->clip_user; >> - key->bypass_viewport = llvm->draw->identity_viewport; >> + key->bypass_viewport = llvm->draw->bypass_viewport; >> key->clip_halfz = llvm->draw->rasterizer->clip_halfz; >> key->need_edgeflags = (llvm->draw->vs.edgeflag_output ? TRUE : FALSE); >> key->ucp_enable = llvm->draw->rasterizer->clip_plane_enable; >> diff --git a/src/gallium/auxiliary/draw/draw_private.h >> b/src/gallium/auxiliary/draw/draw_private.h >> index d8dc2ab..8d4e1cd 100644 >> --- a/src/gallium/auxiliary/draw/draw_private.h >> +++ b/src/g
Re: [Mesa-dev] [PATCH 1/3] ir_to_mesa: Remove sat to clamp lowering pass
Thanks Abdiel. This series looks good to me. Reviewed-by: Matt Turner ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2] mesa: use build flag to ensure stack is realigned on x86
I'm not particularly knowledgeable about autoconf/make, but it looks correct AFAICT. Thanks for doing this. I think it would be useful to also add a comment somewhere on why -mstackrealign is necessary for 32-bits, for future reference. Reviewed-by: Jose Fonseca Jose On 07/12/14 12:13, Timothy Arceri wrote: Nowadays GCC assumes stack pointer is 16-byte aligned even on 32-bits, but that is an assumption OpenGL drivers (or any dynamic library for that matter) can't afford to make as there are many closed- and open- source application binaries out there that only assume 4-byte stack alignment. V2: use $target_cpu rather than $host_cpu and setup build flags in config rather than makefile https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.freedesktop.org_show-5Fbug.cgi-3Fid-3D86788&d=AAIBAQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=zfmBZnnVGHeYde45pMKNnVyzeaZbdIqVLprmZCM2zzE&m=weeQ6GJvnQZRjCkjJDlt2roe8RkioxRFO4Rstd2I7Qc&s=w8hG15bIjKbRUDArovpsg5FkTcM8R5Jb7bayyaXGWis&e= Signed-off-by: Timothy Arceri --- Tested by cross compiling and running 32-bit version of UrbanTerror. Please note if this patch is ok it should also be applied to 10.4 with the last hunk removed. configure.ac | 11 ++- src/mesa/Makefile.am | 2 +- src/mesa/main/sse_minmax.c | 3 --- 3 files changed, 11 insertions(+), 5 deletions(-) diff --git a/configure.ac b/configure.ac index b0df1bb..7dc435a 100644 --- a/configure.ac +++ b/configure.ac @@ -253,8 +253,9 @@ AC_SUBST([VISIBILITY_CXXFLAGS]) dnl dnl Optional flags, check for compiler support dnl +SSE41_CFLAGS="-msse4.1" save_CFLAGS="$CFLAGS" -CFLAGS="-msse4.1 $CFLAGS" +CFLAGS="$SSE41_CFLAGS $CFLAGS" AC_COMPILE_IFELSE([AC_LANG_SOURCE([[ #include int main () { @@ -474,6 +475,12 @@ fi dnl dnl Arch/platform-specific settings dnl +case "$target_cpu" in +i?86) +SSE41_CFLAGS="$SSE41_CFLAGS -mstackrealign" +;; +esac + AC_ARG_ENABLE([asm], [AS_HELP_STRING([--disable-asm], [disable assembly usage @<:@default=enabled on supported plaforms@:>@])], @@ -2091,6 +2098,8 @@ AM_CONDITIONAL(HAVE_X86_ASM, test "x$asm_arch" = xx86 -o "x$asm_arch" = xx86_64) AM_CONDITIONAL(HAVE_X86_64_ASM, test "x$asm_arch" = xx86_64) AM_CONDITIONAL(HAVE_SPARC_ASM, test "x$asm_arch" = xsparc) +AC_SUBST([SSE41_CFLAGS], $SSE41_CFLAGS) + AC_SUBST([NINE_MAJOR], 1) AC_SUBST([NINE_MINOR], 0) AC_SUBST([NINE_TINY], 0) diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am index 932db4f..3b68573 100644 --- a/src/mesa/Makefile.am +++ b/src/mesa/Makefile.am @@ -153,7 +153,7 @@ libmesagallium_la_LIBADD = \ libmesa_sse41_la_SOURCES = \ main/streaming-load-memcpy.c \ main/sse_minmax.c -libmesa_sse41_la_CFLAGS = $(AM_CFLAGS) -msse4.1 +libmesa_sse41_la_CFLAGS = $(AM_CFLAGS) $(SSE41_CFLAGS) pkgconfigdir = $(libdir)/pkgconfig pkgconfig_DATA = gl.pc diff --git a/src/mesa/main/sse_minmax.c b/src/mesa/main/sse_minmax.c index 93cf2a6..222ac14 100644 --- a/src/mesa/main/sse_minmax.c +++ b/src/mesa/main/sse_minmax.c @@ -31,9 +31,6 @@ #include void -#if !defined(__x86_64__) - __attribute__((force_align_arg_pointer)) -#endif _mesa_uint_array_min_max(const unsigned *ui_indices, unsigned *min_index, unsigned *max_index, const unsigned count) { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 86788] (bisected) 32bit UrbanTerror 4.1 timedemo sse4.1 segfault...
https://bugs.freedesktop.org/show_bug.cgi?id=86788 --- Comment #13 from José Fonseca --- Timothy, Sorry for jumping into wrong conclusions, and thanks for pursuing a more comprehensive fix. I appreciate it. Jose -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] ir_to_mesa: Remove sat to clamp lowering pass
Fixes an infinite loop in swrast where the lowering pass unpacks saturate into clamp but the opt_algebraic pass tries to do the opposite. v3 (Ian): This is a revert of commit cfa8c1cb "ir_to_mesa: lower ir_unop_saturate" on the ir_to_mesa.cpp portion. prog_execute.c can handle saturates in vertex shaders, so classic swrast shouldn't need this lowering pass. Cc: "10.4" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83463 Signed-off-by: Abdiel Janulgue --- src/mesa/program/ir_to_mesa.cpp | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp index 5cd9058..68e2597 100644 --- a/src/mesa/program/ir_to_mesa.cpp +++ b/src/mesa/program/ir_to_mesa.cpp @@ -2946,9 +2946,7 @@ _mesa_ir_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) GLenum target = _mesa_shader_stage_to_program(prog->_LinkedShaders[i]->Stage); lower_instructions(ir, (MOD_TO_FRACT | DIV_TO_MUL_RCP | EXP_TO_EXP2 | LOG_TO_LOG2 | INT_DIV_TO_MUL_RCP -| ((options->EmitNoPow) ? POW_TO_EXP2 : 0) -| ((target == GL_VERTEX_PROGRAM_ARB) ? SAT_TO_CLAMP -: 0))); +| ((options->EmitNoPow) ? POW_TO_EXP2 : 0))); progress = do_lower_jumps(ir, true, true, options->EmitNoMainReturn, options->EmitNoCont, options->EmitNoLoops) || progress; -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] glsl: Don't optimize min/max into saturate when EmitNoSat is set
v3: Fix multi-line comment format (Ian) Signed-off-by: Abdiel Janulgue --- src/glsl/opt_algebraic.cpp | 2 +- src/mesa/main/mtypes.h | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp index c4f883b..c6f4a9c 100644 --- a/src/glsl/opt_algebraic.cpp +++ b/src/glsl/opt_algebraic.cpp @@ -689,7 +689,7 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir) case ir_binop_min: case ir_binop_max: - if (ir->type->base_type != GLSL_TYPE_FLOAT) + if (ir->type->base_type != GLSL_TYPE_FLOAT || options->EmitNoSat) break; /* Replace min(max) operations and its commutative combinations with diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 7389baa..cee11a3 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -2990,6 +2990,7 @@ struct gl_shader_compiler_options GLboolean EmitNoMainReturn;/**< Emit CONT/RET opcodes? */ GLboolean EmitNoNoise; /**< Emit NOISE opcodes? */ GLboolean EmitNoPow; /**< Emit POW opcodes? */ + GLboolean EmitNoSat; /**< Emit SAT opcodes? */ GLboolean LowerClipDistance; /**< Lower gl_ClipDistance from float[8] to vec4[2]? */ /** -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] st/mesa: For vertex shaders, don't emit saturate when SM 3.0 is unsupported
There is a bug in the current lowering pass implementation where we lower saturate to clamp only for vertex shaders on drivers supporting SM 3.0. The correct behavior is to actually lower to clamp only when we don't support saturate which happens on drivers that don't support SM 3.0 Reviewed-by: Marek Olšák Signed-off-by: Abdiel Janulgue --- src/mesa/state_tracker/st_context.c| 2 ++ src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 5 + 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/src/mesa/state_tracker/st_context.c b/src/mesa/state_tracker/st_context.c index 1723513..9da0c77 100644 --- a/src/mesa/state_tracker/st_context.c +++ b/src/mesa/state_tracker/st_context.c @@ -271,6 +271,8 @@ st_create_context_priv( struct gl_context *ctx, struct pipe_context *pipe, */ st->ctx->Point.MaxSize = MAX2(ctx->Const.MaxPointSize, ctx->Const.MaxPointSizeAA); + /* For vertex shaders, make sure not to emit saturate when SM 3.0 is not supported */ + ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].EmitNoSat = !st->has_shader_model3; _mesa_compute_version(ctx); diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index fd51595..80dd102 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -5419,9 +5419,6 @@ st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) if (!pscreen->get_param(pscreen, PIPE_CAP_TEXTURE_GATHER_OFFSETS)) lower_offset_arrays(ir); do_mat_op_to_vec(ir); - /* Emit saturates in the vertex shader only if SM 3.0 is supported. */ - bool vs_sm3 = (_mesa_shader_stage_to_program(prog->_LinkedShaders[i]->Stage) == - GL_VERTEX_PROGRAM_ARB) && st_context(ctx)->has_shader_model3; lower_instructions(ir, MOD_TO_FRACT | DIV_TO_MUL_RCP | @@ -5432,7 +5429,7 @@ st_link_shader(struct gl_context *ctx, struct gl_shader_program *prog) BORROW_TO_ARITH | (options->EmitNoPow ? POW_TO_EXP2 : 0) | (!ctx->Const.NativeIntegers ? INT_DIV_TO_MUL_RCP : 0) | - (vs_sm3 ? SAT_TO_CLAMP : 0)); + (options->EmitNoSat ? SAT_TO_CLAMP : 0)); lower_ubo_reference(prog->_LinkedShaders[i], ir); do_vec_index_to_cond_assign(ir); -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Fix union usage for GCC <= 4.6.
Hi, On 12/06/2014 05:46 AM, Jonathan Gray wrote: Along with the ABI-check scripts it seems at the very least all occurances of "#!/bin/bash" should be changed to "#!/usr/bin/env bash" if they are actually bash specific. Debian's devscripts package's checkbashisms checker tells about Bash needs: -- $ checkbashisms bin/bugzilla_mesa.sh could not find any possible bashisms in bash script bin/bugzilla_mesa.sh $ checkbashisms bin/shortlog_mesa.sh $checkbashisms src/egl/wayland/wayland-egl/wayland-egl-symbols-check could not find any possible bashisms in bash script src/egl/wayland/wayland-egl/wayland-egl-symbols-check $ checkbashisms src/gallium/targets/gbm/gallium-gbm-symbols-check could not find any possible bashisms in bash script src/gallium/targets/gbm/gallium-gbm-symbols-check $ checkbashisms src/gallium/tools/addr2line.sh $ checkbashisms src/gallium/tools/trace/tracediff.sh $ checkbashisms src/gbm/gbm-symbols-check could not find any possible bashisms in bash script src/gbm/gbm-symbols-check -- - Eero ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 1/7] ir_to_mesa: Only lower saturate to clamp when EmitNoSat is set
Hi Ian, On 12/04/2014 01:01 AM, Ian Romanick wrote: > On 12/01/2014 05:47 AM, Abdiel Janulgue wrote: >> Fixes an infinite loop in swrast where the lowering pass unpacks saturate >> into clamp > > Which swrast are we talking about here? Classic swrast? softpipe? > llvmpipe? Classic swrast. Although there is also another separate issue in llvmpipe that I fixed in patch 5 within this series. > > prog_execute.c can handle saturates in vertex shaders, so classic swrast > shouldn't need this lowering pass. The only classic hardware driver > that can't do saturates in vertex shaders is r200... GLSL is not enabled > there, so it doesn't matter. > > What happens if you just revert the ir_to_mesa.cpp hunk from cfa8c1cb? Reverting the ir_to_mesa.cpp change in cfa8c1cb does fix the issue as well. I'll submit this change today together with the llvmpipe fix. But I'll drop the i915, i965, and r200 patches. > >> but the opt_algebraic pass tries to do the opposite. >> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83463 >> Signed-off-by: Abdiel Janulgue >> --- >> src/mesa/main/mtypes.h | 1 + >> src/mesa/program/ir_to_mesa.cpp | 3 +-- >> 2 files changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h >> index 7389baa..cee11a3 100644 >> --- a/src/mesa/main/mtypes.h >> +++ b/src/mesa/main/mtypes.h >> @@ -2990,6 +2990,7 @@ struct gl_shader_compiler_options >> GLboolean EmitNoMainReturn;/**< Emit CONT/RET opcodes? */ >> GLboolean EmitNoNoise; /**< Emit NOISE opcodes? */ >> GLboolean EmitNoPow; /**< Emit POW opcodes? */ >> + GLboolean EmitNoSat; /**< Emit SAT opcodes? */ >> GLboolean LowerClipDistance; /**< Lower gl_ClipDistance from float[8] to >> vec4[2]? */ >> >> /** >> diff --git a/src/mesa/program/ir_to_mesa.cpp >> b/src/mesa/program/ir_to_mesa.cpp >> index 5cd9058..7e7aded 100644 >> --- a/src/mesa/program/ir_to_mesa.cpp >> +++ b/src/mesa/program/ir_to_mesa.cpp >> @@ -2947,8 +2947,7 @@ _mesa_ir_link_shader(struct gl_context *ctx, struct >> gl_shader_program *prog) >> lower_instructions(ir, (MOD_TO_FRACT | DIV_TO_MUL_RCP | EXP_TO_EXP2 >> | LOG_TO_LOG2 | INT_DIV_TO_MUL_RCP >> | ((options->EmitNoPow) ? POW_TO_EXP2 : 0) >> - | ((target == GL_VERTEX_PROGRAM_ARB) ? >> SAT_TO_CLAMP >> -: 0))); >> + | ((options->EmitNoSat) ? SAT_TO_CLAMP : 0))); >> >> progress = do_lower_jumps(ir, true, true, options->EmitNoMainReturn, >> options->EmitNoCont, options->EmitNoLoops) || progress; >> >> > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] r600/sb loop issue
On 12/06/2014 07:13 AM, Vadim Girlin wrote: On 12/04/2014 01:43 AM, Dave Airlie wrote: Hi Vadim, I've been looking with Glenn's help into a bug in sb for a couple of weeks now triggered by a change in how GLSL generates switch statements. I understand you probably aren't too interested in r600g but I believe I'm hitting a design level problem and I would like some advice. So it appears that GLSL can create loops that don't repeat for switch statements, and it appears SB wasn't ready to handle such a thing. Hi, Dave, I suspect we should rather get rid of such loops somehow, i.e. convert to something else, the loop that never repeats is not really a loop anyway. AFAICS "continue" is not supported in switch statements according to GLSL specs, so the loops generated for switch will never be repeated. Am I missing something? Even if repeating is possible somehow, at least we can get rid of the loops that are not repeated. I think loops are less efficient than other control flow instructions on r600g hw (at least because they increase stack usage), and possibly on other hw too. In fact it seems sb basically gets rid of it already in IR, it just doesn't know how to translate resulting control flow to ISA, because so far it only supports specific control flow structure for if-then-else that was previously preserved during optimizations. I think it may be not very hard to implement support for that in finalizer, I'll look into it. In fact handling that control flow in finalizer is not as easy as I hoped, probably impossible, at least if we want to make it efficient. I forgot about the limitations of R600 ISA. OTOH it seems I've managed to fix the issues with loops, the patch is attached (it's meant to be used instead of 7b0067d2). There are no piglit regressions on evergreen, but I didn't test any real apps. Vadim sb has the ->is_loop() and it just checks !repeats.empty(), so this meant in the finalizer code we'd fall into the if statement which would then assert. I hacked/fixed (more hacked), this in 7b0067d23a6f64cf83c42e7f11b2cd4100c569fe which attempts to detect single pass loops and handle things that way. However this lead to stack depth calculations being incorrectly done, so I moved the single loop detect into the is_loop check, (see attached patch). This fixes the rendering in some places, but lead to a regression in tests/shaders/glsl-vs-continue-in-switch-in-do-while.shader_test error at : PHI t76||FP@R3.x, t128||FP@R3.x, t115||FP@R3.x, t102||FP@R3.x, t89||FP@R3.x : expected operand value t115||FP@R3.x, gpr contains t17||FP@R3.x error at : PHI t76||FP@R3.x, t128||FP@R3.x, t115||FP@R3.x, t102||FP@R3.x, t89||FP@R3.x : expected operand value t102||FP@R3.x, gpr contains t17||FP@R3.x Now Glenn suspected this was due to the is_loop check in sb_shader.cpp:create_bbs, and changing that check to only detect repeating loops removes that issue, but introduces stack sizing issues again, resulting in lockups/random rendering. So I just want to ask had you considered single loops with an always break in sb design, I didn't see such loops with any test cases, so I didn't even think about it. and perhaps some idea where things are going so wrong with the register alloc above. Not sure, but as long as the only "repeat" node is optimized away in bc_parser because it's useless due to unconditional break, I suspect it may be not easy to make all other code think that it's still a loop. I've tried a quick fix to not optimize the repeat away for such loops, but it results in other issues, probably it will require handling this as a special case in other places, so it doesn't look like a good idea either. I'll try to implement the solution that I described above, that is, translate resulting control flow back to ISA. If it won't be too much work, it's probably the best way and it won't use loop instructions in the end. I suspect I'll keep digging into this, but its getting to the edges of the brain space/time I can find! Dave. >From 4967ef90847f921fc0ef7c018ae7ae8048d2a6ce Mon Sep 17 00:00:00 2001 From: Vadim Girlin Date: Mon, 8 Dec 2014 13:11:48 +0300 Subject: [PATCH] r600g/sb: fix issues with loops created for switch statements --- src/gallium/drivers/r600/sb/sb_bc_finalize.cpp | 2 ++ src/gallium/drivers/r600/sb/sb_bc_parser.cpp | 2 ++ src/gallium/drivers/r600/sb/sb_if_conversion.cpp | 4 ++-- src/gallium/drivers/r600/sb/sb_ir.h | 9 +++-- src/gallium/drivers/r600/sb/sb_sched.cpp | 2 +- 5 files changed, 14 insertions(+), 5 deletions(-) diff --git a/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp b/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp index f0849ca..3f362c4 100644 --- a/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp +++ b/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp @@ -110,6 +110,8 @@ int bc_finalizer::run() { void bc_finalizer::finalize_loop(region_node* r) { + update_n
Re: [Mesa-dev] Finishing make distcheck
On Sun, Dec 7, 2014 at 6:57 PM, Matt Turner wrote: > I've seen some (sporadic?) failures of the glcpp/tests/glcpp-test. I > think it's because it's trying to write out files into the > distribution directory, which isn't allowed. I'll try to track that > down. That's now fixed. glcpp-test was writing its .out files into the srcdir instead of the builddir. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev