[PATCH 2/9] rs6000: New type attribute value halfmul
This is for the legacy integer multiply-accumulate instructions. Quite a mouthful, and mulhw is also a terrible name since we already have a machine instruction called exactly that. Hence halfmul. Also fixes the titan automaton description for this. 2014-05-22 Segher Boessenkool seg...@kernel.crashing.org gcc/ * config/rs6000/rs6000.md (type): Add new value halfmul. (*macchwc, *macchw, *macchwuc, *macchwu, *machhwc, *machhw, *machhwuc, *machhwu, *maclhwc, *maclhw, *maclhwuc, *maclhwu, *nmacchwc, *nmacchw, *nmachhwc, *nmachhw, *nmaclhwc, *nmaclhw, *mulchwc, *mulchw, *mulchwuc, *mulchwu, *mulhhwc, *mulhhw, *mulhhwuc, *mulhhwu, *mullhwc, *mullhw, *mullhwuc, *mullhwu): Use it. * config/rs6000/40x.md (ppc405-imul3): Add type halfmul. * config/rs6000/440.md (ppc440-imul2): Add type halfmul. * config/rs6000/476.md (ppc476-imul): Add type halfmul. * config/rs6000/titan.md: Delete nonsensical comment. (titan_imul): Add type imul3. (titan_mulhw): Remove type imul3; add type halfmul. --- gcc/config/rs6000/40x.md| 2 +- gcc/config/rs6000/440.md| 2 +- gcc/config/rs6000/476.md| 2 +- gcc/config/rs6000/rs6000.md | 62 ++--- gcc/config/rs6000/titan.md | 8 ++ 5 files changed, 36 insertions(+), 40 deletions(-) diff --git a/gcc/config/rs6000/40x.md b/gcc/config/rs6000/40x.md index ed236a4..5510767 100644 --- a/gcc/config/rs6000/40x.md +++ b/gcc/config/rs6000/40x.md @@ -73,7 +73,7 @@ (define_insn_reservation ppc405-imul2 3 iu_40x*2) (define_insn_reservation ppc405-imul3 2 - (and (eq_attr type imul3) + (and (eq_attr type imul3,halfmul) (eq_attr cpu ppc405)) iu_40x) diff --git a/gcc/config/rs6000/440.md b/gcc/config/rs6000/440.md index 2dcc58d..df3a3b5 100644 --- a/gcc/config/rs6000/440.md +++ b/gcc/config/rs6000/440.md @@ -76,7 +76,7 @@ (define_insn_reservation ppc440-imul 3 ppc440_issue,ppc440_i_pipe) (define_insn_reservation ppc440-imul2 2 - (and (eq_attr type imul2,imul3) + (and (eq_attr type imul2,imul3,halfmul) (eq_attr cpu ppc440)) ppc440_issue,ppc440_i_pipe) diff --git a/gcc/config/rs6000/476.md b/gcc/config/rs6000/476.md index 8b4e65f..acfe063 100644 --- a/gcc/config/rs6000/476.md +++ b/gcc/config/rs6000/476.md @@ -82,7 +82,7 @@ (define_insn_reservation ppc476-compare 4 ppc476_i_pipe) (define_insn_reservation ppc476-imul 4 - (and (eq_attr type imul,imul_compare,imul2,imul3) + (and (eq_attr type imul,imul_compare,imul2,imul3,halfmul) (eq_attr cpu ppc476)) ppc476_issue,\ ppc476_i_pipe) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 667aac1..3e9686e 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -160,7 +160,7 @@ (define_c_enum unspecv (define_attr type integer,two,three, shift,var_shift_rotate,insert_word,insert_dword, - imul,imul2,imul3,lmul,idiv,ldiv, + imul,imul2,imul3,lmul,halfmul,idiv,ldiv, exts,cntlz,popcnt,isel, load,store,fpload,fpstore,vecload,vecstore, cmp, @@ -1248,7 +1248,7 @@ (define_insn *macchwc (match_dup 4)))] TARGET_MULHW macchw. %0,%1,%2 - [(set_attr type imul3)]) + [(set_attr type halfmul)]) (define_insn *macchw [(set (match_operand:SI 0 gpc_reg_operand =r) @@ -1260,7 +1260,7 @@ (define_insn *macchw (match_operand:SI 3 gpc_reg_operand 0)))] TARGET_MULHW macchw %0,%1,%2 - [(set_attr type imul3)]) + [(set_attr type halfmul)]) (define_insn *macchwuc [(set (match_operand:CC 3 cc_reg_operand =x) @@ -1280,7 +1280,7 @@ (define_insn *macchwuc (match_dup 4)))] TARGET_MULHW macchwu. %0,%1,%2 - [(set_attr type imul3)]) + [(set_attr type halfmul)]) (define_insn *macchwu [(set (match_operand:SI 0 gpc_reg_operand =r) @@ -1292,7 +1292,7 @@ (define_insn *macchwu (match_operand:SI 3 gpc_reg_operand 0)))] TARGET_MULHW macchwu %0,%1,%2 - [(set_attr type imul3)]) + [(set_attr type halfmul)]) (define_insn *machhwc [(set (match_operand:CC 3 cc_reg_operand =x) @@ -1314,7 +1314,7 @@ (define_insn *machhwc (match_dup 4)))] TARGET_MULHW machhw. %0,%1,%2 - [(set_attr type imul3)]) + [(set_attr type halfmul)]) (define_insn *machhw [(set (match_operand:SI 0 gpc_reg_operand =r) @@ -1327,7 +1327,7 @@ (define_insn *machhw (match_operand:SI 3 gpc_reg_operand 0)))] TARGET_MULHW machhw %0,%1,%2 - [(set_attr type imul3)]) + [(set_attr type halfmul)]) (define_insn *machhwuc [(set (match_operand:CC 3 cc_reg_operand =x) @@ -1349,7 +1349,7 @@ (define_insn *machhwuc (match_dup 4)))] TARGET_MULHW machhwu. %0,%1,%2 - [(set_attr type imul3)]) + [(set_attr type halfmul)]) (define_insn *machhwu [(set (match_operand:SI 0 gpc_reg_operand =r) @@ -1362,7 +1362,7 @@ (define_insn *machhwu
[PATCH 1/9] rs6000: Clean up the type attribute
Get rid of the one huge line. Group and order things a bit. Further changes will follow so this doesn't try to make it perfect. The rest of this patch series reduces the number of different integer instruction types by folding many together using attributes size (the data size), dot (does this instruction set CR0), and var_shift (for shift instructions: is the shift amount from a register). Many scheduling descriptions are incomplete; many instruction patterns use the wrong instruction type. Hopefully things will be better if there aren't that many different types to handle. Each patch bootstrapped on powerpc64-linux, tested with -m64,-m64/-mtune=power8,-m32,-m32/-mpowerpc64; no regressions (and nothing magically fixed either). Okay to apply? Segher 2014-05-22 Segher Boessenkool seg...@kernel.crashing.org gcc/ * config/rs6000/rs6000.md (type): Reorder, reformat. --- gcc/config/rs6000/rs6000.md | 17 - 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 300bd36..667aac1 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -157,7 +157,22 @@ (define_c_enum unspecv ;; Define an insn type attribute. This is used in function unit delay ;; computations. -(define_attr type integer,two,three,load,store,fpload,fpstore,vecload,vecstore,imul,imul2,imul3,lmul,idiv,ldiv,insert_word,branch,cmp,fast_compare,compare,var_delayed_compare,delayed_compare,imul_compare,lmul_compare,fpcompare,cr_logical,delayed_cr,mfcr,mfcrf,mtcr,mfjmpr,mtjmpr,fp,fpsimple,dmul,sdiv,ddiv,ssqrt,dsqrt,jmpreg,brinc,vecsimple,veccomplex,vecdiv,veccmp,veccmpsimple,vecperm,vecfloat,vecfdiv,vecdouble,isync,sync,load_l,store_c,shift,trap,insert_dword,var_shift_rotate,cntlz,exts,mffgpr,mftgpr,isel,popcnt,crypto,htm +(define_attr type + integer,two,three, + shift,var_shift_rotate,insert_word,insert_dword, + imul,imul2,imul3,lmul,idiv,ldiv, + exts,cntlz,popcnt,isel, + load,store,fpload,fpstore,vecload,vecstore, + cmp, + branch,jmpreg,mfjmpr,mtjmpr,trap,isync,sync,load_l,store_c, + compare,fast_compare,delayed_compare,var_delayed_compare, + imul_compare,lmul_compare, + cr_logical,delayed_cr,mfcr,mfcrf,mtcr, + fpcompare,fp,fpsimple,dmul,sdiv,ddiv,ssqrt,dsqrt, + brinc, + vecsimple,veccomplex,vecdiv,veccmp,veccmpsimple,vecperm, + vecfloat,vecfdiv,vecdouble,mffgpr,mftgpr,crypto, + htm (const_string integer)) ;; Does this instruction sign-extend its result? -- 1.8.1.4
[PATCH 3/9] rs6000: Make all multiply instructions one type
This uses the attributes size and dot to specify the differences: imul3 - mul size=8 imul2 - mul size=16 imul - mul size=32 lmul - mul size=64 imul_compare - mul size=32 dot=yes lmul_compare - mul size=64 dot=yes 2014-05-22 Segher Boessenkool seg...@kernel.crashing.org gcc/ * config/rs6000/rs6000.md (type): Add mul. Delete imul, imul2, imul3, lmul, imul_compare, lmul_compare. (size): New attribute. (dot): New attribute. (cell_micro): Adjust. (mulsi3, *mulsi3_internal1, *mulsi3_internal2, mulsidi3, umulsidi3, smulsi3_highpart, umulsi3_highpart, muldi3, *muldi3_internal1, *muldi3_internal2, smuldi3_highpart, umuldi3_highpart): Adjust. * config/rs6000/rs6000.c (rs6000_adjust_cost, is_cracked_insn, rs6000_adjust_priority, is_nonpipeline_insn, insn_must_be_first_in_group, insn_must_be_last_in_group): Adjust. * config/rs6000/40x.md (ppc403-imul, ppc405-imul, ppc405-imul2, ppc405-imul3): Adjust. * config/rs6000/440.md (ppc440-imul, ppc440-imul2): Adjust. * config/rs6000/476.md (ppc476-imul): Adjust. * config/rs6000/601.md (ppc601-imul): Adjust. * config/rs6000/603.md (ppc603-imul, ppc603-imul2): Adjust. * config/rs6000/6xx.md (ppc604-imul, ppc604e-imul, ppc620-imul, ppc620-imul2, ppc620-imul3, ppc620-lmul): Adjust. * config/rs6000/7450.md (ppc7450-imul, ppc7450-imul2): Adjust. * config/rs6000/7xx.md (ppc750-imul, ppc750-imul2, ppc750-imul3): Adjust. * config/rs6000/8540.md (ppc8540_multiply): Adjust. * config/rs6000/a2.md (ppca2-imul, ppca2-lmul): Adjust. * config/rs6000/cell.md (cell-lmul, cell-lmul-cmp, cell-imul23, cell-imul): Adjust. * config/rs6000/e300c2c3.md (ppce300c3_multiply): Adjust. * config/rs6000/e500mc.md (e500mc_multiply): Adjust. * config/rs6000/e500mc64.md (e500mc64_multiply): Adjust. * config/rs6000/e5500.md (e5500_multiply, e5500_multiply_i): Adjust. * config/rs6000/e6500.md (e6500_multiply, e6500_multiply_i): Adjust. * config/rs6000/mpc.md (mpccore-imul): Adjust. * config/rs6000/power4.md (power4-lmul-cmp, power4-imul-cmp, power4-lmul, power4-imul, power4-imul3): Adjust. * config/rs6000/power5.md (power5-lmul-cmp, power5-imul-cmp, power5-lmul, power5-imul, power5-imul3): Adjust. * config/rs6000/power6.md (power6-lmul-cmp, power6-imul-cmp, power6-lmul, power6-imul, power6-imul3): Adjust. * config/rs6000/power7.md (power7-mul, power7-mul-compare): Adjust. * config/rs6000/power8.md (power8-mul, power8-mul-compare): Adjust. * config/rs6000/rs64.md (rs64a-imul, rs64a-imul2, rs64a-imul3, rs64a-lmul): Adjust. * config/rs6000/titan.md (titan_imul): Adjust. --- gcc/config/rs6000/40x.md | 12 ++--- gcc/config/rs6000/440.md | 7 +++-- gcc/config/rs6000/476.md | 2 +- gcc/config/rs6000/601.md | 2 +- gcc/config/rs6000/603.md | 6 +++-- gcc/config/rs6000/6xx.md | 16 +++- gcc/config/rs6000/7450.md | 6 +++-- gcc/config/rs6000/7xx.md | 9 --- gcc/config/rs6000/8540.md | 2 +- gcc/config/rs6000/a2.md | 6 +++-- gcc/config/rs6000/cell.md | 15 --- gcc/config/rs6000/e300c2c3.md | 2 +- gcc/config/rs6000/e500mc.md | 2 +- gcc/config/rs6000/e500mc64.md | 2 +- gcc/config/rs6000/e5500.md| 8 -- gcc/config/rs6000/e6500.md| 8 -- gcc/config/rs6000/mpc.md | 2 +- gcc/config/rs6000/power4.md | 19 ++ gcc/config/rs6000/power5.md | 19 ++ gcc/config/rs6000/power6.md | 19 ++ gcc/config/rs6000/power7.md | 6 +++-- gcc/config/rs6000/power8.md | 6 +++-- gcc/config/rs6000/rs6000.c| 52 +--- gcc/config/rs6000/rs6000.md | 61 --- gcc/config/rs6000/rs64.md | 12 ++--- gcc/config/rs6000/titan.md| 2 +- 26 files changed, 188 insertions(+), 115 deletions(-) diff --git a/gcc/config/rs6000/40x.md b/gcc/config/rs6000/40x.md index 5510767..7ec2801 100644 --- a/gcc/config/rs6000/40x.md +++ b/gcc/config/rs6000/40x.md @@ -58,22 +58,26 @@ (define_insn_reservation ppc403-compare 3 iu_40x,nothing,bpu_40x) (define_insn_reservation ppc403-imul 4 - (and (eq_attr type imul,imul2,imul3,imul_compare) + (and (eq_attr type mul) (eq_attr cpu ppc403)) iu_40x*4) (define_insn_reservation ppc405-imul 5 - (and (eq_attr type imul,imul_compare) + (and (eq_attr type mul) + (eq_attr size 32) (eq_attr cpu ppc405)) iu_40x*4) (define_insn_reservation ppc405-imul2 3 - (and (eq_attr type imul2) + (and (eq_attr type mul) + (eq_attr size 16) (eq_attr cpu ppc405)) iu_40x*2) (define_insn_reservation ppc405-imul3 2 - (and (eq_attr type
[PATCH 4/9] rs6000: Make all insert instructions one type
This uses the attribute size to specify the differences: insert_word - insert size=32 insert_dword - insert size=64 It could use dot as well, but the current code doesn't handle that. 2014-05-22 Segher Boessenkool seg...@kernel.crashing.org gcc/ * config/rs6000/rs6000.md (type): Delete insert_word, insert_dword. Add insert. (size): Update comment. * config/rs6000/rs6000.c (rs6000_adjust_cost, is_cracked_insn, insn_must_be_first_in_group): Adjust. (insvsi_internal, *insvsi_internal1, *insvsi_internal2, *insvsi_internal3, *insvsi_internal4, *insvsi_internal5, *insvsi_internal6, insvdi_internal): Adjust. * config/rs6000/40x.md (ppc403-integer): Adjust. * config/rs6000/440.md (ppc440-integer): Adjust. * config/rs6000/476.md (ppc476-simple-integer): Adjust. * config/rs6000/601.md (ppc601-integer): Adjust. * config/rs6000/603.md (ppc603-integer): Adjust. * config/rs6000/6xx.md (ppc604-integer): Adjust. * config/rs6000/7450.md (ppc7450-integer): Adjust. * config/rs6000/7xx.md (ppc750-integer): Adjust. * config/rs6000/8540.md (ppc8540_su): Adjust. * config/rs6000/cell.md (cell-integer, cell-insert): Adjust. * config/rs6000/e300c2c3.md (ppce300c3_iu): Adjust. * config/rs6000/e500mc.md (e500mc_su): Adjust. * config/rs6000/e500mc64.md (e500mc64_su): Adjust. * config/rs6000/e5500.md (e5500_sfx): Adjust. * config/rs6000/e6500.md (e6500_sfx): Adjust. * config/rs6000/mpc.md (mpccore-integer): Adjust. * config/rs6000/power4.md (power4-integer, power4-insert): Adjust. * config/rs6000/power5.md (power5-integer, power5-insert): Adjust. * config/rs6000/power6.md (power6-insert, power6-insert-dword): Adjust. * config/rs6000/power7.md (power7-integer): Adjust. * config/rs6000/power8.md (power8-1cyc): Adjust. * config/rs6000/rs64.md (rs64a-integer): Adjust. * config/rs6000/titan.md (titan_fxu_shift_and_rotate): Adjust. --- gcc/config/rs6000/40x.md | 2 +- gcc/config/rs6000/440.md | 2 +- gcc/config/rs6000/476.md | 2 +- gcc/config/rs6000/601.md | 2 +- gcc/config/rs6000/603.md | 2 +- gcc/config/rs6000/6xx.md | 2 +- gcc/config/rs6000/7450.md | 2 +- gcc/config/rs6000/7xx.md | 2 +- gcc/config/rs6000/8540.md | 2 +- gcc/config/rs6000/cell.md | 9 ++--- gcc/config/rs6000/e300c2c3.md | 2 +- gcc/config/rs6000/e500mc.md | 2 +- gcc/config/rs6000/e500mc64.md | 2 +- gcc/config/rs6000/e5500.md| 2 +- gcc/config/rs6000/e6500.md| 2 +- gcc/config/rs6000/mpc.md | 2 +- gcc/config/rs6000/power4.md | 9 ++--- gcc/config/rs6000/power5.md | 9 ++--- gcc/config/rs6000/power6.md | 6 -- gcc/config/rs6000/power7.md | 2 +- gcc/config/rs6000/power8.md | 2 +- gcc/config/rs6000/rs6000.c| 12 +--- gcc/config/rs6000/rs6000.md | 21 +++-- gcc/config/rs6000/rs64.md | 2 +- gcc/config/rs6000/titan.md| 2 +- 25 files changed, 57 insertions(+), 47 deletions(-) diff --git a/gcc/config/rs6000/40x.md b/gcc/config/rs6000/40x.md index 7ec2801..02971cb 100644 --- a/gcc/config/rs6000/40x.md +++ b/gcc/config/rs6000/40x.md @@ -36,7 +36,7 @@ (define_insn_reservation ppc403-store 2 iu_40x) (define_insn_reservation ppc403-integer 1 - (and (eq_attr type integer,insert_word,insert_dword,shift,trap,\ + (and (eq_attr type integer,insert,shift,trap,\ var_shift_rotate,cntlz,exts,isel) (eq_attr cpu ppc403,ppc405)) iu_40x) diff --git a/gcc/config/rs6000/440.md b/gcc/config/rs6000/440.md index 55d1155..292177d 100644 --- a/gcc/config/rs6000/440.md +++ b/gcc/config/rs6000/440.md @@ -53,7 +53,7 @@ (define_insn_reservation ppc440-fpstore 3 ppc440_issue,ppc440_l_pipe) (define_insn_reservation ppc440-integer 1 - (and (eq_attr type integer,insert_word,insert_dword,shift,\ + (and (eq_attr type integer,insert,shift,\ trap,var_shift_rotate,cntlz,exts,isel) (eq_attr cpu ppc440)) ppc440_issue,ppc440_i_pipe|ppc440_j_pipe) diff --git a/gcc/config/rs6000/476.md b/gcc/config/rs6000/476.md index 7b00632..403752a 100644 --- a/gcc/config/rs6000/476.md +++ b/gcc/config/rs6000/476.md @@ -63,7 +63,7 @@ (define_insn_reservation ppc476-fpstore 4 ppc476_lj_pipe) (define_insn_reservation ppc476-simple-integer 1 - (and (eq_attr type integer,insert_word,var_shift_rotate,exts,shift) + (and (eq_attr type integer,insert,var_shift_rotate,exts,shift) (eq_attr cpu ppc476)) ppc476_issue,\ ppc476_i_pipe|ppc476_lj_pipe) diff --git a/gcc/config/rs6000/601.md b/gcc/config/rs6000/601.md index c1a0043..d0afcf7 100644 --- a/gcc/config/rs6000/601.md +++ b/gcc/config/rs6000/601.md @@ -45,7 +45,7 @@ (define_insn_reservation ppc601-fpstore 3 iu_ppc601+fpu_ppc601)
Re: libsanitizer merge from upstream r208536
On Fri, May 23, 2014 at 09:30:12AM +0400, Yury Gribov wrote: much better would be just dlsym a couple of interesting symbols to verify that libasan.so.1 is ahead of libc.so.6, libstdc++.so.6, libpthread.so.0 (whatever else you want to override). One problem with dlsym() that I've seen is that it causes a call to malloc on failure (to allocate buffer for dlerror()) which forces allocator initialization and breaks Asan's delayed initialization feature. Also on Linux dlsym() tends to return address of executable's PLT entry which is useless. You don't need to use dlsym actually, just comparing if (malloc != __hidden_malloc_alias) would do it. You're right that will return PLT slots in the executable though. Otherwise libasan apps will simply stop working altogether if LD_PRELOAD is set, to whatever library, even if it doesn't define any symbols you care about. Right but I'm not sure whether failing fast here is necessarily bad. I think it is very bad. In fact, if you really want such a check, I'd say it shouldn't be at least enabled by default, unless some env var requests it; and document that if you are having troubles with asan sanitized programs, try this magic env var to get better troubleshooting. Even before this exaggerated check asan imposes far more restrictions than good, and this just makes asan less usable just for fear that it wouldn't work right. Most preloaded libs will just provide symbols asan never cares about, or even if say they override malloc, it could be just some malloc wrappers that add some bookkeeping and call the original malloc through dlsym RTLD_NEXT, or even if you say override malloc completely without calling the original implementation, the world doesn't end, the shadow mem of those allocations just won't be surrounded by protected paddings, so what, you don't detect out of bounds for malloc, but can still detect out of bounds in your program's stack etc. Ditto for string ops etc. What really matters is that to avoid crashes, libasan unfortunately has to be constructed very early, but this check doesn't help with that, furthermore this code is run during the libasan construction and thus if it is not early, the library has already crashed by then. Imagine preloaded library has an initializer which calls intercepted APIs. Asan didn't get a chance to initialize at the point of call and if interceptor doesn't contain a sanity call to asan_init, we are risking hard-to-debug runtime error (call to NULL, etc.). I've seen numerous bugs like this (both locally and on mailing lists) and they were main motivation to add this check. That is nonsense. Early in the symbol search scope is the opposite of being initialized early, on the contrary, such libraries are initialized last. That is the reason why LD_PRELOAD=libasan.so.1 still doesn't help with the __asan_init_v3 being performed early, you need either .preinit_array in the executable, or the ctor called by some library late in the symbol search scope (== early constructed). Typically people in LD_PRELOAD override malloc (which we want to diagnose), or far more rarely stringops (e.g. memstomp, also undesirable). I wonder whether overriding Asan's malloc, etc. is expected to work at all? Perhaps banning it altogether is just the safest thing to do? Don't know why you want to ban everything. Jakub
[PATCH 5/9] rs6000: Make all divide instructions one type
This uses the attribute size to specify the differences: idiv - div size=32 ldiv - div size=64 It could use dot as well, but the current code doesn't handle that. 2014-05-22 Segher Boessenkool seg...@kernel.crashing.org gcc/ * config/rs6000/rs6000.md (type): Delete idiv, ldiv. Add div. (bits): New mode_attr. (idiv_ldiv): Delete mode_attr. (udivmode3, *divmode3, divdiv_extend_mode): Adjust. * config/rs6000/rs6000.c (rs6000_adjust_cost, is_cracked_insn, rs6000_adjust_priority, is_nonpipeline_insn, insn_must_be_first_in_group, insn_must_be_last_in_group): Adjust. * config/rs6000/40x.md (ppc403-idiv): Adjust. * config/rs6000/440.md (ppc440-idiv): Adjust. * config/rs6000/476.md (ppc476-idiv): Adjust. * config/rs6000/601.md (ppc601-idiv): Adjust. * config/rs6000/603.md (ppc603-idiv): Adjust. * config/rs6000/6xx.md (ppc604-idiv, ppc620-idiv, ppc630-idiv, ppc620-ldiv): Adjust. * config/rs6000/7450.md (ppc7450-idiv): Adjust. * config/rs6000/7xx.md (ppc750-idiv): Adjust. * config/rs6000/8540.md (ppc8540_divide): Adjust. * config/rs6000/a2.md (ppca2-idiv, ppca2-ldiv): Adjust. * config/rs6000/cell.md (cell-idiv, cell-ldiv): Adjust. * config/rs6000/e300c2c3.md (ppce300c3_divide): Adjust. * config/rs6000/e500mc.md (e500mc_divide): Adjust. * config/rs6000/e500mc64.md (e500mc64_divide): Adjust. * config/rs6000/e5500.md (e5500_divide, e5500_divide_d): Adjust. * config/rs6000/e6500.md (e6500_divide, e6500_divide_d): Adjust. * config/rs6000/mpc.md (mpccore-idiv): Adjust. * config/rs6000/power4.md (power4-idiv, power4-ldiv): Adjust. * config/rs6000/power5.md (power5-idiv, power5-ldiv): Adjust. * config/rs6000/power6.md (power6-idiv, power6-ldiv): Adjust. * config/rs6000/power7.md (power7-idiv, power7-ldiv): Adjust. * config/rs6000/power8.md (power8-idiv, power8-ldiv): Adjust. * config/rs6000/rs64.md (rs64a-idiv, rs64a-ldiv): Adjust. * config/rs6000/titan.md (titan_fxu_div): Adjust. --- gcc/config/rs6000/40x.md | 2 +- gcc/config/rs6000/440.md | 2 +- gcc/config/rs6000/476.md | 2 +- gcc/config/rs6000/601.md | 2 +- gcc/config/rs6000/603.md | 2 +- gcc/config/rs6000/6xx.md | 11 +++ gcc/config/rs6000/7450.md | 2 +- gcc/config/rs6000/7xx.md | 2 +- gcc/config/rs6000/8540.md | 2 +- gcc/config/rs6000/a2.md | 6 -- gcc/config/rs6000/cell.md | 6 -- gcc/config/rs6000/e300c2c3.md | 2 +- gcc/config/rs6000/e500mc.md | 2 +- gcc/config/rs6000/e500mc64.md | 2 +- gcc/config/rs6000/e5500.md| 6 -- gcc/config/rs6000/e6500.md| 6 -- gcc/config/rs6000/mpc.md | 2 +- gcc/config/rs6000/power4.md | 6 -- gcc/config/rs6000/power5.md | 6 -- gcc/config/rs6000/power6.md | 6 -- gcc/config/rs6000/power7.md | 6 -- gcc/config/rs6000/power8.md | 6 -- gcc/config/rs6000/rs6000.c| 45 ++- gcc/config/rs6000/rs6000.md | 19 +- gcc/config/rs6000/rs64.md | 6 -- gcc/config/rs6000/titan.md| 2 +- 26 files changed, 89 insertions(+), 72 deletions(-) diff --git a/gcc/config/rs6000/40x.md b/gcc/config/rs6000/40x.md index 02971cb..8ddccba 100644 --- a/gcc/config/rs6000/40x.md +++ b/gcc/config/rs6000/40x.md @@ -82,7 +82,7 @@ (define_insn_reservation ppc405-imul3 2 iu_40x) (define_insn_reservation ppc403-idiv 33 - (and (eq_attr type idiv) + (and (eq_attr type div) (eq_attr cpu ppc403,ppc405)) iu_40x*33) diff --git a/gcc/config/rs6000/440.md b/gcc/config/rs6000/440.md index 292177d..e6c28a7 100644 --- a/gcc/config/rs6000/440.md +++ b/gcc/config/rs6000/440.md @@ -84,7 +84,7 @@ (define_insn_reservation ppc440-imul2 2 ppc440_issue,ppc440_i_pipe) (define_insn_reservation ppc440-idiv 34 - (and (eq_attr type idiv) + (and (eq_attr type div) (eq_attr cpu ppc440)) ppc440_issue,ppc440_i_pipe*33) diff --git a/gcc/config/rs6000/476.md b/gcc/config/rs6000/476.md index 403752a..5acd668 100644 --- a/gcc/config/rs6000/476.md +++ b/gcc/config/rs6000/476.md @@ -88,7 +88,7 @@ (define_insn_reservation ppc476-imul 4 ppc476_i_pipe) (define_insn_reservation ppc476-idiv 11 - (and (eq_attr type idiv) + (and (eq_attr type div) (eq_attr cpu ppc476)) ppc476_issue,\ ppc476_i_pipe*11) diff --git a/gcc/config/rs6000/601.md b/gcc/config/rs6000/601.md index d0afcf7..85892c8 100644 --- a/gcc/config/rs6000/601.md +++ b/gcc/config/rs6000/601.md @@ -66,7 +66,7 @@ (define_insn_reservation ppc601-imul 5 iu_ppc601*5) (define_insn_reservation ppc601-idiv 36 - (and (eq_attr type idiv) + (and (eq_attr type div) (eq_attr cpu ppc601)) iu_ppc601*36) diff --git a/gcc/config/rs6000/603.md b/gcc/config/rs6000/603.md index
[PATCH 9/9] rs6000: Make all rlw*nm and rld*c* type shift
They are often labeled just integer currently. Fix that. Also handle shift properly in those scheduling descriptions that neglected it. 2014-05-22 Segher Boessenkool seg...@kernel.crashing.org gcc/ * config/rs6000/440.md (ppc440-integer): Include shift without dot. (ppc440-compare): Include shift with dot. * config/rs6000/e300c2c3.md (ppce300c3_iu): Include shift without dot. * config/rs6000/e5500.md (e5500_sfx2): Include constant shift without dot. * config/rs6000/e6500.md (e6500_sfx): Exclude constant shift without dot. (e6500_sfx2): Include it. * config/rs6000/rs6000.md ( *zero_extendmodedi2_internal1, *zero_extendmodedi2_internal2, *zero_extendmodedi2_internal3, *zero_extendsidi2_lfiwzx, andsi3_mc, andsi3_nomc, andsi3_internal0_nomc, extzvsi_internal, extzvdi_internal, *extzvdi_internal1, *extzvdi_internal2, rotlsi3, *rotlsi3_64, *rotlsi3_internal4, *rotlsi3_internal7le, *rotlsi3_internal7be, *rotlsi3_internal10le, *rotlsi3_internal10be, rlwinm, *lshiftrt_internal1le, *lshiftrt_internal1be, *lshiftrt_internal4le, *lshiftrt_internal4be, rotldi3, *rotldi3_internal4, *rotldi3_internal7le, *rotldi3_internal7be, *rotldi3_internal10le, *rotldi3_internal10be, *rotldi3_internal13le, *rotldi3_internal13be, *ashldi3_internal4, ashldi3_internal5, *ashldi3_internal6, *ashldi3_internal7, ashldi3_internal8, *ashldi3_internal9, anddi3_mc, anddi3_nomc, *anddi3_internal2_mc, *anddi3_internal3_mc, and 4 anonymous define_insns): Use type shift in the appropriate alternatives. --- gcc/config/rs6000/440.md | 6 +-- gcc/config/rs6000/e300c2c3.md | 2 +- gcc/config/rs6000/e5500.md| 5 +- gcc/config/rs6000/e6500.md| 6 ++- gcc/config/rs6000/rs6000.md | 113 +- 5 files changed, 81 insertions(+), 51 deletions(-) diff --git a/gcc/config/rs6000/440.md b/gcc/config/rs6000/440.md index bc8da3e..f956bd6 100644 --- a/gcc/config/rs6000/440.md +++ b/gcc/config/rs6000/440.md @@ -53,8 +53,8 @@ (define_insn_reservation ppc440-fpstore 3 ppc440_issue,ppc440_l_pipe) (define_insn_reservation ppc440-integer 1 - (and (ior (eq_attr type integer,insert,shift,trap,cntlz,exts,isel) - (and (eq_attr type add,logical) + (and (ior (eq_attr type integer,insert,trap,cntlz,exts,isel) + (and (eq_attr type add,logical,shift) (eq_attr dot no))) (eq_attr cpu ppc440)) ppc440_issue,ppc440_i_pipe|ppc440_j_pipe) @@ -96,7 +96,7 @@ (define_insn_reservation ppc440-branch 1 (define_insn_reservation ppc440-compare 2 (and (ior (eq_attr type cmp,compare,cr_logical,delayed_cr,mfcr) - (and (eq_attr type add,logical) + (and (eq_attr type add,logical,shift) (eq_attr dot yes))) (eq_attr cpu ppc440)) ppc440_issue,ppc440_i_pipe) diff --git a/gcc/config/rs6000/e300c2c3.md b/gcc/config/rs6000/e300c2c3.md index 6ac585b..f80ef30 100644 --- a/gcc/config/rs6000/e300c2c3.md +++ b/gcc/config/rs6000/e300c2c3.md @@ -93,7 +93,7 @@ (define_insn_reservation ppce300c3_cmp 1 ;; Other one cycle IU insns (define_insn_reservation ppce300c3_iu 1 (and (ior (eq_attr type integer,insert,isel) - (and (eq_attr type add,logical) + (and (eq_attr type add,logical,shift) (eq_attr dot no))) (ior (eq_attr cpu ppce300c2) (eq_attr cpu ppce300c3))) ppce300c3_decode,ppce300c3_issue+ppce300c3_iu_stage0+ppce300c3_retire) diff --git a/gcc/config/rs6000/e5500.md b/gcc/config/rs6000/e5500.md index 49a5c39..8d784e0 100644 --- a/gcc/config/rs6000/e5500.md +++ b/gcc/config/rs6000/e5500.md @@ -67,7 +67,10 @@ (define_insn_reservation e5500_sfx 1 (define_insn_reservation e5500_sfx2 2 (and (ior (eq_attr type cmp,compare,trap) (and (eq_attr type add,logical) -(eq_attr dot yes))) +(eq_attr dot yes)) + (and (eq_attr type shift) +(eq_attr dot yes) +(eq_attr var_shift no))) (eq_attr cpu ppce5500)) e5500_decode,e5500_sfx) diff --git a/gcc/config/rs6000/e6500.md b/gcc/config/rs6000/e6500.md index deec34b..a013a94 100644 --- a/gcc/config/rs6000/e6500.md +++ b/gcc/config/rs6000/e6500.md @@ -63,6 +63,7 @@ (define_insn_reservation e6500_sfx 1 (and (eq_attr type add,logical) (eq_attr dot no)) (and (eq_attr type shift) +(eq_attr dot no) (eq_attr var_shift no))) (eq_attr cpu ppce6500)) e6500_decode,e6500_sfx) @@ -70,7 +71,10 @@ (define_insn_reservation e6500_sfx 1 (define_insn_reservation e6500_sfx2 2 (and (ior (eq_attr type cmp,compare,trap) (and (eq_attr type add,logical) -(eq_attr dot yes))) +(eq_attr dot yes)) + (and (eq_attr type shift) +
[PATCH 7/9] rs6000: Make all add instructions one type
They are currently just integer, but the dot version is fast_compare. This makes them all add. Later we should introduce attributes to distinguish e.g. addc and adde (which aren't currently handled as separate instructions at all, only in groups). 2014-05-22 Segher Boessenkool seg...@kernel.crashing.org gcc/ * config/rs6000/rs6000.md (type): Add add. (*addmode3_internal1, addsi3_high, *addmode3_internal2, *addmode3_internal3, *negmode2_internal, and 5 anonymous define_insns): Use it. * config/rs6000/rs6000.c (rs6000_adjust_cost): Adjust. * config/rs6000/40x.md (ppc403-integer, ppc403-compare): Adjust. * config/rs6000/440.md (ppc440-integer, ppc440-compare): Adjust. * config/rs6000/476.md (ppc476-simple-integer, ppc476-compare): Adjust. * config/rs6000/601.md (ppc601-integer): Adjust. * config/rs6000/603.md (ppc603-integer, ppc603-compare): Adjust. * config/rs6000/6xx.md (ppc604-integer, ppc604-compare): Adjust. * config/rs6000/7450.md (ppc7450-integer, ppc7450-compare): Adjust. * config/rs6000/7xx.md (ppc750-integer, ppc750-compare): Adjust. * config/rs6000/8540.md (ppc8540_su): Adjust. * config/rs6000/cell.md (cell-integer, cell-fast-cmp, cell-cmp-microcoded): Adjust. * config/rs6000/e300c2c3.md (ppce300c3_cmp, ppce300c3_iu): Adjust. * config/rs6000/e500mc.md (e500mc_su): Adjust. * config/rs6000/e500mc64.md (e500mc64_su, e500mc64_su2): Adjust. * config/rs6000/e5500.md (e5500_sfx, e5500_sfx2): Adjust. * config/rs6000/e6500.md (e6500_sfx, e6500_sfx2): Adjust. * config/rs6000/mpc.md (mpccore-integer, mpccore-compare): Adjust. * config/rs6000/power4.md (power4-integer, power4-cmp): Adjust. * config/rs6000/power5.md (power5-integer, power5-cmp): Adjust. * config/rs6000/power6.md (power6-integer, power6-fast-compare): Adjust. * config/rs6000/power7.md (power7-integer, power7-cmp): Adjust. * config/rs6000/power8.md (power8-1cyc, power8-fast-compare): Adjust. * config/rs6000/rs64.md (rs64a-integer, rs64a-compare): Adjust. * config/rs6000/titan.md (titan_fxu_adder, titan_fxu_alu): Adjust. --- gcc/config/rs6000/40x.md | 4 ++-- gcc/config/rs6000/440.md | 8 ++-- gcc/config/rs6000/476.md | 4 ++-- gcc/config/rs6000/601.md | 2 +- gcc/config/rs6000/603.md | 4 ++-- gcc/config/rs6000/6xx.md | 4 ++-- gcc/config/rs6000/7450.md | 4 ++-- gcc/config/rs6000/7xx.md | 4 ++-- gcc/config/rs6000/8540.md | 2 +- gcc/config/rs6000/cell.md | 6 +++--- gcc/config/rs6000/e300c2c3.md | 6 -- gcc/config/rs6000/e500mc.md | 2 +- gcc/config/rs6000/e500mc64.md | 4 gcc/config/rs6000/e5500.md| 6 +- gcc/config/rs6000/e6500.md| 6 +- gcc/config/rs6000/mpc.md | 4 ++-- gcc/config/rs6000/power4.md | 6 -- gcc/config/rs6000/power5.md | 6 -- gcc/config/rs6000/power6.md | 8 ++-- gcc/config/rs6000/power7.md | 6 -- gcc/config/rs6000/power8.md | 6 -- gcc/config/rs6000/rs6000.c| 2 ++ gcc/config/rs6000/rs6000.md | 30 +++--- gcc/config/rs6000/rs64.md | 4 ++-- gcc/config/rs6000/titan.md| 8 ++-- 25 files changed, 95 insertions(+), 51 deletions(-) diff --git a/gcc/config/rs6000/40x.md b/gcc/config/rs6000/40x.md index 30ac01d..85b9e41 100644 --- a/gcc/config/rs6000/40x.md +++ b/gcc/config/rs6000/40x.md @@ -37,7 +37,7 @@ (define_insn_reservation ppc403-store 2 (define_insn_reservation ppc403-integer 1 (and (ior (eq_attr type integer,insert,trap,cntlz,exts,isel) - (and (eq_attr type shift) + (and (eq_attr type add,shift) (eq_attr dot no))) (eq_attr cpu ppc403,ppc405)) iu_40x) @@ -54,7 +54,7 @@ (define_insn_reservation ppc403-three 1 (define_insn_reservation ppc403-compare 3 (and (ior (eq_attr type cmp,fast_compare,compare) - (and (eq_attr type shift) + (and (eq_attr type add,shift) (eq_attr dot yes))) (eq_attr cpu ppc403,ppc405)) iu_40x,nothing,bpu_40x) diff --git a/gcc/config/rs6000/440.md b/gcc/config/rs6000/440.md index 3a36ffb..23f69b1 100644 --- a/gcc/config/rs6000/440.md +++ b/gcc/config/rs6000/440.md @@ -53,7 +53,9 @@ (define_insn_reservation ppc440-fpstore 3 ppc440_issue,ppc440_l_pipe) (define_insn_reservation ppc440-integer 1 - (and (eq_attr type integer,insert,shift,trap,cntlz,exts,isel) + (and (ior (eq_attr type integer,insert,shift,trap,cntlz,exts,isel) + (and (eq_attr type add) +(eq_attr dot no))) (eq_attr cpu ppc440)) ppc440_issue,ppc440_i_pipe|ppc440_j_pipe) @@ -93,7 +95,9 @@ (define_insn_reservation ppc440-branch 1 ppc440_issue,ppc440_i_pipe) (define_insn_reservation ppc440-compare 2 - (and (eq_attr type
[PATCH 8/9] rs6000: Make all logical instructions one type
They are currently just integer, but the dot version is fast_compare. This makes them all logical. 2014-05-22 Segher Boessenkool seg...@kernel.crashing.org gcc/ * config/rs6000/rs6000.md (type): Add logical. Delete fast_compare. (dot): Adjust comment. (andsi3_mc, *andsi3_internal2_mc, *andsi3_internal3_mc, *andsi3_internal4, *andsi3_internal5_mc, *boolsi3_internal2, *boolsi3_internal3, *boolccsi3_internal2, *boolccsi3_internal3, anddi3_mc, *anddi3_internal2_mc, *anddi3_internal3_mc, *booldi3_internal2, *booldi3_internal3, *boolcdi3_internal2, *boolcdi3_internal3, *boolccdi3_internal2, *boolccdi3_internal3, *movmode_internal2, and 10 anonymous define_insns): Use logical. * config/rs6000/rs6000.c (rs6000_adjust_cost): Adjust. * config/rs6000/40x.md: (ppc403-integer, ppc403-compare): Adjust. * config/rs6000/440.md: (ppc440-integer, ppc440-compare): Adjust. * config/rs6000/476.md: (ppc476-simple-integer, ppc476-compare): Adjust. * config/rs6000/603.md: (ppc603-integer, ppc603-compare): Adjust. * config/rs6000/6xx.md: (ppc604-integer, ppc604-compare): Adjust. * config/rs6000/7450.md: (ppc7450-integer, ppc7450-compare): Adjust. * config/rs6000/7xx.md: (ppc750-integer, ppc750-compare): Adjust. * config/rs6000/8540.md: (ppc8540_su): Adjust. * config/rs6000/cell.md: (cell-integer, cell-fast-cmp, cell-cmp-microcoded): Adjust. * config/rs6000/e300c2c3.md: (ppce300c3_cmp, ppce300c3_iu): Adjust. * config/rs6000/e500mc.md: (e500mc_su): Adjust. * config/rs6000/e500mc64.md: (e500mc64_su, e500mc64_su2): Adjust. * config/rs6000/e5500.md: (e5500_sfx, e5500_sfx2): Adjust. * config/rs6000/e6500.md: (e6500_sfx, e6500_sfx2): Adjust. * config/rs6000/mpc.md: (mpccore-integer, mpccore-compare): Adjust. * config/rs6000/power4.md: (power4-integer, power4-cmp): Adjust. * config/rs6000/power5.md: (power5-integer, power5-cmp): Adjust. * config/rs6000/power6.md: (power6-integer, power6-fast-compare): Adjust. * config/rs6000/power7.md: (power7-integer, power7-cmp): Adjust. * config/rs6000/power8.md: (power8-1cyc, power8-fast-compare): Adjust. Adjust comment. * config/rs6000/rs64.md: (rs64a-integer, rs64a-compare): Adjust. * config/rs6000/titan.md: (titan_fxu_adder, titan_fxu_alu): Adjust. --- gcc/config/rs6000/40x.md | 6 +-- gcc/config/rs6000/440.md | 6 +-- gcc/config/rs6000/476.md | 7 ++-- gcc/config/rs6000/603.md | 6 +-- gcc/config/rs6000/6xx.md | 6 +-- gcc/config/rs6000/7450.md | 6 +-- gcc/config/rs6000/7xx.md | 6 +-- gcc/config/rs6000/8540.md | 2 +- gcc/config/rs6000/cell.md | 10 ++--- gcc/config/rs6000/e300c2c3.md | 6 +-- gcc/config/rs6000/e500mc.md | 2 +- gcc/config/rs6000/e500mc64.md | 6 +-- gcc/config/rs6000/e5500.md| 6 +-- gcc/config/rs6000/e6500.md| 6 +-- gcc/config/rs6000/mpc.md | 6 +-- gcc/config/rs6000/power4.md | 6 +-- gcc/config/rs6000/power5.md | 6 +-- gcc/config/rs6000/power6.md | 7 ++-- gcc/config/rs6000/power7.md | 6 +-- gcc/config/rs6000/power8.md | 9 ++--- gcc/config/rs6000/rs6000.c| 4 +- gcc/config/rs6000/rs6000.md | 89 ++- gcc/config/rs6000/rs64.md | 6 +-- gcc/config/rs6000/titan.md| 6 +-- 24 files changed, 122 insertions(+), 104 deletions(-) diff --git a/gcc/config/rs6000/40x.md b/gcc/config/rs6000/40x.md index 85b9e41..b29e06a 100644 --- a/gcc/config/rs6000/40x.md +++ b/gcc/config/rs6000/40x.md @@ -37,7 +37,7 @@ (define_insn_reservation ppc403-store 2 (define_insn_reservation ppc403-integer 1 (and (ior (eq_attr type integer,insert,trap,cntlz,exts,isel) - (and (eq_attr type add,shift) + (and (eq_attr type add,logical,shift) (eq_attr dot no))) (eq_attr cpu ppc403,ppc405)) iu_40x) @@ -53,8 +53,8 @@ (define_insn_reservation ppc403-three 1 iu_40x,iu_40x,iu_40x) (define_insn_reservation ppc403-compare 3 - (and (ior (eq_attr type cmp,fast_compare,compare) - (and (eq_attr type add,shift) + (and (ior (eq_attr type cmp,compare) + (and (eq_attr type add,logical,shift) (eq_attr dot yes))) (eq_attr cpu ppc403,ppc405)) iu_40x,nothing,bpu_40x) diff --git a/gcc/config/rs6000/440.md b/gcc/config/rs6000/440.md index 23f69b1..bc8da3e 100644 --- a/gcc/config/rs6000/440.md +++ b/gcc/config/rs6000/440.md @@ -54,7 +54,7 @@ (define_insn_reservation ppc440-fpstore 3 (define_insn_reservation ppc440-integer 1 (and (ior (eq_attr type integer,insert,shift,trap,cntlz,exts,isel) - (and (eq_attr type add) + (and (eq_attr type add,logical) (eq_attr dot no))) (eq_attr cpu ppc440))
Re: libsanitizer merge from upstream r208536
On 05/23/2014 10:34 AM, Jakub Jelinek wrote: Otherwise libasan apps will simply stop working altogether if LD_PRELOAD is set, to whatever library, even if it doesn't define any symbols you care about. Right but I'm not sure whether failing fast here is necessarily bad. I think it is very bad. In fact, if you really want such a check, I'd say it shouldn't be at least enabled by default, unless some env var requests it; and document that if you are having troubles with asan sanitized programs, try this magic env var to get better troubleshooting. As I said, we can remove that Die() there. Warning can also be hidden under ASAN_OPTIONS=verbosity=1. Even before this exaggerated check asan imposes far more restrictions than good, and this just makes asan less usable just for fear that it wouldn't work right. Ok, we seem to approach this from two different angles. I usually try to prohibit functionality that's not proven to work It could be later enabled if users need it. Most preloaded libs will just provide symbols asan never cares about Maybe. In my experience it's all libc decorations. even if you say override malloc completely without calling the original implementation, the world doesn't end, the shadow mem of those allocations just won't be surrounded by protected paddings, so what, you don't detect out of bounds for malloc, but can still detect out of bounds in your program's stack etc. Ditto for string ops etc. Let's wait for Konstantin to comment on this. I don't know runtime well enough to guarantee that arbitrary symbol overloads are expected to work. Imagine preloaded library has an initializer which calls intercepted APIs. Asan didn't get a chance to initialize at the point of call and if interceptor doesn't contain a sanity call to asan_init, we are risking hard-to-debug runtime error (call to NULL, etc.). I've seen numerous bugs like this (both locally and on mailing lists) and they were main motivation to add this check. That is nonsense. Early in the symbol search scope is the opposite of being initialized early, on the contrary, such libraries are initialized last. I may be wrong but my understanding was that ld.so performs a reverse topological sort of dependencies and initializes them in that order. Given that libasan depends on standard libs (librt, lipthread, libdl, etc.) it'll be initialized after them but before user libs. Initializers of std libs may indeed cause problems but we at least make sure to initialize before arbitrary user libraries. I wonder whether overriding Asan's malloc, etc. is expected to work at all? Perhaps banning it altogether is just the safest thing to do? Don't know why you want to ban everything. To guarantee predictable and consistent behavior. -Y
[PATCH] Fix PR rtl-optimization/61278
Hi, The patch fixes PR rtl-optimization/61278. Root cause for issue is that df_live does not exist at -O1. Bootstrap and no make check regression on X86-64. OK for trunk? Thanks! -Zhenqiang ChangeLog: 2014-05-23 Zhenqiang Chen zhenqiang.c...@linaro.org PR rtl-optimization/61278 * shrink-wrap.c (move_insn_for_shrink_wrap): Check df_live. testsuite/ChangeLog: 2014-05-23 Zhenqiang Chen zhenqiang.c...@linaro.org * gcc.dg/lto/pr61278_0.c: New test. * gcc.dg/lto/pr61278_1.c: New test. diff --git a/gcc/shrink-wrap.c b/gcc/shrink-wrap.c index f09cfe7..be17829 100644 --- a/gcc/shrink-wrap.c +++ b/gcc/shrink-wrap.c @@ -204,8 +204,15 @@ move_insn_for_shrink_wrap (basic_block bb, rtx insn, /* Create a new basic block on the edge. */ if (EDGE_COUNT (next_block-preds) == 2) { + /* If DF_LIVE doesn't exist, i.e. at -O1, just give up. */ + if (!df_live) + return false; + next_block = split_edge (live_edge); + /* We create a new basic block. Call df_grow_bb_info to make sure +all data structures are allocated. */ + df_grow_bb_info (df_live); bitmap_copy (df_get_live_in (next_block), df_get_live_out (bb)); df_set_bb_dirty (next_block); diff --git a/gcc/testsuite/gcc.dg/lto/pr61278_0.c b/gcc/testsuite/gcc.dg/lto/pr61278_0.c new file mode 100644 index 000..03a24ae --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/pr61278_0.c @@ -0,0 +1,30 @@ +/* { dg-lto-do link } */ +/* { dg-lto-options { { -flto -O0 } } } */ +/* { dg-extra-ld-options -flto -O1 } */ + +static unsigned int +fn1 (int p1, int p2) +{ + return 0; +} + +char a, b, c; + +char +foo (char *p) +{ + int i; + for (b = 1 ; b 0; b++) +{ + for (i = 0; i 2; i++) + ; + for (a = 1; a 0; a++) + { + char d[1] = { 0 }; + if (*p) + break; + c ^= fn1 (fn1 (fn1 (0, 0), 0), 0); + } +} + return 0; +} diff --git a/gcc/testsuite/gcc.dg/lto/pr61278_1.c b/gcc/testsuite/gcc.dg/lto/pr61278_1.c new file mode 100644 index 000..b02c8ac --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/pr61278_1.c @@ -0,0 +1,13 @@ +/* { dg-lto-do link } */ +/* { dg-lto-options { { -flto -O1 } } } */ + +extern char foo (char *); + +char d; + +int +main () +{ + foo (d); + return 0; +}
Re: [PATCH, sched] Cleanup and improve multipass_dfa_lookahead_guard
../../gcc/config/ia64/ia64.c: In function 'int ia64_first_cycle_multipass_dfa_lookahead_guard(rtx, int)': ../../gcc/config/ia64/ia64.c:7551:1: error: control reaches end of non-void function [-Werror=return-type] Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
Re: libsanitizer merge from upstream r208536
On Thu, May 22, 2014 at 7:31 AM, Konstantin Serebryany konstantin.s.serebry...@gmail.com wrote: On Wed, May 21, 2014 at 11:43 PM, Jakub Jelinek ja...@redhat.com wrote: On Wed, May 21, 2014 at 04:09:19PM +0400, Konstantin Serebryany wrote: A new patch based on r209283. This one has the H.J.'s patches for x32. Ok for trunk then. But please help the ppc*/arm*/sparc* maintainers if issues on those targets are reported. Of course. arm should be in a good shape since there are arm users upstream, including ourselves. On ARM the asan tests have always been a random generator of PASS / FAIL on qemu despite efforts to nobble qemu for /proc/self/maps outputs. On a board where this appears to work well ( my A15 / A7 Odroid XU at home) https://gcc.gnu.org/ml/gcc-testresults/2014-05/msg01902.html the set of results from before the merge. indicates the following failures. Running target unix FAIL: c-c++-common/asan/asan-interface-1.c -O0 (test for excess errors) UNRESOLVED: c-c++-common/asan/asan-interface-1.c -O0 compilation failed to produce executable FAIL: c-c++-common/asan/asan-interface-1.c -O1 (test for excess errors) UNRESOLVED: c-c++-common/asan/asan-interface-1.c -O1 compilation failed to produce executable FAIL: c-c++-common/asan/asan-interface-1.c -O2 (test for excess errors) UNRESOLVED: c-c++-common/asan/asan-interface-1.c -O2 compilation failed to produce executable FAIL: c-c++-common/asan/asan-interface-1.c -O3 -fomit-frame-pointer (test for excess errors) UNRESOLVED: c-c++-common/asan/asan-interface-1.c -O3 -fomit-frame-pointer compilation failed to produce executable FAIL: c-c++-common/asan/asan-interface-1.c -O3 -g (test for excess errors) UNRESOLVED: c-c++-common/asan/asan-interface-1.c -O3 -g compilation failed to produce executable FAIL: c-c++-common/asan/asan-interface-1.c -Os (test for excess errors) UNRESOLVED: c-c++-common/asan/asan-interface-1.c -Os compilation failed to produce executable FAIL: c-c++-common/asan/asan-interface-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none (test for excess errors) UNRESOLVED: c-c++-common/asan/asan-interface-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none compilation failed to produce executable FAIL: c-c++-common/asan/asan-interface-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects (test for excess errors) UNRESOLVED: c-c++-common/asan/asan-interface-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects compilation failed to produce executable After the merge I see these new failures instead https://gcc.gnu.org/ml/gcc-testresults/2014-05/msg02018.html FAIL: c-c++-common/asan/heap-overflow-1.c -O0 output pattern test, is = FAIL: c-c++-common/asan/heap-overflow-1.c -O1 output pattern test, is = FAIL: c-c++-common/asan/heap-overflow-1.c -O2 output pattern test, is = FAIL: c-c++-common/asan/heap-overflow-1.c -O3 -fomit-frame-pointer output pattern test, is = FAIL: c-c++-common/asan/heap-overflow-1.c -O3 -g output pattern test, is = FAIL: c-c++-common/asan/heap-overflow-1.c -Os output pattern test, is = FAIL: c-c++-common/asan/heap-overflow-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none output pattern test, is = FAIL: c-c++-common/asan/heap-overflow-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects output pattern test, is = FAIL: c-c++-common/asan/sanity-check-pure-c-1.c -O0 output pattern test, is = FAIL: c-c++-common/asan/sanity-check-pure-c-1.c -O1 output pattern test, is = FAIL: c-c++-common/asan/sanity-check-pure-c-1.c -O2 output pattern test, is = FAIL: c-c++-common/asan/sanity-check-pure-c-1.c -O3 -fomit-frame-pointer output pattern test, is = FAIL: c-c++-common/asan/sanity-check-pure-c-1.c -O3 -g output pattern test, is = FAIL: c-c++-common/asan/sanity-check-pure-c-1.c -Os output pattern test, is = FAIL: c-c++-common/asan/sanity-check-pure-c-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none output pattern test, is = FAIL: c-c++-common/asan/sanity-check-pure-c-1.c -O2 -flto
Re: libsanitizer merge from upstream r208536
On Fri, May 23, 2014 at 11:20 AM, Yury Gribov y.gri...@samsung.com wrote: On 05/23/2014 10:34 AM, Jakub Jelinek wrote: Otherwise libasan apps will simply stop working altogether if LD_PRELOAD is set, to whatever library, even if it doesn't define any symbols you care about. Right but I'm not sure whether failing fast here is necessarily bad. I think it is very bad. In fact, if you really want such a check, I'd say it shouldn't be at least enabled by default, unless some env var requests it; and document that if you are having troubles with asan sanitized programs, try this magic env var to get better troubleshooting. As I said, we can remove that Die() there. Warning can also be hidden under ASAN_OPTIONS=verbosity=1. Even before this exaggerated check asan imposes far more restrictions than good, and this just makes asan less usable just for fear that it wouldn't work right. Ok, we seem to approach this from two different angles. I usually try to prohibit functionality that's not proven to work It could be later enabled if users need it. Most preloaded libs will just provide symbols asan never cares about Maybe. In my experience it's all libc decorations. even if you say override malloc completely without calling the original implementation, the world doesn't end, the shadow mem of those allocations just won't be surrounded by protected paddings, so what, you don't detect out of bounds for malloc, but can still detect out of bounds in your program's stack etc. Ditto for string ops etc. Let's wait for Konstantin to comment on this. I don't know runtime well enough to guarantee that arbitrary symbol overloads are expected to work. [These things are really better discussed at address-saniti...@googlegroups.com, where more asan people will read. maybe we should start a separate topic there. This conversation is already too long to comprehend. ] Failing to intercept something may cause not just false negatives, but also false positives. These cases are often exceptionally hard to debug, so any checking that the interception machinery works as intended is good. Of course if these checks are wrong we should fix them. Imagine preloaded library has an initializer which calls intercepted APIs. Asan didn't get a chance to initialize at the point of call and if interceptor doesn't contain a sanity call to asan_init, we are risking hard-to-debug runtime error (call to NULL, etc.). I've seen numerous bugs like this (both locally and on mailing lists) and they were main motivation to add this check. That is nonsense. Early in the symbol search scope is the opposite of being initialized early, on the contrary, such libraries are initialized last. I may be wrong but my understanding was that ld.so performs a reverse topological sort of dependencies and initializes them in that order. Given that libasan depends on standard libs (librt, lipthread, libdl, etc.) it'll be initialized after them but before user libs. Initializers of std libs may indeed cause problems but we at least make sure to initialize before arbitrary user libraries. I wonder whether overriding Asan's malloc, etc. is expected to work at all? Perhaps banning it altogether is just the safest thing to do? Don't know why you want to ban everything. To guarantee predictable and consistent behavior. -Y
Re: [PATCH, sched] Cleanup and improve multipass_dfa_lookahead_guard
On May 23, 2014, at 7:23 PM, Andreas Schwab sch...@linux-m68k.org wrote: ../../gcc/config/ia64/ia64.c: In function 'int ia64_first_cycle_multipass_dfa_lookahead_guard(rtx, int)': ../../gcc/config/ia64/ia64.c:7551:1: error: control reaches end of non-void function [-Werror=return-type] Fixed, sorry about the breakage. The patch is trivial. Thank you, -- Maxim Kuvyrkov www.linaro.org 2014-05-23 Maxim Kuvyrkov maxim.kuvyr...@linaro.org Fix bootstrap error on ia64 * config/ia64/ia64.c (ia64_first_cycle_multipass_dfa_lookahead_guard): Return default value. Index: gcc/config/ia64/ia64.c === --- gcc/config/ia64/ia64.c (revision 210844) +++ gcc/config/ia64/ia64.c (working copy) @@ -7548,6 +7548,8 @@ ia64_first_cycle_multipass_dfa_lookahead_guard || !is_load_p (insn) || mem_ops_in_group[current_cycle % 4] ia64_max_memory_insns)) return 0; + + return 1; } /* The following variable value is pseudo-insn used by the DFA insn
Re: libsanitizer merge from upstream r208536
On Fri, May 23, 2014 at 11:20:01AM +0400, Yury Gribov wrote: Even before this exaggerated check asan imposes far more restrictions than good, and this just makes asan less usable just for fear that it wouldn't work right. Ok, we seem to approach this from two different angles. I usually try to prohibit functionality that's not proven to work It could be later enabled if users need it. No other shared library does anything close to that, for each such library you can interpose any of its public symbols, either you know what you are doing when interposing it, or it breaks. That is nonsense. Early in the symbol search scope is the opposite of being initialized early, on the contrary, such libraries are initialized last. I may be wrong but my understanding was that ld.so performs a reverse topological sort of dependencies and initializes them in that order. Given that libasan depends on standard libs (librt, lipthread, libdl, etc.) it'll be initialized after them but before user libs. Initializers of std libs may indeed cause problems but we at least make sure to initialize before arbitrary user libraries. Just try say LD_DEBUG=all LD_PRELOAD=libasan.so.1 /bin/bash to see (non-instrumented bash or any of its shared libraries). ... 30218: object=/bin/bash [0] 30218: scope 0: /bin/bash ./libasan.so.1.0.0 /lib64/libtinfo.so.5 /lib64/libdl.so.2 /lib64/libc.so.6 /lib64/libpthread.so.0 /usr/src/gcc/obj2/x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs/libstdc++.so.6 /lib64/libm.so.6 /lib64/ld-linux-x86-64.so.2 /lib64/libgcc_s.so.1 ... 30218: calling init: /lib64/libpthread.so.0 30218: calling init: /lib64/libc.so.6 30218: calling init: /lib64/libgcc_s.so.1 30218: calling init: /lib64/libm.so.6 30218: calling init: /usr/src/gcc/obj2/x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs/libstdc++.so.6 30218: calling init: /lib64/libdl.so.2 30218: calling init: /lib64/libtinfo.so.5 30218: calling init: ./libasan.so.1.0.0 libasan.so.1 doesn't depend on libtinfo.so.5, only bash itself does, yet libtinfo.so.5 is constructed before libasan.so.1. Jakub
Re: libsanitizer merge from upstream r208536
On Fri, May 23, 2014 at 11:32 AM, Ramana Radhakrishnan ramana@googlemail.com wrote: On Thu, May 22, 2014 at 7:31 AM, Konstantin Serebryany konstantin.s.serebry...@gmail.com wrote: On Wed, May 21, 2014 at 11:43 PM, Jakub Jelinek ja...@redhat.com wrote: On Wed, May 21, 2014 at 04:09:19PM +0400, Konstantin Serebryany wrote: A new patch based on r209283. This one has the H.J.'s patches for x32. Ok for trunk then. But please help the ppc*/arm*/sparc* maintainers if issues on those targets are reported. Of course. arm should be in a good shape since there are arm users upstream, including ourselves. On ARM the asan tests have always been a random generator of PASS / FAIL on qemu despite efforts to nobble qemu for /proc/self/maps outputs. We ourselves test only Android ARM on a real box. There the tests work. On ARM Linux there are quite a few known test failures (mostly due to unwinding), and no public regular testing. See e.g. http://reviews.llvm.org/D3857 We don't have any experience with running asan on quemu and I remember some complaints regarding quemu itself, e.g. https://code.google.com/p/address-sanitizer/issues/detail?id=160 As usual: if you are interested in supporting asan on any given platform, please work with us upstream. On a board where this appears to work well ( my A15 / A7 Odroid XU at home) https://gcc.gnu.org/ml/gcc-testresults/2014-05/msg01902.html the set of results from before the merge. indicates the following failures. Running target unix FAIL: c-c++-common/asan/asan-interface-1.c -O0 (test for excess errors) UNRESOLVED: c-c++-common/asan/asan-interface-1.c -O0 compilation failed to produce executable FAIL: c-c++-common/asan/asan-interface-1.c -O1 (test for excess errors) UNRESOLVED: c-c++-common/asan/asan-interface-1.c -O1 compilation failed to produce executable FAIL: c-c++-common/asan/asan-interface-1.c -O2 (test for excess errors) UNRESOLVED: c-c++-common/asan/asan-interface-1.c -O2 compilation failed to produce executable FAIL: c-c++-common/asan/asan-interface-1.c -O3 -fomit-frame-pointer (test for excess errors) UNRESOLVED: c-c++-common/asan/asan-interface-1.c -O3 -fomit-frame-pointer compilation failed to produce executable FAIL: c-c++-common/asan/asan-interface-1.c -O3 -g (test for excess errors) UNRESOLVED: c-c++-common/asan/asan-interface-1.c -O3 -g compilation failed to produce executable FAIL: c-c++-common/asan/asan-interface-1.c -Os (test for excess errors) UNRESOLVED: c-c++-common/asan/asan-interface-1.c -Os compilation failed to produce executable FAIL: c-c++-common/asan/asan-interface-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none (test for excess errors) UNRESOLVED: c-c++-common/asan/asan-interface-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none compilation failed to produce executable FAIL: c-c++-common/asan/asan-interface-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects (test for excess errors) UNRESOLVED: c-c++-common/asan/asan-interface-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects compilation failed to produce executable After the merge I see these new failures instead https://gcc.gnu.org/ml/gcc-testresults/2014-05/msg02018.html FAIL: c-c++-common/asan/heap-overflow-1.c -O0 output pattern test, is = FAIL: c-c++-common/asan/heap-overflow-1.c -O1 output pattern test, is = FAIL: c-c++-common/asan/heap-overflow-1.c -O2 output pattern test, is = FAIL: c-c++-common/asan/heap-overflow-1.c -O3 -fomit-frame-pointer output pattern test, is = FAIL: c-c++-common/asan/heap-overflow-1.c -O3 -g output pattern test, is = FAIL: c-c++-common/asan/heap-overflow-1.c -Os output pattern test, is = FAIL: c-c++-common/asan/heap-overflow-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none output pattern test, is = FAIL: c-c++-common/asan/heap-overflow-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects output pattern test, is = FAIL: c-c++-common/asan/sanity-check-pure-c-1.c -O0 output pattern test, is = FAIL: c-c++-common/asan/sanity-check-pure-c-1.c -O1 output pattern test, is = FAIL: c-c++-common/asan/sanity-check-pure-c-1.c -O2 output pattern test, is = FAIL: c-c++-common/asan/sanity-check-pure-c-1.c -O3
Re: libsanitizer merge from upstream r208536
On Fri, May 23, 2014 at 11:34:38AM +0400, Konstantin Serebryany wrote: Failing to intercept something may cause not just false negatives, but also false positives. These cases are often exceptionally hard to debug, so any checking that the interception machinery works as intended is good. Of course if these checks are wrong we should fix them. But at least don't report anything when the program starts, but at the end of the ASAN failure report (if anything has been detected). Like: WARNING: libasan.so.1 isn't early in the symbol search scope, some symbols in it might be interposed. Try LD_PRELOAD=libasan.so.1. or so. Of course, LD_PRELOAD=libasan.so.1 wouldn't help with this check, because the vDSO still can come up earlier. Jakub
Re: libsanitizer merge from upstream r208536
On ARM the asan tests have always been a random generator of PASS / FAIL on qemu despite efforts to nobble qemu for /proc/self/maps outputs. This should improve once upstream Asan sets up an ARM build bot. This has been discussed recently but noone has yet volunteered to do the server installation and setup. After the merge I see these new failures instead A pity that gcc-testresults does not report mismatched lines. My guess is this is caused by some unwinding failures when reporting heap allocations: $ grep -R 'allocated by thread' ./gcc/testsuite/c-c++-common/asan -l ./gcc/testsuite/c-c++-common/asan/strncpy-overflow-1.c ./gcc/testsuite/c-c++-common/asan/heap-overflow-1.c ./gcc/testsuite/c-c++-common/asan/use-after-free-1.c -Y
Re: libsanitizer merge from upstream r208536
On 05/23/14 08:50, Yury Gribov wrote: On ARM the asan tests have always been a random generator of PASS / FAIL on qemu despite efforts to nobble qemu for /proc/self/maps outputs. This should improve once upstream Asan sets up an ARM build bot. This has been discussed recently but noone has yet volunteered to do the server installation and setup. After the merge I see these new failures instead A pity that gcc-testresults does not report mismatched lines. My guess is this is caused by some unwinding failures when reporting heap allocations: Ok, so we need to fix that if there is a bug. $ grep -R 'allocated by thread' ./gcc/testsuite/c-c++-common/asan -l ./gcc/testsuite/c-c++-common/asan/strncpy-overflow-1.c ./gcc/testsuite/c-c++-common/asan/heap-overflow-1.c ./gcc/testsuite/c-c++-common/asan/use-after-free-1.c This is what I see from the log and all failures look identical. ==14627==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x41a007fa at pc 0x88c4 bp 0xbebd0084 sp 0xbebd007c READ of size 1 at 0x41a007fa thread T0 #0 0x88c3 in main /work/gcc/gcc/testsuite/c-c++-common/asan/heap-overflow-1.c:21 #1 0x40626631 in __libc_start_main (/lib/arm-linux-gnueabihf/libc.so.6+0x17631) 0x41a007fa is located 0 bytes to the right of 10-byte region [0x41a007f0,0x41a007fa) allocated by thread T0 here: #0 0x400cd587 in __interceptor_malloc /work/gcc/libsanitizer/asan/asan_malloc_linux.cc:73 SUMMARY: AddressSanitizer: heap-buffer-overflow /work/gcc/gcc/testsuite/c-c++-common/asan/heap-overflow-1.c:21 main Shadow bytes around the buggy address: 0x283400a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x283400b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x283400c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x283400d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x283400e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa =0x283400f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa 00[02] 0x28340100: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x28340110: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x28340120: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x28340130: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x28340140: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Heap right redzone: fb Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack partial redzone: f4 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user:f7 Container overflow: fc ASan internal: fe ==14627==ABORTING -Y
Re: libsanitizer merge from upstream r208536
On Fri, May 23, 2014 at 11:56 AM, Ramana Radhakrishnan ramana.radhakrish...@arm.com wrote: On 05/23/14 08:50, Yury Gribov wrote: On ARM the asan tests have always been a random generator of PASS / FAIL on qemu despite efforts to nobble qemu for /proc/self/maps outputs. This should improve once upstream Asan sets up an ARM build bot. This has been discussed recently but noone has yet volunteered to do the server installation and setup. After the merge I see these new failures instead A pity that gcc-testresults does not report mismatched lines. My guess is this is caused by some unwinding failures when reporting heap allocations: Ok, so we need to fix that if there is a bug. Yep. $ grep -R 'allocated by thread' ./gcc/testsuite/c-c++-common/asan -l ./gcc/testsuite/c-c++-common/asan/strncpy-overflow-1.c ./gcc/testsuite/c-c++-common/asan/heap-overflow-1.c ./gcc/testsuite/c-c++-common/asan/use-after-free-1.c This is what I see from the log and all failures look identical. ==14627==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x41a007fa at pc 0x88c4 bp 0xbebd0084 sp 0xbebd007c READ of size 1 at 0x41a007fa thread T0 #0 0x88c3 in main /work/gcc/gcc/testsuite/c-c++-common/asan/heap-overflow-1.c:21 #1 0x40626631 in __libc_start_main (/lib/arm-linux-gnueabihf/libc.so.6+0x17631) 0x41a007fa is located 0 bytes to the right of 10-byte region [0x41a007f0,0x41a007fa) allocated by thread T0 here: #0 0x400cd587 in __interceptor_malloc /work/gcc/libsanitizer/asan/asan_malloc_linux.cc:73 Looks indeed like wrong unwind, similar to what has been recently discussed here: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140519/218239.html SUMMARY: AddressSanitizer: heap-buffer-overflow /work/gcc/gcc/testsuite/c-c++-common/asan/heap-overflow-1.c:21 main Shadow bytes around the buggy address: 0x283400a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x283400b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x283400c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x283400d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x283400e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa =0x283400f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa 00[02] 0x28340100: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x28340110: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x28340120: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x28340130: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x28340140: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Heap right redzone: fb Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack partial redzone: f4 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user:f7 Container overflow: fc ASan internal: fe ==14627==ABORTING -Y
[PATCH] Fix PR61266
The following reverts the un-XFAILing and adjusts the testcase according to reality (as already noted in testcase comments). Committed. Richard. 2014-05-23 Richard Biener rguent...@suse.de PR testsuite/61266 * gcc.dg/Wstrict-overflow-18.c: Revert un-XFAILing and adjust testcase to reflect reality. Index: gcc/testsuite/gcc.dg/Wstrict-overflow-18.c === --- gcc/testsuite/gcc.dg/Wstrict-overflow-18.c (revision 210845) +++ gcc/testsuite/gcc.dg/Wstrict-overflow-18.c (working copy) @@ -1,11 +1,8 @@ /* { dg-do compile } */ /* { dg-options -fstrict-overflow -O2 -Wstrict-overflow } */ -/* Don't warn about an overflow when folding i 0. The loop analysis - should determine that i does not wrap. - - The test is really bogus, p-a - p-b can be larger than INT_MAX - and thus i can very well wrap. */ +/* Warn about an overflow when folding i 0, p-a - p-b can be larger + than INT_MAX and thus i can wrap. */ struct c { unsigned int a; unsigned int b; }; extern void bar (struct c *); @@ -17,7 +14,7 @@ foo (struct c *p) for (i = 0; i p-a - p-b; ++i) { - if (i 0) /* { dg-bogus warning } */ + if (i 0) /* { dg-warning signed overflow } */ sum += 2; bar (p); }
Re: libsanitizer merge from upstream r208536
Hi, On 05/23/2014 09:47 AM, Jakub Jelinek wrote: On Fri, May 23, 2014 at 11:34:38AM +0400, Konstantin Serebryany wrote: Failing to intercept something may cause not just false negatives, but also false positives. These cases are often exceptionally hard to debug, so any checking that the interception machinery works as intended is good. Of course if these checks are wrong we should fix them. But at least don't report anything when the program starts, but at the end of the ASAN failure report (if anything has been detected). Like: WARNING: libasan.so.1 isn't early in the symbol search scope, some symbols in it might be interposed. Try LD_PRELOAD=libasan.so.1. or so. Yes please. For me, and I don't think my setup is such uncommon, the noise is quite annoying. Paolo.
Re: [patch i386]: Sibcall tail-call improvement and partial fix PR/60104
Hello, Does this touch or address the problem raised in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46219#c3 ? [Uros Bizjak] For some reason, memory operand is prohibited in a sibcall, see predicates.md [...] [Richard Henderson] That would be because we have no good way to say: global memory is fine, but the on-stack memory that we just deallocated is not. In addition for this case, we have to ensure that the registers used to do the indexing are still valid after call-saved registers have been restored, and avoid any call-clobbered registers that might be needed to execute the epilogue. In general I don't think this is solvable, but for this specific case we could add a peephole. Thanks. Alexander
Re: libsanitizer merge from upstream r208536
Paolo, I've checked all available systems and wasn't able to repro this. Every time vDSO was already filtered with if (!info-dlpi_name || info-dlpi_name[0] == 0) return 0; in FindFirstDSOCallback. Could you provide additional details of your setup? Or perhaps print dlpi_name of offending library? -Y
Re: [PATCH] Disable unroll loop that has header count less than iteration count.
On Thu, May 22, 2014 at 11:36 PM, Dehao Chen de...@google.com wrote: If a loop's header count is less than iteration count, the iteration estimation is apparently incorrect for this loop. Thus disable unrolling of such loops. Testing on going. OK for trunk if test pass? No. Why don't you instead plug the hole in expected_loop_iterations ()? That is, why may not loop-header be bogus? Isn't it maybe the bounding you run into? /* Returns expected number of LOOP iterations. The returned value is bounded by REG_BR_PROB_BASE. */ unsigned expected_loop_iterations (const struct loop *loop) { gcov_type expected = expected_loop_iterations_unbounded (loop); return (expected REG_BR_PROB_BASE ? REG_BR_PROB_BASE : expected); } I miss a testcase as well. Richard. Thanks, Dehao gcc/ChangeLog: 2014-05-21 Dehao Chen de...@google.com * cfgloop.h (expected_loop_iterations_reliable_p): New func. * cfgloopanal.c (expected_loop_iterations_reliable_p): Likewise. * loop-unroll.c (decide_unroll_runtime_iterations): Disable unroll loop that has unreliable iteration counts. Index: gcc/cfgloop.h === --- gcc/cfgloop.h (revision 210717) +++ gcc/cfgloop.h (working copy) @@ -307,8 +307,8 @@ extern bool just_once_each_iteration_p (const stru gcov_type expected_loop_iterations_unbounded (const struct loop *); extern unsigned expected_loop_iterations (const struct loop *); extern rtx doloop_condition_get (rtx); +extern bool expected_loop_iterations_reliable_p (const struct loop *); - /* Loop manipulation. */ extern bool can_duplicate_loop_p (const struct loop *loop); Index: gcc/cfgloopanal.c === --- gcc/cfgloopanal.c (revision 210717) +++ gcc/cfgloopanal.c (working copy) @@ -285,6 +285,15 @@ expected_loop_iterations (const struct loop *loop) return (expected REG_BR_PROB_BASE ? REG_BR_PROB_BASE : expected); } +/* Returns true if the loop header's profile count is smaller than expected + loop iteration. */ + +bool +expected_loop_iterations_reliable_p (const struct loop *loop) +{ + return expected_loop_iterations (loop) loop-header-count; +} + /* Returns the maximum level of nesting of subloops of LOOP. */ unsigned Index: gcc/loop-unroll.c === --- gcc/loop-unroll.c (revision 210717) +++ gcc/loop-unroll.c (working copy) @@ -988,6 +988,15 @@ decide_unroll_runtime_iterations (struct loop *loo return; } + if (profile_status_for_fn (cfun) == PROFILE_READ + expected_loop_iterations_reliable_p (loop)) +{ + if (dump_file) + fprintf (dump_file, ;; Not unrolling loop, loop iteration + not reliable.); + return; +} + /* Check whether the loop rolls. */ if ((get_estimated_loop_iterations (loop, iterations) || get_max_loop_iterations (loop, iterations))
Re: [patch i386]: Sibcall tail-call improvement and partial fix PR/60104
Hello, yes the underlying issue is the same as for PR/46219. Nevertheless the patch doesn't solve this mentioned PR as I used for know a pretty conservative checking of allowed memories. By extending x86_sibcall_memory_p_1 function about allowing register-arguments too for memory, this problem can be solved. Kai - Original Message - Hello, Does this touch or address the problem raised in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46219#c3 ? [Uros Bizjak] For some reason, memory operand is prohibited in a sibcall, see predicates.md [...] [Richard Henderson] That would be because we have no good way to say: global memory is fine, but the on-stack memory that we just deallocated is not. In addition for this case, we have to ensure that the registers used to do the indexing are still valid after call-saved registers have been restored, and avoid any call-clobbered registers that might be needed to execute the epilogue. In general I don't think this is solvable, but for this specific case we could add a peephole. Thanks. Alexander
Re: libsanitizer merge from upstream r208536
Hi, On 05/23/2014 10:50 AM, Yury Gribov wrote: Paolo, I've checked all available systems and wasn't able to repro this. Given your exchanges with Jakub I thought that at this point it was clear that the issue is real. Every time vDSO was already filtered with if (!info-dlpi_name || info-dlpi_name[0] == 0) return 0; in FindFirstDSOCallback. Could you provide additional details of your setup? Or perhaps print dlpi_name of offending library? How do I print dlpi_name? And which detail do you want? It's a very, very, standard Linux machine, running a 3.11.10 kernel and a 2.18 glibc, two days ago everything was fine, 4_9-branch is fine. Paolo.
Re: [PATCH] Fix PR rtl-optimization/61278
On Fri, May 23, 2014 at 9:23 AM, Zhenqiang Chen zhenqiang.c...@linaro.org wrote: Hi, The patch fixes PR rtl-optimization/61278. Root cause for issue is that df_live does not exist at -O1. Bootstrap and no make check regression on X86-64. OK for trunk? Why do you need to give up? It seems you can simply avoid marking the block as dirty (though df_get_live_in/out also hands you back DF_LR_IN/OUT if !df_live). So isn't the df_grow_bb_info the real fix? Note that df_get_live_in/out are functions tailored to IRA that knows that they handle both df_live and df_lr dependent on optimization level. Is shrink-wrapping supposed to work with both problems as well? Thanks, Richard. Thanks! -Zhenqiang ChangeLog: 2014-05-23 Zhenqiang Chen zhenqiang.c...@linaro.org PR rtl-optimization/61278 * shrink-wrap.c (move_insn_for_shrink_wrap): Check df_live. testsuite/ChangeLog: 2014-05-23 Zhenqiang Chen zhenqiang.c...@linaro.org * gcc.dg/lto/pr61278_0.c: New test. * gcc.dg/lto/pr61278_1.c: New test. diff --git a/gcc/shrink-wrap.c b/gcc/shrink-wrap.c index f09cfe7..be17829 100644 --- a/gcc/shrink-wrap.c +++ b/gcc/shrink-wrap.c @@ -204,8 +204,15 @@ move_insn_for_shrink_wrap (basic_block bb, rtx insn, /* Create a new basic block on the edge. */ if (EDGE_COUNT (next_block-preds) == 2) { + /* If DF_LIVE doesn't exist, i.e. at -O1, just give up. */ + if (!df_live) + return false; + next_block = split_edge (live_edge); + /* We create a new basic block. Call df_grow_bb_info to make sure +all data structures are allocated. */ + df_grow_bb_info (df_live); bitmap_copy (df_get_live_in (next_block), df_get_live_out (bb)); df_set_bb_dirty (next_block); diff --git a/gcc/testsuite/gcc.dg/lto/pr61278_0.c b/gcc/testsuite/gcc.dg/lto/pr61278_0.c new file mode 100644 index 000..03a24ae --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/pr61278_0.c @@ -0,0 +1,30 @@ +/* { dg-lto-do link } */ +/* { dg-lto-options { { -flto -O0 } } } */ +/* { dg-extra-ld-options -flto -O1 } */ + +static unsigned int +fn1 (int p1, int p2) +{ + return 0; +} + +char a, b, c; + +char +foo (char *p) +{ + int i; + for (b = 1 ; b 0; b++) +{ + for (i = 0; i 2; i++) + ; + for (a = 1; a 0; a++) + { + char d[1] = { 0 }; + if (*p) + break; + c ^= fn1 (fn1 (fn1 (0, 0), 0), 0); + } +} + return 0; +} diff --git a/gcc/testsuite/gcc.dg/lto/pr61278_1.c b/gcc/testsuite/gcc.dg/lto/pr61278_1.c new file mode 100644 index 000..b02c8ac --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/pr61278_1.c @@ -0,0 +1,13 @@ +/* { dg-lto-do link } */ +/* { dg-lto-options { { -flto -O1 } } } */ + +extern char foo (char *); + +char d; + +int +main () +{ + foo (d); + return 0; +}
Re: libsanitizer merge from upstream r208536
I've checked all available systems and wasn't able to repro this. Given your exchanges with Jakub I thought that at this point it was clear that the issue is real. There are three issues here: 1) whether warning should cause termination 2) whether warning should be displayed by default 3) why warning occurs on your machine We mainly discussed 1 and 2 with Jakub but even if we hide this warning it still should have come up on your machine in the first place. It's either false positive which needs to be fixed or symptom of some real problem. I believe it still makes sense to investigate the original problem. Every time vDSO was already filtered with if (!info-dlpi_name || info-dlpi_name[0] == 0) return 0; in FindFirstDSOCallback. Could you provide additional details of your setup? Or perhaps print dlpi_name of offending library? How do I print dlpi_name? Could you add something like Report('%s'\n, info-dlpi_name); after if (!info-dlpi_name || info-dlpi_name[0] == 0) check in FindFirstDSOCallback? This should give us the name of library which causes problems. And which detail do you want? Just the name of the library. It's a very, very, standard Linux machine, running a 3.11.10 kernel and a 2.18 glibc, two days ago everything was fine, 4_9-branch is fine. True but being unable to repro this, I'd need some additional help to diagnose the problem. -Y
Re: libsanitizer merge from upstream r208536
still should have come up on your machine in the first place. should not have
Re: libsanitizer merge from upstream r208536
Hi, On 05/23/2014 11:21 AM, Yury Gribov wrote: I've checked all available systems and wasn't able to repro this. Given your exchanges with Jakub I thought that at this point it was clear that the issue is real. There are three issues here: 1) whether warning should cause termination 2) whether warning should be displayed by default 3) why warning occurs on your machine We mainly discussed 1 and 2 with Jakub but even if we hide this warning it still should have come up on your machine in the first place. It's either false positive which needs to be fixed or symptom of some real problem. I believe it still makes sense to investigate the original problem. Every time vDSO was already filtered with if (!info-dlpi_name || info-dlpi_name[0] == 0) return 0; in FindFirstDSOCallback. Could you provide additional details of your setup? Or perhaps print dlpi_name of offending library? How do I print dlpi_name? Could you add something like Report('%s'\n, info-dlpi_name); after if (!info-dlpi_name || info-dlpi_name[0] == 0) check in FindFirstDSOCallback? This should give us the name of library which causes problems. It's always linux-vdso.so.1, but wasn't that already known, given the ldd requested by Jakub?!? Paolo.
Re: emit __float128 typeinfo
On Wed, 21 May 2014, Jason Merrill wrote: On 04/25/2014 05:04 AM, Marc Glisse wrote: Does this approach seem ok, or do we need to try harder to find a way to get this typeinfo into libsupc++? The latter, I think; these are base types, so they should go in the library. Hmm, ok. Because of the arm target, we can't just use the register_builtin_type hook as it is. The things I can think of right now are: 1) change the prototype of register_builtin_type so it takes an extra bool parameter that says if we want to generate runtime stuff like typeinfo (in addition to what register_builtin_type is already doing). I would then update all target calls with , false and let target maintainers switch to true when they are ready. or 2) create a new register_builtin_type_runtime lang hook that would be defined only in (obj-)c++ and would be used only to generate typeinfo. Does one of those seem acceptable? Another alternative might be to wait for the intN_t work to land and do the same for floatN_t, but that's too big for me. -- Marc Glisse
Re: [PATCH 7/7] Plug ipa-prop escape analysis into gimple_call_arg_flags
On Thu, May 22, 2014 at 8:11 PM, Jan Hubicka hubi...@ucw.cz wrote: It won't be so easy, because struct function is really built at relatively convoluted places within frontend before cgraph node is assigned to them (I tried that few years back). Well, just call cgraph create node from struct Funktion allocation. That will make uninstantiated templates to land symbol table (and if you have aliases, also do the assembler name mangling) that is not that cool either :( Well, allocate_struct_function has a abstract_p argument for that. But yes, a simple patch like Index: gcc/function.c === --- gcc/function.c (revision 210845) +++ gcc/function.c (working copy) @@ -64,6 +64,7 @@ along with GCC; see the file COPYING3. #include params.h #include bb-reorder.h #include shrink-wrap.h +#include cgraph.h /* So we can assign to cfun in this file. */ #undef cfun @@ -4512,6 +4513,8 @@ allocate_struct_function (tree fndecl, b if (fndecl != NULL_TREE) { + if (!abstract_p) + cgraph_get_create_node (fndecl); DECL_STRUCT_FUNCTION (fndecl) = cfun; cfun-decl = fndecl; current_function_funcdef_no = get_next_funcdef_no (); ICEs during bootstrap with (at least) /space/rguenther/src/svn/trunk/libgcc/config/i386/cpuinfo.c:405:1: error: node differs from symtab decl hashtable } ^ __get_cpuid_max.constprop.0/42 (__get_cpuid_max.constprop) @0x7ff486232290 Type: function definition analyzed Visibility: artificial previous sharing asm name: 43 References: Referring: Function __get_cpuid_max.constprop/42 is inline copy in __get_cpuid_output/40 Availability: local First run: 0 Function flags: local only_called_at_startup Called by: __get_cpuid_output/40 (1.00 per call) (inlined) Calls: /space/rguenther/src/svn/trunk/libgcc/config/i386/cpuinfo.c:405:1: internal compiler error: verify_cgraph_node failed so I guess we would need to have a way to create a dummy cgraph node first and later populate it properly. But as we currently have a back-pointer from struct function to fndecl it would be nice to hook the cgraph node in there - that way we get away without any extra pointer (we could even save symtab decl pointer and create a cyclic fndecl - cgraph - function - fndecl chain ...). I'm fine with enlarging tree_function_decl for now - ideally we'd push stuff from it elsewhere (like target and optimization option tree nodes, or most of the visibility and symbol related stuff). Not sure why tree_type_decl inherits from tree_decl_non_common (and thus tree_decl_with_vis). Probably because of the non-common parts being (ab-)used by FEs. Otherwise I'd say simply put a symtab node pointer into tree_decl_with_vis ... (can we move section_name and comdat_group more easily than assembler_name?) Richard. Honza Richard. I think we may be on better track moving DECL_ASSEMBLER_NAME that is calculated later, but then we have problem with DECL_ASSEMBLER_NAME being set for assembler names and const decls, too that still go around symtab. Given that decl_assembler_name is a function, I suppose we could go with extra conditoinal in there. Getting struct function out of frontend busyness would be nice indeed, too, but probably should be independent of Martin's work here. Honza Thanks, Richard. Thanks, Martin + } } + + return ret; } /* Detects return flags for the call STMT. */
Re: [PATCH] __attribute__ ((malloc)) doc fix (PR other/56955)
On Thu, May 22, 2014 at 4:15 PM, Paul Eggert egg...@cs.ucla.edu wrote: Richard Biener wrote: Can you try to clarify the wording (I'm not a native speaker). Sure. I've filed a clarified version on PR 56955 and am attaching it here for convenience. Thanks, installed. Richard.
Re: libsanitizer merge from upstream r208536
On Fri, May 23, 2014 at 11:25:02AM +0200, Paolo Carlini wrote: How do I print dlpi_name? Could you add something like Report('%s'\n, info-dlpi_name); after if (!info-dlpi_name || info-dlpi_name[0] == 0) check in FindFirstDSOCallback? This should give us the name of library which causes problems. It's always linux-vdso.so.1, but wasn't that already known, given the ldd requested by Jakub?!? Bet it depends on the versions of glibc and kernel you have, where exactly in the search scope glibc inserts the vDSO and whether the kernel provides DT_SONAME for it. Jakub
Re: libsanitizer merge from upstream r208536
Could you add something like It's always linux-vdso.so.1, but wasn't that already known, given the ldd requested by Jakub?!? Well, for me dlpi_name for vdso was empty string hence I kept asking. I also thought that ldd and dl_iterate_phdr might have used slightly different code paths when quering information from dynamic linker. Could you check if the attached patch fixes the problem for you? Note that I only did limited testing (RUNTESTFLAGS=asan.exp). -Y diff --git a/libsanitizer/asan/asan_linux.cc b/libsanitizer/asan/asan_linux.cc index d893b23..11137d9 100644 --- a/libsanitizer/asan/asan_linux.cc +++ b/libsanitizer/asan/asan_linux.cc @@ -89,6 +89,10 @@ static int FindFirstDSOCallback(struct dl_phdr_info *info, size_t size, if (!info-dlpi_name || info-dlpi_name[0] == 0) return 0; + // Ignore vDSO + if (internal_strncmp(info-dlpi_name, linux-, sizeof(linux-)) == 0) +return 0; + *(const char **)data = info-dlpi_name; return 1; }
Re: [PATCH] Fix PR rtl-optimization/61278
On 23 May 2014 17:05, Richard Biener richard.guent...@gmail.com wrote: On Fri, May 23, 2014 at 9:23 AM, Zhenqiang Chen zhenqiang.c...@linaro.org wrote: Hi, The patch fixes PR rtl-optimization/61278. Root cause for issue is that df_live does not exist at -O1. Bootstrap and no make check regression on X86-64. OK for trunk? Why do you need to give up? It seems you can simply avoid marking the block as dirty (though df_get_live_in/out also hands you back DF_LR_IN/OUT if !df_live). So isn't the df_grow_bb_info the real fix? The df_get_live_in of the new basic block will be used to analyse later INSNs. If it is not set or incorrect, it will impact on later analysis. df_grow_bb_info is to make sure the live_in data structure is allocated for the new basic block (although I have not found any case fail without it). After bitmap_copy(...), we can use it for later INSNs. Note that df_get_live_in/out are functions tailored to IRA that knows that they handle both df_live and df_lr dependent on optimization level. Is shrink-wrapping supposed to work with both problems as well? Yes. But it seams not perfect to handle df_lr problem. When I fixed PR 57637 (https://gcc.gnu.org/ml/gcc-patches/2013-07/msg00897.html), we selected if DF_LIVE doesn't exist, i.e. at -O1, just give up searching NEXT_BLOCK. Thanks! -Zhenqiang ChangeLog: 2014-05-23 Zhenqiang Chen zhenqiang.c...@linaro.org PR rtl-optimization/61278 * shrink-wrap.c (move_insn_for_shrink_wrap): Check df_live. testsuite/ChangeLog: 2014-05-23 Zhenqiang Chen zhenqiang.c...@linaro.org * gcc.dg/lto/pr61278_0.c: New test. * gcc.dg/lto/pr61278_1.c: New test. diff --git a/gcc/shrink-wrap.c b/gcc/shrink-wrap.c index f09cfe7..be17829 100644 --- a/gcc/shrink-wrap.c +++ b/gcc/shrink-wrap.c @@ -204,8 +204,15 @@ move_insn_for_shrink_wrap (basic_block bb, rtx insn, /* Create a new basic block on the edge. */ if (EDGE_COUNT (next_block-preds) == 2) { + /* If DF_LIVE doesn't exist, i.e. at -O1, just give up. */ + if (!df_live) + return false; + next_block = split_edge (live_edge); + /* We create a new basic block. Call df_grow_bb_info to make sure +all data structures are allocated. */ + df_grow_bb_info (df_live); bitmap_copy (df_get_live_in (next_block), df_get_live_out (bb)); df_set_bb_dirty (next_block); diff --git a/gcc/testsuite/gcc.dg/lto/pr61278_0.c b/gcc/testsuite/gcc.dg/lto/pr61278_0.c new file mode 100644 index 000..03a24ae --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/pr61278_0.c @@ -0,0 +1,30 @@ +/* { dg-lto-do link } */ +/* { dg-lto-options { { -flto -O0 } } } */ +/* { dg-extra-ld-options -flto -O1 } */ + +static unsigned int +fn1 (int p1, int p2) +{ + return 0; +} + +char a, b, c; + +char +foo (char *p) +{ + int i; + for (b = 1 ; b 0; b++) +{ + for (i = 0; i 2; i++) + ; + for (a = 1; a 0; a++) + { + char d[1] = { 0 }; + if (*p) + break; + c ^= fn1 (fn1 (fn1 (0, 0), 0), 0); + } +} + return 0; +} diff --git a/gcc/testsuite/gcc.dg/lto/pr61278_1.c b/gcc/testsuite/gcc.dg/lto/pr61278_1.c new file mode 100644 index 000..b02c8ac --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/pr61278_1.c @@ -0,0 +1,13 @@ +/* { dg-lto-do link } */ +/* { dg-lto-options { { -flto -O1 } } } */ + +extern char foo (char *); + +char d; + +int +main () +{ + foo (d); + return 0; +}
[patch] libstdc++/60793 Add *-*-dragonfly* to testsuite target selectors
A mechanical patch to run tests on DragonFlyBSD. Tested x86_64--unknown-linux-gnu and x86_64-unknown-dragonfly3.6, committed to trunk. patch.txt.bz2 Description: BZip2 compressed data
[patch] Adjust target selectors for some libstdc++ tests
This marks a test that is xfail for darwin as also xfail for dragonfly. I'm surprised it fails on darwin and dragonfly but not freebsd, but Gerald's testresults don't show it failing. Also fix a couple of tests which were missing target selectors on the { dg-do compile } directive, so the tests ran everywhere but failed on any platform that wasn't listed as a target in the { dg-options } directive. Tested x86_64-unknown-linux-gnu and x86_64-unknown-dragonfly3.6, committed to trunk. commit 85ca54c780636cc2c3f93de54e03178488e6d8f2 Author: Jonathan Wakely jwak...@redhat.com Date: Fri May 23 00:25:02 2014 +0100 * testsuite/23_containers/vector/capacity/resize/1.cc: Add xfail for dragonfly. * testsuite/30_threads/call_once/60497.cc: Add target selectors. * testsuite/30_threads/condition_variable/members/53841.cc: Likewise. diff --git a/libstdc++-v3/testsuite/23_containers/vector/capacity/resize/1.cc b/libstdc++-v3/testsuite/23_containers/vector/capacity/resize/1.cc index 1c30ff5..c4cd790 100644 --- a/libstdc++-v3/testsuite/23_containers/vector/capacity/resize/1.cc +++ b/libstdc++-v3/testsuite/23_containers/vector/capacity/resize/1.cc @@ -22,7 +22,7 @@ // This fails on some versions of Darwin 8 because malloc doesn't return // NULL even if an allocation fails (filed as Radar 3884894). -// { dg-do run { xfail *-*-darwin8.[0-4].* } } +// { dg-do run { xfail *-*-darwin8.[0-4].* *-*-dragonfly* } } #include vector #include stdexcept diff --git a/libstdc++-v3/testsuite/30_threads/call_once/60497.cc b/libstdc++-v3/testsuite/30_threads/call_once/60497.cc index b058204..a82b88f 100644 --- a/libstdc++-v3/testsuite/30_threads/call_once/60497.cc +++ b/libstdc++-v3/testsuite/30_threads/call_once/60497.cc @@ -1,4 +1,4 @@ -// { dg-do compile } +// { dg-do compile { target *-*-freebsd* *-*-dragonfly* *-*-dragonfly* *-*-netbsd* *-*-linux* *-*-gnu* *-*-solaris* *-*-cygwin *-*-darwin* powerpc-ibm-aix* } } // { dg-options -std=gnu++11 -pthread { target *-*-freebsd* *-*-dragonfly* *-*-netbsd* *-*-linux* *-*-gnu* powerpc-ibm-aix* } } // { dg-options -std=gnu++11 -pthreads { target *-*-solaris* } } // { dg-options -std=gnu++11 { target *-*-cygwin *-*-darwin* } } diff --git a/libstdc++-v3/testsuite/30_threads/condition_variable/members/53841.cc b/libstdc++-v3/testsuite/30_threads/condition_variable/members/53841.cc index e8b7008b..90d02d9 100644 --- a/libstdc++-v3/testsuite/30_threads/condition_variable/members/53841.cc +++ b/libstdc++-v3/testsuite/30_threads/condition_variable/members/53841.cc @@ -1,4 +1,4 @@ -// { dg-do compile } +// { dg-do compile { target *-*-freebsd* *-*-dragonfly* *-*-netbsd* *-*-linux* *-*-gnu* *-*-solaris* *-*-cygwin *-*-darwin* powerpc-ibm-aix* hppa*-hp-hpux11* } } // { dg-options -std=gnu++0x -pthread { target *-*-freebsd* *-*-dragonfly* *-*-netbsd* *-*-linux* *-*-gnu* powerpc-ibm-aix* hppa*-hp-hpux11* } } // { dg-options -std=gnu++0x -pthreads { target *-*-solaris* } } // { dg-options -std=gnu++0x { target *-*-cygwin *-*-darwin* } }
[PATCH/RFC, ARM] Improve static checking of tune_params
One of the things that worries me about all the static tuning tables we have in the compiler is that it is easy to get the order of elements wrong, especially when adding a lot of new fields to existing descriptions. This patch attempts to improve the static checking in this area by making use of enums to replace all those true/false fields; this means that an element that is out of order should now fail a compile-time check (verified by deliberately switching two elements around). A fringe benefit of this approach is that the table is now pretty-much self-commented. I'd like to gather opinions on this; it's not quite as simple as I'd hoped in that you have to use base::enum_val, rather than just enum_val in the tables. Either that or make the declarations global, which I think is slightly worse from a purist point of view. I'll wait a couple of days before committing this. * arm-protos.h (struct tune_params): Re-organise. Convert bool entries to enums. * arm.c (arm_slowmul_tune): Update accordingly. (arm_fastmul_tune): Likewise. (arm_strongarm_tune, arm_xscale_tune, arm_9e_tune): Likewise. (arm_v6t2_tune, arm_cortex_tune, arm_cortex_a8_tune): Likewise. (arm_cortex_a7_tune, arm_cortex_a15_tune): Likewise. (arm_cortex_a53_tune, arm_cortex_a57_tune): Likewise. (arm_cortex_a5_tune, arm_cortex_a9_tune): Likewise. (arm_cortex_a12_tune, arm_v7m_tune, arm_v6m_tune): Likewise. (arm_fa726te_tune): Likewise. (thumb2_reorg): Update accordingly. * arm.h (LOGICAL_OP_NON_SHORTCIRCUIT_P): Update accordingly.diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index 74645ee..cae56ae 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -254,29 +254,30 @@ struct tune_params bool (*rtx_costs) (rtx, RTX_CODE, RTX_CODE, int *, bool); const struct cpu_cost_table *insn_extra_cost; bool (*sched_adjust_cost) (rtx, rtx, rtx, int *); + int (*branch_cost) (bool, bool); + /* Vectorizer costs. */ + const struct cpu_vec_costs* vec_costs; int constant_limit; /* Maximum number of instructions to conditionalise. */ int max_insns_skipped; int num_prefetch_slots; int l1_cache_size; int l1_cache_line_size; - bool prefer_constant_pool; - int (*branch_cost) (bool, bool); + + enum {PREFER_CONST_POOL_FALSE, PREFER_CONST_POOL_TRUE} + prefer_constant_pool: 1; /* Prefer STRD/LDRD instructions over PUSH/POP/LDM/STM. */ - bool prefer_ldrd_strd; + enum {PREFER_LDRD_FALSE, PREFER_LDRD_TRUE} prefer_ldrd_strd: 1; /* The preference for non short cirtcuit operation when optimizing for performance. The first element covers Thumb state and the second one is for ARM state. */ - bool logical_op_non_short_circuit[2]; - /* Vectorizer costs. */ - const struct cpu_vec_costs* vec_costs; - /* Prefer Neon for 64-bit bitops. */ - bool prefer_neon_for_64bits; + enum log_op_non_sc {LOG_OP_NON_SC_NEVER, LOG_OP_NON_SC_ARM, + LOG_OP_NON_SC_THUMB, LOG_OP_NON_SC_ALL}; + log_op_non_sc logical_op_non_short_circuit: 2; + enum {PREFER_NEON_64_FALSE, PREFER_NEON_64_TRUE} prefer_neon_for_64bits: 1; /* Prefer 32-bit encoding instead of flag-setting 16-bit encoding. */ - bool disparage_flag_setting_t16_encodings; - /* Prefer 32-bit encoding instead of 16-bit encoding where subset of flags - would be set. */ - bool disparage_partial_flag_setting_t16_encodings; + enum {DISPARAGE_FLAGS_NEITHER, DISPARAGE_FLAGS_PARTIAL, DISPARAGE_FLAGS_ALL} +disparage_flag_setting_t16_encodings: 2; }; extern const struct tune_params *current_tune; diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index ccad548..73e7c9c 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -1580,16 +1580,16 @@ const struct tune_params arm_slowmul_tune = arm_slowmul_rtx_costs, NULL, NULL,/* Sched adj cost. */ + arm_default_branch_cost, + arm_default_vec_cost,/* Vectorizer costs. */ 3, /* Constant limit. */ 5, /* Max cond insns. */ ARM_PREFETCH_NOT_BENEFICIAL, - true,/* Prefer constant pool. */ - arm_default_branch_cost, - false, /* Prefer LDRD/STRD. */ - {true, true},/* Prefer non short circuit. */ - arm_default_vec_cost,/* Vectorizer costs. */ - false,/* Prefer Neon for 64-bits bitops. */ - false, false /* Prefer 32-bit encodings. */ + tune_params::PREFER_CONST_POOL_TRUE, + tune_params::PREFER_LDRD_FALSE, + tune_params::LOG_OP_NON_SC_ALL, + tune_params::PREFER_NEON_64_FALSE, + tune_params::DISPARAGE_FLAGS_NEITHER }; const
Re: libsanitizer merge from upstream r208536
Hi, On 05/23/2014 12:26 PM, Yury Gribov wrote: Could you add something like It's always linux-vdso.so.1, but wasn't that already known, given the ldd requested by Jakub?!? Well, for me dlpi_name for vdso was empty string hence I kept asking. I also thought that ldd and dl_iterate_phdr might have used slightly different code paths when quering information from dynamic linker. Could you check if the attached patch fixes the problem for you? Note that I only did limited testing (RUNTESTFLAGS=asan.exp). Thanks. It appears to work great for me modulo a trivial off-by-one (you want sizeof(...) - 1) and the asan_test.C issue already discussed by Jakub. Paolo.
Re: [patch] Adjust target selectors for some libstdc++ tests
Jonathan Wakely jwak...@redhat.com writes: diff --git a/libstdc++-v3/testsuite/30_threads/call_once/60497.cc b/libstdc++-v3/testsuite/30_threads/call_once/60497.cc index b058204..a82b88f 100644 --- a/libstdc++-v3/testsuite/30_threads/call_once/60497.cc +++ b/libstdc++-v3/testsuite/30_threads/call_once/60497.cc @@ -1,4 +1,4 @@ -// { dg-do compile } +// { dg-do compile { target *-*-freebsd* *-*-dragonfly* *-*-dragonfly* *-*-netbsd* *-*-linux* *-*-gnu* *-*-solaris* *-*-cygwin *-*-darwin* powerpc-ibm-aix* } } Any reason to list dragonfly twice? Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
[COMMITTED] Fix some bool vs. tree confusion.
From: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 gcc/c/ * c-parser.c (c_parser_omp_target): Return bool values. gcc/cp/ * parser.c (cp_parser_omp_target): Return bool values. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@210851 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/c/ChangeLog | 4 gcc/c/c-parser.c | 14 +- gcc/cp/ChangeLog | 4 gcc/cp/parser.c | 14 +- 4 files changed, 26 insertions(+), 10 deletions(-) diff --git gcc/c/ChangeLog gcc/c/ChangeLog index 9acc6f7..c21f68f 100644 --- gcc/c/ChangeLog +++ gcc/c/ChangeLog @@ -1,3 +1,7 @@ +2014-05-23 Thomas Schwinge tho...@codesourcery.com + + * c-parser.c (c_parser_omp_target): Return bool values. + 2014-05-22 Thomas Schwinge tho...@codesourcery.com * c-parser.c (c_parser_omp_clause_thread_limit): Rename diff --git gcc/c/c-parser.c gcc/c/c-parser.c index a7e33b0..88edf36 100644 --- gcc/c/c-parser.c +++ gcc/c/c-parser.c @@ -12720,15 +12720,19 @@ c_parser_omp_target (c_parser *parser, enum pragma_context context) c_parser_consume_token (parser); strcpy (p_name, #pragma omp target); if (!flag_openmp) /* flag_openmp_simd */ - return c_parser_omp_teams (loc, parser, p_name, - OMP_TARGET_CLAUSE_MASK, cclauses); + { + tree stmt = c_parser_omp_teams (loc, parser, p_name, + OMP_TARGET_CLAUSE_MASK, + cclauses); + return stmt != NULL_TREE; + } keep_next_level (); tree block = c_begin_compound_stmt (true); tree ret = c_parser_omp_teams (loc, parser, p_name, OMP_TARGET_CLAUSE_MASK, cclauses); block = c_end_compound_stmt (loc, block, true); - if (ret == NULL) - return ret; + if (ret == NULL_TREE) + return false; tree stmt = make_node (OMP_TARGET); TREE_TYPE (stmt) = void_type_node; OMP_TARGET_CLAUSES (stmt) = cclauses[C_OMP_CLAUSE_SPLIT_TARGET]; @@ -12739,7 +12743,7 @@ c_parser_omp_target (c_parser *parser, enum pragma_context context) else if (!flag_openmp) /* flag_openmp_simd */ { c_parser_skip_to_pragma_eol (parser); - return NULL_TREE; + return false; } else if (strcmp (p, data) == 0) { diff --git gcc/cp/ChangeLog gcc/cp/ChangeLog index a594e93..b9a22f9 100644 --- gcc/cp/ChangeLog +++ gcc/cp/ChangeLog @@ -1,3 +1,7 @@ +2014-05-23 Thomas Schwinge tho...@codesourcery.com + + * parser.c (cp_parser_omp_target): Return bool values. + 2014-05-22 Paolo Carlini paolo.carl...@oracle.com PR c++/61088 diff --git gcc/cp/parser.c gcc/cp/parser.c index 7f06106..c4440af 100644 --- gcc/cp/parser.c +++ gcc/cp/parser.c @@ -30337,8 +30337,12 @@ cp_parser_omp_target (cp_parser *parser, cp_token *pragma_tok, cp_lexer_consume_token (parser-lexer); strcpy (p_name, #pragma omp target); if (!flag_openmp) /* flag_openmp_simd */ - return cp_parser_omp_teams (parser, pragma_tok, p_name, - OMP_TARGET_CLAUSE_MASK, cclauses); + { + tree stmt = cp_parser_omp_teams (parser, pragma_tok, p_name, + OMP_TARGET_CLAUSE_MASK, + cclauses); + return stmt != NULL_TREE; + } keep_next_level (true); tree sb = begin_omp_structured_block (); unsigned save = cp_parser_begin_omp_structured_block (parser); @@ -30346,8 +30350,8 @@ cp_parser_omp_target (cp_parser *parser, cp_token *pragma_tok, OMP_TARGET_CLAUSE_MASK, cclauses); cp_parser_end_omp_structured_block (parser, save); tree body = finish_omp_structured_block (sb); - if (ret == NULL) - return ret; + if (ret == NULL_TREE) + return false; tree stmt = make_node (OMP_TARGET); TREE_TYPE (stmt) = void_type_node; OMP_TARGET_CLAUSES (stmt) = cclauses[C_OMP_CLAUSE_SPLIT_TARGET]; @@ -30358,7 +30362,7 @@ cp_parser_omp_target (cp_parser *parser, cp_token *pragma_tok, else if (!flag_openmp) /* flag_openmp_simd */ { cp_parser_require_pragma_eol (parser, pragma_tok); - return NULL_TREE; + return false; } else if (strcmp (p, data) == 0) { -- 1.9.1
[COMMITTED] Be more explicit.
From: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 gcc/ * gimplify.c (omp_notice_variable) case OMP_CLAUSE_DEFAULT_NONE: Explicitly enumerate the expected region types. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@210852 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog | 5 + gcc/gimplify.c | 15 +-- 2 files changed, 14 insertions(+), 6 deletions(-) diff --git gcc/ChangeLog gcc/ChangeLog index aedf2d0..d351c0b 100644 --- gcc/ChangeLog +++ gcc/ChangeLog @@ -1,3 +1,8 @@ +2014-05-23 Thomas Schwinge tho...@codesourcery.com + + * gimplify.c (omp_notice_variable) case OMP_CLAUSE_DEFAULT_NONE: + Explicitly enumerate the expected region types. + 2014-05-23 Paul Eggert egg...@cs.ucla.edu PR other/56955 diff --git gcc/gimplify.c gcc/gimplify.c index 3241633..39b2750 100644 --- gcc/gimplify.c +++ gcc/gimplify.c @@ -5683,7 +5683,14 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code) switch (default_kind) { case OMP_CLAUSE_DEFAULT_NONE: - if ((ctx-region_type ORT_TASK) != 0) + if (ctx-region_type == ORT_PARALLEL + || ctx-region_type == ORT_COMBINED_PARALLEL) + { + error (%qE not specified in enclosing parallel, +DECL_NAME (lang_hooks.decls.omp_report_decl (decl))); + error_at (ctx-location, enclosing parallel); + } + else if ((ctx-region_type ORT_TASK) != 0) { error (%qE not specified in enclosing task, DECL_NAME (lang_hooks.decls.omp_report_decl (decl))); @@ -5696,11 +5703,7 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code) error_at (ctx-location, enclosing teams construct); } else - { - error (%qE not specified in enclosing parallel, -DECL_NAME (lang_hooks.decls.omp_report_decl (decl))); - error_at (ctx-location, enclosing parallel); - } + gcc_unreachable (); /* FALLTHRU */ case OMP_CLAUSE_DEFAULT_SHARED: flags |= GOVD_SHARED; -- 1.9.1
[COMMITTED] Remove duplicated variable initialization.
From: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 gcc/c/ * c-typeck.c (c_finish_omp_clauses): Remove duplicated variable initialization. gcc/cp/ * semantics.c (finish_omp_clauses): Remove duplicated variable initialization. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@210853 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/c/ChangeLog| 3 +++ gcc/c/c-typeck.c | 2 +- gcc/cp/ChangeLog | 3 +++ gcc/cp/semantics.c | 2 +- 4 files changed, 8 insertions(+), 2 deletions(-) diff --git gcc/c/ChangeLog gcc/c/ChangeLog index c21f68f..5bee1ca 100644 --- gcc/c/ChangeLog +++ gcc/c/ChangeLog @@ -1,5 +1,8 @@ 2014-05-23 Thomas Schwinge tho...@codesourcery.com + * c-typeck.c (c_finish_omp_clauses): Remove duplicated variable + initialization. + * c-parser.c (c_parser_omp_target): Return bool values. 2014-05-22 Thomas Schwinge tho...@codesourcery.com diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c index 6f4bd4a..74a5ebd 100644 --- gcc/c/c-typeck.c +++ gcc/c/c-typeck.c @@ -11762,7 +11762,7 @@ c_finish_omp_clauses (tree clauses) { bitmap_head generic_head, firstprivate_head, lastprivate_head; bitmap_head aligned_head; - tree c, t, *pc = clauses; + tree c, t, *pc; bool branch_seen = false; bool copyprivate_seen = false; tree *nowait_clause = NULL; diff --git gcc/cp/ChangeLog gcc/cp/ChangeLog index b9a22f9..90ded5b 100644 --- gcc/cp/ChangeLog +++ gcc/cp/ChangeLog @@ -1,5 +1,8 @@ 2014-05-23 Thomas Schwinge tho...@codesourcery.com + * semantics.c (finish_omp_clauses): Remove duplicated variable + initialization. + * parser.c (cp_parser_omp_target): Return bool values. 2014-05-22 Paolo Carlini paolo.carl...@oracle.com diff --git gcc/cp/semantics.c gcc/cp/semantics.c index 7e144a6..edab330 100644 --- gcc/cp/semantics.c +++ gcc/cp/semantics.c @@ -5222,7 +5222,7 @@ finish_omp_clauses (tree clauses) { bitmap_head generic_head, firstprivate_head, lastprivate_head; bitmap_head aligned_head; - tree c, t, *pc = clauses; + tree c, t, *pc; bool branch_seen = false; bool copyprivate_seen = false; -- 1.9.1
[COMMITTED 2/2] GF_OMP_FOR_SIMD: Flag for SIMD variants of OMP_FOR kinds.
From: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 gcc/ * gimple.h (enum gf_mask): Add and use GF_OMP_FOR_SIMD. * omp-low.c: Update accordingly. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@210855 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog | 3 +++ gcc/gimple.h | 6 -- gcc/omp-low.c | 18 +- 3 files changed, 16 insertions(+), 11 deletions(-) diff --git gcc/ChangeLog gcc/ChangeLog index fa2f3c3..c1b2416 100644 --- gcc/ChangeLog +++ gcc/ChangeLog @@ -1,5 +1,8 @@ 2014-05-23 Thomas Schwinge tho...@codesourcery.com + * gimple.h (enum gf_mask): Add and use GF_OMP_FOR_SIMD. + * omp-low.c: Update accordingly. + * gimple.h (enum gf_mask): Rewrite 0 shift expressions used for GF_OMP_FOR_KIND_MASK, GF_OMP_FOR_KIND_FOR, GF_OMP_FOR_KIND_DISTRIBUTE, GF_OMP_FOR_KIND_SIMD, diff --git gcc/gimple.h gcc/gimple.h index b1970e5..ceefbc0 100644 --- gcc/gimple.h +++ gcc/gimple.h @@ -94,8 +94,10 @@ enum gf_mask { GF_OMP_FOR_KIND_MASK = (1 2) - 1, GF_OMP_FOR_KIND_FOR= 0, GF_OMP_FOR_KIND_DISTRIBUTE = 1, -GF_OMP_FOR_KIND_SIMD = 2, -GF_OMP_FOR_KIND_CILKSIMD = 3, +/* Flag for SIMD variants of OMP_FOR kinds. */ +GF_OMP_FOR_SIMD= 1 1, +GF_OMP_FOR_KIND_SIMD = GF_OMP_FOR_SIMD | 0, +GF_OMP_FOR_KIND_CILKSIMD = GF_OMP_FOR_SIMD | 1, GF_OMP_FOR_COMBINED= 1 2, GF_OMP_FOR_COMBINED_INTO = 1 3, GF_OMP_TARGET_KIND_MASK= (1 2) - 1, diff --git gcc/omp-low.c gcc/omp-low.c index 95b0e52..54e837f 100644 --- gcc/omp-low.c +++ gcc/omp-low.c @@ -298,7 +298,7 @@ extract_omp_for_data (gimple for_stmt, struct omp_for_data *fd, int i; struct omp_for_data_loop dummy_loop; location_t loc = gimple_location (for_stmt); - bool simd = gimple_omp_for_kind (for_stmt) GF_OMP_FOR_KIND_SIMD; + bool simd = gimple_omp_for_kind (for_stmt) GF_OMP_FOR_SIMD; bool distribute = gimple_omp_for_kind (for_stmt) == GF_OMP_FOR_KIND_DISTRIBUTE; @@ -1020,7 +1020,7 @@ build_outer_var_ref (tree var, omp_context *ctx) x = build_receiver_ref (var, by_ref, ctx); } else if (gimple_code (ctx-stmt) == GIMPLE_OMP_FOR - gimple_omp_for_kind (ctx-stmt) GF_OMP_FOR_KIND_SIMD) + gimple_omp_for_kind (ctx-stmt) GF_OMP_FOR_SIMD) { /* #pragma omp simd isn't a worksharing construct, and can reference even private vars in its linear etc. clauses. */ @@ -2249,7 +2249,7 @@ check_omp_nesting_restrictions (gimple stmt, omp_context *ctx) if (ctx != NULL) { if (gimple_code (ctx-stmt) == GIMPLE_OMP_FOR - gimple_omp_for_kind (ctx-stmt) GF_OMP_FOR_KIND_SIMD) + gimple_omp_for_kind (ctx-stmt) GF_OMP_FOR_SIMD) { error_at (gimple_location (stmt), OpenMP constructs may not be nested inside simd region); @@ -2272,7 +2272,7 @@ check_omp_nesting_restrictions (gimple stmt, omp_context *ctx) switch (gimple_code (stmt)) { case GIMPLE_OMP_FOR: - if (gimple_omp_for_kind (stmt) GF_OMP_FOR_KIND_SIMD) + if (gimple_omp_for_kind (stmt) GF_OMP_FOR_SIMD) return true; if (gimple_omp_for_kind (stmt) == GF_OMP_FOR_KIND_DISTRIBUTE) { @@ -2598,7 +2598,7 @@ scan_omp_1_stmt (gimple_stmt_iterator *gsi, bool *handled_ops_p, if (setjmp_or_longjmp_p (fndecl) ctx gimple_code (ctx-stmt) == GIMPLE_OMP_FOR - gimple_omp_for_kind (ctx-stmt) GF_OMP_FOR_KIND_SIMD) + gimple_omp_for_kind (ctx-stmt) GF_OMP_FOR_SIMD) { remove = true; error_at (gimple_location (stmt), @@ -3034,7 +3034,7 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist, bool reduction_omp_orig_ref = false; int pass; bool is_simd = (gimple_code (ctx-stmt) == GIMPLE_OMP_FOR - gimple_omp_for_kind (ctx-stmt) GF_OMP_FOR_KIND_SIMD); + gimple_omp_for_kind (ctx-stmt) GF_OMP_FOR_SIMD); int max_vf = 0; tree lane = NULL_TREE, idx = NULL_TREE; tree ivar = NULL_TREE, lvar = NULL_TREE; @@ -3774,7 +3774,7 @@ lower_lastprivate_clauses (tree clauses, tree predicate, gimple_seq *stmt_list, } if (gimple_code (ctx-stmt) == GIMPLE_OMP_FOR - gimple_omp_for_kind (ctx-stmt) GF_OMP_FOR_KIND_SIMD) + gimple_omp_for_kind (ctx-stmt) GF_OMP_FOR_SIMD) { simduid = find_omp_clause (orig_clauses, OMP_CLAUSE__SIMDUID_); if (simduid) @@ -3877,7 +3877,7 @@ lower_reduction_clauses (tree clauses, gimple_seq *stmt_seqp, omp_context *ctx) /* SIMD reductions are handled in lower_rec_input_clauses. */ if (gimple_code (ctx-stmt) == GIMPLE_OMP_FOR - gimple_omp_for_kind (ctx-stmt) GF_OMP_FOR_KIND_SIMD) + gimple_omp_for_kind (ctx-stmt) GF_OMP_FOR_SIMD) return; /* First see if there is exactly one reduction
[COMMITTED 1/2] Just enumerate all GF_OMP_FOR_KIND_* and GF_OMP_TARGET_KIND_*.
From: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 gcc/ * gimple.h (enum gf_mask): Rewrite 0 shift expressions used for GF_OMP_FOR_KIND_MASK, GF_OMP_FOR_KIND_FOR, GF_OMP_FOR_KIND_DISTRIBUTE, GF_OMP_FOR_KIND_SIMD, GF_OMP_FOR_KIND_CILKSIMD, GF_OMP_TARGET_KIND_MASK, GF_OMP_TARGET_KIND_REGION, GF_OMP_TARGET_KIND_DATA, GF_OMP_TARGET_KIND_UPDATE. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@210854 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog | 7 +++ gcc/gimple.h | 18 +- 2 files changed, 16 insertions(+), 9 deletions(-) diff --git gcc/ChangeLog gcc/ChangeLog index d351c0b..fa2f3c3 100644 --- gcc/ChangeLog +++ gcc/ChangeLog @@ -1,5 +1,12 @@ 2014-05-23 Thomas Schwinge tho...@codesourcery.com + * gimple.h (enum gf_mask): Rewrite 0 shift expressions used + for GF_OMP_FOR_KIND_MASK, GF_OMP_FOR_KIND_FOR, + GF_OMP_FOR_KIND_DISTRIBUTE, GF_OMP_FOR_KIND_SIMD, + GF_OMP_FOR_KIND_CILKSIMD, GF_OMP_TARGET_KIND_MASK, + GF_OMP_TARGET_KIND_REGION, GF_OMP_TARGET_KIND_DATA, + GF_OMP_TARGET_KIND_UPDATE. + * gimplify.c (omp_notice_variable) case OMP_CLAUSE_DEFAULT_NONE: Explicitly enumerate the expected region types. diff --git gcc/gimple.h gcc/gimple.h index 9df45de..b1970e5 100644 --- gcc/gimple.h +++ gcc/gimple.h @@ -91,17 +91,17 @@ enum gf_mask { GF_CALL_ALLOCA_FOR_VAR = 1 5, GF_CALL_INTERNAL = 1 6, GF_OMP_PARALLEL_COMBINED = 1 0, -GF_OMP_FOR_KIND_MASK = 3 0, -GF_OMP_FOR_KIND_FOR= 0 0, -GF_OMP_FOR_KIND_DISTRIBUTE = 1 0, -GF_OMP_FOR_KIND_SIMD = 2 0, -GF_OMP_FOR_KIND_CILKSIMD = 3 0, +GF_OMP_FOR_KIND_MASK = (1 2) - 1, +GF_OMP_FOR_KIND_FOR= 0, +GF_OMP_FOR_KIND_DISTRIBUTE = 1, +GF_OMP_FOR_KIND_SIMD = 2, +GF_OMP_FOR_KIND_CILKSIMD = 3, GF_OMP_FOR_COMBINED= 1 2, GF_OMP_FOR_COMBINED_INTO = 1 3, -GF_OMP_TARGET_KIND_MASK= 3 0, -GF_OMP_TARGET_KIND_REGION = 0 0, -GF_OMP_TARGET_KIND_DATA= 1 0, -GF_OMP_TARGET_KIND_UPDATE = 2 0, +GF_OMP_TARGET_KIND_MASK= (1 2) - 1, +GF_OMP_TARGET_KIND_REGION = 0, +GF_OMP_TARGET_KIND_DATA= 1, +GF_OMP_TARGET_KIND_UPDATE = 2, /* True on an GIMPLE_OMP_RETURN statement if the return does not require a thread synchronization via some sort of barrier. The exact barrier -- 1.9.1
Re: [COMMITTED] Be more explicit.
On Fri, May 23, 2014 at 01:08:09PM +0200, Thomas Schwinge wrote: @@ -5683,7 +5683,14 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code) switch (default_kind) { case OMP_CLAUSE_DEFAULT_NONE: - if ((ctx-region_type ORT_TASK) != 0) + if (ctx-region_type == ORT_PARALLEL + || ctx-region_type == ORT_COMBINED_PARALLEL) This should have been (ctx-region_type ORT_PARALLEL) != 0 instead. Jakub
Re: wide-int, rtl-2
This looks OK to me. I obviously didn't look carefully enough, there are a few thinkos in the output_constructor_bitfield hunk: - if (shift HOST_BITS_PER_WIDE_INT - shift + this_time HOST_BITS_PER_WIDE_INT) - { - this_time = shift + this_time - HOST_BITS_PER_WIDE_INT; - shift = HOST_BITS_PER_WIDE_INT; - } + if ((shift / HOST_BITS_PER_WIDE_INT) + != ((shift + this_time) / HOST_BITS_PER_WIDE_INT)) + this_time = (shift + this_time) (HOST_BITS_PER_WIDE_INT - 1); The tests aren't equivalent since the original one is false e.g. if shift == 1 and shift + this_time == HOST_BITS_PER_WIDE_INT, but not the new one. The new computation for this_time is not fully correct since it could yield zero but may not. And the shift = HOST_BITS_PER_WIDE_INT; line shouldn't have been dropped but adjusted. - if (shift HOST_BITS_PER_WIDE_INT - shift + this_time HOST_BITS_PER_WIDE_INT) + if ((shift / HOST_BITS_PER_WIDE_INT) + != ((shift + this_time) / HOST_BITS_PER_WIDE_INT)) this_time = (HOST_BITS_PER_WIDE_INT - shift); Likewise for the test. And the assignment should have been adjusted since this_time may not be zero or negative. This fixes: WARNING: program timed out. FAIL: gnat.dg/nested_agg_bitfield_constructor.adb (test for excess errors) WARNING: program timed out. FAIL: gnat.dg/outer_agg_bitfield_constructor.adb (test for excess errors) WARNING: gnat.dg/outer_agg_bitfield_constructor.adb compilation failed to produce executable on big-endian platforms. Tested on x86_64-suse-linux and SPARC/Solaris, applied on the mainline. 2014-05-23 Eric Botcazou ebotca...@adacore.com * varasm.c (output_constructor_bitfield): Fix thinkos in latest change. -- Eric BotcazouIndex: varasm.c === --- varasm.c (revision 210676) +++ varasm.c (working copy) @@ -5082,24 +5082,27 @@ output_constructor_bitfield (oc_local_st this_time = MIN (end_offset - next_offset, BITS_PER_UNIT - next_bit); if (BYTES_BIG_ENDIAN) { - /* On big-endian machine, take the most significant bits - first (of the bits that are significant) - and put them into bytes from the most significant end. */ + /* On big-endian machine, take the most significant bits (of the + bits that are significant) first and put them into bytes from + the most significant end. */ shift = end_offset - next_offset - this_time; /* Don't try to take a bunch of bits that cross - the word boundary in the INTEGER_CST. We can - only select bits from the LOW or HIGH part - not from both. */ + the word boundary in the INTEGER_CST. We can + only select bits from one element. */ if ((shift / HOST_BITS_PER_WIDE_INT) - != ((shift + this_time) / HOST_BITS_PER_WIDE_INT)) - this_time = (shift + this_time) (HOST_BITS_PER_WIDE_INT - 1); + != ((shift + this_time - 1) / HOST_BITS_PER_WIDE_INT)) + { + const int end = shift + this_time - 1; + shift = end -HOST_BITS_PER_WIDE_INT; + this_time = end - shift + 1; + } /* Now get the bits from the appropriate constant word. */ value = TREE_INT_CST_ELT (local-val, shift / HOST_BITS_PER_WIDE_INT); shift = shift (HOST_BITS_PER_WIDE_INT - 1); - /* Get the result. This works only when: + /* Get the result. This works only when: 1 = this_time = HOST_BITS_PER_WIDE_INT. */ local-byte |= (((value shift) (((HOST_WIDE_INT) 2 (this_time - 1)) - 1)) @@ -5107,25 +5110,24 @@ output_constructor_bitfield (oc_local_st } else { - /* On little-endian machines, - take first the least significant bits of the value - and pack them starting at the least significant + /* On little-endian machines, take the least significant bits of + the value first and pack them starting at the least significant bits of the bytes. */ shift = next_offset - byte_relative_ebitpos; /* Don't try to take a bunch of bits that cross - the word boundary in the INTEGER_CST. We can - only select bits from the LOW or HIGH part - not from both. */ + the word boundary in the INTEGER_CST. We can + only select bits from one element. */ if ((shift / HOST_BITS_PER_WIDE_INT) - != ((shift + this_time) / HOST_BITS_PER_WIDE_INT)) - this_time = (HOST_BITS_PER_WIDE_INT - shift); + != ((shift + this_time - 1) / HOST_BITS_PER_WIDE_INT)) + this_time + = HOST_BITS_PER_WIDE_INT - (shift (HOST_BITS_PER_WIDE_INT - 1)); /* Now get the bits from the appropriate constant word. */ value = TREE_INT_CST_ELT (local-val, shift / HOST_BITS_PER_WIDE_INT); shift = shift (HOST_BITS_PER_WIDE_INT - 1); - /* Get the result. This works only when: + /* Get the result. This works only when: 1 = this_time =
[COMMITTED] Make it easier to diff expand_omp_for_* functions.
From: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 gcc/ * omp-low.c (expand_omp_for_static_chunk): Rename variable si to gsi, and variables v_* to v*. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@210858 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog | 5 +++ gcc/omp-low.c | 118 +- 2 files changed, 64 insertions(+), 59 deletions(-) diff --git gcc/ChangeLog gcc/ChangeLog index 01f3ca1..3d74b6f 100644 --- gcc/ChangeLog +++ gcc/ChangeLog @@ -1,3 +1,8 @@ +2014-05-23 Thomas Schwinge tho...@codesourcery.com + + * omp-low.c (expand_omp_for_static_chunk): Rename variable si to + gsi, and variables v_* to v*. + 2014-05-23 Eric Botcazou ebotca...@adacore.com * varasm.c (output_constructor_bitfield): Fix thinkos in latest change. diff --git gcc/omp-low.c gcc/omp-low.c index 54e837f..129d513 100644 --- gcc/omp-low.c +++ gcc/omp-low.c @@ -6166,10 +6166,10 @@ expand_omp_for_static_chunk (struct omp_region *region, { tree n, s0, e0, e, t; tree trip_var, trip_init, trip_main, trip_back, nthreads, threadid; - tree type, itype, v_main, v_back, v_extra; + tree type, itype, vmain, vback, vextra; basic_block entry_bb, exit_bb, body_bb, seq_start_bb, iter_part_bb; basic_block trip_update_bb = NULL, cont_bb, collapse_bb = NULL, fin_bb; - gimple_stmt_iterator si; + gimple_stmt_iterator gsi; gimple stmt; edge se; enum built_in_function get_num_threads = BUILT_IN_OMP_GET_NUM_THREADS; @@ -6202,8 +6202,8 @@ expand_omp_for_static_chunk (struct omp_region *region, exit_bb = region-exit; /* Trip and adjustment setup goes in ENTRY_BB. */ - si = gsi_last_bb (entry_bb); - gcc_assert (gimple_code (gsi_stmt (si)) == GIMPLE_OMP_FOR); + gsi = gsi_last_bb (entry_bb); + gcc_assert (gimple_code (gsi_stmt (gsi)) == GIMPLE_OMP_FOR); if (gimple_omp_for_kind (fd-for_stmt) == GF_OMP_FOR_KIND_DISTRIBUTE) { @@ -6217,7 +6217,7 @@ expand_omp_for_static_chunk (struct omp_region *region, basic_block l2_dom_bb = NULL; counts = XALLOCAVEC (tree, fd-collapse); - expand_omp_for_init_counts (fd, si, entry_bb, counts, + expand_omp_for_init_counts (fd, gsi, entry_bb, counts, fin_bb, first_zero_iter, l2_dom_bb); t = NULL_TREE; @@ -6233,21 +6233,21 @@ expand_omp_for_static_chunk (struct omp_region *region, (t == NULL_TREE || !integer_onep (t))) { n1 = fold_convert (type, unshare_expr (fd-loop.n1)); - n1 = force_gimple_operand_gsi (si, n1, true, NULL_TREE, + n1 = force_gimple_operand_gsi (gsi, n1, true, NULL_TREE, true, GSI_SAME_STMT); n2 = fold_convert (type, unshare_expr (fd-loop.n2)); - n2 = force_gimple_operand_gsi (si, n2, true, NULL_TREE, + n2 = force_gimple_operand_gsi (gsi, n2, true, NULL_TREE, true, GSI_SAME_STMT); stmt = gimple_build_cond (fd-loop.cond_code, n1, n2, NULL_TREE, NULL_TREE); - gsi_insert_before (si, stmt, GSI_SAME_STMT); + gsi_insert_before (gsi, stmt, GSI_SAME_STMT); if (walk_tree (gimple_cond_lhs_ptr (stmt), expand_omp_regimplify_p, NULL, NULL) || walk_tree (gimple_cond_rhs_ptr (stmt), expand_omp_regimplify_p, NULL, NULL)) { - si = gsi_for_stmt (stmt); - gimple_regimplify_operands (stmt, si); + gsi = gsi_for_stmt (stmt); + gimple_regimplify_operands (stmt, gsi); } se = split_block (entry_bb, stmt); se-flags = EDGE_TRUE_VALUE; @@ -6258,25 +6258,25 @@ expand_omp_for_static_chunk (struct omp_region *region, if (gimple_in_ssa_p (cfun)) { int dest_idx = find_edge (entry_bb, fin_bb)-dest_idx; - for (si = gsi_start_phis (fin_bb); - !gsi_end_p (si); gsi_next (si)) + for (gsi = gsi_start_phis (fin_bb); + !gsi_end_p (gsi); gsi_next (gsi)) { - gimple phi = gsi_stmt (si); + gimple phi = gsi_stmt (gsi); add_phi_arg (phi, gimple_phi_arg_def (phi, dest_idx), se, UNKNOWN_LOCATION); } } - si = gsi_last_bb (entry_bb); + gsi = gsi_last_bb (entry_bb); } t = build_call_expr (builtin_decl_explicit (get_num_threads), 0); t = fold_convert (itype, t); - nthreads = force_gimple_operand_gsi (si, t, true, NULL_TREE, + nthreads = force_gimple_operand_gsi (gsi, t, true, NULL_TREE, true, GSI_SAME_STMT); t = build_call_expr (builtin_decl_explicit (get_thread_num), 0); t = fold_convert (itype, t); - threadid = force_gimple_operand_gsi (si, t, true, NULL_TREE, + threadid = force_gimple_operand_gsi (gsi, t, true, NULL_TREE, true,
[PATCH][2/n] Always 64bit-HWI cleanups
The following changes the configury to insist on [u]int64_t being available and removes the very old __int64 case. Autoconf doesn't check for it, support came in via a big merge in Dec 2002, r60174, and it was never used on the libcpp side until I fixed that with the last patch of this series, so we couldn't have relied on it at least since libcpp was introduced. Both libcpp and vmsdbg.h now use [u]int64_t, switching HOST_WIDE_INT to literally use int64_t has to be done with the grand renaming of all users due to us using 'unsigned HOST_WIDE_INT'. Btw, I couldn't find any standard way of writing [u]int64_t literals (substitution for HOST_WIDE_INT_C) nor one for printf formats (substitutions for HOST_WIDE_INT_PRINT and friends). I'll consider doing s/HOST_WIDE_INT/[U]INT64/ there if nobody comes up with a better plan. Unfortunately any followup will be the whole renaming game at once due to the 'unsigned' issue. I'll make sure to propose a hwint.h-only patch with a renaming guide for review and expect the actual renaming to take place using a script. Bootstrap and regtest running on x86_64-unknown-linux-gnu, ok? After this patch you may use [u]int64_t freely in host sources (lto-plugin already does that - heh). Thanks, Richard. 2014-05-23 Richard Biener rguent...@suse.de libcpp/ * configure.ac: Remove long long and __int64 type checks, add check for uint64_t and fail if that wasn't found. * include/cpplib.h (cpp_num_part): Use uint64_t. * config.in: Regenerate. * configure: Likewise. gcc/ * configure.ac: Drop __int64 type check. Insist that we found uint64_t and int64_t. * hwint.h (HOST_BITS_PER___INT64): Remove. (HOST_BITS_PER_WIDE_INT): Define to 64 and remove __int64 case. (HOST_WIDE_INT_PRINT_*): Remove 32bit case. (HOST_WIDEST_INT*): Define to HOST_WIDE_INT*. (HOST_WIDEST_FAST_INT): Remove __int64 case. * vmsdbg.h (struct _DST_SRC_COMMAND): Use int64_t for dst_q_src_df_rms_cdt. * configure: Regenerate. * config.in: Likewise. Index: libcpp/config.in === *** libcpp/config.in(revision 210847) --- libcpp/config.in(working copy) *** *** 180,188 /* Define to 1 if you have the locale.h header file. */ #undef HAVE_LOCALE_H - /* Define to 1 if the system has the type `long long'. */ - #undef HAVE_LONG_LONG - /* Define to 1 if you have the memory.h header file. */ #undef HAVE_MEMORY_H --- 180,185 *** *** 231,239 /* Define to 1 if you have the unistd.h header file. */ #undef HAVE_UNISTD_H - /* Define to 1 if the system has the type `__int64'. */ - #undef HAVE___INT64 - /* Define as const if the declaration of iconv() needs const. */ #undef ICONV_CONST --- 228,233 *** *** 264,275 /* The size of `long', as computed by sizeof. */ #undef SIZEOF_LONG - /* The size of `long long', as computed by sizeof. */ - #undef SIZEOF_LONG_LONG - - /* The size of `__int64', as computed by sizeof. */ - #undef SIZEOF___INT64 - /* If using the C implementation of alloca, define if you know the direction of stack growth for your system; otherwise it will be automatically deduced at runtime. --- 258,263 *** *** 340,345 --- 328,338 /* Define to 1 if you need to in order for `stat' and other things to work. */ #undef _POSIX_SOURCE + /* Define for Solaris 2.5.1 so the uint64_t typedef from sys/synch.h, +pthread.h, or semaphore.h is not used. If the typedef were allowed, the +#define below would cause a syntax error. */ + #undef _UINT64_T + /* Define to empty if `const' does not conform to ANSI C. */ #undef const *** *** 361,366 --- 354,363 /* Define to `int' if sys/types.h does not define. */ #undef ssize_t + /* Define to the type of an unsigned integer type of width exactly 64 bits if +such a type exists and the standard includes do not define it. */ + #undef uint64_t + /* Define to the type of an unsigned integer type wide enough to hold a pointer, if such a type exists, and if the system does not define it. */ #undef uintptr_t Index: libcpp/configure === *** libcpp/configure(revision 210847) --- libcpp/configure(working copy) *** $as_echo $ac_res 6; } *** 1822,1827 --- 1822,1879 } # ac_fn_c_check_type + # ac_fn_c_find_uintX_t LINENO BITS VAR + # + # Finds an unsigned integer type with width BITS, setting cache variable VAR + # accordingly. + ac_fn_c_find_uintX_t () + { + as_lineno=${as_lineno-$1} as_lineno_stack=as_lineno_stack=$as_lineno_stack + { $as_echo $as_me:${as_lineno-$LINENO}: checking for uint$2_t 5 + $as_echo_n checking for uint$2_t... 6; } + if
Re: [patch] Adjust target selectors for some libstdc++ tests
On 23/05/14 12:49 +0200, Rainer Orth wrote: Any reason to list dragonfly twice? Oops, no - that's due to me reordering commits with rebase so the sed command added it to some files I'd already added it to. Fixed with the attached patch, thanks. commit 12ad71abb1122314f38e98d382c86eacd3cbb022 Author: Jonathan Wakely jwak...@redhat.com Date: Fri May 23 12:30:37 2014 +0100 * testsuite/30_threads/async/54297.cc: Remove duplicate dragonfly selector. * testsuite/30_threads/call_once/60497.cc: Likewise. * testsuite/30_threads/condition_variable/54185.cc: Likewise. * testsuite/30_threads/condition_variable_any/53830.cc: Likewise. * testsuite/30_threads/packaged_task/60564.cc: Likewise. * testsuite/30_threads/packaged_task/cons/56492.cc: Likewise. * testsuite/30_threads/promise/60966.cc: Likewise. * testsuite/30_threads/shared_lock/cons/1.cc: Likewise. * testsuite/30_threads/shared_lock/cons/2.cc: Likewise. * testsuite/30_threads/shared_lock/cons/3.cc: Likewise. * testsuite/30_threads/shared_lock/cons/4.cc: Likewise. * testsuite/30_threads/shared_lock/cons/5.cc: Likewise. * testsuite/30_threads/shared_lock/cons/6.cc: Likewise. * testsuite/30_threads/shared_lock/locking/1.cc: Likewise. * testsuite/30_threads/shared_lock/locking/2.cc: Likewise. * testsuite/30_threads/shared_lock/locking/3.cc: Likewise. * testsuite/30_threads/shared_lock/locking/4.cc: Likewise. * testsuite/30_threads/shared_lock/modifiers/1.cc: Likewise. * testsuite/30_threads/shared_lock/modifiers/2.cc: Likewise. * testsuite/30_threads/shared_timed_mutex/cons/1.cc: Likewise. * testsuite/30_threads/shared_timed_mutex/try_lock/1.cc: * testsuite/30_threads/shared_timed_mutex/try_lock/2.cc: Likewise. * testsuite/30_threads/thread/native_handle/cancel.cc: Likewise. * testsuite/30_threads/timed_mutex/try_lock_until/57641.cc: Likewise. diff --git a/libstdc++-v3/testsuite/30_threads/async/54297.cc b/libstdc++-v3/testsuite/30_threads/async/54297.cc index b2a8eba..281916d 100644 --- a/libstdc++-v3/testsuite/30_threads/async/54297.cc +++ b/libstdc++-v3/testsuite/30_threads/async/54297.cc @@ -1,5 +1,5 @@ -// { dg-do run { target *-*-freebsd* *-*-dragonfly* *-*-dragonfly* *-*-netbsd* *-*-linux* *-*-gnu* *-*-solaris* *-*-cygwin *-*-darwin* powerpc-ibm-aix* } } -// { dg-options -std=gnu++0x -pthread { target *-*-freebsd* *-*-dragonfly* *-*-dragonfly* *-*-netbsd* *-*-linux* *-*-gnu* powerpc-ibm-aix* } } +// { dg-do run { target *-*-freebsd* *-*-dragonfly* *-*-netbsd* *-*-linux* *-*-gnu* *-*-solaris* *-*-cygwin *-*-darwin* powerpc-ibm-aix* } } +// { dg-options -std=gnu++0x -pthread { target *-*-freebsd* *-*-dragonfly* *-*-netbsd* *-*-linux* *-*-gnu* powerpc-ibm-aix* } } // { dg-options -std=gnu++0x -pthreads { target *-*-solaris* } } // { dg-options -std=gnu++0x { target *-*-cygwin *-*-darwin* } } // { dg-require-cstdint } diff --git a/libstdc++-v3/testsuite/30_threads/call_once/60497.cc b/libstdc++-v3/testsuite/30_threads/call_once/60497.cc index a82b88f..05edc61 100644 --- a/libstdc++-v3/testsuite/30_threads/call_once/60497.cc +++ b/libstdc++-v3/testsuite/30_threads/call_once/60497.cc @@ -1,4 +1,4 @@ -// { dg-do compile { target *-*-freebsd* *-*-dragonfly* *-*-dragonfly* *-*-netbsd* *-*-linux* *-*-gnu* *-*-solaris* *-*-cygwin *-*-darwin* powerpc-ibm-aix* } } +// { dg-do compile { target *-*-freebsd* *-*-dragonfly* *-*-netbsd* *-*-linux* *-*-gnu* *-*-solaris* *-*-cygwin *-*-darwin* powerpc-ibm-aix* } } // { dg-options -std=gnu++11 -pthread { target *-*-freebsd* *-*-dragonfly* *-*-netbsd* *-*-linux* *-*-gnu* powerpc-ibm-aix* } } // { dg-options -std=gnu++11 -pthreads { target *-*-solaris* } } // { dg-options -std=gnu++11 { target *-*-cygwin *-*-darwin* } } diff --git a/libstdc++-v3/testsuite/30_threads/condition_variable/54185.cc b/libstdc++-v3/testsuite/30_threads/condition_variable/54185.cc index 1ce3b1e..509d5db 100644 --- a/libstdc++-v3/testsuite/30_threads/condition_variable/54185.cc +++ b/libstdc++-v3/testsuite/30_threads/condition_variable/54185.cc @@ -1,5 +1,5 @@ -// { dg-do run { target *-*-freebsd* *-*-dragonfly* *-*-dragonfly* *-*-netbsd* *-*-linux* *-*-gnu* *-*-solaris* *-*-cygwin *-*-darwin1[1-9]* powerpc-ibm-aix* } } -// { dg-options -std=gnu++0x -pthread { target *-*-freebsd* *-*-dragonfly* *-*-dragonfly* *-*-netbsd* *-*-linux* *-*-gnu* powerpc-ibm-aix* } } +// { dg-do run { target *-*-freebsd* *-*-dragonfly* *-*-netbsd* *-*-linux* *-*-gnu* *-*-solaris* *-*-cygwin *-*-darwin1[1-9]* powerpc-ibm-aix* } } +// { dg-options -std=gnu++0x -pthread { target *-*-freebsd* *-*-dragonfly* *-*-netbsd* *-*-linux* *-*-gnu* powerpc-ibm-aix* } } // { dg-options -std=gnu++0x -pthreads { target *-*-solaris* } } // { dg-options -std=gnu++0x { target *-*-cygwin *-*-darwin* } } // { dg-require-cstdint } diff --git a/libstdc++-v3/testsuite/30_threads/condition_variable_any/53830.cc
Re: [COMMITTED 1/2] Just enumerate all GF_OMP_FOR_KIND_* and GF_OMP_TARGET_KIND_*.
I think it was supposed to note that it uses two bits in the mask. Did Jakub approve these patches you are committing now? Thanks, Richard. On Fri, May 23, 2014 at 1:32 PM, Thomas Schwinge tho...@codesourcery.com wrote: From: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 gcc/ * gimple.h (enum gf_mask): Rewrite 0 shift expressions used for GF_OMP_FOR_KIND_MASK, GF_OMP_FOR_KIND_FOR, GF_OMP_FOR_KIND_DISTRIBUTE, GF_OMP_FOR_KIND_SIMD, GF_OMP_FOR_KIND_CILKSIMD, GF_OMP_TARGET_KIND_MASK, GF_OMP_TARGET_KIND_REGION, GF_OMP_TARGET_KIND_DATA, GF_OMP_TARGET_KIND_UPDATE. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@210854 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog | 7 +++ gcc/gimple.h | 18 +- 2 files changed, 16 insertions(+), 9 deletions(-) diff --git gcc/ChangeLog gcc/ChangeLog index d351c0b..fa2f3c3 100644 --- gcc/ChangeLog +++ gcc/ChangeLog @@ -1,5 +1,12 @@ 2014-05-23 Thomas Schwinge tho...@codesourcery.com + * gimple.h (enum gf_mask): Rewrite 0 shift expressions used + for GF_OMP_FOR_KIND_MASK, GF_OMP_FOR_KIND_FOR, + GF_OMP_FOR_KIND_DISTRIBUTE, GF_OMP_FOR_KIND_SIMD, + GF_OMP_FOR_KIND_CILKSIMD, GF_OMP_TARGET_KIND_MASK, + GF_OMP_TARGET_KIND_REGION, GF_OMP_TARGET_KIND_DATA, + GF_OMP_TARGET_KIND_UPDATE. + * gimplify.c (omp_notice_variable) case OMP_CLAUSE_DEFAULT_NONE: Explicitly enumerate the expected region types. diff --git gcc/gimple.h gcc/gimple.h index 9df45de..b1970e5 100644 --- gcc/gimple.h +++ gcc/gimple.h @@ -91,17 +91,17 @@ enum gf_mask { GF_CALL_ALLOCA_FOR_VAR = 1 5, GF_CALL_INTERNAL = 1 6, GF_OMP_PARALLEL_COMBINED = 1 0, -GF_OMP_FOR_KIND_MASK = 3 0, -GF_OMP_FOR_KIND_FOR= 0 0, -GF_OMP_FOR_KIND_DISTRIBUTE = 1 0, -GF_OMP_FOR_KIND_SIMD = 2 0, -GF_OMP_FOR_KIND_CILKSIMD = 3 0, +GF_OMP_FOR_KIND_MASK = (1 2) - 1, +GF_OMP_FOR_KIND_FOR= 0, +GF_OMP_FOR_KIND_DISTRIBUTE = 1, +GF_OMP_FOR_KIND_SIMD = 2, +GF_OMP_FOR_KIND_CILKSIMD = 3, GF_OMP_FOR_COMBINED= 1 2, GF_OMP_FOR_COMBINED_INTO = 1 3, -GF_OMP_TARGET_KIND_MASK= 3 0, -GF_OMP_TARGET_KIND_REGION = 0 0, -GF_OMP_TARGET_KIND_DATA= 1 0, -GF_OMP_TARGET_KIND_UPDATE = 2 0, +GF_OMP_TARGET_KIND_MASK= (1 2) - 1, +GF_OMP_TARGET_KIND_REGION = 0, +GF_OMP_TARGET_KIND_DATA= 1, +GF_OMP_TARGET_KIND_UPDATE = 2, /* True on an GIMPLE_OMP_RETURN statement if the return does not require a thread synchronization via some sort of barrier. The exact barrier -- 1.9.1
Re: [PATCH] Fix PR rtl-optimization/61278
On Fri, May 23, 2014 at 12:33 PM, Zhenqiang Chen zhenqiang.c...@linaro.org wrote: On 23 May 2014 17:05, Richard Biener richard.guent...@gmail.com wrote: On Fri, May 23, 2014 at 9:23 AM, Zhenqiang Chen zhenqiang.c...@linaro.org wrote: Hi, The patch fixes PR rtl-optimization/61278. Root cause for issue is that df_live does not exist at -O1. Bootstrap and no make check regression on X86-64. OK for trunk? Why do you need to give up? It seems you can simply avoid marking the block as dirty (though df_get_live_in/out also hands you back DF_LR_IN/OUT if !df_live). So isn't the df_grow_bb_info the real fix? The df_get_live_in of the new basic block will be used to analyse later INSNs. If it is not set or incorrect, it will impact on later analysis. df_grow_bb_info is to make sure the live_in data structure is allocated for the new basic block (although I have not found any case fail without it). After bitmap_copy(...), we can use it for later INSNs. Note that df_get_live_in/out are functions tailored to IRA that knows that they handle both df_live and df_lr dependent on optimization level. Is shrink-wrapping supposed to work with both problems as well? Yes. But it seams not perfect to handle df_lr problem. When I fixed PR 57637 (https://gcc.gnu.org/ml/gcc-patches/2013-07/msg00897.html), we selected if DF_LIVE doesn't exist, i.e. at -O1, just give up searching NEXT_BLOCK. Ok, I see. Maybe it would be better to completely disable shrink-wrapping when LIVE is not available. Patch is ok. Thanks, Richard. Thanks! -Zhenqiang ChangeLog: 2014-05-23 Zhenqiang Chen zhenqiang.c...@linaro.org PR rtl-optimization/61278 * shrink-wrap.c (move_insn_for_shrink_wrap): Check df_live. testsuite/ChangeLog: 2014-05-23 Zhenqiang Chen zhenqiang.c...@linaro.org * gcc.dg/lto/pr61278_0.c: New test. * gcc.dg/lto/pr61278_1.c: New test. diff --git a/gcc/shrink-wrap.c b/gcc/shrink-wrap.c index f09cfe7..be17829 100644 --- a/gcc/shrink-wrap.c +++ b/gcc/shrink-wrap.c @@ -204,8 +204,15 @@ move_insn_for_shrink_wrap (basic_block bb, rtx insn, /* Create a new basic block on the edge. */ if (EDGE_COUNT (next_block-preds) == 2) { + /* If DF_LIVE doesn't exist, i.e. at -O1, just give up. */ + if (!df_live) + return false; + next_block = split_edge (live_edge); + /* We create a new basic block. Call df_grow_bb_info to make sure +all data structures are allocated. */ + df_grow_bb_info (df_live); bitmap_copy (df_get_live_in (next_block), df_get_live_out (bb)); df_set_bb_dirty (next_block); diff --git a/gcc/testsuite/gcc.dg/lto/pr61278_0.c b/gcc/testsuite/gcc.dg/lto/pr61278_0.c new file mode 100644 index 000..03a24ae --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/pr61278_0.c @@ -0,0 +1,30 @@ +/* { dg-lto-do link } */ +/* { dg-lto-options { { -flto -O0 } } } */ +/* { dg-extra-ld-options -flto -O1 } */ + +static unsigned int +fn1 (int p1, int p2) +{ + return 0; +} + +char a, b, c; + +char +foo (char *p) +{ + int i; + for (b = 1 ; b 0; b++) +{ + for (i = 0; i 2; i++) + ; + for (a = 1; a 0; a++) + { + char d[1] = { 0 }; + if (*p) + break; + c ^= fn1 (fn1 (fn1 (0, 0), 0), 0); + } +} + return 0; +} diff --git a/gcc/testsuite/gcc.dg/lto/pr61278_1.c b/gcc/testsuite/gcc.dg/lto/pr61278_1.c new file mode 100644 index 000..b02c8ac --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/pr61278_1.c @@ -0,0 +1,13 @@ +/* { dg-lto-do link } */ +/* { dg-lto-options { { -flto -O1 } } } */ + +extern char foo (char *); + +char d; + +int +main () +{ + foo (d); + return 0; +}
[COMMITTED] Be a bit less explicit.
From: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 gcc/ * gimplify.c (omp_notice_variable) case OMP_CLAUSE_DEFAULT_NONE: Rewrite check for ORT_PARALLEL and ORT_COMBINED_PARALLEL. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@210860 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog | 3 +++ gcc/gimplify.c | 3 +-- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git gcc/ChangeLog gcc/ChangeLog index 3d74b6f..397893d 100644 --- gcc/ChangeLog +++ gcc/ChangeLog @@ -1,5 +1,8 @@ 2014-05-23 Thomas Schwinge tho...@codesourcery.com + * gimplify.c (omp_notice_variable) case OMP_CLAUSE_DEFAULT_NONE: + Rewrite check for ORT_PARALLEL and ORT_COMBINED_PARALLEL. + * omp-low.c (expand_omp_for_static_chunk): Rename variable si to gsi, and variables v_* to v*. diff --git gcc/gimplify.c gcc/gimplify.c index 39b2750..654b05c 100644 --- gcc/gimplify.c +++ gcc/gimplify.c @@ -5683,8 +5683,7 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code) switch (default_kind) { case OMP_CLAUSE_DEFAULT_NONE: - if (ctx-region_type == ORT_PARALLEL - || ctx-region_type == ORT_COMBINED_PARALLEL) + if ((ctx-region_type ORT_PARALLEL) != 0) { error (%qE not specified in enclosing parallel, DECL_NAME (lang_hooks.decls.omp_report_decl (decl))); -- 1.9.1
Re: libsanitizer merge from upstream r208536
2) it doesn't still deal with unaligned power of two accesses properly, but neither does llvm (at least not 3.4). Am not talking about undefined behavior cases where the compiler isn't told the access is misaligned, but e.g. when accessing struct S { int x; } __attribute__((packed)) and similar (or aligned(1)). Supposedly we could force __asan_report_*_n for that case too, because normal wider check assumes it is aligned Yep, we don't do it. Now we do: http://llvm.org/viewvc/llvm-project?rev=209508view=rev
Re: [COMMITTED 1/2] Just enumerate all GF_OMP_FOR_KIND_* and GF_OMP_TARGET_KIND_*.
Hi! On Fri, 23 May 2014 13:50:40 +0200, Richard Biener richard.guent...@gmail.com wrote: I think it was supposed to note that it uses two bits in the mask. That is, you'd like me to add a comment saying that? (Which I can certainly do.) Did Jakub approve these patches you are committing now? For the two in this thread, that's how I understood his 2014-03-20 email, http://news.gmane.org/find-root.php?message_id=%3C20140320144512.GK1817%40tucnak.redhat.com%3E. The others I committed today as well as yesterday evening, I considered obvious cleanups and fixes. On Fri, May 23, 2014 at 1:32 PM, Thomas Schwinge tho...@codesourcery.com wrote: From: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 gcc/ * gimple.h (enum gf_mask): Rewrite 0 shift expressions used for GF_OMP_FOR_KIND_MASK, GF_OMP_FOR_KIND_FOR, GF_OMP_FOR_KIND_DISTRIBUTE, GF_OMP_FOR_KIND_SIMD, GF_OMP_FOR_KIND_CILKSIMD, GF_OMP_TARGET_KIND_MASK, GF_OMP_TARGET_KIND_REGION, GF_OMP_TARGET_KIND_DATA, GF_OMP_TARGET_KIND_UPDATE. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@210854 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog | 7 +++ gcc/gimple.h | 18 +- 2 files changed, 16 insertions(+), 9 deletions(-) diff --git gcc/ChangeLog gcc/ChangeLog index d351c0b..fa2f3c3 100644 --- gcc/ChangeLog +++ gcc/ChangeLog @@ -1,5 +1,12 @@ 2014-05-23 Thomas Schwinge tho...@codesourcery.com + * gimple.h (enum gf_mask): Rewrite 0 shift expressions used + for GF_OMP_FOR_KIND_MASK, GF_OMP_FOR_KIND_FOR, + GF_OMP_FOR_KIND_DISTRIBUTE, GF_OMP_FOR_KIND_SIMD, + GF_OMP_FOR_KIND_CILKSIMD, GF_OMP_TARGET_KIND_MASK, + GF_OMP_TARGET_KIND_REGION, GF_OMP_TARGET_KIND_DATA, + GF_OMP_TARGET_KIND_UPDATE. + * gimplify.c (omp_notice_variable) case OMP_CLAUSE_DEFAULT_NONE: Explicitly enumerate the expected region types. diff --git gcc/gimple.h gcc/gimple.h index 9df45de..b1970e5 100644 --- gcc/gimple.h +++ gcc/gimple.h @@ -91,17 +91,17 @@ enum gf_mask { GF_CALL_ALLOCA_FOR_VAR = 1 5, GF_CALL_INTERNAL = 1 6, GF_OMP_PARALLEL_COMBINED = 1 0, -GF_OMP_FOR_KIND_MASK = 3 0, -GF_OMP_FOR_KIND_FOR= 0 0, -GF_OMP_FOR_KIND_DISTRIBUTE = 1 0, -GF_OMP_FOR_KIND_SIMD = 2 0, -GF_OMP_FOR_KIND_CILKSIMD = 3 0, +GF_OMP_FOR_KIND_MASK = (1 2) - 1, +GF_OMP_FOR_KIND_FOR= 0, +GF_OMP_FOR_KIND_DISTRIBUTE = 1, +GF_OMP_FOR_KIND_SIMD = 2, +GF_OMP_FOR_KIND_CILKSIMD = 3, GF_OMP_FOR_COMBINED= 1 2, GF_OMP_FOR_COMBINED_INTO = 1 3, -GF_OMP_TARGET_KIND_MASK= 3 0, -GF_OMP_TARGET_KIND_REGION = 0 0, -GF_OMP_TARGET_KIND_DATA= 1 0, -GF_OMP_TARGET_KIND_UPDATE = 2 0, +GF_OMP_TARGET_KIND_MASK= (1 2) - 1, +GF_OMP_TARGET_KIND_REGION = 0, +GF_OMP_TARGET_KIND_DATA= 1, +GF_OMP_TARGET_KIND_UPDATE = 2, Grüße, Thomas pgp7LM21nvcZ7.pgp Description: PGP signature
Re: [PATCH 1/9] rs6000: Clean up the type attribute
On Fri, May 23, 2014 at 2:09 AM, Segher Boessenkool seg...@kernel.crashing.org wrote: Get rid of the one huge line. Group and order things a bit. Further changes will follow so this doesn't try to make it perfect. The rest of this patch series reduces the number of different integer instruction types by folding many together using attributes size (the data size), dot (does this instruction set CR0), and var_shift (for shift instructions: is the shift amount from a register). Many scheduling descriptions are incomplete; many instruction patterns use the wrong instruction type. Hopefully things will be better if there aren't that many different types to handle. Each patch bootstrapped on powerpc64-linux, tested with -m64,-m64/-mtune=power8,-m32,-m32/-mpowerpc64; no regressions (and nothing magically fixed either). Okay to apply? Segher 2014-05-22 Segher Boessenkool seg...@kernel.crashing.org gcc/ * config/rs6000/rs6000.md (type): Reorder, reformat. Okay. thanks, David
Re: [PATCH 2/9] rs6000: New type attribute value halfmul
On Fri, May 23, 2014 at 2:09 AM, Segher Boessenkool seg...@kernel.crashing.org wrote: This is for the legacy integer multiply-accumulate instructions. Quite a mouthful, and mulhw is also a terrible name since we already have a machine instruction called exactly that. Hence halfmul. Also fixes the titan automaton description for this. 2014-05-22 Segher Boessenkool seg...@kernel.crashing.org gcc/ * config/rs6000/rs6000.md (type): Add new value halfmul. (*macchwc, *macchw, *macchwuc, *macchwu, *machhwc, *machhw, *machhwuc, *machhwu, *maclhwc, *maclhw, *maclhwuc, *maclhwu, *nmacchwc, *nmacchw, *nmachhwc, *nmachhw, *nmaclhwc, *nmaclhw, *mulchwc, *mulchw, *mulchwuc, *mulchwu, *mulhhwc, *mulhhw, *mulhhwuc, *mulhhwu, *mullhwc, *mullhw, *mullhwuc, *mullhwu): Use it. * config/rs6000/40x.md (ppc405-imul3): Add type halfmul. * config/rs6000/440.md (ppc440-imul2): Add type halfmul. * config/rs6000/476.md (ppc476-imul): Add type halfmul. * config/rs6000/titan.md: Delete nonsensical comment. (titan_imul): Add type imul3. (titan_mulhw): Remove type imul3; add type halfmul. Okay. thanks, David
Re: [PATCH 3/9] rs6000: Make all multiply instructions one type
On Fri, May 23, 2014 at 2:09 AM, Segher Boessenkool seg...@kernel.crashing.org wrote: This uses the attributes size and dot to specify the differences: imul3 - mul size=8 imul2 - mul size=16 imul - mul size=32 lmul - mul size=64 imul_compare - mul size=32 dot=yes lmul_compare - mul size=64 dot=yes 2014-05-22 Segher Boessenkool seg...@kernel.crashing.org gcc/ * config/rs6000/rs6000.md (type): Add mul. Delete imul, imul2, imul3, lmul, imul_compare, lmul_compare. (size): New attribute. (dot): New attribute. (cell_micro): Adjust. (mulsi3, *mulsi3_internal1, *mulsi3_internal2, mulsidi3, umulsidi3, smulsi3_highpart, umulsi3_highpart, muldi3, *muldi3_internal1, *muldi3_internal2, smuldi3_highpart, umuldi3_highpart): Adjust. * config/rs6000/rs6000.c (rs6000_adjust_cost, is_cracked_insn, rs6000_adjust_priority, is_nonpipeline_insn, insn_must_be_first_in_group, insn_must_be_last_in_group): Adjust. * config/rs6000/40x.md (ppc403-imul, ppc405-imul, ppc405-imul2, ppc405-imul3): Adjust. * config/rs6000/440.md (ppc440-imul, ppc440-imul2): Adjust. * config/rs6000/476.md (ppc476-imul): Adjust. * config/rs6000/601.md (ppc601-imul): Adjust. * config/rs6000/603.md (ppc603-imul, ppc603-imul2): Adjust. * config/rs6000/6xx.md (ppc604-imul, ppc604e-imul, ppc620-imul, ppc620-imul2, ppc620-imul3, ppc620-lmul): Adjust. * config/rs6000/7450.md (ppc7450-imul, ppc7450-imul2): Adjust. * config/rs6000/7xx.md (ppc750-imul, ppc750-imul2, ppc750-imul3): Adjust. * config/rs6000/8540.md (ppc8540_multiply): Adjust. * config/rs6000/a2.md (ppca2-imul, ppca2-lmul): Adjust. * config/rs6000/cell.md (cell-lmul, cell-lmul-cmp, cell-imul23, cell-imul): Adjust. * config/rs6000/e300c2c3.md (ppce300c3_multiply): Adjust. * config/rs6000/e500mc.md (e500mc_multiply): Adjust. * config/rs6000/e500mc64.md (e500mc64_multiply): Adjust. * config/rs6000/e5500.md (e5500_multiply, e5500_multiply_i): Adjust. * config/rs6000/e6500.md (e6500_multiply, e6500_multiply_i): Adjust. * config/rs6000/mpc.md (mpccore-imul): Adjust. * config/rs6000/power4.md (power4-lmul-cmp, power4-imul-cmp, power4-lmul, power4-imul, power4-imul3): Adjust. * config/rs6000/power5.md (power5-lmul-cmp, power5-imul-cmp, power5-lmul, power5-imul, power5-imul3): Adjust. * config/rs6000/power6.md (power6-lmul-cmp, power6-imul-cmp, power6-lmul, power6-imul, power6-imul3): Adjust. * config/rs6000/power7.md (power7-mul, power7-mul-compare): Adjust. * config/rs6000/power8.md (power8-mul, power8-mul-compare): Adjust. * config/rs6000/rs64.md (rs64a-imul, rs64a-imul2, rs64a-imul3, rs64a-lmul): Adjust. * config/rs6000/titan.md (titan_imul): Adjust. Okay. thanks, David
Re: libsanitizer merge from upstream r208536
On Mon, May 12, 2014 at 03:20:37PM +0400, Konstantin Serebryany wrote: 5 months' worth of changes may break any platform we are not testing ourselves (that includes Ubuntu 12.04, 13.10, 14.04, Mac 10.9, Windows 7, Android ARM), please help us test this patch on your favorite platform. On powerpc64 I hit /home/polacek/gcc/libsanitizer/asan/asan_linux.cc:209:3: error: #error Unsupported arch # error Unsupported arch because the merge (aka clang's r196802) removed ppc64 hunk of code: -# elif defined(__powerpc__) || defined(__powerpc64__) - ucontext_t *ucontext = (ucontext_t*)context; - *pc = ucontext-uc_mcontext.regs-nip; - *sp = ucontext-uc_mcontext.regs-gpr[PT_R1]; - // The powerpc{,64}-linux ABIs do not specify r31 as the frame - // pointer, but GCC always uses r31 when we need a frame pointer. - *bp = ucontext-uc_mcontext.regs-gpr[PT_R31]; -# elif defined(__sparc__) Marek
Re: [PATCH 4/9] rs6000: Make all insert instructions one type
On Fri, May 23, 2014 at 2:09 AM, Segher Boessenkool seg...@kernel.crashing.org wrote: This uses the attribute size to specify the differences: insert_word - insert size=32 insert_dword - insert size=64 It could use dot as well, but the current code doesn't handle that. 2014-05-22 Segher Boessenkool seg...@kernel.crashing.org gcc/ * config/rs6000/rs6000.md (type): Delete insert_word, insert_dword. Add insert. (size): Update comment. * config/rs6000/rs6000.c (rs6000_adjust_cost, is_cracked_insn, insn_must_be_first_in_group): Adjust. (insvsi_internal, *insvsi_internal1, *insvsi_internal2, *insvsi_internal3, *insvsi_internal4, *insvsi_internal5, *insvsi_internal6, insvdi_internal): Adjust. * config/rs6000/40x.md (ppc403-integer): Adjust. * config/rs6000/440.md (ppc440-integer): Adjust. * config/rs6000/476.md (ppc476-simple-integer): Adjust. * config/rs6000/601.md (ppc601-integer): Adjust. * config/rs6000/603.md (ppc603-integer): Adjust. * config/rs6000/6xx.md (ppc604-integer): Adjust. * config/rs6000/7450.md (ppc7450-integer): Adjust. * config/rs6000/7xx.md (ppc750-integer): Adjust. * config/rs6000/8540.md (ppc8540_su): Adjust. * config/rs6000/cell.md (cell-integer, cell-insert): Adjust. * config/rs6000/e300c2c3.md (ppce300c3_iu): Adjust. * config/rs6000/e500mc.md (e500mc_su): Adjust. * config/rs6000/e500mc64.md (e500mc64_su): Adjust. * config/rs6000/e5500.md (e5500_sfx): Adjust. * config/rs6000/e6500.md (e6500_sfx): Adjust. * config/rs6000/mpc.md (mpccore-integer): Adjust. * config/rs6000/power4.md (power4-integer, power4-insert): Adjust. * config/rs6000/power5.md (power5-integer, power5-insert): Adjust. * config/rs6000/power6.md (power6-insert, power6-insert-dword): Adjust. * config/rs6000/power7.md (power7-integer): Adjust. * config/rs6000/power8.md (power8-1cyc): Adjust. * config/rs6000/rs64.md (rs64a-integer): Adjust. * config/rs6000/titan.md (titan_fxu_shift_and_rotate): Adjust. Okay. Thanks, David
Re: [PATCH 6/9] rs6000: Make all shift instructions one type
On Fri, May 23, 2014 at 2:09 AM, Segher Boessenkool seg...@kernel.crashing.org wrote: This uses the attributes var_shift and dot to specify the differences: var_shift_rotate- shift var_shift=yes delayed_compare - shift var_shift=no dot=yes var_delayed_compare - shift var_shift=yes dot=yes 2014-05-22 Segher Boessenkool seg...@kernel.crashing.org gcc/ * config/rs6000/rs6000.md (type): Delete var_shift_rotate, delayed_compare, var_delayed_compare. (var_shift): New attribute. (cell_micro): Adjust. (*andsi3_internal2_mc, *andsi3_internal3_mc, *andsi3_internal4, *andsi3_internal5_mc, *extzvsi_internal1, *extzvsi_internal2, rotlsi3, *rotlsi3_64, *rotlsi3_internal2, *rotlsi3_internal3, *rotlsi3_internal4, *rotlsi3_internal5, *rotlsi3_internal6, *rotlsi3_internal8le, *rotlsi3_internal8be, *rotlsi3_internal9le, *rotlsi3_internal9be, *rotlsi3_internal10le, *rotlsi3_internal10be, *rotlsi3_internal11le, *rotlsi3_internal11be, *rotlsi3_internal12le, *rotlsi3_internal12be, ashlsi3, *ashlsi3_64, lshrsi3, *lshrsi3_64, *lshiftrt_internal2le, *lshiftrt_internal2be, *lshiftrt_internal3le, *lshiftrt_internal3be, *lshiftrt_internal5le, *lshiftrt_internal5be, *lshiftrt_internal5le, *lshiftrt_internal5be, ashrsi3, *ashrsi3_64, rotldi3, *rotldi3_internal2, *rotldi3_internal3, *rotldi3_internal4, *rotldi3_internal5, *rotldi3_internal6, *rotldi3_internal7le, *rotldi3_internal7be, *rotldi3_internal8le, *rotldi3_internal8be, *rotldi3_internal9le, *rotldi3_internal9be, *rotldi3_internal10le, *rotldi3_internal10be, *rotldi3_internal11le, *rotldi3_internal11be, *rotldi3_internal12le, *rotldi3_internal12be, *rotldi3_internal13le, *rotldi3_internal13be, *rotldi3_internal14le, *rotldi3_internal14be, *rotldi3_internal15le, *rotldi3_internal15be, *ashldi3_internal1, *ashldi3_internal2, *ashldi3_internal3, *lshrdi3_internal1, *lshrdi3_internal2, *lshrdi3_internal3, *ashrdi3_internal1, *ashrdi3_internal2, *ashrdi3_internal3, *anddi3_internal2_mc, *anddi3_internal3_mc, as well as 11 anonymous define_insns): Adjust. * config/rs6000/rs6000.c (rs6000_adjust_cost, is_cracked_insn, insn_must_be_first_in_group, insn_must_be_last_in_group): Adjust. * config/rs6000/40x.md (ppc403-integer, ppc403-compare): Adjust. * config/rs6000/440.md (ppc440-integer): Adjust. * config/rs6000/476.md (ppc476-simple-integer, ppc476-compare): Adjust. * config/rs6000/601.md (ppc601-integer, ppc601-compare): Adjust. * config/rs6000/603.md (ppc603-integer, ppc603-compare): Adjust. * config/rs6000/6xx.md (ppc604-integer, ppc604-compare): Adjust. * config/rs6000/7450.md (ppc7450-integer, ppc7450-compare): Adjust. * config/rs6000/7xx.md (ppc750-integer, ppc750-compare): Adjust. * config/rs6000/8540.md (ppc8540_su): Adjust. * config/rs6000/cell.md (cell-integer, cell-fast-cmp, cell-cmp-microcoded): Adjust. * config/rs6000/e300c2c3.md (ppce300c3_cmp): Adjust. * config/rs6000/e500mc.md (e500mc_su): Adjust. * config/rs6000/e500mc64.md (e500mc64_su, e500mc64_su2, e500mc64_delayed): Adjust. * config/rs6000/e5500.md (e5500_sfx, e5500_delayed): Adjust. * config/rs6000/e6500.md (e6500_sfx, e6500_delayed): Adjust. * config/rs6000/mpc.md (mpccore-integer, mpccore-compare): Adjust. * config/rs6000/power4.md (power4-integer, power4-compare): Adjust. * config/rs6000/power5.md (power5-integer, power5-compare): Adjust. * config/rs6000/power6.md (power6-shift, power6-var-rotate, power6-delayed-compare, power6-var-delayed-compare): Adjust. * config/rs6000/power7.md (power7-integer, power7-compare): Adjust. * config/rs6000/power8.md (power8-1cyc, power8-compare): Adjust. Adjust comment. * config/rs6000/rs64.md (rs64a-integer, rs64a-compare): Adjust. * config/rs6000/titan.md (titan_fxu_shift_and_rotate): Adjust. Okay. thanks, David
Re: [PATCH 7/9] rs6000: Make all add instructions one type
On Fri, May 23, 2014 at 2:09 AM, Segher Boessenkool seg...@kernel.crashing.org wrote: They are currently just integer, but the dot version is fast_compare. This makes them all add. Later we should introduce attributes to distinguish e.g. addc and adde (which aren't currently handled as separate instructions at all, only in groups). 2014-05-22 Segher Boessenkool seg...@kernel.crashing.org gcc/ * config/rs6000/rs6000.md (type): Add add. (*addmode3_internal1, addsi3_high, *addmode3_internal2, *addmode3_internal3, *negmode2_internal, and 5 anonymous define_insns): Use it. * config/rs6000/rs6000.c (rs6000_adjust_cost): Adjust. * config/rs6000/40x.md (ppc403-integer, ppc403-compare): Adjust. * config/rs6000/440.md (ppc440-integer, ppc440-compare): Adjust. * config/rs6000/476.md (ppc476-simple-integer, ppc476-compare): Adjust. * config/rs6000/601.md (ppc601-integer): Adjust. * config/rs6000/603.md (ppc603-integer, ppc603-compare): Adjust. * config/rs6000/6xx.md (ppc604-integer, ppc604-compare): Adjust. * config/rs6000/7450.md (ppc7450-integer, ppc7450-compare): Adjust. * config/rs6000/7xx.md (ppc750-integer, ppc750-compare): Adjust. * config/rs6000/8540.md (ppc8540_su): Adjust. * config/rs6000/cell.md (cell-integer, cell-fast-cmp, cell-cmp-microcoded): Adjust. * config/rs6000/e300c2c3.md (ppce300c3_cmp, ppce300c3_iu): Adjust. * config/rs6000/e500mc.md (e500mc_su): Adjust. * config/rs6000/e500mc64.md (e500mc64_su, e500mc64_su2): Adjust. * config/rs6000/e5500.md (e5500_sfx, e5500_sfx2): Adjust. * config/rs6000/e6500.md (e6500_sfx, e6500_sfx2): Adjust. * config/rs6000/mpc.md (mpccore-integer, mpccore-compare): Adjust. * config/rs6000/power4.md (power4-integer, power4-cmp): Adjust. * config/rs6000/power5.md (power5-integer, power5-cmp): Adjust. * config/rs6000/power6.md (power6-integer, power6-fast-compare): Adjust. * config/rs6000/power7.md (power7-integer, power7-cmp): Adjust. * config/rs6000/power8.md (power8-1cyc, power8-fast-compare): Adjust. * config/rs6000/rs64.md (rs64a-integer, rs64a-compare): Adjust. * config/rs6000/titan.md (titan_fxu_adder, titan_fxu_alu): Adjust. Okay. Thanks, David
Re: [PATCH 8/9] rs6000: Make all logical instructions one type
On Fri, May 23, 2014 at 2:09 AM, Segher Boessenkool seg...@kernel.crashing.org wrote: They are currently just integer, but the dot version is fast_compare. This makes them all logical. 2014-05-22 Segher Boessenkool seg...@kernel.crashing.org gcc/ * config/rs6000/rs6000.md (type): Add logical. Delete fast_compare. (dot): Adjust comment. (andsi3_mc, *andsi3_internal2_mc, *andsi3_internal3_mc, *andsi3_internal4, *andsi3_internal5_mc, *boolsi3_internal2, *boolsi3_internal3, *boolccsi3_internal2, *boolccsi3_internal3, anddi3_mc, *anddi3_internal2_mc, *anddi3_internal3_mc, *booldi3_internal2, *booldi3_internal3, *boolcdi3_internal2, *boolcdi3_internal3, *boolccdi3_internal2, *boolccdi3_internal3, *movmode_internal2, and 10 anonymous define_insns): Use logical. * config/rs6000/rs6000.c (rs6000_adjust_cost): Adjust. * config/rs6000/40x.md: (ppc403-integer, ppc403-compare): Adjust. * config/rs6000/440.md: (ppc440-integer, ppc440-compare): Adjust. * config/rs6000/476.md: (ppc476-simple-integer, ppc476-compare): Adjust. * config/rs6000/603.md: (ppc603-integer, ppc603-compare): Adjust. * config/rs6000/6xx.md: (ppc604-integer, ppc604-compare): Adjust. * config/rs6000/7450.md: (ppc7450-integer, ppc7450-compare): Adjust. * config/rs6000/7xx.md: (ppc750-integer, ppc750-compare): Adjust. * config/rs6000/8540.md: (ppc8540_su): Adjust. * config/rs6000/cell.md: (cell-integer, cell-fast-cmp, cell-cmp-microcoded): Adjust. * config/rs6000/e300c2c3.md: (ppce300c3_cmp, ppce300c3_iu): Adjust. * config/rs6000/e500mc.md: (e500mc_su): Adjust. * config/rs6000/e500mc64.md: (e500mc64_su, e500mc64_su2): Adjust. * config/rs6000/e5500.md: (e5500_sfx, e5500_sfx2): Adjust. * config/rs6000/e6500.md: (e6500_sfx, e6500_sfx2): Adjust. * config/rs6000/mpc.md: (mpccore-integer, mpccore-compare): Adjust. * config/rs6000/power4.md: (power4-integer, power4-cmp): Adjust. * config/rs6000/power5.md: (power5-integer, power5-cmp): Adjust. * config/rs6000/power6.md: (power6-integer, power6-fast-compare): Adjust. * config/rs6000/power7.md: (power7-integer, power7-cmp): Adjust. * config/rs6000/power8.md: (power8-1cyc, power8-fast-compare): Adjust. Adjust comment. * config/rs6000/rs64.md: (rs64a-integer, rs64a-compare): Adjust. * config/rs6000/titan.md: (titan_fxu_adder, titan_fxu_alu): Adjust. Okay. thanks, David
Re: libsanitizer merge from upstream r208536
On Fri, May 23, 2014 at 5:41 PM, Marek Polacek pola...@redhat.com wrote: On Mon, May 12, 2014 at 03:20:37PM +0400, Konstantin Serebryany wrote: 5 months' worth of changes may break any platform we are not testing ourselves (that includes Ubuntu 12.04, 13.10, 14.04, Mac 10.9, Windows 7, Android ARM), please help us test this patch on your favorite platform. On powerpc64 I hit /home/polacek/gcc/libsanitizer/asan/asan_linux.cc:209:3: error: #error Unsupported arch # error Unsupported arch because the merge (aka clang's r196802) removed ppc64 hunk of code: -# elif defined(__powerpc__) || defined(__powerpc64__) - ucontext_t *ucontext = (ucontext_t*)context; - *pc = ucontext-uc_mcontext.regs-nip; - *sp = ucontext-uc_mcontext.regs-gpr[PT_R1]; - // The powerpc{,64}-linux ABIs do not specify r31 as the frame - // pointer, but GCC always uses r31 when we need a frame pointer. - *bp = ucontext-uc_mcontext.regs-gpr[PT_R31]; -# elif defined(__sparc__) Someone will have to send this patch via llvm-commits :( (I've pinged Peter Bergner once with no luck). Marek
Re: [PATCH 5/9] rs6000: Make all divide instructions one type
On Fri, May 23, 2014 at 2:09 AM, Segher Boessenkool seg...@kernel.crashing.org wrote: This uses the attribute size to specify the differences: idiv - div size=32 ldiv - div size=64 It could use dot as well, but the current code doesn't handle that. 2014-05-22 Segher Boessenkool seg...@kernel.crashing.org gcc/ * config/rs6000/rs6000.md (type): Delete idiv, ldiv. Add div. (bits): New mode_attr. (idiv_ldiv): Delete mode_attr. (udivmode3, *divmode3, divdiv_extend_mode): Adjust. * config/rs6000/rs6000.c (rs6000_adjust_cost, is_cracked_insn, rs6000_adjust_priority, is_nonpipeline_insn, insn_must_be_first_in_group, insn_must_be_last_in_group): Adjust. * config/rs6000/40x.md (ppc403-idiv): Adjust. * config/rs6000/440.md (ppc440-idiv): Adjust. * config/rs6000/476.md (ppc476-idiv): Adjust. * config/rs6000/601.md (ppc601-idiv): Adjust. * config/rs6000/603.md (ppc603-idiv): Adjust. * config/rs6000/6xx.md (ppc604-idiv, ppc620-idiv, ppc630-idiv, ppc620-ldiv): Adjust. * config/rs6000/7450.md (ppc7450-idiv): Adjust. * config/rs6000/7xx.md (ppc750-idiv): Adjust. * config/rs6000/8540.md (ppc8540_divide): Adjust. * config/rs6000/a2.md (ppca2-idiv, ppca2-ldiv): Adjust. * config/rs6000/cell.md (cell-idiv, cell-ldiv): Adjust. * config/rs6000/e300c2c3.md (ppce300c3_divide): Adjust. * config/rs6000/e500mc.md (e500mc_divide): Adjust. * config/rs6000/e500mc64.md (e500mc64_divide): Adjust. * config/rs6000/e5500.md (e5500_divide, e5500_divide_d): Adjust. * config/rs6000/e6500.md (e6500_divide, e6500_divide_d): Adjust. * config/rs6000/mpc.md (mpccore-idiv): Adjust. * config/rs6000/power4.md (power4-idiv, power4-ldiv): Adjust. * config/rs6000/power5.md (power5-idiv, power5-ldiv): Adjust. * config/rs6000/power6.md (power6-idiv, power6-ldiv): Adjust. * config/rs6000/power7.md (power7-idiv, power7-ldiv): Adjust. * config/rs6000/power8.md (power8-idiv, power8-ldiv): Adjust. * config/rs6000/rs64.md (rs64a-idiv, rs64a-ldiv): Adjust. * config/rs6000/titan.md (titan_fxu_div): Adjust. Okay. Thanks, David
Re: [PATCH 9/9] rs6000: Make all rlw*nm and rld*c* type shift
On Fri, May 23, 2014 at 2:09 AM, Segher Boessenkool seg...@kernel.crashing.org wrote: They are often labeled just integer currently. Fix that. Also handle shift properly in those scheduling descriptions that neglected it. 2014-05-22 Segher Boessenkool seg...@kernel.crashing.org gcc/ * config/rs6000/440.md (ppc440-integer): Include shift without dot. (ppc440-compare): Include shift with dot. * config/rs6000/e300c2c3.md (ppce300c3_iu): Include shift without dot. * config/rs6000/e5500.md (e5500_sfx2): Include constant shift without dot. * config/rs6000/e6500.md (e6500_sfx): Exclude constant shift without dot. (e6500_sfx2): Include it. * config/rs6000/rs6000.md ( *zero_extendmodedi2_internal1, *zero_extendmodedi2_internal2, *zero_extendmodedi2_internal3, *zero_extendsidi2_lfiwzx, andsi3_mc, andsi3_nomc, andsi3_internal0_nomc, extzvsi_internal, extzvdi_internal, *extzvdi_internal1, *extzvdi_internal2, rotlsi3, *rotlsi3_64, *rotlsi3_internal4, *rotlsi3_internal7le, *rotlsi3_internal7be, *rotlsi3_internal10le, *rotlsi3_internal10be, rlwinm, *lshiftrt_internal1le, *lshiftrt_internal1be, *lshiftrt_internal4le, *lshiftrt_internal4be, rotldi3, *rotldi3_internal4, *rotldi3_internal7le, *rotldi3_internal7be, *rotldi3_internal10le, *rotldi3_internal10be, *rotldi3_internal13le, *rotldi3_internal13be, *ashldi3_internal4, ashldi3_internal5, *ashldi3_internal6, *ashldi3_internal7, ashldi3_internal8, *ashldi3_internal9, anddi3_mc, anddi3_nomc, *anddi3_internal2_mc, *anddi3_internal3_mc, and 4 anonymous define_insns): Use type shift in the appropriate alternatives. Okay. Thanks, David
Re: libsanitizer merge from upstream r208536
Hi, Since merge from upstream r209283 (210743 in GCC), my build fails on ARM, because rpc/xdr.h is not found. Is this expected? Thanks, Christophe. On 23 May 2014 15:45, Konstantin Serebryany konstantin.s.serebry...@gmail.com wrote: On Fri, May 23, 2014 at 5:41 PM, Marek Polacek pola...@redhat.com wrote: On Mon, May 12, 2014 at 03:20:37PM +0400, Konstantin Serebryany wrote: 5 months' worth of changes may break any platform we are not testing ourselves (that includes Ubuntu 12.04, 13.10, 14.04, Mac 10.9, Windows 7, Android ARM), please help us test this patch on your favorite platform. On powerpc64 I hit /home/polacek/gcc/libsanitizer/asan/asan_linux.cc:209:3: error: #error Unsupported arch # error Unsupported arch because the merge (aka clang's r196802) removed ppc64 hunk of code: -# elif defined(__powerpc__) || defined(__powerpc64__) - ucontext_t *ucontext = (ucontext_t*)context; - *pc = ucontext-uc_mcontext.regs-nip; - *sp = ucontext-uc_mcontext.regs-gpr[PT_R1]; - // The powerpc{,64}-linux ABIs do not specify r31 as the frame - // pointer, but GCC always uses r31 when we need a frame pointer. - *bp = ucontext-uc_mcontext.regs-gpr[PT_R31]; -# elif defined(__sparc__) Someone will have to send this patch via llvm-commits :( (I've pinged Peter Bergner once with no luck). Marek
Re: libsanitizer merge from upstream r208536
On 24/05/14 00:06, Christophe Lyon wrote: Hi, Since merge from upstream r209283 (210743 in GCC), my build fails on ARM, because rpc/xdr.h is not found. Is this expected? I also have the same issue. I had to build glibc with --enable-obsolete-rpc to bootstrap now. Thanks, Kugan
Re: libsanitizer merge from upstream r208536
On Fri, May 23, 2014 at 6:11 PM, Kugan kugan.vivekanandara...@linaro.org wrote: On 24/05/14 00:06, Christophe Lyon wrote: Hi, Since merge from upstream r209283 (210743 in GCC), my build fails on ARM, because rpc/xdr.h is not found. Is this expected? I also have the same issue. I had to build glibc with --enable-obsolete-rpc to bootstrap now. +eugenis This comes from http://llvm.org/viewvc/llvm-project?view=revisionrevision=205627 : Thanks, Kugan
Re: libsanitizer merge from upstream r208536
On Fri, May 23, 2014 at 6:12 PM, Konstantin Serebryany konstantin.s.serebry...@gmail.com wrote: On Fri, May 23, 2014 at 6:11 PM, Kugan kugan.vivekanandara...@linaro.org wrote: On 24/05/14 00:06, Christophe Lyon wrote: Hi, Since merge from upstream r209283 (210743 in GCC), my build fails on ARM, because rpc/xdr.h is not found. Is this expected? I also have the same issue. I had to build glibc with --enable-obsolete-rpc to bootstrap now. +eugenis This comes from http://llvm.org/viewvc/llvm-project?view=revisionrevision=205627 : Thanks, Kugan
Re: libsanitizer merge from upstream r208536
On Fri, 2014-05-23 at 17:45 +0400, Konstantin Serebryany wrote: On Fri, May 23, 2014 at 5:41 PM, Marek Polacek pola...@redhat.com wrote: On Mon, May 12, 2014 at 03:20:37PM +0400, Konstantin Serebryany wrote: 5 months' worth of changes may break any platform we are not testing ourselves (that includes Ubuntu 12.04, 13.10, 14.04, Mac 10.9, Windows 7, Android ARM), please help us test this patch on your favorite platform. On powerpc64 I hit /home/polacek/gcc/libsanitizer/asan/asan_linux.cc:209:3: error: #error Unsupported arch # error Unsupported arch because the merge (aka clang's r196802) removed ppc64 hunk of code: -# elif defined(__powerpc__) || defined(__powerpc64__) - ucontext_t *ucontext = (ucontext_t*)context; - *pc = ucontext-uc_mcontext.regs-nip; - *sp = ucontext-uc_mcontext.regs-gpr[PT_R1]; - // The powerpc{,64}-linux ABIs do not specify r31 as the frame - // pointer, but GCC always uses r31 when we need a frame pointer. - *bp = ucontext-uc_mcontext.regs-gpr[PT_R31]; -# elif defined(__sparc__) Someone will have to send this patch via llvm-commits :( (I've pinged Peter Bergner once with no luck). Sorry, I'm not purposely ignoring you. Just had a lot of other higher priority items on my TODO list when you asked before. I'll try and reapply the code that was removed and resubmit to the correct mailing list. Sorry for being so tardy and non-responsive. Peter
Re: [PATCH] Implement -fsanitize=float-cast-overflow (take 3)
On Fri, May 23, 2014 at 04:19:00PM +0200, Marek Polacek wrote: This is the latest patch for -fsanitize=float-cast-overflow. Since last version it: - adds tons of tests written by Jakub; - patches libubsan so it can handle 96-bit floating-point types (that is, long double and __float80 in -m32 mode); CCing Kostya on this one liner, which has been posted to llvm-commits, but nothing has been done yet. I'm approving this anyway, I don't see anything controversial on it and clang fails without that change the same (supposedly insufficient test coverage on the compiler-rt side). - includes a hack for printing __float{80,128}/_Decimal* types in libubsan. Since libubsan handles only float/double/long double floating-point types, we use TK_Unknown for other types, meaning that libubsan prints unknown instead of the value. I think this is for now good, while in theory I can imagine not very long code to print _Decimal* to string (convert to binary integer format if in the densely packed format (hey, ppc*), print __int128 significand (or do 2x wide long long division/modulo) into string using snprintf, take care of exponent and sign and putting in decimal dot), it is not high prio for me, and for __float128 you can hardly avoid libquadmath or something similarly large, unless you want to print it as C99 hexadecimal float (that would be again pretty easy). Regtested/bootstrapped on x86_64-linux. Couldn't test ppc64, as libsanitizer currently doesn't build on this architecture. Ok for trunk? 2014-05-23 Marek Polacek pola...@redhat.com Jakub Jelinek ja...@redhat.com * builtins.def: Change SANITIZE_FLOAT_DIVIDE to SANITIZE_NONDEFAULT. * gcc.c (sanitize_spec_function): Likewise. * convert.c (convert_to_integer): Include ubsan.h. Add floating-point to integer instrumentation. * doc/invoke.texi: Document -fsanitize=float-cast-overflow. * flag-types.h (enum sanitize_code): Add SANITIZE_FLOAT_CAST and SANITIZE_NONDEFAULT. * opts.c (common_handle_option): Handle -fsanitize=float-cast-overflow. * sanitizer.def (BUILT_IN_UBSAN_HANDLE_FLOAT_CAST_OVERFLOW, BUILT_IN_UBSAN_HANDLE_FLOAT_CAST_OVERFLOW_ABORT): Add. * ubsan.c: Include realmpfr.h and dfp.h. (get_ubsan_type_info_for_type): Handle REAL_TYPEs. (ubsan_type_descriptor): Set tkind to 0x for types other than float/double/long double. (ubsan_instrument_float_cast): New function. * ubsan.h (ubsan_instrument_float_cast): Declare. testsuite/ * c-c++-common/ubsan/float-cast-overflow-1.c: New test. * c-c++-common/ubsan/float-cast-overflow-10.c: New test. * c-c++-common/ubsan/float-cast-overflow-2.c: New test. * c-c++-common/ubsan/float-cast-overflow-3.c: New test. * c-c++-common/ubsan/float-cast-overflow-4.c: New test. * c-c++-common/ubsan/float-cast-overflow-5.c: New test. * c-c++-common/ubsan/float-cast-overflow-6.c: New test. * c-c++-common/ubsan/float-cast-overflow-7.c: New test. * c-c++-common/ubsan/float-cast-overflow-7.h: New file. * c-c++-common/ubsan/float-cast-overflow-8.c: New test. * c-c++-common/ubsan/float-cast-overflow-9.c: New test. * c-c++-common/ubsan/float-cast.h: New file. * g++.dg/ubsan/float-cast-overflow-bf.C: New test. * gcc.dg/ubsan/float-cast-overflow-bf.c: New test. libsanitizer/ * ubsan/ubsan_value.cc (getFloatValue): Handle 96-bit floating-point types. Ok, thanks. Jakub
Re: [PATCH] Implement -fsanitize=float-cast-overflow (take 3)
On Fri, May 23, 2014 at 6:28 PM, Jakub Jelinek ja...@redhat.com wrote: On Fri, May 23, 2014 at 04:19:00PM +0200, Marek Polacek wrote: This is the latest patch for -fsanitize=float-cast-overflow. Since last version it: - adds tons of tests written by Jakub; - patches libubsan so it can handle 96-bit floating-point types (that is, long double and __float80 in -m32 mode); CCing Kostya on this one liner, which has been posted to llvm-commits, but nothing has been done yet. I'm approving this anyway, I don't see anything controversial on it and clang fails without that change the same (supposedly insufficient test coverage on the compiler-rt side). ubsan is not my domain, but since the patch is so simple let me try to handle it. - includes a hack for printing __float{80,128}/_Decimal* types in libubsan. Since libubsan handles only float/double/long double floating-point types, we use TK_Unknown for other types, meaning that libubsan prints unknown instead of the value. I think this is for now good, while in theory I can imagine not very long code to print _Decimal* to string (convert to binary integer format if in the densely packed format (hey, ppc*), print __int128 significand (or do 2x wide long long division/modulo) into string using snprintf, take care of exponent and sign and putting in decimal dot), it is not high prio for me, and for __float128 you can hardly avoid libquadmath or something similarly large, unless you want to print it as C99 hexadecimal float (that would be again pretty easy). Regtested/bootstrapped on x86_64-linux. Couldn't test ppc64, as libsanitizer currently doesn't build on this architecture. Ok for trunk? 2014-05-23 Marek Polacek pola...@redhat.com Jakub Jelinek ja...@redhat.com * builtins.def: Change SANITIZE_FLOAT_DIVIDE to SANITIZE_NONDEFAULT. * gcc.c (sanitize_spec_function): Likewise. * convert.c (convert_to_integer): Include ubsan.h. Add floating-point to integer instrumentation. * doc/invoke.texi: Document -fsanitize=float-cast-overflow. * flag-types.h (enum sanitize_code): Add SANITIZE_FLOAT_CAST and SANITIZE_NONDEFAULT. * opts.c (common_handle_option): Handle -fsanitize=float-cast-overflow. * sanitizer.def (BUILT_IN_UBSAN_HANDLE_FLOAT_CAST_OVERFLOW, BUILT_IN_UBSAN_HANDLE_FLOAT_CAST_OVERFLOW_ABORT): Add. * ubsan.c: Include realmpfr.h and dfp.h. (get_ubsan_type_info_for_type): Handle REAL_TYPEs. (ubsan_type_descriptor): Set tkind to 0x for types other than float/double/long double. (ubsan_instrument_float_cast): New function. * ubsan.h (ubsan_instrument_float_cast): Declare. testsuite/ * c-c++-common/ubsan/float-cast-overflow-1.c: New test. * c-c++-common/ubsan/float-cast-overflow-10.c: New test. * c-c++-common/ubsan/float-cast-overflow-2.c: New test. * c-c++-common/ubsan/float-cast-overflow-3.c: New test. * c-c++-common/ubsan/float-cast-overflow-4.c: New test. * c-c++-common/ubsan/float-cast-overflow-5.c: New test. * c-c++-common/ubsan/float-cast-overflow-6.c: New test. * c-c++-common/ubsan/float-cast-overflow-7.c: New test. * c-c++-common/ubsan/float-cast-overflow-7.h: New file. * c-c++-common/ubsan/float-cast-overflow-8.c: New test. * c-c++-common/ubsan/float-cast-overflow-9.c: New test. * c-c++-common/ubsan/float-cast.h: New file. * g++.dg/ubsan/float-cast-overflow-bf.C: New test. * gcc.dg/ubsan/float-cast-overflow-bf.c: New test. libsanitizer/ * ubsan/ubsan_value.cc (getFloatValue): Handle 96-bit floating-point types. Ok, thanks. Jakub
Re: [PATCH 3/7] IPA-CP escape and clobber analysis
Hi, sorry, I should have added a better description of the overall algorithm, I will try to do that now, I hope I will at least clarify what stage does what. At summary generation time, we process one function after another, looking at their bodies. There are three new things in the generated summaries: 1) How do actual arguments of calls relate to the formal parameters and how they relate to each other. If an argument in one call refers to the same object as an argument in another call or as a formal parameter, we want to know they are the same memory object. This is captured in jump functions, jump functions which did not have one get an integer index. 2) Which pointer formal parameters escape in this function and which pointer actual arguments of all calls escape in this function. This is stored as a bit per each formal parameter and each other (locally unescaped) memory object passed as an actual argument in one or more calls. 3) Whether we modify the memory pointed to by the formal and actual arguments in this function. Also just a but per memory object. When doing the local analysis, we allocate an ipa_escape structure for each SSA name and each addressable local declaration and after examining SSA definitions, store how they relate to each other (ie which point to the same thing and to what offsets which is useful for subsequent patches). We also look at uses of SSA names to see what objects escape locally. These structures are also used when building jump functions. When we now that passed data escapes locally, we mark it directly to the jump function. If it does not, we store an index into the jump function which identifies the memory object - it is an index to vector of ipa_ref_descriptor structures. Which are allocated to have one element for each formal parameter - locally escaped and non-pointer ones are marked as escaped - and every other locally unescaped memory object which is passed to a called function. ipa_escapes and associated data are then deallocated and we move on to another function. During WPA, we basically propagate escape and clobber flags across the call graph. Escape flags are propagated more or less in both directions, it is perhaps best described by figure 4 of http://sysrun.haifa.il.ibm.com/hrl/greps2007/papers/ipa-agg-no_copyright.pdf (I called them unusable flags in that paper some seven years ago) Modified flags are propagated only from callees to callers, of course. On Wed, May 21, 2014 at 04:50:33PM +0200, Richard Biener wrote: On Wed, May 21, 2014 at 3:16 PM, Martin Jambor mjam...@suse.cz wrote: Hi, this patch is rather big but not overly complicated. Its goal is to figure out whether data passed to a function by reference escapes (somewhere, not necessarily in that particular function) and is potentially clobbered (in that one function or its callees). The result is stored into call graph node global structure, at least for now, because it is supposed to live longer than IPA-CP optimization info and be available for PTA later in the pipeline. Before that, however, quite a lot of intermediate results are stored in a number of places. First of all, there is a vector describing all SSA names and address taken local aggregates which is used to figure out relations between them and do the local escape and clobber analysis (I am aware that a local aggregate might incorrectly pass as non-clobbered, that is fixed by the fifth patch, this one is big enough as it is and it does not really matter here). We then store the local results describing formal parameters and so-far-presumed-unescaped aggregates and malloced data that is passed as actual arguments to other functions into a new vector ref_descs. I did not store this into the existing descriptors vector because there are often more elements. Also, I had to extend the UNKNOWN, KNOWN_TYPE and CONSTANT jump functions with an index into this new vector (PASS_THROUGH and ANCESTOR reuse the index into parameters), so there is quite a lot of new getter and setter methods. This information is used by simple queue based interprocedural propagation. Eventually, the information is stored into the call graph node, as described above. After propagation, data in ref_descs and in the call graph are the same, only the call graph can live much longer. One set of flags that is not copied to call graph nodes are callee_clobbered flags, which only IPA-CP uses it in a subsequent patch (and which would require maintenance during inlining). There are more uses of the flags introduced by subsequent patches. In this one, the only one is that IPA-CP modification phase is able to use the results instead of querying AA and is capable of doing more replacements of aggregate values when the aggregate is unescaped and not clobbered. The following table summarizes what the pass can discover now. All
Re: [PATCH] Implement -fsanitize=float-cast-overflow (take 3)
On Fri, May 23, 2014 at 6:35 PM, Konstantin Serebryany konstantin.s.serebry...@gmail.com wrote: On Fri, May 23, 2014 at 6:28 PM, Jakub Jelinek ja...@redhat.com wrote: On Fri, May 23, 2014 at 04:19:00PM +0200, Marek Polacek wrote: This is the latest patch for -fsanitize=float-cast-overflow. Since last version it: - adds tons of tests written by Jakub; - patches libubsan so it can handle 96-bit floating-point types (that is, long double and __float80 in -m32 mode); CCing Kostya on this one liner, which has been posted to llvm-commits, but nothing has been done yet. I'm approving this anyway, I don't see anything controversial on it and clang fails without that change the same (supposedly insufficient test coverage on the compiler-rt side). ubsan is not my domain, but since the patch is so simple let me try to handle it. http://llvm.org/viewvc/llvm-project?view=revisionrevision=209516 - includes a hack for printing __float{80,128}/_Decimal* types in libubsan. Since libubsan handles only float/double/long double floating-point types, we use TK_Unknown for other types, meaning that libubsan prints unknown instead of the value. I think this is for now good, while in theory I can imagine not very long code to print _Decimal* to string (convert to binary integer format if in the densely packed format (hey, ppc*), print __int128 significand (or do 2x wide long long division/modulo) into string using snprintf, take care of exponent and sign and putting in decimal dot), it is not high prio for me, and for __float128 you can hardly avoid libquadmath or something similarly large, unless you want to print it as C99 hexadecimal float (that would be again pretty easy). Regtested/bootstrapped on x86_64-linux. Couldn't test ppc64, as libsanitizer currently doesn't build on this architecture. Ok for trunk? 2014-05-23 Marek Polacek pola...@redhat.com Jakub Jelinek ja...@redhat.com * builtins.def: Change SANITIZE_FLOAT_DIVIDE to SANITIZE_NONDEFAULT. * gcc.c (sanitize_spec_function): Likewise. * convert.c (convert_to_integer): Include ubsan.h. Add floating-point to integer instrumentation. * doc/invoke.texi: Document -fsanitize=float-cast-overflow. * flag-types.h (enum sanitize_code): Add SANITIZE_FLOAT_CAST and SANITIZE_NONDEFAULT. * opts.c (common_handle_option): Handle -fsanitize=float-cast-overflow. * sanitizer.def (BUILT_IN_UBSAN_HANDLE_FLOAT_CAST_OVERFLOW, BUILT_IN_UBSAN_HANDLE_FLOAT_CAST_OVERFLOW_ABORT): Add. * ubsan.c: Include realmpfr.h and dfp.h. (get_ubsan_type_info_for_type): Handle REAL_TYPEs. (ubsan_type_descriptor): Set tkind to 0x for types other than float/double/long double. (ubsan_instrument_float_cast): New function. * ubsan.h (ubsan_instrument_float_cast): Declare. testsuite/ * c-c++-common/ubsan/float-cast-overflow-1.c: New test. * c-c++-common/ubsan/float-cast-overflow-10.c: New test. * c-c++-common/ubsan/float-cast-overflow-2.c: New test. * c-c++-common/ubsan/float-cast-overflow-3.c: New test. * c-c++-common/ubsan/float-cast-overflow-4.c: New test. * c-c++-common/ubsan/float-cast-overflow-5.c: New test. * c-c++-common/ubsan/float-cast-overflow-6.c: New test. * c-c++-common/ubsan/float-cast-overflow-7.c: New test. * c-c++-common/ubsan/float-cast-overflow-7.h: New file. * c-c++-common/ubsan/float-cast-overflow-8.c: New test. * c-c++-common/ubsan/float-cast-overflow-9.c: New test. * c-c++-common/ubsan/float-cast.h: New file. * g++.dg/ubsan/float-cast-overflow-bf.C: New test. * gcc.dg/ubsan/float-cast-overflow-bf.c: New test. libsanitizer/ * ubsan/ubsan_value.cc (getFloatValue): Handle 96-bit floating-point types. Ok, thanks. Jakub
Re: libsanitizer merge from upstream r208536
Hi Christophe, is there anything special about your setup? We've seen it build on arm/linux and arm/android correctly. On Fri, May 23, 2014 at 6:06 PM, Christophe Lyon christophe.l...@linaro.org wrote: Hi, Since merge from upstream r209283 (210743 in GCC), my build fails on ARM, because rpc/xdr.h is not found. Is this expected? Thanks, Christophe. On 23 May 2014 15:45, Konstantin Serebryany konstantin.s.serebry...@gmail.com wrote: On Fri, May 23, 2014 at 5:41 PM, Marek Polacek pola...@redhat.com wrote: On Mon, May 12, 2014 at 03:20:37PM +0400, Konstantin Serebryany wrote: 5 months' worth of changes may break any platform we are not testing ourselves (that includes Ubuntu 12.04, 13.10, 14.04, Mac 10.9, Windows 7, Android ARM), please help us test this patch on your favorite platform. On powerpc64 I hit /home/polacek/gcc/libsanitizer/asan/asan_linux.cc:209:3: error: #error Unsupported arch # error Unsupported arch because the merge (aka clang's r196802) removed ppc64 hunk of code: -# elif defined(__powerpc__) || defined(__powerpc64__) - ucontext_t *ucontext = (ucontext_t*)context; - *pc = ucontext-uc_mcontext.regs-nip; - *sp = ucontext-uc_mcontext.regs-gpr[PT_R1]; - // The powerpc{,64}-linux ABIs do not specify r31 as the frame - // pointer, but GCC always uses r31 when we need a frame pointer. - *bp = ucontext-uc_mcontext.regs-gpr[PT_R31]; -# elif defined(__sparc__) Someone will have to send this patch via llvm-commits :( (I've pinged Peter Bergner once with no luck). Marek
Re: [RS6000] Fix PR61098, Poor code setting count register
OK, let's start again from scratch. This patch fixes PR61098, a problem caused by trying to do arithmetic on the count register. The fix is to provide a new pseudo in rs6000_emit_set_long_const so arithmetic will be done in a gpr. Additionally, the patch fixes a number of other bugs and cleanup issues with rs6000_emit_set_{,long_}const. - rs6000_emit_set_long_const took two HWI constant parameters, a relic from the days when HWI might be 32 bits on powerpc. We're only setting a 64-bit value, so remove the unnecessary parameter. - The !TARGET_POWERPC64 handling of DImode assumed a 32 bit HWI, and the insn setting the low 32-bit reg was wrongly marked with a reg_equiv note saying the reg contained the entire 64-bit constant. I hadn't spotted the bad reg_equiv when writing the previous patch. - The comments describing the functions are inaccurate and misleading. - rs6000_emit_set_long_const always returns DEST, so it's caller can assume this and rs6000_emit_set_long_const return void. - The code testing for a NULL DEST in rs6000_emit_set_const is dead. DEST cannot be NULL, since the three uses of the function are in rs6000.md splitters where DEST (operand[0]) satisfies gpc_reg_operand. - The above two points mean that rs6000_emit_set_const always returns DEST, which in turn would allow rs6000_emit_set_const to return void. However, in view of a possible future change that might need to return status on whether rs6000_emit_set_const emitted anything, return a bool. - rs6000_emit_set_const parameter N is currently unused, and MODE always matches GET_MODE (DEST), so N and MODE can be removed. - The code is liberally sprinkled with copy_rtx. DEST/TEMP is always used once without copy_rtx, but which insn uses copy_rtx varies. I changed the code to always use a bare DEST as the last insn for consistency. (Writing the code this way might allow us to omit the copy_rtx on DEST/TEMP entirely. Before reload TEMP will be a new pseudo reg, thus doesn't need copy_rtx, and after reload we shouldn't have a SUBREG DEST. I wasn't sure of exactly what might happen during reload, so left well enough alone.) Bootstrapped and regression tested powerpc64-linux. OK to apply mainline? PR target/61098 * config/rs6000/rs6000.c (rs6000_emit_set_const): Remove unneeded params and return a bool. Remove dead code. Update comment. Assert we have a const_int source. Remove bogus code from 32-bit HWI days. Move !TARGET_POWERPC64 handling, and correct handling of constants 2G and reg_equal note, from.. (rs6000_emit_set_long_const): ..here. Remove unneeded param and return value. Update comment. If we can, use a new pseudo for intermediate calculations. * config/rs6000/rs6000-protos.h (rs6000_emit_set_const): Update prototype. * config/rs6000/rs6000.md (movsi_internal1_single+1): Update call to rs6000_emit_set_const in splitter. (movdi_internal64+2, +3): Likewise. Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 210835) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -1068,7 +1068,7 @@ static tree rs6000_handle_longcall_attribute (tree static tree rs6000_handle_altivec_attribute (tree *, tree, tree, int, bool *); static tree rs6000_handle_struct_attribute (tree *, tree, tree, int, bool *); static tree rs6000_builtin_vectorized_libmass (tree, tree, tree); -static rtx rs6000_emit_set_long_const (rtx, HOST_WIDE_INT, HOST_WIDE_INT); +static void rs6000_emit_set_long_const (rtx, HOST_WIDE_INT); static int rs6000_memory_move_cost (enum machine_mode, reg_class_t, bool); static bool rs6000_debug_rtx_costs (rtx, int, int, int, int *, bool); static int rs6000_debug_address_cost (rtx, enum machine_mode, addr_space_t, @@ -7849,53 +7849,50 @@ rs6000_conditional_register_usage (void) } -/* Try to output insns to set TARGET equal to the constant C if it can - be done in less than N insns. Do all computations in MODE. - Returns the place where the output has been placed if it can be - done and the insns have been emitted. If it would take more than N - insns, zero is returned and no insns and emitted. */ +/* Output insns to set DEST equal to the constant SOURCE as a series of + lis, ori and shl instructions and return TRUE. */ -rtx -rs6000_emit_set_const (rtx dest, enum machine_mode mode, - rtx source, int n ATTRIBUTE_UNUSED) +bool +rs6000_emit_set_const (rtx dest, rtx source) { - rtx result, insn, set; - HOST_WIDE_INT c0, c1; + enum machine_mode mode = GET_MODE (dest); + rtx temp, insn, set; + HOST_WIDE_INT c; + gcc_checking_assert (CONST_INT_P (source)); + c = INTVAL (source); switch (mode) { -case QImode: +case QImode: case HImode: - if (dest == NULL) - dest = gen_reg_rtx (mode);
Re: [PATCH][ARM] Vectorize bswap[16,32,64] operations
Looks good except for : diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index ef370fe..7e1ec71 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -3280,7 +3280,8 @@ proc check_effective_target_vect_bswap { } { verbose check_effective_target_vect_bswap: using cached result 2 } else { set et_vect_bswap_saved 0 -if { [istarget aarch64*-*-*] } { +if { [istarget aarch64*-*-*] + || [istarget arm*-*-*] } { This condition should have [check_effective_target_arm_neon] or you'll break testing on AArch32 non-neon targets especially as this is an implicit run time test. Ok with that change. Ramana On Fri, May 16, 2014 at 3:45 PM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote: Hi all, This is the aarch32 version of https://gcc.gnu.org/ml/gcc-patches/2014-04/msg00850.html that allows us to (auto-)vectorise the __builtin_bswap[16,32,64] functions using the vrev instructions. For that we create some new NEON builtins and get arm_builtin_vectorized_function to map to them when asked to vectorise the corresponding builtin. The tests for this were added with the aarch64 patch mentioned above but were disabled for arm. This patch enables them now that we support the operations (of course they now pass on arm) Tested arm-none-eabi and bootstrapped on arm-none-linux-gnueabihf. Ok for trunk? Thanks, Kyrill 2014-05-16 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/arm/neon.md (neon_bswapmode): New pattern. * config/arm/arm.c (neon_itype): Add NEON_BSWAP. (arm_init_neon_builtins): Handle NEON_BSWAP. Define required type nodes. (arm_expand_neon_builtin): Handle NEON_BSWAP. (arm_builtin_vectorized_function): Handle BUILTIN_BSWAP builtins. * config/arm/arm_neon_builtins.def (bswap): Define builtins. * config/arm/iterators.md (VDQHSD): New mode iterator. 2014-05-16 Kyrylo Tkachov kyrylo.tkac...@arm.com * lib/target-supports.exp (check_effective_target_vect_bswap): Specify arm*-*-* support.
patch to add test for PR61215
The following patch adds missed test for the PR. Committed to the trunk as rev. 210838. 2014-05-23 Vladimir Makarov vmaka...@redhat.com PR rtl-optimization/61215 * gcc.target/i386/pr61215.c: New. Index: testsuite/gcc.target/i386/pr61215.c === --- testsuite/gcc.target/i386/pr61215.c (revision 0) +++ testsuite/gcc.target/i386/pr61215.c (working copy) @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target ia32 } */ +/* { dg-options -O2 -march=i686 } */ + +void fn1 (int *, ...); +int fn2 (int p1) +{ + fn1 (0, (short)(int)p1); + return 0; +}
Re: patch to add test for PR61215
On Fri, May 23, 2014 at 11:36:33AM -0400, Vladimir Makarov wrote: The following patch adds missed test for the PR. Committed to the trunk as rev. 210838. 2014-05-23 Vladimir Makarov vmaka...@redhat.com PR rtl-optimization/61215 * gcc.target/i386/pr61215.c: New. Index: testsuite/gcc.target/i386/pr61215.c === --- testsuite/gcc.target/i386/pr61215.c (revision 0) +++ testsuite/gcc.target/i386/pr61215.c (working copy) @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target ia32 } */ +/* { dg-options -O2 -march=i686 } */ + +void fn1 (int *, ...); +int fn2 (int p1) +{ + fn1 (0, (short)(int)p1); + return 0; +} What is i?86 specific on this testcase? I'd say move it to gcc.dg/, remove effective-target, /* { dg-options -O2 } */ /* { dg-additional-options -march=i686 { target ia32 } } */ and use (__INTPTR_TYPE__) instead of (int) - no change for i?86. Jakub
Re: libsanitizer merge from upstream r208536
On Fri, May 23, 2014 at 11:40 AM, Jakub Jelinek ja...@redhat.com wrote: No other shared library does anything close to that, for each such library you can interpose any of its public symbols, either you know what you are doing when interposing it, or it breaks. I think I have finally recalled why we added this check. They main usecase was sanitized dynamic library linked into unsanitized executable. This of course requires that libasan comes earlier in loaded DSO list than libc libraries so that malloc, etc. can be successfully intercepted with dlsym(..., RTLD_NEXT) in libasan (for more details grep for LD_PRELOAD in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56393). Jakub is probably right that forcing libasan to be the first library in DSO list (that's what AsanCheckDynamicRTPrereqs currently does) is an overkill but it felt a lot easier to do back in the day. As already said, removing abort and hiding warning under verbosity0 runtime option is probably enough to support usecases mentioned by Jakub (preloading some other DSO before libasan). As Konstantin mentioned, this should be discussed in address-sanitizer ML which I'll probably do on Monday if noone takes the lead. -Y
Re: [PATCH 3/9] rs6000: Make all multiply instructions one type
On 05/23/2014 01:09 AM, Segher Boessenkool wrote: @@ -27385,6 +27371,11 @@ insn_must_be_first_in_group (rtx insn) case TYPE_MFJMPR: case TYPE_MTJMPR: return true; +case TYPE_MUL: + if (get_attr_dot (insn) == DOT_YES) +return true; + else +break; case TYPE_LOAD: if (get_attr_sign_extend (insn) == SIGN_EXTEND_YES || get_attr_update (insn) == UPDATE_YES) @@ -27415,8 +27406,6 @@ insn_must_be_first_in_group (rtx insn) case TYPE_COMPARE: case TYPE_DELAYED_COMPARE: case TYPE_VAR_DELAYED_COMPARE: -case TYPE_IMUL_COMPARE: -case TYPE_LMUL_COMPARE: case TYPE_SYNC: case TYPE_ISYNC: case TYPE_LOAD_L: This looks like you added it to the POWER7 case and removed from the POWER8 case. The MUL_COMPARE types should have been listed for the POWER7 case leg also, so the addition there is fine, but the new code should also be duplicated in the POWER8 case leg. -Pat
Re: libsanitizer merge from upstream r208536
On Fri, May 23, 2014 at 04:11:48PM +0400, Konstantin Serebryany wrote: 2) it doesn't still deal with unaligned power of two accesses properly, but neither does llvm (at least not 3.4). Am not talking about undefined behavior cases where the compiler isn't told the access is misaligned, but e.g. when accessing struct S { int x; } __attribute__((packed)) and similar (or aligned(1)). Supposedly we could force __asan_report_*_n for that case too, because normal wider check assumes it is aligned Yep, we don't do it. Now we do: http://llvm.org/viewvc/llvm-project?rev=209508view=rev Here is the GCC side of the thing, ok for trunk if it bootstraps/regtests? (on top of the earlier posted two patches). Note, I think some work is needed on the library side, ERROR: AddressSanitizer: unknown-crash on address 0xffc439cf at pc 0x804898e bp 0xffc438d8 sp 0xffc438cc READ of size 4 at 0xffc439cf thread T0 #0 0x804898d in foo /usr/src/gcc/gcc/testsuite/c-c++-common/asan/misalign-1.c:10 #1 0x8048746 in main /usr/src/gcc/gcc/testsuite/c-c++-common/asan/misalign-1.c:34 #2 0x49e39b72 in __libc_start_main (/lib/libc.so.6+0x49e39b72) #3 0x8048848 (/usr/src/gcc/obj2/gcc/testsuite/gcc/misalign-1.exe+0x8048848) Address 0xffc439cf is located in stack of thread T0 at offset 175 in frame #0 0x804868f in main /usr/src/gcc/gcc/testsuite/c-c++-common/asan/misalign-1.c:27 This frame has 3 object(s): [32, 36) 'v' [96, 100) 'w' [160, 176) 't' == Memory access at offset 175 partially overflows this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext (longjmp and C++ exceptions *are* supported) SUMMARY: AddressSanitizer: unknown-crash /usr/src/gcc/gcc/testsuite/c-c++-common/asan/misalign-1.c:10 foo Shadow bytes around the buggy address: 0x3ff886e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x3ff886f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x3ff88700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x3ff88710: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x3ff88720: 00 00 00 00 f1 f1 f1 f1 04 f4 f4 f4 f2 f2 f2 f2 =0x3ff88730: 04 f4 f4 f4 f2 f2 f2 f2 00[00]f4 f4 f3 f3 f3 f3 0x3ff88740: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x3ff88750: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x3ff88760: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x3ff88770: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x3ff88780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Heap right redzone: fb Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack partial redzone: f4 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user:f7 Container overflow: fc ASan internal: fe is just too ugly (I mean, it shouldn't print unknown-crash). It is true that the first byte of the __asan_report_load_n range corresponds to shadow byte 0, but for _n you should either check it for all bytes in that range, or at least the first and last byte (which would correspond to what the caller of __asan_report_*_n actually does right now). 2014-05-23 Jakub Jelinek ja...@redhat.com * asan.c (report_error_func): Add SLOW_P argument, use BUILT_IN_ASAN_*_N if set. (build_check_stmt): Likewise. (instrument_derefs): If T has insufficient alignment, force same handling as for odd sizes. * c-c++-common/asan/misalign-1.c: New test. * c-c++-common/asan/misalign-2.c: New test. --- gcc/asan.c.jj 2014-05-23 17:17:46.0 +0200 +++ gcc/asan.c 2014-05-23 18:14:08.630807014 +0200 @@ -1319,7 +1319,7 @@ asan_protect_global (tree decl) IS_STORE is either 1 (for a store) or 0 (for a load). */ static tree -report_error_func (bool is_store, HOST_WIDE_INT size_in_bytes) +report_error_func (bool is_store, HOST_WIDE_INT size_in_bytes, bool slow_p) { static enum built_in_function report[2][6] = { { BUILT_IN_ASAN_REPORT_LOAD1, BUILT_IN_ASAN_REPORT_LOAD2, @@ -1329,7 +1329,8 @@ report_error_func (bool is_store, HOST_W BUILT_IN_ASAN_REPORT_STORE4, BUILT_IN_ASAN_REPORT_STORE8, BUILT_IN_ASAN_REPORT_STORE16, BUILT_IN_ASAN_REPORT_STORE_N } }; if ((size_in_bytes (size_in_bytes - 1)) != 0 - || size_in_bytes 16) + || size_in_bytes 16 + || slow_p) return builtin_decl_implicit (report[is_store][5]); return builtin_decl_implicit (report[is_store][exact_log2 (size_in_bytes)]); } @@ -1508,7 +1509,8 @@ build_shadow_mem_access (gimple_stmt_ite static void build_check_stmt (location_t location, tree base, gimple_stmt_iterator *iter,
Re: libsanitizer merge from upstream r208536
On Fri, 2014-05-23 at 09:25 -0500, Peter Bergner wrote: xagsmtp4.20140523142452.1...@vmsdvm6.vnet.ibm.com X-Xagent-Gateway: vmsdvm6.vnet.ibm.com (XAGSMTP4 at VMSDVM6) On Fri, 2014-05-23 at 17:45 +0400, Konstantin Serebryany wrote: On Fri, May 23, 2014 at 5:41 PM, Marek Polacek pola...@redhat.com wrote: On Mon, May 12, 2014 at 03:20:37PM +0400, Konstantin Serebryany wrote: 5 months' worth of changes may break any platform we are not testing ourselves (that includes Ubuntu 12.04, 13.10, 14.04, Mac 10.9, Windows 7, Android ARM), please help us test this patch on your favorite platform. On powerpc64 I hit /home/polacek/gcc/libsanitizer/asan/asan_linux.cc:209:3: error: #error Unsupported arch # error Unsupported arch because the merge (aka clang's r196802) removed ppc64 hunk of code: -# elif defined(__powerpc__) || defined(__powerpc64__) - ucontext_t *ucontext = (ucontext_t*)context; - *pc = ucontext-uc_mcontext.regs-nip; - *sp = ucontext-uc_mcontext.regs-gpr[PT_R1]; - // The powerpc{,64}-linux ABIs do not specify r31 as the frame - // pointer, but GCC always uses r31 when we need a frame pointer. - *bp = ucontext-uc_mcontext.regs-gpr[PT_R31]; -# elif defined(__sparc__) Someone will have to send this patch via llvm-commits :( (I've pinged Peter Bergner once with no luck). The following patch gets bootstrap working again, but there seems to be a large number of testsuite failures I will look into before submitting the patch to the LLVM mailing list. Peter Index: libsanitizer/asan/asan_linux.cc === --- libsanitizer/asan/asan_linux.cc (revision 210861) +++ libsanitizer/asan/asan_linux.cc (working copy) @@ -186,6 +186,13 @@ void GetPcSpBp(void *context, uptr *pc, *bp = ucontext-uc_mcontext.gregs[REG_EBP]; *sp = ucontext-uc_mcontext.gregs[REG_ESP]; # endif +#elif defined(__powerpc__) || defined(__powerpc64__) + ucontext_t *ucontext = (ucontext_t*)context; + *pc = ucontext-uc_mcontext.regs-nip; + *sp = ucontext-uc_mcontext.regs-gpr[PT_R1]; + // The powerpc{,64}-linux ABIs do not specify r31 as the frame + // pointer, but GCC always uses r31 when we need a frame pointer. + *bp = ucontext-uc_mcontext.regs-gpr[PT_R31]; #elif defined(__sparc__) ucontext_t *ucontext = (ucontext_t*)context; uptr *stk_ptr; Index: libsanitizer/asan/asan_mapping.h === --- libsanitizer/asan/asan_mapping.h(revision 210861) +++ libsanitizer/asan/asan_mapping.h(working copy) @@ -85,6 +85,7 @@ static const u64 kDefaultShadowOffset64 static const u64 kDefaultShort64bitShadowOffset = 0x7FFF8000; // 2G. static const u64 kAArch64_ShadowOffset64 = 1ULL 36; static const u64 kMIPS32_ShadowOffset32 = 0x0aaa8000; +static const u64 kPPC64_ShadowOffset64 = 1ULL 41; static const u64 kFreeBSD_ShadowOffset32 = 1ULL 30; // 0x4000 static const u64 kFreeBSD_ShadowOffset64 = 1ULL 46; // 0x4000 @@ -107,6 +108,8 @@ static const u64 kFreeBSD_ShadowOffset64 # else # if defined(__aarch64__) #define SHADOW_OFFSET kAArch64_ShadowOffset64 +# elif defined(__powerpc64__) +#define SHADOW_OFFSET kPPC64_ShadowOffset64 # elif SANITIZER_FREEBSD #define SHADOW_OFFSET kFreeBSD_ShadowOffset64 # elif SANITIZER_MAC Index: libsanitizer/sanitizer_common/sanitizer_common.h === --- libsanitizer/sanitizer_common/sanitizer_common.h(revision 210861) +++ libsanitizer/sanitizer_common/sanitizer_common.h(working copy) @@ -26,7 +26,11 @@ struct StackTrace; const uptr kWordSize = SANITIZER_WORDSIZE / 8; const uptr kWordSizeInBits = 8 * kWordSize; -const uptr kCacheLineSize = 64; +#if defined(__powerpc__) || defined(__powerpc64__) + const uptr kCacheLineSize = 128; +#else + const uptr kCacheLineSize = 64; +#endif const uptr kMaxPathLength = 512; Index: libsanitizer/sanitizer_common/sanitizer_stacktrace.cc === --- libsanitizer/sanitizer_common/sanitizer_stacktrace.cc (revision 210861) +++ libsanitizer/sanitizer_common/sanitizer_stacktrace.cc (working copy) @@ -16,11 +16,13 @@ namespace __sanitizer { uptr StackTrace::GetPreviousInstructionPc(uptr pc) { -#ifdef __arm__ +#if defined(__arm__) // Cancel Thumb bit. pc = pc (~1); -#endif -#if defined(__sparc__) +#elif defined(__powerpc__) || defined(__powerpc64__) + // PCs are always 4 byte aligned. + return pc - 4; +#elif defined(__sparc__) return pc - 8; #else return pc - 1;
Re: [PATCH] Disable unroll loop that has header count less than iteration count.
On Thu, May 22, 2014 at 11:36 PM, Dehao Chen de...@google.com wrote: If a loop's header count is less than iteration count, the iteration estimation is apparently incorrect for this loop. Thus disable unrolling of such loops. Testing on going. OK for trunk if test pass? No. Why don't you instead plug the hole in expected_loop_iterations ()? That is, why may not loop-header be bogus? Isn't it maybe the bounding It is autoFDO thing. Dehao is basically pushing out changes that should make compiler more sane about bogus profiles. At the moment we have tests bb-count != 0 to figure out if the profile is reliable. I.e. we assume that profile feedback is always good, guessed profile is never good. Loop optimizers then do not trust guessed profile to give realistic estimates on number of iterations and bail out into simple heuristics for estimated profiles that are usually not good on this job - i.e. int niters = 0; if (desc-const_iter) niters = desc-niter; else if (loop-header-count) niters = expected_loop_iterations (loop); ... With FDO this change, as the FDO profiles are sometimes bogus (and there is not much to do about it). I would say that for loop optimization, we want a flag whether expected number of iterations is reliable. We probably want if (expected_loop_iterations_reliable_p (loop)) niters = expected_loop_iterations (loop); We probably also want to store this information into loop structure rather than computing it all the time from profile, since the profile may get inaccurate and we already know how to maintain upper bounds on numbers of iterations, so it should be easy to implement. This would allow us to preserve info like for (i=0 ;i __bulitin_expect (n,10); i++) that would be nice feature to have. Honza you run into? /* Returns expected number of LOOP iterations. The returned value is bounded by REG_BR_PROB_BASE. */ unsigned expected_loop_iterations (const struct loop *loop) { gcov_type expected = expected_loop_iterations_unbounded (loop); return (expected REG_BR_PROB_BASE ? REG_BR_PROB_BASE : expected); } I miss a testcase as well. Richard. Thanks, Dehao gcc/ChangeLog: 2014-05-21 Dehao Chen de...@google.com * cfgloop.h (expected_loop_iterations_reliable_p): New func. * cfgloopanal.c (expected_loop_iterations_reliable_p): Likewise. * loop-unroll.c (decide_unroll_runtime_iterations): Disable unroll loop that has unreliable iteration counts. Index: gcc/cfgloop.h === --- gcc/cfgloop.h (revision 210717) +++ gcc/cfgloop.h (working copy) @@ -307,8 +307,8 @@ extern bool just_once_each_iteration_p (const stru gcov_type expected_loop_iterations_unbounded (const struct loop *); extern unsigned expected_loop_iterations (const struct loop *); extern rtx doloop_condition_get (rtx); +extern bool expected_loop_iterations_reliable_p (const struct loop *); - /* Loop manipulation. */ extern bool can_duplicate_loop_p (const struct loop *loop); Index: gcc/cfgloopanal.c === --- gcc/cfgloopanal.c (revision 210717) +++ gcc/cfgloopanal.c (working copy) @@ -285,6 +285,15 @@ expected_loop_iterations (const struct loop *loop) return (expected REG_BR_PROB_BASE ? REG_BR_PROB_BASE : expected); } +/* Returns true if the loop header's profile count is smaller than expected + loop iteration. */ + +bool +expected_loop_iterations_reliable_p (const struct loop *loop) +{ + return expected_loop_iterations (loop) loop-header-count; +} + /* Returns the maximum level of nesting of subloops of LOOP. */ unsigned Index: gcc/loop-unroll.c === --- gcc/loop-unroll.c (revision 210717) +++ gcc/loop-unroll.c (working copy) @@ -988,6 +988,15 @@ decide_unroll_runtime_iterations (struct loop *loo return; } + if (profile_status_for_fn (cfun) == PROFILE_READ + expected_loop_iterations_reliable_p (loop)) +{ + if (dump_file) + fprintf (dump_file, ;; Not unrolling loop, loop iteration + not reliable.); + return; +} + /* Check whether the loop rolls. */ if ((get_estimated_loop_iterations (loop, iterations) || get_max_loop_iterations (loop, iterations))
Re: [C++ patch] Reduce vtable alignment
Hi, I would like to ping these two patches. If we conclude it is absolutely unsafe to not align virtual tables to 16byte boundary (that is an x86_64 ABI requirement for array datastructures but I would like to argue that vtables are compiler controlled ones and do not need to follow ABI here), I can add a code to while program visibility pass to bump up alignment of vtables that are externally visible. Vtables are always accessed via the vtbl pointer otherwise (that is almost always misaligned because of the offset to RTTI pointer), so for vtables static to given compilation unit, there is no way other compiler can derive the alignment based on ABI promises. This would save the data segment size more progressively at least for -flto. Honza
Re: [PATCH] Disable unroll loop that has header count less than iteration count.
On May 23, 2014 7:26:04 PM CEST, Jan Hubicka hubi...@ucw.cz wrote: On Thu, May 22, 2014 at 11:36 PM, Dehao Chen de...@google.com wrote: If a loop's header count is less than iteration count, the iteration estimation is apparently incorrect for this loop. Thus disable unrolling of such loops. Testing on going. OK for trunk if test pass? No. Why don't you instead plug the hole in expected_loop_iterations ()? That is, why may not loop-header be bogus? Isn't it maybe the bounding It is autoFDO thing. Dehao is basically pushing out changes that should make compiler more sane about bogus profiles. At the moment we have tests bb-count != 0 to figure out if the profile is reliable. I.e. we assume that profile feedback is always good, guessed profile is never good. Loop optimizers then do not trust guessed profile to give realistic estimates on number of iterations and bail out into simple heuristics for estimated profiles that are usually not good on this job - i.e. int niters = 0; if (desc-const_iter) niters = desc-niter; else if (loop-header-count) niters = expected_loop_iterations (loop); ... With FDO this change, as the FDO profiles are sometimes bogus (and there is not much to do about it). I would say that for loop optimization, we want a flag whether expected number of iterations is reliable. We probably want if (expected_loop_iterations_reliable_p (loop)) niters = expected_loop_iterations (loop); But why not simply never return bogus values from expected-loop-iterations? We probably want a flag in the .gcda file on whether it was from auto-fdo and only not trust profiles from those. Richard. We probably also want to store this information into loop structure rather than computing it all the time from profile, since the profile may get inaccurate and we already know how to maintain upper bounds on numbers of iterations, so it should be easy to implement. This would allow us to preserve info like for (i=0 ;i __bulitin_expect (n,10); i++) that would be nice feature to have. Honza you run into? /* Returns expected number of LOOP iterations. The returned value is bounded by REG_BR_PROB_BASE. */ unsigned expected_loop_iterations (const struct loop *loop) { gcov_type expected = expected_loop_iterations_unbounded (loop); return (expected REG_BR_PROB_BASE ? REG_BR_PROB_BASE : expected); } I miss a testcase as well. Richard. Thanks, Dehao gcc/ChangeLog: 2014-05-21 Dehao Chen de...@google.com * cfgloop.h (expected_loop_iterations_reliable_p): New func. * cfgloopanal.c (expected_loop_iterations_reliable_p): Likewise. * loop-unroll.c (decide_unroll_runtime_iterations): Disable unroll loop that has unreliable iteration counts. Index: gcc/cfgloop.h === --- gcc/cfgloop.h (revision 210717) +++ gcc/cfgloop.h (working copy) @@ -307,8 +307,8 @@ extern bool just_once_each_iteration_p (const stru gcov_type expected_loop_iterations_unbounded (const struct loop *); extern unsigned expected_loop_iterations (const struct loop *); extern rtx doloop_condition_get (rtx); +extern bool expected_loop_iterations_reliable_p (const struct loop *); - /* Loop manipulation. */ extern bool can_duplicate_loop_p (const struct loop *loop); Index: gcc/cfgloopanal.c === --- gcc/cfgloopanal.c (revision 210717) +++ gcc/cfgloopanal.c (working copy) @@ -285,6 +285,15 @@ expected_loop_iterations (const struct loop *loop) return (expected REG_BR_PROB_BASE ? REG_BR_PROB_BASE : expected); } +/* Returns true if the loop header's profile count is smaller than expected + loop iteration. */ + +bool +expected_loop_iterations_reliable_p (const struct loop *loop) +{ + return expected_loop_iterations (loop) loop-header-count; +} + /* Returns the maximum level of nesting of subloops of LOOP. */ unsigned Index: gcc/loop-unroll.c === --- gcc/loop-unroll.c (revision 210717) +++ gcc/loop-unroll.c (working copy) @@ -988,6 +988,15 @@ decide_unroll_runtime_iterations (struct loop *loo return; } + if (profile_status_for_fn (cfun) == PROFILE_READ + expected_loop_iterations_reliable_p (loop)) +{ + if (dump_file) + fprintf (dump_file, ;; Not unrolling loop, loop iteration + not reliable.); + return; +} + /* Check whether the loop rolls. */ if ((get_estimated_loop_iterations (loop, iterations) || get_max_loop_iterations (loop, iterations))
Re: [PATCH] Disable unroll loop that has header count less than iteration count.
if (expected_loop_iterations_reliable_p (loop)) niters = expected_loop_iterations (loop); But why not simply never return bogus values from expected-loop-iterations? Hmm, actually we do have: /* If we have a measured profile, use it to estimate the number of iterations. */ if (loop-header-count != 0) { gcov_type nit = expected_loop_iterations_unbounded (loop) + 1; bound = gcov_type_to_wide_int (nit); record_niter_bound (loop, bound, true, false); } and get_estimated_loop_iterations that has the behaviour intended. Forgot about this. So probably we want to revisit remaining uses of expected_loop_iterations and use get_estimated_loop_iterations (most of compiler actually does that). I did some of these changes in past, so there are not many left. I would move the logic setting the actual estimate based on profile from estimate_numbers_of_iterations_loop into tree-profile pass (i.e. do it once at profile load time and maintain it all the way through compilation them, such as in inlining). AtoFDO can do its own analysis: I suppose loop count is known to autoFDO when it can find source line that is always executed in the loop and source line that is known to have same count as header. This may be implementable as a separate analysis rather than having heuristic based on overall sanity of the profile around the loop. Honza We probably want a flag in the .gcda file on whether it was from auto-fdo and only not trust profiles from those. Richard. We probably also want to store this information into loop structure rather than computing it all the time from profile, since the profile may get inaccurate and we already know how to maintain upper bounds on numbers of iterations, so it should be easy to implement. This would allow us to preserve info like for (i=0 ;i __bulitin_expect (n,10); i++) that would be nice feature to have. Honza you run into? /* Returns expected number of LOOP iterations. The returned value is bounded by REG_BR_PROB_BASE. */ unsigned expected_loop_iterations (const struct loop *loop) { gcov_type expected = expected_loop_iterations_unbounded (loop); return (expected REG_BR_PROB_BASE ? REG_BR_PROB_BASE : expected); } I miss a testcase as well. Richard. Thanks, Dehao gcc/ChangeLog: 2014-05-21 Dehao Chen de...@google.com * cfgloop.h (expected_loop_iterations_reliable_p): New func. * cfgloopanal.c (expected_loop_iterations_reliable_p): Likewise. * loop-unroll.c (decide_unroll_runtime_iterations): Disable unroll loop that has unreliable iteration counts. Index: gcc/cfgloop.h === --- gcc/cfgloop.h (revision 210717) +++ gcc/cfgloop.h (working copy) @@ -307,8 +307,8 @@ extern bool just_once_each_iteration_p (const stru gcov_type expected_loop_iterations_unbounded (const struct loop *); extern unsigned expected_loop_iterations (const struct loop *); extern rtx doloop_condition_get (rtx); +extern bool expected_loop_iterations_reliable_p (const struct loop *); - /* Loop manipulation. */ extern bool can_duplicate_loop_p (const struct loop *loop); Index: gcc/cfgloopanal.c === --- gcc/cfgloopanal.c (revision 210717) +++ gcc/cfgloopanal.c (working copy) @@ -285,6 +285,15 @@ expected_loop_iterations (const struct loop *loop) return (expected REG_BR_PROB_BASE ? REG_BR_PROB_BASE : expected); } +/* Returns true if the loop header's profile count is smaller than expected + loop iteration. */ + +bool +expected_loop_iterations_reliable_p (const struct loop *loop) +{ + return expected_loop_iterations (loop) loop-header-count; +} + /* Returns the maximum level of nesting of subloops of LOOP. */ unsigned Index: gcc/loop-unroll.c === --- gcc/loop-unroll.c (revision 210717) +++ gcc/loop-unroll.c (working copy) @@ -988,6 +988,15 @@ decide_unroll_runtime_iterations (struct loop *loo return; } + if (profile_status_for_fn (cfun) == PROFILE_READ + expected_loop_iterations_reliable_p (loop)) +{ + if (dump_file) + fprintf (dump_file, ;; Not unrolling loop, loop iteration + not reliable.); + return; +} + /* Check whether the loop rolls. */ if ((get_estimated_loop_iterations (loop, iterations) || get_max_loop_iterations (loop, iterations))
Re: [PATCH][3/n] Always 64bit-HWI cleanups (drop HOST_WIDEST_INT)
On 05/23/14 07:49, Richard Biener wrote: This patch does the exercise of a grand rename and drops HOST_WIDEST_INT which is equal to HOST_WIDE_INT now. But we use [u]int64_t and the C99 inttypes.h PRI[dux]64 printf modifiers (which I provide in hwint.h if they are not available). Certainly most of the HOST_WIDEST_INT was to get reliable 64bit counters for debug counting and printing. Bootstrap and regtest ongoing on x86_64-unknown-linux-gnu, ok? Will do a mmix cross-cc1 as well, just to make sure I didn't mess up anything there. Thanks, Richard. 2014-05-23 Richard Biener rguent...@suse.de * system.h: Define __STDC_FORMAT_MACROS before including inttypes.h. * hwint.h (HOST_WIDEST_INT, HOST_BITS_PER_WIDEST_INT, HOST_WIDEST_INT_PRINT, HOST_WIDEST_INT_PRINT_DEC, HOST_WIDEST_INT_PRINT_DEC_C, HOST_WIDEST_INT_PRINT_UNSIGNED, HOST_WIDEST_INT_PRINT_HEX, HOST_WIDEST_INT_PRINT_DOUBLE_HEX, HOST_WIDEST_INT_C): Remove. (PRId64, PRIi64, PRIo64, PRIu64, PRIx64, PRIX64): Define if C99 inttypes.h is not available. * coretypes.h (gcov_type, gcov_type_unsigned): Use [u]int64_t. * gcov-io.h (gcov_type, gcov_type_unsigned): Likewise. * cfgloop.h (struct niter_desc): Use uint64_t for niter field. * bitmap.c (struct bitmap_descriptor_d): Use uint64_t for counters. (struct output_info): Likewise. (print_statistics): Adjust. (dump_bitmap_statistics): Likewise. * bt-load.c (migrate_btr_defs): Print with PRId64. * cfg.c (dump_edge_info, dump_bb_info): Likewise. (MAX_SAFE_MULTIPLIER): Adjust. * cfghooks.c (dump_bb_for_graph): Print with PRId64. * cgraph.c (cgraph_redirect_edge_call_stmt_to_callee, dump_cgraph_node): Likewise. * final.c (dump_basic_block_info): Likewise. * gcov-dump.c (tag_counters, tag_summary, dump_working_sets): Likewise. * gcov.c (format_gcov): Likewise. * ipa-cp.c (good_cloning_opportunity_p): Likewise. Use int64_t for calculation. (get_clone_agg_value): Use HOST_WIDE_INT for offset. * ipa-inline.c (compute_max_insns): Use int64_t for calcuation. (inline_small_functions, dump_overall_stats, dump_inline_stats): Use PRId64 for dumping. * ipa-profile.c (dump_histogram, ipa_profile): Likewise. * ira-color.c (struct allocno_hard_regs): Use int64_t for cost. (add_allocno_hard_regs): Adjust. * loop-doloop.c (doloop_modify): Print using PRId64. * loop-iv.c (inverse): Compute in uint64_t. (determine_max_iter, iv_number_of_iterations): Likewise. * loop-unroll.c (decide_peel_completely, decide_peel_simple): Print using PRId64. * lto-streamer-out.c (write_symbol): Use uint64_t. * mcf.c (CAP_INFINITY): Use int64_t maximum. (dump_fixup_edge, create_fixup_graph, cancel_negative_cycle, find_max_flow, adjust_cfg_counts): Use int64_t and dump with PRId64. * modulo-sched.c (const_iteration_count): Use int64_t. (sms_schedule): Dump using PRId64. * predict.c (dump_prediction): Likewise. * pretty-print.h (pp_widest_integer): Remove. * profile.c (get_working_sets, is_edge_inconsistent, is_inconsistent, read_profile_edge_counts): Dump using PRId64. * tree-pretty-print.c (pp_double_int): Remove case handling HOST_BITS_PER_DOUBLE_INT == HOST_BITS_PER_WIDEST_INT. * tree-ssa-math-opts.c (struct symbolic_number): Use uint64_t and adjust users. (pass_optimize_bswap::execute): Remove restriction on hosts. * tree-streamer-in.c (streamer_alloc_tree): Use HOST_WIDE_INT. * tree-streamer-out.c (streamer_write_tree_header): Likewise. * tree.c (widest_int_cst_value): Remove. * tree.h (widest_int_cst_value): Likewise. * value-prof.c (dump_histogram_value): Print using PRId64. * gengtype.c (main): Also inject int64_t. * ggc-page.c (struct max_alignment): Use int64_t. * alloc-pool.c (struct allocation_object_def): Likewise. * ira-conflicts.c (build_conflict_bit_table): Use uint64_t for computation. * doc/tm.texi.in: Remove reference to HOST_WIDEST_INT. * doc/tm.texi: Regenerated. * gengtype-lex.l (IWORD): Handle [u]int64_t. * config/sh/sh.c (expand_cbranchdi4): Use gcov_type. * config/mmix/mmix-protos.h (mmix_intval, mmix_shiftable_wyde_value, mmix_output_register_setting): Use [u]int64_t in prototypes. * config/mmix/mmix.c (mmix_print_operand, mmix_output_register_setting, mmix_shiftable_wyde_value, mmix_output_shiftvalue_op_from_str, mmix_output_octa, mmix_output_shifted_value): Adjust. (mmix_intval): Adjust. Remove unreachable case. lto/ * lto.c (lto_parse_hex): Use int64_t. (lto_resolution_read): Likewise. Given the highly
Re: [PATCH][2/n] Always 64bit-HWI cleanups
On 05/23/14 05:45, Richard Biener wrote: The following changes the configury to insist on [u]int64_t being available and removes the very old __int64 case. Autoconf doesn't check for it, support came in via a big merge in Dec 2002, r60174, and it was never used on the libcpp side until I fixed that with the last patch of this series, so we couldn't have relied on it at least since libcpp was introduced. Both libcpp and vmsdbg.h now use [u]int64_t, switching HOST_WIDE_INT to literally use int64_t has to be done with the grand renaming of all users due to us using 'unsigned HOST_WIDE_INT'. Btw, I couldn't find any standard way of writing [u]int64_t literals (substitution for HOST_WIDE_INT_C) nor one for printf formats (substitutions for HOST_WIDE_INT_PRINT and friends). I'll consider doing s/HOST_WIDE_INT/[U]INT64/ there if nobody comes up with a better plan. Unfortunately any followup will be the whole renaming game at once due to the 'unsigned' issue. I'll make sure to propose a hwint.h-only patch with a renaming guide for review and expect the actual renaming to take place using a script. Bootstrap and regtest running on x86_64-unknown-linux-gnu, ok? After this patch you may use [u]int64_t freely in host sources (lto-plugin already does that - heh). Thanks, Richard. 2014-05-23 Richard Biener rguent...@suse.de libcpp/ * configure.ac: Remove long long and __int64 type checks, add check for uint64_t and fail if that wasn't found. * include/cpplib.h (cpp_num_part): Use uint64_t. * config.in: Regenerate. * configure: Likewise. gcc/ * configure.ac: Drop __int64 type check. Insist that we found uint64_t and int64_t. * hwint.h (HOST_BITS_PER___INT64): Remove. (HOST_BITS_PER_WIDE_INT): Define to 64 and remove __int64 case. (HOST_WIDE_INT_PRINT_*): Remove 32bit case. (HOST_WIDEST_INT*): Define to HOST_WIDE_INT*. (HOST_WIDEST_FAST_INT): Remove __int64 case. * vmsdbg.h (struct _DST_SRC_COMMAND): Use int64_t for dst_q_src_df_rms_cdt. * configure: Regenerate. * config.in: Likewise. OK. Jeff
Re: [PATCH, sched] Cleanup and improve multipass_dfa_lookahead_guard
On 05/23/14 01:35, Maxim Kuvyrkov wrote: On May 23, 2014, at 7:23 PM, Andreas Schwab sch...@linux-m68k.org wrote: ../../gcc/config/ia64/ia64.c: In function 'int ia64_first_cycle_multipass_dfa_lookahead_guard(rtx, int)': ../../gcc/config/ia64/ia64.c:7551:1: error: control reaches end of non-void function [-Werror=return-type] Fixed, sorry about the breakage. The patch is trivial. Thank you, -- Maxim Kuvyrkov www.linaro.org 2014-05-23 Maxim Kuvyrkov maxim.kuvyr...@linaro.org Fix bootstrap error on ia64 * config/ia64/ia64.c (ia64_first_cycle_multipass_dfa_lookahead_guard): Return default value. If you haven't done so already, go ahead and check this in. jeff
Re: [patch i386]: Sibcall tail-call improvement and partial fix PR/60104
On 05/23/14 02:58, Kai Tietz wrote: Hello, yes the underlying issue is the same as for PR/46219. Nevertheless the patch doesn't solve this mentioned PR as I used for know a pretty conservative checking of allowed memories. By extending x86_sibcall_memory_p_1 function about allowing register-arguments too for memory, this problem can be solved. BTW, do you want to add 46219 to your list? At the least, I think we should add the test from 46219 to the suite, xfailed if you don't tackle it as a part of this work. jeff
Re: [patch i386]: Expand sibling-tail-calls via accumulator register
On 05/22/14 16:07, H.J. Lu wrote: On Thu, May 22, 2014 at 2:33 PM, Kai Tietz kti...@redhat.com wrote: Hello, This patch avoids for sibling-tail-calls the use of pseudo-register. Instead it uses for load of memory-address the accumulator-register. By this we avoid that IRA/LRA need to choose a register. So we reduce register-pressure. The accumulator-register is always a valid register on tail-call case. All other registers might be callee-saved, or used for argument-passing. The only case where we would use the accumulator on call is the variadic-case for x86_64 ABI. Just that this function never is a candidate for sibling-tail-calls. ChangeLog 2014-05-22 Kai Tietz kti...@redhat.com * config/i386/i386.c (ix86_expand_call): Enforce for sibcalls on memory the use of accumulator-register. Regression tested for x86_64-unknown-linux-gnu, x86_64-w64-mingw32, and i686-pc-cygwin. Ok for apply? Regards, Kai Index: i386.c === --- i386.c (Revision 210412) +++ i386.c (Arbeitskopie) @@ -24898,8 +24898,19 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx call ? !sibcall_insn_operand (XEXP (fnaddr, 0), word_mode) : !call_insn_operand (XEXP (fnaddr, 0), word_mode)) { + rtx r; fnaddr = convert_to_mode (word_mode, XEXP (fnaddr, 0), 1); - fnaddr = gen_rtx_MEM (QImode, copy_to_mode_reg (word_mode, fnaddr)); + if (!sibcall) + r = copy_to_mode_reg (word_mode, fnaddr); + else + { + r = gen_rtx_REG (word_mode, AX_REG) If fnaddr points to a function with variable argument list in 64-bit, AX_REG may be used to store number of FP arguments passed in registers. Right, but as Kai mentioned earlier, a varardic function should have been rejected by now as a sibcall target. Regardless, I think adding a check here shouldn't hurt and makes the backend more bullet-proof if the target independent bits get smarter in the future. jeff
[Google/4_8] Support for embedding build info into gcda files
Support for embedding arbitrary build information from the profile-generate compile into the gcda file in a new BUILD_INFO record. Lines from a file passed to the -fprofile-generate compile via a new -fprofile-generate-buildinfo=filename option are embedded as strings in the gcov_info struct and emitted as-is to a new GCOV_TAG_BUILD_INFO record. They are ignored on profile-use compiles, but emitted by gcov-dump. This is useful for recording information about, for example, source revision info that can be helpful for diagnosing profile mis-matches. For example: $ cat buildinfo.txt Build timestamp Build source revision r12345 Other random build data $ g++ foo.cc -fprofile-generate -fprofile-generate-buildinfo=buildinfo.txt $ a.out $ gcov-dump foo.gcda foo.gcda:data:magic `gcda':version `408*' foo.gcda:stamp 708902860 foo.gcda: a300: 22:PROGRAM_SUMMARY checksum=0x86a3bc55 foo.gcda: counts=1, runs=1, sum_all=1, run_max=1, sum_max=1 foo.gcda: counter histogram: foo.gcda: 1: num counts=1, min counter=1, cum_counter=1 foo.gcda: a700: 24:BUILD INFO num_strings=3 foo.gcda: Build timestamp foo.gcda: Build source revision r12345 foo.gcda: Other random build data foo.gcda: 0100: 3:FUNCTION ident=1, lineno_checksum=0x17c79156, cfg_checksum=0xdb5de9e8 foo.gcda: 01a1: 2:COUNTERS arcs 1 counts Tested manually, passes regression tests. Ok for Google/4_8? Thanks, Teresa -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413 Support for embedding arbitrary build information from the profile-generate compile into the gcda file in a new BUILD_INFO record. Lines from a file passed to the -fprofile-generate compile via a new -fprofile-generate-buildinfo=filename option are embedded as strings in the gcov_info struct and emitted as-is to a new GCOV_TAG_BUILD_INFO record. They are ignored on profile-use compiles, but emitted by gcov-dump. This is useful for recording information about, for example, source revision info that can be helpful for diagnosing profile mis-matches. For example: $ cat buildinfo.txt Build timestamp Build source revision r12345 Other random build data $ g++ foo.cc -fprofile-generate -fprofile-generate-buildinfo=buildinfo.txt $ a.out $ gcov-dump foo.gcda foo.gcda:data:magic `gcda':version `408*' foo.gcda:stamp 708902860 foo.gcda: a300: 22:PROGRAM_SUMMARY checksum=0x86a3bc55 foo.gcda: counts=1, runs=1, sum_all=1, run_max=1, sum_max=1 foo.gcda: counter histogram: foo.gcda: 1: num counts=1, min counter=1, cum_counter=1 foo.gcda: a700: 24:BUILD INFO num_strings=3 foo.gcda: Build timestamp foo.gcda: Build source revision r12345 foo.gcda: Other random build data foo.gcda: 0100: 3:FUNCTION ident=1, lineno_checksum=0x17c79156, cfg_checksum=0xdb5de9e8 foo.gcda: 01a1: 2:COUNTERS arcs 1 counts Tested manually, passes regression tests. Ok for Google/4_8? Thanks, Teresa 2014-05-23 Teresa Johnson tejohn...@google.com Google ref b/14794433 * gcc/common.opt (flag_profile_generate_buildinfo): New flag. * gcc/coverage.c (read_counts_file): Handle build info tag. (build_info_type): Initialize new build_info gcov_info fields. (build_info): Ditto. (str_list_append): Move earlier in file. (read_buildinfo): New function. (coverage_obj_init): Handle flag_profile_generate_buildinfo. * gcc/gcov.c (read_count_file): Handle build info tag. * gcc/gcov-dump.c (tag_table): Ditto. (tag_build_info): New function. * gcc/gcov-io.c (gcov_compute_string_array_len): Outline from gcov_write_module_info. (gcov_write_string_array): Ditto. (gcov_read_string_array): Outline from gcov_read_module_info. (gcov_read_build_info): New function. (gcov_read_module_info): Invoke outlined gcov_read_string_array. * gcc/gcov-io.h (GCOV_TAG_BUILD_INFO): New tag. (gcov_read_build_info): Declare. (gcov_read_string_array): Ditto. (gcov_compute_string_array_len): Ditto. (gcov_write_string_array): Ditto. * libgcc/dyn-ipa.c (gcov_write_module_info): Invoke outlined gcov_compute_string_array_len and gcov_write_string_array. * libgcc/libgcov-driver.c (gcov_exit_merge_gcda): Read build info. (gcov_write_build_info): New function. (gcov_exit_write_gcda): Write build info. * libgcc/libgcov.h (struct gcov_info): Add new build info fields. Index: gcc/common.opt === --- gcc/common.opt (revision 210862) +++ gcc/common.opt (working copy) @@ -1798,6 +1798,10 @@ fprofile-generate-sampling Common Var(flag_profile_generate_sampling) Turn on instrumentation sampling with -fprofile-generate with rate set by --param
Re: [patch i386]: Expand sibling-tail-calls via accumulator register
On 05/22/14 15:33, Kai Tietz wrote: Hello, This patch avoids for sibling-tail-calls the use of pseudo-register. Instead it uses for load of memory-address the accumulator-register. By this we avoid that IRA/LRA need to choose a register. So we reduce register-pressure. The accumulator-register is always a valid register on tail-call case. All other registers might be callee-saved, or used for argument-passing. The only case where we would use the accumulator on call is the variadic-case for x86_64 ABI. Just that this function never is a candidate for sibling-tail-calls. ChangeLog 2014-05-22 Kai Tietz kti...@redhat.com * config/i386/i386.c (ix86_expand_call): Enforce for sibcalls on memory the use of accumulator-register. I'm generally not a fan of explicitly mentioning hard registers in RTL. Though most of the major problems related to doing that have been resolved through the years. Regression tested for x86_64-unknown-linux-gnu, x86_64-w64-mingw32, and i686-pc-cygwin. Ok for apply? In the interest of defensive programming, can you verify that fnaddr doesn't refer to a varardic function? Hmm, I guess we can't get to a signature here. So, never mind. So I think the way to go is to ensure that the x86 port always rejects sibcalls to a varardic target, which I think can be done in ix86-function_ok_for_sibcall. With that change this patch should be OK. But please post it one more time for a final review. jeff
Re: RFA: cache enabled attribute by insn code
On 05/20/14 15:36, Richard Sandiford wrote: This is OK for the trunk (referring to the follow-up message which fixed EWRONGPATCH. Sorry, while working on the follow-up LRA patch, I realised I hadn't accounted for target changes that happen directly via target_reinit (rather than SWITCHABLE_TARGETS) and cases where reinit_regs is called to change just the register information. Both could potentially affect the enabled attribute. This version adds a recog_init function that clears the data if necessary. There are no other changes from first time. Is this still OK? Thanks for letting me know, that's a minor twiddle -- the patch is still OK. Jeff
Re: [PATCH] Fix ICE in rtl-optimization/PR61220, PR61225
On 05/20/14 20:12, Zhenqiang Chen wrote: In the code, there are 4 combinations of EDGE_COUNT: 1, 1, 1, 2, 2, 1 and 2, 2. 2, 1 is illegal. 2, 2 is legal, but need split_edge. 1, * can bypass the second check. EDGE_CRITICAL_P can only distinguish 2, 2 from others. So I think two explicit checks is more efficient than EDGE_CRITICAL_P. OK. It was more a question of clarity than efficiency. Your patch is fine as-is. jeff