Re: [PATCH] Check the STRING_CSTs in varasm.c
On 08/03/18 23:36, Jeff Law wrote: > On 08/01/2018 05:35 AM, Bernd Edlinger wrote: >> Hi, >> >> this completes the previous patches, and adds a check in varasm.c >> that ensures that all string constants are NUL terminated, >> And that varasm does not strip anything but _exactly_ one NUL >> character. >> >> Bootstrapped and reg-tested on x86_64-pc-linux-gnu. >> Is it OK for trunk? >> >> >> Thanks >> Bernd. >> >> >> patch-varasm.diff >> >> >> 2018-08-01 Bernd Edlinger >> >> * varasm.c (check_string_literal): New checking function. >> (output_constant): Use it. > My biggest concern here is that we've explicitly defined STRING_CST > nodes as not needing NUL termination. See generic.texi. That explicit > definition is almost certainly derived from existing practice that long > pre-dates the introduction of gimple/generic. > > Even so I'm generally OK with the concept of tightening the rules here. > If we need to fault in more fixes, that's fine with me. I'm assuming > that you and Ian have sorted out what to do WRT Go. > > If we decide to go forward, you'll definitely want to update the > STRING_CST documentation in generic.texi. > This is the same patch with STRING_CST documentation updated. Bernd. 2018-08-01 Bernd Edlinger * doc/generic.texi (STRING_CST): Update. * varasm.c (check_string_literal): New checking function. (output_constant): Use it. diff -pur gcc/varasm.c gcc/varasm.c --- gcc/varasm.c 2018-07-17 11:19:27.0 +0200 +++ gcc/varasm.c 2018-07-31 10:16:12.058827505 +0200 @@ -4774,6 +4774,29 @@ initializer_constant_valid_for_bitfield_ return false; } +/* Check if a STRING_CST fits into the field. + Tolerate only the case when the NUL termination + does not fit into the field. */ + +bool +check_string_literal (tree string, unsigned HOST_WIDE_INT size) +{ + tree eltype = TREE_TYPE (TREE_TYPE (string)); + unsigned HOST_WIDE_INT elts = tree_to_uhwi (TYPE_SIZE_UNIT (eltype)); + const char *p = TREE_STRING_POINTER (string); + int len = TREE_STRING_LENGTH (string); + + if (elts != 1 && elts != 2 && elts != 4) +return false; + if (len <= 0 || len % elts != 0) +return false; + if ((unsigned)len != size && (unsigned)len != size + elts) +return false; + if (memcmp (p + len - elts, "\0\0\0\0", elts) != 0) +return false; + return true; +} + /* output_constructor outer state of relevance in recursive calls, typically for nested aggregate bitfields. */ @@ -4942,6 +4965,7 @@ output_constant (tree exp, unsigned HOST case STRING_CST: thissize = MIN ((unsigned HOST_WIDE_INT)TREE_STRING_LENGTH (exp), size); + gcc_checking_assert (check_string_literal (exp, thissize)); assemble_string (TREE_STRING_POINTER (exp), thissize); break; case VECTOR_CST: Index: gcc/doc/generic.texi === --- gcc/doc/generic.texi (revision 263072) +++ gcc/doc/generic.texi (working copy) @@ -1162,9 +1162,13 @@ These nodes represent string-constants. The @code returns the length of the string, as an @code{int}. The @code{TREE_STRING_POINTER} is a @code{char*} containing the string itself. The string may not be @code{NUL}-terminated, and it may contain -embedded @code{NUL} characters. Therefore, the -@code{TREE_STRING_LENGTH} includes the trailing @code{NUL} if it is -present. +embedded @code{NUL} characters. However, the +@code{TREE_STRING_LENGTH} always includes a trailing @code{NUL} that +is not part of the language string literal but appended by the front end. +If the string shall not be @code{NUL}-terminated the @code{TREE_TYPE} +is one character shorter than @code{TREE_STRING_LENGTH}. +Excess caracters other than one trailing @code{NUL} character are not +permitted. For wide string constants, the @code{TREE_STRING_LENGTH} is the number of bytes in the string, and the @code{TREE_STRING_POINTER} @@ -1173,6 +1177,11 @@ target system (that is, as integers in the target non-wide string constants are distinguished only by the @code{TREE_TYPE} of the @code{STRING_CST}. +String constants may also be used for other purpose like assember +constraints or attributes. These have no @code{TREE_TYPE} and +need no explicit @code{NUL}-termination. Note there is always +another @code{NUL}-byte beyond @code{TREE_STRING_LENGTH}. + FIXME: The formats of string constants are not well-defined when the target system bytes are not the same width as host system bytes.
Re: [PATCH v2 6/7] Remaining support for clobber high
On 07/26/2018 03:13 AM, Alan Hayward wrote: > Add the remainder of clobber high checks. > Happy to split this into smaller patches if required (there didn't > seem anything obvious to split into). > > 2018-07-25 Alan Hayward > > * alias.c (record_set): Check for clobber high. > * cfgexpand.c (expand_gimple_stmt): Likewise. > * combine-stack-adj.c (single_set_for_csa): Likewise. > * combine.c (find_single_use_1): Likewise. > (set_nonzero_bits_and_sign_copies): Likewise. > (get_combine_src_dest): Likewise. > (is_parallel_of_n_reg_sets): Likewise. > (try_combine): Likewise. > (record_dead_and_set_regs_1): Likewise. > (reg_dead_at_p_1): Likewise. > (reg_dead_at_p): Likewise. > * dce.c (deletable_insn_p): Likewise. > (mark_nonreg_stores_1): Likewise. > (mark_nonreg_stores_2): Likewise. > * df-scan.c (df_find_hard_reg_defs): Likewise. > (df_uses_record): Likewise. > (df_get_call_refs): Likewise. > * dwarf2out.c (mem_loc_descriptor): Likewise. > * haifa-sched.c (haifa_classify_rtx): Likewise. > * ira-build.c (create_insn_allocnos): Likewise. > * ira-costs.c (scan_one_insn): Likewise. > * ira.c (equiv_init_movable_p): Likewise. > (rtx_moveable_p): Likewise. > (interesting_dest_for_shprep): Likewise. > * jump.c (mark_jump_label_1): Likewise. > * postreload-gcse.c (record_opr_changes): Likewise. > * postreload.c (reload_cse_simplify): Likewise. > (struct reg_use): Add source expr. > (reload_combine): Check for clobber high. > (reload_combine_note_use): Likewise. > (reload_cse_move2add): Likewise. > (move2add_note_store): Likewise. > * print-rtl.c (print_pattern): Likewise. > * recog.c (decode_asm_operands): Likewise. > (store_data_bypass_p): Likewise. > (if_test_bypass_p): Likewise. > * regcprop.c (kill_clobbered_value): Likewise. > (kill_set_value): Likewise. > * reginfo.c (reg_scan_mark_refs): Likewise. > * reload1.c (maybe_fix_stack_asms): Likewise. > (eliminate_regs_1): Likewise. > (elimination_effects): Likewise. > (mark_not_eliminable): Likewise. > (scan_paradoxical_subregs): Likewise. > (forget_old_reloads_1): Likewise. > * reorg.c (find_end_label): Likewise. > (try_merge_delay_insns): Likewise. > (redundant_insn): Likewise. > (own_thread_p): Likewise. > (fill_simple_delay_slots): Likewise. > (fill_slots_from_thread): Likewise. > (dbr_schedule): Likewise. > * resource.c (update_live_status): Likewise. > (mark_referenced_resources): Likewise. > (mark_set_resources): Likewise. > * rtl.c (copy_rtx): Likewise. > * rtlanal.c (reg_referenced_p): Likewise. > (single_set_2): Likewise. > (noop_move_p): Likewise. > (note_stores): Likewise. > * sched-deps.c (sched_analyze_reg): Likewise. > (sched_analyze_insn): Likewise. I only spot-checked on this. It's looks like it's relatively mechanical. I'm going to assume you fixed all the places that needed fixing. OK Jeff
Re: [PATCH v2 7/7] Enable clobber high for tls descs on Aarch64
On 07/26/2018 03:13 AM, Alan Hayward wrote: > Add the clobber high expressions to tls_desc for aarch64. > It also adds three tests. > > In addition I also tested by taking the gcc torture test suite and making > all global variables __thread. Then emended the suite to compile with -fpic, > save the .s file and only for one given O level. > I ran this before and after the patch and compared the resulting .s files, > ensuring that there were no ASM changes. > I discarded the 10% of tests that failed to compile (due to the code in > the test now being invalid C). > I did this for O0,O2,O3 on both x86 and aarch64 and observed no difference > between ASM files before and after the patch. > > Alan. > > 2018-07-25 Alan Hayward > > gcc/ > * config/aarch64/aarch64.md: Add clobber highs to tls_desc. > > gcc/testsuite/ > * gcc.target/aarch64/sve_tls_preserve_1.c: New test. > * gcc.target/aarch64/sve_tls_preserve_2.c: New test. > * gcc.target/aarch64/sve_tls_preserve_3.c: New test. AArch64 specific, so leaving it to the AArch64 maintainers to handle review. jeff
Re: [PATCH v2 5/7] cse support for clobber_high
On 07/26/2018 03:13 AM, Alan Hayward wrote: > The cse specific changes for clobber_high. > > 2018-07-25 Alan Hayward > > * cse.c (invalidate_reg): New function extracted from... > (invalidate): ...here. > (canonicalize_insn): Check for clobber high. > (invalidate_from_clobbers): invalidate clobber highs. > (invalidate_from_sets_and_clobbers): Likewise. > (count_reg_usage): Check for clobber high. > (insn_live_p): Likewise. > * cselib.c (cselib_expand_value_rtx_1):Likewise. > (cselib_invalidate_regno): Check for clobber in setter. > (cselib_invalidate_rtx): Pass through setter. > (cselib_invalidate_rtx_note_stores): > (cselib_process_insn): Check for clobber high. > * cselib.h (cselib_invalidate_rtx): Add operand. OK. jeff
Re: [PATCH v2 4/7] lra support for clobber_high
On 07/26/2018 03:13 AM, Alan Hayward wrote: > The lra specific changes for clobber_high. > > 2018-07-25 Alan Hayward > > * lra-eliminations.c (lra_eliminate_regs_1): Check for clobber high. > (mark_not_eliminable): Likewise. > * lra-int.h (struct lra_insn_reg): Add clobber high marker. > * lra-lives.c (process_bb_lives): Check for clobber high. > * lra.c (new_insn_reg): Remember clobber highs. > (collect_non_operand_hard_regs): Check for clobber high. > (lra_set_insn_recog_data): Likewise. > (add_regs_to_insn_regno_info): Likewise. > (lra_update_insn_regno_info): Likewise. OK. jeff
Re: [PATCH v2 3/7] Add func to check if register is clobbered by clobber_high
On 07/26/2018 03:13 AM, Alan Hayward wrote: > Given a CLOBBER_HIGH expression and a register, it checks if > the register will be clobbered. > > A second version exists for the cases where the expressions are > not available. > > The function will be used throughout the following patches. > > 2018-07-25 Alan Hayward > > * rtl.h (reg_is_clobbered_by_clobber_high): Add declarations. > * rtlanal.c (reg_is_clobbered_by_clobber_high): Add function. OK jeff
Re: [PATCH v2 2/7] Generation support for CLOBBER_HIGH
On 07/26/2018 03:13 AM, Alan Hayward wrote: > Ensure clobber high is a register expression. > Info is passed through for the error case. > > 2018-07-25 Alan Hayward > > * emit-rtl.c (verify_rtx_sharing): Check for CLOBBER_HIGH. > (copy_insn_1): Likewise. > (gen_hard_reg_clobber_high): New gen function. > * genconfig.c (walk_insn_part): Check for CLOBBER_HIGH. > * genemit.c (gen_exp): Likewise. > (gen_emit_seq): Pass through info. > (gen_insn): Check for CLOBBER_HIGH. > (gen_expand): Pass through info. > (gen_split): Likewise. > (output_add_clobbers): Likewise. > * genrecog.c (validate_pattern): Check for CLOBBER_HIGH. > (remove_clobbers): Likewise. > * rtl.h (gen_hard_reg_clobber_high): New declaration. > --- > gcc/emit-rtl.c | 16 > gcc/genconfig.c | 1 + > gcc/genemit.c | 51 +++ > gcc/genrecog.c | 3 ++- > gcc/rtl.h | 1 + > 5 files changed, 51 insertions(+), 21 deletions(-) > > diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c > index e4b070486e8..6a32bcbdaf6 100644 > --- a/gcc/emit-rtl.c > +++ b/gcc/emit-rtl.c > @@ -6508,6 +6511,19 @@ gen_hard_reg_clobber (machine_mode mode, unsigned int > regno) > gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (mode, regno))); > } > > +static GTY((deletable)) rtx > +hard_reg_clobbers_high[NUM_MACHINE_MODES][FIRST_PSEUDO_REGISTER]; > + > +rtx > +gen_hard_reg_clobber_high (machine_mode mode, unsigned int regno) > +{ > + if (hard_reg_clobbers_high[mode][regno]) > +return hard_reg_clobbers_high[mode][regno]; > + else > +return (hard_reg_clobbers_high[mode][regno] > + = gen_rtx_CLOBBER_HIGH (VOIDmode, gen_rtx_REG (mode, regno))); > +} You need a function comment on gen_hard_reg_clobber_high. OK with that change. jeff
Re: [PATCH v2 1/7] Add CLOBBER_HIGH expression
On 07/26/2018 03:13 AM, Alan Hayward wrote: > Includes documentation. > > 2018-07-25 Alan Hayward > > * doc/rtl.texi (clobber_high): Add. > (parallel): Add in clobber high > * rtl.c (rtl_check_failed_code3): Add function. > * rtl.def (CLOBBER_HIGH): Add expression. > * rtl.h (RTL_CHECKC3): Add macro. > (rtl_check_failed_code3): Add declaration. > (XC3EXP): Add macro. OK. jeff
[nios2, committed] Define TARGET_HAVE_SPECULATION_SAFE_VALUE
I've checked in this patch to fix the c-c++-common/spec-barrier-1.c test failure on nios2. Per previous discussions about Spectre vulnerabilities with Altera/Intel, this architecture is not affected so no special handling is required here. -Sandra 2018-08-03 Sandra Loosemore gcc/ * config/nios2/nios2.c (TARGET_HAVE_SPECULATION_SAFE_VALUE): Define. Index: gcc/config/nios2/nios2.c === --- gcc/config/nios2/nios2.c (revision 263298) +++ gcc/config/nios2/nios2.c (working copy) @@ -5572,6 +5572,9 @@ nios2_adjust_reg_alloc_order (void) #undef TARGET_CONSTANT_ALIGNMENT #define TARGET_CONSTANT_ALIGNMENT constant_alignment_word_strings +#undef TARGET_HAVE_SPECULATION_SAFE_VALUE +#define TARGET_HAVE_SPECULATION_SAFE_VALUE speculation_safe_value_not_needed + struct gcc_target targetm = TARGET_INITIALIZER; #include "gt-nios2.h"
Re: Async I/O patch with compilation fix
Hi Cristophe, this is seriously weird - there is not even an I/O statement in that test case. One question: Is this real hardware or an emulator? Also, Could you try a few things? Run the test case manually. Do you still fail? Is there an error if the executable is run under valgrind? If you have two compilers, one with the patch and one without: Is there a difference in the generated files for -dump-tree-original, -fdump-tree-optimized and -S? Regards, Thomas
Re: [PATCH, testsuite]: Fix PR 86153, test case g++.dg/pr83239.C fails
On 08/01/2018 04:34 AM, Uros Bizjak wrote: > Hello! > > The testcase fails with: > > FAIL: g++.dg/pr83239.C -std=gnu++11 scan-tree-dump-not optimized > "_ZNSt6vectorIiSaIiEE17_M_default_appendEm" > FAIL: g++.dg/pr83239.C -std=gnu++14 scan-tree-dump-not optimized > "_ZNSt6vectorIiSaIiEE17_M_default_appendEm" > > the test depends on _M_default_append to be inlined, so it verifies > the inlining with: > > // Verify that std::vector::_M_default_append() has been inlined > // (the absence of warnings depends on it). > // { dg-final { scan-tree-dump-not > "_ZNSt6vectorIiSaIiEE17_M_default_appendEm" optimized } } > // { dg-final { scan-tree-dump-not > "_ZNSt6vectorIPvSaIS0_EE17_M_default_appendEm" optimized } } > > However, this is not the case with the default -finline-limit, so > raise it to 500 (the same approach is taken in g++.dg/ > tree-ssa/copyprop.C). > > Unfortunately, the fixed testcase exposes some issue with -std=gnu++98: > > FAIL: g++.dg/pr83239.C -std=gnu++98 (test for excess errors) > > In function 'void test_loop() [with T = int]': > cc1plus: warning: 'void* __builtin_memset(void*, int, long unsigned > int)' specified size 18446744073709551608 exceeds maximum object size > 9223372036854775807 [-Wstringop-overflow=] > In function 'void test_if(std::vector&, int) [with T = long int]': > cc1plus: warning: 'void* __builtin_memset(void*, int, long unsigned > int)' specified size 18446744073709551600 exceeds maximum object size > 9223372036854775807 [-Wstringop-overflow=] > > 2018-08-01 Uros Bizjak > > PR testsuite/86153 > * g++.dg/pr83239.C (dg-options): Add -finline-limit=500. > > OK for mainline and gcc-8 branch? OK. jeff
Re: [PATCH, testsuite]: Fix PR 86153, test case g++.dg/pr83239.C fails
On 08/01/2018 11:21 AM, Martin Sebor wrote: > On 08/01/2018 04:34 AM, Uros Bizjak wrote: >> Hello! >> >> The testcase fails with: >> >> FAIL: g++.dg/pr83239.C -std=gnu++11 scan-tree-dump-not optimized >> "_ZNSt6vectorIiSaIiEE17_M_default_appendEm" >> FAIL: g++.dg/pr83239.C -std=gnu++14 scan-tree-dump-not optimized >> "_ZNSt6vectorIiSaIiEE17_M_default_appendEm" >> >> the test depends on _M_default_append to be inlined, so it verifies >> the inlining with: >> >> // Verify that std::vector::_M_default_append() has been inlined >> // (the absence of warnings depends on it). >> // { dg-final { scan-tree-dump-not >> "_ZNSt6vectorIiSaIiEE17_M_default_appendEm" optimized } } >> // { dg-final { scan-tree-dump-not >> "_ZNSt6vectorIPvSaIS0_EE17_M_default_appendEm" optimized } } >> >> However, this is not the case with the default -finline-limit, so >> raise it to 500 (the same approach is taken in g++.dg/ >> tree-ssa/copyprop.C). >> >> Unfortunately, the fixed testcase exposes some issue with -std=gnu++98: >> >> FAIL: g++.dg/pr83239.C -std=gnu++98 (test for excess errors) >> >> In function 'void test_loop() [with T = int]': >> cc1plus: warning: 'void* __builtin_memset(void*, int, long unsigned >> int)' specified size 18446744073709551608 exceeds maximum object size >> 9223372036854775807 [-Wstringop-overflow=] >> In function 'void test_if(std::vector&, int) [with T = long int]': >> cc1plus: warning: 'void* __builtin_memset(void*, int, long unsigned >> int)' specified size 18446744073709551600 exceeds maximum object size >> 9223372036854775807 [-Wstringop-overflow=] >> >> 2018-08-01 Uros Bizjak >> >> PR testsuite/86153 >> * g++.dg/pr83239.C (dg-options): Add -finline-limit=500. >> >> OK for mainline and gcc-8 branch? > > Thanks for spending time on this! I just looked into it > earlier this week and was going to touch base with Jeff after > he comes back from PTO later this week to see what to about > the test and the outstanding warning. In comment #20 on > pr83239 Jeff said he has a patch for the missed optimization > that presumably is behind the false positive, so that should > presumably fix the other part of the problem here. Yes. It's a relatively simple patch to the VRP bits to evaluate conditionals -- it extends our ability to detect overflow tests and the resultant ranges that inputs must take to trigger overflow. All that really needs to happen is for me to recreate the lost tests and submit the patch. jeff
Re: [PATCH] Check the STRING_CSTs in varasm.c
On 08/03/18 23:36, Jeff Law wrote: > On 08/01/2018 05:35 AM, Bernd Edlinger wrote: >> Hi, >> >> this completes the previous patches, and adds a check in varasm.c >> that ensures that all string constants are NUL terminated, >> And that varasm does not strip anything but _exactly_ one NUL >> character. >> >> Bootstrapped and reg-tested on x86_64-pc-linux-gnu. >> Is it OK for trunk? >> >> >> Thanks >> Bernd. >> >> >> patch-varasm.diff >> >> >> 2018-08-01 Bernd Edlinger >> >> * varasm.c (check_string_literal): New checking function. >> (output_constant): Use it. > My biggest concern here is that we've explicitly defined STRING_CST > nodes as not needing NUL termination. See generic.texi. That explicit > definition is almost certainly derived from existing practice that long > pre-dates the introduction of gimple/generic. > > Even so I'm generally OK with the concept of tightening the rules here. > If we need to fault in more fixes, that's fine with me. I'm assuming > that you and Ian have sorted out what to do WRT Go. > > If we decide to go forward, you'll definitely want to update the > STRING_CST documentation in generic.texi. > Yes, note however that STRING_CST have more uses that string literals, for instance ASM constraints, and __attribute__ values, and these do not have explicit NUL termination, and no TREE_TYPE i'd suppose. I am open to discuss how the string constants should be handled in the middle-end. Bernd.
Re: [PATCH] Fix wrong code with truncated string literals (PR 86711/86714)
On 08/03/18 23:15, Jeff Law wrote: > On 07/30/2018 02:21 PM, Bernd Edlinger wrote: >> On 07/30/18 21:52, Martin Sebor wrote: >>> On 07/30/2018 09:24 AM, Bernd Edlinger wrote: On 07/30/18 01:05, Martin Sebor wrote: > On 07/29/2018 04:56 AM, Bernd Edlinger wrote: >> Hi! >> >> This fixes two wrong code bugs where string_constant >> returns over length string constants. Initializers >> like that are rejected in C++, but valid in C. > > If by valid you are referring to declarations like the one in > the added test: > > const char a[2][3] = { "1234", "xyz" }; > > then (as I explained), the excess elements in "1234" make > the char[3] initialization and thus the test case undefined. > I have resolved bug 86711 as invalid on those grounds. > > Bug 86711 has a valid test case that needs to be fixed, along > with bug 86688 that I raised for the same underlying problem: > considering the excess nul as part of the string. As has been > discussed in a separate bug, rather than working around > the excessively long strings in the middle-end, it would be > preferable to avoid creating them to begin with. > > I'm already working on a fix for bug 86688, in part because > I introduced the code change and also because I'm making other > changes in this area -- bug 86552. Both of these in response > to your comments. > Sorry, I must admit, I have completely lost track on how many things you are trying to work in parallel. Nevertheless I started to review you pr86552 patch here: https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01593.html But so far you did not respond to me. Well actually I doubt your patch does apply to trunk, maybe you start to re-base that one, and post it again instead? >>> >>> I read your comments and have been busy working on enhancing >>> the patch (among other things). There are a large number of >>> additional contexts where constant strings are expected and >>> where a missing nul needs to be detected. Some include >>> additional instances of strlen calls that my initial patch >>> didn't handle, many more others that involve other string >>> functions. I have posted an updated patch that applies >>> cleanly and that handles the first set. >>> >>> There is also a class of problems involving constant character >>> arrays initialized by a braced list, as in char [] = { x, y, z }; >>> Those are currently not recognized as strings even if they are >>> nul-terminated, but they are far more likely to be a source of >>> these problems than string literals, certainly in C++ where >>> string initializers must fit in the array. I am testing a patch >>> to convert those into STRING_CST so they can be handled as well. >>> >>> Since initializing arrays with more elements than fit is >>> undefined in C and since the behavior is undefined at compile >>> time it seems to me that rejecting such initializers with >>> a hard error (as opposed to a warning) would be appropriate >>> and obviate having to deal with them in the middle-end. >>> >> >> We do not want to change what is currently accepted by the >> front end. period. > ?!? I don't follow this. When we can detect an error, we should issue > a suitable diagnostic. As is mentioned later in the thread, some > diagnostics are considered "pedantic" and are conditional. But I don't > see how you get to "We do not want to change what is currently accepted > by the front end. period." That seems like far too strong of a statement. > What I mean here is: we should fix a middle-end consistency issue with the string constants. When that is done, we can decide to make change a pedantic error into a hard error, but IMHO it should be an independent patch of it's own. By the way, I found that Fortran has non-zero terminated strings with index range starting from 1. Then there are also overlength strings in Fortran, that have excess precision. Should we forbid that Fortran feature as well? Then Ada and Go do not have zero-terminated strings, but I do not know if these can hit the strlen pass. C++ has a -fpermissive option that also allows overlength strings, but all test cases are C only. etc. etc. > I'm not sure I agree with this being a pedantic error only. See below... >> >> But there is no reason why ambiguous string constants >> have to be passed to the middle end. > Agreed -- if we're not outright rejecting, then we should present the > middle end with something reasonable. But > >> >> For instance char c[2] = "a\0"; should look like c[1] = "a"; >> while c[2] = "aaa"; should look like c[2] = "aa"; varasm.c >> will cut the excess precision off anyway. > I get the first case. The second is less clear. One could argue that > c[1] should be NUL or one could argue that c[1] should be 'a'. It's > the inability to know what is "more correct" and the security > implications
Re: [PATCH] Check the STRING_CSTs in varasm.c
On 08/01/2018 05:35 AM, Bernd Edlinger wrote: > Hi, > > this completes the previous patches, and adds a check in varasm.c > that ensures that all string constants are NUL terminated, > And that varasm does not strip anything but _exactly_ one NUL > character. > > Bootstrapped and reg-tested on x86_64-pc-linux-gnu. > Is it OK for trunk? > > > Thanks > Bernd. > > > patch-varasm.diff > > > 2018-08-01 Bernd Edlinger > > * varasm.c (check_string_literal): New checking function. > (output_constant): Use it. My biggest concern here is that we've explicitly defined STRING_CST nodes as not needing NUL termination. See generic.texi. That explicit definition is almost certainly derived from existing practice that long pre-dates the introduction of gimple/generic. Even so I'm generally OK with the concept of tightening the rules here. If we need to fault in more fixes, that's fine with me. I'm assuming that you and Ian have sorted out what to do WRT Go. If we decide to go forward, you'll definitely want to update the STRING_CST documentation in generic.texi. jeff
Re: [PATCH] Fix wrong code with truncated string literals (PR 86711/86714)
On Fri, Aug 03, 2018 at 03:16:41PM -0600, Jeff Law wrote: > On 07/31/2018 12:33 AM, Jakub Jelinek wrote: > > On Mon, Jul 30, 2018 at 10:01:38PM -0600, Martin Sebor wrote: > >>> We do not want to change what is currently accepted by the > >>> front end. period. > >> > >> On whose behalf are you making such categorical statements? > >> It was Jakub and Richard's suggestion in bug 86714 to reject > >> the undefined excessive initializers and I happen to like > >> the idea. I don't recall anyone making a decision about what > > > > It was not my suggestion and it is already rejected with -pedantic-errors, > > which is all that is needed, reject it in pedantic mode. > > Otherwise there is a warning emitted by default which is enough. > But I think that's a mistake. I think a hard error at this point is > warranted and the safest thing to do. The normal behavior when it isn't an error is that the initializer is truncated. That is what happens if I use struct S { char a[4]; }; struct S r = { "abc" }; struct S s = { "abcd" }; struct S t = { "abcde" }; C says that in the s.a initializer is actually just 'a', 'b', 'c', 'd' rather than 'a', 'b', 'c', 'd', '\0', so there is a silent truncation in the language naturally, so the extension that truncates even more with a warning enabled by default is IMHO natural and doesn't need to be more pedantic. We've always truncated, so there should be no surprises. > > The suggestion was that if we don't reject (and no reason to change when we > > reject it), that we handle it right, which is what your optimization broke. > But it's not always clear what is right. That's my concern. If we had > rules which were clearly right, then applying those rules and continuing > is much more reasonable. Perhaps we haven't written them down, but we've always behaved that way. clang also truncates with a warning enabled by default, only rejects it like gcc with -pedantic-errors. So does icc. Jakub
Re: [PATCH] Fix wrong code with truncated string literals (PR 86711/86714)
On 07/30/2018 01:52 PM, Martin Sebor wrote: > On 07/30/2018 09:24 AM, Bernd Edlinger wrote: >> On 07/30/18 01:05, Martin Sebor wrote: >>> On 07/29/2018 04:56 AM, Bernd Edlinger wrote: Hi! This fixes two wrong code bugs where string_constant returns over length string constants. Initializers like that are rejected in C++, but valid in C. >>> >>> If by valid you are referring to declarations like the one in >>> the added test: >>> >>> const char a[2][3] = { "1234", "xyz" }; >>> >>> then (as I explained), the excess elements in "1234" make >>> the char[3] initialization and thus the test case undefined. >>> I have resolved bug 86711 as invalid on those grounds. >>> >>> Bug 86711 has a valid test case that needs to be fixed, along >>> with bug 86688 that I raised for the same underlying problem: >>> considering the excess nul as part of the string. As has been >>> discussed in a separate bug, rather than working around >>> the excessively long strings in the middle-end, it would be >>> preferable to avoid creating them to begin with. >>> >>> I'm already working on a fix for bug 86688, in part because >>> I introduced the code change and also because I'm making other >>> changes in this area -- bug 86552. Both of these in response >>> to your comments. >>> >> >> Sorry, I must admit, I have completely lost track on how many things >> you are trying to work in parallel. >> >> Nevertheless I started to review you pr86552 patch here: >> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01593.html >> >> But so far you did not respond to me. >> >> Well actually I doubt your patch does apply to trunk, >> maybe you start to re-base that one, and post it again >> instead? > > I read your comments and have been busy working on enhancing > the patch (among other things). There are a large number of > additional contexts where constant strings are expected and > where a missing nul needs to be detected. Some include > additional instances of strlen calls that my initial patch > didn't handle, many more others that involve other string > functions. I have posted an updated patch that applies > cleanly and that handles the first set. So without seeing the code my worry here is we end up with a patch that gets increasingly complex because it's trying to handle a large number of cases. If at all possible let's try to make incremental improvements, focusing first on correctness issues. > > There is also a class of problems involving constant character > arrays initialized by a braced list, as in char [] = { x, y, z }; > Those are currently not recognized as strings even if they are > nul-terminated, but they are far more likely to be a source of > these problems than string literals, certainly in C++ where > string initializers must fit in the array. I am testing a patch > to convert those into STRING_CST so they can be handled as well. This feels like an independent, but very useful change. > Since initializing arrays with more elements than fit is > undefined in C and since the behavior is undefined at compile > time it seems to me that rejecting such initializers with > a hard error (as opposed to a warning) would be appropriate > and obviate having to deal with them in the middle-end. I tend to agree when there's no good set of rules we can apply to allow compilation to continue. However, I think that means getting some consensus to change existing practice where this is just a pedantic error. jeff
Re: [PATCH] Fix wrong code with truncated string literals (PR 86711/86714)
On 07/31/2018 12:33 AM, Jakub Jelinek wrote: > On Mon, Jul 30, 2018 at 10:01:38PM -0600, Martin Sebor wrote: >>> We do not want to change what is currently accepted by the >>> front end. period. >> >> On whose behalf are you making such categorical statements? >> It was Jakub and Richard's suggestion in bug 86714 to reject >> the undefined excessive initializers and I happen to like >> the idea. I don't recall anyone making a decision about what > > It was not my suggestion and it is already rejected with -pedantic-errors, > which is all that is needed, reject it in pedantic mode. > Otherwise there is a warning emitted by default which is enough. But I think that's a mistake. I think a hard error at this point is warranted and the safest thing to do. > > The suggestion was that if we don't reject (and no reason to change when we > reject it), that we handle it right, which is what your optimization broke. But it's not always clear what is right. That's my concern. If we had rules which were clearly right, then applying those rules and continuing is much more reasonable. jeff
Re: [PATCH] Fix wrong code with truncated string literals (PR 86711/86714)
On 07/30/2018 02:21 PM, Bernd Edlinger wrote: > On 07/30/18 21:52, Martin Sebor wrote: >> On 07/30/2018 09:24 AM, Bernd Edlinger wrote: >>> On 07/30/18 01:05, Martin Sebor wrote: On 07/29/2018 04:56 AM, Bernd Edlinger wrote: > Hi! > > This fixes two wrong code bugs where string_constant > returns over length string constants. Initializers > like that are rejected in C++, but valid in C. If by valid you are referring to declarations like the one in the added test: const char a[2][3] = { "1234", "xyz" }; then (as I explained), the excess elements in "1234" make the char[3] initialization and thus the test case undefined. I have resolved bug 86711 as invalid on those grounds. Bug 86711 has a valid test case that needs to be fixed, along with bug 86688 that I raised for the same underlying problem: considering the excess nul as part of the string. As has been discussed in a separate bug, rather than working around the excessively long strings in the middle-end, it would be preferable to avoid creating them to begin with. I'm already working on a fix for bug 86688, in part because I introduced the code change and also because I'm making other changes in this area -- bug 86552. Both of these in response to your comments. >>> >>> Sorry, I must admit, I have completely lost track on how many things >>> you are trying to work in parallel. >>> >>> Nevertheless I started to review you pr86552 patch here: >>> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01593.html >>> >>> But so far you did not respond to me. >>> >>> Well actually I doubt your patch does apply to trunk, >>> maybe you start to re-base that one, and post it again >>> instead? >> >> I read your comments and have been busy working on enhancing >> the patch (among other things). There are a large number of >> additional contexts where constant strings are expected and >> where a missing nul needs to be detected. Some include >> additional instances of strlen calls that my initial patch >> didn't handle, many more others that involve other string >> functions. I have posted an updated patch that applies >> cleanly and that handles the first set. >> >> There is also a class of problems involving constant character >> arrays initialized by a braced list, as in char [] = { x, y, z }; >> Those are currently not recognized as strings even if they are >> nul-terminated, but they are far more likely to be a source of >> these problems than string literals, certainly in C++ where >> string initializers must fit in the array. I am testing a patch >> to convert those into STRING_CST so they can be handled as well. >> >> Since initializing arrays with more elements than fit is >> undefined in C and since the behavior is undefined at compile >> time it seems to me that rejecting such initializers with >> a hard error (as opposed to a warning) would be appropriate >> and obviate having to deal with them in the middle-end. >> > > We do not want to change what is currently accepted by the > front end. period. ?!? I don't follow this. When we can detect an error, we should issue a suitable diagnostic. As is mentioned later in the thread, some diagnostics are considered "pedantic" and are conditional. But I don't see how you get to "We do not want to change what is currently accepted by the front end. period." That seems like far too strong of a statement. I'm not sure I agree with this being a pedantic error only. See below... > > But there is no reason why ambiguous string constants > have to be passed to the middle end. Agreed -- if we're not outright rejecting, then we should present the middle end with something reasonable. But > > For instance char c[2] = "a\0"; should look like c[1] = "a"; > while c[2] = "aaa"; should look like c[2] = "aa"; varasm.c > will cut the excess precision off anyway. I get the first case. The second is less clear. One could argue that c[1] should be NUL or one could argue that c[1] should be 'a'. It's the inability to know what is "more correct" and the security implications of leaving the string unterminated that drives my opinion that this should not be a pedantic error, but instead a hard error. > > That is TREE_STRING_LENGTH (str) == 3 and TREE_STRING_POINTER(str) = "aa\0"; See above. That's going to leave an unterminated string in the resulting code. That has a negative security impact. > > I propose to have all STRING_CST always be created by the > FE with explicit nul termination, but the > TYPE_SIZE_UNIT (TREE_TYPE (str)) >= TREE_STRING_LENGTH(str) in normal case > (null-terminated) > TREE_SIZE_UNIT (TREE_TYPE (str)) < TREE_STRING_LENGTH(str) if non zero > terminated, > truncated in the initializer. > > Do you understand what I mean? I think you're starting from the wrong basis, namely that we shouldn't be issuing a hard error for excess initializers like
Re: [PATCH v2] libitm: sh: avoid absolute relocation in shared library (PR 86712)
On 07/28/2018 07:04 AM, slyfox.inbox.ru via gcc-patches wrote: > From: Sergei Trofimovich > > Cc: Andreas Schwab > Cc: Torvald Riegel > Cc: Alexandre Oliva > Cc: Oleg Endo > Cc: Kaz Kojima > Signed-off-by: Sergei Trofimovich > --- > libitm/config/sh/sjlj.S | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/libitm/config/sh/sjlj.S b/libitm/config/sh/sjlj.S > index 043f36749be..f265ab8f898 100644 > --- a/libitm/config/sh/sjlj.S > +++ b/libitm/config/sh/sjlj.S > @@ -53,7 +53,7 @@ _ITM_beginTransaction: > #else > cfi_def_cfa_offset (4*10) > #endif > -#if defined HAVE_ATTRIBUTE_VISIBILITY || !defined __PIC__ > +#if !defined __PIC__ > mov.l .Lbegin, r1 > jsr @r1 >movr15, r5 > @@ -78,7 +78,7 @@ _ITM_beginTransaction: > > .align 2 > .Lbegin: > -#if defined HAVE_ATTRIBUTE_VISIBILITY || !defined __PIC__ > +#if !defined __PIC__ > .long GTM_begin_transaction > #else > .long GTM_begin_transaction@PCREL-(.Lbegin0-.) > THanks. I installed this version. jeff
Re: PING [PATCH] warn for strlen of arrays with missing nul (PR 86552, 86711, 86714) )
On 08/03/2018 07:00 AM, Bernd Edlinger wrote: On 08/02/18 22:34, Martin Sebor wrote: On 08/02/2018 12:56 PM, Bernd Edlinger wrote: On 08/02/18 15:26, Bernd Edlinger wrote: /* If the length can be computed at compile-time, return it. */ - len = c_strlen (src, 0); + tree array; + tree len = c_strlen (src, 0, ); You know the c_strlen tries to compute wide character sizes, but strlen does not do that, strlen (L"abc") should give 1 (or 0 on a BE machine) I wonder if that is correct. [snip] static tree -fold_builtin_strlen (location_t loc, tree type, tree arg) +fold_builtin_strlen (location_t loc, tree fndecl, tree type, tree arg) { if (!validate_arg (arg, POINTER_TYPE)) return NULL_TREE; else { - tree len = c_strlen (arg, 0); - + tree arr = NULL_TREE; + tree len = c_strlen (arg, 0, ); Is it possible to write a test case where strlen(L"test") reaches this point? what will c_strlen return then? Yes, of course it is: $ cat y.c int f(char *x) { return __builtin_strlen(x); } int main () { return f((char*)"abcdef"[0]); } $ gcc -O3 -S y.c $ cat y.s main: .LFB1: .cfi_startproc movl$6, %eax ret .cfi_endproc The reason is that c_strlen tries to fold wide chars at all. I do not know when that was introduced, was that already before your last patches? The function itself was introduced in 1992 if not earlier, before wide strings even existed. AFAICS, it has always accepted strings of all widths. Until r241489 (in GCC 7) it computed their length in bytes, not characters. I don't know if that was on purpose or if it was just never changed to compute the length in characters when wide strings were first introduced. From the name I assume it's the latter. The difference wasn't detected until sprintf tests were added for wide string directives. The ChangeLog description for the change reads: Correctly handle wide strings. I didn't consider pathological cases like strlen (L"abc"). It shouldn't be difficult to change to fix this case. Oh, oh, oh $ cat y3.c int main () { char c[100]; int x = __builtin_sprintf (c, "%S", L"\u"); __builtin_printf("%d %ld\n", x,__builtin_strlen(c)); } $ gcc-4.8 -O3 -std=c99 y3.c $ ./a.out -1 0 $ gcc -O3 y3.c $ ./a.out 1 0 $ echo $LANG de_DE.UTF-8 I would have expected L"\u" to converted to UTF-8 or another encoding, so the return value if sprintf is far from obvious, and probably language dependent. Why do you think it is a good idea to use really every opportunity to optimize totally unnecessary things like using the return value from the sprintf function as it is? Did you never think this adds a significant maintenance burden on the rest of us as well? Your condescending tone is uncalled for, and you clearly speak out of ignorance. I don't owe you an explanation but as I have said multiple times: most of my work, including the sprintf pass, is primarily motivated by detecting bugs like buffer overflow. Optimization is only a secondary goal (but bug detection depends on it). It may come as a shock to you but mistakes happen. That's why it's important to make an effort to detect them. This is one is a simple typo (handling %S the same way as %s instead of %ls. If you are incapable of a professional tone I would suggest you go harass someone else. Martin
Re: [PATCH] Add fix-it hint for missing return statement in assignment operators (PR c++/85523)
On Tue, 2018-05-01 at 07:18 -0400, Nathan Sidwell wrote: > On 04/30/2018 08:29 PM, David Malcolm wrote: > > Following on from the thread on the "gcc" list here: > > > >https://gcc.gnu.org/ml/gcc/2018-04/msg00172.html > > > > here's an updated version of Jonathan's patch, which: > > + { > > + tree valtype = TREE_TYPE (DECL_RESULT (fndecl)); > > + if (TREE_CODE (valtype) == REFERENCE_TYPE > > + && same_type_ignoring_top_level_qualifiers_p > > + (TREE_TYPE (valtype), TREE_TYPE > > (current_class_ref))) > > While this is probably close enough, you could > *) use convert_to_base to include cases where the return type's a > base > of the current object. > *) convert_to_base would also allow dropping the REFERENCE_TYPE > check > here, so icky code returning by-value could also be caught. > > Up to you. But otherwise ok. Sorry about the belated response; this fell off my radar for some reason. I looked at updating it to support the cases you suggest, but I wasn't happy with issuing the fix-it hint for them, so I've gone with the patch as-is. Committed to trunk as r263298 (after rebasing and re-testing) Thanks Dave
Re: [PATCH] [Ada] Make middle-end string literals NUL terminated
On 08/03/18 18:50, Olivier Hainque wrote: > Hi Bernd, > >> On 31 Jul 2018, at 14:07, Bernd Edlinger wrote: >> >> Hi! >> >> >> This fixes a couple STRING_CST that are not explicitly NUL terminated. >> These were caught in a new check in varasm.c I am currently working on. >> >> Having a NUL terminated string does not change the binary output, but it >> makes things easier for he middle-end. >> >> >> Bootstrapped and reg-tested on x86_64-pc-linux-gnu. >> Is it OK for trunk? > > --- gcc/ada/gcc-interface/utils2.c 2017-12-21 07:57:41.0 +0100 > +++ gcc/ada/gcc-interface/utils2.c 2018-07-31 11:44:01.517117923 +0200 > @@ -1844,7 +1844,7 @@ expand_sloc (Node_Id gnat_node, tree *fi > } > > const int len = strlen (str); > - *filename = build_string (len, str); > + *filename = build_string (len + 1, str); > TREE_TYPE (*filename) = build_array_type (char_type_node, > build_index_type (size_int > (len))); > *line = build_int_cst (NULL_TREE, line_number); > > This one looks good to me. I'm not officially listed as a maintainer > so I'll let Eric have the final word. I'm answering because ... > > > --- gcc/ada/gcc-interface/trans.c 2018-07-17 10:10:04.0 +0200 > +++ gcc/ada/gcc-interface/trans.c 2018-07-31 11:16:27.350728886 +0200 > @@ -6079,7 +6079,7 @@ gnat_to_gnu (Node_Id gnat_node) > where GCC will want to treat it as a C string. */ > string[i] = 0; > > - gnu_result = build_string (length, string); > + gnu_result = build_string (length + 1, string); > > /* Strings in GCC don't normally have types, but we want > this to not be converted to the array type. */ > > We have a local variant of this one, on which I worked after we realized > that it was not enough to achieve the intended effect. > > The difference at this spot is simply that we prevent the +1 if the > original string happens to be explicitly nul terminated already. Like: > > build_string > ((length > 0 && string[length-1] == 0) ? length : length + 1, > string); > > This is however not enough because the extra zero is only conveyed > through the string value, not the corresponding type, and > > varasm.c: mergeable_string_section > > currently uses the type bounds to search for a zero termination. > > We can't really change the type, so came up with an adjustment to > varasm. The motivation for using the type bounds was > > https://gcc.gnu.org/ml/gcc-patches/2006-06/msg01004.html > > which, IIUC, was tightly linked to string constants used as > initializers for objects which have a size of their own. > (Jakub cc'ed) > > For string constants not used as initializers, it seems that > we might be able to use the string bounds instead, possibly > wider. The attached patch does that and passed testing a few weeks > ago. > > So, while we're at it, does that look right, in light of all the > string length related issues that have been discussed lately ? > > I'm not sure I grasped all the ramifications. > > Thanks in advance! > > > * varasm.c (mergeable_string_section): Accept an extra SCAT > argument conveying the section category from which the request > originates. Only restrict the search to the string type bounds > if we're queried for an initializer. Consider the string bounds > otherwise. > (default_elf_select_section, STR and STR_INIT): Pass the section > category to mergeable_string_section. > > > > > Oh, how interesting. My patch only intends to add a dummy byte that is stripped again in the assembly. The purpose is to make it trivial to find out if a string constant is NUL-terminated in the middle-end by comparing TYPE_SIZE_UNIT (TREE_TYPE (decl)) with TREE_STRING_LENGTH (decl). TYPE_SIZE_UNIT >= TREE_STRING_LENGTH means zero terminated TYPE_SIZE_UNIT < TREE_STRING_LENGTH means not necessarily zero terminated, Such strings shall never chop off more than one nul character. Ada strings are generally not zero terminated, right? So in your search loop you would not have to look after type_len, because that byte would be guaranteed to be zero. That is if we agree that we want to restrict the string constants in that way as I proposed. In the case str_len > type_len I do not understand if that is right. Because varasm will only output type_size bytes, this seems only to select a string section but the last nul byte will not be assembled. Something that is not contained in the type_size should not be assembled. What exactly do you want to accomplish? Thanks Bernd.
Re: [PATCH][GCC][AArch64] Cleanup the AArch64 testsuite when stack-clash is on [Patch (6/6)]
On 07/24/2018 04:28 AM, tamar.christ...@arm.com wrote: > Hi All, > > This patch cleans up the testsuite when a run is done with stack clash > protection turned on. > > Concretely this switches off -fstack-clash-protection for a couple of tests: > > * sve: We don't yet support stack-clash-protection and sve, so for now turn > these off. > * assembler scan: some tests are quite fragile in that they check for exact >assembly output, e.g. check for exact amount of sub etc. These won't >match now. > * vla: Some of the ubsan tests negative array indices. Because the arrays > weren't >used before the incorrect $sp wouldn't have been used. The correct > value is >restored on ret. Now however we probe the $sp which causes a segfault. > * params: When testing the parameters we have to skip these on AArch64 > because of our > custom constraints on them. We already test them separately so > this isn't a > loss. > > Note that the testsuite is not entire clean due to gdb failure caused by > alloca with > stack clash. On AArch64 we output an incorrect .loc directive, but this is > already the > case with the current implementation in GCC and is a bug unrelated to this > patch series. > > Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no > issues. > Both targets were tested with stack clash on and off by default. > > Ok for trunk? > > Thanks, > Tamar > > gcc/testsuite/ > 2018-07-24 Tamar Christina > > PR target/86486 > * gcc.dg/pr82788.c: Skip for AArch64. > * gcc.dg/guality/vla-1.c: Turn off stack-clash. > * gcc.target/aarch64/subsp.c: Likewise. > * gcc.target/aarch64/sve/mask_struct_load_3.c: Likewise. > * gcc.target/aarch64/sve/mask_struct_store_3.c: Likewise. > * gcc.target/aarch64/sve/mask_struct_store_4.c: Likewise. > * gcc.dg/params/blocksort-part.c: Skip stack-clash checks > on AArch64. > * gcc.dg/stack-check-10.c: Add AArch64 specific checks. > * gcc.dg/stack-check-5.c: Add AArch64 specific checks. > * gcc.dg/stack-check-6a.c: Skip on AArch64, we don't support this. > * testsuite/lib/target-supports.exp > (check_effective_target_frame_pointer_for_non_leaf): AArch64 does not > require frame pointer for non-leaf functions. > >> -Original Message- >> From: Tamar Christina >> Sent: Wednesday, July 11, 2018 12:23 >> To: gcc-patches@gcc.gnu.org >> Cc: nd ; James Greenhalgh ; >> Richard Earnshaw ; Marcus Shawcroft >> >> Subject: [PATCH][GCC][AArch64] Cleanup the AArch64 testsuite when stack- >> clash is on [Patch (6/6)] >> >> Hi All, >> >> This patch cleans up the testsuite when a run is done with stack clash >> protection turned on. >> >> Concretely this switches off -fstack-clash-protection for a couple of tests: >> >> * sve: We don't yet support stack-clash-protection and sve, so for now turn >> these off. >> * assembler scan: some tests are quite fragile in that they check for exact >>assembly output, e.g. check for exact amount of sub etc. These won't >>match now. >> * vla: Some of the ubsan tests negative array indices. Because the arrays >> weren't >>used before the incorrect $sp wouldn't have been used. The correct >> value is >>restored on ret. Now however we probe the $sp which causes a >> segfault. >> * params: When testing the parameters we have to skip these on AArch64 >> because of our >> custom constraints on them. We already test them separately so >> this >> isn't a >> loss. >> >> Note that the testsuite is not entire clean due to gdb failure caused by >> alloca >> with stack clash. On AArch64 we output an incorrect .loc directive, but this >> is >> already the case with the current implementation in GCC and is a bug >> unrelated to this patch series. >> >> Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu >> and no issues. >> Both targets were tested with stack clash on and off by default. >> >> Ok for trunk? >> >> Thanks, >> Tamar >> >> gcc/testsuite/ >> 2018-07-11 Tamar Christina >> >> PR target/86486 >> gcc.dg/pr82788.c: Skip for AArch64. >> gcc.dg/guality/vla-1.c: Turn off stack-clash. >> gcc.target/aarch64/subsp.c: Likewise. >> gcc.target/aarch64/sve/mask_struct_load_3.c: Likewise. >> gcc.target/aarch64/sve/mask_struct_store_3.c: Likewise. >> gcc.target/aarch64/sve/mask_struct_store_4.c: Likewise. >> gcc.dg/params/blocksort-part.c: Skip stack-clash checks >> on AArch64. This is fine. FWIW, I'd been ignoring vla-1 and one or two others that were clearly invalid if stack-clash were on by default locally, but didn't push any kind of patch for that upstream since I didn't think anyone builds with stack-clash on by default (I did for testing purposes of course). Jeff
Re: [PATCH] Alias -Warray-bounds to -Warray-bounds=1
On 07/31/2018 02:58 PM, Martin Sebor wrote: > I can't approve patches but this one seems to be in > the obvious category so I think it could be checked in > without formal approval. > > It is however missing a couple of things: 1) a test case, > and 2) a reference to the bug it fixes in the ChangeLog > and in the test. > > With that, if no one objects, I will commit the path for > you. I think the patch is fine. Go forward with it whenever it's convenient for you. jeff
Re: [PATCH] include more detail in -Warray-bounds (PR 86650)
On 07/31/2018 12:50 PM, Martin Sebor wrote: > Attached is a much scaled-down patch that only adds a note > to the -Warray-bounds warnings mentioning the declaration > to which the out-of-bounds index or offset applies. > > Printing the inlining context needs a bit more work but > this small improvement can be made independently of it. > > On 07/23/2018 05:49 PM, Martin Sebor wrote: >> (David, I'm hoping your your help here. Please see the end.) >> >> While looking into a recent -Warray-bounds instance in Glibc >> involving inlining of large functions it became apparent that >> GCC could do a better job of pinpointing the source of >> the problem. >> >> The attached patch makes a few adjustments to both >> the pretty printer infrastructure and to VRP to make this >> possible. The diagnostic pretty printer already has directives >> to print the inlining context for both tree and gcall* arguments, >> so most of the changes just adjust things to be able to pass in >> gimple* argument instead. >> >> The only slightly interesting change is to print the declaration >> to which the out-of-bounds array refers if one is known. >> >> Tested on x86_64-linux with one regression. >> >> The regression is in the gcc.dg/Warray-bounds.c test: the column >> numbers of the warnings are off. Adding the %G specifier to >> the array bounds warnings in VRP has the unexpected effect of >> expanding the extent of the underling. For instance, for a test >> case like this: >> >> int a[10]; >> >> void f (void) >> { >> a[-1] = 0; >> } >> >> from the expected: >> >> a[-1] = 0; >> ~^~~~ >> >> to this: >> >> a[-1] = 0; >> ~~^~~ >> >> David, do you have any idea how to avoid this? >> >> Martin > > > gcc-86650.diff > > > PR tree-optimization/86650 - -Warray-bounds missing inlining context > > gcc/ChangeLog: > > PR tree-optimization/86650 > * tree-vrp.c (vrp_prop::check_array_ref): Print an inform message. > (vrp_prop::check_mem_ref): Same. > > gcc/testsuite/ChangeLog: > > PR tree-optimization/86650 > * gcc.dg/Warray-bounds-33.c: New test. OK. jeff
Re: [patch] improve internals documentation for nested function descriptors
On 07/27/2018 03:44 PM, Eric Botcazou wrote: >> Apropos of the discussion about improving the docs for >> TARGET_CUSTOM_FUNCTION_DESCRIPTORS in the context of the C-SKY port >> submission, >> >> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01454.html >> >> here is the patch I've come up with based on reading the source. Is >> this technically accurate? Any suggestions on how to improve it further? > > ""Define this hook to 0 if the target implements custom support" > > "custom" was precisely chosen to distinguish this kind of descriptors from > the > standard descriptors used on IA-64 or HP-PA, so it's contradictory. Moreover > I would really start with the "custom" case and not the standard case as was > originally written, the "custom" case being the common case for targets. > > I'm not really convinced by the substitution misalignment -> tag either, but > if others find the new version more understandable, then OK with me. I think the new version is an improvement, but I don't think it's ideal due to the issues noted above. We're trying to distinguish between the ABI mandated function descriptors and those generated internally for supporting nested functions. Wordsmithing that into the docs might make things clearer. I'm not sure how to express that clearly in the name of the target hook though. jeff
Re: [PATCH] [MSP430] Fix PR/86662
On 07/27/2018 07:09 AM, Jozef Lawrynowicz wrote: > As reported in PR/86662, use of __int20 in a program built with -mlarge and > -flto causes a segfault for msp430 due to endless recursion in > gimple_get_alias_set. > The attached patch fixes this. > The segfault can be observed on the gcc-7 and gcc-8 branches, and on trunk. > The testcase works in gcc-6 > > Successfully boostrapped and regtested on x86_64-pc-linux-gnu and > msp430-elf. > This fixes many LTO C and C++ tests for msp430 when the testsuite is > invoked > with the -mlarge target flag. > > Ok for gcc-7-branch, gcc-8-branch, and trunk? > OK for the trunk. Please give it a week or two of soak time before backporting. jeff
Re: [PATCH] -fsave-optimization-record: add contrib/optrecord.py
On 07/25/2018 08:59 AM, David Malcolm wrote: > On Tue, 2018-07-24 at 16:11 +0200, Richard Biener wrote: >> On Mon, Jul 23, 2018 at 9:20 PM David Malcolm >> wrote: >>> >>> On Mon, 2018-07-23 at 11:46 +0200, Richard Biener wrote: On Fri, Jul 20, 2018 at 6:27 PM David Malcolm >>> m> wrote: > > This patch adds a Python 3 module to "contrib" for reading the > output of > -fsave-optimization-record. > > It can be imported from other Python code, or run standalone as > a > script, > in which case it prints the saved messages in a form resembling > GCC > diagnostics. > > OK for trunk? OK, but shouldn't there maybe a user-visible (and thus installed) tool for this kind of stuff? Which would mean to place it somewhere else. >>> >>> As well as this support code, I've got code that uses it to >>> generate >>> HTML reports. I'm thinking that all this Python code might be >>> better >>> to maintain in an entirely separate repository, as a third-party >>> project (maintained by me, under some suitable Free Software >>> license, >>> accessible via PyPI), since I suspect that the release cycle ought >>> to >>> be different from that of gcc itself. >>> >>> Would that be a better approach? >> >> Possibly. >> >> Richard. > > [CCing Rainer and Mike] > > A related matter that may affect this: currently there's not much test > coverage for -fsave-optimization-record in trunk (sorry). > > "trunk" currently has: > > (a) selftest::test_building_json_from_dump_calls, which captures the > results of some dump calls, and does very minimal textual verification > of the JSON that would be emitted by -fsave-optimization-record. > > (b) gcc.c-torture/compile/pr86636.c, which merely verifies that we > don't ICE with a particular usage of -fsave-optimization-record. > > Ideally we'd have some test coverage of the file written out by -fsave- > optimization-record: that it's valid JSON, that it conforms to the > expected internal structure, and that the expected data is correct and > complete (relative to some known dump calls; I have a plugin for > testing this if need be, in gcc.dg/plugins). > > I don't know if Tcl has any JSON support, but in Python, JSON support > is built-in to the standard library, so I wonder if there's a case for > having a DejaGnu directive to (optionally) call out to a Python script > to check the JSON file that's been written, using this optrecord.py > module to handle loading the JSON. Doing so would implicitly check > that that the emitted adheres to the expected internal structure, and > the script could add additional testcase-specific verifications. > > The directive would have to check for the presence of Python, and emit > an UNSUPPORTED if unavailable. I'd support this. When a suitable python isn't available it fails gracefully and gives us independent testing that the resulting bits are proper JSON. jeff
Re: [PATCH][GCC][AArch64] Ensure that outgoing argument size is at least 8 bytes when alloca and stack-clash. [Patch (3/6)]
On 07/25/2018 05:09 AM, Tamar Christina wrote: > Hi All, > > Attached an updated patch which documents what the test cases are expecting > as requested. > > Ok for trunk? > > Thanks, > Tamar > > gcc/ > 2018-07-25 Tamar Christina > > PR target/86486 > * config/aarch64/aarch64.h (STACK_CLASH_OUTGOING_ARGS, > STACK_DYNAMIC_OFFSET): New. > * config/aarch64/aarch64.c (aarch64_layout_frame): > Update outgoing args size. > (aarch64_stack_clash_protection_alloca_probe_range, > TARGET_STACK_CLASH_PROTECTION_ALLOCA_PROBE_RANGE): New. > > gcc/testsuite/ > 2018-07-25 Tamar Christina > > PR target/86486 > * gcc.target/aarch64/stack-check-alloca-1.c: New. > * gcc.target/aarch64/stack-check-alloca-10.c: New. > * gcc.target/aarch64/stack-check-alloca-2.c: New. > * gcc.target/aarch64/stack-check-alloca-3.c: New. > * gcc.target/aarch64/stack-check-alloca-4.c: New. > * gcc.target/aarch64/stack-check-alloca-5.c: New. > * gcc.target/aarch64/stack-check-alloca-6.c: New. > * gcc.target/aarch64/stack-check-alloca-7.c: New. > * gcc.target/aarch64/stack-check-alloca-8.c: New. > * gcc.target/aarch64/stack-check-alloca-9.c: New. > * gcc.target/aarch64/stack-check-alloca.h: New. > * gcc.target/aarch64/stack-check-14.c: New. > * gcc.target/aarch64/stack-check-15.c: New. > >> -Original Message- >> From: Tamar Christina >> Sent: Friday, July 13, 2018 17:36 >> To: Tamar Christina ; gcc-patches@gcc.gnu.org >> Cc: nd ; James Greenhalgh ; >> Richard Earnshaw ; Marcus Shawcroft >> >> Subject: RE: [PATCH][GCC][AArch64] Ensure that outgoing argument size is at >> least 8 bytes when alloca and stack-clash. [Patch (3/6)] >> >> Hi All, >> >> I'm sending an updated patch which updates a testcase that hits one of our >> corner cases. >> This is an assembler scan only update in a testcase. >> >> Regards, >> Tamar >> >>> -Original Message- >>> From: Tamar Christina >>> Sent: Wednesday, July 11, 2018 12:21 >>> To: gcc-patches@gcc.gnu.org >>> Cc: nd ; James Greenhalgh ; >>> Richard Earnshaw ; Marcus Shawcroft >>> >>> Subject: [PATCH][GCC][AArch64] Ensure that outgoing argument size is >>> at least 8 bytes when alloca and stack-clash. [Patch (3/6)] >>> >>> Hi All, >>> >>> This patch adds a requirement that the number of outgoing arguments >>> for a function is at least 8 bytes when using stack-clash protection. >>> >>> By using this condition we can avoid a check in the alloca code and so >>> have smaller and simpler code there. >>> >>> A simplified version of the AArch64 stack frames is: >>> >>>+---+ >>>| | >>>| | >>>| | >>>+---+ >>>|LR | >>>+---+ >>>|FP | >>>+---+ >>>|dynamic allocations| expanding area which will push the >>> outgoing >>>+---+ args down during each allocation. >>>|padding| >>>+---+ >>>|outgoing stack args| safety buffer of 8 bytes (aligned) >>>+---+ >>> >>> By always defining an outgoing argument, alloca(0) effectively is safe >>> to probe at $sp due to the reserved buffer being there. It will never >>> corrupt the stack. >>> >>> This is also safe for alloca(x) where x is 0 or x % page_size == 0. >>> In the former it is the same case as alloca(0) while the latter is >>> safe because any allocation pushes the outgoing stack args down: >>> >>>|FP | >>>+---+ >>>| | >>>|dynamic allocations| alloca (x) >>>| | >>>+---+ >>>|padding| >>>+---+ >>>|outgoing stack args| safety buffer of 8 bytes (aligned) >>>+---+ >>> >>> Which means when you probe for the residual, if it's 0 you'll again >>> just probe in the outgoing stack args range, which we know is non-zero (at >> least 8 bytes). >>> >>> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. >>> Target was tested with stack clash on and off by default. >>> >>> Ok for trunk? >>> >>> Thanks, >>> Tamar >>> >>> gcc/ >>> 2018-07-11 Tamar Christina >>> >>> PR target/86486 >>> * config/aarch64/aarch64.h (STACK_CLASH_OUTGOING_ARGS, >>> STACK_DYNAMIC_OFFSET): New. >>> * config/aarch64/aarch64.c (aarch64_layout_frame): >>> Update outgoing args size. >>> (aarch64_stack_clash_protection_alloca_probe_range, >>> TARGET_STACK_CLASH_PROTECTION_ALLOCA_PROBE_RANGE): >>> New. >>> >>> gcc/testsuite/ >>> 2018-07-11 Tamar Christina >>> >>> PR target/86486 >>> * gcc.target/aarch64/stack-check-alloca-1.c: New. >>> *
Re: [PATCH][GCC][AArch64] Updated stack-clash implementation supporting 64k probes. [patch (1/6)]
On 07/25/2018 05:09 AM, Tamar Christina wrote: > Hi All, > > Attached is an updated patch that clarifies some of the comments in the patch > and adds comments to the individual testcases > as requested. > > Ok for trunk? > > Thanks, > Tamar > > gcc/ > 2018-07-25 Jeff Law > Richard Sandiford > Tamar Christina > > PR target/86486 > * config/aarch64/aarch64.md (cmp, > probe_stack_range): Add k (SP) constraint. > * config/aarch64/aarch64.h (STACK_CLASH_CALLER_GUARD, > STACK_CLASH_MAX_UNROLL_PAGES): New. > * config/aarch64/aarch64.c (aarch64_output_probe_stack_range): Emit > stack probes for stack clash. > (aarch64_allocate_and_probe_stack_space): New. > (aarch64_expand_prologue): Use it. > (aarch64_expand_epilogue): Likewise and update IP regs re-use criteria. > (aarch64_sub_sp): Add emit_move_imm optional param. > > gcc/testsuite/ > 2018-07-25 Jeff Law > Richard Sandiford > Tamar Christina > > PR target/86486 > * gcc.target/aarch64/stack-check-12.c: New. > * gcc.target/aarch64/stack-check-13.c: New. > * gcc.target/aarch64/stack-check-cfa-1.c: New. > * gcc.target/aarch64/stack-check-cfa-2.c: New. > * gcc.target/aarch64/stack-check-prologue-1.c: New. > * gcc.target/aarch64/stack-check-prologue-10.c: New. > * gcc.target/aarch64/stack-check-prologue-11.c: New. > * gcc.target/aarch64/stack-check-prologue-2.c: New. > * gcc.target/aarch64/stack-check-prologue-3.c: New. > * gcc.target/aarch64/stack-check-prologue-4.c: New. > * gcc.target/aarch64/stack-check-prologue-5.c: New. > * gcc.target/aarch64/stack-check-prologue-6.c: New. > * gcc.target/aarch64/stack-check-prologue-7.c: New. > * gcc.target/aarch64/stack-check-prologue-8.c: New. > * gcc.target/aarch64/stack-check-prologue-9.c: New. > * gcc.target/aarch64/stack-check-prologue.h: New. > * lib/target-supports.exp > (check_effective_target_supports_stack_clash_protection): Add AArch64. OK on my end. AArch64 maintainers have the final say since this is all AArch64 specific bits. jeff
Re: [PATCH][GCC][front-end][build-machinery][opt-framework] Allow setting of stack-clash via configure options. [Patch (4/6)]
On 07/24/2018 08:14 AM, Tamar Christina wrote: > Hi All, > > Here's an updated patch with documentation. > > > Ok for trunk? > > Thanks, > Tamar > > gcc/ > 2018-07-24 Tamar Christina > > PR target/86486 > * configure.ac: Add stack-clash-protection-guard-size. > * doc/install.texi: Document it. > * config.in (DEFAULT_STK_CLASH_GUARD_SIZE): New. > * params.def: Update comment for guard-size. > * configure: Regenerate. > > The 07/24/2018 11:34, Joseph Myers wrote: >> On Tue, 24 Jul 2018, tamar.christ...@arm.com wrote: >> >>> This patch defines a configure option to allow the setting of the default >>> guard size via configure flags when building the target. >> If you add a configure option, you must also add documentation for it in >> install.texi. >> >> -- >> Joseph S. Myers >> jos...@codesourcery.com > -- > > > rb9473-3.patch > OK. jeff
Re: [PATCH][GCC][front-end][opt-framework] Update options framework for parameters to properly handle and validate configure time params. [Patch (2/3)]
On 07/24/2018 04:29 AM, tamar.christ...@arm.com wrote: > Hi All, > > This patch is re-spun to handle the configure changes in patch 4 / 6 of the > previous series. > > This patch now changes it so that default parameters are validated during > initialization. This change is needed to ensure parameters set via by the > target specific common initialization routines still keep the parameters > within > the valid range. > > Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no > issues. > Both targets were tested with stack clash on and off by default. > > Ok for trunk? > > Thanks, > Tamar > > gcc/ > 2018-07-24 Tamar Christina > > * params.c (validate_param): New. > (add_params): Use it. > (set_param_value): Refactor param validation into validate_param. > (diagnostic.h): Include. > * diagnostic.h (diagnostic_ready_p): New. > >> -Original Message- >> From: Jeff Law >> Sent: Wednesday, July 11, 2018 20:24 >> To: Tamar Christina ; gcc-patches@gcc.gnu.org >> Cc: nd ; jos...@codesourcery.com >> Subject: Re: [PATCH][GCC][front-end][opt-framework] Update options >> framework for parameters to properly handle and validate configure time >> params. [Patch (2/3)] >> >> On 07/11/2018 05:24 AM, Tamar Christina wrote: >>> Hi All, >>> >>> This patch builds on a previous patch to pass param options down from >>> configure by adding more expansive validation and correctness checks. >>> >>> These are set very early on and allow the target to validate or reject >>> the values as they see fit. >>> >>> To do this compiler_param has been extended to hold a value set at >>> configure time, this value is used to be able to distinguish between >>> >>> 1) default value >>> 2) configure value >>> 3) back-end default >>> 4) user specific value. >>> >>> The priority of the values should be 4 > 2 > 3 > 1. The compiler will >>> now also validate the values in params.def after setting them. This >>> means invalid values will no longer be accepted. >>> >>> This also changes it so that default parameters are validated during >>> initialization. This change is needed to ensure parameters set via >>> configure or by the target specific common initialization routines >>> still keep the parameters within the valid range. >>> >>> Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu >> and no issues. >>> Both targets were tested with stack clash on and off by default. >>> >>> Ok for trunk? >>> >>> Thanks, >>> Tamar >>> >>> gcc/ >>> 2018-07-11 Tamar Christina >>> >>> * params.h (struct param_info): Add configure_value. >>> * params.c (DEFPARAMCONF): New. >>> (DEFPARAM, DEFPARAMENUM5): Set configure_value. >>> (validate_param): New. >>> (add_params): Use it. >>> (set_param_value): Refactor param validation into validate_param. >>> (maybe_set_param_value): Don't override value from configure. >>> (diagnostic.h): Include. >>> * params-enum.h (DEFPARAMCONF): New. >>> * params-list.h: Likewise. >>> * params-options.h: Likewise. >>> * params.def (PARAM_STACK_CLASH_PROTECTION_GUARD_SIZE): >> Use it. >>> * diagnostic.h (diagnostic_ready_p): New. >> Generally OK, though probably should depend on what we decide WRT >> configurability. ie, I'm not convinced we need to be able to set the default >> via a configure time option. And if we don't support that this patch gets >> somewhat simpler. >> >> jeff > > > rb9153-2.patch [ ... ] I think this is fine now. jeff
[committed] Fix target/86795 (mn10300)
AFAIK there's not enough building blocks on the mn103 series to mount a spectre v1 attack. Committed to the trunk. Jeff commit 2419ebf7b73827b1266f265b325d24c2e9daf0f1 Author: law Date: Fri Aug 3 17:39:00 2018 + PR target/86795 * config/mn10300/mn10300.c (TARGET_HAVE_SPECULATION_SAFE_VALUE): Define to speculation_safe_value_not_needed. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@263296 138bc75d-0d04-0410-961f-82ee72b054a4 diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 0c4918bf1ef..0b451cef341 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,9 @@ +2018-08-03 Jeff Law + + PR target/86795 + * config/mn10300/mn10300.c (TARGET_HAVE_SPECULATION_SAFE_VALUE): + Define to speculation_safe_value_not_needed. + 2018-08-03 David Malcolm * doc/gcov.texi (-x): Remove duplicate "to". diff --git a/gcc/config/mn10300/mn10300.c b/gcc/config/mn10300/mn10300.c index 1247f32e3d5..a7e5e6b24f5 100644 --- a/gcc/config/mn10300/mn10300.c +++ b/gcc/config/mn10300/mn10300.c @@ -3437,4 +3437,7 @@ mn10300_reorg (void) #undef TARGET_MODES_TIEABLE_P #define TARGET_MODES_TIEABLE_P mn10300_modes_tieable_p +#undef TARGET_HAVE_SPECULATION_SAFE_VALUE +#define TARGET_HAVE_SPECULATION_SAFE_VALUE speculation_safe_value_not_needed + struct gcc_target targetm = TARGET_INITIALIZER;
Re: [PATCH 00/11] (v2) Mitigation against unsafe data speculation (CVE-2017-5753)
On 08/02/2018 02:19 PM, John David Anglin wrote: > On 2018-08-02 2:40 PM, Jeff Law wrote: >> It's been eons. I think there's enough building blocks on the PA to >> mount a spectre v1 attack. They've got branch prediction with varying >> degress of speculative execution, caches and user accessable cycle >> timers. > Yes. >> >> There's varying degrees of out of order execution all the way back in >> the PA7xxx processors (hit-under-miss) to full o-o-o execution in the >> PA8xxx series (including the PA8900 that's in the rp3440). > However, as far as I know, loads and stores are always ordered. I'm pretty sure that's not true on PA8000 class machines: You can get the details here: http://web.archive.org/web/20040214092531/http://www.cpus.hp.com/technical_references/advperf.shtml It describes in reasonable detail how the load/store reorder buffer and the address reorder buffer works as well as the tag checking to detect when a speculative load was executed and its results had to be thrown away due to a store-to-load dependency check in the ARB. But again, given the state of the target, I'm not at all concerned about mitigating spectre v1. Jeff
Re: [PATCH] [Ada] Make middle-end string literals NUL terminated
Hi Bernd, > On 31 Jul 2018, at 14:07, Bernd Edlinger wrote: > > Hi! > > > This fixes a couple STRING_CST that are not explicitly NUL terminated. > These were caught in a new check in varasm.c I am currently working on. > > Having a NUL terminated string does not change the binary output, but it > makes things easier for he middle-end. > > > Bootstrapped and reg-tested on x86_64-pc-linux-gnu. > Is it OK for trunk? --- gcc/ada/gcc-interface/utils2.c 2017-12-21 07:57:41.0 +0100 +++ gcc/ada/gcc-interface/utils2.c 2018-07-31 11:44:01.517117923 +0200 @@ -1844,7 +1844,7 @@ expand_sloc (Node_Id gnat_node, tree *fi } const int len = strlen (str); - *filename = build_string (len, str); + *filename = build_string (len + 1, str); TREE_TYPE (*filename) = build_array_type (char_type_node, build_index_type (size_int (len))); *line = build_int_cst (NULL_TREE, line_number); This one looks good to me. I'm not officially listed as a maintainer so I'll let Eric have the final word. I'm answering because ... --- gcc/ada/gcc-interface/trans.c 2018-07-17 10:10:04.0 +0200 +++ gcc/ada/gcc-interface/trans.c 2018-07-31 11:16:27.350728886 +0200 @@ -6079,7 +6079,7 @@ gnat_to_gnu (Node_Id gnat_node) where GCC will want to treat it as a C string. */ string[i] = 0; - gnu_result = build_string (length, string); + gnu_result = build_string (length + 1, string); /* Strings in GCC don't normally have types, but we want this to not be converted to the array type. */ We have a local variant of this one, on which I worked after we realized that it was not enough to achieve the intended effect. The difference at this spot is simply that we prevent the +1 if the original string happens to be explicitly nul terminated already. Like: build_string ((length > 0 && string[length-1] == 0) ? length : length + 1, string); This is however not enough because the extra zero is only conveyed through the string value, not the corresponding type, and varasm.c: mergeable_string_section currently uses the type bounds to search for a zero termination. We can't really change the type, so came up with an adjustment to varasm. The motivation for using the type bounds was https://gcc.gnu.org/ml/gcc-patches/2006-06/msg01004.html which, IIUC, was tightly linked to string constants used as initializers for objects which have a size of their own. (Jakub cc'ed) For string constants not used as initializers, it seems that we might be able to use the string bounds instead, possibly wider. The attached patch does that and passed testing a few weeks ago. So, while we're at it, does that look right, in light of all the string length related issues that have been discussed lately ? I'm not sure I grasped all the ramifications. Thanks in advance! * varasm.c (mergeable_string_section): Accept an extra SCAT argument conveying the section category from which the request originates. Only restrict the search to the string type bounds if we're queried for an initializer. Consider the string bounds otherwise. (default_elf_select_section, STR and STR_INIT): Pass the section category to mergeable_string_section. ada-string.diff Description: Binary data
Re: [PATCH][GCC][AARCH64] Use stdint integers in vect_su_add_sub.c
On 02/08/18 20:18, James Greenhalgh wrote: On Tue, Jul 31, 2018 at 04:53:19AM -0500, Matthew Malcomson wrote: Fixing the ilp32 issue that Christophe found. The existing testcase uses `long` to represent a 64 bit integer. This breaks when compiled using the `-mabi=ilp32` flag. We switch the use of int/long for int32_t/int64_t to avoid this problem and show the requirement of a widening operation more clearly. Normally we try to avoid pulling more headers than we need in the testsuite. Have you seen this construct: typedef unsigned __attribute__((mode(DI))) uint64_t; It can be a good way to ensure you have the right type for the patterns you are adding. Thanks, James Thanks! No I hadn't seen that construct before. Have attached a patch using that construct. Matthew diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vect_su_add_sub.c b/gcc/testsuite/gcc.target/aarch64/simd/vect_su_add_sub.c index 338da54f6281c90e1c6b1c59fa50d9b719005c77..921c5f15c74ce02d106f9b5290244532fc56d480 100644 --- a/gcc/testsuite/gcc.target/aarch64/simd/vect_su_add_sub.c +++ b/gcc/testsuite/gcc.target/aarch64/simd/vect_su_add_sub.c @@ -1,21 +1,27 @@ /* { dg-do compile } */ /* { dg-options "-O3" } */ +typedef int __attribute__ ((mode (SI))) int32_t; +typedef int __attribute__ ((mode (DI))) int64_t; +typedef unsigned __attribute__ ((mode (SI))) size_t; +typedef unsigned __attribute__ ((mode (SI))) uint32_t; +typedef unsigned __attribute__ ((mode (DI))) uint64_t; + /* Ensure we use the signed/unsigned extend vectorized add and sub instructions. */ #define N 1024 -int a[N]; -long c[N]; -long d[N]; -unsigned int ua[N]; -unsigned long uc[N]; -unsigned long ud[N]; +int32_t a[N]; +int64_t c[N]; +int64_t d[N]; +uint32_t ua[N]; +uint64_t uc[N]; +uint64_t ud[N]; void add () { - for (int i = 0; i < N; i++) + for (size_t i = 0; i < N; i++) d[i] = a[i] + c[i]; } /* { dg-final { scan-assembler-times "\[ \t\]saddw2\[ \t\]+" 1 } } */ @@ -24,7 +30,7 @@ add () void subtract () { - for (int i = 0; i < N; i++) + for (size_t i = 0; i < N; i++) d[i] = c[i] - a[i]; } /* { dg-final { scan-assembler-times "\[ \t\]ssubw2\[ \t\]+" 1 } } */ @@ -33,7 +39,7 @@ subtract () void uadd () { - for (int i = 0; i < N; i++) + for (size_t i = 0; i < N; i++) ud[i] = ua[i] + uc[i]; } /* { dg-final { scan-assembler-times "\[ \t\]uaddw2\[ \t\]+" 1 } } */ @@ -42,7 +48,7 @@ uadd () void usubtract () { - for (int i = 0; i < N; i++) + for (size_t i = 0; i < N; i++) ud[i] = uc[i] - ua[i]; } /* { dg-final { scan-assembler-times "\[ \t\]usubw2\[ \t\]+" 1 } } */
Re: [2/5] C-SKY port: Backend implementation
On 08/02/2018 04:27 PM, Jeff Law wrote: I think the cse_cc pass is really just working around one or more bugs in CSE and/or a backend bug. The RTL above clearly shows a common subexpression that is not eliminated by CSE. What CSE should be trying to do is changing the second and third occurrences of (gt:CC (reg 77) (reg 78)) with (reg 33) which would create nop-sets which would be subsequently deleted. I suspect you do not have an insn which matches that nop set of the CC register. If you fix that I suspect CSE will work better and eliminate the need for your cse_cc pass. Thanks for the tip about that! I hacked things up to do as you suggest and it appears to work. I'll post a new patch set once I've had time for more testing, probably Monday-ish. -Sandra
[C++ PATCH] Fix constexpr ICE with poor man's offsetof (PR c++/86738)
Hi! Some time ago, I've moved the poor man's offsetof recognizing hack to cp_fold. On the following testcase that means we don't fold it early during parsing. Now, if we try to evaluate those inside of constexpr contexts with !ctx->quiet, it is diagnosed as invalid (but *non_constant_p is not set). Worse, with ctx->quiet, we pretend it is a constant expression but don't actually fold it, and then we have e.g. in array index evaluation VERIFY_CONSTANT (index); ... VERIFY_CONSTANT (nelts); if ((lval ? !tree_int_cst_le (index, nelts) : !tree_int_cst_lt (index, nelts)) || tree_int_cst_sgn (index) < 0) { diag_array_subscript (ctx, ary, index); *non_constant_p = true; return t; } where VERIFY_CONSTANT is happy about it, even when index is not an INTEGER_CST, but large complex TREE_CONSTANT expression. The above though assumes INTEGER_CST. Perhaps we should check for INTEGER_CST somewhere (and in other similar code too), but it isn't clear to me what exactly we should do if those trees aren't INTEGER_CSTs, especially with !ctx->quiet. This patch changes a different thing, the usual case (including other spots for NULL pointer dereferences or arith) in constexpr.c is if (some condition) { if (!ctx->quiet) error* (...); *non_constant_p = true; return t; } but the following two spots were different and that caused the array handling to see those complex unsimplified constant expressions. With this, it is not treated as constant for maybe_constant_value etc. purposes, though following cp_fold can still fold it into a constant. Bootstrapped/regtested on x86_64-linux and i686-linux (including check-c++-all testing on both), ok for trunk and 8.3 after a while? 2018-08-03 Jakub Jelinek PR c++/86738 * constexpr.c (cxx_eval_binary_expression): For arithmetics involving NULL pointer set *non_constant_p to true. (cxx_eval_component_reference): For dereferencing of a NULL pointer, set *non_constant_p to true and return t. * g++.dg/opt/pr86738.C: New test. --- gcc/cp/constexpr.c.jj 2018-07-31 23:57:24.193432388 +0200 +++ gcc/cp/constexpr.c 2018-08-03 14:54:13.302817282 +0200 @@ -2082,6 +2082,7 @@ cxx_eval_binary_expression (const conste { if (!ctx->quiet) error ("arithmetic involving a null pointer in %qE", lhs); + *non_constant_p = true; return t; } else if (code == POINTER_PLUS_EXPR) @@ -2522,9 +2523,13 @@ cxx_eval_component_reference (const cons lval, non_constant_p, overflow_p); if (INDIRECT_REF_P (whole) - && integer_zerop (TREE_OPERAND (whole, 0)) - && !ctx->quiet) -error ("dereferencing a null pointer in %qE", orig_whole); + && integer_zerop (TREE_OPERAND (whole, 0))) +{ + if (!ctx->quiet) + error ("dereferencing a null pointer in %qE", orig_whole); + *non_constant_p = true; + return t; +} if (TREE_CODE (whole) == PTRMEM_CST) whole = cplus_expand_constant (whole); --- gcc/testsuite/g++.dg/opt/pr86738.C.jj 2018-08-03 15:03:51.477358712 +0200 +++ gcc/testsuite/g++.dg/opt/pr86738.C 2018-08-03 15:02:51.940201694 +0200 @@ -0,0 +1,12 @@ +// PR c++/86738 +// { dg-do compile } + +struct S { int s; }; +unsigned char a[20]; +unsigned char *p = [(__UINTPTR_TYPE__) &((S *) 0)->s]; + +void +foo () +{ + __builtin_memcpy ([15], [(__UINTPTR_TYPE__) &((S *) 0)->s], 2); +} Jakub
RE: [PATCH][GCC][AArch64] Cleanup the AArch64 testsuite when stack-clash is on [Patch (6/6)]
Hi James, Thanks for the review. > -Original Message- > From: James Greenhalgh > Sent: Tuesday, July 31, 2018 22:07 > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > > Subject: Re: [PATCH][GCC][AArch64] Cleanup the AArch64 testsuite when > stack-clash is on [Patch (6/6)] > > On Tue, Jul 24, 2018 at 05:28:03AM -0500, Tamar Christina wrote: > > Hi All, > > > > This patch cleans up the testsuite when a run is done with stack clash > > protection turned on. > > > > Concretely this switches off -fstack-clash-protection for a couple of tests: > > > > * sve: We don't yet support stack-clash-protection and sve, so for now turn > these off. > > * assembler scan: some tests are quite fragile in that they check for exact > >assembly output, e.g. check for exact amount of sub etc. These won't > >match now. > > * vla: Some of the ubsan tests negative array indices. Because the arrays > weren't > >used before the incorrect $sp wouldn't have been used. The correct > value is > >restored on ret. Now however we probe the $sp which causes a > segfault. > > * params: When testing the parameters we have to skip these on AArch64 > because of our > > custom constraints on them. We already test them separately so > > this > isn't a > > loss. > > > > Note that the testsuite is not entire clean due to gdb failure caused > > by alloca with stack clash. On AArch64 we output an incorrect .loc > > directive, but this is already the case with the current implementation in > GCC and is a bug unrelated to this patch series. > > > > Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu > and no issues. > > Both targets were tested with stack clash on and off by default. > > > > Ok for trunk? > > For each of the generic tests you skip because of mismatched bounds, I think > we should ensure we have an equivalent test checking that behaviour in > gcc.target/aarch64/ . If we have that, it might be good to cross-reference > them with a comment above your skip lines. The problem is, fundamentally what these two tests are trying to test we don't support. pr82788.c is explicitly checking for a bug that happened when you have a 1KB probe interval with a 4KB guard-size. And stack-check-6a.c is checking that increasing the guard size with smaller probe interval reduces the amount of probing you would have to do. Both these cases we can't support as we force stack guard to be the same as probing interval. I can't think of any equivalent AArch64 test as these class of failures just don't happen for us. > > > * vla: Some of the ubsan tests negative array indices. Because the arrays > weren't > >used before the incorrect $sp wouldn't have been used. The correct > value is > >restored on ret. Now however we probe the $sp which causes a > segfault. > > This is interesting behaviour; is it a desirable side effect of your changes? They should fail for every correct implementation of stack clash protection. The programs were always invalid. These test could probably be made to work by allocating some number of stack space before the negative Index arrays. They wouldn't crash then as the negative indices would still keep you within your stack space, but that's arguably a different test then. > > Otherwise, this patch is OK. > > Thanks, > James > > > > gcc/testsuite/ > > 2018-07-24 Tamar Christina > > > > PR target/86486 > > * gcc.dg/pr82788.c: Skip for AArch64. > > * gcc.dg/guality/vla-1.c: Turn off stack-clash. > > * gcc.target/aarch64/subsp.c: Likewise. > > * gcc.target/aarch64/sve/mask_struct_load_3.c: Likewise. > > * gcc.target/aarch64/sve/mask_struct_store_3.c: Likewise. > > * gcc.target/aarch64/sve/mask_struct_store_4.c: Likewise. > > * gcc.dg/params/blocksort-part.c: Skip stack-clash checks > > on AArch64. > > * gcc.dg/stack-check-10.c: Add AArch64 specific checks. > > * gcc.dg/stack-check-5.c: Add AArch64 specific checks. > > * gcc.dg/stack-check-6a.c: Skip on AArch64, we don't support this. > > * testsuite/lib/target-supports.exp > > (check_effective_target_frame_pointer_for_non_leaf): AArch64 > does not > > require frame pointer for non-leaf functions. > > > > > -Original Message- > > > From: Tamar Christina > > > Sent: Wednesday, July 11, 2018 12:23 > > > To: gcc-patches@gcc.gnu.org > > > Cc: nd ; James Greenhalgh > ; > > > Richard Earnshaw ; Marcus Shawcroft > > > > > > Subject: [PATCH][GCC][AArch64] Cleanup the AArch64 testsuite when > > > stack- clash is on [Patch (6/6)] > > > > > > Hi All, > > > > > > This patch cleans up the testsuite when a run is done with stack > > > clash protection turned on. > > > > > > Concretely this switches off -fstack-clash-protection for a couple of > > > tests: > > > > > > * sve: We don't yet support stack-clash-protection and sve, so for > > > now turn these off. >
[PATCH] Fix thinko in insert_reciprocals (PR tree-optimization/86835)
Hi! As the comment say, we want to insert (if should_insert_square_recip is true) new_square_stmt right after new_stmt. The current code does that correctly if new_stmt is inserted with gsi_insert_before (, new_stmt, GSI_SAME_STMT); because the following gsi_insert_before (, new_square_stmt, GSI_SAME_STMT); will keep inserting before whatever gsi points to. But one of the 3 cases inserts new_stmt instead after def_gsi, and the current gsi = *def_gsi; gsi_insert_after (def_gsi, new_stmt, GSI_NEW_STMT); gsi_insert_before (, new_square_stmt, GSI_SAME_STMT); certainly doesn't do that, it inserts new_stmt after the def_gsi stmt and then inserts new_square_stmt before the def_gsi stmt, so we have order of: new_square_stmt whatever-def_gsi-pointed-to-stmt new_stmt where new_square_stmt uses the lhs of new_stmt. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2018-08-03 Jakub Jelinek PR tree-optimization/86835 * tree-ssa-math-opts.c (insert_reciprocals): Even when inserting new_stmt after def_gsi, make sure to insert new_square_stmt after that stmt, not 2 stmts before it. * gcc.dg/pr86835.c: New test. --- gcc/tree-ssa-math-opts.c.jj 2018-07-12 21:31:27.698410833 +0200 +++ gcc/tree-ssa-math-opts.c2018-08-03 12:58:27.475961037 +0200 @@ -422,6 +422,8 @@ insert_reciprocals (gimple_stmt_iterator gsi_next (); gsi_insert_before (, new_stmt, GSI_SAME_STMT); + if (should_insert_square_recip) + gsi_insert_before (, new_square_stmt, GSI_SAME_STMT); } else if (def_gsi && occ->bb == def_gsi->bb) { @@ -429,21 +431,19 @@ insert_reciprocals (gimple_stmt_iterator never happen if the definition statement can throw, because in that case the sole successor of the statement's basic block will dominate all the uses as well. */ - gsi = *def_gsi; gsi_insert_after (def_gsi, new_stmt, GSI_NEW_STMT); + if (should_insert_square_recip) + gsi_insert_after (def_gsi, new_square_stmt, GSI_NEW_STMT); } else { /* Case 3: insert in a basic block not containing defs/uses. */ gsi = gsi_after_labels (occ->bb); gsi_insert_before (, new_stmt, GSI_SAME_STMT); + if (should_insert_square_recip) + gsi_insert_before (, new_square_stmt, GSI_SAME_STMT); } - /* Regardless of which case the reciprocal as inserted in, -we insert the square immediately after the reciprocal. */ - if (should_insert_square_recip) - gsi_insert_before (, new_square_stmt, GSI_SAME_STMT); - reciprocal_stats.rdivs_inserted++; occ->recip_def_stmt = new_stmt; --- gcc/testsuite/gcc.dg/pr86835.c.jj 2018-08-03 13:07:28.834159126 +0200 +++ gcc/testsuite/gcc.dg/pr86835.c 2018-08-03 13:07:14.119126559 +0200 @@ -0,0 +1,29 @@ +/* PR tree-optimization/86835 */ +/* { dg-do run } */ +/* { dg-options "-O2 -ffast-math -Wuninitialized" } */ + +__attribute__((noipa)) void +foo (int n, double *x, double *y) +{ /* { dg-bogus "is used uninitialized in this function" "" { target *-*-* } 0 } */ + int i; + double b = y[4]; + for (i = 0; i < n; ++i) +y[3] += __builtin_sin (x[i] / b); + y[0] /= b; + y[1] /= b * b; + y[2] /= b; +} + +int +main () +{ + double y[] = { 16.0, 64.0, 128.0, 0.0, 2.0 }; + foo (0, y, y); + if (__builtin_fabs (y[0] - 8.0) > 0.0001 + || __builtin_fabs (y[1] - 16.0) > 0.0001 + || __builtin_fabs (y[2] - 64.0) > 0.0001 + || y[3] != 0.0 + || y[4] != 2.0) +__builtin_abort (); + return 0; +} Jakub
[C++ PATCH] Fix tsubst of structured bindings (PR c++/86836)
Hi! As mentioned in the PR, for valid structured bindings this patch should be unnecessary, because the identifiers from the structured binding shouldn't be used in the initializer of the structured binding, but for invalid source it can matter. When tsubst_init is called before tsubst_decomp_names, the local specializations for the decomp id VAR_DECLs aren't created and so the tsubst of those VAR_DECLs gives the PARM_DECL in this testcase, or something else unrelated to the decomp. Fixed by doing tsubst_decomp_names first, then tsubst_init the initializer and then the rest. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk and 8.3 after a while? 2018-08-03 Jakub Jelinek PR c++/86836 * pt.c (tsubst_expr): For structured bindings, call tsubst_decomp_names before tsubst_init, not after it. * g++.dg/cpp1z/decomp46.C: New test. --- gcc/cp/pt.c.jj 2018-08-03 11:36:25.550755429 +0200 +++ gcc/cp/pt.c 2018-08-03 11:48:51.144567965 +0200 @@ -16740,7 +16740,17 @@ tsubst_expr (tree t, tree args, tsubst_f else { int const_init = false; + unsigned int cnt = 0; + tree first = NULL_TREE, ndecl = error_mark_node; maybe_push_decl (decl); + + if (VAR_P (decl) + && DECL_DECOMPOSITION_P (decl) + && TREE_TYPE (pattern_decl) != error_mark_node) + ndecl = tsubst_decomp_names (decl, pattern_decl, args, + complain, in_decl, , + ); + if (VAR_P (decl) && DECL_PRETTY_FUNCTION_P (decl)) { @@ -16756,23 +16766,14 @@ tsubst_expr (tree t, tree args, tsubst_f if (VAR_P (decl)) const_init = (DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P (pattern_decl)); - if (VAR_P (decl) - && DECL_DECOMPOSITION_P (decl) - && TREE_TYPE (pattern_decl) != error_mark_node) - { - unsigned int cnt; - tree first; - tree ndecl - = tsubst_decomp_names (decl, pattern_decl, args, -complain, in_decl, , ); - if (ndecl != error_mark_node) - cp_maybe_mangle_decomp (ndecl, first, cnt); - cp_finish_decl (decl, init, const_init, NULL_TREE, 0); - if (ndecl != error_mark_node) - cp_finish_decomp (ndecl, first, cnt); - } - else - cp_finish_decl (decl, init, const_init, NULL_TREE, 0); + + if (ndecl != error_mark_node) + cp_maybe_mangle_decomp (ndecl, first, cnt); + + cp_finish_decl (decl, init, const_init, NULL_TREE, 0); + + if (ndecl != error_mark_node) + cp_finish_decomp (ndecl, first, cnt); } } } --- gcc/testsuite/g++.dg/cpp1z/decomp46.C.jj2018-08-03 12:00:10.524066454 +0200 +++ gcc/testsuite/g++.dg/cpp1z/decomp46.C 2018-08-03 11:59:49.925018174 +0200 @@ -0,0 +1,25 @@ +// PR c++/86836 +// { dg-do compile { target c++11 } } +// { dg-options "" } + +struct A { + int operator*(); + void operator++(); + bool operator!=(A); +}; +template class map { +public: + A begin(); + A end(); +}; + +template void mergemap(map orig, map toadd) { + for (auto p : toadd) +auto [orig] = orig;// { dg-error "use of 'orig' before deduction of 'auto'" } +} // { dg-warning "structured bindings only available with" "" { target c++14_down } .-1 } + +int +main() { + map x, y; + mergemap(x, y); +} Jakub
Re: [PATCH,nvptx] Use CUDA driver API to select default runtime launch, geometry
On 08/03/2018 08:22 AM, Tom de Vries wrote: > On 08/01/2018 09:11 PM, Cesar Philippidis wrote: >> On 08/01/2018 07:12 AM, Tom de Vries wrote: >> >> + gangs = grids * (blocks / warp_size); > > So, we launch with gangs == grids * workers ? Is that intentional? Yes. At least that's what I've been using in og8. Setting num_gangs = grids alone caused significant slow downs. >>> >>> Well, what you're saying here is: increasing num_gangs increases >>> performance. >>> >>> You don't explain why you multiply with workers specifically. >> >> I set it that way because I think the occupancy calculator is >> determining the occupancy of a single multiprocessor unit, rather than >> the entire GPU. Looking at the og8 code again, I had >> >>num_gangs = 2 * threads_per_sm / warp_size * dev_size >> >> which corresponds to >> >>2 * grids * blocks / warp_size >> > > I've done an experiment using the sample simpleOccupancy. The kernel is > small, so the blocks returned is the maximum: max_threads_per_block (1024). > > The grids returned is 10, which I tentatively interpret as num_dev * > (max_threads_per_multi_processor / blocks). [ Where num_dev == 5, and > max_threads_per_multi_processor == 2048. ] > > Substituting that into the og8 code, and equating > max_threads_per_multi_processor with threads_per_sm, I indeed get > > num_gangs = 2 * grids * blocks / warp_size. > > So with this extra information I see how you got there. > > But I still see no rationale why blocks is used here, and I wonder > whether something like num_gangs = grids * 64 would give similar results. My original intent was to keep the load proportional to the block size. So, in the case were a block size is limited by shared-memory or the register file capacity, the runtime wouldn't excessively over assign gangs to the multiprocessor units if their state is going to be swapped out even more than necessary. With that said, I could be wrong here. It would be nice if Nvidia provided us with more insights into their hardware. > Anyway, given that this is what is used on og8, I'm ok with using that, > so let's go with: > ... > gangs = 2 * grids * (blocks / warp_size); > ... > [ so, including the factor two you explicitly left out from the original > patch. Unless you see a pressing reason not to include it. ] > > Can you repost after retesting? [ note: the updated patch I posted > earlier doesn't apply on trunk anymore due to the cuda-lib.def change. ] Thanks for looking into this. I got bogged down tracking a problem with allocatable scalars in fortran. I'll repost post this patch after I tested it with an older version of CUDA (probably CUDA 5.5 using the Nvidia driver 331.113 on a K40). Cesar
[AArch64] Fix -mlow-precision-div (PR 86838)
The "@" handling broke -mlow-precision-div, because the scalar forms of the instruction were provided by a pattern that also provided FRECPX (and so were parameterised on an unspec code as well as a mode), while the SIMD versions had a dedicated FRECPE pattern. This patch moves the scalar FRECPE handling to the SIMD pattern too (as for FRECPS) and uses a separate pattern for FRECPX. The convention in aarch64-simd-builtins.def seemed to be to add comments only if the mapping wasn't obvious (i.e. not just sticking "aarch64_" on the beginning and "" on the end), so the patch deletes the reference to the combined pattern instead of rewording it. There didn't seem to be any coverage of -mlow-precision-div in the testsuite, so the patch adds some tests for it. Tested on aarch64-linux-gnu. OK to install? Richard 2018-08-03 Richard Sandiford gcc/ PR target/86838 * config/aarch64/iterators.md (FRECP, frecp_suffix): Delete. * config/aarch64/aarch64-simd.md (aarch64_frecp): Fold FRECPE into... (@aarch64_frecpe): ...here and the move FRECPX to... (aarch64_frecpx): ...this new pattern. * config/aarch64/aarch64-simd-builtins.def: Remove comment about aarch64_frecp. gcc/testsuite/ PR target/86838 * gcc.target/aarch64/frecpe_1.c: New test. * gcc.target/aarch64/frecpe_2.c: Likewise. Index: gcc/config/aarch64/iterators.md === --- gcc/config/aarch64/iterators.md 2018-07-31 19:10:15.744291661 +0100 +++ gcc/config/aarch64/iterators.md 2018-08-03 16:32:07.531492221 +0100 @@ -1537,8 +1537,6 @@ (define_int_iterator FCVT [UNSPEC_FRINTZ (define_int_iterator FCVT_F2FIXED [UNSPEC_FCVTZS UNSPEC_FCVTZU]) (define_int_iterator FCVT_FIXED2F [UNSPEC_SCVTF UNSPEC_UCVTF]) -(define_int_iterator FRECP [UNSPEC_FRECPE UNSPEC_FRECPX]) - (define_int_iterator CRC [UNSPEC_CRC32B UNSPEC_CRC32H UNSPEC_CRC32W UNSPEC_CRC32X UNSPEC_CRC32CB UNSPEC_CRC32CH UNSPEC_CRC32CW UNSPEC_CRC32CX]) @@ -1788,8 +1786,6 @@ (define_int_attr hi_lanes_optab [(UNSPEC (UNSPEC_UNPACKSLO "BYTES_BIG_ENDIAN") (UNSPEC_UNPACKULO "BYTES_BIG_ENDIAN")]) -(define_int_attr frecp_suffix [(UNSPEC_FRECPE "e") (UNSPEC_FRECPX "x")]) - (define_int_attr crc_variant [(UNSPEC_CRC32B "crc32b") (UNSPEC_CRC32H "crc32h") (UNSPEC_CRC32W "crc32w") (UNSPEC_CRC32X "crc32x") (UNSPEC_CRC32CB "crc32cb") (UNSPEC_CRC32CH "crc32ch") Index: gcc/config/aarch64/aarch64-simd.md === --- gcc/config/aarch64/aarch64-simd.md 2018-08-02 11:59:06.851355923 +0100 +++ gcc/config/aarch64/aarch64-simd.md 2018-08-03 16:32:07.531492221 +0100 @@ -5879,21 +5879,22 @@ (define_insn "aarch64_simd_ld1_x2" (define_insn "@aarch64_frecpe" - [(set (match_operand:VHSDF 0 "register_operand" "=w") - (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand" "w")] + [(set (match_operand:VHSDF_HSDF 0 "register_operand" "=w") + (unspec:VHSDF_HSDF +[(match_operand:VHSDF_HSDF 1 "register_operand" "w")] UNSPEC_FRECPE))] "TARGET_SIMD" - "frecpe\\t%0., %1." + "frecpe\t%0, %1" [(set_attr "type" "neon_fp_recpe_")] ) -(define_insn "aarch64_frecp" +(define_insn "aarch64_frecpx" [(set (match_operand:GPF_F16 0 "register_operand" "=w") (unspec:GPF_F16 [(match_operand:GPF_F16 1 "register_operand" "w")] -FRECP))] +UNSPEC_FRECPX))] "TARGET_SIMD" - "frecp\\t%0, %1" - [(set_attr "type" "neon_fp_recp_")] + "frecpx\t%0, %1" + [(set_attr "type" "neon_fp_recpx_")] ) (define_insn "@aarch64_frecps" Index: gcc/config/aarch64/aarch64-simd-builtins.def === --- gcc/config/aarch64/aarch64-simd-builtins.def2018-06-14 12:27:40.672026808 +0100 +++ gcc/config/aarch64/aarch64-simd-builtins.def2018-08-03 16:32:07.531492221 +0100 @@ -413,8 +413,6 @@ BUILTIN_VALL (BINOP, trn1, 0) BUILTIN_VALL (BINOP, trn2, 0) - /* Implemented by - aarch64_frecp. */ BUILTIN_GPF_F16 (UNOP, frecpe, 0) BUILTIN_GPF_F16 (UNOP, frecpx, 0) Index: gcc/testsuite/gcc.target/aarch64/frecpe_1.c === --- /dev/null 2018-07-26 10:26:13.137955424 +0100 +++ gcc/testsuite/gcc.target/aarch64/frecpe_1.c 2018-08-03 16:32:07.531492221 +0100 @@ -0,0 +1,18 @@ +/* { dg-options "-Ofast -mlow-precision-div" } */ +/* { dg-do compile } */ + +float +f1 (float x) +{ + return 1 / x; +} + +/* { dg-final { scan-assembler {\tfrecpe\t(s[0-9]+), s0\n\tfrecps\t(s[0-9]+), \1, s0\n\tfmul\ts0, \1, \2\n} } } */ + +double +f2 (double x) +{ + return 1 / x; +} + +/* { dg-final { scan-assembler {\tfrecpe\t(d[0-9]+), d0\n\tfrecps\t(d[0-9]+), \1, d0\n\tfmul\t\1, \1,
Re: [PATCH] Avoid infinite loop with duplicate anonymous union fields
On Wed, 1 Aug 2018, Bogdan Harjoc wrote: > > seen_error () is the idiomatic way of testing whether an error has been > > reported. > > The updated patch is attached and includes a test that passes with: > > make check-gcc RUNTESTFLAGS="dg.exp=union-duplicate-field.c" Thanks, committed. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH,nvptx] Use CUDA driver API to select default runtime launch, geometry
On 08/01/2018 09:11 PM, Cesar Philippidis wrote: > On 08/01/2018 07:12 AM, Tom de Vries wrote: > > + gangs = grids * (blocks / warp_size); So, we launch with gangs == grids * workers ? Is that intentional? >>> >>> Yes. At least that's what I've been using in og8. Setting num_gangs = >>> grids alone caused significant slow downs. >>> >> >> Well, what you're saying here is: increasing num_gangs increases >> performance. >> >> You don't explain why you multiply with workers specifically. > > I set it that way because I think the occupancy calculator is > determining the occupancy of a single multiprocessor unit, rather than > the entire GPU. Looking at the og8 code again, I had > >num_gangs = 2 * threads_per_sm / warp_size * dev_size > > which corresponds to > >2 * grids * blocks / warp_size > I've done an experiment using the sample simpleOccupancy. The kernel is small, so the blocks returned is the maximum: max_threads_per_block (1024). The grids returned is 10, which I tentatively interpret as num_dev * (max_threads_per_multi_processor / blocks). [ Where num_dev == 5, and max_threads_per_multi_processor == 2048. ] Substituting that into the og8 code, and equating max_threads_per_multi_processor with threads_per_sm, I indeed get num_gangs = 2 * grids * blocks / warp_size. So with this extra information I see how you got there. But I still see no rationale why blocks is used here, and I wonder whether something like num_gangs = grids * 64 would give similar results. Anyway, given that this is what is used on og8, I'm ok with using that, so let's go with: ... gangs = 2 * grids * (blocks / warp_size); ... [ so, including the factor two you explicitly left out from the original patch. Unless you see a pressing reason not to include it. ] Can you repost after retesting? [ note: the updated patch I posted earlier doesn't apply on trunk anymore due to the cuda-lib.def change. ] Thanks, - Tom
Re: [PATCH] docs: fix stray duplicated words
On Fri, 3 Aug 2018, David Malcolm wrote: > A manpage checker [1] detected various stray repeated words > in the generated manpages for gcc and gcov, which this patch > fixes. > > Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. > > OK for trunk? OK (in my view this patch is obvious). -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH, Ada] Makefile patches from initial RISC-V cross/native build.
On Aug 03 2018, Pierre-Marie de Rodat wrote: > Understood, thank you for the context. Let’s try to fix the underlying > problem if you have issues with canadian crosses again. :-) There should at least be a way to find the build gnatmake under a different name. Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different."
Re: [PATCH][GCC][AARCH64] Use STLUR for atomic_store
On 02/08/18 17:26, matthew.malcom...@arm.com wrote: > Use the STLUR instruction introduced in Armv8.4-a. > This insruction has the store-release semantic like STLR but can take a > 9-bit unscaled signed immediate offset. > > Example test case: > ``` > void > foo () > { > int32_t *atomic_vals = calloc (4, sizeof (int32_t)); > atomic_store_explicit (atomic_vals + 1, 2, memory_order_release); > } > ``` > > Before patch generates > ``` > foo: > stp x29, x30, [sp, -16]! > mov x1, 4 > mov x0, x1 > mov x29, sp > bl calloc > mov w1, 2 > add x0, x0, 4 > stlrw1, [x0] > ldp x29, x30, [sp], 16 > ret > ``` > > After patch generates > ``` > foo: > stp x29, x30, [sp, -16]! > mov x1, 4 > mov x0, x1 > mov x29, sp > bl calloc > mov w1, 2 > stlur w1, [x0, 4] > ldp x29, x30, [sp], 16 > ret > ``` > > Full bootstrap and regression test done on aarch64. > > Ok for trunk? > > gcc/ > 2018-07-26 Matthew Malcomson > > * config/aarch64/aarch64-protos.h > (aarch64_offset_9bit_signed_unscaled_p): New declaration. > * config/aarch64/aarch64.c > (aarch64_offset_9bit_signed_unscaled_p): Rename from > offset_9bit_signed_unscaled_p. > * config/aarch64/aarch64.h (TARGET_ARMV8_4): Add feature macro. > * config/aarch64/atomics.md (atomic_store): Allow offset > and use stlur. > * config/aarch64/constraints.md (Ust): New constraint. > * config/aarch64/predicates.md. > (aarch64_sync_or_stlur_memory_operand): New predicate. Thinking further ahead, there were several new instructions added that have the same basic form; not all of them are STLUR. At some point I expect we will want to add support for those into GCC. So I think you should consider a more generic name, perhaps aarch64_sync_offset_memory_operand? R. > > gcc/testsuite/ > 2018-07-26 Matthew Malcomson > > * gcc.target/aarch64/atomic-store.c: New. > > > use-stlur-instruction.patch > > > diff --git a/gcc/config/aarch64/aarch64-protos.h > b/gcc/config/aarch64/aarch64-protos.h > index > af5db9c595385f7586692258f750b6aceb3ed9c8..630a75bf776fcdc374aa9ffa4bb020fea3719320 > 100644 > --- a/gcc/config/aarch64/aarch64-protos.h > +++ b/gcc/config/aarch64/aarch64-protos.h > @@ -393,6 +393,7 @@ void aarch64_split_add_offset (scalar_int_mode, rtx, rtx, > rtx, rtx, rtx); > bool aarch64_mov_operand_p (rtx, machine_mode); > rtx aarch64_reverse_mask (machine_mode, unsigned int); > bool aarch64_offset_7bit_signed_scaled_p (machine_mode, poly_int64); > +bool aarch64_offset_9bit_signed_unscaled_p (machine_mode, poly_int64); > char *aarch64_output_sve_cnt_immediate (const char *, const char *, rtx); > char *aarch64_output_sve_addvl_addpl (rtx, rtx, rtx); > char *aarch64_output_sve_inc_dec_immediate (const char *, rtx); > diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h > index > c1218503bab19323eee1cca8b7e4bea8fbfcf573..328512e11f4230e24223bc51e55bdca8b31f6a20 > 100644 > --- a/gcc/config/aarch64/aarch64.h > +++ b/gcc/config/aarch64/aarch64.h > @@ -237,6 +237,9 @@ extern unsigned aarch64_architecture_version; > /* ARMv8.3-A features. */ > #define TARGET_ARMV8_3 (AARCH64_ISA_V8_3) > > +/* ARMv8.4-A features. */ > +#define TARGET_ARMV8_4 (AARCH64_ISA_V8_4) > + > /* Make sure this is always defined so we don't have to check for ifdefs > but rather use normal ifs. */ > #ifndef TARGET_FIX_ERR_A53_835769_DEFAULT > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index > fa01475aa9ee579b6a3b2526295b622157120660..8560150d552b4f830335ccd420a0749f01d4bb6a > 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -4546,8 +4546,8 @@ aarch64_offset_7bit_signed_scaled_p (machine_mode mode, > poly_int64 offset) > > /* Return true if OFFSET is a signed 9-bit value. */ > > -static inline bool > -offset_9bit_signed_unscaled_p (machine_mode mode ATTRIBUTE_UNUSED, > +bool > +aarch64_offset_9bit_signed_unscaled_p (machine_mode mode ATTRIBUTE_UNUSED, > poly_int64 offset) > { >HOST_WIDE_INT const_offset; > @@ -5823,7 +5823,7 @@ aarch64_classify_address (struct aarch64_address_info > *info, >instruction memory accesses. */ > if (mode == TImode || mode == TFmode) > return (aarch64_offset_7bit_signed_scaled_p (DImode, offset) > - && (offset_9bit_signed_unscaled_p (mode, offset) > + && (aarch64_offset_9bit_signed_unscaled_p (mode, offset) > || offset_12bit_unsigned_scaled_p (mode, offset))); > > /* A 7bit offset check because OImode will emit a ldp/stp > @@ -5837,7 +5837,7 @@ aarch64_classify_address (struct aarch64_address_info > *info, >ldr/str instructions (only big
Re: [C++ PATCH] Fix ICE in build_base_path (PR c++/86706)
On 08/03/2018 02:45 AM, Jakub Jelinek wrote: Hi! This is Jason's patch from the PR for which I've added a reduced testcase and bootstrapped/regtested on x86_64-linux and i686-linux. Ok for trunk and 8.3 (perhaps after a while)? 2018-07-30 Jason Merrill PR c++/86706 * class.c (build_base_path): Use currently_open_class. * g++.dg/template/pr86706.C: New test. Ok nathan -- Nathan Sidwell
Re: [PATCH] Use getentropy() for seeding PRNG
> On Aug 3, 2018, at 9:19 AM, Janne Blomqvist wrote: > > The getentropy function, found on Linux, OpenBSD, and recently also > FreeBSD, can be used to get random bytes to initialize the PRNG. It > is similar to the traditional way of reading from /dev/urandom, but > being a system call rather than a special file, it doesn't suffer from > problems like running out of file descriptors, or failure when running > in a container where /dev/urandom is not available. I don't understand why this is useful. getrandom, and /dev/random, are for strong (secure) RNGs. A PRNG is something entirely different. By saying we use entropy to seed it, we blur the distinction and create the false impression that the PRNG has security properties. It would be better to initialize with something more obviously insecure, like gettimeofday(). paul
Re: [PATCH] Use getentropy() for seeding PRNG
On Fri, Aug 3, 2018 at 4:28 PM, Jakub Jelinek wrote: > On Fri, Aug 03, 2018 at 04:19:03PM +0300, Janne Blomqvist wrote: > > --- a/libgfortran/intrinsics/random.c > > +++ b/libgfortran/intrinsics/random.c > > @@ -309,12 +309,9 @@ getosrandom (void *buf, size_t buflen) > >for (size_t i = 0; i < buflen / sizeof (unsigned int); i++) > > rand_s ([i]); > >return buflen; > > +#elif defined(HAVE_GETENTROPY) > > + return getentropy (buf, buflen); > > #else > > I wonder if it wouldn't be better to use getentropy only if it is > successful > and fall back to whatever you were doing before. > > E.g. on Linux, I believe getentropy in glibc just uses the getrandom > syscall, which has only been introduced in Linux kernel 3.17. > Yes, that is my understanding as well. > So, if somebody is running glibc 2.25 or later on kernel < 3.17, it will > fail, even though reads from /dev/urandom could work. > Hmm, fair enough. So replace the random.c part of the patch with diff --git a/libgfortran/intrinsics/random.c b/libgfortran/intrinsics/random.c index 234c5ff95fd..229fa6995c0 100644 --- a/libgfortran/intrinsics/random.c +++ b/libgfortran/intrinsics/random.c @@ -310,11 +310,10 @@ getosrandom (void *buf, size_t buflen) rand_s ([i]); return buflen; #else - /* - TODO: When glibc adds a wrapper for the getrandom() system call - on Linux, one could use that. - - TODO: One could use getentropy() on OpenBSD. */ +#ifdef HAVE_GETENTROPY + if (getentropy (buf, buflen) == 0) +return 0; +#endif int flags = O_RDONLY; #ifdef O_CLOEXEC flags |= O_CLOEXEC; Just to be sure, I regtested this slightly modified patch as well. Ok for trunk? -- Janne Blomqvist
Re: [PATCH] Use getentropy() for seeding PRNG
On Fri, Aug 03, 2018 at 04:19:03PM +0300, Janne Blomqvist wrote: > --- a/libgfortran/intrinsics/random.c > +++ b/libgfortran/intrinsics/random.c > @@ -309,12 +309,9 @@ getosrandom (void *buf, size_t buflen) >for (size_t i = 0; i < buflen / sizeof (unsigned int); i++) > rand_s ([i]); >return buflen; > +#elif defined(HAVE_GETENTROPY) > + return getentropy (buf, buflen); > #else I wonder if it wouldn't be better to use getentropy only if it is successful and fall back to whatever you were doing before. E.g. on Linux, I believe getentropy in glibc just uses the getrandom syscall, which has only been introduced in Linux kernel 3.17. So, if somebody is running glibc 2.25 or later on kernel < 3.17, it will fail, even though reads from /dev/urandom could work. > - /* > - TODO: When glibc adds a wrapper for the getrandom() system call > - on Linux, one could use that. > - > - TODO: One could use getentropy() on OpenBSD. */ >int flags = O_RDONLY; > #ifdef O_CLOEXEC >flags |= O_CLOEXEC; Jakub
Re: [Patch][GCC] Document and fix -r (partial linking)
Hi Allan, > On 3 Aug 2018, at 12:56, Allan Sandfeld Jensen wrote: > > On Mittwoch, 1. August 2018 18:32:30 CEST Joseph Myers wrote: >> On Wed, 1 Aug 2018, Allan Sandfeld Jensen wrote: >>> gcc/ > > 2018-07-29 Allan Sandfeld Jensen > > gcc/doc > >* invoke.texi: Document -r > > gcc/ >* gcc.c (LINK_COMMAND_SPEC): Handle -r like -nostdlib >* config/darwin.h (LINK_COMMAND_SPEC): Handle -r like -nostdlib > --- > gcc/config/darwin.h | 8 > gcc/doc/invoke.texi | 7 ++- > gcc/gcc.c | 6 +++--- > 3 files changed, 13 insertions(+), 8 deletions(-) > > diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h > index 980ad9b4057..6ad657a4958 100644 > --- a/gcc/config/darwin.h > +++ b/gcc/config/darwin.h > @@ -178,20 +178,20 @@ extern GTY(()) int darwin_ms_struct; >"%X %{s} %{t} %{Z} %{u*} \ > %{e*} %{r} \ > %{o*}%{!o:-o a.out} \ > -%{!nostdlib:%{!nostartfiles:%S}} \ > +%{!nostdlib:%{!r:%{!nostartfiles:%S}}} \ > %{L*} %(link_libgcc) %o > %{fprofile-arcs|fprofile-generate*|coverage:-lgcov} \ > %{fopenacc|fopenmp|%:gt(%{ftree-parallelize-loops=*:%*} 1): \ > %{static|static-libgcc|static-libstdc++|static-libgfortran: > libgomp.a%s; : -lgomp } } \ > %{fgnu-tm: \ > %{static|static-libgcc|static-libstdc++|static-libgfortran: libitm.a%s; > : -litm } } \ > -%{!nostdlib:%{!nodefaultlibs:\ > +%{!nostdlib:%{!r:%{!nodefaultlibs:\ > %{%:sanitize(address): -lasan } \ > %{%:sanitize(undefined): -lubsan } \ > %(link_ssp) \ > " DARWIN_EXPORT_DYNAMIC " % %(link_gcc_c_sequence) \ > -}}\ > -%{!nostdlib:%{!nostartfiles:%E}} %{T*} %{F*} }}}" > +}}}\ > +%{!nostdlib:%{!r:%{!nostartfiles:%E}}} %{T*} %{F*} }}}" > > #define DSYMUTIL "\ndsymutil” The Darwin part looks reasonable to me. Thanks for doing this, Iain
Re: [PATCH, Ada] Makefile patches from initial RISC-V cross/native build.
On 08/02/2018 04:44 PM, Jim Wilson wrote: I only needed it for the first canadian cross build. If it is causing problems for you it can be dropped. If I need it again, it is easy enough to write again. Ok, thank you! I’ve just committed the revert. I ran into a problem where the search paths were wrong when running the canadian cross built compiler for the native build, and I had to add something like -cargs -L$installdir/lib to gnatmake to make it work, but it is possible I did something wrong. These canadian cross builds aren't something I do often, so I might have made a mistake. Also, RISC-V is a new target, so there are lots of things that are just a little broken and have to be worked around. This might have been related to one of those problems. Understood, thank you for the context. Let’s try to fix the underlying problem if you have issues with canadian crosses again. :-) -- Pierre-Marie de Rodat
[PATCH] Use getentropy() for seeding PRNG
The getentropy function, found on Linux, OpenBSD, and recently also FreeBSD, can be used to get random bytes to initialize the PRNG. It is similar to the traditional way of reading from /dev/urandom, but being a system call rather than a special file, it doesn't suffer from problems like running out of file descriptors, or failure when running in a container where /dev/urandom is not available. Regtested on x86_64-pc-linux-gnu, Ok for trunk? 2018-08-03 Janne Blomqvist * configure.ac: Check for getentropy. * intrinsics/random.c (getosrandom): Use getentropy if available. * config.h.in: Regenerated. * configure: Regenerated. --- libgfortran/configure.ac| 3 ++- libgfortran/intrinsics/random.c | 7 ++- 2 files changed, 4 insertions(+), 6 deletions(-) diff --git a/libgfortran/configure.ac b/libgfortran/configure.ac index bf6d3634dda..900c7466dec 100644 --- a/libgfortran/configure.ac +++ b/libgfortran/configure.ac @@ -312,7 +312,8 @@ if test "${hardwire_newlib:-0}" -eq 1; then fi else AC_CHECK_FUNCS_ONCE(getrusage times mkstemp strtof strtold snprintf \ - ftruncate chsize chdir getlogin gethostname kill link symlink sleep ttyname \ + ftruncate chsize chdir getentropy getlogin gethostname kill link symlink \ + sleep ttyname \ alarm access fork setmode fcntl \ gettimeofday stat fstat lstat getpwuid vsnprintf dup \ getcwd localtime_r gmtime_r getpwuid_r ttyname_r clock_gettime \ diff --git a/libgfortran/intrinsics/random.c b/libgfortran/intrinsics/random.c index 234c5ff95fd..1f47f32188b 100644 --- a/libgfortran/intrinsics/random.c +++ b/libgfortran/intrinsics/random.c @@ -309,12 +309,9 @@ getosrandom (void *buf, size_t buflen) for (size_t i = 0; i < buflen / sizeof (unsigned int); i++) rand_s ([i]); return buflen; +#elif defined(HAVE_GETENTROPY) + return getentropy (buf, buflen); #else - /* - TODO: When glibc adds a wrapper for the getrandom() system call - on Linux, one could use that. - - TODO: One could use getentropy() on OpenBSD. */ int flags = O_RDONLY; #ifdef O_CLOEXEC flags |= O_CLOEXEC; -- 2.17.1
Re: PING [PATCH] warn for strlen of arrays with missing nul (PR 86552, 86711, 86714) )
On 08/02/18 22:34, Martin Sebor wrote: > On 08/02/2018 12:56 PM, Bernd Edlinger wrote: >> On 08/02/18 15:26, Bernd Edlinger wrote: /* If the length can be computed at compile-time, return it. */ - len = c_strlen (src, 0); + tree array; + tree len = c_strlen (src, 0, ); >>> >>> You know the c_strlen tries to compute wide character sizes, >>> but strlen does not do that, strlen (L"abc") should give 1 >>> (or 0 on a BE machine) >>> I wonder if that is correct. >>> >> [snip] static tree -fold_builtin_strlen (location_t loc, tree type, tree arg) +fold_builtin_strlen (location_t loc, tree fndecl, tree type, tree arg) { if (!validate_arg (arg, POINTER_TYPE)) return NULL_TREE; else { - tree len = c_strlen (arg, 0); - + tree arr = NULL_TREE; + tree len = c_strlen (arg, 0, ); >>> >>> Is it possible to write a test case where strlen(L"test") reaches this >>> point? >>> what will c_strlen return then? >>> >> >> Yes, of course it is: >> >> $ cat y.c >> int f(char *x) >> { >> return __builtin_strlen(x); >> } >> >> int main () >> { >> return f((char*)"abcdef"[0]); >> } >> $ gcc -O3 -S y.c >> $ cat y.s >> main: >> .LFB1: >> .cfi_startproc >> movl $6, %eax >> ret >> .cfi_endproc >> >> The reason is that c_strlen tries to fold wide chars at all. >> I do not know when that was introduced, was that already before your last >> patches? > > The function itself was introduced in 1992 if not earlier, > before wide strings even existed. AFAICS, it has always > accepted strings of all widths. Until r241489 (in GCC 7) > it computed their length in bytes, not characters. I don't > know if that was on purpose or if it was just never changed > to compute the length in characters when wide strings were > first introduced. From the name I assume it's the latter. > The difference wasn't detected until sprintf tests were added > for wide string directives. The ChangeLog description for > the change reads: Correctly handle wide strings. I didn't > consider pathological cases like strlen (L"abc"). It > shouldn't be difficult to change to fix this case. > Oh, oh, oh $ cat y3.c int main () { char c[100]; int x = __builtin_sprintf (c, "%S", L"\u"); __builtin_printf("%d %ld\n", x,__builtin_strlen(c)); } $ gcc-4.8 -O3 -std=c99 y3.c $ ./a.out -1 0 $ gcc -O3 y3.c $ ./a.out 1 0 $ echo $LANG de_DE.UTF-8 I would have expected L"\u" to converted to UTF-8 or another encoding, so the return value if sprintf is far from obvious, and probably language dependent. Why do you think it is a good idea to use really every opportunity to optimize totally unnecessary things like using the return value from the sprintf function as it is? Did you never think this adds a significant maintenance burden on the rest of us as well? Bernd.
Re: Handle SLP of call pattern statements
Richard Biener writes: > On Fri, Jul 20, 2018 at 12:22 PM Richard Sandiford > wrote: >> >> We couldn't vectorise: >> >> for (int j = 0; j < n; ++j) >> { >> for (int i = 0; i < 16; ++i) >> a[i] = (b[i] + c[i]) >> 1; >> a += step; >> b += step; >> c += step; >> } >> >> at -O3 because cunrolli unrolled the inner loop and SLP couldn't handle >> AVG_FLOOR patterns (see also PR86504). The problem was some overly >> strict checking of pattern statements compared to normal statements >> in vect_get_and_check_slp_defs: >> >> switch (gimple_code (def_stmt)) >> { >> case GIMPLE_PHI: >> case GIMPLE_ASSIGN: >> break; >> >> default: >> if (dump_enabled_p ()) >> dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, >> "unsupported defining stmt:\n"); >> return -1; >> } >> >> The easy fix would have been to add GIMPLE_CALL to the switch, >> but I don't think the switch is doing anything useful. We only create >> pattern statements that the rest of the vectoriser can handle, and code >> in this function and elsewhere check whether SLP is possible. >> >> I'm also not sure why: >> >> if (!first && !oprnd_info->first_pattern >> /* Allow different pattern state for the defs of the >> first stmt in reduction chains. */ >> && (oprnd_info->first_dt != vect_reduction_def >> >> is necessary. All that should matter is that the statements in the >> node are "similar enough". It turned out to be quite hard to find a >> convincing example that used a mixture of pattern and non-pattern >> statements, so bb-slp-pow-1.c is the best I could come up with. >> But it does show that the combination of "xi * xi" statements and >> "pow (xj, 2) -> xj * xj" patterns are handled correctly. >> >> The patch therefore just removes the whole if block. >> >> The loop also needed commutative swapping to be extended to at least >> AVG_FLOOR. >> >> This gives +3.9% on 525.x264_r at -O3. >> >> Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf >> and x86_64-linux-gnu. OK to install? > > Always nice to see seemingly dead code go. OK if you can still > build SPEC with this change and pass a test run. > > At least I _do_ seem to remember having "issues" in this area... Thanks. I've now applied it after testing SPEC2006 and SPEC2017 on aarch64-linux-gnu and x86_64-linux-gnu. I also ran it through our internal SVE benchmark set, which includes those two and various other things. Richard
Re: [PATCH] PR libstdc++/60555 std::system_category() should recognise POSIX errno values
On 02/08/18 13:05 -0400, David Edelsohn wrote: On Thu, Aug 2, 2018 at 12:12 PM Jonathan Wakely wrote: On 02/08/18 11:32 -0400, David Edelsohn wrote: >Jonathan > >This patch broke AIX bootstrap. > >/nasfarm/edelsohn/src/src/libstdc++-v3/src/c++11/system_error.cc: In >member function 'virtual std::error_condition >{anonymous}::system_error_category::default_error_condition(int) >const': >/nasfarm/edelsohn/src/src/libstdc++-v3/src/c++11/system_error.cc:245:7: >error: duplicate case value > case ENOTEMPTY: > ^~~~ >/nasfarm/edelsohn/src/src/libstdc++-v3/src/c++11/system_error.cc:136:7: >note: previously used here > case EEXIST: > ^~~~ > >AIX errno.h > >/* > * AIX returns EEXIST where 4.3BSD used ENOTEMPTY; > * but, the standards insist on unique errno values for each errno. > * A unique value is reserved for users that want to code case > * statements for systems that return either EEXIST or ENOTEMPTY. > */ >#if defined(_ALL_SOURCE) && !defined(_LINUX_SOURCE_COMPAT) >#define ENOTEMPTY EEXIST /* Directory not empty */ >#else /* not _ALL_SOURCE */ >#define ENOTEMPTY 87 >#endif /* _ALL_SOURCE */ Strange that "linux source compat" is required for POSIX conformance. >I will add _LINUX_SOURCE_COMPAT and see how bootstrap proceeds. If it >works, it also will be required in the backport. Let's just workaround the maybe-not-unique values instead: --- a/libstdc++-v3/src/c++11/system_error.cc +++ b/libstdc++-v3/src/c++11/system_error.cc @@ -241,7 +241,8 @@ namespace #ifdef ENOTDIR case ENOTDIR: #endif -#ifdef ENOTEMPTY +#if defined ENOTEMPTY && (!defined EEXIST || ENOTEMPTY != EEXIST) + // AIX sometimes uses the same value for EEXIST and ENOTEMPTY case ENOTEMPTY: #endif #ifdef ENOTRECOVERABLE Whatever you prefer. _LINUX_SOURCE_COMPAT seems to be working for bootstrap so far. I've committed this to trunk. Tested x86_64-linux, powerpc64le-linux, powerpc-aix. commit f2848bf67225467c71853921d071f3c4a94431a0 Author: Jonathan Wakely Date: Fri Aug 3 13:24:06 2018 +0100 Add workaround for non-unique errno values on AIX * src/c++11/system_error.cc (system_error_category::default_error_condition): Add workaround for ENOTEMPTY and EEXIST having the same value on AIX. * testsuite/19_diagnostics/error_category/system_category.cc: Add extra testcases for EDOM, EILSEQ, ERANGE, EEXIST and ENOTEMPTY. diff --git a/libstdc++-v3/src/c++11/system_error.cc b/libstdc++-v3/src/c++11/system_error.cc index 82b4cb5f98c..07f44c0af9c 100644 --- a/libstdc++-v3/src/c++11/system_error.cc +++ b/libstdc++-v3/src/c++11/system_error.cc @@ -241,7 +241,8 @@ namespace #ifdef ENOTDIR case ENOTDIR: #endif -#ifdef ENOTEMPTY +#if defined ENOTEMPTY && (!defined EEXIST || ENOTEMPTY != EEXIST) + // AIX sometimes uses the same value for EEXIST and ENOTEMPTY case ENOTEMPTY: #endif #ifdef ENOTRECOVERABLE diff --git a/libstdc++-v3/testsuite/19_diagnostics/error_category/system_category.cc b/libstdc++-v3/testsuite/19_diagnostics/error_category/system_category.cc index 6076d735513..77cd9c5df83 100644 --- a/libstdc++-v3/testsuite/19_diagnostics/error_category/system_category.cc +++ b/libstdc++-v3/testsuite/19_diagnostics/error_category/system_category.cc @@ -34,6 +34,22 @@ test02() const std::error_category& cat = std::system_category(); std::error_condition cond; + // As of 2011, ISO C only defines EDOM, EILSEQ and ERANGE: + cond = cat.default_error_condition(EDOM); + VERIFY( cond.value() == EDOM ); + VERIFY( cond == std::errc::argument_out_of_domain ); + VERIFY( cond.category() == std::generic_category() ); + cond = cat.default_error_condition(EILSEQ); + VERIFY( cond.value() == EILSEQ ); + VERIFY( cond == std::errc::illegal_byte_sequence ); + VERIFY( cond.category() == std::generic_category() ); + cond = cat.default_error_condition(ERANGE); + VERIFY( cond.value() == ERANGE ); + VERIFY( cond == std::errc::result_out_of_range ); + VERIFY( cond.category() == std::generic_category() ); + + // EBADF and EACCES are defined on all targets, + // according to config/os/*/error_constants.h cond = cat.default_error_condition(EBADF); VERIFY( cond.value() == EBADF ); VERIFY( cond == std::errc::bad_file_descriptor ); @@ -52,8 +68,29 @@ test02() VERIFY( cond.category() == cat ); // PR libstdc++/60555 + VERIFY( std::error_code(EDOM, cat) == std::errc::argument_out_of_domain ); + VERIFY( std::error_code(EILSEQ, cat) == std::errc::illegal_byte_sequence ); + VERIFY( std::error_code(ERANGE, cat) == std::errc::result_out_of_range ); VERIFY( std::error_code(EBADF, cat) == std::errc::bad_file_descriptor ); VERIFY( std::error_code(EACCES, cat) == std::errc::permission_denied ); + + // As shown at https://gcc.gnu.org/ml/libstdc++/2018-08/msg00018.html + // these two error codes might have the same value on AIX, but we still + // expect both to be matched by
Re: [PATCH v2 0/4] some vxworks/powerpc patches
Hello Rasmus, (Back after a summer break & some backlog preemption) Thanks for resubmitting those patches. Re reviewing; will get back to you when I'm done, hopefully soon :) Olivier > On 28 Jun 2018, at 10:43, Rasmus Villemoes wrote: > > Patches 1, 2 and 4 (the latter was number 6 previously) are unchanged > from the first round, apart from small changes in the commit log > wording. Patch 3 now includes parentheses in the macro > definition. Patches 4 and 5 are dropped. > > 4/4 is mostly a minor optimization, omitting a tiny bit of dead code > from the compiler binary. But the preprocessor directives also serve > as a reminder that the custom vxworks functions will not be used if > the compiler is built with --enable-initfini-array. > > Rasmus Villemoes (4): > vxworks: add target/h/wrn/coreip to the set of system include paths > libgcc: add crt{begin,end} for powerpc-wrs-vxworks target > vxworks: enable use of .init_array/.fini_array for cdtors > vxworks: don't define vxworks_asm_out_constructor when using >.init_array > > gcc/config/vxworks.c | 9 +++-- > gcc/config/vxworks.h | 21 +++-- > libgcc/config.host | 1 + > 3 files changed, 23 insertions(+), 8 deletions(-) > > -- > 2.16.4 >
Re: [PATCH][GCC][AARCH64] Use STLUR for atomic_store
Hi Matthew On 02/08/18 17:26, matthew.malcom...@arm.com wrote: Use the STLUR instruction introduced in Armv8.4-a. This insruction has the store-release semantic like STLR but can take a 9-bit unscaled signed immediate offset. Example test case: ``` void foo () { int32_t *atomic_vals = calloc (4, sizeof (int32_t)); atomic_store_explicit (atomic_vals + 1, 2, memory_order_release); } ``` Before patch generates ``` foo: stp x29, x30, [sp, -16]! mov x1, 4 mov x0, x1 mov x29, sp bl calloc mov w1, 2 add x0, x0, 4 stlrw1, [x0] ldp x29, x30, [sp], 16 ret ``` After patch generates ``` foo: stp x29, x30, [sp, -16]! mov x1, 4 mov x0, x1 mov x29, sp bl calloc mov w1, 2 stlur w1, [x0, 4] ldp x29, x30, [sp], 16 ret ``` Full bootstrap and regression test done on aarch64. Ok for trunk? gcc/ 2018-07-26 Matthew Malcomson * config/aarch64/aarch64-protos.h (aarch64_offset_9bit_signed_unscaled_p): New declaration. * config/aarch64/aarch64.c (aarch64_offset_9bit_signed_unscaled_p): Rename from offset_9bit_signed_unscaled_p. * config/aarch64/aarch64.h (TARGET_ARMV8_4): Add feature macro. * config/aarch64/atomics.md (atomic_store): Allow offset and use stlur. * config/aarch64/constraints.md (Ust): New constraint. * config/aarch64/predicates.md. (aarch64_sync_or_stlur_memory_operand): New predicate. gcc/testsuite/ 2018-07-26 Matthew Malcomson * gcc.target/aarch64/atomic-store.c: New. Thank you for doing this. I am not a maintainer but I have a few nits on this patch: diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index af5db9c595385f7586692258f750b6aceb3ed9c8..630a75bf776fcdc374aa9ffa4bb020fea3719320 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -393,6 +393,7 @@ void aarch64_split_add_offset (scalar_int_mode, rtx, rtx, rtx, rtx, rtx); bool aarch64_mov_operand_p (rtx, machine_mode); ... -static inline bool -offset_9bit_signed_unscaled_p (machine_mode mode ATTRIBUTE_UNUSED, +bool +aarch64_offset_9bit_signed_unscaled_p (machine_mode mode ATTRIBUTE_UNUSED, poly_int64 offset) This needs to be aligned with the first argument ... @@ -5837,7 +5837,7 @@ aarch64_classify_address (struct aarch64_address_info *info, ldr/str instructions (only big endian will get here). */ if (mode == CImode) return (aarch64_offset_7bit_signed_scaled_p (TImode, offset) - && (offset_9bit_signed_unscaled_p (V16QImode, offset + 32) + && (aarch64_offset_9bit_signed_unscaled_p (V16QImode, offset + 32) This is not less that 80 characters ... +;; STLUR instruction constraint requires Armv8.4 +(define_special_memory_constraint "Ust" + "@internal + A memory address suitable for use with an stlur instruction." + (and (match_operand 0 "aarch64_sync_or_stlur_memory_operand") + (match_test "TARGET_ARMV8_4"))) + You are already checking for TARGET_ARMV8_4 inside aarch64_sync_or_stlur_memory_operand. Also see my comment below for this function. ... +;; True if the operand is memory reference valid for one of a str or stlur +;; operation. +(define_predicate "aarch64_sync_or_stlur_memory_operand" + (ior (match_operand 0 "aarch64_sync_memory_operand") + (and (match_operand 0 "memory_operand") + (match_code "plus" "0") + (match_code "reg" "00") + (match_code "const_int" "01"))) +{ + if (aarch64_sync_memory_operand (op, mode)) +return true; + + if (!TARGET_ARMV8_4) +return false; + + rtx mem_op = XEXP (op, 0); + rtx plus_op0 = XEXP (mem_op, 0); + rtx plus_op1 = XEXP (mem_op, 1); + + if (GET_MODE (plus_op0) != DImode) +return false; + + poly_int64 offset; + poly_int_rtx_p (plus_op1, ); + return aarch64_offset_9bit_signed_unscaled_p (mode, offset); +}) + This predicate body makes it a bit mixed up with the two type of operands that you want to test especially looking at it from the constraint check perspective. I am assuming you would not want to use the non-immediate form of stlur and instead only use it in the form: STLUR , [, #] and use stlr for no immediate alternative. Thus the constraint does not need to check for aarch64_sync_memory_operand. My suggestion would be to make this operand check separate. Something like: +(define_predicate "aarch64_sync_or_stlur_memory_operand" + (ior (match_operand 0 "aarch64_sync_memory_operand") + (match_operand 0 "aarch64_stlur_memory_operand"))) Where you define aarch64_stlur_memory_operand as +bool aarch64_stlur_memory_operand (rtx op) +{ + if (!TARGET_ARMV8_4) +return false; + + rtx mem_op = XEXP
Re: [Patch][GCC] Document and fix -r (partial linking)
On Mittwoch, 1. August 2018 18:32:30 CEST Joseph Myers wrote: > On Wed, 1 Aug 2018, Allan Sandfeld Jensen wrote: > > gcc/ > > > > * gcc.c: Correct default specs for -r > > I don't follow why your changes (which would need describing for each > individual spec changed) are corrections. > > > /* config.h can define LIB_SPEC to override the default libraries. */ > > #ifndef LIB_SPEC > > > > -#define LIB_SPEC "%{!shared:%{g*:-lg} > > %{!p:%{!pg:-lc}}%{p:-lc_p}%{pg:-lc_p}}" +#define LIB_SPEC > > "%{!shared|!r:%{g*:-lg} %{!p:%{!pg:-lc}}%{p:-lc_p}%{pg:-lc_p}}"> > > #endif > > '!' binds more closely than '|' in specs. That is, !shared|!r means the > following specs are used unless both -shared and -r are specified, which > seems nonsensical to me. I'd expect something more like "shared|r:;" to > expand to nothing if either -shared or -r is passed and to what follows if > neither is passed. > > And that ignores that this LIB_SPEC value in gcc.c is largely irrelevant, > as it's generally overridden by targets - and normally for targets using > ELF shared libraries, for example, -lc *does* have to be used when linking > with -shared. > > I think you're changing the wrong place for this. If you want -r to be > usable with GCC without using -nostdlib (which is an interesting > question), you actually need to change LINK_COMMAND_SPEC (also sometimes > overridden for targets) to handle -r more like -nostdlib -nostartfiles. > Okay, so like this? >From 9de68f2ef0b77a0c0bcf0d83232e7fc34b006406 Mon Sep 17 00:00:00 2001 From: Allan Sandfeld Jensen Date: Wed, 1 Aug 2018 18:07:05 +0200 Subject: [PATCH] Fix and document -r option The option has existed and been working for years, make sure it implies the right extra options, and list it in the documentation. 2018-07-29 Allan Sandfeld Jensen gcc/doc * invoke.texi: Document -r gcc/ * gcc.c (LINK_COMMAND_SPEC): Handle -r like -nostdlib * config/darwin.h (LINK_COMMAND_SPEC): Handle -r like -nostdlib --- gcc/config/darwin.h | 8 gcc/doc/invoke.texi | 7 ++- gcc/gcc.c | 6 +++--- 3 files changed, 13 insertions(+), 8 deletions(-) diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h index 980ad9b4057..6ad657a4958 100644 --- a/gcc/config/darwin.h +++ b/gcc/config/darwin.h @@ -178,20 +178,20 @@ extern GTY(()) int darwin_ms_struct; "%X %{s} %{t} %{Z} %{u*} \ %{e*} %{r} \ %{o*}%{!o:-o a.out} \ -%{!nostdlib:%{!nostartfiles:%S}} \ +%{!nostdlib:%{!r:%{!nostartfiles:%S}}} \ %{L*} %(link_libgcc) %o %{fprofile-arcs|fprofile-generate*|coverage:-lgcov} \ %{fopenacc|fopenmp|%:gt(%{ftree-parallelize-loops=*:%*} 1): \ %{static|static-libgcc|static-libstdc++|static-libgfortran: libgomp.a%s; : -lgomp } } \ %{fgnu-tm: \ %{static|static-libgcc|static-libstdc++|static-libgfortran: libitm.a%s; : -litm } } \ -%{!nostdlib:%{!nodefaultlibs:\ +%{!nostdlib:%{!r:%{!nodefaultlibs:\ %{%:sanitize(address): -lasan } \ %{%:sanitize(undefined): -lubsan } \ %(link_ssp) \ " DARWIN_EXPORT_DYNAMIC " %
Re: [c++] Don't emit exception tables for UI_NONE
On Fri, Aug 03, 2018 at 11:45:31AM +0200, Tom de Vries wrote: > If a target does not support exceptions, it can indicate this by returning > UI_NONE in TARGET_EXCEPT_UNWIND_INFO. Currently the compiler still emits > exception tables for such a target. > > This patch makes sure that no exception tables are emitted if the target does > not support exceptions. This allows us to remove a workaround in > TARGET_ASM_BYTE_OP in the nvptx port. > > Build on x86_64 with nvptx accelerator, and tested libgomp. > > OK for trunk if currently running bootstrap and reg-test on x86_64 succeeds? LGTM (for UI_NONE there is no personality function either, so nothing to decode .gcc_except_table ...). > 2018-08-03 Tom de Vries > > * common/config/nvptx/nvptx-common.c (nvptx_except_unwind_info): Return > UI_NONE. > * config/nvptx/nvptx.c (TARGET_ASM_BYTE_OP): Remove define. > * except.c (output_function_exception_table): Do early exit if > targetm_common.except_unwind_info (_options) == UI_NONE. Jakub
[c++] Don't emit exception tables for UI_NONE
[ was: Re: [PATCH][nvptx] Ignore c++ exceptions ] On Thu, Aug 02, 2018 at 06:10:50PM +0200, Tom de Vries wrote: > On 08/02/2018 05:39 PM, Jakub Jelinek wrote: > > On Thu, Aug 02, 2018 at 05:30:53PM +0200, Tom de Vries wrote: > >> The nvptx port can't support exceptions using sjlj, because ptx does not > >> support sjlj. However, default_except_unwind_info still returns UI_SJLJ, > >> even > >> even if we configure with --disable-sjlj-exceptions, because UI_SJLJ is the > >> fallback option. > >> > >> The reason default_except_unwind_info doesn't return UI_DWARF2 is because > >> DWARF2_UNWIND_INFO is not defined in defaults.h, because > >> INCOMING_RETURN_ADDR_RTX is not defined, because there's no ptx equivalent. > >> > >> Testcase libgomp.c++/for-15.C currently doesn't compile unless > >> fno-exceptions > >> is added because: > >> - it tries to generate sjlj exception handling code, and > >> - it tries to generate exception tables using label-addressed .byte > >> sequence. > >> Ptx doesn't support generating random data at a label, nor being able to > >> load/write data relative to a label. > >> > >> This patch fixes the first problem by using UI_TARGET for nvptx. > >> > >> The second problem is worked around by generating all .byte sequences > >> commented > >> out. It would be better to have a narrower workaround, and define > >> TARGET_ASM_BYTE_OP to "error: .byte unsupported " or some such. > > > > Hopefully this part is temporary and there is some way to figure this out > > later. > > > > I except that for UI_NONE we don't want to write out exception tables, > so that's something we can fix, and then we can return UI_NONE in > nvptx_except_unwind_info. > Hi, If a target does not support exceptions, it can indicate this by returning UI_NONE in TARGET_EXCEPT_UNWIND_INFO. Currently the compiler still emits exception tables for such a target. This patch makes sure that no exception tables are emitted if the target does not support exceptions. This allows us to remove a workaround in TARGET_ASM_BYTE_OP in the nvptx port. Build on x86_64 with nvptx accelerator, and tested libgomp. OK for trunk if currently running bootstrap and reg-test on x86_64 succeeds? Thanks, - Tom [c++] Don't emit exception tables for UI_NONE 2018-08-03 Tom de Vries * common/config/nvptx/nvptx-common.c (nvptx_except_unwind_info): Return UI_NONE. * config/nvptx/nvptx.c (TARGET_ASM_BYTE_OP): Remove define. * except.c (output_function_exception_table): Do early exit if targetm_common.except_unwind_info (_options) == UI_NONE. --- gcc/common/config/nvptx/nvptx-common.c | 2 +- gcc/config/nvptx/nvptx.c | 3 --- gcc/except.c | 3 ++- 3 files changed, 3 insertions(+), 5 deletions(-) diff --git a/gcc/common/config/nvptx/nvptx-common.c b/gcc/common/config/nvptx/nvptx-common.c index f31e0696281..3a124e8844d 100644 --- a/gcc/common/config/nvptx/nvptx-common.c +++ b/gcc/common/config/nvptx/nvptx-common.c @@ -33,7 +33,7 @@ along with GCC; see the file COPYING3. If not see enum unwind_info_type nvptx_except_unwind_info (struct gcc_options *opts ATTRIBUTE_UNUSED) { - return UI_TARGET; + return UI_NONE; } #undef TARGET_HAVE_NAMED_SECTIONS diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c index 62fefda8612..c0b0a2ec3ab 100644 --- a/gcc/config/nvptx/nvptx.c +++ b/gcc/config/nvptx/nvptx.c @@ -6051,9 +6051,6 @@ nvptx_can_change_mode_class (machine_mode, machine_mode, reg_class_t) #undef TARGET_HAVE_SPECULATION_SAFE_VALUE #define TARGET_HAVE_SPECULATION_SAFE_VALUE speculation_safe_value_not_needed -#undef TARGET_ASM_BYTE_OP -#define TARGET_ASM_BYTE_OP "// .byte " - struct gcc_target targetm = TARGET_INITIALIZER; #include "gt-nvptx.h" diff --git a/gcc/except.c b/gcc/except.c index 84666d9041e..728b1e1d349 100644 --- a/gcc/except.c +++ b/gcc/except.c @@ -3189,7 +3189,8 @@ output_function_exception_table (int section) rtx personality = get_personality_function (current_function_decl); /* Not all functions need anything. */ - if (!crtl->uses_eh_lsda) + if (!crtl->uses_eh_lsda + || targetm_common.except_unwind_info (_options) == UI_NONE) return; /* No need to emit any boilerplate stuff for the cold part. */
[C++ PATCH] Fix ICE in build_base_path (PR c++/86706)
Hi! This is Jason's patch from the PR for which I've added a reduced testcase and bootstrapped/regtested on x86_64-linux and i686-linux. Ok for trunk and 8.3 (perhaps after a while)? 2018-07-30 Jason Merrill PR c++/86706 * class.c (build_base_path): Use currently_open_class. * g++.dg/template/pr86706.C: New test. --- gcc/cp/class.c.jj 2018-07-30 00:21:26.507313463 +0200 +++ gcc/cp/class.c 2018-07-30 00:22:49.184447971 +0200 @@ -278,6 +278,9 @@ build_base_path (enum tree_code code, probe = TYPE_MAIN_VARIANT (TREE_TYPE (expr)); if (want_pointer) probe = TYPE_MAIN_VARIANT (TREE_TYPE (probe)); + if (dependent_type_p (probe)) +if (tree open = currently_open_class (probe)) + probe = open; if (code == PLUS_EXPR && !SAME_BINFO_TYPE_P (BINFO_TYPE (d_binfo), probe)) --- gcc/testsuite/g++.dg/template/pr86706.C.jj 2018-07-30 11:09:33.002035612 +0200 +++ gcc/testsuite/g++.dg/template/pr86706.C 2018-07-30 11:07:21.069813029 +0200 @@ -0,0 +1,16 @@ +// PR c++/86706 +// { dg-do compile } + +class A { int b; }; + +template +class C : A { C (); static C *f; }; + +template +C *C::f; + +template +C::C () +{ + f->b; +} Jakub
Re: [PATCH 00/11] (v2) Mitigation against unsafe data speculation (CVE-2017-5753)
On 02/08/18 21:19, John David Anglin wrote: > On 2018-08-02 2:40 PM, Jeff Law wrote: >> It's been eons. I think there's enough building blocks on the PA to >> mount a spectre v1 attack. They've got branch prediction with varying >> degress of speculative execution, caches and user accessable cycle >> timers. > Yes. >> >> There's varying degrees of out of order execution all the way back in >> the PA7xxx processors (hit-under-miss) to full o-o-o execution in the >> PA8xxx series (including the PA8900 that's in the rp3440). > However, as far as I know, loads and stores are always ordered. >> >> I suspect that given enough time we could figure out why the test didn't >> indicate spectre v1 vulnerability on your system and twiddle it, but >> given it's a dead processor, I doubt it's worth the effort. > Spectre output looks like this: > dave@mx3210:~/meltdown$ ./spectre > Reading 40 bytes: > Reading at malicious_x = 0xef10... Unclear: 0xFE='?' score=999 > (second best: 0xFC score=999) > Reading at malicious_x = 0xef11... Unclear: 0xFC='?' score=999 > (second best: 0xFB score=999) > Reading at malicious_x = 0xef12... Unclear: 0xFE='?' score=999 > (second best: 0xFC score=999) > > I don't think there's a suitable barrier. The sync instruction seems > like overkill. > > So, I'm going to install attached change after testing is complete. > It's your call as port maintainers. I've created a PR for each unfixed architecture. Please can you commit the patch against that so that I can track things for back-porting. Thanks, R. > Dave > > > pa-spectre.d > > > Index: config/pa/pa.c > === > --- config/pa/pa.c(revision 263228) > +++ config/pa/pa.c(working copy) > @@ -428,6 +428,9 @@ > #undef TARGET_STARTING_FRAME_OFFSET > #define TARGET_STARTING_FRAME_OFFSET pa_starting_frame_offset > > +#undef TARGET_HAVE_SPECULATION_SAFE_VALUE > +#define TARGET_HAVE_SPECULATION_SAFE_VALUE speculation_safe_value_not_needed > + > struct gcc_target targetm = TARGET_INITIALIZER; > > /* Parse the -mfixed-range= option string. */ >
[PATCH] docs: fix stray duplicated words
A manpage checker [1] detected various stray repeated words in the generated manpages for gcc and gcov, which this patch fixes. Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. OK for trunk? Thanks Dave [1] https://bugzilla.redhat.com/show_bug.cgi?id=1611292 gcc/ChangeLog: * doc/gcov.texi (-x): Remove duplicate "to". * doc/invoke.texi (-Wnoexcept-type): Remove duplicate "calls". (-Wif-not-aligned): Remove duplicate "is". (-flto): Remove duplicate "the". (MicroBlaze Options): In examples of "-mcpu=cpu-type", remove duplicate "v5.00.b". (MSP430 Options): Remove duplicate "and" from the description of "-mgprel-sec=regexp". (x86 Options): Remove duplicate copies of "vmldLog102" and vmlsLog104 from description of "-mveclibabi=type". --- gcc/doc/gcov.texi | 2 +- gcc/doc/invoke.texi | 14 +++--- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/gcc/doc/gcov.texi b/gcc/doc/gcov.texi index dfa86f1..f33dc8f 100644 --- a/gcc/doc/gcov.texi +++ b/gcc/doc/gcov.texi @@ -340,7 +340,7 @@ Print verbose informations related to basic blocks and arcs. @item -x @itemx --hash-filenames -By default, gcov uses the full pathname of the source files to to create +By default, gcov uses the full pathname of the source files to create an output filename. This can lead to long filenames that can overflow filesystem limits. This option creates names of the form @file{@var{source-file}##@var{md5}.gcov}, diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 6047d82..841cb47 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -3056,7 +3056,7 @@ void h() @{ f(g); @} @end smallexample @noindent -In C++14, @code{f} calls calls @code{f}, but in +In C++14, @code{f} calls @code{f}, but in C++17 it calls @code{f}. @item -Wclass-memaccess @r{(C++ and Objective-C++ only)} @@ -4587,7 +4587,7 @@ The @option{-Wimplicit-fallthrough=3} warning is enabled by @option{-Wextra}. @opindex Wif-not-aligned @opindex Wno-if-not-aligned Control if warning triggered by the @code{warn_if_not_aligned} attribute -should be issued. This is is enabled by default. +should be issued. This is enabled by default. Use @option{-Wno-if-not-aligned} to disable it. @item -Wignored-qualifiers @r{(C and C++ only)} @@ -9613,7 +9613,7 @@ for LTO, use @command{gcc-ar} and @command{gcc-ranlib} instead of @command{ar} and @command{ranlib}; to show the symbols of object files with GIMPLE bytecode, use @command{gcc-nm}. Those commands require that @command{ar}, @command{ranlib} -and @command{nm} have been compiled with plugin support. At link time, use the the +and @command{nm} have been compiled with plugin support. At link time, use the flag @option{-fuse-linker-plugin} to ensure that the library participates in the LTO optimization process: @@ -20159,7 +20159,7 @@ Use features of, and schedule code for, the given CPU. Supported values are in the format @samp{v@var{X}.@var{YY}.@var{Z}}, where @var{X} is a major version, @var{YY} is the minor version, and @var{Z} is compatibility code. Example values are @samp{v3.00.a}, -@samp{v4.00.b}, @samp{v5.00.a}, @samp{v5.00.b}, @samp{v5.00.b}, @samp{v6.00.a}. +@samp{v4.00.b}, @samp{v5.00.a}, @samp{v5.00.b}, @samp{v6.00.a}. @item -mxl-soft-mul @opindex mxl-soft-mul @@ -21839,7 +21839,7 @@ GP-relative addressing. It is most useful in conjunction with The @var{regexp} is a POSIX Extended Regular Expression. This option does not affect the behavior of the @option{-G} option, and -and the specified sections are in addition to the standard @code{.sdata} +the specified sections are in addition to the standard @code{.sdata} and @code{.sbss} small-data sections that are recognized by @option{-mgpopt}. @item -mr0rel-sec=@var{regexp} @@ -27613,11 +27613,11 @@ To use this option, both @option{-ftree-vectorize} and ABI-compatible library must be specified at link time. GCC currently emits calls to @code{vmldExp2}, -@code{vmldLn2}, @code{vmldLog102}, @code{vmldLog102}, @code{vmldPow2}, +@code{vmldLn2}, @code{vmldLog102}, @code{vmldPow2}, @code{vmldTanh2}, @code{vmldTan2}, @code{vmldAtan2}, @code{vmldAtanh2}, @code{vmldCbrt2}, @code{vmldSinh2}, @code{vmldSin2}, @code{vmldAsinh2}, @code{vmldAsin2}, @code{vmldCosh2}, @code{vmldCos2}, @code{vmldAcosh2}, -@code{vmldAcos2}, @code{vmlsExp4}, @code{vmlsLn4}, @code{vmlsLog104}, +@code{vmldAcos2}, @code{vmlsExp4}, @code{vmlsLn4}, @code{vmlsLog104}, @code{vmlsPow4}, @code{vmlsTanh4}, @code{vmlsTan4}, @code{vmlsAtan4}, @code{vmlsAtanh4}, @code{vmlsCbrt4}, @code{vmlsSinh4}, @code{vmlsSin4}, @code{vmlsAsinh4}, @code{vmlsAsin4}, @code{vmlsCosh4}, -- 1.8.5.3
Re: Async I/O patch with compilation fix
On Thu, 2 Aug 2018 at 19:05, Nicolas Koenig wrote: > > On Thu, Aug 02, 2018 at 05:42:46PM +0200, Christophe Lyon wrote: > > On Thu, 2 Aug 2018 at 13:35, Nicolas Koenig > > wrote: > > > > > > > > > Hello everyone, > > > > > > Here is an updated version of the patch that hopefully fixes the > > > compilation > > > problems by disabling async I/O if conditions are not supported by the > > > target. > > > > > > I would appreciate if people could test it on systems on which it failed > > > before. As for the array_constructor_8.f90 failure reported in the PR, why > > > it fails is beyond me, it doesn't even use I/O. Maybe/Probably something > > > unrelated? > > > > > > > Hi, > > I'm probably missing something obvious, but after applying this patch > > on top of r263136, the builds fail while building libgfortran: > > /tmp/9271913_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgfortran/runtime/error.c:28:10: > > fatal error: async.h: No such file or directory > > #include "async.h" > > ^ > > compilation terminated. > > make[3]: *** [error.lo] Error 1 > > > > Hi, > > It wasn't you who missed something obvious. Typing `svn add` is hard. > Here is a version of the patch with the two new files. > OK, I applied this patch, and again I still see regressions on armeb-none-linux-gnueabihf --with-cpu cortex-a9 --with-fpu neon-fp16 FAIL: gfortran.dg/array_constructor_8.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test FAIL: gfortran.dg/array_constructor_8.f90 -O3 -g execution test gfortran.log contains: STOP 2 STOP 2 Christophe > Nicolas > > > > Nicolas > > > > > > > > > 2018-08-02 Nicolas Koenig > > > Thomas Koenig > > > > > > PR fortran/25829 > > > * gfortran.texi: Add description of asynchronous I/O. > > > * trans-decl.c (gfc_finish_var_decl): Treat asynchronous variables > > > as volatile. > > > * trans-io.c (gfc_build_io_library_fndecls): Rename st_wait to > > > st_wait_async and change argument spec from ".X" to ".w". > > > (gfc_trans_wait): Pass ID argument via reference. > > > > > > 2018-08-02 Nicolas Koenig > > > Thomas Koenig > > > > > > PR fortran/25829 > > > * gfortran.dg/f2003_inquire_1.f03: Add write statement. > > > * gfortran.dg/f2003_io_1.f03: Add wait statement. > > > > > > 2018-08-02 Nicolas Koenig > > > Thomas Koenig > > > > > > PR fortran/25829 > > > * Makefile.am: Add async.c to gfor_io_src. > > > Add async.h to gfor_io_headers. > > > * Makefile.in: Regenerated. > > > * gfortran.map: Add _gfortran_st_wait_async. > > > * io/async.c: New file. > > > * io/async.h: New file. > > > * io/close.c: Include async.h. > > > (st_close): Call async_wait for an asynchronous unit. > > > * io/file_pos.c (st_backspace): Likewise. > > > (st_endfile): Likewise. > > > (st_rewind): Likewise. > > > (st_flush): Likewise. > > > * io/inquire.c: Add handling for asynchronous PENDING > > > and ID arguments. > > > * io/io.h (st_parameter_dt): Add async bit. > > > (st_parameter_wait): Correct. > > > (gfc_unit): Add au pointer. > > > (st_wait_async): Add prototype. > > > (transfer_array_inner): Likewise. > > > (st_write_done_worker): Likewise. > > > * io/open.c: Include async.h. > > > (new_unit): Initialize asynchronous unit. > > > * io/transfer.c (async_opt): New struct. > > > (wrap_scalar_transfer): New function. > > > (transfer_integer): Call wrap_scalar_transfer to do the work. > > > (transfer_real): Likewise. > > > (transfer_real_write): Likewise. > > > (transfer_character): Likewise. > > > (transfer_character_wide): Likewise. > > > (transfer_complex): Likewise. > > > (transfer_array_inner): New function. > > > (transfer_array): Call transfer_array_inner. > > > (transfer_derived): Call wrap_scalar_transfer. > > > (data_transfer_init): Check for asynchronous I/O. > > > Perform a wait operation on any pending asynchronous I/O > > > if the data transfer is synchronous. Copy PDT and enqueue > > > thread for data transfer. > > > (st_read_done_worker): New function. > > > (st_read_done): Enqueue transfer or call st_read_done_worker. > > > (st_write_done_worker): New function. > > > (st_write_done): Enqueue transfer or call st_read_done_worker. > > > (st_wait): Document as no-op for compatibility reasons. > > > (st_wait_async): New function. > > > * io/unit.c (insert_unit): Use macros LOCK, UNLOCK and TRYLOCK; > > > add NOTE where necessary. > > > (get_gfc_unit): Likewise. > > > (init_units): Likewise. > > > (close_unit_1):
Re: [2/5] C-SKY port: Backend implementation
> 在 2018年8月3日,06:27,Jeff Law 写道: > > On 07/26/2018 12:06 AM, 瞿仙淼 wrote: >> >>> 在 2018年7月25日,上午5:24,Jeff Law 写道: >>> >>> On 07/24/2018 12:18 PM, Sandra Loosemore wrote: On 07/24/2018 09:45 AM, Jeff Law wrote: > On 07/23/2018 10:21 PM, Sandra Loosemore wrote: > I'm not a big fan of more awk code, but I'm not going to object to it :-) > > Why does the port have its own little pass for condition code > optimization (cse_cc)? What is it doing that can't be done with our > generic optimizers? This pass was included in the initial patch set we got from C-SKY, and as it didn't seem to break anything I left it in. Perhaps C-SKY can provide a testcase that demonstrates why it's still useful in the current version of GCC; otherwise we can remove this from the initial port submission and restore it later if some performance analysis shows it is still worthwhile. >>> FWIW it looks like we model CC setting on just a few insns, (add, >>> subtract) so I'd be surprised if this little mini pass found much. I'd >>> definitely like to hear from the csky authors here. >>> >>> Alternately, you could do add some instrumentation to flag when it >>> triggers, take a test or two that does, reduce it and we can then look >>> at the key RTL sequences and see what the pass is really doing. >>> >> >> I wrote a case to reproduce this problem on C-SKY. C code is as follows: >> --- >> int e1, e2; >> >> void func (int a, int b, int c, int d, int f, int g) >> { >> e1 = a > b ? f : g; >> e2 = a > b ? c : d; >> >> return; >> } >> --- >> >> compile to assembler with option “-O3 -S” : >> --- >> func: >> cmplt a1, a0 >> ld.w t1, (sp, 0) >> ld.w t0, (sp, 4) >> movt t0, t1 >> cmplt a1, a0 >> movt a3, a2 >> lrw a1, e2 >> lrw a2, e1 >> st.w a3, (a1, 0) >> st.w t0, (a2, 0) >> rts >> --- >> There is an extra “cmplt a1, a0" in the above code without cse_cc. This >> situation mainly occurs when a relatively short branch jump is converted >> into a conditional execution instruction. And the CSE pass can not reduce >> the same conditional comparison instruction . Here is the rtx sequence after >> “cse2” pass. >> >> --- >> (insn 28 13 29 2 (set (reg:CC 33 c) >>(gt:CC (reg/v:SI 77 [ a ]) >>(reg/v:SI 78 [ b ]))) func.c:5 1099 {*cmpgtsi} >> (nil)) >> (insn 29 28 30 2 (set (reg/v:SI 82 [ g ]) >>(if_then_else:SI (eq (reg:CC 33 c) >>(const_int 0 [0])) >>(reg/v:SI 82 [ g ]) >>(reg/v:SI 81 [ f ]))) func.c:5 983 {movf} >> (expr_list:REG_DEAD (reg/v:SI 81 [ f ]) >>(expr_list:REG_DEAD (reg:CC 33 c) >>(nil >> (insn 30 29 31 2 (set (reg:CC 33 c) >>(gt:CC (reg/v:SI 77 [ a ]) >>(reg/v:SI 78 [ b ]))) func.c:5 1099 {*cmpgtsi} >> (expr_list:REG_DEAD (reg/v:SI 78 [ b ]) >>(expr_list:REG_DEAD (reg/v:SI 77 [ a ]) >>(nil >> (insn 31 30 18 2 (set (reg/v:SI 80 [ d ]) >>(if_then_else:SI (eq (reg:CC 33 c) >>(const_int 0 [0])) >>(reg/v:SI 80 [ d ]) >>(reg/v:SI 79 [ c ]))) func.c:5 983 {movf} >> (expr_list:REG_DEAD (reg/v:SI 79 [ c ]) >>(expr_list:REG_DEAD (reg:CC 33 c) >>(nil >> --- >> >> It doesn't seem to check the same conditional comparison instruction .So I >> wrote this to solve this problem, but I am not sure if this is the best way >> : ) >> >> PS, the same conditional comparison instruction cannot be reduced with the >> latest version gcc with C-SKY because I just insert the “cse_cc” after >> “cse1”, when I insert after “cse2”, this problem can be solved very well. > I think the cse_cc pass is really just working around one or more bugs > in CSE and/or a backend bug. The RTL above clearly shows a common > subexpression that is not eliminated by CSE. > > What CSE should be trying to do is changing the second and third > occurrences of (gt:CC (reg 77) (reg 78)) with (reg 33) which would > create nop-sets which would be subsequently deleted. I suspect you do > not have an insn which matches that nop set of the CC register. If you > fix that I suspect CSE will work better and eliminate the need for your > cse_cc pass. Thanks you for your suggestions, we will try this method. > > jeff
Re: [PATCH] Make strlen range computations more conservative
On Fri, Aug 03, 2018 at 01:19:14AM -0600, Jeff Law wrote: > I'm leaning towards a similar conclusion, namely that we can only rely > on type information for the pointer that actually gets passed to strlen, > which 99.9% of the time is (char *), potentially with const qualifiers. You can't derive anything from the pointer type of the strlen argument, because pointer conversions are useless in the middle-end, so if there was some conversion at some point, it might be gone, or you might get there a completely different pointer type of something that happened to have the same value. That is why the information, if it matters, needs to be stored elsewhere, on the memory access (MEM_REF has such info, TARGET_MEM_REF too, handled_component_p do too). Jakub
Re: [PATCH] Make strlen range computations more conservative
On 08/01/2018 09:13 PM, Martin Sebor wrote: > On 08/01/2018 01:19 AM, Richard Biener wrote: >> On Tue, 31 Jul 2018, Martin Sebor wrote: >> >>> On 07/31/2018 09:48 AM, Jakub Jelinek wrote: On Tue, Jul 31, 2018 at 09:17:52AM -0600, Martin Sebor wrote: > On 07/31/2018 12:38 AM, Jakub Jelinek wrote: >> On Mon, Jul 30, 2018 at 09:45:49PM -0600, Martin Sebor wrote: >>> Even without _FORTIFY_SOURCE GCC diagnoses (some) writes past >>> the end of subobjects by string functions. With _FORTIFY_SOURCE=2 >>> it calls abort. This is the default on popular distributions, >> >> Note that _FORTIFY_SOURCE=2 is the mode that goes beyond what the >> standard >> requires, imposes extra requirements. So from what this mode >> accepts or >> rejects we shouldn't determine what is or isn't considered valid. > > I'm not sure what the additional requirements are but the ones > I am referring to are the enforcing of struct member boundaries. > This is in line with the standard requirements of not accessing > [sub]objects via pointers derived from other [sub]objects. In the middle-end the distinction between what was originally a reference to subobjects and what was a reference to objects is quickly lost (whether through SCCVN or other optimizations). We've run into this many times with the __builtin_object_size already. So, if e.g. struct S { char a[3]; char b[5]; } s = { "abc", "defg" }; ... strlen ((char *) ) is well defined but strlen (s.a) is not in C, for the middle-end you might not figure out which one is which. >>> >>> Yes, I'm aware of the middle-end transformation to MEM_REF >>> -- it's one of the reasons why detecting invalid accesses >>> by the middle end warnings, including -Warray-bounds, >>> -Wformat-overflow, -Wsprintf-overflow, and even -Wrestrict, >>> is less than perfect. >>> >>> But is strlen(s.a) also meant to be well-defined in the middle >>> end (with the semantics of computing the length or "abcdefg"?) >> >> Yes. >> >>> And if so, what makes it well defined? >> >> The fact that strlen takes a char * argument and thus inline-expansion >> of a trivial implementation like >> >> int len = 0; >> for (; *p; ++p) >> ++len; >> >> will have >> >> p = >> >> and the middle-end doesn't reconstruct s.a[..] from the pointer >> access. >> >>> >>> Certainly not every "strlen" has these semantics. For example, >>> this open-coded one doesn't: >>> >>> int len = 0; >>> for (int i = 0; s.a[i]; ++i) >>> ++len; >>> >>> It computes 2 (with no warning for the out-of-bounds access). >> >> Yes. > > If that's not a problem then why is it one when strlen() does > the same thing? Presumably the answer is: "because here > the access is via array indexing and in strlen via pointer > dereferences." (But in C there is no difference between > the two. Also see below.) But the semantics within GCC aren't necessarily C, they are GIMPLE. While they are usually very similar, they can differ. That also means that if array indexing gets turned into pointer dereferences we lose semantic information. > I have seen and I think shown in this discussion examples > where this is not so. For instance: > > struct S { char a[1], b[1]; }; > > void f (struct S *s, int i) > { > char *p = >a[i]; > char *q = >b[0]; > > char x = *p; > *q = 11; > > if (x != *p) // folded to false > __builtin_abort (); // eliminated > } > > Is this a bug? (I hope not.) What I keep coming back to is what is the type of the object we pass to strlen in GIMPLE. If it is a char array, then we can use the array bounds to construct bounds for the result. If it is a char *, then we can not. This example is really looking at the aliasing model (which is another place where the semantics may not match precisely). But I don't think you can equate what happens in the aliasing model of GIMPLE with the strlen case. They're apples and oranges. > >>> If we can't then the only language we have in common with users >>> is the standard. (This, by the way, is what the C memory model >>> group is trying to address -- the language or feature that's >>> missing from the standard that says when, if ever, these things >>> might be valid.) >> >> Well, you simply have to not compare apples and oranges, >> a strlen implementation that isn't a strlen implementation >> and strlen. > > As I'm sure you know, the C standard doesn't differentiate > between the semantics of array subscript expressions and > pointer dereferencing. They both mean the same thing. > (Nothing prevents an implementation from defining strlen > as a macro that expands into a loop using array indices > for array arguments.) But this is an area were the C standard and GIMPLE differ. It's easy to miss (as I did in the initial review). But that's the way it is. Jeff
Re: [PATCH] Make strlen range computations more conservative
On Thu, Aug 02, 2018 at 09:59:13PM -0600, Martin Sebor wrote: > > If I call this with foo (2, 1), do you still claim it is not valid C? > > String functions like strlen operate on character strings stored > in character arrays. Calling strlen ([1]) is invalid because > [1] is not the address of a character array. The fact that > objects can be represented as arrays of bytes doesn't change > that. The standard may be somewhat loose with words on this > distinction but the intent certainly isn't for strlen to traverse > arbitrary sequences of bytes that cross subobject boundaries. > (That is the intent behind the raw memory functions, but > the current text doesn't make the distinction clear.) But the standard doesn't say that right now. Plus, at least from the middle-end POV, there is also the case of placement new and stores changing the dynamic type of the object, previously say a struct with two fields, then a placement new with a single char array over it (the placement new will not survive in the middle-end, so it will be just a memcpy or strcpy or some other byte copy over the original object, and due to the CSE/SCCVN etc. of pointer to pointer conversions being in the middle-end useless means you can see a pointer to the struct with two fields rather than pointer to char array. Consider e.g. typedef __typeof__ (sizeof 0) size_t; void *operator new (size_t, void *p) { return p; } void *operator new[] (size_t, void *p) { return p; } struct S { char a; char b[64]; }; void baz (char *); size_t foo (S *p) { baz (>a); char *q = new (p) char [16]; baz (q); return __builtin_strlen (q); } I don't think it is correct to say that strlen must be 0. In this testcase the pointer passed to strlen is still S *, though I think with enough tweaking you could also have something where the argument is >a. I have no problem for strlen to return 0 if it sees a toplevel object of size 1, but note that if it is extern, it already might be a problem in some cases: struct T { char a; char a2[]; } b; extern struct T c; void foo (int *p) { p[0] = strlen (b); p[1] = strlen (c); } If c's definition is struct T c = { ' ', "abcde" }; then the object doesn't have length of 1. But for subobjects and especially trying to derive anything from what kind of pointer you are called with just doesn't work, unless you do it in the FE only. Jakub
Re: [PATCH] Make strlen range computations more conservative
On 08/01/2018 01:19 AM, Richard Biener wrote: > On Tue, 31 Jul 2018, Martin Sebor wrote: > >> On 07/31/2018 09:48 AM, Jakub Jelinek wrote: >>> On Tue, Jul 31, 2018 at 09:17:52AM -0600, Martin Sebor wrote: On 07/31/2018 12:38 AM, Jakub Jelinek wrote: > On Mon, Jul 30, 2018 at 09:45:49PM -0600, Martin Sebor wrote: >> Even without _FORTIFY_SOURCE GCC diagnoses (some) writes past >> the end of subobjects by string functions. With _FORTIFY_SOURCE=2 >> it calls abort. This is the default on popular distributions, > > Note that _FORTIFY_SOURCE=2 is the mode that goes beyond what the > standard > requires, imposes extra requirements. So from what this mode accepts or > rejects we shouldn't determine what is or isn't considered valid. I'm not sure what the additional requirements are but the ones I am referring to are the enforcing of struct member boundaries. This is in line with the standard requirements of not accessing [sub]objects via pointers derived from other [sub]objects. >>> >>> In the middle-end the distinction between what was originally a reference >>> to subobjects and what was a reference to objects is quickly lost >>> (whether through SCCVN or other optimizations). >>> We've run into this many times with the __builtin_object_size already. >>> So, if e.g. >>> struct S { char a[3]; char b[5]; } s = { "abc", "defg" }; >>> ... >>> strlen ((char *) ) is well defined but >>> strlen (s.a) is not in C, for the middle-end you might not figure out which >>> one is which. >> >> Yes, I'm aware of the middle-end transformation to MEM_REF >> -- it's one of the reasons why detecting invalid accesses >> by the middle end warnings, including -Warray-bounds, >> -Wformat-overflow, -Wsprintf-overflow, and even -Wrestrict, >> is less than perfect. >> >> But is strlen(s.a) also meant to be well-defined in the middle >> end (with the semantics of computing the length or "abcdefg"?) > > Yes. > >> And if so, what makes it well defined? > > The fact that strlen takes a char * argument and thus inline-expansion > of a trivial implementation like [ ... ] And ISTM again the key here is the type of the object that actually gets passed to strlen at the gimple level. If it's a char *, then the type does not constrain the return value in any way shape or form. Jeff
Re: [PATCH] Make strlen range computations more conservative
On 07/25/2018 01:36 PM, Martin Sebor wrote: >> BUT - for the string_constant and c_strlen functions we are, >> in all cases we return something interesting, able to look >> at an initializer which then determines that type. Hopefully. >> I think the strlen() folding code when it sets SSA ranges >> now looks at types ...? >> >> Consider >> >> struct X { int i; char c[4]; int j;}; >> struct Y { char c[16]; }; >> >> void foo (struct X *p, struct Y *q) >> { >> memcpy (p, q, sizeof (struct Y)); >> if (strlen ((char *)(struct Y *)p + 4) < 7) >> abort (); >> } >> >> here the GIMPLE IL looks like >> >> const char * _1; >> >> [local count: 1073741825]: >> _5 = MEM[(char * {ref-all})q_4(D)]; >> MEM[(char * {ref-all})p_6(D)] = _5; >> _1 = p_6(D) + 4; >> _2 = __builtin_strlen (_1); >> >> and I guess Martin would argue that since p is of type struct X >> + 4 gets you to c[4] and thus strlen of that cannot be larger >> than 3. But of course the middle-end doesn't work like that >> and luckily we do not try to draw such conclusions or we >> are somehow lucky that for the testcase as written above we do not >> (I'm not sure whether Martins changes in this area would derive >> such conclusions in principle). > > Only if the strlen argument were p->c. Right. In that case the argument passed to strlen is of type char[4] and we can derive range info from that. In the testcase as posted, the type is char * and we can't derive anything from that. > >> NOTE - we do not know the dynamic type here since we do not know >> the dynamic type of the memory pointed-to by q! We can only >> derive that at q+4 there must be some object that we can >> validly call strlen on (where Martin again thinks strlen >> imposes constrains that memchr does not - sth I do not agree >> with from a QOI perspective) > > The dynamic type is a murky area. As you said, above we don't > know whether *p is an allocated object or not. Strictly speaking, > we would need to treat it as such. It would basically mean > throwing out all type information and treating objects simply > as blobs of bytes. But that's not what GCC or other compilers do > either. For instance, in the modified foo below, GCC eliminates > the test because it assumes that *p and *q don't overlap. It > does that because they are members of structs of unrelated types > access to which cannot alias. I.e., not just the type of > the access matters (here int and char) but so does the type of > the enclosing object. If it were otherwise and only the type > of the access mattered then eliminating the test below wouldn't > be valid (objects can have their stored value accessed by either > an lvalue of a compatible type or char). I think we're getting off base here. The more I think about this problem the more it seems to me like the real issue is we can't look through casts unless doing so allows us to get to an initializer (which in turn allows us to compute the length as a compile-time constant). jeff
Re: [PATCH] Make strlen range computations more conservative
On 07/25/2018 01:23 AM, Richard Biener wrote: > On Tue, 24 Jul 2018, Bernd Edlinger wrote: > >> On 07/24/18 23:46, Jeff Law wrote: >>> On 07/24/2018 01:59 AM, Bernd Edlinger wrote: Hi! This patch makes strlen range computations more conservative. Firstly if there is a visible type cast from type A to B before passing then value to strlen, don't expect the type layout of B to restrict the possible return value range of strlen. >>> Why do you think this is the right thing to do? ie, is there language >>> in the standards that makes you think the code as it stands today is >>> incorrect from a conformance standpoint? Is there a significant body of >>> code that is affected in an adverse way by the current code? If so, >>> what code? >>> >>> >> >> I think if you have an object, of an effective type A say char[100], then >> you can cast the address of A to B, say typedef char (*B)[2] for instance >> and then to const char *, say for use in strlen. I may be wrong, but I think >> that we should at least try not to pick up char[2] from B, but instead >> use A for strlen ranges, or leave this range open. Currently the range >> info for strlen is [0..1] in this case, even if we see the type cast >> in the generic tree. > > You raise a valid point - namely that the middle-end allows > any object (including storage with a declared type) to change > its dynamic type (even of a piece of it). So unless you can > prove that the dynamic type of the thing you are looking at > matches your idea of that type you may not derive any string > lengths (or ranges) from it. > > BUT - for the string_constant and c_strlen functions we are, > in all cases we return something interesting, able to look > at an initializer which then determines that type. Hopefully. > I think the strlen() folding code when it sets SSA ranges > now looks at types ...? I'm leaning towards a similar conclusion, namely that we can only rely on type information for the pointer that actually gets passed to strlen, which 99.9% of the time is (char *), potentially with const qualifiers. It's tempting to look back through the cast to find a cast from a char array but I'm more and more concerned that it's not safe unless we can walk back to an initializer. What this might argue is that we need to distinguish between a known range and a likely range. I really dislike doing that again. We may have to see more real world cases where the likely range allows us to improve the precision of the sprintf warnings (since that's really the goal of improved string length ranges). > > Consider > > struct X { int i; char c[4]; int j;}; > struct Y { char c[16]; }; > > void foo (struct X *p, struct Y *q) > { > memcpy (p, q, sizeof (struct Y)); > if (strlen ((char *)(struct Y *)p + 4) < 7) > abort (); > } > > here the GIMPLE IL looks like > > const char * _1; > >[local count: 1073741825]: > _5 = MEM[(char * {ref-all})q_4(D)]; > MEM[(char * {ref-all})p_6(D)] = _5; > _1 = p_6(D) + 4; > _2 = __builtin_strlen (_1); > > and I guess Martin would argue that since p is of type struct X > + 4 gets you to c[4] and thus strlen of that cannot be larger > than 3. But _1 is of type const char * and that's what's passed to strlen. The type of P and Q are irrelevant ISTM. Jeff
Re: [PATCH] Make strlen range computations more conservative
On 07/24/2018 05:18 PM, Bernd Edlinger wrote: > On 07/24/18 23:46, Jeff Law wrote: >> On 07/24/2018 01:59 AM, Bernd Edlinger wrote: >>> Hi! >>> >>> This patch makes strlen range computations more conservative. >>> >>> Firstly if there is a visible type cast from type A to B before passing >>> then value to strlen, don't expect the type layout of B to restrict the >>> possible return value range of strlen. >> Why do you think this is the right thing to do? ie, is there language >> in the standards that makes you think the code as it stands today is >> incorrect from a conformance standpoint? Is there a significant body of >> code that is affected in an adverse way by the current code? If so, >> what code? >> >> > > I think if you have an object, of an effective type A say char[100], then > you can cast the address of A to B, say typedef char (*B)[2] for instance > and then to const char *, say for use in strlen. I may be wrong, but I think > that we should at least try not to pick up char[2] from B, but instead > use A for strlen ranges, or leave this range open. Currently the range > info for strlen is [0..1] in this case, even if we see the type cast > in the generic tree. ISTM that you're essentially saying that the cast to const char * destroys any type information we can exploit here. But if that's the case, then I don't think we can even derive a range of [0,99]. What's to say that "A" didn't result from a similar cast of some object that was char[200] that happened out of the scope of what we could see during the strlen range computation? If that is what you're arguing, then I think there's a re-evaluation that needs to happen WRT strlen range computation/ And just to be clear, I do see this as a significant correctness question. Martin, thoughts? Jeff