Re: PR83648
On Tue, 9 Jan 2018, Prathamesh Kulkarni wrote: As Jakub pointed out for the case: void *f() { return __builtin_malloc (0); } The malloc propagation would set f() to malloc. However AFAIU, malloc(0) returns NULL (?) and the function shouldn't be marked as malloc ? Why not? Even for void*f(){return 0;} are any of the properties of the malloc attribute violated? It seems to me that if you reject malloc(0), then even the standard malloc function should be rejected as well... -- Marc Glisse
Re: std::vector default default and move constructors
On Thu, 11 Jan 2018, Marc Glisse wrote: On Thu, 11 Jan 2018, François Dumont wrote: - void _M_swap_data(_Vector_impl& __x) _GLIBCXX_NOEXCEPT + void + _M_swap_data(_Vector_impl_data& __x) _GLIBCXX_NOEXCEPT { std::swap(_M_start, __x._M_start); std::swap(_M_finish, __x._M_finish); std::swap(_M_end_of_storage, __x._M_end_of_storage); } I don't remember earlier discussions about this patch, but is this piece of code still needed? std::swap(*this, __x) looks like it should work. Ah, no, there is a non-trivial move constructor, so forget it. I'll still need to rewrite that function later :-( -- Marc Glisse
Re: std::vector default default and move constructors
On Thu, 11 Jan 2018, François Dumont wrote: - void _M_swap_data(_Vector_impl& __x) _GLIBCXX_NOEXCEPT + void + _M_swap_data(_Vector_impl_data& __x) _GLIBCXX_NOEXCEPT { std::swap(_M_start, __x._M_start); std::swap(_M_finish, __x._M_finish); std::swap(_M_end_of_storage, __x._M_end_of_storage); } I don't remember earlier discussions about this patch, but is this piece of code still needed? std::swap(*this, __x) looks like it should work. -- Marc Glisse
Re: std::vector default default and move constructors
Hi Here is an updated patch. Tested under Linux x86_64. Ok to commit ? François On 21/08/2017 21:15, François Dumont wrote: Following feedback on std::list patch this one had the same problem of unused code being deleted. So here is a new version. Ok to commit ? François On 28/07/2017 18:45, François Dumont wrote: Hi There was a little issue in this patch, here is the correct version. François On 23/07/2017 19:41, François Dumont wrote: Hi Is it time now to consider this patch ? * include/bits/stl_vector.h (_Vector_impl_data): New. (_Vector_impl): Inherit from latter. (_Vertor_impl(_Vector_impl&&, _Tp_alloc_type&&)): New. (_Vector_base(_Vector_base&&, const allocator_type&)): Use latter. (_Vector_base()): Default. (_Vector_base(size_t)): Delete. (_Vector_base(_Tp_alloc_type&&)): Delete. (_Vector_base(_Vector_base&&)): Default. (vector()): Default. (vector(vector&&, const allocator_type&, true_type)): New. (vector(vector&&, const allocator_type&, false_type)): New. (vector(vector&&, const allocator_type&)): Use latters. Tested under linux x86_64. François diff --git a/libstdc++-v3/include/bits/stl_vector.h b/libstdc++-v3/include/bits/stl_vector.h index acec501..3dfb4a6 100644 --- a/libstdc++-v3/include/bits/stl_vector.h +++ b/libstdc++-v3/include/bits/stl_vector.h @@ -85,34 +85,54 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER typedef typename __gnu_cxx::__alloc_traits<_Tp_alloc_type>::pointer pointer; - struct _Vector_impl - : public _Tp_alloc_type + struct _Vector_impl_data { pointer _M_start; pointer _M_finish; pointer _M_end_of_storage; - _Vector_impl() - : _Tp_alloc_type(), _M_start(), _M_finish(), _M_end_of_storage() - { } - - _Vector_impl(_Tp_alloc_type const& __a) _GLIBCXX_NOEXCEPT - : _Tp_alloc_type(__a), _M_start(), _M_finish(), _M_end_of_storage() + _Vector_impl_data() _GLIBCXX_NOEXCEPT + : _M_start(), _M_finish(), _M_end_of_storage() { } #if __cplusplus >= 201103L - _Vector_impl(_Tp_alloc_type&& __a) noexcept - : _Tp_alloc_type(std::move(__a)), - _M_start(), _M_finish(), _M_end_of_storage() - { } + _Vector_impl_data(_Vector_impl_data&& __x) noexcept + : _M_start(__x._M_start), _M_finish(__x._M_finish), + _M_end_of_storage(__x._M_end_of_storage) + { __x._M_start = __x._M_finish = __x._M_end_of_storage = pointer(); } #endif - void _M_swap_data(_Vector_impl& __x) _GLIBCXX_NOEXCEPT + void + _M_swap_data(_Vector_impl_data& __x) _GLIBCXX_NOEXCEPT { std::swap(_M_start, __x._M_start); std::swap(_M_finish, __x._M_finish); std::swap(_M_end_of_storage, __x._M_end_of_storage); } + }; + + struct _Vector_impl + : public _Tp_alloc_type, public _Vector_impl_data + { + _Vector_impl() _GLIBCXX_NOEXCEPT_IF( noexcept(_Tp_alloc_type()) ) + : _Tp_alloc_type() + { } + + _Vector_impl(_Tp_alloc_type const& __a) _GLIBCXX_NOEXCEPT + : _Tp_alloc_type(__a) + { } + +#if __cplusplus >= 201103L + _Vector_impl(_Vector_impl&&) = default; + + _Vector_impl(_Tp_alloc_type&& __a) noexcept + : _Tp_alloc_type(std::move(__a)) + { } + + _Vector_impl(_Tp_alloc_type&& __a, _Vector_impl&& __rv) noexcept + : _Tp_alloc_type(std::move(__a)), _Vector_impl_data(std::move(__rv)) + { } +#endif #if _GLIBCXX_SANITIZE_STD_ALLOCATOR && _GLIBCXX_SANITIZE_VECTOR template @@ -235,38 +255,42 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER _Tp_alloc_type& _M_get_Tp_allocator() _GLIBCXX_NOEXCEPT - { return *static_cast<_Tp_alloc_type*>(&this->_M_impl); } + { return this->_M_impl; } const _Tp_alloc_type& _M_get_Tp_allocator() const _GLIBCXX_NOEXCEPT - { return *static_cast(&this->_M_impl); } + { return this->_M_impl; } allocator_type get_allocator() const _GLIBCXX_NOEXCEPT { return allocator_type(_M_get_Tp_allocator()); } - _Vector_base() - : _M_impl() { } +#if __cplusplus >= 201103L + _Vector_base() = default; +#else + _Vector_base() { } +#endif _Vector_base(const allocator_type& __a) _GLIBCXX_NOEXCEPT : _M_impl(__a) { } +#if !_GLIBCXX_INLINE_VERSION _Vector_base(size_t __n) : _M_impl() { _M_create_storage(__n); } +#endif _Vector_base(size_t __n, const allocator_type& __a) : _M_impl(__a) { _M_create_storage(__n); } #if __cplusplus >= 201103L + _Vector_base(_Vector_base&&) = default; + +# if !_GLIBCXX_INLINE_VERSION _Vector_base(_Tp_alloc_type&& __a) noexcept : _M_impl(std::move(__a)) { } - _Vector_base(_Vector_base&& __x) noexcept - : _M_impl(std::move(__x._M_get_Tp_allocator())) - { this->_M_impl._M_swap_data(__x._M_impl); } - _Vector_base(_Vector_base&& __x, const allocator_type& __a) : _M_impl(__a) { @@ -278,6 +302,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER _M_create_storage(__n); } } +# endif + + _Vector_base(const allocator_type& __a, _Vec
Re: PR83648
On 11 January 2018 at 10:34, Prathamesh Kulkarni wrote: > On 11 January 2018 at 04:50, Jeff Law wrote: >> On 01/09/2018 05:57 AM, Prathamesh Kulkarni wrote: >>> >>> As Jakub pointed out for the case: >>> void *f() >>> { >>> return __builtin_malloc (0); >>> } >>> >>> The malloc propagation would set f() to malloc. >>> However AFAIU, malloc(0) returns NULL (?) and the function shouldn't >>> be marked as malloc ? >> This seems like a pretty significant concern. Given: >> >> >> return n ? 0 : __builtin_malloc (n); >> >> Is the function malloc-like enough to allow it to be marked? >> >> If not, then ISTM we have to be very conservative in what we mark. >> >> foo (n, m) >> { >> return n ? 0 : __builtin_malloc (m); >> } >> >> Is that malloc-like enough to mark? > Not sure. Should I make it more conservative by marking it as malloc > only if the argument to __builtin_malloc > is constant or it's value-range is known not to include 0? And > similarly for __builtin_calloc ? But I suppose this constraint will make malloc propagation almost useless :( > > Thanks, > Prathamesh >> Jeff
Re: [PATCH improve early strlen range folding (PR 83671)
On 01/10/2018 06:30 PM, H.J. Lu wrote: On Sat, Jan 6, 2018 at 2:04 PM, Martin Sebor wrote: Bug 83671 - Fix for false positive reported by -Wstringop-overflow does not work at -O1, points out that the string length range optimization implemented as a solution for bug 83373 doesn't help at -O1. The root cause is that the fix was added to the strlen pass that doesn't run at -O1. The string length range computation doesn't depend on the strlen pass, and so the range can be set earlier, in gimple-fold, and its results made available even at -O1. The attached patch changes the gimple_fold_builtin_strlen() function to do that. While testing the change I came across a number of other simple strlen cases that currently aren't handled, some at -O1, others at all. I added code to handle some of the simplest of them and opened bugs to remind us/myself to get back to the rest in the future (pr83693 and pr83702). The significant enhancement is handling arrays of arrays with non-constant indices and pointers to such things, such as in: char a[2][7]; void f (int i) { if (strlen (a[i]) > 6) // eliminated with the patch abort (); } Attached is a near-minimal patch to handle PR 83671. This may have caused: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83781. Yes, it did. I committed r256477 to fix the problem. With it, plain x86_64 bootstrap as well as with --with-arch=corei7 --with-cpu=corei7 succeed. Sorry for the breakage. Martin
Re: PR83648
On 11 January 2018 at 04:50, Jeff Law wrote: > On 01/09/2018 05:57 AM, Prathamesh Kulkarni wrote: >> >> As Jakub pointed out for the case: >> void *f() >> { >> return __builtin_malloc (0); >> } >> >> The malloc propagation would set f() to malloc. >> However AFAIU, malloc(0) returns NULL (?) and the function shouldn't >> be marked as malloc ? > This seems like a pretty significant concern. Given: > > > return n ? 0 : __builtin_malloc (n); > > Is the function malloc-like enough to allow it to be marked? > > If not, then ISTM we have to be very conservative in what we mark. > > foo (n, m) > { > return n ? 0 : __builtin_malloc (m); > } > > Is that malloc-like enough to mark? Not sure. Should I make it more conservative by marking it as malloc only if the argument to __builtin_malloc is constant or it's value-range is known not to include 0? And similarly for __builtin_calloc ? Thanks, Prathamesh > Jeff
libgo patch committed: Fix handling of DW_FORM_strp for 64-bit DWARF
This libgo patch fixes the handling of DW_FORM_strp when using 64-bit DWARF. This is an early backport of https://golang.org/cl/84379, which will be in Go 1.11. Backporting now for AIX support in gccgo. Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu. Committed to mainline. Ian Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 256450) +++ gcc/go/gofrontend/MERGE (working copy) @@ -1,4 +1,4 @@ -19d94969c5202c07b3b166079b9f4ebbb52dfa6b +1176dd2b53f2d2b826b599a126f3f9828283cec3 The first line of this file holds the git revision number of the last merge done from the gofrontend repository. Index: libgo/go/debug/dwarf/entry.go === --- libgo/go/debug/dwarf/entry.go (revision 256366) +++ libgo/go/debug/dwarf/entry.go (working copy) @@ -461,7 +461,18 @@ func (b *buf) entry(atab abbrevTable, ub case formString: val = b.string() case formStrp: - off := b.uint32() // offset into .debug_str + var off uint64 // offset into .debug_str + is64, known := b.format.dwarf64() + if !known { + b.error("unknown size for DW_FORM_strp") + } else if is64 { + off = b.uint64() + } else { + off = uint64(b.uint32()) + } + if uint64(int(off)) != off { + b.error("DW_FORM_strp offset out of range") + } if b.err != nil { return nil }
Re: [PATCH improve early strlen range folding (PR 83671)
On Sat, Jan 6, 2018 at 2:04 PM, Martin Sebor wrote: > Bug 83671 - Fix for false positive reported by -Wstringop-overflow > does not work at -O1, points out that the string length range > optimization implemented as a solution for bug 83373 doesn't help > at -O1. The root cause is that the fix was added to the strlen > pass that doesn't run at -O1. > > The string length range computation doesn't depend on the strlen > pass, and so the range can be set earlier, in gimple-fold, and > its results made available even at -O1. The attached patch > changes the gimple_fold_builtin_strlen() function to do that. > > While testing the change I came across a number of other simple > strlen cases that currently aren't handled, some at -O1, others > at all. I added code to handle some of the simplest of them > and opened bugs to remind us/myself to get back to the rest in > the future (pr83693 and pr83702). The significant enhancement > is handling arrays of arrays with non-constant indices and > pointers to such things, such as in: > > char a[2][7]; > > void f (int i) > { > if (strlen (a[i]) > 6) // eliminated with the patch > abort (); > } > > Attached is a near-minimal patch to handle PR 83671. > This may have caused: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83781. -- H.J.
Re: [PATCH 0/5] x86: CVE-2017-5715, aka Spectre
On 01/10/2018 06:14 AM, Jakub Jelinek wrote: > On Wed, Jan 10, 2018 at 02:08:48PM +0100, Richard Biener wrote: >> On Wed, Jan 10, 2018 at 11:18 AM, Eric Botcazou >> wrote: It's really just a couple of new primitives to emit a jump as a call and one to slam in a new return address. Given those I think you can do the entire implementation as RTL at expansion time and you've got a damn good shot at protecting most architectures from these kinds of attacks. >>> >>> I think that you're a bit optimistic here and that implementing a generic >>> and >>> robust framework at the RTL level might require some time. Given the time >>> and >>> (back-)portability constraints, it might be wiser to rush into architecture- >>> specific countermeasures than to rush into an half-backed RTL framework. >> >> Let me also say that while it might be nice to commonize code introducing >> these >> mitigations as late as possible to not disrupt optimization is important. >> So I >> don't see a very strong motivation in trying very hard to make this more >> middle-endish, apart from maybe sharing helper functions where possible. > > That and perhaps a common option to handle the cases that are common to > multiple backends (i.e. move some options from -m* namespace to -f*). > I'd say the decision about the options and ABI of what we emit is more > important than where we actually emit it, we can easily change where we do > that over time, but not the options nor the ABI. >From a UI standpoint, I think the decision has already been made as LLVM has already thrown -mretpolines into their tree. Sigh. So I think the one thing we ought to seriously consider is at least reserving -mretpoline for this style of mitigation of spectre v2. ALl target's don't have to implementation this style mitigation, but if they do, they use -mretpoline. Jeff
Re: [PATCH 0/5] x86: CVE-2017-5715, aka Spectre
On 01/08/2018 07:23 AM, Alan Modra wrote: > On Sun, Jan 07, 2018 at 04:36:20PM -0700, Jeff Law wrote: >> On 01/07/2018 03:58 PM, H.J. Lu wrote: >>> This set of patches for GCC 8 mitigates variant #2 of the speculative >>> execution >>> vulnerabilities on x86 processors identified by CVE-2017-5715, aka Spectre. > [snip] >> My fundamental problem with this patchkit is that it is 100% x86/x86_64 >> specific. > > It's possible that x86 needs spectre variant 2 mitigation that isn't > necessary on other modern processors like ARM and PowerPC, so let's > not rush into general solutions designed around x86.. >From what I know about variant 2 mitigation it's going to be needed on a variety of chip families, not just the Intel architecture. However, I'm seeing signals that other chips vendors are looking towards approaches that don't use retpolines. So even though I think we could build them fairly easy for most targets out of simple primitives, it may not be the best use of our time. > > Here's a quick overview of Spectre. For more, see > https://spectreattack.com/spectre.pdf > https://googleprojectzero.blogspot.com.au/2018/01/reading-privileged-memory-with-side.html > https://developer.arm.com/-/media/Files/pdf/Cache_Speculation_Side-channels.pdf Yup. Already familiar with this stuff :-) > > However, x86 has the additional problem of variable length > instructions. Gadgets might be hiding in code when executed at an > offset from the start of the "real" instructions. Which is why x86 is > more at risk from this attack than other processors, and why x86 needs > something like the posted variant 2 mitigation, slowing down all > indirect branches. > True, but largely beside the point. I'm not aware of anyone serious looking at mating ROP with Spectre at this point, though it is certainly possible. The bad guys don't need to work that hard at this time. Jeff
Re: [PATCH 0/3] Add __builtin_load_no_speculate
On 01/09/2018 08:29 AM, Richard Earnshaw (lists) wrote: > I'm in two minds about that. There are certainly cases where you might > want to use the generic expansion as part of handling a case where you > have a more standard speculation barrier; in those cases you might want > to emit your barrier and then let the generic code expand and provide > the logical behaviour of the builtin. > > However, if you don't have a barrier you need to ask yourself, will my > architecture ever have an implementation that does do speculation? > Unless you can be certain that it won't, you probably need to be warned > that some code the programmer thinks might be vulnerable to spectre has > not been compiled with that protection, otherwise if you run that code > on a later implementation that does speculate, it might be vulnerable. > > Let me give an example, we use the generic code expansion if we > encounter the builtin when generating code for Thumb on pre-ARMv7 > devices. We don't have instructions in 'thumb1' to guard against > speculation and we really want to warn users that they've done this (it > might be a mistake in how they're invoking the compiler). > > I could add an extra parameter to the helper (bool warn_unimplemented), > which would be true if called directly from the expanders, but would now > allow back-ends to implement a trivial variant that just suppressed the > warning. They could also then use the generic expansion if all that was > needed was a subsequent fence instruction. Yea, I see your side as well on this -- advisable or not I suspect we're going to see folks saying "my embedded chip doesn't have these problems, I don't want to pay any of these costs" and I don't want to be warned about a problem I know can't happen (today). Anyway, I think relatively speaking this is minor compared to the stuff we're discussing around the semantics of the builtin. I can live with iterating on this aspect based on feedback. jeff
Re: [C++ Patch] PR 78344 ("ICE on invalid c++ code (tree check: expected tree_list, have error_mark in cp_check_const_attributes, at cp/decl2.c:1347")
Hi again, thus the below is a rather "dull" solution at the level of cplus_decl_attributes itself: cp_check_const_attributes is tweaked to check for error_mark_node at each outer iteration and consistently return a bool, which is then checked by the caller in order to possibly bail out (this is similar to the check_for_bare_parameter_packs call a few lines before). Testing is Ok on x86_64-linux. Thanks, Paolo. /cp 2018-01-11 Paolo Carlini PR c++/78344 * decl2.c (cp_check_const_attributes): Check for error_mark_node at each outer iterations; return a bool. (cplus_decl_attributes): Adjust cp_check_const_attributes call. /testsuite 2018-01-10 Paolo Carlini PR c++/78344 * g++.dg/cpp0x/alignas13.C: New. Index: cp/decl2.c === --- cp/decl2.c (revision 256451) +++ cp/decl2.c (working copy) @@ -1396,19 +1396,17 @@ cp_reconstruct_complex_type (tree type, tree botto } /* Replaces any constexpr expression that may be into the attributes - arguments with their reduced value. */ + arguments with their reduced value. Returns FALSE if encounters + error_mark_node, TRUE otherwise. */ -static void +static bool cp_check_const_attributes (tree attributes) { - if (attributes == error_mark_node) -return; - - tree attr; - for (attr = attributes; attr; attr = TREE_CHAIN (attr)) + for (tree attr = attributes; attr; attr = TREE_CHAIN (attr)) { - tree arg; - for (arg = TREE_VALUE (attr); arg; arg = TREE_CHAIN (arg)) + if (attr == error_mark_node) + return false; + for (tree arg = TREE_VALUE (attr); arg; arg = TREE_CHAIN (arg)) { tree expr = TREE_VALUE (arg); if (EXPR_P (expr)) @@ -1415,6 +1413,7 @@ cp_check_const_attributes (tree attributes) TREE_VALUE (arg) = maybe_constant_value (expr); } } + return true; } /* Return true if TYPE is an OpenMP mappable type. */ @@ -1528,7 +1527,8 @@ cplus_decl_attributes (tree *decl, tree attributes save_template_attributes (&attributes, decl, flags); } - cp_check_const_attributes (attributes); + if (!cp_check_const_attributes (attributes)) +return; if (TREE_CODE (*decl) == TEMPLATE_DECL) decl = &DECL_TEMPLATE_RESULT (*decl); Index: testsuite/g++.dg/cpp0x/alignas13.C === --- testsuite/g++.dg/cpp0x/alignas13.C (nonexistent) +++ testsuite/g++.dg/cpp0x/alignas13.C (working copy) @@ -0,0 +1,5 @@ +// PR c++/78344 +// { dg-do compile { target c++11 } } + +alignas(double) int f alignas; // { dg-error "30:expected '\\('" } +alignas(double) int g alignas(double; // { dg-error "37:expected '\\)'" }
Re: [PATCH 0/3] Add __builtin_load_no_speculate
On 01/09/2018 03:47 AM, Richard Earnshaw (lists) wrote: > On 05/01/18 13:08, Alexander Monakov wrote: >> On Fri, 5 Jan 2018, Richard Earnshaw (lists) wrote: >>> This is quite tricky. For ARM we have to have a speculated load. >> >> Sorry, I don't follow. On ARM, it is surprising that CSEL-CSDB-LDR sequence >> wouldn't work (applying CSEL to the address rather than loaded value), and >> if it wouldn't, then ARM-specific lowering of the builtin can handle that >> anyhow, right? (by spilling the pointer) > > The load has to feed /in/ to the csel/csdb sequence, not come after it. > >> >> (on x86 the current Intel's recommendation is to emit LFENCE prior to the >> load) > > That can be supported in the way you expand the builtin. The builtin > expander is given a (MEM (ptr)) , but it's up to the back-end where to > put that in the expanded sequence to materialize the load, so you could > write (sorry, don't know x86 asm very well, but I think this is how > you'd put it) > > lfence > mov (ptr), dest > > with branches around that as appropriate to support the remainder of the > builtin's behaviour. I think the argument is going to be that they don't want the branches around to support the other test + failval semantics. Essentially the same position as IBM has with PPC. > >> Is the main issue expressing the CSEL condition in the source code? Perhaps >> it is >> possible to introduce >> >> int guard = __builtin_nontransparent(predicate); >> >> if (predicate) >> foo = __builtin_load_no_speculate(&arr[addr], guard); >> >> ... or maybe even >> >> if (predicate) >> foo = arr[__builtin_loadspecbarrier(addr, guard)]; >> >> where internally __builtin_nontransparent is the same as >> >> guard = predicate; >> asm volatile("" : "+g"(guard)); >> >> although admittedly this is not perfect since it forces evaluation of 'guard' >> before the branch. > > As I explained to Bernd last night, I think this is likely be unsafe. > If there's some control path before __builtin_nontransparent that allows > 'predicate' to be simplified (eg by value range propagation), then your > guard doesn't protect against the speculation that you think it does. > Changing all the optimizers to guarantee that wouldn't happen (and > guaranteeing that all future optimizers won't introduce new problems of > that nature) is, I suspect, very non-trivial. Agreed. Whatever PREDICATE happens to be, the compiler is going to go through extreme measures to try and collapse PREDICATE down to a compile-time constant, including splitting paths to the point where PREDICATE is used in the conditional so that on one side it's constant and the other it's non-constant. It seems like this approach is likely to be compromised by the optimizers. Jeff
Re: [PATCH] Add PowerPC configuration option --with-long-double-format={ibm,ieee}
On Wed, 10 Jan 2018, Michael Meissner wrote: > This patch is next in my series of patches to enable us to configure the long > double type on PowerPC systems. This patch is only about the configuration > option. A future patch will contain the multilib support. In general we expect configure options to be documented in install.texi. Is this one being deliberately omitted because it is only actually usable for toolchain development at present, with various ways in which the support for IEEE long double is not yet completely functional and library support is missing - with documentation intended to be added later once fully functional? -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH 1/3] [builtins] Generic support for __builtin_load_no_speculate()
On 01/08/2018 02:03 PM, Bill Schmidt wrote: > > I agree 100% with this approach. I just wanted to raise the point in case > other architectures have different needs. Power can work around this > by just ignoring 4 of the 5 arguments. As long as nobody else needs > *additional* arguments, this should work out just fine. But I want to be > clear > that the only guarantee of the semantics for everybody is that "speculation > stops here," while on some processors it may be "speculation stops here > if out of range." If we can write this into the documentation, then I'm fine > writing a target expander for Power as discussed. My recollection of other micro-architectures and how they handle conditional moves makes me believe that the test and conditional move may be enough to stop the rampant speculation that causes the problems. They're just enough of a fence to provide a level of mitigation. So I see value in providing those arguments for architectures other than ARM/AArch64. What I think we're really trying to nail down is how crisply the semantics of this builtin are, particularly around the need to test and provide a failval. I'm actually going to be in a meeting with another chip vendor tomorrow AM and will make a point to discuss this with them and see what direction they want to go. I suspect they're closer to PPC in terms of the semantics they want, but need to verify. > > I had a brief interchange with Richi last week, and he suggested that for > the automatic detection we might look into flagging MEM_REFs rather > than inserting a built-in; a target hook can still handle such a flag. That > has some advantages and some disadvantages that I can think of, so > we'll have to talk that out on the list over time after we get through the > crisis mode reactions. I think someone could likely spend a huge amount of time in this space. Both with the analysis around finding potentially vulnerable sequences, generating appropriate mitigation code and optimizing that to be as painless as possible. But as you say, this is something we should hash out post-crisis. jeff
[Ada] Fix ICE on Component_Size clause with atomic type
The compiler aborts on the assignment to an array component if the component type is an atomic type and the array is subject to a Component_Size clause that is too large for atomic access on the target. A proper error message must be issued instead. Tested on x86-64/Linux, applied on the mainline. 2018-01-10 Eric Botcazou * gcc-interface/decl.c (gnat_to_gnu_component_type): Apply the check for atomic access once the component size is taken into account and also do it if the component type is Atomic or Volatile_Full_Access. 2018-01-10 Eric Botcazou * gnat.dg/atomic10.adb: New test. -- Eric BotcazouIndex: gcc-interface/decl.c === --- gcc-interface/decl.c (revision 256275) +++ gcc-interface/decl.c (working copy) @@ -5022,9 +5022,6 @@ gnat_to_gnu_component_type (Entity_Id gn && tree_fits_uhwi_p (TYPE_SIZE (gnu_type))) gnu_type = make_packable_type (gnu_type, false, max_align); - if (Has_Atomic_Components (gnat_array)) -check_ok_for_atomic_type (gnu_type, gnat_array, true); - /* Get and validate any specified Component_Size. */ gnu_comp_size = validate_size (Component_Size (gnat_array), gnu_type, gnat_array, @@ -5071,6 +5068,9 @@ gnat_to_gnu_component_type (Entity_Id gn gnat_array); } + if (Has_Atomic_Components (gnat_array) || Is_Atomic_Or_VFA (gnat_type)) +check_ok_for_atomic_type (gnu_type, gnat_array, true); + /* If the component type is a padded type made for a non-bit-packed array of scalars with reverse storage order, we need to propagate the reverse storage order to the padding type since it is the innermost enclosing -- { dg-do compile } -- { dg-options "-gnatws" } with System.Multiprocessors; procedure Atomic10 is type Atomic_Unsigned is mod 2 ** 32; pragma Atomic (Atomic_Unsigned); Max : Positive := Positive (System.Multiprocessors.Number_Of_CPUs); Comp_Size : constant := 64 * 8; subtype Index_Type is Positive range 1 .. Max; type Array_Type is array (Index_Type) of aliased Atomic_Unsigned; -- { dg-error "cannot be guaranteed" } for Array_Type'Component_Size use Comp_Size; Slots : Array_Type; begin for Index in Index_Type loop Slots (Index) := 0; end loop; end;
Re: [PATCH 1/3] [builtins] Generic support for __builtin_load_no_speculate()
On 01/09/2018 10:11 AM, Bill Schmidt wrote: > On Jan 9, 2018, at 4:21 AM, Richard Earnshaw (lists) > wrote: >> >> On 08/01/18 16:01, Bill Schmidt wrote: >>> On Jan 8, 2018, at 8:06 AM, Richard Earnshaw (lists) >>> wrote: On 08/01/18 02:20, Bill Schmidt wrote: > Hi Richard, > > Unfortunately, I don't see any way that this will be useful for the ppc > targets. We don't > have a way to force resolution of a condition prior to continuing > speculation, so this > will just introduce another comparison that we would speculate past. For > our mitigation > we will have to introduce an instruction that halts all speculation at > that point, and place > it in front of all dangerous loads. I wish it were otherwise. So can't you make the builtin expand to (in pseudo code): if (bounds_check) { __asm ("barrier"); result = *ptr; } else result = failval; >>> >>> Could, but this just generates unnecessary code for Power. We would >>> instead generate >>> >>> __asm ("barrier"); >>> result = *ptr; >>> >>> without any checks. We would ignore everything but the first argument. >> >> You can't do that with the builtin as it is currently specified as it >> also has a defined behaviour for the result it returns. You can, >> however, expand the code as normal RTL and let the optimizers remove any >> redundant code if they can make that deduction and you don't need the >> additional behaviour. > > But that's my original point. "As currently specified" is overspecified for > our architecture, and expanding the extra code *hoping* it will go away > is not something I feel we should do. If our hopes are dashed, we end up > with yet worse performance. If we're going to use something generic > for everyone, then I argue that the semantics may not be the same for > all targets, and that this needs to be specified in the documentation. Other than in the case where bounds_check is a compile-time constant I don't see that the compiler is likely to drop the else clause above. So relying on the compiler to optimize away the redundant code doesn't seem like a viable option. However, if we relax the semantics on failval, then I think we get enough freedom to generate the desired code on PPC. > I'm just looking for a solution that works for everyone. Likewise. It'd be unfortunate if we end up with two distinct builtins and the kernel buys have to write some level of abstraction on top of them to select between them based on the target. jeff
[Committed] PR Fortran/82367 -- Check for NULL pointer.
Thomas Koenig approved the patch in the PR. Regression tested on x86_64-*-freebsd 2018-01-10 Steven G. Kargl PR fortran/82367 * resolve.c (resolve_allocate_expr): Check for NULL pointer. 2018-01-10 Steven G. Kargl PR fortran/82367 * gfortran.dg/deferred_character_18.f90: New test. -- Steve Index: gcc/fortran/resolve.c === --- gcc/fortran/resolve.c (revision 256455) +++ gcc/fortran/resolve.c (working copy) @@ -7484,8 +7484,13 @@ resolve_allocate_expr (gfc_expr *e, gfc_code *code, bo if (code->ext.alloc.ts.type == BT_CHARACTER && !e->ts.deferred && !UNLIMITED_POLY (e)) { - int cmp = gfc_dep_compare_expr (e->ts.u.cl->length, - code->ext.alloc.ts.u.cl->length); + int cmp; + + if (!e->ts.u.cl->length) + goto failure; + + cmp = gfc_dep_compare_expr (e->ts.u.cl->length, + code->ext.alloc.ts.u.cl->length); if (cmp == 1 || cmp == -1 || cmp == -3) { gfc_error ("Allocating %s at %L with type-spec requires the same " Index: gcc/testsuite/gfortran.dg/deferred_character_18.f90 === --- gcc/testsuite/gfortran.dg/deferred_character_18.f90 (nonexistent) +++ gcc/testsuite/gfortran.dg/deferred_character_18.f90 (working copy) @@ -0,0 +1,29 @@ +! { dg-do compile } +! PR Fortran/82367 +! Contributed by Walter Spector +module cls_allocmod + implicit none + +contains + + subroutine cls_alloc (n, str) +integer, intent(in) :: n +character(*), allocatable, intent(out) :: str +! Note: Star ^ should have been a colon (:) + +allocate (character(n)::str) + + end subroutine + +end module + +program cls + use cls_allocmod + implicit none + + character(:), allocatable :: s + + call cls_alloc(42, s) ! { dg-error "allocatable or pointer dummy argument" } + print *, 'string len =', len(s) + +end program
Re: [PATCH 1/3] [builtins] Generic support for __builtin_load_no_speculate()
On 01/08/2018 09:01 AM, Bill Schmidt wrote: > On Jan 8, 2018, at 8:06 AM, Richard Earnshaw (lists) > wrote: >> >> On 08/01/18 02:20, Bill Schmidt wrote: >>> Hi Richard, >>> >>> Unfortunately, I don't see any way that this will be useful for the ppc >>> targets. We don't >>> have a way to force resolution of a condition prior to continuing >>> speculation, so this >>> will just introduce another comparison that we would speculate past. For >>> our mitigation >>> we will have to introduce an instruction that halts all speculation at that >>> point, and place >>> it in front of all dangerous loads. I wish it were otherwise. >> >> So can't you make the builtin expand to (in pseudo code): >> >> if (bounds_check) >>{ >> __asm ("barrier"); >> result = *ptr; >> } >>else >>result = failval; > > Could, but this just generates unnecessary code for Power. We would instead > generate > > __asm ("barrier"); > result = *ptr; > > without any checks. We would ignore everything but the first argument. So how bad would it be to drop the whole concept of failval and make the result indeterminate in the out of bounds case. Would that give us enough freedom to generate an appropriate sequence for aarch64 and ppc? It feels like these two architectures are essentially on opposite sides of the spectrum and if we can find a reasonable way to handle them that we'd likely have semantics we can use on just about any architecture. jeff
Re: PR83648
On 01/09/2018 05:57 AM, Prathamesh Kulkarni wrote: > > As Jakub pointed out for the case: > void *f() > { > return __builtin_malloc (0); > } > > The malloc propagation would set f() to malloc. > However AFAIU, malloc(0) returns NULL (?) and the function shouldn't > be marked as malloc ? This seems like a pretty significant concern. Given: return n ? 0 : __builtin_malloc (n); Is the function malloc-like enough to allow it to be marked? If not, then ISTM we have to be very conservative in what we mark. foo (n, m) { return n ? 0 : __builtin_malloc (m); } Is that malloc-like enough to mark? Jeff
Fix couple of -fdump-ada-spec issues on preprocessor macros
The -fdump-ada-spec currently generates invalid Ada for preprocessor macros containing floating-point constants and string concatenations. Tested on x86-64/Linux, applied on the mainline. 2018-01-10 Eric Botcazou c-family/ * c-ada-spec.c (dump_number): Add FLOAT_P parameter. Skip 'f' and 'F' characters if it is true. (store_ada_macro): Minor tweak. (dump_ada_macros) : Likewise. : Likewise. : Output '&' in the buffer if not the first string. : Adjust calls to dump_number. -- Eric BotcazouIndex: c-ada-spec.c === --- c-ada-spec.c (revision 256275) +++ c-ada-spec.c (working copy) @@ -113,14 +113,15 @@ macro_length (const cpp_macro *macro, in } /* Dump all digits/hex chars from NUMBER to BUFFER and return a pointer - to the character after the last character written. */ + to the character after the last character written. If FLOAT_P is true, + this is a floating-point number. */ static unsigned char * -dump_number (unsigned char *number, unsigned char *buffer) +dump_number (unsigned char *number, unsigned char *buffer, bool float_p) { while (*number != '\0' - && *number != 'U' - && *number != 'u' + && *number != (float_p ? 'F' : 'U') + && *number != (float_p ? 'f' : 'u') && *number != 'l' && *number != 'L') *buffer++ = *number++; @@ -192,7 +193,8 @@ store_ada_macro (cpp_reader *pfile ATTRI { const cpp_macro *macro = node->value.macro; - if (node->type == NT_MACRO && !(node->flags & NODE_BUILTIN) + if (node->type == NT_MACRO + && !(node->flags & NODE_BUILTIN) && macro->count && *NODE_NAME (node) != '_' && LOCATION_FILE (macro->line) == macro_source_file) @@ -345,7 +347,8 @@ dump_ada_macros (pretty_printer *pp, con is_one = prev_is_one; break; - case CPP_COMMENT: break; + case CPP_COMMENT: + break; case CPP_WSTRING: case CPP_STRING16: @@ -359,11 +362,18 @@ dump_ada_macros (pretty_printer *pp, con if (!macro->fun_like) supported = 0; else - buffer = cpp_spell_token (parse_in, token, buffer, false); + buffer + = cpp_spell_token (parse_in, token, buffer, false); break; case CPP_STRING: - is_string = 1; + if (is_string) + { + *buffer++ = '&'; + *buffer++ = ' '; + } + else + is_string = 1; { const unsigned char *s = token->val.str.text; @@ -428,7 +438,7 @@ dump_ada_macros (pretty_printer *pp, con *buffer++ = '1'; *buffer++ = '6'; *buffer++ = '#'; -buffer = dump_number (tmp + 2, buffer); +buffer = dump_number (tmp + 2, buffer, false); *buffer++ = '#'; break; @@ -436,19 +446,20 @@ dump_ada_macros (pretty_printer *pp, con case 'B': *buffer++ = '2'; *buffer++ = '#'; -buffer = dump_number (tmp + 2, buffer); +buffer = dump_number (tmp + 2, buffer, false); *buffer++ = '#'; break; default: -/* Dump floating constants unmodified. */ +/* Dump floating-point constant unmodified. */ if (strchr ((const char *)tmp, '.')) - buffer = dump_number (tmp, buffer); + buffer = dump_number (tmp, buffer, true); else { *buffer++ = '8'; *buffer++ = '#'; -buffer = dump_number (tmp + 1, buffer); +buffer + = dump_number (tmp + 1, buffer, false); *buffer++ = '#'; } break; @@ -456,19 +467,23 @@ dump_ada_macros (pretty_printer *pp, con break; case '1': - if (tmp[1] == '\0' || tmp[1] == 'l' || tmp[1] == 'u' - || tmp[1] == 'L' || tmp[1] == 'U') + if (tmp[1] == '\0' + || tmp[1] == 'u' + || tmp[1] == 'U' + || tmp[1] == 'l' + || tmp[1] == 'L') { is_one = 1; char_one = buffer; *buffer++ = '1'; + break; } - else - buffer = dump_number (tmp, buffer); - break; + /* fallthrough */ default: - buffer = dump_number (tmp, buffer); + buffer + = dump_number (tmp, buffer, + strchr ((const char *)tmp, '.')); break; } break;
[PATCH] RISC-V: Add naked function support.
This adds naked function support to the RISC-V port. The basic structure was copied from other ports, so there should be nothing unexpected here. This was tested with a riscv64-linux build and make check. There were no regressions. I also hand checked the info docs to make sure the extend.texi change looked OK. 2018-01-10 Kito Cheng gcc/ * config/riscv/riscv-protos.h (riscv_output_return): New. * config/riscv/riscv.c (struct machine_function): New naked_p field. (riscv_attribute_table, riscv_output_return), (riscv_handle_fndecl_attribute, riscv_naked_function_p), (riscv_allocate_stack_slots_for_args, riscv_warn_func_return): New. (riscv_compute_frame_info): Only compute frame->mask if not a naked function. (riscv_expand_prologue): Add early return for naked function. (riscv_expand_epilogue): Likewise. (riscv_function_ok_for_sibcall): Return false for naked function. (riscv_set_current_function): New. (TARGET_SET_CURRENT_FUNCTION, TARGET_ALLOCATE_STACK_SLOTS_FOR_ARGS), (TARGET_ATTRIBUTE_TABLE, TARGET_WARN_FUNC_RETURN): New. * config/riscv/riscv.md (simple_return): Call riscv_output_return. * doc/extend.texi (RISC-V Function Attributes): New. --- gcc/config/riscv/riscv-protos.h | 1 + gcc/config/riscv/riscv.c| 157 +++- gcc/config/riscv/riscv.md | 4 +- gcc/doc/extend.texi | 19 + 4 files changed, 163 insertions(+), 18 deletions(-) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 1cf016d850b..0538ede77e4 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -54,6 +54,7 @@ extern bool riscv_split_64bit_move_p (rtx, rtx); extern void riscv_split_doubleword_move (rtx, rtx); extern const char *riscv_output_move (rtx, rtx); extern const char *riscv_output_gpr_save (unsigned); +extern const char *riscv_output_return (); #ifdef RTX_CODE extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx); extern void riscv_expand_float_scc (rtx, enum rtx_code, rtx, rtx); diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c index b6270f7bfd7..d260c0ebae1 100644 --- a/gcc/config/riscv/riscv.c +++ b/gcc/config/riscv/riscv.c @@ -127,6 +127,9 @@ struct GTY(()) machine_function { This area is allocated by the callee at the very top of the frame. */ int varargs_size; + /* True if current function is a naked function. */ + bool naked_p; + /* The current frame information, calculated by riscv_compute_frame_info. */ struct riscv_frame_info frame; }; @@ -269,6 +272,23 @@ static const struct riscv_tune_info optimize_size_tune_info = { false, /* slow_unaligned_access */ }; +static tree riscv_handle_fndecl_attribute (tree *, tree, tree, int, bool *); + +/* Defining target-specific uses of __attribute__. */ +static const struct attribute_spec riscv_attribute_table[] = +{ + /* Syntax: { name, min_len, max_len, decl_required, type_required, + function_type_required, affects_type_identity, handler, + exclude } */ + + /* The attribute telling no prologue/epilogue. */ + { "naked", 0, 0, true, false, false, false, +riscv_handle_fndecl_attribute, NULL }, + + /* The last attribute spec is set to be NULL. */ + { NULL, 0, 0, false, false, false, false, NULL, NULL } +}; + /* A table describing all the processors GCC knows about. */ static const struct riscv_cpu_info riscv_cpu_info_table[] = { { "rocket", &rocket_tune_info }, @@ -1827,6 +1847,16 @@ riscv_output_move (rtx dest, rtx src) } gcc_unreachable (); } + +const char * +riscv_output_return () +{ + if (cfun->machine->naked_p) +return ""; + + return "ret"; +} + /* Return true if CMP1 is a suitable second operand for integer ordering test CODE. See also the *sCC patterns in riscv.md. */ @@ -2647,6 +2677,50 @@ riscv_setup_incoming_varargs (cumulative_args_t cum, machine_mode mode, cfun->machine->varargs_size = gp_saved * UNITS_PER_WORD; } +/* Handle an attribute requiring a FUNCTION_DECL; + arguments as in struct attribute_spec.handler. */ +static tree +riscv_handle_fndecl_attribute (tree *node, tree name, + tree args ATTRIBUTE_UNUSED, + int flags ATTRIBUTE_UNUSED, bool *no_add_attrs) +{ + if (TREE_CODE (*node) != FUNCTION_DECL) +{ + warning (OPT_Wattributes, "%qE attribute only applies to functions", + name); + *no_add_attrs = true; +} + + return NULL_TREE; +} + +/* Return true if func is a naked function. */ +static bool +riscv_naked_function_p (tree func) +{ + tree func_decl = func; + if (func == NULL_TREE) +func_decl = current_function_decl; + return NULL_TREE != lookup_attribute ("naked", DECL_ATTRIBUTES (func_decl)); +} + +/* Implement T
Re: [PATCH] expandargv: fix check for dynamic allocation of argument vector
On 12/30/2017 12:46 PM, Daniel van Gerpen wrote: > > > When the code interpolates the contents of response files, the > argument vector is reallocated to the new size. This only works > if it was dynamically allocated once before -- we do not want to > mess with the argv memory given to us by the init code. > > The code tried to detect this with a flag, but that was never > written to, leading to multiple dynamic allocations -- one > for each response file. > --- > libiberty/argv.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) THanks. I created a suitable ChangeLog entry and committed your patch to the trunk. Jeff
Re: [PATCH] PR Fortran/83093 -- Check the type of character length
On Wed, Jan 10, 2018 at 08:41:17AM +0200, Janne Blomqvist wrote: > On Wed, Jan 10, 2018 at 3:18 AM, Steve Kargl > wrote: > > When parsing code and implicit typing is used, the > > type of an entity used as a character length is not > > known until after resolution. The attach patch > > checks the type of length and response accordingly. > > Regression tested on x86_64-*-freebsd. Ok to commit? > > Ok, thanks. Thanks. Committed. -- Steve
[PATCH] Add PowerPC configuration option --with-long-double-format={ibm,ieee}
This patch is next in my series of patches to enable us to configure the long double type on PowerPC systems. This patch is only about the configuration option. A future patch will contain the multilib support. This patch needs the previous patch I submitted today (January 10th, 2018) applied before it will work. https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00806.html I built a bootstrap configuratrion on a little endian power8 system using the --with-long-double-format-ibm option and it passed with no regressions. I also built a non-bootstrap version using --with-long-double-format-ieee and I verified that it defaulted to IEEE 128-bit for long double. Can I check this patch into the trunk once the previous patch has been checked in? 2018-01-10 Michael Meissner * configure.ac (--with-long-double-format): Add support for the configuration option to change the default long double format on PowerPC systems. * config.gcc (powerpc*-linux*-*): Likewise. * configure: Regenerate. * config/rs6000/rs6000-c.c (rs6000_cpu_cpp_builtins): If long double is IEEE, define __KC__ and __KF__ to allow floatn.h to be used without modification. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797 Index: gcc/configure.ac === --- gcc/configure.ac(revision 256356) +++ gcc/configure.ac(working copy) @@ -5885,6 +5885,43 @@ if test x$gcc_cv_target_ldbl128 = xyes; [Define if TFmode long double should be the default]) fi +# Check if TFmode long double target should use the IBM extended double or IEEE +# 128-bit floating point formats if long doubles are 128-bits long. The long +# double type can only be switched on powerpc64 bit Linux systems where VSX is +# supported. Other PowerPC systems do not build the IEEE 128-bit emulator in +# liggcc. +AC_ARG_WITH([long-double-format], + [AS_HELP_STRING([--with-long-double-format={ieee,ibm}] + [Specify whether PowerPC long double uses IEEE or IBM format])],[ +case x"$target:$with_long_double_format" in + xpowerpc64le-*-linux*:ieee | xpowerpc64le-*-linux*:ibm) +: +;; + xpowerpc64-*-linux*:ieee | xpowerpc64-*-linux*:*:ibm) +# IEEE 128-bit emulationr is only built on 64-bit VSX Linux systems +case "$with_cpu" in + power7 | power8 | power9 | power1*) + : + ;; + *) + AC_MSG_ERROR([Configuration option --with-long-double-format is only \ +supported if the default cpu is power7 or newer]) + with_long_double_format="" + ;; + esac + ;; + xpowerpc64*-*-linux*:*) +AC_MSG_ERROR([--with-long-double-format argument should be ibm or ieee]) +with_long_double_format="" +;; + *) +AC_MSG_ERROR([Configure option --with-long-double-format is only supported \ +on 64-bit PowerPC VSX Linux systems]) +with_long_double_format="" +;; +esac], + []) + # Check if the target LIBC supports exporting the AT_PLATFORM and AT_HWCAP # values in the TCB. Currently, only GLIBC 2.23 and later support this. gcc_cv_libc_provides_hwcap_in_tcb=no Index: gcc/configure === --- gcc/configure (revision 256356) +++ gcc/configure (working copy) @@ -945,6 +945,7 @@ enable_linker_build_id enable_libssp enable_default_ssp with_long_double_128 +with_long_double_format with_gc with_system_zlib enable_maintainer_mode @@ -1738,6 +1739,9 @@ Optional Packages: --with-glibc-version=M.N assume GCC used with glibc version M.N or later --with-long-double-128 use 128-bit long double by default + --with-long-double-format={ieee,ibm} + Specify whether PowerPC long double uses IEEE or IBM format + --with-gc={page,zone} this option is not supported anymore. It used to choose the garbage collection mechanism to use with the compiler @@ -18442,7 +18446,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 18445 "configure" +#line 18449 "configure" #include "confdefs.h" #if HAVE_DLFCN_H @@ -18548,7 +18552,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 18551 "configure" +#line 18555 "configure" #include "confdefs.h" #if HAVE_DLFCN_H @@ -29185,6 +29189,45 @@ $as_echo "#define TARGET_DEFAULT_LONG_DO fi +# Check if TFmode long double target should use the IBM extended double or IEEE +# 128-bit floating point formats if long doubles are 128-bits long. The long +# double type can only be switched on powerpc64 bit Linux systems where VSX is +# supported. Other PowerPC systems do not build the IEEE 128-bit emulator in +# liggc
Re: [PATCH], Add checks for -mlong-double-128 in PowerPC ibm/float128 code
On Wed, Jan 10, 2018 at 03:19:49PM -0500, Michael Meissner wrote: > As I add support for making -mabi=ieeelongdouble default, I noticed some > issues > in building libstdc++ with the default for the code that supports the > -mlong-double-64 option. These tests add checks for -mlong-double-128 before > considering whether TFmode/TCmode are IEEE or IBM extended double. > > I went into the two primary macros (FLOAT128_{IBM,IEEE}_P) and added the > checks. I also looked at all of the TFmode/TCmode support in rs6000.c and I > found one case where it wasn't checking the long double size first. > > I have done a bootstrap and regression test with this patch on a little endian > power8 system. I previously did a bootstrap wtih the original patch (which > includes the rs6000.h change and a similar change to rs6000.c) to add long > double configuration and multilib on a big endian power8 system. In both > cases, there was no regression. Can I install this patch into the trunk? Okay. Thanks! Segher > 2018-01-10 Michael Meissner > > * config/rs6000/rs6000.c (is_complex_IBM_long_double): Explicitly > check for 128-bit long double before checking TCmode. > * config/rs6000/rs6000.h (FLOAT128_IEEE_P): Explicitly check for > 128-bit long doubles before checking TFmode or TCmode. > (FLOAT128_IBM_P): Likewise.
Re: [PR/middle-end 81897] make tree-ssa-uninit.c handle longer sequences
> OK. > > Sorry for mucking things up and making more work :( > Not at all. I didn't know PHIs were going to behave that way either ;-). Thanks for the review. Committed to trunk. Aldy
Re: PR81703 and Martin's fix for PR83501
On 01/10/2018 11:42 AM, Prathamesh Kulkarni wrote: > Hi, > I have attached patch for PR81703 rebased on Martin's fix for PR83501 > posted here since both had considerable overlaps: > https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00180.html > > The patch passes bootstrap+test on x86_64-unknown-linux-gnu > and cross-tested on aarch64-*-*. > Currently it fails to pass validation on arm targets because of PR83775. > > Does it look OK? > > Thanks, > Prathamesh > > > pr81703-1.txt > > > 2018-10-01 Martin Sebor > Prathamesh Kulkarni > > PR tree-optimization/83501 > PR tree-optimization/81703 > > * tree-ssa-strlen.c (get_string_cst): Rename... > (get_string_len): ...to this. Handle global constants. > (handle_char_store): Adjust. > > testsuite/ > * gcc.dg/strlenopt-39.c: New test-case. > * gcc.dg/pr81703.c: Likewise. OK. Jeff
Re: [PR/middle-end 81897] make tree-ssa-uninit.c handle longer sequences
On 01/10/2018 10:45 AM, Aldy Hernandez wrote: >>> @@ -671,11 +668,9 @@ convert_control_dep_chain_into_preds (vec >>> *dep_chains, >>> e = one_cd_chain[j]; >>> guard_bb = e->src; >>> gsi = gsi_last_bb (guard_bb); >>> + /* Ignore empty BBs as they're basically forwarder blocks. */ >>> if (gsi_end_p (gsi)) >>> - { >>> - has_valid_pred = false; >>> - break; >>> - } >>> + continue; >>> cond_stmt = gsi_stmt (gsi); >>> if (is_gimple_call (cond_stmt) && EDGE_COUNT (e->src->succs) >= 2) >>> /* Ignore EH edge. Can add assertion on the other edge's flag. >>> */ >> ISTM that you want to use empty_block_p (bb) && single_succ_p (bb) to >> detect the forwarder block. Otherwise ISTM that labels and debug >> statements would affect the uninit analysis. > We still need to check for gsi_end_p() because guard_bb can have no > statements but be considered non empty according to empty_block_p(). > This is the case with a seemingly empty basic block that actually has > an incoming PHI. > > Jakub suggested the following patch which fixes the new ICE in the PR. > I've adjusted the comments accordingly. > > OK? > > > curr.patch > > > gcc/ > > PR middle-end/81897 > * tree-ssa-uninit.c (convert_control_dep_chain_into_preds): Skip > empty blocks. OK. Sorry for mucking things up and making more work :( jeff
Re: PR83775
On 01/10/2018 12:21 PM, Martin Sebor wrote: > On 01/10/2018 12:01 PM, Prathamesh Kulkarni wrote: >> Hi, >> The attached patch tries to fix PR83775. >> Validation in progress. >> OK to commit if passes ? > > FWIW, the patch makes sense to me as it simplifies things for > me when debugging using a cross-compiler. I reported the same > ICE in bug 83775 and it was closed as WontFix but perhaps with > a patch the decision will be reconsidered. It's not very user > friendly to crash on the wrong/missing options. If the patch > isn't accepted then perhaps one where the compiler issues > an error would be. I've found the current behavior extremely user unfriendly as well. jeff
Re: [PATCH, rs6000] Fix PR83399, ICE During LRA with 2-op rtl pattern for lvx instruction
On 1/10/18 2:38 PM, Segher Boessenkool wrote: > They don't have that name (they don't have any name). > > I often say things like > > (8 unnamed splitters): Likewise. Ok, committed with the following updated ChangeLog which we discussed offline. Thanks! gcc/ PR target/83399 * config/rs6000/rs6000.c (print_operand) <'y'>: Use VECTOR_MEM_ALTIVEC_OR_VSX_P. * config/rs6000/vsx.md (*vsx_le_perm_load_ for VSX_D): Use indexed_or_indirect_operand predicate. (*vsx_le_perm_load_ for VSX_W): Likewise. (*vsx_le_perm_load_v8hi): Likewise. (*vsx_le_perm_load_v16qi): Likewise. (*vsx_le_perm_store_ for VSX_D): Likewise. (*vsx_le_perm_store_ for VSX_W): Likewise. (*vsx_le_perm_store_v8hi): Likewise. (*vsx_le_perm_store_v16qi): Likewise. (eight unnamed splitters): Likewise. gcc/testsuite/ PR target/83399 * gcc.target/powerpc/pr83399.c: New test. Peter
Re: [PATCH PR82096] Fix ICE in int_mode_for_mode, at stor-layout.c:403 with arm-linux-gnueabi
On 01/10/2018 09:25 AM, Sudakshina Das wrote: > Hi Jeff > > On 10/01/18 10:44, Sudakshina Das wrote: >> Hi Jeff >> >> On 09/01/18 23:43, Jeff Law wrote: >>> On 01/05/2018 12:25 PM, Sudakshina Das wrote: Hi Jeff On 05/01/18 18:44, Jeff Law wrote: > On 01/04/2018 08:35 AM, Sudakshina Das wrote: >> Hi >> >> The bug reported a particular test di-longlong64-sync-1.c failing >> when >> run on arm-linux-gnueabi with options -mthumb -march=armv5t >> -O[g,1,2,3] >> and -mthumb -march=armv6 -O[g,1,2,3]. >> >> According to what I could see, the crash was caused because of the >> explicit VOIDmode argument that was sent to emit_store_flag_force (). >> Since the comparing argument was a long long, it was being forced >> into a >> VOID type register before the comparison (in prepare_cmp_insn()) is done. >> >> >> >> As pointed out by Kyrill, there is a comment on emit_store_flag() >> which >> says "MODE is the mode to use for OP0 and OP1 should they be >> CONST_INTs. >> If it is VOIDmode, they cannot both be CONST_INT". This >> condition is >> not true in this case and thus I think it is suitable to change the >> argument. >> >> Testing done: Checked for regressions on bootstrapped >> arm-none-linux-gnueabi and arm-none-linux-gnueabihf and added new >> test >> cases. >> >> Sudi >> >> ChangeLog entries: >> >> *** gcc/ChangeLog *** >> >> 2017-01-04 Sudakshina Das >> >> PR target/82096 >> * optabs.c (expand_atomic_compare_and_swap): Change argument >> to emit_store_flag_force. >> >> *** gcc/testsuite/ChangeLog *** >> >> 2017-01-04 Sudakshina Das >> >> PR target/82096 >> * gcc.c-torture/compile/pr82096-1.c: New test. >> * gcc.c-torture/compile/pr82096-2.c: Likwise. > In the case where both (op0/op1) to > emit_store_flag/emit_store_flag_force are constants, don't we know the > result of the comparison and shouldn't we have optimized the store > flag > to something simpler? > > I feel like I must be missing something here. > emit_store_flag_force () is comparing a register to op0. >>> ? >>> /* Emit a store-flags instruction for comparison CODE on OP0 and OP1 >>> and storing in TARGET. Normally return TARGET. >>> Return 0 if that cannot be done. >>> >>> MODE is the mode to use for OP0 and OP1 should they be >>> CONST_INTs. If >>> it is VOIDmode, they cannot both be CONST_INT. >>> >>> >>> So we're comparing op0 and op1 AFAICT. One, but not both can be a >>> CONST_INT. If both are a CONST_INT, then you need to address the >>> problem in the caller (by optimizing away the condition). If you've got >>> a REG and a CONST_INT, then the mode should be taken from the REG >>> operand. >>> >>> >>> >>> >>> The 2 constant arguments are to the expand_atomic_compare_and_swap () function. emit_store_flag_force () is used in case when this function is called by the bool variant of the built-in function where the bool return value is computed by comparing the result register with the expected op0. >>> So if only one of the two objects is a CONST_INT, then the mode should >>> come from the other object. I think that's the fundamental problem here >>> and that you're just papering over it by changing the caller. >>> >> I think my earlier explanation was a bit misleading and I may have >> rushed into quoting the comment about both operands being const for >> emit_store_flag_force(). The problem is with the function and I do >> agree with your suggestion of changing the function to add the code >> below to be a better approach than the changing the caller. I will >> change the patch and test it. >> > > This is the updated patch according to your suggestions. > > Testing: Checked for regressions on arm-none-linux-gnueabihf and added > new test case. > > Thanks > Sudi > > ChangeLog entries: > > *** gcc/ChangeLog *** > > 2017-01-10 Sudakshina Das > > PR target/82096 > * expmed.c (emit_store_flag_force): Swap if const op0 > and change VOIDmode to mode of op0. > > *** gcc/testsuite/ChangeLog *** > > 2017-01-10 Sudakshina Das > > PR target/82096 > * gcc.c-torture/compile/pr82096.c: New test. OK. jeff
Re: [PATCH] Add some reproducers for issues found when developing the location-wrappers patch
On 01/09/2018 02:13 PM, David Malcolm wrote: > Whilst developing the location-wrapper patch kit, I accumulated various > source files for which my work-in-progress patched-cc1plus had different > behavior to an unpatched cc1plus: some of these were crashes, others > were erroneous diagnostics. > > All of these are now fixed, but it seems appropriate to capture them > in our testsuite, hence this patch. > > Tested in conjunction with the patch kit; > adds 12 PASS, 2 UNSUPPORTED and 3 XFAILs to g++.sum. > > The XFAILs are due to the use of dg-excess-errors in > g++.dg/wrappers/cp-stdlib.C. > This source file is full of syntax errors, but used to ICE my cc1plus > *somewhere* in error recovery, and I wasn't able to minimize it further > at the time. I fixed the ICE, but it seems worth keeping as a test case. > I checked hacking in an ICE, and dg-excess-errors will FAIL on it > (rather than merely XFAIL, on all of the syntax errors). > > OK for trunk? > > gcc/testsuite/ChangeLog: > PR c++/43486 > * g++.dg/wrappers: New subdirectory. > * g++.dg/wrappers/README: New file. > * g++.dg/wrappers/alloc.C: New test case. > * g++.dg/wrappers/cow-istream-string.C: New test case. > * g++.dg/wrappers/cp-stdlib.C: New test case. > * g++.dg/wrappers/sanitizer_coverage_libcdep_new.C: New test case. > * g++.dg/wrappers/wrapper-around-type-pack-expansion.C: New test > case. OK. jeff
Fix buglet in dwarf2out_var_location
Passing a null pointer as argument to formatting functions corresponding to the %s specifier makes the libc choke on old versions of Solaris 10. The attached patchlet fixes a recently added case and thus eliminates: -FAIL: gcc.dg/debug/debug-7.c -gdwarf-2 -O (internal compiler error) -FAIL: gcc.dg/debug/debug-7.c -gdwarf-2 -O (test for excess errors) -FAIL: gcc.dg/debug/debug-7.c -gdwarf-2 -O3 (internal compiler error) -FAIL: gcc.dg/debug/debug-7.c -gdwarf-2 -O3 (test for excess errors) -FAIL: gcc.dg/debug/debug-7.c -gdwarf-2 -g3 -O (internal compiler error) -FAIL: gcc.dg/debug/debug-7.c -gdwarf-2 -g3 -O (test for excess errors) -FAIL: gcc.dg/debug/debug-7.c -gdwarf-2 -g3 -O3 (internal compiler error) -FAIL: gcc.dg/debug/debug-7.c -gdwarf-2 -g3 -O3 (test for excess errors) -FAIL: gcc.dg/debug/dwarf2/pr43237.c (internal compiler error) -FAIL: gcc.dg/debug/dwarf2/pr43237.c (test for excess errors) -UNRESOLVED: gcc.dg/debug/dwarf2/pr43237.c scan-assembler-not LLST[^r\\\ \n]*DW_AT_upper_bound Tested on SPARC/Solaris and x86-64/Linux, applied on the mainline as obvious. 2018-01-10 Eric Botcazou * dwarf2out.c (dwarf2out_var_location): Do not pass NULL to fprintf. -- Eric BotcazouIndex: dwarf2out.c === --- dwarf2out.c (revision 256275) +++ dwarf2out.c (working copy) @@ -26584,11 +26584,16 @@ create_label: if (var_loc_p && flag_debug_asm) { - const char *name = NULL, *sep = " => ", *patstr = NULL; + const char *name, *sep, *patstr; if (decl && DECL_NAME (decl)) name = IDENTIFIER_POINTER (DECL_NAME (decl)); + else + name = ""; if (NOTE_VAR_LOCATION_LOC (loc_note)) - patstr = str_pattern_slim (NOTE_VAR_LOCATION_LOC (loc_note)); + { + sep = " => "; + patstr = str_pattern_slim (NOTE_VAR_LOCATION_LOC (loc_note)); + } else { sep = " ";
Re: PR82665 - missing value range optimization for memchr
On 01/06/2018 11:58 PM, Prathamesh Kulkarni wrote: [ Snip ] >> I think with those changes we're probably in good shape. But please >> repost for final approval. > I have the updated the attached version with your suggestions. > Does it look OK ? > Bootstrap+test passes on x86_64-unknown-linux-gnu. > > Thanks, > Prathamesh >> jeff > > pr82665-9.diff > > > diff --git a/gcc/vr-values.c b/gcc/vr-values.c > index 794b4635f9e..41a4a0b041f 100644 > --- a/gcc/vr-values.c > +++ b/gcc/vr-values.c > @@ -793,6 +793,39 @@ vr_values::extract_range_from_binary_expr (value_range > *vr, > >extract_range_from_binary_expr_1 (vr, code, expr_type, &vr0, &vr1); > > + /* Set value_range for n in following sequence: > + def = __builtin_memchr (arg, 0, sz) > + n = def - arg > + Here the range for n can be set to [0, PTRDIFF_MAX - 1]. */ > + > + if (vr->type == VR_VARYING > + && code == POINTER_DIFF_EXPR > + && TREE_CODE (op0) == SSA_NAME > + && TREE_CODE (op1) == SSA_NAME) > +{ > + tree op0_ptype = TREE_TYPE (TREE_TYPE (op0)); > + tree op1_ptype = TREE_TYPE (TREE_TYPE (op1)); > + gcall *call_stmt = NULL; > + > + if (TYPE_MODE (op0_ptype) == TYPE_MODE (char_type_node) > + && TYPE_PRECISION (op0_ptype) == TYPE_PRECISION (char_type_node) > + && TYPE_MODE (op1_ptype) == TYPE_MODE (char_type_node) > + && TYPE_PRECISION (op1_ptype) == TYPE_PRECISION (char_type_node) Note that while the operands of POINTER_DIFF_EXPR can be pointers to different types, we do require that they have the same mode and precision. So the tests of TYPE_MODE and TYPE_PRECISION for op1_ptype are not needed. OK with those two checks removed. Jeff ps. FWIW, the close a stage in our development cycle is a patch submission deadline. So if a patch is submitted prior to the deadline, then it can move forward, even if review/approval happens after the deadline.
Re: [PATCH] MicroBlaze use default ident output generation
On 01/09/2018 01:26 AM, Nathan Rossi wrote: > On 18 November 2017 at 22:13, Nathan Rossi wrote: >> On 18 November 2017 at 04:25, Jeff Law wrote: >>> On 11/15/2017 11:58 PM, Nathan Rossi wrote: Remove the MicroBlaze specific TARGET_ASM_OUTPUT_IDENT definition, and use the default. This resolves issues associated with the use of the .sdata2 operation in cases where emitted assembly after the ident output is incorrectly in the .sdata2 section instead of .text or any other expected section. Which results in assembly failures including operations with symbols across different segments. gcc/ChangeLog 2017-11-16 Nathan Rossi PR target/83013 * config/microblaze/microblaze-protos.h (microblaze_asm_output_ident): Delete * config/microblaze/microblaze.c (microblaze_asm_output_ident): Delete * config/microblaze/microblaze.h (TARGET_ASM_OUTPUT_IDENT): Default >>> But isn't the purpose of the override to force certain small-enough >>> objects into the .sdata2 section and by removing the override aren't you >>> losing that capability? >>> >>> It does seem like the override is broken in that it changes the current >>> section behind the back of the generic code. >>> >>> Wouldn't a better fix be to ensure that the override arranges to switch >>> back to whatever the current section is? Perhaps using .pushsection and >>> .popsection would help here? >>> >> >> That would be a better fix, however I sent this change first as it >> seemed it might be preferred to remove the target specific behavior >> instead of fixing it. Since it is the only target that actually uses >> the TARGET_ASM_OUTPUT_IDENT to change the output asm content (others >> either define the default or have a function that calls the default). >> >> But I can sort out a patch that fixes the behavior instead if that is >> preferred? > > Ping. Should I sort out a patch which uses the push/pop of the section > or is this patch preferred? I'd approve a push/pop. THe current patch as-is seems broken to me. jeff
Re: Fix folding of Inf/NaN comparisons for -ftrapping-math (PR tree-optimization/64811)
On Wed, Jan 10, 2018 at 08:17:08PM +, Joseph Myers wrote: > On Wed, 10 Jan 2018, Bernhard Reutner-Fischer wrote: > > > > >+/* x <= +Inf is the same as x == x, i.e. !isnan(x), but this > > > >loses > > > >+ an "invalid" exception. */ > > > >+(if (!flag_trapping_math) > > > >+ (eq @0 @0 > > > >+ /* x == +Inf and x >= +Inf are always equal to x > DBL_MAX, but > > > >+ for == this introduces an exception for x a NaN. */ > > > > What does "x a NaN" mean? x OP NaN resp. x CMP NaN ? > > That x is a NaN. fpclassify (x) == FP_NAN. It's not a C comparison So s/for x a NaN/if x is a NaN/ wouldn't have thrown me off. > operator since NaNs compare unequal to everything. > > > Shouldn't run tests return 0 here? Also drop the exit() declaration > > since it's unused? > > C99 automatically returns 0 from the end of main, but it's true the > default style would be exit (0) in tests. Ah right fenv_exceptions even sets -std=gnu99 and fenv support would require a minimum of C99 anyway so you're of course right. Obviously i'm half asleep, sorry for the noise.. thanks,
[PATCH] i386: Align stack frame if argument is passed on stack
When a function call is removed, it may become a leaf function. But if argument may be passed on stack, we need to align the stack frame when there is no tail call. Tested on Linux/i686 and Linux/x86-64. OK for trunk? H.J. --- gcc/ PR target/83330 * config/i386/i386.c (ix86_compute_frame_layout): Align stack frame if argument is passed on stack. gcc/testsuite/ PR target/83330 * gcc.target/i386/pr83330.c: New test. --- gcc/config/i386/i386.c | 7 ++- gcc/testsuite/gcc.target/i386/pr83330.c | 29 + 2 files changed, 35 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr83330.c diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 5e17b694d7f..d6ff096d466 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -11339,11 +11339,16 @@ ix86_compute_frame_layout (void) offset += frame->va_arg_size; } - /* Align start of frame for local function. */ + /* Align start of frame for local function. When a function call + is removed, it may become a leaf function. But if argument may + be passed on stack, we need to align the stack when there is no + tail call. */ if (m->call_ms2sysv || frame->va_arg_size != 0 || size != 0 || !crtl->is_leaf + || (!crtl->tail_call_emit + && cfun->machine->outgoing_args_on_stack) || cfun->calls_alloca || ix86_current_function_calls_tls_descriptor) offset = ROUND_UP (offset, stack_alignment_needed); diff --git a/gcc/testsuite/gcc.target/i386/pr83330.c b/gcc/testsuite/gcc.target/i386/pr83330.c new file mode 100644 index 000..8a63fbd5d09 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr83330.c @@ -0,0 +1,29 @@ +/* { dg-do run { target int128 } } */ +/* { dg-options "-O2 -fno-tree-dce -mno-push-args" } */ + +typedef unsigned long long u64; +typedef unsigned __int128 u128; + +u64 v; +u64 g; + +u64 __attribute__ ((noinline, noclone)) +bar (u128 d, u64 e, u64 f, u64 g, u128 h) +{ + (void)d, (void)e, (void)f, (void)g, (void)h; + return 0; +} + +static u64 __attribute__ ((noipa)) +foo (void) +{ + (void)(v - bar (0, 0, 0, 0, 0)); + return g; +} + +int +main (void) +{ + (void)foo (); + return 0; +} -- 2.14.3
Re: [PATCH, rs6000] Fix PR83399, ICE During LRA with 2-op rtl pattern for lvx instruction
Hi, On Wed, Jan 10, 2018 at 01:55:34PM -0600, Peter Bergner wrote: > >> @@ -570,7 +570,7 @@ (define_split > >> ;; The post-reload split requires that we re-permute the source > >> ;; register in case it is still live. > >> (define_split > >> - [(set (match_operand:VSX_D 0 "memory_operand" "") > >> + [(set (match_operand:VSX_D 0 "indexed_or_indirect_operand" "") > >> (match_operand:VSX_D 1 "vsx_register_operand" ""))] > >>"!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR && > >> reload_completed" > >>[(set (match_dup 1) > > > > You don't mention these in the changelog. > > I tried to by mentioning splitters in the entry above. How are these > unnamed splitters supposed to be mentioned? Maybe like: > > (*vsx_le_perm_store_ for ): Likewise/ > (*vsx_le_perm_store_ for splitter): Likewise. They don't have that name (they don't have any name). I often say things like (8 unnamed splitters): Likewise. (you can try to merge a define_split first, to a define_insn_and_split, which is a nice cleanup by itself; but that won't work here I think). Segher
RE: [PATCH 5/5][AArch64] fp16fml support
Okay will put on my to-do list for post GCC 8. -Original Message- From: James Greenhalgh [mailto:james.greenha...@arm.com] Sent: Wednesday, January 10, 2018 12:21 PM To: Michael Collison Cc: Richard Sandiford ; GCC Patches ; nd Subject: Re: [PATCH 5/5][AArch64] fp16fml support On Tue, Jan 09, 2018 at 06:28:09PM +, Michael Collison wrote: > Patch updated per Richard's comments. Ok for trunk? This patch adds a lot of code, much of which looks like it ought to be possible to common up using the iterators. I'm going to OK it as is, as I'd like to see this make GCC 8, and we've sat on it for long enough, but I would really appreciate futurec refactoring in this area. I'm worried about maintainability as it stands. OK. Thanks, James > > -Original Message- > From: Richard Sandiford [mailto:richard.sandif...@linaro.org] > Sent: Thursday, January 4, 2018 8:02 AM > To: Michael Collison > Cc: GCC Patches ; nd > Subject: Re: [PATCH 5/5][AArch64] fp16fml support > > Hi Michael, > > Not a review of the full patch, just a comment about the patterns: > > Michael Collison writes: > > +(define_expand "aarch64_fmll_lane_lowv2sf" > > + [(set (match_operand:V2SF 0 "register_operand" "") > > + (unspec:V2SF [(match_operand:V2SF 1 "register_operand" "") > > + (match_operand:V4HF 2 "register_operand" "") > > + (match_operand:V4HF 3 "register_operand" "") > > + (match_operand:SI 4 "aarch64_imm2" "")] > > +VFMLA16_LOW))] > > + "TARGET_F16FML" > > +{ > > +rtx p1 = aarch64_simd_vect_par_cnst_half (V4HFmode, > > + GET_MODE_NUNITS (V4HFmode), > > + false); > > +rtx lane = GEN_INT (ENDIAN_LANE_N (GET_MODE_NUNITS (SImode), > > +INTVAL (operands[4]))); > > Please use the newly-introduced aarch64_endian_lane_rtx for this. > > GET_MODE_NUNITS (SImode) doesn't seem right though, since that's always 1. > Should it be using V4HFmode instead? > > Same for the other patterns. > > Thanks, > Richard
[PATCH] suppress -Wstringop-overflow when no-warning is set (PR 83508)
To avoid issuing duplicate warnings for the same function call in the source code the -Wrestrict warning code makes sure the no-warning bit is propagated between trees and GIMPLE and tested before issuing a warning. But the warning also detects some of the same problems as -Wstringop-overflow, and that warning was not updated to pay attention to the no-warning bit. This can in turn lead to two warnings for what boils down to the same bug. The warnings can be confusing when the first one references the function as it appears in the source code and the second one the one it was transformed to by GCC after the first warning was issued. The attached patch corrects this oversight by having the buffer overflow checker test the no-warning bit and skip issuing a diagnostic. (The function that does the overflow checking still runs so that it can continue to indicate to its callers whether an overflow has been detected.) Bootstrap on x86_64-linux in progress. Martin PR other/83508 - c-c++-common/Wrestrict.c fails since r255836 gcc/testsuite/ChangeLog: PR other/83508 * gcc.dg/Wstringop-overflow-2.c: New test. gcc/ChangeLog: PR other/83508 * builtins.c (check_access): Avoid warning when the no-warning bit is set. diff --git a/gcc/builtins.c b/gcc/builtins.c index 1d6e69d..a1b6a4e 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -3150,6 +3150,9 @@ check_access (tree exp, tree, tree, tree dstwrite, || (tree_fits_uhwi_p (dstwrite) && tree_int_cst_lt (dstwrite, range[0] { + if (TREE_NO_WARNING (exp)) + return false; + location_t loc = tree_nonartificial_location (exp); loc = expansion_point_location_if_in_system_header (loc); @@ -3209,6 +3212,9 @@ check_access (tree exp, tree, tree, tree dstwrite, if (tree_int_cst_lt (maxobjsize, range[0])) { + if (TREE_NO_WARNING (exp)) + return false; + /* Warn about crazy big sizes first since that's more likely to be meaningful than saying that the bound is greater than the object size if both are big. */ @@ -3230,6 +3236,9 @@ check_access (tree exp, tree, tree, tree dstwrite, if (dstsize != maxobjsize && tree_int_cst_lt (dstsize, range[0])) { + if (TREE_NO_WARNING (exp)) + return false; + if (tree_int_cst_equal (range[0], range[1])) warning_at (loc, opt, "%K%qD specified bound %E " @@ -3253,6 +3262,9 @@ check_access (tree exp, tree, tree, tree dstwrite, && dstwrite && range[0] && tree_int_cst_lt (slen, range[0])) { + if (TREE_NO_WARNING (exp)) + return false; + location_t loc = tree_nonartificial_location (exp); if (tree_int_cst_equal (range[0], range[1])) diff --git a/gcc/testsuite/gcc.dg/Wstringop-overflow-2.c b/gcc/testsuite/gcc.dg/Wstringop-overflow-2.c new file mode 100644 index 000..6e3e2ca --- /dev/null +++ b/gcc/testsuite/gcc.dg/Wstringop-overflow-2.c @@ -0,0 +1,30 @@ +/* PR tree-optimization/83508 - c-c++-common/Wrestrict.c fails since r255836 + Test to verify that only one of -Wrestrict and -Wstringop-overflow is + issued for a problem where either would be appropriate. + { dg-do compile } + { dg-options "-O2 -Wrestrict -Wstringop-overflow" } */ + +#define DIFF_MAX __PTRDIFF_MAX__ + +typedef __PTRDIFF_TYPE__ ptrdiff_t; +typedef __SIZE_TYPE__ size_t; + +void sink (void*); + +void f (ptrdiff_t i, size_t n) +{ + if (i < DIFF_MAX - 2 || DIFF_MAX - 1 > i) +i = DIFF_MAX - 2; + + if (n < 4 || 5 < n) +n = 4; + + char a[8] = "012"; + + /* The following could very well be diagnosed by -Wstringop-overflow + instead but there's no way to verify that only one of the two + warnings is issued and the choice of -Wrestrict simply reflects + the fact that -Wrestrict runs before -Wstringop-overflow. */ + __builtin_strncpy (a + i, a, n); /* { dg-warning "\\\[-Wrestrict]" } */ + sink (a); +}
Re: [PATCH, rs6000] Change x86 intrinsic compat headers to use #error
On 1/10/18 2:14 PM, Segher Boessenkool wrote: > On Tue, Jan 09, 2018 at 04:09:25PM -0600, Peter Bergner wrote: >> The following patch changes the x86 intrinsic compat headers to use #error >> instead of #warning. We do this for two reasons. Firstly, we want the user >> to really be sure they want/need to use the x86 intrinsic compat support >> before doing so, because a warning is too easy to ignore. Secondly, we don't >> want configure time tests that check for whether x86 intrinsic support exists >> and then use them if they do, to automatically enable usage of them. >> The x86 intrinsic compat support should only be used when there is no other >> option to the user (ie, no generic or ppc specific code exists). >> >> Is this ok for trunk? > > Okay, thanks! Committed, thanks! Peter
[PATCH], Add checks for -mlong-double-128 in PowerPC ibm/float128 code
As I add support for making -mabi=ieeelongdouble default, I noticed some issues in building libstdc++ with the default for the code that supports the -mlong-double-64 option. These tests add checks for -mlong-double-128 before considering whether TFmode/TCmode are IEEE or IBM extended double. I went into the two primary macros (FLOAT128_{IBM,IEEE}_P) and added the checks. I also looked at all of the TFmode/TCmode support in rs6000.c and I found one case where it wasn't checking the long double size first. I have done a bootstrap and regression test with this patch on a little endian power8 system. I previously did a bootstrap wtih the original patch (which includes the rs6000.h change and a similar change to rs6000.c) to add long double configuration and multilib on a big endian power8 system. In both cases, there was no regression. Can I install this patch into the trunk? 2018-01-10 Michael Meissner * config/rs6000/rs6000.c (is_complex_IBM_long_double): Explicitly check for 128-bit long double before checking TCmode. * config/rs6000/rs6000.h (FLOAT128_IEEE_P): Explicitly check for 128-bit long doubles before checking TFmode or TCmode. (FLOAT128_IBM_P): Likewise. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797 Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 256355) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -11429,7 +11429,7 @@ rs6000_must_pass_in_stack (machine_mode static inline bool is_complex_IBM_long_double (machine_mode mode) { - return mode == ICmode || (!TARGET_IEEEQUAD && mode == TCmode); + return mode == ICmode || (mode == TCmode && FLOAT128_IBM_P (TCmode)); } /* Whether ABI_V4 passes MODE args to a function in floating point Index: gcc/config/rs6000/rs6000.h === --- gcc/config/rs6000/rs6000.h (revision 256355) +++ gcc/config/rs6000/rs6000.h (working copy) @@ -437,11 +437,13 @@ extern const char *host_detect_local_cpu Similarly IFmode is the IBM long double format even if the default is IEEE 128-bit. Don't allow IFmode if -msoft-float. */ #define FLOAT128_IEEE_P(MODE) \ - ((TARGET_IEEEQUAD && ((MODE) == TFmode || (MODE) == TCmode)) \ + ((TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128 \ +&& ((MODE) == TFmode || (MODE) == TCmode)) \ || ((MODE) == KFmode) || ((MODE) == KCmode)) #define FLOAT128_IBM_P(MODE) \ - ((!TARGET_IEEEQUAD && ((MODE) == TFmode || (MODE) == TCmode)) \ + ((!TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128 \ +&& ((MODE) == TFmode || (MODE) == TCmode)) \ || (TARGET_HARD_FLOAT && ((MODE) == IFmode || (MODE) == ICmode))) /* Helper macros to say whether a 128-bit floating point type can go in a
Re: Fix folding of Inf/NaN comparisons for -ftrapping-math (PR tree-optimization/64811)
On Wed, 10 Jan 2018, Bernhard Reutner-Fischer wrote: > > >+ /* x <= +Inf is the same as x == x, i.e. !isnan(x), but this loses > > >+ an "invalid" exception. */ > > >+ (if (!flag_trapping_math) > > >+ (eq @0 @0 > > >+ /* x == +Inf and x >= +Inf are always equal to x > DBL_MAX, but > > >+ for == this introduces an exception for x a NaN. */ > > What does "x a NaN" mean? x OP NaN resp. x CMP NaN ? That x is a NaN. fpclassify (x) == FP_NAN. It's not a C comparison operator since NaNs compare unequal to everything. > Shouldn't run tests return 0 here? Also drop the exit() declaration > since it's unused? C99 automatically returns 0 from the end of main, but it's true the default style would be exit (0) in tests. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH, rs6000] Change x86 intrinsic compat headers to use #error
On Tue, Jan 09, 2018 at 04:09:25PM -0600, Peter Bergner wrote: > The following patch changes the x86 intrinsic compat headers to use #error > instead of #warning. We do this for two reasons. Firstly, we want the user > to really be sure they want/need to use the x86 intrinsic compat support > before doing so, because a warning is too easy to ignore. Secondly, we don't > want configure time tests that check for whether x86 intrinsic support exists > and then use them if they do, to automatically enable usage of them. > The x86 intrinsic compat support should only be used when there is no other > option to the user (ie, no generic or ppc specific code exists). > > Is this ok for trunk? Okay, thanks! Segher > * config/rs6000/x86intrin.h: Change #warning to #error. Update message. > * config/rs6000/emmintrin.h: Likewise. > * config/rs6000/mmintrin.h: Likewise. > * config/rs6000/xmmintrin.h: Likewise.
Re: [C++ Patch] PR 78344 ("ICE on invalid c++ code (tree check: expected tree_list, have error_mark in cp_check_const_attributes, at cp/decl2.c:1347")
Hi, On 10/01/2018 17:58, Jason Merrill wrote: On Wed, Jan 10, 2018 at 11:33 AM, Paolo Carlini wrote: Hi, On 10/01/2018 16:32, Jason Merrill wrote: On Fri, Dec 22, 2017 at 7:34 PM, Paolo Carlini wrote: in this error recovery issue cp_check_const_attributes and more generally cplus_decl_attributes have lots of troubles handling the error_mark_node returned by cp_parser_std_attribute_spec_seq, as called by cp_parser_direct_declarator. I fiddled quite a bit with these parsing facilities to eventually notice that boldly changing cp_parser_std_attribute_spec_seq to return NULL_TREE instead of error_mark_node when cp_parser_std_attribute_spec returns error_mark_node in the loop cures the bug at issue as a special case. Hmm, I'm skeptical. In general, we want to use error_mark_node to distinguish between something not being there and being there but wrong. I see what you mean. But consider that we already emitted diagnostic anyway, I'm not sure we can rely on that; cp_parser_error doesn't emit anything when parsing tentatively. Ok. I'm going to investigate that a bit more - the obvious idea would be limiting somehow the approach to when we are *not* parsing tentatively - otherwise I'll see if we can handle elegantly enough those error_mark_nodes. I committed the other straightforward change which you approved, thanks about that. Paolo.
Re: [PATCH, rs6000] Fix PR83399, ICE During LRA with 2-op rtl pattern for lvx instruction
On 1/10/18 1:44 PM, Segher Boessenkool wrote: > On Tue, Jan 09, 2018 at 12:22:38PM -0600, Peter Bergner wrote: >> * config/rs6000/rs6000.c (print_operand): Use >> VECTOR_MEM_ALTIVEC_OR_VSX_P. > > (print_operand) <'y'>: ... Will fix. >> * config/rs6000/vsx.md (*vsx_le_perm_load_): Use >> indexed_or_indirect_operand predicate. > > It's *vsx_le_perm_load_ -- is not part of the name. It > makes sense to mention it, maybe like > * config/rs6000/vsx.md (*vsx_le_perm_load_ for VSX_D): ... I didn't know how to handle that, so I guessed! :-) Will fix. >> (*vsx_le_perm_load_v8hi): Likewise. >> (*vsx_le_perm_load_v8hi): Likewise. > > You duplicated this line. Will fix. >> (*vsx_le_perm_store_): Likewise in pattern and splitters. [snip] >> (define_split >> - [(set (match_operand:VSX_D 0 "memory_operand" "") >> + [(set (match_operand:VSX_D 0 "indexed_or_indirect_operand" "") >> (match_operand:VSX_D 1 "vsx_register_operand" ""))] >>"!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR && >> !reload_completed" >>[(set (match_dup 2) >> @@ -570,7 +570,7 @@ (define_split >> ;; The post-reload split requires that we re-permute the source >> ;; register in case it is still live. >> (define_split >> - [(set (match_operand:VSX_D 0 "memory_operand" "") >> + [(set (match_operand:VSX_D 0 "indexed_or_indirect_operand" "") >> (match_operand:VSX_D 1 "vsx_register_operand" ""))] >>"!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR && reload_completed" >>[(set (match_dup 1) > > You don't mention these in the changelog. I tried to by mentioning splitters in the entry above. How are these unnamed splitters supposed to be mentioned? Maybe like: (*vsx_le_perm_store_ for ): Likewise/ (*vsx_le_perm_store_ for splitter): Likewise. ? Peter
Re: [1/4] [AArch64] SVE backend support
Thanks for the review! James Greenhalgh writes: > On Fri, Jan 05, 2018 at 11:41:25AM +, Richard Sandiford wrote: >> Here's the patch updated to apply on top of the v8.4 and >> __builtin_load_no_speculate support. It also handles the new >> vec_perm_indices and CONST_VECTOR encoding and uses VNx... names >> for the SVE modes. >> >> Richard Sandiford writes: >> > This patch adds support for ARM's Scalable Vector Extension. >> > The patch just contains the core features that work with the >> > current vectoriser framework; later patches will add extra >> > capabilities to both the target-independent code and AArch64 code. >> > The patch doesn't include: >> > >> > - support for unwinding frames whose size depends on the vector length >> > - modelling the effect of __tls_get_addr on the SVE registers >> > >> > These are handled by later patches instead. >> > >> > Some notes: >> > >> > - The copyright years for aarch64-sve.md start at 2009 because some of >> > the code is based on aarch64.md, which also starts from then. >> > >> > - The patch inserts spaces between items in the AArch64 section >> > of sourcebuild.texi. This matches at least the surrounding >> > architectures and looks a little nicer in the info output. >> > >> > - aarch64-sve.md includes a pattern: >> > >> > while_ult >> > >> > A later patch adds a matching "while_ult" optab, but the pattern >> > is also needed by the predicate vec_duplicate expander. > > I'm keen to take this. The code is good quality overall, I'm confident in your > reputation and implementation. There are some parts of the design that I'm > less happy about, but pragmatically, we should take this now to get the > behaviour correct, and look to optimise, refactor, and clean-up in future. > > Sorry it took me a long time to get to the review. I've got no meaningful > design concerns here, and certainly nothing so critical that we couldn't > fix it after the fact in GCC 9 and up. > > That said... > >> (aarch64_sve_cnt_immediate_p, aarch64_sve_addvl_addpl_immediate_p) > > I'm not a big fan of these sorts of functions which return a char* where > we've dumped the text we want to print out in the short term. The interface > (fill a static char[] we can then leak on return) is pretty ugly. Yeah, it's not pretty, but I think the various possible ways of doing the addition do justify using output functions here. The distinction between INC[BHWD], DEC[BHWD], ADDVL and ADDPL doesn't really affect anything other than the final output, so it isn't something that should be exposed as different constraints (for example). We should probably "just" have a nicer interface for target code to construct instruction format strings. > One consideration for future work would be refactoring out aarch64.c - it is > getting to be too big for my liking (near 18,000 lines). > >> (aarch64_expand_sve_mem_move) > > Do we have a good description of how SVE big-endian vectors work, comments - I found the detailed comment at the top of aarch64-sve.md> > > The sort of comment you write later ("see the comment at the head of > aarch64-sve.md for details") would also be useful here as a reference. Ah, yeah, will add a reference there too. >> aarch64_get_reg_raw_mode > > Do we assert/warn anywhere for users of __builtin_apply that they are > fundamentally unsupported? Not as far as I know. FWIW, this doesn't affect SVE (yet), because we don't yet support any types that would be passed in the SVE-specific part of the registers. >> offset_4bit_signed_scaled_p > > So much code duplication here and in similair functions. Would a single > interface (unsigned bits, bool signed, bool scaled) let you avoid the many > identical functions? We just kept to the existing style here. I agree it might be a good idea to consolidate them, but personally I'd prefer to keep the signed/scaled distinction in the function name, since it's more readable than booleans and shorter than a new enum. >> aarch64_evpc_rev_local > > I'm likely missing something obvious, but what is the distinction you're > drawing between global and local? Could you comment it? "global" reverses the whole vector: the first and last elements switch places. "local" reverses within groups of N consecutive elements but not between them. But yet again names are probably my downfall here. :-) I'm happy to call them something else instead. Either way I'll expand the comments. >> aarch64-sve.md - scheduling types > > None of the instructions here have types for scheduling. That's going to > make for a future headache. Adding them to the existing scheduling types > is going to cause all sorts of trouble when building GCC (we already have > too many types for some compilers to handle the structure!). We'll need > to finds a solution to how we'll direct scheduling for SVE. Yeah. I didn't want to add scheduling attributes now without scheduling descriptions to go with them, since there's no way of knowing what the di
libgo patch committed: Handle _st_timespec for AIX stat
The AIX stat function apparently uses an _st_timespec type. This libgo patch by Tony Reix handles that correctly. Bootstrapped on x86_64-pc-linux-gnu. Committed to mainline. Ian Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 256446) +++ gcc/go/gofrontend/MERGE (working copy) @@ -1,4 +1,4 @@ -d8a9f7433a9e8a81c992ad2908818d2e84f3698b +19d94969c5202c07b3b166079b9f4ebbb52dfa6b The first line of this file holds the git revision number of the last merge done from the gofrontend repository. Index: libgo/go/os/stat_aix.go === --- libgo/go/os/stat_aix.go (nonexistent) +++ libgo/go/os/stat_aix.go (working copy) @@ -0,0 +1,51 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package os + +import ( + "syscall" + "time" +) + +func fillFileStatFromSys(fs *fileStat, name string) { + fs.name = basename(name) + fs.size = int64(fs.sys.Size) + fs.modTime = stTimespecToTime(fs.sys.Mtim) + fs.mode = FileMode(fs.sys.Mode & 0777) + switch fs.sys.Mode & syscall.S_IFMT { + case syscall.S_IFBLK: + fs.mode |= ModeDevice + case syscall.S_IFCHR: + fs.mode |= ModeDevice | ModeCharDevice + case syscall.S_IFDIR: + fs.mode |= ModeDir + case syscall.S_IFIFO: + fs.mode |= ModeNamedPipe + case syscall.S_IFLNK: + fs.mode |= ModeSymlink + case syscall.S_IFREG: + // nothing to do + case syscall.S_IFSOCK: + fs.mode |= ModeSocket + } + if fs.sys.Mode&syscall.S_ISGID != 0 { + fs.mode |= ModeSetgid + } + if fs.sys.Mode&syscall.S_ISUID != 0 { + fs.mode |= ModeSetuid + } + if fs.sys.Mode&syscall.S_ISVTX != 0 { + fs.mode |= ModeSticky + } +} + +func stTimespecToTime(ts syscall.StTimespec) time.Time { + return time.Unix(int64(ts.Sec), int64(ts.Nsec)) +} + +// For testing. +func atime(fi FileInfo) time.Time { + return stTimespecToTime(fi.Sys().(*syscall.Stat_t).Atim) +} Index: libgo/go/os/stat_atim.go === --- libgo/go/os/stat_atim.go(revision 256366) +++ libgo/go/os/stat_atim.go(working copy) @@ -2,7 +2,7 @@ // Use of this source code is governed by a BSD-style // license that can be found in the LICENSE file. -// +build aix linux openbsd solaristag +// +build linux openbsd solaristag package os Index: libgo/go/syscall/syscall_aix.go === --- libgo/go/syscall/syscall_aix.go (revision 256366) +++ libgo/go/syscall/syscall_aix.go (working copy) @@ -6,6 +6,14 @@ package syscall import "unsafe" +func (ts *StTimespec) Unix() (sec int64, nsec int64) { + return int64(ts.Sec), int64(ts.Nsec) +} + +func (ts *StTimespec) Nano() int64 { + return int64(ts.Sec)*1e9 + int64(ts.Nsec) +} + func direntIno(buf []byte) (uint64, bool) { return readInt(buf, unsafe.Offsetof(Dirent{}.Ino), unsafe.Sizeof(Dirent{}.Ino)) } Index: libgo/mksysinfo.sh === --- libgo/mksysinfo.sh (revision 256366) +++ libgo/mksysinfo.sh (working copy) @@ -443,6 +443,12 @@ grep '^type _tms ' gen-sysinfo.go | \ -e 's/tms_cstime/Cstime/' \ >> ${OUT} +# AIX uses st_timespec struct for stat. +grep '^type _st_timespec ' gen-sysinfo.go | \ +sed -e 's/type _st_timespec /type StTimespec /' \ + -e 's/tv_sec/Sec/' \ + -e 's/tv_nsec/Nsec/' >> ${OUT} + # The stat type. # Prefer largefile variant if available. stat=`grep '^type _stat64 ' gen-sysinfo.go || true` @@ -467,7 +473,7 @@ fi | sed -e 's/type _stat64/type Stat_t/ -e 's/st_ctim/Ctim/' \ -e 's/\([^a-zA-Z0-9_]\)_timeval\([^a-zA-Z0-9_]\)/\1Timeval\2/g' \ -e 's/\([^a-zA-Z0-9_]\)_timespec_t\([^a-zA-Z0-9_]\)/\1Timespec\2/g' \ - -e 's/\([^a-zA-Z0-9_]\)_st_timespec_t\([^a-zA-Z0-9_]\)/\1Timespec\2/g' \ + -e 's/\([^a-zA-Z0-9_]\)_st_timespec_t\([^a-zA-Z0-9_]\)/\1StTimespec\2/g' \ -e 's/\([^a-zA-Z0-9_]\)_timespec\([^a-zA-Z0-9_]\)/\1Timespec\2/g' \ -e 's/\([^a-zA-Z0-9_]\)_timestruc_t\([^a-zA-Z0-9_]\)/\1Timestruc\2/g' \ -e 's/Godump_[0-9] struct { \([^;]*;\) };/\1/g' \
[committed] Preserving locations for variable-uses and constants (PR c++/43486)
All of the various parts of this kit have now been approved; I've committed it to trunk as r256448. (The initial version of the kit was posted 2017-10-20, and the followup on 2017-11-10, both in stage 1; the final patch approval was 2018-01-09, towards the end of stage 3). Successfully bootstrapped®rtested on x86_64-pc-linux-gnu. Successfully built stage1 on powerpc-ibm-aix7.1.3.0 (the rest of the bootstrap there is in progress). Manually tested with "make s-selftest-c++" on both hosts (since we don't run the selftests for cc1plus by default). Light benchmark testing (kdecore.cc) showed no significant compile-time difference, and a slight (~0.1%) ggc increase. Blurb follows: This patch implements location wrapper nodes, preserving source locations of the uses of variables and constants in various places in the C++ frontend: at the arguments at callsites, and for typeid, alignof, sizeof, and offsetof. For example, it allows the C++ FE to underline the pertinent argument for mismatching calls, for such expressions, improving: extern int callee (int one, const char *two, float three); int caller (int first, int second, float third) { return callee (first, second, third); } from test.cc: In function 'int caller(int, int, float)': test.cc:5:38: error: invalid conversion from 'int' to 'const char*' [-fpermissive] return callee (first, second, third); ^ test.cc:1:41: note: initializing argument 2 of 'int callee(int, const char*, float)' extern int callee (int one, const char *two, float three); ^~~ to: test.cc: In function 'int caller(int, int, float)': test.cc:5:25: error: invalid conversion from 'int' to 'const char*' [-fpermissive] return callee (first, second, third); ^~ test.cc:1:41: note: initializing argument 2 of 'int callee(int, const char*, float)' extern int callee (int one, const char *two, float three); ^~~ This is the combination of the following patches: "[PATCH 01/14] C++: preserve locations within build_address" https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00883.html "[PATCH v2.4 of 02/14] Support for adding and stripping location_t wrapper nodes" https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00591.html "[PATCH] Eliminate location wrappers in tree_nop_conversion/STRIP_NOPS" https://gcc.gnu.org/ml/gcc-patches/2017-12/msg01330.html "[PATCH v4 of 03/14] C++: add location_t wrapper nodes during parsing (minimal impl)" https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00660.html "[PATCH 04/14] Update testsuite to show improvements" https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00891.html "[v3 of 05/14] C++: handle locations wrappers when calling warn_for_memset" https://gcc.gnu.org/ml/gcc-patches/2017-12/msg01378.html "[PATCH 07/14] reject_gcc_builtin: strip any location wrappers" https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00886.html "[v3 of PATCH 08/14] cp/tree.c: strip location wrappers in lvalue_kind" https://gcc.gnu.org/ml/gcc-patches/2017-12/msg01433.html "[PATCH 09/14] Strip location wrappers in null_ptr_cst_p" https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00888.html "[PATCH 11/14] Handle location wrappers in string_conv_p" https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00890.html "[PATCH 12/14] C++: introduce null_node_p" https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00894.html "[v3 of PATCH 13/14] c-format.c: handle location wrappers" https://gcc.gnu.org/ml/gcc-patches/2017-12/msg01494.html "[PATCH 14/14] pp_c_cast_expression: don't print casts for location wrappers" https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00893.html "[v3 of PATCH 15/14] Use fold_for_warn in get_atomic_generic_size" https://gcc.gnu.org/ml/gcc-patches/2017-12/msg01380.html "[PATCH] Add selftest for "fold_for_warn (error_mark_node)"" https://gcc.gnu.org/ml/gcc-patches/2017-12/msg01385.html gcc/c-family/ChangeLog: PR c++/43486 * c-common.c: Include "selftest.h". (get_atomic_generic_size): Perform the test for integral type before the range test for any integer constant, fixing indentation of braces. Call fold_for_warn before testing for an INTEGER_CST. (reject_gcc_builtin): Strip any location wrapper from EXPR. (selftest::test_fold_for_warn): New function. (selftest::c_common_c_tests): New function. (selftest::c_family_tests): Call it, and selftest::c_pretty_print_c_tests. * c-common.h (selftest::c_pretty_print_c_tests): New decl. * c-format.c (check_format_arg): Convert VAR_P check to a fold_for_warn. * c-pretty-print.c: Include "selftest.h". (pp_c_cast_expression): Don't print casts for location wrappers. (selftest::assert_c_pretty_printer_output): New function. (ASSERT_C_PRETTY_PRINTER_OUTPUT): New macro.
Re: [PATCH, rs6000] Fix PR83399, ICE During LRA with 2-op rtl pattern for lvx instruction
On Tue, Jan 09, 2018 at 12:22:38PM -0600, Peter Bergner wrote: > PR83399 shows a problem where we emit an altivec load using a builtin > that forces us to use a specific altivec load pattern. The generated > rtl pattern has a use of sfp (frame pointer) and during LRA, we eliminate > it's use to the sp (lra-eliminations.c:process_insn_for_elimination). > During this process, we re-recog the insn and end up matching a different > vsx pattern, because it exists earlier in the machine description file. > That vsx pattern uses a "Z" constraint for its address operand, which will > not accept the "special" altivec address we have, but the memory_operand > predicate the pattern uses allows it. The recog'ing to a different pattern > than we want, causes us to ICE later on. > > The solution here is to tighten the predicate used for the address in the > vsx pattern to use the indexed_or_indirect_operand instead, which will > reject the altivec address our rtl pattern has. > > Once this is fixed, we end up hitting another issue in print_operand when > outputing altivec addresses when using -mvsx. This was fixed by allowing > both ALTIVEC or VSX VECTOR MEMs. > > This passed bootstrap and regtesting on powerpc64le-linux with no regressions. > Ok for trunk? > > Is this ok for the open release branches too, once testing has completed > there? > > Peter > > > gcc/ > PR target/83399 > * config/rs6000/rs6000.c (print_operand): Use > VECTOR_MEM_ALTIVEC_OR_VSX_P. (print_operand) <'y'>: ... > * config/rs6000/vsx.md (*vsx_le_perm_load_): Use > indexed_or_indirect_operand predicate. It's *vsx_le_perm_load_ -- is not part of the name. It makes sense to mention it, maybe like * config/rs6000/vsx.md (*vsx_le_perm_load_ for VSX_D): ... > (*vsx_le_perm_load_): Likewise. Similar. > (*vsx_le_perm_load_v8hi): Likewise. > (*vsx_le_perm_load_v8hi): Likewise. You duplicated this line. > (*vsx_le_perm_load_v16qi): Likewise. > (*vsx_le_perm_store_): Likewise in pattern and splitters. > (*vsx_le_perm_store_): Likewise. Same for these two. > (*vsx_le_perm_store_v8hi): Likewise. > (*vsx_le_perm_store_v16qi): Likewise. > > gcc/testsuite/ > PR target/83399 > * gcc.target/powerpc/pr83399.c: New test. > (define_split > - [(set (match_operand:VSX_D 0 "memory_operand" "") > + [(set (match_operand:VSX_D 0 "indexed_or_indirect_operand" "") > (match_operand:VSX_D 1 "vsx_register_operand" ""))] >"!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR && !reload_completed" >[(set (match_dup 2) > @@ -570,7 +570,7 @@ (define_split > ;; The post-reload split requires that we re-permute the source > ;; register in case it is still live. > (define_split > - [(set (match_operand:VSX_D 0 "memory_operand" "") > + [(set (match_operand:VSX_D 0 "indexed_or_indirect_operand" "") > (match_operand:VSX_D 1 "vsx_register_operand" ""))] >"!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR && reload_completed" >[(set (match_dup 1) You don't mention these in the changelog. > (define_split > - [(set (match_operand:VSX_W 0 "memory_operand" "") > + [(set (match_operand:VSX_W 0 "indexed_or_indirect_operand" "") > (match_operand:VSX_W 1 "vsx_register_operand" ""))] >"!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR && !reload_completed" >[(set (match_dup 2) > @@ -617,7 +617,7 @@ (define_split > ;; The post-reload split requires that we re-permute the source > ;; register in case it is still live. > (define_split > - [(set (match_operand:VSX_W 0 "memory_operand" "") > + [(set (match_operand:VSX_W 0 "indexed_or_indirect_operand" "") > (match_operand:VSX_W 1 "vsx_register_operand" ""))] >"!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR && reload_completed" >[(set (match_dup 1) > @@ -638,7 +638,7 @@ (define_split >"") Or these. > (define_split > - [(set (match_operand:V8HI 0 "memory_operand" "") > + [(set (match_operand:V8HI 0 "indexed_or_indirect_operand" "") > (match_operand:V8HI 1 "vsx_register_operand" ""))] >"!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR && !reload_completed" >[(set (match_dup 2) > @@ -671,7 +671,7 @@ (define_split > ;; The post-reload split requires that we re-permute the source > ;; register in case it is still live. > (define_split > - [(set (match_operand:V8HI 0 "memory_operand" "") > + [(set (match_operand:V8HI 0 "indexed_or_indirect_operand" "") > (match_operand:V8HI 1 "vsx_register_operand" ""))] >"!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR && reload_completed" >[(set (match_dup 1) Or these. > (define_split > - [(set (match_operand:V16QI 0 "memory_operand" "") > + [(set (match_operand:V16QI 0 "indexed_or_indirect_operand" "") > (match_operand:V16QI 1 "vsx_register_operand" ""))] >"!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR && !reload_completed" >
Re: [PATCH improve early strlen range folding (PR 83671)
On 01/06/2018 03:04 PM, Martin Sebor wrote: > Bug 83671 - Fix for false positive reported by -Wstringop-overflow > does not work at -O1, points out that the string length range > optimization implemented as a solution for bug 83373 doesn't help > at -O1. The root cause is that the fix was added to the strlen > pass that doesn't run at -O1. > > The string length range computation doesn't depend on the strlen > pass, and so the range can be set earlier, in gimple-fold, and > its results made available even at -O1. The attached patch > changes the gimple_fold_builtin_strlen() function to do that. > > While testing the change I came across a number of other simple > strlen cases that currently aren't handled, some at -O1, others > at all. I added code to handle some of the simplest of them > and opened bugs to remind us/myself to get back to the rest in > the future (pr83693 and pr83702). The significant enhancement > is handling arrays of arrays with non-constant indices and > pointers to such things, such as in: > > char a[2][7]; > > void f (int i) > { > if (strlen (a[i]) > 6) // eliminated with the patch > abort (); > } > > Attached is a near-minimal patch to handle PR 83671. > > Martin > > gcc-83671.diff > > > PR tree-optimization/83671 - Fix for false positive reported by > -Wstringop-overflow does not work with inlining > > gcc/testsuite/ChangeLog: > > PR tree-optimization/83671 > * gcc.dg/strlenopt-40.c: New test. > * gcc.dg/strlenopt-41.c: New test. > > gcc/ChangeLog: > > PR tree-optimization/83671 > * builtins.c (c_strlen): Unconditionally return zero for the empty > string. > Use -Warray-bounds for warnings. > * gimple-fold.c (get_range_strlen): Handle non-constant lengths > for non-constant array indices with COMPONENT_REF, arrays of > arrays, and pointers to arrays. > (gimple_fold_builtin_strlen): Determine and set length range for > non-constant character arrays. > > @@ -1311,14 +1311,30 @@ get_range_strlen (tree arg, tree length[2], bitmap > *visited, int type, [ ... ] > + else if (TREE_CODE (TREE_OPERAND (op, 0)) == COMPONENT_REF && fuzzy) > + { > + tree idx = TREE_OPERAND (op, 1); > + > + arg = TREE_OPERAND (op, 0); > + tree optype = TREE_TYPE (arg); > + if (tree dom = TYPE_DOMAIN (optype)) > + if (tree bound = TYPE_MAX_VALUE (dom)) > + if (TREE_CODE (bound) == INTEGER_CST > + && TREE_CODE (idx) == INTEGER_CST > + && tree_int_cst_lt (bound, idx)) > + return false; This deserves a comment what are you looking for and why do you return false when you find it. I think I know the answers, but it'd be clearer to future readers to spell it out in a comment. With that comment and removal of the controversial set_range_info, I think this is OK. Jeff
Re: Fix folding of Inf/NaN comparisons for -ftrapping-math (PR tree-optimization/64811)
Joseph, On Sat, Jan 06, 2018 at 08:45:33AM +0100, Richard Biener wrote: > On January 5, 2018 10:13:34 PM GMT+01:00, Joseph Myers > wrote: > >unrelated. OK to commit? > > OK. > > Richard. > >Index: gcc/match.pd > >=== > >--- gcc/match.pd (revision 256279) > >+++ gcc/match.pd (working copy) > >+/* x <= +Inf is the same as x == x, i.e. !isnan(x), but this loses > >+ an "invalid" exception. */ > >+(if (!flag_trapping_math) > >+ (eq @0 @0 > >+ /* x == +Inf and x >= +Inf are always equal to x > DBL_MAX, but > >+ for == this introduces an exception for x a NaN. */ What does "x a NaN" mean? x OP NaN resp. x CMP NaN ? > >+ (if ((code == EQ_EXPR && !(HONOR_NANS (@0) && > >flag_trapping_math)) > >+ || code == GE_EXPR) > >(with { real_maxval (&max, neg, TYPE_MODE (TREE_TYPE (@0))); } > > (if (neg) > > (lt @0 { build_real (TREE_TYPE (@0), max); }) > >@@ -3072,7 +3076,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > > (if (neg) > > (ge @0 { build_real (TREE_TYPE (@0), max); }) > > (le @0 { build_real (TREE_TYPE (@0), max); } > >- /* x != +Inf is always equal to !(x > DBL_MAX). */ > >+ /* x != +Inf is always equal to !(x > DBL_MAX), but this > >introduces > >+ an exception for x a NaN so use an unordered comparison. */ Likewise. > >Index: gcc/testsuite/gcc.c-torture/execute/ieee/fp-cmp-7.x > >=== > >--- gcc/testsuite/gcc.c-torture/execute/ieee/fp-cmp-7.x (nonexistent) > >+++ gcc/testsuite/gcc.c-torture/execute/ieee/fp-cmp-7.x (working copy) > >@@ -0,0 +1,2 @@ > >+lappend additional_flags "-fno-trapping-math" > >+return 0 > >Index: gcc/testsuite/gcc.dg/torture/inf-compare-1.c > >=== > >--- gcc/testsuite/gcc.dg/torture/inf-compare-1.c (nonexistent) > >+++ gcc/testsuite/gcc.dg/torture/inf-compare-1.c (working copy) > >@@ -0,0 +1,19 @@ > >+/* { dg-do run } */ > >+/* { dg-add-options ieee } */ > >+/* { dg-require-effective-target fenv_exceptions } */ > >+ > >+#include > >+ > >+extern void abort (void); > >+extern void exit (int); > >+ > >+volatile double x = __builtin_nan (""); > >+volatile int i; > >+ > >+int > >+main (void) > >+{ > >+ i = x > __builtin_inf (); > >+ if (i != 0 || !fetestexcept (FE_INVALID)) > >+abort (); > >+} Shouldn't run tests return 0 here? Also drop the exit() declaration since it's unused? Same for all new tests in this patch. thanks, > >Index: gcc/testsuite/gcc.dg/torture/inf-compare-2.c > >Index: gcc/testsuite/gcc.dg/torture/inf-compare-3.c > >Index: gcc/testsuite/gcc.dg/torture/inf-compare-4.c > >Index: gcc/testsuite/gcc.dg/torture/inf-compare-5.c > >Index: gcc/testsuite/gcc.dg/torture/inf-compare-6.c > >Index: gcc/testsuite/gcc.dg/torture/inf-compare-7.c > >Index: gcc/testsuite/gcc.dg/torture/inf-compare-8.c
Re: PR83775
On 01/10/2018 12:01 PM, Prathamesh Kulkarni wrote: Hi, The attached patch tries to fix PR83775. Validation in progress. OK to commit if passes ? FWIW, the patch makes sense to me as it simplifies things for me when debugging using a cross-compiler. I reported the same ICE in bug 83775 and it was closed as WontFix but perhaps with a patch the decision will be reconsidered. It's not very user friendly to crash on the wrong/missing options. If the patch isn't accepted then perhaps one where the compiler issues an error would be. Martin
Re: [PATCH 5/5][AArch64] fp16fml support
On Tue, Jan 09, 2018 at 06:28:09PM +, Michael Collison wrote: > Patch updated per Richard's comments. Ok for trunk? This patch adds a lot of code, much of which looks like it ought to be possible to common up using the iterators. I'm going to OK it as is, as I'd like to see this make GCC 8, and we've sat on it for long enough, but I would really appreciate futurec refactoring in this area. I'm worried about maintainability as it stands. OK. Thanks, James > > -Original Message- > From: Richard Sandiford [mailto:richard.sandif...@linaro.org] > Sent: Thursday, January 4, 2018 8:02 AM > To: Michael Collison > Cc: GCC Patches ; nd > Subject: Re: [PATCH 5/5][AArch64] fp16fml support > > Hi Michael, > > Not a review of the full patch, just a comment about the patterns: > > Michael Collison writes: > > +(define_expand "aarch64_fmll_lane_lowv2sf" > > + [(set (match_operand:V2SF 0 "register_operand" "") > > + (unspec:V2SF [(match_operand:V2SF 1 "register_operand" "") > > + (match_operand:V4HF 2 "register_operand" "") > > + (match_operand:V4HF 3 "register_operand" "") > > + (match_operand:SI 4 "aarch64_imm2" "")] > > +VFMLA16_LOW))] > > + "TARGET_F16FML" > > +{ > > +rtx p1 = aarch64_simd_vect_par_cnst_half (V4HFmode, > > + GET_MODE_NUNITS (V4HFmode), > > + false); > > +rtx lane = GEN_INT (ENDIAN_LANE_N (GET_MODE_NUNITS (SImode), INTVAL > > (operands[4]))); > > Please use the newly-introduced aarch64_endian_lane_rtx for this. > > GET_MODE_NUNITS (SImode) doesn't seem right though, since that's always 1. > Should it be using V4HFmode instead? > > Same for the other patterns. > > Thanks, > Richard
libgo patch committed: Add SH support
This libgo patch by John Paul Adrian Glaubitz adds SH support to libgo. Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu, which I admit doesn't prove much. Committed to mainline. Ian Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 256437) +++ gcc/go/gofrontend/MERGE (working copy) @@ -1,4 +1,4 @@ -9705a1f4c37ad2c099e9fe6cd587d22a2a2ab2c3 +d8a9f7433a9e8a81c992ad2908818d2e84f3698b The first line of this file holds the git revision number of the last merge done from the gofrontend repository. Index: libgo/configure.ac === --- libgo/configure.ac (revision 256366) +++ libgo/configure.ac (working copy) @@ -208,10 +208,10 @@ AC_SUBST(USE_DEJAGNU) # supported by the gofrontend and all architectures supported by the # gc toolchain. # N.B. Keep in sync with gcc/testsuite/go.test/go-test.exp (go-set-goarch). -ALLGOARCH="386 alpha amd64 amd64p32 arm armbe arm64 arm64be ia64 m68k mips mipsle mips64 mips64le mips64p32 mips64p32le ppc ppc64 ppc64le s390 s390x sparc sparc64" +ALLGOARCH="386 alpha amd64 amd64p32 arm armbe arm64 arm64be ia64 m68k mips mipsle mips64 mips64le mips64p32 mips64p32le ppc ppc64 ppc64le s390 s390x sh shbe sparc sparc64" # All known GOARCH_FAMILY values. -ALLGOARCHFAMILY="I386 ALPHA AMD64 ARM ARM64 IA64 M68K MIPS MIPS64 PPC PPC64 S390 S390X SPARC SPARC64" +ALLGOARCHFAMILY="I386 ALPHA AMD64 ARM ARM64 IA64 M68K MIPS MIPS64 PPC PPC64 S390 S390X SH SPARC SPARC64" GOARCH=unknown GOARCH_FAMILY=unknown @@ -366,6 +366,36 @@ GOARCH_MINFRAMESIZE=8 GOARCH_CACHELINESIZE=256 GOARCH_PCQUANTUM=2 ;; + sh3eb*-*-*) +GOARCH=shbe +GOARCH_FAMILY=SH +GOARCH_BIGENDIAN=true +GOARCH_CACHELINESIZE=16 +GOARCH_PCQUANTUM=2 +GOARCH_MINFRAMESIZE=4 +;; + sh3*-*-*) +GOARCH=sh +GOARCH_FAMILY=SH +GOARCH_CACHELINESIZE=16 +GOARCH_PCQUANTUM=2 +GOARCH_MINFRAMESIZE=4 +;; + sh4eb*-*-*) +GOARCH=shbe +GOARCH_FAMILY=SH +GOARCH_BIGENDIAN=true +GOARCH_CACHELINESIZE=32 +GOARCH_PCQUANTUM=2 +GOARCH_MINFRAMESIZE=4 +;; + sh4*-*-*) +GOARCH=sh +GOARCH_FAMILY=SH +GOARCH_CACHELINESIZE=32 +GOARCH_PCQUANTUM=2 +GOARCH_MINFRAMESIZE=4 +;; sparc*-*-*) AC_COMPILE_IFELSE([ #if defined(__sparcv9) || defined(__arch64__) Index: libgo/go/cmd/cgo/main.go === --- libgo/go/cmd/cgo/main.go(revision 256366) +++ libgo/go/cmd/cgo/main.go(working copy) @@ -170,6 +170,8 @@ var ptrSizeMap = map[string]int64{ "ppc64le": 8, "s390":4, "s390x": 8, + "sh": 4, + "shbe":4, "sparc": 4, "sparc64": 8, } @@ -192,6 +194,8 @@ var intSizeMap = map[string]int64{ "ppc64le": 8, "s390":4, "s390x": 8, + "sh": 4, + "shbe":4, "sparc": 4, "sparc64": 8, } Index: libgo/go/go/build/syslist.go === --- libgo/go/go/build/syslist.go(revision 256366) +++ libgo/go/go/build/syslist.go(working copy) @@ -5,4 +5,4 @@ package build const goosList = "aix android darwin dragonfly freebsd linux nacl netbsd openbsd plan9 solaris windows zos " -const goarchList = "386 amd64 amd64p32 arm armbe arm64 arm64be alpha m68k ppc64 ppc64le mips mipsle mips64 mips64le mips64p32 mips64p32le ppc s390 s390x sparc sparc64 " +const goarchList = "386 amd64 amd64p32 arm armbe arm64 arm64be alpha m68k ppc64 ppc64le mips mipsle mips64 mips64le mips64p32 mips64p32le ppc s390 s390x sh shbe sparc sparc64" Index: libgo/go/runtime/hash32.go === --- libgo/go/runtime/hash32.go (revision 256366) +++ libgo/go/runtime/hash32.go (working copy) @@ -6,7 +6,7 @@ // xxhash: https://code.google.com/p/xxhash/ // cityhash: https://code.google.com/p/cityhash/ -// +build 386 arm armbe m68k mips mipsle ppc s390 sparc +// +build 386 arm armbe m68k mips mipsle ppc s390 sh shbe sparc package runtime Index: libgo/go/runtime/lfstack_32bit.go === --- libgo/go/runtime/lfstack_32bit.go (revision 256366) +++ libgo/go/runtime/lfstack_32bit.go (working copy) @@ -2,7 +2,7 @@ // Use of this source code is governed by a BSD-style // license that can be found in the LICENSE file. -// +build 386 arm nacl armbe m68k mips mipsle mips64p32 mips64p32le ppc s390 sparc +// +build 386 arm nacl armbe m68k mips mipsle mips64p32 mips64p32le ppc s390 sh shbe sparc package runtime Index: libgo/go/runtime/unaligned2.go === --- libgo/go/runtime/unaligned2.go (revision 256366) +++ libgo/go/run
Re: [1/4] [AArch64] SVE backend support
On Fri, Jan 05, 2018 at 11:41:25AM +, Richard Sandiford wrote: > Here's the patch updated to apply on top of the v8.4 and > __builtin_load_no_speculate support. It also handles the new > vec_perm_indices and CONST_VECTOR encoding and uses VNx... names > for the SVE modes. > > Richard Sandiford writes: > > This patch adds support for ARM's Scalable Vector Extension. > > The patch just contains the core features that work with the > > current vectoriser framework; later patches will add extra > > capabilities to both the target-independent code and AArch64 code. > > The patch doesn't include: > > > > - support for unwinding frames whose size depends on the vector length > > - modelling the effect of __tls_get_addr on the SVE registers > > > > These are handled by later patches instead. > > > > Some notes: > > > > - The copyright years for aarch64-sve.md start at 2009 because some of > > the code is based on aarch64.md, which also starts from then. > > > > - The patch inserts spaces between items in the AArch64 section > > of sourcebuild.texi. This matches at least the surrounding > > architectures and looks a little nicer in the info output. > > > > - aarch64-sve.md includes a pattern: > > > > while_ult > > > > A later patch adds a matching "while_ult" optab, but the pattern > > is also needed by the predicate vec_duplicate expander. I'm keen to take this. The code is good quality overall, I'm confident in your reputation and implementation. There are some parts of the design that I'm less happy about, but pragmatically, we should take this now to get the behaviour correct, and look to optimise, refactor, and clean-up in future. Sorry it took me a long time to get to the review. I've got no meaningful design concerns here, and certainly nothing so critical that we couldn't fix it after the fact in GCC 9 and up. That said... > (aarch64_sve_cnt_immediate_p, aarch64_sve_addvl_addpl_immediate_p) I'm not a big fan of these sorts of functions which return a char* where we've dumped the text we want to print out in the short term. The interface (fill a static char[] we can then leak on return) is pretty ugly. One consideration for future work would be refactoring out aarch64.c - it is getting to be too big for my liking (near 18,000 lines). > (aarch64_expand_sve_mem_move) Do we have a good description of how SVE big-endian vectors work, The sort of comment you write later ("see the comment at the head of aarch64-sve.md for details") would also be useful here as a reference. > aarch64_get_reg_raw_mode Do we assert/warn anywhere for users of __builtin_apply that they are fundamentally unsupported? > offset_4bit_signed_scaled_p So much code duplication here and in similair functions. Would a single interface (unsigned bits, bool signed, bool scaled) let you avoid the many identical functions? > aarch64_evpc_rev_local I'm likely missing something obvious, but what is the distinction you're drawing between global and local? Could you comment it? > aarch64-sve.md - scheduling types None of the instructions here have types for scheduling. That's going to make for a future headache. Adding them to the existing scheduling types is going to cause all sorts of trouble when building GCC (we already have too many types for some compilers to handle the structure!). We'll need to finds a solution to how we'll direct scheduling for SVE. > aarch64-sve.md - predicated operands It is a shame this ends up being so ugly and requiring UNSPEC_MERGE_PTRUE everywhere. That will block a lot of useful optimisation. Otherwise, this is OK for trunk. I'm happy to take it as is, and have the above suggestions applied as follow-ups if you think they are worth doing. Thanks, James
Re: [PR 81616] Deferring FMA transformations in tight loops
On 01/10/2018 11:46 AM, Martin Jambor wrote: > Hello, > > I would really like to ping the FMA transformation prevention patch that > I sent here in December, which, after incorporating a suggestion from > Richi, re-base and re-testing, I re-post below. I really think that it > should make into gcc 8 in some form, because the performance wins are > really big. > > I am still opened to all sorts of comments, of course, especially to > suggestions about the form of the param controlling this behavior (or > how to communicate that we want to do this on Zen in general). It might > even be a binary value since not forming FMAs does not seem to harm > bigger vectors either (as far as we know, I should add). > > For the record, I have not yet received any information from AMD why > FMAs on 256bit vectors do not have this problem, I assume all people > that could give an authoritative answer are now looking into covert > channel mitigation stuff. But very probably it is just that the > internal split FMAs can be scheduled so that while one is still waiting > for its addend, another can already execute. Both are likely true... Hoping Richi will look at the patch, but it's certainly in my queue of things to look at if he doesn't get to it. jeff
PR83775
Hi, The attached patch tries to fix PR83775. Validation in progress. OK to commit if passes ? Thanks, Prathamesh 2018-01-11 Prathamesh Kulkarni PR target/83775 * config/arm/arm.c (arm_declare_function_name): Set arch_to_print if targ_options->x_arm_arch_string is non NULL. diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 196aa6de1ac..868251a154c 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -30954,7 +30954,10 @@ arm_declare_function_name (FILE *stream, const char *name, tree decl) /* Only update the assembler .arch string if it is distinct from the last such string we printed. */ - std::string arch_to_print = targ_options->x_arm_arch_string; + std::string arch_to_print; + if (targ_options->x_arm_arch_string) +arch_to_print = targ_options->x_arm_arch_string; + if (arch_to_print != arm_last_printed_arch_string) { std::string arch_name
Re: [PR 81616] Deferring FMA transformations in tight loops
Hello, I would really like to ping the FMA transformation prevention patch that I sent here in December, which, after incorporating a suggestion from Richi, re-base and re-testing, I re-post below. I really think that it should make into gcc 8 in some form, because the performance wins are really big. I am still opened to all sorts of comments, of course, especially to suggestions about the form of the param controlling this behavior (or how to communicate that we want to do this on Zen in general). It might even be a binary value since not forming FMAs does not seem to harm bigger vectors either (as far as we know, I should add). For the record, I have not yet received any information from AMD why FMAs on 256bit vectors do not have this problem, I assume all people that could give an authoritative answer are now looking into covert channel mitigation stuff. But very probably it is just that the internal split FMAs can be scheduled so that while one is still waiting for its addend, another can already execute. Thanks, Martin On Fri, Dec 15 2017, Martin Jambor wrote: > Hello, > > the patch below prevents creation if fused-multiply-and-add instructions > in the widening_mul gimple pass on the Zen-based AMD CPUs and as a > result fixes regressions of native znver1 tuning when compared to > generic tuning in: > > - the matrix.c testcase of PR 81616 (straightforward matrix > multiplication) at -O2 and -O3 which is currently 60% (!), > > - SPEC 2006 454.calculix at -O2, which is currently over 20%, and > > - SPEC 2017 510.parest at -O2 and -Ofast, which is currently also > about 20% in both cases. > > The basic idea is to detect loops in the following form: > > > # accumulator_111 = PHI <0.0(5), accumulator_66(6)> > ... > _65 = _14 * _16; > accumulator_66 = _65 + accumulator_111; > > and prevents from creating FMA for it. Because at least in the parest > and calculix cases it has to, it also deals with more than one chain of > FMA candidates that feed the next one's addend: > > > > # accumulator_111 = PHI <0.0(5), accumulator_66(6)> > ... > _65 = _14 * _16; > accumulator_55 = _65 + accumulator_111; > _65 = _24 * _36; > accumulator_66 = _65 + accumulator_55; > > Unfortunately, to really get rid of the calculix regression, the > algorithm cannot just look at one BB at a time but also has to work for > cases like the following: > > 1 void mult(void) > 2 { > 3 int i, j, k, l; > 4 > 5 for(i=0; i 6 { > 7for(j=0; j 8{ > 9 for(l=0; l 10 { > 11 c[i][j] += a[i][l] * b[k][l]; > 12 for(k=1; k<10; ++k) > 13 { > 14 c[i][j] += a[i][l+k] * b[k][l+k]; > 15 } > 16 > 17 } > 18} > 19 } > 20 } > > where the FMA on line 14 feeds into the one on line 11 in an > encompassing loop. Therefore I have changed the structure of the pass > to work in reverse dominance order and it keeps a hash set of results of > rejected FMAs candidates which it checks when looking at PHI nodes of > the current BB. Without this reorganization, calculix was still 8% > slower with native tuning than with generic one. > > When the deferring mechanism realizes that in the current BB, the FMA > candidates do not all form a one chain tight-loop like in the examples > above, it goes back to all the previously deferred candidates (in the > current BB only) and performs the transformation. > > The main reason is to keep the patch conservative (and also simple), but > it also means that the following function is not affected and is still > 20% slower when compiled with native march and tuning compared to the > generic one: > > 1 void mult(struct s *p1, struct s *p2) > 2 { > 3 int i, j, k; > 4 > 5 for(i=0; i 6 { > 7for(j=0; j 8{ > 9 for(k=0; k 10 { > 11 p1->c[i][j] += p1->a[i][k] * p1->b[k][j]; > 12 p2->c[i][j] += p2->a[i][k] * p2->b[k][j]; > 13 } > 14} > 15 } > 16 } > > I suppose that the best optimization for the above would be to split the > loops, but one could probably construct at least an artificial testcase > where the FMAs would keep enough locality that it is not the case. The > mechanism can be easily extended to keep track of not just one chain but > a few, preferably as a followup, if people think it makes sense. > > An interesting observation is that the matrix multiplication does not > suffer the penalty when compiled with -O3 -mprefer-vector-width=256. > Apparently the 256 vector processing can hide the latency penalty when > internally it is split into two halves. The same goes for 512 bit > vectors. That is why the patch leaves those be - well, there is a param > fo
PR81703 and Martin's fix for PR83501
Hi, I have attached patch for PR81703 rebased on Martin's fix for PR83501 posted here since both had considerable overlaps: https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00180.html The patch passes bootstrap+test on x86_64-unknown-linux-gnu and cross-tested on aarch64-*-*. Currently it fails to pass validation on arm targets because of PR83775. Does it look OK? Thanks, Prathamesh 2018-10-01 Martin Sebor Prathamesh Kulkarni PR tree-optimization/83501 PR tree-optimization/81703 * tree-ssa-strlen.c (get_string_cst): Rename... (get_string_len): ...to this. Handle global constants. (handle_char_store): Adjust. testsuite/ * gcc.dg/strlenopt-39.c: New test-case. * gcc.dg/pr81703.c: Likewise. diff --git a/gcc/testsuite/gcc.dg/pr81703.c b/gcc/testsuite/gcc.dg/pr81703.c new file mode 100644 index 000..190f4a833dd --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr81703.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-strlen" } */ + +unsigned g (void) +{ + char d[8]; + const char s[] = "0123"; + __builtin_memcpy (d, s, sizeof s); + return __builtin_strlen (d); +} + +/* { dg-final { scan-tree-dump-not "__builtin_strlen" "strlen" } } */ diff --git a/gcc/testsuite/gcc.dg/strlenopt-39.c b/gcc/testsuite/gcc.dg/strlenopt-39.c new file mode 100644 index 000..a4177c918ad --- /dev/null +++ b/gcc/testsuite/gcc.dg/strlenopt-39.c @@ -0,0 +1,66 @@ +/* PR tree-optimization/83444 + { dg-do compile } + { dg-options "-O2 -fdump-tree-optimized" } */ + +#include "strlenopt.h" + +#define STR "1234567" + +const char str[] = STR; + +char dst[10]; + +void copy_from_global_str (void) +{ + strcpy (dst, str); + + if (strlen (dst) != sizeof str - 1) +abort (); +} + +void copy_from_local_str (void) +{ + const char s[] = STR; + + strcpy (dst, s); + + if (strlen (dst) != sizeof s - 1) +abort (); +} + +void copy_from_local_memstr (void) +{ + struct { +char s[sizeof STR]; + } x = { STR }; + + strcpy (dst, x.s); + + if (strlen (dst) != sizeof x.s - 1) +abort (); +} + +void copy_to_local_str (void) +{ + char d[sizeof STR]; + + strcpy (d, str); + + if (strlen (d) != sizeof str - 1) +abort (); +} + +void copy_to_local_memstr (void) +{ + struct { +char d[sizeof STR]; + } x; + + strcpy (x.d, str); + + if (strlen (x.d) != sizeof str- 1) +abort (); +} + +/* Verify that all calls to strlen have been eliminated. + { dg-final { scan-tree-dump-not "(abort|strlen) \\(\\)" "optimized" } } */ diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c index aae242d93d6..4e363278ea2 100644 --- a/gcc/tree-ssa-strlen.c +++ b/gcc/tree-ssa-strlen.c @@ -2773,18 +2773,40 @@ handle_pointer_plus (gimple_stmt_iterator *gsi) } /* Check if RHS is string_cst possibly wrapped by mem_ref. */ -static tree -get_string_cst (tree rhs) +static int +get_string_len (tree rhs) { if (TREE_CODE (rhs) == MEM_REF && integer_zerop (TREE_OPERAND (rhs, 1))) { - rhs = TREE_OPERAND (rhs, 0); + tree rhs_addr = rhs = TREE_OPERAND (rhs, 0); if (TREE_CODE (rhs) == ADDR_EXPR) - rhs = TREE_OPERAND (rhs, 0); + { + rhs = TREE_OPERAND (rhs, 0); + if (TREE_CODE (rhs) != STRING_CST) + { + int idx = get_stridx (rhs_addr); + if (idx > 0) + { + strinfo *si = get_strinfo (idx); + if (si && si->full_string_p) + return tree_to_shwi (si->nonzero_chars); + } + } + } } - return (TREE_CODE (rhs) == STRING_CST) ? rhs : NULL_TREE; + if (TREE_CODE (rhs) == VAR_DECL + && TREE_READONLY (rhs)) +rhs = DECL_INITIAL (rhs); + + if (rhs && TREE_CODE (rhs) == STRING_CST) +{ + unsigned HOST_WIDE_INT ilen = strlen (TREE_STRING_POINTER (rhs)); + return ilen <= INT_MAX ? ilen : -1; +} + + return -1; } /* Handle a single character store. */ @@ -2799,6 +2821,9 @@ handle_char_store (gimple_stmt_iterator *gsi) tree rhs = gimple_assign_rhs1 (stmt); unsigned HOST_WIDE_INT offset = 0; + /* Set to the length of the string being assigned if known. */ + int rhslen; + if (TREE_CODE (lhs) == MEM_REF && TREE_CODE (TREE_OPERAND (lhs, 0)) == SSA_NAME) { @@ -2942,19 +2967,18 @@ handle_char_store (gimple_stmt_iterator *gsi) } } else if (idx == 0 - && (rhs = get_string_cst (gimple_assign_rhs1 (stmt))) + && (rhslen = get_string_len (gimple_assign_rhs1 (stmt))) >= 0 && ssaname == NULL_TREE && TREE_CODE (TREE_TYPE (lhs)) == ARRAY_TYPE) { - size_t l = strlen (TREE_STRING_POINTER (rhs)); HOST_WIDE_INT a = int_size_in_bytes (TREE_TYPE (lhs)); - if (a > 0 && (unsigned HOST_WIDE_INT) a > l) + if (a > 0 && (unsigned HOST_WIDE_INT) a > (unsigned HOST_WIDE_INT) rhslen) { int idx = new_addr_stridx (lhs); if (idx != 0)
Re: [PATCH][AArch64] Use LDP/STP in shrinkwrapping
On Tue, Jan 09, 2018 at 09:13:23PM -0800, Andrew Pinski wrote: > On Tue, Jan 9, 2018 at 6:54 AM, Segher Boessenkool > wrote: > > On Tue, Jan 09, 2018 at 12:23:42PM +, Wilco Dijkstra wrote: > >> Segher Boessenkool wrote: > >> > On Mon, Jan 08, 2018 at 0:25:47PM +, Wilco Dijkstra wrote: > >> >> > Always pairing two registers together *also* degrades code quality. > >> >> > >> >> No, while it's not optimal, it means smaller code and fewer memory > >> >> accesses. > >> > > >> > It means you execute *more* memory accesses. Always. This may be > >> > sometimes hidden, sure. I'm not saying you do not want more ldp's; > >> > I'm saying this particular strategy is very far from ideal. > >> > >> No it means less since the number of memory accesses reduces (memory > >> bandwidth may increase but that's not an issue). > > > > The problem is *more* memory accesses are executed at runtime. Which is > > why separate shrink-wrapping does what it does: to have *fewer* executed. > > (It's not just the direct execution cost why that helps: more important > > are latencies to dependent ops, microarchitectural traps, etc.). > > On most micro-arch of AARCH64, having one LDP/STP will take just as > long as one LDR/STR as long as it is on the same cache line. > So having one LDP/STP compared to two LDR?STR is much better. LDP/STP > is considered one memory access really and that is where the confusion > is coming from. We are reducing the overall number of memory accesses > or keeping it the same on that path. > Hope this explanation allows you to understand why pairing does not > degrade the code quality but improves it overall. Of course I see that ldp is useful. I don't think that this particular way of forcing more pairs is a good idea. Needs testing / benchmarking / instrumentation, and we haven't seen any of that. Forcing pairs before separate shrink-wrapping reduces the effectiveness of the latter by a lot. Segher
Re: [wwwdocs,avr] Mention PR83738 in release notes
On 01/10/2018 07:48 AM, Georg-Johann Lay wrote: > This patch adds a bit more the the avr section of the v8 release notes. > > Ok? OK. jeff
Re: [PATCH] reduce runtime of gcc.dg/memcmp-1.c test
On 01/10/2018 10:46 AM, Aaron Sawdey wrote: > This brings it back not quite to where it was but a lot more reasonable > than what I put into 256351. > > 2018-01-10 Aaron Sawdey > > * gcc.dg/memcmp-1.c: Reduce runtime to something reasonable. > > OK for trunk? > > Thanks, >Aaron > > OK jeff
Re: [PATCH] Fix PR81968
On January 10, 2018 6:28:57 PM GMT+01:00, Jeff Law wrote: >On 01/10/2018 03:05 AM, Richard Biener wrote: >> >> This joint work rewrites LTO debug section copying to not leave >> discarded sections around as SHT_NULL but to really discard them >> and deal with the fallout (remapping all remaining section >references). >> This is to avoid diagnostics from the Solaris linker which doesn't >> like those. >> >> LTO bootstrapped on x86_64-unknown-linux-gnu, I also tested the >> "incredibly large # of sections" testcase to verify SHN_XINDEX >> handling. A regular bootstrap & test run is currently in progress. >> >> Rainer, can you check this patch on Solaris? Maybe we can finally >> close that PR ... >> >> Ok for trunk? >> >> Thanks, >> Richard. >> >> 2017-01-10 Richard Biener >> Rainer Orth >> >> PR lto/81968 >> libiberty/ >> * simple-object-common.h (struct simple_object_functions): >> Change copy_lto_debug_sections callback signature. >> * simple-object-elf.c (SHN_HIRESERVE, SHT_SYMTAB_SHNDX, >> SHF_INFO_LINK): Add defines. >> (simple_object_elf_copy_lto_debug_sections): Instead of >> leaving not to be copied sections empty unnamed SHT_NULL >> remove them from the target section headers and adjust section >> reference everywhere. Handle SHN_XINDEX in the symbol table >> processing properly. >> * simple-object.c (handle_lto_debug_sections): Change >> interface to return a modified string and handle renaming >> of relocation sections. >Note there's also 82005 which affects Darwin. It might be worth >reaching out to the Darwin folks and see if this helps them as well. Darwin doesn't use ELF and thus is not affected by this patch. Richard. > >jeff
Re: Go patch committed: Update to Go1.10beta1
On Wed, Jan 10, 2018 at 7:43 AM, Rainer Orth wrote: > Hi Ian, > >> On Wed, Jan 10, 2018 at 3:44 AM, Rainer Orth >> wrote: >>> >>> thanks. Testing has now concluded as well. x86 results are good (no >>> regressions except for cmd/internal/buildid which fails on Linux, too), >>> as are 64-bit sparc results. >> >> The cmd/internal/buildid test does pass on my system. What are you seeing? > > the log shows > > --- FAIL: TestReadFile (0.01s) > buildid_test.go:40: ReadFile(testdata/p.a) = "", open testdata/p.a: > no such file or directory, want > "abcdefghijklmnopqrstuvwxyz.1234567890123456789012345678901234567890123456789012345678901234", > nil > buildid_test.go:47: ReadFile(testdata/p.a) [readSize=2k] = "", open > testdata/p.a: no such file or directory, want > "abcdefghijklmnopqrstuvwxyz.1234567890123456789012345678901234567890123456789012345678901234", > nil > buildid_test.go:52: open testdata/p.a: no such file or directory > > libgo/go/cmd/internal/buildid/testdata/p.a is just missing. Ah. Turns out it was being ignored by the default global-ignores setting in the Subversion configuration, which ignores all .a files. I found a couple of other .a files that were missing in the gofrontend git repo (dropped by merge.sh) and copied them over. Then I committed all the missing .a files to the gcc Subversion repository (there were five missing files; the others were not causing a test failure). Ian
Re: [PR/middle-end 81897] make tree-ssa-uninit.c handle longer sequences
>> @@ -671,11 +668,9 @@ convert_control_dep_chain_into_preds (vec >> *dep_chains, >> e = one_cd_chain[j]; >> guard_bb = e->src; >> gsi = gsi_last_bb (guard_bb); >> + /* Ignore empty BBs as they're basically forwarder blocks. */ >> if (gsi_end_p (gsi)) >> - { >> - has_valid_pred = false; >> - break; >> - } >> + continue; >> cond_stmt = gsi_stmt (gsi); >> if (is_gimple_call (cond_stmt) && EDGE_COUNT (e->src->succs) >= 2) >> /* Ignore EH edge. Can add assertion on the other edge's flag. */ > ISTM that you want to use empty_block_p (bb) && single_succ_p (bb) to > detect the forwarder block. Otherwise ISTM that labels and debug > statements would affect the uninit analysis. We still need to check for gsi_end_p() because guard_bb can have no statements but be considered non empty according to empty_block_p(). This is the case with a seemingly empty basic block that actually has an incoming PHI. Jakub suggested the following patch which fixes the new ICE in the PR. I've adjusted the comments accordingly. OK? gcc/ PR middle-end/81897 * tree-ssa-uninit.c (convert_control_dep_chain_into_preds): Skip empty blocks. diff --git a/gcc/testsuite/gcc.dg/uninit-pr81897-2.c b/gcc/testsuite/gcc.dg/uninit-pr81897-2.c new file mode 100644 index 000..3960af454e5 --- /dev/null +++ b/gcc/testsuite/gcc.dg/uninit-pr81897-2.c @@ -0,0 +1,35 @@ +/* { dg-do compile } */ +/* { dg-options "-O1 -fno-tree-ccp -Wmaybe-uninitialized" } */ + +int oo; + +void +pc (int *tt) +{ + int cf = 0; + + if (*tt != 0) +{ + if (0) +{ + int *qg; + int uj = 0; + + t6: + tt = &cf; + if (oo != 0) +{ + ++uj; /* { dg-warning "may be used uninit" } */ + *qg = !!oo && !!uj; /* { dg-warning "may be used uninit" } */ +} +} + cf = 0; + goto t6; +} + + if (oo != 0) +{ + *tt = 1; + goto t6; +} +} diff --git a/gcc/tree-ssa-uninit.c b/gcc/tree-ssa-uninit.c index 38239476286..8ccbc85970a 100644 --- a/gcc/tree-ssa-uninit.c +++ b/gcc/tree-ssa-uninit.c @@ -669,9 +669,16 @@ convert_control_dep_chain_into_preds (vec *dep_chains, e = one_cd_chain[j]; guard_bb = e->src; gsi = gsi_last_bb (guard_bb); - /* Ignore empty BBs as they're basically forwarder blocks. */ + /* Ignore empty forwarder blocks. */ if (empty_block_p (guard_bb) && single_succ_p (guard_bb)) continue; + /* An empty basic block here is likely a PHI, and is not one + of the cases we handle below. */ + if (gsi_end_p (gsi)) + { + has_valid_pred = false; + break; + } cond_stmt = gsi_stmt (gsi); if (is_gimple_call (cond_stmt) && EDGE_COUNT (e->src->succs) >= 2) /* Ignore EH edge. Can add assertion on the other edge's flag. */
[PATCH] reduce runtime of gcc.dg/memcmp-1.c test
This brings it back not quite to where it was but a lot more reasonable than what I put into 256351. 2018-01-10 Aaron Sawdey * gcc.dg/memcmp-1.c: Reduce runtime to something reasonable. OK for trunk? Thanks, Aaron -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC ToolchainIndex: /home/sawdey/src/gcc/trunk/trunk/gcc/testsuite/gcc.dg/memcmp-1.c === --- /home/sawdey/src/gcc/trunk/trunk/gcc/testsuite/gcc.dg/memcmp-1.c (revision 256437) +++ /home/sawdey/src/gcc/trunk/trunk/gcc/testsuite/gcc.dg/memcmp-1.c (working copy) @@ -12,8 +12,20 @@ int lib_strncmp(const char *a, const char *b, size_t n) asm("strncmp"); #ifndef NRAND +#ifdef TEST_ALL #define NRAND 1 +#else +#define NRAND 500 #endif +#endif +#ifndef TZONE +#ifdef TEST_ALL +#define TZONE 16 +#else +#define TZONE 8 +#endif +#endif + #define MAX_SZ 600 #define DEF_RS(ALIGN) \ @@ -33,9 +45,7 @@ b = four+i*ALIGN+j*(4096-2*i*ALIGN); \ memcpy(a,str1,sz); \ memcpy(b,str2,sz); \ - asm(" "); \ r = memcmp(a,b,sz); \ - asm(" "); \ if ( r < 0 && !(expect < 0) ) abort(); \ if ( r > 0 && !(expect > 0) ) abort(); \ if ( r == 0 && !(expect == 0) ) abort(); \ @@ -67,15 +77,13 @@ { for (a1=0; a1 < 2*sizeof(void *); a1++) { + a = three+i*a1+j*(4096-2*i*a1); + memcpy(a,str1,sz); for (a2=0; a2 < 2*sizeof(void *); a2++) { - a = three+i*a1+j*(4096-2*i*a1); b = four+i*a2+j*(4096-2*i*a2); - memcpy(a,str1,sz); memcpy(b,str2,sz); - asm(" "); r = memcmp(a,b,sz); - asm(" "); if ( r < 0 && !(expect < 0) ) abort(); if ( r > 0 && !(expect > 0) ) abort(); if ( r == 0 && !(expect == 0) ) abort(); @@ -89,7 +97,7 @@ void (test_strncmp)(const char *, const char *, int), size_t sz, int align) { - char buf1[MAX_SZ*2+10],buf2[MAX_SZ*2+10]; + char buf1[MAX_SZ*2+TZONE],buf2[MAX_SZ*2+TZONE]; size_t test_sz = (sz10)?(test_sz-10):0); diff_pos < test_sz+10; diff_pos++) -for(zero_pos = ((test_sz>10)?(test_sz-10):0); zero_pos < test_sz+10; zero_pos++) + for(diff_pos = ((test_sz>TZONE)?(test_sz-TZONE):0); diff_pos < test_sz+TZONE; diff_pos++) +for(zero_pos = ((test_sz>TZONE)?(test_sz-TZONE):0); zero_pos < test_sz+TZONE; zero_pos++) { memset(buf1, 'A', 2*test_sz); memset(buf2, 'A', 2*test_sz); @@ -125,7 +133,6 @@ (*test_memcmp)(buf2,buf2,0); test_memcmp_runtime_size (buf1, buf2, sz, e); test_memcmp_runtime_size (buf2, buf1, sz, -e); - test_memcmp_runtime_size (buf2, buf2, sz, 0); e = lib_strncmp(buf1,buf2,sz); (*test_strncmp)(buf1,buf2,e); (*test_strncmp)(buf2,buf1,-e); @@ -470,10 +477,8 @@ DEF_TEST(9,1) DEF_TEST(16,1) DEF_TEST(32,1) -DEF_TEST(100,1) -DEF_TEST(100,8) -DEF_TEST(180,1) -DEF_TEST(180,8) +DEF_TEST(33,8) +DEF_TEST(49,1) #endif int @@ -753,9 +758,7 @@ RUN_TEST(9,1) RUN_TEST(16,1) RUN_TEST(32,1) -RUN_TEST(100,1) -RUN_TEST(100,8) -RUN_TEST(180,1) -RUN_TEST(180,8) +RUN_TEST(33,8) +RUN_TEST(49,1) #endif }
Re: [PATCH] Fix PR81968
On 01/10/2018 03:05 AM, Richard Biener wrote: > > This joint work rewrites LTO debug section copying to not leave > discarded sections around as SHT_NULL but to really discard them > and deal with the fallout (remapping all remaining section references). > This is to avoid diagnostics from the Solaris linker which doesn't > like those. > > LTO bootstrapped on x86_64-unknown-linux-gnu, I also tested the > "incredibly large # of sections" testcase to verify SHN_XINDEX > handling. A regular bootstrap & test run is currently in progress. > > Rainer, can you check this patch on Solaris? Maybe we can finally > close that PR ... > > Ok for trunk? > > Thanks, > Richard. > > 2017-01-10 Richard Biener > Rainer Orth > > PR lto/81968 > libiberty/ > * simple-object-common.h (struct simple_object_functions): > Change copy_lto_debug_sections callback signature. > * simple-object-elf.c (SHN_HIRESERVE, SHT_SYMTAB_SHNDX, > SHF_INFO_LINK): Add defines. > (simple_object_elf_copy_lto_debug_sections): Instead of > leaving not to be copied sections empty unnamed SHT_NULL > remove them from the target section headers and adjust section > reference everywhere. Handle SHN_XINDEX in the symbol table > processing properly. > * simple-object.c (handle_lto_debug_sections): Change > interface to return a modified string and handle renaming > of relocation sections. Note there's also 82005 which affects Darwin. It might be worth reaching out to the Darwin folks and see if this helps them as well. jeff
Re: [C++ PATCH] Fix some -Wused-but-set-variable regressions (PR c++/82728, PR c++/82799, PR c++/83690)
On Wed, Jan 10, 2018 at 11:56 AM, Jakub Jelinek wrote: > On Wed, Jan 10, 2018 at 11:52:16AM -0500, Jason Merrill wrote: >> On Fri, Jan 5, 2018 at 5:14 PM, Jakub Jelinek wrote: >> > Jason's recent change removed a mark_rvalue_use call from constant_value_1, >> > which unfortunately regressed quite a few cases where >> > -Wunused-but-set-variable now has false positives. >> >> > The easiest fix seems to be just deal with the -Wunused-but-set-variable >> > issue at that point. >> >> Hmm, we ought to have called mark_rvalue_use before we get here. I'm >> concerned that these issues indicate that lambda captures won't work >> in the situations in the testcase, since we rely on mark_rvalue_use to >> look through them. > > Unless you have ideas where to put those mark_rvalue_use calls, I'll defer > these PRs to you then, this was just an attempt for an easy way out of it > for the warning. At least the testcases should be usable for future patch. Makes sense, thanks. Jason
Re: [C++ Patch] PR 78344 ("ICE on invalid c++ code (tree check: expected tree_list, have error_mark in cp_check_const_attributes, at cp/decl2.c:1347")
On Wed, Jan 10, 2018 at 11:33 AM, Paolo Carlini wrote: > Hi, > > On 10/01/2018 16:32, Jason Merrill wrote: >> >> On Fri, Dec 22, 2017 at 7:34 PM, Paolo Carlini >> wrote: >>> >>> in this error recovery issue cp_check_const_attributes and more generally >>> cplus_decl_attributes have lots of troubles handling the error_mark_node >>> returned by cp_parser_std_attribute_spec_seq, as called by >>> cp_parser_direct_declarator. I fiddled quite a bit with these parsing >>> facilities to eventually notice that boldly changing >>> cp_parser_std_attribute_spec_seq to return NULL_TREE instead of >>> error_mark_node when cp_parser_std_attribute_spec returns error_mark_node >>> in >>> the loop cures the bug at issue as a special case. >> >> Hmm, I'm skeptical. In general, we want to use error_mark_node to >> distinguish between something not being there and being there but >> wrong. > > I see what you mean. But consider that we already emitted diagnostic anyway, I'm not sure we can rely on that; cp_parser_error doesn't emit anything when parsing tentatively. > and, important detail, isn't that we are dropping some correct attributes, > we are dropping all of them anyway with error_mark_node or NULL_TREE the > same way. It's just that with NULL_TREE we are behaving during error > recovery as if the attributes weren't there in the first place. I'm not sure > if this was 100% clear... Would you like to have some additional details? It's clear, I'm just not convinced it's what we want. Jason
Re: [C++ PATCH] -Wclass-memaccess fixes and improvements (PR c++/81327)
On Wed, Jan 10, 2018 at 11:25 AM, Jakub Jelinek wrote: > On Wed, Jan 10, 2018 at 10:46:06AM -0500, Jason Merrill wrote: >> >> This patch moves the warning tiny bit earlier (from build_cxx_call to the >> >> caller) where we still have information about the original parsed >> >> arguments before conversions to the builtin argument types. >> >> OK. > > Is that an ack for the whole patch and docs can be handled incrementally > (e.g. if Martin volunteers), or shall that be documented first? > Though, even before this patch memcpy ((char *) whatever, ...) > already disabled the warning and wasn't documented either. The docs can follow on. Jason
Re: [C++ PATCH] Fix some -Wused-but-set-variable regressions (PR c++/82728, PR c++/82799, PR c++/83690)
On Wed, Jan 10, 2018 at 11:52:16AM -0500, Jason Merrill wrote: > On Fri, Jan 5, 2018 at 5:14 PM, Jakub Jelinek wrote: > > Jason's recent change removed a mark_rvalue_use call from constant_value_1, > > which unfortunately regressed quite a few cases where > > -Wunused-but-set-variable now has false positives. > > > The easiest fix seems to be just deal with the -Wunused-but-set-variable > > issue at that point. > > Hmm, we ought to have called mark_rvalue_use before we get here. I'm > concerned that these issues indicate that lambda captures won't work > in the situations in the testcase, since we rely on mark_rvalue_use to > look through them. Unless you have ideas where to put those mark_rvalue_use calls, I'll defer these PRs to you then, this was just an attempt for an easy way out of it for the warning. At least the testcases should be usable for future patch. Jakub
Re: C++ PATCH to fix bogus warning with a non-type argument (PR c++/82541)
OK. On Tue, Jan 2, 2018 at 9:51 AM, Marek Polacek wrote: > This PR complains about a bogus -Wduplicated-branches warning with a non-type > template argument. That can be easily fixed with a new sentinel. I also > noticed a missing tf_warning warning check, so I added it for good measure. > > Bootstrapped/regtested on x86_64-linux, ok for trunk? > > 2018-01-02 Marek Polacek > > PR c++/82541 > * call.c (build_conditional_expr_1): Check complain before warning. > * pt.c (tsubst_copy_and_build) : Suppress > -Wduplicated-branches. > > * g++.dg/warn/Wduplicated-branches4.C: New test. > > diff --git gcc/cp/call.c gcc/cp/call.c > index bd7666d72bb..c23a733978e 100644 > --- gcc/cp/call.c > +++ gcc/cp/call.c > @@ -5343,6 +5343,7 @@ build_conditional_expr_1 (location_t loc, tree arg1, > tree arg2, tree arg3, >/* If the ARG2 and ARG3 are the same and don't have side-effects, > warn here, because the COND_EXPR will be turned into ARG2. */ >if (warn_duplicated_branches > + && (complain & tf_warning) >&& (arg2 == arg3 || operand_equal_p (arg2, arg3, 0))) > warning_at (EXPR_LOCATION (result), OPT_Wduplicated_branches, > "this condition has identical branches"); > diff --git gcc/cp/pt.c gcc/cp/pt.c > index a8144e85a39..2c216eaebbe 100644 > --- gcc/cp/pt.c > +++ gcc/cp/pt.c > @@ -17846,6 +17846,7 @@ tsubst_copy_and_build (tree t, > exp2 = RECUR (TREE_OPERAND (t, 2)); > } > > + warning_sentinel s(warn_duplicated_branches); > RETURN (build_x_conditional_expr (EXPR_LOCATION (t), > cond, exp1, exp2, complain)); >} > diff --git gcc/testsuite/g++.dg/warn/Wduplicated-branches4.C > gcc/testsuite/g++.dg/warn/Wduplicated-branches4.C > index e69de29bb2d..7963c9e8ab5 100644 > --- gcc/testsuite/g++.dg/warn/Wduplicated-branches4.C > +++ gcc/testsuite/g++.dg/warn/Wduplicated-branches4.C > @@ -0,0 +1,16 @@ > +// PR c++/82541 > +// { dg-do compile } > +// { dg-options "-Wduplicated-branches" } > + > +template > +struct AR > +{ > +char a1[N > 0 ? N : 1]; // { dg-bogus "this condition has identical > branches" } > +char a2[N > 0 ? 1 : 1]; // { dg-warning "this condition has identical > branches" } > +}; > + > +int > +main () > +{ > +AR<1> w; > +} > > Marek
Re: [C++ PATCH] Fix some -Wused-but-set-variable regressions (PR c++/82728, PR c++/82799, PR c++/83690)
On Fri, Jan 5, 2018 at 5:14 PM, Jakub Jelinek wrote: > Jason's recent change removed a mark_rvalue_use call from constant_value_1, > which unfortunately regressed quite a few cases where > -Wunused-but-set-variable now has false positives. > The easiest fix seems to be just deal with the -Wunused-but-set-variable > issue at that point. Hmm, we ought to have called mark_rvalue_use before we get here. I'm concerned that these issues indicate that lambda captures won't work in the situations in the testcase, since we rely on mark_rvalue_use to look through them. [checks] And indeed the first testcase breaks if the switch is wrapped in a lambda that can capture i. void foo () { const int i = 1; // { dg-bogus "set but not used" } [=]() { switch (0) { case i: break; } }; } Jason
Re: [C++ Patch] PR 78344 ("ICE on invalid c++ code (tree check: expected tree_list, have error_mark in cp_check_const_attributes, at cp/decl2.c:1347")
Hi, On 10/01/2018 16:32, Jason Merrill wrote: On Fri, Dec 22, 2017 at 7:34 PM, Paolo Carlini wrote: in this error recovery issue cp_check_const_attributes and more generally cplus_decl_attributes have lots of troubles handling the error_mark_node returned by cp_parser_std_attribute_spec_seq, as called by cp_parser_direct_declarator. I fiddled quite a bit with these parsing facilities to eventually notice that boldly changing cp_parser_std_attribute_spec_seq to return NULL_TREE instead of error_mark_node when cp_parser_std_attribute_spec returns error_mark_node in the loop cures the bug at issue as a special case. Hmm, I'm skeptical. In general, we want to use error_mark_node to distinguish between something not being there and being there but wrong. I see what you mean. But consider that we already emitted diagnostic anyway, and, important detail, isn't that we are dropping some correct attributes, we are dropping all of them anyway with error_mark_node or NULL_TREE the same way. It's just that with NULL_TREE we are behaving during error recovery as if the attributes weren't there in the first place. I'm not sure if this was 100% clear... Would you like to have some additional details? Thanks! Paolo.
Re: [C++ PATCH] -Wclass-memaccess fixes and improvements (PR c++/81327)
On Wed, Jan 10, 2018 at 10:46:06AM -0500, Jason Merrill wrote: > >> This patch moves the warning tiny bit earlier (from build_cxx_call to the > >> caller) where we still have information about the original parsed > >> arguments before conversions to the builtin argument types. > > OK. Is that an ack for the whole patch and docs can be handled incrementally (e.g. if Martin volunteers), or shall that be documented first? Though, even before this patch memcpy ((char *) whatever, ...) already disabled the warning and wasn't documented either. Jakub
Re: [PATCH PR82096] Fix ICE in int_mode_for_mode, at stor-layout.c:403 with arm-linux-gnueabi
Hi Jeff On 10/01/18 10:44, Sudakshina Das wrote: Hi Jeff On 09/01/18 23:43, Jeff Law wrote: On 01/05/2018 12:25 PM, Sudakshina Das wrote: Hi Jeff On 05/01/18 18:44, Jeff Law wrote: On 01/04/2018 08:35 AM, Sudakshina Das wrote: Hi The bug reported a particular test di-longlong64-sync-1.c failing when run on arm-linux-gnueabi with options -mthumb -march=armv5t -O[g,1,2,3] and -mthumb -march=armv6 -O[g,1,2,3]. According to what I could see, the crash was caused because of the explicit VOIDmode argument that was sent to emit_store_flag_force (). Since the comparing argument was a long long, it was being forced into a VOID type register before the comparison (in prepare_cmp_insn()) is done. As pointed out by Kyrill, there is a comment on emit_store_flag() which says "MODE is the mode to use for OP0 and OP1 should they be CONST_INTs. If it is VOIDmode, they cannot both be CONST_INT". This condition is not true in this case and thus I think it is suitable to change the argument. Testing done: Checked for regressions on bootstrapped arm-none-linux-gnueabi and arm-none-linux-gnueabihf and added new test cases. Sudi ChangeLog entries: *** gcc/ChangeLog *** 2017-01-04 Sudakshina Das PR target/82096 * optabs.c (expand_atomic_compare_and_swap): Change argument to emit_store_flag_force. *** gcc/testsuite/ChangeLog *** 2017-01-04 Sudakshina Das PR target/82096 * gcc.c-torture/compile/pr82096-1.c: New test. * gcc.c-torture/compile/pr82096-2.c: Likwise. In the case where both (op0/op1) to emit_store_flag/emit_store_flag_force are constants, don't we know the result of the comparison and shouldn't we have optimized the store flag to something simpler? I feel like I must be missing something here. emit_store_flag_force () is comparing a register to op0. ? /* Emit a store-flags instruction for comparison CODE on OP0 and OP1 and storing in TARGET. Normally return TARGET. Return 0 if that cannot be done. MODE is the mode to use for OP0 and OP1 should they be CONST_INTs. If it is VOIDmode, they cannot both be CONST_INT. So we're comparing op0 and op1 AFAICT. One, but not both can be a CONST_INT. If both are a CONST_INT, then you need to address the problem in the caller (by optimizing away the condition). If you've got a REG and a CONST_INT, then the mode should be taken from the REG operand. The 2 constant arguments are to the expand_atomic_compare_and_swap () function. emit_store_flag_force () is used in case when this function is called by the bool variant of the built-in function where the bool return value is computed by comparing the result register with the expected op0. So if only one of the two objects is a CONST_INT, then the mode should come from the other object. I think that's the fundamental problem here and that you're just papering over it by changing the caller. I think my earlier explanation was a bit misleading and I may have rushed into quoting the comment about both operands being const for emit_store_flag_force(). The problem is with the function and I do agree with your suggestion of changing the function to add the code below to be a better approach than the changing the caller. I will change the patch and test it. This is the updated patch according to your suggestions. Testing: Checked for regressions on arm-none-linux-gnueabihf and added new test case. Thanks Sudi ChangeLog entries: *** gcc/ChangeLog *** 2017-01-10 Sudakshina Das PR target/82096 * expmed.c (emit_store_flag_force): Swap if const op0 and change VOIDmode to mode of op0. *** gcc/testsuite/ChangeLog *** 2017-01-10 Sudakshina Das PR target/82096 * gcc.c-torture/compile/pr82096.c: New test. Thanks Sudi For example in emit_store_flag_1 we have this code: /* If one operand is constant, make it the second one. Only do this if the other operand is not constant as well. */ if (swap_commutative_operands_p (op0, op1)) { std::swap (op0, op1); code = swap_condition (code); } if (mode == VOIDmode) mode = GET_MODE (op0); I think if you do this in emit_store_flag_force as well everything will "just work". You can put it after this call/test pair: /* First see if emit_store_flag can do the job. */ tem = emit_store_flag (target, code, op0, op1, mode, unsignedp, normalizep); if (tem != 0) return tem; jeff diff --git a/gcc/expmed.c b/gcc/expmed.c index 6b22946..142d542 100644 --- a/gcc/expmed.c +++ b/gcc/expmed.c @@ -6084,6 +6084,18 @@ emit_store_flag_force (rtx target, enum rtx_code code, rtx op0, rtx op1, if (tem != 0) return tem; + /* If one operand is constant, make it the second one. Only do this + if the other operand is not constant as well. */ + + if (swap_commutative_operands_p (op0, op1)) +{ + std::swap (op0, op1); + code = swap_condition (code); +} +
Re: [C++ PATCH] -Wclass-memaccess fixes and improvements (PR c++/81327)
On Fri, Jan 5, 2018 at 6:37 PM, Martin Sebor wrote: > On 01/05/2018 03:02 PM, Jakub Jelinek wrote: >> >> Hi! >> >> Apparently LLVM allows similar warning to -Wclass-memaccess (is it just >> similar or same?; if the latter, perhaps we should use the same option for >> that) to be disabled not just by casting the destination pointer to >> e.g. (char *) or some other pointer to non-class, but also to (void *), >> which is something that Qt uses or wants to use. > > Clang has an option/warning called -Wdynamic-class-memaccess. > It detects a limited subset of the problems -Wclass-memaccess > detects: AFAIK, just calling raw memory functions like memcpy > on objects of polymorphic types, which the standard says is > undefined. > > I didn't know about the Clang option when I wrote the GCC code. > Had I found out about it sooner I would have probably considered > splitting up -Wclass-memaccess and providing the same option as > Clang for compatibility, and another for the rest of the checks. > But there might still be time to split it up before the release > if that's important. > > Some Qt code apparently uses memcpy to copy polymorphic objects > but gets around the Clang warning by casting the pointers to > void*, which doesn't suppress the GCC warning. I had considered > allowing the void* cast as a possible suppression mechanism but > ultimately decided against it. IIRC, partly because it would > have complicated the implementation and partly because I wasn't > convinced that it was a good idea (and I also didn't know about > the Clang escape hatch). > >> The problem is that the warning is diagnosed too late and whether we had >> explicit or implicit conversion to void * is lost at that point. >> >> This patch moves the warning tiny bit earlier (from build_cxx_call to the >> caller) where we still have information about the original parsed >> arguments before conversions to the builtin argument types. OK. > I'm still not convinced that letting programs get away with > byte-wise copying polymorphic objects is a good thing, but I > agree that supporting the void* cast suppression will be useful > for portability with Clang (presumably both C-style cast and > C++-style static_cast work). IMO It's good to have a way to suppress any warning by being more verbose. Warnings are to highlight inadventent mistakes, not deliberate ones. > If we want to make it a feature then we should document it in > the manual. Otherwise, if it's supposed to be a back door for > poorly written code, then I think it would be helpful to mention > it in comments in the implementation and above the assertion in > the test. I don't like undocumented back doors so I'm leaning > toward documenting it in the manual. I'm happy to add some > text to the manual if/when this is approved and decided. I agree that this should be documented. Jason
Re: Go patch committed: Update to Go1.10beta1
Hi Ian, > On Wed, Jan 10, 2018 at 3:44 AM, Rainer Orth > wrote: >> >> thanks. Testing has now concluded as well. x86 results are good (no >> regressions except for cmd/internal/buildid which fails on Linux, too), >> as are 64-bit sparc results. > > The cmd/internal/buildid test does pass on my system. What are you seeing? the log shows --- FAIL: TestReadFile (0.01s) buildid_test.go:40: ReadFile(testdata/p.a) = "", open testdata/p.a: no such file or directory, want "abcdefghijklmnopqrstuvwxyz.1234567890123456789012345678901234567890123456789012345678901234", nil buildid_test.go:47: ReadFile(testdata/p.a) [readSize=2k] = "", open testdata/p.a: no such file or directory, want "abcdefghijklmnopqrstuvwxyz.1234567890123456789012345678901234567890123456789012345678901234", nil buildid_test.go:52: open testdata/p.a: no such file or directory libgo/go/cmd/internal/buildid/testdata/p.a is just missing. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH] libgo: add missing prototypes (PR 82922)
On Thu, Jan 4, 2018 at 6:53 PM, Martin Sebor wrote: > > PR 82922 asks to enable -Wstrict-prototypes. The attached > patch handles the errors in an x86_64 bootstrap. With it, > GCC bootstraps successfully with --enable-languages=all,jit, > but there are many FAILs in the test suite but I think those > could be handled by a script so unless there are objections > it seems feasible to me to enable the optionin GCC 8. Either > way, the libgo patch can be considered independently of that > decision. Thanks. I committed a slightly modified version of this as follows. Ian Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 256435) +++ gcc/go/gofrontend/MERGE (working copy) @@ -1,4 +1,4 @@ -4b8036b3f995cdb0b99a9fa26c2af1e2420b4fa2 +9705a1f4c37ad2c099e9fe6cd587d22a2a2ab2c3 The first line of this file holds the git revision number of the last merge done from the gofrontend repository. Index: libgo/go/syscall/errno.c === --- libgo/go/syscall/errno.c(revision 256366) +++ libgo/go/syscall/errno.c(working copy) @@ -11,11 +11,11 @@ /* errno is typically a macro. These functions set and get errno specific to the libc being used. */ -uintptr_t GetErrno() __asm__ (GOSYM_PREFIX "syscall.GetErrno"); +uintptr_t GetErrno(void) __asm__ (GOSYM_PREFIX "syscall.GetErrno"); void SetErrno(uintptr_t) __asm__ (GOSYM_PREFIX "syscall.SetErrno"); uintptr_t -GetErrno() +GetErrno(void) { return (uintptr_t) errno; } Index: libgo/runtime/go-now.c === --- libgo/runtime/go-now.c (revision 256366) +++ libgo/runtime/go-now.c (working copy) @@ -16,11 +16,11 @@ struct walltime_ret int32_t nsec; }; -struct walltime_ret now() __asm__ (GOSYM_PREFIX "runtime.walltime") +struct walltime_ret now(void) __asm__ (GOSYM_PREFIX "runtime.walltime") __attribute__ ((no_split_stack)); struct walltime_ret -now() +now(void) { struct timespec ts; struct walltime_ret ret; Index: libgo/runtime/go-runtime-error.c === --- libgo/runtime/go-runtime-error.c(revision 256366) +++ libgo/runtime/go-runtime-error.c(working copy) @@ -55,7 +55,7 @@ enum GO_NIL = 11 }; -extern void __go_runtime_error () __attribute__ ((noreturn)); +extern void __go_runtime_error (int32) __attribute__ ((noreturn)); void __go_runtime_error (int32 i) Index: libgo/runtime/proc.c === --- libgo/runtime/proc.c(revision 256366) +++ libgo/runtime/proc.c(working copy) @@ -369,8 +369,6 @@ runtime_mcall(FuncVal *fv) // // Design doc at http://golang.org/s/go11sched. -extern bool* runtime_getCgoHasExtraM() - __asm__ (GOSYM_PREFIX "runtime.getCgoHasExtraM"); extern G* allocg(void) __asm__ (GOSYM_PREFIX "runtime.allocg"); @@ -560,7 +558,7 @@ void setGContext(void) __asm__ (GOSYM_PR // setGContext sets up a new goroutine context for the current g. void -setGContext() +setGContext(void) { int val; G *gp; Index: libgo/runtime/runtime.h === --- libgo/runtime/runtime.h (revision 256366) +++ libgo/runtime/runtime.h (working copy) @@ -473,9 +473,7 @@ extern void typedmemmove(const Type *, v __asm__ (GOSYM_PREFIX "runtime.typedmemmove"); extern void setncpu(int32) __asm__(GOSYM_PREFIX "runtime.setncpu"); -extern P** runtime_getAllP() - __asm__ (GOSYM_PREFIX "runtime.getAllP"); -extern Sched* runtime_getsched() +extern Sched* runtime_getsched(void) __asm__ (GOSYM_PREFIX "runtime.getsched"); extern void setpagesize(uintptr_t) __asm__(GOSYM_PREFIX "runtime.setpagesize");
Re: [PATCH v2] libgo: Add support for sh
On 01/10/2018 04:22 PM, Ian Lance Taylor wrote: Ok, so basically we should end up having only two GOARCHes for SH, being "sh" and "shbe", correct? Yes, I would like to start there. If, yes, I can rebase my patch, make the requested changes and resubmit it to Gerrit. Thanks! Done: https://go-review.googlesource.com/c/gofrontend/+/84555 -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer - glaub...@debian.org `. `' Freie Universitaet Berlin - glaub...@physik.fu-berlin.de `-GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
libgo patch committed: Remove old exp packages
This libgo patch removes the exp/proxy and exp/terminal packages. The exp/proxy package was removed from the master library in https://golang.org/cl/6461056 (August, 2012). The exp/terminal package was removed from the master library in https://golang.org/cl/5970044 (March, 2012). I'm not sure why they lingered in the gofrontend copy, but let's finally remove them now. Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu. Committed to mainline. Ian Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 256433) +++ gcc/go/gofrontend/MERGE (working copy) @@ -1,4 +1,4 @@ -87df767807acac466edb3bb6445ad83a48141d17 +4b8036b3f995cdb0b99a9fa26c2af1e2420b4fa2 The first line of this file holds the git revision number of the last merge done from the gofrontend repository. Index: libgo/Makefile.am === --- libgo/Makefile.am (revision 256366) +++ libgo/Makefile.am (working copy) @@ -235,12 +235,6 @@ toolexeclibgoencoding_DATA = \ encoding/pem.gox \ encoding/xml.gox -toolexeclibgoexpdir = $(toolexeclibgodir)/exp - -toolexeclibgoexp_DATA = \ - exp/proxy.gox \ - exp/terminal.gox - toolexeclibgogodir = $(toolexeclibgodir)/go toolexeclibgogo_DATA = \ @@ -759,8 +753,6 @@ PACKAGES = \ encoding/pem \ encoding/xml \ errors \ - exp/proxy \ - exp/terminal \ expvar \ flag \ fmt \ @@ -1066,7 +1058,6 @@ CHECK_DEPS = \ $(toolexeclibgocrypto_DATA) \ $(toolexeclibgodebug_DATA) \ $(toolexeclibgoencoding_DATA) \ - $(toolexeclibgoexp_DATA) \ $(toolexeclibgogo_DATA) \ $(toolexeclibgohash_DATA) \ $(toolexeclibgoimage_DATA) \ @@ -1352,8 +1343,6 @@ TEST_PACKAGES = \ encoding/json/check \ encoding/pem/check \ encoding/xml/check \ - exp/proxy/check \ - exp/terminal/check \ html/template/check \ go/ast/check \ go/build/check \ Index: libgo/go/exp/README === --- libgo/go/exp/README (revision 256366) +++ libgo/go/exp/README (nonexistent) @@ -1,3 +0,0 @@ -This directory tree contains experimental packages and -unfinished code that is subject to even more change than the -rest of the Go tree. Index: libgo/go/exp/proxy/direct.go === --- libgo/go/exp/proxy/direct.go(revision 256366) +++ libgo/go/exp/proxy/direct.go(nonexistent) @@ -1,18 +0,0 @@ -// Copyright 2011 The Go Authors. All rights reserved. -// Use of this source code is governed by a BSD-style -// license that can be found in the LICENSE file. - -package proxy - -import ( - "net" -) - -type direct struct{} - -// Direct is a direct proxy: one that makes network connections directly. -var Direct = direct{} - -func (direct) Dial(network, addr string) (net.Conn, error) { - return net.Dial(network, addr) -} Index: libgo/go/exp/proxy/per_host.go === --- libgo/go/exp/proxy/per_host.go (revision 256366) +++ libgo/go/exp/proxy/per_host.go (nonexistent) @@ -1,140 +0,0 @@ -// Copyright 2011 The Go Authors. All rights reserved. -// Use of this source code is governed by a BSD-style -// license that can be found in the LICENSE file. - -package proxy - -import ( - "net" - "strings" -) - -// A PerHost directs connections to a default Dailer unless the hostname -// requested matches one of a number of exceptions. -type PerHost struct { - def, bypass Dialer - - bypassNetworks []*net.IPNet - bypassIPs []net.IP - bypassZones[]string - bypassHosts[]string -} - -// NewPerHost returns a PerHost Dialer that directs connections to either -// defaultDialer or bypass, depending on whether the connection matches one of -// the configured rules. -func NewPerHost(defaultDialer, bypass Dialer) *PerHost { - return &PerHost{ - def:defaultDialer, - bypass: bypass, - } -} - -// Dial connects to the address addr on the network net through either -// defaultDialer or bypass. -func (p *PerHost) Dial(network, addr string) (c net.Conn, err error) { - host, _, err := net.SplitHostPort(addr) - if err != nil { - return nil, err - } - - return p.dialerForRequest(host).Dial(network, addr) -} - -func (p *PerHost) dialerForRequest(host string) Dialer { - if ip := net.ParseIP(host); ip != nil { - for _, net := range p.bypassNetworks { - if net.Contains(ip) { - return p.bypass - } - } - for _, bypassIP := range p.bypassIPs { - if bypassIP.Equal(ip) { -
Re: [C++ Patch] PR 78344 ("ICE on invalid c++ code (tree check: expected tree_list, have error_mark in cp_check_const_attributes, at cp/decl2.c:1347")
On Fri, Dec 22, 2017 at 7:34 PM, Paolo Carlini wrote: > in this error recovery issue cp_check_const_attributes and more generally > cplus_decl_attributes have lots of troubles handling the error_mark_node > returned by cp_parser_std_attribute_spec_seq, as called by > cp_parser_direct_declarator. I fiddled quite a bit with these parsing > facilities to eventually notice that boldly changing > cp_parser_std_attribute_spec_seq to return NULL_TREE instead of > error_mark_node when cp_parser_std_attribute_spec returns error_mark_node in > the loop cures the bug at issue as a special case. Hmm, I'm skeptical. In general, we want to use error_mark_node to distinguish between something not being there and being there but wrong. > I also noticed that in cp_parser_std_attribute_spec we are using > token_pair::require_open / require_close very peculiarly, issuing a > cp_parser_error when the returned bool is false instead of simply bailing > out with error_mark_node and that in fact causes duplicate diagnostics about > expected '(' / ')', respectively. The hunks for this issue are OK. Jason
Re: [PATCH v2] libgo: Add support for sh
On Wed, Jan 10, 2018 at 7:05 AM, John Paul Adrian Glaubitz wrote: > On 01/10/2018 03:51 PM, Ian Lance Taylor wrote: >> >> I suppose I shouldn't have said calling convention, as that cuts too >> finely. There are similar issues with MIPS and even x86 has softfloat >> variants. Those options are selectable via command line options in >> GCC, and in the gc toolchain they are selected using environment >> variables (GO386, GOARM, etc.). > > Ok, so basically we should end up having only two GOARCHes for SH, > being "sh" and "shbe", correct? Yes, I would like to start there. > If, yes, I can rebase my patch, make the requested changes and resubmit > it to Gerrit. Thanks! Ian
Re: [PATCH, rs6000] generate loop code for memcmp inline expansion
I'll check the runtime of that --- I added some test cases to memcmp- 1.c and probably it is now taking too long. I will revise it so it's no longer than it was before. Aaron On Wed, 2018-01-10 at 14:25 +, Szabolcs Nagy wrote: > On 08/01/18 19:37, Aaron Sawdey wrote: > > On Tue, 2017-12-12 at 10:13 -0600, Segher Boessenkool wrote: > > > > Please fix those trivialities, and it's okay for trunk (after > > > > the > > > > rtlanal patch is approved too). Thanks! > > > > Here's the final version of this, which is committed as 256351. > > > > > > 2018-01-08 Aaron Sawdey > > > > * config/rs6000/rs6000-string.c > > (do_load_for_compare_from_addr): New > > function. > > (do_ifelse): New function. > > (do_isel): New function. > > (do_sub3): New function. > > (do_add3): New function. > > (do_load_mask_compare): New function. > > (do_overlap_load_compare): New function. > > (expand_compare_loop): New function. > > (expand_block_compare): Call expand_compare_loop() when > > appropriate. > > * config/rs6000/rs6000.opt (-mblock-compare-inline-limit): > > Change > > option description. > > (-mblock-compare-inline-loop-limit): New option. > > > > ... > > Index: gcc/testsuite/gcc.dg/memcmp-1.c > > === > > --- gcc/testsuite/gcc.dg/memcmp-1.c (revision 256350) > > +++ gcc/testsuite/gcc.dg/memcmp-1.c (working copy) > > @@ -14,11 +14,80 @@ > > #ifndef NRAND > > #define NRAND 1 > > #endif > > -#define MAX_SZ 200 > > +#define MAX_SZ 600 > > > > i see timeouts when running aarch64-none-elf tests in some > emulator environments: > > WARNING: program timed out. > FAIL: gcc.dg/memcmp-1.c execution test > > if there is a way to reduce the iteration count or the > tested variants that would help slow targets. > > > +#define > > DEF_RS(ALIGN) > > \ > > +static void test_memcmp_runtime_size_ ## ALIGN (const char *str1, > >\ > > + const char *str2, > >\ > > + size_t sz, int > > expect)\ > > +{ > >\ > > + char three[8192] __attribute__ ((aligned (4096))); > >\ > > + char four[8192] __attribute__ ((aligned (4096))); > >\ > > + char *a, *b; > >\ > > + int i,j,a1,a2,r; > >\ > > + for (j = 0; j < 2; j++) > >\ > > +{ > >\ > > + for (i = 0; i < 2; i++) > >\ > > + { > >\ > > + a = three+i*ALIGN+j*(4096-2*i*ALIGN); > >\ > > + b = four+i*ALIGN+j*(4096-2*i*ALIGN); > >\ > > + memcpy(a,str1,sz); > >\ > > + memcpy(b,str2,sz); > >\ > > + asm(" "); > >\ > > + r = memcmp(a,b,sz); > >\ > > + asm(" "); > >\ > > + if ( r < 0 && !(expect < 0) ) abort(); > >\ > > + if ( r > 0 && !(expect > 0) ) abort(); > >\ > > + if ( r == 0 && !(expect == 0) ) abort(); > >\ > > + } > >\ > > +} > >\ > > +} > > + > > +DEF_RS(1) > > +DEF_RS(2) > > +DEF_RS(4) > > +DEF_RS(8) > > +DEF_RS(16) > > + > > +static void test_memcmp_runtime_size (const char *str1, const char > > *str2, > > + size_t sz, int expect) > > +{ > > + char three[8192] __attribute__ ((aligned (4096))); > > + char four[8192] __attribute__ ((aligned (4096))); > > + char *a, *b; > > + int i,j,a1,a2,r; > > + test_memcmp_runtime_size_1 (str1,str2,sz,expect); > > + test_memcmp_runtime_size_2 (str1,str2,sz,expect); > > + test_memcmp_runtime_size_4 (str1,str2,sz,expect); > > + test_memcmp_runtime_size_8 (str1,str2,sz,expect); > > + test_memcmp_runtime_size_16 (str1,str2,sz,expect); > > + for (j = 0; j < 2; j++) > > +{ > > + for (i = 0; i < 2; i++) > > + { > > + for (a1=0; a1 < 2*sizeof(void *); a1++) > > + { > > + for (a2=0; a2 < 2*sizeof(void *); a2++) > > + { > > + a = three+i*a1+j*(4096-2*i*a1); > > + b = four+i*a2+j*(4096-2*i*a2); > > + memcpy(a,str1,sz); > > + memcpy(b,str2,sz
Re: C++ PATCH to stop folding constants to their values in convert_like_real
On Fri, Jun 09, 2017 at 03:46:41PM -0700, Jason Merrill wrote: > Something that was missed previously in the delayed folding work: > there's no good reason to be pulling values out of constants in > convert_like_real. Well, it breaks nontype10.C on 32-bit targets. Or, let's use a modified version of nontype10.C: #define NULL __null template struct A {}; template struct B {}; template struct C {}; A a; // { dg-warning "NULL" } B b; // { dg-warning "NULL" } C c; // { dg-warning "NULL" } previously we'd emit for both -m32 and -m64 just 3 warnings, but the trunk emits many further ones: nontype10.C: In instantiation of ‘struct B<0>’: nontype10.C:8:9: required from here nontype10.C:4:28: warning: converting to non-pointer type ‘long int’ from NULL [-Wconversion-null] template struct B {}; ^ nontype10.C:4:28: warning: converting to non-pointer type ‘long int’ from NULL [-Wconversion-null] nontype10.C:4:28: warning: converting to non-pointer type ‘long int’ from NULL [-Wconversion-null] on top of the 3 desired ones with -m64, or: nontype10.C: In instantiation of ‘struct A<0>’: nontype10.C:7:9: required from here nontype10.C:3:27: warning: converting to non-pointer type ‘int’ from NULL [-Wconversion-null] template struct A {}; ^ nontype10.C:3:27: warning: converting to non-pointer type ‘int’ from NULL [-Wconversion-null] nontype10.C:3:27: warning: converting to non-pointer type ‘int’ from NULL [-Wconversion-null] on top of the 3 desired ones with -m32. The problem is that the conversion of null_node to int (for -m32) or long int (for -m64) remains to be null_node rather than some other INTEGER_CST, so we warn again and again. Jakub
Re: [gofrontend-dev] Re: Go patch committed: Update to Go1.10beta1
On Wed, Jan 10, 2018 at 6:34 AM, Ian Lance Taylor wrote: > > Thanks. I think https://golang.org/cl/87137 will fix it. Committed as follows. Ian Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 256431) +++ gcc/go/gofrontend/MERGE (working copy) @@ -1,4 +1,4 @@ -c22eb29a62b4fd72ad2ea09ebe5fcea5b8ed78b8 +87df767807acac466edb3bb6445ad83a48141d17 The first line of this file holds the git revision number of the last merge done from the gofrontend repository. Index: libgo/go/cmd/go/internal/work/exec.go === --- libgo/go/cmd/go/internal/work/exec.go (revision 256366) +++ libgo/go/cmd/go/internal/work/exec.go (working copy) @@ -1857,9 +1857,11 @@ func (b *Builder) gccSupportsFlag(compil // GCC says "unrecognized command line option". // clang says "unknown argument". // Older versions of GCC say "unrecognised debug output level". + // For -fsplit-stack GCC says "'-fsplit-stack' is not supported". supported := !bytes.Contains(out, []byte("unrecognized")) && !bytes.Contains(out, []byte("unknown")) && - !bytes.Contains(out, []byte("unrecognised")) + !bytes.Contains(out, []byte("unrecognised")) && + !bytes.Contains(out, []byte("is not supported")) b.flagCache[key] = supported return supported }
Re: [Patch][ARM] Add -mbranch-cost option, and update a few tests
On 10 January 2018 at 15:44, Jakub Jelinek wrote: > On Mon, Oct 23, 2017 at 02:30:24PM +0200, Christophe Lyon wrote: >> After Jakub's suggestion in PR82120 and PR81184, the attached patch >> adds the -mbranch-cost option to the ARM target. My understanding >> is that it's intended to be used internally for testing and does not >> require user-facing documentation. >> >> I have updated a few tests, validation on aarch64 & arm targets shows >> no regression, >> and a few improvements when targeting cortex-a5 or cortex-m3: >> gcc.dg/tree-ssa/reassoc-3[3456].c now pass. >> >> That being said, I'm not sure about the other targets for which I >> changed the condition, >> and I am also concerned by the fact that it has no impact on >> gcc.dg/pr21643.c and gcc.dg/tree-ssa/phi-opt-11.c (PR81184). >> >> Should I restrict my patch to the only tests where it has an impact >> (gcc.dg/tree-ssa/reassoc-3[3456].c) ? > > Let's change all and watch the effects on all targets in testresults, > we can fine tune later on. > OK, thanks > Does pr21643.c really fail somewhere on arm*? Tried -mcpu=cortex-a5 > and don't see the failure in x86_64-linux -> armv7a-hardfloat-linux-gnueabi > cross. Yes, for me still fails on arm-none-linux-gnueabihf --with-cpu=cortex-a5 --with-fpu=vfpv3-d16-fp16 > >> gcc/ChangeLog: >> >> 2017-10-23 Christophe Lyon >> >> * config/arm/arm.opt (-mbranch-cost): New option. >> * config/arm/arm.h (BRANCH_COST): Take arm_branch_cost into >> account. >> >> gcc/testsuite/ChangeLog: >> >> 2017-10-23 Christophe Lyon >> >> * lib/target-supports.exp (check_effective_target_branch_cost): >> New function. >> * gcc.dg/builtin-bswap-7.c: Use branch_cost effective target. >> * gcc.dg/pr21643.c: Likewise. >> * gcc.dg/pr46309.c: Likewise. >> * gcc.dg/tree-ssa/phi-opt-11.c: Likewise. >> * gcc.dg/tree-ssa/phi-opt-2.c: Likewise. >> * gcc.dg/tree-ssa/reassoc-32.c: Likewise. >> * gcc.dg/tree-ssa/reassoc-33.c: Likewise. >> * gcc.dg/tree-ssa/reassoc-34.c: Likewise. >> * gcc.dg/tree-ssa/reassoc-35.c: Likewise. >> * gcc.dg/tree-ssa/reassoc-36.c: Likewise. >> * gcc.dg/tree-ssa/ssa-ifcombine-13.c: Likewise. >> * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-1.c: Likewise. >> * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-2.c: Likewise. >> * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-3.c: Likewise. >> * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-4.c: Likewise. >> * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-5.c: Likewise. >> * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-6.c: Likewise. > > Ok for trunk. > > Note, unreviewed patches should be pinged from time to time. Sorry, I thought the patch was too crappy ;-) Thanks, Christophe > > Jakub
Re: [C++ Patch] PR 81055 ("[6/7/8 Regression] ICE with invalid initializer for array new")
OK. On Thu, Dec 21, 2017 at 2:21 PM, Paolo Carlini wrote: > Hi, > > On 21/12/2017 17:04, Jason Merrill wrote: >> >> On Wed, Dec 20, 2017 at 10:37 AM, Paolo Carlini >> wrote: >>> >>> in this error recovery regression, after a sensible error produced by >>> unqualified_name_lookup_error we ICE much later when gimplify_modify_expr >>> encounters a corresponding error_mark_node as second argument of a >>> MODIFY_EXPR. I believe we have a very general error recovery weakness >>> with >>> errors like unqualified_name_lookup_error and functions like >>> cp_parser_initializer_list returning a vec: certainly we don't want to >>> give >>> up the parsing too early but then we have to cope with error_mark_nodes >>> filtering down and reappearing much later in the compilation. The present >>> bug is a rather clear example, but I have seen many others in the past: a >>> couple of times I even tried doing something about it, but I have yet to >>> figure out something worth sending to the mailing list. Anyway, here I'm >>> wondering if at this stage it would make sense to handle the >>> error_mark_node >>> in gimplify_modify_expr - I believe we do have a couple other cases of >>> such >>> late handling in the gimplifier. Tested x86_64-linux. >> >> This seems fine, but the front end shouldn't have created such a >> MODIFY_EXPR in the first place. How does this happen? > > Thanks for asking: a good cure for my laziness ;) > > In fact, it's an INIT_EXPR, created by build_vec_init around line #4402. The > below works for the testcase and I'm finishing regtesting it. Alternately, > only setting errors = true when init == error_mark_node, which eventually > leads anyway to build_vec_init returning error_mark_node, also works for the > testcase: it's a little lighter, so to speak, but less explicit (setting > elt_init = error_mark_node leads to errors = true too a few lines below). > > Thanks! > Paolo. > >
Re: [PATCH v2] libgo: Add support for sh
On 01/10/2018 03:51 PM, Ian Lance Taylor wrote: I suppose I shouldn't have said calling convention, as that cuts too finely. There are similar issues with MIPS and even x86 has softfloat variants. Those options are selectable via command line options in GCC, and in the gc toolchain they are selected using environment variables (GO386, GOARM, etc.). Ok, so basically we should end up having only two GOARCHes for SH, being "sh" and "shbe", correct? If, yes, I can rebase my patch, make the requested changes and resubmit it to Gerrit. Adrian -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer - glaub...@debian.org `. `' Freie Universitaet Berlin - glaub...@physik.fu-berlin.de `-GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Re: Go patch committed: Update to Go1.10beta1
On Wed, Jan 10, 2018 at 5:42 AM, Ian Lance Taylor wrote: > > Whoops, there's a bug on big-endian 32-bit systems. I'm testing > https://golang.org/cl/87135. Committed as follows. Ian Index: gcc/go/gofrontend/MERGE === --- gcc/go/gofrontend/MERGE (revision 256419) +++ gcc/go/gofrontend/MERGE (working copy) @@ -1,4 +1,4 @@ -c18c6bd80e0995827ad3396eb1c2401451de88fd +c22eb29a62b4fd72ad2ea09ebe5fcea5b8ed78b8 The first line of this file holds the git revision number of the last merge done from the gofrontend repository. Index: libgo/runtime/go-construct-map.c === --- libgo/runtime/go-construct-map.c(revision 256366) +++ libgo/runtime/go-construct-map.c(working copy) @@ -11,8 +11,8 @@ #include "runtime.h" #include "go-type.h" -extern void *makemap (const struct __go_map_type *, int64_t hint, - void *, void *) +extern void *makemap (const struct __go_map_type *, intgo hint, + void *) __asm__ (GOSYM_PREFIX "runtime.makemap"); extern void *mapassign (const struct __go_map_type *, void *hmap, @@ -29,7 +29,7 @@ __go_construct_map (const struct __go_ma uintptr_t i; void *p; - ret = makemap(type, (int64_t) count, NULL, NULL); + ret = makemap(type, (intgo) count, NULL); entries = (const unsigned char *) ventries; for (i = 0; i < count; ++i)
Re: [PATCH][RFC] Radically simplify emission of balanced tree for switch statements.
On 01/10/2018 02:13 PM, Richard Biener wrote: > On Tue, Jan 9, 2018 at 7:29 PM, Jeff Law wrote: >> On 01/09/2018 07:43 AM, Martin Liška wrote: >>> On 09/20/2017 05:00 PM, Jeff Law wrote: On 09/20/2017 01:24 AM, Martin Liška wrote: > > Hello. > > Thank you Jeff for very verbose explanation what's happening. I'm > planning to do > follow-up of this patch that will include clustering for bit-tests and > jump tables. > Maybe that will make aforementioned issues even more difficult, but we'll > see. FWIW, the DOM changes to simplify the conditionals seem to help both cases, trigger reasonably consistently in a bootstrap and for some subset of the triggers actually result in transformations that allow other passes to do a better job in the common (-O2) case. So my inclination is to polish them a bit further get them on the trunk. My recommendation is to ignore the two regressions for now and focus on the cleanups you're trying to do. jeff >>> >>> Hello. >>> >>> Some time ago I've decided that I'll make patch submission of switch >>> clustering >>> in next stage1. However, this patch can be applied as is in this stage3. >>> Would >>> it be possible or is it too late? >> I'll let Richi make the call here. FWIW, the DOM changes to avoid the >> two missed-optimization regressions you ran into are on the trunk, so >> that's no longer a blocking issue. > > If you are fine with waiting then please wait ;) Yep, it's not urgent. Thanks. Martin > > Richard. > >> jeff
Re: Go patch committed: Update to Go1.10beta1
On Wed, Jan 10, 2018 at 3:44 AM, Rainer Orth wrote: > > thanks. Testing has now concluded as well. x86 results are good (no > regressions except for cmd/internal/buildid which fails on Linux, too), > as are 64-bit sparc results. The cmd/internal/buildid test does pass on my system. What are you seeing? Ian
[PATCH] rs6000: Wrap diff of immediates in const (PR83629)
In various of our 32-bit load_toc patterns we take the difference of two immediates (labels) as a term to something bigger; but this isn't canonical RTL, it needs to be wrapped in CONST. This fixes it. Tested on powerpc64-linux {-m32,-m64}. Committing. Segher 2018-01-10 Segher Boessenkool PR target/83629 * config/rs6000/rs6000.md (load_toc_v4_PIC_2, load_toc_v4_PIC_3b, load_toc_v4_PIC_3c): Wrap const term in CONST RTL. testsuite/ PR target/83629 * gcc.target/powerpc/pr83629.c: New testcase. --- gcc/config/rs6000/rs6000.md| 26 -- gcc/testsuite/gcc.target/powerpc/pr83629.c | 9 + 2 files changed, 25 insertions(+), 10 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr83629.c diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index fc9daf6..fd8f10d 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -10109,27 +10109,33 @@ (define_insn "load_toc_v4_PIC_1b_476" (define_insn "load_toc_v4_PIC_2" [(set (match_operand:SI 0 "gpc_reg_operand" "=r") - (mem:SI (plus:SI (match_operand:SI 1 "gpc_reg_operand" "b") - (minus:SI (match_operand:SI 2 "immediate_operand" "s") -(match_operand:SI 3 "immediate_operand" "s")] + (mem:SI (plus:SI + (match_operand:SI 1 "gpc_reg_operand" "b") + (const + (minus:SI (match_operand:SI 2 "immediate_operand" "s") + (match_operand:SI 3 "immediate_operand" "s"))] "TARGET_ELF && DEFAULT_ABI == ABI_V4 && flag_pic == 2" "lwz %0,%2-%3(%1)" [(set_attr "type" "load")]) (define_insn "load_toc_v4_PIC_3b" [(set (match_operand:SI 0 "gpc_reg_operand" "=r") - (plus:SI (match_operand:SI 1 "gpc_reg_operand" "b") -(high:SI - (minus:SI (match_operand:SI 2 "symbol_ref_operand" "s") -(match_operand:SI 3 "symbol_ref_operand" "s")] + (plus:SI + (match_operand:SI 1 "gpc_reg_operand" "b") + (high:SI + (const + (minus:SI (match_operand:SI 2 "symbol_ref_operand" "s") + (match_operand:SI 3 "symbol_ref_operand" "s"))] "TARGET_ELF && TARGET_SECURE_PLT && DEFAULT_ABI == ABI_V4 && flag_pic" "addis %0,%1,%2-%3@ha") (define_insn "load_toc_v4_PIC_3c" [(set (match_operand:SI 0 "gpc_reg_operand" "=r") - (lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "b") - (minus:SI (match_operand:SI 2 "symbol_ref_operand" "s") -(match_operand:SI 3 "symbol_ref_operand" "s"] + (lo_sum:SI + (match_operand:SI 1 "gpc_reg_operand" "b") + (const + (minus:SI (match_operand:SI 2 "symbol_ref_operand" "s") + (match_operand:SI 3 "symbol_ref_operand" "s")] "TARGET_ELF && TARGET_SECURE_PLT && DEFAULT_ABI == ABI_V4 && flag_pic" "addi %0,%1,%2-%3@l") diff --git a/gcc/testsuite/gcc.target/powerpc/pr83629.c b/gcc/testsuite/gcc.target/powerpc/pr83629.c new file mode 100644 index 000..aeff699 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr83629.c @@ -0,0 +1,9 @@ +/* { dg-options "-O2 -fPIC -frename-registers --param=sched-autopref-queue-depth=0 -mcpu=603" } */ + +extern void bar (void *); + +void +foo (void) +{ + bar (""); +} -- 1.8.3.1
[PATCH] Fix PR82770
Applied. Richard. 2018-01-10 Richard Biener PR testsuite/78768 * gcc.dg/pr78768.c: Un-XFAIL. Index: gcc/testsuite/gcc.dg/pr78768.c === --- gcc/testsuite/gcc.dg/pr78768.c (revision 256428) +++ gcc/testsuite/gcc.dg/pr78768.c (working copy) @@ -9,7 +9,7 @@ int main (void) { char *d = (char *)__builtin_alloca (12); /* { dg-warning "argument to .alloca. is too large" } */ - __builtin_sprintf (d, "%32s", "x"); /* { dg-warning "directive writing 32 bytes into a region of size 12" "-Wformat-overflow" { xfail *-*-* } } */ + __builtin_sprintf (d, "%32s", "x"); /* { dg-warning "directive writing 32 bytes into a region of size 12" "-Wformat-overflow" } */ return 0; }
Re: [PATCH v2] libgo: Add support for sh
On Wed, Jan 10, 2018 at 6:30 AM, Oleg Endo wrote: > On Wed, 2018-01-10 at 06:25 -0800, Ian Lance Taylor wrote: >> >> Thanks. I finally took a look at this. I don't know much about SH, >> but I don't think we want to add each SH variant as a separate GOARCH >> value. As you can see from the list you modified in >> ibgo/go/go/build/syslist.go, the difference between GOARCH values is >> essentially the calling convention. There are many different kinds of >> x86 processors, but since the only calling convention difference is >> between 32-bit and 64-bit, the list has only 386 and amd64. Similarly >> it seems to me we should have only sh and shbe in the list for SH >> processors. > > On SH the calling convention depends on the processor features which > are available/used. For example there is a "no-fpu" mode which uses > software fp and passes all fp values in gp regs, while the "normal" > mode is to pass fp values in fp regs. Some of the CPU variants like > SH2 imply "no-fpu". I suppose I shouldn't have said calling convention, as that cuts too finely. There are similar issues with MIPS and even x86 has softfloat variants. Those options are selectable via command line options in GCC, and in the gc toolchain they are selected using environment variables (GO386, GOARM, etc.). Ian
[wwwdocs,avr] Mention PR83738 in release notes
This patch adds a bit more the the avr section of the v8 release notes. Ok? Johann Index: changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-8/changes.html,v retrieving revision 1.25 diff -r1.25 changes.html 240a241,246 > > The compiler no more saves / restores registers in main; > the effect is the same as if attribute OS_task was > specified for main. This optimization can be switched > off by the new command-line option -mno-main-is-OS_task. >
Re: [PATCH v2] libgo: Add support for sh
On 01/10/2018 03:40 PM, Ian Lance Taylor wrote: The main purpose of a GOARCH value is to specify build targets, which are the +build lines seen in files like libgo/go/internal/syscall/unix/getrandom_linux_mipsx.go. They also appear in file names. It seems to me unlikely that it will ever be necessary to distinguish SH3 and SH4 when choosing which files to build. What about Oleg's comment on the issue? Adrian -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer - glaub...@debian.org `. `' Freie Universitaet Berlin - glaub...@physik.fu-berlin.de `-GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913