RE: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 2/2
Hi, >> What I would suggest is to have a -fgnu-strict-volatile-bit-fields > > Why a new option? The -fstrict-volatile-bitfields option is already > GCC-specific, and I think it can do what you want anyway. As I understand Richard's comment, he proposes to have an option for true AAPCS compliance, which will be allowed to break the C++11 memory model and which will _not_ be the default on any target. Name it -fstrict-volatile-bitfields. And an option that addresses your requirements, which will _not_ break the C++11 memory model and which will be the default on some targets, dependent on the respective ABI requirements. Name it -fgnu-strict-volatile-bit-fields. Bernd.
RE: Fix asymmetric volatile handling in optimize_bit_field_compare
Hi, OnĀ Fri, 18 Oct 2013 14:06:57, Richard Biener wrote: > > On Fri, Oct 18, 2013 at 1:56 PM, Bernd Edlinger > wrote: >> Hello Richard, >> >> I see it the same way. >> >> The existing code in optimize_bit_field_compare looks wrong. It is very >> asymmetric, >> and handles lvolatilep and rvolatilep differently. And the original handling >> of >> flag_strict_volatile_bitfields also was asymmetric. >> >> However this optimize-bit-field-compare check at the beginning of that >> function was not >> introduced because of volatile issues I think: >> >> There was a discussion in April 2012 on this thread: "Continue >> strict-volatile-bitfields fixes" >> >> http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01094.html >> I played with a sh cross-compiler and I checked the mentioned test cases manually. Result: They will regress with this patch. But the true reason for this is that get_inner_reference returns a different mode for non-volatile bit-fields dependent on the flag_strict_volatile_bitfields, which is wrong per definition. So I will prepare another patch that restores the code in stor-layout.c to what it was before 2010, and uses DECL_BIT_FIELD_TYPE in get_inner_reference if the reference is volatile and flag_strict_volatile_bitfields. >> The result was that this optimization seems to break other possible >> optimizations later on, >> when -fstrict-volatile-bitfields was enabled on the SH target. Even when the >> bit fields are NOT volatile. >> (Of course you should not touch volatile bit fields at all) >> >> And this was added to optimize_bit_field_compare as a result: >> >> /* In the strict volatile bitfields case, doing code changes here may prevent >> other optimizations, in particular in a SLOW_BYTE_ACCESS setting. */ >> if (flag_strict_volatile_bitfields> 0) >> return 0; > > I see it as that this check breaked the testcases which required the > testcase fixes. So yes, if > that referenced patch got in then it can likely be reverted. > >> Well, I dont know if that still applies to the trunk, and probably this was >> the completely wrong way to fix >> this issue it here. >> >> Maybe that had to do with the other place in stor-layout.c, >> which also could be solved differently. >> >> I think it could be possible to remove the flag_strict_volatile_bitfields >> special case >> also there, and instead use the DECL_BIT_FIELD_TYPE instead of >> DECL_BIT_FIELD in >> get_inner_reference IF we have flag_strict_volatile_bitfields and that field >> is really volatile. At least this helped in the portable volatiliy patch. >> What do you think? > > Well, sure using DECL_BIT_FIELD is wrong if AACPS specifies the > handling is conditional > on being declared as bitfield. Otherwise > > struct { > int c : 8; > }; > > will be not handled correctly (DECL_BIT_FIELD is not set, the field > occupies a QImode). > > DECL_BIT_FIELD_TYPE is the proper way to check whether a field was _declared_ > as bitfield. > > So yes, removing this special-case in stor-layout.c seems possible - > but beware of > possible ABI issues here, for example passing an argument of the above type > by value might result in passing it in a register or on the stack > dependent on whether > the struct has QImode or BLKmode (I'd say that's a target bug then as the > target > should look at the type, not at its mode... but reality is that I > fully can believe such > a bug exists - and whoops you have a call ABI that depends on > -fstrict-volatile-bitfields ;)) > I've done a grep "DECL_BIT_FIELD[^_]" and looked at all references of this attribute. There are even places in config/i386 and config/frv where it is used. All of these places look like latent bugs, if it depends on -fstrict-volatile-bitfields. None of them check for volatiles. I think restoring the original semantic of that attribute will not cause any problems. >> Anyway, here is my patch for review. >> >> OK for trunk after regression testing? > > Ok. > > Thanks, > Richard. > >> Thanks >> Bernd
[PATCH, rs6000] Adjust vec_unpacku patterns for little endian
Hi, For little endian, the permute control vector for unpacking high and low halves of a vector register must be reversed from the one used for big endian. Fixing this corrects 27 failing tests for powerpc64le-unknown-linux-gnu. Bootstrapped and tested for powerpc64{,le}-unknown-linux-gnu with no new regressions. Is this ok for trunk? Thanks, Bill 2013-10-19 Bill Schmidt * altivec.md (vec_unpacku_hi_v16qi): Adjust for little endian. (vec_unpacku_hi_v8hi): Likewise. (vec_unpacku_lo_v16qi): Likewise. (vec_unpacku_lo_v8hi): Likewise. Index: gcc/config/rs6000/altivec.md === --- gcc/config/rs6000/altivec.md(revision 203792) +++ gcc/config/rs6000/altivec.md(working copy) @@ -2035,25 +2035,26 @@ rtx vzero = gen_reg_rtx (V8HImode); rtx mask = gen_reg_rtx (V16QImode); rtvec v = rtvec_alloc (16); + bool be = BYTES_BIG_ENDIAN; emit_insn (gen_altivec_vspltish (vzero, const0_rtx)); - RTVEC_ELT (v, 0) = gen_rtx_CONST_INT (QImode, 16); - RTVEC_ELT (v, 1) = gen_rtx_CONST_INT (QImode, 0); - RTVEC_ELT (v, 2) = gen_rtx_CONST_INT (QImode, 16); - RTVEC_ELT (v, 3) = gen_rtx_CONST_INT (QImode, 1); - RTVEC_ELT (v, 4) = gen_rtx_CONST_INT (QImode, 16); - RTVEC_ELT (v, 5) = gen_rtx_CONST_INT (QImode, 2); - RTVEC_ELT (v, 6) = gen_rtx_CONST_INT (QImode, 16); - RTVEC_ELT (v, 7) = gen_rtx_CONST_INT (QImode, 3); - RTVEC_ELT (v, 8) = gen_rtx_CONST_INT (QImode, 16); - RTVEC_ELT (v, 9) = gen_rtx_CONST_INT (QImode, 4); - RTVEC_ELT (v, 10) = gen_rtx_CONST_INT (QImode, 16); - RTVEC_ELT (v, 11) = gen_rtx_CONST_INT (QImode, 5); - RTVEC_ELT (v, 12) = gen_rtx_CONST_INT (QImode, 16); - RTVEC_ELT (v, 13) = gen_rtx_CONST_INT (QImode, 6); - RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, 16); - RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, 7); + RTVEC_ELT (v, 0) = gen_rtx_CONST_INT (QImode, be ? 16 : 7); + RTVEC_ELT (v, 1) = gen_rtx_CONST_INT (QImode, be ? 0 : 16); + RTVEC_ELT (v, 2) = gen_rtx_CONST_INT (QImode, be ? 16 : 6); + RTVEC_ELT (v, 3) = gen_rtx_CONST_INT (QImode, be ? 1 : 16); + RTVEC_ELT (v, 4) = gen_rtx_CONST_INT (QImode, be ? 16 : 5); + RTVEC_ELT (v, 5) = gen_rtx_CONST_INT (QImode, be ? 2 : 16); + RTVEC_ELT (v, 6) = gen_rtx_CONST_INT (QImode, be ? 16 : 4); + RTVEC_ELT (v, 7) = gen_rtx_CONST_INT (QImode, be ? 3 : 16); + RTVEC_ELT (v, 8) = gen_rtx_CONST_INT (QImode, be ? 16 : 3); + RTVEC_ELT (v, 9) = gen_rtx_CONST_INT (QImode, be ? 4 : 16); + RTVEC_ELT (v, 10) = gen_rtx_CONST_INT (QImode, be ? 16 : 2); + RTVEC_ELT (v, 11) = gen_rtx_CONST_INT (QImode, be ? 5 : 16); + RTVEC_ELT (v, 12) = gen_rtx_CONST_INT (QImode, be ? 16 : 1); + RTVEC_ELT (v, 13) = gen_rtx_CONST_INT (QImode, be ? 6 : 16); + RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 16 : 0); + RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ? 7 : 16); emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v))); emit_insn (gen_vperm_v16qiv8hi (operands[0], operands[1], vzero, mask)); @@ -2070,25 +2071,26 @@ rtx vzero = gen_reg_rtx (V4SImode); rtx mask = gen_reg_rtx (V16QImode); rtvec v = rtvec_alloc (16); + bool be = BYTES_BIG_ENDIAN; emit_insn (gen_altivec_vspltisw (vzero, const0_rtx)); - RTVEC_ELT (v, 0) = gen_rtx_CONST_INT (QImode, 16); - RTVEC_ELT (v, 1) = gen_rtx_CONST_INT (QImode, 17); - RTVEC_ELT (v, 2) = gen_rtx_CONST_INT (QImode, 0); - RTVEC_ELT (v, 3) = gen_rtx_CONST_INT (QImode, 1); - RTVEC_ELT (v, 4) = gen_rtx_CONST_INT (QImode, 16); - RTVEC_ELT (v, 5) = gen_rtx_CONST_INT (QImode, 17); - RTVEC_ELT (v, 6) = gen_rtx_CONST_INT (QImode, 2); - RTVEC_ELT (v, 7) = gen_rtx_CONST_INT (QImode, 3); - RTVEC_ELT (v, 8) = gen_rtx_CONST_INT (QImode, 16); - RTVEC_ELT (v, 9) = gen_rtx_CONST_INT (QImode, 17); - RTVEC_ELT (v, 10) = gen_rtx_CONST_INT (QImode, 4); - RTVEC_ELT (v, 11) = gen_rtx_CONST_INT (QImode, 5); - RTVEC_ELT (v, 12) = gen_rtx_CONST_INT (QImode, 16); - RTVEC_ELT (v, 13) = gen_rtx_CONST_INT (QImode, 17); - RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, 6); - RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, 7); + RTVEC_ELT (v, 0) = gen_rtx_CONST_INT (QImode, be ? 16 : 7); + RTVEC_ELT (v, 1) = gen_rtx_CONST_INT (QImode, be ? 17 : 6); + RTVEC_ELT (v, 2) = gen_rtx_CONST_INT (QImode, be ? 0 : 17); + RTVEC_ELT (v, 3) = gen_rtx_CONST_INT (QImode, be ? 1 : 16); + RTVEC_ELT (v, 4) = gen_rtx_CONST_INT (QImode, be ? 16 : 5); + RTVEC_ELT (v, 5) = gen_rtx_CONST_INT (QImode, be ? 17 : 4); + RTVEC_ELT (v, 6) = gen_rtx_CONST_INT (QImode, be ? 2 : 17); + RTVEC_ELT (v, 7) = gen_rtx_CONST_INT (QImode, be ? 3 : 16); + RTVEC_ELT (v, 8) = gen_rtx_CONST_INT (QImode, be ? 16 : 3); + RTVEC_ELT (v, 9) = gen_rtx_CONST_INT (QImode, be ? 17 : 2); + RTVEC_ELT (v, 10) = gen_rtx_CONST_INT (QImode, be ? 4 : 17); + RTVEC_ELT (v, 11) = gen_rtx_CONST_INT (QImode, be ? 5 : 16); + RTVEC_ELT (v, 12) = gen_rtx_CONST_INT (QImod
[committed] Skip long double test in c-c++-common/opaque-vector.c on hppa
The test fails on the 32-bit hpux target because the long double type is larger than the largest integer type. Test passes on 64-bit hpux target. On linux, long double is the same as double. So, it's simplest to just skip long double test. Tested on hppa2.0w-hp-hpux11.11. Dave -- John David Anglin dave.ang...@bell.net 2013-10-19 John David Anglin * c-c++-common/opaque-vector.c: Skip long double test on hppa. Index: c-c++-common/opaque-vector.c === --- c-c++-common/opaque-vector.c(revision 203832) +++ c-c++-common/opaque-vector.c(working copy) @@ -16,7 +16,7 @@ T_TEST(float) T_TEST(double) /* Avoid trouble with non-power-of-two sizes. */ -#if !defined(__i386__) && !defined(__x86_64__) && !defined(__m68k__) && !defined(__ia64__) +#if !defined(__i386__) && !defined(__x86_64__) && !defined(__m68k__) && !defined(__ia64__) && !defined(__hppa__) T_TEST(long double) #endif }
[committed] Skip gnat.dg/specs/linker_alias.ads on hppa*-*-hpux*
Skipped because alias definitions are not supported on 32-bit hpux targets. Tested on hppa2.0w-hp-hpux11.11. Dave -- John David Anglin dave.ang...@bell.net 2013-10-19 John David Anglin PR testsuite/58645 * gnat.dg/specs/linker_alias.ads: Skip on hppa*-*-hpux*. Index: gnat.dg/specs/linker_alias.ads === --- gnat.dg/specs/linker_alias.ads (revision 203166) +++ gnat.dg/specs/linker_alias.ads (working copy) @@ -1,5 +1,5 @@ -- { dg-do compile } --- { dg-skip-if "missing alias support" { *-*-darwin* } } +-- { dg-skip-if "missing alias support" { *-*-darwin* hppa*-*-hpux* } } package Linker_Alias is
[committed] Fix identifier conflict on hppa*-*-hpux*
The attached change fixes an identifer conflict caused by renaming the member _slot to m_slot. This fixes PR target/58603. Committed as obvious. Tested on hppa2.0w-hp-hpux11.11, hppa64-hp- hpux11.11 and hppa1.1-hp-hpux11.11. Dave -- John David Anglin dave.ang...@bell.net 2013-10-19 John David Anglin PR target/58603 * system.h: Undef m_slot. Index: system.h === --- system.h(revision 203160) +++ system.h(working copy) @@ -264,8 +264,9 @@ #ifdef HAVE_SYS_PARAM_H # include -/* We use this identifier later and it appears in some vendor param.h's. */ +/* We use these identifiers later and they appear in some vendor param.h's. */ # undef PREFETCH +# undef m_slot #endif #if HAVE_LIMITS_H
Re: [Patch] Change default executor to DFS in regex
On Sat, Oct 19, 2013 at 3:12 PM, Paolo Carlini wrote: > Yes, but giving it a name doesn't buy us much wrt the issue I pointed out. > For comparison, in similar cases, the compiler driver has --params which the > user can fine tune on the command line. The best approximation we have got > in the library - for the time being at least, in principle the driver could > also forward parameters to the library - is a macro, which is defined and > documented in a comment a few lines earlier and then is possibly used to > initialize a const static. Again, see the stl_deque.h example. I can't > imagine any other simple (and conforming! eg, no additional template parms) > solution. I see. Here's the macro version. Thanks! -- Tim Shen a.patch Description: Binary data
libstdc++ Testsuite extension for sort, partial_sort, partial_sort_copy, nth_element
The following patch (related to my 58800 patch) adds many more tests for several important functions. While 58800 is my fault, the reason it was not caught earlier is that many functions in libstdc++ have almost no testing. This works toward fixing that. These tests are kept to an acceptable amount of time (~2 seconds/ function) at the moment (and automatic reduction for simulators), although I would like to take much more time and do much more testing, but I realise that might annoy some people so it might not be on by default. I plan to write my tests if these are accepted (although writing this reminded me why gcc is the worst open source project for submitting code to I have ever worked with). 2013-10-16 Edward Smith-Rowland <3dw...@verizon.net> * testsuite/util/testsuite_iterators.h : Add 'val' method to read testsuite container. * testsuite/25_algorithms/partial_sort/nth_element.cc : New * testsuite/25_algorithms/partial_sort/random_test.cc : New * testsuite/25_algorithms/partial_sort_copy/random_test.cc : New * testsuite/25_algorithms/sort/random_test.cc : New * testsuite/util/testsuite_containergen.h : New header for automatically testing functions on a large set of autogenerated headers sorting_test.patch Description: Binary data
nth_element fix [58800]
A previous fix to a performance problem in std::sort unfortunately has lead to a crashing bug in std::nth_element. The following minimal patch fixes the problem (there are a couple of different ways of fixing it, this is the shortest and safest). 2013-09-19 Chris Jefferson PR libstdc++/58800 * include/bits/stl_algo.h (__unguarded_partition_pivot) : Change __last - 2 to __last - 1, so we correctly work on arrays of length 4. * testsuite/25_algorithms/nth_element/58800.cc : New nth_element1.patch Description: Binary data
Re: Honnor ix86_accumulate_outgoing_args again
> Jan, > > Does this seem reasonable to you? Oops, sorry, I missed your email. (I was travelling and I am finishing a paper now). > > Thanks, > Igor > > > -Original Message- > > From: Zamyatin, Igor > > Sent: Tuesday, October 15, 2013 3:48 PM > > To: Jan Hubicka > > Subject: RE: Honnor ix86_accumulate_outgoing_args again > > > > Jan, > > > > Now we have following prologue in, say, phi0 routine in equake > > > > 0x804aa90 1 push %ebp > > 0x804aa91 2 mov%esp,%ebp > > 0x804aa93 3 sub$0x18,%esp > > 0x804aa96 4 vmovsd 0x80ef7a8,%xmm0 > > 0x804aa9e 5 vmovsd 0x8(%ebp),%xmm1 > > 0x804aaa3 6 vcomisd %xmm1,%xmm0 <-- we see big stall somewhere here > > or 1-2 instructions above > > > > While earlier it was > > > > 0x804abd0 1 sub$0x2c,%esp > > 0x804abd3 2 vmovsd 0x30(%esp),%xmm1 > > 0x804abd9 3 vmovsd 0x80efcc8,%xmm0 > > 0x804abe1 4 vcomisd %xmm1,%xmm0 Thanks for analysis! It is a different benchmark than for bulldozer, but apparently same case. Again we used to eliminate frame pointer here but IRS now doesn't Do you see the same regression with -fno-omit-frame-pointer -maccumulate-outgoing-args? I suppose this is a conflict in between the push instruction hanled by stack engine and initialization of EBP that isn't. That would explain why bulldozer don't seem to care about this particular benchmark (its stack engine seems to have quite different design). This is a bit sad situation - accumulate-outgoing-args is expensive code size wise and it seems we don't really need esp with -mno-accumulate-outgoing-args. The non-accumulation code path was mistakely disabled for too long ;( Vladimir, how much effort do you think it will be to fix the frame pointer elimination here? I can imagine it is a quite tricky case. If so I would suggest adding m_CORE_ALL to X86_TUNE_ACCUMULATE_OUTGOING_ARGS with a comment explaining the problem and mentioning the regression on equake on core and mgrid on Bulldizer and opening an enhancement request for this... I also wonder if direct ESP use and push/pop instructions are causing so noticeable issues, I wonder if we can't "shrink wrap" this into red-zone in the 64bit compilation. It seems that even with -maccumulate-outgoing-args pushing the frame allocation as late as possible in the function would be a good idea so it is not close to the push/pop/call/ret. Honza > > > > Note that for Atom and SLM no regression happens because they are already > > in x86_accumulate_outgoing_args. As for other machines - seems now > > (after your change) they don't get that > > MASK_ACCUMULATE_OUTGOING_ARGS and it leads to using ebp in the > > prologue. > > > > Thanks, > > Igor > > > > -Original Message- > > From: Jan Hubicka [mailto:hubi...@ucw.cz] > > Sent: Monday, October 14, 2013 8:44 PM > > To: Zamyatin, Igor > > Cc: Jan Hubicka > > Subject: Re: Honnor ix86_accumulate_outgoing_args again > > > > > Jan, > > > > > > I see that you haven't committed this change. Any particular reason for > > this? > > > > No, seems I forgot about it. > > > > > > BTW, after r203171 (http://gcc.gnu.org/ml/gcc-cvs/2013- > > 10/msg00122.html) we see lot of performance degradation on spec2000 and > > spec2006 tests on SandyBridge and IvyBridge in 32 bits mode. E.g. > > 183.equake got ~-15%. Have you seen it? > > > > I tested this only on Bulldozer chips where I saw large regression from > > mgrid, > > but curiously not from equake. I tracked it down to frame pointer being > > forced by IRA that Vladimir promised to look at later. > > > > I assumed that arg accumulation was tested before the flags was set, so I > > did > > not benchmarked chips that previously dsabled it. Perhaps things changed > > because IRA was merged in meantime while this feature was accidentally > > disabled. > > > > It would be great if you can figure out why equake regress - in FP > > benchmarks we do not use push/pop so perhaps we just end up emitting > > something really stupid or we manage to confuse stack engine... > > > > Honza > > > > > > > > Thanks, > > > Igor > > > > > > -Original Message- > > > From: gcc-patches-ow...@gcc.gnu.org > > > [mailto:gcc-patches-ow...@gcc.gnu.org] On Behalf Of Jan Hubicka > > > Sent: Thursday, October 10, 2013 10:40 PM > > > To: Jan Hubicka > > > Cc: Vladimir Makarov; gcc-patches@gcc.gnu.org; hjl.to...@gmail.com; > > > ubiz...@gmail.com; r...@redhat.com; > > ganesh.gopalasubraman...@amd.com > > > Subject: Re: Honnor ix86_accumulate_outgoing_args again > > > > > > Hi, > > > this patch makes ACCUMULATE_OUTGOING_ARGS to disable itself when > > function is cold. I did some extra testing and to my amusement we now > > seem to output more compact unwind info when > > ACCUMULATE_OUTGOING_ARGS is disabled, so this seems quite consistent > > code size win. > > > > > > We actually can do better and enable ACCUMULATE_OUTGOING_ARGS > > only when function contains hot calls. This should also avoid need for > > frame > > allocation in prologue/epilogu
Re: [wide-int] int_traits
On 10/19/2013 10:18 AM, Richard Sandiford wrote: Kenneth Zadeck writes: On 10/19/2013 05:01 AM, Richard Sandiford wrote: Mike Stump writes: + // We optimize x < y, where y is 64 or fewer bits. + // We have to be careful to not allow comparison to a large positive + // unsigned value like 0x8000, those would be encoded + // with a y.len == 2. + if (y.precision <= HOST_BITS_PER_WIDE_INT + && y.len == 1) I don't get this. If y.precision <= HOST_BITS_PER_WIDE_INT then y.len must be 1. I realise that tree constants can be stored with TREE_INT_CST_NUNITS > TYPE_PRECISION / HOST_BITS_PER_WIDE_INT (so that extensions beyond TYPE_PRECISION are free). But the wide-int code is shielded from that by the ::decompose routine. A wide int never has len > precision / HOST_BITS_PER_WIDE_INT. Thanks, Richard I think that part of this is that neither mike or i really understand how this stuff works anymore. in the old version where we had precision 0, we would wait to canonicalize a constant or a simple integer until we saw what the precision of the other operand was. That was what precison 0 meant. it was kind of like what you are now proposing with this new trait, but for the reason that we actually did not know what to do than some concern about escapement. What i do not understand is what happens what do you get when you bring in an integer variable that is an unsigned HOST_WIDE_INT with the top bit set. In the precision 0 days, if the prec of the other side was 64 or less, the incoming integer took 1 hwi and if the precision was larger, it took two hwis. The canonicalization happened late enough so that there was never a question. Ah, I think I know what you mean. The original implementation was: template static inline const HOST_WIDE_INT* to_shwi1 (HOST_WIDE_INT *s, unsigned int *l, unsigned int *p, const T& x) { s[0] = x; if (signedp(x) || sizeof (T) < sizeof (HOST_WIDE_INT) || ! top_bit_set (x)) { *l = 1; } else { s[1] = 0; *l = 2; } *p = 0; return s; } void wide_int_ro::check_precision (unsigned int *p1, unsigned int *p2, bool check_equal ATTRIBUTE_UNUSED, bool check_zero ATTRIBUTE_UNUSED) { gcc_checking_assert ((!check_zero) || *p1 != 0 || *p2 != 0); if (*p1 == 0) *p1 = *p2; if (*p2 == 0) *p2 = *p1; gcc_checking_assert ((!check_equal) || *p1 == *p2); } /* Return true if C1 < C2 using signed comparisons. */ template static inline bool lts_p (const T1 &c1, const T2 &c2) { bool result; HOST_WIDE_INT ws1[WIDE_INT_MAX_ELTS]; HOST_WIDE_INT ws2[WIDE_INT_MAX_ELTS]; const HOST_WIDE_INT *s1, *s2; /* Returned data */ unsigned int cl1, cl2; /* array lengths */ unsigned int p1, p2; /* precisions */ s1 = to_shwi1 (ws1, &cl1, &p1, c1); s2 = to_shwi1 (ws2, &cl2, &p2, c2); check_precision (&p1, &p2, false, true); if (p1 <= HOST_BITS_PER_WIDE_INT && p2 <= HOST_BITS_PER_WIDE_INT) { HOST_WIDE_INT x0 = sext_hwi (s1[0], p1); HOST_WIDE_INT x1 = sext_hwi (s2[0], p2); result = x0 < x1; } else result = lts_p_large (s1, cl1, p1, s2, cl2, p2); #ifdef DEBUG_WIDE_INT debug_vaa ("wide_int_ro:: %d = (%s lts_p %s\n", result, s1, cl1, p1, s2, cl2, p2); #endif return result; } you need to be careful about asserting too much from the old code. the time line was: 1) we developed the stuff on x86-64 2) you did your patch 3) we ported everything to ppc and our private port. i really only became very sensitive to this issue during step 3 because the x86 does not exhibit these bugs. So if you had a 128-bit wide_int and T == unsigned HOST_WIDE_INT, this lts_p would zero-extend the unsigned HOST_WIDE_INT to 128 bits and then do a signed comparison. The thing here is that the "check_equal" argument is false. So if instead you were comparing a 128-bit wide_int with a 64-bit tree constant, lts_p would treat that tree constant as a signed 64-bit number, even if it was TYPE_UNSIGNED. Similarly if you were comparing a 128-bit tree constant and a 64-bit tree constant. You also allowed a comparison of a 128-bit wide_int with a 64-bit rtx, again treating the 64-bit rtx as signed. I do not think that this is what check_equal meant because the 0 precision was a wild card. The 0 precision allowed the values to come in from simple vars and constants and be converted on the fly. However, as i said above, i am not sure how well this worked since the testing for wide stuff was so thin. So when doing the wi:: conversion, I'd interpreted the desired semantics for lts_p as being "treat both inputs as signed without extending them", since that's what the function did in most cases. It seemed inconsistent t
Re: [wide-int] Go back to having undefined exccess bits on read
On 10/19/2013 02:41 PM, Mike Stump wrote: On Oct 19, 2013, at 7:22 AM, Richard Sandiford wrote: As discussed, this patch effectively goes back to your original idea of having excess upper bits in a HWI being undefined on read (at least as the default assumption). wide_int itself still ensures that the excess bits are stored as signs though. Or we could decide that it isn't worth the hassle and just leave excess upper bits as undefined on write too, which really is going back to your original model. :-) I don't get it. If the bits are undefined upon read, then, they should be undefined upon write. It doesn't make any sense to me to have them be undefined upon read and defined upon write. To me, one describes the semantics of the data, then once that is done, all else falls out from that. If no one is allowed to deviate their behavior upon a bit (that bit is don't care), then, trivially reads need to zap it out, and writes can write anything into it. no, they should not be undefined on write.if they are undefined on write, then it makes tree_fits* (the successor functions from integer_hostp) much more complex. kenny
Re: [C PATCH] Warn for _Alignas in an array declarator (PR c/58267)
On Wed, 16 Oct 2013, Marek Polacek wrote: > @@ -2946,7 +2957,8 @@ c_parser_declarator (c_parser *parser, b >struct c_declspecs *quals_attrs = build_null_declspecs (); >struct c_declarator *inner; >c_parser_consume_token (parser); > - c_parser_declspecs (parser, quals_attrs, false, false, true, > cla_prefer_id); > + c_parser_declspecs (parser, quals_attrs, false, false, true, > + true, cla_prefer_id); >inner = c_parser_declarator (parser, type_seen_p, kind, seen_id); >if (inner == NULL) > return NULL; Looking at this again, shouldn't the new argument be "false" (with associated testcase)? This is parsing pointer declarators, and _Alignas isn't allowed there any more than it is in array declarators > @@ -3715,7 +3730,8 @@ c_parser_type_name (c_parser *parser) >struct c_declarator *declarator; >struct c_type_name *ret; >bool dummy = false; > - c_parser_declspecs (parser, specs, false, true, true, cla_prefer_type); > + c_parser_declspecs (parser, specs, false, true, true, false, > + cla_prefer_type); >if (!specs->declspecs_seen_p) > { >c_parser_error (parser, "expected specifier-qualifier-list"); And this should get a testcase added, that _Alignas is correctly rejected in type names where previously it would have been wrongly accepted. (Strictly by the standard it should be "false" in c_parser_struct_declaration as well - the syntax there doesn't allow _Alignas - but it appears to have been intended to allow it there, so probably best not to change anything there until WG14 reaches a conclusion on the issues I raised in N1731.) -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] Fix PR54702 and LTO bootstrap (for me)
On Oct 16, 2013, at 1:11 AM, Richard Biener wrote: > I suppose it's fine to move the testcase to the C++ lto testsuite > if you verify it will use the C compiler for the C parts. Thanks. Testing pointed out one additional feature lto testing need to make thing whole thing go. Committed. 2013-10-19 Mike Stump * g++.dg/lto/lto.exp: Add support for C/C++ mix language testing. * gcc.dg/lto/pr54625-1_0.c: Move from here... * g++.dg/lto/pr54625-1_0.c: ... to here. * gcc.dg/lto/pr54625-1_1.C: Likewise. * g++.dg/lto/pr54625-1_1.C: Likewise. * gcc.dg/lto/pr54625-2_0.c: Likewise. * g++.dg/lto/pr54625-2_0.c: Likewise. * gcc.dg/lto/pr54625-2_1.C: Likewise. * g++.dg/lto/pr54625-2_1.C: Likewise. Index: g++.dg/lto/lto.exp === --- g++.dg/lto/lto.exp (revision 203827) +++ g++.dg/lto/lto.exp (working copy) @@ -48,7 +48,7 @@ if { ![check_effective_target_lto] } { } # Main loop. -foreach src [lsort [find $srcdir/$subdir *_0.C]] { +foreach src [lsort [find $srcdir/$subdir *_0.\[cC\]]] { # If we're only testing specific files and this isn't one of them, skip it. if ![runtest_file_p $runtests $src] then { continue
Re: [Patch] Change default executor to DFS in regex
On 10/19/2013 07:51 PM, Tim Shen wrote: On Sat, Oct 19, 2013 at 4:03 AM, Paolo Carlini wrote: About the < 2, in general hardcoding a parameter value in the code isn't a nice idea. Why don't we take it out to a macro, say _GLIBCXX_REGEX_NFA_QUANTIFIERS_LIMIT? In stl_deque.h we have something similar and in the present case it would be even safe from the ABI point of view, if I'm not mistaken. I here use a const static local variable to hide it from other parts. Yes, but giving it a name doesn't buy us much wrt the issue I pointed out. For comparison, in similar cases, the compiler driver has --params which the user can fine tune on the command line. The best approximation we have got in the library - for the time being at least, in principle the driver could also forward parameters to the library - is a macro, which is defined and documented in a comment a few lines earlier and then is possibly used to initialize a const static. Again, see the stl_deque.h example. I can't imagine any other simple (and conforming! eg, no additional template parms) solution. Thanks, Paolo.
Re: [wide-int] Go back to having undefined exccess bits on read
On Oct 19, 2013, at 7:22 AM, Richard Sandiford wrote: > As discussed, this patch effectively goes back to your original idea of > having excess upper bits in a HWI being undefined on read (at least as > the default assumption). wide_int itself still ensures that the excess > bits are stored as signs though. > Or we could decide that it isn't worth the hassle and just leave excess > upper bits as undefined on write too, which really is going back to your > original model. :-) I don't get it. If the bits are undefined upon read, then, they should be undefined upon write. It doesn't make any sense to me to have them be undefined upon read and defined upon write. To me, one describes the semantics of the data, then once that is done, all else falls out from that. If no one is allowed to deviate their behavior upon a bit (that bit is don't care), then, trivially reads need to zap it out, and writes can write anything into it.
Re: [wide-int] int_traits
On Oct 19, 2013, at 7:18 AM, Richard Sandiford wrote: > So I think Mike's original patch was right and that this extra > "y.len == 1" check is redundant. So, gcc's optimizer agrees with your view, it is redundant. :-) I checked this in: Index: wide-int.h === --- wide-int.h (revision 203860) +++ wide-int.h (working copy) @@ -1464,11 +1464,7 @@ inline bool wi::lts_p (const wide_int_ref &x, const wide_int_ref &y) { // We optimize x < y, where y is 64 or fewer bits. - // We have to be careful to not allow comparison to a large positive - // unsigned value like 0x8000, those would be encoded - // with a y.len == 2. - if (y.precision <= HOST_BITS_PER_WIDE_INT - && y.len == 1) + if (y.precision <= HOST_BITS_PER_WIDE_INT) { // If x fits directly into a shwi, we can compare directly. if (wi::fits_shwi_p (x))
Re: Help with mode switching (create_pre_exit)
Hello! > I'm fighting with mode switching (to be more precise with > create_pre_exit function) trying to make it work for MPX. I saw > create_pre_exit had some stability issues before and now I'm facing > similar issues trying to have it working when bound register is > returned by function in addition to GPR. > > The more I look into create_pre_exit code, the more I think it makes > some assumptions which are wrong. Also it is very sensitive to the > code structure and simple changes in exit block crashes pre-exit block > creation. And in some cases it seems to me function does not fail by > accident. [...] > create_pre_exit code assumes value is returned in a single register dx > which is wrong from the beginning. Then it does not realize set to dx > and ax are copy of returned value, finally reaches the end of BB and > splits there. It does not crash assuming it is a case with EH where > there are no return value copy in exit block. > > If we exchange ax and dx usages then create_pre_exit will fail. If we > have value returned on more registers and thus will have more usages > then create_pre_exit will fail. This should be fixed in [1]. > There is an additional hard reg used for returned value and now > returned value is stored in non subsequent registers. I think that to > successfully cover all cases here, function_value target hook should > be used to determine returned hard regs, rather than use the last > 'use' insn to determine required hard regs. Does it sound reasonable? > Am I missing something here? You should add BND register REGNO(s) used to return bounds to ix86_function_value_regno_p. These additional registers will be detected as valid mult-register outputs in create_pre_exit, and this will bypass the picky assert. Please also see patch at [2]. [1] http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01606.html [2] http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01602.html Uros.
Re: [Patch] Change default executor to DFS in regex
On Sat, Oct 19, 2013 at 4:03 AM, Paolo Carlini wrote: > About the < 2, in general hardcoding a parameter value in the code isn't a > nice idea. Why don't we take it out to a macro, say > _GLIBCXX_REGEX_NFA_QUANTIFIERS_LIMIT? In stl_deque.h we have something > similar and in the present case it would be even safe from the ABI point of > view, if I'm not mistaken. I here use a const static local variable to hide it from other parts. > Finally, you mentioned the DFA optimization: I would like to see some > details about it. Would it be a big win in all cases? Before we freeze the > next ABI we should keep open all the possibilities... It'll be great help to implement partial DFA compilation for all cases, though DFA cannot handle back-references neither. I need more time to investigate. Thanks! -- Tim Shen a.patch Description: Binary data
Re: [wide-int] int_traits
On Oct 19, 2013, at 5:20 AM, "Bernhard Reutner-Fischer" wrote: >> WIDE? > > s/positve/positive/g Thanks. Here is what I put in: Index: wide-int.h === --- wide-int.h (revision 203835) +++ wide-int.h (working copy) @@ -185,7 +185,7 @@ along with GCC; see the file COPYING3. assuming t is a int_cst. - Note, the bits past the precision up to the nearest HOST_WDE_INT + Note, the bits past the precision up to the nearest HOST_WIDE_INT boundary are defined to be copies of the top bit of the value, however the bits above those defined bits not defined and the algorithms used here are careful not to depend on their value. In @@ -1477,7 +1477,7 @@ wi::lts_p (const wide_int_ref &x, const // negative than any value in y, and hence smaller than y. if (neg_p (x, SIGNED)) return true; - // If x is positve, then it must be larger than any value in y, + // If x is positive, then it must be larger than any value in y, // and hence greater than y. return false; }
[wide-int] Go back to having undefined exccess bits on read
Hi Kenny, As discussed, this patch effectively goes back to your original idea of having excess upper bits in a HWI being undefined on read (at least as the default assumption). wide_int itself still ensures that the excess bits are stored as signs though. This patch is already quite big, so I've left ::is_sign_extended itself to a follow-on patch. I've also not done anything wrt the scratch array. More to follow soon, if this patch is OK. The easiest way of maintaining the excess bits as signs seemed to be to have wide_int_storage::set_len do the sign extension. This means that we do it in one place and also means that the *_large routines don't waste time doing it for addr_wide_int and max_wide_int, which don't have excess bits. The current code took advantage of the fact that operations like AND and OR on two sign-extended inputs also produce a sign-extended result. We no longer know without ::is_sign_extended whether the inputs are sign-extended though, so there are very few places that can guarantee that the final extension is redundant. We can look at optimising extension away later in cases where operations like AND and OR are applied to ::is_sign_extended inputs, but that'd need the follow-on patch above. Or we could decide that it isn't worth the hassle and just leave excess upper bits as undefined on write too, which really is going back to your original model. :-) I've used to_shwi and to_uhwi instead of slow and ulow in cases where we need the upper bits to be signs or zeros respectively. There were several places that still assumed undefined upper bits, so the patch isn't as big as it could have been. I got rid of a couple of signed right shifts while there (and a test function at the end of wide-int.cc that I'd accidentally committed at one point -- oops). OK to install? Thanks, Richard Index: gcc/tree.h === --- gcc/tree.h 2013-10-19 09:54:45.504387762 +0100 +++ gcc/tree.h 2013-10-19 09:54:59.671508531 +0100 @@ -5175,7 +5175,7 @@ wi::int_traits ::get_precisi /* Convert the tree_cst X into a wide_int of PRECISION. */ inline wi::storage_ref -wi::int_traits ::decompose (HOST_WIDE_INT *scratch, +wi::int_traits ::decompose (HOST_WIDE_INT *, unsigned int precision, const_tree x) { unsigned int len = TREE_INT_CST_NUNITS (x); @@ -5186,32 +5186,14 @@ wi::int_traits ::decompose ( gcc_assert (precision >= xprecision); - /* Got to be careful of precision 0 values. */ - if (precision) -len = MIN (len, max_len); - if (TYPE_SIGN (TREE_TYPE (x)) == UNSIGNED) -{ - unsigned int small_prec = precision & (HOST_BITS_PER_WIDE_INT - 1); - if (small_prec) - { - /* We have to futz with this because the canonization for -short unsigned numbers in wide-int is different from the -canonized short unsigned numbers in the tree-cst. */ - if (len == max_len) - { - for (unsigned int i = 0; i < len - 1; i++) - scratch[i] = val[i]; - scratch[len - 1] = sext_hwi (val[len - 1], precision); - return wi::storage_ref (scratch, len, precision); - } - } - /* We have to futz here because a large unsigned int with -precision 128 may look (0x0 0x 0xF...) as a -tree-cst and as (0xF...) as a wide-int. */ - else if (precision == xprecision && len == max_len) -while (len > 1 && val[len - 1] == (HOST_WIDE_INT)-1) - len--; -} + /* If an unsigned constant occupies a whole number of HWIs and has the + upper bit set, its representation includes an extra zero HWI, + so that the representation can be used for wider precisions. + Trim the length if we're accessing the tree in its own precision. */ + if (__builtin_expect (len > max_len, 0)) +do + len--; +while (len > 1 && val[len - 1] == -1); /* Signed and the rest of the unsigned cases are easy. */ return wi::storage_ref (val, len, precision); Index: gcc/wide-int.cc === --- gcc/wide-int.cc 2013-10-15 18:33:44.789339673 +0100 +++ gcc/wide-int.cc 2013-10-19 13:08:12.180902396 +0100 @@ -46,7 +46,7 @@ static const HOST_WIDE_INT zeros[WIDE_IN #define BLOCK_OF(TARGET) ((TARGET) / HOST_BITS_PER_WIDE_INT) #define BLOCKS_NEEDED(PREC) \ (PREC ? (((PREC) + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT) : 1) -#define SIGN_MASK(X) (((HOST_WIDE_INT)X) >> (HOST_BITS_PER_WIDE_INT - 1)) +#define SIGN_MASK(X) ((HOST_WIDE_INT) (X) < 0 ? -1 : 0) /* Return the value a VAL[I] if I < LEN, otherwise, return 0 or -1 based on the top existing bit of VAL. */ @@ -65,7 +65,6 @@ safe_uhwi (const HOST_WIDE_INT *val, uns static unsigned int canonize (HOST_WIDE_INT *val, unsigned int len, unsigned int precision) { - unsigned int small_prec = precision & (HOST_B
Re: [wide-int] int_traits
Kenneth Zadeck writes: > On 10/19/2013 05:01 AM, Richard Sandiford wrote: >> Mike Stump writes: >>> + // We optimize x < y, where y is 64 or fewer bits. >>> + // We have to be careful to not allow comparison to a large positive >>> + // unsigned value like 0x8000, those would be encoded >>> + // with a y.len == 2. >>> + if (y.precision <= HOST_BITS_PER_WIDE_INT >>> + && y.len == 1) >> I don't get this. If y.precision <= HOST_BITS_PER_WIDE_INT then >> y.len must be 1. I realise that tree constants can be stored with >> TREE_INT_CST_NUNITS > TYPE_PRECISION / HOST_BITS_PER_WIDE_INT >> (so that extensions beyond TYPE_PRECISION are free). But the >> wide-int code is shielded from that by the ::decompose routine. >> A wide int never has len > precision / HOST_BITS_PER_WIDE_INT. >> >> Thanks, >> Richard > I think that part of this is that neither mike or i really understand > how this stuff works anymore. > > in the old version where we had precision 0, we would wait to > canonicalize a constant or a simple integer until we saw what the > precision of the other operand was. That was what precison 0 meant. > it was kind of like what you are now proposing with this new trait, but > for the reason that we actually did not know what to do than some > concern about escapement. > > What i do not understand is what happens what do you get when you bring > in an integer variable that is an unsigned HOST_WIDE_INT with the top > bit set. In the precision 0 days, if the prec of the other side was 64 > or less, the incoming integer took 1 hwi and if the precision was > larger, it took two hwis. The canonicalization happened late enough so > that there was never a question. Ah, I think I know what you mean. The original implementation was: template static inline const HOST_WIDE_INT* to_shwi1 (HOST_WIDE_INT *s, unsigned int *l, unsigned int *p, const T& x) { s[0] = x; if (signedp(x) || sizeof (T) < sizeof (HOST_WIDE_INT) || ! top_bit_set (x)) { *l = 1; } else { s[1] = 0; *l = 2; } *p = 0; return s; } void wide_int_ro::check_precision (unsigned int *p1, unsigned int *p2, bool check_equal ATTRIBUTE_UNUSED, bool check_zero ATTRIBUTE_UNUSED) { gcc_checking_assert ((!check_zero) || *p1 != 0 || *p2 != 0); if (*p1 == 0) *p1 = *p2; if (*p2 == 0) *p2 = *p1; gcc_checking_assert ((!check_equal) || *p1 == *p2); } /* Return true if C1 < C2 using signed comparisons. */ template static inline bool lts_p (const T1 &c1, const T2 &c2) { bool result; HOST_WIDE_INT ws1[WIDE_INT_MAX_ELTS]; HOST_WIDE_INT ws2[WIDE_INT_MAX_ELTS]; const HOST_WIDE_INT *s1, *s2; /* Returned data */ unsigned int cl1, cl2; /* array lengths */ unsigned int p1, p2; /* precisions */ s1 = to_shwi1 (ws1, &cl1, &p1, c1); s2 = to_shwi1 (ws2, &cl2, &p2, c2); check_precision (&p1, &p2, false, true); if (p1 <= HOST_BITS_PER_WIDE_INT && p2 <= HOST_BITS_PER_WIDE_INT) { HOST_WIDE_INT x0 = sext_hwi (s1[0], p1); HOST_WIDE_INT x1 = sext_hwi (s2[0], p2); result = x0 < x1; } else result = lts_p_large (s1, cl1, p1, s2, cl2, p2); #ifdef DEBUG_WIDE_INT debug_vaa ("wide_int_ro:: %d = (%s lts_p %s\n", result, s1, cl1, p1, s2, cl2, p2); #endif return result; } So if you had a 128-bit wide_int and T == unsigned HOST_WIDE_INT, this lts_p would zero-extend the unsigned HOST_WIDE_INT to 128 bits and then do a signed comparison. The thing here is that the "check_equal" argument is false. So if instead you were comparing a 128-bit wide_int with a 64-bit tree constant, lts_p would treat that tree constant as a signed 64-bit number, even if it was TYPE_UNSIGNED. Similarly if you were comparing a 128-bit tree constant and a 64-bit tree constant. You also allowed a comparison of a 128-bit wide_int with a 64-bit rtx, again treating the 64-bit rtx as signed. So when doing the wi:: conversion, I'd interpreted the desired semantics for lts_p as being "treat both inputs as signed without extending them", since that's what the function did in most cases. It seemed inconsistent to treat a 64-bit unsigned primitive integer differently from a 64-bit unsigned tree constant. So at the moment, it doesn't matter whether any HOST_WIDE_INT input to lts_p is signed or unsigned, just like it didn't and doesn't matter whether any tree input is signed or unsigned. If instead we want lts_p to operate to a single unified precision, like eq_p did: s1 = to_shwi1 (ws1, &cl1, &p1, c1); s2 = to_shwi1 (ws2, &cl2, &p2, c2); check_precision (&p1, &p2, true, false); and still does, then that's easy enough to change. Then all extendable inputs will be extended according to their natural signedness and then compared sig
[SH, committed] Fix test pr54089-3.c
Hello, The test case pr54089-3.c started to fail a while ago, because of the broken test for load of constant 31. Committed as obvious to trunk and 4.8 branch. Cheers, Oleg testsuite/ChangeLog: * gcc.target/sh/pr54089-3.c: Fix test for load of constant 31. Index: gcc/testsuite/gcc.target/sh/pr54089-3.c === --- gcc/testsuite/gcc.target/sh/pr54089-3.c (revision 203000) +++ gcc/testsuite/gcc.target/sh/pr54089-3.c (working copy) @@ -5,7 +5,7 @@ /* { dg-options "-O1" } */ /* { dg-skip-if "" { "sh*-*-*" } { "*" } { "-m1*" "-m2" "-m2e*" } } */ /* { dg-final { scan-assembler-not "and" } } */ -/* { dg-final { scan-assembler-not "31" } } */ +/* { dg-final { scan-assembler-not "#31" } } */ int test00 (unsigned int a, int* b, int c, int* d, unsigned int e)
Re: [PATCH, i386]: Fix PR 58792 by adding missing registers to ix86_function_value_regno_p
On Sat, Oct 19, 2013 at 2:55 PM, Uros Bizjak wrote: > > The list of possible return registers in ix86_function_value_regno_p > is incomplete. 32bit and 64bit targets use %[er]dx register for > DImode and TImode values respectively and %st(1) and %xmm1 for > imaginary part of complex values. Additionally, 64bit SYSV targets use > %rdi and %rsi registers to pass structures (as declared in > x86_64_int_return_registers array). > > function_value_regno_p is mainly used in (obsolete) __builtin_apply > and __builtin_return builtins, but it is also used in mode-switching > pass to check if regno is OK as multiple register value return. > > Attached patch fixes PR 58792 by adding missing registers to > function_value_regno_p. > > 2013-10-19 Uros Bizjak > > PR target/58792 > * config/i386/i386.c (ix86_function_value_regno): Add DX_REG, > ST1_REG and XMM1_REG for 32bit and 64bit targets. Also add DI_REG > and SI_REG for 64bit SYSV ABI targets. > > The patch was tested on x86_64-pc-linux-gnu {,-m32}, also with > '--with-arch=core-avx-i --with-cpu=core-avx-i --with-fpmath=avx' > configured compiler for all default languages, plus obj-c++ and go. > > I plan to commit the patch to all release branches after a week in mainline. > > Patch was committed to mainline, so automatic SPEC testers will pick > it for additional testing. Oh, I forgot to say that I tried to construct a C testcase to be put in gcc.target/i386, but I was not able to trigger the assert. Since the problem turned out to be quite obvious and with a straightforward fix, I gave up on the testcase. Uros.
[PATCH, i386]: Fix PR 58792 by adding missing registers to ix86_function_value_regno_p
Hello! The list of possible return registers in ix86_function_value_regno_p is incomplete. 32bit and 64bit targets use %[er]dx register for DImode and TImode values respectively and %st(1) and %xmm1 for imaginary part of complex values. Additionally, 64bit SYSV targets use %rdi and %rsi registers to pass structures (as declared in x86_64_int_return_registers array). function_value_regno_p is mainly used in (obsolete) __builtin_apply and __builtin_return builtins, but it is also used in mode-switching pass to check if regno is OK as multiple register value return. Attached patch fixes PR 58792 by adding missing registers to function_value_regno_p. 2013-10-19 Uros Bizjak PR target/58792 * config/i386/i386.c (ix86_function_value_regno): Add DX_REG, ST1_REG and XMM1_REG for 32bit and 64bit targets. Also add DI_REG and SI_REG for 64bit SYSV ABI targets. The patch was tested on x86_64-pc-linux-gnu {,-m32}, also with '--with-arch=core-avx-i --with-cpu=core-avx-i --with-fpmath=avx' configured compiler for all default languages, plus obj-c++ and go. I plan to commit the patch to all release branches after a week in mainline. Patch was committed to mainline, so automatic SPEC testers will pick it for additional testing. Uros. Index: config/i386/i386.c === --- config/i386/i386.c (revision 203845) +++ config/i386/i386.c (working copy) @@ -7406,9 +7406,15 @@ ix86_function_value_regno_p (const unsigned int re switch (regno) { case AX_REG: +case DX_REG: return true; +case DI_REG: +case SI_REG: + return TARGET_64BIT && ix86_abi != MS_ABI; -case FIRST_FLOAT_REG: + /* Complex values are returned in %st(0)/%st(1) pair. */ +case ST0_REG: +case ST1_REG: /* TODO: The function should depend on current function ABI but builtins.c would need updating then. Therefore we use the default ABI. */ @@ -7416,10 +7422,12 @@ ix86_function_value_regno_p (const unsigned int re return false; return TARGET_FLOAT_RETURNS_IN_80387; -case FIRST_SSE_REG: + /* Complex values are returned in %xmm0/%xmm1 pair. */ +case XMM0_REG: +case XMM1_REG: return TARGET_SSE; -case FIRST_MMX_REG: +case MM0_REG: if (TARGET_MACHO || TARGET_64BIT) return false; return TARGET_MMX;
Re: [wide-int] int_traits
On 10/19/2013 05:01 AM, Richard Sandiford wrote: Mike Stump writes: + // We optimize x < y, where y is 64 or fewer bits. + // We have to be careful to not allow comparison to a large positive + // unsigned value like 0x8000, those would be encoded + // with a y.len == 2. + if (y.precision <= HOST_BITS_PER_WIDE_INT + && y.len == 1) I don't get this. If y.precision <= HOST_BITS_PER_WIDE_INT then y.len must be 1. I realise that tree constants can be stored with TREE_INT_CST_NUNITS > TYPE_PRECISION / HOST_BITS_PER_WIDE_INT (so that extensions beyond TYPE_PRECISION are free). But the wide-int code is shielded from that by the ::decompose routine. A wide int never has len > precision / HOST_BITS_PER_WIDE_INT. Thanks, Richard I think that part of this is that neither mike or i really understand how this stuff works anymore. in the old version where we had precision 0, we would wait to canonicalize a constant or a simple integer until we saw what the precision of the other operand was. That was what precison 0 meant. it was kind of like what you are now proposing with this new trait, but for the reason that we actually did not know what to do than some concern about escapement. What i do not understand is what happens what do you get when you bring in an integer variable that is an unsigned HOST_WIDE_INT with the top bit set. In the precision 0 days, if the prec of the other side was 64 or less, the incoming integer took 1 hwi and if the precision was larger, it took two hwis. The canonicalization happened late enough so that there was never a question. Here we are trying to do this at compile time to avoid the escape. This is why my emails about this have continued to talk about the unsigned HOST_WIDE_INT as a boundary case.It is clear, that if the value is a constant, then you should be able to see at compile time if the top bit is set, but if the value is a simple integer variable, you should still be able to do the non escaping test as long as the type is signed HOST_WIDE_INT or smaller. I think that mike certainly has not captured this yet. But those are the issues as i see them.
RE: FW: MAX_PATH problems with mingw gcc
> -Original Message- > From: Joey Ye [mailto:joey.ye...@gmail.com] > Sent: Friday, October 18, 2013 6:37 AM > > I do encourage you to upstream this patch, though I'm not the maintainer of > libiberty to approve it. OK. Thank you. Here it is. Could you push it to appropriate people? Attaching test for it also. To do tests: - copy patched filename_cmp.c and filenames.h to new dir - touch safe-ctype.h - create hashtab.h with typedef int size_t; typedef int hashval_t; #define TOLOWER(c) (c) This is enough for me to test it on cygwin32. It may be that other platforms will require another fake includes. - gcc -O2 filename_cmp.c main.c -o test1 Best regards Vladimir gcc-4.8.1-filename-normalize-2.patch Description: gcc-4.8.1-filename-normalize-2.patch #include "filenames.h" char *tests[] = { "test", /* 0 */ "test", "./test", /* 1 */ "./test", "/./test", /* 2 */ "/test", ".\\test", /* 3 */ "./test", "\\.\\test", /* 4 */ "/test", "C", /* 5 */ "C", ".", /* 6 */ ".", "", /* 7 */ "", "/../test", /* 8 */ "/test", "C:/test/dir1/dir2", /* 9 */ "C:/test/dir1/dir2", "/test/./dir2", /* 10 */ "/test/dir2", "/test/././dir2", /* 11 */ "/test/dir2", "/test/././/dir2", /* 12 */ "/test/dir2", "/test/../dir2", /* 13 */ "/dir2", "C:/test/dir1/dir2/../dir3", /* 14 */ "C:/test/dir1/dir3", "/test/dir1/dir2/../../dir3", /* 15 */ "/test/dir3", "/test/dir1/dir2/../../../dir3", /* 16 */ "/dir3", "/test/dir1/dir2/../../../../dir3", /* 17 */ "/dir3", "C:/test/dir1/dir2/../../../../aaa/../../././././dir3", /* 18 */ "C:/dir3", "/../..//././././dir3", /* 19 */ "/dir3", "\\..\\test", /* 20 */ "/test", "C:\\test\\dir1\\dir2", /* 21 */ "C:/test/dir1/dir2", "\\test\\.\\dir2", /* 22 */ "/test/dir2", "\\test\\.\\.\\dir2", /* 23 */ "/test/dir2", "\\test\\.\\.dir2", /* 24 */ "/test/dir2", "\\test\\..\\dir2", /* 25 */ "/dir2", "C:\\test\\dir1\\dir2\\..\\dir3", /* 26 */ "C:/test/dir1/dir3", "\\test\\dir1\\dir2\\..\\..\\dir3", /* 27 */ "/test/dir3", "\\test\\dir1\\dir2\\..\\..\\..\\dir3", /* 28 */ "/dir3", "\\test\\dir1\\dir2\\..\\..\\..\\..\\dir3", /* 29 */ "/dir3", "\\test\\dir1\\dir2\\..\\..\\..\\..\\aaa\\..\\..\\..\\.\\.\\dir3", /* 30 */ "/dir3", "C:\\..\\...\\.\\.\\.\\dir3", /* 31 */ "C:/dir3", "../test", /* 32 */ "../test", "C:test/dir1/dir2", /* 33 */ "C:test/dir1/dir2", "test/./dir2", /* 34 */ "test/dir2", "test/././dir2", /* 35 */ "test/dir2", "test/././/dir2", /* 36 */ "test/dir2", "test/../dir2", /* 37 */ "dir2", "C:test/dir1/dir2/../dir3", /* 38 */ "C:test/dir1/dir3", "test/dir1/dir2/../../dir3", /* 39 */ "test/dir3", "C:test/dir1/dir2/../../../dir3", /* 40 */ "C:dir3", "test/dir1/dir2/../../../../dir3", /* 41 */ "../dir3", "test/dir1/dir2/../../../../aaa/../../././././dir3", /* 42 */ "../../dir3", "../..//././././dir3", /* 43 */ "../../dir3", "../../../../dir3/aaa", /* 44 */ "../../../../dir3/aaa", "..\\test", /* 45 */ "../test", "C:test\\dir1\\dir2", /* 46 */ "C:test/dir1/dir2", "test\\.\\dir2", /* 47 */ "test/dir2", "test\\.\\.\\dir2", /* 48 */ "test/dir2", "test\\.\\.dir2", /* 49 */ "test/dir2", "test\\..\\dir2", /* 50 */ "dir2", "C:test\\dir1\\dir2\\..\\dir3", /* 51 */ "C:test/dir1/dir3", "test\\dir1\\dir2\\..\\..\\dir3", /* 52 */ "test/dir3", "C:test\\dir1\\dir2\\..\\..\\..\\dir3", /* 53 */ "C:dir3", "test\\dir1\\dir2\\..\\..\\..\\..\\dir3", /* 54 */ "../dir3", "test\\dir1\\dir2\\..\\..\\..\\..\\aaa\\..\\..\\.\\.\\.\\.\\dir3", /* 55 */ "../../dir3", "..\\...\\.\\.\\.\\dir3", /* 56 */ "../../dir3", "..\\..\\..\\..\\dir3\\aaa", /* 57 */ "../../../../dir3/aaa", "/", /* 58 */ "/", "\\", /* 59 */ "/", "C:test\\dir1\\dir2\\..\\..\\..\\dir3\\", /* 60 */ "C:dir3/", "C:test/dir1/dir2/..\\..\\..\\dir3/", /* 61 */ "C:dir3/", "C:\\test\\dir1\\dir2\\..\\..\\..\\dir3\\", /* 62 */ "C:/dir3/", "C:\\test/dir1/dir2/..\\..\\..\\dir3/", /* 63 */ "C:/dir3/", 0 }; int main(int argc, char **argv) { int i; i = 0; while (tests[i]) { char *p = strdup(tests[i]); filename_normalize (p); if (strcmp(p, tests[i+1])) { printf("Error[%d]: <%s> != <%s>, orig <%s>\n", i/2, p, tests[i+1], tests[i]); } free(p); i += 2; } }
Re: PR libstdc++/58729 - tr2::dynamic_bitset::resize fails
Paolo Carlini wrote: > Patch is Ok with me. Before committing you may want to test -m32 too. Obviously, you did not!-(see http://gcc.gnu.org/ml/gcc-testresults/2013-10/msg01471.html FAIL: tr2/dynamic_bitset/pr58729.cc execution test On x86_64-apple-darwin12, the failure is Assertion failed: (pdb2.count() == 60), function test01, file /opt/gcc/work/libstdc++-v3/testsuite/tr2/dynamic_bitset/pr58729.cc, line 56. TIA Dominique
[PATCH]: Clarify multiple return registers handling in mode-switching.c
Hello! The testcase in PR58792 [1], where a function returns its result in multiple registers, triggered a very picky assert in mode-switching code. The problem was, that x86_64 didn't mark correctly all of its possible return registers, so code assumed that something went wrong here. The mode-switching code already handles __builtin_apply builtin, which also generates multiple return registers. When a sequence of USEs is detected, maybe_builtin_apply flag is raised to bypass the assert. However, functions that pass their result in multiple registers also generate multiple USEs, so the patch clarifies that __builtin_apply situation actually applies to all multiple return register exits. The patch clarifies this situation. No functional changes (modulo converting a couple of variables to boolean type). 2013-10-19 Uros Bizjak * mode-switching.c (create_pre_exit): Rename maybe_builtin_apply to multi_reg_return. Clarify that we are skipping USEs of multiple return registers. Use bool type where appropriate. Tested on x86_64-pc-linux-gnu and committed to mainline SVN. [1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58792 Uros. Index: mode-switching.c === --- mode-switching.c(revision 203845) +++ mode-switching.c(working copy) @@ -229,9 +229,9 @@ create_pre_exit (int n_entities, int *entity_map, int ret_start = REGNO (ret_reg); int nregs = hard_regno_nregs[ret_start][GET_MODE (ret_reg)]; int ret_end = ret_start + nregs; - int short_block = 0; - int maybe_builtin_apply = 0; - int forced_late_switch = 0; + bool short_block = false; + bool multi_reg_return = false; + bool forced_late_switch = false; rtx before_return_copy; do @@ -251,19 +251,20 @@ create_pre_exit (int n_entities, int *entity_map, copy yet, the copy must have been deleted. */ if (CALL_P (return_copy)) { - short_block = 1; + short_block = true; break; } return_copy_pat = PATTERN (return_copy); switch (GET_CODE (return_copy_pat)) { case USE: - /* Skip __builtin_apply pattern. */ + /* Skip return values in multiple registers. + __builtin_apply pattern is also handled here. */ if (GET_CODE (XEXP (return_copy_pat, 0)) == REG && (targetm.calls.function_value_regno_p (REGNO (XEXP (return_copy_pat, 0) { - maybe_builtin_apply = 1; + multi_reg_return = true; last_insn = return_copy; continue; } @@ -326,7 +327,7 @@ create_pre_exit (int n_entities, int *entity_map, there are no return copy insns at all. This avoids an ice on that invalid function. */ if (ret_start + nregs == ret_end) - short_block = 1; + short_block = true; break; } if (!targetm.calls.function_value_regno_p (copy_start)) @@ -354,10 +355,10 @@ create_pre_exit (int n_entities, int *entity_map, another mode than MODE_EXIT, even if it is unrelated to the return value, so we want to put the final mode switch after it. */ - if (maybe_builtin_apply + if (multi_reg_return && targetm.calls.function_value_regno_p (copy_start)) - forced_late_switch = 1; + forced_late_switch = true; /* For the SH4, floating point loads depend on fpscr, thus we might need to put the final mode switch @@ -367,7 +368,7 @@ create_pre_exit (int n_entities, int *entity_map, if (copy_start >= ret_start && copy_start + copy_num <= ret_end && OBJECT_P (SET_SRC (return_copy_pat))) - forced_late_switch = 1; + forced_late_switch = true; break; } if (copy_num == 0) @@ -379,7 +380,7 @@ create_pre_exit (int n_entities, int *entity_map, if (copy_start >= ret_start && copy_start + copy_num <= ret_end) nregs -= copy_num; - else if (!
Re: [wide-int] int_traits
On 18 October 2013 22:55:39 Jakub Jelinek wrote: On Fri, Oct 18, 2013 at 01:52:54PM -0700, Mike Stump wrote: > On Oct 18, 2013, at 6:11 AM, Kenneth Zadeck wrote: > >>> Does this look ok? Kenny, can you double check the cases, think I have them right, but? a double check would be good. > >> That works for me. > >> > i talked to mike about this last night, but did not follow up with an email until now. The problem is that this code is wrong!!! He is working to fix that and so i would expect something from him later (tomorrow for those in europe). > Ok, updated the patch, here is what I checked in: > > diff --git a/gcc/wide-int.h b/gcc/wide-int.h > index 2ff130b..738ae11 100644 > --- a/gcc/wide-int.h > +++ b/gcc/wide-int.h > @@ -185,7 +185,9 @@ along with GCC; see the file COPYING3. If not see > assuming t is a int_cst. > - Note that the bits above the precision are not defined and the > + Note, the bits past the precision up to the nearest HOST_WDE_INT WIDE? s/positve/positive/g falls into the same category, without commenting on the patch itself. Thanks, Jakub Sent with AquaMail for Android http://www.aqua-mail.com
[Ada] Adjust path names in debug info
This ensures the path names in the debug info of the library are consistent with those of libgcc. Tested on x86_64-suse-linux, applied on the mainline. 2013-10-19 Thomas Quinot * gcc-interface/Makefile.in: Use canonical absolute path to refer to the top source directory and to the libgcc subidrectories. -- Eric BotcazouIndex: gcc-interface/Makefile.in === --- gcc-interface/Makefile.in (revision 203844) +++ gcc-interface/Makefile.in (working copy) @@ -167,6 +167,13 @@ tmake_file = @tmake_file@ # Directory where sources are, from where we are. VPATH = $(srcdir)/ada +# Full path to top source directory +# In particular this is used to access libgcc headers, so that references to +# these headers from GNAT runtime objects have path names in debugging info +# that are consistent with libgcc objects. Also used for other references to +# the top source directory for consistency. +ftop_srcdir := $(shell cd $(srcdir)/..;${PWD_COMMAND}) + fsrcdir := $(shell cd $(srcdir);${PWD_COMMAND}) fsrcpfx := $(shell cd $(srcdir);${PWD_COMMAND})/ fcurdir := $(shell ${PWD_COMMAND}) @@ -262,7 +269,7 @@ TOOLS_LIBS = ../link.o ../targext.o ../. # Both . and srcdir are used, in that order, # so that tm.h and config.h will be found in the compilation # subdirectory rather than in the source directory. -INCLUDES = -I- -I. -I.. -I$(srcdir)/ada -I$(srcdir) -I$(srcdir)/../include $(GMPINC) +INCLUDES = -I- -I. -I.. -I$(srcdir)/ada -I$(srcdir) -I$(ftop_srcdir)/include $(GMPINC) ADA_INCLUDES = -I- -I. -I$(srcdir)/ada @@ -272,11 +279,11 @@ ADA_INCLUDES = -I- -I. -I$(srcdir)/ada ifneq ($(findstring vxworks,$(target_os)),) INCLUDES_FOR_SUBDIR = -iquote . -iquote .. -iquote ../.. \ -iquote $(fsrcdir)/ada \ - -I$(fsrcdir)/../include $(GMPINC) + -I$(ftop_srcdir)/include $(GMPINC) else INCLUDES_FOR_SUBDIR = -iquote . -iquote .. -iquote ../.. \ -iquote $(fsrcdir)/ada -iquote $(fsrcdir) \ - -I$(fsrcdir)/../include $(GMPINC) + -I$(ftop_srcdir)/include $(GMPINC) endif ADA_INCLUDES_FOR_SUBDIR = -I. -I$(fsrcdir)/ada @@ -3015,7 +3024,7 @@ vx_stack_info.o : vx_stack_info.c raise-gcc.o : raise-gcc.c raise.h $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ADA_CFLAGS) \ - -iquote $(srcdir) -iquote $(srcdir)/../libgcc \ + -iquote $(srcdir) -iquote $(ftop_srcdir)/libgcc \ $(ALL_CPPFLAGS) $(INCLUDES) $< $(OUTPUT_OPTION) cio.o : cio.c
[Ada] Fix debug info for record type with dynamic-sized field
The compiler may emit an incorrect alignment in the XVE descriptive type associated with a record type containing a dynamic-sized field. Tested on x86_64-suse-linux, applied on the mainline and 4.8 branch. 2013-10-19 Eric Botcazou * gcc-interface/utils.c (scale_by_factor_of): New function. (rest_of_record_type_compilation): Use scale_by_factor_of in order to scale the original offset for both rounding cases; in the second case, take into accout the addend to compute the alignment. Tidy up. -- Eric BotcazouIndex: gcc-interface/utils.c === --- gcc-interface/utils.c (revision 203848) +++ gcc-interface/utils.c (working copy) @@ -232,6 +232,7 @@ static tree compute_related_constant (tr static tree split_plus (tree, tree *); static tree float_type_for_precision (int, enum machine_mode); static tree convert_to_fat_pointer (tree, tree); +static unsigned int scale_by_factor_of (tree, unsigned int); static bool potential_alignment_gap (tree, tree, tree); /* Initialize data structures of the utils.c module. */ @@ -1708,93 +1709,74 @@ rest_of_record_type_compilation (tree re TYPE_SIZE_UNIT (new_record_type) = size_int (TYPE_ALIGN (record_type) / BITS_PER_UNIT); - /* Now scan all the fields, replacing each field with a new - field corresponding to the new encoding. */ + /* Now scan all the fields, replacing each field with a new field + corresponding to the new encoding. */ for (old_field = TYPE_FIELDS (record_type); old_field; old_field = DECL_CHAIN (old_field)) { tree field_type = TREE_TYPE (old_field); tree field_name = DECL_NAME (old_field); - tree new_field; tree curpos = bit_position (old_field); + tree pos, new_field; bool var = false; unsigned int align = 0; - tree pos; - /* See how the position was modified from the last position. - - There are two basic cases we support: a value was added - to the last position or the last position was rounded to - a boundary and they something was added. Check for the - first case first. If not, see if there is any evidence - of rounding. If so, round the last position and try - again. + /* We're going to do some pattern matching below so remove as many + conversions as possible. */ + curpos = remove_conversions (curpos, true); - If this is a union, the position can be taken as zero. */ + /* See how the position was modified from the last position. - /* Some computations depend on the shape of the position expression, - so strip conversions to make sure it's exposed. */ - curpos = remove_conversions (curpos, true); + There are two basic cases we support: a value was added + to the last position or the last position was rounded to + a boundary and they something was added. Check for the + first case first. If not, see if there is any evidence + of rounding. If so, round the last position and retry. + If this is a union, the position can be taken as zero. */ if (TREE_CODE (new_record_type) == UNION_TYPE) - pos = bitsize_zero_node, align = 0; + pos = bitsize_zero_node; else pos = compute_related_constant (curpos, last_pos); - if (!pos && TREE_CODE (curpos) == MULT_EXPR + if (!pos + && TREE_CODE (curpos) == MULT_EXPR && host_integerp (TREE_OPERAND (curpos, 1), 1)) { tree offset = TREE_OPERAND (curpos, 0); align = tree_low_cst (TREE_OPERAND (curpos, 1), 1); - - /* An offset which is a bitwise AND with a mask increases the - alignment according to the number of trailing zeros. */ - offset = remove_conversions (offset, true); - if (TREE_CODE (offset) == BIT_AND_EXPR - && TREE_CODE (TREE_OPERAND (offset, 1)) == INTEGER_CST) - { - unsigned HOST_WIDE_INT mask - = TREE_INT_CST_LOW (TREE_OPERAND (offset, 1)); - unsigned int i; - - for (i = 0; i < HOST_BITS_PER_WIDE_INT; i++) - { - if (mask & 1) - break; - mask >>= 1; - align *= 2; - } - } - - pos = compute_related_constant (curpos, - round_up (last_pos, align)); + align = scale_by_factor_of (offset, align); + last_pos = round_up (last_pos, align); + pos = compute_related_constant (curpos, last_pos); } - else if (!pos && TREE_CODE (curpos) == PLUS_EXPR - && TREE_CODE (TREE_OPERAND (curpos, 1)) == INTEGER_CST + else if (!pos + && TREE_CODE (curpos) == PLUS_EXPR + && host_integerp (TREE_OPERAND (curpos, 1), 1) && TREE_CODE (TREE_OPERAND (curpos, 0)) == MULT_EXPR - && host_integerp (TREE_OPERAND - (TREE_OPERAND (curpos, 0), 1), - 1)) + && host_integerp + (TREE_OPERAND (TREE_OPERAND (curpos, 0), 1), 1)) { + tree offset = TREE_OPERAND (TREE_OPERAND (curpos, 0), 0); + unsigned HOST_WIDE_INT addend + = tree_low_cst (TREE_OPERAND (cur
[Ada] Minor code cleanup
Tested on x86_64-suse-linux, applied on the mainline. 2013-10-19 Eric Botcazou * gcc-interface/cuintp.c: Remove useless include directives. (build_cst_from_int): Use standard predicate. (UI_To_gnu): Simplify. (UI_From_gnu): Fix formatting. * gcc-interface/trans.c (post_error): Likewise. (post_error_ne): Likewise. -- Eric BotcazouIndex: gcc-interface/cuintp.c === --- gcc-interface/cuintp.c (revision 203844) +++ gcc-interface/cuintp.c (working copy) @@ -6,7 +6,7 @@ * * * C Implementation File * * * - * Copyright (C) 1992-2012, Free Software Foundation, Inc. * + * Copyright (C) 1992-2013, Free Software Foundation, Inc. * * * * GNAT is free software; you can redistribute it and/or modify it under * * terms of the GNU General Public License as published by the Free Soft- * @@ -23,8 +23,8 @@ * * / -/* This file corresponds to the Ada package body Uintp. It was created - manually from the files uintp.ads and uintp.adb. */ +/* This file corresponds to the Ada package body Uintp. It was created + manually from the files uintp.ads and uintp.adb. */ #include "config.h" #include "system.h" @@ -35,11 +35,6 @@ #include "ada.h" #include "types.h" #include "uintp.h" -#include "atree.h" -#include "elists.h" -#include "nlists.h" -#include "stringt.h" -#include "fe.h" #include "ada-tree.h" #include "gigi.h" @@ -53,13 +48,13 @@ the integer value itself. The origin of the Uints_Ptr table is adjusted so that a Uint value of Uint_Bias indexes the first element. - First define a utility function that operates like build_int_cst for - integral types and does a conversion to floating-point for real types. */ + First define a utility function that operates like build_int_cst_type for + integral types and does a conversion for floating-point types. */ static tree build_cst_from_int (tree type, HOST_WIDE_INT low) { - if (TREE_CODE (type) == REAL_TYPE) + if (SCALAR_FLOAT_TYPE_P (type)) return convert (type, build_int_cst (NULL_TREE, low)); else return build_int_cst_type (type, low); @@ -73,20 +68,15 @@ build_cst_from_int (tree type, HOST_WIDE tree UI_To_gnu (Uint Input, tree type) { + /* We might have a TYPE with biased representation and be passed an unbiased + value that doesn't fit. We always use an unbiased type to be able to hold + any such possible value for intermediate computations and then rely on a + conversion back to TYPE to perform the bias adjustment when need be. */ + tree comp_type += TREE_CODE (type) == INTEGER_TYPE && TYPE_BIASED_REPRESENTATION_P (type) + ? get_base_type (type) : type; tree gnu_ret; - /* We might have a TYPE with biased representation and be passed an - unbiased value that doesn't fit. We always use an unbiased type able - to hold any such possible value for intermediate computations, and - then rely on a conversion back to TYPE to perform the bias adjustment - when need be. */ - - int biased_type_p -= (TREE_CODE (type) == INTEGER_TYPE - && TYPE_BIASED_REPRESENTATION_P (type)); - - tree comp_type = biased_type_p ? get_base_type (type) : type; - if (Input <= Uint_Direct_Last) gnu_ret = build_cst_from_int (comp_type, Input - Uint_Direct_Bias); else @@ -188,12 +178,13 @@ UI_From_gnu (tree Input) { v[i] = tree_low_cst (fold_build1 (ABS_EXPR, gnu_type, fold_build2 (TRUNC_MOD_EXPR, gnu_type, - gnu_temp, gnu_base)), - 0); + gnu_temp, gnu_base)), 0); gnu_temp = fold_build2 (TRUNC_DIV_EXPR, gnu_type, gnu_temp, gnu_base); } - temp.Low_Bound = 1, temp.High_Bound = Max_For_Dint; - vec.Array = v, vec.Bounds = &temp; + temp.Low_Bound = 1; + temp.High_Bound = Max_For_Dint; + vec.Bounds = &temp; + vec.Array = v; return Vector_To_Uint (vec, tree_int_cst_sgn (Input) < 0); } Index: gcc-interface/trans.c === --- gcc-interface/trans.c (revision 203844) +++ gcc-interface/trans.c (working copy) @@ -9221,10 +9221,14 @@ post_error (const char *msg, Node_Id nod String_Template temp; Fat_Pointer fp; - temp.Low_Bound = 1, temp.High_Bound = strlen (msg); - fp.Array = msg, fp.Bounds = &temp; - if (Present (node)) -Error_Msg_N (fp, node); + if (No (node)) +return; + + temp.Low_Bound = 1; + temp.High_Bound = strlen (msg); + fp.Bounds = &temp; + fp.Array = msg; + E
[Ada] Fix wrong scope for descriptive type in debug info
This is a regression present on all active branches. The descriptive types emitted in the debug info by the Ada compiler can have a different scope than that of the type they are associated with, which goes against the spec. Tested on x86_64-suse-linux, applied on all active branches. 2013-10-19 Eric Botcazou * gcc-interface/utils.c (gnat_set_type_context): New function. (gnat_pushdecl): Use it to set the context of the type. -- Eric BotcazouIndex: gcc-interface/utils.c === --- gcc-interface/utils.c (revision 297217) +++ gcc-interface/utils.c (revision 297218) @@ -531,6 +531,22 @@ gnat_zaplevel (void) free_binding_level = level; } +/* Set the context of TYPE and its parallel types (if any) to CONTEXT. */ + +static void +gnat_set_type_context (tree type, tree context) +{ + tree decl = TYPE_STUB_DECL (type); + + TYPE_CONTEXT (type) = context; + + while (decl && DECL_PARALLEL_TYPE (decl)) +{ + TYPE_CONTEXT (DECL_PARALLEL_TYPE (decl)) = context; + decl = TYPE_STUB_DECL (DECL_PARALLEL_TYPE (decl)); +} +} + /* Record DECL as belonging to the current lexical scope and use GNAT_NODE for location information and flag propagation. */ @@ -612,7 +628,7 @@ gnat_pushdecl (tree decl, Node_Id gnat_n if (TREE_CODE (t) == POINTER_TYPE) TYPE_NEXT_PTR_TO (t) = tt; TYPE_NAME (tt) = DECL_NAME (decl); - TYPE_CONTEXT (tt) = DECL_CONTEXT (decl); + gnat_set_type_context (tt, DECL_CONTEXT (decl)); TYPE_STUB_DECL (tt) = TYPE_STUB_DECL (t); DECL_ORIGINAL_TYPE (decl) = tt; } @@ -622,7 +638,7 @@ gnat_pushdecl (tree decl, Node_Id gnat_n /* We need a variant for the placeholder machinery to work. */ tree tt = build_variant_type_copy (t); TYPE_NAME (tt) = decl; - TYPE_CONTEXT (tt) = DECL_CONTEXT (decl); + gnat_set_type_context (tt, DECL_CONTEXT (decl)); TREE_USED (tt) = TREE_USED (t); TREE_TYPE (decl) = tt; if (DECL_ORIGINAL_TYPE (TYPE_NAME (t))) @@ -644,7 +660,7 @@ gnat_pushdecl (tree decl, Node_Id gnat_n if (!(TYPE_NAME (t) && TREE_CODE (TYPE_NAME (t)) == TYPE_DECL)) { TYPE_NAME (t) = decl; - TYPE_CONTEXT (t) = DECL_CONTEXT (decl); + gnat_set_type_context (t, DECL_CONTEXT (decl)); } } }
Re: PR libstdc++/58729 - tr2::dynamic_bitset::resize fails
Paolo Carlini writes: > Before committing you may want to test -m32 too. Where it in fact fails, see PR58804. Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different."
Re: [PATCH, SH] Add support for inlined builtin-strcmp (2/2)
On Fri, 2013-10-18 at 09:59 +0200, Christian Bruel wrote: > On 10/18/2013 01:05 AM, Oleg Endo wrote: > > I was wondering, in file sh-mem.c, the new function > > 'sh4_expand_cmpstr' ... why is it SH4-something? It's a bit confusing, > > since cmp/str has been around since ever (i.e. since SH1). Maybe just > > rename it to 'sh_expand_cmpstr' instead? > > Just historical. (SH4* are our primary SH platforms). The code is > enabled/tested for all SH1 of course, I will rename. Thanks . > > > Maybe just > > rename it to 'sh_expand_cmpstr' instead? The function always returns > > 'true', so maybe just make it return 'void'? > > yes, it's for genericity as I plan to reuse/specialize the code based on > the count parameter for strncmp to be contributed next. I already assumed so :) > > > > Also, in the expander ... > > > > + [(set (match_operand:SI 0 "register_operand" "") > > + (compare:SI (match_operand:BLK 1 "memory_operand" "") > > > > ... no need to use empty "" constraints > > OK, thanks Could you also please remove the quotes around the preparation block: " { if (! optimize_insn_for_size_p () && sh4_expand_cmpstr(operands)) DONE; else FAIL; }") I've attached two test cases, tested with make -k check-gcc RUNTESTFLAGS="sh.exp=strcmp* --target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}" Could you please include them? Cheers, Oleg Index: gcc/testsuite/gcc.target/sh/strcmp-2.c === --- gcc/testsuite/gcc.target/sh/strcmp-2.c (revision 0) +++ gcc/testsuite/gcc.target/sh/strcmp-2.c (revision 0) @@ -0,0 +1,13 @@ +/* Check that the __builtin_strcmp function is not inlined when optimizing + for size. */ +/* { dg-do compile { target "sh*-*-*" } } */ +/* { dg-options "-Os" } */ +/* { dg-skip-if "" { "sh*-*-*" } { "-m5*" } { "" } } */ +/* { dg-final { scan-assembler-not "cmp/str" } } */ +/* { dg-final { scan-assembler "jsr|jmp|bsr|bra" } } */ + +int +test00 (const char* a, const char* b) +{ + return __builtin_strcmp (a, b); +} Index: gcc/testsuite/gcc.target/sh/strcmp-1.c === --- gcc/testsuite/gcc.target/sh/strcmp-1.c (revision 0) +++ gcc/testsuite/gcc.target/sh/strcmp-1.c (revision 0) @@ -0,0 +1,12 @@ +/* Check that the __builtin_strcmp function is inlined utilizing cmp/str insn + when optimizing for speed. */ +/* { dg-do compile { target "sh*-*-*" } } */ +/* { dg-options "-O2" } */ +/* { dg-skip-if "" { "sh*-*-*" } { "-m5*" } { "" } } */ +/* { dg-final { scan-assembler "cmp/str" } } */ + +int +test00 (const char* a, const char* b) +{ + return __builtin_strcmp (a, b); +}
Re: [wide-int] int_traits
Mike Stump writes: > + // We optimize x < y, where y is 64 or fewer bits. > + // We have to be careful to not allow comparison to a large positive > + // unsigned value like 0x8000, those would be encoded > + // with a y.len == 2. > + if (y.precision <= HOST_BITS_PER_WIDE_INT > + && y.len == 1) I don't get this. If y.precision <= HOST_BITS_PER_WIDE_INT then y.len must be 1. I realise that tree constants can be stored with TREE_INT_CST_NUNITS > TYPE_PRECISION / HOST_BITS_PER_WIDE_INT (so that extensions beyond TYPE_PRECISION are free). But the wide-int code is shielded from that by the ::decompose routine. A wide int never has len > precision / HOST_BITS_PER_WIDE_INT. Thanks, Richard
Re: [Patch] Change default executor to DFS in regex
Hi, On 10/19/2013 05:53 AM, Tim Shen wrote: On Fri, Oct 18, 2013 at 9:13 PM, Tim Shen wrote: As the comment in this patch said, DFS approach is faster in simple regex, but exponentially slower in complex(many quantifier) cases. Actually I suggest to use DFS where number of quantifiers < 2, to make this 'optimization' more conservative. Now `split regex`, say "\s+" can be optimized. The idea of course makes sense to me, but I think it needs a few small tweaks. First, I don't like much the name _S_anti_auto, sounds like something which goes against auto (?). What about _S_alternate? The comment: The _S_auto policy now is as coded, should probably be: The _S_auto policy is the following: In the text, remember to always put a space before open parentheses, it adds clarity. About the < 2, in general hardcoding a parameter value in the code isn't a nice idea. Why don't we take it out to a macro, say _GLIBCXX_REGEX_NFA_QUANTIFIERS_LIMIT? In stl_deque.h we have something similar and in the present case it would be even safe from the ABI point of view, if I'm not mistaken. Finally, you mentioned the DFA optimization: I would like to see some details about it. Would it be a big win in all cases? Before we freeze the next ABI we should keep open all the possibilities... Thanks! Paolo.
Re: [PATCH, SH] Add support for inlined builtin-strcmp (1/2)
On Fri, 2013-10-18 at 09:38 +0200, Christian Bruel wrote: > On 10/18/2013 12:53 AM, Oleg Endo wrote: > > Hi, > > > > On Thu, 2013-10-17 at 16:13 +0200, Christian Bruel wrote: > >> Hello, > >> > >> This patch just reorganizes the SH code used for memory builtins into > >> its own file, in preparation of the RTL strcmp hoisting in the next part. > >> > > Since GCC is now being compiled as C++, it's probably better to name > > newly added source files .cc instead of .c. Could you please rename the > > new file to sh-mem.cc? > > > > Thanks, > > Oleg > Hello Oleg, > > I have no objection to rename a pure C file to a c++ suffixed file. > I'll conform to whatever > the general guidelines for pure C code is. > > For now it doesn't seem to be the tendency. > > grep -i "ew File" ChangeLog | grep .c: > * gimple-builder.c: New File. > * config/winnt-c.c: New file > * ipa-profile.c: New file. > * ubsan.c: New file. > * ipa-devirt.c: New file. > * vtable-verify.c: New file. > * config/arm/aarch-common.c: ... here. New file. > * diagnostic-color.c: New file. > * config/linux-android.c: New file. > Yeah, the thing is that pure C files (.c) are also compiled as C++. There is no distinction of how .c and .cc files are compiled. Thus it's quite easy for C++ code to "slip in", e.g. by follow up changes. So I think it's better to have new files checked in as .cc in the first place in order to avoid more confusion. > I haven't seen any reference to this in the GCC coding guidelines, > should we prefer .cc, cxx, C, cpp., c++.. ? There was some discussion on the GCC list regarding this. gcc/ChangeLog can be used as a summarizing conclusion: 2013-03-27 Gabriel Dos Reis * Makefile.in (.SUFFIXES): Add .cc. (.c.o): Apply same recipe for implicit rule .cc.o. 2013-09-20 Basile Starynkevitch * gengtype.c (file_rules): Added rule for *.cc files. 2013-09-25 Tom Tromey * Makefile.in (CCDEPMODE, DEPDIR, depcomp, COMPILE.base) (COMPILE, POSTCOMPILE): New variables. (.cc.o .c.o): Use COMPILE, POSTCOMPILE. > Also I'm wondering if there is any plan to rename all files in the tree > so we have a consistent source tree. It's been discussed a while ago on the GCC list. The decision was not to do it in order to avoid SVN mass renames for now. See also: http://gcc.gnu.org/ml/gcc/2012-08/msg00311.html http://gcc.gnu.org/ml/gcc/2012-08/msg00312.html http://gcc.gnu.org/ml/gcc/2012-08/msg00315.html Cheers, Oleg