Re: load reverse
On 08/12/2013 05:22 AM, sravan megan wrote: Anyone please help me to get out of this issue It's hard for anyone to do that because we don't have your code. Did you step through insn-output.c with GDB when compiling your test case? What happened? Andrew.
Re: How to specify multiple OSDIRNAME suffixes for multilib (Multilib usage with MPX)?
Hi Terry, Thanks a lot for your reply! I suppose I have to introduce some new option like MULTILIB_COMPATIBLE to produce additional search locations for libraries. Does it sound reasonable? Any advice on implementation? Thanks, Ilya 2013/8/12 Terry Guo terry@arm.com: -Original Message- From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Ilya Enkovich Sent: Friday, August 09, 2013 8:37 PM To: GCC Development Subject: How to specify multiple OSDIRNAME suffixes for multilib (Multilib usage with MPX)? Hi, I'm currently trying to create multilib libraries compiled with MPX. The main difference with existing multilib variants on i386 target is that new targets (32/mpx, 64/mpx) are compatible with old variants (32, 64). Also we should not prevent user from using mpx if he does not have MPX variants for some libraries - legacy versions should be used instead. Thus we need to check several suffixes instead of one. E.g. for 64bit MPX binary we should firstly check ../lib64/mpx, then check ../lib64 and finally the default one. I looked at MULTILIB_REUSE and thought it might solve my problem according to documentation: And for some targets it is better to reuse an existing multilib than to fall back to default multilib when there is no corresponding multilib. [1]. So I tried following declarations: MULTILIB_OSDIRNAMES+= m64/fmpx=../lib64/mpx MULTILIB_REUSE = m64=m64/fmpx But it appeared that only the first entry for some options set counts when multilibs are parsed in gcc.c and my reuse here is just ignored. Is it a wrong implementation of MULTILIB_REUSE or my wrong understanding of this option? Is there a way to implement mpx multilibs still allowing legacy ones when some mpx libs are missing? [1] http://gcc.gnu.org/onlinedocs/gccint/Target-Fragment.html#Target- Fragment Thanks, Ilya Hi Ilya, Sorry for the later response. I am the author of MULTILIB_REUSE. So far this feature is not flexible enough to meet your requirement. It can't dynamically decide to choose m64/fmpx if such libraries are there, then secondly choose m64 if m64/fmpx don't exist. This feature only makes a static decision. The following statement: MULTILIB_REUSE = m64=m64/fmpx means that when options m64 and fmpx are given, we should reuse libraries for m64 always. And for this purpose, we also need: MULTILIB_EXCEPTIONS = m64/fmpx to make sure libraries for m64 fmpx won't be built. If m64/fmpx isn't excluded, the MULTILIB_REUSE will think the required libraries are there and no need to reuse. IMHO, the way used by gcc to select multilib is based on string match rather than detecting the existence of libraries. So the flexible way like you wanted isn't supported yet. BR, Terry
RE: How to specify multiple OSDIRNAME suffixes for multilib (Multilib usage with MPX)?
-Original Message- From: Ilya Enkovich [mailto:enkovich@gmail.com] Sent: Monday, August 12, 2013 4:37 PM To: Terry Guo Cc: GCC Development Subject: Re: How to specify multiple OSDIRNAME suffixes for multilib (Multilib usage with MPX)? Hi Terry, Thanks a lot for your reply! I suppose I have to introduce some new option like MULTILIB_COMPATIBLE to produce additional search locations for libraries. Does it sound reasonable? Any advice on implementation? Thanks, Ilya Make sense to me. And I think the feature you mentioned can cover MULTILIB_REUSE, so to keep things simple, I would prefer to unifying them into one term, either MULTILIB_COMPATIBLE or MULTILIB_REUSE. I am ok with both names. In terms of implementation, I think gcc as a driver program only decides the path to libraries based on command line options and multilib configuration, the linker will finally search the libraries and link them together. When MULTILIB_COMPATIBLE is provided, gcc can select more than one paths and pass them to linker. When there is only one compatible library, the linker can find it by searching all paths, the whole thing can work. But when there are more than one compatible libraries spread in different paths, I am not sure it works. You can try it out. BR, Terry 2013/8/12 Terry Guo terry@arm.com: -Original Message- From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Ilya Enkovich Sent: Friday, August 09, 2013 8:37 PM To: GCC Development Subject: How to specify multiple OSDIRNAME suffixes for multilib (Multilib usage with MPX)? Hi, I'm currently trying to create multilib libraries compiled with MPX. The main difference with existing multilib variants on i386 target is that new targets (32/mpx, 64/mpx) are compatible with old variants (32, 64). Also we should not prevent user from using mpx if he does not have MPX variants for some libraries - legacy versions should be used instead. Thus we need to check several suffixes instead of one. E.g. for 64bit MPX binary we should firstly check ../lib64/mpx, then check ../lib64 and finally the default one. I looked at MULTILIB_REUSE and thought it might solve my problem according to documentation: And for some targets it is better to reuse an existing multilib than to fall back to default multilib when there is no corresponding multilib. [1]. So I tried following declarations: MULTILIB_OSDIRNAMES+= m64/fmpx=../lib64/mpx MULTILIB_REUSE = m64=m64/fmpx But it appeared that only the first entry for some options set counts when multilibs are parsed in gcc.c and my reuse here is just ignored. Is it a wrong implementation of MULTILIB_REUSE or my wrong understanding of this option? Is there a way to implement mpx multilibs still allowing legacy ones when some mpx libs are missing? [1] http://gcc.gnu.org/onlinedocs/gccint/Target-Fragment.html#Target- Fragment Thanks, Ilya Hi Ilya, Sorry for the later response. I am the author of MULTILIB_REUSE. So far this feature is not flexible enough to meet your requirement. It can't dynamically decide to choose m64/fmpx if such libraries are there, then secondly choose m64 if m64/fmpx don't exist. This feature only makes a static decision. The following statement: MULTILIB_REUSE = m64=m64/fmpx means that when options m64 and fmpx are given, we should reuse libraries for m64 always. And for this purpose, we also need: MULTILIB_EXCEPTIONS = m64/fmpx to make sure libraries for m64 fmpx won't be built. If m64/fmpx isn't excluded, the MULTILIB_REUSE will think the required libraries are there and no need to reuse. IMHO, the way used by gcc to select multilib is based on string match rather than detecting the existence of libraries. So the flexible way like you wanted isn't supported yet. BR, Terry
Re: [RFC] gcc feature request: Moving blocks into sections
On Mon, Aug 05, 2013 at 12:55:15PM -0400, Steven Rostedt wrote: [ sent to both Linux kernel mailing list and to gcc list ] Let me hijack this thread for something related... I've been wanting to 'abuse' static_key/asm-goto to sort-of JIT if-forest functions like perf_prepare_sample() and perf_output_sample(). They are of the form: void func(obj, args..) { unsigned long f = ...; if (f F1) do_f1(); if (f F2) do_f2(); ... if (f FN) do_fn(); } Where f is constant for the entire lifetime of the particular object. So I was thinking of having these functions use static_key/asm-goto; then write the proper static key values unsafe so as to avoid all trickery (as these functions would never actually be used) and copy the end result into object private memory. The object will then use indirect calls into these functions. The advantage of using something like this is that it would work for all architectures that now support the asm-goto feature. For arch/gcc combinations that do not we'd simply revert to the current state of affairs. I suppose the question is, do people strenuously object to creativity like that and or is there something GCC can do to make this easier/better still?
Re: [x86-64 psABI] RFC: Extend x86-64 PLT entry to support MPX
On 09.08.13 at 19:03, H.J. Lu hjl.to...@gmail.com wrote: On Fri, Aug 9, 2013 at 12:08 AM, Jan Beulich jbeul...@suse.com wrote: On 08.08.13 at 18:01, H.J. Lu hjl.to...@gmail.com wrote: On Thu, Aug 8, 2013 at 12:19 AM, Jan Beulich jbeul...@suse.com wrote: On 08.08.13 at 02:33, H.J. Lu hjl.to...@gmail.com wrote: We use the .gnu_attribute directive to record an object attribute: enum { Tag_GNU_X86_EXTERN_BRANCH = 4, }; for the types of external branch instructions in relocatable files. enum { /* All external branch instructions are legacy. */ Val_GNU_X86_EXTERN_BRANCH_LEGACY = 0, /* There is at lease one external branch instruction with BND prefix. */ Val_GNU_X86_EXTERN_BRANCH_BND = 1, }; An x86 feature note section, .note.x86-feature, is used to indicate features in executables and shared library. The contents of this note section are: .section.note.x86-feature .align 4 .long .L1 - .L0 .long .L3 - .L2 .long 1 .L0: .asciz x86 feature .L1: .align 4 .L2: .longFeatureFlag (Feature flag) .L3: The current valid bits in FeatureFlag are #define NT_X86_FEATURE_PLT_BND(0x1 0) It should be set if PLT entry has BND prefix to preserve bound registers. The remaining bits in FeatureFlag are reserved. When merging Tag_GNU_X86_EXTERN_BRANCH, if any input relocatable file has Tag_GNU_X86_EXTERN_BRANCH set to Val_GNU_X86_EXTERN_BRANCH_BND, the resulting Tag_GNU_X86_EXTERN_BRANCH value should be Val_GNU_X86_EXTERN_BRANCH_BND. When generating executable or shared library, if PLT is needed and Tag_GNU_X86_EXTERN_BRANCH value is Val_GNU_X86_EXTERN_BRANCH_BND, the 32-byte PLT entry should be used and the feature note section should be generated with the NT_X86_FEATURE_PLT_BND bit set to 1 and the feature note section should be included in PT_NOTE segment. The benefit of the note section is it is backward compatible with existing run-time and tools. While I can see the purpose of the attribute section, I don't see what the note section is for: You don't mention at all what it's consumed for, and I also can't see how it validly would be for anything. That's because iirc note section contents, if not understood by the consumer, is required to not have any effect on the correctness of the program. Hence if loaded on a system that MPX capable, has an MPX aware kernel, but no MPX aware user space (apart from this one executable or shared library, or a set thereof), it ought to still work correctly. Which - afaict - it won't if the dynamic loader itself isn't MPX aware. The note section isn't required for correctness. But it can be used by ld.so to select an alternate MPX aware shared library in a different directory, instead of a legacy one. Okay, that clarifies your intentions with the note section. However, then you need something else to make sure an MPX aware app can't load on an MPX enabled kernel without MPX-enabled ld.so. The MPX enabled app will still run correctly. ld.so will clear the bound registers (that makes unlimited bound) for the first call with lazy binding. Only if those registers are used for their primary purpose. The documentation specifically says that this isn't a requirement. But anyway, I see we're once again not going to get anywhere with this... Jan
Re: How to specify multiple OSDIRNAME suffixes for multilib (Multilib usage with MPX)?
2013/8/12 Terry Guo terry@arm.com: -Original Message- From: Ilya Enkovich [mailto:enkovich@gmail.com] Sent: Monday, August 12, 2013 4:37 PM To: Terry Guo Cc: GCC Development Subject: Re: How to specify multiple OSDIRNAME suffixes for multilib (Multilib usage with MPX)? Hi Terry, Thanks a lot for your reply! I suppose I have to introduce some new option like MULTILIB_COMPATIBLE to produce additional search locations for libraries. Does it sound reasonable? Any advice on implementation? Thanks, Ilya Make sense to me. And I think the feature you mentioned can cover MULTILIB_REUSE, so to keep things simple, I would prefer to unifying them into one term, either MULTILIB_COMPATIBLE or MULTILIB_REUSE. I am ok with both names. In terms of implementation, I think gcc as a driver program only decides the path to libraries based on command line options and multilib configuration, the linker will finally search the libraries and link them together. When MULTILIB_COMPATIBLE is provided, gcc can select more than one paths and pass them to linker. When there is only one compatible library, the linker can find it by searching all paths, the whole thing can work. But when there are more than one compatible libraries spread in different paths, I am not sure it works. You can try it out. Thanks for tips! I do not want to change semantics of existing option and will try to implement new option. I hope it will work fine with multiple compatible libraries available. At least simple test with providing two paths with the same library worked fine for me. Linker just chooses the first path. Thanks, Ilya BR, Terry 2013/8/12 Terry Guo terry@arm.com: -Original Message- From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Ilya Enkovich Sent: Friday, August 09, 2013 8:37 PM To: GCC Development Subject: How to specify multiple OSDIRNAME suffixes for multilib (Multilib usage with MPX)? Hi, I'm currently trying to create multilib libraries compiled with MPX. The main difference with existing multilib variants on i386 target is that new targets (32/mpx, 64/mpx) are compatible with old variants (32, 64). Also we should not prevent user from using mpx if he does not have MPX variants for some libraries - legacy versions should be used instead. Thus we need to check several suffixes instead of one. E.g. for 64bit MPX binary we should firstly check ../lib64/mpx, then check ../lib64 and finally the default one. I looked at MULTILIB_REUSE and thought it might solve my problem according to documentation: And for some targets it is better to reuse an existing multilib than to fall back to default multilib when there is no corresponding multilib. [1]. So I tried following declarations: MULTILIB_OSDIRNAMES+= m64/fmpx=../lib64/mpx MULTILIB_REUSE = m64=m64/fmpx But it appeared that only the first entry for some options set counts when multilibs are parsed in gcc.c and my reuse here is just ignored. Is it a wrong implementation of MULTILIB_REUSE or my wrong understanding of this option? Is there a way to implement mpx multilibs still allowing legacy ones when some mpx libs are missing? [1] http://gcc.gnu.org/onlinedocs/gccint/Target-Fragment.html#Target- Fragment Thanks, Ilya Hi Ilya, Sorry for the later response. I am the author of MULTILIB_REUSE. So far this feature is not flexible enough to meet your requirement. It can't dynamically decide to choose m64/fmpx if such libraries are there, then secondly choose m64 if m64/fmpx don't exist. This feature only makes a static decision. The following statement: MULTILIB_REUSE = m64=m64/fmpx means that when options m64 and fmpx are given, we should reuse libraries for m64 always. And for this purpose, we also need: MULTILIB_EXCEPTIONS = m64/fmpx to make sure libraries for m64 fmpx won't be built. If m64/fmpx isn't excluded, the MULTILIB_REUSE will think the required libraries are there and no need to reuse. IMHO, the way used by gcc to select multilib is based on string match rather than detecting the existence of libraries. So the flexible way like you wanted isn't supported yet. BR, Terry
Re: [RFC] vector subscripts/BIT_FIELD_REF in Big Endian.
What's interesting to me here is the bitpos - does this not need BYTES_BIG_ENDIAN correction? This seems to be inconsistenct with what happens with reduction operations in the autovectorizer where the scalar result in the reduction epilogue gets extracted with a BIT_FIELD_REF but the bitpos there is corrected for BIG_ENDIAN. a[0] is at the left end of the array in BIG_ENDIAN, and big-endian machines number bits from the left, so bit position 0 is correct. ... vect_sum_9.17_74 = [reduc_plus_expr] vect_sum_9.15_73; stmp_sum_9.16_75 = BIT_FIELD_REF vect_sum_9.17_74, 32, 96; sum_76 = stmp_sum_9.16_75 + sum_47; the BIT_FIELD_REF here seems to have been corrected for BYTES_BIG_ENDIAN Yes, because something else is going on here. This is a reduction operation where the sum ends up in the rightmost element of a vector register that contains four 32-bit integers. This is at position 96 from the left end of the register according to big-endian numbering. Thanks for your reply. Sorry, I'm still a bit confused here. The reduc_splus_ documentation says Compute the sum of the signed elements of a vector. The vector is operand 1, and the scalar result is stored in the least significant bits of operand 0 (also a vector). Shouldn't this mean the scalar result should be in bitpos 0 which is the left end of the register in BIG ENDIAN? Thanks, Tejas If vec_extract is defined in the back-end, how does one figure out if the BIT_FIELD_REF is a product of the gimplifier's indirect ref folding or the vectorizer's bit-field extraction and apply the appropriate correction in vec_extract's expansion? Or am I missing something that corrects BIT_FIELD_REFs between the gimplifier and the RTL expander? There is no inconsistency here. Hope this helps! Bill Thanks, Tejas.
Re: [RFC] vector subscripts/BIT_FIELD_REF in Big Endian.
On Mon, 2013-08-12 at 11:54 +0100, Tejas Belagod wrote: What's interesting to me here is the bitpos - does this not need BYTES_BIG_ENDIAN correction? This seems to be inconsistenct with what happens with reduction operations in the autovectorizer where the scalar result in the reduction epilogue gets extracted with a BIT_FIELD_REF but the bitpos there is corrected for BIG_ENDIAN. a[0] is at the left end of the array in BIG_ENDIAN, and big-endian machines number bits from the left, so bit position 0 is correct. ... vect_sum_9.17_74 = [reduc_plus_expr] vect_sum_9.15_73; stmp_sum_9.16_75 = BIT_FIELD_REF vect_sum_9.17_74, 32, 96; sum_76 = stmp_sum_9.16_75 + sum_47; the BIT_FIELD_REF here seems to have been corrected for BYTES_BIG_ENDIAN Yes, because something else is going on here. This is a reduction operation where the sum ends up in the rightmost element of a vector register that contains four 32-bit integers. This is at position 96 from the left end of the register according to big-endian numbering. Thanks for your reply. Sorry, I'm still a bit confused here. The reduc_splus_ documentation says Compute the sum of the signed elements of a vector. The vector is operand 1, and the scalar result is stored in the least significant bits of operand 0 (also a vector). Shouldn't this mean the scalar result should be in bitpos 0 which is the left end of the register in BIG ENDIAN? No. The least significant bits of any register are the rightmost bits, and big-endian numbering begins at the left. (I don't really like the commentary, since least significant bits isn't a very good term to use with vectors.) Analogously, a 64-bit integer is numbered with 0 on the left being the most significant bit, and 63 on the right being the least significant bit. Thanks, Bill Thanks, Tejas If vec_extract is defined in the back-end, how does one figure out if the BIT_FIELD_REF is a product of the gimplifier's indirect ref folding or the vectorizer's bit-field extraction and apply the appropriate correction in vec_extract's expansion? Or am I missing something that corrects BIT_FIELD_REFs between the gimplifier and the RTL expander? There is no inconsistency here. Hope this helps! Bill Thanks, Tejas.
Re: i686 elf return values
On Tue, 6 Aug 2013, Gabriel Dos Reis wrote: On Tue, Aug 6, 2013 at 1:46 PM, Nathan Sidwell nat...@acm.org wrote: Hi, i386elf.h defines: /* The ELF ABI for the i386 says that records and unions are returned in memory. */ #define SUBTARGET_RETURN_IN_MEMORY(TYPE, FNTYPE) \ (TYPE_MODE (TYPE) == BLKmode \ || (VECTOR_MODE_P (TYPE_MODE (TYPE)) int_size_in_bytes (TYPE) == 8)) and as such differs from the regular i86 return mechanism. Notice that the comment doesn't match the code: *) some structs/unions are non BLKmode *) some vectors can be BLKmode, some might not be -- the vector mode check appears to be an attempt to catch DImode vectors. Basing your ABI on the internal modes used by the compiler is not, IMHO, a sensible design choice. This code doesn't appear at first glance to cope with transparent_union. In fact it looks pretty bitrotted. Is it best just to junk the different behaviour at this point? Yes and yes :-) This piece was introduced along i386elf.h itself at r28057 (1999-07-11) as: #define RETURN_IN_MEMORY(TYPE) \ (TYPE_MODE (TYPE) == BLKmode) while the corresponding i386.h piece was: #define RETURN_IN_MEMORY(TYPE) \ ((TYPE_MODE (TYPE) == BLKmode) || int_size_in_bytes (TYPE) 12) AFAICT at that time for the i386 target GCC only supported integer and x87 data types. The largest native (hardware) data type therefore was the x87 80-bit extended represented in C as a 12-byte `long double' type. For this and narrower types the two macros produce the same result. Also at that time IIUC all structs/unions were BLKmode. So the difference between the two macros only applied to complex types. The former would only put `float complex' data in registers (EDX:EAX) -- because that type fits in 8 bytes -- and any other results in memory. The latter however wanted to put all complex data in registers, including `double complex' (16 bytes) and `long double complex' (24 bytes). As observed by Nathan in: http://gcc.gnu.org/ml/gcc-patches/2013-08/msg00373.html this can't possibly work, as i386 only makes 3 GPRs possibly available (defined as call-clobbered) for results: EAX, EDX and ECX. Jeff, given the above -- do you happen to remember what made you make the i386/ELF target different from the base i386 target? Did I miss anything in the consideration above? The decision looks like has been deliberate, the i386.h default would normally apply (as i386elf.h included it at the beginning), but the macro was explicitly undefined and then defined as above. Of course even the decision made by the base i386 target to put `float complex' data in GPRs seems odd to me -- that's awkward and costly to handle as there's no way to pass such data between FPRs and GPRs without going through memory (so what's the point not to just leave it there?). And then this data is obviously useless in GPRs as it has to be put back to FPRs by the caller by the same long route if any further arithmetic is required. If in registers at all, I would expect complex results to be returned in ST(1):ST(0) -- that would make them accessible right away and the ABI consistent with real results returned in ST(0). Then, as further evolution, with the addition of MMX and SSE support at r34721 (2000-06-26) both macros were changed, to: #define RETURN_IN_MEMORY(TYPE) \ (TYPE_MODE (TYPE) == BLKmode \ || (VECTOR_MODE_P (TYPE_MODE (TYPE)) int_size_in_bytes (TYPE) == 8)) and: #define RETURN_IN_MEMORY(TYPE)\ ((TYPE_MODE (TYPE) == BLKmode) \ || (VECTOR_MODE_P (TYPE_MODE (TYPE)) int_size_in_bytes (TYPE) == 8) \ || (int_size_in_bytes (TYPE) 12 TYPE_MODE (TYPE) != TImode\ ! VECTOR_MODE_P (TYPE_MODE (TYPE respectively, which IIUC made MMX data returned in memory and SSE data in registers -- in both cases. Bernd, if you still remember: am I missing anything here? Especially the TImode piece is unobvious to me -- why would it matter for MMX/SSE? Neither supports 128-bit integers. From this point onwards no further changes were made to the version of RETURN_IN_MEMORY in i386elf.h. The version in i386.h was updated with r39693 (2001-02-14): #define RETURN_IN_MEMORY(TYPE) \ ((TYPE_MODE (TYPE) == BLKmode)\ || (VECTOR_MODE_P (TYPE_MODE (TYPE)) int_size_in_bytes (TYPE) == 8)\ || (int_size_in_bytes (TYPE) 12 TYPE_MODE (TYPE) != TImode \ TYPE_MODE (TYPE) != TFmode ! VECTOR_MODE_P (TYPE_MODE (TYPE and then in r45726 (2001-09-21) factored out to ix86_return_in_memory: #define RETURN_IN_MEMORY(TYPE) \ ix86_return_in_memory (TYPE) -- which eventually evolved to its current form. My conclusion therefore is i386/ELF was not maintained, as far as the
Re: i686 elf return values
On 08/12/13 08:07, Maciej W. Rozycki wrote: My conclusion therefore is i386/ELF was not maintained, as far as the return convention is concerned, beyond r34721 and it looks to me like it should have been converted with r45726 to make use of ix86_return_in_memory just like generic i386, perhaps with a special exception for complex types (although as I noted above, this exception was probably a mistake from the beginning). Any thoughts? Thanks for digging into this. It does look like the ABI is accidental, and i686elf.h should not define SUBTARGET_RETURN_IN_MEMORY. nathan
Re: [RFC] gcc feature request: Moving blocks into sections
On 08/12/2013 02:17 AM, Peter Zijlstra wrote: I've been wanting to 'abuse' static_key/asm-goto to sort-of JIT if-forest functions like perf_prepare_sample() and perf_output_sample(). They are of the form: void func(obj, args..) { unsigned long f = ...; if (f F1) do_f1(); if (f F2) do_f2(); ... if (f FN) do_fn(); } Am I reading this right that f can be a combination of any of these? Where f is constant for the entire lifetime of the particular object. So I was thinking of having these functions use static_key/asm-goto; then write the proper static key values unsafe so as to avoid all trickery (as these functions would never actually be used) and copy the end result into object private memory. The object will then use indirect calls into these functions. I'm really not following what you are proposing here, especially not copy the end result into object private memory. With asm goto you end up with at minimum a jump or NOP for each of these function entries, whereas an actual JIT can elide that as well. On the majority of architectures, including x86, you cannot simply copy a piece of code elsewhere and have it still work. You end up doing a bunch of the work that a JIT would do anyway, and would end up with considerably higher complexity and worse results than a true JIT. You also say the object will then use indirect calls into these functions... you mean the JIT or pseudo-JIT generated functions, or the calls inside them? I suppose the question is, do people strenuously object to creativity like that and or is there something GCC can do to make this easier/better still? I think it would be much easier to just write a minimal JIT for this, even though it is per architecture. However, I would really like to understand what the value is. -hpa
[no subject]
E-mail рассылки рекламы Возможные базы: - Москва и Петербург; - Города РФ; - Фирмы любых сфер бизнеса; - Любые страны; Любые формы оплаты. Моментальный эффект. Самые низкие расценки ны рынке. Дателизированный отчет в личном кабинете. Обращайтесь по любым возникшим вопросам по телефону: 7(49 5) 5О 2 ~ 6 1 - 8 5
Re: HAVE_ATTR_enabled mishandling?
On 7/10/13 5:51 AM, David Given wrote: I think I have found a bug. This is in stock gcc 4.8.1... My backend does not use the 'enabled' attribute; therefore the following code in insn-attr.h kicks in: #ifndef HAVE_ATTR_enabled #define HAVE_ATTR_enabled 0 #endif Therefore the following code in gcc/lra-constraints.c is enabled: #ifdef HAVE_ATTR_enabled if (curr_id-alternative_enabled_p != NULL ! curr_id-alternative_enabled_p[nalt]) continue; #endif -alternative_enabled_p is bogus; therefore segfault. Elsewhere I see structures of the form: #if HAVE_ATTR_enabled ... #endif So I think that #ifdef above is a straight typo. Certainly, changing it to a #if makes the crash go away... Hi, Vladimir, Apparently the issue that David mentioned has already been fixed earlier: http://gcc.gnu.org/r198344 2013-04-26 Vladimir Makarov vmaka...@redhat.com ... * lra-constraints.c (curr_insn_set): New. ... (process_alt_operands): Use it. Use #if HAVE_ATTR_enabled instead of #ifdef. Add code to remove cycling. ... However, such change is only applied on trunk but not on 4.8 branch. Since 4.8 branch is still open and this issue seems to be a bug, perhaps it is a good idea to backport this part. What do you think? :) Best regards, jasonwucj
Re: [RFC] gcc feature request: Moving blocks into sections
H. Peter Anvin h...@linux.intel.com writes: However, I would really like to understand what the value is. Probably very little. When I last looked at it, the main overhead in perf currently seems to be backtraces and the ring buffer, not this code. -Andi -- a...@linux.intel.com -- Speaking for myself only
Re: [RFC] gcc feature request: Moving blocks into sections
On Mon, Aug 12, 2013 at 07:56:10AM -0700, H. Peter Anvin wrote: On 08/12/2013 02:17 AM, Peter Zijlstra wrote: I've been wanting to 'abuse' static_key/asm-goto to sort-of JIT if-forest functions like perf_prepare_sample() and perf_output_sample(). They are of the form: void func(obj, args..) { unsigned long f = ...; if (f F1) do_f1(); if (f F2) do_f2(); ... if (f FN) do_fn(); } Am I reading this right that f can be a combination of any of these? Correct. Where f is constant for the entire lifetime of the particular object. So I was thinking of having these functions use static_key/asm-goto; then write the proper static key values unsafe so as to avoid all trickery (as these functions would never actually be used) and copy the end result into object private memory. The object will then use indirect calls into these functions. I'm really not following what you are proposing here, especially not copy the end result into object private memory. With asm goto you end up with at minimum a jump or NOP for each of these function entries, whereas an actual JIT can elide that as well. On the majority of architectures, including x86, you cannot simply copy a piece of code elsewhere and have it still work. I thought we used -fPIC which would allow just that. You end up doing a bunch of the work that a JIT would do anyway, and would end up with considerably higher complexity and worse results than a true JIT. Well, less complexity but worse result, yes. We'd only poke the specific static_branch sites with either NOPs or the (relative) jump target for each of these branches. Then copy the result. You also say the object will then use indirect calls into these functions... you mean the JIT or pseudo-JIT generated functions, or the calls inside them? The calls to these pseudo-JIT generated functions. I suppose the question is, do people strenuously object to creativity like that and or is there something GCC can do to make this easier/better still? I think it would be much easier to just write a minimal JIT for this, even though it is per architecture. However, I would really like to understand what the value is. Removing a lot of the conditionals from the sample path. Depending on the configuration these can be quite expensive.
Re: [RFC] gcc feature request: Moving blocks into sections
On 08/12/2013 09:09 AM, Peter Zijlstra wrote: On the majority of architectures, including x86, you cannot simply copy a piece of code elsewhere and have it still work. I thought we used -fPIC which would allow just that. Doubly wrong. The kernel is not compiled with -fPIC, nor does -fPIC allow this kind of movement for code that contains intramodule references (that is *all* references in the kernel). Since we really doesn't want to burden the kernel with a GOT and a PLT, that is life. You end up doing a bunch of the work that a JIT would do anyway, and would end up with considerably higher complexity and worse results than a true JIT. Well, less complexity but worse result, yes. We'd only poke the specific static_branch sites with either NOPs or the (relative) jump target for each of these branches. Then copy the result. Once again, you can't copy the result. You end up with a full disassembler. -hpa
Combine pass with reused sources
Hi, I'm working on compiler for an architecture with a multiply instruction that takes two 32-bit factors, sign-extends both factors to 64-bits and then does a 64-bit multiplication and stores the result to a destination register. The combine pass successfully generates the pattern (mulhizi3) for this instruction twice for the following function. long long res0; long long res1; long f1(long a, long b, long c, long d) { res0=((long long) a)*((long long) b); res1=((long long) c)*((long long) d); } The generated RTL from combine looks like: (insn 10 9 11 2 g.c:5 (set (reg:ZI 176) (mult:ZI (sign_extend:ZI (reg:HI 9 r6 [ b ])) (sign_extend:ZI (reg:HI 6 r4 [ a ] 262 {*mulhizi3} (nil)) However, if I modify the function so that one of the factors is reused, long f1(long a, long b, long c) { res0=((long long) a)*((long long) b); res1=((long long) c)*((long long) b); } combine will not fuse the reused sign-extension result to generate the mulhizi3 pattern. I am wondering if anyone else has hit this issue or if I have done something wrong in my port. Any help would be greatly appreciated. Thanks, John Lu
[Bug fortran/56666] Suppression flag for DO loop at (1) will be executed zero times
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=5 --- Comment #3 from Thomas Koenig tkoenig at gcc dot gnu.org --- Files modified in the GCC repository. Log entry: 2013-08-12 Thomas Koenig tkoe...@gcc.gnu.org * gcc-4.9/changes.html: Document -Wzerotrip.
[Bug middle-end/58134] New: -ftree-vectorizer-verbose=n shows vectroiyed loops only for N== 1 and N 2 but not for N==2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58134 Bug ID: 58134 Summary: -ftree-vectorizer-verbose=n shows vectroiyed loops only for N== 1 and N 2 but not for N==2 Product: gcc Version: 4.9.0 Status: UNCONFIRMED Keywords: diagnostic Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org ... -ftree-vectorizer-verbose=1 test.cc test.cc:8: note: Vectorized loop But no result for -ftree-vectorizer-verbose=2 test.cc 21|grep 'Vectorized loop' Again with n = 3: -ftree-vectorizer-verbose=3 test.cc 21|grep 'Vectorized loop' test.cc:8: note: Vectorized loop #include algorithm typedef int myint; void max(__restrict myint *data, myint val, int n) { //__assume_aligned(data,64); data = (myint*) __builtin_assume_aligned(data, 64); for (int i = 0; i n; i++) data[i] = std::max(data[i], val); }
[Bug middle-end/58096] [4.9 Regression] gcc.dg/tree-ssa/attr-alias.c fails with r201439
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58096 Yufeng Zhang yufeng at gcc dot gnu.org changed: What|Removed |Added CC||yufeng at gcc dot gnu.org --- Comment #3 from Yufeng Zhang yufeng at gcc dot gnu.org --- The test case also failed on ARM and AArch64
[Bug middle-end/58125] [4.9 Regression] ICE: in operator[], at vec.h:827 with -fno-inline-small-functions
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58125 Marek Polacek mpolacek at gcc dot gnu.org changed: What|Removed |Added CC||hubicka at gcc dot gnu.org --- Comment #2 from Marek Polacek mpolacek at gcc dot gnu.org --- Started with r201439.
[Bug middle-end/58125] [4.9 Regression] ICE: in operator[], at vec.h:827 with -fno-inline-small-functions
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58125 --- Comment #3 from Marek Polacek mpolacek at gcc dot gnu.org --- Seems like we're trying to access (*inline_summary_vec)[node-uid]; where the node-uid is 8, but inline_summary_vec's length is 8.
[Bug fortran/52153] REAL128 gives extended precision, not quad precision
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52153 A. Kasahara latlon90180+gcc_bugzilla at gmail dot com changed: What|Removed |Added CC||latlon90180+gcc_bugzilla@gm ||ail.com --- Comment #8 from A. Kasahara latlon90180+gcc_bugzilla at gmail dot com --- Is there any progress on this? REAL128 of gfortran4.8 is still 10.
[Bug c++/58129] [C++11] Lack of access control checking using auto type deduction
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58129 Jonathan Wakely redi at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #1 from Jonathan Wakely redi at gcc dot gnu.org --- Access control applies to names, and you don't use the private name Private so there's no error.
[Bug gcov-profile/58127] [4.9 Regression] 37 failures in gcc.dg/tree-prof/ for x86_64-apple-darwin10
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58127 --- Comment #1 from Dominique d'Humieres dominiq at lps dot ens.fr --- Revision 201632 is OK, r201634 is not.
[Bug tree-optimization/57980] [4.7/4.8/4.9 Regression] gcc 4.8.1 -foptimize-sibling-calls -O1 ICE in build_int_cst_wide, at tree.c:1210
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57980 --- Comment #3 from Marek Polacek mpolacek at gcc dot gnu.org --- Author: mpolacek Date: Mon Aug 12 08:46:41 2013 New Revision: 201660 URL: http://gcc.gnu.org/viewcvs?rev=201660root=gccview=rev Log: PR tree-optimization/57980 Added: trunk/gcc/testsuite/gcc.dg/pr57980.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-tailcall.c
[Bug tree-optimization/57980] [4.7/4.8/4.9 Regression] gcc 4.8.1 -foptimize-sibling-calls -O1 ICE in build_int_cst_wide, at tree.c:1210
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57980 --- Comment #4 from Marek Polacek mpolacek at gcc dot gnu.org --- Fixed on trunk.
[Bug tree-optimization/58006] [4.8/4.9 Regression] ICE compiling VegaStrike with -ffast-math -ftree-parallelize-loops=2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58006 vincent.legoll at gmail dot com changed: What|Removed |Added CC||vincent.legoll at gmail dot com --- Comment #7 from vincent.legoll at gmail dot com --- Hello, I got the same under Debian Jessie $ gcc-4.8 -v Using built-in specs. COLLECT_GCC=gcc-4.8 COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.8/lto-wrapper Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Debian 4.8.1-2' --with-bugurl=file:///usr/share/doc/gcc-4.8/README.Bugs --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.8 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.8 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.8-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --with-arch-32=i586 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 4.8.1 (Debian 4.8.1-2)
[Bug tree-optimization/58039] -ftree-vectorizer makes a loop crash on a non-aligned memory
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58039 --- Comment #4 from Alexander Barkov bar at mariadb dot org --- Mikael, thanks for your comment on this. (In reply to Mikael Pettersson from comment #3) Your code performs mis-aligned uint16_t stores, which x86 allows. Right, this is done for performance purposes. The vectorizer turns those into larger and still mis-aligned `movdqa' stores, which x86 does not allow, hence the SEGV. Can you please clarify: is it a bug in the recent gcc versions? Note, we've used such performance improvement tricks for years. It worked perfectly fine until now. Has anything changed in how the gcc vectorizer works recently? Replace the non-portable mis-aligned stores with portable code like #define int2store_little_endian(s,A) memcpy((s), (A), 2) or gcc-specific code like struct __attribute__((__packed__)) packed_uint16 { uint16_t u16; }; #define int2store_little_endian(s,A) ((struct packed_uint16*)(s))-u16 = (A) and then the vectorizer generates large `movdqu' stores, which is pretty much the best you can hope for unless you rewrite the code to avoid mis-aligned stores. Unfortunately it's not possible to avoid mis-aligned stores due to the project architecture. I've read somewhere that gcc vectorizer generates two code branches, for aligned memory and for non-aligned memory (but can't find the reference now). Can you please confirm this? Thanks.
[Bug tree-optimization/58135] New: [x86] Missed opportunities for partial SLP
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58135 Bug ID: 58135 Summary: [x86] Missed opportunities for partial SLP Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ysrumyan at gmail dot com If we consider the following simple test-case int a[100]; void foo() { a[0] = a[1] = a[2] = a[3] = 0; } SLP vectorization of basic block takes place: gcc -S -O3 -m32 t.c -ftree-vectorizer-verbose=1 t.c:4: note: Vectorized basic-block but if we add at least one more assignment it won't be vectorized: a[0] = a[1] = a[2] = a[3] = a[4] = 0; t11.c:4: note: Build SLP failed: unrolling required in basic block SLP It is clear that gcc can do partial BB vectorization, i.e. vectorize the first 4 assignments only.
[Bug tree-optimization/58039] -ftree-vectorizer makes a loop crash on a non-aligned memory
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58039 --- Comment #5 from Mikael Pettersson mikpe at it dot uu.se --- (In reply to Alexander Barkov from comment #4) The vectorizer turns those into larger and still mis-aligned `movdqa' stores, which x86 does not allow, hence the SEGV. Can you please clarify: is it a bug in the recent gcc versions? Note, we've used such performance improvement tricks for years. It worked perfectly fine until now. Has anything changed in how the gcc vectorizer works recently? I know next to nothing about the vectorizer, so I cannot comment on this. Unfortunately it's not possible to avoid mis-aligned stores due to the project architecture. Mis-aligned accesses are Ok, as long as they are expressed using the proper mechanisms (memcpy, attribute packed, or pragma packed). I've read somewhere that gcc vectorizer generates two code branches, for aligned memory and for non-aligned memory (but can't find the reference now). Can you please confirm this? I don't know, see above.
[Bug regression/58084] FAIL: gcc.dg/torture/pr8081.c -O2 -flto -fno-use-linker-plugin -flto-partition=none (internal compiler error)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58084 Jan Hubicka hubicka at gcc dot gnu.org changed: What|Removed |Added Status|NEW |ASSIGNED CC||hubicka at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |hubicka at gcc dot gnu.org --- Comment #4 from Jan Hubicka hubicka at gcc dot gnu.org --- Mine...
[Bug lto/58108] [4.9 regression] 32-bit g++.dg/torture/covariant-1.C -O2 -flto FAILs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58108 Jan Hubicka hubicka at gcc dot gnu.org changed: What|Removed |Added CC||hubicka at gcc dot gnu.org --- Comment #1 from Jan Hubicka hubicka at gcc dot gnu.org --- Does this bug still reproduce (I fixed problem related to x86 local calls that may fix this too)
[Bug libgomp/38724] Segfault caused by derived-type with allocatable component in private clause
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38724 --- Comment #7 from janus at gcc dot gnu.org --- see also https://groups.google.com/forum/?fromgroups#!topic/comp.lang.fortran/vPs4MJamnCM
[Bug regression/58084] FAIL: gcc.dg/torture/pr8081.c -O2 -flto -fno-use-linker-plugin -flto-partition=none (internal compiler error)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58084 --- Comment #5 from Jan Hubicka hubicka at gcc dot gnu.org --- OK, the problem is that the return type of nested function is variable sized type of the outer functions. These types go to function sections and are not merged. We used to not ICE just by luck - RESTLT_DECL went to global section that created yet another unmerged version of the type that got into RESULT_DECL. This is not only problem of this kind and I am not quite sure what to do here: either we need to invent way how to refer items in the other function section, or we need to put all abstract origins into global stream completely. The second would be very expensive... In this partiuclar case we probably can just teach tree-inline to VOID_CONVERT_EXPR when needed?
[Bug fortran/56655] [F03] ASSOCIATE construct with OpenMP triggers ICE
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56655 --- Comment #4 from janus at gcc dot gnu.org --- The final specification of OpenMP 4.0 has been published by now and apparently supports the ASSOCIATE construct.
[Bug c/58136] New: Initialized static global variables cause segfault on AIX with debugging symbols
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58136 Bug ID: 58136 Summary: Initialized static global variables cause segfault on AIX with debugging symbols Product: gcc Version: 4.8.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: gcc at rkeene dot org Tested with gcc 4.6.3 and gcc 4.8.1 with binutils 2.22 Program Listing #1 (test-1.c): static unsigned int test = 3; int main(int argc, char **argv) { test = 4; return(0); } Program Listing #2 (test-2.c): static unsigned int test; int main(int argc, char **argv) { test = 4; return(0); } Compiling program listing #1 (above) with the -gxcoff argument causes a segfault. Leaving off the -gxcoff argument, or not initializing the static global (program listing #2, above) causes the program to not segfault. $ powerpc-ibm-aix5.3.0.0-gcc -gxcoff -o test-1_g test-1.c $ ./test-1_g Segmentation fault (core dumped) $ powerpc-ibm-aix5.3.0.0-gcc -o test-1 test-1.c $ ./test-1 $ powerpc-ibm-aix5.3.0.0-gcc -gxcoff -o test-2_g test-2.c $ ./test-2_g $ powerpc-ibm-aix5.3.0.0-gcc -o test-2 test-2.c $ ./test-2
[Bug fortran/38724] Segfault caused by derived-type with allocatable component in private clause
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38724 janus at gcc dot gnu.org changed: What|Removed |Added Keywords||accepts-invalid Status|UNCONFIRMED |NEW Last reconfirmed||2013-08-12 Component|libgomp |fortran Ever confirmed|0 |1 --- Comment #8 from janus at gcc dot gnu.org --- (In reply to Steve Kargl from comment #6) I agree gfortran should reject the program until we have some idea of the behavior with regards to OpenMP 4.0. It seems that the final OpenMP 4.0 specification does not support allocatable components. In particular it lists Allocatable enhancement as unsupported, which supposedly refers to TR 15581 and therefore includes alloc. comp., see http://openmp.org/wp/openmp-specifications/ So the test case should probably be rejected by the front end (alternatively: support it as a GNU extension).
[Bug tree-optimization/58137] New: [trunk, ICE] full unroll + AVX2 vectorization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58137 Bug ID: 58137 Summary: [trunk, ICE] full unroll + AVX2 vectorization Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: kirill.yukhin at intel dot com Created attachment 30635 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30635action=edit Reproducer Hello attached test produces ICE, when compiled as $ gcc -S -O3 1.c -mavx2 It seems that full unroll or copyprop (or whatever) introduces something wrong. 1.c: In function 'more_xrv': 1.c:23:1: error: type mismatch in pointer plus expression more_xrv(void) ^ struct XRV * struct XRV * struct XRV * vect_vec_iv_.15_88 = vect_cst_.13_60 + { 64B, 64B, 64B, 64B }; 1.c:23:1: error: type mismatch in pointer plus expression struct XRV * struct XRV * struct XRV * ...
[Bug tree-optimization/58137] [trunk, ICE] full unroll + AVX2 vectorization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58137 --- Comment #1 from Yukhin Kirill kirill.yukhin at intel dot com --- Actually, this case come while debugging Spec2000's perl workload on AVX-512 changes (with bigger tripcount).
[Bug fortran/46271] [F03] OpenMP default(none) and procedure pointers
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46271 janus at gcc dot gnu.org changed: What|Removed |Added Status|NEW |ASSIGNED CC||janus at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |janus at gcc dot gnu.org Summary|OpenMP default(none) and|[F03] OpenMP default(none) |procedure pointers |and procedure pointers --- Comment #2 from janus at gcc dot gnu.org --- Here is a simple patch to accept version A: Index: gcc/fortran/openmp.c === --- gcc/fortran/openmp.c(revision 201653) +++ gcc/fortran/openmp.c(working copy) @@ -847,7 +847,7 @@ resolve_omp_clauses (gfc_code *code) for (n = omp_clauses-lists[list]; n; n = n-next) { n-sym-mark = 0; -if (n-sym-attr.flavor == FL_VARIABLE) +if (n-sym-attr.flavor == FL_VARIABLE || n-sym-attr.proc_pointer) continue; if (n-sym-attr.flavor == FL_PROCEDURE n-sym-result == n-sym @@ -876,8 +876,6 @@ resolve_omp_clauses (gfc_code *code) if (el) continue; } -if (n-sym-attr.proc_pointer) - continue; } gfc_error (Object '%s' is not a variable at %L, n-sym-name, code-loc);
[Bug fortran/46271] [F03] OpenMP default(none) and procedure pointers
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46271 --- Comment #3 from janus at gcc dot gnu.org --- (In reply to mrestelli from comment #0) With version B: gfortran -fopenmp omp_test.f90 -o omp_test omp_test.f90: In function ‘test’: omp_test.f90:25:0: error: ‘pf’ not specified in enclosing parallel omp_test.f90:23:0: error: enclosing parallel What is actually the problem here? That error message looks correct to me, doesn't it?
[Bug target/57717] error: unrecognizable insn compiling ./strtod_l.c from glibc on powerpc-gnuspe
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57717 jules at gcc dot gnu.org changed: What|Removed |Added CC||jules at gcc dot gnu.org --- Comment #7 from jules at gcc dot gnu.org --- Here's another candidate patch: http://gcc.gnu.org/ml/gcc-patches/2013-08/msg00668.html
[Bug fortran/52153] REAL128 gives extended precision, not quad precision
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52153 --- Comment #9 from Steve Kargl sgk at troutmask dot apl.washington.edu --- On Mon, Aug 12, 2013 at 08:08:18AM +, latlon90180+gcc_bugzilla at gmail dot com wrote: Is there any progress on this? REAL128 of gfortran4.8 is still 10. Need a short example. gfortran has supported a 128-bit real type for quite some time (since 4.6). real(4) a real(8) b real(10) c real(16) d print '(4(I0,1X))', digits(a), digits(b), digits(c), digits(d) end % gfortran46 -o z a.f90 ./z 24 53 53 113 PS: yes, the output is correct for real(10). FreeBSD-i386's long double only has 53-bits of precision.
[Bug fortran/52153] REAL128 gives extended precision, not quad precision
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52153 --- Comment #10 from kargl at gcc dot gnu.org --- (In reply to Steve Kargl from comment #9) On Mon, Aug 12, 2013 at 08:08:18AM +, latlon90180+gcc_bugzilla at gmail dot com wrote: Is there any progress on this? REAL128 of gfortran4.8 is still 10. Need a short example. gfortran has supported a 128-bit real type for quite some time (since 4.6). real(4) a real(8) b real(10) c real(16) d print '(4(I0,1X))', digits(a), digits(b), digits(c), digits(d) end % gfortran46 -o z a.f90 ./z 24 53 53 113 PS: yes, the output is correct for real(10). FreeBSD-i386's long double only has 53-bits of precision. Ignore. I should have read the audit trail first.
[Bug fortran/46271] [F03] OpenMP default(none) and procedure pointers
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46271 --- Comment #4 from janus at gcc dot gnu.org --- (In reply to janus from comment #2) Here is a simple patch to accept version A: ... which regtests cleanly!
[Bug lto/58108] [4.9 regression] 32-bit g++.dg/torture/covariant-1.C -O2 -flto FAILs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58108 --- Comment #2 from ro at CeBiTec dot Uni-Bielefeld.DE ro at CeBiTec dot Uni-Bielefeld.DE --- --- Comment #1 from Jan Hubicka hubicka at gcc dot gnu.org --- Does this bug still reproduce (I fixed problem related to x86 local calls that may fix this too) The failure still exists in a i386-pc-solaris2.10 bootstrap as of r201663. Rainer
[Bug rtl-optimization/57451] Incorrect debug ranges emitted for -freorder-blocks-and-partition -g
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57451 --- Comment #9 from ccoutant at google dot com --- + if (!active_insn_p (insn)) +continue; I'm not clear on why this is needed. Is it because after the change_scope, insn will now be a NOTE? If that's it, just put the continue in the previous if clause. Because the notes were being skipped by the iteration over instructions, which previously only walked active instructions (notes are not active instructions). So to see the switch section note I had to walk all instructions, and just skip non-active instructions after I am done checking for the note of interest. Oh, right. I didn't notice the change in the for loop. -cary
[Bug c++/58138] New: #include random gives warning: macro __code_model_small__ is not used
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58138 Bug ID: 58138 Summary: #include random gives warning: macro __code_model_small__ is not used Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: sbergman at redhat dot com At least with a trunk revision 201654 (aka LATEST-4.9) build (on Fedora 18 x86_64): $ cat test1.cc #include random namespace {} $ ~/gcc/LATEST-4.9/inst/bin/g++ -std=gnu++11 -Wunused-macros -c test1.cc test1.cc:2:12: warning: macro __code_model_small__ is not used [-Wunused-macros] namespace {} ^ I was able to strip that down to the following excerpt of ~/gcc/LATEST-4.9/inst/lib/gcc/x86_64-unknown-linux-gnu/4.9.0/include/ia32intrin.h: $ cat test2.cc #include test2.h namespace {} $ cat test2.h #pragma GCC system_header #pragma GCC push_options #pragma GCC target(sse4.2) #pragma GCC pop_options $ ~/gcc/LATEST-4.9/inst/bin/g++ -Wunused-macros -c test2.cc test2.cc:2:12: warning: macro __code_model_small__ is not used [-Wunused-macros] namespace {} ^ With a build of tags/gcc_4_8_1_release, compiling test1.cc does not give a warning while test2.cc does. And with a random old build of branches/gcc-4_6-branch, compiling neither test1.cc nor test2.cc gives a warning (replacing -std=gnu++11 with -std=gnu++0x when compiling test1.cc).
[Bug fortran/46271] [F03] OpenMP default(none) and procedure pointers
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46271 --- Comment #5 from mrestelli mrestelli at gmail dot com --- (In reply to janus from comment #3) (In reply to mrestelli from comment #0) With version B: gfortran -fopenmp omp_test.f90 -o omp_test omp_test.f90: In function ‘test’: omp_test.f90:25:0: error: ‘pf’ not specified in enclosing parallel omp_test.f90:23:0: error: enclosing parallel What is actually the problem here? That error message looks correct to me, doesn't it? Janus, you are probably right that version B should not compile. I guess when I posted the bug report I was not sure which was the correct version according to the OpenMP specifications, since fp is a variable (requiring an OpenMP attribute), but it behaves like a subroutine (so, no OpenMP attribute). Clearly however at least one of the two versions should work, hence my pointing out that both alternatives do not work. Well, at least this is my recollection, since it was quite a while ago. As a note, I mention that ifort (version 13.1) accepts both versions, but maybe this is an issue with ifort itself. Regards, Marco Restelli
[Bug fortran/46271] [F03] OpenMP default(none) and procedure pointers
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46271 --- Comment #6 from janus at gcc dot gnu.org --- Hi Marco, Janus, you are probably right that version B should not compile. I guess when I posted the bug report I was not sure which was the correct version according to the OpenMP specifications, since fp is a variable (requiring an OpenMP attribute), but it behaves like a subroutine (so, no OpenMP attribute). well, since a procedure pointer can be assigned and change its value, I would say it counts as a variable and one should make up one's mind whether it is supposed to be shared or private in an OpenMP loop (as for any other variable, this can clearly make a difference). Hence my interpretation that the error message is correct. However, I should note that I'm not much of an OpenMP expert and haven't checked whether the OpenMP specifications makes any definitive statement about this. It's merely my 'gut feeling'. As a note, I mention that ifort (version 13.1) accepts both versions, but maybe this is an issue with ifort itself. ifort is not exactly known for it's strictness on invalid programs, and of course it may have bugs. I don't know if this is allowed on purpose or if the missing error is an oversight. If ifort accepts the program, it would be interesting whether it treats the procptr as private or shared with default(none), and whether this behavior is documented somewhere (either in the OpenMP spec or the ifort docs). Some people claim that documentation is the only thing that distinguishes a feature from a bug ;) Cheers, Janus
[Bug target/58139] New: PowerPC volatile VSX register live across call
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58139 Bug ID: 58139 Summary: PowerPC volatile VSX register live across call Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: dje at gcc dot gnu.org void tightness3_intrinsics2(double* A, double* B, int N) { __vector double * vA = (__vector double*)A; __vector double * vB = (__vector double*)B; __vector double va0, va1; double b0, b1, b2, b3; va0 = vA[0]; va1 = vA[1]; b0 = log(vec_extract(va0, 0)); b1 = log(vec_extract(va0, 1)); b2 = log(vec_extract(va1, 0)); b3 = log(vec_extract(va1, 1)); __vector double vb0 = {b0, b1}; __vector double vb1 = {b2, b3}; vB[0] = vb0; vB[1] = vb1; } xxpermdi 1,63,63,2 xxpermdi 30,30,29,0 bl log nop addi 1,1,192 li 0,-80 stxvd2x 30,0,30 GCC should not expect VSX 30 to be preserved across the call to log().
[Bug target/58139] PowerPC volatile VSX register live across call
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58139 David Edelsohn dje at gcc dot gnu.org changed: What|Removed |Added Target||powerpc*-*-* Status|UNCONFIRMED |NEW Keywords||wrong-code Last reconfirmed||2013-08-12 CC||bergner at gcc dot gnu.org Host||powerpc*-*-* Ever confirmed|0 |1 Known to fail||4.6.3, 4.7.3, 4.8.1 Build||powerpc*-*-* --- Comment #1 from David Edelsohn dje at gcc dot gnu.org --- Confirmed.
[Bug c++/58140] New: -Wnon-virtual-dtor shouldn't fire for classes declared final
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58140 Bug ID: 58140 Summary: -Wnon-virtual-dtor shouldn't fire for classes declared final Product: gcc Version: 4.7.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: tudorb at fb dot com Created attachment 30636 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30636action=edit Test case In C++11, we can declare a class as final to indicate that it can't be derived from. In that case, having a public non-virtual destructor is fine, even if the class has virtual methods (no derived classes exist, so deleting an instance via a pointer is always safe). In the attached example, the warning should fire for NonFinalDerived, but not for FinalDerived.
[Bug c++/58140] -Wnon-virtual-dtor shouldn't fire for classes declared final
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58140 --- Comment #1 from Tudor Bosman tudorb at fb dot com --- (Tested with gcc 4.7.1, compiled with -std=c++11 -Wnon-virtual-dtor
[Bug middle-end/58134] -ftree-vectorizer-verbose=n shows vectroiyed loops only for N== 1 and N 2 but not for N==2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58134 --- Comment #1 from Tobias Burnus burnus at gcc dot gnu.org --- The reason is the following: dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, vect_location, Vectorized loop\n); And in opts-global.c's dump_remap_tree_vectorizer_verbose: switch (value) { case 0: break; case 1: remapped_opt_info = optimized; break; case 2: remapped_opt_info = missed; break; default: remapped_opt_info = all; break; } And dumpfile.h: #define MSG_OPTIMIZED_LOCATIONS (1 26) /* -fopt-info optimized sources */ #define MSG_MISSED_OPTIMIZATION (1 27) /* missed opportunities */ #define MSG_NOTE (1 28) /* general optimization info */ #define MSG_ALL (MSG_OPTIMIZED_LOCATIONS | MSG_MISSED_OPTIMIZATION \ | MSG_NOTE)
[Bug middle-end/58134] [4.8/4.9 Regression] -ftree-vectorizer-verbose=n shows vectroiyed loops only for N== 1 and N 2 but not for N==2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58134 Tobias Burnus burnus at gcc dot gnu.org changed: What|Removed |Added CC||singhai at gcc dot gnu.org Summary|-ftree-vectorizer-verbose= |[4.8/4.9 Regression] |n shows vectroiyed loops |-ftree-vectorizer-verbose= |only for N== 1 and N 2 but |n shows vectroiyed loops |not for N==2|only for N== 1 and N 2 but ||not for N==2 --- Comment #2 from Tobias Burnus burnus at gcc dot gnu.org --- Using g++-4.7 -O3 -ftree-vectorizer-verbose=2 it works as one gets: 7: LOOP VECTORIZED. Seemingly caused by r193061
[Bug middle-end/58134] [4.8/4.9 Regression] -ftree-vectorizer-verbose=n shows vectroiyed loops only for N== 1 and N 2 but not for N==2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58134 --- Comment #3 from Sharad Singhai singhai at gcc dot gnu.org --- I think this is the intended behavior. While working on the new dump infrastructure, I modified the behavior of -ftree-vectorizer-verbose. Thus right now -ftree-vectorizer-verbose=1 : dump info about optimized loops ...=2 : dump info about missed loops ...2 : dump info about optimized _and_ missed loops Thus at 3 and greater, you are again seeing info available at 1. But really, only 1 and 2 are meaningful. Anything higher is a combination of these two kinds of information. This was a way to preserve compatibility with old scripts, while deprecating this flag. I didn't see any tests relying on the old behavior. Here is the current documentation about this flag in gcc.info: `-ftree-vectorizer-verbose=N' This option is deprecated and is implemented in terms of `-fopt-info'. Please use `-fopt-info-KIND' form instead, where KIND is one of the valid opt-info options. It prints additional optimization information. For N=0 no diagnostic information is reported. If N=1 the vectorizer reports each loop that got vectorized, and the total number of loops that got vectorized. If N=2 the vectorizer reports locations which could not be vectorized and the reasons for those. For any higher verbosity levels all the analysis and transformation information from the vectorizer is reported.
[Bug c++/58140] -Wnon-virtual-dtor shouldn't fire for classes declared final
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58140 Jonathan Wakely redi at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2013-08-12 Ever confirmed|0 |1 --- Comment #2 from Jonathan Wakely redi at gcc dot gnu.org --- This should be pretty simple to fix, but why use -Wnon-virtual-dtor anyway, when -Wdelete-non-virtual-dtor is more accurate and more useful?
[Bug c++/58140] -Wnon-virtual-dtor shouldn't fire for classes declared final
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58140 --- Comment #3 from Jonathan Wakely redi at gcc dot gnu.org --- (In reply to Tudor Bosman from comment #0) In C++11, we can declare a class as final to indicate that it can't be derived from. In that case, having a public non-virtual destructor is fine, even if the class has virtual methods (no derived classes exist, so deleting an instance via a pointer is always safe). N.B. this is only true if there's no base class with a public destructor, which is true for your example, but not in general.
[Bug c/58141] New: [bfin]: ICE: Segmentation fault
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58141 Bug ID: 58141 Summary: [bfin]: ICE: Segmentation fault Product: gcc Version: 4.8.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: canyon at recursivebliss dot com Created attachment 30637 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30637action=edit Preprocessed source Description of problem: When compiling Das U-Boot for a Blackfin target, there is a internal compiler error: Segmentation fault . Version-Release number of selected component (if applicable): bfin-uclinux-gcc 4.8.1 How reproducible: Every time. Steps to Reproduce: 1. git clone git://git.denx.de/u-boot.git 2. cd u-boot 3. make bf518f-ezbrd Actual results: main.c: In function 'builtin_run_command': main.c:1434:1: internal compiler error: Segmentation fault } Expected results: Successful build. Additional info: If you comment out the call of process_macros in builtin_run_command the build is successful.
[Bug tree-optimization/58137] [trunk, ICE] full unroll + AVX2 vectorization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58137 Bernd Edlinger bernd.edlinger at hotmail dot de changed: What|Removed |Added CC||bernd.edlinger at hotmail dot de --- Comment #2 from Bernd Edlinger bernd.edlinger at hotmail dot de --- reproduced also with arm-none-eabi: ../arm-eabi/bin/arm-eabi-gcc -O3 -mfpu=neon -mfloat-abi=softfp 1.c
[Bug tree-optimization/58121] [4.9 regression] FAIL: cc1224a
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58121 Eric Botcazou ebotcazou at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |WAITING Last reconfirmed||2013-08-12 CC||ebotcazou at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Eric Botcazou ebotcazou at gcc dot gnu.org --- I cannot reproduce: === acats tests === === acats Summary === # of expected passes2320 # of unexpected failures0 Native configuration is ia64-unknown-linux-gnu === gnat tests === Running target unix === gnat Summary === # of expected passes1168 # of expected failures 18 # of unsupported tests 10
[Bug other/58133] GCC should emit arm assembly following the unified syntax
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58133 --- Comment #1 from Sven sven.koehler at gmail dot com --- It seems, that for targets like -mcpu=cortex-m4 the gcc does generate unified syntax. So is the unified syntax only used for newer targets that use the thumb2 instruction set whereas the divided syntax is used for older thumb1 targets?
[Bug c++/58142] New: _pthread_tsd_cleanup called before destructors are called
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58142 Bug ID: 58142 Summary: _pthread_tsd_cleanup called before destructors are called Product: gcc Version: 4.8.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: soonhok at cs dot cmu.edu Created attachment 30638 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30638action=edit preprocessed input file It seems when a thread is finished, its thread cleanup routine is called before destructors of TLS(Thread Local Storage) variables are called and it causes (possible) segmentation faults. I provided a simplified small program which reproduces the behavior. Even if it doesn't generate a segmentation fault, running valgrind over it indicates the same problem is going on in run-time. This problem happens only on OSX. When I tried the same C++ code on Ubuntu12.04 with g++-4.8.1. There was no problem. I also tried with clang++-3.3 on OSX. There was no problem either. 1. The exact version of GCC: gcc-4.8.1 2. the system type: OSX 10.8.4, Darwin air 12.4.0 Darwin Kernel Version 12.4.0 3. the options given when GCC was configured/built: g++-4.8 -std=c++11 thread.cpp -O thread 4. the complete command line that triggers the bug; valgrind thread 5. the compiler output (error messages, warnings, etc.); and the preprocessed file (*.i*) that triggers the bug, generated by adding -save-temps to the complete compilation command, or, in the case of a bug report for the GNAT front end, a complete set of source files (see below). Preprocessed file is attached. Here is the original source code (much shorter): == #include thread #include iostream #include mutex #include vector static void foo() { static thread_local std::vectorint v(1024); if (v.size() != 1024) { std::cerr Error\n; exit(1); } } static void tst1() { unsigned n = 5; for (unsigned i = 0; i n; i++) { std::thread t([](){ foo(); }); t.join(); } } int main() { tst1(); } == The following is the output of valgrind: ... ==18408== Invalid read of size 8 ==18408==at 0x121B3: std::_Vector_baseint, std::allocatorint ::~_Vector_base() (in ./a.out) ==18408==by 0x12054: std::vectorint, std::allocatorint ::~vector() (in ./a.out) ==18408==by 0xB9F5: (anonymous namespace)::run(void*) (in /usr/local/lib/libstdc++.6.dylib) ==18408==by 0x29CA01: _pthread_exit (in /usr/lib/system/libsystem_c.dylib) ==18408==by 0x29C7AC: _pthread_start (in /usr/lib/system/libsystem_c.dylib) ==18408==by 0x2891E0: thread_start (in /usr/lib/system/libsystem_c.dylib) ==18408== Address 0x1000257a8 is 8 bytes inside a block of size 32 free'd ==18408==at 0x5632: free (in /usr/local/Cellar/valgrind/3.8.1/lib/valgrind/vgpreload_memcheck-amd64-darwin.so) ==18408==by 0x1CAA12: emutls_destroy (in /usr/local/lib/libgcc_s.1.dylib) ==18408==by 0x101: ??? ==18408==by 0xB0080E9F: ??? ==18408==by 0xB008186F: ??? ==18408==by 0x2A34DF: _pthread_tsd_cleanup (in /usr/lib/system/libsystem_c.dylib) ==18408==by 0xB008105F: ??? ...
[Bug target/58139] PowerPC volatile VSX register live across call
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58139 --- Comment #2 from Peter Bergner bergner at gcc dot gnu.org --- This looks like a scheduling bug. Just before sched2, we have: (call_insn 29 28 31 2 (parallel [ (set (reg:DF 33 1) (call (mem:SI (symbol_ref:DI (log) [flags 0x41] function_decl 0xfff92c41200 log) [0 __builtin_log S4 A8]) (const_int 64 [0x40]))) (use (const_int 0 [0])) (clobber (reg:SI 65 lr)) ]) bug.c:17 509 {*call_value_nonlocal_aix64} (expr_list:REG_EH_REGION (const_int 0 [0]) (nil)) (expr_list:REG_NON_LOCAL_GOTO (use (reg:DF 33 1)) (nil))) (insn 31 29 34 2 (set (reg:V2DF 62 30 [orig:140 vb0 ] [140]) (unspec:V2DF [ (reg/v:DF 62 30 [orig:123 b0 ] [123]) (reg/v:DF 61 29 [orig:125 b1 ] [125]) ] UNSPEC_VSX_CONCAT)) bug.c:18 920 {vsx_concat_v2df} (expr_list:REG_DEAD (reg/v:DF 61 29 [orig:125 b1 ] [125]) (expr_list:REG_EQUIV (mem:V2DF (reg/v/f:DI 30 30 [orig:133 B ] [133]) [2 MEM[(__vector double *)B_2(D)]+0 S16 A128]) (nil Here, insn 31 sets VSX reg 62 (ie, fpr30,vsr30). In DFmode, reg 62 is a non-volatile register, but in V2DFmode, it is volatile. After sched2, we have: insn:TI 31 28 29 2 (set (reg:V2DF 62 30 [orig:140 vb0 ] [140]) (unspec:V2DF [ (reg/v:DF 62 30 [orig:123 b0 ] [123]) (reg/v:DF 61 29 [orig:125 b1 ] [125]) ] UNSPEC_VSX_CONCAT)) bug.c:18 920 {vsx_concat_v2df} (expr_list:REG_DEAD (reg/v:DF 61 29 [orig:125 b1 ] [125]) (expr_list:REG_EQUIV (mem:V2DF (reg/v/f:DI 30 30 [orig:133 B ] [133]) [2 MEM[(__vector double *)B_2(D)]+0 S16 A128]) (nil (call_insn 29 31 72 2 (parallel [ (set (reg:DF 33 1) (call (mem:SI (symbol_ref:DI (log) [flags 0x41] function_decl 0xfff92c41200 log) [0 __builtin_log S4 A8]) (const_int 64 [0x40]))) (use (const_int 0 [0])) (clobber (reg:SI 65 lr)) ]) bug.c:17 509 {*call_value_nonlocal_aix64} (expr_list:REG_EH_REGION (const_int 0 [0]) (nil)) (expr_list:REG_NON_LOCAL_GOTO (use (reg:DF 33 1)) (nil))) So it looks like the scheduler is somehow thinking that reg 62 is non-volatile when it's really volatile in V2DFmode and moving it before the call which ends up clobbering it. Still digging.
[Bug c++/57416] internal compiler error: in gimple_expand_cfg, at cfgexpand.c:4575
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57416 --- Comment #8 from Paolo Carlini paolo.carlini at oracle dot com --- The ICE is indeed fixed in mainline. I'm going to commit a (reduced) testcase and close the issue.
[Bug c++/57416] internal compiler error: in gimple_expand_cfg, at cfgexpand.c:4575
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57416 Paolo Carlini paolo.carlini at oracle dot com changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED Target Milestone|--- |4.9.0 --- Comment #9 from Paolo Carlini paolo.carlini at oracle dot com --- Done.
[Bug go/58075] Unable to build go on ia64-hp-hpux11.31
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58075 --- Comment #2 from Paul Ackersviller pda at freeshell dot org --- Thanks, I have sent this on to HP. Should I report back a patch number, or whatever they end up responding with?
[Bug tree-optimization/58137] [trunk, ICE] full unroll + AVX2 vectorization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58137 --- Comment #3 from Bernd Edlinger bernd.edlinger at hotmail dot de --- Created attachment 30639 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30639action=edit possible fix This seems to be a bug in the constant folding of constant vector values at forwprop4. Could some one check if the generated code is now correct ? Thanks.
[Bug go/58075] Unable to build go on ia64-hp-hpux11.31
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58075 --- Comment #3 from Ian Lance Taylor ian at airs dot com --- Yes, please. Thanks.
[Bug middle-end/58143] New: wrong code at -O3 on x86_64-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58143 Bug ID: 58143 Summary: wrong code at -O3 on x86_64-linux-gnu Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: su at cs dot ucdavis.edu The current gcc trunk and gcc 4.8 produce wrong code for the following testcase on x86_64-linux when compiled at -O3 (in both 32-bit and 64-bit modes). This is a regression from 4.7.x. $ gcc-trunk -v gcc version 4.9.0 20130812 (experimental) [trunk revision 201658] (GCC) $ gcc-trunk -O2 small.c $ a.out 0 $ gcc-4.7 -O3 small.c $ a.out 0 $ gcc-trunk -O3 small.c $ a.out -1 $ gcc-4.8 -O3 small.c $ a.out -1 $ -- int printf (const char *, ...); int a, b, c, d, e, f, g, h = 1, i; int foo (int p) { return p 0 a -2147483647 - 1 - p ? 0 : 1; } int *bar () { int j; i = h ? 0 : 1 % h; for (j = 0; j 1; j++) for (d = 0; d; d++) for (e = 1; e;) return 0; return 0; } int baz () { for (; b = 0; b--) for (c = 1; c = 0; c--) { int *k = c; for (;;) { for (f = 0; f 1; f++) { g = foo (*k); bar (); } if (*k) break; return 0; } } return 0; } int main () { baz (); printf (%d\n, b); return 0; }
[Bug middle-end/58143] wrong code at -O3 on x86_64-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58143 --- Comment #1 from Andrew Pinski pinskia at gcc dot gnu.org --- -2147483647 - 1 - p Hmm, this overflows for p 1.
[Bug middle-end/58143] wrong code at -O3 on x86_64-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58143 --- Comment #2 from Zhendong Su su at cs dot ucdavis.edu --- Andrew, because of short-circuiting, when p = 0, the expression -2147483647 - 1 - p isn't actually evaluated. Thanks for looking into this so quickly! Zhendong
[Bug c++/58144] New: Receive virtual memory exhausted: Cannot allocate memory while compiling
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58144 Bug ID: 58144 Summary: Receive virtual memory exhausted: Cannot allocate memory while compiling Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: amit.chitnis at gmail dot com g++ (GCC) 4.4.6 20130305 (Red Hat 4.4.6-4). Steps to reproduce 1. create a small hello world program with iostream,stdio.h and stdlib.h using namespace std; 2. create a big file (say 900M) named new in a location which is a part of your include path. 3. Compile the hello world cpp and it should fail with the above error. This seems to be because of the size and name of the file created in step 2 above. g++ -g -pthread -D_THREAD_SAFE -D_REENTRANT -I/opt/performance -o helloworld.o -c helloworld.cpp file new was created at location /opt/performance/
[Bug middle-end/58145] New: [Regression]: volatileness of write is discarded, perhaps in lim1 related to loop optimizations
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145 Bug ID: 58145 Summary: [Regression]: volatileness of write is discarded, perhaps in lim1 related to loop optimizations Product: gcc Version: 4.9.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: hp at gcc dot gnu.org Target: cris-*-*, crisv32-*-* Created attachment 30640 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30640action=edit Preprocessed code; compile at -O2, e.g. cc1 -O2 y.i -o y.s The exact version in which the bug appeared is not yet triaged: it's present on r201675 of trunk, r201652 of the 4.8 branch, r190527 of the 4.7 branch (!) but appears to not be present in r135713 of the 4.3 branch (!). The bug is that the volatileness of the dereference of the write (the assignment through a pointer to a volatile structure) in function pb_out is discarded, leaving a single write after the loop. Note also that together with the discarded-volatileness-bug there seems to be a missed-optimization-bug in that the loop is redundant; the loop awkwardly computes iterates over 0..31 and computes 1i but the intermediate computations aren't used; then the last value is written after the loop. Editing the code to manually inline pb_out makes no difference to the bug. The wrong code is evident already in the .expand dump on trunk (according to -da). It is not present (according to -fdump-tree-all-all) in y.i.096t.loopinit but appears present in y.i.097t.lim1. Until someone (including myself) has repeated the observation for another target, I'll set the target-specifier to cris*-* but it seems obviously generic, affecting all targets.
Re: [patch, fortran] RFD: PR 56666 Allow suppression of zero-trip DO loop warning
Hi Janus, OK for trunk? Looks good to m Committed as rev. 201658; also committed a snippet to the documentation. Thanks for the review! Thomas
[PATCH] TREE-SSA remove redundant condition checks in get_default_value
In function get_default_value of tree-ssa-ccp.c, 261 else if (is_gimple_assign (stmt) 262/* Value-returning GIMPLE_CALL statements assign to 263 a variable, and are treated similarly to GIMPLE_ASSIGN. */ 264|| (is_gimple_call (stmt) 265 gimple_call_lhs (stmt) != NULL_TREE) 266|| gimple_code (stmt) == GIMPLE_PHI) 267 { 268 tree cst; 269 if (gimple_assign_single_p (stmt) 270DECL_P (gimple_assign_rhs1 (stmt)) 271(cst = get_symbol_constant_value (gimple_assign_rhs1 (stmt 272 { 273 val.lattice_val = CONSTANT; 274 val.value = cst; 275 } 276 else 277 /* Any other variable defined by an assignment or a PHI node 278is considered UNDEFINED. */ 279 val.lattice_val = UNDEFINED; if the stmt is a gimple call node or a gimple phi node, it will never satisfy the condition gimple_assign_single_p (stmt). so there exists redundant condition checks. The patch attached try to remove this. Bootstrap passed. Regression tested on x86_64-unknown-linux-gnu (pc). ChangeLog: 2013-08-13 Zhouyi Zhou yizhouz...@ict.ac.cn * tree-ssa-ccp.c (get_default_value): remove redundant condition checks -- Zhouyi Zhou yizhouz...@ict.ac.cn diff --git a/gcc/tree-ssa-ccp.c b/gcc/tree-ssa-ccp.c index 6472f48..7fbb687 100644 --- a/gcc/tree-ssa-ccp.c +++ b/gcc/tree-ssa-ccp.c @@ -258,12 +258,7 @@ get_default_value (tree var) val.mask = double_int_minus_one; } } - else if (is_gimple_assign (stmt) - /* Value-returning GIMPLE_CALL statements assign to - a variable, and are treated similarly to GIMPLE_ASSIGN. */ - || (is_gimple_call (stmt) - gimple_call_lhs (stmt) != NULL_TREE) - || gimple_code (stmt) == GIMPLE_PHI) + else if (is_gimple_assign (stmt)) { tree cst; if (gimple_assign_single_p (stmt) @@ -274,10 +269,18 @@ get_default_value (tree var) val.value = cst; } else - /* Any other variable defined by an assignment or a PHI node + /* Any other variable defined by an assignment is considered UNDEFINED. */ val.lattice_val = UNDEFINED; } + else if ((is_gimple_call (stmt) + gimple_call_lhs (stmt) != NULL_TREE) + || gimple_code (stmt) == GIMPLE_PHI) +{ + /*Variable defined by a call or a PHI node + is considered UNDEFINED. */ + val.lattice_val = UNDEFINED; +} else { /* Otherwise, VAR will never take on a constant value. */
Re: [PATCH] x86-64 gcc generate wrong assembly instruction movabs for intel syntax
Hello! movabs is incorrectly translated into mov [rax], -1, and causes compile error Error: ambiguous operand size for `mov' . It should be mov QWORD PTR [rax], -1 Bootstrap passed. Regression tested on x86_64-unknown-linux-gnu (pc). 2013-08-10 Perez Read netfirew...@gmail.com * config/i386/i386.md (*movabsmode_1) : Add ptrsize PTR before operand 0 for intel asm alternative. * testsuite/gcc.target/i386/movabs-1.c : New test. You should mention PR number in the ChangeLog. Looks OK, but I think that for consistency this decoration should also be added to *movabsmode_2 pattern. Uros.
Re: [PATCH v2 00/18] resurrect automatic dependencies
Tom == Tom Tromey tro...@redhat.com writes: Tom This is a refresh of my series to resurrect automatic dependency Tom tracking. Ping. Tom
Re: [PATCH] Fix PR57980
On Fri, Aug 09, 2013 at 08:40:00PM +0200, Richard Biener wrote: Marek Polacek pola...@redhat.com wrote: In this PR the problem was that when dealing with the gimple assign in the tailcall optimization, we, when the rhs operand is of a vector type, need to create -1 also of a vector type, but build_int_cst doesn't create vectors (ICEs). Instead, we should use build_minus_one_cst because that can create even the VECTOR_TYPE constant (and, it can create even REAL_TYPE/COMPLEX_TYPE), as suggested by Marc. Regtested/bootstrapped on x86_64-linux, ok for trunk and 4.8? Ok. Double-check that this function exists on the branch please. It does not :(. So not backporting to 4.8... Marek
Re: [wwwdocs] Add link to @gnutools on Twitter
On Mon, Aug 12, 2013 at 01:20:03AM +0100, Gerald Pfeifer wrote: David suggested adding this link, and I think it fits nicely. Does this also deserve a news post? I certainly found it to be interesting news! James
Re: [PATCH] x86-64 gcc generate wrong assembly instruction movabs for intel syntax
On Mon, Aug 12, 2013 at 2:52 PM, Uros Bizjak ubiz...@gmail.com wrote: Hello! movabs is incorrectly translated into mov [rax], -1, and causes compile error Error: ambiguous operand size for `mov' . It should be mov QWORD PTR [rax], -1 Bootstrap passed. Regression tested on x86_64-unknown-linux-gnu (pc). 2013-08-10 Perez Read netfirew...@gmail.com * config/i386/i386.md (*movabsmode_1) : Add ptrsize PTR before operand 0 for intel asm alternative. * testsuite/gcc.target/i386/movabs-1.c : New test. You should mention PR number in the ChangeLog. Looks OK, but I think that for consistency this decoration should also be added to *movabsmode_2 pattern. Uros. Hello, After the test, I think we can skip this pattern. Because the operand 0 must be the register, the assembler will determine the size automatically. Perez Fixed ChangeLog 2013-08-10 Perez Read netfirew...@gmail.com PR target/58132 * config/i386/i386.md (*movabsmode_1) : Add ptrsize PTR before operand 0 for intel asm alternative. * testsuite/gcc.target/i386/movabs-1.c : New test.
Re: [PATCH] x86-64 gcc generate wrong assembly instruction movabs for intel syntax
On Mon, Aug 12, 2013 at 11:24 AM, Perez Read netfirew...@gmail.com wrote: movabs is incorrectly translated into mov [rax], -1, and causes compile error Error: ambiguous operand size for `mov' . It should be mov QWORD PTR [rax], -1 Bootstrap passed. Regression tested on x86_64-unknown-linux-gnu (pc). 2013-08-10 Perez Read netfirew...@gmail.com * config/i386/i386.md (*movabsmode_1) : Add ptrsize PTR before operand 0 for intel asm alternative. * testsuite/gcc.target/i386/movabs-1.c : New test. You should mention PR number in the ChangeLog. Looks OK, but I think that for consistency this decoration should also be added to *movabsmode_2 pattern. Uros. Hello, After the test, I think we can skip this pattern. Because the operand 0 must be the register, the assembler will determine the size automatically. As said, I don't want two similar patterns with a different asm template in i386.md. So, if decorating movabsmode_2 works OK, I propose to change both patterns with your change. Uros.
Backport from trune:
I think this one is obvious/trivial, but I'll ask anyway. OK? Andrew. 2013-08-12 Andrew Haley a...@redhat.com Backport from mainline: * 2013-07-11 Andreas Schwab sch...@suse.de * config/aarch64/aarch64-linux.h (CPP_SPEC): Define. Index: gcc/config/aarch64/aarch64-linux.h === --- gcc/config/aarch64/aarch64-linux.h (revision 201661) +++ gcc/config/aarch64/aarch64-linux.h (working copy) @@ -23,6 +23,8 @@ #define GLIBC_DYNAMIC_LINKER /lib/ld-linux-aarch64.so.1 +#define CPP_SPEC %{pthread:-D_REENTRANT} + #define LINUX_TARGET_LINK_SPEC %{h*} \ %{static:-Bstatic} \ %{shared:-shared} \
RFA: AVR: Support building AVR Linux targets
Hi Dennis, Hi Anatoly, Hi Eric, I have run into a small problem building GCC for an AVR Linux target - glibc-c.o is not being built. It turns out that the section handling avr-*-* in the config.gcc file is redefining tmake_file without allowing for the fact that t-glibc has already been added to it. The patch below is the obvious fix for this problem, but I have not committed it because it occurred to me that there might be some AVR specific reason for not including t-glibc. So - is the patch OK, or is there some other way of fixing the problem ? Cheers Nick gcc/ChangeLog 2013-08-12 Nick Clifton ni...@redhat.com * config.gcc (avr-*-*): Allow for tmake_file not being empty. Index: gcc/config.gcc === --- gcc/config.gcc (revision 201658) +++ gcc/config.gcc (working copy) @@ -1001,7 +1001,7 @@ tm_file=${tm_file} ${cpu_type}/avrlibc.h tm_defines=${tm_defines} WITH_AVRLIBC fi - tmake_file=avr/t-avr avr/t-multilib + tmake_file=${tmake_file} avr/t-avr avr/t-multilib use_gcc_stdint=wrap extra_gcc_objs=driver-avr.o avr-devices.o extra_objs=avr-devices.o avr-log.o
[RFC] Bare bones of virtual call tracking
Hi, this patch represents bare bones of what I hope to give me possible targets of a virtual call. I basically added One Definition Rule based hash that unify all types that are same in C++ sense (with LTO many of those are still not merged - I hope that with few dumps I can improve the merging, too). So every type used in virtual method declaration gets assigned odr_type entry. Then I use BINFO_BASE_BINFOS to walk direct bases and produce a type inheritance graph linking type with its bases but also with its derived types. So I get: jan@linux-9ure:~/trunk/build/gcc ./xgcc -B ./ -O2 devirt-1.C type 0: struct A defined at: devirt-1.C:7 methods: virtual int A::foo(int)/0 derived types: type 1: struct C defined at: devirt-1.C:20 base odr type ids: 0 methods: virtual int C::foo(int)/2 type 2: struct B defined at: devirt-1.C:14 base odr type ids: 0 methods: virtual int B::foo(int)/1 I think in future I can also use this for LTO merging (i.e. merge binfos of all types equivalent by ODR) and perhaps canonical types can be refined to honor ODR when there is no non-ODR language type of same layout. Now for single inheritance, I think my work is easy: I have token and type of the virtual call taken from OBJ_TYPE_REF. I think I can just walk my inheritance graph now and on each entry look for method with given token (I can take it from virtual table, or I can actually use DECL_VINFO and complette my current partial tracking of them) and put them into set. Those should be all possible virtual call targets (defined in current unit) of the call. With multiple inheritance I need to adjust offsets. I assume for every type, I can simply walk binfos, look for mathing type of the call and look for method at given token within the binfo. This will be quadratic. Other otion would be to track the offsets in my base to derived type link. But I do not know how to obtain it, since BINFO_BASE_BINFOS do not track them. Shall I look for TYPE_FIELDs instead? Does this approach seem to make sense? Honza Index: Makefile.in === --- Makefile.in (revision 201654) +++ Makefile.in (working copy) @@ -1275,6 +1275,7 @@ init-regs.o \ internal-fn.o \ ipa-cp.o \ + ipa-devirt.o \ ipa-split.o \ ipa-inline.o \ ipa-inline-analysis.o \ @@ -2945,6 +2946,10 @@ $(TREE_PASS_H) $(GIMPLE_H) $(TARGET_H) $(GGC_H) pointer-set.h \ $(IPA_UTILS_H) tree-inline.h $(HASH_TABLE_H) profile.h $(PARAMS_H) \ $(LTO_STREAMER_H) $(DATA_STREAMER_H) +ipa-devirt.o : ipa-devirt.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(CGRAPH_H) \ + $(TREE_PASS_H) $(GIMPLE_H) $(TARGET_H) $(GGC_H) pointer-set.h \ + $(IPA_UTILS_H) tree-inline.h $(HASH_TABLE_H) profile.h $(PARAMS_H) \ + $(LTO_STREAMER_H) $(DATA_STREAMER_H) ipa-prop.o : ipa-prop.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ langhooks.h $(GGC_H) $(TARGET_H) $(CGRAPH_H) $(IPA_PROP_H) $(DIAGNOSTIC_H) \ $(TREE_FLOW_H) $(TM_H) $(TREE_PASS_H) $(FLAGS_H) $(TREE_H) \ Index: ipa-devirt.c === --- ipa-devirt.c(revision 0) +++ ipa-devirt.c(working copy) @@ -0,0 +1,267 @@ +/* Basic IPA optimizations and utilities. + Copyright (C) 2003-2013 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +http://www.gnu.org/licenses/. */ + +#include config.h +#include system.h +#include coretypes.h +#include tm.h +#include cgraph.h +#include tree-pass.h +#include gimple.h +#include ggc.h +#include flags.h +#include pointer-set.h +#include target.h +#include tree-iterator.h +#include pointer-set.h +#include hash-table.h +#include params.h +#include tree-pretty-print.h + +struct odr_type_d +{ + int id; + vectree types; + pointer_set_t *types_set; + vecstruct odr_type_d * bases; + vecstruct odr_type_d * derived_types; + vecstruct cgraph_node * methods; +}; + +typedef odr_type_d *odr_type; + +/* One Definition Rule hashtable helpers. */ + +struct odr_hasher +{ + typedef odr_type_d value_type; + typedef odr_type_d compare_type; + static inline hashval_t hash (const value_type *); + static inline bool equal (const value_type *, const compare_type *); + static inline void remove (value_type *); +}; + +/* Return the computed hashcode for ODR_TYPE. */ + +inline hashval_t +odr_hasher::hash (const
Re: RFA: AVR: Support building AVR Linux targets
2013/8/12 Nick Clifton ni...@redhat.com: Hi Dennis, Hi Anatoly, Hi Eric, I have run into a small problem building GCC for an AVR Linux target - glibc-c.o is not being built. It turns out that the section handling avr-*-* in the config.gcc file is redefining tmake_file without allowing for the fact that t-glibc has already been added to it. The patch below is the obvious fix for this problem, but I have not committed it because it occurred to me that there might be some AVR specific reason for not including t-glibc. I can't remember such reasons. So - is the patch OK, or is there some other way of fixing the problem ? Cheers Nick gcc/ChangeLog 2013-08-12 Nick Clifton ni...@redhat.com * config.gcc (avr-*-*): Allow for tmake_file not being empty. Index: gcc/config.gcc Please Apply. Denis.
Re: [PATCH, PR 57748] Set mode of structures with zero sized arrays to be BLK
Hi, Ping. Any news of the following patch being included into the trunk? Thanks, david On Aug 2, 2013, at 1:45 PM, Martin Jambor wrote: Hi, while compute_record_mode in stor-layout.c makes sure it assigns BLK mode to structs with flexible arrays, it has no such provisions for zero length arrays (http://gcc.gnu.org/onlinedocs/gcc-4.8.1/gcc/Zero-Length.html). I think that in order to avoid problems and surprises like PR 57748 (where this triggered code that was intended for small structures that fit into a scalar mode and ICEd), we should assign both variable array possibilities the same mode. Bootstrapped and tested on x86_64-linux without any problems. OK for trunk and the 4.8 branch? (I'm not sure about the 4.7, this PR does not happen there despite the wrong mode so I'd ignore it for now.) Thanks, Martin 2013-08-01 Martin Jambor mjam...@suse.cz PR middle-end/57748 * stor-layout.c (compute_record_mode): Treat zero-sized array fields like incomplete types. testsuite/ * gcc.dg/torture/pr57748.c: New test. *** /tmp/lV6Ba8_stor-layout.c Thu Aug 1 16:28:25 2013 --- gcc/stor-layout.c Thu Aug 1 15:36:18 2013 *** compute_record_mode (tree type) *** 1604,1610 integer_zerop (TYPE_SIZE (TREE_TYPE (field) || ! host_integerp (bit_position (field), 1) || DECL_SIZE (field) == 0 ! || ! host_integerp (DECL_SIZE (field), 1)) return; /* If this field is the whole struct, remember its mode so --- 1604,1612 integer_zerop (TYPE_SIZE (TREE_TYPE (field) || ! host_integerp (bit_position (field), 1) || DECL_SIZE (field) == 0 ! || ! host_integerp (DECL_SIZE (field), 1) ! || (TREE_CODE (TREE_TYPE (field)) == ARRAY_TYPE !tree_low_cst (DECL_SIZE (field), 1) == 0)) return; /* If this field is the whole struct, remember its mode so *** /dev/null Tue Jun 4 12:34:56 2013 --- gcc/testsuite/gcc.dg/torture/pr57748.cThu Aug 1 15:42:14 2013 *** *** 0 --- 1,45 + /* PR middle-end/57748 */ + /* { dg-do run } */ + + #include stdlib.h + + extern void abort (void); + + typedef long long V + __attribute__ ((vector_size (2 * sizeof (long long)), may_alias)); + + typedef struct S { V a; V b[0]; } P __attribute__((aligned (1))); + + struct __attribute__((packed)) T { char c; P s; }; + + void __attribute__((noinline, noclone)) + check (struct T *t) + { + if (t-s.b[0][0] != 3 || t-s.b[0][1] != 4) + abort (); + } + + int __attribute__((noinline, noclone)) + get_i (void) + { + return 0; + } + + void __attribute__((noinline, noclone)) + foo (P *p) + { + V a = { 3, 4 }; + int i = get_i(); + p-b[i] = a; + } + + int + main () + { + struct T *t = (struct T *) malloc (128); + + foo (t-s); + check (t); + + return 0; + }
Re: [PATCH, PR 57748] Set mode of structures with zero sized arrays to be BLK
Hi, Ping. Any news of the following patch being included into the trunk? Thanks, david On Aug 2, 2013, at 1:45 PM, Martin Jambor wrote: Hi, while compute_record_mode in stor-layout.c makes sure it assigns BLK mode to structs with flexible arrays, it has no such provisions for zero length arrays (http://gcc.gnu.org/onlinedocs/gcc-4.8.1/gcc/Zero-Length.html). I think that in order to avoid problems and surprises like PR 57748 (where this triggered code that was intended for small structures that fit into a scalar mode and ICEd), we should assign both variable array possibilities the same mode. Bootstrapped and tested on x86_64-linux without any problems. OK for trunk and the 4.8 branch? (I'm not sure about the 4.7, this PR does not happen there despite the wrong mode so I'd ignore it for now.) Thanks, Martin 2013-08-01 Martin Jambor mjam...@suse.cz PR middle-end/57748 * stor-layout.c (compute_record_mode): Treat zero-sized array fields like incomplete types. testsuite/ * gcc.dg/torture/pr57748.c: New test. *** /tmp/lV6Ba8_stor-layout.c Thu Aug 1 16:28:25 2013 --- gcc/stor-layout.c Thu Aug 1 15:36:18 2013 *** compute_record_mode (tree type) *** 1604,1610 integer_zerop (TYPE_SIZE (TREE_TYPE (field) || ! host_integerp (bit_position (field), 1) || DECL_SIZE (field) == 0 ! || ! host_integerp (DECL_SIZE (field), 1)) return; /* If this field is the whole struct, remember its mode so --- 1604,1612 integer_zerop (TYPE_SIZE (TREE_TYPE (field) || ! host_integerp (bit_position (field), 1) || DECL_SIZE (field) == 0 ! || ! host_integerp (DECL_SIZE (field), 1) ! || (TREE_CODE (TREE_TYPE (field)) == ARRAY_TYPE !tree_low_cst (DECL_SIZE (field), 1) == 0)) return; /* If this field is the whole struct, remember its mode so *** /dev/null Tue Jun 4 12:34:56 2013 --- gcc/testsuite/gcc.dg/torture/pr57748.cThu Aug 1 15:42:14 2013 *** *** 0 --- 1,45 + /* PR middle-end/57748 */ + /* { dg-do run } */ + + #include stdlib.h + + extern void abort (void); + + typedef long long V + __attribute__ ((vector_size (2 * sizeof (long long)), may_alias)); + + typedef struct S { V a; V b[0]; } P __attribute__((aligned (1))); + + struct __attribute__((packed)) T { char c; P s; }; + + void __attribute__((noinline, noclone)) + check (struct T *t) + { + if (t-s.b[0][0] != 3 || t-s.b[0][1] != 4) + abort (); + } + + int __attribute__((noinline, noclone)) + get_i (void) + { + return 0; + } + + void __attribute__((noinline, noclone)) + foo (P *p) + { + V a = { 3, 4 }; + int i = get_i(); + p-b[i] = a; + } + + int + main () + { + struct T *t = (struct T *) malloc (128); + + foo (t-s); + check (t); + + return 0; + }
Re: Fwd: [x86, PATCH] More effecient code for short unsigned conversion to float-point.
On 12 Aug 16:12, Yuri Rumyantsev wrote: Hello, part of the thread was accidentally removed from gcc-patches. I've comitted Yuri's patch into ML: http://gcc.gnu.org/ml/gcc-cvs/2013-08/msg00272.html As far as discussion was out of ML - feel free to object. Thanks, K -- Forwarded message -- From: Uros Bizjak ubiz...@gmail.com Date: 2013/8/7 Subject: Re: [x86, PATCH] More effecient code for short unsigned conversion to float-point. To: Yuri Rumyantsev ysrum...@gmail.com Ah, OK, I see where I did a thinko. The patch looks OK, then. Uros.
[AArch64] Fix name of macros called in the vdup_lane Neon intrinsics
Ugh. Typos in arm_neon.h macro names mean that scalar intrinsics end up calling macros which don't exist. So wherever I have written vget_laneq I should have written vgetq_lane. This gets fixed by: http://gcc.gnu.org/ml/gcc-patches/2013-08/msg00535.html which I was testing at the same time. But, yuck that shouldn't have happened. Tested on aarch64-none-elf with no regressions. OK? Thanks, James --- gcc/ * config/aarch64/arm_none.h (vdupbhsd_lane_su8,16,32,64): Fix macro call.diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index 73a5400..4a480fb 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -19780,49 +19780,49 @@ vcvtpq_u64_f64 (float64x2_t __a) __extension__ static __inline int8x1_t __attribute__ ((__always_inline__)) vdupb_lane_s8 (int8x16_t a, int const b) { - return __aarch64_vget_laneq_s8 (a, b); + return __aarch64_vgetq_lane_s8 (a, b); } __extension__ static __inline uint8x1_t __attribute__ ((__always_inline__)) vdupb_lane_u8 (uint8x16_t a, int const b) { - return __aarch64_vget_laneq_u8 (a, b); + return __aarch64_vgetq_lane_u8 (a, b); } __extension__ static __inline int16x1_t __attribute__ ((__always_inline__)) vduph_lane_s16 (int16x8_t a, int const b) { - return __aarch64_vget_laneq_s16 (a, b); + return __aarch64_vgetq_lane_s16 (a, b); } __extension__ static __inline uint16x1_t __attribute__ ((__always_inline__)) vduph_lane_u16 (uint16x8_t a, int const b) { - return __aarch64_vget_laneq_u16 (a, b); + return __aarch64_vgetq_lane_u16 (a, b); } __extension__ static __inline int32x1_t __attribute__ ((__always_inline__)) vdups_lane_s32 (int32x4_t a, int const b) { - return __aarch64_vget_laneq_s32 (a, b); + return __aarch64_vgetq_lane_s32 (a, b); } __extension__ static __inline uint32x1_t __attribute__ ((__always_inline__)) vdups_lane_u32 (uint32x4_t a, int const b) { - return __aarch64_vget_laneq_u32 (a, b); + return __aarch64_vgetq_lane_u32 (a, b); } __extension__ static __inline int64x1_t __attribute__ ((__always_inline__)) vdupd_lane_s64 (int64x2_t a, int const b) { - return __aarch64_vget_laneq_s64 (a, b); + return __aarch64_vgetq_lane_s64 (a, b); } __extension__ static __inline uint64x1_t __attribute__ ((__always_inline__)) vdupd_lane_u64 (uint64x2_t a, int const b) { - return __aarch64_vget_laneq_s64 (a, b); + return __aarch64_vgetq_lane_u64 (a, b); } /* vld1 */
Re: [C++ PATCH] Grammar fix in pt.c comments.
Thank you for this patch, Adam. Adam Butcher a...@jessamine.co.uk a écrit: * pt.c: Grammar fix in comments (it's to its). FWIW, this change seems to fall under the obvious rule and thus, ought to be committed. --- gcc/cp/pt.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index ce899ef..78b7a97 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -1986,7 +1986,7 @@ determine_specialization (tree template_id, tree decl_arg_types; /* This is an ordinary member function. However, since - we're here, we can assume it's enclosing class is a + we're here, we can assume its enclosing class is a template class. For example, template typename T struct S { void f(); }; @@ -4337,7 +4337,7 @@ check_default_tmpl_args (tree decl, tree parms, bool is_primary, || DECL_INITIALIZED_IN_CLASS_P (decl))) /* We already checked these parameters when the template was declared, so there's no need to do it again now. This function - was defined in class scope, but we're processing it's body now + was defined in class scope, but we're processing its body now that the class is complete. */ return true; @@ -7555,7 +7555,7 @@ lookup_template_class_1 (tree d1, tree arglist, tree in_decl, tree context, the one of #0. When we encounter #1, we want to store the partial instantiation - of M (templateclass T Sint::MT) in it's CLASSTYPE_TI_TEMPLATE. + of M (templateclass T Sint::MT) in its CLASSTYPE_TI_TEMPLATE. For all cases other than this explicit specialization of member of a class template, we just want to store the most general template into -- Dodji
Re: [RFC Patch, Aarch64] : Macros for profile code generation to enable gprof support
Marcus, On 9 August 2013 18:17, Marcus Shawcroft marcus.shawcr...@arm.com wrote: On 03/08/13 19:01, Venkataramanan Kumar wrote: 2013-08-02 Venkataramanan Kumar venkataramanan.ku...@linaro.org * config/aarch64/aarch64.h (MCOUNT_NAME): Define. (NO_PROFILE_COUNTERS): Likewise. (PROFILE_HOOK): Likewise. (FUNCTION_PROFILER): Likewise. * config/aarch64/aarch64.c (aarch64_function_profiler): Remove. . regards, Venkat. Hi Venkat, Looking at the various other ports it looks that the majority choose to use FUNCTION_PROFILER_HOOK rather than PROFILE_HOOK. Using PROFILE_HOOK to inject a regular call to to _mcount() means that all arguments passed in registers in every function will be spilled and reloaded because the _mcount call will kill the caller save registers. Using the FUNCTION_PROFILER_HOOK and taking care not to kill the caller save registers would be less invasive. The LR argument to _mcount would need to be passed in a temporary register, say x9 and _mcount would also need to ensure caller save registers are saved and restored. The latter seems to be a better option to me, is there compelling reason to choose PROFILE_HOOK over FUNCTION_PROFILER_HOOK ?? (I think you mean FUNCTION_PROFILER rather than FUNCTION_PROFILER_HOOK in all the above.) Using either PROFILE_HOOK or FUNCTION_PROFILER results in a call chain that looks like the following (assuming the C Library is glibc): Function - _mcount - _mcount_internal. Where _mcount_internal is the C function that does the real work and is provided in glibc. Importantly this means that _mcount_internal follows the normal ABI - so we have to save the caller saved registers somewhere. Using FUNCTION_PROFILER requires us to write assembler which saves and restores all caller saved registers every time it is called, and requires (as you say) a special ABI. This means _mcount ends up being a piece of assembly that saves all caller-saved registers (i.e. parameter-passing temporary registers) and then makes the call to _mcount internal before restoring everything on _mcount's return. Using PROFILE_HOOK will cause the compiler to do all the heavy lifting, and it will do the minimum required (for example with a function with one parameter it will only save and restore x0). _mcount in this case can be a simple function that sets up some parameters and calls _mcount_internal (or even _mcount could just alias _mcount_internal). As to which of PROFILE_HOOK or FUNCTION_PROFILER are the right way (TM) - I don't know - the documentation isn't very clear at all. PROFILE_HOOK was introduced to support profiling for AIX 4.3. http://gcc.gnu.org/ml/gcc-patches/2000-12/msg00580.html is the initial patch, with a reworked patch here: http://gcc.gnu.org/ml/gcc-patches/2001-02/msg00112.html. The final commit happening on 2001-02-05. The patch was introduced because it was impossible to make FUNCTION_PROFILER work for AIX 4.3 and so a new hook that worked earlier in the compiler was needed. There doesn't seem to have been a discussion about preferring one form over the other. In conclusion - I prefer the PROFILE_HOOK method because it makes the compiler do all the work, and results in less impact on stack usage and performance. FUNCTION_PROFILER may impact the code generated by the compiler less and produce a smaller overall image - but I'm not sure that's more beneficial. Thanks, Matt -- Matthew Gretton-Dann Linaro Toolchain Working Group matthew.gretton-d...@linaro.org
Re: [PATCH] x86-64 gcc generate wrong assembly instruction movabs for intel syntax
On Mon, Aug 12, 2013 at 5:26 PM, Uros Bizjak ubiz...@gmail.com wrote: On Mon, Aug 12, 2013 at 11:24 AM, Perez Read netfirew...@gmail.com wrote: movabs is incorrectly translated into mov [rax], -1, and causes compile error Error: ambiguous operand size for `mov' . It should be mov QWORD PTR [rax], -1 Bootstrap passed. Regression tested on x86_64-unknown-linux-gnu (pc). 2013-08-10 Perez Read netfirew...@gmail.com * config/i386/i386.md (*movabsmode_1) : Add ptrsize PTR before operand 0 for intel asm alternative. * testsuite/gcc.target/i386/movabs-1.c : New test. You should mention PR number in the ChangeLog. Looks OK, but I think that for consistency this decoration should also be added to *movabsmode_2 pattern. Uros. Hello, After the test, I think we can skip this pattern. Because the operand 0 must be the register, the assembler will determine the size automatically. As said, I don't want two similar patterns with a different asm template in i386.md. So, if decorating movabsmode_2 works OK, I propose to change both patterns with your change. Uros. Sorry for forgetting to Cc the mailing list. There are new patch and changelog. Add ptrsize PTR to both patterns. Bootstrap passed, Regression tested on x86_64-unknown-linux-gnu (pc). 2013-08-10 Perez Read netfirew...@gmail.com PR target/58132 * config/i386/i386.md (*movabsmode_1) : Add ptrsize PTR before operand 0 for intel asm alternative. * testsuite/gcc.target/i386/movabs-1.c : New test. 2013-08-12 Perez Read netfirew...@gmail.com PR target/58132 * config/i386/i386.md (*movabsmode_2) : Add ptrsize PTR before operand 1 for intel asm alternative. Thanks, Perez movabs.patch Description: Binary data
Re: [PATCH] Convert more passes to new dump framework
On Tue, Aug 6, 2013 at 10:23 PM, Teresa Johnson tejohn...@google.com wrote: On Tue, Aug 6, 2013 at 9:29 AM, Teresa Johnson tejohn...@google.com wrote: On Tue, Aug 6, 2013 at 9:01 AM, Martin Jambor mjam...@suse.cz wrote: Hi, On Tue, Aug 06, 2013 at 07:14:42AM -0700, Teresa Johnson wrote: On Tue, Aug 6, 2013 at 5:37 AM, Martin Jambor mjam...@suse.cz wrote: On Mon, Aug 05, 2013 at 10:37:00PM -0700, Teresa Johnson wrote: This patch ports messages to the new dump framework, It would be great this new framework was documented somewhere. I lost track of what was agreed it would be and from the uses in the vectorizer I was never quite sure how to utilize it in other passes. Cc'ing Sharad who implemented this - Sharad, is this documented on a wiki or elsewhere? Thanks I'd also like to point out two other minor things inline: [...] 2013-08-06 Teresa Johnson tejohn...@google.com Dehao Chen de...@google.com * dumpfile.c (dump_loc): Add column number to output, make newlines consistent. * dumpfile.h (OPTGROUP_OTHER): Add and enable under OPTGROUP_ALL. * ipa-inline-transform.c (clone_inlined_nodes): (cgraph_node_opt_info): New function. (cgraph_node_call_chain): Ditto. (dump_inline_decision): Ditto. (inline_call): Invoke dump_inline_decision. * doc/invoke.texi: Document optall -fopt-info flag. * profile.c (read_profile_edge_counts): Use new dump framework. (compute_branch_probabilities): Ditto. * passes.c (pass_manager::register_one_dump_file): Use OPTGROUP_OTHER when pass not in any opt group. * value-prof.c (check_counter): Use new dump framework. (find_func_by_funcdef_no): Ditto. (check_ic_target): Ditto. * coverage.c (get_coverage_counts): Ditto. (coverage_init): Setup new dump framework. * ipa-inline.c (inline_small_functions): Set is_in_ipa_inline. * ipa-inline.h (is_in_ipa_inline): Declare. * testsuite/gcc.dg/pr40209.c: Use -fopt-info. * testsuite/gcc.dg/pr26570.c: Ditto. * testsuite/gcc.dg/pr32773.c: Ditto. * testsuite/g++.dg/tree-ssa/dom-invalid.C (struct C): Ditto. [...] Index: ipa-inline-transform.c === --- ipa-inline-transform.c (revision 201461) +++ ipa-inline-transform.c (working copy) @@ -192,6 +192,108 @@ clone_inlined_nodes (struct cgraph_edge *e, bool d } +#define MAX_INT_LENGTH 20 + +/* Return NODE's name and profile count, if available. */ + +static const char * +cgraph_node_opt_info (struct cgraph_node *node) +{ + char *buf; + size_t buf_size; + const char *bfd_name = lang_hooks.dwarf_name (node-symbol.decl, 0); + + if (!bfd_name) +bfd_name = unknown; + + buf_size = strlen (bfd_name) + 1; + if (profile_info) +buf_size += (MAX_INT_LENGTH + 3); + + buf = (char *) xmalloc (buf_size); + + strcpy (buf, bfd_name); + + if (profile_info) +sprintf (buf, %s (HOST_WIDEST_INT_PRINT_DEC), buf, node-count); + return buf; +} I'm not sure if output of this function is aimed only at the user or if it is supposed to be used by gcc developers as well. If the latter, an incredibly useful thing is to also dump node-symbol.order too. We usually dump it after / sign separating it from node name. It is invaluable when examining decisions in C++ code where you can have lots of clones of a node (and also because existing dumps print it, it is easy to combine them). The output is useful for both power users doing performance tuning of their application, and by gcc developers. Adding the id is not so useful for the former, but I agree that it is very useful for compiler developers. In fact, in the google branch version we emit more verbose information (the lipo module id and the funcdef_no) to help uniquely identify the routines and to aid in post-processing by humans and tools. So it is probably useful to add something similar here too. Is the node-symbol.order more or less unique than the funcdef_no? I see that you added a patch a few months ago to print the node-symbol.order in the function header, and it also has the advantage as you note of matching up with existing ipa dumps. node-symbol.order is unique and if I remember correctly, it is not even recycled. Clones, inline clones, thunks, every symbol table node gets its own symbol order so it should be more unique than funcdef_no. On the other hand it may be a bit cryptic for users but at the same time it is only one number. Ok, I am going to go ahead and add this to the output. [...] Index: ipa-inline.c === --- ipa-inline.c(revision 201461) +++ ipa-inline.c
[PATCH] Possible fix for PR57717 (PowerPC E500v2)
Hi, At present, mainline fails to build a PowerPC E500v2 cross-compiler for me because of the bug described in PR57717. The attached patch is a possible fix for that, although I have been struggling to obtain good evidence that it is correct due to lack of a working current baseline. Without the patch, the partially-built compiler ICEs during a cross-build trying to reload a TImode load instruction: I think this is because the RTL generated by the clause modified by the attached patch in rs6000_legitimize_reload_address is not valid for TARGET_E500_DOUBLE. Simply disallowing all greater-than UNITS_PER_WORD-sized modes seems to suffice to fix this. I have tested on current mainline with the candidate patch in http://gcc.gnu.org/bugzilla//show_bug.cgi?id=57717#c3 and compared the results with my patch: this gives the same results. I configured with: [...] --enable-e500_double --with-long-double-128 --with-cpu=8548 --disable-decimal-float --disable-libvtv with a target of powerpc-linux-gnuspe (this is with our internal build tools, which unfortunately I can't share), and tested on real hardware.0 (The last two options given are just working around build errors.) The other test cases in PR57717 appear to work correctly with my patch too. Unfortunately results show significant degradation relative to r189800 (before the patch identified in PR57717 was applied), though I believe this to be due to a cause other than my patch (there seems to be some kind of stack corruption in execute tests -- I've not yet tracked this down). Also -- possibly related -- I had to add a hack to rs6000_dwarf_register_span to get through the build, i.e.: @@ -28940,6 +28940,9 @@ rs6000_dwarf_register_span (rtx reg) unsigned regno = REGNO (reg); enum machine_mode mode = GET_MODE (reg); + /* FIXME: This function causes an ICE when emitting Dwarf. */ + return NULL_RTX; + if (TARGET_SPE regno 32 (SPE_VECTOR_MODE (GET_MODE (reg)) I am not proposing that particular patch for committing, of course. OK to commit, or any comments? If anyone's in a position to do some further testing on the patch, I'd be grateful for that! Thanks, Julian ChangeLog gcc/ * config/rs6000/rs6000.c (rs6000_legitimize_reload_address): Don't perform invalid legitimization on greater-than-word-size modes for TARGET_E500_DOUBLE. Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 201609) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -6930,9 +6930,7 @@ rs6000_legitimize_reload_address (rtx x, GET_CODE (XEXP (x, 1)) == CONST_INT reg_offset_p !SPE_VECTOR_MODE (mode) - !(TARGET_E500_DOUBLE (mode == DFmode || mode == TFmode - || mode == DDmode || mode == TDmode - || mode == DImode)) + !(TARGET_E500_DOUBLE GET_MODE_SIZE (mode) UNITS_PER_WORD) (!VECTOR_MODE_P (mode) || VECTOR_MEM_NONE_P (mode))) { HOST_WIDE_INT val = INTVAL (XEXP (x, 1));
Re: [PATCH] x86-64 gcc generate wrong assembly instruction movabs for intel syntax
On Mon, Aug 12, 2013 at 9:51 PM, Uros Bizjak ubiz...@gmail.com wrote: On Mon, Aug 12, 2013 at 3:39 PM, Perez Read netfirew...@gmail.com wrote: movabs is incorrectly translated into mov [rax], -1, and causes compile error Error: ambiguous operand size for `mov' . It should be mov QWORD PTR [rax], -1 Bootstrap passed. Regression tested on x86_64-unknown-linux-gnu (pc). 2013-08-10 Perez Read netfirew...@gmail.com * config/i386/i386.md (*movabsmode_1) : Add ptrsize PTR before operand 0 for intel asm alternative. * testsuite/gcc.target/i386/movabs-1.c : New test. You should mention PR number in the ChangeLog. Looks OK, but I think that for consistency this decoration should also be added to *movabsmode_2 pattern. Uros. Hello, After the test, I think we can skip this pattern. Because the operand 0 must be the register, the assembler will determine the size automatically. As said, I don't want two similar patterns with a different asm template in i386.md. So, if decorating movabsmode_2 works OK, I propose to change both patterns with your change. Uros. There are new patch and changelog. Add ptrsize PTR to both patterns. Bootstrap passed, Regression tested on x86_64-unknown-linux-gnu (pc). 2013-08-10 Perez Read netfirew...@gmail.com PR target/58132 * config/i386/i386.md (*movabsmode_1) : Add ptrsize PTR before operand 0 for intel asm alternative. * testsuite/gcc.target/i386/movabs-1.c : New test. 2013-08-12 Perez Read netfirew...@gmail.com PR target/58132 * config/i386/i386.md (*movabsmode_2) : Add ptrsize PTR before operand 1 for intel asm alternative. Just merge these two ChangeLog entries. OK with this change. BTW: Do you have SVN committ access? Otherwise, I will take care to merge your change. Uros. Ok, and I add a space before second ptrsize PTR, which corrects the coding style. I don't have SVN committ access, so thanks for helping me. 2013-08-12 Perez Read netfirew...@gmail.com PR target/58132 * config/i386/i386.md (*movabsmode_1) : Add ptrsize PTR before operand 0 for intel asm alternative. (*movabsmode_2): Ditto for operand 1. * testsuite/gcc.target/i386/movabs-1.c : New test. Thanks, Perez movabs.patch Description: Binary data
Commit: M32R: Fix config problem building m32r-linux toolchains
Hi Guys, I am applying the patch below to fix a small problem building m32r-linux toolchains - the glibc-c.o object file was not being built because the definition of tmake_file in M32R section of config.gcc was not allowing for the inclusion of t-glibc. Cheers Nick gcc/ChangeLog 2013-08-12 Nick Clifton ni...@redhat.com * config.gcc (m32r-linux): Allow for tmake_file not being empty. Index: gcc/config.gcc === --- gcc/config.gcc (revision 201658) +++ gcc/config.gcc (working copy) @@ -1705,8 +1705,7 @@ ;; m32r-*-linux*) tm_file=dbxelf.h elfos.h gnu-user.h linux.h glibc-stdint.h ${tm_file} m32r/linux.h - # We override the tmake_file for linux -- why? - tmake_file=m32r/t-linux t-slibgcc + tmake_file=${tmake_file} m32r/t-linux t-slibgcc gnu_ld=yes if test x$enable_threads = xyes; then thread_file='posix' @@ -1714,8 +1713,7 @@ ;; m32rle-*-linux*) tm_file=dbxelf.h elfos.h gnu-user.h linux.h glibc-stdint.h m32r/little.h ${tm_file} m32r/linux.h - # We override the tmake_file for linux -- why? - tmake_file=m32r/t-linux t-slibgcc + tmake_file=${tmake_file} m32r/t-linux t-slibgcc gnu_ld=yes if test x$enable_threads = xyes; then thread_file='posix'
Re: [RFC] Bare bones of virtual call tracking
On 08/12/2013 08:16 AM, Jan Hubicka wrote: With multiple inheritance I need to adjust offsets. It's not clear to me that you need to worry about that in your search. A call through a particular vptr can only call overrides that go into a vtable that vptr can point to, and you can look up any thunk adjustments from the vtable. + /* First skip wrappers that C++ FE puts randomly into types. */ + while (TREE_CODE (t) == TYPE_DECL + DECL_ORIGINAL_TYPE (t)) How can you get a decl in your types array? Jason