Re: shrink-wrap leads to ICE at dwarf2cfi.c
On 11 September 2012 03:37, Richard Henderson r...@redhat.com wrote: On 09/10/2012 01:41 AM, Zhenqiang Chen wrote: In function maybe_record_trace_start, there is a check: /* We ought to have the same state incoming to a given trace no matter how we arrive at the trace. Anything else means we've got some kind of optimization error. */ gcc_checking_assert (cfi_row_equal_p (cur_row, ti-beg_row)); The assert is most definitely valid. The check makes certain that the unwind info as seen by any two paths leading to a common location are the same. When this fails, typically one of two things has happened: (1) The notes for the epilogue unwind info are incorrect, (2) We have applied an invalid code transformation in some earler optimization pass. We can't tell what the real problem is without more information. Thank you for the comments. I will do more check on the transformation of shrink-wrap. -Zhenqiang
Re: gcc-c-api
On Tue, Sep 11, 2012 at 3:10 AM, David Malcolm dmalc...@redhat.com wrote: On Mon, 2012-09-10 at 17:20 +0200, Michael Matz wrote: Hi David, On Mon, 10 Sep 2012, David Malcolm wrote: Is it possible for you to post your work-in-progress code somewhere? Attached. Many thanks for posting this! Various comments inline below. I know that you don't feel it's ready for committing, but I would find it helpful - I'm interested in understanding the general approach, rather than seeing completeness or perfection. Some sort of brain dump follows: The idea is as follows: as first cut an introspection API that is tied to compiler IR concepts rather than GCC specifics. As such it should be implementable also for other compilers, at least the trivial things that every traditional compiler will have. So, we have functions, basic blocks, instructions, operands and operators. Nothing of that should relate to tree or gimple or RTL. I see. So there's a terminology issue here: we shouldn't refer to gimple or rtl, we should refer to instructions or statements. [Possibly crazy idea: should the API actually refer to itself as GCC? (with gcc_ prefixes etc) If it's implementable by other compilers, would another prefix be suitable? I don't think this is a good idea, but it makes for an interesting thought experiment] I think the API shouldn't refer to GCC itself - in fact I was hoping that someone implemented the very same API for LLVM or Open64. At least introspection should be compiler agnostic (in my tiny ideal world ;)). I also think that we can easily backport plugin API changes (read: additions, the API of course never changes) to active release branches, so plugins using this API should run against all released GCC versions (for additions the API needs a way to identify its version though) and other compilers without re-compiling the plugin itself. Take for instance the (included) dump-plugin. The goal would be, that depending on where you'd put that dumper in the pass pipeline it would work _unchanged_ on GENERIC, on GIMPLE and on RTL. That goal isn't reached yet, once because the internal iteration just isn't implemented for e.g. RTL instruction stream, and once because the operand iterator API isn't well suited to the tree-like nesting in GENERIC and RTL currently. Interesting idea. I prefer having a little more type-safety, but it's a pain to achieve it in C. If you look at my proposed API [1], there are dozens of tiny casting functions. I like that it's typesafe, but it's somewhat inelegant. I suppose one could wrap a more type-safe C++ interface around the C API (well, or simply wrap a nice python API around it ;)). [The intermediate goal was to redo the operand API to be tree-like at the base, and possibly write small wrappers to again expose the nicer interface that GIMPLE would provide (i.e. direct access to all read operands of an instruction).] Another thing I want is simplicity. E.g. only the bare minimum of types should be exposed. Note how the API itself for instance doesn't expose different types of collections, only a general Range which can enumerate all things, depending on how it's used (though the implementation has runtime checks for wrong usage). I seem to remember from earlier mailing list discussion there being a preference for explicit iterator objects, rather than for_each functions taking callbacks (my API uses the latter approach). Yes (I didn't look into Michas patch ... but I believe it uses iterators). FWIW I don't expose any iterators directly in my plugin, I simply generate a list of wrapper objects and return that. But I suppose others might. There are some questions to be solved, e.g. memory management for those objects that aren't directly tied to GCC objects, e.g. Ranges right now. I do have a strong feeling about the relation of e.g. plugin Instructions and GCC gimple/rtx, in the sense that plugin authors should _not_ be required to manage memory for those things (same for BBs, functions, operands). I'm not sure how to parse what you wrote above, so I'm not quite sure what your preference here is. I see that you have range-creation functions (e.g. gcc_stmts), which return a Range that's owned by the caller, together with a cleanup function (gcc_free_range) that must be called exactly once assuming the Range was successfully created. The other entities (e.g. Function) are in fact really just gcc structs internally (e.g. Function = struct Function_* = cast to (struct function *) ) and those are GC-managed. Yes, they are handlers that can be passed/returned by value and thus need no memory management. That they relate to the internal GC pointer is an implementation detail. Currently in my proposed API I assume that all objects are GC-managed, and that the user is required to register a callback hook to mark any objects that they're referring to. Does that seem like a sane
Libtool update for gcc-4.8 (slim-lto bootstrap)?
Is there any interest in updating the in-tree libtool to something newer? This update would allow to use a -fno-fat-lto-objects lto-bootstrap target, that should speed up the (lto) build time. If there is interest, when would be the best date for such an update? Thanks. -- Markus
Re: Libtool update for gcc-4.8 (slim-lto bootstrap)?
Is there any interest in updating the in-tree libtool to something newer? This update would allow to use a -fno-fat-lto-objects lto-bootstrap target, that should speed up the (lto) build time. If there is interest, when would be the best date for such an update? There is definitely an interest. I still hope we will be able to switch to slim-lto by default in foreseeable future... Honza Thanks. -- Markus
Re: C++'ization of cp/parser.c/h, limited C++ parsing support for gengtype, Remove dependency of cp/cp-lang.c on cp/parser.h
Hi, On Mon, 10 Sep 2012, Gabriel Dos Reis wrote: You could also do this with an explicit pointer-to-context-struct parameter that's passed around (and that version of the patch I posted), but the class-based approached seems nicer to me. Are we talking about encapsulation or looking nice? In either case, I respectfully disagree -- talking specifically about the patch posted. Having a giant struct with a zillion of member functions defies any reasonable notion of looking nice, and of encapsulation. Amen. Ciao, Michael.
GCC's Decimal Floating Point extension problem
Hi All, I'm trying to write a small program to check the decimal floating point gcc extension but I encountered some problems The program just converts a _Decimal64 number to double to print it and I used the function (double __bid_truncdddf (_Decimal64 a) as the gnu online docs show) #include stdio.h int main () { _Decimal64 d = 12.5DD; printf (%lf\n,__bid_truncdddf(d) ); return 0; } $ gcc test.c -Wall -g test.c: In function ‘main’: test.c:23: warning: implicit declaration of function ‘__bid_truncdddf’ test.c:23: warning: format ‘%lf’ expects type ‘double’, but argument 2 has type ‘int’ $ ./a.out 0.00 I don't know why the result is zero and why the second warning appears although I wrote the function properly! I'm using gcc version 4.4.3 on ubuntu 10.04 Finally, I suffer from lack of good docs about DFP gcc extension, Does anyone know a good tutorial explaining the functions and give some examples Best Regards, M. Ahmed
Re: GCC's Decimal Floating Point extension problem
On Tue, Sep 11, 2012 at 11:31 AM, Mohamed Abou Samra my_abousa...@yahoo.com wrote: Hi All, I'm trying to write a small program to check the decimal floating point gcc extension but I encountered some problems The program just converts a _Decimal64 number to double to print it and I used the function (double __bid_truncdddf (_Decimal64 a) as the gnu online docs show) #include stdio.h int main () { _Decimal64 d = 12.5DD; printf (%lf\n,__bid_truncdddf(d) ); return 0; } $ gcc test.c -Wall -g test.c: In function ‘main’: test.c:23: warning: implicit declaration of function ‘__bid_truncdddf’ test.c:23: warning: format ‘%lf’ expects type ‘double’, but argument 2 has type ‘int’ $ ./a.out 0.00 I don't know why the result is zero and why the second warning appears although I wrote the function properly! ,__bid_truncdddf is a libgcc internal function. Don't ever use it in user programs. Just cast DFP to double. -- H.J.
Re: Cgraph Modification Plan
We do not yet seem to have consensus on a long term plan. Would it be reasonable to start on short term prepatory work? In particular, I was think we could do Add converters and testers. Change callers to use those. and maybe Change callers to use type-safe parameters. Where those mean what I earlier stated. Comments? CONVERTERS AND TESTERS ### add symtab_node_base symtab_node_def::ref_symbol() { return symbol; } symtab_node_base cgraph_node::ref_symbol() { return symbol; } symtab_node_base varpool_node::ref_symbol() { return symbol; } change node-symbol.whatever to node-ref_symbol().whatever should not need to add these cgraph_node symtab_node_def::ref_cgraph() { gcc_assert (symbol.type == SYMTAB_FUNCTION); return x_function; } varpool_node symtab_node_def::ref_varpool() { gcc_assert (symbol.type == SYMTAB_VARIABLE); return x_variable; } add symtab_node_base *symtab_node_def::try_symbol() { return symbol; } cgraph_node *symtab_node_def::try_cgraph() { return symbol.type == SYMTAB_FUNCTION ? x_function : NULL; } varpool_node *symtab_node_def::try_varpool() { return symbol.type == SYMTAB_VARIABLE ? x_variable : NULL; } change if (symtab_function_p (node) cgraph (node)-analyzed) return cgraph (node); to if (cgraph_node *p = node-try_cgraph()) if (p-analyzed) return p; change if (symtab_function_p (node) cgraph (node)-callers) to if (cgraph_node *p = node-try_cgraph()) if (p-callers) change if (symtab_function_p (node)) { struct cgraph_node *cnode = cgraph (node); to if (cgraph_node *cnode = node-try_cgraph ()) { likewise symtab_variable_p (node) and varpool (node) If there are any symtab_function_p (node) expressions left, add bool symtab_node_def::is_cgraph() { return symbol.type == SYMTAB_FUNCTION; } bool symtab_node_def::is_varpool() { return symbol.type == SYMTAB_VARIABLE; } change symtab_function_p (node) to node-is_cgraph () likewise symtab_variable_p (node) Though we would like to avoid doing so, if there are any cgraph (node) or varpool (node) expressions left, add symtab_node_base *symtab_node_def::ptr_symbol() { return symbol; } cgraph_node *symtab_node_def::ptr_cgraph() { gcc_assert (symbol.type == SYMTAB_FUNCTION); { return x_function; } varpool_node *symtab_node_def::ptr_varpool() { gcc_assert (symbol.type == SYMTAB_VARIABLE); { return x_variable; } change cgraph (node) = node-ptr_cgraph() likewise varpool (node) TYPE SAFETY ### If a function asserts that its symtab_node parameter is symtab_function_p, then convert the function to take a cgraph_node* and change the callers to convert as above. -- Lawrence Crowl
contrib/config-list.mk
The contrib/config-list.mk tool appears to be suffering from bitrot. The make failures for a limited subset of configurations consisted exclusively of: cc1: warnings being treated as errors ../../../../gcc/fixincludes/server.c: In function 'server_setup': ../../../../gcc/fixincludes/server.c:195: error: ignoring return value of 'getcwd', declared with attribute warn_unused_result The warning is correct. It is not clear what one should do upon testing the return value, as server_setup does not signal errors. Suggestions? Do we consider contrib/config-list.mk dead? -- Lawrence Crowl
variable tracking vs. delay slots question
I have a question about the the variable tracking and the delay slot passes. In some configuration files there is comment that says that the variable tracking pass should be run after all optimizations which change the order of instructions and that it requires a valid control flow graph to work. But my understanding of the delay slot pass is that it can change the order of instructions and that it can only be run after the control flow graph has been freed. These requirements seem to conflict. Am I right about this or are the comments wrong or am I confused? I think this problem is the basis of bug 54128, a bootstrap failure on MIPS, though the problem seems generic to any system with delay slots. Steve Ellcey sell...@mips.com
Re: contrib/config-list.mk
On Tue, Sep 11, 2012 at 3:18 PM, Lawrence Crowl cr...@googlers.com wrote: The contrib/config-list.mk tool appears to be suffering from bitrot. The make failures for a limited subset of configurations consisted exclusively of: cc1: warnings being treated as errors ../../../../gcc/fixincludes/server.c: In function 'server_setup': ../../../../gcc/fixincludes/server.c:195: error: ignoring return value of 'getcwd', declared with attribute warn_unused_result The warning is correct. It is not clear what one should do upon testing the return value, as server_setup does not signal errors. Suggestions? Do we consider contrib/config-list.mk dead? I don't know whether contrib/config-list.mk is dead or not. But I do know that you will only get that error on Debian or Ubuntu systems, which by default pass some rather aggressive warning options. It would be fine to have the program crash if getcwd somehow fails. There is nothing useful that it can do in that unlikely case. Ian
gfortran error: Statement order error: declaration after DATA
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I am trying to compile the cactuscode package and can not get past the error : Statement order error: declaration after DATA can you point me in the direction of a fix. I included offending file as an attachment. Dave kb9qhd Amateur Radio Service Technician class Licence Grid EN43 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJQT9tfAAoJEIHvsckbl2dBLMIH/0LR4lA3w9W6lhaB3lkyX9WB dQJmYHAM59LsGmi+9fmhODG1KkoVfIMIqI8AaDHAFQiqkN2QCr1BNGTFgifFFcV9 BijJt4OtcZTzS0LwIzLTGOEbBJIT2xP1HQmVm/7gYr90HlWvLMHLoPJgqnNsJyNT mxWMEJojD/xeKaHE6yUIZxRlbnM/pC7UYSIruQ7YjsxC7gKpHfBeOM9Op4AkwJ0k H4IaKRDpYOKBbEHP6LLPZFTdosjQgWaFnTBILvLaHjSqa9mskU4yTDLdLHFNjUz9 i5hC2ihlIJBcQx1QVLwt/AvjSDtqPqLPKo3h2OBH0IJzlcS+kOkfeSQ+AvkWghU= =snlv -END PGP SIGNATURE- C Fast Fourier Transform subroutines C C Copyright (c) 1978 Clive Temperton (European Centre for Medium-Range C Weather Forecasts, Reading, UK) C Copyright (c) 1980 Russ Rew (National Center for Atmospheric Research, C Boulder, Colorado, USA) C C References: C -- CC. Temperton: ``Self-Sorting Mixed-Radix Fast Fourier C Transforms.'', Journal of Computational Physics 52, 1 (1983) CC. Temperton: ``Fast Mixed-Radix Real Fourier Transforms.'', C Journal of Computational Physics 52, 340 (1983) C C C $Id: fax.f,v 1.2 2004/10/04 19:22:00 e_gourgoulhon Exp $ C $Log: fax.f,v $ C Revision 1.2 2004/10/04 19:22:00 e_gourgoulhon C Added copyright and references. C C Revision 1.1.1.1 2001/11/20 15:19:30 e_gourgoulhon C LORENE C c Revision 1.3 2000/12/14 15:41:16 eric c subroutine VPASSM (ligne 152) : les DATA sont forces a la double c precision par l'ajout de D00 aux valeurs numeriques c (cela generait des erreurs double - simple precision avec les c compilateurs g77 et NAG f95 sous Linux). c c Revision 1.2 1997/05/23 11:45:49 hyc c *** empty log message *** c C Revision 1.1 1997/03/17 20:05:32 hyc C Initial revision C C C $Header$ C C SUBROUTINE FAX(IFAX,N,MODE) IMPLICIT double PRECISION (A-H,O-Z) character*100 header data header/'$Header$'/ DIMENSION IFAX(*) NN=N IF (IABS(MODE).EQ.1) GO TO 10 IF (IABS(MODE).EQ.8) GO TO 10 NN=N/2 IF ((NN+NN).EQ.N) GO TO 10 IFAX(1)=-99 RETURN 10 K=1 C TEST FOR FACTORS OF 4 20 IF (MOD(NN,4).NE.0) GO TO 30 K=K+1 IFAX(K)=4 NN=NN/4 IF (NN.EQ.1) GO TO 80 GO TO 20 C TEST FOR EXTRA FACTOR OF 2 30 IF (MOD(NN,2).NE.0) GO TO 40 K=K+1 IFAX(K)=2 NN=NN/2 IF (NN.EQ.1) GO TO 80 C TEST FOR FACTORS OF 3 40 IF (MOD(NN,3).NE.0) GO TO 50 K=K+1 IFAX(K)=3 NN=NN/3 IF (NN.EQ.1) GO TO 80 GO TO 40 C NOW FIND REMAINING FACTORS 50 L=5 INC=2 C INC ALTERNATELY TAKES ON VALUES 2 AND 4 60 IF (MOD(NN,L).NE.0) GO TO 70 K=K+1 IFAX(K)=L NN=NN/L IF (NN.EQ.1) GO TO 80 GO TO 60 70 L=L+INC INC=6-INC GO TO 60 80 IFAX(1)=K-1 C IFAX(1) CONTAINS NUMBER OF FACTORS C IFAX(1) CONTAINS NUMBER OF FACTORS NFAX=IFAX(1) C SORT FACTORS INTO ASCENDING ORDER IF (NFAX.EQ.1) GO TO 110 DO 100 II=2,NFAX ISTOP=NFAX+2-II DO 90 I=2,ISTOP IF (IFAX(I+1).GE.IFAX(I)) GO TO 90 ITEM=IFAX(I) IFAX(I)=IFAX(I+1) IFAX(I+1)=ITEM 90 CONTINUE 100CONTINUE 110 CONTINUE RETURN END C SUBROUTINE FFTRIG(TRIGS,N,MODE) IMPLICIT double PRECISION (A-H,O-Z) DIMENSION TRIGS(*) X1=1 PI=2.*ASIN(X1) IMODE=IABS(MODE) NN=N IF (IMODE.GT.1.AND.IMODE.LT.6) NN=N/2 DEL=(PI+PI)/DFLOAT(NN) L=NN+NN DO 10 I=1,L,2 ANGLE=.5D00*DFLOAT(I-1)*DEL TRIGS(I)=COS(ANGLE) TRIGS(I+1)=SIN(ANGLE) 10 CONTINUE IF (IMODE.EQ.1) RETURN IF (IMODE.EQ.8) RETURN DEL=.5D00*DEL NH=(NN+1)/2 L=NH+NH LA=NN+NN DO 20 I=1,L,2 ANGLE=.5D00*DFLOAT(I-1)*DEL TRIGS(LA+I)=COS(ANGLE) TRIGS(LA+I+1)=SIN(ANGLE) 20 CONTINUE IF (IMODE.LE.3) RETURN DEL=.5D00*DEL LA=LA+NN IF (MODE.EQ.5) GO TO 40 DO 30 I=2,NN ANGLE=DFLOAT(I-1)*DEL TRIGS(LA+I)=2*SIN(ANGLE) 30 CONTINUE RETURN 40 CONTINUE DEL=.5D00*DEL DO 50 I=2,N ANGLE=DFLOAT(I-1)*DEL TRIGS(LA+I)=SIN(ANGLE) 50 CONTINUE RETURN END C C SUBROUTINE 'VPASSMD' - MULTIPLE VERSION OF 'VPASSA' C PERFORMS ONE PASS THROUGH DATA C AS PART OF MULTIPLE COMPLEX FFT ROUTINE C A IS FIRST REAL INPUT VECTOR C B IS FIRST IMAGINARY INPUT VECTOR C C IS FIRST REAL OUTPUT
Re: contrib/config-list.mk
Quoting Ian Lance Taylor i...@google.com: On Tue, Sep 11, 2012 at 3:18 PM, Lawrence Crowl cr...@googlers.com wrote: The contrib/config-list.mk tool appears to be suffering from bitrot. The make failures for a limited subset of configurations consisted exclusively of: cc1: warnings being treated as errors ../../../../gcc/fixincludes/server.c: In function 'server_setup': ../../../../gcc/fixincludes/server.c:195: error: ignoring return value of 'getcwd', declared with attribute warn_unused_result The warning is correct. It is not clear what one should do upon testing the return value, as server_setup does not signal errors. Suggestions? Do we consider contrib/config-list.mk dead? I don't know whether contrib/config-list.mk is dead or not. I certainly hope not. But I do know that you will only get that error on Debian or Ubuntu systems, which by default pass some rather aggressive warning options. So does that mean that bootstrap is broken there too? It would be fine to have the program crash if getcwd somehow fails. There is nothing useful that it can do in that unlikely case. However, less desirable would be if the program silently continues and makes the user think everything is fine when it isn't. Maybe even alter some files it isn't supposed to alter in the process. As buff is an automatic variable, it is likely it contains somethng that can be interpreted as a string. So calling abort / exit (1) or something similar when things go wrong would make sense.
Re: variable tracking vs. delay slots question
On 09/11/2012 04:33 PM, Steve Ellcey wrote: In some configuration files there is comment that says that the variable tracking pass should be run after all optimizations which change the order of instructions and that it requires a valid control flow graph to work. But my understanding of the delay slot pass is that it can change the order of instructions and that it can only be run after the control flow graph has been freed. It changes the order of instructions, but IIRC it leaves a little breadcrumb in the instruction's original position. No idea of var-tracking would utilize that breadcrumb. reorg clobbers the CFG as well. These requirements seem to conflict. Am I right about this or are the comments wrong or am I confused? I think this problem is the basis of bug 54128, a bootstrap failure on MIPS, though the problem seems generic to any system with delay slots. I haven't looked at 54128, but yes, I think you're generally right about the conflict. Not sure what the implications are in terms of the failure mode -- it would seem to me that we wouldn't get good debug info. However, I'm not sure offhand how it'd cause a boostrap error. jeff
[Bug other/54398] Incorrect ARM assembly when building with -fno-omit-frame-pointer -O2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54398 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #6 from Jakub Jelinek jakub at gcc dot gnu.org 2012-09-11 06:09:13 UTC --- cselib.c has for this the various spots that special case HARD_FRAME_POINTER_REGNUM (or STACK_POINTER_REGNUM or FRAME_POINTER_REGNUM). Please see why that doesn't work in this case.
[Bug preprocessor/54528] [4.8 Regression] system.h:288:78: error: integer overflow in expression
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54528 Mikael Pettersson mikpe at it dot uu.se changed: What|Removed |Added CC||mikpe at it dot uu.se --- Comment #2 from Mikael Pettersson mikpe at it dot uu.se 2012-09-11 07:21:57 UTC --- I got these errors too, when trying to bootstrap gcc-4.8-20120909 on m68k-linux using g++ 4.5.3 as the bootstrap compiler.
[Bug middle-end/54515] cc1plus sigsegv -O2 anonymous namespace
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54515 --- Comment #10 from Richard Guenther rguenth at gcc dot gnu.org 2012-09-11 08:32:43 UTC --- Author: rguenth Date: Tue Sep 11 08:32:29 2012 New Revision: 191174 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=191174 Log: 2012-09-11 Richard Guenther rguent...@suse.de PR middle-end/54515 * gimple.c (get_base_address): Do not return NULL_TREE apart from for WITH_SIZE_EXPR. * gimple-fold.c (canonicalize_constructor_val): Do not call get_base_address when not necessary. * g++.dg/tree-ssa/pr54515.C: New testcase. Added: trunk/gcc/testsuite/g++.dg/tree-ssa/pr54515.C Modified: trunk/gcc/ChangeLog trunk/gcc/gimple-fold.c trunk/gcc/gimple.c trunk/gcc/testsuite/ChangeLog
[Bug middle-end/54515] cc1plus sigsegv -O2 anonymous namespace
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54515 Richard Guenther rguenth at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Known to work||4.6.4, 4.7.2, 4.8.0 Resolution||FIXED Known to fail||4.6.3, 4.7.1 --- Comment #10 from Richard Guenther rguenth at gcc dot gnu.org 2012-09-11 08:32:43 UTC --- Author: rguenth Date: Tue Sep 11 08:32:29 2012 New Revision: 191174 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=191174 Log: 2012-09-11 Richard Guenther rguent...@suse.de PR middle-end/54515 * gimple.c (get_base_address): Do not return NULL_TREE apart from for WITH_SIZE_EXPR. * gimple-fold.c (canonicalize_constructor_val): Do not call get_base_address when not necessary. * g++.dg/tree-ssa/pr54515.C: New testcase. Added: trunk/gcc/testsuite/g++.dg/tree-ssa/pr54515.C Modified: trunk/gcc/ChangeLog trunk/gcc/gimple-fold.c trunk/gcc/gimple.c trunk/gcc/testsuite/ChangeLog --- Comment #11 from Richard Guenther rguenth at gcc dot gnu.org 2012-09-11 08:33:36 UTC --- Fixed.
[Bug middle-end/54515] cc1plus sigsegv -O2 anonymous namespace
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54515 Richard Guenther rguenth at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Known to work||4.6.4, 4.7.2, 4.8.0 Resolution||FIXED Known to fail||4.6.3, 4.7.1 --- Comment #11 from Richard Guenther rguenth at gcc dot gnu.org 2012-09-11 08:33:36 UTC --- Fixed.
[Bug target/54546] New: SH: Enable -fshrink-wrap
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54546 Bug #: 54546 Summary: SH: Enable -fshrink-wrap Classification: Unclassified Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassig...@gcc.gnu.org ReportedBy: ch...@gcc.gnu.org Created attachment 28169 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28169 simple-return pattern Implement the simple_return pattern to enable shrink-wrapping, which is beneficial on SH when the prologue/epilogue is small enough or when not optimizing for size. Adding the sh_can_use_return_insn_p function so refinements based on epilogue size can be further added. However this exposes a -freorder-blocks-and-partition -fprofile-use regression in the testsuite with that must be investigated.
[Bug target/54546] SH: Enable -fshrink-wrap
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54546 chrbr at gcc dot gnu.org changed: What|Removed |Added Severity|normal |enhancement
[Bug c++/54545] diagnostic overflow
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54545 Richard Guenther rguenth at gcc dot gnu.org changed: What|Removed |Added Keywords||diagnostic Status|UNCONFIRMED |NEW Last reconfirmed||2012-09-11 Ever Confirmed|0 |1 Severity|normal |enhancement --- Comment #1 from Richard Guenther rguenth at gcc dot gnu.org 2012-09-11 09:46:45 UTC --- Confirmed. t.C:1:15: error: expected '{' instead of '(' enum class d(a,b,c}; ^ would be nice ;)
[Bug debug/54534] [4.7 Regression] Missing location for unused variable
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54534 --- Comment #3 from Richard Guenther rguenth at gcc dot gnu.org 2012-09-11 09:49:27 UTC --- I am testing Index: gcc/cgraph.h === --- gcc/cgraph.h(revision 191174) +++ gcc/cgraph.h(working copy) @@ -951,7 +951,7 @@ varpool_can_remove_if_no_refs (struct va return (!node-force_output !node-used_from_other_partition ((DECL_COMDAT (node-decl) !varpool_used_from_object_file_p (node)) - || !node-externally_visible + || (flag_toplevel_reorder !node-externally_visible) || DECL_HAS_VALUE_EXPR_P (node-decl))); } which restores previous behavior.
[Bug debug/54534] [4.7 Regression] Missing location for unused variable
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54534 --- Comment #4 from Jan Hubicka hubicka at gcc dot gnu.org 2012-09-11 10:19:04 UTC --- Well, the patch really is quite symptomatic - i.e. dwarf2out should not forget about the decl when it is removed from varpool. The way things are supposed to work (I believe) is to call global_decl debug hook via emit_debug_global_declarations that is called from frontend at global decl wrapup time (i.e. the thing that is executed after whole compilation and should not be frontend specific :) I made some cleanups in this area as part of symtab work, but I do not see how those should affect the debug output...
[Bug c++/54403] [C++11] operator! applied to a member of a templated class in a lambda expression that captures 'this' pointer crashes compiler
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54403 Paolo Carlini paolo.carlini at oracle dot com changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2012-09-11 Ever Confirmed|0 |1
[Bug debug/54534] [4.7 Regression] Missing location for unused variable
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54534 --- Comment #5 from rguenther at suse dot de rguenther at suse dot de 2012-09-11 10:38:32 UTC --- On Tue, 11 Sep 2012, hubicka at gcc dot gnu.org wrote: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54534 --- Comment #4 from Jan Hubicka hubicka at gcc dot gnu.org 2012-09-11 10:19:04 UTC --- Well, the patch really is quite symptomatic - i.e. dwarf2out should not forget about the decl when it is removed from varpool. The way things are supposed to work (I believe) is to call global_decl debug hook via emit_debug_global_declarations that is called from frontend at global decl wrapup time (i.e. the thing that is executed after whole compilation and should not be frontend specific :) I made some cleanups in this area as part of symtab work, but I do not see how those should affect the debug output... The regression is that it prints optimized_out which it really is. But at -O0 we don't IMHO want unused decls to go away. Not sure why it doesn't regress on trunk - somehow the decl is output anyway. Richard.
[Bug debug/54534] [4.7 Regression] Missing location for unused variable
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54534 --- Comment #6 from Richard Guenther rguenth at gcc dot gnu.org 2012-09-11 10:43:17 UTC --- Author: rguenth Date: Tue Sep 11 10:43:13 2012 New Revision: 191176 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=191176 Log: 2012-09-11 Richard Guenther rguent...@suse.de PR debug/54534 * cgraph.h (varpool_can_remove_if_no_refs): Restore dependence on flag_toplevel_reorder. Modified: branches/gcc-4_7-branch/gcc/ChangeLog branches/gcc-4_7-branch/gcc/cgraph.h
[Bug debug/54534] [4.7 Regression] Missing location for unused variable
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54534 Richard Guenther rguenth at gcc dot gnu.org changed: What|Removed |Added Status|WAITING |RESOLVED Resolution||FIXED --- Comment #6 from Richard Guenther rguenth at gcc dot gnu.org 2012-09-11 10:43:17 UTC --- Author: rguenth Date: Tue Sep 11 10:43:13 2012 New Revision: 191176 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=191176 Log: 2012-09-11 Richard Guenther rguent...@suse.de PR debug/54534 * cgraph.h (varpool_can_remove_if_no_refs): Restore dependence on flag_toplevel_reorder. Modified: branches/gcc-4_7-branch/gcc/ChangeLog branches/gcc-4_7-branch/gcc/cgraph.h --- Comment #7 from Richard Guenther rguenth at gcc dot gnu.org 2012-09-11 10:43:40 UTC --- Fixed.
[Bug other/54398] Incorrect ARM assembly when building with -fno-omit-frame-pointer -O2
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54398 --- Comment #7 from ramrad01 at arm dot com 2012-09-11 10:44:30 UTC --- On 09/11/12 07:09, jakub at gcc dot gnu.org wrote: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54398 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #6 from Jakub Jelinek jakub at gcc dot gnu.org 2012-09-11 06:09:13 UTC --- cselib.c has for this the various spots that special case HARD_FRAME_POINTER_REGNUM (or STACK_POINTER_REGNUM or FRAME_POINTER_REGNUM). Please see why that doesn't work in this case. This rings a bell. Maybe the patch mentioned below needs backporting given Carrot is reporting this against the 4.6 branch. What's not clear if this is reproducible on anything later though. http://old.nabble.com/-PATCH--Prevent-cselib-substitution-of-FP,-SP,-SFP-td33080657.html Full disclaimer that I've not investigated whether the same problem occurs on trunk. HTH, Ramana
[Bug debug/54534] [4.7 Regression] Missing location for unused variable
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54534 Richard Guenther rguenth at gcc dot gnu.org changed: What|Removed |Added Status|WAITING |RESOLVED Resolution||FIXED --- Comment #7 from Richard Guenther rguenth at gcc dot gnu.org 2012-09-11 10:43:40 UTC --- Fixed.
[Bug c++/54548] New: unclear error message for ambiguous type lookup.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54548 Bug #: 54548 Summary: unclear error message for ambiguous type lookup. Classification: Unclassified Product: gcc Version: 4.7.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassig...@gcc.gnu.org ReportedBy: pl...@agmk.net #include new struct X; namespace { struct X; } int main() { new X(); } a gcc error message is unreadable for end-user in case when the first 'struct X' decl is hidden deeply in the #include forest. $ LANG=C g++ -Wall t.cpp -c t.cpp: In function 'int main()': t.cpp:6:6: error: expected type-specifier before 'X' t.cpp:6:6: error: expected ';' before 'X' clang is more user-friendly and shows problem directly. $ LANG=C clang++ -Wall t.cpp -c t.cpp:6:6: error: reference to 'X' is ambiguous new X(); ^ t.cpp:2:8: note: candidate found by name lookup is 'X' struct X; ^ t.cpp:3:20: note: candidate found by name lookup is 'anonymous namespace::X' namespace { struct X; } ^ t.cpp:6:6: error: allocation of incomplete type 'X' new X(); ^ t.cpp:2:8: note: forward declaration of 'X' struct X; ^ 2 errors generated.
[Bug lto/54312] uniquify_nodes takes 12% of Mozilla LTO build
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54312 --- Comment #2 from Richard Guenther rguenth at gcc dot gnu.org 2012-09-11 11:03:52 UTC --- Patch pre-approved (also for 4.7) if it passes your testing.
[Bug middle-end/54149] write introduction incorrect wrt the C11 memory model
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54149 --- Comment #3 from Aldy Hernandez aldyh at gcc dot gnu.org 2012-09-11 12:28:11 UTC --- Author: aldyh Date: Tue Sep 11 12:28:02 2012 New Revision: 191179 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=191179 Log: PR middle-end/54149 * tree-ssa-loop-im.c (execute_sm_if_changed_flag_set): Only set flag for writes. Added: trunk/gcc/testsuite/gcc.dg/simulate-thread/speculative-store-4.c Modified: trunk/gcc/ChangeLog trunk/gcc/tree-ssa-loop-im.c
[Bug middle-end/54149] write introduction incorrect wrt the C11 memory model
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54149 Aldy Hernandez aldyh at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED --- Comment #4 from Aldy Hernandez aldyh at gcc dot gnu.org 2012-09-11 12:30:36 UTC --- fixed on trunk
[Bug ada/54549] New: Compilation Error : Assertion Failure
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54549 Bug #: 54549 Summary: Compilation Error : Assertion Failure Classification: Unclassified Product: gcc Version: unknown Status: UNCONFIRMED Severity: blocker Priority: P3 Component: ada AssignedTo: unassig...@gcc.gnu.org ReportedBy: webe...@gmail.com Created attachment 28170 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28170 The source code to be compiled. gnatmake -ws -c -u -P/home/someone/Projects/RTES_Train/rtes_train.gpr layout.ads gcc-4.6 -c -I- -gnatA /home/someone/Projects/RTES_Train/layout.ads +===GNAT BUG DETECTED==+ | 4.6.3 (i686-pc-linux-gnu) Assert_Failure sinfo.adb:2547 | | Error detected at layout.ads:9:4 | | Please submit a bug report; see http://gcc.gnu.org/bugs.html.| | Use a subject line meaningful to you and us to track the bug.| | Include the entire contents of this bug box in the report. | | Include the exact gcc-4.6 or gnatmake command that you entered. | | Also include sources listed below in gnatchop format | | (concatenated together with no headers between files). | +==+ Please include these source files with error report Note that list may not be accurate in some cases, so please double check that the problem can still be reproduced with the set of files listed. Consider also -gnatd.n switch (see debug.adb). /home/someone/Projects/RTES_Train/layout.ads /home/someone/Projects/RTES_Train/bin/GNAT-TEMP-01.TMP /home/someone/Projects/RTES_Train/bin/GNAT-TEMP-02.TMP raised SYSTEM.ASSERTIONS.ASSERT_FAILURE : namet.adb:655 gnatmake: /home/someone/Projects/RTES_Train/layout.ads compilation error [2012-09-11 07:32:17] process exited with status 4 (elapsed time: 00.12s)
[Bug debug/54519] [4.6/4.7/4.8 Regression] Debug info quality regression due to (pointless) partial inlining
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54519 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added Status|NEW |ASSIGNED AssignedTo|unassigned at gcc dot |jakub at gcc dot gnu.org |gnu.org | --- Comment #3 from Jakub Jelinek jakub at gcc dot gnu.org 2012-09-11 13:26:54 UTC --- Created attachment 28171 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28171 gcc48-pr54519.patch Generic solution patch. This doesn't attempt to special case inlining of FN.part.N back into FN or FN inlined into BAR (which is going to be harder than I've initially thought, because both the (possibly inlined) FN and FN.part.N have originally full copy of the BLOCK tree of FN, after some optimizations in between fnsplit and inlining some of the BLOCKs or BLOCK_VARS in either or both of them might be removed though. So, expand_call_inline would probably need to avoid attaching remap_blocks as children of new BLOCK it creates, instead it should somehow merge the two BLOCK trees back into one (it could use BLOCK_ABSTRACT_ORIGIN to find the matching blocks, and for the blocks that already exist in FN just insert_decl_map from FN.part.N's BLOCK to corresponding FN BLOCK). Similarly BLOCK_VARS need to be handled by preferring to remap to vars in the caller FN BLOCK_VARS (just insert_decl_map those), and just re-add the rest that was dropped on the floor in the mean time. And it would need to drop some of the debug bind and debug source bind stmts for parameters that this patch adds. Several of the tests fail sometimes, I'm going to file a separate PR for that because the problem is during the first df_analyze.
[Bug fortran/50640] [4.7 Regression] [OOP] FAIL: gfortran.dg/select_type_12.f03 -O (internal compiler error)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50640 --- Comment #26 from Tobias Burnus burnus at gcc dot gnu.org 2012-09-11 13:44:44 UTC --- The solution of comment 3, fixed by comment 24 seems to break the test case of PR fortran/53718. Reverting the patch (comment 24, except for unrelated class.c part) fixes the issue of PR 53718. For some reason, reverting the patch no longer triggers the issue of this PR, i.e. gfortran.dg/select_type_12.f03 gives no ICE. Hence, it seems as if comment 2 no longer applies. To recap (rough version): gfortran generates in MAIN__ the nested function __copy_MAIN___T1 and assigns it (function pointer) to a field of the static struct __vtab_MAIN___T1. But __copy_MAIN___T1 does not get called in MAIN__ but only in foo which is also a nested function of MAIN__ - thus, __vtab_MAIN___T1 didn't get marked as referenced, causing the ICE. The patch in comment 24 hoisted the __copy_MAIN___T1 out of MAIN__ into the TU space. Does anyone see a reason why the patch shouldn't be revered? (I assume one of Richard's patches in May fixed the issue. That probably means that one has to find another solution for 4.7. Suggestions?) Comments - especially from the middle-end side?
[Bug fortran/53718] [4.7/4.8 regression] [OOP] gfortran generates asm label twice in the same output file
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53718 --- Comment #10 from Tobias Burnus burnus at gcc dot gnu.org 2012-09-11 13:47:59 UTC --- (In reply to comment #7) (In reply to comment #6) Could it be revision 181505? Very likely. If it is, I'm betting on the PR50640 part of that commit. Indeed the following patch, which is practically a revert of the trans-decl.c part of the above commit, makes the errors go away: We should backout everything - except for the class.c part of the commit in bug 50640 comment 24. My suspicious is that one of Richard's commits in May fixed the issue. In turn that probably means that backing out the patch for PR50640 only works for 4.7 and not for 4.8 See also some comments/analysis/RFC at bug 50640 comment 26.
[Bug c/54550] New: GCC -O3 breaks floating point equality comparison
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54550 Bug #: 54550 Summary: GCC -O3 breaks floating point equality comparison Classification: Unclassified Product: gcc Version: 4.6.3 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c AssignedTo: unassig...@gcc.gnu.org ReportedBy: veio...@gmail.com 32-bit X86 on Linux. First of all, this is not one of those bug reports that states that floating point is implemented differently on different platforms. This is the same C source on the same platform. Secondly, this isn't a blanket statement. It happens in a very specific set of circumstances. If there's interest from this forum, then I'll try to hack together a demo. Doing so is nontrivial due to the sensitive (to surrounding code) nature of the bug, and my inability to upload my entire source. Here's the bug, from a high level perspective: 1. Fill a list with a bunch of random (but valid) 64-bit double-precision values. 2. Sort the list from negative to positive, to another list. 3. For each random value in the first list, scan through the second (sorted) list to find it, using floating point comparison for equality. This should work because numbers are bit-for-bit identical to the original list. (Only the order of the numbers is different between the lists.) 4. If you find any value which is not on the list, print an error. With -O2, it's fine. With -O3, it prints an error. However, if I use 80-bit long double precision, it works again with -O3. I suspect that, at higher optimization levels, GCC is caching doubles in X87 registers. That's great, but somehow equality comparison is getting messed up under _some_ scenarios. When I print out the values that are different, they're identical, down to the bit. Again, no math is done on any of these values after generating them. It's purely loads and stores. It's worth noting that, if instead of -O3, I do -O2 and add all the command line options that -O3 enables beyond -O2 _except_ for -finline_functions, then it also works fine. I suspect that function inlining, in this case, causes the above register caching issue to break equality comparison, probably because the compiler is inlining the search function in step 3. Maybe there's some subtle spec violation here, i.e. maybe floating point equality is somehow illegitimate. If it is, sorry for submitting this bug.
[Bug tree-optimization/52445] [4.6 Regression] conditional store replacement causes segfault in generated code
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52445 Mikael Pettersson mikpe at it dot uu.se changed: What|Removed |Added CC||mikpe at it dot uu.se --- Comment #13 from Mikael Pettersson mikpe at it dot uu.se 2012-09-11 14:21:09 UTC --- Could this be applied to gcc-4.6.4 please? A recently reported miscompilation of a device driver in the Linux/ARM kernel by gcc-4.6.3 was traced to this bug. Applying the trunk patch to 4.6.3 fixed that test case. FWIW, I've been using and testing this fix in my own 4.6-based branch since early March, on multiple targets, without regressions.
[Bug debug/54551] New: DF resets some DEBUG_INSNs unnecessarily
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54551 Bug #: 54551 Summary: DF resets some DEBUG_INSNs unnecessarily Classification: Unclassified Product: gcc Version: 4.7.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: debug AssignedTo: unassig...@gcc.gnu.org ReportedBy: ja...@gcc.gnu.org CC: aol...@gcc.gnu.org, hubi...@gcc.gnu.org, jan.kratoch...@redhat.com, m...@gcc.gnu.org, rgue...@gcc.gnu.org Depends on: 54519 +++ This bug was initially created as a clone of Bug #54519 +++ Several of the PR54519 tests fail. The problem can be reproduced even without any partial inlining though, e.g. on: void bar (void); int foo (int x, int y, int z) { if (x != z) { int a = z + 8; bar (); bar (); } return y; } at -g -O2. At *.dfinit we still have: (insn 4 3 5 2 (set (reg/v:SI 62 [ z ]) (reg:SI 1 dx [ z ])) vu.c:5 65 {*movsi_internal} (nil)) (note 5 4 8 2 NOTE_INSN_FUNCTION_BEG) (insn 8 5 9 2 (set (reg:CCZ 17 flags) (compare:CCZ (reg/v:SI 60 [ x ]) (reg/v:SI 62 [ z ]))) vu.c:6 7 {*cmpsi_1} (nil)) (jump_insn 9 8 10 2 (set (pc) (if_then_else (eq (reg:CCZ 17 flags) (const_int 0 [0])) (label_ref 14) (pc))) vu.c:6 595 {*jcc_1} (expr_list:REG_BR_PROB (const_int 3784 [0xec8]) (nil)) - 14) (note 10 9 11 3 [bb 3] NOTE_INSN_BASIC_BLOCK) (debug_insn 11 10 12 3 (var_location:SI a (plus:SI (reg/v:SI 62 [ z ]) (const_int 8 [0x8]))) vu.c:8 -1 (nil)) but during CSE1 when fast DCE is performed, the debug insn for a is reset, as pseudo 62 isn't live in that basic block. We have the valtrack.c infrastructure for this kind of things, but that apparently works only within basic blocks, while in this case we have a BB (2) where the pseudo dies and a BB (3) that is dominated by that BB and has a debug insn using that pseudo. Perhaps in further RTL passes that use DF that is sufficient, but the first time DF liveness is computed as this testcase or PR54519 shows we drop on the floor debug insns that could still refer to debug temporaries if we initialized them in the bbs where they die and that dominate the debug uses. For this particular testcase it could still live in %edx on the first call bar insn, then in DW_OP_GNU_entry_value (%edx). Alex, what do you think about this?
[Bug fortran/53957] Polyhedron 11 benchmark: MP_PROP_DESIGN twice as long as other compiler
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53957 --- Comment #10 from Richard Guenther rguenth at gcc dot gnu.org 2012-09-11 15:02:00 UTC --- There are a lot more reasons why we do not vectorize this loop :(
[Bug libstdc++/54172] [4.7 Regression] __cxa_guard_acquire thread-safety issue
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54172 --- Comment #16 from Jakub Jelinek jakub at gcc dot gnu.org 2012-09-11 15:23:01 UTC --- Author: jakub Date: Tue Sep 11 15:22:54 2012 New Revision: 191190 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=191190 Log: PR libstdc++/54172 * libsupc++/guard.cc (__cxa_guard_acquire): Fix up the last argument of the first __atomic_compare_exchange_n. Modified: trunk/libstdc++-v3/ChangeLog trunk/libstdc++-v3/libsupc++/guard.cc
[Bug libstdc++/54172] [4.7 Regression] __cxa_guard_acquire thread-safety issue
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54172 --- Comment #16 from Jakub Jelinek jakub at gcc dot gnu.org 2012-09-11 15:23:01 UTC --- Author: jakub Date: Tue Sep 11 15:22:54 2012 New Revision: 191190 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=191190 Log: PR libstdc++/54172 * libsupc++/guard.cc (__cxa_guard_acquire): Fix up the last argument of the first __atomic_compare_exchange_n. Modified: trunk/libstdc++-v3/ChangeLog trunk/libstdc++-v3/libsupc++/guard.cc --- Comment #17 from Jakub Jelinek jakub at gcc dot gnu.org 2012-09-11 15:24:12 UTC --- Author: jakub Date: Tue Sep 11 15:24:06 2012 New Revision: 191191 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=191191 Log: PR libstdc++/54172 * libsupc++/guard.cc (__cxa_guard_acquire): Fix up the last argument of the first __atomic_compare_exchange_n. Modified: branches/gcc-4_7-branch/libstdc++-v3/ChangeLog branches/gcc-4_7-branch/libstdc++-v3/libsupc++/guard.cc
[Bug libstdc++/54172] [4.7 Regression] __cxa_guard_acquire thread-safety issue
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54172 --- Comment #17 from Jakub Jelinek jakub at gcc dot gnu.org 2012-09-11 15:24:12 UTC --- Author: jakub Date: Tue Sep 11 15:24:06 2012 New Revision: 191191 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=191191 Log: PR libstdc++/54172 * libsupc++/guard.cc (__cxa_guard_acquire): Fix up the last argument of the first __atomic_compare_exchange_n. Modified: branches/gcc-4_7-branch/libstdc++-v3/ChangeLog branches/gcc-4_7-branch/libstdc++-v3/libsupc++/guard.cc
[Bug c/54550] GCC -O3 breaks floating point equality comparison
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54550 --- Comment #1 from Jonathan Wakely redi at gcc dot gnu.org 2012-09-11 15:34:59 UTC --- Have you read http://gcc.gnu.org/bugs/#nonbugs_general and PR 323?
[Bug c/54550] GCC -O3 breaks floating point equality comparison
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54550 Michael Matz matz at gcc dot gnu.org changed: What|Removed |Added CC||matz at gcc dot gnu.org --- Comment #2 from Michael Matz matz at gcc dot gnu.org 2012-09-11 15:46:08 UTC --- In particular your speculation about the x87 registers is most probably right. If so, it's a known deficiency in the 32bit x86 backend, and you should be able to work around this with -ffloat-store. We have no intention to change this behaviour in GCC.
[Bug c/54550] GCC -O3 breaks floating point equality comparison
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54550 --- Comment #3 from Michael Matz matz at gcc dot gnu.org 2012-09-11 15:48:10 UTC --- Or with the more recent -fexcess-precision=standard option.
[Bug fortran/53718] [4.7/4.8 regression] [OOP] gfortran generates asm label twice in the same output file
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53718 --- Comment #11 from janus at gcc dot gnu.org 2012-09-11 15:57:35 UTC --- (In reply to comment #10) My suspicious is that one of Richard's commits in May fixed the issue. In turn that probably means that backing out the patch for PR50640 only works for 4.7 and not for 4.8 I assume you mean the other way around, right? The patch of comment 7 *does* work on trunk. I just checked that applying it on the 4.7 branch revives the old ICE on select_type_12. So, should we apply the patch only on trunk for now?
[Bug tree-optimization/54492] [4.8 Regression] SLSR takes ages
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54492 William J. Schmidt wschmidt at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||FIXED --- Comment #2 from William J. Schmidt wschmidt at gcc dot gnu.org 2012-09-11 16:00:11 UTC --- Fixed with r191178. I apparently typoed the PR number in the ChangeLog.
[Bug c/54552] New: Cast to pointer to VLA crash the compiler
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54552 Bug #: 54552 Summary: Cast to pointer to VLA crash the compiler Classification: Unclassified Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c AssignedTo: unassig...@gcc.gnu.org ReportedBy: jens.gust...@loria.fr Created attachment 28172 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28172 code to reproduce the problem I stumbled int to a case where a cast (double(*)[n]) crashes the compiler, whereas first having a typedef for the VLA works ok. I attach code that reproduces the problem for me. The bug is present as well in gcc 4.6.3 as in 4.7.0. This is on an ubuntu x86_64, but I don't think that this should matter. Jens
[Bug c/54552] Cast to pointer to VLA crash the compiler
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54552 --- Comment #1 from Jens Gustedt jens.gustedt at loria dot fr 2012-09-11 16:11:59 UTC --- The compiler error is test-p99-gcc-bug.c:9:6: internal compiler error: in gimplify_expr, at gimplify.c:7584
[Bug c/54552] [4.6/4.7/4.8 Regression] Cast to pointer to VLA crash the compiler
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54552 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2012-09-11 CC||jakub at gcc dot gnu.org Target Milestone|--- |4.6.4 Summary|Cast to pointer to VLA |[4.6/4.7/4.8 Regression] |crash the compiler |Cast to pointer to VLA ||crash the compiler Ever Confirmed|0 |1 --- Comment #2 from Jakub Jelinek jakub at gcc dot gnu.org 2012-09-11 16:18:17 UTC --- Fails since http://gcc.gnu.org/viewcvs?root=gccview=revrev=145254
[Bug middle-end/54544] Option -Wuninitialized does not work as documented with volatile
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54544 Zakhar jimfr06 at gmail dot com changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|INVALID | --- Comment #3 from Zakhar jimfr06 at gmail dot com 2012-09-11 16:41:43 UTC --- On second thoughs, I reopen the issue! WHY: --- The 'correction' of the code above is a contrary to what the function intended to do and thus breaks its logic: the declaration in the second version of the code are inconsistent with what the function was intented to. That is because, I suppose, the message gcc delivers and its correction are not OK with what really happens. FURTHER EXPLANATIONS: - You are right Andrew, the variable 'bar' is indeed (first version of the code) a 'pointer to a volatile memory location' (location of an int). *And that was intended so*. The code is a simplification of a 'memory protection algorithm' where the int that is pointed represents the count of threads using a given piece of memory. This count being accessed from several threads, is indeed volatile as no thread can assume the value didn't change from last time it read/wrote the value. (atomic instructions where removed from the code example). The pointer itself is not at all volatile. Anyway, once it is passed to the function, it sits on the stack where its value should be unchanged as long a the function lives, and same goes for the 'bar' automatic variable. I'm not doing crazy threaded things on variables on the stack of this function! I was confused by the warning message saying: 'bar' may be used uninitialized in this function ... because the message is indeed confusing. 'bar' is NOT used uninitialized (as demonstrated) but the *content pointed by 'bar'* could be. I must confess that I didn't look from far enough to interpret the message this way. HYPOTHESIS: --- So could it be that gcc saw that, and warns incorrectly on 'bar' instead of '*bar'? If so, yes, because the function receives a pointer to a memory location, the function itself cannot know whether the location pointed to was initialized or not. 'bar' gets the same address ( bar = p ) thus, indeed, the location pointed by 'bar' could be un-initialized. This could also be coherent with the fact that when we change the function to static, the warning disappears. Being static, all the callers have to be on the same source, thus the compiler can check if the callers initialize properly the content memory pointed to. But then, shouldn't the message be: '*bar' may be used uninitialized in this function. And that would indeed be correct, because '*bar' being the memory pointed to by bar, could indeed be un-initialized (if the caller didn't initialize it). And thus, the compiler would do a good job signaling that, as '*bar' (memory which bar points to) is declared as volatile, but it is NOT an automatic variable (the pointer is, not the memory pointed to). My hypothesis is probably wrong, because if gcc warned about un-initialized memory pointed to, it would have to warn on that: /*01*/ int /*02*/ foo( p ) /*03*/ int *p; /*04*/ { /*05*/ int foobar; /*06*/ /*07*/ foobar= *p; /*08*/ /*09*/ return foobar + 2; /*10*/ } (we don't know if '*p' has ever been initialized). And this short code snipet produces no warning at all, which is fine because a lot of code do that (passing variable 'by reference'), and it is perfectly correct not to warn. CONCLUSION: -- As previously concluded, it is not strictly a 'bug' as the documentation perfectly states that in some case the dectection is broken, but I assume this issue can go in the general thread better wuninitialized. - either gcc saw that '*bar' is uninitialized and should report that (and not report the pointer instead) - or gcc saw 'bar' (the pointer) to be unitialized, and that is a test case where it can do better work, as we can demonstrate it IS initialized. VERSION 3 of the code: - Of course, NOT to break my program logic, the correct declaration of the variable should be: /*09*/ volatile int * volatile bar; The 1st 'volatile' because indeed the memory we point to MUST be declared as volatile. The 2nd 'volatile' to suppress the 'false positive' detection on the pointer. QUESTION @Andrew: Should I post something on the general thread better wuninitialized (unless my deductions are wrong again), or do you attach the use case directly from this bug report?
[Bug gcov-profile/54487] [4.8 Regression] profiledbootstrap broken by r190952
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54487 --- Comment #16 from Teresa Johnson tejohnson at google dot com 2012-09-11 17:24:58 UTC --- On Thu, Sep 6, 2012 at 10:18 PM, Teresa Johnson tejohn...@google.com wrote: On Thu, Sep 6, 2012 at 1:49 PM, hjl.tools at gmail dot com gcc-bugzi...@gcc.gnu.org wrote: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54487 --- Comment #13 from H.J. Lu hjl.tools at gmail dot com 2012-09-06 20:49:02 UTC --- It works for me now after syncing with revision 191037. Unfortunately, I have now hit this myself a couple times in further testing, so I am still digging... After digging into this for several days now, I am convinced that the gcda file is changing out from underneath the profile-use compile during the read. I just don't understand how, since fcntl is being used to lock the file. Details below. I am hoping someone has some ideas or advice on where to go. First, some info on the compile step and gcda file locking, then I will explain why I belive the gcda file is being written mid-read. The failure is occurring sporadically during the profile-use step of a parallel profiledbootstrap. In this step, not only are the gcc profiles being read in by the profile-use compilation, but the gcc binaries being used in the compile are instrumented, so the gcda files are also being written by the parallel build processes. However, the fcntl file locking should prevent interference between the reads and the writes of the same gcda file. According to the fcntl documentation, the lock is lost on any fclose on the file by the same process, even if it uses a different file descriptor. But for a given profile use compile, the read step should be complete before the write step when the gcc exits and updates the gcda files with libgcov. So I don't think the lock should be lost this way. The failure is not reproducible manually. Therefore, I have been adding instrumentation to the compiler to spit out information at both the point of failure (error: corrupted profile info: edge from 54 to 55 exceeds maximal count from read_profile_edge_counts()), and then at the point when the gcda counts are being read in by read_counts_file(). This is the loop that looks like: for (ix = 0; ix != n_counts; ix++) entry-counts[ix] += gcov_read_counter (); I modified the above loop to check each count as it is read in against the sum_max, and if it exceeds sum_max to print the summary and counters at that point. I then compared this to the counters obtained for that function from a gcov-dump of the gcda file afterwards. The counters in the gcda file from gcov-dump are slightly higher because of subsequent merges into it by other parallel builds before compilation aborts, but it is still fairly easy to correlate the counter values between them. What I found is that the counter values being read by read_counts_file look good up to a certain point, and then they go bad. Looking at the huge bad values in hex, they are actually valid values from the gcda file, but it looks like we suddenly jumped back or ahead several locations. So for example, in some cases where it looks like we suddenly jumped ahead in the gcda file, I see that some of the last counter values being read into the array as counters are actually the tag values from just after the counter array in the gcda file. Or in some cases, the counter values are being read mis-aligned by one word, so they look huge because instead of having 0x in the most significant of the 2 counter words, the 0x is in the least significant of the 2 counter words. In other words, we jumped some odd number of words ahead (or behind) in the gcda file. If another process is writing into the gcda file at the same time, this could happen. Specifically, since the histogram now included in the gcda file summaries with my patch only write non-zero values, after a merge the size of the histogram, and therefore the size of the summaries, could change. This will cause the starting offsets of different tags/sections throughout the rest of the gcda file to shift. This shouldn't be a problem if the file locking is working though. But if the file locking is not working, then that would explain why suddenly during the read we suddenly seem to jump to a different spot. A couple of other notes: - I added some code after each counter read in the above loop to seek back to the offset where we read the tag, re-read it, and compare it to the tag we read earlier (then re-seek back to the current location in the counter array before continuing the read). Normally this check succeeds, but in the cases where I am hitting the above errors, the tag at that location has changed to something that doesn't look like a tag. - Jumping to a different spot should corrupt the reads of the rest of the file. But the main loop in read_counts_file will simply ignore any tag it doesn't understand, and exits when it reads a '0' tag. That's why there was no
[Bug gcov-profile/54487] [4.8 Regression] profiledbootstrap broken by r190952
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54487 --- Comment #17 from H.J. Lu hjl.tools at gmail dot com 2012-09-11 17:29:15 UTC --- Thanks for looking into it. This is a long standing problem. I have seen random profiledbootstrap failures for a long time.
[Bug gcov-profile/54487] [4.8 Regression] profiledbootstrap broken by r190952
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54487 --- Comment #18 from Teresa Johnson tejohnson at google dot com 2012-09-11 17:39:00 UTC --- On Tue, Sep 11, 2012 at 10:29 AM, hjl.tools at gmail dot com gcc-bugzi...@gcc.gnu.org wrote: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54487 --- Comment #17 from H.J. Lu hjl.tools at gmail dot com 2012-09-11 17:29:15 UTC --- Thanks for looking into it. This is a long standing problem. I have seen random profiledbootstrap failures for a long time. Thanks for confirming that this has happened prior. Unfortunately the addition of the histogram is likely making this more frequent, due to the changing summary sizes after merging. One way to deal with this for now might be to write all histogram entries (even the 0 ones) into the summary to keep the size static. Honza, what do you think? Teresa -- Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug.
[Bug gcov-profile/54487] [4.8 Regression] profiledbootstrap broken by r190952
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54487 --- Comment #19 from davidxl at google dot com 2012-09-11 17:44:29 UTC --- How much saving do we get by not writing out the 0 entries? With the proposed change, how less frequent is the problem occuring? David On Tue, Sep 11, 2012 at 10:38 AM, Teresa Johnson tejohn...@google.com wrote: On Tue, Sep 11, 2012 at 10:29 AM, hjl.tools at gmail dot com gcc-bugzi...@gcc.gnu.org wrote: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54487 --- Comment #17 from H.J. Lu hjl.tools at gmail dot com 2012-09-11 17:29:15 UTC --- Thanks for looking into it. This is a long standing problem. I have seen random profiledbootstrap failures for a long time. Thanks for confirming that this has happened prior. Unfortunately the addition of the histogram is likely making this more frequent, due to the changing summary sizes after merging. One way to deal with this for now might be to write all histogram entries (even the 0 ones) into the summary to keep the size static. Honza, what do you think? Teresa -- Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
[Bug gcov-profile/54487] [4.8 Regression] profiledbootstrap broken by r190952
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54487 --- Comment #20 from Teresa Johnson tejohnson at google dot com 2012-09-11 18:05:13 UTC --- On Tue, Sep 11, 2012 at 10:44 AM, Xinliang David Li davi...@google.com wrote: How much saving do we get by not writing out the 0 entries? With the proposed change, how less frequent is the problem occuring? Let me get back with some stats. Each histogram entry requires 5 words, and there are a max of 252 entries. In the few cases I checked just now, we were printing about 60 entries per summary, with 3 summaries per gcda file. So printing the whole thing in these cases would require 5*(252-60)*3 = 2880 extra words, or 11520 bytes. Unfortunately, that is a significant increase over the current sizes of those files, which are currently only double or triple that. I also need to verify that changing this would reduce the frequency. A couple other possibilities since this is not frequent: - change the existing error to a warning (as we do under flag_profile_correction) - after finishing reading the counts, re-read the tag as I am doing in my debugging, and if it is no longer valid, throw everything away and re-read the file. - check the counters after reading each one, and if it is sum_max, ignore it and abort the profile read with a warning but continue compiling. Obviously the best solution would be to figure out how the lock is being lost/ignored and fix that, but that may take some time. Teresa David On Tue, Sep 11, 2012 at 10:38 AM, Teresa Johnson tejohn...@google.com wrote: On Tue, Sep 11, 2012 at 10:29 AM, hjl.tools at gmail dot com gcc-bugzi...@gcc.gnu.org wrote: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54487 --- Comment #17 from H.J. Lu hjl.tools at gmail dot com 2012-09-11 17:29:15 UTC --- Thanks for looking into it. This is a long standing problem. I have seen random profiledbootstrap failures for a long time. Thanks for confirming that this has happened prior. Unfortunately the addition of the histogram is likely making this more frequent, due to the changing summary sizes after merging. One way to deal with this for now might be to write all histogram entries (even the 0 ones) into the summary to keep the size static. Honza, what do you think? Teresa -- Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
[Bug gcov-profile/54487] [4.8 Regression] profiledbootstrap broken by r190952
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54487 --- Comment #21 from davidxl at google dot com 2012-09-11 18:08:26 UTC --- Assuming the size of histogram for the same file does not vary that much, is it better to round the size to the next power of 2 -- 60 entries will need print out 64 etc? David On Tue, Sep 11, 2012 at 11:04 AM, Teresa Johnson tejohn...@google.com wrote: On Tue, Sep 11, 2012 at 10:44 AM, Xinliang David Li davi...@google.com wrote: How much saving do we get by not writing out the 0 entries? With the proposed change, how less frequent is the problem occuring? Let me get back with some stats. Each histogram entry requires 5 words, and there are a max of 252 entries. In the few cases I checked just now, we were printing about 60 entries per summary, with 3 summaries per gcda file. So printing the whole thing in these cases would require 5*(252-60)*3 = 2880 extra words, or 11520 bytes. Unfortunately, that is a significant increase over the current sizes of those files, which are currently only double or triple that. I also need to verify that changing this would reduce the frequency. A couple other possibilities since this is not frequent: - change the existing error to a warning (as we do under flag_profile_correction) - after finishing reading the counts, re-read the tag as I am doing in my debugging, and if it is no longer valid, throw everything away and re-read the file. - check the counters after reading each one, and if it is sum_max, ignore it and abort the profile read with a warning but continue compiling. Obviously the best solution would be to figure out how the lock is being lost/ignored and fix that, but that may take some time. Teresa David On Tue, Sep 11, 2012 at 10:38 AM, Teresa Johnson tejohn...@google.com wrote: On Tue, Sep 11, 2012 at 10:29 AM, hjl.tools at gmail dot com gcc-bugzi...@gcc.gnu.org wrote: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54487 --- Comment #17 from H.J. Lu hjl.tools at gmail dot com 2012-09-11 17:29:15 UTC --- Thanks for looking into it. This is a long standing problem. I have seen random profiledbootstrap failures for a long time. Thanks for confirming that this has happened prior. Unfortunately the addition of the histogram is likely making this more frequent, due to the changing summary sizes after merging. One way to deal with this for now might be to write all histogram entries (even the 0 ones) into the summary to keep the size static. Honza, what do you think? Teresa -- Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413 -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
[Bug gcov-profile/54487] [4.8 Regression] profiledbootstrap broken by r190952
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54487 --- Comment #22 from H.J. Lu hjl.tools at gmail dot com 2012-09-11 18:10:55 UTC --- (In reply to comment #20) Obviously the best solution would be to figure out how the lock is being lost/ignored and fix that, but that may take some time. Can we use a lockfile to verify that fcntl lock is working correctly?
[Bug gcov-profile/54487] [4.8 Regression] profiledbootstrap broken by r190952
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54487 --- Comment #23 from Markus Trippelsdorf markus at trippelsdorf dot de 2012-09-11 18:14:52 UTC --- gcc/gcov-io.h has: #if defined (HOST_HAS_F_SETLKW) #define GCOV_LOCKED 1 #else #define GCOV_LOCKED 0 #endif But HOST_HAS_F_SETLKW isn't defined anywhere else AFAICS: gcc % git grep HOST_HAS_F_SETLKW gcc/gcov-io.h:#if defined (HOST_HAS_F_SETLKW) gcc %
[Bug fortran/53306] [4.6/4.7/4.8 Regression] ICE on invalid 'array(*) =' statement
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53306 --- Comment #4 from Dominique d'Humieres dominiq at lps dot ens.fr 2012-09-11 18:16:59 UTC --- This PR is fixed by the patch at http://gcc.gnu.org/ml/fortran/2012-09/msg00035.html for pr54225. Isn't it a duplicate?
[Bug fortran/54225] [4.6/4.7/4.8 Regression] fortran compiler segfault processing ' print *, A(1,*) '
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54225 --- Comment #3 from Dominique d'Humieres dominiq at lps dot ens.fr 2012-09-11 18:19:29 UTC --- This PR seems to be a duplicate of pr53306.
[Bug middle-end/54550] GCC -O3 breaks floating point equality comparison
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54550 Andrew Pinski pinskia at gcc dot gnu.org changed: What|Removed |Added Target|32-bit X86 on Linux |i?86-*-* Status|UNCONFIRMED |WAITING Last reconfirmed||2012-09-11 Component|c |middle-end Ever Confirmed|0 |1 --- Comment #4 from Andrew Pinski pinskia at gcc dot gnu.org 2012-09-11 18:48:50 UTC --- And we need a testcase.
[Bug gcov-profile/54487] [4.8 Regression] profiledbootstrap broken by r190952
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54487 --- Comment #24 from Teresa Johnson tejohnson at google dot com 2012-09-11 18:57:05 UTC --- On Tue, Sep 11, 2012 at 11:14 AM, markus at trippelsdorf dot de gcc-bugzi...@gcc.gnu.org wrote: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54487 --- Comment #23 from Markus Trippelsdorf markus at trippelsdorf dot de 2012-09-11 18:14:52 UTC --- gcc/gcov-io.h has: #if defined (HOST_HAS_F_SETLKW) #define GCOV_LOCKED 1 #else #define GCOV_LOCKED 0 #endif But HOST_HAS_F_SETLKW isn't defined anywhere else AFAICS: gcc % git grep HOST_HAS_F_SETLKW gcc/gcov-io.h:#if defined (HOST_HAS_F_SETLKW) gcc % Maybe it is as simple as that?! I thought I saw that GCOV_LOCKED was set for my compile, but that may have been on the libgcov compile. In fact, just above the code Markus shows from gcov-io.h, when IN_LIBGCOV, GCOV_LOCKED is set based on TARGET_POSIX_IO: #if defined (TARGET_POSIX_IO) #define GCOV_LOCKED 1 #else #define GCOV_LOCKED 0 #endif Indeed, when I look at the preprocessed libgcov.c output from its compile command, the GCOV_LOCKED is clearly set (by looking at the preprocessed gcov_open() code). But when I use the compile command for coverage.c, which includes gcov-io.c but is !IN_LIBGCOV (so GCOV_LOCKED is set based on HOST_HAS_F_SETLKW), the preprocessed gcov_open code is that of a !GCOV_LOCKED compile, without the call to fcntl. So perhaps it is just the case that the libgcov code is that writes the gcda files is doing the locking, but the read on profile-use is not! Anyone know how HOST_HAS_F_SETLKW was supposed to be set? I do see that my configure is setting HAVE_FCNTL_H, perhaps that was intended? Teresa -- Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug.
[Bug gcov-profile/54487] [4.8 Regression] profiledbootstrap broken by r190952
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54487 --- Comment #25 from Jakub Jelinek jakub at gcc dot gnu.org 2012-09-11 18:58:59 UTC --- Indeed, seems http://gcc.gnu.org/ml/gcc-patches/2003-05/msg00571.html has introduced use of that macro, but didn't bother to actually define it anywhere.
[Bug gcov-profile/54487] [4.8 Regression] profiledbootstrap broken by r190952
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54487 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #26 from Jakub Jelinek jakub at gcc dot gnu.org 2012-09-11 19:04:47 UTC --- For the check, I guess you want to check that you can actually compile on host something like: #include fcntl.h int main () { struct flock fl; fl.l_whence = 0; fl.l_start = 0; fl.l_len = 0; fl.l_pid = 0; return fcntl (1, F_SETLKW, fl); }
[Bug gcov-profile/54487] [4.8 Regression] profiledbootstrap broken by r190952
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54487 --- Comment #27 from Teresa Johnson tejohnson at google dot com 2012-09-11 19:08:07 UTC --- Thanks for the pointers, Jakub. I'll work on adding this check. Teresa On Tue, Sep 11, 2012 at 12:04 PM, jakub at gcc dot gnu.org gcc-bugzi...@gcc.gnu.org wrote: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54487 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #26 from Jakub Jelinek jakub at gcc dot gnu.org 2012-09-11 19:04:47 UTC --- For the check, I guess you want to check that you can actually compile on host something like: #include fcntl.h int main () { struct flock fl; fl.l_whence = 0; fl.l_start = 0; fl.l_len = 0; fl.l_pid = 0; return fcntl (1, F_SETLKW, fl); } -- Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug.
[Bug target/40836] ICE: insn does not satisfy its constraints (iwmmxt_movsi_insn)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40836 --- Comment #31 from Daniel Drake dsd at laptop dot org 2012-09-11 19:11:27 UTC --- Created attachment 28173 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28173 preprocessed source that crashes Another preprocessed source example that shows this crasher, from glibc gconv_cache compilation. Compile with: gcc -march=iwmmxt -O3 -c test.c Note: -O3 is needed to cause the crash, with -O2 it compiles OK.
[Bug middle-end/54544] Option -Wuninitialized does not work as documented with volatile
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54544 --- Comment #4 from Zakhar jimfr06 at gmail dot com 2012-09-11 21:09:28 UTC --- MORE Unfortunately, I don't think the hypothesis of the uninitialized pointed memory hold. That should prove it if we add: /*01*/ int fct(volatile int *p); /*02*/ /*03*/ static int /*04*/ foo( p ) /*05*/ volatile int * p; /*06*/ { /*07*/ volatile int foobar,barfoo; /*08*/ volatile int flag=0; /*09*/ volatile int * bar; /*10*/ /*11*/ do /*12*/ { /*13*/ if ( *p ) /*14*/ { /*15*/ flag= fct( p ); /*16*/ bar = p; /*17*/ } /*18*/ if ( fct( p ) ) break; /*19*/ if ( flag ) /*20*/ { /*21*/ barfoo = *bar; /*22*/ if ( bar == (int *)0 ) break; /*23*/ foobar = *bar; /*24*/ return foobar + barfoo; /*25*/ } /*26*/ } /*27*/ while ( fct( p ) ); /*28*/ /*29*/ return 0; /*30*/ } /*31*/ /*32*/ int /*33*/ main() /*34*/ { /*35*/ int i; /*35*/ /*37*/ return foo( i ); /*38*/ /*40*/ } Here 'main' calls the 'foo' function with a pointer to a variable which for sure is NOT initialized, and there is no warning whatsoever when we compile with: $ gcc -O3 -c uninit.c -o /dev/null -Wall In this example, if we go to line 23, for sure the result of the returned value is totally unpredictable as it depends on the value of 'i' in the main function. 'i' is on the stack, and has not been initialized, so it gets any value that was there previously on the stack! If we remove 'static' in front of the function, this time we get our warning back... but probably a 'false positive' on 'bar', and not related to tracking down pointed memory. In this new use-case, if we add 'inline' after static (which -O3 should do by itself here?) we are for sure doing something wrong. Shouldn't -WUninitialized output something instead of remaining silent?
[Bug debug/54551] DF resets some DEBUG_INSNs unnecessarily
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54551 --- Comment #1 from Alexandre Oliva aoliva at gcc dot gnu.org 2012-09-11 21:20:11 UTC --- I guess we have to somehow local all death points of the pseudo in paths towards the debug use, and insert debug insns binding the same debug temp to the pseudo before all of the death points, then replace the debug use with a use of the debug temp. I'm not sure how well this fits in the general structure of the DF machinery. Presumably we just need to look up a table of (lists of?) debug temps as we reach death points.
[Bug target/28896] -fstack-limit-symbol and m68k and non 68020
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28896 Larry Baker baker at usgs dot gov changed: What|Removed |Added Attachment #28165|0 |1 is obsolete|| --- Comment #26 from Larry Baker baker at usgs dot gov 2012-09-11 21:33:55 UTC --- Created attachment 28174 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28174 Patch for trunk version 2012-08-12 of gcc/config/m68k/m68k.c Missed second LEGITIMATE_CONSTANT_P; should be m68k_legitimate_constant_p.
[Bug preprocessor/44191] -E output broken for gcc-4.5.0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44191 Israel Pinkas ipinkas at nds dot com changed: What|Removed |Added CC||ipinkas at nds dot com --- Comment #2 from Israel Pinkas ipinkas at nds dot com 2012-09-11 22:40:08 UTC --- A am not sure what Jakub means by Locus, but the C and C++ standards are clear that during the preprocessing stage, the sequence backslash-newline is dropped from the stream. Line splicing occurs before tokenization. While in most situations there is no difference, there are some cases which are affected by this bug: 1. Token splicing. Two tokens are separated by only a backslash-newline in the source are supposed to be concatenated into a single token. In a bizzare twist to this bug, this behavior is preserved. See example below. 2. Use of cpp for other purposes. There exist multiple software packages which are not compilers for C-line languages that use cpp as a preprocessor, accepting that cpp is C-oriented in a number of ways. This includes some assemblers and other packages that need file inclusion, conditional compilation, and simple macros. Many of these packages rely on the line splicing. The bug description needs to be slighly ammended. The splicing behavior was changed only when the first character following the newline is a space or tab. The following demonstrates: x.txt Test\ ing Test\ case END $ cpp x.txt # 1 x.txt # 1 built-in # 1 command-line # 1 x.txt Testing Test code == The first instance spliced the tokens. However, the second instance left the newline, a change in behavior and a deviation from the spec.
[Bug libstdc++/54482] failures in static linking with libstdc++, due to versioned symbols
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54482 Benjamin Kosnik bkoz at gcc dot gnu.org changed: What|Removed |Added CC||bkoz at gcc dot gnu.org, ||Ralf.Wildenhues at gmx dot ||de --- Comment #2 from Benjamin Kosnik bkoz at gcc dot gnu.org 2012-09-12 03:26:23 UTC --- FYI this bug is a duplicate of 28811. The summary/diagnosis is wrong. It's not about versioned symbols, at least not in 4.7.x and above (after fix of 52689). The issue is that even for all libstdc++ sources that are destined for the static library, ie libstdc++.a, compile flags should include -fPIC or equivalent, so that -static-libstdc++ works. These files are not compiled this way at the moment. So, compat*.o files need to be compiled with -prefer-pic. But, using that means that the delicate balance of the non-convenience library files in src, ie compat*.cc files is disturbed. At the moment, compat*.cc files are special, and have have code suitable for static and shared libs, and some code intended only for shared libs. Right now, the compile-time macro to designate these shared-only sections is PIC. But, using libtool's -prefer-pic for the compat*.cc files means -fPIC -DPIC, which messes up static/shared code paths. So, one solution may be to use -prefer-pic when using libtool to create all object files, but to use another macro, say _GLIBCXX_SHARED when compiling only shared code. (Another solution is to make yet-another convenience libary, that is only shared, and manually separate the source file dependencies. Let's try not to do it this way.) So, what is desired is a compile-time hook or flag into libtool that deals with just the static or just the shared compiled objects. There are currently configure-time hooks for this (--enable-shared/-static, etc). One hook is to create an override for libtool's pic_flag variable, using CXX_pic_flag, that is special for libstdc++. ie, from the generated libtool: from: pic_flag= -fPIC -DPIC to: pic_flag=-D_GLIBCXX_SHARED -fPIC -DPIC Sadly, I cannot figure out the correct way to do this, perhaps Ralf can help me. In the meantime: A hacky way to do this in configure.ac: # Use _GLIBCXX_SHARED as a compile-time switch just for libstdc++ to designate # a shared-only codepath. AC_CONFIG_COMMANDS([libtool-pic-patch], [echo config.status: patching libtool's pic_flag with -D_GLIBCXX_SHARED; sed libtool libtool.tmp 's/^pic_flag=/pic_flag=-D_GLIBCXX_SHARED /'; mv libtool.tmp libtool; chmod 775 libtool; ])
[Bug middle-end/54550] GCC -O3 breaks floating point equality comparison
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54550 --- Comment #5 from Veiokej veiokej at gmail dot com 2012-09-12 03:28:43 UTC --- Johnathan, Yes, I've read the floating point nonbug stuff. This isn't a nonbug. Michael, I understand your point, and thanks for the command line option. However, this is a subtly different issue than saying 64-bit double precision is slightly more accurate on X86 platforms due to 80-bit temporaries, vs. X64 platforms. The reason it's different is because there is no math done, and 80-bit precision can hold 100% of 64-bit values with no loss of precision. So no matter what, my equality test should not fail. Andrew, I appreciate that. Let me see if I can come up with something short of uploading the entire codebase. It's extremely sensitive to neighboring code, and is thus hard to isolate.
[Bug middle-end/54550] GCC -O3 breaks floating point equality comparison
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54550 --- Comment #6 from Veiokej veiokej at gmail dot com 2012-09-12 04:14:52 UTC --- In the process of trying to create a demo, I think I found the problem. Indeed, no math is taking place between when the value X is first computed and stored to the list, and when it's compared for equality with the nearest value Y found in the list. (They should be identical.) I think that X is observed to be unequal to Y because X is cached in a register, despite the fact that Y is computed in a (later) function. (Evidence is that removing -finline_functions, as does -ffloat-store fixes the problem.) When I was printing the value previously, I was inadvertently causing both X and Y to be stored to memory as 64-bit doubles, then taking the difference after the fact, which of course turned out to be 0. The difference was actually closer to 6x10^(-17). So this demonstrates how the previously well known excess precision issue can actually cause equality testing to fail, even when no math is involved between the store to memory, and the compare with memory. Yikes! Thankfully this is fixable by using full 80-bit precision, or migrating to X64. Let's leave this on the record for documentation, but I don't think it's worth further pursuit. Thanks to all for your input.
[Bug libstdc++/54482] failures in static linking with libstdc++, due to versioned symbols
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54482 --- Comment #3 from Ollie Wild aaw at gcc dot gnu.org 2012-09-12 04:58:29 UTC --- Note, however, that simply changing pic_flag to pic_flag=-D_GLIBCXX_SHARED -fPIC -DPIC is insufficient. It suffers from the same issue as the original problem, namely that, when configured with --with-pic, pic_flag is passed even when compiling objects for libstdc++.a. Since your solution passes -DPIC and -D_GLIBCXX_SHARED in tandem, keying off the new macro suffers from the same limitations as the old one. See my previous comment. In particular, the link there points to a more detailed analysis, as well as a patch which outlines the basics of what I think needs to change. As you correctly suggest, we need a mechanism to pass flags based on whether we're compiling for a static or shared library independently of whether or not -fPIC is used. The aforementioned patch does that in our branch, but that will need to be cleaned up and submitted to upstream libtool before incorporating it in GCC trunk.
[Bug debug/54551] DF resets some DEBUG_INSNs unnecessarily
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54551 --- Comment #2 from Jakub Jelinek jakub at gcc dot gnu.org 2012-09-12 05:54:29 UTC --- If there is a death point of the pseudo that dominates bbs with uses in some debug insns, then I think best is to insert the debug temporary immediately before the death point. If the death point of the pseudo doesn't dominate the bb with debug uses, or if there are multiple death point in different branches, but if the setter of the pseudo dominates the bb with debug uses or if there is some bb where the pseudo is live, isn't changed afterwards and that spot dominates the debug uses, then the best spot to insert the debug temporary is probably before the conditional jump/whatever other control changing insn at the end of that bb. E.g. for: +---+ | |set| / \ +---+ +---++---+ ||set||set| / \ +---++---+ | | | | \ / \ / (1) | +-+ (2) |death| / \ +-+ | \ | +-+ \ / \ |death| +-+ | | +-+ |death| | / \ | +-+ | | | \ / | \ / | | |+--+ | +--+ |dbguse| | |dbguse| +--+ | +--+ I think we want to insert the debug temp at (1) resp. (2). If there is no such spot, I think we have to give up, trying to build (if_then_else (condition) D#1234 D#2345) would bloat the debug info too much. Still handling even the dominating cases would be better than what we have right now. Perhaps we could handle single setters first if DF has computed that already. Perhaps this handling could be keyed off some new DF flag which would only be set in the first cse pass.
Re: [SH] Add simple_return pattern
On 09/11/2012 12:28 AM, Oleg Endo wrote: On Mon, 2012-09-10 at 15:51 +0200, Christian Bruel wrote: This patch implements the simple_return pattern to enable -fshrink-wrap on SH. It also clean up some redundancies for expand_epilogue (called twice from the return and epilogue patterns and the sh_expand_prologue parameter type. No regressions with sh-superh-elf and sh4-linux gcc testsuites. Thanks Christian Regarding the iterators, maybe it's better to put them in config/sh/iterators.md. The optab code attr is not needed in this case, code is sufficient. How about the attached patch instead? yes, there is this new iterator.md file. I'm moving the iterator there. Will resent. Thanks Christian
Re: [SH] Add simple_return pattern
On 09/11/2012 03:05 AM, Kaz Kojima wrote: Christian Bruel christian.br...@st.com wrote: This patch implements the simple_return pattern to enable -fshrink-wrap on SH. It also clean up some redundancies for expand_epilogue (called twice from the return and epilogue patterns and the sh_expand_prologue parameter type. No regressions with sh-superh-elf and sh4-linux gcc testsuites. With the patch + revision 191106, I've got a new failure: FAIL: gcc.dg/tree-prof/bb-reorg.c compilation, -fprofile-use -D_PROFILE_USE (internal compiler error) for sh4-unknown-linux-gnu. My testsuite/gcc/gcc.log says /exp/ldroot/dodes/xsh-gcc/gcc/xgcc -B/exp/ldroot/dodes/xsh-gcc/gcc/ /exp/ldroot/dodes/LOCAL/trunk/gcc/testsuite/gcc.dg/tree-prof/bb-reorg.c -fno-diagnostics-show-caret -O2 -freorder-blocks-and-partition -fprofile-use -D_PROFILE_USE -lm -o /exp/ldroot/dodes/xsh-gcc/gcc/testsuite/gcc/bb-reorg.x02 /exp/ldroot/dodes/LOCAL/trunk/gcc/testsuite/gcc.dg/tree-prof/bb-reorg.c: In function 'main': /exp/ldroot/dodes/LOCAL/trunk/gcc/testsuite/gcc.dg/tree-prof/bb-reorg.c:38:1: error: EDGE_CROSSING missing across section boundary /exp/ldroot/dodes/LOCAL/trunk/gcc/testsuite/gcc.dg/tree-prof/bb-reorg.c:38:1: internal compiler error: verify_flow_info failed Please submit a full bug report, Regards, Ugh, indeed, I forgot a SPEC file that set the release mode on my SH-Linux distri, so verify_flow_info was not called :-(. I need to test again. thanks ! Christian kaz
Bootstrap fails (was: Remove unnecessary VEC function overloads.)
On 09/11/2012 01:52 AM, Diego Novillo wrote: Remove unnecessary VEC function overloads. Several VEC member functions that accept an element 'T' used to have two overloads: one taking 'T', the second taking 'T *'. They might be unnecessary, but with your patch bootstrapping fails here with the following failure. Did you test with or without Graphite? Tobias /home/tob/projects/gcc-git/gcc/gcc/graphite-scop-detection.c: In function ‘void move_sd_regions(vec_tsd_region_p**, vec_tsd_region_p**)’: /home/tob/projects/gcc-git/gcc/gcc/vec.h:408:63: error: no matching function for call to ‘vec_tsd_region_p::safe_push(vec_tsd_region_p**, sd_region*, const char [61], int, const char [16])’ (vec_tT::safe_pushA ((V), O VEC_CHECK_INFO MEM_STAT_INFO)) ^ /home/tob/projects/gcc-git/gcc/gcc/graphite-scop-detection.c:146:5: note: in expansion of macro 'VEC_safe_push' VEC_safe_push (sd_region, heap, *target, s); ^ /home/tob/projects/gcc-git/gcc/gcc/vec.h:408:63: note: candidate is: (vec_tT::safe_pushA ((V), O VEC_CHECK_INFO MEM_STAT_INFO))
Ping [SH] Define NO_IMPLICIT_EXTERN_C for newlib targets
Hi Kaz, Any news for my sh-superh-elf --with-newlib patch ? http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00137.html Thanks Christian
Re: [PATCH] Fix PR54492
On Mon, 10 Sep 2012, William J. Schmidt wrote: Here's the revised patch with a param. Bootstrapped and tested in the same manner. Ok for trunk? Ok. Thanks, Richard. Thanks, Bill 2012-08-10 Bill Schmidt wschm...@linux.vnet.ibm.com * doc/invoke.texi (max-slsr-cand-scan): New description. * gimple-ssa-strength-reduction.c (find_basis_for_candidate): Limit the time spent searching for a basis. * params.def (PARAM_MAX_SLSR_CANDIDATE_SCAN): New param. Index: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi (revision 191135) +++ gcc/doc/invoke.texi (working copy) @@ -9407,6 +9407,11 @@ having a regular register file and accurate regist See @file{haifa-sched.c} in the GCC sources for more details. The default choice depends on the target. + +@item max-slsr-cand-scan +Set the maximum number of existing candidates that will be considered when +seeking a basis for a new straight-line strength reduction candidate. + @end table @end table Index: gcc/gimple-ssa-strength-reduction.c === --- gcc/gimple-ssa-strength-reduction.c (revision 191135) +++ gcc/gimple-ssa-strength-reduction.c (working copy) @@ -54,6 +54,7 @@ along with GCC; see the file COPYING3. If not see #include domwalk.h #include pointer-set.h #include expmed.h +#include params.h /* Information about a strength reduction candidate. Each statement in the candidate table represents an expression of one of the @@ -353,10 +354,14 @@ find_basis_for_candidate (slsr_cand_t c) cand_chain_t chain; slsr_cand_t basis = NULL; + // Limit potential of N^2 behavior for long candidate chains. + int iters = 0; + int max_iters = PARAM_VALUE (PARAM_MAX_SLSR_CANDIDATE_SCAN); + mapping_key.base_expr = c-base_expr; chain = (cand_chain_t) htab_find (base_cand_map, mapping_key); - for (; chain; chain = chain-next) + for (; chain iters max_iters; chain = chain-next, ++iters) { slsr_cand_t one_basis = chain-cand; Index: gcc/params.def === --- gcc/params.def(revision 191135) +++ gcc/params.def(working copy) @@ -973,6 +973,13 @@ DEFPARAM (PARAM_SCHED_PRESSURE_ALGORITHM, Which -fsched-pressure algorithm to apply, 1, 1, 2) +/* Maximum length of candidate scans in straight-line strength reduction. */ +DEFPARAM (PARAM_MAX_SLSR_CANDIDATE_SCAN, + max-slsr-cand-scan, + Maximum length of candidate scans for straight-line + strength reduction, + 50, 1, 99) + /* Local variables: mode:c -- Richard Biener rguent...@suse.de SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imend
[PATCH] Fix PR54515
This is the trunk variant of the 54515 fix - we shouldn't really return NULL_TREE from get_base_address apart from for invalid inputs (and then it's just GIGO). This makes us go half-way to fix the PR, I'll followup with a patch to look through WITH_SIZE_EXPR (after thinking about effects on alias analysis). Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2012-09-11 Richard Guenther rguent...@suse.de PR middle-end/54515 * gimple.c (get_base_address): Do not return NULL_TREE apart from for WITH_SIZE_EXPR. * gimple-fold.c (canonicalize_constructor_val): Do not call get_base_address when not necessary. * g++.dg/tree-ssa/pr54515.C: New testcase. Index: gcc/gimple.c === --- gcc/gimple.c(revision 191143) +++ gcc/gimple.c(working copy) @@ -2878,16 +2878,12 @@ get_base_address (tree t) TREE_CODE (TREE_OPERAND (t, 0)) == ADDR_EXPR) t = TREE_OPERAND (TREE_OPERAND (t, 0), 0); - if (TREE_CODE (t) == SSA_NAME - || DECL_P (t) - || TREE_CODE (t) == STRING_CST - || TREE_CODE (t) == CONSTRUCTOR - || INDIRECT_REF_P (t) - || TREE_CODE (t) == MEM_REF - || TREE_CODE (t) == TARGET_MEM_REF) -return t; - else + /* ??? Either the alias oracle or all callers need to properly deal + with WITH_SIZE_EXPRs before we can look through those. */ + if (TREE_CODE (t) == WITH_SIZE_EXPR) return NULL_TREE; + + return t; } void Index: gcc/gimple-fold.c === --- gcc/gimple-fold.c (revision 191143) +++ gcc/gimple-fold.c (working copy) @@ -154,13 +154,15 @@ canonicalize_constructor_val (tree cval, } if (TREE_CODE (cval) == ADDR_EXPR) { - tree base = get_base_address (TREE_OPERAND (cval, 0)); - if (!base TREE_CODE (TREE_OPERAND (cval, 0)) == COMPOUND_LITERAL_EXPR) + tree base = NULL_TREE; + if (TREE_CODE (TREE_OPERAND (cval, 0)) == COMPOUND_LITERAL_EXPR) { base = COMPOUND_LITERAL_EXPR_DECL (TREE_OPERAND (cval, 0)); if (base) TREE_OPERAND (cval, 0) = base; } + else + base = get_base_address (TREE_OPERAND (cval, 0)); if (!base) return NULL_TREE; Index: gcc/testsuite/g++.dg/tree-ssa/pr54515.C === --- gcc/testsuite/g++.dg/tree-ssa/pr54515.C (revision 0) +++ gcc/testsuite/g++.dg/tree-ssa/pr54515.C (working copy) @@ -0,0 +1,19 @@ +// { dg-do compile } +// { dg-options -O2 } + +template typename T T h2le (T) +{ +T a; +unsigned short b = a; +short c = 0; +unsigned char (d)[2] = reinterpret_cast unsigned char ()[2] (c); +unsigned char (e)[2] = reinterpret_cast unsigned char ()[2] (b); +e[0] = d[0]; +return a; +} + +void +bar () +{ +h2le ((unsigned short) 0); +}
Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
On Mon, Sep 10, 2012 at 6:30 PM, Richard Henderson r...@redhat.com wrote: On 09/10/2012 09:11 AM, Iyer, Balaji V wrote: Can you please help me get a start on how to get can be done? From what I understand (please correct me if I am wrong), this requires rearranging and duplicating a lot of passes and can potentially open up to a lot of bugs. Certainly not duplicating passes. And probably not even rearranging them. The Important parts are: (1) Having a bit in struct loop that indicates the special semantics you have for #pragma simd. I don't know if maybe all loops inside an elemental function are so automatically marked? (2) Have bits in struct function that summarize the contents of the bit from struct loop, for all loops in the function. Note that this bit would need to be updated during inlining. (3) Change the gate predicates for the relevant function to also check the bit from struct function. In some cases the pass might need to run globally (perhaps if-conversion?) and in some cases the pass might be able to restrict work to specific loops (e.g. the vectorizer), skipping loops for which the optimization is not enabled. Note that we do not preserve the loop tree before the gimple loop optimizer passes. Nor do we have a convenient way (currently) to transfer per-loop information from GENERIC to the point where we can first create the loop tree (after the CFG is built). The former is because I didn't want to think about the inlining case (I'm still chasing bugs for preserving the loop tree from the start of gimple loop optimizer passes ...), the latter could be done in a similar way we handle predications or OMP annotations - have special instructions in the IL. Richard. r~
Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
On Mon, Sep 10, 2012 at 6:37 PM, Richard Henderson r...@redhat.com wrote: On 09/10/2012 09:09 AM, Iyer, Balaji V wrote: If that's the case, what's the point in defining an external ABI and defining what __attribute__((vector)) placed on a function declaration means? When you have __attribute__((vector)) you are asking the compiler to create a vector AND a scalar version of the function. The advantage is that if the function is used, for example, in 2 loops where 1 can be vectorized and another cannot, the vectorizable loop won't suffer (i.e. suffer from being not-vectorized). You've totally mis-understood my point. Whether or not the compiler creates a clone COULD BE totally up to the compiler, based on whether or not vectorization is enabled, whether the loop has been analyzed such that vectorization may proceed, or indeed the phase of the moon. But in order for that to happen, the clone must be totally private to the module for which we are generating code (in the LTO sense, this is the entire program or dll; without LTO, this is just the object file). It means that we never attempt to generate clones for functions for which the body of the function is not visible. On the other hand, if you insist on assuming a clone exists merely because a declaration bears an attribute, then you must address ALL of the problems with respect to defining a stable ABI in the face of different cpu revisions, different ISAs, and different vector lengths. I've not seen you address ANY of these problems, despite having the problem pointed out multiple times. Indeed, if the definition of an elemental function is always visible to the vectorizer the vectorizer itself can instruct the creation of the clone if it does not already exist (just make those clones managed by the callgraph). Then the clones are visible to the current TU only and no ABI issues exist (though you could say that the vectorizer or the inliner could as well force inlining of elemental functions into places it wants to vectorize - one complication even with local clones is that the x86 ABI has no callee-saved XMM registers which makes function calls inside loops especially expensive). Richard. r~
Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
On Tue, Sep 11, 2012 at 10:41 AM, Richard Guenther richard.guent...@gmail.com wrote: On Mon, Sep 10, 2012 at 6:37 PM, Richard Henderson r...@redhat.com wrote: On 09/10/2012 09:09 AM, Iyer, Balaji V wrote: If that's the case, what's the point in defining an external ABI and defining what __attribute__((vector)) placed on a function declaration means? When you have __attribute__((vector)) you are asking the compiler to create a vector AND a scalar version of the function. The advantage is that if the function is used, for example, in 2 loops where 1 can be vectorized and another cannot, the vectorizable loop won't suffer (i.e. suffer from being not-vectorized). You've totally mis-understood my point. Whether or not the compiler creates a clone COULD BE totally up to the compiler, based on whether or not vectorization is enabled, whether the loop has been analyzed such that vectorization may proceed, or indeed the phase of the moon. But in order for that to happen, the clone must be totally private to the module for which we are generating code (in the LTO sense, this is the entire program or dll; without LTO, this is just the object file). It means that we never attempt to generate clones for functions for which the body of the function is not visible. On the other hand, if you insist on assuming a clone exists merely because a declaration bears an attribute, then you must address ALL of the problems with respect to defining a stable ABI in the face of different cpu revisions, different ISAs, and different vector lengths. I've not seen you address ANY of these problems, despite having the problem pointed out multiple times. Indeed, if the definition of an elemental function is always visible to the vectorizer the vectorizer itself can instruct the creation of the clone if it does not already exist (just make those clones managed by the callgraph). Then the clones are visible to the current TU only and no ABI issues exist (though you could say that the vectorizer or the inliner could as well force inlining of elemental functions into places it wants to vectorize - one complication even with local clones is that the x86 ABI has no callee-saved XMM registers which makes function calls inside loops especially expensive). Btw, this then happily fits into my suggestion that the elementalness can be autodetected by the compiler simply by means of a proper IPA pass and thus be fully LTO / whole-program aware. No need for an attribute (where you'd need to handle the case that the attribute was placed there by error). Richard. Richard. r~
Re: Ping [SH] Define NO_IMPLICIT_EXTERN_C for newlib targets
Christian Bruel christian.br...@st.com wrote: Any news for my sh-superh-elf --with-newlib patch ? http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00137.html The patch is OK for both 4.7 and 4.8. Sorry for the delay. Regards, kaz
Re: [PATCH] Combine location with block using block_locations
On Mon, Sep 10, 2012 at 5:27 PM, Dehao Chen de...@google.com wrote: On Mon, Sep 10, 2012 at 3:01 AM, Richard Guenther richard.guent...@gmail.com wrote: On Sun, Sep 9, 2012 at 12:26 AM, Dehao Chen de...@google.com wrote: Hi, Diego, Thanks a lot for the review. I've updated the patch. This patch is large and may easily break builds because it reserves more complete information for TREE_BLOCK as well as gimple_block (may trigger bugs that was hided when these info are unavailable). I've done more rigorous testing to ensure that most bugs are caught before checking in. * Sync to the head and retest all gcc testsuite. * Port the patch to google-4_7 branch to retest all gcc testsuite, as well as build many large applications. Through these tests, I've found two additional bugs that was omitted in the original implementation. A new patch is attached (patch.txt) to fix these problems. After this fix, all gcc testsuites pass for both trunk and google-4_7 branch. I've also copy pasted the new fixes (lto.c and tree-cfg.c) below. Now I'd say this patch is in good shape. But it may not be perfect. I'll look into build failures as soon as it arises. Richard and Diego, could you help me take a look at the following two fixes? Thanks, Dehao New fixes: --- gcc/lto/lto.c (revision 191083) +++ gcc/lto/lto.c (working copy) @@ -1559,8 +1559,6 @@ lto_fixup_prevailing_decls (tree t) { enum tree_code code = TREE_CODE (t); LTO_NO_PREVAIL (TREE_TYPE (t)); - if (CODE_CONTAINS_STRUCT (code, TS_COMMON)) -LTO_NO_PREVAIL (TREE_CHAIN (t)); That change is odd. Can you show us how it breaks? This will break LTO build of gcc.c-torture/execute/pr38051.c There is data structure like: union { long int l; char c[sizeof (long int)]; } u; Once the block info is reserved for this, it'll reserve this data structure. And inside this data structure, there is VAR_DECL. Thus LTO_NO_PREVAIL assertion does not satisfy here for TREE_CHAIN (t). I see - the issue here is that this data structure is not reached at the time we call free_lang_data (via find_decls_types_r). But maybe I do not understand once the block info is reserved for this. So the patch papers over an issue elsewhere I believe. Maybe Micha can add some clarification here though, how BLOCK_VARS should be visible here Richard. if (DECL_P (t)) { LTO_NO_PREVAIL (DECL_NAME (t)); Index: gcc/tree-cfg.c === --- gcc/tree-cfg.c (revision 191083) +++ gcc/tree-cfg.c (working copy) @@ -5980,9 +5974,21 @@ move_stmt_op (tree *tp, int *walk_subtrees, void * tree t = *tp; if (EXPR_P (t)) -/* We should never have TREE_BLOCK set on non-statements. */ -gcc_assert (!TREE_BLOCK (t)); - +{ + tree block = TREE_BLOCK (t); + if (p-orig_block == NULL_TREE + || block == p-orig_block + || block == NULL_TREE) + TREE_SET_BLOCK (t, p-new_block); +#ifdef ENABLE_CHECKING + else if (block != p-new_block) + { + while (block block != p-orig_block) + block = BLOCK_SUPERCONTEXT (block); + gcc_assert (block); + } +#endif I think what this means is that TREE_BLOCK on non-stmts are meaningless (thus only gimple_block is interesting on GIMPLE, not BLOCKs on trees). So instead of setting a BLOCK in some cases you should clear BLOCK if it happens to be set, or alternatively, only re-set it if there was a block associated with it. Yeah, makes sense. New change: @@ -5980,9 +5974,10 @@ tree t = *tp; if (EXPR_P (t)) -/* We should never have TREE_BLOCK set on non-statements. */ -gcc_assert (!TREE_BLOCK (t)); - +{ + if (TREE_BLOCK (t)) + TREE_SET_BLOCK (t, p-new_block); +} else if (DECL_P (t) || TREE_CODE (t) == SSA_NAME) { if (TREE_CODE (t) == SSA_NAME) Thanks, Dehao Richard. +} else if (DECL_P (t) || TREE_CODE (t) == SSA_NAME) { if (TREE_CODE (t) == SSA_NAME) Whole patch: gcc/ChangeLog: 2012-09-08 Dehao Chen de...@google.com * toplev.c (general_init): Init block_locations. * tree.c (tree_set_block): New. (tree_block): Change to use LOCATION_BLOCK. * tree.h (TREE_SET_BLOCK): New. * final.c (reemit_insn_block_notes): Change to use LOCATION_BLOCK. (final_start_function): Likewise. * input.c (expand_location_1): Likewise. * input.h (LOCATION_LOCUS): New. (LOCATION_BLOCK): New. (IS_UNKNOWN_LOCATION): New. * fold-const.c (expr_location_or): Change to use new location. * reorg.c (emit_delay_sequence): Likewise. (try_merge_delay_insns): Likewise. * modulo-sched.c (dump_insn_location): Likewise. * lto-streamer-out.c (lto_output_location_bitpack): Likewise. * jump.c (rtx_renumbered_equal_p): Likewise.
Re: [patch] PR54149: fix data race in LIM pass
On Tue, Sep 11, 2012 at 1:15 AM, Aldy Hernandez al...@redhat.com wrote: In this failing testcase the LIM pass writes to g_13 regardless of the initial value of g_13, which is the test protecting the write. This causes an incorrect store data race wrt both the C++ memory model and transactional memory (the latter if the store occurs inside of a transaction). The problem here is that the ``lsm_flag'' temporary should only be set to true on the code paths where we actually set the original global. As it stands, we are setting lsm_flag to true for reads or writes. Fixed by only setting lsm_flag=1 when the original code path has a write. Tested on x86-64 Linux. OK for trunk? + /* Only set the flag for writes. */ + if (is_gimple_assign (loc-stmt) + gimple_assign_lhs (loc-stmt) == *loc-ref) ok with gimple_assign_lhs_ptr (loc-stmt) == loc-ref instead. Let's hope we conservatively catch all writes to ref this way (which is what we need, right)? Thanks, Richard.
Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
On Tue, Sep 11, 2012 at 3:42 AM, Richard Guenther richard.guent...@gmail.com wrote: Btw, this then happily fits into my suggestion that the elementalness can be autodetected by the compiler simply by means of a proper IPA pass and thus be fully LTO / whole-program aware. No need for an attribute (where you'd need to handle the case that the attribute was placed there by error). We are in violent agreement. -- Gaby
[PATCH,i386] Enable prefetchw in processor alias table for AMD targets
Hi Maintainers, This patch enables prefetchw ISA in the processor alias table for targets amdfam10,barcelona and bdver1,2 and btver1,2. GCC regression test passes with the patch. Ok for trunk? Change log: 2012-09-11 Venkataramanan Kumar venkataramanan.ku...@amd.com * config/i386/i386.c (processor_alias_table): Enable PTA_PRFCHW for targets amdfam10, barcelona, bdver1, bdver2, btver1 and btver2. Index: gcc/config/i386/i386.c === --- gcc/config/i386/i386.c (revision 190345) +++ gcc/config/i386/i386.c (working copy) @@ -3151,31 +3151,33 @@ | PTA_SSE2 | PTA_NO_SAHF}, {amdfam10, PROCESSOR_AMDFAM10, CPU_AMDFAM10, PTA_64BIT | PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE - | PTA_SSE2 | PTA_SSE3 | PTA_SSE4A | PTA_CX16 | PTA_ABM}, + | PTA_SSE2 | PTA_SSE3 | PTA_SSE4A | PTA_CX16 | PTA_ABM + | PTA_PRFCHW}, {barcelona, PROCESSOR_AMDFAM10, CPU_AMDFAM10, PTA_64BIT | PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE - | PTA_SSE2 | PTA_SSE3 | PTA_SSE4A | PTA_CX16 | PTA_ABM}, + | PTA_SSE2 | PTA_SSE3 | PTA_SSE4A | PTA_CX16 | PTA_ABM + | PTA_PRFCHW}, {bdver1, PROCESSOR_BDVER1, CPU_BDVER1, PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSE4A | PTA_CX16 | PTA_ABM | PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX | PTA_FMA4 - | PTA_XOP | PTA_LWP}, + | PTA_XOP | PTA_LWP | PTA_PRFCHW}, {bdver2, PROCESSOR_BDVER2, CPU_BDVER2, PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSE4A | PTA_CX16 | PTA_ABM | PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX | PTA_XOP | PTA_LWP | PTA_BMI | PTA_TBM | PTA_F16C - | PTA_FMA}, + | PTA_FMA | PTA_PRFCHW}, {btver1, PROCESSOR_BTVER1, CPU_GENERIC64, PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 -| PTA_SSSE3 | PTA_SSE4A |PTA_ABM | PTA_CX16}, +| PTA_SSSE3 | PTA_SSE4A |PTA_ABM | PTA_CX16 | PTA_PRFCHW}, {generic32, PROCESSOR_GENERIC32, CPU_PENTIUMPRO, PTA_HLE /* flags are only used for -march switch. */ }, {btver2, PROCESSOR_BTVER2, CPU_GENERIC64, PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSSE3 | PTA_SSE4A |PTA_ABM | PTA_CX16 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX - | PTA_BMI | PTA_F16C | PTA_MOVBE}, + | PTA_BMI | PTA_F16C | PTA_MOVBE | PTA_PRFCHW}, {generic64, PROCESSOR_GENERIC64, CPU_GENERIC64, PTA_64BIT | PTA_HLE /* flags are only used for -march switch. */ },
Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
On Tue, Sep 11, 2012 at 03:57:44AM -0500, Gabriel Dos Reis wrote: On Tue, Sep 11, 2012 at 3:42 AM, Richard Guenther richard.guent...@gmail.com wrote: Btw, this then happily fits into my suggestion that the elementalness can be autodetected by the compiler simply by means of a proper IPA pass and thus be fully LTO / whole-program aware. No need for an attribute (where you'd need to handle the case that the attribute was placed there by error). We are in violent agreement. For locally defined functions sure, the question is if we want the attribute to be something for external functions. Something that would have ABI implications (the external symbol would need to be provided in two forms (or more?), one scalar with normal mangling, one vector with some other kind of mangling/suffix/whatever), when compiling the definition of function with such an attribute the compiler could verify its properties (i.e. autodetect and if it is not autodetected elemental, complain?), and when using extern function just rely on it being provided twice. Even with LTO, the function can be defined in some other shared library etc. Nothing says the implementation of the vector version of the elemental function necessary has to be vectorized, just that the arguments would need to be passed in the expected vector registers, similarly for return value. Say if the elemental function is compiled with -O0, then there could just be a loop executing the scalar body several times and creating vectors. Jakub
RE: [PATCH] Enable bbro for -Os
Thank you for the detail comments. The updated patched is attached. Is it OK? Thanks! -Zhenqiang -Original Message- From: Eric Botcazou [mailto:ebotca...@adacore.com] Sent: Tuesday, September 11, 2012 1:01 AM To: Zhenqiang Chen Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH] Enable bbro for -Os All other comments are accepted. The updated patch is attached. Is it OK? As you probably gathered, I had missed that Steven and Richard had already commented on your patch before posting my message. Sorry about that... I think that the patch is interesting because, even if it doesn't exactly implement what the comment in gate_handle_reorder_blocks was talking about, it fixes code layout regressions without increasing the code size (and even decreasing it). So, assuming that Steven and Richard don't strongly oppose, I think the patch is OK modulo the following nits: + The above description is for the full algorithm, which is used when the + function is optimized for speed. When the function is optimized for size, + in order to reduce long jump and connect more fall through edges, + the long jumps... bb-reorder.c uses fallthru edges consistently. + algorithm is modified as follows: + (1) Break long trace to short ones. The trace is broken at a block, which + has multi-predecessors/successors during finding traces. long traces... A trace is broken at a block that has multiple predecessors/ successors during trace discovery. + (2) Ignore the edge probability and frequency for fall through edges. fallthru + (3) Keep its original order when there is no chance to fall through. + bbro Keep the original order of blocks... We rely on the results of cfg_cleanup + bases on the result of cfg_cleanup, which does lots of optimizations + on cfg. + So the order is expected to be kept if no fall through. + + To implement the change for code size optimization, block's index is + selected as the key and all traces are found in one round. + /* If the best destination has multiple successors or predecessors, + don't allow it to be added when optimizing for size. This makes + sure predecessors with smaller index handled before the best + destination. It breaks long trace and reduces long jumps. missing are before handled + After removing the best edge, the final result will be ABCD/ACBD. + It does not add jump compared with the previous order. But it + reduce the possibility of long jump. */ Double space before But. + if (optimize_function_for_size_p (cfun)) +{ + e_index = src_index_p ? e-src-index : e-dest-index; + b_index = src_index_p ? cur_best_edge-src-index + : cur_best_edge-dest-index; + /* The smaller one is better to keep the original order. */ + return b_index e_index; +} Trailing space after the last parenthesis. + /* If dest has multiple predecessors, skip it. We expect + that one predecessor with smaller index connect with it + later. */ connects + /* Only connect Trace n with Trace n + 1. It is conservative + to keep the order as close as possible to the original order. + It also helps to reduce long jump. */ long jumps Thanks for working on this. -- Eric Botcazou Enable-bbro-for-size-updated3.patch Description: Binary data
Re: Bootstrap fails (was: Remove unnecessary VEC function overloads.)
On Tue, Sep 11, 2012 at 9:58 AM, Tobias Burnus bur...@net-b.de wrote: On 09/11/2012 01:52 AM, Diego Novillo wrote: Remove unnecessary VEC function overloads. Several VEC member functions that accept an element 'T' used to have two overloads: one taking 'T', the second taking 'T *'. They might be unnecessary, but with your patch bootstrapping fails here with the following failure. Did you test with or without Graphite? Fixed with the attached. Richard. Tobias /home/tob/projects/gcc-git/gcc/gcc/graphite-scop-detection.c: In function ‘void move_sd_regions(vec_tsd_region_p**, vec_tsd_region_p**)’: /home/tob/projects/gcc-git/gcc/gcc/vec.h:408:63: error: no matching function for call to ‘vec_tsd_region_p::safe_push(vec_tsd_region_p**, sd_region*, const char [61], int, const char [16])’ (vec_tT::safe_pushA ((V), O VEC_CHECK_INFO MEM_STAT_INFO)) ^ /home/tob/projects/gcc-git/gcc/gcc/graphite-scop-detection.c:146:5: note: in expansion of macro 'VEC_safe_push' VEC_safe_push (sd_region, heap, *target, s); ^ /home/tob/projects/gcc-git/gcc/gcc/vec.h:408:63: note: candidate is: (vec_tT::safe_pushA ((V), O VEC_CHECK_INFO MEM_STAT_INFO)) p Description: Binary data
Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
On Tue, Sep 11, 2012 at 11:06 AM, Jakub Jelinek ja...@redhat.com wrote: On Tue, Sep 11, 2012 at 03:57:44AM -0500, Gabriel Dos Reis wrote: On Tue, Sep 11, 2012 at 3:42 AM, Richard Guenther richard.guent...@gmail.com wrote: Btw, this then happily fits into my suggestion that the elementalness can be autodetected by the compiler simply by means of a proper IPA pass and thus be fully LTO / whole-program aware. No need for an attribute (where you'd need to handle the case that the attribute was placed there by error). We are in violent agreement. For locally defined functions sure, the question is if we want the attribute to be something for external functions. Something that would have ABI implications (the external symbol would need to be provided in two forms (or more?), one scalar with normal mangling, one vector with some other kind of mangling/suffix/whatever), when compiling the definition of function with such an attribute the compiler could verify its properties (i.e. autodetect and if it is not autodetected elemental, complain?), and when using extern function just rely on it being provided twice. Even with LTO, the function can be defined in some other shared library etc. Nothing says the implementation of the vector version of the elemental function necessary has to be vectorized, just that the arguments would need to be passed in the expected vector registers, similarly for return value. Say if the elemental function is compiled with -O0, then there could just be a loop executing the scalar body several times and creating vectors. Sure. And the versioning can happen from the C frontend then. Of course this one has the requirement of documenting the ABI. Richard. Jakub
Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
On Tue, Sep 11, 2012 at 4:06 AM, Jakub Jelinek ja...@redhat.com wrote: On Tue, Sep 11, 2012 at 03:57:44AM -0500, Gabriel Dos Reis wrote: On Tue, Sep 11, 2012 at 3:42 AM, Richard Guenther richard.guent...@gmail.com wrote: Btw, this then happily fits into my suggestion that the elementalness can be autodetected by the compiler simply by means of a proper IPA pass and thus be fully LTO / whole-program aware. No need for an attribute (where you'd need to handle the case that the attribute was placed there by error). We are in violent agreement. For locally defined functions sure, the question is if we want the attribute to be something for external functions. Something that would have ABI implications (the external symbol would need to be provided in two forms (or more?), one scalar with normal mangling, one vector with some other kind of mangling/suffix/whatever), when compiling the definition of function with such an attribute the compiler could verify its properties (i.e. autodetect and if it is not autodetected elemental, complain?), and when using extern function just rely on it being provided twice. Even with LTO, the function can be defined in some other shared library etc. Nothing says the implementation of the vector version of the elemental function necessary has to be vectorized, just that the arguments would need to be passed in the expected vector registers, similarly for return value. Say if the elemental function is compiled with -O0, then there could just be a loop executing the scalar body several times and creating vectors. As it was pointed out earlier (by Marc?), there is also an issue of overload resolution if these automatically synthetized functions have to be something visible, which of course entails the whole ABI issues. This is really a language design issue, not just compiler implementation. If the synthetized functions do not need to have the same status as real functions (hence no need for attributes), then these issues evaporate. -- Gaby
Re: [PATCH] PowerPC VLE port
2012-09-10 Maciej W. Rozycki ma...@codesourcery.com gcc/ * config/rs6000/rs6000.c (print_operand) 'c': Remove. * config/rs6000/spe.md: Remove a leftover comment. Okay. This patch wasn't sent to gcc-patches -- can we see it please? Segher
Re: Bootstrap fails (was: Remove unnecessary VEC function overloads.)
Fixed with the attached. Followed by the same failure on darwin. Fixed with --- ../_clean/gcc/config/darwin.c 2012-07-09 22:06:21.0 +0200 +++ ../p_work/gcc/config/darwin.c 2012-09-11 11:53:02.0 +0200 @@ -1878,7 +1878,7 @@ darwin_asm_named_section (const char *na the assumption of how this is done. */ if (lto_section_names == NULL) lto_section_names = VEC_alloc (darwin_lto_section_e, gc, 16); - VEC_safe_push (darwin_lto_section_e, gc, lto_section_names, e); + VEC_safe_push (darwin_lto_section_e, gc, lto_section_names, e); } else if (strncmp (name, __DWARF,, 8) == 0) darwin_asm_dwarf_section (name, flags, decl); @@ -2698,7 +2698,7 @@ darwin_asm_dwarf_section (const char *na fprintf (asm_out_file, Lsection%.*s:\n, namelen, sname); e.count = 1; e.name = xstrdup (sname); - VEC_safe_push (dwarf_sect_used_entry, gc, dwarf_sect_names_table, e); + VEC_safe_push (dwarf_sect_used_entry, gc, dwarf_sect_names_table, e); } } (now at stage 2). TIA Dominique