Re: Register constraints + and =

2012-05-08 Thread Jan Hubicka
(use (match_operand:SI 5 "register_operand" "2")) > (set (match_operand:SI 0 "register_operand" "=D") > (plus:SI (match_operand:SI 3 "address_operand" "0") (match_dup 5))) > (set (match_operand:SI 1 &q

Re: Register constraints + and =

2012-05-08 Thread Jan Hubicka
> > I can accept the issue as a matter of documentation, but I don't > understand the rest. Remember that all the patterns are executed in > parallel. I don't see how adding a USE in parallel could affect > anything about how the operand is used. > >> >> (define_insn "*rep_movqi" > >> >>[(s

Re: Register constraints + and =

2012-05-08 Thread Jan Hubicka
> Jan Hubicka writes: > > >> > >> I can accept the issue as a matter of documentation, but I don't > >> understand the rest. Remember that all the patterns are executed in > >> parallel. I don't see how adding a USE in parallel coul

Re: What do do with the exceptional case of expand_case for SJLJ exceptions

2012-05-12 Thread Jan Hubicka
> On Wed, Apr 18, 2012 at 2:44 PM, Richard Henderson wrote: > > On 04/18/2012 05:39 AM, Jan Hubicka wrote: > >> Well, if SJLJ lowering happens as gimple pass somewhere near the end of > >> gimple > >> queue, this should not be problem at all. (and i

Re: LTO inlining of transactional builtins

2012-06-22 Thread Jan Hubicka
> On Fri, Jun 22, 2012 at 2:47 PM, Aldy Hernandez wrote: > > Hi gentlemen. > > > > I am looking again at LTO + TM.  The goal is to be able to link with the > > implemented _ITM_* functions in libitm.a, and have them inlined into the > > transaction code when profitable. > > > > To refresh everyone

Re: LTO inlining of transactional builtins

2012-06-25 Thread Jan Hubicka
> a) If a user provides a builtin implementation to LTO, it is discarded, since by design LTO prefers builtins to user-provided versions of them. In LTO, builtins are their own prevailing decl. There is an enhancement request PR here: http://gcc.gnu.org/bugzi

Re: LTO inlining of transactional builtins

2012-06-26 Thread Jan Hubicka
> On Mon, Jun 25, 2012 at 4:32 PM, Richard Henderson wrote: > > On 2012-06-22 06:08, Richard Guenther wrote: > >> Do I understand correctly that inlining the builtin at expansion time is > >> not > >> good because the implementation detail may depend on how libitm was > >> configured? > > > > Mor

Re: LTO inlining of transactional builtins

2012-06-26 Thread Jan Hubicka
> > I'm not sure TM people care about double streaming cost ;) As far as I can > see TM people want the non-lowered form go through at least loop > optimizations, > so I don't see how even a proper IPA pass would help here. As of > cherry-picking :) Yep, this is kind of similar to what we may

Re: GCC and Clang produce undefined references to functions with vague linkage

2012-07-02 Thread Jan Hubicka
Quoting John McCall : On Jun 29, 2012, at 2:23 PM, Rafael Espíndola wrote: There's no "for a long time" here. The ABI does not allow us to emit these symbols with non-coalescing linkage. We're not going to break ABI just because people didn't consider a particular code pattern when they hacke

Re: Cgraph Modification Plan

2012-09-05 Thread Jan Hubicka
> What do you think of the following plan for turning cgraph into > a class hierarchy? We cannot finish it until we have gengtype > understanding single inheritance, but we can start changing APIs > in preparation. Good you told me, I was about trying that myself. Did not know gengtype do not und

Re: Cgraph Modification Plan

2012-09-05 Thread Jan Hubicka
> On 9/5/12, Xinliang David Li wrote: > > On Sep 5, 2012 Jan Hubicka wrote: > > > OK, the basic idea is that symtab_node is basetype of > > > cgraph_node and varpool_node. We may want to drop the historica > > > cgraph/varpool names here, since function_

Re: Cgraph Modification Plan

2012-09-05 Thread Jan Hubicka
> > The cgraph redesign probably deserves more discussion. > > 1) It may be worthwhile to abstract the graph manipulation code into a > utility class which is templatized. > > graph, node with node inheriting from T. > > 2) Introduce a global symbol table containing a function table and a > g

Re: Cgraph Modification Plan

2012-09-05 Thread Jan Hubicka
> > On 9/5/12, Xinliang David Li wrote: > > > On Sep 5, 2012 Jan Hubicka wrote: > > > > OK, the basic idea is that symtab_node is basetype of > > > > cgraph_node and varpool_node. We may want to drop the historica > > > > cgraph/va

Re: Cgraph Modification Plan

2012-09-05 Thread Jan Hubicka
> On Wed, Sep 5, 2012 at 5:41 PM, Lawrence Crowl wrote: > > On 9/5/12, Xinliang David Li wrote: > >> On Sep 5, 2012 Jan Hubicka wrote: > >> > OK, the basic idea is that symtab_node is basetype of > >> > cgraph_node and varpool_node. We may want to drop

Re: Cgraph Modification Plan

2012-09-05 Thread Jan Hubicka
> > Areas that are confusing and need clean up (IMO) include: > 1) handling of aliases and clones I am slowly cleaning up alias stuff, it had major reorg in 4.7 and further cleanups in 4.8. Do you have more specific suggestions? > 2) reachability, needed, analyzed bits. The needed bit is not i

Re: Cgraph Modification Plan

2012-09-07 Thread Jan Hubicka
> Sorry to interrupt here, but please finish the existing partial C++ > transitions > instead of starting to work on new ones. Current stage1 will not last forever > (stage1 is usually 6 months, so its natural end would be end of September). > I'd rather have the current transition to a symbol ta

Re: Libtool update for gcc-4.8 (slim-lto bootstrap)?

2012-09-11 Thread Jan Hubicka
> Is there any interest in updating the in-tree libtool to something > newer? This update would allow to use a -fno-fat-lto-objects > lto-bootstrap target, that should speed up the (lto) build time. > > If there is interest, when would be the best date for such an update? There is definitely an i

Re: Cgraph Modification Plan

2012-09-12 Thread Jan Hubicka
> We do not yet seem to have consensus on a long term plan. > Would it be reasonable to start on short term prepatory work? > > In particular, I was think we could do > >Add converters and testers. >Change callers to use those. > > and maybe > >Change callers to use type-safe parame

Re: g++.dg/tree-ssa/pr45453.C time out

2012-10-31 Thread Jan Hubicka
> On Wed, Oct 31, 2012 at 3:17 PM, Paolo Carlini > wrote: > > Hi, > > > > whoever a few days ago or so broke this test, can please either fix the > > testcase, the compiler or just xfail for now the testcase itself, to avoid > > everybody the waste of time? > > > > If you want me to do go ahead w

Re: g++.dg/tree-ssa/pr45453.C time out

2012-10-31 Thread Jan Hubicka
> > On Wed, Oct 31, 2012 at 3:17 PM, Paolo Carlini > > wrote: > > > Hi, > > > > > > whoever a few days ago or so broke this test, can please either fix the > > > testcase, the compiler or just xfail for now the testcase itself, to avoid > > > everybody the waste of time? > > > > > > If you want m

Re: RFC: [ARM] Disable peeling

2012-12-10 Thread Jan Hubicka
> > I agree that this is a sledgehammer. If aligned/unaligned loads/stores have > the same cost then reflect that in the vectorized stmt cost hook. If that > alone does not prevent peeling for alignment to happen then the fix is to > not consider doing peeling for alignment if aligned/unaligned

Re: Identical basic blocks live long in RTL flow.

2013-01-16 Thread Jan Hubicka
> > > >Basic blocks 8/9/10 are identical and live until pass jump2, which is > >after register allocation. > >I think these duplicated BBs do not contain additional information and > >should be better to be removed ASAP, because they might interfere with > >other passes like ifcvt. > > > >So should

Re: System V Application Binary Interface 0.99.5

2013-01-31 Thread Jan Hubicka
> > Well, it's hardly an optimization if it's incorrect, and it seems to be > > incorrect. As the old saying goes, I can make your code infinitely fast > > if you don't care about the results. > > It's incorrect to rely on the extension taking place. It's not incorrect to > do the extension. Th

Re: System V Application Binary Interface 0.99.5

2013-01-31 Thread Jan Hubicka
> On 01/30/2013 04:49 PM, Michael Matz wrote: > > Hmm? GCC generates code that doesn't rely on the extension taking place. > > Sure, I didn't mean to suggest it was: it's LLVM that's incorrect. Yes, that is LLVM bug. I am surprised that it went unnoticed for so long, but I guess it is difficult

Re: System V Application Binary Interface 0.99.5

2013-02-03 Thread Jan Hubicka
> On 02/01/2013 12:38 AM, Jan Hubicka wrote: > > Doing the extensions at caller side always is however IMO a preformance bug > > in > > GCC. We can definitly drop them at -Os, for non-PRS targets and for calls > > within compilation unit where we know that GCC is no

Re: make_decl_one_only and inlining

2013-02-17 Thread Jan Hubicka
> Hi, > > We recently got a bug report for the GCC D compiler frontend which shows that > we > currently don't inline any templated functions. The reason seems to be that > decl_replaceable_p always returns true for D template functions. > > We currently just mark such template function instance

Re: make_decl_one_only and inlining

2013-02-17 Thread Jan Hubicka
> >> Set DECL_COMDAT. You said that didn't work but you didn't fully > >> explain why. A DECL_COMDAT function should be output in every object > >> file in which it is referenced. > > > > I wasn't sure if that's the correct approach. If it is, some > > further investigation will be necessary why

Re: How to stop gcc from not calling noinline functions

2008-01-14 Thread Jan Hubicka
> On 14 January 2008 11:03, Hans-Peter Nilsson wrote: > > >> Date: Sat, 12 Jan 2008 11:16:23 +0100 > >> From: Paolo Bonzini <[EMAIL PROTECTED]> > > > >>> (Yeah, new attributes "impure" and/or "nonconst" would solve > >>> this, but only for IPA and there's already the existing option > >>> and asm

Re: Why is DImode aligned at 8 byte for i386?

2008-02-04 Thread Jan Hubicka
> > -malign-double is (was?) indeed a performance improvement for > numerical applications on 32bits. But DImode is still not 8 bytes aligned > there (which makes a next-gen 32bit ABI for 64bit x86 difficult there, > if you want to retain DImode/DFmode 8 byte alignment and re-use the > kernel 32b

Re: Why is DImode aligned at 8 byte for i386?

2008-02-04 Thread Jan Hubicka
> On Feb 4, 2008 4:11 AM, H.J. Lu <[EMAIL PROTECTED]> wrote: > > DImode is aligned at 8 byte in i386. Since 32bit doesn't have > > 64bit register, can we align DImode at 4byte instead of 8 > > for i386? It shouldn't have any negative impact on performance. > > I don't think DImode is aligned at 8

Re: Why is DImode aligned at 8 byte for i386?

2008-02-04 Thread Jan Hubicka
> On Mon, Feb 04, 2008 at 12:24:33PM +0100, Jan Hubicka wrote: > > > > > > -malign-double is (was?) indeed a performance improvement for > > > numerical applications on 32bits. But DImode is still not 8 bytes aligned > > > there (which makes a next-gen

Re: [discuss] When is RBX used for base pointer?

2008-02-13 Thread Jan Hubicka
> Hi, > > On Wed, 13 Feb 2008, H.J. Lu wrote: > > > We need a callee-saved register for stack alignment. > > Can you expand on why? > > > In 64bit, our choices are rbx, and r12-r15. r12-r15 need the REX byte > > and r12 also needs the SIB byte. So I'd like to use rbx. x86-64 psABI > > says rb

Re: [discuss] When is RBX used for base pointer?

2008-02-13 Thread Jan Hubicka
> On Feb 13, 2008 2:49 PM, Michael Matz <[EMAIL PROTECTED]> wrote: > > Hi, > > > > On Wed, 13 Feb 2008, H.J. Lu wrote: > > > > > Our proposal is at > > > > > > http://gcc.gnu.org/ml/gcc/2007-12/msg00567.html > > > > > > For most cases, we can align stack with RBP/RSP. But > > > we need an extra reg

Re: lto

2008-02-14 Thread Jan Hubicka
Hi, > diego and honza > > diego asked on irc were we planning to be able to serialize out all of > the combined declarations. > My response we could certainly use the machinery that we currently have > to do this. > > However, I believe that Deigo's motivation is somewhat flawed, and for > that m

Re: [discuss] When is RBX used for base pointer?

2008-02-14 Thread Jan Hubicka
> > > Hi, > > > > > > On Wed, 13 Feb 2008, H.J. Lu wrote: > > > > > > > We need a callee-saved register for stack alignment. > > > > > > Can you expand on why? > > > > > > > In 64bit, our choices are rbx, and r12-r15. r12-r15 need the REX byte > > > > and r12 also needs the SIB byte. So I'

Re: lto

2008-02-14 Thread Jan Hubicka
> I have to admit that i had not considered the issue of having the later > stages of compilation of one function feed into the early stages of > another. The problem with this is that it implies a serial ordering of > the compilation and that is not going to yield a usable system for > compiling

Re: Some 4.4 project musings

2008-02-14 Thread Jan Hubicka
> Diego Novillo wrote: > >On Fri, Feb 1, 2008 at 3:55 PM, Andrew MacLeod <[EMAIL PROTECTED]> wrote: > > > > > >> 1 - Pass cleanup. There have been rumblings about this, but I haven't > >> > > > >Yes, this is an area that is in desperate need of TLC. Your plan > >looks good to me. We need t

Re: lto

2008-02-15 Thread Jan Hubicka
> Your idea seems fine to me. Unless I'm not understanding you > completely, it does not really conflict with what we're trying to do in > whopr. > > The main goal of whopr is to support transformations that can be > expressed in terms of a local generation phase, a global analysis that > dec

Re: lto

2008-02-15 Thread Jan Hubicka
> IP RA as currenly implemented in IRA does propagate info only down in > topological order. But a good IP RA (e.g. Minimal cost inter-procedural > regiter allocator http://citeseer.ist.psu.edu/kurlander96minimum.html) > needs to propagate info up and down. > > But I am quite skeptical about I

Re: lto

2008-02-15 Thread Jan Hubicka
> I am agree with you. We should not forget embedded market where the > code size is more imortant but we should provide an option for market > on which Pathscale/Intel/Sun compilers are oriented. > > With this point of view, we have a lot of resources because gcc > generates the smallest co

API for callgraph and IPA passes for whole program optimization

2008-02-17 Thread Jan Hubicka
Hi, since it seems that we are getting ready with LTO, I think it is time to write updated design document for callgraph and implementation of whole program optimizer as outlined in Whopr document http://gcc.gnu.org/wiki/whopr As usual, all comments or ideas are welcome ;) Basic organization =

Re: API for callgraph and IPA passes for whole program optimization

2008-02-19 Thread Jan Hubicka
> I find 'analyze' for the first stage confusing. We do no analysis > there, we just produce summary info. The analysis is actually done by > what you call 'read'. How about some variant of: > > generate_summary_{function/variable} > analyze_{function/variable} > transform_{function/variable}

Re: API for callgraph and IPA passes for whole program optimization

2008-02-19 Thread Jan Hubicka
> Currently the job to drive compilation process is implemented in > cgraphunit and passmanager. I am leaning to plan to do as much work as BTW for a while I think that name of cgraphunit outlived its original meaning. Just as historical note, it was introduced so because some bits wasn't possibl

Re: [discuss] When is RBX used for base pointer?

2008-02-20 Thread Jan Hubicka
Hi, > Michael, Jan, > > When aligning stack for those functions who have dynamic stack > allocation, we must use an available callee-saved register in prologue. > We named this hard register DRAP. It is worthwhile to emphasize that > *free* here means "free in prologue". After prologue, a virtual

Re: optimizing predictable branches on x86

2008-02-26 Thread Jan Hubicka
Hi, > Core2 follows a similar pattern, although it's not seeing any > slowdown in the "no deps, predictable, jmp" case like K8 does. > > Any comments? (please cc me) Should gcc be using conditional jumps > more often eg. in the case of __builtin_expect())? The problem is that in general GCC's bra

Re: optimizing predictable branches on x86

2008-02-26 Thread Jan Hubicka
> Hi, > > Core2 follows a similar pattern, although it's not seeing any > > slowdown in the "no deps, predictable, jmp" case like K8 does. > > > > Any comments? (please cc me) Should gcc be using conditional jumps > > more often eg. in the case of __builtin_expect())? > > The problem is that in g

Re: optimizing predictable branches on x86

2008-02-27 Thread Jan Hubicka
> > At least on x86 it should also be a good idea to know which way > > the branch is going to go, because it doesn't have explicit branch > > hints, you really want to be able to optimize the cold branch > > predictor case if converting from cmov to conditional branches. > > x86 as of Pentium 4 d

Re: optimizing predictable branches (Was: ... on x86)

2008-02-27 Thread Jan Hubicka
> This is also interesting for the ARC700 processor. > > There is also an issue if the flag for the conditionalized instruction is > set in the immediately preceding instruction, and the result of the > conditionalized instruction is required in the immediately following > instruction, and if usin

Re: [PATCH] linux/fs.h - Convert debug functions declared inline __attribute__((format (printf,x,y) to statement expression macros

2008-02-27 Thread Jan Hubicka
> > -static inline void __attribute__((format(printf, 1, 2))) > -__simple_attr_check_format(const char *fmt, ...) It would be nice to have a testcase, but I guess it is because GCC can't inline variadic functions. The function gets identified as const and removed as unused by DCE, but this happ

Re: [PATCH] linux/fs.h - Convert debug functions declared inline __attribute__((format (printf,x,y) to statement expression macros

2008-02-28 Thread Jan Hubicka
> On Thu, Feb 28, 2008 at 12:58:35AM +0100, Jan Hubicka wrote: > > We probably also can simply allow inlining variadic functions not > > calling va_start. I must say that this option appeared to me but I was > > unable to think of any sane use case. This probably is one ;)

Re: birthpoints in rtl.

2008-02-28 Thread Jan Hubicka
Hi, as discussed briefly on IRC yesterday, I would be very happy to see current DU/UD infrastructure changed to FUD chains (or on side SSA form). This way it will be more symmetric to how tree level virtual operands are handled and will hopefully make whole compiler more standard and easier to foll

Re: A few questions on XFmode for x86

2008-02-29 Thread Jan Hubicka
> i386-modes.def has > > --- > /* In ILP32 mode, XFmode has size 12 and alignment 4. >In LP64 mode, XFmode has size and alignment 16. */ > ADJUST_FLOAT_FORMAT (XF, (TARGET_128BIT_LONG_DOUBLE > ? &ieee_extended_intel_128_format > : TARGET_96_

Re: birthpoints in rtl.

2008-03-01 Thread Jan Hubicka
> The two complications are: > 1) libcalls I am probably dense here, but why we can't just ignore existence of libcalls for dataflow framework? This exist so we can effectivly remove blocks of code in dead code removal and do some other changes, but I don't see how they can be less friendly to F

Re: birthpoints in rtl.

2008-03-01 Thread Jan Hubicka
> Diego, > > I am leaning to just adding noop moves at the birthpoints (dominance > frontiers) as real noop move insns in the streams in the passes that use > ud or du chains. The back end is tolerant of noop moves and without Hi, while I am with Diego that would preffer PHI nodes on side espec

Re: birthpoints in rtl.

2008-03-01 Thread Jan Hubicka
> > Not libcalls, but libcall *notes*. > > > This exist so we can effectivly remove > > blocks of code in dead code removal and do some other changes, but I > > don't see how they can be less friendly to FUD than they are to DU/UD. > > Sure the optimization has to care to not break the extra

[RFA] optimizing predictable branches on x86

2008-03-03 Thread Jan Hubicka
Hi, I had to tweak the testcase a bit to not compute minimum: GCC optimizes this early into MIN_EXPR throwing away any profile information. If we get serious here we can maintain it via histogram, but I am not sure it is worth the effort at least until IL is sanitized and expansion cleaned up with

Re: [RFA] optimizing predictable branches on x86

2008-03-03 Thread Jan Hubicka
> > >/* High branch cost, expand as the bitwise OR of the conditions. > > Do the same if the RHS has side effects, because we're effectively > > turning a TRUTH_OR_EXPR into a TRUTH_ORIF_EXPR. */ > >! if (BRANCH_COST (!optimize_size, false)>= 4 > >! || TREE_SIDE_EFFEC

Re: [RFA] optimizing predictable branches on x86

2008-03-03 Thread Jan Hubicka
> But I can also hide the cfun->function_frequency trick in > DEFAULT_BRANCH_COST macro if it seems to help. (in longer term I hope > they will all go away as expansion needs to be aware of hotness info > anyway) Well, it definitly helps. I originally hoped there will be fewer places querying BRA

Re: [RFA] optimizing predictable branches on x86

2008-03-03 Thread Jan Hubicka
> > But I can also hide the cfun->function_frequency trick in > > DEFAULT_BRANCH_COST macro if it seems to help. (in longer term I hope > > they will all go away as expansion needs to be aware of hotness info > > anyway) > > Well, it definitly helps. I originally hoped there will be fewer places

Re: [RFA] optimizing predictable branches on x86

2008-03-03 Thread Jan Hubicka
> On Monday 03 March 2008 22:38, Jan Hubicka wrote: > > Hi, > > I had to tweak the testcase a bit to not compute minimum: GCC optimizes > > this early into MIN_EXPR throwing away any profile information. If we > > get serious here we can maintain it via histogram,

Re: [RFA] optimizing predictable branches on x86

2008-03-03 Thread Jan Hubicka
> > >>>I hope so too. For the kernel we have some parts where > >>>__builtin_expect is used quite a lot and noticably helps, and could > >>>help even more if we cut down the use of cmov too. I guess on > >>>architectures with even more predictated instructions it could be > >>>even more useful too

Re: birthpoints in rtl.

2008-03-04 Thread Jan Hubicka
Hi, > > 1) In ssa, the operands of the phis and the renaming contain > information. The operands are paired with the cfg edges that the > values come in on. In fud/birthpoints there is no such pairing or > renaming. For some problems, like conditional constant, this pairing > and renaming is

Re: birthpoints in rtl.

2008-03-04 Thread Jan Hubicka
> I think that at this point, i have been convinced to: > > 1) use fud's rather than birthpoints because these do keep a slot for > the value along each in edge. > 2) keep the info on the side (see rsandifors diverging thread). > > I am not there on keeping extra names on the side. The advantag

Re: birthpoints in rtl.

2008-03-04 Thread Jan Hubicka
> > I think that at this point, i have been convinced to: > > > > 1) use fud's rather than birthpoints because these do keep a slot for > > the value along each in edge. > > 2) keep the info on the side (see rsandifors diverging thread). > > > > I am not there on keeping extra names on the side.

Re: Linux doesn't follow x86/x86-64 ABI wrt direction flag

2008-03-05 Thread Jan Hubicka
> On Wed, Mar 05, 2008 at 09:38:13PM +0100, Michael Matz wrote: > > Hi, > > > > On Wed, 5 Mar 2008, Aurelien Jarno wrote: > > > > > > So I think gcc at least needs an *option* to revert to the old behavior, > > > > and there's a good argument to make it the default for now, at least for > > > > x

Re: [RFC] GCC caret diagnostics

2008-03-08 Thread Jan Hubicka
> Ian Lance Taylor <[EMAIL PROTECTED]> writes: > > >Another approach would be to only use the carets for parse errors, > >which is where they are the most helpful. > > And preprocessor if possible > > [also sometimes I would love to have an option in gcc to just > display the preprocessed inpu

Re: API for callgraph and IPA passes for whole program optimization

2008-03-09 Thread Jan Hubicka
Hi, based on the discussion, this is change I would like to do to the passmanager. I am sending the header change only first, because the actual change will need updating all PM datastructure initializers and compensate testsuite and documentation for the removal of RTL dump letters so I would rat

Re: API for callgraph and IPA passes for whole program optimization

2008-03-09 Thread Jan Hubicka
> Jan Hubicka wrote: > > This looks mostly fine to me. note that i added you to pr35094 since > this patch will resolve that issue. > > I guess that one of the questions that i would have is why not have > there be a base structure for the core passmanager fields, an

Re: API for callgraph and IPA passes for whole program optimization

2008-03-12 Thread Jan Hubicka
> On 3/9/08 7:26 AM, Jan Hubicka wrote: > > >compensate testsuite and documentation for the removal of RTL dump > >letters so I would rather do that just once. Does this seem OK? > > Yup, thanks for doing this. > > > >The patch include the read/write me

Re: gcc 4.3.0 i386 default question

2008-03-12 Thread Jan Hubicka
> Hi, > > Did the default i386 CPU model that gcc generates > code for change between 4.2.x and 4.3.0? I didn't > see anything in the release notes that jumps out at > me about this. There wasnt any intend to change the codebase. However the default tunning now has changed to "generic" model.

Re: gcc 4.3.0 i386 default question

2008-03-12 Thread Jan Hubicka
> David Edelsohn wrote: > >>Joel Sherrill writes: > >> > > > >Joel> Those all look like checks to see if the compiler itself > >Joel> supports Altivec -- not a run-time check on the hardware > >Joel> like the Neon check_effective_target_arm_neon_hw appears > >Joel> to be. > > >

Re: Bootstrap comparison failures on i586

2008-04-04 Thread Jan Hubicka
> > if (parts.base) > { > if (REGNO_POINTER_ALIGN (REGNO (parts.base)) < 32) <-- 820 > return 0; > } > > I think parts.base is OK so it's probably REGNO_POINTER_ALIGN Uh, while converting the regno_pointer_align from GGC to malloced memory, I mistakely used xmalloc instead

Re: Bootstrap comparison failures on i586

2008-04-05 Thread Jan Hubicka
t reproduce for me, but I've comitted the following as obvious. I am sorry for all the fallout... Index: ChangeLog === *** ChangeLog (revision 133932) --- ChangeLog (working copy) *** *** 1,5 --- 1,7 ---- 2008

Re: Bootstrap failure on i686-apple-darwin9

2008-04-15 Thread Jan Hubicka
> At revision 134333, boostrap fails on i686-apple-darwin9 at stage 1 > with: > > ... > gcc -c -g -fkeep-inline-functions -DIN_GCC -W -Wall -Wwrite-strings > -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition > -Wmissing-format-attribute -fno-common -DHAVE_CONFIG_H -I. -I. > -I

Re: Bootstrap failure on i686-apple-darwin9

2008-04-15 Thread Jan Hubicka
> > Does this help? > > Thanks for tha answer, but now I have: > > ... > ../../gcc-4.4-work/gcc/except.c: In function 'set_nothrow_function_flags': > ../../gcc-4.4-work/gcc/except.c:2787: error: 'struct rtl_data' has no member > named 'epilogue_delay_list' > make[3]: *** [except.o] Error 1 > ...

Re: ICE while bootstrap for x86_64-pc-mingw32 in cp-demangle.c: 2905

2008-05-02 Thread Jan Hubicka
> Hi, > > it seems, that the compilation of libiberty for x86_64-pc-mingw32 leads to > an ICE in cp-demangle.c:2905: internal compiler error verify_cgraph_node > failed breaks bootstrap. Hi, this was caused by my PM patch. I've commited fix for it this morning, so please let me know if any prob

Re: apparent memory increase

2008-05-23 Thread Jan Hubicka
> You may have seen this warning from the memory consumption tester: > > http://gcc.gnu.org/ml/gcc-regression/2008-05/msg00041.html > > ... related to the recent identifier GC patch. > > I looked into this a little. My theory is that this is an artifact of > how the tester collects its data. I

Re: Why is the length of *sse_prologue_save_insn 135?

2008-05-24 Thread Jan Hubicka
> Hi Jan, Uros, Hi, I guess I was just lazy to figure out the size at a time of writting the pattern. Length is not used for anything useful at the moment, but fixing it definitly won't hurt. Honza > > i386.md has > > (define_insn "*sse_prologue_save_insn" > [(set (mem:BLK (plus:DI (match_oper

Re: [lto] Streaming out language-specific DECL/TYPEs

2008-06-03 Thread Jan Hubicka
> On Mon, Jun 2, 2008 at 5:10 PM, Diego Novillo <[EMAIL PROTECTED]> wrote: > > In g++.dg/torture/20070621-1.C we are trying to stream out a structure > > that contains a TEMPLATE_DECL. This currently causes a failure in > > lto-function-out.c:output_tree because not only TEMPLATE_DECL is > > C++-s

Re: [lto] Streaming out language-specific DECL/TYPEs

2008-06-03 Thread Jan Hubicka
> On Mon, Jun 2, 2008 at 20:37, Kenneth Zadeck <[EMAIL PROTECTED]> wrote: > > > the problem with making this a langhook is that there is no "there-there" in > > that on the serialize in side, you would have to recreate the c++ front end > > code that expects this tree code. (if there is no such c

Re: Is this a typo in setup_incoming_varargs_64?

2008-06-05 Thread Jan Hubicka
> Hi, > > setup_incoming_varargs_64 in i386.c has > > /* Compute address to jump to : > label - 5*eax + nnamed_sse_arguments*5 */ > > The comments don't match the code. Shout the comments be > > /* Compute address to jump to : > label - 4*eax + nnamed_sse_arguments

Re: [lto] Streaming out language-specific DECL/TYPEs

2008-06-05 Thread Jan Hubicka
> Jan Hubicka wrote: > > >Sure if it works, we should be lowering the types during gimplification > >so we don't need to store all this in memory... > >But C++ FE still use its local data later in stuff like thunks, but we > >will need to cgraphize them any

Re: [whopr] Design/implementation alternatives for the driver and WPA

2008-06-05 Thread Jan Hubicka
Hi, I am jumping in somewhat late, as yesterday I was on meetings without internet access. (and I probably will be offline again tomorrow) I think that in basic terms we all mostly agree (we want to implement optimization scheme that does not get everything into memory, we want to parallelize the

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-05 Thread Jan Hubicka
> > 1. Extend the register save area to put upper 128bit at the end. > Pros: > Aligned access. > Save stack space if 256bit registers are used. > Cons > Split access. Require more split access beyond 256bit. > > 2. Extend the register save area to put full 265bit YMMs at the end.

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-06 Thread Jan Hubicka
> > ymm0 and xmm0 are the same register. xmm0 is the lower 128bit > of xmm0. I am not sure if we need separate XMM registers from > YMM registers. Yes, I know that xmm0 is lower part of ymm0. I still think we ought to be able to support varargs that do save ymm0 registers only when ymm values a

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-09 Thread Jan Hubicka
> On Fri, Jun 06, 2008 at 06:50:26AM -0700, H.J. Lu wrote: > > On Fri, Jun 06, 2008 at 10:28:34AM +0200, Jan Hubicka wrote: > > > > > > > > ymm0 and xmm0 are the same register. xmm0 is the lower 128bit > > > > of xmm0. I am not sure if we need sepa

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-10 Thread Jan Hubicka
> > I don't understand why you want to pass __m256 and 256-bit vector values > to anonymous arguments in registers. The only thing the vararg functions > would do with it would be save it somewhere on the stack. > Given the x86_64 ABI, you can't expect calling an implicitly > prototyped or non-va

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-10 Thread Jan Hubicka
> On Tue, Jun 10, 2008 at 4:32 AM, Jan Hubicka <[EMAIL PROTECTED]> wrote: > >> > >> I don't understand why you want to pass __m256 and 256-bit vector values > >> to anonymous arguments in registers. The only thing the vararg functions > >> would

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-10 Thread Jan Hubicka
> On Tue, Jun 10, 2008 at 8:11 AM, Jakub Jelinek <[EMAIL PROTECTED]> wrote: > > On Tue, Jun 10, 2008 at 04:50:14PM +0200, Jan Hubicka wrote: > >> 1) make __m256 passed on stack on variadic functions and in registers > >> otherwse. Then we don't need

Re: RFC: Extend x86-64 psABI for 256bit AVX register

2008-06-15 Thread Jan Hubicka
> On Wed, Jun 11, 2008 at 07:49:12AM -0700, H.J. Lu wrote: > > > I guess we all agree on passing variadic arguments on stack (that is > > > only those belonging on ...) and rest in registers. It seems easiest in > > > regard to future register set extensions too. Only negative thing is > > > that

Re: newlib & libgcov

2008-06-15 Thread Jan Hubicka
> Hello, > In our GCC porting, we use newlib instead of libc. Today I tried to use > profiling feedback based optimization with option -fprofile-arcs. But > the executable doesn't produce .gcda file. I examined the disassembled > binary file and found the following functions are basically just d

Re: Results for 4.4.0 20080618 (experimental) (GCC) testsuite on i686-pc-linux-gnu

2008-06-18 Thread Jan Hubicka
> > FAIL: gcc.dg/weak/weak-6.c (test for errors, line 5) > > FAIL: gcc.dg/weak/weak-6.c (test for excess errors) > > FAIL: gcc.dg/weak/weak-7.c (test for errors, line 5) > > FAIL: gcc.dg/weak/weak-7.c (test for excess errors) > > These look like they were caused by one of your patches. Yes, the

Re: Merging tuples branch into mainline today

2008-07-25 Thread Jan Hubicka
> > I think that someone, though, should be committed to fixing this pass ASAP > > after it's checked in; waiting until late August to fix it seems bad. Is > > there someone else who can commit to working on it as a high priority after > > the main tuples checkin? > > I would obviously vote in

Re: Merging tuples branch into mainline today

2008-07-25 Thread Jan Hubicka
> Jan Hubicka wrote: > >So while the passes are probably now well in "benchmark toy" category > >and they will need many changes to be useful in general, I think it is > >good to have something we can test the framework at. > > Do these passes actually

Re: [tuples] New memory/time comparison vs trunk

2008-07-27 Thread Jan Hubicka
> > - The rest of the memory utilization difference is mostly in inlining > (240Kb) and SSA update (50Kb). > > I think the main focus points should be DSE and trying to get a good > way of measuring the memory utilization differences. Jan, any > suggestion? I've switched memory tester to tuples

Re: Recent warning regression: no return statement in function returning non-void

2008-07-27 Thread Jan Hubicka
> On Sun, Jul 27, 2008 at 1:18 PM, Gerald Pfeifer <[EMAIL PROTECTED]> wrote: > > I believe the following happened in the last 48 or so hours; I saw > > this triggered by my nightly Wine builds which in turn use my nightly > > GCC builds. ;-) > > > > For code like the following where we have an infi

Enabling IPCP by default

2008-08-24 Thread Jan Hubicka
Hi, Since most of issues with IPCP should be fixed now and it should be as strong as possible with the elementary textbook quality algorithm it uses, I would like to enable it by default. I've tested it on SPEC and C++ behcmarks yeterday and didn't measured any significant improvments. There is qu

Re: Enabling IPCP by default

2008-08-24 Thread Jan Hubicka
> Jan Hubicka <[EMAIL PROTECTED]> writes: > > > If there are no complains, I will enable ipcp as proposed after remaining > > patches are tested and comitted (that would be about day after tomorrow) > > It breaks Ada on ia64: I was hitting same problem on x86

Re: Enabling IPCP by default

2008-08-28 Thread Jan Hubicka
Hi, after IRA I've re-done x86-64 SPECint testing (SPECfp, CSiBE and C++ benchmark failed because tree was broken at that point, I will get results tomorrow, but there was no surprises already before) also with the new code to eliminate arguments. Luis also did PPC SPEC runs. The most important re

Re: Enabling IPCP by default

2008-08-29 Thread Jan Hubicka
Hi, tonight testing on x86_64, i386 and IA-64 didn't seem to bring any new surprises, so I've comitted the following patch. I will also update changes page of 4.4. * doc/invoke.texi (-fipa-cp): Enabled by default at -O2/-Os/-O3 (-fipa-cp-clone): Enabled by default at -O3. *

<    1   2   3   4   5   6   7   >