(use (match_operand:SI 5 "register_operand" "2"))
> (set (match_operand:SI 0 "register_operand" "=D")
> (plus:SI (match_operand:SI 3 "address_operand" "0") (match_dup 5)))
> (set (match_operand:SI 1 &q
>
> I can accept the issue as a matter of documentation, but I don't
> understand the rest. Remember that all the patterns are executed in
> parallel. I don't see how adding a USE in parallel could affect
> anything about how the operand is used.
> >> >> (define_insn "*rep_movqi"
> >> >>[(s
> Jan Hubicka writes:
>
> >>
> >> I can accept the issue as a matter of documentation, but I don't
> >> understand the rest. Remember that all the patterns are executed in
> >> parallel. I don't see how adding a USE in parallel coul
> On Wed, Apr 18, 2012 at 2:44 PM, Richard Henderson wrote:
> > On 04/18/2012 05:39 AM, Jan Hubicka wrote:
> >> Well, if SJLJ lowering happens as gimple pass somewhere near the end of
> >> gimple
> >> queue, this should not be problem at all. (and i
> On Fri, Jun 22, 2012 at 2:47 PM, Aldy Hernandez wrote:
> > Hi gentlemen.
> >
> > I am looking again at LTO + TM. The goal is to be able to link with the
> > implemented _ITM_* functions in libitm.a, and have them inlined into the
> > transaction code when profitable.
> >
> > To refresh everyone
>
a) If a user provides a builtin implementation to LTO, it is discarded,
since by design LTO prefers builtins to user-provided versions of them. In
LTO, builtins are their own prevailing decl. There is an enhancement
request PR here:
http://gcc.gnu.org/bugzi
> On Mon, Jun 25, 2012 at 4:32 PM, Richard Henderson wrote:
> > On 2012-06-22 06:08, Richard Guenther wrote:
> >> Do I understand correctly that inlining the builtin at expansion time is
> >> not
> >> good because the implementation detail may depend on how libitm was
> >> configured?
> >
> > Mor
>
> I'm not sure TM people care about double streaming cost ;) As far as I can
> see TM people want the non-lowered form go through at least loop
> optimizations,
> so I don't see how even a proper IPA pass would help here. As of
> cherry-picking
:) Yep, this is kind of similar to what we may
Quoting John McCall :
On Jun 29, 2012, at 2:23 PM, Rafael Espíndola wrote:
There's no "for a long time" here. The ABI does not allow us to emit these
symbols with non-coalescing linkage. We're not going to break ABI
just because people didn't consider a particular code pattern when they
hacke
> What do you think of the following plan for turning cgraph into
> a class hierarchy? We cannot finish it until we have gengtype
> understanding single inheritance, but we can start changing APIs
> in preparation.
Good you told me, I was about trying that myself. Did not know gengtype
do not und
> On 9/5/12, Xinliang David Li wrote:
> > On Sep 5, 2012 Jan Hubicka wrote:
> > > OK, the basic idea is that symtab_node is basetype of
> > > cgraph_node and varpool_node. We may want to drop the historica
> > > cgraph/varpool names here, since function_
>
> The cgraph redesign probably deserves more discussion.
>
> 1) It may be worthwhile to abstract the graph manipulation code into a
> utility class which is templatized.
>
> graph, node with node inheriting from T.
>
> 2) Introduce a global symbol table containing a function table and a
> g
> > On 9/5/12, Xinliang David Li wrote:
> > > On Sep 5, 2012 Jan Hubicka wrote:
> > > > OK, the basic idea is that symtab_node is basetype of
> > > > cgraph_node and varpool_node. We may want to drop the historica
> > > > cgraph/va
> On Wed, Sep 5, 2012 at 5:41 PM, Lawrence Crowl wrote:
> > On 9/5/12, Xinliang David Li wrote:
> >> On Sep 5, 2012 Jan Hubicka wrote:
> >> > OK, the basic idea is that symtab_node is basetype of
> >> > cgraph_node and varpool_node. We may want to drop
>
> Areas that are confusing and need clean up (IMO) include:
> 1) handling of aliases and clones
I am slowly cleaning up alias stuff, it had major reorg in 4.7 and further
cleanups in 4.8. Do you have more specific suggestions?
> 2) reachability, needed, analyzed bits. The needed bit is not i
> Sorry to interrupt here, but please finish the existing partial C++
> transitions
> instead of starting to work on new ones. Current stage1 will not last forever
> (stage1 is usually 6 months, so its natural end would be end of September).
> I'd rather have the current transition to a symbol ta
> Is there any interest in updating the in-tree libtool to something
> newer? This update would allow to use a -fno-fat-lto-objects
> lto-bootstrap target, that should speed up the (lto) build time.
>
> If there is interest, when would be the best date for such an update?
There is definitely an i
> We do not yet seem to have consensus on a long term plan.
> Would it be reasonable to start on short term prepatory work?
>
> In particular, I was think we could do
>
>Add converters and testers.
>Change callers to use those.
>
> and maybe
>
>Change callers to use type-safe parame
> On Wed, Oct 31, 2012 at 3:17 PM, Paolo Carlini
> wrote:
> > Hi,
> >
> > whoever a few days ago or so broke this test, can please either fix the
> > testcase, the compiler or just xfail for now the testcase itself, to avoid
> > everybody the waste of time?
> >
> > If you want me to do go ahead w
> > On Wed, Oct 31, 2012 at 3:17 PM, Paolo Carlini
> > wrote:
> > > Hi,
> > >
> > > whoever a few days ago or so broke this test, can please either fix the
> > > testcase, the compiler or just xfail for now the testcase itself, to avoid
> > > everybody the waste of time?
> > >
> > > If you want m
>
> I agree that this is a sledgehammer. If aligned/unaligned loads/stores have
> the same cost then reflect that in the vectorized stmt cost hook. If that
> alone does not prevent peeling for alignment to happen then the fix is to
> not consider doing peeling for alignment if aligned/unaligned
> >
> >Basic blocks 8/9/10 are identical and live until pass jump2, which is
> >after register allocation.
> >I think these duplicated BBs do not contain additional information and
> >should be better to be removed ASAP, because they might interfere with
> >other passes like ifcvt.
> >
> >So should
> > Well, it's hardly an optimization if it's incorrect, and it seems to be
> > incorrect. As the old saying goes, I can make your code infinitely fast
> > if you don't care about the results.
>
> It's incorrect to rely on the extension taking place. It's not incorrect to
> do the extension.
Th
> On 01/30/2013 04:49 PM, Michael Matz wrote:
> > Hmm? GCC generates code that doesn't rely on the extension taking place.
>
> Sure, I didn't mean to suggest it was: it's LLVM that's incorrect.
Yes, that is LLVM bug. I am surprised that it went unnoticed for so long,
but I guess it is difficult
> On 02/01/2013 12:38 AM, Jan Hubicka wrote:
> > Doing the extensions at caller side always is however IMO a preformance bug
> > in
> > GCC. We can definitly drop them at -Os, for non-PRS targets and for calls
> > within compilation unit where we know that GCC is no
> Hi,
>
> We recently got a bug report for the GCC D compiler frontend which shows that
> we
> currently don't inline any templated functions. The reason seems to be that
> decl_replaceable_p always returns true for D template functions.
>
> We currently just mark such template function instance
> >> Set DECL_COMDAT. You said that didn't work but you didn't fully
> >> explain why. A DECL_COMDAT function should be output in every object
> >> file in which it is referenced.
> >
> > I wasn't sure if that's the correct approach. If it is, some
> > further investigation will be necessary why
> On 14 January 2008 11:03, Hans-Peter Nilsson wrote:
>
> >> Date: Sat, 12 Jan 2008 11:16:23 +0100
> >> From: Paolo Bonzini <[EMAIL PROTECTED]>
> >
> >>> (Yeah, new attributes "impure" and/or "nonconst" would solve
> >>> this, but only for IPA and there's already the existing option
> >>> and asm
>
> -malign-double is (was?) indeed a performance improvement for
> numerical applications on 32bits. But DImode is still not 8 bytes aligned
> there (which makes a next-gen 32bit ABI for 64bit x86 difficult there,
> if you want to retain DImode/DFmode 8 byte alignment and re-use the
> kernel 32b
> On Feb 4, 2008 4:11 AM, H.J. Lu <[EMAIL PROTECTED]> wrote:
> > DImode is aligned at 8 byte in i386. Since 32bit doesn't have
> > 64bit register, can we align DImode at 4byte instead of 8
> > for i386? It shouldn't have any negative impact on performance.
>
> I don't think DImode is aligned at 8
> On Mon, Feb 04, 2008 at 12:24:33PM +0100, Jan Hubicka wrote:
> > >
> > > -malign-double is (was?) indeed a performance improvement for
> > > numerical applications on 32bits. But DImode is still not 8 bytes aligned
> > > there (which makes a next-gen
> Hi,
>
> On Wed, 13 Feb 2008, H.J. Lu wrote:
>
> > We need a callee-saved register for stack alignment.
>
> Can you expand on why?
>
> > In 64bit, our choices are rbx, and r12-r15. r12-r15 need the REX byte
> > and r12 also needs the SIB byte. So I'd like to use rbx. x86-64 psABI
> > says rb
> On Feb 13, 2008 2:49 PM, Michael Matz <[EMAIL PROTECTED]> wrote:
> > Hi,
> >
> > On Wed, 13 Feb 2008, H.J. Lu wrote:
> >
> > > Our proposal is at
> > >
> > > http://gcc.gnu.org/ml/gcc/2007-12/msg00567.html
> > >
> > > For most cases, we can align stack with RBP/RSP. But
> > > we need an extra reg
Hi,
> diego and honza
>
> diego asked on irc were we planning to be able to serialize out all of
> the combined declarations.
> My response we could certainly use the machinery that we currently have
> to do this.
>
> However, I believe that Deigo's motivation is somewhat flawed, and for
> that m
> > > Hi,
> > >
> > > On Wed, 13 Feb 2008, H.J. Lu wrote:
> > >
> > > > We need a callee-saved register for stack alignment.
> > >
> > > Can you expand on why?
> > >
> > > > In 64bit, our choices are rbx, and r12-r15. r12-r15 need the REX byte
> > > > and r12 also needs the SIB byte. So I'
> I have to admit that i had not considered the issue of having the later
> stages of compilation of one function feed into the early stages of
> another. The problem with this is that it implies a serial ordering of
> the compilation and that is not going to yield a usable system for
> compiling
> Diego Novillo wrote:
> >On Fri, Feb 1, 2008 at 3:55 PM, Andrew MacLeod <[EMAIL PROTECTED]> wrote:
> >
> >
> >> 1 - Pass cleanup. There have been rumblings about this, but I haven't
> >>
> >
> >Yes, this is an area that is in desperate need of TLC. Your plan
> >looks good to me. We need t
> Your idea seems fine to me. Unless I'm not understanding you
> completely, it does not really conflict with what we're trying to do in
> whopr.
>
> The main goal of whopr is to support transformations that can be
> expressed in terms of a local generation phase, a global analysis that
> dec
> IP RA as currenly implemented in IRA does propagate info only down in
> topological order. But a good IP RA (e.g. Minimal cost inter-procedural
> regiter allocator http://citeseer.ist.psu.edu/kurlander96minimum.html)
> needs to propagate info up and down.
>
> But I am quite skeptical about I
> I am agree with you. We should not forget embedded market where the
> code size is more imortant but we should provide an option for market
> on which Pathscale/Intel/Sun compilers are oriented.
>
> With this point of view, we have a lot of resources because gcc
> generates the smallest co
Hi,
since it seems that we are getting ready with LTO, I think it is time to write
updated design document for callgraph and implementation of whole program
optimizer as outlined in Whopr document http://gcc.gnu.org/wiki/whopr
As usual, all comments or ideas are welcome ;)
Basic organization
=
> I find 'analyze' for the first stage confusing. We do no analysis
> there, we just produce summary info. The analysis is actually done by
> what you call 'read'. How about some variant of:
>
> generate_summary_{function/variable}
> analyze_{function/variable}
> transform_{function/variable}
> Currently the job to drive compilation process is implemented in
> cgraphunit and passmanager. I am leaning to plan to do as much work as
BTW for a while I think that name of cgraphunit outlived its original
meaning.
Just as historical note, it was introduced so because some bits wasn't
possibl
Hi,
> Michael, Jan,
>
> When aligning stack for those functions who have dynamic stack
> allocation, we must use an available callee-saved register in prologue.
> We named this hard register DRAP. It is worthwhile to emphasize that
> *free* here means "free in prologue". After prologue, a virtual
Hi,
> Core2 follows a similar pattern, although it's not seeing any
> slowdown in the "no deps, predictable, jmp" case like K8 does.
>
> Any comments? (please cc me) Should gcc be using conditional jumps
> more often eg. in the case of __builtin_expect())?
The problem is that in general GCC's bra
> Hi,
> > Core2 follows a similar pattern, although it's not seeing any
> > slowdown in the "no deps, predictable, jmp" case like K8 does.
> >
> > Any comments? (please cc me) Should gcc be using conditional jumps
> > more often eg. in the case of __builtin_expect())?
>
> The problem is that in g
> > At least on x86 it should also be a good idea to know which way
> > the branch is going to go, because it doesn't have explicit branch
> > hints, you really want to be able to optimize the cold branch
> > predictor case if converting from cmov to conditional branches.
>
> x86 as of Pentium 4 d
> This is also interesting for the ARC700 processor.
>
> There is also an issue if the flag for the conditionalized instruction is
> set in the immediately preceding instruction, and the result of the
> conditionalized instruction is required in the immediately following
> instruction, and if usin
>
> -static inline void __attribute__((format(printf, 1, 2)))
> -__simple_attr_check_format(const char *fmt, ...)
It would be nice to have a testcase, but I guess it is because GCC can't
inline variadic functions. The function gets identified as const and
removed as unused by DCE, but this happ
> On Thu, Feb 28, 2008 at 12:58:35AM +0100, Jan Hubicka wrote:
> > We probably also can simply allow inlining variadic functions not
> > calling va_start. I must say that this option appeared to me but I was
> > unable to think of any sane use case. This probably is one ;)
Hi,
as discussed briefly on IRC yesterday, I would be very happy to see
current DU/UD infrastructure changed to FUD chains (or on side SSA
form). This way it will be more symmetric to how tree level virtual
operands are handled and will hopefully make whole compiler more
standard and easier to foll
> i386-modes.def has
>
> ---
> /* In ILP32 mode, XFmode has size 12 and alignment 4.
>In LP64 mode, XFmode has size and alignment 16. */
> ADJUST_FLOAT_FORMAT (XF, (TARGET_128BIT_LONG_DOUBLE
> ? &ieee_extended_intel_128_format
> : TARGET_96_
> The two complications are:
> 1) libcalls
I am probably dense here, but why we can't just ignore existence of
libcalls for dataflow framework? This exist so we can effectivly remove
blocks of code in dead code removal and do some other changes, but I
don't see how they can be less friendly to F
> Diego,
>
> I am leaning to just adding noop moves at the birthpoints (dominance
> frontiers) as real noop move insns in the streams in the passes that use
> ud or du chains. The back end is tolerant of noop moves and without
Hi,
while I am with Diego that would preffer PHI nodes on side espec
>
> Not libcalls, but libcall *notes*.
>
> > This exist so we can effectivly remove
> > blocks of code in dead code removal and do some other changes, but I
> > don't see how they can be less friendly to FUD than they are to DU/UD.
> > Sure the optimization has to care to not break the extra
Hi,
I had to tweak the testcase a bit to not compute minimum: GCC optimizes
this early into MIN_EXPR throwing away any profile information. If we
get serious here we can maintain it via histogram, but I am not sure it
is worth the effort at least until IL is sanitized and expansion cleaned
up with
>
> >/* High branch cost, expand as the bitwise OR of the conditions.
> > Do the same if the RHS has side effects, because we're effectively
> > turning a TRUTH_OR_EXPR into a TRUTH_ORIF_EXPR. */
> >! if (BRANCH_COST (!optimize_size, false)>= 4
> >! || TREE_SIDE_EFFEC
> But I can also hide the cfun->function_frequency trick in
> DEFAULT_BRANCH_COST macro if it seems to help. (in longer term I hope
> they will all go away as expansion needs to be aware of hotness info
> anyway)
Well, it definitly helps. I originally hoped there will be fewer places
querying BRA
> > But I can also hide the cfun->function_frequency trick in
> > DEFAULT_BRANCH_COST macro if it seems to help. (in longer term I hope
> > they will all go away as expansion needs to be aware of hotness info
> > anyway)
>
> Well, it definitly helps. I originally hoped there will be fewer places
> On Monday 03 March 2008 22:38, Jan Hubicka wrote:
> > Hi,
> > I had to tweak the testcase a bit to not compute minimum: GCC optimizes
> > this early into MIN_EXPR throwing away any profile information. If we
> > get serious here we can maintain it via histogram,
>
> >>>I hope so too. For the kernel we have some parts where
> >>>__builtin_expect is used quite a lot and noticably helps, and could
> >>>help even more if we cut down the use of cmov too. I guess on
> >>>architectures with even more predictated instructions it could be
> >>>even more useful too
Hi,
>
> 1) In ssa, the operands of the phis and the renaming contain
> information. The operands are paired with the cfg edges that the
> values come in on. In fud/birthpoints there is no such pairing or
> renaming. For some problems, like conditional constant, this pairing
> and renaming is
> I think that at this point, i have been convinced to:
>
> 1) use fud's rather than birthpoints because these do keep a slot for
> the value along each in edge.
> 2) keep the info on the side (see rsandifors diverging thread).
>
> I am not there on keeping extra names on the side. The advantag
> > I think that at this point, i have been convinced to:
> >
> > 1) use fud's rather than birthpoints because these do keep a slot for
> > the value along each in edge.
> > 2) keep the info on the side (see rsandifors diverging thread).
> >
> > I am not there on keeping extra names on the side.
> On Wed, Mar 05, 2008 at 09:38:13PM +0100, Michael Matz wrote:
> > Hi,
> >
> > On Wed, 5 Mar 2008, Aurelien Jarno wrote:
> >
> > > > So I think gcc at least needs an *option* to revert to the old behavior,
> > > > and there's a good argument to make it the default for now, at least for
> > > > x
> Ian Lance Taylor <[EMAIL PROTECTED]> writes:
>
> >Another approach would be to only use the carets for parse errors,
> >which is where they are the most helpful.
>
> And preprocessor if possible
>
> [also sometimes I would love to have an option in gcc to just
> display the preprocessed inpu
Hi,
based on the discussion, this is change I would like to do to the
passmanager. I am sending the header change only first, because the
actual change will need updating all PM datastructure initializers and
compensate testsuite and documentation for the removal of RTL dump
letters so I would rat
> Jan Hubicka wrote:
>
> This looks mostly fine to me. note that i added you to pr35094 since
> this patch will resolve that issue.
>
> I guess that one of the questions that i would have is why not have
> there be a base structure for the core passmanager fields, an
> On 3/9/08 7:26 AM, Jan Hubicka wrote:
>
> >compensate testsuite and documentation for the removal of RTL dump
> >letters so I would rather do that just once. Does this seem OK?
>
> Yup, thanks for doing this.
>
>
> >The patch include the read/write me
> Hi,
>
> Did the default i386 CPU model that gcc generates
> code for change between 4.2.x and 4.3.0? I didn't
> see anything in the release notes that jumps out at
> me about this.
There wasnt any intend to change the codebase. However the default
tunning now has changed to "generic" model.
> David Edelsohn wrote:
> >>Joel Sherrill writes:
> >>
> >
> >Joel> Those all look like checks to see if the compiler itself
> >Joel> supports Altivec -- not a run-time check on the hardware
> >Joel> like the Neon check_effective_target_arm_neon_hw appears
> >Joel> to be.
> >
>
>
> if (parts.base)
> {
> if (REGNO_POINTER_ALIGN (REGNO (parts.base)) < 32) <-- 820
> return 0;
> }
>
> I think parts.base is OK so it's probably REGNO_POINTER_ALIGN
Uh, while converting the regno_pointer_align from GGC to malloced
memory, I mistakely used xmalloc instead
t reproduce for me, but I've comitted the following as
obvious. I am sorry for all the fallout...
Index: ChangeLog
===
*** ChangeLog (revision 133932)
--- ChangeLog (working copy)
***
*** 1,5
--- 1,7 ----
2008
> At revision 134333, boostrap fails on i686-apple-darwin9 at stage 1
> with:
>
> ...
> gcc -c -g -fkeep-inline-functions -DIN_GCC -W -Wall -Wwrite-strings
> -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition
> -Wmissing-format-attribute -fno-common -DHAVE_CONFIG_H -I. -I.
> -I
> > Does this help?
>
> Thanks for tha answer, but now I have:
>
> ...
> ../../gcc-4.4-work/gcc/except.c: In function 'set_nothrow_function_flags':
> ../../gcc-4.4-work/gcc/except.c:2787: error: 'struct rtl_data' has no member
> named 'epilogue_delay_list'
> make[3]: *** [except.o] Error 1
> ...
> Hi,
>
> it seems, that the compilation of libiberty for x86_64-pc-mingw32 leads to
> an ICE in cp-demangle.c:2905: internal compiler error verify_cgraph_node
> failed breaks bootstrap.
Hi,
this was caused by my PM patch. I've commited fix for it this morning,
so please let me know if any prob
> You may have seen this warning from the memory consumption tester:
>
> http://gcc.gnu.org/ml/gcc-regression/2008-05/msg00041.html
>
> ... related to the recent identifier GC patch.
>
> I looked into this a little. My theory is that this is an artifact of
> how the tester collects its data. I
> Hi Jan, Uros,
Hi,
I guess I was just lazy to figure out the size at a time of writting the
pattern.
Length is not used for anything useful at the moment, but fixing it
definitly won't hurt.
Honza
>
> i386.md has
>
> (define_insn "*sse_prologue_save_insn"
> [(set (mem:BLK (plus:DI (match_oper
> On Mon, Jun 2, 2008 at 5:10 PM, Diego Novillo <[EMAIL PROTECTED]> wrote:
> > In g++.dg/torture/20070621-1.C we are trying to stream out a structure
> > that contains a TEMPLATE_DECL. This currently causes a failure in
> > lto-function-out.c:output_tree because not only TEMPLATE_DECL is
> > C++-s
> On Mon, Jun 2, 2008 at 20:37, Kenneth Zadeck <[EMAIL PROTECTED]> wrote:
>
> > the problem with making this a langhook is that there is no "there-there" in
> > that on the serialize in side, you would have to recreate the c++ front end
> > code that expects this tree code. (if there is no such c
> Hi,
>
> setup_incoming_varargs_64 in i386.c has
>
> /* Compute address to jump to :
> label - 5*eax + nnamed_sse_arguments*5 */
>
> The comments don't match the code. Shout the comments be
>
> /* Compute address to jump to :
> label - 4*eax + nnamed_sse_arguments
> Jan Hubicka wrote:
>
> >Sure if it works, we should be lowering the types during gimplification
> >so we don't need to store all this in memory...
> >But C++ FE still use its local data later in stuff like thunks, but we
> >will need to cgraphize them any
Hi,
I am jumping in somewhat late, as yesterday I was on meetings without
internet access. (and I probably will be offline again tomorrow)
I think that in basic terms we all mostly agree (we want to implement
optimization scheme that does not get everything into memory, we want to
parallelize the
>
> 1. Extend the register save area to put upper 128bit at the end.
> Pros:
> Aligned access.
> Save stack space if 256bit registers are used.
> Cons
> Split access. Require more split access beyond 256bit.
>
> 2. Extend the register save area to put full 265bit YMMs at the end.
>
> ymm0 and xmm0 are the same register. xmm0 is the lower 128bit
> of xmm0. I am not sure if we need separate XMM registers from
> YMM registers.
Yes, I know that xmm0 is lower part of ymm0. I still think we ought to
be able to support varargs that do save ymm0 registers only when ymm
values a
> On Fri, Jun 06, 2008 at 06:50:26AM -0700, H.J. Lu wrote:
> > On Fri, Jun 06, 2008 at 10:28:34AM +0200, Jan Hubicka wrote:
> > > >
> > > > ymm0 and xmm0 are the same register. xmm0 is the lower 128bit
> > > > of xmm0. I am not sure if we need sepa
>
> I don't understand why you want to pass __m256 and 256-bit vector values
> to anonymous arguments in registers. The only thing the vararg functions
> would do with it would be save it somewhere on the stack.
> Given the x86_64 ABI, you can't expect calling an implicitly
> prototyped or non-va
> On Tue, Jun 10, 2008 at 4:32 AM, Jan Hubicka <[EMAIL PROTECTED]> wrote:
> >>
> >> I don't understand why you want to pass __m256 and 256-bit vector values
> >> to anonymous arguments in registers. The only thing the vararg functions
> >> would
> On Tue, Jun 10, 2008 at 8:11 AM, Jakub Jelinek <[EMAIL PROTECTED]> wrote:
> > On Tue, Jun 10, 2008 at 04:50:14PM +0200, Jan Hubicka wrote:
> >> 1) make __m256 passed on stack on variadic functions and in registers
> >> otherwse. Then we don't need
> On Wed, Jun 11, 2008 at 07:49:12AM -0700, H.J. Lu wrote:
> > > I guess we all agree on passing variadic arguments on stack (that is
> > > only those belonging on ...) and rest in registers. It seems easiest in
> > > regard to future register set extensions too. Only negative thing is
> > > that
> Hello,
> In our GCC porting, we use newlib instead of libc. Today I tried to use
> profiling feedback based optimization with option -fprofile-arcs. But
> the executable doesn't produce .gcda file. I examined the disassembled
> binary file and found the following functions are basically just d
> > FAIL: gcc.dg/weak/weak-6.c (test for errors, line 5)
> > FAIL: gcc.dg/weak/weak-6.c (test for excess errors)
> > FAIL: gcc.dg/weak/weak-7.c (test for errors, line 5)
> > FAIL: gcc.dg/weak/weak-7.c (test for excess errors)
>
> These look like they were caused by one of your patches.
Yes, the
> > I think that someone, though, should be committed to fixing this pass ASAP
> > after it's checked in; waiting until late August to fix it seems bad. Is
> > there someone else who can commit to working on it as a high priority after
> > the main tuples checkin?
>
> I would obviously vote in
> Jan Hubicka wrote:
> >So while the passes are probably now well in "benchmark toy" category
> >and they will need many changes to be useful in general, I think it is
> >good to have something we can test the framework at.
>
> Do these passes actually
>
> - The rest of the memory utilization difference is mostly in inlining
> (240Kb) and SSA update (50Kb).
>
> I think the main focus points should be DSE and trying to get a good
> way of measuring the memory utilization differences. Jan, any
> suggestion?
I've switched memory tester to tuples
> On Sun, Jul 27, 2008 at 1:18 PM, Gerald Pfeifer <[EMAIL PROTECTED]> wrote:
> > I believe the following happened in the last 48 or so hours; I saw
> > this triggered by my nightly Wine builds which in turn use my nightly
> > GCC builds. ;-)
> >
> > For code like the following where we have an infi
Hi,
Since most of issues with IPCP should be fixed now and it should be as
strong as possible with the elementary textbook quality algorithm it
uses, I would like to enable it by default. I've tested it on SPEC and
C++ behcmarks yeterday and didn't measured any significant improvments.
There is qu
> Jan Hubicka <[EMAIL PROTECTED]> writes:
>
> > If there are no complains, I will enable ipcp as proposed after remaining
> > patches are tested and comitted (that would be about day after tomorrow)
>
> It breaks Ada on ia64:
I was hitting same problem on x86
Hi,
after IRA I've re-done x86-64 SPECint testing (SPECfp, CSiBE and C++
benchmark failed because tree was broken at that point, I will get
results tomorrow, but there was no surprises already before) also with
the new code to eliminate arguments. Luis also did PPC SPEC runs.
The most important re
Hi,
tonight testing on x86_64, i386 and IA-64 didn't seem to bring any new
surprises, so I've comitted the following patch.
I will also update changes page of 4.4.
* doc/invoke.texi (-fipa-cp): Enabled by default at -O2/-Os/-O3
(-fipa-cp-clone): Enabled by default at -O3.
*
501 - 600 of 681 matches
Mail list logo