from:"chrbr at gcc dot gnu dot org"

[Bug target/29996] sh-elf: should enable -fomit-frame-pointer

2010-07-16 Thread chrbr at gcc dot gnu dot org



--- Comment #1 from chrbr at gcc dot gnu dot org  2010-07-16 11:34 ---
done since http://gcc.gnu.org/ml/gcc-patches/2010-01/msg01147.html

needed ACCUMULATE_OUTGOING_ARG to fix unwinding (can go back to previous
behavior with -mno-accumulate-outgoing-args -fno-omit-frame-pointer)


-- 

chrbr at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29996

[Bug target/42841] [4.3/4.4/4.5 Regression] SH: Assembler complains pcrel too far.

2010-02-10 Thread chrbr at gcc dot gnu dot org



--- Comment #36 from chrbr at gcc dot gnu dot org  2010-02-10 12:02 ---
(In reply to comment #33)
 Your fix of the middle end looks plausible but I think the target
 shouldn't generate a CP at the eh landing pad anyway.  I'll commit
 the hunk below anyway after your patch for pic problem is installed.
 

done, you can commit your w/a.

 @@ -4654,6 +4654,13 @@ find_barrier (int num_mova, rtx mova, rt
if (last_got)
 from = PREV_INSN (last_got);
 
 +  /* Don't insert the constant pool table at the position which
 +may be the landing pad.  */
 +  if (flag_exceptions
 +  CALL_P (from)
 +  find_reg_note (from, REG_EH_REGION, NULL_RTX))
 +   from = PREV_INSN (from);
 +
/* Walk back to be just before any jump or label.
  Putting it before a label reduces the number of times the branch
  around the constant pool table will be hit.  Putting it before
 


-- 

chrbr at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42841

[Bug target/42841] [4.3/4.4/4.5 Regression] SH: Assembler complains pcrel too far.

2010-02-05 Thread chrbr at gcc dot gnu dot org



--- Comment #34 from chrbr at gcc dot gnu dot org  2010-02-05 08:26 ---
(In reply to comment #33)
 Your fix of the middle end looks plausible but I think the target
 shouldn't generate a CP at the eh landing pad anyway.  I'll commit
 the hunk below anyway after your patch for pic problem is installed.
 

OK. I didn't check the code quality difference between the middle-end fix and
yours. Since there are no fallthru to the landing pad, and locality with the
upcoming exception region is not important, (if we suppose that the exception
handler is not on the critical path), I was expecting that the landing pad was
a good place for the constant pool on the contrary. 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42841

[Bug target/42841] [4.3/4.4/4.5 Regression] SH: Assembler complains pcrel too far.

2010-02-04 Thread chrbr at gcc dot gnu dot org



--- Comment #32 from chrbr at gcc dot gnu dot org  2010-02-05 07:05 ---
 Looks smart and clean!  One minor nit, I guess that the occurence of
 gbr and GBR in ChangeLog and comments should be replaced with GOT to
 avoid confusion with GBR register of SH CPU.

Thanks for catching up this error in the comment. I meant GP of course, which
is even more preferable that GOT (which is what we load, not what we compute).

(In reply to comment #31)
 When you propose it to the list, could you please separate the third
 hunk which is for the original PR42841 as an independant patch.  Also
 don't forget to update the copyright years in the first one.
 

OK, that was also my intention to submit the 3rd hunk (the one that fixes the
jump to the landing pad around the constant table right ?) as a separate patch
as it will require the approval of a middle end maintainer. 
If it cannot go in the trunk before the 4.5 freeze I can propose you to commit
your workaround (comment #23) so not to block the regression. Then we can
revert when the proper patch is discussed/accepted. (I'm a little bit late for
that sorry).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42841

[Bug target/42841] [4.3/4.4/4.5 Regression] SH: Assembler complains pcrel too far.

2010-02-03 Thread chrbr at gcc dot gnu dot org



--- Comment #28 from chrbr at gcc dot gnu dot org  2010-02-03 08:30 ---
Hello Kaj, thanks for your proposal

thanks for the proposal. but I'm wondering if preventing the scheduling of the
mov.l and mova instructions are not too much overkill ? (sh_reorg comes after
the scheduler, but even if it didn't that should be ok to mov up instructions. 
(the R0 liverange between the add and load is another more general problem)
Do I miss something ?

We only want to avoid the CP to be inserted between those 2 instructions, it's
not necessary to have more blockages. I'm working on something that tracks the 
GOT loading access during the find_barrier walk and then revert back at the end
to the latest safe place. OK on the example but the full linux distrib rebuild
and validation is still ongoing.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42841

[Bug target/42841] [4.3/4.4/4.5 Regression] SH: Assembler complains pcrel too far.

2010-02-03 Thread chrbr at gcc dot gnu dot org



--- Comment #30 from chrbr at gcc dot gnu dot org  2010-02-03 13:12 ---
Created an attachment (id=19794)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19794action=view)
patch to fix GOT access load with constant pool

Patch under validation.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42841

[Bug target/42841] [4.3/4.4/4.5 Regression] SH: Assembler complains pcrel too far.

2010-02-01 Thread chrbr at gcc dot gnu dot org



--- Comment #26 from chrbr at gcc dot gnu dot org  2010-02-01 16:30 ---
I'm afraid the unaligned access sigbug regression is another latent bug just
exhibited by the fix for the original PR :-(

what happens is the the GOT loading sequence is broken by a constant pool:

we end up to emit:

mov.l   .L542,r12(X)
bra .L516
nop
.L542:
   .align 2
.long   _GLOBAL_OFFSET_TABLE_
   ...
.L516:
mova.L545,r0   (Y) !
add r0,r12
.L545:
.long   _GLOBAL_OFFSET_TABLE_

The reason for that is that the second mova instruction is unluckily now out of
range by 2 bytes. (which could happen with any other situation, even without
this patch).

IMHO We should forbid the duplication of a _GLOBAL_OFFSET_TABLE_ loading
constant while in a UNSPEC_MOVA sequence.

We should probably reduce si_limit in find_barrier when a 
(set (reg:SI 0 r0)
(unspec:SI [
(const:SI (unspec:SI [
(symbol_ref (*_GLOBAL_OFFSET_TABLE_))
is met and next is
 (set (reg:SI 12 r12)
(const:SI (unspec:SI [
(symbol_ref (*_GLOBAL_OFFSET_TABLE_) 

in PIC.

I experimenting with a couple of different solutions in this direction.

this PR was a really interesting bugs finder !


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42841

[Bug target/42841] [4.3/4.4/4.5 Regression] SH: Assembler complains pcrel too far.

2010-01-29 Thread chrbr at gcc dot gnu dot org



--- Comment #25 from chrbr at gcc dot gnu dot org  2010-01-29 08:59 ---
by the way, FYI, trying to explain the differences between your results and
mine for sh4-linux. my build was is configured with --enable-target-optspace,
so all my runtime build tests are ran with -Os, not -O2 like yours. Which could
make a huge differences in CP layout...
I repass in -O2 over the week end.
Cheers


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42841

[Bug target/42841] [4.3/4.4/4.5 Regression] SH: Assembler complains pcrel too far.

2010-01-28 Thread chrbr at gcc dot gnu dot org



--- Comment #22 from chrbr at gcc dot gnu dot org  2010-01-28 13:09 ---
humm, looks like a latent bug. Accidentally the CP is inserted before a
compact_jump, which enable further redirect jump optimisation. I don't think it
is directly related to the fix, but lets work it a little bit more.

so we have just before dbr:
jump_insn - 2586
a constant pool
L2586
jump_insn - 3394

L3394: ...

then in reorg_redirect_jump we redirect the jump over the CP and
delete_related_insn so the code between the CP and the jump becomes dead.

and we have 

jump_insn - 3394
a constant pool

L3394
...

but the label L2586 is used in the exception table... and thus remains
undefined.

now my question: how the exception table can refer to a region delimited by
deleted labels. It's should be built after dbr isn't it ?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42841

[Bug target/42841] [4.3/4.4/4.5 Regression] SH: Assembler complains pcrel too far.

2010-01-28 Thread chrbr at gcc dot gnu dot org



--- Comment #24 from chrbr at gcc dot gnu dot org  2010-01-29 07:46 ---
Created an attachment (id=19747)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19747action=view)
fixed removal of landing pad label rtx

The landing_pad label rtx was created and recorded in tree_inline
(duplicate_eh_regions). Seems that reorg_redirect_jump or delete_insn should
check for it before deciding it can be removed.

I'm testing this patch that does this. 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42841

[Bug target/42841] [4.3/4.4/4.5 Regression] SH: Assembler complains pcrel too far.

2010-01-27 Thread chrbr at gcc dot gnu dot org



--- Comment #17 from chrbr at gcc dot gnu dot org  2010-01-27 12:50 ---
strange, I didn't see that, even the undefined symbol in the assembler. 

OK I disable the fix until this is clarified.

Let me do a recheck on the silicium, will let you know. 

-c

(In reply to comment #16)
 I've got some new libstdc++-v3 testsuite failures with the patch
 on my nightly sh4-linux tester:
 
 Running
 /exp/ldroot/dodes/ORIG/trunk/libstdc++-v3/testsuite/libstdc++-dg/conformance.exp
 ...
 FAIL: 23_containers/deque/requirements/exception/basic.cc (test for excess
 errors)
 WARNING: 23_containers/deque/requirements/exception/basic.cc compilation 
 failed
 to produce executable
 FAIL: 23_containers/deque/requirements/exception/propagation_consistent.cc
 (test for excess errors)
 WARNING: 23_containers/deque/requirements/exception/propagation_consistent.cc
 compilation failed to produce executable
 FAIL: 30_threads/packaged_task/members/get_future.cc execution test
 FAIL: 30_threads/shared_future/members/get.cc execution test
 
 The first failure is
 
 /tmp/ccl5TCl4.s: Assembler messages:
 /tmp/ccl5TCl4.s:43070: Error: undefined symbol `.L3394' in operation
 
 FAIL: 23_containers/deque/requirements/exception/basic.cc (test for excess
 errors)
 
 The last 2 failures are resulted with the unaligned accesses.  I saw
 
 Sending SIGBUS to get_future.exe due to unaligned access (PC 296554a8 PR
 2965549a)
 Sending SIGBUS to get.exe due to unaligned access (PC 296554a8 PR 2965549a)
 
 on the target machine.
 With reverting the first hunk of the patch, these errors go away.
 Christian, could you please revert or disable the first hunk
 of patches temporarily?  Sorry I didn't catch this earlier.
 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42841

[Bug target/42841] [4.3/4.4/4.5 Regression] SH: Assembler complains pcrel too far.

2010-01-27 Thread chrbr at gcc dot gnu dot org



--- Comment #18 from chrbr at gcc dot gnu dot org  2010-01-27 13:24 ---
Subject: Bug 42841

Author: chrbr
Date: Wed Jan 27 13:24:40 2010
New Revision: 156282

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=156282
Log:
temporarily revert fix for PR target/42841

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/sh/sh.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42841

[Bug target/42841] [4.3/4.4/4.5 Regression] SH: Assembler complains pcrel too far.

2010-01-27 Thread chrbr at gcc dot gnu dot org



--- Comment #19 from chrbr at gcc dot gnu dot org  2010-01-27 13:40 ---
to make sure we are in the same testing/configuration environment could you
please send me the preprocessed file for
23_containers/deque/requirements/exception/propagation_consistent.cc as well as
the compilation line in libstdc++.log that you used ?

many thanks

Christian


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42841

[Bug target/42841] [4.3/4.4/4.5 Regression] SH: Assembler complains pcrel too far.

2010-01-27 Thread chrbr at gcc dot gnu dot org



--- Comment #21 from chrbr at gcc dot gnu dot org  2010-01-27 18:13 ---
This one is marked as unsupported in my sh-superh-elf log, But I can reproduce
it now on sh4-linux. (despite that I have rebuilt a whole distrib without
seeing it :O).

Anyway I'm investigating. I'm reopening the bug and will revert in the branches
as well if I don't find a quick solution.

Regards

(In reply to comment #20)
 Created an attachment (id=19729)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19729action=view) [edit]
 A test case
 
 cc1plus -std=gnu++0x -O2 propagation_consistent.ii produces
 a problematic code here.
 


-- 

chrbr at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42841

[Bug target/42841] [4.3/4.4/4.5 Regression] SH: Assembler complains pcrel too far.

2010-01-25 Thread chrbr at gcc dot gnu dot org



--- Comment #11 from chrbr at gcc dot gnu dot org  2010-01-26 07:20 ---
Subject: Bug 42841

Author: chrbr
Date: Tue Jan 26 07:20:27 2010
New Revision: 156229

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=156229
Log:
fix PR target/42841

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/sh/sh.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42841

[Bug target/42841] [4.3/4.4/4.5 Regression] SH: Assembler complains pcrel too far.

2010-01-25 Thread chrbr at gcc dot gnu dot org



--- Comment #12 from chrbr at gcc dot gnu dot org  2010-01-26 07:22 ---
Subject: Bug 42841

Author: chrbr
Date: Tue Jan 26 07:21:57 2010
New Revision: 156230

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=156230
Log:
fix PR target/42841

Modified:
branches/gcc-4_4-branch/gcc/ChangeLog
branches/gcc-4_4-branch/gcc/config/sh/sh.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42841

[Bug target/42841] [4.3/4.4/4.5 Regression] SH: Assembler complains pcrel too far.

2010-01-25 Thread chrbr at gcc dot gnu dot org



--- Comment #13 from chrbr at gcc dot gnu dot org  2010-01-26 07:28 ---
Subject: Bug 42841

Author: chrbr
Date: Tue Jan 26 07:28:05 2010
New Revision: 156231

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=156231
Log:
fix PR target/42841

Modified:
branches/gcc-4_3-branch/gcc/ChangeLog
branches/gcc-4_3-branch/gcc/config/sh/sh.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42841

[Bug target/42841] [4.3/4.4/4.5 Regression] SH: Assembler complains pcrel too far.

2010-01-25 Thread chrbr at gcc dot gnu dot org



--- Comment #14 from chrbr at gcc dot gnu dot org  2010-01-26 07:29 ---
fixed in 4.5, 4.3 and 4.4


-- 

chrbr at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED
   Target Milestone|--- |4.5.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42841

[Bug treelang/41639] New: synchronisation primitives take unsigned as input and output values.

2009-10-09 Thread chrbr at gcc dot gnu dot org

Current implementation of the synchronization builtins in gcc (from
http://software.intel.com/sites/products/documentation/hpc/compilerpro/en-us/cpp/lin/compiler_c/intref_cls/common/intref_itanium_synchro_prim.htm)
describe type as unsigned. although it is stated as type is either a
32-bit or 64-bit integer

consequently, testsuite tests such as sync-2.c:

if (__sync_sub_and_fetch(AI+13, 12) != (char)-12)
abort ();

might fail. (unless the target/runtime dependant primitive implementation
artificially change the return type).


-- 
   Summary: synchronisation primitives take unsigned as input and
output values.
   Product: gcc
   Version: 4.5.0
Status: UNCONFIRMED
  Severity: trivial
  Priority: P3
 Component: treelang
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: chrbr at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41639

[Bug treelang/41639] synchronisation primitives take unsigned as input and output values.

2009-10-09 Thread chrbr at gcc dot gnu dot org



--- Comment #1 from chrbr at gcc dot gnu dot org  2009-10-09 07:12 ---
Created an attachment (id=18758)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18758action=view)
Fix synchronisation parameter/output signess

The attached patch gives the correct semantic. But should be checked on target
using them (pa/arm) for possible legacy regression.

(tested on SH with a non-linux, in house runtime, implementation)

2009-10-08  Christian Bruel  christian.br...@st.com

* builtin-types.def (BT_I[1,2,4,8,16): Set signed.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41639

[Bug tree-optimization/41486] New: cselim is not dse aware

2009-09-28 Thread chrbr at gcc dot gnu dot org

The cs elim pass introduces a conditional store, but does not remove the
original one. If the former is not removed by DSE, this results in worse code.

original thread: http://gcc.gnu.org/ml/gcc-patches/2009-09/msg01955.html.

on machines with no predicated stores, disabling this optimization is generally
a win, but only as a workaround.


-- 
   Summary: cselim is not dse aware
   Product: gcc
   Version: tree-ssa
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: chrbr at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41486

[Bug tree-optimization/41486] cselim is not dse aware

2009-09-28 Thread chrbr at gcc dot gnu dot org



--- Comment #1 from chrbr at gcc dot gnu dot org  2009-09-28 10:53 ---
Created an attachment (id=18665)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18665action=view)
case


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41486

[Bug target/39423] [SH] performance regression: lost mov @(disp,Rn)

2009-03-12 Thread chrbr at gcc dot gnu dot org



--- Comment #9 from chrbr at gcc dot gnu dot org  2009-03-12 15:04 ---
The attached patch improves the SH generation, but I noticed a small regression
with the ARM that could make use before of a shifted constant addressing mode,
so not using the extra register for the value.

A target description check should be done while expanding the
canonicalization, that should not be done only when a base+cst addressing mode
exists and the cst must be shifter in a register. Any input welcome to how
to target conditionalize this transformation.

No performance impact was noticed on i686 however.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39423

[Bug target/39423] [SH] performance regression: lost mov @(disp,Rn)

2009-03-12 Thread chrbr at gcc dot gnu dot org



--- Comment #10 from chrbr at gcc dot gnu dot org  2009-03-12 15:10 ---
Created an attachment (id=17447)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17447action=view)
SH illustrative patch

for feedback only. win on SH. lost on ARM

2009-03-12  Christian Bruel  christian.br...@st.com

* fold-const.c (fold_plusminus_mult_expr): Move canonicalization of
 index+cst...
* expr.c (expand_expr_real_1): ... here.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39423

[Bug target/39423] [SH] performance regression: lost mov @(disp,Rn)

2009-03-11 Thread chrbr at gcc dot gnu dot org



--- Comment #2 from chrbr at gcc dot gnu dot org  2009-03-11 08:46 ---
I observed some large performance regressions in 4.3 and upwards for many
benchmarks for the superh targets, There are many causes but the main one is
reduced to the indirect+offset access :

int
foo (int tab[], int index)
{
  return tab [index+1];
} 

compiles (-O2 -fomit-frame-pointer) into 
mov r5,r0
add #1,r0
shll2   r0
rts 
mov.l   @(r0,r4),r0

instead of
shll2   r5
add r4,r5
rts 
mov.l   @(4,r5),r0

Note that in more complex code the problem is emphasized because only r0
register class can be used as indirect register index, putting extra pressure
on reload.

It seems to be that the problem is in the way that the constant index is now
hidden by gimple, so we now have

 return *(tab + ((unsigned int) index + 1) * 4)

instead of 

 return *(tab + 4B + (int *) ((unsigned int) index * 4))

It seems more easy to change gimple, but this is a target dependant
transformation. On the other hand the RTL code gen should be able to
redistribute the factorization, but that seems extra work to undo what was done
previously.


-- 

chrbr at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|RESOLVED|UNCONFIRMED
 Resolution|INVALID |
Summary|[   |[SH]  performance
   ||regression: lost mov
   ||@(disp,Rn)


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39423

[Bug target/39423] [SH] performance regression: lost mov @(disp,Rn)

2009-03-11 Thread chrbr at gcc dot gnu dot org



-- 

chrbr at gcc dot gnu dot org changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |chrbr at gcc dot gnu dot org
   |dot org |
 Status|UNCONFIRMED |ASSIGNED
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2009-03-11 08:52:58
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39423

[Bug target/39423] [SH] performance regression: lost mov @(disp,Rn)

2009-03-11 Thread chrbr at gcc dot gnu dot org



--- Comment #4 from chrbr at gcc dot gnu dot org  2009-03-11 09:30 ---
(In reply to comment #3)
 See http://gcc.gnu.org/ml/gcc-patches/2008-12/msg01134.html
 

Thanks, I tried your patch against a 4.3.3 base but it didn't fix the problem,
your patch canonicalizes while what I need is a distribution
(base + 1) * 4 = base * 4 + 4


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39423

[Bug target/39423] [SH] performance regression: lost mov @(disp,Rn)

2009-03-11 Thread chrbr at gcc dot gnu dot org



--- Comment #8 from chrbr at gcc dot gnu dot org  2009-03-11 14:07 ---
I have picky disabled the canonicalization in fold_plusminus_mult_expr for 
identical constants that are power of 2, so my mov @(disp, rn) is back :-(. For
some reason your patch let the base+index computation factorization thru 

This is experimental for now, because expand_expr needs to be extended to
repair expressions like return ((a * 4) + 4) that are not an indirect_ref.
(thanks we differ PLUS expr from POINTER_PLUS_EXPR)

+++ fold-const.c2009-03-11 13:49:40.0 +0100
@@ -7431,7 +7431,10 @@
   same = NULL_TREE;

   if (operand_equal_p (arg01, arg11, 0))
-same = arg01, alt0 = arg00, alt1 = arg10;
+{
+  if (code != PLUS_EXPR || exact_log2 (TREE_INT_CST_LOW (arg01)) == -1)
+   same = arg01, alt0 = arg00, alt1 = arg10;
+}
   else if (operand_equal_p (arg00, arg10, 0))
 same = arg00, alt0 = arg01, alt1 = arg11;
   else if (operand_equal_p (arg00, arg11, 0))


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39423

[Bug target/39423] New: [

2009-03-10 Thread chrbr at gcc dot gnu dot org




-- 
   Summary: [
   Product: gcc
   Version: 4.3.3
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: chrbr at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39423

[Bug c++/39391] New: argument dependant name lookup don't catch pointer to function

2009-03-06 Thread chrbr at gcc dot gnu dot org

ref iec/iso c++ section 3.4.2

gcc correctly reports an error when the argument is one of the fundamental type
and the associated namespace is empty. like the call to 'f' in the attached
example.

However if the argument is a pointer to function the associated name space
should be the one associated with the function. So it seems to me that the call
to 'h' should not generate an error.


-- 
   Summary: argument dependant name lookup don't catch pointer to
function
   Product: gcc
   Version: 4.4.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: chrbr at gcc dot gnu dot org
 GCC build triplet: i686-pc-linux-gnu
  GCC host triplet: i686-pc-linux-gnu
GCC target triplet: i686-pc-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39391

[Bug c++/39391] argument dependant name lookup don't catch pointer to function

2009-03-06 Thread chrbr at gcc dot gnu dot org



--- Comment #1 from chrbr at gcc dot gnu dot org  2009-03-06 14:05 ---
Created an attachment (id=17406)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17406action=view)
Example


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39391

[Bug tree-optimization/35178] Misaligned Accesses on arrays of packed stucts

2008-02-18 Thread chrbr at gcc dot gnu dot org



--- Comment #4 from chrbr at gcc dot gnu dot org  2008-02-19 07:53 ---
fixed in mainline


-- 

chrbr at gcc dot gnu dot org changed:

   What|Removed |Added

URL||http://gcc.gnu.org/ml/gcc-
   ||patches/2008-
   ||02/msg00690.html
 Status|UNCONFIRMED |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35178

[Bug middle-end/23868] [4.1/4.2/4.3 regression] builtin_apply uses wrong mode for multi-hard-register return values

2008-02-15 Thread chrbr at gcc dot gnu dot org



--- Comment #11 from chrbr at gcc dot gnu dot org  2008-02-15 13:08 ---
If no one as started to do so, I'm resurecting this patch for the mainline, I
can test builtin-apply4.c on sh4.

btw, builtin-apply4.c doesn't currently fail with the testsuite because it is
restricted to { target { { i?86-*-* x86_64-*-* }  ilp32 } }


-- 

chrbr at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||kkojima at gcc dot gnu dot
   ||org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23868

[Bug c/35178] New: Misaligned Accesses on arrays of packed stucts

2008-02-13 Thread chrbr at gcc dot gnu dot org

This bug was noticed on sh but potentially impacts other STRICT_ALIGNMENT
targets. 

The attached test case reduces a misaligned field access from an array of
packed structs. In this example compiled with -O2 the field wMaxPacketSize is
accessed (on sh4) with a mov.w instruction although it is byte aligned.

It seems that the tree-ssa-loop ivopts did not check packed struct offsets
indexed by an induction variable. Thus if the struct size is not aligned, the
field  becomes unaligned after the first iteration even if it was aligned from
the base of this structure.

The proposed patch solves this problem by expanding may_be_unaligned_p to check
that a loop carried offset is a multiple of the desired alignment.


-- 
   Summary: Misaligned Accesses on arrays of packed stucts
   Product: gcc
   Version: 4.3.0
Status: UNCONFIRMED
  Severity: major
  Priority: P3
 Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: chrbr at gcc dot gnu dot org
GCC target triplet: sh4-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35178

[Bug c/35178] Misaligned Accesses on arrays of packed stucts

2008-02-13 Thread chrbr at gcc dot gnu dot org



--- Comment #1 from chrbr at gcc dot gnu dot org  2008-02-13 13:42 ---
Created an attachment (id=15138)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15138action=view)
gcc testsuite case


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35178

[Bug c/35178] Misaligned Accesses on arrays of packed stucts

2008-02-13 Thread chrbr at gcc dot gnu dot org



--- Comment #2 from chrbr at gcc dot gnu dot org  2008-02-13 13:45 ---
Created an attachment (id=15139)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15139action=view)
Proposed patch

regression tested on sh-superh-elf and sh4-linux-gnu.
and bootstraped for i686-pc-linux-gnu. 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35178

[Bug target/34807] New: SH4 �R0_REGS� spill failure when using asm

2008-01-16 Thread chrbr at gcc dot gnu dot org

Building the attached file (extracted and reduced from the uclibc) with
-O[1,2,3,s] -fPIC fails :

test.c: In function _start:
test.c:14: error: unable to find a register to spill in class R0_REGS
test.c:14: error: this is the insn:
(insn 16 28 17 0 (set (mem/c/i:SI (plus:SI (reg:SI 12 r12)
(reg/f:SI 1 r1 [160])) [0 buf+0 S4 A32])
(reg/v:SI 0 r0 [ __sc0 ])) 179 {movsi_ie} (insn_list:REG_DEP_TRUE 13
(insn_list:REG_DEP_TRUE 11 (nil)))
(expr_list:REG_DEAD (reg/f:SI 1 r1 [160])
(expr_list:REG_DEAD (reg/v:SI 0 r0 [ __sc0 ])
(nil
test.c:14: internal compiler error: in spill_failure


-- 
   Summary: SH4 R0_REGS spill failure when using asm
   Product: gcc
   Version: 4.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
AssignedTo: chrbr at gcc dot gnu dot org
ReportedBy: chrbr at gcc dot gnu dot org
GCC target triplet: sh-superh-elf,sh4-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34807

[Bug target/34807] SH4 �R0_REGS� spill failure when using asm

2008-01-16 Thread chrbr at gcc dot gnu dot org



--- Comment #1 from chrbr at gcc dot gnu dot org  2008-01-16 08:47 ---
Created an attachment (id=14945)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14945action=view)
Test case

build with sh-superh-elf-gcc -O1 -fPIC test.c -S


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34807

[Bug target/34807] SH4 �R0_REGS� spill failure when using asm

2008-01-16 Thread chrbr at gcc dot gnu dot org



--- Comment #2 from chrbr at gcc dot gnu dot org  2008-01-16 11:15 ---
Created an attachment (id=14946)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14946action=view)
fails with 4.2.2 and 4.3.0


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34807

[Bug c++/19531] NRV is performed on volatile temporary

2007-10-31 Thread chrbr at gcc dot gnu dot org



--- Comment #7 from chrbr at gcc dot gnu dot org  2007-10-31 07:56 ---
Subject: Bug 19531

Author: chrbr
Date: Wed Oct 31 07:55:46 2007
New Revision: 129792

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=129792
Log:
fix PR c++/19531: NRV is performed on volatile temporary

Added:
trunk/gcc/testsuite/g++.dg/opt/nrv8.C
Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/typeck.c
trunk/gcc/testsuite/ChangeLog


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19531

[Bug c++/19531] NRV is performed on volatile temporary

2007-10-31 Thread chrbr at gcc dot gnu dot org



--- Comment #8 from chrbr at gcc dot gnu dot org  2007-10-31 08:01 ---
fixed check_return_expr


-- 

chrbr at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19531

[Bug rtl-optimization/15473] Sibcall optimization for libcalls.

2007-10-09 Thread chrbr at gcc dot gnu dot org



--- Comment #4 from chrbr at gcc dot gnu dot org  2007-10-09 08:36 ---
*** Bug 32684 has been marked as a duplicate of this bug. ***


-- 

chrbr at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||pinskia at gcc dot gnu dot
   ||org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15473

[Bug tree-optimization/32684] Missed tail call with sin/cos and sincos pass

2007-10-09 Thread chrbr at gcc dot gnu dot org



--- Comment #2 from chrbr at gcc dot gnu dot org  2007-10-09 08:36 ---
I think this is a duplicate of #15473 (Sibcall optimization for libcalls).

*** This bug has been marked as a duplicate of 15473 ***


-- 

chrbr at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||DUPLICATE


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32684

[Bug tree-optimization/32684] Missed tail call with sin/cos and sincos pass

2007-10-09 Thread chrbr at gcc dot gnu dot org



--- Comment #5 from chrbr at gcc dot gnu dot org  2007-10-09 13:12 ---
you are right, it's not a sibcall, my mistake. 
But even at the tree level I still don't see the builtin marked as tailcall. On
a reduced case when entering find_tail_calls I have

D.1177_2 = __builtin_cos (phi_1(D));
D.1176_3 = COMPLEX_EXPR D.1177_2, 0.0;
return D.1176_3;

and this is not recognized as a tailcall a candidate because the
GIMPLE_MODIFY_STMT operand 1 is a complex_expr, not a call.

note that in the absence of complex_expr, such as a builtin_memset. all is fine


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32684

[Bug tree-optimization/32684] Missed tail call with sin/cos and sincos pass

2007-10-09 Thread chrbr at gcc dot gnu dot org



--- Comment #6 from chrbr at gcc dot gnu dot org  2007-10-09 13:15 ---
 you are right, it's not a sibcall, my mistake. 

typo, I meant libcall not sibcall


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32684

[Bug c++/19531] NRV is performed on volatile temporary

2007-10-08 Thread chrbr at gcc dot gnu dot org



--- Comment #6 from chrbr at gcc dot gnu dot org  2007-10-08 11:02 ---
patch+testcase at http://gcc.gnu.org/ml/gcc-patches/2007-09/msg01902.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19531

[Bug c++/19531] NRV is performed on volatile temporary

2007-09-24 Thread chrbr at gcc dot gnu dot org



--- Comment #4 from chrbr at gcc dot gnu dot org  2007-09-24 07:10 ---
Created an attachment (id=14248)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14248action=view)
volatile nrv patch


-- 

chrbr at gcc dot gnu dot org changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |chrbr at gcc dot gnu dot org
   |dot org |
 Status|NEW |ASSIGNED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19531

[Bug c++/19531] NRV is performed on volatile temporary

2007-09-24 Thread chrbr at gcc dot gnu dot org



--- Comment #5 from chrbr at gcc dot gnu dot org  2007-09-24 07:14 ---
the attached patch was hanging in my sandbox. will submit it along with a
testsuite case.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19531

[Bug rtl-optimization/15473] Sibcall optimization for libcalls.

2007-09-03 Thread chrbr at gcc dot gnu dot org



--- Comment #3 from chrbr at gcc dot gnu dot org  2007-09-03 13:24 ---
this report is quite old, but worth to pop :

We found similar problems with implicit memory block copying when using struct
copying by value. (frequent in C++ )

Softfloat architectures making a very extensive use of libcalls are also very
sensitive to this lost optimisation (it is a performance regression since the
optimisation was correctly done with a gcc 3.4.3). The rtl was then emitted
both for normal calls and sibling calls and stored in a placeholder. The
placeholder was decided to be emitted after all the stmts were expanded. Since
gcc 4.0 the placeholders have disapeared so we lost the ability to optimise
libcalls in the backend.

I will try to make use of the cfg information available in expand to decide if 
we can pass BLOCK_OP_TAILCALL to emit_block_move. I expect that libcalls can
share the same interface. 


-- 

chrbr at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||chrbr at gcc dot gnu dot org
   Last reconfirmed|2006-03-01 02:40:48 |2007-09-03 13:24:57
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15473

[Bug target/29953] [SH-4] Perfomance regression in loops. cmp/eq used instead of dt

2007-06-08 Thread chrbr at gcc dot gnu dot org

--- Comment #7 from chrbr at gcc dot gnu dot org  2007-06-08 07:58 ---
Subject: Bug 29953

Author: chrbr
Date: Fri Jun  8 07:58:41 2007
New Revision: 125564

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=125564
Log:
PR target/29953
* config/sh/sh.md (doloop_end): New pattern and splitter.
* loop-iv.c (simple_rhs_p): Check for hardware registers.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/sh/sh.md
trunk/gcc/loop-iv.c

-- 

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29953

[Bug target/29953] [SH-4] Perfomance regression in loops. cmp/eq used instead of dt

2007-06-08 Thread chrbr at gcc dot gnu dot org



--- Comment #8 from chrbr at gcc dot gnu dot org  2007-06-08 08:18 ---
doloop_optimize does the iv inversion with the doloop_end insn support in the
machine description.


-- 

chrbr at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED
   Target Milestone|--- |4.3.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29953

[Bug target/29953] [SH-4] Perfomance regression in loops. cmp/eq used instead of dt

2007-05-15 Thread chrbr at gcc dot gnu dot org



--- Comment #6 from chrbr at gcc dot gnu dot org  2007-05-15 10:30 ---

I dropped the 4.1 and implemented a -finvert-loops option on the trunk.

This option allows a basic induction variable to be decremented instead of
incremented to support exit testing against 0.

I'm validating a patch on intel and sh. 


-- 

chrbr at gcc dot gnu dot org changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |chrbr at gcc dot gnu dot org
   |dot org |
 Status|NEW |ASSIGNED
   Last reconfirmed|2007-04-03 16:34:17 |2007-05-15 10:30:36
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29953

[Bug target/31403] wrong branch instructions generated with -m2a on sh-elf

2007-04-23 Thread chrbr at gcc dot gnu dot org



--- Comment #3 from chrbr at gcc dot gnu dot org  2007-04-23 07:59 ---
Hi Kaj,

The same problem seems to transpire from the movsf_ie pattern for the sh2a-fpu
that also have 32 bit memory instructions. So your fix also applies there.

Note that traditional sh memory move instructions can also have a length of 2
so your fix is conservative (but not more than the previous code). Shouldn't
the new 4 bytes instructions be described latter with a new memory constraint ?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31403

[Bug target/31640] New: cache align alignment is too aggressive on sh-elf

2007-04-20 Thread chrbr at gcc dot gnu dot org

The sh4 port aligns blocks that have no fallthrus and that are either
frequently executed (JUMP_ALIGN) or preceeded a barrier
(LABEL_ALIGN_AFTER_BARRIER) on a cache line.

While in theory this help to avoid cache misses if the block slits over 2 cache
lines, in practise this reduces cache locality and lenghten distance between
blocks.
The number of issued instructions are also impacted. For example the relative
indirect address in jump tables needs a byte zero extend instruction if the
distance occupies 8 bits instead of 7 bits. 

I ran some experiments and benchmarked (eembc) with 2 strategies
1) -falign-jumps=1
2) Align the block if the size is bigger than a given threshold. (empirically
set to 16 bytes, half of the cache line size). See illustrating attached patch.

My conclusion is that in -O3 the performance never degrades (option 2 is a
little bit better, even improving dhrystone by 3%) when removing this padding.
And the text size improves by ~15%.

So I was not able to measurate the benefit of the cache line padding although
the code size impact is big (even in -O2/-O3 a code size bloat should be
motivated by some performance improvement).

Is there a motivating test that justifies this microoptimisation ?

In the illustrating patch I still align the basic blocks on 4-bytes to account
for better instruction fetch accesses


-- 
   Summary: cache align alignment is too aggressive on sh-elf
   Product: gcc
   Version: 4.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: chrbr at gcc dot gnu dot org
GCC target triplet: sh-superh-elf


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31640

[Bug target/31640] cache block alignment is too aggressive on sh-elf

2007-04-20 Thread chrbr at gcc dot gnu dot org



-- 

chrbr at gcc dot gnu dot org changed:

   What|Removed |Added

   Severity|normal  |minor
Summary|cache align alignment is too|cache block alignment is too
   |aggressive on sh-elf|aggressive on sh-elf


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31640

[Bug target/31640] cache block alignment is too aggressive on sh-elf

2007-04-20 Thread chrbr at gcc dot gnu dot org



--- Comment #1 from chrbr at gcc dot gnu dot org  2007-04-20 14:13 ---
Created an attachment (id=13391)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13391action=view)
Illustrative patch to not align small basic blocks

I used this patch to reduce the number of basic blocks aligned on cache-lines.
My choice was not to align blocks less than 16 bytes (also tried 32 bytes)
seems to give the best results. 

Note than never aligning doesn't degrade eebmc perfs (similar to
-falign-jumps=1)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31640

[Bug target/31640] cache block alignment is too aggressive on sh-elf

2007-04-20 Thread chrbr at gcc dot gnu dot org



--- Comment #2 from chrbr at gcc dot gnu dot org  2007-04-20 15:51 ---
Created an attachment (id=13393)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13393action=view)
testcase for new instruction introduced by increased distance

In this example, the max distance between the jump table and the cases is
artificially augmented by the padding. Although each basic block is very small
and has very few chances to spread over several cache blocks.

In addition the 
extu.b  r1,r1
instruction can be avoided.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31640

57 matches

Mail list logo