[Bug tree-optimization/60454] New: Code mistakenly detected as doing bswap

2014-03-06 Thread thomas.preudhomme at arm dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60454

Bug ID: 60454
   Summary: Code mistakenly detected as doing bswap
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: thomas.preudhomme at arm dot com

Created attachment 32296
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=32296&action=edit
Testcase for bswap incorrect detection

Optimization pass optimize_bswap in tree-ssa-math-opts.c incorrectly detect a
bswap being performed in certain cases, leading to wrong code being generated.

Please find attached a testcase that exhibit the problem. Compile with trunk
gcc with gcc -O2 -fdump-tree-bswap -c testcase.c and observe that
testcase.c.*bswap contains the string "32 bit bswap implementation found at"
while the code does not perform a byte swap. This test was tried on an Ubuntu
"13.10 (saucy) with 4.8.1 and trunk gcc.

gcc 4.8.1 was configured with:

../src/configure -v --with-pkgversion='Ubuntu/Linaro 4.8.1-10ubuntu9'
--with-bugurl=file:///usr/share/doc/gcc-4.8/README.Bugs
--enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr
--program-suffix=-4.8 --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--with-gxx-include-dir=/usr/include/c++/4.8 --libdir=/usr/lib --enable-nls
--with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --enable-gnu-unique-object --enable-plugin
--with-system-zlib --disable-browser-plugin --enable-java-awt=gtk
--enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64/jre
--enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64
--with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.8-amd64
--with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar
--enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686
--with-abi=m64 --with-multilib-list=m32,m64,mx32 --with-tune=generic
--enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu
--target=x86_64-linux-gnu

while trunk gcc was configured with:

../src/configure --prefix=../install --target=arm-none-eabi
--enable-languages=c,c++ --with-mode=thumb --with-cpu=cortex-m3 --with-newlib
--with-headers=../src/newlib/libc/include --enable-newlib-register-fini
--disable-newlib-supplied-syscalls --disable-multilib --with-libexpat
--with-system-zlib --disable-gdbtk --enable-plugins --disable-libgomp
--disable-libmudflap --disable-libquadmath --disable-libssp
--disable-libstdcxx-pch --disable-nls --disable-rda --disable-sid --disable-tui
--disable-utils --disable-werror --disable-fixed-point

I'm currently cleaning a patch to solve this issue.

Best regards.


[Bug tree-optimization/60454] Code mistakenly detected as doing bswap

2014-03-07 Thread thomas.preudhomme at arm dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60454

--- Comment #1 from Thomas Preud'homme  ---
Created attachment 32297
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=32297&action=edit
Unpreprocessed testcase for incorrect bswap detection


[Bug tree-optimization/60454] [4.7/4.8/4.9 Regression] Code mistakenly detected as doing bswap

2014-03-07 Thread thomas.preudhomme at arm dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60454

--- Comment #3 from Thomas Preud'homme  ---
Created attachment 32299
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=32299&action=edit
Fix_bswap_detection

See in attachment for the patch I wrote to fix the issue. I'm welcoming any
comment on it.


[Bug tree-optimization/60454] [4.7/4.8/4.9 Regression] Code mistakenly detected as doing bswap

2014-03-07 Thread thomas.preudhomme at arm dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60454

Thomas Preud'homme  changed:

   What|Removed |Added

  Attachment #32299|0   |1
is obsolete||

--- Comment #4 from Thomas Preud'homme  ---
Created attachment 32300
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=32300&action=edit
Fix_bswap_detection_with_ChangeLog

Added ChangeLog entries to previous patch.


[Bug tree-optimization/60454] [4.7/4.8/4.9 Regression] Code mistakenly detected as doing bswap

2014-03-07 Thread thomas.preudhomme at arm dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60454

--- Comment #5 from Thomas Preud'homme  ---
I have posted the patch on gcc-patches mailing list. The discussion can be
followed from http://gcc.gnu.org/ml/gcc-patches/2014-03/msg00313.html.


[Bug tree-optimization/60454] [4.7/4.8 Regression] Code mistakenly detected as doing bswap

2014-03-12 Thread thomas.preudhomme at arm dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60454

Thomas Preud'homme  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Thomas Preud'homme  ---
As mentionned by Jakub Jelinek, fixed in trunk


[Bug tree-optimization/54733] Missing opportunity to optimize endian independent load/store

2014-03-18 Thread thomas.preudhomme at arm dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54733

Thomas Preud'homme  changed:

   What|Removed |Added

 CC||thomas.preudhomme at arm dot 
com

--- Comment #2 from Thomas Preud'homme  ---
A patch to fix this is currently under discussion on gcc-patches at:

http://gcc.gnu.org/ml/gcc-patches/2014-03/msg00925.html


[Bug middle-end/39246] FAIL: gcc.dg/uninit-13.c

2014-05-04 Thread thomas.preudhomme at arm dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39246

Thomas Preud'homme  changed:

   What|Removed |Added

 CC||thomas.preudhomme at arm dot 
com

--- Comment #12 from Thomas Preud'homme  ---
A patch to fix this is currently under discussion on gcc-patches at:
http://gcc.gnu.org/ml/gcc-patches/2014-05/msg00164.html


[Bug target/60109] __builtin_frame_address does not work as documented on ARM

2014-05-05 Thread thomas.preudhomme at arm dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60109

--- Comment #4 from Thomas Preud'homme  ---
Sorry for the late reply, I wasn't aware of this bug report until today.

(In reply to Richard Earnshaw from comment #1)
> This is an unresolvable problem.
> 
> If we made __builtin_frame_address(N > 0) always return 0, then some useful
> use cases for debugging would be excluded.
> 
> On the other hand, it is impossible to know whether it will return a useful
> value in other cases, since it is dependent on all code being:
> a) built with the same instruction set (arm or thumb)

Doesn't gcc use the same instruction set for a given compilation unit? I
thought an application (without its libraries) would typically not mix
instruction sets and therefore builtin_(return|frame)_address could be made to
work within an application in 99% of the cases even if it breaks accross
library calls (that could be compiled with different instruction set).

> b) Having a consistent use of the frame pointer.

It was my understanding that having a __builtin_frame_address disable frame
pointer elimination base of the following comment in expand_builtin_return_addr
():

"For a nonzero count, or a zero count with __builtin_frame_address, we require
a stable offset from the current frame pointer to the previous one, so we must
use the hard frame pointer, and we must disable frame pointer elimination."

The fact that other compilation unit might not have a frame pointer because
they were compiled with different compilers is ok. It would be nice that it
works within a given compilation unit (as in the example in the initial email)
though.


[Bug tree-optimization/60172] [4.9/4.10 Regression] ARM performance regression from trunk@207239

2014-05-09 Thread thomas.preudhomme at arm dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60172

Thomas Preud'homme  changed:

   What|Removed |Added

 CC||thomas.preudhomme at arm dot 
com

--- Comment #14 from Thomas Preud'homme  ---
(In reply to Steven Bosscher from comment #12)
> Annotated "bad expansion":
> ;; _40 = Arr_2_Par_Ref_22(D) + _12;
> 22: r138=r128+r121
> 23: r127=r132+r138  // r127=Arr_2_Par_Ref+r128+r121
> 
> ;; _32 = _20 + 1000;
> 29: r124=r121+1000
> 
> ;; MEM[(int[25] *)_51 + 20B] = _34;
> 32: r141=r132+r124  // r141=Arr_2_Par_Ref+r121+1000
> 33: r142=r141+r128  // r142=Arr_2_Par_Ref+r128+r121+1000 (==r127+1000)
> 34: [r142+20]=r126

So in gimple the two offsets are added first and then added to the pointer
while after expansion the first offset is added to the pointer and then the
second offset. Is it normal that the order of operations seems to change?

[Bug tree-optimization/60172] [4.9/4.10 Regression] ARM performance regression from trunk@207239

2014-05-14 Thread thomas.preudhomme at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60172

--- Comment #16 from Thomas Preud'homme  ---
Hi Richard,

could you expand on what you said in comment #13? I don't see how reassoc could
help cse here. From what I understood, reassoc tries to group per rank. In our
case, we have (view of the source with loop unrolling):

Arr_2_Par_Ref [Int_Loc] [Int_Loc] = Int_Loc;
/* some stmts */
Arr_2_Par_Ref [Int_Loc+10] [Int_Loc] = Arr_1_Par_Ref [Int_Loc];

If I'm not mistaken, in the first case you'd have:

Int_Loc * 4
Int_Loc * 100
Arr_2_Par_Ref

that would be added together with several statements. However in the second
case you'd have:

Int_Loc * 4
Int_Loc * 100
1000
Arr_2_Par_Ref

that would be added together with several statements. I don't see how could
1000 being added first or last, it seems to me that it's always going to be in
an intermediate statement and thus not all redanduncy would be eliminated by
CSE.

Please let me know if my reasonning is flawed so that I can progress toward a
solution.


[Bug tree-optimization/60172] [4.9/4.10 Regression] ARM performance regression from trunk@207239

2014-05-15 Thread thomas.preudhomme at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60172

--- Comment #18 from Thomas Preud'homme  ---
(In reply to Richard Biener from comment #17)
> 
> Citing myself:
> 
> On the GIMPLE level before expansion we have
> 
>  +40 = Arr_2_Par_Ref_22(D) + (_41 + pretmp_20);
> 
>  _51 = Arr_2_Par_Ref_22(D) + (_41 + (pretmp_20 + 1000));
> 
> so if _51 were Arr_2_Par_Ref_22(D) + ((_41 + pretmp_20) + 1000);
> 
> then _41 + pretmp_20 would be fully redundant with the expression needed
> by _40.

Yes I saw that but I was wondering why would reassoc try this association
rather than another since the header of the file doesn't mention any special
treatment of explicit integer constants.

Besides, wouldn't it still misses that fact that _51 = _40 + 1000?

> 
> Note that IIRC one issue with TER is that it is no longer happening as
> there are dead stmts around that confuse its has_single_use logic.  Thus
> placing a dce pass right before expand would fix that and might be a good
> idea anyway (see comment #3).  Implementing a "proper" poor-mans SSA-based
> DCE would be a good way out (out-of-SSA already has one to remove dead
> PHIs).

Ok


[Bug tree-optimization/60172] [4.9/4.10 Regression] ARM performance regression from trunk@207239

2014-05-15 Thread thomas.preudhomme at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60172

--- Comment #20 from Thomas Preud'homme  ---
(In reply to rguent...@suse.de from comment #19)
> On Thu, 15 May 2014, thomas.preudhomme at arm dot com wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60172
> > 
> > --- Comment #18 from Thomas Preud'homme  
> > ---
> > (In reply to Richard Biener from comment #17)
> > > 
> > > Citing myself:
> > > 
> > > On the GIMPLE level before expansion we have
> > > 
> > >  +40 = Arr_2_Par_Ref_22(D) + (_41 + pretmp_20);
> > > 
> > >  _51 = Arr_2_Par_Ref_22(D) + (_41 + (pretmp_20 + 1000));
> > > 
> > > so if _51 were Arr_2_Par_Ref_22(D) + ((_41 + pretmp_20) + 1000);
> > > 
> > > then _41 + pretmp_20 would be fully redundant with the expression needed
> > > by _40.
> > 
> > Yes I saw that but I was wondering why would reassoc try this association
> > rather than another since the header of the file doesn't mention any special
> > treatment of explicit integer constants.
> > 
> > Besides, wouldn't it still misses that fact that _51 = _40 + 1000?
> 
> Yes.  But reassoc doesn't associate across POINTER_PLUS_EXPRs.

Is there a reason for that?

> 
> RTL CSE could catch it, but for it the association would have to
> be the same for both.  If we start from the proposed form
> then at RTL expansion time we could associate
> pointer + (X + CST) to (pointer + X) + CST.

Right.

> 
> Feels all somewhat hacky, of course (and relies on TER).  There
> may be cases where doing the opposite is better (for example
> if you have ptr1 + (X + 1000) and ptr2 + (X + 1000)).  Association
> to make CSE possible is always hard if CSE itself cannot associate
> to maximize the number of CSE opportunities.  So at the moment
> any choice is just canonicalization.

Exactly my thought. I'm not sure if that's what you have in mind when you write
association for CSE but I was thinking about a scheme that ressemble what
tree_to_aff_combination_expand does and organize all expanded expression to
compare them easily (read efficiently). With such a capability it would then
not be necessary to do the first replacement with forprop+reassoc+dom as
everything could be done in CSE.


[Bug tree-optimization/61306] [4.10 Regression] wrong code at -Os and above on x86_64-linux-gnu

2014-05-26 Thread thomas.preudhomme at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61306

--- Comment #3 from Thomas Preud'homme  ---
Indeed. I also noticed that the original bswap code would happily accept signed
ssa value and signed cast which can lead to disaster. I worked out a patch for
this issue that check the sign of the lhs of the bitwise or expression and use
the (unsigned_)?int.I_type_node accordingly but I now get bootstrap failure.
I'll provide a patch asap.


[Bug bootstrap/61320] [4.10 regression] ICE in jcf-parse.c:1622 (parse_class_file

2014-05-27 Thread thomas.preudhomme at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61320

--- Comment #6 from Thomas Preud'homme  ---
Sure, I'll push a patch for this as soon as I finish fixing the regressions
that poped up due to the change I made to the bswap pass.


[Bug c/61328] valgrind finds problem in find_bswap_or_nop_1

2014-05-27 Thread thomas.preudhomme at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61328

--- Comment #1 from Thomas Preud'homme  ---
*facepalm*

Yes indeed. Does this qualify for an obvious fix as per commiting rules?


[Bug tree-optimization/61306] [4.10 Regression] wrong code at -Os and above on x86_64-linux-gnu

2014-05-28 Thread thomas.preudhomme at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61306

--- Comment #4 from Thomas Preud'homme  ---
I finally managed to find the root cause for the bootstrap failure with my
current fix. I shall be able to improve my fix and should hopefully be ready
tomorrow.


[Bug c/61328] valgrind finds problem in find_bswap_or_nop_1

2014-05-28 Thread thomas.preudhomme at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61328

--- Comment #4 from Thomas Preud'homme  ---
Oh I think I see. When I wrote find_bswap_or_nop_load () I assumed that it
would only return in find_bswap_or_nop_1 as called in the GIMPLE_UNARY_RHS
case. It seems I was wrong.


[Bug tree-optimization/61306] [4.10 Regression] wrong code at -Os and above on x86_64-linux-gnu

2014-05-29 Thread thomas.preudhomme at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61306

--- Comment #5 from Thomas Preud'homme  ---
I have a working patch that also pass bootstrap. I'll do a bit more testing and
post it for review.


[Bug bootstrap/61320] [4.10 regression] ICE in jcf-parse.c:1622 (parse_class_file

2014-06-02 Thread thomas.preudhomme at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61320

--- Comment #9 from Thomas Preud'homme  ---
Sorry, I didn't realize it was preventing bootstrap. I have a small patch that
disable the optimization for STRICT_ALIGNMENT target but was reluctant to use
it as is because this effectively disable this optimization for ARM. But given
the situation the patch could be applied temporarily to avoid the bootstrap
failure and a better solution be commited later.


[Bug bootstrap/61320] [4.10 regression] ICE in jcf-parse.c:1622 (parse_class_file

2014-06-03 Thread thomas.preudhomme at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61320

--- Comment #10 from Thomas Preud'homme  ---
So I am testing the patch right now and should be able to send it tomorrow.
However, the code already shall already mark the load with the actual alignment
the access is being done with. Therefore it seems to me that something in the
backend fails to split the unaligned load into several aligned load. Could you
break after the line align = get_object_alignment (src); in
tree-ssa-math-opts.c when compiling gcc/java/jcf-parse.c in stage 1 (I suppose
it breaks in stage 2)?

What does print align gives? What about print load_type->type_common.align ?

Best regards.


[Bug bootstrap/61320] [4.10 regression] ICE in jcf-parse.c:1622 (parse_class_file

2014-06-03 Thread thomas.preudhomme at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61320

--- Comment #17 from Thomas Preud'homme  ---
(In reply to Richard Biener from comment #12)
> 
> I'd say
> 
> Index: tree-ssa-math-opts.c
> ===
> --- tree-ssa-math-opts.c(revision 211170)
> +++ tree-ssa-math-opts.c(working copy)
> @@ -2149,7 +2149,8 @@ bswap_replace (gimple stmt, gimple_stmt_
>unsigned align;
>  
>align = get_object_alignment (src);
> -  if (bswap && SLOW_UNALIGNED_ACCESS (TYPE_MODE (load_type), align))
> +  if (align < GET_MODE_ALIGNMENT (TYPE_MODE (load_type))
> + && SLOW_UNALIGNED_ACCESS (TYPE_MODE (load_type), align))
> return false;
>  
>/*  Compute address to load from and cast according to the size
> 
> is obvious (and pre-approved).

Alright but tests need to be modified to add an xfail for target impacted by
this. I did such a change and also rewrote the tests to use aligned variable as
much as possible so that they are more meaningful on STRICT_ALIGNMENT targets.
I'll post it for review today (at least for the changes in the testsuite).


[Bug bootstrap/61320] [4.10 regression] ICE in jcf-parse.c:1622 (parse_class_file

2014-06-03 Thread thomas.preudhomme at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61320

--- Comment #18 from Thomas Preud'homme  ---
(In reply to Eric Botcazou from comment #16)
> > unsigned int foo (unsigned short *x)
> > {
> >   return x[0] << 16 | x[1];
> > }
> > 
> > [...]
> > gets you
> > 
> > foo:
> > lduh[%o0], %g1
> > lduh[%o0+2], %o0
> > sll %g1, 16, %g1
> > jmp %o7+8
> >  or %o0, %g1, %o0
> > 
> > which looks perfect to me.
> 
> Indeed, but after having gone through a perfectly useless transformation and
> wasted cycles.  This reminds me of the ipa-split + inlining round trip.
> 
> Really SPARC machines aren't fast enough to allow such a silliness...

Fair enough but the information about alignment is only available late in the
pass so that most of the code is already executed. Only when the whole OR
expression has been processed do we know what is the lowest address and the
range of the memory access and therefore whether that access is aligned or not.

Also if the expression was loading a 32 bit value byte by byte then the
transformation would be useful. I'm already working on a patch to add a cost
model but this will just add more code to execute before taking the decision.
It will however prevent rewriting statements if the result will execute slower
on the target.

Maybe a better solution for sparc would be to add a switch for this pass and
disable it by default on sparc. What do you think about that?


[Bug bootstrap/61320] [4.10 regression] ICE in jcf-parse.c:1622 (parse_class_file

2014-06-04 Thread thomas.preudhomme at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61320

--- Comment #21 from Thomas Preud'homme  ---
(In reply to r...@cebitec.uni-bielefeld.de from comment #19)
> 
> I've now regtested that patch on sparc-sun-solaris2.11 (compared to a
> bootstrap without java before) and i386-pc-solaris2.11.  No regressions,
> but gcc.c-torture/execute/bswap-2.c is still failing on sparc.

There is a patch for bswap-2.c ready [0]. I'm just waiting for Andreas to
confirm me it works for him on m68k. I'd be interested in knowing if that
solves your issue as well.

[0] https://gcc.gnu.org/ml/gcc-patches/2014-05/msg02519.html


[Bug bootstrap/61320] [4.10 regression] ICE in jcf-parse.c:1622 (parse_class_file

2014-06-04 Thread thomas.preudhomme at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61320

--- Comment #23 from Thomas Preud'homme  ---
(In reply to Eric Botcazou from comment #20)
> 
> > Maybe a better solution for sparc would be to add a switch for this pass and
> > disable it by default on sparc. What do you think about that?
> 
> There is nothing special about SPARC, it's the same for every strict
> alignment architecture supported by GCC and SLOW_UNALIGNED_ACCESS is a valid
> predicate.

My point was two fold:

1) Even if the pass does nothing for unaligned access on target where this is
slow, a bunch of code is still executed to determine that the access is
unaligned (in fact most of the pass is executed before the address of the
access is known).

2) For some unaligned access the rewrite might be interesting, like rewriting
this:

tab[1] | (tab [2] << 8) | (tab[3] << 16) | (tab[4] << 24)

into this:

*((uint32_t *) &tab[1])

(considering tab[0] to be 4 byte aligned) which could end up doing one 32 bit
load at addresses &tab[0], one shift and one byte load at address &tab[4].


[Bug bootstrap/61320] [4.10 regression] ICE in jcf-parse.c:1622 (parse_class_file

2014-06-04 Thread thomas.preudhomme at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61320

--- Comment #24 from Thomas Preud'homme  ---
(In reply to r...@cebitec.uni-bielefeld.de from comment #22)
> 
> I'm giving both patches combined a try right now, though SPARC bootstrap
> will take 7+ hours to complete.

Great, thanks.

> 
> Please remember to add proposed patches to the URL field of the PR,
> otherwise they are easily overlooked.

Sorry I'm not very familiar with bugzilla yet and I didn't know this was
possible. It doesn't seem I can edit anything in the PR beyond my subscription
to it and adding comments though. Have I missed something?


[Bug bootstrap/61320] [4.10 regression] ICE in jcf-parse.c:1622 (parse_class_file

2014-06-04 Thread thomas.preudhomme at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61320

--- Comment #26 from Thomas Preud'homme  ---
(In reply to r...@cebitec.uni-bielefeld.de from comment #25)
> 
> Ah, I see: write-after-approval maintainers do get bugzilla write
> access, but your not according to the MAINTAINERS file.

Oups, my mistake, I forgot to update the file. Will do it now, thanks for
reminding me.


[Bug bootstrap/61320] [4.10 regression] ICE in jcf-parse.c:1622 (parse_class_file

2014-06-05 Thread thomas.preudhomme at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61320

--- Comment #28 from Thomas Preud'homme  ---
(In reply to r...@cebitec.uni-bielefeld.de from comment #22)
> > --- Comment #21 from Thomas Preud'homme  
> > ---
> >
> > There is a patch for bswap-2.c ready [0]. I'm just waiting for Andreas to
> > confirm me it works for him on m68k. I'd be interested in knowing if that
> > solves your issue as well.
> >
> > [0] https://gcc.gnu.org/ml/gcc-patches/2014-05/msg02519.html
> 
> I'm giving both patches combined a try right now, though SPARC bootstrap
> will take 7+ hours to complete.

Did it work?


[Bug bootstrap/61320] [4.10 regression] ICE in jcf-parse.c:1622 (parse_class_file

2014-06-05 Thread thomas.preudhomme at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61320

--- Comment #30 from Thomas Preud'homme  ---
Can you run the test manually under gdb and tell me what is the value for the
"out" variable in hex format?


[Bug bootstrap/61320] [4.10 regression] ICE in jcf-parse.c:1622 (parse_class_file

2014-06-06 Thread thomas.preudhomme at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61320

--- Comment #32 from Thomas Preud'homme  ---
(In reply to r...@cebitec.uni-bielefeld.de from comment #31)
> > --- Comment #30 from Thomas Preud'homme  
> > ---
> > Can you run the test manually under gdb and tell me what is the value for 
> > the
> > "out" variable in hex format?
> 
> Sure: the -O0 test aborts at line 78, where out is
> 
> (gdb) p/x out
> $11 = 0x44434241
> (gdb) p (char[4])out
> $12 = "DCBA"
> 
>   Rainer

Are you sure the patch was applied to this test? Line 78 I have "bfin.inval =
(struct ok) { 0x83, 0x85, 0x87, 0x89 };"

The next abort about this line is under a "if (out == 0x89878583)" so would not
abort either. By the way, no need to do a bootstrap again or run the whole
testsuite to try this patch, only this test was changed.

[Bug bootstrap/61320] [4.10 regression] ICE in jcf-parse.c:1622 (parse_class_file

2014-06-06 Thread thomas.preudhomme at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61320

--- Comment #34 from Thomas Preud'homme  ---
Ok, committed then.


[Bug tree-optimization/61301] missed optimization of move if vector passed by reference

2014-06-08 Thread thomas.preudhomme at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61301

--- Comment #4 from Thomas Preud'homme  ---
(In reply to Richard Biener from comment #3)
>   _3 = MEM[(const float *)this_1(D) + 4B];
>   _4 = MEM[(const float *)this_1(D)];
>   _5 = MEM[(const float *)this_1(D) + 12B];
>   _6 = MEM[(const float *)this_1(D) + 8B];
>   _7 = {_3, _4, _5, _6};
>   return _7;
> 
> does look like an opportunity for a bswap pass improvement.  Basically
> handle CONSTRUCTOR as supported composition operation (and then support
> vector loads and shuffle, of course).

I had started working on shuffle support in bswap but I realized this would
probably not help ARM as move between floating point register and general
purpose register are quite slow. I then moved on to higher priority tasks This
doesn't mean I'm finished with the bswap task, as several improvement were
suggested to me during the review of the first patch to improve bswap.

So I wouldn't hold my breath for me doing work for now, feel free to beat me to
it.