date:20131210

Re: [PATCH, libgfortran]: Emit SSE instructions when __SSE_MATH__ is defined

2013-12-10 Thread Janne Blomqvist

On Wed, Dec 11, 2013 at 12:42 AM, Tobias Burnus  wrote:
> The patch does the same for libgfortran as Uros did for libgcc/libatomic,
> cf. http://gcc.gnu.org/ml/gcc-patches/2013-12/msg00878.html
>
> "This is also how glibc generates exceptions."
>
> Build and tested on x86-64-gnu-linux.
> OK ?

Ok, thanks. (resending in plain text mode so that it doesn't bounce
from gcc-patches, gaah. Sorry.)



-- 
Janne Blomqvist

ping Re: PR c++/58567: Fix ICE on invalid code with -fopenmp in cp/pt.c

2013-12-10 Thread Tobias Burnus


Tobias Burnus wrote:
A rather simple fix for an ICE on invalid bug (low-priority 4.8/4.9 
regression).


Bootstrapped and regtested without new failure on x86-64-gnu-linux.
OK for the trunk and 4.8?

Tobias

Re: [PATCH PR41488]Recognize more induction variables by simplifying PEELED chrec in scalar evolution

2013-12-10 Thread Jeff Law


On 12/10/13 23:35, Bin.Cheng wrote:


I know little about GC, so when ggc_collect may be called (between two
passes)? If so, I have to call free_affine_expand_cache just after the
calling to tree_to_affine_combination_expand in SCEV because it's an
analyzer and as you pointed out, the analyzing result is used by
different optimizers.  The pointer map would be maintained between
different passes if it's not instantly freed.
The garbage collector only runs between passes.  If you have roots in 
static storage, then  you'll need to decorate them so the garbage 
collector scans them.


There's a fairly extensive discussion of the garbage collector in the 
GCC internals manual.



jeff

[RFA][PATCH][PR tree-optimization/45685]

2013-12-10 Thread Jeff Law



So for this source, compiled for x86_64 with -O3:

typedef unsigned long int uint64_t;
typedef long int int64_t;
int summation_helper_1(int64_t* products, uint64_t count)
{
int s = 0;
uint64_t i;
for(i=0; i0) ? 1 : -1;
products[i] *= val;
if(products[i] != i)
val = -val;
products[i] = val;
s += val;
}
return s;
}


int summation_helper_2(int64_t* products, uint64_t count)
{
int s = 0;
uint64_t i;
for(i=0; i0) ? 1 : -1;
products[i] *= val;
if(products[i] != i)
val = -val;
products[i] = val;
s += val;
}
return s;
}


The loops we generate are pretty bad and have regressed relative to 
older versions of GCC.


For the first loop, we have the following .optimized output for the loop:

  :
  # s_28 = PHI 
  # i_27 = PHI 
  _11 = MEM[base: products_9(D), index: i_27, step: 8, offset: 0B];
  val_4 = _11 > 0 ? 1 : -1;
  prephitmp_38 = _11 > 0 ? -1 : 1;
  prephitmp_39 = _11 > 0 ? 4294967295 : 1;
  prephitmp_41 = _11 > 0 ? 1 : 4294967295;
  _12 = val_4 * _11;
  _14 = (long unsigned int) _12;
  val_3 = _14 != i_27 ? prephitmp_38 : val_4;
  prephitmp_44 = _14 != i_27 ? prephitmp_39 : prephitmp_41;
  MEM[base: products_9(D), index: i_27, step: 8, offset: 0B] = val_3;
  s.1_18 = (unsigned int) s_28;
  _19 = prephitmp_44 + s.1_18;
  s_20 = (int) _19;
  i_21 = i_27 + 1;
  if (i_21 != count_7(D))
goto ;
  else
goto ;

  :
  goto ;


Note the series of COND_EXPRs.  A couple are just conditional negation 
which can be implemented with a straight-line code sequence.   Using 
that straight-line sequence results in:


  :
  # s_31 = PHI 
  # i_32 = PHI 
  _11 = MEM[base: products_9(D), index: i_32, step: 8, offset: 0B];
  val_4 = _11 > 0 ? 1 : -1;
  _12 = val_4 * _11;
  _14 = (long unsigned int) _12;
  _24 = _14 != i_32;
  _25 = (int64_t) _24;
  _29 = -_25;
  _28 = _29 ^ val_4;
  _27 = _28 + _25;
  MEM[base: products_9(D), index: i_32, step: 8, offset: 0B] = _27;
  _17 = (unsigned int) _27;
  s.1_18 = (unsigned int) s_31;
  _19 = _17 + s.1_18;
  s_20 = (int) _19;
  i_21 = i_32 + 1;
  if (i_21 != count_7(D))
goto ;
  else
goto ;

  :
  goto ;

Which *appears* worse.  However, that code can much more easily be 
handled by the RTL optimizers.When we look at what the trunk 
generates at the assembly level we have:


.L3:
movq(%rdi,%rcx,8), %rdx
testq   %rdx, %rdx
setg%r8b
movzbl  %r8b, %r10d
movzbl  %r8b, %r8d
leaq-1(%r10,%r10), %r10
leal-1(%r8,%r8), %r8d
movq%r10, %r11
imulq   %rdx, %r11
testq   %rdx, %rdx
setle   %dl
movzbl  %dl, %r9d
movzbl  %dl, %edx
leaq-1(%r9,%r9), %r9
leal-1(%rdx,%rdx), %edx
cmpq%rcx, %r11
cmove   %r10, %r9
cmove   %r8d, %edx
movq%r9, (%rdi,%rcx,8)
addq$1, %rcx
addl%edx, %eax
cmpq%rsi, %rcx
jne .L3
(Ick)

With the conditional negation patch that turns into:

L3:
movq(%rdi,%rcx,8), %r8
xorl%edx, %edx
testq   %r8, %r8
setg%dl
leaq-1(%rdx,%rdx), %rdx
imulq   %rdx, %r8
cmpq%rcx, %r8
setne   %r8b
movzbl  %r8b, %r8d
movq%r8, %r9
negq%r9
xorq%r9, %rdx
addq%r8, %rdx
movq%rdx, (%rdi,%rcx,8)
addq$1, %rcx
addl%edx, %eax
cmpq%rsi, %rcx
jne .L3

No branches within the loop, no conditional moves either.  In all it's 5 
instructions shorter.



The second loop shows similar effects, though they're not as dramatic.

Before:
  :
  # s_27 = PHI 
  # i_26 = PHI 
  _11 = MEM[base: products_9(D), index: i_26, step: 8, offset: 0B];
  val_4 = _11 > 0 ? 1 : -1;
  prephitmp_32 = _11 > 0 ? 1 : -1;
  prephitmp_33 = _11 > 0 ? -1 : 1;
  prephitmp_34 = _11 > 0 ? -1 : 1;
  _13 = _11 * prephitmp_32;
  _15 = (long unsigned int) _13;
  val_3 = _15 != i_26 ? prephitmp_33 : val_4;
  prephitmp_36 = _15 != i_26 ? prephitmp_34 : prephitmp_32;
  MEM[base: products_9(D), index: i_26, step: 8, offset: 0B] = 
prephitmp_36;

  s_19 = val_3 + s_27;
  i_20 = i_26 + 1;
  if (i_20 != count_7(D))
goto ;
  else
goto ;

 :
  goto ;


Which results in the following assembly:

.L8:
movq(%rdi,%r8,8), %rdx
testq   %rdx, %rdx
movq%rdx, %r11
setg%cl
movzbl  %cl, %r10d
movzbl  %cl, %ecx
leaq-1(%rcx,%rcx), %rcx
leal-1(%r10,%r10), %r10d
imulq   %rcx, %r11
testq   %rdx, %rdx
setle   %dl
movzbl  %dl, %r9d
movzbl  %dl, %edx
leaq-1(%r9,%r9), %r9
leal-1(%rdx,%rdx), %edx
cmpq%r8, %r11
cmovne  %r9, %r

Re: [PATCH PR41488]Recognize more induction variables by simplifying PEELED chrec in scalar evolution

2013-12-10 Thread Bin.Cheng

On Wed, Dec 11, 2013 at 6:50 AM, Jakub Jelinek  wrote:
> On Tue, Dec 10, 2013 at 10:46:32AM +0800, bin.cheng wrote:
>> This is the new version patch computing the difference in tree affine then
>> comparing against integer_zero_node.
>> Bootstrap and test on current upstream.  I will commit it if there is no
>> other objection.
>
> This breaks bootstrap on x86_64-linux for me too, though in a different
> pass.
> Anyway, the bugs I'm seeing here:
> 1) other passes that use a cache for tree-affine computations
>don't initialize it by pointer_map_create (), tree-affine.c does that
>for you
> 2) the cache shouldn't be destroyed by pointer_map_destroy, but
>instead by free_affine_expand_cache that will actually make sure
>it doesn't leak memory, etc.
> 3) the actual issue why this breaks bootstrap is that you've added
>the pointer_map_destroy (should be free_affine_expand_cache per 2) )
>to scev_finalize, but scev_initialize is performed at the start
>of the loop passes and scev_finalize several passes later, and
>in between passes ggc_collect is called.  As the cache isn't registered
>with GC, if you are unlucky and ggc_collect actually collects
>in between scev_initialize and scev_finalize and garbage collects
>some trees only mentioned in the affine cache peeled_chrec_map,
>things can break completely.
>
> So, IMHO please revert this patch for now and try to find some
> better place for the free_affine_expand_cache (&peeled_chrec_map);
> call so that the cache is never non-NULL when ggc_collect may be called.
> Alternatively you'd need to either tell GC about it (but the structures
> are heap allocated, not GC), or say instruct through some GTY magic
> that during collection free_affine_expand_cache (&peeled_chrec_map);
> should be called.

Thanks for helping on the issue, I shouldn't use the affine facility
before fully understanding it.

I know little about GC, so when ggc_collect may be called (between two
passes)? If so, I have to call free_affine_expand_cache just after the
calling to tree_to_affine_combination_expand in SCEV because it's an
analyzer and as you pointed out, the analyzing result is used by
different optimizers.  The pointer map would be maintained between
different passes if it's not instantly freed.

Thanks,
bin
>
> Jakub



-- 
Best Regards.

Re: [PATCH PR41488]Recognize more induction variables by simplifying PEELED chrec in scalar evolution

2013-12-10 Thread Bin.Cheng

On Wed, Dec 11, 2013 at 4:31 AM, H.J. Lu  wrote:
> On Mon, Dec 9, 2013 at 6:46 PM, bin.cheng  wrote:
>>
>>>
>> Hi,
>> This is the new version patch computing the difference in tree affine then
>> comparing against integer_zero_node.
>> Bootstrap and test on current upstream.  I will commit it if there is no
>> other objection.
>>
>
> This caused:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59445
I found out the cause and addressed it in the bug entry.

Thanks,
bin

>
> --
> H.J.



-- 
Best Regards.

Re: [PING] [PATCH][RFA][PR middle-end/59285] Handle BARRIERS between blocks in rtl_merge_blocks

2013-12-10 Thread Jeff Law


On 12/10/13 16:22, Steven Bosscher wrote:

On Tue, Dec 10, 2013 at 7:30 AM, Jeff Law wrote:



Ping!


I'd prefer going with the first patch to merge_if_blocks (with a big
"???" or FIXME), to contain this problem to where it occurs.
Otherwise, before you know it, there are other places in the compiler
that assume it's OK to merge blocks that are not can_merge_block_p.
I think the patch as posted in the RFA is better based on the robustness 
principle.  Nothing stops an RTL pass from asking for two blocks to be 
merged when there's a BARRIER between them.  So rtl_merge_blocks should 
do the right thing when presented with that situation.


Furthermore, I do not believe that can_merge_blocks_p should reject 
merging of blocks where there's a BARRIER between them.  There is a very 
legitimate use for that that arises from conditional execution and from 
the use of __builtin_unreachable.


Obviously if you succeed in removing BARRIERS after 4.9, then all this 
becomes moot.  But as long as BARRIERS exist, we should handle them 
properly.


Jeff

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-12-10 Thread H.J. Lu

On Tue, Dec 10, 2013 at 2:33 PM, H.J. Lu  wrote:
> On Tue, Dec 10, 2013 at 2:25 PM, Tejas Belagod  
> wrote:
>> On 10 December 2013 21:51, H.J. Lu  wrote:
>>> On Tue, Dec 10, 2013 at 1:09 PM, H.J. Lu  wrote:
 On Tue, Dec 10, 2013 at 12:39 PM, Richard Henderson  
 wrote:
> On 12/10/2013 10:44 AM, Richard Sandiford wrote:
>> Sorry, I don't understand.  I never said it was invalid.  I said
>> (subreg:SF (reg:V4SF X) 1) was invalid if (reg:V4SF X) represents
>> a single register.  On a little-endian target, the offset cannot be
>> anything other than 0 in that case.
>>
>> So the CANNOT_CHANGE_MODE_CLASS code above seems to be checking for
>> something that is always invalid, regardless of the target.  That kind
>> of situation should be rejected by target-independent code instead.
>
> But, we want to disable the subreg before we know whether or not 
> (reg:V4SF X)
> will be allocated to a single hard register.  That is something that we 
> can't
> know in target-independent code before register allocation.

 I tried Kirill's patch.  But LRA isn't prepared to handle it:

 spawn -ignore SIGHUP /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc
 -B/export/build/gnu/gcc/build-x86_64-linux/gcc/
 /export/gnu/import/git/gcc/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-1.c
 -B/export/build/gnu/gcc/build-x86_64-linux/x86_64-unknown-linux-gnu/32/libatomic/
 -L/export/build/gnu/gcc/build-x86_64-linux/x86_64-unknown-linux-gnu/32/libatomic/.libs
 -latomic -fno-diagnostics-show-caret -fdiagnostics-color=never -O1
 -std=c11 -pedantic-errors -lm -m32 -o ./c11-atomic-exec-1.exe^M
 /export/gnu/import/git/gcc/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-1.c:
 In function 'test_simple_assign':^M
 /export/gnu/import/git/gcc/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-1.c:81:1:
 internal compiler error: Maximum number of LRA constraint passes is
 achieved (30)^M
 ^M
 0x88ed77 lra_constraints(bool)^M
 /export/gnu/import/git/gcc/gcc/lra-constraints.c:3871^M
 0x87fe8c lra(_IO_FILE*)^M
 /export/gnu/import/git/gcc/gcc/lra.c:2331^M
 0x840f76 do_reload^M
 /export/gnu/import/git/gcc/gcc/ira.c:5455^M
 0x840f76 rest_of_handle_reload^M
 /export/gnu/import/git/gcc/gcc/ira.c:5584^M
 0x840f76 execute^M
 /export/gnu/import/git/gcc/gcc/ira.c:5613^M

>>>
>>> I got several hundred failures like this in GCC
>>> testsuite with -m32 and -m64 on Linux/x86-64.
>>>
>>
>> I think this is the same as:
>>
>> http://gcc.gnu.org/ml/gcc/2013-12/msg00086.html
>>
>> LRA does not seem to know how to resolve subregs with non-zero offsets
>> that don't map to a full-hardreg.
>>
>
> Looks like it.  I am rebuilding and retesting Kirill's patch +
> Vladimir's patch.
>

Vladimir's patch cuts down the ICEs to 53. I still got

spawn -ignore SIGHUP /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/gcc/build-x86_64-linux/gcc/
/export/gnu/import/git/gcc/gcc/testsuite/gcc.dg/vect/slp-2.c
-fno-diagnostics-show-caret -fdiagnostics-color=never -msse2
-ftree-vectorize -fno-vect-cost-model -fno-common -O2
-fdump-tree-vect-details -lm -o ./slp-2.exe^M
/export/gnu/import/git/gcc/gcc/testsuite/gcc.dg/vect/slp-2.c: In
function 'main1':^M
/export/gnu/import/git/gcc/gcc/testsuite/gcc.dg/vect/slp-2.c:131:1:
internal compiler error: Max. number of generated reload insns per
insn is achieved (90)^M
^M
0x88ecb3 lra_constraints(bool)^M
/export/gnu/import/git/gcc/gcc/lra-constraints.c:3986^M
0x87fe8c lra(_IO_FILE*)^M
/export/gnu/import/git/gcc/gcc/lra.c:2331^M
0x840f76 do_reload^M
/export/gnu/import/git/gcc/gcc/ira.c:5455^M
0x840f76 rest_of_handle_reload^M
/export/gnu/import/git/gcc/gcc/ira.c:5584^M
0x840f76 execute^M
/export/gnu/import/git/gcc/gcc/ira.c:5613^M
Please submit a full bug report,^M
with preprocessed source if appropriate.^M
Please include the complete backtrace with any bug report.^M
See  for instructions.^M
compiler exited with status 1

I also got the wrong code regression:

FAIL: gfortran.dg/round_4.f90  -O0  execution test
FAIL: gfortran.dg/round_4.f90  -O1  execution test
FAIL: gfortran.dg/round_4.f90  -O2  execution test
FAIL: gfortran.dg/round_4.f90  -O3 -fomit-frame-pointer  execution test
FAIL: gfortran.dg/round_4.f90  -O3 -fomit-frame-pointer
-funroll-all-loops -finline-functions  execution test
FAIL: gfortran.dg/round_4.f90  -O3 -fomit-frame-pointer -funroll-loops
 execution test
FAIL: gfortran.dg/round_4.f90  -O3 -g  execution test
FAIL: gfortran.dg/round_4.f90  -Os  execution test

My hjl/subreg branch doesn't have those regressions.

-- 
H.J.

Re: [PATCH PR41488]Recognize more induction variables by simplifying PEELED chrec in scalar evolution

2013-12-10 Thread Bin.Cheng

On Wed, Dec 11, 2013 at 6:50 AM, Jakub Jelinek  wrote:
> On Tue, Dec 10, 2013 at 10:46:32AM +0800, bin.cheng wrote:
>> This is the new version patch computing the difference in tree affine then
>> comparing against integer_zero_node.
>> Bootstrap and test on current upstream.  I will commit it if there is no
>> other objection.
>
> This breaks bootstrap on x86_64-linux for me too, though in a different
> pass.
> Anyway, the bugs I'm seeing here:
> 1) other passes that use a cache for tree-affine computations
>don't initialize it by pointer_map_create (), tree-affine.c does that
>for you
> 2) the cache shouldn't be destroyed by pointer_map_destroy, but
>instead by free_affine_expand_cache that will actually make sure
>it doesn't leak memory, etc.
> 3) the actual issue why this breaks bootstrap is that you've added
>the pointer_map_destroy (should be free_affine_expand_cache per 2) )
>to scev_finalize, but scev_initialize is performed at the start
>of the loop passes and scev_finalize several passes later, and
>in between passes ggc_collect is called.  As the cache isn't registered
>with GC, if you are unlucky and ggc_collect actually collects
>in between scev_initialize and scev_finalize and garbage collects
>some trees only mentioned in the affine cache peeled_chrec_map,
>things can break completely.
>
> So, IMHO please revert this patch for now and try to find some
> better place for the free_affine_expand_cache (&peeled_chrec_map);
> call so that the cache is never non-NULL when ggc_collect may be called.
> Alternatively you'd need to either tell GC about it (but the structures
> are heap allocated, not GC), or say instruct through some GTY magic
> that during collection free_affine_expand_cache (&peeled_chrec_map);
> should be called.

Sorry for bothering, I have reverted the patch.  Will investigate it.

Thanks,
bin
>
> Jakub



-- 
Best Regards.

Re: [RFC, patch] Detect lack of 32-bit devel environment on x86_64-linux targets

2013-12-10 Thread Paolo Bonzini

Il 09/12/2013 12:08, Gerald Pfeifer ha scritto:
> On Mon, 9 Dec 2013, FX wrote:
>> > Look at this as a diagnostics bug: our current diagnostics for this 
>> > pretty common situation sucks. It comes late in the compilation, and 
>> > the message itself isn’t helpful.
> Totally seconded.
> 
> Paolo, I have been running into this myself and was confused at
> first.  If that happens to me, building GCC for some 20 years,
> imagine a new user.
> 
> This is really something to be identified at configure time,
> with a reasonable error message.
> 
> If FX's specific patch is not ideal yet, can you help rework it?
> That would be great.

The patch is okay, but if other architecture maintainers could add
similar checks for their ports (SPARC and PPC, I guess), it would be nice.

Paolo

Re: [Patch, i386] PR 59422 - Support more targets for function multi versioning

2013-12-10 Thread Uros Bizjak

Hello!

> PR gcc/59422
>
> This patch extends the supported targets for function multi versiong to also
> include Haswell, Silvermont, and the most recent AMD models. It also
> prioritizes AVX2 versions over AMD specific pre-AVX2 versions.

Please add a ChangeLog entry and attach the complete patch. Please
also state how you tested the patch, as outlined in the instructions
[1].

[1] http://gcc.gnu.org/contribute.html

Thanks,
Uros.

Re: PATCH: PR target/59458: alternative 13 in *movsf_internal is mishandled

2013-12-10 Thread H.J. Lu

On Tue, Dec 10, 2013 at 3:19 PM, Uros Bizjak  wrote:
> On Tue, Dec 10, 2013 at 11:23 PM, H.J. Lu  wrote:
>> Hi,
>>
>> We have
>>
>> (define_insn "*movsf_internal"
>>   [(set (match_operand:SF 0 "nonimmediate_operand"
>>   "=Yf*f,m   ,Yf*f,?r ,?m,v,v,v,m,?r,?Yi,!*y,!*y,!m,!r ,!*Ym")
>> (match_operand:SF 1 "general_operand"
>>   "Yf*fm,Yf*f,G   ,rmF,rF,C,v,m,v,Yj,r  ,*y ,m  ,*y,*Yn,r"))]
>>
>> alternative 13 is MMXMOV instruction to store "*y" to "!m".  But
>> alternative 13 gets the default mode, SF.  But MMXMOV is
>>
>> case TYPE_MMXMOV:
>>   switch (get_attr_mode (insn))
>> {
>> case MODE_DI:
>>   return "movq\t{%1, %0|%0, %1}";
>> case MODE_SI:
>>   return "movd\t{%1, %0|%0, %1}";
>>
>> default:
>>   gcc_unreachable ();
>> }
>>
>> MODE_SF gets gcc_unreachable.  This patch sets mode to SI for
>> alternative 13. Tested on Linux/x86-64.  OK to install?
>>
>> Thanks.
>>
>>
>> H.J.
>> ---
>> 2013-12-10   H.J. Lu  
>>
>> PR target/59458
>> * config/i386/i386.md (*movsf_internal): Set mode to SI for
>> alternative 13.
>
> OK for mainline and release branches.
>

Fixed ion trunk.  4.7 and 4.8 branches are OK:

(define_insn "*movsi_internal"
  [(set (match_operand:SI 0 "nonimmediate_operand"
"=r,m ,*y,*y,?rm,?*y,*x,*x,?r ,m ,?*Yi,*x")
(match_operand:SI 1 "general_operand"
"g ,re,C ,*y,*y ,rm ,C ,*x,*Yi,*x,r   ,m "))]

...
case TYPE_MMXMOV:
  if (get_attr_mode (insn) == MODE_DI)
return "movq\t{%1, %0|%0, %1}";
  return "movd\t{%1, %0|%0, %1}";

-- 
H.J.

Re: [PING] [PATCH][RFA][PR middle-end/59285] Handle BARRIERS between blocks in rtl_merge_blocks

2013-12-10 Thread Steven Bosscher

On Tue, Dec 10, 2013 at 7:30 AM, Jeff Law wrote:
>
>
> Ping!

I'd prefer going with the first patch to merge_if_blocks (with a big
"???" or FIXME), to contain this problem to where it occurs.
Otherwise, before you know it, there are other places in the compiler
that assume it's OK to merge blocks that are not can_merge_block_p.

Ciao!
Steven

Re: PATCH: PR target/59458: alternative 13 in *movsf_internal is mishandled

2013-12-10 Thread Uros Bizjak

On Tue, Dec 10, 2013 at 11:23 PM, H.J. Lu  wrote:
> Hi,
>
> We have
>
> (define_insn "*movsf_internal"
>   [(set (match_operand:SF 0 "nonimmediate_operand"
>   "=Yf*f,m   ,Yf*f,?r ,?m,v,v,v,m,?r,?Yi,!*y,!*y,!m,!r ,!*Ym")
> (match_operand:SF 1 "general_operand"
>   "Yf*fm,Yf*f,G   ,rmF,rF,C,v,m,v,Yj,r  ,*y ,m  ,*y,*Yn,r"))]
>
> alternative 13 is MMXMOV instruction to store "*y" to "!m".  But
> alternative 13 gets the default mode, SF.  But MMXMOV is
>
> case TYPE_MMXMOV:
>   switch (get_attr_mode (insn))
> {
> case MODE_DI:
>   return "movq\t{%1, %0|%0, %1}";
> case MODE_SI:
>   return "movd\t{%1, %0|%0, %1}";
>
> default:
>   gcc_unreachable ();
> }
>
> MODE_SF gets gcc_unreachable.  This patch sets mode to SI for
> alternative 13. Tested on Linux/x86-64.  OK to install?
>
> Thanks.
>
>
> H.J.
> ---
> 2013-12-10   H.J. Lu  
>
> PR target/59458
> * config/i386/i386.md (*movsf_internal): Set mode to SI for
> alternative 13.

OK for mainline and release branches.

Thanks,
Uros.

Re: Fix PR rtl-optimization/58295

2013-12-10 Thread Eric Botcazou

> The patch does indeed fix unsigned-extend-1.c on arm and bootstraps
> succesfully on arm-none-linux-gnueabihf.

Thanks for the testing.

> Thanks for fixing this, that test failure has been spoiling my test runs for
> quite a while now :)

You're welcome.  Now applied on mainline and 4.8 branch, after testing on 
PowerPC/Linux for the latter.

-- 
Eric Botcazou

Re: [PATCH PR41488]Recognize more induction variables by simplifying PEELED chrec in scalar evolution

2013-12-10 Thread Jakub Jelinek

On Tue, Dec 10, 2013 at 10:46:32AM +0800, bin.cheng wrote:
> This is the new version patch computing the difference in tree affine then
> comparing against integer_zero_node.
> Bootstrap and test on current upstream.  I will commit it if there is no
> other objection.

This breaks bootstrap on x86_64-linux for me too, though in a different
pass.
Anyway, the bugs I'm seeing here:
1) other passes that use a cache for tree-affine computations
   don't initialize it by pointer_map_create (), tree-affine.c does that
   for you
2) the cache shouldn't be destroyed by pointer_map_destroy, but
   instead by free_affine_expand_cache that will actually make sure
   it doesn't leak memory, etc.
3) the actual issue why this breaks bootstrap is that you've added
   the pointer_map_destroy (should be free_affine_expand_cache per 2) )
   to scev_finalize, but scev_initialize is performed at the start
   of the loop passes and scev_finalize several passes later, and
   in between passes ggc_collect is called.  As the cache isn't registered
   with GC, if you are unlucky and ggc_collect actually collects
   in between scev_initialize and scev_finalize and garbage collects
   some trees only mentioned in the affine cache peeled_chrec_map,
   things can break completely.

So, IMHO please revert this patch for now and try to find some
better place for the free_affine_expand_cache (&peeled_chrec_map);
call so that the cache is never non-NULL when ggc_collect may be called.
Alternatively you'd need to either tell GC about it (but the structures
are heap allocated, not GC), or say instruct through some GTY magic
that during collection free_affine_expand_cache (&peeled_chrec_map);
should be called.

Jakub

[PATCH, libgfortran]: Emit SSE instructions when __SSE_MATH__ is defined

2013-12-10 Thread Tobias Burnus

The patch does the same for libgfortran as Uros did for 
libgcc/libatomic, cf. 
http://gcc.gnu.org/ml/gcc-patches/2013-12/msg00878.html


"This is also how glibc generates exceptions."

Build and tested on x86-64-gnu-linux.
OK ?

Tobias
2013-12-10  Tobias Burnus  

	* config/fpu-387.h (sigill_hdlr, get_fpu_rounding_mode): Emit SSE
	instructions when __SSE_MATH__ is defined.

diff --git a/libgfortran/config/fpu-387.h b/libgfortran/config/fpu-387.h
index b35c315..0317ef1 100644
--- a/libgfortran/config/fpu-387.h
+++ b/libgfortran/config/fpu-387.h
@@ -23,7 +23,7 @@ a copy of the GCC Runtime Library Exception along with this program;
 see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 .  */
 
-#ifndef __x86_64__
+#ifndef __SSE_MATH__
 #include "cpuid.h"
 #endif
 
@@ -50,7 +50,7 @@ sigill_hdlr (int sig __attribute((unused)),
 static int
 has_sse (void)
 {
-#ifndef __x86_64__
+#ifndef __SSE_MATH__
   unsigned int eax, ebx, ecx, edx;
 
   if (!__get_cpuid (1, &eax, &ebx, &ecx, &edx))
@@ -227,7 +227,7 @@ get_fpu_rounding_mode (void)
 {
   int round_mode;
 
-#ifdef __x86_64__
+#ifdef __SSE_MATH__
   unsigned int cw;
 
   __asm__ __volatile__ ("%vstmxcsr\t%0" : "=m" (cw));

Re: [buildrobot] score: Silence warnings to fix config-list.mk build

2013-12-10 Thread Jan-Benedict Glaw

Hi Jeff,

On Thu, 2013-12-05 21:14:23 -0700, Jeff Law  wrote:
> On 12/05/13 21:10, Jan-Benedict Glaw wrote:
> > On Thu, 2013-12-05 20:49:06 -0700, Jeff Law  wrote:
> > > I'd just change this one to be correct for the GNU style.  If
> > > you wanted to follow-up with another patch to fix these
> > > throughout the port, that'd be appreciated.
[...]
> If you do the followup (as you hint you might), consider that
> pre-approved if all it does is fix style/indention issues that we've
> touched on within the score port.

Looking into this right now. There are a number of braces-on-same-line
cases, patch for those is done.

  I'm also cooking an additional patch cleaning up eight spaces vs.
tabs. That patch however will be quite large.

MfG, JBG

-- 
  Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
Signature of:  Träume nicht von Deinem Leben: Lebe Deinen Traum!
the second  :


signature.asc
Description: Digital signature

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-12-10 Thread H.J. Lu

On Tue, Dec 10, 2013 at 2:25 PM, Tejas Belagod  wrote:
> On 10 December 2013 21:51, H.J. Lu  wrote:
>> On Tue, Dec 10, 2013 at 1:09 PM, H.J. Lu  wrote:
>>> On Tue, Dec 10, 2013 at 12:39 PM, Richard Henderson  wrote:
 On 12/10/2013 10:44 AM, Richard Sandiford wrote:
> Sorry, I don't understand.  I never said it was invalid.  I said
> (subreg:SF (reg:V4SF X) 1) was invalid if (reg:V4SF X) represents
> a single register.  On a little-endian target, the offset cannot be
> anything other than 0 in that case.
>
> So the CANNOT_CHANGE_MODE_CLASS code above seems to be checking for
> something that is always invalid, regardless of the target.  That kind
> of situation should be rejected by target-independent code instead.

 But, we want to disable the subreg before we know whether or not (reg:V4SF 
 X)
 will be allocated to a single hard register.  That is something that we 
 can't
 know in target-independent code before register allocation.
>>>
>>> I tried Kirill's patch.  But LRA isn't prepared to handle it:
>>>
>>> spawn -ignore SIGHUP /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc
>>> -B/export/build/gnu/gcc/build-x86_64-linux/gcc/
>>> /export/gnu/import/git/gcc/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-1.c
>>> -B/export/build/gnu/gcc/build-x86_64-linux/x86_64-unknown-linux-gnu/32/libatomic/
>>> -L/export/build/gnu/gcc/build-x86_64-linux/x86_64-unknown-linux-gnu/32/libatomic/.libs
>>> -latomic -fno-diagnostics-show-caret -fdiagnostics-color=never -O1
>>> -std=c11 -pedantic-errors -lm -m32 -o ./c11-atomic-exec-1.exe^M
>>> /export/gnu/import/git/gcc/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-1.c:
>>> In function 'test_simple_assign':^M
>>> /export/gnu/import/git/gcc/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-1.c:81:1:
>>> internal compiler error: Maximum number of LRA constraint passes is
>>> achieved (30)^M
>>> ^M
>>> 0x88ed77 lra_constraints(bool)^M
>>> /export/gnu/import/git/gcc/gcc/lra-constraints.c:3871^M
>>> 0x87fe8c lra(_IO_FILE*)^M
>>> /export/gnu/import/git/gcc/gcc/lra.c:2331^M
>>> 0x840f76 do_reload^M
>>> /export/gnu/import/git/gcc/gcc/ira.c:5455^M
>>> 0x840f76 rest_of_handle_reload^M
>>> /export/gnu/import/git/gcc/gcc/ira.c:5584^M
>>> 0x840f76 execute^M
>>> /export/gnu/import/git/gcc/gcc/ira.c:5613^M
>>>
>>
>> I got several hundred failures like this in GCC
>> testsuite with -m32 and -m64 on Linux/x86-64.
>>
>
> I think this is the same as:
>
> http://gcc.gnu.org/ml/gcc/2013-12/msg00086.html
>
> LRA does not seem to know how to resolve subregs with non-zero offsets
> that don't map to a full-hardreg.
>

Looks like it.  I am rebuilding and retesting Kirill's patch +
Vladimir's patch.

-- 
H.J.

[Fortran] RFC / RFA patch for using build_predict_expr instead of builtin_expect / PR 58721

2013-12-10 Thread Tobias Burnus


Pre-remark:

I had hoped that something like the following patch would work. However, 
it will lead to a bunch of run-time segfaults in the test suite - but 
the original dump seems to be fine and I also fail to spot the problem 
when looking at the patch. Thus, instead of posting a working patch, I 
have below a nonworking draft patch. Questions:


(a) Does the build_predict_expr() usage look in principle right - or do 
I do something obviously wrong?


(b) Is it possible to use build_predict_expr() for code like "a = (cond) 
? (expr1) : (expr2)"? After gimplifying, there should be a BB [and a 
phi] - thus, adding the predict to the BB should be possible. But is it 
also possible on tree level with code like above? That mainly applies to 
trans-array.c but also in another case.


(c) How to handle:  if (cond1 || overflow_cond) {...}? Currently, 
"overflow_cond" is marked as unlikely but as build_predict_expr() 
applies to the branch/basic block and not to the condition, how should 
it be handled?


(d) And in general: Does the approach look right and the choice of PRED_ 
and their values?


* * *

gfortran uses internally builtin_expect [i.e. gfc_likely()/gfc_unlikely()].

On the other hand, builtin_expect is also used by (C/C++) users. Google 
did some tests and empirical result was that unlikely is more likely 
than GCC's prediction handling expected. I believe, the likelyhood for 
likely() reduced from a hit rate of 99 to 90.


For gfortran, the change leads to PR58721, where some code is regarded 
as less likely as before and to no inlining. In many cases, the 
unlikely() is not necessary as those branches have a call to an error 
function, annotated with "noreturn". As explicit user request overrides 
other heurisitic, the builtin_expect will cause that the predictor 
regards the gfc_unlikely() as *more* likely than without (as "noreturn" 
is less likely).



This patch tries to address this for the different cases:

a) If there is a "noreturn" function called, the gfc_(un)likely has been 
removed. [Unless, I wrongly classified is in one of the next items; 
however, a "noreturn" will override another PRED_*. Thus, it shouldn't 
matter.]


b) As OVERFLOWs should only occur extremely rarely, I used PRED_OVERFLOW 
with PROB_VERY_LIKELY (i.e. the branch is very likely to be NOT_TAKEN).


c) In many cases, GCC permits to replace run-time aborts for failures by 
asking for a status, e.g. "DEALLOCATE(var, stat=status)"; I have used 
PRED_FAIL with HITRATE(80) for those - but one can argue about the value.


d) gfortran supports a run-time warning. That's currently only used for 
copy-in/out. I think both branches can be approximately equally likely 
(with no-warning being more common in most codes). However, the warning 
is only shown once. I have set PRED_WARN_CALL's hitrate to  to 70 - but 
that's also an arbitrary value.


e) As in the trans_image_index: There one had:  "if (cond || 
invalid_bound) {... } else {...}". The "invalid_bound" is marked as 
unlikely but PRED_* cannot be use as only part of the branch condition 
is unlikely. [I've removed the unlikely.]



Does the patch look sensible? Does the addition of the PRED_* make 
sense? How about their values?



Tobias
 predict.def   |   16 +++

 fortran/trans-array.c |   26 +++-
 fortran/trans-expr.c  |   15 ---
 fortran/trans-intrinsic.c |3 -
 fortran/trans-io.c|1 
 fortran/trans-stmt.c  |   26 
 fortran/trans.c   |   97 +++---
 fortran/trans.h   |4 -

 8 files changed, 107 insertions(+), 81 deletions(-)

diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index 78b08d7..8f4db5a 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -79,6 +79,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "system.h"
 #include "coretypes.h"
 #include "tree.h"
+#include "predict.h"	/* For PRED_FAIL.  */
 #include "gimple-expr.h"
 #include "diagnostic-core.h"	/* For internal_error/fatal_error.  */
 #include "flags.h"
@@ -4987,13 +4988,12 @@ gfc_array_init_size (tree descriptor, int rank, int corank, tree * poffset,
 			 fold_convert (gfc_array_index_type,
 	   TYPE_MAX_VALUE (gfc_array_index_type)),
 	   size);
-  cond = gfc_unlikely (fold_build2_loc (input_location, LT_EXPR,
-	boolean_type_node, tmp, stride));
+  cond = fold_build2_loc (input_location, LT_EXPR, boolean_type_node, tmp, stride);
   tmp = fold_build3_loc (input_location, COND_EXPR, integer_type_node, cond,
 			 integer_one_node, integer_zero_node);
-  cond = gfc_unlikely (fold_build2_loc (input_location, EQ_EXPR,
+  cond = /*gfc_unlikely (*/fold_build2_loc (input_location, EQ_EXPR,
 	boolean_type_node, size,
-	gfc_index_zero_node));
+	gfc_index_zero_node);
   tmp = fold_build3_loc (input_location, COND_EXPR, integer_type_node, cond,
 			 integer_zero

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-12-10 Thread Tejas Belagod

On 10 December 2013 21:51, H.J. Lu  wrote:
> On Tue, Dec 10, 2013 at 1:09 PM, H.J. Lu  wrote:
>> On Tue, Dec 10, 2013 at 12:39 PM, Richard Henderson  wrote:
>>> On 12/10/2013 10:44 AM, Richard Sandiford wrote:
 Sorry, I don't understand.  I never said it was invalid.  I said
 (subreg:SF (reg:V4SF X) 1) was invalid if (reg:V4SF X) represents
 a single register.  On a little-endian target, the offset cannot be
 anything other than 0 in that case.

 So the CANNOT_CHANGE_MODE_CLASS code above seems to be checking for
 something that is always invalid, regardless of the target.  That kind
 of situation should be rejected by target-independent code instead.
>>>
>>> But, we want to disable the subreg before we know whether or not (reg:V4SF 
>>> X)
>>> will be allocated to a single hard register.  That is something that we 
>>> can't
>>> know in target-independent code before register allocation.
>>
>> I tried Kirill's patch.  But LRA isn't prepared to handle it:
>>
>> spawn -ignore SIGHUP /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc
>> -B/export/build/gnu/gcc/build-x86_64-linux/gcc/
>> /export/gnu/import/git/gcc/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-1.c
>> -B/export/build/gnu/gcc/build-x86_64-linux/x86_64-unknown-linux-gnu/32/libatomic/
>> -L/export/build/gnu/gcc/build-x86_64-linux/x86_64-unknown-linux-gnu/32/libatomic/.libs
>> -latomic -fno-diagnostics-show-caret -fdiagnostics-color=never -O1
>> -std=c11 -pedantic-errors -lm -m32 -o ./c11-atomic-exec-1.exe^M
>> /export/gnu/import/git/gcc/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-1.c:
>> In function 'test_simple_assign':^M
>> /export/gnu/import/git/gcc/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-1.c:81:1:
>> internal compiler error: Maximum number of LRA constraint passes is
>> achieved (30)^M
>> ^M
>> 0x88ed77 lra_constraints(bool)^M
>> /export/gnu/import/git/gcc/gcc/lra-constraints.c:3871^M
>> 0x87fe8c lra(_IO_FILE*)^M
>> /export/gnu/import/git/gcc/gcc/lra.c:2331^M
>> 0x840f76 do_reload^M
>> /export/gnu/import/git/gcc/gcc/ira.c:5455^M
>> 0x840f76 rest_of_handle_reload^M
>> /export/gnu/import/git/gcc/gcc/ira.c:5584^M
>> 0x840f76 execute^M
>> /export/gnu/import/git/gcc/gcc/ira.c:5613^M
>>
>
> I got several hundred failures like this in GCC
> testsuite with -m32 and -m64 on Linux/x86-64.
>

I think this is the same as:

http://gcc.gnu.org/ml/gcc/2013-12/msg00086.html

LRA does not seem to know how to resolve subregs with non-zero offsets
that don't map to a full-hardreg.

Thanks,
Tejas.

PATCH: PR target/59458: alternative 13 in *movsf_internal is mishandled

2013-12-10 Thread H.J. Lu

Hi,

We have

(define_insn "*movsf_internal"
  [(set (match_operand:SF 0 "nonimmediate_operand"
  "=Yf*f,m   ,Yf*f,?r ,?m,v,v,v,m,?r,?Yi,!*y,!*y,!m,!r ,!*Ym")
(match_operand:SF 1 "general_operand"
  "Yf*fm,Yf*f,G   ,rmF,rF,C,v,m,v,Yj,r  ,*y ,m  ,*y,*Yn,r"))]

alternative 13 is MMXMOV instruction to store "*y" to "!m".  But
alternative 13 gets the default mode, SF.  But MMXMOV is

case TYPE_MMXMOV:
  switch (get_attr_mode (insn))
{
case MODE_DI:
  return "movq\t{%1, %0|%0, %1}";
case MODE_SI:
  return "movd\t{%1, %0|%0, %1}";

default:
  gcc_unreachable ();
}

MODE_SF gets gcc_unreachable.  This patch sets mode to SI for
alternative 13. Tested on Linux/x86-64.  OK to install?

Thanks.


H.J.
---
2013-12-10   H.J. Lu  

PR target/59458
* config/i386/i386.md (*movsf_internal): Set mode to SI for
alternative 13.

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 7138868..aa2664f 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -3126,7 +3140,7 @@
(const_string "1")
(const_string "*")))
(set (attr "mode")
-(cond [(eq_attr "alternative" "3,4,9,10,14,15")
+(cond [(eq_attr "alternative" "3,4,9,10,13,14,15")
 (const_string "SI")
   (eq_attr "alternative" "11")
 (const_string "DI")

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-12-10 Thread H.J. Lu

On Tue, Dec 10, 2013 at 1:09 PM, H.J. Lu  wrote:
> On Tue, Dec 10, 2013 at 12:39 PM, Richard Henderson  wrote:
>> On 12/10/2013 10:44 AM, Richard Sandiford wrote:
>>> Sorry, I don't understand.  I never said it was invalid.  I said
>>> (subreg:SF (reg:V4SF X) 1) was invalid if (reg:V4SF X) represents
>>> a single register.  On a little-endian target, the offset cannot be
>>> anything other than 0 in that case.
>>>
>>> So the CANNOT_CHANGE_MODE_CLASS code above seems to be checking for
>>> something that is always invalid, regardless of the target.  That kind
>>> of situation should be rejected by target-independent code instead.
>>
>> But, we want to disable the subreg before we know whether or not (reg:V4SF X)
>> will be allocated to a single hard register.  That is something that we can't
>> know in target-independent code before register allocation.
>
> I tried Kirill's patch.  But LRA isn't prepared to handle it:
>
> spawn -ignore SIGHUP /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc
> -B/export/build/gnu/gcc/build-x86_64-linux/gcc/
> /export/gnu/import/git/gcc/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-1.c
> -B/export/build/gnu/gcc/build-x86_64-linux/x86_64-unknown-linux-gnu/32/libatomic/
> -L/export/build/gnu/gcc/build-x86_64-linux/x86_64-unknown-linux-gnu/32/libatomic/.libs
> -latomic -fno-diagnostics-show-caret -fdiagnostics-color=never -O1
> -std=c11 -pedantic-errors -lm -m32 -o ./c11-atomic-exec-1.exe^M
> /export/gnu/import/git/gcc/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-1.c:
> In function 'test_simple_assign':^M
> /export/gnu/import/git/gcc/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-1.c:81:1:
> internal compiler error: Maximum number of LRA constraint passes is
> achieved (30)^M
> ^M
> 0x88ed77 lra_constraints(bool)^M
> /export/gnu/import/git/gcc/gcc/lra-constraints.c:3871^M
> 0x87fe8c lra(_IO_FILE*)^M
> /export/gnu/import/git/gcc/gcc/lra.c:2331^M
> 0x840f76 do_reload^M
> /export/gnu/import/git/gcc/gcc/ira.c:5455^M
> 0x840f76 rest_of_handle_reload^M
> /export/gnu/import/git/gcc/gcc/ira.c:5584^M
> 0x840f76 execute^M
> /export/gnu/import/git/gcc/gcc/ira.c:5613^M
>

I got several hundred failures like this in GCC
testsuite with -m32 and -m64 on Linux/x86-64.

-- 
H.J.

Re: [Patch, Fortran] PR 35831: Checking characteristics of dummy arguments

2013-12-10 Thread Janus Weil

2013/12/10 Tobias Burnus :
> Hi Janus,
>
>
> Janus Weil wrote:
>>
>> here is a straightforward patch for a rather old PR, which deals with
>> argument checking. The patch at hand fixes one of the outstanding todo
>> items of the PR (see comment 17), namely extending the attribute
>> checking of dummy arguments. It adds checks for the four missing
>> attributes (asynchronous, contiguous, value, volatile). The relevant
>> standard reference is F08:12.3.2.
>>
>> Included is also a test case and a fix for one testsuite case which
>> was rejected with the patch.
>>
>> Regtested on x86_64-unknown-linux-gnu. Ok for trunk?
>
>
> Looks fine to me. Thanks!

Thanks for reviewing. Committed as r205873.

Cheers,
Janus

Re: [Patch, Fortran] PR 35831: Checking characteristics of dummy arguments

2013-12-10 Thread Tobias Burnus


Hi Janus,

Janus Weil wrote:

here is a straightforward patch for a rather old PR, which deals with
argument checking. The patch at hand fixes one of the outstanding todo
items of the PR (see comment 17), namely extending the attribute
checking of dummy arguments. It adds checks for the four missing
attributes (asynchronous, contiguous, value, volatile). The relevant
standard reference is F08:12.3.2.

Included is also a test case and a fix for one testsuite case which
was rejected with the patch.

Regtested on x86_64-unknown-linux-gnu. Ok for trunk?


Looks fine to me. Thanks!

Tobias


2013-12-10  Janus Weil  

 PR fortran/35831
 * interface.c (check_dummy_characteristics): Add checks for several
 attributes.


2013-12-10  Janus Weil  

 PR fortran/35831
 * gfortran.dg/c_by_val_5.f90: Modified.
 * gfortran.dg/dummy_procedure_10.f90: New.

[Patch, i386] PR 59422 - Support more targets for function multi versioning

2013-12-10 Thread Allan Sandfeld Jensen

PR gcc/59422

This patch extends the supported targets for function multi versiong to also 
include Haswell, Silvermont, and the most recent AMD models. It also 
prioritizes AVX2 versions over AMD specific pre-AVX2 versions.

[Patch, Fortran] PR 35831: Checking characteristics of dummy arguments

2013-12-10 Thread Janus Weil

Hi all,

here is a straightforward patch for a rather old PR, which deals with
argument checking. The patch at hand fixes one of the outstanding todo
items of the PR (see comment 17), namely extending the attribute
checking of dummy arguments. It adds checks for the four missing
attributes (asynchronous, contiguous, value, volatile). The relevant
standard reference is F08:12.3.2.

Included is also a test case and a fix for one testsuite case which
was rejected with the patch.

Regtested on x86_64-unknown-linux-gnu. Ok for trunk?

Cheers,
Janus



2013-12-10  Janus Weil  

PR fortran/35831
* interface.c (check_dummy_characteristics): Add checks for several
attributes.


2013-12-10  Janus Weil  

PR fortran/35831
* gfortran.dg/c_by_val_5.f90: Modified.
* gfortran.dg/dummy_procedure_10.f90: New.
Index: gcc/fortran/interface.c
===
--- gcc/fortran/interface.c (revision 205870)
+++ gcc/fortran/interface.c (working copy)
@@ -1114,9 +1114,38 @@ check_dummy_characteristics (gfc_symbol *s1, gfc_s
   return false;
 }
 
-  /* FIXME: Do more comprehensive testing of attributes, like e.g.
-   ASYNCHRONOUS, CONTIGUOUS, VALUE, VOLATILE, etc.  */
+  /* Check ASYNCHRONOUS attribute.  */
+  if (s1->attr.asynchronous != s2->attr.asynchronous)
+{
+  snprintf (errmsg, err_len, "ASYNCHRONOUS mismatch in argument '%s'",
+   s1->name);
+  return false;
+}
 
+  /* Check CONTIGUOUS attribute.  */
+  if (s1->attr.contiguous != s2->attr.contiguous)
+{
+  snprintf (errmsg, err_len, "CONTIGUOUS mismatch in argument '%s'",
+   s1->name);
+  return false;
+}
+
+  /* Check VALUE attribute.  */
+  if (s1->attr.value != s2->attr.value)
+{
+  snprintf (errmsg, err_len, "VALUE mismatch in argument '%s'",
+   s1->name);
+  return false;
+}
+
+  /* Check VOLATILE attribute.  */
+  if (s1->attr.volatile_ != s2->attr.volatile_)
+{
+  snprintf (errmsg, err_len, "VOLATILE mismatch in argument '%s'",
+   s1->name);
+  return false;
+}
+
   /* Check interface of dummy procedures.  */
   if (s1->attr.flavor == FL_PROCEDURE)
 {
Index: gcc/testsuite/gfortran.dg/c_by_val_5.f90
===
--- gcc/testsuite/gfortran.dg/c_by_val_5.f90(revision 205870)
+++ gcc/testsuite/gfortran.dg/c_by_val_5.f90(working copy)
@@ -23,7 +23,7 @@ module x
  ! "external" only.
  interface
subroutine bmp_write(nx)
- integer :: nx
+ integer, value :: nx
end subroutine bmp_write
  end interface
 contains
! { dg-do compile }
!
! PR 35831: [F95] Shape mismatch check missing for dummy procedure argument
!
! Contributed by Janus Weil 

program test_attributes

  call tester1 (a1)   ! { dg-error "ASYNCHRONOUS mismatch in argument" }
  call tester2 (a2)   ! { dg-error "CONTIGUOUS mismatch in argument" }
  call tester3 (a1)   ! { dg-error "VALUE mismatch in argument" }
  call tester4 (a1)   ! { dg-error "VOLATILE mismatch in argument" }

contains

  subroutine a1(aa)
real :: aa
  end subroutine
  
  subroutine a2(bb)
real :: bb(:)
  end subroutine

  subroutine tester1 (f1)
interface
  subroutine f1 (a)
real, asynchronous :: a
  end subroutine
end interface
  end subroutine

  subroutine tester2 (f2)
interface
  subroutine f2 (b)
real, contiguous :: b(:)
  end subroutine
end interface
  end subroutine
  
  subroutine tester3 (f3)
interface
  subroutine f3 (c)
real, value :: c
  end subroutine
end interface
  end subroutine
  
  subroutine tester4 (f4)
interface
  subroutine f4 (d)
real, volatile :: d
  end subroutine
end interface
  end subroutine

end

Re: RFA (cgraph): C++ 'structor decloning patch, Mark III

2013-12-10 Thread Jason Merrill


On 12/10/2013 04:48 AM, Jan Hubicka wrote:

The case where I played with local comdats was the following.
I made cgraph_body_availability to get context argument (i.e. instead of saying
if something is available in current unit, it was saying if it is available
in current function/variable).

If two symbols are in the same comdat group and refering each other, they are
available even though they may seem overwritable to others. I then started to
produce local symbols for those local references that are equivalent to your 
comdat
local.

We probably want to get in this extension too. (I did not get data on how often
it fires, but it does i.e. for recursive calls of C++ inlines).


I wouldn't expect it to affect anything other than recursive calls, 
since before my patch functions in the same COMDAT don't call each 
other, and with my patch they only call functions that are already local.


Also, this optimization would seem to apply to all recursive functions, 
not just those in comdat groups.


Are you thinking to add this after my patch is in?


+  /* Don't clone decls local to a comdat group.  */
+  if (comdat_local_p (node))
+return false;


Why? It seems this should work if the clone becomes another comdat local?


Right, I think the problem was that the clone wasn't becoming comdat 
local.  And for the specific case of decloning, there's no point in 
cloning the decloned constructor.



On the other hand, I think you want to prevent ipa-cp propagating addresses of 
comdat
locals out of the function. (i.e. make it to check can_refer predicate for its 
subtitutions)


Right, I wasn't worrying about that because it can't come up with decloning.


So we should check here if both caller and callee are in the same group and 
allow
inlining then?


Makes sense.


Probably you want to get make_decl_local to preserve comdat group; then you 
won't need
to care about it here and clonning could work.


OK.


Probably also when declaring a comdat symbol local, we want to turn all 
associated
comdats to local, right? (i.e. with LTO and hidden comdat)


I don't think so; if we change a public comdat symbol to local, that's a 
change to the ABI of the comdat, so it can't be the same comdat anymore, 
and dissolving the comdat seems appropriate.



Also i think this change needs more work.  FROM_DECL is not the function you
are going to get the reference, it is variable whose constructor the value was
take from.
I think we need to rename it to can_refer_decl_in_symbol and add symbol argument
to get proper check. I am not sure where we use it without being sure the symbol
is current_function_decl.  We definitly make use of it during devirt and we 
need to
start using it in ipa-cp.


Would it be OK for me to just drop this change, since it isn't needed 
for decloning?



Also this is not correct - because it ignores functions inlined into NODE and
use in can_inline_edge_p will lead to quadratic time issues.  We probably want
ipa-inline to compute flag if function refer to local symbol just before
inlining starts and propagate it across inlining decisions in make_edge_inline
and only check the flag in can_inline_edge_p.


OK.

Jason

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-12-10 Thread H.J. Lu

On Tue, Dec 10, 2013 at 12:39 PM, Richard Henderson  wrote:
> On 12/10/2013 10:44 AM, Richard Sandiford wrote:
>> Sorry, I don't understand.  I never said it was invalid.  I said
>> (subreg:SF (reg:V4SF X) 1) was invalid if (reg:V4SF X) represents
>> a single register.  On a little-endian target, the offset cannot be
>> anything other than 0 in that case.
>>
>> So the CANNOT_CHANGE_MODE_CLASS code above seems to be checking for
>> something that is always invalid, regardless of the target.  That kind
>> of situation should be rejected by target-independent code instead.
>
> But, we want to disable the subreg before we know whether or not (reg:V4SF X)
> will be allocated to a single hard register.  That is something that we can't
> know in target-independent code before register allocation.

I tried Kirill's patch.  But LRA isn't prepared to handle it:

spawn -ignore SIGHUP /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/gcc/build-x86_64-linux/gcc/
/export/gnu/import/git/gcc/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-1.c
-B/export/build/gnu/gcc/build-x86_64-linux/x86_64-unknown-linux-gnu/32/libatomic/
-L/export/build/gnu/gcc/build-x86_64-linux/x86_64-unknown-linux-gnu/32/libatomic/.libs
-latomic -fno-diagnostics-show-caret -fdiagnostics-color=never -O1
-std=c11 -pedantic-errors -lm -m32 -o ./c11-atomic-exec-1.exe^M
/export/gnu/import/git/gcc/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-1.c:
In function 'test_simple_assign':^M
/export/gnu/import/git/gcc/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-1.c:81:1:
internal compiler error: Maximum number of LRA constraint passes is
achieved (30)^M
^M
0x88ed77 lra_constraints(bool)^M
/export/gnu/import/git/gcc/gcc/lra-constraints.c:3871^M
0x87fe8c lra(_IO_FILE*)^M
/export/gnu/import/git/gcc/gcc/lra.c:2331^M
0x840f76 do_reload^M
/export/gnu/import/git/gcc/gcc/ira.c:5455^M
0x840f76 rest_of_handle_reload^M
/export/gnu/import/git/gcc/gcc/ira.c:5584^M
0x840f76 execute^M
/export/gnu/import/git/gcc/gcc/ira.c:5613^M

>> In other words I'm arguing against the idea of passing the offset to
>> CANNOT_CHANGE_MODE_CLASS (which you seemed to be supporting in the
>> quote above).  I think Kirill's patch to remove the i386.c check was
>> the right way to go.
>
>
> Unless you can figure a way around the above, I think passing the offset to
> C_C_M_C is probably the way to go.  I need to have a look over the patches
> though...
>
>>
>> There's no need for a separate insn though.  Once you allow the subregs
>> (as per Kirill's patch), the normal move patterns will handle them.
>
> Absolutely.
>

We may need to adjust the existing patterns if subreg is allowed.
I have a few small testcases I can try.

-- 
H.J.

Re: [RFC, patch] Detect lack of 32-bit devel environment on x86_64-linux targets

2013-12-10 Thread Richard Henderson

On 12/09/2013 02:46 AM, Gerald Pfeifer wrote:
> Hmm, it looks like this has not been approved/applied, but I also
> have not seen any NACK.
> 
> This does address an annoying (and hard for novices to understand)
> roadblock for someone installing GCC manually.  Can this go in?

The patch looks ok to me.

Totally agreed with down-thread re avoiding messages to gcc-help for the 99%.


r~

Go patch committed: Test r* tests in parallel

2013-12-10 Thread Ian Lance Taylor

Among the slower tests in the Go gcc testsuite are the rotate* tests.
This patch changes the GCC testsuite driver to test them in parallel
when using make -j.  Tested by running the testsuite on
x86_64-unknown-linux-gnu and verifying that there were no differences.
Committed to mainline.

Ian


2013-12-10  Ian Lance Taylor  

* Make-lang.in (check_go_parallelize): Test go-test.exp r* tests
separately.


Index: Make-lang.in
===
--- Make-lang.in	(revision 205710)
+++ Make-lang.in	(working copy)
@@ -132,9 +132,10 @@ go.srcman: doc/gccgo.1
 
 lang_checks += check-go
 lang_checks_parallelized += check-go
-check_go_parallelize = go-test.exp=*/test/\[0-57-9a-bd-hj-zA-Z\]* \
+check_go_parallelize = go-test.exp=*/test/\[0-57-9a-bd-hj-qs-zA-Z\]* \
 		   go-test.exp=*/test/c* \
 		   go-test.exp=*/test/i* \
+		   go-test.exp=*/test/r* \
 		   go-test.exp=*/test/6*
 
 # Install hooks.

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-12-10 Thread Richard Henderson

On 12/10/2013 10:44 AM, Richard Sandiford wrote:
> Sorry, I don't understand.  I never said it was invalid.  I said
> (subreg:SF (reg:V4SF X) 1) was invalid if (reg:V4SF X) represents
> a single register.  On a little-endian target, the offset cannot be
> anything other than 0 in that case.
> 
> So the CANNOT_CHANGE_MODE_CLASS code above seems to be checking for
> something that is always invalid, regardless of the target.  That kind
> of situation should be rejected by target-independent code instead.

But, we want to disable the subreg before we know whether or not (reg:V4SF X)
will be allocated to a single hard register.  That is something that we can't
know in target-independent code before register allocation.

> In other words I'm arguing against the idea of passing the offset to
> CANNOT_CHANGE_MODE_CLASS (which you seemed to be supporting in the
> quote above).  I think Kirill's patch to remove the i386.c check was
> the right way to go.


Unless you can figure a way around the above, I think passing the offset to
C_C_M_C is probably the way to go.  I need to have a look over the patches
though...

> 
> There's no need for a separate insn though.  Once you allow the subregs
> (as per Kirill's patch), the normal move patterns will handle them.

Absolutely.


r~

Re: [PATCH PR41488]Recognize more induction variables by simplifying PEELED chrec in scalar evolution

2013-12-10 Thread H.J. Lu

On Mon, Dec 9, 2013 at 6:46 PM, bin.cheng  wrote:
>
>>
> Hi,
> This is the new version patch computing the difference in tree affine then
> comparing against integer_zero_node.
> Bootstrap and test on current upstream.  I will commit it if there is no
> other objection.
>

This caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59445


-- 
H.J.

Re: Fix tsan tests.

2013-12-10 Thread Mike Stump

On Dec 10, 2013, at 7:06 AM, Yury Gribov  wrote:
> Jakub wrote:
> > HJ wrote:
> >>> Done, r205853.
> >> I think it caused:
> >>
> >> http://gcc.gnu.org/ml/gcc-regression/2013-12/msg00214.html
> >
> > Missing torture-finish before dg-finish? At least looking at other
> > *.exp files that do set-torture-options they all do that.
> 
> Full make-check is still running but I was able to repro this with `make 
> check RUNTESTFLAGS='tsan.exp i386-prefetch.exp math-torture.exp'. Attached 
> patch is tested against these exps as well. Ok to commit or should I wait for 
> make-check to complete? I'd prefer to unblock other developers ASAP.

I'm ok with correcting things with subset testing, but just be careful.  The 
full test suite can in some instances show how one test case driver can knock 
into the next in a bad way, so when changing things that might knock, one still 
needs to do a full run and ensure no fallout.

Re: AARCH64 configure check for gas -mabi support

2013-12-10 Thread Kugan


[snip]

>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>> index b1b4eef..a53febc 100644
>> --- a/gcc/config/aarch64/aarch64.c
>> +++ b/gcc/config/aarch64/aarch64.c
>> @@ -5187,6 +5187,13 @@ aarch64_override_options (void)
>> aarch64_parse_tune ();
>>   }
>>
>> +/* Issue error if assembler does not support -mabi and option ilp32
>> +  is selected.  */
>
>I'd prefer the comment to be "The compiler may have been configured
>with 2.23.* binutils, which does not have support for ILP32."


>> +#ifndef HAVE_AS_MABI_OPTION
>> +  if (TARGET_ILP32)
>> +error ("Assembler does not supprt -mabi=ilp32");
>> +#endif
> 
> supprt/support
> 
[snip]

> I'm not very sure about the indent rules for configury files, but in
> other areas of configure.ac, it seems using a similar indent convention
> as in .c files.
> 

Thanks Yufeng. I have updated the patch based on the comments above.

Marcus, is this OK for trunk now?

Thanks,
Kugan


gcc/

+2013-12-11  Kugan Vivekanandarajah  
+   * configure.ac: Add check for aarch64 assembler -mabi support.
+   * configure: Regenerate.
+   * config.in: Regenerate.
+   * config/aarch64/aarch64-elf.h (ASM_MABI_SPEC): New define.
+   (ASM_SPEC): Update to substitute -mabi with ASM_MABI_SPEC.
+   * config/aarch64/aarch64.h (aarch64_override_options):  Issue error if
+   assembler does not support -mabi and option ilp32 is selected.
+   * doc/install.texi: Added note that building gcc 4.9 and after with pre
+   2.24 binutils will not support -mabi=ilp32.
+
diff --git a/gcc/config/aarch64/aarch64-elf.h b/gcc/config/aarch64/aarch64-elf.h
index 4757d22..a66c3db 100644
--- a/gcc/config/aarch64/aarch64-elf.h
+++ b/gcc/config/aarch64/aarch64-elf.h
@@ -134,13 +134,19 @@
   " %{!mbig-endian:%{!mlittle-endian:" ENDIAN_SPEC "}}" \
   " %{!mabi=*:" ABI_SPEC "}"
 
+#ifdef HAVE_AS_MABI_OPTION
+#define ASM_MABI_SPEC  "%{mabi=*:-mabi=%*}"
+#else
+#define ASM_MABI_SPEC  "%{mabi=lp64:}"
+#endif
+
 #ifndef ASM_SPEC
 #define ASM_SPEC "\
 %{mbig-endian:-EB} \
 %{mlittle-endian:-EL} \
 %{mcpu=*:-mcpu=%*} \
-%{march=*:-march=%*} \
-%{mabi=*:-mabi=%*}"
+%{march=*:-march=%*}" \
+ASM_MABI_SPEC
 #endif
 
 #undef TYPE_OPERAND_FMT
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index b1b4eef..01dbe23 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -5187,6 +5187,13 @@ aarch64_override_options (void)
   aarch64_parse_tune ();
 }
 
+#ifndef HAVE_AS_MABI_OPTION
+  /* The compiler may have been configured with 2.23.* binutils, which does
+ not have support for ILP32.  */
+  if (TARGET_ILP32)
+error ("Assembler does not support -mabi=ilp32");
+#endif
+
   initialize_aarch64_code_model ();
 
   aarch64_build_bitmask_table ();
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 91a22d5..0a3b97b 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -3495,6 +3495,35 @@ AC_DEFINE_UNQUOTED(HAVE_LTO_PLUGIN, $gcc_cv_lto_plugin,
 AC_MSG_RESULT($gcc_cv_lto_plugin)
 
 case "$target" in
+  aarch64*-*-*)
+gcc_GAS_CHECK_FEATURE([-mabi option], gcc_cv_as_aarch64_mabi,,
+  [-mabi=lp64], [.text],,,)
+if test x$gcc_cv_as_aarch64_mabi = xyes; then
+  AC_DEFINE(HAVE_AS_MABI_OPTION, 1,
+[Define if your assembler supports the -mabi option.])
+else
+  if test x$with_abi = xilp32; then
+AC_MSG_ERROR([Assembler does not support -mabi=ilp32.\
+ Upgrade the Assembler.])
+  fi
+  if test x"$with_multilib_list" = xdefault; then
+TM_MULTILIB_CONFIG=lp64
+  else
+aarch64_multilibs=`echo $with_multilib_list | sed -e 's/,/ /g'`
+for aarch64_multilib in ${aarch64_multilibs}; do
+  case ${aarch64_multilib} in
+ilp32)
+  AC_MSG_ERROR([Assembler does not support -mabi=ilp32.\
+Upgrade the Assembler.])
+  ;;
+*)
+  ;;
+  esac
+done
+  fi
+fi
+;;
+
   # All TARGET_ABI_OSF targets.
   alpha*-*-linux* | alpha*-*-*bsd*)
 gcc_GAS_CHECK_FEATURE([explicit relocation support],
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index a8f9f8a..00c4f0d 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -3735,6 +3735,15 @@ removed and the system libunwind library will always be 
used.
 
 @html
 
+@end html
+@anchor{aarch64-x-x}
+@heading aarch64*-*-*
+Pre 2.24 binutils does not have support for selecting -mabi and does not
+support ILP32.  If GCC 4.9 or later is built with pre 2.24, GCC will not
+support option -mabi=ilp32.
+
+@html
+
 
 @end html
 @anchor{x-ibm-aix}

Re: [Ping]Two pending IVOPT patches

2013-12-10 Thread Jeff Law


On 12/10/13 00:01, bin.cheng wrote:

Emm, some kind of.  See the cost of iv candidate set consists of several
parts, the representation cost in cost pair; the register pressure cost
falls in dependence on invariant expressions, etc..  Here iv_ca_has_deps
checks whether new cost pair depends on other invariant expression which is
not involved before, if so, current algorithm considers the new cost pair
has higher cost and just skips.  In fact, the new cost pair may give us
lower overall cost even it introduces new dependence(similar to the case I
gave).  That's why I used the overall cost comparison for good.

OK.  Thanks for the explanation.



Is this new version patch looks good to you? I will re-test if it's ok.

It does.  Please do some final testing and it should be good to go.
jeff

Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-12-10 Thread Teresa Johnson

On Tue, Dec 10, 2013 at 11:01 AM, Xinliang David Li  wrote:
> I agree with the staged checkin plan proposed.  Teresa, can you help
> with spliting out the libgcov.h/gcov-io.h change out (Rong is on a
> trip..)

Sure, I will work on that part and send a patch once it is tested.

Teresa

>
> thanks,
>
> David
>
> On Fri, Dec 6, 2013 at 6:23 AM, Jan Hubicka  wrote:
>>> Hi, all
>>>
>>> This is the new patch for gcov-tool (previously profile-tool).
>>>
>>> Honza: can you comment on the new merge interface? David posted some
>>> comments in an earlier email and we want to know what's your opinion.
>>>
>>> Test patch has been tested with boostrap, regresssion,
>>> profiledbootstrap and SPEC2006.
>>>
>>> Noticeable changes from the earlier version:
>>>
>>> 1. create a new file libgcov.h and move libgcov-*.h headers to libgcov.h
>>> So we can included multiple libgcov-*.c without adding new macros.
>>>
>>> 2. split libgcov.h specific code in gcvo-io.h to libcc/libgcov.h
>>> Avoid multiple-page of code under IN_LIBGCOV macro -- this
>>> improves the readability.
>>>
>>> 3. make gcov_var static, and move the definition from gcov-io.h to
>>> gcov-io.c. Also
>>>move some static functions accessing gcov_var to gcvo-io.c
>>> Current code rely on GCOV_LINKAGE tricks to avoid multi-definition. I don't 
>>> see
>>> a reason that gcov_var needs to exposed as a global.
>>>
>>> 4. expose gcov_write_strings() and gcov_sync() to gcov_tool usage
>>>
>>> 5. rename profile-tool to gcov-tool per Honza's suggestion.
>>>
>>> Thanks,
>>
>> Hi,
>> I did not read in deatil the gcov-tool source itself, but lets first make 
>> the interface changes
>> needed.
>>
>>> 2013-11-18  Rong Xu  
>>>
>>>   * gcc/gcov-io.c (gcov_var): Moved from gcov-io.h and make it static.
>>>   (gcov_position): Move from gcov-io.h
>>>   (gcov_is_error): Ditto.
>>>   (gcov_rewrite): Ditto.
>>>   * gcc/gcov-io.h: Re-factoring. Move gcov_var to gcov-io.h and
>>> move the libgcov only part of libgcc/libgcov.h.
>>>   * libgcc/libgcov.h: New common header files for libgcov-*.h
>>>   * libgcc/Makefile.in: Add dependence to libgcov.h
>>>   * libgcc/libgcov-profiler.c: Use libgcov.h
>>>   * libgcc/libgcov-driver.c: Ditto.
>>>   * libgcc/libgcov-interface.c: Ditto.
>>>   * libgcc/libgcov-driver-system.c (allocate_filename_struct): use
>>>   xmalloc instread of malloc.
>>>   * libgcc/libgcov-merge.c (void __gcov_merge_delta): Add more
>>>   parameters to merge function.
>>>   (__gcov_merge_add): Ditto.
>>>   (__gcov_merge_ior): Ditto.
>>>   (__gcov_merge_time_profile): Ditto.
>>>   (__gcov_merge_single): Ditto.
>>>   (__gcov_merge_delta): Ditto.
>>>   * libgcc/libgcov-tool.c (void gcov_tool_set_verbose): New for
>>>   gcov-tool support.
>>>   (set_fn_ctrs): Ditto.
>>>   (tag_function): Ditto.
>>>   (tag_blocks): Ditto.
>>>   (tag_arcs): Ditto.
>>>   (tag_lines): Ditto.
>>>   (tag_counters): Ditto.
>>>   (tag_summary): Ditto.
>>>   (read_gcda_finalize): Ditto.
>>>   (read_gcda_file): Ditto.
>>>   (ftw_read_file): Ditto.
>>>   (read_profile_dir_init) Ditto.:
>>>   (gcov_read_profile_dir): Ditto.
>>>   (gcov_merge): Ditto.
>>>   (find_match_gcov_inf Ditto.o):
>>>   (gcov_profile_merge): Ditto.
>>>   (__gcov_scale_add): Ditto.
>>>   (__gcov_scale_ior): Ditto.
>>>   (__gcov_scale_delta): Ditto.
>>>   (__gcov_scale_single): Ditto.
>>>   (gcov_profile_scale): Ditto.
>>>   (gcov_profile_normalize): Ditto.
>>>   (__gcov_scale2_add): Ditto.
>>>   (__gcov_scale2_ior): Ditto.
>>>   (__gcov_scale2_delta): Ditto.
>>>   (__gcov_scale2_single): Ditto.
>>>   (gcov_profile_scale2): Ditto.
>>>   * gcc/gcov-tool.c (unlink_file): Gcov-tool driver support.
>>>   (unlink_dir): Ditto.
>>>   (profile_merge): Ditto.
>>>   (print_merge_usage_message): Ditto.
>>>   (merge_usage): Ditto.
>>>   (do_merge): Ditto.
>>>   (profile_rewrite2): Ditto.
>>>   (profile_rewrite): Ditto.
>>>   (print_rewrite_usage_message): Ditto.
>>>   (rewrite_usage): Ditto.
>>>   (do_rewrite): Ditto.
>>>   (print_usage): Ditto.
>>>   (print_version): Ditto.
>>>   (process_args): Ditto.
>>>   (main): Ditto.
>>>   * gcc/Makefile.in: Build and install gcov-tool.
>>
>>> Index: gcc/gcov-io.c
>>> ===
>>> --- gcc/gcov-io.c (revision 204895)
>>> +++ gcc/gcov-io.c (working copy)
>>> @@ -36,6 +36,37 @@ static const gcov_unsigned_t *gcov_read_words (uns
>>>  static void gcov_allocate (unsigned);
>>>  #endif
>>>
>>> +/* Moved for gcov-io.h and make it static.  */
>>> +static struct gcov_var gcov_var;
>>
>> This is more an changelog message than a comment in source file.
>> Just describe what gcov_var is.
>>
>> Do you know how the size of libgcov changed with your patch?
>> Quick check of current mainline on co

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-12-10 Thread Paul_Koning


On Dec 10, 2013, at 2:12 PM, H.J. Lu  wrote:

> On Tue, Dec 10, 2013 at 11:05 AM, Tejas Belagod  wrote:
>> ...
>> So, if (subreg:DI (match_operand:V4SF 1 "register_operand" "x,x") 0) is a
>> valid subreg, why not allow it in CANNOT_CHANGE_MODE_CLASS (like in Kirill's
>> patch http://gcc.gnu.org/ml/gcc-patches/2013-12/msg00987.html) and resolve
>> the actual register later in subreg_get_info ()?
> 
> Let's wait for Kirill's results on GCC testsuite.

I'm puzzled.  What is the connection between testsuite results and a design 
decision about a code change?  The question remains valid even if the testsuite 
didn't exist at all.

paul

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-12-10 Thread H.J. Lu

On Tue, Dec 10, 2013 at 11:05 AM, Tejas Belagod  wrote:
> H.J. Lu wrote:
>>
>> On Tue, Dec 10, 2013 at 9:26 AM, Tejas Belagod  wrote:
>>>
>>> H.J. Lu wrote:

 On Tue, Dec 10, 2013 at 9:04 AM, Kirill Yukhin 
 wrote:
>
> On 10 Dec 08:23, H.J. Lu wrote:
>>
>> What is wrong to pass the correct offset to
>> CANNOT_CHANGE_MODE_CLASS?  Backends are free to
>> ignore it.
>
> Yes, but as fas as understand this hook as a predicate
> saying if it not-safe to change mode1 to mode2 for given


 In many places, the macro is used with the known offset.
 I have a follow up patch which improves x86 code generation,
 in cases like:

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45198

>>> What is it that subreg_get_info () can't resolve that
>>> CANNOT_CHANGE_MODE_CLASS with an offset can?
>>>
>>
>> We have
>>
>> int
>> simplify_subreg_regno (unsigned int xregno, enum machine_mode xmode,
>>unsigned int offset, enum machine_mode ymode)
>> {
>>   struct subreg_info info;
>>   unsigned int yregno;
>>
>> #ifdef CANNOT_CHANGE_MODE_CLASS
>>   /* Give the backend a chance to disallow the mode change.  */
>>   if (GET_MODE_CLASS (xmode) != MODE_COMPLEX_INT
>>   && GET_MODE_CLASS (xmode) != MODE_COMPLEX_FLOAT
>>   && REG_CANNOT_CHANGE_MODE_P (xregno, xmode, ymode)
>>   /* We can use mode change in LRA for some transformations.  */
>>   && ! lra_in_progress)
>> return -1;
>> #endif
>>
>> CANNOT_CHANGE_MODE_CLASS is checked before subreg_get_info is
>> called.  When REG_CANNOT_CHANGE_MODE_P returns false,
>> there is nothing subreg_get_info can do.
>>
>
> So, if (subreg:DI (match_operand:V4SF 1 "register_operand" "x,x") 0) is a
> valid subreg, why not allow it in CANNOT_CHANGE_MODE_CLASS (like in Kirill's
> patch http://gcc.gnu.org/ml/gcc-patches/2013-12/msg00987.html) and resolve
> the actual register later in subreg_get_info ()?

Let's wait for Kirill's results on GCC testsuite.

> In general, as I understand it, CANNOT_CHANGE_MODE_CLASS doesn't need an
> offset because it only checks for validity of a mode-change in a regclass
> from the point of view of bit-correct-representation (eg. a register when
> subreg'ed still has the same order of bits) - the actual register reference
> is done elsewhere using the offset.
>
> Thanks,
> Tejas.
>



-- 
H.J.

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-12-10 Thread Tejas Belagod


H.J. Lu wrote:

On Tue, Dec 10, 2013 at 9:26 AM, Tejas Belagod  wrote:

H.J. Lu wrote:

On Tue, Dec 10, 2013 at 9:04 AM, Kirill Yukhin 
wrote:

On 10 Dec 08:23, H.J. Lu wrote:

What is wrong to pass the correct offset to
CANNOT_CHANGE_MODE_CLASS?  Backends are free to
ignore it.

Yes, but as fas as understand this hook as a predicate
saying if it not-safe to change mode1 to mode2 for given


In many places, the macro is used with the known offset.
I have a follow up patch which improves x86 code generation,
in cases like:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45198


What is it that subreg_get_info () can't resolve that
CANNOT_CHANGE_MODE_CLASS with an offset can?



We have

int
simplify_subreg_regno (unsigned int xregno, enum machine_mode xmode,
   unsigned int offset, enum machine_mode ymode)
{
  struct subreg_info info;
  unsigned int yregno;

#ifdef CANNOT_CHANGE_MODE_CLASS
  /* Give the backend a chance to disallow the mode change.  */
  if (GET_MODE_CLASS (xmode) != MODE_COMPLEX_INT
  && GET_MODE_CLASS (xmode) != MODE_COMPLEX_FLOAT
  && REG_CANNOT_CHANGE_MODE_P (xregno, xmode, ymode)
  /* We can use mode change in LRA for some transformations.  */
  && ! lra_in_progress)
return -1;
#endif

CANNOT_CHANGE_MODE_CLASS is checked before subreg_get_info is
called.  When REG_CANNOT_CHANGE_MODE_P returns false,
there is nothing subreg_get_info can do.



So, if (subreg:DI (match_operand:V4SF 1 "register_operand" "x,x") 0) is a valid 
subreg, why not allow it in CANNOT_CHANGE_MODE_CLASS (like in Kirill's patch 
http://gcc.gnu.org/ml/gcc-patches/2013-12/msg00987.html) and resolve the actual 
register later in subreg_get_info ()?


In general, as I understand it, CANNOT_CHANGE_MODE_CLASS doesn't need an offset 
because it only checks for validity of a mode-change in a regclass from the 
point of view of bit-correct-representation (eg. a register when subreg'ed still 
has the same order of bits) - the actual register reference is done elsewhere 
using the offset.


Thanks,
Tejas.

Re: [RFC] libgcov.c re-factoring and offline profile-tool

2013-12-10 Thread Xinliang David Li

I agree with the staged checkin plan proposed.  Teresa, can you help
with spliting out the libgcov.h/gcov-io.h change out (Rong is on a
trip..)

thanks,

David

On Fri, Dec 6, 2013 at 6:23 AM, Jan Hubicka  wrote:
>> Hi, all
>>
>> This is the new patch for gcov-tool (previously profile-tool).
>>
>> Honza: can you comment on the new merge interface? David posted some
>> comments in an earlier email and we want to know what's your opinion.
>>
>> Test patch has been tested with boostrap, regresssion,
>> profiledbootstrap and SPEC2006.
>>
>> Noticeable changes from the earlier version:
>>
>> 1. create a new file libgcov.h and move libgcov-*.h headers to libgcov.h
>> So we can included multiple libgcov-*.c without adding new macros.
>>
>> 2. split libgcov.h specific code in gcvo-io.h to libcc/libgcov.h
>> Avoid multiple-page of code under IN_LIBGCOV macro -- this
>> improves the readability.
>>
>> 3. make gcov_var static, and move the definition from gcov-io.h to
>> gcov-io.c. Also
>>move some static functions accessing gcov_var to gcvo-io.c
>> Current code rely on GCOV_LINKAGE tricks to avoid multi-definition. I don't 
>> see
>> a reason that gcov_var needs to exposed as a global.
>>
>> 4. expose gcov_write_strings() and gcov_sync() to gcov_tool usage
>>
>> 5. rename profile-tool to gcov-tool per Honza's suggestion.
>>
>> Thanks,
>
> Hi,
> I did not read in deatil the gcov-tool source itself, but lets first make the 
> interface changes
> needed.
>
>> 2013-11-18  Rong Xu  
>>
>>   * gcc/gcov-io.c (gcov_var): Moved from gcov-io.h and make it static.
>>   (gcov_position): Move from gcov-io.h
>>   (gcov_is_error): Ditto.
>>   (gcov_rewrite): Ditto.
>>   * gcc/gcov-io.h: Re-factoring. Move gcov_var to gcov-io.h and
>> move the libgcov only part of libgcc/libgcov.h.
>>   * libgcc/libgcov.h: New common header files for libgcov-*.h
>>   * libgcc/Makefile.in: Add dependence to libgcov.h
>>   * libgcc/libgcov-profiler.c: Use libgcov.h
>>   * libgcc/libgcov-driver.c: Ditto.
>>   * libgcc/libgcov-interface.c: Ditto.
>>   * libgcc/libgcov-driver-system.c (allocate_filename_struct): use
>>   xmalloc instread of malloc.
>>   * libgcc/libgcov-merge.c (void __gcov_merge_delta): Add more
>>   parameters to merge function.
>>   (__gcov_merge_add): Ditto.
>>   (__gcov_merge_ior): Ditto.
>>   (__gcov_merge_time_profile): Ditto.
>>   (__gcov_merge_single): Ditto.
>>   (__gcov_merge_delta): Ditto.
>>   * libgcc/libgcov-tool.c (void gcov_tool_set_verbose): New for
>>   gcov-tool support.
>>   (set_fn_ctrs): Ditto.
>>   (tag_function): Ditto.
>>   (tag_blocks): Ditto.
>>   (tag_arcs): Ditto.
>>   (tag_lines): Ditto.
>>   (tag_counters): Ditto.
>>   (tag_summary): Ditto.
>>   (read_gcda_finalize): Ditto.
>>   (read_gcda_file): Ditto.
>>   (ftw_read_file): Ditto.
>>   (read_profile_dir_init) Ditto.:
>>   (gcov_read_profile_dir): Ditto.
>>   (gcov_merge): Ditto.
>>   (find_match_gcov_inf Ditto.o):
>>   (gcov_profile_merge): Ditto.
>>   (__gcov_scale_add): Ditto.
>>   (__gcov_scale_ior): Ditto.
>>   (__gcov_scale_delta): Ditto.
>>   (__gcov_scale_single): Ditto.
>>   (gcov_profile_scale): Ditto.
>>   (gcov_profile_normalize): Ditto.
>>   (__gcov_scale2_add): Ditto.
>>   (__gcov_scale2_ior): Ditto.
>>   (__gcov_scale2_delta): Ditto.
>>   (__gcov_scale2_single): Ditto.
>>   (gcov_profile_scale2): Ditto.
>>   * gcc/gcov-tool.c (unlink_file): Gcov-tool driver support.
>>   (unlink_dir): Ditto.
>>   (profile_merge): Ditto.
>>   (print_merge_usage_message): Ditto.
>>   (merge_usage): Ditto.
>>   (do_merge): Ditto.
>>   (profile_rewrite2): Ditto.
>>   (profile_rewrite): Ditto.
>>   (print_rewrite_usage_message): Ditto.
>>   (rewrite_usage): Ditto.
>>   (do_rewrite): Ditto.
>>   (print_usage): Ditto.
>>   (print_version): Ditto.
>>   (process_args): Ditto.
>>   (main): Ditto.
>>   * gcc/Makefile.in: Build and install gcov-tool.
>
>> Index: gcc/gcov-io.c
>> ===
>> --- gcc/gcov-io.c (revision 204895)
>> +++ gcc/gcov-io.c (working copy)
>> @@ -36,6 +36,37 @@ static const gcov_unsigned_t *gcov_read_words (uns
>>  static void gcov_allocate (unsigned);
>>  #endif
>>
>> +/* Moved for gcov-io.h and make it static.  */
>> +static struct gcov_var gcov_var;
>
> This is more an changelog message than a comment in source file.
> Just describe what gcov_var is.
>
> Do you know how the size of libgcov changed with your patch?
> Quick check of current mainline on compiling empty main gives:
>
> jh@gcc10:~/trunk/build/gcc$ cat t.c
> main()
> {
> }
> jh@gcc10:~/trunk/build/gcc$ ./xgcc -B ./ -O2 -fprofile-generate -o a.out-new 
> --static t.c
> jh@gcc10:~/trunk/build/gcc$ gcc -O2 -fprofile-generate -o a.out-old --static 
> t.c
> jh@gcc

[PATCH, preprocessor] Fix 56896

2013-12-10 Thread Ryan Mansfield

Fixes missing DIR_SEPARATOR if --with-gxx-include-dir configured as 
subdir of sysroot.


2013-12-10  Ryan Mansfield  

PR preprocessor/56896
* incpath.c (add_standard_paths): Strip trailing sysroot 
DIR_SEPARATOR

only if appended path begins with DIR_SEPARATOR.

Regards,

Ryan Mansfield
Index: incpath.c
===
--- incpath.c	(revision 205857)
+++ incpath.c	(working copy)
@@ -179,7 +179,8 @@
 	  char *sysroot_no_trailing_dir_separator = xstrdup (sysroot);
 	  size_t sysroot_len = strlen (sysroot);
 
-	  if (sysroot_len > 0 && sysroot[sysroot_len - 1] == DIR_SEPARATOR)
+	  if (sysroot_len > 0 && sysroot[sysroot_len - 1] == DIR_SEPARATOR
+		  && p->fname[0] == DIR_SEPARATOR)
 		sysroot_no_trailing_dir_separator[sysroot_len - 1] = '\0';
 	  str = concat (sysroot_no_trailing_dir_separator, p->fname, NULL);
 	  free (sysroot_no_trailing_dir_separator);

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-12-10 Thread H.J. Lu

On Tue, Dec 10, 2013 at 10:44 AM, Richard Sandiford
 wrote:
> "H.J. Lu"  writes:
>> On Tue, Dec 10, 2013 at 10:26 AM, Richard Sandiford
>>  wrote:
>>> "H.J. Lu"  writes:
 On Tue, Dec 10, 2013 at 9:57 AM, Richard Sandiford
  wrote:
> "H.J. Lu"  writes:
>> On Tue, Dec 10, 2013 at 8:05 AM, Kirill Yukhin
>>  wrote:
>>> On 09 Dec 14:08, H.J. Lu wrote:

 There are no regressions on Linux/x86-64 with -m32 and -m64.
 Can you check if it improves code quality on x886?
>>>
>>> As second thought. If Tejas and Richard are right and it is simply
>>> incorrect
>>> to check any offsets in this hook, may be we can end up with patch in 
>>> the
>>> bottom?
>>
>> What is wrong to pass the correct offset to
>> CANNOT_CHANGE_MODE_CLASS?  Backends are free to
>> ignore it.
>
> The point is that:
>
>>> - /* Vector registers do not support subreg with nonzero offsets,
>>> which
>>> -are otherwise valid for integer registers.  Since we can't see
>>> -whether we have a nonzero offset from here, prohibit all
>>> - nonparadoxical subregs changing size.  */
>>> -  if (GET_MODE_SIZE (to) < GET_MODE_SIZE (from))
>>> -   return true;
>
> seems to be trying to reject things like (subreg:SF (reg:V4SF X) 1),
> which is always invalid for a single-register V4SF.  See:

 That is correct.
>>>
>>> Sorry, what I mean is: that subreg is always invalid for single-
>>> register V4SFs regardless of the target.  This isn't something that
>>> CANNOT_CHANGE_MODE_CLASS should be expected to check.
>>>
>>
>> Why is
>>
>> (define_insn "*movv4sfdi_subreg"
>>  [(set (match_operand:DI 0 "nonimmediate_operand"   "=rxm")
>>(subreg:DI (match_operand:V4SF 1 "register_operand" "x") 0))]
>>
>> invalid?
>
> Sorry, I don't understand.  I never said it was invalid.  I said
> (subreg:SF (reg:V4SF X) 1) was invalid if (reg:V4SF X) represents
> a single register.  On a little-endian target, the offset cannot be
> anything other than 0 in that case.
>
> So the CANNOT_CHANGE_MODE_CLASS code above seems to be checking for
> something that is always invalid, regardless of the target.  That kind
> of situation should be rejected by target-independent code instead.
>
> In other words I'm arguing against the idea of passing the offset to
> CANNOT_CHANGE_MODE_CLASS (which you seemed to be supporting in the
> quote above).  I think Kirill's patch to remove the i386.c check was
> the right way to go.
>
> There's no need for a separate insn though.  Once you allow the subregs
> (as per Kirill's patch), the normal move patterns will handle them.
>

We will wait for Kirill's results.


-- 
H.J.

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-12-10 Thread Richard Sandiford

"H.J. Lu"  writes:
> On Tue, Dec 10, 2013 at 10:26 AM, Richard Sandiford
>  wrote:
>> "H.J. Lu"  writes:
>>> On Tue, Dec 10, 2013 at 9:57 AM, Richard Sandiford
>>>  wrote:
 "H.J. Lu"  writes:
> On Tue, Dec 10, 2013 at 8:05 AM, Kirill Yukhin
>  wrote:
>> On 09 Dec 14:08, H.J. Lu wrote:
>>>
>>> There are no regressions on Linux/x86-64 with -m32 and -m64.
>>> Can you check if it improves code quality on x886?
>>
>> As second thought. If Tejas and Richard are right and it is simply
>> incorrect
>> to check any offsets in this hook, may be we can end up with patch in the
>> bottom?
>
> What is wrong to pass the correct offset to
> CANNOT_CHANGE_MODE_CLASS?  Backends are free to
> ignore it.

 The point is that:

>> - /* Vector registers do not support subreg with nonzero offsets,
>> which
>> -are otherwise valid for integer registers.  Since we can't see
>> -whether we have a nonzero offset from here, prohibit all
>> - nonparadoxical subregs changing size.  */
>> -  if (GET_MODE_SIZE (to) < GET_MODE_SIZE (from))
>> -   return true;

 seems to be trying to reject things like (subreg:SF (reg:V4SF X) 1),
 which is always invalid for a single-register V4SF.  See:
>>>
>>> That is correct.
>>
>> Sorry, what I mean is: that subreg is always invalid for single-
>> register V4SFs regardless of the target.  This isn't something that
>> CANNOT_CHANGE_MODE_CLASS should be expected to check.
>>
>
> Why is
>
> (define_insn "*movv4sfdi_subreg"
>  [(set (match_operand:DI 0 "nonimmediate_operand"   "=rxm")
>(subreg:DI (match_operand:V4SF 1 "register_operand" "x") 0))]
>
> invalid?

Sorry, I don't understand.  I never said it was invalid.  I said
(subreg:SF (reg:V4SF X) 1) was invalid if (reg:V4SF X) represents
a single register.  On a little-endian target, the offset cannot be
anything other than 0 in that case.

So the CANNOT_CHANGE_MODE_CLASS code above seems to be checking for
something that is always invalid, regardless of the target.  That kind
of situation should be rejected by target-independent code instead.

In other words I'm arguing against the idea of passing the offset to
CANNOT_CHANGE_MODE_CLASS (which you seemed to be supporting in the
quote above).  I think Kirill's patch to remove the i386.c check was
the right way to go.

There's no need for a separate insn though.  Once you allow the subregs
(as per Kirill's patch), the normal move patterns will handle them.

Thanks,
Richard

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-12-10 Thread H.J. Lu

On Tue, Dec 10, 2013 at 10:26 AM, Richard Sandiford
 wrote:
> "H.J. Lu"  writes:
>> On Tue, Dec 10, 2013 at 9:57 AM, Richard Sandiford
>>  wrote:
>>> "H.J. Lu"  writes:
 On Tue, Dec 10, 2013 at 8:05 AM, Kirill Yukhin
  wrote:
> On 09 Dec 14:08, H.J. Lu wrote:
>>
>> There are no regressions on Linux/x86-64 with -m32 and -m64.
>> Can you check if it improves code quality on x886?
>
> As second thought. If Tejas and Richard are right and it is simply 
> incorrect
> to check any offsets in this hook, may be we can end up with patch in the
> bottom?

 What is wrong to pass the correct offset to
 CANNOT_CHANGE_MODE_CLASS?  Backends are free to
 ignore it.
>>>
>>> The point is that:
>>>
> -  /* Vector registers do not support subreg with nonzero offsets, 
> which
> -are otherwise valid for integer registers.  Since we can't see
> -whether we have a nonzero offset from here, prohibit all
> - nonparadoxical subregs changing size.  */
> -  if (GET_MODE_SIZE (to) < GET_MODE_SIZE (from))
> -   return true;
>>>
>>> seems to be trying to reject things like (subreg:SF (reg:V4SF X) 1),
>>> which is always invalid for a single-register V4SF.  See:
>>
>> That is correct.
>
> Sorry, what I mean is: that subreg is always invalid for single-
> register V4SFs regardless of the target.  This isn't something that
> CANNOT_CHANGE_MODE_CLASS should be expected to check.
>

Why is

(define_insn "*movv4sfdi_subreg"
 [(set (match_operand:DI 0 "nonimmediate_operand"   "=rxm")
   (subreg:DI (match_operand:V4SF 1 "register_operand" "x") 0))]

invalid?  It can be used in libffi.call/struct8.c.  hjl/subreg branch
generates

movq%xmm0, %xmm0# 39*movv4sfdi_subreg[length = 4]

instead of

movq-56(%rsp), %xmm0# 44*movdi_internal/15[length = 7]

both clears upper 64 bits in %xmm0.

-- 
H.J.

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-12-10 Thread Richard Sandiford

"H.J. Lu"  writes:
> On Tue, Dec 10, 2013 at 9:57 AM, Richard Sandiford
>  wrote:
>> "H.J. Lu"  writes:
>>> On Tue, Dec 10, 2013 at 8:05 AM, Kirill Yukhin
>>>  wrote:
 On 09 Dec 14:08, H.J. Lu wrote:
>
> There are no regressions on Linux/x86-64 with -m32 and -m64.
> Can you check if it improves code quality on x886?

 As second thought. If Tejas and Richard are right and it is simply 
 incorrect
 to check any offsets in this hook, may be we can end up with patch in the
 bottom?
>>>
>>> What is wrong to pass the correct offset to
>>> CANNOT_CHANGE_MODE_CLASS?  Backends are free to
>>> ignore it.
>>
>> The point is that:
>>
 -  /* Vector registers do not support subreg with nonzero offsets, 
 which
 -are otherwise valid for integer registers.  Since we can't see
 -whether we have a nonzero offset from here, prohibit all
 - nonparadoxical subregs changing size.  */
 -  if (GET_MODE_SIZE (to) < GET_MODE_SIZE (from))
 -   return true;
>>
>> seems to be trying to reject things like (subreg:SF (reg:V4SF X) 1),
>> which is always invalid for a single-register V4SF.  See:
>
> That is correct.

Sorry, what I mean is: that subreg is always invalid for single-
register V4SFs regardless of the target.  This isn't something that
CANNOT_CHANGE_MODE_CLASS should be expected to check.

Thanks,
Richard

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-12-10 Thread H.J. Lu

On Tue, Dec 10, 2013 at 9:57 AM, Richard Sandiford
 wrote:
> "H.J. Lu"  writes:
>> On Tue, Dec 10, 2013 at 8:05 AM, Kirill Yukhin  
>> wrote:
>>> On 09 Dec 14:08, H.J. Lu wrote:

 There are no regressions on Linux/x86-64 with -m32 and -m64.
 Can you check if it improves code quality on x886?
>>>
>>> As second thought. If Tejas and Richard are right and it is simply incorrect
>>> to check any offsets in this hook, may be we can end up with patch in the
>>> bottom?
>>
>> What is wrong to pass the correct offset to
>> CANNOT_CHANGE_MODE_CLASS?  Backends are free to
>> ignore it.
>
> The point is that:
>
>>> -  /* Vector registers do not support subreg with nonzero offsets, which
>>> -are otherwise valid for integer registers.  Since we can't see
>>> -whether we have a nonzero offset from here, prohibit all
>>> - nonparadoxical subregs changing size.  */
>>> -  if (GET_MODE_SIZE (to) < GET_MODE_SIZE (from))
>>> -   return true;
>
> seems to be trying to reject things like (subreg:SF (reg:V4SF X) 1),
> which is always invalid for a single-register V4SF.  See:

That is correct.

> http://gcc.gnu.org/ml/gcc-patches/2013-12/msg00824.html
>
> for the longer version.

In all places where CANNOT_CHANGE_MODE_CLASS is used,
only mode_change_ok, record_subregs_of_mode and inherit_piecemeal_p
don't have the known subreg offset.  In all other places, we know
what exactly the subreg offset is. When the subreg offset is passed
to CANNOT_CHANGE_MODE_CLASS, a backend can have

(subreg:DI (match_operand:V4SF 1 "register_operand" "x,x") 0)

in patterns.  I pushed hjl/subreg branch to GCC git repo with
a new pattern:


(define_insn "*mov_subreg"
 [(set (match_operand:VMOVE_SWI48 0 "nonimmediate_operand"   "=rxm")
   (subreg:VMOVE_SWI48 (match_operand:VMOVE 1 "register_operand" "x") 0))]
 ""
{
#if 1
  /* Help check where the subreg pattern is used.  */
  debug_rtx (insn);
  abort ();
#else
  /* Handle broken assemblers that require movd instead of movq.  */
  if (mode == SImode
  || (!HAVE_AS_IX86_INTERUNIT_MOVQ
  && (GENERAL_REG_P (operands[0]
return "%vmovd\t{%x1, %0|%0, %x1}";
  return "%vmovq\t{%x1, %0|%0, %1x}";
#endif
}
  [(set_attr "type" "ssemov")
   (set_attr "prefix" "maybe_vex")
   (set_attr "mode" "")])

I ran GCC testsuites with all languages enabled.  This pattern
is triggered 1178 times.  I checked a few of them.  The new patten
leads to reg-reg move instead of mem-reg load.


-- 
H.J.

Re: [PATCH, testsuite] Fix alignment in movapd tests

2013-12-10 Thread Uros Bizjak

Hello!

> 2013-12-10  Ryan Mansfield  
>
> PR testsuite/59442
> * gcc.target/i386/sse2-movapd-1.c: Fix alignment attributes.
> * gcc.target/i386/sse2-movapd-2.c: Likewise.
> * gcc.target/i386/avx-vmovapd-256-1.c: Likewise.
> * gcc.target/i386/avx-vmovapd-256-2.c: Likewise.

OK for mainline and release branches.

Thanks,
Uros.

Re: [PATCH] Hexadecimal numbers in option arguments

2013-12-10 Thread Joseph S. Myers

On Tue, 10 Dec 2013, Chung-Lin Tang wrote:

> Hi Joseph,
> 
> Forgot to follow up on this patch. Here it is with a small update to
> check if 'p' got updated to a difference position. Does this now look okay?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] Deal with promotions for internal functions (PR sanitizer/59399)

2013-12-10 Thread Richard Sandiford

Richard Biener  writes:
> On Mon, 9 Dec 2013, Marek Polacek wrote:
>
>> Back in April 2011, Richard S. submitted the implementation of
>> internal functions [1].  It originally had this hunk of code:
>> 
>>if (code == SSA_NAME
>>&& (g = SSA_NAME_DEF_STMT (ssa_name))
>> -  && gimple_code (g) == GIMPLE_CALL)
>> +  && gimple_code (g) == GIMPLE_CALL
>> +  && !gimple_call_internal_p (g))
>>  pmode = promote_function_mode (type, mode, &unsignedp,
>> TREE_TYPE
>> (TREE_TYPE (gimple_call_fn (g))),
>> 
>> but the !gimple_call_internal_p (g) was turned into an assert.  This patch
>> turns the assert back into a condition.  That's because I actually hit that
>> assert with -fsanitize=signed-integer-overflow on PPC64, where we expand 
>> internal
>> calls at expand time.  On PPC64, integers are promoted to long when passed to
>> a function, that is why the assert doesn't trigger on x86_64.
>> There shouldn't be any harm in not calling promote_function_mode since 
>> internal
>> functions do not result in a call instruction.
>> 
>> I'm not attaching any new testcase, because that's not needed - we ICE on 
>> PPC64
>> with current testsuite.
>> 
>> Regtested/bootstrapped on powerpc64-unknown-linux-gnu and 
>> x86_64-unknown-linux-gnu.
>> Ok for trunk?
>
> Looks good to me - Richard?

Yeah, fine with me too.  I suppose this wasn't ever likely to hit with
the original ARM use.

Thanks,
Richard

Re: [PING]: [GOMP4] [PATCH] SIMD-Enabled Functions (formerly Elemental functions) for C

2013-12-10 Thread Aldy Hernandez




But aren't both OpenMP and Cilk Plus simd clones marked as "omp
declare simd"?  In which case you shouldn't have to do anything?
Are the Cilk Plus clones not being marked as "omp declare simd" in
the front-ends?



Didn't you mention in this thread
(http://gcc.gnu.org/ml/gcc-patches/2013-11/msg03506.html) that Cilk
Plus SIMD-enabled functions must be marked as "cilk plus elementals"
(as it wsa called then)? Did I misunderstand you?


Both OpenMP and Cilk Plus clones should be marked with "omp declare
simd".  But Cilk Plus should _also_ be marked with "cilk plus
elementals" (or "cilk simd function" now).  In which case the
aforementioned function doesn't require any changes because Cilk Plus
clones will be seen as they are marked with "omp declare simd".

So...

OpenMP declare simd gets tagged as:
"omp declare simd"
Cilk Plus vector functions gets tagged as:
"omp declare simd"
"cilk simd function"


if (is_simd_fn) { c_token new_token; /* If we are here then it
has be a number.  */ new_token.type = CPP_NUMBER;
new_token.keyword = RID_MAX;


I thought we agreed integral constants were allowed.  Why would we
be expecting only a number?  Also, it should have been "it has TO
be a number".



Whatever that is reading this token only checks if this token is not
a keyword or a variable.  So, I put it as a number since it is the
most common case. I clarified it in the comment.


Ah, I see, perfect.  Thanks for the comment explaining so.

Aldy

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-12-10 Thread Richard Sandiford

"H.J. Lu"  writes:
> On Tue, Dec 10, 2013 at 8:05 AM, Kirill Yukhin  
> wrote:
>> On 09 Dec 14:08, H.J. Lu wrote:
>>>
>>> There are no regressions on Linux/x86-64 with -m32 and -m64.
>>> Can you check if it improves code quality on x886?
>>
>> As second thought. If Tejas and Richard are right and it is simply incorrect
>> to check any offsets in this hook, may be we can end up with patch in the
>> bottom?
>
> What is wrong to pass the correct offset to
> CANNOT_CHANGE_MODE_CLASS?  Backends are free to
> ignore it.

The point is that:

>> -  /* Vector registers do not support subreg with nonzero offsets, which
>> -are otherwise valid for integer registers.  Since we can't see
>> -whether we have a nonzero offset from here, prohibit all
>> - nonparadoxical subregs changing size.  */
>> -  if (GET_MODE_SIZE (to) < GET_MODE_SIZE (from))
>> -   return true;

seems to be trying to reject things like (subreg:SF (reg:V4SF X) 1),
which is always invalid for a single-register V4SF.  See:

http://gcc.gnu.org/ml/gcc-patches/2013-12/msg00824.html

for the longer version.

Thanks,
Richard

Re: Remove some typedefs (was: Silence class vs. struct warnings (opt_pass, ipa_opt_pass_d))

2013-12-10 Thread Oleg Endo

On Tue, 2013-12-10 at 09:49 -0800, H.J. Lu wrote:
> On Tue, Dec 10, 2013 at 9:44 AM, Jakub Jelinek  wrote:
> > On Mon, Dec 09, 2013 at 08:49:59PM +0100, Oleg Endo wrote:
> >> Tested with make all-gcc.
> >
> > You should be testing such changes by full bootstrap/regtest.
> 
> I checked in r205866 to restore bootstrap.
> 
> >> gcc/ChangeLog:
> >>   * gcc/cgraph.h (cgraph_node_set_iterator,
> >>   varpool_node_set_iterator): Remove typedef.
> >>   (cgraph_inline_failed_enum, cgraph_inline_failed_t): Remove
> >>   typedef and rename to cgraph_inline_failed_t.
> >
> > Apparently you have installed the patch with the incorrect prefixes
> > in the ChangeLog entries.  Please fix it up.
> >
> 
> I checked in r205865 to strip gcc/ from ChangeLog.

I'm sorry for the mess.  Thanks for fixing it.

Cheers,
Oleg

Re: Remove some typedefs (was: Silence class vs. struct warnings (opt_pass, ipa_opt_pass_d))

2013-12-10 Thread H.J. Lu

On Tue, Dec 10, 2013 at 9:44 AM, Jakub Jelinek  wrote:
> On Mon, Dec 09, 2013 at 08:49:59PM +0100, Oleg Endo wrote:
>> Tested with make all-gcc.
>
> You should be testing such changes by full bootstrap/regtest.

I checked in r205866 to restore bootstrap.

>> gcc/ChangeLog:
>>   * gcc/cgraph.h (cgraph_node_set_iterator,
>>   varpool_node_set_iterator): Remove typedef.
>>   (cgraph_inline_failed_enum, cgraph_inline_failed_t): Remove
>>   typedef and rename to cgraph_inline_failed_t.
>
> Apparently you have installed the patch with the incorrect prefixes
> in the ChangeLog entries.  Please fix it up.
>

I checked in r205865 to strip gcc/ from ChangeLog.

-- 
H.J.

Re: Remove some typedefs (was: Silence class vs. struct warnings (opt_pass, ipa_opt_pass_d))

2013-12-10 Thread Jakub Jelinek

On Mon, Dec 09, 2013 at 08:49:59PM +0100, Oleg Endo wrote:
> Tested with make all-gcc.

You should be testing such changes by full bootstrap/regtest.

> gcc/ChangeLog:
>   * gcc/cgraph.h (cgraph_node_set_iterator, 
>   varpool_node_set_iterator): Remove typedef.
>   (cgraph_inline_failed_enum, cgraph_inline_failed_t): Remove 
>   typedef and rename to cgraph_inline_failed_t.

Apparently you have installed the patch with the incorrect prefixes
in the ChangeLog entries.  Please fix it up.

Jakub

Re: Remove some typedefs (was: Silence class vs. struct warnings (opt_pass, ipa_opt_pass_d))

2013-12-10 Thread H.J. Lu

On Tue, Dec 10, 2013 at 9:33 AM, H.J. Lu  wrote:
> On Mon, Dec 9, 2013 at 11:49 AM, Oleg Endo  wrote:
>> On Thu, 2013-12-05 at 15:34 +0100, Oleg Endo wrote:
>>> On Thu, 2013-12-05 at 14:56 +0100, Richard Biener wrote:
>>> >
>>> > grep for 'typedef struct.*{' in headers.  The typedef name is usually
>>> > the desired one and is used without 'struct'.  So it's an orthogonal
>>> > issue.
>>>
>>> Ah, do you mean converting this stuff ...
>>>
>>> typedef struct
>>> {
>>>   cgraph_node_set set;
>>>   unsigned index;
>>> } cgraph_node_set_iterator;
>>>
>>> ... to ...
>>>
>>> struct cgraph_node_set_iterator
>>> {
>>> 
>>>
>>> right?
>>> Sure, no problem.  But I'd rather do it step by step in separate
>>> patches.
>>
>> So here it is.
>> Tested with make all-gcc.
>> Is this OK?
>>
>> Cheers,
>> Oleg
>>
>> gcc/ChangeLog:
>> * gcc/cgraph.h (cgraph_node_set_iterator,
>> varpool_node_set_iterator): Remove typedef.
>> (cgraph_inline_failed_enum, cgraph_inline_failed_t): Remove
>> typedef and rename to cgraph_inline_failed_t.
>> * gcc/tree-ssa-alias.h (ao_ref_s, ao_ref): Remove typedef and
>> rename to ao_ref.
>> * gcc/reload.h (reg_equivs_s, reg_equivs_t): Remove typedef and
>> rename to reg_equivs_t.
>> * gcc/conditions.h (CC_STATUS): Remove typedef.
>> * gcc/bitmap.h (bitmap_obstack): Remove typedef.
>> (bitmap_element_def, bitmap_element): Remove typedef and rename
>> to bitmap_element.
>> (bitmap_head_def, bitmap_head): Remove typedef and rename to
>> bitmap_head.
>> (bitmap_iterator): Remove typedef.
>> * gcc/target.h (cumulative_args_t, print_switch_type,
>> secondary_reload_info): Remove typedef.
>> * gcc/dwarf2out.h (dw_cfi_oprnd_struct, dw_cfi_oprnd): Remove
>> dw_cfi_oprnd_struct alias.
>> (dw_cfi_struct, dw_cfi_node): Remove typedef and rename to
>> dw_cfi_node.
>> (dw_fde_struct, dw_fde_node): Remove typedef and rename to
>> dw_fde_node.
>> (cfa_loc, dw_cfa_location): Remove typedef and rename to
>> dw_cfa_location.
>> (dw_vec_struct, dw_vec_const): Remove typedef and rename to
>> dw_vec_const.
>> (dw_val_struct, dw_val_node): Remove typedef and rename to
>> dw_val_node.
>> (dw_loc_descr_struct, dw_loc_descr_node): Remove typedef and
>> rename to dw_loc_descr_node.
>> * gcc/params.h (param_info, compiler_param): Remove typedef.
>> * gcc/opts.h (cl_deferred_param): Remove typedef.
>> * gcc/sreal.h (sreal): Remove typedef.
>> * gcc/ddg.h (dep_type, dep_data_type): Remove typedef.
>> * gcc/graphite-clast-to-gimple.h (cloog_prog_clast, bb_pbb_def):
>> Remove typedef.
>> * gcc/lto-streamer.h (lto_decl_stream_e_t, lto_encoder_entry,
>> lto_symtab_encoder_iterator, res_pair): Remove typedef.
>> * gcc/tree-affine.h (affine_tree_combination, aff_tree): Remove
>> typedef and rename to aff_tree.
>> * gcc/sched-int.h (region): Remove typedef.
>> * gcc/diagnostic.h (diagnostic_info,
>> diagnostic_classification_change_t): Remove typedef.
>> * gcc/tree-ssa-loop.h (affine_iv_d): Remove typedef and rename
>> to affine_iv.
>> * gcc/sbitmap.h (sbitmap_iterator): Remove typedef.
>> * gcc/ssa-iterators.h (immediate_use_iterator_d,
>> imm_use_iterator): Remove typedef and rename to
>> imm_use_iterator.
>> (ssa_operand_iterator_d, ssa_op_iter): Remove typedef and rename
>> to ssa_op_iter.
>> * gcc/ggc-internal.h (ggc_statistics): Remove typedef.
>> * gcc/cselib.h (cselib_val_struct, cselib_val): Remove typedef
>> and rename to cselib_val.
>> * gcc/tree-core.h (alias_pair): Remove typedef.
>> (constructor_elt_d, constructor_elt): Remove typedef and rename
>> to constructor_elt.
>> (ssa_use_operand_d, ssa_use_operand_t): Remove typedef and
>> rename to ssa_use_operand_t.
>> * gcc/graphite-sese-to-poly.h (base_alias_pair): Remove typedef.
>> * gcc/tree-data-ref.h (conflict_function): Remove typedef.
>> * gcc/tree-inline.h (copy_body_data): Remove typedef.
>> * gcc/ipa-inline.h (condition, size_time_entry,
>> inline_param_summary_t, edge_growth_cache_entry): Remove
>> typedef.
>> * gcc/regrename.h (operand_rr_info, insn_rr_info): Remove
>> typedef.
>> * gcc/gimple-iterator.h (gimple_stmt_iterator_d,
>> gimple_stmt_iterator): Remove typedef and rename to
>> gimple_stmt_iterator.
>> * gcc/basic-block.h (ce_if_block, ce_if_block_t): Remove typedef
>> and rename to ce_if_block.
>> (edge_iterator): Remove typedef.
>> * gcc/ipa-prop.h (ipa_agg_jf_item, ipa_agg_jf_item_t): Remove
>> typedef and rename to ipa_agg_jf_item.
>> (ipa_agg

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-12-10 Thread H.J. Lu

On Tue, Dec 10, 2013 at 9:26 AM, Tejas Belagod  wrote:
> H.J. Lu wrote:
>>
>> On Tue, Dec 10, 2013 at 9:04 AM, Kirill Yukhin 
>> wrote:
>>>
>>> On 10 Dec 08:23, H.J. Lu wrote:

 What is wrong to pass the correct offset to
 CANNOT_CHANGE_MODE_CLASS?  Backends are free to
 ignore it.
>>>
>>> Yes, but as fas as understand this hook as a predicate
>>> saying if it not-safe to change mode1 to mode2 for given
>>
>>
>> In many places, the macro is used with the known offset.
>> I have a follow up patch which improves x86 code generation,
>> in cases like:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45198
>>
>
> What is it that subreg_get_info () can't resolve that
> CANNOT_CHANGE_MODE_CLASS with an offset can?
>

We have

int
simplify_subreg_regno (unsigned int xregno, enum machine_mode xmode,
   unsigned int offset, enum machine_mode ymode)
{
  struct subreg_info info;
  unsigned int yregno;

#ifdef CANNOT_CHANGE_MODE_CLASS
  /* Give the backend a chance to disallow the mode change.  */
  if (GET_MODE_CLASS (xmode) != MODE_COMPLEX_INT
  && GET_MODE_CLASS (xmode) != MODE_COMPLEX_FLOAT
  && REG_CANNOT_CHANGE_MODE_P (xregno, xmode, ymode)
  /* We can use mode change in LRA for some transformations.  */
  && ! lra_in_progress)
return -1;
#endif

CANNOT_CHANGE_MODE_CLASS is checked before subreg_get_info is
called.  When REG_CANNOT_CHANGE_MODE_P returns false,
there is nothing subreg_get_info can do.

-- 
H.J.

Re: Remove some typedefs (was: Silence class vs. struct warnings (opt_pass, ipa_opt_pass_d))

2013-12-10 Thread H.J. Lu

On Mon, Dec 9, 2013 at 11:49 AM, Oleg Endo  wrote:
> On Thu, 2013-12-05 at 15:34 +0100, Oleg Endo wrote:
>> On Thu, 2013-12-05 at 14:56 +0100, Richard Biener wrote:
>> >
>> > grep for 'typedef struct.*{' in headers.  The typedef name is usually
>> > the desired one and is used without 'struct'.  So it's an orthogonal
>> > issue.
>>
>> Ah, do you mean converting this stuff ...
>>
>> typedef struct
>> {
>>   cgraph_node_set set;
>>   unsigned index;
>> } cgraph_node_set_iterator;
>>
>> ... to ...
>>
>> struct cgraph_node_set_iterator
>> {
>> 
>>
>> right?
>> Sure, no problem.  But I'd rather do it step by step in separate
>> patches.
>
> So here it is.
> Tested with make all-gcc.
> Is this OK?
>
> Cheers,
> Oleg
>
> gcc/ChangeLog:
> * gcc/cgraph.h (cgraph_node_set_iterator,
> varpool_node_set_iterator): Remove typedef.
> (cgraph_inline_failed_enum, cgraph_inline_failed_t): Remove
> typedef and rename to cgraph_inline_failed_t.
> * gcc/tree-ssa-alias.h (ao_ref_s, ao_ref): Remove typedef and
> rename to ao_ref.
> * gcc/reload.h (reg_equivs_s, reg_equivs_t): Remove typedef and
> rename to reg_equivs_t.
> * gcc/conditions.h (CC_STATUS): Remove typedef.
> * gcc/bitmap.h (bitmap_obstack): Remove typedef.
> (bitmap_element_def, bitmap_element): Remove typedef and rename
> to bitmap_element.
> (bitmap_head_def, bitmap_head): Remove typedef and rename to
> bitmap_head.
> (bitmap_iterator): Remove typedef.
> * gcc/target.h (cumulative_args_t, print_switch_type,
> secondary_reload_info): Remove typedef.
> * gcc/dwarf2out.h (dw_cfi_oprnd_struct, dw_cfi_oprnd): Remove
> dw_cfi_oprnd_struct alias.
> (dw_cfi_struct, dw_cfi_node): Remove typedef and rename to
> dw_cfi_node.
> (dw_fde_struct, dw_fde_node): Remove typedef and rename to
> dw_fde_node.
> (cfa_loc, dw_cfa_location): Remove typedef and rename to
> dw_cfa_location.
> (dw_vec_struct, dw_vec_const): Remove typedef and rename to
> dw_vec_const.
> (dw_val_struct, dw_val_node): Remove typedef and rename to
> dw_val_node.
> (dw_loc_descr_struct, dw_loc_descr_node): Remove typedef and
> rename to dw_loc_descr_node.
> * gcc/params.h (param_info, compiler_param): Remove typedef.
> * gcc/opts.h (cl_deferred_param): Remove typedef.
> * gcc/sreal.h (sreal): Remove typedef.
> * gcc/ddg.h (dep_type, dep_data_type): Remove typedef.
> * gcc/graphite-clast-to-gimple.h (cloog_prog_clast, bb_pbb_def):
> Remove typedef.
> * gcc/lto-streamer.h (lto_decl_stream_e_t, lto_encoder_entry,
> lto_symtab_encoder_iterator, res_pair): Remove typedef.
> * gcc/tree-affine.h (affine_tree_combination, aff_tree): Remove
> typedef and rename to aff_tree.
> * gcc/sched-int.h (region): Remove typedef.
> * gcc/diagnostic.h (diagnostic_info,
> diagnostic_classification_change_t): Remove typedef.
> * gcc/tree-ssa-loop.h (affine_iv_d): Remove typedef and rename
> to affine_iv.
> * gcc/sbitmap.h (sbitmap_iterator): Remove typedef.
> * gcc/ssa-iterators.h (immediate_use_iterator_d,
> imm_use_iterator): Remove typedef and rename to
> imm_use_iterator.
> (ssa_operand_iterator_d, ssa_op_iter): Remove typedef and rename
> to ssa_op_iter.
> * gcc/ggc-internal.h (ggc_statistics): Remove typedef.
> * gcc/cselib.h (cselib_val_struct, cselib_val): Remove typedef
> and rename to cselib_val.
> * gcc/tree-core.h (alias_pair): Remove typedef.
> (constructor_elt_d, constructor_elt): Remove typedef and rename
> to constructor_elt.
> (ssa_use_operand_d, ssa_use_operand_t): Remove typedef and
> rename to ssa_use_operand_t.
> * gcc/graphite-sese-to-poly.h (base_alias_pair): Remove typedef.
> * gcc/tree-data-ref.h (conflict_function): Remove typedef.
> * gcc/tree-inline.h (copy_body_data): Remove typedef.
> * gcc/ipa-inline.h (condition, size_time_entry,
> inline_param_summary_t, edge_growth_cache_entry): Remove
> typedef.
> * gcc/regrename.h (operand_rr_info, insn_rr_info): Remove
> typedef.
> * gcc/gimple-iterator.h (gimple_stmt_iterator_d,
> gimple_stmt_iterator): Remove typedef and rename to
> gimple_stmt_iterator.
> * gcc/basic-block.h (ce_if_block, ce_if_block_t): Remove typedef
> and rename to ce_if_block.
> (edge_iterator): Remove typedef.
> * gcc/ipa-prop.h (ipa_agg_jf_item, ipa_agg_jf_item_t): Remove
> typedef and rename to ipa_agg_jf_item.
> (ipa_agg_jump_function_t, ipa_param_descriptor_t,
> ipa_node_params_t, ipa_parm_adjustment_t): Remove typedef.
> (ipa_jump_func, ipa_jump_func_t): Remove

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-12-10 Thread Tejas Belagod


H.J. Lu wrote:

On Tue, Dec 10, 2013 at 9:04 AM, Kirill Yukhin  wrote:

On 10 Dec 08:23, H.J. Lu wrote:

What is wrong to pass the correct offset to
CANNOT_CHANGE_MODE_CLASS?  Backends are free to
ignore it.

Yes, but as fas as understand this hook as a predicate
saying if it not-safe to change mode1 to mode2 for given


In many places, the macro is used with the known offset.
I have a follow up patch which improves x86 code generation,
in cases like:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45198



What is it that subreg_get_info () can't resolve that CANNOT_CHANGE_MODE_CLASS 
with an offset can?


Thanks,
Tejas.


register class. I don't think that offsets should be
involved here. IMHO it is safe to change V4SF->SF
for SSE/AVX register...

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-12-10 Thread H.J. Lu

On Tue, Dec 10, 2013 at 9:04 AM, Kirill Yukhin  wrote:
> On 10 Dec 08:23, H.J. Lu wrote:
>> What is wrong to pass the correct offset to
>> CANNOT_CHANGE_MODE_CLASS?  Backends are free to
>> ignore it.
>
> Yes, but as fas as understand this hook as a predicate
> saying if it not-safe to change mode1 to mode2 for given

In many places, the macro is used with the known offset.
I have a follow up patch which improves x86 code generation,
in cases like:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45198

> register class. I don't think that offsets should be
> involved here. IMHO it is safe to change V4SF->SF
> for SSE/AVX register...


-- 
H.J.

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-12-10 Thread H.J. Lu

On Tue, Dec 10, 2013 at 9:09 AM, Kirill Yukhin  wrote:
> On 10 Dec 09:02, H.J. Lu wrote:
>> On Tue, Dec 10, 2013 at 8:05 AM, Kirill Yukhin  
>> wrote:
>> > On 09 Dec 14:08, H.J. Lu wrote:
>> >  NOINLINE float
>> >  foo32x2_le (float32x2_t x)
>> >  {
>> > +#ifdef __i386__
>> > +  __builtin_ia32_emms ();
>> > +#endif
>> >return bar (x[0]);
>> >  }
>> You should check both __i386__ and __x86_64__.
> Why? I thought that we pass using MMX only in 32-bit mode.
> This built-in is useless on 64-bit x86.
>

Can you double check it?

-- 
H.J.

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-12-10 Thread Kirill Yukhin

On 10 Dec 09:02, H.J. Lu wrote:
> On Tue, Dec 10, 2013 at 8:05 AM, Kirill Yukhin  
> wrote:
> > On 09 Dec 14:08, H.J. Lu wrote:
> >  NOINLINE float
> >  foo32x2_le (float32x2_t x)
> >  {
> > +#ifdef __i386__
> > +  __builtin_ia32_emms ();
> > +#endif
> >return bar (x[0]);
> >  }
> You should check both __i386__ and __x86_64__.
Why? I thought that we pass using MMX only in 32-bit mode.
This built-in is useless on 64-bit x86.

K

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-12-10 Thread Kirill Yukhin

On 10 Dec 08:23, H.J. Lu wrote:
> What is wrong to pass the correct offset to
> CANNOT_CHANGE_MODE_CLASS?  Backends are free to
> ignore it.

Yes, but as fas as understand this hook as a predicate
saying if it not-safe to change mode1 to mode2 for given
register class. I don't think that offsets should be
involved here. IMHO it is safe to change V4SF->SF
for SSE/AVX register...

--
Thanks, K

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-12-10 Thread H.J. Lu

On Tue, Dec 10, 2013 at 8:05 AM, Kirill Yukhin  wrote:
> On 09 Dec 14:08, H.J. Lu wrote:
>>
>> There are no regressions on Linux/x86-64 with -m32 and -m64.
>> Can you check if it improves code quality on x886?
>
> As second thought. If Tejas and Richard are right and it is simply incorrect
> to check any offsets in this hook, may be we can end up with patch in the
> bottom?
>
> Test is passing (however I still don't know how to prohibit it for 32 bit 
> x86),
> bootstrap in progress.
>
> Ideas?
>
> This change belongs to rth.


>
>  NOINLINE float
>  foo32x2_le (float32x2_t x)
>  {
> +#ifdef __i386__
> +  __builtin_ia32_emms ();
> +#endif
>return bar (x[0]);
>  }
>

You should check both __i386__ and __x86_64__.


-- 
H.J.

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-12-10 Thread Paul_Koning


On Dec 10, 2013, at 9:50 AM, Kirill Yukhin  wrote:

> Hello,
> On 09 Dec 14:08, H.J. Lu wrote:
>> There are no regressions on Linux/x86-64 with -m32 and -m64.
>> Can you check if it improves code quality on x886?
> 
> That is exactly what I was talking about. However I wasn't sure
> that we can change already defined (and used throughout ports)
> target hook.
> 
> ...
> 
> Attached patch + updated test.

You're missing the documentation change needed for this.

paul

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-12-10 Thread H.J. Lu

On Tue, Dec 10, 2013 at 8:05 AM, Kirill Yukhin  wrote:
> On 09 Dec 14:08, H.J. Lu wrote:
>>
>> There are no regressions on Linux/x86-64 with -m32 and -m64.
>> Can you check if it improves code quality on x886?
>
> As second thought. If Tejas and Richard are right and it is simply incorrect
> to check any offsets in this hook, may be we can end up with patch in the
> bottom?

What is wrong to pass the correct offset to
CANNOT_CHANGE_MODE_CLASS?  Backends are free to
ignore it.

>
> Test is passing (however I still don't know how to prohibit it for 32 bit 
> x86),
> bootstrap in progress.
>
> Ideas?
>
> This change belongs to rth.
>
> --
> Thanks, K
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 382f8fb..0d0bb67 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -35002,22 +35002,13 @@ ix86_cannot_change_mode_class (enum machine_mode 
> from, enum machine_mode to,
>if (MAYBE_FLOAT_CLASS_P (regclass))
>  return true;
>
> -  if (MAYBE_SSE_CLASS_P (regclass) || MAYBE_MMX_CLASS_P (regclass))
> -{
> -  /* Vector registers do not support QI or HImode loads.  If we don't
> -disallow a change to these modes, reload will assume it's ok to
> -drop the subreg from (subreg:SI (reg:HI 100) 0).  This affects
> -the vec_dupv4hi pattern.  */
> -  if (GET_MODE_SIZE (from) < 4)
> -   return true;
> -
> -  /* Vector registers do not support subreg with nonzero offsets, which
> -are otherwise valid for integer registers.  Since we can't see
> -whether we have a nonzero offset from here, prohibit all
> - nonparadoxical subregs changing size.  */
> -  if (GET_MODE_SIZE (to) < GET_MODE_SIZE (from))
> -   return true;
> -}
> +  /* Vector registers do not support QI or HImode loads.  If we don't
> + disallow a change to these modes, reload will assume it's ok to
> + drop the subreg from (subreg:SI (reg:HI 100) 0).  This affects
> + the vec_dupv4hi pattern.  */
> +  if ((MAYBE_SSE_CLASS_P (regclass) || MAYBE_MMX_CLASS_P (regclass))
> +  && (GET_MODE_SIZE (from) < 4))
> +return true;
>
>

You need to run full "make check" for both -m32 and -m64.

-- 
H.J.

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-12-10 Thread Kirill Yukhin

On 09 Dec 14:08, H.J. Lu wrote:
> 
> There are no regressions on Linux/x86-64 with -m32 and -m64.
> Can you check if it improves code quality on x886?

As second thought. If Tejas and Richard are right and it is simply incorrect
to check any offsets in this hook, may be we can end up with patch in the
bottom?

Test is passing (however I still don't know how to prohibit it for 32 bit x86),
bootstrap in progress.

Ideas?

This change belongs to rth.

--
Thanks, K

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 382f8fb..0d0bb67 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -35002,22 +35002,13 @@ ix86_cannot_change_mode_class (enum machine_mode 
from, enum machine_mode to,
   if (MAYBE_FLOAT_CLASS_P (regclass))
 return true;
 
-  if (MAYBE_SSE_CLASS_P (regclass) || MAYBE_MMX_CLASS_P (regclass))
-{
-  /* Vector registers do not support QI or HImode loads.  If we don't
-disallow a change to these modes, reload will assume it's ok to
-drop the subreg from (subreg:SI (reg:HI 100) 0).  This affects
-the vec_dupv4hi pattern.  */
-  if (GET_MODE_SIZE (from) < 4)
-   return true;
-
-  /* Vector registers do not support subreg with nonzero offsets, which
-are otherwise valid for integer registers.  Since we can't see
-whether we have a nonzero offset from here, prohibit all
- nonparadoxical subregs changing size.  */
-  if (GET_MODE_SIZE (to) < GET_MODE_SIZE (from))
-   return true;
-}
+  /* Vector registers do not support QI or HImode loads.  If we don't
+ disallow a change to these modes, reload will assume it's ok to
+ drop the subreg from (subreg:SI (reg:HI 100) 0).  This affects
+ the vec_dupv4hi pattern.  */
+  if ((MAYBE_SSE_CLASS_P (regclass) || MAYBE_MMX_CLASS_P (regclass))
+  && (GET_MODE_SIZE (from) < 4))
+return true;
 
   return false;
 }
diff --git a/gcc/testsuite/gcc.dg/vect/vect-nop-move.c 
b/gcc/testsuite/gcc.dg/vect/vect-nop-move.c
index 1941933..e863c1b 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-nop-move.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-nop-move.c
@@ -1,4 +1,4 @@
-/* { dg-do run } */ 
+/* { dg-do run } */
 /* { dg-require-effective-target vect_float } */
 /* { dg-options "-O3 -fdump-rtl-combine-details" } */
 
@@ -16,9 +16,15 @@ foo32x4_be (float32x4_t x)
 }
 
 NOINLINE float
-foo32x4_le (float32x4_t x)
+bar_2 (float a, float b)
 {
-  return x[0];
+  return a;
+}
+
+NOINLINE float
+foo32x4_le (float32x4_t x, float32x4_t y)
+{
+  return bar_2 (x[0], y[0]);
 }
 
 NOINLINE float
@@ -30,12 +36,18 @@ bar (float a)
 NOINLINE float
 foo32x2_be (float32x2_t x)
 {
+#ifdef __i386__
+  __builtin_ia32_emms ();
+#endif
   return bar (x[1]);
 }
 
 NOINLINE float
 foo32x2_le (float32x2_t x)
 {
+#ifdef __i386__
+  __builtin_ia32_emms ();
+#endif
   return bar (x[0]);
 }
 
@@ -48,7 +60,7 @@ main()
   if (foo32x4_be (a) != 3.0f)
 abort ();
 
-  if (foo32x4_le (a) != 0.0f)
+  if (foo32x4_le (a, a) != 0.0f)
 abort ();
 
   if (foo32x2_be (b) != 1.0f)
@@ -60,5 +72,5 @@ main()
   return 0;
 }
 
-/* { dg-final { scan-rtl-dump "deleting noop move" "combine" { target 
aarch64*-*-* } } } */
+/* { dg-final { scan-rtl-dump "deleting noop move" "combine" { target { 
aarch64*-*-* || x86_64-*-* } } } } */
 /* { dg-final { cleanup-rtl-dump "combine" } } */

Re: Fix tsan tests.

2013-12-10 Thread Yury Gribov


Done in r205858.

---
From: Jakub Jelinek 
Sent:  Tuesday, December 10, 2013 7:37PM
To: Yury Gribov 
Cc: H.J. Lu , Maxim Ostapenko 
, GCC Patches 
, Slava Garbuzov 

Subject: Re: Fix tsan tests.

On 12/10/2013 07:37 PM, Jakub Jelinek wrote:

On Tue, Dec 10, 2013 at 07:06:21PM +0400, Yury Gribov wrote:


--- a/gcc/testsuite/g++.dg/tsan/tsan.exp
+++ b/gcc/testsuite/g++.dg/tsan/tsan.exp
@@ -42,5 +42,6 @@ gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.c 
$srcdir/c-c++-common
  }
  
  # All done.

+torture-finish
  tsan_finish
  dg-finish
diff --git a/gcc/testsuite/gcc.dg/tsan/tsan.exp 
b/gcc/testsuite/gcc.dg/tsan/tsan.exp
index a4a5b72..c70021c 100644
--- a/gcc/testsuite/gcc.dg/tsan/tsan.exp
+++ b/gcc/testsuite/gcc.dg/tsan/tsan.exp
@@ -42,5 +42,6 @@ gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.c 
$srcdir/c-c++-common
  }
  
  # All done.

+torture-finish
  tsan_finish
  dg-finish


I'd prefer to put torture-finish after tsan_finish, so that
destruction is done in reverse order from construction.
Ok with that change.


2013-12-10  Yury Gribov  

* gcc-dg/tsan/tsan.exp: Added missing call to torture-finish.
* g++-dg/tsan/tsan.exp: Likewise.


Jakub

Re: Fix tsan tests.

2013-12-10 Thread Jakub Jelinek

On Tue, Dec 10, 2013 at 07:06:21PM +0400, Yury Gribov wrote:
> --- a/gcc/testsuite/g++.dg/tsan/tsan.exp
> +++ b/gcc/testsuite/g++.dg/tsan/tsan.exp
> @@ -42,5 +42,6 @@ gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.c 
> $srcdir/c-c++-common
>  }
>  
>  # All done.
> +torture-finish
>  tsan_finish
>  dg-finish
> diff --git a/gcc/testsuite/gcc.dg/tsan/tsan.exp 
> b/gcc/testsuite/gcc.dg/tsan/tsan.exp
> index a4a5b72..c70021c 100644
> --- a/gcc/testsuite/gcc.dg/tsan/tsan.exp
> +++ b/gcc/testsuite/gcc.dg/tsan/tsan.exp
> @@ -42,5 +42,6 @@ gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.c 
> $srcdir/c-c++-common
>  }
>  
>  # All done.
> +torture-finish
>  tsan_finish
>  dg-finish

I'd prefer to put torture-finish after tsan_finish, so that
destruction is done in reverse order from construction.
Ok with that change.

> 2013-12-10  Yury Gribov  
> 
>   * gcc-dg/tsan/tsan.exp: Added missing call to torture-finish.
>   * g++-dg/tsan/tsan.exp: Likewise.

Jakub

Re: [REPOST] Invalid Code when reading from unaligned zero-sized array

2013-12-10 Thread Richard Biener

On Tue, Dec 10, 2013 at 4:02 PM, Richard Biener
 wrote:
> On Tue, Dec 10, 2013 at 11:53 AM, Eric Botcazou  wrote:
>>> What we support is out of bounds accesses for heap vars if the var's type
>>> has flexible array member or something we treat similarly and there is the
>>> possibility that there could be payload after the heap var that could be
>>> accessed from the flexible array members or similar arrays.
>>
>> My question was about the above similar arrays, i.e. whether we consider all
>> trailing arrays in structures as flexible-like or not.  No strong opinion.
>
> Yes we do, even for struct { struct { int a; char a[1] } }; (note the not 
> really
> "trailing" as there is padding after the trailing array).  We do take
> size limitations from a DECL (if we see one) into account to limit the
> effect of this trailing-array-supporting, so it effectively only applies to
> indirect accesses (and the padding example above, you can use the whole
> padding if DECL_SIZE allows that).
>
>>> So, I don't see what is the big deal with BLKmode, because all the cases
>>> which actually could have flexible array member extra payloads (or similar)
>>> must necessarily live in memory, if it is the compiler that decides whether
>>> to put it into memory or keep in registers etc., then it can't be heap
>>> allocated.
>>
>> The invariant is that types for which objects can effectively have variable
>> size must have BLKmode, otherwise you need to add very ugly code in the RTL
>> expander to mask the lie.
>
> I wonder if we can make the expander more rely on the DECLs mode
> and optimize only the DECLs mode (if we can constrain its size via
> DECL_SIZE) to non-BLKmode instead of doing that for the TYPEs mode.
> Yes, you'd have DECL_MODE != TYPE_MODE that way.
>
> Or rather I wonder if the expander doesn't already work that way
> (looks at DECL_MODE).

To speak in patches (completely untested) - sth like the following ontop
of making more types have BLKmode:

Index: gcc/stor-layout.c
===
--- gcc/stor-layout.c   (revision 205857)
+++ gcc/stor-layout.c   (working copy)
@@ -569,8 +569,17 @@ layout_decl (tree decl, unsigned int kno
  bitsize_unit_node));

   if (code != FIELD_DECL)
-/* For non-fields, update the alignment from the type.  */
-do_type_align (type, decl);
+{
+  /* For non-fields, update the alignment from the type.  */
+  do_type_align (type, decl);
+  /* Optimize the mode of the decl.
+???  Ensure choosen mode alignment fits decl alignment.  */
+  if (DECL_MODE (decl) == BLKmode
+ && DECL_SIZE (decl)
+ && compare_tree_int (DECL_SIZE (t), MAX_FIXED_MODE_SIZE) <= 0)
+   DECL_MODE (decl)
+ = mode_for_size (TREE_INT_CST_LOW (DECL_SIZE (decl)), MODE_INT, 1);
+}
   else
 /* For fields, it's a bit more complicated...  */
 {


> Richard.
>
>> --
>> Eric Botcazou

Re: Fix tsan tests.

2013-12-10 Thread Yury Gribov


Jakub wrote:
> HJ wrote:
>>> Done, r205853.
>> I think it caused:
>>
>> http://gcc.gnu.org/ml/gcc-regression/2013-12/msg00214.html
>
> Missing torture-finish before dg-finish? At least looking at other
> *.exp files that do set-torture-options they all do that.

Full make-check is still running but I was able to repro this with `make 
check RUNTESTFLAGS='tsan.exp i386-prefetch.exp math-torture.exp'. 
Attached patch is tested against these exps as well. Ok to commit or 
should I wait for make-check to complete? I'd prefer to unblock other 
developers ASAP.


-Y
diff --git a/gcc/testsuite/g++.dg/tsan/tsan.exp b/gcc/testsuite/g++.dg/tsan/tsan.exp
index 68b1d83..f54092e 100644
--- a/gcc/testsuite/g++.dg/tsan/tsan.exp
+++ b/gcc/testsuite/g++.dg/tsan/tsan.exp
@@ -42,5 +42,6 @@ gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.c $srcdir/c-c++-common
 }
 
 # All done.
+torture-finish
 tsan_finish
 dg-finish
diff --git a/gcc/testsuite/gcc.dg/tsan/tsan.exp b/gcc/testsuite/gcc.dg/tsan/tsan.exp
index a4a5b72..c70021c 100644
--- a/gcc/testsuite/gcc.dg/tsan/tsan.exp
+++ b/gcc/testsuite/gcc.dg/tsan/tsan.exp
@@ -42,5 +42,6 @@ gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.c $srcdir/c-c++-common
 }
 
 # All done.
+torture-finish
 tsan_finish
 dg-finish
2013-12-10  Yury Gribov  

* gcc-dg/tsan/tsan.exp: Added missing call to torture-finish.
* g++-dg/tsan/tsan.exp: Likewise.

Re: [REPOST] Invalid Code when reading from unaligned zero-sized array

2013-12-10 Thread Richard Biener

On Tue, Dec 10, 2013 at 11:53 AM, Eric Botcazou  wrote:
>> What we support is out of bounds accesses for heap vars if the var's type
>> has flexible array member or something we treat similarly and there is the
>> possibility that there could be payload after the heap var that could be
>> accessed from the flexible array members or similar arrays.
>
> My question was about the above similar arrays, i.e. whether we consider all
> trailing arrays in structures as flexible-like or not.  No strong opinion.

Yes we do, even for struct { struct { int a; char a[1] } }; (note the not really
"trailing" as there is padding after the trailing array).  We do take
size limitations from a DECL (if we see one) into account to limit the
effect of this trailing-array-supporting, so it effectively only applies to
indirect accesses (and the padding example above, you can use the whole
padding if DECL_SIZE allows that).

>> So, I don't see what is the big deal with BLKmode, because all the cases
>> which actually could have flexible array member extra payloads (or similar)
>> must necessarily live in memory, if it is the compiler that decides whether
>> to put it into memory or keep in registers etc., then it can't be heap
>> allocated.
>
> The invariant is that types for which objects can effectively have variable
> size must have BLKmode, otherwise you need to add very ugly code in the RTL
> expander to mask the lie.

I wonder if we can make the expander more rely on the DECLs mode
and optimize only the DECLs mode (if we can constrain its size via
DECL_SIZE) to non-BLKmode instead of doing that for the TYPEs mode.
Yes, you'd have DECL_MODE != TYPE_MODE that way.

Or rather I wonder if the expander doesn't already work that way
(looks at DECL_MODE).

Richard.

> --
> Eric Botcazou

Re: [PATCH] Strict volatile bit-fields clean-up, Take 2

2013-12-10 Thread Richard Biener

On Tue, Dec 10, 2013 at 10:31 AM, Bernd Edlinger
 wrote:
> Hi,
>
>
> On Mon, 9 Dec 2013 14:18:23, Richard Biener wrote:
>>
>> On Mon, Dec 9, 2013 at 1:39 PM, Bernd Edlinger
>>  wrote:
>>> On Fri, 6 Dec 2013 11:51:15, Richard Biener wrote:

 On Fri, Dec 6, 2013 at 11:15 AM, Bernd Edlinger
  wrote:
> Hi,
>
> On Thu, 5 Dec 2013 15:10:51, Richard Biener wrote:
>>
>> On Thu, Dec 5, 2013 at 1:27 PM, Bernd Edlinger
>>  wrote:
>>> Hi Richard,
>>>
>>> I had just an idea how to solve that recursion problem without 
>>> completely ignoring the
>>> memory mode. I hope you are gonna like it.
>>>
>>> This time I even added a comment :-)
>>
>> Ehm, ...
>>
>> + /* If MODE has no size i.e. BLKmode or is larger than WORD_MODE
>> + or BITREGION_END is used we can use WORD_MODE as upper limit.
>> + However, if the field is too unaligned to be fetched with a single
>> + access, we also have to use WORD_MODE. This can happen in Ada. */
>> if (GET_MODE_BITSIZE (mode) == 0
>> - || GET_MODE_BITSIZE (mode)> GET_MODE_BITSIZE (word_mode))
>> + || GET_MODE_BITSIZE (mode)> GET_MODE_BITSIZE (word_mode)
>> + || bitregion_end != 0
>> + || bitnum % BITS_PER_UNIT + bitsize> GET_MODE_SIZE (mode)
>> + || (STRICT_ALIGNMENT
>> + && bitnum % GET_MODE_ALIGNMENT (mode) + bitsize
>> +> GET_MODE_BITSIZE (mode)))
>> mode = word_mode;
>>
>> If the field is unaligned how does choosing a larger mode help? We should
>> always be able to use multiple accesses with a smaller mode in this case.
>>
>> So - I'd buy the || bitregion_end != 0 thing but the rest ...? So, ok if
>> that alone fixes the bug. Note that when bitregion_end == 0 the
>> access should be limited by the mode we pass to get_best_mode.
>>
>> Btw, it should be always possible to return QImode here, so I wonder
>> how enlarging the mode fixes the underlying issue (maybe the bug
>> is really elsewhere?)
>>
>
> This comment in store_split_bit_field explains everything:
>
> /* THISSIZE must not overrun a word boundary. Otherwise,
> store_fixed_bit_field will call us again, and we will mutually
> recurse forever. */
> thissize = MIN (bitsize - bitsdone, BITS_PER_WORD);
> thissize = MIN (thissize, unit - thispos);
>
>
> This comment was added in 1993. At that time the code in 
> store_fixed_bit_field
> looked like this:
>
> mode = get_best_mode (bitsize, bitnum, bitregion_start, bitregion_end,
> MEM_ALIGN (op0), word_mode, MEM_VOLATILE_P (op0));
> if (mode == VOIDmode)
> {
> /* The only way this should occur is if the field spans word
> boundaries. */
> store_split_bit_field (op0, bitsize, bitnum, bitregion_start,
> bitregion_end, value);
> return;
> }
>
> And in 2001 that was changed to this:
>
> mode = GET_MODE (op0);
> if (GET_MODE_BITSIZE (mode) == 0
> || GET_MODE_BITSIZE (mode)> GET_MODE_BITSIZE (word_mode))
> mode = word_mode;
> mode = get_best_mode (bitsize, bitnum, bitregion_start, bitregion_end,
> MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0));
>
> Which broke the symmetry, and invalidated the above comment.
> Therefore we have again a recursion problem.

 This was rev. 43864 btw, no testcase but added the comment
 "We don't want a mode bigger than the destination".

 So isn't the bug fixed by calling your new store_fixed_bit_field_1
 from store_split_bit_field? In fact that caller knows the mode
 it wants to do the store with, so why not even have an explicit
 mode argument for store_fixed_bit_field_1 instead of extracting
 it from op0?

 Note I just want to help as well and I am not very familiar with
 the details of the implementation here. So I'd rather have
 a patch "obviously correct" to me - which expanding a condition
 by several more checks isn't ;)

>>>
>>> Hmm...
>>>
>>> I think if I solve it from the other side, everything looks
>>> much more obvious indeed.
>>>
>>> How about this new version: For the inconsistent values,
>>> MODE = QImode, bitpos=11, bitsize=8, it just generates two
>>> QImode accesses now, instead of a single HI or SImode.
>>>
>>> Boot-strapped and regression-test passed on X86-linux-gnu.
>>>
>>> OK for trunk?
>>
>> Looks good to me.
>>
>> Thanks,
>> Richard.
>>
>
> Great!
>
> Then I think we can put all bits together now:
>
> 1. Let Sandra apply her Bit-fields patch "reimplement
> -fstrict-volatile-bitfields v4, part 1/2" which was
> posted here: http://gcc.gnu.org/ml/gcc-patches/2013-09/msg02058.html
> and approved here: http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01476.html
>
> 2. As follow-Up I'd like to apply this update-patch, which fixes the
> recursion in the extract_split_bit_field and fixes the C++ memory
> model for -fstrict-volatile-bitfields:
>
> which was posted here: http

Re: [Patch, RTL] Eliminate redundant vec_select moves.

2013-12-10 Thread Kirill Yukhin

Hello,
On 09 Dec 14:08, H.J. Lu wrote:
> There are no regressions on Linux/x86-64 with -m32 and -m64.
> Can you check if it improves code quality on x886?

That is exactly what I was talking about. However I wasn't sure
that we can change already defined (and used throughout ports)
target hook.

Anyway, this patch is not working for given test, because combine
of these insns is blocked:
(insn 2 4 3 2 (set (reg/v:V4SF 85 [ x ])
(reg:V4SF 21 xmm0 [ x ]))
 (expr_list:REG_DEAD (reg:V4SF 21 xmm0 [ x ])
(nil)))

(insn 6 3 11 2 (set (reg:SF 86 [ D.1819 ])
(vec_select:SF (reg/v:V4SF 85 [ x ])
(parallel [
(const_int 0 [0])
])))
 (expr_list:REG_DEAD (reg/v:V4SF 85 [ x ])
(nil)))

(insn 11 6 14 2 (set (reg/i:SF 21 xmm0)
(reg:SF 86 [ D.1819 ]))
 (expr_list:REG_DEAD (reg:SF 86 [ D.1819 ])
(nil)))

This is because XMM0 is SSE_FIRST_REG which is likely_spilled_p.
Which I suspect is correct, since it is return value register.
Anyway, we may change the test so, that VEC_SELECT won't contain
XMM0 and will be successfully combined with input and output,
resulting to this pattern:
(insn 9 8 10 2 (set (reg:SF 22 xmm1)
(vec_select:SF (reg:V4SF 22 xmm1 [ y ])
(parallel [
(const_int 0 [0])
])))
 (nil))
Which is noop-erased with the patch.

Attached patch + updated test.

Note. It still not working for 32-bit x86 because of different
paramter passing. I cannot invent solution :)
For x86_64 all workks fine.

Note2. Even without the patch such VEC_SELECT are removed during
split2 pass, due to such split (sse.md):
(define_split
  [(set (match_operand:DF 0 "register_operand")
(vec_select:DF
  (match_operand:V2DF 1 "nonimmediate_operand")
  (parallel [(const_int 0)])))]
  "TARGET_SSE2 && reload_completed"
  [(set (match_dup 0) (match_dup 1))]

But I believe earlier we git rid of redundant code is better.

--
Thanks, K

---
 gcc/combine.c |  2 ++
 gcc/config/aarch64/aarch64.h  |  6 +++---
 gcc/config/alpha/alpha.h  |  2 +-
 gcc/config/arm/arm.h  |  8 
 gcc/config/i386/i386-protos.h |  4 +++-
 gcc/config/i386/i386.c| 12 ++--
 gcc/config/i386/i386.h|  7 ---
 gcc/config/i386/i386.md   |  1 +
 gcc/config/ia64/ia64.h|  2 +-
 gcc/config/m32c/m32c.h|  2 +-
 gcc/config/mep/mep.h  |  2 +-
 gcc/config/mips/mips.h|  2 +-
 gcc/config/msp430/msp430.h| 10 +-
 gcc/config/pa/pa32-regs.h |  2 +-
 gcc/config/pa/pa64-regs.h |  2 +-
 gcc/config/pdp11/pdp11.h  |  2 +-
 gcc/config/rs6000/rs6000.h|  2 +-
 gcc/config/s390/s390.h|  2 +-
 gcc/config/score/score.h  |  4 ++--
 gcc/config/sh/sh.h|  2 +-
 gcc/config/sparc/sparc.h  |  2 +-
 gcc/config/spu/spu.h  |  2 +-
 gcc/emit-rtl.c|  2 +-
 gcc/hard-reg-set.h|  6 +++---
 gcc/postreload.c  |  4 
 gcc/recog.c   |  3 ++-
 gcc/regcprop.c|  4 +++-
 gcc/reginfo.c |  1 +
 gcc/reload.c  | 10 +++---
 gcc/reload1.c | 10 +++---
 gcc/rtlanal.c |  2 +-
 gcc/testsuite/gcc.dg/vect/vect-nop-move.c | 22 +-
 32 files changed, 89 insertions(+), 55 deletions(-)

diff --git a/gcc/combine.c b/gcc/combine.c
index c7eb5e5..4575b16 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -5084,6 +5084,7 @@ subst (rtx x, rtx from, rtx to, int in_dest, int in_cond, 
int unique_copy)
  && REGNO (to) < FIRST_PSEUDO_REGISTER
  && REG_CANNOT_CHANGE_MODE_P (REGNO (to),
   GET_MODE (to),
+  SUBREG_BYTE (x),
   GET_MODE (x)))
return gen_rtx_CLOBBER (VOIDmode, const0_rtx);
 #endif
@@ -6450,6 +6451,7 @@ simplify_set (rtx x)
   && ! (REG_P (dest) && REGNO (dest) < FIRST_PSEUDO_REGISTER
&& REG_CANNOT_CHANGE_MODE_P (REGNO (dest),
 GET_MODE (SUBREG_REG (src)),
+SUBREG_BYTE (src),
 GET_MODE (src)))
 #endif
   && (REG_P (dest)
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index cead022..5b3bead 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -820,9 +820,9 @@ do {

Re: [PATCH] Deal with promotions for internal functions (PR sanitizer/59399)

2013-12-10 Thread Richard Biener

On Mon, 9 Dec 2013, Marek Polacek wrote:

> Back in April 2011, Richard S. submitted the implementation of
> internal functions [1].  It originally had this hunk of code:
> 
> if (code == SSA_NAME
> && (g = SSA_NAME_DEF_STMT (ssa_name))
> -   && gimple_code (g) == GIMPLE_CALL)
> +   && gimple_code (g) == GIMPLE_CALL
> +   && !gimple_call_internal_p (g))
>   pmode = promote_function_mode (type, mode, &unsignedp,
>  TREE_TYPE
>  (TREE_TYPE (gimple_call_fn (g))),
> 
> but the !gimple_call_internal_p (g) was turned into an assert.  This patch
> turns the assert back into a condition.  That's because I actually hit that
> assert with -fsanitize=signed-integer-overflow on PPC64, where we expand 
> internal
> calls at expand time.  On PPC64, integers are promoted to long when passed to
> a function, that is why the assert doesn't trigger on x86_64.
> There shouldn't be any harm in not calling promote_function_mode since 
> internal
> functions do not result in a call instruction.
> 
> I'm not attaching any new testcase, because that's not needed - we ICE on 
> PPC64
> with current testsuite.
> 
> Regtested/bootstrapped on powerpc64-unknown-linux-gnu and 
> x86_64-unknown-linux-gnu.
> Ok for trunk?

Looks good to me - Richard?

Thanks,
Richard.

> [1] http://gcc.gnu.org/ml/gcc-patches/2011-04/msg00609.html
> 
> 2013-12-09  Marek Polacek  
> 
>   PR sanitizer/59399
>   * expr.c (expand_expr_real_1): Remove assert dealing with
>   internal calls and turn that into a condition instead.
> 
> --- gcc/expr.c.mp22013-12-09 18:54:16.235624592 +0100
> +++ gcc/expr.c2013-12-09 18:55:03.580792572 +0100
> @@ -9482,13 +9482,11 @@ expand_expr_real_1 (tree exp, rtx target
>the same mode we got when the variable was declared.  */
> if (code == SSA_NAME
> && (g = SSA_NAME_DEF_STMT (ssa_name))
> -   && gimple_code (g) == GIMPLE_CALL)
> - {
> -   gcc_assert (!gimple_call_internal_p (g));
> -   pmode = promote_function_mode (type, mode, &unsignedp,
> -  gimple_call_fntype (g),
> -  2);
> - }
> +   && gimple_code (g) == GIMPLE_CALL
> +   && !gimple_call_internal_p (g))
> + pmode = promote_function_mode (type, mode, &unsignedp,
> +gimple_call_fntype (g),
> +2);
> else
>   pmode = promote_decl_mode (exp, &unsignedp);
> gcc_assert (GET_MODE (decl_rtl) == pmode);
> 
>   Marek
> 
>

Re: Remove some typedefs (was: Silence class vs. struct warnings (opt_pass, ipa_opt_pass_d))

2013-12-10 Thread Richard Biener

On Mon, Dec 9, 2013 at 8:49 PM, Oleg Endo  wrote:
> On Thu, 2013-12-05 at 15:34 +0100, Oleg Endo wrote:
>> On Thu, 2013-12-05 at 14:56 +0100, Richard Biener wrote:
>> >
>> > grep for 'typedef struct.*{' in headers.  The typedef name is usually
>> > the desired one and is used without 'struct'.  So it's an orthogonal
>> > issue.
>>
>> Ah, do you mean converting this stuff ...
>>
>> typedef struct
>> {
>>   cgraph_node_set set;
>>   unsigned index;
>> } cgraph_node_set_iterator;
>>
>> ... to ...
>>
>> struct cgraph_node_set_iterator
>> {
>> 
>>
>> right?
>> Sure, no problem.  But I'd rather do it step by step in separate
>> patches.
>
> So here it is.
> Tested with make all-gcc.
> Is this OK?

Ok.

Thanks,
Richard.

> Cheers,
> Oleg
>
> gcc/ChangeLog:
> * gcc/cgraph.h (cgraph_node_set_iterator,
> varpool_node_set_iterator): Remove typedef.
> (cgraph_inline_failed_enum, cgraph_inline_failed_t): Remove
> typedef and rename to cgraph_inline_failed_t.
> * gcc/tree-ssa-alias.h (ao_ref_s, ao_ref): Remove typedef and
> rename to ao_ref.
> * gcc/reload.h (reg_equivs_s, reg_equivs_t): Remove typedef and
> rename to reg_equivs_t.
> * gcc/conditions.h (CC_STATUS): Remove typedef.
> * gcc/bitmap.h (bitmap_obstack): Remove typedef.
> (bitmap_element_def, bitmap_element): Remove typedef and rename
> to bitmap_element.
> (bitmap_head_def, bitmap_head): Remove typedef and rename to
> bitmap_head.
> (bitmap_iterator): Remove typedef.
> * gcc/target.h (cumulative_args_t, print_switch_type,
> secondary_reload_info): Remove typedef.
> * gcc/dwarf2out.h (dw_cfi_oprnd_struct, dw_cfi_oprnd): Remove
> dw_cfi_oprnd_struct alias.
> (dw_cfi_struct, dw_cfi_node): Remove typedef and rename to
> dw_cfi_node.
> (dw_fde_struct, dw_fde_node): Remove typedef and rename to
> dw_fde_node.
> (cfa_loc, dw_cfa_location): Remove typedef and rename to
> dw_cfa_location.
> (dw_vec_struct, dw_vec_const): Remove typedef and rename to
> dw_vec_const.
> (dw_val_struct, dw_val_node): Remove typedef and rename to
> dw_val_node.
> (dw_loc_descr_struct, dw_loc_descr_node): Remove typedef and
> rename to dw_loc_descr_node.
> * gcc/params.h (param_info, compiler_param): Remove typedef.
> * gcc/opts.h (cl_deferred_param): Remove typedef.
> * gcc/sreal.h (sreal): Remove typedef.
> * gcc/ddg.h (dep_type, dep_data_type): Remove typedef.
> * gcc/graphite-clast-to-gimple.h (cloog_prog_clast, bb_pbb_def):
> Remove typedef.
> * gcc/lto-streamer.h (lto_decl_stream_e_t, lto_encoder_entry,
> lto_symtab_encoder_iterator, res_pair): Remove typedef.
> * gcc/tree-affine.h (affine_tree_combination, aff_tree): Remove
> typedef and rename to aff_tree.
> * gcc/sched-int.h (region): Remove typedef.
> * gcc/diagnostic.h (diagnostic_info,
> diagnostic_classification_change_t): Remove typedef.
> * gcc/tree-ssa-loop.h (affine_iv_d): Remove typedef and rename
> to affine_iv.
> * gcc/sbitmap.h (sbitmap_iterator): Remove typedef.
> * gcc/ssa-iterators.h (immediate_use_iterator_d,
> imm_use_iterator): Remove typedef and rename to
> imm_use_iterator.
> (ssa_operand_iterator_d, ssa_op_iter): Remove typedef and rename
> to ssa_op_iter.
> * gcc/ggc-internal.h (ggc_statistics): Remove typedef.
> * gcc/cselib.h (cselib_val_struct, cselib_val): Remove typedef
> and rename to cselib_val.
> * gcc/tree-core.h (alias_pair): Remove typedef.
> (constructor_elt_d, constructor_elt): Remove typedef and rename
> to constructor_elt.
> (ssa_use_operand_d, ssa_use_operand_t): Remove typedef and
> rename to ssa_use_operand_t.
> * gcc/graphite-sese-to-poly.h (base_alias_pair): Remove typedef.
> * gcc/tree-data-ref.h (conflict_function): Remove typedef.
> * gcc/tree-inline.h (copy_body_data): Remove typedef.
> * gcc/ipa-inline.h (condition, size_time_entry,
> inline_param_summary_t, edge_growth_cache_entry): Remove
> typedef.
> * gcc/regrename.h (operand_rr_info, insn_rr_info): Remove
> typedef.
> * gcc/gimple-iterator.h (gimple_stmt_iterator_d,
> gimple_stmt_iterator): Remove typedef and rename to
> gimple_stmt_iterator.
> * gcc/basic-block.h (ce_if_block, ce_if_block_t): Remove typedef
> and rename to ce_if_block.
> (edge_iterator): Remove typedef.
> * gcc/ipa-prop.h (ipa_agg_jf_item, ipa_agg_jf_item_t): Remove
> typedef and rename to ipa_agg_jf_item.
> (ipa_agg_jump_function_t, ipa_param_descriptor_t,
> ipa_node_params_t, ipa_parm_adjustment_t): Remove typedef.
> (ipa_jump_func, ipa

[PATCH, testsuite] Fix alignment in movapd tests

2013-12-10 Thread Ryan Mansfield

The movapd tests in the i386 testsuite fail if built with 
-fstack-protector-strong or -fstack-protector-all, due to 8byte 
alignment instead of 16, or 32byte alignment.


2013-12-10  Ryan Mansfield  

PR testsuite/59442
* gcc.target/i386/sse2-movapd-1.c: Fix alignment attributes.
* gcc.target/i386/sse2-movapd-2.c: Likewise.
* gcc.target/i386/avx-vmovapd-256-1.c: Likewise.
* gcc.target/i386/avx-vmovapd-256-2.c: Likewise.

Regards,

Ryan Mansfield
Index: gcc/testsuite/gcc.target/i386/avx-vmovapd-256-1.c
===
--- gcc/testsuite/gcc.target/i386/avx-vmovapd-256-1.c	(revision 205857)
+++ gcc/testsuite/gcc.target/i386/avx-vmovapd-256-1.c	(working copy)
@@ -15,7 +15,7 @@
 avx_test (void)
 {
   union256d u;
-  double e [4] __attribute__ ((aligned (8)))  = {41124.234,2344.2354,8653.65635,856.43576};
+  double e [4] __attribute__ ((aligned (32)))  = {41124.234,2344.2354,8653.65635,856.43576};
 
   u.x = test (e);
 
Index: gcc/testsuite/gcc.target/i386/avx-vmovapd-256-2.c
===
--- gcc/testsuite/gcc.target/i386/avx-vmovapd-256-2.c	(revision 205857)
+++ gcc/testsuite/gcc.target/i386/avx-vmovapd-256-2.c	(working copy)
@@ -15,7 +15,7 @@
 avx_test (void)
 {
   union256d u;
-  double e [4] __attribute__ ((aligned (8))) = {0.0};
+  double e [4] __attribute__ ((aligned (32))) = {0.0};
 
   u.x = _mm256_set_pd (39578.467285, 7856.342941, 85632.783567, 47563.234215);
 
Index: gcc/testsuite/gcc.target/i386/sse2-movapd-1.c
===
--- gcc/testsuite/gcc.target/i386/sse2-movapd-1.c	(revision 205857)
+++ gcc/testsuite/gcc.target/i386/sse2-movapd-1.c	(working copy)
@@ -25,7 +25,7 @@
 TEST (void)
 {
   union128d u;
-  double e[2] __attribute__ ((aligned (8))) = {2134.3343,1234.635654};
+  double e[2] __attribute__ ((aligned (16))) = {2134.3343,1234.635654};
 
   u.x = test (e);   
 
Index: gcc/testsuite/gcc.target/i386/sse2-movapd-2.c
===
--- gcc/testsuite/gcc.target/i386/sse2-movapd-2.c	(revision 205857)
+++ gcc/testsuite/gcc.target/i386/sse2-movapd-2.c	(working copy)
@@ -25,7 +25,7 @@
 TEST (void)
 {
   union128d u;
-  double e[2] __attribute__ ((aligned (8))) = {0.0};
+  double e[2] __attribute__ ((aligned (16))) = {0.0};
 
   u.x = _mm_set_pd (2134.3343,1234.635654);

Re: [PATCH/AARCH64 6/6] Support ILP32 multi-lib

2013-12-10 Thread Marcus Shawcroft

Hi,


On 10 December 2013 01:52, Andrew Pinski  wrote:
> On Mon, Dec 9, 2013 at 12:12 PM, Yufeng Zhang  wrote:
>> To be more explicit and consistent, the name of the ILP32 loader shall have
>> 'ilp32' instead of '32'.  The extension field shall be appended to
>> 'aarch64', separated by '_', and we should probably add the big-endian name
>> at the same time.  With the extension fields sorted alphabetically,
>> GLIBC_DYNAMIC_LINKER can be defined as:

>> "/lib/ld-linux-aarch64%{mbig-endian:_be}%{mabi=ilp32:_ilp32}.so.1"


> AS mentioned in another email I think _be is incorrect.  I also think

There was recent discussion on this topic over here:
http://lists.linaro.org/pipermail/cross-distro/2013-November/000570.html

> aarch64 becomes redundant for _ilp32 and really it should be just:
> "/lib/ld-linux-%{mabi=ilp32:ilp32;:aarch64}.so.

Stick with: /lib/ld-linux-.so. as outlined in Yufeng's reply.

Cheers
/Marcus

Re: Fix tsan tests.

2013-12-10 Thread Yury Gribov


HJ wrote:
>> Done, r205853.
> I think it caused:
> http://gcc.gnu.org/ml/gcc-regression/2013-12/msg00214.html

Right, I think we are missing torture-finish. Will send a fix after test.

-Y

Re: Fix tsan tests.

2013-12-10 Thread Jakub Jelinek

On Tue, Dec 10, 2013 at 05:10:44AM -0800, H.J. Lu wrote:
> On Tue, Dec 10, 2013 at 2:44 AM, Yury Gribov  wrote:
> > Done, r205853.
> 
> I think it caused:
> 
> http://gcc.gnu.org/ml/gcc-regression/2013-12/msg00214.html

Missing torture-finish before dg-finish?  At least looking at other
*.exp files that do set-torture-options they all do that.

Jakub

Re: Fix tsan tests.

2013-12-10 Thread H.J. Lu

On Tue, Dec 10, 2013 at 2:44 AM, Yury Gribov  wrote:
> Done, r205853.

I think it caused:

http://gcc.gnu.org/ml/gcc-regression/2013-12/msg00214.html

-- 
H.J.

Re: [patch i386]: Fix PR 56807

2013-12-10 Thread Kai Tietz

Yes, I sent update to HJ's comment.

2013/12/6 Kai Tietz :
> Upps ... here is the missing Changlog
>
> ChangeLog
>
> 2013-12-06  Kai Tietz  
>
> PR target/56807
> * config/i386/i386.c (ix86_expand_prologue): Address saved
> registers stack-relative, not via frame-pointer.

[C++ Patch] PR 59165 (aka Core/1442)

2013-12-10 Thread Paolo Carlini


Hi,

as far as I can see, this bug asks for the implementation of Core/1442, 
thus don't do a special Koenig lookup including namespace std in 
cp_parser_perform_range_for_lookup. Tested x86_64-linux.


Thanks,
Paolo.

/
/cp
2013-12-10  Paolo Carlini  

Core DR 1442
PR c++/59165
* parser.c (cp_parser_perform_range_for_lookup): Don't pass true
as include_std to perform_koenig_lookup.
(cp_parser_postfix_expression): Adjust.
* pt.c (tsubst_copy_and_build): Likewise.
* semantics.c (perform_koenig_lookup): Remove bool parameter.
(omp_reduction_lookup): Adjust.
* name-lookup.c (lookup_arg_dependent_1): Remove bool parameter.
(lookup_arg_dependent): Likewise.
(lookup_function_nonclass): Adjust.
* name-lookup.h: Adjust declaration.
* cp-tree.h: Likewise.

/testsuite
2013-12-10  Paolo Carlini  

Core DR 1442
PR c++/59165
* g++.dg/cpp0x/range-for28.C: New.
* g++.dg/cpp0x/range-for3.C: Update.
Index: cp/cp-tree.h
===
--- cp/cp-tree.h(revision 205850)
+++ cp/cp-tree.h(working copy)
@@ -5743,7 +5743,7 @@ extern tree finish_stmt_expr_expr (tree, tree);
 extern tree finish_stmt_expr   (tree, bool);
 extern tree stmt_expr_value_expr   (tree);
 bool empty_expr_stmt_p (tree);
-extern tree perform_koenig_lookup  (tree, vec *, bool,
+extern tree perform_koenig_lookup  (tree, vec *,
 tsubst_flags_t);
 extern tree finish_call_expr   (tree, vec **, 
bool,
 bool, tsubst_flags_t);
Index: cp/name-lookup.c
===
--- cp/name-lookup.c(revision 205850)
+++ cp/name-lookup.c(working copy)
@@ -4879,7 +4879,7 @@ lookup_function_nonclass (tree name, vec *args,
-   bool include_std)
+lookup_arg_dependent_1 (tree name, tree fns, vec *args)
 {
   struct arg_lookup k;
 
@@ -5617,8 +5616,6 @@ static tree
   else
 k.fn_set = NULL;
 
-  if (include_std)
-arg_assoc_namespace (&k, std_node);
   arg_assoc_args_vec (&k, args);
 
   fns = k.functions;
@@ -5643,13 +5640,12 @@ static tree
 /* Wrapper for lookup_arg_dependent_1.  */
 
 tree
-lookup_arg_dependent (tree name, tree fns, vec *args,
-  bool include_std)
+lookup_arg_dependent (tree name, tree fns, vec *args)
 {
   tree ret;
   bool subtime;
   subtime = timevar_cond_start (TV_NAME_LOOKUP);
-  ret = lookup_arg_dependent_1 (name, fns, args, include_std);
+  ret = lookup_arg_dependent_1 (name, fns, args);
   timevar_cond_stop (TV_NAME_LOOKUP, subtime);
   return ret;
 }
Index: cp/name-lookup.h
===
--- cp/name-lookup.h(revision 205850)
+++ cp/name-lookup.h(working copy)
@@ -338,7 +338,7 @@ extern void do_toplevel_using_decl (tree, tree, tr
 extern void do_local_using_decl (tree, tree, tree);
 extern tree do_class_using_decl (tree, tree);
 extern void do_using_directive (tree);
-extern tree lookup_arg_dependent (tree, tree, vec *, bool);
+extern tree lookup_arg_dependent (tree, tree, vec *);
 extern bool is_associated_namespace (tree, tree);
 extern void parse_using_directive (tree, tree);
 extern tree innermost_non_namespace_value (tree);
Index: cp/parser.c
===
--- cp/parser.c (revision 205850)
+++ cp/parser.c (working copy)
@@ -6012,7 +6012,6 @@ cp_parser_postfix_expression (cp_parser *parser, b
if (!any_type_dependent_arguments_p (args))
  postfix_expression
= perform_koenig_lookup (postfix_expression, args,
-/*include_std=*/false,
 complain);
  }
else
@@ -6038,7 +6037,6 @@ cp_parser_postfix_expression (cp_parser *parser, b
if (!any_type_dependent_arguments_p (args))
  postfix_expression
= perform_koenig_lookup (postfix_expression, args,
-/*include_std=*/false,
 complain);
  }
  }
@@ -10287,12 +10285,10 @@ cp_parser_perform_range_for_lookup (tree range, tr
  vec_safe_push (vec, range);
 
  member_begin = perform_koenig_lookup (id_begin, vec,
-   /*include_std=*/true,
tf_warning_or_error);
  *begin = finish_call_expr (member_begin, &vec, false, true,

[gomp4] OpenACC structured blocks (was: PING: Fwd: Re: [patch] implement Cilk Plus simd loops on trunk)

2013-12-10 Thread Thomas Schwinge

Hi!

At the end of this email you can find the patch that I'd like to apply to
gomp-4_0-branch for OpenACC structured blocks, after the following two
have been approved:

On Fri, 06 Dec 2013 19:33:35 +0100, I wrote:
> On Fri, 15 Nov 2013 14:44:45 -0700, Aldy Hernandez  wrote:
> > --- a/gcc/omp-low.c
> > +++ b/gcc/omp-low.c
> > @@ -10177,12 +10210,33 @@ diagnose_sb_0 (gimple_stmt_iterator *gsi_p,
> >  error ("invalid entry to OpenMP structured block");
> >  #endif
> >  
> > +  bool cilkplus_block = false;
> > +  if (flag_enable_cilkplus)
> > +{
> > +  if ((branch_ctx
> > +  && gimple_code (branch_ctx) == GIMPLE_OMP_FOR
> > +  && gimple_omp_for_kind (branch_ctx) == GF_OMP_FOR_KIND_CILKSIMD)
> > + || (gimple_code (label_ctx) == GIMPLE_OMP_FOR
> > + && gimple_omp_for_kind (label_ctx) == GF_OMP_FOR_KIND_CILKSIMD))
> > +   cilkplus_block = true;
> > +}
> 
> There is one issue here: consider the following code:
> 
> void baz()
> {
>   bad1:
>   #pragma omp parallel
> goto bad1;
> }
> 
> Then, if both -fcilkplus and -fopenmp are specified, that will run into a
> SIGSEGV/ICE because of label_ctx == NULL.  The fix is simple enough; OK
> for trunk and gomp-4_0-branch (after full testing)?

Testing looks good.

> The testcase is
> basically a concatenation of gcc.dg/cilk-plus/jump.c and
> gcc.dg/gomp/block-1.c -- should this be done differently/better?
> 
> commit eee16f8aad4527b705d327476b00bf9f5ba6dcce
> Author: Thomas Schwinge 
> Date:   Fri Dec 6 18:55:41 2013 +0100
> 
> Fix possible ICE (null pointer dereference) introduced in r204863.
> 
>   gcc/
>   * omp-low.c (diagnose_sb_0): Make sure label_ctx is valid to
>   dereference.
>   gcc/testsuite/
>   * gcc.dg/cilk-plus/jump-openmp.c: New file.
> 
> diff --git gcc/omp-low.c gcc/omp-low.c
> index e0f7d1d..91221c0 100644
> --- gcc/omp-low.c
> +++ gcc/omp-low.c
> @@ -10865,7 +10865,8 @@ diagnose_sb_0 (gimple_stmt_iterator *gsi_p,
>if ((branch_ctx
>  && gimple_code (branch_ctx) == GIMPLE_OMP_FOR
>  && gimple_omp_for_kind (branch_ctx) == GF_OMP_FOR_KIND_CILKSIMD)
> -   || (gimple_code (label_ctx) == GIMPLE_OMP_FOR
> +   || (label_ctx
> +   && gimple_code (label_ctx) == GIMPLE_OMP_FOR
> && gimple_omp_for_kind (label_ctx) == GF_OMP_FOR_KIND_CILKSIMD))
>   cilkplus_block = true;
>  }
> diff --git gcc/testsuite/gcc.dg/cilk-plus/jump-openmp.c 
> gcc/testsuite/gcc.dg/cilk-plus/jump-openmp.c
> new file mode 100644
> index 000..95e6b2d
> --- /dev/null
> +++ gcc/testsuite/gcc.dg/cilk-plus/jump-openmp.c
> @@ -0,0 +1,49 @@
> +/* { dg-do compile } */
> +/* { dg-options "-fcilkplus -fopenmp" } */
> +/* { dg-require-effective-target fopenmp } */
> +
> +int *a, *b, c;
> +
> +void foo()
> +{
> +#pragma simd
> +  for (int i=0; i < 1000; ++i)
> +{
> +  a[i] = b[i];
> +  if (c == 5)
> + return; /* { dg-error "invalid branch to/from a Cilk Plus structured 
> block" } */
> +}
> +}
> +
> +void bar()
> +{
> +#pragma simd
> +  for (int i=0; i < 1000; ++i)
> +{
> +lab:
> +  a[i] = b[i];
> +}
> +  if (c == 6)
> +goto lab; /* { dg-error "invalid entry to Cilk Plus structured block" } 
> */
> +}
> +
> +void baz()
> +{
> +  bad1:
> +  #pragma omp parallel
> +goto bad1; /* { dg-error "invalid branch to/from an OpenMP structured 
> block" } */
> +
> +  goto bad2; /* { dg-error "invalid entry to OpenMP structured block" } */
> +  #pragma omp parallel
> +{
> +  bad2: ;
> +}
> +
> +  #pragma omp parallel
> +{
> +  int i;
> +  goto ok1;
> +  for (i = 0; i < 10; ++i)
> + { ok1: break; }
> +}
> +}
> 
> 
> >/* If it's obvious we have an invalid entry, be specific about the 
> > error.  */
> >if (branch_ctx == NULL)
> > -error ("invalid entry to OpenMP structured block");
> > +{
> > +  if (cilkplus_block)
> > +   error ("invalid entry to Cilk Plus structured block");
> > +  else
> > +   error ("invalid entry to OpenMP structured block");
> > +}
> >else
> > -/* Otherwise, be vague and lazy, but efficient.  */
> > -error ("invalid branch to/from an OpenMP structured block");
> > +{
> > +  /* Otherwise, be vague and lazy, but efficient.  */
> > +  if (cilkplus_block)
> > +   error ("invalid branch to/from a Cilk Plus structured block");
> > +  else
> > +   error ("invalid branch to/from an OpenMP structured block");
> > +}
> 
> In fact, and keeping in mind that we're currently adding OpenACC support,
> I'd suggest to do this differently; OK for trunk and gomp-4_0-branch?

Testing looks good.

> commit 367dabfcc94a3e96d63b48c38d0dd94ca9f517f8
> Author: Thomas Schwinge 
> Date:   Fri Dec 6 19:23:47 2013 +0100
> 
> Generalize diagnose_omp_blocks' structured block logic.
> 
>   gcc/
>   * omp-low.c (diagnose_sb_0): Generalize detection which kind of
>   structured block we're in.
>   gcc/testsuite/
>

[PATCH][2/2] Speedup PTA (PR38474)

2013-12-10 Thread Richard Biener


This is the 2nd piece.  We can cache solution_set_expand when processing
complex constraints.  This also fixes spurious bits leaking into
constraints that don't need the expansion as the delta was expanded
in-place previously (seen by the ipa-pta-14.c XFAIL).

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2013-12-10  Richard Biener  

PR middle-end/38474
* tree-ssa-structalias.c (solution_set_expand): Expand into
a different possibly cached bitmap and return the result.
(set_union_with_increment): Pass in a shared expanded bitmap
and adjust.
(do_sd_constraint): Likewise.
(do_ds_constraint): Likewise.
(do_complex_constraint): Likewise.
(solve_graph): Manage the shared expanded bitmap.

* gcc.dg/ipa/ipa-pta-14.c: Un-XFAIL.

Index: gcc/tree-ssa-structalias.c
===
*** gcc/tree-ssa-structalias.c  (revision 205808)
--- gcc/tree-ssa-structalias.c  (working copy)
*** constraint_set_union (vec
*** 917,928 
  
  /* Expands the solution in SET to all sub-fields of variables included.  */
  
! static void
! solution_set_expand (bitmap set)
  {
bitmap_iterator bi;
unsigned j;
  
/* In a first pass expand to the head of the variables we need to
   add all sub-fields off.  This avoids quadratic behavior.  */
EXECUTE_IF_SET_IN_BITMAP (set, 0, j, bi)
--- 917,933 
  
  /* Expands the solution in SET to all sub-fields of variables included.  */
  
! static bitmap
! solution_set_expand (bitmap set, bitmap *expanded)
  {
bitmap_iterator bi;
unsigned j;
  
+   if (*expanded)
+ return *expanded;
+ 
+   *expanded = BITMAP_ALLOC (&iteration_obstack);
+ 
/* In a first pass expand to the head of the variables we need to
   add all sub-fields off.  This avoids quadratic behavior.  */
EXECUTE_IF_SET_IN_BITMAP (set, 0, j, bi)
*** solution_set_expand (bitmap set)
*** 931,981 
if (v->is_artificial_var
  || v->is_full_var)
continue;
!   bitmap_set_bit (set, v->head);
  }
  
/* In the second pass now expand all head variables with subfields.  */
!   EXECUTE_IF_SET_IN_BITMAP (set, 0, j, bi)
  {
varinfo_t v = get_varinfo (j);
!   if (v->is_artificial_var
! || v->is_full_var
! || v->head != j)
continue;
for (v = vi_next (v); v != NULL; v = vi_next (v))
!   bitmap_set_bit (set, v->id);
  }
  }
  
! /* Union solution sets TO and FROM, and add INC to each member of FROM in the
 process.  */
  
  static bool
! set_union_with_increment  (bitmap to, bitmap from, HOST_WIDE_INT inc)
  {
bool changed = false;
bitmap_iterator bi;
unsigned int i;
  
!   /* If the solution of FROM contains anything it is good enough to transfer
   this to TO.  */
!   if (bitmap_bit_p (from, anything_id))
  return bitmap_set_bit (to, anything_id);
  
/* If the offset is unknown we have to expand the solution to
   all subfields.  */
if (inc == UNKNOWN_OFFSET)
  {
!   bitmap tmp = BITMAP_ALLOC (&iteration_obstack);
!   bitmap_copy (tmp, from);
!   solution_set_expand (tmp);
!   changed |= bitmap_ior_into (to, tmp);
!   BITMAP_FREE (tmp);
return changed;
  }
  
/* For non-zero offset union the offsetted solution into the destination.  
*/
!   EXECUTE_IF_SET_IN_BITMAP (from, 0, i, bi)
  {
varinfo_t vi = get_varinfo (i);
  
--- 936,987 
if (v->is_artificial_var
  || v->is_full_var)
continue;
!   bitmap_set_bit (*expanded, v->head);
  }
  
/* In the second pass now expand all head variables with subfields.  */
!   EXECUTE_IF_SET_IN_BITMAP (*expanded, 0, j, bi)
  {
varinfo_t v = get_varinfo (j);
!   if (v->head != j)
continue;
for (v = vi_next (v); v != NULL; v = vi_next (v))
!   bitmap_set_bit (*expanded, v->id);
  }
+ 
+   /* And finally set the rest of the bits from SET.  */
+   bitmap_ior_into (*expanded, set);
+ 
+   return *expanded;
  }
  
! /* Union solution sets TO and DELTA, and add INC to each member of DELTA in 
the
 process.  */
  
  static bool
! set_union_with_increment  (bitmap to, bitmap delta, HOST_WIDE_INT inc,
!  bitmap *expanded_delta)
  {
bool changed = false;
bitmap_iterator bi;
unsigned int i;
  
!   /* If the solution of DELTA contains anything it is good enough to transfer
   this to TO.  */
!   if (bitmap_bit_p (delta, anything_id))
  return bitmap_set_bit (to, anything_id);
  
/* If the offset is unknown we have to expand the solution to
   all subfields.  */
if (inc == UNKNOWN_OFFSET)
  {
!   delta = solution_set_expand (delta, expanded_delta);
!   changed |= bitmap_ior_into (to, delta);
return changed;
  }
  
/* For non-zero offset union the offsetted solution

Re: Fix PR rtl-optimization/58295

2013-12-10 Thread Kyrill Tkachov


On 09/12/13 23:18, Eric Botcazou wrote:

It's the pessimization introduced on the ARM (and other RISC targets) by the
distribution of truncations in simplify_truncation.  Further information at:
   http://gcc.gnu.org/ml/gcc/2013-12/msg00019.html

The change started as a simple address reassociation trick for x32 and evolved
into a general distribution of truncations for common operations.  But the
outcome doesn't really play nice with other transformations that help on RISC
targets so I think that we should restrict it.  I'm not sure whether it really
helps on CISC targets either, but it's at least neutral and it's too late to
backpedal anyway since further changes were installed in the x86 back-end.

Hence the proposed minimal patch based on WORD_REGISTER_OPERATIONS.  Not clear
whether it's the best long term solution, but I think that we want to fix the
regression on the 4.8 branch as well.

Tested on x86-64/Linux, PowerPC/Linux and IA-64/Linux.  As reported by Kyrill,
this fixes gcc.target/arm/unsigned-extend-1.c on the ARM.

Comments?


Hi Eric,

The patch does indeed fix unsigned-extend-1.c on arm and bootstraps succesfully 
on arm-none-linux-gnueabihf.


Thanks for fixing this, that test failure has been spoiling my test runs for 
quite a while now :)


Kyrill




2013-12-09  Eric Botcazou  

PR rtl-optimization/58295
* simplify-rtx.c (simplify_truncation): Restrict the distribution for
WORD_REGISTER_OPERATIONS targets.

Re: [patch i386]: Fix PR 56807

2013-12-10 Thread Jan Hubicka

> On Tue, Dec 10, 2013 at 12:48:20PM +0100, Jan Hubicka wrote:
> > > Hi,
> > > 
> > > 
> > > ChangeLog
> > > 
> > > 2013-12-06  Kai Tietz  
> > > 
> > > PR target/56807
> > > * config/i386/i386.c (ix86_expand_prologue):
> > > 
> > > Tested for i686-w64-mingw32, x86_64-unknown-linux-gnu.  Ok for apply?
> > 
> > So the code in question does spilling relative to stack pointer before 
> > function
> > call and relative to base pointer after the function call and this leads to 
> > mismatch
> > when stack alignment is in the way?
> > 
> > The chagne seems OK if that is the case.
> 
> The ChangeLog entry is not ok though.

Kai sent updated one in reply to Jeff's email I beleive.

honza
> 
>   Jakub

Re: [patch i386]: Fix PR 56807

2013-12-10 Thread Jakub Jelinek

On Tue, Dec 10, 2013 at 12:48:20PM +0100, Jan Hubicka wrote:
> > Hi,
> > 
> > 
> > ChangeLog
> > 
> > 2013-12-06  Kai Tietz  
> > 
> > PR target/56807
> > * config/i386/i386.c (ix86_expand_prologue):
> > 
> > Tested for i686-w64-mingw32, x86_64-unknown-linux-gnu.  Ok for apply?
> 
> So the code in question does spilling relative to stack pointer before 
> function
> call and relative to base pointer after the function call and this leads to 
> mismatch
> when stack alignment is in the way?
> 
> The chagne seems OK if that is the case.

The ChangeLog entry is not ok though.

Jakub

Re: [patch i386]: Fix PR 56807

2013-12-10 Thread Jan Hubicka

> Hi,
> 
> 
> ChangeLog
> 
> 2013-12-06  Kai Tietz  
> 
> PR target/56807
> * config/i386/i386.c (ix86_expand_prologue):
> 
> Tested for i686-w64-mingw32, x86_64-unknown-linux-gnu.  Ok for apply?

So the code in question does spilling relative to stack pointer before function
call and relative to base pointer after the function call and this leads to 
mismatch
when stack alignment is in the way?

The chagne seems OK if that is the case.

Honza
> 
> Regards,
> Kai
> 
> Index: config/i386/i386.c
> ===
> --- config/i386/i386.c(Revision 205719)
> +++ config/i386/i386.c(Arbeitskopie)
> @@ -10934,18 +10937,21 @@ ix86_expand_prologue (void)
>  }
>m->fs.sp_offset += allocate;
> 
> +  /* Use stack_pointer_rtx for relative addressing so that code
> + works for realigned stack, too.  */
>if (r10_live && eax_live)
>  {
> -  t = choose_baseaddr (m->fs.sp_offset - allocate);
> +  t = plus_constant (Pmode, stack_pointer_rtx, allocate);
>emit_move_insn (gen_rtx_REG (word_mode, R10_REG),
>gen_frame_mem (word_mode, t));
> -  t = choose_baseaddr (m->fs.sp_offset - allocate - UNITS_PER_WORD);
> +  t = plus_constant (Pmode, stack_pointer_rtx,
> + allocate - UNITS_PER_WORD);
>emit_move_insn (gen_rtx_REG (word_mode, AX_REG),
>gen_frame_mem (word_mode, t));
>  }
>else if (eax_live || r10_live)
>  {
> -  t = choose_baseaddr (m->fs.sp_offset - allocate);
> +  t = plus_constant (Pmode, stack_pointer_rtx, allocate);
>emit_move_insn (gen_rtx_REG (word_mode,
> (eax_live ? AX_REG : R10_REG)),
>gen_frame_mem (word_mode, t));

[PATCH] libsanitizer demangling using cp-demangle.c

2013-12-10 Thread Jakub Jelinek

On Fri, Dec 06, 2013 at 06:40:52AM -0800, Ian Lance Taylor wrote:
> There was a recent buggy patch to the demangler that added calls to
> malloc and realloc (2013-10-25 Gary Benson ).
> That patch must be fixed or reverted before the 4.9 release.  The main
> code in the demangler must not call malloc/realloc.
> 
> When that patch is fixed, you can use the cplus_demangle_v3_callback
> function to get a demangler that never calls malloc.

AFAIK Gary is working on a fix, when that is fixed, with the following
patch libsanitizer (when using libbacktrace for symbolization) will not
use system malloc/realloc/free for the demangling at all.

Tested on x86_64-linux (-m64/-m32).  Note that the changes for the 3 files
unfortunately will need to be applied upstream to compiler-rt, is that
possible?

2013-12-10  Jakub Jelinek  

* sanitizer_common/sanitizer_symbolizer_libbacktrace.h
(LibbacktraceSymbolizer::Demangle): New declaration.
* sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc
(POSIXSymbolizer::Demangle): Use libbacktrace_symbolizer_'s Demangle
method if possible.
* sanitizer_common/sanitizer_symbolizer_libbacktrace.cc: Include
"demangle.h" if SANITIZE_CP_DEMANGLE is defined.
(struct CplusV3DemangleData): New type.
(CplusV3DemangleCallback, CplusV3Demangle): New functions.
(SymbolizeCodePCInfoCallback, SymbolizeCodeCallback,
SymbolizeDataCallback): Use CplusV3Demangle.
* sanitizer_common/Makefile.am (AM_CXXFLAGS): Add
-DSANITIZE_CP_DEMANGLE and -I $(top_srcdir)/../include.
* libbacktrace/backtrace-rename.h (cplus_demangle_builtin_types,
cplus_demangle_fill_ctor, cplus_demangle_fill_dtor,
cplus_demangle_fill_extended_operator, cplus_demangle_fill_name,
cplus_demangle_init_info, cplus_demangle_mangled_name,
cplus_demangle_operators, cplus_demangle_print,
cplus_demangle_print_callback, cplus_demangle_type, cplus_demangle_v3,
cplus_demangle_v3_callback, is_gnu_v3_mangled_ctor,
is_gnu_v3_mangled_dtor, java_demangle_v3, java_demangle_v3_callback):
Define.
(__asan_internal_memcmp, __asan_internal_strncmp): New prototypes.
(memcmp, strncmp): Redefine.
* libbacktrace/Makefile.am (libsanitizer_libbacktrace_la_SOURCES): Add
../../libiberty/cp-demangle.c.
* libbacktrace/bridge.cc (__asan_internal_memcmp,
__asan_internal_strncmp): New functions.
* sanitizer_common/Makefile.in: Regenerated.
* libbacktrace/Makefile.in: Regenerated.
* configure: Regenerated.
* configure.ac: Regenerated.
* config.h.in: Regenerated.

--- libsanitizer/sanitizer_common/sanitizer_symbolizer_libbacktrace.h.jj
2013-12-05 12:04:28.0 +0100
+++ libsanitizer/sanitizer_common/sanitizer_symbolizer_libbacktrace.h   
2013-12-10 11:01:26.777371566 +0100
@@ -29,6 +29,8 @@ class LibbacktraceSymbolizer {
 
   bool SymbolizeData(DataInfo *info);
 
+  const char *Demangle(const char *name);
+
  private:
   explicit LibbacktraceSymbolizer(void *state) : state_(state) {}
 
--- libsanitizer/sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc.jj  
2013-12-05 12:04:28.0 +0100
+++ libsanitizer/sanitizer_common/sanitizer_symbolizer_posix_libcdep.cc 
2013-12-10 11:03:02.971876505 +0100
@@ -513,6 +513,11 @@ class POSIXSymbolizer : public Symbolize
 SymbolizerScope sym_scope(this);
 if (internal_symbolizer_ != 0)
   return internal_symbolizer_->Demangle(name);
+if (libbacktrace_symbolizer_ != 0) {
+  const char *demangled = libbacktrace_symbolizer_->Demangle(name);
+  if (demangled)
+   return demangled;
+}
 return DemangleCXXABI(name);
   }
 
--- libsanitizer/sanitizer_common/sanitizer_symbolizer_libbacktrace.cc.jj   
2013-12-09 14:32:06.0 +0100
+++ libsanitizer/sanitizer_common/sanitizer_symbolizer_libbacktrace.cc  
2013-12-10 11:48:19.803830291 +0100
@@ -20,6 +20,10 @@
 # include "backtrace-supported.h"
 # if SANITIZER_POSIX && BACKTRACE_SUPPORTED && !BACKTRACE_USES_MALLOC
 #  include "backtrace.h"
+#  if SANITIZER_CP_DEMANGLE
+#   undef ARRAY_SIZE
+#   include "demangle.h"
+#  endif
 # else
 #  define SANITIZER_LIBBACKTRACE 0
 # endif
@@ -31,6 +35,60 @@ namespace __sanitizer {
 
 namespace {
 
+#if SANITIZER_CP_DEMANGLE
+struct CplusV3DemangleData {
+  char *buf;
+  uptr size, allocated;
+};
+
+extern "C" {
+static void CplusV3DemangleCallback(const char *s, size_t l, void *vdata) {
+  CplusV3DemangleData *data = (CplusV3DemangleData *)vdata;
+  uptr needed = data->size + l + 1;
+  if (needed > data->allocated) {
+data->allocated *= 2;
+if (needed > data->allocated)
+  data->allocated = needed;
+char *buf = (char *)InternalAlloc(data->allocated);
+if (data->buf) {
+  internal_memcpy(buf, data->buf, data->size);
+  InternalFree(data->buf);
+}
+data->buf = buf;
+  }
+  internal_memcpy(data->bu

[PATCH] Allow building if libsanitizer on RHEL5 (i.e. with 2.6.18-ish kernel headers, take 2)

2013-12-10 Thread Jakub Jelinek

Hi!

Here is an updated version which doesn't warn about #include_next.
Ok for trunk?

2013-12-10  Jakub Jelinek  

* sanitizer_common/Makefile.am (AM_CPPFLAGS): Add
-isystem $(top_srcdir)/include/system.
* sanitizer_common/Makefile.in: Regenerated.
* include/system/linux/aio_abi.h: New header.
* include/system/linux/mroute.h: New header.
* include/system/linux/mroute6.h: New header.
* include/system/linux/perf_event.h: New header.
* include/system/linux/types.h: New header.

--- libsanitizer/sanitizer_common/Makefile.am.jj2013-12-10 
09:55:19.0 +0100
+++ libsanitizer/sanitizer_common/Makefile.am   2013-12-10 09:56:36.748472017 
+0100
@@ -1,5 +1,5 @@
-AM_CPPFLAGS = -I $(top_srcdir)/include
- 
+AM_CPPFLAGS = -I $(top_srcdir)/include -isystem $(top_srcdir)/include/system
+
 # May be used by toolexeclibdir.
 gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
 
--- libsanitizer/include/system/linux/mroute.h.jj   2013-12-10 
09:56:03.888640325 +0100
+++ libsanitizer/include/system/linux/mroute.h  2013-12-10 09:56:03.888640325 
+0100
@@ -0,0 +1,8 @@
+#include 
+/*  before 2.6.26 included 
+   which clashes with userspace headers.  */
+#if LINUX_VERSION_CODE < 132634
+#define _LINUX_IN_H
+#include 
+#endif
+#include_next 
--- libsanitizer/include/system/linux/aio_abi.h.jj  2013-12-10 
09:56:03.887640312 +0100
+++ libsanitizer/include/system/linux/aio_abi.h 2013-12-10 09:56:03.887640312 
+0100
@@ -0,0 +1,7 @@
+#include 
+#include_next 
+/* IOCB_CMD_PREADV/PWRITEV has been added in 2.6.19 */
+#if LINUX_VERSION_CODE < 132627
+#define IOCB_CMD_PREADV 7
+#define IOCB_CMD_PWRITEV 8
+#endif
--- libsanitizer/include/system/linux/types.h.jj2013-12-10 
09:56:03.888640325 +0100
+++ libsanitizer/include/system/linux/types.h   2013-12-10 09:56:03.888640325 
+0100
@@ -0,0 +1,12 @@
+#ifndef LINUX_TYPES_WRAPPER_H
+#define LINUX_TYPES_WRAPPER_H
+
+/* Before
+   
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/include/linux/types.h?id=6c7c6afbb8c0e60d32a563cae7c6889211e9d9d8
+   linux/types.h conflicted with sys/ustat.h.  Work around it.  */
+
+#define ustat __asan_bad_ustat
+#include_next 
+#undef ustat
+
+#endif
--- libsanitizer/include/system/linux/perf_event.h.jj   2013-12-10 
09:56:03.888640325 +0100
+++ libsanitizer/include/system/linux/perf_event.h  2013-12-10 
09:56:03.888640325 +0100
@@ -0,0 +1,7 @@
+#include 
+/*  has been added in 2.6.32 */
+#if LINUX_VERSION_CODE >= 132640
+#include_next 
+#else
+#define perf_event_attr __sanitizer_perf_event_attr
+#endif
--- libsanitizer/include/system/linux/mroute6.h.jj  2013-12-10 
09:56:03.888640325 +0100
+++ libsanitizer/include/system/linux/mroute6.h 2013-12-10 09:56:03.888640325 
+0100
@@ -0,0 +1,5 @@
+#include 
+/*  has been added in 2.6.26 */
+#if LINUX_VERSION_CODE >= 132634
+#include_next 
+#endif
--- libsanitizer/sanitizer_common/Makefile.in.jj2013-12-10 
09:55:19.0 +0100
+++ libsanitizer/sanitizer_common/Makefile.in   2013-12-10 09:57:05.079324403 
+0100
@@ -235,7 +235,7 @@ toolexeclibdir = @toolexeclibdir@
 top_build_prefix = @top_build_prefix@
 top_builddir = @top_builddir@
 top_srcdir = @top_srcdir@
-AM_CPPFLAGS = -I $(top_srcdir)/include
+AM_CPPFLAGS = -I $(top_srcdir)/include -isystem $(top_srcdir)/include/system
 
 # May be used by toolexeclibdir.
 gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)

Jakub

Re: AARCH64 configure check for gas -mabi support

2013-12-10 Thread Yufeng Zhang


Hi Kugan,

The latest patch looks good to me; I only have a couple of minor 
comments inlined below.  Please ask Marcus to review and approve it. 
Thanks again for fixing this issue!


On 12/10/13 06:21, Kugan wrote:
[snip]


Updated it and tested with

1. binutils 2.23.2
   a. bootstrapped with defaults and tested gcc for -mabi=lp64
(compiles) and -mabi=ilp32 gives error
   b. Trying to boottsrap with --with-multilibs-list=lp64,ilp32 fails
with error msg
   c. Trying to bootstrap with --with-multilibs-list=ilp32 fails with
error msg
   d. Bootstrap with --with-multilibs-list=lp64 works.

2. binutils 2.24.51
a. bootstrapped with defaults and tested gcc for -mabi=lp64
(compiles) and -mabi=ilp32 (compiles)
   b. Bootstrap with --with-multilibs-list=lp64,ilp32 works and tested
gcc for -mabi=lp64
compiles and -mabi=ilp32  compiles(* gives linker error in my setup -
aarch64:ilp32 architecture of input file `/tmp/ccIFqSxU.o' is
incompatible with aarch64 output; I believe this is not related to what
I am testing)
   c. Bootstrap with default works


Thanks for the comprehensive testing.  The linker error you see is 
because that the ILP32 support for aarch64*-*-linux* has not been added 
(Andrew Pinski has sent the patch series to enable the support here 
http://gcc.gnu.org/ml/gcc-patches/2013-12/msg00282.html)


I also test the patch by building aarch64-none-elf cross compilers with 
binutils 2.23.2 and mainline, with default --with-multilibs-list.  It 
works well.


[snip]


diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index b1b4eef..a53febc 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -5187,6 +5187,13 @@ aarch64_override_options (void)
aarch64_parse_tune ();
  }

+/* Issue error if assembler does not support -mabi and option ilp32
+  is selected.  */


I'd prefer the comment to be "The compiler may have been configured with 
2.23.* binutils, which does not have support for ILP32."



+#ifndef HAVE_AS_MABI_OPTION
+  if (TARGET_ILP32)
+error ("Assembler does not supprt -mabi=ilp32");
+#endif


supprt/support


+
initialize_aarch64_code_model ();

aarch64_build_bitmask_table ();
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 91a22d5..a951b82 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -3495,6 +3495,35 @@ AC_DEFINE_UNQUOTED(HAVE_LTO_PLUGIN, $gcc_cv_lto_plugin,
  AC_MSG_RESULT($gcc_cv_lto_plugin)

  case "$target" in
+
+  aarch64*-*-*)
+gcc_GAS_CHECK_FEATURE([-mabi option],
+  gcc_cv_as_aarch64_mabi,,
+  [-mabi=lp64], [.text],,,)
+if test x$gcc_cv_as_aarch64_mabi = xyes; then
+   AC_DEFINE(HAVE_AS_MABI_OPTION, 1,
+ [Define if your assembler supports the -mabi option.])
+else
+   if test x$with_abi = xilp32; then
+   AC_MSG_ERROR([Assembler does not support -mabi=ilp32.  Upgrade the 
Assembler.])
+   fi
+if test x"$with_multilib_list" = xdefault; then
+   TM_MULTILIB_CONFIG=lp64
+else
+   aarch64_multilibs=`echo $with_multilib_list | sed -e 's/,/ /g'`
+   for aarch64_multilib in ${aarch64_multilibs}; do
+   case ${aarch64_multilib} in
+   ilp32)
+   AC_MSG_ERROR([Assembler does not support -mabi=ilp32.  
Upgrade the Assembler.])
+   ;;
+   *)
+   ;;
+   esac
+   done
+   fi
+fi
+;;
+


I'm not very sure about the indent rules for configury files, but in 
other areas of configure.ac, it seems using a similar indent convention 
as in .c files.



Thanks,
Yufeng



# All TARGET_ABI_OSF targets.
alpha*-*-linux* | alpha*-*-*bsd*)
  gcc_GAS_CHECK_FEATURE([explicit relocation support],
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index a8f9f8a..00c4f0d 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -3735,6 +3735,15 @@ removed and the system libunwind library will always be 
used.

  @html
  
+@end html
+@anchor{aarch64-x-x}
+@heading aarch64*-*-*
+Pre 2.24 binutils does not have support for selecting -mabi and does not
+support ILP32.  If GCC 4.9 or later is built with pre 2.24, GCC will not
+support option -mabi=ilp32.
+
+@html
+
  
  @end html
  @anchor{x-ibm-aix}

Re: [PATCH/AARCH64 1/6] Fix size and pointer different types for ILP32.

2013-12-10 Thread Marcus Shawcroft

On 3 December 2013 21:24, Andrew Pinski  wrote:
>
> While compiling some programs, GCC and glibc (and newlib)'s definitions of 
> size_t
> were not agreeing and causing format warnings to happen.  The simple testcase 
> for this is:
> #include 
> #include 
>
> int main(void)
> {
>   ssize_t t = 0x1;
>   printf("%zd\n", t);
>   return 0;
> }

Hi Andrew,

The PCS IHI0056C defines SIZE_TYPE as 'unsigned long' for both ILP32
and LP64 and PTRDIFF_TYPE as "signed long" for both ILP32 and LP64.
This seems like a sane choice to me.

Trying to recreate the failure with the test fragment above doesn't
given a warning:

$ aarch64-none-elf-gcc -Wall -mabi=ilp32 test-size_t.c -specs=rdimon.specs
$ aarch64-none-linux-gnu-gcc -Wall -mabi=ilp32 test-size_t.c -c
$

..what am I doing differently to your test run?

Cheers
/Marcus

[PATCH PR41488]Recognize more induction variables by simplifying PEELED chrec in scalar evolution

2013-12-10 Thread Dominique Dhumieres

Revision 205848 breaks bootstrap on x86_64-apple-darwin13: pr59445.

TIA

Dominique

Re: [patch] Fix gnat.dg/pack19.adb on some platforms

2013-12-10 Thread Eric Botcazou

> :-)  From a cleanup standpoint, I think the expansion code is ripe for
> someone to spend (condsiderable) time killing dead code.  I suspect
> there's still significant gcc-1.91 (or even older) bits in there that
> have been dead for at least a decade.

The existing test was added for Ada a decade ago:
  http://gcc.gnu.org/ml/gcc-patches/2003-10/msg00823.html
but I think that it's obsolete now (at least replacing it with an assert 
doesn't change anything) so I've installed the attached patch.

-- 
Eric BotcazouIndex: expr.c
===
--- expr.c	(revision 205848)
+++ expr.c	(working copy)
@@ -10157,6 +10157,8 @@ expand_expr_real_1 (tree exp, rtx target
 		if (target == 0)
 		  target = assign_temp (type, 1, 1);
 
+		/* ??? Unlike the similar test a few lines below, this one is
+		   very likely obsolete.  */
 		if (bitsize == 0)
 		  return target;
 
@@ -10177,6 +10179,13 @@ expand_expr_real_1 (tree exp, rtx target
 		return target;
 	  }
 
+	/* If we have nothing to extract, the result will be 0 for targets
+	   with SHIFT_COUNT_TRUNCATED == 0 and garbage otherwise.  Always
+	   return 0 for the sake of consistency, as reading a zero-sized
+	   bitfield is valid in Ada and the value is fully specified.  */
+	if (bitsize == 0)
+	  return const0_rtx;
+
 	op0 = validize_mem (op0);
 
 	if (MEM_P (op0) && REG_P (XEXP (op0, 0)))

Re: [REPOST] Invalid Code when reading from unaligned zero-sized array

2013-12-10 Thread Eric Botcazou

> What we support is out of bounds accesses for heap vars if the var's type
> has flexible array member or something we treat similarly and there is the
> possibility that there could be payload after the heap var that could be 
> accessed from the flexible array members or similar arrays.

My question was about the above similar arrays, i.e. whether we consider all 
trailing arrays in structures as flexible-like or not.  No strong opinion.

> So, I don't see what is the big deal with BLKmode, because all the cases
> which actually could have flexible array member extra payloads (or similar)
> must necessarily live in memory, if it is the compiler that decides whether
> to put it into memory or keep in registers etc., then it can't be heap
> allocated.

The invariant is that types for which objects can effectively have variable 
size must have BLKmode, otherwise you need to add very ugly code in the RTL 
expander to mask the lie.

-- 
Eric Botcazou

Re: Fix tsan tests.

2013-12-10 Thread Yury Gribov


Done, r205853.

Re: [Patch, AArch64] [5/6] Implement support for Crypto -- SHA256.

2013-12-10 Thread Marcus Shawcroft

On 6 December 2013 17:36, Tejas Belagod  wrote:
>
> Hi,
>
> The attached patch implements support for crypto sha256.

Same comments as previous crypto patch.

/Marcus

Re: [Patch, AArch64] [4/6] Implement support for Crypto -- SHA1.

2013-12-10 Thread Marcus Shawcroft

Same comments as previous patch:

On 6 December 2013 17:36, Tejas Belagod  wrote:

> testsuite/
> * gcc.target/aarch64/sha1.c: New.

Add _1 on the test case file name (see http://gcc.gnu.org/wiki/TestCaseWriting)

> +static __inline uint32x4_t
> +vsha1cq_u32 (uint32x4_t hash_abcd, uint32_t hash_e, uint32x4_t wk)
> +{
> +  return
> +(uint32x4_t) __builtin_aarch64_crypto_sha1cv4si ((int32x4_t) hash_abcd,
> +(int32_t) hash_e,
> +(int32x4_t) wk);
> +}

James G fixed the infrastructure to allow properly typed builtins, see:

http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02005.html
and
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02880.html

> +/* { dg-final { scan-assembler "sha1c\\tq" } } */

Use scan-assembler-times 1

Cheers
/Marcus

Re: [REPOST] Invalid Code when reading from unaligned zero-sized array

2013-12-10 Thread Jakub Jelinek

On Tue, Dec 10, 2013 at 11:04:53AM +0100, Eric Botcazou wrote:
> > What are the transformations that are enabled by making something not
> > BLKmode?
> > 
> > On the gimple level I cannot think of one..
> 
> On the RTL level, it's simple: anything BLKmode is forced to memory instead 
> of 
> being loaded into registers.
> 
> > >This could work as well, although I'd restrict this to arrays,
> > >recursively.
> >
> > Works for me.
> 
> Are we sure that we really support out-of-bounds accesses for arbitrary 
> arrays 
> though?  It seems to me that we easily take advantage of them e.g. in loops 
> to 
> invoke undefined behavior.  IMO it's not clear whether we want to risk more 
> accidental ABI changes if they are not supported throughout the compiler.

I think we don't support out-of-bounds accesses for global vars (ok, there
are vars with real flexible array members as GNU extension initialized by
initializers that initialize the flexible array members, but then the decl
size etc. should be adjusted for that), similarly for automatic vars (and in
both cases that includes the hard register vars).  What we support is
out of bounds accesses for heap vars if the var's type has flexible array member
or something we treat similarly and there is the possibility that there
could be payload after the heap var that could be accessed from the flexible
array members or similar arrays.  And of course if you have just pointer to
some var and can't see what it points to, we need to treat it as if it could
be heap var.

So, I don't see what is the big deal with BLKmode, because all the cases
which actually could have flexible array member extra payloads (or similar)
must necessarily live in memory, if it is the compiler that decides whether
to put it into memory or keep in registers etc., then it can't be heap
allocated.

Jakub

1 2 >

1 - 100 of 123 matches

Mail list logo