Re: Missing gen_sse2_cvtdq2p in convert splitter?

2008-10-07 Thread Uros Bizjak
On Wed, Oct 8, 2008 at 8:29 AM, H.J. Lu <[EMAIL PROTECTED]> wrote:

> I386.md has
>
> (define_split
>  [(set (match_operand:MODEF 0 "register_operand" "")
>(float:MODEF (match_operand:SI 1 "register_operand" "")))]
>  "TARGET_SSE2 && TARGET_SSE_MATH
>   && TARGET_USE_VECTOR_CONVERTS && optimize_function_for_speed_p (cfun)
>   && reload_completed
>   && (SSE_REG_P (operands[0])
>   || (GET_CODE (operands[0]) == SUBREG
>   && SSE_REG_P (operands[0])))"
>  [(const_int 0)]
> {
>  rtx op1 = operands[1];
>
>  operands[3] = simplify_gen_subreg (mode, operands[0],
> mode, 0);
>  if (GET_CODE (op1) == SUBREG)
>op1 = SUBREG_REG (op1);
>
>  if (GENERAL_REG_P (op1) && TARGET_INTER_UNIT_MOVES)
>{
>  operands[4] = simplify_gen_subreg (V4SImode, operands[0], mode, 0);
>  emit_insn (gen_sse2_loadld (operands[4],
>  CONST0_RTX (V4SImode), operands[1]));
>}
>  /* We can ignore possible trapping value in the
> high part of SSE register for non-trapping math. */
>  else if (SSE_REG_P (op1) && !flag_trapping_math)
>operands[4] = simplify_gen_subreg (V4SImode, operands[1], SImode, 0);
>  else
>gcc_unreachable ();
> })
>
> Aren't
>
>  emit_insn
>(gen_sse2_cvtdq2p (operands[3], operands[4]));
>  DONE;
>
> missing at the end?

Uh, yes.

The patch is pre-approved as obvious.

Thanks,
Uros.


Missing gen_sse2_cvtdq2p in convert splitter?

2008-10-07 Thread H.J. Lu
Hi,

I386.md has

(define_split
  [(set (match_operand:MODEF 0 "register_operand" "")
(float:MODEF (match_operand:SI 1 "register_operand" "")))]
  "TARGET_SSE2 && TARGET_SSE_MATH
   && TARGET_USE_VECTOR_CONVERTS && optimize_function_for_speed_p (cfun)
   && reload_completed
   && (SSE_REG_P (operands[0])
   || (GET_CODE (operands[0]) == SUBREG
   && SSE_REG_P (operands[0])))"
  [(const_int 0)]
{
  rtx op1 = operands[1];

  operands[3] = simplify_gen_subreg (mode, operands[0],
 mode, 0);
  if (GET_CODE (op1) == SUBREG)
op1 = SUBREG_REG (op1);

  if (GENERAL_REG_P (op1) && TARGET_INTER_UNIT_MOVES)
{
  operands[4] = simplify_gen_subreg (V4SImode, operands[0], mode, 0);
  emit_insn (gen_sse2_loadld (operands[4],
  CONST0_RTX (V4SImode), operands[1]));
}
  /* We can ignore possible trapping value in the
 high part of SSE register for non-trapping math. */
  else if (SSE_REG_P (op1) && !flag_trapping_math)
operands[4] = simplify_gen_subreg (V4SImode, operands[1], SImode, 0);
  else
gcc_unreachable ();
})

Aren't

  emit_insn
(gen_sse2_cvtdq2p (operands[3], operands[4]));
  DONE;

missing at the end?

-- 
H.J.


register class constraints question

2008-10-07 Thread DJ Delorie

I've got this code:

(define_insn "andhi3_24"
  [(set (match_operand:HI 0 "mra_operand" 
"=Sd,Sd,*Rhl,*Rhl,RhiSd,??Rmm,RhiSd,??Rmm")
(and:HI (match_operand:HI 1 "mra_operand" "%0,0,*0,*0,0,0,0,0")
(match_operand:HI 2 "mrai_operand" 
"Imb,Imw,*Imb,*Imw,iRhiSd,?Rmm,?Rmm,iRhiSd")))]
  "TARGET_A24"
  "@
   bclr\t%B2,%0
   bclr\t%B2-8,1+%0
   bclr\t%B2,%h0
   bclr\t%B2-8,%H0
   and.w\t%X2,%0
   and.w\t%X2,%0
   and.w\t%X2,%0
   and.w\t%X2,%0"
  [(set_attr "flags" "n,n,n,n,sz,sz,sz,sz")]
  )

Originally, the '*' constraints were missing.  It failed:

/greed/dj/ges/gnupro/head/gnupro/gcc/testsuite/gcc.c-torture/execute/pr17133.c: 
In function 'pure_alloc':
/greed/dj/ges/gnupro/head/gnupro/gcc/testsuite/gcc.c-torture/execute/pr17133.c:19:
 error: unable to find a register to spill in class 'HL_REGS'
/greed/dj/ges/gnupro/head/gnupro/gcc/testsuite/gcc.c-torture/execute/pr17133.c:19:
 error: this is the insn:
(insn 31 30 32 6 
/greed/dj/ges/gnupro/head/gnupro/gcc/testsuite/gcc.c-torture/execute/pr17133.c:13
 (set (reg:HI 0 r0 [41])
(and:HI (subreg:HI (reg/f:PSI 5 a1 [orig:29 bar.0 ] [29]) 0)
(const_int -2 [0xfffe]))) 26 {andhi3_24} (expr_list:REG_DEAD 
(reg/f:PSI 5 a1 [orig:29 bar.0 ] [29])
(nil)))
/greed/dj/ges/gnupro/head/gnupro/gcc/testsuite/gcc.c-torture/execute/pr17133.c:19:
 internal compiler error: in spill_failure, at reload1.c:2093

I added the '*' constraints to keep it from using HL_REGS class (HL
includes R0 and R1, HI includes R0 through R3) but it seems to be
ignoring them.

If I remove those alternatives completely, the code compiles properly.

How can I get register allocation to use HI_REGS as the allocation
class?

I still want the bclr opcodes to be used *if* the constraints hold.  I
just don't want them to limit register choices.

What am I missing?


Re: cpp found limits.h in FIXED_INCLUDE_DIR, but not in STANDARD_INCLUDE_DIR

2008-10-07 Thread Zhang Le
Hi, all,

I think I've found the reason.
It all comes from this gentoo patch:
http://sources.gentoo.org/viewcvs.py/gentoo-x86/sys-devel/gcc/files/4.1.0/gcc-4.1.0-cross-compile.patch?rev=1.1&view=markup
Specifically:
  -elif test "x$TARGET_SYSTEM_ROOT" != x; then
  +elif test "x$TARGET_SYSTEM_ROOT" != x -o $build != $host; then
   SYSTEM_HEADER_DIR=$build_system_header_dir
  fi
BTW, I haven't got time to learn more about that debate, but I will do it.

Since my build != host, so SYSTEM_HEADER_DIR=$build_system_header_dir, which is
in turn CROSS_SYSTEM_HEADER_DIR.

So this test will fail
  LIMITS_H_TEST = [ -f $(SYSTEM_HEADER_DIR)/limits.h ]

And then:
  if $(LIMITS_H_TEST) ; then \
cat $(srcdir)/limitx.h $(srcdir)/glimits.h $(srcdir)/limity.h > 
tmp-xlimits.h; \
  else \
cat $(srcdir)/glimits.h > tmp-xlimits.h; \
  fi; \

And the solution is easy, just turn on 'vanilla' USE flag in Gentoo.
Sorry for the noise. 

Zhang Le


Re: [PATCH]: bump minimum MPFR version, (includes some fortranbits)

2008-10-07 Thread Adrian Bunk
On Mon, Oct 06, 2008 at 04:10:04PM -0700, Kaveh R. Ghazi wrote:
> From: "Adrian Bunk" <[EMAIL PROTECTED]>
>
>> On Sat, Oct 04, 2008 at 09:33:48PM -0400, Kaveh R. GHAZI wrote:
>>> Since we're in stage3, I'm raising the issue of the MPFR version we
>>> require for GCC, just as in last year's stage3 for gcc-4.3:
>>> http://gcc.gnu.org/ml/gcc/2007-12/msg00298.html
>>>
>>> I'd like to increase the "minimum" MPFR version to 2.3.0, (which has been
>>> released since Aug 2007).  The "recommended" version of MPFR can be  
>>> bumped
>>> to the latest which is 2.3.2.
>>> ...
>>
>> Considering that your patch removes the conditionals on MPFR versions
>> from the code (good!), is there any reason for gcc to keep this unusual
>> minimum/recommended split in the requirement?
>>
>> Either 2.3.0 is good enough, or 2.3.2 contains some critical fix
>> and should be the minimum version.
>
> The last time this came up, the consensus was that we should not hard 
> fail the configure script even if the user would then be missing some 
> mpfr bugfix in the latest/greatest release.  That's why we have the 
> minimum/recommended split.

I see the point for the 2.2.1/2.3.0 versions since 2.3.0 introduced
additional functionality gcc can use.

> But I see no reason not to encourage people and/or make them aware of the 
> need to upgrade if they are so inclined.  Whether a particular fix is  
> "critical" can be in the eye of the beholder.

But is there any "need to upgrade" to 2.3.2 since it would fix a bug gcc 
ran into?

IMHO it's not "in the eye of the beholder" whether 2.3.2 contains a
"critical" fix _for usage by gcc_.

>--Kaveh

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed



Re: Status of the DLX backend for GCC?

2008-10-07 Thread Peter Bergner
On Sat, 2008-10-04 at 18:48 +0200, Gerald Pfeifer wrote:
> Thanks for the background on this, Peter, and the background on this
> site disappearing.
> 
> The reason I asked was that we have that reference from our site to that
> URL and I failed to find any replacement so far.  The first two hits that
> I get in Google actually are mails by you in the gcc archives. ;-)
> 
> I guess we'll just have to remove that reference?

I talked with Aaron Sawdey and he still had the tarballs which he has
given me.  Let me go through a build process with them to make sure they
still build and then I'll post them somewhere you can link to.

Peter






Re: Help with IA64 profiling bug - g++.dg/tree-prof/indir-call-prof.C

2008-10-07 Thread Andreas Schwab
Steve Ellcey <[EMAIL PROTECTED]> writes:

> This is about as far as I have gotten.  I am not sure why there is this
> difference or how to fix it.  I *think* it may be related to the fact
> that IA64 GCC defines TARGET_VTABLE_USES_DESCRIPTORS but my only reason
> for thinking that is that IA64 is the only platform that defines this
> macro and I think that the profiler must be getting callee addresses out
> of the vtable (though I am not sure about that and I don't know where it
> would be doing it from).

I think to make that work tree_gen_ic_profiler and
tree_gen_ic_func_profiler would have to dereference the function
descriptor to extract the code address, which would then have the
necessary uniqueness which the vtable function descriptor lacks.

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: Help with IA64 profiling bug - g++.dg/tree-prof/indir-call-prof.C

2008-10-07 Thread Andreas Schwab
Steve Ellcey <[EMAIL PROTECTED]> writes:

> Comparing x86 (where things work) and IA64 (where they do not), I see
> the test case, when compiled with -fprofile-generate, has calls
> __gcov_indirect_call_profiler in both cases.  But on IA64, cur_func is
> never equal to callee_func

That's because cur_func points to the function address, but callee_func
to the function descriptor.

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: A question regarding recognition of nop

2008-10-07 Thread Ian Lance Taylor
Revital1 Eres <[EMAIL PROTECTED]> writes:

> Is there a general way to recognize a nop insn in RTL (using attributes?),
> or should I add a target hook for that?
>
> For example, I would like to recognize the following spu insn as a nop:?),
>
> (insn 555 210 203 11 (unspec_volatile [
> (const_int 0 [0x0])
> ] 14) 393 {lnop} (nil))

There is no general way to recognize a nop as such.  There are various
ways to recognize an insn which does nothing; such insns are normally
removed (there is still some code which checks and preserves
set_noop_p, but I have a feeling that code is now obsolete).

By gcc's standards your insn is not a nop, because it is volatile.

Ian


Re: Autovectorizing does not work with classes

2008-10-07 Thread Tim Prince
Georg Martius wrote:
> Dear gcc developers,
> 
> I am new to this list. 
> I tried to use the auto-vectorization (4.2.1 (SUSE Linux)) but unfortunately 
> with limited success.
> My code is bassically a matrix library in C++. The vectorizer does not like 
> the member variables. Consider this code compiled with 
> gcc -ftree-vectorize -msse2 -ftree-vectorizer-verbose=5 
> -funsafe-math-optimizations
> that gives basically  "not vectorized: unhandled data-ref"
> 
> class P{
> public:
>   P() : m(5),n(3) {
> double *d = data;
> for (int i=0; i   d[i] = i/10.2;
>   }
>   void test(const double& sum);
> private:
>   int m;
>   int n;
>   double data[15];
> };
> 
> void P::test(const double& sum) {  
>   double *d = this->data;
>   for(int i=0; i d[i]+=sum;
>   }
> }
> 
> whereas the more or less equivalent C version works just fine:
> 
> int m=5;
> int n=3;
> double data[15];
> 
> void test(const double& sum) {  
>   int mn = m*n;
>   for(int i=0; i data[i]+=sum;
>   }
> }
> 
> 
> Is there a fundamental problem in using the vectorizer in C++?
> 

I don't see any C code above.  As another reply indicated, the most likely
C idiom would be to pass sum by value.  Alternatively, you could use a
local copy of sum, in cases where that is a problem.
The only fundamental vectorization problem I can think of which is
specific to C++ is the lack of a standard restrict keyword.  In g++,
__restrict__ is available.  A local copy (or value parameter) of sum
avoids a need for the compiler to recognize const or restrict as an
assurance of no value modification.
The loop has to have known fixed bounds at entry, in order to vectorize.
If your C++ style doesn't support that, e.g. by calculating the end value
outside the loop, as you show in your latter version, then you do have a
problem with vectorization.


Issue in building the libgcc-Os-4-200.a library for SH target

2008-10-07 Thread Cecilia Rodrigues
Hi,

We have built a cross compiled toolchain for SH target using the
following sources,
gcc-4.3.1 [released],
newlib-1.16.0 [released]
binutils-2.18.50 [snapshot dated 30th July 2008], 

We have experienced following error, when building a C++ application
using a toolchain built with the above mentioned sources,

/
"sh-elf-ld.exe: sh-elf\lib\gcc\sh-elf\4.3.1\ml\m2\libgcc-Os-4-200.a
(unwind-dw2-Os-4-200.o): compiled for a big endian system and target is
little endian
sh-elf-ld.exe: \sh-elf\lib\gcc\sh-elf\4.3.1\ml\m2\libgcc-Os-4-200.a
(unwind-dw2-Os-4-200.o): uses instructions which are incompatible with
instructions used in previous modules
sh-elf-ld.exe: failed to merge target specific data of file
sh-elf\lib\gcc\sh-elf\4.3.1\ml\m2\libgcc-Os-4-200.a(unwind-dw2-Os-4-200.
o)"

/

The libgcc-Os-4-200.a archive built for SH target consists of the
following object files, udivsi3_i4i-Os-4-200.o sdivsi3_i4i-Os-4-200.o
unwind-dw2-Os-4-200.o

It has been observed that the object file "unwind-dw2-Os-4-200.o" gets
built for big endian instead of little endian target.Whereas the other
two object files, "udivsi3_i4i-Os-4-200.o" and "sdivsi3_i4i-Os-4-200.o"
from the same archive are successfully built for little endian target of
SH.

The libraries built for little endian SH-2/SH-3 target series resides at
the following path,
sh-elf/lib/gcc/sh-elf/4.3.1/ml/m2

We have also observed that the target specific options such as ml, m2,
ml m2 etc are not passed to the compiler while building the
"unwind-dw2-Os-4-200" object.

It appears that, somewhere in the GCC makefiles, the required options
have been missed while building the "unwind-dw2-Os-4-200.o" component. 

Has anyone faced a similar problem? Any possible workaround? 

Regards,
Cecilia Rodrigues   
KPIT Cummins Infosystems Ltd.   
Pune, India


Re: Autovectorizing does not work with classes

2008-10-07 Thread Ira Rosen


[EMAIL PROTECTED] wrote on 07/10/2008 10:48:29:

> Dear gcc developers,
>
> I am new to this list.
> I tried to use the auto-vectorization (4.2.1 (SUSE Linux)) but
unfortunately
> with limited success.
> My code is bassically a matrix library in C++. The vectorizer does not
like
> the member variables. Consider this code compiled with
> gcc -ftree-vectorize -msse2 -ftree-vectorizer-verbose=5 -funsafe-
> math-optimizations
> that gives basically  "not vectorized: unhandled data-ref"

The unhandled data-ref here is sum. It is invariant in the loop, and
invariant data-refs are currently unsupported by the data dependence
analysis. If you can change your code to pass sum by value, it will get
vectorized (at least with gcc 4.3).
This is not C++ specific problem (for me your C version does not get
vectorized either because of the same reason).

HTH,
Ira,

> 
> class P{
> public:
>   P() : m(5),n(3) {
> double *d = data;
> for (int i=0; i   d[i] = i/10.2;
>   }
>   void test(const double& sum);
> private:
>   int m;
>   int n;
>   double data[15];
> };
>
> void P::test(const double& sum) {
>   double *d = this->data;
>   for(int i=0; i d[i]+=sum;
>   }
> }
> 
> whereas the more or less equivalent C version works just fine:
> 
> int m=5;
> int n=3;
> double data[15];
>
> void test(const double& sum) {
>   int mn = m*n;
>   for(int i=0; i data[i]+=sum;
>   }
> }
> 
>
> Is there a fundamental problem in using the vectorizer in C++?
>
> Regards!
>Georg
> [attachment "signature.asc" deleted by Ira Rosen/Haifa/IBM]



A question regarding recognition of nop

2008-10-07 Thread Revital1 Eres

Hello,

Is there a general way to recognize a nop insn in RTL (using attributes?),
or should I add a target hook for that?

For example, I would like to recognize the following spu insn as a nop:?),

(insn 555 210 203 11 (unspec_volatile [
(const_int 0 [0x0])
] 14) 393 {lnop} (nil))

Thanks,
Revital



Autovectorizing does not work with classes

2008-10-07 Thread Georg Martius
Dear gcc developers,

I am new to this list. 
I tried to use the auto-vectorization (4.2.1 (SUSE Linux)) but unfortunately 
with limited success.
My code is bassically a matrix library in C++. The vectorizer does not like 
the member variables. Consider this code compiled with 
gcc -ftree-vectorize -msse2 -ftree-vectorizer-verbose=5 
-funsafe-math-optimizations
that gives basically  "not vectorized: unhandled data-ref"

class P{
public:
  P() : m(5),n(3) {
double *d = data;
for (int i=0; idata;
  for(int i=0; i
whereas the more or less equivalent C version works just fine:

int m=5;
int n=3;
double data[15];

void test(const double& sum) {  
  int mn = m*n;
  for(int i=0; i

Is there a fundamental problem in using the vectorizer in C++?

Regards!
Georg


signature.asc
Description: This is a digitally signed message part.


Re: [PATCH]: bump minimum MPFR version, (includes some fortran bits)

2008-10-07 Thread Janne Blomqvist
On Tue, Oct 7, 2008 at 2:15 AM, Kaveh R. Ghazi <[EMAIL PROTECTED]> wrote:
> From: "Richard Guenther" <[EMAIL PROTECTED]>
>
>> On Sun, Oct 5, 2008 at 3:33 AM, Kaveh R. GHAZI <[EMAIL PROTECTED]>
>> wrote:
>>>
>>> Okay for mainline?
>>
>> Ok if there are no objections within the week.
>>
>> Thanks,
>> Richard.
>
> Great, thanks.  Can I get an explicit ack from a fortran maintainer as well?

Ok.

-- 
Janne Blomqvist


Re: Help with IA64 profiling bug - g++.dg/tree-prof/indir-call-prof.C

2008-10-07 Thread Richard Guenther
On Tue, Oct 7, 2008 at 1:18 AM, Steve Ellcey <[EMAIL PROTECTED]> wrote:
> I have been looking at why g++.dg/tree-prof/indir-call-prof.C fails on
> IA64 (HP-UX and Linux).  It looks like the optimization (turning an
> indirect call into a direct call) does not happen because the initial
> run with -fprofile-generate is not generating any count data about
> indirect calls.
>
> Comparing x86 (where things work) and IA64 (where they do not), I see
> the test case, when compiled with -fprofile-generate, has calls
> __gcov_indirect_call_profiler in both cases.  But on IA64, cur_func is
> never equal to callee_func and so __gcov_one_value_profiler_body is
> never called.  On x86 we do have cur_func equal to callee_func and so
> __gcov_one_value_profiler_body is called to write out profile
> information.
>
> This is about as far as I have gotten.  I am not sure why there is this
> difference or how to fix it.  I *think* it may be related to the fact
> that IA64 GCC defines TARGET_VTABLE_USES_DESCRIPTORS but my only reason
> for thinking that is that IA64 is the only platform that defines this
> macro and I think that the profiler must be getting callee addresses out
> of the vtable (though I am not sure about that and I don't know where it
> would be doing it from).
>
> So this is a request to anyone who might know the profiling code to help
> me with some advise about what I should look at next or about how to go
> about fixing this bug.

If these testcases never worked on IA64 I suggest you XFAIL
them for IA64 and file a missed-optimization bugreport.

Richard.

> Steve Ellcey
> [EMAIL PROTECTED]
>