Re: New GCC releases comparison and comparison of GCC4.4 and LLVM2.5 on SPEC2000

2009-05-25 Thread Vincent Lefevre
On 2009-05-25 12:53:49 -0700, Chris Lattner wrote:
> On May 13, 2009, at 5:26 AM, Duncan Sands wrote:
>>> -mpc64 sets the x87 floating point control register to not use the
>>> 80bit extended precision. This causes some x87 floating point
>>> operations to operate faster and there are no issues with the
>>> extra roundings you get when storing an 80bit precision register
>>> to a 64bit memory location.
>
> However, this does break long double, right?

Unless the parameters associated with long double (among LDBL_MANT_DIG,
LDBL_DIG, LDBL_MIN_EXP, LDBL_MIN_10_EXP, LDBL_MAX_EXP, LDBL_MAX_10_EXP,
LDBL_MAX, LDBL_EPSILON, LDBL_MIN) are changed accordingly.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon)


arm926 branch cost

2009-05-25 Thread Phil Fong

At the end of config/arm/arm926ejs.md, branch costs are modeled with:

>;; Branch instructions are difficult to model accurately.  The ARM
>;; core can predict most branches.  If the branch is predicted
>;; correctly, and predicted early enough, the branch can be completely
>;; eliminated from the instruction stream.  Some branches can
>;; therefore appear to require zero cycles to execute.  We assume that
>;; all branches are predicted correctly, and that the latency is
>;; therefore the minimum value.
>
>(define_insn_reservation "9_branch_op" 0
> (and (eq_attr "tune" "arm926ejs")
>  (eq_attr "type" "branch"))
> "nothing")
In arm.md "*arm_cond_branch" and "*arm_cond_branch_reversed" have attr "type" 
set to "branch".

This seems to disagree with Section 8.3 of the ARM9EJ-S Technical Reference 
Manual (ARM DDI 0222B) which says:
"Any ARM or Thumb branch, and an ARM branch with link operation
takes three cycles"
Presumably, branches that are not taken take 1 cycle like any other 
non-executed conditional instruction.  

In addition, arm926ejs.md does not model the cost of alu instructions like mov, 
etc. with the PC as the destination.  According to the reference manual, these 
are either 3 or 4 cycles.

The section section above from arm926ejs.md also appears in arm1026ejs.md.  The 
arm1026 has branch prediction while the arm926 does not according to their 
respective reference manuals.

Am I mis-understanding what "define_insn_reservation" means?  There
does not appear to be anything the arm_adjust_cost in arm.c which
affects branching costs.

Phil


  


Re: Bugfix request

2009-05-25 Thread Piotr Wyderski
Robert Dewar wrote:

> Since this is particularly important to you

It is not "particularly important" to me, it's just a bug
with a known workaround (i.e. a cast to the enum's
base type). But a very annoying one.

> why not take the opportunity to dig in and see if you
> can figure out the necessary fix.

Frankly: because I know nothing about GCC internals
and am not interested in becoming a wizard GCC
developer. I do enjoy writing my hobbyists asynchronous
execution framework in C++0x using GCC as an excellent
tool for the job (on many levels: as the primary code base
compiler and a semi-dynamic code generator) but I do not
enjoy writing the tool itself. GCC wishes to support C++0x
as soon as possible, so the bug will eventually be fixed anyway.
My message was merely a user feedback from the trenches
of a large scale  application  of the experimental C++0x
support in GCC, intended to help its real developers to
assing an appropriate priority to the bug. Call it a profile
-gudided prioritization if you wish. You may help fixing the
bug or completely ignore this feedback -- both possibilities
are perfectly OK, it's up to you. But I see nothing here to
discuss about, so I rest my case.

> Bugs only get fixed if someone volunteers to do the fix!

Indeed. And, as you can see, there are volunteers who
enjoy GCC development and share my opinion that the
bug is quite important: its status has been rapidly changed
to confirmed and a volunteer has been assigned to fix it.
So I would like to thank these fellows very much.

Best regards
Piotr Wyderski


Re: New GCC releases comparison and comparison of GCC4.4 and LLVM2.5 on SPEC2000

2009-05-25 Thread Chris Lattner


On May 13, 2009, at 5:26 AM, Duncan Sands wrote:


Hi Richard,

-mpc64 sets the x87 floating point control register to not use the  
80bit

extended precision.  This causes some x87 floating point operations
to operate faster and there are no issues with the extra roundings  
you
get when storing an 80bit precision register to a 64bit memory  
location.


However, this does break long double, right?


Does LLVM support x87 arithmetic at all or does it default to SSE
arithmetic in 32bits?  I guess using SSE math for both would be a  
more

fair comparison?


LLVM does support the x86 floating point stack, though it doesn't  
support
all asm expressions for it (which is why llvm-gcc disables math  
inlines).
My understanding is that no effort has been made to produce optimal  
code

when using the x86 fp stack, and all the effort when into SSE instead.


As long as you configure llvm-gcc with --with-arch=pentium-4 or later,  
you should be fine.  I personally use --with-arch=nocona.


-Chris


Re: Bugfix request

2009-05-25 Thread Robert Dewar

Piotr Wyderski wrote:

Hello,

I would like to kindly ask somebody to fix PR38064,
as the bug is very annoying -- it makes the use of
enum class virtually impossible. Most of my "GCCBUG"
workaround comments refer to this one.


Since this is particularly important to you, why not take
the opportunity to dig in and see if you can figure out
the necessary fix. Bugs only get fixed if someone volunteers
to do the fix!


For a quick reference:

enum class E { elem };

int main()
{
E e = E::elem;
if (e == E::elem);
return 1;
}
g++ -std=c++0x tc1.cpp
tc1.cpp: In function 'int main()':
tc1.cpp:6: error: invalid operands of types 'E' and 'E' to binary 'operator=='

Best regards
Piotr Wyderski




Re: Seeking suggestion

2009-05-25 Thread Jamie Prescott

> From: Michael Meissner 
> To: Jamie Prescott 
> Cc: gcc@gcc.gnu.org
> Sent: Sunday, May 24, 2009 1:57:19 PM
> Subject: Re: Seeking suggestion
> 
> One way is to use match_scratch, and different register classes for the two
> cases.
> 
> (define_insn "add3"
>   [(set (match_operand:SI 0 "register_operand" "=x,y")
> (plus:SI (match_operand:SI 1 "register_operand" "%x,y")
>  (match_operand:SI 2 "register_operand" "x,y")))
>(clobber (match_scratch:CC 3 "=X,z"))]
>   ""
>   "add %0,%1,%2")
> 
> 
> (define_register_constraint "x" "TARGET_MACHINE ? GENERAL_REGS : NO_REGS"
>   "@internal")
> 
> (define_register_constraint "y" "!TARGET_MACHINE ? GENERAL_REGS : NO_REGS"
>   "@internal")
> 
> (define_register_constraint "z" CR_REGS "@interal")
> 
> This assumes you have a register class for the condition code register.  Most
> machines however, use the normal define_expand with two different insns.
> 
> In theory, you could define a second condition code register that doesn't
> actually exist in the machine, and change the clobber from the main CC to the
> fake one.

Hmm, interesting. Thanks Michael.
Though, as you were saying, I'll probably leave them as separate insns. These 
should
have been two separate targets probably, but I'm too lazy to split them up ATM.



> > But now I get and invalid rtx sharing from the push/pop parallels:
> > 
> > 
> > .c: In function 'test_dashr':
> > .c:32: error: invalid rtl sharing found in the insn
> > (insn 26 3 28 2 .c:26 (parallel [
> > (insn/f 25 0 0 (set (reg/f:SI 51 SP)
> > (minus:SI (reg/f:SI 51 SP)
> > (const_int 4 [0x4]))) -1 (nil))
> > (set/f (mem:SI (reg/f:SI 51 SP) [0 S4 A8])
> > (reg:SI 8 r8))
> > ]) -1 (nil))
> > .c:32: error: shared rtx
> > (insn/f 25 0 0 (set (reg/f:SI 51 SP)
> > (minus:SI (reg/f:SI 51 SP)
> > (const_int 4 [0x4]))) -1 (nil))
> > .c:32: internal compiler error: internal consistency failure
> 
> I suspect you don't have the proper guards on the push/pop insns, and the
> combiner is eliminating the clobber.  You probably need to have parallel insns
> for the push and pop.

Dunno exactly what was happening. The push/pop were generated with a parallel,
but I was issuing a gen_addsi3() directly, and this somehow was creating the 
problem.
Once I open coded that with SET(SP, PLUS(SP, SIZE)), the issue disappeared.


- Jamie


  


Inner loop unable to compute sufficient information during vectorization

2009-05-25 Thread Abhishek Shrivastav
for a loop like

1 for(i=0;i

Re: extern const (Was: Re: [gcc-in-cxx]: patches from multi-target-4_4-branch)

2009-05-25 Thread Gabriel Dos Reis
On Sun, May 24, 2009 at 11:13 PM, Ian Lance Taylor  wrote:
> Joern Rennecke  writes:
>
>> Quoting Ian Lance Taylor :
>>> Joern Rennecke  writes:
     * config/sh/sh.c (sh_attribute_table): Use extern in forward
     declaration.
 Common issue with declaring/defining const variables in C++.
>>>
>>> I've been doing this as
>>>
>>> #ifdef __cplusplus
>>> extern
>>> #endif
>>
>> These #ifdefs sprinkled over the code are awkward.  Could we use a #define
>> for this?  E.g. put in system.h
>> #ifdef __cplusplus
>> #define CONST_VAR_DECL extern const
>> #else
>> #define CONST_VAR_DECL const
>> #endif
>
> Yes, that is certainly the way to go if it is in fact not safe to use
> "extern const int i = 1;" for all C compilers.  I hadn't planned to deal
> with this issue yet, but since you bring it up, we should decide whether
> that construct is safe, or whether we need the macro.
>
> Ian
>

I believe
extenr const'  should be pretty safe in any C90 compiler.
Do we know a compiler that does not handle that correctly?


Re: extern const (Was: Re: [gcc-in-cxx]: patches from multi-target-4_4-branch)

2009-05-25 Thread Gabriel Dos Reis
On Sun, May 24, 2009 at 10:23 PM, Joern Rennecke  wrote:
> Quoting Ian Lance Taylor :
>>
>> Joern Rennecke  writes:
>>>
>>>        * config/sh/sh.c (sh_attribute_table): Use extern in forward
>>>        declaration.
>>> Common issue with declaring/defining const variables in C++.
>>
>> I've been doing this as
>>
>> #ifdef __cplusplus
>> extern
>> #endif
>
> These #ifdefs sprinkled over the code are awkward.  Could we use a #define
> for this?  E.g. put in system.h
> #ifdef __cplusplus
> #define CONST_VAR_DECL extern const
> #else
> #define CONST_VAR_DECL const
> #endif

It is much better than defining 'extern' to nothing.

I'm a bit surprised that a C90 compiler will not accept
'extern const'.

>


Bugfix request

2009-05-25 Thread Piotr Wyderski
Hello,

I would like to kindly ask somebody to fix PR38064,
as the bug is very annoying -- it makes the use of
enum class virtually impossible. Most of my "GCCBUG"
workaround comments refer to this one.

For a quick reference:

enum class E { elem };

int main()
{
E e = E::elem;
if (e == E::elem);
return 1;
}
g++ -std=c++0x tc1.cpp
tc1.cpp: In function 'int main()':
tc1.cpp:6: error: invalid operands of types 'E' and 'E' to binary 'operator=='

Best regards
Piotr Wyderski


Re: Do we have optimizations to reduce cache miss?

2009-05-25 Thread Ben Elliston
> I just want to ask whether we have any special pass to reduce cache
> miss? Or any idea or branch to enhance it.

There are various data layout optimisations.  There is also -Os ;-)

Ben




Re: relative include search path

2009-05-25 Thread Dave Korn
Denis Onischenko wrote:
> I have a problem with gcc not finding location for stddef.h include
> file when it is invoked from directory other than
> /bin.

  Questions about the usage of GCC should go to the gcc-help mailing list,
rather than this one which is about the development of the internals of GCC.
What you have discovered here may appear to be a bug, but in fact it is by
design, so please send any follow-ups to the gcc-help list, thank you.

> Output from gcc invocation with -v option contains the following:
>
> ignoring nonexistent directory "../lib/gcc..."
>
> i.e. gcc is trying to find include files in directory with relative path.
> So it is works only when working directory is /usr/bin, where gcc is 
> installed.
>
> Why gcc looks for headers in directories with relative paths ?

  So that you can move the entire installation somewhere else and it will all
work because it will all be in the same relative locations compared to the new
$prefix as it was when installed in the original $prefix.

  What you have done is move a single part of the installation to a new
location and leave the rest behind.  That is not supported, and there's no
simple and direct way in which it could be.

cheers,
  DaveK


relative include search path

2009-05-25 Thread Denis Onischenko
I have a problem with gcc not finding location for stddef.h include
file when it is invoked from directory other than
/bin.
Output from gcc invocation with -v option contains the following:

ignoring nonexistent directory "../lib/gcc..."

i.e. gcc is trying to find include files in directory with relative path.
So it is works only when working directory is /usr/bin, where gcc is installed.

Why gcc looks for headers in directories with relative paths ?
I configured gcc with --prefix=/usr. The gcc was built with cross-compiler.

Thanks in advance.


Do we have optimizations to reduce cache miss?

2009-05-25 Thread Eric Fisher
Hi,

I just want to ask whether we have any special pass to reduce cache
miss? Or any idea or branch to enhance it.

Thanks
Eric Fisher