Re: [PATCH] improved algorithm for gcc/expmed.c::choose_multiplier()

2006-08-02 Thread Roger Sayle

Hi Denis,

On Mon, 31 Jul 2006, Jim Wilson wrote:
> At the moment, there is probably no one who understands this code as
> well as you do, so you may not get much help from others.

In my defence, I'm struggling to get up to speed with all of the
issues.  The first and obvious comments are that patches should be
submitted to [EMAIL PROTECTED] rather than gcc@gcc.gnu.org,
we're currently in stage 3, which is a code freeze for regression
fixes, and therefore missed optimization improvements such as yours
can't be committed until gcc 4.2 has branched, which means reviewers
tend not to review complex patches until then, and finally you really
should take a look at GCC's coding standards documentation as
the formatting of your submission is just all wrong.

In the meantime, however, I have been running your example code,
and it does indeed find better correct multipliers than the algorithm
currently used by GCC.

As mentioned by Jim, the current GCC algorithm is based upon the
paper by Torbjorn Granlund and Peter Montgomery, "Division by
Invariant Integers using Multiplication", PLDI-94.
http://swox.se/~tg/divcnst-pldi94.pdf

Another excellent reference describing division by integer constant
algorithms is Henry S. Warren Jr.'s "Hacker's Delight", Addision-Wesley
publishers, 2003.

The strange thing is that all three of GCC, the paper and the book
agree on the a preferred multiplier, which differs from the one
selected by your code.  Following through both sets of correctness
proofs is tedious, but it looks like Granlund's work attempts to
find multipliers whose residual error is less than 1/n, which is
sufficient by not necessary.  Your code relies on round to zero
semantics of truncating division, so that provided the residual is
less than 1, we'll always truncate to the correct result.

One issue is that Grunlund's derivation starts with signed division
and is then extended/tweaked to handle unsigned division.  Your
work on the other hand, starts its derivation with unsigned division.


So currently I'm trying to find the catch.  For which divisions
are 33-bit operations required.  Your proof looks reasonable and
the code does the correct thing for the divisions in your examples,
though the coding style issues prevent it being used in the compiler
without being cleaned up.  There's also a return clause where it gives
up after failing to find a suitable multiplier, where falling back
to the exisiting code might be a better compromise.  I've not played
with your algorithm enough to determine how often it triggers.


Congratulations for the impressive work!  You'll forgive me if it
takes a while to understand why your method is better than previous
works, i.e. what it is they overlooked (or that you've overlooked).
To be honest all this discrete math makes my head hurt.  Indeed,
when its appreciated that its not the smallest multiplier that we'd
prefer, but the cheapest (using shifts/adds etc...) and that
choose_multiplier is potentially intertwined with synth_mult, the
world starts spinning and I need to reach for the headache tablets.

Roger
--



Re: a request

2006-08-02 Thread Diego Novillo
Russell Whitaker wrote on 08/02/06 22:31:

> Could I get a copy of the implementation of the OpenMP 2.5 interface,
> please?
> 
Try one of the weekly snapshots http://gcc.gnu.org/snapshots.html for
mainline (to become 4.2).  Or, if you are running Fedora Core 5, the GCC
4.1 version in there already contains the OMP bits.


Re: a request

2006-08-02 Thread Andrew Pinski
> 
> hi
> 
> Could I get a copy of the implementation of the OpenMP 2.5 interface, 
> please?

You download the 4.2 snapshot and you have it.
Or you could grab a GCC via svn.

Thanks,
Andrew Pinski


a request

2006-08-02 Thread Russell Whitaker

hi

Could I get a copy of the implementation of the OpenMP 2.5 interface, 
please?


  thanks,
Russ


Eric Botcazou appointed RTL maintainer

2006-08-02 Thread Mark Mitchell
Eric --

The GCC SC has appointed you an "RTL maintainer".  Congratulations!

That means that you have maintainership of all machine-independent RTL
optimization passes, like jump, CSE, GCSE, flow, sched2,
shorten_branches, etc.  This post doesn't cover back ends, dwarf2out.c,
or other things that aren't optimization passes.  I know that it's hard
to be exactly sure where the boundaries lie, but I've every confidence
you'll use your judgment well.  Please feel free to ask if you're not sure.

Please adjust MAINTAINERS accordingly -- and then please fix PRs and
approve patches for same. :-)

Thanks!

-- 
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713


Re: [PATCH] improved algorithm for gcc/expmed.c::choose_multiplier()

2006-08-02 Thread Denis Vlasenko
On Tuesday 01 August 2006 00:34, Jim Wilson wrote:
> Denis Vlasenko wrote:
> > I still cannot figure out what precision is, so I restricted new code to
> > (n == HOST_BITS_PER_WIDE_INT && precision == HOST_BITS_PER_WIDE_INT) case.
> > Need help here.
> 
> At the moment, there is probably no one who understands this code as 
> well as you do, so you may not get much help from others.
> 
> I looked at this a little, and I think precision is the number of 
> significant bits in the divisor.  Note that unsigned divides use N for 
> precision, but signed divides use N-1.  Also, in the pre_shift case, 
> where we shift the divisors right if they have zeros in the low bits, we 
> subtract the shift count from the precision.

Yes.

> This probably has some effect on how many bits we can safely use for the 
> resulting inverse multiplier.
> 
> This code is based on a paper writte by Torbjorn Granlund and Peter 
> Montgomery.  This was published in the 1994 SIGPLAN PLDI conference 
> proceedings.  You should read this paper if you haven't already done so. 
>   There may be clues in there about how the gcc algorithm works.  The 
> gcc code was written by one of the co-authors, Torbjorn Granlund.

I read it.

I uploaded newer patch to http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28417.
My code basically differs in

1) I never do 33-bit math. I do 32-bit and if the best 32-bit value of K
   for multiplier still doesn't work (there exist such 32-bit A
   that A*K/L != A/B) then I return K*2-1 (which is a 33-bit value)
   _without_ ever checking that it works (because it works always
   because of math stuff (read the paper)). Old code calculates
   33-bit K (it calls it "m", not K) and needs to jumt thru hoops
   due to 33-bitness.

2) I do not do mhigh/mlow stuff. This is the core of my improvement.
   I directly calculate the smallest bad_A for which bad_A*K/L != A/B.
   For some values of B, it turns out that bad_A is > 2^32-1 even though
   mhigh/mlow method says that K is maybe bad (i.e. my method
   proves that this K is actually okay).
   It also allows for finding better K values if we know that A is
   bounded (from value range analysis), but current patch doesn't
   use it yet.

My algorithm's comment (taken from patch):


+/*
+[below: 'div' is unsigned integer division]
+['/' is real division with infinite precision]
+[A,B,C... - integers, a,b,c... - reals]
+
+Theory: we want to calculate A div B, fast.
+B is const > 2 and is not a power of 2.
+
+In fp: A/B = A*(L/B)/L. (If L is a large power of 2,
+say 2^32, then it could be done really fast).
+Let k := L/B, K := L div B + 1.
+Then A/B = A*k/L.
+
+Then this is true:
+
+A div B <= A * K div L.
+
+For small enough A: A div B = A * K div L.
+Lets find first A for which it is not true.
+
+Lets compare k/L and K/L. K/L is larger by a small value d:
+
+d := K/L - k/L = (L div B + 1) / L - L/B/L =
+= (L div B * B + B) / (L*B) - L/(L*B) =
+= (L div B * B + B - L) / (L*B)
+
+A*K/L is larger than A*k/L by A*d.
+
+A*k/L is closest to 'overflowing into next integer'
+when A = N*B-1. The 'deficit' with such A is exactly 1/B.
+If A*d >= 1/B, then A*K/L will 'overflow'.
+
+Thus bad_A >= 1/B / d = (1/B) * (L*B)/(L div B * B + B - L) =
+= L/(L div B * B + B - L). Then you need to 'walk up' to next
+A representable as N*B-1: bad_A2 = (bad_A div B) * B + B-1
+This is the first A which will have wrong result
+(i.e. for which (A*K div L) = (A div B)+1, not (A div B).

--
vda


Re: [PATCH] improved algorithm for gcc/expmed.c::choose_multiplier()

2006-08-02 Thread Denis Vlasenko
On Wednesday 02 August 2006 08:30, Daniel Berlin wrote:
> Denis Vlasenko wrote:
> > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28417
> > 
> > Right now Bugzilla internal problem prevents me from creating
> > an attachement there. So it goes here.
> 
> What problems?
> Please let me know.
> 
> The only issue i ever see is stuff about Fh::slice.
> 
> This happens when the ip address or session changes between the create
> attachment click and the uploading you do :)

Yes it is Fh::slice, and it is intermittent.

Our admins must be doing something funny on the Inet gateway.
--
vda


Re: if() and trailing ;

2006-08-02 Thread Denis Vlasenko
On Tuesday 01 August 2006 16:54, Gabriel Dos Reis wrote:
> "Denis Vlasenko" <[EMAIL PROTECTED]> writes:
> | if()
> | (void)0; /* do nothing */
> | 
> | will make you happy.
> 
> No, I'm not.  I find it Very Silly.

Do you prefer buggy code like this?

| > > After a couple hours debugging code, I figured our an if() somewhere
| > > had a trailing ;  like this:
| > >
| > >                 if (memcmp(p, COMMUNITY, strlen(COMMUNITY)) != 0);
| > >                         continue; /* failed */
| > >
| > > The code above will always reach "continue" even when memcmp() == 0.

We want gcc to warn about such cases (about empty if's, that is).
Thus we need to have non-empty statement in if() body, always.

Of course you do not write "if(expr) (void)0;" directly,
it's what compiler sees.

Usage is

#if 
#define do_something() ((void)0)
#endif
...

if(expr) do_something();
--
vda


Re: native or cross libssp?

2006-08-02 Thread Mark Mitchell
Ben Elliston wrote:

> The former will be far easier, unless you desparately want ssp support
> for your target.  libssp's configure script checks for certain
> behaviour from vsnprintf and needs to run the test program to
> determine that.

It's our general policy that all libraries have configuration options
that permit them to be built in cross environment, by explicitly passing
appropriate configuration options.  Until those are added to libssp, it
 should be automatically disabled for cross builds.  Please file a PR
for this problem.

There have been several libraries contributed of late which have had
serious build and installation problems, including placing files in
incorrect directories and not being disabled in casses where they do not
work.

In future, I plan to be looking hard at any new library submissions and
asking contributors to test and fix such issues before checking in.
These problems are not hard to fix, but sometimes folks that are used to
working only in native environments do not think about them.

Thanks,

-- 
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713


New branch: insn-select

2006-08-02 Thread Michael Matz
Hi,

I just created the insn-select branch (off of trunk) containing currently 
the code I wrote shortly after the summit.  It intends to implement to 
what I agreed during the summit register allocation BoF, some sort of 
instruction and regclass selection, plus compensation code generation when 
a pseudo is used in conflicting contexts.  Currently it doesn't even 
select one regclass for each pseudo, doesn't do compensation code, and has 
no heuristics, and generally doesn't do very much :-)

I might be going off the dataflow branch somewhen, as I will need 
incremental updating of the regrefs (no dataflow problems need to be 
solved, just replacing one regref in some insns with other pseudos, and 
adding new instructions).  That once was possible with the old df code, 
but I think it currently isn't anymore (?)


Ciao,
Michael.
-- 

Index: svn.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/svn.html,v
retrieving revision 1.31
retrieving revision 1.32
diff -u -p -r1.31 -r1.32
--- svn.html2 Aug 2006 03:21:22 -   1.31
+++ svn.html2 Aug 2006 17:18:50 -   1.32
@@ -310,6 +310,19 @@ list therefore provides only some repres
   [ra-improvements] in the subject line.  The branch
   is maintained by mailto:[EMAIL PROTECTED]">Peter
   Bergner.
+
+  insn-select
+  This branch aims to implement in early instruction selection
+  and register class selection pass, which runs before register allocation
+  and subsumes the current regclass pass.  In particular
+  the goal is to chose an alternative per instruction, usable as a base
+  during register allocation, which ideally is not changed during reload
+  if registers could be allocated.  This will not be possible in all cases,
+  especially when addresses generated during spilling will be invalid on
+  the target machine.  But we should be able to do away with fake register
+  classes representing strict unions of other register classes.
+  Patch should be marked with [isel] or [insn-select]
+  in the subject line.  The branch is maintained by Michael Matz.
 

 Architecture-specific



libgfortran build failure

2006-08-02 Thread David Edelsohn
After the aclocal/automake regeneration, I now am seeing a
bootstrap failure on AIX because the generated Makefile is not
substituting a variable.  My generated Makefile looks like:

CC = /tmp/20060801/./gcc/xgcc -B/tmp/20060801/./gcc/ 
-B/farm/dje/install/powerpc-ibm-aix5.2.0.0-20060801/powerpc-ibm-aix5.2.0.0/bin/ 
-B/farm/dje/install/powerpc-ibm-aix5.2.0.0-20060801/powerpc-ibm-aix5.2.0.0/lib/ 
-isystem 
/farm/dje/install/powerpc-ibm-aix5.2.0.0-20060801/powerpc-ibm-aix5.2.0.0/include
 -isystem 
/farm/dje/install/powerpc-ibm-aix5.2.0.0-20060801/powerpc-ibm-aix5.2.0.0/sys-include
CFLAGS = @CFLAGS@
CPP = /tmp/20060801/./gcc/xgcc -B/tmp/20060801/./gcc/ 
-B/farm/dje/install/powerpc-ibm-aix5.2.0.0-20060801/powerpc-ibm-aix5.2.0.0/bin/ 
-B/farm/dje/install/powerpc-ibm-aix5.2.0.0-20060801/powerpc-ibm-aix5.2.0.0/lib/ 
-isystem 
/farm/dje/install/powerpc-ibm-aix5.2.0.0-20060801/powerpc-ibm-aix5.2.0.0/include
 -isystem 
/farm/dje/install/powerpc-ibm-aix5.2.0.0-20060801/powerpc-ibm-aix5.2.0.0/sys-include
 -E

GCC does not like @CFLAGS@ as a commandline option.  Something was
not regenerated properly.

David



Re: if() and trailing ;

2006-08-02 Thread Thomas R. Truscott
> But it is common to have an empty action on a condition.  You'll often
> see code like

if (condition)
/* nothing */;

Yes, the intent of the warning is catch people who stick a ; at the end
of a line (out of habit) when there should not be one.
That is what the warning should target, not good code like the above.

gcc should suppress the warning when the ';' is on a different line
than the `if'.  I do that (and other filtering) in a similar warning
for `for' and `while' with good results.

If the pre-processor would mark comment and macro boundaries,
then we could do better.  E.g. these should not trigger the warning:

if (condition) /* nothing */;
if (condition) SPEW();

And we could easily detect the problem with

if(...)
   SOME_MACRO(with complex arguments)
do_something();

since we could determine that the control is spanning a null macro.

Tom Truscott