Re: g++ 4.3, troubles with C++ indexing idioms

2007-07-25 Thread Paolo Bonzini



For performance small arrays should be the same as individual members
(I can see the annoying fact that initialization is a headache - this has
annoyed me as well).  For larger arrays (4 members), aliasing will
make a difference possibly, making the array variant slower.  Any union
variant is expected to be slower for aliasing reasons (we do not do
field-sensitive aliasing for unions).


I wonder if the fabled union as VIEW_CONVERT_EXPR patch would help... 
 Richi, do you know?


Paolo


Re: g++ 4.3, troubles with C++ indexing idioms

2007-07-25 Thread Richard Guenther

On 7/25/07, Paolo Bonzini [EMAIL PROTECTED] wrote:


 For performance small arrays should be the same as individual members
 (I can see the annoying fact that initialization is a headache - this has
 annoyed me as well).  For larger arrays (4 members), aliasing will
 make a difference possibly, making the array variant slower.  Any union
 variant is expected to be slower for aliasing reasons (we do not do
 field-sensitive aliasing for unions).

I wonder if the fabled union as VIEW_CONVERT_EXPR patch would help...
  Richi, do you know?


I don't think so.  As long as our memory optimizers are weak and only work
well if they are faced with single SFT.xx VUSEs and VDEFs then using a union
will be hit by create_variable_info_for not creating field variables
for unions at all
(and thus memory optimizers not be able to tell apart fields).

Richard.


Re: g++ 4.3, troubles with C++ indexing idioms

2007-07-24 Thread Richard Guenther

On 7/21/07, tbp [EMAIL PROTECTED] wrote:

On 7/19/07, Richard Guenther [EMAIL PROTECTED] wrote:
 Of course, if any then the array indexing variant is fixed.  It would be nice
 to see a complete testcase with a pessimization, maybe you can file
 a bugreport about this?
There's many issues for all alternatives and i'm not qualified to
pinpoint them further.
I've taken http://ompf.org/ray/sphereflake/ which is used as a
benchmark already here
http://www.suse.de/~gcctest/c++bench/raytracer/, because it's small,
self contained and has such a basic 3 component class that's used all
over.
It doesn't use any kind of array access operator, but it's good enough
to show the price one has to pay before even thinking of providing
some. It has been adjusted to use floats and access members through
accessors (to allow for a straighter comparison of all cases).

variation 0 is the reference, a mere struct { float x,y,z; ...};,
performs as good as the original, but wouldn't allow for any 'valid'
indexing.
variation 1 is struct { float f[3]; ... }
variations 2,3,4,5 try to use some union

# /usr/local/gcc-4.3-20070720/bin/g++ -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../configure --prefix=/usr/local/gcc-4.3-20070720
--enable-languages=c,c++ --enable-threads=posix --disable-checking
--disable-nls --disable-shared --disable-win32-registry
--with-system-zlib --disable-multilib --verbose --with-gcc=gcc-4.2
--with-gnu-ld --with-gnu-as --enable-checking=none --disable-bootstrap
Thread model: posix
gcc version 4.3.0 20070720 (experimental)
# make bench
[snip]
sf.v0

real0m3.963s
user0m3.812s
sys 0m0.152s
sf.v1

real0m3.972s
user0m3.864s
sys 0m0.104s
sf.v2

real0m10.384s
user0m10.261s
sys 0m0.120s
sf.v3

real0m10.390s
user0m10.289s
sys 0m0.104s
sf.v4

real0m10.388s
user0m10.265s
sys 0m0.124s
sf.v5

real0m10.399s
user0m10.281s
sys 0m0.116s

There's some inlining  difference between union variations and the
first two, but they clearly stand in their own league anyway.
So we can only seriously consider the first two.
Variation #0 would ask for invalid c++ (pointer arithmetic abuse, not
an option anymore) or forbidding array access operator and going to
set/get + memcpy, but pretty optimal.
Variation #1 (straight array) is quite annoying in C++ (no initializer
list, need to reformulate all access etc...) and already show some
slight pessimization, but it's not easy to track. Apparently g++ got a
bit better lately in this regard, or it's only blatant on larger data
or more complex cases.

I hope this shows how problematic it is for the end user.


For performance small arrays should be the same as individual members
(I can see the annoying fact that initialization is a headache - this has
annoyed me as well).  For larger arrays (4 members), aliasing will
make a difference possibly, making the array variant slower.  Any union
variant is expected to be slower for aliasing reasons (we do not do
field-sensitive aliasing for unions).

In the end I would still recommend to go with array variants.

Richard.


Re: g++ 4.3, troubles with C++ indexing idioms

2007-07-24 Thread tbp

On 7/24/07, Richard Guenther [EMAIL PROTECTED] wrote:

For performance small arrays should be the same as individual members
(I can see the annoying fact that initialization is a headache - this has
annoyed me as well).  For larger arrays (4 members), aliasing will
make a difference possibly, making the array variant slower.  Any union
variant is expected to be slower for aliasing reasons (we do not do
field-sensitive aliasing for unions).

Confirmed :)
And thanks for the clue about the threshold.


In the end I would still recommend to go with array variants.

I guess wishful thinking, or heresy, got me asking for a sanctioned
address-this-as-an-array idiom.; now i'll go with the flow and use
those 2nd class citizens of C++ aka array, even if i'm a bit sceptical
about the performance equivalence (granted it isn't as obvious as it
used to be, i need to investigate some more).
But for sure it can't be as terrible as unions...


Re: g++ 4.3, troubles with C++ indexing idioms

2007-07-21 Thread tbp

On 7/19/07, Richard Guenther [EMAIL PROTECTED] wrote:

Of course, if any then the array indexing variant is fixed.  It would be nice
to see a complete testcase with a pessimization, maybe you can file
a bugreport about this?

There's many issues for all alternatives and i'm not qualified to
pinpoint them further.
I've taken http://ompf.org/ray/sphereflake/ which is used as a
benchmark already here
http://www.suse.de/~gcctest/c++bench/raytracer/, because it's small,
self contained and has such a basic 3 component class that's used all
over.
It doesn't use any kind of array access operator, but it's good enough
to show the price one has to pay before even thinking of providing
some. It has been adjusted to use floats and access members through
accessors (to allow for a straighter comparison of all cases).

variation 0 is the reference, a mere struct { float x,y,z; ...};,
performs as good as the original, but wouldn't allow for any 'valid'
indexing.
variation 1 is struct { float f[3]; ... }
variations 2,3,4,5 try to use some union

# /usr/local/gcc-4.3-20070720/bin/g++ -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../configure --prefix=/usr/local/gcc-4.3-20070720
--enable-languages=c,c++ --enable-threads=posix --disable-checking
--disable-nls --disable-shared --disable-win32-registry
--with-system-zlib --disable-multilib --verbose --with-gcc=gcc-4.2
--with-gnu-ld --with-gnu-as --enable-checking=none --disable-bootstrap
Thread model: posix
gcc version 4.3.0 20070720 (experimental)
# make bench
[snip]
sf.v0

real0m3.963s
user0m3.812s
sys 0m0.152s
sf.v1

real0m3.972s
user0m3.864s
sys 0m0.104s
sf.v2

real0m10.384s
user0m10.261s
sys 0m0.120s
sf.v3

real0m10.390s
user0m10.289s
sys 0m0.104s
sf.v4

real0m10.388s
user0m10.265s
sys 0m0.124s
sf.v5

real0m10.399s
user0m10.281s
sys 0m0.116s

There's some inlining  difference between union variations and the
first two, but they clearly stand in their own league anyway.
So we can only seriously consider the first two.
Variation #0 would ask for invalid c++ (pointer arithmetic abuse, not
an option anymore) or forbidding array access operator and going to
set/get + memcpy, but pretty optimal.
Variation #1 (straight array) is quite annoying in C++ (no initializer
list, need to reformulate all access etc...) and already show some
slight pessimization, but it's not easy to track. Apparently g++ got a
bit better lately in this regard, or it's only blatant on larger data
or more complex cases.

I hope this shows how problematic it is for the end user.
// sphere flake bvh raytracer (c) 2005, thierry berger-perrin [EMAIL PROTECTED]
// this code is released under the GNU Public License.
// see http://ompf.org/ray/sphereflake/
// compile with ie g++ -O2 -ffast-math sphereflake.cc
// usage: ./sphereflake [lvl=6] pix.ppm
#include cmath
#include iostream
#include cstdlib
#include limits
#define GIMME_SHADOWS

enum { childs = 9, ss= 2, ss_sqr = ss*ss }; /* not really tweakable anymore */
static const float infinity = std::numeric_limitsfloat::infinity(), epsilon = 1e-4f;


#if VARIATION == 5
union v_t {
	// straight union; array left unharmed; just as horrible as the others.
	struct { float _x, _y, _z; };
	float f[3];
	v_t(const float a, const float b, const float c) : _x(a), _y(b), _z(c) {}
	float x() const { return _x; }
	float x()  { return _x; }
	float y() const { return _y; }
	float y()  { return _y; }
	float z() const { return _z; }
	float z()  { return _z; }
	
#else
struct v_t {
#endif
#if VARIATION == 0
	// best of the breed, but doesn't give way for an 'array access' operator.
	float _x, _y, _z;
	v_t(const float a, const float b, const float c) : _x(a), _y(b), _z(c) {}
	float x() const { return _x; }
	float x()  { return _x; }
	float y() const { return _y; }
	float y()  { return _y; }
	float z() const { return _z; }
	float z()  { return _z; }
#elif VARIATION == 1
	// not as good, obvious 'array access' but forbids initializer lists
	float f[3];
	v_t(const float a, const float b, const float c) { f[0] = a; f[1] = b; f[2] = c; }
	float x() const { return f[0]; }
	float x()  { return f[0]; }
	float y() const { return f[1]; }
	float y()  { return f[1]; }
	float z() const { return f[2]; }
	float z()  { return f[2]; }
#elif VARIATION == 2
	// Richard Guenther's suggestion, worst of the worst.
	union {
		struct { float x, y, z; } a;
		float b[3];
	} u;
	v_t(const float i, const float j, const float k) { u.a.x = i; u.a.y = j; u.a.z = k; }
	float x() const { return u.a.x; }
	float x()  { return u.a.x; }
	float y() const { return u.a.y; }
	float y()  { return u.a.y; }
	float z() const { return u.a.z; }
	float z()  { return u.a.z; }
 
#elif VARIATION == 3
	// slightly better than variation #2, but still terrible.
	union {
		struct { float _x, _y, _z; };
		float f[3];
	};
	v_t(const float a, const float b, const float c) : _x(a), _y(b), _z(c) 

g++ 4.3, troubles with C++ indexing idioms

2007-07-19 Thread tbp

I have that usual heavy duty 3 fp components class that needs to be
reasonably efficient and takes this form for g++
struct vec_t {
float x,y,z;
const float operator()(const uint_t i) const { return *(x + i); }
float operator()(const uint_t i) { return *(x + i); } // -- guilty
[snip ctors, operators  related cruft]
};

I use this notation because g++ does silly things with straight arrays
(and C++ gets in the way), doesn't like
union vec_t {
struct { float x,y,z; };
float f[3];
const float operator()(const uint_t i) const { return m[i]; }
float operator()(const uint_t i) { return m[i]; }
};
much either, and seems to enjoy the first form (+ ctors with
initializer lists) much. So far, so good.

Alas, somewhere between gcc-4.3-20070608 (ok) and gcc-4.3-20070707
(not ok ever since), the non const indexing started to trigger bogus
codegen with some skipped stores on x86-64, but of course only in
convoluted situations. So, i can't produce a simple testcase. I can
kludge around either by:
. marking it __attribute__((noinline))
. turning it into a set operation doing a std::memcpy(x + i, f,
sizeof(float))
. annoying the optimizer with the entertaining
union vec_t {
struct { float x,y,z; };
float f[3];
const float operator()(const uint_t i) const { return *(x + i); }
float operator()(const uint_t i) { return *(x + i); }
};

At this point i'd need some guidance from compiler developers because
the compiler itself provides none (no warning whatsoever in any of
those variations) and what i thought was acceptable apparently isn't
anymore.
What kind of idiom am i supposed to write such thing in to get back
efficient and correct code?


Re: g++ 4.3, troubles with C++ indexing idioms

2007-07-19 Thread Richard Guenther

On 7/19/07, tbp [EMAIL PROTECTED] wrote:

I have that usual heavy duty 3 fp components class that needs to be
reasonably efficient and takes this form for g++
struct vec_t {
float x,y,z;
const float operator()(const uint_t i) const { return *(x + i); }
float operator()(const uint_t i) { return *(x + i); } // -- guilty
[snip ctors, operators  related cruft]
};

I use this notation because g++ does silly things with straight arrays
(and C++ gets in the way), doesn't like
union vec_t {
struct { float x,y,z; };
float f[3];
const float operator()(const uint_t i) const { return m[i]; }
float operator()(const uint_t i) { return m[i]; }
};
much either, and seems to enjoy the first form (+ ctors with
initializer lists) much. So far, so good.

Alas, somewhere between gcc-4.3-20070608 (ok) and gcc-4.3-20070707
(not ok ever since), the non const indexing started to trigger bogus
codegen with some skipped stores on x86-64, but of course only in
convoluted situations. So, i can't produce a simple testcase. I can
kludge around either by:
. marking it __attribute__((noinline))
. turning it into a set operation doing a std::memcpy(x + i, f,
sizeof(float))
. annoying the optimizer with the entertaining
union vec_t {
struct { float x,y,z; };
float f[3];
const float operator()(const uint_t i) const { return *(x + i); }
float operator()(const uint_t i) { return *(x + i); }
};

At this point i'd need some guidance from compiler developers because
the compiler itself provides none (no warning whatsoever in any of
those variations) and what i thought was acceptable apparently isn't
anymore.
What kind of idiom am i supposed to write such thing in to get back
efficient and correct code?



Well, I always used the array variant, but you should be able to do

struct vec {
...
 union {
struct { float x, y, z; } a;
float x[3] b;
 } u;
}

if you need to (why does the array form not work for you?)

Richard.


Re: g++ 4.3, troubles with C++ indexing idioms

2007-07-19 Thread tbp

On 7/19/07, Richard Guenther [EMAIL PROTECTED] wrote:

Well, I always used the array variant, but you should be able to do

[snip]

if you need to (why does the array form not work for you?)

Because if you bench in some non trivial program, on x86/x86-64 at
least, those variations (struct { float x,y,z; }, struct { float f[3];
} and some additional union layer) the last 2 consistently come out as
slower. In the array case addressing seems to be the main issue
(redundant scaling etc...); for the union variant, it's less clear but
it seems it prohibits some copy/return value optimizations.
Plus gcc apparently likes (well, used to) very much the *(x + i)
idiom; all in all i had something to work with.

Now i'm seeing *some* stores indexed in this way vanish, array
addressing is still as bad as it was, unions still get me some
pessimization and using the memcpy idiom asks me to give up on the
idea of an array acces operator altogether.

So i'm asking, which is going to be fixed in the foreseeable future.


Re: g++ 4.3, troubles with C++ indexing idioms

2007-07-19 Thread Joel Dice

On Thu, 19 Jul 2007, tbp wrote:


On 7/19/07, Richard Guenther [EMAIL PROTECTED] wrote:

Well, I always used the array variant, but you should be able to do

[snip]

if you need to (why does the array form not work for you?)

Because if you bench in some non trivial program, on x86/x86-64 at
least, those variations (struct { float x,y,z; }, struct { float f[3];
} and some additional union layer) the last 2 consistently come out as
slower. In the array case addressing seems to be the main issue
(redundant scaling etc...); for the union variant, it's less clear but
it seems it prohibits some copy/return value optimizations.
Plus gcc apparently likes (well, used to) very much the *(x + i)
idiom; all in all i had something to work with.


It's worth noting that C++ does not guarantee that (y == x + 1) in your 
example.  The compiler is allowed to insert in arbitrary amount of 
padding between the fields of a structure.  If it works, you got lucky.



Now i'm seeing *some* stores indexed in this way vanish, array
addressing is still as bad as it was, unions still get me some
pessimization and using the memcpy idiom asks me to give up on the
idea of an array acces operator altogether.

So i'm asking, which is going to be fixed in the foreseeable future.


RE: g++ 4.3, troubles with C++ indexing idioms

2007-07-19 Thread Dave Korn
On 19 July 2007 14:08, tbp wrote:

 I have that usual heavy duty 3 fp components class that needs to be
 reasonably efficient and takes this form for g++
 struct vec_t {
   float x,y,z;
   const float operator()(const uint_t i) const { return *(x + i); }
   float operator()(const uint_t i) { return *(x + i); } // -- guilty
   [snip ctors, operators  related cruft]
 };
 
 I use this notation 

  That's not notation.  That's illegal invalid code that invokes undefined
behaviour.  struct vec_t is not a POD type (since it has ctors) and you don't
actually have any right to assume that x, y and z are arranged in consecutive
memory locations.

 Alas, somewhere between gcc-4.3-20070608 (ok) and gcc-4.3-20070707
 (not ok ever since), the non const indexing started to trigger bogus
 codegen with 

  Bogus codegen is the inevitable result of bogus code.  Garbage in, garbage
out.

  BTW, the const indexing is completely undefined too.

 At this point i'd need some guidance from compiler developers 

  No, you need to read a book on how to program in C++.

 What kind of idiom am i supposed to write such thing in to get back
 efficient and correct code?

  This is a question for gcc-help, not for the main gcc list.  Legal valid
code would be a good start.

cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: g++ 4.3, troubles with C++ indexing idioms

2007-07-19 Thread Richard Guenther

On 7/19/07, tbp [EMAIL PROTECTED] wrote:

On 7/19/07, Richard Guenther [EMAIL PROTECTED] wrote:
 Well, I always used the array variant, but you should be able to do
[snip]
 if you need to (why does the array form not work for you?)
Because if you bench in some non trivial program, on x86/x86-64 at
least, those variations (struct { float x,y,z; }, struct { float f[3];
} and some additional union layer) the last 2 consistently come out as
slower. In the array case addressing seems to be the main issue
(redundant scaling etc...); for the union variant, it's less clear but
it seems it prohibits some copy/return value optimizations.
Plus gcc apparently likes (well, used to) very much the *(x + i)
idiom; all in all i had something to work with.

Now i'm seeing *some* stores indexed in this way vanish, array
addressing is still as bad as it was, unions still get me some
pessimization and using the memcpy idiom asks me to give up on the
idea of an array acces operator altogether.

So i'm asking, which is going to be fixed in the foreseeable future.


Of course, if any then the array indexing variant is fixed.  It would be nice
to see a complete testcase with a pessimization, maybe you can file
a bugreport about this?

Thanks,
Richard


Re: g++ 4.3, troubles with C++ indexing idioms

2007-07-19 Thread tbp

On 7/19/07, Richard Guenther [EMAIL PROTECTED] wrote:

Of course, if any then the array indexing variant is fixed.  It would be nice
to see a complete testcase with a pessimization, maybe you can file
a bugreport about this?

By essence they're hard to trigger in small testcases (that's not
where they matter anyway), and by my own previous experience large
hairy bug reports get forgotten on the side of the road.
But i'll see if can make up something convincing, provided i got the
cause for the relative slowdown right.


Re: g++ 4.3, troubles with C++ indexing idioms

2007-07-19 Thread tbp

On 7/19/07, Dave Korn [EMAIL PROTECTED] wrote:

  Bogus codegen is the inevitable result of bogus code.  Garbage in, garbage
out.

  BTW, the const indexing is completely undefined too.

That's the kind of answer i'd get from gcc-help and at that point i'd
be none wiser because i already know that. I also know that up to
gcc-4.3-20070608 it was provably giving correct results faster than
any other variants. Being no language lawyer, that's the only metric i
consider.
It's no portability issue either because every compiler asks for a
specific work around; which is quite sad considering how mundane that
code is.


RE: g++ 4.3, troubles with C++ indexing idioms

2007-07-19 Thread Dave Korn
On 19 July 2007 20:38, tbp wrote:

 On 7/19/07, Dave Korn wrote:
   Bogus codegen is the inevitable result of bogus code.  Garbage in,
 garbage out. 
 
   BTW, the const indexing is completely undefined too.
 That's the kind of answer i'd get from gcc-help and at that point i'd
 be none wiser because i already know that. I also know that up to
 gcc-4.3-20070608 it was provably giving correct results faster than
 any other variants. Being no language lawyer, that's the only metric i
 consider.
 It's no portability issue either because every compiler asks for a
 specific work around; which is quite sad considering how mundane that
 code is.

  The workaround is to just do it right in the first place: union the three
single ints with the array of three.  That way you won't mislead the aliasing
and dataflow analysis.  If you really analysed the generated code in great
depth, you'd be able to classify all the differences between the codegen when
you do it the right way and the codegen when you do it your original way into
two categories: a) missed optimisations, and b) extra instructions that are
actually needed for technical correctness in aliasing situations.  Those in
category a) you can file PRs and they'll be fixed if possible; those in
category b) would actually be bad codegen if the compiler generated the same
code for the array/union case as it did for your pointer-arithmetic-abuse
case, although in some cases you might be able to use restricted pointers to
get back to the kind of codegen you're looking for.


cheers,
  DaveK
-- 
Can't think of a witty .sigline today