At 10:55 AM 12/15/2006, Merlin Moncure wrote:
On 12/15/06, Ron <[EMAIL PROTECTED]> wrote:

There are many instances of x86 compatible code that get
30-40% speedups just because they get access to 16 rather than 8 GPRs
when recompiled for x84-64.

...We benchmarked PostgreSQL internally here and found it to be fastest in 32 bit mode running on a 64 bit platform -- this was on a quad opteron 870 runnning our specific software stack, your results might be differnt of course.

On AMD Kx's, you probably will get best performance in 64b mode (so you get all those extra registers and other stuff) while using 32b pointers (to keep code size and memory footprint down).

On Intel C2's, things are more complicated since Intel's x86-64 implementation and memory IO architecture are both different enough from AMD's to have caused some consternation on occasion when Intel's 64b performance did not match AMD's.



The big arch specific differences in Kx's are in 64b mode.  Not 32b

I dont think so.  IMO all the processor specific instruction sets were
hacks of 32 bit mode to optimize specific tasks.  Except for certain
things these instructions are rarely, if ever used in 64 bit mode,
especially in integer math (i.e. database binaries).  Since Intel and
AMD64 64 bit are virtually indentical I submit that -march is not
really important anymore except for very, very specific (but
important) cases like spinlocks.

Take a good look at the processor specific manuals and the x86-64 benches around the net. The evidence outside the DBMS domain is pretty solidly in contrast to your statement and position. Admittedly, DBMS are not web servers or FPS games or ... That's why we need to do our own rigorous study of the subject.


This thread is about how much architecture depenant binares can beat standard ones. I say they don't very much at all, and with the specific exception of Daniel's
benchmarking the results posted to this list bear that out.
...and IMHO the issue is still very much undecided given that we don't have enough properly obtained and documented evidence.

ATM, the most we can say is that in a number of systems with modest physical IO subsystems that are not running Gentoo Linux we have not been able to duplicate the results. (We've also gotten some interesting results like yours suggesting the arch specific optimizations are bad for pg performance in your environment.)

In the process questions have been asked and issues raised regarding both the tolls involved and the proper way to use them.

We really do need to have someone other than Daniel duplicate his Gentoo environment and independently try to duplicate his results.


...and let us bear in mind that this is not just intellectual curiosity. The less pg is mysterious, the better the odds pg will be adopted in any specific case.
Ron Peacetree

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

Reply via email to