Re: [PERFORM] Large (8M) cache vs. dual-core CPUs

2006-04-26 Thread mark
On Tue, Apr 25, 2006 at 11:07:17PM -0400, Ron Peacetree wrote:
 THROUGHPUT is better with DDR2 if and only if there is enough data
 to be fetched in a serial fashion from memory.
 LATENCY however is dependent on the base clock rate of the RAM
 involved.  So PC3200, 200MHz x2, is going to actually perform better
 than PC2-5400, 166MHz x4, for almost any memory access pattern
 except those that are highly sequential.

I had forgotten about this. Still, it's not quite as simple as you say.

DDR2 has increased latency, however, it has a greater upper limit,
and when run at the same clock speed (200 Mhz for 200 Mhz), it is
not going to perform worse. Add in double the pre-fetching capability,
and what you get is that most benchmarks show DDR2 5400 as being
slightly faster than DDR 3200.

AMD is switching to DDR2, and I believe that, even after making such a
big deal about latency, and why they wouldn't switch to DDR2, they are
now saying that their on-chip memory controller will be able to access
DDR2 memory (when they support it soon) faster than Intel can, not
having an on-chip memory controller.

You said that DB accesses are random. I'm not so sure. In PostgreSQL,
are not the individual pages often scanned sequentially, especially
because all records are variable length? You don't think PostgreSQL
will regularly read 32 bytes (8 bytes x 4) at a time, in sequence?
Whether for table pages, or index pages - I'm not seeing why the
accesses wouldn't be sequential. You believe PostgreSQL will access
the table pages and index pages randomly on a per-byte basis? What
is the minimum PostgreSQL record size again? Isn't it 32 bytes or
over? :-)

I wish my systems were running the same OS, and I'd run a test for
you. Alas, I don't think comparing Windows to Linux would be valuable.

 A minor point to be noted in addition here is that most DB servers
 under load are limited by their physical IO subsystem, their HDs,
 and not the speed of their RAM.

It seems like a pretty major point to me. :-)

It's why Opteron with RAID kicks ass over HyperTransport.

 All of the above comments about the relative performance of
 different RAM types become insignificant when performance is gated
 by the HD subsystem.

Yes.

Luckily - we don't all have Terrabyte databases... :-)

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Large (8M) cache vs. dual-core CPUs

2006-04-26 Thread Ron Peacetree
I'm posting this to the entire performance list in the hopes that it will be 
generally useful.
=r

-Original Message-
From: [EMAIL PROTECTED]
Sent: Apr 26, 2006 3:25 AM
To: Ron Peacetree [EMAIL PROTECTED]
Subject: Re: [PERFORM] Large (8M) cache vs. dual-core CPUs

Hi Ron:

As a result of your post on the matter, I've been redoing some of my
online research on this subject, to see whether I do have one or more
things wrong.

I'm always in favor of independent investigation to find the truth. :-)


You say:

 THROUGHPUT is better with DDR2 if and only if there is enough data
 to be fetched in a serial fashion from memory.
...
 So PC3200, 200MHz x2, is going to actually perform better than
 PC2-5400, 166MHz x4, for almost any memory access pattern except
 those that are highly sequential.
...
 For the mostly random memory access patterns that comprise many DB
 applications, the base latency of the RAM involved is going to
 matter more than the peak throughput AKA the bandwidth of that RAM.

I'm trying to understand right now - why does DDR2 require data to be
fetched in a serial fashion, in order for it to maximize bandwidth?

SDR transfers data on either the rising or falling edge of its clock cycle.

DDR transfers data on both the rising and falling edge of the base clock 
signal.  If there is a contiguous chunk of 2+ datums to be transferred.

DDR2 basically has a second clock that cycles at 2x the rate of the base clock 
and thus we get 4 data transfers per base clock cycle.  If there is a 
contiguous chunk of 4+ datums to be transferred.

Note also what happens when transferring the first datum after a lull period.
For purposes of example, let's pretend that we are talking about a base clock 
rate of 200MHz= 5ns.

The SDR still transfers data every 5ns no matter what.
The DDR transfers the 1st datum in 10ns and then assuming there are at least 2 
sequential datums to be transferred will transfer the 2nd and subsequent 
sequential pieces of data every 2.5ns.
The DDR2 transfers the 1st datum in 20ns and then assuming there are at least 4 
sequential datums to be transferred will transfer the 2nd and subsequent 
sequential pieces of data every 1.25ns.

Thus we can see that randomly accessing RAM degrades performance significantly 
for DDR and DDR2.   We can also see that the conditions for optimal RAM 
performance become more restrictive as we go from SDR to DDR to DDR2.
The reason DDR2 with a low base clock rate excelled at tasks like streaming 
multimedia and stank at things like small transaction OLTP DB applications is 
now apparent.

Factors like CPU prefetching and victim buffers can muddy this picture a bit.
Also, if the CPU's off die IO is slower than the RAM it is talking to, how fast 
that RAM is becomes unimportant.

The reason AMD is has held off from supporting DDR2 until now are:
1.  DDR is EOL.  JEDEC is not ratifying any DDR faster than 200x2 while DDR2 
standards as fast as 333x4 are likely to be ratified (note that Intel pretty 
much avoided DDR, leaving it to AMD, while DDR2 is Intel's main RAM technology. 
 Guess who has more pull with JEDEC?)

2.  DDR and DDR2 RAM with equal base clock rates are finally available, 
removing the biggest performance difference between DDR and DDR2.

3.  Due to the larger demand for DDR2, more of it is produced.  That in turn 
has resulted in larger supplies of DDR2 than DDR.  Which in turn, especially 
when combined with the factors above, has resulted in lower prices for DDR2 
than for DDR of the same or faster base clock rate by now.

Hope this is helpful,
Ron

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Large (8M) cache vs. dual-core CPUs

2006-04-26 Thread David Boreham




The reason AMD is has held off from supporting DDR2 until now are:
1.  DDR is EOL.  JEDEC is not ratifying any DDR faster than 200x2 while DDR2 
standards as fast as 333x4 are likely to be ratified (note that Intel pretty 
much avoided DDR, leaving it to AMD, while DDR2 is Intel's main RAM technology. 
 Guess who has more pull with JEDEC?)

 


DDR2 is to RDRAM as C# is to Java

;)



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Large (8M) cache vs. dual-core CPUs

2006-04-26 Thread William Yu

[EMAIL PROTECTED] wrote:


I have an Intel Pentium D 920, and an AMD X2 3800+. These are very
close in performance. The retail price difference is:

Intel Pentium D 920 is selling for $310 CDN
AMD X2 3800+is selling for $347 CDN

Anybody who claims that Intel is 2X more expensive for the same
performance, isn't considering all factors. No question at all - the
Opteron is good, and the Xeon isn't - but the original poster didn't
ask about Opeteron or Xeon, did he? For the desktop lines - X2 is not
double Pentium D. Maybe 10%. Maybe not at all. Especially now that
Intel is dropping it's prices due to overstock.


There's part of the equation you are missing here. This is a PostgreSQL 
mailing list which means we're usually talking about performance of just 
this specific server app. While in general there may not be that much of 
a % difference between the 2 chips, there's a huge gap in Postgres. For 
whatever reason, Postgres likes Opterons. Way more than Intel 
P4-architecture chips. (And it appears way more than IBM Power4 chips 
and a host of other chips also.)


Here's one of the many discussions we had about this issue last year:

http://qaix.com/postgresql-database-development/337-670-re-opteron-vs-xeon-was-what-to-do-with-6-disks-read.shtml

The exact reasons why Opteron runs PostgreSQL so much better than P4s, 
we're not 100% sure of. We have guesses -- lower memory latency, lack of 
shared FSB, better 64-bit, 64-bit IOMMU, context-switch storms on P4, 
better dualcore implementation and so on. Perhaps it's a combination of 
all the above factors but somehow, the general experience people have 
had is that equivalently priced Opterons servers run PostgreSQL 2X 
faster than P4 servers as the baseline and the gap increases as you add 
more sockets and more cores.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] slow deletes on pgsql 7.4

2006-04-26 Thread Junaili Lie
It was on my first email.
Here it is again:
MONSOON=# explain delete from scenario where id='1099'; QUERY PLAN--Index Scan using scenario_pkey on scenario (cost=
0.00..3.14 rows=1 width=6) Index Cond: (id = 1099::bigint)(2 rows)MONSOON=# explain delete from scenario where id='1023'; QUERY PLAN--
Index Scan using scenario_pkey on scenario (cost=0.00..3.14 rows=1 width=6) Index Cond: (id = 1023::bigint)(2 rows)
Thanks,
J
On 4/25/06, Tom Lane [EMAIL PROTECTED] wrote:
Junaili Lie [EMAIL PROTECTED] writes: ie. delete from scenario where id=3D'1023' is very fast, but delete from
 scenario where id=3D'1099' is running forever.What does EXPLAIN show for each of those cases? regards, tom lane


Re: [PERFORM] Large (8M) cache vs. dual-core CPUs

2006-04-26 Thread David Boreham


While in general there may not be that much of a % difference between 
the 2 chips,
there's a huge gap in Postgres. For whatever reason, Postgres likes 
Opterons.

Way more than Intel P4-architecture chips.

It isn't only Postgres. I work on a number of other server applications
that also run much faster on Opterons than the published benchmark
figures would suggest they should. They're all compiled with gcc4,
so possibly there's a compiler issue. I don't run Windows on any
of our Opteron boxes so I can't easily compare using the MS compiler.





---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Introducing a new linux readahead framework

2006-04-26 Thread Michael Stone
From my initial testing this is very promising for a postgres server. 
Benchmark-wise, a simple dd with an 8k blocksize gets ~200MB/s as 
compared to ~140MB/s on the same hardware without the patch. Also, that 
200MB/s seems to be unaffected by the dd blocksize, whereas without the 
patch a 512k blocksize would get ~100MB/s. I'm now watching to see how 
it does over a couple of days on real-world workloads. 


Mike Stone

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Large (8M) cache vs. dual-core CPUs

2006-04-26 Thread Ron Peacetree
Mea Culpa.  There is a mistake in my example for SDR vs DDR vs DDR2.
This is what I get for posting before my morning coffee.

The base latency for all of the memory types is that of the base clock rate; 
200MHz= 5ns in my given examples.

I double factored, making DDR and DDR2 worse than they actually are.

Again, my apologies.
Ron

-Original Message-
From: Ron Peacetree [EMAIL PROTECTED]
Sent: Apr 26, 2006 8:40 AM
To: [EMAIL PROTECTED], pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Large (8M) cache vs. dual-core CPUs

I'm posting this to the entire performance list in the hopes that it will be 
generally useful.
=r
snip

Note also what happens when transferring the first datum after a lull period.
For purposes of example, let's pretend that we are talking about a base clock 
rate of 200MHz= 5ns.

The SDR still transfers data every 5ns no matter what.
The DDR transfers the 1st datum in 10ns and then assuming there are at least 2 
sequential datums to be transferred will transfer the 2nd and subsequent 
sequential pieces of data every 2.5ns.
The DDR2 transfers the 1st datum in 20ns and then assuming there are at least 
4 sequential datums to be transferred will transfer the 2nd and subsequent 
sequential pieces of data every 1.25ns.

=5= ns to first transfer in all 3 casess.  Bad Ron.   No Biscuit!


Thus we can see that randomly accessing RAM degrades performance significantly 
for DDR and DDR2.   We can also see that the conditions for optimal RAM 
performance become more restrictive as we go from SDR to DDR to DDR2.
The reason DDR2 with a low base clock rate excelled at tasks like streaming 
multimedia and stank at things like small transaction OLTP DB applications is 
now apparent.

Factors like CPU prefetching and victim buffers can muddy this picture a bit.
Also, if the CPU's off die IO is slower than the RAM it is talking to, how 
fast that RAM is becomes unimportant.

These statements, and everything else I posted, are accurate.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Large (8M) cache vs. dual-core CPUs

2006-04-26 Thread PFC


	Have a look at this Wikipedia page which outlines some differences  
between the AMD and Intel versions of 64-bit :


http://en.wikipedia.org/wiki/EM64T


It isn't only Postgres. I work on a number of other server applications
that also run much faster on Opterons than the published benchmark
figures would suggest they should. They're all compiled with gcc4,
so possibly there's a compiler issue. I don't run Windows on any
of our Opteron boxes so I can't easily compare using the MS compiler.



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Large (8M) cache vs. dual-core CPUs

2006-04-26 Thread Scott Marlowe
On Tue, 2006-04-25 at 18:55, Jim C. Nasby wrote:
 On Tue, Apr 25, 2006 at 01:33:38PM -0500, Scott Marlowe wrote:
  On Tue, 2006-04-25 at 13:14, Bill Moran wrote:
   I've been given the task of making some hardware recommendations for
   the next round of server purchases.  The machines to be purchased
   will be running FreeBSD  PostgreSQL.
   
   Where I'm stuck is in deciding whether we want to go with dual-core
   pentiums with 2M cache, or with HT pentiums with 8M cache.
  
  Given a choice between those two processors, I'd choose the AMD 64 x 2
  CPU.  It's a significantly better processor than either of the Intel
  choices.  And if you get the HT processor, you might as well turn of HT
  on a PostgreSQL machine.  I've yet to see it make postgresql run faster,
  but I've certainly seen HT make it run slower.
 
 Actually, believe it or not, a coworker just saw HT double the
 performance of pgbench on his desktop machine. Granted, not really a
 representative test case, but it still blew my mind. This was with a
 database that fit in his 1G of memory, and running windows XP. Both
 cases were newly minted pgbench databases with a scale of 40. Testing
 was 40 connections and 100 transactions. With HT he saw 47.6 TPS,
 without it was 21.1.
 
 I actually had IT build put w2k3 server on a HT box specifically so I
 could do more testing.

Just to clarify, this is PostgreSQL on Windows, right?

I wonder if the latest Linux kernel can do that well...  I'm guessing
that the kernel scheduler in Windows has had a lot of work to make it
good at scheduling on a HT architecture than the linux kernel has.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Large (8M) cache vs. dual-core CPUs

2006-04-26 Thread William Yu

David Boreham wrote:

It isn't only Postgres. I work on a number of other server applications
that also run much faster on Opterons than the published benchmark
figures would suggest they should. They're all compiled with gcc4,
so possibly there's a compiler issue. I don't run Windows on any
of our Opteron boxes so I can't easily compare using the MS compiler.



Maybe it's just a fact that the majority of x86 64-bit development for 
open source software happens on Opteron/A64 machines. 64-bit AMD 
machines were selling a good year before 64-bit Intel machines were 
available. And even after Intel EMT64 were available, anybody in their 
right mind would have picked AMD machines over Intel due to 
cost/heat/performance. So you end up with 64-bit OSS being 
developed/optimized for Opterons and the 10% running Intel EMT64 handle 
compatibility issues.


Would be interesting to see a survey of what machines OSS developers use 
to write/test/optimize their code.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Large (8M) cache vs. dual-core CPUs

2006-04-26 Thread Scott Marlowe
On Tue, 2006-04-25 at 20:17, [EMAIL PROTECTED] wrote:
 On Tue, Apr 25, 2006 at 08:54:40PM -0400, [EMAIL PROTECTED] wrote:
  I made the choice I describe based on a lot of research. I was going
  to go both Intel, until I noticed that the Intel prices were dropping
  fast. 30% price cut in 2 months. AMD didn't drop at all during the
  same time.
 
 Errr.. big mistake. That was going to be - I was going to go both AMD.
 
  There are plenty of reasons to choose one over the other. Generally
  the AMD comes out on top. It is *not* 2X though. Anybody who claims
  this is being highly selective about which benchmarks they consider.
 
 I have an Intel Pentium D 920, and an AMD X2 3800+. These are very
 close in performance. The retail price difference is:
 
 Intel Pentium D 920 is selling for $310 CDN
 AMD X2 3800+is selling for $347 CDN

Let me be clear.  The performance difference between those boxes running
the latest first person shooter is not what I was alluding to in my
first post.  While the price of the Intel's may have dropped, there's a
huge difference (often 2x or more) in performance when running
PostgreSQL on otherwise similar chips from Intel and AMD.

Note that my workstation at work, my workstation at home, and my laptop
are all intel based machines.  They work fine for that.  But if I needed
to build a big fast oracle or postgresql server, I'd almost certainly go
with the AMD, especially so if I needed 2 cores, where the performance
difference becomes greater and greater.

You'd likely find that for PostgreSQL, the slowest dual core AMDs out
would still beat the fasted Intel Dual cores, because of the issue we've
seen on the list with context switching storms.

If you haven't actually run a heavy benchmark of postgresql on the two
architectures, please don't make your decision based on other
benchmarks.  Since you've got both a D920 and an X2 3800, that'd be a
great place to start.  Mock up some benchmark with a couple dozen
threads hitting the server at once and see if the Intel can keep up.  It
should do OK, but not great.  If you can get your hands on a dual
dual-core setup for either, you should really start to see the advantage
going to AMD, and by the time you get to a quad dual core setup, it
won't even be a contest.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Introducing a new linux readahead framework

2006-04-26 Thread Steve Poe
I found an average 14% improvement Using Pg 7.4.11 with odbc-bench as my
test bed with Wu's kernel patch. I have not tried version 8.x yet.

Thanks Wu.  

Steve Poe

Using Postgresql 7.4.11, on an dual Opteron with 4GB

On Fri, 2006-04-21 at 09:38 +0800, Wu Fengguang wrote:
 Greetings,
 
 I'd like to introduce a new readahead framework for the linux kernel:
 http://www.ussg.iu.edu/hypermail/linux/kernel/0603.2/1021.html
 
 HOW IT WORKS
 
 In adaptive readahead, the context based method may be of particular
 interest to postgresql users. It works by peeking into the file cache
 and check if there are any history pages present or accessed. In this
 way it can detect almost all forms of sequential / semi-sequential read
 patterns, e.g.
   - parallel / interleaved sequential scans on one file
   - sequential reads across file open/close
   - mixed sequential / random accesses
   - sparse / skimming sequential read
 
 It also have methods to detect some less common cases:
   - reading backward
   - seeking all over reading N pages
 
 WAYS TO BENEFIT FROM IT
 
 As we know, postgresql relies on the kernel to do proper readahead.
 The adaptive readahead might help performance in the following cases:
   - concurrent sequential scans
   - sequential scan on a fragmented table
 (some DBs suffer from this problem, not sure for pgsql)
   - index scan with clustered matches
   - index scan on majority rows (in case the planner goes wrong)
 
 TUNABLE PARAMETERS
 
 There are two parameters which are described in this email:
 http://www.ussg.iu.edu/hypermail/linux/kernel/0603.2/1024.html
 
 Here are the more oriented guidelines for postgresql users:
 
 - /proc/sys/vm/readahead_ratio
 Since most DB servers are bounty of memory, the danger of readahead
 thrashing is near to zero. In this case, you can set readahead_ratio to
 100(or even 200:), which helps the readahead window to scale up rapidly.
 
 - /proc/sys/vm/readahead_hit_rate
 Sparse sequential reads are read patterns like {0, 2, 4, 5, 8, 11, ...}.
 In this case we might prefer to do readahead to get good I/O performance
 with the overhead of some useless pages. But if you prefer not to do so,
 set readahead_hit_rate to 1 will disable this feature.
 
 - /sys/block/sdX/queue/read_ahead_kb
 Set it to a large value(e.g. 4096) as you used to do.
 RAID users might want to use a bigger number.
 
 TRYING IT OUT
 
 The latest patch for stable kernels can be downloaded here:
 http://www.vanheusden.com/ara/
 
 Before compiling, make sure that the following options are enabled:
 Processor type and features - Adaptive file readahead
 Processor type and features -   Readahead debug and accounting
 
 HELPING AND CONTRIBUTING
 
 The patch is open to fine-tuning advices :)
 Comments and benchmarking results are highly appreciated.
 
 Thanks,
 Wu
 
 ---(end of broadcast)---
 TIP 4: Have you searched our list archives?
 
http://archives.postgresql.org


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Large (8M) cache vs. dual-core CPUs

2006-04-26 Thread Jim C. Nasby
On Wed, Apr 26, 2006 at 10:27:18AM -0500, Scott Marlowe wrote:
 If you haven't actually run a heavy benchmark of postgresql on the two
 architectures, please don't make your decision based on other
 benchmarks.  Since you've got both a D920 and an X2 3800, that'd be a
 great place to start.  Mock up some benchmark with a couple dozen
 threads hitting the server at once and see if the Intel can keep up.  It

Or better yet, use dbt* or even pgbench so others can reproduce...
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Large (8M) cache vs. dual-core CPUs

2006-04-26 Thread Jim C. Nasby
On Tue, Apr 25, 2006 at 11:07:17PM -0400, Ron Peacetree wrote:
 A minor point to be noted in addition here is that most DB servers under load 
 are limited by their physical IO subsystem, their HDs, and not the speed of 
 their RAM.

I think if that were the only consideration we wouldn't be seeing such a
dramatic difference between AMD and Intel though. Even in a disk-bound
server, caching is going to have a tremendous impact, and that's
essentially entirely bound by memory bandwith and latency.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Large (8M) cache vs. dual-core CPUs

2006-04-26 Thread Bruce Momjian
Jim C. Nasby wrote:
 On Wed, Apr 26, 2006 at 10:27:18AM -0500, Scott Marlowe wrote:
  If you haven't actually run a heavy benchmark of postgresql on the two
  architectures, please don't make your decision based on other
  benchmarks.  Since you've got both a D920 and an X2 3800, that'd be a
  great place to start.  Mock up some benchmark with a couple dozen
  threads hitting the server at once and see if the Intel can keep up.  It
 
 Or better yet, use dbt* or even pgbench so others can reproduce...

For why Opterons are superior to Intel for PostgreSQL, see:

http://techreport.com/reviews/2005q2/opteron-x75/index.x?pg=2

Section MESI-MESI-MOESI Banana-fana  Specifically, this part about
the Intel implementation:

The processor with the Invalid data in its cache (CPU 0, let's say)
might then wish to modify that chunk of data, but it could not do so
while the only valid copy of the data is in the cache of the other
processor (CPU 1). Instead, CPU 0 would have to wait until CPU 1 wrote
the modified data back to main memory before proceeding.and that takes
time, bus bandwidth, and memory bandwidth. This is the great drawback of
MESI.

AMD transfers the dirty cache line directly from cpu to cpu.  I can
imaging that helping our test-and-set shared memory usage quite a bit.

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Introducing a new linux readahead framework

2006-04-26 Thread Jim C. Nasby
(including bizgres-general)

Has anyone done any testing on bizgres? It's got some patches that
eliminate a lot of IO bottlenecks, so it might present even larger
gains.

On Wed, Apr 26, 2006 at 03:08:59PM -0500, Steve Poe wrote:
 I found an average 14% improvement Using Pg 7.4.11 with odbc-bench as my
 test bed with Wu's kernel patch. I have not tried version 8.x yet.
 
 Thanks Wu.  
 
 Steve Poe
 
 Using Postgresql 7.4.11, on an dual Opteron with 4GB
 
 On Fri, 2006-04-21 at 09:38 +0800, Wu Fengguang wrote:
  Greetings,
  
  I'd like to introduce a new readahead framework for the linux kernel:
  http://www.ussg.iu.edu/hypermail/linux/kernel/0603.2/1021.html
  
  HOW IT WORKS
  
  In adaptive readahead, the context based method may be of particular
  interest to postgresql users. It works by peeking into the file cache
  and check if there are any history pages present or accessed. In this
  way it can detect almost all forms of sequential / semi-sequential read
  patterns, e.g.
  - parallel / interleaved sequential scans on one file
  - sequential reads across file open/close
  - mixed sequential / random accesses
  - sparse / skimming sequential read
  
  It also have methods to detect some less common cases:
  - reading backward
  - seeking all over reading N pages
  
  WAYS TO BENEFIT FROM IT
  
  As we know, postgresql relies on the kernel to do proper readahead.
  The adaptive readahead might help performance in the following cases:
  - concurrent sequential scans
  - sequential scan on a fragmented table
(some DBs suffer from this problem, not sure for pgsql)
  - index scan with clustered matches
  - index scan on majority rows (in case the planner goes wrong)
  
  TUNABLE PARAMETERS
  
  There are two parameters which are described in this email:
  http://www.ussg.iu.edu/hypermail/linux/kernel/0603.2/1024.html
  
  Here are the more oriented guidelines for postgresql users:
  
  - /proc/sys/vm/readahead_ratio
  Since most DB servers are bounty of memory, the danger of readahead
  thrashing is near to zero. In this case, you can set readahead_ratio to
  100(or even 200:), which helps the readahead window to scale up rapidly.
  
  - /proc/sys/vm/readahead_hit_rate
  Sparse sequential reads are read patterns like {0, 2, 4, 5, 8, 11, ...}.
  In this case we might prefer to do readahead to get good I/O performance
  with the overhead of some useless pages. But if you prefer not to do so,
  set readahead_hit_rate to 1 will disable this feature.
  
  - /sys/block/sdX/queue/read_ahead_kb
  Set it to a large value(e.g. 4096) as you used to do.
  RAID users might want to use a bigger number.
  
  TRYING IT OUT
  
  The latest patch for stable kernels can be downloaded here:
  http://www.vanheusden.com/ara/
  
  Before compiling, make sure that the following options are enabled:
  Processor type and features - Adaptive file readahead
  Processor type and features -   Readahead debug and accounting
  
  HELPING AND CONTRIBUTING
  
  The patch is open to fine-tuning advices :)
  Comments and benchmarking results are highly appreciated.
  
  Thanks,
  Wu
  
  ---(end of broadcast)---
  TIP 4: Have you searched our list archives?
  
 http://archives.postgresql.org
 
 
 ---(end of broadcast)---
 TIP 4: Have you searched our list archives?
 
http://archives.postgresql.org
 

-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Large (8M) cache vs. dual-core CPUs

2006-04-26 Thread Jim C. Nasby
On Wed, Apr 26, 2006 at 06:16:46PM -0400, Bruce Momjian wrote:
 Jim C. Nasby wrote:
  On Wed, Apr 26, 2006 at 10:27:18AM -0500, Scott Marlowe wrote:
   If you haven't actually run a heavy benchmark of postgresql on the two
   architectures, please don't make your decision based on other
   benchmarks.  Since you've got both a D920 and an X2 3800, that'd be a
   great place to start.  Mock up some benchmark with a couple dozen
   threads hitting the server at once and see if the Intel can keep up.  It
  
  Or better yet, use dbt* or even pgbench so others can reproduce...
 
 For why Opterons are superior to Intel for PostgreSQL, see:
 
   http://techreport.com/reviews/2005q2/opteron-x75/index.x?pg=2
 
 Section MESI-MESI-MOESI Banana-fana  Specifically, this part about
 the Intel implementation:
 
   The processor with the Invalid data in its cache (CPU 0, let's say)
   might then wish to modify that chunk of data, but it could not do so
   while the only valid copy of the data is in the cache of the other
   processor (CPU 1). Instead, CPU 0 would have to wait until CPU 1 wrote
   the modified data back to main memory before proceeding.and that takes
   time, bus bandwidth, and memory bandwidth. This is the great drawback of
   MESI.
 
 AMD transfers the dirty cache line directly from cpu to cpu.  I can
 imaging that helping our test-and-set shared memory usage quite a bit.

Wasn't the whole point of test-and-set that it's the recommended way to
do lightweight spinlocks according to AMD/Intel? You'd think they'd have
a way to make that performant on multiple CPUs (though if it's relying
on possibly modifying an underlying data page I can't really think of
how to do that without snaking through the cache...)
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] WAL logging of SELECT ... INTO command

2006-04-26 Thread Bruce Momjian

Backpatched to 8.0.X and 8.1.X.

---

Kris Jurka wrote:
 
 
 On Fri, 24 Mar 2006, Jim C. Nasby wrote:
 
  On Wed, Mar 22, 2006 at 02:37:28PM -0500, Kris Jurka wrote:
 
  On Wed, 22 Mar 2006, Jim C. Nasby wrote:
 
  Ok, I saw disk activity on the base directory and assumed it was pg_xlog
  stuff. Turns out that both SELECT INTO and CREATE TABLE AS ignore
  default_tablepsace and create the new tables in the base directory. I'm
  guessing that's a bug... (this is on 8.1.2, btw).
 
  This has been fixed in CVS HEAD as part of a patch to allow additional
  options to CREATE TABLE AS.
 
  http://archives.postgresql.org/pgsql-patches/2006-02/msg00211.php
 
  I'll argue that the current behavior is still a bug and should be fixed.
  Would it be difficult to patch 8.1 (and 8.0 if there were tablespaces
  then...) to honor default_tablespace?
 
 Here are patches that fix this for 8.0 and 8.1.
 
 Kris Jurka

Content-Description: 

[ Attachment, skipping... ]

Content-Description: 

[ Attachment, skipping... ]

 
 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [Bizgres-general] [PERFORM] Introducing a new linux

2006-04-26 Thread Luke Lonergan
Title: Re: [Bizgres-general] [PERFORM] Introducing a new linux readahead framework



Jim,

Im thinking about it, were already using a fixed read-ahead of 16MB using blockdev on the stock Redhat 2.6.9 kernel, it would be nice to not have to set this so we may try it.

- Luke 


On 4/26/06 3:28 PM, Jim C. Nasby [EMAIL PROTECTED] wrote:

(including bizgres-general)

Has anyone done any testing on bizgres? It's got some patches that
eliminate a lot of IO bottlenecks, so it might present even larger
gains.

On Wed, Apr 26, 2006 at 03:08:59PM -0500, Steve Poe wrote:
 I found an average 14% improvement Using Pg 7.4.11 with odbc-bench as my
 test bed with Wu's kernel patch. I have not tried version 8.x yet.

 Thanks Wu. 

 Steve Poe

 Using Postgresql 7.4.11, on an dual Opteron with 4GB

 On Fri, 2006-04-21 at 09:38 +0800, Wu Fengguang wrote:
  Greetings,
 
  I'd like to introduce a new readahead framework for the linux kernel:
  http://www.ussg.iu.edu/hypermail/linux/kernel/0603.2/1021.html
 
  HOW IT WORKS
 
  In adaptive readahead, the context based method may be of particular
  interest to postgresql users. It works by peeking into the file cache
  and check if there are any history pages present or accessed. In this
  way it can detect almost all forms of sequential / semi-sequential read
  patterns, e.g.
  - parallel / interleaved sequential scans on one file
  - sequential reads across file open/close
  - mixed sequential / random accesses
  - sparse / skimming sequential read
 
  It also have methods to detect some less common cases:
  - reading backward
  - seeking all over reading N pages
 
  WAYS TO BENEFIT FROM IT
 
  As we know, postgresql relies on the kernel to do proper readahead.
  The adaptive readahead might help performance in the following cases:
  - concurrent sequential scans
  - sequential scan on a fragmented table
  (some DBs suffer from this problem, not sure for pgsql)
  - index scan with clustered matches
  - index scan on majority rows (in case the planner goes wrong)
 
  TUNABLE PARAMETERS
 
  There are two parameters which are described in this email:
  http://www.ussg.iu.edu/hypermail/linux/kernel/0603.2/1024.html
 
  Here are the more oriented guidelines for postgresql users:
 
  - /proc/sys/vm/readahead_ratio
  Since most DB servers are bounty of memory, the danger of readahead
  thrashing is near to zero. In this case, you can set readahead_ratio to
  100(or even 200:), which helps the readahead window to scale up rapidly.
 
  - /proc/sys/vm/readahead_hit_rate
  Sparse sequential reads are read patterns like {0, 2, 4, 5, 8, 11, ...}.
  In this case we might prefer to do readahead to get good I/O performance
  with the overhead of some useless pages. But if you prefer not to do so,
  set readahead_hit_rate to 1 will disable this feature.
 
  - /sys/block/sdX/queue/read_ahead_kb
  Set it to a large value(e.g. 4096) as you used to do.
  RAID users might want to use a bigger number.
 
  TRYING IT OUT
 
  The latest patch for stable kernels can be downloaded here:
  http://www.vanheusden.com/ara/
 
  Before compiling, make sure that the following options are enabled:
  Processor type and features - Adaptive file readahead
  Processor type and features - Readahead debug and accounting
 
  HELPING AND CONTRIBUTING
 
  The patch is open to fine-tuning advices :)
  Comments and benchmarking results are highly appreciated.
 
  Thanks,
  Wu
 
  ---(end of broadcast)---
  TIP 4: Have you searched our list archives?
 
  http://archives.postgresql.org


 ---(end of broadcast)---
 TIP 4: Have you searched our list archives?

 http://archives.postgresql.org


--
Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED]
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
___
Bizgres-general mailing list
[EMAIL PROTECTED]
http://pgfoundry.org/mailman/listinfo/bizgres-general









Re: [PERFORM] Slow deletes in 8.1 when FKs are involved

2006-04-26 Thread Stef T

Hey there Will,
I would assume that, perhaps, jst perhaps, the FK doesn't have an
index on the field on both sides, so, your seeing a potential sequential
scan happening. Can you fling up an explain anaylze for everyone please
? Anything more will be merely shooting in the dark, and, tracer bullets
aside, I have heard that -that- can be dangerous ;p

Regards
Stef

Will Reese wrote:
 I'm preparing for an upgrade from PostgreSQL 7.4.5 to 8.1.3, and I
 noticed a potential performance issue.

 I have two servers, a dual proc Dell with raid 5 running PostgreSQL
 7.4, and a quad proc Dell with a storage array running PostgreSQL 8.1.
 Both servers have identical postgresql.conf settings and were restored
 from the same 7.4 backup. Almost everything is faster on the 8.1
 server (mostly due to hardware), except one thing...deletes from
 tables with foreign keys.

 I have table A with around 100,000 rows, that has foreign keys to
 around 50 other tables.  Some of these other tables (table B, for
 example) have around 10 million rows.

 On the 7.4 server, I can delete a single row from a table A in well
 under a second (as expected).  On the 8.1 server, it takes over a
 minute to delete.  I tried all the usual stuff, recreating indexes,
 vacuum analyzing, explain analyze.  Everything is identical between
 the systems.  If I hit ctrl-c while the delete was running on 8.1, I
 repeatedly got the following message...

 db=# delete from A where ID in ('6');
 Cancel request sent
 ERROR:  canceling statement due to user request
 CONTEXT:  SQL statement SELECT 1 FROM ONLY public.B x WHERE
 A_ID = $1 FOR SHARE OF x

 It looks to me like the SELECT ... FOR SHARE functionality in 8.1 is
 the culprit. Has anyone else run into this issue?


 Will Reese -- http://blog.rezra.com




 ---(end of broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [Bizgres-general] [PERFORM] Introducing a new linux

2006-04-26 Thread Michael Stone

On Wed, Apr 26, 2006 at 04:33:40PM -0700, Luke Lonergan wrote:

I¹m thinking about it, we¹re already using a fixed read-ahead of 16MB using
blockdev on the stock Redhat 2.6.9 kernel, it would be nice to not have to
set this so we may try it.


FWIW, I never saw much performance difference from doing that. Wu's 
patch, OTOH, gave a big boost.


Mike Stone

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[PERFORM] Running on an NFS Mounted Directory

2006-04-26 Thread Ketema Harris
Title: Running on an NFS Mounted Directory



I was wondering if there were any performance issues with having a data directory that was an nfs mounted drive? Say like a SAN or NAS device? Has anyone done this before?






Re: [PERFORM] Slow deletes in 8.1 when FKs are involved

2006-04-26 Thread Will Reese

Stef:

There is already a post explaining the solution.  All the proper  
indexes were there, and it works great on 7.4.  The problem lies with  
leftover 7.4 RI triggers being carried over to an 8.1 database.  The  
solution is to drop the triggers and add the constraint.  Hopefully  
this will not cause as many locking issues with FKs on 8.1 as it did  
in 7.4 (which is why one of the RI triggers was removed in the first  
place).


Will Reese  -- http://blog.rezra.com

On Apr 26, 2006, at 6:43 PM, Stef T wrote:



Hey there Will,
I would assume that, perhaps, jst perhaps, the FK doesn't have an
index on the field on both sides, so, your seeing a potential  
sequential
scan happening. Can you fling up an explain anaylze for everyone  
please
? Anything more will be merely shooting in the dark, and, tracer  
bullets

aside, I have heard that -that- can be dangerous ;p

Regards
Stef

Will Reese wrote:

I'm preparing for an upgrade from PostgreSQL 7.4.5 to 8.1.3, and I
noticed a potential performance issue.

I have two servers, a dual proc Dell with raid 5 running PostgreSQL
7.4, and a quad proc Dell with a storage array running PostgreSQL  
8.1.
Both servers have identical postgresql.conf settings and were  
restored

from the same 7.4 backup. Almost everything is faster on the 8.1
server (mostly due to hardware), except one thing...deletes from
tables with foreign keys.

I have table A with around 100,000 rows, that has foreign keys to
around 50 other tables.  Some of these other tables (table B, for
example) have around 10 million rows.

On the 7.4 server, I can delete a single row from a table A in well
under a second (as expected).  On the 8.1 server, it takes over a
minute to delete.  I tried all the usual stuff, recreating indexes,
vacuum analyzing, explain analyze.  Everything is identical between
the systems.  If I hit ctrl-c while the delete was running on 8.1, I
repeatedly got the following message...

db=# delete from A where ID in ('6');
Cancel request sent
ERROR:  canceling statement due to user request
CONTEXT:  SQL statement SELECT 1 FROM ONLY public.B x WHERE
A_ID = $1 FOR SHARE OF x

It looks to me like the SELECT ... FOR SHARE functionality in  
8.1 is

the culprit. Has anyone else run into this issue?


Will Reese -- http://blog.rezra.com




---(end of  
broadcast)---

TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that  
your

  message can get through to the mailing list cleanly






---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Large (8M) cache vs. dual-core CPUs

2006-04-26 Thread mark
On Wed, Apr 26, 2006 at 05:37:31PM -0500, Jim C. Nasby wrote:
 On Wed, Apr 26, 2006 at 06:16:46PM -0400, Bruce Momjian wrote:
  AMD transfers the dirty cache line directly from cpu to cpu.  I can
  imaging that helping our test-and-set shared memory usage quite a bit.
 Wasn't the whole point of test-and-set that it's the recommended way to
 do lightweight spinlocks according to AMD/Intel? You'd think they'd have
 a way to make that performant on multiple CPUs (though if it's relying
 on possibly modifying an underlying data page I can't really think of
 how to do that without snaking through the cache...)

It's expensive no matter what. One method might be less expensive than
another. :-)

AMD definately seems to have things right for lowest absolute latency.
2X still sounds like an extreme case - but until I've actually tried a
very large, or thread intensive PostgreSQL db on both, I probably
shouldn't doubt the work of others too much. :-)

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Running on an NFS Mounted Directory

2006-04-26 Thread Steve Wampler
On Wed, Apr 26, 2006 at 10:06:58PM -0400, Ketema Harris wrote:
 I was wondering if there were any performance issues with having a data
 directory that was an nfs mounted drive?  Say like a SAN or NAS device? Has
 anyone done this before?
 
My understanding is that NFS is pretty poor in performance in general,
so I would expect it to be particularly bad for a DB.  You might run
some (non-DB) performance tests to get a feel for how bad it might me.
(Someone once told me that NFS topped out at around 12MB/s, but I don't
know if that's really true [they were trying to sell a competitive
networked filesystem]).

In any event, you're at least limited by ethernet speeds, if not more.

-- 
Steve Wampler -- [EMAIL PROTECTED]
The gods that smiled on your birth are now laughing out loud.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Running on an NFS Mounted Directory

2006-04-26 Thread Jim C. Nasby
On Wed, Apr 26, 2006 at 07:35:42PM -0700, Steve Wampler wrote:
 On Wed, Apr 26, 2006 at 10:06:58PM -0400, Ketema Harris wrote:
  I was wondering if there were any performance issues with having a data
  directory that was an nfs mounted drive?  Say like a SAN or NAS device? Has
  anyone done this before?
  
 My understanding is that NFS is pretty poor in performance in general,
 so I would expect it to be particularly bad for a DB.  You might run
 some (non-DB) performance tests to get a feel for how bad it might me.
 (Someone once told me that NFS topped out at around 12MB/s, but I don't
 know if that's really true [they were trying to sell a competitive
 networked filesystem]).
 
 In any event, you're at least limited by ethernet speeds, if not more.

More importantly, the latency involved will kill commit performance. If
it doesn't then it's likely that fsync isn't being obeyed, which means 0
data integrity.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Running on an NFS Mounted Directory

2006-04-26 Thread Dan Gorman

We have gotten very good performance from netapp and postgres 7.4.11 .

I was able to push about 100MB/s over gigE, but that was limited by  
our netapp.


DAS will generally always be faster, but if for example you have 2  
disks vs. 100 NFS mounted ,NFS will be faster.


NFS is very reliable and I would stay away from iscsi.



Regards,
Dan Gorman

On Apr 26, 2006, at 7:35 PM, Steve Wampler wrote:


On Wed, Apr 26, 2006 at 10:06:58PM -0400, Ketema Harris wrote:
I was wondering if there were any performance issues with having a  
data
directory that was an nfs mounted drive?  Say like a SAN or NAS  
device? Has

anyone done this before?


My understanding is that NFS is pretty poor in performance in general,
so I would expect it to be particularly bad for a DB.  You might run
some (non-DB) performance tests to get a feel for how bad it might me.
(Someone once told me that NFS topped out at around 12MB/s, but I  
don't

know if that's really true [they were trying to sell a competitive
networked filesystem]).

In any event, you're at least limited by ethernet speeds, if not more.

--
Steve Wampler -- [EMAIL PROTECTED]
The gods that smiled on your birth are now laughing out loud.

---(end of  
broadcast)---

TIP 2: Don't 'kill -9' the postmaster




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly