Re: [PERFORM] Configuring PostgreSQL to minimize impact of checkpoints
TL == Tom Lane [EMAIL PROTECTED] writes: TL Jack Orenstein [EMAIL PROTECTED] writes: I'm looking at one case in which two successive transactions, each updating a handful of records, take 26 and 18 *seconds* (not msec) to complete. These transactions normally complete in under 30 msec. TL I've seen installations in which it seemed that the normal query load TL was close to saturating the available disk bandwidth, and the extra load TL imposed by a background VACUUM just pushed the entire system's response TL time over a cliff. In an installation that has I/O capacity to spare, me stand up waving hand... ;-) This is my only killer problem left. I always peg my disk usage at 100% when vacuum runs, and other queries are slow too. When not running vacuum, my queries are incredibly zippy fast, including joins and counts and group by's on upwards of 100k rows at a time. TL I suspect that the same observations hold true for checkpoints, though TL I haven't specifically seen an installation suffering from that effect. I don't see that. But I also set checkpoint segments to about 50 on my big server. -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Vivek Khera, Ph.D.Khera Communications, Inc. Internet: [EMAIL PROTECTED] Rockville, MD +1-301-869-4449 x806 AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/ ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Configuring PostgreSQL to minimize impact of checkpoints
Quoting Vivek Khera [EMAIL PROTECTED]: TL == Tom Lane [EMAIL PROTECTED] writes: TL Jack Orenstein [EMAIL PROTECTED] writes: I'm looking at one case in which two successive transactions, each updating a handful of records, take 26 and 18 *seconds* (not msec) to complete. These transactions normally complete in under 30 msec. TL I've seen installations in which it seemed that the normal query load TL was close to saturating the available disk bandwidth, and the extra load TL imposed by a background VACUUM just pushed the entire system's response TL time over a cliff. In an installation that has I/O capacity to spare, ... TL I suspect that the same observations hold true for checkpoints, though TL I haven't specifically seen an installation suffering from that effect. I don't see that. But I also set checkpoint segments to about 50 on my big server. But wouldn't that affect checkpoint frequency, not checkpoint cost? Jack Orenstein This message was sent using IMP, the Internet Messaging Program. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Configuring PostgreSQL to minimize impact of checkpoints
On May 12, 2004, at 3:22 PM, [EMAIL PROTECTED] wrote: I don't see that. But I also set checkpoint segments to about 50 on my big server. But wouldn't that affect checkpoint frequency, not checkpoint cost Seems reasonable. I suppose checkpointing doesn't cost as much disk I/O as vacuum does. My checkpoints are also on a separate RAID volume on a separate RAID channel, so perhaps that gives me extra bandwidth to perform the checkpoints. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Configuring PostgreSQL to minimize impact of checkpoints
On Tue, 11 May 2004, Paul Tuckfield wrote: If you are having a write storm or bursty writes that's burying performance, a scsi raid controler with writeback cache will greatly improve the situation, but I do believe they run around $1-2k. If it's write specific problem, the cache matters more than the striping, except to say that write specfic perf problems should avoid raid5 Actually, a single channel MegaRAID 320-1 (single channel ultra 320) is only $421 at http://www.siliconmechanics.com/c248/u320-scsi.php It works pretty well for me, having 6 months of a production server on one with zero hickups and very good performance. They have a dual channel intel card for only $503, but I'm not at all familiar with that card. The top of the line megaraid is the 320-4, which is only $1240, which ain't bad for a four channel RAID controller. Battery backed cache is an addon, but I think it's only about $80 or so. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Configuring PostgreSQL to minimize impact of checkpoints
Love that froogle. It looks like a nice card. One thing I didn't get straight is if the cache is writethru or write back. If the original posters problem is truly a burst write problem (and not linux caching or virtual memory overcommitment) then writeback is key. On Tue, 11 May 2004, Paul Tuckfield wrote: If you are having a write storm or bursty writes that's burying performance, a scsi raid controler with writeback cache will greatly improve the situation, but I do believe they run around $1-2k. If it's write specific problem, the cache matters more than the striping, except to say that write specfic perf problems should avoid raid5 Actually, a single channel MegaRAID 320-1 (single channel ultra 320) is only $421 at http://www.siliconmechanics.com/c248/u320-scsi.php It works pretty well for me, having 6 months of a production server on one with zero hickups and very good performance. They have a dual channel intel card for only $503, but I'm not at all familiar with that card. The top of the line megaraid is the 320-4, which is only $1240, which ain't bad for a four channel RAID controller. Battery backed cache is an addon, but I think it's only about $80 or so. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Configuring PostgreSQL to minimize impact of checkpoints
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of scott.marlowe Sent: Tuesday, May 11, 2004 2:23 PM To: Paul Tuckfield Cc: [EMAIL PROTECTED]; Matthew Nuzum; [EMAIL PROTECTED]; Rob Fielding Subject: Re: [PERFORM] Configuring PostgreSQL to minimize impact of checkpoints On Tue, 11 May 2004, Paul Tuckfield wrote: If you are having a write storm or bursty writes that's burying performance, a scsi raid controler with writeback cache will greatly improve the situation, but I do believe they run around $1-2k. If it's write specific problem, the cache matters more than the striping, except to say that write specfic perf problems should avoid raid5 Actually, a single channel MegaRAID 320-1 (single channel ultra 320) is only $421 at http://www.siliconmechanics.com/c248/u320-scsi.php It works pretty well for me, having 6 months of a production server on one with zero hickups and very good performance. They have a dual channel intel card for only $503, but I'm not at all familiar with that card. The top of the line megaraid is the 320-4, which is only $1240, which ain't bad for a four channel RAID controller. Battery backed cache is an addon, but I think it's only about $80 or so. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly - If you don't mind slumming on ebay :-) keep an eye out for PERC III cards, they are dell branded LSI cards. Perc = Power Edge Raid Controller. There are models on there dual channel u320 and dell usually sells them with battery backed cache. That's how I have acquired all my high end raid cards. Rob ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Configuring PostgreSQL to minimize impact of checkpoints
On Tue, 11 May 2004, Paul Tuckfield wrote: Love that froogle. It looks like a nice card. One thing I didn't get straight is if the cache is writethru or write back. If the original posters problem is truly a burst write problem (and not linux caching or virtual memory overcommitment) then writeback is key. the MegaRaid can be configured either way. it defaults to writeback if the battery backed cache is present, I believe. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Configuring PostgreSQL to minimize impact of checkpoints
Jack Orenstein [EMAIL PROTECTED] writes: I'm looking at one case in which two successive transactions, each updating a handful of records, take 26 and 18 *seconds* (not msec) to complete. These transactions normally complete in under 30 msec. ... None of this is necessarily going to fix matters for an installation that has no spare I/O capacity, though. And from the numbers you're quoting I fear you may be in that category. Buy faster disks may be the only answer ... I had a computer once that had an out-of-the-box hard drive configuration that provided horrible disk performance. I found a tutorial at O'Reilly that explained how to use hdparm to dramatically speed up disk performance on Linux. I've noticed on other computers I've set up recently that hdparm seems to be used by default out of the box to give good performance. Maybe your computer is using all of it's I/O capacity because it's using PIO mode or some other non-optimal method of accessing the disk. Just a suggestion, I hope it helps, Matthew Nuzum | ISPs: Make $200 - $5,000 per referral by www.followers.net | recomending Elite CMS to your customers! [EMAIL PROTECTED] | http://www.followers.net/isp ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Configuring PostgreSQL to minimize impact of checkpoints
Jan Wieck [EMAIL PROTECTED] writes: If we would combine the background writer and the checkpointer, ... which in fact is on my agenda of things to do ... then a checkpoint flush could actually be implemented as a temporary change in that activity that basically is done by not reevaluating the list of to be flushed blocks any more and switching to a constant amount of blocks flushed per cycle. When that list get's empty, the checkpoint flush is done, the checkpoint can complete and the background writer resumes normal business. Sounds like a plan. I'll do it that way. However, we might want to have different configuration settings controlling the write rate during checkpoint and the rate during normal background writing --- what do you think? Also, presumably a shutdown checkpoint should just whomp out the data without any delays. We can't afford to wait around and risk having init decide we took too long. None of this is necessarily going to fix matters for an installation that has no spare I/O capacity, though. As a matter of fact, the background writer increases the overall IO. It writes buffers that possibly get modified again before a checkpoint or their replacement requires them to be finally written. So if there is no spare IO bandwidth, it makes things worse. Right, the trickle writes could be wasted effort. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Configuring PostgreSQL to minimize impact of checkpoints
Matthew Nuzum wrote: Jack Orenstein [EMAIL PROTECTED] writes: I'm looking at one case in which two successive transactions, each updating a handful of records, take 26 and 18 *seconds* (not msec) to complete. These transactions normally complete in under 30 msec. None of this is necessarily going to fix matters for an installation that has no spare I/O capacity, though. And from the numbers you're quoting I fear you may be in that category. Buy faster disks may be the only answer ... I had a computer once that had an out-of-the-box hard drive configuration that provided horrible disk performance. I found a tutorial at O'Reilly that explained how to use hdparm to dramatically speed up disk performance on Linux. I've noticed on other computers I've set up recently that hdparm seems to be used by default out of the box to give good performance. Maybe your computer is using all of it's I/O capacity because it's using PIO mode or some other non-optimal method of accessing the disk. There's certainly some scope there. I have an SGI Octane whos SCSI 2 disks were set-up by default with no write buffer and CTQ depth of zero :/ IDE drivers in Linux maybe not detecting your IDE chipset correctly and stepping down, however unlikely there maybe something odd going on but you could check hdparm out. Ensure correct cables too, and the aren't crushed or twisted too bad I digress... Assuming you're running with optimal schema and index design (ie you're not doing extra work unnecessarily), and your backend has better-then-default config options set-up (plenty of tips around here), then disk arrangement is critical to smoothing the ride. Taking things to a relative extreme, we implemented a set-up with issues similar sounding to yours. It was resolved by first optimising everything but hardware, then finally optimising hardware. This served us because it meant we squeezed as much out of the available hardware, before finally throwing more at it, getting us the best possible returns (plus further post optimisation on the new hardware). First tip would to take your pg_xlog and put it on another disk (and channel). Next if you're running a journalled fs, get that journal off onto another disk (and channel). Finally, get as many disks for the data store and spread the load across spindles. You're aiming here to distribute the contention and disk I/O more evenly to remove the congestion. sar and iostat help out as part of the analysis. You say you're using IDE, for which I'd highly recommend switching to SCSI and mutliple controllers because IDE isn't great for lots of other reasons. Obviously budgets count, and playing with SCSI certainly limits that. We took a total of 8 disks across 2 SCSI 160 channels and split up the drives into a number of software RAID arrays. RAID0 mirrors for the os, pg_xlog, data disk journal and swap and the rest became a RAID5 array for the data. You could instead implement your DATA disk as RAID1+0 if you wanted more perf at the cost of free space. Anyway, it's certainly not the fastest config out there, but it made all the difference to this particular application. Infact, we had so much free I/O we recently installed another app on there (based on mysql, sorry) which runs concurrently, and itself 4 times faster than it originally did... YMMV, just my 2p. -- Rob Fielding [EMAIL PROTECTED] www.dsvr.co.uk Development Designer Servers Ltd ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Configuring PostgreSQL to minimize impact of checkpoints
Quoting Rob Fielding [EMAIL PROTECTED]: Assuming you're running with optimal schema and index design (ie you're not doing extra work unnecessarily), and your backend has better-then-default config options set-up (plenty of tips around here), then disk arrangement is critical to smoothing the ride. The schema and queries are extremely simple. I've been experimenting with config options. One possibility I'm looking into is whether shared_buffers is too high, at 12000. We have some preliminary evidence that setting it lower (1000) reduces the demand for IO bandwidth to a point where the spikes become almost tolerable. First tip would to take your pg_xlog and put it on another disk (and channel). That's on my list of things to try. Next if you're running a journalled fs, get that journal off onto another disk (and channel). Finally, get as many disks for the data store and spread the load across spindles. Dumb question: how do I spread the data across spindles? Do you have a pointer to something I could read? Jack Orenstein This message was sent using IMP, the Internet Messaging Program. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Configuring PostgreSQL to minimize impact of checkpoints
The king of statistics in these cases, is probably vmstat. one can drill down on specific things from there, but first you should send some vmstat output. Reducing cache - reducing IO suggests to me the OS might be paging out shared buffers. This is indicated by activity in the si and so columns of vmstat. intentional disk activity by the applciation(postgres) shows up in the bi and bo columns. If you are having a write storm or bursty writes that's burying performance, a scsi raid controler with writeback cache will greatly improve the situation, but I do believe they run around $1-2k. If it's write specific problem, the cache matters more than the striping, except to say that write specfic perf problems should avoid raid5 please send the output of vmstat 10 for about 10 minutes, spanning good performance and bad performance. On May 11, 2004, at 9:52 AM, [EMAIL PROTECTED] wrote: Quoting Rob Fielding [EMAIL PROTECTED]: Assuming you're running with optimal schema and index design (ie you're not doing extra work unnecessarily), and your backend has better-then-default config options set-up (plenty of tips around here), then disk arrangement is critical to smoothing the ride. The schema and queries are extremely simple. I've been experimenting with config options. One possibility I'm looking into is whether shared_buffers is too high, at 12000. We have some preliminary evidence that setting it lower (1000) reduces the demand for IO bandwidth to a point where the spikes become almost tolerable. First tip would to take your pg_xlog and put it on another disk (and channel). That's on my list of things to try. Next if you're running a journalled fs, get that journal off onto another disk (and channel). Finally, get as many disks for the data store and spread the load across spindles. Dumb question: how do I spread the data across spindles? Do you have a pointer to something I could read? Jack Orenstein This message was sent using IMP, the Internet Messaging Program. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[PERFORM] Configuring PostgreSQL to minimize impact of checkpoints
My company is developing a PostgreSQL application. We're using 7.3.4 but will soon upgrade to 7.4.x. Our OS is RedHat 9. Our production machines have 512 MB RAM and IDE disks. So far we've been using default configuration settings, but I have started to examine performance and to modify these settings. Our typical transaction involves 5-10 SELECT, INSERT or UPDATEs, (usually 1/2 SELECT and the remainder a mixture of INSERT and UPDATE). There are a few aggregation queries which need to scan an entire table. We observed highly uneven performance for the small transactions. A transaction usually runs in under 100 msec, but we would see spikes as high as 40,000 msec. These spikes occurred regularly, every 4-5 minutes, and I speculated that checkpointing might be the issue. I created a test case, based on a single table: create table test( id int not null, count int not null, filler varchar(200), primary key(id)) I loaded a database with 1,000,000 rows, with the filler column always filled with 200 characters. I then ran a test in which a random row was selected, and the count column incremented. Each transaction contained ten such updates. In this test, I set shared_buffers = 2000 checkpoint_segments = 40 checkpoint_timeout = 600 wal_debug = 1 I set checkpoint_segments high because I wanted to see whether the spikes correlated with checkpoints. Most transactions completed in under 60 msec. Approximately every 10th transaction, the time went up to 500-600 msec, (which is puzzling, but not my major concern). I did see a spike every 10 minutes, in which transaction time goes up to 5000-8000 msec. The spikes were correlated with checkpoint activity, occurring slightly before a log entry that looks like this: 2004-05-09 16:34:19 LOG: INSERT @ 2/C2A0F628: prev 2/C2A0F5EC; xprev 0/0; xid 0: XLOG - checkpoint: redo 2/C2984D4C; undo 0/0; sui 36; xid 1369741; oid 6321782; online Questions: 1. Can someone provide an overview of checkpoint processing, to help me understand the performance issues? 2. Is the spike due to the checkpoint process keeping the disk busy? Or is there some locking involved that blocks my application until the checkpoint completes? 3. The spikes are quite problematic for us. What can I do to minimize the impact of checkpointing on my application? I understand how checkpoint_segments and checkpoint_timeout determine when a checkpoint occurs; what can I do to lessen the impact of a checkpoint? 4. I understand that a background writer is being contemplated for 7.5. Will that replace or augment the checkpoint process? Any comments on how that work will apply to my problem would be appreciated. I wouldn't mind seeing the average performance, (without the spikes) go up -- let's say -- 10%, in exchange for more uniform performance. These spikes are a real problem. Jack Orenstein This message was sent using IMP, the Internet Messaging Program. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Configuring PostgreSQL to minimize impact of checkpoints
[EMAIL PROTECTED] wrote: 4. I understand that a background writer is being contemplated for 7.5. Will that replace or augment the checkpoint process? Any comments on how that work will apply to my problem would be appreciated. I wouldn't mind seeing the average performance, (without the spikes) go up -- let's say -- 10%, in exchange for more uniform performance. These spikes are a real problem. The background writer is designed to address your specific problem. We will stil checkpoint, but the spike should be greatly minimized. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Configuring PostgreSQL to minimize impact of checkpoints
Bruce Momjian wrote: [EMAIL PROTECTED] wrote: 4. I understand that a background writer is being contemplated for 7.5. Will that replace or augment the checkpoint process? Any comments on how that work will apply to my problem would be appreciated. I wouldn't mind seeing the average performance, (without the spikes) go up -- let's say -- 10%, in exchange for more uniform performance. These spikes are a real problem. The background writer is designed to address your specific problem. We will stil checkpoint, but the spike should be greatly minimized. Thanks. Do you know when 7.5 is expected to be released? Until then, is a workaround known? Also, are the delays I'm seeing out of the ordinary? I'm looking at one case in which two successive transactions, each updating a handful of records, take 26 and 18 *seconds* (not msec) to complete. These transactions normally complete in under 30 msec. Jack Orenstein ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Configuring PostgreSQL to minimize impact of checkpoints
Jack Orenstein wrote: Bruce Momjian wrote: [EMAIL PROTECTED] wrote: 4. I understand that a background writer is being contemplated for 7.5. Will that replace or augment the checkpoint process? Any comments on how that work will apply to my problem would be appreciated. I wouldn't mind seeing the average performance, (without the spikes) go up -- let's say -- 10%, in exchange for more uniform performance. These spikes are a real problem. The background writer is designed to address your specific problem. We will stil checkpoint, but the spike should be greatly minimized. Thanks. Do you know when 7.5 is expected to be released? 3-6 months. Until then, is a workaround known? Also, are the delays I'm seeing out of the ordinary? I'm looking at one case in which two successive transactions, each updating a handful of records, take 26 and 18 *seconds* (not msec) to complete. These transactions normally complete in under 30 msec. Wow. Others might know the answer to that. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Configuring PostgreSQL to minimize impact of checkpoints
Jack Orenstein [EMAIL PROTECTED] writes: I'm looking at one case in which two successive transactions, each updating a handful of records, take 26 and 18 *seconds* (not msec) to complete. These transactions normally complete in under 30 msec. I've seen installations in which it seemed that the normal query load was close to saturating the available disk bandwidth, and the extra load imposed by a background VACUUM just pushed the entire system's response time over a cliff. In an installation that has I/O capacity to spare, a VACUUM doesn't really hurt foreground query response at all. I suspect that the same observations hold true for checkpoints, though I haven't specifically seen an installation suffering from that effect. Already-committed changes for 7.5 include a background writer, which basically will trickle out dirty pages between checkpoints, thereby hopefully reducing the volume of I/O forced at a checkpoint. We have also got code in place that throttles the rate of I/O requests during VACUUM. It seems like it might be useful to similarly throttle the I/O request rate during a CHECKPOINT, though I'm not sure if there'd be any bad side effects from lengthening the elapsed time for a checkpoint. (Jan, any thoughts here?) None of this is necessarily going to fix matters for an installation that has no spare I/O capacity, though. And from the numbers you're quoting I fear you may be in that category. Buy faster disks may be the only answer ... regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster