Re: [PERFORM] Oddly slow queries
On 22.04.2008, at 17:25, Scott Marlowe wrote: On Tue, Apr 22, 2008 at 7:42 AM, Thomas Spreng <[EMAIL PROTECTED]> wrote: I think I'll upgrade PostgreSQL to the latest 8.3 version in the next few days anyway, along with a memory upgrade (from 1.5GB to 4GB) and a new 2x RAID-1 (instead of RAID-5) disk configuration. I hope that this has already a noticeable impact on the performance. Note that if you have a good RAID controller with battery backed cache and write back enabled, then you're probably better or / at least as well off using four disks in a RAID-10 than two separate RAID-1 sets (one for xlog and one for data). I just wanted to let you know that upgrading Postgres from 8.1 to 8.3, RAM from 1.5GB to 4GB and changing from a 3 disk RAID5 to 2x RAID1 (OS & WAL, Tablespace) led to a significant speed increase. What's especially important is that those randomly slow queries seem to be gone (for now). Thanks for all the help. Cheers, Tom -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Oddly slow queries
On Tue, Apr 22, 2008 at 7:42 AM, Thomas Spreng <[EMAIL PROTECTED]> wrote: > > I think I'll upgrade PostgreSQL to the latest 8.3 version in the next > few days anyway, along with a memory upgrade (from 1.5GB to 4GB) and a > new 2x RAID-1 (instead of RAID-5) disk configuration. I hope that this > has already a noticeable impact on the performance. Note that if you have a good RAID controller with battery backed cache and write back enabled, then you're probably better or / at least as well off using four disks in a RAID-10 than two separate RAID-1 sets (one for xlog and one for data). Test to see. I've had better performance in general with the RAID-10 setup. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Oddly slow queries
On 19.04.2008, at 19:11, Christopher Browne wrote: Martha Stewart called it a Good Thing when [EMAIL PROTECTED] (Thomas Spreng) wrote: On 16.04.2008, at 17:42, Chris Browne wrote: What I meant is if there are no INSERT's or UPDATE's going on it shouldn't affect SELECT queries, or am I wrong? Yes, that's right. (Caveat: VACUUM would be a form of update, in this context...) thanks for pointing that out, at the moment we don't run autovacuum but VACUUM ANALYZE VERBOSE twice a day. 2. On the other hand, if you're on 8.1 or so, you may be able to configure the Background Writer to incrementally flush checkpoint data earlier, and avoid the condition of 1. Mind you, you'd have to set BgWr to be pretty aggressive, based on the "10s periodicity" that you describe; that may not be a nice configuration to have all the time :-(. I've just seen that the daily vacuum tasks didn't run, apparently. The DB has almost doubled it's size since some days ago. I guess I'll have to VACUUM FULL (dump/restore might be faster, though) and check if that helps anything. If you're locking out users, then it's probably a better idea to use CLUSTER to reorganize the tables, as that simultaneously eliminates empty space on tables *and indices.* In contrast, after running VACUUM FULL, you may discover you need to reindex tables, because the reorganization of the *table* leads to bloating of the indexes. I don't VACUUM FULL but thanks for the hint. Pre-8.3 (I *think*), there's a transactional issue with CLUSTER where it doesn't fully follow MVCC, so that "dead, but still accessible, to certain transactions" tuples go away. That can cause surprises (e.g. - queries missing data) if applications are accessing the database concurrently with the CLUSTER. It's safe as long as the DBA can take over the database and block out applications. And at some point, the MVCC bug got fixed. I think I'll upgrade PostgreSQL to the latest 8.3 version in the next few days anyway, along with a memory upgrade (from 1.5GB to 4GB) and a new 2x RAID-1 (instead of RAID-5) disk configuration. I hope that this has already a noticeable impact on the performance. Note that you should check the output of a VACUUM VERBOSE run, and/or use the contrib function pgsstattuples() to check how sparse the storage usage is. There may only be a few tables that are behaving badly, and cleaning up a few tables will be a lot less intrusive than cleaning up the whole database. That surely is the case because about 90% of all data is stored in one big table and most of the rows are deleted and newly INSERT'ed every night. cheers, tom -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Oddly slow queries
that's correct, there are nightly (at least at the moment) processes that insert around 2-3 mio rows and delete about the same amount. I can see that those 'checkpoints are occurring too frequently' messages are only logged during that timeframe. Perhaps you should increase the quantity of xlog PG is allowed to write between each checkpoint (this is checkpoint_segments). Checkpointing every 10 seconds is going to slow down your inserts also, because of the need to fsync()'ing all those pages, not to mention nuking your IO-bound SELECTs. Increase it till it checkpoints every 5 minutes or something. I assume that it's normal that so many INSERT's and DELETE's cause the Well, also, do you use batch-processing or plpgsql or issue a huge mass of individual INSERTs via some script ? If you use a script, make sure that each INSERT doesn't have its own transaction (I think you know that since with a few millions of rows it would take forever... unless you can do 1 commits/s, in which case either you use 8.3 and have activated the "one fsync every N seconds" feature, or your battery backed up cache works, or your disk is lying)... If you use a script and the server is under heavy load you can : BEGIN Process N rows (use multi-values INSERT and DELETE WHERE .. IN (...)), or execute a prepared statement multiple times, or copy to temp table and process with SQL (usually much faster) COMMIT Sleep Wash, rinse, repeat background writer to choke a little bit. I guess I really need to adjust the processes to INSERT and DELETE rows in a slower pace if I want to do other queries during the same time. cheers, tom -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Oddly slow queries
On 19.04.2008, at 19:04, Scott Marlowe wrote: No, that will certainly NOT just affect write performance; if the postmaster is busy writing out checkpoints, that will block SELECT queries that are accessing whatever is being checkpointed. What I meant is if there are no INSERT's or UPDATE's going on it shouldn't affect SELECT queries, or am I wrong? But checkpoints only occur every 10 seconds because of a high insert / update rate. So, there ARE inserts and updates going on, and a lot of them, and they are blocking your selects when checkpoint hits. While adjusting your background writer might be called for, and might provide you with some relief, you REALLY need to find out what's pushing so much data into your db at once that it's causing a checkpoint storm. that's correct, there are nightly (at least at the moment) processes that insert around 2-3 mio rows and delete about the same amount. I can see that those 'checkpoints are occurring too frequently' messages are only logged during that timeframe. I assume that it's normal that so many INSERT's and DELETE's cause the background writer to choke a little bit. I guess I really need to adjust the processes to INSERT and DELETE rows in a slower pace if I want to do other queries during the same time. cheers, tom -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Oddly slow queries
Martha Stewart called it a Good Thing when [EMAIL PROTECTED] (Thomas Spreng) wrote: > On 16.04.2008, at 17:42, Chris Browne wrote: >> [EMAIL PROTECTED] (Thomas Spreng) writes: >>> On 16.04.2008, at 01:24, PFC wrote: > The queries in question (select's) occasionally take up to 5 mins > even if they take ~2-3 sec under "normal" conditions, there are no > sequencial scans done in those queries. There are not many users > connected (around 3, maybe) to this database usually since it's > still in a testing phase. I tried to hunt down the problem by > playing around with resource usage cfg options but it didn't really > made a difference. Could that be caused by a CHECKPOINT ? >>> >>> actually there are a few log (around 12 per day) entries concerning >>> checkpoints: >>> >>> LOG: checkpoints are occurring too frequently (10 seconds apart) >>> HINT: Consider increasing the configuration parameter >>> "checkpoint_segments". >>> >>> But wouldn't that only affect write performance? The main problems >>> I'm >>> concerned about affect SELECT queries. >> >> No, that will certainly NOT just affect write performance; if the >> postmaster is busy writing out checkpoints, that will block SELECT >> queries that are accessing whatever is being checkpointed. > > What I meant is if there are no INSERT's or UPDATE's going on it > shouldn't affect SELECT queries, or am I wrong? Yes, that's right. (Caveat: VACUUM would be a form of update, in this context...) > All the data modification tasks usually run at night, during the day > there shouldn't be many INSERT's or UPDATE's going on. > >> When we were on 7.4, we would *frequently* see SELECT queries that >> should be running Very Quick that would get blocked by the >> checkpoint flush. > > How did you actually see they were blocked by the checkpoint > flushes? Do they show up as separate processes? We inferred this based on observed consistency of behaviour, and based on having highly observant people like Andrew Sullivan around :-). It definitely wasn't blatantly obvious. It *might* be easier to see in more recent versions, although BgWr makes the issue go away ;-). >> There are two things worth considering: >> >> 1. If the checkpoints are taking place "too frequently," then that >> is clear evidence that something is taking place that is injecting >> REALLY heavy update load on your database at those times. >> >> If the postmaster is checkpointing every 10s, that implies Rather >> Heavy Load, so it is pretty well guaranteed that performance of >> other activity will suck at least somewhat because this load is >> sucking up all the I/O bandwidth that it can. >> >> So, to a degree, there may be little to be done to improve on this. > > I strongly assume that those log entries showed up at night when the > heavy insert routines are being run. I'm more concerned about the > query performance under "normal" conditions when there are very few > modifications done. I rather thought as much. You *do* have to accept that when you get heavy update load, there will be a lot of I/O, and in the absence of "disk array fairies" that magically make bits get to the disks via automated mental telepathy ;-), you have to live with the notion that there will be *some* side-effects on activity taking place at such times. Or you have to spend, spend, spend on heftier hardware. Sometimes too expensive... >> 2. On the other hand, if you're on 8.1 or so, you may be able to >> configure the Background Writer to incrementally flush checkpoint data >> earlier, and avoid the condition of 1. >> >> Mind you, you'd have to set BgWr to be pretty aggressive, based on the >> "10s periodicity" that you describe; that may not be a nice >> configuration to have all the time :-(. > > I've just seen that the daily vacuum tasks didn't run, > apparently. The DB has almost doubled it's size since some days > ago. I guess I'll have to VACUUM FULL (dump/restore might be faster, > though) and check if that helps anything. If you're locking out users, then it's probably a better idea to use CLUSTER to reorganize the tables, as that simultaneously eliminates empty space on tables *and indices.* In contrast, after running VACUUM FULL, you may discover you need to reindex tables, because the reorganization of the *table* leads to bloating of the indexes. Pre-8.3 (I *think*), there's a transactional issue with CLUSTER where it doesn't fully follow MVCC, so that "dead, but still accessible, to certain transactions" tuples go away. That can cause surprises (e.g. - queries missing data) if applications are accessing the database concurrently with the CLUSTER. It's safe as long as the DBA can take over the database and block out applications. And at some point, the MVCC bug got fixed. Note that you should check the output of a VACUUM VERBOSE run, and/or use the contrib function pgsstattuples() to check how sparse the storage usage is. There may only be a few
Re: [PERFORM] Oddly slow queries
On Wed, Apr 16, 2008 at 3:48 PM, Thomas Spreng <[EMAIL PROTECTED]> wrote: > > On 16.04.2008, at 17:42, Chris Browne wrote: > > > [EMAIL PROTECTED] (Thomas Spreng) writes: > > > > > On 16.04.2008, at 01:24, PFC wrote: > > > > > > > > > > > > > > > > The queries in question (select's) occasionally take up to 5 mins > > > > > even if they take ~2-3 sec under "normal" conditions, there are no > > > > > sequencial scans done in those queries. There are not many users > > > > > connected (around 3, maybe) to this database usually since it's > > > > > still in a testing phase. I tried to hunt down the problem by > > > > > playing around with resource usage cfg options but it didn't really > > > > > made a difference. > > > > > > > > > > > > >Could that be caused by a CHECKPOINT ? > > > > > > > > > > actually there are a few log (around 12 per day) entries concerning > > > checkpoints: > > > > > > LOG: checkpoints are occurring too frequently (10 seconds apart) > > > HINT: Consider increasing the configuration parameter > > > "checkpoint_segments". > > > > > > But wouldn't that only affect write performance? The main problems I'm > > > concerned about affect SELECT queries. > > > > > > > No, that will certainly NOT just affect write performance; if the > > postmaster is busy writing out checkpoints, that will block SELECT > > queries that are accessing whatever is being checkpointed. > > > > What I meant is if there are no INSERT's or UPDATE's going on it shouldn't > affect SELECT queries, or am I wrong? But checkpoints only occur every 10 seconds because of a high insert / update rate. So, there ARE inserts and updates going on, and a lot of them, and they are blocking your selects when checkpoint hits. While adjusting your background writer might be called for, and might provide you with some relief, you REALLY need to find out what's pushing so much data into your db at once that it's causing a checkpoint storm. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Oddly slow queries
On Wed, Apr 16, 2008 at 11:48:21PM +0200, Thomas Spreng wrote: > What I meant is if there are no INSERT's or UPDATE's going on it > shouldn't > affect SELECT queries, or am I wrong? CHECKPOINTs also happen on a time basis. They should be short in that case, but they still have to happen. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Oddly slow queries
On 16.04.2008, at 17:42, Chris Browne wrote: [EMAIL PROTECTED] (Thomas Spreng) writes: On 16.04.2008, at 01:24, PFC wrote: The queries in question (select's) occasionally take up to 5 mins even if they take ~2-3 sec under "normal" conditions, there are no sequencial scans done in those queries. There are not many users connected (around 3, maybe) to this database usually since it's still in a testing phase. I tried to hunt down the problem by playing around with resource usage cfg options but it didn't really made a difference. Could that be caused by a CHECKPOINT ? actually there are a few log (around 12 per day) entries concerning checkpoints: LOG: checkpoints are occurring too frequently (10 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". But wouldn't that only affect write performance? The main problems I'm concerned about affect SELECT queries. No, that will certainly NOT just affect write performance; if the postmaster is busy writing out checkpoints, that will block SELECT queries that are accessing whatever is being checkpointed. What I meant is if there are no INSERT's or UPDATE's going on it shouldn't affect SELECT queries, or am I wrong? All the data modification tasks usually run at night, during the day there shouldn't be many INSERT's or UPDATE's going on. When we were on 7.4, we would *frequently* see SELECT queries that should be running Very Quick that would get blocked by the checkpoint flush. How did you actually see they were blocked by the checkpoint flushes? Do they show up as separate processes? There are two things worth considering: 1. If the checkpoints are taking place "too frequently," then that is clear evidence that something is taking place that is injecting REALLY heavy update load on your database at those times. If the postmaster is checkpointing every 10s, that implies Rather Heavy Load, so it is pretty well guaranteed that performance of other activity will suck at least somewhat because this load is sucking up all the I/O bandwidth that it can. So, to a degree, there may be little to be done to improve on this. I strongly assume that those log entries showed up at night when the heavy insert routines are being run. I'm more concerned about the query performance under "normal" conditions when there are very few modifications done. 2. On the other hand, if you're on 8.1 or so, you may be able to configure the Background Writer to incrementally flush checkpoint data earlier, and avoid the condition of 1. Mind you, you'd have to set BgWr to be pretty aggressive, based on the "10s periodicity" that you describe; that may not be a nice configuration to have all the time :-(. I've just seen that the daily vacuum tasks didn't run, apparently. The DB has almost doubled it's size since some days ago. I guess I'll have to VACUUM FULL (dump/restore might be faster, though) and check if that helps anything. Does a bloated DB affect the performance alot or does it only use up disk space? Thanks for all the hints/help so far from both of you. Cheers, Tom -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Oddly slow queries
[EMAIL PROTECTED] (Thomas Spreng) writes: > On 16.04.2008, at 01:24, PFC wrote: >> >>> The queries in question (select's) occasionally take up to 5 mins >>> even if they take ~2-3 sec under "normal" conditions, there are no >>> sequencial scans done in those queries. There are not many users >>> connected (around 3, maybe) to this database usually since it's >>> still in a testing phase. I tried to hunt down the problem by >>> playing around with resource usage cfg options but it didn't really >>> made a difference. >> >> Could that be caused by a CHECKPOINT ? > > actually there are a few log (around 12 per day) entries concerning > checkpoints: > > LOG: checkpoints are occurring too frequently (10 seconds apart) > HINT: Consider increasing the configuration parameter > "checkpoint_segments". > > But wouldn't that only affect write performance? The main problems I'm > concerned about affect SELECT queries. No, that will certainly NOT just affect write performance; if the postmaster is busy writing out checkpoints, that will block SELECT queries that are accessing whatever is being checkpointed. When we were on 7.4, we would *frequently* see SELECT queries that should be running Very Quick that would get blocked by the checkpoint flush. We'd periodically see hordes of queries of the form: select id from some_table where unique_field = 'somevalue.something'; which would normally run in less than 1ms running for (say) 2s. And the logs would show something looking rather like the following: 2008-04-03 09:01:52 LOG select id from some_table where unique_field = 'somevalue.something'; - 952ms 2008-04-03 09:01:52 LOG select id from some_table where unique_field = 'somevalue.something'; - 742ms 2008-04-03 09:01:52 LOG select id from some_table where unique_field = 'another.something'; - 1341ms 2008-04-03 09:01:52 LOG select id from some_table where unique_field = 'somevalue.something'; - 911ms 2008-04-03 09:01:52 LOG select id from some_table where unique_field = 'another.something'; - 1244ms 2008-04-03 09:01:52 LOG select id from some_table where unique_field = 'another.something'; - 2311ms 2008-04-03 09:01:52 LOG select id from some_table where unique_field = 'another.something'; - 1799ms 2008-04-03 09:01:52 LOG select id from some_table where unique_field = 'somevalue.something'; - 1992ms This was happening because the checkpoint was flushing those two tuples, and hence blocking 8 SELECTs that came in during the flush. There are two things worth considering: 1. If the checkpoints are taking place "too frequently," then that is clear evidence that something is taking place that is injecting REALLY heavy update load on your database at those times. If the postmaster is checkpointing every 10s, that implies Rather Heavy Load, so it is pretty well guaranteed that performance of other activity will suck at least somewhat because this load is sucking up all the I/O bandwidth that it can. So, to a degree, there may be little to be done to improve on this. 2. On the other hand, if you're on 8.1 or so, you may be able to configure the Background Writer to incrementally flush checkpoint data earlier, and avoid the condition of 1. Mind you, you'd have to set BgWr to be pretty aggressive, based on the "10s periodicity" that you describe; that may not be a nice configuration to have all the time :-(. -- output = reverse("ofni.sesabatadxunil" "@" "enworbbc") http://cbbrowne.com/info/multiplexor.html Nagging is the repetition of unpalatable truths. --Baroness Edith Summerskill -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Oddly slow queries
On Wed, 16 Apr 2008 06:07:04 +0200, Thomas Spreng <[EMAIL PROTECTED]> wrote: On 16.04.2008, at 01:24, PFC wrote: The queries in question (select's) occasionally take up to 5 mins even if they take ~2-3 sec under "normal" conditions, there are no sequencial scans done in those queries. There are not many users connected (around 3, maybe) to this database usually since it's still in a testing phase. I tried to hunt down the problem by playing around with resource usage cfg options but it didn't really made a difference. Could that be caused by a CHECKPOINT ? actually there are a few log (around 12 per day) entries concerning checkpoints: LOG: checkpoints are occurring too frequently (10 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". But wouldn't that only affect write performance? The main problems I'm concerned about affect SELECT queries. OK, so if you get 12 of those per day, this means your checkpoint interval isn't set to 10 seconds... I hope... Those probably correspond to some large update or insert query that comes from a cron or archive job ?... or a developer doing tests or filling a table... So, if it is checkpointing every 10 seconds it means you have a pretty high write load at that time ; and having to checkpoint and flush the dirty pages makes it worse, so it is possible that your disk(s) choke on writes, also killing the selects in the process. -> Set your checkpoint log segments to a much higher value -> Set your checkpoint timeout to a higher value (5 minutes or something), to be tuned afterwards -> Tune bgwriter settings to taste (this means you need a realistic load, not a test load) -> Use separate disk(s) for the xlog -> For the love of God, don't keep the RAID5 for production ! (RAID5 + 1 small write = N reads + N writes, N=3 in your case) Since this is a test server I would suggest RAID1 for the OS and database files and the third disk for the xlog, if it dies you just recreate the DB... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Oddly slow queries
On 16.04.2008, at 01:24, PFC wrote: The queries in question (select's) occasionally take up to 5 mins even if they take ~2-3 sec under "normal" conditions, there are no sequencial scans done in those queries. There are not many users connected (around 3, maybe) to this database usually since it's still in a testing phase. I tried to hunt down the problem by playing around with resource usage cfg options but it didn't really made a difference. Could that be caused by a CHECKPOINT ? actually there are a few log (around 12 per day) entries concerning checkpoints: LOG: checkpoints are occurring too frequently (10 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". But wouldn't that only affect write performance? The main problems I'm concerned about affect SELECT queries. Regards, Tom PS: WAL settings are all set to defaults. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Oddly slow queries
The queries in question (select's) occasionally take up to 5 mins even if they take ~2-3 sec under "normal" conditions, there are no sequencial scans done in those queries. There are not many users connected (around 3, maybe) to this database usually since it's still in a testing phase. I tried to hunt down the problem by playing around with resource usage cfg options but it didn't really made a difference. Could that be caused by a CHECKPOINT ? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance