Re: [PERFORM] Huge performance problem between 7.4.1 and 8.0.3 - CS
When grilled further on (Sun, 17 Jul 2005 23:43:29 -0600), Robert Creager [EMAIL PROTECTED] confessed: I've 6 runs going concurrently. Just saw (vmstat 1) a set of 8 seconds where the CS didn't drop below 90k, but right now its at ~300 for over 30 seconds... It's bouncing all over the place, but staying reasonably well behaved overall. Against 741 and the same load, CS is steady around 300 with spikes up to 4k, but it's only been running for about 15 minutes. All queries are .2 seconds. Cheers, Rob -- 00:03:33 up 4 days, 3:38, 6 users, load average: 1.67, 0.98, 0.44 Linux 2.6.5-02 #8 SMP Mon Jul 12 21:34:44 MDT 2004 pgpnjscjDu7zf.pgp Description: PGP signature
Re: [PERFORM] slow joining very large table to smaller ones
On 7/15/05, Bruno Wolff III [EMAIL PROTECTED] wrote: On Thu, Jul 14, 2005 at 16:29:58 -0600, Dan Harris [EMAIL PROTECTED] wrote: Ok, I tried this one. My ssh keeps getting cut off by a router somewhere between me and the server due to inactivity timeouts, so all I know is that both the select and explain analyze are taking over an hour to run. Here's the explain select for that one, since that's the best I can get. Are you using NAT at home? That's probably where the issue is. If you have control of that box you can probably increase the timeout to a couple of hours. Some versions of ssh have such a configuration option (in .ssh/config): Host * ServerAliveInterval 600 ...it means that ssh will send a ping packet to a sshd every 10 minutes of inactivity. This way NAT will see activity and won't kill the session. I'm using OpenSSH_4.1p1 for this... Oh, and it doesn't have anything to do with TCP keep alive, which is rather for finding dead connections than keeping connections alive. ;) Regards, Dawid ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Huge performance problem between 7.4.1 and 8.0.3 - CS
Robert Creager [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] confessed: The context swap problem was no worse in 8.0 than in prior versions, so that hardly seems like a good explanation. Have you tried reverting to the cron-based vacuuming method you used in 7.4? I've vacuum_cost_delay = 10 in the conf file for 803. Hmm, did you read this thread? http://archives.postgresql.org/pgsql-performance/2005-07/msg00088.php It's still far from clear what's going on there, but it might be interesting to see if turning off the vacuum delay changes your results with 8.0. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Huge performance problem between 7.4.1 and 8.0.3 - CS
When grilled further on (Mon, 18 Jul 2005 00:10:53 -0400), Tom Lane [EMAIL PROTECTED] confessed: The context swap problem was no worse in 8.0 than in prior versions, so that hardly seems like a good explanation. Have you tried reverting to the cron-based vacuuming method you used in 7.4? Ran 7 hours on 741 with VACUUM ANALYZE every 5 minutes. The largest CS I saw was 40k, with an average of 500 (via script which monitors vmstat output). I've done a VACUUM FULL ANALYZE on 803 and have switched the cron based VACUUM ANALYZE to 803 also. The tests are now running again. Hmm, did you read this thread? http://archives.postgresql.org/pgsql-performance/2005-07/msg00088.php I just glanced at it. Once I've reproduced (or not) the problem on 803 with the VACUUM FULL, I'll turn off the vacuum delay. Cheers, Rob -- 07:10:06 up 4 days, 10:45, 6 users, load average: 0.28, 0.40, 0.29 Linux 2.6.5-02 #8 SMP Mon Jul 12 21:34:44 MDT 2004 pgpVMzW0M2boJ.pgp Description: PGP signature
Re: [PERFORM] Huge performance problem between 7.4.1 and 8.0.3 - CS
When grilled further on (Mon, 18 Jul 2005 09:23:11 -0400), Tom Lane [EMAIL PROTECTED] confessed: It's still far from clear what's going on there, but it might be interesting to see if turning off the vacuum delay changes your results with 8.0. Can that be affected by hupping the server, or do I need a restart? Thanks, Rob -- 07:46:53 up 4 days, 11:21, 6 users, load average: 0.77, 0.43, 0.27 Linux 2.6.5-02 #8 SMP Mon Jul 12 21:34:44 MDT 2004 pgpvW15iUwSUq.pgp Description: PGP signature
Re: [PERFORM] Huge performance problem between 7.4.1 and 8.0.3 - CS
Robert Creager [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] confessed: It's still far from clear what's going on there, but it might be interesting to see if turning off the vacuum delay changes your results with 8.0. Can that be affected by hupping the server, or do I need a restart? sighup should be fine. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Huge performance problem between 7.4.1 and 8.0.3 -
Tom Lane wrote: Robert Creager [EMAIL PROTECTED] writes: I've vacuum_cost_delay = 10 in the conf file for 803. Hmm, did you read this thread? http://archives.postgresql.org/pgsql-performance/2005-07/msg00088.php It's still far from clear what's going on there, but it might be interesting to see if turning off the vacuum delay changes your results with 8.0. With the contrib autovacuum code if you don't specify vacuum delay settings from the command line, then autovacuum doesn't touch them. Therefore (if you aren't specifying them from the command line), on 803, the vacuum delay settings should be the same for a cron issued vacuum and an autovacuum issued vacuum. So if the vacuum delay settings are the problem, then it should show up either way. Matt ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] join and query planner
Just out of curiosity, does it do any better with the following? SELECT ... FROM a JOIN b ON (a.key = b.key) LEFT JOIN c ON (c.key = a.key) LEFT JOIN d ON (d.key=a.key) WHERE (b.column = 100) Dario Pudlo [EMAIL PROTECTED] 07/06/05 4:54 PM (first at all, sorry for my english) Hi. - Does left join restrict the order in which the planner must join tables? I've read about join, but i'm not sure about left join... - If so: Can I avoid this behavior? I mean, make the planner resolve the query, using statistics (uniqueness, data distribution) rather than join order. My query looks like: SELECT ... FROM a, b, LEFT JOIN c ON (c.key = a.key) LEFT JOIN d on (d.key=a.key) WHERE (a.key = b.key) AND (b.column = 100) b.column has a lot better selectivity, but planner insist on resolve first c.key = a.key. Of course, I could rewrite something like: SELECT ... FROM (SELECT ... FROM a,b LEFT JOIN d on (d.key=a.key) WHERE (b.column = 100) ) as aa LEFT JOIN c ON (c.key = aa.key) but this is query is constructed by an application with a multicolumn filter. It's dynamic. It means that a user could choose to look for c.column = 1000. And also, combinations of filters. So, I need the planner to choose the best plan... I've already change statistics, I clustered tables with cluster, ran vacuum analyze, changed work_mem, shared_buffers... Greetings. TIA. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Impact of checkpoint_segments under continual load conditions
On 7/18/05, Tom Lane [EMAIL PROTECTED] wrote: Christopher Petrilli [EMAIL PROTECTED] writes: http://blog.amber.org/diagrams/comparison_mysql_pgsql.png Notice the VERY steep drop off. Hmm. Whatever that is, it's not checkpoint's fault. I would interpret the regular upticks in the Postgres times (every several hundred iterations) as being the effects of checkpoints. You could probably smooth out those peaks some with appropriate hacking on bgwriter parameters, but that's not the issue at hand (is it?). I tried hacking that, turning it up to be more agressive, it got worse. Turned it down, it got worse :-) I have no idea at all what's causing the sudden falloff in performance after about 1 iterations. COPY per se ought to be about a constant-time operation, since APPEND is (or should be) constant-time. What indexes, foreign keys, etc do you have on this table? What else was going on at the time? The table has 15 columns, 5 indexes (character, inet and timestamp). No foreign keys. The only other thing running on the machine was the application actually DOING the benchmarking, written in Python (psycopg), but it was, according to top, using less than 1% of the CPU. It was just talking through a pipe to a psql prompt to do the COPY. Chris -- | Christopher Petrilli | [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Impact of checkpoint_segments under continual load conditions
On Jul 17, 2005, at 1:08 PM, Christopher Petrilli wrote: Normally, checkpoint_segments can help absorb some of that, but my experience is that if I crank the number up, it simply delays the impact, and when it occurs, it takes a VERY long time (minutes) to clear. There comes a point where your only recourse is to throw hardware at the problem. I would suspect that getting faster disks and splitting the checkpoint log to its own RAID partition would help you here. Adding more RAM while you're at it always does wonders for me :-) Vivek Khera, Ph.D. +1-301-869-4449 x806 smime.p7s Description: S/MIME cryptographic signature
Re: [PERFORM] Huge performance problem between 7.4.1 and 8.0.3 - CS issue?
On Mon, 18 Jul 2005 13:52:53 -0400 Tom Lane [EMAIL PROTECTED] wrote: Start a fresh psql session and SHOW vacuum_cost_delay to verify what the active setting is. Thanks. It does show 0 for 803 in a session that was up since I thought I had HUPed the server with the new value. This is leading me to believe that 803 doesn't do very well with VACUUM ANALYZE running often, at least in my particular application... I will provide a more definitive statement to that affect, hopefully tonight. Cheers, Rob ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Impact of checkpoint_segments under continual load conditions
On 7/18/05, Vivek Khera [EMAIL PROTECTED] wrote: On Jul 17, 2005, at 1:08 PM, Christopher Petrilli wrote: Normally, checkpoint_segments can help absorb some of that, but my experience is that if I crank the number up, it simply delays the impact, and when it occurs, it takes a VERY long time (minutes) to clear. There comes a point where your only recourse is to throw hardware at the problem. I would suspect that getting faster disks and splitting the checkpoint log to its own RAID partition would help you here. Adding more RAM while you're at it always does wonders for me :-) My concern is less with absolute performance, than with the nosedive it goes into. I published some of my earlier findings and comparisons on my blog, but there's a graph here: http://blog.amber.org/diagrams/comparison_mysql_pgsql.png Notice the VERY steep drop off. I'm still trying to get rid of it, but honestly, am not smart enough to know where it's originating. I have no desire to ever use MySQL, but it is a reference point, and since I don't particularly need transactional integrity, a valid comparison. Chris -- | Christopher Petrilli | [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Huge performance problem between 7.4.1 and 8.0.3 - CS issue?
Robert Creager [EMAIL PROTECTED] writes: Around 8:15 I was starting to receive hits of a few seconds of high CS hits, higher than the previous 7 hour run on 741. I changed the vacuum delay to 0 and HUP'ed the server (how can I see the value vacuum_cost_delay run time?). Start a fresh psql session and SHOW vacuum_cost_delay to verify what the active setting is. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] Insert performance (OT?)
Hi, Suppose I have a table with 4 fields (f1, f2, f3, f4) I define 2 unique indexes u1 (f1, f2, f3) and u2 (f1, f2, f4) I have 3 records A, B, C, D (this will be inserted) A, B, C, E (this will pass u2, but not u1, thus not inserted) A, B, F, D (this will pass u1, but not u2, thus not inserted) Now, for performance ... I have tables like this with 500.000 records where there's a new upload of approx. 20.000 records. It is only now that we say index u2 to be necessary. So, until now, I did something like insert into ... select f1, f2, f2, max(f4) group by f1, f2, f3 That is ok ... and also logically ok because of the data definition I cannot do this with 2 group by's. I tried this on paper and I'm not succeeding. So, I must use a function that will check against u1 and u2, and then insert if it is ok. I know that such a function is way slower that my insert query. So, my question ... How can I keep the same performance, but also with the new index in mind ??? Met vriendelijke groeten, Bien à vous, Kind regards, Yves Vindevogel Implements Mail: [EMAIL PROTECTED] - Mobile: +32 (478) 80 82 91 Kempische Steenweg 206 - 3500 Hasselt - Tel-Fax: +32 (11) 43 55 76 Web: http://www.implements.be x-tad-smaller First they ignore you. Then they laugh at you. Then they fight you. Then you win. Mahatma Ghandi./x-tad-smaller ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Impact of checkpoint_segments under continual load conditions
Christopher Petrilli [EMAIL PROTECTED] writes: http://blog.amber.org/diagrams/comparison_mysql_pgsql.png Notice the VERY steep drop off. Hmm. Whatever that is, it's not checkpoint's fault. I would interpret the regular upticks in the Postgres times (every several hundred iterations) as being the effects of checkpoints. You could probably smooth out those peaks some with appropriate hacking on bgwriter parameters, but that's not the issue at hand (is it?). I have no idea at all what's causing the sudden falloff in performance after about 1 iterations. COPY per se ought to be about a constant-time operation, since APPEND is (or should be) constant-time. What indexes, foreign keys, etc do you have on this table? What else was going on at the time? regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] join and query planner
Hi. Just out of curiosity, does it do any better with the following? SELECT ... Yes, it does. But my query could also be SELECT ... FROM a JOIN b ON (a.key = b.key) LEFT JOIN c ON (c.key = a.key) LEFT JOIN d ON (d.key=a.key) /*new*/ , e WHERE (b.column = 100) /*new*/ and (e.key = a.key) and (e.field = 'filter') because it's constructed by an application. I needed to know if, somehow, someway, I can unforce join order. The only way to solve it so far is changing application. It must build something like SELECT ... FROM b JOIN (a JOIN e ON (e.key = a.key)) ON (a.key = b.key) LEFT JOIN c ON (c.key = a.key) LEFT JOIN d ON (d.key=a.key) WHERE (b.column = 100) and (e.field = 'filter') Supossed that e.field has (should have) better selectivity. But now this problem belongs to programmer's group :-) The query, in fact, has more tables to join. I wonder if lowering geqo threshold could do the work... Thank you. Greetings. Long life, little spam and prosperity! -Mensaje original- De: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] nombre de Kevin Grittner Enviado el: lunes, 18 de julio de 2005 14:58 Para: pgsql-performance@postgresql.org; [EMAIL PROTECTED] Asunto: Re: [PERFORM] join and query planner Just out of curiosity, does it do any better with the following? SELECT ... FROM a JOIN b ON (a.key = b.key) LEFT JOIN c ON (c.key = a.key) LEFT JOIN d ON (d.key=a.key) WHERE (b.column = 100) snipp ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Impact of checkpoint_segments under continual load conditions
Christopher Petrilli [EMAIL PROTECTED] writes: On 7/18/05, Tom Lane [EMAIL PROTECTED] wrote: I have no idea at all what's causing the sudden falloff in performance after about 1 iterations. COPY per se ought to be about a constant-time operation, since APPEND is (or should be) constant-time. What indexes, foreign keys, etc do you have on this table? What else was going on at the time? The table has 15 columns, 5 indexes (character, inet and timestamp). No foreign keys. The only other thing running on the machine was the application actually DOING the benchmarking, written in Python (psycopg), but it was, according to top, using less than 1% of the CPU. It was just talking through a pipe to a psql prompt to do the COPY. Sounds pretty plain-vanilla all right. Are you in a position to try the same benchmark against CVS tip? (The nightly snapshot tarball would be plenty close enough.) I'm just wondering if the old bgwriter behavior of locking down the bufmgr while it examined the ARC/2Q data structures is causing this... regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match