Re: [PERFORM] 10x faster sort performance on Skylake CPU vs Ivy Bridge
> On Aug 25, 2017, at 17:07, Tom Lanewrote: > > =?utf-8?Q?Felix_Geisend=C3=B6rfer?= writes: >> I recently came across a performance difference between two machines that >> surprised me: >> ... >> As you can see, Machine A spends 5889ms on the Sort Node vs 609ms on Machine >> B when looking at the "Exclusive" time with explain.depesz.com [3][4]. I.e. >> Machine B is ~10x faster at sorting than Machine B (for this particular >> query). > > I doubt this is a hardware issue, it's more likely that you're comparing > apples and oranges. The first theory that springs to mind is that the > sort keys are strings and you're using C locale on the faster machine but > some non-C locale on the slower. strcoll() is pretty darn expensive > compared to strcmp() :-( You're right, that seems to be it. Machine A was using strcoll() (lc_collate=en_US.UTF-8) Machine B was using strcmp() (lc_collate=C) After switching Machine A to use lc_collate=C, I get: CTE Scan on zulu (cost=40673.620..40742.300 rows=3434 width=56) (actual time=1368.610..1368.698 rows=58 loops=1) CTE zulu -> HashAggregate (cost=40639.280..40673.620 rows=3434 width=56) (actual time=1368.607..1368.659 rows=58 loops=1) Group Key: mike.two, ((mike.golf)::text) -> Unique (cost=37656.690..40038.310 rows=34341 width=104) (actual time=958.493..1168.128 rows=298104 loops=1) -> Sort (cost=37656.690..38450.560 rows=317549 width=104) (actual time=958.491..1055.635 rows=316982 loops=1) Sort Key: mike.two, ((mike.lima)::text) COLLATE "papa", mike.echo DESC, mike.quebec Sort Method: quicksort Memory: 56834kB -> Seq Scan on mike (cost=0.000..8638.080 rows=317549 width=104) (actual time=0.043..172.496 rows=316982 loops=1) Filter: (golf five NOT NULL) Rows Removed by Filter: 26426 So Machine A needs 883ms [1] for the sort vs 609ms [2] for Machine B. That's ~1.4x faster which seems reasonable :). Sorry for the delayed response, I didn't have access to machine B to confirm this right away. > regards, tom lane This is my first post to a PostgreSQL mailing list, but I've been lurking for a while. Thank you for taking the time for replying to e-mails such as mine and all the work you've put into PostgreSQL over the years. I'm deeply grateful. > On Aug 25, 2017, at 17:43, Peter Geoghegan wrote: > > On Fri, Aug 25, 2017 at 8:07 AM, Tom Lane wrote: >> I doubt this is a hardware issue, it's more likely that you're comparing >> apples and oranges. The first theory that springs to mind is that the >> sort keys are strings and you're using C locale on the faster machine but >> some non-C locale on the slower. strcoll() is pretty darn expensive >> compared to strcmp() :-( > > strcoll() is very noticeably slower on macOS, too. > Thanks. This immediately explains what I saw when testing this query on a Linux machine that was also using lc_collate=en_US.UTF-8 but not being slowed down by it as much as the macOS machine. [1] https://explain.depesz.com/s/LOqa [2] https://explain.depesz.com/s/zVe -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] 10x faster sort performance on Skylake CPU vs Ivy Bridge
Hi, I recently came across a performance difference between two machines that surprised me: Postgres Version / OS on both machines: v9.6.3 / MacOS 10.12.5 Machine A: MacBook Pro Mid 2012, 2.7 GHz Intel Core i7 (Ivy Bridge), 8 MB L3 Cache, 16 GB 1600 MHz DDR3 [1] Machine B: MacBook Pro Late 2016, 2.6 GHz Intel Core i7 (Skylake), 6 MB L3 Cache,16 GB 2133 MHz LPDDR3 [2] Query Performance on Machine A: [3] CTE Scan on zulu (cost=40673.620..40742.300 rows=3434 width=56) (actual time=6339.404..6339.462 rows=58 loops=1) CTE zulu -> HashAggregate (cost=40639.280..40673.620 rows=3434 width=31) (actual time=6339.400..6339.434 rows=58 loops=1) Group Key: mike.two, mike.golf -> Unique (cost=37656.690..40038.310 rows=34341 width=64) (actual time=5937.934..6143.161 rows=298104 loops=1) -> Sort (cost=37656.690..38450.560 rows=317549 width=64) (actual time=5937.933..6031.925 rows=316982 loops=1) Sort Key: mike.two, mike.lima, mike.echo DESC, mike.quebec Sort Method: quicksort Memory: 56834kB -> Seq Scan on mike (cost=0.000..8638.080 rows=317549 width=64) (actual time=0.019..142.831 rows=316982 loops=1) Filter: (golf five NOT NULL) Rows Removed by Filter: 26426 Query Performance on Machine B: [4] CTE Scan on zulu (cost=40621.420..40690.100 rows=3434 width=56) (actual time=853.436..853.472 rows=58 loops=1) CTE zulu -> HashAggregate (cost=40587.080..40621.420 rows=3434 width=31) (actual time=853.433..853.448 rows=58 loops=1) Group Key: mike.two, mike.golf -> Unique (cost=37608.180..39986.110 rows=34341 width=64) (actual time=634.412..761.678 rows=298104 loops=1) -> Sort (cost=37608.180..38400.830 rows=317057 width=64) (actual time=634.411..694.719 rows=316982 loops=1) Sort Key: mike.two, mike.lima, mike.echo DESC, mike.quebec Sort Method: quicksort Memory: 56834kB -> Seq Scan on mike (cost=0.000..8638.080 rows=317057 width=64) (actual time=0.047..85.534 rows=316982 loops=1) Filter: (golf five NOT NULL) Rows Removed by Filter: 26426 As you can see, Machine A spends 5889ms on the Sort Node vs 609ms on Machine B when looking at the "Exclusive" time with explain.depesz.com [3][4]. I.e. Machine B is ~10x faster at sorting than Machine B (for this particular query). My question is: Why? I understand that this is a 3rd gen CPU vs a 6th gen, and that things have gotten faster despite stagnant clock speeds, but seeing a 10x difference still caught me off guard. Does anybody have some pointers to understand where those gains are coming from? Is it the CPU, memory, or both? And in particular, why does Sort benefit so massively from the advancement here (~10x), but Seq Scan, Unique and HashAggregate don't benefit as much (~2x)? As you can probably tell, my hardware knowledge is very superficial, so I apologize if this is a stupid question. But I'd genuinely like to improve my understanding and intuition about these things. Cheers Felix Geisendörfer [1] http://www.everymac.com/systems/apple/macbook_pro/specs/macbook-pro-core-i7-2.7-15-mid-2012-retina-display-specs.html [2] http://www.everymac.com/systems/apple/macbook_pro/specs/macbook-pro-core-i7-2.6-15-late-2016-retina-display-touch-bar-specs.html [3] https://explain.depesz.com/s/hmn [4] https://explain.depesz.com/s/zVe -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Slow Performance on a XEON E5504
Hello List, I've got a system on a customers location which has a XEON E5504 @ 2.00GHz Processor (HP Proliant) It's postgres 8.4 on a Debian Squeeze System running with 8GB of ram: The Postgres Performance on this system measured with pgbench is very poor: transaction type: TPC-B (sort of) scaling factor: 1 query mode: simple number of clients: 40 number of transactions per client: 100 number of transactions actually processed: 4000/4000 tps = 158.283272 (including connections establishing) tps = 158.788545 (excluding connections establishing) The same database on a Core i7 CPU 920 @ 2.67GHz, 8 cores with 8GB RAM same distro and Postgresql Version is much faster: transaction type: TPC-B (sort of) scaling factor: 1 query mode: simple number of clients: 40 number of transactions per client: 100 number of transactions actually processed: 4000/4000 tps = 1040.534002 (including connections establishing) tps = 1065.215134 (excluding connections establishing) Even optimizing the postgresql.conf values doesn't change a lot on the tps values. (less than 10%) Tried Postgresql 9.1 on the Proliant: transaction type: TPC-B (sort of) scaling factor: 1 query mode: simple number of clients: 40 number of threads: 1 number of transactions per client: 100 number of transactions actually processed: 4000/4000 tps = 53.114978 (including connections establishing) tps = 53.198667 (excluding connections establishing) Next was to compare the diskperformance which was much better on the XEON than on the Intel i7. Any idea where to search for the bottleneck? Mit freundlichen Grüßen Felix Schubert FEScon ... and work flows! felix schubert haspelgasse 5 69117 heidelberg mobil: +49-151-25337718 mail: in...@fescon.de skype: fesmac
[PERFORM] Slow Performance on a XEON E5504
Hello List, I've got a system on a customers location which has a XEON E5504 @ 2.00GHz Processor (HP Proliant) It's postgres 8.4 on a Debian Squeeze System running with 8GB of ram: The Postgres Performance on this system measured with pgbench is very poor: transaction type: TPC-B (sort of) scaling factor: 1 query mode: simple number of clients: 40 number of transactions per client: 100 number of transactions actually processed: 4000/4000 tps = 158.283272 (including connections establishing) tps = 158.788545 (excluding connections establishing) The same database on a Core i7 CPU 920 @ 2.67GHz, 8 cores with 8GB RAM same distro and Postgresql Version is much faster: transaction type: TPC-B (sort of) scaling factor: 1 query mode: simple number of clients: 40 number of transactions per client: 100 number of transactions actually processed: 4000/4000 tps = 1040.534002 (including connections establishing) tps = 1065.215134 (excluding connections establishing) Even optimizing the postgresql.conf values doesn't change a lot on the tps values. (less than 10%) Tried Postgresql 9.1 on the Proliant: transaction type: TPC-B (sort of) scaling factor: 1 query mode: simple number of clients: 40 number of threads: 1 number of transactions per client: 100 number of transactions actually processed: 4000/4000 tps = 53.114978 (including connections establishing) tps = 53.198667 (excluding connections establishing) Next was to compare the diskperformance which was much better on the XEON than on the Intel i7. Any idea where to search for the bottleneck? best regards, Felix Schubert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Slow Performance on a XEON E5504
Hi Scott, the controller is a HP i410 running 3x300GB SAS 15K / Raid 5 Mit freundlichen Grüßen Felix Schubert Von meinem iPhone gesendet :-) Am 25.08.2012 um 14:42 schrieb Scott Marlowe scott.marl...@gmail.com: On Sat, Aug 25, 2012 at 6:07 AM, Felix Schubert in...@fescon.de wrote: Hello List, I've got a system on a customers location which has a XEON E5504 @ 2.00GHz Processor (HP Proliant) It's postgres 8.4 on a Debian Squeeze System running with 8GB of ram: The Postgres Performance on this system measured with pgbench is very poor: transaction type: TPC-B (sort of) scaling factor: 1 query mode: simple number of clients: 40 number of transactions per client: 100 number of transactions actually processed: 4000/4000 tps = 158.283272 (including connections establishing) tps = 158.788545 (excluding connections establishing) For a single thread on a 10k RPM drive the maximum number of times per second you can write and get a proper fsync back is 166. This is quite close to that theoretical max. The same database on a Core i7 CPU 920 @ 2.67GHz, 8 cores with 8GB RAM same distro and Postgresql Version is much faster: transaction type: TPC-B (sort of) scaling factor: 1 query mode: simple number of clients: 40 number of transactions per client: 100 number of transactions actually processed: 4000/4000 tps = 1040.534002 (including connections establishing) tps = 1065.215134 (excluding connections establishing) This is much faster than the theoretical limit of a single 10k RPM drive obeying fsync. I'll ignore the rest of your post where you get 53 tps after optimization. The important thing you forgot to mention was your drive subsystem here. I'm gonna take a wild guess that they are both on a single drive and that the older machine is using an older SATA or PATA interface HD that is lying about fsync, and the new machine is using a 10k RPM drive that is not lying about fsync and you are getting a proper ~150 tps from it. So, what kind of IO subsystems you got in those things? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Slow Performance on a XEON E5504
Don't know but I forwarded the question to the System Administrator. Anyhow thanks for the information up to now! best regards, Felix Am 25.08.2012 um 14:59 schrieb Scott Marlowe scott.marl...@gmail.com: Well it sounds like it does NOT have a battery back caching module on it, am I right?
[PERFORM] queries are fast after dump-restore but slow again after some days dispite vacuum
Hi, I am running a 9.1 server at Ubuntu. When I upgraded to the current version I did a pg_dump followed by pg_restore and found that the db was much faster. But slowed down again after two days. I did the dump-restore again and could now compare the two (actually identical) databases. This is a query of the old one directly after a VACUUM ANALYSE: QUERY PLAN WindowAgg (cost=2231.56..2232.17 rows=22 width=59) (actual time=16748.382..16749.203 rows=340 loops=1) - Sort (cost=2231.56..2231.62 rows=22 width=59) (actual time=16748.360..16748.575 rows=340 loops=1) Sort Key: ba.bookid, (CASE WHEN (e.languageid = 123) THEN 1 WHEN (e.languageid = 401) THEN 2 WHEN (e.languageid = 150) THEN 3 ELSE 4 END) Sort Method: quicksort Memory: 60kB - Nested Loop (cost=0.00..2231.07 rows=22 width=59) (actual time=0.125..16747.395 rows=340 loops=1) - Index Scan using authorid1 on book_author ba (cost=0.00..73.94 rows=20 width=8) (actual time=0.034..11.453 rows=99 loops=1) Index Cond: (authorid = 543) - Index Scan using foreign_key_bookid on editions e (cost=0.00..107.76 rows=8 width=51) (actual time=90.741..169.031 rows=3 loops=99) Index Cond: (bookid = ba.bookid) Filter: mainname Total runtime: 16752.146 ms (11 Zeilen) And here after dump-restore: QUERY PLAN - WindowAgg (cost=2325.78..2326.41 rows=23 width=58) (actual time=18.583..19.387 rows=340 loops=1) - Sort (cost=2325.78..2325.84 rows=23 width=58) (actual time=18.562..18.823 rows=340 loops=1) Sort Key: ba.bookid, (CASE WHEN (e.languageid = 123) THEN 1 WHEN (e.languageid = 401) THEN 2 WHEN (e.languageid = 150) THEN 3 ELSE 4 END) Sort Method: quicksort Memory: 60kB - Nested Loop (cost=0.00..2325.26 rows=23 width=58) (actual time=0.385..18.060 rows=340 loops=1) - Index Scan using authorid1 on book_author ba (cost=0.00..73.29 rows=20 width=8) (actual time=0.045..0.541 rows=99 loops=1) Index Cond: (authorid = 543) - Index Scan using foreign_key_bookid on editions e (cost=0.00..112.49 rows=9 width=50) (actual time=0.056..0.168 rows=3 loops=99) Index Cond: (bookid = ba.bookid) Filter: mainname Total runtime: 19.787 ms (11 Zeilen) server settings: shared_buffers = 680MB work_mem = 10MB maintenance_work_mem = 64MB checkpoint_segments = 32 checkpoint_completion_target = 0.9 effective_cache_size = 1500MB No matter how much I vacuum or analyse the slow db, I don't get it faster. I also checked for dead tuples - there are none. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] queries are fast after dump-restore but slow again after some days dispite vacuum
Andrew Dunstan andrew at dunslane.net writes: Try running CLUSTER on the relevant tables and see if it makes a difference. If it does you might want to look into using pg_reorg periodically. That worked like a charm! Many thanks. But how comes, the queries are also fast after a restore without the cluster? regards, Felix -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] good old VACUUM FULL
I posted many weeks ago about a severe problem with a table that was obviously bloated and was stunningly slow. Up to 70 seconds just to get a row count on 300k rows. I removed the text column, so it really was just a few columns of fixed data. Still very bloated. Table size was 450M The advice I was given was to do CLUSTER, but this did not reduce the table size in the least. Nor performance. Also to resize my free space map (which still does need to be done). Since that involves tweaking the kernel settings, taking the site down and rebooting postgres and exposing the system to all kinds of risks and unknowns and expensive experimentations I was unable to do it and have had to hobble along with a slow table in my backend holding up jobs. Much swearing that nobody should ever do VACUUM FULL. Manual advises against it. Only crazy people do that. Finally I decide to stop taking advice. ns= explain analyze select count(*) from fastadder_fastadderstatus; - Aggregate (cost=62602.08..62602.09 rows=1 width=0) (actual time=25320.000..25320.000 rows=1 loops=1) - Seq Scan on fastadder_fastadderstatus (cost=0.00..61815.86 rows=314486 width=0) (actual time=180.000..25140.000 rows=314493 loops=1) Total runtime: *25320.000* ms ns= vacuum full fastadder_fastadderstatus; took about 20 minutes ns= explain analyze select count(*) from fastadder_fastadderstatus; Aggregate (cost=7478.03..7478.04 rows=1 width=0) (actual time=940.000..940.000 rows=1 loops=1) - Seq Scan on fastadder_fastadderstatus (cost=0.00..6691.82 rows=314482 width=0) (actual time=0.000..530.000 rows=314493 loops=1) Total runtime: *940.000 ms* moral of the story: if your table is really bloated, just do VACUUM FULL CLUSTER will not reduce table bloat in and identical fashion
Re: [PERFORM] Really really slow select count(*)
On Tue, Feb 8, 2011 at 3:23 PM, Shaun Thomas stho...@peak6.com wrote: With 300k rows, count(*) isn't a good test, really. That's just on the edge of big-enough that it could be 1-second to fetch from the disk controller, 1 second you say ? excellent, sign me up 70 seconds is way out of bounds I don't want a more efficient query to test with, I want the shitty query that performs badly that isolates an obvious problem. The default settings are not going to cut it for a database of your size, with the volume you say it's getting. not to mention the map reduce jobs I'm hammering it with all night :) but I did pause those until this is solved But you need to put in those kernel parameters I suggested. And I know this sucks, but you also have to raise your shared_buffers and possibly your work_mem and then restart the DB. But this time, pg_ctl to invoke a fast stop, and then use the init script in /etc/init.d to restart it. I'm getting another slicehost slice. hopefully I can clone the whole thing over without doing a full install and go screw around with it there. its a fairly complicated install, even with buildout doing most of the configuration. =felix
Re: [PERFORM] Really really slow select count(*)
+1 this is exactly what I was looking for at the time: a -t (configtest) option to pg_ctl and I think it should fall back to lower shared buffers and log it. SHOW ALL; would show the used value On Mon, Feb 7, 2011 at 11:30 AM, Marti Raudsepp ma...@juffo.org wrote: On Mon, Feb 7, 2011 at 05:03, Craig Ringer cr...@postnewspapers.com.au wrote: What would possibly help would be if Pg could fall back to lower shared_buffers automatically, screaming about it in the logs but still launching. OTOH, many people don't check the logs, so they'd think their new setting had taken effect and it hadn't - you've traded one usability problem for another. Even if Pg issued WARNING messages to each client that connected, lots of (non-psql) clients don't display them, so many users would never know. Do you have a suggestion about how to do this better? The current approach is known to be rather unlovely, but nobody's come up with a better one that works reasonably and doesn't trample on other System V shared memory users that may exist on the system. We could do something similar to what Apache does -- provide distros with a binary to check the configuration file in advance. This check program is launched before the restart command, and if it fails, the server is not restarted. Regards, Marti
Re: [PERFORM] Really really slow select count(*)
On Mon, Feb 7, 2011 at 6:05 AM, Shaun Thomas stho...@peak6.com wrote: That’s one of the things I talked about. To be safe, PG will start to shut down but disallow new connections, and **that’s all**. Old connections are grandfathered in until they disconnect, and when they all go away, it shuts down gracefully. Well it said Failed to shutdown ... and then returned control. and then proceeded to run for about an hour. I'm not sure how graceful that is. I generally take programs at their word. Failed is clearly past tense. So far as your Django install, have you activated the memcache contrib. module? Your pages should be lazy-caching and rarely depend on the DB, if they can. yes thanks my web app is very finely tuned and is working splendidly. I've been working on very large sites sites since 1998 and this client has been with me for 10 years already. its a fairly high traffic site. I've only been using postgres since we migrated in May but it is one particular table on postgres that has shit the sock drawer. You should also rarely be doing count(*) on a 300k row table, even if everything is cached and speedy. I'm not this is a test query that is obviously way out of bounds for acceptable response. there is something very very wrong with this table and I need to solve it ASAP. other tables that have less updates but similar sizes are not having this problem. there are foreign keys pointing to this table so its a bit tricky to just refill it, but I can think of one way. I'll have to do that. its only conjecture that the issue is file space bloat or free map problems. those are overall issues that I will get to as soon as I can. but this is table specific. That’s an application design issue you need to address before it’s too late, or you have to rush and implement a hasty fix. it is not an application design issue, though there are always improvements being made. Being a DBA sucks sometimes. J I am not a DBA, I'm just trying to query a 300k row table. though I am happy to learn more. I know an awful lot about a lot of things. but you can't specialize in everything
Re: [PERFORM] Really really slow select count(*)
BRUTAL http://www.postgresql.org/docs/8.3/interactive/runtime-config-resource.html max_fsm_pages See Section 17.4.1http://www.postgresql.org/docs/8.3/interactive/kernel-resources.html#SYSVIPC for information on how to adjust those parameters, if necessary. I see absolutely nothing in there about how to set those parameters. several hours later ( where is my data directory ? 8.4 shows it in SHOW ALL; 8.3 does not. conf files ? in the data directory no, its in /etc/postgres/8.3/main where is pg_ctl ? what user do I need to be ? postgres then why was it installed in the home dir of a user that does not have permissions to use it ?? ) cd /home/crucial/bin /home/crucial/bin/pg_ctl -D /var/lib/postgresql/8.3/main reload reload does not reset max_fsm_pages, I need to actually restart the server. postgres@nestseekers:/home/crucial/bin$ /home/crucial/bin/pg_ctl -D /var/lib/postgresql/8.3/main restart waiting for server to shut down... failed pg_ctl: server does not shut down OK, my mistake. probably I have to disconnect all clients. I don't want to do a planned maintenance right now. so I go to sleep the server restarts itself an hour later. but no, it fails to restart because this memory setting you recommend is not possible without reconfiguring the kernel. postgres@nestseekers:/home/crucial/bin$ 2011-02-06 05:18:00 EST LOG: could not load root certificate file root.crt: No such file or directory 2011-02-06 05:18:00 EST DETAIL: Will not verify client certificates. 2011-02-06 05:18:00 EST FATAL: could not create shared memory segment: Invalid argument 2011-02-06 05:18:00 EST DETAIL: Failed system call was shmget(key=5432001, size=35463168, 03600). 2011-02-06 05:18:00 EST HINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded your kernel's SHMMAX parameter. You can either reduce the request size or reconfigure the kernel with larger SHMMAX. To reduce the request size (currently 35463168 bytes), reduce PostgreSQL's shared_buffers parameter (currently 3072) and/or its max_connections parameter (currently 103). If the request size is already small, it's possible that it is less than your kernel's SHMMIN parameter, in which case raising the request size or reconfiguring SHMMIN is called for. The PostgreSQL documentation contains more information about shared memory configuration. ^C *and the website is down for the next 6 hours while I sleep.* total disaster after a few tries I get it to take an max_fsm_pages of 300k postgres@nestseekers:/home/crucial/bin$ 2011-02-06 05:19:26 EST LOG: could not load root certificate file root.crt: No such file or directory 2011-02-06 05:19:26 EST DETAIL: Will not verify client certificates. 2011-02-06 05:19:26 EST LOG: database system was shut down at 2011-02-06 00:07:41 EST 2011-02-06 05:19:27 EST LOG: autovacuum launcher started 2011-02-06 05:19:27 EST LOG: database system is ready to accept connections ^C 2011-02-06 05:33:45 EST LOG: checkpoints are occurring too frequently (21 seconds apart) 2011-02-06 05:33:45 EST HINT: Consider increasing the configuration parameter checkpoint_segments. ?? From my perspective: the defaults for postgres 8.3 result in a database that does not scale and fails dramatically after 6 months. changing that default is brutally difficult and can only really be done by adjusting something in the kernel. I have clustered that table, its still unbelievably slow. I still don't know if this bloat due to the small free space map has anything to do with why the table is performing like this. On Fri, Feb 4, 2011 at 5:35 PM, Shaun Thomas stho...@peak6.com wrote: You can stop the bloating by setting the right max_fsm_pages setting, but you'll either have to go through and VACUUM FULL every table in your database, or dump/restore to regain all the lost space and performance (the later would actually be faster). Before I even touch an older PostgreSQL DB, I set it to some value over 3-million just as a starting value to be on the safe side. A little used memory is a small price to pay for stopping gradual expansion.
Re: [PERFORM] Really really slow select count(*)
On Sun, Feb 6, 2011 at 4:23 PM, Scott Marlowe scott.marl...@gmail.comwrote: Let's review: 1: No test or staging system used before production no, I do not have a full ubuntu machine replicating the exact memory and application load of the production server. this was changing one configuration parameter. something I was advised to do, read about quite a bit, tested on my development server (mac) and then proceeded to do at 6 am on Sunday morning, our slowest time. 2: DB left in an unknown state (trying to shut down, not able) what ? I checked the site, everything was normal. I went in via psql and tried some queries for about half an hour and continued to monitor the site. then I went to bed at 7am (EU time). Why did it shutdown so much later ? I have never restarted postgres before, so this was all new to me. I apologize that I wasn't born innately with such knowledge. So is it normal for postgres to report that it failed to shut down, operate for an hour and then go ahead and restart itself ? 3: No monitoring software to tell you when the site is down of course I have monitoring software. both external and internal. but it doesn't come and kick me out of bed. yes, I need an automated cel phone call. that was the first thing I saw to afterwards. 4: I'm gonna just go ahead and guess no backups were taken either, or are regularly taken. WTF ? of course I have backups. I just went through a very harsh down period event. I fail to see why it is now necessary for you to launch such an attack on me. Perhaps the tone of my post sounded like I was blaming you, or at least you felt that way. Why do you feel that way ? Why not respond with: ouch ! did you check this ... that say something nice and helpful. correct my mistakes This website can't be very important, if that's the way you treat it. just to let you know, that is straight up offensive This is high traffic real estate site. Downtime is unacceptable. I had less downtime than this when I migrated to the new platform. I spent rather a large amount of time reading and questioning here. I asked many questions for clarification and didn't do ANYTHING until I was sure it was the correct solution. I didn't just pull some shit off a blog and start changing settings at random. I double checked opinions against different people and I searched for more docs on that param. Amazingly none of the ones I found commented on the shared memory issue and I didn't even understand the docs discussing shared memory because it didn't seem to apply to what I was doing. that's my misunderstanding. I come her to share my misunderstanding. And my phone starts complaining a minute after the site stops responding if something does go wrong the rest of the time. Do not lay this at anyone else's feet. I didn't. There is not even the slightest hint of that in my post. I came here and posted the details of where I went wrong and what confused me about the documentation that I followed. That's so other people can follow it and so somebody here can comment on it. changing that default is brutally difficult and can only really be done by adjusting something in the kernel. Please, that's a gross exaggeration. The sum totoal to changing them is: run sysctl -a|grep shm copy out proper lines to cahnge edit sysctl.conf put new lines in there with changes sudo sysctl -p # applies changes edit the appropriate postgresql.conf, make changes sudo /etc/init.d/postgresql-8.3 stop sudo /etc/init.d/postgresql-8.3 start Considering how splendidly the experiment with changing fsm_max_pages went, I think you can understand that I have no desire to experiment with kernel settings. It is easy for you because you ALREADY KNOW everything involved. I am not a sysadmin and we don't have one. My apologies for that. so does the above mean that I don't have to restart the entire server, just postgres ? I assumed that changing kernel settings means rebooting the server. I have clustered that table, its still unbelievably slow. Did you actually delete the old entries before clustering it? if it's still got 4G of old sessions or whatever in it, clustering ain't gonna help. its a different table. the problem one has only 300k rows the problem is not the size, the problem is the speed is catastrophic I still don't know if this bloat due to the small free space map has anything to do with why the table is performing like this. Since you haven't show us what changes, if any, have happened to the table, neither do we :) sorry, it didn't seem to be the most important topic when I got out of bed
Re: [PERFORM] Really really slow select count(*)
yeah, it already uses memcached with db save. nothing important in session anyway the session table is not the issue and I never clustered that one or ever will thanks for the tip, also the other one about HOT On Sun, Feb 6, 2011 at 8:19 PM, Pierre C li...@peufeu.com wrote: I have clustered that table, its still unbelievably slow. Did you actually delete the old entries before clustering it? if it's still got 4G of old sessions or whatever in it, clustering ain't gonna help. Also, IMHO it is a lot better to store sessions in something like memcached, rather than imposing this rather large load on the main database... PS : if your site has been down for 6 hours, you can TRUNCATE your sessions table...
[PERFORM] Really really slow select count(*)
I am having huge performance problems with a table. Performance deteriorates every day and I have to run REINDEX and ANALYZE on it every day. auto vacuum is on. yes, I am reading the other thread about count(*) :) but obviously I'm doing something wrong here explain analyze select count(*) from fastadder_fastadderstatus; Aggregate (cost=62458.73..62458.74 rows=1 width=0) (actual time=77130.000..77130.000 rows=1 loops=1) - Seq Scan on fastadder_fastadderstatus (cost=0.00..61701.18 rows=303018 width=0) (actual time=50.000..76930.000 rows=302479 loops=1) Total runtime: *77250.000 ms* directly after REINDEX and ANALYZE: Aggregate (cost=62348.70..62348.71 rows=1 width=0) (actual time=15830.000..15830.000 rows=1 loops=1) - Seq Scan on fastadder_fastadderstatus (cost=0.00..61613.16 rows=294216 width=0) (actual time=30.000..15570.000 rows=302479 loops=1) Total runtime: 15830.000 ms still very bad for a 300k row table a similar table: explain analyze select count(*) from fastadder_fastadderstatuslog; Aggregate (cost=8332.53..8332.54 rows=1 width=0) (actual time=1270.000..1270.000 rows=1 loops=1) - Seq Scan on fastadder_fastadderstatuslog (cost=0.00..7389.02 rows=377402 width=0) (actual time=0.000..910.000 rows=377033 loops=1) Total runtime: 1270.000 ms It gets updated quite a bit each day, and this is perhaps the problem. To me it doesn't seem like that many updates 100-500 rows inserted per day no deletes 10k-50k updates per day mostly of this sort: set priority=1 where id=12345 is it perhaps this that is causing the performance problem ? I could rework the app to be more efficient and do updates using batches where id IN (1,2,3,4...) I assume that means a more efficient index update compared to individual updates. There is one routine that updates position_in_queue using a lot (too many) update statements. Is that likely to be the culprit ? *What else can I do to investigate ?* Table public.fastadder_fastadderstatus Column | Type | Modifiers ---+--+ id| integer | not null default nextval('fastadder_fastadderstatus_id_seq'::regclass) apt_id| integer | not null service_id| integer | not null agent_priority| integer | not null priority | integer | not null last_validated| timestamp with time zone | last_sent | timestamp with time zone | last_checked | timestamp with time zone | last_modified | timestamp with time zone | not null running_status| integer | validation_status | integer | position_in_queue | integer | sent | boolean | not null default false built | boolean | not null default false webid_suffix | integer | build_cache | text | Indexes: fastadder_fastadderstatus_pkey PRIMARY KEY, btree (id) fastadder_fastadderstatus_apt_id_key UNIQUE, btree (apt_id, service_id) fastadder_fastadderstatus_agent_priority btree (agent_priority) fastadder_fastadderstatus_apt_id btree (apt_id) fastadder_fastadderstatus_built btree (built) fastadder_fastadderstatus_last_checked btree (last_checked) fastadder_fastadderstatus_last_validated btree (last_validated) fastadder_fastadderstatus_position_in_queue btree (position_in_queue) fastadder_fastadderstatus_priority btree (priority) fastadder_fastadderstatus_running_status btree (running_status) fastadder_fastadderstatus_service_id btree (service_id) Foreign-key constraints: fastadder_fastadderstatus_apt_id_fkey FOREIGN KEY (apt_id) REFERENCES nsproperties_apt(id) DEFERRABLE INITIALLY DEFERRED fastadder_fastadderstatus_service_id_fkey FOREIGN KEY (service_id) REFERENCES fastadder_fastadderservice(id) DEFERRABLE INITIALLY DEFERRED thanks !
Fwd: [PERFORM] Really really slow select count(*)
sorry, reply was meant to go to the list. -- Forwarded message -- From: felix crucialfe...@gmail.com Date: Fri, Feb 4, 2011 at 5:17 PM Subject: Re: [PERFORM] Really really slow select count(*) To: stho...@peak6.com On Fri, Feb 4, 2011 at 4:00 PM, Shaun Thomas stho...@peak6.com wrote: How big is this table when it's acting all bloated and ugly? 458MB Is this the only thing running when you're doing your tests? What does your disk IO look like? this is on a live site. best not to scare the animals. I have the same config on the dev environment but not the same table size. 10k-50k updates per day mostly of this sort: set priority=1 where id=12345 Well... that's up to 16% turnover per day, but even then, regular vacuuming should keep it manageable. something is definitely amiss with this table. I'm not sure if its something that happened at one point when killing an task that was writing to it or if its something about the way the app is updating. it SHOULDN'T be that much of a problem, though I can find ways to improve it. No. Don't do that. You'd be better off loading everything into a temp table and doing this: UPDATE fastadder_fastadderstatus s SET priority = 1 FROM temp_statuses t WHERE t.id=s.id; ok, that is one the solutions I was thinking about. are updates of the where id IN (1,2,3,4) generally not efficient ? how about for select queries ? fastadder_fastadderstatus_pkey PRIMARY KEY, btree (id) fastadder_fastadderstatus_apt_id_key UNIQUE, btree (apt_id, service_id) fastadder_fastadderstatus_agent_priority btree (agent_priority) fastadder_fastadderstatus_apt_id btree (apt_id) fastadder_fastadderstatus_built btree (built) fastadder_fastadderstatus_last_checked btree (last_checked) fastadder_fastadderstatus_last_validated btree (last_validated) fastadder_fastadderstatus_position_in_queue btree (position_in_queue) fastadder_fastadderstatus_priority btree (priority) fastadder_fastadderstatus_running_status btree (running_status) fastadder_fastadderstatus_service_id btree (service_id) Whoh! Hold on, here. That looks like *way* too many indexes. I actually just added most of those yesterday in an attempt to improve performance. priority and agent_priority were missing indexes and that was a big mistake. overall performance went way up on my primary selects Definitely will slow down your insert/update performance. there are a lot more selects happening throughout the day The index on 'built' for example, is a boolean. If it's evenly distributed, that's 150k matches for true or false, ok, built True is in the minority. here is the test query that caused me to add indices to the booleans. this is a 30k table which is doing selects on two booleans constantly. again: True is the minority explain analyze SELECT nsproperties_apt.id, nsproperties_apt.display_address, nsproperties_apt.apt_num, nsproperties_apt.bldg_id, nsproperties_apt.is_rental, nsproperties_apt.is_furnished, nsproperties_apt.listing_type, nsproperties_apt.list_on_web, nsproperties_apt.is_approved, nsproperties_apt.status, nsproperties_apt.headline, nsproperties_apt.slug, nsproperties_apt.cross_street, nsproperties_apt.show_apt_num, nsproperties_apt.show_building_name, nsproperties_apt.external_url, nsproperties_apt.listed_on, nsproperties_bldg.id, nsproperties_bldg.name FROM nsproperties_apt LEFT OUTER JOIN nsproperties_bldg ON (nsproperties_apt.bldg_id = nsproperties_bldg.id) WHERE (nsproperties_apt.list_on_web = True AND nsproperties_apt.is_available = True ) ; QUERY PLAN Hash Left Join (cost=408.74..10062.18 rows=3344 width=152) (actual time=12.688..2442.542 rows=2640 loops=1) Hash Cond: (nsproperties_apt.bldg_id = nsproperties_bldg.id) - Seq Scan on nsproperties_apt (cost=0.00..9602.52 rows=3344 width=139) (actual time=0.025..2411.644 rows=2640 loops=1) Filter: (list_on_web AND is_available) - Hash (cost=346.66..346.66 rows=4966 width=13) (actual time=12.646..12.646 rows=4966 loops=1) - Seq Scan on nsproperties_bldg (cost=0.00..346.66 rows=4966 width=13) (actual time=0.036..8.236 rows=4966 loops=1) Total runtime: 2444.067 ms (7 rows) = Hash Left Join (cost=1232.45..9784.18 rows=5690 width=173) (actual time=30.000..100.000 rows=5076 loops=1) Hash Cond: (nsproperties_apt.bldg_id = nsproperties_bldg.id) - Bitmap Heap Scan on nsproperties_apt (cost=618.23..9075.84 rows=5690 width=157) (actual time=10.000..60.000 rows=5076 loops=1) Filter: (list_on_web AND is_available) - BitmapAnd (cost=618.23..618.23 rows=5690 width=0) (actual time=10.000..10.000 rows=0 loops=1) - Bitmap Index Scan on nsproperties_apt_is_available (cost=0.00..131.81 rows=6874 width=0) (actual time=0.000..0.000 rows
[PERFORM] Really really slow select count(*)
reply was meant for the list -- Forwarded message -- From: felix crucialfe...@gmail.com Date: Fri, Feb 4, 2011 at 4:39 PM Subject: Re: [PERFORM] Really really slow select count(*) To: Greg Smith g...@2ndquadrant.com On Fri, Feb 4, 2011 at 3:56 PM, Greg Smith g...@2ndquadrant.com wrote: PostgreSQL version? If you're running on 8.3 or earlier, I would be suspicous that your Free Space Map has been overrun. 8.3 What you are seeing is that the table itself is much larger on disk than it's supposed to be. which part of the explain told you that ? shaun thomas SELECT relpages*8/1024 FROM pg_class WHERE relname='fastadder_fastadderstatus'; 458MB way too big. build_cache is text between 500-1k chars That can be caused by frequent UPDATEs if you don't have vacuum cleanup working effectively, you'll get lots of dead sections left behind from UPDATEs in the middle. ok, I just vacuumed it (did this manually a few times as well). and auto is on. still: 32840.000ms and still 458MB The best way to fix all this is to run CLUSTER on the table. http://www.postgresonline.com/journal/archives/10-How-does-CLUSTER-ON-improve-index-performance.html now that would order the data on disk by id (primary key) the usage of the table is either by a query or by position_in_queue which is rewritten often (I might change this part of the app and pull it out of this table) is this definitely the best way to fix this ? thanks for your help ! That will introduce a bit of downtime while it holds a lock on the table (only a few minutes based on what you've shown here), but the copy you'll have afterwards won't be spread all over disk anymore. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books
Re: [PERFORM] Really really slow select count(*)
On Fri, Feb 4, 2011 at 5:35 PM, Shaun Thomas stho...@peak6.com wrote: vacuumdb -a -v -z vacuum.log And at the end of the log, it'll tell you how many pages it wants, and how many pages were available. this is the dev, not live. but this is after it gets done with that table: CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: analyzing public.fastadder_fastadderstatus INFO: fastadder_fastadderstatus: scanned 2492 of 2492 pages, containing 154378 live rows and 0 dead rows; 3 rows in sample, 154378 estimated total rows and there's nothing at the end of the whole vacuum output about pages actual command: vacuumdb -U postgres -W -v -z djns4 vacuum.log I tried it with all databases too ? thanks
Re: [PERFORM] Really really slow select count(*)
vacuumdb -a -v -z -U postgres -W vacuum.log that's all, isn't it ? it did each db 8.3 in case that matters the very end: There were 0 unused item pointers. 0 pages are entirely empty. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: analyzing public.seo_partnerlinkcategory INFO: seo_partnerlinkcategory: scanned 0 of 0 pages, containing 0 live rows and 0 dead rows; 0 rows in sample, 0 estimated total rows On Fri, Feb 4, 2011 at 6:40 PM, Scott Marlowe scott.marl...@gmail.comwrote: I tried it with all databases too I believe you have to run it on the whole db to get that output.
Re: [PERFORM] Really really slow select count(*)
vacuumdb -a -v -z -U postgres -W vacuum.log Password: Password: Password: Password: Password: Password: Password: Password: Password: Password: Password: cruxnu:nsbuildout crucial$ do you think its possible that it just doesn't have anything to complain about ? or the password is affecting it ? In any case I'm not sure I want to run this even at night on production. what is the downside to estimating max_fsm_pages too high ? 300 should be safe its certainly not 150k I have one very large table (10m) that is being analyzed before I warehouse it. that could've been the monster that ate the free map. I think today I've learned that even unused tables affect postgres performance. and do you agree that I should turn CLUSTER ON ? I have no problem to stop all tasks to this table at night and just reload it On Fri, Feb 4, 2011 at 6:47 PM, Shaun Thomas stho...@peak6.com wrote: On 02/04/2011 11:44 AM, felix wrote: the very end: There were 0 unused item pointers. 0 pages are entirely empty. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: analyzing public.seo_partnerlinkcategory INFO: seo_partnerlinkcategory: scanned 0 of 0 pages, containing 0 live rows and 0 dead rows; 0 rows in sample, 0 estimated total rows That looks to me like it didn't finish. Did you fork it off with '' or run it and wait until it gave control back to you? It really should be telling you how many pages it wanted, and are in use. If not, something odd is going on. -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604 312-676-8870 stho...@peak6.com __ See http://www.peak6.com/email_disclaimer.php for terms and conditions related to this email
Re: [PERFORM] Really really slow select count(*)
On Fri, Feb 4, 2011 at 7:34 PM, Shaun Thomas stho...@peak6.com wrote: Why is it asking for the password over and over again? It shouldn't be doing that. because I asked it to: -W on the production server I need to enter password and I'm testing on dev first. I just sudo tried it but still no report and do you agree that I should turn CLUSTER ON ? Cluster isn't really something you turn on, but something you do. djns4=# cluster fastadder_fastadderstatus; ERROR: there is no previously clustered index for table fastadder_fastadderstatus http://www.postgresonline.com/journal/archives/10-How-does-CLUSTER-ON-improve-index-performance.html djns4=# alter table fastadder_fastadderstatus CLUSTER ON fastadder_fastadderstatus_pkey; ALTER TABLE djns4=# CLUSTER fastadder_fastadderstatus; CLUSTER ok, that's why I figured I was turning something on. the table has been altered. it will be pk ordered, new entries always at the end and no deletes but this means I have to manually run cluster from time to time, right ? not that there will be much or any reordering. or it should be fine going forward with vacuum and enlarging the free space memory map. It's like vacuum full, in that it basically rebuilds the table and all indexes from scratch. The major issue you'll run into is that it reorders the table by the index you chose, so you'd best select the primary key unless you have reasons to use something else. And you have to do it table by table, which will really suck since we already know your whole db has bloated, not just one or two tables. do we know that ? many of the tables are fairly static. only this one is seriously borked, and yet other related tables seem to be fine. You're going to be doing some scripting, buddy. :) Well, unless you just do a dump/restore and start over with sane postgresql.conf settings. well who knew the defaults were unsane ? :) scripting this is trivial, I already have the script I have made the mistake of doing VACUUM FULL in the past. in fact on this table, and it had to be killed because it took down my entire website ! that may well be the major borking event. a credit to postgres that the table still functions if that's the case. scott marlowe: begin; select * into temporaryholdingtable order by somefield; truncate oldtable; insert into oldtables select * from temporaryholdingtable; commit; that sounds like a good approach. gentlemen, 300,000 + thanks for your generous time ! (a small number, I know) -felix
Re: [PERFORM] Really really slow select count(*)
ah right, duh. yes, I did it as -U postgres, verified as a superuser just now did it from inside psql as postgres \c djns4 vacuum verbose analyze; still no advice on the pages On Fri, Feb 4, 2011 at 8:34 PM, Scott Marlowe scott.marl...@gmail.comwrote: On Fri, Feb 4, 2011 at 12:26 PM, felix crucialfe...@gmail.com wrote: I just sudo tried it but still no report It's not about who you are in Unix / Linux, it's about who you are in Postgresql. \du will show you who is a superusr. psql -U username will let you connect as that user.
Re: [PERFORM] Really really slow select count(*)
it probably has good reason to hate me. ns= SELECT n.nspname AS schema_name, c.relname AS table_name, ns- c.reltuples AS row_count, ns- c.relpages*8/1024 AS mb_used, ns- pg_total_relation_size(c.oid)/1024/1024 AS total_mb_used ns- FROM pg_class c ns- JOIN pg_namespace n ON (n.oid=c.relnamespace) ns- WHERE c.relkind = 'r' ns- ORDER BY total_mb_used DESC ns- LIMIT 20; schema_name |table_name| row_count | mb_used | total_mb_used -+--+-+-+--- public | django_session | 1.47843e+07 |4122 | 18832 public | traffic_tracking2010 | 9.81985e+06 | 811 | 1653 public | mailer_mailingmessagelog | 7.20214e+06 | 441 | 1082 public | auth_user| 3.20077e+06 | 572 | 791 public | fastadder_fastadderstatus| 302479 | 458 | 693 public | registration_registrationprofile | 3.01345e+06 | 248 | 404 public | reporting_dp_6c93734c| 1.1741e+06 | 82 | 224 public | peoplez_contact | 79759 | 18 | 221 public | traffic_tracking201101 | 1.49972e+06 | 163 | 204 public | reporting_dp_a3439e2a| 1.32739e+06 | 82 | 187 public | nsproperties_apthistory | 44906 | 69 | 126 public | nsproperties_apt | 30780 | 71 | 125 public | clients_showingrequest | 85175 | 77 | 103 public | reporting_dp_4ffe04ad| 330252 | 26 | 63 public | fastadder_fastadderstatuslog | 377402 | 28 | 60 public | nsmailings_officememotoagent | 268345 | 15 | 52 public | celery_taskmeta |5041 | 12 | 32 public | mailer_messagelog| 168298 | 24 | 32 public | datapoints_job |9167 | 12 | 23 public | fastadder_fastadderstatus_errors | 146314 | 7 | 21 oh and there in the footnotes to django they say dont' forget to run the delete expired sessions management every once in a while. thanks guys. it won't run now because its too big, I can delete them from psql though well just think how sprightly my website will run tomorrow once I fix these. On Fri, Feb 4, 2011 at 9:00 PM, Shaun Thomas stho...@peak6.com wrote: On 02/04/2011 01:59 PM, felix wrote: still no advice on the pages I think it just hates you. -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604 312-676-8870 stho...@peak6.com __ See http://www.peak6.com/email_disclaimer.php for terms and conditions related to this email
Re: [PERFORM] Update problem on large table
thanks for the replies !, but actually I did figure out how to kill it but pb_cancel_backend didn't work. here's some notes: this has been hung for 5 days: ns | 32681 | nssql | IDLE in transaction | f | 2010-12-01 15 resulting in: fastadder_fastadderstatus: scanned 3000 of 58551 pages, containing 13587 live rows and 254709 dead rows; and resulting in general pandemonium you need to become the postgres superuser to use pg_cancel_backend: su postgres psql and then: select pg_cancel_backend(32681); but this does not kill the IDLE in transaction processes. it returns true, but its still there from the linux shell I tried: pg_ctl kill INT 32681 but it still will not die the docs for pg_ctl state: Use pb_ctl --help to see a list of supported signal names. doing so does indeed tell me the names: HUP INT QUIT ABRT TERM USR1 USR2 but nothing about them whatseover :) throwing caution to the wind: pg_ctl kill TERM 32681 and that did it ran VACUUM and now performance has returned to normal. lessons learned. I guess as Josh says, pg_cancel_backend is the same as SIGINT, which also failed for me using pg_ctl. not sure why. the hung transaction was doing something like update table set field = null where service_id = x On Mon, Dec 6, 2010 at 9:26 PM, Kenneth Marshall k...@rice.edu wrote: On Mon, Dec 06, 2010 at 03:24:31PM -0500, Josh Kupershmidt wrote: On Mon, Dec 6, 2010 at 2:48 PM, Jon Nelson jnelson+pg...@jamponi.netjnelson%2bpg...@jamponi.net wrote: On Mon, Dec 6, 2010 at 1:46 PM, bricklen brick...@gmail.com wrote: Not sure if anyone replied about killing your query, but you can do it like so: select pg_cancel_backend(5902); ?-- assuming 5902 is the pid of the query you want canceled. How does this differ from just killing the pid? pg_cancel_backend(5902) does the same thing as: kill -SIGINT 5902 Josh Yes, but you can use it from within the database. The kill command requires shell access to the backend. Cheers, Ken
Re: [PERFORM] Update problem on large table
Ok, I caught one : an update that is stuck in waiting. the first one blocks the second one. ns |5902 | nssql | UPDATE fastadder_fastadderstatus SET built = false WHERE fastadder_fastadderstatus.service_id = 1 | f | 2010-12-04 13:44:38.5228-05 | 127.0.0.1 ns |7000 | nssql | UPDATE fastadder_fastadderstatus SET last_sent = E'2010-12-04 13:50:51.452800', sent = true WHERE (fastadder_fastadderstatus.built = true AND fastadder_fastadderstatus.service_id = 1 ) | t | 2010-12-04 13:50:51.4628-05 | 127.0.0.1 is it possible to release the lock and/or cancel the query ? the process that initiated the first one is long ceased. On Fri, Nov 26, 2010 at 6:02 PM, bricklen brick...@gmail.com wrote: No problem! On Fri, Nov 26, 2010 at 8:34 AM, felix crucialfe...@gmail.com wrote: thanks ! of course now, 2 hours later, the queries run fine. the first one was locked up for so long that I interrupted it. maybe that caused it to get blocked saved your query for future reference, thanks again ! On Fri, Nov 26, 2010 at 5:00 PM, bricklen brick...@gmail.com wrote: On Fri, Nov 26, 2010 at 6:22 AM, felix crucialfe...@gmail.com wrote: Hello, I have a very large table that I'm not too fond of. I'm revising the design now. Up until now its been insert only, storing tracking codes from incoming webtraffic. It has 8m rows It appears to insert fine, but simple updates using psql are hanging. update ONLY traffic_tracking2010 set src_content_type_id = 90 where id = 90322; I am also now trying to remove the constraints, this also hangs. alter table traffic_tracking2010 drop constraint traffic_tracking2010_src_content_type_id_fkey; thanks in advance for any advice. Try your update or alter and in another session, run the following query and look at the waiting column. A true value means that that transaction is blocked. SELECT pg_stat_activity.datname, pg_stat_activity.procpid, pg_stat_activity.usename, pg_stat_activity.current_query, pg_stat_activity.waiting, pg_stat_activity.query_start,pg_stat_activity.client_addr FROM pg_stat_activity WHERE ((pg_stat_activity.procpid pg_backend_pid()) AND (pg_stat_activity.current_query 'IDLE')) ORDER BY pg_stat_activity.query_start; -- Sent via pgsql-performance mailing list ( pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Update problem on large table
Hello, I have a very large table that I'm not too fond of. I'm revising the design now. Up until now its been insert only, storing tracking codes from incoming webtraffic. It has 8m rows It appears to insert fine, but simple updates using psql are hanging. update ONLY traffic_tracking2010 set src_content_type_id = 90 where id = 90322; I am also now trying to remove the constraints, this also hangs. alter table traffic_tracking2010 drop constraint traffic_tracking2010_src_content_type_id_fkey; thanks in advance for any advice. Table public.traffic_tracking2010 Column| Type | Modifiers -+--+--- id | integer | not null default nextval('traffic_tracking2010_id_seq'::regclass) action_time | timestamp with time zone | not null user_id | integer | content_type_id | integer | object_id | integer | action_type | smallint | not null src_type| smallint | src_content_type_id | integer | src_object_id | integer | http_referrer | character varying(100) | search_term | character varying(50)| remote_addr | inet | not null Indexes: traffic_tracking2010_pkey PRIMARY KEY, btree (id) traffic_tracking2010_content_type_id btree (content_type_id) traffic_tracking2010_src_content_type_id btree (src_content_type_id) traffic_tracking2010_user_id btree (user_id) Foreign-key constraints: traffic_tracking2010_content_type_id_fkey FOREIGN KEY (content_type_id) REFERENCES django_content_type(id) DEFERRABLE INITIALLY DEFERRED traffic_tracking2010_src_content_type_id_fkey FOREIGN KEY (src_content_type_id) REFERENCES django_content_type(id) DEFERRABLE INITIALLY DEFERRED traffic_tracking2010_user_id_fkey FOREIGN KEY (user_id) REFERENCES auth_user(id) DEFERRABLE INITIALLY DEFERRED This is generated by Django's ORM. The hang may be do having other clients connected, though I have tried doing the update when I know all tracking inserts are stopped. But the other client (the webapp) is still connected. based on this: http://postgresql.1045698.n5.nabble.com/slow-full-table-update-td2070754.html ns= ANALYZE traffic_tracking2010; ANALYZE ns= SELECT relpages, reltuples FROM pg_class WHERE relname = 'traffic_tracking2010'; relpages | reltuples --+- 99037 | 8.38355e+06 and I did vacuum it vacuum verbose traffic_tracking2010; INFO: vacuuming public.traffic_tracking2010 INFO: scanned index traffic_tracking2010_pkey to remove 1057 row versions DETAIL: CPU 0.09s/0.37u sec elapsed 10.70 sec. INFO: scanned index traffic_tracking2010_user_id to remove 1057 row versions DETAIL: CPU 0.12s/0.30u sec elapsed 13.53 sec. INFO: scanned index traffic_tracking2010_content_type_id to remove 1057 row versions DETAIL: CPU 0.11s/0.28u sec elapsed 13.99 sec. INFO: scanned index traffic_tracking2010_src_content_type_id to remove 1057 row versions DETAIL: CPU 0.09s/0.26u sec elapsed 15.57 sec. INFO: traffic_tracking2010: removed 1057 row versions in 535 pages DETAIL: CPU 0.01s/0.02u sec elapsed 2.83 sec. INFO: index traffic_tracking2010_pkey now contains 8315147 row versions in 22787 pages DETAIL: 1057 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index traffic_tracking2010_user_id now contains 8315147 row versions in 29006 pages DETAIL: 1057 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index traffic_tracking2010_content_type_id now contains 8315147 row versions in 28980 pages DETAIL: 1057 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index traffic_tracking2010_src_content_type_id now contains 8315147 row versions in 28978 pages DETAIL: 1057 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: traffic_tracking2010: found 336 removable, 8315147 nonremovable row versions in 99035 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 25953 pages contain useful free space. 0 pages are entirely empty. CPU 0.78s/1.49u sec elapsed 100.43 sec. INFO: vacuuming pg_toast.pg_toast_165961 INFO: index pg_toast_165961_index now contains 0 row versions in 1 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.01 sec. INFO: pg_toast_165961: found 0 removable, 0 nonremovable row
Re: [PERFORM] Unsubscribe
On Mon, Oct 02, 2006 at 01:36:17PM -0400, uwcssa wrote: Please unsubscribe me! Thank you! Also, it would be better to have a message foot saying how to unsubscribe. Will this do? It's too big for a footer. Here's how to unsubscribe: First, ask your Internet Provider to mail you an Unsubscribing Kit. Then follow these directions. The kit will most likely be the standard no-fault type. Depending on requirements, System A and/or System B can be used. When operating System A, depress lever and a plastic dalkron unsubscriber will be dispensed through the slot immediately underneath. When you have fastened the adhesive lip, attach connection marked by the large X outlet hose. Twist the silver-coloured ring one inch below the connection point until you feel it lock. The kit is now ready for use. The Cin-Eliminator is activated by the small switch on the lip. When securing, twist the ring back to its initial condition, so that the two orange lines meet. Disconnect. Place the dalkron unsubscriber in the vacuum receptacle to the rear. Activate by pressing the blue button. The controls for System B are located on the opposite side. The red release switch places the Cin-Eliminator into position; it can be adjusted manually up or down by pressing the blue manual release button. The opening is self-adjusting. To secure after use, press the green button, which simultaneously activates the evaporator and returns the Cin-Eliminator to its storage position. You may log off if the green exit light is on over the evaporator. If the red light is illuminated, one of the Cin-Eliminator requirements has not been properly implemented. Press the List Guy call button on the right of the evaporator. He will secure all facilities from his control panel. To use the Auto-Unsub, first undress and place all your clothes in the clothes rack. Put on the velcro slippers located in the cabinet immediately below. Enter the shower, taking the entire kit with you. On the control panel to your upper right upon entering you will see a Shower seal button. Press to activate. A green light will then be illuminated immediately below. On the intensity knob, select the desired setting. Now depress the Auto-Unsub activation lever. Bathe normally. The Auto-Unsub will automatically go off after three minutes unless you activate the Manual off override switch by flipping it up. When you are ready to leave, press the blue Shower seal release button. The door will open and you may leave. Please remove the velcro slippers and place them in their container. If you prefer the ultrasonic log-off mode, press the indicated blue button. When the twin panels open, pull forward by rings A B. The knob to the left, just below the blue light, has three settings, low, medium or high. For normal use, the medium setting is suggested. After these settings have been made, you can activate the device by switching to the ON position the clearly marked red switch. If during the unsubscribing operation you wish to change the settings, place the manual off override switch in the OFF position. You may now make the change and repeat the cycle. When the green exit light goes on, you may log off and have lunch. Please close the door behind you. -- ... _._. ._ ._. . _._. ._. ___ .__ ._. . .__. ._ .. ._. Felix Finch: scarecrow repairman rocket surgeon / [EMAIL PROTECTED] GPG = E987 4493 C860 246C 3B1E 6477 7838 76E9 182E 8151 ITAR license #4933 I've found a solution to Fermat's Last Theorem but I see I've run out of room o ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match