Re: [PERFORM] Do I have a hardware or a software problem?
On Wed, Dec 12, 2012 at 8:46 AM, Niels Kristian Schjødt nielskrist...@autouncle.com wrote: Den 11/12/2012 kl. 18.25 skrev Jeff Janes jeff.ja...@gmail.com: On Tue, Dec 11, 2012 at 2:04 AM, Niels Kristian Schjødt nielskrist...@autouncle.com wrote: Maybe I should mention, that I never see more than max 5Gb out of my total 32Gb being in use on the server… Can I somehow utilize more of it? What tool do you use to determine that? Is that on top of the 4GB shared_buffers, are including it? Okay I might not have made myself clear, I was talking physical memory utilization. Here is the stats: free -m total used free sharedbuffers cached Mem: 32075 25554 6520 0 69 22694 -/+ buffers/cache: 2791 29284 Swap: 2046595 1451 I don't how you get 5 Gig from that, though. You have 22 Gig of cached file-system, which for your purposes probably counts as being utilized. Although we don't know how much of this is for postgres data files, chances are it is a pretty good chunk. How big is your entire data set? Maybe all your data fits in 5GB (believable, as all your indexes listed below sum to 2.5GB) so there is no need to use more. It doesn't we are a search engine for used cars, and there are quite a lot of those out there :-) But how big? More than 22GB? (you can use \l+ in psql, or du -s on the data directory) However, my indexes are almost all partial indexes, which mean that they are only on cars which is still for sale, so in that sense, the indexes them selves doesn't really grow, but the tables do. So maybe this reverses things. If your car table is huge and the active cars are scattered randomly among all the inactive ones, then updating random active cars is going to generate a lot of randomly scattered writing which can't be combined into sequential writes. Do you have plans for archiving cars that are no longer for sale? Why do you keep them around in the first place, i.e. what types of queries do you do on inactive ones? Unfortunately you currently can't use CLUSTER with partial indexes, otherwise that might be a good idea. You could build a full index on whatever it is you use as the criterion for the partial indexes, cluster on that, and then drop it. But the table would eventually become unclustered again, so if this works you might want to implement partitioning between active and inactive partitions so as to maintain the clustering. You could really crank up shared_buffers or vm.dirty_background_ratio, but doing so might cause problems with checkpoints stalling and latency spikes. That would probably not be a problem during the night, but could be during the day. What do you have in mind here? Tweaking what parameters to what values? I'd set shared_buffers to 20GB (or 10GB, if that will hold all of your I had that before, Shaun suggested that I changed it to 4GB as he was talking about a strange behavior when larger than that on 12.04. But I can say, that there has not been any notable difference between having it at 4Gb and at 8Gb. It is almost an all or nothing thing. If you need 16 or 20GB, just going from 4 to 8 isn't going to show much difference. If you can test this easily, I'd just set it to 24 or even 28GB and run the bulk update. I don't think you'd want to run a server permanently at those settings, but it is an easy way to rule in or out different theories about what is going on. But if your database is growing so rapidly that it soon won't fit on 240GB, then cranking up shared_buffers won't do for long. If you can get your tables and all of their indexes clustered together, then you can do the updates in an order that makes IO more efficient. Maybe partitioning would help. Can you explain a little more about this, or provide me a good link? If all your partial btree indexes are using the same WHERE clause, then your indexes are already clustered together in a sense--a partial index is kind of like a composite index with the WHERE clause as the first column. So the trick would be to get the table to be clustered on the same thing--either by partitioning or by the CLUSTER command, or something equivalent to those. I don't know of a good link, other than the documentation (which is more about how to do it, rather than why you would want to or how to design it) Cheers, Jeff -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hash join vs nested loop join
Huan Ruan wrote: Hash 1st run Hash Join (cost=1681.87..6414169.04 rows=48261 width=171) (actual time=2182.450..88158.645 rows=48257 loops=1) - Seq Scan on invtran bigtable (cost=0.00..4730787.28 rows=168121728 width=108) (actual time=0.051..32581.052 rows=168121657 loops=1) 194 nanoseconds per row suggests 100% cache hits. NL 1st run Nested Loop (cost=0.00..6451637.88 rows=48261 width=171) (actual time=0.056..551.438 rows=48257 loops=1) - Index Scan using pk_invtran on invtran bigtable (cost=0.00..133.65 rows=1 width=108) (actual time=0.010..0.010 rows=1 loops=48261) 10 microseconds per index scan (each index scan requiring multiple random accesses) also suggests 100% cache hits. I originally reduced random_page_cost to 2 to achieve the nested loop join. Now I set cpu_tuple_cost to 0.05 and reset random_page_cost back to 4, I can also achieve a nested loop join. I'm still new in Postgres, but I'm worried about random_page_cost being 2 is too low, so maybe increasing cpu_tuple_cost is a better choice. If these are typical of what you would expect in production, then the fact that with default cost factors the costs are barely different (by 0.6%) for actual run times which differ by two orders of magnitude (the chosen plan is 160 times slower) means that the modeling of cost factors is off by a lot. If you expect the active portion of your database to be fully cached like this, it makes sense to reduce random_page_cost to be equal to seq_page_cost. But that only adjusts the costs by at most a factor of four, and we've established that in the above query they're off by a factor of 160. To help make up the difference, it makes sense to de-emphasize page access compared to cpu-related costs by reducing both page costs to 0.1. Combined, these adjustments still can't compensate for how far off the estimate was. In my experience default cpu_tuple_cost is understated compared to other cpu-related costs, so I would do the above *plus* a boost to cpu_tuple_cost. Personally, I have never seen a difference between plans chosen with that set to 0.03 and 0.05, so I can't say where in that range is the ideal value; you should feel free to experiment if there is a query which seems to be choosing a bad plan. If the above results really do represent cache hit levels you expect in production, the combination of the above changes should come reasonably close to modeling costs realistically, resulting in better plan choice. If you don't expect such high cache hit ratios in production, you probably don't want to go so low with page costs. - shared_buffers = 6GB - effective_cache_size = 18GB - work_mem = 10MB - maintenance_work_mem = 3GB Can you see any obvious issues with the other memory settings I changed? I might bump up work_mem to 20MB to 60MB, as long as you're not going crazy with max_connections. I would probably take maintenance_work_mem down to 1GB to 2GB -- you can have several of these allocations at one time, and you don't want to blow away your cache. (I think it might actually be adjusted down to 2GB internally anyway; but I would need to check.) -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] problem with large inserts
Hi I have currently some trouble with inserts into a table INSERT INTO LPP (PPID, LID) SELECT DISTINCT PPid, LID FROM (SELECT * FROM PP WHERE s_id = sid) pp INNER JOIN has_protein hp1 ON pp.p1id = hp1.pid INNER JOIN has_protein hp2 ON pp.p2_id = hp2.pid INNER JOIN (SELECT * FROM L WHERE s_id = sid) l ON (hp1.pid = l.p1id AND hp2.pid = l.p2id AND hp1.ppos + pp.s1 = l.s1 AND hp2.ppos + pp.s2 = l.s2) OR (hp1.pid = l.p2id AND hp2.pid = l.p1id AND hp1.ppos + pp.s1 = l.s2 AND hp2.ppos + pp.s2 = l.s1) ; If I run only SELECT DISTINCT PPid, LID FROM (SELECT * FROM PP WHERE s_id = 708) pp INNER JOIN has_protein hp1 ON pp.p1id = hp1.pid INNER JOIN has_protein hp2 ON pp.p2_id = hp2.pid INNER JOIN (SELECT * FROM L WHERE s_id = 708) l ON (hp1.pid = l.p1id AND hp2.pid = l.p2id AND hp1.ppos + pp.s1 = l.s1 AND hp2.ppos + pp.s2 = l.s2) OR (hp1.pid = l.p2id AND hp2.pid = l.p1id AND hp1.ppos + pp.s1 = l.s2 AND hp2.ppos + pp.s2 = l.s1) ; it returns 200620 rows in 170649 ms ( thats just under 3 minutes). I stopped the actual insert after about 8h. The table that the insert happens to, is following: CREATE TABLE LPP ( ppid bigint NOT NULL, lid bigint NOT NULL, CONSTRAINT pk_lpp PRIMARY KEY (ppid,lid) ) I also tried without the primary key but that one is still running for more that a day. Currently the table LPP holds 471139 rows. Its linking the PP and the L table. There are no foreign keys referring to that table nor are there any other constraints on it. Previously I had foreign keys on lid and ppid refering to the L and PP table. But in a desperate try to get some speed up I deleted these. - But still... I am running postgresql 9.2 on a windows 2008 R2 server with 256 GB and the database is on something like a raid 1+0 (actually a raid1e) consisting of 3x4TB disks (limit of what could easily be fitted into the server). At the given time there were no concurrent access to any of the involved tables. Has anybody some idea why the insert takes so long and/or how to speed things up a bit? I could live with something like half an hour - better would be in minutes. Thanks for any responds, Lutz Fischer -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Limit offset effect on query plans
Pavan Deolasee wrote: Amitabh Kant amitabhk...@gmail.com wrote: Our scripts automatically add LIMIT ALL OFFSET 0 to every select query if no values are passed on for these parameters. I remember reading through the mailing list that it's better not to pass them if they are not needed as they add a cost to the query plan. Is this the case, or am i looking at a very minor optimization. I would tend to think that is the latter. While undoubtedly limit/offset clause will add another node during query planning and execution, AFAICS the OFFSET 0 and LIMIT ALL cases are optimized to a good extent. So the overhead of having them will not be significant. I ran some quick tests on my i7 under Linux. Plan time was increased by about 40 microseconds (based on EXPLAIN runtime) and added a limit node to the plan. Execution time on a SELECT * FROM tenk1 in the regression database went up by 1.35 ms on fully cached runs. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] problem with large inserts
I would strongly discourage you from droppping the referential integrity. You risk data corruption, which will cost you a good deal of time to sort it out properly, and corruption prevents you to apply the R.I. again. Also it has hardly any performance impact. Are the plans different? ( i guess you've looked at http://wiki.postgresql.org/wiki/Slow_Query_Questions ?) Date: Thu, 13 Dec 2012 15:37:33 + From: lfisc...@staffmail.ed.ac.uk To: pgsql-performance@postgresql.org Subject: [PERFORM] problem with large inserts Hi I have currently some trouble with inserts into a table INSERT INTO LPP (PPID, LID) SELECT DISTINCT PPid, LID FROM (SELECT * FROM PP WHERE s_id = sid) pp INNER JOIN has_protein hp1 ON pp.p1id = hp1.pid INNER JOIN has_protein hp2 ON pp.p2_id = hp2.pid INNER JOIN (SELECT * FROM L WHERE s_id = sid) l ON (hp1.pid = l.p1id AND hp2.pid = l.p2id AND hp1.ppos + pp.s1 = l.s1 AND hp2.ppos + pp.s2 = l.s2) OR (hp1.pid = l.p2id AND hp2.pid = l.p1id AND hp1.ppos + pp.s1 = l.s2 AND hp2.ppos + pp.s2 = l.s1) ; If I run only SELECT DISTINCT PPid, LID FROM (SELECT * FROM PP WHERE s_id = 708) pp INNER JOIN has_protein hp1 ON pp.p1id = hp1.pid INNER JOIN has_protein hp2 ON pp.p2_id = hp2.pid INNER JOIN (SELECT * FROM L WHERE s_id = 708) l ON (hp1.pid = l.p1id AND hp2.pid = l.p2id AND hp1.ppos + pp.s1 = l.s1 AND hp2.ppos + pp.s2 = l.s2) OR (hp1.pid = l.p2id AND hp2.pid = l.p1id AND hp1.ppos + pp.s1 = l.s2 AND hp2.ppos + pp.s2 = l.s1) ; it returns 200620 rows in 170649 ms ( thats just under 3 minutes). I stopped the actual insert after about 8h. The table that the insert happens to, is following: CREATE TABLE LPP ( ppid bigint NOT NULL, lid bigint NOT NULL, CONSTRAINT pk_lpp PRIMARY KEY (ppid,lid) ) I also tried without the primary key but that one is still running for more that a day. Currently the table LPP holds 471139 rows. Its linking the PP and the L table. There are no foreign keys referring to that table nor are there any other constraints on it. Previously I had foreign keys on lid and ppid refering to the L and PP table. But in a desperate try to get some speed up I deleted these. - But still... I am running postgresql 9.2 on a windows 2008 R2 server with 256 GB and the database is on something like a raid 1+0 (actually a raid1e) consisting of 3x4TB disks (limit of what could easily be fitted into the server). At the given time there were no concurrent access to any of the involved tables. Has anybody some idea why the insert takes so long and/or how to speed things up a bit? I could live with something like half an hour - better would be in minutes. Thanks for any responds, Lutz Fischer -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] problem with large inserts
Just an idea - how long does it take to run _only_ CREATE TEMP TABLE foo AS your SELECT here On Thu, Dec 13, 2012 at 4:37 PM, Lutz Fischer lfisc...@staffmail.ed.ac.uk wrote: Hi I have currently some trouble with inserts into a table INSERT INTO LPP (PPID, LID) SELECT DISTINCT PPid, LID FROM (SELECT * FROM PP WHERE s_id = sid) pp INNER JOIN has_protein hp1 ON pp.p1id = hp1.pid INNER JOIN has_protein hp2 ON pp.p2_id = hp2.pid INNER JOIN (SELECT * FROM L WHERE s_id = sid) l ON (hp1.pid = l.p1id AND hp2.pid = l.p2id AND hp1.ppos + pp.s1 = l.s1 AND hp2.ppos + pp.s2 = l.s2) OR (hp1.pid = l.p2id AND hp2.pid = l.p1id AND hp1.ppos + pp.s1 = l.s2 AND hp2.ppos + pp.s2 = l.s1) ; If I run only SELECT DISTINCT PPid, LID FROM (SELECT * FROM PP WHERE s_id = 708) pp INNER JOIN has_protein hp1 ON pp.p1id = hp1.pid INNER JOIN has_protein hp2 ON pp.p2_id = hp2.pid INNER JOIN (SELECT * FROM L WHERE s_id = 708) l ON (hp1.pid = l.p1id AND hp2.pid = l.p2id AND hp1.ppos + pp.s1 = l.s1 AND hp2.ppos + pp.s2 = l.s2) OR (hp1.pid = l.p2id AND hp2.pid = l.p1id AND hp1.ppos + pp.s1 = l.s2 AND hp2.ppos + pp.s2 = l.s1) ; it returns 200620 rows in 170649 ms ( thats just under 3 minutes). I stopped the actual insert after about 8h. The table that the insert happens to, is following: CREATE TABLE LPP ( ppid bigint NOT NULL, lid bigint NOT NULL, CONSTRAINT pk_lpp PRIMARY KEY (ppid,lid) ) I also tried without the primary key but that one is still running for more that a day. Currently the table LPP holds 471139 rows. Its linking the PP and the L table. There are no foreign keys referring to that table nor are there any other constraints on it. Previously I had foreign keys on lid and ppid refering to the L and PP table. But in a desperate try to get some speed up I deleted these. - But still... I am running postgresql 9.2 on a windows 2008 R2 server with 256 GB and the database is on something like a raid 1+0 (actually a raid1e) consisting of 3x4TB disks (limit of what could easily be fitted into the server). At the given time there were no concurrent access to any of the involved tables. Has anybody some idea why the insert takes so long and/or how to speed things up a bit? I could live with something like half an hour - better would be in minutes. Thanks for any responds, Lutz Fischer -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] problem with large inserts
Lutz Fischer lfisc...@staffmail.ed.ac.uk writes: I have currently some trouble with inserts into a table If I run only [ the select part ] it returns 200620 rows in 170649 ms ( thats just under 3 minutes). I stopped the actual insert after about 8h. It should not take 8h to insert 200k rows on any machine made this century. Frankly, I'm wondering if the insert is doing anything at all, or is blocked on a lock somewhere. You say there's no concurrent activity, but how hard did you look? Did you check that, say, the physical disk file for the table is growing? I am running postgresql 9.2 on a windows 2008 R2 server with 256 GB and the database is on something like a raid 1+0 (actually a raid1e) consisting of 3x4TB disks (limit of what could easily be fitted into the server). A different line of thought is that there's something seriously broken about the raid configuration. Have you tried basic disk-speed benchmarks? (Perhaps there's an equivalent of bonnie++ for windows.) regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Limit offset effect on query plans
Kevin Grittner kgri...@mail.com writes: Pavan Deolasee wrote: I would tend to think that is the latter. While undoubtedly limit/offset clause will add another node during query planning and execution, AFAICS the OFFSET 0 and LIMIT ALL cases are optimized to a good extent. So the overhead of having them will not be significant. I ran some quick tests on my i7 under Linux. Plan time was increased by about 40 microseconds (based on EXPLAIN runtime) and added a limit node to the plan. Execution time on a SELECT * FROM tenk1 in the regression database went up by 1.35 ms on fully cached runs. 1.35ms out of what? FWIW, I've been considering teaching the planner to not bother with an actual Limit plan node if the limit clause is an obvious no-op. I wasn't thinking about applications that blindly insert such clauses, but rather about not penalizing subqueries when someone uses one of these as an optimization fence. (The clauses would still work as an opt fence, you'd just not see any Limit node in the final plan.) regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] problem with large inserts
Thanks a lot you saved my day create temp table foo AS SELECT DISTINCT ... did take a mere 77464.744 ms And an additional Insert into LPP select * from foo; Just 576.909 ms I don't really understand why it's working via a temp table but not directly (or in any reasonable amount of time) - but at least I have a solution I can work with. On 13/12/12 16:09, Filip Rembiałkowski wrote: Just an idea - how long does it take to run _only_ CREATE TEMP TABLE foo AS your SELECT here On Thu, Dec 13, 2012 at 4:37 PM, Lutz Fischer lfisc...@staffmail.ed.ac.uk wrote: Hi I have currently some trouble with inserts into a table INSERT INTO LPP (PPID, LID) SELECT DISTINCT PPid, LID FROM (SELECT * FROM PP WHERE s_id = sid) pp INNER JOIN has_protein hp1 ON pp.p1id = hp1.pid INNER JOIN has_protein hp2 ON pp.p2_id = hp2.pid INNER JOIN (SELECT * FROM L WHERE s_id = sid) l ON (hp1.pid = l.p1id AND hp2.pid = l.p2id AND hp1.ppos + pp.s1 = l.s1 AND hp2.ppos + pp.s2 = l.s2) OR (hp1.pid = l.p2id AND hp2.pid = l.p1id AND hp1.ppos + pp.s1 = l.s2 AND hp2.ppos + pp.s2 = l.s1) ; If I run only SELECT DISTINCT PPid, LID FROM (SELECT * FROM PP WHERE s_id = 708) pp INNER JOIN has_protein hp1 ON pp.p1id = hp1.pid INNER JOIN has_protein hp2 ON pp.p2_id = hp2.pid INNER JOIN (SELECT * FROM L WHERE s_id = 708) l ON (hp1.pid = l.p1id AND hp2.pid = l.p2id AND hp1.ppos + pp.s1 = l.s1 AND hp2.ppos + pp.s2 = l.s2) OR (hp1.pid = l.p2id AND hp2.pid = l.p1id AND hp1.ppos + pp.s1 = l.s2 AND hp2.ppos + pp.s2 = l.s1) ; it returns 200620 rows in 170649 ms ( thats just under 3 minutes). I stopped the actual insert after about 8h. The table that the insert happens to, is following: CREATE TABLE LPP ( ppid bigint NOT NULL, lid bigint NOT NULL, CONSTRAINT pk_lpp PRIMARY KEY (ppid,lid) ) I also tried without the primary key but that one is still running for more that a day. Currently the table LPP holds 471139 rows. Its linking the PP and the L table. There are no foreign keys referring to that table nor are there any other constraints on it. Previously I had foreign keys on lid and ppid refering to the L and PP table. But in a desperate try to get some speed up I deleted these. - But still... I am running postgresql 9.2 on a windows 2008 R2 server with 256 GB and the database is on something like a raid 1+0 (actually a raid1e) consisting of 3x4TB disks (limit of what could easily be fitted into the server). At the given time there were no concurrent access to any of the involved tables. Has anybody some idea why the insert takes so long and/or how to speed things up a bit? I could live with something like half an hour - better would be in minutes. Thanks for any responds, Lutz Fischer -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Lutz Fischer lfisc...@staffmail.ed.ac.uk +44 131 6517057 The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] problem with large inserts
Hmm, so it is some kind of file / table locking issue, not general IO system malfunction. It would be interesting and useful to run this use case on other postgres instance (or several instances), including non-Windows ones. OTOH Pg on Windows housekeeping was always fun - I advise all my clients to avoid it for production purposes. On Thu, Dec 13, 2012 at 5:33 PM, Lutz Fischer lfisc...@staffmail.ed.ac.uk wrote: Thanks a lot you saved my day create temp table foo AS SELECT DISTINCT ... did take a mere 77464.744 ms And an additional Insert into LPP select * from foo; Just 576.909 ms I don't really understand why it's working via a temp table but not directly (or in any reasonable amount of time) - but at least I have a solution I can work with. On 13/12/12 16:09, Filip Rembiałkowski wrote: Just an idea - how long does it take to run _only_ CREATE TEMP TABLE foo AS your SELECT here On Thu, Dec 13, 2012 at 4:37 PM, Lutz Fischer lfisc...@staffmail.ed.ac.uk wrote: Hi I have currently some trouble with inserts into a table INSERT INTO LPP (PPID, LID) SELECT DISTINCT PPid, LID FROM (SELECT * FROM PP WHERE s_id = sid) pp INNER JOIN has_protein hp1 ON pp.p1id = hp1.pid INNER JOIN has_protein hp2 ON pp.p2_id = hp2.pid INNER JOIN (SELECT * FROM L WHERE s_id = sid) l ON (hp1.pid = l.p1id AND hp2.pid = l.p2id AND hp1.ppos + pp.s1 = l.s1 AND hp2.ppos + pp.s2 = l.s2) OR (hp1.pid = l.p2id AND hp2.pid = l.p1id AND hp1.ppos + pp.s1 = l.s2 AND hp2.ppos + pp.s2 = l.s1) ; If I run only SELECT DISTINCT PPid, LID FROM (SELECT * FROM PP WHERE s_id = 708) pp INNER JOIN has_protein hp1 ON pp.p1id = hp1.pid INNER JOIN has_protein hp2 ON pp.p2_id = hp2.pid INNER JOIN (SELECT * FROM L WHERE s_id = 708) l ON (hp1.pid = l.p1id AND hp2.pid = l.p2id AND hp1.ppos + pp.s1 = l.s1 AND hp2.ppos + pp.s2 = l.s2) OR (hp1.pid = l.p2id AND hp2.pid = l.p1id AND hp1.ppos + pp.s1 = l.s2 AND hp2.ppos + pp.s2 = l.s1) ; it returns 200620 rows in 170649 ms ( thats just under 3 minutes). I stopped the actual insert after about 8h. The table that the insert happens to, is following: CREATE TABLE LPP ( ppid bigint NOT NULL, lid bigint NOT NULL, CONSTRAINT pk_lpp PRIMARY KEY (ppid,lid) ) I also tried without the primary key but that one is still running for more that a day. Currently the table LPP holds 471139 rows. Its linking the PP and the L table. There are no foreign keys referring to that table nor are there any other constraints on it. Previously I had foreign keys on lid and ppid refering to the L and PP table. But in a desperate try to get some speed up I deleted these. - But still... I am running postgresql 9.2 on a windows 2008 R2 server with 256 GB and the database is on something like a raid 1+0 (actually a raid1e) consisting of 3x4TB disks (limit of what could easily be fitted into the server). At the given time there were no concurrent access to any of the involved tables. Has anybody some idea why the insert takes so long and/or how to speed things up a bit? I could live with something like half an hour - better would be in minutes. Thanks for any responds, Lutz Fischer -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Lutz Fischer lfisc...@staffmail.ed.ac.uk +44 131 6517057 The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Limit offset effect on query plans
Tom Lane wrote: Kevin Grittner kgri...@mail.com writes: I ran some quick tests on my i7 under Linux. Plan time was increased by about 40 microseconds (based on EXPLAIN runtime) and added a limit node to the plan. Execution time on a SELECT * FROM tenk1 in the regression database went up by 1.35 ms on fully cached runs. 1.35ms out of what? Without the limit node the runtimes (after priming the cache) were: 1.805, 2.533 1.805, 2.495 1.800, 2.446 1.818, 2.470 1.804, 2.502 The first time for each run is Total runtime reported by EXPLAIN, the second is what psql reported from having \timing on. With the limit node: 3.237, 3.914 3.243, 3.918 3.263, 4.010 3.265, 3.943 3.272, 3.953 I eyeballed that in the console window and said 1.35 based on rough in-my-head calculations, although with it laid out in a nicer format, I think I was a little low. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Limit offset effect on query plans
Kevin Grittner kgri...@mail.com writes: Tom Lane wrote: 1.35ms out of what? Without the limit node the runtimes (after priming the cache) were: 1.805, 2.533 1.805, 2.495 1.800, 2.446 1.818, 2.470 1.804, 2.502 The first time for each run is Total runtime reported by EXPLAIN, the second is what psql reported from having \timing on. With the limit node: 3.237, 3.914 3.243, 3.918 3.263, 4.010 3.265, 3.943 3.272, 3.953 I eyeballed that in the console window and said 1.35 based on rough in-my-head calculations, although with it laid out in a nicer format, I think I was a little low. Huh, so on a percentage basis the Limit-node overhead is actually pretty significant, at least for a trivial seqscan plan like this case. (This is probably about the worst-case scenario, really, since it's tough to beat a simple seqscan for cost-per-emitted-row. Also I gather you're not actually transmitting any data to the client ...) regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Limit offset effect on query plans
Tom Lane wrote: Huh, so on a percentage basis the Limit-node overhead is actually pretty significant, at least for a trivial seqscan plan like this case. (This is probably about the worst-case scenario, really, since it's tough to beat a simple seqscan for cost-per-emitted- row. Also I gather you're not actually transmitting any data to the client ...) Right, I was trying to isolate the cost, and in a more complex query, or with results streaming back, that could easily be lost in the noise. Assuming that the setup time for the node is trivial compared to filtering 10,000 rows, the time per row which passes through the limit node seems to be (very roughly) 140 nanoseconds on an i7. I don't know whether that will vary based on the number or types of columns. I just tried with returning the results rather than running EXPLAIN ANALYZE, and any difference was lost in the noise with only five samples each way. I wonder how much of the difference with EXPLAIN ANALYZE might have been from the additional time checking. Maybe on a normal run the difference would be less significant. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Why is PostgreSQL 9.2 slower than 9.1 in my tests?
On Mon, Dec 10, 2012 at 4:53 AM, Jeff Janes jeff.ja...@gmail.com wrote: On Wed, Dec 5, 2012 at 4:09 AM, Patryk Sidzina patryk.sidz...@gmail.com wrote: CREATE TEMP TABLE test_table_md_speed(id serial primary key, n integer); CREATE OR REPLACE FUNCTION TEST_DB_SPEED(cnt integer) RETURNS text AS $$ DECLARE time_start timestamp; time_stop timestamp; time_total interval; BEGIN time_start := cast(timeofday() AS TIMESTAMP); FOR i IN 1..cnt LOOP INSERT INTO test_table_md_speed(n) VALUES (i); END LOOP; time_stop := cast(timeofday() AS TIMESTAMP); time_total := time_stop-time_start; RETURN extract (milliseconds from time_total); END; $$ LANGUAGE plpgsql; SELECT test_db_speed(100); I see strange results. For PostgreSQL 9.1.5 I get 8254.769, and for 9.2.1 I get: 9022.219. This means that new version is slower. I cannot find why. Any ideas why those results differ? Did you just run it once each? The run-to-run variability in timing can be substantial. I put the above into a custom file for pgbench -f sidzina.sql -t 1 -p $port and run it on both versions in random order for several hundred iterations. There was no detectable difference in timing. Sorry for the mix up. The above results are from one of our test machines. I wanted to simplify the function as much as possible. Unfortunately, I didn't test this on a different machine. I did that after your post and like you said, there isn't much difference in the results. The differences come up when you change the INSERT to EXECUTE 'INSERT' ( and i checked this time on 3 machines, one of which was Windows): CREATE TEMP TABLE test_table_md_speed(id serial primary key, n integer); CREATE OR REPLACE FUNCTION test_db_speed(cnt integer) RETURNS text LANGUAGE plpgsql AS $function$ DECLARE time_start timestamp; time_stop timestamp; time_total interval; BEGIN time_start := cast(timeofday() AS TIMESTAMP); FOR i IN 1..cnt LOOP EXECUTE 'INSERT INTO test_table_md_speed(n) VALUES (' || i || ')'; END LOOP; time_stop := cast(timeofday() AS TIMESTAMP); time_total := time_stop-time_start; RETURN extract (milliseconds from time_total); END; $function$; SELECT test_db_speed(10); I run the above several times and get 4029.356 on PGSQL 9.1.6 and 5015.073 on PGSQL 9.2.1. Again, sorry for not double checking my results. -- Patryk Sidzina
Re: [PERFORM] encouraging index-only scans
Andrew Dunstan and...@dunslane.net writes: A client is testing a migration from 9.1 to 9.2, and has found that a large number of queries run much faster if they use index-only scans. However, the only way he has found to get such a plan is by increasing the seq_page_cost to insanely high levels (3.5). Is there any approved way to encourage such scans that's a but less violent than this? Is the pg_class.relallvisible estimate for the table realistic? They might need a few more VACUUM and ANALYZE cycles to get it into the neighborhood of reality, if not. Keep in mind also that small values of random_page_cost necessarily decrease the apparent advantage of index-only scans. If you think 3.5 is an insanely high setting, I wonder whether you haven't driven those numbers too far in the other direction to compensate for something else. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hash join vs nested loop join
On 13 December 2012 03:28, Jeff Janes jeff.ja...@gmail.com wrote: This looks like the same large-index over-penalty as discussed in the recent thread [PERFORM] Slow query: bitmap scan troubles. Back-patching the log(npages) change is starting to look like a good idea. Cheers, Jeff Thanks for the information Jeff. That does seem to be related.
[PERFORM] How do I track stats on foreign table access through foreign data wrapper?
Hi, I'm using a foreign data wrapper to access mongodb and I'm looking for a way to monitor query stats against foreign tables. It looks like the common methods have limited support for foreign tables at this time. pg_stat_statements collects the query, total time, and rows returned, which is useful. But all the disk access stats are zero (understandably). Looks like pg_stat_all_tables doesn't have any info on foreign tables from my tests. I'm interested in the following: - Foreign table rows read - Foreign table bytes read - Foreign table read time I'm working with my own fork of the FDW, so I could build these in myself, but I was wondering if there's more generic support for this sort of stuff. Or at the least, if I do implement it can I push it into another stats collection framework rather than logging it. Thanks, David Crawford
Re: [PERFORM] hash join vs nested loop join
Hi Kevin On 13 December 2012 10:47, Kevin Grittner kgri...@mail.com wrote: Huan Ruan wrote: is a lot slower than a nested loop join. Giving actual numbers is more useful than terms like a lot. Even better is to provide the output of EXPLAIN ANALYZZE rather than just EXPLAIN. This shows estimates against actual numbers, and give timings. For more suggestions see this page: http://wiki.postgresql.org/wiki/SlowQueryQuestions You are right. I realised my information wasn't accurate. Was a bit slack and canceled the slower one. The full outputs are Hash 1st run QUERY PLAN Hash Join (cost=1681.87..6414169.04 rows=48261 width=171) (actual time=2182.450..88158.645 rows=48257 loops=1) Hash Cond: (bigtable.invtranref = smalltable.invtranref) Buffers: shared hit=3950 read=3046219 - Seq Scan on invtran bigtable (cost=0.00..4730787.28 rows=168121728 width=108) (actual time=0.051..32581.052 rows=168121657 loops=1) Buffers: shared hit=3351 read=3046219 - Hash (cost=1078.61..1078.61 rows=48261 width=63) (actual time=21.751..21.751 rows=48261 loops=1) Buckets: 8192 Batches: 1 Memory Usage: 4808kB Buffers: shared hit=596 - Seq Scan on im_match_table smalltable (cost=0.00..1078.61 rows=48261 width=63) (actual time=0.007..8.299 rows=48261 loops=1) Buffers: shared hit=596 Total runtime: 88162.417 ms Hash 2nd run (after disconnect and reconnect) QUERY PLAN Hash Join (cost=1681.87..6414169.04 rows=48261 width=171) (actual time=2280.390..87934.540 rows=48257 loops=1) Hash Cond: (bigtable.invtranref = smalltable.invtranref) Buffers: shared hit=3982 read=3046187 - Seq Scan on invtran bigtable (cost=0.00..4730787.28 rows=168121728 width=108) (actual time=0.052..32747.805 rows=168121657 loops=1) Buffers: shared hit=3383 read=3046187 - Hash (cost=1078.61..1078.61 rows=48261 width=63) (actual time=62.161..62.161 rows=48261 loops=1) Buckets: 8192 Batches: 1 Memory Usage: 4808kB Buffers: shared hit=596 - Seq Scan on im_match_table smalltable (cost=0.00..1078.61 rows=48261 width=63) (actual time=0.006..8.209 rows=48261 loops=1) Buffers: shared hit=596 Total runtime: 87938.584 ms NL 1st run QUERY PLAN Nested Loop (cost=0.00..6451637.88 rows=48261 width=171) (actual time=0.056..551.438 rows=48257 loops=1) Buffers: shared hit=242267 - Seq Scan on im_match_table smalltable (cost=0.00..1078.61 rows=48261 width=63) (actual time=0.009..7.353 rows=48261 loops=1) Buffers: shared hit=596 - Index Scan using pk_invtran on invtran bigtable (cost=0.00..133.65 rows=1 width=108) (actual time=0.010..0.010 rows=1 loops=48261) Index Cond: (invtranref = smalltable.invtranref) Buffers: shared hit=241671 Total runtime: 555.336 ms NL 2nd run (after disconnect and reconnect) QUERY PLAN Nested Loop (cost=0.00..6451637.88 rows=48261 width=171) (actual time=0.058..554.215 rows=48257 loops=1) Buffers: shared hit=242267 - Seq Scan on im_match_table smalltable (cost=0.00..1078.61 rows=48261 width=63) (actual time=0.009..7.416 rows=48261 loops=1) Buffers: shared hit=596 - Index Scan using pk_invtran on invtran bigtable (cost=0.00..133.65 rows=1 width=108) (actual time=0.010..0.010 rows=1 loops=48261) Index Cond: (invtranref = smalltable.invtranref) Buffers: shared hit=241671 Total runtime: 558.095 ms I don't understand why the optimiser chooses the hash join in favor of the nested loop. What can I do to get the optimiser to make a better decision (nested loop in this case)? I have run analyze on both tables. Config changes are - shared_buffers = 6GB - effective_cache_size = 18GB - work_mem = 10MB - maintenance_work_mem = 3GB As already suggested, there was a change made in 9.2 which may have over-penalized nested loops using index scans. This may be fixed in the next minor release. Will keep this in mind. Also, as already suggested, you may want to reduce random_page cost, to bring it in line with the actual cost relative to seq_page_cost based on your cache hit ratio. Additionally, I just routinely set cpu_tuple_cost higher than the default of 0.01. I find that 0.03 to 0.05 better models the actual relative cost of processing a tuple. I originally reduced random_page_cost to 2 to achieve the nested loop join. Now I set cpu_tuple_cost to 0.05 and reset random_page_cost back to 4, I can also achieve a nested loop join. I'm still new in Postgres, but I'm worried about random_page_cost being 2 is too low, so maybe increasing cpu_tuple_cost is a better choice. All these tuning probably also depends on the above mentioned possible fix as well. Can you see any obvious issues with the other memory settings I changed? Thanks for your help. Cheers Huan -Kevin
[PERFORM] Slow queries after vacuum analyze
Hello, I have a customer that experience a strange behaviour related to statictics. Threre is a vacuum analyze planned during the night. The morning, 1 day out of 2, there are some extremely slow queries. Those queries lasts more than 5 minutes (never waited more and cancelled them) whereas when everything is OK they last less than 300ms. In order to resolve this i have to execute a least one analyze, sometimes more. My Configuration: Windows PostgreSQL 8.4.8 default_statistics_target = 100 In addition to an increase for shared_buffers, work_mem, ..., i changed the default_statistics_target to 500 with no effect. It was even worse as i never managed to get rid of the slow queries after running many analyze. So i fell back to default_statistics_target=100 in order to get rid of those slow queries. I have no idea of what i can do to solve this issue. Any help would be greatly appreciated. Cordialement, *Ghislain ROUVIGNAC* Ingénieur RD g...@sylob.com 7 rue Marcel Dassault - Z.A. La Mouline - 81990 Cambon d'Albi - FRANCE Tel : 05 63 53 08 18 - Fax : 05 63 53 07 42 - www.sylob.com Support : 05 63 53 78 35 - supp...@sylob.com Entreprise certifiée ISO 9001 version 2008 par Bureau Veritas.
Re: [PERFORM] Slow queries after vacuum analyze
Ghislain ROUVIGNAC wrote: Threre is a vacuum analyze planned during the night. The morning, 1 day out of 2, there are some extremely slow queries. Those queries lasts more than 5 minutes (never waited more and cancelled them) whereas when everything is OK they last less than 300ms. In order to resolve this i have to execute a least one analyze, sometimes more. My Configuration: Windows PostgreSQL 8.4.8 default_statistics_target = 100 In addition to an increase for shared_buffers, work_mem, ..., i changed the default_statistics_target to 500 with no effect. It was even worse as i never managed to get rid of the slow queries after running many analyze. So i fell back to default_statistics_target=100 in order to get rid of those slow queries. You probably need to adjust your cost factors to more accurately reflect the actual costs of various activities on your system. What is probably happening is that there are two plans which are very close together in estimated costs using the current values, while the actual costs are very different. The particular random sample chosen can push the balance one way or the other. Please show the results from running the query on this page: http://wiki.postgresql.org/wiki/Server_Configuration Also, a description of the run environment would help. Other information listed on this page would help, although cores, RAM, and storage system information would probably be most important. http://wiki.postgresql.org/wiki/Server_Configuration -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Occasional timeouts on TRUNCATE and simple INSERTs
You all were right. The time-outs for TRUNCATE were due to a rogue pg_dump. And the issue with the inserts was due to an unrelated code change. Thanks for your help! --Jeff O On Dec 11, 2012, at 5:34 PM, Osborn, Jeff wrote: Yeah I've been running a cron pulling relevant info from pg_stat_activity. Haven't seen anything yet. Currently looking into the pg_dump situation. --Jeff O On Dec 11, 2012, at 5:16 PM, Merlin Moncure wrote: On Tue, Dec 11, 2012 at 3:38 PM, Sergey Konoplev gray...@gmail.com wrote: Yeah: absolute first thing to check is if your statements are being blocked -- you can get that via pg_stat_activity from another session. It's a completely different beast if that's the case. merlin smime.p7s Description: S/MIME cryptographic signature
[PERFORM] Why does the number of rows are different in actual and estimated.
Why does the number of rows are different in actual and estimated? The default_statistics_target is set to 100. explain analyze select * FROM ( SELECT entity.id AS con_s_id, entity.setype AS con_s_setype , con_details.salutation AS con_s_salutationtype, con_details.firstname AS con_s_firstname, con_details.phone AS con_s_phone, con_details.lastname AS con_s_lastname, con_details.accountid AS con_s_account_id_entityid, con_details.mobile AS con_s_mobile, con_details.title AS con_s_title, con_details.donotcall AS con_s_donotcall, con_details.fax AS con_s_fax, con_details.department AS con_s_department, con_details.email AS con_s_email, con_details.yahooid AS con_s_yahooid, con_details.emailoptout AS con_s_emailoptout, con_details.reportsto AS con_s_con__id_entityid, con_details.reference AS con_s_reference, entity.smownerid AS con_s_assigned_user_id_entityid, CASE WHEN entity.owner_type='U' THEN users.user_name ELSE groups.groupname END AS con_s_assigned_user_id_name, CASE WHEN entity.owner_type='U' THEN users.first_name || ' ' || users.last_name ELSE groups.groupname END AS con_s_assigned_user_id, CASE WHEN entity.owner_type='U' THEN 'Users' ELSE 'Groups' END AS con_s_assigned_user_id_linkmodule, entity.modifiedtime AS con_s_modifiedtime, con_details.notify_owner AS con_s_notify_owner, entity.createdtime AS con_s_createdtime, entity.description AS con_s_description, con_details.imagename AS con_s_imagename FROM con_details INNER JOIN entity ON con_details.con_id=entity.id LEFT JOIN groups ON groups.groupid = entity.smownerid LEFT join users ON entity.smownerid= users.id WHERE entity.setype='con_s' AND entity.deleted=0 AND (((con_details.email ILIKE '%@%'))) ) con_base INNER JOIN con_scf ON con_s_base.con_s_id=con_scf.con_id INNER JOIN con_subdetails ON con_s_base.con_s_id=con_subdetails.con_subscriptionid INNER JOIN customerdetails ON con_s_base.con_s_id=customerdetails.customerid INNER JOIN con_address ON con_s_base.con_s_id=con_address.con_addressid Nested Loop (cost=18560.97..26864.83 rows=24871 width=535) (actual time=1335.157..8492.414 rows=157953 loops=1) - Hash Left Join (cost=18560.97..26518.91 rows=116 width=454) (actual time=1335.117..6996.585 rows=205418 loops=1) Hash Cond: (entity.smownerid = users.id) - Hash Left Join (cost=18547.22..26503.57 rows=116 width=419) (actual time=1334.354..6671.442 rows=205418 loops=1) Hash Cond: (entity.smownerid = groups.groupid) - Nested Loop (cost=18546.83..26502.72 rows=116 width=398) (actual time=1334.314..6385.664 rows=205418 loops=1) - Nested Loop (cost=18546.83..26273.40 rows=774 width=319) (actual time=1334.272..5025.175 rows=205418 loops=1) - Hash Join (cost=18546.83..24775.02 rows=5213 width=273) (actual time=1334.238..3666.748 rows=205420 loops=1) Hash Cond: (con_subdetails.con_subscriptionid = entity.id) - Index Scan using con_subdetails_pkey on con_subdetails (cost=0.00..4953.41 rows=326040 width=29) (actual time=0.019..350 .736 rows=327328 loops=1) - Hash (cost=18115.71..18115.71 rows=34489 width=244) (actual time=1334.147..1334.147 rows=205420 loops=1) Buckets: 4096 Batches: 1 Memory Usage: 19417kB - Hash Join (cost=9337.97..18115.71 rows=34489 width=244) (actual time=418.054..1156.453 rows=205420 loops=1) Hash Cond: (customerdetails.customerid = entity.id) - Seq Scan on customerdetails (cost=0.00..4752.46 rows=327146 width=13) (actual time=0.021..176.389 rows=327328 loops=1) - Hash (cost=6495.65..6495.65 rows=227386 width=231) (actual time=417.839..417.839 rows=205420 loops=1) Buckets: 32768 Batches: 1 Memory Usage: 16056kB - Index Scan using entity_setype_idx on entity (cost=0.00..6495.65 rows=227386 width=231) (actual time=0.033..2 53.880 rows=205420 loops=1) Index Cond: ((setype)::text = 'con_s'::text) - Index Scan using con_address_pkey on con_address (cost=0.00..0.27 rows=1 width=46) (actual time=0.003..0.004 rows=1 loops=2054 20) Index Cond: (con_addressid = entity.id) - Index Scan using con_scf_pkey on con_scf (cost=0.00..0.28 rows=1 width=79) (actual time=0.003..0.004 rows=1 loops=205418) Index Cond: (con_id = entity.id) - Hash (cost=0.34..0.34 rows=4 width=25) (actual time=0.016..0.016 rows=4 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 1kB - Index Scan using groups_pkey on
Re: [PERFORM] Why does the number of rows are different in actual and estimated.
On 12/13/2012 05:12 PM, AI Rumman wrote: Why does the number of rows are different in actual and estimated? Isn't that in the nature of estimates? An estimate is a heuristic guess at the number of rows it will find for the given query or part of a query. It's not uncommon for estimates to be out by several orders of magnitude. Guaranteeing estimates within bounded accuracy and in a given short amount of time (you don't want your planning time to overwhelm your execution time) isn't possible. cheers andrew -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Why does the number of rows are different in actual and estimated.
On Dec 14, 2012, at 2:36 AM, Andrew Dunstan and...@dunslane.net wrote: On 12/13/2012 05:12 PM, AI Rumman wrote: Why does the number of rows are different in actual and estimated? Isn't that in the nature of estimates? An estimate is a heuristic guess at the number of rows it will find for the given query or part of a query. It's not uncommon for estimates to be out by several orders of magnitude. Guaranteeing estimates within bounded accuracy and in a given short amount of time (you don't want your planning time to overwhelm your execution time) isn't possible. The main question i think is what to do with it. The problem starts here - Hash Join (cost=9337.97..18115.71 rows=34489 width=244) (actual time=418.054..1156.453 rows=205420 loops=1) Hash Cond: (customerdetails.customerid = entity.id) - Seq Scan on customerdetails (cost=0.00..4752.46 rows=327146 width=13) (actual time=0.021..176.389 rows=327328 loops=1) - Hash (cost=6495.65..6495.65 rows=227386 width=231) (actual time=417.839..417.839 rows=205420 loops=1) Buckets: 32768 Batches: 1 Memory Usage: 16056kB - Index Scan using entity_setype_idx on entity (cost=0.00..6495.65 rows=227386 width=231) (actual time=0.033..2 53.880 rows=205420 loops=1) Index Cond: ((setype)::text = 'con_s'::text) - Index Scan using con_address_pkey on con_address (cost=0.00..0.27 rows=1 width=46) (actual time=0.003..0.004 rows=1 loops=2054 20) As you see access methods estimates are ok, it is join result set which is wrong. How to deal with it? May be a hack with CTE can help, but is there a way to improve statistics correlation? cheers andrew -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Why does the number of rows are different in actual and estimated.
On Dec 14, 2012, at 3:09 AM, Andrew Dunstan and...@dunslane.net wrote: On 12/13/2012 05:42 PM, Claudio Freire wrote: And it looks like it all may be starting to go south here: - Hash Join (cost=9337.97..18115.71 rows=34489 width=244) (actual time=418.054..1156.453 rows=205420 loops=1) Hash Cond: (customerdetails.customerid = entity.id) Well, it looks like it's choosing a join order that's quite a bit different from the way the query is expressed, so the OP might need to play around with forcing the join order some. OP joins 8 tables, and i suppose join collapse limit is set to default 8. I thought postgresql's optimiser is not mysql's. cheers andrew -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Why does the number of rows are different in actual and estimated.
Evgeny Shishkin itparan...@gmail.com writes: On Dec 14, 2012, at 3:09 AM, Andrew Dunstan and...@dunslane.net wrote: Well, it looks like it's choosing a join order that's quite a bit different from the way the query is expressed, so the OP might need to play around with forcing the join order some. OP joins 8 tables, and i suppose join collapse limit is set to default 8. I thought postgresql's optimiser is not mysql's. It's not obvious to me that there's anything very wrong with the plan. An 8-way join that produces 150K rows is unlikely to run in milliseconds no matter what the plan. The planner would possibly have done the last join step differently if it had had a better rowcount estimate, but even if that were free the query would still have been 7 seconds (vs 8.5). regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Why does the number of rows are different in actual and estimated.
On Dec 14, 2012, at 3:36 AM, Tom Lane t...@sss.pgh.pa.us wrote: Evgeny Shishkin itparan...@gmail.com writes: On Dec 14, 2012, at 3:09 AM, Andrew Dunstan and...@dunslane.net wrote: Well, it looks like it's choosing a join order that's quite a bit different from the way the query is expressed, so the OP might need to play around with forcing the join order some. OP joins 8 tables, and i suppose join collapse limit is set to default 8. I thought postgresql's optimiser is not mysql's. It's not obvious to me that there's anything very wrong with the plan. An 8-way join that produces 150K rows is unlikely to run in milliseconds no matter what the plan. The planner would possibly have done the last join step differently if it had had a better rowcount estimate, but even if that were free the query would still have been 7 seconds (vs 8.5). May be in this case it is. I once wrote to this list regarding similar problem - joining 4 tables, result set are off by 2257 times - 750ms vs less then 1ms. Unfortunately the question was not accepted to the list. I spoke to Bruce Momjian about that problem on one local conference, he said shit happens :) regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hash join vs nested loop join
Hi Kevin Again, many thanks for your time and help. On 14 December 2012 02:26, Kevin Grittner kgri...@mail.com wrote: Huan Ruan wrote: Hash 1st run Hash Join (cost=1681.87..6414169.04 rows=48261 width=171) (actual time=2182.450..88158.645 rows=48257 loops=1) - Seq Scan on invtran bigtable (cost=0.00..4730787.28 rows=168121728 width=108) (actual time=0.051..32581.052 rows=168121657 loops=1) 194 nanoseconds per row suggests 100% cache hits. NL 1st run Nested Loop (cost=0.00..6451637.88 rows=48261 width=171) (actual time=0.056..551.438 rows=48257 loops=1) - Index Scan using pk_invtran on invtran bigtable (cost=0.00..133.65 rows=1 width=108) (actual time=0.010..0.010 rows=1 loops=48261) 10 microseconds per index scan (each index scan requiring multiple random accesses) also suggests 100% cache hits. Interesting to see how you derived 100% cache hits. I assume by 'cache' you mean the pg shared buffer plus the OS cache? Because the table is 23GB but the shared buffer is only 6GB. Even then, I'm not completely convinced because the total RAM is just 24GB, part of which will have to be used for other data and indexes. I read somewhere that a pg shared buffer that's too big can hurt the performance and it's better just leave it to the OS cache. I'm not sure why but for now, I just configured the shared buffer to be 1/4 of the total RAM. I originally reduced random_page_cost to 2 to achieve the nested loop join. Now I set cpu_tuple_cost to 0.05 and reset random_page_cost back to 4, I can also achieve a nested loop join. I'm still new in Postgres, but I'm worried about random_page_cost being 2 is too low, so maybe increasing cpu_tuple_cost is a better choice. If these are typical of what you would expect in production, then the fact that with default cost factors the costs are barely different (by 0.6%) for actual run times which differ by two orders of magnitude (the chosen plan is 160 times slower) means that the modeling of cost factors is off by a lot. If you expect the active portion of your database to be fully cached like this, it makes sense to reduce random_page_cost to be equal to seq_page_cost. But that only adjusts the costs by at most a factor of four, and we've established that in the above query they're off by a factor of 160. To help make up the difference, it makes sense to de-emphasize page access compared to cpu-related costs by reducing both page costs to 0.1. Combined, these adjustments still can't compensate for how far off the estimate was. In my experience default cpu_tuple_cost is understated compared to other cpu-related costs, so I would do the above *plus* a boost to cpu_tuple_cost. Personally, I have never seen a difference between plans chosen with that set to 0.03 and 0.05, so I can't say where in that range is the ideal value; you should feel free to experiment if there is a query which seems to be choosing a bad plan. If the above results really do represent cache hit levels you expect in production, the combination of the above changes should come reasonably close to modeling costs realistically, resulting in better plan choice. In production, 60% of the database would be able to fit in the RAM. But roughly, all the active data we need to use should be able to fit in 100%. On the test server I'm playing with now, RAM is only 8% of the database size. Nonetheless, I will play with these parameters like you suggested. I was wondering on our production server where the effetive_cache_size will be much bigger, will pg then guess that probably most data is cached anyway therefore leaning towards nested loop join rather than a scan for hash join? Even on a test server where the cache hit rate is much smaller, for a big table like this, under what circumstances, will a hash join perform better than nested loop join though? If you don't expect such high cache hit ratios in production, you probably don't want to go so low with page costs. - shared_buffers = 6GB - effective_cache_size = 18GB - work_mem = 10MB - maintenance_work_mem = 3GB Can you see any obvious issues with the other memory settings I changed? I might bump up work_mem to 20MB to 60MB, as long as you're not going crazy with max_connections. I would probably take maintenance_work_mem down to 1GB to 2GB -- you can have several of these allocations at one time, and you don't want to blow away your cache. (I think it might actually be adjusted down to 2GB internally anyway; but I would need to check.) Yes, I had bumped up work_mem yesterday to speed up another big group by query. I used 80MB. I assumed this memory will only be used if the query needs it and will be released as soon as it's finished, so it won't be too much an issue as long as I don't have too many concurrently sorting queries running (which is true in our production). Is this correct? I increased maintenance_work_mem initially to speed up the
Re: [PERFORM] Why does the number of rows are different in actual and estimated.
On Thu, Dec 13, 2012 at 8:50 PM, Evgeny Shishkin itparan...@gmail.com wrote: OP joins 8 tables, and i suppose join collapse limit is set to default 8. I thought postgresql's optimiser is not mysql's. It's not obvious to me that there's anything very wrong with the plan. An 8-way join that produces 150K rows is unlikely to run in milliseconds no matter what the plan. The planner would possibly have done the last join step differently if it had had a better rowcount estimate, but even if that were free the query would still have been 7 seconds (vs 8.5). May be in this case it is. I once wrote to this list regarding similar problem - joining 4 tables, result set are off by 2257 times - 750ms vs less then 1ms. Unfortunately the question was not accepted to the list. I spoke to Bruce Momjian about that problem on one local conference, he said shit happens :) I think it's more likely a missing FK constraint. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hash join vs nested loop join
Huan Ruan wrote: Interesting to see how you derived 100% cache hits. I assume by 'cache' you mean the pg shared buffer plus the OS cache? Because the table is 23GB but the shared buffer is only 6GB. Even then, I'm not completely convinced because the total RAM is just 24GB, part of which will have to be used for other data and indexes. Well, you can't have more than a few disk hits, which typically take something like 10 ms each, and still get an average less than 200 nanoseconds. I read somewhere that a pg shared buffer that's too big can hurt the performance and it's better just leave it to the OS cache. I'm not sure why but for now, I just configured the shared buffer to be 1/4 of the total RAM. PostgreSQL goes through the OS and its filesystems, unlike some other products. The right balance of memory in the PostgreSQL shared buffers versus what is left for a combination of OS caching and other memory allocations can be hard to determine. 25% is a good starting point, but your best performance might be higher or lower. It's a good idea to try incremental adjustments using your actual workload. Just remember you need to allow enough for several maintenance_work_mem allocations, about one work_mem allocation per max_connections setting, plus a reasonable OS cache size. I was wondering on our production server where the effetive_cache_size will be much bigger, will pg then guess that probably most data is cached anyway therefore leaning towards nested loop join rather than a scan for hash join? Once effective_cache_size is larger than your largest index, its exact value doesn't matter all that much. Even on a test server where the cache hit rate is much smaller, for a big table like this, under what circumstances, will a hash join perform better than nested loop join though? With a low cache hit rate, that would generally be when the number of lookups into the table exceeds about 10% of the table's rows. Yes, I had bumped up work_mem yesterday to speed up another big group by query. I used 80MB. I assumed this memory will only be used if the query needs it and will be released as soon as it's finished, so it won't be too much an issue as long as I don't have too many concurrently sorting queries running (which is true in our production). Is this correct? Each connection running a query can allocate one work_mem allocation per plan node (depending on node type), which will be freed after the query completes. A common rule of thumb is to plan on peaks of max_conncetions allocations of work_mem. I increased maintenance_work_mem initially to speed up the index creation when I first pump in the data. In production environment, we don't do run time index creation, so I think only the vacuum and analyze will consume this memory? You'll probably be creating indexes from time to time. Figure an occasional one of those plus up to one allocation per autovacuum worker (and you probably shouldn't go below three of those). -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hash join vs nested loop join
With a low cache hit rate, that would generally be when the number of lookups into the table exceeds about 10% of the table's rows. So far, my main performance issue comes down to this pattern where Postgres chooses hash join that's slower than a nest loop indexed join. By changing those cost parameters, this query works as expected now, but there are others fall into the same category and appear to be harder to convince the optimiser. I'm still a bit worried about this query as Postgres gets the record count right, and knows the index is a primary key index, therefore it knows it's 0.05m out of 170m records (0.03%) but still chooses the sequential scan. Hopefully this is just related to that big index penalty bug introduced in 9.2.