Re: [PERFORM] slow query
I am reposting as my original query was mangled The link to the explain plan is here as it does not paste well into the email body. http://explain.depesz.com/s/kHa The machine is a 2 cpu quad core 5430 with 32G RAM and 6x450G 15K single raid-10 array 1G work_mem default_statistics_target=1000 random_page_cost=1 Are you sure it's wise to set the work_mem to 1G? Do you really need it? Don't forget this is not a 'total' or 'per query' - each query may allocate multiple work areas (and occupy multiple GB). But I guess this does not cause the original problem. The last row 'random_page_cost=1' - this basically says that reading data by random is just as cheap as reading data sequentially. Which may result in poor performance due to bad plans. Why have you set this value? Sure, there are rare cases where 'random_page_cost=1' is OK. I am curious why the hash join takes so long. The main table dev4_act_dy_fact_2010_05_t has 25 million rows. The table is partitioned into 3 parts per month. Remaining tables are very small ( 1000 rows) Well, the real cause that makes your query slow is the 'index scan' part. Index Scan using dev4_act_dy_fact_2010_05_t3_thedate on dev4_act_dy_fact_2010_05_t3 a (cost=0.00..94041.89 rows=204276 width=60) (actual time=164533.725..164533.725 rows=0 loops=1) The first thing to note here is the difference in expected and actual number of rows - the planner expects 204276 but gets 0 rows. How large is this partition? Try to analyze it, set the random_page_cost to something reasonable (e.g. 4) and try to run the query again. Tomas -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] slow query
On Thu, 3 Jun 2010, Anj Adu wrote: http://explain.depesz.com/s/kHa I'm interested in why the two partitions dev4_act_dy_fact and dev4_act_dy_fact_2010_05_t3 are treated so differently. I'm guessing that the former is the parent and the latter the child table? When accessing the parent table, Postgres is able to use a bitmap AND index scan, because it has the two indexes dev4_act_dy_dm_nd_indx and dev4_act_dy_dm_cd_indx. Do the child tables have a similar index setup? Incidentally, you could get even better than a bitmap AND index scan by creating an index on (node_id, thedate) on each table. random_page_cost=1 I agree with Tomas that this is rarely a useful setting. Matthew -- You can configure Windows, but don't ask me how. -- Bill Gates -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Weird XFS WAL problem
On Thu, 3 Jun 2010, Greg Smith wrote: And it's also quite reasonable for a RAID controller to respond to that flush the whole cache call by flushing its cache. Remember that the RAID controller is presenting itself to the OS as a large disc, and hiding the individual discs from the OS. Why should the OS care what has actually happened to the individual discs' caches, as long as that flush the whole cache command guarantees that the data is persistent. Taking the RAID array as a whole, that happens when the data hits the write-back cache. The only circumstance where you actually need to flush the data to the individual discs is when you need to take that disc away somewhere else and read it on another system. That's quite a rare use case for a RAID array (http://thedailywtf.com/Articles/RAIDing_Disks.aspx notwithstanding). If the controller had some logic that said it's OK to not flush the cache when that call comes in if my battery is working fine, that would make this whole problem go away. The only place this can be properly sorted is the RAID controller. Anywhere else would be crazy. Matthew -- To err is human; to really louse things up requires root privileges. -- Alexander Pope, slightly paraphrased -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] How filesystems matter with PostgreSQL
Some interesting data about different filesystems I tried with PostgreSQL and how it came out. I have an application that is backed in postgres using Java JDBC to access it. The tests were all done on an opensuse 11.2 64-bit machine, on the same hard drive (just ran mkfs between each test) on the same input with the same code base. All filesystems were created with the default options. XFS (logbufs=8): ~4 hours to finish ext4: ~1 hour 50 minutes to finish ext3: 15 minutes to finish ext3 on LVM: 15 minutes to finish -- Jon Schewe | http://mtu.net/~jpschewe If you see an attachment named signature.asc, this is my digital signature. See http://www.gnupg.org for more information. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How filesystems matter with PostgreSQL
On Friday 04 June 2010 14:17:35 Jon Schewe wrote: Some interesting data about different filesystems I tried with PostgreSQL and how it came out. I have an application that is backed in postgres using Java JDBC to access it. The tests were all done on an opensuse 11.2 64-bit machine, on the same hard drive (just ran mkfs between each test) on the same input with the same code base. All filesystems were created with the default options. XFS (logbufs=8): ~4 hours to finish ext4: ~1 hour 50 minutes to finish ext3: 15 minutes to finish ext3 on LVM: 15 minutes to finish Hi Jon, Any chance you can do the same test with reiserfs? Thanks, Joost -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How filesystems matter with PostgreSQL
On Friday 04 June 2010 14:17:35 Jon Schewe wrote: Some interesting data about different filesystems I tried with PostgreSQL and how it came out. I have an application that is backed in postgres using Java JDBC to access it. The tests were all done on an opensuse 11.2 64-bit machine, on the same hard drive (just ran mkfs between each test) on the same input with the same code base. All filesystems were created with the default options. XFS (logbufs=8): ~4 hours to finish ext4: ~1 hour 50 minutes to finish ext3: 15 minutes to finish ext3 on LVM: 15 minutes to finish My guess is that some of the difference comes from barrier differences. ext4 uses barriers by default, ext3 does not. Andres -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] performance regression with Linux 2.6.33 and glibc 2.12
Hi. I hope I'm not going to expose an already known problem, but I couldn't find it mailing list archives (I only found http://archives.postgresql.org/pgsql- hackers/2009-12/msg01543.php). On one of my (non production) machines, I've just seen a very big performance regression (I was doing a very simple insert test). I had an 'old' 8.4 postgresql compiled a few month ago, performing very well, and my 'bleeding edge' 9.0, doing the same insert very slowly. I managed to find the cause of the regression : with Linux 2.6.33, O_DSYNC is now available. With glibc 2.12, O_DSYNC is available in userspace. Having both (they are both very new, 2.12 isn't even official on glibc website), my new build defaulted to open_datasync. The problem is that it is much slower. I tested it on 2 small machines (no big raid, just basic machines, with SATA or software RAID). Here is the trivial test : The configuration is the default configuration, just after initdb CREATE TABLE test (a int); CREATE INDEX idxtest on test (a); with wal_sync_method = open_datasync (new default) marc=# INSERT INTO test SELECT generate_series(1,10); INSERT 0 10 Time: 16083,912 ms with wal_sync_method = fdatasync (old default) marc=# INSERT INTO test SELECT generate_series(1,10); INSERT 0 10 Time: 954,000 ms Doing synthetic benchmarks with test_fsync: open_datasync performance, glibc 2.12, 2.6.34, 1 SATA drive Simple 8k write timing: write 0.037511 Compare file sync methods using one 8k write: open_datasync write56.998797 open_sync write 168.653995 write, fdatasync 55.359279 write, fsync 166.854911 Compare file sync methods using two 8k writes: open_datasync write, write113.342738 open_sync write, write339.066883 write, write, fdatasync57.336820 write, write, fsync 166.847923 Compare open_sync sizes: 16k open_sync write 169.423723 2 8k open_sync writes 336.457119 Compare fsync times on write() and new file descriptors (if the times are similar, fsync() can sync data written on a different descriptor): write, fsync, close 166.264048 write, close, fsync 168.702035 This is it, I just wanted to raise an alert on this: the degradation was 16- fold with this test. We wont see linux 2.6.33 + glibc 2.12 in production before months (I hope), but shouldn't PostgreSQL use fdatasync by default with Linux, seeing the result ? By the way, I re-did my tests with both 2.6.33, 2.6.34 and 2.6.35-rc1 and got the exact same result (O_DSYNC there, obviously, but also the performance degradation). Cheers Marc -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] performance regression with Linux 2.6.33 and glibc 2.12
Marc Cousin cousinm...@gmail.com writes: I hope I'm not going to expose an already known problem, but I couldn't find it mailing list archives (I only found http://archives.postgresql.org/pgsql- hackers/2009-12/msg01543.php). You sure this isn't the well-known ext4 actually implements fsync where ext3 didn't issue? regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance tuning for postgres
Yogesh Naik yogesh_n...@persistent.co.in wrote: I am performing a DB insertion and update for 3000+ records and while doing so i get CPU utilization to 100% with 67% of CPU used by postgres I have also done optimization on queries too... Is there any way to optimized the CPU utilization for postgres We'd need a lot more information before we could make useful suggestions. Knowing something about your hardware, OS, exact PostgreSQL version, postgresql.conf contents, the table definition, any foreign keys or other constraints, and exactly how you're doing the inserts would all be useful. Please read this and repost: http://wiki.postgresql.org/wiki/SlowQueryQuestions -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How filesystems matter with PostgreSQL
Andres Freund and...@anarazel.de writes: On Friday 04 June 2010 14:17:35 Jon Schewe wrote: XFS (logbufs=8): ~4 hours to finish ext4: ~1 hour 50 minutes to finish ext3: 15 minutes to finish ext3 on LVM: 15 minutes to finish My guess is that some of the difference comes from barrier differences. ext4 uses barriers by default, ext3 does not. Or, to put it more clearly: the reason ext3 is fast is that it's unsafe. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How filesystems matter with PostgreSQL
On Friday 04 June 2010 16:25:30 Tom Lane wrote: Andres Freund and...@anarazel.de writes: On Friday 04 June 2010 14:17:35 Jon Schewe wrote: XFS (logbufs=8): ~4 hours to finish ext4: ~1 hour 50 minutes to finish ext3: 15 minutes to finish ext3 on LVM: 15 minutes to finish My guess is that some of the difference comes from barrier differences. ext4 uses barriers by default, ext3 does not. Or, to put it more clearly: the reason ext3 is fast is that it's unsafe. Jon: To verify you can enable it via the barrier=1 option during mounting.. Andres -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance tuning for postgres
In my opinion it depends on the application, the priority of the application and whether or not it is a commercially sold product, but depending on your needs you might want to consider having a 3rd party vendor who has expertise in this process review and help tune the application. One vendor that I know does this is EnterpriseDB. I've worked with other SQL engines and have a lot of experience tuning queries in a couple of the environments but PostGresql isn't one of them. Having an experienced DBA review your system can make the difference between night and day. Best Regards Michael Gould Kevin Grittner kevin.gritt...@wicourts.gov wrote: Yogesh Naik yogesh_n...@persistent.co.in wrote: I am performing a DB insertion and update for 3000+ records and while doing so i get CPU utilization to 100% with 67% of CPU used by postgres I have also done optimization on queries too... Is there any way to optimized the CPU utilization for postgres We'd need a lot more information before we could make useful suggestions. Knowing something about your hardware, OS, exact PostgreSQL version, postgresql.conf contents, the table definition, any foreign keys or other constraints, and exactly how you're doing the inserts would all be useful. Please read this and repost: http://wiki.postgresql.org/wiki/SlowQueryQuestions -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Michael Gould, Managing Partner Intermodal Software Solutions, LLC 904.226.0978 904.592.5250 fax -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Weird XFS WAL problem
Greg Smith wrote: Kevin Grittner wrote: I've seen this, too (with xfs). Our RAID controller, in spite of having BBU cache configured for writeback, waits for actual persistence on disk for write barriers (unlike for fsync). This does strike me as surprising to the point of bordering on qualifying as a bug. Completely intentional, and documented at http://xfs.org/index.php/XFS_FAQ#Q._Should_barriers_be_enabled_with_storage_which_has_a_persistent_write_cache.3F The issue is that XFS will actually send the full flush your cache call to the controller, rather than just the usual fsync call, and that eliminates the benefit of having a write cache there in the first place. Good controllers respect that and flush their whole write cache out. And ext4 has adopted the same mechanism. This is very much a good thing from the perspective of database reliability for people with regular hard drives who don't have a useful write cache on their cheap hard drives. It allows them to keep the disk's write cache on for other things, while still getting the proper cache flushes when the database commits demand them. It does mean that everyone with a non-volatile battery backed cache, via RAID card typically, needs to turn barriers off manually. I've already warned on this list that PostgreSQL commit performance on ext4 is going to appear really terrible to many people. If you benchmark and don't recognize ext3 wasn't operating in a reliable mode before, the performance drop now that ext4 is doing the right thing with barriers looks impossibly bad. Well, this is depressing. Now that we finally have common battery-backed cache RAID controller cards, the file system developers have throw down another roadblock in ext4 and xfs. Do we need to document this? On another topic, I am a little unclear on how things behave when the drive is write-back. If the RAID controller card writes to the drive, but the data isn't on the platers, how does it know when it can discard that information from the BBU RAID cache? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Weird XFS WAL problem
Bruce Momjian br...@momjian.us wrote: On another topic, I am a little unclear on how things behave when the drive is write-back. If the RAID controller card writes to the drive, but the data isn't on the platers, how does it know when it can discard that information from the BBU RAID cache? The controller waits for the drive to tell it that it has made it to the platter before it discards it. What made you think otherwise? -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Weird XFS WAL problem
Kevin Grittner wrote: Bruce Momjian br...@momjian.us wrote: On another topic, I am a little unclear on how things behave when the drive is write-back. If the RAID controller card writes to the drive, but the data isn't on the platers, how does it know when it can discard that information from the BBU RAID cache? The controller waits for the drive to tell it that it has made it to the platter before it discards it. What made you think otherwise? Because a write-back drive cache says it is on the drive before it hits the platters, which I think is the default for SATA drive. Is that inaccurate? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Weird XFS WAL problem
Bruce Momjian br...@momjian.us wrote: Kevin Grittner wrote: The controller waits for the drive to tell it that it has made it to the platter before it discards it. What made you think otherwise? Because a write-back drive cache says it is on the drive before it hits the platters, which I think is the default for SATA drive. Is that inaccurate? Any decent RAID controller will ensure that the drives themselves aren't using write-back caching. When we've mentioned write-back versus write-through on this thread we've been talking about the behavior of the *controller*. We have our controllers configured to use write-back through the BBU cache as long as the battery is good, but to automatically switch to write-through if the battery goes bad. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] performance regression with Linux 2.6.33 and glibc 2.12
On Friday 04 June 2010 15:59:05 Tom Lane wrote: Marc Cousin cousinm...@gmail.com writes: I hope I'm not going to expose an already known problem, but I couldn't find it mailing list archives (I only found http://archives.postgresql.org/pgsql- hackers/2009-12/msg01543.php). You sure this isn't the well-known ext4 actually implements fsync where ext3 didn't issue? I doubt it. It reads to me like he is testing the two methods on the same installation with the same kernel with wal_sync_method = open_datasync (new default) marc=# INSERT INTO test SELECT generate_series(1,10); INSERT 0 10 Time: 16083,912 ms with wal_sync_method = fdatasync (old default) marc=# INSERT INTO test SELECT generate_series(1,10); INSERT 0 10 Time: 954,000 ms Its not actually surprising that in such a open_datasync is hugely slower than fdatasync. With open_datasync every single write will be synchronous, very likely not reordered/batched/whatever. In contrast to that with fdatasync it will only synced in way much bigger batches. Or am I missing something? I always thought the synchronous write methods to be a fallback kludge and didnt realize its actually the preferred method... Andres -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] performance regression with Linux 2.6.33 and glibc 2.12
The Friday 04 June 2010 15:59:05, Tom Lane wrote : Marc Cousin cousinm...@gmail.com writes: I hope I'm not going to expose an already known problem, but I couldn't find it mailing list archives (I only found http://archives.postgresql.org/pgsql- hackers/2009-12/msg01543.php). You sure this isn't the well-known ext4 actually implements fsync where ext3 didn't issue? regards, tom lane Everything is ext4. So I should have fsync working with write barriers on all the tests. I don't think this problem is of the same kind: I think it is really because of O_DSYNC appearing on 2.6.33, and PostgreSQL using it by default now. If my filesystem was lying to me about barriers, I should take no more performance hit with open_datasync than with fdatasync, should I ? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Weird XFS WAL problem
Kevin Grittner wrote: Bruce Momjian br...@momjian.us wrote: Kevin Grittner wrote: The controller waits for the drive to tell it that it has made it to the platter before it discards it. What made you think otherwise? Because a write-back drive cache says it is on the drive before it hits the platters, which I think is the default for SATA drive. Is that inaccurate? Any decent RAID controller will ensure that the drives themselves aren't using write-back caching. When we've mentioned write-back versus write-through on this thread we've been talking about the behavior of the *controller*. We have our controllers configured to use write-back through the BBU cache as long as the battery is good, but to automatically switch to write-through if the battery goes bad. OK, good, but when why would a BBU RAID controller flush stuff to disk with a flush-all command? I thought the whole goal of BBU was to avoid such flushes. What is unique about the command ext4/xfs is sending? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Weird XFS WAL problem
Bruce Momjian br...@momjian.us wrote: Kevin Grittner wrote: Any decent RAID controller will ensure that the drives themselves aren't using write-back caching. When we've mentioned write-back versus write-through on this thread we've been talking about the behavior of the *controller*. We have our controllers configured to use write-back through the BBU cache as long as the battery is good, but to automatically switch to write-through if the battery goes bad. OK, good, but when why would a BBU RAID controller flush stuff to disk with a flush-all command? I thought the whole goal of BBU was to avoid such flushes. That has been *precisely* my point. I don't know at the protocol level; I just know that write barriers do *something* which causes our controllers to wait for actual disk platter persistence, while fsync does not. The write barrier concept seems good to me, and I wish it could be used at the OS level without killing performance. I blame the controller, for not treating it the same as fsync (i.e., as long as it's in write-back mode it should treat data as persisted as soon as it's in BBU cache). -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Weird XFS WAL problem
Kevin Grittner wrote: Bruce Momjian br...@momjian.us wrote: Kevin Grittner wrote: Any decent RAID controller will ensure that the drives themselves aren't using write-back caching. When we've mentioned write-back versus write-through on this thread we've been talking about the behavior of the *controller*. We have our controllers configured to use write-back through the BBU cache as long as the battery is good, but to automatically switch to write-through if the battery goes bad. OK, good, but when why would a BBU RAID controller flush stuff to disk with a flush-all command? I thought the whole goal of BBU was to avoid such flushes. That has been *precisely* my point. I don't know at the protocol level; I just know that write barriers do *something* which causes our controllers to wait for actual disk platter persistence, while fsync does not. The write barrier concept seems good to me, and I wish it could be used at the OS level without killing performance. I blame the controller, for not treating it the same as fsync (i.e., as long as it's in write-back mode it should treat data as persisted as soon as it's in BBU cache). Yeah. I wonder if it honors the cache flush because it might think it is replacing disks or something odd. I think we are going to have to document this in 9.0 because obviously you have seen it already. Is this an issue with SAS cards/drives as well? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] slow query
I'm interested in why the two partitions dev4_act_dy_fact and dev4_act_dy_fact_2010_05_t3 are treated so differently. I'm guessing that the former is the parent and the latter the child table? Yes..you are correct. When accessing the parent table, Postgres is able to use a bitmap AND index scan, because it has the two indexes dev4_act_dy_dm_nd_indx and dev4_act_dy_dm_cd_indx. Do the child tables have a similar index setup? Yes..the child table have indexes on those fields as well Incidentally, you could get even better than a bitmap AND index scan by creating an index on (node_id, thedate) on each table. Will this perform better than separate indexes ? random_page_cost=1 I agree with Tomas that this is rarely a useful setting. Matthew -- You can configure Windows, but don't ask me how. -- Bill Gates -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] slow query
2010/6/4 t...@fuzzy.cz: I am reposting as my original query was mangled The link to the explain plan is here as it does not paste well into the email body. http://explain.depesz.com/s/kHa The machine is a 2 cpu quad core 5430 with 32G RAM and 6x450G 15K single raid-10 array 1G work_mem default_statistics_target=1000 random_page_cost=1 Are you sure it's wise to set the work_mem to 1G? Do you really need it? Don't forget this is not a 'total' or 'per query' - each query may allocate multiple work areas (and occupy multiple GB). But I guess this does not cause the original problem. The last row 'random_page_cost=1' - this basically says that reading data by random is just as cheap as reading data sequentially. Which may result in poor performance due to bad plans. Why have you set this value? Sure, there are rare cases where 'random_page_cost=1' is OK. The default for 8.4 is 2 I tried with 2 and 1..but the results are not very different. I understand that for fast disks (which we have with a decent Raid 10 setup)..the random_page_cost can be lowered as needed..but I guess it did not make a difference here. I am curious why the hash join takes so long. The main table dev4_act_dy_fact_2010_05_t has 25 million rows. The table is partitioned into 3 parts per month. Remaining tables are very small ( 1000 rows) Well, the real cause that makes your query slow is the 'index scan' part. Index Scan using dev4_act_dy_fact_2010_05_t3_thedate on dev4_act_dy_fact_2010_05_t3 a (cost=0.00..94041.89 rows=204276 width=60) (actual time=164533.725..164533.725 rows=0 loops=1) The first thing to note here is the difference in expected and actual number of rows - the planner expects 204276 but gets 0 rows. How large is this partition? The partition has 25 million rows with indexes on theDate, node_id.. I altered the random_page_cost to 4 (1 more than the default)..still slow. These tables are analyzed every day I have an index on each field used in the where criteria, Try to analyze it, set the random_page_cost to something reasonable (e.g. 4) and try to run the query again. Tomas -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] slow query
Does the difference in expected and actual rows as seen by the planner a big factor? Even after an analyze...the results are similar. (there is a big diff between expected and actual) Partition has 25 million rows On Fri, Jun 4, 2010 at 10:41 AM, Anj Adu fotogra...@gmail.com wrote: 2010/6/4 t...@fuzzy.cz: I am reposting as my original query was mangled The link to the explain plan is here as it does not paste well into the email body. http://explain.depesz.com/s/kHa The machine is a 2 cpu quad core 5430 with 32G RAM and 6x450G 15K single raid-10 array 1G work_mem default_statistics_target=1000 random_page_cost=1 Are you sure it's wise to set the work_mem to 1G? Do you really need it? Don't forget this is not a 'total' or 'per query' - each query may allocate multiple work areas (and occupy multiple GB). But I guess this does not cause the original problem. The last row 'random_page_cost=1' - this basically says that reading data by random is just as cheap as reading data sequentially. Which may result in poor performance due to bad plans. Why have you set this value? Sure, there are rare cases where 'random_page_cost=1' is OK. The default for 8.4 is 2 I tried with 2 and 1..but the results are not very different. I understand that for fast disks (which we have with a decent Raid 10 setup)..the random_page_cost can be lowered as needed..but I guess it did not make a difference here. I am curious why the hash join takes so long. The main table dev4_act_dy_fact_2010_05_t has 25 million rows. The table is partitioned into 3 parts per month. Remaining tables are very small ( 1000 rows) Well, the real cause that makes your query slow is the 'index scan' part. Index Scan using dev4_act_dy_fact_2010_05_t3_thedate on dev4_act_dy_fact_2010_05_t3 a (cost=0.00..94041.89 rows=204276 width=60) (actual time=164533.725..164533.725 rows=0 loops=1) The first thing to note here is the difference in expected and actual number of rows - the planner expects 204276 but gets 0 rows. How large is this partition? The partition has 25 million rows with indexes on theDate, node_id.. I altered the random_page_cost to 4 (1 more than the default)..still slow. These tables are analyzed every day I have an index on each field used in the where criteria, Try to analyze it, set the random_page_cost to something reasonable (e.g. 4) and try to run the query again. Tomas -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance tuning for postgres
Is this a bulk insert? Are you wrapping your statements within a transaction(s)? How many columns in the table? What do the table statistics look like? On Fri, Jun 4, 2010 at 9:21 AM, Michael Gould mgo...@intermodalsoftwaresolutions.net wrote: In my opinion it depends on the application, the priority of the application and whether or not it is a commercially sold product, but depending on your needs you might want to consider having a 3rd party vendor who has expertise in this process review and help tune the application. One vendor that I know does this is EnterpriseDB. I've worked with other SQL engines and have a lot of experience tuning queries in a couple of the environments but PostGresql isn't one of them. Having an experienced DBA review your system can make the difference between night and day. Best Regards Michael Gould Kevin Grittner kevin.gritt...@wicourts.gov wrote: Yogesh Naik yogesh_n...@persistent.co.in wrote: I am performing a DB insertion and update for 3000+ records and while doing so i get CPU utilization to 100% with 67% of CPU used by postgres I have also done optimization on queries too... Is there any way to optimized the CPU utilization for postgres We'd need a lot more information before we could make useful suggestions. Knowing something about your hardware, OS, exact PostgreSQL version, postgresql.conf contents, the table definition, any foreign keys or other constraints, and exactly how you're doing the inserts would all be useful. Please read this and repost: http://wiki.postgresql.org/wiki/SlowQueryQuestions -Kevin -- Sent via pgsql-performance mailing list ( pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Michael Gould, Managing Partner Intermodal Software Solutions, LLC 904.226.0978 904.592.5250 fax -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How filesystems matter with PostgreSQL
UFS2 w/ soft updates on FreeBSD might be an interesting addition to the list of test cases On Fri, Jun 4, 2010 at 9:33 AM, Andres Freund and...@anarazel.de wrote: On Friday 04 June 2010 16:25:30 Tom Lane wrote: Andres Freund and...@anarazel.de writes: On Friday 04 June 2010 14:17:35 Jon Schewe wrote: XFS (logbufs=8): ~4 hours to finish ext4: ~1 hour 50 minutes to finish ext3: 15 minutes to finish ext3 on LVM: 15 minutes to finish My guess is that some of the difference comes from barrier differences. ext4 uses barriers by default, ext3 does not. Or, to put it more clearly: the reason ext3 is fast is that it's unsafe. Jon: To verify you can enable it via the barrier=1 option during mounting.. Andres -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] slow query
The behaviour is different in postgres 8.1.9 (much faster) (the table has 9 million rows instead of 25 million..but the query comes back very fast (8 seconds).. Wonder if this is very specific to 8.4.0 On Fri, Jun 4, 2010 at 11:01 AM, Anj Adu fotogra...@gmail.com wrote: Does the difference in expected and actual rows as seen by the planner a big factor? Even after an analyze...the results are similar. (there is a big diff between expected and actual) Partition has 25 million rows On Fri, Jun 4, 2010 at 10:41 AM, Anj Adu fotogra...@gmail.com wrote: 2010/6/4 t...@fuzzy.cz: I am reposting as my original query was mangled The link to the explain plan is here as it does not paste well into the email body. http://explain.depesz.com/s/kHa The machine is a 2 cpu quad core 5430 with 32G RAM and 6x450G 15K single raid-10 array 1G work_mem default_statistics_target=1000 random_page_cost=1 Are you sure it's wise to set the work_mem to 1G? Do you really need it? Don't forget this is not a 'total' or 'per query' - each query may allocate multiple work areas (and occupy multiple GB). But I guess this does not cause the original problem. The last row 'random_page_cost=1' - this basically says that reading data by random is just as cheap as reading data sequentially. Which may result in poor performance due to bad plans. Why have you set this value? Sure, there are rare cases where 'random_page_cost=1' is OK. The default for 8.4 is 2 I tried with 2 and 1..but the results are not very different. I understand that for fast disks (which we have with a decent Raid 10 setup)..the random_page_cost can be lowered as needed..but I guess it did not make a difference here. I am curious why the hash join takes so long. The main table dev4_act_dy_fact_2010_05_t has 25 million rows. The table is partitioned into 3 parts per month. Remaining tables are very small ( 1000 rows) Well, the real cause that makes your query slow is the 'index scan' part. Index Scan using dev4_act_dy_fact_2010_05_t3_thedate on dev4_act_dy_fact_2010_05_t3 a (cost=0.00..94041.89 rows=204276 width=60) (actual time=164533.725..164533.725 rows=0 loops=1) The first thing to note here is the difference in expected and actual number of rows - the planner expects 204276 but gets 0 rows. How large is this partition? The partition has 25 million rows with indexes on theDate, node_id.. I altered the random_page_cost to 4 (1 more than the default)..still slow. These tables are analyzed every day I have an index on each field used in the where criteria, Try to analyze it, set the random_page_cost to something reasonable (e.g. 4) and try to run the query again. Tomas -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How filesystems matter with PostgreSQL
I'm running on Linux, so that's not really an option here. On 6/4/10 1:20 PM, Bryan Hinton wrote: UFS2 w/ soft updates on FreeBSD might be an interesting addition to the list of test cases On Fri, Jun 4, 2010 at 9:33 AM, Andres Freund and...@anarazel.de mailto:and...@anarazel.de wrote: On Friday 04 June 2010 16:25:30 Tom Lane wrote: Andres Freund and...@anarazel.de mailto:and...@anarazel.de writes: On Friday 04 June 2010 14:17:35 Jon Schewe wrote: XFS (logbufs=8): ~4 hours to finish ext4: ~1 hour 50 minutes to finish ext3: 15 minutes to finish ext3 on LVM: 15 minutes to finish My guess is that some of the difference comes from barrier differences. ext4 uses barriers by default, ext3 does not. Or, to put it more clearly: the reason ext3 is fast is that it's unsafe. Jon: To verify you can enable it via the barrier=1 option during mounting.. Andres -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org mailto:pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Jon Schewe | http://mtu.net/~jpschewe If you see an attachment named signature.asc, this is my digital signature. See http://www.gnupg.org for more information.
Re: [PERFORM] How filesystems matter with PostgreSQL
On 6/4/10 9:33 AM, Andres Freund wrote: On Friday 04 June 2010 16:25:30 Tom Lane wrote: Andres Freund and...@anarazel.de writes: On Friday 04 June 2010 14:17:35 Jon Schewe wrote: XFS (logbufs=8): ~4 hours to finish ext4: ~1 hour 50 minutes to finish ext3: 15 minutes to finish ext3 on LVM: 15 minutes to finish My guess is that some of the difference comes from barrier differences. ext4 uses barriers by default, ext3 does not. Or, to put it more clearly: the reason ext3 is fast is that it's unsafe. Jon: To verify you can enable it via the barrier=1 option during mounting.. First some details: Linux kernel 2.6.31 postgres version: 8.4.2 More test results: reiserfs: ~1 hour 50 minutes ext3 barrier=1: ~15 minutes ext4 nobarrier: ~15 minutes jfs: ~15 minutes -- Jon Schewe | http://mtu.net/~jpschewe If you see an attachment named signature.asc, this is my digital signature. See http://www.gnupg.org for more information. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How filesystems matter with PostgreSQL
What types of journaling on each fs? On Fri, Jun 4, 2010 at 1:26 PM, Jon Schewe jpsch...@mtu.net wrote: On 6/4/10 9:33 AM, Andres Freund wrote: On Friday 04 June 2010 16:25:30 Tom Lane wrote: Andres Freund and...@anarazel.de writes: On Friday 04 June 2010 14:17:35 Jon Schewe wrote: XFS (logbufs=8): ~4 hours to finish ext4: ~1 hour 50 minutes to finish ext3: 15 minutes to finish ext3 on LVM: 15 minutes to finish My guess is that some of the difference comes from barrier differences. ext4 uses barriers by default, ext3 does not. Or, to put it more clearly: the reason ext3 is fast is that it's unsafe. Jon: To verify you can enable it via the barrier=1 option during mounting.. First some details: Linux kernel 2.6.31 postgres version: 8.4.2 More test results: reiserfs: ~1 hour 50 minutes ext3 barrier=1: ~15 minutes ext4 nobarrier: ~15 minutes jfs: ~15 minutes -- Jon Schewe | http://mtu.net/~jpschewe If you see an attachment named signature.asc, this is my digital signature. See http://www.gnupg.org for more information. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How filesystems matter with PostgreSQL
I just used standard mkfs for each filesystem and mounted them without options, unless otherwise specified. On 6/4/10 1:37 PM, Bryan Hinton wrote: What types of journaling on each fs? On Fri, Jun 4, 2010 at 1:26 PM, Jon Schewe jpsch...@mtu.net mailto:jpsch...@mtu.net wrote: On 6/4/10 9:33 AM, Andres Freund wrote: On Friday 04 June 2010 16:25:30 Tom Lane wrote: Andres Freund and...@anarazel.de mailto:and...@anarazel.de writes: On Friday 04 June 2010 14:17:35 Jon Schewe wrote: XFS (logbufs=8): ~4 hours to finish ext4: ~1 hour 50 minutes to finish ext3: 15 minutes to finish ext3 on LVM: 15 minutes to finish My guess is that some of the difference comes from barrier differences. ext4 uses barriers by default, ext3 does not. Or, to put it more clearly: the reason ext3 is fast is that it's unsafe. Jon: To verify you can enable it via the barrier=1 option during mounting.. First some details: Linux kernel 2.6.31 postgres version: 8.4.2 More test results: reiserfs: ~1 hour 50 minutes ext3 barrier=1: ~15 minutes ext4 nobarrier: ~15 minutes jfs: ~15 minutes -- Jon Schewe | http://mtu.net/~jpschewe http://mtu.net/%7Ejpschewe If you see an attachment named signature.asc, this is my digital signature. See http://www.gnupg.org for more information. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org mailto:pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Jon Schewe | http://mtu.net/~jpschewe If you see an attachment named signature.asc, this is my digital signature. See http://www.gnupg.org for more information.
Re: [PERFORM] How filesystems matter with PostgreSQL
On Friday 04 June 2010 20:26:27 Jon Schewe wrote: ext3 barrier=1: ~15 minutes ext4 nobarrier: ~15 minutes Any message in the kernel log about barriers or similar? Andres -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How filesystems matter with PostgreSQL
On 6/4/10 1:46 PM, Andres Freund wrote: On Friday 04 June 2010 20:26:27 Jon Schewe wrote: ext3 barrier=1: ~15 minutes ext4 nobarrier: ~15 minutes Any message in the kernel log about barriers or similar? No. -- Jon Schewe | http://mtu.net/~jpschewe If you see an attachment named signature.asc, this is my digital signature. See http://www.gnupg.org for more information. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] planner costs in warm cache tests
On Mon, May 31, 2010 at 3:55 PM, Tom Lane t...@sss.pgh.pa.us wrote: Jesper Krogh jes...@krogh.cc writes: On 2010-05-30 20:34, Tom Lane wrote: Well, hmm, I really doubt that that represents reality either. A page access is by no means free even when the page is already in cache. I don't recall anyone suggesting that you set these numbers to less than perhaps 0.01. Thank you for the prompt response. Is it a false assumption that the cost should in some metric between different plans be a measurement of actual run-time in a dead-disk run? Well, the default cost parameters (seq_page_cost=1, random_page_cost=4) are intended to model the non-cached state where most page fetches actually do require a disk access. They are definitely too large relative to the cpu_xxx_cost parameters when you have a fully-cached database, but what I've seen people recommending for that condition is to set them both to the same value in the vicinity of 0.1 or 0.01 or so. If it's only mostly cached you might try intermediate settings. I have had to set it as low as .005 to get the right things to happen. Could have been a fluke, I suppose. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] how to force hashaggregate plan?
On Thu, May 27, 2010 at 3:34 PM, Slava Moudry smou...@4info.com wrote: 1) Is there a way to force plan that uses hashaggregate for the second query? No, although if you crank work_mem up high enough you should get it, I think. 2) I am not trying to achieve any particular execution time for the query, but I noticed that when disk sort kicks in (and that happens eventually once the dataset is large enough) the query drastically slows down, even if there is no physical IO going on. I wonder if it's possible to have predictable performance rather than sudden drop. No. The planner has to choose one algorithm or the other - there's not really a way it can do a mix. 3) Why hashAggregate plan uses so much less memory (work_mem) than the plan with groupAggregate/sort? HashAggregate plan for Query1 works even with work_mem='2GB'; The second plan decides to use disk sort even with work_mem='4GB'. Why sort is so memory greedy? Are there any plans to address the sorting memory efficiency issues? Well, if you select more columns, then the tuples that are buffered in memory take up more space, right? Twice the columns = twice the memory. What I'd be curious to know is how accurate the memory estimates are - figure out what the lowest value of work_mem needed to get a particular plan is and then compare that to the amount of memory used when you execute the query... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PgAdmin iii - Explain.
On Tue, Jun 1, 2010 at 1:47 PM, Jeres Caldeira Gomes jeres.go...@gmail.com wrote: I'm needing some tutorial to use and understand the graphical feature Explain of PgAdmin III? Do you have it? Hmm... you might want to ask about this on the pgadmin-support list. http://archives.postgresql.org/pgadmin-support/ If you're looking for documentation of the explain format in general, you might read the PostgreSQL documentation for explain. http://www.postgresql.org/docs/current/static/sql-explain.html -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance