Re: [PERFORM] Why creating GIN table index is so slow than inserting data into empty table with the same index?
Tom Lane wrote: Sergey Burladyan eshkin...@gmail.com writes: show maintenance_work_mem ; maintenance_work_mem -- 128MB create table a (i1 int, i2 int, i3 int, i4 int, i5 int, i6 int); insert into a select n, n, n, n, n, n from generate_series(1, 10) as n; create index arr_gin on a using gin ( (array[i1, i2, i3, i4, i5, i6]) ); [ takes forever ] Seems the reason this is so awful is that the incoming data is exactly presorted, meaning that the tree structure that ginInsertEntry() is trying to build degenerates to a linear list (ie, every new element becomes the right child of the prior one). So the processing becomes O(N^2) up till you reach maintenance_work_mem and flush the tree. With a large setting for maintenance_work_mem it gets spectacularly slow. Yes, this is probably the same issue I bumped into a while ago: http://archives.postgresql.org/message-id/49350a13.3020...@enterprisedb.com I think a reasonable solution for this might be to keep an eye on maxdepth and force a flush if that gets too large (more than a few hundred, perhaps?). Something like this: /* If we've maxed out our available memory, dump everything to the index */ + /* Also dump if the tree seems to be getting too unbalanced */ - if (buildstate-accum.allocatedMemory = maintenance_work_mem * 1024L) + if (buildstate-accum.allocatedMemory = maintenance_work_mem * 1024L || + buildstate-accum.maxdepth DEPTH_LIMIT) { The new fast-insert code likely will need a similar defense. I fooled around with a balanced tree, which solved the problem but unfortunately made the unsorted case slower. Limiting the depth like that should avoid that so it's worth trying. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] High CPU Utilization
On Mar 20, 2009, at 4:58 PM, Scott Marlowe wrote: On Fri, Mar 20, 2009 at 2:49 PM, Joe Uhl joe...@gmail.com wrote: On Mar 20, 2009, at 4:29 PM, Scott Marlowe wrote: What does the cs entry on vmstat say at this time? If you're cs is skyrocketing then you're getting a context switch storm, which is usually a sign that there are just too many things going on at once / you've got an old kernel things like that. cs column (plus cpu columns) of vmtstat 1 30 reads as follows: csus sy id wa 11172 95 4 1 0 12498 94 5 1 0 14121 91 7 1 1 11310 90 7 1 1 12918 92 6 1 1 10613 93 6 1 1 9382 94 4 1 1 14023 89 8 2 1 10138 92 6 1 1 11932 94 4 1 1 15948 93 5 2 1 12919 92 5 3 1 10879 93 4 2 1 14014 94 5 1 1 9083 92 6 2 0 11178 94 4 2 0 10717 94 5 1 0 9279 97 2 1 0 12673 94 5 1 0 8058 82 17 1 1 8150 94 5 1 1 11334 93 6 0 0 13884 91 8 1 0 10159 92 7 0 0 9382 96 4 0 0 11450 95 4 1 0 11947 96 3 1 0 8616 95 4 1 0 10717 95 3 1 0 We are running on 2.6.28.7-2 kernel. I am unfamiliar with vmstat output but reading the man page (and that cs = context switches per second) makes my numbers seem very high. No, those aren't really all that high. If you were hitting cs contention, I'd expect it to be in the 25k to 100k range. 10k average under load is pretty reasonable. Our sum JDBC pools currently top out at 400 connections (and we are doing work on all 400 right now). I may try dropping those pools down even smaller. Are there any general rules of thumb for figuring out how many connections you should service at maximum? I know of the memory constraints, but thinking more along the lines of connections per CPU core. Well, maximum efficiency is usually somewhere in the range of 1 to 2 times the number of cores you have, so trying to get the pool down to a dozen or two connections would be the direction to generally head. May not be reasonable or doable though. Turns out we may have an opportunity to purchase a new database server with this increased load. Seems that the best route, based on feedback to this thread, is to go whitebox, get quad opterons, and get a very good disk controller. Can anyone recommend a whitebox vendor? Is there a current controller anyone on this list has experience with that they could recommend? This will be a bigger purchase so will be doing research and benchmarking but any general pointers to a vendor/controller greatly appreciated. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] High CPU Utilization
On Tue, 24 Mar 2009, Joe Uhl wrote: Can anyone recommend a whitebox vendor? I dumped a list of recommended vendors from a discussion here a while back at http://wiki.postgresql.org/wiki/SCSI_vs._IDE/SATA_Disks you could get started with. -- * Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] High CPU Utilization
On Tue, Mar 24, 2009 at 1:29 PM, Greg Smith gsm...@gregsmith.com wrote: On Tue, 24 Mar 2009, Joe Uhl wrote: Can anyone recommend a whitebox vendor? I dumped a list of recommended vendors from a discussion here a while back at http://wiki.postgresql.org/wiki/SCSI_vs._IDE/SATA_Disks you could get started with. I'd add Aberdeen Inc to that list. They supply quality white box servers with 3ware, areca, or LSI controllers, and provide a 5 year all inclusive warranty. Their customer service is top notch too. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Help Me Understand Why I'm Getting a Bad Query Plan
I hate to nag, but could anybody help me with this? We have a few related queries that are causing noticeable service delays in our production system. I've tried a number of different things, but I'm running out of ideas and don't know what to do next. Thanks, Bryan On Mon, Mar 23, 2009 at 2:03 PM, Bryan Murphy bmurphy1...@gmail.com wrote: Hey Guys, I've got a query on our production system that isn't choosing a good plan. I can't see why it's choosing to do a sequential scan on the ItemExperienceLog table. That table is about 800mb and has about 2.5 million records. This example query only returns 4 records. I've tried upping the statics for ItemExperienceLog.VistorId and ItemExperienceLog.ItemId to 1000 (from out default of 100) with no success. Our primary keys are guids stored as char(32) not null. Our database uses UTF-8 encoding and is currently version v8.3.5. The queries: --SET enable_seqscan = off --SET enable_seqscan = on --ALTER TABLE ItemExperienceLog ALTER COLUMN VisitorId SET STATISTICS 1000 --ALTER TABLE ItemExperienceLog ALTER COLUMN ItemId SET STATISTICS 1000 --ANALYZE ItemExperienceLog SELECT MAX(l.Id) as Id, l.ItemId FROM ItemExperienceLog l INNER JOIN Items_Primary p ON p.Id = l.ItemId INNER JOIN Feeds f ON f.Id = p.FeedId INNER JOIN Visitors v ON v.Id = l.VisitorId WHERE v.UserId = 'fbe2537f21d94f519605612c0bf7c2c5' AND LOWER(f.Slug) = LOWER('Wealth_Building_by_NightingaleConant') GROUP BY l.ItemId Explain verbose output (set enable_seqscan = on): HashAggregate (cost=124392.54..124392.65 rows=9 width=37) (actual time=7765.650..7765.654 rows=4 loops=1) - Nested Loop (cost=2417.68..124392.49 rows=9 width=37) (actual time=1706.703..7765.611 rows=11 loops=1) - Nested Loop (cost=2417.68..123868.75 rows=1807 width=70) (actual time=36.374..7706.677 rows=3174 loops=1) - Hash Join (cost=2417.68..119679.50 rows=1807 width=37) (actual time=36.319..7602.221 rows=3174 loops=1) Hash Cond: (l.visitorid = v.id) - Seq Scan on itemexperiencelog l (cost=0.00..107563.09 rows=2581509 width=70) (actual time=0.010..4191.251 rows=2579880 loops=1) - Hash (cost=2401.43..2401.43 rows=1300 width=33) (actual time=3.673..3.673 rows=897 loops=1) - Bitmap Heap Scan on visitors v (cost=22.48..2401.43 rows=1300 width=33) (actual time=0.448..2.523 rows=897 loops=1) Recheck Cond: (userid = 'fbe2537f21d94f519605612c0bf7c2c5'::bpchar) - Bitmap Index Scan on visitors_userid_index2 (cost=0.00..22.16 rows=1300 width=0) (actual time=0.322..0.322 rows=897 loops=1) Index Cond: (userid = 'fbe2537f21d94f519605612c0bf7c2c5'::bpchar) - Index Scan using items_primary_pkey on items_primary p (cost=0.00..2.31 rows=1 width=66) (actual time=0.027..0.029 rows=1 loops=3174) Index Cond: (p.id = l.itemid) - Index Scan using feeds_pkey on feeds f (cost=0.00..0.28 rows=1 width=33) (actual time=0.016..0.016 rows=0 loops=3174) Index Cond: (f.id = p.feedid) Filter: (lower((f.slug)::text) = 'wealth_building_by_nightingaleconant'::text) Total runtime: 7765.767 ms Explain verbose output (set enable_seqscan = off): HashAggregate (cost=185274.71..185274.82 rows=9 width=37) (actual time=185.024..185.028 rows=4 loops=1) - Nested Loop (cost=0.00..185274.67 rows=9 width=37) (actual time=0.252..184.989 rows=11 loops=1) - Nested Loop (cost=0.00..184751.21 rows=1806 width=70) (actual time=0.223..134.943 rows=3174 loops=1) - Nested Loop (cost=0.00..180564.28 rows=1806 width=37) (actual time=0.192..60.214 rows=3174 loops=1) - Index Scan using visitors_userid_index2 on visitors v (cost=0.00..2580.97 rows=1300 width=33) (actual time=0.052..2.342 rows=897 loops=1) Index Cond: (userid = 'fbe2537f21d94f519605612c0bf7c2c5'::bpchar) - Index Scan using itemexperiencelog__index__visitorid on itemexperiencelog l (cost=0.00..134.04 rows=230 width=70) (actual time=0.013..0.040 rows=4 loops=897) Index Cond: (l.visitorid = v.id) - Index Scan using items_primary_pkey on items_primary p (cost=0.00..2.31 rows=1 width=66) (actual time=0.019..0.020 rows=1 loops=3174) Index Cond: (p.id = l.itemid) - Index Scan using feeds_pkey on feeds f (cost=0.00..0.28 rows=1 width=33) (actual time=0.014..0.014 rows=0 loops=3174) Index Cond: (f.id = p.feedid) Filter: (lower((f.slug)::text) = 'wealth_building_by_nightingaleconant'::text) Total runtime: 185.117 ms The relevent portions of postgresql.conf: max_connections = 100 shared_buffers = 2GB temp_buffers = 32MB work_mem = 64MB
Re: [PERFORM] High CPU Utilization
At 02:47 PM 3/24/2009, Joe Uhl wrote: Turns out we may have an opportunity to purchase a new database server with this increased load. Seems that the best route, based on feedback to this thread, is to go whitebox, get quad opterons, and get a very good disk controller. Can anyone recommend a whitebox vendor? I'll 2nd the Aberdeen recommendation. I'll add Pogolinux to that list as well. Is there a current controller anyone on this list has experience with that they could recommend? The 2 best performing RAID controller vendors at this time are AMCC (AKA 3Ware) and Areca. In general, the 8+ port Areca's with their BB cache maxed outperform every other controller available. This will be a bigger purchase so will be doing research and benchmarking but any general pointers to a vendor/controller greatly appreciated. Be =very= careful to thoroughly bench both the AMD and Intel CPU options. It is far from clear which is the better purchase. I'd be very interested to see the results of your research and benchmarks posted here on pgsql-performance. Ron Peacetree -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] High CPU Utilization
On Tue, Mar 24, 2009 at 4:58 PM, Ron rjpe...@earthlink.net wrote: At 02:47 PM 3/24/2009, Joe Uhl wrote: Turns out we may have an opportunity to purchase a new database server with this increased load. Seems that the best route, based on feedback to this thread, is to go whitebox, get quad opterons, and get a very good disk controller. Can anyone recommend a whitebox vendor? I'll 2nd the Aberdeen recommendation. I'll add Pogolinux to that list as well. Is there a current controller anyone on this list has experience with that they could recommend? The 2 best performing RAID controller vendors at this time are AMCC (AKA 3Ware) and Areca. In general, the 8+ port Areca's with their BB cache maxed outperform every other controller available. This will be a bigger purchase so will be doing research and benchmarking but any general pointers to a vendor/controller greatly appreciated. Be =very= careful to thoroughly bench both the AMD and Intel CPU options. It is far from clear which is the better purchase. My anecdotal experience has been that the Opterons stay afloat longer as load increases, but I haven't had machines with similar enough hardware to really test that. I'd be very interested to see the results of your research and benchmarks posted here on pgsql-performance. Me too. I'm gonna spend some time this summer benchmarking and tuning the database servers that I pretty much had to burn in and put in production this year due to time pressures. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] High CPU Utilization
On 3/24/09 4:16 PM, Scott Marlowe scott.marl...@gmail.com wrote: On Tue, Mar 24, 2009 at 4:58 PM, Ron rjpe...@earthlink.net wrote: At 02:47 PM 3/24/2009, Joe Uhl wrote: Turns out we may have an opportunity to purchase a new database server with this increased load. Seems that the best route, based on feedback to this thread, is to go whitebox, get quad opterons, and get a very good disk controller. Can anyone recommend a whitebox vendor? I'll 2nd the Aberdeen recommendation. I'll add Pogolinux to that list as well. Is there a current controller anyone on this list has experience with that they could recommend? The 2 best performing RAID controller vendors at this time are AMCC (AKA 3Ware) and Areca. In general, the 8+ port Areca's with their BB cache maxed outperform every other controller available. I personally have had rather bad performance experiences with 3Ware 9550/9650 SATA cards. I have no experience with the AMCC SAS stuff though. Adaptec demolished the 9650 on arrays larger than 4 drives, and Areca will do better at the very high end. However, if CPU is the issue for this particular case, then the RAID controller details are less significant. I don't know how much data you have, but don't forget the option of SSDs, or a mix of hard drives and SSDs for different data. Ideally, you would want the OS to just extend its pagecache onto a SSD, but only OpenSolaris can do that right now and it is rather new (needs to be persistent across reboots). http://blogs.sun.com/brendan/entry/test http://blogs.sun.com/brendan/entry/l2arc_screenshots This will be a bigger purchase so will be doing research and benchmarking but any general pointers to a vendor/controller greatly appreciated. Be =very= careful to thoroughly bench both the AMD and Intel CPU options. It is far from clear which is the better purchase. My anecdotal experience has been that the Opterons stay afloat longer as load increases, but I haven't had machines with similar enough hardware to really test that. One may want to note that Intel's next generation servers are due out within 45 days from what I can sense ('Q2' traditionally means ~April 1 for Intel when on time). These should be a rather significant bump for a database as they adopt the AMD / Alpha style memory-controller-on-CPU architecture and add a lot of cache. Other relevant improvements: increased performance on compare-and-swap operations, the return of hyper threading, and ridiculous memory bandwidth per CPU (3 DDR3 memory channels per CPU). I'd be very interested to see the results of your research and benchmarks posted here on pgsql-performance. Me too. I'm gonna spend some time this summer benchmarking and tuning the database servers that I pretty much had to burn in and put in production this year due to time pressures. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Raid 10 chunksize
I'm trying to pin down some performance issues with a machine where I work, we are seeing (read only) query response times blow out by an order of magnitude or more at busy times. Initially we blamed autovacuum, but after a tweak of the cost_delay it is *not* the problem. Then I looked at checkpoints... and altho there was some correlation with them and the query response - I'm thinking that the raid chunksize may well be the issue. Fortunately there is an identical DR box, so I could do a little testing. Details follow: Sun 4140 2x quad-core opteron 2356 16G RAM, 6x 15K 140G SAS Debian Lenny Pg 8.3.6 The disk is laid out using software (md) raid: 4 drives raid 10 *4K* chunksize with database files (ext3 ordered, noatime) 2 drives raid 1 with database transaction logs (ext3 ordered, noatime) The relevant non default .conf params are: shared_buffers = 2048MB work_mem = 4MB maintenance_work_mem = 1024MB max_fsm_pages = 153600 bgwriter_lru_maxpages = 200 wal_buffers = 2MB checkpoint_segments = 32 effective_cache_size = 4096MB autovacuum_vacuum_scale_factor = 0.1 autovacuum_vacuum_cost_delay = 60# This is high, but seemed to help... I've run pgbench: transaction type: TPC-B (sort of) scaling factor: 100 number of clients: 24 number of transactions per client: 12000 number of transactions actually processed: 288000/288000 tps = 655.335102 (including connections establishing) tps = 655.423232 (excluding connections establishing) Looking at iostat while it is running shows (note sda-sdd raid10, sde and sdf raid 1): Device: rrqm/s wrqm/s r/s w/srMB/swMB/s avgrq-sz avgqu-sz await svctm %util sda 0.0056.800.00 579.00 0.00 2.47 8.74 133.76 235.10 1.73 100.00 sdb 0.0045.600.00 583.60 0.00 2.45 8.5952.65 90.03 1.71 100.00 sdc 0.0049.000.00 579.80 0.00 2.45 8.6672.56 125.09 1.72 100.00 sdd 0.0058.400.00 565.00 0.00 2.42 8.79 135.31 235.52 1.77 100.00 sde 0.00 0.000.000.00 0.00 0.00 0.00 0.000.00 0.00 0.00 sdf 0.00 0.000.000.00 0.00 0.00 0.00 0.000.00 0.00 0.00 Device: rrqm/s wrqm/s r/s w/srMB/swMB/s avgrq-sz avgqu-sz await svctm %util sda 0.0012.800.00 23.40 0.00 0.15 12.85 3.04 103.38 4.27 10.00 sdb 0.0012.800.00 22.80 0.00 0.14 12.77 2.31 73.51 3.58 8.16 sdc 0.0012.800.00 21.40 0.00 0.13 12.86 2.38 79.21 3.63 7.76 sdd 0.0012.800.00 21.80 0.00 0.14 12.70 2.66 90.02 3.93 8.56 sde 0.00 2546.800.00 146.80 0.0010.53 146.94 0.976.38 5.34 78.40 sdf 0.00 2546.800.00 146.60 0.0010.53 147.05 0.976.38 5.53 81.04 Device: rrqm/s wrqm/s r/s w/srMB/swMB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 231.400.00 566.80 0.00 3.16 11.41 124.92 228.26 1.76 99.52 sdb 0.00 223.000.00 558.00 0.00 3.06 11.2346.64 83.55 1.70 94.88 sdc 0.00 230.600.00 551.60 0.00 3.07 11.4094.38 171.54 1.76 96.96 sdd 0.00 231.400.00 528.60 0.00 2.94 11.37 122.55 220.81 1.83 96.48 sde 0.00 1495.800.00 99.00 0.00 6.23 128.86 0.818.15 7.76 76.80 sdf 0.00 1495.800.00 99.20 0.00 6.26 129.24 0.737.40 7.10 70.48 Top looks like: Cpu(s): 2.5%us, 1.9%sy, 0.0%ni, 71.9%id, 23.4%wa, 0.2%hi, 0.2%si, 0.0%st Mem: 16474084k total, 15750384k used, 723700k free, 1654320k buffers Swap: 2104440k total, 944k used, 2103496k free, 13552720k cached It looks to me like we are maxing out the raid 10 array, and I suspect the chunksize (4K) is the culprit. However as this is a pest to change (!) I'd like some opinions on whether I'm jumping to conclusions. I'd also appreciate comments about what chunksize to use (I've tended to use 256K in the past, but what are folks preferring these days?) regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 10 chunksize
On 3/24/09 6:09 PM, Mark Kirkwood mar...@paradise.net.nz wrote: I'm trying to pin down some performance issues with a machine where I work, we are seeing (read only) query response times blow out by an order of magnitude or more at busy times. Initially we blamed autovacuum, but after a tweak of the cost_delay it is *not* the problem. Then I looked at checkpoints... and altho there was some correlation with them and the query response - I'm thinking that the raid chunksize may well be the issue. Fortunately there is an identical DR box, so I could do a little testing. Details follow: Sun 4140 2x quad-core opteron 2356 16G RAM, 6x 15K 140G SAS Debian Lenny Pg 8.3.6 The disk is laid out using software (md) raid: 4 drives raid 10 *4K* chunksize with database files (ext3 ordered, noatime) 2 drives raid 1 with database transaction logs (ext3 ordered, noatime) Top looks like: Cpu(s): 2.5%us, 1.9%sy, 0.0%ni, 71.9%id, 23.4%wa, 0.2%hi, 0.2%si, 0.0%st Mem: 16474084k total, 15750384k used, 723700k free, 1654320k buffers Swap: 2104440k total, 944k used, 2103496k free, 13552720k cached It looks to me like we are maxing out the raid 10 array, and I suspect the chunksize (4K) is the culprit. However as this is a pest to change (!) I'd like some opinions on whether I'm jumping to conclusions. I'd also appreciate comments about what chunksize to use (I've tended to use 256K in the past, but what are folks preferring these days?) regards Mark md tends to work great at 1MB chunk sizes with RAID 1 or 10 for whatever reason. Unlike a hardware raid card, smaller chunks aren't going to help random i/o as it won't read the whole 1MB or bother caching much. Make sure any partitions built on top of md are 1MB aligned if you go that route. Random I/O on files smaller than 1MB would be affected -- but that's not a problem on a 16GB RAM server running a database that won't fit in RAM. Your xlogs are occasionally close to max usage too -- which is suspicious at 10MB/sec. There is no reason for them to be on ext3 since they are a transaction log that syncs writes so file system journaling doesn't mean anything. Ext2 there will lower the sync times and reduced i/o utilization. I also tend to use xfs if sequential access is important at all (obviously not so in pg_bench). ext3 is slightly safer in a power failure with unsyncd data, but Postgres has that covered with its own journal anyway so those differences are irrelevant. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 10 chunksize
On Tue, Mar 24, 2009 at 6:48 PM, Scott Carey sc...@richrelevance.com wrote: Your xlogs are occasionally close to max usage too -- which is suspicious at 10MB/sec. There is no reason for them to be on ext3 since they are a transaction log that syncs writes so file system journaling doesn't mean anything. Ext2 there will lower the sync times and reduced i/o utilization. I would tend to recommend ext3 in data=writeback and make sure that it's mounted with noatime over using ext2 - for the sole reason that if the system shuts down unexpectedly, you don't have to worry about a long fsck when bringing it back up. Performance between the two filesystems should really be negligible for Postgres logging. -Dave -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 10 chunksize
On Tue, Mar 24, 2009 at 7:09 PM, Mark Kirkwood mar...@paradise.net.nz wrote: I'm trying to pin down some performance issues with a machine where I work, we are seeing (read only) query response times blow out by an order of magnitude or more at busy times. Initially we blamed autovacuum, but after a tweak of the cost_delay it is *not* the problem. Then I looked at checkpoints... and altho there was some correlation with them and the query response - I'm thinking that the raid chunksize may well be the issue. Sounds to me like you're mostly just running out of bandwidth on your RAID array. Whether or not you can tune it to run faster is the real issue. This problem becomes worse as you add clients and the RAID array starts to thrash. Thrashing is likely to be worse with a small chunk size, so that's definitely worth a look at fixing. Fortunately there is an identical DR box, so I could do a little testing. Can you try changing the chunksize on the test box you're testing on to see if that helps? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Help Me Understand Why I'm Getting a Bad Query Plan
Brian, I hate to nag, but could anybody help me with this? We have a few related queries that are causing noticeable service delays in our production system. I've tried a number of different things, but I'm running out of ideas and don't know what to do next. For some reason, your first post didn't make it to the list, which is why nobody responded. I've got a query on our production system that isn't choosing a good plan. I can't see why it's choosing to do a sequential scan on the ItemExperienceLog table. That table is about 800mb and has about 2.5 million records. This example query only returns 4 records. I've tried upping the statics for ItemExperienceLog.VistorId and ItemExperienceLog.ItemId to 1000 (from out default of 100) with no success. Yes, that is kind of inexplicable. For some reason, it's assigning a very high cost to the nestloops, which is why it wants to avoid them with a seq scan. Can you try lowering cpu_index_cost to 0.001 and see how that affects the plan? --Josh -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Help Me Understand Why I'm Getting a Bad Query Plan
There is one thing I don`t understand: - Nested Loop (cost=0.00..180564.28 rows=1806 width=37) (actual time=0.192..60.214 rows=3174 loops=1) - Index Scan using visitors_userid_index2 on visitors v (cost=0.00..2580.97 rows=1300 width=33) (actual time=0.052..2.342 rows=897 loops=1) Index Cond: (userid = 'fbe2537f21d94f519605612c0bf7c2c5'::bpchar) - Index Scan using itemexperiencelog__index__visitorid on itemexperiencelog l (cost=0.00..134.04 rows=230 width=70) (actual time=0.013..0.040 rows=4 loops=897) Index Cond: (l.visitorid = v.id) If it expects 1300 visitors with the userid, and for each of them to have 230 entries in itemexperiencelog, how can it come up with 1806 returned rows (and be about right!)? Greetings Marcin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 10 chunksize
On Tue, 24 Mar 2009, David Rees wrote: I would tend to recommend ext3 in data=writeback and make sure that it's mounted with noatime over using ext2 - for the sole reason that if the system shuts down unexpectedly, you don't have to worry about a long fsck when bringing it back up. Well, Mark's system is already using noatime, and if you believe http://www.commandprompt.com/blogs/joshua_drake/2008/04/is_that_performance_i_smell_ext2_vs_ext3_on_50_spindles_testing_for_postgresql/ there's little difference between writeback and ordered on the WAL disk. Might squeeze out some improvements with ext2 though, and if there's nothing besides the WAL on there fsck isn't ever going to take very long anyway--not much of a directory tree to traverse there. -- * Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Help Me Understand Why I'm Getting a Bad Query Plan
On Tue, Mar 24, 2009 at 10:04 PM, marcin mank marcin.m...@gmail.com wrote: There is one thing I don`t understand: - Nested Loop (cost=0.00..180564.28 rows=1806 width=37) (actual time=0.192..60.214 rows=3174 loops=1) - Index Scan using visitors_userid_index2 on visitors v (cost=0.00..2580.97 rows=1300 width=33) (actual time=0.052..2.342 rows=897 loops=1) Index Cond: (userid = 'fbe2537f21d94f519605612c0bf7c2c5'::bpchar) - Index Scan using itemexperiencelog__index__visitorid on itemexperiencelog l (cost=0.00..134.04 rows=230 width=70) (actual time=0.013..0.040 rows=4 loops=897) Index Cond: (l.visitorid = v.id) If it expects 1300 visitors with the userid, and for each of them to have 230 entries in itemexperiencelog, how can it come up with 1806 returned rows (and be about right!)? I'm not sure I follow what you're saying. One thing to keep in mind, due to a lapse in our judgement at the time, this itemexperiencelog table serves as both a current state table, and a log table. Therefore, it potentially has multiple redundant entries, but we typically only look at the most recent entry to figure out the state of the current item. We're in the process of re-factoring this now, as well as denormalizing some of the tables to eliminate unnecessary joins, but I keep running into these problems and need to understand what is going on so that I know we're fixing the correct things. Thanks, Bryan -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Help Me Understand Why I'm Getting a Bad Query Plan
On Tue, Mar 24, 2009 at 9:30 PM, Josh Berkus j...@agliodbs.com wrote: For some reason, your first post didn't make it to the list, which is why nobody responded. Weird... I've been having problems with gmail and google reader all week. I've got a query on our production system that isn't choosing a good plan. I can't see why it's choosing to do a sequential scan on the ItemExperienceLog table. That table is about 800mb and has about 2.5 million records. This example query only returns 4 records. I've tried upping the statics for ItemExperienceLog.VistorId and ItemExperienceLog.ItemId to 1000 (from out default of 100) with no success. Yes, that is kind of inexplicable. For some reason, it's assigning a very high cost to the nestloops, which is why it wants to avoid them with a seq scan. Can you try lowering cpu_index_cost to 0.001 and see how that affects the plan? I'm assuming you meant cpu_index_tuple_cost. I changed that to 0.001 as you suggested, forced postgres to reload it's configuration and I'm still getting the same execution plan. Looking through our configuration one more time, I see that at some point I set random_page_cost to 2.0, but I don't see any other changes to query planner settings from their default values. Bryan -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 10 chunksize
Scott Marlowe wrote: On Tue, Mar 24, 2009 at 7:09 PM, Mark Kirkwood mar...@paradise.net.nz wrote: I'm trying to pin down some performance issues with a machine where I work, we are seeing (read only) query response times blow out by an order of magnitude or more at busy times. Initially we blamed autovacuum, but after a tweak of the cost_delay it is *not* the problem. Then I looked at checkpoints... and altho there was some correlation with them and the query response - I'm thinking that the raid chunksize may well be the issue. Sounds to me like you're mostly just running out of bandwidth on your RAID array. Whether or not you can tune it to run faster is the real issue. This problem becomes worse as you add clients and the RAID array starts to thrash. Thrashing is likely to be worse with a small chunk size, so that's definitely worth a look at fixing. Yeah, I was wondering if we are maxing out the bandwidth... Fortunately there is an identical DR box, so I could do a little testing. Can you try changing the chunksize on the test box you're testing on to see if that helps? Yes - or I am hoping to anyway (part of posting here was to collect some outside validation for the idea). Thanks for your input! Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance