Re: [PERFORM] Hash Anti Join performance degradation
On 24/05/11 12:14, panam wrote: > Hi, > > In my DB, there is a large table containing messages and one that contains > message boxes. > Messages are assigned to boxes via a child parent link m->b. > In order to obtain the last message for a specific box, I use the following > SQL: > > SELECT m1.id FROM message m1 LEFT JOIN message m2 ON (m1.box_id = m2.box_id > AND m1.id < m2.id) WHERE m2.id IS NULL AND m1.box_id = id; > > This worked quite well for a long time. But now, suddenly the performance of > this query drastically degraded as new messages were added. > If these new messages are removed again, everything is back to normal. If > other messages are removed instead, the problem remains, so it does not seem > to be a memory issue. I fear I have difficulties to understand what is going > wrong here. The usual cause is that the statistics for estimated row counts cross a threshold that makes the query planner think that a different kind of plan will be faster. If the query planner is using bad information about the performance of the storage, then it will be making bad decisions about which approach is faster. So the usual thing to do is to adjust seq_page_cost and random_page_cost to more closely reflect the real performance of your hardware, and to make sure that effective_cache_size matches the real amount of memory your computer has free for disk cache use. Newer versions of PostgreSQL always include query planning and statistics improvements too. BTW, it can be really helpful to paste your query plans into http://explain.depesz.com/ , which will provide an easier to read visual analysis of the plan. This will only work with query plans that haven't been butchered by mail client word wrapping, so I can't do it for you, but if you paste them there and post the links that'd be really handy. Also have a look at http://wiki.postgresql.org/wiki/SlowQueryQuestions I found the plans you posted a bit hard to read. Not your fault; it's stupid mail clients. Maybe depesz.com needs to be taught to de-munge the damage done to plans by common mail clients. -- Craig Ringer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance degradation of inserts when database size grows
As near as I can tell from your test configuration description, you have JMeter --> J2EE --> Postgres. Have you ruled out the J2EE server as the problem? This problem may not be the database. I would take a look at your app server's health and look for any potential issues there before spending too much time on the database. Perhaps there are memory issues or excessive garbage collection on the app server? Terry On Tue, May 17, 2011 at 5:45 AM, Andrey Vorobiev < andrey.o.vorob...@gmail.com> wrote: > Hi, guys. > > > I have following environment configuration > > - Postgres 8.4.7 with following postresql.conf settings modified: > > listen_addresses = '*' > > max_connections = 100 > > > shared_buffers = 2048MB > > max_prepared_transactions = 100 > > wal_buffers = 1024kB > > > checkpoint_segments = 64 > > checkpoint_completion_target = 0.8 > > > log_checkpoints = on > > > - Two databases. Let's call them db_1 and db_2 > > - J2EE application server that performs inserts into databases defined > above. (distribution transactions are used). > > - All constraints and indexes are on. > > - JMeter that acts as HTTP client and sends requests to server causing it > to insert rows. (case of new users registration) > > > After running scenario scenario described above (with 10 concurrent > threads) I have observed following behavior: > > > For the first time everything is fine and J2EE server handles about 700 > requests/sec (about 2500 inserts into several tables per second). But after > some amount of time I observe performance degradation. In general it looks > like the following: > > > Total number of requests passed; Requests per second; > > 382000; 768; > > 546000; 765; > > 58; 723; > > 65; 700; > > 671000; 656; > > 70; 628; > > > Checkpoint logging gives me the following: > > 2011-05-17 18:55:51 NOVST LOG: checkpoint starting: xlog > > 2011-05-17 18:57:20 NOVST LOG: checkpoint complete: wrote 62861 buffers > (24.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; > write=89.196 s, sync=0.029 s, total=89.242 s > > 2011-05-17 18:57:47 NOVST LOG: checkpoint starting: xlog > > 2011-05-17 18:59:02 NOVST LOG: checkpoint complete: wrote 83747 buffers > (31.9%); 0 transaction log file(s) added, 0 removed, 64 recycled; > write=75.001 s, sync=0.043 s, total=75.061 s > > 2011-05-17 18:59:29 NOVST LOG: checkpoint starting: xlog > > 2011-05-17 19:00:30 NOVST LOG: checkpoint complete: wrote 97341 buffers > (37.1%); 0 transaction log file(s) added, 0 removed, 64 recycled; > write=60.413 s, sync=0.050 s, total=60.479 s > > 2011-05-17 19:00:55 NOVST LOG: checkpoint starting: xlog > > 2011-05-17 19:01:48 NOVST LOG: checkpoint complete: wrote 110149 buffers > (42.0%); 0 transaction log file(s) added, 0 removed, 64 recycled; > write=52.285 s, sync=0.072 s, total=52.379 s > > 2011-05-17 19:02:11 NOVST LOG: checkpoint starting: xlog > > 2011-05-17 19:02:58 NOVST LOG: checkpoint complete: wrote 120003 buffers > (45.8%); 0 transaction log file(s) added, 0 removed, 64 recycled; > write=46.766 s, sync=0.082 s, total=46.864 s > > 2011-05-17 19:03:20 NOVST LOG: checkpoint starting: xlog > > 2011-05-17 19:04:18 NOVST LOG: checkpoint complete: wrote 122296 buffers > (46.7%); 0 transaction log file(s) added, 0 removed, 64 recycled; > write=57.795 s, sync=0.054 s, total=57.867 s > > 2011-05-17 19:04:38 NOVST LOG: checkpoint starting: xlog > > 2011-05-17 19:05:34 NOVST LOG: checkpoint complete: wrote 128165 buffers > (48.9%); 0 transaction log file(s) added, 0 removed, 64 recycled; > write=55.061 s, sync=0.087 s, total=55.188 s > > 2011-05-17 19:05:53 NOVST LOG: checkpoint starting: xlog > > 2011-05-17 19:06:51 NOVST LOG: checkpoint complete: wrote 138508 buffers > (52.8%); 0 transaction log file(s) added, 0 removed, 64 recycled; > write=57.919 s, sync=0.106 s, total=58.068 s > > 2011-05-17 19:07:08 NOVST LOG: checkpoint starting: xlog > > 2011-05-17 19:08:21 NOVST LOG: checkpoint complete: wrote 132485 buffers > (50.5%); 0 transaction log file(s) added, 0 removed, 64 recycled; > write=72.949 s, sync=0.081 s, total=73.047 s > > 2011-05-17 19:08:40 NOVST LOG: checkpoint starting: xlog > > 2011-05-17 19:09:48 NOVST LOG: checkpoint complete: wrote 139542 buffers > (53.2%); 0 transaction log file(s) added, 0 removed, 64 recycled; > write=68.193 s, sync=0.107 s, total=68.319 s > > 2011-05-17 19:10:06 NOVST LOG: checkpoint starting: xlog > > 2011-05-17 19:11:31 NOVST LOG: checkpoint complete: wrote 137657 buffers > (52.5%); 0 transaction log file(s) added, 0 removed, 64 recycled; > write=84.575 s, sync=0.047 s, total=84.640 s > > > Also I observed more heavy IO from iostat utility. > > > So my questions are: > > 1. How does database size affect insert performance? > > 2. Why does number of written buffers increase when database size grows? > > 3. How can I further analyze this problem? > > -- > Best regards. >
[PERFORM] Hash Anti Join performance degradation
Hi, In my DB, there is a large table containing messages and one that contains message boxes. Messages are assigned to boxes via a child parent link m->b. In order to obtain the last message for a specific box, I use the following SQL: SELECT m1.id FROM message m1 LEFT JOIN message m2 ON (m1.box_id = m2.box_id AND m1.id < m2.id) WHERE m2.id IS NULL AND m1.box_id = id; This worked quite well for a long time. But now, suddenly the performance of this query drastically degraded as new messages were added. If these new messages are removed again, everything is back to normal. If other messages are removed instead, the problem remains, so it does not seem to be a memory issue. I fear I have difficulties to understand what is going wrong here. This is the query plan when everything is fine: "Seq Scan on public.box this_ (cost=0.00..10467236.32 rows=128 width=696) (actual time=0.169..7683.978 rows=128 loops=1)" " Output: this_.id, this_.login, (SubPlan 1)" " Buffers: shared hit=188413 read=94635 written=135, temp read=22530 written=22374" " SubPlan 1" "-> Hash Anti Join (cost=41323.25..81775.25 rows=20427 width=8) (actual time=59.571..59.877 rows=1 loops=128)" " Output: m1.id" " Hash Cond: (m1.box_id = m2.box_id)" " Join Filter: (m1.id < m2.id)" " Buffers: shared hit=188412 read=94633 written=135, temp read=22530 written=22374" " -> Bitmap Heap Scan on public.message m1 (cost=577.97..40212.28 rows=30640 width=16) (actual time=3.152..9.514 rows=17982 loops=128)" "Output: m1.id, m1.box_id" "Recheck Cond: (m1.box_id = $0)" "Buffers: shared hit=131993 read=9550 written=23" "-> Bitmap Index Scan on "message_box_Idx" (cost=0.00..570.31 rows=30640 width=0) (actual time=2.840..2.840 rows=18193 loops=128)" " Index Cond: (m1.box_id = $0)" " Buffers: shared hit=314 read=6433 written=23" " -> Hash (cost=40212.28..40212.28 rows=30640 width=16) (actual time=26.840..26.840 rows=20014 loops=115)" "Output: m2.box_id, m2.id" "Buckets: 4096 Batches: 4 (originally 2) Memory Usage: 5444kB" "Buffers: shared hit=56419 read=85083 written=112, temp written=7767" "-> Bitmap Heap Scan on public.message m2 (cost=577.97..40212.28 rows=30640 width=16) (actual time=2.419..20.007 rows=20014 loops=115)" " Output: m2.box_id, m2.id" " Recheck Cond: (m2.box_id = $0)" " Buffers: shared hit=56419 read=85083 written=112" " -> Bitmap Index Scan on "message_box_Idx" (cost=0.00..570.31 rows=30640 width=0) (actual time=2.166..2.166 rows=20249 loops=115)" "Index Cond: (m2.box_id = $0)" "Buffers: shared hit=6708" "Total runtime: 7685.202 ms" This is the plan when the query gets sluggish: "Seq Scan on public.box this_ (cost=0.00..10467236.32 rows=128 width=696) (actual time=0.262..179333.086 rows=128 loops=1)" " Output: this_.id, this_.login, (SubPlan 1)" " Buffers: shared hit=189065 read=93983 written=10, temp read=22668 written=22512" " SubPlan 1" "-> Hash Anti Join (cost=41323.25..81775.25 rows=20427 width=8) (actual time=1264.700..1400.886 rows=1 loops=128)" " Output: m1.id" " Hash Cond: (m1.box_id = m2.box_id)" " Join Filter: (m1.id < m2.id)" " Buffers: shared hit=189064 read=93981 written=10, temp read=22668 written=22512" " -> Bitmap Heap Scan on public.message m1 (cost=577.97..40212.28 rows=30640 width=16) (actual time=3.109..9.850 rows=18060 loops=128)" "Output: m1.id, m1.box_id" "Recheck Cond: (m1.box_id = $0)" "Buffers: shared hit=132095 read=9448" "-> Bitmap Index Scan on "message_box_Idx" (cost=0.00..570.31 rows=30640 width=0) (actual time=2.867..2.867 rows=18193 loops=128)" " Index Cond: (m1.box_id = $0)" " Buffers: shared hit=312 read=6435" " -> Hash (cost=40212.28..40212.28 rows=30640 width=16) (actual time=27.533..27.533 rows=20102 loops=115)" "Output: m2.box_id, m2.id" "Buckets: 4096 Batches: 4 (originally 2) Memory Usage: 5522kB" "Buffers: shared hit=56969 read=84533 written=10, temp written=7811" "-> Bitmap Heap Scan on public.message m2 (cost=577.97..40212.28 rows=30640 width=16) (actual time=2.406..20.492 rows=20102 loops=115)" " Output: m2.box_id, m2.id" " Recheck Cond: (m2.box_id = $0)" " Buffers: shared hit=56969 read=84533 written=10" " -> Bitmap Index Scan on "message_box_Idx" (cost=0.00..570.31 rows=30640 width=0) (actual time=2.170..2.170 rows=20249 loops=115)" "Index Cond: (m2.box_id = $0)" "
Re: [PERFORM] reducing random_page_cost from 4 to 2 to force index scan
On Mon, May 23, 2011 at 3:08 PM, Josh Berkus wrote: > >> Well, all of that stuff sounds impractically expensive to me... but I >> just work here. > > I'll point out that the simple version, which just checks for hot tables > and indexes, would improve estimates greatly and be a LOT less > complicated than these proposals. I realize I'm sounding like a broken record here, but as far as I can tell there is absolutely zero evidence that that would be better. I'm sure you're in good company thinking so, but the list of things that could skew (or should I say, screw) the estimates is long and painful; and if those estimates are wrong, you'll end up with something that is both worse and less predictable than the status quo. First, I haven't seen a shred of hard evidence that the contents of the buffer cache or OS cache are stable enough to be relied upon, and we've repeatedly discussed workloads where that might not be true. Has anyone done a systematic study of this on a variety real production systems? If so, the results haven't been posted here, at least not that I can recall. Second, even if we were willing to accept that we could obtain relatively stable and accurate measurements of this data, who is to say that basing plans on it would actually result in an improvement in plan quality? That may seem obvious, but I don't think it is. The proposed method is a bit like trying to determine the altitude of a hot air balloon by throwing the ballast over the side and timing how long it takes to hit the ground. Executing plans that are based on the contents of the cache will change the contents of the cache, which will in turn change the plans. The idea that we can know, without any experimentation, how that's going to shake out, seems to me to be an exercise in unjustified optimism of the first order. Sorry to sound grumpy and pessimistic, but I really think we're letting our enthusiasm get way, way ahead of the evidence. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] postgres not use index, IN statement
On 24/05/2011 5:30 AM, Anibal David Acosta wrote: So, this is some bug? Hard to know with the information provided. Please post EXPLAIN ANALYZE output. http://wiki.postgresql.org/wiki/SlowQueryQuestions -- Craig Ringer Tech-related writing at http://soapyfrogs.blogspot.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] "error with invalid page header" while vacuuming pgbench data
John Rouillard wrote: > I seem to be able to provoke this error: > >vacuum...ERROR: invalid page header in > block 2128910 of relation base/16385/21476 What version of PostgreSQL? -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] "error with invalid page header" while vacuuming pgbench data
Hi all: Not sure if this is a performance question or a generic admin question. I have the following script running on a host different from the database to use pgbench to test the database: pgbench -i (inital mode) pgsql vacuum analyze; (and some other code to dump table sizes) pgbench (multiple connections, jobs etc ) with a loop for setting different scales I seem to be able to provoke this error: vacuum...ERROR: invalid page header in block 2128910 of relation base/16385/21476 on a pgbench database created with a scale factor of 1000 relatively reliably (2 for 2). I am not seeing any disk errors from the raid controller or the operating system. Running pg_dumpall to check for errors reports: pg_dump: Error message from server: ERROR: invalid page header in block 401585 of relation base/16385/21476 which is different from the originaly reported block. Does anybody have any suggestions? Configuration details. OS: centos 5.5 Filesystem: data - ext4 (note 4 not 3); 6.6T formatted wal - ext4; 1.5T formatted Raid: data - level 10, 8 disk wd2003; controller LSI MegaRAID SAS 9260-4i wal - level 1, 2 disk wd2003; controller LSI MegaRAID SAS 9260-4i Could it be an ext4 issue? It seems that ext4 may still be at the bleeding edge for postgres use. Thanks for any thoughts even if it's go to the admin list. -- -- rouilj John Rouillard System Administrator Renesys Corporation 603-244-9084 (cell) 603-643-9300 x 111 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] postgres not use index, IN statement
I have a strange situation. I have a table of detail with millones of rows and a table of items with thousands of rows When I do.. select count(*) from wiz_application_response where application_item_id in (select id from wiz_application_item where application_id=110) This query NOT use the index on column application_item_id, instead is doing a sequential scan BUT, when I add the range of min and max id of the subquery, the postgres uses the INDEX This is the second query... select count(*) from wiz_application_response where application_item_id between 908 and 1030 and application_item_id in(select id from wiz_application_item where application_id=110) 908 and 1030 are limits (lower and upper) of the subquery, the subquery returns 100 elements aprox. So, this is some bug? Thanks! Anibal -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Pushing LIMIT into sub-queries of a UNION ALL?
On 5/23/11 8:54 AM, Dave Johansen wrote: > I apologize for the multiple posts. I sent this email right after joining > the list and after it hadn't shown up a day later I figured that it had been > lost or something and sent the other one. List moderation took a holiday while all of us were at pgCon. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Pushing LIMIT into sub-queries of a UNION ALL?
On Mon, May 23, 2011 at 5:54 PM, Dave Johansen wrote: > I apologize for the multiple posts. I sent this email right after joining > the list and after it hadn't shown up a day later I figured that it had been > lost or something and sent the other one. Sorry for the nitpicking but I even see _three_ instances of this posting (first on May 18th). > Also, the database I posted this about does not have internet access and so > I'm working on getting it moved over to a machine that does or getting it > the info onto a machine where I can post the pertinent information about the > schema and explain outputs. Great! Cheers robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance degradation of inserts when database size grows
Dne 23.5.2011 21:05, Robert Haas napsal(a): > On Mon, May 23, 2011 at 2:46 PM, Tomas Vondra wrote: >> Really? He already has 64 checkpoint segments, which is about 1GB of >> xlog data. The real problem is that the amount of buffers to write is >> constantly growing. At the beginning there's 62861 buffers (500MB) and >> at the end there's 137657 buffers (1GB). >> >> IMHO increasing the number of checkpoint segments would make this >> disruption even worse. > > Maybe - but it would also make the checkpoints less frequent, which > might be a good thing. > >> In 9.1 there's a feature that spreads checkpoint writes, but with 8.4 >> that's not possible. > > What feature are you referring to here? Checkpoint spreading was > added in 8.3, IIRC. You're absolutely right, I was talking about checkpoint_completion_target and it was added in 8.3. Your memory is obviously better than mine. Tomas -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance degradation of inserts when database size grows
On Tue, May 17, 2011 at 8:45 AM, Andrey Vorobiev wrote: > 1. How does database size affect insert performance? Well, if your database gets bigger, then your indexes will become deeper, requiring more time to update. But I'm not sure that's your problem here. > 2. Why does number of written buffers increase when database size grows? It normally doesn't. > 3. How can I further analyze this problem? Are you actually inserting more user data into these tables, so that they have more and more rows as time goes by, or are the data files getting larger out of proportion to the amount of useful data in them? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] reducing random_page_cost from 4 to 2 to force index scan
> Well, all of that stuff sounds impractically expensive to me... but I > just work here. I'll point out that the simple version, which just checks for hot tables and indexes, would improve estimates greatly and be a LOT less complicated than these proposals. Certainly having some form of block-based or range-based stats would be better, but it also sounds hard enough to maybe never get done. Having user-accessible "hot" stats would also be useful to DBAs. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance degradation of inserts when database size grows
On Mon, May 23, 2011 at 2:46 PM, Tomas Vondra wrote: > Really? He already has 64 checkpoint segments, which is about 1GB of > xlog data. The real problem is that the amount of buffers to write is > constantly growing. At the beginning there's 62861 buffers (500MB) and > at the end there's 137657 buffers (1GB). > > IMHO increasing the number of checkpoint segments would make this > disruption even worse. Maybe - but it would also make the checkpoints less frequent, which might be a good thing. > In 9.1 there's a feature that spreads checkpoint writes, but with 8.4 > that's not possible. What feature are you referring to here? Checkpoint spreading was added in 8.3, IIRC. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance degradation of inserts when database size grows
Dne 23.5.2011 15:30, Shaun Thomas napsal(a): > On 05/17/2011 07:45 AM, Andrey Vorobiev wrote: > >> 2011-05-17 18:55:51 NOVST LOG: checkpoint starting: xlog >> 2011-05-17 18:57:20 NOVST LOG: checkpoint complete: wrote 62861 buffers >> (24.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; >> write=89.196 s, sync=0.029 s, total=89.242 s > > Increase your checkpoint_segments. If you see "checkpoint starting: > xlog" instead of "checkpoint starting: time", you don't have enough > checkpoint segments to handle your writes. Checkpoints *will* degrade > your throughput. > Really? He already has 64 checkpoint segments, which is about 1GB of xlog data. The real problem is that the amount of buffers to write is constantly growing. At the beginning there's 62861 buffers (500MB) and at the end there's 137657 buffers (1GB). IMHO increasing the number of checkpoint segments would make this disruption even worse. What I don't understand is that the checkpoint time does not increase with the amount of data to write. Writing the 62861 buffers total=89.242 s( 5 MB/s) 83747 buffers total=75.061 s( 9 MB/s) 97341 buffers total=60.479 s(13 MB/s) 110149 buffers total=52.379 s(17 MB/s) 120003 buffers total=46.864 s(20 MB/s) 122296 buffers total=57.867 s(17 MB/s) 128165 buffers total=55.188 s(18 MB/s) 138508 buffers total=58.068 s(19 MB/s) 132485 buffers total=73.047 s(14 MB/s) 139542 buffers total=68.319 s(16 MB/s) 137657 buffers total=84.640 s(13 MB/s) Maybe this depends on what sections of the files are modified (contiguous vs. not contiguous), but I doubt it. In 9.1 there's a feature that spreads checkpoint writes, but with 8.4 that's not possible. I think think this might be tuned using background writer, just make it more aggressive. - bgwriter_delay (decrease) - bgwriter_lru_maxpages (increase) - bgwriter_lru_multiplier (increase) regards Tomas -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Logfile
Hi, I am using contrib/cube code. I am building GIST index on cube data type then it leads to a very large size of log file (nearly 220 MB for only 12k records). While creating index on geometry field with gist gives 1KB size of log file for 17 lakh records. Can someone please tell me how to stop postgres to logged so much data in case of cube? Thanks Nick
Re: [PERFORM] SORT performance - slow?
Dne 23.5.2011 19:01, Maciek Sakrejda napsal(a): >> You're probably reading it wrong. The sort itself takes about 1 ms (just >> subtract the numbers in "actual="). > > I thought it was cost=startup_cost..total_cost. That is not quite the > same thing, since startup_cost is effectively "cost to produce first > row", and Sort can't really operate in a "streaming" fashion (well, > theoretically, something like selection sort could, but that's beside > the point) so it needs to do all the work up front. I'm no explain > expert, so someone please correct me if I'm wrong. Good point, thanks. In that case the second number (2.3 sec) is correct. I still think the problem is not the sorting but the inaccurate estimates - fixing this might yield a much better / faster plan. But the OP posted just a small part of the plan, so it's hard to guess. regards Tomas -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] SORT performance - slow?
On Mon, May 23, 2011 at 1:01 PM, Maciek Sakrejda wrote: >> You're probably reading it wrong. The sort itself takes about 1 ms (just >> subtract the numbers in "actual="). > > I thought it was cost=startup_cost..total_cost. That is not quite the > same thing, since startup_cost is effectively "cost to produce first > row", and Sort can't really operate in a "streaming" fashion (well, > theoretically, something like selection sort could, but that's beside > the point) so it needs to do all the work up front. I'm no explain > expert, so someone please correct me if I'm wrong. You are right. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Pushing LIMIT into sub-queries of a UNION ALL
On Mon, May 16, 2011 at 3:38 PM, Dave Johansen wrote: > I am using Postgres 8.3 and I have an issue very closely related to the one > described here: > http://archives.postgresql.org/pgsql-general/2005-06/msg00488.php > > Basically, I have a VIEW which is a UNION ALL of two tables but when I do a > select on the view using a LIMIT, it scans the entire tables and takes > significantly longer than writing out the query with the LIMITs in the > sub-queries themselves. Is there a solution to get the view to perform like > the sub-query version? I believe this is fixed by MergeAppend in 9.1. You might want to try 9.1beta1 and see if that works better for you. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: FW: [PERFORM] KVP table vs. hstore - hstore performance (Was: Postgres NoSQL emulation)
On Tue, May 17, 2011 at 11:10 AM, wrote: > For Hstore I'm using a GIST index. I would have thought that GIN would be a better choice for this workload. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] SORT performance - slow?
> You're probably reading it wrong. The sort itself takes about 1 ms (just > subtract the numbers in "actual="). I thought it was cost=startup_cost..total_cost. That is not quite the same thing, since startup_cost is effectively "cost to produce first row", and Sort can't really operate in a "streaming" fashion (well, theoretically, something like selection sort could, but that's beside the point) so it needs to do all the work up front. I'm no explain expert, so someone please correct me if I'm wrong. --- Maciek Sakrejda | System Architect | Truviso 1065 E. Hillsdale Blvd., Suite 215 Foster City, CA 94404 (650) 242-3500 Main www.truviso.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Pushing LIMIT into sub-queries of a UNION ALL?
I apologize for the multiple posts. I sent this email right after joining the list and after it hadn't shown up a day later I figured that it had been lost or something and sent the other one. Also, the database I posted this about does not have internet access and so I'm working on getting it moved over to a machine that does or getting it the info onto a machine where I can post the pertinent information about the schema and explain outputs. Thanks, Dave -- Dave Johansen phone: (520) 302-4526 On Sun, May 22, 2011 at 10:34 AM, Robert Klemme wrote: > Dave, > > how often do you want to repeat that posting? What about instead > replying to the answers you got so far? > > Cheers > > robert > > > On Tue, May 17, 2011 at 5:31 PM, Dave Johansen > wrote: > > I am using Postgres 8.3 and I have an issue very closely related to the > one > > described here: > > http://archives.postgresql.org/pgsql-general/2005-06/msg00488.php > > > > Basically, I have a VIEW which is a UNION ALL of two tables but when I do > a > > select on the view using a LIMIT, it scans the entire tables and takes > > significantly longer than writing out the query with the LIMITs in the > > sub-queries themselves. Is there a solution to get the view to perform > like > > the sub-query version? > > > > Thanks, > > Dave > > > > -- > remember.guy do |as, often| as.you_can - without end > http://blog.rubybestpractices.com/ >
Re: [PERFORM] Postgres refusing to use >1 core
> > It's always good to hear when these things work out. Thanks for > reporting back. > > Using the set-based nature of relational databases to your advantage, > writing smarter queries that do more work server-side with fewer > round-trips, and effective batching can make a huge difference. > Glad I could be a good digital citizen! :-) Correction: it's going to run for significantly more than 2 hours, but far less than 30 hours! I'm loving seeing the CPU meter showing all 8 of my (fake) cores being pounded mercilessly! Aren
Re: [PERFORM] Performance degradation of inserts when database size grows
On 05/17/2011 07:45 AM, Andrey Vorobiev wrote: 2011-05-17 18:55:51 NOVST LOG: checkpoint starting: xlog 2011-05-17 18:57:20 NOVST LOG: checkpoint complete: wrote 62861 buffers (24.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=89.196 s, sync=0.029 s, total=89.242 s Increase your checkpoint_segments. If you see "checkpoint starting: xlog" instead of "checkpoint starting: time", you don't have enough checkpoint segments to handle your writes. Checkpoints *will* degrade your throughput. -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604 312-676-8870 stho...@peak6.com __ See http://www.peak6.com/email_disclaimer.php for terms and conditions related to this email -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] [OT]: Confidentiality disclosures in list posts (Was: SORT performance - slow?)
On 05/20/2011 11:47 AM, Lew wrote: I don't even know why people bother even putting such nonsense into their emails, let alone Usenet or mailing-list posts. This may sound like a surprise, but many of us don't. Several companies use an auto-append on any outgoing message not sent to an internal recipient. You can see this for yourselves in this message, as my company's little blurb gets attached after my signature lines. It's just standard boilerplate meant as a CYA measure, really. -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604 312-676-8870 stho...@peak6.com __ See http://www.peak6.com/email_disclaimer.php for terms and conditions related to this email -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: FW: [PERFORM] KVP table vs. hstore - hstore performance (Was: Postgres NoSQL emulation)
Hi Merlin The analyze command gave the following result: On the KVP table: Index Scan using kvpidx on bench_kvp (cost=0.00..8.53 rows=1 width=180) (actual time=0.037..0.038 rows=1 loops=1) Index Cond: (bench_id = '20_20'::text) Total runtime: 0.057 ms And on the Hstore table: Bitmap Heap Scan on bench_hstore (cost=32.22..3507.54 rows=1000 width=265) (actual time=145.040..256.173 rows=1 loops=1) Recheck Cond: (bench_hstore @> '"bench_id"=>"20_20"'::hstore) -> Bitmap Index Scan on hidx (cost=0.00..31.97 rows=1000 width=0) (actual time=114.748..114.748 rows=30605 loops=1) Index Cond: (bench_hstore @> '"bench_id"=>"20_20"'::hstore) Total runtime: 256.211 ms For Hstore I'm using a GIST index. Try to create a btree index on "(bench_hstore->bench_id) WHERE (bench_hstore->bench_id) IS NOT NULL". -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance