Re: [PERFORM] Constraint propagating for equal fields
30 січня 2010 р. 04:30 Greg Stark gsst...@mit.edu написав: 2010/1/28 Віталій Тимчишин tiv...@gmail.com I've always thought that PostgreSQL would propagate constraint from field1 to field2 if condition says field1=field2, but this does not seem the case: version? PostgreSQL 8.3.7 on amd64-portbld-freebsd7.2, compiled by GCC cc (GCC) 4.2.1 20070719 [FreeBSD]
Re: [PERFORM] Slow query: table iteration (8.3)
Glenn Maynard wrote: Hitting a performance issues that I'm not sure how to diagnose. SELECT highscores_for_steps_and_card(s.id, 591, 1) FROM stomp_steps s; Seq Scan on stomp_steps s (cost=0.00..793.52 rows=2902 width=4) (actual time=26509.919..26509.919 rows=0 loops=1) Total runtime: 26509.972 ms Hello Glenn, Stomp_steps is analyzed to 2902 rows but when you run the query the actual rows are 0. This means that the highscore function is not called or the number 0 is incorrect. Suppose that the number of rows is 2900, then 26 seconds means 100ms per function call. This still is a lot, compared to the 0.054 ms analyze result below. The truth might be that you probably got that result by explaining the query in the function with actual parameter values. This plan differs from the one that is made when the function is called from sql and is planned (once) without parameters, and in that case the plan is probably different. A way to check the plan of that query is to turn on debug_print_plan and watch the server log. It takes a bit getting used. The plan starts with CONTEXT: SQL function functionname during startup and is also recognized because in the opexpr (operator expression) one of the operands is a parameter. Important is the total cost of the top plan node (the limit). I know 8.3 is mentioned in the subject, but I think that a WITH query (http://www.postgresql.org/docs/8.4/interactive/queries-with.html) could be a good solution to your problem and may be worth trying out, if you have the possibility to try out 8.4. Regards, Yeb Havinga The inner function looks like this: CREATE FUNCTION highscores_for_steps_and_card(steps_id int, card_id int, count int) RETURNS SETOF INTEGER LANGUAGE SQL AS $$ SELECT r.id FROM stomp_round r WHERE ($1 IS NULL OR r.steps_id = $1) AND ($2 IS NULL OR r.user_card_id = $2) ORDER BY r.score DESC LIMIT $3 $$ Limit (cost=13.12..13.12 rows=1 width=8) (actual time=0.054..0.054 rows=0 loops=1) - Sort (cost=13.12..13.12 rows=1 width=8) (actual time=0.051..0.051 rows=0 loops=1) Sort Key: score Sort Method: quicksort Memory: 17kB - Bitmap Heap Scan on stomp_round r (cost=9.09..13.11 rows=1 width=8) (actual time=0.036..0.036 rows=0 loops=1) Recheck Cond: ((280 = steps_id) AND (user_card_id = 591)) - BitmapAnd (cost=9.09..9.09 rows=1 width=0) (actual time=0.032..0.032 rows=0 loops=1) - Bitmap Index Scan on stomp_round_steps_id (cost=0.00..4.40 rows=20 width=0) (actual time=0.030..0.030 rows=0 loops=1) Index Cond: (280 = steps_id) - Bitmap Index Scan on stomp_round_user_card_id (cost=0.00..4.44 rows=25 width=0) (never executed) Index Cond: (user_card_id = 591) Total runtime: 0.153 ms (12 rows) stomp_steps has about 1500 rows, so it finds 1500 high scores, one for each stage. I expected scalability issues from this on a regular drive, since it'll be doing a ton of index seeking when not working out of cache, so I expected to need to change to an SSD at some point (when it no longer easily fits in cache). However, I/O doesn't seem to be the bottleneck yet. If I run it several times, it consistently takes 26 seconds. The entire database is in OS cache (find | xargs cat: 250ms). I'm not sure why the full query (26s) is orders of magnitude slower than 1500*0.150ms (225ms). It's not a very complex query, and I'd hope it's not being re-planned every iteration through the loop. Any thoughts? Using SELECT to iterate over a table like this is very useful (and I don't know any practical alternative), but it's difficult to profile since it doesn't play nice with EXPLAIN ANALYZE. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Slow query: table iteration (8.3)
On Mon, Feb 1, 2010 at 6:15 AM, Yeb Havinga yhavi...@gmail.com wrote: Glenn Maynard wrote: SELECT highscores_for_steps_and_card(s.id, 591, 1) FROM stomp_steps s; Seq Scan on stomp_steps s (cost=0.00..793.52 rows=2902 width=4) (actual time=26509.919..26509.919 rows=0 loops=1) Total runtime: 26509.972 ms Stomp_steps is analyzed to 2902 rows but when you run the query the actual rows are 0. This means that the highscore function is not called or the number 0 is incorrect. This SELECT returns 0 rows: it calls the function 1500 times, and each time it returns no data, because there simply aren't any results for these parameters. below. The truth might be that you probably got that result by explaining the query in the function with actual parameter values. This plan differs from the one that is made when the function is called from sql and is planned (once) without parameters, and in that case the plan is probably different. Yeah. It would help a lot if EXPLAIN could show query plans of functions used by the statement and not just the top-level query. A way to check the plan of that query is to turn on debug_print_plan and watch the server log. It takes a bit getting used. The plan starts with CONTEXT: SQL function functionname during startup and is also recognized because in the opexpr (operator expression) one of the operands is a parameter. Important is the total cost of the top plan node (the limit). Thanks. SELECT highscores_for_steps_and_card(s.id, 591, 1) FROM stomp_steps s: Squinting at the output, it definitely looks like a less optimized plan; it's using a SEQSCAN instead of BITMAPHEAPSCAN. (I've attached the output.) Does the planner not optimize functions based on context? That seems like a huge class of optimizations. The first NULLTEST can be optimized away, since that parameter comes from a NOT NULL source (a PK). The second NULLTEST can also be optimized away, since it's a constant value (591). The search could be a BITMAPHEAPSCAN, substituting the s.id value for each call, instead of a SEQSCAN. (Not that I'm concerned about a few cheap NULLTESTs, I'm just surprised at it using such a generic plan.) If I create a new function with the constant parameters hard-coded, it's back to BITMAPHEAPSCAN: 175ms. This suggests a horrible workaround: creating temporary functions every time I make this type of query, with the fixed values substituted textually. I'd really love to know a less awful fix. I know 8.3 is mentioned in the subject, but I think that a WITH query (http://www.postgresql.org/docs/8.4/interactive/queries-with.html) could be a good solution to your problem and may be worth trying out, if you have the possibility to try out 8.4. I can't see how to apply WITH to this. Non-recursive WITH seems like syntax sugar that doesn't do anything a plain SELECT can't do, and I don't think what I'm doing here can be done with a regular SELECT. -- Glenn Maynard SELECT highscores_for_steps_and_card(s.id, 591, 1) FROM stomp_steps s; QUERY PLAN Seq Scan on stomp_steps s (cost=0.00..793.52 rows=2902 width=4) (actual time=1726.998..26729.717 rows=17 loops=1) Total runtime: 26729.822 ms DEBUG: plan: DETAIL: {PLANNEDSTMT :commandType 1 :canSetTag true :planTree {SEQSCAN :startup_cost 0.00 :total_cost 793.52 :plan_rows 2902 :plan_width 4 :targetlist ( {TARGETENTRY :expr {FUNCEXPR :funcid 240532 :funcresulttype 23 :funcretset true :funcformat 0 :args ( {VAR :varno 1 :varattno 1 :vartype 23 :vartypmod -1 :varlevelsup 0 :varnoold 1 :varoattno 1 } {CONST :consttype 23 :consttypmod -1 :constlen 4 :constbyval true :constisnull false :constvalue 4 [ 79 2 0 0 ] } {CONST :consttype 23 :consttypmod -1 :constlen 4 :constbyval true :constisnull false :constvalue 4 [ 1 0 0 0 ] } ) } :resno 1 :resname highscores_for_steps_and_card :ressortgroupref 0 :resorigtbl 0 :resorigcol 0 :resjunk false } ) :qual :lefttree :righttree :initPlan :extParam (b) :allParam (b) :scanrelid 1 } :rtable ( {RTE :alias {ALIAS :aliasname s :colnames } :eref {ALIAS :aliasname s :colnames (id ...)
[PERFORM] the jokes for pg concurrency write performance
hi, first, thanks u for make so good opensource db . recently maybe half an years ago ,i begin to use pg in a big project for insurance project, belong as the project go on ,and i found some performance problem on concurrency write situation , then i do a research on concurrency write strategy on postgresql , i found a joke ,maybe this joke concurrency strategy is the designer's pround idea, but i think it is a joke , next let me describe the problems: * joke 1: insert operation would use a excluse lock on reference row by the foreign key . a big big big performance killer , i think this is a stupid design . * joke 2: concurrency update on same row would lead to that other transaction must wait the earlier transaction complete , this would kill the concurrency performance in some long time transaction situation . a stupid design to , this joke design's reason is avoid confliction on read committed isolation , such as this situation: UPDATE webpages SET hits = hits + 1 WHERE url ='some url '; when concurrency write transaction on read committed isolation , the hits may result wrong . this joke design would do seriable write , but i think any stupid developer would not write this code like this stupid sample code , a good code is use a exclusive lock to do a seriable write on this same row , but the joker think he should help us to do this , i say ,u should no kill concurrency performance and help i do this fucking stupid sample code , i would use a select .. for update to do this : select 1 from lock_table where lockId='lock1' for update ; UPDATE webpages SET hits = hits + 1 WHERE url ='some url '; * joke 3: update 10 rows on a table no any index , it cost 5-8 seconds , this is not acceptable in some bulk update situation .
Re: [PERFORM] use pgsql in a big project, but i found pg has some big problem on concurrency write operation, maybe a joke for myself !
wrote: hi, first, thanks u for make so good opensource db . Not a bad start. [insults and hand-waving] Not a good way to continue. If there's some particular performance problem you would like to try to solve, please read these pages, and try again: http://wiki.postgresql.org/wiki/SlowQueryQuestions http://wiki.postgresql.org/wiki/Guide_to_reporting_problems It would also help if you posted something where the plain-text format was a bit more readable. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] the jokes for pg concurrency write performance
On Tue, 2 Feb 2010, wyx6...@sina.com wrote: hi, first, thanks u for make so good opensource db . first you thank the developers, then you insult them. are you asking for help or just trying to cause problems. recently maybe half an years ago ,i begin to use pg in a big project for insurance project, belong as the project go on ,and i found some performance problem on concurrency write situation , then i do a research on concurrency write strategy on postgresql , i found a joke ,maybe this joke concurrency strategy is the designer's pround idea, but i think it is a joke , next let me describe the problems: * joke 1: insert operation would use a excluse lock on reference row by the foreign key . a big big big performance killer , i think this is a stupid design . this I don't know enough to answer * joke 2: concurrency update on same row would lead to that other transaction must wait the earlier transaction complete , this would kill the concurrency performance in some long time transaction situation . a stupid design to , this joke design's reason is avoid confliction on read committed isolation , such as this situation: UPDATE webpages SET hits = hits + 1 WHERE url ='some url '; when concurrency write transaction on read committed isolation , the hits may result wrong . this joke design would do seriable write , but i think any stupid developer would not write this code like this stupid sample code , a good code is use a exclusive lock to do a seriable write on this same row , but the joker think he should help us to do this , i say ,u should no kill concurrency performance and help i do this fucking stupid sample code , i would use a select .. for update to do this : select 1 from lock_table where lockId='lock1' for update ; UPDATE webpages SET hits = hits + 1 WHERE url ='some url '; If you have one transaction start modifying a row, and then have another one start, how do you not have one wait for the other? Remember that any transaction can end up running for a long time and may revert at any time. Why would you want to lock the entire table for an update as simple as you describe? * joke 3: update 10 rows on a table no any index , it cost 5-8 seconds , this is not acceptable in some bulk update situation . This one is easy, you did 10 inserts as seperate transactions, if you do them all as one transaction (or better still as a copy) they would complete much faster. You seem to be assuming incopatence on the part of the developers whenever you run into a problem. If you want them to help you, I would suggest that you assume that they know what they are doing (after all, if they didn't you wouldn't want to use their code for anything important anyway), and instead ask what the right way is to do what you are trying to do. David Lang -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] the jokes for pg concurrency write performance
On Feb 1, 2010, at 8:57 PM, wyx6...@sina.com wrote: i found a joke ,maybe this joke concurrency strategy is the designer's pround idea, but i think it is a joke , next let me describe the problems: I would suggest that the behavior that you dislike so much is really not idea of the postgresql developers as much as it is of Prof. Codd and the ANSI-SQL committee. I wonder if a creditable relational DBMS exists that doesn't behave in exactly the way you've described? UPDATE webpages SET hits = hits + 1 WHERE url ='some url '; i say ,u should no kill concurrency performance One alternative design would be to log the timestamp of your web page hits rather than update a hits count field. Once you have this design, if the table becomes volumous with historical logs you have the choice the use horizontal table partitioning or you can roll up all of the historical logs into an aggregating materialized view(table). Regarding all of the jokes you mentioned, I found them all to be very funny indeed. :) Regards, Richard -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Sincere apology for my insults and hand-waving in these mails:the jokes for pg concurrency write performance
nbsp;I want say to the developer of pg, Thanks very much , you make so great project。 nbsp; I think i am not a rough guy , Please forgive me.