Re: [PERFORM] Creating and updating table using function parameter reference
thanks, i posted in this listed because it was related to my previous query. Anyway, I am able to achieve, with the help in this mailing list, what I wanted but is there any way to further optimize this. Thanks CREATE OR REPLACE FUNCTION test ( t1 text ) RETURNS numeric AS $$ declare cmd1 text; declare cmd2 text; declare t2row RECORD; begin cmd1=' select product, (case when sum(pd) 0 then sum(gd)/sum(pd)*100 else 0 end ) as gppp, (case when sum(tld) 0 then sum(pd)/sum(tld) else 0 end ) as ppd from '|| t1 || ' as dummy group by dummy.product' ; for t2row in execute cmd1 loop cmd2 = 'update ' || t1 || ' as t set GPPP=' ||t2row.gppp||' where product='||quote_literal(t2row.product); execute cmd2; cmd2 = 'update ' || t1 || ' as t set PPD=' ||t2row.ppd||' where product='||quote_literal(t2row.product); execute cmd2; END LOOP; RETURN NULL; end; $$ LANGUAGE plpgsql VOLATILE On Thu, Feb 14, 2008 at 5:54 PM, Albert Cervera Areny [EMAIL PROTECTED] wrote: You need the string concatenation operator ||. Take a look at http://www.postgresql.org/docs/8.3/static/functions-string.html By the way, this is off-topic in this list please, post general non-performance questions to pgsql-general. A Dijous 14 Febrer 2008 13:35, Linux Guru va escriure: I still cannot pass tablename, what is wrong? Is this the right way? CREATE OR REPLACE FUNCTION test ( t1 text,t2 text ) RETURNS numeric AS $$ declare temp1 text; declare temp2 text; declare cmd text; declare t2row RECORD; begin temp1=t1; temp2=t2; cmd='select product, (case when sum(pd) 0 then sum(gd)/sum(pd)*100 else 0 end ) as gppp from ' temp1 ' as dummy group by dummy.product,dummy.totalclaimsgroup, dummy.avgmems,dummy.months'; execute cmd into t2row --After executing above, I need here to update table t1 end; $$ LANGUAGE plpgsql ERROR: syntax error at or near $1 LINE 2: from ' $1 ' as dummy group by dummy.product,dummy.totalcla... ^ QUERY: SELECT 'select product, (case when sum(pd) 0 then sum(gd)/sum(pd)*100 else 0 end ) as gppp from ' $1 ' as dummy group by dummy.product,dummy.totalclaimsgroup, dummy.avgmems,dummy.months' CONTEXT: SQL statement in PL/PgSQL function test near line 9 ** Error ** ERROR: syntax error at or near $1 SQL state: 42601 Context: SQL statement in PL/PgSQL function test near line 9 On Wed, Feb 13, 2008 at 8:23 PM, Albert Cervera Areny [EMAIL PROTECTED] wrote: A Dimecres 13 Febrer 2008 15:25, Linux Guru va escriure: I want to create and update two tables in a function such as below, but using parameters as tablename is not allowed and gives an error. Is there any way I could achieve this? You're looking for EXECUTE: http://www.postgresql.org/docs/8.3/static/plpgsql-statements.html#PLPGSQL -STATEMENTS-EXECUTING-DYN CREATE OR REPLACE FUNCTION test ( t1 text,t2 text ) RETURNS numeric AS $$ declare temp1 text; declare temp2 text; begin temp1=t1; temp2=t2; select product, (case when sum(pd) 0 then sum(gd)/sum(pd)*100 else 0 end ) as gppp into temp2 from temp1 as dummy group by dummy.product,dummy.totalclaimsgroup,dummy.avgmems,dummy.months ; update temp1 as t set GPPP=(select gppp from temp2 as dummy where dummy.product=t.product), end $$ LANGUAGE plpgsql -- ERROR: syntax error at or near $1 LINE 1: ...en sum(gd)/sum(pd)*100 else 0 end ) as gppp from $1 as dum... ^ QUERY: select product, (case when sum(pd) 0 then sum(gd)/sum(pd)*100 else 0 end ) as gppp from $1 as dummy group by dummy.product, dummy.totalclaimsgroup,dummy.avgmems,dummy.months CONTEXT: SQL statement in PL/PgSQL function test near line 10 ** Error ** ERROR: syntax error at or near $1 SQL state: 42601 Context: SQL statement in PL/PgSQL function test near line 10 -- Albert Cervera Areny Dept. Informàtica Sedifa, S.L. Av. Can Bordoll, 149 08202 - Sabadell (Barcelona) Tel. 93 715 51 11 Fax. 93 715 51 12 AVISO LEGAL La presente comunicación y sus anexos tiene como destinatario la persona a la que va dirigida, por lo que si usted lo recibe por error debe notificarlo al remitente y eliminarlo de su sistema, no pudiendo utilizarlo, total o parcialmente, para ningún fin. Su contenido puede tener información confidencial o protegida legalmente y únicamente expresa la opinión del remitente. El uso del correo electrónico vía Internet no permite asegurarni la
Re: [PERFORM] Creating and updating table using function parameter reference
I still cannot pass tablename, what is wrong? Is this the right way? CREATE OR REPLACE FUNCTION test ( t1 text,t2 text ) RETURNS numeric AS $$ declare temp1 text; declare temp2 text; declare cmd text; declare t2row RECORD; begin temp1=t1; temp2=t2; cmd='select product, (case when sum(pd) 0 then sum(gd)/sum(pd)*100 else 0 end ) as gppp from ' temp1 ' as dummy group by dummy.product,dummy.totalclaimsgroup, dummy.avgmems,dummy.months'; execute cmd into t2row --After executing above, I need here to update table t1 end; $$ LANGUAGE plpgsql ERROR: syntax error at or near $1 LINE 2: from ' $1 ' as dummy group by dummy.product,dummy.totalcla... ^ QUERY: SELECT 'select product, (case when sum(pd) 0 then sum(gd)/sum(pd)*100 else 0 end ) as gppp from ' $1 ' as dummy group by dummy.product,dummy.totalclaimsgroup, dummy.avgmems,dummy.months' CONTEXT: SQL statement in PL/PgSQL function test near line 9 ** Error ** ERROR: syntax error at or near $1 SQL state: 42601 Context: SQL statement in PL/PgSQL function test near line 9 On Wed, Feb 13, 2008 at 8:23 PM, Albert Cervera Areny [EMAIL PROTECTED] wrote: A Dimecres 13 Febrer 2008 15:25, Linux Guru va escriure: I want to create and update two tables in a function such as below, but using parameters as tablename is not allowed and gives an error. Is there any way I could achieve this? You're looking for EXECUTE: http://www.postgresql.org/docs/8.3/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN CREATE OR REPLACE FUNCTION test ( t1 text,t2 text ) RETURNS numeric AS $$ declare temp1 text; declare temp2 text; begin temp1=t1; temp2=t2; select product, (case when sum(pd) 0 then sum(gd)/sum(pd)*100 else 0 end ) as gppp into temp2 from temp1 as dummy group by dummy.product,dummy.totalclaimsgroup,dummy.avgmems,dummy.months ; update temp1 as t set GPPP=(select gppp from temp2 as dummy where dummy.product=t.product), end $$ LANGUAGE plpgsql -- ERROR: syntax error at or near $1 LINE 1: ...en sum(gd)/sum(pd)*100 else 0 end ) as gppp from $1 as dum... ^ QUERY: select product, (case when sum(pd) 0 then sum(gd)/sum(pd)*100 else 0 end ) as gppp from $1 as dummy group by dummy.product, dummy.totalclaimsgroup,dummy.avgmems,dummy.months CONTEXT: SQL statement in PL/PgSQL function test near line 10 ** Error ** ERROR: syntax error at or near $1 SQL state: 42601 Context: SQL statement in PL/PgSQL function test near line 10
Re: [PERFORM] Update with Subquery Performance
yes, I also thought of this method and tested it before I got your mail and this solution seems workable. Thanks for the help On Feb 12, 2008 9:18 PM, Tom Lane [EMAIL PROTECTED] wrote: Linux Guru [EMAIL PROTECTED] writes: Analyzing did not help, here is the out of EXPLAIN ANALYZE of update query Seq Scan on dummy (cost=0.00..56739774.24 rows=23441 width=275) (actual time=18.927..577929.014 rows=22712 loops=1) SubPlan - Aggregate (cost=2420.41..2420.43 rows=1 width=19) (actual time= 25.423..25.425 rows=1 loops=22712) - Seq Scan on dummy temp (cost=0.00..2416.01 rows=586 width=19) (actual time=0.049..17.834 rows=2414 loops=22712) Filter: ((product)::text = ($0)::text) Total runtime: 578968.885 ms Yeah, that's just not going to be fast. An index on the product column might help a bit, but the real issue is that you're repetitively calculating the same aggregates. I think you need a separate temp table, along the lines of create temp table dummy_agg as select product, (case when sum(pd) 0 then sum(gd)/sum(pd)*100 else 0 end) as s from dummy group by product; create index dummy_agg_i on dummy_agg(product); -- optional update dummy set gp= (select s from dummy_agg where dummy_agg.product = dummy.product ); The index would only be needed if you expect a lot of rows (lot of different product values). regards, tom lane
[PERFORM] Creating and updating table using function parameter reference
I want to create and update two tables in a function such as below, but using parameters as tablename is not allowed and gives an error. Is there any way I could achieve this? CREATE OR REPLACE FUNCTION test ( t1 text,t2 text ) RETURNS numeric AS $$ declare temp1 text; declare temp2 text; begin temp1=t1; temp2=t2; select product, (case when sum(pd) 0 then sum(gd)/sum(pd)*100 else 0 end ) as gppp into temp2 from temp1 as dummy group by dummy.product,dummy.totalclaimsgroup,dummy.avgmems,dummy.months; update temp1 as t set GPPP=(select gppp from temp2 as dummy where dummy.product=t.product), end $$ LANGUAGE plpgsql -- ERROR: syntax error at or near $1 LINE 1: ...en sum(gd)/sum(pd)*100 else 0 end ) as gppp from $1 as dum... ^ QUERY: select product, (case when sum(pd) 0 then sum(gd)/sum(pd)*100 else 0 end ) as gppp from $1 as dummy group by dummy.product, dummy.totalclaimsgroup,dummy.avgmems,dummy.months CONTEXT: SQL statement in PL/PgSQL function test near line 10 ** Error ** ERROR: syntax error at or near $1 SQL state: 42601 Context: SQL statement in PL/PgSQL function test near line 10
Re: [PERFORM] Update with Subquery Performance
See, its calculating sum by grouping the product field. Here is an example Product GP - --- A 30 B 40 A 30 C 50 C 50 Now the query calculates aggregated sum and divide by grouping product so all A's must have same the result, so with B's and C's. Is this supposed to be updating every single row with one value? Cause I'm guessing it's running that sub select over and over instead of one time. yes you are right that its calculating every time for all elements in each group i.e. GP(A) is calculated twice for A, where it should only calculated once for each group. Is there any way to achieve this? analyze; set work_mem = 128000; between the alter and update and see if that helps. that did not help Also, as Tom said, post explain analyze output of the statement. Seq Scan on dummy (cost=0.00..56739774.24 rows=23441 width=275) (actual time=18.927..577929.014 rows=22712 loops=1) SubPlan - Aggregate (cost=2420.41..2420.43 rows=1 width=19) (actual time= 25.423..25.425 rows=1 loops=22712) - Seq Scan on dummy temp (cost=0.00..2416.01 rows=586 width=19) (actual time=0.049..17.834 rows=2414 loops=22712) Filter: ((product)::text = ($0)::text) Total runtime: 578968.885 ms Thanks On Feb 12, 2008 2:29 AM, Scott Marlowe [EMAIL PROTECTED] wrote: On Feb 11, 2008 5:06 AM, Linux Guru [EMAIL PROTECTED] wrote: We have a large datawarehouse stored in postgres and temp tables are created based on user query. The process of temp table creation involves selecting data from main fact table, this includes several select and update statements and one of the following update statement is having performance issues. The newly temp table created for this scenario contains 22712 rows. Here is the query alter table dummy add column gp numeric(40,15); update dummy set gp=(select ( case when sum(temp.pd) 0 then sum(temp.gd)/sum(temp.pd)*100 else 0 end ) from dummy as temp where temp.product=dummy.product) Is this supposed to be updating every single row with one value? Cause I'm guessing it's running that sub select over and over instead of one time. I'm guessing that with more work_mem the planner might use a more efficient plan. Try adding analyze; set work_mem = 128000; between the alter and update and see if that helps. Also, as Tom said, post explain analyze output of the statement. Now this query basically updates a table using values within itself in the subquery but it takes to much time i.e. approx 5 mins. The whole temp table creation process is stucked in this query (there are 4 additional such updates with same problem). Index creation is useless here since its only a one time process. Here is the strip down version (the part making performance issue) of above query i.e. only select statement --- select (case when sum(temp.pd) 0 then sum(temp.gd)/sum(temp.pd)*100 else 0 end ) from dummy as temp, dummy as temp2 where temp.product=temp2.product group by temp.product HashAggregate (cost=1652480.98..1652481.96 rows=39 width=39) - Hash Join (cost=1636.07..939023.13 rows=71345785 width=39) Hash Cond: ((temp.product)::text = (temp2.product)::text) - Seq Scan on dummy temp (cost=0.00..1311.03 rows=26003 width=39) - Hash (cost=1311.03..1311.03 rows=26003 width=21) - Seq Scan on dummy temp2 (cost=0.00..1311.03rows=26003 width=21) --- Whats the solution of this problem, or any alternate way to write this query?
Re: [PERFORM] Update with Subquery Performance
Analyzing did not help, here is the out of EXPLAIN ANALYZE of update query Seq Scan on dummy (cost=0.00..56739774.24 rows=23441 width=275) (actual time=18.927..577929.014 rows=22712 loops=1) SubPlan - Aggregate (cost=2420.41..2420.43 rows=1 width=19) (actual time= 25.423..25.425 rows=1 loops=22712) - Seq Scan on dummy temp (cost=0.00..2416.01 rows=586 width=19) (actual time=0.049..17.834 rows=2414 loops=22712) Filter: ((product)::text = ($0)::text) Total runtime: 578968.885 ms On Feb 11, 2008 9:59 PM, Tom Lane [EMAIL PROTECTED] wrote: Linux Guru [EMAIL PROTECTED] writes: We have a large datawarehouse stored in postgres and temp tables are created based on user query. The process of temp table creation involves selecting data from main fact table, this includes several select and update statements and one of the following update statement is having performance issues. Try ANALYZEing the temp table before the step that's too slow. If that doesn't help, let's see EXPLAIN ANALYZE (not just EXPLAIN) output. regards, tom lane
[PERFORM] Update with Subquery Performance
We have a large datawarehouse stored in postgres and temp tables are created based on user query. The process of temp table creation involves selecting data from main fact table, this includes several select and update statements and one of the following update statement is having performance issues. The newly temp table created for this scenario contains 22712 rows. Here is the query alter table dummy add column gp numeric(40,15); update dummy set gp=(select ( case when sum(temp.pd) 0 then sum(temp.gd)/sum(temp.pd)*100 else 0 end ) from dummy as temp where temp.product=dummy.product) Now this query basically updates a table using values within itself in the subquery but it takes to much time i.e. approx 5 mins. The whole temp table creation process is stucked in this query (there are 4 additional such updates with same problem). Index creation is useless here since its only a one time process. Here is the strip down version (the part making performance issue) of above query i.e. only select statement --- select (case when sum(temp.pd) 0 then sum(temp.gd)/sum(temp.pd)*100 else 0 end ) from dummy as temp, dummy as temp2 where temp.product=temp2.product group by temp.product HashAggregate (cost=1652480.98..1652481.96 rows=39 width=39) - Hash Join (cost=1636.07..939023.13 rows=71345785 width=39) Hash Cond: ((temp.product)::text = (temp2.product)::text) - Seq Scan on dummy temp (cost=0.00..1311.03 rows=26003 width=39) - Hash (cost=1311.03..1311.03 rows=26003 width=21) - Seq Scan on dummy temp2 (cost=0.00..1311.03 rows=26003 width=21) --- Whats the solution of this problem, or any alternate way to write this query?
[PERFORM] Index ot being used
Hi all, I have an index on a table that doesn't seem to want to be used. I'm hopig someone might be able to help point me in the right direction. My index is (typed, not copied): tle-bu= \d file_info_7_display_idx; Index public.file_info_7_display_idx Column | Type -+-- file_type | character varying(2) file_parent_dir | text file_name | text btree, for table public.file_info_7 tle-bu= EXPLAIN ANALYZE SELECT file_type, file_parent_dir, file_name FROM file_info_7; QUERY PLAN -- Seq Scan on file_info_7 (cost=0.00..11028.35 rows=294035 width=118) (actual time=0.122..2707.764 rows=294035 loops=1) Total runtime: 3717.862 ms (2 rows) Can anyone see what's wrong? Should I post the table schema? Thanks all! Madison ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match