Re: [SQL] Ask To Optimize Looping
Hello, I would try to replace the loop with a single UPDATE FROM Statement: Update EP_ES06_N_TEMP2 Set FROM ( select kodedivisi,kodeseksi,kodewip,nobatch,ket1,ket2,ket3,ket4,NILAIPROP as nilaiygdibagi from EDP040_07_23 --order by kodedivisi,kodeseksi,kodewip,nobatch,ket1,ket2,ket3,ket4 ) i WHERE .. Here a simple example for this syntax: create table test (i int); insert into test select * from generate_series (1,20); update test set i =0 from (select * from generate_series (1,10) s)i where test.i=i.s; --Query returned successfully: 10 rows affected But beware the limitation of update from: "When a FROM clause is present, what essentially happens is that the target table is joined to the tables mentioned in the fromlist, and each output row of the join represents an update operation for the target table. When using FROM you should ensure that the join produces at most one output row for each row to be modified. In other words, a target row shouldn't join to more than one row from the other table(s). If it does, then only one of the join rows will be used to update the target row, but which one will be used is not readily predictable. Because of this indeterminacy, referencing other tables only within sub-selects is safer, though often harder to read and slower than using a join. " HTH, Marc Mamin
[SQL] record to columns: syntax question and strange behaviour
Hello, how should I retrieve the result from a function with some OUT paramenters? (PG is 8.3.7) here a short example to illustrate my question: CREATE OR REPLACE FUNCTION test (In a int, OUT b int, OUT c int) AS $BODY$ BEGIN b:=a+1; c:=a+2; raise notice 'done: %', a; END $BODY$ LANGUAGE 'plpgsql' IMMUTABLE select column1, test(column1) FROM (values(1),(2)) foo 1, (2,3) 2, (3,4) NOTICE: done: 1 NOTICE: done: 2 What I want is just 1,2,3 2,3,4 Following returns the expected result, but the function is called for each OUT parameter: select column1, (test(column1)).* FROM (values(1),(2)) foo => 1,2,3 2,3,4 NOTICE: done: 1 NOTICE: done: 1 NOTICE: done: 2 NOTICE: done: 2 Is there a way to avoid it ??? Thanks, Marc Mamin
Re: [SQL] record to columns: syntax question and strange behaviour
Hello, Your proposal unfortunately does not work if you try to query more than one value and want additional columns in the results, like in select column1,test(column1) FROM (values(1),(2)) foo cheers, Marc Mamin >IMO easiest would be to include a RETURNS SETOF record in the >function declaration and a return next; statement in the function >body. E.g. > > >CREATE OR REPLACE FUNCTION test (In a int, OUT b int, OUT c int) >RETURNS SETOF record >AS >$BODY$ >BEGIN > b:=a+1; > c:=a+2; > return next; >END >$BODY$ > LANGUAGE 'plpgsql' > >and then issue > >SELECT * FROM test(1);
[SQL] workaround for missing ROWNUM feature with the help of GUC variables
Hello, here my two pence on this recurring thema. (just a workaround) regards, Marc Mamin The PG parameter must be set to allow defining own configuration variables: #--- --- # CUSTOMIZED OPTIONS #--- --- custom_variable_classes = 'public'# list of custom variable class names usage example: select my_rownum(),* from generate_series (10,15); wrong usage: select my_rownum() as n1, my_rownum() as n2, * from generate_series (10,15); solution: select my_rownum('1') as n1, my_rownum('2') as n2, * from generate_series (10,15); Code: = CREATE OR REPLACE FUNCTION public.my_rownum () returns int AS $BODY$ /* equivalent to oracle rownum (The previous row value is attached to a GUC Variable valid in the current transaction only) quite slow :-( */ DECLARE current_rownum int; config_id varchar = 'public.my_rownum'; BEGIN BEGIN current_rownum := cast (current_setting (config_id) as int); EXCEPTION when others then return cast( set_config(config_id, cast(1 as text), true) as int); END; RETURN cast( set_config(config_id, cast(current_rownum + 1 as text), true) as int); END; $BODY$ LANGUAGE 'plpgsql' VOLATILE; /* -- For multiple usage: -- */ CREATE OR REPLACE FUNCTION public.my_rownum ( id varchar ) returns int AS $BODY$ /* equivalent to oracle rownum quite slow :-( (The previous row value is attached to a GUC Variable valid in the current transaction only) $1: when more than one my_rownum is used within a query, each call must have its own ID in order to get different GUC variable). */ DECLARE current_rownum int; config_id varchar = 'public.my_rownum'||id; BEGIN BEGIN current_rownum := cast (current_setting (config_id) as int); EXCEPTION when others then return cast( set_config(config_id, cast(1 as text), true) as int); END; RETURN cast( set_config(config_id, cast(current_rownum + 1 as text), true) as int); END; $BODY$ LANGUAGE 'plpgsql' VOLATILE;
Re: [SQL] workaround for missing ROWNUM feature with the help of GUC variables
> I don't understand what you mean by missing ROWNUM feature, PG got this with windows-functions in 8.4: > http://www.postgresql.org/docs/8.4/interactive/functions-window.html <http://www.postgresql.org/docs/8.4/interactive/functions-window.html> I'm unfortunately still using 8.3. sorry not to have mentioned that. Marc Mamin
[SQL] Controlling join order with parenthesis
Hello, According to the documentation, I thought it was possible to force given join orders with parenthesis. But in this small example, this seems to have no effect; With the first query, I'd expected to see t3 in the Merge Right Join but both queries return the same query plan . I'm missing a point ? (Postgres 8.3) create temp table t1(i int); create temp table t2(i int); create temp table t3(i int); select * from ( ( t1 LEFT OUTER JOIN t2 on (t1.i=t2.i) ) LEFT OUTER JOIN t3 on (t2.i=t3.i) ) select * from ( t1 LEFT OUTER JOIN (t2 LEFT OUTER JOIN t3 on (t2.i=t3.i) ) on (t1.i=t2.i) ) Merge Right Join (cost=506.24..6206.24 rows=345600 width=12) Merge Cond: (t2.i = t1.i) -> Merge Left Join (cost=337.49..781.49 rows=28800 width=8) Merge Cond: (t2.i = t3.i) -> Sort (cost=168.75..174.75 rows=2400 width=4) Sort Key: t2.i -> Seq Scan on t2 (cost=0.00..34.00 rows=2400 width=4) -> Sort (cost=168.75..174.75 rows=2400 width=4) Sort Key: t3.i -> Seq Scan on t3 (cost=0.00..34.00 rows=2400 width=4) -> Sort (cost=168.75..174.75 rows=2400 width=4) Sort Key: t1.i -> Seq Scan on t1 (cost=0.00..34.00 rows=2400 width=4) best regards, Marc Mamin -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Overlapping Ranges- Query Alternative
or: Select Groups, generate_series FROM ranges JOIN generate_series(10,50,10) on ( ColumnA < generate_series) ORDER by Groups , generate_series ; regards, Marc Mamin From: pgsql-sql-ow...@postgresql.org [mailto:pgsql-sql-ow...@postgresql.org] On Behalf Of Andreas Gaab Sent: Freitag, 12. November 2010 09:23 To: 'Ozer, Pam'; pgsql-sql@postgresql.org Subject: Re: [SQL] Overlapping Ranges- Query Alternative Hi, the following works: Create temp table ranges (Groups int, ColumnA int); Insert into ranges Values(2,45); Insert into ranges Values(3,15); Insert into ranges Values(4,25); Insert into ranges Values(5,35); Select Groups, Case when ColumnA between 0 and 19 then 0 when ColumnA >=20 AND ColumnA < 30 then generate_series(20,20,10) when ColumnA >=30 AND ColumnA < 40 then generate_series(20,30,10) when ColumnA>=40 AND ColumnA < 50 then generate_series(20,40,10) when ColumnA>=50 then generate_series(20,50,10) end MinRange from ranges; --or even only Select Groups, CASE WHEN ColumnA < 20 then 0 ELSE generate_series(20, (floor(ColumnA / 10.0) * 10)::integer ,10) END MinRange from ranges; Best, Andreas Von: pgsql-sql-ow...@postgresql.org [mailto:pgsql-sql-ow...@postgresql.org] Im Auftrag von Ozer, Pam Gesendet: Donnerstag, 11. November 2010 20:07 An: pgsql-sql@postgresql.org Betreff: [SQL] Overlapping Ranges- Query Alternative I have the following problem: Create temp table ranges (Groups int, ColumnA int); Insert into ranges Values(2,45); Select Groups, Case when ColumnA between 0 and 19 then 0 when ColumnA >=20 then 20 when ColumnA >=30 then 30 when ColumnA>=40 then 40 when ColumnA>=50 then 50 end MinRange from ranges Results: Groups minrange 2;20 What I want Is : One column can fall into multiple ranges. For example 45 >20, 30, and 40 so I want the following results 2;20 2;30 2;40 I know I could do a union with each range but is there any way to bring back all ranges in one query? I need to bring back the values in one column so having separate columns for each range is not an option. Thank you in advance for any help Pam Ozer
[SQL] First aggregate with null support
Hello, I'm looking for a First aggregate which may return null. From the example at http://wiki.postgresql.org/index.php?title=First_%28aggregate%29, I have just defined a non strict function that returns the first value: CREATE OR REPLACE FUNCTION public.first_agg_wnull ( anyelement, anyelement ) RETURNS anyelement AS $$ SELECT $1; $$ LANGUAGE SQL IMMUTABLE COST 1; And an aggregate: CREATE AGGREGATE public.first_wnull ( sfunc= first_agg, basetype = anyelement, stype= anyelement ); But this always return null which is the default init condition of the aggregate :-( I also have a working one using an array function (first_wnull_a, below) , but I wonder if there is a simpler solution ... best regards, Marc Mamin CREATE OR REPLACE FUNCTION first_element_state(anyarray, anyelement) RETURNS anyarray AS $$ SELECT CASE WHEN array_upper($1,1) IS NULL THEN array_append($1,$2) ELSE $1 END; $$ LANGUAGE 'sql' IMMUTABLE COST 2; CREATE OR REPLACE FUNCTION first_element(anyarray) RETURNS anyelement AS $$ SELECT ($1)[1] ; $$ LANGUAGE 'sql' IMMUTABLE COST 2; CREATE AGGREGATE first_wnull_a(anyelement) ( SFUNC=first_element_state, STYPE=anyarray, FINALFUNC=first_element ); select first_wnull(s) from generate_series (1,10) s => NULL select first_wnull_a(s) from generate_series (1,10) s => 1 -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] conditional aggregates
something like ? Select min (case when X > 0 then X end) HTH, Marc Mamin From: pgsql-sql-ow...@postgresql.org [mailto:pgsql-sql-ow...@postgresql.org] On Behalf Of Marcin Krawczyk Sent: Mittwoch, 8. Dezember 2010 14:20 To: Pavel Stehule Cc: pgsql-sql@postgresql.org Subject: Re: [SQL] conditional aggregates Yeah I know but I'm having problems creating sfunc fuction for the aggregate. regards mk 2010/12/8 Pavel Stehule Hello use a CASE statement http://www.postgresql.org/docs/7.4/static/functions-conditional.html Regards Pavel Stehule 2010/12/8 Marcin Krawczyk : > Hi list, > Can anyone advise me on creating an aggregate that would take additional > parameter as a condition ? For example, say I have a table like this > id;value > 1;45 > 2;13 > 3;0 > 4;90 > I'd like to do something like this > SELECT min_some_cond_aggregate(value,0) FROM table > to get the minimal value from table greater than 0, in this case 13. > I can't do SELECT min(value) FROM table WHERE value > 0 as this will mess my > other computations. My current solution involves a function operating on the > output of array_accum from the docs, but I'm looking for more elegant > solution. > Is this possible at all ? I'm running 8.1. > > regards > mk >
[SQL] checking for the existence of a current_setting ?
Hello, (Postgres 8.3) I'm misusing the current settings at some places to store session variables. The next function retrieve such a variable, or initialized it with a default value. It is working as expected but performances are slow due to the exception block. Is there a way to make the check more smoothly, i.e. without relying on the exception ? maybe some undocumented internal function ? many thanks, Marc Mamin CREATE OR REPLACE FUNCTION public.var_get_check(int,text) RETURNS text AS $BODY$ BEGIN return current_setting('public.' || $2 || pg_backend_pid()); EXCEPTION when undefined_object then perform set_config ('public.' || $2 || pg_backend_pid(), $1::text, false); return $1::text; END ; $BODY$ LANGUAGE plpgsql VOLATILE COST 100;
Re: [SQL] Prevent double entries ... no simple unique index
> > Or this one: > > test=*# create unique index on log((case when state = 0 then 0 when > state = 1 then 1 else null end)); > CREATE INDEX > > > Now you can insert one '0' and one '1' - value - but no more. Hi, A partial index would do the same, but requires less space: create unique index on log(state) WHERE state IN (0,1); best regards, Marc Mamin -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] reduce many loosely related rows down to one
> > Von: pgsql-sql-ow...@postgresql.org [pgsql-sql-ow...@postgresql.org]" im > Auftrag von "Bill MacArthur [webmas...@dhs-club.com] > Gesendet: Samstag, 25. Mai 2013 09:19 > An: pgsql-sql@postgresql.org > Betreff: [SQL] reduce many loosely related rows down to one > > Here is a boiled down example of a scenario which I am having a bit of > difficulty solving. > This is a catchall table where all the rows are related to the "id" but are > entered by different unrelated processes that do not necessarily have access > to the other data bits. > > -- raw data now looks like this: > > select * from test; > > id | rspid | nspid | cid | iac | newp | oldp | ppv | tppv > +---+---+-+-+--+--+-+- >1 | 2 | 3 | 4 | t | | | | >1 | 2 | 3 | | | 100 | | | >1 | 2 | 3 | | | | 200 | | >1 | 2 | 3 | | | | | | 4100.00 >1 | 2 | 3 | | | | | | 3100.00 >1 | 2 | 3 | | | | | -100.00 | >1 | 2 | 3 | | | | | 250.00 | >2 | 7 | 8 | 4 | | | | | > (8 rows) > > -- I want this result (where ppv and tppv are summed and the other distinct > values are boiled down into one row) > -- I want to avoid writing explicit UNIONs that will break if, say the "cid" > was entered as a discreet row from the row containing "iac" > -- in this example "rspid" and "nspid" are always the same for a given ID, > however they could possibly be absent for a given row as well > > id | rspid | nspid | cid | iac | newp | oldp | ppv | tppv > +---+---+-+-+--+--+-+- >1 |2 | 3 | 4 | t | 100 | 200 | 150.00 | 7200.00 >2 |7 | 8 | 4 | | | |0.00 |0.00 > > > I have experimented with doing the aggregates as a CTE and then joining that > to various incarnations of DISTINCT and DISTINCT ON, but those do not do what > I want. Trying to find the right combination of terms to get an answer from > Google has been unfruitful. Hello, If I understand you well, you want to perform a group by whereas null values are coalesced to existing not null values. this seems to be logically not feasible. What should look the result like if your "raw" data are as following: id | rspid | nspid | cid | iac | newp | oldp | ppv | tppv +---+---+-+-+--+--+-+- 1 | 2 | 3 | 4 | t | | | | 1 | 2 | 3 | 5 | t | | | | 1 | 2 | 3 | | | 100 | | | (to which cid should newp be summed to?) regards, Marc Mmain -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Advice on re-writing a SELECT query.
I have a query like this: > >SELECT >lpt_titulo AS tmt_titulo, >tmd_nombre AS tmt_nombre, >tmd_album AS tmt_album >SUM(lpt_puntos) AS tmt_puntos, >lpt_fuente AS tmt_fuente >FROM listas_pre_titulos, temp_lista_titulos >WHERE >listas_pre_titulos.lpt_tipo = 3 AND >listas_pre_titulos.lpt_titulo <> temp_lista_titulos.tmt_titulo AND >listas_pre_titulos.tmd_album <> temp_lista_titulos.tmt_album AND >listas_pre_titulos.lpt_fuente <> temp_lista_titulos.tmt_fuente >GROUP BY >lpt_fuente, lpt_titulo, tmd_album >ORDER BY tmt_puntos ASC > >Is it valid to re-write the FROM and WHERE statements as follows? > >FROM listas_pre_titulos >INNER JOIN temp_lista_titulos ON >(listas_pre_titulos.lpt_titulo, listas_pre_titulos.tmd_album, >listas_pre_titulos.lpt_fuente) >NOT IN >(temp_lista_titulos.tmt_titulo, temp_lista_titulos.tmt_album, >temp_lista_titulos.tmt_fuente) >WHERE listas_pre_titulos.lpt_tipo = 3 hello, your second syntax is not valid sql, but you can achieve it as in this example: create temp table a(a int,b int,c int,d int); create temp table b(a int,b int,c int,d int); select * from a join b ON ((a.a,a.b,a.c)<>(b.a,b.b,b.c)) but beware if null values are involved( 1<>NULL => NULL). In this case you can use : select * from a join b ON ((a.a,a.b,a.c) IS DISTINCT FROM (b.a,b.b,b.c)) regards, Marc Mamin
Re: [SQL] reduce many loosely related rows down to one
> SELECT id, > (array_agg(rspid))[1] AS rspid,-- (1) for such cases, I have created an new aggregate function: SELECT firstnotnull(rspid) AS rspid, this avoid to collect first all rspid values to then keep only the first one... CREATE OR REPLACE FUNCTION public.first_agg_nn ( anyelement, anyelement ) RETURNS anyelement AS $$ SELECT $1; $$ LANGUAGE SQL IMMUTABLE STRICT COST 1; CREATE AGGREGATE public.firstnotnull ( sfunc= public.first_agg_nn, basetype = anyelement, stype= anyelement ); regards, Marc Mamin > -Original Message- > From: pgsql-sql-ow...@postgresql.org [mailto:pgsql-sql- > ow...@postgresql.org] On Behalf Of Torsten Grust > Sent: Dienstag, 28. Mai 2013 17:08 > To: pgsql-sql@postgresql.org > Subject: Re: [SQL] reduce many loosely related rows down to one > > On 25 May 2013, at 9:19, Bill MacArthur wrote (with possible > deletions): > > [...] > > select * from test; > > > > id | rspid | nspid | cid | iac | newp | oldp | ppv | tppv > > +---+---+-+-+--+--+-+- > > 1 | 2 | 3 | 4 | t | | | | > > 1 | 2 | 3 | | | 100 | | | > > 1 | 2 | 3 | | | | 200 | | > > 1 | 2 | 3 | | | | | | 4100.00 > > 1 | 2 | 3 | | | | | | 3100.00 > > 1 | 2 | 3 | | | | | -100.00 | > > 1 | 2 | 3 | | | | | 250.00 | > > 2 | 7 | 8 | 4 | | | | | > > (8 rows) > > > > -- I want this result (where ppv and tppv are summed and the other > > distinct values are boiled down into one row) > > -- I want to avoid writing explicit UNIONs that will break if, say > the > > "cid" was entered as a discreet row from the row containing "iac" > > -- in this example "rspid" and "nspid" are always the same for a > given > > ID, however they could possibly be absent for a given row as well > > > > id | rspid | nspid | cid | iac | newp | oldp | ppv | tppv > > +---+---+-+-+--+--+-+- > > 1 |2 | 3 | 4 | t | 100 | 200 | 150.00 | 7200.00 > > 2 |7 | 8 | 4 | | | |0.00 |0.00 > > One possible option could be > > SELECT id, > (array_agg(rspid))[1] AS rspid,-- (1) > (array_agg(nspid))[1] AS nspid, > (array_agg(cid))[1] AS cid, > bool_or(iac) AS iac, -- (2) > max(newp) AS newp, -- (3) > min(oldp) AS oldp, -- (4) > coalesce(sum(ppv), 0) AS ppv, > coalesce(sum(tppv),0) AS tppv > FROM test > GROUP BY id; > > > This query computes the desired output for your example input. > > There's a caveat here: your description of the problem has been > somewhat vague and it remains unclear how the query should respond if > the functional dependency id -> rspid does not hold. In this case, the > array_agg(rspid)[1] in the line marked (1) will pick one among many > different(!) rspid values. > I don't know your scenario well enough to judge whether this would be > an acceptable behavior. Other possible behaviors have been implemented > in the lines (2), (3), (4) where different aggregation functions are > used to reduce sets to a single value (e.g., pick the largest/smallest > of many values ...). > > Cheers, >--Torsten > > > -- > | Torsten "Teggy" Grust > | torsten.gr...@gmail.com > > > -- > Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make > changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-sql -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] delete where not in another table
> Subject: [SQL] delete where not in another table > DELETE FROM T1 WHERE T1.user_id NOT IN (SELECT user_id FROM T2 WHERE > T2.user_id=T1.user_id); Following query use an anti join and is much faster: delete from t1 where not exists (select user_id from t2 where t2.user_id =t1.user_id ) regards, Marc Mamin -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] value from max row in group by
> >Von: pgsql-sql-ow...@postgresql.org [pgsql-sql-ow...@postgresql.org]" im >Auftrag von "Venky >Kandaswamy [ve...@adchemy.com] > >You can use Postgres WINDOW functions for this in several different ways. For >example, one way of doing it: > >select stts_id, last_value(stts_offset) over (partition by stts_id order by >stts_offset desc) > + last_value(stts_duration) over (partition by stts_id order > by stts_offset desc) >from table >group by stts_id; another simple solution with distinct on: select distinct on (stts_id, stts_offset) stts_id, stts_offset+stts_duration from table order by stts_id, stts_offset desc Marc Mamin From: pgsql-sql-ow...@postgresql.org on behalf of Gary Stainburn Sent: Thursday, July 25, 2013 10:57 AM To: pgsql-sql@postgresql.org Subject: Re: [SQL] value from max row in group by As usual, once I've asked the question, I find the answer myself. However, it *feels* like there should be a more efficient way. Can anyone comment or suggest a better method? timetable=> select stts_id, stts_offset+stts_duration as total_duration timetable-> from standard_trip_sections timetable-> where (stts_id, stts_offset) in timetable-> (select stts_id, max(stts_offset) from standard_trip_sections group by stts_id); stts_id | total_duration -+ 1 | 01:35:00 2 | 01:35:00 3 | 01:08:00 4 | 01:38:00 5 | 01:03:00 6 | 01:06:00 (6 rows) timetable=> -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Need to subtract values between previous and current row
I see 3 solutions. A) self join B) define a procedure that return a set of records. this use only a single table scan on the ordered table not tested, just the genera idea: CREATE OR REPLACE FUNCTION foo() RETURNS SETOF AS ... DELARE previous_time int8; --(or whaever datatype you have) rec record ; rec2 ; BEGIN FOR rec in select id, time from yourtable ORDER BY ID LOOP select into rec2 id, rec.time - previous_time; return next rec2; END LOOP; END; ... C) embedding R in Postgres http://www.joeconway.com/oscon-pres-2003-1.pdf http://www.omegahat.org/RSPostgres/ This may be a solution to implement complex cross-rows aggregation. But I never got the time to test it; I'd be interested to know which concerns this solution can show (stability, memory & cpu load, concurent requests) Cheers, Marc
[SQL] a way to generate functions dynamically ?
Hello, I need to generate some procedures that depend data models stored in my DBs. As I have different models in different databases, the stored procedures will differ. My idea is to generate the required stored procedures dynamically once a model is defined. I will probably do this within the application. But is there a way to achieve this with plpgsql ? here a naive try to illustrate my idea: CREATE OR REPLACE FUNCTION test(p1 int) RETURNS integer AS $BODY$ EXECUTE' CREATE OR REPLACE FUNCTION generated(p2 int) RETURNS integer AS $BODY$ BEGIN return p2 + ' || p1 || '; END; $BODY$ LANGUAGE ''plpgsql'' VOLATILE '; ERROR: syntax error at or near "BEGIN" SQL state: 42601 Character: 156 Thanks, Marc
[SQL] EXECUTE 'EXECUTE ...' or how to use dynamic prepared statements ?
Hello, I have a large upddate to perform on tables which are dynamically generated (dynamic names). In this simplified example, the operation should replace in each family the "mygroup" of each item of rang=0 with the "mygroup" value of the element of rang=1 : (the * indicate the modified values) idfamily rang mygroup 1 10 1 2 11 2 3 12 3 4 13 4 5 20 6 6 21 6 7 22 7 8 23 7 9 30 10 10 31 20 11 32 21 After the update: 1 10 2 * 2 11 2 3 12 3 4 13 4 5 20 6 6 21 6 7 22 7 8 23 7 9 30 20 * 1031 20 1132 21 In the following function, I would like to use a prepared statement for the update command but I get stuck with the tho different meanings of EXECUTE ... Is there a way to achieve this ? Thanks, Marc CREATE OR REPLACE FUNCTION test_function(tablename varchar) RETURNS integer AS $BODY$ DECLARE rec record; top_group int; top_family character(16); top_id int; BEGIN /* the prepared statement must be generated dynamically in order to include the table name. */ EXECUTE 'PREPARE update_stmt (int, int) AS update '||tablename||' set mygroup= $1 where id = $2'; /* using "select distinct on" allows to retrieve and sort the required information for the update. this is faster than a self join on the table */ for rec in execute 'select DISTINCT on (family,rang) family, rang, mygroup, id from '||tablename||' where rang < 2 order by family, rang' loop IF rec.rang = 0 THEN top_group := rec.mygroup; top_family := rec.family; top_id := rec.id; ELSIF rec.family = top_family AND rec.mygroup <> top_group THEN /* Update without using a prepared statement EXECUTE 'update '||tablename||' set mygroup= '||rec.mygroup||' where id = '||top_id; */ -- This works, but the command has to be computed for each iteration EXECUTE 'EXECUTE update_stmt('||rec.mygroup||','||top_id||')'; /* Following syntax would be fine PERFORM EXECUTE update_stmt(rec.mygroup,top_id); */ END IF; end loop; DEALLOCATE update_stmt; RETURN 0; END; $BODY$ LANGUAGE 'plpgsql' VOLATILE; /* test data: === */ --drop table test_table; create table test_table(id int,family int,rang int,mygroup int); insert into test_table values (1,1,0,1); insert into test_table values (2,1,1,2); insert into test_table values (3,1,2,3); insert into test_table values (4,1,3,4); insert into test_table values (5,2,0,6); insert into test_table values (6,2,1,6); insert into test_table values (7,2,2,7); insert into test_table values (8,2,3,7); insert into test_table values (9, 3,0,10); insert into test_table values (10,3,1,20); insert into test_table values (11,3,2,21); select test_function('test_table'); select * from test_table order by id; ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [SQL] EXECUTE 'EXECUTE ...' or how to use dynamic prepared statements ?
Hello Dirk, I have to disagree. Your first update query is very low. It probably implies to run the sub select statement for each row to be updated. Following update statement is already much faster: (using UPDATE FROM) update test_table set mygroup= t.mygroup from test_table as t where t.family = test_table.family and t.rang = 1 and table.rang=0 -- perform the updte only when required and mygroup <> t.mygroup; But when you are dealing with "parent - child" relations within a single table as in my case, a single table scan with SELECT DISTINCT ON and a row by row comparison on the result set appears to be faster. I tested both approaches on tables with ca. 14'000'000 rows where 25% of them needed to be updated. The above update statement run in 5H30' where my function did the job in 2H. (as my tables are very large, much time is lost in i/o wait) Cheers, Marc ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] SQL stored function inserting and returning data in a row.
> What about > $$ > INSERT INTO ; > select currval('seq_matchmaking_session_id'); > $$ language sql; > > ? Hello, I'm not sure that this would return the correct id in case of concurrent calls to your function. I'm using following kind of function to manage reference tables: HTH, Marc Mamin CREATE TABLE xxx ( id serial NOT NULL, mycolumn character varying, CONSTRAINT xxx_pk PRIMARY KEY (id) , CONSTRAINT xxx_uk UNIQUE (mycolumn) ) CREATE OR REPLACE FUNCTION get_or_insert_id_xxx( input_value varchar) RETURNS INT AS $$ DECLARE id_value int; BEGIN select into id_value id from xxx where mycolumn = input_value; IF FOUND THEN return id_value; ELSE insert into xxx ( mycolumn ) values ( input_value ); return id from xxx where mycolumn = input_value; END IF; EXCEPTION WHEN unique_violation THEN return id from xxx where mycolumn = input_value; END; $$ LANGUAGE plpgsql; ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SQL] SQL question: Highest column value of unique column pairs
Hello Kevin, I would use "select distinct on" to first isolate the candidates in (1) and (2) and then reitere the query on this sub result: (the query below will retrieve the last score, not the best one...) something like (not tested): select distinct on (date,name) date,name,score from (select distinct (on date, LName1) date,LName1 as name ,score1 as score from table order by date desc,LName1 union all select distinct on (date, LName2) date,LName2 as name,score2 as score from table order by date desc, LName2 )foo order by date desc,name regards, Marc Mamin -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Kevin Jenkins Sent: Saturday, January 12, 2008 1:10 AM To: pgsql-sql@postgresql.org Subject: [SQL] SQL question: Highest column value of unique column pairs Hi, I have the following table which holds the result of 1 on 1 matches: FName1, LName1, Score1, FName2, LName2, Score2, Date John, Doe,85 Bill, Gates, 20 Jan 1. John, Archer, 90 John, Doe,120 Jan 5 Bob,Barker, 70 Calvin, Klien 8 Jan 8 John, Doe,60 Bill, Gates, 25 Jan 3. So columns 1 and 2 hold the first person. Column 3 holds his score. Columns 4 and 5 hold the second person. Column 6 holds his score. I want to return the most recent score for each person (be they an opponent or myself). And the resultant table shouldn't care if they are person 1 or 2. So the end result would be FName, LName, Score, Date John,Doe, 120Jan 5. John,Archer 90 Jan 5. Bob, Barker 70 Jan 8 Bill,Gates 25 Jan 3 Calvin Klien 8 Jan 8 Thanks for any help! ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [SQL] dynamic events categorization
Hello, I guess that the time offsets (now-21 and now-28) are evaluated each time the corresponding condition is met. It may be faster to put them into a separate sub query. I'm not sure about putting "now" itself within the sub query... It may also be better to put your query in a procedure where you can put these constants into variables instead of using a sub query. Depending of the distribution of a) 2_past,1_future,0_current and '' and b) t.type, it may be worth to have different queries, bound with UNION ALL. This would simplify the "CASE" construct and at least part of the tests should happen on indexes only. If the query is run very often, you may want to add a boolean column is_past on show_date, and have a separate job that put the concerned records to true every x minutes ... HTH, Marc Mamin SELECT s.id_event_subtype, s.subtype, t.id_event_type, t.type, e.id_event, e.created_by, e.created_on, e.modified_by, e.modified_on, e.id_image, e.show_name, e.length, d.id_date, d.start_date, d.end_date, d.low_price, d.high_price, d.id_location, d.showtime, CASE WHEN d.start_date <= 'now'::text::date AND CASE WHEN t.type = 'movie'::text THEN d.start_date >= c.a WHEN t.type = 'book'::text THEN e.created_on >= c.b ELSE d.end_date >= 'now'::text::date OR d.end_date IS NULL END THEN '0_current'::text WHEN d.start_date > 'now'::text::date THEN '1_future'::text WHEN d.start_date IS NOT NULL THEN '2_past'::text ELSE ''::text END AS timing FROM -- added sub query: (select 'now'::text::date - 21 as a, 'now'::text::date - 28 as b) c, event e NATURAL JOIN event_type2 t LEFT JOIN event_subtype2 s USING (id_event_subtype) LEFT JOIN show_date d USING (id_event); -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] exclude part of result
Hi, Two other ideas... SELECT DISTINCT p.a, p.b, p.c, now(), count(item.item_pk) FROM product p JOIN (select distinct a,b,c from products except select distinct a,b,c from navigation )foo USING (a,b,c) LEFT JOIN item ON item.product_fk = product_pk WHERE ... GROUP BY p.a, p.b, p.c or maybe SELECT DISTINCT foo.*, now(), count(item.item_pk) FROM (select distinct a,b,c from products WHERE ... except select distinct a,b,c from navigation )foo LEFT JOIN item ON item.product_fk = product_pk WHERE ... GROUP BY p.a, p.b, p.c HTH, Marc Mamin -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
[SQL] varchar::timezone conversion
Hi, I'm facing an issue, where I build time stamps from a set of strings. In this non working example, I'm looking for a way to cast my strings to the expected format 'timestamp' and timezone... CREATE OR REPLACE FUNCTION "public"."timestamp_to_utcms" (varchar,varchar) RETURNS int8 AS $body$ SELECT EXTRACT(EPOCH FROM $1 AT TIME ZONE $2)::int8*1000; $body$ LANGUAGE 'sql' IMMUTABLE RETURNS NULL ON NULL INPUT SECURITY INVOKER; Thanks for your help, Marc Mamin; -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Query how-to
Hi, What about something like that ? select adate, sum(openedCount) as openedCount, sum(closedCount) as closedCount from ( select sum(case when status ='Closed' then stop_date else start_date end) as adate, sum(case when status ='Closed' then 1 else 0 end) as closedCount sum(case when status ='New' then 1 else 0 end) as openedCount from Yourtable where status in ('Closed','New') )x group by adate order by adate HTH, Marc
Re: [SQL] Query how-to
this was silly from me! this should naturally look like this: select case when status ='Closed' then stop_date else start_date end as adate, sum(case when status ='Closed' then 1 else 0 end) as closedCount, sum(case when status ='New' then 1 else 0 end) as openedCount from Yourtable where status in ('Closed','New') group by case when status ='Closed' then stop_date else start_date end order by adate Marc > Hi, > What about something like that ? > select adate, sum(openedCount) as openedCount, sum(closedCount) as closedCount > from > ( > select sum(case when status ='Closed' then stop_date else start_date end) as adate, >sum(case when status ='Closed' then 1 else 0 end) as closedCount >sum(case when status ='New' then 1 else 0 end) as openedCount > from Yourtable > where status in ('Closed','New') > )x > group by adate > order by adate -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
[SQL] aggregation problem: first/last/count(*)
Hello, I have a query to aggregate data wich is too slow :-) Here a simplified example: create table test ( time int8, --store the time as epoch a_group varchar, category varchar ) For each group, I need the first/last times and categories , the number of distinct categories and the number of records. Here my best solution until now: SELECT FIRST.a_group, FIRST.time as first_time, FIRST.category as first_category, LAST.time as last_time, LAST.category as last_category, AGG.c_count, AGG.c_all FROM ( select distinct on (a_group) a_group, time, category from test order by a_group, time ) FIRST, ( select distinct on (a_group) a_group, time, category from test order by a_group, time DESC ) LAST, ( select a_group, count(distinct category) as c_count, count(*) as c_all from test group by a_group order by a_group ) AGG where FIRST.a_group = LAST.a_group and LAST.a_group=AGG.a_group each sub query is quite fast -- thanks for the DISTINCT ON feature :-) , but the whole is really slow as Postgres start to swap due to the large amount of data to merge. I guess there must be a better solution as the three sub queries return exactly one row for each 'a_group' and are moreover already sorted (The table does not contain any NULL value). But in the query plan below, most of the cost comes form the merges. I imagine there must be a way using custom aggregation functions, but I'm not confident with those: Is it possible to define aggregate in order to retrieve the first/last values of an ordered result set? This would allow to make a single scan of the table. something like select a_group, first(category) as first_category, last(category) as last_category, ... from test order by a_group,time Many thanks for any hints. Marc Mamin Here are some dummy values if you'd like to play with this issue: insert into test select s,'G'||s , 'C1' from(select generate_series(1,1)as s)s; insert into test select s+10,'G'||s , 'C2' from(select generate_series(1,1)as s)s; insert into test select s+13,'G'||s , 'C3' from(select generate_series(1,1)as s)s; insert into test select s+1,'G'||s , 'C2' from(select generate_series(1,1,5)as s)s; insert into test select s,'G'||s%10 , 'C3' from(select generate_series(1,1,5)as s)s; insert into test select s+1,'G'||s%5 , 'C2' from(select generate_series(1,1,5)as s)s; insert into test select s+1,'G'||s , 'C1' from(select generate_series(1,100)as s)s; --10^6 !! create index test_i on test(a_group); analyze test; => Merge Join (cost=259000.31..34904377039.75 rows=1550421099181 width=128) Merge Cond: ((test.a_group)::text = (last.a_group)::text) -> Merge Join (cost=129500.16..17814340.14 rows=783387153 width=120) Merge Cond: ((test.a_group)::text = (test.a_group)::text) -> GroupAggregate (cost=0.00..53681.23 rows=395825 width=10) -> Index Scan using test_i on test (cost=0.00..39973.53 rows=1036043 width=10) -> Materialize (cost=129500.16..133458.41 rows=395825 width=72) -> Unique (cost=119965.87..125146.08 rows=395825 width=18) -> Sort (cost=119965.87..122555.97 rows=1036043 width=18) Sort Key: test.a_group, test."time" -> Seq Scan on test (cost=0.00..16451.43 rows=1036043 width=18) -> Materialize (cost=129500.16..133458.41 rows=395825 width=72) -> Subquery Scan last (cost=119965.87..129104.33 rows=395825 width=72) -> Unique (cost=119965.87..125146.08 rows=395825 width=18) -> Sort (cost=119965.87..122555.97 rows=1036043 width=18) Sort Key: test.a_group, test."time" -> Seq Scan on test (cost=0.00..16451.43 rows=1036043 width=18) -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] aggregation problem: first/last/count(*)
> I think the problem in here is that you want to collect the first and last values in the same row Your idea is ok, but it just postpone the problem. And I need the result within the DB for further calculations /aggregations. What I need is really something like: test=# SELECT foo.ts, foo.grp, foo.val,foo2.val FROM (SELECT grp, MAX(ts) AS max_ts, MIN(ts) AS min_ts FROM foo GROUP BY grp) AS bar INNER JOIN foo ON foo.grp = bar.grp AND foo.ts = bar.min_ts INNER JOIN foo2 ON foo2.grp = bar.grp AND foo2.ts = bar.max_ts I've tested different solutions and the DISTINCT ON clause was better. (I guess the best solution depend of the distribution of grp and val). I've also just found aggregate functions for first/last: http://www.postgresonline.com/journal/index.php?/archives/68-More-Aggreg ate-Fun-Whos-on-First-and-Whos-on-Last.html But its is slightly slower as my solution. I'll still make a test with more data As I guess that swapping will grow fatser mith my query than with the first/last aggregate functions. cheers, Marc Mamin -Original Message- From: Volkan YAZICI [mailto:yazic...@ttmail.com] Sent: Monday, January 26, 2009 4:27 PM To: Marc Mamin Cc: pgsql-sql@postgresql.org Subject: Re: aggregation problem: first/last/count(*) On Mon, 26 Jan 2009, "Marc Mamin" writes: > create table test > ( > time int8, --store the time as epoch > a_group varchar, > category varchar > ) > > ... > > SELECT > FIRST.a_group, > FIRST.time as first_time, > FIRST.category as first_category, > LAST.time as last_time, > LAST.category as last_category, > AGG.c_count, > AGG.c_all > FROM > ... I think the problem in here is that you want to collect the first and last values in the same row. Instead, splitting them into two sequential rows would suit better to your database schema design, and you can rebuild the data structure as you want in the application tier later. For instance, consider below example: test=# SELECT ts, grp, val FROM foo; ts | grp | val +-+- 1 | 1 | 1 2 | 1 | 2 3 | 1 | 3 4 | 2 | 1 4 | 2 | 2 5 | 3 | 1 (6 rows) test=# SELECT foo.ts, foo.grp, foo.val FROM (SELECT grp, MAX(ts) AS max_ts, MIN(ts) AS min_ts FROM foo GROUP BY grp) AS bar INNER JOIN foo ON foo.grp = bar.grp AND (foo.ts = bar.min_ts OR foo.ts = bar.max_ts); ts | grp | val +-+- 1 | 1 | 1 3 | 1 | 3 4 | 2 | 1 4 | 2 | 2 5 | 3 | 1 (5 rows) After receiving above output, you can traverse returned rows one by one in the application layer and output desired results. Regards. -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
[SQL] select regexp_matches('a a a', '([a-z]) a','g');
Hello, I wonder if someone has an idea for this problem: I have a string that contains a serie of chars, separated by single spaces. e.g 'a b x n r a b c b' Having such a string, I d'like to get a list of all predecessors of a given character. In the example, the predecessors of b are a,a,c. If I now have the string 'a a a', the predecessors of 'a' are a,a I tried to use regexp_matches for this: select regexp_matches('a a a', '([a-z]) a','g'); => {"a "} only As the second parameter of the function matches the first 2 'a', only the trailing ' a' will be used to seek for further matching... Cheers, Marc Mamin -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql