Re: [GENERAL] table / query as a prameter for PL/pgSQL function

2011-08-09 Thread Merlin Moncure
2011/8/8 Ondrej Ivanič ondrej.iva...@gmail.com:
 Hi,

 2011/8/9 Merlin Moncure mmonc...@gmail.com:
 You have a few of different methods for passing sets between functions.

 I do not want to pass data between functions. The ideal solution
 should look like this:
 select * from my_map_func(select query)

well, the method still applies: you'd just do:
select * from my_map_func(array(select query that grabs foo_t type))

...but, it sounds like that method is not appropriate -- see below.

 1) refcursor as David noted.  reasonably fast. however, I find the
 'FETCH' mechanic a little inflexible.

 I've came across this but manual example wasn't (isn't) clear to me:
 CREATE TABLE test (col text);
 INSERT INTO test VALUES ('123');

 CREATE FUNCTION reffunc(refcursor) RETURNS refcursor AS '
 BEGIN
    OPEN $1 FOR SELECT col FROM test;
    RETURN $1;
 END;
 ' LANGUAGE plpgsql;

 BEGIN;
 SELECT reffunc('funccursor');
 FETCH ALL IN funccursor;
 COMMIT;

 What is the funccursor?

funccursor is the name -- just a string.  refcursors can be named with
a variable string and later fetched as an identifier -- they are kinda
unique in that respect.

 2) stage data to TABLE/TEMP TABLE; extremely flexible, but can be a
 headache because a non temp table can get thrashed pretty hard a and a
 'TEMP' can cause severe function plan invalidation issues if you're
 not careful

 I'm not familiar with this issue (function plan invalidation issues).
 Could you please provide more details/links about it?

well, in your particular case it's probably not so much of an issue.
plpgsql, when a function is executed for the first time in a session,
'compiles' the source code into a plan that is kept around until it
invalidates.  one of the things that causes a plan to invalidate is a
table getting dropped that is inside the plan -- temp tables are
notorious for doing that (in older postgres we'd get annoying OID
errors).  if your application is even partially cpu bound, and you
have a lot of plpgsql flying around,  that can add up in a surprising
hurry.  temp tables also write to the system catalogs, so if your
function calls are numerous, short, and sweet, array passing is the
way to go because it's a completely in-memory structure that can be
used like a set (via unnest) without those issues.  for 'big' data
though, it's not good.

 3) arrays of composites -- the most flexible and very fast for *small*
 amounts of records (say less than 10,000):

 My data set is huge: between 1 and 5 mil rows and avg row size is 100
 - 400 bytes

your best bet is probably a cursor IMO.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] table / query as a prameter for PL/pgSQL function

2011-08-08 Thread Merlin Moncure
2011/8/7 Ondrej Ivanič ondrej.iva...@gmail.com:
 Hi,

 It is possible to pass query result (or cursor?) as function
 parameter? I need a function which emits zero or more rows per input
 row (map function from mapreduce paradigm). Function returns record
 (or array): (value1, value2, value3)
 I've tried the following:

 1) create or replace function test (r record) returns setof record as $$ ...
 Doesn't work: PL/pgSQL functions cannot accept type record

 2) pass query as text parameter and open no scroll cursor inside the function
 It works but it's ugly.

 3) hardcode the query inside function
 Similar to (2) and looks better but I need several functions with
 different queries inside:
 ...
 for r in (query) loop
    ...
 end loop;
 ...

 4) use function in select clause:
 select my_map_func(col1, col2, col3, col4) from ... -- the rest of the query
 In this case I wasn't able figure out how to access record members
 returned by the function:

 select ?, ?, ?, count(*) from (
   select my_map_func(col1, col2, col3, col4) as map_func_result from ...
 ) as map
 group by 1, 2, 3

 The '?' should be something like map.map_func_result.value1 (both
 map.value1 and map_func_result.value1 doesn't not work). If function
 returns array then I can access value1 by using map_func_result[1]

 Is there a better way how to solve this? I'm kind of satisfied with 4
 (maybe 3) but it is little bit cumbersome

You have a few of different methods for passing sets between functions.
1) refcursor as David noted.  reasonably fast. however, I find the
'FETCH' mechanic a little inflexible.
2) stage data to TABLE/TEMP TABLE; extremely flexible, but can be a
headache because a non temp table can get thrashed pretty hard a and a
'TEMP' can cause severe function plan invalidation issues if you're
not careful
3) arrays of composites -- the most flexible and very fast for *small*
amounts of records (say less than 10,000):

#3 is my favorite method unless the data being passed is very large.
Here is an example of it in use:

CREATE TYPE foo_t as (a int, b text);
CREATE FUNCTION get_foos() RETURNS SETOF foo_t AS
$$
BEGIN
  RETURN QUERY SELECT 1, 'abc' UNION ALL SELECT 2, 'def';
END;
$$ LANGUAGE PLPGSQL;

CREATE FUNCTION do_foos(_foos foo_t[]) returns VOID AS
$$
DECLARE
  f foo_t;
BEGIN
  FOR f in SELECT * FROM UNNEST(_foos)
  LOOP
RAISE NOTICE '% %', f.a, f.b;
  END LOOP;
END;
$$ LANGUAGE PLPGSQL;

postgres=# SELECT do_foos(ARRAY(SELECT (a,b)::foo_t FROM get_foos()));
NOTICE:  1 abc
NOTICE:  2 def
 do_foos
-

(1 row)

Also, if you are deploying vs 9.1, be sure to check out Pavel's
for-in-array which is better method to do the unnest() which expands
the array.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] table / query as a prameter for PL/pgSQL function

2011-08-08 Thread Ondrej Ivanič
Hi,

2011/8/9 Merlin Moncure mmonc...@gmail.com:
 You have a few of different methods for passing sets between functions.

I do not want to pass data between functions. The ideal solution
should look like this:
select * from my_map_func(select query)

 1) refcursor as David noted.  reasonably fast. however, I find the
 'FETCH' mechanic a little inflexible.

I've came across this but manual example wasn't (isn't) clear to me:
CREATE TABLE test (col text);
INSERT INTO test VALUES ('123');

CREATE FUNCTION reffunc(refcursor) RETURNS refcursor AS '
BEGIN
OPEN $1 FOR SELECT col FROM test;
RETURN $1;
END;
' LANGUAGE plpgsql;

BEGIN;
SELECT reffunc('funccursor');
FETCH ALL IN funccursor;
COMMIT;

What is the funccursor?

Anyway, I will try to use something like this
CREATE FUNCTION my_map_func(refcursor) returns setof text[] as $$
...
$$ LANGUAGE 'plpgsql';

BEGIN;
DECLARE my_cursor NO SCROLL CURSOR FOR query;
SELECT * FROM my_map_func(my_cursor);
COMMIT;

I'll keep you posted.

 2) stage data to TABLE/TEMP TABLE; extremely flexible, but can be a
 headache because a non temp table can get thrashed pretty hard a and a
 'TEMP' can cause severe function plan invalidation issues if you're
 not careful

I'm not familiar with this issue (function plan invalidation issues).
Could you please provide more details/links about it?

 3) arrays of composites -- the most flexible and very fast for *small*
 amounts of records (say less than 10,000):

My data set is huge: between 1 and 5 mil rows and avg row size is 100
- 400 bytes

-- 
Ondrej Ivanic
(ondrej.iva...@gmail.com)

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] table / query as a prameter for PL/pgSQL function

2011-08-07 Thread David Johnston
On Aug 7, 2011, at 23:24, Ondrej Ivanič ondrej.iva...@gmail.com wrote:

 Hi,
 
 It is possible to pass query result (or cursor?) as function
 parameter? I need a function which emits zero or more rows per input
 row (map function from mapreduce paradigm). Function returns record
 (or array): (value1, value2, value3)
 I've tried the following:
 
 1) create or replace function test (r record) returns setof record as $$ ...
 Doesn't work: PL/pgSQL functions cannot accept type record
 

From the docs you can try using the refcursor data type though I have never 
done so myself.
 
 
 4) use function in select clause:
 select my_map_func(col1, col2, col3, col4) from ... -- the rest of the query
 In this case I wasn't able figure out how to access record members
 returned by the function:
 
 select ?, ?, ?, count(*) from (
   select my_map_func(col1, col2, col3, col4) as map_func_result from ...
 ) as map
 group by 1, 2, 3
 
 The '?' should be something like map.map_func_result.value1 (both
 map.value1 and map_func_result.value1 doesn't not work). If function
 returns array then I can access value1 by using map_func_result[1]
 

Try  (map.map_func_result).value1  - the parenthesis around the table alias 
and column are necessary.

 Is there a better way how to solve this? I'm kind of satisfied with 4
 (maybe 3) but it is little bit cumbersome
 
 Thanks,
 -- 
 Ondrej Ivanic
 (ondrej.iva...@gmail.com)
 
 -- 
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general