Execute command in PL/pgSQL function not executing

2021-11-18 Thread Abdul Mohammed
Hello everyone,
Please I am having a problem with a function I am writing. The first part
uses a loop that pulls the values from a column and concatenates them into
a string. This first part works fine. The second part tries to use the
string to build a pivot table using the crosstab function. The function is
as follows:

CREATE OR REPLACE FUNCTION field_values_ct ()
RETURNS VOID AS $$
DECLARE rec RECORD;
DECLARE str text;
BEGIN
str := '"participant_id" integer,';
   -- looping to get column heading string
   FOR rec IN SELECT DISTINCT text
FROM question
ORDER BY text
LOOP
str :=  str || '"' || rec.text || '" text' ||',';
END LOOP;
str:= substring(str, 0, length(str));

EXECUTE 'SELECT *
FROM crosstab(''select sp.participant_id, distinct qu.text, sr.text
from survey_progress sp
join question qu
on sp.survey_id = qu.survey_id
join survey_response sr
on qu.id = sr.question_id
where qu.question_type_id = 8
order by 1,2'')

 AS final_result ('|| str ||')';
RAISE NOTICE 'Got to the end of the function';
END;
$$ LANGUAGE plpgsql;

The Execute Select statement doesn't seem to execute. There aren't any
error or hint messages either. It only prints a context message as follows:

CONTEXT:  PL/pgSQL function field_values_ct() line 15 at EXECUTE

Please I would be very grateful for any hints as to what I could be doing
wrong.

Regards


Re: Execute command in PL/pgSQL function not executing

2021-11-18 Thread Pavel Stehule
čt 18. 11. 2021 v 12:24 odesílatel Abdul Mohammed 
napsal:

> Hello everyone,
> Please I am having a problem with a function I am writing. The first part
> uses a loop that pulls the values from a column and concatenates them into
> a string. This first part works fine. The second part tries to use the
> string to build a pivot table using the crosstab function. The function is
> as follows:
>
> CREATE OR REPLACE FUNCTION field_values_ct ()
> RETURNS VOID AS $$
> DECLARE rec RECORD;
> DECLARE str text;
> BEGIN
> str := '"participant_id" integer,';
>-- looping to get column heading string
>FOR rec IN SELECT DISTINCT text
> FROM question
> ORDER BY text
> LOOP
> str :=  str || '"' || rec.text || '" text' ||',';
> END LOOP;
> str:= substring(str, 0, length(str));
>
> EXECUTE 'SELECT *
> FROM crosstab(''select sp.participant_id, distinct qu.text, sr.text
> from survey_progress sp
> join question qu
> on sp.survey_id = qu.survey_id
> join survey_response sr
> on qu.id = sr.question_id
> where qu.question_type_id = 8
> order by 1,2'')
>
>  AS final_result ('|| str ||')';
> RAISE NOTICE 'Got to the end of the function';
> END;
> $$ LANGUAGE plpgsql;
>
> The Execute Select statement doesn't seem to execute. There aren't any
> error or hint messages either. It only prints a context message as follows:
>
> CONTEXT:  PL/pgSQL function field_values_ct() line 15 at EXECUTE
>
> Please I would be very grateful for any hints as to what I could be doing
> wrong.
>

This is not MS SQL - result of last query is not result of function.

When you want to see result, you should to use RETURN statement - in this
case RETURN QUERY EXECUTE, and your function should to return SETOF text
instead VOID.

Regards

Pavel Stehule

>
> Regards
>
>
>


General Performance Question

2021-11-18 Thread DAVID ROTH
I am working on a large Oracle to Postgres migration.
The existing code frequently constructs a string and then uses Oracle's 
"EXECUTE IMMEDIATE" to run it.
"EXECUTE" has the same functionality in Postgres.

For example:
CREATE or REPLACE FUNCTION djr_foo_fnc (p_emp_no IN number)
RETURN VARCHAR2
AS
v_sql VARCHAR2(1000);
v_name VARCHAR2(30);
BEGIN
v_sql :='SELECT name FROM employees';
v_sql := v_sql ||' WHERE employee_number = '||p_emp_no;
EXECUTE IMMEDIATE v_sql INTO v_name;
RETURN v_name;
END;
/

CREATE or REPLACE FUNCTION djr_foo_fnc (p_emp_no IN number)
RETURN VARCHAR2
AS
v_name VARCHAR2(30);
BEGIN
SELECT name INTO v_name FROM employees
WHERE employee_number = p_emp_no;
RETURN v_name;
END;
/

These are oversimplified samples of some very complex queries I need to migrate.

How does the Postgres optimizer handle these 2 formats?
Which format is likely to perform better?
Thanks
Dave

Re: General Performance Question

2021-11-18 Thread Thomas Kellerer
DAVID ROTH schrieb am 18.11.2021 um 15:15:
> I am working on a large Oracle to Postgres migration.
> The existing code frequently constructs a string and then uses Oracle's 
> "EXECUTE IMMEDIATE" to run it.
> "EXECUTE" has the same functionality in Postgres.
>
> For example:
> CREATE or REPLACE FUNCTION djr_foo_fnc (p_emp_no IN number)
> RETURN VARCHAR2
> AS
> v_sql VARCHAR2(1000);
> v_name VARCHAR2(30);
> BEGIN
> v_sql :=            'SELECT name FROM employees';
> v_sql := v_sql ||' WHERE employee_number = '||p_emp_no;
> EXECUTE IMMEDIATE v_sql INTO v_name;
> RETURN v_name;
> END;
> /
>
> CREATE or REPLACE FUNCTION djr_foo_fnc (p_emp_no IN number)
> RETURN VARCHAR2
> AS
> v_name VARCHAR2(30);
> BEGIN
> SELECT name INTO v_name FROM employees
> WHERE employee_number = p_emp_no;
> RETURN v_name;
> END;
> /
>
> These are oversimplified samples of some very complex queries I need to 
> migrate.
>
> How does the Postgres optimizer handle these 2 formats?
> Which format is likely to perform better?

The query does not use any dynamic parts, so EXECUTE is not needed to begin 
with.
(not even in the Oracle implementation)

For functions that just encapsulate a SQL query, a "language sql" function 
might be better:

CREATE OR REPLACE FUNCTION djr_foo_fnc (p_emp_no integer)
  RETURNS text
AS
$$
  SELECT name
  FROM employees
  WHERE employee_number = p_emp_no;
$$
language sql
rows 1;


They are optimized together with the calling function which can be an advantage
but doesn't have to be in all cases.

If the function is called frequently avoiding the overhead of PL/pgSQL can make
a  difference though.





check scripts after database code change

2021-11-18 Thread Dennis
Hi,

Are there any scripts that we can check after the changes we made in the
database code?
In summary, are there any automatic post-development scripts before they
come to the beta stage?

Dennis


Re: General Performance Question

2021-11-18 Thread DAVID ROTH
Agreed.
The code I am migrating uses the EXECUTE immediate form for almost everything.  
The path of least resistance for me is to just copy the code to Postgres and 
change "EXECUTE IMMEDIATE" to "EXECUTE".
I am asking about performance differences to see if it is worth converting code 
that does not have any dynamic elements.

I am also hoping to get a better understanding of the way the Postgres 
optimizer works. Any links you could suggest for this would be appreciated.

> On 11/18/2021 9:27 AM Thomas Kellerer  wrote:
> 
>  
> DAVID ROTH schrieb am 18.11.2021 um 15:15:
> > I am working on a large Oracle to Postgres migration.
> > The existing code frequently constructs a string and then uses Oracle's 
> > "EXECUTE IMMEDIATE" to run it.
> > "EXECUTE" has the same functionality in Postgres.
> >
> > For example:
> > CREATE or REPLACE FUNCTION djr_foo_fnc (p_emp_no IN number)
> > RETURN VARCHAR2
> > AS
> > v_sql VARCHAR2(1000);
> > v_name VARCHAR2(30);
> > BEGIN
> > v_sql :=            'SELECT name FROM employees';
> > v_sql := v_sql ||' WHERE employee_number = '||p_emp_no;
> > EXECUTE IMMEDIATE v_sql INTO v_name;
> > RETURN v_name;
> > END;
> > /
> >
> > CREATE or REPLACE FUNCTION djr_foo_fnc (p_emp_no IN number)
> > RETURN VARCHAR2
> > AS
> > v_name VARCHAR2(30);
> > BEGIN
> > SELECT name INTO v_name FROM employees
> > WHERE employee_number = p_emp_no;
> > RETURN v_name;
> > END;
> > /
> >
> > These are oversimplified samples of some very complex queries I need to 
> > migrate.
> >
> > How does the Postgres optimizer handle these 2 formats?
> > Which format is likely to perform better?
> 
> The query does not use any dynamic parts, so EXECUTE is not needed to begin 
> with.
> (not even in the Oracle implementation)
> 
> For functions that just encapsulate a SQL query, a "language sql" function 
> might be better:
> 
> CREATE OR REPLACE FUNCTION djr_foo_fnc (p_emp_no integer)
>   RETURNS text
> AS
> $$
>   SELECT name
>   FROM employees
>   WHERE employee_number = p_emp_no;
> $$
> language sql
> rows 1;
> 
> 
> They are optimized together with the calling function which can be an 
> advantage
> but doesn't have to be in all cases.
> 
> If the function is called frequently avoiding the overhead of PL/pgSQL can 
> make
> a  difference though.




restore with pg engine upgrade in a middle

2021-11-18 Thread Marcin Giedz
Hi, the case is:

one day we did full backup of database version 12 using standard procedure
of continuous backup so: pg_start_backup/rsync/pg_stop_backup

snip of WAL files archive directory:
ver 12 PG
Jul 28 04:37 000103340004
Jul 28 04:41 000103340005
Jul 28 05:16 000103340006
Jul 28 05:16 000103340006.0028.backup
Jul 28 05:16 000103340007
Jul 28 05:18 000103340008
ver 13 PG
Jul 28 05:18 000103340068
Jul 28 05:18 000103340069

once backup was finished, we migrated PG engine from 12 to 13 - as you can
see in WAL filename structure. @ circa 5:18 all went OK however we have
forgotten to adjust full backup script to newer version of PG so till mid
of Nov we didn't have any full backup of our database except for WAL files
and now we need to restore database with target time set to end of Sep.

in this case how should I proceed with db restore including pg version
switch from 12 to 13 ?

Thx
Marcin


Re: General Performance Question

2021-11-18 Thread Tom Lane
DAVID ROTH  writes:
> The code I am migrating uses the EXECUTE immediate form for almost 
> everything.  The path of least resistance for me is to just copy the code to 
> Postgres and change "EXECUTE IMMEDIATE" to "EXECUTE".
> I am asking about performance differences to see if it is worth converting 
> code that does not have any dynamic elements.

Yes, absolutely.  (I would've imagined that that habit is pretty awful
for performance on Oracle, too.)

Having said that, the golden rule in such things is to get it to work
first, and make it fast later.  I wouldn't worry about cleaning up
unnecessary EXECUTE usage till you have a working port.

regards, tom lane




Re: check scripts after database code change

2021-11-18 Thread Karsten Hilbert
Am Thu, Nov 18, 2021 at 05:45:37PM +0300 schrieb Dennis:

> Are there any scripts that we can check after the changes we made in the
> database code?
> In summary, are there any automatic post-development scripts before they
> come to the beta stage?

What do you want those script to do ?

Karsten
--
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B




Re: check scripts after database code change

2021-11-18 Thread Adrian Klaver

On 11/18/21 06:45, Dennis wrote:

Hi,

Are there any scripts that we can check after the changes we made in the 
database code?


Who made what changes in what code?

In summary, are there any automatic post-development scripts before they 
come to the beta stage?


Development of what and whose beta?

This is going to need a more detailed explanation before there can be an 
answer more sophisticated then: maybe?




Dennis



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Plans with bad estimates/paths

2021-11-18 Thread Joe Wildish
Hi Tomas,

On Tue, 16 Nov 2021, at 22:03, Tomas Vondra wrote:

> It sure smells like a case of correlated data for some clients but not
> others, but who knows ... Hard to say without seeing the nodes below the
> join. If the lower nodes are estimated accurately, then it's the join
> selectivity that is estimated poorly, and there's not much we can do
> about it :-(

Here is a link to a segment of a plan where the estimations degrade.
I've also pasted the plan segment at the end of this message.

https://gist.githubusercontent.com/joewildish/8eb66815d728399687b24647df941746/raw/ef8c62455dd34d76807feff5d164e6d8857014e2/gistfile1.txt

The nodes beneath the Merge Join seem to be estimated accurately (within 2x of 
the actual)? But, the Merge Join itself is a 10x under-estimate and the ones 
above that are even further out.  You can see in the plan that this particular 
execution is for multiple clients (144, 145, 146 & 155). In my experiments, I 
am getting better results with a single client, although I don't know if that 
is down to the actual data size being smaller, or if the estimation for 
multiple values is inherently more inaccurate. Anyway, is this an example of a 
join selectivity problem?

> Do the "good" plans have the same underestimate? Maybe there's just much
> less data for those clients, and the "poor" plan ends up being fast anyway?

I think that may be happening but haven't been able to capture a plan yet that 
confirms it.

>> I am assuming this underestimation is the source of the planner
>> choosing the "wrong" path; in production, we have had to resort to
>> setting the join and from collapse limits to 1 to force a naive plan
>> to be generated.  This is giving us execution times in the 10/20
>> second range vs. >45m in some cases.
>
> That may be happening, yes. So is it the join order that ends up being
> wrong, or the join methods?

I've seen both. For example, in the worst-performing plans, the root node has 
its first input estimated to produce 1 row and its second input estimated to 
produce c.40,000 rows. The second input is a SeqScan, presumably because of the 
single-row estimate of its sibling. Of course, the estimate of 1 turns out to 
be wildly inaccurate, the join produces c.2BN rows, and most are then filtered 
out.

In other (better) plans, the troublesome SeqScan doesn't exist: the relation in 
question gets joined lower down the tree, and it is not traversed by a SeqScan.

> Have you tried increasing the collapse limit
> instead? Although, if it works for some queries but not others, that's
> likely not going to help.

Yes but it often creates poorer plans rather than better plans. In fact, I 
increase those limits locally when testing, to get the planner to consistently 
produce what it thinks is the best plan. (I find without this, sub-paths can be 
materialised, presumably because one of the collapse limits has been hit. 
Obviously I'd rather remove this particular variability when trying to debug).

> The first thing I'd do is reduce the query size as much as possible. In
> this case I'd try removing as many joins as possible until the issue
> disappears. The simpler the query, the easier it is to investigate.
>
> And yes, replacing parts of a query with a temporary table is a common
> solution, because it's possible to collect statistics on it, build
> indexes etc. That usually solves estimation issues in multi-tenancy.
> Sometimes even a CTE with materialization is enough.

Thank you. It seems, then, that the solution lies in simplifying the queries 
such that the chances of poor estimation are reduced/removed. (I have had some 
success with this today. One of the queries was bringing in a view which 
resulted in needless self-joins). However, such a solution begs the question -- 
which bits of the query should be pre-computed? And, will such work survive 
further changes in the underlying data distributions?

Thanks,
-Joe

Plan segment:

Nested Loop Left Join  (cost=2.29..348095.52 rows=3 width=93) (actual 
time=828.619..3368.362 rows=517367 loops=1)
  ->  Nested Loop  (cost=1.86..348094.00 rows=3 width=81) (actual 
time=828.609..2655.136 rows=517367 loops=1)
Join Filter: (clients.id = order_items.client_id)
->  Nested Loop  (cost=1.43..347875.73 rows=400 width=60) (actual 
time=828.603..1890.900 rows=517367 loops=1)
  Join Filter: (clients.id = order_items_1.client_id)
  ->  Merge Join  (cost=1.00..322712.24 rows=50370 width=48) 
(actual time=828.584..1224.993 rows=517367 loops=1)
Merge Cond: (order_items_2.client_id = clients.id)
->  GroupAggregate  (cost=0.85..290718.67 rows=2518498 
width=44) (actual time=0.040..1126.298 rows=1856351 loops=1)
  Group Key: order_items_2.client_id, 
order_items_2.order_item_id
  ->  Merge Left Join  (cost=0.85..240348.71 
rows=2518498 width=18) (actual time=0.033..535.466 rows=1858076 loops=1)