[PERFORM] Performance difference between Slon master and slave

2015-12-14 Thread Matthew Lunnon

Hello all, I hope someone can help me with this.

Postgres 9.4.4
Slon 2.2.4
Linux

I am using slony-i to replicate a production database which is in the 
order of 70GB. I have a reasonably complex select query that runs in 40 
seconds on the master but takes in the region of 30-40 minutes on the 
slave. The postgres configurations are identical and the machines are a 
similar specifications (12 core hyper threaded HP server and the slave 
has slightly less RAM: 132GB vs 148GB) The server running the slave 
database has a higher load than the one running the master though the 
load average on the slave machine was low (1-2) when running the test 
and the postgres process on the slave machine runs at 100% of a CPU with 
very little iowait on the server.


Inspecting the execution plan shows that there are some differences, for 
example, the slave is using a HashAggregate when the master is simply 
grouping. There also seems to be a difference with the ordering of the 
sub plans. Armed with this knowledge I have set enable_hashagg to off 
and run the query again and it now takes 53 seconds on the slave which 
is a more acceptable difference and the execution plans now look very 
similar (one difference being that there is another HashAggregate in the 
master which is now missing on the slave and may account for the 13 
seconds). I have isolated a much simpler query which I have detailed 
below with their execution plans which shows the difference on line 4. I 
would rather not disable hash aggregation on the slave as this might 
have other consequences so this raises a number of questions. Firstly Is 
there anything that I can do to stop this feature? Why is the slave 
behaving differently to the master?


Thanks in advance for any help.

Cheers
Matthew

explain
with my_view_booking_pax_breakdown as (
SELECT bev.booking_id,
( SELECT count(*) AS count
   FROM passenger_version
  WHERE passenger_version.current_version = 'T'::bpchar AND 
passenger_version.deleted = 'F'::bpchar AND 
passenger_version.indicative_pax_type = 'A'::bpchar AND 
passenger_version.booking_id = bev.booking_id) AS adult_count,

( SELECT count(*) AS count
   FROM passenger_version
  WHERE passenger_version.current_version = 'T'::bpchar AND 
passenger_version.deleted = 'F'::bpchar AND 
passenger_version.indicative_pax_type = 'C'::bpchar AND 
passenger_version.booking_id = bev.booking_id) AS child_count,

( SELECT count(*) AS count
   FROM passenger_version
  WHERE passenger_version.current_version = 'T'::bpchar AND 
passenger_version.deleted = 'F'::bpchar AND 
passenger_version.indicative_pax_type = 'I'::bpchar AND 
passenger_version.booking_id = bev.booking_id) AS infant_count

   FROM booking_expanded_version bev
  GROUP BY bev.booking_id
)
select * from "my_view_booking_pax_breakdown" "view_booking_pax_breakdown"
INNER JOIN "booking"."booking_expanded_version" 
"booking_expanded_version" ON 
"view_booking_pax_breakdown"."booking_id"="booking_expanded_version"."booking_id" 



Master

"Merge Join  (cost=5569138.32..6158794.12 rows=2461265 width=1375)"
"  Merge Cond: (booking_expanded_version.booking_id = 
view_booking_pax_breakdown.booking_id)"

"  CTE my_view_booking_pax_breakdown"
*"->  Group  (cost=0.43..5545692.19 rows=215891 width=4)"*
"  Group Key: bev.booking_id"
"  ->  Index Only Scan using 
booking_expanded_version_booking_idx on booking_expanded_version bev  
(cost=0.43..64607.40 rows=2461265 width=4)"

"  SubPlan 1"
"->  Aggregate  (cost=8.57..8.58 rows=1 width=0)"
"  ->  Index Scan using passenger_version_idx_4 on 
passenger_version  (cost=0.43..8.55 rows=5 width=0)"

"Index Cond: (booking_id = bev.booking_id)"
"  SubPlan 2"
"->  Aggregate  (cost=8.45..8.46 rows=1 width=0)"
"  ->  Index Scan using passenger_version_idx_3 on 
passenger_version passenger_version_1  (cost=0.42..8.45 rows=1 width=0)"

"Index Cond: (booking_id = bev.booking_id)"
"  SubPlan 3"
"->  Aggregate  (cost=8.31..8.32 rows=1 width=0)"
"  ->  Index Scan using passenger_version_idx_2 on 
passenger_version passenger_version_2  (cost=0.29..8.31 rows=1 width=0)"

"Index Cond: (booking_id = bev.booking_id)"
"  ->  Index Scan using booking_expanded_version_booking_idx on 
booking_expanded_version  (cost=0.43..546584.09 rows=2461265 width=1347)"

"  ->  Sort  (cost=23445.70..23985.43 rows=215891 width=28)"
"Sort Key: view_booking_pax_breakdown.booking_id"
"->  CTE Scan on my_view_booking_pax_breakdown 
view_booking_pax_breakdown  (cost=0.00..4317.82 rows=215891 width=28)"


Slave

"Merge Join  (cost=6168518.91..6764756.86 rows=2505042 width=1299)"
"  Merge Cond: (booking_expanded_version.booking_id = 
view_booking_pax_breakdown.booking_id)"

"  CTE my_view_booking_pax_breakdown"
*"->  

Re: [PERFORM] Performance difference between Slon master and slave

2015-12-14 Thread Mattthew Lunnon
Hi Jim,
Thanks for your response. Yes the tables have been analysed and I have also 
re-indexed and vacuumed the slave database. 
Regards
Matthew 

Sent from my iPad

> On 14 Dec 2015, at 17:49, Jim Nasby  wrote:
> 
>> On 12/14/15 11:16 AM, Matthew Lunnon wrote:
>> Inspecting the execution plan shows that there are some differences, for
>> example, the slave is using a HashAggregate when the master is simply
>> grouping. There also seems to be a difference with the ordering of the
>> sub plans.
> 
> Have you tried analyzing the tables on the slave?
> 
> Also, keep in mind that the first time you access rows on a Slony slave after 
> they're replicated Postgres will need to write hint bits out, which will take 
> some time. But that's clearly not the issue here.
> -- 
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://BlueTreble.com


This message has been scanned for malware by Websense. www.websense.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Getting an optimal plan on the first execution of a pl/pgsql function

2015-12-14 Thread David G. Johnston
On Mon, Dec 14, 2015 at 11:53 AM, Pedro França 
wrote:

> I have a really busy function that I need to optimize the best way I can.
> This function is just a nested select statement that is requested several
> times a sec by a legacy application. I'm running a PostgreSQL 9.4 on a
> CentOS 6;
>
> The indexes are in place but I've noticed that it is only used after the
> first execution of the function.
>

​How do you know this?​

I think that the problem is that Postgres isn't getting the best execution
> plan at first because of a parameter that it is highly exclusive in the
> majority of the cases, but it can be not as good sometimes. We can't change
> the way we call the function to a plain sql statement or a view because we
> can't change the application code itself.
>
> When I test with EXPLAIN ANALYZE after the first execution, the query runs
> really fast but the aplication sessions call the function only once and
> then are terminated. I need that the first execution use the actual
> optimized plan.
>
> We tried messing around with the connector driver that manage the
> connection pooling to issue a DISCARD TEMP instead of DISCARD ALL, so it
> could keep the cached plan of the sessions and the performance improved a
> lot, but I don't want to do that in a production environment.
>

Given the constraints you've listed this seems like it might be your only
avenue of improvement.​  Your problem that the performance improvement is
seen due to caching effects.  If you throw away the cache you loose the
improvement.


> I've tried to change the language to a sql function but it didn't help as
> the execution time didn't drop after the first execution.
>

​Yes, this likely would make thing worse...depending upon how it is called.

I've tried to add the "SET LOCAL join_collapse_limit = 1" too but it
> appears it doesn't work inside a function;
>

​I wouldn't expect that parameter to have any effect in this scenario.

Here is the function code:
>
> CREATE OR REPLACE FUNCTION public.ap_keepalive_geteqpid_veiid(
> IN tcbserie bigint,
> IN protocolo integer)
>   RETURNS TABLE(eqpid integer, veiid integer, tcbid integer, veiplaca
> character varying, veiproprietariocliid integer, tcbtppid integer,
> tcbversao character, veirpmparametro double precision, tcbconfiguracao
> bigint, tcbevtconfig integer, veibitsalertas integer, sluid integer, harid
> integer) AS
> $BODY$
> BEGIN
>
> RETURN QUERY
> SELECT teqp.eqpID,
> teqp.eqpveiID AS veiID,
> tcb.tcbID,
> tvei.veiPlaca,
> tvei.veiProprietariocliID,
> tcb.tcbtppID,
> tcb.tcbVersao,
> tvei.veiRPMParametro,
> COALESCE(COALESCE(NULLIF(tcb.tcbConfiguracao, 0),
> tcc.clcConfiguracaoBitsVeic), 0) AS tcbConfiguracao,
> COALESCE(tcb.tcbevtConfig, 0) AS tcbevtConfig,
> COALESCE(tvei.veiBitsAlertas, 0) AS veiBitsAlertas,
> COALESCE(tvei.veisluID, 0) AS sluID,
> COALESCE(tcb.tcbharID, 0) AS harID
> FROM TabComputadorBordo tcb
> INNER JOIN TabEquipamento teqp ON teqp.eqptcbID = tcb.tcbID
> INNER JOIN TabPacoteProduto tpp ON teqp.eqptppID = tpp.tppID
> INNER JOIN TabVeiculos tvei ON teqp.eqpveiID = tvei.veiID
> LEFT JOIN TabCliente tcli ON tcli.cliid = tvei.veiProprietariocliID
> LEFT JOIN TabClienteConfig tcc ON tcc.clcCliID = tcli.cliID
> WHERE   tcb.tcbserie = $1
> AND teqp.eqpAtivo = 1
> AND tpp.tppIDProtocolo = $2
> AND tvei.veiBloqueioSinal = 0;
>
> END
> $BODY$
>   LANGUAGE plpgsql VOLATILE
>   COST 1
>   ROWS 1;
>
> Execution plan in the first execution:
>

​You likely could make this STABLE instead of VOLATILE; though that doesn't
solve your problem.​


> "Function Scan on ap_keepalive_geteqpid_veiid  (cost=0.25..0.26 rows=1
> width=116) (actual time=3.268..3.268 rows=1 loops=1)"
> "Planning time: 0.032 ms"
> "Execution time: 3.288 ms"
>
> Second execution:
>
> "Function Scan on ap_keepalive_geteqpid_veiid  (cost=0.25..0.26 rows=1
> width=116) (actual time=0.401..0.402 rows=1 loops=1)"
> "Planning time: 0.058 ms"
> "Execution time: 0.423 ms"
>
>
​I'm doubting the query inside of the function is the problem here...it is
the function usage itself.  Calling a function has overhead in that the
body of function needs to be processed.  This only has to happen once per
session.  The first call of the function incurs this overhead while
subsequent calls do not.

Pending others correcting me...I fairly certain regarding my conclusions
though somewhat inexperienced in doing this kind of diagnostics.

David J.


Re: [PERFORM] Performance difference between Slon master and slave

2015-12-14 Thread Jim Nasby

On 12/14/15 11:16 AM, Matthew Lunnon wrote:

Inspecting the execution plan shows that there are some differences, for
example, the slave is using a HashAggregate when the master is simply
grouping. There also seems to be a difference with the ordering of the
sub plans.


Have you tried analyzing the tables on the slave?

Also, keep in mind that the first time you access rows on a Slony slave 
after they're replicated Postgres will need to write hint bits out, 
which will take some time. But that's clearly not the issue here.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Getting an optimal plan on the first execution of a pl/pgsql function

2015-12-14 Thread Pedro França
I have a really busy function that I need to optimize the best way I can.
This function is just a nested select statement that is requested several
times a sec by a legacy application. I'm running a PostgreSQL 9.4 on a
CentOS 6;

The indexes are in place but I've noticed that it is only used after the
first execution of the function. I think that the problem is that Postgres
isn't getting the best execution plan at first because of a parameter that
it is highly exclusive in the majority of the cases, but it can be not as
good sometimes. We can't change the way we call the function to a plain sql
statement or a view because we can't change the application code itself.

When I test with EXPLAIN ANALYZE after the first execution, the query runs
really fast but the aplication sessions call the function only once and
then are terminated. I need that the first execution use the actual
optimized plan.

We tried messing around with the connector driver that manage the
connection pooling to issue a DISCARD TEMP instead of DISCARD ALL, so it
could keep the cached plan of the sessions and the performance improved a
lot, but I don't want to do that in a production environment.

I've tried to change the language to a sql function but it didn't help as
the execution time didn't drop after the first execution. I've tried to add
the "SET LOCAL join_collapse_limit = 1" too but it appears it doesn't work
inside a function; Here is the function code:

CREATE OR REPLACE FUNCTION public.ap_keepalive_geteqpid_veiid(
IN tcbserie bigint,
IN protocolo integer)
  RETURNS TABLE(eqpid integer, veiid integer, tcbid integer, veiplaca
character varying, veiproprietariocliid integer, tcbtppid integer,
tcbversao character, veirpmparametro double precision, tcbconfiguracao
bigint, tcbevtconfig integer, veibitsalertas integer, sluid integer, harid
integer) AS
$BODY$
BEGIN

RETURN QUERY
SELECT teqp.eqpID,
teqp.eqpveiID AS veiID,
tcb.tcbID,
tvei.veiPlaca,
tvei.veiProprietariocliID,
tcb.tcbtppID,
tcb.tcbVersao,
tvei.veiRPMParametro,
COALESCE(COALESCE(NULLIF(tcb.tcbConfiguracao, 0),
tcc.clcConfiguracaoBitsVeic), 0) AS tcbConfiguracao,
COALESCE(tcb.tcbevtConfig, 0) AS tcbevtConfig,
COALESCE(tvei.veiBitsAlertas, 0) AS veiBitsAlertas,
COALESCE(tvei.veisluID, 0) AS sluID,
COALESCE(tcb.tcbharID, 0) AS harID
FROM TabComputadorBordo tcb
INNER JOIN TabEquipamento teqp ON teqp.eqptcbID = tcb.tcbID
INNER JOIN TabPacoteProduto tpp ON teqp.eqptppID = tpp.tppID
INNER JOIN TabVeiculos tvei ON teqp.eqpveiID = tvei.veiID
LEFT JOIN TabCliente tcli ON tcli.cliid = tvei.veiProprietariocliID
LEFT JOIN TabClienteConfig tcc ON tcc.clcCliID = tcli.cliID
WHERE   tcb.tcbserie = $1
AND teqp.eqpAtivo = 1
AND tpp.tppIDProtocolo = $2
AND tvei.veiBloqueioSinal = 0;

END
$BODY$
  LANGUAGE plpgsql VOLATILE
  COST 1
  ROWS 1;

Execution plan in the first execution:

"Function Scan on ap_keepalive_geteqpid_veiid  (cost=0.25..0.26 rows=1
width=116) (actual time=3.268..3.268 rows=1 loops=1)"
"Planning time: 0.032 ms"
"Execution time: 3.288 ms"

Second execution:

"Function Scan on ap_keepalive_geteqpid_veiid  (cost=0.25..0.26 rows=1
width=116) (actual time=0.401..0.402 rows=1 loops=1)"
"Planning time: 0.058 ms"
"Execution time: 0.423 ms"

Thank you in advance,
Pedro Ivo


Re: [PERFORM] Getting an optimal plan on the first execution of a pl/pgsql function

2015-12-14 Thread Tom Lane
"David G. Johnston"  writes:
> On Mon, Dec 14, 2015 at 11:53 AM, Pedro França 
> wrote:
>> When I test with EXPLAIN ANALYZE after the first execution, the query runs
>> really fast but the aplication sessions call the function only once and
>> then are terminated. I need that the first execution use the actual
>> optimized plan.

> Your problem that the performance improvement is
> seen due to caching effects.  If you throw away the cache you loose the
> improvement.

Yeah.  And it's not only the function itself, it's catalog caches and a
bunch of other stuff.  Basically, you should expect that the first few
queries executed by any PG session are going to be slower than those
executed later.  If you can't fix your application to hold sessions open
for a reasonable amount of time, use a connection pooler to do it for you
(pgpooler for instance).

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Getting an optimal plan on the first execution of a pl/pgsql function

2015-12-14 Thread Pedro França
Thank you for the replies guys, The output of auto-explain pratically
comfirms what you say (sorry there are some portuguese words in there). I
will try pgpooler.

< 2015-12-14 18:10:02.314 BRST >LOG:  duration: 0.234 ms  plan:
Query Text: SELECT teqp.eqpID,
teqp.eqpveiID AS veiID,
tcb.tcbID,
tvei.veiPlaca,
tvei.veiProprietariocliID,
tcb.tcbtppID,
tcb.tcbVersao,
tvei.veiRPMParametro,
COALESCE(COALESCE(NULLIF(tcb.tcbConfiguracao, 0),
tcc.clcConfiguracaoBitsVeic), 0) AS tcbConfiguracao,
COALESCE(tcb.tcbevtConfig, 0) AS tcbevtConfig,
COALESCE(tvei.veiBitsAlertas, 0) AS veiBitsAlertas,
COALESCE(tvei.veisluID, 0) AS sluID,
COALESCE(tcb.tcbharID, 0) AS harID
FROM TabComputadorBordo tcb
INNER JOIN TabEquipamento teqp ON teqp.eqptcbID = tcb.tcbID
INNER JOIN TabPacoteProduto tpp ON teqp.eqptppID = tpp.tppID
INNER JOIN TabVeiculos tvei ON teqp.eqpveiID = tvei.veiID
LEFT JOIN TabCliente tcli ON tcli.cliid = tvei.veiProprietariocliID
LEFT JOIN TabClienteConfig tcc ON tcc.clcCliID = tcli.cliID
WHERE   tcb.tcbserie = $1
AND teqp.eqpAtivo = 1
AND tpp.tppIDProtocolo = $2
AND tvei.veiBloqueioSinal = 0
Nested Loop Left Join  (cost=1.29..18.65 rows=1 width=75) (actual
time=0.226..0.230 rows=1 loops=1)
 Join Filter: (tcc.clccliid = tcli.cliid)
 Rows Removed by Join Filter: 3
 ->  Nested Loop Left Join  (cost=1.29..17.57 rows=1 width=75) (actual
time=0.205..0.209 rows=1 loops=1)
   ->  Nested Loop  (cost=1.01..17.26 rows=1 width=71) (actual
time=0.200..0.203 rows=1 loops=1)
 ->  Nested Loop  (cost=0.72..16.80 rows=1 width=43) (actual
time=0.097..0.098 rows=1 loops=1)
   ->  Nested Loop  (cost=0.58..16.63 rows=1 width=47)
(actual time=0.079..0.080 rows=1 loops=1)
 ->  Index Scan using
ix_tabcomputadorbordo_tcbserie on tabcomputadorbordo tcb  (cost=0.29..8.31
rows=1 width=35) (actual time=0.046..0.046 rows=1 loops=1)
   Index Cond: (tcbserie = $1)
 ->  Index Scan using
ix_tabequipamento_eqptcbid_eqpativo_eqptppid_eqpveiid on tabequipamento
teqp  (cost=0.29..8.31 rows=1 width=16) (actual time=0.030..0.031 rows=1
loops=1)
   Index Cond: ((eqptcbid = tcb.tcbid) AND
(eqpativo = 1))
   ->  Index Only Scan using
ix_tabpacoteproduto_tppidprotocolo on tabpacoteproduto tpp
 (cost=0.14..0.16 rows=1 width=4) (actual time=0.015..0.015 rows=1 loops=1)
 Index Cond: ((tppidprotocolo = $2) AND (tppid =
teqp.eqptppid))
 Heap Fetches: 1
 ->  Index Scan using pk_tabveiculos on tabveiculos tvei
 (cost=0.29..0.45 rows=1 width=32) (actual time=0.100..0.101 rows=1 loops=1)
   Index Cond: (veiid = teqp.eqpveiid)
   Filter: (veibloqueiosinal = 0)
   ->  Index Only Scan using pk_tabcliente on tabcliente tcli
 (cost=0.28..0.30 rows=1 width=4) (actual time=0.004..0.005 rows=1 loops=1)
 Index Cond: (cliid = tvei.veiproprietariocliid)
 Heap Fetches: 1
 ->  Seq Scan on tabclienteconfig tcc  (cost=0.00..1.03 rows=3 width=8)
(actual time=0.014..0.015 rows=3 loops=1)
< 2015-12-14 18:10:02.314 BRST >CONTEXTO:  função PL/pgSQL
ap_keepalive_geteqpid_veiid(bigint,integer) linha 4 em RETURN QUERY
< 2015-12-14 18:10:02.314 BRST >LOG:  duration: 4.057 ms  plan:
Query Text: SELECT * FROM ap_keepalive_geteqpid_veiid (tcbSerie := 8259492,
protocolo:= 422);