Re: [PERFORM] Stored Procedure Performance

2017-10-14 Thread phb07


Le 11/10/2017 à 16:11, Purav Chovatia a écrit :

Thanks.

We looked at pg_stat_statements and we see execution count & total 
time taken. But that still does not help me to identify why is it slow 
or what is taking time or where is the wait.


btw, does pg_stat_statements add considerable overhead? Coming from 
the Oracle world, we are very used to such execution stats, and hence 
we are planning to add this extension as a default to all our 
production deployments.


Its a single row select using PK, single row update using PK and a 
single row insert, so I dont see anything wrong with the code. So 
auto_explain would not add any value, I believe.


Basically, on an Oracle server, I would minimally look at 
statspack/awr report & OS stats (like cpu, iostat & memory) to start 
with. What should I look for in case of a Postgres server.
You could have a look at the PoWA extension 
(http://dalibo.github.io/powa/). It has the same purpose as AWR.




Thanks & Regards

On 3 October 2017 at 20:58, Pavel Stehule > wrote:




2017-10-03 17:17 GMT+02:00 Adam Brusselback
>:

There is also the option of pg_stat_statements:
https://www.postgresql.org/docs/current/static/pgstatstatements.html

and auto_explain:
https://www.postgresql.org/docs/current/static/auto-explain.html


These should help you identify what is slowing things down. 
There is no reason I could think of you should be seeing a 10x

slowdown between Postgres and Oracle, so you'll likely have to
just profile it to find out.


depends what is inside.

The max 10x slow down is possible if you are hit some unoptimized
cases. The times about 1ms - 10ms shows so procedure (code) can be
very sensitive to some impacts.

Regards

Pavel






Re: [PERFORM] Rollback table data.

2017-06-10 Thread phb07

Hi Dinesh,

Le 07/06/2017 à 14:48, Andreas Kretschmer a écrit :



Am 07.06.2017 um 13:33 schrieb Dinesh Chandra 12108:


Dear Expert,

Is there any way to rollback table data in PostgreSQL?


if you are looking for somewhat similar to flashback in oracle the 
answer is no.


Well, if this is what you are looking for, the E-Maj extension may help 
you. In few words, it allows 1) to log updates on tables sets (using 
triggers), 2) to set marks on these tables sets when they are in a 
stable state and 3) to travel back and forth to these marks.

Some pointers:
- pgxn to download a stable version : https://pgxn.org/dist/e-maj/ (the 
downloadable zip file also contains a presentation that may help to 
quickly get a good view of the extension - doc/emaj.2.0.1_doc_en.pdf)

- on-line documentation : http://emaj.readthedocs.io/
- github projects : https://github.com/beaud76/emaj and 
https://github.com/beaud76/emaj_ppa_plugin


Best regards. Philippe.

Regards, Andreas





--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Monitoring tool for Postgres Database

2017-05-27 Thread phb07

Hi Ravi,

We, at Dalibo, are contributing to the postgres community mostly through 
open source administration tools development.


Among them :
- pgBadger, from Gilles Darold, extracts a lot of data from the postgres 
logs and restitutes them via a web interface 
(https://github.com/dalibo/pgbadger and http://dalibo.github.io/pgbadger/),
- OPM - Open PostgreSQL Monitoring - monitors the activity of instances 
and sends alerts if needed (http://opm.io/, https://github.com/OPMDG and 
http://opm.readthedocs.io/index.html). It uses the check_pgactivity agent,
- PoWA - PostgreSQL Workload Analyzer - captures and stores the SQL 
activity of instances (using the pg_stat_statements extension) and 
reports it through a web interface. 
(http://powa.readthedocs.io/en/latest/ and 
https://github.com/dalibo/powa). Several plugins help the DBA in 
understanding and improving SQL statements performance:
   - pg_qualstats evaluates the selectivity of predicates or where 
clause encountered in SQL statements,
   - pg_stat_kcache captures additional statistics from the OS, like 
CPU and physical I/Os,
   - HypoPG allows to create hypothetical indexes and shows the access 
plan that the postgres optimizer would choose if these indexes would exist.
- PgCluu, from Gilles Darold, performs a full audit of a PostgreSQL 
Cluster performances. A collector grabs statistics on the PostgreSQL 
cluster using psql and sar, a grapher generates all HTML and charts 
output. (http://pgcluu.darold.net/ and https://github.com/darold/pgcluu)


FYI, we are also working on a new project named TemBoard 
(http://temboard.io/ and https://github.com/dalibo/temboard). It is not 
yet production ready. But it has been presented at the latest postgres 
conference in Russia (https://pgconf.ru/en/2017/93881).


Philippe Beaudoin.

Le 27/05/2017 à 06:58, Dave Stibrany a écrit :
We've found pghero to be a good first line of defence. It doesn't have 
alerting yet, but it's great for a quick high level healthcheck.


Also +1 for Datadog. Extremely flexible and elegant UI + powerful 
alerting capabilities.


On Fri, May 26, 2017 at 10:32 AM, Sunkara, Amrutha 
> wrote:


We have been using Nagios to monitor the system level stats. The
database level stats that we gather are custom scripts that we
have nagios poll to get the database health. You could use pg
badger to generate reports against your database logs as well.
Pg_badger reports are your bffs for performance related specs..
very close to AWR reports that oracle provides.

Sotrage/Disk latencies -- we have oracle's os watcher we running
regularly on these hosts to generate iostats as well.

Thanks.
-Amrutha.

On Thu, May 25, 2017 at 3:48 PM, Ravi Tammineni
> wrote:

Hi,

What is the best monitoring tool for Postgres database?
Something like Oracle Enterprise Manager.

Specifically I am interested in tools to help:

Alert DBAs to problems with both configuration and performance
issues

Deadlocks, Long running queries etc.,

Monitoring of overall system performance

General performance tuning

Storage/Disk latencies

Thanks

ravi







Re: [PERFORM] Strange nested loop for an INSERT

2016-09-23 Thread phb07


Le 21/09/2016 à 23:42, Jim Nasby a écrit :

On 9/12/16 1:05 PM, phb07 wrote:

The drawback is the overhead of this added ANALYZE statement. With a
heavy processing like in this test case, it is worth to be done. But for
common cases, it's a little bit expensive.


You could always look at the number of rows affected by a command and 
make a decision on whether to ANALYZE based on that, possibly by 
looking at pg_stat_all_tables.n_mod_since_analyze.
I have solved the issue by adding an ANALYZE between both statements. To 
avoid the associated overhead for cases when it is not worth to be done, 
the ANALYZE is only performed when more than 1000 rows have just been 
deleted by the first statement (as the logic is embeded into a plpgsql 
function, the GET DIAGNOSTICS statement provides the information). This 
threshold is approximately the point where the potential loss due to bad 
estimates equals the ANALYZE cost.
But the idea of using the n_mod_since_analyze data to also take into 
account other recent updates not yet reflected into the statistics is 
very interesting.


Thanks.


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Strange nested loop for an INSERT

2016-09-12 Thread phb07

Thanks, Tom, for this quick answer.


Le 12/09/2016 à 16:41, Tom Lane a écrit :

phb07 <ph...@apra.asso.fr> writes:

The performance issue, encountered in very specific situations, is the
time needed to cancel a significant number of insertions.
I have build a simple test case that reproduces the problem without the
need of the extension. It just mimics the behaviour.

At least for this example, the problem is that the DELETE enormously
alters the statistics for the t1_log.tuple column (going from 100% "NEW"
to 50% "NEW" and 50% "OLD"), but the plan for your last command is
generated with stats saying there are no "OLD" entries.  So you get a plan
that would be fast for small numbers of "OLD" entries, but it sucks when
there are lots of them.  The fix I would recommend is to do a manual
"ANALYZE t1_log" after such a large data change.  Auto-ANALYZE would fix
it for you after a minute or so, probably, but if your script doesn't want
to wait around then an extra ANALYZE is the ticket.

regards, tom lane

I understand the point (and I now realize that I should have found the 
answer by myself...)
Adding an ANALYZE of the log table effectively changes the plan and 
brings good performances for the INSERT statement.
The drawback is the overhead of this added ANALYZE statement. With a 
heavy processing like in this test case, it is worth to be done. But for 
common cases, it's a little bit expensive.

But I keep the idea and I will study the best solution to implement.

Regards. Philippe.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Strange nested loop for an INSERT

2016-09-12 Thread phb07

Hi all,
One of my customer has reported to me a performance problem when using 
the E-Maj extension.
It is a tool that allows to log updates performed on application tables, 
with the capability to cancel them.
It is based on classic triggers with a log table associated to each 
application table.
The performance issue, encountered in very specific situations, is the 
time needed to cancel a significant number of insertions.
I have build a simple test case that reproduces the problem without the 
need of the extension. It just mimics the behaviour.

Attached is the psql script and its result.
The updates cancellation operation is done in 3 steps:
- create a temporary table that holds each primary key to process
- delete from the application table all rows that are no longer wished 
(previously inserted rows and new values of updated rows)
- insert into the application table old rows we want to see again 
(previously deleted rows or old values of updated rows)

The performance problem only concerns the third statement (the INSERT).
I have run this test case in various recent postgres versions, from 9.1 
to 9.6, with the same results.

The problem appears when:
- the application table has a primary key with a large number of columns 
(at least 7 columns in this test case)
- and nothing but INSERT statements have been executed on the 
application table
- and the log trigger remains active (to provide a nice feature: cancel 
the cancellation !)
In the test case, I create a table and populate it with 100,000 rows, 
create the log mechanism, then insert 10,000 rows and finaly cancel 
these 10,000 rows insertion.

The faulting INSERT statement has the following explain:
explain analyze
INSERT INTO t1
  SELECT 
t1_log.c1,t1_log.c2,t1_log.c3,t1_log.c4,t1_log.c5,t1_log.c6,t1_log.c7,t1_log.c8

FROM t1_log, tmp
WHERE t1_log.c1 = tmp.c1 AND t1_log.c2 = tmp.c2 AND t1_log.c3 = tmp.c3
  AND t1_log.c4 = tmp.c4 AND t1_log.c5 = tmp.c5 AND t1_log.c6 = tmp.c6
  AND t1_log.c7 = tmp.c7
  AND t1_log.gid = tmp.gid AND t1_log.tuple = 'OLD';
QUERY PLAN
--
Insert on t1  (cost=0.00..890.90 rows=1 width=32) (actual 
time=434571.193..434571.193 rows=0 loops=1)
   ->  Nested Loop  (cost=0.00..890.90 rows=1 width=32) (actual 
time=434571.187..434571.187 rows=0 loops=1)
 Join Filter: ((t1_log.c1 = tmp.c1) AND (t1_log.c2 = tmp.c2) 
AND (t1_log.c3 = tmp.c3) AND (t1_log.c4 = tmp.c4) AND (t1_log.c5 = 
tmp.c5) AND (t1_log.c6 = tmp.c6) AND (t1_log.c7 = tmp.c7) AND 
(t1_log.gid = tmp.gid))

 Rows Removed by Join Filter: 1
 ->  Index Scan using t1_log_gid_tuple_idx on t1_log  
(cost=0.00..423.22 rows=1 width=40) (actual time=0.378..69.594 
rows=1 loops=1)

   Index Cond: ((tuple)::text = 'OLD'::text)
 ->  Seq Scan on tmp  (cost=0.00..176.17 rows=9717 width=36) 
(actual time=0.006..21.678 rows=1 loops=1)

Total runtime: 434571.243 ms
(8 rows)
Time: 434572,146 ms
When the conditions are not exactly met, I get:
explain analyze
INSERT INTO t1
  SELECT 
t1_log.c1,t1_log.c2,t1_log.c3,t1_log.c4,t1_log.c5,t1_log.c6,t1_log.c7,t1_log.c8

FROM t1_log, tmp
WHERE t1_log.c1 = tmp.c1 AND t1_log.c2 = tmp.c2 AND t1_log.c3 = tmp.c3
  AND t1_log.c4 = tmp.c4 AND t1_log.c5 = tmp.c5 AND t1_log.c6 = tmp.c6
  AND t1_log.c7 = tmp.c7
  AND t1_log.gid = tmp.gid AND t1_log.tuple = 'OLD';
QUERY PLAN

Insert on t1  (cost=438.65..906.34 rows=1 width=32) (actual 
time=111.526..111.526 rows=0 loops=1)
   ->  Hash Join  (cost=438.65..906.34 rows=1 width=32) (actual 
time=111.521..111.521 rows=0 loops=1)
 Hash Cond: ((tmp.c1 = t1_log.c1) AND (tmp.c2 = t1_log.c2) AND 
(tmp.c3 = t1_log.c3) AND (tmp.c4 = t1_log.c4) AND (tmp.c5 = t1_log.c5) 
AND (tmp.c6 = t1_log.c6) AND (tmp.c7 = t1_log.c7) AND (tmp.gid = 
t1_log.gid))
 ->  Seq Scan on tmp  (cost=0.00..176.17 rows=9717 width=36) 
(actual time=0.007..22.444 rows=1 loops=1)
 ->  Hash  (cost=435.68..435.68 rows=99 width=40) (actual 
time=58.300..58.300 rows=1 loops=1)

   Buckets: 1024  Batches: 1  Memory Usage: 586kB
   ->  Seq Scan on t1_log (cost=0.00..435.68 rows=99 
width=40) (actual time=2.281..28.430 rows=1 loops=1)

 Filter: ((tuple)::text = 'OLD'::text)
 Rows Removed by Filter: 1
Total runtime: 111.603 ms
(10 rows)
So we get a nested loop in the bad case, instead of a hash join.
But what looks strange to me in this nested loop is that the seq scan on 
the tmp table is executed 1 times 

[PERFORM] Performance on AIX

2011-03-19 Thread phb07

Hi all,

At Bull company, we want to answer a call for tender from a large 
company. And we are asked for information about PostgreSQL performance 
under AIX on Power 7 servers.


By chance, has someone some data about this ?
Has someone performed a benchmark using AIX quite recently ?

Are there any reasons for having performance level significantly 
different between AIX and, let say, Linux, on a given platform ?


Thanks by advance for any help.

Philippe BEAUDOIN


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Auto-clustering?

2010-12-17 Thread phb07


Royce Ausburn a écrit :
All of the queries on this table are reporting on a single collection, so ideally a collection's data would all be stored in the same part of the disk... or at least clumped together.  This can be achieved using cluster, however as far as I know there's no automated, non-cronesque means of clustering and having the table become unusable during the cluster is not ideal.  



  


If the lock level used by CLUSTER is a problem for you, you could 
consider pg_reorg contrib. AFAIK, it does similar work as CLUSTER but 
allowing a concurrent read and write activity on the table.


Regards. Philippe.


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow query with planner row strange estimation

2010-07-12 Thread phb07


Dimitri a écrit :

It's probably one of the cases when having HINTS in PostgreSQL may be
very helpful..

SELECT /*+ enable_nestloop=off */ ... FROM ...

will just fix this query without impacting other queries and without
adding any additional instructions into the application code..

So, why there is a such resistance to implement hints withing SQL
queries in PG?..

Rgds,
-Dimitri

  

+1.
Another typical case when it would be helpful is with setting the 
cursor_tuple_fraction GUC variable for a specific statement, without 
being obliged to issue 2 SET statements, one before the SELECT and the 
other after.



On 7/9/10, Robert Haas robertmh...@gmail.com wrote:
  

On Fri, Jul 9, 2010 at 6:13 AM, damien hostin damien.hos...@axege.com
wrote:


Have you tried running ANALYZE on the production server?

You might also want to try ALTER TABLE ... SET STATISTICS to a large
value on some of the join columns involved in the query.


Hello,

Before comparing the test case on the two machines, I run analyse on the
whole and look at pg_stats table to see if change occurs for the columns.
but on the production server the stats never became as good as on the
desktop computer. I set statistic at 1 on column used by the join, run
analyse which take a 300 row sample then look at the stats. The stats
are not as good as on the desktop. Row number is nearly the same but only
1
or 2 values are found.

The data are not balanced the same way on the two computer :
- Desktop is 12000 rows with 6000 implicated in the query (50%),
- Production (actually a dev/test server) is 6 million rows with 6000
implicated in the query (0,1%).
Columns used in the query are nullable, and in the 5994000 other rows that
are not implicated in the query these columns are null.

I don't know if the statistic target is a % or a number of value to
obtain,
  

It's a number of values to obtain.



but event set at max (1), it didn't managed to collect good stats (for
this particular query).
  

I think there's a cutoff where it won't collect values unless they
occur significantly more often than the average frequency.  I wonder
if that might be biting you here: without the actual values in the MCV
table, the join selectivity estimates probably aren't too good.



As I don't know what more to do, my conclusion is that the data need to be
better balanced to allow the analyse gather better stats. But if there is
a
way to improve the stats/query with this ugly balanced data, I'm open to
it
!

I hope that in real production, data will never be loaded this way. If
this
appened we will maybe set enable_nestloop to off, but I don't think it's a
good solution, other query have a chance to get slower.
  

Yeah, that usually works out poorly.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



Regards.
Philippe Beaudoin.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance