Re: [PERFORM] Hot Standby performance issue

2013-10-28 Thread sparikh
Hi,

Yes, you are right. The table is the biggest one . Please find below the
information you requested. I agree the fact that autovacuum ran on this
table would fix the performance issue on standby does not sound very
convincing. But that is the only thing I could correlate when the query on
standby started working again. Otherwise there is absolutely no changes at
code level , database level or OS level.
As of now query is still working fine on standby.

I may be wrong, but could it be the case that standby disk was too much
fragmented compare to primary and autovaccum on primary fixed that.
(Assuming autovacuum on primary internally triggers the same on standby)


Sequential Scans18  
Sequential Tuples Read  1355777067  
Index Scans 102566124   
Index Tuples Fetched67155748
Tuples Inserted 16579520
Tuples Updated  17144291
Tuples Deleted  24383607
Tuples HOT Updated  1214531 
Live Tuples 101712125   
Dead Tuples 207 
Heap Blocks Read420703920   
Heap Blocks Hit 496135814   
Index Blocks Read   66807468
Index Blocks Hit916783267   
Toast Blocks Read   310677  
Toast Blocks Hit557735  
Toast Index Blocks Read 6959
Toast Index Blocks Hit  936473  
Last Vacuum 
Last Autovacuum 2013-10-25 02:47:09.914775-04   
Last Analyze
Last Autoanalyze2013-10-25 18:39:25.386091-04   
Vacuum counter  0   
Autovacuum counter  2   
Analyze counter 0   
Autoanalyze counter 4   
Table Size  46 GB   
Toast Table Size615 MB  
Indexes Size20 GB



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Hot-Standby-performance-issue-tp5774673p5776156.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hot Standby performance issue

2013-10-28 Thread sparikh
Table statistics I sent before were from primary. Following are from standby.

Index Tuples Fetched25910277
Tuples Inserted 0   
Tuples Updated  0   
Tuples Deleted  0   
Tuples HOT Updated  0   
Live Tuples 0   
Dead Tuples 0   
Heap Blocks Read138482386   
Heap Blocks Hit 1059169445  
Index Blocks Read   4730561 
Index Blocks Hit9702556 
Toast Blocks Read   1165
Toast Blocks Hit82  
Toast Index Blocks Read 85  
Toast Index Blocks Hit  3055
Last Vacuum 
Last Autovacuum 
Last Analyze
Last Autoanalyze
Vacuum counter  0   
Autovacuum counter  0   
Analyze counter 0   
Autoanalyze counter 0   
Table Size  46 GB   
Toast Table Size615 MB  
Indexes Size20 GB



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Hot-Standby-performance-issue-tp5774673p5776160.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hot Standby performance issue

2013-10-22 Thread sparikh
From Primary:

relname relpages
pg_toast_17673  1812819
pg_toast_17594  161660
pg_toast_17972  121902
pg_toast_17587  77190
pg_toast_18537  29108
pg_toast_17578  26638
pg_toast_17673_index19984
pg_toast_17868  14911
pg_toast_17594_index2208
pg_toast_10722461922
pg_toast_17587_index1510
pg_toast_17972_index1399
pg_statistic911
pg_toast_18694  883
pg_toast_17578_index375
pg_attribute336
pg_toast_16475  332
pg_toast_18537_index321
pg_proc 233
pg_depend_depender_index176

From Secondary :

relname relpages
pg_toast_17673  1812819
pg_toast_17594  161660
pg_toast_17972  121902
pg_toast_17587  77190
pg_toast_18537  29108
pg_toast_17578  26638
pg_toast_17673_index19984
pg_toast_17868  14911
pg_toast_17594_index2208
pg_toast_10722461922
pg_toast_17587_index1510
pg_toast_17972_index1399
pg_statistic911
pg_toast_18694  883
pg_toast_17578_index375
pg_attribute336
pg_toast_16475  332
pg_toast_18537_index321
pg_proc 233
pg_depend_depender_index176

Yes, result looks same both on primary and standby.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Hot-Standby-performance-issue-tp5774673p5775526.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hot Standby performance issue

2013-10-22 Thread sparikh
Sorry, it was typo from my side. I meant strace only.

I will try to request both perf and strace to be installed. But I am not
quite sure as the VMs are managed by third party. Will keep you posted...

The main thing puzzling to me is Explain Plan with Analyze takes couple of
secs to execute the operation but in reality it runs for more than 20 mins.

Thanks.



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Hot-Standby-performance-issue-tp5774673p5775529.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hot Standby performance issue

2013-10-22 Thread sparikh
 I will try to request both perf and strace to be installed. But I am 
 not quite sure as the VMs are managed by third party. Will keep you 
 posted... 

What do you mean by VM? Is this a virtualized environment or bare hardware? 

Yes, they are virtualized environments. 

Sorry about the confusion. But I was just telling from based on the explain
plan report. e.g at the bottom of explain plan report it says Total
runtime: 1698.453 ms (This is with analyze option).

But from the client perspective (either run from pgadmin or directly from
the server command line) it takes more that 20 min to display the output.



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Hot-Standby-performance-issue-tp5774673p5775550.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hot Standby performance issue

2013-10-21 Thread sparikh
Stupid question - when you say that a query is fast on primary but slow 
on standby, are you referring to exactly the same query, including 
parameter values? 

Yes . It is exactly and exactly the same query with the same parameters. 
Yes, it sounds stupid  but that is what happening. Though plan says it is
18ms it runs for more than 15-20 mins and finally returns with conflict
error :  ERROR: canceling statement due to conflict with recovery 

Even the to run execute plan itself takes very long on standby. Just to get
the execute plan on standby is turning out big deal. 

Regarding IO spike, yes I can understand that if data is not available in
the memory then it has to get it from disk. But the thing is it remains
there as much time until query returns with Query conflict error.

Thanks again.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Hot-Standby-performance-issue-tp5774673p5775257.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hot Standby performance issue

2013-10-21 Thread sparikh
Yes, both Explain and Explain Analyse are taking time. As you suggested I set
the lock parameters, but no locks are observed. Also checked
pg_stat_activity and none of the sessions are either waiting are blocked.

I agree we must upgrade to latest version (9.1.10), but unfortunately kind
of resources (not only man power) we are having it is going to be extremely
challenging task for us. Of course all other options are not working then we
have to take the tough route. No choice.

I am also working with sys admin to rule any issues at the OS or VM level.

Thanks.  



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Hot-Standby-performance-issue-tp5774673p5775332.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hot Standby performance issue

2013-10-21 Thread sparikh
Yes, Expalin without Analyze is taking long. It is weird. In the
pg_stat_activity Explain was the only query running. So server was almost
idle. Using New relic interface I checked CPU was almost idle - around
10-20%. There were some IO activity - around 40-50%.

I forgot to mention before I could run perf on command line even with root
permission. It says command not found. May be utility is not installed or
not enabled.

I have attached the snapshot of vmstat while explain was running in
background. vmstat.txt
http://postgresql.1045698.n5.nabble.com/file/n5775349/vmstat.txt  

Do you suggest if I remove all the data files from /data/base folder of
standby and again rebuild using rsync from primary ? do you see any issues
there.? This is just to rule out any fragmentation on standby side.

Our sys admin is planning to run fsck sometime today or tomorrow.

Thanks.









--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Hot-Standby-performance-issue-tp5774673p5775349.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hot Standby performance issue

2013-10-21 Thread sparikh
 Do you suggest if I remove all the data files from /data/base folder 
 of standby and again rebuild using rsync from primary ? do you see 
 any issues there.? This is just to rule out any fragmentation on 
 standby side. 

The EXPLAIN really should not do much I/O. I doubt it has anything to do 
with fragmentation, so I doubt this is going to help. 

Actually I was referring to this in the context of addressing main
underlying performance issue, not EXPLAIN. Sorry, I may not have
communicated it correctly.

Even strance does not seem to be installed.

The filesytem type it shows to me ext3.

Thanks.



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Hot-Standby-performance-issue-tp5774673p5775361.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hot Standby performance issue

2013-10-20 Thread sparikh
Thanks so much Tomas and Kevin for your valuable inputs. I am getting very
good response from this forum and learning so many new stuffs. I will try
all those options and will let you update .


standby_performance_issue.rar
http://postgresql.1045698.n5.nabble.com/file/n5775181/standby_performance_issue.rar
  


On further digging I found from the new relic report that as soon as I
execute query IO spikes immediately (100%). But the same query on primary
executes very fast.

I am not sure if postgres has some utility like what oracle's tkprof or AWR
where I can exactly pin point where exactly the query spends time. I will
try Tomas' suggestions perf and strace.

Below is the query. I also tried to attached rar file  one more time,
hopefully it gets through this time.

SELECT xfer_id, transaction_type, evse_time, transmit_time, error_code,
detail, reported_card_account_number as reported_card_account_number,
event_id,
 event_code,evse_id, batch_id, port, charge_event_id as charge_event_id FROM
(SELECT t.transaction_id::text AS xfer_id, t.transaction_type, e.event_time
AS evse_time,
 t.create_date AS transmit_time, t.error_code::text, '' AS detail,
COALESCE(e.reported_rfid,'N/A') AS reported_card_account_number,
e.event_id::text, e.event_code::text,
 t.evse_id::text, t.batch_id, e.port, COALESCE(e.evse_charge_id,'N/A') AS
charge_event_id 
FROM evse_transaction t, evse_event e , evse_unit u 
WHERE e.transaction_id = t.transaction_id AND t.evse_id = u.evse_id  
AND e.event_code IN
('1','2','3','4','5','6','7','8','9','10','11','12','13','14','15','16','17','18','19','20','21','22','23','24','25','26','27','28','29','30','31','32','33','34','35','36','37','38','39','40','41','42','43','44','45','46','47','48','49'))
  
AND u.evse_id = 1
  AND t.create_date BETWEEN '2013-10-01'::date AND '2013-10-15'::date +
INTERVAL '1 day'
 UNION
 SELECT t.transaction_id::text AS xfer_id, t.transaction_type, t.log_time AS
evse_time, t.create_date AS transmit_time, t.error_code::text, '' AS detail,
 COALESCE(t.reported_card_account_number,'N/A') AS
reported_card_account_number, '' AS event_id, '' AS event_code,
t.evse_id::text,t.batch_id, '' AS port, 'N/A' AS charge_event_id 
FROM evse_transaction t, evse_unit u 
WHERE t.evse_id = u.evse_id AND t.api_error IS NULL 
AND t.transaction_type NOT IN
('DCFCTransactionService','L2TransactionService','EVSEUploadTransactionService','EVSEUploadTransactionService','UploadTransactionService')
  
AND t.transaction_type IN
('DCFCDownloadConfigService','L2DownloadConfigService','EVSEDownloadConfigService','DownloadConfigService','ConfigDownloadService','DCFCUploadConfigService','L2UploadConfigService','EVSEUploadConfigService','UploadConfigService','ConfigUploadService','L2GetAdPackageListService','AdPackageListService','L2GPSService','EVSEGPSService','GPSService','ReportErrorService','EVSEDownloadRevisionService','DCFCCommandService','L2CommandService','CommandService','DCFCErrorService','L2ErrorService','EVSEReportErrorService','ErrorService','DCFCHeartbeatService','L2HeartbeatService','HeartbeatService','DCFCAuthorizeService','L2AuthorizeService','AuthorizeService','DCFCGetAccessListService','L2GetAccessListService','GetAccessListService','DCFCSetAccessService','L2SetAccessService','SetAccessService','DCFCPackageDownloadService','L2PackageDownloadService','PackageDownloadService','DCFCReportInventoryService','L2ReportInventoryService','ReportInventoryService','DCFCTargetVersionService','L2TargetVersionService','TargetVersionService','DCFCPackageListService','L2PackageInfoService','PackageListService','DCFCPackageInfoService','L2PackageInfoService','PackageInfoService','DCFCRegisterService','L2AuthorizeCodeService',
'AuthorizeCodeService') 
 AND u.evse_id = 1  AND t.create_date BETWEEN '2013-10-01'::date AND
'2013-10-15'::date + INTERVAL '1 day' 
UNION
 SELECT ef.fee_id::text AS xfer_id, 'FEE' as transaction_type, ef.event_time
AS evse_time, ef.create_time AS transmit_time, 
'' AS error_code, 'Fee Event' AS detail, COALESCE(ef.card_account_number,
'N/A') AS reported_card_account_number, '' AS event_id, '' AS event_code,
ef.evse_id::text, '' AS batch_id, 
ef.port::text AS port, COALESCE(ef.client_charge_id, 'N/A') AS
charge_event_id 
FROM evse_fee ef LEFT OUTER JOIN evse_unit eu ON eu.evse_id = ef.evse_id 
WHERE ef.evse_id = 1  AND  ef.create_time BETWEEN '2013-10-01'::date AND
'2013-10-15'::date + INTERVAL '1 day'
) x 
ORDER BY transmit_time DESC LIMIT 500

==

Query plan:

Limit  (cost=101950.33..101950.40 rows=30 width=368) (actual
time=18.421..18.421 rows=0 loops=1)
  Output: ((t.transaction_id)::text), t.transaction_type, e.event_time,
t.create_date, t.error_code, (''::text), (COALESCE(e.reported_rfid,
'N/A'::text)), ((e.event_id)::text), ((e.event_code)::text),
((t.evse_id)::text), t.batch_id, e.port, (COALESCE(e.evse_charge_id,
'N/A'::text))
  Buffers: shared hit=5 read=7
  -  Sort  (cost=101950.33..101950.40 rows=30 width=368) (actual

Re: [PERFORM] Hot Standby performance issue

2013-10-18 Thread sparikh
Anybody has any idea, or pointer ? This is a high priority issue I have
resolve at work. Any help would be of great help.



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Hot-Standby-performance-issue-tp5774673p5775103.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hot Standby performance issue

2013-10-18 Thread sparikh
Hi Tomas,

Thanks so much for your response and sorry for not providing the enough
details.

I have attached the zip file which has query,explain plan and database
parameter settings for both primary and secondary.

Please note that query has multiple unions only the first query on top is
causing the performance issue.

Transaction search is one of the feature in our Admin user interface(web
portal) where user can search for the transactions against our OLTP
database. The attached query is generated dynamically by the application.

 (3) The load on standby does not seem to be issue, because with 
 absolutely no load the query takes long and most of the time 
 returned with the conflict error. 

Not suse I understand this. Are you saying that the standby is mostly 
idle, i.e. the query seems to be stuck, and then fails with conflict 
error most of the time? 

The standby is not idle all the time. What I meant was even with no user
activity or no active user sessions, if I issue the query directly from
pgadmin tool it takes for ever. 

Hardware settings both primary and secondary :
===

Red Hat Enterprise Linux Server release 5.5 (Tikanga)
Linux 2.6.18-194.26.1.el5 x86_64
4 CPUs
16 GB RAM
Intel Xeon

Postgresql Version:
= 
 PostgreSQL 9.1.1 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.1.2
20080704 (Red Hat 4.1.2-51), 64-bit

6) After we recovered standby it was fine for few weeks and then 
 again started slowing down. 

Was it slowing down gradually, or did it start failing suddenly? 

Honestly speaking I do not exactly, when users started reporting the issue I
started looking into it. But the performance was good in September and
somewhere in October it started slowing down. I guess it was gradual. There
were no code change in the application or major change in the data volume. 

Hope this helps. Please let me know if you need any other details.

Thanks Again.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Hot-Standby-performance-issue-tp5774673p5775123.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Hot Standby performance issue

2013-10-15 Thread sparikh
Hi, This is my first post ever in the Postgres forum. I am relatively new to
Postgres and coming from oracle background. We have hot stand by setup to
serve mainly the read only queries. Past few days we have been facing a
performance issues on one of the transaction search. The search mainly
utilizes 3 of our biggest transaction tables. We had recently crash on both
primary and standby because of the space issues. Both servers were brought
up and running successfully after that incident. The standby is in almost in
sync with primary, far behind by less than a second. I also rebuilt all the
major indexes on the primary. I have done some research work to address the
issue as following. (1) I checked most of the database parameters settings
and they are same on both primary and standby, except some specific to the
individual server. (2) Checked the explain plan for the offending query and
they are exactly same on both the servers. Checked cpu usage on unix box and
found it was quite low. (3) The load on standby does not seem to be issue,
because with absolutely no load the query takes long and most of the time
returned with the conflict error. (4) The hardware settings are exactly same
on both primary and secondary. (5) The same query executes very fast on
primary (6) After we recovered standby it was fine for few weeks and then
again started slowing down. I believe autovacuum and analyze does not need
to be run on standby as it inherits that from primary. Please correct me if
I am wrong. Any help or suggestion would be greatly appreciated. Thanks, 



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Hot-Standby-performance-issue-tp5774673.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.