Re: [ADMIN] Delay in completion of aggregation inserts when run in a single commit - PG 9.1.2

2012-07-05 Thread Craig Ringer

On 07/05/2012 02:30 AM, Akash Kodibail wrote:
-Staging table population happens in expected time. Anywhere from 
10-15 minutes for each process.


-Aggregation process almost never completes which is a set of 15 
insert queries. Entire process happens in a single commit (population 
of staging tables and aggregation).



When you mean never completes - can you be more specific?

Is the query not progressing - say, waiting on a lock?

Is it doing work, but performing very poorly?

Does it terminate with a failure? If so, error message?

If the query just keeps running but you're not sure what it is or isn't 
doing, you should examine pg_stat_activity and pg_locks. From 
pg_stat_activity you can get the backend process ID; examine that 
backend using system tools (ps, top, etc) to see if it's using CPU and 
whether it's doing disk I/O. From pg_locks you can get an idea of 
whether a query is stuck waiting on a lock.
Could you be kind enough to explain how differently queries run when 
in a “single commit” as opposed to running each of them individually?
There shouldn't be tons of difference. The first queries continue to 
hold locks while the others run, but since you're in the same 
transaction it won't affect subsequent queries, only concurrent ones.


If you have other concurrent work going on it's possible you're running 
into locking issues that you don't encounter when you release locks with 
individual commits.


Also, some resources can only be released when a transaction commits. If 
you're doing something like using huge numbers of savepoints or PL/PgSQL 
BEGIN ... EXCEPT blocks it's possible you're running into resource 
issues. The SAVEPOINT code has improved massively since 8.3 when I had 
similar issues, but savepoints still aren't free.


Start by looking at the locking situation, and by determining whether 
your query is making slow forward progress or is simply waiting.


Postgresql.conf remains untouched, except for log writing which has 
been made to “all”.


That's going to be VERY sub-optimal if you're bulk-loading data. You'll 
need to push checkpoint_segments right up. Check your logs; you're 
probably seeing warnings about checkpoints being too frequent. Hell, my 
checkpoint_segments is at 50 on my *laptop* - albeit a very fast 
SSD-equipped laptop that does lots of blk data uinit testing.


You'll want to push effective_cache_size up to 4 or 5 GB given you have 
three instances. Hard to say re shared_buffers, I have little experience 
with running multiple instances on one machine.


--
Craig Rin

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Delay in completion of aggregation inserts when run in a single commit - PG 9.1.2

2012-07-05 Thread Craig Ringer

On 07/05/2012 04:00 PM, Albe Laurenz wrote:

Akash Kodibail wrote:

Problem:

[...]

-  Aggregation process almost never completes which is a set of 15

insert queries. Entire

process happens in a single commit (population of staging tables and

aggregation).

-  When I run these 15 queries individually, it happens in no time.

Could you be kind enough to explain how differently queries run when

in a single commit as opposed

to running each of them individually?

Apart from minor differences (current_timestamp, ...) the
main difference is that there is no COMMIT after each query.
COMMIT is expensive since it requires a write to hard disk.

Try to get the execution plans of the queries in both cases
and see if you can spot a difference.


Actually, that's a thought. Autovaccum can't see uncomitted work, and 
thus can't analyze it. Maybe they need an explicit ANALYZE or two after 
a bulk insert or update during their data load.


--
Craig Ringer

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] auto vacuum errors

2012-07-05 Thread Isabella Ghiurea

Hi Tom ,
 Here is my error  , the missing files were removed by mistake , I have 
another similar db cfg with this tables:spherrtmp% and no issues.
I 'm thinking to  restore  this tables content from this second db , 
will this  eliminate the auto vacuum error ?


07:45:59.770 PDT,,,31112,,4ff5a8a7.7988,5,,2012-07-05 07:45:59 
PDT,19/2479,0,ERROR,58P01,could not open relation 26384/247389/248165: 
No such file or directory,automatic vacuum of table 
contrib_regression.public.spheretmp5,,,
2012-07-05 07:45:59.770 PDT,,,31112,,4ff5a8a7.7988,6,,2012-07-05 
07:45:59 PDT,19/2481,0,ERROR,58P01,could not open relation 
26384/247389/248174: No such file or directory,automatic vacuum of 
table contrib_regression.public.spheretmp6,,,



On 07/04/2012 09:18 PM, Tom Lane wrote:

Isabella Ghiureaisabella.ghiu...@nrc-cnrc.gc.ca  writes:

I'm seeing a issue with some temp tables in contrib_regression db , the
auto vacuum job will not run for this tables  due some files/QID's
missing  how can I fix this  ?

Um ... in general, autovacuum just silently ignores temp tables, because
it doesn't have any way to access their contents.  (One of the reasons
that temp tables are fast is that the backend owning a temp table keeps
its data in local buffers, not in shared buffers that another process
could access.)  I wouldn't expect any missing file complaints though.
What exactly were you seeing?

regards, tom lane



--
---
Isabella A. Ghiurea

isabella.ghiu...@nrc-cnrc.gc.ca
Canadian Astronomy Data Centre 
|http://www.nrc-cnrc.gc.ca/eng/services/hia/data-centre.html
National Research Council of Canada, Herzberg Institute of Astrophysics
5071 West Saanich Road, Victoria BC V9E 2E7, Canada
Phone: 250 363-3446 fax: 250 363-0045


--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] auto vacuum errors

2012-07-05 Thread Tom Lane
Isabella Ghiurea isabella.ghiu...@nrc-cnrc.gc.ca writes:
   Here is my error  , the missing files were removed by mistake , I have 
 another similar db cfg with this tables:spherrtmp% and no issues.
 I 'm thinking to  restore  this tables content from this second db , 
 will this  eliminate the auto vacuum error ?

What you ought to do is drop the tables that are reported as causing the
error.  Or the whole database --- are you actually doing something
useful in contrib_regression, or is that just left over from testing?

regards, tom lane

-- 
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


[ADMIN] Re: Delay in completion of aggregation inserts when run in a single commit - PG 9.1.2

2012-07-05 Thread Akash
@@Craig

I could only wait for 6 hours. I cancelled the process after that.

No locks on any of the tables when running.

Yes, It was progressing. Of the 15 tables, when I cancelled after 6 hours,
3rd table was getting populated (when checked in pg_stat_activity).

Below are some changes I have in postgresql.conf, some based on your
suggestions:

1) constraint_exclusion = partition, which I had already made, supported by
an trigger on every insert with around 100 checks for each table (date
constraints).
2)log_destination = 'stderr'
3)logging_collector = on
4)autovacuum = off, we are doing daily vacuum analyze on all tables.
5)shared_buffers = 32MB
6)max_prepared_transactions = 100
7)work_mem = 16MB
8)maintenance_work_mem = 64MB
9)wal_writer_delay = 1000ms
10)checkpoint_segments = 16

After these changes process is moving forward, but somehow I feel that, It
is only matter of time, I will reach the next choking point. Since I am
seeing the that process completion is taking a little longer each time I run
it.

Also a thing I noticed is: When I run the query:

select relname, age(relfrozenxid) from pg_class order by age(relfrozenxid)
desc;

 relname  |age
--+
 pg_toast_28344_index | 2147483647
 pg_toast_28351_index | 2147483647
 pg_toast_33106_index | 2147483647
 pg_toast_33099_index | 2147483647
 pg_toast_32128_index | 2147483647
 pg_toast_28232_index | 2147483647
 pg_toast_33092_index | 2147483647
 promo_seq| 2147483647
 pg_toast_33085_index | 2147483647
 pg_toast_32135_index | 2147483647
 pg_toast_33120_index | 2147483647
.. There are many more. 

Regards,
Akash.

--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Delay-in-completion-of-aggregation-inserts-when-run-in-a-single-commit-PG-9-1-2-tp5715391p5715621.html
Sent from the PostgreSQL - admin mailing list archive at Nabble.com.

-- 
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Re: [ADMIN] Re: Delay in completion of aggregation inserts when run in a single commit - PG 9.1.2

2012-07-05 Thread Craig Ringer

On 07/06/2012 01:06 PM, Akash wrote:

@@Craig

I could only wait for 6 hours. I cancelled the process after that.

No locks on any of the tables when running.
That's a very ambiguous statement. I'm assuming you are saying our code 
does not take any explict locks on those tables using LOCK TABLE or 
SELECT ... FOR SHARE / FOR UPDATE - because there /most certainly are/ 
locks on those tables whenever you're running a query against them. Even 
a simple SELECT takes out a lock to prevent the table from being dropped 
while the query runs.


There could be no locks taken by transactions other than the transaction 
doing the run, though.



Yes, It was progressing. Of the 15 tables, when I cancelled after 6 hours,
3rd table was getting populated (when checked in pg_stat_activity).
OK. If you turn auto_explain on and use it to get a plan for the slow 
queries, do those plans differ from the plans produced when running the 
same queries standalone?


Did you examine iostat? Did you look at what work the postgres process 
was doing to see if it was busy with cpu or I/O (iowait) or if it was 
idle waiting for something else to happen?



4)autovacuum = off, we are doing daily vacuum analyze on all tables.

No!

Unless your database as /no/ write activity other than these batch 
loads, you should have autovacuum on. The more frequently autovaccum 
runs the better a job it will do of preventing bloat, maintaining table 
stats, etc.


If you have tables that are only ever modified by a bulk-load script 
that then immediately vacuums them, set the autovac parameters for that 
table so it's excluded, don't turn autovaccum off entirely.




5)shared_buffers = 32MB

That's tiny and will severely constrain Pg's resources.

I notice you didn't set effective_cache_size either.

After these changes process is moving forward, but somehow I feel that, It
is only matter of time, I will reach the next choking point. Since I am
seeing the that process completion is taking a little longer each time I run
it.
Well, your tables and indexes are probably bloating horribly because 
autovaccum is turned off, so that's not surprising.


If you drop and recreate, or TRUNCATE, the tables between load runs you 
might be OK with autovac off for those tables, but what you're 
describing makes me think otherwise.


--
Craig Ringer