Re: [GENERAL] Restarting DB after moving to another drive

Daniel Begin Fri, 15 May 2015 12:18:12 -0700

Thank for that comprehensive response!

And you are right about practicing restore, I never had to :-)

However, I use pg_dump on a regular basis (custom format) but I did not know 
the difference between database/database cluster (and pg_dumpall) until I had 
to move everything because the PGDATA drive started overheating. 

Now that I better understand what is going on with backup/restore processes, 
and considering...
- The time it is going to  take to rebuild the whole cluster ;
- That I am the only user of the database;
- That everything was just fine with the database, except the temperature of 
the drive
- And considering the initial concern of this tread was about bad copy of 
symbolic links with windows 

I will make another attempt to copy everything on another drive from windows, 
unless someone tells me it is not possible.
- I will move my external tablespaces content back to pgdata and drop them for 
the time I copy the db to the new drive. 
- Doing so, I will get rid of the symbolic link (from tablespaces) from where 
originated the initial error message
- Without symbolic links, I should be able to copy the db using standard 
windows commands.
- Setting up the new drive's letter to the old one before restarting the db is 
easy 
-The whole process should take 12hours instead of a week.

Hoping it makes sense and that I have not missed something important (again)
Thank for your patience :-)
Daniel

-----Original Message-----
From: pgsql-general-ow...@postgresql.org 
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Francisco Olarte
Sent: May-15-15 12:20
To: Daniel Begin
Cc: r...@iol.ie; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Restarting DB after moving to another drive

HI Daniel:

On Fri, May 15, 2015 at 5:35 PM, Daniel Begin <jfd...@hotmail.com> wrote:
> Bonjour Francisco.
Buenos dias.

> Skimming the documentation sequentially is a cleaver advice, 
> especially since the doc is much of the time well done and exhaustive. 
> Unfortunately, even if I actually did it about 1 year ago, it seems 
> this specific item slipped out of my mind :-(

Bad luck, you managed to get one of the most important commands, but it's 
solved now.

> About dump/restore operation, restoring the database cluster is running for 
> 24hrs now (psql -f pgdumpallOutputfile  postgres). Since it took 13hrs to 
> dump the cluster, I begin to wonder how long it is going to take to restore 
> it...

Much longer than this, especially if as you say below you have a lot of 
indexes. It's one of the reasons many of us do not use pg_dumpall for anything 
but global objects, but use something like  the crontab which John R. Pierce 
posted ( I use a similar thing, but with an intermediate script with dumps 
critical databases more frequently, skips recreatable ( may be someone can 
confirm if that word is right ?
I mean test things which can be created from scratch ( as they come from a 
script ) ) databases and keeps several numbered copies ). Doing it this ways 
insures we can restore on criticality order if it needs to be done ( I even 
move unmodified partitions to a 'historic" schema, which gets dumped only after 
a change, which cuts my backups times to a tenth )

One thing. I do not know how you are restoring the database, but when doing 
this things we use a specially tuned postgresql.conf ( fsync off, minimal 
loging, lots of worrk mems and similar things, as we do not care about 
durability ( you can just rerun initdb and redo the restore, and there is only 
1 session connected, the restoring one ).
This cuts the restore times to easily a tenth, then after ending it we restart 
the server with the normal cong. It is a must when doing this short of things.

> My main concern is about how the indexes are managed in dump/restore 
> operations. I understand that pg_dumpall actually uses pg_dump where the doc 
> says "Post-data items include definitions of indexes, triggers..." I would 
> not worry if the doc said that indexes are simply copied but it says 
> "includes definition of indexes".
> Since some of the indexes took days to build... does someone could confirm 
> indexes are rebuilt instead of copied?
> If indexes are actually rebuilt, why should it be done that way? - There must 
> be good reason!

You are out of luck, and it has a reason. First, pg_dumps does not copy, it 
dumps. It's simpler behaviour ( text output ) just output a SQL script which 
recreates everything and inserts all the data ( normally using copy for speed, 
but it's the same as inserting ). It takes care of generating a fast script ( 
meaning it creates the tables, then inserts the data, then creates indexes and 
reactivates constraints, which is faster than defining everything and inserting 
with indexes and constraints actives ).

The reason to do it in text mode is you can dump between different version 
and/or architectures, and also the dump is much smaller than the db, specially 
if you compress it ( I always do it, testing a bit you can always find a 
compresor with will lead to faster backups, as saved disk writing easily 
offsets compression times, specially in moder multicpu memory rich machines ). 
Bear in mind in many scenarios you backup a lot ( we dump some critical things 
hourly, even if we are using replication ) and restore nearly never, and prefer 
to use a couple days more for the restore than a couple hours of degraded 
performance every backup.

This being said, if you have an 820G db ( I still do not know which size is 
this, I suppose it's $PGDATA footprint ) of important data ( it does not seem 
critical in availability, as you are taking days and still in bussiness ) and 
you are having these kind of problems to dump and restore and move directories 
in your OS, and do not know how much time it takes for backups, you have a 
problem. You should practice backup AND restore more, because your question 
indicates you MAY be backing up your data, but you have never restored a backup.

Also, the text output format is really good for the global objects in 
pg_dumpall, but not so much for the normal databases. For this you should use 
the custom format, unless it is a really small db. The problem with it is it 
can only do a database per file, and needs pg_restore to be read ( I know those 
are minors ). The advantage is instead of generating a plain text dump it 
builds a kind of tar file with the definitions and data for every object 
clearly separated, so you can do partial restores or whatever thing you want ( 
in fact, without options and without connecting to the database pg_restore 
spits out the same text file that a text dump will generate ). If you had used 
this technique you could have restored your tables in order, or restored only 
the data and then reindexed them concurrently with some other ( performance 
degraded ) work. You can do the same thing by editing the text dump, but it 
gets impractical and really complex beyond a few megabytes.

regards.
   Francisco Olarte.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make 
changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Restarting DB after moving to another drive

Reply via email to