Re: [HACKERS] problem with archive_command as suggested by documentation

2009-01-26 Thread Peter Eisentraut

Simon Riggs wrote:

I don't think that particular example is a good one since the whole
point of the archive is that it should be off-server. If we're going to
be exact about the example then we should give a more realistic one,
like using scp. Unfortunately, there is no secure-remote-move command,
so doing the above with scp would resend the whole file 3 times. I think
it's better to write a script...


The problem is that most people do copy the stuff straight out of the 
documentation.  And for those that do write a separate script, there are 
still a lot of possibilities to get it wrong.


There are only about a handful of transportation protocols appearing in 
practice, so it would be very helpful to provide carefully reviewed 
scripts or recipes for these.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] problem with archive_command as suggested by documentation

2009-01-26 Thread Albe Laurenz
Peter Eisentraut wrote:
 I don't think that particular example is a good one since the whole
 point of the archive is that it should be off-server. If we're going to
 be exact about the example then we should give a more realistic one,
 like using scp. Unfortunately, there is no secure-remote-move command,
 so doing the above with scp would resend the whole file 3 times. I think
 it's better to write a script...

 The problem is that most people do copy the stuff straight out of the
 documentation.  And for those that do write a separate script, there are
 still a lot of possibilities to get it wrong.

 There are only about a handful of transportation protocols appearing in
 practice, so it would be very helpful to provide carefully reviewed
 scripts or recipes for these.

I'll try to come up with something, but it will take some time; an impending
addition to the family currently has priority...

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] problem with archive_command as suggested by documentation

2009-01-26 Thread David Fetter
On Mon, Jan 26, 2009 at 04:40:12PM +0100, Albe Laurenz wrote:
 Peter Eisentraut wrote:
  I don't think that particular example is a good one since the
  whole point of the archive is that it should be off-server. If
  we're going to be exact about the example then we should give a
  more realistic one, like using scp. Unfortunately, there is no
  secure-remote-move command, so doing the above with scp would
  resend the whole file 3 times. I think it's better to write a
  script...
 
  The problem is that most people do copy the stuff straight out of
  the documentation.  And for those that do write a separate script,
  there are still a lot of possibilities to get it wrong.
 
  There are only about a handful of transportation protocols
  appearing in practice, so it would be very helpful to provide
  carefully reviewed scripts or recipes for these.
 
 I'll try to come up with something, but it will take some time; an
 impending addition to the family currently has priority...

You understand that this means you need to provide blue-elephant-themed
photos, right?

Congratulations! :)

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] problem with archive_command as suggested by documentation

2009-01-23 Thread Simon Riggs

On Fri, 2009-01-23 at 08:20 +0100, Albe Laurenz wrote:

  Perhaps it should suggest 
  something like:
  
  test ! -f .../%f  cp %p .../%f.tmp  mv .../%f.tmp .../%f
  
  ie. copy under a different filename first, and rename the file in place 
  after it's completely written, assuming that mv is atomic. It gets a bit 
  complicated, though.
 
 That's a good idea (although it could lead to race conditions in the
 extremely rare case that two clusters want to archive equally named
 files at the same time).
 
 I'll write a patch for that and send it as basis for a discussion.

The example is to help you understand things, not to solve every case. I
think it should start simply and then have additional comments later.

I don't think that particular example is a good one since the whole
point of the archive is that it should be off-server. If we're going to
be exact about the example then we should give a more realistic one,
like using scp. Unfortunately, there is no secure-remote-move command,
so doing the above with scp would resend the whole file 3 times. I think
it's better to write a script...

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] problem with archive_command as suggested by documentation

2009-01-23 Thread Tom Lane
Albe Laurenz laurenz.a...@wien.gv.at writes:
 Heikki Linnakangas wrote:
 Well, the documentation states the reason to do that:
 
 This is an important safety feature to preserve the 
 integrity of your archive in case of administrator error 
 (such as sending the output of two different servers to the 
 same archive directory)

 But isn't it true that the vast majority of people have only one
 PostgreSQL cluster per machine, and it is highly unlikely that
 somebody else creates a file with the same name as a WAL segment
 in the archive directory?

That's not the point.  You'd typically be sending the WAL archive to
another machine (via NFS or FTP or whatever), and it's not very hard at
all to imagine accidentally setting up two different machines to point
at the same archive directory on the same backup server.  For instance,
imagine that you're cloning your DB in preparation for an upgrade.
You'll probably start by copying your configuration file...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] problem with archive_command as suggested by documentation

2009-01-22 Thread Albe Laurenz
The documentation states in
http://www.postgresql.org/docs/current/static/continuous-archiving.html#BACKUP-ARCHIVING-WAL

The archive command should generally be designed to refuse to overwrite any 
pre-existing archive file.

and suggests an archive_command like test ! -f .../%f  cp %p .../%f.

We ran into (small) problems with an archive_command similar to this
as follows:

The server received a fast shutdown request while a WAL segment was being 
archived.
The archiver stopped and left behind a half-written archive file.

Now when the server was restarted, the archiver tried to archive the same
WAL segment again and got an error because the destination file already
existed.

That means that WAL archiving is stuck until somebody manually removes
the partial archived file.


I suggest that the documentation be changed so that it does not
recommend this setup. WAL segment names are unique anyway.

What is your opinion? Is the problem I encountered a corner case
that should be ignored?

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] problem with archive_command as suggested by documentation

2009-01-22 Thread Heikki Linnakangas

Albe Laurenz wrote:

The documentation states in
http://www.postgresql.org/docs/current/static/continuous-archiving.html#BACKUP-ARCHIVING-WAL

The archive command should generally be designed to refuse to overwrite any 
pre-existing archive file.

and suggests an archive_command like test ! -f .../%f  cp %p .../%f.

We ran into (small) problems with an archive_command similar to this
as follows:

The server received a fast shutdown request while a WAL segment was being 
archived.
The archiver stopped and left behind a half-written archive file.


Hmm, if I'm reading the code correctly, a fast shutdown request 
shouldn't kill an ongoing archive command.



Now when the server was restarted, the archiver tried to archive the same
WAL segment again and got an error because the destination file already
existed.

That means that WAL archiving is stuck until somebody manually removes
the partial archived file.


Yeah, that's a good point. Even if it turns out that the reason for your 
 partial write wasn't the fast shutdown request, the archive_command 
could be interrupted for some other reason and leave behind a partially 
written file behind.



I suggest that the documentation be changed so that it does not
recommend this setup. WAL segment names are unique anyway.


Well, the documentation states the reason to do that:


This is an important safety feature to preserve the integrity of your archive 
in case of administrator error (such as sending the output of two different 
servers to the same archive directory)


which seems like a reasonable concern too. Perhaps it should suggest 
something like:


test ! -f .../%f  cp %p .../%f.tmp  mv .../%f.tmp .../%f

ie. copy under a different filename first, and rename the file in place 
after it's completely written, assuming that mv is atomic. It gets a bit 
complicated, though.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] problem with archive_command as suggested by documentation

2009-01-22 Thread decibel

On Jan 22, 2009, at 10:18 AM, Albe Laurenz wrote:
The archive command should generally be designed to refuse to  
overwrite any pre-existing archive file.

...
The server received a fast shutdown request while a WAL segment was  
being archived.

The archiver stopped and left behind a half-written archive file.

Now when the server was restarted, the archiver tried to archive  
the same
WAL segment again and got an error because the destination file  
already

existed.

That means that WAL archiving is stuck until somebody manually removes
the partial archived file.


I suggest that the documentation be changed so that it does not
recommend this setup. WAL segment names are unique anyway.

What is your opinion? Is the problem I encountered a corner case
that should be ignored?


The test is recommended because if you accidentally set two different  
clusters to archive to the same location you'll trash everything. I  
don't know of a good work-around; IIRC we used to leave the archive  
command to complete, but that could seriously delay shutdown so it  
was changed. I don't think we created an option to control that  
behavior.

--
Decibel!, aka Jim C. Nasby, Database Architect  deci...@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] problem with archive_command as suggested by documentation

2009-01-22 Thread Albe Laurenz
Heikki Linnakangas wrote:
  The documentation states
  
  The archive command should generally be designed to refuse to overwrite 
  any pre-existing archive file.
  
  and suggests an archive_command like test ! -f .../%f  cp %p .../%f.
  
  We ran into (small) problems with an archive_command similar to this
  as follows:
  
  The server received a fast shutdown request while a WAL segment was being 
  archived.
  The archiver stopped and left behind a half-written archive file.
 
 Hmm, if I'm reading the code correctly, a fast shutdown request 
 shouldn't kill an ongoing archive command.

Maybe it died because of a signal 1, I don't know.
But it left behind a half-written file.

  Now when the server was restarted, the archiver tried to archive the same
  WAL segment again and got an error because the destination file already
  existed.
  
  That means that WAL archiving is stuck until somebody manually removes
  the partial archived file.
 
 Yeah, that's a good point. Even if it turns out that the reason for your 
   partial write wasn't the fast shutdown request, the archive_command 
 could be interrupted for some other reason and leave behind a partially 
 written file behind.
 
  I suggest that the documentation be changed so that it does not
  recommend this setup. WAL segment names are unique anyway.
 
 Well, the documentation states the reason to do that:
 
  This is an important safety feature to preserve the 
 integrity of your archive in case of administrator error 
 (such as sending the output of two different servers to the 
 same archive directory)
 
 which seems like a reasonable concern too.

Of course, that's why I did that at first.

But isn't it true that the vast majority of people have only one
PostgreSQL cluster per machine, and it is highly unlikely that
somebody else creates a file with the same name as a WAL segment
in the archive directory?

 Perhaps it should suggest 
 something like:
 
 test ! -f .../%f  cp %p .../%f.tmp  mv .../%f.tmp .../%f
 
 ie. copy under a different filename first, and rename the file in place 
 after it's completely written, assuming that mv is atomic. It gets a bit 
 complicated, though.

That's a good idea (although it could lead to race conditions in the
extremely rare case that two clusters want to archive equally named
files at the same time).

I'll write a patch for that and send it as basis for a discussion.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers