Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-20 Thread Greg Stark
On Tue, Aug 19, 2014 at 10:42 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Is there a way to create a link to a file which only exists as an open
 file descriptor?   If there was, you could create a temp file, open an
 fd, then delete the file.  That would remove the issue with files being
 leaked due to failures of various kinds.


Sort of. On recent Linuxen you can create a file with open(O_TMPFILE)
then use linkat(2) to create a link for it only once it's fully
written.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-20 Thread Greg Stark
c.f.:

   O_TMPFILE (since Linux 3.11)
  Create an unnamed temporary file.  The pathname argument
  specifies a directory; an unnamed inode will be created in
  that directory's filesystem.  Anything written to the
  resulting file will be lost when the last file descriptor is
  closed, unless the file is given a name.

  O_TMPFILE must be specified with one of O_RDWR or O_WRONLY
  and, optionally, O_EXCL.  If O_EXCL is not specified, then
  linkat(2) can be used to link the temporary file into the
  filesystem, making it permanent, using code like the
  following:

  char path[PATH_MAX];
  fd = open(/path/to/dir, O_TMPFILE | O_RDWR,
  S_IRUSR | S_IWUSR);

  /* File I/O on 'fd'... */

  snprintf(path, PATH_MAX,  /proc/self/fd/%d, fd);
  linkat(AT_FDCWD, path, AT_FDCWD, /path/for/file,
  AT_SYMLINK_FOLLOW);

  In this case, the open() mode argument determines the file
  permission mode, as with O_CREAT.

  Specifying O_EXCL in conjunction with O_TMPFILE prevents a
  temporary file from being linked into the filesystem in the
  above manner.  (Note that the meaning of O_EXCL in this case
  is different from the meaning of O_EXCL otherwise.)

  There are two main use cases for O_TMPFILE:

  *  Improved tmpfile(3) functionality: race-free creation of
 temporary files that (1) are automatically deleted when
 closed; (2) can never be reached via any pathname; (3) are
 not subject to symlink attacks; and (4) do not require the
 caller to devise unique names.

  *  Creating a file that is initially invisible, which is then
 populated with data and adjusted to have appropriate
 filesystem attributes (chown(2), chmod(2), fsetxattr(2),
 etc.)  before being atomically linked into the filesystem
 in a fully formed state (using linkat(2) as described
 above).

  O_TMPFILE requires support by the underlying filesystem; only
  a subset of Linux filesystems provide that support.  In the
  initial implementation, support was provided in the ext2,
  ext3, ext4, UDF, Minix, and shmem filesystems.  XFS support
  was added in Linux 3.15.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-20 Thread Alvaro Herrera
Greg Stark wrote:

   char path[PATH_MAX];
   fd = open(/path/to/dir, O_TMPFILE | O_RDWR,
   S_IRUSR | S_IWUSR);
 
   /* File I/O on 'fd'... */
 
   snprintf(path, PATH_MAX,  /proc/self/fd/%d, fd);
   linkat(AT_FDCWD, path, AT_FDCWD, /path/for/file,
   AT_SYMLINK_FOLLOW);

Hmm, the real trick here is linkat(... /proc/self/foobar), not the
O_TMPFILE: you can have an open file descriptor to an invisible file
simply by creating a normal file and unlinking it.  I looked at linkat()
yesterday but the idea of using /proc/self didn't occur to me.  Nasty
trick :-(  It seems linkat() is quite a bit more portable than
O_TMPFILE, fortunately ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-20 Thread Andres Freund
On 2014-08-19 17:42:06 -0400, Alvaro Herrera wrote:
 MauMau wrote:
 
  With that said, copying to a temporary file like dest.tmp and
  renaming it to dest sounds worthwhile even as a basic copy
  utility.  I want to avoid copying to a temporary file with a fixed
  name like _copy.tmp, because some advanced utility may want to run
  multiple instances of pg_copy to copy several files into the same
  directory simultaneously.  However, I'm afraid multiple dest.tmp
  files might continue to occupy disk space after canceling copy or
  power failure in some use cases, where the copy of the same file
  won't be retried.  That's also the reason why I chose to not use a
  temporary file like cp/copy.
 
 Is there a way to create a link to a file which only exists as an open
 file descriptor?   If there was, you could create a temp file, open an
 fd, then delete the file.  That would remove the issue with files being
 leaked due to failures of various kinds.

Isn't this a solution looking for a problem? We're using tempfiles in
dozens of other places and I really don't see why this is the place to
stop doing so. Just copy to dest.tmp and move it into place. If things
crash during that, the command will be repeated shortly afterwards again
*anyway*. Let's not get into platform specific games here.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-20 Thread Alvaro Herrera
Andres Freund wrote:
 On 2014-08-19 17:42:06 -0400, Alvaro Herrera wrote:
  MauMau wrote:
  
   With that said, copying to a temporary file like dest.tmp and
   renaming it to dest sounds worthwhile even as a basic copy
   utility.  I want to avoid copying to a temporary file with a fixed
   name like _copy.tmp, because some advanced utility may want to run
   multiple instances of pg_copy to copy several files into the same
   directory simultaneously.  However, I'm afraid multiple dest.tmp
   files might continue to occupy disk space after canceling copy or
   power failure in some use cases, where the copy of the same file
   won't be retried.  That's also the reason why I chose to not use a
   temporary file like cp/copy.
  
  Is there a way to create a link to a file which only exists as an open
  file descriptor?   If there was, you could create a temp file, open an
  fd, then delete the file.  That would remove the issue with files being
  leaked due to failures of various kinds.
 
 Isn't this a solution looking for a problem? We're using tempfiles in
 dozens of other places and I really don't see why this is the place to
 stop doing so. Just copy to dest.tmp and move it into place. If things
 crash during that, the command will be repeated shortly afterwards again
 *anyway*. Let's not get into platform specific games here.

The issue is what happens if there's a crash while the temp file is in
the middle of being filled.  I agree there are bigger problems to solve
in this area though.  Yes, I agree that a fixed name such as _copy.tmp
is a really bad choice for several reasons.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-20 Thread Greg Stark
On Wed, Aug 20, 2014 at 2:27 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Hmm, the real trick here is linkat(... /proc/self/foobar), not the
 O_TMPFILE: you can have an open file descriptor to an invisible file
 simply by creating a normal file and unlinking it.  I looked at linkat()
 yesterday but the idea of using /proc/self didn't occur to me.  Nasty
 trick :-(  It seems linkat() is quite a bit more portable than
 O_TMPFILE, fortunately ...

Supposedly linkat(2) on Linux refuses to create a link to a file that
was opened normally and had its last link removed with unlink(2) due
to concerns that this would create security holes.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-20 Thread Andres Freund
On 2014-08-20 09:50:56 -0400, Alvaro Herrera wrote:
 Andres Freund wrote:
  On 2014-08-19 17:42:06 -0400, Alvaro Herrera wrote:
   MauMau wrote:
   
With that said, copying to a temporary file like dest.tmp and
renaming it to dest sounds worthwhile even as a basic copy
utility.  I want to avoid copying to a temporary file with a fixed
name like _copy.tmp, because some advanced utility may want to run
multiple instances of pg_copy to copy several files into the same
directory simultaneously.  However, I'm afraid multiple dest.tmp
files might continue to occupy disk space after canceling copy or
power failure in some use cases, where the copy of the same file
won't be retried.  That's also the reason why I chose to not use a
temporary file like cp/copy.
   
   Is there a way to create a link to a file which only exists as an open
   file descriptor?   If there was, you could create a temp file, open an
   fd, then delete the file.  That would remove the issue with files being
   leaked due to failures of various kinds.
  
  Isn't this a solution looking for a problem? We're using tempfiles in
  dozens of other places and I really don't see why this is the place to
  stop doing so. Just copy to dest.tmp and move it into place. If things
  crash during that, the command will be repeated shortly afterwards again
  *anyway*. Let's not get into platform specific games here.
 
 The issue is what happens if there's a crash while the temp file is in
 the middle of being filled.

The archive command will be be run again a couple seconds and remove the
half-filled temp file.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-20 Thread Alvaro Herrera
Greg Stark wrote:
 On Wed, Aug 20, 2014 at 2:27 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
  Hmm, the real trick here is linkat(... /proc/self/foobar), not the
  O_TMPFILE: you can have an open file descriptor to an invisible file
  simply by creating a normal file and unlinking it.  I looked at linkat()
  yesterday but the idea of using /proc/self didn't occur to me.  Nasty
  trick :-(  It seems linkat() is quite a bit more portable than
  O_TMPFILE, fortunately ...
 
 Supposedly linkat(2) on Linux refuses to create a link to a file that
 was opened normally and had its last link removed with unlink(2) due
 to concerns that this would create security holes.

Sigh.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-20 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-08-20 09:50:56 -0400, Alvaro Herrera wrote:
 Andres Freund wrote:
 Isn't this a solution looking for a problem? We're using tempfiles in
 dozens of other places and I really don't see why this is the place to
 stop doing so. Just copy to dest.tmp and move it into place.

 The issue is what happens if there's a crash while the temp file is in
 the middle of being filled.

 The archive command will be be run again a couple seconds and remove the
 half-filled temp file.

Alternatively, you could use the process PID as part of the temp file
name; which is probably a good idea anyway.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-20 Thread Andres Freund
On 2014-08-20 10:19:33 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-08-20 09:50:56 -0400, Alvaro Herrera wrote:
  Andres Freund wrote:
  Isn't this a solution looking for a problem? We're using tempfiles in
  dozens of other places and I really don't see why this is the place to
  stop doing so. Just copy to dest.tmp and move it into place.
 
  The issue is what happens if there's a crash while the temp file is in
  the middle of being filled.
 
  The archive command will be be run again a couple seconds and remove the
  half-filled temp file.
 
 Alternatively, you could use the process PID as part of the temp file
 name; which is probably a good idea anyway.

I think that's actually worse, because nothing will clean up those
unless you explicitly scan for all whatever.$pid files, and somehow
kill them.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-20 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-08-20 10:19:33 -0400, Tom Lane wrote:
 Alternatively, you could use the process PID as part of the temp file
 name; which is probably a good idea anyway.

 I think that's actually worse, because nothing will clean up those
 unless you explicitly scan for all whatever.$pid files, and somehow
 kill them.

True.  As long as the copy command is prepared to get rid of a
pre-existing target file, using a fixed .tmp extension should be fine.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-20 Thread Bruce Momjian
On Wed, Aug 20, 2014 at 10:36:40AM -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-08-20 10:19:33 -0400, Tom Lane wrote:
  Alternatively, you could use the process PID as part of the temp file
  name; which is probably a good idea anyway.
 
  I think that's actually worse, because nothing will clean up those
  unless you explicitly scan for all whatever.$pid files, and somehow
  kill them.
 
 True.  As long as the copy command is prepared to get rid of a
 pre-existing target file, using a fixed .tmp extension should be fine.

Well, then we are back to this comment by MauMau:

 With that said, copying to a temporary file like dest.tmp and
 renaming it to dest sounds worthwhile even as a basic copy utility.
 I want to avoid copying to a temporary file with a fixed name like
 _copy.tmp, because some advanced utility may want to run multiple
 instances of pg_copy to copy several files into the same directory
 simultaneously.  However, I'm afraid multiple dest.tmp files might
 continue to occupy disk space after canceling copy or power failure in
 some use cases, where the copy of the same file won't be retried.
 That's also the reason why I chose to not use a temporary file like
 cp/copy.

Do we want cases where the same directory is used multiple pg_copy
processes?  I can't imagine how that setup would make sense.

I am thinking pg_copy should emit a warning message when it removes an
old temp file.  This might alert people that something odd is happening
if they see the message often.

The pid-extension idea would work as pg_copy can test all pid extension
files to see if the pid is still active.  However, that assumes that the
pid is running on the local machine and not on another machines that has
NFS-mounted this directory, so maybe this is a bad idea, but again, we
are back to the idea that only one process should be writing into this
directory.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-20 Thread Andres Freund
On 2014-08-20 18:58:05 -0400, Bruce Momjian wrote:
 On Wed, Aug 20, 2014 at 10:36:40AM -0400, Tom Lane wrote:
  Andres Freund and...@2ndquadrant.com writes:
   On 2014-08-20 10:19:33 -0400, Tom Lane wrote:
   Alternatively, you could use the process PID as part of the temp file
   name; which is probably a good idea anyway.
  
   I think that's actually worse, because nothing will clean up those
   unless you explicitly scan for all whatever.$pid files, and somehow
   kill them.
  
  True.  As long as the copy command is prepared to get rid of a
  pre-existing target file, using a fixed .tmp extension should be fine.
 
 Well, then we are back to this comment by MauMau:

  With that said, copying to a temporary file like dest.tmp and
  renaming it to dest sounds worthwhile even as a basic copy utility.
  I want to avoid copying to a temporary file with a fixed name like
  _copy.tmp, because some advanced utility may want to run multiple
  instances of pg_copy to copy several files into the same directory
  simultaneously.  However, I'm afraid multiple dest.tmp files might
  continue to occupy disk space after canceling copy or power failure in
  some use cases, where the copy of the same file won't be retried.
  That's also the reason why I chose to not use a temporary file like
  cp/copy.
 
 Do we want cases where the same directory is used multiple pg_copy
 processes?  I can't imagine how that setup would make sense.

I don't think anybody is proposing the _copy.tmp proposal. We've just
argued about the risk of dest.tmp. And I argued - and others seem to
agree - the space usage problem isn't really relevant because archive
commands and such are rerun after failure and can then clean up the temp
file again.

 I am thinking pg_copy should emit a warning message when it removes an
 old temp file.  This might alert people that something odd is happening
 if they see the message often.

Don't really see a point in this. If the archive command or such failed,
that will already have been logged. I'd expect this to be implemented by
passing O_CREAT | O_TRUNC to open(), nothing else.

 The pid-extension idea would work as pg_copy can test all pid extension
 files to see if the pid is still active.  However, that assumes that the
 pid is running on the local machine and not on another machines that has
 NFS-mounted this directory, so maybe this is a bad idea, but again, we
 are back to the idea that only one process should be writing into this
 directory.

I don't actually think we should assume that. There very well could be
one process running an archive command, using differently prefixed file
names or such.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-19 Thread MauMau

From: Fujii Masao masao.fu...@gmail.com
What's the main purpose of this tool? If it's for WAL archiving, the tool 
name

pg_copy sounds too generic. We already have pg_archivecleanup, so maybe
pg_archivecopy or something is better for the consistency?

pg_copy in the patch copies the file to the destination in a
straightforward way,
i.e., directly copies the file to the dest file with actual name. This can 
cause
the problem which some people reported. The problem is that, when the 
server
crashes while WAL file is being archived by cp command, its 
partially-filled
WAL file remains at the archival area. This half-baked archive file can 
cause
various troubles. To address this, WAL file needs to be copied to the 
temporary
file at first, then renamed to the actual name. I think that pg_copy 
should

copy the WAL file in that way.


I intended to make pg_copy a straightforward replacement of cp/copy, which 
complements the missing sync.  Direct I/O and posix_fadvice() feature may be 
convenient but not essential for this utility.  cp/copy doesn't copy to a 
temporary file, and the problem can be solved easily by mv/move.  I wanted 
to keep pg_copy as generic as cp/copy, so that it can be used by some 
advanced features in the future, e.g. comprehensive backup/recovery 
management like RMAN (this example may not be best) when it's integrated 
into the core.


With that said, copying to a temporary file like dest.tmp and renaming it 
to dest sounds worthwhile even as a basic copy utility.  I want to avoid 
copying to a temporary file with a fixed name like _copy.tmp, because some 
advanced utility may want to run multiple instances of pg_copy to copy 
several files into the same directory simultaneously.  However, I'm afraid 
multiple dest.tmp files might continue to occupy disk space after 
canceling copy or power failure in some use cases, where the copy of the 
same file won't be retried.  That's also the reason why I chose to not use a 
temporary file like cp/copy.


Currently pg_copy always syncs the archive file, and there is no way to 
disable
that. But I'm sure that not everyone want to sync the archive file. So I 
think

that it's better to add the option specifying whether to sync the file
or not, into
pg_copy.


pg_copy is for copying a file reliably by syncing.  If sync is not 
necessary, people can use cp/copy.



Some users might want to specify whether to call posix_fadvise or not 
because

they might need to re-read the archvied files just after the archiving.
For example, network copy of the archived files from the archive area to
remote site for disaster recovery.


This sounds reasonable.  Do you have an idea on the switch name and the 
default behavior?



Do you recommend to use pg_copy for restore_command? If yes, it also 
should
be documented. And in the WAL restore case, the restored WAL files are 
re-read

soon by recovery, so posix_fadvise is not good in that case.

Direct I/O and posix_fadvise are used only for destination file. But why 
not

source file? That might be useful especially for restore_command case.


No, I don't think it's necessary to use pg_copy for restore_command.



At last, the big question is, is there really no OS command which provides
the same functionality as pg_copy does? If there is, I'd like to avoid 
duplicate

work basically.


If there exists such a command available in the standard OS installation, I 
want to use it.


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-19 Thread Peter Eisentraut
On 8/19/14 9:35 AM, MauMau wrote:
 pg_copy is for copying a file reliably by syncing.  If sync is not
 necessary, people can use cp/copy.

I'm getting mixed messages from this thread.

I think there could be a fair amount of support for a new tool that can
serve as a universal plug-and-play archive_command, with a variety of
options, such as fsync yes/no, fadvise yes/no, direct-io[*] yes/no,
atomic copy yes/no, allow overwrite yes/no, compression yes/no.  That
would reduce the need for users to compose adventurous shell commands,
and it would set out the various options more clearly.

This is not that.  This is cp+fsync with a hardcoded fadvise policy and
optional direct-io.  That is a valid problem to address, but it is one
of many.  On the other hand, I fear that the addition of this
single-and-a-half-purpose tool would make the overall landscape more
complicated than it already is.  Since it's in the examples, people will
probably use it, even if they don't need to or shouldn't.  And not
recommending it for the restore_command is also confusing.

Another example of how confusing all of this is: On Windows, the copy
command by default doesn't overwrite files, which is what we want
(usually).  The patch changes those instances of copy to pg_copy, but it
doesn't have that behavior.  Should the examples by changed to do a test
 pg_copy on Windows (what's the Windows shell syntax for that?), or
should pg_copy have an option to not overwrite a file?  How do you then
avoid inconsistencies with the Unix behavior?  Or what if you want fsync
but allow overwriting on Windows?

On the technical side, I think if you fsync a new file, you also need to
fsync the directory, to make sure the file is certain to be visible
after a crash.


[*] I keep reading directio as a typo of direction, so please
consider putting a hyphen or underscore in the option and variable
names. ;-)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-19 Thread Peter Eisentraut
On 8/15/14 10:46 AM, Fujii Masao wrote:
 At last, the big question is, is there really no OS command which provides
 the same functionality as pg_copy does? If there is, I'd like to avoid 
 duplicate
 work basically.

If you look hard enough, you can maybe find an OS command that can fsync
a file after it was copied.  Some versions of dd can do that, and some
systems have an fsync program.  But it's not clear whether all systems
have that, and it probably won't be simple and consistent.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-19 Thread Josh Berkus
On 08/19/2014 12:37 PM, Peter Eisentraut wrote:
 On 8/19/14 9:35 AM, MauMau wrote:
 pg_copy is for copying a file reliably by syncing.  If sync is not
 necessary, people can use cp/copy.
 
 I'm getting mixed messages from this thread.
 
 I think there could be a fair amount of support for a new tool that can
 serve as a universal plug-and-play archive_command, with a variety of
 options, such as fsync yes/no, fadvise yes/no, direct-io[*] yes/no,
 atomic copy yes/no, allow overwrite yes/no, compression yes/no.  That
 would reduce the need for users to compose adventurous shell commands,
 and it would set out the various options more clearly.

Yes please!

Although I'm not sold on the idea of using DirectIO for this.  Is there
really enough benefit to make it worth the trouble?

 
 This is not that.  This is cp+fsync with a hardcoded fadvise policy and
 optional direct-io.  That is a valid problem to address, but it is one
 of many.  On the other hand, I fear that the addition of this
 single-and-a-half-purpose tool would make the overall landscape more
 complicated than it already is.  Since it's in the examples, people will
 probably use it, even if they don't need to or shouldn't.  And not
 recommending it for the restore_command is also confusing.

I'm afraid that I agree with Peter here.  pg_copy looks like a nice
foundation for the eventual pg_copy utility we need, but it's not there yet.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-19 Thread Alvaro Herrera
MauMau wrote:

 With that said, copying to a temporary file like dest.tmp and
 renaming it to dest sounds worthwhile even as a basic copy
 utility.  I want to avoid copying to a temporary file with a fixed
 name like _copy.tmp, because some advanced utility may want to run
 multiple instances of pg_copy to copy several files into the same
 directory simultaneously.  However, I'm afraid multiple dest.tmp
 files might continue to occupy disk space after canceling copy or
 power failure in some use cases, where the copy of the same file
 won't be retried.  That's also the reason why I chose to not use a
 temporary file like cp/copy.

Is there a way to create a link to a file which only exists as an open
file descriptor?   If there was, you could create a temp file, open an
fd, then delete the file.  That would remove the issue with files being
leaked due to failures of various kinds.

Also, it's been mentioned that this utility might be useful for
restore_command.  That sounds good I guess, but need to keep the
RECOVERYXLOG trick in mind.  I remember a case of stalled replay because
the restore command was writing to RECOVERYXLOG.gz and ungzipping, and
the unlink(RECOVERYXLOG) call failed after a partial copy and so did
the copy from the archive.  (Removing the borked RECOVERYXLOG.gz fixed
it.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-15 Thread Fujii Masao
On Thu, Aug 14, 2014 at 1:52 PM, MauMau maumau...@gmail.com wrote:
 I fixed some minor mistakes.

What's the main purpose of this tool? If it's for WAL archiving, the tool name
pg_copy sounds too generic. We already have pg_archivecleanup, so maybe
pg_archivecopy or something is better for the consistency?

pg_copy in the patch copies the file to the destination in a
straightforward way,
i.e., directly copies the file to the dest file with actual name. This can cause
the problem which some people reported. The problem is that, when the server
crashes while WAL file is being archived by cp command, its partially-filled
WAL file remains at the archival area. This half-baked archive file can cause
various troubles. To address this, WAL file needs to be copied to the temporary
file at first, then renamed to the actual name. I think that pg_copy should
copy the WAL file in that way.

Currently pg_copy always syncs the archive file, and there is no way to disable
that. But I'm sure that not everyone want to sync the archive file. So I think
that it's better to add the option specifying whether to sync the file
or not, into
pg_copy.

Some users might want to specify whether to call posix_fadvise or not because
they might need to re-read the archvied files just after the archiving.
For example, network copy of the archived files from the archive area to
remote site for disaster recovery.

Do you recommend to use pg_copy for restore_command? If yes, it also should
be documented. And in the WAL restore case, the restored WAL files are re-read
soon by recovery, so posix_fadvise is not good in that case.

Direct I/O and posix_fadvise are used only for destination file. But why not
source file? That might be useful especially for restore_command case.

At last, the big question is, is there really no OS command which provides
the same functionality as pg_copy does? If there is, I'd like to avoid duplicate
work basically.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-15 Thread Kevin Grittner
Fujii Masao masao.fu...@gmail.com wrote:

 Direct I/O and posix_fadvise are used only for destination file.
 But why not source file? That might be useful especially for
 restore_command case.

That would prevent people from piping the file through a
compression utility.  We should support piped I/O for input (and if
we want to use it on the recovery side, for the output, too), at
least as an option.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-13 Thread MauMau

I fixed some minor mistakes.

Regards
MauMau


pg_copy_v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-06-17 Thread Abhijit Menon-Sen
At 2014-06-17 23:26:37 +0900, maumau...@gmail.com wrote:

 Hello,
 
 As I proposed before in the thread below, I've implemented a simple
 command for reliable WAL archiving.  I would appreciate it if you
 could review and test the patch.

Please add your patch to the next CF (i.e. 2014-08), so that it gets
review attention at the appropriate time.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-06-17 Thread Joe Conway
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 06/17/2014 07:26 AM, MauMau wrote:
 Hello,
 
 As I proposed before in the thread below, I've implemented a simple
 command for reliable WAL archiving.  I would appreciate it if you could
 review and test the patch.
 
 http://www.postgresql.org/message-id/9C1EB95CA1F34DAB93DF549A51E3E874@maumau

That first hunk refers to dblink -- I'm guessing it does not belong with
this patch.

Joe


- -- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting,  24x7 Support
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.14 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJToFiiAAoJEDfy90M199hlseMP/Rpwneo5la7mJkHJA0BWUtj/
Hh8yNzzyBVPbKYuTEOZWi2yblFW6yA2dknrYD8RigS1qJFwLw+drFt5673Vi6jKW
Pf7Qn62Ck/U0lZTGXUU9akOhSx7BsKVwH8HvdARIp5DSV2n7/HFDtazi3hTtFq31
GHKA84lPwuynODN42eVez0YXdeRUX7K/s5rCRq154q3BrLPhEEvQwo/kfhet3F0A
utf0ymSPuX3RvpGDDPAZ1oQxTd/xA+qJhX0jrsRoJVaY40rufTgXPmAnNFOdWEds
fXs6ObRm+tHmeLhYVUXhODa1HPHBQMvElVeB3CGxhPUvhP2494sfoLOd7qs8Lblg
oUkYbIJee8Ir7NN34Gc69YW58sekPSVI4vXKu0PwF++Ubs4MYNNd7nP4Wu9N2ISw
p/GPIbx0uR3NbFGCoOLD0K3QHnX/b0FTWHzTTboZ+b69rNIDePpn3eGO2QEOLL/R
P/YkPma8SLyDvNnCzuSU3XDkFmWQsgK7xa7qpsBR1mbdF7zKPfOxCtoby/enSeuk
z7NJxtpHeUQkO7Pb3ZNk6gL+E8UAQihvdBgdBwBzj4qaZyAM5ec29aya3TtBbF+3
UoFX3m4tthR6mMWqizsdadSPvozDjMrhqSRFAjdBSX80Nfs2DVN1Hepp8NXvjRzV
b5RfV7x4yvtr92FFQboj
=+TWh
-END PGP SIGNATURE-


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-06-17 Thread Abhijit Menon-Sen
At 2014-06-17 08:02:59 -0700, m...@joeconway.com wrote:

 That first hunk refers to dblink -- I'm guessing it does not belong
 with this patch.

Yes, it's a leftover of the dblink memory leak patch that's in this CF.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-06-17 Thread MauMau

From: Joe Conway m...@joeconway.com

That first hunk refers to dblink -- I'm guessing it does not belong with
this patch.


Ouch, what a careless mistake.  The attached one is clean.  I'll update the 
CommitFest entry when I'm back home from work.


Regards
MauMau



pg_copy_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers