On Tue, Dec 27, 2016 at 1:16 PM, Michael Paquier <michael.paqu...@gmail.com> wrote:
> On Tue, Dec 27, 2016 at 6:34 PM, Magnus Hagander <mag...@hagander.net> > wrote: > > On Tue, Dec 27, 2016 at 2:23 AM, Michael Paquier < > michael.paqu...@gmail.com> > > wrote: > >> Magnus, you have mentioned me as well that you had a couple of ideas > >> on the matter, feel free to jump in and let's mix our thoughts! > > > > > > Yeah, I've been wondering what the actual usecase is here :) > > There is value to compress segments finishing with trailing zeros, > even if they are not the species with the highest representation in > the WAL archive. > Agreed on that part -- that's the value in compression though, and not necessarily the TAR format itself. Is there any value of the TAR format *without* compression in your scenario? > > Though I was considering the case where all segments are streamed into > the > > same tarfile (and then some sort of configurable limit where we'd switch > > tarfile after <n> segments, which rapidly started to feel too > complicated). > > > > What's the actual advantage of having it wrapped inside a single tarfile? > > I am advocating for one tar file per segment to be honest. Grouping > them makes the failure handling more complicated when connection to > the server is killed, or the replication stream is cut. Well, not > really complicated actually, because I think that you would need to > drop in the segment folder a status file with enough information to > let pg_receivexlog know from where in the tar file it needs to > continue writing. If a new tarball is created for each segment, > deciding from where to stream after a connection failure is just a > matter of doing what is done today: having a look at the completed > segments and begin streaming from the incomplete/absent one. > This pretty much matches up with the conclusion I got to myself as well. We could create a new tarfile for each restart of pg_receivexlog, but then it becomes unpredictable. > >> There are a couple of things that I have been considering as well for > >> pg_receivexlog. Though they are not directly stick to this thread, > >> here they are as I don't forget about them: > >> - Removal of oldest WAL segments on a partition. When writing WAL > >> segments to a dedicated partition, we could have an option that > >> automatically removes the oldest WAL segment if the partition is full. > >> This triggers once a segment is completed. > >> - Compression of fully-written segments. When a segment is finished > >> being written, pg_receivexlog could compress them further with gz for > >> example. With --format=t this leads to segnum.tar.gz being generated. > >> The advantage of doing those two things in pg_receivexlog is > >> monitoring. One process to handle them all, and there is no need of > >> cron jobs to handle any cleanup or compression. > > > > I was at one point thinking that would be a good idea as well, but > recently > > I've more been thinking that what we should do is implement a > > "--post-segment-command", which would act similar to archive_command but > > started by pg_receivexlog. This could handle things like compression, and > > also integration with external backup tools like backrest or barman in a > > cleaner way. We could also spawn this without waiting for it to finish > > immediately, which would allow parallellization of the process. When > doing > > the compression inline that rapidly becomes the bottleneck. Unlike a > > basebackup you're only dealing with the need to buffer 16Mb on disk > before > > compressing it, so it should be fairly cheap. > > I did not consider the case of barman and backrest to be honest, > having the view of 2ndQ folks and David would be nice here. Still, the > main idea behind those done by pg_receivexlog's process would be to > not spawn a new process. I have a class of users that care about > things that could hang, they play a lot with network-mounted disks... > And VMs of course. > I have been talking to David about it a couple of times, and he agreed that it'd be useful to have a post-segment command. We haven't discussed it in much detail though. I'll add him to direct-cc here to see if he has any further input :) It could be that the best idea is to just notify some other process of what's happening. But making it an external command would give that a lot of flexibility. Of course, we need to be careful not to put ourselves back in the position we are in with archive_command, in that it's very difficult to write a good one. I'm sure everybody cares about things that could hang. But everything can hang... > > Another thing I've been considering in the same area would be to add the > > ability to write the segments to a pipe instead of a directory. Then you > > could just pipe it into gzip without the need to buffer on disk. This > would > > kill the ability to know at which point we'd sync()ed to disk, but in > most > > cases so will doing direct gzip. Just means we couldn't support this in > sync > > mode. > > Users piping their data don't care about reliability anyway. So that > is not a problem. > Good point. Same would be true about people who gzip it, wouldn't it? > > I can see the point of being able to compress the individual segments > > directly in pg_receivexlog in smaller systems though, without the need to > > rely on an external compression program as well. But in that case, is > there > > any reason we need to wrap it in a tarfile, and can't just write it to > > <segment>.gz natively? > > You mean having a --compress=0|9 option that creates individual gz > files for each segment? Definitely we could just do that. It would be > Yes, that's what I meant. > a shame though to not use the WAL methods you have introduced in > src/bin/pg_basebackup, with having the whole set tar and tar.gz. A > quick hack in pg_receivexlog has showed me that segments are saved in > a single tarball, which is not cool. My feeling is that using the > existing infrastructure, but making it pluggable for individual files > (in short I think that what is needed here is a way to tell the WAL > method to switch to a new file when a segment completes) would really > be the most simple one in terms of code lines and maintenance. > Much as I'd like to reuse that, I don't think that reusing that in itself shold be the driver for how this should be decided. It should be the end product. To me it seems silly to create a directory full of tarfiles with a single file in each. I don't particularly care about the fact that we added 512 bytes of wasted space to each, but we just created something that's unnecessarily complicated for people to handle, didn't we? A plain directory of .gz files is a lot easier to work with. //Magnus