Re: [HACKERS] pg_rewind in contrib

Heikki Linnakangas Tue, 16 Dec 2014 01:39:07 -0800

On 12/16/2014 11:23 AM, Satoshi Nagayasu wrote:

Hi,


On 2014/12/12 23:13, Heikki Linnakangas wrote:
  > Hi,
  >
  > I'd like to include pg_rewind in contrib. I originally wrote it as an
  > external project so that I could quickly get it working with the
  > existing versions, and because I didn't feel it was quite ready for
  > production use yet. Now, with the WAL format changes in master, it is a
  > lot more maintainable than before. Many bugs have been fixed since the
  > first prototypes, and I think it's fairly robust now.
  >
  > I propose that we include pg_rewind in contrib/ now. Attached is a patch
  > for that. It just includes the latest sources from the current pg_rewind
  > repository at https://github.com/vmware/pg_rewind. It is released under
  > the PostgreSQL license.
  >
  > For those who are not familiar with pg_rewind, it's a tool that allows
  > repurposing an old master server as a new standby server, after
  > promotion, even if the old master was not shut down cleanly. That's a
  > very often requested feature.

I'm looking into pg_rewind with a very first scenario.
My scenario is here.

https://github.com/snaga/pg_rewind_test/blob/master/pg_rewind_test.sh

At least, I think a file descriptor "srcf" should be closed before
exiting copy_file_range(). I got "can't open file" error with
"too many open file" while running pg_rewind.

------------------------------------------------
diff --git a/contrib/pg_rewind/copy_fetch.c b/contrib/pg_rewind/copy_fetch.c
index bea1b09..5a8cc8e 100644
--- a/contrib/pg_rewind/copy_fetch.c
+++ b/contrib/pg_rewind/copy_fetch.c
@@ -280,6 +280,8 @@ copy_file_range(const char *path, off_t begin, off_t
end, bool trunc)
                  write_file_range(buf, begin, readlen);
                  begin += readlen;
          }
+
+       close(srcfd);
   }

   /*
------------------------------------------------


Yep, good catch. I pushed a fix to the pg_rewind repository at github.

And I have one question here.

pg_rewind assumes that the source PostgreSQL has, at least, one
checkpoint after getting promoted. I think the target timeline id
in the pg_control file to be read is only available after the first
checkpoint. Right?

Yes, it does assume that the source server (= old standby, new master)has had at least one checkpoint after promotion. It probably should bemore explicit about it: If there hasn't been a checkpoint, you willcurrently get an error "source and target cluster are both on the sametimeline", which isn't very informative.

I assume that by "target timeline ID" you meant the timeline ID of thesource server, i.e. the timeline that the target server should berewound to.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_rewind in contrib

Reply via email to