On Wed, Sep 27, 2017 at 1:59 PM, Igor Polishchuk <ora4...@gmail.com> wrote:

> Sorry, here are the missing details, if it helps:
> Postgres 9.6.5 on CentOS 7.2.1511
>
> > On Sep 27, 2017, at 10:56, Igor Polishchuk <ora4...@gmail.com> wrote:
> >
> > Hello,
> > I have a multi-terabyte streaming replica on a bysy database. When I set
> it up, repetative rsyncs take at least 6 hours each.
> > So, when I start the replica, it begins streaming, but it is many hours
> behind right from the start. It is working for hours, and cannot reach a
> consistent state
> > so the database is not getting opened for queries. I have plenty of WAL
> files available in the master’s pg_xlog, so the replica never uses archived
> logs.
> > A question:
> > Should I be able to run one more rsync from the master to my replica
> while it is streaming?
> > The idea is to overcome the throughput limit imposed by a single
> recovery process on the replica and allow to catch up quicker.
> > I remember doing it many years ago on Pg 8.4, and also heard from other
> people doing it. In all cases, it seamed working.
> > I’m just not sure if there is no high risk of introducing some hidden
> data corruption, which I may not notice for a while on such a huge database.
> > Any educated opinions on the subject here?
>

It really comes down to the amount of I/O (network and disk) your system
can handle while under load.  I've used 2 methods to do this in the past:

- http://moo.nac.uci.edu/~hjm/parsync/

  parsync (parallel rsync)is nice, it does all the hard work for you of
parellizing rsync.  It's just a pain to get all the prereqs installed.


- rsync --itemize-changes
  Essentially, use this to get a list of files, manually split them out and
fire up a number of rsyncs.  parsync does this for you, but, if you can't
get it going for any reason, this works.


The real trick, after you do your parallel rsync, make sure that you run
one final rsync to sync-up any missed items.

Remember, it's all about I/O.  The more parallel threads you use, the
harder you'll beat up the disks / network on the master, which could impact
production.

Good luck

--Scott







> >
> > Thank you
> > Igor Polishchuk
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>



-- 
--
Scott Mead
Sr. Architect
*OpenSCG <http://openscg.com>*
http://openscg.com

Reply via email to