Re: How to monitor logical replication initial sync?

2018-03-15 Thread bricklen
On Wed, Mar 7, 2018 at 3:23 PM, Doug Gorley  wrote:

> Good day,
>
> How does one monitor the status or progress of an initial sync under
> logical replication?  For example:
>
> * I create a publication in database db_pub
> * I create a subscription in database db_sub
> * In 15 minutes I want to check an see that the initial sync is N% complete
>
> Is it possible to tell when the initial sync is complete, or better yet,
> how complete it is?
>

​This is a question I'm quite interested in as well (and one I do not have
an answer to).​

​Does anyone with more familiarity ​with logical replication have any
suggestions on how to determine the status of the initial sync?


Re: upgrading from pg 9.3 to 10

2018-08-20 Thread bricklen
Hi Edmundo,

On Tue, Aug 14, 2018 at 10:45 AM Edmundo Robles 
wrote:

> Is safe  to upgrade from pg 9.3 to pg 10 directly using pg_upgrade or
> is better upgrade, with pg_upgrade,  from  9.3 -> 9.4 ->9.5 -> 9.6 -> 10.
>

Using pg_upgrade, it is definitely possible to upgrade 9.3 to 10 in one
jump. We did this in production earlier this year for 1500 Postgres
clusters.
At https://bricklen.github.io/2018-03-27-Postgres-10-upgrade/ I documented
(in excrutiating detail, perhaps) how we upgraded those clusters from 9.3
to 10 with sub-15 minute downtime per cluster. The only real issues we ran
into were some corrupted indexes that appeared to be related to 10.1 and
10.2. Upgrading to 10.3/10.4 and reindexing fixed those particular
corrupted indexes.


Re: upgrading from pg 9.3 to 10

2018-08-22 Thread bricklen
Hi Stephen, thanks for the feedback, it is much appreciated!

On Mon, Aug 20, 2018 at 12:15 PM Stephen Frost  wrote:

> * bricklen (brick...@gmail.com) wrote:
> > Using pg_upgrade, it is definitely possible to upgrade 9.3 to 10 in one
> > jump. We did this in production earlier this year for 1500 Postgres
> > clusters.
> > At https://bricklen.github.io/2018-03-27-Postgres-10-upgrade/ I
> documented
> > (in excrutiating detail, perhaps) how we upgraded those clusters from 9.3
> > to 10 with sub-15 minute downtime per cluster. The only real issues we
> ran
> > into were some corrupted indexes that appeared to be related to 10.1 and
> > 10.2. Upgrading to 10.3/10.4 and reindexing fixed those particular
> > corrupted indexes.
>
> Interesting write-up.  A few points:
>
> #1: I'd strongly recommend including something in the write-up about
> checking for unlogged tables.  Those can add quite a bit of time to the
> rsync if they aren't truncated before the primary cluster is shut down.
>

A note about UNLOGGED has been added to the page, thanks.



> #2: The issue with timelines leads me to suspect that you had a
> restore_command defined and that when PG started up, it found the
> timeline history files from the old cluster.  If you don't have a
> restore_command set for any of these then I'm very curious what
> happened.  The other possibility (though I wouldn't have expected a
> timeline complaint from PG...) is that the replica wasn't fully up to
> date for whatever reason.
>

Entirely plausible, unfortunately I don't recall what all the settings were
in the replicas, nor what the actual errors were. One puzzling aspect was
that every one of those 1500 clusters had the same primary and replica
settings, and every one of them had been failed back and forth dozens of
times before we upgraded to PG10.
The replica not being fully up to date makes more sense than rsync copying
over history files only part of the time, since we were less focused on the
remote replica than ensuring the primary was successfully upgraded.



> #3: There's a number of checks discussed in the upgrade documentation
> around using the rsync-based method, but it doesn't seem like you did
> those.  A mention of them might be good.


If you mean #8 from the docs, where it says "If you are upgrading standby
servers using methods outlined in section Step 10, verify that the old
standby servers are caught up by running pg_controldata against the old
primary and standby clusters. Verify that the “Latest checkpoint location”
values match in all clusters. (There will be a mismatch if old standby
servers were shut down before the old primary.) ", we found that the
pg_controldata output from the hot standby never matched the output from
the primary. When I was investigating at the time, I read a note somewhere
that if the replica was not using Streaming Replication the pg_controldata
output was unlikely to match. It's possible I misunderstood (or I've
forgotten what I read), but the output did not match in the dozen-ish times
we tested that particular aspect. It's entirely possible it was due to me
bungling something, but I was careful to checkpoint + switch wal files +
shut down the primary cleanly before checking that all the WALs from the
primary were shipped to the hot standby and were applied successfully,
before gracefully shutting down the hot standby.  In all that testing, the
systems were quiescent, as we'd already blocked access to the primary.
If you meant other checks, I'd be happy to add a note about them if you can
point out which ones they were. It's possible we overlooked something in
our planning and execution of the pg_upgrade steps.



> Note that these are
> particularly important because the rsync will *not* copy over changes to
> the data files except in the relatively rare case of the relfilenode's
> size changing (given that most of them are 1G, that's not too likely).
> The note you have about the rsync taking more time due to "if the
> remote replica was fully caught up when it was shut down" isn't
> accurate- there is no WAL replay that happens on the replica using this
> method to 'catch up' and if WAL replay was required to have this process
> be correct then it simply wouldn't ever work.
>

No doubt the problem exists between my ears and I'm making invalid
assumptions about the problems we ran into.
What I meant say -and please correct me if this still inaccurate- was that
once the primary was stopped cleanly, all WALs generated needed to be
shipped to the remote replica, where those WALs needed to be applied to
bring the replica to a point where it is near-identical the primary. At
that point, the rsync would only ship the changes created by t

Re: Replicate Tables from SAP (DB2/HANA) to PostgreSQL

2018-09-19 Thread bricklen
On Tue, Sep 18, 2018 at 11:31 PM Thomas Güttler <
guettl...@thomas-guettler.de> wrote:

> Hi,
>
> is it possible to replicate some tables from SAP to PostgreSQL?
> At the moment there are two underlaying database systems.
> Some run DB2 and some run SAP-HANA.
> In my case it would be nice, if it would be possible to replicate
> only some rows, not all.
>
> The replication should be unidirectional. The data in PostgreSQL
> is only needed for reading, not for inserts/updates.
>

My team is about to start a proof-of-concept doing the same thing, except
it's a large MSSQL to PG11 replication process. The tool we are evaluating
is from https://www.hvr-software.com/solutions/database-replication/, and
the demos looked pretty good (at least, for our use-case). You didn't
mention if you're looking for free or paid software; HVR is not free so
that may or may not fit your needs.


Re: Pg_auto_failover

2019-09-24 Thread bricklen
On Tue, Sep 24, 2019 at 8:41 AM Sonam Sharma  wrote:

> I was setting up pg_auto_failover. Have successfully set up and monitor
> and primary instance. While setting up secondary it's failing with below :
>
> ERROR Failed to ensure empty directory "//backup" : Permission denied
> Error Failed initialise standby server , see above for details.
>

If you're using the tool from Citus, per
https://github.com/citusdata/pg_auto_failover/blob/d8ba26f47a60eaf1e1dc63ae67279553984d84f5/src/bin/pg_autoctl/file_utils.c#L112
Is the problem literally that the OS user executing the commands does not
have sufficient permission to work in that directory? If you are creating
your data directory manually in "/", you might want to create a directory
under that path that the user executing the commands has sufficient
permission to modify (delete, create, write to)


Re: schema change tracking

2019-05-16 Thread bricklen
On Thu, May 16, 2019 at 9:41 AM Benedict Holland <
benedict.m.holl...@gmail.com> wrote:

> I need a tool that can track schema changes in a postgesql database, write
> scripts to alter the tables, and store those changes in git. Are there
> tools that exist that can do this?
>

For dev and QA, my company is using Migra (
https://github.com/djrobstep/migra) as part of the developer workflow to
track DDL changes. The generated changes are then put into git as patches.
Migra itself is just used for the diff'ing capabilities it provides, and
will require some glue code if you want to use it for any CI/CD work.


Re: Flood Warning message : user=[unknown],db=[unknown],host= WARNING: pg_getnameinfo_all() failed: Temporary failure in name resolution

2019-06-05 Thread bricklen
On Wed, Jun 5, 2019 at 11:08 AM Steve Crawford <
scrawf...@pinpointresearch.com> wrote:

> On Wed, Jun 5, 2019 at 10:13 AM Perumal Raj  wrote:
>
>>
>> [64459]: [1-1] user=[unknown],db=[unknown],host= WARNING:
>>  pg_getnameinfo_all() failed: Temporary failure in name resolution...
>>
>
> First thing I'd check is that DNS is functioning correctly (including
> local resolution settings or caching name resolvers).
>

In addition to what Steve mentioned, I've a vague recollection from running
into this several years ago that this is triggered by having "log_hostname"
enabled in your postgresql.conf.


Re: How clear the cache on postgresql?

2017-11-24 Thread bricklen
On Fri, Nov 24, 2017 at 6:54 AM, hmidi slim  wrote:

> I'm trying to analyze some queries using the explain instructions and the
> option analyze and buffers. I realized that the query refers to the cache
> memory to return the results. Is there any solution to clear the cache in
> postgresql inorder to get execution time of the query when it get data from
> disk and not from cache memory?
>
​
If you are on a (non-production) *nix server you can use:
sync && echo 3 > /proc/sys/vm/drop_caches​


Re: Replica on low-bandwitdh network

2017-12-05 Thread bricklen
On Tue, Dec 5, 2017 at 1:12 AM, Job  wrote:

> we have some postgreswl 9.6.1 servers in two different continent.
> Latency is from 180 to 250 ms between the two sites.
>
> Actually we use Rubyrep for off-site replication, since servers are
> completely independent in case of connectivity loss.
>
> We are experiencing some problems with lock tables (rubyrep works with
> trigger to commit changes to the other server),
> we think for the uncommon latency.
>

​You might get more mileage from pglogical (
https://www.2ndquadrant.com/en/resources/pglogical/), which you can use to
ship just the changes, though given my unfamiliarity with Rubyrep maybe
that's what it does too (?)


Re: Dependency tree to tie type/function deps to a table

2017-12-15 Thread bricklen
On Fri, Dec 15, 2017 at 6:44 AM, Jeremy Finzel  wrote:

> Here is my current effort.  I would love feedback in case I've missed
> something.
>

Perhaps you'll find the version on the wiki useful as reference, it's an
older version of the dependencies, and can be found at
https://wiki.postgresql.org/wiki/Pg_depend_display​