Re: Please let me know which configuration setting we need to modify to speedup the pg_dump backup.

2018-10-10 Thread Laurenz Albe
Please bottom-post, otherwise the thread becomes hard to read.

On Thu, 2018-10-11 at 08:44 +0530, Raghavendra Rao J S V wrote:
> We are using postgresql 9.2. It doesn't contain the option. 
> 
> Please guide me any other approaches to improve the performance of pg_dump.
> 
> On Thu 11 Oct, 2018, 8:05 AM Pavan Teja,  wrote:
> > You can use -j jobs option to speed up the process.
> > 
> > Hope it works.
> > 
> > On Thu, Oct 11, 2018, 8:02 AM Raghavendra Rao J S V 
> >  wrote:
> > > pg_dump is taking more time. Please let me know which configuration 
> > > setting we
> > > need to modify to speedup the pg_dump backup.We are using 9.2 version on 
> > > Centos Box.

There you have your first configuration change to speed up pg_dump:
upgrade from 9.2 to a release that is still supported.
And please don't tell me that you cannot / may not.
If your data are important for you, and you really want that speed-up,
you can.

Anyway, to make pg_dump faster with the -j option, you need to use
the "directory format" (-F d).

Other options to speed up pg_dump are limited: you can get faster
storage so that sequential scans are faster or more memory so that
more of the data resides in the filesystem cache.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: how to identify the timeline of specified recovery_target_time when do multiple PITR

2018-10-10 Thread Laurenz Albe
magodo wrote:
> OK... Just take another example:
> 
>  A B
> BASE-+-+--o1 (recover to A)  1
>  | |   C
>  +.|...+---o2 (regret, recover to B) 2
>|   |
>+...|..--o3 (regret again, recover to C)  3
>| 
>+ 4
> 
> 
> Suppose I'm at "o3" and want to recover to point "C". Because I want to
> recover to the first timeline which covers this time point, it means I
> wish to recover to timeline 2.

Ah, I finally understand your question.

You assume tht timeline 1 and 2 have ended (that's how you drew it),
and that consequently timeline 3 is the "earliest existing" timeline,
so why doesn't PostgreSQL choose that one automatically?

Even though you drew it that way, timeline 1 and 2 have not ended, in
a way.  There may be more on them.  How should PostgreSQL know what is
the last WAL entry on a certain timeline?  For that, it would have to
restore and examine *all* WAL segments on that timeline until that fails.

But that is unreasonable because of the potential amount of time
and work involved.  Rather, PostgreSQL has to decide at the point where
timeline 2 branches off which one it should follow.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: Please let me know which configuration setting we need to modify to speedup the pg_dump backup.

2018-10-10 Thread Raghavendra Rao J S V
Thank you very much for your prompt response Christopher.

On Thu 11 Oct, 2018, 8:41 AM Christopher Browne,  wrote:

> On Wed, Oct 10, 2018, 10:32 PM Raghavendra Rao J S V <
> raghavendra...@gmail.com> wrote:
>
>> Hi All,
>>
>> pg_dump is taking more time. Please let me know which configuration
>> setting we need to modify to speedup the pg_dump backup.We are using 9.2
>> version on Centos Box.
>>
>
> There certainly isn't a configuration parameter to say "make PG dump
> faster."
>
> - It is possible that it is taking longer to backup the database because
> the database has grown in size.  If you have a lot more data captured, this
> would be a natural consequence, that you need simply to accept.
>
> But there are additional possibilities...
>
> - Perhaps the database is bloated because an update pattern is leading to
> a lot of dead data being left behind.  In that case some tables need to be
> vacuumed much more often and you should look into the auto vacuum
> configuration.
>
> - Perhaps the database has some table that contains a lot of obsolete
> data.  This would depend heavily on the nature of your application.
>
> You should look to see what data you are collecting that is not of ongoing
> value.  That may represent data that you should trim out of the database.
> That should improve the amount of time it takes to do a backup of the
> database.
>
>>


Re: Please let me know which configuration setting we need to modify to speedup the pg_dump backup.

2018-10-10 Thread Raghavendra Rao J S V
We are using postgresql 9.2. It doesn't contain the option.

Please guide me any other approaches to improve the performance of pg_dump.

On Thu 11 Oct, 2018, 8:05 AM Pavan Teja, 
wrote:

> Hi,
>
> You can use -j jobs option to speed up the process.
>
> Hope it works.
>
> Regards,
> Pavan
>
> On Thu, Oct 11, 2018, 8:02 AM Raghavendra Rao J S V <
> raghavendra...@gmail.com> wrote:
>
>> Hi All,
>>
>> pg_dump is taking more time. Please let me know which configuration
>> setting we need to modify to speedup the pg_dump backup.We are using 9.2
>> version on Centos Box.
>>
>> --
>> Regards,
>> Raghavendra Rao J S V
>>
>>


Re: Please let me know which configuration setting we need to modify to speedup the pg_dump backup.

2018-10-10 Thread Christopher Browne
On Wed, Oct 10, 2018, 10:32 PM Raghavendra Rao J S V <
raghavendra...@gmail.com> wrote:

> Hi All,
>
> pg_dump is taking more time. Please let me know which configuration
> setting we need to modify to speedup the pg_dump backup.We are using 9.2
> version on Centos Box.
>

There certainly isn't a configuration parameter to say "make PG dump
faster."

- It is possible that it is taking longer to backup the database because
the database has grown in size.  If you have a lot more data captured, this
would be a natural consequence, that you need simply to accept.

But there are additional possibilities...

- Perhaps the database is bloated because an update pattern is leading to a
lot of dead data being left behind.  In that case some tables need to be
vacuumed much more often and you should look into the auto vacuum
configuration.

- Perhaps the database has some table that contains a lot of obsolete
data.  This would depend heavily on the nature of your application.

You should look to see what data you are collecting that is not of ongoing
value.  That may represent data that you should trim out of the database.
That should improve the amount of time it takes to do a backup of the
database.

>


Re: how to identify the timeline of specified recovery_target_time when do multiple PITR

2018-10-10 Thread magodo


On Wed, 2018-10-10 at 18:21 +0200, Laurenz Albe wrote:
> magodo wrote:
> > > By default, recovery will stay on the timeline where it started.
> > > If you want to go to timeline 2 or 3, you have to specify
> > > recovery_target_timeline.
> > > 
> > For me, the specified point corresponds to timeline 1, because at
> > that
> > time, the timeline is 1 indeed (when there is no timeline 2
> > branched
> > out yet). 
> > 
> > So in other word, my expectation is like I want to get the first
> > timeline which covers the specified time.
> > 
> > As you mentioned, there is no active timeline concept in postgres,
> > then
> > what is the best practice to meet my expectation? Do I have to
> > somehow
> > record the timestamp when archived wal's timeline has been changed,
> > then compare the specified recovery target time with those records
> > to
> > find the first timeline which covers that time?
> 
> As I wrote, that is the default behavior; PostgreSQL will stay on
> the timeline that was active when recovery was started.
> 
> Yours,
> Laurenz Albe

OK... Just take another example:

 A B
BASE-+-+--o1 (recover to A)  1
 | |   C
 +.|...+---o2 (regret, recover to B) 2
   |   |
   +...|..--o3 (regret again, recover to C)  3
   | 
   + 4


Suppose I'm at "o3" and want to recover to point "C". Because I want to
recover to the first timeline which covers this time point, it means I
wish to recover to timeline 2. Then the same question:

> > what is the best practice to meet my expectation? Do I have to
> > somehow
> > record the timestamp when archived wal's timeline has been changed,
> > then compare the specified recovery target time with those records
> > to
> > find the first timeline which covers that time?

---
Magodo
 





Re: Please let me know which configuration setting we need to modify to speedup the pg_dump backup.

2018-10-10 Thread Ron

On 10/10/2018 09:32 PM, Raghavendra Rao J S V wrote:

Hi All,

pg_dump is taking more time. Please let me know which configuration 
setting we need to modify to speedup the pg_dump backup.We are using 9.2 
version on Centos Box.


Is it taking "more time" because your database is bigger?


--
Angular momentum makes the world go 'round.


Re: Please let me know which configuration setting we need to modify to speedup the pg_dump backup.

2018-10-10 Thread Pavan Teja
Hi,

You can use -j jobs option to speed up the process.

Hope it works.

Regards,
Pavan

On Thu, Oct 11, 2018, 8:02 AM Raghavendra Rao J S V <
raghavendra...@gmail.com> wrote:

> Hi All,
>
> pg_dump is taking more time. Please let me know which configuration
> setting we need to modify to speedup the pg_dump backup.We are using 9.2
> version on Centos Box.
>
> --
> Regards,
> Raghavendra Rao J S V
>
>


Please let me know which configuration setting we need to modify to speedup the pg_dump backup.

2018-10-10 Thread Raghavendra Rao J S V
Hi All,

pg_dump is taking more time. Please let me know which configuration setting
we need to modify to speedup the pg_dump backup.We are using 9.2 version on
Centos Box.

-- 
Regards,
Raghavendra Rao J S V


Re: tds_fdw binary column

2018-10-10 Thread Adrian Klaver

On 10/10/18 12:20 PM, Aleš Zelený wrote:

Hello, thanks for the testcase!




Test cases with added msg_handler

test_get_some_id)
Thanks to chosen small table, it fast enough:

=# select branch_id from ase.tds_tbl where branch_id::text = 
'\x00038500875c3d60';
NOTICE:  DB-Library notice: Msg #: 5701, Msg state: 2, Msg: Changed 
database context to 'customer'.

, Server: FMI0MA1, Process: , Line: 0, Level: 0
NOTICE:  DB-Library notice: Msg #: 5701, Msg state: 1, Msg: Changed 
database context to 'vendor'.

, Server: FMI0MA1, Process: , Line: 1, Level: 0
NOTICE:  tds_fdw: Query executed correctly
NOTICE:  tds_fdw: Getting results
NOTICE:  DB-Library notice: Msg #: 5701, Msg state: 2, Msg: Changed 
database context to 'customer'.

, Server: FMI0MA1, Process: , Line: 0, Level: 0
NOTICE:  DB-Library notice: Msg #: 5701, Msg state: 1, Msg: Changed 
database context to 'vendor'.

, Server: FMI0MA1, Process: , Line: 1, Level: 0
┌┐
│ branch_id  │
├┤
│ \x00038500875c3d60 │
└┘
(1 row)

Time: 38.673 ms

We get ID used in later tests:

test_bytea_predicate)
=# select branch_id from ase.tds_tbl where branch_id = 
E'\\x00038500875c3d60'::bytea;
NOTICE:  DB-Library notice: Msg #: 5701, Msg state: 2, Msg: Changed 
database context to 'customer'.

, Server: FMI0MA1, Process: , Line: 0, Level: 0
NOTICE:  DB-Library notice: Msg #: 5701, Msg state: 1, Msg: Changed 
database context to 'vendor'.

, Server: FMI0MA1, Process: , Line: 1, Level: 0
NOTICE:  DB-Library notice: Msg #: 102, Msg state: 181, Msg: Incorrect 
syntax near 'E'.

, Server: FMI0MA1, Process: , Line: 1, Level: 15
ERROR:  DB-Library error: DB #: 102, DB Msg: General SQL Server error: 
Check messages from the SQL Server, OS #: -1, OS Msg: , Level: 15

Time: 0.209 ms

Failed as expected after previous tests, but we have new message: 
Incorrect syntax near 'E'.

Might be some issue with cast handling???


No with the escape syntax, E' is a Postgres extension. See:

https://www.postgresql.org/docs/10/static/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS


4.1.2.2. String Constants with C-style Escapes


The text cast in your previous example seems to work. Is there an issue 
with using that.


If there is I would file an issue here:

https://github.com/tds-fdw/tds_fdw/issues





test_bytea_predicate_to_bytea)
[local]:5432 postgres@postgres:13550
=# select branch_id from ase.tds_tbl where branch_id = (select branch_id 
from ase.tds_tbl where branch_id::text = '\x00038500875c3d60');
NOTICE:  DB-Library notice: Msg #: 5701, Msg state: 2, Msg: Changed 
database context to 'customer'.

, Server: FMI0MA1, Process: , Line: 0, Level: 0
NOTICE:  DB-Library notice: Msg #: 5701, Msg state: 1, Msg: Changed 
database context to 'vendor'.

, Server: FMI0MA1, Process: , Line: 1, Level: 0
NOTICE:  tds_fdw: Query executed correctly
NOTICE:  tds_fdw: Getting results
NOTICE:  DB-Library notice: Msg #: 5701, Msg state: 2, Msg: Changed 
database context to 'customer'.

, Server: FMI0MA1, Process: , Line: 0, Level: 0
NOTICE:  DB-Library notice: Msg #: 5701, Msg state: 1, Msg: Changed 
database context to 'vendor'.

, Server: FMI0MA1, Process: , Line: 1, Level: 0
NOTICE:  DB-Library notice: Msg #: 2715, Msg state: 1, Msg: Can't find 
type 'bytea'.

, Server: FMI0MA1, Process: , Line: 1, Level: 16
ERROR:  DB-Library error: DB #: 2715, DB Msg: General SQL Server error: 
Check messages from the SQL Server, OS #: -1, OS Msg: , Level: 16

Time: 0.249 ms

The error is different, it looks tds_fdw is trying use bytea dat type 
fro ASE query (guess).


That is what I was able to test.

Kind regards
Ales Zeleny

st 10. 10. 2018 v 15:30 odesílatel Adrian Klaver 
mailto:adrian.kla...@aklaver.com>> napsal:


On 10/10/18 1:31 AM, Aleš Zelený wrote:
 > Hello,
 >
 > my fault, I've forgot to mention that I have only DSN and database
 > user/password credentials with no access to the box with Sybase.
trying
 > to reach service vendor support,  but it might take some time and I
 > hoped I've done some mistake on my side...

Hmm, some digging found this:

https://github.com/tds-fdw/tds_fdw/issues/88

If your credentials allow it you might try the suggestion in the above
to see if you can get a more detailed error message.

 >
 > Kind Regards
 > Ales Zeleny
 >

-- 
Adrian Klaver

adrian.kla...@aklaver.com 




--
Adrian Klaver
adrian.kla...@aklaver.com



Re: COPY threads

2018-10-10 Thread Ravi Krishna
Thank you.  Let me test it and see the benefit. We have a use case for this.


> On Oct 10, 2018, at 17:18 , Andres Freund  wrote:
> 
> 
> 
> On October 10, 2018 2:15:19 PM PDT, Ravi Krishna  wrote:
>>> 
>>> pg_restore doesn't take locks on the table for the COPY, it does so
>>> because creating the table takes an exclusive lock.
>> 
>> 
>> Interesting.  I seem to recollect reading here that I can't have
>> concurrent COPY on the same table because of the lock.
>> To give an example:
>> 
>> If I have a large file with say 400 million rows, can I first split it
>> into 10 files of 40 million rows each and then fire up 10 different
>> COPY sessions , each reading from a split file, but copying into the
>> same table.  I thought not.  It will be great if we can do this.
> 
> Yes, you can.
> 
> Andres
> -- 
> Sent from my Android device with K-9 Mail. Please excuse my brevity.




Re: COPY threads

2018-10-10 Thread Andres Freund



On October 10, 2018 2:15:19 PM PDT, Ravi Krishna  wrote:
>> 
>> pg_restore doesn't take locks on the table for the COPY, it does so
>> because creating the table takes an exclusive lock.
>
>
>Interesting.  I seem to recollect reading here that I can't have
>concurrent COPY on the same table because of the lock.
>To give an example:
>
>If I have a large file with say 400 million rows, can I first split it
>into 10 files of 40 million rows each and then fire up 10 different
>COPY sessions , each reading from a split file, but copying into the
>same table.  I thought not.  It will be great if we can do this.

Yes, you can.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: COPY threads

2018-10-10 Thread Ravi Krishna
> 
> pg_restore doesn't take locks on the table for the COPY, it does so
> because creating the table takes an exclusive lock.


Interesting.  I seem to recollect reading here that I can't have concurrent 
COPY on the same table because of the lock.
To give an example:

If I have a large file with say 400 million rows, can I first split it into 10 
files of 40 million rows each and then fire up 10 different
COPY sessions , each reading from a split file, but copying into the same 
table.  I thought not.  It will be great if we can do this.


Re: COPY threads

2018-10-10 Thread Andres Freund
Hi,

On 2018-10-10 17:02:59 -0400, Ravi Krishna wrote:
> I am talking about pg_restore which is COPY FROM, which takes exclusive lock 
> on the table
> while the data is loaded into the table.

pg_restore doesn't take locks on the table for the COPY, it does so
because creating the table takes an exclusive lock.

Greetings,

Andres Freund



Re: COPY threads

2018-10-10 Thread Ravi Krishna
> 
> No, why would that seem to be the case?  If it did so, then you could
> not run pg_dump to dump data while regular activity was going on.


Not sure. In fact I am now confused.
I am talking about pg_restore which is COPY FROM, which takes exclusive lock on 
the table
while the data is loaded into the table.
pg_dump does COPY TO which reads data out of the table to a file or something 
else.


Re: COPY threads

2018-10-10 Thread Christopher Browne
On Wed, 10 Oct 2018 at 16:22, Ravi Krishna  wrote:
> You obviously are referring to multiple connections running COPY on different 
> tables, right?  Like what pg_restore does with -j option.
> Doesn't copy take an exclusive lock on the table which makes it incompatible 
> with parallelization.

No, why would that seem to be the case?  If it did so, then you could
not run pg_dump to dump data while regular activity was going on.

That's decidedly not the case.

The challenge in parallelizing a dump via COPY TO is in ensuring that
the multiple requests are attached to the same serializable
transaction.  There's a function now that allows multiple connections
to attach to the same transaction context, I believe...  Also, there's
the challenge in actually splitting the data, so that both requests
are dumping different data; that might be fairly expensive whether
with or without indices.

-- 
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"



Re: COPY threads

2018-10-10 Thread Ravi Krishna
> 
> You obviously can just copy the data into postgres over multiple
> connections if you need to speed COPY up. But that requires splitting up
> the data on the clientside.
> 

You obviously are referring to multiple connections running COPY on different 
tables, right?  Like what pg_restore does with -j option.
Doesn't copy take an exclusive lock on the table which makes it incompatible 
with parallelization.




Re: COPY threads

2018-10-10 Thread Rob Sargent



> On Oct 10, 2018, at 1:24 PM, Andres Freund  wrote:
> 
> On 2018-10-10 18:50:02 +0200, Laurenz Albe wrote:
>> Rob Sargent wrote:
>>> Can anyone here tell me whether or not the CopyManager facility in JDBC 
>>> via org.postgresql:postgresql:42.1.4 is internally multithreaded? 
>>> Running on CentOS 7 (all participants), java8, postgres 10.5
>> 
>> It isn't, and there would not be much reason for it to be, as COPY
>> in PostgreSQL cannot be parallelized.
> 
> s/cannot/is not/.
> 
> You obviously can just copy the data into postgres over multiple
> connections if you need to speed COPY up. But that requires splitting up
> the data on the clientside.
> 
> Greetings,
> 
> Andres Freund
Interesting note, deeply pocketed.  The COPY portion is /not/ my problem - 
that’s plenty fast enough and memory efficient enough straight from the box.  I 
had over looked that I was requesting 33M records, in a map with a longish CSV 
string as key...then playing sort games with keys and ... well that’s as stupid 
as I got.




Re: COPY threads

2018-10-10 Thread Andres Freund
On 2018-10-10 18:50:02 +0200, Laurenz Albe wrote:
> Rob Sargent wrote:
> > Can anyone here tell me whether or not the CopyManager facility in JDBC 
> > via org.postgresql:postgresql:42.1.4 is internally multithreaded? 
> > Running on CentOS 7 (all participants), java8, postgres 10.5
> 
> It isn't, and there would not be much reason for it to be, as COPY
> in PostgreSQL cannot be parallelized.

s/cannot/is not/.

You obviously can just copy the data into postgres over multiple
connections if you need to speed COPY up. But that requires splitting up
the data on the clientside.

Greetings,

Andres Freund



Re: tds_fdw binary column

2018-10-10 Thread Aleš Zelený
Hello, thanks for the testcase!

First of all, some more environment information:

Foreign server:

[local]:5432 postgres@postgres:13713
=# \des+ ase

List of foreign servers
┌───┬──┬──┬──┬─┬┐
│   Name│  Owner   │ Foreign-data wrapper │ Type │ Version
│   FDW options
├───┼──┼──┼──┼─┼┤
│ ase   │ postgres │ tds_fdw  │  │ │
(servername '', port '', database 'vendor', tds_version
'5.0', msg_handler 'notice') │
└───┴──┴──┴──┴─┴┘
(1 row)

Foreign table:

=# \d ase.tds_tbl
   Foreign table "ase.tds_tbl"
┌─┬┬───┬──┬─┬─┐
│   Column│  Type  │ Collation │ Nullable │
Default │ FDW options │
├─┼┼───┼──┼─┼─┤
│ branch_id   │ bytea  │   │ not null
│ │ │
│ city│ character varying(60)  │   │
│ │ │
│ zip_code│ character varying(10)  │   │
│ │ │
└─┴┴───┴──┴─┴─┘
Server: ase
FDW options: (schema_name 'dbo', table_name 'branch')

How Sybase reports that:

use vendor
go

sp_columns branch
go

table_qualifier |table_owner |table_name |column_name |data_type
|type_name |precision |length |scale |radix |nullable |remarks
|ss_data_type |colid |column_def |sql_data_type |sql_datetime_sub
|char_octet_length |ordinal_position |is_nullable |
||---||--|--|--|---|--|--|-||-|--|---|--|-|--|-||
vednor  |dbo |branch |branch_id   |-2
|binary|8 |8  |  |  |0|
|45   |1 |   |-2|
|8 |1|NO  |
vednor  |dbo |branch |city|12
|varchar   |60|60 |  |  |1|
|39   |3 |   |12|
|60|3|YES |
vednor  |dbo |branch |zip_code|12
|varchar   |10|10 |  |  |1|
|39   |9 |   |12|
|10|9|YES |

Test cases with added msg_handler

test_get_some_id)
Thanks to chosen small table, it fast enough:

=# select branch_id from ase.tds_tbl where branch_id::text =
'\x00038500875c3d60';
NOTICE:  DB-Library notice: Msg #: 5701, Msg state: 2, Msg: Changed
database context to 'customer'.
, Server: FMI0MA1, Process: , Line: 0, Level: 0
NOTICE:  DB-Library notice: Msg #: 5701, Msg state: 1, Msg: Changed
database context to 'vendor'.
, Server: FMI0MA1, Process: , Line: 1, Level: 0
NOTICE:  tds_fdw: Query executed correctly
NOTICE:  tds_fdw: Getting results
NOTICE:  DB-Library notice: Msg #: 5701, Msg state: 2, Msg: Changed
database context to 'customer'.
, Server: FMI0MA1, Process: , Line: 0, Level: 0
NOTICE:  DB-Library notice: Msg #: 5701, Msg state: 1, Msg: Changed
database context to 'vendor'.
, Server: FMI0MA1, Process: , Line: 1, Level: 0
┌┐
│ branch_id  │
├┤
│ \x00038500875c3d60 │
└┘
(1 row)

Time: 38.673 ms

We get ID used in later tests:

test_bytea_predicate)
=# select branch_id from ase.tds_tbl where branch_id =
E'\\x00038500875c3d60'::bytea;
NOTICE:  DB-Library notice: Msg #: 5701, Msg state: 2, Msg: Changed
database context to 'customer'.
, Server: FMI0MA1, Process: , Line: 0, Level: 0
NOTICE:  DB-Library notice: Msg #: 5701, Msg state: 1, Msg: Changed
database context to 'vendor'.
, Server: FMI0MA1, Process: , Line: 1, Level: 0
NOTICE:  DB-Library notice: Msg #: 102, Msg state: 181, Msg: Incorrect
syntax near 'E'.
, Server: FMI0MA1, Process: , Line: 1, Level: 15
ERROR:  DB-Library error: DB #: 102, DB Msg: General SQL Server error:
Check messages from the SQL Server, OS #: -1, OS Msg: , Level: 15
Time: 0.209 ms

Failed as expected after previous tests, but we have new message: Incorrect
syntax near 'E'.
Might be some issue with cast handling???

test_bytea_predicate_to_bytea)
[local]:5432 postgres@postgres:13550
=# select branch_id from ase.tds_tbl where branch_id = (sel

Re: RHEL 7 (systemd) reboot

2018-10-10 Thread Adrian Klaver

On 10/10/18 7:37 AM, Bryce Pepper wrote:

Sorry, I wasn't clear in the prior posts.

The stop script is running during reboot. The problem is the database is not 
reachable when the stop script runs.  The ctmdist server shut down is as 
follows:
Stop control-m application
Stop control-m configuration agent
Stop database


Several things:

1) In your OP there was this:

Oct 05 14:18:56 kccontrolmt01 network[29310]: Shutting down interface 
eth0:  Device 'eth0' successfully disconnected.


Oct 05 14:18:56 kccontrolmt01 network[29310]: [  OK  ]

Oct 05 14:18:56 kccontrolmt01 stop_ctmlinux_server.sh[29185]: 



Oct 05 14:18:56 kccontrolmt01 stop_ctmlinux_server.sh[29185]: Shutting 
down CONTROL-M.


So is your Postgres instance running on the same machine as the CTM 
instance or does the eth0 need to be up to reach the database?


2) In the above there is:
"Shutting down CONTROL-M."

Yet in script below there is:
"Stopping CONTROL-M application"

Is this because there are sub-scripts involved or the "Stopping ..." is 
embedded in the script?


3) I am by no means a shell script expert and I will admit to not fully 
understanding what control-m_shutdown.sh does. Still here it goes:


a) Are there actually two shebangs in one file or are there two files 
involved?


b) What is:

# stop database
/data00/ctmlinux/ctm_server/scripts/dbversion
if [ $? -ne 0 ] ; then
  echo "SQL Server is already stopped "
else
  if [ -f /data00/ctmlinux/ctm_server/scripts/shutdb ]; then
echo "Stopping SQL server for CONTROL-M"
/data00/ctmlinux/ctm_server/scripts/shutdb
  fi

actually doing?

I ask because from what I can see there are a set of parallel processes 
initiated and it is possible that the database server is winning. It 
comes down to what 'if [ $? -ne 0 ]' is testing.






As you can see the intent is for the database to be shut down after the product.

But as you noticed from /var/log/message the stop_ctmlinux_server.sh  script is 
running but unable to execute the update query.

I created the following Service definition and scripts that follow -- note 
there are 2 datacenters (ctmdist, ctmlinux) that have comparable scripts so I 
have only included one set:

[root@kccontrolmt01 ~]# cat ControlM_Shutdown.service
[Unit]
Description=Run mycommand at shutdown
Requires=network.target CTM_Postgre.service
DefaultDependencies=no
Before=shutdown.target reboot.target

[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/true
ExecStop=/root/scripts/control-m_shutdown.sh

[Install]
WantedBy=multi-user.target


[root@kccontrolmt01 ~]# cat /root/scripts/control-m_shutdown.sh
#!/bin/sh
   # Shutdown any running Control-M services
 STATUS=$(/usr/bin/systemctl is-active CTMLinux_Server.service)
 if [ ${STATUS} == "active" ]; then
   /usr/bin/systemctl stop CTMLinux_Server.service
 fi

 STATUS=$(/usr/bin/systemctl is-active CTMDist_Server.service)
 if [ ${STATUS} == "active" ]; then
   /usr/bin/systemctl stop CTMDist_Server.service
 fi

 STATUS=$(/usr/bin/systemctl is-active EnterpriseManager.service)
 if [ ${STATUS} == "active" ]; then
   /usr/bin/systemctl stop EnterpriseManager.service
 fi
exit 0


#!/bin/bash

# stop CONTROL-M
if [ -f /data00/ctmlinux/ctm_server/scripts/shut_ctm ]; then
   echo "Stopping CONTROL-M application"
   /data00/ctmlinux/ctm_server/scripts/shut_ctm
fi

# stop CONTROL-M Configuration Agent
if [ -f /data00/ctmlinux/ctm_server/scripts/shut_ca ]; then
   echo "Stopping CONTROL-M Server Configuration Agent"
   /data00/ctmlinux/ctm_server/scripts/shut_ca
fi

# stop database
/data00/ctmlinux/ctm_server/scripts/dbversion
if [ $? -ne 0 ] ; then
   echo "SQL Server is already stopped "
else
   if [ -f /data00/ctmlinux/ctm_server/scripts/shutdb ]; then
 echo "Stopping SQL server for CONTROL-M"
 /data00/ctmlinux/ctm_server/scripts/shutdb
   fi
fi

exit 0





--
Adrian Klaver
adrian.kla...@aklaver.com



Re: COPY threads

2018-10-10 Thread Rob Sargent


> On Oct 10, 2018, at 10:50 AM, Laurenz Albe  wrote:
> 
> Rob Sargent wrote:
>> Can anyone here tell me whether or not the CopyManager facility in JDBC 
>> via org.postgresql:postgresql:42.1.4 is internally multithreaded? 
>> Running on CentOS 7 (all participants), java8, postgres 10.5
> 
> It isn't, and there would not be much reason for it to be, as COPY
> in PostgreSQL cannot be parallelized.
Thank you for that confirmation.  I had all cores firing on my copyManager box 
and it turned out it was really just looking really hard for more byte to free 
up.  I was getting a lot more out of the database than I could chew.  OOM 
ensued.

Cheers,
rjs

Re: COPY threads

2018-10-10 Thread Laurenz Albe
Rob Sargent wrote:
> Can anyone here tell me whether or not the CopyManager facility in JDBC 
> via org.postgresql:postgresql:42.1.4 is internally multithreaded? 
> Running on CentOS 7 (all participants), java8, postgres 10.5

It isn't, and there would not be much reason for it to be, as COPY
in PostgreSQL cannot be parallelized.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: how to identify the timeline of specified recovery_target_time when do multiple PITR

2018-10-10 Thread Laurenz Albe
magodo wrote:
> > By default, recovery will stay on the timeline where it started.
> > If you want to go to timeline 2 or 3, you have to specify
> > recovery_target_timeline.
> > 
> For me, the specified point corresponds to timeline 1, because at that
> time, the timeline is 1 indeed (when there is no timeline 2 branched
> out yet). 
> 
> So in other word, my expectation is like I want to get the first
> timeline which covers the specified time.
> 
> As you mentioned, there is no active timeline concept in postgres, then
> what is the best practice to meet my expectation? Do I have to somehow
> record the timestamp when archived wal's timeline has been changed,
> then compare the specified recovery target time with those records to
> find the first timeline which covers that time?

As I wrote, that is the default behavior; PostgreSQL will stay on
the timeline that was active when recovery was started.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: Pgbouncer discard all

2018-10-10 Thread Fabio Pardi
Hi Nicola,

I am of the impression that the problem of having waiting clients does not 
depend from server_idle_timeout

How is the load on both ends of pgbouncer? High? Low? No activity? A lot of 
content switch perhaps?

I think that a pool_size of 120 is very high, unless you have an XXL database 
server. Too high numbers will bring lower performances.

Also i think it would be of interest to know the values you are using for:

default_pool_size  min_pool_size and reserve_pool_size

and of main importance is to actually know what are the active connections 
doing. Are those all active? Did perhaps somebody forgot to close the 
connection to the db in your application and they are only waiting to be timed 
out?

About 'DISCARD ALL', that is the default 'server_reset_query'.

regards,

fabio pardi


On 10/10/18 17:03, Nicola Contu wrote:
> Hello,
> we are running pgbouncer 1.9.1 connected to postgres 10.5
>
> Sometimes we are seeing a lot of waiting connections with this query :
>
> DISCARD ALL
>
> This is our pgbouncer config :
>
> [databases]
> dev = host=10.10.10.1 port=5432 dbname=dev auth_user=pgbouncer pool_size=120
>
> [pgbouncer]
> listen_port = 6543
> listen_addr = *
> auth_type = md5
> auth_file = /etc/pgbouncer/users.txt
> auth_query = select uname,phash from user_lookup($1)
> logfile = /var/log/pgbouncer.log
> pidfile = /home/postgres/pgbouncer.pid
> admin_users = admin
> user = postgres
> max_db_connections = 220
> log_connections = 0
> log_disconnections = 0
>
>
> Do you think this can depend on the server_idle_timeout default config value?
>
> Thanks a lot,
> Nicola



Pgbouncer discard all

2018-10-10 Thread Nicola Contu
Hello,
we are running pgbouncer 1.9.1 connected to postgres 10.5

Sometimes we are seeing a lot of waiting connections with this query :

DISCARD ALL

This is our pgbouncer config :

[databases]
dev = host=10.10.10.1 port=5432 dbname=dev auth_user=pgbouncer pool_size=120

[pgbouncer]
listen_port = 6543
listen_addr = *
auth_type = md5
auth_file = /etc/pgbouncer/users.txt
auth_query = select uname,phash from user_lookup($1)
logfile = /var/log/pgbouncer.log
pidfile = /home/postgres/pgbouncer.pid
admin_users = admin
user = postgres
max_db_connections = 220
log_connections = 0
log_disconnections = 0


Do you think this can depend on the server_idle_timeout default config
value?

Thanks a lot,
Nicola


RE: RHEL 7 (systemd) reboot

2018-10-10 Thread Bryce Pepper
Sorry, I wasn't clear in the prior posts.   

The stop script is running during reboot. The problem is the database is not 
reachable when the stop script runs.  The ctmdist server shut down is as 
follows:
   Stop control-m application
   Stop control-m configuration agent
   Stop database

As you can see the intent is for the database to be shut down after the 
product. 

But as you noticed from /var/log/message the stop_ctmlinux_server.sh  script is 
running but unable to execute the update query.

I created the following Service definition and scripts that follow -- note 
there are 2 datacenters (ctmdist, ctmlinux) that have comparable scripts so I 
have only included one set:

[root@kccontrolmt01 ~]# cat ControlM_Shutdown.service
[Unit]
Description=Run mycommand at shutdown
Requires=network.target CTM_Postgre.service
DefaultDependencies=no
Before=shutdown.target reboot.target

[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/true
ExecStop=/root/scripts/control-m_shutdown.sh

[Install]
WantedBy=multi-user.target


[root@kccontrolmt01 ~]# cat /root/scripts/control-m_shutdown.sh
#!/bin/sh
  # Shutdown any running Control-M services
STATUS=$(/usr/bin/systemctl is-active CTMLinux_Server.service)
if [ ${STATUS} == "active" ]; then
  /usr/bin/systemctl stop CTMLinux_Server.service 
fi

STATUS=$(/usr/bin/systemctl is-active CTMDist_Server.service)
if [ ${STATUS} == "active" ]; then
  /usr/bin/systemctl stop CTMDist_Server.service 
fi

STATUS=$(/usr/bin/systemctl is-active EnterpriseManager.service)
if [ ${STATUS} == "active" ]; then
  /usr/bin/systemctl stop EnterpriseManager.service
fi
exit 0


#!/bin/bash

# stop CONTROL-M
if [ -f /data00/ctmlinux/ctm_server/scripts/shut_ctm ]; then
  echo "Stopping CONTROL-M application"
  /data00/ctmlinux/ctm_server/scripts/shut_ctm
fi

# stop CONTROL-M Configuration Agent
if [ -f /data00/ctmlinux/ctm_server/scripts/shut_ca ]; then
  echo "Stopping CONTROL-M Server Configuration Agent"
  /data00/ctmlinux/ctm_server/scripts/shut_ca
fi

# stop database
/data00/ctmlinux/ctm_server/scripts/dbversion
if [ $? -ne 0 ] ; then
  echo "SQL Server is already stopped "
else
  if [ -f /data00/ctmlinux/ctm_server/scripts/shutdb ]; then
echo "Stopping SQL server for CONTROL-M"
/data00/ctmlinux/ctm_server/scripts/shutdb
  fi
fi

exit 0

-Original Message-
From: Adrian Klaver  
Sent: Wednesday, October 10, 2018 8:25 AM
To: Bryce Pepper ; pgsql-general@lists.postgresql.org
Subject: Re: RHEL 7 (systemd) reboot

This email originated from outside the company. Please use caution when opening 
attachments or clicking on links. If you suspect this to be a phishing attempt, 
please report via PhishAlarm.


On 10/10/18 5:32 AM, Bryce Pepper wrote:
> Adrian,
> Thanks for the inquiry.  The function (db_execute_sql) is coming from a 
> vendor (BMC) product called Control-M. It is a scheduling product.
> The tmp file is deleted before I can see its contents but I believe it is 
> trying to update some columns in the CMS_SYSPRM table.
> I also think the postgresql instance is already stopped and hence why the 
> db_execute fails.  I will try to modify the vendor function to save off the 
> contents of the query.

Alright, I'm confused. In your earlier post you said the stop script is not 
running. Yet here it is, just not at the right time. I think a more detailed 
explanation is needed:

1) The stop script you are concerned about is a systemd  script, one that you 
created or system provided?

2) What is the shutdown service you refer to?

3) Is there a separate shutdown script for the Control-M product?

4) What do you expect to happen vs what is happening?

>
> Bryce
>
> p.s. Do you know of any verbose logging that could be turned on to catch when 
> pgsql is being terminated?
>
>
> -Original Message-
> From: Adrian Klaver 
> Sent: Tuesday, October 09, 2018 7:39 PM
> To: Bryce Pepper ; 
> pgsql-general@lists.postgresql.org
> Subject: Re: RHEL 7 (systemd) reboot
>
> This email originated from outside the company. Please use caution when 
> opening attachments or clicking on links. If you suspect this to be a 
> phishing attempt, please report via PhishAlarm.
> 
>
> On 10/9/18 11:06 AM, Bryce Pepper wrote:
>> I am running three instances (under different users) on a RHEL 7 
>> server to support a vendor product.
>>
>> In the defined services, the start & stop scripts work fine when 
>> invoked with systemctl {start|stop} whatever.service  but we have 
>> automated monthly patching which does a reboot.
>>
>> Looking in /var/log/messages and the stop scripts do not get invoked 
>> on reboot, therefore I created a new shutdown service as described 
>> here 
>> .
>>
>> It appears that PostGreSQL is receiving a signal from somewhere prior 
>> to my script running.
>

Re: how to identify the timeline of specified recovery_target_time when do multiple PITR

2018-10-10 Thread magodo


On Wed, 2018-10-03 at 08:06 +0200, Laurenz Albe wrote:
> magodo wrote:
> > I read the document about recovery configuration, it says:
> 
> [...]
> 
> > Therefore, suppose I am going to recover to a point of some child
> > timeline, to identify the point, I have to specify either
> > recovery_target_name or recovery_target_time, and also specify the
> > recovery_target_timeline.
> > 
> > It is more like a tuple like (recovery_target_time,
> > recovery_target_timeline), that specify a real point among all
> > history
> > branches. Am I understand this correctly?
> 
> I think you understood the concept well enough.
> 
> > If yes, what I want to ask is that, though the timeline is
> > increasing
> > between different recovery, but each timestamp corresponds to a
> > timeline ID, one by one. So if I get a recovery_target_time, why
> > should
> > I still specify the recovery_target_timeline? 
> > 
> > Suppose following illustration:
> > 
> >  A B
> > BASE-+-+--o1 (recover to
> > A)  1
> >  | |   C
> >  +.|...+---o2 (regret, recover to
> > B) 2
> >|   |
> >+...|..--o3 (regret again, recover to
> > C)  3
> >| 
> >+--
> > -- 4
> 
> Consider this   ^   point in time.
> |
> 
> Suppose you specify this point in time as recovery_target_time.
> 
> Then it is not clear which of the timelines you want to follow.
> The point of time exists in timeline 1, 2 and 3.
> 
> In other words, should recovery use the WAL from
> 00010ABC0012, 00020ABC0012 or
> 00030ABC0012?
> 
> By default, recovery will stay on the timeline where it started.
> If you want to go to timeline 2 or 3, you have to specify
> recovery_target_timeline.
> 
> Yours,
> Laurenz Albe

For me, the specified point corresponds to timeline 1, because at that
time, the timeline is 1 indeed (when there is no timeline 2 branched
out yet). 

So in other word, my expectation is like I want to get the first
timeline which covers the specified time.

As you mentioned, there is no active timeline concept in postgres, then
what is the best practice to meet my expectation? Do I have to somehow
record the timestamp when archived wal's timeline has been changed,
then compare the specified recovery target time with those records to
find the first timeline which covers that time?

Thank you in anticipation!

---
Magodo





Re: how to cleanup archive based on datetime

2018-10-10 Thread Martin Marques
El 10/10/18 a las 04:00, magodo escribió:
> 
> Hello,
> 
> I know there is a tool `pg_archivecleanup`, but it only accepts wal
> filename. What if I want to clean up all archived wal before a
> specified datetime?
> 
> Though I could check the ctime fo each wal and figure out which to
> remove, but it looks like not a elegent way to do this.
> 
> Also I tried to parse each wal and compare with the specified datetime
> in a binary-search way, but it is based on the fact that every COMMIT
> time in wal is sequential, but it looks like not the case..
> 
> Hence I wonder what is the best practice for this task?

Don't do that manually. Install `barman` and let it take care of
removing old WALs which are not needed anyore.

Regards,

-- 
Martín Marquéshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



how to cleanup archive based on datetime

2018-10-10 Thread magodo


Hello,

I know there is a tool `pg_archivecleanup`, but it only accepts wal
filename. What if I want to clean up all archived wal before a
specified datetime?

Though I could check the ctime fo each wal and figure out which to
remove, but it looks like not a elegent way to do this.

Also I tried to parse each wal and compare with the specified datetime
in a binary-search way, but it is based on the fact that every COMMIT
time in wal is sequential, but it looks like not the case..

Hence I wonder what is the best practice for this task?

Thank you in advance!
---
Magodo





Re: how to cleanup archive based on datetime

2018-10-10 Thread Adrian Klaver

On 10/10/18 12:00 AM, magodo wrote:


Hello,

I know there is a tool `pg_archivecleanup`, but it only accepts wal
filename. What if I want to clean up all archived wal before a
specified datetime?



That is dangerous, given that a WAL is important not based on its age 
but on whether it's content is still needed.


It would help to know what it is you are trying to achieve?



Though I could check the ctime fo each wal and figure out which to
remove, but it looks like not a elegent way to do this.

Also I tried to parse each wal and compare with the specified datetime
in a binary-search way, but it is based on the fact that every COMMIT
time in wal is sequential, but it looks like not the case..

Hence I wonder what is the best practice for this task?

Thank you in advance!
---
Magodo







--
Adrian Klaver
adrian.kla...@aklaver.com



Re: pg9.6: PITR paused just after failover, need manual resume

2018-10-10 Thread magodo


On Wed, 2018-10-10 at 11:24 +0800, magodo wrote:
> Hello,
> 
> I'm figuring out how to do PITR on a HA setup for postgres 9.6. I
> almost manage to do it, while I faced following issue:
> 
> 
> [1. create basebackup]
> 
> [2. insert 2 rows]
> 
> on both primary and standby's pg_xlog
> 
> 00010005
> 
> rmgr: Transaction len (rec/tot): 34/34, tx:633, lsn:
> 0/05016120, prev 0/050160E0, desc: COMMIT 2018-10-10 03:04:58.459754
> UTC  - insert 1 (before: 2018-10-10 11:05:04 +08:00)
> 
> rmgr: Transaction len (rec/tot): 34/34, tx:634, lsn:
> 0/05016188, prev 0/05016148, desc: COMMIT 2018-10-10 03:05:07.010478
> UTC  - insert 2 (before: 2018-10-10 11:05:09 +08:00)
> 
> [3. failover and promote the standby]
> 
> 00010005 been archived as
> 00010005.partial, and a new wal is switched out
> named:
> 00020005, which contains above two commits.
> 
> [4. recover to "after insert 1st row"]
> 
> Stop new standby(the rewinded primary), and on the new
> primary(promoted
> standby), do following:
> 
> 1. stop DB
> 2. sync $PGDATA against basebackup
> 3. create a recovery.conf with following content:
> 
>restore_command = 'cp /var/lib/pgsql/9.6/data/archive/%f %p'
>recovery_target_timeline = 2
>recovery_target_time = '2018-10-10 11:05:04 +08:00'
> 
> 4. stat DB
> 
> Then I expect server is recovered to the time before insert 2nd row
> but
> after insert 1st row. The result is as I expected, but it is in a
> "read-only transaction". The log shows following:
> 
> < 2018-10-10 03:09:56.790 UTC > LOG:  database system was
> interrupted;
> last known up at 2018-10-10 02:57:59 UTC
> < 2018-10-10 03:09:56.931 UTC > DEBUG:  restore_command = 'cp
> /var/lib/pgsql/9.6/data/archive/%f %p'
> < 2018-10-10 03:09:56.931 UTC > DEBUG:  recovery_target_timeline = 2
> < 2018-10-10 03:09:56.931 UTC > DEBUG:  recovery_target_time = '2018-
> 10-10 03:05:04+00'
> < 2018-10-10 03:09:56.934 UTC > LOG:  restored log file
> "0002.history" from archive
> < 2018-10-10 03:09:56.934 UTC > LOG:  starting point-in-time recovery
> to 2018-10-10 03:05:04+00
> < 2018-10-10 03:09:56.936 UTC > LOG:  restored log file
> "0002.history" from archive
> cp: cannot stat
> '/var/lib/pgsql/9.6/data/archive/00020004': No such
> file or directory
> < 2018-10-10 03:09:56.941 UTC > DEBUG:  could not restore file
> "00020004" from archive: child process exited with
> exit
> code
> 1
>   
> 
> < 2018-10-10 03:09:56.952 UTC > LOG:  restored log file
> "00010004" from archive
> < 2018-10-10 03:09:57.012 UTC > DEBUG:  got WAL segment from archive
> < 2018-10-10 03:09:57.013 UTC > DEBUG:  checkpoint record is at
> 0/460
> < 2018-10-10 03:09:57.013 UTC > DEBUG:  redo record is at 0/428;
> shutdown FALSE
> < 2018-10-10 03:09:57.013 UTC > DEBUG:  next transaction ID: 0:632;
> next OID: 16385
> < 2018-10-10 03:09:57.013 UTC > DEBUG:  next MultiXactId: 1; next
> MultiXactOffset: 0
> < 2018-10-10 03:09:57.013 UTC > DEBUG:  oldest unfrozen transaction
> ID:
> 623, in database 1
> < 2018-10-10 03:09:57.013 UTC > DEBUG:  oldest MultiXactId: 1, in
> database 1
> < 2018-10-10 03:09:57.013 UTC > DEBUG:  commit timestamp Xid
> oldest/newest: 0/0
> < 2018-10-10 03:09:57.013 UTC > DEBUG:  transaction ID wrap limit is
> 2147484270, limited by database with OID 1
> < 2018-10-10 03:09:57.013 UTC > DEBUG:  MultiXactId wrap limit is
> 2147483648, limited by database with OID 1
> < 2018-10-10 03:09:57.013 UTC > DEBUG:  starting up replication slots
> < 2018-10-10 03:09:57.013 UTC > DEBUG:  restoring replication slot
> from
> "pg_replslot/repl_slot/state"
> < 2018-10-10 03:09:57.015 UTC > DEBUG:  starting up replication
> origin
> progress state
> < 2018-10-10 03:09:57.020 UTC > DEBUG:  resetting unlogged relations:
> cleanup 1 init 0
> < 2018-10-10 03:09:57.021 UTC > DEBUG:  initializing for hot standby
> < 2018-10-10 03:09:57.021 UTC > LOG:  redo starts at 0/428
> < 2018-10-10 03:09:57.021 UTC > DEBUG:  recovery snapshots are now
> enabled
> < 2018-10-10 03:09:57.021 UTC > CONTEXT:  xlog redo at 0/428 for
> Standby/RUNNING_XACTS: nextXid 632 latestCompletedXid 631
> oldestRunningXid
> 632  
>   
> 
> < 2018-10-10 03:09:57.021 UTC > DEBUG:  end of backup reached
> < 2018-10-10 03:09:57.021 UTC > CONTEXT:  xlog redo at 0/4D0 for
> XLOG/BACKUP_END: 0/428
> < 2018-10-10 03:09:57.022 UTC > LOG:  consistent recovery state
> reached
> at 0/4F8
> < 2018-10-10 03:09:57.023 UTC > DEBUG:  checkpointer updated shared
> memory configuration values
> < 2018-10-10 03:09:57.024 UTC > LOG:  database system is ready to
> accept read only connections
> < 2018-10-10 03:09:57.028 UTC > 

Re: Barman issue

2018-10-10 Thread Martin Marques
El 05/10/18 a las 06:50, Rijo Roy escribió:
> Hello Experts,
> 
> I have a server rhel 6.9 installed with barman 2.3 which is configured
> to take backup from Postgresql 10 installed in a remote host and the
> same is working.

You should upgrade to 2.4, although I can't see anything relevant in the
changelog to your issue. Anyway, upgrade.

> I am trying to do a poc on streaming backup and streaming wal method by
> making use of the pg_receivewal. I have already created a user with
> replication privilege the same is added in pg_hba. conf. And I am able
> to connect remotely using the user created.
> But when I am running barman check pg
> pg is the name given to the remote dB host. I am getting an error like this
> EXCEPTION: LooseVersion instance has no attribute 'version'

You should be getting more output than just that. Have you checked the
barman logs?

> Below I am pasting the conf files used:
> [pg] 
> conninfo
> streaming_conninfo
> streaming_archiver = on
> slot_name = barman
> backup_method = postgres 
> path_prefix = Postgresql bin path
> basebackups_directory
> errors_directory
> incoming_wals_directory
> streaming_wals_directory

Set log_level to DEBUG while trying to find the reason for that exception.

> ---—---
> I have already created the replication slots and started the
> pg_receivexlog remotely from the batman server. Was are streaming and is
> coming to the incoming_wals_directory 

Did you start start pg_receivexlog (which in PG10 is actually called
pg_receivewal) manually or using `barman receive-wal`? (also, `barman
cron` should automatically start streaming if you have everything
configured correctly)

Regards,

-- 
Martín Marquéshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: tds_fdw binary column

2018-10-10 Thread Adrian Klaver

On 10/10/18 1:31 AM, Aleš Zelený wrote:

Hello,

my fault, I've forgot to mention that I have only DSN and database 
user/password credentials with no access to the box with Sybase. trying 
to reach service vendor support,  but it might take some time and I 
hoped I've done some mistake on my side...


Hmm, some digging found this:

https://github.com/tds-fdw/tds_fdw/issues/88

If your credentials allow it you might try the suggestion in the above 
to see if you can get a more detailed error message.




Kind Regards
Ales Zeleny



--
Adrian Klaver
adrian.kla...@aklaver.com



pg9.6: PITR paused just after failover, need manual resume

2018-10-10 Thread magodo


Hello,

I'm figuring out how to do PITR on a HA setup for postgres 9.6. I
almost manage to do it, while I faced following issue:


[1. create basebackup]

[2. insert 2 rows]

on both primary and standby's pg_xlog

00010005

rmgr: Transaction len (rec/tot): 34/34, tx:633, lsn:
0/05016120, prev 0/050160E0, desc: COMMIT 2018-10-10 03:04:58.459754
UTC  - insert 1 (before: 2018-10-10 11:05:04 +08:00)

rmgr: Transaction len (rec/tot): 34/34, tx:634, lsn:
0/05016188, prev 0/05016148, desc: COMMIT 2018-10-10 03:05:07.010478
UTC  - insert 2 (before: 2018-10-10 11:05:09 +08:00)

[3. failover and promote the standby]

00010005 been archived as
00010005.partial, and a new wal is switched out named:
00020005, which contains above two commits.

[4. recover to "after insert 1st row"]

Stop new standby(the rewinded primary), and on the new primary(promoted
standby), do following:

1. stop DB
2. sync $PGDATA against basebackup
3. create a recovery.conf with following content:

   restore_command = 'cp /var/lib/pgsql/9.6/data/archive/%f %p'
   recovery_target_timeline = 2
   recovery_target_time = '2018-10-10 11:05:04 +08:00'

4. stat DB

Then I expect server is recovered to the time before insert 2nd row but
after insert 1st row. The result is as I expected, but it is in a
"read-only transaction". The log shows following:

< 2018-10-10 03:09:56.790 UTC > LOG:  database system was interrupted;
last known up at 2018-10-10 02:57:59 UTC
< 2018-10-10 03:09:56.931 UTC > DEBUG:  restore_command = 'cp
/var/lib/pgsql/9.6/data/archive/%f %p'
< 2018-10-10 03:09:56.931 UTC > DEBUG:  recovery_target_timeline = 2
< 2018-10-10 03:09:56.931 UTC > DEBUG:  recovery_target_time = '2018-
10-10 03:05:04+00'
< 2018-10-10 03:09:56.934 UTC > LOG:  restored log file
"0002.history" from archive
< 2018-10-10 03:09:56.934 UTC > LOG:  starting point-in-time recovery
to 2018-10-10 03:05:04+00
< 2018-10-10 03:09:56.936 UTC > LOG:  restored log file
"0002.history" from archive
cp: cannot stat
'/var/lib/pgsql/9.6/data/archive/00020004': No such
file or directory
< 2018-10-10 03:09:56.941 UTC > DEBUG:  could not restore file
"00020004" from archive: child process exited with exit
code
1  

< 2018-10-10 03:09:56.952 UTC > LOG:  restored log file
"00010004" from archive
< 2018-10-10 03:09:57.012 UTC > DEBUG:  got WAL segment from archive
< 2018-10-10 03:09:57.013 UTC > DEBUG:  checkpoint record is at
0/460
< 2018-10-10 03:09:57.013 UTC > DEBUG:  redo record is at 0/428;
shutdown FALSE
< 2018-10-10 03:09:57.013 UTC > DEBUG:  next transaction ID: 0:632;
next OID: 16385
< 2018-10-10 03:09:57.013 UTC > DEBUG:  next MultiXactId: 1; next
MultiXactOffset: 0
< 2018-10-10 03:09:57.013 UTC > DEBUG:  oldest unfrozen transaction ID:
623, in database 1
< 2018-10-10 03:09:57.013 UTC > DEBUG:  oldest MultiXactId: 1, in
database 1
< 2018-10-10 03:09:57.013 UTC > DEBUG:  commit timestamp Xid
oldest/newest: 0/0
< 2018-10-10 03:09:57.013 UTC > DEBUG:  transaction ID wrap limit is
2147484270, limited by database with OID 1
< 2018-10-10 03:09:57.013 UTC > DEBUG:  MultiXactId wrap limit is
2147483648, limited by database with OID 1
< 2018-10-10 03:09:57.013 UTC > DEBUG:  starting up replication slots
< 2018-10-10 03:09:57.013 UTC > DEBUG:  restoring replication slot from
"pg_replslot/repl_slot/state"
< 2018-10-10 03:09:57.015 UTC > DEBUG:  starting up replication origin
progress state
< 2018-10-10 03:09:57.020 UTC > DEBUG:  resetting unlogged relations:
cleanup 1 init 0
< 2018-10-10 03:09:57.021 UTC > DEBUG:  initializing for hot standby
< 2018-10-10 03:09:57.021 UTC > LOG:  redo starts at 0/428
< 2018-10-10 03:09:57.021 UTC > DEBUG:  recovery snapshots are now
enabled
< 2018-10-10 03:09:57.021 UTC > CONTEXT:  xlog redo at 0/428 for
Standby/RUNNING_XACTS: nextXid 632 latestCompletedXid 631
oldestRunningXid
632

< 2018-10-10 03:09:57.021 UTC > DEBUG:  end of backup reached
< 2018-10-10 03:09:57.021 UTC > CONTEXT:  xlog redo at 0/4D0 for
XLOG/BACKUP_END: 0/428
< 2018-10-10 03:09:57.022 UTC > LOG:  consistent recovery state reached
at 0/4F8
< 2018-10-10 03:09:57.023 UTC > DEBUG:  checkpointer updated shared
memory configuration values
< 2018-10-10 03:09:57.024 UTC > LOG:  database system is ready to
accept read only connections
< 2018-10-10 03:09:57.028 UTC > DEBUG:  archived transaction log file
"00010004"
< 2018-10-10 03:09:57.035 UTC > DEBUG:  archived transaction log file
"0002.history"
< 2018-10-10 03:09:57.056 UTC > LOG:  restored log file
"00020005" from archive
< 2018-10-10 03:09:57.116 UTC > DEBUG:  got WAL segme

Re: RHEL 7 (systemd) reboot

2018-10-10 Thread Adrian Klaver

On 10/10/18 5:32 AM, Bryce Pepper wrote:

Adrian,
Thanks for the inquiry.  The function (db_execute_sql) is coming from a vendor 
(BMC) product called Control-M. It is a scheduling product.
The tmp file is deleted before I can see its contents but I believe it is 
trying to update some columns in the CMS_SYSPRM table.
I also think the postgresql instance is already stopped and hence why the 
db_execute fails.  I will try to modify the vendor function to save off the 
contents of the query.


Alright, I'm confused. In your earlier post you said the stop script is 
not running. Yet here it is, just not at the right time. I think a more 
detailed explanation is needed:


1) The stop script you are concerned about is a systemd  script, one 
that you created or system provided?


2) What is the shutdown service you refer to?

3) Is there a separate shutdown script for the Control-M product?

4) What do you expect to happen vs what is happening?



Bryce

p.s. Do you know of any verbose logging that could be turned on to catch when 
pgsql is being terminated?


-Original Message-
From: Adrian Klaver 
Sent: Tuesday, October 09, 2018 7:39 PM
To: Bryce Pepper ; pgsql-general@lists.postgresql.org
Subject: Re: RHEL 7 (systemd) reboot

This email originated from outside the company. Please use caution when opening 
attachments or clicking on links. If you suspect this to be a phishing attempt, 
please report via PhishAlarm.


On 10/9/18 11:06 AM, Bryce Pepper wrote:

I am running three instances (under different users) on a RHEL 7
server to support a vendor product.

In the defined services, the start & stop scripts work fine when
invoked with systemctl {start|stop} whatever.service  but we have
automated monthly patching which does a reboot.

Looking in /var/log/messages and the stop scripts do not get invoked
on reboot, therefore I created a new shutdown service as described
here 
.

It appears that PostGreSQL is receiving a signal from somewhere prior
to my script running.





The database must be available for the product to shut down in a
consistent state.

I am open to suggestions.


What is the below doing or coming from?:

db_execute_sql failed while processing
/data00/ctmlinux/ctm_server/tmp/upd_CMS_SYSPRM_29448.



Thanks,

Bryce

*Bryce Pepper*

Sr. Unix Applications Systems Engineer

*The Kansas City Southern Railway Company *

114 West 11^th Street  |  Kansas City,  MO 64105

Office:  816.983.1512

Email: bpep...@kcsouthern.com 




--
Adrian Klaver
adrian.kla...@aklaver.com




--
Adrian Klaver
adrian.kla...@aklaver.com



RE: RHEL 7 (systemd) reboot

2018-10-10 Thread Bryce Pepper
Here is the contents of the query and error:
[root@kccontrolmt01 tmp]# cat ctm.Xf9pQkg2
update CMS_SYSPRM set CURRENT_STATE='STOPPING',DESIRED_STATE='Down' where 
DESIRED_STATE <> 'Ignored'
;
psql: could not connect to server: Connection refused
Is the server running on host "kccontrolmt01" (10.1.32.53) and accepting
TCP/IP connections on port 5433?

-Original Message-
From: Adrian Klaver  
Sent: Tuesday, October 09, 2018 7:39 PM
To: Bryce Pepper ; pgsql-general@lists.postgresql.org
Subject: Re: RHEL 7 (systemd) reboot

This email originated from outside the company. Please use caution when opening 
attachments or clicking on links. If you suspect this to be a phishing attempt, 
please report via PhishAlarm.


On 10/9/18 11:06 AM, Bryce Pepper wrote:
> I am running three instances (under different users) on a RHEL 7 
> server to support a vendor product.
>
> In the defined services, the start & stop scripts work fine when 
> invoked with systemctl {start|stop} whatever.service  but we have 
> automated monthly patching which does a reboot.
>
> Looking in /var/log/messages and the stop scripts do not get invoked 
> on reboot, therefore I created a new shutdown service as described 
> here 
> .
>
> It appears that PostGreSQL is receiving a signal from somewhere prior 
> to my script running.
>

>
> The database must be available for the product to shut down in a 
> consistent state.
>
> I am open to suggestions.

What is the below doing or coming from?:

db_execute_sql failed while processing
/data00/ctmlinux/ctm_server/tmp/upd_CMS_SYSPRM_29448.

>
> Thanks,
>
> Bryce
>
> *Bryce Pepper*
>
> Sr. Unix Applications Systems Engineer
>
> *The Kansas City Southern Railway Company *
>
> 114 West 11^th Street  |  Kansas City,  MO 64105
>
> Office:  816.983.1512
>
> Email: bpep...@kcsouthern.com 
>


--
Adrian Klaver
adrian.kla...@aklaver.com



RE: RHEL 7 (systemd) reboot

2018-10-10 Thread Bryce Pepper
Adrian,
Thanks for the inquiry.  The function (db_execute_sql) is coming from a vendor 
(BMC) product called Control-M. It is a scheduling product.
The tmp file is deleted before I can see its contents but I believe it is 
trying to update some columns in the CMS_SYSPRM table. 
I also think the postgresql instance is already stopped and hence why the 
db_execute fails.  I will try to modify the vendor function to save off the 
contents of the query.

Bryce

p.s. Do you know of any verbose logging that could be turned on to catch when 
pgsql is being terminated?


-Original Message-
From: Adrian Klaver  
Sent: Tuesday, October 09, 2018 7:39 PM
To: Bryce Pepper ; pgsql-general@lists.postgresql.org
Subject: Re: RHEL 7 (systemd) reboot

This email originated from outside the company. Please use caution when opening 
attachments or clicking on links. If you suspect this to be a phishing attempt, 
please report via PhishAlarm.


On 10/9/18 11:06 AM, Bryce Pepper wrote:
> I am running three instances (under different users) on a RHEL 7 
> server to support a vendor product.
>
> In the defined services, the start & stop scripts work fine when 
> invoked with systemctl {start|stop} whatever.service  but we have 
> automated monthly patching which does a reboot.
>
> Looking in /var/log/messages and the stop scripts do not get invoked 
> on reboot, therefore I created a new shutdown service as described 
> here 
> .
>
> It appears that PostGreSQL is receiving a signal from somewhere prior 
> to my script running.
>

>
> The database must be available for the product to shut down in a 
> consistent state.
>
> I am open to suggestions.

What is the below doing or coming from?:

db_execute_sql failed while processing
/data00/ctmlinux/ctm_server/tmp/upd_CMS_SYSPRM_29448.

>
> Thanks,
>
> Bryce
>
> *Bryce Pepper*
>
> Sr. Unix Applications Systems Engineer
>
> *The Kansas City Southern Railway Company *
>
> 114 West 11^th Street  |  Kansas City,  MO 64105
>
> Office:  816.983.1512
>
> Email: bpep...@kcsouthern.com 
>


--
Adrian Klaver
adrian.kla...@aklaver.com



Re: Pg_logical without subscription. Can we log the operations ?

2018-10-10 Thread Steve Atkins


> On Oct 9, 2018, at 11:45 PM, Dilshan  wrote:
> 
> Hi Team,
>  I am working on a requirement, where I need to log each and every operation 
> on the master db. So I was thinking if I could do with pg_logical by setting 
> my master db as publisher and setting a new db as subscriber and dropping the 
> subscription there after. I am wondering, how to get the operations that a 
> subscription would receive just into logs. Could you guide me on that? 
> Otherwise is there a possibility to receive all the operation without 
> dropping subscription and logging the details and deleting the subscription 
> tables to save space. I am planning to have logs rotated and purging logs 
> every month. Could you please guide me about the possibility of this approach?

You can read the logical replication stream with clients other than postgresql, 
and you can use plugins to format it in different ways.

https://wiki.postgresql.org/wiki/Logical_Decoding_Plugins has some of the 
third-party plugins to format the changes.

There are libraries for most languages to consume the logical decoding stream, 
or the included client "pg_recvlogical" can be used to write it to disk.

pg_recvlogical + wal2json might be the simplest way to do a basic audit trail.

Cheers,
  Steve


Re: tds_fdw binary column

2018-10-10 Thread Aleš Zelený
Hello,

my fault, I've forgot to mention that I have only DSN and database
user/password credentials with no access to the box with Sybase. trying to
reach service vendor support,  but it might take some time and I hoped I've
done some mistake on my side...

Kind Regards
Ales Zeleny

st 10. 10. 2018 v 0:08 odesílatel Adrian Klaver 
napsal:

> On 10/9/18 12:07 PM, Aleš Zelený wrote:
> > Hello,
> >
> > I've an issue with foreign table using tds_fdw from PostgreSQL 10 to
> > Sybase ASE 15.7.
> >
> > The issue is, that primary key, which I have to use for my predicate is
> > in Sybase data type binary.
> >
> > Foreign table maps ID column from Sybase binary to PostgreSQL BYTEA.
> >
> > Simple select is smooth:
> > [local]:5432 postgres@postgres:7650
> > =# select branch_id from ase.tds_tbl limit 1;
> > NOTICE:  tds_fdw: Query executed correctly
> > NOTICE:  tds_fdw: Getting results
> > ┌┐
> > │ branch_id  │
> > ├┤
> > │ \x000246000944242d │
> > └┘
> > (1 row)
> >
> > Whereas select using the ID fails:
> >
> > [local]:5432 postgres@postgres:7650
> > =# select * from ase. tds_tbl where branch_id =
> > E'\\x000246000944242d'::bytea;
> > ERROR:  DB-Library error: DB #: 102, DB Msg: General SQL Server error:
> > Check messages from the SQL Server, OS #: -1, OS Msg: , Level: 15
>^^
> Have you done the above, look at the error log for the server?
>
> > Time: 0.213 ms
> >
> > Thanks for any hints.
>
>
> > Kind regards Ales Zeleny
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: SELECT UNION into a text[]

2018-10-10 Thread Andrew Gierth
> "David" == David G Johnston  writes:

 >> SELECT ARRAY_AGG(hashed)
 >> FROM words_nouns
 >> WHERE added > TO_TIMESTAMP(1539100913)
 >> UNION
 >> SELECT ARRAY_AGG(hashed)
 >> FROM words_verbs
 >> WHERE added > TO_TIMESTAMP(1539100913);

 David> SELECT array_agg(words) FROM (
 David> SELECT hashed FROM words_nouns
 David> UNION ALL
 David> SELECT hashed FROM words_verbs
 David> ) AS src

Or even better,

SELECT ARRAY(SELECT hashed FROM words_nouns
 UNION
 SELECT hashed FROM words_verbs);

-- 
Andrew (irc:RhodiumToad)



Question about advance confirmed_flush_lsn using logic replication slot when there is no modification.

2018-10-10 Thread mo jia
Hi:

I am not sure it is ok to ask this question in this lislt.

I met this problem using RDS Postgres. (I think the normal postgres
may have the same problem).

I have posted the question on StackOverflow here:
https://stackoverflow.com/questions/52589058/aws-rds-postgresql-transaction-logs-keep-going-when-there-is-no-data-change

I also try to got some suggestion from psycopg. (And got suggestion
asking there)
https://github.com/psycopg/psycopg2/issues/780

The problem may be described like this in simple words:
When there is no modification, the postgres wal( or transaction logs
in RDS) keep going increase.
pg_current_wal_lsn is keeping increase, however the
confirmed_flush_lsn in (select * from pg_replication_slots;) can not
be advanced.
In this case my problem using pgyscopy can't read message from the slot.

Someone have give me a answer:
RDS dms service
https://docs.aws.amazon.com/dms/latest/userguide/CHAP_ReleaseNotes.html

there is a wal-heartbeat concept:

Added a write-ahead log (WAL) heartbeat (that is, running dummy
queries) for replication from a PostgreSQL source. This feature was
added so that idle logical replication slots don't hold onto old WAL
logs, which can result in storage full situations on the source. This
heartbeat keeps restart_lsn moving and prevents storage full
scenarios.

Any more suggestion? I am more or less a application level programmer.
I think this problem need to understand more about the inner of
postgres and some protocol on replication mechanism.

Thanks.