Re: [GENERAL] BDR, wal sender, high system cpu, mutex_lock_common

2017-10-03 Thread milist ujang
Hi Craig,

Anyway, this OS is guess OS in vmware (vsphere).
Thank for your response and help.

On Tue, Oct 3, 2017 at 8:49 PM, Craig Ringer  wrote:
>
>
> Can you get stacks please?
>
> Use -g
>

# Events: 2K cpu-clock
#
# Overhead   Command  Shared ObjectSymbol
#     .  
#
86.96%  postgres  [kernel.kallsyms]  [k] __mutex_lock_common.isra.5
|
--- __mutex_lock_common.isra.5
read

 2.85%  postgres  [kernel.kallsyms]  [k] do_raw_spin_lock
|
--- do_raw_spin_lock
   |
   |--90.48%-- read
   |
   |--8.33%-- recv
   |
--1.19%-- write

 2.44%  postgres  [kernel.kallsyms]  [k] mutex_unlock
|
--- mutex_unlock
read

 2.03%  postgres  [kernel.kallsyms]  [k] arch_local_irq_restore
|
--- arch_local_irq_restore
read

 1.32%  postgres  postgres   [.] ValidXLogRecord
|
--- ValidXLogRecord

 1.25%  postgres  [kernel.kallsyms]  [k] mutex_lock
|
--- mutex_lock
read


-- 
regards

ujang jaenudin | DBA Consultant (Freelancer)
http://ora62.wordpress.com
http://id.linkedin.com/pub/ujang-jaenudin/12/64/bab


[GENERAL] BDR, wal sender, high system cpu, mutex_lock_common

2017-10-03 Thread milist ujang
Hi all,

I've an environment 9.4 + bdr:
PostgreSQL 9.4.4 on x86_64-unknown-linux-gnu, compiled by gcc (Debian
4.7.2-5) 4.7.2, 64-bit

kernel version:
3.2.0-4-amd64 #1 SMP Debian 3.2.65-1 x86_64 GNU/Linux

This is consolidation databases, in this machine there are around 250+ wal
sender processes.

top output revealed high system cpu:
%Cpu(s):  1.4 us, 49.7 sy,  0.0 ni, 48.8 id,  0.0 wa,  0.0 hi,  0.0 si,
0.0 st

profiling cpu with perf:

perf top -e cpu-clock

Events: 142K cpu-clock
 82.37%  [kernel][k] __mutex_lock_common.isra.5
  4.49%  [kernel][k] do_raw_spin_lock
  2.23%  [kernel][k] mutex_lock
  2.16%  [kernel][k] mutex_unlock
  2.12%  [kernel][k] arch_local_irq_restore
  1.73%  postgres[.] ValidXLogRecord
  0.87%  [kernel][k] __mutex_unlock_slowpath
  0.78%  [kernel][k] arch_local_irq_enable
  0.63%  [kernel][k] sys_recvfrom


finally get which processes (wal senders) that are using mutexes:

perf top -e task-clock -p 55382

Events: 697  task-clock
 88.08%  [kernel]  [k] __mutex_lock_common.isra.5
  3.27%  [kernel]  [k] do_raw_spin_lock
  2.34%  [kernel]  [k] arch_local_irq_restore
  2.10%  postgres  [.] ValidXLogRecord
  1.87%  [kernel]  [k] mutex_unlock
  1.87%  [kernel]  [k] mutex_lock
  0.47%  [kernel]  [k] sys_recvfrom

strace output of wal sender process:

% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 98.301.030072   5213063201463 read
  1.690.017686   0201464201464 recvfrom
  0.010.000110   0   806   lseek
  0.000.43   0   474   468 rt_sigreturn
  0.000.00   0 6   open
  0.000.00   0 6   close
-- --- --- - - 
100.001.047911415819403395 total


strace detail, majority read from pipe and recvfrom from socket (but most
of them are EAGAIN):

read(15, 
"~\320\5\0\1\0\0\0\0@\235\1\360\16\0\0\334\26\0\0\0\0\0\0\365\27\0\0\0\0\0\0"...,
8192) = 8192 <0.25>
read(6, 0x7fffdd837b3f, 1) = -1 EAGAIN (Resource temporarily unavailable)
<0.000116>
recvfrom(10, 0x7fffdd837b17, 1, 0, 0, 0) = -1 EAGAIN (Resource temporarily
unavailable) <0.49>

ls -l /proc/62388/fd/15
lr-x-- 1 postgres postgres 64 Oct  1 08:39 /proc/62388/fd/15 ->
/data/pg_xlog/00010EF00061

ls -l /proc/62388/fd/6
lr-x-- 1 postgres postgres 64 Oct  1 08:39 /proc/62388/fd/6 ->
pipe:[1090892506]

ls -l /proc/62388/fd/10
lrwx-- 1 postgres postgres 64 Oct  1 08:39 /proc/62388/fd/10 ->
socket:[1096584060]

I wonder, is there kernel version has better handling mutexes?
Or is it the expected behavior?

Sorry for cross-posting, I have posted the same on pgsql-performance too


-- 
regards

ujang jaenudin | DBA Consultant (Freelancer)
http://ora62.wordpress.com
http://id.linkedin.com/pub/ujang-jaenudin/12/64/bab


Re: [GENERAL] BDR, near xid wraparound, a lot of files in pg_subtrans directory

2017-09-18 Thread milist ujang
Hi Craig,

So, is it safe to drop those list from this query output?

select riname from pg_replication_identifier where riname not in
(select external_id from pg_replication_identifier_progress);

I cannot read pg_get_replication_identifier_progress function, is it likely
c function?




On Fri, Sep 15, 2017 at 11:14 AM, Craig Ringer <cr...@2ndquadrant.com>
wrote:

> On 15 September 2017 at 11:46, milist ujang <ujang.mil...@gmail.com>
> wrote:
>
>> Hi Craig,
>>
>> Thanks again for pointing to inactive replication slot.
>> After inactive replication slot been dropped, the relfrozenxid now moving.
>>
>> I wonder if  replication identifier will have some issue if left
>> un-chained? since at other side there are inactive replication identifier.
>>
>
> No, that's harmless.
>
> However, if there's still an "other side" at all, you presumably have
> broken replication.
>
>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>



-- 
regards

ujang jaenudin | DBA Consultant (Freelancer)
http://ora62.wordpress.com
http://id.linkedin.com/pub/ujang-jaenudin/12/64/bab


Re: [GENERAL] BDR, near xid wraparound, a lot of files in pg_subtrans directory

2017-09-14 Thread milist ujang
Hi Craig,

Thanks again for pointing to inactive replication slot.
After inactive replication slot been dropped, the relfrozenxid now moving.

I wonder if  replication identifier will have some issue if left
un-chained? since at other side there are inactive replication identifier.



On Fri, Sep 15, 2017 at 9:36 AM, Craig Ringer  wrote:

>
>
> Do you have any idle/old replication slots, perhaps from failed node joins
> or abandoned nodes not properly parted?
>
> SELECT *
> FROM pg_replication_slots;
>
> Also check
>
> SELECT oid,*
> FROM pg_database;
>
>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>



-- 
regards

ujang jaenudin | DBA Consultant (Freelancer)
http://ora62.wordpress.com
http://id.linkedin.com/pub/ujang-jaenudin/12/64/bab


[GENERAL] BDR, near xid wraparound, a lot of files in pg_subtrans directory

2017-09-13 Thread milist ujang
HI list,

I have a database with bdr environment which keep alerting these messages
in log file:

HINT:  Close open transactions soon to avoid wraparound problems.
WARNING:  oldest xmin is far in the past

Querying pg_stat_activity where state='active';

datname  | template1
query| autovacuum: VACUUM pg_catalog.pg_depend (to prevent
wraparound)

datname  | template1
query| autovacuum: VACUUM pg_toast.pg_toast_1255 (to prevent
wraparound)

datname  | template1
query| autovacuum: VACUUM pg_catalog.pg_ts_parser (to prevent
wraparound)

SELECT
 pg_namespace.nspname
 ,c.relname AS relname
 ,greatest(age(c.relfrozenxid),age(t.relfrozenxid)) as age
 ,c.relfrozenxid
 ,t.relfrozenxid
FROM pg_class c
LEFT JOIN pg_class t ON c.reltoastrelid = t.oid
LEFT JOIN
 pg_namespace
   ON pg_namespace.oid = c.relnamespace
WHERE c.relkind = 'r' and c.relname='pg_depend';

-[ RECORD 1 ]+---
nspname  | pg_catalog
relname  | pg_depend
age  | 1165907267
relfrozenxid | 102246720
relfrozenxid |


Trying to vacuum manual, but no luck:

postgres=# vacuum pg_catalog.pg_depend;
WARNING:  oldest xmin is far in the past
HINT:  Close open transactions soon to avoid wraparound problems.
VACUUM
postgres=# vacuum freeze pg_catalog.pg_depend;
WARNING:  oldest xmin is far in the past
HINT:  Close open transactions soon to avoid wraparound problems.
VACUUM
postgres=# vacuum full pg_catalog.pg_depend;
WARNING:  oldest xmin is far in the past
HINT:  Close open transactions soon to avoid wraparound problems.
VACUUM


Check fozenxid again after vacuum, not moving:

SELECT
pg_namespace.nspname
,c.relname AS relname
,greatest(age(c.relfrozenxid),age(t.relfrozenxid)) as age
,c.relfrozenxid
,t.relfrozenxid
FROM pg_class c
LEFT JOIN pg_class t ON c.reltoastrelid = t.oid
LEFT JOIN
 pg_namespace
   ON pg_namespace.oid = c.relnamespace
WHERE c.relkind = 'r' and c.relname='pg_depend';
-[ RECORD 1 ]+---
nspname  | pg_catalog
relname  | pg_depend
age  | 1165908742
relfrozenxid | 102246720
relfrozenxid |


Searching the similar situation, found this url:
https://postgrespro.com/list/thread-id/1556972

but the above url was in streaming replication, not bdr; but symptom is
likely same:
there are a lot of files inside pg_subtrans directory, many files older
than latest cluster restarted.

query pg_prepared_xacts, pg_stat_activity, pg_locks with no indication
about long running queries or even open transactions.


-- 
regards

ujang jaenudin | DBA Consultant (Freelancer)
http://ora62.wordpress.com
http://id.linkedin.com/pub/ujang-jaenudin/12/64/bab


[GENERAL] BDR, limit number of groups in a cluster

2017-09-10 Thread milist ujang
Hi all,

Based on the docs and look at the processes, it seems 1 wal sender on each
node per group.

If there is scenario of consolidating many databases (say hundreds) into 1
database (in this central cluster there are hundreds wal sender), what is
the limit number of groups?

I wonder if anyone has runs such the scenario?


-- 
regards

ujang jaenudin | DBA Consultant (Freelancer)
http://ora62.wordpress.com
http://id.linkedin.com/pub/ujang-jaenudin/12/64/bab


Re: [GENERAL] BDR, wal segment has been removed, is it possible move forward?

2017-09-07 Thread milist ujang
Hi Craig,

On Wed, Sep 6, 2017 at 4:07 PM, Craig Ringer  wrote:

>
> You could drop and re-create the replication slot, I guess. But your nodes
> would be hopelessly out of sync and need manual resync (with data
> replication disabled) of one node vs another.
>

Thanks for pointing to replication slot.
I Simulate the similar situation in dev env by remove the wal segment on
node1, when  node2 keep inserting into a table, now it perfectly can move
forward to latest wal segment, but the difference situation is at
node_status.

In production node_status is  i  in node1 but  r  in node2, where on my dev
both nodes keep  r  , even I waited to let it may change quite long.

can I safely update the node_status directy on bdr.bdr_nodes?


-- 
regards

ujang jaenudin | DBA Consultant (Freelancer)
http://ora62.wordpress.com
http://id.linkedin.com/pub/ujang-jaenudin/12/64/bab


Re: [GENERAL] BDR, wal segment has been removed, is it possible move forward?

2017-09-05 Thread milist ujang
Hi Craig

On Wed, Sep 6, 2017 at 7:21 AM, Craig Ringer  wrote:
>
>
> BDR can, see bdr.skip_changes_upto .
>
> Unluckily my bdr is 0.9.3


> But PostgreSQL's logical decoding requires a contiguous WAL stream to
> maintain a valid catalog_xmin and restart_lsn, so it'll still fail to
> advance. So your replication from that node is broken, and you have to part
> the node then rejoin. You'll need to manually recover any diverged data.
>
> Yup, I willing to reconcile data manualy via dblink or else,
is it still possible to move wal segment in 0.9.3?

I've played these at dev env to simulate, but no luck:

# select * from pg_replication_identifier_progress;
 local_id |  external_id   | remote_lsn | local_lsn
--+++---
1 | bdr_6461744703437035137_1_34424_30406_ | 0/4288950  | 0/3007588
(1 row)

# SELECT
pg_replication_identifier_setup_replaying_from('bdr_6461744703437035137_1_34424_30406_');
 pg_replication_identifier_setup_replaying_from


(1 row)

# SELECT
pg_replication_identifier_setup_tx_origin('0/4288960','2017-09-06');
 pg_replication_identifier_setup_tx_origin
---

(1 row)

# select * from pg_replication_identifier_progress;
 local_id |  external_id   | remote_lsn | local_lsn
--+++---
1 | bdr_6461744703437035137_1_34424_30406_ | 0/4288950  | 0/3007588
(1 row)


-- 
regards

ujang jaenudin | DBA Consultant (Freelancer)
http://ora62.wordpress.com
http://id.linkedin.com/pub/ujang-jaenudin/12/64/bab


[GENERAL] BDR, wal segment has been removed, is it possible move forward?

2017-09-05 Thread milist ujang
Hi all,

due to space issue and high volume transaction, some wal segments removed
from pg_xlog on bdr environment.

warning log at node1 saying "requested WAL segment . has already been
removed" following Connection reset by peer.

log at node2 :
Sending replication command: START_REPLICATION SLOT
...
...
XX000: data stream ended


I had played streams and goldengate (oracle product) , that at capture side
we can move forward to certain sequence (archivedlog/redolog - wal segment
in postgres).

So, is it possible to move forward to read recent wal segment in bdr
environment? (assume data reconciliation will be done manually).


-- 
regards

ujang jaenudin | DBA Consultant (Freelancer)
http://ora62.wordpress.com
http://id.linkedin.com/pub/ujang-jaenudin/12/64/bab