Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-02-02 Thread Noah Misch
On Wed, Feb 01, 2017 at 02:39:25PM +0200, Heikki Linnakangas wrote:
> On 02/01/2017 01:07 PM, Konstantin Knizhnik wrote:
> >Attached please find my patch for XLC/AIX.
> >The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
> >The comment in this file says that:
> >
> >   * __fetch_and_add() emits a leading "sync" and trailing "isync",
> >thereby
> >   * providing sequential consistency.  This is undocumented.
> >
> >But it is not true any more (I checked generated assembler code in
> >debugger).
> >This is why I have added __sync() to this function. Now pgbench working
> >normally.

Konstantin, does "make -C src/bin/pg_bench check" fail >10% of the time in the
bad build?

> Seems like it was not so much undocumented, but an implementation detail
> that was not guaranteed after all..

Seems so.

> There was a long thread on these things the last time this was changed: 
> https://www.postgresql.org/message-id/20160425185204.jrvlghn3jxulsb7i%40alap3.anarazel.de.
> I couldn't find an explanation there of why we thought that fetch_and_add
> implicitly performs sync and isync.

It was in the generated code, for AIX xlc 12.1.0.0.

> >Also there is mysterious disappearance of assembler section function
> >with sync instruction from pg_atomic_compare_exchange_u32_impl.
> >I have fixed it by using __sync() built-in function instead.
> 
> __sync() seems more appropriate there, anyway. We're using intrinsics for
> all the other things in generic-xlc.h. But it sure is scary that the "asm"
> sections just disappeared.

That is a problem, but it's a stretch to conclude that asm sections are
generally prone to removal, while intrinsics are generally durable.

> @@ -73,11 +73,19 @@ pg_atomic_compare_exchange_u32_impl(volatile 
> pg_atomic_uint32 *ptr,
>  static inline uint32
>  pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_)
>  {
> + uint32  ret;
> +
>   /*
> -  * __fetch_and_add() emits a leading "sync" and trailing "isync", 
> thereby
> -  * providing sequential consistency.  This is undocumented.
> +  * Use __sync() before and __isync() after, like in compare-exchange
> +  * above.
>*/
> - return __fetch_and_add((volatile int *)>value, add_);
> + __sync();
> +
> + ret = __fetch_and_add((volatile int *)>value, add_);
> +
> + __isync();
> +
> + return ret;
>  }

Since this emits double syncs with older xlc, I recommend instead replacing
the whole thing with inline asm.  As I opined in the last message of the
thread you linked above, the intrinsics provide little value as abstractions
if one checks the generated code to deduce how to use them.  Now that the
generated code is xlc-version-dependent, the port is better off with
compiler-independent asm like we have for ppc in s_lock.h.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-02-02 Thread Konstantin Knizhnik

Hi Tony,

On 02.02.2017 17:10, REIX, Tony wrote:


Hi Konstantin

I've discussed the "zombie/exit" issue with our expert here.

- He does not think that AIX has anything special here

- If the process is marked  in ps, this is because the flag 
SEXIT is set, thus the process is blocked somewhere in the kexitx() 
syscall, waiting for something.


- In order to know what it is waiting for, the best would be to have a 
look with *kdb*.




kdb shows the following stack:

pvthread+073000 STACK:
[005E1958]slock+000578 (005E1958, 80001032 [??])
[9558].simple_lock+58 ()
[00651DBC]vm_relalias+00019C (??, ??, ??, ??, ??)
[006544AC]vm_map_entry_delete+00074C (??, ??, ??)
[00659C30]vm_map_delete+000150 (??, ??, ??, ??)
[00659D88]vm_map_deallocate+48 (??, ??)
[0011C588]kexitx+001408 (??)
[000BB08C]kexit+8C ()
___ Recovery (FFF9290) ___
WARNING: Eyecatcher/version mismatch in RWA


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-02-02 Thread Konstantin Knizhnik

On 02.02.2017 18:20, REIX, Tony wrote:


Hi Konstantin

I have an issue with pgbench. Any idea ?




Pgbench -s options specifies scale.
Scale 1000 corresponds to 1000 million rows and requires about 16Gb at disk.


# mkdir /tmp/PGS
 # chown pgstbf.staff /tmp/PGS

 # su pgstbf

 $ /opt/freeware/bin/*initdb* -D /tmp/PGS
 The files belonging to this database system will be owned by user 
"pgstbf".

 This user must also own the server prcess.

 The database cluster will be initialized with locale "C".
 The default database encoding has accordingly been set to "SQL_ASCII".
 The default text search configuration will be set to "english".

 Data page checksums are disabled.

 fixing permissions on existing directory /tmp/PGS ... ok
 creating subdirectories ... ok
 selecting default max_connections ... 100
 selecting default shared_buffers ... 128MB
 selecting dynamic shared memory implementation ... posix
 creating configuration files ... ok
 running bootstrap script ... ok
 performing post-bootstrap initialization ... ok
 syncing data to disk ... ok

 WARNING: enabling "trust" authentication for local connections
 You can change this by editing pg_hba.conf or using the option -A, or 
--auth-local and --auth-host, the next time you run initdb.


 Success. You can now start the database server using:


 $ /opt/freeware/bin/*pg_ctl* -D /tmp/PGS -l /tmp/PGS/logfile *start*
 server starting

 $ /opt/freeware/bin/pg_ctl -D /tmp/PGS -l /tmp/PGS/logfile status
  pg_ctl: server is running (PID: 11599920)
 /opt/freeware/bin/postgres_64 "-D" "/tmp/PGS"


 $ /usr/bin/*createdb* pgstbf
 $


 $ *pgbench* -i -s 1000
 creating tables...
 10 of 1 tuples (0%) done (elapsed 0.29 s, remaining 288.09 s)
 ...
 1 of 1 tuples (100%) done (elapsed 42.60 s, remaining 
0.00 s)
*ERROR:  could not extend file "base/16384/24614": wrote only 7680 of 
8192 bytes at block 131071**

** HINT:  Check free disk space.*
 CONTEXT:  COPY pgbench_accounts, line 7995584
 PQendcopy failed


After cleaning all /tmp/PGS and symlinking it to /home, where I have 
6GB free, I've retried and I got nearly the same:



 1 of 1 tuples (100%) done (elapsed 204.65 s, 
remaining 0.00 s)
 ERROR:  could not extend file "base/16384/16397.6": *No space left on 
device*

 HINT:  Check free disk space.
 CONTEXT:  COPY pgbench_accounts, line 51235802
PQendcopy failed


*Do I need more than 6GB ???*


*Thanks*

*Tony*


$ df -k .
Filesystem1024-blocks  Free %UsedIused %Iused Mounted on
/dev/hd1 45088768   6719484   86%   94601639% /home

bash-4.3$ pwd
/tmp/PGS

bash-4.3$ ll /tmp/PGS
lrwxrwxrwx1 root system   10 Feb  2 08:43 /tmp/PGS -> 
/home/PGS/



$ df -k
Filesystem1024-blocks  Free %UsedIused %Iused Mounted on
/dev/hd4   524288277284   48%1073314% /
/dev/hd2  6684672148896   98%4930348% /usr
/dev/hd9var   2097152314696   85%2493418% /var
/dev/hd3  3145728   2527532   20%  418 1% /tmp
*/dev/hd1 45088768   6719484   86%   94601639% /home*
/dev/hd11admin  1310721306921%7 1% /admin
/proc   - -- - - /proc
/dev/hd10opt 65273856829500   99%   93833941% /opt
/dev/livedump  2621442617761%4 1% 
/var/adm/ras/livedump

/aha- --18 1% /aha

$ cat logfile
LOG:  database system was shut down at 2017-02-02 09:08:31 CST
LOG:  MultiXact member wraparound protections are now enabled
LOG:  autovacuum launcher started
LOG:  database system is ready to accept connections
ERROR:  could not extend file "base/16384/16397.6": No space left on 
device

HINT:  Check free disk space.
CONTEXT:  COPY pgbench_accounts, line 51235802
STATEMENT:  copy pgbench_accounts from stdin



$ *ulimit -a*
core file size  (blocks, -c) 1048575
data seg size   (kbytes, -d) 131072
*file size   (blocks, -f) unlimited*
max memory size (kbytes, -m) 32768
open files  (-n) 2000
pipe size(512 bytes, -p) 64
stack size  (kbytes, -s) 32768
cpu time   (seconds, -t) unlimited
max user processes  (-u) unlimited
virtual memory  (kbytes, -v) unlimited


bash-4.3$ ll /tmp/PGS
lrwxrwxrwx1 root system   10 Feb  2 08:43 /tmp/PGS -> 
/home/PGS/

bash-4.3$ ls -l
total 120
-rw---1 pgstbf   staff 4 Feb  2 09:08 PG_VERSION
drwx--6 pgstbf   staff   256 Feb  2 09:09 base
drwx--2 pgstbf   staff  4096 Feb  2 09:09 global
-rw---1 pgstbf   staff   410 Feb  2 09:13 logfile
drwx--2 pgstbf   staff   256 Feb  2 09:08 pg_clog
drwx--2 pgstbf   staff   256 Feb  2 09:08 pg_commit_ts
drwx--2 pgstbf   staff   256 Feb  2 09:08 pg_dynshmem
-rw---1 pgstbf   staff  4462 Feb  2 09:08 

Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-02-02 Thread REIX, Tony
Hi Konstantin

I have an issue with pgbench. Any idea ?


  # mkdir /tmp/PGS
 # chown pgstbf.staff /tmp/PGS

 # su pgstbf

 $ /opt/freeware/bin/initdb -D /tmp/PGS
 The files belonging to this database system will be owned by user "pgstbf".
 This user must also own the server prcess.

 The database cluster will be initialized with locale "C".
 The default database encoding has accordingly been set to "SQL_ASCII".
 The default text search configuration will be set to "english".

 Data page checksums are disabled.

 fixing permissions on existing directory /tmp/PGS ... ok
 creating subdirectories ... ok
 selecting default max_connections ... 100
 selecting default shared_buffers ... 128MB
 selecting dynamic shared memory implementation ... posix
 creating configuration files ... ok
 running bootstrap script ... ok
 performing post-bootstrap initialization ... ok
 syncing data to disk ... ok

 WARNING: enabling "trust" authentication for local connections
 You can change this by editing pg_hba.conf or using the option -A, or 
--auth-local and --auth-host, the next time you run initdb.

 Success. You can now start the database server using:


 $ /opt/freeware/bin/pg_ctl -D /tmp/PGS -l /tmp/PGS/logfile start
 server starting


 $ /opt/freeware/bin/pg_ctl -D /tmp/PGS -l /tmp/PGS/logfile status
  pg_ctl: server is running (PID: 11599920)
 /opt/freeware/bin/postgres_64 "-D" "/tmp/PGS"

 $ /usr/bin/createdb pgstbf
 $


 $ pgbench -i -s 1000
 creating tables...
 10 of 1 tuples (0%) done (elapsed 0.29 s, remaining 288.09 s)
 ...
 1 of 1 tuples (100%) done (elapsed 42.60 s, remaining 0.00 s)
 ERROR:  could not extend file "base/16384/24614": wrote only 7680 of 8192 
bytes at block 131071
 HINT:  Check free disk space.
 CONTEXT:  COPY pgbench_accounts, line 7995584
 PQendcopy failed


After cleaning all /tmp/PGS and symlinking it to /home, where I have 6GB free, 
I've retried and I got nearly the same:


 1 of 1 tuples (100%) done (elapsed 204.65 s, remaining 0.00 s)
 ERROR:  could not extend file "base/16384/16397.6": No space left on device
 HINT:  Check free disk space.
 CONTEXT:  COPY pgbench_accounts, line 51235802
PQendcopy failed


Do I need more than 6GB ???


Thanks

Tony


$ df -k .
Filesystem1024-blocks  Free %UsedIused %Iused Mounted on
/dev/hd1 45088768   6719484   86%   94601639% /home

bash-4.3$ pwd
/tmp/PGS

bash-4.3$ ll /tmp/PGS
lrwxrwxrwx1 root system   10 Feb  2 08:43 /tmp/PGS -> /home/PGS/



$ df -k
Filesystem1024-blocks  Free %UsedIused %Iused Mounted on
/dev/hd4   524288277284   48%1073314% /
/dev/hd2  6684672148896   98%4930348% /usr
/dev/hd9var   2097152314696   85%2493418% /var
/dev/hd3  3145728   2527532   20%  418 1% /tmp
/dev/hd1 45088768   6719484   86%   94601639% /home
/dev/hd11admin  1310721306921%7 1% /admin
/proc   - -- - -  /proc
/dev/hd10opt 65273856829500   99%   93833941% /opt
/dev/livedump  2621442617761%4 1% /var/adm/ras/livedump
/aha- --18 1% /aha


$ cat logfile
LOG:  database system was shut down at 2017-02-02 09:08:31 CST
LOG:  MultiXact member wraparound protections are now enabled
LOG:  autovacuum launcher started
LOG:  database system is ready to accept connections
ERROR:  could not extend file "base/16384/16397.6": No space left on device
HINT:  Check free disk space.
CONTEXT:  COPY pgbench_accounts, line 51235802
STATEMENT:  copy pgbench_accounts from stdin



$ ulimit -a
core file size  (blocks, -c) 1048575
data seg size   (kbytes, -d) 131072
file size   (blocks, -f) unlimited
max memory size (kbytes, -m) 32768
open files  (-n) 2000
pipe size(512 bytes, -p) 64
stack size  (kbytes, -s) 32768
cpu time   (seconds, -t) unlimited
max user processes  (-u) unlimited
virtual memory  (kbytes, -v) unlimited


bash-4.3$ ll /tmp/PGS
lrwxrwxrwx1 root system   10 Feb  2 08:43 /tmp/PGS -> /home/PGS/
bash-4.3$ ls -l
total 120
-rw---1 pgstbf   staff 4 Feb  2 09:08 PG_VERSION
drwx--6 pgstbf   staff   256 Feb  2 09:09 base
drwx--2 pgstbf   staff  4096 Feb  2 09:09 global
-rw---1 pgstbf   staff   410 Feb  2 09:13 logfile
drwx--2 pgstbf   staff   256 Feb  2 09:08 pg_clog
drwx--2 pgstbf   staff   256 Feb  2 09:08 pg_commit_ts
drwx--2 pgstbf   staff   256 Feb  2 09:08 pg_dynshmem
-rw---1 pgstbf   staff  4462 Feb  2 09:08 pg_hba.conf
-rw---1 pgstbf   staff  1636 Feb  2 09:08 pg_ident.conf
drwx--4 pgstbf   staff   256 Feb  2 09:08 pg_logical
drwx--4 pgstbf   staff   256 Feb 

Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-02-02 Thread REIX, Tony
Hi Konstantin

I've discussed the "zombie/exit" issue with our expert here.

- He does not think that AIX has anything special here

- If the process is marked  in ps, this is because the flag SEXIT is 
set, thus the process is blocked somewhere in the kexitx() syscall, waiting for 
something.

- In order to know what it is waiting for, the best would be to have a look 
with kdb.

- either it is waiting for an asynchronous I/O to end, or a thread to end if 
the process is multi-thread

- Using the proctree command for analyzing the issue is not a good idea, since 
the process will block in kexitx() if there is an operation on /proc being done

- If the process is marked , that means that the process has not 
called waitpid() yet for getting the son's status. Maybe the parent is blocked 
in non-interruptible code where the signal handler cannot be called.

- In short, that may be due to many causes... Use kdb is the best way.

- Instead of proctree (which makes use of /proc), use: "ps -faT ".


I'll try to reproduce here.

Regards

Tony

Le 01/02/2017 à 21:26, Konstantin Knizhnik a écrit :
On 02/01/2017 08:30 PM, REIX, Tony wrote:



About the zombie issue, I've discussed with my colleagues. Looks like the 
process keeps zombie till the father looks at its status. However, though I did 
that several times, I  do not remember well the details. And that should be not 
specific to AIX. I'll discuss with another colleague, tomorrow, who should 
understand this better than me.

1. Process is not in zomby state (according to ps). It is in  state... 
It is something AIX specific, I have not see processes in this state at Linux.
2. I have implemented simple test - forkbomb. It creates 1000 children and then 
wait for them. It is about ten times slower than at Intel/Linux, but still much 
faster than 100 seconds. So there is some difference between postgress backend 
and dummy process doing nothing - just immediately terminating after return 
from fork()


Regards,

Tony

Le 01/02/2017 à 16:59, Konstantin Knizhnik a écrit :
Hi Tony,

On 01.02.2017 18:42, REIX, Tony wrote:

Hi Konstantin

XLC.

I'm on AIX 7.1 for now.

I'm using this version of XLC v13:

# xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0003

With this version, I have (at least, since I tested with "check" and not 
"check-world" at that time) 2 failing tests: create_aggregate , aggregates .


With the following XLC v12 version, I have NO test failure:

# /usr/vac/bin/xlc -qversion
IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
Version: 12.01..0016


So maybe you are not using XLC v13.1.3.3, rather another sub-version. Unless 
you are using more options for the configure ?


Configure.

What are the options that you give to the configure ?


export CC="/opt/IBM/xlc/13.1.3/bin/xlc"
export CFLAGS="-qarch=pwr8 -qtune=pwr8 -O2 -qalign=natural -q64 "
export LDFLAGS="-Wl,-bbigtoc,-b64"
export AR="/usr/bin/ar -X64"
export LD="/usr/bin/ld -b64 "
export NM="/usr/bin/nm -X64"
./configure --prefix="/opt/postgresql/xlc-debug/9.6"



Hard load & 64 cores ? OK. That clearly explains why I do not see this issue.


pgbench ? I wanted to run it. However, I'm still looking where to get it plus a 
guide for using it for testing.

pgbench is part of Postgres distributive (src/bin/pgbench)



I would add such tests when building my PostgreSQL RPMs on AIX. So any help is 
welcome !


Performance.

- Also, I'd like to compare PostgreSQL performance on AIX vs Linux/PPC64. Any 
idea how I should proceed ? Any PostgreSQL performance benchmark that I could 
find and use ? pgbench ?

pgbench is most widely used tool simulating OLTP workload. Certainly it is 
quite primitive and its results are rather artificial. TPC-C seems to be better 
choice.
But the best case is to implement your own benchmark simulating actual workload 
of your real application.


- I'm interested in any information for improving the performance & quality of 
my PostgreSQM RPMs on AIX. (As I already said, BullFreeware RPMs for AIX are 
free and can be used by anyone, like Perzl RPMs are. My company (ATOS/Bull) 
sells IBM Power machines under the Escala brand since ages (25 years this 
year)).


How to help ?

How could I help for improving the quality and performance of PostgreSQL on AIX 
?

We still have one open issue at AIX: see 
https://www.mail-archive.com/pgsql-hackers@postgresql.org/msg303094.html
It will be great if you can somehow help to fix this problem.




--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-02-02 Thread Konstantin Knizhnik

Last update on the issue with deadlock in XLogInsert.

After almost one day of working, pgbench is once again not working 
normally:(

There are no deadlock, there are no core files and no error messages in log.
But TPS is almost zero:

progress: 57446.0 s, 1.1 tps, lat 3840.265 ms stddev NaNQ
progress: 57447.3 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57448.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57449.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57450.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57451.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57452.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57453.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57454.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57455.1 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57456.5 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57457.1 s, 164.6 tps, lat 11504.085 ms stddev 5902.148
progress: 57458.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57459.0 s, 234.0 tps, lat 1597.573 ms stddev 3665.814
progress: 57460.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57461.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57462.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57463.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57464.0 s, 602.8 tps, lat 906.765 ms stddev 1940.256
progress: 57465.0 s, 7.2 tps, lat 38.052 ms stddev 12.302
progress: 57466.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57467.1 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57468.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57469.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57470.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57471.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57472.1 s, 147.8 tps, lat 4379.790 ms stddev 3431.477
progress: 57473.0 s, 1314.1 tps, lat 156.884 ms stddev 535.761
progress: 57474.0 s, 1272.2 tps, lat 31.548 ms stddev 59.538
progress: 57475.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57476.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57477.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57478.0 s, 1688.6 tps, lat 268.379 ms stddev 956.537
progress: 57479.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57480.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57481.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57482.1 s, 29.0 tps, lat 3500.432 ms stddev 54.177
progress: 57483.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57484.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57485.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57486.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57487.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57488.0 s, 66.0 tps, lat 9813.646 ms stddev 19.807
progress: 57489.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57490.0 s, 31.0 tps, lat 8368.125 ms stddev 933.997
progress: 57491.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57492.0 s, 1601.0 tps, lat 226.865 ms stddev 844.952
progress: 57493.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ
progress: 57494.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ


ps auwx shows the following picture:
[10:44:12]postgres@postgres:~/postgresql $ ps auwx | fgrep postgres
postgres 61802470  0.4  0.0 4064 4180  pts/6 A18:54:58 976:56 
pgbench -c 100 -j 20 -P 1 -T 10 -p 5436
postgres 15271518  0.0  0.0 138276 15024  - A10:43:34  0:06 
postgres: autovacuum worker process   postgres
postgres 13305354  0.0  0.0 22944 21356  - A20:49:04 27:51 
postgres: autovacuum worker process   postgres
postgres  5245902  0.0  0.0 14072 14020  - A18:54:59 10:24 
postgres: postgres postgres [local] COMMIT
postgres 44303278  0.0  0.0 15176 14036  - A18:54:59 10:18 
postgres: postgres postgres [local] COMMIT
postgres 38601340  0.0  0.0 11564 14008  - A18:54:59 10:16 
postgres: postgres postgres [local] COMMIT
postgres 53674890  0.0  0.0 12712 14004  - A18:54:59  8:54 
postgres: postgres postgres [local] COMMIT
postgres 27591640  0.0  0.0 15040 14028  - A18:54:59  8:38 
postgres: postgres postgres [local] COMMIT
postgres 40960422  0.0  0.0 12128 13996  - A18:54:59  8:36 
postgres: postgres postgres [local] COMMIT
postgres 41288514  0.0  0.0 10544 14012  - A18:54:59  8:30 
postgres: postgres postgres [local] idle
postgres 55771564  0.0  0.0 12844 14008  - A18:54:59  8:24 
postgres: postgres postgres [local] COMMIT
postgres 21760842  0.0  0.0 13164 14008  - A18:54:59  8:17 
postgres: postgres postgres [local] COMMIT
postgres 18810974  0.0  0.0 10416 14012  - A18:54:59  8:13 
postgres: postgres postgres [local] idle in transaction
postgres 17566474  0.0  0.0 10224 14012  - A18:54:59  8:02 
postgres: postgres postgres [local] COMMIT
postgres 63963402  0.0  0.0 11300 14000  - A18:54:59  7:48 
postgres: postgres postgres [local] COMMIT
postgres  9963962  0.0  0.0 15548 14024  - A18:54:59  7:37 
postgres: postgres postgres [local] idle
postgres 10094942  0.0  0.0 12192 13996  - A18:54:59  7:33 

Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-02-01 Thread Konstantin Knizhnik
On 02/01/2017 08:28 PM, Heikki Linnakangas wrote:
>
> But if there's no pressing reason to change it, let's leave it alone. It's 
> not related to the problem at hand, right?
>

Yes, I agree with you: we should better leave it as it is.


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-02-01 Thread Konstantin Knizhnik
On 02/01/2017 08:30 PM, REIX, Tony wrote:
>
> Hi Konstantin,
>
> Please run:*/opt/IBM/xlc/13.1.3/bin/xlc -qversion*  so that I know your exact 
> XLC v13 version.
>
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)

> I'm building on Power7 and not giving any architecture flag to XLC.
>
> I'm not using *-qalign=natural* . Thus, by default, XLC use -qalign=power, 
> which is close to natural, as explained at:
>  
> https://www.ibm.com/support/knowledgecenter/SSGH2K_13.1.0/com.ibm.xlc131.aix.doc/compiler_ref/opt_align.html
> Why are you using this flag ?
>

Because otherwise double type is aligned on 4 bytes.

> Thanks for info about *pgbench*. PostgreSQL web-site contains a lot of old 
> information...
>
> If you could*share scripts or instructions about the tests you are doing with 
> pgbench*, I would reproduce here.
>

You do not need any script.
Just two simple commands.
One to initialize database:

pgbench -i -s 1000

And another to run benchmark itself:

pgbench -c 100 -j 20 -P 1 -T 10


> I have no "real" application. My job consists in porting OpenSource packages 
> on AIX. Many packages. Erlang, Go, these days. I just want to make PostgreSQL 
> RPMs as good as possible... within the limited amount of time I can give to 
> this package, before
> moving to another one.
>
> About the *zombie* issue, I've discussed with my colleagues. Looks like the 
> process keeps zombie till the father looks at its status. However, though I 
> did that several times, I  do not remember well the details. And that should 
> be not specific to AIX.
> I'll discuss with another colleague, tomorrow, who should understand this 
> better than me.
>

1. Process is not in zomby state (according to ps). It is in  state... 
It is something AIX specific, I have not see processes in this state at Linux.
2. I have implemented simple test - forkbomb. It creates 1000 children and then 
wait for them. It is about ten times slower than at Intel/Linux, but still much 
faster than 100 seconds. So there is some difference between postgress backend 
and dummy process
doing nothing - just immediately terminating after return from fork()
>
> *Patch for Large Files*: When building PostgreSQL, I found required to use 
> the following patch so that PostgreSQL works with large files. I do not 
> remember the details. Do you agree with such a patch ? 1rst version (new-...) 
> shows the exact places where
>   define _LARGE_FILES 1  is required.  2nd version (new2-...) is simpler.
>
> I'm now experimenting with your patch for dead lock. However, that should be 
> invisible with the  "check-world" tests I guess.
>
> Regards,
>
> Tony
>
>
> Le 01/02/2017 à 16:59, Konstantin Knizhnik a écrit :
>> Hi Tony,
>>
>> On 01.02.2017 18:42, REIX, Tony wrote:
>>>
>>> Hi Konstantin
>>>
>>> *XLC.*
>>>
>>> I'm on AIX 7.1 for now.
>>>
>>> I'm using this version of *XL**C v13*:
>>>
>>> # xlc -qversion
>>> IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
>>> Version: 13.01.0003.0003
>>>
>>> With this version, I have (at least, since I tested with "check" and not 
>>> "check-world" at that time) 2 failing tests: create_aggregate , aggregates .
>>>
>>>
>>> With the following *XLC v12* version, I have NO test failure:
>>>
>>> # /usr/vac/bin/xlc -qversion
>>> IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
>>> Version: 12.01..0016
>>>
>>>
>>> So maybe you are not using XLC v13.1.3.3, rather another sub-version. 
>>> Unless you are using more options for the configure ?
>>>
>>>
>>> *Configure*.
>>>
>>> What are the options that you give to the configure ?
>>>
>>>
>> export CC="/opt/IBM/xlc/13.1.3/bin/xlc"
>> export CFLAGS="-qarch=pwr8 -qtune=pwr8 -O2 -qalign=natural -q64 "
>> export LDFLAGS="-Wl,-bbigtoc,-b64"
>> export AR="/usr/bin/ar -X64"
>> export LD="/usr/bin/ld -b64 "
>> export NM="/usr/bin/nm -X64"
>> ./configure --prefix="/opt/postgresql/xlc-debug/9.6"
>>
>>
>>> *Hard load & 64 cores ?* OK. That clearly explains why I do not see this 
>>> issue.
>>>
>>>
>>> *pgbench ?* I wanted to run it. However, I'm still looking where to get it 
>>> plus a guide for using it for testing.
>>>
>>
>> pgbench is part of Postgres distributive (src/bin/pgbench)
>>
>>
>>> I would add such tests when building my PostgreSQL RPMs on AIX. So any help 
>>> is welcome !
>>>
>>>
>>> *Performance*.
>>>
>>> - Also, I'd like to compare PostgreSQL performance on AIX vs Linux/PPC64. 
>>> Any idea how I should proceed ? Any PostgreSQL performance benchmark that I 
>>> could find and use ? pgbench ?
>>>
>> pgbench is most widely used tool simulating OLTP workload. Certainly it is 
>> quite primitive and its results are rather artificial. TPC-C seems to be 
>> better choice.
>> But the best case is to implement your own benchmark simulating actual 
>> workload of your real application.
>>
>>> - I'm interested in any information for improving the performance & quality 
>>> of my PostgreSQM RPMs on AIX./(As I already said, BullFreeware RPMs for AIX 
>>> are free and can be 

Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-02-01 Thread REIX, Tony
Hi Konstantin,

Please run: /opt/IBM/xlc/13.1.3/bin/xlc -qversion  so that I know your exact 
XLC v13 version.

I'm building on Power7 and not giving any architecture flag to XLC.

I'm not using -qalign=natural . Thus, by default, XLC use -qalign=power, which 
is close to natural, as explained at:
 
https://www.ibm.com/support/knowledgecenter/SSGH2K_13.1.0/com.ibm.xlc131.aix.doc/compiler_ref/opt_align.html
Why are you using this flag ?

Thanks for info about pgbench. PostgreSQL web-site contains a lot of old 
information...

If you could share scripts or instructions about the tests you are doing with 
pgbench, I would reproduce here.
I have no "real" application. My job consists in porting OpenSource packages on 
AIX. Many packages. Erlang, Go, these days. I just want to make PostgreSQL RPMs 
as good as possible... within the limited amount of time I can give to this 
package, before moving to another one.

About the zombie issue, I've discussed with my colleagues. Looks like the 
process keeps zombie till the father looks at its status. However, though I did 
that several times, I  do not remember well the details. And that should be not 
specific to AIX. I'll discuss with another colleague, tomorrow, who should 
understand this better than me.

Patch for Large Files: When building PostgreSQL, I found required to use the 
following patch so that PostgreSQL works with large files. I do not remember 
the details. Do you agree with such a patch ? 1rst version (new-...) shows the 
exact places where   define _LARGE_FILES 1  is required.  2nd version 
(new2-...) is simpler.

I'm now experimenting with your patch for dead lock. However, that should be 
invisible with the  "check-world" tests I guess.

Regards,

Tony

Le 01/02/2017 à 16:59, Konstantin Knizhnik a écrit :
Hi Tony,

On 01.02.2017 18:42, REIX, Tony wrote:

Hi Konstantin

XLC.

I'm on AIX 7.1 for now.

I'm using this version of XLC v13:

# xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0003

With this version, I have (at least, since I tested with "check" and not 
"check-world" at that time) 2 failing tests: create_aggregate , aggregates .


With the following XLC v12 version, I have NO test failure:

# /usr/vac/bin/xlc -qversion
IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
Version: 12.01..0016


So maybe you are not using XLC v13.1.3.3, rather another sub-version. Unless 
you are using more options for the configure ?


Configure.

What are the options that you give to the configure ?


export CC="/opt/IBM/xlc/13.1.3/bin/xlc"
export CFLAGS="-qarch=pwr8 -qtune=pwr8 -O2 -qalign=natural -q64 "
export LDFLAGS="-Wl,-bbigtoc,-b64"
export AR="/usr/bin/ar -X64"
export LD="/usr/bin/ld -b64 "
export NM="/usr/bin/nm -X64"
./configure --prefix="/opt/postgresql/xlc-debug/9.6"



Hard load & 64 cores ? OK. That clearly explains why I do not see this issue.


pgbench ? I wanted to run it. However, I'm still looking where to get it plus a 
guide for using it for testing.

pgbench is part of Postgres distributive (src/bin/pgbench)



I would add such tests when building my PostgreSQL RPMs on AIX. So any help is 
welcome !


Performance.

- Also, I'd like to compare PostgreSQL performance on AIX vs Linux/PPC64. Any 
idea how I should proceed ? Any PostgreSQL performance benchmark that I could 
find and use ? pgbench ?

pgbench is most widely used tool simulating OLTP workload. Certainly it is 
quite primitive and its results are rather artificial. TPC-C seems to be better 
choice.
But the best case is to implement your own benchmark simulating actual workload 
of your real application.


- I'm interested in any information for improving the performance & quality of 
my PostgreSQM RPMs on AIX. (As I already said, BullFreeware RPMs for AIX are 
free and can be used by anyone, like Perzl RPMs are. My company (ATOS/Bull) 
sells IBM Power machines under the Escala brand since ages (25 years this 
year)).


How to help ?

How could I help for improving the quality and performance of PostgreSQL on AIX 
?

We still have one open issue at AIX: see 
https://www.mail-archive.com/pgsql-hackers@postgresql.org/msg303094.html
It will be great if you can somehow help to fix this problem.




--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

--- src/include/postgres.h.ORIGIN	2017-02-01 07:32:04 -0600
+++ src/include/postgres.h	2017-02-01 07:32:29 -0600
@@ -44,6 +44,10 @@
 #ifndef POSTGRES_H
 #define POSTGRES_H
 
+#ifdef _AIX
+#define _LARGE_FILES 1
+#endif
+
 #include "c.h"
 #include "utils/elog.h"
 #include "utils/palloc.h"
--- src/pl/plpython/plpy_cursorobject.c.ORIGIN	2017-02-01 02:59:08 -0600
+++ src/pl/plpython/plpy_cursorobject.c	2017-02-01 03:00:20 -0600
@@ -4,6 +4,10 @@
  * src/pl/plpython/plpy_cursorobject.c
  */
 
+#ifdef _AIX
+#define _LARGE_FILES 1
+#endif
+
 #include "postgres.h"
 
 #include 
--- src/pl/plpython/plpy_elog.c.ORIGIN	2017-02-01 02:59:08 -0600
+++ 

Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-02-01 Thread Heikki Linnakangas

On 02/01/2017 04:12 PM, Konstantin Knizhnik wrote:

On 01.02.2017 15:39, Heikki Linnakangas wrote:

On 02/01/2017 01:07 PM, Konstantin Knizhnik wrote:

Attached please find my patch for XLC/AIX.
The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
The comment in this file says that:

   * __fetch_and_add() emits a leading "sync" and trailing "isync",
thereby
   * providing sequential consistency.  This is undocumented.

But it is not true any more (I checked generated assembler code in
debugger).
This is why I have added __sync() to this function. Now pgbench working
normally.


Seems like it was not so much undocumented, but an implementation
detail that was not guaranteed after all..

Does __fetch_and_add emit a trailing isync there either? Seems odd if
__compare_and_swap requires it, but __fetch_and_add does not. Unless
we can find conclusive documentation on that, I think we should assume
that an __isync() is required, too.

There was a long thread on these things the last time this was
changed:
https://www.postgresql.org/message-id/20160425185204.jrvlghn3jxulsb7i%40alap3.anarazel.de.
I couldn't find an explanation there of why we thought that
fetch_and_add implicitly performs sync and isync.


Also there is mysterious disappearance of assembler section function
with sync instruction from pg_atomic_compare_exchange_u32_impl.
I have fixed it by using __sync() built-in function instead.


__sync() seems more appropriate there, anyway. We're using intrinsics
for all the other things in generic-xlc.h. But it sure is scary that
the "asm" sections just disappeared.

In arch-ppc.h, shouldn't we have #ifdef __IBMC__ guards for the
__sync() and __lwsync() intrinsics? Those are an xlc compiler-specific
thing, right? Or if they are expected to work on any ppc compiler,
then we should probably use them always, instead of the asm sections.

In summary, I came up with the attached. It's essentially your patch,
with tweaks for the above-mentioned things. I don't have a powerpc
system to test on, so there are probably some silly typos there.


Why do you prefer to use _check_lock instead of __check_lock_mp ?
First one is even not mentioned in XLC compiler manual:
http://www-01.ibm.com/support/docview.wss?uid=swg27046906=7
or
http://scv.bu.edu/computation/bluegene/IBMdocs/compiler/xlc-8.0/html/compiler/ref/bif_sync.htm


Googling around, it seems that they do more or less the same thing. I 
would guess that they actually produce the same assembly code, but I 
have no machine to test on. If I understand correctly, the difference is 
that __check_lock_mp() is an xlc compiler intrinsic, while _check_lock() 
is a libc function. The libc function presumably does __check_lock_mp() 
or __check_lock_up() depending on whether the system is a multi- or 
uni-processor system.


So I think if we're going to change this, the use of __check_lock_mp() 
needs to be in an #ifdef block to check that you're on the XLC compiler, 
as it's a *compiler* intrinsic, while the current code that uses 
_check_lock() are in an "#ifdef _AIX" block, which is correct for 
_check_lock() because it's defined in libc, not by the compiler.


But if there's no pressing reason to change it, let's leave it alone. 
It's not related to the problem at hand, right?


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-02-01 Thread REIX, Tony
Hi Konstantin

XLC.

I'm on AIX 7.1 for now.

I'm using this version of XLC v13:

# xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0003

With this version, I have (at least, since I tested with "check" and not 
"check-world" at that time) 2 failing tests: create_aggregate , aggregates .


With the following XLC v12 version, I have NO test failure:

# /usr/vac/bin/xlc -qversion
IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
Version: 12.01..0016


So maybe you are not using XLC v13.1.3.3, rather another sub-version. Unless 
you are using more options for the configure ?


Configure.

What are the options that you give to the configure ?


Hard load & 64 cores ? OK. That clearly explains why I do not see this issue.


pgbench ? I wanted to run it. However, I'm still looking where to get it plus a 
guide for using it for testing. I would add such tests when building my 
PostgreSQL RPMs on AIX. So any help is welcome !


Performance.

- Also, I'd like to compare PostgreSQL performance on AIX vs Linux/PPC64. Any 
idea how I should proceed ? Any PostgreSQL performance benchmark that I could 
find and use ? pgbench ?

- I'm interested in any information for improving the performance & quality of 
my PostgreSQM RPMs on AIX. (As I already said, BullFreeware RPMs for AIX are 
free and can be used by anyone, like Perzl RPMs are. My company (ATOS/Bull) 
sells IBM Power machines under the Escala brand since ages (25 years this 
year)).


How to help ?

How could I help for improving the quality and performance of PostgreSQL on AIX 
?
I may have access to very big machines for even more deeply testing of 
PostgreSQL. I just need to know how to run tests.


Thanks!

Regards,

Tony


Le 01/02/2017 à 14:48, Konstantin Knizhnik a écrit :
Hi,

We are using 13.1.3 version of XLC. All tests are passed.
Please notice that is is synchronization bug which can be reproduced only under 
hard load.
Our server has 64 cores and it is necessary to run pgbench with 100 connections 
during several minutes to reproduce the problem.
So may be you just didn't notice it;)



On 01.02.2017 16:29, REIX, Tony wrote:

Hi,

I'm now working on the port of PostgreSQL on AIX.
(RPMs can be found, as free OpenSource work, at 
 http://http://bullfreeware.com/ .
 http://bullfreeware.com/search.php?package=postgresql )

I was not aware of any issue with XLC v12 on AIX for atomic operations.
(XLC v13 generates at least 2 tests failures)

For now, with version 9.6.1, all tests "check-world", plus numeric_big test, 
are OK, in both 32 & 64bit versions.

Am I missing something ?

I configure the build of PostgreSQL with (in 64bits):

 ./configure
--prefix=/opt/freeware
--libdir=/opt/freeware/lib64
--mandir=/opt/freeware/man
--with-perl
--with-tcl
--with-tclconfig=/opt/freeware/lib
--with-python
--with-ldap
--with-openssl
--with-libxml
--with-libxslt
--enable-nls
--enable-thread-safety
--sysconfdir=/etc/sysconfig/postgresql

Am I missing some option for more optimization on AIX ?

Thanks

Regards,

Tony

Le 01/02/2017 à 12:07, Konstantin Knizhnik a écrit :
Attached please find my patch for XLC/AIX.
The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
The comment in this file says that:

  * __fetch_and_add() emits a leading "sync" and trailing "isync",
thereby
  * providing sequential consistency.  This is undocumented.

But it is not true any more (I checked generated assembler code in
debugger).
This is why I have added __sync() to this function. Now pgbench working
normally.

Also there is mysterious disappearance of assembler section function
with sync instruction from pg_atomic_compare_exchange_u32_impl.
I have fixed it by using __sync() built-in function instead.


Thanks to everybody who helped me to locate and fix this problem.

--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


ATOS WARNING !
This message contains attachments that could potentially harm your computer.
Please make sure you open ONLY attachments from senders you know, trust and is 
in an e-mail that you are expecting.

AVERTISSEMENT ATOS !
Ce message contient des pièces jointes qui peuvent potentiellement endommager 
votre ordinateur.
Merci de vous assurer que vous ouvrez uniquement les pièces jointes provenant 
d’emails que vous attendez et dont vous connaissez les expéditeurs et leur 
faites confiance.

AVISO DE ATOS !
Este mensaje contiene datos adjuntos que pudiera ser que dañaran su ordenador.
Asegúrese de abrir SOLO datos adjuntos enviados desde remitentes de confianza y 
que procedan de un correo esperado.

ATOS WARNUNG !
Diese E-Mail enthält Anlagen, welche möglicherweise ihren Computer beschädigen 
könnten.
Bitte beachten Sie, daß Sie NUR Anlagen öffnen, von einem Absender den Sie 
kennen, vertrauen und vom dem 

Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-02-01 Thread Konstantin Knizhnik



On 01.02.2017 15:39, Heikki Linnakangas wrote:


In summary, I came up with the attached. It's essentially your patch, 
with tweaks for the above-mentioned things. I don't have a powerpc 
system to test on, so there are probably some silly typos there.




Attached pleased find fixed version of your patch.
I verified that it is correctly applied, build and postgres normally 
works with it.



--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/include/port/atomics/arch-ppc.h b/src/include/port/atomics/arch-ppc.h
index ed1cd9d1b9..7cf8c8ef97 100644
--- a/src/include/port/atomics/arch-ppc.h
+++ b/src/include/port/atomics/arch-ppc.h
@@ -23,4 +23,11 @@
 #define pg_memory_barrier_impl()	__asm__ __volatile__ ("sync" : : : "memory")
 #define pg_read_barrier_impl()		__asm__ __volatile__ ("lwsync" : : : "memory")
 #define pg_write_barrier_impl()		__asm__ __volatile__ ("lwsync" : : : "memory")
+
+#elif defined(__IBMC__) || defined(__IBMCPP__)
+
+#define pg_memory_barrier_impl()__sync()
+#define pg_read_barrier_impl()  __lwsync()
+#define pg_write_barrier_impl() __lwsync()
+
 #endif
diff --git a/src/include/port/atomics/generic-xlc.h b/src/include/port/atomics/generic-xlc.h
index f854612d39..e1dd3310a5 100644
--- a/src/include/port/atomics/generic-xlc.h
+++ b/src/include/port/atomics/generic-xlc.h
@@ -48,7 +48,7 @@ pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr,
 	 * consistency only, do not use it here.  GCC atomics observe the same
 	 * restriction; see its rs6000_pre_atomic_barrier().
 	 */
-	__asm__ __volatile__ ("	sync \n" ::: "memory");
+	__sync();
 
 	/*
 	 * XXX: __compare_and_swap is defined to take signed parameters, but that
@@ -73,11 +73,19 @@ pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr,
 static inline uint32
 pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_)
 {
+	uint32		ret;
+
 	/*
-	 * __fetch_and_add() emits a leading "sync" and trailing "isync", thereby
-	 * providing sequential consistency.  This is undocumented.
+	 * Use __sync() before and __isync() after, like in compare-exchange
+	 * above.
 	 */
-	return __fetch_and_add((volatile int *)>value, add_);
+	__sync();
+
+	ret = __fetch_and_add((volatile int *)>value, add_);
+
+	__isync();
+
+	return ret;
 }
 
 #ifdef PG_HAVE_ATOMIC_U64_SUPPORT
@@ -89,7 +97,7 @@ pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 *ptr,
 {
 	bool		ret;
 
-	__asm__ __volatile__ ("	sync \n" ::: "memory");
+	__sync();
 
 	ret = __compare_and_swaplp((volatile long*)>value,
 			   (long *)expected, (long)newval);
@@ -103,7 +111,15 @@ pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 *ptr,
 static inline uint64
 pg_atomic_fetch_add_u64_impl(volatile pg_atomic_uint64 *ptr, int64 add_)
 {
-	return __fetch_and_addlp((volatile long *)>value, add_);
+	uint64		ret;
+
+	__sync();
+
+	ret = __fetch_and_addlp((volatile long *)>value, add_);
+
+	__isync();
+
+	return ret;
 }
 
 #endif /* PG_HAVE_ATOMIC_U64_SUPPORT */
diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h
index 7aad2de..c6ef114 100644
--- a/src/include/storage/s_lock.h
+++ b/src/include/storage/s_lock.h
@@ -832,9 +831,8 @@ typedef unsigned int slock_t;
 #include 
 
 typedef int slock_t;
-
-#define TAS(lock)			_check_lock((slock_t *) (lock), 0, 1)
-#define S_UNLOCK(lock)		_clear_lock((slock_t *) (lock), 0)
+#define TAS(lock)			__check_lock_mp((slock_t *) (lock), 0, 1)
+#define S_UNLOCK(lock)		__clear_lock_mp((slock_t *) (lock), 0)
 #endif	 /* _AIX */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-02-01 Thread Konstantin Knizhnik

Hi Tony,

On 01.02.2017 18:42, REIX, Tony wrote:


Hi Konstantin

*XLC.*

I'm on AIX 7.1 for now.

I'm using this version of *XL**C v13*:

# xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0003

With this version, I have (at least, since I tested with "check" and 
not "check-world" at that time) 2 failing tests: create_aggregate , 
aggregates .



With the following *XLC v12* version, I have NO test failure:

# /usr/vac/bin/xlc -qversion
IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
Version: 12.01..0016


So maybe you are not using XLC v13.1.3.3, rather another sub-version. 
Unless you are using more options for the configure ?



*Configure*.

What are the options that you give to the configure ?



export CC="/opt/IBM/xlc/13.1.3/bin/xlc"
export CFLAGS="-qarch=pwr8 -qtune=pwr8 -O2 -qalign=natural -q64 "
export LDFLAGS="-Wl,-bbigtoc,-b64"
export AR="/usr/bin/ar -X64"
export LD="/usr/bin/ld -b64 "
export NM="/usr/bin/nm -X64"
./configure --prefix="/opt/postgresql/xlc-debug/9.6"


*Hard load & 64 cores ?* OK. That clearly explains why I do not see 
this issue.



*pgbench ?* I wanted to run it. However, I'm still looking where to 
get it plus a guide for using it for testing.




pgbench is part of Postgres distributive (src/bin/pgbench)


I would add such tests when building my PostgreSQL RPMs on AIX. So any 
help is welcome !



*Performance*.

- Also, I'd like to compare PostgreSQL performance on AIX vs 
Linux/PPC64. Any idea how I should proceed ? Any PostgreSQL 
performance benchmark that I could find and use ? pgbench ?


pgbench is most widely used tool simulating OLTP workload. Certainly it 
is quite primitive and its results are rather artificial. TPC-C seems to 
be better choice.
But the best case is to implement your own benchmark simulating actual 
workload of your real application.


- I'm interested in any information for improving the performance & 
quality of my PostgreSQM RPMs on AIX./(As I already said, BullFreeware 
RPMs for AIX are free and can be used by anyone, like Perzl RPMs 
are//.My compa//ny (ATOS/Bull) sells IBM Power machines under the 
Escala brand s//ince ages (25 years this year)//)/.



*How to help ?*

How could I help for improving the quality and performance of 
PostgreSQL on AIX ?




We still have one open issue at AIX: see 
https://www.mail-archive.com/pgsql-hackers@postgresql.org/msg303094.html

It will be great if you can somehow help to fix this problem.



--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-02-01 Thread REIX, Tony
Hi,

I'm now working on the port of PostgreSQL on AIX.
(RPMs can be found, as free OpenSource work, at http://http://bullfreeware.com/ 
.
 http://bullfreeware.com/search.php?package=postgresql )

I was not aware of any issue with XLC v12 on AIX for atomic operations.
(XLC v13 generates at least 2 tests failures)

For now, with version 9.6.1, all tests "check-world", plus numeric_big test, 
are OK, in both 32 & 64bit versions.

Am I missing something ?

I configure the build of PostgreSQL with (in 64bits):

 ./configure
--prefix=/opt/freeware
--libdir=/opt/freeware/lib64
--mandir=/opt/freeware/man
--with-perl
--with-tcl
--with-tclconfig=/opt/freeware/lib
--with-python
--with-ldap
--with-openssl
--with-libxml
--with-libxslt
--enable-nls
--enable-thread-safety
--sysconfdir=/etc/sysconfig/postgresql

Am I missing some option for more optimization on AIX ?

Thanks

Regards,

Tony

Le 01/02/2017 à 12:07, Konstantin Knizhnik a écrit :
Attached please find my patch for XLC/AIX.
The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
The comment in this file says that:

  * __fetch_and_add() emits a leading "sync" and trailing "isync",
thereby
  * providing sequential consistency.  This is undocumented.

But it is not true any more (I checked generated assembler code in
debugger).
This is why I have added __sync() to this function. Now pgbench working
normally.

Also there is mysterious disappearance of assembler section function
with sync instruction from pg_atomic_compare_exchange_u32_impl.
I have fixed it by using __sync() built-in function instead.


Thanks to everybody who helped me to locate and fix this problem.

--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


ATOS WARNING !
This message contains attachments that could potentially harm your computer.
Please make sure you open ONLY attachments from senders you know, trust and is 
in an e-mail that you are expecting.

AVERTISSEMENT ATOS !
Ce message contient des pièces jointes qui peuvent potentiellement endommager 
votre ordinateur.
Merci de vous assurer que vous ouvrez uniquement les pièces jointes provenant 
d’emails que vous attendez et dont vous connaissez les expéditeurs et leur 
faites confiance.

AVISO DE ATOS !
Este mensaje contiene datos adjuntos que pudiera ser que dañaran su ordenador.
Asegúrese de abrir SOLO datos adjuntos enviados desde remitentes de confianza y 
que procedan de un correo esperado.

ATOS WARNUNG !
Diese E-Mail enthält Anlagen, welche möglicherweise ihren Computer beschädigen 
könnten.
Bitte beachten Sie, daß Sie NUR Anlagen öffnen, von einem Absender den Sie 
kennen, vertrauen und vom dem Sie vor allem auch E-Mails mit Anlagen erwarten.








Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-02-01 Thread Konstantin Knizhnik

Hi,

We are using 13.1.3 version of XLC. All tests are passed.
Please notice that is is synchronization bug which can be reproduced 
only under hard load.
Our server has 64 cores and it is necessary to run pgbench with 100 
connections during several minutes to reproduce the problem.

So may be you just didn't notice it;)



On 01.02.2017 16:29, REIX, Tony wrote:


Hi,

I'm now working on the port of PostgreSQL on AIX.
(RPMs can be found, as free OpenSource work, at 
http://http://bullfreeware.com/  .

http://bullfreeware.com/search.php?package=postgresql)

I was not aware of any issue with XLC v12 on AIX for atomic operations.
(XLC v13 generates at least 2 tests failures)

For now, with version 9.6.1, all tests "check-world", plus 
numeric_bigtest, are OK, in both 32 & 64bit versions.


Am I missing something ?

I configure the build of PostgreSQL with (in 64bits):

 ./configure
--prefix=/opt/freeware
--libdir=/opt/freeware/lib64
--mandir=/opt/freeware/man
--with-perl
--with-tcl
--with-tclconfig=/opt/freeware/lib
--with-python
--with-ldap
--with-openssl
--with-libxml
--with-libxslt
--enable-nls
--enable-thread-safety
--sysconfdir=/etc/sysconfig/postgresql

Am I missing some option for more optimization on AIX ?

Thanks

Regards,

Tony


Le 01/02/2017 à 12:07, Konstantin Knizhnik a écrit :

Attached please find my patch for XLC/AIX.
The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
The comment in this file says that:

  * __fetch_and_add() emits a leading "sync" and trailing "isync",
thereby
  * providing sequential consistency.  This is undocumented.

But it is not true any more (I checked generated assembler code in
debugger).
This is why I have added __sync() to this function. Now pgbench working
normally.

Also there is mysterious disappearance of assembler section function
with sync instruction from pg_atomic_compare_exchange_u32_impl.
I have fixed it by using __sync() built-in function instead.


Thanks to everybody who helped me to locate and fix this problem.

--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


ATOS WARNING !
This message contains attachments that could potentially harm your 
computer.
Please make sure you open ONLY attachments from senders you know, 
trust and is in an e-mail that you are expecting.


AVERTISSEMENT ATOS !
Ce message contient des pièces jointes qui peuvent potentiellement 
endommager votre ordinateur.
Merci de vous assurer que vous ouvrez uniquement les pièces jointes 
provenant d’emails que vous attendez et dont vous connaissez les 
expéditeurs et leur faites confiance.


AVISO DE ATOS !
Este mensaje contiene datos adjuntos que pudiera ser que dañaran su 
ordenador.
Asegúrese de abrir SOLO datos adjuntos enviados desde remitentes de 
confianza y que procedan de un correo esperado.


ATOS WARNUNG !
Diese E-Mail enthält Anlagen, welche möglicherweise ihren Computer 
beschädigen könnten.
Bitte beachten Sie, daß Sie NUR Anlagen öffnen, von einem Absender 
den Sie kennen, vertrauen und vom dem Sie vor allem auch E-Mails mit 
Anlagen erwarten.







--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-02-01 Thread Konstantin Knizhnik

On 01.02.2017 15:39, Heikki Linnakangas wrote:

On 02/01/2017 01:07 PM, Konstantin Knizhnik wrote:

Attached please find my patch for XLC/AIX.
The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
The comment in this file says that:

   * __fetch_and_add() emits a leading "sync" and trailing "isync",
thereby
   * providing sequential consistency.  This is undocumented.

But it is not true any more (I checked generated assembler code in
debugger).
This is why I have added __sync() to this function. Now pgbench working
normally.


Seems like it was not so much undocumented, but an implementation 
detail that was not guaranteed after all..


Does __fetch_and_add emit a trailing isync there either? Seems odd if 
__compare_and_swap requires it, but __fetch_and_add does not. Unless 
we can find conclusive documentation on that, I think we should assume 
that an __isync() is required, too.


There was a long thread on these things the last time this was 
changed: 
https://www.postgresql.org/message-id/20160425185204.jrvlghn3jxulsb7i%40alap3.anarazel.de. 
I couldn't find an explanation there of why we thought that 
fetch_and_add implicitly performs sync and isync.



Also there is mysterious disappearance of assembler section function
with sync instruction from pg_atomic_compare_exchange_u32_impl.
I have fixed it by using __sync() built-in function instead.


__sync() seems more appropriate there, anyway. We're using intrinsics 
for all the other things in generic-xlc.h. But it sure is scary that 
the "asm" sections just disappeared.


In arch-ppc.h, shouldn't we have #ifdef __IBMC__ guards for the 
__sync() and __lwsync() intrinsics? Those are an xlc compiler-specific 
thing, right? Or if they are expected to work on any ppc compiler, 
then we should probably use them always, instead of the asm sections.


In summary, I came up with the attached. It's essentially your patch, 
with tweaks for the above-mentioned things. I don't have a powerpc 
system to test on, so there are probably some silly typos there.


Why do you prefer to use _check_lock instead of __check_lock_mp ?
First one is even not mentioned in XLC compiler manual:
http://www-01.ibm.com/support/docview.wss?uid=swg27046906=7
or
http://scv.bu.edu/computation/bluegene/IBMdocs/compiler/xlc-8.0/html/compiler/ref/bif_sync.htm



- Heikki





--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-02-01 Thread Heikki Linnakangas

On 02/01/2017 01:07 PM, Konstantin Knizhnik wrote:

Attached please find my patch for XLC/AIX.
The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
The comment in this file says that:

   * __fetch_and_add() emits a leading "sync" and trailing "isync",
thereby
   * providing sequential consistency.  This is undocumented.

But it is not true any more (I checked generated assembler code in
debugger).
This is why I have added __sync() to this function. Now pgbench working
normally.


Seems like it was not so much undocumented, but an implementation detail 
that was not guaranteed after all..


Does __fetch_and_add emit a trailing isync there either? Seems odd if 
__compare_and_swap requires it, but __fetch_and_add does not. Unless we 
can find conclusive documentation on that, I think we should assume that 
an __isync() is required, too.


There was a long thread on these things the last time this was changed: 
https://www.postgresql.org/message-id/20160425185204.jrvlghn3jxulsb7i%40alap3.anarazel.de. 
I couldn't find an explanation there of why we thought that 
fetch_and_add implicitly performs sync and isync.



Also there is mysterious disappearance of assembler section function
with sync instruction from pg_atomic_compare_exchange_u32_impl.
I have fixed it by using __sync() built-in function instead.


__sync() seems more appropriate there, anyway. We're using intrinsics 
for all the other things in generic-xlc.h. But it sure is scary that the 
"asm" sections just disappeared.


In arch-ppc.h, shouldn't we have #ifdef __IBMC__ guards for the __sync() 
and __lwsync() intrinsics? Those are an xlc compiler-specific thing, 
right? Or if they are expected to work on any ppc compiler, then we 
should probably use them always, instead of the asm sections.


In summary, I came up with the attached. It's essentially your patch, 
with tweaks for the above-mentioned things. I don't have a powerpc 
system to test on, so there are probably some silly typos there.


- Heikki

diff --git a/src/include/port/atomics/arch-ppc.h 
b/src/include/port/atomics/arch-ppc.h
index ed1cd9d1b9..7cf8c8ef97 100644
--- a/src/include/port/atomics/arch-ppc.h
+++ b/src/include/port/atomics/arch-ppc.h
@@ -23,4 +23,11 @@
 #define pg_memory_barrier_impl()   __asm__ __volatile__ ("sync" : : : 
"memory")
 #define pg_read_barrier_impl() __asm__ __volatile__ ("lwsync" : : : 
"memory")
 #define pg_write_barrier_impl()__asm__ __volatile__ ("lwsync" 
: : : "memory")
+
+#if defined(__IBMC__) || defined(__IBMCPP__)
+
+#define pg_memory_barrier_impl()__sync()
+#define pg_read_barrier_impl()  __lwsync()
+#define pg_write_barrier_impl() __lwsync()
+
 #endif
diff --git a/src/include/port/atomics/generic-xlc.h 
b/src/include/port/atomics/generic-xlc.h
index f854612d39..e1dd3310a5 100644
--- a/src/include/port/atomics/generic-xlc.h
+++ b/src/include/port/atomics/generic-xlc.h
@@ -48,7 +48,7 @@ pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 
*ptr,
 * consistency only, do not use it here.  GCC atomics observe the same
 * restriction; see its rs6000_pre_atomic_barrier().
 */
-   __asm__ __volatile__ (" sync \n" ::: "memory");
+   __sync();
 
/*
 * XXX: __compare_and_swap is defined to take signed parameters, but 
that
@@ -73,11 +73,19 @@ pg_atomic_compare_exchange_u32_impl(volatile 
pg_atomic_uint32 *ptr,
 static inline uint32
 pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_)
 {
+   uint32  ret;
+
/*
-* __fetch_and_add() emits a leading "sync" and trailing "isync", 
thereby
-* providing sequential consistency.  This is undocumented.
+* Use __sync() before and __isync() after, like in compare-exchange
+* above.
 */
-   return __fetch_and_add((volatile int *)>value, add_);
+   __sync();
+
+   ret = __fetch_and_add((volatile int *)>value, add_);
+
+   __isync();
+
+   return ret;
 }
 
 #ifdef PG_HAVE_ATOMIC_U64_SUPPORT
@@ -89,7 +97,7 @@ pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 
*ptr,
 {
boolret;
 
-   __asm__ __volatile__ (" sync \n" ::: "memory");
+   __sync();
 
ret = __compare_and_swaplp((volatile long*)>value,
   (long *)expected, 
(long)newval);
@@ -103,7 +111,15 @@ pg_atomic_compare_exchange_u64_impl(volatile 
pg_atomic_uint64 *ptr,
 static inline uint64
 pg_atomic_fetch_add_u64_impl(volatile pg_atomic_uint64 *ptr, int64 add_)
 {
-   return __fetch_and_addlp((volatile long *)>value, add_);
+   uint64  ret;
+
+   __sync();
+
+   ret = __fetch_and_addlp((volatile long *)>value, add_);
+
+   __isync();
+
+   return ret;
 }
 
 #endif /* PG_HAVE_ATOMIC_U64_SUPPORT */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-02-01 Thread Heikki Linnakangas

On 02/01/2017 01:07 PM, Konstantin Knizhnik wrote:

Attached please find my patch for XLC/AIX.
The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
The comment in this file says that:

   * __fetch_and_add() emits a leading "sync" and trailing "isync",
thereby
   * providing sequential consistency.  This is undocumented.

But it is not true any more (I checked generated assembler code in
debugger).
This is why I have added __sync() to this function. Now pgbench working
normally.


Seems like it was not so much undocumented, but an implementation detail 
that was not guaranteed after all..


Does __fetch_and_add emit a trailing isync there either? Seems odd if 
__compare_and_swap requires it, but __fetch_and_add does not. Unless we 
can find conclusive documentation on that, I think we should assume that 
an __isync() is required, too.



Also there is mysterious disappearance of assembler section function
with sync instruction from pg_atomic_compare_exchange_u32_impl.
I have fixed it by using __sync() built-in function instead.


__sync() seems more appropriate there, anyway. We're using intrinsics 
for all the other things in generic-xlc.h. But it sure is scary that the 
"asm" sections just disappeared.


In arch-ppc.h, shouldn't we have #ifdef __IBMC__ guards for the __sync() 
and __lwsync() intrinsics? Those are an xlc compiler-specific thing, 
right? Or if they are expected to work on any ppc compiler, then we 
should probably use them always, instead of the asm sections.


In summary, I came up with the attached. It's essentially your patch, 
with tweaks for the above-mentioned things. I don't have a powerpc 
system to test on, so there are probably some silly typos there.


- Heikki

diff --git a/src/include/port/atomics/arch-ppc.h 
b/src/include/port/atomics/arch-ppc.h
index ed1cd9d1b9..7cf8c8ef97 100644
--- a/src/include/port/atomics/arch-ppc.h
+++ b/src/include/port/atomics/arch-ppc.h
@@ -23,4 +23,11 @@
 #define pg_memory_barrier_impl()   __asm__ __volatile__ ("sync" : : : 
"memory")
 #define pg_read_barrier_impl() __asm__ __volatile__ ("lwsync" : : : 
"memory")
 #define pg_write_barrier_impl()__asm__ __volatile__ ("lwsync" 
: : : "memory")
+
+#if defined(__IBMC__) || defined(__IBMCPP__)
+
+#define pg_memory_barrier_impl()__sync()
+#define pg_read_barrier_impl()  __lwsync()
+#define pg_write_barrier_impl() __lwsync()
+
 #endif
diff --git a/src/include/port/atomics/generic-xlc.h 
b/src/include/port/atomics/generic-xlc.h
index f854612d39..e1dd3310a5 100644
--- a/src/include/port/atomics/generic-xlc.h
+++ b/src/include/port/atomics/generic-xlc.h
@@ -48,7 +48,7 @@ pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 
*ptr,
 * consistency only, do not use it here.  GCC atomics observe the same
 * restriction; see its rs6000_pre_atomic_barrier().
 */
-   __asm__ __volatile__ (" sync \n" ::: "memory");
+   __sync();
 
/*
 * XXX: __compare_and_swap is defined to take signed parameters, but 
that
@@ -73,11 +73,19 @@ pg_atomic_compare_exchange_u32_impl(volatile 
pg_atomic_uint32 *ptr,
 static inline uint32
 pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_)
 {
+   uint32  ret;
+
/*
-* __fetch_and_add() emits a leading "sync" and trailing "isync", 
thereby
-* providing sequential consistency.  This is undocumented.
+* Use __sync() before and __isync() after, like in compare-exchange
+* above.
 */
-   return __fetch_and_add((volatile int *)>value, add_);
+   __sync();
+
+   ret = __fetch_and_add((volatile int *)>value, add_);
+
+   __isync();
+
+   return ret;
 }
 
 #ifdef PG_HAVE_ATOMIC_U64_SUPPORT
@@ -89,7 +97,7 @@ pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 
*ptr,
 {
boolret;
 
-   __asm__ __volatile__ (" sync \n" ::: "memory");
+   __sync();
 
ret = __compare_and_swaplp((volatile long*)>value,
   (long *)expected, 
(long)newval);
@@ -103,7 +111,15 @@ pg_atomic_compare_exchange_u64_impl(volatile 
pg_atomic_uint64 *ptr,
 static inline uint64
 pg_atomic_fetch_add_u64_impl(volatile pg_atomic_uint64 *ptr, int64 add_)
 {
-   return __fetch_and_addlp((volatile long *)>value, add_);
+   uint64  ret;
+
+   __sync();
+
+   ret = __fetch_and_addlp((volatile long *)>value, add_);
+
+   __isync();
+
+   return ret;
 }
 
 #endif /* PG_HAVE_ATOMIC_U64_SUPPORT */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-02-01 Thread Konstantin Knizhnik

Attached please find my patch for XLC/AIX.
The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl.
The comment in this file says that:

  * __fetch_and_add() emits a leading "sync" and trailing "isync", 
thereby

  * providing sequential consistency.  This is undocumented.

But it is not true any more (I checked generated assembler code in 
debugger).
This is why I have added __sync() to this function. Now pgbench working 
normally.


Also there is mysterious disappearance of assembler section function 
with sync instruction from pg_atomic_compare_exchange_u32_impl.

I have fixed it by using __sync() built-in function instead.


Thanks to everybody who helped me to locate and fix this problem.

--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/include/port/atomics/arch-ppc.h b/src/include/port/atomics/arch-ppc.h
index 2b54c42..5828f7e 100644
--- a/src/include/port/atomics/arch-ppc.h
+++ b/src/include/port/atomics/arch-ppc.h
@@ -23,4 +23,10 @@
 #define pg_memory_barrier_impl()	__asm__ __volatile__ ("sync" : : : "memory")
 #define pg_read_barrier_impl()		__asm__ __volatile__ ("lwsync" : : : "memory")
 #define pg_write_barrier_impl()		__asm__ __volatile__ ("lwsync" : : : "memory")
+
+#else
+#define pg_memory_barrier_impl()__sync()
+#define pg_read_barrier_impl()  __lwsync()
+#define pg_write_barrier_impl() __lwsync()
+
 #endif
diff --git a/src/include/port/atomics/generic-xlc.h b/src/include/port/atomics/generic-xlc.h
index f4fd2f3..531d17c 100644
--- a/src/include/port/atomics/generic-xlc.h
+++ b/src/include/port/atomics/generic-xlc.h
@@ -36,7 +36,8 @@ typedef struct pg_atomic_uint64
 #endif /* __64BIT__ */
 
 #define PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U32
 static inline bool
 pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr,
 	uint32 *expected, uint32 newval)
 {
@@ -48,14 +49,14 @@ pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr,
 	 * consistency only, do not use it here.  GCC atomics observe the same
 	 * restriction; see its rs6000_pre_atomic_barrier().
 	 */
-	__asm__ __volatile__ ("	sync \n" ::: "memory");
+	__sync();
 
 	/*
 	 * XXX: __compare_and_swap is defined to take signed parameters, but that
 	 * shouldn't matter since we don't perform any arithmetic operations.
 	 */
 	ret = __compare_and_swap((volatile int*)>value,
@@ -77,6 +78,7 @@ pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_)
 	 * __fetch_and_add() emits a leading "sync" and trailing "isync", thereby
 	 * providing sequential consistency.  This is undocumented.
 	 */
+	__sync();
 	return __fetch_and_add((volatile int *)>value, add_);
 }
 
@@ -89,10 +91,10 @@ pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 *ptr,
 {
 	bool		ret;
 
-	__asm__ __volatile__ ("	sync \n" ::: "memory");
+	__sync();
 
 	ret = __compare_and_swaplp((volatile long*)>value,
			   (long *)expected, (long)newval);
 
 	__isync();
 
@@ -103,7 +105,8 @@ pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 *ptr,
 static inline uint64
 pg_atomic_fetch_add_u64_impl(volatile pg_atomic_uint64 *ptr, int64 add_)
 {
-	return __fetch_and_addlp((volatile long *)>value, add_);
+__sync();
+return __fetch_and_addlp((volatile long *)>value, add_);
 }
 
 #endif /* PG_HAVE_ATOMIC_U64_SUPPORT */
diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h
index 7aad2de..c6ef114 100644
--- a/src/include/storage/s_lock.h
+++ b/src/include/storage/s_lock.h
@@ -832,9 +831,8 @@ typedef unsigned int slock_t;
 #include 
 
 typedef int slock_t;
-
-#define TAS(lock)			_check_lock((slock_t *) (lock), 0, 1)
-#define S_UNLOCK(lock)		_clear_lock((slock_t *) (lock), 0)
+#define TAS(lock)			__check_lock_mp((slock_t *) (lock), 0, 1)
+#define S_UNLOCK(lock)		__clear_lock_mp((slock_t *) (lock), 0)
 #endif	 /* _AIX */
 
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-02-01 Thread Heikki Linnakangas
Oh, you were one step ahead of me, I didn't understand it on first read 
of your email. Need more coffee..


On 01/31/2017 05:03 PM, Konstantin Knizhnik wrote:

I inspected code of pg_atomic_compare_exchange_u32_impl and didn't sync
in prologue:

(dbx) listi pg_atomic_compare_exchange_u32_impl

> [no sync instruction]


and if I compile this fuctions standalone, I get the following assembler
code:

.pg_atomic_compare_exchange_u32_impl:   # 0x (H.4.NO_SYMBOL)
 stdu   SP,-128(SP)
 stdr3,176(SP)
 stdr4,184(SP)
 stdr5,192(SP)
 ld r0,192(SP)
 stwr0,192(SP)
sync
 ld r4,176(SP)
 ld r3,184(SP)
 lwzr0,192(SP)
 extsw  r0,r0
 lwar5,0(r3)

> ...


sync is here!


Ok, so, the 'sync' instruction gets lost somehow. That "standalone" 
assemly version looks slightly different in other ways too, you perhaps 
used different optimization levels, or it looks different when it's 
inlined into the caller. Not sure which version of the function gdb 
would show, when it's a "static inline" function. Would be good to check 
the disassembly of LWLockAttemptLock(), to see if the 'sync' is there.


Certainly seems like a compiler bug, though.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-02-01 Thread Heikki Linnakangas

On 01/31/2017 05:03 PM, Konstantin Knizhnik wrote:

One more assertion failure:


ExceptionalCondition(conditionName = "!(OldPageRqstPtr <=
XLogCtl->InitializedUpTo)", errorType = "FailedAssertion", fileName =
"xlog.c", lineNumber = 1887), line 54 in "assert.c"

(dbx) p OldPageRqstPtr
153551667200
(dbx) p XLogCtl->InitializedUpTo
153551667200
(dbx) p InitializedUpTo
153551659008

I slightly modify xlog.c code - store value of XLogCtl->InitializedUpTo
in local variable:


  1870 LWLockAcquire(WALBufMappingLock, LW_EXCLUSIVE);
  1871
  1872 /*
  1873  * Now that we have the lock, check if someone
initialized the page
  1874  * already.
  1875  */
  1876 while (upto >= XLogCtl->InitializedUpTo || opportunistic)
  1877 {
  1878 XLogRecPtr InitializedUpTo =
XLogCtl->InitializedUpTo;
  1879 nextidx = XLogRecPtrToBufIdx(InitializedUpTo);
  1880
  1881 /*
  1882  * Get ending-offset of the buffer page we need
to replace (this may
  1883  * be zero if the buffer hasn't been used yet).
Fall through if it's
  1884  * already written out.
  1885  */
  1886 OldPageRqstPtr = XLogCtl->xlblocks[nextidx];
  1887 Assert(OldPageRqstPtr <= XLogCtl->InitializedUpTo);


And, as you can see,  XLogCtl->InitializedUpTo is not equal to saved
value InitializedUpTo.
But we are under exclusive WALBufMappingLock and InitializedUpTo is
updated only under this lock.
So it means that LW-locks doesn't work!


Yeah, so it seems. XLogCtl->InitializeUpTo is quite clearly protected by 
the WALBufMappingLock. All references to it (after StartupXLog) happen 
while holding the lock.


Can you get the assembly output of the AdvanceXLInsertBuffer() function? 
I wonder if the compiler is rearranging things so that 
XLogCtl->InitializedUpTo is fetched before the LWLockAcquire call. Or 
should there be a memory barrier instruction somewhere in LWLockAcquire?


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-01-31 Thread Konstantin Knizhnik

One more assertion failure:


ExceptionalCondition(conditionName = "!(OldPageRqstPtr <= 
XLogCtl->InitializedUpTo)", errorType = "FailedAssertion", fileName = 
"xlog.c", lineNumber = 1887), line 54 in "assert.c"


(dbx) p OldPageRqstPtr
153551667200
(dbx) p XLogCtl->InitializedUpTo
153551667200
(dbx) p InitializedUpTo
153551659008

I slightly modify xlog.c code - store value of XLogCtl->InitializedUpTo 
in local variable:



 1870 LWLockAcquire(WALBufMappingLock, LW_EXCLUSIVE);
 1871
 1872 /*
 1873  * Now that we have the lock, check if someone 
initialized the page

 1874  * already.
 1875  */
 1876 while (upto >= XLogCtl->InitializedUpTo || opportunistic)
 1877 {
 1878 XLogRecPtr InitializedUpTo = 
XLogCtl->InitializedUpTo;

 1879 nextidx = XLogRecPtrToBufIdx(InitializedUpTo);
 1880
 1881 /*
 1882  * Get ending-offset of the buffer page we need 
to replace (this may
 1883  * be zero if the buffer hasn't been used yet).  
Fall through if it's

 1884  * already written out.
 1885  */
 1886 OldPageRqstPtr = XLogCtl->xlblocks[nextidx];
 1887 Assert(OldPageRqstPtr <= XLogCtl->InitializedUpTo);


And, as you can see,  XLogCtl->InitializedUpTo is not equal to saved 
value InitializedUpTo.
But we are under exclusive WALBufMappingLock and InitializedUpTo is 
updated only under this lock.

So it means that LW-locks doesn't work!
I inspected code of pg_atomic_compare_exchange_u32_impl and didn't sync 
in prologue:


(dbx) listi pg_atomic_compare_exchange_u32_impl
0x1000817bc (pg_atomic_compare_exchange_u32_impl+0x1c) 
e88100b0 ld   r4,0xb0(r1)
0x1000817c0 (pg_atomic_compare_exchange_u32_impl+0x20) 
e86100b8 ld   r3,0xb8(r1)
0x1000817c4 (pg_atomic_compare_exchange_u32_impl+0x24) 
800100c0lwz   r0,0xc0(r1)
0x1000817c8 (pg_atomic_compare_exchange_u32_impl+0x28) 7c0007b4  
extsw   r0,r0
0x1000817cc (pg_atomic_compare_exchange_u32_impl+0x2c) 
e8a30002lwa   r5,0x0(r3)
0x1000817d0 (pg_atomic_compare_exchange_u32_impl+0x30) 7cc02028  
lwarx   r6,r0,r4,0x0
0x1000817d4 (pg_atomic_compare_exchange_u32_impl+0x34) 
7c053040   cmpl   cr0,0x0,r5,r6
0x1000817d8 (pg_atomic_compare_exchange_u32_impl+0x38) 
4082000cbne   0x1000817e4 
(pg_atomic_compare_exchange_u32_impl+0x44)
0x1000817dc (pg_atomic_compare_exchange_u32_impl+0x3c) 7c00212d 
stwcx.   r0,r0,r4
0x1000817e0 (pg_atomic_compare_exchange_u32_impl+0x40) 
40e2fff0   bne+   0x1000817d0 
(pg_atomic_compare_exchange_u32_impl+0x30)
0x1000817e4 (pg_atomic_compare_exchange_u32_impl+0x44) 
60c0ori   r0,r6,0x0
0x1000817e8 (pg_atomic_compare_exchange_u32_impl+0x48) 
9003stw   r0,0x0(r3)
0x1000817ec (pg_atomic_compare_exchange_u32_impl+0x4c) 
7c26   mfcr   r0
0x1000817f0 (pg_atomic_compare_exchange_u32_impl+0x50) 54001ffe 
rlwinm   r0,r0,0x3,0x1f,0x1f
0x1000817f4 (pg_atomic_compare_exchange_u32_impl+0x54) 78000620 
rldicl   r0,r0,0x0,0x19
0x1000817f8 (pg_atomic_compare_exchange_u32_impl+0x58) 
98010070stb   r0,0x70(r1)
0x1000817fc (pg_atomic_compare_exchange_u32_impl+0x5c) 4c00012c  
isync
0x100081800 (pg_atomic_compare_exchange_u32_impl+0x60) 
88610070lbz   r3,0x70(r1)
0x100081804 (pg_atomic_compare_exchange_u32_impl+0x64) 
4804  b   0x100081808 
(pg_atomic_compare_exchange_u32_impl+0x68)
0x100081808 (pg_atomic_compare_exchange_u32_impl+0x68) 
38210080   addi   r1,0x80(r1)
0x10008180c (pg_atomic_compare_exchange_u32_impl+0x6c) 
4e800020blr



Source code of pg_atomic_compare_exchange_u32_impl is the following:

static inline bool
pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr,
uint32 *expected, uint32 newval)
{
boolret;

/*
 * atomics.h specifies sequential consistency ("full barrier 
semantics")

 * for this interface.  Since "lwsync" provides acquire/release
 * consistency only, do not use it here.  GCC atomics observe the same
 * restriction; see its rs6000_pre_atomic_barrier().
 */
__asm__ __volatile__ ("sync \n" ::: "memory");

/*
 * XXX: __compare_and_swap is defined to take signed parameters, 
but that

 * shouldn't matter since we don't perform any arithmetic operations.
 */
ret = __compare_and_swap((volatile int*)>value,
 (int *)expected, (int)newval);

/*
 * xlc's documentation tells us:
 * "If __compare_and_swap is used as a locking primitive, insert a 
call to
 * the __isync built-in function at the start of any critical 
sections."

 *
 * The critical section begins immediately after __compare_and_swap().
 */
__isync();

return ret;
}

and if I compile this fuctions 

Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-01-31 Thread Konstantin Knizhnik


On 30.01.2017 19:21, Heikki Linnakangas wrote:

On 01/24/2017 04:47 PM, Konstantin Knizhnik wrote:
Interesting.. What should happen here is that for the backend's own 
insertion slot, the "insertingat" value should be greater than the 
requested flush point ('upto' variable). That's because before 
GetXLogBuffer() calls AdvanceXLInsertBuffer(), it updates the 
backend's insertingat value, to the position that it wants to insert 
to. And AdvanceXLInsertBuffer() only calls 
WaitXLogInsertionsToFinish() with value smaller than what was passed 
as the 'upto' argument.



The comment to WaitXLogInsertionsToFinish says:

  * Note: When you are about to write out WAL, you must call this 
function
  * *before* acquiring WALWriteLock, to avoid deadlocks. This 
function might

  * need to wait for an insertion to finish (or at least advance to next
  * uninitialized page), and the inserter might need to evict an old WAL
buffer
  * to make room for a new one, which in turn requires WALWriteLock.

Which contradicts to the observed stack trace.


Not AFAICS. In the stack trace you showed, the backend is not holding 
WALWriteLock. It would only acquire it after the 
WaitXLogInsertionsToFinish() call finished.





Hmmm, may be I missed something.
I am not telling about WALBufMappingLock which is required after return 
from XLogInsertionsToFinish.
But about lock obtained by WALInsertLockAcquire  at line 946 in 
XLogInsertRecord.
It will be release at line  1021 by  WALInsertLockRelease(). But 
CopyXLogRecordToWAL is invoked with this lock granted.




This line in the stack trace is suspicious:


WaitXLogInsertionsToFinish(upto = 102067874328), line 1583 in "xlog.c"


AdvanceXLInsertBuffer() should only ever call 
WaitXLogInsertionsToFinish() with an xlog position that points to a 
page bounary, but that upto value points to the middle of a page.


Perhaps the value stored in the stack trace is not what the caller 
passed, but it was updated because it was past the 'reserveUpto' 
value? That would explain the "request to flush past end
of generated WAL" notices you saw in the log. Now, why would that 
happen, I have no idea.


If you can and want to provide me access to the system, I could have a 
look myself. I'd also like to see if the attached additional 
Assertions will fire.


I really get this assertion failed:

ExceptionalCondition(conditionName = "!(OldPageRqstPtr <= upto || 
opportunistic)", errorType = "FailedAssertion", fileName = "xlog.c", 
lineNumber = 1917), line 54 in "assert.c"

(dbx) up
unnamed block in AdvanceXLInsertBuffer(upto = 147439056632, 
opportunistic = '\0'), line 1917 in "xlog.c"

(dbx) p OldPageRqstPtr
147439058944
(dbx) p upto
147439056632
(dbx) p opportunistic
'\0'

Also , in another run, I encountered yet another assertion failure:

ExceptionalCondition(conditionName = "!NewPageBeginPtr) / 8192) % 
(XLogCtl->XLogCacheBlck + 1)) == nextidx)", errorType = 
"FailedAssertion", fileName = "xlog.c", lineNumber = 1950), line 54 in 
"assert.c"


nextidx equals to 1456, while expected value is 1457.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-01-30 Thread Heikki Linnakangas

On 01/24/2017 04:47 PM, Konstantin Knizhnik wrote:

As I already mentioned, we built Postgres with LOCK_DEBUG , so we can
inspect lock owner. Backend is waiting for itself!
Now please look at two frames in this stack trace marked with red.
XLogInsertRecord is setting WALInsert locks at the beginning of the
function:

 if (isLogSwitch)
 WALInsertLockAcquireExclusive();
 else
 WALInsertLockAcquire();

WALInsertLockAcquire just selects random item from WALInsertLocks array
and exclusively locks:

 if (lockToTry == -1)
 lockToTry = MyProc->pgprocno % NUM_XLOGINSERT_LOCKS;
 MyLockNo = lockToTry;
 immed = LWLockAcquire([MyLockNo].l.lock, LW_EXCLUSIVE);

Then, following the stack trace, AdvanceXLInsertBuffer calls
WaitXLogInsertionsToFinish:

 /*
  * Now that we have an up-to-date LogwrtResult value, see if we
  * still need to write it or if someone else already did.
  */
 if (LogwrtResult.Write < OldPageRqstPtr)
 {
 /*
  * Must acquire write lock. Release WALBufMappingLock
first,
  * to make sure that all insertions that we need to
wait for
  * can finish (up to this same position). Otherwise we risk
  * deadlock.
  */
 LWLockRelease(WALBufMappingLock);

WaitXLogInsertionsToFinish(OldPageRqstPtr);

 LWLockAcquire(WALWriteLock, LW_EXCLUSIVE);


It releases WALBufMappingLock but not WAL insert locks!
Finally in WaitXLogInsertionsToFinish tries to wait for all locks:

 for (i = 0; i < NUM_XLOGINSERT_LOCKS; i++)
 {
 XLogRecPtrinsertingat = InvalidXLogRecPtr;

 do
 {
 /*
  * See if this insertion is in progress. LWLockWait will
wait for
  * the lock to be released, or for the 'value' to be set by a
  * LWLockUpdateVar call.  When a lock is initially
acquired, its
  * value is 0 (InvalidXLogRecPtr), which means that we
don't know
  * where it's inserting yet.  We will have to wait for it.  If
  * it's a small insertion, the record will most likely fit
on the
  * same page and the inserter will release the lock without
ever
  * calling LWLockUpdateVar.  But if it has to sleep, it will
  * advertise the insertion point with LWLockUpdateVar before
  * sleeping.
  */
 if (LWLockWaitForVar([i].l.lock,
  [i].l.insertingAt,
  insertingat, ))

And here we stuck!


Interesting.. What should happen here is that for the backend's own 
insertion slot, the "insertingat" value should be greater than the 
requested flush point ('upto' variable). That's because before 
GetXLogBuffer() calls AdvanceXLInsertBuffer(), it updates the backend's 
insertingat value, to the position that it wants to insert to. And 
AdvanceXLInsertBuffer() only calls WaitXLogInsertionsToFinish() with 
value smaller than what was passed as the 'upto' argument.



The comment to WaitXLogInsertionsToFinish says:

  * Note: When you are about to write out WAL, you must call this function
  * *before* acquiring WALWriteLock, to avoid deadlocks. This function might
  * need to wait for an insertion to finish (or at least advance to next
  * uninitialized page), and the inserter might need to evict an old WAL
buffer
  * to make room for a new one, which in turn requires WALWriteLock.

Which contradicts to the observed stack trace.


Not AFAICS. In the stack trace you showed, the backend is not holding 
WALWriteLock. It would only acquire it after the 
WaitXLogInsertionsToFinish() call finished.



I wonder if it is really synchronization bug in xlog.c or there is
something wrong in this stack trace and it can not happen in case of
normal work?


Yeah, hard to tell. Something is clearly wrong..

This line in the stack trace is suspicious:


WaitXLogInsertionsToFinish(upto = 102067874328), line 1583 in "xlog.c"


AdvanceXLInsertBuffer() should only ever call 
WaitXLogInsertionsToFinish() with an xlog position that points to a page 
bounary, but that upto value points to the middle of a page.


Perhaps the value stored in the stack trace is not what the caller 
passed, but it was updated because it was past the 'reserveUpto' value? 
That would explain the "request to flush past end
of generated WAL" notices you saw in the log. Now, why would that 
happen, I have no idea.


If you can and want to provide me access to the system, I could have a 
look myself. I'd also like to see if the attached additional Assertions 
will fire.


- Heikki

diff --git a/src/backend/access/transam/xlog.c 
b/src/backend/access/transam/xlog.c
index 2f5d603066..a2ea03506a 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1940,6 +1940,7 @@ AdvanceXLInsertBuffer(XLogRecPtr upto, bool 

Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-01-30 Thread Bernd Helmle
Hi Konstantin,

We had observed exactly the same issues on a customer system with the
same environment and PostgreSQL 9.5.5. Additionally, we've tested on
Linux with XL/C 12 and 13 with exactly the same deadlock behavior. 

So we assumed that this is somehow a compiler issue.

Am Dienstag, den 24.01.2017, 19:26 +0300 schrieb Konstantin Knizhnik:
> More information about the problem - Postgres log contains several
> records:
> 
> 2017-01-24 19:15:20.272 MSK [19270462] LOG:  request to flush past
> end 
> of generated WAL; request 6/AAEBE000, currpos 6/AAEBC2B0
> 
> and them correspond to the time when deadlock happen.

Yeah, the same logs here:

LOG:  request to flush past end of generated WAL; request 1/1F4C6000,
currpos 1/1F4C40E0
STATEMENT:  UPDATE pgbench_accounts SET abalance = abalance + -2653
WHERE aid = 3662494;


> There is the following comment in xlog.c concerning this message:
> 
>  /*
>   * No-one should request to flush a piece of WAL that hasn't
> even been
>   * reserved yet. However, it can happen if there is a block with
> a 
> bogus
>   * LSN on disk, for example. XLogFlush checks for that situation
> and
>   * complains, but only after the flush. Here we just assume that
> to 
> mean
>   * that all WAL that has been reserved needs to be finished. In
> this
>   * corner-case, the return value can be smaller than 'upto'
> argument.
>   */
> 
> So looks like it should not happen.
> The first thing to suspect is spinlock implementation which is
> different 
> for GCC and XLC.
> But ... if I rebuild Postgres without spinlocks, then the problem is 
> still reproduced.

Before we got the results from XLC on Linux (where Postgres show the
same behavior) i had a look into the spinlock implementation. If i got
it right, XLC doesn't use the ppc64 specific ones, but the fallback
implementation (system monitoring on AIX also has shown massive calls
for signal(0)...). So i tried the following patch:

diff --git a/src/include/port/atomics/arch-ppc.h
b/src/include/port/atomics/arch-ppc.h
new file mode 100644
index f901a0c..028cced
*** a/src/include/port/atomics/arch-ppc.h
--- b/src/include/port/atomics/arch-ppc.h
***
*** 23,26 
--- 23,33 
  #define pg_memory_barrier_impl()  __asm__ __volatile__ ("sync" :
: :
"memory")
  #define pg_read_barrier_impl()__asm__ __volatile__
("lwsync" : : : "memory")
  #define pg_write_barrier_impl()   __asm__ __volatile__
("lwsync" : : : "memory")
+
+ #elif defined(__IBMC__) || defined(__IBMCPP__)
+
+ #define pg_memory_barrier_impl()  __asm__ __volatile__ (" sync
\n"
::: "memory")
+ #define pg_read_barrier_impl()__asm__ __volatile__ ("
lwsync \n" ::: "memory")
+ #define pg_write_barrier_impl()   __asm__ __volatile__ ("
lwsync \n" ::: "memory")
+
  #endif

This didn't change the picture, though.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deadlock in XLogInsert at AIX

2017-01-24 Thread Konstantin Knizhnik

More information about the problem - Postgres log contains several records:

2017-01-24 19:15:20.272 MSK [19270462] LOG:  request to flush past end 
of generated WAL; request 6/AAEBE000, currpos 6/AAEBC2B0


and them correspond to the time when deadlock happen.
There is the following comment in xlog.c concerning this message:

/*
 * No-one should request to flush a piece of WAL that hasn't even been
 * reserved yet. However, it can happen if there is a block with a 
bogus

 * LSN on disk, for example. XLogFlush checks for that situation and
 * complains, but only after the flush. Here we just assume that to 
mean

 * that all WAL that has been reserved needs to be finished. In this
 * corner-case, the return value can be smaller than 'upto' argument.
 */

So looks like it should not happen.
The first thing to suspect is spinlock implementation which is different 
for GCC and XLC.
But ... if I rebuild Postgres without spinlocks, then the problem is 
still reproduced.


On 24.01.2017 17:47, Konstantin Knizhnik wrote:

Hi Hackers,

We are running Postgres at AIX and encoountered two strqange problems: 
active zombies process and deadlock in XLOG writer.
First problem I will explain in separate mail, now I am mostly 
concerning about deadlock.
It is irregularly reproduced with standard pgbench launched with 100 
connections.


It sometimes happens with 9.6 stable version of Postgres but only when 
it is compiled with xlc compiler.
We failed to reproduce the problem with GCC. So it looks like as bug 
in compiler or xlc-specific atomics implementation...

But there are few moments which contradicts with this hypothesis:

1. The problem is reproduce with Postgres built without optimization. 
Usually compiler bugs affect only optimized code.

2. Disabling atomics doesn't help.
3. Without optimization and with  LOCK_DEBUG defined time of 
reproducing the problem significantly increased. With optimized code 
it is almost always reproduced in few minutes.

With debug version it usually takes much more time.

But the most confusing thing is stack trace:

(dbx) where
semop(??, ??, ??) at 0x91f5790
PGSemaphoreLock(sema = 0x0a0044b95928), line 387 in "pg_sema.c"
unnamed block in LWLockWaitForVar(lock = 0x0a00d980, valptr = 
0x0a00d9a8, oldval = 102067874256, newval = 
0x0fff9c10), line 1666 in "lwlock.c"
LWLockWaitForVar(lock = 0x0a00d980, valptr = 
0x0a00d9a8, oldval = 102067874256, newval = 
0x0fff9c10), line 1666 in "lwlock.c"
unnamed block in WaitXLogInsertionsToFinish(upto = 102067874328), line 
1583 in "xlog.c"

WaitXLogInsertionsToFinish(upto = 102067874328), line 1583 in "xlog.c"
AdvanceXLInsertBuffer(upto = 102067874256, opportunistic = '\0'), line 
1916 in "xlog.c"

unnamed block in GetXLogBuffer(ptr = 102067874256), line 1697 in "xlog.c"
GetXLogBuffer(ptr = 102067874256), line 1697 in "xlog.c"
CopyXLogRecordToWAL(write_len = 70, isLogSwitch = '\0', rdata = 
0x00011007ce10, StartPos = 102067874256, EndPos = 102067874328), 
line 1279 in "xlog.c"
XLogInsertRecord(rdata = 0x00011007ce10, fpw_lsn = 102067718328), 
line 1011 in "xlog.c"
unnamed block in XLogInsert(rmid = '\n', info = '@'), line 453 in 
"xloginsert.c"

XLogInsert(rmid = '\n', info = '@'), line 453 in "xloginsert.c"
log_heap_update(reln = 0x000110273540, oldbuf = 40544, newbuf = 
40544, oldtup = 0x0fffa2a0, newtup = 0x0001102bb958, 
old_key_tuple = (nil), all_visible_cleared = '\0', 
new_all_visible_cleared = '\0'), line 7708 in "heapam.c"
unnamed block in heap_update(relation = 0x000110273540, otid = 
0x0fffa6f8, newtup = 0x0001102bb958, cid = 1, crosscheck = 
(nil), wait = '^A', hufd = 0x0fffa5b0, lockmode = 
0x0fffa5c8), line 4212 in "heapam.c"
heap_update(relation = 0x000110273540, otid = 0x0fffa6f8, 
newtup = 0x0001102bb958, cid = 1, crosscheck = (nil), wait = '^A', 
hufd = 0x0fffa5b0, lockmode = 0x0fffa5c8), line 4212 
in "heapam.c"
unnamed block in ExecUpdate(tupleid = 0x0fffa6f8, oldtuple = 
(nil), slot = 0x0001102bb308, planSlot = 0x0001102b4630, 
epqstate = 0x0001102b2cd8, estate = 0x0001102b29e0, canSetTag 
= '^A'), line 937 in "nodeModifyTable.c"
ExecUpdate(tupleid = 0x0fffa6f8, oldtuple = (nil), slot = 
0x0001102bb308, planSlot = 0x0001102b4630, epqstate = 
0x0001102b2cd8, estate = 0x0001102b29e0, canSetTag = '^A'), 
line 937 in "nodeModifyTable.c"
ExecModifyTable(node = 0x0001102b2c30), line 1516 in 
"nodeModifyTable.c"

ExecProcNode(node = 0x0001102b2c30), line 396 in "execProcnode.c"
ExecutePlan(estate = 0x0001102b29e0, planstate = 
0x0001102b2c30, use_parallel_mode = '\0', operation = CMD_UPDATE, 
sendTuples = '\0', numberTuples = 0, direction = ForwardScanDirection, 
dest = 0x0001102b7520), line 1569 in "execMain.c"
standard_ExecutorRun(queryDesc = 0x0001102b25c0, direction = 

[HACKERS] Deadlock in XLogInsert at AIX

2017-01-24 Thread Konstantin Knizhnik

Hi Hackers,

We are running Postgres at AIX and encoountered two strqange problems: 
active zombies process and deadlock in XLOG writer.
First problem I will explain in separate mail, now I am mostly 
concerning about deadlock.
It is irregularly reproduced with standard pgbench launched with 100 
connections.


It sometimes happens with 9.6 stable version of Postgres but only when 
it is compiled with xlc compiler.
We failed to reproduce the problem with GCC. So it looks like as bug in 
compiler or xlc-specific atomics implementation...

But there are few moments which contradicts with this hypothesis:

1. The problem is reproduce with Postgres built without optimization. 
Usually compiler bugs affect only optimized code.

2. Disabling atomics doesn't help.
3. Without optimization and with  LOCK_DEBUG defined time of reproducing 
the problem significantly increased. With optimized code it is almost 
always reproduced in few minutes.

With debug version it usually takes much more time.

But the most confusing thing is stack trace:

(dbx) where
semop(??, ??, ??) at 0x91f5790
PGSemaphoreLock(sema = 0x0a0044b95928), line 387 in "pg_sema.c"
unnamed block in LWLockWaitForVar(lock = 0x0a00d980, valptr = 
0x0a00d9a8, oldval = 102067874256, newval = 0x0fff9c10), 
line 1666 in "lwlock.c"
LWLockWaitForVar(lock = 0x0a00d980, valptr = 0x0a00d9a8, 
oldval = 102067874256, newval = 0x0fff9c10), line 1666 in "lwlock.c"
unnamed block in WaitXLogInsertionsToFinish(upto = 102067874328), line 
1583 in "xlog.c"

WaitXLogInsertionsToFinish(upto = 102067874328), line 1583 in "xlog.c"
AdvanceXLInsertBuffer(upto = 102067874256, opportunistic = '\0'), line 
1916 in "xlog.c"

unnamed block in GetXLogBuffer(ptr = 102067874256), line 1697 in "xlog.c"
GetXLogBuffer(ptr = 102067874256), line 1697 in "xlog.c"
CopyXLogRecordToWAL(write_len = 70, isLogSwitch = '\0', rdata = 
0x00011007ce10, StartPos = 102067874256, EndPos = 102067874328), 
line 1279 in "xlog.c"
XLogInsertRecord(rdata = 0x00011007ce10, fpw_lsn = 102067718328), 
line 1011 in "xlog.c"
unnamed block in XLogInsert(rmid = '\n', info = '@'), line 453 in 
"xloginsert.c"

XLogInsert(rmid = '\n', info = '@'), line 453 in "xloginsert.c"
log_heap_update(reln = 0x000110273540, oldbuf = 40544, newbuf = 
40544, oldtup = 0x0fffa2a0, newtup = 0x0001102bb958, 
old_key_tuple = (nil), all_visible_cleared = '\0', 
new_all_visible_cleared = '\0'), line 7708 in "heapam.c"
unnamed block in heap_update(relation = 0x000110273540, otid = 
0x0fffa6f8, newtup = 0x0001102bb958, cid = 1, crosscheck = 
(nil), wait = '^A', hufd = 0x0fffa5b0, lockmode = 
0x0fffa5c8), line 4212 in "heapam.c"
heap_update(relation = 0x000110273540, otid = 0x0fffa6f8, 
newtup = 0x0001102bb958, cid = 1, crosscheck = (nil), wait = '^A', 
hufd = 0x0fffa5b0, lockmode = 0x0fffa5c8), line 4212 in 
"heapam.c"
unnamed block in ExecUpdate(tupleid = 0x0fffa6f8, oldtuple = 
(nil), slot = 0x0001102bb308, planSlot = 0x0001102b4630, 
epqstate = 0x0001102b2cd8, estate = 0x0001102b29e0, canSetTag = 
'^A'), line 937 in "nodeModifyTable.c"
ExecUpdate(tupleid = 0x0fffa6f8, oldtuple = (nil), slot = 
0x0001102bb308, planSlot = 0x0001102b4630, epqstate = 
0x0001102b2cd8, estate = 0x0001102b29e0, canSetTag = '^A'), line 
937 in "nodeModifyTable.c"

ExecModifyTable(node = 0x0001102b2c30), line 1516 in "nodeModifyTable.c"
ExecProcNode(node = 0x0001102b2c30), line 396 in "execProcnode.c"
ExecutePlan(estate = 0x0001102b29e0, planstate = 0x0001102b2c30, 
use_parallel_mode = '\0', operation = CMD_UPDATE, sendTuples = '\0', 
numberTuples = 0, direction = ForwardScanDirection, dest = 
0x0001102b7520), line 1569 in "execMain.c"
standard_ExecutorRun(queryDesc = 0x0001102b25c0, direction = 
ForwardScanDirection, count = 0), line 338 in "execMain.c"
ExecutorRun(queryDesc = 0x0001102b25c0, direction = 
ForwardScanDirection, count = 0), line 286 in "execMain.c"
ProcessQuery(plan = 0x0001102b1510, sourceText = "UPDATE 
pgbench_tellers SET tbalance = tbalance + 4019 WHERE tid = 6409;", 
params = (nil), dest = 0x0001102b7520, completionTag = ""), line 187 
in "pquery.c"
unnamed block in PortalRunMulti(portal = 0x000110115e20, isTopLevel 
= '^A', setHoldSnapshot = '\0', dest = 0x0001102b7520, altdest = 
0x0001102b7520, completionTag = ""), line 1303 in "pquery.c"
unnamed block in PortalRunMulti(portal = 0x000110115e20, isTopLevel 
= '^A', setHoldSnapshot = '\0', dest = 0x0001102b7520, altdest = 
0x0001102b7520, completionTag = ""), line 1303 in "pquery.c"
PortalRunMulti(portal = 0x000110115e20, isTopLevel = '^A', 
setHoldSnapshot = '\0', dest = 0x0001102b7520, altdest = 
0x0001102b7520, completionTag = ""), line 1303 in "pquery.c"
unnamed block in PortalRun(portal = 0x000110115e20, count =