subject:"\[HACKERS\] Postgres"

Re: [HACKERS] Postgres 9.6 Logical and Fisical replication

2017-10-05 Thread Mario Fernando Guerrero Díaz

Thank you for the clarification.

El 5/10/2017 9:27 AM, "Robert Haas"  escribió:

> On Mon, Sep 18, 2017 at 5:30 PM, guedim  wrote:
> > I am working with Postgres9.6 with a Master/Slave cluster replication
> using
> > Streaming replication.
> > I would like to add a new Slave server database but this database with
> > logical replication .
> >
> > I tried with some configurations but it was not possible  :(
> >
> > https://github.com/guedim/postgres-streaming-replication
> >
> > Here is the image of what is in my mind:
> >  >
>
> This question is really off-topic for this list, which is probably why
> you haven't gotten any replies.  This list is for discussion of
> PostgreSQL development; there are other lists for user questions, like
> pgsql-general.  Logical replication is only supported beginning in
> PostgreSQL 10; if you are using some earlier version, you need an
> add-on tool like pglogical, slony, etc.
>
> Please also read https://wiki.postgresql.org/wiki/Guide_to_reporting_
> problems
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Re: [HACKERS] Postgres 9.6 Logical and Fisical replication

2017-10-05 Thread Robert Haas

On Mon, Sep 18, 2017 at 5:30 PM, guedim  wrote:
> I am working with Postgres9.6 with a Master/Slave cluster replication using
> Streaming replication.
> I would like to add a new Slave server database but this database with
> logical replication .
>
> I tried with some configurations but it was not possible  :(
>
> https://github.com/guedim/postgres-streaming-replication
>
> Here is the image of what is in my mind:
> 

This question is really off-topic for this list, which is probably why
you haven't gotten any replies.  This list is for discussion of
PostgreSQL development; there are other lists for user questions, like
pgsql-general.  Logical replication is only supported beginning in
PostgreSQL 10; if you are using some earlier version, you need an
add-on tool like pglogical, slony, etc.

Please also read https://wiki.postgresql.org/wiki/Guide_to_reporting_problems

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Postgres 9.6 Logical and Fisical replication

2017-09-18 Thread guedim

Hi guys

I am working with Postgres9.6 with a Master/Slave cluster replication using
Streaming replication.
I would like to add a new Slave server database but this database with
logical replication .


I tried with some configurations but it was not possible  :(

https://github.com/guedim/postgres-streaming-replication


Here is the image of what is in my mind: 
 

Thanks for any help!



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres process invoking exit resulting in sh-QUIT core

2017-07-18 Thread K S, Sandhya (Nokia - IN/Bangalore)

Hi Craig,

While testing for another scenario of continuous postgres server restart, we 
got many cores of sh-QUIT and along with that we got cores for rm-QUIT. It is 
pointing to rm of the archive file but we were not able to get the bt as the 
stack is corrupted.

We got below info from gdb:
Core was generated by `rm ./Archive_00020118'.

And also we were able to get this info:
4518 12490  0.0  0.0  11484  1356 ?Ss   10:59   0:00 postgres: 
archiver process   archiving 00020118.0028.backup
4518 12704  2.0  0.0   7672  2932 ?S11:00   0:00   \_ sh -c 
rm ./Archive_*; touch ./Archive_"00020118.0028.backup"; 
exit 0
4518 12707  0.0  0.0344 4 ?S11:00   0:00
 \_ rm ./Archive_00020118

In the Postgres configuration file ,we have this information.
archive_command = 'rm ./Archive_*; touch ./Archive_"%f"; exit 0'

So while executing this archive command, core was generated.
You pointed out earlier that issue might be happening during archive command 
and also all evidence for this crash are pointing to this same command.
Are there any suggestions to recover from this situation or on ways to debug 
the issue ?

Regards,
Sandhya

From: K S, Sandhya (Nokia - IN/Bangalore)
Sent: Wednesday, July 12, 2017 4:51 PM
To: 'Craig Ringer' <cr...@2ndquadrant.com>
Cc: pgsql-bugs <pgsql-b...@postgresql.org>; PostgreSQL Hackers 
<pgsql-hackers@postgresql.org>; T, Rasna (Nokia - IN/Bangalore) 
<rasn...@nokia.com>; Itnal, Prakash (Nokia - IN/Bangalore) 
<prakash.it...@nokia.com>
Subject: RE: [HACKERS] Postgres process invoking exit resulting in sh-QUIT core

Hi Craig,

Here is bt after installing all the missing debuginfo packages.

(gdb) bt
#0  0x00fff7682f18 in do_lookup_x (undef_name=undef_name@entry=0xfff75cece5 
"_Jv_RegisterClasses", new_hash=new_hash@entry=2681263574,
old_hash=old_hash@entry=0xa159b8, ref=0xfff75ceac8, 
result=result@entry=0xa159a0, scope=, i=1, 
version=version@entry=0x0,
flags=flags@entry=1, skip=skip@entry=0x0, type_class=type_class@entry=0, 
undef_map=undef_map@entry=0xfff76a9478) at dl-lookup.c:444
#1  0x00fff76839a0 in _dl_lookup_symbol_x (undef_name=0xfff75cece5 
"_Jv_RegisterClasses", undef_map=0xfff76a9478, ref=0xa15a90,
symbol_scope=0xfff76a9980, version=0x0, type_class=, 
flags=, skip_map=0x0) at dl-lookup.c:833
#2  0x00fff7685730 in elf_machine_got_rel (lazy=1, map=0xfff76a9478) at 
../sysdeps/mips/dl-machine.h:870
#3  elf_machine_runtime_setup (profile=, lazy=1, l=0xfff76a9478) 
at ../sysdeps/mips/dl-machine.h:916
#4  _dl_relocate_object (scope=0xfff76a9980, reloc_mode=, 
consider_profiling=0) at dl-reloc.c:259
#5  0x00fff767ba10 in dl_main (phdr=, 
phdr@entry=0x12040, phnum=, phnum@entry=8,
user_entry=user_entry@entry=0xa15cf0, auxv=) at 
rtld.c:2070
#6  0x00fff7692e3c in _dl_sysdep_start (start_argptr=, 
dl_main=0xfff7679a98 ) at ../elf/dl-sysdep.c:249
#7  0x00fff767d0d8 in _dl_start_final (arg=arg@entry=0xa16410, 
info=info@entry=0xa15d80) at rtld.c:307
#8  0x00fff767d3d8 in _dl_start (arg=0xa16410) at rtld.c:415
#9  0x00fff7679380 in __start () from /lib64/ld.so.1

Please see if this could help in analysing the issue.

Regards,
Sandhya

From: Craig Ringer [mailto:cr...@2ndquadrant.com]
Sent: Friday, July 07, 2017 1:53 PM
To: K S, Sandhya (Nokia - IN/Bangalore) 
<sandhya@nokia.com<mailto:sandhya@nokia.com>>
Cc: pgsql-bugs <pgsql-b...@postgresql.org<mailto:pgsql-b...@postgresql.org>>; 
PostgreSQL Hackers 
<pgsql-hackers@postgresql.org<mailto:pgsql-hackers@postgresql.org>>; T, Rasna 
(Nokia - IN/Bangalore) <rasn...@nokia.com<mailto:rasn...@nokia.com>>; Itnal, 
Prakash (Nokia - IN/Bangalore) 
<prakash.it...@nokia.com<mailto:prakash.it...@nokia.com>>
Subject: Re: [HACKERS] Postgres process invoking exit resulting in sh-QUIT core

On 7 July 2017 at 15:41, K S, Sandhya (Nokia - IN/Bangalore) 
<sandhya@nokia.com<mailto:sandhya@nokia.com>> wrote:
Hi Craig,

The scenario is lock and unlock of the system for 30 times. During this 
scenario 5 sh-QUIT core is generated. GDB of 5 core is pointing to different 
locations.
I have attached output for 2 such instance.


You seem to be missing debug symbols. Install appropriate debuginfo packages.


--
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Postgres process invoking exit resulting in sh-QUIT core

2017-07-12 Thread K S, Sandhya (Nokia - IN/Bangalore)

Hi Craig,

Here is bt after installing all the missing debuginfo packages.

(gdb) bt
#0  0x00fff7682f18 in do_lookup_x (undef_name=undef_name@entry=0xfff75cece5 
"_Jv_RegisterClasses", new_hash=new_hash@entry=2681263574,
old_hash=old_hash@entry=0xa159b8, ref=0xfff75ceac8, 
result=result@entry=0xa159a0, scope=, i=1, 
version=version@entry=0x0,
flags=flags@entry=1, skip=skip@entry=0x0, type_class=type_class@entry=0, 
undef_map=undef_map@entry=0xfff76a9478) at dl-lookup.c:444
#1  0x00fff76839a0 in _dl_lookup_symbol_x (undef_name=0xfff75cece5 
"_Jv_RegisterClasses", undef_map=0xfff76a9478, ref=0xa15a90,
symbol_scope=0xfff76a9980, version=0x0, type_class=, 
flags=, skip_map=0x0) at dl-lookup.c:833
#2  0x00fff7685730 in elf_machine_got_rel (lazy=1, map=0xfff76a9478) at 
../sysdeps/mips/dl-machine.h:870
#3  elf_machine_runtime_setup (profile=, lazy=1, l=0xfff76a9478) 
at ../sysdeps/mips/dl-machine.h:916
#4  _dl_relocate_object (scope=0xfff76a9980, reloc_mode=, 
consider_profiling=0) at dl-reloc.c:259
#5  0x00fff767ba10 in dl_main (phdr=, 
phdr@entry=0x12040, phnum=, phnum@entry=8,
user_entry=user_entry@entry=0xa15cf0, auxv=) at 
rtld.c:2070
#6  0x00fff7692e3c in _dl_sysdep_start (start_argptr=, 
dl_main=0xfff7679a98 ) at ../elf/dl-sysdep.c:249
#7  0x00fff767d0d8 in _dl_start_final (arg=arg@entry=0xa16410, 
info=info@entry=0xa15d80) at rtld.c:307
#8  0x00fff767d3d8 in _dl_start (arg=0xa16410) at rtld.c:415
#9  0x00fff7679380 in __start () from /lib64/ld.so.1

Please see if this could help in analysing the issue.

Regards,
Sandhya

From: Craig Ringer [mailto:cr...@2ndquadrant.com]
Sent: Friday, July 07, 2017 1:53 PM
To: K S, Sandhya (Nokia - IN/Bangalore) <sandhya@nokia.com>
Cc: pgsql-bugs <pgsql-b...@postgresql.org>; PostgreSQL Hackers 
<pgsql-hackers@postgresql.org>; T, Rasna (Nokia - IN/Bangalore) 
<rasn...@nokia.com>; Itnal, Prakash (Nokia - IN/Bangalore) 
<prakash.it...@nokia.com>
Subject: Re: [HACKERS] Postgres process invoking exit resulting in sh-QUIT core

On 7 July 2017 at 15:41, K S, Sandhya (Nokia - IN/Bangalore) 
<sandhya@nokia.com<mailto:sandhya@nokia.com>> wrote:
Hi Craig,

The scenario is lock and unlock of the system for 30 times. During this 
scenario 5 sh-QUIT core is generated. GDB of 5 core is pointing to different 
locations.
I have attached output for 2 such instance.


You seem to be missing debug symbols. Install appropriate debuginfo packages.


--
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Postgres process invoking exit resulting in sh-QUIT core

2017-07-07 Thread K S, Sandhya (Nokia - IN/Bangalore)

Hi Craig,

The scenario is lock and unlock of the system for 30 times. During this 
scenario 5 sh-QUIT core is generated. GDB of 5 core is pointing to different 
locations.
I have attached output for 2 such instance.

Regards,
Sandhya

From: Craig Ringer [mailto:cr...@2ndquadrant.com]
Sent: Friday, July 07, 2017 12:55 PM
To: K S, Sandhya (Nokia - IN/Bangalore) <sandhya@nokia.com>
Cc: pgsql-bugs <pgsql-b...@postgresql.org>; PostgreSQL Hackers 
<pgsql-hackers@postgresql.org>; T, Rasna (Nokia - IN/Bangalore) 
<rasn...@nokia.com>; Itnal, Prakash (Nokia - IN/Bangalore) 
<prakash.it...@nokia.com>
Subject: Re: [HACKERS] Postgres process invoking exit resulting in sh-QUIT core

On 7 July 2017 at 15:10, K S, Sandhya (Nokia - IN/Bangalore) 
<sandhya@nokia.com<mailto:sandhya@nokia.com>> wrote:
Hi Craig,

You were right about the restore_command.

This all makes sense then.

PostgreSQL sends SIGQUIT for immediate shutdown to its children. So the 
restore_command would get signalled too.

Can't immediately explain the exit code, and SIGQUIT should _not_ generate a 
core file. Can you show the result of attaching 'gdb' to the core file and 
running 'bt full' ?

--
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
GDB of first instance of corefile.
# gdb /bin/bash CFPU-1-7919-595e59a9-sh-QUIT.core
GNU gdb (Wind River Linux G++ 4.4a-470) 7.2.50.20100908-cvs
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "mips64-wrs-linux-gnu".
For bug reporting instructions, please see:
<supp...@windriver.com>...
Reading symbols from /bin/bash...(no debugging symbols found)...done.

warning: core file may not match specified executable file.
[New LWP 7919]
Reading symbols from /lib64/libreadline.so.5...(no debugging symbols 
found)...done.
Loaded symbols for /lib64/libreadline.so.5
Reading symbols from /lib64/libhistory.so.5...(no debugging symbols 
found)...done.
Loaded symbols for /lib64/libhistory.so.5
Reading symbols from /lib64/libncurses.so.5...(no debugging symbols 
found)...done.
Loaded symbols for /lib64/libncurses.so.5
Reading symbols from /lib64/libdl.so.2...Reading symbols from 
/mnt/sysimg/usr/lib/debug/lib64/libdl-2.11.1.so.debug...(no debugging symbols 
found)...done.
(no debugging symbols found)...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/libc.so.6...Reading symbols from 
/mnt/sysimg/usr/lib/debug/lib64/libc-2.11.1.so.debug...done.
done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/libtinfo.so.5...(no debugging symbols found)...done.
Loaded symbols for /lib64/libtinfo.so.5
Reading symbols from /lib64/ld.so.1...Reading symbols from 
/mnt/sysimg/usr/lib/debug/lib64/ld-2.11.1.so.debug...(no debugging symbols 
found)...done.
(no debugging symbols found)...done.
Loaded symbols for /lib64/ld.so.1
Core was generated by `sh -c exit 1'.
Program terminated with signal 3, Quit.
#0  0x005558246a80 in _dl_lookup_symbol_x () from /lib64/ld.so.1
(gdb) bt full
#0  0x005558246a80 in _dl_lookup_symbol_x () from /lib64/ld.so.1
No symbol table info available.
#1  0x00555824816c in _dl_relocate_object () from /lib64/ld.so.1
No symbol table info available.
#2  0x00555823fb6c in dl_main () from /lib64/ld.so.1
No symbol table info available.
#3  0x005558254214 in _dl_sysdep_start () from /lib64/ld.so.1
No symbol table info available.
#4  0x00555823d1b0 in _dl_start_final () from /lib64/ld.so.1
No symbol table info available.
#5  0x00555823d3f0 in _dl_start () from /lib64/ld.so.1
No symbol table info available.
#6  0x00555823cc10 in __start () from /lib64/ld.so.1
No symbol table info available.
Backtrace stopped: frame did not save the PC






GDB of second instance of corefile.
# gdb /bin/bash CFPU-1-15638-595e5efb-sh-QUIT.core   
GNU gdb (Wind River Linux G++ 4.4a-470) 7.2.50.20100908-cvs
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "mips64-wrs-linux-gnu".
For bug reporting instructions, please see:
<supp...@windriver.com>...
Reading symbols from /bin/bash...(no debugging symbols found)...done.

warning: core file may not match specified executable file.
[New LWP 15638]
Reading symbols from /lib64/libreadline.so.5...(no debugging symbols 
found)...done.
Loaded symbols for /lib64/libreadline.so.5
Rea

Re: [HACKERS] Postgres process invoking exit resulting in sh-QUIT core

2017-07-07 Thread Craig Ringer

On 7 July 2017 at 15:41, K S, Sandhya (Nokia - IN/Bangalore) <
sandhya@nokia.com> wrote:

> Hi Craig,
>
>
>
> The scenario is lock and unlock of the system for 30 times. During this
> scenario 5 sh-QUIT core is generated. GDB of 5 core is pointing to
> different locations.
>
> I have attached output for 2 such instance.
>
>

You seem to be missing debug symbols. Install appropriate debuginfo
packages.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Postgres process invoking exit resulting in sh-QUIT core

2017-07-07 Thread Craig Ringer

On 7 July 2017 at 15:10, K S, Sandhya (Nokia - IN/Bangalore) <
sandhya@nokia.com> wrote:

> Hi Craig,
>
>
>
> You were right about the restore_command.
>

This all makes sense then.

PostgreSQL sends SIGQUIT for immediate shutdown to its children. So the
restore_command would get signalled too.

Can't immediately explain the exit code, and SIGQUIT should _not_ generate
a core file. Can you show the result of attaching 'gdb' to the core file
and running 'bt full' ?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Postgres process invoking exit resulting in sh-QUIT core

2017-07-05 Thread Craig Ringer

On 3 Jul. 2017 23:01, "K S, Sandhya (Nokia - IN/Bangalore)" <
sandhya@nokia.com> wrote:

Hi Craig,

Thanks for the response.

Scenario tried here is restart of the system multiple times. sh-QUIT core
is generated when Postgres is invoking the shell to exit and may not be due
to kernel or file system issues. I will try to reproduce the issue with
dmesg output being printed.

However, is there any instance in Postgres where 'sh -c exit 1' will be
invoked?


Most likely it's used directly or indirectly by an archive_commsnd or
restore_comand you have configured.

Re: [HACKERS] Postgres process invoking exit resulting in sh-QUIT core

2017-07-03 Thread K S, Sandhya (Nokia - IN/Bangalore)

Hi Craig,

Thanks for the response.

Scenario tried here is restart of the system multiple times. sh-QUIT core is 
generated when Postgres is invoking the shell to exit and may not be due to 
kernel or file system issues. I will try to reproduce the issue with dmesg 
output being printed.

However, is there any instance in Postgres where 'sh -c exit 1' will be invoked?

Regards,
Sandhya

-Original Message-
From: Craig Ringer [mailto:cr...@2ndquadrant.com] 
Sent: Friday, June 30, 2017 5:40 PM
To: K S, Sandhya (Nokia - IN/Bangalore) <sandhya@nokia.com>
Cc: pgsql-hackers@postgresql.org; pgsql-b...@postgresql.org; T, Rasna (Nokia - 
IN/Bangalore) <rasn...@nokia.com>; Itnal, Prakash (Nokia - IN/Bangalore) 
<prakash.it...@nokia.com>
Subject: Re: [HACKERS] Postgres process invoking exit resulting in sh-QUIT core

On 30 June 2017 at 17:41, K S, Sandhya (Nokia - IN/Bangalore)
<sandhya@nokia.com> wrote:

> When we checked the process listing during the time of core generation, we
> found Postgres startup process is invoking “sh -c exit 1”:
> 4518  9249  0.1  0.0 155964  2036 ?Ss   15:05   0:00 postgres:
> startup process   waiting for 000102EB

Looks like an archive_command or restore_command .

If 'sh' is dumping core, you probably have issues at a low level in
the kernel, file system, etc. Check dmesg.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres process invoking exit resulting in sh-QUIT core

2017-06-30 Thread Craig Ringer

On 30 June 2017 at 17:41, K S, Sandhya (Nokia - IN/Bangalore)
 wrote:

> When we checked the process listing during the time of core generation, we
> found Postgres startup process is invoking “sh -c exit 1”:
> 4518  9249  0.1  0.0 155964  2036 ?Ss   15:05   0:00 postgres:
> startup process   waiting for 000102EB

Looks like an archive_command or restore_command .

If 'sh' is dumping core, you probably have issues at a low level in
the kernel, file system, etc. Check dmesg.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Postgres process invoking exit resulting in sh-QUIT core

2017-06-30 Thread K S, Sandhya (Nokia - IN/Bangalore)

Hi,

We are using Postgres version 9.3.14 over linux based OS and we are observing 
sh-QUIT core files randomly when we are restarting the system(occurrence seen 
once in 30 times).
Backtrace is showing as below:

Loaded symbols for /lib64/ld.so.1
Core was generated by `sh -c exit 1'.
Program terminated with signal 3, Quit.
#0  0x005559ed78f0 in do_lookup_x () from /lib64/ld.so.1
(gdb) bt
#0  0x005559ed78f0 in do_lookup_x () from /lib64/ld.so.1
#1  0x005559ed7b88 in _dl_lookup_symbol_x () from /lib64/ld.so.1
#2  0x005559ed916c in _dl_relocate_object () from /lib64/ld.so.1
#3  0x005559ed0b6c in dl_main () from /lib64/ld.so.1
#4  0x005559ee5214 in _dl_sysdep_start () from /lib64/ld.so.1
#5  0x005559ece1b0 in _dl_start_final () from /lib64/ld.so.1
#6  0x005559ece3f0 in _dl_start () from /lib64/ld.so.1
#7  0x005559ecdc10 in __start () from /lib64/ld.so.1
Backtrace stopped: frame did not save the PC

When we checked the process listing during the time of core generation, we 
found Postgres startup process is invoking "sh -c exit 1":
4518  9249  0.1  0.0 155964  2036 ?Ss   15:05   0:00 postgres: 
startup process   waiting for 000102EB
4518 10288  0.0  0.0   3600   508 ?S15:11   0:00  \_ sh -c exit 
1

We tried disabling DB and running the same testcase which didn't result in core 
being generated.
Also we are using immediate shutdown mode which uses SIGQUIT.

Can you please help us in debugging the issue ?

Regards,
Sandhya

Re: [HACKERS] postgres 9.6.2 update breakage

2017-05-15 Thread Peter Eisentraut

On 5/15/17 02:48, Roel Janssen wrote:
> Ah yes, I see the point.  The problem here is that when new features are
> added to PostgreSQL, and you rely upon them in your database schemas,
> downgrading will most likely cause loss of information.
> 
> Maybe we need a wrapper script that also makes a dump of all of the
> data?  Now that could become a security hole.
> 
> Or the wrapper script warns about this situation, and recommends making
> a (extra) back-up of the database before upgrading.
> 
> Or.. the upgrade is something a user should do explicitly, basically
> giving up on the "just works" concept.  Guix already provides a nice way
> to get the previous version of the exact binaries used before the
> upgrade.

The best way to manage this with PostgreSQL is to make separate packages
for each PostgreSQL major version.  I see for example that you have
packages gcc-4.9, gcc-5, gcc-6, etc.  You should do the same with
PostgreSQL, e.g., postgresql-9.5, postgresql-9.6, postgresql-10.  Then
you don't have to concern yourselves with how "upgrades" and
"downgrades" should look for the users of your packaging system.  Minor
version upgrades are just installing the new package and restarting.
Major version upgrades are figured out by the user.

Downgrades between minor versions of the same major versions should
mostly work.  They are not well tested, if at all, but I don't think
that's all that different from downgrading any other package.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postgres 9.6.2 update breakage

2017-05-15 Thread Roel Janssen

Jan Nieuwenhuizen writes:

> Roel Janssen writes:
>
>> So, it would be something like:
>> postgres pg_upgrade \
>> ...
>
> It's great to have a recipe `that works', so thanks!
>
> However, whether or not we automate this, I cannot help to wonder if
> we should support downgrading -- at least to the previous version
> in this case?
>
> If I'm not mistaken, everything else in GuixSD will run if I select a
> previous system generation in Grub...except for this?
>
> Is involving postgres developers an option, I'm sure a least one of
> the postgresql hackers[cc] are already looking at Guix[SD]?
>
> Greetings,
> janneke

Ah yes, I see the point.  The problem here is that when new features are
added to PostgreSQL, and you rely upon them in your database schemas,
downgrading will most likely cause loss of information.

Maybe we need a wrapper script that also makes a dump of all of the
data?  Now that could become a security hole.

Or the wrapper script warns about this situation, and recommends making
a (extra) back-up of the database before upgrading.

Or.. the upgrade is something a user should do explicitly, basically
giving up on the "just works" concept.  Guix already provides a nice way
to get the previous version of the exact binaries used before the
upgrade.

Kind regards,
Roel Janssen

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postgres 9.6.2 update breakage

2017-05-14 Thread Christopher Allan Webber

Jan Nieuwenhuizen writes:

> Roel Janssen writes:
>
>> So, it would be something like:
>> postgres pg_upgrade \
>> ...
>
> It's great to have a recipe `that works', so thanks!
>
> However, whether or not we automate this, I cannot help to wonder if
> we should support downgrading -- at least to the previous version
> in this case?
>
> If I'm not mistaken, everything else in GuixSD will run if I select a
> previous system generation in Grub...except for this?
>
> Is involving postgres developers an option, I'm sure a least one of
> the postgresql hackers[cc] are already looking at Guix[SD]?
>
> Greetings,
> janneke

There's a big difference in upgrading and downgrading between guix
revisions and doing so in highly stateful databases, unfortunately.

I can't speak for postgres specifically, but here's my experience with
migrations as the tech lead of MediaGoblin:

 - upgrades should be taken with extreme caution, and you should back up
   first.
 - downgrades should be taken with ten times the amount of caution of
   upgrades, a vat of coffee to work through the problems, and a barrel
   of whiskey for when it doesn't.  I say that as someone who's mostly
   given up coffee and doesn't drink alcohol.

State changes are bad enough when unidirectional.  Django, for instance,
provides an API that does both upgrades and downgrades.  Almost
everybody spends a bunch of time carefully crafting their upgrades, and
just leaves their downgrades as the stubs that come with it.  These are
stubs that drop columns entirely, possibly columns that data was moved
to in the migration.  Reverse course, and suddenly you don't have a lot
of data you used to.

What we really want to do is provide the option to snapshot things
*before* you do an upgrade, IMO...

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postgres 9.6.2 update breakage

2017-05-14 Thread Jan Nieuwenhuizen

Roel Janssen writes:

> So, it would be something like:
> postgres pg_upgrade \
> ...

It's great to have a recipe `that works', so thanks!

However, whether or not we automate this, I cannot help to wonder if
we should support downgrading -- at least to the previous version
in this case?

If I'm not mistaken, everything else in GuixSD will run if I select a
previous system generation in Grub...except for this?

Is involving postgres developers an option, I'm sure a least one of
the postgresql hackers[cc] are already looking at Guix[SD]?

Greetings,
janneke

-- 
Jan Nieuwenhuizen  | GNU LilyPond http://lilypond.org
Freelance IT http://JoyofSource.com | Avatar®  http://AvatarAcademy.nl  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [HACKERS] Re: [HACKERS] 答复: [HACKERS] postgres 1 个(共 2 个) can pg 9.6 vacuum freeze skip page on index?

2016-12-05 Thread Masahiko Sawada

On Fri, Dec 2, 2016 at 3:50 AM, Robert Haas  wrote:
> On Thu, Dec 1, 2016 at 1:39 PM, Tom Lane  wrote:
>> Robert Haas  writes:
>>> I think that the indexes only need to be scanned if the VACUUM finds
>>> dead tuples.  But even 1 dead tuple will cause a complete scan of
>>> every index.  I've complained about this before and I think there's
>>> room for improvement here, but nobody's been motivated enough to
>>> pursue this yet.
>>
>> The thing that's been speculated about in the past is having some
>> threshold larger than 1 on the minimum number of dead tuples needed
>> to cause a cleanup pass.
>
> Agreed.
>
>> It wouldn't be hard to implement, if you
>> could get consensus on what the threshold should be.
>
> Also agreed.
>
>> I'd think
>> some algorithm similar to the autovacuum thresholds might be
>> appropriate.  It's not quite clear how this would interact with
>> HOT pruning, though.
>
> What's the relevance of HOT pruning here?
>
> I was thinking that the relevant metric might be how many pages
> contain dead tuples, because what we really want to do to reduce the
> cost of future vacuuming and future index-only scans is get pages
> marked all-visible.  Say, if less than 2% of the pages in the table
> contain dead tuples and the space required to store the TIDs is less
> than 50% of maintenance_work_mem, skip the index scans.  The first of
> those thresholds, at least, would probably need to be configurable,
> but that kind of idea.

I think that this idea is better. If the number of pages containing
dead tuple is less than threshold (e.g.
vacuum_index_cleanup_scale_factor), we can skip  the cleanup index
scans.
I will write the patch and submit to next CF.

> The alternative that's been proposed is to do something based on the
> number of dead tuples but, as somebody pointed out in a previous
> discussion of this topic, one dead tuple per page throughout the whole
> table is a LOT worse than same number of dead tuples all on the same
> pages.  You don't want to keep scanning large chunks of the heap
> because you're too lazy to visit the indexes.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: 答复: [HACKERS] Re: [HACKERS] 答复: [HACKERS] postgres 1 个(共 2 个) can pg 9.6 vacuum freeze skip page on index?

2016-12-02 Thread Robert Haas

On Thu, Dec 1, 2016 at 9:54 PM, xu jian  wrote:
> Thanks every for your help. I am not familiar with the internal of the
> vacuum freeze, just curious if there is no row change on the table(in other
> words, all pages are frozen), why could index page have dead tuple?

It can't.   If the *entire* table is frozen, then I would think it
shouldn't be scanning the indexes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] 答复: [HACKERS] Re: [HACKERS] 答复: [HACKERS] postgres 1 个(共 2 个) can pg 9.6 vacuum freeze skip page on index?

2016-12-01 Thread xu jian

Thanks every for your help. I am not familiar with the internal of the vacuum 
freeze, just curious if there is no row change on the table(in other words, all 
pages are frozen), why could index page have dead tuple?

is it possible to scan data page first, if all data page are frozen, skipping 
the index page scan step. Perhaps there is other reason vacuum freeze does 
index page first, then is it possible to provide a option to skip index page 
scan step in vacuum freeze command? thanks

James

发件人: Robert Haas <robertmh...@gmail.com>
发送时间: 2016年12月1日 13:50:49
收件人: Tom Lane
抄送: xu jian; Masahiko Sawada; pgsql-hackers@postgresql.org
主题: Re: [HACKERS] Re: [HACKERS] 答复: [HACKERS] postgres 1 个(共 2 个) can pg 9.6 
vacuum freeze skip page on index?

On Thu, Dec 1, 2016 at 1:39 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Robert Haas <robertmh...@gmail.com> writes:
>> I think that the indexes only need to be scanned if the VACUUM finds
>> dead tuples.  But even 1 dead tuple will cause a complete scan of
>> every index.  I've complained about this before and I think there's
>> room for improvement here, but nobody's been motivated enough to
>> pursue this yet.
>
> The thing that's been speculated about in the past is having some
> threshold larger than 1 on the minimum number of dead tuples needed
> to cause a cleanup pass.

Agreed.

> It wouldn't be hard to implement, if you
> could get consensus on what the threshold should be.

Also agreed.

> I'd think
> some algorithm similar to the autovacuum thresholds might be
> appropriate.  It's not quite clear how this would interact with
> HOT pruning, though.

What's the relevance of HOT pruning here?

I was thinking that the relevant metric might be how many pages
contain dead tuples, because what we really want to do to reduce the
cost of future vacuuming and future index-only scans is get pages
marked all-visible.  Say, if less than 2% of the pages in the table
contain dead tuples and the space required to store the TIDs is less
than 50% of maintenance_work_mem, skip the index scans.  The first of
those thresholds, at least, would probably need to be configurable,
but that kind of idea.

The alternative that's been proposed is to do something based on the
number of dead tuples but, as somebody pointed out in a previous
discussion of this topic, one dead tuple per page throughout the whole
table is a LOT worse than same number of dead tuples all on the same
pages.  You don't want to keep scanning large chunks of the heap
because you're too lazy to visit the indexes.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[HACKERS] Re: [HACKERS] Re: [HACKERS] 答复: [HACKERS] postgres 1 个(共 2 个) can pg 9.6 vacuum freeze skip page on index?

2016-12-01 Thread Robert Haas

On Thu, Dec 1, 2016 at 1:39 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> I think that the indexes only need to be scanned if the VACUUM finds
>> dead tuples.  But even 1 dead tuple will cause a complete scan of
>> every index.  I've complained about this before and I think there's
>> room for improvement here, but nobody's been motivated enough to
>> pursue this yet.
>
> The thing that's been speculated about in the past is having some
> threshold larger than 1 on the minimum number of dead tuples needed
> to cause a cleanup pass.

Agreed.

> It wouldn't be hard to implement, if you
> could get consensus on what the threshold should be.

Also agreed.

> I'd think
> some algorithm similar to the autovacuum thresholds might be
> appropriate.  It's not quite clear how this would interact with
> HOT pruning, though.

What's the relevance of HOT pruning here?

I was thinking that the relevant metric might be how many pages
contain dead tuples, because what we really want to do to reduce the
cost of future vacuuming and future index-only scans is get pages
marked all-visible.  Say, if less than 2% of the pages in the table
contain dead tuples and the space required to store the TIDs is less
than 50% of maintenance_work_mem, skip the index scans.  The first of
those thresholds, at least, would probably need to be configurable,
but that kind of idea.

The alternative that's been proposed is to do something based on the
number of dead tuples but, as somebody pointed out in a previous
discussion of this topic, one dead tuple per page throughout the whole
table is a LOT worse than same number of dead tuples all on the same
pages.  You don't want to keep scanning large chunks of the heap
because you're too lazy to visit the indexes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [HACKERS] 答复: [HACKERS] postgres 1 个(共 2 个) can pg 9.6 vacuum freeze skip page on index?

2016-12-01 Thread Tom Lane

Robert Haas  writes:
> I think that the indexes only need to be scanned if the VACUUM finds
> dead tuples.  But even 1 dead tuple will cause a complete scan of
> every index.  I've complained about this before and I think there's
> room for improvement here, but nobody's been motivated enough to
> pursue this yet.

The thing that's been speculated about in the past is having some
threshold larger than 1 on the minimum number of dead tuples needed
to cause a cleanup pass.  It wouldn't be hard to implement, if you
could get consensus on what the threshold should be.  I'd think
some algorithm similar to the autovacuum thresholds might be
appropriate.  It's not quite clear how this would interact with
HOT pruning, though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [HACKERS] 答复: [HACKERS] postgres 1 个(共 2 个) can pg 9.6 vacuum freeze skip page on index?

2016-12-01 Thread Robert Haas

On Thu, Dec 1, 2016 at 9:45 AM, xu jian  wrote:
> Thanks for your reply. Is there any reason to update index
> statistics even if there is no changes on the table?
> or is there any way to disable index statistics update during vacuum freeze?
> thanks

I think that the indexes only need to be scanned if the VACUUM finds
dead tuples.  But even 1 dead tuple will cause a complete scan of
every index.  I've complained about this before and I think there's
room for improvement here, but nobody's been motivated enough to
pursue this yet.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] 答复: [HACKERS] postgres 1 个(共 2 个) can pg 9.6 vacuum freeze skip page on index?

2016-12-01 Thread xu jian

Hi Masahiko,

Thanks for your reply. Is there any reason to update index statistics 
even if there is no changes on the table?
or is there any way to disable index statistics update during vacuum freeze? 
thanks

James

发件人: Masahiko Sawada <sawada.m...@gmail.com>
发送时间: 2016年12月1日 9:06:15
收件人: xu jian
抄送: pgsql-hackers@postgresql.org
主题: Re: [HACKERS] postgres 1 个(共 2 个) can pg 9.6 vacuum freeze skip page on 
index?

On Thu, Dec 1, 2016 at 1:33 AM, xu jian <jame...@outlook.com> wrote:
> Hello,
>
>Please execute me if I am using the wrong mailing list, but I ask the
> question in pgsql-admin, looks like no one know the answer.
>
>
> we upgraded our pg db to 9.6, as we know, pg9.6 doesn't need full table scan
> in vacuum freeze.
>
> http://rhaas.blogspot.com/2016/03/no-more-full-table-vacuums.html
>
>
> so we think if we have run vacuum freeze on the table, and there is no
> change on table which has been vacuum freeze before  it should finish super
> faster.
>
>
> However, it doesn't look like we expect. the next run of vacuum freeze still
> take long time. Then we run vacuum freeze with verbose. we notice it spends
> long time on scanning index.
>
> it seems even all rows are frozen on the data page, vacuum freeze still
> needs to scan all the index pages. if we drop the index, then vacuum freeze
> finishes immediately.
>
>
> Does anyone know if it is true?

Yeah that's true. The vacuum on each index is required in order to
update index statistics even if  no updating on table.

> Btw, our table is large, and has about 40GB index files.  is there anyway to
> make the vacuum freeze faster in this case?

I guess that there is no way.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

[HACKERS] Re: [HACKERS] postgres 1 个(共 2 个) can pg 9.6 vacuum freeze skip page on index?

2016-12-01 Thread Masahiko Sawada

On Thu, Dec 1, 2016 at 1:33 AM, xu jian  wrote:
> Hello,
>
>Please execute me if I am using the wrong mailing list, but I ask the
> question in pgsql-admin, looks like no one know the answer.
>
>
> we upgraded our pg db to 9.6, as we know, pg9.6 doesn't need full table scan
> in vacuum freeze.
>
> http://rhaas.blogspot.com/2016/03/no-more-full-table-vacuums.html
>
>
> so we think if we have run vacuum freeze on the table, and there is no
> change on table which has been vacuum freeze before  it should finish super
> faster.
>
>
> However, it doesn't look like we expect. the next run of vacuum freeze still
> take long time. Then we run vacuum freeze with verbose. we notice it spends
> long time on scanning index.
>
> it seems even all rows are frozen on the data page, vacuum freeze still
> needs to scan all the index pages. if we drop the index, then vacuum freeze
> finishes immediately.
>
>
> Does anyone know if it is true?

Yeah that's true. The vacuum on each index is required in order to
update index statistics even if  no updating on table.

> Btw, our table is large, and has about 40GB index files.  is there anyway to
> make the vacuum freeze faster in this case?

I guess that there is no way.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] postgres 1 个(共 2 个) can pg 9.6 vacuum freeze skip page on index?

2016-11-30 Thread xu jian

Hello,

   Please execute me if I am using the wrong mailing list, but I ask the 
question in pgsql-admin, looks like no one know the answer.


we upgraded our pg db to 9.6, as we know, pg9.6 doesn't need full table scan in 
vacuum freeze.

http://rhaas.blogspot.com/2016/03/no-more-full-table-vacuums.html


so we think if we have run vacuum freeze on the table, and there is no change 
on table which has been vacuum freeze before  it should finish super faster.


However, it doesn't look like we expect. the next run of vacuum freeze still 
take long time. Then we run vacuum freeze with verbose. we notice it spends 
long time on scanning index.

it seems even all rows are frozen on the data page, vacuum freeze still needs 
to scan all the index pages. if we drop the index, then vacuum freeze finishes 
immediately.


Does anyone know if it is true?


Btw, our table is large, and has about 40GB index files.  is there anyway to 
make the vacuum freeze faster in this case?


Thanks for the help.


James

Re: [HACKERS] postgres 9.3 postgres_fdw ::LOG: could not receive data from client: Connection reset by peer

2016-11-24 Thread Vladimir Svedov

Local server log has the line and remote table log is empty (it is
configured for minimum warning and when I produce one it appears in log OK)
And I have new details - it happens on some additional environments - not
constantly. Some hours it happens every time, then just stops appearing:
postgres@hostname:~$ grep user
9.3/main/pg_log/postgresql-2016-11-22_00.log  | cat -n
 1  2016-11-22 06:00:02 UTC 127.0.0.1 user::LOG:  could not receive
data from client: Connection reset by peer
 2  2016-11-22 06:00:03 UTC 127.0.0.1 user::LOG:  could not receive
data from client: Connection reset by peer
 3  2016-11-22 06:00:03 UTC 127.0.0.1 user::LOG:  could not receive
data from client: Connection reset by peer
 4  2016-11-22 10:06:08 UTC 127.0.0.1 user::LOG:  could not receive
data from client: Connection reset by peer
 5  2016-11-22 10:06:08 UTC 127.0.0.1 user ::LOG:  could not receive
data from client: Connection reset by peer
 6  2016-11-22 10:06:08 UTC 127.0.0.1 user::LOG:  could not receive
data from client: Connection reset by peer
 7  2016-11-22 11:25:27 UTC 127.0.0.1  user::LOG:  could not receive
data from client: Connection reset by peer
 8  2016-11-22 11:25:27 UTC 127.0.0.1 user::LOG:  could not receive
data from client: Connection reset by peer
 9  2016-11-22 11:25:27 UTC 127.0.0.1 user::LOG:  could not receive
data from client: Connection reset by peer
this is log from local server where user runs same bunch of three
statements MINUTELY

But when it start happening if I psql to db, same statement produces NOT
this string in log.

Here reason why you should not follow this post any more:
postgres=# select version();
   version
--
 PostgreSQL 9.3.9 on x86_64-unknown-linux-gnu, compiled by gcc (Debian
4.7.2-5) 4.7.2, 64-bit
(1 row)

I will just upgrade it. And this is the only machine I have this weird
behavior at.
Sorry guys and thank you for your time

2016-11-23 18:16 GMT+00:00 Jeff Janes :

> On Mon, Nov 21, 2016 at 6:32 AM, Vladimir Svedov 
> wrote:
>
>> Hi,
>> I have this question. Looked for a help on http://dba.stackexchange.com/
>> No success.
>> Maybe you can answer?..
>> Thank you in advance
>>
>>
>> "FOREIGN_TABLE" created with postgres_fdw. LOCAL_TABLE is just a local
>> table...
>>
>> Symptoms:
>>
>>1. I run in psql query SELECT * from FOREIGN_TABLE. No log generated
>>2. I run in bash psql -c "SELECT * from LOCAL_TABLE". No log generated
>>3. I run in bash psql -c "SELECT * from FOREIGN_TABLE". ::LOG: could
>>not receive data from client: Connection reset by peer generated in
>>postgres log
>>
>>
> Which server log file is this generated in, the local or the foreign?
> Whichever it is, is there an entry in the logfile for the other server
> which seems to match up to this one?  That may have more useful details.
>
> Cheers,
>
> Jeff
>

Re: [HACKERS] postgres 9.3 postgres_fdw ::LOG: could not receive data from client: Connection reset by peer

2016-11-23 Thread Jeff Janes

On Mon, Nov 21, 2016 at 6:32 AM, Vladimir Svedov  wrote:

> Hi,
> I have this question. Looked for a help on http://dba.stackexchange.com/
> No success.
> Maybe you can answer?..
> Thank you in advance
>
>
> "FOREIGN_TABLE" created with postgres_fdw. LOCAL_TABLE is just a local
> table...
>
> Symptoms:
>
>1. I run in psql query SELECT * from FOREIGN_TABLE. No log generated
>2. I run in bash psql -c "SELECT * from LOCAL_TABLE". No log generated
>3. I run in bash psql -c "SELECT * from FOREIGN_TABLE". ::LOG: could
>not receive data from client: Connection reset by peer generated in
>postgres log
>
>
Which server log file is this generated in, the local or the foreign?
Whichever it is, is there an entry in the logfile for the other server
which seems to match up to this one?  That may have more useful details.

Cheers,

Jeff

Re: [HACKERS] postgres 9.3 postgres_fdw ::LOG: could not receive data from client: Connection reset by peer

2016-11-23 Thread Robert Haas

On Wed, Nov 23, 2016 at 3:08 AM, Vladimir Svedov  wrote:
> No, I select from it OK.
> The bug(?) is that when I do it in oppened psql session it produces no log,
> and when I run same select as psql -c "SELECT..." it gives the above

OK, that's pretty weird.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postgres 9.3 postgres_fdw ::LOG: could not receive data from client: Connection reset by peer

2016-11-23 Thread Vladimir Svedov

No, I select from it OK.
The bug(?) is that when I do it in oppened psql session it produces no log,
and when I run same select as psql -c "SELECT..." it gives the above

2016-11-22 20:18 GMT+00:00 Robert Haas :

> On Tue, Nov 22, 2016 at 5:05 AM, Vladimir Svedov 
> wrote:
> > Hi,
> > Sorry - tried to reproduce on other machine and gather all statements.
> And
> > failed
> > Installed 9.3 (which has those symptoms) and still can't reproduce.
> > Must be platform specific, not version
>
> Probably the foreign server isn't configured properly, and points to a
> host/port to that resets the connection when you attempt to connect to
> it.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Re: [HACKERS] postgres 9.3 postgres_fdw ::LOG: could not receive data from client: Connection reset by peer

2016-11-22 Thread Robert Haas

On Tue, Nov 22, 2016 at 5:05 AM, Vladimir Svedov  wrote:
> Hi,
> Sorry - tried to reproduce on other machine and gather all statements. And
> failed
> Installed 9.3 (which has those symptoms) and still can't reproduce.
> Must be platform specific, not version

Probably the foreign server isn't configured properly, and points to a
host/port to that resets the connection when you attempt to connect to
it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres abort found in 9.3.11

2016-11-22 Thread K S, Sandhya (Nokia - IN/Bangalore)

Hello,

The setup is made of hot-standby architecture and the issue is seen during 
normal run with normal load of 50% insert and 50% delete operation.
During startup of the standby node, we copy the data directory from the active 
postgres using pg_basebackup.

Meanwhile we are trying to create a test bed for people to try.

Regards,
Sandhya

-Original Message-
From: Tom Lane [mailto:t...@sss.pgh.pa.us] 
Sent: Tuesday, November 22, 2016 1:47 AM
To: K S, Sandhya (Nokia - IN/Bangalore) <sandhya@nokia.com>
Cc: pgsql-hackers@postgresql.org; Itnal, Prakash (Nokia - IN/Bangalore) 
<prakash.it...@nokia.com>
Subject: Re: [HACKERS] Postgres abort found in 9.3.11

"K S, Sandhya (Nokia - IN/Bangalore)" <sandhya@nokia.com> writes:
> As suggested by you, we upgraded the postgres to version 9.3.14. Also we 
> removed all the patches we had applied before. But the issue is still 
> observed in the latest version as well.

Can you make a test case for other people to try?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postgres 9.3 postgres_fdw ::LOG: could not receive data from client: Connection reset by peer

2016-11-22 Thread Vladimir Svedov

Hi,
Sorry - tried to reproduce on other machine and gather all statements. And
failed
Installed 9.3 (which has those symptoms) and still can't reproduce.
Must be platform specific, not version

2016-11-21 21:58 GMT+00:00 Kevin Grittner :

> On Mon, Nov 21, 2016 at 8:32 AM, Vladimir Svedov 
> wrote:
>
> > I have this question. Looked for a help on http://dba.stackexchange.com/
> > No success.
>
> A link to the actual question would be appreciated.
>
> > "FOREIGN_TABLE" created with postgres_fdw. LOCAL_TABLE is just a local
> table...
> >
> > Symptoms:
> >
> > I run in psql query SELECT * from FOREIGN_TABLE. No log generated
> > I run in bash psql -c "SELECT * from LOCAL_TABLE". No log generated
> > I run in bash psql -c "SELECT * from FOREIGN_TABLE". ::LOG: could not
> receive data from client: Connection reset by peer generated in postgres log
>
> Please provide more information, and preferably a self-contained
> test case (one that anyone can run on an empty database to see the
> problem).
>
> https://wiki.postgresql.org/wiki/Guide_to_reporting_problems
>
> --
> Kevin Grittner
> EDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Re: [HACKERS] postgres 9.3 postgres_fdw ::LOG: could not receive data from client: Connection reset by peer

2016-11-21 Thread Kevin Grittner

On Mon, Nov 21, 2016 at 8:32 AM, Vladimir Svedov  wrote:

> I have this question. Looked for a help on http://dba.stackexchange.com/
> No success.

A link to the actual question would be appreciated.

> "FOREIGN_TABLE" created with postgres_fdw. LOCAL_TABLE is just a local 
> table...
>
> Symptoms:
>
> I run in psql query SELECT * from FOREIGN_TABLE. No log generated
> I run in bash psql -c "SELECT * from LOCAL_TABLE". No log generated
> I run in bash psql -c "SELECT * from FOREIGN_TABLE". ::LOG: could not receive 
> data from client: Connection reset by peer generated in postgres log

Please provide more information, and preferably a self-contained
test case (one that anyone can run on an empty database to see the
problem).

https://wiki.postgresql.org/wiki/Guide_to_reporting_problems

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres abort found in 9.3.11

2016-11-21 Thread Tom Lane

"K S, Sandhya (Nokia - IN/Bangalore)"  writes:
> As suggested by you, we upgraded the postgres to version 9.3.14. Also we 
> removed all the patches we had applied before. But the issue is still 
> observed in the latest version as well.

Can you make a test case for other people to try?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres abort found in 9.3.11

2016-11-21 Thread K S, Sandhya (Nokia - IN/Bangalore)

Hello,

As suggested by you, we upgraded the postgres to version 9.3.14. Also we 
removed all the patches we had applied before. But the issue is still observed 
in the latest version as well.

The issue is seen during normal run and only observed in the standby node. 

This time as well, the same error log is observed.
node-1 postgres[8743]: [18-1] PANIC:  btree_xlog_delete_get_latestRemovedXid: 
cannot operate with inconsistent data

Can you please share your inputs which would help us proceed further?

Regards,
Sandhya

-Original Message-
From: Tom Lane [mailto:t...@sss.pgh.pa.us] 
Sent: Friday, September 16, 2016 1:29 AM
To: K S, Sandhya (Nokia - IN/Bangalore) <sandhya@nokia.com>
Cc: pgsql-hackers@postgresql.org; Itnal, Prakash (Nokia - IN/Bangalore) 
<prakash.it...@nokia.com>
Subject: Re: [HACKERS] Postgres abort found in 9.3.11

"K S, Sandhya (Nokia - IN/Bangalore)" <sandhya@nokia.com> writes:
> We tried to replicate the scenario without our patch(exiting postmaster) and 
> still we were able to see the issue.

> Same error was seen this time as well.
> node-0 postgres[8243]: [1-2] HINT:  Is another postmaster already running on 
> port 5433? If not, wait a few seconds and retry.  
> node-1 postgres[8650]: [18-1] PANIC:  btree_xlog_delete_get_latestRemovedXid: 
> cannot operate with inconsistent data

> Crash was not seen in 9.3.9 without the patch but it was reproduced in 9.3.11.
> So something specifically changed between 9.3.9 and 9.3.11 is causing the 
> issue.

Well, I looked through the git history from 9.3.9 to 9.3.11 and I don't
see anything that seems likely to explain a problem here.

If you can reproduce this, which it sounds like you can, maybe you could
create a self-contained test case for other people to try?

Also worth noting is that the current 9.3.x release is 9.3.14.  You
might save yourself some time by updating and seeing if it still
reproduces in 9.3.14.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] postgres 9.3 postgres_fdw ::LOG: could not receive data from client: Connection reset by peer

2016-11-21 Thread Vladimir Svedov

Hi,
I have this question. Looked for a help on http://dba.stackexchange.com/
No success.
Maybe you can answer?..
Thank you in advance


"FOREIGN_TABLE" created with postgres_fdw. LOCAL_TABLE is just a local
table...

Symptoms:

   1. I run in psql query SELECT * from FOREIGN_TABLE. No log generated
   2. I run in bash psql -c "SELECT * from LOCAL_TABLE". No log generated
   3. I run in bash psql -c "SELECT * from FOREIGN_TABLE". ::LOG: could not
   receive data from client: Connection reset by peer generated in postgres
   log

I can't set logging lower and yet this message distracts. Please share any
idea how to set up env to avoid generating this message?.. I feel I'm
missing something obvious, but can't see what.

PS. I tried running -f file instead of -c "SQL". Of course it did not
change a thing. And of course I tried putting \q to file with same result...

Re: [HACKERS] Postgres abort found in 9.3.11

2016-09-15 Thread Tom Lane

"K S, Sandhya (Nokia - IN/Bangalore)"  writes:
> We tried to replicate the scenario without our patch(exiting postmaster) and 
> still we were able to see the issue.

> Same error was seen this time as well.
> node-0 postgres[8243]: [1-2] HINT:  Is another postmaster already running on 
> port 5433? If not, wait a few seconds and retry.  
> node-1 postgres[8650]: [18-1] PANIC:  btree_xlog_delete_get_latestRemovedXid: 
> cannot operate with inconsistent data

> Crash was not seen in 9.3.9 without the patch but it was reproduced in 9.3.11.
> So something specifically changed between 9.3.9 and 9.3.11 is causing the 
> issue.

Well, I looked through the git history from 9.3.9 to 9.3.11 and I don't
see anything that seems likely to explain a problem here.

If you can reproduce this, which it sounds like you can, maybe you could
create a self-contained test case for other people to try?

Also worth noting is that the current 9.3.x release is 9.3.14.  You
might save yourself some time by updating and seeing if it still
reproduces in 9.3.14.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres abort found in 9.3.11

2016-09-15 Thread K S, Sandhya (Nokia - IN/Bangalore)

Hello,

We tried to replicate the scenario without our patch(exiting postmaster) and 
still we were able to see the issue.

Same error was seen this time as well.
node-0 postgres[8243]: [1-2] HINT:  Is another postmaster already running on 
port 5433? If not, wait a few seconds and retry.  
node-1 postgres[8650]: [18-1] PANIC:  btree_xlog_delete_get_latestRemovedXid: 
cannot operate with inconsistent data

Crash was not seen in 9.3.9 without the patch but it was reproduced in 9.3.11.
So something specifically changed between 9.3.9 and 9.3.11 is causing the issue.

Thanks in advance!!!

Sandhya

-Original Message-
From: Tom Lane [mailto:t...@sss.pgh.pa.us] 
Sent: Tuesday, September 06, 2016 5:04 PM
To: K S, Sandhya (Nokia - IN/Bangalore) <sandhya@nokia.com>
Cc: pgsql-hackers@postgresql.org; Itnal, Prakash (Nokia - IN/Bangalore) 
<prakash.it...@nokia.com>
Subject: Re: [HACKERS] Postgres abort found in 9.3.11

"K S, Sandhya (Nokia - IN/Bangalore)" <sandhya@nokia.com> writes:
> I was able to find a patch file where there is a call to ExitPostmaster() in 
> postmaster.c . 

> @@ -3081,6 +3081,11 @@
> shmem_exit(1);
> reset_shared(PostPortNumber);

> +   /* recovery termination */
> +   ereport(FATAL,
> +   (errmsg("recovery termination due to process crash")));
> +   ExitPostmaster(99);
> +
> StartupPID = StartupDataBase();
> Assert(StartupPID != 0); 
> pmState = PM_STARTUP;

There's no such code in the community sources, and I can't say that
such a patch looks like a bright idea to me.  It would disable any
restart after a crash (not only during recovery).

If you're running a version with assorted random non-community patches,
we can't really offer much support for that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres abort found in 9.3.11

2016-09-06 Thread K S, Sandhya (Nokia - IN/Bangalore)

Hello,

I was able to find a patch file where there is a call to ExitPostmaster() in 
postmaster.c . 

@@ -3081,6 +3081,11 @@
shmem_exit(1);
reset_shared(PostPortNumber);

+   /* recovery termination */
+   ereport(FATAL,
+   (errmsg("recovery termination due to process crash")));
+   ExitPostmaster(99);
+
StartupPID = StartupDataBase();
Assert(StartupPID != 0); 
pmState = PM_STARTUP;

But this patch is there from 2009 when Postgres was upgraded to 9.0. I am 
checking on why this patch was introduced in the first place.
Still the question exists of why the issue is not seen in version 9.3.9 but 
exists in 9.3.11.

Also the case of standalone recovery is taken care of with introduction of the 
patch file.

"err-3" is part of postgres source code(nbtxlog.c). Two different lines are 
combined probably leading to confusion.
Aug 22 11:44:52.065760 crit node-1 postgres[8629]: [18-1] err-3:  
btree_xlog_delete_get_latestRemovedXid: cannot operate with inconsistent data
Aug 22 11:44:52.065971 crit node-1 postgres[8629]: [18-2] CONTEXT:  xlog redo 
delete: index 1663/16386/17378; iblk 1, heap 1663/16386/16518;

Thanks in advance!!!
Sandhya

-Original Message-
From: Tom Lane [mailto:t...@sss.pgh.pa.us] 
Sent: Thursday, September 01, 2016 7:19 PM
To: K S, Sandhya (Nokia - IN/Bangalore) <sandhya@nokia.com>
Cc: pgsql-hackers@postgresql.org; Itnal, Prakash (Nokia - IN/Bangalore) 
<prakash.it...@nokia.com>
Subject: Re: [HACKERS] Postgres abort found in 9.3.11

"K S, Sandhya (Nokia - IN/Bangalore)" <sandhya@nokia.com> writes:
> Our setup is a hot-standby architecture. This crash is occurring only on 
> stand-by node. Postgres continues to run without any issues on active node.
> Postmaster is waiting for a start and is throwing this message.

> Aug 22 11:44:21.462555 info node-0 postgres[8222]: [1-2] HINT:  Is another 
> postmaster already running on port 5433? If not, wait a few seconds and 
> retry.  
> Aug 22 11:44:52.065760 crit node-1 postgres[8629]: [18-1] err-3:  
> btree_xlog_delete_get_latestRemovedXid: cannot operate with inconsistent 
> dataAug 22 11:44:52.065971 crit CFPU-1 postgres[8629]: [18-2] CONTEXT:  xlog 
> redo delete: index 1663/16386/17378; iblk 1, heap 1663/16386/16518;

Hmm, that HINT seems to be the tail end of a message indicating that the
postmaster is refusing to start because of an existing postmaster.  Why
is that appearing?  If you've got some script that's overeagerly launching
and killing postmasters, maybe that's the ultimate cause of problems.

The only method I've heard of for getting that get_latestRemovedXid
error is to try to launch a standalone backend (postgres --single)
in a standby server directory.  We don't support that, cf
https://www.postgresql.org/message-id/flat/00F0B2CEF6D0CEF8A90119D4%40eje.credativ.lan

BTW, I'm curious about the "err-3:" part.  That would not be expected
in any standard build of Postgres ... is this something custom modified?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres abort found in 9.3.11

2016-09-06 Thread Tom Lane

"K S, Sandhya (Nokia - IN/Bangalore)"  writes:
> I was able to find a patch file where there is a call to ExitPostmaster() in 
> postmaster.c . 

> @@ -3081,6 +3081,11 @@
> shmem_exit(1);
> reset_shared(PostPortNumber);

> +   /* recovery termination */
> +   ereport(FATAL,
> +   (errmsg("recovery termination due to process crash")));
> +   ExitPostmaster(99);
> +
> StartupPID = StartupDataBase();
> Assert(StartupPID != 0); 
> pmState = PM_STARTUP;

There's no such code in the community sources, and I can't say that
such a patch looks like a bright idea to me.  It would disable any
restart after a crash (not only during recovery).

If you're running a version with assorted random non-community patches,
we can't really offer much support for that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres abort found in 9.3.11

2016-09-02 Thread K S, Sandhya (Nokia - IN/Bangalore)

Hello Tom,

Apologies for delayed reply.

Our setup is a hot-standby architecture. This crash is occurring only on 
stand-by node. Postgres continues to run without any issues on active node.
Postmaster is waiting for a start and is throwing this message.

Aug 22 11:44:21.462555 info node-0 postgres[8222]: [1-2] HINT:  Is another 
postmaster already running on port 5433? If not, wait a few seconds and retry.  
Aug 22 11:44:52.065760 crit node-1 postgres[8629]: [18-1] err-3:  
btree_xlog_delete_get_latestRemovedXid: cannot operate with inconsistent 
dataAug 22 11:44:52.065971 crit CFPU-1 postgres[8629]: [18-2] CONTEXT:  xlog 
redo delete: index 1663/16386/17378; iblk 1, heap 1663/16386/16518;
Aug 22 11:44:52.085486 info node-1 coredumper: Generating core file 

The standby postgres recovers automatically on next restart. This is because we 
always copy db freshly from active node on restart.

We implemented one patch to force kill walsender on active side. This is done 
to avoid prolonged wait if standby node is not reachable (for eg. Force power 
off or LAN cable removal). This implementation exists from long time. However 
the issue only recently observed after upgrading to 9.3.11. Do you think this 
force kill of walsender might lead to such issues in latest postgres?

Regards,
Sandhya

-Original Message-
From: Tom Lane [mailto:t...@sss.pgh.pa.us] 
Sent: Tuesday, August 30, 2016 5:09 PM
To: K S, Sandhya (Nokia - IN/Bangalore) <sandhya@nokia.com>
Cc: pgsql-hackers@postgresql.org; Itnal, Prakash (Nokia - IN/Bangalore) 
<prakash.it...@nokia.com>
Subject: Re: [HACKERS] Postgres abort found in 9.3.11

"K S, Sandhya (Nokia - IN/Bangalore)" <sandhya@nokia.com> writes:
> During the server restart, we are getting postgres crash with sigabrt. No 
> other operation being performed.
> Attached the backtrace.

What shows up in the postmaster log?

> The occurrence is occasional. The issue is seen once in 30~50 times.

Does it successfully restart if you try again?  If not, what are you
doing to recover?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres abort found in 9.3.11

2016-09-01 Thread Tom Lane

"K S, Sandhya (Nokia - IN/Bangalore)"  writes:
> Our setup is a hot-standby architecture. This crash is occurring only on 
> stand-by node. Postgres continues to run without any issues on active node.
> Postmaster is waiting for a start and is throwing this message.

> Aug 22 11:44:21.462555 info node-0 postgres[8222]: [1-2] HINT:  Is another 
> postmaster already running on port 5433? If not, wait a few seconds and 
> retry.  
> Aug 22 11:44:52.065760 crit node-1 postgres[8629]: [18-1] err-3:  
> btree_xlog_delete_get_latestRemovedXid: cannot operate with inconsistent 
> dataAug 22 11:44:52.065971 crit CFPU-1 postgres[8629]: [18-2] CONTEXT:  xlog 
> redo delete: index 1663/16386/17378; iblk 1, heap 1663/16386/16518;

Hmm, that HINT seems to be the tail end of a message indicating that the
postmaster is refusing to start because of an existing postmaster.  Why
is that appearing?  If you've got some script that's overeagerly launching
and killing postmasters, maybe that's the ultimate cause of problems.

The only method I've heard of for getting that get_latestRemovedXid
error is to try to launch a standalone backend (postgres --single)
in a standby server directory.  We don't support that, cf
https://www.postgresql.org/message-id/flat/00F0B2CEF6D0CEF8A90119D4%40eje.credativ.lan

BTW, I'm curious about the "err-3:" part.  That would not be expected
in any standard build of Postgres ... is this something custom modified?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres abort found in 9.3.11

2016-08-30 Thread Tom Lane

"K S, Sandhya (Nokia - IN/Bangalore)"  writes:
> During the server restart, we are getting postgres crash with sigabrt. No 
> other operation being performed.
> Attached the backtrace.

What shows up in the postmaster log?

> The occurrence is occasional. The issue is seen once in 30~50 times.

Does it successfully restart if you try again?  If not, what are you
doing to recover?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Postgres abort found in 9.3.11

2016-08-30 Thread K S, Sandhya (Nokia - IN/Bangalore)

Hello,

During the server restart, we are getting postgres crash with sigabrt. No other 
operation being performed.
Attached the backtrace.


The occurrence is occasional. The issue is seen once in 30~50 times.
Recently we had performed postgres upgrade from 9.3.9 to 9.3.11. The issue is 
not seen in 9.3.9.

Postgres server version: 9.3.11
Target architecture: mips-64

We checked the difference between 9.3.9 and 9.3.11 and nothing relevant seems 
to be causing the crash.

Please help in resolving the issue as we are not competent with the postgres 
code.
Also if you see any difference valid be 9.3.9 and 9.3.11 which might be 
pointing to this issue, please let us know.

Thanks in advance!!!
Sandhya



#0  0x005558e909c0 in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x005558e909c0 in raise () from /lib64/libc.so.6
#1  0x005558e952bc in abort () from /lib64/libc.so.6
#2  0x00012039db88 in errfinish ()
#3  0x00012039e868 in elog_finish ()
#4  0x00012009ea08 in btree_redo ()
#5  0x0001200cb178 in StartupXLOG ()
#6  0x000120259838 in StartupProcessMain ()
#7  0x0001200d5b3c in AuxiliaryProcessMain ()
#8  0x000120253314 in ?? ()


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres 9.6 scariest patch tournament

2016-05-05 Thread Alvaro Herrera

This is raw, in case anyone wants to look more closely.

alvherre=# select level, count(*), patch, subject from scary left join commits 
on patch = sha1 group by level, patch, subject order by level asc, count(*) 
desc;
┌───┬───┬───┬┐
│ level │ count │ patch 
│  subject   │
├───┼───┼───┼┤
│ 1 │ 3 │ fd31cd265138019d9b5fe53043670898bc9f  
│ Don't vacuum all-frozen pages. │
│ 1 │ 2 │ 3fc6e2d7f5b652b417fa6937c34de2438d60fa9f  
│ Make the upper part of the planner work by generating and comparing Paths. │
│ 1 │ 1 │ Change the format of the VM fork to add a second bit per page 
││
│ 1 │ 1 │ Don't VACUUM all-frozen pages 
││
│ 1 │ 1 │ 65578341af1ae50e52e0f45e691ce88ad5a1b9b1  
│ Add Generic WAL interface  │
│ 1 │ 1 │ 9cd00c457e6a1ebb984167ac556a9961812a683c  
│ Checkpoint sorting and balancing.  │
│ 1 │ 1 │ 924bcf4f16d54c55310b28f77686608684734f42  
│ Create an infrastructure for parallel computation in PostgreSQL.   │
│ 1 │ 1 │ Freeze Map
││
│ 1 │ 1 │ parallelism, gather node...   
││
│ 1 │ 1 │ Parallel query
││
│ 1 │ 1 │ Snapshot too old  
││
│ 1 │ 1 │ Snapshot Too Old  
││
│ 2 │ 2 │ bb140506df605fab58f48926ee1db1f80bdafb59  
│ Phrase full text search.   │
│ 2 │ 1 │ \crosstabview 
││
│ 2 │ 1 │ 1c1a7cbd6a1600c97dfcd9b5dc78a23b5db9bbf6  
│ Sync our copy of the timezone library with IANA release tzcode2016c.   │
│ 2 │ 1 │ 3fc6e2d7f5b652b417fa6937c34de2438d60fa9f  
│ Make the upper part of the planner work by generating and comparing Paths. │
│ 2 │ 1 │ 428b1d6b29ca599c5700d4bc4f4ce4c5880369bf  
│ Allow to trigger kernel writeback after a configurable number of writes.   │
│ 2 │ 1 │ 48354581a49c30f5757c203415aa8412d85b0f70  
│ Allow Pin/UnpinBuffer to operate in a lockfree manner. │
│ 2 │ 1 │ 848ef42bb8c7909c9d7baa38178d4a209906e7c1  
│ Add the "snapshot too old" feature │
│ 2 │ 1 │ fd31cd265138019d9b5fe53043670898bc9f  
│ Don't vacuum all-frozen pages. │
│ 2 │ 1 │ Freeze map
││
│ 2 │ 1 │ LWLock tranches   
││
│ 2 │ 1 │ Parallelism Stuff (many patches)  
││
│ 2 │ 1 │ Parallel Query
││
│ 3 │ 1 │ 013ebc0a7b7ea9c1b1ab7a3d4dd75ea121ea8ba7  
│ Microvacuum for GIST   │
│ 3 │ 1 │ 0711803775a37e0bf39d7efdd1e34d9d7e640ea1  
│ Use quicksort, not replacement selection, for external sorting.│
│ 3 │ 1 │ 45be99f8cd5d606086e0a458c9c72910ba8a613d  
│ Support parallel joins, and make related improvements.

Re: [HACKERS] Postgres 9.6 scariest patch tournament

2016-05-05 Thread Alvaro Herrera

Alvaro Herrera wrote:

> "Parallel Query" got many mentions; some of them were specific commits
> (such as "parallel infrastructure", "parallel joins", "parallel
> aggregates") and others were more generic.  For the generic mentions I
> just chose a few of the most salient patches, but didn't include either
> parallel aggregates nor parallel joins in that list.  "LWLock tranches"
> and "Freeze Map" also resulted in several commits appearing in the list
> below.  This distorts the results somewhat.  I considered redoing the
> results once I noticed the problem, but didn't really want to invest
> *too* much time.

After looking at commits mentioning "parallel", I'm surprised that this
one didn't turn up specifically as a scary commit:

│ a1c1af2a1f6099c039f145c1edb52257f315be51 │ rh...@postgresql.org│ 
Introduce group locking to prevent parallel processes from deadlocking.  │


-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres 9.6 scariest patch tournament

2016-05-05 Thread Alvaro Herrera

Noah Misch wrote:
> On Mon, Apr 18, 2016 at 03:37:21PM -0300, Alvaro Herrera wrote:
> > The RMT will publish aggregate, unattributed results after the poll
> > closes.

Here are some more detailed results.  We got 15 valid replies.  One
person voted twice, mentioning the same patches both times in slightly
different order; I considered the second reply only.

"Parallel Query" got many mentions; some of them were specific commits
(such as "parallel infrastructure", "parallel joins", "parallel
aggregates") and others were more generic.  For the generic mentions I
just chose a few of the most salient patches, but didn't include either
parallel aggregates nor parallel joins in that list.  "LWLock tranches"
and "Freeze Map" also resulted in several commits appearing in the list
below.  This distorts the results somewhat.  I considered redoing the
results once I noticed the problem, but didn't really want to invest
*too* much time.

I chose to ignore the scariness rating in this tabular report.

┌───┬──┬─┬─┐
│ count │   sha1   │committer│  
 subject   │
├───┼──┼─┼─┤
│ 8 │ fd31cd265138019d9b5fe53043670898bc9f │ rh...@postgresql.org│ 
Don't vacuum all-frozen pages.  │
│ 6 │ 924bcf4f16d54c55310b28f77686608684734f42 │ rh...@postgresql.org│ 
Create an infrastructure for parallel computation in PostgreSQL.│
│ 5 │ f0661c4e8c44c0ec7acd4ea7c82e85b265447398 │ rh...@postgresql.org│ 
Make sequential scans parallel-aware.   │
│ 5 │ ee7ca559fcf404f9a3bd99da85c8f4ea9fbc2e92 │ rh...@postgresql.org│ 
Add a C API for parallel heap scans.│
│ 3 │ a892234f830e832110f63fc0a2afce2fb21d1584 │ rh...@postgresql.org│ 
Change the format of the VM fork to add a second bit per page.  │
│ 3 │ 3fc6e2d7f5b652b417fa6937c34de2438d60fa9f │ t...@sss.pgh.pa.us   │ 
Make the upper part of the planner work by generating and comparing Paths.  │
│ 3 │ 848ef42bb8c7909c9d7baa38178d4a209906e7c1 │ kgri...@postgresql.org  │ 
Add the "snapshot too old" feature  │
│ 2 │ bb140506df605fab58f48926ee1db1f80bdafb59 │ teo...@sigaev.ru│ 
Phrase full text search.│
│ 2 │ 9cd00c457e6a1ebb984167ac556a9961812a683c │ and...@anarazel.de  │ 
Checkpoint sorting and balancing.   │
│ 1 │ 3bd909b220930f21d6e15833a17947be749e7fde │ rh...@postgresql.org│ 
Add a Gather executor node. │
│ 1 │ c319991bcad02a2e99ddac3f42762b0f6fa8d52a │ rh...@postgresql.org│ 
Use separate lwlock tranches for buffer, lock, and predicate lock managers. │
│ 1 │ 1c1a7cbd6a1600c97dfcd9b5dc78a23b5db9bbf6 │ t...@sss.pgh.pa.us   │ 
Sync our copy of the timezone library with IANA release tzcode2016c.│
│ 1 │ 65578341af1ae50e52e0f45e691ce88ad5a1b9b1 │ teo...@sigaev.ru│ 
Add Generic WAL interface   │
│ 1 │ c09b18f21c52cbcf8718d6c267c84fcfea3739a9 │ alvhe...@alvh.no-ip.org │ 
Support crosstabview in psql│
│ 1 │ e4106b2528727c4b48639c0e12bf2f70a766b910 │ rh...@postgresql.org│ 
postgres_fdw: Push down joins to remote servers.│
│ 1 │ 428b1d6b29ca599c5700d4bc4f4ce4c5880369bf │ and...@anarazel.de  │ 
Allow to trigger kernel writeback after a configurable number of writes.│
│ 1 │ 6150a1b08a9fe7ead2b25240be46dddeae9d98e1 │ rh...@postgresql.org│ 
Move buffer I/O and content LWLocks out of the main tranche.│
│ 1 │ 48354581a49c30f5757c203415aa8412d85b0f70 │ and...@anarazel.de  │ 
Allow Pin/UnpinBuffer to operate in a lockfree manner.  │
│ 1 │ 7a542700df25eaf97b794bff63606176433dcdda │ sfr...@snowman.net  │ 
Create default roles│
│ 1 │ 013ebc0a7b7ea9c1b1ab7a3d4dd75ea121ea8ba7 │ teo...@sigaev.ru│ 
Microvacuum for GIST│
│ 1 │ 53be0b1add7064ca5db3cd884302dfc3268d884e │ rh...@postgresql.org│ 
Provide much better wait information in pg_stat_activity.   │
│ 1 │ 0711803775a37e0bf39d7efdd1e34d9d7e640ea1 │ rh...@postgresql.org│ 
Use quicksort, not replacement selection, for external sorting. │
│ 1 │

Re: [HACKERS] Postgres 9.6 scariest patch tournament

2016-05-04 Thread Josh berkus

On 05/04/2016 06:56 PM, Robert Haas wrote:
> On Wed, May 4, 2016 at 9:41 PM, Noah Misch  wrote:
>> On Mon, Apr 18, 2016 at 03:37:21PM -0300, Alvaro Herrera wrote:
>>> The RMT will publish aggregate, unattributed results after the poll
>>> closes.
>>
>> Thanks for voting.  Join me in congratulating our top finishers:
>>
>> 1. fd31cd2 Dont vacuum all-frozen pages.
>> 2. "Parallel Query"
>> 3(tie). 3fc6e2d Make the upper part of the planner work by generating and 
>> comparing Paths.
>> 3(tie). 848ef42 Add the "snapshot too old" feature
> 
> Congratulations Kevin, Tom, me, and me!
> 
> I feel like I went to the Olympics and won both the gold *and* silver
> medals in the same event.  Beat that!
> 

Maybe we *should* call this 10.0.  That way people will be ready for
lots of breakage. ;-b

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres 9.6 scariest patch tournament

2016-05-04 Thread Robert Haas

On Wed, May 4, 2016 at 9:41 PM, Noah Misch  wrote:
> On Mon, Apr 18, 2016 at 03:37:21PM -0300, Alvaro Herrera wrote:
>> The RMT will publish aggregate, unattributed results after the poll
>> closes.
>
> Thanks for voting.  Join me in congratulating our top finishers:
>
> 1. fd31cd2 Dont vacuum all-frozen pages.
> 2. "Parallel Query"
> 3(tie). 3fc6e2d Make the upper part of the planner work by generating and 
> comparing Paths.
> 3(tie). 848ef42 Add the "snapshot too old" feature

Congratulations Kevin, Tom, me, and me!

I feel like I went to the Olympics and won both the gold *and* silver
medals in the same event.  Beat that!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres 9.6 scariest patch tournament

2016-05-04 Thread Noah Misch

On Mon, Apr 18, 2016 at 03:37:21PM -0300, Alvaro Herrera wrote:
> The RMT will publish aggregate, unattributed results after the poll
> closes.

Thanks for voting.  Join me in congratulating our top finishers:

1. fd31cd2 Dont vacuum all-frozen pages.
2. "Parallel Query"
3(tie). 3fc6e2d Make the upper part of the planner work by generating and 
comparing Paths.
3(tie). 848ef42 Add the "snapshot too old" feature

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres 9.6 scariest patch tournament

2016-04-28 Thread Noah Misch

On Mon, Apr 18, 2016 at 03:37:21PM -0300, Alvaro Herrera wrote:
> The PostgreSQL Project needs you!
> 
> The Release Management Team would like your input regarding the patch or
> patches which, in your opinion, are the most likely sources of major
> bugs or instabilities in PostgreSQL 9.6.
> 
> Please submit your answers before May 1st using this form:
> https://docs.google.com/forms/d/1xNNqhXC116wCMnomqGz9RQ7OuVwZqAcEre7iiU6pT20/viewform

Reminder: the survey closes this weekend.  Identify those scary patches!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres 9.6 scariest patch tournament

2016-04-19 Thread David Fetter

On Tue, Apr 19, 2016 at 04:06:55PM -0400, Tom Lane wrote:
> Alvaro Herrera  writes:
> > Peter Geoghegan wrote:
> >> I would have appreciated more scope to say how confident I am in
> >> my prediction, and how scary in absolute terms I consider the
> >> scariest patches to be.
> 
> > It was purposefully ambiguous.  Maybe it should have been stated
> > explicitely.
> 
> I was thinking about complaining that "scariest" and "most bugs" are
> not the same thing.  Features you can turn off are not very scary,
> even if they're full of bugs (cough ... parallel query ... cough),
> because we could just ship 'em disabled by default until there's
> more reason to trust them.  What I find scary is patches that can
> break existing use-cases with no simple workaround.  I'm not sure
> which one to vote for yet.

There's space on the ballot for up to three, and it appears to be a
ranked choice or similar preference system.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres 9.6 scariest patch tournament

2016-04-19 Thread Peter Geoghegan

On Tue, Apr 19, 2016 at 12:37 PM, Alvaro Herrera
 wrote:
> This guy reads my mind.  Where's my tinfoil hat?

Heh. Well, I'm generally not in favor of communicating concerns
without an obligation to defend them, but it could work well in tiny
doses. Offering hackers a low-risk way to take a position greatly
reduces the "knew-it-all-along" effect. We may then be more accurate
in assessing our own ability to anticipate problems.

There is very simple Malthusian logic [1] that explains why we'll
usually be wrong, which is:

Why are hackers bad at anticipating where bugs will be? Because if
they weren't, then there wouldn't be any bugs.

Please don't interpret my remarks as showing flippancy about bugs.
(The same should be said about the whole "scary patches" poll,
actually.)

>> I would have appreciated more scope to say how confident I am in my
>> prediction, and how scary in absolute terms I consider the scariest
>> patches to be.
>
> It was purposefully ambiguous.  Maybe it should have been stated
> explicitely.

I voted, and my vote probably just slightly reinforced the
conventional wisdom about where to look for problems -- it was not a
vote for parallel query, since I agree with Tom's assessment of the
risks there. I think you can probably guess what I voted for.

I wouldn't have expressed a similar sentiment on this list, because
that would probably turn out to be just jumping on the bandwagon.
There is a good chance that the patch will be totally fine in the end,
anyway. It was probably very carefully reviewed, precisely because it
touches critical parts of the system. And, it works in a way that
generalizes from an existing well-tested mechanism.

My vote represented "I certainly hope this patch has no bugs in it"
this time around. Next time, it might be "this patch almost certainly
has lots of undiscovered bugs", which might well be an original
insight for the release team if it's in my area of expertise (chances
are good that those bugs are not critically important if it gets to
that). Rarely, the message will be "I'm deeply concerned about the
*lasting* repercussions of having merged this patch". And so, yes, I
think that we might want to be clearer about looking for nuances like
that.

[1] http://www.scottaaronson.com/blog/?p=418
-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres 9.6 scariest patch tournament

2016-04-19 Thread Tom Lane

Alvaro Herrera  writes:
> Peter Geoghegan wrote:
>> I would have appreciated more scope to say how confident I am in my
>> prediction, and how scary in absolute terms I consider the scariest
>> patches to be.

> It was purposefully ambiguous.  Maybe it should have been stated
> explicitely.

I was thinking about complaining that "scariest" and "most bugs" are
not the same thing.  Features you can turn off are not very scary,
even if they're full of bugs (cough ... parallel query ... cough),
because we could just ship 'em disabled by default until there's more
reason to trust them.  What I find scary is patches that can break
existing use-cases with no simple workaround.  I'm not sure which one
to vote for yet.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres 9.6 scariest patch tournament

2016-04-19 Thread Alvaro Herrera

Peter Geoghegan wrote:
> On Mon, Apr 18, 2016 at 1:22 PM, Josh berkus  wrote:
> > We should send the owner of the scariest patch something as a prize.
> > Maybe a plastic skeleton or something ...
> 
> I think it was a good idea to call it the scariest patch rather than
> something more severe sounding. Having the poll only be half-serious
> is a good way to avoid self-censorship, and emphasizes that we're
> concerned about bugs that cause serious instability to the system as a
> whole. We're less concerned about the overall number of bugs in any
> given patch.

This guy reads my mind.  Where's my tinfoil hat?

> I would have appreciated more scope to say how confident I am in my
> prediction, and how scary in absolute terms I consider the scariest
> patches to be.

It was purposefully ambiguous.  Maybe it should have been stated
explicitely.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres 9.6 scariest patch tournament

2016-04-19 Thread Peter Geoghegan

On Mon, Apr 18, 2016 at 1:22 PM, Josh berkus  wrote:
> We should send the owner of the scariest patch something as a prize.
> Maybe a plastic skeleton or something ...

I think it was a good idea to call it the scariest patch rather than
something more severe sounding. Having the poll only be half-serious
is a good way to avoid self-censorship, and emphasizes that we're
concerned about bugs that cause serious instability to the system as a
whole. We're less concerned about the overall number of bugs in any
given patch.

I would have appreciated more scope to say how confident I am in my
prediction, and how scary in absolute terms I consider the scariest
patches to be.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres 9.6 scariest patch tournament

2016-04-19 Thread Chapman Flack

On 04/18/2016 04:22 PM, Josh berkus wrote:
> 
> We should send the owner of the scariest patch something as a prize.
> Maybe a plastic skeleton or something ...

A mouse.

-Chap



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres 9.6 scariest patch tournament

2016-04-18 Thread Josh berkus

On 04/18/2016 11:37 AM, Alvaro Herrera wrote:
> Hackers, lurkers,
> 
> The PostgreSQL Project needs you!
> 
> The Release Management Team would like your input regarding the patch or
> patches which, in your opinion, are the most likely sources of major
> bugs or instabilities in PostgreSQL 9.6.
> 
> Please submit your answers before May 1st using this form:
> https://docs.google.com/forms/d/1xNNqhXC116wCMnomqGz9RQ7OuVwZqAcEre7iiU6pT20/viewform
> 
> If, for some reason, you prefer not to fill that form or have further
> input on the topic, you can correspond via private email to one or more
> members of the RMT,
> 
>   Robert Haas 
>   Alvaro Herrera 
>   Noah Misch 
> 
> The RMT will publish aggregate, unattributed results after the poll
> closes.

We should send the owner of the scariest patch something as a prize.
Maybe a plastic skeleton or something ...

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres 9.6 scariest patch tournament

2016-04-18 Thread Bill Moran

On Mon, 18 Apr 2016 15:37:21 -0300
Alvaro Herrera  wrote:

> Hackers, lurkers,
> 
> The PostgreSQL Project needs you!
> 
> The Release Management Team would like your input regarding the patch or
> patches which, in your opinion, are the most likely sources of major
> bugs or instabilities in PostgreSQL 9.6.

Wow ... this may be the most unusual survey I've seen in a while.

-- 
Bill Moran


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Postgres 9.6 scariest patch tournament

2016-04-18 Thread Alvaro Herrera

Hackers, lurkers,

The PostgreSQL Project needs you!

The Release Management Team would like your input regarding the patch or
patches which, in your opinion, are the most likely sources of major
bugs or instabilities in PostgreSQL 9.6.

Please submit your answers before May 1st using this form:
https://docs.google.com/forms/d/1xNNqhXC116wCMnomqGz9RQ7OuVwZqAcEre7iiU6pT20/viewform

If, for some reason, you prefer not to fill that form or have further
input on the topic, you can correspond via private email to one or more
members of the RMT,

Robert Haas 
Alvaro Herrera 
Noah Misch 

The RMT will publish aggregate, unattributed results after the poll
closes.

Thanks,

-- 
Álvaro Herrera


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-13 Thread Amit Kapila

On Tue, Oct 13, 2015 at 8:57 PM, Tom Lane  wrote:
>
> Michael Paquier  writes:
> > On Tue, Oct 13, 2015 at 5:35 AM, Tom Lane  wrote:
> >> After poking around a bit more, I propose the attached patch.  I've
> >> checked that this is happy with an EXEC_BACKEND Unix build, but I'm not
> >> able to test it on Windows ... would somebody do that?
>
> > Looking at the patch, clearly +1 for the additional routine in both
> > win32_shmem.c and sysv_shmem.c to clean up the shmem state at backend
> > level. I have played as well with the patch on Windows and it behaves
> > as expected: without the patch a process killed with taskkill /f stops
> > straight the server even if restart_on_crash is on. With the patch the
> > server restarts correctly.
>
> OK, pushed with some additional comment-smithing.
>
> I noticed while looking at this that for subprocesses that aren't supposed
> to be attached to shared memory, we do pgwin32_ReserveSharedMemoryRegion()
> anyway in internal_forkexec(), and then that's never undone anywhere,
> so that that segment of the subprocess's memory space remains reserved.
> I'm not sure if this is worth changing, but if it is, we could do so now
> by calling VirtualFree() in PGSharedMemoryNoReAttach().
>

I think it is worth doing, as it can save the memory for processes which
don't attach to shared memory.  Another thing is that we do allocate
handles (by using duplicate handle) in save_backend_variables() which
I am not sure are required for all the processes, anyway this doesn't
seem worth the trouble.

> BTW, I am suspicious that the DSM stuff may have related issues --- do
> we use inheritable mapping handles for DSM segments on Windows?
>

Not by default, there is an API dsm_pin_segment() which Duplicates the
handle for Postmaster process to retain the shared memory segment
till Postmaster shutdown.  In general, I don't see such issues for DSM,
but please point me if you see anything problematic.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-13 Thread Michael Paquier

On Tue, Oct 13, 2015 at 5:35 AM, Tom Lane  wrote:
> I wrote:
>> This is kind of a mess :-(.  But it does look like what we want is
>> for SubPostmasterMain to do more than nothing when it chooses not to
>> reattach.  Probably that should include resetting UsedShmemSegAddr to
>> NULL, as well as closing the handle.
>
> After poking around a bit more, I propose the attached patch.  I've
> checked that this is happy with an EXEC_BACKEND Unix build, but I'm not
> able to test it on Windows ... would somebody do that?
>
> BTW, it appears from this that Cygwin builds have been broken right along
> in a different way: according to the code in sysv_shmem's
> PGSharedMemoryReAttach, Cygwin does cause a re-attach to occur, which we
> were not undoing for putatively-not-connected-to-shmem child processes.
> That's a robustness problem because it breaks the postmaster's expectation
> that it's safe to not reinitialize shmem after a crash of one of those
> processes.  I believe this patch fixes that problem as well, though if
> anyone can test it on Cygwin that wouldn't be a bad thing either.

I don't have a Cygwin environment at hand. That's unfortunate..

Looking at the patch, clearly +1 for the additional routine in both
win32_shmem.c and sysv_shmem.c to clean up the shmem state at backend
level. I have played as well with the patch on Windows and it behaves
as expected: without the patch a process killed with taskkill /f stops
straight the server even if restart_on_crash is on. With the patch the
server restarts correctly.

(Sorry, I should have mentioned that my last patch was untested and
*surely broken*, that was the result of a 3-min guess to make the
cleanup more generic for child processes that do not need to be
attached to shmem).
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-13 Thread Andrew Dunstan




On 10/12/2015 04:35 PM, Tom Lane wrote:

I wrote:

This is kind of a mess :-(.  But it does look like what we want is
for SubPostmasterMain to do more than nothing when it chooses not to
reattach.  Probably that should include resetting UsedShmemSegAddr to
NULL, as well as closing the handle.

After poking around a bit more, I propose the attached patch.  I've
checked that this is happy with an EXEC_BACKEND Unix build, but I'm not
able to test it on Windows ... would somebody do that?

BTW, it appears from this that Cygwin builds have been broken right along
in a different way: according to the code in sysv_shmem's
PGSharedMemoryReAttach, Cygwin does cause a re-attach to occur, which we
were not undoing for putatively-not-connected-to-shmem child processes.
That's a robustness problem because it breaks the postmaster's expectation
that it's safe to not reinitialize shmem after a crash of one of those
processes.  I believe this patch fixes that problem as well, though if
anyone can test it on Cygwin that wouldn't be a bad thing either.





OK, I can test it. But it's not quite clear to me from your description 
how I should test Cygwin.



cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-13 Thread Tom Lane

Andrew Dunstan  writes:
> On 10/12/2015 04:35 PM, Tom Lane wrote:
>> BTW, it appears from this that Cygwin builds have been broken right along
>> in a different way: according to the code in sysv_shmem's
>> PGSharedMemoryReAttach, Cygwin does cause a re-attach to occur, which we
>> were not undoing for putatively-not-connected-to-shmem child processes.
>> That's a robustness problem because it breaks the postmaster's expectation
>> that it's safe to not reinitialize shmem after a crash of one of those
>> processes.  I believe this patch fixes that problem as well, though if
>> anyone can test it on Cygwin that wouldn't be a bad thing either.

> OK, I can test it. But it's not quite clear to me from your description 
> how I should test Cygwin.

The point is that I think that right now, the logging collector subprocess
remains connected to shared memory, which it should not (and won't, if my
patch is doing the right thing).  I do not know if there's an easy way to
inspect the process state to verify that on Windows.

If nothing else, you could put a bogus access to some shared-memory data
structure into the syslogger loop, and check that it succeeds now and
crashes after applying the patch ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-13 Thread Tom Lane

Michael Paquier  writes:
> On Tue, Oct 13, 2015 at 5:35 AM, Tom Lane  wrote:
>> After poking around a bit more, I propose the attached patch.  I've
>> checked that this is happy with an EXEC_BACKEND Unix build, but I'm not
>> able to test it on Windows ... would somebody do that?

> Looking at the patch, clearly +1 for the additional routine in both
> win32_shmem.c and sysv_shmem.c to clean up the shmem state at backend
> level. I have played as well with the patch on Windows and it behaves
> as expected: without the patch a process killed with taskkill /f stops
> straight the server even if restart_on_crash is on. With the patch the
> server restarts correctly.

OK, pushed with some additional comment-smithing.

I noticed while looking at this that for subprocesses that aren't supposed
to be attached to shared memory, we do pgwin32_ReserveSharedMemoryRegion()
anyway in internal_forkexec(), and then that's never undone anywhere,
so that that segment of the subprocess's memory space remains reserved.
I'm not sure if this is worth changing, but if it is, we could do so now
by calling VirtualFree() in PGSharedMemoryNoReAttach().

BTW, I am suspicious that the DSM stuff may have related issues --- do
we use inheritable mapping handles for DSM segments on Windows?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-12 Thread Magnus Hagander

On Mon, Oct 12, 2015 at 12:25 PM, Andres Freund  wrote:

> On 2015-10-12 11:25:35 +0530, Amit Kapila wrote:
> >   /*
> > +  * Close the shared memory handle as the syslogger doesn't need to
> > +  * attach to it.  For EXEC_BACKEND case, the shared memory handle
> > +  * is inherited by all postmaster child processes irrespective of
> > +  * whether they need it or not.
> > +  */
> > +#ifdef EXEC_BACKEND
> > + if (!CloseHandle(UsedShmemSegID))
> > + elog(LOG, "could not close handle to shared memory: error
> code %lu", GetLastError());
> > +#endif
> > +
>
> It feels wrong to do this in syslogger.c - I mean it's not the only
> process that's not attached to shared memory. Sure, the others get
> killed, but nonetheless...
>

+1. It feels like we're setting our selves up for repeating this mistake at
some later time :)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-12 Thread Andres Freund

On 2015-10-12 11:25:35 +0530, Amit Kapila wrote:
>   /*
> +  * Close the shared memory handle as the syslogger doesn't need to
> +  * attach to it.  For EXEC_BACKEND case, the shared memory handle
> +  * is inherited by all postmaster child processes irrespective of
> +  * whether they need it or not.
> +  */
> +#ifdef EXEC_BACKEND
> + if (!CloseHandle(UsedShmemSegID))
> + elog(LOG, "could not close handle to shared memory: error code 
> %lu", GetLastError());
> +#endif
> +

It feels wrong to do this in syslogger.c - I mean it's not the only
process that's not attached to shared memory. Sure, the others get
killed, but nonetheless...

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-12 Thread Amit Kapila

On Mon, Oct 12, 2015 at 3:45 PM, Michael Paquier 
wrote:
>
> On Mon, Oct 12, 2015 at 2:55 PM, Amit Kapila 
wrote:
> > On Sun, Oct 11, 2015 at 9:12 PM, Tom Lane  wrote:
> > I could easily reproduce the issue if logging collector is on and even
if
> > we try to increase the loop count or sleep time in
PGSharedMemoryCreate(),
> > it doesn't change the situation as the syslogger has a valid handle to
> > shared memory.  One way to fix is to just close the shared memory handle
> > in sys logger as we are not going to need it and attached patch which
does
> > this fixes the issue for me.  Another invasive fix in case we want to
> > retain shared memory handle for some purpose (future requirement) could
> > be to send some signal to syslogger in restart path so that it can
release
> > the shared memory handle.
>
> +#ifdef EXEC_BACKEND
> +if (!CloseHandle(UsedShmemSegID))
> +elog(LOG, "could not close handle to shared memory: error
> code %lu", GetLastError());
> +#endif
> I am pretty sure that you would want a WIN32 block here, not
> EXEC_BACKEND as the latter can be used on non-Windows platforms as
> well to emulate Windows behavior.
>

Agreed, I can change the patch to use WIN32, but it seems not all
people want to follow this approach.  So lets first try to see what
is the best way to fix.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-12 Thread Michael Paquier

On Mon, Oct 12, 2015 at 2:55 PM, Amit Kapila  wrote:
> On Sun, Oct 11, 2015 at 9:12 PM, Tom Lane  wrote:
> I could easily reproduce the issue if logging collector is on and even if
> we try to increase the loop count or sleep time in PGSharedMemoryCreate(),
> it doesn't change the situation as the syslogger has a valid handle to
> shared memory.  One way to fix is to just close the shared memory handle
> in sys logger as we are not going to need it and attached patch which does
> this fixes the issue for me.  Another invasive fix in case we want to
> retain shared memory handle for some purpose (future requirement) could
> be to send some signal to syslogger in restart path so that it can release
> the shared memory handle.

+#ifdef EXEC_BACKEND
+if (!CloseHandle(UsedShmemSegID))
+elog(LOG, "could not close handle to shared memory: error
code %lu", GetLastError());
+#endif
I am pretty sure that you would want a WIN32 block here, not
EXEC_BACKEND as the latter can be used on non-Windows platforms as
well to emulate Windows behavior.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-12 Thread Michael Paquier

On Mon, Oct 12, 2015 at 7:26 PM, Magnus Hagander  wrote:
>
>
> On Mon, Oct 12, 2015 at 12:25 PM, Andres Freund  wrote:
>>
>> On 2015-10-12 11:25:35 +0530, Amit Kapila wrote:
>> >   /*
>> > +  * Close the shared memory handle as the syslogger doesn't need to
>> > +  * attach to it.  For EXEC_BACKEND case, the shared memory handle
>> > +  * is inherited by all postmaster child processes irrespective of
>> > +  * whether they need it or not.
>> > +  */
>> > +#ifdef EXEC_BACKEND
>> > + if (!CloseHandle(UsedShmemSegID))
>> > + elog(LOG, "could not close handle to shared memory: error
>> > code %lu", GetLastError());
>> > +#endif
>> > +
>>
>> It feels wrong to do this in syslogger.c - I mean it's not the only
>> process that's not attached to shared memory. Sure, the others get
>> killed, but nonetheless...
>
>
> +1. It feels like we're setting our selves up for repeating this mistake at
> some later time :)

Actually, doesn't this apply as well to the archiver and the pgstat
collector? So perhaps we may want to do that in SubPostmasterMain with
PGSharedMemoryDetach. See for example the attached as an idea (patch
completely untested).
-- 
Michael
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 24e8404..2076d96 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -4637,6 +4637,16 @@ SubPostmasterMain(int argc, char *argv[])
 		strncmp(argv[1], "--forkbgworker=", 15) == 0)
 		PGSharedMemoryReAttach();
 
+	/*
+	 * Close any existing shared memory segment as those processes do not
+	 * need to have an access to it. This state is inherited from the
+	 * postmaster whether they need it or not.
+	 */
+	if (strcmp(argv[1], "--forkarch") == 0 ||
+		strcmp(argv[1], "--forkcol") == 0 ||
+		strcmp(argv[1], "--forklog") == 0)
+		PGSharedMemoryDetach();
+
 	/* autovacuum needs this set before calling InitProcess */
 	if (strcmp(argv[1], "--forkavlauncher") == 0)
 		AutovacuumLauncherIAm();

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-12 Thread Andres Freund

On 2015-10-12 21:38:12 +0900, Michael Paquier wrote:
> >> It feels wrong to do this in syslogger.c - I mean it's not the only
> >> process that's not attached to shared memory. Sure, the others get
> >> killed, but nonetheless...
> >
> >
> > +1. It feels like we're setting our selves up for repeating this mistake at
> > some later time :)
> 
> Actually, doesn't this apply as well to the archiver and the pgstat
> collector?

As mentioned above? The difference is that the archiver et al get killed
by postmaster during a PANIC restart thus don't present the problem
discussed here.

> So perhaps we may want to do that in SubPostmasterMain with
> PGSharedMemoryDetach. See for example the attached as an idea (patch
> completely untested).

> + /*
> +  * Close any existing shared memory segment as those processes do not
> +  * need to have an access to it. This state is inherited from the
> +  * postmaster whether they need it or not.
> +  */
> + if (strcmp(argv[1], "--forkarch") == 0 ||
> + strcmp(argv[1], "--forkcol") == 0 ||
> + strcmp(argv[1], "--forklog") == 0)
> + PGSharedMemoryDetach();
> +

Well, in those cases we won't have attached to shared memory, so I'm not
convinced that this is the right solution. In fact, won't this lead to
hitting the elog in
void
PGSharedMemoryDetach(void)
{
if (UsedShmemSegAddr != NULL)
{
if (!UnmapViewOfFile(UsedShmemSegAddr))
elog(LOG, "could not unmap view of shared memory: error 
code %lu", GetLastError());

UsedShmemSegAddr = NULL;
}
}
UsedShmemSegAddr will have been setup by read_backend_variables(), but
the process won't have anything mapped at this point?

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-12 Thread Dmitry Vasilyev

Hello, Amit!
On Пн, 2015-10-12 at 11:25 +0530, Amit Kapila wrote:
> On Sun, Oct 11, 2015 at 9:12 PM, Tom Lane  wrote:
> >
> > Magnus Hagander  writes:
> > > On Sun, Oct 11, 2015 at 5:22 PM, Tom Lane 
> wrote:
> > >> I'm a bit suspicious that we may have leaked a handle to the
> shared
> > >> memory block someplace, for example.  That would explain why
> this
> > >> symptom is visible now when it was not in 2009.  Or maybe it's
> dependent
> > >> on some feature that we didn't test back then --- for instance,
> if
> > >> the logging collector is in use, could it have inherited a
> handle and
> > >> not closed it?
> >
> > > Even if we leaked it, it should go away when the other processes
> died.
> >
> > I'm fairly certain that we do not kill/restart the logging
> collector
> > during a database restart (because it's impossible to reproduce the
> > original stderr destination if we do).  
> 
> True and it seems this is the reason for issue we are discussing
> here.
> The reason why this happens is that during creation of shared memory
> (PGSharedMemoryCreate()), we duplicate the handle such that it
> become inheritable by all child processes.  Then during fork
> (syslogger_forkexec()->postmaster_forkexec()->internal_forkexec) we
> always inherit the handles which causes syslogger to get a copy of
> shared memory handle which it neither uses and nor closes it.
> 
> I could easily reproduce the issue if logging collector is on and
> even if
> we try to increase the loop count or sleep time
> in PGSharedMemoryCreate(),
> it doesn't change the situation as the syslogger has a valid handle
> to
> shared memory.  One way to fix is to just close the shared memory
> handle
> in sys logger as we are not going to need it and attached patch which
> does
> this fixes the issue for me.  Another invasive fix in case we want to
> retain shared memory handle for some purpose (future requirement)
> could
> be to send some signal to syslogger in restart path so that it can
> release
> the shared memory handle.
> 
> 
> 
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
Specified patch with "ifdef WIN32" is working for me. Maybe it’s
necessary to check open handlers from replication for example?
--
Dmitry Vasilyev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-12 Thread Andres Freund

On 2015-10-12 10:04:55 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2015-10-12 21:38:12 +0900, Michael Paquier wrote:
> >> Actually, doesn't this apply as well to the archiver and the pgstat
> >> collector?
> 
> > As mentioned above? The difference is that the archiver et al get killed
> > by postmaster during a PANIC restart thus don't present the problem
> > discussed here.
> 
> I thought your objection to the original patch was exactly that we should
> not treat syslogger as a special case for this purpose.

Yes. The above was just about this not being actively broken - I'd
mentioned the other processes before and to me it sounded like Michael
thought there might be an active problem.

> > Well, in those cases we won't have attached to shared memory, so I'm not
> > convinced that this is the right solution.
> 
> No, you're missing the point.

Don't think so.

> In Windows builds, child processes inherit
> a "handle" reference to the shared memory mapping, whether or not they
> make any use of the handle to re-attach to that shared memory.  The point
> here is that we need to close that handle if we're not going to use it.

Right. But that doesn't mean it's right to call PGSharedMemoryDetach()
without other changes as done in Michael's proposed patch? That'll do an
UnmapViewOfFile() which'll fail because nothing i mapped, but still not
close UsedShmemSegID?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-12 Thread Tom Lane

Andres Freund  writes:
> Right. But that doesn't mean it's right to call PGSharedMemoryDetach()
> without other changes as done in Michael's proposed patch? That'll do an
> UnmapViewOfFile() which'll fail because nothing i mapped, but still not
> close UsedShmemSegID?

Ah, right, I'd not noticed that he proposed changing
CloseHandle(UsedShmemSegID) to PGSharedMemoryDetach().  The latter is
clearly the wrong thing.

I'm not sure whether we should just put the CloseHandle call in
postmaster.c, or invent a function in win32_shmem.c to provide a
layer of abstraction.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-12 Thread Amit Kapila

On Mon, Oct 12, 2015 at 8:10 PM, Tom Lane  wrote:
>
> I wrote:
> > Andres Freund  writes:
> >> Right. But that doesn't mean it's right to call PGSharedMemoryDetach()
> >> without other changes as done in Michael's proposed patch? That'll do
an
> >> UnmapViewOfFile() which'll fail because nothing i mapped, but still not
> >> close UsedShmemSegID?
>
> > Ah, right, I'd not noticed that he proposed changing
> > CloseHandle(UsedShmemSegID) to PGSharedMemoryDetach().  The latter is
> > clearly the wrong thing.
>
> Actually, now that I look at it, it's even more obvious that this is the
> wrong thing because *all the subprocess types in question already call
> PGSharedMemoryDetach*.  That's necessary on Unix, but I should think that
> on Windows all it will do is provoke the log message:
>
> elog(LOG, "could not unmap view of shared memory: error code
%lu", GetLastError());
>
> Could someone confirm whether syslogger, archiver, stats collector
> processes reliably produce that log message at startup on Windows?
>

I have tried this approach of calling PGSharedMemoryDetach() for
syslogger before calling closehandle() patch and I saw that message
and understood that it is not going to work.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-12 Thread Tom Lane

I wrote:
> Actually, now that I look at it, it's even more obvious that this is the
> wrong thing because *all the subprocess types in question already call
> PGSharedMemoryDetach*.

Ah, scratch that: in most of them, the call is in #ifndef EXEC_BACKEND
stanzas.  The exception is bgworker start for a non-attached-to-shmem
worker, and in that case there's no log message because in fact
SubPostmasterMain did reattach.

This is kind of a mess :-(.  But it does look like what we want is
for SubPostmasterMain to do more than nothing when it chooses not to
reattach.  Probably that should include resetting UsedShmemSegAddr to
NULL, as well as closing the handle.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-12 Thread Tom Lane

Oleg Bartunov  writes:
> Assuming the problem will be fixed, should we release Beta2 soon ?

This bug has existed since we had native Windows support.  It's entirely
immaterial for beta purposes, and I have a hard time thinking it's
critical enough to justify a short release cycle for the back branches
either.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-12 Thread Oleg Bartunov

On Mon, Oct 12, 2015 at 4:42 PM, Dmitry Vasilyev 
wrote:

> Hello, Amit!
>
> On Пн, 2015-10-12 at 11:25 +0530, Amit Kapila wrote:
>
> On Sun, Oct 11, 2015 at 9:12 PM, Tom Lane  wrote:
> >
> > Magnus Hagander  writes:
> > > On Sun, Oct 11, 2015 at 5:22 PM, Tom Lane  wrote:
> > >> I'm a bit suspicious that we may have leaked a handle to the shared
> > >> memory block someplace, for example.  That would explain why this
> > >> symptom is visible now when it was not in 2009.  Or maybe it's
> dependent
> > >> on some feature that we didn't test back then --- for instance, if
> > >> the logging collector is in use, could it have inherited a handle and
> > >> not closed it?
> >
> > > Even if we leaked it, it should go away when the other processes died.
> >
> > I'm fairly certain that we do not kill/restart the logging collector
> > during a database restart (because it's impossible to reproduce the
> > original stderr destination if we do).
>
> True and it seems this is the reason for issue we are discussing here.
> The reason why this happens is that during creation of shared memory
> (PGSharedMemoryCreate()), we duplicate the handle such that it
> become inheritable by all child processes.  Then during fork
> (syslogger_forkexec()->postmaster_forkexec()->internal_forkexec) we
> always inherit the handles which causes syslogger to get a copy of
> shared memory handle which it neither uses and nor closes it.
>
> I could easily reproduce the issue if logging collector is on and even if
> we try to increase the loop count or sleep time in PGSharedMemoryCreate(),
> it doesn't change the situation as the syslogger has a valid handle to
> shared memory.  One way to fix is to just close the shared memory handle
> in sys logger as we are not going to need it and attached patch which does
> this fixes the issue for me.  Another invasive fix in case we want to
> retain shared memory handle for some purpose (future requirement) could
> be to send some signal to syslogger in restart path so that it can release
> the shared memory handle.
>
>
>
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
>
>
> Specified patch with "ifdef WIN32" is working for me. Maybe it’s necessary
> to check open handlers from replication for example?
>
>
Assuming the problem will be fixed, should we release Beta2 soon ?



>
> --
> Dmitry Vasilyev
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-12 Thread Tom Lane

Andres Freund  writes:
> On 2015-10-12 21:38:12 +0900, Michael Paquier wrote:
>> Actually, doesn't this apply as well to the archiver and the pgstat
>> collector?

> As mentioned above? The difference is that the archiver et al get killed
> by postmaster during a PANIC restart thus don't present the problem
> discussed here.

I thought your objection to the original patch was exactly that we should
not treat syslogger as a special case for this purpose.

> Well, in those cases we won't have attached to shared memory, so I'm not
> convinced that this is the right solution.

No, you're missing the point.  In Windows builds, child processes inherit
a "handle" reference to the shared memory mapping, whether or not they
make any use of the handle to re-attach to that shared memory.  The point
here is that we need to close that handle if we're not going to use it.

I think the right thing is something close to Michael's proposed patch,
though not duplicating and reversing the previous if-test like that.
In other words, something like this in SubPostmasterMain:

/*
 * If appropriate, physically re-attach to shared memory segment. We 
want
 * to do this before going any further to ensure that we can attach at 
the
 * same address the postmaster used.
+* If we're not re-attaching, close the inherited handle to avoid leaks.
 */
if (strcmp(argv[1], "--forkbackend") == 0 ||
strcmp(argv[1], "--forkavlauncher") == 0 ||
strcmp(argv[1], "--forkavworker") == 0 ||
strcmp(argv[1], "--forkboot") == 0 ||
strncmp(argv[1], "--forkbgworker=", 15) == 0)
PGSharedMemoryReAttach();
+#ifdef WIN32
+   else
+   close the handle;
+#endif

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-12 Thread Tom Lane

I wrote:
> Andres Freund  writes:
>> Right. But that doesn't mean it's right to call PGSharedMemoryDetach()
>> without other changes as done in Michael's proposed patch? That'll do an
>> UnmapViewOfFile() which'll fail because nothing i mapped, but still not
>> close UsedShmemSegID?

> Ah, right, I'd not noticed that he proposed changing
> CloseHandle(UsedShmemSegID) to PGSharedMemoryDetach().  The latter is
> clearly the wrong thing.

Actually, now that I look at it, it's even more obvious that this is the
wrong thing because *all the subprocess types in question already call
PGSharedMemoryDetach*.  That's necessary on Unix, but I should think that
on Windows all it will do is provoke the log message:

elog(LOG, "could not unmap view of shared memory: error code %lu", 
GetLastError());

Could someone confirm whether syslogger, archiver, stats collector
processes reliably produce that log message at startup on Windows?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-12 Thread Tom Lane

I wrote:
> This is kind of a mess :-(.  But it does look like what we want is
> for SubPostmasterMain to do more than nothing when it chooses not to
> reattach.  Probably that should include resetting UsedShmemSegAddr to
> NULL, as well as closing the handle.

After poking around a bit more, I propose the attached patch.  I've
checked that this is happy with an EXEC_BACKEND Unix build, but I'm not
able to test it on Windows ... would somebody do that?

BTW, it appears from this that Cygwin builds have been broken right along
in a different way: according to the code in sysv_shmem's
PGSharedMemoryReAttach, Cygwin does cause a re-attach to occur, which we
were not undoing for putatively-not-connected-to-shmem child processes.
That's a robustness problem because it breaks the postmaster's expectation
that it's safe to not reinitialize shmem after a crash of one of those
processes.  I believe this patch fixes that problem as well, though if
anyone can test it on Cygwin that wouldn't be a bad thing either.

regards, tom lane

diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index 8be5bbe..c7a3a91 100644
*** a/src/backend/port/sysv_shmem.c
--- b/src/backend/port/sysv_shmem.c
*** PGSharedMemoryReAttach(void)
*** 619,624 
--- 619,652 
  
  	UsedShmemSegAddr = hdr;		/* probably redundant */
  }
+ 
+ /*
+  * PGSharedMemoryNoReAttach
+  *
+  * Clean up if we choose *not* to re-attach to an already existing shared
+  * memory segment.  This is not used in the non EXEC_BACKEND case, either.
+  *
+  * UsedShmemSegID and UsedShmemSegAddr are implicit parameters to this
+  * routine.  The caller must have already restored them to the postmaster's
+  * values.
+  */
+ void
+ PGSharedMemoryNoReAttach(void)
+ {
+ 	Assert(UsedShmemSegAddr != NULL);
+ 	Assert(IsUnderPostmaster);
+ 
+ #ifdef __CYGWIN__
+ 	/* cygipc (currently) appears to not detach on exec. */
+ 	PGSharedMemoryDetach();
+ #endif
+ 
+ 	/* For cleanliness, reset UsedShmemSegAddr to show we're not attached. */
+ 	UsedShmemSegAddr = NULL;
+ 	/* And the same for UsedShmemSegID. */
+ 	UsedShmemSegID = 0;
+ }
+ 
  #endif   /* EXEC_BACKEND */
  
  /*
*** PGSharedMemoryReAttach(void)
*** 629,634 
--- 657,665 
   * (it will have an on_shmem_exit callback registered to do that).  Rather,
   * this is for subprocesses that have inherited an attachment and want to
   * get rid of it.
+  *
+  * UsedShmemSegID and UsedShmemSegAddr are implicit parameters to this
+  * routine.
   */
  void
  PGSharedMemoryDetach(void)
diff --git a/src/backend/port/win32_shmem.c b/src/backend/port/win32_shmem.c
index db67627..8152522 100644
*** a/src/backend/port/win32_shmem.c
--- b/src/backend/port/win32_shmem.c
***
*** 17,23 
  #include "storage/ipc.h"
  #include "storage/pg_shmem.h"
  
! HANDLE		UsedShmemSegID = 0;
  void	   *UsedShmemSegAddr = NULL;
  static Size UsedShmemSegSize = 0;
  
--- 17,23 
  #include "storage/ipc.h"
  #include "storage/pg_shmem.h"
  
! HANDLE		UsedShmemSegID = INVALID_HANDLE_VALUE;
  void	   *UsedShmemSegAddr = NULL;
  static Size UsedShmemSegSize = 0;
  
*** PGSharedMemoryCreate(Size size, bool mak
*** 218,226 
  		elog(LOG, "could not close handle to shared memory: error code %lu", GetLastError());
  
  
- 	/* Register on-exit routine to delete the new segment */
- 	on_shmem_exit(pgwin32_SharedMemoryDelete, PointerGetDatum(hmap2));
- 
  	/*
  	 * Get a pointer to the new shared memory segment. Map the whole segment
  	 * at once, and let the system decide on the initial address.
--- 218,223 
*** PGSharedMemoryCreate(Size size, bool mak
*** 254,259 
--- 251,259 
  	UsedShmemSegSize = size;
  	UsedShmemSegID = hmap2;
  
+ 	/* Register on-exit routine to delete the new segment */
+ 	on_shmem_exit(pgwin32_SharedMemoryDelete, PointerGetDatum(hmap2));
+ 
  	*shim = hdr;
  	return hdr;
  }
*** PGSharedMemoryReAttach(void)
*** 299,321 
  }
  
  /*
   * PGSharedMemoryDetach
   *
   * Detach from the shared memory segment, if still attached.  This is not
!  * intended for use by the process that originally created the segment. Rather,
   * this is for subprocesses that have inherited an attachment and want to
   * get rid of it.
   */
  void
  PGSharedMemoryDetach(void)
  {
  	if (UsedShmemSegAddr != NULL)
  	{
  		if (!UnmapViewOfFile(UsedShmemSegAddr))
! 			elog(LOG, "could not unmap view of shared memory: error code %lu", GetLastError());
  
  		UsedShmemSegAddr = NULL;
  	}
  }
  
  
--- 299,368 
  }
  
  /*
+  * PGSharedMemoryNoReAttach
+  *
+  * Clean up if we choose *not* to re-attach to an already existing shared
+  * memory segment.
+  *
+  * UsedShmemSegID and UsedShmemSegAddr are implicit parameters to this
+  * routine.  The caller must have already restored them to the postmaster's
+  * values.
+  */
+ void
+ PGSharedMemoryNoReAttach(void)
+ {
+ 	Assert(UsedShmemSegAddr != NULL);
+

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-12 Thread Dmitry Vasilyev

Hello Tom!
On Пн, 2015-10-12 at 16:35 -0400, Tom Lane wrote:
> I wrote:
> > This is kind of a mess :-(.  But it does look like what we want is
> > for SubPostmasterMain to do more than nothing when it chooses not
> > to
> > reattach.  Probably that should include resetting UsedShmemSegAddr
> > to
> > NULL, as well as closing the handle.
> 
> After poking around a bit more, I propose the attached patch.  I've
> checked that this is happy with an EXEC_BACKEND Unix build, but I'm
> not
> able to test it on Windows ... would somebody do that?
> 
> BTW, it appears from this that Cygwin builds have been broken right
> along
> in a different way: according to the code in sysv_shmem's
> PGSharedMemoryReAttach, Cygwin does cause a re-attach to occur, which
> we
> were not undoing for putatively-not-connected-to-shmem child
> processes.
> That's a robustness problem because it breaks the postmaster's
> expectation
> that it's safe to not reinitialize shmem after a crash of one of
> those
> processes.  I believe this patch fixes that problem as well, though
> if
> anyone can test it on Cygwin that wouldn't be a bad thing either.
> 
>   regards, tom lane
> 

This patch is working for me,
binaries: https://goo.gl/32j7QE (MSVC 2010, build script here: 
https://github.com/postgrespro/pgwininstall).


--
Dmitry Vasilyev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-11 Thread Magnus Hagander

On Sun, Oct 11, 2015 at 5:55 AM, Michael Paquier 
wrote:

> On Sun, Oct 11, 2015 at 8:54 AM, Ali Akbar  wrote:
> > C:\Windows\system32>taskkill /F /PID 2080
> > SUCCESS: The process with PID 2080 has been terminated.
>
> taskkill /f *forcefully* terminates the process targeted [1]. Isn't
> that equivalent to a kill -9? If you headshot a backend process on
> Linux with kill -9, an instance won't restart either.
> [1]:
> http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/taskkill.mspx?mfr=true



It does. If you want a "gracefull kill" on Windows, you must use "pg_ctl
kill" which can send an "emulated term-signal".

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-11 Thread Amit Kapila

On Sun, Oct 11, 2015 at 10:09 AM, Tom Lane  wrote:
> Dmitry Vasilyev  writes:
> > The log you can see bellow:
> > ...
> > 2015-10-10 19:00:32 AST DEBUG:  cleaning up dynamic shared memory
control segment with ID 851401618
> > 2015-10-10 19:00:32 AST DEBUG:  invoking IpcMemoryCreate(size=290095104)
> > 2015-10-10 19:00:42 AST FATAL:  pre-existing shared memory block is
still in use
> > 2015-10-10 19:00:42 AST HINT:  Check if there are any old server
processes still running, and terminate them.
>
..
>
> If I had to guess, on the basis of no evidence, I'd wonder whether the
> DSM code broke it; there is evidently at least one DSM segment in play
> in your use-case.  But that's only a guess.
>

There is some possibility based on the above DEBUG messages that
DSM could cause this problem, but I think the last message (pre-existing
shared memory block is still in use) won't be logged for DSM.  We create
the new dsm segment in below code dsm_postmaster_startup()->
dsm_impl_op()->dsm_impl_windows()

dsm_impl_windows()
{
..
if (op == DSM_OP_CREATE)
..
}

Basically in this path, we try to recreate the dsm with different name if it
fails with ALREADY_EXIST error.

To diagnose the reason of problem, I think we can write a diagnostic
patch which would do below 2 points:

1. Increase the below loop count 10 to 50 or 100 in win32_shmem.c
or instead of loop count, we can increase the sleep time as well.
PGSharedMemoryCreate()
{
..
for (i = 0; i < 10; i++)
..
if (GetLastError() == ERROR_ALREADY_EXISTS)
{
..
Sleep(1000);
continue;
}
..
}

2. Increase the log messages both in win32_shmem.c and dsm related
code which can help us in narrowing down the problem.

If you find this as reasonable approach to diagnose the root cause
of problem, I can work on writing a diagnostic patch.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-11 Thread Tom Lane

Magnus Hagander  writes:
> On Sun, Oct 11, 2015 at 5:22 PM, Tom Lane  wrote:
>> I'm a bit suspicious that we may have leaked a handle to the shared
>> memory block someplace, for example.  That would explain why this
>> symptom is visible now when it was not in 2009.  Or maybe it's dependent
>> on some feature that we didn't test back then --- for instance, if
>> the logging collector is in use, could it have inherited a handle and
>> not closed it?

> Even if we leaked it, it should go away when the other processes died.

I'm fairly certain that we do not kill/restart the logging collector
during a database restart (because it's impossible to reproduce the
original stderr destination if we do).  Not sure if any other postmaster
children are allowed to survive.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-11 Thread Tom Lane

Andrew Dunstan  writes:
> Amit's proposals elsewhere to increase the shmem timeout and increase 
> logging seem reasonable.

I'm back to the position I had in the previous thread, which is that
we don't really understand why any delay is needed here at all, and
we ought to try to remedy that lack rather than just hoping that more
and more delay will fix it.  It may be that there's some proactive
measure we can take to improve matters.

I'm a bit suspicious that we may have leaked a handle to the shared
memory block someplace, for example.  That would explain why this
symptom is visible now when it was not in 2009.  Or maybe it's dependent
on some feature that we didn't test back then --- for instance, if
the logging collector is in use, could it have inherited a handle and
not closed it?

One thing I noticed in the CreateFileMapping docs is that Windows
apparently implements the sort of anonymous mapping we're doing as
a mapping of part of the "system paging file".  I wonder if it's too
dumb (perhaps in only some releases) to realize that it doesn't
really need to flush dirty pages to disk when the last reference
to the mapping is abandoned.  In that case maybe an explicit flush
request would move things along.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-11 Thread Andrew Dunstan

On 10/11/2015 05:58 AM, Magnus Hagander wrote:

On Sun, Oct 11, 2015 at 5:55 AM, Michael Paquier
> wrote:

On Sun, Oct 11, 2015 at 8:54 AM, Ali Akbar > wrote:
> C:\Windows\system32>taskkill /F /PID 2080
> SUCCESS: The process with PID 2080 has been terminated.

taskkill /f *forcefully* terminates the process targeted [1]. Isn't
that equivalent to a kill -9? If you headshot a backend process on
Linux with kill -9, an instance won't restart either.
[1]:

http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/taskkill.mspx?mfr=true

It does. If you want a "gracefull kill" on Windows, you must use
"pg_ctl kill" which can send an "emulated term-signal".

Nevertheless, we'd like a hard crash of a backend other than the
postmaster not to have worse effects than on *nix, where killing a
backend even with SIGKILL doesn't halt the server:

andrew=# select pg_backend_pid();
pg_backend_pid

24359
(1 row)

andrew=# \! kill -9 24359
andrew=# select 1;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
andrew=#

Amit's proposals elsewhere to increase the shmem timeout and increase
logging seem reasonable.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-11 Thread Magnus Hagander

On Sun, Oct 11, 2015 at 4:32 PM, Andrew Dunstan  wrote:

>
>
> On 10/11/2015 05:58 AM, Magnus Hagander wrote:
>
>>
>>
>> On Sun, Oct 11, 2015 at 5:55 AM, Michael Paquier <
>> michael.paqu...@gmail.com > wrote:
>>
>> On Sun, Oct 11, 2015 at 8:54 AM, Ali Akbar > > wrote:
>> > C:\Windows\system32>taskkill /F /PID 2080
>> > SUCCESS: The process with PID 2080 has been terminated.
>>
>> taskkill /f *forcefully* terminates the process targeted [1]. Isn't
>> that equivalent to a kill -9? If you headshot a backend process on
>> Linux with kill -9, an instance won't restart either.
>> [1]:
>>
>> http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/taskkill.mspx?mfr=true
>>
>>
>>
>> It does. If you want a "gracefull kill" on Windows, you must use "pg_ctl
>> kill" which can send an "emulated term-signal".
>>
>>
>>
> Nevertheless, we'd like a hard crash of a backend other than the
> postmaster not to have worse effects than on *nix, where killing a backend
> even with SIGKILL doesn't halt the server:
>

Oh, absolutely. I was just pointing out that something like taskill
*should* result in a hard restart of *all* backends, and if you want to
kill off just the one you should never use it, you should instead use
pg_ctl kill. But of course, none of those two should lead to the scenario
explained here where it does not come back up again.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-11 Thread Magnus Hagander

On Sun, Oct 11, 2015 at 5:22 PM, Tom Lane  wrote:

> Andrew Dunstan  writes:
> > Amit's proposals elsewhere to increase the shmem timeout and increase
> > logging seem reasonable.
>
> I'm back to the position I had in the previous thread, which is that
> we don't really understand why any delay is needed here at all, and
> we ought to try to remedy that lack rather than just hoping that more
> and more delay will fix it.  It may be that there's some proactive
> measure we can take to improve matters.
>
> I'm a bit suspicious that we may have leaked a handle to the shared
> memory block someplace, for example.  That would explain why this
> symptom is visible now when it was not in 2009.  Or maybe it's dependent
> on some feature that we didn't test back then --- for instance, if
> the logging collector is in use, could it have inherited a handle and
> not closed it?
>

Even if we leaked it, it should go away when the other processes died.

What would be interesting to know is if there at this point is *any*
postgres.exe process still running, and in that case what it is. It should
then be possible to use Process Explorer to figure out which process it is
(by looking at the "fake title"), and probably also which shared memory
handles it has open (even though they don't have a name, their info might
explain things).

So if someone with a reproducible case could check that as well, I think it
woudl be valuable information.

> One thing I noticed in the CreateFileMapping docs is that Windows
> apparently implements the sort of anonymous mapping we're doing as
> a mapping of part of the "system paging file".  I wonder if it's too
> dumb (perhaps in only some releases) to realize that it doesn't
> really need to flush dirty pages to disk when the last reference
> to the mapping is abandoned.  In that case maybe an explicit flush
> request would move things along.
>
>
First of all, note that "system paging file" is exactly the same as "swap
file" or "swap partition" on Unix. Just in case there is any unclearness
there.

And I'm pretty sure it doesn't do that. Surely we would've seen performance
issues from that before in that case. But I don't really have any facts to
back that up :)

We do get, AIUI, the SEC_COMMIT behaviour which commits the pages initially
to make sure there is actually space for them. I don't believe that one
specifically says anything about when you close it.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-11 Thread Michael Paquier

> On Sun, Oct 11, 2015 at 5:55 AM, Michael Paquier wrote:
>> On Sun, Oct 11, 2015 at 8:54 AM, Ali Akbar  wrote:
>> > C:\Windows\system32>taskkill /F /PID 2080
>> > SUCCESS: The process with PID 2080 has been terminated.
>>
>> taskkill /f *forcefully* terminates the process targeted [1]. Isn't
>> that equivalent to a kill -9? If you headshot a backend process on
>> Linux with kill -9, an instance won't restart either.
>> [1]:
>> http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/taskkill.mspx?mfr=true
> It does. If you want a "gracefull kill" on Windows, you must use "pg_ctl
> kill" which can send an "emulated term-signal".

Ah, yes. Sure. I had restart_after_crash = off on this instance...
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-11 Thread Amit Kapila

On Sun, Oct 11, 2015 at 9:12 PM, Tom Lane  wrote:
>
> Magnus Hagander  writes:
> > On Sun, Oct 11, 2015 at 5:22 PM, Tom Lane  wrote:
> >> I'm a bit suspicious that we may have leaked a handle to the shared
> >> memory block someplace, for example.  That would explain why this
> >> symptom is visible now when it was not in 2009.  Or maybe it's
dependent
> >> on some feature that we didn't test back then --- for instance, if
> >> the logging collector is in use, could it have inherited a handle and
> >> not closed it?
>
> > Even if we leaked it, it should go away when the other processes died.
>
> I'm fairly certain that we do not kill/restart the logging collector
> during a database restart (because it's impossible to reproduce the
> original stderr destination if we do).

True and it seems this is the reason for issue we are discussing here.
The reason why this happens is that during creation of shared memory
(PGSharedMemoryCreate()), we duplicate the handle such that it
become inheritable by all child processes.  Then during fork
(syslogger_forkexec()->postmaster_forkexec()->internal_forkexec) we
always inherit the handles which causes syslogger to get a copy of
shared memory handle which it neither uses and nor closes it.

I could easily reproduce the issue if logging collector is on and even if
we try to increase the loop count or sleep time in PGSharedMemoryCreate(),
it doesn't change the situation as the syslogger has a valid handle to
shared memory.  One way to fix is to just close the shared memory handle
in sys logger as we are not going to need it and attached patch which does
this fixes the issue for me.  Another invasive fix in case we want to
retain shared memory handle for some purpose (future requirement) could
be to send some signal to syslogger in restart path so that it can release
the shared memory handle.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

fix_syslogger_dangling_shmhandle_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-10 Thread Tom Lane

Robert Haas  writes:
> On Fri, Oct 9, 2015 at 5:52 AM, Dmitry Vasilyev
>> postgres=# select 1;
>> server closed the connection unexpectedly
>> This probably means the server terminated abnormally
>> before or while processing the request.
>> The connection to the server was lost. Attempting reset: Failed.

> Hmm.  I'd expect that to cause a crash-and-restart cycle, just like a
> SIGQUIT would cause a crash-and-restart cycle on Linux.  But I would
> expect the server to end up running again at the end, not stopped.

It *is* a crash and restart cycle, or at least no evidence to the
contrary has been provided.

Whether psql's attempt to do an immediate reconnect succeeds or not is
very strongly timing-dependent, on both Linux and Windows.  It's easy
for it to attempt the reconnection before crash recovery is complete,
and then you get the above symptom.  Personally I get a "Failed" result
more often than not, regardless of platform.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-10 Thread Tom Lane

Dmitry Vasilyev  writes:
> I have written, what service stopped. This action is repeatable.
> You can run command 'psql -c "do $$ unpack p,1x8 $$ language plperlu;"'
> and after this windows service will stop.

Well, (a) that probably means that your plperl installation is broken,
and (b) you still haven't convinced me that you had an actual service
stop, and not just that the recovery time was longer than psql would
wait before retrying the connection.  Can you start a fresh psql
session after waiting a few seconds?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-10 Thread Dmitry Vasilyev

I have written, what service stopped. This action is repeatable.
You can run command 'psql -c "do $$ unpack p,1x8 $$ language plperlu;"'
and after this windows service will stop. 

On Сб, 2015-10-10 at 10:23 -0500, Tom Lane wrote:
> Robert Haas  writes:
> > On Fri, Oct 9, 2015 at 5:52 AM, Dmitry Vasilyev
> > > postgres=# select 1;
> > > server closed the connection unexpectedly
> > > This probably means the server terminated abnormally
> > > before or while processing the request.
> > > The connection to the server was lost. Attempting reset: Failed.
> 
> > Hmm.  I'd expect that to cause a crash-and-restart cycle, just like
> > a
> > SIGQUIT would cause a crash-and-restart cycle on Linux.  But I
> > would
> > expect the server to end up running again at the end, not stopped.
> 
> It *is* a crash and restart cycle, or at least no evidence to the
> contrary has been provided.
> 
> Whether psql's attempt to do an immediate reconnect succeeds or not
> is
> very strongly timing-dependent, on both Linux and Windows.  It's easy
> for it to attempt the reconnection before crash recovery is complete,
> and then you get the above symptom.  Personally I get a "Failed"
> result
> more often than not, regardless of platform.
> 
>   regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-10 Thread Dmitry Vasilyev

Hello Tom!

On Сб, 2015-10-10 at 10:55 -0500, Tom Lane wrote:
> Dmitry Vasilyev  writes:
> > I have written, what service stopped. This action is repeatable.
> > You can run command 'psql -c "do $$ unpack p,1x8 $$ language
> > plperlu;"'
> > and after this windows service will stop.
> 
> Well, (a) that probably means that your plperl installation is
> broken,
> and (b) you still haven't convinced me that you had an actual service
> stop, and not just that the recovery time was longer than psql would
> wait before retrying the connection.  Can you start a fresh psql
> session after waiting a few seconds?
> 
>   regards, tom lane

This is knowned bug of perl:

perl -e ' unpack p,1x8'
Segmentation fault (core dumped)

backend of postgres is crashed, and windows service is stopped:

C:\Users\vadv>sc query postgresql-X64-9.4 | findstr /i "STATE"
S
TATE  : 1  STOPPED


The log you can see bellow:

2015-10-10 19:00:13 AST LOG:  database system was interrupted; last
known up at 2015-10-10 18:54:47 AST
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 2 to 2
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 2 to 2
2015-10-10 19:00:13 AST DEBUG:  checkpoint record is at 0/16A01C8
2015-10-10 19:00:13 AST DEBUG:  redo record is at 0/16A01C8; shutdown
TRUE
2015-10-10 19:00:13 AST DEBUG:  next transaction ID: 0/678; next OID:
16393
2015-10-10 19:00:13 AST DEBUG:  next MultiXactId: 1; next
MultiXactOffset: 0
2015-10-10 19:00:13 AST DEBUG:  oldest unfrozen transaction ID: 667, in
database 1
2015-10-10 19:00:13 AST DEBUG:  oldest MultiXactId: 1, in database 1
2015-10-10 19:00:13 AST DEBUG:  transaction ID wrap limit is
2147484314, limited by database with OID 1
2015-10-10 19:00:13 AST DEBUG:  MultiXactId wrap limit is 2147483648,
limited by database with OID 1
2015-10-10 19:00:13 AST DEBUG:  starting up replication slots
2015-10-10 19:00:13 AST LOG:  database system was not properly shut
down; automatic recovery in progress
2015-10-10 19:00:13 AST DEBUG:  resetting unlogged relations: cleanup 1
init 0
2015-10-10 19:00:13 AST LOG:  redo starts at 0/16A0230
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT:  xlog redo insert: rel
1663/12135/12057; tid 0/3
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT:  xlog redo insert: rel
1663/12135/12059; tid 1/3
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT:  xlog redo insert: rel
1663/12135/12060; tid 1/2
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT:  xlog redo insert: rel
1663/12135/11979; tid 31/63
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT:  xlog redo insert: rel
1663/12135/11984; tid 16/34
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT:  xlog redo insert: rel
1663/12135/11889; tid 67/5
2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 80 to 17
2015-10-10 19:00:13 AST CONTEXT:  xlog redo insert: rel
1663/12135/11894; tid 9/132
2015-10-10

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-10 Thread Pavel Stehule

2015-10-10 18:04 GMT+02:00 Dmitry Vasilyev :

> Hello Tom!
>
> On Сб, 2015-10-10 at 10:55 -0500, Tom Lane wrote:
> > Dmitry Vasilyev  writes:
> > > I have written, what service stopped. This action is repeatable.
> > > You can run command 'psql -c "do $$ unpack p,1x8 $$ language
> > > plperlu;"'
> > > and after this windows service will stop.
> >
> > Well, (a) that probably means that your plperl installation is
> > broken,
> > and (b) you still haven't convinced me that you had an actual service
> > stop, and not just that the recovery time was longer than psql would
> > wait before retrying the connection.  Can you start a fresh psql
> > session after waiting a few seconds?
> >
> >   regards, tom lane
>
> This is knowned bug of perl:
>
> perl -e ' unpack p,1x8'
> Segmentation fault (core dumped)
>

so it is expected behave. After any unexpected client fails, the server is
restarted

Regards

Pavel


>
> backend of postgres is crashed, and windows service is stopped:
>
> C:\Users\vadv>sc query postgresql-X64-9.4 | findstr /i "STATE"
> S
> TATE  : 1  STOPPED
>
>
> The log you can see bellow:
>
> 2015-10-10 19:00:13 AST LOG:  database system was interrupted; last
> known up at 2015-10-10 18:54:47 AST
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 5 to 13
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 2 to 2
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 2 to 2
> 2015-10-10 19:00:13 AST DEBUG:  checkpoint record is at 0/16A01C8
> 2015-10-10 19:00:13 AST DEBUG:  redo record is at 0/16A01C8; shutdown
> TRUE
> 2015-10-10 19:00:13 AST DEBUG:  next transaction ID: 0/678; next OID:
> 16393
> 2015-10-10 19:00:13 AST DEBUG:  next MultiXactId: 1; next
> MultiXactOffset: 0
> 2015-10-10 19:00:13 AST DEBUG:  oldest unfrozen transaction ID: 667, in
> database 1
> 2015-10-10 19:00:13 AST DEBUG:  oldest MultiXactId: 1, in database 1
> 2015-10-10 19:00:13 AST DEBUG:  transaction ID wrap limit is
> 2147484314, limited by database with OID 1
> 2015-10-10 19:00:13 AST DEBUG:  MultiXactId wrap limit is 2147483648,
> limited by database with OID 1
> 2015-10-10 19:00:13 AST DEBUG:  starting up replication slots
> 2015-10-10 19:00:13 AST LOG:  database system was not properly shut
> down; automatic recovery in progress
> 2015-10-10 19:00:13 AST DEBUG:  resetting unlogged relations: cleanup 1
> init 0
> 2015-10-10 19:00:13 AST LOG:  redo starts at 0/16A0230
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 80 to 17
> 2015-10-10 19:00:13 AST CONTEXT:  xlog redo insert: rel
> 1663/12135/12057; tid 0/3
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 80 to 17
> 2015-10-10 19:00:13 AST CONTEXT:  xlog redo insert: rel
> 1663/12135/12059; tid 1/3
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 80 to 17
> 2015-10-10 19:00:13 AST CONTEXT:  xlog redo insert: rel
> 1663/12135/12060; tid 1/2
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 80 to 17
> 2015-10-10 19:00:13 AST CONTEXT:  xlog redo insert: rel
> 1663/12135/11979; tid 31/63
> 2015-10-10 19:00:13 AST DEBUG:  mapped win32 error code 80 to 17
> 2015-10-10 19:00:13 AST CONTEXT:

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-10 Thread Tom Lane

Dmitry Vasilyev  writes:
> On Ð¡Ð±, 2015-10-10 at 10:55 -0500, Tom Lane wrote:
>> and (b) you still haven't convinced me that you had an actual service
>> stop, and not just that the recovery time was longer than psql would
>> wait before retrying the connection.

> The log you can see bellow:
> ...
> 2015-10-10 19:00:32 AST DEBUG:  cleaning up dynamic shared memory control 
> segment with ID 851401618
> 2015-10-10 19:00:32 AST DEBUG:  invoking IpcMemoryCreate(size=290095104)
> 2015-10-10 19:00:42 AST FATAL:  pre-existing shared memory block is still in 
> use
> 2015-10-10 19:00:42 AST HINT:  Check if there are any old server processes 
> still running, and terminate them.

Thanks for providing some detail!  It's clear from the above log excerpt
that we're timing out after 10 seconds in win32_shmem.c's version of
PGSharedMemoryCreate, because CreateFileMapping is still reporting that
the old shared memory segment still exists.  When we last discussed this
sort of problem in
http://www.postgresql.org/message-id/flat/49fa3b6f.6080...@dunslane.net
there was no evidence that such a failure could persist for longer than a
second or two.  Now it seems that on your machine the failure state can
persist for at least 10 seconds, but I don't know why.

If I had to guess, on the basis of no evidence, I'd wonder whether the
DSM code broke it; there is evidently at least one DSM segment in play
in your use-case.  But that's only a guess.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-10 Thread Ali Akbar

Greetings,

2015-10-11 0:18 GMT+07:00 Pavel Stehule :

>
> 2015-10-10 18:04 GMT+02:00 Dmitry Vasilyev :
>
>>
>> On Сб, 2015-10-10 at 10:55 -0500, Tom Lane wrote:
>> > Dmitry Vasilyev  writes:
>> > > I have written, what service stopped. This action is repeatable.
>> > > You can run command 'psql -c "do $$ unpack p,1x8 $$ language
>> > > plperlu;"'
>> > > and after this windows service will stop.
>> >
>>
>
> so it is expected behave. After any unexpected client fails, the server is
> restarted
>

I can confirm this too. In linux (i use Fedora 22), this is what happens
when a server is killed:

=== 1. before:
$ sudo systemctl status postgresql.service
postgresql.service - PostgreSQL database server
   Loaded: loaded (/usr/lib/systemd/system/postgresql.service; enabled)
   Active: active (running) since Jum 2015-10-09 16:25:43 WIB; 1 day 14h ago
  Process: 778 ExecStart=/usr/bin/pg_ctl start -D ${PGDATA} -s -o -p
${PGPORT} -w -t 300 (code=exited, status=0/SUCCESS)
  Process: 747 ExecStartPre=/usr/bin/postgresql-check-db-dir ${PGDATA}
(code=exited, status=0/SUCCESS)
 Main PID: 783 (postgres)
   CGroup: /system.slice/postgresql.service
   ├─  783 /usr/bin/postgres -D /var/lib/pgsql/data -p 5432
   ├─  812 postgres: logger process
   ├─  821 postgres: checkpointer process
   ├─  822 postgres: writer process
   ├─  823 postgres: wal writer process
   ├─  824 postgres: autovacuum launcher process
   ├─  825 postgres: stats collector process
   └─17181 postgres: postgres test [local] idle

=== 2. killing and attempt to reconnect:
$ sudo kill 17181

test=# select 1;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

=== 3. service status after:
$ sudo systemctl status postgresql.service
postgresql.service - PostgreSQL database server
   Loaded: loaded (/usr/lib/systemd/system/postgresql.service; enabled)
   Active: active (running) since Jum 2015-10-09 16:25:43 WIB; 1 day 14h ago
  Process: 778 ExecStart=/usr/bin/pg_ctl start -D ${PGDATA} -s -o -p
${PGPORT} -w -t 300 (code=exited, status=0/SUCCESS)
  Process: 747 ExecStartPre=/usr/bin/postgresql-check-db-dir ${PGDATA}
(code=exited, status=0/SUCCESS)
 Main PID: 783 (postgres)
   CGroup: /system.slice/postgresql.service
   ├─  783 /usr/bin/postgres -D /var/lib/pgsql/data -p 5432
   ├─  812 postgres: logger process
   ├─  821 postgres: checkpointer process
   ├─  822 postgres: writer process
   ├─  823 postgres: wal writer process
   ├─  824 postgres: autovacuum launcher process
   ├─  825 postgres: stats collector process
   └─17422 postgres: postgres test [local] idle

===

The service status is still active (running), and new process 17422 handles
the client.

But this is what happens in Windows (win 7 32 bit, postgres 9.4):

=== 1. before:
C:\Windows\system32>sc queryex postgresql-9.4

SERVICE_NAME: postgresql-9.4
TYPE   : 10  WIN32_OWN_PROCESS
STATE  : 4  RUNNING
(STOPPABLE, PAUSABLE, ACCEPTS_SHUTDOWN)
WIN32_EXIT_CODE: 0  (0x0)
SERVICE_EXIT_CODE  : 0  (0x0)
CHECKPOINT : 0x0
WAIT_HINT  : 0x0
PID: 3716
FLAGS  :

=== 2. killing & attempt to reconnect:
postgres=# select pg_backend_pid();
 pg_backend_pid

   2080
(1 row)

C:\Windows\system32>taskkill /F /PID 2080
SUCCESS: The process with PID 2080 has been terminated.

postgres=# select 1;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>

=== 3. service status after:
C:\Windows\system32>sc query postgresql-9.4

SERVICE_NAME: postgresql-9.4
TYPE   : 10  WIN32_OWN_PROCESS
STATE  : 1  STOPPED
WIN32_EXIT_CODE: 0  (0x0)
SERVICE_EXIT_CODE  : 0  (0x0)
CHECKPOINT : 0x0
WAIT_HINT  : 0x0

===

The client cannot reconnect. The service is dead. This is nasty, because
any client can exploit some segfault bug like the one in perl Dmitry
mentoined upthread, and the postgresql service is down.

Note: killing the server process with pg_terminate_backend isn't causing
this behavior to happen. The client reconnects normally, and the service is
still running.

Regards,
Ali Akbar

Re: [HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-10 Thread Michael Paquier

On Sun, Oct 11, 2015 at 8:54 AM, Ali Akbar  wrote:
> C:\Windows\system32>taskkill /F /PID 2080
> SUCCESS: The process with PID 2080 has been terminated.

taskkill /f *forcefully* terminates the process targeted [1]. Isn't
that equivalent to a kill -9? If you headshot a backend process on
Linux with kill -9, an instance won't restart either.
[1]: 
http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/taskkill.mspx?mfr=true
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Postgres service stops when I kill client backend on Windows

2015-10-09 Thread Dmitry Vasilyev

I’ve started PostgreSQL server on Windows and then I kill client
backend’s process by taskkill the service was stopped: 

postgres=# select pg_backend_pid();
 pg_backend_pid

   1976

postgres=# \! taskkill /pid 1976 /f
SUCCESS: The process with PID 1976 has been terminated.
postgres=# select 1;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>


If I kill backend’s process on Linux then service not failing. So
what’s the problem? Why PostgreSQL is so strange on Windows?


--
Dmitry Vasilyev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 3 4 5 6 7 8 >

1 - 100 of 759 matches

Mail list logo