Hi Craig,

While testing for another scenario of continuous postgres server restart, we 
got many cores of sh-QUIT and along with that we got cores for rm-QUIT. It is 
pointing to rm of the archive file but we were not able to get the bt as the 
stack is corrupted.

We got below info from gdb:
Core was generated by `rm ./Archive_000000020000000000000118'.

And also we were able to get this info:
4518     12490  0.0  0.0  11484  1356 ?        Ss   10:59   0:00 postgres: 
archiver process   archiving 000000020000000000000118.00000028.backup
4518     12704  2.0  0.0   7672  2932 ?        S    11:00   0:00       \_ sh -c 
rm ./Archive_*; touch ./Archive_"000000020000000000000118.00000028.backup"; 
exit 0
4518     12707  0.0  0.0    344     4 ?        S    11:00   0:00                
 \_ rm ./Archive_000000020000000000000118

In the Postgres configuration file ,we have this information.
archive_command             = 'rm ./Archive_*; touch ./Archive_"%f"; exit 0'

So while executing this archive command, core was generated.
You pointed out earlier that issue might be happening during archive command 
and also all evidence for this crash are pointing to this same command.
Are there any suggestions to recover from this situation or on ways to debug 
the issue ?

Regards,
Sandhya

From: K S, Sandhya (Nokia - IN/Bangalore)
Sent: Wednesday, July 12, 2017 4:51 PM
To: 'Craig Ringer' <cr...@2ndquadrant.com>
Cc: pgsql-bugs <pgsql-b...@postgresql.org>; PostgreSQL Hackers 
<pgsql-hackers@postgresql.org>; T, Rasna (Nokia - IN/Bangalore) 
<rasn...@nokia.com>; Itnal, Prakash (Nokia - IN/Bangalore) 
<prakash.it...@nokia.com>
Subject: RE: [HACKERS] Postgres process invoking exit resulting in sh-QUIT core

Hi Craig,

Here is bt after installing all the missing debuginfo packages.

(gdb) bt
#0  0x000000fff7682f18 in do_lookup_x (undef_name=undef_name@entry=0xfff75cece5 
"_Jv_RegisterClasses", new_hash=new_hash@entry=2681263574,
    old_hash=old_hash@entry=0xffffa159b8, ref=0xfff75ceac8, 
result=result@entry=0xffffa159a0, scope=<optimized out>, i=1, 
version=version@entry=0x0,
    flags=flags@entry=1, skip=skip@entry=0x0, type_class=type_class@entry=0, 
undef_map=undef_map@entry=0xfff76a9478) at dl-lookup.c:444
#1  0x000000fff76839a0 in _dl_lookup_symbol_x (undef_name=0xfff75cece5 
"_Jv_RegisterClasses", undef_map=0xfff76a9478, ref=0xffffa15a90,
    symbol_scope=0xfff76a9980, version=0x0, type_class=<optimized out>, 
flags=<optimized out>, skip_map=0x0) at dl-lookup.c:833
#2  0x000000fff7685730 in elf_machine_got_rel (lazy=1, map=0xfff76a9478) at 
../sysdeps/mips/dl-machine.h:870
#3  elf_machine_runtime_setup (profile=<optimized out>, lazy=1, l=0xfff76a9478) 
at ../sysdeps/mips/dl-machine.h:916
#4  _dl_relocate_object (scope=0xfff76a9980, reloc_mode=<optimized out>, 
consider_profiling=0) at dl-reloc.c:259
#5  0x000000fff767ba10 in dl_main (phdr=<optimized out>, 
phdr@entry=0x120000040, phnum=<optimized out>, phnum@entry=8,
    user_entry=user_entry@entry=0xffffa15cf0, auxv=<optimized out>) at 
rtld.c:2070
#6  0x000000fff7692e3c in _dl_sysdep_start (start_argptr=<optimized out>, 
dl_main=0xfff7679a98 <dl_main>) at ../elf/dl-sysdep.c:249
#7  0x000000fff767d0d8 in _dl_start_final (arg=arg@entry=0xffffa16410, 
info=info@entry=0xffffa15d80) at rtld.c:307
#8  0x000000fff767d3d8 in _dl_start (arg=0xffffa16410) at rtld.c:415
#9  0x000000fff7679380 in __start () from /lib64/ld.so.1

Please see if this could help in analysing the issue.

Regards,
Sandhya

From: Craig Ringer [mailto:cr...@2ndquadrant.com]
Sent: Friday, July 07, 2017 1:53 PM
To: K S, Sandhya (Nokia - IN/Bangalore) 
<sandhya....@nokia.com<mailto:sandhya....@nokia.com>>
Cc: pgsql-bugs <pgsql-b...@postgresql.org<mailto:pgsql-b...@postgresql.org>>; 
PostgreSQL Hackers 
<pgsql-hackers@postgresql.org<mailto:pgsql-hackers@postgresql.org>>; T, Rasna 
(Nokia - IN/Bangalore) <rasn...@nokia.com<mailto:rasn...@nokia.com>>; Itnal, 
Prakash (Nokia - IN/Bangalore) 
<prakash.it...@nokia.com<mailto:prakash.it...@nokia.com>>
Subject: Re: [HACKERS] Postgres process invoking exit resulting in sh-QUIT core

On 7 July 2017 at 15:41, K S, Sandhya (Nokia - IN/Bangalore) 
<sandhya....@nokia.com<mailto:sandhya....@nokia.com>> wrote:
Hi Craig,

The scenario is lock and unlock of the system for 30 times. During this 
scenario 5 sh-QUIT core is generated. GDB of 5 core is pointing to different 
locations.
I have attached output for 2 such instance.


You seem to be missing debug symbols. Install appropriate debuginfo packages.


--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply via email to