[ https://issues.apache.org/jira/browse/HAWQ-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ming LI reassigned HAWQ-1338: ----------------------------- Assignee: Ming LI (was: Ed Espino) > In some case writer process crashed when running 'hawq stop cluster' > -------------------------------------------------------------------- > > Key: HAWQ-1338 > URL: https://issues.apache.org/jira/browse/HAWQ-1338 > Project: Apache HAWQ > Issue Type: Bug > Components: Core > Reporter: Ming LI > Assignee: Ming LI > > On master node of test machine, some process doesn't exit nicely, and core > dump after a while. > {code} > ------------------- The running log ------------------------- > 2/12/17 11:33:59 PM PST: > ---------------------------------------------------------------------- > 2/12/17 11:33:59 PM PST: Check if postgres/java processes are closed properly: > 2/12/17 11:33:59 PM PST: > ---------------------------------------------------------------------- > 2/12/17 11:33:59 PM PST: Check if postgres|java process is running on test1: > 2/12/17 11:33:59 PM PST: gpadmin 5279 1 0 22:53 ? 00:00:03 > postgres: port 31000, master logger process > > > 2/12/17 11:33:59 PM PST: gpadmin 5283 1 0 22:53 ? 00:00:01 > postgres: port 31000, writer process > > > 2/12/17 11:33:59 PM PST: root 23864 24 1 23:37 ? 00:00:01 > /usr/libexec/abrt-hook-ccpp 6 18446744073709551615 5283 501 501 1486971433 > postgres > 2/12/17 11:33:59 PM PST: ------------------------------------- > 2/12/17 11:33:59 PM PST: Check if postgres|java process is running on test2: > 2/12/17 11:33:59 PM PST: ------------------------------------- > 2/12/17 11:33:59 PM PST: Check if postgres|java process is running on test3: > 2/12/17 11:33:59 PM PST: ------------------------------------- > 2/12/17 11:33:59 PM PST: Check if postgres|java process is running on test4: > 2/12/17 11:33:59 PM PST: ------------------------------------- > 2/12/17 11:33:59 PM PST: Check if postgres|java process is running on test5: > 2/12/17 11:33:59 PM PST: ------------------------------------- > 2/12/17 11:33:59 PM PST: ERROR: Postgres process not closed on test1, please > check. > 2/12/17 11:33:59 PM PST: > ---------------------------------------------------------------------- > ------------------- The call stack ------------------------- > (gdb) bt > #0 0x00000032214325e5 in raise () from /lib64/libc.so.6 > #1 0x0000003221433dc5 in abort () from /lib64/libc.so.6 > #2 0x000000000096433a in errfinish (dummy=0) at elog.c:686 > #3 0x00000000009665bd in elog_finish (elevel=22, fmt=0xc53af0 "process is > dying from critical section") at elog.c:1463 > #4 0x000000000086c11d in proc_exit_prepare (code=1) at ipc.c:153 > #5 0x000000000086c0a9 in proc_exit (code=1) at ipc.c:93 > #6 0x0000000000964300 in errfinish (dummy=0) at elog.c:670 > #7 0x0000000000825121 in ServiceClientRead (serviceClient=0xfc73f0, > response=0x7fffb96842de, responseLen=1, > timeout=0x7fffb96842c0) at service.c:523 > #8 0x0000000000824f7b in ServiceClientReceiveResponse > (serviceClient=0xfc73f0, response=0x7fffb96842de, responseLen=1, > timeout=0x7fffb96842c0) at service.c:480 > #9 0x000000000082bce1 in WalSendServerClientReceiveResponse > (walSendResponse=0x7fffb96842de, timeout=0x7fffb96842c0) > at walsendserver.c:372 > #10 0x000000000051596d in XLogQDMirrorWaitForResponse (waitForever=0 '\000') > at xlog.c:1919 > #11 0x0000000000515c0c in XLogQDMirrorWrite (startidx=0, npages=1, > timeLineID=1, logId=0, logSeg=1, logOff=13729792) > at xlog.c:2005 > #12 0x0000000000516615 in XLogWrite (WriteRqst=..., flexible=0 '\000', > xlog_switch=0 '\000') at xlog.c:2354 > #13 0x0000000000516d68 in XLogFlush (record=...) at xlog.c:2572 > #14 0x0000000000522f88 in CreateCheckPoint (shutdown=1 '\001', force=1 > '\001') at xlog.c:8136 > #15 0x000000000052277b in ShutdownXLOG (code=0, arg=0) at xlog.c:7865 > #16 0x0000000000821f42 in BackgroundWriterMain () at bgwriter.c:318 > #17 0x000000000059c9f1 in AuxiliaryProcessMain (argc=2, argv=0x7fffb9684b60) > at bootstrap.c:467 > #18 0x000000000083c7b0 in StartChildProcess (type=BgWriterProcess) at > postmaster.c:6836 > #19 0x0000000000838f39 in CommenceNormalOperations () at postmaster.c:3618 > #20 0x000000000083984a in do_reaper () at postmaster.c:4021 > #21 0x0000000000835e97 in ServerLoop () at postmaster.c:2136 > #22 0x000000000083500f in PostmasterMain (argc=9, argv=0x288bd10) at > postmaster.c:1454 > #23 0x00000000007612af in main (argc=9, argv=0x288bd10) at main.c:226 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)