[ https://issues.apache.org/jira/browse/HAWQ-559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201205#comment-15201205 ]
ASF GitHub Bot commented on HAWQ-559: ------------------------------------- Github user ztao1987 commented on the pull request: https://github.com/apache/incubator-hawq/pull/472#issuecomment-198264539 +1 > QD hangs when QE is killed after connected to QD > ------------------------------------------------ > > Key: HAWQ-559 > URL: https://issues.apache.org/jira/browse/HAWQ-559 > Project: Apache HAWQ > Issue Type: Bug > Components: Dispatcher > Affects Versions: 2.0.0 > Environment: mac os X 10.10 > Reporter: Chunling Wang > Assignee: Lili Ma > > When the first query finishes, the QE is still alive. Then we run the second > query. After the thread of QD is created and bind to QE but not send data to > QE, we kill this QE and find QD hangs. > Here is the backtrace when QD hangs: > {code} > * thread #1: tid = 0x1c4afd, 0x00007fff890355be libsystem_kernel.dylib`poll + > 10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP > * frame #0: 0x00007fff890355be libsystem_kernel.dylib`poll + 10 > frame #1: 0x000000010745692c postgres`receiveChunksUDP [inlined] > udpSignalPoll + 42 at ic_udp.c:2882 > frame #2: 0x0000000107456902 postgres`receiveChunksUDP + 26 at > ic_udp.c:2715 > frame #3: 0x00000001074568e8 postgres`receiveChunksUDP [inlined] > waitOnCondition(timeout_us=250000) + 82 at ic_udp.c:1599 > frame #4: 0x0000000107456896 > postgres`receiveChunksUDP(pTransportStates=0x00007ff2a381ae48, > pEntry=0x00007ff2a18f2230, motNodeID=<unavailable>, > srcRoute=0x00007fff58c0ce96, conn=<unavailable>, inTeardown='\0') + 726 at > ic_udp.c:4039 > frame #5: 0x0000000107452a86 postgres`RecvTupleChunkFromAnyUDP [inlined] > RecvTupleChunkFromAnyUDP_Internal + 498 at ic_udp.c:4146 > frame #6: 0x0000000107452894 > postgres`RecvTupleChunkFromAnyUDP(mlStates=<unavailable>, > transportStates=<unavailable>, motNodeID=1, srcRoute=0x00007fff58c0ce96) + > 100 at ic_udp.c:4167 > frame #7: 0x0000000107442254 postgres`RecvTupleFrom [inlined] > processIncomingChunks(mlStates=0x00007ff2a3812a30, > transportStates=0x00007ff2a381ae48, motNodeID=1, srcRoute=<unavailable>) + 34 > at cdbmotion.c:684 > frame #8: 0x0000000107442232 > postgres`RecvTupleFrom(mlStates=0x00007ff2a3812a30, > transportStates=<unavailable>, motNodeID=1, tup_i=0x00007fff58c0cf00, > srcRoute=-100) + 370 at cdbmotion.c:610 > frame #9: 0x00000001071c8778 postgres`ExecMotion [inlined] > execMotionUnsortedReceiver(node=<unavailable>) + 57 at nodeMotion.c:466 > frame #10: 0x00000001071c873f postgres`ExecMotion(node=<unavailable>) + > 1071 at nodeMotion.c:298 > frame #11: 0x00000001071a4835 > postgres`ExecProcNode(node=0x00007ff2a38164b8) + 613 at execProcnode.c:999 > frame #12: 0x00000001071b9f82 postgres`ExecAgg + 104 at nodeAgg.c:1163 > frame #13: 0x00000001071b9f1a postgres`ExecAgg + 316 at nodeAgg.c:1693 > frame #14: 0x00000001071b9dde postgres`ExecAgg(node=0x00007ff2a3815348) + > 126 at nodeAgg.c:1138 > frame #15: 0x00000001071a4803 > postgres`ExecProcNode(node=0x00007ff2a3815348) + 563 at execProcnode.c:979 > frame #16: 0x000000010719ecfd > postgres`ExecutePlan(estate=0x00007ff2a3814e30, planstate=0x00007ff2a3815348, > operation=CMD_SELECT, numberTuples=0, direction=<unavailable>, > dest=0x00007ff2a28db178) + 1181 at execMain.c:3218 > frame #17: 0x000000010719e619 > postgres`ExecutorRun(queryDesc=0x00007ff2a3811f00, > direction=ForwardScanDirection, count=0) + 569 at execMain.c:1213 > frame #18: 0x00000001072e7fc2 postgres`PortalRun + 14 at pquery.c:1649 > frame #19: 0x00000001072e7fb4 > postgres`PortalRun(portal=0x00007ff2a1893e30, count=<unavailable>, > isTopLevel='\x01', dest=<unavailable>, altdest=0x00007ff2a28db178, > completionTag=0x00007fff58c0d530) + 1124 at pquery.c:1471 > frame #20: 0x00000001072e4a8e > postgres`exec_simple_query(query_string=0x00007ff2a380fe30, > seqServerHost=0x0000000000000000, seqServerPort=-1) + 2078 at postgres.c:1745 > frame #21: 0x00000001072e0c4c postgres`PostgresMain(argc=<unavailable>, > argv=<unavailable>, username=0x00007ff2a201bcf0) + 9404 at postgres.c:4754 > frame #22: 0x000000010729a002 postgres`ServerLoop [inlined] BackendRun + > 105 at postmaster.c:5889 > frame #23: 0x0000000107299f99 postgres`ServerLoop at postmaster.c:5484 > frame #24: 0x0000000107299f99 postgres`ServerLoop + 9593 at > postmaster.c:2163 > frame #25: 0x0000000107296f3b postgres`PostmasterMain(argc=<unavailable>, > argv=<unavailable>) + 5019 at postmaster.c:1454 > frame #26: 0x0000000107200ca9 postgres`main(argc=9, > argv=0x00007ff2a141eef0) + 1433 at main.c:209 > frame #27: 0x00007fff95e8c5c9 libdyld.dylib`start + 1 > thread #2: tid = 0x1c4afe, 0x00007fff890355be libsystem_kernel.dylib`poll + > 10 > frame #0: 0x00007fff890355be libsystem_kernel.dylib`poll + 10 > frame #1: 0x000000010744d8e3 postgres`rxThreadFunc(arg=<unavailable>) + > 2163 at ic_udp.c:6251 > frame #2: 0x00007fff95e822fc libsystem_pthread.dylib`_pthread_body + 131 > frame #3: 0x00007fff95e82279 libsystem_pthread.dylib`_pthread_start + 176 > frame #4: 0x00007fff95e804b1 libsystem_pthread.dylib`thread_start + 13 > thread #3: tid = 0x1c4b02, 0x00007fff890343f6 > libsystem_kernel.dylib`__select + 10 > frame #0: 0x00007fff890343f6 libsystem_kernel.dylib`__select + 10 > frame #1: 0x00000001074ec47e postgres`pg_usleep(microsec=<unavailable>) + > 78 at pgsleep.c:43 > frame #2: 0x0000000107400c26 > postgres`generateResourceRefreshHeartBeat(arg=0x00007ff2a141ce90) + 166 at > rmcomm_QD2RM.c:1519 > frame #3: 0x00007fff95e822fc libsystem_pthread.dylib`_pthread_body + 131 > frame #4: 0x00007fff95e82279 libsystem_pthread.dylib`_pthread_start + 176 > frame #5: 0x00007fff95e804b1 libsystem_pthread.dylib`thread_start + 13 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)