Chunling Wang created HAWQ-559:
----------------------------------

             Summary: QD hangs when QE is killed after connected to QD
                 Key: HAWQ-559
                 URL: https://issues.apache.org/jira/browse/HAWQ-559
             Project: Apache HAWQ
          Issue Type: Bug
          Components: Dispatcher
            Reporter: Chunling Wang
            Assignee: Lei Chang


When the first query finishes, the QE is still alive. Then we run the second 
query. After the thread of QD is created and bind to QE but not send data to 
QE, we kill this QE and find QD hangs.
Here is the backtrace when QD hangs:
* thread #1: tid = 0x1c4afd, 0x00007fff890355be libsystem_kernel.dylib`poll + 
10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  * frame #0: 0x00007fff890355be libsystem_kernel.dylib`poll + 10
    frame #1: 0x000000010745692c postgres`receiveChunksUDP [inlined] 
udpSignalPoll + 42 at ic_udp.c:2882
    frame #2: 0x0000000107456902 postgres`receiveChunksUDP + 26 at ic_udp.c:2715
    frame #3: 0x00000001074568e8 postgres`receiveChunksUDP [inlined] 
waitOnCondition(timeout_us=250000) + 82 at ic_udp.c:1599
    frame #4: 0x0000000107456896 
postgres`receiveChunksUDP(pTransportStates=0x00007ff2a381ae48, 
pEntry=0x00007ff2a18f2230, motNodeID=<unavailable>, 
srcRoute=0x00007fff58c0ce96, conn=<unavailable>, inTeardown='\0') + 726 at 
ic_udp.c:4039
    frame #5: 0x0000000107452a86 postgres`RecvTupleChunkFromAnyUDP [inlined] 
RecvTupleChunkFromAnyUDP_Internal + 498 at ic_udp.c:4146
    frame #6: 0x0000000107452894 
postgres`RecvTupleChunkFromAnyUDP(mlStates=<unavailable>, 
transportStates=<unavailable>, motNodeID=1, srcRoute=0x00007fff58c0ce96) + 100 
at ic_udp.c:4167
    frame #7: 0x0000000107442254 postgres`RecvTupleFrom [inlined] 
processIncomingChunks(mlStates=0x00007ff2a3812a30, 
transportStates=0x00007ff2a381ae48, motNodeID=1, srcRoute=<unavailable>) + 34 
at cdbmotion.c:684
    frame #8: 0x0000000107442232 
postgres`RecvTupleFrom(mlStates=0x00007ff2a3812a30, 
transportStates=<unavailable>, motNodeID=1, tup_i=0x00007fff58c0cf00, 
srcRoute=-100) + 370 at cdbmotion.c:610
    frame #9: 0x00000001071c8778 postgres`ExecMotion [inlined] 
execMotionUnsortedReceiver(node=<unavailable>) + 57 at nodeMotion.c:466
    frame #10: 0x00000001071c873f postgres`ExecMotion(node=<unavailable>) + 
1071 at nodeMotion.c:298
    frame #11: 0x00000001071a4835 
postgres`ExecProcNode(node=0x00007ff2a38164b8) + 613 at execProcnode.c:999
    frame #12: 0x00000001071b9f82 postgres`ExecAgg + 104 at nodeAgg.c:1163
    frame #13: 0x00000001071b9f1a postgres`ExecAgg + 316 at nodeAgg.c:1693
    frame #14: 0x00000001071b9dde postgres`ExecAgg(node=0x00007ff2a3815348) + 
126 at nodeAgg.c:1138
    frame #15: 0x00000001071a4803 
postgres`ExecProcNode(node=0x00007ff2a3815348) + 563 at execProcnode.c:979
    frame #16: 0x000000010719ecfd 
postgres`ExecutePlan(estate=0x00007ff2a3814e30, planstate=0x00007ff2a3815348, 
operation=CMD_SELECT, numberTuples=0, direction=<unavailable>, 
dest=0x00007ff2a28db178) + 1181 at execMain.c:3218
    frame #17: 0x000000010719e619 
postgres`ExecutorRun(queryDesc=0x00007ff2a3811f00, 
direction=ForwardScanDirection, count=0) + 569 at execMain.c:1213
    frame #18: 0x00000001072e7fc2 postgres`PortalRun + 14 at pquery.c:1649
    frame #19: 0x00000001072e7fb4 postgres`PortalRun(portal=0x00007ff2a1893e30, 
count=<unavailable>, isTopLevel='\x01', dest=<unavailable>, 
altdest=0x00007ff2a28db178, completionTag=0x00007fff58c0d530) + 1124 at 
pquery.c:1471
    frame #20: 0x00000001072e4a8e 
postgres`exec_simple_query(query_string=0x00007ff2a380fe30, 
seqServerHost=0x0000000000000000, seqServerPort=-1) + 2078 at postgres.c:1745
    frame #21: 0x00000001072e0c4c postgres`PostgresMain(argc=<unavailable>, 
argv=<unavailable>, username=0x00007ff2a201bcf0) + 9404 at postgres.c:4754
    frame #22: 0x000000010729a002 postgres`ServerLoop [inlined] BackendRun + 
105 at postmaster.c:5889
    frame #23: 0x0000000107299f99 postgres`ServerLoop at postmaster.c:5484
    frame #24: 0x0000000107299f99 postgres`ServerLoop + 9593 at 
postmaster.c:2163
    frame #25: 0x0000000107296f3b postgres`PostmasterMain(argc=<unavailable>, 
argv=<unavailable>) + 5019 at postmaster.c:1454
    frame #26: 0x0000000107200ca9 postgres`main(argc=9, 
argv=0x00007ff2a141eef0) + 1433 at main.c:209
    frame #27: 0x00007fff95e8c5c9 libdyld.dylib`start + 1

  thread #2: tid = 0x1c4afe, 0x00007fff890355be libsystem_kernel.dylib`poll + 10
    frame #0: 0x00007fff890355be libsystem_kernel.dylib`poll + 10
    frame #1: 0x000000010744d8e3 postgres`rxThreadFunc(arg=<unavailable>) + 
2163 at ic_udp.c:6251
    frame #2: 0x00007fff95e822fc libsystem_pthread.dylib`_pthread_body + 131
    frame #3: 0x00007fff95e82279 libsystem_pthread.dylib`_pthread_start + 176
    frame #4: 0x00007fff95e804b1 libsystem_pthread.dylib`thread_start + 13

  thread #3: tid = 0x1c4b02, 0x00007fff890343f6 libsystem_kernel.dylib`__select 
+ 10
    frame #0: 0x00007fff890343f6 libsystem_kernel.dylib`__select + 10
    frame #1: 0x00000001074ec47e postgres`pg_usleep(microsec=<unavailable>) + 
78 at pgsleep.c:43
    frame #2: 0x0000000107400c26 
postgres`generateResourceRefreshHeartBeat(arg=0x00007ff2a141ce90) + 166 at 
rmcomm_QD2RM.c:1519
    frame #3: 0x00007fff95e822fc libsystem_pthread.dylib`_pthread_body + 131
    frame #4: 0x00007fff95e82279 libsystem_pthread.dylib`_pthread_start + 176
    frame #5: 0x00007fff95e804b1 libsystem_pthread.dylib`thread_start + 13



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to