[ https://issues.apache.org/jira/browse/TRAFODION-1023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hans Zeller reopened TRAFODION-1023: ------------------------------------ Assignee: Hans Zeller (was: Apache Trafodion) Anu and Rao ran into this issue again this week and I am working on a fix, so I think this issue is not yet resolved. > LP Bug: 1425661 - Hang with hive scan and [FIRST N] > --------------------------------------------------- > > Key: TRAFODION-1023 > URL: https://issues.apache.org/jira/browse/TRAFODION-1023 > Project: Apache Trafodion > Issue Type: Bug > Components: sql-exe > Reporter: Apache Trafodion > Assignee: Hans Zeller > Priority: Critical > > A SELECT statement using [FIRST n] can fail to cleanup and hang when the main > thread deallocates the statement. A reader thread can be observer to be > waiting on a new buffer: > (gdb) fr 2 > #2 0x00007ffff30bd7c7 in ExLobGlobals::performRequest (this=0x217f7e0, > request=<optimized out>) at ../exp/ExpLOBaccess.cpp:1859 > 1859 cursor->lock_.wait(); > (gdb) list > #0 0x0000003d60c0b43c in pthread_cond_wait@@GLIBC_2.3.2 () > from /lib64/libpthread.so.0 > #1 0x00007ffff30b797d in ExLobLock::wait (this=0x217fef8) > at ../exp/ExpLOBaccess.cpp:2246 > #2 0x00007ffff30bd7c7 in ExLobGlobals::performRequest (this=0x217f7e0, > request=<optimized out>) at ../exp/ExpLOBaccess.cpp:1859 > #3 0x00007ffff30bd8b9 in ExLobGlobals::doWorkInThread (this=0x217f7e0) > at ../exp/ExpLOBaccess.cpp:2397 > #4 0x00007ffff30bd909 in workerThreadMain (arg=<optimized out>) > at ../exp/ExpLOBaccess.cpp:2048 > #5 0x0000003d60c07851 in start_thread () from /lib64/libpthread.so.0 > #6 0x0000003d608e890d in clone () from /lib64/libc.so.6 > 1854 // there are no empty buffers. > 1855 // if prefetch list already has the max, wait for one to > freeup. > 1856 totalBufSize = cursor->prefetchBufList_.size() * > cursor->bufMaxSize_; > 1857 if (totalBufSize > LOB_CURSOR_PREFETCH_BYTES_MAX) { > 1858 traceMessage("wait on condition cursor",__LINE__); > 1859 cursor->lock_.wait(); > 1860 continue; > 1861 } > The main thread's backtrace in the hang: > #0 0x0000003d60c080ad in pthread_join () from /lib64/libpthread.so.0 > #1 0x00007ffff30b9d2c in ExLobGlobals::~ExLobGlobals (this=0x217f7e0, > __in_chrg=<optimized out>) at ../exp/ExpLOBaccess.cpp:1973 > #2 0x00007ffff30be711 in ExLobsOper (lobName=0x7fffffff1780 "/h/temp", > handleIn=0x0, handleInLen=0, hdfsServer=0x0, hdfsPort=0, handleOut=0x0, > handleOutLen=@0x7fffffff17f0: 0, descNumIn=0, > descNumOut=@0x7fffffff17f0: 0, retOperLen=@0x7fffffff17f0: 0, > requestTagIn=0, requestTagOut=@0x7fffffff17f0: 0, > requestStatus=@0x7fffffff17fc: 32767, cliError=@0x7fffffff17e8: -1, > dir=0x7fffffff1780 "/h/temp", storage=Lob_HDFS_File, source=0x0, > sourceLen=0, cursorBytes=0, cursorId=0x0, operation=Lob_Cleanup, > subOperation=Lob_None, waited=1, globPtr=@0x7fffd6fd6ab8: 0x217f7e0, > transId=0, blackBox=0x0, blackBoxLen=0, bufferSize=0, replication=0, > blockSize=0) at ../exp/ExpLOBaccess.cpp:2732 > #3 0x00007ffff30c0c86 in ExpLOBinterfaceCleanup (lobGlob=<optimized out>, > lobHeap=<optimized out>) at ../exp/ExpLOBinterface.cpp:100 > #4 0x00007ffff4cd0b8d in ExHdfsScanTcb::~ExHdfsScanTcb (this=0x7fffd6fd69b0, > __in_chrg=<optimized out>) at ../executor/ExHdfsScan.cpp:178 > #5 0x00007ffff4cd0ca1 in ExHdfsScanTcb::~ExHdfsScanTcb (this=0x7fffd6fd69b0, > __in_chrg=<optimized out>) at ../executor/ExHdfsScan.cpp:179 > #6 0x00007ffff4b5ba20 in ex_globals::cleanupTcbs (this=0x7fffe96b5700) > at ../executor/ex_globals.cpp:192 > #7 0x00007ffff4b5eefd in ex_globals::deleteMe (this=0x7fffe96b5700) > at ../executor/ex_globals.cpp:138 > #8 0x00007ffff4b448c9 in ExExeStmtGlobals::deleteMe (this=0x7fffe96b5700) > at ../executor/ex_exe_stmt_globals.cpp:303 > #9 0x00007ffff4b44bc9 in ExMasterStmtGlobals::deleteMe (this=0x7fffe96b5700) > at ../executor/ex_exe_stmt_globals.cpp:654 > #10 0x00007ffff4b84b46 in ex_root_tcb::deallocAndDelete ( > this=<optimized out>, glob=0x7fffe96b5700, fragTable=<optimized out>) > at ../executor/ex_root.cpp:2430 > #11 0x00007ffff5fa86b7 in CliStatement::releaseTcbs (this=<optimized out>, > closeAllOpens=<optimized out>) at ../cli/Statement.cpp:6056 > #12 0x00007ffff5fa8843 in CliStatement::dealloc (this=0x7fffe96c1f90, > closeAllOpens=0) at ../cli/Statement.cpp:6104 > #13 0x00007ffff5fa8f24 in CliStatement::~CliStatement (this=0x7fffe96c1f90, > __in_chrg=<optimized out>) at ../cli/Statement.cpp:571 > #14 0x00007ffff5fa94f1 in CliStatement::~CliStatement (this=0x7fffe96c1f90, > __in_chrg=<optimized out>) at ../cli/Statement.cpp:741 > #15 0x00007ffff5f73a28 in ContextCli::deallocStmt (this=<optimized out>, > statement_id=0x1f925d0, deallocStaticStmt=0) at ../cli/Context.cpp:2328 > #16 0x00007ffff5f59ecf in SQLCLI_DeallocStmt (cliGlobals=<optimized out>, > statement_id=0x1f925d0) at ../cli/Cli.cpp:1649 > #17 0x00007ffff5fba706 in SQL_EXEC_DeallocStmt (statement_id=0x1f925d0) > at ../cli/CliExtern.cpp:1823 > I added a test case into regress/hive/TEST003 to demo this scenario; it will > be available when I check in the fix. > Assigned to LaunchPad User Mike Hanlon -- This message was sent by Atlassian JIRA (v6.3.4#6332)