[ 
https://issues.apache.org/jira/browse/TRAFODION-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15218296#comment-15218296
 ] 

Hans Zeller commented on TRAFODION-1910:
----------------------------------------

The issue in this bug was when a JDBC client disconnected. During disconnect, 
we delete the HiveClient_JNI object with this stack trace:

{noformat}
#1  0x00007ffff25ad180 in HiveClient_JNI::~HiveClient_JNI (this=0x7fffdf726900, 
__in_chrg=<value optimized out>) at ../executor/HBaseClient_JNI.cpp:4882
#2  0x00007ffff25ad0d3 in HiveClient_JNI::deleteInstance () at 
../executor/HBaseClient_JNI.cpp:4870
#3  0x00007ffff3eda59a in SQL_EXEC_DeleteHbaseJNI () at 
../cli/CliExtern.cpp:6319
#4  0x00007fffeba7466e in CmpStatement::process (this=0x7fffcd1d6420, es=...) 
at ../arkcmp/CmpStatement.cpp:1314
#5  0x00007fffeba61f86 in CmpContext::compileDirect (this=0x7fffdeb80090, 
data=0x7fffe0beca30 "\002", data_len=4, outHeap=0x7fffe0189138, charset=15, 
op=CmpMessageObj::END_SESSION, gen_code=@0x7fffe0beca18, 
gen_code_len=@0x7fffe0beca14, parserFlags=4194304, parentQid=0x0, 
parentQidLen=0, diagsArea=0x0) at ../arkcmp/CmpContext.cpp:894
#6  0x00007ffff3e7678f in ContextCli::endMxcmpSession (this=0x7fffe0189128, 
cleanupEsps=0, clearCmpCache=0) at ../cli/Context.cpp:3907
#7  0x00007ffff3e76c6e in ContextCli::endSession (this=0x7fffe0189128, 
cleanupEsps=0, cleanupEspsOnly=0, cleanupOpens=0) at ../cli/Context.cpp:4053
#8  0x00007ffff232cfbd in ExSetSessionDefaultTcb::work (this=0x7fffdf764a48) at 
../executor/ex_control.cpp:815
#9  0x00007ffff2357c2d in ex_tcb::sWork (tcb=0x7fffdf764a48) at 
../executor/ex_tcb.h:103
#10 0x00007ffff24e343f in ExSubtask::work (this=0x7fffdf764f80) at 
../executor/ExScheduler.cpp:754
#11 0x00007ffff24e2802 in ExScheduler::work (this=0x7fffdf7645b0, 
prevWaitTime=0) at ../executor/ExScheduler.cpp:331
#12 0x00007ffff23b2b2a in ex_root_tcb::execute (this=0x7fffdf765000, 
cliGlobals=0x1086dc0, glob=0x7fffdf76f2d8, input_desc=0x0, 
diagsArea=@0x7fffe0bee220, reExecute=0) at ../executor/ex_root.cpp:1058
#13 0x00007ffff3ebea2b in CliStatement::execute (this=0x7fffdf7827e0, 
cliGlobals=0x1086dc0, input_desc=0x0, diagsArea=..., 
execute_state=CliStatement::INITIAL_STATE_, fixupOnly=0, cliflags=0) at 
../cli/Statement.cpp:4525
#14 0x00007ffff3e407f4 in SQLCLI_PerformTasks(CliGlobals *, ULng32, SQLSTMT_ID 
*, SQLDESC_ID *, SQLDESC_ID *, Lng32, Lng32, typedef __va_list_tag 
__va_list_tag *, SQLCLI_PTR_PAIRS *, SQLCLI_PTR_PAIRS *) (cliGlobals=0x1086dc0, 
tasks=606, statement_id=0x30613e8, input_descriptor=0x0, output_descriptor=0x0, 
num_input_ptr_pairs=0, num_output_ptr_pairs=0, ap=0x7fffe0bee890, 
input_ptr_pairs=0x0, output_ptr_pairs=0x0) at ../cli/Cli.cpp:3297
#15 0x00007ffff3e418f6 in SQLCLI_ExecDirect2(CliGlobals *, SQLSTMT_ID *, 
SQLDESC_ID *, Int32, SQLDESC_ID *, Lng32, typedef __va_list_tag __va_list_tag 
*, SQLCLI_PTR_PAIRS *) (cliGlobals=0x1086dc0, statement_id=0x30613e8, 
sql_source=0x7fffe0beeae0, prepFlags=0, input_descriptor=0x0, num_ptr_pairs=0, 
ap=0x7fffe0bee890, ptr_pairs=0x0) at ../cli/Cli.cpp:3731
#16 0x00007ffff3ed4c17 in SQL_EXEC_ExecDirect2 (statement_id=0x30613e8, 
sql_source=0x7fffe0beeae0, prep_flags=0, input_descriptor=0x0, num_ptr_pairs=0) 
at ../cli/CliExtern.cpp:2329
#17 0x00007ffff69ceb3d in SRVR::WSQL_EXEC_ExecDirect (statement_id=0x30613e8, 
sql_source=0x7fffe0beeae0, input_descriptor=0x0, num_ptr_pairs=0) at 
SQLWrapper.cpp:363
#18 0x00007ffff69b594f in SRVR::EXECDIRECT (pSrvrStmt=0x3060dd0) at 
sqlinterface.cpp:4521
#19 0x00007ffff6941e73 in SRVR::ControlProc (pParam=0x3060dd0) at 
csrvrstmt.cpp:763
#20 0x00007ffff69414b1 in SRVR_STMT_HDL::ExecDirect (this=0x3060dd0, 
inCursorName=0x0, inSqlString=0x61c430 "SET SESSION DEFAULT SQL_SESSION 'END'", 
inStmtType=1, inSqlStmtType=0, inSqlAsyncEnable=0, inQueryTimeout=0) at 
csrvrstmt.cpp:445
#21 0x00007ffff69b614b in SRVR::EXECDIRECT (pSqlStr=0x61c430 "SET SESSION 
DEFAULT SQL_SESSION 'END'", WriteError=0) at sqlinterface.cpp:4702
#22 0x0000000000581162 in SRVR::SrvrSessionCleanup () at SrvrConnect.cpp:4080
#23 0x0000000000580d14 in odbc_SQLSvc_TerminateDialogue_ame_ 
(objtag_=0x10785e0, call_id_=0x1078638, dialogueId=903221211) at 
SrvrConnect.cpp:3950
#24 0x000000000051ce04 in SQLDISCONNECT_IOMessage (objtag_=0x10785e0, 
call_id_=0x1078638) at Interface/odbcs_srvr.cpp:653
#25 0x000000000051eff4 in DISPATCH_TCPIPRequest (objtag_=0x10785e0, 
call_id_=0x1078638, operation_id=3002) at Interface/odbcs_srvr.cpp:1775
#26 0x0000000000465928 in BUILD_TCPIP_REQUEST (pnode=0x10785e0) at 
../Common/TCPIPSystemSrvr.cpp:606
#27 0x000000000046586f in PROCESS_TCPIP_REQUEST (pnode=0x10785e0) at 
../Common/TCPIPSystemSrvr.cpp:584
#28 0x00000000004b32b0 in CNSKListenerSrvr::CheckTCPIPRequest (this=0xf2d850, 
ipnode=0x10785e0) at Interface/Listener_srvr.cpp:64
#29 0x00000000004c4939 in CNSKListenerSrvr::tcpip_listener (arg=0xf2d850) at 
Interface/linux/Listener_srvr_ps.cpp:403
#30 0x00007ffff43752f4 in sb_thread_sthr_disp (pp_arg=0x1073660) at 
threadl.cpp:256
#31 0x00007ffff4141a51 in start_thread () from /lib64/libpthread.so.0
#32 0x00007ffff467793d in clone () from /lib64/libc.so.6
{noformat}

My assumption is that this will not get rid of the context itself. The problem 
is that we cache the pointer to the HiveClient_JNI object in the compiler (in 
class HiveMetaData, used by NATableDB).

If we get rid of the CLI context, that will still delete the HiveClient_JNI 
object, it does not go through the CLI call I changed.

My hope is that we will do the following:


* JDBC disconnect:
** Call SQL_EXEC_DELETEHbase_JNI() through a SET SESSION command
** Delete HBaseClient_JNI
** Keep HiveClient_JNI, which is also cached in NATableDB/HiveMetaData

* Destroy CLI context
** Delete both HBaseClient_JNI and HiveClient_JNI from ContextCli::deleteMe()

> mxosrvr crashes on Hive query after reconnect
> ---------------------------------------------
>
>                 Key: TRAFODION-1910
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-1910
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: sql-exe
>    Affects Versions: 1.3-incubating
>            Reporter: Hans Zeller
>            Assignee: Hans Zeller
>
> This is a problem Wei-Shiun found when running tests with many connections 
> that use Hive queries. He sees intermittent core dumps with this stack trace:
> #0 0x00007f47cb0dd625 in raise () from /lib64/libc.so.6
> 0000001 0x00007f47cb0ded8d in abort () from /lib64/libc.so.6
> 0000002 0x00007f47cc613a55 in os::abort(bool) ()
>    from /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so
> 0000003 0x00007f47cc793f87 in VMError::report_and_die() ()
>    from /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so
> 0000004 0x00007f47cc61896f in JVM_handle_linux_signal ()
>    from /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so
> 0000005 <signal handler called>
> 0000006 0x00007f47c92bd5ee in HiveMetaData::recordError (this=0x7f47a5e50088,
>     errCode=122, errMethodName=0x7f47c935aaa3 "HiveClient_JNI::getTableStr()")
>     at ../executor/hiveHook.cpp:228
> 0000007 0x00007f47c92bf613 in HiveMetaData::getTableDesc (this=0x7f47a5e50088,
>     schemaName=0x7f47b858e798 "mytest5", tblName=0x7f47b858e7c8 "mytable")
>     at ../executor/hiveHook.cpp:806
> 0000008 0x00007f47c4056307 in NATableDB::get (this=0x7f47b652d3c0, 
> corrName=...,
>     bindWA=0x7f47b85912d0, inTableDescStruct=<value optimized out>)
>     at ../optimizer/NATable.cpp:8377
> 0000009 0x00007f47c3db0743 in BindWA::getNATable (this=0x7f47b85912d0,
>     corrName=..., catmanCollectTableUsages=1, inTableDescStruct=0x0)
>     at ../optimizer/BindRelExpr.cpp:1514
> 0000010 0x00007f47c3db3290 in Describe::bindNode (this=0x7f47a2aae440,
>     bindWA=0x7f47b85912d0) at ../optimizer/BindRelExpr.cpp:13565
> 0000011 0x00007f47c3d989f7 in RelExpr::bindChildren (this=0x7f47a2aaf5f8,
>     bindWA=0x7f47b85912d0) at ../optimizer/BindRelExpr.cpp:2258
> 0000012 0x00007f47c3dccbce in RelRoot::bindNode (this=0x7f47a2aaf5f8,
>     bindWA=0x7f47b85912d0) at ../optimizer/BindRelExpr.cpp:5204
> 0000013 0x00007f47c577e84e in CmpMain::compile (this=0x7f47b8593c40,
>     input_str=0x7f47a5e0b690 "showddl mytable", charset=15,
>     queryExpr=@0x7f47b8593b78, gen_code=0x7f47a5e0c1a8,
>     gen_code_len=0x7f47a5e0c1a0, heap=0x7f47b70bbc00, phase=CmpMain::END,
>     fragmentDir=0x7f47b8593d98, op=3004, useQueryCache=1,
>     cacheable=0x7f47b8593b88, begTime=0x7f47b8593b60, shouldLog=0)
>     at ../sqlcomp/CmpMain.cpp:2071
> 0000014 0x00007f47c578168c in CmpMain::sqlcomp (this=0x7f47b8593c40,
>     input_str=0x7f47a5e0b690 "showddl mytable", charset=15,
>     queryExpr=@0x7f47b8593b78, gen_code=0x7f47a5e0c1a8,
>     gen_code_len=0x7f47a5e0c1a0, heap=0x7f47b70bbc00, phase=CmpMain::END,
>     fragmentDir=0x7f47b8593d98, op=3004, useQueryCache=1,
>     cacheable=0x7f47b8593b88, begTime=0x7f47b8593b60, shouldLog=0)
>     at ../sqlcomp/CmpMain.cpp:1684
> 0000015 0x00007f47c5782998 in CmpMain::sqlcomp (this=0x7f47b8593c40, 
> input=...,
>     gen_code=0x7f47a5e0c1a8, gen_code_len=0x7f47a5e0c1a0, heap=0x7f47b70bbc00,
>     phase=CmpMain::END, fragmentDir=0x7f47b8593d98, op=3004)
>     at ../sqlcomp/CmpMain.cpp:819
> 0000016 0x00007f47c33a8898 in CmpStatement::process (this=0x7f47a5e52f10,
>     sqltext=<value optimized out>) at ../arkcmp/CmpStatement.cpp:499
> 0000017 0x00007f47c339b48c in CmpContext::compileDirect (this=0x7f47b6525090,
>     data=0x7f47b7112db8 "\200", data_len=144, outHeap=0x7f47b7b2e128,
>     charset=15, op=CmpMessageObj::SQLTEXT_COMPILE, gen_code=@0x7f47b8594320,
>     gen_code_len=@0x7f47b8594328, parserFlags=4194304, parentQid=0x0,
>     parentQidLen=0, diagsArea=0x7f47b7112e50) at ../arkcmp/CmpContext.cpp:841
> 0000018 0x00007f47caa0dd38 in CliStatement::prepare2 (this=0x7f47b70d4028,
>     source=0x7f47b711ab18 "showddl mytable", diagsArea=...,
>     passed_gen_code=<value optimized out>, passed_gen_code_len=3081953576,
>     charset=15, unpackTdbs=1, cliFlags=129) at ../cli/Statement.cpp:1775
> 0000019 0x00007f47ca9bac94 in SQLCLI_Prepare2 (cliGlobals=0x27bcbb0,
>     statement_id=0x370a9c8, sql_source=0x7f47b8594610, gencode_ptr=0x0,
>     gencode_len=0, ret_gencode_len=0x0, query_cost_info=0x370abf8,
>     query_comp_stats_info=0x370ac48, uniqueStmtId=<value optimized out>,
>     uniqueStmtIdLen=0x370ab2c, flags=1) at ../cli/Cli.cpp:5927
> 0000020 0x00007f47caa1b1ae in SQL_EXEC_Prepare2 (statement_id=0x370a9c8,
>     sql_source=0x7f47b8594610, gencode_ptr=0x0, gencode_len=0,
>     ret_gencode_len=0x0, query_cost_info=0x370abf8, comp_stats_info=0x370ac48,
>     uniqueStmtId=0x370ab30 "", uniqueStmtIdLen=0x370ab2c, flags=1)
>     at ../cli/CliExtern.cpp:5034
> 0000021 0x00007f47cd4e31d9 in SRVR::WSQL_EXEC_Prepare2 
> (statement_id=0x370a9c8,
>     sql_source=<value optimized out>, gencode_ptr=<value optimized out>,
>     gencode_len=<value optimized out>, ret_gencode_len=<value optimized out>,
>     query_cost_info=<value optimized out>, comp_stats_info=0x370ac48,
>     uniqueQueryId=0x370ab30 "", uniqueQueryIdLen=0x370ab2c, flags=1)
>     at SQLWrapper.cpp:803
> 0000022 0x00007f47cd4d7b45 in SRVR::PREPARE2 (pSrvrStmt=0x370a3b0,
>     isFromExecDirect=248) at sqlinterface.cpp:5057
> 0000023 0x00007f47cd508370 in odbc_SQLSvc_Prepare2_sme_ (inputRowCnt=0,
>     sqlStmtType=1, stmtLabel=<value optimized out>,
>     sqlString=0x2ba7254 "showddl mytable", holdableCursor=0,
>     returnCode=0x7f47b8594b08, sqlWarningOrErrorLength=0x7f47b8594b04,
>     sqlWarningOrError=@0x7f47b8594ae0, sqlQueryType=0x7f47b8594afc,
>     stmtHandle=0x7f47b8594ac0, estimatedCost=0x7f47b8594af8,
>     inputDescLength=0x7f47b8594af0, inputDesc=@0x7f47b8594ad0,
>     outputDescLength=0x7f47b8594aec, outputDesc=@0x7f47b8594ac8,
>     isFromExecDirect=true) at srvrothers.cpp:939
> 0000024 0x00000000004c6ca2 in odbc_SQLSrvr_ExecDirect_ame_ (objtag_=0x55e6ec0,
>     call_id_=0x55e6f18, dialogueId=259570813, stmtLabel=0x2ba7270 "SQL_CUR_7",
>     cursorName=0x0, stmtExplainLabel=<value optimized out>, stmtType=0,
>     sqlStmtType=1, sqlString=0x2ba7254 "showddl mytable", sqlAsyncEnable=0,
>     queryTimeout=0, inputRowCnt=0, txnID=0, holdableCursor=0)
>     at SrvrConnect.cpp:7894
> 0000025 0x0000000000495886 in SQLEXECUTE_IOMessage (objtag_=0x55e6ec0,
>     call_id_=0x55e6f18, operation_id=3012) at Interface/odbcs_srvr.cpp:1731
> 0000026 0x0000000000495934 in DISPATCH_TCPIPRequest (objtag_=0x55e6ec0,
>     call_id_=0x55e6f18, operation_id=<value optimized out>)
>     at Interface/odbcs_srvr.cpp:1796
> 0000027 0x0000000000434532 in BUILD_TCPIP_REQUEST (pnode=0x55e6ec0)
>     at ../Common/TCPIPSystemSrvr.cpp:606
> 0000028 0x0000000000434ecd in PROCESS_TCPIP_REQUEST (pnode=0x55e6ec0)
>     at ../Common/TCPIPSystemSrvr.cpp:584
> 0000029 0x00000000004631a6 in CNSKListenerSrvr::tcpip_listener (arg=0x2663560)
>     at Interface/linux/Listener_srvr_ps.cpp:403
> 0000030 0x00007f47cae91314 in sb_thread_sthr_disp (pp_arg=0x27a94a0)
>     at threadl.cpp:256
> 0000031 0x00007f47cac5da51 in start_thread () from /lib64/libpthread.so.0
> 0000032 0x00007f47cb19393d in clone () from /lib64/libc.so.6
> The problem does not happen with sqlci.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to