[ 
https://issues.apache.org/jira/browse/TRAFODION-648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Wayne Birdsall updated TRAFODION-648:
-------------------------------------------
    Fix Version/s:     (was: 2.0-incubating)

> LP Bug: 1371670 - Use of bulk load for ustat causes a core in some cases
> ------------------------------------------------------------------------
>
>                 Key: TRAFODION-648
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-648
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: sql-cmp
>            Reporter: Apache Trafodion
>            Assignee: David Wayne Birdsall
>
> in some cases we noticed that the use of bulk load with update statistics 
> makes the compiler generate a core file. The update statistics operation 
> continues and the sample table is populated.
> The issue was seen on zircon2 and on Amethyst with 2 different tables. On 
> Zircon4 the issue did not happen when I tried to run the same update 
> statistics stament as the one that produced the core on zircon2
> the stack is below and my initial debugging showed that the issue happens 
> when we try to do cleanup in NATable.cpp
>   for(i=0; i < tablesToDeleteAfterStatement_.entries(); i++)
>   {
>     if ( tablesToDeleteAfterStatement_[i]->getHeapType() == NATable::OTHER ) {
>       tableHeap = tablesToDeleteAfterStatement_[i]->heap_;
>       delete tableHeap;
>     }
>   } 
> in the case I debugged with Barry  it looks like when we try to delete the 
> 3rd item in the list we fail because it was already deleted. it looks like  
> 1st and 3rd element are pointing to same object and when we delete the first 
> one the 3rd element is now pointing to a non exiting object
> [Thread debugging using libthread_db enabled]
> Core was generated by `tdm_arkcmp SQMON1.0 00000 00000 011902 $Z0009Q2 
> tag#0$port#52331$description#n0'.
> Program terminated with signal 6, Aborted.
> #0  0x00007fffee9a38a5 in raise () from /lib64/libc.so.6
> #0  0x00007fffee9a38a5 in raise () from /lib64/libc.so.6
> #1  0x00007fffee9a500d in abort () from /lib64/libc.so.6
> #2  0x00007ffff138f455 in os::abort(bool) () from 
> /usr/lib/jvm/jdk1.7.0_09_64/jre/lib/amd64/server/libjvm.so
> #3  0x00007ffff14ef717 in VMError::report_and_die() () from 
> /usr/lib/jvm/jdk1.7.0_09_64/jre/lib/amd64/server/libjvm.so
> #4  0x00007ffff1392f60 in JVM_handle_linux_signal () from 
> /usr/lib/jvm/jdk1.7.0_09_64/jre/lib/amd64/server/libjvm.so
> #5  <signal handler called>
> #6  NATableDB::resetAfterStatement (this=0x7fffe79683b0) at 
> ../optimizer/NATable.cpp:7559
> #7  0x00007ffff4f712df in SchemaDB::cleanupPerStatement (this=0x7fffe79683a0) 
> at ../optimizer/SchemaDB.cpp:186
> #8  0x00007ffff4127735 in CmpContext::cleanup (this=0x7fffe7963090, 
> exception=<value optimized out>) at ../arkcmp/CmpContext.cpp:489
> #9  0x00007ffff4129f63 in CmpContext::unsetStatement (this=0x7fffe7963090, 
> s=0x7fffe7990c10, exceptionRaised=0) at ../arkcmp/CmpContext.cpp:453
> #10 0x00007ffff4134e46 in CmpStatement::~CmpStatement (this=0x7fffe7990c10, 
> __in_chrg=<value optimized out>) at ../arkcmp/CmpStatement.cpp:224
> #11 0x00007ffff4134f11 in CmpStatement::~CmpStatement (this=0x7fffe7990c10, 
> __in_chrg=<value optimized out>) at ../arkcmp/CmpStatement.cpp:227
> #12 0x00007ffff41251b9 in ExCmpMessage::actOnReceive (this=0x7fffffffc250) at 
> ../arkcmp/CmpConnection.cpp:588
> #13 0x00007ffff6fdca56 in IpcMessageStream::internalActOnReceive 
> (this=0x7fffffffc250, buffer=<value optimized out>, connection=0xbaadb0) at 
> ../common/Ipc.cpp:3553
> #14 0x00007ffff6ff3aab in GuaConnectionToClient::acceptBuffer (this=0xbaadb0, 
> buffer=<value optimized out>, receivedDataLength=<value optimized out>) at 
> ../common/IpcGuardian.cpp:2467
> #15 0x00007ffff6ff47af in GuaReceiveControlConnection::wait (this=0xb9a5e0, 
> timeout=-1, eventConsumed=<value optimized out>, ipcAwaitiox=0x7fffffffbc00) 
> at ../common/IpcGuardian.cpp:3164
> #16 0x00007ffff6ff5b92 in GuaConnectionToClient::wait (this=0xbaadb0, 
> timeout=<value optimized out>, eventConsumed=0x0, ipcAwaitiox=0x0) at 
> ../common/IpcGuardian.cpp:2136
> #17 0x00007ffff6fe91aa in IpcSetOfConnections::waitOnSet 
> (this=0x7fffffffc3f0, timeout=-1, calledByESP=0, timedout=0x0) at 
> ../common/Ipc.cpp:1709
> #18 0x00007ffff6fe9ced in IpcMessageStream::waitOnMsgStream 
> (this=0x7fffffffc250, timeout=-1) at ../common/Ipc.cpp:3272
> #19 0x00007ffff6fea032 in IpcMessageStream::receive (this=0x7fffffffc250, 
> waited=1) at ../common/Ipc.cpp:3254
> #20 0x00000000004048ae in main (argc=2, argv=0x7fffffffc9c8) at 
> ../bin/arkcmp.cpp:303
> to reproduce you can use zircon2  and the statement 
> update statistics for table trafodion.bench60.ycsb_table_20 on every key 
> generate 1 intervals sample 1000 rows
> or amethyst 5 and  and do an update statistics on the ossdba.box table



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to