[ https://issues.apache.org/jira/browse/TS-4160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15128211#comment-15128211 ]
ASF GitHub Bot commented on TS-4160: ------------------------------------ Github user sudheerv commented on the pull request: https://github.com/apache/trafficserver/pull/443#issuecomment-178564856 Does it not affect the plugins that use API such as Request::getUrl() or Request::getHeaders() etc which rely on the cached handles at the moment? OR Are you suggesting these API could be modified too without necessarily breaking compatibility? > MLocRelease for txn buffers called too late in CPP API resulting buffer > corruption > ---------------------------------------------------------------------------------- > > Key: TS-4160 > URL: https://issues.apache.org/jira/browse/TS-4160 > Project: Traffic Server > Issue Type: Bug > Components: CPP API > Affects Versions: 6.1.0 > Reporter: Sudheer Vinukonda > Assignee: Sudheer Vinukonda > Labels: yahoo > Fix For: 6.1.0 > > > CPP API currently acquires the buffer and heap handles for various kinds of > request/response objects (e.g cached_request_hdr_buf, cached_request_hdr_loc, > client_response_hdr_buf, client_response_hdr_loc etc) at the corresponding > hooks of a transaction, along with the url handles in the CPP API's > request/response objects, but keeps them until txn_close, before calling > MLocRelease on them finally. > However, this is buggy, since the core doesn't seem to guarantee the life of > these handles until the end of the transaction. In particular, the > cached_request_hdr_buf/loc are released at the end of > tunnel_handler_cache_read(), which causes it to be reassigned to a different > txn and results in a crash when the original txn finally tries to release > them triggering the asserts from the MLocRelease C API. Another case is > internal redirect follow in the core, which destroys and recreates some of > these buffers. > Note that the C API plugins typically call MLocRelease immediately after > acquiring/using them. > Additionally, the order in which CPP API releases the handles is incorrect. > Transaction object contains the Request objects and Transaction object's > destructor first calls MLocRelease on the request heap handles, before which > the Request object's d'tor is triggered which calls the url handle in that > released request handle. This may not be a problem right now, since > MLocRelease for HTTP_OBJ and URL_OBJ are basically no-ops, but, the order of > release still breaks the implicit contract of the API, which requires the > child to be released before releasing the parent (which means, if the API's > internal implementation changes tomorrow, that can cause a problem). > Here are the back traces for the issues mentioned above: > {code} > (gdb) bt > #0 0x0000003c1b232625 in raise (sig=6) at > ../nptl/sysdeps/unix/sysv/linux/raise.c:64 > #1 0x0000003c1b233e05 in abort () at abort.c:92 > #2 0x00002b149d517018 in ink_die_die_die (retval=1) at ink_error.cc:43 > #3 0x00002b149d5170e5 in ink_fatal_va(int, const char *, typedef > __va_list_tag __va_list_tag *) (return_code=1, message_format=0x2b149d525d78 > "%s:%d: failed assert `%s`", ap=0x2b14a4806570) at ink_error.cc:65 > #4 0x00002b149d5171ae in ink_fatal (return_code=1, > message_format=0x2b149d525d78 "%s:%d: failed assert `%s`") at ink_error.cc:73 > #5 0x00002b149d515d20 in _ink_assert (expression=0x771185 "!\"invalid > mloc\"", file=0x770f1d "InkAPI.cc", line=1886) at ink_assert.cc:37 > #6 0x000000000050f809 in TSHandleMLocRelease (bufp=0x2b15e82357f0, > parent=0x2b15e40fca98, mloc=0x2b15e40fcd18) at InkAPI.cc:1886 > #7 0x00002b15759066fc in atscppapi::Request::~Request (this=0x2b15e0003318, > __in_chrg=<value optimized out>) at Request.cc:179 > #8 0x00002b1575902131 in ~TransactionState (this=0x2b15e0001320, > __in_chrg=<value optimized out>) at Transaction.cc:40 > #9 atscppapi::Transaction::~Transaction (this=0x2b15e0001320, > __in_chrg=<value optimized out>) at Transaction.cc:111 > #10 0x00002b1575901031 in (anonymous namespace)::handleTransactionEvents > (cont=<value optimized out>, event=<value optimized out>, > edata=0x2b15e8235460) at utils_internal.cc:87 > #11 0x000000000050d7f0 in INKContInternal::handle_event (this=0x20399b0, > event=60012, edata=0x2b15e8235460) at InkAPI.cc:1000 > #12 0x00000000004f8dee in Continuation::handleEvent (this=0x20399b0, > event=60012, data=0x2b15e8235460) at > ../iocore/eventsystem/I_Continuation.h:146 > #13 0x000000000050e037 in APIHook::invoke (this=0x202b170, event=60012, > edata=0x2b15e8235460) at InkAPI.cc:1219 > #14 0x00000000005cfb5d in HttpSM::state_api_callout (this=0x2b15e8235460, > event=0, data=0x0) at HttpSM.cc:1384 > #15 0x00000000005dba75 in HttpSM::do_api_callout_internal > (this=0x2b15e8235460) at HttpSM.cc:4896 > #16 0x00000000005e8b2a in HttpSM::do_api_callout (this=0x2b15e8235460) at > HttpSM.cc:437 > #17 0x00000000005e0ed0 in HttpSM::kill_this (this=0x2b15e8235460) at > HttpSM.cc:6615 > #18 0x00000000005d358f in HttpSM::main_handler (this=0x2b15e8235460, > event=2301, data=0x2b15e82360a0) at HttpSM.cc:2556 > #19 0x00000000004f8dee in Continuation::handleEvent (this=0x2b15e8235460, > event=2301, data=0x2b15e82360a0) at ../iocore/eventsystem/I_Continuation.h:146 > #20 0x000000000061cebf in HttpTunnel::main_handler (this=0x2b15e82360a0, > event=103, data=0x2b15fc01e2e8) at HttpTunnel.cc:1537 > #21 0x00000000004f8dee in Continuation::handleEvent (this=0x2b15e82360a0, > event=103, data=0x2b15fc01e2e8) at ../iocore/eventsystem/I_Continuation.h:146 > #22 0x000000000070eafa in CacheVC::calluser (this=0x2b15fc01e0a0, event=103) > at P_CacheInternal.h:642 > #23 0x00000000007174fb in CacheVC::openWriteMain (this=0x2b15fc01e0a0) at > CacheWrite.cc:1367 > #24 0x00000000004f8dee in Continuation::handleEvent (this=0x2b15fc01e0a0, > event=1, data=0x2b15e4041090) at ../iocore/eventsystem/I_Continuation.h:146 > #25 0x0000000000760bee in EThread::process_event (this=0x2b149ef4a010, > e=0x2b15e4041090, calling_code=1) at UnixEThread.cc:145 > #26 0x0000000000760dbc in EThread::execute (this=0x2b149ef4a010) at > UnixEThread.cc:196 > #27 0x0000000000760160 in spawn_thread_internal (a=0x1b4c790) at Thread.cc:88 > #28 0x00002b149dded9d1 in start_thread (arg=0x0) at pthread_create.c:301 > #29 0x0000003c1b2e88fd in clone () at > ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 > {code} > {code} > (gdb) bt > #0 0x0000003c1b232625 in raise (sig=6) at > ../nptl/sysdeps/unix/sysv/linux/raise.c:64 > #1 0x0000003c1b233e05 in abort () at abort.c:92 > #2 0x00002b0af272e018 in ink_die_die_die (retval=1) at ink_error.cc:43 > #3 0x00002b0af272e0e5 in ink_fatal_va(int, const char *, typedef > __va_list_tag __va_list_tag *) (return_code=1, message_format=0x2b0af273cd78 > "%s:%d: failed assert `%s`", ap=0x2b0af9d1b420) at ink_error.cc:65 > #4 0x00002b0af272e1ae in ink_fatal (return_code=1, > message_format=0x2b0af273cd78 "%s:%d: failed assert `%s`") at ink_error.cc:73 > #5 0x00002b0af272cd20 in _ink_assert (expression=0x771158 > "sdk_sanity_check_mbuffer(bufp) == TS_SUCCESS", file=0x770f1d "InkAPI.cc", > line=1869) at ink_assert.cc:37 > #6 0x000000000050c948 in _TSReleaseAssert (text=0x771158 > "sdk_sanity_check_mbuffer(bufp) == TS_SUCCESS", file=0x770f1d "InkAPI.cc", > line=1869) at InkAPI.cc:405 > #7 0x000000000050f78e in TSHandleMLocRelease (bufp=0x2b0c4400d068, > parent=0x0, mloc=0x2b0c942811c8) at InkAPI.cc:1869 > #8 0x00002b0bc98bf091 in atscppapi::Transaction::~Transaction > (this=0x2b0c64059580, __in_chrg=<value optimized out>) at Transaction.cc:105 > #9 0x00002b0bc98be031 in (anonymous namespace)::handleTransactionEvents > (cont=<value optimized out>, event=<value optimized out>, > edata=0x2b0c401c7480) at utils_internal.cc:87 > #10 0x000000000050d7f0 in INKContInternal::handle_event (this=0x1ba23c0, > event=60012, edata=0x2b0c401c7480) at InkAPI.cc:1000 > #11 0x00000000004f8dee in Continuation::handleEvent (this=0x1ba23c0, > event=60012, data=0x2b0c401c7480) at > ../iocore/eventsystem/I_Continuation.h:146 > #12 0x000000000050e037 in APIHook::invoke (this=0x1b93b80, event=60012, > edata=0x2b0c401c7480) at InkAPI.cc:1219 > #13 0x00000000005cfb5d in HttpSM::state_api_callout (this=0x2b0c401c7480, > event=0, data=0x0) at HttpSM.cc:1384 > #14 0x00000000005dba75 in HttpSM::do_api_callout_internal > (this=0x2b0c401c7480) at HttpSM.cc:4896 > #15 0x00000000005e8b2a in HttpSM::do_api_callout (this=0x2b0c401c7480) at > HttpSM.cc:437 > #16 0x00000000005e0ed0 in HttpSM::kill_this (this=0x2b0c401c7480) at > HttpSM.cc:6615 > #17 0x00000000005d358f in HttpSM::main_handler (this=0x2b0c401c7480, > event=2301, data=0x2b0c401c80c0) at HttpSM.cc:2556 > #18 0x00000000004f8dee in Continuation::handleEvent (this=0x2b0c401c7480, > event=2301, data=0x2b0c401c80c0) at ../iocore/eventsystem/I_Continuation.h:146 > #19 0x000000000061cebf in HttpTunnel::main_handler (this=0x2b0c401c80c0, > event=103, data=0x2b0c2800b898) at HttpTunnel.cc:1537 > #20 0x00000000004f8dee in Continuation::handleEvent (this=0x2b0c401c80c0, > event=103, data=0x2b0c2800b898) at ../iocore/eventsystem/I_Continuation.h:146 > #21 0x000000000073f958 in write_signal_and_update (event=103, > vc=0x2b0c2800b720) at UnixNetVConnection.cc:174 > #22 0x000000000073fb3e in write_signal_done (event=103, nh=0x2b0af8508cb0, > vc=0x2b0c2800b720) at UnixNetVConnection.cc:216 > #23 0x0000000000740d12 in write_to_net_io (nh=0x2b0af8508cb0, > vc=0x2b0c2800b720, thread=0x2b0af8505010) at UnixNetVConnection.cc:559 > #24 0x0000000000740579 in write_to_net (nh=0x2b0af8508cb0, vc=0x2b0c2800b720, > thread=0x2b0af8505010) at UnixNetVConnection.cc:414 > #25 0x0000000000739536 in NetHandler::mainNetEvent (this=0x2b0af8508cb0, > event=5, e=0x1701b70) at UnixNet.cc:534 > #26 0x00000000004f8dee in Continuation::handleEvent (this=0x2b0af8508cb0, > event=5, data=0x1701b70) at ../iocore/eventsystem/I_Continuation.h:146 > #27 0x0000000000760bee in EThread::process_event (this=0x2b0af8505010, > e=0x1701b70, calling_code=5) at UnixEThread.cc:145 > #28 0x00000000007610f8 in EThread::execute (this=0x2b0af8505010) at > UnixEThread.cc:269 > #29 0x0000000000760160 in spawn_thread_internal (a=0x162e500) at Thread.cc:88 > #30 0x00002b0af30049d1 in start_thread (arg=0x2b0af9d1c700) at > pthread_create.c:301 > #31 0x0000003c1b2e88fd in clone () at > ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 > {code} > Below are debug logs from load tests that prove the problem: > {code} > Step 1: Transaction 0x2b8430214d70 allocates > cached_request_hdr_loc=0x2b846c026a58 > [Jan 28 18:28:19.652] Server {0x2b82ecc0b700} DIAG: > (atscppapi.utils_internal.cc:176) [getTransaction()] existing transaction > object at 0x2b8444014450 for ats pointer 0x2b8430214d70, event 60005 > [Jan 28 18:28:19.652] Server {0x2b82ecc0b700} DIAG: > (atscppapi.utils_internal.cc:53) [handleTransactionEvents()] Got event 60005 > on continuation 0x2d58950 for transaction (ats pointer 0x2b8430214d70, object > 0x2b8444014450) > [Jan 28 18:28:19.652] Server {0x2b82ecc0b700} DEBUG: (atscppapi) > TSHttpTxnCachedReqGet txnp 0x2b8430214d70, new handle bufp 0x2b8434020878, > heap 0x2b846c0269d0, obj 0x2b846c026a58, magic 0xabcdfeed > [Jan 28 18:28:19.652] Server {0x2b82ecc0b700} DIAG: > (atscppapi.Transaction.cc:494) [initCachedRequest()] Initializing cached > request, txn=0x2b8430214d70, state_=0x2b8444002130, > client_request_hdr_buf_=0x2b8430215080, > client_request_hdr_loc_=0x2b8468263f18, client_response_hdr_buf_=(nil), > client_response_hdr_loc_=(nil), server_request_hdr_buf_=(nil), > server_request_hdr_loc_=(nil), server_response_hdr_buf_=(nil), > server_response_hdr_loc_=(nil), cached_request_hdr_buf_=0x2b8434020878, > cached_request_hdr_loc_=0x2b846c026a58 > Step 2: Transaction 0x2b84301f8a30 allocates the same as > client_response_hdr_buf_=0x2b84301f8d80, while it's handle is still in the > previous transaction 0x2b8430214d70 in CPP API > [Jan 28 18:28:19.667] Server {0x2b82e7fdd700} DIAG: > (atscppapi.utils_internal.cc:53) [handleTransactionEvents()] Got event 60007 > on continuation 0x2d58950 for transaction (ats pointer 0x2b84301f8a30, object > 0x2b830c00a750) > [Jan 28 18:28:19.667] Server {0x2b82e7fdd700} DIAG: > (atscppapi.Transaction.cc:485) [initClientResponse()] Initializing client > response, txn=0x2b84301f8a30, state_=0x2b830c0017d0, > client_request_hdr_buf_=0x2b84301f8d40, > client_request_hdr_loc_=0x2b84641a73e8, > client_response_hdr_buf_=0x2b84301f8d80, > client_response_hdr_loc_=0x2b846c026a58, server_request_hdr_buf_=(nil), > server_request_hdr_loc_=(nil), server_response_hdr_buf_=(nil), > server_response_hdr_loc_=(nil), cached_request_hdr_buf_=0x2b8434010478, > cached_request_hdr_loc_=0x2b84181666e8 > Step 3: Transaction 0x2b8430214d70 now gets TXN_CLOSE event and tries to call > MLocRelease on 0x2b846c026a58's url handle which results in an assert failure > in MLocRelease API > [Jan 28 18:28:19.678] Server {0x2b82ecc0b700} DIAG: > (atscppapi.utils_internal.cc:53) [handleTransactionEvents()] Got event 60012 > on continuation 0x2d58950 for transaction (ats pointer 0x2b8430214d70, object > 0x2b8444014450) > [Jan 28 18:28:19.678] Server {0x2b82ecc0b700} DIAG: > (atscppapi.Transaction.cc:88) [~Transaction()] Transaction > tshttptxn=0x2b8430214d70 destroying Transaction object 0x2b8444014450, > state_=0x2b8444002130, state_->client_request_hdr_buf_=0x2b8430215080, > state_->client_request_hdr_loc_=0x2b8468263f18, > state_->server_request_hdr_buf_=(nil), > state_->server_response_hdr_buf_=(nil), > state_->client_response_hdr_buf_=0x2b84302150c0, > state_->cached_request_hdr_buf_=0x2b8434020878, > state_->cached_response_hdr_buf_=0x2b8434020880 > [Jan 28 18:28:19.678] Server {0x2b82ecc0b700} DEBUG: (atscppapi) > TSHandleMLocRelease bufp=0x2b8430215080, parent=(nil), mloc=0x2b8468263f18, > obj type 3 > [Jan 28 18:28:19.678] Server {0x2b82ecc0b700} DIAG: > (atscppapi.Transaction.cc:100) [~Transaction()] Releasing client response > [Jan 28 18:28:19.678] Server {0x2b82ecc0b700} DIAG: > (atscppapi.Transaction.cc:104) [~Transaction()] Releasing cached request > [Jan 28 18:28:19.678] Server {0x2b82ecc0b700} DIAG: > (atscppapi.Transaction.cc:108) [~Transaction()] Releasing cached response > [Jan 28 18:28:19.678] Server {0x2b82ecc0b700} DIAG: > (atscppapi.Request.cc:181) [~Request()] Destroying request object with no > destroy_buf 0x2b84440021d8 on hdr_buf=0x2b8434020878, hdr_loc=0x2b846c026a58, > url_loc=0x2b846c026cd8 > [Jan 28 18:28:19.678] Server {0x2b82ecc0b700} DEBUG: (atscppapi) > TSHandleMLocRelease bufp=0x2b8434020878, parent=0x2b846c026a58, > mloc=0x2b846c026cd8, obj type 5 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)