[ 
https://issues.apache.org/jira/browse/KUDU-3254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17297107#comment-17297107
 ] 

ASF subversion and git services commented on KUDU-3254:
-------------------------------------------------------

Commit 7c8dca60d15b560017ef7e726a379788727502ba in kudu's branch 
refs/heads/branch-1.13.x from Alexey Serbin
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=7c8dca6 ]

KUDU-3254 fix bug in meta-cache exposed by KUDU-1802

This patch fixes an issue resulting in a SIGABRT crash in Kudu client
when working with stale scan tokens which contain information about
tablet locations for a table (see KUDU-1802) whose range partition
was dropped.  The patch also adds a test scenario reproducing the crash;
now it passes and can catch future regressions.

This patch is a follow-up to d23ee5d38ddc4317f431dd65df0c825c00cc968a.

Prior the change in src/kudu/client/meta_cache.cc was back-ported from
Kudu 1.14 as part of this fix, the scenario crashed with SIGABRT when
running with the stack trace similar to the following (this one below
was captured on macOS):

  * frame #0: 0x00007fff7035833a libsystem_kernel.dylib`__pthread_kill + 10
    frame #1: 0x00007fff70414e60 libsystem_pthread.dylib`pthread_kill + 430
    frame #2: 0x00007fff702df808 libsystem_c.dylib`abort + 120
    frame #3: 0x000000010ca1a259 libglog.0.dylib`google::logging_fail() at 
logging.cc:1474:3
    frame #4: 0x000000010ca19121 
libglog.0.dylib`google::LogMessage::SendToLog() [inlined] 
google::LogMessage::Fail() at logging.cc:1488:3
    frame #5: 0x000000010ca1911b 
libglog.0.dylib`google::LogMessage::SendToLog() at logging.cc:1442
    frame #6: 0x000000010ca19815 libglog.0.dylib`google::LogMessage::Flush() at 
logging.cc:1311:5
    frame #7: 0x000000010ca1d76f 
libglog.0.dylib`google::LogMessageFatal::~LogMessageFatal() at logging.cc:2023:5
    frame #8: 0x000000010ca1a5f9 
libglog.0.dylib`google::LogMessageFatal::~LogMessageFatal() at 
logging.cc:2022:37
    frame #9: 0x0000000103e365e3 
libkudu_client.dylib`std::__1::map<std::__1::basic_string<char, 
std::__1::char_traits<char>, std::__1::allocator<char> >, 
kudu::client::internal::MetaCacheEntry, 
std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>, 
std::__1::allocator<char> > >, 
std::__1::allocator<std::__1::pair<std::__1::basic_string<char, 
std::__1::char_traits<char>, std::__1::allocator<char> > const, 
kudu::client::internal::MetaCacheEntry> > >::mapped_type& 
FindOrDie<std::__1::map<std::__1::basic_string<char, 
std::__1::char_traits<char>, std::__1::allocator<char> >, 
kudu::client::internal::MetaCacheEntry, 
std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>, 
std::__1::allocator<char> > >, 
std::__1::allocator<std::__1::pair<std::__1::basic_string<char, 
std::__1::char_traits<char>, std::__1::allocator<char> > const, 
kudu::client::internal::MetaCacheEntry> > > >() at map-util.h:109:3
    frame #10: 0x0000000103e34cbb 
libkudu_client.dylib`kudu::client::internal::MetaCache::ProcessGetTableLocationsResponse()
 at meta_cache.cc:943:23
    frame #11: 0x0000000103e86166 
libkudu_client.dylib`kudu::client::KuduScanToken::Data::PBIntoScanner() at 
scan_token-internal.cc:192:35
    frame #12: 0x0000000103e88051 
libkudu_client.dylib`kudu::client::KuduScanToken::Data::DeserializeIntoScanner()
 at scan_token-internal.cc:111:10
    frame #13: 0x0000000103d55d3c 
libkudu_client.dylib`kudu::client::KuduScanToken::DeserializeIntoScanner() at 
client.cc:1879:10

Change-Id: I5b8370290c13b1e496f461ed5bc2e0193bdf4b19
Reviewed-on: http://gerrit.cloudera.org:8080/17152
Tested-by: Alexey Serbin <aser...@cloudera.com>
Reviewed-by: Andrew Wong <aw...@cloudera.com>


> Crash in Kudu C++ client when working with stale scan tokens containing 
> tablet location info
> --------------------------------------------------------------------------------------------
>
>                 Key: KUDU-3254
>                 URL: https://issues.apache.org/jira/browse/KUDU-3254
>             Project: Kudu
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 1.13.0
>            Reporter: Alexey Serbin
>            Assignee: Alexey Serbin
>            Priority: Major
>             Fix For: 1.14.0
>
>
> With KUDU-1802 implemented, the meta-cache in Kudu C++ client might crash if 
> using a scan token with information on tablet location in scenarios like 
> below:
> # Scan tokens were generated for table with multiple ranges (e.g., with two 
> ranges: [-100, 0), [0, 100)).
> # First range was dropped (e.g., range [-100, 0) is dropped).
> # A client was fed a set of tokens generated at step 1 to read from the table 
> (now with one stale token corresponding to the dropped range).
> # The same client instance was used to write into the table.
> # The same client instance fed the original set of tokens once more to read 
> from the table again.
> The client would crash at step 5 of the sequence above.
> The stack trace on crash might look like this (captured on macOS):
> {noformat}
>       * frame #0: 0x00007fff7035833a libsystem_kernel.dylib`__pthread_kill + 
> 10
>         frame #1: 0x00007fff70414e60 libsystem_pthread.dylib`pthread_kill + 
> 430
>         frame #2: 0x00007fff702df808 libsystem_c.dylib`abort + 120
>         frame #3: 0x000000010ca1a259 libglog.0.dylib`google::logging_fail() 
> at logging.cc:1474:3
>         frame #4: 0x000000010ca19121 
> libglog.0.dylib`google::LogMessage::SendToLog() [inlined] 
> google::LogMessage::Fail() at logging.cc:
> 1488:3
>         frame #5: 0x000000010ca1911b 
> libglog.0.dylib`google::LogMessage::SendToLog() at logging.cc:1442
>         frame #6: 0x000000010ca19815 
> libglog.0.dylib`google::LogMessage::Flush() at logging.cc:1311:5
>         frame #7: 0x000000010ca1d76f 
> libglog.0.dylib`google::LogMessageFatal::~LogMessageFatal() at 
> logging.cc:2023:5
>         frame #8: 0x000000010ca1a5f9 
> libglog.0.dylib`google::LogMessageFatal::~LogMessageFatal() at 
> logging.cc:2022:37
>         frame #9: 0x0000000103e365e3 
> libkudu_client.dylib`std::__1::map<std::__
> 1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> 
> >, kudu::client::internal::MetaCacheEntry, 
> std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>, 
> std::__1::allocator<char> > >, 
> std::__1::allocator<std::__1::pair<std::__1::basic_string<char, 
> std::__1::char_traits<char>, std::__1::allocator<char> > const, 
> kudu::client::internal::MetaCacheEntry> > >::mapped_type& 
> FindOrDie<std::__1::map<std::__1::basic_string<char, 
> std::__1::char_traits<char>, std::__1::allocator<char> >, 
> kudu::client::internal::MetaCacheEntry, 
> std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>, 
> std::__1::allocator<char> > >, 
> std::__1::allocator<std::__1::pair<std::__1::basic_string<char, 
> std::__1::char_traits<char>, std::__1::allocator<char> > const, 
> kudu::client::internal::MetaCacheEntry> > > >() at map-util.h:109:3
>         frame #10: 0x0000000103e34cbb 
> libkudu_client.dylib`kudu::client::internal::MetaCache::ProcessGetTableLocationsResponse()
>  at meta_cache.cc:943:23
>         frame #11: 0x0000000103e86166 
> libkudu_client.dylib`kudu::client::KuduScanToken::Data::PBIntoScanner() at 
> scan_token-internal.cc:192:35
>         frame #12: 0x0000000103e88051 
> libkudu_client.dylib`kudu::client::KuduScanToken::Data::DeserializeIntoScanner()
>  at scan_token-internal.cc:111:10
>         frame #13: 0x0000000103d55d3c 
> libkudu_client.dylib`kudu::client::KuduScanToken::DeserializeIntoScanner() at 
> client.cc:1879:10
> {noformat}
> The issue is fixed in Kudu 1.14 with [this 
> changelist|https://github.com/apache/kudu/commit/2a558768f8aa00068e72ccd1327081f07ba46b03].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to