Alexey Serbin created KUDU-3254:
-----------------------------------

             Summary: Crash in Kudu C++ client when working with stale scan 
tokens containing tablet location info
                 Key: KUDU-3254
                 URL: https://issues.apache.org/jira/browse/KUDU-3254
             Project: Kudu
          Issue Type: Bug
          Components: client
    Affects Versions: 1.13.0
            Reporter: Alexey Serbin
             Fix For: 1.14.0


With KUDU-1802 implemented, the meta-cache in Kudu C++ client might crash if 
using a scan token with information on tablet location in scenarios like below:

# Scan tokens were generated for table with multiple ranges (e.g., with two 
ranges: [-100, 0), [0, 100)).
# First range was dropped (e.g., range [-100, 0) is dropped).
# A client was fed a set of tokens generated at step 1 to read from the table 
(now with one stale token corresponding to the dropped range).
# The same client instance was used to write into the table.
# The same client instance fed the original set of tokens once more to read 
from the table again.

The client would crash at step 5 of the sequence above.

The stack trace on crash might look like this (captured on macOS):
{noformat}
      * frame #0: 0x00007fff7035833a libsystem_kernel.dylib`__pthread_kill + 10
        frame #1: 0x00007fff70414e60 libsystem_pthread.dylib`pthread_kill + 430
        frame #2: 0x00007fff702df808 libsystem_c.dylib`abort + 120
        frame #3: 0x000000010ca1a259 libglog.0.dylib`google::logging_fail() at 
logging.cc:1474:3
        frame #4: 0x000000010ca19121 
libglog.0.dylib`google::LogMessage::SendToLog() [inlined] 
google::LogMessage::Fail() at logging.cc:
1488:3
        frame #5: 0x000000010ca1911b 
libglog.0.dylib`google::LogMessage::SendToLog() at logging.cc:1442
        frame #6: 0x000000010ca19815 
libglog.0.dylib`google::LogMessage::Flush() at logging.cc:1311:5
        frame #7: 0x000000010ca1d76f 
libglog.0.dylib`google::LogMessageFatal::~LogMessageFatal() at logging.cc:2023:5
        frame #8: 0x000000010ca1a5f9 
libglog.0.dylib`google::LogMessageFatal::~LogMessageFatal() at 
logging.cc:2022:37
        frame #9: 0x0000000103e365e3 libkudu_client.dylib`std::__1::map<std::__
1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, 
kudu::client::internal::MetaCacheEntry, 
std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>, 
std::__1::allocator<char> > >, 
std::__1::allocator<std::__1::pair<std::__1::basic_string<char, 
std::__1::char_traits<char>, std::__1::allocator<char> > const, 
kudu::client::internal::MetaCacheEntry> > >::mapped_type& 
FindOrDie<std::__1::map<std::__1::basic_string<char, 
std::__1::char_traits<char>, std::__1::allocator<char> >, 
kudu::client::internal::MetaCacheEntry, 
std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>, 
std::__1::allocator<char> > >, 
std::__1::allocator<std::__1::pair<std::__1::basic_string<char, 
std::__1::char_traits<char>, std::__1::allocator<char> > const, 
kudu::client::internal::MetaCacheEntry> > > >() at map-util.h:109:3
        frame #10: 0x0000000103e34cbb 
libkudu_client.dylib`kudu::client::internal::MetaCache::ProcessGetTableLocationsResponse()
 at meta_cache.cc:943:23
        frame #11: 0x0000000103e86166 
libkudu_client.dylib`kudu::client::KuduScanToken::Data::PBIntoScanner() at 
scan_token-internal.cc:192:35
        frame #12: 0x0000000103e88051 
libkudu_client.dylib`kudu::client::KuduScanToken::Data::DeserializeIntoScanner()
 at scan_token-internal.cc:111:10
        frame #13: 0x0000000103d55d3c 
libkudu_client.dylib`kudu::client::KuduScanToken::DeserializeIntoScanner() at 
client.cc:1879:10
{noformat}

The issue is fixed in Kudu 1.14 with [this 
changelist|https://github.com/apache/kudu/commit/2a558768f8aa00068e72ccd1327081f07ba46b03].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to