Alexey Serbin created KUDU-3688:
-----------------------------------

             Summary: Race between CatalogManager::InitCertAuthorityWith() and 
ServerNegotiation::HandleTlsHandshake() in follower Kudu master
                 Key: KUDU-3688
                 URL: https://issues.apache.org/jira/browse/KUDU-3688
             Project: Kudu
          Issue Type: Bug
          Components: master
            Reporter: Alexey Serbin
         Attachments: tsan-reports.txt.xz

With the blanket suppression of TSAN warnings for everything called from 
libcrypto.so removed, there are reports on data race between ongoing RPC 
connection negotiations and the background thread that runs 
{{CatalogManager::PrepareFollowerCaInfo()}} in the follower Kudu master.

The essence of the problem boils down to {{TlsContext::AdoptSignedCert()}} 
invoking OpenSSL's {{SSL_CTX_use_certificate()}} when there is concurrent TLS 
handshake being performed by one of the threads in the connection negotiation 
pool.

Below is a snippet from
{noformat}
WARNING: ThreadSanitizer: data race (pid=884526)
  Write of size 8 at 0x7b1c000033a0 by thread T84 (mutexes: read M3638, write 
M3818):
    #0 free 
/root/Projects/kudu/thirdparty/src/llvm-11.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:708:3
 (kudu+0x468f30)
    #1 ossl_asn1_string_embed_free 
/usr/src/debug/openssl-3.2.2-6.el9_5.1.x86_64/crypto/asn1/asn1_lib.c:367:9 
(libcrypto.so.3+0xb937c)
    #2 ASN1_STRING_free 
/usr/src/debug/openssl-3.2.2-6.el9_5.1.x86_64/crypto/asn1/asn1_lib.c:376:5 
(libcrypto.so.3+0xb937c)
    #3 ASN1_STRING_free 
/usr/src/debug/openssl-3.2.2-6.el9_5.1.x86_64/crypto/asn1/asn1_lib.c:372:6 
(libcrypto.so.3+0xb937c)
    #4 
kudu::master::CatalogManager::InitCertAuthorityWith(std::__1::unique_ptr<kudu::security::PrivateKey,
 std::__1::default_delete<kudu::security::PrivateKey> >, 
std::__1::unique_ptr<kudu::security::Cert, 
std::__1::default_delete<kudu::security::Cert> >) 
/root/Projects/kudu/src/kudu/master/catalog_manager.cc:1249:5 
(libmaster.so+0x323366)
    #5 
kudu::master::CatalogManager::PrepareFollowerCaInfo()::$_13::operator()() const 
/root/Projects/kudu/src/kudu/master/catalog_manager.cc:1573:12 
(libmaster.so+0x35d1fb)
    #6 kudu::Status 
kudu::Status::AndThen<kudu::master::CatalogManager::PrepareFollowerCaInfo()::$_13>(kudu::master::CatalogManager::PrepareFollowerCaInfo()::$_13)
 /root/Projects/kudu/src/kudu/util/status.h:241:14 (libmaster.so+0x325e4d)
    #7 kudu::master::CatalogManager::PrepareFollowerCaInfo() 
/root/Projects/kudu/src/kudu/master/catalog_manager.cc:1572:49 
(libmaster.so+0x325c59)
    #8 kudu::master::CatalogManager::PrepareFollower(kudu::MonoTime*) 
/root/Projects/kudu/src/kudu/master/catalog_manager.cc:1618:5 
(libmaster.so+0x32091d)
    #9 kudu::master::CatalogManagerBgTasks::Run() 
/root/Projects/kudu/src/kudu/master/catalog_manager.cc:873:38 
(libmaster.so+0x31e3d1)
    #10 kudu::master::CatalogManagerBgTasks::Init()::$_0::operator()() const 
/root/Projects/kudu/src/kudu/master/catalog_manager.cc:773:3 
(libmaster.so+0x352211)
    ...

  Previous read of size 8 at 0x7b1c000033a0 by thread T89:
    #0 memcpy sanitizer_common/sanitizer_common_interceptors.inc:808:5 
(kudu+0x48b836)
    #1 asn1_ex_i2c /usr/include/bits/string_fortified.h:29:10 
(libcrypto.so.3+0xc6eca)
    #2 kudu::rpc::ServerNegotiation::HandleTlsHandshake(kudu::rpc::NegotiatePB 
const&) /root/Projects/kudu/src/kudu/rpc/server_negotiation.cc:633:35 
(libkrpc.so+0x1e92d1)
    #3 kudu::rpc::ServerNegotiation::Negotiate() 
/root/Projects/kudu/src/kudu/rpc/server_negotiation.cc:244:18 
(libkrpc.so+0x1e745a)
    #4 kudu::rpc::DoServerNegotiation(kudu::rpc::Connection*, 
kudu::TriStateFlag, kudu::TriStateFlag, bool, kudu::MonoTime const&) 
/root/Projects/kudu/src/kudu/rpc/negotiation.cc:293:3 (libkrpc.so+0x188180)
    #5 
kudu::rpc::Negotiation::RunNegotiation(scoped_refptr<kudu::rpc::Connection> 
const&, kudu::TriStateFlag, kudu::TriStateFlag, bool, kudu::MonoTime) 
/root/Projects/kudu/src/kudu/rpc/negotiation.cc:315:9 (libkrpc.so+0x1879b5)
    #6 
kudu::rpc::ReactorThread::StartConnectionNegotiation(scoped_refptr<kudu::rpc::Connection>
 const&)::$_1::operator()() const 
/root/Projects/kudu/src/kudu/rpc/reactor.cc:631:3 (libkrpc.so+0x1a6edc)
    ...
{noformat}

The log with several instances of the TSAN warning is attached.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to