[ https://issues.apache.org/jira/browse/KUDU-3532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mahesh Reddy updated KUDU-3532: ------------------------------- Fix Version/s: 1.17.0 (was: 1.18.0) > Unable to place replicas using range aware logic with multiple locations > ------------------------------------------------------------------------ > > Key: KUDU-3532 > URL: https://issues.apache.org/jira/browse/KUDU-3532 > Project: Kudu > Issue Type: Bug > Components: master > Affects Versions: 1.17.0 > Reporter: Mahesh Reddy > Assignee: Mahesh Reddy > Priority: Major > Fix For: 1.17.0 > > > When multiple locations exist, it's possible an std::length_error will be > thrown when ReservoirSample is called within > PlacementPolicy::SelectReplica(). > Look at this file for reference: > https://github.com/apache/kudu/blob/master/src/kudu/master/placement_policy.cc > There's an error in the logic of the code that assumes an improper relation > between two sets, one set being the tablet servers to choose from and the > other set being the tablet servers not to choose from. This error manifests > itself as an implicit conversion from unsigned long to int. If "choices_size" > is negative, the implicit conversion to int will make the value larger than > the the max size allowed to reserve a vector and an error will be thrown > within ReservoirSample(). > Below is a stack trace from a master crash due to this bug: > SIGABRT (@0x1da00007b60) received by PID 31584 (TID 0x7fdf9644f700) from PID > 31584; stack trace: *** > @ 0xe48496 google::(anonymous namespace)::FailureSignalHandler() > @ 0x7fdfb9a90630 (unknown) > @ 0x7fdfb7c95387 __GI_raise > @ 0x7fdfb7c96a78 __GI_abort > @ 0x7fdfb85a5a95 {_}{{_}}gnu_cxx::\{_}_verbose_terminate_handler() > @ 0x7fdfb85a3a06 (unknown) > @ 0x7fdfb85a3a33 std::terminate() > @ 0x7fdfb85a3c53 __cxa_throw > @ 0x7fdfb85f8a67 std::__throw_length_error() > @ 0xe01fcf kudu::ReservoirSample<>() > @ 0xdfce0f kudu::master::PlacementPolicy::SelectReplica() > @ 0xdff386 kudu::master::PlacementPolicy::PlaceExtraTabletReplica() > @ 0xd873bf kudu::master::AsyncAddReplicaTask::SendRequest() > @ 0xd7912c kudu::master::RetryingTSRpcTask::Run() > @ 0xda5412 kudu::master::CatalogManager::ProcessTabletReport() > @ 0xdf7018 kudu::master::MasterServiceImpl::TSHeartbeat() > @ 0x2fea455 kudu::rpc::GeneratedServiceIf::Handle() > @ 0x2feb44a kudu::rpc::ServicePool::RunThread() > @ 0x31d2e1e kudu::Thread::SuperviseThread() > @ 0x7fdfb9a88ea5 start_thread > @ 0x7fdfb7d5db0d __clone -- This message was sent by Atlassian Jira (v8.20.10#820010)