Mostafa Mokhtar created IMPALA-6425: ---------------------------------------
Summary: Change Mempool memory allocation size to be <1MB to avoid allocating from CentralFreeList Key: IMPALA-6425 URL: https://issues.apache.org/jira/browse/IMPALA-6425 Project: IMPALA Issue Type: Improvement Reporter: Mostafa Mokhtar Assignee: Tim Armstrong While [~tlipcon] was investigating KRPC contention he noticed that MemPool::Allocate is doing 1MB allocations, which is somewhat of an anti-pattern with tcmalloc. During the tests MemPool was doing several thousand 1MB allocs per second and those have to do a full scan of the tcmalloc span linked list, which is very slow and only gets slower 1040384 bytes on the other hand is constant time. It is not clear if a Power of 2 allocation size would help, worth experimenting with 512KB and 1040384 bytes. {code} /// The maximum size of chunk that should be allocated. Allocations larger than this /// size will get their own individual chunk. static const int MAX_CHUNK_SIZE = 8192*127 {code} {code} #0 0x0000000002097407 in base::internal::SpinLockDelay(int volatile*, int, int) () #1 0x00000000020e2049 in SpinLock::SlowLock() () #2 0x0000000002124348 in tcmalloc::CentralFreeList::Populate() () #3 0x0000000002124458 in tcmalloc::CentralFreeList::FetchFromOneSpansSafe(int, void**, void**) () #4 0x00000000021244e8 in tcmalloc::CentralFreeList::RemoveRange(void**, void**, int) () #5 0x0000000002131ee5 in tcmalloc::ThreadCache::FetchFromCentralCache(unsigned int, int, void* (*)(unsigned long)) () #6 0x0000000000b2879a in impala::MemPool::FindChunk(long, bool) () #7 0x0000000000b364f6 in impala::MemPool::Allocate(long) () #8 0x0000000000b36674 in impala::FreePool::Allocate(long) () #9 0x0000000000b353db in impala::RowBatch::Deserialize(kudu::Slice const&, kudu::Slice const&, long, bool, impala::FreePool*) () #10 0x0000000000b35795 in impala::RowBatch::RowBatch(impala::RowDescriptor const*, impala::RowBatchHeaderPB const&, kudu::Slice const&, kudu::Slice const&, impala::FreePool*) () #11 0x0000000000b1644f in impala::KrpcDataStreamRecvr::SenderQueue::AddBatchWork(long, impala::RowBatchHeaderPB const&, kudu::Slice const&, kudu::Slice const&, boost::unique_lock<impala::SpinLock>*) () #12 0x0000000000b19135 in impala::KrpcDataStreamRecvr::SenderQueue::AddBatch(impala::TransmitDataRequestPB const*, kudu::rpc::RpcContext*) () #13 0x0000000000b0ee30 in impala::KrpcDataStreamMgr::AddData(impala::TransmitDataRequestPB const*, kudu::rpc::RpcContext*) () #14 0x0000000001187035 in kudu::rpc::GeneratedServiceIf::Handle(kudu::rpc::InboundCall*) () #15 0x00000000011bc1cd in impala::ImpalaServicePool::RunThread(long) () {code} Also it appears that the thread above was a victim of thread below, yet allocations <1MB will make MemPool::Allocate content less over the CentralFreeList lock. {code} #0 0x0000003173ae5407 in madvise () from /lib64/libc.so.6 #1 0x0000000002131cca in TCMalloc_SystemRelease(void*, unsigned long) () #2 0x000000000212f26a in tcmalloc::PageHeap::DecommitSpan(tcmalloc::Span*) () #3 0x000000000212f505 in tcmalloc::PageHeap::MergeIntoFreeList(tcmalloc::Span*) () #4 0x000000000212f864 in tcmalloc::PageHeap::Delete(tcmalloc::Span*) () #5 0x0000000002123cf7 in tcmalloc::CentralFreeList::ReleaseToSpans(void*) () #6 0x0000000002123d9b in tcmalloc::CentralFreeList::ReleaseListToSpans(void*) () #7 0x0000000002124067 in tcmalloc::CentralFreeList::InsertRange(void*, void*, int) () #8 0x00000000021320a4 in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned int, int) () #9 0x0000000002132575 in tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, unsigned int) () #10 0x0000000000b276e0 in impala::MemPool::FreeAll() () #11 0x0000000000b34655 in impala::RowBatch::Reset() () #12 0x0000000000fe882f in impala::PartitionedAggregationNode::GetRowsStreaming(impala::RuntimeState*, impala::RowBatch*) () #13 0x0000000000fe9771 in impala::PartitionedAggregationNode::GetNext(impala::RuntimeState*, impala::RowBatch*, bool*) () #14 0x0000000000b78352 in impala::FragmentInstanceState::ExecInternal() () #15 0x0000000000b7adc2 in impala::FragmentInstanceState::Exec() () #16 0x0000000000b6a0da in impala::QueryState::ExecFInstance(impala::FragmentInstanceState*) () {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)