[ https://issues.apache.org/jira/browse/IMPALA-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mostafa Mokhtar resolved IMPALA-2682. ------------------------------------- Resolution: Fixed Fixed in 2.13 > Address exchange operator CPU bottlenecks in Thrift > --------------------------------------------------- > > Key: IMPALA-2682 > URL: https://issues.apache.org/jira/browse/IMPALA-2682 > Project: IMPALA > Issue Type: Bug > Components: Distributed Exec > Affects Versions: Impala 2.2 > Reporter: Mostafa Mokhtar > Assignee: Mostafa Mokhtar > Priority: Minor > Labels: performance > Attachments: Screen Shot 2015-11-25 at 4.19.28 PM.png > > > Currently the exchange operator is a major bottleneck for partitioned table > insert, Broadcast and Shuffle joins as well as aggregates, exchange alone > constitutes for 60% of slow down in partitioned table insert compared to > un-partitioned. > The CPU cost of exchange operators can vary between 10-50% depending on > cardinality and complexity of other operators. > Part of the bottleneck is thrift is due to 512 bytes buffer used by > TBufferedTransport, which causes the reads to go through a "slow" path where > the payload is mem-copied over N chunks of 512 bytes. > These are the top contributing call stacks in the exchange operator. > Read path in thrift > {code} > Data Of Interest (CPU Metrics) > 1 of 18: 58.1% (43.860s of 75.474s) > impalad!apache::thrift::transport::TSocket::read - TSocket.cpp > impalad!apache::thrift::transport::TTransport::read+0x5 - TTransport.h:109 > impalad!apache::thrift::transport::TBufferedTransport::readSlow+0x4b - > TBufferTransports.cpp:52 > impalad!apache::thrift::transport::TBufferBase::read+0xb9 - > TBufferTransports.h:69 > impalad!apache::thrift::transport::readAll<apache::thrift::transport::TBufferBase>+0x28 > - TTransport.h:44 > impalad!apache::thrift::transport::TTransport::readAll+0xa - TTransport.h:126 > impalad!apache::thrift::protocol::TBinaryProtocolT<apache::thrift::transport::TTransport>::readStringBody<std::string>+0xae > - TBinaryProtocol.tcc:458 > impalad!readString<std::basic_string<char, std::char_traits<char>, > std::allocator<char> > >+0x24 - TBinaryProtocol.tcc:412 > impalad!apache::thrift::protocol::TVirtualProtocol<apache::thrift::protocol::TBinaryProtocolT<apache::thrift::transport::TTransport>, > apache::thrift::protocol::TProtocolDefaults>::readString_virt+0x11 - > TVirtualProtocol.h:515 > impalad!apache::thrift::protocol::TProtocol::readString+0x10 - TProtocol.h:621 > impalad!impala::TRowBatch::read+0x17d - Results_types.cpp:89 > {code} > Send path compressing the data > {code} > Data Of Interest (CPU Metrics) > 1 of 1: 100.0% (65.480s of 65.480s) > impalad!LZ4_compress64kCtx - [Unknown] > impalad!impala::Lz4Compressor::ProcessBlock+0x32 - compress.cc:294 > impalad!impala::RowBatch::Serialize+0x20b - row-batch.cc:211 > impalad!impala::RowBatch::Serialize+0x34 - row-batch.cc:168 > impalad!impala::DataStreamSender::SerializeBatch+0x117 - > data-stream-sender.cc:463 > impalad!impala::DataStreamSender::Channel::SendCurrentBatch+0x44 - > data-stream-sender.cc:256 > impalad!impala::DataStreamSender::Channel::AddRow+0x48 - > data-stream-sender.cc:232 > impalad!impala::DataStreamSender::Send+0x6d2 - data-stream-sender.cc:444 > {code} > Memory copy in send path > {code} > Data Of Interest (CPU Metrics) > 1 of 137: 24.6% (6.870s of 27.911s) > libc.so.6!memcpy - [Unknown] > impalad!impala::Tuple::DeepCopyVarlenData+0xfe - tuple.cc:89 > impalad!impala::Tuple::DeepCopy+0xc3 - tuple.cc:69 > impalad!impala::DataStreamSender::Channel::AddRow+0xf5 - > data-stream-sender.cc:245 > impalad!impala::DataStreamSender::Send+0x6d2 - data-stream-sender.cc:444 > impalad!impala::PlanFragmentExecutor::OpenInternal+0x3e2 - > plan-fragment-executor.cc:355 > {code} > Un-compressing in read path > {code} > Data Of Interest (CPU Metrics) > 1 of 1: 100.0% (25.100s of 25.100s) > impalad!LZ4_uncompress - [Unknown] > impalad!impala::Lz4Decompressor::ProcessBlock+0x47 - decompress.cc:454 > impalad!impala::RowBatch::RowBatch+0x3ec - row-batch.cc:108 > impalad!impala::DataStreamRecvr::SenderQueue::AddBatch+0x1a1 - > data-stream-recvr.cc:207 > impalad!impala::DataStreamMgr::AddData+0x134 - data-stream-mgr.cc:103 > impalad!impala::ImpalaServer::TransmitData+0x175 - impala-server.cc:1077 > impalad!impala::ImpalaInternalService::TransmitData+0x43 - > impala-internal-service.h:60 > {code} > Write path in thrift > {code} > Data Of Interest (CPU Metrics) > 1 of 14: 31.9% (7.010s of 21.980s) > libpthread.so.0!__send - [Unknown] > impalad!apache::thrift::transport::TSocket::write_partial+0x36 - > TSocket.cpp:567 > impalad!apache::thrift::transport::TSocket::write+0x3c - TSocket.cpp:542 > impalad!apache::thrift::transport::TTransport::write+0xa - TTransport.h:158 > impalad!apache::thrift::transport::TBufferedTransport::writeSlow+0x79 - > TBufferTransports.cpp:93 > impalad!apache::thrift::transport::TTransport::write+0xb - TTransport.h:158 > impalad!writeString<std::basic_string<char, std::char_traits<char>, > std::allocator<char> > >+0x39 - TBinaryProtocol.tcc:186 > impalad!apache::thrift::protocol::TVirtualProtocol<apache::thrift::protocol::TBinaryProtocolT<apache::thrift::transport::TTransport>, > apache::thrift::protocol::TProtocolDefaults>::writeString_virt+0x1b - > TVirtualProtocol.h:417 > impalad!apache::thrift::protocol::TProtocol::writeString+0xf - TProtocol.h:463 > impalad!impala::TRowBatch::write+0x18a - Results_types.cpp:169 > impalad!impala::TTransmitDataParams::write+0x128 - > ImpalaInternalService_types.cpp:2914 > impalad!impala::ImpalaInternalService_TransmitData_pargs::write+0x59 - > ImpalaInternalService.cpp:571 > impalad!impala::ImpalaInternalServiceClient::send_TransmitData+0x65 - > ImpalaInternalService.cpp:862 > impalad!impala::ImpalaInternalServiceClient::TransmitData+0x1b - > ImpalaInternalService.cpp:851 > impalad!impala::ClientConnection<impala::ImpalaInternalServiceClient>::DoRpc<void > (impala::TTransmitDataResult&, impala::TTransmitDataParams const&) > impala::ImpalaInternalServiceClient::*, impala::TTransmitDataParams, > impala::TTransmitDataResult>+0x4a - client-cache.h:229 > impalad!impala::DataStreamSender::Channel::TransmitDataHelper+0x5b1 - > data-stream-sender.cc:206 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)