[ https://issues.apache.org/jira/browse/IMPALA-6932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pooja Nilangekar resolved IMPALA-6932. -------------------------------------- Resolution: Fixed Fix Version/s: Impala 3.2.0 > Simple LIMIT 1 query can be really slow on many-filed sequence datasets > ----------------------------------------------------------------------- > > Key: IMPALA-6932 > URL: https://issues.apache.org/jira/browse/IMPALA-6932 > Project: IMPALA > Issue Type: Task > Components: Backend > Reporter: Philip Zeyliger > Assignee: Pooja Nilangekar > Priority: Critical > Fix For: Impala 3.2.0 > > > I recently ran across really slow behavior with the trivial {{SELECT * FROM > table LIMIT 1}} query. The table used Avro as a file format and had about > 45,000 files across about 250 partitions. An optimization kicked in to set > NUM_NODES to 1. > The query ran for about an hour, and the profile indicated that it was > opening files: > - TotalRawHdfsOpenFileTime(*): 1.0h (3622833666032) > I took a single minidump while this query was running, and I suspect the > query was here: > {code:java} > 1 impalad!impala::ScannerContext::Stream::GetNextBuffer(long) > [scanner-context.cc : 115 + 0x13] > 2 impalad!impala::ScannerContext::Stream::GetBytesInternal(long, unsigned > char**, bool, long*) [scanner-context.cc : 241 + 0x5] > 3 impalad!impala::HdfsAvroScanner::ReadFileHeader() [scanner-context.inline.h > : 54 + 0x1f] > 4 impalad!impala::BaseSequenceScanner::GetNextInternal(impala::RowBatch*) > [base-sequence-scanner.cc : 157 + 0x13] > 5 impalad!impala::HdfsScanner::ProcessSplit() [hdfs-scanner.cc : 129 + 0xc] > 6 > impalad!impala::HdfsScanNode::ProcessSplit(std::vector<impala::FilterContext, > std::allocator<impala::FilterContext> > const&, impala::MemPool*, > impala::io::ScanRange*) [hdfs-scan-node.cc : 527 + 0x17] > 7 impalad!impala::HdfsScanNode::ScannerThread() [hdfs-scan-node.cc : 437 + > 0x1c] > 8 impalad!impala::Thread::SuperviseThread(std::string const&, std::string > const&, boost::function<void ()>, impala::ThreadDebugInfo const*, > impala::Promise<long>*) [function_template.hpp : 767 + 0x7]{code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)