[ https://issues.apache.org/jira/browse/IMPALA-9572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17071101#comment-17071101 ]
Csaba Ringhofer commented on IMPALA-9572: ----------------------------------------- I think that there is a quite straightforward bug in the code that can occur with nested Parquet + page filtering: https://github.com/apache/impala/blob/5ff7c6a7de76aac2623710b3e43ceed8ce7424c8/be/src/exec/parquet/parquet-column-readers.cc#L1315 pos_slot is only non-NULL if pos_slot_desc_ is non-NULL (so the query asks for the position of the collection), but we try to read the positions anyway, which leads to a DCHECK in StrideWriter. Testing is a harder question - it is surprising if this a flaky test, as the same query in the same file should always hit this issue. My guess is that Hive created a bit different file that lead page filtering kicking in. > Impalad crash when process decimal value > ---------------------------------------- > > Key: IMPALA-9572 > URL: https://issues.apache.org/jira/browse/IMPALA-9572 > Project: IMPALA > Issue Type: Bug > Components: Backend > Affects Versions: Impala 4.0 > Reporter: Yongzhi Chen > Assignee: Norbert Luksa > Priority: Blocker > Labels: broken-build, crash > Attachments: > impalad.impala-ec2-centos74-m5-4xlarge-ondemand-1a73.vpc.cloudera.com.jenkins.log.ERROR.20200328-031659.18222.gz > > > In impala-asf-master-exhaustive build, minidump shows impalad crashed: > Crash reason: SIGABRT > Crash address: 0x7d10000472e > Process uptime: not available > Thread 396 (crashed) > 0 libc-2.17.so + 0x351f7 > rax = 0x0000000000000000 rdx = 0x0000000000000006 > rcx = 0xffffffffffffffff rbx = 0x0000000007246600 > rsi = 0x0000000000004273 rdi = 0x000000000000472e > rbp = 0x00007f6e43ee7050 rsp = 0x00007f6e43ee6ce8 > r8 = 0x0000000000000000 r9 = 0x00007f6e43ee6b60 > r10 = 0x0000000000000008 r11 = 0x0000000000000206 > r12 = 0x0000000007246680 r13 = 0x0000000000000044 > r14 = 0x000000000724dfc4 r15 = 0x0000000007246600 > rip = 0x00007f6f2b1da1f7 > Found by: given as instruction pointer in context > 1 libc-2.17.so + 0x368e8 > rbp = 0x00007f6e43ee7050 rsp = 0x00007f6e43ee6cf0 > rip = 0x00007f6f2b1db8e8 > Found by: stack scanning > 2 impalad!google_breakpad::ExceptionHandler::HandleSignal(int, siginfo_t*, > void*) + 0x1e0 > rbp = 0x00007f6e43ee7050 rsp = 0x00007f6e43ee6d78 > rip = 0x0000000004ed0840 > Found by: stack scanning > 3 impalad!google::DumpStackTraceAndExit() + 0x24 > rbp = 0x00007f6e43ee7050 rsp = 0x00007f6e43ee6e20 > rip = 0x0000000004ea2554 > Found by: stack scanning > 4 impalad!google::LogMessage::Fail() + 0xd > rbx = 0x0000000007246600 rbp = 0x00007f6e43ee7050 > rsp = 0x00007f6e43ee6ed0 rip = 0x0000000004e98fad > Found by: call frame info > 5 impalad!google::LogMessage::SendToLog() + 0x2b2 > rbx = 0x0000000007246600 rbp = 0x00007f6e43ee7050 > rsp = 0x00007f6e43ee6ee0 rip = 0x0000000004e9a852 > Found by: call frame info > 6 impalad!google::LogMessage::Flush() + 0x157 > rbx = 0x00007f6e43ee7090 rbp = 0x00007f6f2bd8a6a0 > rsp = 0x00007f6e43ee7060 r12 = 0x00007f6e43ee707f > r13 = 0x00000000072554f8 r14 = 0x00007f6e43ee7120 > r15 = 0x0000000018954c50 rip = 0x0000000004e98987 > Found by: call frame info > 7 impalad!google::LogMessageFatal::~LogMessageFatal() + 0xe > rbx = 0x00007f6e43ee7120 rbp = 0x00007f6e43ee7160 > rsp = 0x00007f6e43ee70e0 r12 = 0x0000000000000001 > r13 = 0x00000000072554f8 r14 = 0x0000000118ba0000 > r15 = 0x0000000018954c50 rip = 0x0000000004e9bf4e > Found by: call frame info > 8 impalad!impala::StrideWriter<long>::StrideWriter(long*, long) [mem-util.h > : 40 + 0xd] > rbx = 0x0000000000000001 rbp = 0x00007f6e43ee7160 > rsp = 0x00007f6e43ee7100 r12 = 0x0000000000000001 > r13 = 0x00000000072554f8 r14 = 0x0000000118ba0000 > r15 = 0x0000000018954c50 rip = 0x0000000002c07535 > Found by: call frame info > 9 > impalad!impala::BaseScalarColumnReader::FillPositionsInCandidateRange(int, > int, unsigned char*, int) [parquet-column-readers.cc : 1319 + 0x1f] > rbx = 0x00000000000017d4 rbp = 0x00007f6e43ee7270 > rsp = 0x00007f6e43ee7170 r12 = 0x0000000000000000 > r13 = 0x0000000000000039 r14 = 0x0000000118ba0000 > r15 = 0x0000000018954c50 rip = 0x0000000002c008d3 > Found by: call frame info > 10 impalad!impala::ScalarColumnReader<impala::DecimalValue<long>, > (parquet::Type::type)7, true>::MaterializeValueBatchRepeatedDefLevel(int, > int, unsigned char*, int*) [parquet-column-readers.cc : 612 + 0x2b] > rbx = 0x0000000000000007 rbp = 0x00007f6e43ee73a0 > rsp = 0x00007f6e43ee7280 r12 = 0x0000000000000000 > r13 = 0x0000000000000039 r14 = 0x0000000118ba0000 > r15 = 0x0000000018954c50 rip = 0x0000000002c731f7 > Found by: call frame info > 11 impalad!bool impala::ScalarColumnReader<impala::DecimalValue<long>, > (parquet::Type::type)7, true>::ReadValueBatch<true>(int, int, unsigned char*, > int*) [parquet-column-readers.cc : 468 + 0x26] > rbx = 0x0000000000000000 rbp = 0x00007f6e43ee7520 > rsp = 0x00007f6e43ee73b0 r12 = 0x0000000000000400 > r13 = 0x0000000000000039 r14 = 0x0000000118ba0000 > r15 = 0x0000000018954c50 rip = 0x0000000002c54db1 > Found by: call frame info > 12 impalad!impala::ScalarColumnReader<impala::DecimalValue<long>, > (parquet::Type::type)7, true>::ReadValueBatch(impala::MemPool*, int, int, > unsigned char*, int*) [parquet-column-readers.cc : 77 + 0x1d] > rbx = 0x0000000002c1cdac rbp = 0x00007f6e43ee7560 > rsp = 0x00007f6e43ee7530 r12 = 0x0000000000000400 > r13 = 0x0000000000000039 r14 = 0x0000000118ba0000 > r15 = 0x0000000018954c50 rip = 0x0000000002c1cde7 > Found by: call frame info > 13 > impalad!impala::HdfsParquetScanner::AssembleRows(std::vector<impala::ParquetColumnReader*, > std::allocator<impala::ParquetColumnReader*> > const&, impala::RowBatch*, > bool*) [hdfs-parquet-scanner.cc : 1111 + 0x19] > rbx = 0x0000000002c1cdac rbp = 0x00007f6e43ee7880 > rsp = 0x00007f6e43ee7570 r12 = 0x0000000000000400 > r13 = 0x0000000000000039 r14 = 0x0000000118ba0000 > r15 = 0x0000000018954c50 rip = 0x0000000002bb4145 > Found by: call frame info > 14 impalad!impala::HdfsParquetScanner::GetNextInternal(impala::RowBatch*) > [hdfs-parquet-scanner.cc : 454 + 0x42] > rbx = 0x00007f6e43ee8080 rbp = 0x00007f6e43ee7f30 > rsp = 0x00007f6e43ee7890 r12 = 0x0000000000000000 > r13 = 0x0000000000000000 r14 = 0x00000000a6971fc8 > r15 = 0x000000000000002c rip = 0x0000000002baefe5 > Found by: call frame info > 15 impalad!impala::HdfsParquetScanner::ProcessSplit() > [hdfs-parquet-scanner.cc : 351 + 0x39] > rbx = 0x0000000002bad198 rbp = 0x00007f6e43ee8000 > Link: > https://master-02.jenkins.cloudera.com/job/impala-asf-master-exhaustive/917/testReport/generate_junitxml/finalize/minidumps/ > There are several test failures after that with same error: > Query aborted:Failed due to unreachable impalad(s): > impala-ec2-centos74-m5-4xlarge-ondemand-1a73.vpc.cloudera.com:22001 > The first related test may be: > query_test.test_spilling.TestSpillingDebugActionDimensions.test_spilling -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org