[ https://issues.apache.org/jira/browse/IMPALA-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Quanlong Huang updated IMPALA-10233: ------------------------------------ Summary: Hit DCHECK in DmlExecState::AddPartition when inserting to a partitioned table with zorder (was: Hit DCHECK in DmlExecState::AddPartition) > Hit DCHECK in DmlExecState::AddPartition when inserting to a partitioned > table with zorder > ------------------------------------------------------------------------------------------ > > Key: IMPALA-10233 > URL: https://issues.apache.org/jira/browse/IMPALA-10233 > Project: IMPALA > Issue Type: Bug > Reporter: Quanlong Huang > Priority: Major > > Hit the DCHECK when inserting to a partitioned parquet table with zorder. I'm > on master branch (commit=b8a2b75). > {code:java} > F1012 15:04:27.726274 3868 dml-exec-state.cc:432] > a6479cc4725101fd:b86db2a100000003] Check failed: > per_partition_status_.find(name) == per_partition_status_.end() > *** Check failure stack trace: *** > @ 0x51ff3cc google::LogMessage::Fail() > @ 0x5200cbc google::LogMessage::SendToLog() > @ 0x51fed2a google::LogMessage::Flush() > @ 0x5202928 google::LogMessageFatal::~LogMessageFatal() > @ 0x234ba18 impala::DmlExecState::AddPartition() > @ 0x2817786 impala::HdfsTableSink::GetOutputPartition() > @ 0x2813151 impala::HdfsTableSink::WriteClusteredRowBatch() > @ 0x28156c4 impala::HdfsTableSink::Send() > @ 0x23139dd impala::FragmentInstanceState::ExecInternal() > @ 0x230fe10 impala::FragmentInstanceState::Exec() > @ 0x227bb79 impala::QueryState::ExecFInstance() > @ 0x2279f7b > _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv > @ 0x227e2c2 > _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_15function_bufferE > @ 0x2137699 boost::function0<>::operator()() > @ 0x2715d7d impala::Thread::SuperviseThread() > @ 0x271dd1a boost::_bi::list5<>::operator()<>() > @ 0x271dc3e boost::_bi::bind_t<>::operator()() > @ 0x271dbff boost::detail::thread_data<>::run() > @ 0x3f05f01 thread_proxy > @ 0x7fb18bebb6b9 start_thread > @ 0x7fb188a474dc clone {code} > It seems the zorder sort node doesn't keep the rows sorted by partition keys. > Thus violates the assumption of HdfsTableSink::WriteClusteredRowBatch() that > input must be ordered by the partition key expressions. So a partition key > was deleted and then inserted again to the > {{partition_keys_to_output_partitions_}} map. > {code:c++} > /// Maps all rows in 'batch' to partitions and appends them to their > temporary Hdfs > /// files. The input must be ordered by the partition key expressions. > Status WriteClusteredRowBatch(RuntimeState* state, RowBatch* batch) > WARN_UNUSED_RESULT; > {code} > The key got removed here: > https://github.com/apache/impala/blob/b8a2b754669eb7f8d164e8112e594ac413e436ef/be/src/exec/hdfs-table-sink.cc#L334 > when processing a new partition key. > It got reinserted here: > https://github.com/apache/impala/blob/b8a2b754669eb7f8d164e8112e594ac413e436ef/be/src/exec/hdfs-table-sink.cc#L590 > so hit the DCHECK. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org