Wenzhe Zhou created IMPALA-12376: ------------------------------------ Summary: DataSourceScanNode drop some returned rows if FLAGS_data_source_batch_size is greater than default value Key: IMPALA-12376 URL: https://issues.apache.org/jira/browse/IMPALA-12376 Project: IMPALA Issue Type: Sub-task Components: Backend Reporter: Wenzhe Zhou Assignee: Wenzhe Zhou
Backend DataSourceScanNode (be/src/exec/data-source-scan-node.cc) does not handle eos properly in function DataSourceScanNode::GetNext(). Rows, which are returned from external data source, could be dropped if FLAGS_data_source_batch_size is set with value which is greater than default value 1024. In following code: if (row_batch->AtCapacity() || input_batch_->eos || ReachedLimit()) { *eos = input_batch_->eos || ReachedLimit(); eos could be set as true when some rows in input batch are not processed if row_batch->AtCapacity() return true. -- This message was sent by Atlassian Jira (v8.20.10#820010)