Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15370 )

Change subject: IMPALA-6636: Use async IO in ORC scanner
......................................................................


Patch Set 23:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/15370/21/be/src/exec/hdfs-orc-scanner.cc
File be/src/exec/hdfs-orc-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/15370/21/be/src/exec/hdfs-orc-scanner.cc@252
PS21, Line 252:         col_range_local, split_range->mtime(), 
BufferOpts(split_range->cache_options()));
              :     RETURN_IF_ERROR(
              :         context_->AddAndStartStream(scan_range, 
range.io_reservation, &range.stream_));
              :   }
              :   return Status::OK();
              : }
              :
> Added one more case in HdfsOrcScanner::ScanRangeInputStream::read.
Done


http://gerrit.cloudera.org:8080/#/c/15370/21/be/src/exec/hdfs-orc-scanner.cc@484
PS21, Line 484: Status HdfsOrcScanner::ProcessFileTail() {
> Ah, yeah, that's a problem. Then we need to make sure 'input_stream_' won't
Done


http://gerrit.cloudera.org:8080/#/c/15370/22/be/src/exec/hdfs-orc-scanner.cc
File be/src/exec/hdfs-orc-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/15370/22/be/src/exec/hdfs-orc-scanner.cc@114
PS22, Line 114: uint64_t
> Could you change this to warning or use VLOG_QUERY? Otherwise it's hard to
Done


http://gerrit.cloudera.org:8080/#/c/15370/22/be/src/exec/hdfs-orc-scanner.cc@128
PS22, Line 128:   if (!status.ok()) throw ResourceError(status);
> Can we report the offset and length here?
Done


http://gerrit.cloudera.org:8080/#/c/15370/22/be/src/exec/hdfs-orc-scanner.cc@137
PS22, Line 137:     return Status(msg);
              :   }
> Not related to this patch, but I think we need to revisit this as IMPALA-68
I suppose we can match it with how we expect locality in case of 
ColumnRange::read?


http://gerrit.cloudera.org:8080/#/c/15370/22/be/src/exec/hdfs-orc-scanner.cc@238
PS22, Line 238: artition_id =
> Shouldn't this be "range.offset_ + range.length_"?
Thanks for catching this! Fixed it and move the logic as inline function 
IsExpectedLocal.


http://gerrit.cloudera.org:8080/#/c/15370/22/be/src/exec/hdfs-orc-scanner.cc@261
PS22, Line 261:     string msg = Substitute("ORC read request out of range. 
offset: $0 length: $1 $2",
              :         offset, length, debug());
              :     return Status(msg);
              :   }
              :
              :   DCHECK(offset >= current_position_);
              :   S
> I think we can change this to DCHECK(offset >= current_position_) now, sinc
Done


http://gerrit.cloudera.org:8080/#/c/15370/22/be/src/exec/hdfs-orc-scanner.cc@287
PS22, Line 287:
> Could you create a JIRA for this?
I found ORC-262 already describe this problem. I add this to the comment.



--
To view, visit http://gerrit.cloudera.org:8080/15370
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074
Gerrit-Change-Number: 15370
Gerrit-PatchSet: 23
Gerrit-Owner: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Comment-Date: Fri, 21 Jan 2022 17:41:58 +0000
Gerrit-HasComments: Yes

Reply via email to