Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/15370 )
Change subject: IMPALA-6636: Use async IO in ORC scanner ...................................................................... Patch Set 14: (6 comments) http://gerrit.cloudera.org:8080/#/c/15370/13/be/src/exec/hdfs-columnar-scanner.cc File be/src/exec/hdfs-columnar-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15370/13/be/src/exec/hdfs-columnar-scanner.cc@239 PS13, Line 239: columnar_scanner_actual_reservation_counter_->UpdateCounter( > line too long (93 > 90) Done http://gerrit.cloudera.org:8080/#/c/15370/13/be/src/exec/hdfs-orc-scanner.h File be/src/exec/hdfs-orc-scanner.h: http://gerrit.cloudera.org:8080/#/c/15370/13/be/src/exec/hdfs-orc-scanner.h@123 PS13, Line 123: le_desc_->file_length; > Can be removed? Done http://gerrit.cloudera.org:8080/#/c/15370/13/be/src/exec/hdfs-orc-scanner.cc File be/src/exec/hdfs-orc-scanner.cc: http://gerrit.cloudera.org:8080/#/c/15370/13/be/src/exec/hdfs-orc-scanner.cc@208 PS13, Line 208: > line too long (95 > 90) Done http://gerrit.cloudera.org:8080/#/c/15370/13/be/src/exec/hdfs-orc-scanner.cc@300 PS13, Line 300: memcpy(buf, stream_buf, length); // TODO: extend Orc interface to avoid the copy : current_position_ += length; > Calling 'ReleaseCompletedResources(true)' seems to be OK here? Done http://gerrit.cloudera.org:8080/#/c/15370/13/be/src/exec/hdfs-scan-node-base.cc File be/src/exec/hdfs-scan-node-base.cc: http://gerrit.cloudera.org:8080/#/c/15370/13/be/src/exec/hdfs-scan-node-base.cc@821 PS13, Line 821: DCHECK_LE(offset + len, GetFileDesc(metadata->partition_id, file)->file_length) : << "Scan range beyond end of file (offset=" << offset << ", len=" << len << ")"; > Can be removed? This seems to be specific only for ORC. Therefore, I decide to restore this DCHECK and add the additional check in hdfs-orc-scanner.cc. http://gerrit.cloudera.org:8080/#/c/15370/14/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/15370/14/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@2130 PS14, Line 2130: private List<Long> computeMinColumnMemReservations(boolean hasOrc) { It is probably safe to set hasOrc=false here if ORC_ASYNC_READ=false. -- To view, visit http://gerrit.cloudera.org:8080/15370 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074 Gerrit-Change-Number: 15370 Gerrit-PatchSet: 14 Gerrit-Owner: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Comment-Date: Fri, 03 Dec 2021 00:01:16 +0000 Gerrit-HasComments: Yes