Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14348 )
Change subject: IMPALA-8742: Switch to ScanRange::bytes_to_read() instead of len() ...................................................................... IMPALA-8742: Switch to ScanRange::bytes_to_read() instead of len() IMPALA-7543 introduced sub-ranges in scan ranges. These are smaller parts of the scan ranges that actually need to be read, other parts of the scan range can be skipped. Currently sub-ranges are only used in the Parquet scanner during page filtering. With sub-ranges the scan range has a new field 'bytes_to_read_', that is the sum of the lengths of the sub-ranges. Or, if there are no sub-ranges, 'bytes_to_read_' equals to field 'len_' which is the length of the whole scan range. At some parts of Impala ScanRange::len() is being used instead of ScanRange::bytes_to_read(). It doesn't cause a bug because only the Parquet scanner uses sub-ranges, i.e. bytes_to_read() usually equals to len(). The Parquet scanner also doesn't hit the bug because it tracks which pages it reads. However, it can be a potential source of bugs in the future to leave the invocations of len() instead of bytes_to_read(). Also, the scanners might allocate more memory than needed. At couple of places we still need to invoke len(), e.g. when we test scan-range containment (for local splits), or when we test whether a scan range contains the mid-point of a Parquet row group. Testing: Added a scanner reservation test. Ran the exhaustive tests. Change-Id: Ie896db3f4b5f3e2272d81c2d360049af09c41d9c Reviewed-on: http://gerrit.cloudera.org:8080/14348 Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> --- M be/src/exec/base-sequence-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/scanner-context.h M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/io/request-context.cc M be/src/runtime/io/scan-range.cc M testdata/workloads/functional-query/queries/QueryTest/scanner-reservation.test 8 files changed, 24 insertions(+), 8 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/14348 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ie896db3f4b5f3e2272d81c2d360049af09c41d9c Gerrit-Change-Number: 14348 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>