Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11520 )

Change subject: IMPALA-7543: Enhance scan ranges to support sub-ranges
......................................................................


Patch Set 6:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/11520/3/be/src/runtime/io/hdfs-file-reader.cc
File be/src/runtime/io/hdfs-file-reader.cc:

http://gerrit.cloudera.org:8080/#/c/11520/3/be/src/runtime/io/hdfs-file-reader.cc@155
PS3, Line 155:         *eof = true;
> I'm thinking that eof is the right way to go. The FileReader is a new abstr
I agree, it is cleaner to keep FileReader simple and keep higher level logic in 
ScanRange.


http://gerrit.cloudera.org:8080/#/c/11520/5/be/src/runtime/io/request-ranges.h
File be/src/runtime/io/request-ranges.h:

http://gerrit.cloudera.org:8080/#/c/11520/5/be/src/runtime/io/request-ranges.h@254
PS5, Line 254:       bool expected_local, const BufferOpts& buffer_opts,
> line too long (93 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/11520/4/be/src/runtime/io/request-ranges.h
File be/src/runtime/io/request-ranges.h:

http://gerrit.cloudera.org:8080/#/c/11520/4/be/src/runtime/io/request-ranges.h@251
PS4, Line 251:   /// Same as above, but it also adds sub-ranges. No need to 
merge contiguous sub-ranges
             :   /// in advance, as this method will do the merge.
             :   void Reset(hdfsFS fs, const char* file, int64_t len, int64_t 
offset, int disk_id,
             :       bool expected_local, const BufferOpts& buffer_opts,
             :       std::vector<SubRange>&& sub_ranges, void* meta_dat
> Will we know the SubRanges when we call Reset()? (BaseScalarColumnReader::R
Yes, I think we'll always know the SubRanges when we call Reset().

I overloaded Reset() to have another version that takes SubRanges. This way I 
didn't have to modify the current call sites that pass in metadata.


http://gerrit.cloudera.org:8080/#/c/11520/3/be/src/runtime/io/scan-range.cc
File be/src/runtime/io/scan-range.cc:

http://gerrit.cloudera.org:8080/#/c/11520/3/be/src/runtime/io/scan-range.cc@268
PS3, Line 268:  (!read_status.ok()) return read_status;
> I think it makes sense for us to view end-of-file or partial read as errors
Seems valid, now it returns SCANNER_INCOMPLETE_READ in this case as suggested.


http://gerrit.cloudera.org:8080/#/c/11520/4/be/src/runtime/io/scan-range.cc
File be/src/runtime/io/scan-range.cc:

http://gerrit.cloudera.org:8080/#/c/11520/4/be/src/runtime/io/scan-range.cc@236
PS4, Line 236: buffer_desc->eosr_ = eof || bytes_read_ == bytes_to_
> One thing we should try to do is make it clear what this calculation is doi
Introduced a new local boolean variable called 'eof'.
Also added a short comment.



--
To view, visit http://gerrit.cloudera.org:8080/11520
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iea26ba386713990f7671aab5a372cf449b8d51e4
Gerrit-Change-Number: 11520
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Thu, 11 Oct 2018 09:58:33 +0000
Gerrit-HasComments: Yes

Reply via email to