Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11444 )
Change subject: IMPALA-7556: part 1: handle different file systems via polymorphism ...................................................................... Patch Set 2: (9 comments) I think this makes sense. I think the documentation of invariants around some of the variables could be better and make it easier to work on the code in future. http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/file-reader.h File be/src/runtime/io/file-reader.h: http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/file-reader.h@41 PS2, Line 41: void SetRequestContext(RequestContext* request_context) { I think we always call SetRequestContext() and SetDiskIoMgr() at the same time as ResetState() so I think we can just combine them. Would make the intended use pattern of the API clearer. Or we could just expose request_context_ and io_mgr_ in ScanRange and use those directly. http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/file-reader.h@50 PS2, Line 50: BytesRead bytes_read() since it's just a trivial accessor http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/file-reader.h@81 PS2, Line 81: GetLock this is a plain accessor so prefer 'lock()' http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/file-reader.h@93 PS2, Line 93: ScanRange* scan_range_; const? I don't think this changes after initialization. http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/hdfs-file-reader.h File be/src/runtime/io/hdfs-file-reader.h: http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/hdfs-file-reader.h@47 PS2, Line 47: hdfsFS hdfs_fs_; const, since I don't think this is reassigned http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/hdfs-file-reader.cc File be/src/runtime/io/hdfs-file-reader.cc: http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/hdfs-file-reader.cc@108 PS2, Line 108: int64_t max_chunk_size = scan_range_->MaxReadChunkSize(); This part was just a mechanical move, right? Would be helpful to flag the parts that were just moved from one file to another (we can verify this ourselves but it's helpful to have a starting people). http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/local-file-reader.h File be/src/runtime/io/local-file-reader.h: http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/local-file-reader.h@40 PS2, Line 40: FILE* file_ = nullptr; Maybe briefly document how this fits into the lifecycle, e.g. set in Open() and set back to nullptr in Close(). http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/request-ranges.h File be/src/runtime/io/request-ranges.h: http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/request-ranges.h@274 PS2, Line 274: Status CancelStatus() const { return cancel_status_; } This needs a little more thought - we needed to hold lock_ or scan_range_lock_ to read this member so don't want to expose this publicly here. Maybe could move it to private: and document that the FileReader classes access it directly while holding their lock. trivial accessor, so cancel_status() if we keep it exposed. http://gerrit.cloudera.org:8080/#/c/11444/2/be/src/runtime/io/request-ranges.h@482 PS2, Line 482: hdfs_lock_ Comments referencing hdfs_lock_ need updating. -- To view, visit http://gerrit.cloudera.org:8080/11444 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia3d3d2d774075008285230606b992603d5be1a82 Gerrit-Change-Number: 11444 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Mon, 17 Sep 2018 17:03:51 +0000 Gerrit-HasComments: Yes