Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13329 )
Change subject: IMPALA-6433: Part 1: Extract page reading logic from ParquetColumnReader ...................................................................... Patch Set 19: (11 comments) Thanks for doing all this cleanup! This is a big improvement. I had some relatively minor suggestions. I'd suggest doing a pass over the patch and reducing the amount of vertical whitespace. I pointed out a couple of functions where it was out of line with the rest of the code, but I think generally there's other opportunities to fit more code on the screen without it becomign less readable. http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-column-readers.cc File be/src/exec/parquet/parquet-column-readers.cc: http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-column-readers.cc@1066 PS19, Line 1066: page_reader_.Close(row_batch->tuple_data_pool()); nit: maybe condense to: page_reader_.Close(row_batch == nullptr ? nullptr : row_batch->tuple_data_pool()); http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-column-readers.cc@1074 PS19, Line 1074: Status BaseScalarColumnReader::InitDictionary() { nit: I think there's a bit too much vertical whitespace in this function now - it doesn't really aid readability. Some parts are basically double-spaced now. I'd suggest removing all the blank lines from l1079-1097, l1099-1112 and the blank line before the final return. http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-column-readers.cc@1084 PS19, Line 1084: int data_size; We should switch to int64_t for the byte value while we're here, it's just generally safer to use in interfaces for memory values. http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-column-readers.cc@1133 PS19, Line 1133: Status BaseScalarColumnReader::ReadDataPage() { Same comment about vertical whitespace. http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-column-readers.cc@1404 PS19, Line 1404: // TODO for 2.3: node_.element->name isn't necessarily useful we can remove the 2.3 reference here :) http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-page-reader-low.h File be/src/exec/parquet/parquet-page-reader-low.h: http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-page-reader-low.h@27 PS19, Line 27: ParquetPageReaderLow I think the class name could be improved. Maybe something like ParquetPageReaderHelper? http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-page-reader-low.h@93 PS19, Line 93: /// Uninitialized Thank you for taking the time to add and document this, I like it. Maybe it's worth labeling the state transitions with the method names, if you can do it without cluttering it too much. http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-page-reader-low.cc File be/src/exec/parquet/parquet-page-reader-low.cc: http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-page-reader-low.cc@79 PS19, Line 79: // TODO: this will need to change when we have co-located files and the columns Can you remove this TODO, I don't think we're going to do what this is describing. http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-page-reader-low.cc@131 PS19, Line 131: *eos = false; nit: there's a bit too much vertical whitespace here, it's become double-spaced http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-page-reader.cc File be/src/exec/parquet/parquet-page-reader.cc: http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-page-reader.cc@97 PS19, Line 97: nit: don't need the blank lines in this function, would be perfectly readable and denser without them. http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-page-reader.cc@252 PS19, Line 252: // TODO: can't we call stream_->ReleaseCompletedResources(false); at this point? Quite possibly, this code has undergone a lot of iteration so stuff like this may have been missed. -- To view, visit http://gerrit.cloudera.org:8080/13329 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic0289394adcb97a3529313030930c9c5b85aaa12 Gerrit-Change-Number: 13329 Gerrit-PatchSet: 19 Gerrit-Owner: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Wed, 18 Sep 2019 23:36:36 +0000 Gerrit-HasComments: Yes