Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13329 )

Change subject: IMPALA-6433: Part 1: Extract page reading logic from 
ParquetColumnReader
......................................................................


Patch Set 19:

(11 comments)

Thanks for doing all this cleanup! This is a big improvement. I had some 
relatively minor suggestions.

I'd suggest doing a pass over the patch and reducing the amount of vertical 
whitespace. I pointed out a couple of functions where it was out of line with 
the rest of the code, but I think generally there's other opportunities to fit 
more code on the screen without it becomign less readable.

http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-column-readers.cc
File be/src/exec/parquet/parquet-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-column-readers.cc@1066
PS19, Line 1066:     page_reader_.Close(row_batch->tuple_data_pool());
nit: maybe condense to:

  page_reader_.Close(row_batch == nullptr ? nullptr : 
row_batch->tuple_data_pool());


http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-column-readers.cc@1074
PS19, Line 1074: Status BaseScalarColumnReader::InitDictionary() {
nit: I think there's a bit too much vertical whitespace in this function now - 
it doesn't really aid readability. Some parts are basically double-spaced now.

I'd suggest removing all the blank lines from l1079-1097, l1099-1112 and the 
blank line before the final return.


http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-column-readers.cc@1084
PS19, Line 1084:   int data_size;
We should switch to int64_t for the byte value while we're here, it's just 
generally safer to use in interfaces for memory values.


http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-column-readers.cc@1133
PS19, Line 1133: Status BaseScalarColumnReader::ReadDataPage() {
Same comment about vertical whitespace.


http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-column-readers.cc@1404
PS19, Line 1404:   // TODO for 2.3: node_.element->name isn't necessarily useful
we can remove the 2.3 reference here :)


http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-page-reader-low.h
File be/src/exec/parquet/parquet-page-reader-low.h:

http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-page-reader-low.h@27
PS19, Line 27: ParquetPageReaderLow
I think the class name could be improved. Maybe something like 
ParquetPageReaderHelper?


http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-page-reader-low.h@93
PS19, Line 93:   ///                Uninitialized
Thank you for taking the time to add and document this, I like it.

Maybe it's worth labeling the state transitions with the method names, if you 
can do it without cluttering it too much.


http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-page-reader-low.cc
File be/src/exec/parquet/parquet-page-reader-low.cc:

http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-page-reader-low.cc@79
PS19, Line 79:   // TODO: this will need to change when we have co-located 
files and the columns
Can you remove this TODO, I don't think we're going to do what this is 
describing.


http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-page-reader-low.cc@131
PS19, Line 131:   *eos = false;
nit: there's a bit too much vertical whitespace here, it's become double-spaced


http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-page-reader.cc
File be/src/exec/parquet/parquet-page-reader.cc:

http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-page-reader.cc@97
PS19, Line 97:
nit: don't need the blank lines in this function, would be perfectly readable 
and denser without them.


http://gerrit.cloudera.org:8080/#/c/13329/19/be/src/exec/parquet/parquet-page-reader.cc@252
PS19, Line 252:     // TODO: can't we call 
stream_->ReleaseCompletedResources(false); at this point?
Quite possibly, this code has undergone a lot of iteration so stuff like this 
may have been missed.



--
To view, visit http://gerrit.cloudera.org:8080/13329
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic0289394adcb97a3529313030930c9c5b85aaa12
Gerrit-Change-Number: 13329
Gerrit-PatchSet: 19
Gerrit-Owner: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Wed, 18 Sep 2019 23:36:36 +0000
Gerrit-HasComments: Yes

Reply via email to