Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-3989: Display skew warning for poorly formatted Parquet 
files
......................................................................


Patch Set 4:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/5400/4/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

Line 75:   bool poorly_formatted_file_warned;
Presumably you did this to prevent excessive logging, but I think its 
unnecessary. Removing it won't result in excess logging because error messages 
are grouped by their error code (eg. in the shell you'll just see one instance 
of the message followed by '(1 of X similar)', and its better not to clutter up 
this struct.

If you're still worried about excessive logging, you could move the call to 
LogPoorlyFormattedParquetFileWarning to the end of NextRowGroup outside of the 
'while(true)'.

In fact, if you remove this, then LogPoorlyFormattedParquetFileWarning isn't 
really doing much and you could eliminate it and just do the logging inline in 
NextRowGroup.


http://gerrit.cloudera.org:8080/#/c/5400/4/common/thrift/generate_error_codes.py
File common/thrift/generate_error_codes.py:

PS4, Line 315:    
Remove spaces, here and below.

Sorry, I know we're inconsistent about our Python style, but we try to mostly 
follow PEP8:
https://www.python.org/dev/peps/pep-0008/#indentation


http://gerrit.cloudera.org:8080/#/c/5400/4/tests/query_test/test_scanners.py
File tests/query_test/test_scanners.py:

PS4, Line 319: 'NumScannersWithMisalignedRowGroups'
NumScannersWithNoReads


PS4, Line 321: """
Move """ to next line.

Another place where we're inconsistent with our Python style:
https://www.python.org/dev/peps/pep-0008/#documentation-strings


PS4, Line 347: num_scanners_misaligned_row_groups
num_scanners_with_no_reads


PS4, Line 349: excuting
executing


PS4, Line 351: 'num_scanners_misaligned_row_groups'
num_scanners_with_no_reads


PS4, Line 352: """
Move to next line.


PS4, Line 362: 'NumScannersWithMisalignedRowGroups:
NumScannersWithNoReads


-- 
To view, visit http://gerrit.cloudera.org:8080/5400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibf48d978383d73efdade733a892e795ebd53c76a
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges <atti...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <atti...@cloudera.com>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to