Dan Hecht has posted comments on this change.

Change subject: IMPALA-4363: Add Parquet timestamp validation
......................................................................


Patch Set 7:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/4968/7/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

Line 473:     ErrorMsg msg;
this has a non-trivial destructor, so we should probably only construct it 
after checking NeedsValidation().  Also, why are NeedsValidation() and 
NeedsConversion() mutually exclusive?  Actually, doesn't this break 
--convert_legacy_hive_parquet_utc_timestamps since we'll never take the 
conversion path for timestamp now? (Do we not have tests for that?)  That is, I 
think both NeedsConversion() and NeedsValidation() can be true, and the code 
needs to be adjusted for that.

Also consider tucking lines 475 to 480 inside ValidateSlot() and that way the 
ErrorMsg doesn't have to be passed back and forth, and it'll naturally keep the 
construction after the NeedsValidation() check.


PS7, Line 474: val_ptr
you'll also have to be careful that we pass the right pointer here, depending 
on how you address the comment above.


http://gerrit.cloudera.org:8080/#/c/4968/7/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

Line 147:   /// Verifies that the timestamp date falls into a valid range 
(years 1400..9999).
update this comment to be clear that we also validate time.


PS7, Line 156: l
doesn't this have to be "ll" (long long)?


PS7, Line 160: time_.ticks() < NUMBER_OF_NANOSECONDS_IN_24H
I'm slightly worried about how this interacts with leap seconds.  Are you sure 
this is always illegal?  If it doesn't crash, what behavior did you see (i.e. 
how do you know this is the correct bounds)?


-- 
To view, visit http://gerrit.cloudera.org:8080/4968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9988449aa0dc0f39fabb91ce6cce0dd8a06e8bcf
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky <tbobrovyt...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com>
Gerrit-Reviewer: Lars Volker <l...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <laszlo.g...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tbobrovyt...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to