[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-20 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..

IMPALA-5664: Unix time to timestamp conversions may crash Impala

TimestampValue::FromSubsecondUnixTime() and UtcFromUnixTimeMicros()
are incorrect only in case of the last second of 1399, because these
sub-second values are rounded first towards 1400-01-01 00:00:00, which
is accepted as a valid date, and the sub-second part is subtracted
afterwards, leading to a date outside the valid interval. The maximum
case, -12-31 59:59:59 is a bit different, because as I understand,
with nanosecond precision posix times, the maximum value is actually
1-01-01. 00:00:00 minus 1 nanosec.

TimestampValue::FromUnixTimeNanos() can create problematic TimestampValues
both <1400 and 1<=.

These timestamps can cause problems, because most code assumes that if
HasDate/HasTime is true, then it really is a valid timestamp.

To fix this, the posix times are checked in the constructor of
TimestampValue, and if it is outside the valid interval,both time_
and date_ are set to not_a_date_time.

Test:
select cast(-17987443200-0.1 as timestamp);
This query no longer crashes, but returns NULL, similarly to other
< 1400 timestamps.

Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Reviewed-on: http://gerrit.cloudera.org:8080/7954
Reviewed-by: Lars Volker 
Tested-by: Impala Public Jenkins
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-stats.inline.h
M be/src/exprs/expr-test.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
8 files changed, 146 insertions(+), 10 deletions(-)

Approvals:
  Lars Volker: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 18
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-20 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 17: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 17
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Sat, 21 Oct 2017 03:13:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-20 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 17: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 17
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Fri, 20 Oct 2017 23:13:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-20 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 17:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1360/


--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 17
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Fri, 20 Oct 2017 23:13:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-20 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 17:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7954/17/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/7954/17/be/src/exprs/expr-test.cc@5005
PS17, Line 5005:   // Timestamp conversions of "dateless" times should return 
null (and not crash,
This part is from different commit that was applied because of the rebase.



--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 17
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Fri, 20 Oct 2017 12:40:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-20 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Alex Behm, Dan Hecht, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/7954

to look at the new patch set (#16).

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..

IMPALA-5664: Unix time to timestamp conversions may crash Impala

TimestampValue::FromSubsecondUnixTime() and UtcFromUnixTimeMicros()
are incorrect only in case of the last second of 1399, because these
sub-second values are rounded first towards 1400-01-01 00:00:00, which
is accepted as a valid date, and the sub-second part is subtracted
afterwards, leading to a date outside the valid interval. The maximum
case, -12-31 59:59:59 is a bit different, because as I understand,
with nanosecond precision posix times, the maximum value is actually
1-01-01. 00:00:00 minus 1 nanosec.

TimestampValue::FromUnixTimeNanos() can create problematic TimestampValues
both <1400 and 1<=.

These timestamps can cause problems, because most code assumes that if
HasDate/HasTime is true, then it really is a valid timestamp.

To fix this, the posix times are checked in the constructor of
TimestampValue, and if it is outside the valid interval,both time_
and date_ are set to not_a_date_time.

Test:
select cast(-17987443200-0.1 as timestamp);
This query no longer crashes, but returns NULL, similarly to other
< 1400 timestamps.

Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-stats.inline.h
M be/src/exprs/expr-test.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
8 files changed, 146 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7954/16
--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 16
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-18 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 15:

> Csaba, are those failing tests specifically require a timestamp
 > that issues such a warning? If not, I suggest we just change the
 > timestamps of those tests in a way that preserves test coverage and
 > avoids the warnings problem.
Yes, those tests work with "non timestamp" strings by design.


--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 15
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Wed, 18 Oct 2017 14:28:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-13 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 15:

Csaba, are those failing tests specifically require a timestamp that issues 
such a warning? If not, I suggest we just change the timestamps of those tests 
in a way that preserves test coverage and avoids the warnings problem.


--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 15
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Fri, 13 Oct 2017 18:16:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-13 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc@139
PS9, Line 139:   }
> Literal::Literal(ColumnType type, double v) also uses TimestampValue::FromS
I have found the cause of this issue, see 
https://issues.apache.org/jira/browse/IMPALA-5664?focusedCommentId=16203482=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16203482



--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 9
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Fri, 13 Oct 2017 13:05:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-12 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 15:

I have checked the output of jenkins. There are 3 problematic front end tests:
AnalyzeDDLTest.TestCreateManagedKuduTable error, (null pointer exception), side 
effect of adding warning to invalid string->timestamp casts

PlannerTest.testKuduSelectivity: failure, not 100% sure that it is related to 
the patch

PlannerTest.testPartitionPruning: failure, side effect of adding warning to 
invalid string->timestamp casts

As a short term solution, the warning can be removed from the string->timestamp 
conversion, as it is not related to the original issue.


--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 15
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Thu, 12 Oct 2017 15:59:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 15: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1323/


--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 15
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Tue, 10 Oct 2017 21:46:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-10 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 15: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 15
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Tue, 10 Oct 2017 17:48:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 15:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1323/


--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 15
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Tue, 10 Oct 2017 17:48:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-10 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 11:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/exprs/cast-functions-ir.cc
File be/src/exprs/cast-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/exprs/cast-functions-ir.cc@277
PS11, Line 277:   }
> I mean the warning messages that you're adding here have no verification.
I have added tests to exprs.test.
There are files called "out-of-range-timestamp-abort-on-error.test" and 
"out-of-range-timestamp-continue-on-error.test", but they seem to be Parquet 
specific.


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-test.cc@670
PS11, Line 670: *
> Yes, we should follow the style guide so we have a consistent style, which
Done


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.cc
File be/src/runtime/timestamp-value.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.cc@116
PS11, Line 116: (
> still not fixed:
Done



--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 11
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Tue, 10 Oct 2017 13:55:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-10 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/7954

to look at the new patch set (#14).

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..

IMPALA-5664: Unix time to timestamp conversions may crash Impala

TimestampValue::FromSubsecondUnixTime() and UtcFromUnixTimeMicros()
are incorrect only in case of the last second of 1399, because these
sub-second values are rounded first towards 1400-01-01 00:00:00, which
is accepted as a valid date, and the sub-second part is subtracted
afterwards, leading to a date outside the valid interval. The maximum
case, -12-31 59:59:59 is a bit different, because as I understand,
with nanosecond precision posix times, the maximum value is actually
1-01-01. 00:00:00 minus 1 nanosec.

TimestampValue::FromUnixTimeNanos() can create problematic TimestampValues
both <1400 and 1<=.

These timestamps can cause problems, because most code assumes that if
HasDate/HasTime is true, then it really is a valid timestamp.

To fix this, the posix times are checked in the constructor of
TimestampValue, and if it is outside the valid interval,both time_
and date_ are set to not_a_date_time.

Also added logging to "cast to timestamp" functions.

Test:
select cast(-17987443200-0.1 as timestamp);
This query no longer crashes, but returns NULL, similarly to other
< 1400 timestamps.

Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-stats.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
9 files changed, 183 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7954/14
--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 14
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-10 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/7954

to look at the new patch set (#13).

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..

IMPALA-5664: Unix time to timestamp conversions may crash Impala

TimestampValue::FromSubsecondUnixTime() and UtcFromUnixTimeMicros()
are incorrect only in case of the last second of 1399, because these
sub-second values are rounded first towards 1400-01-01 00:00:00, which
is accepted as a valid date, and the sub-second part is subtracted
afterwards, leading to a date outside the valid interval. The maximum
case, -12-31 59:59:59 is a bit different, because as I understand,
with nanosecond precision posix times, the maximum value is actually
1-01-01. 00:00:00 minus 1 nanosec.

TimestampValue::FromUnixTimeNanos() can create problematic TimestampValues
both <1400 and 1<=.

These timestamps can cause problems, because most code assumes that if
HasDate/HasTime is true, then it really is a valid timestamp.

To fix this, the posix times are checked in the constructor of
TimestampValue, and if it is outside the valid interval,both time_
and date_ are set to not_a_date_time.

Also added logging to "cast to timestamp" functions.

Test:
select cast(-17987443200-0.1 as timestamp);
This query no longer crashes, but returns NULL, similarly to other
< 1400 timestamps.

Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-stats.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
8 files changed, 81 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7954/13
--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 13
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-06 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 11:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/exprs/cast-functions-ir.cc
File be/src/exprs/cast-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/exprs/cast-functions-ir.cc@277
PS11, Line 277:   }
> They are exercised by different parts of be/exprs/expr-test.cc .I have sear
I mean the warning messages that you're adding here have no verification.


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-test.cc@670
PS11, Line 670: *
> I can change this, but I prefer it this way actually. What is the rule, sho
Yes, we should follow the style guide so we have a consistent style, which 
overall makes the code more readable.  I'm sure everyone has something that 
they dislike about the style, but we need consistency.


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.h@43
PS11, Line 43: 
> This comment was just wrong, we state <= -12-31 or <1-01-01 everywh
Okay, thanks for confirming.


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.cc
File be/src/runtime/timestamp-value.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.cc@116
PS11, Line 116: (
> Done
still not fixed:
if (HasDate()...



--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 11
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Fri, 06 Oct 2017 17:27:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-06 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 11:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/exprs/cast-functions-ir.cc
File be/src/exprs/cast-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/exprs/cast-functions-ir.cc@277
PS11, Line 277:   }
> all of these new if-stmts need end-to-end tests to cover. they aren't exerc
They are exercised by different parts of be/exprs/expr-test.cc .I have searched 
for "TestIsNull.*as timestamp", and the existing tests cover every code line, 
but not every type instantiation for CAST_TO_SUBSECOND_TIMESTAMP / 
CAST_TO_TIMESTAMP, only BIGINT/DECIMAL/DOUBLE. Actually most integer types can 
not represent values that are outside the valid range, because it needs >32 bit.

This design (BE unit tests call FE to parse expressions) seems strange to me, 
but there is probably a valid reason why it was not done in FE or E2E.


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/exprs/timestamp-functions-ir.cc@139
PS11, Line 139:   }
> also needs test
Done


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-test.cc@670
PS11, Line 670: *
> nit: missing space here and below
I can change this, but I prefer it this way actually. What is the rule, should 
we follow the style guide even if the contributor consciously ignored it?  It 
doesn`t matter in this case, but it would matter with short variable names, 
e.g. I find a*a + 2*a*b much more readable than a * a + 2 * a * b.


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.h@43
PS11, Line 43: 
> is that what the documentation says? and did this code change affect the up
This comment was just wrong, we state <= -12-31 or <1-01-01 everywhere 
else. This limitation comes from boost::gregorian, and this patch has no effect 
on it.


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.cc
File be/src/runtime/timestamp-value.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.cc@116
PS11, Line 116: (
> style
Done



--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 11
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Fri, 06 Oct 2017 13:00:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-06 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/7954

to look at the new patch set (#12).

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..

IMPALA-5664: Unix time to timestamp conversions may crash Impala

TimestampValue::FromSubsecondUnixTime() and UtcFromUnixTimeMicros()
are incorrect only in case of the last second of 1399, because these
sub-second values are rounded first towards 1400-01-01 00:00:00, which
is accepted as a valid date, and the sub-second part is subtracted
afterwards, leading to a date outside the valid interval. The maximum
case, -12-31 59:59:59 is a bit different, because as I understand,
with nanosecond precision posix times, the maximum value is actually
1-01-01. 00:00:00 minus 1 nanosec.

TimestampValue::FromUnixTimeNanos() can create problematic TimestampValues
both <1400 and 1<=.

These timestamps can cause problems, because most code assumes that if
HasDate/HasTime is true, then it really is a valid timestamp.

To fix this, the posix times are checked in the constructor of
TimestampValue, and if it is outside the valid interval,both time_
and date_ are set to not_a_date_time.

Also added logging to "cast to timestamp" functions.

Test:
select cast(-17987443200-0.1 as timestamp);
This query no longer crashes, but returns NULL, similarly to other
< 1400 timestamps.

Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-stats.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
8 files changed, 81 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7954/12
--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 12
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-05 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 11:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/exprs/cast-functions-ir.cc
File be/src/exprs/cast-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/exprs/cast-functions-ir.cc@277
PS11, Line 277:   }
all of these new if-stmts need end-to-end tests to cover. they aren't exercised 
by the unit test since that exercises the layer below.


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/exprs/timestamp-functions-ir.cc@139
PS11, Line 139:   }
also needs test


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-test.cc@670
PS11, Line 670: *
nit: missing space here and below


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.h@43
PS11, Line 43: 
is that what the documentation says? and did this code change affect the upper 
bound or was this comment just wrong?


http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.cc
File be/src/runtime/timestamp-value.cc:

http://gerrit.cloudera.org:8080/#/c/7954/11/be/src/runtime/timestamp-value.cc@116
PS11, Line 116: (
style



--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 11
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Thu, 05 Oct 2017 17:34:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-04 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/7954

to look at the new patch set (#11).

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..

IMPALA-5664: Unix time to timestamp conversions may crash Impala

TimestampValue::FromSubsecondUnixTime() and UtcFromUnixTimeMicros()
are incorrect only in case of the last second of 1399, because these
sub-second values are rounded first towards 1400-01-01 00:00:00, which
is accepted as a valid date, and the sub-second part is subtracted
afterwards, leading to a date outside the valid interval. The maximum
case, -12-31 59:59:59 is a bit different, because as I understand,
with nanosecond precision posix times, the maximum value is actually
1-01-01. 00:00:00 minus 1 nanosec.

TimestampValue::FromUnixTimeNanos() can create problematic TimestampValues
both <1400 and 1<=.

These timestamps can cause problems, because most code assumes that if
HasDate/HasTime is true, then it really is a valid timestamp.

To fix this, the posix times are checked in the constructor of
TimestampValue, and if it is outside the valid interval,both time_
and date_ are set to not_a_date_time.

Also added logging to "cast to timestamp" functions.

Test:
select cast(-17987443200-0.1 as timestamp);
This query no longer crashes, but returns NULL, similarly to other
< 1400 timestamps.

Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-stats.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
7 files changed, 66 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7954/11
--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 11
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-04 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 10:

The fe/ files should not be in the patch, I just needed them to fix a compile 
error.


--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 10
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Wed, 04 Oct 2017 15:15:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-04 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 9:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc@136
PS9, Line 136: if(!tv.HasDateAndTime()){
> here an elsewhere - please follow the style used by the rest of impala.
Done


http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc@137
PS9, Line 137: UnixMicrosToUtcTimestamp
> is that meaningful to the user?
I have changed it to its query name.


http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc@139
PS9, Line 139:   }
> are the ones you added the only places where your new validation might prod
Literal::Literal(ColumnType type, double v) also uses 
TimestampValue::FromSubsecondUnixTime, but there is no function context there, 
and I do not know how to call that constructor. If I put a double to a place 
where a timestamp is expected, I get an analyses exception.

I have also noticed someting strange:
select timestamp_cmp(timstamp_col, cast(cast(-17987443200.1 as double) as 
timestamp)) from table;
The table has 3 rows and 4 "UDF WARNING: Could not convert -17987443200.1 to 
timestamp" are printed.

It would make more sense to me to have 1 call to cast (if the casts with 
constant arguments would be optimized away) or 3 (if no optimization takes 
place, and the cast is called for every row).

Does Impala try to optimize functions calls with constant parameters?


http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc@751
PS9, Line 751: datetime.date().year();
> it still seems like this is doing the same validation. should we remove thi
It is the same check, but I would prefer to leave it as it is for now - it 
leads to a nice warning message, and the try-catch block is needed anyway 
because of AddInterval. It could be done in another commit + jira like clean 
up/speed up timestamp functions.



--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 9
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Wed, 04 Oct 2017 15:09:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-04 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/7954

to look at the new patch set (#10).

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..

IMPALA-5664: Unix time to timestamp conversions may crash Impala

TimestampValue::FromSubsecondUnixTime() and UtcFromUnixTimeMicros()
are incorrect only in case of the last second of 1399, because these
sub-second values are rounded first towards 1400-01-01 00:00:00, which
is accepted as a valid date, and the sub-second part is subtracted
afterwards, leading to a date outside the valid interval. The maximum
case, -12-31 59:59:59 is a bit different, because as I understand,
with nanosecond precision posix times, the maximum value is actually
1-01-01. 00:00:00 minus 1 nanosec.

TimestampValue::FromUnixTimeNanos() can create problematic TimestampValues
both <1400 and 1<=.

These timestamps can cause problems, because most code assumes that if
HasDate/HasTime is true, then it really is a valid timestamp.

To fix this, the posix times are checked in the constructor of
TimestampValue, and if it is outside the valid interval,both time_
and date_ are set to not_a_date_time.

Also added logging to "cast to timestamp" functions.

Test:
select cast(-17987443200-0.1 as timestamp);
This query no longer crashes, but returns NULL, similarly to other
< 1400 timestamps.

Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-stats.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
M fe/pom.xml
M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java
9 files changed, 76 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7954/10
--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 10
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-02 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 9:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc@136
PS9, Line 136: if(!tv.HasDateAndTime()){
here an elsewhere - please follow the style used by the rest of impala.


http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc@137
PS9, Line 137: UnixMicrosToUtcTimestamp
is that meaningful to the user?


http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc@139
PS9, Line 139:   }
are the ones you added the only places where your new validation might produce 
a !HasDateAndTime() TimestampValue?


http://gerrit.cloudera.org:8080/#/c/7954/9/be/src/exprs/timestamp-functions-ir.cc@751
PS9, Line 751: datetime.date().year();
it still seems like this is doing the same validation. should we remove this an 
instead do the result_value.HasDateAndTime() check here? Or is this really a 
different case?



--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 9
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Mon, 02 Oct 2017 21:51:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-02 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h@86
PS2, Line 86: }
> But what we need is to log to the warning log, like the code in timestamp-f
I have added some logging to warning. The string -> timestamp cast is not 
related to this issue, but I think it is very useful to warn the user about the 
invalid formatting.

I have left the logging in TimestampValue::Validate, maybe it will be useful 
for postmortem analysis one day.


http://gerrit.cloudera.org:8080/#/c/7954/8/be/src/runtime/timestamp-value.cc
File be/src/runtime/timestamp-value.cc:

http://gerrit.cloudera.org:8080/#/c/7954/8/be/src/runtime/timestamp-value.cc@116
PS8, Line 116: /// The time zone of the resulting ptime is local time. This is 
called by
> nit: for style consistency combine lines 115 & 116.
Done



--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Mon, 02 Oct 2017 18:00:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-10-02 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/7954

to look at the new patch set (#9).

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..

IMPALA-5664: Unix time to timestamp conversions may crash Impala

TimestampValue::FromSubsecondUnixTime() and UtcFromUnixTimeMicros()
are incorrect only in case of the last second of 1399, because these
sub-second values are rounded first towards 1400-01-01 00:00:00, which
is accepted as a valid date, and the sub-second part is subtracted
afterwards, leading to a date outside the valid interval. The maximum
case, -12-31 59:59:59 is a bit different, because as I understand,
with nanosecond precision posix times, the maximum value is actually
1-01-01. 00:00:00 minus 1 nanosec.

TimestampValue::FromUnixTimeNanos() can create problematic TimestampValues
both <1400 and 1<=.

These timestamps can cause problems, because most code assumes that if
HasDate/HasTime is true, then it really is a valid timestamp.

To fix this, the posix times are checked in the constructor of
TimestampValue, and if it is outside the valid interval,both time_
and date_ are set to not_a_date_time.

Also added logging to "cast to timestamp" functions.

Test:
select cast(-17987443200-0.1 as timestamp);
This query no longer crashes, but returns NULL, similarly to other
< 1400 timestamps.

Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-stats.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
7 files changed, 66 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7954/9
--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 9
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-09-29 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7954/4/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

http://gerrit.cloudera.org:8080/#/c/7954/4/be/src/runtime/timestamp-value.h@173
PS4, Line 173:
> i think it would be best to avoid running the constructor given these thing
Done


http://gerrit.cloudera.org:8080/#/c/7954/4/be/src/runtime/timestamp-value.h@287
PS4, Line 287:
> nit: extraneous space
Done


http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h@86
PS2, Line 86: }
> Logging in the caller is fine but I don't see that added to your diff.
I have added some logging, but not on the caller side.



--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Fri, 29 Sep 2017 16:40:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-09-27 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 5:

Ping?


--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 5
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Wed, 27 Sep 2017 17:20:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-09-22 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/exprs/timestamp-functions-ir.cc@332
PS2, Line 332:   DCHECK(ts_value.IsValidDate());
> why does this DCHECK no longer make sense?  are we saying it's trivially tr
Yes, my idea was that range check must be done at construction, and it is 
enough to check HasDate() in other timestamp related functions.



--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Fri, 22 Sep 2017 13:35:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-09-21 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 5:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/exprs/timestamp-functions-ir.cc@332
PS2, Line 332:   StringVal result(context, 10);
why does this DCHECK no longer make sense?  are we saying it's trivially true 
now after check HasDate() and with the checks you've added in this commit?


http://gerrit.cloudera.org:8080/#/c/7954/4/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

http://gerrit.cloudera.org:8080/#/c/7954/4/be/src/runtime/timestamp-value.h@173
PS4, Line 173: boost::gregorian::date
i think it would be best to avoid running the constructor given these things 
can throw exceptions, so let's use 'const &' here.


http://gerrit.cloudera.org:8080/#/c/7954/4/be/src/runtime/timestamp-value.h@287
PS4, Line 287:
nit: extraneous space


http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h@86
PS2, Line 86:   /// Constructors that parse from a date/time string. See 
TimestampParser for details
> I think it would be better to do the logging on the caller side, because de
Logging in the caller is fine but I don't see that added to your diff.



--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 5
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Thu, 21 Sep 2017 20:50:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-09-21 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7954/5/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

http://gerrit.cloudera.org:8080/#/c/7954/5/be/src/runtime/timestamp-value.h@173
PS5, Line 173:
This became static to enable other functions to check whether their date is 
valid or not without actually creating a TimestampValue or calling year() to 
generate an exception. I think it would be better if there was a single 
function that does this check, so if 1400 will be changed for some reason ( 
IMPALA-2009 ), there will be no need to change many parts of the code.


http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h@86
PS2, Line 86: }
> One purpose of FunctionContext is to be the interface back into impala's ru
I think it would be better to do the logging on the caller side, because 
depending on the caller, the format for the log can be different, e.g. unix 
timestamp vs -MM-DD string.



--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Thu, 21 Sep 2017 17:29:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-09-21 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/7954

to look at the new patch set (#4).

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..

IMPALA-5664: Unix time to timestamp conversions may crash Impala

TimestampValue::FromSubsecondUnixTime() and UtcFromUnixTimeMicros()
are incorrect only in case of the last second of 1399, because these
sub-second values are rounded first towards 1400-01-01 00:00:00, which
is accepted as a valid date, and the sub-second part is subtracted
afterwards, leading to a date outside the valid interval. The maximum
case, -12-31 59:59:59 is a bit different, because as I understand,
with nanosecond precision posix times, the maximum value is actually
1-01-01. 00:00:00 minus 1 nanosec.

TimestampValue::FromUnixTimeNanos() can create problematic TimestampValues
both <1400 and 1<=.

These timestamps can cause problems, because most code assumes that if
HasDate/HasTime is true, then it really is a valid timestamp.

To fix this, the posix times are checked in the constructor of
TimestampValue, and if it is outside the valid interval,both time_
and date_ are set to not_a_date_time.

Test:
select cast(-17987443200-0.1 as timestamp);
This query no longer crashes, but returns NULL, similarly to other
< 1400 timestamps.

Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-stats.inline.h
M be/src/exprs/timestamp-functions-ir.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
5 files changed, 47 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7954/4
--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 4
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-09-21 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7954 )

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-test.cc@637
PS2, Line 637: Sub-second F
> The problem is only with the functions that take sub-second resolution, rig
Done


http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-test.cc@685
PS2, Line 685:
> nit: typo
Done



--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 3
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Thu, 21 Sep 2017 15:43:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-09-21 Thread Csaba Ringhofer (Code Review)
Hello Lars Volker, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/7954

to look at the new patch set (#3).

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..

IMPALA-5664: Unix time to timestamp conversions may crash Impala

TimestampValue::FromSubsecondUnixTime() and UtcFromUnixTimeMicros()
are incorrect only in case of the last second of 1399, because these
sub-second values are rounded first towards 1400-01-01 00:00:00, which
is accepted as a valid date, and the sub-second part is subtracted
afterwards, leading to a date outside the valid interval. The maximum
case, -12-31 59:59:59 is a bit different, because as I understand,
with nanosecond precision posix times, the maximum value is actually
1-01-01. 00:00:00 minus 1 nanosec.

TimestampValue::FromUnixTimeNanos() can create problematic TimestampValues
both <1400 and 1<=.

These timestamps can cause problems, because most code assumes that if
HasDate/HasTime is true, then it really is a valid timestamp.

To fix this, the posix times are checked in the constructor of
TimestampValue, and if it is outside the valid interval,both time_
and date_ are set to not_a_date_time.

Test:
select cast(-17987443200-0.1 as timestamp);
This query no longer crashes, but returns NULL, similarly to other
< 1400 timestamps.

Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-stats.inline.h
M be/src/exprs/timestamp-functions-ir.cc
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
6 files changed, 50 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7954/3
--
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-Change-Number: 7954
Gerrit-PatchSet: 3
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-09-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

PS2, Line 80: !date_.is_special()
> I looked at ScalarColumnReader::ValidateSlot(), and n
My comment was just that HasDate() == !date_.is_special(), and using HasDate() 
would make it easier to grep all the places we check for this condition.

I'm not sure I follow your comment about public functions.  Since the scanner 
does a cast, it explicitly checks IsValidDate(). It is kinda gross, though.


PS2, Line 80: !date_.is_special()
> I think that ScalarColumnReader::ValidateSlot() is ac
Thanks for finding that. Let's discuss on the jira.


Line 86: }
> About giving a warning to the user: is it ok to log in low level classes li
One purpose of FunctionContext is to be the interface back into impala's 
runtime for UDFs (and expressions in general). It'd be best to leak it outside 
of exprs, udf, and codegen.

So, I think it'd make sense to have a static wrapper around the constructor 
where we do this validation that lives in the exprs code, and that can have 
access to FunctionContext to do the logging, and construct the TimestampValue 
only for well formed values.  (Then this constructor could do a dcheck for 
that).


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-09-15 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change.

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

PS2, Line 80: !date_.is_special()
> I looked at ScalarColumnReader::ValidateSlot(), and n
I think that ScalarColumnReader::ValidateSlot() is 
actually a bit too strict, which lead to inconsistency between text and parquet 
files. I have created a ticket for this:
https://issues.apache.org/jira/browse/IMPALA-5942

We should decide the correct behavior in these cases, before continuing with 
this ticket.


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-09-15 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change.

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

PS2, Line 80: !date_.is_special()
> how about HasDate()?  But also, should IsValidDate() be checking HasDate()?
I looked at ScalarColumnReader::ValidateSlot(), and now I 
see that TimestampValue is created via reinterpret_cast on buffer pointers - 
this means that its constructor is skipped, so it is not possible to force 
validity in the public functions.

>That code looks bogus if date_.is_special(), 
Special values lie outside the valid time range, so IsValidDate can only return 
true if is_special() is false.

TimestampValue can have not_a_date_time in date_ or time_, it is only invalid, 
if both are not_a_date_time. I am not sure, what will/should happen in these 
cases.


Line 86: }
> why is that that the year() calls in timestamp-functions-ir.cc aren't suffi
About giving a warning to the user: is it ok to log in low level classes like 
TimestampValue? AddSub calls FunctionContext::AddWarning, while 
TimestampValue`s functions do not receive FunctionContext arguments.

If TimestampValue functions cannot log, then I will have to look for their 
callers and add checks + warnings there.

Note that I am not familiar with Impala`s logging system yet.


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-09-14 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

PS2, Line 637: FromUnixTime
The problem is only with the functions that take sub-second resolution, right? 
I think that's important to clarify -- otherwise it's not clear what we mean by 
rounding.


PS2, Line 685: n
nit: typo


http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

PS2, Line 80: !date_.is_special()
how about HasDate()?  But also, should IsValidDate() be checking HasDate()? 
That code looks bogus if date_.is_special(), so the call from 
ScalarColumnReader::ValidateSlot() seems dangerous (if 
the parquet file contained a corrupted value that corresponded to 
not_a_date_time).


PS2, Line 81: ptime
> There was no exception thrown at the time of creation, but much later, when
We don't want to use exceptions in impala. We limit their use to when 
interacting with other libraries. So not throwing an exception here is the 
right thing.


Line 86: }
why is that that the year() calls in timestamp-functions-ir.cc aren't 
sufficient, e.g.:

// Validate that the ptime is not "special" (ie not_a_date_time) and has a 
valid year.
// If validation fails, an exception is thrown.
datetime.date().year();

Centralizing where we do this validation seems like a good idea, but unlike the 
other code that attempts to do this kind of thing, e.g. 
TimestampFunctions::AddSub(), we don't get a warning. The user should probably 
get a warning in this case.
2) If we're going to do the validation in a central place, we should try to 
clean up the other code that does the year() calls (assuming it's trying to 
validate the same thing).


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-09-14 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 2: Code-Review+1

Thanks for fixing this. LGTM, let's see what others say.

-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-09-14 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change.

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

PS2, Line 81: ptime
> Have you considered catching the error when the invalid ptime object is cre
There was no exception thrown at the time of creation, but much later, when the 
TimestampValue::ToString() function was called. It would be possible to throw 
an exception here, and add try+catch to every function where this constructor 
is called, but it would be a bit more work.


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-09-12 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

PS2, Line 81: ptime
Have you considered catching the error when the invalid ptime object is 
created? What is the benefit of doing it here?


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-09-07 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change.

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..


Patch Set 2:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/7954/1//COMMIT_MSG
Commit Message:

PS1, Line 7: Impala
> nit: Capitalize Impala (it's a name)
Done


PS1, Line 9: FromSubsecondUnixTime
> We usually refer to functions as Function().
Done


PS1, Line 11: 1400-01-01 00:00:00
> You could write dates in the more standard format (1400-01-01 00:00:00).
Done


PS1, Line 13: The maximum
: case, -12-31 59:59:59 is a bit different, because as I 
understand,
: with nanosecond precision posix times, the maximum value is 
actually
: 1-01-01. 00:00:00 minus 1 nanosec.
> Can you add tests with sub-second deltas around the upper bound, too?
Done


http://gerrit.cloudera.org:8080/#/c/7954/1/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

Line 637:   // FromUnixTime functions incorrectly accepted the last second of 
1399 as valid,
> Please wrap lines at 90 characters. Please follow the surrounding code for 
Done


PS1, Line 637: he last second of 1399 as valid,
> Please outline in the comment briefly why the last second needs special att
Done


PS1, Line 637: ns i
> Extra word?
Done


Line 640:   MIN_DATE_AS_UNIX_TIME - 0.1).HasDate());
> While you are here, can you also add tests for the exact beginning of the l
Done


http://gerrit.cloudera.org:8080/#/c/7954/1/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

PS1, Line 80: s_special() && UNLI
> Can you add a comment explaining why this check is necessary?
Done


Line 80: if(!date_.is_special() && UNLIKELY(!IsValidDate())){
> Please use spaces instead of tabs.
Done


http://gerrit.cloudera.org:8080/#/c/7954/2/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

Line 84:   time_ = 
boost::posix_time::time_duration(boost::posix_time::not_a_date_time);
The first patch was incorrect - it did not set time_ to not_a_date_time, but 
00:00:00 instead. This caused the last second of year 1399 to be inconsistent 
with other dates before year 1400.


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash Impala

2017-09-07 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has uploaded a new patch set (#2).

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash Impala
..

IMPALA-5664: Unix time to timestamp conversions may crash Impala

TimestampValue::FromSubsecondUnixTime() and UtcFromUnixTimeMicros()
are incorrect only in case of the last second of 1399, because these
sub-second values are rounded first towards 1400-01-01 00:00:00, which
is accepted as a valid date, and the sub-second part is subtracted
afterwards, leading to a date outside the valid interval. The maximum
case, -12-31 59:59:59 is a bit different, because as I understand,
with nanosecond precision posix times, the maximum value is actually
1-01-01. 00:00:00 minus 1 nanosec.

TimestampValue::FromUnixTimeNanos() can create problematic TimestampValues
both <1400 and 1<=.

These timestamps can cause problems, because most code assumes that if
HasDate/HasTime is true, then it really is a valid timestamp.

To fix this, the posix times are checked in the constructor of
TimestampValue, and if it is outside the valid interval,both time_
and date_ are set to not_a_date_time.

Test:
select cast(-17987443200-0.1 as timestamp);
This query no longer crashes, but returns NULL, similarly to other
< 1400 timestamps.

Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
---
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
2 files changed, 34 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7954/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash impala (boost exception)

2017-09-05 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash 
impala (boost exception)
..


Patch Set 1:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/7954/1//COMMIT_MSG
Commit Message:

PS1, Line 7: impala
nit: Capitalize Impala (it's a name)


PS1, Line 9: FromSubsecondUnixTime
We usually refer to functions as Function().


PS1, Line 11: 1400.01.01 00.00.00
You could write dates in the more standard format (1400-01-01 00:00:00).


PS1, Line 13: The maximum
: case, .12.31 59.59.59 is a bit different, because as I 
understand,
: with nanosecond precision posix times, the maximum value is 
actually
: 1.01.01. 00.00.00 minus 1 nanosec.
Can you add tests with sub-second deltas around the upper bound, too?


PS1, Line 18: TimestampValue::FromUnixTimeNanos can create problematic 
TimestampValues
: both <1400 and 1<=.
Doens't this change fix this?


Line 22: HasDate/HasTime is true, then it really is a valid timestamp.
Can you include in the commit message how you fixed it and how you test it? 
Generally speaking, it's usually good to structure commit messages for bug 
fixes as "Problem, Solution, Tests".


http://gerrit.cloudera.org:8080/#/c/7954/1/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

Line 637:   // last second of 1399 need can be special, see 
https://issues.apache.org/jira/browse/IMPALA-5664
Please wrap lines at 90 characters. Please follow the surrounding code for 
indents and capitalization of comments.


PS1, Line 637: need
Extra word?


PS1, Line 637: 1399
Please state of what.


PS1, Line 637: https://issues.apache.org/jira/browse/IMPALA-5664
Please outline in the comment briefly why the last second needs special 
attention.


Line 640:   
EXPECT_FALSE(TimestampValue::FromUnixTimeNanos(MIN_DATE_AS_UNIX_TIME, 
-NANOS_PER_MICRO*100).HasDate());
While you are here, can you also add tests for the exact beginning of the last 
second?

We also should have tests that add a subsecond interval for symmetry.

Then please also add similar tests for the maximum date.


http://gerrit.cloudera.org:8080/#/c/7954/1/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

PS1, Line 80: !date_.is_special()
Can you add a comment explaining why this check is necessary?


Line 80:  if(UNLIKELY(!date_.is_special() && !IsValidDate())) *this = 
TimestampValue();
Please use spaces instead of tabs.


-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5664: Unix time to timestamp conversions may crash impala (boost exception)

2017-09-04 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7954

Change subject: IMPALA-5664: Unix time to timestamp conversions may crash 
impala (boost exception)
..

IMPALA-5664: Unix time to timestamp conversions may crash impala (boost 
exception)

TimestampValue::FromSubsecondUnixTime and UtcFromUnixTimeMicros
are incorrect only in case of the last second of 1399, because these
sub-second values are rounded first towards 1400.01.01 00.00.00, which
is accepted as a valid date, and the sub-second part is subtracted
afterwards, leading to a date outside the valid interval. The maximum
case, .12.31 59.59.59 is a bit different, because as I understand,
with nanosecond precision posix times, the maximum value is actually
1.01.01. 00.00.00 minus 1 nanosec.

TimestampValue::FromUnixTimeNanos can create problematic TimestampValues
both <1400 and 1<=.

These timestamps can cause problems, because most code assumes that if
HasDate/HasTime is true, then it really is a valid timestamp.

Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
---
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
2 files changed, 10 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/7954/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I77b2f6284d3a597f57e61c17a67c959eff9e38ff
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Csaba Ringhofer