[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-07-08 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..

IMPALA-9531: Dropped support for dateless timestamps

Removed the support for dateless timestamps.
During dateless timestamp casts if the format doesn't contain
date part we get an error during tokenization of the format.
If the input str doesn't contain a date part then we get null result.

Examples:
select cast('01:02:59' as timestamp);
This will come back as NULL value.

select to_timestamp('01:01:01', 'HH:mm:ss');
select cast('01:02:59' as timestamp format 'HH12:MI:SS');
select cast('12 AM' as timestamp FORMAT 'AM.HH12');
These will come back with a parsing errors.

Casting from a table will generate similar results.

Testing:
Modified the previous tests related to dateless timestamps.
Added test to read fromtables which are still containing dateless
timestamps and covered timestamp to string path when no date tokens
are requested in the output string.

Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Reviewed-on: http://gerrit.cloudera.org:8080/15866
Tested-by: Impala Public Jenkins 
Reviewed-by: Gabor Kaszab 
---
M be/src/benchmarks/convert-timestamp-benchmark.cc
M be/src/benchmarks/parse-timestamp-benchmark.cc
M be/src/exec/text-converter.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/scalar-expr-evaluator.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timestamp-functions.h
M be/src/runtime/date-parse-util.cc
M be/src/runtime/date-test.cc
M be/src/runtime/datetime-iso-sql-format-tokenizer.cc
M be/src/runtime/datetime-parser-common.cc
M be/src/runtime/datetime-parser-common.h
M be/src/runtime/datetime-simple-date-format-parser.cc
M be/src/runtime/datetime-simple-date-format-parser.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
M bin/rat_exclude_files.txt
M common/function-registry/impala_functions.py
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M testdata/data/README
A testdata/data/dateless_timestamps.parq
A testdata/data/dateless_timestamps.txt
M testdata/data/lazy_timestamp.csv
M testdata/workloads/functional-query/queries/QueryTest/date.test
A 
testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_parquet.test
A 
testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_text.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M 
testdata/workloads/functional-query/queries/QueryTest/select-lazy-timestamp.test
M tests/data_errors/test_data_errors.py
M tests/query_test/test_cast_with_format.py
M tests/query_test/test_scanners.py
34 files changed, 278 insertions(+), 231 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Gabor Kaszab: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 14
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-07-08 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 13: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 13
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 08 Jul 2020 19:32:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-07-08 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 13: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 13
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 08 Jul 2020 17:09:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-07-08 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 13:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6108/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 13
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 08 Jul 2020 12:05:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-07-08 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 12:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6527/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 12
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 08 Jul 2020 09:12:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-07-08 Thread Adam Tamas (Code Review)
Adam Tamas has uploaded a new patch set (#12). ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..

IMPALA-9531: Dropped support for dateless timestamps

Removed the support for dateless timestamps.
During dateless timestamp casts if the format doesn't contain
date part we get an error during tokenization of the format.
If the input str doesn't contain a date part then we get null result.

Examples:
select cast('01:02:59' as timestamp);
This will come back as NULL value.

select to_timestamp('01:01:01', 'HH:mm:ss');
select cast('01:02:59' as timestamp format 'HH12:MI:SS');
select cast('12 AM' as timestamp FORMAT 'AM.HH12');
These will come back with a parsing errors.

Casting from a table will generate similar results.

Testing:
Modified the previous tests related to dateless timestamps.
Added test to read fromtables which are still containing dateless
timestamps and covered timestamp to string path when no date tokens
are requested in the output string.

Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
---
M be/src/benchmarks/convert-timestamp-benchmark.cc
M be/src/benchmarks/parse-timestamp-benchmark.cc
M be/src/exec/text-converter.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/scalar-expr-evaluator.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timestamp-functions.h
M be/src/runtime/date-parse-util.cc
M be/src/runtime/date-test.cc
M be/src/runtime/datetime-iso-sql-format-tokenizer.cc
M be/src/runtime/datetime-parser-common.cc
M be/src/runtime/datetime-parser-common.h
M be/src/runtime/datetime-simple-date-format-parser.cc
M be/src/runtime/datetime-simple-date-format-parser.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
M bin/rat_exclude_files.txt
M common/function-registry/impala_functions.py
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M testdata/data/README
A testdata/data/dateless_timestamps.parq
A testdata/data/dateless_timestamps.txt
M testdata/data/lazy_timestamp.csv
M testdata/workloads/functional-query/queries/QueryTest/date.test
A 
testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_parquet.test
A 
testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_text.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M 
testdata/workloads/functional-query/queries/QueryTest/select-lazy-timestamp.test
M tests/data_errors/test_data_errors.py
M tests/query_test/test_cast_with_format.py
M tests/query_test/test_scanners.py
34 files changed, 278 insertions(+), 231 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/15866/12
--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 12
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-07-08 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 11:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15866/11/testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_text.test
File 
testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_text.test:

http://gerrit.cloudera.org:8080/#/c/15866/11/testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_text.test@8
PS11, Line 8: localhost:20500
Note, the dockerised tests fail because the hardcoded localhost.



--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 11
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 08 Jul 2020 06:23:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 11: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6100/


--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 11
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 07 Jul 2020 18:17:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-07-07 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 11: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 11
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 07 Jul 2020 13:08:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 11:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6100/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 11
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 07 Jul 2020 13:08:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6511/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 10
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 07 Jul 2020 11:56:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-07-07 Thread Adam Tamas (Code Review)
Adam Tamas has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 10:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15866/7/be/src/exprs/timestamp-functions.cc
File be/src/exprs/timestamp-functions.cc:

http://gerrit.cloudera.org:8080/#/c/15866/7/be/src/exprs/timestamp-functions.cc@154
PS7, Line 154: // The purpose of making this .cc only is to avoid moving the 
code of
> I'd rather say something to indicate that the purpose of making this .cc on
Done


http://gerrit.cloudera.org:8080/#/c/15866/7/be/src/runtime/datetime-simple-date-format-parser.cc
File be/src/runtime/datetime-simple-date-format-parser.cc:

http://gerrit.cloudera.org:8080/#/c/15866/7/be/src/runtime/datetime-simple-date-format-parser.cc@56
PS7, Line 56: Tokenize(_DATE_TIME_CTX[i], PARSE);
> It's no longer needed to give the 3rd parameter in case it's true, right? L
Done



--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 10
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 07 Jul 2020 11:27:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-07-07 Thread Adam Tamas (Code Review)
Adam Tamas has uploaded a new patch set (#10). ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..

IMPALA-9531: Dropped support for dateless timestamps

Removed the support for dateless timestamps.
During dateless timestamp casts if the format doesn't contain
date part we get an error during tokenization of the format.
If the input str doesn't contain a date part then we get null result.

Examples:
select cast('01:02:59' as timestamp);
This will come back as NULL value.

select to_timestamp('01:01:01', 'HH:mm:ss');
select cast('01:02:59' as timestamp format 'HH12:MI:SS');
select cast('12 AM' as timestamp FORMAT 'AM.HH12');
These will come back with a parsing errors.

Casting from a table will generate similar results.

Testing:
Modified the previous tests related to dateless timestamps.
Added test to read fromtables which are still containing dateless
timestamps and covered timestamp to string path when no date tokens
are requested in the output string.

Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
---
M be/src/benchmarks/convert-timestamp-benchmark.cc
M be/src/benchmarks/parse-timestamp-benchmark.cc
M be/src/exec/text-converter.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/scalar-expr-evaluator.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timestamp-functions.h
M be/src/runtime/date-parse-util.cc
M be/src/runtime/date-test.cc
M be/src/runtime/datetime-iso-sql-format-tokenizer.cc
M be/src/runtime/datetime-parser-common.cc
M be/src/runtime/datetime-parser-common.h
M be/src/runtime/datetime-simple-date-format-parser.cc
M be/src/runtime/datetime-simple-date-format-parser.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
M bin/rat_exclude_files.txt
M common/function-registry/impala_functions.py
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M testdata/data/README
A testdata/data/dateless_timestamps.parq
A testdata/data/dateless_timestamps.txt
M testdata/data/lazy_timestamp.csv
M testdata/workloads/functional-query/queries/QueryTest/date.test
A 
testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_parquet.test
A 
testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_text.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M 
testdata/workloads/functional-query/queries/QueryTest/select-lazy-timestamp.test
M tests/data_errors/test_data_errors.py
M tests/query_test/test_cast_with_format.py
M tests/query_test/test_scanners.py
34 files changed, 277 insertions(+), 231 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/15866/10
--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 10
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-07-05 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 8: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 8
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 Jul 2020 05:28:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-06-30 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6462/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 8
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 30 Jun 2020 07:29:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-06-30 Thread Adam Tamas (Code Review)
Adam Tamas has uploaded a new patch set (#8). ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..

IMPALA-9531: Dropped support for dateless timestamps

Removed the support for dateless timestamps.
During dateless timestamp casts if the format doesn't contain
date part we get an error during tokenization of the format.
If the input str doesn't contain a date part then we get null result.

Examples:
select cast('01:02:59' as timestamp);
This will come back as NULL value.

select to_timestamp('01:01:01', 'HH:mm:ss');
select cast('01:02:59' as timestamp format 'HH12:MI:SS');
select cast('12 AM' as timestamp FORMAT 'AM.HH12');
These will come back with a parsing errors.

Casting from a table will generate similar results.

Testing:
Modified the previous tests related to dateless timestamps.
Added test to read fromtables which are still containing dateless
timestamps and covered timestamp to string path when no date tokens
are requested in the output string.

Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
---
M be/src/benchmarks/convert-timestamp-benchmark.cc
M be/src/benchmarks/parse-timestamp-benchmark.cc
M be/src/exec/text-converter.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/scalar-expr-evaluator.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timestamp-functions.h
M be/src/runtime/date-parse-util.cc
M be/src/runtime/date-test.cc
M be/src/runtime/datetime-iso-sql-format-tokenizer.cc
M be/src/runtime/datetime-parser-common.cc
M be/src/runtime/datetime-parser-common.h
M be/src/runtime/datetime-simple-date-format-parser.cc
M be/src/runtime/datetime-simple-date-format-parser.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
M bin/rat_exclude_files.txt
M common/function-registry/impala_functions.py
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M testdata/data/README
A testdata/data/dateless_timestamps.parq
A testdata/data/dateless_timestamps.txt
M testdata/data/lazy_timestamp.csv
M testdata/workloads/functional-query/queries/QueryTest/date.test
A 
testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_parquet.test
A 
testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_text.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M 
testdata/workloads/functional-query/queries/QueryTest/select-lazy-timestamp.test
M tests/data_errors/test_data_errors.py
M tests/query_test/test_cast_with_format.py
M tests/query_test/test_scanners.py
34 files changed, 278 insertions(+), 231 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/15866/8
--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 8
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-06-26 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15866/6/be/src/exprs/timestamp-functions.cc
File be/src/exprs/timestamp-functions.cc:

http://gerrit.cloudera.org:8080/#/c/15866/6/be/src/exprs/timestamp-functions.cc@154
PS6, Line 154: void UnixAndFromUnixPrepare(FunctionContext* context,
This function is not included in the header because you didn't want to include 
the whole datatime-common for CastDirection, right? Could you mention it here?


http://gerrit.cloudera.org:8080/#/c/15866/6/be/src/runtime/datetime-simple-date-format-parser.h
File be/src/runtime/datetime-simple-date-format-parser.h:

http://gerrit.cloudera.org:8080/#/c/15866/6/be/src/runtime/datetime-simple-date-format-parser.h@87
PS6, Line 87: CastDirection cast_mode = FORMAT
This still has a default value. (I saw that you actually added the parameters 
on the callsites.)



--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 6
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 26 Jun 2020 13:58:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-06-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6386/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 6
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 22 Jun 2020 09:02:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-06-22 Thread Adam Tamas (Code Review)
Adam Tamas has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 6:

There was a rebase between patch set 5 and 6. The only thing that is changed is 
bin/rat_exclude_files.txt.


--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 6
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 22 Jun 2020 08:35:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-06-22 Thread Adam Tamas (Code Review)
Adam Tamas has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..

IMPALA-9531: Dropped support for dateless timestamps

Removed the support for dateless timestamps.
During dateless timestamp casts if the format doesn't contain
date part we get an error during tokenization of the format.
If the input str doesn't contain a date part then we get null result.

Examples:
select cast('01:02:59' as timestamp);
This will come back as NULL value.

select to_timestamp('01:01:01', 'HH:mm:ss');
select cast('01:02:59' as timestamp format 'HH12:MI:SS');
select cast('12 AM' as timestamp FORMAT 'AM.HH12');
These will come back with a parsing errors.

Casting from a table will generate similar results.

Testing:
Modified the previous tests related to dateless timestamps.
Added test to read fromtables which are still containing dateless
timestamps and covered timestamp to string path when no date tokens
are requested in the output string.

Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
---
M be/src/benchmarks/convert-timestamp-benchmark.cc
M be/src/benchmarks/parse-timestamp-benchmark.cc
M be/src/exec/text-converter.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/scalar-expr-evaluator.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timestamp-functions.h
M be/src/runtime/date-parse-util.cc
M be/src/runtime/date-test.cc
M be/src/runtime/datetime-iso-sql-format-tokenizer.cc
M be/src/runtime/datetime-parser-common.cc
M be/src/runtime/datetime-parser-common.h
M be/src/runtime/datetime-simple-date-format-parser.cc
M be/src/runtime/datetime-simple-date-format-parser.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
M bin/rat_exclude_files.txt
M common/function-registry/impala_functions.py
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M testdata/data/README
A testdata/data/dateless_timestamps.parq
A testdata/data/dateless_timestamps.txt
M testdata/data/lazy_timestamp.csv
M testdata/workloads/functional-query/queries/QueryTest/date.test
A 
testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_parquet.test
A 
testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_text.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M 
testdata/workloads/functional-query/queries/QueryTest/select-lazy-timestamp.test
M tests/data_errors/test_data_errors.py
M tests/query_test/test_cast_with_format.py
M tests/query_test/test_scanners.py
34 files changed, 275 insertions(+), 230 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/15866/6
--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 6
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-06-21 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 5:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/6385/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 5
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Sun, 21 Jun 2020 11:44:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-06-21 Thread Adam Tamas (Code Review)
Adam Tamas has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 5:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/scalar-expr-evaluator.cc
File be/src/exprs/scalar-expr-evaluator.cc:

http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/scalar-expr-evaluator.cc@444
PS4, Line 444:   TimestampFunctions::FromUnixPrepare(nullptr, 
FunctionContext::FRAGMENT_LOCAL);
> line too long (93 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/exprs/timestamp-functions-ir.cc@167
PS3, Line 167: if (!context->IsArgConstant(1)) {
> Did you find out what this check is for? Don't you break anything with simp
Done


http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions-ir.cc@168
PS4, Line 168:   dt_ctx->Reset(reinterpret_cast(fmt.ptr), fmt.len)
> Anyway, if you tokenize the format here then you do the tokenization for ea
Done


http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions.h
File be/src/exprs/timestamp-functions.h:

http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions.h@104
PS4, Line 104:
> Please use datetime_parse_util::CastDirection
We can't reach datetime_parse_util::CastDirection from here that is why I used 
bool.

I spoke with Csaba Ringhofer about it and in the end we come to the conclusion 
that it is not necessary to be declared here, it is enough in the .cc because 
it is just there to avoid code duplication.


http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions.cc
File be/src/exprs/timestamp-functions.cc:

http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions.cc@155
PS4, Line 155: CastDirection cast
> Why don't you use the parameter type that could be directly passed to Token
Done


http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions.cc@167
PS4, Line 167: if (!parse_result) {
 :   delete dt_ctx;
 :   ReportBadFormat(context, 
datetime_parse_util::GENERAL_ERROR, fmt_val, true);
 :   return;
> Please check how to format if-else in Impala
Done


http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/runtime/datetime-simple-date-format-parser.h
File be/src/runtime/datetime-simple-date-format-parser.h:

http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/runtime/datetime-simple-date-format-parser.h@88
PS4, Line 88:
> I wouldn't give this a default value unless you have any specific reason ma
Done


http://gerrit.cloudera.org:8080/#/c/15866/4/testdata/data/README
File testdata/data/README:

http://gerrit.cloudera.org:8080/#/c/15866/4/testdata/data/README@503
PS4, Line 503: dateless_timestamps.pa
> The file name is not that verbose. After reading it it doesn't give any hin
Done


http://gerrit.cloudera.org:8080/#/c/15866/4/testdata/data/timestamp_text_test.txt
File testdata/data/timestamp_text_test.txt:

http://gerrit.cloudera.org:8080/#/c/15866/4/testdata/data/timestamp_text_test.txt@1
PS4, Line 1:
> Same comment for the file name as for the parquet file: "dateless_timestamp
Done


http://gerrit.cloudera.org:8080/#/c/15866/4/tests/query_test/test_scanners.py
File tests/query_test/test_scanners.py:

http://gerrit.cloudera.org:8080/#/c/15866/4/tests/query_test/test_scanners.py@391
PS4, Line 391:
> flake8: E501 line too long (98 > 90 characters)
Done



--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 5
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Sun, 21 Jun 2020 10:58:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-06-21 Thread Adam Tamas (Code Review)
Adam Tamas has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..

IMPALA-9531: Dropped support for dateless timestamps

Removed the support for dateless timestamps.
During dateless timestamp casts if the format doesn't contain
date part we get an error during tokenization of the format.
If the input str doesn't contain a date part then we get null result.

Examples:
select cast('01:02:59' as timestamp);
This will come back as NULL value.

select to_timestamp('01:01:01', 'HH:mm:ss');
select cast('01:02:59' as timestamp format 'HH12:MI:SS');
select cast('12 AM' as timestamp FORMAT 'AM.HH12');
These will come back with a parsing errors.

Casting from a table will generate similar results.

Testing:
Modified the previous tests related to dateless timestamps.
Added test to read fromtables which are still containing dateless
timestamps and covered timestamp to string path when no date tokens
are requested in the output string.

Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
---
M be/src/benchmarks/convert-timestamp-benchmark.cc
M be/src/benchmarks/parse-timestamp-benchmark.cc
M be/src/exec/text-converter.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/scalar-expr-evaluator.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timestamp-functions.h
M be/src/runtime/date-parse-util.cc
M be/src/runtime/date-test.cc
M be/src/runtime/datetime-iso-sql-format-tokenizer.cc
M be/src/runtime/datetime-parser-common.cc
M be/src/runtime/datetime-parser-common.h
M be/src/runtime/datetime-simple-date-format-parser.cc
M be/src/runtime/datetime-simple-date-format-parser.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
M common/function-registry/impala_functions.py
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M testdata/data/README
A testdata/data/dateless_timestamps.parq
A testdata/data/dateless_timestamps.txt
M testdata/data/lazy_timestamp.csv
M testdata/workloads/functional-query/queries/QueryTest/date.test
A 
testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_parquet.test
A 
testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_text.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M 
testdata/workloads/functional-query/queries/QueryTest/select-lazy-timestamp.test
M tests/data_errors/test_data_errors.py
M tests/query_test/test_cast_with_format.py
M tests/query_test/test_scanners.py
33 files changed, 274 insertions(+), 230 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/15866/5
--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 5
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-06-16 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 4:

(8 comments)

Nice work! I feel that this patch is getting into shape. I have some follow-up 
comments but none of the seems super difficult to address.

http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/exprs/timestamp-functions-ir.cc@167
PS3, Line 167: dt_ctx->Reset(reinterpret_casthttp://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions-ir.cc
File be/src/exprs/timestamp-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions-ir.cc@168
PS4, Line 168: if (!SimpleDateFormatTokenizer::Tokenize(dt_ctx, true, PARSE)) {
Anyway, if you tokenize the format here then you do the tokenization for each 
row of the table even though it would be enough to do once before Impala starts 
reading rows.


http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions.h
File be/src/exprs/timestamp-functions.h:

http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions.h@104
PS4, Line 104: bool format
Please use datetime_parse_util::CastDirection


http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions.cc
File be/src/exprs/timestamp-functions.cc:

http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions.cc@155
PS4, Line 155: bool format = true
Why don't you use the parameter type that could be directly passed to 
Tokenize()?


http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/timestamp-functions.cc@167
PS4, Line 167: if (format)
 :   parse_result = SimpleDateFormatTokenizer::Tokenize(dt_ctx, 
true, FORMAT);
 : else
 :   parse_result = SimpleDateFormatTokenizer::Tokenize(dt_ctx, 
true, PARSE);
Please check how to format if-else in Impala


http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/runtime/datetime-simple-date-format-parser.h
File be/src/runtime/datetime-simple-date-format-parser.h:

http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/runtime/datetime-simple-date-format-parser.h@88
PS4, Line 88: CastDirection cast_mode = FORMAT
I wouldn't give this a default value unless you have any specific reason making 
it this way.


http://gerrit.cloudera.org:8080/#/c/15866/4/testdata/data/README
File testdata/data/README:

http://gerrit.cloudera.org:8080/#/c/15866/4/testdata/data/README@503
PS4, Line 503: Timestamp_parquet_test
The file name is not that verbose. After reading it it doesn't give any hints 
what timestamp tests it is used for. "dateless_timestamps.parq" would be better 
for this purpose.

nit: No need to include "parquet" into the file name as the extension already 
expresses the format.
nit: The file name itself starts with a lowercase letter.


http://gerrit.cloudera.org:8080/#/c/15866/4/testdata/data/timestamp_text_test.txt
File testdata/data/timestamp_text_test.txt:

http://gerrit.cloudera.org:8080/#/c/15866/4/testdata/data/timestamp_text_test.txt@1
PS4, Line 1: 1996-04-22 10:00:00.43210
Same comment for the file name as for the parquet file: 
"dateless_timestamps.txt" would be more verbose.



--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 4
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 16 Jun 2020 15:12:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-06-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 4:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/6336/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 4
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 16 Jun 2020 14:40:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-06-16 Thread Adam Tamas (Code Review)
Adam Tamas has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 4:

(18 comments)

http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@15
PS3, Line 15: select cast('01:02:59' as timestamp);
> This should return parse error because the format itself doesn't contain da
Done


http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@17
PS3, Line 17:
> nit: dot at the end of a sentence.
Done


http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@17
PS3, Line 17:
> nit: This should be plural in my opinion as it stands for 2 separate exampl
Done


http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@28
PS3, Line 28: timesta
> I still don't see tests where you use a file populated by dateless timestam
Done


http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@29
PS3, Line 29: are requ
> nit: start a sentence with capital letter.
Done


http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@29
PS3, Line 29: utpu
> nit: tests
Done


http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@30
PS3, Line 30:
> It's just a matter of your point of view which direction you call backwards
Done


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/exprs/expr-test.cc@4507
PS3, Line 4507: 1.00
> Is this related to your change or does it come with a rebase you've made in
Sorry for this one, I really did a rebase between the commits.


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/exprs/expr-test.cc@7536
PS3, Line 7536: H:mm:ss'
> Actually, we should get an error when parsing the format and see that there
Done


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/datetime-simple-date-format-parser.h
File be/src/runtime/datetime-simple-date-format-parser.h:

http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/datetime-simple-date-format-parser.h@88
PS3, Line 88:
> Can you re-use the CastDirection type similarly as it is used in e.g. ISO S
Done


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/datetime-simple-date-format-parser.cc
File be/src/runtime/datetime-simple-date-format-parser.cc:

http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/datetime-simple-date-format-parser.cc@179
PS3, Line 179:   if (cast_mode == PARSE) return (dt_ctx->has_date_toks);
> Please don't leave commented code
Done


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/datetime-simple-date-format-parser.cc@180
PS3, Line 180:   return (dt_ctx->has_date_toks || dt_ctx->has_time_toks);
> This seems logically correct but not that readable. Additionally, this is n
Done


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/timestamp-test.cc@610
PS3, Line 610: (TimestampTC("-MM-dd HH:m:ss", "2020-05-11 1:24:34", 
false, true))
> Here the intention was to also cover the case when the minute part of the i
Here we are checking the format tokens not the given string themselves which 
has a short minute part("-MM-dd HH:m:ss")

We use the short form here because we expect it to fail.


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/timestamp-test.cc@612
PS3, Line 612: (TimestampTC("-MM-dd HH:mm:s", "2020-05-11 14:24:34", false, 
true, true, 2020,
 : 05, 11, 14, 24, 34))
> Similarly to the other tests, please add another one where the second part
Done


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/timestamp-test.cc@679
PS3, Line 679: mestampFormatTC(1382337792, "yyy
> nit: This sentence sounds a bit weird for me.
Done


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/timestamp-test.cc@695
PS3, Line 695: Test padding on double di
> nit: same as above
Done


http://gerrit.cloudera.org:8080/#/c/15866/3/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

http://gerrit.cloudera.org:8080/#/c/15866/3/testdata/workloads/functional-query/queries/QueryTest/exprs.test@2976
PS3, Line 2976: select to_timestamp('01:01:01', 'HH:mm:ss');
> As mentioned elsewhere, this should result a parse error as the format does
Done


http://gerrit.cloudera.org:8080/#/c/15866/3/testdata/workloads/functional-query/queries/QueryTest/exprs.test@2981
PS3, Line 2981: select to_timestamp('01:01:01.123', 'HH:mm:ss.SSS');
> same as above
Done



--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 

[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-06-16 Thread Adam Tamas (Code Review)
Adam Tamas has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..

IMPALA-9531: Dropped support for dateless timestamps

Removed the support for dateless timestamps.
During dateless timestamp casts if the format doesn't contain
date part we get an error during tokenization of the format.
If the input str doesn't contain a date part then we get null result.

Examples:
select cast('01:02:59' as timestamp);
This will come back as NULL value.

select to_timestamp('01:01:01', 'HH:mm:ss');
select cast('01:02:59' as timestamp format 'HH12:MI:SS');
select cast('12 AM' as timestamp FORMAT 'AM.HH12');
These will come back with a parsing errors.

Casting from a table will generate similar results.

Testing:
Modified the previous tests related to dateless timestamps.
Added test to read fromtables which are still containing dateless
timestamps and covered timestamp to string path when no date tokens
are requested in the output string.

Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
---
M be/src/exec/text-converter.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/scalar-expr-evaluator.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timestamp-functions.h
M be/src/runtime/date-parse-util.cc
M be/src/runtime/datetime-iso-sql-format-tokenizer.cc
M be/src/runtime/datetime-parser-common.cc
M be/src/runtime/datetime-parser-common.h
M be/src/runtime/datetime-simple-date-format-parser.cc
M be/src/runtime/datetime-simple-date-format-parser.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
M common/function-registry/impala_functions.py
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M testdata/data/README
M testdata/data/lazy_timestamp.csv
A testdata/data/timestamp_parquet_test.parq
A testdata/data/timestamp_text_test.txt
M testdata/workloads/functional-query/queries/QueryTest/date.test
A 
testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_parquet.test
A 
testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_text.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M 
testdata/workloads/functional-query/queries/QueryTest/select-lazy-timestamp.test
M tests/data_errors/test_data_errors.py
M tests/query_test/test_cast_with_format.py
M tests/query_test/test_scanners.py
30 files changed, 263 insertions(+), 217 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/15866/4
--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 4
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-06-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/scalar-expr-evaluator.cc
File be/src/exprs/scalar-expr-evaluator.cc:

http://gerrit.cloudera.org:8080/#/c/15866/4/be/src/exprs/scalar-expr-evaluator.cc@444
PS4, Line 444:   TimestampFunctions::UnixAndFromUnixPrepare(nullptr, 
FunctionContext::FRAGMENT_LOCAL, true);
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/15866/4/tests/query_test/test_scanners.py
File tests/query_test/test_scanners.py:

http://gerrit.cloudera.org:8080/#/c/15866/4/tests/query_test/test_scanners.py@391
PS4, Line 391: a
flake8: E501 line too long (98 > 90 characters)



--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 4
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 16 Jun 2020 13:56:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-05-27 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 3:

(18 comments)

Thanks for taking care of my comments! In general the casting part of this 
change is getting into shape in my opinion. However, I miss the part where the 
timestamp is not a result of a cast() or to_timestamp() or such, and it comes 
from reading a table. e.g. I don't see now any changes that could prevent us 
reading dateless timestamp when we run e.g. "select ts_col from table_name".

We discussed offline that some of the file formats are unable to hold these 
values because only Hive can write them and there Hive doesn't allow writing 
dateless TSs but for the ones where Impala can write them it should at leas be 
covered by tests that a previously written file with dateless timestamps can be 
read after this changes and the dateless TSs will be nulls.

http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@15
PS3, Line 15: select to_timestamp('01:01:01', 'HH:mm:ss');
This should return parse error because the format itself doesn't contain date 
just time.


http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@17
PS3, Line 17: values
nit: dot at the end of a sentence.


http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@17
PS3, Line 17: This
nit: This should be plural in my opinion as it stands for 2 separate examples.


http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@28
PS3, Line 28: Testing
I still don't see tests where you use a file populated by dateless timestamps 
(created by an Impala pre this change) and try to read it with Impala 
containing your changes.


http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@29
PS3, Line 29: modified
nit: start a sentence with capital letter.


http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@29
PS3, Line 29: test
nit: tests


http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@30
PS3, Line 30: backwards
It's just a matter of your point of view which direction you call backwards so 
I wouldn't use that word here. Simply "Added test to cover timestamp to string 
path when no date tokens are requested in the output string" or something like 
that.


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/exprs/expr-test.cc@4507
PS3, Line 4507: 1.00
Is this related to your change or does it come with a rebase you've made in the 
background?
If the second, then a general comment is to try to push rebase related changes 
to gerrit separately from your actual changes to make the reviewers life easier.


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/exprs/expr-test.cc@7536
PS3, Line 7536: HH:mm:ss
Actually, we should get an error when parsing the format and see that there is 
no date part.


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/datetime-simple-date-format-parser.h
File be/src/runtime/datetime-simple-date-format-parser.h:

http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/datetime-simple-date-format-parser.h@88
PS3, Line 88: parse
Can you re-use the CastDirection type similarly as it is used in e.g. ISO SQL 
format tokenizer?


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/datetime-simple-date-format-parser.cc
File be/src/runtime/datetime-simple-date-format-parser.cc:

http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/datetime-simple-date-format-parser.cc@179
PS3, Line 179:   //return (dt_ctx->has_date_toks || dt_ctx->has_time_toks);
Please don't leave commented code


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/datetime-simple-date-format-parser.cc@180
PS3, Line 180:   return parse?(dt_ctx->has_date_toks):(dt_ctx->has_date_toks || 
dt_ctx->has_time_toks);
This seems logically correct but not that readable. Additionally, this is not 
how we format ternary operators in Impala:
 - Use space around the question marks
 - As the condition is simple no need to surround with brackets.
 - Use space around the colon

Anyway, I feel that rewriting to something like this would help a bit:

if (parse) return dt_ctx->has_date_toks;
return dt_ctx->has_date_toks || dt_ctx->has_time_toks;

Or something similar.


http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/timestamp-test.cc@610
PS3, Line 610: (TimestampTC("-MM-dd HH:m:ss", "2020-05-11 1:24:34", 
false, true))
Here the intention was to also cover the case when the minute part of the input 
is 1 char long, right? Instead this tests that the hour part is 1 long.
Additionally, shouldn't you also give params to test the expected result?



[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-05-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6136/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 3
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 25 May 2020 09:53:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-05-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6135/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 2
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 25 May 2020 09:46:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-05-25 Thread Adam Tamas (Code Review)
Adam Tamas has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..

IMPALA-9531: Dropped support for dateless timestamps

Removed the support for dateless timestamps.
During dateless timestamp casts if the format doesn't contain
date part we get an error during tokenization of the format.
If the input str doesn't contain a date part then we get null result.

Examples:
select to_timestamp('01:01:01', 'HH:mm:ss');
select cast('01:02:59' as timestamp);
This will come back as NULL values

select cast(string_col as timestamp) from table_name;
Casting from a table with a similar method like this
will also give back NULL values similar to the above example.

select cast('01:02:59' as timestamp format 'HH12:MI:SS');
select cast('12 AM' as timestamp FORMAT 'AM.HH12');
These will come back with a parsing error which is:
ERROR: PARSE ERROR: No date tokens provided.

Testing:
modified the previous test related to dateless timestamps and
added new tests to check if the conversions backwards(timestamp to
string) is still working

Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
---
M be/src/exec/text-converter.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/runtime/date-parse-util.cc
M be/src/runtime/datetime-iso-sql-format-tokenizer.cc
M be/src/runtime/datetime-parser-common.cc
M be/src/runtime/datetime-parser-common.h
M be/src/runtime/datetime-simple-date-format-parser.cc
M be/src/runtime/datetime-simple-date-format-parser.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M testdata/data/lazy_timestamp.csv
M testdata/workloads/functional-query/queries/QueryTest/date.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M 
testdata/workloads/functional-query/queries/QueryTest/select-lazy-timestamp.test
M tests/data_errors/test_data_errors.py
M tests/query_test/test_cast_with_format.py
20 files changed, 131 insertions(+), 198 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/15866/3
--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 3
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-05-25 Thread Adam Tamas (Code Review)
Adam Tamas has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15866


Change subject: IMPALA-9531: Dropped support for dateless timestamps
..

IMPALA-9531: Dropped support for dateless timestamps

Removed the support for dateless timestamps.
During dateless timestamp casts if the format doesn't contain
date part we get an error during tokenization of the format.
If the input str doesn't contain a date part then we get null result.

Examples:
select to_timestamp('01:01:01', 'HH:mm:ss');
select cast('01:02:59' as timestamp);
This will come back as NULL values

select cast(string_col as timestamp) from table_name;
Casting from a table with a similar method like this
will also give back NULL values similar to the above example.

select cast('01:02:59' as timestamp format 'HH12:MI:SS');
select cast('12 AM' as timestamp FORMAT 'AM.HH12');
These will come back with a parsing error which is:
ERROR: PARSE ERROR: No date tokens provided.

Testing:
modified the previous test related to dateless timestamps and
added new tests to check if the conversions backwards(timestamp to
string) is still working

Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
---
M be/src/exec/text-converter.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/runtime/date-parse-util.cc
M be/src/runtime/datetime-iso-sql-format-tokenizer.cc
M be/src/runtime/datetime-parser-common.cc
M be/src/runtime/datetime-parser-common.h
M be/src/runtime/datetime-simple-date-format-parser.cc
M be/src/runtime/datetime-simple-date-format-parser.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M testdata/data/lazy_timestamp.csv
M testdata/workloads/functional-query/queries/QueryTest/date.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M 
testdata/workloads/functional-query/queries/QueryTest/select-lazy-timestamp.test
M tests/data_errors/test_data_errors.py
M tests/query_test/test_cast_with_format.py
20 files changed, 131 insertions(+), 198 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/15866/2
--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 2
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-05-25 Thread Adam Tamas (Code Review)
Adam Tamas has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 2:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/15866/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15866/1//COMMIT_MSG@7
PS1, Line 7: d
> no need for quotation marks
Done


http://gerrit.cloudera.org:8080/#/c/15866/1//COMMIT_MSG@10
PS1, Line 10: During dateless timestamp casts if the format doesn't contain
: date part we get an error during tokenization of the format
> I'd rather mention that when a format is provided for a cast then if the fo
Done


http://gerrit.cloudera.org:8080/#/c/15866/1//COMMIT_MSG@14
PS1, Line 14: s:
> I'd use this example:
Done


http://gerrit.cloudera.org:8080/#/c/15866/1//COMMIT_MSG@15
PS1, Line 15: estam
> I'd use an example as '01:02:59'
Done


http://gerrit.cloudera.org:8080/#/c/15866/1//COMMIT_MSG@16
PS1, Line 16: select cast('01:02:59' as timestamp);
> I'd also add an example for to_timestamp(input_str, format_str)
Done


http://gerrit.cloudera.org:8080/#/c/15866/1//COMMIT_MSG@17
PS1, Line 17: This will come back as NULL values
> Another example:
Done


http://gerrit.cloudera.org:8080/#/c/15866/1//COMMIT_MSG@23
PS1, Line 23: select cast('01:02:59' as timestamp format 'HH12:MI:SS');
: select cast('12 AM' as timestamp FORMAT 'AM.HH12');
: These will come back with a parsing error which is:
: ERROR: PARSE ERROR: No date tokens provid
> No need to list the modified test files here. If there is something in part
Done


http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/exprs/expr-test.cc@6777
PS1, Line 6777: TestValue("dayofweek(cast('2011-12-22' as timestamp))", 
TYPE_INT, 5);
  :   TestValue("dayofweek(cast('2011-12-24' as timestamp))", 
TYPE_INT, 7);
  :   TestStringValue(
  :   "to_date(cast('2011-12-22 09:10:11.12345678' as 
timestamp))", "2011-12-22");
  :
> these are the same tests as in L6759-6762. No need to duplicate them.
Done


http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/exprs/expr-test.cc@7155
PS1, Line 7155:
  :   TestStringValue(
> These 2 tests lost their purpose with your change, A comment in L7151 refer
If we call trunc() on timestamp is gives back everything that stands before 
that value and the value that stands in the given place, as I see we can no 
longer create a valid test which will give back NULL value since we dropped the 
timeless timestamps.

I will remove these tests since they are indeed misleading and there are other 
test case where we trunc() a NULL value.


http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/runtime/datetime-iso-sql-format-tokenizer.cc
File be/src/runtime/datetime-iso-sql-format-tokenizer.cc:

http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@195
PS1, Line 195:
> I'd move this check after L-197-202. We exit there if cast_mode_ == FORMAT
Done


http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/runtime/datetime-simple-date-format-parser.h
File be/src/runtime/datetime-simple-date-format-parser.h:

http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/runtime/datetime-simple-date-format-parser.h@77
PS1, Line 77: DEFAULT_SHORT_DATE_T
> Is this needed anymore?
Done


http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/runtime/datetime-simple-date-format-parser.cc
File be/src/runtime/datetime-simple-date-format-parser.cc:

http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/runtime/datetime-simple-date-format-parser.cc@38
PS1, Line 38: DEFAULT_SHORT_DATE_T
> Is this still needed?
Done


http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/runtime/datetime-simple-date-format-parser.cc@101
PS1, Line 101: bool accept_time_toks, bool parse) {
> It would be nice to see tests for to_timestamp as well. I haven't seem any
I looked around a bit and as I seen there was no test for to_timestamp() with 
dateless conversions, but I tried it out with manual tests and it is no longer 
accepting dateless formats.
I will look around and add a failing test with dateless timestamp to the 
to_timestamp() tests.


http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/runtime/datetime-simple-date-format-parser.cc@344
PS1, Line 344: // Check if this string starts with a date
> I don't think that this check is needed.
Done


http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

http://gerrit.cloudera.org:8080/#/c/15866/1/be/src/runtime/timestamp-test.cc@a617
PS1, Line 617:
 :
 :
 :
 :
 :
 :
 :
 :
 :
 :
 :
 :
 :
>