Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 )
Change subject: IMPALA-9531: Dropped support for dateless timestamps ...................................................................... Patch Set 3: (18 comments) Thanks for taking care of my comments! In general the casting part of this change is getting into shape in my opinion. However, I miss the part where the timestamp is not a result of a cast() or to_timestamp() or such, and it comes from reading a table. e.g. I don't see now any changes that could prevent us reading dateless timestamp when we run e.g. "select ts_col from table_name". We discussed offline that some of the file formats are unable to hold these values because only Hive can write them and there Hive doesn't allow writing dateless TSs but for the ones where Impala can write them it should at leas be covered by tests that a previously written file with dateless timestamps can be read after this changes and the dateless TSs will be nulls. http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@15 PS3, Line 15: select to_timestamp('01:01:01', 'HH:mm:ss'); This should return parse error because the format itself doesn't contain date just time. http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@17 PS3, Line 17: values nit: dot at the end of a sentence. http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@17 PS3, Line 17: This nit: This should be plural in my opinion as it stands for 2 separate examples. http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@28 PS3, Line 28: Testing I still don't see tests where you use a file populated by dateless timestamps (created by an Impala pre this change) and try to read it with Impala containing your changes. http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@29 PS3, Line 29: modified nit: start a sentence with capital letter. http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@29 PS3, Line 29: test nit: tests http://gerrit.cloudera.org:8080/#/c/15866/3//COMMIT_MSG@30 PS3, Line 30: backwards It's just a matter of your point of view which direction you call backwards so I wouldn't use that word here. Simply "Added test to cover timestamp to string path when no date tokens are requested in the output string" or something like that. http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/exprs/expr-test.cc@4507 PS3, Line 4507: 1.00 Is this related to your change or does it come with a rebase you've made in the background? If the second, then a general comment is to try to push rebase related changes to gerrit separately from your actual changes to make the reviewers life easier. http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/exprs/expr-test.cc@7536 PS3, Line 7536: HH:mm:ss Actually, we should get an error when parsing the format and see that there is no date part. http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/datetime-simple-date-format-parser.h File be/src/runtime/datetime-simple-date-format-parser.h: http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/datetime-simple-date-format-parser.h@88 PS3, Line 88: parse Can you re-use the CastDirection type similarly as it is used in e.g. ISO SQL format tokenizer? http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/datetime-simple-date-format-parser.cc File be/src/runtime/datetime-simple-date-format-parser.cc: http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/datetime-simple-date-format-parser.cc@179 PS3, Line 179: //return (dt_ctx->has_date_toks || dt_ctx->has_time_toks); Please don't leave commented code http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/datetime-simple-date-format-parser.cc@180 PS3, Line 180: return parse?(dt_ctx->has_date_toks):(dt_ctx->has_date_toks || dt_ctx->has_time_toks); This seems logically correct but not that readable. Additionally, this is not how we format ternary operators in Impala: - Use space around the question marks - As the condition is simple no need to surround with brackets. - Use space around the colon Anyway, I feel that rewriting to something like this would help a bit: if (parse) return dt_ctx->has_date_toks; return dt_ctx->has_date_toks || dt_ctx->has_time_toks; Or something similar. http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/timestamp-test.cc File be/src/runtime/timestamp-test.cc: http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/timestamp-test.cc@610 PS3, Line 610: (TimestampTC("yyyy-MM-dd HH:m:ss", "2020-05-11 1:24:34", false, true)) Here the intention was to also cover the case when the minute part of the input is 1 char long, right? Instead this tests that the hour part is 1 long. Additionally, shouldn't you also give params to test the expected result? http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/timestamp-test.cc@612 PS3, Line 612: (TimestampTC("yyyy-MM-dd HH:mm:s", "2020-05-11 14:24:34", false, true, true, 2020, : 05, 11, 14, 24, 34)) Similarly to the other tests, please add another one where the second part of the input is 1 long. http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/timestamp-test.cc@679 PS3, Line 679: Test just formatting time tokens nit: This sentence sounds a bit weird for me. http://gerrit.cloudera.org:8080/#/c/15866/3/be/src/runtime/timestamp-test.cc@695 PS3, Line 695: Test just formatting time nit: same as above http://gerrit.cloudera.org:8080/#/c/15866/3/testdata/workloads/functional-query/queries/QueryTest/exprs.test File testdata/workloads/functional-query/queries/QueryTest/exprs.test: http://gerrit.cloudera.org:8080/#/c/15866/3/testdata/workloads/functional-query/queries/QueryTest/exprs.test@2976 PS3, Line 2976: select to_timestamp('01:01:01', 'HH:mm:ss'); As mentioned elsewhere, this should result a parse error as the format doesn't contain date. http://gerrit.cloudera.org:8080/#/c/15866/3/testdata/workloads/functional-query/queries/QueryTest/exprs.test@2981 PS3, Line 2981: select to_timestamp('01:01:01.123', 'HH:mm:ss.SSS'); same as above -- To view, visit http://gerrit.cloudera.org:8080/15866 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322 Gerrit-Change-Number: 15866 Gerrit-PatchSet: 3 Gerrit-Owner: Adam Tamas <ta...@cloudera.com> Gerrit-Reviewer: Adam Tamas <ta...@cloudera.com> Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Comment-Date: Wed, 27 May 2020 13:49:55 +0000 Gerrit-HasComments: Yes