[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter

2018-09-28 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11450 )

Change subject: IMPALA-7492: Add support for DATE text parser/formatter
..

IMPALA-7492: Add support for DATE text parser/formatter

This change is the first step in implementing support for DATE type
(IMPALA-6169).

The DATE parser/formatter is implemented by the new DateParser class.
- The parser supports parsing both default and custom formatted DATE
values. CCTZ is used to validate the parsed dates.
- The formatter supports default and custom formatting of DATE values.

In the future, DateParser will be used in the text scanner/writer and
in the DATE <-> STRING cast functions.

The DateParser class reuses some of the functionality already
implemented in the TimestampParser class to minimize redundancy. To
make code reuse easier, a new namespace (datetime_parse_util) was
created and the common functionality was moved there.

This change also adds a new class (DateValue) to represent a DATE
value in-memory. The DateParser and DateValue classes are used only in
tests at the moment, therefore this patch doesn't change user facing
behavior.

Testing:
- Added BE-tests for DateParser and DateValue classes.
- Re-run parse-timestamp-benchmark to make sure that parser
  performance hasn't degraded.

Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Reviewed-on: http://gerrit.cloudera.org:8080/11450
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/benchmarks/convert-timestamp-benchmark.cc
M be/src/benchmarks/parse-timestamp-benchmark.cc
M be/src/common/init.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timestamp-functions.h
M be/src/runtime/CMakeLists.txt
A be/src/runtime/date-parse-util.cc
A be/src/runtime/date-parse-util.h
A be/src/runtime/date-test.cc
A be/src/runtime/date-value.cc
A be/src/runtime/date-value.h
A be/src/runtime/datetime-parse-util.cc
A be/src/runtime/datetime-parse-util.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-parse-util.h
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
19 files changed, 1,896 insertions(+), 732 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/11450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Gerrit-Change-Number: 11450
Gerrit-PatchSet: 7
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter

2018-09-28 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11450 )

Change subject: IMPALA-7492: Add support for DATE text parser/formatter
..


Patch Set 6: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/11450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Gerrit-Change-Number: 11450
Gerrit-PatchSet: 6
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 28 Sep 2018 18:14:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter

2018-09-28 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11450 )

Change subject: IMPALA-7492: Add support for DATE text parser/formatter
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3245/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/11450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Gerrit-Change-Number: 11450
Gerrit-PatchSet: 6
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 28 Sep 2018 14:29:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter

2018-09-28 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11450 )

Change subject: IMPALA-7492: Add support for DATE text parser/formatter
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Gerrit-Change-Number: 11450
Gerrit-PatchSet: 6
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 28 Sep 2018 14:29:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter

2018-09-28 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11450 )

Change subject: IMPALA-7492: Add support for DATE text parser/formatter
..


Patch Set 5: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11450/5/be/src/runtime/date-value.cc
File be/src/runtime/date-value.cc:

http://gerrit.cloudera.org:8080/#/c/11450/5/be/src/runtime/date-value.cc@30
PS5, Line 30: namespace {
nit: Unnamed namespace is not necessary here since the variable is const and 
the function is inline => they are already implicitly static



--
To view, visit http://gerrit.cloudera.org:8080/11450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Gerrit-Change-Number: 11450
Gerrit-PatchSet: 5
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 28 Sep 2018 10:43:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter

2018-09-26 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11450 )

Change subject: IMPALA-7492: Add support for DATE text parser/formatter
..


Patch Set 5: Code-Review+2

Thanks for the changes!


--
To view, visit http://gerrit.cloudera.org:8080/11450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Gerrit-Change-Number: 11450
Gerrit-PatchSet: 5
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 26 Sep 2018 14:28:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter

2018-09-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11450 )

Change subject: IMPALA-7492: Add support for DATE text parser/formatter
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/808/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Gerrit-Change-Number: 11450
Gerrit-PatchSet: 5
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 26 Sep 2018 14:19:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter

2018-09-26 Thread Attila Jeges (Code Review)
Attila Jeges has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/11450 )

Change subject: IMPALA-7492: Add support for DATE text parser/formatter
..

IMPALA-7492: Add support for DATE text parser/formatter

This change is the first step in implementing support for DATE type
(IMPALA-6169).

The DATE parser/formatter is implemented by the new DateParser class.
- The parser supports parsing both default and custom formatted DATE
values. CCTZ is used to validate the parsed dates.
- The formatter supports default and custom formatting of DATE values.

In the future, DateParser will be used in the text scanner/writer and
in the DATE <-> STRING cast functions.

The DateParser class reuses some of the functionality already
implemented in the TimestampParser class to minimize redundancy. To
make code reuse easier, a new namespace (datetime_parse_util) was
created and the common functionality was moved there.

This change also adds a new class (DateValue) to represent a DATE
value in-memory. The DateParser and DateValue classes are used only in
tests at the moment, therefore this patch doesn't change user facing
behavior.

Testing:
- Added BE-tests for DateParser and DateValue classes.
- Re-run parse-timestamp-benchmark to make sure that parser
  performance hasn't degraded.

Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
---
M be/src/benchmarks/convert-timestamp-benchmark.cc
M be/src/benchmarks/parse-timestamp-benchmark.cc
M be/src/common/init.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timestamp-functions.h
M be/src/runtime/CMakeLists.txt
A be/src/runtime/date-parse-util.cc
A be/src/runtime/date-parse-util.h
A be/src/runtime/date-test.cc
A be/src/runtime/date-value.cc
A be/src/runtime/date-value.h
A be/src/runtime/datetime-parse-util.cc
A be/src/runtime/datetime-parse-util.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-parse-util.h
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
19 files changed, 1,896 insertions(+), 732 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/11450/5
--
To view, visit http://gerrit.cloudera.org:8080/11450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Gerrit-Change-Number: 11450
Gerrit-PatchSet: 5
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter

2018-09-26 Thread Attila Jeges (Code Review)
Attila Jeges has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11450 )

Change subject: IMPALA-7492: Add support for DATE text parser/formatter
..


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11450/4/be/src/runtime/date-test.cc
File be/src/runtime/date-test.cc:

http://gerrit.cloudera.org:8080/#/c/11450/4/be/src/runtime/date-test.cc@76
PS4, Line 76: EXPECT_FALSE(dv.IsValid());
> It would be more informative if problematic date was added as message.
Done


http://gerrit.cloudera.org:8080/#/c/11450/4/be/src/runtime/date-value.h
File be/src/runtime/date-value.h:

http://gerrit.cloudera.org:8080/#/c/11450/4/be/src/runtime/date-value.h@61
PS4, Line 61: if
> The valid case could be marked as LIKELY.
Done


http://gerrit.cloudera.org:8080/#/c/11450/4/be/src/runtime/date-value.h@68
PS4, Line 68:   DateValue(int year, int month, int day) : 
days_since_epoch_(INVALID_DAYS_SINCE_EPOCH) {
: DCHECK(!IsValid());
: // Check year range and whether year-month-day is a valid 
date.
: if (LIKELY(year >= MIN_YEAR && year <= MAX_YEAR)) {
:   // Use CCTZ for validity check.
:   cctz::civil_day date(year, month, day);
:   if (LIKELY(year == date.year() && month == date.month() && 
day == date.day())) {
: days_since_epoch_ = CalcDaysSinceEpoch(date);
: DCHECK(IsValid());
:   }
: }
:   }
> Can you move this to the .cc? This would mean that CalcDaysSinceEpoch and E
Done



--
To view, visit http://gerrit.cloudera.org:8080/11450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Gerrit-Change-Number: 11450
Gerrit-PatchSet: 4
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 26 Sep 2018 13:43:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter

2018-09-26 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11450 )

Change subject: IMPALA-7492: Add support for DATE text parser/formatter
..


Patch Set 4: Code-Review+1

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11450/4/be/src/runtime/date-test.cc
File be/src/runtime/date-test.cc:

http://gerrit.cloudera.org:8080/#/c/11450/4/be/src/runtime/date-test.cc@76
PS4, Line 76: EXPECT_FALSE(dv.IsValid());
It would be more informative if problematic date was added as message.


http://gerrit.cloudera.org:8080/#/c/11450/4/be/src/runtime/date-value.h
File be/src/runtime/date-value.h:

http://gerrit.cloudera.org:8080/#/c/11450/4/be/src/runtime/date-value.h@61
PS4, Line 61: if
The valid case could be marked as LIKELY.


http://gerrit.cloudera.org:8080/#/c/11450/4/be/src/runtime/date-value.h@68
PS4, Line 68:   DateValue(int year, int month, int day) : 
days_since_epoch_(INVALID_DAYS_SINCE_EPOCH) {
: DCHECK(!IsValid());
: // Check year range and whether year-month-day is a valid 
date.
: if (LIKELY(year >= MIN_YEAR && year <= MAX_YEAR)) {
:   // Use CCTZ for validity check.
:   cctz::civil_day date(year, month, day);
:   if (LIKELY(year == date.year() && month == date.month() && 
day == date.day())) {
: days_since_epoch_ = CalcDaysSinceEpoch(date);
: DCHECK(IsValid());
:   }
: }
:   }
Can you move this to the .cc? This would mean that CalcDaysSinceEpoch and 
EPOCH_DATE could be also moved, so #include "cctz/civil_time.h" could be 
removed too.



--
To view, visit http://gerrit.cloudera.org:8080/11450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Gerrit-Change-Number: 11450
Gerrit-PatchSet: 4
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 26 Sep 2018 12:43:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter

2018-09-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11450 )

Change subject: IMPALA-7492: Add support for DATE text parser/formatter
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/807/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Gerrit-Change-Number: 11450
Gerrit-PatchSet: 4
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 26 Sep 2018 12:29:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter

2018-09-26 Thread Attila Jeges (Code Review)
Attila Jeges has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/11450 )

Change subject: IMPALA-7492: Add support for DATE text parser/formatter
..

IMPALA-7492: Add support for DATE text parser/formatter

This change is the first step in implementing support for DATE type
(IMPALA-6169).

The DATE parser/formatter is implemented by the new DateParser class.
- The parser supports parsing both default and custom formatted DATE
values. CCTZ is used to validate the parsed dates.
- The formatter supports default and custom formatting of DATE values.

In the future, DateParser will be used in the text scanner/writer and
in the DATE <-> STRING cast functions.

The DateParser class reuses some of the functionality already
implemented in the TimestampParser class to minimize redundancy. To
make code reuse easier, a new namespace (datetime_parse_util) was
created and the common functionality was moved there.

This change also adds a new class (DateValue) to represent a DATE
value in-memory. The DateParser and DateValue classes are used only in
tests at the moment, therefore this patch doesn't change user facing
behavior.

Testing:
- Added BE-tests for DateParser and DateValue classes.
- Re-run parse-timestamp-benchmark to make sure that parser
  performance hasn't degraded.

Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
---
M be/src/benchmarks/convert-timestamp-benchmark.cc
M be/src/benchmarks/parse-timestamp-benchmark.cc
M be/src/common/init.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timestamp-functions.h
M be/src/runtime/CMakeLists.txt
A be/src/runtime/date-parse-util.cc
A be/src/runtime/date-parse-util.h
A be/src/runtime/date-test.cc
A be/src/runtime/date-value.cc
A be/src/runtime/date-value.h
A be/src/runtime/datetime-parse-util.cc
A be/src/runtime/datetime-parse-util.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-parse-util.h
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
19 files changed, 1,895 insertions(+), 732 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/11450/4
--
To view, visit http://gerrit.cloudera.org:8080/11450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Gerrit-Change-Number: 11450
Gerrit-PatchSet: 4
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter

2018-09-26 Thread Attila Jeges (Code Review)
Attila Jeges has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11450 )

Change subject: IMPALA-7492: Add support for DATE text parser/formatter
..


Patch Set 3:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/11450/3/be/src/runtime/date-test.cc
File be/src/runtime/date-test.cc:

http://gerrit.cloudera.org:8080/#/c/11450/3/be/src/runtime/date-test.cc@50
PS3, Line 50:   char s1[] = "2012-01-20";
:   char s2[] = "1990-10-20";
:   char s3[] = "  1990-10-20   ";
:
:   DateValue v1 = DateValue::Parse(s1, strlen(s1));
:   DateValue v2 = DateValue::Parse(s2, strlen(s2));
:   DateValue v3 = DateValue::Parse(s3, strlen(s3));
:
:   ValidateDate(v1, 2012, 1, 20, s1);
:   ValidateDate(v2, 1990, 10, 20, s2);
:   ValidateDate(v3, 1990, 10, 20, s3);
> It would be nice to merge these 3 line tests to 1, e.g. by creating an over
Done


http://gerrit.cloudera.org:8080/#/c/11450/3/be/src/runtime/date-test.cc@126
PS3, Line 126:   const char* str;
> This could be renamed to month_name to make its purpose clearer.
Done


http://gerrit.cloudera.org:8080/#/c/11450/3/be/src/runtime/date-test.cc@300
PS3, Line 300: char buff[buff_len];
> I would prefer to avoid using variable length arrays as it is not standard
Done


http://gerrit.cloudera.org:8080/#/c/11450/3/be/src/runtime/date-test.cc@356
PS3, Line 356:   const TimestampValue now(date(1980, 2, 28), time_duration(16, 
14, 24));
> Can you add some tests that parse y/yy with a different "now", e.g. 2018? I
Done


http://gerrit.cloudera.org:8080/#/c/11450/3/be/src/runtime/date-test.cc@366
PS3, Line 366: (DateTC("yy-MM-dd", "00-02-29", false, true))
> If I didn't miss something, then leap years are only tested in this functio
Added some tests to DateValueEdgeCases and ParseFormatEdgeCases


http://gerrit.cloudera.org:8080/#/c/11450/3/be/src/runtime/date-value.h
File be/src/runtime/date-value.h:

http://gerrit.cloudera.org:8080/#/c/11450/3/be/src/runtime/date-value.h@59
PS3, Line 59:   DateValue(int32_t days_since_epoch) : 
days_since_epoch_(days_since_epoch) {
:   }
> I am not totally sure here, but I would consider a similar logic to Timesta
Done



--
To view, visit http://gerrit.cloudera.org:8080/11450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Gerrit-Change-Number: 11450
Gerrit-PatchSet: 3
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 26 Sep 2018 10:57:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter

2018-09-24 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11450 )

Change subject: IMPALA-7492: Add support for DATE text parser/formatter
..


Patch Set 3: Code-Review+1

(6 comments)

I have some minor comments, lgtm otherwise.

http://gerrit.cloudera.org:8080/#/c/11450/3/be/src/runtime/date-test.cc
File be/src/runtime/date-test.cc:

http://gerrit.cloudera.org:8080/#/c/11450/3/be/src/runtime/date-test.cc@50
PS3, Line 50:   char s1[] = "2012-01-20";
:   char s2[] = "1990-10-20";
:   char s3[] = "  1990-10-20   ";
:
:   DateValue v1 = DateValue::Parse(s1, strlen(s1));
:   DateValue v2 = DateValue::Parse(s2, strlen(s2));
:   DateValue v3 = DateValue::Parse(s3, strlen(s3));
:
:   ValidateDate(v1, 2012, 1, 20, s1);
:   ValidateDate(v2, 1990, 10, 20, s2);
:   ValidateDate(v3, 1990, 10, 20, s3);
It would be nice to merge these 3 line tests to 1, e.g. by creating an overload 
for validate that expects const char* instead of DateValue.


http://gerrit.cloudera.org:8080/#/c/11450/3/be/src/runtime/date-test.cc@126
PS3, Line 126:   const char* str;
This could be renamed to month_name to make its purpose clearer.


http://gerrit.cloudera.org:8080/#/c/11450/3/be/src/runtime/date-test.cc@300
PS3, Line 300: char buff[buff_len];
I would prefer to avoid using variable length arrays as it is not standard C++.


http://gerrit.cloudera.org:8080/#/c/11450/3/be/src/runtime/date-test.cc@356
PS3, Line 356:   const TimestampValue now(date(1980, 2, 28), time_duration(16, 
14, 24));
Can you add some tests that parse y/yy with a different "now", e.g. 2018? It 
does not has to be exhaustive, it should just check that "now" actually matters.


http://gerrit.cloudera.org:8080/#/c/11450/3/be/src/runtime/date-test.cc@366
PS3, Line 366: (DateTC("yy-MM-dd", "00-02-29", false, true))
If I didn't miss something, then leap years are only tested in this function. 
Can you add some checks to the basic tests about 02-29's handling in 
leap/non-leap years?


http://gerrit.cloudera.org:8080/#/c/11450/3/be/src/runtime/date-value.h
File be/src/runtime/date-value.h:

http://gerrit.cloudera.org:8080/#/c/11450/3/be/src/runtime/date-value.h@59
PS3, Line 59:   DateValue(int32_t days_since_epoch) : 
days_since_epoch_(days_since_epoch) {
:   }
I am not totally sure here, but I would consider a similar logic to 
TimestampValue, where out of range values are always replaced with a fix 
invalid value (INVALID_DAYS_SINCE_EPOCH in this case). The benefit would be 
that IsValid() could be probably faster (test for equality with one value vs 
two comparisons).

Note that this was more important for TimestampValue, because out of range 
boost dates could lead to exceptions.



--
To view, visit http://gerrit.cloudera.org:8080/11450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Gerrit-Change-Number: 11450
Gerrit-PatchSet: 3
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 24 Sep 2018 14:17:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter

2018-09-21 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11450 )

Change subject: IMPALA-7492: Add support for DATE text parser/formatter
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/744/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Gerrit-Change-Number: 11450
Gerrit-PatchSet: 3
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 21 Sep 2018 17:18:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter

2018-09-21 Thread Attila Jeges (Code Review)
Attila Jeges has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/11450 )

Change subject: IMPALA-7492: Add support for DATE text parser/formatter
..

IMPALA-7492: Add support for DATE text parser/formatter

This change is the first step in implementing support for DATE type
(IMPALA-6169).

The DATE parser/formatter is implemented by the new DateParser class.
- The parser supports parsing both default and custom formatted DATE
values. CCTZ is used to validate the parsed dates.
- The formatter supports default and custom formatting of DATE values.

In the future, DateParser will be used in the text scanner/writer and
in the DATE <-> STRING cast functions.

The DateParser class reuses some of the functionality already
implemented in the TimestampParser class to minimize redundancy. To
make code reuse easier, a new namespace (datetime_parse_util) was
created and the common functionality was moved there.

This change also adds a new class (DateValue) to represent a DATE
value in-memory. The DateParser and DateValue classes are used only in
tests at the moment, therefore this patch doesn't change user facing
behavior.

Testing:
- Added BE-tests for DateParser and DateValue classes.
- Re-run parse-timestamp-benchmark to make sure that parser
  performance hasn't degraded.

Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
---
M be/src/benchmarks/convert-timestamp-benchmark.cc
M be/src/benchmarks/parse-timestamp-benchmark.cc
M be/src/common/init.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timestamp-functions.h
M be/src/runtime/CMakeLists.txt
A be/src/runtime/date-parse-util.cc
A be/src/runtime/date-parse-util.h
A be/src/runtime/date-test.cc
A be/src/runtime/date-value.cc
A be/src/runtime/date-value.h
A be/src/runtime/datetime-parse-util.cc
A be/src/runtime/datetime-parse-util.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-parse-util.h
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
19 files changed, 1,895 insertions(+), 732 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/11450/3
--
To view, visit http://gerrit.cloudera.org:8080/11450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Gerrit-Change-Number: 11450
Gerrit-PatchSet: 3
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter

2018-09-17 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11450 )

Change subject: IMPALA-7492: Add support for DATE text parser/formatter
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-value.cc
File be/src/runtime/date-value.cc:

http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-value.cc@89
PS2, Line 89:   }
Do you want to set an informative string if the value is not valid?



--
To view, visit http://gerrit.cloudera.org:8080/11450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Gerrit-Change-Number: 11450
Gerrit-PatchSet: 2
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 17 Sep 2018 23:59:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter

2018-09-17 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11450 )

Change subject: IMPALA-7492: Add support for DATE text parser/formatter
..


Patch Set 2:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-test.cc
File be/src/runtime/date-test.cc:

http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-test.cc@120
PS2, Line 120: ValidateDate
I think that it's worth mentioning both here and in the commit message that 
Impala's parsing logic is checked with against CCTZ's logic.


http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-test.cc@418
PS2, Line 418: {
I would prefer to move this block to a member function of DateTc.


http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-test.cc@424
PS2, Line 424: "TC: " << i;
Instead of printing the index it would be more informative print test_case.str 
or adding a << function for DateTc.


http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-test.cc@472
PS2, Line 472: {
Similarly to line 418, I would prefer to move this block to a member function + 
replace '<< "TC: " << i' with more detailed info.


http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-test.cc@490
PS2, Line 490: // Test min supported date.
Please mention here that MIN_DATE_DAYS_SINCE_EPOCH is calculated with Proleptic 
Gregorian calendar and is expected to be different than 0001-01-01 Hive.


http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-value.h
File be/src/runtime/date-value.h:

http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-value.h@44
PS2, Line 44: MIN_DAYS_SINCE_EPOCH - 1
I would prefer to choose another value for invalid dates - my concern is that 
this may be valid if written by Hive or other systems that do not use Proleptic 
Gregorian Calendar. This probably wouldn't cause an issue, but it would clearer 
to use a more extreme value.


http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-value.h@46
PS2, Line 46: year >= MIN_YEAR && year <= MAX_YEAR
LIKELY could be added here.


http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-value.h@67
PS2, Line 67: ToCivilDay
I would prefer to avoid including CCTZ in the interface (but not at all costs) 
and hide it behind the implementation of TimestampValue/DateValue/TimezoneDb. 
Maybe some utility functions could be added somewhere that do these conversions.

CCTZ will be useful for functions like addDay/addYear/dayOfWeek, but these 
could be also added to the interface of DateValue, so the CCTZ can do its work 
in the implementation.


http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-value.h@101
PS2, Line 101:   int32_t days_since_epoch_;
Please add more explanation about the representation here or at the start of 
the class e.g.:
- the epoch is 1970.01.01
- this representation was chosen to be the same (bit-by-bit) as Parquet's date: 
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#date
- boost::gregorian::date could not be used as representation due to its limited 
range
- Proleptic Gregorian calendar which can lead to different historical 
representation compared to Hive 
(https://en.wikipedia.org/wiki/Proleptic_Gregorian_calendar)



--
To view, visit http://gerrit.cloudera.org:8080/11450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Gerrit-Change-Number: 11450
Gerrit-PatchSet: 2
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 17 Sep 2018 17:22:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter

2018-09-17 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11450 )

Change subject: IMPALA-7492: Add support for DATE text parser/formatter
..


Patch Set 1:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/11450/1/be/src/benchmarks/convert-timestamp-benchmark.cc
File be/src/benchmarks/convert-timestamp-benchmark.cc:

http://gerrit.cloudera.org:8080/#/c/11450/1/be/src/benchmarks/convert-timestamp-benchmark.cc@152
PS1, Line 152: 19
nit: shouldn't it be 20?

Maybe you could add a Reset function that only takes a const char*, and uses 
strlen() inside.


http://gerrit.cloudera.org:8080/#/c/11450/1/be/src/runtime/date-parse-util.h
File be/src/runtime/date-parse-util.h:

http://gerrit.cloudera.org:8080/#/c/11450/1/be/src/runtime/date-parse-util.h@26
PS1, Line 26: namespace dtp = datetime_parse_util;
nit: It is against the google style guide: "Do not use Namespace aliases at 
namespace scope in header files except in explicitly marked internal-only 
namespaces"

https://google.github.io/styleguide/cppguide.html#Namespaces


http://gerrit.cloudera.org:8080/#/c/11450/1/be/src/runtime/date-parse-util.h@59
PS1, Line 59:   /// Returns the realigned year value for dt_result.year. This 
function should be called
:   /// only if dt_result.realign_year is true.
Please explain what is a realigned year. You can re-use the original comment 
from timestamp-parse-util.h


http://gerrit.cloudera.org:8080/#/c/11450/1/be/src/runtime/date-parse-util.cc
File be/src/runtime/date-parse-util.cc:

http://gerrit.cloudera.org:8080/#/c/11450/1/be/src/runtime/date-parse-util.cc@61
PS1, Line 61: NULL
nit: nullptr


http://gerrit.cloudera.org:8080/#/c/11450/1/be/src/runtime/date-parse-util.cc@71
PS1, Line 71:   if (!date->IsValid()) return false;
:
:   return true;
nit: return date->IsValid();


http://gerrit.cloudera.org:8080/#/c/11450/1/be/src/runtime/date-parse-util.cc@132
PS1, Line 132:   DateTimeFormatContext lazy_ctx;
 :   lazy_ctx.Reset(str, trimmed_len);
nit: could be a one-liner with a constructor call.


http://gerrit.cloudera.org:8080/#/c/11450/1/be/src/runtime/date-parse-util.cc@147
PS1, Line 147: NULL
nit: nullptr


http://gerrit.cloudera.org:8080/#/c/11450/1/be/src/runtime/date-parse-util.cc@155
PS1, Line 155: NULL
nit: nullptr


http://gerrit.cloudera.org:8080/#/c/11450/1/be/src/runtime/date-test.cc
File be/src/runtime/date-test.cc:

http://gerrit.cloudera.org:8080/#/c/11450/1/be/src/runtime/date-test.cc@140
PS1, Line 140:   for (int i = 0; i < toks_len; ++i) {
 : fmt.append((*toks)[i].fmt);
 : if ((*toks)[i].str != nullptr) {
 :   val.append(string((*toks)[i].str));
 : } else {
 :   val.append(lexical_cast((*toks)[i].val));
 : }
 :   }
 :   string fmt_val = "Format: " + fmt + ", Val: " + val;
 :   DateTimeFormatContext dt_ctx(fmt.c_str(), fmt.length());
 :   ASSERT_TRUE(ParseFormatTokens(_ctx, false)) << fmt_val;
 :   DateValue dv = DateValue::Parse(val.c_str(), val.length(), 
dt_ctx);
 :   ValidateDate(dv, year, month, day, fmt_val);
 :   int buff_len = dt_ctx.fmt_out_len + 1;
 :   char buff[buff_len];
 :   int actual_len = dv.Format(dt_ctx, buff_len, buff);
 :   EXPECT_GT(actual_len, 0) << fmt_val;
 :   EXPECT_LE(actual_len, dt_ctx.fmt_len) << fmt_val;
 :   string buff_str(buff);
 :   EXPECT_EQ(buff_str, val) << fmt_val <<  " " << buff_str;
 :   fmt.clear();
 :   val.clear();
 : }
 : // Validate we can parse date with separators
 : {
 :   for (const char* separator = SEPARATORS; *separator != 0; 
++separator) {
 : for (int i = 0; i < toks_len; ++i) {
 :   fmt.append((*toks)[i].fmt);
 :   if (i + 1 < toks_len) fmt.push_back(*separator);
 :   if ((*toks)[i].str != nullptr) {
 : val.append(string((*toks)[i].str));
 :   } else {
 : val.append(lexical_cast((*toks)[i].val));
 :   }
 :   if (i + 1 < toks_len) val.push_back(*separator);
 : }
 : string fmt_val = "Format: " + fmt + ", Val: " + val;
 : DateTimeFormatContext dt_ctx(fmt.c_str(), fmt.length());
 : ASSERT_TRUE(ParseFormatTokens(_ctx, false)) << 
fmt_val;
 : DateValue dv = DateValue::Parse(val.c_str(), 
val.length(), dt_ctx);
 : ValidateDate(dv, year, month, day, fmt_val);
 : int 

[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter

2018-09-17 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11450 )

Change subject: IMPALA-7492: Add support for DATE text parser/formatter
..


Patch Set 1:

(5 comments)

I have went through the change quickly, I will dig into details deeper in the 
future.

http://gerrit.cloudera.org:8080/#/c/11450/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11450/2//COMMIT_MSG@13
PS2, Line 13: The parser supports parsing both default and custom formatted DATE
: values. The formatter supports default and custom formatting of 
DATE
: values.
It took me some time to realize that these two sentences are not duplicates. 
Please add "also" or merge them somehow.


http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-test.cc
File be/src/runtime/date-test.cc:

http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-test.cc@195
PS2, Line 195: TEST
I would prefer to split this into more than one tests, e.g. DefaultParse, 
CustomParse, EdgeCases, Format.

timestamp-test.cc has a very long Basic test, but I do not think that it is a 
good idea - I found it quite hard to get an overview of the tests and checking 
whether something is already tested or not.


http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-test.cc@491
PS2, Line 491:   const int64_t MIN_DATE_DAYS_SINCE_EPOCH = -719528;
Does this have to be int64? Int32 should be generally always enough to 
represent day since epoch.


http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-test.cc@510
PS2, Line 510:   const int64_t MAX_DATE_DAYS_SINCE_EPOCH = 2932896;
Same as line 491.


http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-value.h
File be/src/runtime/date-value.h:

http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-value.h@35
PS2, Line 35: -01-01
Isn't it -12-31? That date can be represented with TimestampValue.



--
To view, visit http://gerrit.cloudera.org:8080/11450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Gerrit-Change-Number: 11450
Gerrit-PatchSet: 1
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 17 Sep 2018 16:19:59 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter

2018-09-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11450 )

Change subject: IMPALA-7492: Add support for DATE text parser/formatter
..


Patch Set 2:

(3 comments)

I was curious so I took a quick look over the changes to understand the 
high-level changes, I think this makes a lot of sense to me.

http://gerrit.cloudera.org:8080/#/c/11450/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11450/2//COMMIT_MSG@29
PS2, Line 29: - Added BE-tests for DateParser and DateValue classes.
I didn't see any reason in the code to expect that performance would change, 
but it would be good to do a sanity-test of timestamp parsing performance, e.g. 
measure the time taken to convert a largish table of timestamp strings. I'd 
suggest doing this with mt_dop=1 so that scanner threads don't affect 
measurements.


http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-value.h
File be/src/runtime/date-value.h:

http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-value.h@54
PS2, Line 54:   bool IsValid() const {
We should document the invariant about validity. I.e. that it's possible for an 
"invalid" DateValue to exist.


http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/datetime-parse-util.h
File be/src/runtime/datetime-parse-util.h:

http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/datetime-parse-util.h@178
PS2, Line 178: : year(-1),
We'v generally been directly initializing the members when they're constants.



--
To view, visit http://gerrit.cloudera.org:8080/11450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Gerrit-Change-Number: 11450
Gerrit-PatchSet: 2
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 17 Sep 2018 16:18:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter

2018-09-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11450 )

Change subject: IMPALA-7492: Add support for DATE text parser/formatter
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/691/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Gerrit-Change-Number: 11450
Gerrit-PatchSet: 2
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 17 Sep 2018 15:31:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter

2018-09-17 Thread Attila Jeges (Code Review)
Attila Jeges has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/11450 )

Change subject: IMPALA-7492: Add support for DATE text parser/formatter
..

IMPALA-7492: Add support for DATE text parser/formatter

This change is the first step in implementing support for DATE type
(IMPALA-6169).

The DATE parser/formatter is implemented by the new DateParser class.
The parser supports parsing both default and custom formatted DATE
values. The formatter supports default and custom formatting of DATE
values. In the future, DateParser will be used in the text
scanner/writer and in the DATE <-> STRING cast functions.

The DateParser class reuses some of the functionality already
implemented in the TimestampParser class to minimize redundancy. To
make code reuse easier, a new namespace (datetime_parse_util) was
created and the common functionality was moved there.

This change also adds a new class (DateValue) to represent a DATE
value in-memory. The DateParser and DateValue classes are used only in
tests at the moment, therefore this patch doesn't change user facing
behavior.

Testing:
- Added BE-tests for DateParser and DateValue classes.

Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
---
M be/src/benchmarks/convert-timestamp-benchmark.cc
M be/src/benchmarks/parse-timestamp-benchmark.cc
M be/src/common/init.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timestamp-functions.h
M be/src/runtime/CMakeLists.txt
A be/src/runtime/date-parse-util.cc
A be/src/runtime/date-parse-util.h
A be/src/runtime/date-test.cc
A be/src/runtime/date-value.cc
A be/src/runtime/date-value.h
A be/src/runtime/datetime-parse-util.cc
A be/src/runtime/datetime-parse-util.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-parse-util.h
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
19 files changed, 1,830 insertions(+), 729 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/11450/2
--
To view, visit http://gerrit.cloudera.org:8080/11450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Gerrit-Change-Number: 11450
Gerrit-PatchSet: 2
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter

2018-09-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11450 )

Change subject: IMPALA-7492: Add support for DATE text parser/formatter
..


Patch Set 1:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/690/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/11450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Gerrit-Change-Number: 11450
Gerrit-PatchSet: 1
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 17 Sep 2018 13:05:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter

2018-09-17 Thread Attila Jeges (Code Review)
Attila Jeges has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11450


Change subject: IMPALA-7492: Add support for DATE text parser/formatter
..

IMPALA-7492: Add support for DATE text parser/formatter

This change is the first step in implementing support for DATE type
(IMPALA-6169).

The DATE parser/formatter is implemented by the new DateParser class.
The parser supports parsing both default and custom formatted DATE
values. The formatter supports default and custom formatting of DATE
values. In the future, DateParser will be used in the text
scanner/writer and in the DATE <-> STRING cast functions.

The DateParser class reuses some of the functionality already
implemented in the TimestampParser class to minimize redundancy. To
make code reuse easier, a new namespace (datetime_parse_util) was
created and the common functionality was moved there.

This change also adds a new class (DateValue) to represent a DATE
value in-memory. The DateParser and DateValue classes are used only in
tests at the moment, therefore this patch doesn't change user facing
behavior.

Testing:
- Added BE-tests for DateParser and DateValue classes.

Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
---
M be/src/benchmarks/convert-timestamp-benchmark.cc
M be/src/benchmarks/parse-timestamp-benchmark.cc
M be/src/common/init.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timestamp-functions.h
M be/src/runtime/CMakeLists.txt
A be/src/runtime/date-parse-util.cc
A be/src/runtime/date-parse-util.h
A be/src/runtime/date-test.cc
A be/src/runtime/date-value.cc
A be/src/runtime/date-value.h
A be/src/runtime/datetime-parse-util.cc
A be/src/runtime/datetime-parse-util.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-parse-util.h
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/timestamp-value.h
19 files changed, 1,830 insertions(+), 729 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/11450/1
--
To view, visit http://gerrit.cloudera.org:8080/11450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Gerrit-Change-Number: 11450
Gerrit-PatchSet: 1
Gerrit-Owner: Attila Jeges