[
https://issues.apache.org/jira/browse/HIVE-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13840006#comment-13840006
]
Teddy Choi commented on HIVE-5761:
----------------------------------
Eric,
Okay. I will keep the numbers as the are.
I implemented a fast date utility with few arithmetics and ISO 8601. It does
not requires any additional library. It is about 2,000,000,000 times faster
than java.util.Date in my benchmarks. I originally planed to use it for
non-cached dates. But with this performance, I don't need cache anymore. The
following code for YEAR expression.
{code}
private static final int Y1 = 365;
private static final int Y4 = Y1 * 4 + 1;
private static final int Y100 = Y4 * 25 - 1;
private static final int Y400 = Y100 * 4 + 1;
private static final long EPOCH_FROM_00010101 = 4 * Y400 + 3 * Y100 + 17 * Y4 +
Y1;
public static int getYear(final int daysSinceEpoch) {
long d = daysSinceEpoch + EPOCH_FROM_00010101;
int offset = 0;
if (d < 0) {
offset = (int) (-d / Y400) + 1;
d += offset * Y400;
}
final int r400 = (int) (d % Y400);
final int q400 = (int) (d / Y400);
final int r100 = r400 % Y100;
final int q100 = r400 / Y100;
final int r4 = r100 % Y4;
final int q4 = r100 / Y4;
final int q1 = r4 / Y1;
return 1 + 400 * (q400 - offset) + 100 * q100 + 4 * q4 + q1 + ((q1 == 4 ||
q100 == 4) ? - 1 : 0);
}
{code}
I also implemented MONTH and DAY. And all of them return appropriate results in
AD and BC eras. I'll start with this utility. :D
> Implement vectorized support for the DATE data type
> ---------------------------------------------------
>
> Key: HIVE-5761
> URL: https://issues.apache.org/jira/browse/HIVE-5761
> Project: Hive
> Issue Type: Sub-task
> Reporter: Eric Hanson
> Assignee: Teddy Choi
>
> Add support to allow queries referencing DATE columns and expression results
> to run efficiently in vectorized mode. This should re-use the code for the
> the integer/timestamp types to the extent possible and beneficial. Include
> unit tests and end-to-end tests. Consider re-using or extending existing
> end-to-end tests for vectorized integer and/or timestamp operations.
--
This message was sent by Atlassian JIRA
(v6.1#6144)