[ 
https://issues.apache.org/jira/browse/HIVE-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13840006#comment-13840006
 ] 

Teddy Choi commented on HIVE-5761:
----------------------------------

Eric,

Okay. I will keep the numbers as the are.

I implemented a fast date utility with few arithmetics and ISO 8601. It does 
not requires any additional library. It is about 2,000,000,000 times faster 
than java.util.Date in my benchmarks. I originally planed to use it for 
non-cached dates. But with this performance, I don't need cache anymore. The 
following code for YEAR expression.

{code}
private static final int Y1 = 365;
private static final int Y4 = Y1 * 4 + 1;
private static final int Y100 = Y4 * 25 - 1;
private static final int Y400 = Y100 * 4 + 1;
private static final long EPOCH_FROM_00010101 = 4 * Y400 + 3 * Y100 + 17 * Y4 + 
Y1;

public static int getYear(final int daysSinceEpoch) {
    long d = daysSinceEpoch + EPOCH_FROM_00010101;
    int offset = 0;
    if (d < 0) {
        offset = (int) (-d / Y400) + 1;
        d += offset * Y400;
    }
    final int r400 = (int) (d % Y400);
    final int q400 = (int) (d / Y400);
    final int r100 = r400 % Y100;
    final int q100 = r400 / Y100;
    final int r4 = r100 % Y4;
    final int q4 = r100 / Y4;
    final int q1 = r4 / Y1;

    return 1 + 400 * (q400 - offset) + 100 * q100 + 4 * q4 + q1 + ((q1 == 4 || 
q100 == 4) ? - 1 : 0);
}
{code}

I also implemented MONTH and DAY. And all of them return appropriate results in 
AD and BC eras. I'll start with this utility. :D

> Implement vectorized support for the DATE data type
> ---------------------------------------------------
>
>                 Key: HIVE-5761
>                 URL: https://issues.apache.org/jira/browse/HIVE-5761
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Eric Hanson
>            Assignee: Teddy Choi
>
> Add support to allow queries referencing DATE columns and expression results 
> to run efficiently in vectorized mode. This should re-use the code for the 
> the integer/timestamp types to the extent possible and beneficial. Include 
> unit tests and end-to-end tests. Consider re-using or extending existing 
> end-to-end tests for vectorized integer and/or timestamp operations.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to