Maxim Gekk created SPARK-31359:
----------------------------------

             Summary: Speed up timestamps rebasing
                 Key: SPARK-31359
                 URL: https://issues.apache.org/jira/browse/SPARK-31359
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 3.0.0
            Reporter: Maxim Gekk


Currently, rebasing of timestamps is performed via conversions to local 
timestamps and back to microseconds. This is CPU intensive operation which can 
be avoid by converting via pre-calculated tables per each time zone. For 
example, the below is timestamps when diffs are changed in America/Los_Angeles 
time zone for the range 0001-01-01...2100-01-01
{code}
0001-01-01T00:00 diff = -2872 minutes
0100-03-01T00:00 diff = -1432 minutes
0200-03-01T00:00 diff = 7 minutes
0300-03-01T00:00 diff = 1447 minutes
0500-03-01T00:00 diff = 2887 minutes
0600-03-01T00:00 diff = 4327 minutes
0700-03-01T00:00 diff = 5767 minutes
0900-03-01T00:00 diff = 7207 minutes
1000-03-01T00:00 diff = 8647 minutes
1100-03-01T00:00 diff = 10087 minutes
1300-03-01T00:00 diff = 11527 minutes
1400-03-01T00:00 diff = 12967 minutes
1500-03-01T00:00 diff = 14407 minutes
1582-10-15T00:00 diff = 7 minutes
1883-11-18T12:22:58 diff = 0 minutes
{code}
It seems it is possible to build rebasing maps, and perform rebasing via the 
maps.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to