waitingkuo commented on issue #4106:
URL: 
https://github.com/apache/arrow-datafusion/issues/4106#issuecomment-1304872305

   @avantgardnerio thank you. I didn't aware this before. never live in the 
area that has the timezone switch. 
   
   I did some research in Postgrseql and Chrono-tz
   
   MST timezone offset: -7
   MDT timezone offset: -6
   
   In 2022, MDT
   began from  Sunday, 13 March, 02:00 (changed from -7 to -6)
   ended at  Sunday, 6 November, 02:00 (changed from -6 to -7)
   
   ```bash
   willy=# set timezone to 'America/Denver';
   SET
   ```
   this is valid (right before the timezone shift):
   ```bash
   willy=# select to_timestamp('2022-03-13 01:00', 'YYYY-MM-DD HH24:MI');
         to_timestamp      
   ------------------------
    2022-03-13 01:00:00-07
   (1 row)
   
   willy=# select to_timestamp('2022-03-13 01:59', 'YYYY-MM-DD HH24:MI');
         to_timestamp      
   ------------------------
    2022-03-13 01:59:00-07
   (1 row)
   ```
   
   begin from 2:00, it switches to MDT (-6)
   i think the logic behind for this hour is:
   parse as it's MST(-7), and then switch to MDT (-6), that's why we have an 
one hour shift here
   ```bash
   willy=# select to_timestamp('2022-03-13 02:00', 'YYYY-MM-DD HH24:MI');
         to_timestamp      
   ------------------------
    2022-03-13 03:00:00-06
   (1 row)
   
   willy=# select to_timestamp('2022-03-13 02:30', 'YYYY-MM-DD HH24:MI');
         to_timestamp      
   ------------------------
    2022-03-13 03:30:00-06
   (1 row)
   
   willy=# select to_timestamp('2022-03-13 02:59', 'YYYY-MM-DD HH24:MI');
         to_timestamp      
   ------------------------
    2022-03-13 03:59:00-06
   (1 row)
   ```
   
   and then the next hour it's parsed as MDT
   ```bash
   willy=# select to_timestamp('2022-03-13 03:00', 'YYYY-MM-DD HH24:MI');
         to_timestamp      
   ------------------------
    2022-03-13 03:00:00-06
   (1 row)
   ```
   
   
   In November 6th 2am MDT, time zone is switched back to MST (1am MST)
   
   let's begin with something unambiguous
   ```
   willy=# select to_timestamp('2022-11-06 00:59', 'YYYY-MM-DD HH24:MI');
         to_timestamp      
   ------------------------
    2022-11-06 00:59:00-06
   (1 row)
   ```
   
   for the next 1 hour, it's ambiguous since both MST and MDT has 1 am. and 
this is what postgresql does, it uses MST even though 1am MDT is valid as well
   ```bash
   willy=# select to_timestamp('2022-11-06 01:00', 'YYYY-MM-DD HH24:MI');
         to_timestamp      
   ------------------------
    2022-11-06 01:00:00-07
   (1 row)
   willy=# select to_timestamp('2022-11-06 01:59', 'YYYY-MM-DD HH24:MI');
         to_timestamp      
   ------------------------
    2022-11-06 01:59:00-07
   (1 row)
   ```
   
   after that, there's no ambiguity
   ```bash
   willy=# select to_timestamp('2022-11-06 02:00', 'YYYY-MM-DD HH24:MI');
         to_timestamp      
   ------------------------
    2022-11-06 02:00:00-07
   (1 row)
   ```
   
   conclusion for Postgrseql: when it's ambiguous or invalid, it's parsed as 
MST, and then switches to MDT if needed
   
   
   now let's see the behavior for `chrono-tz`
   ```rust
        let tz: Tz = "America/Denver".parse().unwrap();
       let dt = tz.datetime_from_str("2022-03-13T01:59", "%Y-%m-%dT%H:%M");
       println!("2022-03-13T01:59 -> {:?} / {}", dt, dt.unwrap().to_rfc3339());
       let dt = tz.datetime_from_str("2022-03-13T02:00", "%Y-%m-%dT%H:%M");
       println!("2022-03-13T02:00 -> {:?}", dt);
       let dt = tz.datetime_from_str("2022-03-13T02:59", "%Y-%m-%dT%H:%M");
       println!("2022-03-13T02:59 -> {:?}", dt);
       let dt = tz.datetime_from_str("2022-03-13T03:00", "%Y-%m-%dT%H:%M");
       println!("2022-03-13T03:00 -> {:?} / {}", dt, dt.unwrap().to_rfc3339());
   ```
   ```bash
   2022-03-13T01:59 -> Ok(2022-03-13T01:59:00MST) / 2022-03-13T01:59:00-07:00
   2022-03-13T02:00 -> Err(ParseError(Impossible))
   2022-03-13T02:59 -> Err(ParseError(Impossible))
   2022-03-13T03:00 -> Ok(2022-03-13T03:00:00MDT) / 2022-03-13T03:00:00-06:00
   ```
   ```rust
       let dt = tz.datetime_from_str("2022-11-06T00:59", "%Y-%m-%dT%H:%M");
       println!("2022-11-06T00:59 -> {:?} / {}", dt, dt.unwrap().to_rfc3339());
       let dt = tz.datetime_from_str("2022-11-06T01:00", "%Y-%m-%dT%H:%M");
       println!("2022-11-06T01:00 -> {:?}", dt);
       let dt = tz.datetime_from_str("2022-11-06T01:59", "%Y-%m-%dT%H:%M");
       println!("2022-11-06T01:59 -> {:?}", dt);
       let dt = tz.datetime_from_str("2022-11-06T02:00", "%Y-%m-%dT%H:%M");
       println!("2022-11-06T02:00 -> {:?} / {}", dt, dt.unwrap().to_rfc3339());
   ```
   ```bash
   2022-11-06T00:59 -> Ok(2022-11-06T00:59:00MDT) / 2022-11-06T00:59:00-06:00
   2022-11-06T01:00 -> Err(ParseError(NotEnough))
   2022-11-06T01:59 -> Err(ParseError(NotEnough))
   2022-11-06T02:00 -> Ok(2022-11-06T02:00:00MST) / 2022-11-06T02:00:00-07:00
   ```
   I think `chrono-tz` did a good job here.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to