kosiew opened a new pull request, #18025:
URL: https://github.com/apache/datafusion/pull/18025

   ### **PR Title**
   
   **Respect execution timezone in `to_timestamp` and related functions**
   
   ---
   
   ## Which issue does this PR close?
   
   Closes #XXXX
   *(Replace `XXXX` with the actual issue number if one exists — e.g., “Closes 
#1234”)*
   
   ---
   
   ## Rationale for this change
   
   Currently, `to_timestamp()` and its precision variants 
(`to_timestamp_seconds`, `to_timestamp_millis`, `to_timestamp_micros`, 
`to_timestamp_nanos`) **ignore the execution timezone** provided in 
`ScalarFunctionArgs.config_options`.
   As a result, timestamp strings without explicit timezone information are 
**always interpreted as UTC**, regardless of the session’s 
`datafusion.execution.time_zone` setting.
   
   This behavior is incorrect and can lead to inconsistent or unexpected 
timestamp conversions when users rely on a configured execution timezone.
   
   This PR ensures that all `to_timestamp*` functions correctly respect and 
apply the configured execution timezone for naïve (timezone-free) datetime 
strings.
   
   ---
   
   ## What changes are included in this PR?
   
   * Introduced a new utility type:
   
     * `ConfiguredTimeZone`, encapsulating both named timezones (via 
`arrow::array::timezone::Tz`) and fixed offsets (`FixedOffset`).
   * Added timezone parsing helpers:
   
     * `ConfiguredTimeZone::parse()` to resolve IANA names or `±HH:MM` offsets.
     * `parse_fixed_offset()` to safely interpret offset strings.
   * Updated all `to_timestamp*` implementations to:
   
     * Extract the execution timezone from `config_options.execution.time_zone`.
     * Use `string_to_timestamp_nanos_with_timezone()` or
       `string_to_timestamp_nanos_formatted_with_timezone()` when parsing naïve 
datetime strings.
   * Added robust conversion helpers for naïve vs. localized datetimes:
   
     * `timestamp_to_naive`, `datetime_to_timestamp`, and
       `local_datetime_to_timestamp` (handling ambiguous/invalid times).
   * Comprehensive test coverage:
   
     * ✅ `to_timestamp_respects_execution_timezone` verifies that configured 
offsets shift timestamps as expected.
     * ✅ `to_timestamp_formats_respect_timezone` ensures format-based parsing 
respects named zones (e.g., `"Asia/Tokyo"`).
   
   ---
   
   ## Are these changes tested?
   
   Yes.
   Two new unit tests were added:
   
   * `to_timestamp_respects_execution_timezone`
   * `to_timestamp_formats_respect_timezone`
   
   Existing `to_timestamp_*` tests also pass with the new timezone logic.
   
   ---
   
   ## Are there any user-facing changes?
   
   Yes — **intended behavioral improvement**:
   
   * Naïve datetime strings passed to `to_timestamp()` and its variants will 
now be interpreted relative to the configured execution timezone (from 
`datafusion.execution.time_zone`), rather than defaulting to UTC.
   
   There are **no breaking API changes**, only corrected behavior aligning with 
user expectations.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to