alamb opened a new issue, #5827: URL: https://github.com/apache/arrow-rs/issues/5827
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** This is in the context of implementing `date_bin` for timestamps with timezones: https://github.com/apache/datafusion/issues/10602 I made https://github.com/apache/arrow-rs/pull/5826 to document the behavior of casting timestamps and I found it very confusing. Specifically when you cast from `Timestamp(None)` to `Timestamp(Some(tz))` and then back to `Timetamp(None)` the underlying timestamp values are changed as shown in this example ```rust use arrow_array::Int64Array; use arrow_array::types::{TimestampSecondType}; use arrow_cast::{cast, display}; use arrow_array::cast::AsArray; use arrow_schema::{DataType, TimeUnit}; let data_type = DataType::Timestamp(TimeUnit::Second, None); let data_type_tz = DataType::Timestamp(TimeUnit::Second, Some("-05:00".into())); let a = Int64Array::from(vec![1_000_000_000, 2_000_000_000, 3_000_000_000]); let b = cast(&a, &data_type).unwrap(); // cast to timestamp without timezone let b = b.as_primitive::<TimestampSecondType>(); // downcast to result type assert_eq!(2_000_000_000, b.value(1)); // values are still the same // Convert timestamps without a timezone to timestamps with a timezone let c = cast(&b, &data_type_tz).unwrap(); let c = c.as_primitive::<TimestampSecondType>(); // downcast to result type assert_eq!(2_000_018_000, c.value(1)); // value has been adjusted by offset // Convert from timestamp with timezone back to timestamp without timezone let d = cast(&c, &data_type).unwrap(); let d = d.as_primitive::<TimestampSecondType>(); // downcast to result type assert_eq!(2_000_018_000, d.value(1)); // <---- **** THIS VALUE IS DIFFERENT THAN IT WAS INITITALLY assert_eq!("2033-05-18T08:33:20", display::array_value_to_string(&d, 1).unwrap()); ``` Thus I wanted to discuss if we should change the behavior to make it less surprising or if there was a reason to leave the current behavior **Describe the solution you'd like** I propose making `casting timestamp with a timezone to timestamp without a timezone` do the inverse of `casting timestamp withpit a timezone to timestamp with a timezone` This would mean the final value of d in the above example is `2_000_000_000`, not `2_000_018_000` **Describe alternatives you've considered** Leave existing behavior **Additional context** <!-- Add any other context or screenshots about the feature request here. --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
