rok commented on pull request #10457:
URL: https://github.com/apache/arrow/pull/10457#issuecomment-858470970


   > I think it would be good to write some tests in python as well, as 
currently the C++ tests are very hard to verify since we don't yet have the 
ability to parse strings localized in the timezone (I mean: the strings in the 
tests are interpreted as UTC and not the "Australia/Broken_Hill" timezone. And 
thus as a result, the expected values can't be read/verified from the strings).
   > (while in python we could create the localized input timestamps with 
pandas)
   
   Indeed! I've used that approach to generate data in 
[ARROW-11759](https://github.com/apache/arrow/pull/10176) by pandas already and 
I'll just adopt that code here.
   It did lead me to an [odd issue? in 
pandas](https://github.com/pandas-dev/pandas/issues/41834) that I cant quite 
explain. We'll either need to address it or avoid it in tests ..
   
   > Given that, it might also make sense to first add a "localize" kernel for 
converting timestamps from naive to a certain timezone.
   
   Localize kernel would be great. I suppose we'd need a scalar and a vector 
one depending if timezone is shared between rows or not? Vector version would 
apply [here](https://issues.apache.org/jira/browse/ARROW-5912).
   
   Also strptime kernel ignores timezones at the moment.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to