lidavidm commented on a change in pull request #11358:
URL: https://github.com/apache/arrow/pull/11358#discussion_r727420763
##########
File path: cpp/src/arrow/compute/kernels/scalar_string_test.cc
##########
@@ -1429,6 +1430,17 @@ TYPED_TEST(TestStringKernels, Strptime) {
this->CheckUnary("strptime", input1, timestamp(TimeUnit::MICRO), output1,
&options);
}
+TYPED_TEST(TestStringKernels, StrptimeZoneOffset) {
+ if (!arrow::internal::kStrptimeSupportsZone) {
+ GTEST_SKIP() << "strptime does not support %z on this platform";
+ }
+ std::string input1 = R"(["5/1/2020 +01", null, "12/11/1900 -01:30"])";
+ std::string output1 =
+ R"(["2020-04-30T23:00:00.000000", null, "1900-12-11T01:30:00.000000"])";
+ StrptimeOptions options("%m/%d/%Y %z", TimeUnit::MICRO);
+ this->CheckUnary("strptime", input1, timestamp(TimeUnit::MICRO), output1,
&options);
Review comment:
Makes sense, thanks all for the comments.
I'm OOO this week but when I get a chance next, I'll update the CSV parser
to track the zone offset and return either "UTC", no timezone, or an error. (If
a user wants to preserve a consistent non-UTC offset that can be tackled
later.) I think casting is another place that needs to be updated, as well as
Python `pyarrow.array` inference (though that may just also use casting? Not
sure off the top of my head).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]