pitrou commented on code in PR #48601:
URL: https://github.com/apache/arrow/pull/48601#discussion_r2667527253
##########
python/pyarrow/tests/test_compute.py:
##########
@@ -2317,9 +2317,18 @@ def test_strftime():
for fmt in formats:
options = pc.StrftimeOptions(fmt)
result = pc.strftime(tsa, options=options)
- # cast to the same type as result to ignore string vs
large_string
expected = pa.array(ts.strftime(fmt)).cast(result.type)
- assert result.equals(expected)
+ if sys.platform == "win32" and fmt == "%Z":
+ # TODO(GH-48743): On Windows, std::chrono returns GMT
+ # offset style (e.g. "GMT+1") instead of timezone
+ # abbreviations (e.g. "CET")
+ # https://github.com/apache/arrow/issues/48743
+ # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116110
+ for val in result:
+ assert val.as_py() is None or
val.as_py().startswith("GMT") \
+ or val.as_py() == "UTC"
Review Comment:
So the formatted timestamp can just be "UTC", is that right?
##########
python/pyarrow/config.pxi:
##########
@@ -96,21 +96,3 @@ build_info = _build_info()
cpp_build_info = build_info.cpp_build_info
cpp_version = build_info.cpp_build_info.version
cpp_version_info = build_info.cpp_build_info.version_info
-
-
-def set_timezone_db_path(path):
Review Comment:
Can we just deprecate this for now?
##########
python/pyarrow/tests/test_compute.py:
##########
@@ -2317,9 +2317,18 @@ def test_strftime():
for fmt in formats:
options = pc.StrftimeOptions(fmt)
result = pc.strftime(tsa, options=options)
- # cast to the same type as result to ignore string vs
large_string
expected = pa.array(ts.strftime(fmt)).cast(result.type)
- assert result.equals(expected)
+ if sys.platform == "win32" and fmt == "%Z":
+ # TODO(GH-48743): On Windows, std::chrono returns GMT
+ # offset style (e.g. "GMT+1") instead of timezone
+ # abbreviations (e.g. "CET")
Review Comment:
Ok, this means the compute function will have to introduce a workaround for
Windows if we want to expose consistent results? This can be done in a separate
issue IMHO.
##########
python/pyarrow/tests/test_compute.py:
##########
@@ -2317,9 +2317,18 @@ def test_strftime():
for fmt in formats:
options = pc.StrftimeOptions(fmt)
result = pc.strftime(tsa, options=options)
- # cast to the same type as result to ignore string vs
large_string
expected = pa.array(ts.strftime(fmt)).cast(result.type)
- assert result.equals(expected)
+ if sys.platform == "win32" and fmt == "%Z":
+ # TODO(GH-48743): On Windows, std::chrono returns GMT
+ # offset style (e.g. "GMT+1") instead of timezone
+ # abbreviations (e.g. "CET")
+ # https://github.com/apache/arrow/issues/48743
+ # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116110
Review Comment:
These issues do not seem like the issue, are they?
##########
cpp/src/arrow/public_api_test.cc:
##########
@@ -122,46 +122,4 @@ TEST(Misc, BuildInfo) {
ASSERT_THAT(info.full_so_version, ::testing::HasSubstr(info.so_version));
}
-TEST(Misc, SetTimezoneConfig) {
Review Comment:
Can we keep this test as long as the tested APIs still do something?
(it could run on MinGW or CLang64 builds I suppose)
##########
cpp/src/arrow/config.h:
##########
@@ -64,13 +64,6 @@ struct RuntimeInfo {
/// The SIMD level available on the OS and CPU
std::string detected_simd_level;
-
- /// Whether using the OS-based timezone database
- /// This is set at compile-time.
- bool using_os_timezone_db;
-
- /// The path to the timezone database; by default None.
- std::optional<std::string> timezone_db_path;
Review Comment:
Can we just deprecate these?
##########
cpp/src/arrow/config.cc:
##########
@@ -77,15 +75,11 @@ RuntimeInfo GetRuntimeInfo() {
MakeSimdLevelString([&](int64_t flags) { return
cpu_info->IsSupported(flags); });
info.detected_simd_level =
MakeSimdLevelString([&](int64_t flags) { return
cpu_info->IsDetected(flags); });
- info.using_os_timezone_db = USE_OS_TZDB;
-#if !USE_OS_TZDB
- info.timezone_db_path = timezone_db_path;
-#else
- info.timezone_db_path = std::optional<std::string>();
-#endif
return info;
}
+// TODO(GH-48743): Remove when RTools upgrades to GCC with std::chrono
timezone support
Review Comment:
Nit: can you replace references to "GCC" with "libstdc++"? This isn't really
about the compiler...
##########
cpp/src/arrow/config.cc:
##########
@@ -95,12 +89,7 @@ Status Initialize(const GlobalOptions& options) noexcept {
} catch (const std::runtime_error& e) {
return Status::IOError(e.what());
}
- timezone_db_path = options.timezone_db_path.value();
-#else
Review Comment:
Why remove this path?
##########
python/pyarrow/util.py:
##########
@@ -242,35 +242,3 @@ def _download_requests(url, out_path):
with requests.get(url) as response:
with open(out_path, 'wb') as f:
f.write(response.content)
-
-
-def download_tzdata_on_windows():
Review Comment:
If this is a public API, it should be deprecated first.
##########
cpp/src/arrow/testing/util.cc:
##########
@@ -122,25 +122,6 @@ Status GetTestResourceRoot(std::string* out) {
return Status::OK();
}
-std::optional<std::string> GetTestTimezoneDatabaseRoot() {
Review Comment:
CLang64 doesn't need this? It's not clear from the PR changes...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]