nealrichardson commented on a change in pull request #10334:
URL: https://github.com/apache/arrow/pull/10334#discussion_r633148838
##########
File path: r/R/compute.R
##########
@@ -286,3 +286,8 @@ cast_options <- function(safe = TRUE, ...) {
)
modifyList(opts, list(...))
}
+
+strptime_arrow <- function(..., format, unit){
+ a <- collect_arrays_from_dots(list(...))
Review comment:
I don't think you want `collect_arrays_from_dots` here. This function
exists to support the base R behavior like:
```
> sum(1, 2)
[1] 3
```
But `strptime` doesn't take `...` like that.
##########
File path: r/R/compute.R
##########
@@ -286,3 +286,8 @@ cast_options <- function(safe = TRUE, ...) {
)
modifyList(opts, list(...))
}
+
+strptime_arrow <- function(..., format, unit){
Review comment:
I'm not sure how useful this function is since it is a thin wrapper
around `call_function()` and we can't set it as an S3 method.. More useful
would be to add a version of this in the `nse_funcs` in dplyr-functions.R.
In either case, we should match the `base::strptime()` signature: `function
(x, format, tz = "")` with the possible addition of `unit` if that's an Arrow
feature.
Also, should `format` and `unit` have default arguments?
##########
File path: r/tests/testthat/test-Array.R
##########
@@ -291,6 +291,17 @@ test_that("Timezone handling in Arrow roundtrip
(ARROW-3543)", {
expect_identical(read_feather(feather_file), df)
})
+test_that("strptime", {
+ # array of strings
+ time_strings <- Array$create(c("2018-10-07 19:04:05", NA))
+ # array of timestamps (doesn't work if tz="" is added!)
+ timestamps <- Array$create(c(as.POSIXct("2018-10-07 19:04:05"), NA))
+ # array of parsed timestamps
+ parsed_timestamps <- strptime_arrow(time_strings, format = "%Y-%m-%d
%H:%M:%S", unit = TimeUnit$MICRO)
Review comment:
`unit` should also take human-friendly strings ("s", "ms", etc.); see
how this is done in the `timestamp()` function in type.R.
##########
File path: r/tests/testthat/test-Array.R
##########
@@ -291,6 +291,17 @@ test_that("Timezone handling in Arrow roundtrip
(ARROW-3543)", {
expect_identical(read_feather(feather_file), df)
})
+test_that("strptime", {
+ # array of strings
+ time_strings <- Array$create(c("2018-10-07 19:04:05", NA))
+ # array of timestamps (doesn't work if tz="" is added!)
Review comment:
Why not?
##########
File path: r/src/compute.cpp
##########
@@ -233,6 +233,12 @@ std::shared_ptr<arrow::compute::FunctionOptions>
make_compute_options(
max_replacements);
}
+ if (func_name == "strptime") {
+ using Options = arrow::compute::StrptimeOptions;
Review comment:
Does `StrptimeOptions` have a `Defaults()` method in C++? If so, we
should call it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]