[ https://issues.apache.org/jira/browse/ARROW-16316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17527504#comment-17527504 ]
Rok Mihevc edited comment on ARROW-16316 at 4/25/22 1:46 PM: ------------------------------------------------------------- Exactly as [~dragosmg] said. Additionally: {code:java} call_function("round_temporal", arrow_now, options = list(unit = arrow_unit)){code} currently should work but behaves differently than lubridate. Namely rounding origin is set to 1970-01-01 instead of the one unit greater than unit rounded to. That means if you round to 6 hrs it will round to multiple of 6hrs since 1970-01-01 instead of to multiple of 6hrs since the beginning of the day the timestamp falls into. If you're rounding to multiple of 1 unit that the two behaviours would overlap and you would get the same behaviour. was (Author: rokm): Exactly as [~dragosmg] said. Additionally: {code:java} call_function("round_temporal", arrow_now, unit = arrow_unit) {code} currently should work but behaves differently than lubridate. Namely rounding origin is set to 1970-01-01 instead of the one unit greater than unit rounded to. That means if you round to 6 hrs it will round to multiple of 6hrs since 1970-01-01 instead of to multiple of 6hrs since the beginning of the day the timestamp falls into. If you're rounding to multiple of 1 unit that the two behaviours would overlap and you would get the same behaviour. > [R] How to round the timestamps in a mutate statement? > ------------------------------------------------------ > > Key: ARROW-16316 > URL: https://issues.apache.org/jira/browse/ARROW-16316 > Project: Apache Arrow > Issue Type: Wish > Components: R > Affects Versions: 7.0.0 > Reporter: Zsolt Kegyes-Brassai > Priority: Minor > > I was trying to aggregate over time using different granularity. Usually I > would use the {{lubridate::floor_date()}} , which is currently not supported > for parquet datasets. > Is there any comprehensive list of supported list of currently supported > {{{}lubridate (or dplyr{}}}) verbs? Maybe, it’s only my fault, but except the > changelog I haven’t find any relevant information. > > Later I found that the {{round_temporal()}} function is exposed to {{{}R{}}}. > But I am struggling to find the right syntax inside a mutate statement to > apply on a {{timestamp[us, tz=UTC]}} type column. > {code:java} > new_dataset |> > mutate(time = arrow_round_temporal(time)) > #> Error: Invalid: Attempted to initialize KernelState from null > FunctionOptions > {code} > > Here are some other attempts: > {code:java} > library(arrow) > arrow_now <- Scalar$create(lubridate::now()) > (arrow_now) > #> Scalar > #> 2022-04-25 11:44:33.805609 > call_function("round_temporal", arrow_now) > #> Scalar > #> 2022-04-25 00:00:00.000000 > call_function("round_temporal", arrow_now, unit = "day") > #> Error: Argument 2 is of class character but it must be one of "Array", > "ChunkedArray", "RecordBatch", "Table", or "Scalar" > arrow_unit <- Scalar$create("day") > (arrow_unit) > #> Scalar > #> day > call_function("round_temporal", arrow_now, unit = arrow_unit) > #> Error: Invalid: Function 'round_temporal' accepts 1 arguments but > attempted to look up kernel(s) with 2 > {code} > -- This message was sent by Atlassian Jira (v8.20.7#820007)