ianmcook commented on a change in pull request #9875: URL: https://github.com/apache/arrow/pull/9875#discussion_r607349708
########## File path: r/R/compute.R ########## @@ -80,6 +80,36 @@ collect_arrays_from_dots <- function(dots) { ChunkedArray$create(!!!arrays) } +#' @export +quantile.ArrowDatum <- function(x, + probs = seq(0, 1, 0.25), + na.rm = FALSE, + interpolation = c("linear", "lower", "higher", "nearest", "midpoint"), + ...) { Review comment: In 188ceea3fbc037c3944ac936e9492f9f388f3c95, I added an error if the user specifies a non-default value for `type` I read through the `quantile` docs and the associated Rob Hyndman paper, but most of the sample quantile types described there are quite different from any of the options implemented in Arrow, so that did not help much. Doing quantitative comparisons was more fruitful. Here's what I found: - The R default `type = 7` corresponds most closely to the Arrow default `interpolation = "linear"` (which is very good) - R's `type = 7` seems to correspond closely to Arrow's `interpolation = "lower"` - None of the other R `type` options and Arrow `interpolation` options exhibit any close correspondence Based on these findings, and considering how few users will likely attempt to compute quantiles with Arrow using anything but the default type, I don't think there's any immediate action to take here, and I don't really think any follow-up is needed. It seems like a long shot that we would ever implement any of these other quantile algorithms in the C++ library, but I'll open a Jira for that if you think it's worth it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org