[jira] [Commented] (ARROW-14778) [C++] mean on a decimal truncates and does not round

David Li (Jira) Fri, 19 Nov 2021 08:35:06 -0800


    [ 
https://issues.apache.org/jira/browse/ARROW-14778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17446568#comment-17446568
 ]


David Li commented on ARROW-14778:
----------------------------------

Ah, it's because we perform all the computations at the input decimal 
precision/scale (so only 1 decimal digit here). We could perhaps promote it to 
the max precision/scale, then round it back down? (e.g. for decimal128(5, 1), 
do computations at decimal128(38, 2) or something then round back down to (5, 
1), I haven't thought this though too much, also this would apply to many of 
the other decimal kernels).

> [C++] mean on a decimal truncates and does not round
> ----------------------------------------------------
>
>                 Key: ARROW-14778
>                 URL: https://issues.apache.org/jira/browse/ARROW-14778
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>            Reporter: Jonathan Keane
>            Priority: Major
>              Labels: query-engine
>
> {code}
> library(arrow, warn.conflicts = FALSE)
> library(dplyr, warn.conflicts = FALSE)
> df <- data.frame(
>   x = c(0.1, 0.2, 0.2, 0.2, 0.2)
> )
> tab <- Table$create(df)
> tab %>%
>   summarise(mean(x)) %>% 
>   collect()
> #> # A tibble: 1 × 1
> #>   `mean(x)`
> #>       <dbl>
> #> 1      0.18
> tab %>%
>   summarise(x = mean(x)) %>% 
>   mutate(x = cast(x, decimal(5, 1))) %>% 
>   collect()
> #> # A tibble: 1 × 1
> #>       x
> #>   <dbl>
> #> 1   0.2
> tab %>%
>   mutate(x = cast(x, decimal(5, 1))) %>% 
>   summarise(x = mean(x)) %>% 
>   collect()
> #> # A tibble: 1 × 1
> #>       x
> #>   <dbl>
> #> 1   0.1
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (ARROW-14778) [C++] mean on a decimal truncates and does not round

Reply via email to