[ 
https://issues.apache.org/jira/browse/ARROW-14045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17579368#comment-17579368
 ] 

Neal Richardson commented on ARROW-14045:
-----------------------------------------

{{hash_one}} does not take options (yet), so a C++ change would be required. 
But maybe the path forward is to just implement this anyway, possibly with a 
warning that it's not guaranteed to take the first value like regular dplyr, or 
possibly without a warning. ({{head()}} is also not deterministic as to what 
rows it takes because of the async evaluation, so that's another difference, 
but we don't make noise about that.) Even with options to keep nulls, it's not 
obvious to me that {{hash_one}} is guaranteed to take the first row, it just 
takes one row, so seeking exact parity with R might not be feasible. 

> [R] Support for .keep_all = TRUE with distinct() 
> -------------------------------------------------
>
>                 Key: ARROW-14045
>                 URL: https://issues.apache.org/jira/browse/ARROW-14045
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: R
>            Reporter: Nicola Crane
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to