[ https://issues.apache.org/jira/browse/ARROW-12960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17458559#comment-17458559 ]
Antoine Pitrou commented on ARROW-12960: ---------------------------------------- What is the status on this? > [C++][R] Option for is_nan(null) to evaluate to false or true > ------------------------------------------------------------- > > Key: ARROW-12960 > URL: https://issues.apache.org/jira/browse/ARROW-12960 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, R > Reporter: Ian Cook > Assignee: Christian Cordova > Priority: Major > Labels: good-first-issue, kernel > Fix For: 7.0.0 > > > (This is the flip side of ARROW-12959.) > Currently the Arrow compute kernel {{is_nan}} always treats {{null}} as a > missing value, returning {{null}} at positions of the input datum with > {{null}} (missing) values. > It would be helpful to be able to control this behavior with an option. The > option could be named {{value_for_null}} or something similar and it would > take a nullable boolean scalar. It would default to {{null}}, consistent > with current behavior. When set to {{false}} or {{true}}, it would return > {{false}} or {{true}} at positions of the input datum with {{null}} values. > Among other things, this would enable the {{arrow}} R package to evaluate > {{is.nan()}} consistently with the way base R does. In base R, {{is.nan()}} > returns {{FALSE}} on {{NA}}. But in the {{arrow}} R package, it returns > {{NA}}: > {code:r} > > is.nan(c(3.14, NA, NaN)) > ##[1] FALSE FALSE TRUE > as.vector(is.nan(Array$create(c(3.14, NA, NaN)))) > ##[1] FALSE NA TRUE{code} > I think solving this with an option in the C++ kernel is the best solution, > because I suspect there are other cases in which users would want the ability > to return all non-missing values in the output from {{is_nan}} without > needing to call another kernel to fill the missing values in. However, it > would also be possible to solve this just in the R package, by changing the > mapping of {{is.nan}} in the R package. If we choose to go that route, we > should change this Jira issue summary to "[R] Make is.nan(NA) consistent with > base R". -- This message was sent by Atlassian Jira (v8.20.1#820001)