[jira] [Commented] (ARROW-18241) [R] as.integer can't handdle empty character cels (ex c(''))
[ https://issues.apache.org/jira/browse/ARROW-18241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17628905#comment-17628905 ] Neal Richardson commented on ARROW-18241: - > "I agree this would be a nice option to have." > sure. But should be the > default behavior, as that is what happens in base R, no? Sorry, that was ambiguous. We would need the C++ cast function to support an option to return nulls for values that can't be converted, rather than just error. If/when that option exists, then yes, we would make that default in R. I'll rename this ticket to be about that feature. > [R] as.integer can't handdle empty character cels (ex c('')) > > > Key: ARROW-18241 > URL: https://issues.apache.org/jira/browse/ARROW-18241 > Project: Apache Arrow > Issue Type: Bug > Components: R >Reporter: Lucas Mation >Priority: Major > > I am importing a dataset with arrow, and then converting variable types. But > I got an error message because the `arrow` implementation of `as.integer` > can't handle empty strings (which is legal in base R). Is this a bug? > {code:r} > #In R > '' %>% as.integer() > [1] NA > > #in arrow > q <- data.table(x=c('','1','2')) > q %>% write_dataset('q') > q2 <- 'q' %>% open_dataset %>% mutate(x=as.integer(x)) %>% collect > Error in `collect()`: > ! Invalid: Failed to parse string: '' as a scalar of type int32 > Run `rlang::last_error()` to see where the error occurred. > {code} > Update: tryed to preprocess x with `ifelse` but it also did not work. > {code:r} > 'q' %>% open_dataset %>% mutate(x= ifelse(x=='',NA,x)) %>% > mutate(x=as.integer(x)) %>% collect > Error in `collect()`: > ! NotImplemented: Function 'if_else' has no kernel matching input types > (bool, bool, string) > Run `rlang::last_error()` to see where the error occurred. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-18241) [R] as.integer can't handdle empty character cels (ex c(''))
[ https://issues.apache.org/jira/browse/ARROW-18241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17628902#comment-17628902 ] Lucas Mation commented on ARROW-18241: -- [~npr] , tks. 1) "I agree this would be a nice option to have." > sure. But should be the default behavior, as that is what happens in base R, no? 2) " it should work as you typed it on the development version" > tested. Works. Tks. {~}"On the released version, you can make it work by explicitly making the NA be a string so the types match{~}" > tested works. > [R] as.integer can't handdle empty character cels (ex c('')) > > > Key: ARROW-18241 > URL: https://issues.apache.org/jira/browse/ARROW-18241 > Project: Apache Arrow > Issue Type: Bug > Components: R >Reporter: Lucas Mation >Priority: Major > > I am importing a dataset with arrow, and then converting variable types. But > I got an error message because the `arrow` implementation of `as.integer` > can't handle empty strings (which is legal in base R). Is this a bug? > {code:r} > #In R > '' %>% as.integer() > [1] NA > > #in arrow > q <- data.table(x=c('','1','2')) > q %>% write_dataset('q') > q2 <- 'q' %>% open_dataset %>% mutate(x=as.integer(x)) %>% collect > Error in `collect()`: > ! Invalid: Failed to parse string: '' as a scalar of type int32 > Run `rlang::last_error()` to see where the error occurred. > {code} > Update: tryed to preprocess x with `ifelse` but it also did not work. > {code:r} > 'q' %>% open_dataset %>% mutate(x= ifelse(x=='',NA,x)) %>% > mutate(x=as.integer(x)) %>% collect > Error in `collect()`: > ! NotImplemented: Function 'if_else' has no kernel matching input types > (bool, bool, string) > Run `rlang::last_error()` to see where the error occurred. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ARROW-18241) [R] as.integer can't handdle empty character cels (ex c(''))
[ https://issues.apache.org/jira/browse/ARROW-18241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17628622#comment-17628622 ] Neal Richardson commented on ARROW-18241: - Two observations: 1. This isn't just about empty strings: cast string to int raises an error on any string that doesn't parse. I believe this was raised before but I can't seem to find an issue about it (that is, adding an option to return NA for values that don't parse instead of erroring). I agree this would be a nice option to have. {code} > arrow_table(x="a") %>% mutate(x = as.integer(x)) %>% collect() Error in `compute.arrow_dplyr_query()`: ! Invalid: Failed to parse string: 'a' as a scalar of type int32 {code} 2. The ifelse workaround will work, and it should work as you typed it on the development version of the package. On the released version, you can make it work by explicitly making the NA be a string so the types match: {code} arrow_table(a=c("1", "", "3")) %>% mutate(x = as.integer(ifelse(a == "", NA_character_, a))) %>% collect() # A tibble: 3 × 2 a x 1 "1" 1 2 "" NA 3 "3" 3 {code} > [R] as.integer can't handdle empty character cels (ex c('')) > > > Key: ARROW-18241 > URL: https://issues.apache.org/jira/browse/ARROW-18241 > Project: Apache Arrow > Issue Type: Bug > Components: R >Reporter: Lucas Mation >Priority: Major > > I am importing a dataset with arrow, and then converting variable types. But > I got an error message because the `arrow` implementation of `as.integer` > can't handle empty strings (which is legal in base R). Is this a bug? > {code:r} > #In R > '' %>% as.integer() > [1] NA > > #in arrow > q <- data.table(x=c('','1','2')) > q %>% write_dataset('q') > q2 <- 'q' %>% open_dataset %>% mutate(x=as.integer(x)) %>% collect > Error in `collect()`: > ! Invalid: Failed to parse string: '' as a scalar of type int32 > Run `rlang::last_error()` to see where the error occurred. > {code} > Update: tryed to preprocess x with `ifelse` but it also did not work. > {code:r} > paste0(p2,'/q') %>% open_dataset %>% mutate(x= ifelse(x=='',NA,x)) %>% > mutate(x=as.integer(x)) %>% collect > Error in `collect()`: > ! NotImplemented: Function 'if_else' has no kernel matching input types > (bool, bool, string) > Run `rlang::last_error()` to see where the error occurred. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)