nealrichardson commented on code in PR #14679:
URL: https://github.com/apache/arrow/pull/14679#discussion_r1027095501


##########
r/R/csv.R:
##########
@@ -418,6 +418,8 @@ CsvTableReader$create <- function(file,
 #' The `CsvWriteOptions$create()` factory method takes the following arguments:
 #' - `include_header` Whether to write an initial header line with column names
 #' - `batch_size` Maximum number of rows processed at a time. Default is 1024.
+#' - `null_string` The string to be written for null values. Must not contain
+#'   quotation marks. Default is `NA`.

Review Comment:
   Is this the current default, or is this a behavior change?



##########
r/R/csv.R:
##########
@@ -456,24 +458,31 @@ CsvReadOptions$create <- function(use_threads = 
option_use_threads(),
 }
 
 readr_to_csv_write_options <- function(include_header,
-                                       batch_size = 1024L) {
+                                       batch_size = 1024L,
+                                       na = "NA") {
   assert_that(is_integerish(batch_size, n = 1, finite = TRUE), batch_size > 0)
   assert_that(is.logical(include_header))
+  assert_that(is.character(na))
+  assert_that(length(na) == 1)
+  assert_that(!grepl('"', na), msg = "na argument must not contain quote 
characters.")

Review Comment:
   Only `"` is invalid? What about `'`?



##########
r/tests/testthat/test-csv.R:
##########
@@ -422,6 +422,25 @@ test_that("Write a CSV file with invalid batch size", {
   )
 })
 
+test_that("Write a CSV with custom NA value", {
+  tbl_out1 <- write_csv_arrow(tbl_no_dates, csv_file, na = "NULL_VALUE")

Review Comment:
   Is it worth adding a test that `na = ""` works? I'd imagine that would be a 
common choice,. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to