[GitHub] [arrow] eitsupi commented on a diff in pull request #33968: GH-18487: [R] Read Text (CSV/JSON) from character vector

via GitHub Fri, 03 Feb 2023 04:25:17 -0800


eitsupi commented on code in PR #33968:
URL: https://github.com/apache/arrow/pull/33968#discussion_r1095744610



##########
r/R/csv.R:
##########
@@ -198,6 +205,14 @@ read_delim_arrow <- function(file,
     )
   }
 
+  if (inherits(file, "AsIs")) {
+    if (is.raw(file)) {
+      file <- unclass(file)

Review Comment:
   Without it, wrapping `I()` around a raw vector will cause it to fail to read.
   This behavior can be seen in the current version of arrow.
   
   ```r
   > "a\n1" |> charToRaw() |> arrow::read_csv_arrow()
     a
   1 1
   
   > "a\n1" |> charToRaw() |> I() |> arrow::read_csv_arrow()
   Error: file must be a "InputStream"
   ```
   
   On the other hand, `readr::read_csv()` can read raw vectors with or without 
`I()`, so I thought it necessary to unclass `AsIs` here for consistency of 
behavior.
   
   ```r
   > "a\n1" |> charToRaw() |> readr::read_csv()
   Rows: 1 Columns: 1
   ── Column specification 
────────────────────────────────────────────────────────────────────────────────────────────────
   Delimiter: ","
   dbl (1): a
   
   ℹ Use `spec()` to retrieve the full column specification for this data.
   ℹ Specify the column types or set `show_col_types = FALSE` to quiet this 
message.
   # A tibble: 1 × 1
         a
     <dbl>
   1     1
   
   > "a\n1" |> charToRaw() |> I() |> readr::read_csv()
   Rows: 1 Columns: 1
   ── Column specification 
────────────────────────────────────────────────────────────────────────────────────────────────
   Delimiter: ","
   dbl (1): a
   
   ℹ Use `spec()` to retrieve the full column specification for this data.
   ℹ Specify the column types or set `show_col_types = FALSE` to quiet this 
message.
   # A tibble: 1 × 1
         a
     <dbl>
   1     1
   ```
   
   Perhaps we should mention this in the comments?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [arrow] eitsupi commented on a diff in pull request #33968: GH-18487: [R] Read Text (CSV/JSON) from character vector

Reply via email to