thisisnic commented on a change in pull request #12083:
URL: https://github.com/apache/arrow/pull/12083#discussion_r785474474



##########
File path: r/R/dataset.R
##########
@@ -123,6 +123,7 @@
 #' or call [`$NewScan()`][Scanner] to construct a query directly.
 #' @export
 #' @seealso `vignette("dataset", package = "arrow")`
+#' See [read_csv_arrow()] on how to specify column names and types for 
"csv"/"text" and "tsv" -formats.

Review comment:
       This is great, though I think it might fit better after the content on 
line 121 along with links to `read_feather()` and `read_parquet()` and a more 
generic comment about viewing those page for format-specific options.

##########
File path: r/R/dataset-format.R
##########
@@ -122,6 +122,18 @@ CsvFileFormat$create <- function(...,
                                  opts = csv_file_format_parse_options(...),
                                  convert_options = 
csv_file_format_convert_opts(...),
                                  read_options = 
csv_file_format_read_opts(...)) {
+
+  options <- list(...)
+  schema  <- options[["schema"]]
+
+  if (length(read_options$column_names) > 0 & !is.null(schema) & 
!identical(names(schema), read_options$column_names)) {
+    abort(c(
+        '"column_names" in read_options do not match the schema.',
+      i = "Set column_names in read_options to match the schema",
+      i = "Omit the read_options argument"
+    ))

Review comment:
       Looking good, just a couple of suggestions:
   - the user of the `open_dataset()` function won't necessarily have used the 
function `read_options()` directly, so we should remove the reference to it to 
avoid confusion
   - to provide maximal useful information to the end-user, could you print out 
the arguments which are mismatches between `column_names` and the schema?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to