westonpace commented on a change in pull request #9561:
URL: https://github.com/apache/arrow/pull/9561#discussion_r582252920
##########
File path: r/R/dataset-partition.R
##########
@@ -25,12 +25,17 @@
#' `DirectoryPartitioning` describes how to interpret raw path segments, in
#' order. For example, `schema(year = int16(), month = int8())` would define
#' partitions for file paths like "2019/01/file.parquet",
-#' "2019/02/file.parquet", etc.
+#' "2019/02/file.parquet", etc. In this scheme null values will be skipped.
+#' In the previous example, if the month was null, the files would be placed
+#' in 2019/file.parquet. An error will be raised if an outer directory is
Review comment:
`An error will be raised` is what I would say in Python. Does R
translate invalid Status into some kind of "raised error" or would it be more
accurate to say "returned"? Is the R terminology "thrown"?
##########
File path: r/R/dataset-partition.R
##########
@@ -72,19 +77,22 @@ HivePartitioning$create <- dataset___HivePartitioning
#' Because fields are named in the path segments, order of fields passed to
#' `hive_partition()` does not matter.
#' @param ... named list of [data types][data-type], passed to [schema()]
+#' @param null_fallback character to be used in place of `NA` and `NULL` values
Review comment:
Self nit: In this comment I say "`NA` and `NULL`" and in the next
comment I say `null`. I guess I wasn't being too consistent. I'm not sure
what the proper R term for "a null array value" is.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]