[GitHub] [arrow] jonkeane commented on a change in pull request #9969: ARROW-11589: [R] Add methods for modifying Schemas

GitBox Fri, 09 Apr 2021 14:10:49 -0700


jonkeane commented on a change in pull request #9969:
URL: https://github.com/apache/arrow/pull/9969#discussion_r610902318




##########
File path: r/tests/testthat/test-schema.R
##########
@@ -65,7 +69,51 @@ test_that("Schema slicing", {
   expect_equal(schm[c(FALSE, TRUE, TRUE)], schema(c = string(), d = int8()))
   expect_error(schm[c("c", "ZZZ")], 'Invalid field name: "ZZZ"')
   expect_error(schm[c("XXX", "c", "ZZZ")], 'Invalid field names: "XXX" and 
"ZZZ"')
+})
+
+test_that("Schema modification", {
+  schm <- schema(b = double(), c = string(), d = int8())
+  schm$c <- boolean()
+  expect_equal(schm, schema(b = double(), c = boolean(), d = int8()))
+  schm[["d"]] <- int16()
+  expect_equal(schm, schema(b = double(), c = boolean(), d = int16()))
+  schm$b <- NULL
+  expect_equal(schm, schema(c = boolean(), d = int16()))
+  # NULL assigning something that doesn't exist doesn't modify
+  schm$zzzz <- NULL
+  expect_equal(schm, schema(c = boolean(), d = int16()))
+  # Adding a field
+  schm$fff <- int32()
+  expect_equal(schm, schema(c = boolean(), d = int16(), fff = int32()))
+
+  # By index
+  schm <- schema(b = double(), c = string(), d = int8())
+  schm[[2]] <- int32()
+  expect_equal(schm, schema(b = double(), c = int32(), d = int8()))
 
+  # Adding actual Fields
+  # If assigning by name, note that this can modify the resulting name
+  schm <- schema(b = double(), c = string(), d = int8())
+  schm$c <- field("x", int32())
+  expect_equal(schm, schema(b = double(), x = int32(), d = int8()))

Review comment:
       I can see why this happens, but this seems really off to me. I suspect 
that most instances of something like `schm$c <- field("x", int32())` would be 
accidents where `c` or `x` is a typo. 
   
   I wonder if it would be better to warn/error in these cases that the name 
has changed.

##########
File path: r/tests/testthat/test-schema.R
##########
@@ -48,6 +47,11 @@ test_that("Schema $GetFieldByName", {
 })
 
 test_that("Schema extract (returns Field)", {
+  # TODO: should this return a Field or the Type?

Review comment:
       I think Field is right here, but I can see the other side too. But 
especially since names can be duplicated, I think it's easier to interact with 
them that way. FWIW, it appears that readr's `col_spec` doesn't seem to have an 
interaction model there at all:
   
   ```
   > foo <- readr::cols(a = readr::col_integer())
   > foo$a
   NULL
   > foo
   cols(
     a = col_integer()
   )
   > foo["a"]
   $<NA>
   NULL
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] jonkeane commented on a change in pull request #9969: ARROW-11589: [R] Add methods for modifying Schemas

Reply via email to