jorisvandenbossche commented on code in PR #19706:
URL: https://github.com/apache/arrow/pull/19706#discussion_r1068333553


##########
r/R/expression.R:
##########
@@ -89,6 +92,56 @@ Expression$create <- function(function_name,
   expr
 }
 
+
+#' @export
+`[[.Expression` <- function(x, i, ...) {
+  # TODO: integer (positional) field refs are supported in C++
+  assert_that(is.string(i))
+  get_nested_field(x, i)
+}
+
+#' @export
+`$.Expression` <- function(x, name, ...) {
+  assert_that(is.string(name))
+  if (name %in% ls(x)) {
+    get(name, x)
+  } else {
+    get_nested_field(x, name)
+  }
+}
+
+get_nested_field <- function(expr, name) {
+  if (expr$is_field_ref()) {
+    # Make a nested field ref
+    out <- compute___expr__nested_field_ref(expr, name)
+  } else {
+    # Use the struct_field kernel, but that only works if:
+    # * expr has a knowable type (has a schema set)
+    # * that type is struct
+    # * `name` exists in the struct (bc we have to map to an integer position)

Review Comment:
   FYI, nowadays it shouldn't be need to map it to an integer position, the 
"struct_field" kernel now also accepts a string name field ref



##########
r/src/expression.cpp:
##########
@@ -46,13 +46,26 @@ std::shared_ptr<compute::Expression> 
compute___expr__call(std::string func_name,
       compute::call(std::move(func_name), std::move(arguments), 
std::move(options_ptr)));
 }
 
+// [[arrow::export]]
+bool compute___expr__is_field_ref(const std::shared_ptr<compute::Expression>& 
x) {
+  return x->field_ref() != nullptr;
+}
+
 // [[arrow::export]]
 std::vector<std::string> field_names_in_expression(
     const std::shared_ptr<compute::Expression>& x) {
   std::vector<std::string> out;
+  std::vector<arrow::FieldRef> nested;
+
   auto field_refs = FieldsInExpression(*x);
   for (auto f : field_refs) {
-    out.push_back(*f.name());
+    if (f.IsNested()) {
+      // We keep the top-level field name.

Review Comment:
   This might not be used in practice (in a `mutate` call where you select the 
field, you also directly specify the resulting column name), but otherwise it 
might also make sense to keep the innermost field name?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to