Nicola Crane created ARROW-18403:
------------------------------------

             Summary: [C++] Error consuming Substrait plan which uses count 
function: "only unary aggregate functions are currently supported"
                 Key: ARROW-18403
                 URL: https://issues.apache.org/jira/browse/ARROW-18403
             Project: Apache Arrow
          Issue Type: Bug
          Components: C++
            Reporter: Nicola Crane


ARROW-17523 added support for the Substrait extension function "count", but 
when I write code which produces a Substrait plan which calls it, and then try 
to run it in Acero, I get an error.

The plan:

{code:r}
message of type 'substrait.Plan' with 3 fields set
extension_uris {
  extension_uri_anchor: 1
  uri: 
"https://github.com/substrait-io/substrait/blob/main/extensions/functions_arithmetic.yaml";
}
extension_uris {
  extension_uri_anchor: 2
  uri: 
"https://github.com/substrait-io/substrait/blob/main/extensions/functions_comparison.yaml";
}
extension_uris {
  extension_uri_anchor: 3
  uri: 
"https://github.com/substrait-io/substrait/blob/main/extensions/functions_aggregate_generic.yaml";
}
extensions {
  extension_function {
    extension_uri_reference: 3
    function_anchor: 2
    name: "count"
  }
}
relations {
  rel {
    aggregate {
      input {
        project {
          common {
            emit {
              output_mapping: 9
              output_mapping: 10
              output_mapping: 11
              output_mapping: 12
              output_mapping: 13
              output_mapping: 14
              output_mapping: 15
              output_mapping: 16
              output_mapping: 17
            }
          }
          input {
            read {
              base_schema {
                names: "int"
                names: "dbl"
                names: "dbl2"
                names: "lgl"
                names: "false"
                names: "chr"
                names: "verses"
                names: "padded_strings"
                names: "some_negative"
                struct_ {
                  types {
                    i32 {
                      nullability: NULLABILITY_NULLABLE
                    }
                  }
                  types {
                    fp64 {
                      nullability: NULLABILITY_NULLABLE
                    }
                  }
                  types {
                    fp64 {
                      nullability: NULLABILITY_NULLABLE
                    }
                  }
                  types {
                    bool_ {
                      nullability: NULLABILITY_NULLABLE
                    }
                  }
                  types {
                    bool_ {
                      nullability: NULLABILITY_NULLABLE
                    }
                  }
                  types {
                    string {
                      nullability: NULLABILITY_NULLABLE
                    }
                  }
                  types {
                    string {
                      nullability: NULLABILITY_NULLABLE
                    }
                  }
                  types {
                    string {
                      nullability: NULLABILITY_NULLABLE
                    }
                  }
                  types {
                    fp64 {
                      nullability: NULLABILITY_NULLABLE
                    }
                  }
                }
              }
              local_files {
                items {
                  uri_file: "file:///tmp/RtmpsBsoZJ/file1915f604cff4a"
                  parquet {
                  }
                }
              }
            }
          }
          expressions {
            selection {
              direct_reference {
                struct_field {
                }
              }
              root_reference {
              }
            }
          }
          expressions {
            selection {
              direct_reference {
                struct_field {
                  field: 1
                }
              }
              root_reference {
              }
            }
          }
          expressions {
            selection {
              direct_reference {
                struct_field {
                  field: 2
                }
              }
              root_reference {
              }
            }
          }
          expressions {
            selection {
              direct_reference {
                struct_field {
                  field: 3
                }
              }
              root_reference {
              }
            }
          }
          expressions {
            selection {
              direct_reference {
                struct_field {
                  field: 4
                }
              }
              root_reference {
              }
            }
          }
          expressions {
            selection {
              direct_reference {
                struct_field {
                  field: 5
                }
              }
              root_reference {
              }
            }
          }
          expressions {
            selection {
              direct_reference {
                struct_field {
                  field: 6
                }
              }
              root_reference {
              }
            }
          }
          expressions {
            selection {
              direct_reference {
                struct_field {
                  field: 7
                }
              }
              root_reference {
              }
            }
          }
          expressions {
            selection {
              direct_reference {
                struct_field {
                  field: 8
                }
              }
              root_reference {
              }
            }
          }
        }
      }
      groupings {
        grouping_expressions {
          selection {
            direct_reference {
              struct_field {
                field: 3
              }
            }
            root_reference {
            }
          }
        }
      }
      measures {
        measure {
          function_reference: 2
          phase: AGGREGATION_PHASE_INITIAL_TO_RESULT
          output_type {
            i64 {
              nullability: NULLABILITY_NULLABLE
            }
          }
          invocation: AGGREGATION_INVOCATION_ALL
        }
      }
    }
  }
}
{code}

The error:


{code:java}
Error: NotImplemented: Only unary aggregate functions are currently supported
/home/nic2/arrow/cpp/src/arrow/engine/substrait/relation_internal.cc:587  
converter(aggregate_call)
/home/nic2/arrow/cpp/src/arrow/engine/substrait/serde.cc:153  
FromProto(plan_rel.has_root() ? plan_rel.root().input() : plan_rel.rel(), 
ext_set, conversion_options)
{code}

I have no idea what the "phase" and "invocation" fields above do, but previous 
attempts to get Acero to consume this plan led to errors due to me using 
default values instead of the ones specified there (e.g. "Not Implemented: 
Unsupported aggregation phase 'AGGREGATION_PHASE_UNSPECIFIED'"), so I just 
changed them to see if it helped.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to