Neal Richardson created ARROW-17974: ---------------------------------------
Summary: [C++] random function can't actually be used Key: ARROW-17974 URL: https://issues.apache.org/jira/browse/ARROW-17974 Project: Apache Arrow Issue Type: Bug Components: C++ Reporter: Neal Richardson random() is currently implemented as a nullary function. It doesn't let you specify the number of values you want to generate because it's designed to generate however many the given ExecBatch has. The only option RandomOptions takes seems to be an optional seed value. Unfortunately, the result is that the function is not usable, AFAICT. Calling the compute function directly, you get 0 values (all examples from R): {code} library(arrow) call_function("random") # Array # <double> # [] {code} Calling it from within an ExecPlan, it errors because it is not a proper scalar function, despite what the filenames say (scalar_random.cc, etc.): {code} library(arrow) library(dplyr) mtcars %>% arrow_table() %>% mutate(x = arrow_random()) %>% collect() # Error in `collect()`: # ! Invalid: ExecuteScalarExpression cannot Execute non-scalar expression Array[double] {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)