jorgecarleitao commented on a change in pull request #303: URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r630285020
########## File path: datafusion/src/physical_plan/functions.rs ########## @@ -1373,20 +1370,26 @@ impl PhysicalExpr for ScalarFunctionExpr { } fn evaluate(&self, batch: &RecordBatch) -> Result<ColumnarValue> { - // evaluate the arguments - let inputs = self - .args - .iter() - .map(|e| e.evaluate(batch)) - .collect::<Result<Vec<_>>>()?; + // evaluate the arguments, if there are no arguments we'll instead pass in a null array of + // batch size (as a convention) + let inputs = match self.args.len() { + 0 => vec![ColumnarValue::Array(Arc::new(NullArray::new( Review comment: Note that `NullArray` is composed by zero buffers, zero childs, no validity and one datatype, so the cost to instantiate it is really small. The advantage over a `ScalarValue` is that the semantics of getting a length are preserved: use `array.len()` as any other array. I am not married with any; was just trying to think about this from a documentations' perspective: > We support zero-argument UDFs. They MUST be declared as accepting zero arguments and the function signature MUST be a single argument. DataFusion will pass an `Array` to it, from which you can retrieve its length via `Array::len()`. The function MUST return an array whose number of rows equals the length of the array. If we pass a scalar of any type, if the evaluation is distributed, I believe that we will have to serialize `Scalar -> Array` in Ballista. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org