Re: [I] Deduplicate Spark function code with native/default datafusion function code [datafusion]

via GitHub Thu, 13 Nov 2025 22:53:36 -0800


jizezhang commented on issue #17964:
URL: https://github.com/apache/datafusion/issues/17964#issuecomment-3531180159


   @Jefffrey would like to hear your opinion on how to merge `make_array` and 
spark `array` logic. The two are very similar but has subtle differences such 
as 
   - function names
   - function aliases
   - field names: `make_array` uses field `Field::new_list_field` which does 
not have field name while spark_array uses field name 
`ARRAY_FIELD_DEFAULT_NAME` (which is "element")
   
   The logic for `coerce_types` and `make_array_inner` are probably effectively 
identical. 
   
   To use `make_array` logic in spark `array`, I thought about using 
`MakeArray` directly when creating udf, e.g. 
https://github.com/apache/datafusion/blob/6685bbe5028609f676b27e435dbe095124fac99a/datafusion/spark/src/function/array/mod.rs#L25
   or making `SparkArray` as an alias of `MakeArray` (adding an `inner` field 
of `MakeArray` in `SparkArray`), but those did not work due to the differences 
mentioned above. Thus would like to know if you have any suggestions. Thank you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] Deduplicate Spark function code with native/default datafusion function code [datafusion]

Reply via email to