GitHub user mn-mikke opened a pull request: https://github.com/apache/spark/pull/21215
[SPARK-24148][SQL] Overloading array function to support typed empty arrays ## What changes were proposed in this pull request? The PR proposes to overload `array` function and allow users to specify the element type for empty arrays. Currently, empty arrays produced by `array` function are of `StringType` and there is no way how to cast them to a different type. A perfect example of the use case is `when(cond, trueExp).otherwise(falseExp)`, which expects `trueExp` and `falseExp` of being the same type. In scenario where we want to produce an empty array, in one of these cases, there's no other way than creating an `UDF`. ## How was this patch tested? Added test cases into `DataFrameComplexTypeSuite` ## Note Eventually, I will add a wrapper for PySpark, but would like to discuss the idea first. You can merge this pull request into a Git repository by running: $ git pull https://github.com/AbsaOSS/spark feature/array-api-empty-array-to-master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21215.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21215 ---- commit 44b18520dcf8e3e3639756cd8a12f75ea1080bee Author: Marek Novotny <mn.mikke@...> Date: 2018-05-02T13:42:42Z [SPARK-24148][SQL] Overloading array function to support typed empty arrays. ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org