Github user mn-mikke commented on a diff in the pull request: https://github.com/apache/spark/pull/21103#discussion_r205806876 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -3805,3 +3799,233 @@ object ArrayUnion { new GenericArrayData(arrayBuffer) } } + +/** + * Returns an array of the elements in the intersect of x and y, without duplicates + */ +@ExpressionDescription( + usage = """ + _FUNC_(array1, array2) - Returns an array of the elements in array1 but not in array2, + without duplicates. + """, + examples = """ + Examples: + > SELECT _FUNC_(array(1, 2, 3), array(1, 3, 5)); + array(2) + """, + since = "2.4.0") +case class ArrayExcept(left: Expression, right: Expression) extends ArraySetLike { --- End diff -- Maybe I'm missing something, but why we need to apply these checks if there won't be any ```null``` flag merging performed? If ```left.dataType``` and ```right.dataType``` are different, will be casted according to the ```ImplicitTypeCasts``` coercion rule. If they differ only in ```null``` flags, ```left.dataType``` could be directly returned since there won't be any array elements from ```right``` present in the result.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org