Github user michalsenkyr commented on a diff in the pull request: https://github.com/apache/spark/pull/16986#discussion_r120009967 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -652,6 +653,299 @@ case class MapObjects private( } } +object CollectObjectsToMap { + private val curId = new java.util.concurrent.atomic.AtomicInteger() + + /** + * Construct an instance of CollectObjects case class. + * + * @param keyFunction The function applied on the key collection elements. + * @param keyInputData An expression that when evaluated returns a key collection object. + * @param keyElementType The data type of key elements in the collection. + * @param valueFunction The function applied on the value collection elements. + * @param valueInputData An expression that when evaluated returns a value collection object. + * @param valueElementType The data type of value elements in the collection. + * @param collClass The type of the resulting collection. + */ + def apply( + keyFunction: Expression => Expression, + keyInputData: Expression, + keyElementType: DataType, + valueFunction: Expression => Expression, + valueInputData: Expression, + valueElementType: DataType, + collClass: Class[_]): CollectObjectsToMap = { + val id = curId.getAndIncrement() + val keyLoopValue = s"CollectObjectsToMap_keyLoopValue$id" + val keyLoopIsNull = s"CollectObjectsToMap_keyLoopIsNull$id" --- End diff -- Yes. A key in `MapData` cannot be null. However, since the function takes two `ArrayData`s as input, I figured that we shouldn't count on this requirement being necessarily fulfilled. As `CollectObjectsToMap` is a class separate from its usage in `ScalaReflection`, I tried to make it as generic and as similar to `MapObjects` as possible, so it can be used elsewhere without having to make sure additional preconditions are met. It also produces a generic `Map` which has implementations that can support null keys. Right now, the only check that prevents this is [here](https://github.com/apache/spark/pull/16986/files/7af9b0625a245d34943b7192532c93f2aafac635#diff-e436c96ea839dfe446837ab2a3531f93R933). If there is ever a need to support these kinds of `Map`s in the future, this should make the job easier.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org