GideonPotok commented on PR #46597: URL: https://github.com/apache/spark/pull/46597#issuecomment-2121332325
@uros-db I agree that we should avoid auxiliary structures. And I don't see a good way to move the changes to implementation of `merge` and `update` without keeping an auxiliary map from the collation key to the actual values seen (eg from "aaaaaa" to "aaaAAa", "AAaaaa" for a data frame with the values "aaaAAa" and "AAaaaa".) That would be an auxiliary structure. There is ton of of scaffolding to support having just that OpenHashMap available throughout the expression being executed. So I advise strongly against us pursuing this idea, which is good in theory, at least for now. Having said that, such a prototype of an approach might look like this: https://github.com/GideonPotok/spark/pull/1 . Thoughts? Also, I am done with adding the exception for unsupported complex types! Take a look -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org