Is it possible with Spark SQL to merge columns whose types are Arrays or Sets?
My use case would be something like this: DF types id: String words: Array[String] I would want to do something like df.groupBy('id).agg(merge_arrays('words)) -> list of all words df.groupBy('id).agg(merge_sets('words)) -> list of distinct words Thanks, -- Pedro Rodriguez PhD Student in Distributed Machine Learning | CU Boulder UC Berkeley AMPLab Alumni ski.rodrig...@gmail.com | pedrorodriguez.io | 909-353-4423 Github: github.com/EntilZha | LinkedIn: https://www.linkedin.com/in/pedrorodriguezscience