we currently have 2 explode definitions in Dataset: def explode[A <: Product : TypeTag](input: Column*)(f: Row => TraversableOnce[A]): DataFrame
def explode[A, B : TypeTag](inputColumn: String, outputColumn: String)(f: A => TraversableOnce[B]): DataFrame 1) the separation of the functions into their own argument lists is nice, but unfortunately scala's type inference doesn't handle this well, meaning that the generic types always have to be explicitly provided. i assume this was done to allow the "input" to be a varargs in the first method, and then kept the same in the second for reasons of symmetry. 2) i am surprised the first definition returns a DataFrame. this seems to suggest DataFrame usage (so DataFrame to DataFrame), but there is no way to specify the output column names, which limits its usability for DataFrames. i frequently end up using the first definition for DataFrames anyhow because of the need to return more than 1 column (and the data has columns unknown at compile time that i need to carry along making flatMap on Dataset clumsy/unusable), but relying on the output columns being called _1 and _2 and renaming then afterwards seems like an anti-pattern. 3) using Row objects isn't very pretty. why not f: A => TraversableOnce[B] or something like that for the first definition? how about: def explode[A: TypeTag, B: TypeTag](input: Seq[Column], output: Seq[Column])(f: A => TraversableOnce[B]): DataFrame best, koert