Matthias,
You don't need StructType, you can have ArrayType directly
def bufferSchema: StructType = StructType(StructField("vals",
DataTypes.createArrayType(StringType)) :: Nil)
def dataType: DataType = DataTypes.createArrayType(StringType)
def evaluate(buffer: Row): Any =
Hi,
given is a simple DF:
root
|-- id1: string (nullable = true)
|-- id2: string (nullable = true)
|-- val: string (nullable = true)
I run an UDAF on this DF with groupBy($“id1“,$“id2“).agg(udaf($“val“) as
„valsStruct“).
The aggregates simply stores all val in Set.
The result is:
root
|--