Hi folks, having trouble expressing IN and COLLECT_SET on a dataframe. In other words, I'd like to figure out how to write the following query:
"select collect_set(b),a from mytable where c in (1,2,3) group by a" I've started with someDF .where( -- not sure what do for c here--- .groupBy($"a") .agg(-- collect_set is not part of sql functions as far as I see...--) I know I can register a table and do raw sql but I'm trying to figure out the DF route... Help much appreciated.