I describe some of the details here:
https://issues.apache.org/jira/browse/SPARK-27296

The short version of the story is that aggregating data structures (UDTs)
used by UDAFs are serialized to a Row object, and de-serialized, for every
row in a data frame.
Cheers,
Erik

Reply via email to