I want to divide big data into groups (eg groupby some id), and build one model for each group. I am wondering whether I can parallelize the model building process by implementing a UDAF (eg running linearregression in its evaluate mothod). is it good practice? anybody has experience? Thanks!
Regards, Shawn