They https://www.youtube.com/watch?v=R-6nAwLyWCI use such functionality via pyspark.
Xiaomeng Wan <shawn...@gmail.com> schrieb am Di., 29. Nov. 2016 um 17:54 Uhr: > I want to divide big data into groups (eg groupby some id), and build one > model for each group. I am wondering whether I can parallelize the model > building process by implementing a UDAF (eg running linearregression in its > evaluate mothod). is it good practice? anybody has experience? Thanks! > > Regards, > Shawn >