Hello everyone, I have a dataframe which has 5040 rows where these rows are splitted in 5 groups. So i have a column called "Group_Id" which marks every row with values from 0-4 depending on in which group every rows belongs to. I am trying to split my dataframe to 5 partitions and apply Kmeans to every partition. I have tried
rdd=mydataframe.rdd.mapPartitions(function, True) test = Kmeans.train(rdd, num_of_centers, "random") but i get an error. How can i apply Kmeans to every partition? Thank you in advance,