Hi Dimitri,
what is the error you are getting, please specify.
Apostolos
On 30/1/19 16:30, dimitris plakas wrote:
Hello everyone,
I have a dataframe which has 5040 rows where these rows are splitted
in 5 groups. So i have a column called "Group_Id" which marks every
row with values from 0-4 depending on in which group every rows
belongs to. I am trying to split my dataframe to 5 partitions and
apply Kmeans to every partition. I have tried
rdd=mydataframe.rdd.mapPartitions(function, True)
test = Kmeans.train(rdd, num_of_centers, "random")
but i get an error.
How can i apply Kmeans to every partition?
Thank you in advance,
--
Apostolos N. Papadopoulos, Associate Professor
Department of Informatics
Aristotle University of Thessaloniki
Thessaloniki, GREECE
tel: ++0030312310991918
email: papad...@csd.auth.gr
twitter: @papadopoulos_ap
web: http://datalab.csd.auth.gr/~apostol
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org