Hi What is a best way to discretize Continuous Variable within Spark DataFrames ?
I want to discretize some variable 1) by equal frequency 2) by k-means I usually use R for this porpoises _http://www.inside-r.org/packages/cran/arules/docs/discretize R code for example : ### equal frequency table(discretize(data$some_column, "frequency", categories=10)) #k-means table(discretize(data$some_column, "cluster", categories=10)) Thanks a lot !