Hi

What is a best way to discretize Continuous Variable within  Spark
DataFrames ?

I want to discretize some variable 1) by equal frequency 2) by k-means

I usually use R  for this porpoises

_http://www.inside-r.org/packages/cran/arules/docs/discretize

R code for example :

### equal frequency
table(discretize(data$some_column, "frequency", categories=10))


#k-means
table(discretize(data$some_column, "cluster", categories=10))

Thanks a lot !

Reply via email to