example code:
// Discretize data in 16 equal bins since ChiSqSelector requires categorical
features
val discretizedData = data.map { lp =
LabeledPoint(lp.label, Vectors.dense(lp.features.toArray.map { x = x / 16
} ) )
}
I'm sort of missing why x / 16 is considered a discretization approach
is considered a discretization approach
here.
[https://spark.apache.org/docs/latest/mllib-feature-extraction.html#feature-selection]
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Discretization-tp22811.html
Sent from the Apache Spark User List mailing
We have a pending PR (https://github.com/apache/spark/pull/216) for
discretization but it has performance issues. We will try to spend
more time to improve it. -Xiangrui
On Tue, Sep 2, 2014 at 2:56 AM, filipus floe...@gmail.com wrote:
i guess i found it
https://github.com/LIDIAgroup
.nabble.com/New-features-Discretization-for-v1-x-in-xiangrui-pdf-tp13256p13338.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional
https://github.com/apache/spark/pull/216 the code and than sbt package?
is it the same as https://github.com/LIDIAgroup/SparkFeatureSelection ???
or something different
filip
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/New-features-Discretization
is there any news about Discretization in spark?
is there anything on git? i didnt find yet
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/New-features-Discretization-for-v1-x-in-xiangrui-pdf-tp13256.html
Sent from the Apache Spark User List mailing list
i guess i found it
https://github.com/LIDIAgroup/SparkFeatureSelection
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/New-features-Discretization-for-v1-x-in-xiangrui-pdf-tp13256p13261.html
Sent from the Apache Spark User List mailing list archive