[ https://issues.apache.org/jira/browse/SPARK-5891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14512361#comment-14512361 ]
Apache Spark commented on SPARK-5891: ------------------------------------- User 'viirya' has created a pull request for this issue: https://github.com/apache/spark/pull/5699 > Add Binarizer > ------------- > > Key: SPARK-5891 > URL: https://issues.apache.org/jira/browse/SPARK-5891 > Project: Spark > Issue Type: Sub-task > Components: ML > Reporter: Xiangrui Meng > > `Binarizer` takes a column of continuous features and output a column with > binary features, where nonzeros (or values below a threshold) become 1 in the > output. > {code} > val binarizer = new Binarizer() > .setInputCol("numVisits") > .setOutputCol("visited") > {code} > The output column should be marked as binary. We need to discuss whether we > should process multiple columns or a vector column. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org