[ https://issues.apache.org/jira/browse/SPARK-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537296#comment-14537296 ]
Xiangrui Meng commented on SPARK-5885: -------------------------------------- Yes, this sounds good to me. Please create a new JIRA and then submit a PR for it. But if you data is sparse, using StandardScaler with `withMean` is not recommended. > Add VectorAssembler > ------------------- > > Key: SPARK-5885 > URL: https://issues.apache.org/jira/browse/SPARK-5885 > Project: Spark > Issue Type: Sub-task > Components: ML > Reporter: Xiangrui Meng > Assignee: Xiangrui Meng > Fix For: 1.4.0 > > > `VectorAssembler` takes a list of columns (of type double/int/vector) and > merge them into a single vector column. > {code} > val va = new VectorAssembler() > .setInputCols("userFeatures", "dayOfWeek", "timeOfDay") > .setOutputCol("features") > {code} > In the first version, it should be okay if it doesn't handle ML attributes > (SPARK-4588). -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org