[ https://issues.apache.org/jira/browse/SPARK-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537267#comment-14537267 ]
Herman van Hovell tot Westerflier commented on SPARK-5885: ---------------------------------------------------------- Is it possible to add a parameter to the VectorAssembler which controls the type of Vector produced by the assembler, it should be either Dense, Sparse or Compressed. I tried to create a pipeline in which the output of a VectorAssembler was fed into a StandardScaler. This fails because the Assembler produced a sparse vector whereas the scaler expected a dense vector. If needed, I'll create a PR for this. > Add VectorAssembler > ------------------- > > Key: SPARK-5885 > URL: https://issues.apache.org/jira/browse/SPARK-5885 > Project: Spark > Issue Type: Sub-task > Components: ML > Reporter: Xiangrui Meng > Assignee: Xiangrui Meng > Fix For: 1.4.0 > > > `VectorAssembler` takes a list of columns (of type double/int/vector) and > merge them into a single vector column. > {code} > val va = new VectorAssembler() > .setInputCols("userFeatures", "dayOfWeek", "timeOfDay") > .setOutputCol("features") > {code} > In the first version, it should be okay if it doesn't handle ML attributes > (SPARK-4588). -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org