Hi, junjie. As Nick said, spark.ml indeed contains Vector, Vectors and VectorUDT by itself, see: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala:36: sealed trait Vector extends Serializable
So, which bug do you find with VectorAssembler? Could you give more details? On Thu, Jul 13, 2017 at 5:15 PM, <xiongjun...@birdsh.com> wrote: > Dear Developers: > > Here is a bug in org.apache.spark.ml.linalg.*: > Class Vector, Vectors are not included in org.apache.spark.ml.linalg.*, > but they are used in VectorAssembler.scala as follows: > > *import *org.apache.spark.ml.linalg.{Vector, Vectors, VectorUDT} > > Therefore, bug was reported when I was using VectorAssembler. > > Since org.apache.spark.mllib.linalg.* contains the class {Vector, > Vectors, VectorUDT}, I rewrote VectorAssembler.scala as > XVectorAssembler.scala by mainly changing "*import *org.apache.spark.*ml* > .linalg.{Vector, Vectors, VectorUDT}" to > "*import *org.apache.spark.*mllib*.linalg.{Vector, Vectors, VectorUDT}" > > But bug occured as follows: > > " Column v must be of type org.apache.spark.ml.linalg.VectorUDT@3bfc3ba7 > but was actually org.apache.spark.mllib.linalg.VectorUDT@f71b0bce " > > Would you please help fix the bug? > > Thank you very much! > > Best regards > --xiongjunjie On Thu, Jul 13, 2017 at 6:08 PM, Nick Pentreath <nick.pentre...@gmail.com> wrote: > There are Vector classes under ml.linalg package - And VectorAssembler and > other feature transformers all work with ml.linalg vectors. > > If you try to use mllib.linalg vectors instead you will get an error as > the user defined type for SQL is not correct > > > On Thu, 13 Jul 2017 at 11:23, <xiongjun...@birdsh.com> wrote: > >> Dear Developers: >> >> Here is a bug in org.apache.spark.ml.linalg.*: >> Class Vector, Vectors are not included in org.apache.spark.ml.linalg.*, >> but they are used in VectorAssembler.scala as follows: >> >> *import *org.apache.spark.ml.linalg.{Vector, Vectors, VectorUDT} >> >> Therefore, bug was reported when I was using VectorAssembler. >> >> Since org.apache.spark.mllib.linalg.* contains the class {Vector, >> Vectors, VectorUDT}, I rewrote VectorAssembler.scala as >> XVectorAssembler.scala by mainly changing "*import *org.apache.spark.*ml* >> .linalg.{Vector, Vectors, VectorUDT}" to >> "*import *org.apache.spark.*mllib*.linalg.{Vector, Vectors, VectorUDT}" >> >> But bug occured as follows: >> >> " Column v must be of type org.apache.spark.ml.linalg.VectorUDT@3bfc3ba7 >> but was actually org.apache.spark.mllib.linalg.VectorUDT@f71b0bce " >> >> Would you please help fix the bug? >> >> Thank you very much! >> >> Best regards >> --xiongjunjie > >