Hi Sameer, MLLib uses Breezeās vector format under the hood. You can use that. http://www.scalanlp.org/api/breeze/index.html#breeze.linalg.SparseVector
For example: import breeze.linalg.{DenseVector => BDV, SparseVector => BSV, Vector => BV} val numClasses = classes.distinct.count.toInt val userWithClassesAsSparseVector = rows.map(x => (x.userID, new BSV[Double](x.classIDs.sortWith(_ < _), Seq.fill(x.classIDs.length)(1.0).toArray, numClasses).asInstanceOf[BV[Double]])) Chris On Sep 15, 2014, at 11:28 AM, Sameer Tilak <ssti...@live.com> wrote: > Hi All, > I have transformed the data into following format: First column is user id, > and then all the other columns are class ids. For a user only class ids that > appear in this row have value 1 and others are 0. I need to crease a sparse > vector from this. Does the API for creating a sparse vector that can directly > support this format? > > User id Product class ids > > 2622572 145447 1620 13421 28565 285556 293 4553 67261 > 130 3646 1671 18806 183576 3286 51715 57671 57476