Hi Sameer,

MLLib uses Breezeā€™s vector format under the hood.  You can use that.  
http://www.scalanlp.org/api/breeze/index.html#breeze.linalg.SparseVector

For example:

import breeze.linalg.{DenseVector => BDV, SparseVector => BSV, Vector => BV}

val numClasses = classes.distinct.count.toInt

val userWithClassesAsSparseVector = rows.map(x => (x.userID, new 
BSV[Double](x.classIDs.sortWith(_ < _), 
Seq.fill(x.classIDs.length)(1.0).toArray, numClasses).asInstanceOf[BV[Double]]))

Chris

On Sep 15, 2014, at 11:28 AM, Sameer Tilak <ssti...@live.com> wrote:

> Hi All,
> I have transformed the data into following format: First column is user id, 
> and then all the other columns are class ids. For a user only class ids that 
> appear in this row have value 1 and others are 0.  I need to crease a sparse 
> vector from this. Does the API for creating a sparse vector that can directly 
> support this format?  
> 
> User id    Product class ids
> 
> 2622572       145447  1620    13421   28565   285556  293     4553    67261   
> 130     3646    1671    18806   183576  3286    51715   57671   57476

Reply via email to