Re: How to binarize data in spark

2015-08-07 Thread Adamantios Corais
I have ended up with the following piece of code but is turns out to be really slow... Any other ideas provided that I can only use MLlib 1.2? val data = test11.map(x= ((x(0) , x(1)) , x(2))).groupByKey().map(x= (x._1 , x._2.toArray)).map{x= var lt : Array[Double] = new

How to binarize data in spark

2015-08-06 Thread Adamantios Corais
I have a set of data based on which I want to create a classification model. Each row has the following form: user1,class1,product1 user1,class1,product2 user1,class1,product5 user2,class1,product2 user2,class1,product5 user3,class2,product1 etc There are about 1M users, 2 classes, and 1M