Thank you! Would happen to have this code in Java?.
This is extremely helpful! Iman On Sun, Nov 6, 2016 at 3:35 AM -0800, "Robineast [via Apache Spark User List]" <ml-node+s1001560n28027...@n3.nabble.com> wrote: Here’s a way of creating sparse vectors in MLLib: import org.apache.spark.mllib.linalg.Vectorsimport org.apache.spark.rdd.RDD val rdd = sc.textFile("A.txt").map(line => line.split(",")). map(ary => (ary(0).toInt, ary(1).toInt, ary(2).toDouble)) val pairRdd: RDD[(Int, (Int, Int, Double))] = rdd.map(el => (el._1, el)) val create = (first: (Int, Int, Double)) => (Array(first._2), Array(first._3))val combine = (head: (Array[Int], Array[Double]), tail: (Int, Int, Double)) => (head._1 :+ tail._2, head._2 :+ tail._3)val merge = (a: (Array[Int], Array[Double]), b: (Array[Int], Array[Double])) => (a._1 ++ b._1, a._2 ++ b._2) val A = pairRdd.combineByKey(create,combine,merge).map(el => Vectors.sparse(3,el._2._1,el._2._2)) If you have a separate file of b’s then you would need to manipulate this slightly to join the b’s to the A RDD and then create LabeledPoints. I guess there is a way of doing this using the newer ML interfaces but it’s not particularly obvious to me how. One point: In the example you give the b’s are exactly the same as col 2 in the A matrix. I presume this is just a quick hacked together example because that would give a trivial result. -------------------------------------------------------------------------------Robin EastSpark GraphX in Action Michael Malak and Robin EastManning Publications Co.http://www.manning.com/books/spark-graphx-in-action On 3 Nov 2016, at 18:12, im281 [via Apache Spark User List] <[hidden email]> wrote: I would like to use it. But how do I do the following 1) Read sparse data (from text or database) 2) pass the sparse data to the linearRegression class? For example: Sparse matrix A row, column, value 0,0,.42 0,1,.28 0,2,.89 1,0,.83 1,1,.34 1,2,.42 2,0,.23 3,0,.42 3,1,.98 3,2,.88 4,0,.23 4,1,.36 4,2,.97 Sparse vector b row, column, value 0,2,.89 1,2,.42 3,2,.88 4,2,.97 Solve Ax = b??? If you reply to this email, your message will be added to the discussion below: http://apache-spark-user-list.1001560.n3.nabble.com/mLIb-solving-linear-regression-with-sparse-inputs-tp28006p28008.html To start a new topic under Apache Spark User List, email [hidden email] To unsubscribe from Apache Spark User List, click here. NAML Robin East Spark GraphX in Action Michael Malak and Robin East Manning Publications Co. http://www.manning.com/books/spark-graphx-in-action If you reply to this email, your message will be added to the discussion below: http://apache-spark-user-list.1001560.n3.nabble.com/mLIb-solving-linear-regression-with-sparse-inputs-tp28006p28027.html To unsubscribe from mLIb solving linear regression with sparse inputs, click here. NAML -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/mLIb-solving-linear-regression-with-sparse-inputs-tp28006p28028.html Sent from the Apache Spark User List mailing list archive at Nabble.com.