Also in Java as well. Thanks again!
Iman

On Sun, Nov 6, 2016 at 8:28 AM Iman Mohtashemi <iman.mohtash...@gmail.com>
wrote:

Hi Robin,
It looks like the linear regression model takes in a dataset not a matrix?
It would be helpful for this example if you could set up the whole problem
end to end using one of the columns of the matrix as b. So A is a sparse
matrix and b is a sparse vector
Best regards.
Iman

On Sun, Nov 6, 2016 at 6:43 AM <iman.mohtash...@gmail.com> wrote:

Thank you! Would happen to have this code in Java?.
This is extremely helpful!


Iman




On Sun, Nov 6, 2016 at 3:35 AM -0800, "Robineast [via Apache Spark User
List]" <ml-node+s1001560n28027...@n3.nabble.com> wrote:

Here’s a way of creating sparse vectors in MLLib:

import org.apache.spark.mllib.linalg.Vectors
import org.apache.spark.rdd.RDD

val rdd = sc.textFile("A.txt").map(line => line.split(",")).
     map(ary => (ary(0).toInt, ary(1).toInt, ary(2).toDouble))

val pairRdd: RDD[(Int, (Int, Int, Double))] = rdd.map(el => (el._1, el))

val create = (first: (Int, Int, Double)) => (Array(first._2),
Array(first._3))
val combine = (head: (Array[Int], Array[Double]), tail: (Int, Int, Double))
=> (head._1 :+ tail._2, head._2 :+ tail._3)
val merge = (a: (Array[Int], Array[Double]), b: (Array[Int],
Array[Double])) => (a._1 ++ b._1, a._2 ++ b._2)

val A = pairRdd.combineByKey(create,combine,merge).map(el =>
Vectors.sparse(3,el._2._1,el._2._2))

If you have a separate file of b’s then you would need to manipulate this
slightly to join the b’s to the A RDD and then create LabeledPoints. I
guess there is a way of doing this using the newer ML interfaces but it’s
not particularly obvious to me how.

One point: In the example you give the b’s are exactly the same as col 2 in
the A matrix. I presume this is just a quick hacked together example
because that would give a trivial result.

-------------------------------------------------------------------------------
Robin East
*Spark GraphX in Action* Michael Malak and Robin East
Manning Publications Co.
http://www.manning.com/books/spark-graphx-in-action





On 3 Nov 2016, at 18:12, im281 [via Apache Spark User List] <[hidden email]
<http:///user/SendEmail.jtp?type=node&node=28027&i=0>> wrote:

I would like to use it. But how do I do the following
1) Read sparse data (from text or database)
2) pass the sparse data to the linearRegression class?

For example:

Sparse matrix A
row, column, value
0,0,.42
0,1,.28
0,2,.89
1,0,.83
1,1,.34
1,2,.42
2,0,.23
3,0,.42
3,1,.98
3,2,.88
4,0,.23
4,1,.36
4,2,.97

Sparse vector b
row, column, value
0,2,.89
1,2,.42
3,2,.88
4,2,.97

Solve Ax = b???



------------------------------
If you reply to this email, your message will be added to the discussion
below:
http://apache-spark-user-list.1001560.n3.nabble.com/mLIb-solving-linear-regression-with-sparse-inputs-tp28006p28008.html
To start a new topic under Apache Spark User List, email [hidden email]
<http:///user/SendEmail.jtp?type=node&node=28027&i=1>
To unsubscribe from Apache Spark User List, click here.
NAML
<http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>


Robin East
Spark GraphX in Action Michael Malak and Robin East
Manning Publications Co.
http://www.manning.com/books/spark-graphx-in-action


------------------------------
If you reply to this email, your message will be added to the discussion
below:
http://apache-spark-user-list.1001560.n3.nabble.com/mLIb-solving-linear-regression-with-sparse-inputs-tp28006p28027.html
To unsubscribe from mLIb solving linear regression with sparse inputs, click
here
<http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=28006&code=aW1hbi5tb2h0YXNoZW1pQGdtYWlsLmNvbXwyODAwNnwtMTc1OTAxNjQz>
.
NAML
<http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/mLIb-solving-linear-regression-with-sparse-inputs-tp28006p28030.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to