How to do spares vector product in Spark?
Hi, I have two RDD[Vector], both Vector are spares and of the form: (id, value) id indicates the position of the value in the vector space. I want to apply dot product on two of such RDD[Vector] and get a scale value. The none exist values are treated as zero. Any convenient tool to do this in Spark? Thanks, David
RE: How to do spares vector product in Spark?
Any convenient tool to do this [sparse vector product] in Spark? Unfortunately, it seems that there are very few operations defined for sparse vectors. I needed to add some, and ended up converting them to (dense) numpy vectors and doing the addition on those. Best regards, Ron From: Xi Shen [mailto:davidshe...@gmail.com] Sent: Friday, March 13, 2015 1:50 AM To: user@spark.apache.org Subject: How to do spares vector product in Spark? Hi, I have two RDD[Vector], both Vector are spares and of the form: (id, value) id indicates the position of the value in the vector space. I want to apply dot product on two of such RDD[Vector] and get a scale value. The none exist values are treated as zero. Any convenient tool to do this in Spark? Thanks, David
Re: How to do spares vector product in Spark?
In Java/Scala-land, the intent is to use Breeze for this. Vector in Spark is an opaque wrapper around the Breeze representation, which contains a bunch of methods like this. On Fri, Mar 13, 2015 at 3:28 PM, Daniel, Ronald (ELS-SDG) r.dan...@elsevier.com wrote: Any convenient tool to do this [sparse vector product] in Spark? Unfortunately, it seems that there are very few operations defined for sparse vectors. I needed to add some, and ended up converting them to (dense) numpy vectors and doing the addition on those. Best regards, Ron From: Xi Shen [mailto:davidshe...@gmail.com] Sent: Friday, March 13, 2015 1:50 AM To: user@spark.apache.org Subject: How to do spares vector product in Spark? Hi, I have two RDD[Vector], both Vector are spares and of the form: (id, value) id indicates the position of the value in the vector space. I want to apply dot product on two of such RDD[Vector] and get a scale value. The none exist values are treated as zero. Any convenient tool to do this in Spark? Thanks, David - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org