[jira] [Updated] (SPARK-7194) Vectors factors method for sparse vectors should accept the output of zipWithIndex

2015-04-29 Thread Sean Owen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-7194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen updated SPARK-7194:
-
  Component/s: MLlib
 Priority: Minor  (was: Major)
Affects Version/s: 1.3.1

Go ahead and set priority and component, and maybe affects version for 
improvements. 

You can write {{Vectors.dense(array).toSparse}} - that may be simpler still and 
doesn't need a new method?

Or this could also be a little simpler with array.zipWithIndex.filter(_._1 != 
0.0).map(_.swap)

 Vectors factors method for sparse vectors should accept the output of 
 zipWithIndex
 --

 Key: SPARK-7194
 URL: https://issues.apache.org/jira/browse/SPARK-7194
 Project: Spark
  Issue Type: Improvement
  Components: MLlib
Affects Versions: 1.3.1
Reporter: Juliet Hougland
Priority: Minor

 Let's say we have an RDD of Array[Double] where zero values are explictly 
 recorded. Ie (0.0, 0.0, 3.2, 0.0...) If we want to transform this into an RDD 
 of sparse vectors, we currently have to:
 arr_doubles.map{ array =
val indexElem: Seq[(Int, Double)] = array.zipWithIndex.filter(tuple =  
 tuple._1 != 0.0).map(tuple = (tuple._2, tuple._1))
 Vectors.sparse(arrray.length, indexElem)
 }
 Notice that there is a map step at the end to switch the order of the index 
 and the element value after .zipWithIndex. There should be a factory method 
 on the Vectors class that allows you to avoid this flipping of tuple elements 
 when using zipWithIndex.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-7194) Vectors factors method for sparse vectors should accept the output of zipWithIndex

2015-04-28 Thread Juliet Hougland (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-7194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Juliet Hougland updated SPARK-7194:
---
Description: 
Let's say we have an RDD of Array[Double] where zero values are explictly 
recorded. Ie (0.0, 0.0, 3.2, 0.0...) If we want to transform this into an RDD 
of sparse vectors, we currently have to:

arr_doubles.map{ array =
   val indexElem: Seq[(Int, Double)] = array.zipWithIndex.filter(tuple =  
tuple._1 != 0.0).map(tuple = (tuple._2, tuple._1))

Vectors.sparse(arrray.length, indexElem)
}

Notice that there is a map step at the end to switch the order of the index and 
the element value after .zipWithIndex. There should be a factory method on the 
Vectors class that allows you to avoid this flipping of tuple elements when 
using zipWithIndex.

  was:
Let's say we have an RDD of Array[Double] where zero values are explictly 
recorded. Ie (0.0, 0.0, 3.2, 0.0...) If we want to transform this into an RDD 
of sparse vectors, we currently have to:

arr_doubles.map{ array =
   val indexElem: Seq[(Int, Double)] = array.zipWithIndex.filter(tuple =  
tuple._1 != 0.0).map(tuple = (tuple._2, tuple._1))
Vectors.sparse(arrray.length, indexElem)
}

Notice that there is a map step at the end to switch the order of the index and 
the element value after .zipWithIndex. There should be a factory method on the 
Vectors class that allows you to avoid this flipping of tuple elements when 
using zipWithIndex.


 Vectors factors method for sparse vectors should accept the output of 
 zipWithIndex
 --

 Key: SPARK-7194
 URL: https://issues.apache.org/jira/browse/SPARK-7194
 Project: Spark
  Issue Type: Improvement
Reporter: Juliet Hougland

 Let's say we have an RDD of Array[Double] where zero values are explictly 
 recorded. Ie (0.0, 0.0, 3.2, 0.0...) If we want to transform this into an RDD 
 of sparse vectors, we currently have to:
 arr_doubles.map{ array =
val indexElem: Seq[(Int, Double)] = array.zipWithIndex.filter(tuple =  
 tuple._1 != 0.0).map(tuple = (tuple._2, tuple._1))
 Vectors.sparse(arrray.length, indexElem)
 }
 Notice that there is a map step at the end to switch the order of the index 
 and the element value after .zipWithIndex. There should be a factory method 
 on the Vectors class that allows you to avoid this flipping of tuple elements 
 when using zipWithIndex.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org