[ https://issues.apache.org/jira/browse/SPARK-7194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-7194: ----------------------------------- Assignee: Apache Spark > Vectors factors method for sparse vectors should accept the output of > zipWithIndex > ---------------------------------------------------------------------------------- > > Key: SPARK-7194 > URL: https://issues.apache.org/jira/browse/SPARK-7194 > Project: Spark > Issue Type: Improvement > Reporter: Juliet Hougland > Assignee: Apache Spark > > Let's say we have an RDD of Array[Double] where zero values are explictly > recorded. Ie (0.0, 0.0, 3.2, 0.0...) If we want to transform this into an RDD > of sparse vectors, we currently have to: > arr_doubles.map{ array => > val indexElem: Seq[(Int, Double)] = array.zipWithIndex.filter(tuple => > tuple._1 != 0.0).map(tuple => (tuple._2, tuple._1)) > Vectors.sparse(arrray.length, indexElem) > } > Notice that there is a map step at the end to switch the order of the index > and the element value after .zipWithIndex. There should be a factory method > on the Vectors class that allows you to avoid this flipping of tuple elements > when using zipWithIndex. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org