[
https://issues.apache.org/jira/browse/SPARK-7194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Juliet Hougland updated SPARK-7194:
---
Description:
Let's say we have an RDD of Array[Double] where zero values are explictly
recorded. Ie (0.0, 0.0, 3.2, 0.0...) If we want to transform this into an RDD
of sparse vectors, we currently have to:
arr_doubles.map{ array =
val indexElem: Seq[(Int, Double)] = array.zipWithIndex.filter(tuple =
tuple._1 != 0.0).map(tuple = (tuple._2, tuple._1))
Vectors.sparse(arrray.length, indexElem)
}
Notice that there is a map step at the end to switch the order of the index and
the element value after .zipWithIndex. There should be a factory method on the
Vectors class that allows you to avoid this flipping of tuple elements when
using zipWithIndex.
was:
Let's say we have an RDD of Array[Double] where zero values are explictly
recorded. Ie (0.0, 0.0, 3.2, 0.0...) If we want to transform this into an RDD
of sparse vectors, we currently have to:
arr_doubles.map{ array =
val indexElem: Seq[(Int, Double)] = array.zipWithIndex.filter(tuple =
tuple._1 != 0.0).map(tuple = (tuple._2, tuple._1))
Vectors.sparse(arrray.length, indexElem)
}
Notice that there is a map step at the end to switch the order of the index and
the element value after .zipWithIndex. There should be a factory method on the
Vectors class that allows you to avoid this flipping of tuple elements when
using zipWithIndex.
Vectors factors method for sparse vectors should accept the output of
zipWithIndex
--
Key: SPARK-7194
URL: https://issues.apache.org/jira/browse/SPARK-7194
Project: Spark
Issue Type: Improvement
Reporter: Juliet Hougland
Let's say we have an RDD of Array[Double] where zero values are explictly
recorded. Ie (0.0, 0.0, 3.2, 0.0...) If we want to transform this into an RDD
of sparse vectors, we currently have to:
arr_doubles.map{ array =
val indexElem: Seq[(Int, Double)] = array.zipWithIndex.filter(tuple =
tuple._1 != 0.0).map(tuple = (tuple._2, tuple._1))
Vectors.sparse(arrray.length, indexElem)
}
Notice that there is a map step at the end to switch the order of the index
and the element value after .zipWithIndex. There should be a factory method
on the Vectors class that allows you to avoid this flipping of tuple elements
when using zipWithIndex.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org