Hi Ankur,
I have the following questions on IndexedRDD.
1. Does the IndexedRDD support the key types of String? As per the current
documentation, it looks like it supports only Long?
2. Is IndexedRDD efficient when joined with another RDD. So, basically my
usecase is that I need to create an
The latest version of IndexedRDD supports any key type with a defined
serializer
https://github.com/amplab/spark-indexedrdd/blob/master/src/main/scala/edu/berkeley/cs/amplab/spark/indexedrdd/KeySerializer.scala,
including Strings. It's not released yet, but you can use it from the
master branch if
as key/value pairs.
Thanks,
Swetha
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/creating-a-distributed-index-tp11204p23842.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
This is very interesting, do you know if this version will be backwards
compatible with older versions of Spark (1.2.0)?
Thanks,
Jem
On Wed, Jul 15, 2015 at 10:04 AM Ankur Dave ankurd...@gmail.com wrote:
The latest version of IndexedRDD supports any key type with a defined
serializer
this in Spark
Streaming to do lookups/updates/deletes in RDDs using keys by storing them
as key/value pairs.
Thanks,
Swetha
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/creating-a-distributed-index-tp11204p23842.html
Sent from the Apache Spark User List
/creating-a-distributed-index-tp11204p23842.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h
After playing around with mapPartition I think this does exactly what I
want. I can pass in a function to mapPartition that looks like this:
def f1(iter: Iterator[String]): Iterator[MyIndex] = {
val idx: MyIndex = new MyIndex()
while (iter.hasNext) {
Hey,
There is some work that started on IndexedRDD (on master I think).
Meanwhile, checking what has been done in GraphX regarding vertex index in
partitions could be worthwhile I guess
Hth
Andy
Le 1 août 2014 22:50, Philip Ogren philip.og...@oracle.com a écrit :
Suppose I want to take my large
At 2014-08-01 14:50:22 -0600, Philip Ogren philip.og...@oracle.com wrote:
It seems that I could do this with mapPartition so that each element in a
partition gets added to an index for that partition.
[...]
Would it then be possible to take a string and query each partition's index
with it?