i believe it is a generalization of some classes inside graphx, where there
was/is a need to keep stuff indexed for random access within each rdd
partition
On Thu, Apr 16, 2015 at 5:00 PM, Evo Eftimov evo.efti...@isecc.com wrote:
Can somebody from Data Briks sched more light on this Indexed RDD
Can somebody from Data Briks sched more light on this Indexed RDD library
https://github.com/amplab/spark-indexedrdd
It seems to come from AMP Labs and most of the Data Bricks guys are from
there
What is especially interesting is whether the Point Lookup (and the other
primitives) can work
the context of
graphx
From: Koert Kuipers [mailto:ko...@tresata.com]
Sent: Thursday, April 16, 2015 10:31 PM
To: Evo Eftimov
Cc: user@spark.apache.org
Subject: Re: AMP Lab Indexed RDD - question for Data Bricks AMP Labs
i believe it is a generalization of some classes inside graphx, where
/ quality
library which can be used for general purpose RDDs not just inside the
context of graphx
*From:* Koert Kuipers [mailto:ko...@tresata.com]
*Sent:* Thursday, April 16, 2015 10:31 PM
*To:* Evo Eftimov
*Cc:* user@spark.apache.org
*Subject:* Re: AMP Lab Indexed RDD - question for Data
Hi,
I'm trying to implement a custom RDD that essentially works as a
distributed hash table, i.e. the key space is split up into partitions and
within a partition, an element can be looked up efficiently by the key.
However, the RDD lookup() function (in PairRDDFunctions) is implemented in
a way