I have to clarify something…
In SparkSQL, we can query against both immutable existing RDDs, and
Hive/HBase/MapRDB/ which are mutable.
So we have to keep this in mind while we are talking about secondary indexing.
(Its not just RDDs)
I think the only advantage to being immutable
I’m not sure where to post this since its a bit of a philosophical question in
terms of design and vision for spark.
If we look at SparkSQL and performance… where does Secondary indexing fit in?
The reason this is a bit awkward is that if you view Spark as querying RDDs
which are temporary
On Tue, Dec 15, 2015 at 12:28 AM, Michael Segel <msegel_had...@hotmail.com>
wrote:
> Hi,
>
> This may be a silly question… couldn’t find the answer on my own…
>
> I’m trying to find out if anyone has implemented secondary indexing on
> Spark’s RDDs.
>
> If anyone cou