imho, you'll need to implement custom rdd with your locality settings(i.e.
custom implementation of discovering where each partition is located) +
setting for spark.locality.wait
On 24 August 2016 at 03:48, Mohit Jaggi wrote:
> It is a bit hacky but possible. A lot depends
It is a bit hacky but possible. A lot depends on what kind of queries etc you
want to run. You could write a data source that reads your data and keeps it
partitioned the way you want, then use mapPartitions() to execute your codeā¦
Mohit Jaggi
Founder,
Data Orchard LLC
www.dataorchardllc.com
(apologies if this appears twice. I sent it 24 hours ago and it hasn't hit
the list yet)
Hi,
I have a bit of an unusual use-case and would greatly appreciate some
feedback from experienced Sparklers as to whether it is a good fit for
spark.
I have a network of compute/data servers configured as
Hi,
I have a bit of an unusual use-case and would *greatly* *appreciate* some
feedback as to whether it is a good fit for spark.
I have a network of compute/data servers configured as a tree as shown below
- controller
- server 1
- server 2
- server 3
- etc.
There are