Re: Using spark to distribute jobs to standalone servers

2016-08-25 Thread Igor Berman
imho, you'll need to implement custom rdd with your locality settings(i.e. custom implementation of discovering where each partition is located) + setting for spark.locality.wait On 24 August 2016 at 03:48, Mohit Jaggi wrote: > It is a bit hacky but possible. A lot depends

Re: Using spark to distribute jobs to standalone servers

2016-08-23 Thread Mohit Jaggi
It is a bit hacky but possible. A lot depends on what kind of queries etc you want to run. You could write a data source that reads your data and keeps it partitioned the way you want, then use mapPartitions() to execute your codeā€¦ Mohit Jaggi Founder, Data Orchard LLC www.dataorchardllc.com

Retrying: Using spark to distribute jobs to standalone servers

2016-08-23 Thread Larry White
(apologies if this appears twice. I sent it 24 hours ago and it hasn't hit the list yet) Hi, I have a bit of an unusual use-case and would greatly appreciate some feedback from experienced Sparklers as to whether it is a good fit for spark. I have a network of compute/data servers configured as

Using spark to distribute jobs to standalone servers

2016-08-22 Thread Larry White
Hi, I have a bit of an unusual use-case and would *greatly* *appreciate* some feedback as to whether it is a good fit for spark. I have a network of compute/data servers configured as a tree as shown below - controller - server 1 - server 2 - server 3 - etc. There are