Hmm, interesting. I'm using standalone mode but I could consider YARN. I'll have to simmer on that one. Thanks as always, Sean!
On Wed, Sep 17, 2014 at 12:40 AM, Sean Owen <so...@cloudera.com> wrote: > I thought I answered this ... you can easily accomplish this with YARN > by just telling YARN how much memory / CPU each machine has. This can > be configured in groups too rather than per machine. I don't think you > actually want differently-sized executors, and so don't need ratios. > But you can have differently-sized containers which can fit different > numbers of executors as appropriate. > > On Wed, Sep 17, 2014 at 8:35 AM, Victor Tso-Guillen <v...@paxata.com> > wrote: > > I'm supposing that there's no good solution to having heterogenous > hardware > > in a cluster. What are the prospects of having something like this in the > > future? Am I missing an architectural detail that precludes this > > possibility? > > > > Thanks, > > Victor > > > > On Fri, Sep 12, 2014 at 12:10 PM, Victor Tso-Guillen <v...@paxata.com> > > wrote: > >> > >> Ping... > >> > >> On Thu, Sep 11, 2014 at 5:44 PM, Victor Tso-Guillen <v...@paxata.com> > >> wrote: > >>> > >>> So I have a bunch of hardware with different core and memory setups. Is > >>> there a way to do one of the following: > >>> > >>> 1. Express a ratio of cores to memory to retain. The spark worker > config > >>> would represent all of the cores and all of the memory usable for any > >>> application, and the application would take a fraction that sustains > the > >>> ratio. Say I have 4 cores and 20G of RAM. I'd like it to have the > worker > >>> take 4/20 and the executor take 5 G for each of the 4 cores, thus > maxing > >>> both out. If there were only 16G with the same ratio requirement, it > would > >>> only take 3 cores and 12G in a single executor and leave the rest. > >>> > >>> 2. Have the executor take whole number ratios of what it needs. Say it > is > >>> configured for 2/8G and the worker has 4/20. So we can give the > executor > >>> 2/8G (which is true now) or we can instead give it 4/16G, maxing out > one of > >>> the two parameters. > >>> > >>> Either way would allow me to get my heterogenous hardware all > >>> participating in the work of my spark cluster, presumably without > >>> endangering spark's assumption of homogenous execution environments in > the > >>> dimensions of memory and cores. If there's any way to do this, please > >>> enlighten me. > >> > >> > > >