Hmm, interesting. I'm using standalone mode but I could consider YARN. I'll
have to simmer on that one. Thanks as always, Sean!

On Wed, Sep 17, 2014 at 12:40 AM, Sean Owen <so...@cloudera.com> wrote:

> I thought I answered this ... you can easily accomplish this with YARN
> by just telling YARN how much memory / CPU each machine has. This can
> be configured in groups too rather than per machine. I don't think you
> actually want differently-sized executors, and so don't need ratios.
> But you can have differently-sized containers which can fit different
> numbers of executors as appropriate.
>
> On Wed, Sep 17, 2014 at 8:35 AM, Victor Tso-Guillen <v...@paxata.com>
> wrote:
> > I'm supposing that there's no good solution to having heterogenous
> hardware
> > in a cluster. What are the prospects of having something like this in the
> > future? Am I missing an architectural detail that precludes this
> > possibility?
> >
> > Thanks,
> > Victor
> >
> > On Fri, Sep 12, 2014 at 12:10 PM, Victor Tso-Guillen <v...@paxata.com>
> > wrote:
> >>
> >> Ping...
> >>
> >> On Thu, Sep 11, 2014 at 5:44 PM, Victor Tso-Guillen <v...@paxata.com>
> >> wrote:
> >>>
> >>> So I have a bunch of hardware with different core and memory setups. Is
> >>> there a way to do one of the following:
> >>>
> >>> 1. Express a ratio of cores to memory to retain. The spark worker
> config
> >>> would represent all of the cores and all of the memory usable for any
> >>> application, and the application would take a fraction that sustains
> the
> >>> ratio. Say I have 4 cores and 20G of RAM. I'd like it to have the
> worker
> >>> take 4/20 and the executor take 5 G for each of the 4 cores, thus
> maxing
> >>> both out. If there were only 16G with the same ratio requirement, it
> would
> >>> only take 3 cores and 12G in a single executor and leave the rest.
> >>>
> >>> 2. Have the executor take whole number ratios of what it needs. Say it
> is
> >>> configured for 2/8G and the worker has 4/20. So we can give the
> executor
> >>> 2/8G (which is true now) or we can instead give it 4/16G, maxing out
> one of
> >>> the two parameters.
> >>>
> >>> Either way would allow me to get my heterogenous hardware all
> >>> participating in the work of my spark cluster, presumably without
> >>> endangering spark's assumption of homogenous execution environments in
> the
> >>> dimensions of memory and cores. If there's any way to do this, please
> >>> enlighten me.
> >>
> >>
> >
>

Reply via email to