Re: Modifying start-cluster scripts to efficiently spawn multiple TMs

Greg Hogan Mon, 11 Jul 2016 09:16:39 -0700

I'd definitely be interested to hear any insight into what failed when
starting the taskmanagers with pdsh. Did the command fail, or fallback to
standard ssh, a parse error on the slaves file?


I'm wondering if we need to escape
  PDSH_SSH_ARGS_APPEND=$FLINK_SSH_OPTS
as
  PDSH_SSH_ARGS_APPEND="${FLINK_SSH_OPTS}"

On Mon, Jul 11, 2016 at 12:02 AM, Saliya Ekanayake <esal...@gmail.com>
wrote:

> pdsh is available in head node only, but when I tried to do *start-cluster
> *from head node (note Job manager node is not head node) it didn't work,
> which is why I modified the scripts.
>
> Yes, exactly, this is what I was trying to do. My research area has been
> on these NUMA related issues and binding a process to a socket (CPU) and
> then its thread to individual cores have shown great advantage. I actually
> have Java code that automatically (user configurable as well) bind
> processes and threads. For Flink, I've manually done this using  shell
> script that scans TMs in a node and pin them appropriately. This approach
> is OK, but it's better if the support is integrated to Flink.
>
> On Sun, Jul 10, 2016 at 8:33 PM, Greg Hogan <c...@greghogan.com> wrote:
>
>> Hi Saliya,
>>
>> Would you happen to have pdsh (parallel distributed shell) installed? If
>> so the TaskManager startup in start-cluster.sh will run in parallel.
>>
>> As to running 24 TaskManagers together, are these running across multiple
>> NUMA nodes? I had filed FLINK-3163 (
>> https://issues.apache.org/jira/browse/FLINK-3163) last year as I have
>> seen that even with only two NUMA nodes performance is improved by binding
>> TaskManagers, both memory and CPU. I think we can improve configuration of
>> task slots as we do with memory, where the latter can be a fixed measure or
>> a fraction relative to total memory.
>>
>> Greg
>>
>> On Sat, Jul 9, 2016 at 3:44 AM, Saliya Ekanayake <esal...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> The current start/stop scripts SSH worker nodes each time they appear in
>>> the slaves file. When spawning multiple TMs (like 24 per node), this is
>>> very inefficient.
>>>
>>> I've changed the scripts to do one SSH per node and spawn a given N
>>> number of TMs afterwards. I can make a pull request if this seems usable to
>>> others. For now, I assume slaves file will indicate the number of TMs per
>>> slave in "IP N" format.
>>>
>>> Thank you,
>>> Saliya
>>>
>>> --
>>> Saliya Ekanayake
>>> Ph.D. Candidate | Research Assistant
>>> School of Informatics and Computing | Digital Science Center
>>> Indiana University, Bloomington
>>>
>>>
>>
>
>
> --
> Saliya Ekanayake
> Ph.D. Candidate | Research Assistant
> School of Informatics and Computing | Digital Science Center
> Indiana University, Bloomington
>
>

Re: Modifying start-cluster scripts to efficiently spawn multiple TMs

Reply via email to