I think what Su has experienced is true in case of Node manager failure.
This was there in old Hadoop (Task Tracker failures), this [1] paper
discuss effects of this. I think this behavior is there for node manager
failures (In YARN) too, thats what I discovered sometime back (about a year
ago) by going through YARN code. But I am not sure whether this is true
now.

Thanks
Milinda

[1] http://www.cs.rice.edu/~fd2/pdf/hpdc106-dinu.pdf

On Sat, Jan 3, 2015 at 11:24 PM, Yi Su <[email protected]> wrote:

> Hi Fang,
>
> I have verified the failure detection issue. It takes 10mins for recovery,
> if I kill the Node Manager process first. I will detail the experiment, in
> case I have make any mistakes.
>
> The nodes arrangement is same as before.
>
> Workload :
>         Every second, a python program generates a record with the system
> current time. And it sends the record 10 times to kafka topic
> "suyi-test-input".
>
> Stream Task :
>         It gets the tuples from input stream and sends the tuples to the
> output stream "suyi-test-output". Checkpiont is disabled.
>
> YARN Configuration :
>         <property>
>                 <description>How long to wait until a node manager is
> considered dead.</description>
>                 <name>yarn.nm.liveness-monitor.expiry-interval-ms</name>
>                 <value>600000</value>
>         </property>
>         <property>
>                 <description>How often to check that node managers are
> still alive.</description>
>                 <name>yarn.resourcemanager.nm.
> liveness-monitor.interval-ms</name>
>                 <value>1000</value>
>         </property>
>
> Experiment Process :
>         (1) I submited the stream task, the application master ran on node
> "a", and the other container ran on node "b".
>         (2) I killed the node manager process on node "b".
>         (3) At about 09:59:07, I killed the container process on node "b".
>
> Results :
>         (1) At 10:09:33 the application master tried to redeploy the lost
> container, according to application master logs.
>         (2) The output during 09:59:35 to 10:09:53 is lost.
>
> A fraction from application master logs:
>         2015-01-04 09:47:20 ContainerManagementProtocolProxy [INFO]
> Opening proxy : b:35889
>         2015-01-04 09:47:20 SamzaAppMasterTaskManager [INFO] Claimed task
> ID 0 for container container_1420335573203_0001_01_000002 on node b (
> http://b:8042/node/containerlogs/container_1420335573203_0001_01_000002).
>         2015-01-04 09:47:20 SamzaAppMasterTaskManager [INFO] Started task
> ID 0
>         2015-01-04 09:57:19 ClientUtils$ [INFO] Fetching metadata from
> broker id:0,host:192.168.3.141,port:9092 with correlation id 41 for 1
> topic(s) Set(metrics)
>         2015-01-04 09:57:19 SyncProducer [INFO] Connected to
> 192.168.3.141:9092 for producing
>         2015-01-04 09:57:19 SyncProducer [INFO] Disconnecting from
> 192.168.3.141:9092
>         2015-01-04 09:57:19 SyncProducer [INFO] Disconnecting from a:9092
>         2015-01-04 09:57:19 SyncProducer [INFO] Connected to a:9092 for
> producing
>         2015-01-04 10:07:19 ClientUtils$ [INFO] Fetching metadata from
> broker id:0,host:192.168.3.141,port:9092 with correlation id 82 for 1
> topic(s) Set(metrics)
>         2015-01-04 10:07:19 SyncProducer [INFO] Connected to
> 192.168.3.141:9092 for producing
>         2015-01-04 10:07:19 SyncProducer [INFO] Disconnecting from
> 192.168.3.141:9092
>         2015-01-04 10:07:19 SyncProducer [INFO] Disconnecting from a:9092
>         2015-01-04 10:07:19 SyncProducer [INFO] Connected to a:9092 for
> producing
>         2015-01-04 10:09:33 SamzaAppMasterTaskManager [INFO] Got an exit
> code of -100. This means that container container_1420335573203_0001_01_000002
> was killed by YARN, either due to being released by the application master
> or being 'lost' due to node failures etc.
>         2015-01-04 10:09:33 SamzaAppMasterTaskManager [INFO] Released
> container container_1420335573203_0001_01_000002 was assigned task ID 0.
> Requesting a new container for the task.
>         2015-01-04 10:09:33 SamzaAppMasterTaskManager [INFO] Requesting 1
> container(s) with 1024mb of memory
>
> A fraction from output with my comment added:
>         {"time":"1420336774.0"}
>         {"time":"1420336774.0"}
>         {"time":"1420336774.0"}
>         {"time":"1420336775.0"}
>         {"time":"1420336775.0"}
>         {"time":"1420336775.0"}
>         {"time":"1420336775.0"}
>         {"time":"1420336775.0"}
>         {"time":"1420336775.0"}
>         {"time":"1420336775.0"}
>         {"time":"1420336775.0"}
>         {"time":"1420336775.0"}
>         {"time":"1420336775.0"} \\ 09:59:35
>         {"time":"1420337393.0"} \\ 10:09:53
>         {"time":"1420337393.0"}
>         {"time":"1420337393.0"}
>         {"time":"1420337393.0"}
>         {"time":"1420337393.0"}
>         {"time":"1420337393.0"}
>         {"time":"1420337393.0"}
>         {"time":"1420337393.0"}
>         {"time":"1420337393.0"}
>         {"time":"1420337393.0"}
>         {"time":"1420337394.0"}
>         {"time":"1420337394.0"}
>         {"time":"1420337394.0"}
>         {"time":"1420337394.0"}
>         {"time":"1420337394.0"}
>
> For the Task redeployment issue, I worries that if the Resource Manager is
> busy or there are no available containers in the system, redeployment of
> failure task might be delayed.
>
> Thank you for your help.
>
> Su Yi
>
>
> On Sat, 03 Jan 2015 06:04:20 +0800, Yan Fang <[email protected]> wrote:
>
>  Hi Su Yi,
>>
>> I think there maybe a misunderstanding. For the failure detection, if the
>> containers die ( because of NM failure or whatever reason ), AM will bring
>> up new containers in the same NM or a different NM according to the
>> resource availability. It does not take as much as 10 mins to recover. One
>> way you can test is that, you run a Samza job and manually kill the NM or
>> the thread to see how quickly it recovers. In terms of how
>> yarn.nm.liveness-monitor.expiry-interval-ms
>> plays the role here, not very sure. Hope any yarn expert in the community
>> can explain it a little.
>>
>> The goal of standby container in SAMZA-406 is to recover quickly when the
>> task has a lot of local state and so reading changelog takes a long time,
>> not to reduce the time of *allocating* the container, which, I believe, is
>> taken care by the YARN.
>>
>> Hope this help a little. Thanks.
>>
>> Cheers,
>>
>> Fang, Yan
>> [email protected]
>> +1 (206) 849-4108
>>
>> On Thu, Jan 1, 2015 at 4:20 AM, Su Yi <[email protected]> wrote:
>>
>>  Hi Timothy,
>>>
>>> There are 4 nodes in total : a,b,c,d
>>> Resource manager : a
>>> Node manager : a,b,c,d
>>> Kafka and zookeeper running on : a
>>>
>>> YARN configuration is :
>>>
>>> <property>
>>>     <description>How long to wait until a node manager is considered
>>> dead.</description>
>>>     <name>yarn.nm.liveness-monitor.expiry-interval-ms</name>
>>>     <value>1000</value>
>>> </property>
>>>
>>> <property>
>>>     <description>How often to check that node managers are still
>>> alive.</description>
>>>     <name>yarn.resourcemanager.nm.liveness-monitor.interval-ms</name>
>>>     <value>100</value>
>>> </property>
>>>
>>> From web UI of Samza, I found that node 'a' appeared and disappeared
>>> again
>>> and again in the node list.
>>>
>>> Su Yi
>>>
>>> On 2015-01-01 02:54:48,"Timothy Chen" <[email protected]> wrote:
>>>
>>> >Hi Su Yi,
>>> >
>>> >Can you elaborate a bit more what you mean by unstable cluster when
>>> >you configured the heartbeat interval to be 1s?
>>> >
>>> >Tim
>>> >
>>> >On Wed, Dec 31, 2014 at 10:30 AM, Su Yi <[email protected]> wrote:
>>> >> Hello,
>>> >>
>>> >> Here are some thoughts about HA of Samza.
>>> >>
>>> >> 1. Failure detection
>>> >>
>>> >> The problem is, failure detection of container completely depends on
>>> YARN in Samza. YARN counts on Node Manager reporting container failures,
>>> however Node Manager could fail, too (like, if the machine failed, NM
>>> would
>>> fail). Node Manager failures can be detected through heartbeat by
>>> Resource
>>> Manager, but, by default it'll take 10 mins to confirm Node Manager
>>> failure. I think, that's OK with batch processing, but not stream
>>> processing.
>>> >>
>>> >> Configuring yarn failure confirm interval to 1s, result in an unstable
>>> yarn cluster(4 node in total). With 2s, all things works fine, but it
>>> takes
>>> 10s~20s to get lost container(machine shut down) back. Considering that
>>> testing stream task is very simple(stateless), the recovery time is
>>> relatively long.
>>> >>
>>> >> I am not an expert on YARN, I don't know why it, by default, takes
>>> such
>>> a long time to confirm node failure. To my understanding, YARN is
>>> something
>>> trying to be general, and it is not sufficient for stream processing
>>> framework. Extra effort should be done beyond YARN on failure detection
>>> in
>>> stream processing.
>>> >>
>>> >> 2. Task redeployment
>>> >>
>>> >> After Resource Manager informed Samza of container failure, Samza
>>> should apply for resources from YARN to redeploy failed tasks, which
>>> consumes time during recovery. And, recovery time is critical for HA in
>>> stream processing. I think, maintaining a few standby containers may
>>> eliminate this overhead on recovery time. Samza could deploy failed tasks
>>> on the standby containers than requesting from YARN.
>>> >>
>>> >> Hot standby containers, which is described in SAMZA-406(
>>> https://issues.apache.org/jira/browse/SAMZA-406), may help save recovery
>>> time, however it's costly(it doubles the resources needed).
>>> >>
>>> >> I'm wondering, what does these stuffs means to you, and how about the
>>> feasibility. By the way, I'm using Samza 0.7 .
>>> >>
>>> >> Thank you for reading.
>>> >>
>>> >> Happy New Year!;-)
>>> >>
>>> >> Su Yi
>>>
>>
>


-- 
Milinda Pathirage

PhD Student | Research Assistant
School of Informatics and Computing | Data to Insight Center
Indiana University

twitter: milindalakmal
skype: milinda.pathirage
blog: http://milinda.pathirage.org

Reply via email to