[ 
https://issues.apache.org/jira/browse/APEXCORE-575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15693077#comment-15693077
 ] 

Tushar Gosavi commented on APEXCORE-575:
----------------------------------------

I have implemented a new storage agent which maintains two storage agents one 
for old checkpoint directory and one for new checkpoint directory, using this 
storage agent we could direct read on old directory during initial start, and 
write to new checkpoint directory. With this we could avoid copy of checkpoints 
from old directory to new directory. with 2 GB state
application restart was brought down to few seconds from 2 minutes.

Other changes are 
- Do not copy stats and events directory as they are overwritten anyway in the 
new application.
- Use new storage agent to avoid copying checkpoints directory.

{code}
16/11/24 03:51:48 INFO stram.FSRecoveryHandler: Creating 
hdfs://node18.morado.com:8020/user/tushar/datatorrent/apps/application_1479889815831_0086/recovery/log
16/11/24 03:51:48 INFO stram.StramClient: Copying initial state took 1191 ms
16/11/24 03:51:48 INFO stram.StramClient: Set the environment for the 
application master
16/11/24 03:51:48 INFO stram.StramClient: Setting up app master command
{code}

The old application state for app running for 10 minutes.
{code}
2.0 G    5.9 G    datatorrent/apps/application_1479889815831_0081/checkpoints
70.6 K   211.7 K  datatorrent/apps/application_1479889815831_0081/events
133.8 M  401.4 M  datatorrent/apps/application_1479889815831_0081/stats
{code}

> Improve application relaunch time.
> ----------------------------------
>
>                 Key: APEXCORE-575
>                 URL: https://issues.apache.org/jira/browse/APEXCORE-575
>             Project: Apache Apex Core
>          Issue Type: Improvement
>            Reporter: Tushar Gosavi
>            Assignee: Tushar Gosavi
>
> Improve application relaunch time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to