Relaunching from the same location can be one of the options. On Tue, Sep 20, 2016, 10:17 PM Tushar Gosavi <tus...@datatorrent.com> wrote:
> In case of application failure, we will like to have ability to > quickly restart the application while keeping the old state for > failure > analysis. Also the problem remains the same when we want to start from > savepoint, where we will need to copy state from > savepoint to application. > > -Tushar. > > > > On Tue, Sep 20, 2016 at 8:34 PM, Sandesh Hegde <sand...@datatorrent.com> > wrote: > > How about re-launching the app from the same location? > > > > If at all they want to store the state we can provide savepoint feature. > > > > On Tue, Sep 20, 2016 at 4:39 AM Tushar Gosavi <tus...@datatorrent.com> > > wrote: > > > >> We have observed that application relaunch takes long time. > >> The one major reason for delay in application startup during relaunch > >> is time taken to copy state of exisitng application to new application. > >> This state could grow in GBs and copy is performed in single thread > before > >> new application is submitted to Yarn. > >> > >> The state of previous application constists > >> - jars > >> - stram checkpoint/recovery file. > >> - events > >> - container file > >> - stats recording if enabled. > >> - operator checkpoints > >> - operator data. > >> > >> We could avoid copying debugging data like stat recording which could > >> run in TB for long > >> running application and is not required for functioning of new > application. > >> > >> Similarly operator checkpoints could be read in parallel when they are > >> launched for first time, > >> This will also help in copying only required checkpoints and will be > >> done in parallel > >> by multiple containers/threads. > >> > >> For operator data stored in application directory, we could copy it > >> completely for now, but > >> in future we could provide an callback which will allow operator > >> partition to read only > >> required state from previous location. > >> > >> let me know your though on this. > >> > >> Regards, > >> - Tushar. > >> >