[ 
https://issues.apache.org/jira/browse/SAMZA-557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14592312#comment-14592312
 ] 

Navina Ramesh commented on SAMZA-557:
-------------------------------------

[~guozhang] I don't think this patch is part of 0.9.1 release. This is going 
out in 0.10.0 (master now)

> Reuse local state in SamzaContainer on clean shutdown
> -----------------------------------------------------
>
>                 Key: SAMZA-557
>                 URL: https://issues.apache.org/jira/browse/SAMZA-557
>             Project: Samza
>          Issue Type: Sub-task
>          Components: container
>    Affects Versions: 0.9.0
>            Reporter: Chris Riccomini
>            Assignee: Navina Ramesh
>         Attachments: SAMZA-557-0.patch, SAMZA-557-1.patch, SAMZA-557-2.patch
>
>
> Restoring state every time a SamzaContainer is restarted (due to failure, or 
> re-deploy) can be expensive. Samza currently always restores state when a 
> SamzaContainer starts. This could be avoided if the container is started on 
> the same machine that it was running on before shutdown. It could re-use 
> state that exists locally when it's restarted. There are two modes of state 
> re-use:
> # A clean shutdown of the container has occurred.
> # An unclean shutdown of the container occurred.
> Re-using clean state (1) could be achieved by having the SamzaContainer write 
> an OFFSET file for every local state directory when the SamzaContainer is 
> shutdown (after the state stores have been stopped). A clean directory would 
> look as follows:
>     $PWD/state/my-kv-store/Partition-0/OFFSET
> The offset file must contain the offset of the last method in the changelog 
> feed. This information is retrievable via the 
> SystemAdmin.getSystemStreamMetadata method. When a SamzaContainer starts up, 
> it can check if the OFFSET file exists for each store. If the OFFSET file 
> does exist, the SamzaContainer can:
> # Instruct the state store to open the on-disk DB, rather than creating a new 
> state store from scratch.
> # Read the OFFSET file.
> # Delete the OFFSET file.
> # Restore the state store from the OFFSET value, rather than from the oldest 
> offset in the changelog (what Samza currently does).
> The OFFSET file must be removed (3) before any restoration is executed on the 
> store. If this is not done, then a partial restoration might occur, followed 
> by a failure. In such a case, non-idempotent writes to a store could result 
> in inaccurate data being persisted to disk. The trade-off with deleting the 
> OFFSET optimistically is that a failure during restoration will result in the 
> whole state having to be restored, since the OFFSET file is gone. This is 
> tolerable, since a failure during restoration is equivalent to an unclean 
> shutdown, in which case you wouldn't expect the (possibly corrupted) local 
> state to be used anyway.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to