[ 
https://issues.apache.org/jira/browse/GIRAPH-297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433426#comment-13433426
 ] 

Maja Kabiljo commented on GIRAPH-297:
-------------------------------------

The easiest way here would be to switch the execution order to first vertices 
then master - this would make all of it consistent, and it makes just as much 
sense as the other order does to me. However, if we want to keep the current 
order I think we should make some changes in the implementation. It's true that 
this code is not something user should look at, but still it might cause 
unwanted mistakes by us while making changes in other parts. Will think about 
the implementation a bit more and comment then.

In the meantime, this whole ordering thing won't fix the problem from this 
issue. We want master to call finalizeCheckpoint after all the workers have 
written their checkpoint data, but we have no indication of when workers did 
that, and also we call it after master.compute for next superstep which makes 
it wrong. I attached the patch which fixes it - it adds another barrier there, 
which isn't really a big deal since master would be waiting for workers to 
finish at that point of time anyway. Apart from fixing what was incorrect, I 
think this is better since we don't have to wait for computation for superstep 
X to finish before making final checkpoint for superstep X-1.
                
> Checkpointing on master is done one superstep later
> ---------------------------------------------------
>
>                 Key: GIRAPH-297
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-297
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Maja Kabiljo
>            Assignee: Maja Kabiljo
>         Attachments: GIRAPH-297.patch
>
>
> On workers we store checkpoint X before compute() for superstep X are 
> executed. On master we do it after those compute() are executed and after 
> master.compute() for superstep X+1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to