Sergey Edunov created GIRAPH-950: ------------------------------------ Summary: Auto-restart from checkpoint doesn't pick up latest checkpoint Key: GIRAPH-950 URL: https://issues.apache.org/jira/browse/GIRAPH-950 Project: Giraph Issue Type: Bug Reporter: Sergey Edunov
While running different jobs with checkpoints enabled I noticed some issues: 1) The way we pick up latest checkpoint is not correct. Current implementation just picks whatever is returned last from FileSystem.list(), which is not necessarily the last checkpoint 2) If job restarts from checkpoint it immediately creates another checkpoint. 3) We need more flexibility in GiraphJobRetryChecker to allow restarts after multiple failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)