Github user enixon commented on a diff in the pull request:

    https://github.com/apache/zookeeper/pull/560#discussion_r203474532
  
    --- Diff: 
src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java ---
    @@ -399,8 +399,30 @@ public void save(DataTree dataTree,
             File snapshotFile = new File(snapDir, 
Util.makeSnapshotName(lastZxid));
             LOG.info("Snapshotting: 0x{} to {}", Long.toHexString(lastZxid),
                     snapshotFile);
    -        snapLog.serialize(dataTree, sessionsWithTimeouts, snapshotFile, 
syncSnap);
    -
    +        try {
    +            snapLog.serialize(dataTree, sessionsWithTimeouts, 
snapshotFile, syncSnap);
    +        } catch (IOException e) {
    +            if (snapshotFile.length() == 0) {
    --- End diff --
    
    I really like the idea! 
    
    Clearly this patch is aimed at closing one very particular kind of error 
case around snapshot writing and the idea generalizes to many more errors. My 
only concern with "always delete" is that I do not know how wide a net that 
casts and whether we might end up deleting a legitimately useful snapshot under 
some circumstances. Unfortunately, I don't have time to do the testing required 
to validate the safety of the more general approach.
    
    If that sounds reasonable, I'd propose that we commit on this patch and 
close the hole created from this particular issue (snapshot problems on no disk 
space) and I'll open a separate JIRA with details about extending the solution 
so someone can pick up the issue and do the necessary testing.


---

Reply via email to