[jira] [Commented] (GEODE-7989) Improve logging of exceptions that happen during execution of backup

ASF subversion and git services (Jira) Mon, 20 Apr 2020 05:54:34 -0700


    [ 
https://issues.apache.org/jira/browse/GEODE-7989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17087689#comment-17087689
 ]


ASF subversion and git services commented on GEODE-7989:
--------------------------------------------------------

Commit 509240f8deecb1361aaf2a3cd041b01d87d65540 in geode's branch 
refs/heads/develop from Jakov Varenina
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=509240f ]

GEODE-7989: Improve backup exceptions logging (#4967)

* GEODE-7989: Improve backup execeptions logging

Log as warning exception that causes backup execution to fail

* InterruptedException logged as warn

* empty commit to re-launch CI

> Improve logging of exceptions that happen during execution of backup
> --------------------------------------------------------------------
>
>                 Key: GEODE-7989
>                 URL: https://issues.apache.org/jira/browse/GEODE-7989
>             Project: Geode
>          Issue Type: Improvement
>            Reporter: Jakov Varenina
>            Assignee: Jakov Varenina
>            Priority: Major
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> While backup is executed on the servers and fails due to exception e.g. 
> "IOException: Not enough space left on device" then this exception (feedback) 
> is not propagated to the user of DistributedSystemMXBean.backupAllMembers 
> API. It will only get list of members and disk-stores for which backup is 
> successfully executed. But it will not have indication what caused backup to 
> fail for some members since Exception is not logged on server when using log 
> level less than debug (config, warn, ...). It would be good to have at least 
> have better logging for following cases: 
> 1. Disk where oplogs are saved is to small for new oplog created by Geode 
> backup procedure. This step is executed in Geode backup phase 
> startDiskStoreBackup . If there is no enough space left on device, Geode will 
> log that exception in DEBUG (see below). It would be good to have this logged 
> in info or warning log level.
> 2. There is no enough space on disk where oplogs are copied for backup (this 
> doesn't need to be the same disk as mentioned before, and it is not same disk 
> for our case). This step in Geode is called completeBackup, and it doesn't 
> log even debug log if problem appears, but disk stores are reported as 
> offline (DiskBackupStatus.getOfflineDiskStores()).  It would be good to have 
> this exception logged in info or warning log level.
> Exception logged only in debug level:
> java.io.IOException: Not enough space left on device
>         at 
> org.apache.geode.internal.shared.NativeCallsJNAImpl$POSIXNativeCalls.preBlow(NativeCallsJNAImpl.java:296)
>         at org.apache.geode.internal.cache.Oplog.preblow(Oplog.java:1007)
>         at org.apache.geode.internal.cache.Oplog.createCrf(Oplog.java:1073)
>         at org.apache.geode.internal.cache.Oplog.<init>(Oplog.java:646)
>         at org.apache.geode.internal.cache.Oplog.switchOpLog(Oplog.java:3723)
>         at org.apache.geode.internal.cache.Oplog.forceRolling(Oplog.java:3643)
>         at 
> org.apache.geode.internal.cache.PersistentOplogSet.forceRoll(PersistentOplogSet.java:199)
>         at 
> org.apache.geode.internal.cache.backup.BackupTask.startDiskStoreBackup(BackupTask.java:274)
>         at 
> org.apache.geode.internal.cache.backup.BackupTask.startDiskStoreBackups(BackupTask.java:149)
>         at 
> org.apache.geode.internal.cache.backup.BackupTask.doBackup(BackupTask.java:111)
>         at 
> org.apache.geode.internal.cache.backup.BackupTask.backup(BackupTask.java:82)
>         at 
> org.apache.geode.internal.cache.backup.BackupService.lambda$prepareBackup$0(BackupService.java:62)
>         at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>         at java.base/java.lang.Thread.run(Thread.java:834)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (GEODE-7989) Improve logging of exceptions that happen during execution of backup

Reply via email to