[
https://issues.apache.org/jira/browse/HDDS-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wei-Chiu Chuang updated HDDS-14654:
-----------------------------------
Epic Link: HDDS-13747
> NoSuchFileException when orphan check runs after purged snapshot YAML is
> deleted
> --------------------------------------------------------------------------------
>
> Key: HDDS-14654
> URL: https://issues.apache.org/jira/browse/HDDS-14654
> Project: Apache Ozone
> Issue Type: Bug
> Reporter: Sadanand Shenoy
> Assignee: Sadanand Shenoy
> Priority: Major
> Labels: pull-request-available
>
> When a purged snapshot’s YAML is deleted (all versions removed), the orphan
> check can still schedule that snapshot again.
> The next run then tries to load the missing YAML and throws
> NoSuchFileException.
> {code:java}
> java.nio.file.NoSuchFileException: <snapshots_dir>/om.db-<snapshotId>.yaml
> at java.nio.file.Files.newInputStream(Files.java:...)
> at
> org.apache.hadoop.ozone.util.YamlSerializer.load(YamlSerializer.java:73)
> at
> org.apache.hadoop.ozone.om.snapshot.OmSnapshotLocalDataManager$ReadableOmSnapshotLocalDataProvider.lambda$0(OmSnapshotLocalDataManager.java:653)
> at
> org.apache.hadoop.ozone.om.snapshot.OmSnapshotLocalDataManager$ReadableOmSnapshotLocalDataProvider.initialize(OmSnapshotLocalDataManager.java:655)
> at
> org.apache.hadoop.ozone.om.snapshot.OmSnapshotLocalDataManager$ReadableOmSnapshotLocalDataProvider.<init>(OmSnapshotLocalDataManager.java:614)
> at
> org.apache.hadoop.ozone.om.snapshot.OmSnapshotLocalDataManager$WritableOmSnapshotLocalDataProvider.<init>(OmSnapshotLocalDataManager.java:829)
> at
> org.apache.hadoop.ozone.om.snapshot.OmSnapshotLocalDataManager.checkOrphanSnapshotVersions(OmSnapshotLocalDataManager.java:408)
> at
> org.apache.hadoop.ozone.om.snapshot.OmSnapshotLocalDataManager.checkOrphanSnapshotVersions(OmSnapshotLocalDataManager.java:399)
> at
> org.apache.hadoop.ozone.om.snapshot.OmSnapshotLocalDataManager.lambda$0(OmSnapshotLocalDataManager.java:383)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:...)
> at java.util.concurrent.FutureTask.run(FutureTask.java:...)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:...)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:...)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:...)
> at java.lang.Thread.run(Thread.java:...) {code}
> Root cause
> In checkForOphanVersionsAndIncrementCount, we always call
> incrementOrphanCheckCount(snapshotId) when isPurgeTransactionSet or the
> version changes, even if the YAML was just deleted (empty versions). That
> keeps the snapshot in snapshotToBeCheckedForOrphans even though there is
> nothing left to check.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]