[ 
https://issues.apache.org/jira/browse/HDDS-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sadanand Shenoy updated HDDS-14654:
-----------------------------------
    Description: 
When a purged snapshot’s YAML is deleted (all versions removed), the orphan 
check can still schedule that snapshot again.

The next run then tries to load the missing YAML and throws NoSuchFileException.
{code:java}
java.nio.file.NoSuchFileException: <snapshots_dir>/om.db-<snapshotId>.yaml
    at java.nio.file.Files.newInputStream(Files.java:...)
    at org.apache.hadoop.ozone.util.YamlSerializer.load(YamlSerializer.java:73)
    at 
org.apache.hadoop.ozone.om.snapshot.OmSnapshotLocalDataManager$ReadableOmSnapshotLocalDataProvider.lambda$0(OmSnapshotLocalDataManager.java:653)
    at 
org.apache.hadoop.ozone.om.snapshot.OmSnapshotLocalDataManager$ReadableOmSnapshotLocalDataProvider.initialize(OmSnapshotLocalDataManager.java:655)
    at 
org.apache.hadoop.ozone.om.snapshot.OmSnapshotLocalDataManager$ReadableOmSnapshotLocalDataProvider.<init>(OmSnapshotLocalDataManager.java:614)
    at 
org.apache.hadoop.ozone.om.snapshot.OmSnapshotLocalDataManager$WritableOmSnapshotLocalDataProvider.<init>(OmSnapshotLocalDataManager.java:829)
    at 
org.apache.hadoop.ozone.om.snapshot.OmSnapshotLocalDataManager.checkOrphanSnapshotVersions(OmSnapshotLocalDataManager.java:408)
    at 
org.apache.hadoop.ozone.om.snapshot.OmSnapshotLocalDataManager.checkOrphanSnapshotVersions(OmSnapshotLocalDataManager.java:399)
    at 
org.apache.hadoop.ozone.om.snapshot.OmSnapshotLocalDataManager.lambda$0(OmSnapshotLocalDataManager.java:383)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:...)
    at java.util.concurrent.FutureTask.run(FutureTask.java:...)
    at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:...)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:...)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:...)
    at java.lang.Thread.run(Thread.java:...) {code}
Root cause

In checkForOphanVersionsAndIncrementCount, we always call 
incrementOrphanCheckCount(snapshotId) when isPurgeTransactionSet or the version 
changes, even if the YAML was just deleted (empty versions). That keeps the 
snapshot in snapshotToBeCheckedForOrphans even though there is nothing left to 
check.

  was:
When a purged snapshot’s YAML is deleted (all versions removed), the orphan 
check can still schedule that snapshot again.

The next run then tries to load the missing YAML and throws NoSuchFileException.

Root cause

In checkForOphanVersionsAndIncrementCount, we always call 
incrementOrphanCheckCount(snapshotId) when isPurgeTransactionSet or the version 
changes, even if the YAML was just deleted (empty versions). That keeps the 
snapshot in snapshotToBeCheckedForOrphans even though there is nothing left to 
check.


> NoSuchFileException when orphan check runs after purged snapshot YAML is 
> deleted
> --------------------------------------------------------------------------------
>
>                 Key: HDDS-14654
>                 URL: https://issues.apache.org/jira/browse/HDDS-14654
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Sadanand Shenoy
>            Assignee: Sadanand Shenoy
>            Priority: Major
>              Labels: pull-request-available
>
> When a purged snapshot’s YAML is deleted (all versions removed), the orphan 
> check can still schedule that snapshot again.
> The next run then tries to load the missing YAML and throws 
> NoSuchFileException.
> {code:java}
> java.nio.file.NoSuchFileException: <snapshots_dir>/om.db-<snapshotId>.yaml
>     at java.nio.file.Files.newInputStream(Files.java:...)
>     at 
> org.apache.hadoop.ozone.util.YamlSerializer.load(YamlSerializer.java:73)
>     at 
> org.apache.hadoop.ozone.om.snapshot.OmSnapshotLocalDataManager$ReadableOmSnapshotLocalDataProvider.lambda$0(OmSnapshotLocalDataManager.java:653)
>     at 
> org.apache.hadoop.ozone.om.snapshot.OmSnapshotLocalDataManager$ReadableOmSnapshotLocalDataProvider.initialize(OmSnapshotLocalDataManager.java:655)
>     at 
> org.apache.hadoop.ozone.om.snapshot.OmSnapshotLocalDataManager$ReadableOmSnapshotLocalDataProvider.<init>(OmSnapshotLocalDataManager.java:614)
>     at 
> org.apache.hadoop.ozone.om.snapshot.OmSnapshotLocalDataManager$WritableOmSnapshotLocalDataProvider.<init>(OmSnapshotLocalDataManager.java:829)
>     at 
> org.apache.hadoop.ozone.om.snapshot.OmSnapshotLocalDataManager.checkOrphanSnapshotVersions(OmSnapshotLocalDataManager.java:408)
>     at 
> org.apache.hadoop.ozone.om.snapshot.OmSnapshotLocalDataManager.checkOrphanSnapshotVersions(OmSnapshotLocalDataManager.java:399)
>     at 
> org.apache.hadoop.ozone.om.snapshot.OmSnapshotLocalDataManager.lambda$0(OmSnapshotLocalDataManager.java:383)
>     at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:...)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:...)
>     at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:...)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:...)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:...)
>     at java.lang.Thread.run(Thread.java:...) {code}
> Root cause
> In checkForOphanVersionsAndIncrementCount, we always call 
> incrementOrphanCheckCount(snapshotId) when isPurgeTransactionSet or the 
> version changes, even if the YAML was just deleted (empty versions). That 
> keeps the snapshot in snapshotToBeCheckedForOrphans even though there is 
> nothing left to check.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to