Wei-Chiu Chuang created HDDS-15650:
--------------------------------------

             Summary: Fix snapshotUsedNamespace underflow when FSO directory is 
deleted and purged
                 Key: HDDS-15650
                 URL: https://issues.apache.org/jira/browse/HDDS-15650
             Project: Apache Ozone
          Issue Type: Bug
            Reporter: Wei-Chiu Chuang


Ryan posted in HDDS-14435:

 

I found what look like some problems that are actually on the 
snapshotUsedNamespace side. Looks like two areas where there is a problem:
 # In OmKeyDeleteRequestWithFSO.java, it looks like we put a tombstone on the 
directory, but the corresponding snapshotUsedNamespace update is skipped 
through the empty key check:

{code:java}
      long quotaReleased = sumBlockLengths(omKeyInfo);
      // Empty entries won't be added to deleted table so this key shouldn't 
get added to snapshotUsed space.
      boolean isKeyNonEmpty = !OmKeyInfo.isKeyEmpty(omKeyInfo);
      omBucketInfo.decrUsedBytes(quotaReleased, isKeyNonEmpty);
->    omBucketInfo.decrUsedNamespace(1L, isKeyNonEmpty);{code}
[https://github.com/apache/ozone/blob/cb29f193ea1d1945a8b908fbf8ee1a39bc00e5e4/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyDeleteRequestWithFSO.java#L161-L165]

This is the only namespace update in the file, so it looks to me like any any 
time a user deletes a directory it doesn't get reflected in the 
snapshotUsedNamespace because directories don't have blocks.

2. In OMDirectoriesPurgeRequestWithFSO.java, it looks like we always decrement 
snapshotUsedNamespace on purge, with no attendant checks of whether there even 
is a snapshot:
{code:java}
        if (path.hasDeletedDir()) {
          deletedDirNames.add(path.getDeletedDir());
          BucketNameInfo bucketNameInfo = volumeBucketIdMap.get(new 
VolumeBucketId(path.getVolumeId(),
              path.getBucketId()));
          OmBucketInfo omBucketInfo = getBucketInfo(omMetadataManager,
              bucketNameInfo.getVolumeName(), bucketNameInfo.getBucketName());
          if (omBucketInfo != null && omBucketInfo.getObjectID() == 
path.getBucketId()) {
-->         omBucketInfo.purgeSnapshotUsedNamespace(1);
            volBucketInfoMap.put(Pair.of(omBucketInfo.getVolumeName(), 
omBucketInfo.getBucketName()), omBucketInfo);
          }
          numDirsDeleted++;
        } {code}
[https://github.com/apache/ozone/blob/cb29f193ea1d1945a8b908fbf8ee1a39bc00e5e4/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMDirectoriesPurgeRequestWithFSO.java#L202]

As a consequence, it's common for snapshotUsedNamespace to be hugely negative 
on buckets with no snapshots at all. For example, I know of a large cluster 
with ~200 DNs with 68 buckets that have negative-value snapshotUsedNamespace 
values, and only 6 specific buckets in the cluster have snapshots.

These wrong snapshotUsedNamespace counts will be reflected in the bucket info 
output the value it displays is the sum of AOS usedNamespace + 
snapshotUsedNamespace, so this problem wrecks namespace quotas even on FSO 
buckets without snapshots:
{code:java}
public long getTotalBucketNamespace() {
  return usedNamespace + snapshotUsedNamespace;
} {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to