Wei-Chiu Chuang created HDDS-15650:
--------------------------------------
Summary: Fix snapshotUsedNamespace underflow when FSO directory is
deleted and purged
Key: HDDS-15650
URL: https://issues.apache.org/jira/browse/HDDS-15650
Project: Apache Ozone
Issue Type: Bug
Reporter: Wei-Chiu Chuang
Ryan posted in HDDS-14435:
I found what look like some problems that are actually on the
snapshotUsedNamespace side. Looks like two areas where there is a problem:
# In OmKeyDeleteRequestWithFSO.java, it looks like we put a tombstone on the
directory, but the corresponding snapshotUsedNamespace update is skipped
through the empty key check:
{code:java}
long quotaReleased = sumBlockLengths(omKeyInfo);
// Empty entries won't be added to deleted table so this key shouldn't
get added to snapshotUsed space.
boolean isKeyNonEmpty = !OmKeyInfo.isKeyEmpty(omKeyInfo);
omBucketInfo.decrUsedBytes(quotaReleased, isKeyNonEmpty);
-> omBucketInfo.decrUsedNamespace(1L, isKeyNonEmpty);{code}
[https://github.com/apache/ozone/blob/cb29f193ea1d1945a8b908fbf8ee1a39bc00e5e4/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyDeleteRequestWithFSO.java#L161-L165]
This is the only namespace update in the file, so it looks to me like any any
time a user deletes a directory it doesn't get reflected in the
snapshotUsedNamespace because directories don't have blocks.
2. In OMDirectoriesPurgeRequestWithFSO.java, it looks like we always decrement
snapshotUsedNamespace on purge, with no attendant checks of whether there even
is a snapshot:
{code:java}
if (path.hasDeletedDir()) {
deletedDirNames.add(path.getDeletedDir());
BucketNameInfo bucketNameInfo = volumeBucketIdMap.get(new
VolumeBucketId(path.getVolumeId(),
path.getBucketId()));
OmBucketInfo omBucketInfo = getBucketInfo(omMetadataManager,
bucketNameInfo.getVolumeName(), bucketNameInfo.getBucketName());
if (omBucketInfo != null && omBucketInfo.getObjectID() ==
path.getBucketId()) {
--> omBucketInfo.purgeSnapshotUsedNamespace(1);
volBucketInfoMap.put(Pair.of(omBucketInfo.getVolumeName(),
omBucketInfo.getBucketName()), omBucketInfo);
}
numDirsDeleted++;
} {code}
[https://github.com/apache/ozone/blob/cb29f193ea1d1945a8b908fbf8ee1a39bc00e5e4/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMDirectoriesPurgeRequestWithFSO.java#L202]
As a consequence, it's common for snapshotUsedNamespace to be hugely negative
on buckets with no snapshots at all. For example, I know of a large cluster
with ~200 DNs with 68 buckets that have negative-value snapshotUsedNamespace
values, and only 6 specific buckets in the cluster have snapshots.
These wrong snapshotUsedNamespace counts will be reflected in the bucket info
output the value it displays is the sum of AOS usedNamespace +
snapshotUsedNamespace, so this problem wrecks namespace quotas even on FSO
buckets without snapshots:
{code:java}
public long getTotalBucketNamespace() {
return usedNamespace + snapshotUsedNamespace;
} {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]