Copilot commented on code in PR #9388:
URL: https://github.com/apache/ozone/pull/9388#discussion_r2582758225
##########
hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/RocksDatabase.java:
##########
@@ -665,6 +666,35 @@ public Collection<ColumnFamily> getExtraColumnFamilies() {
return Collections.unmodifiableCollection(columnFamilies.values());
}
+ public void dropColumnFamily(String tableName) throws RocksDatabaseException
{
+ try (UncheckedAutoCloseable ignored = acquire()) {
+ ColumnFamily columnFamily = columnFamilies.get(tableName);
+ if (columnFamily != null) {
+ try {
+ getManagedRocksDb().get().dropColumnFamily(columnFamily.getHandle());
+ byte[] handleNameBytes = columnFamily.getHandle().getName();
+ ColumnFamilyDescriptor descriptor = null;
+ for (int i = 0; i < descriptors.size(); i++) {
+ ColumnFamilyDescriptor desc = descriptors.get(i);
+ if (Arrays.equals(desc.getName(), handleNameBytes)) {
+ descriptor = desc;
+ descriptors.remove(i);
+ break;
+ }
+ }
+ columnFamily.getHandle().close();
+ if (descriptor != null) {
+ RocksDatabase.close(descriptor);
+ }
+ columnFamilies.remove(tableName);
Review Comment:
After removing a column family from `columnFamilies` map, the cached
`columnFamilyNames` supplier (line 376) becomes stale. The `MemoizedSupplier`
caches the result of `toColumnFamilyNameMap(columnFamilies.values())` and will
continue returning outdated data that includes the dropped column family.
This should be addressed by either:
1. Making `columnFamilyNames` recompute after modifications, or
2. Directly updating the cached map if possible
Without this fix, `getColumnFamilyNames()` will return incorrect data after
a column family is dropped.
##########
hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/RocksDatabase.java:
##########
@@ -665,6 +666,35 @@ public Collection<ColumnFamily> getExtraColumnFamilies() {
return Collections.unmodifiableCollection(columnFamilies.values());
}
+ public void dropColumnFamily(String tableName) throws RocksDatabaseException
{
+ try (UncheckedAutoCloseable ignored = acquire()) {
+ ColumnFamily columnFamily = columnFamilies.get(tableName);
+ if (columnFamily != null) {
+ try {
+ getManagedRocksDb().get().dropColumnFamily(columnFamily.getHandle());
+ byte[] handleNameBytes = columnFamily.getHandle().getName();
+ ColumnFamilyDescriptor descriptor = null;
+ for (int i = 0; i < descriptors.size(); i++) {
+ ColumnFamilyDescriptor desc = descriptors.get(i);
+ if (Arrays.equals(desc.getName(), handleNameBytes)) {
+ descriptor = desc;
+ descriptors.remove(i);
+ break;
+ }
+ }
+ columnFamily.getHandle().close();
+ if (descriptor != null) {
+ RocksDatabase.close(descriptor);
+ }
+ columnFamilies.remove(tableName);
+ } catch (RocksDBException e) {
+ closeOnError(e);
+ throw toRocksDatabaseException(this, "DropColumnFamily " +
tableName, e);
+ }
+ }
+ }
+ }
Review Comment:
The `dropColumnFamily` method is not thread-safe. It modifies both
`columnFamilies` (HashMap) and `descriptors` (ArrayList) without
synchronization. Concurrent calls to `dropColumnFamily`, `getColumnFamily`, or
`getExtraColumnFamilies` could lead to race conditions and data corruption.
Since `columnFamilies` and `descriptors` are not thread-safe collections and
there's no synchronization mechanism in place, consider either:
1. Making this method `synchronized`, or
2. Using thread-safe collections (e.g., `ConcurrentHashMap` and
`CopyOnWriteArrayList`), or
3. Documenting that this method must only be called when no other operations
are in progress
##########
hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/RocksDatabase.java:
##########
@@ -381,7 +382,7 @@ private Map<String, ColumnFamily>
toColumnFamilyMap(List<ColumnFamilyHandle> han
final ColumnFamily f = new ColumnFamily(h);
map.put(f.getName(), f);
}
- return Collections.unmodifiableMap(map);
+ return map;
Review Comment:
Changing the return type from `Collections.unmodifiableMap(map)` to a
mutable `map` breaks the immutability contract that was previously enforced.
This allows external modification of the `columnFamilies` map, which could lead
to inconsistent state.
While this change enables the `dropColumnFamily` method to modify the map,
it also exposes the internal mutable state. Consider:
1. Keeping the map immutable and using alternative approaches (e.g.,
creating a new map instance on drops), or
2. Using a `ConcurrentHashMap` and carefully managing concurrent
modifications, or
3. If mutability is intentional, document this behavior clearly
The previous immutability was a defensive programming practice that
prevented accidental modifications.
```suggestion
return Collections.unmodifiableMap(map);
```
##########
hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/RocksDatabase.java:
##########
@@ -665,6 +666,35 @@ public Collection<ColumnFamily> getExtraColumnFamilies() {
return Collections.unmodifiableCollection(columnFamilies.values());
}
+ public void dropColumnFamily(String tableName) throws RocksDatabaseException
{
+ try (UncheckedAutoCloseable ignored = acquire()) {
+ ColumnFamily columnFamily = columnFamilies.get(tableName);
+ if (columnFamily != null) {
+ try {
+ getManagedRocksDb().get().dropColumnFamily(columnFamily.getHandle());
+ byte[] handleNameBytes = columnFamily.getHandle().getName();
+ ColumnFamilyDescriptor descriptor = null;
+ for (int i = 0; i < descriptors.size(); i++) {
+ ColumnFamilyDescriptor desc = descriptors.get(i);
+ if (Arrays.equals(desc.getName(), handleNameBytes)) {
+ descriptor = desc;
+ descriptors.remove(i);
+ break;
+ }
+ }
+ columnFamily.getHandle().close();
+ if (descriptor != null) {
+ RocksDatabase.close(descriptor);
+ }
+ columnFamilies.remove(tableName);
+ } catch (RocksDBException e) {
+ closeOnError(e);
+ throw toRocksDatabaseException(this, "DropColumnFamily " +
tableName, e);
+ }
+ }
+ }
+ }
Review Comment:
The new `dropColumnFamily` method lacks direct unit test coverage in the
framework module. While the PR description mentions it's tested via "Defrag
Service Test", consider adding unit tests in `TestRDBStore.java` that cover:
1. Successfully dropping an existing column family
2. Attempting to drop a non-existent column family (should be a no-op)
3. Verifying that handles are properly closed and removed from internal data
structures
4. Error handling when RocksDB operations fail
This would improve test coverage and make it easier to catch regressions in
this critical cleanup logic.
##########
hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/RocksDatabase.java:
##########
@@ -665,6 +666,35 @@ public Collection<ColumnFamily> getExtraColumnFamilies() {
return Collections.unmodifiableCollection(columnFamilies.values());
}
+ public void dropColumnFamily(String tableName) throws RocksDatabaseException
{
+ try (UncheckedAutoCloseable ignored = acquire()) {
+ ColumnFamily columnFamily = columnFamilies.get(tableName);
+ if (columnFamily != null) {
+ try {
+ getManagedRocksDb().get().dropColumnFamily(columnFamily.getHandle());
+ byte[] handleNameBytes = columnFamily.getHandle().getName();
+ ColumnFamilyDescriptor descriptor = null;
+ for (int i = 0; i < descriptors.size(); i++) {
+ ColumnFamilyDescriptor desc = descriptors.get(i);
+ if (Arrays.equals(desc.getName(), handleNameBytes)) {
+ descriptor = desc;
+ descriptors.remove(i);
+ break;
+ }
+ }
+ columnFamily.getHandle().close();
+ if (descriptor != null) {
+ RocksDatabase.close(descriptor);
+ }
+ columnFamilies.remove(tableName);
Review Comment:
If `columnFamily.getHandle().close()` throws an exception (line 685), the
subsequent cleanup operations (lines 686-689) won't execute, leaving the column
family in an inconsistent state:
- The descriptor won't be closed (line 687)
- The column family remains in the `columnFamilies` map (line 689)
Even though the column family was successfully dropped from RocksDB (line
674), the internal bookkeeping won't be updated. Consider wrapping each cleanup
operation in try-catch blocks to ensure all cleanup steps complete, or using
try-with-resources pattern where applicable.
```suggestion
try {
columnFamily.getHandle().close();
} catch (Exception closeHandleEx) {
LOG.warn("Failed to close ColumnFamilyHandle for table '{}':
{}", tableName, closeHandleEx.toString(), closeHandleEx);
}
if (descriptor != null) {
try {
RocksDatabase.close(descriptor);
} catch (Exception closeDescEx) {
LOG.warn("Failed to close ColumnFamilyDescriptor for table
'{}': {}", tableName, closeDescEx.toString(), closeDescEx);
}
}
try {
columnFamilies.remove(tableName);
} catch (Exception removeEx) {
LOG.warn("Failed to remove column family '{}' from internal map:
{}", tableName, removeEx.toString(), removeEx);
}
```
##########
hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/RocksDatabase.java:
##########
@@ -665,6 +666,35 @@ public Collection<ColumnFamily> getExtraColumnFamilies() {
return Collections.unmodifiableCollection(columnFamilies.values());
}
+ public void dropColumnFamily(String tableName) throws RocksDatabaseException
{
+ try (UncheckedAutoCloseable ignored = acquire()) {
+ ColumnFamily columnFamily = columnFamilies.get(tableName);
+ if (columnFamily != null) {
+ try {
+ getManagedRocksDb().get().dropColumnFamily(columnFamily.getHandle());
+ byte[] handleNameBytes = columnFamily.getHandle().getName();
+ ColumnFamilyDescriptor descriptor = null;
+ for (int i = 0; i < descriptors.size(); i++) {
+ ColumnFamilyDescriptor desc = descriptors.get(i);
+ if (Arrays.equals(desc.getName(), handleNameBytes)) {
+ descriptor = desc;
+ descriptors.remove(i);
+ break;
+ }
+ }
+ columnFamily.getHandle().close();
+ if (descriptor != null) {
+ RocksDatabase.close(descriptor);
+ }
Review Comment:
The variable `descriptor` is initialized to `null` and only used to hold a
reference for cleanup. Consider moving the declaration inside the loop and the
cleanup logic, or use a different pattern to avoid the null initialization:
```java
ColumnFamilyDescriptor descriptorToClose = descriptors.stream()
.filter(desc -> Arrays.equals(desc.getName(), handleNameBytes))
.findFirst()
.orElse(null);
if (descriptorToClose != null) {
descriptors.remove(descriptorToClose);
// ... cleanup
}
```
This makes the code more readable and reduces the scope of the variable.
```suggestion
ColumnFamilyDescriptor descriptorToClose = descriptors.stream()
.filter(desc -> Arrays.equals(desc.getName(), handleNameBytes))
.findFirst()
.orElse(null);
if (descriptorToClose != null) {
descriptors.remove(descriptorToClose);
columnFamily.getHandle().close();
RocksDatabase.close(descriptorToClose);
} else {
columnFamily.getHandle().close();
}
```
##########
hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/RocksDatabase.java:
##########
@@ -665,6 +666,35 @@ public Collection<ColumnFamily> getExtraColumnFamilies() {
return Collections.unmodifiableCollection(columnFamilies.values());
}
+ public void dropColumnFamily(String tableName) throws RocksDatabaseException
{
+ try (UncheckedAutoCloseable ignored = acquire()) {
+ ColumnFamily columnFamily = columnFamilies.get(tableName);
+ if (columnFamily != null) {
+ try {
+ getManagedRocksDb().get().dropColumnFamily(columnFamily.getHandle());
+ byte[] handleNameBytes = columnFamily.getHandle().getName();
+ ColumnFamilyDescriptor descriptor = null;
+ for (int i = 0; i < descriptors.size(); i++) {
+ ColumnFamilyDescriptor desc = descriptors.get(i);
+ if (Arrays.equals(desc.getName(), handleNameBytes)) {
+ descriptor = desc;
+ descriptors.remove(i);
+ break;
+ }
Review Comment:
This loop modifies the `descriptors` list while iterating, which is not
thread-safe. If multiple threads call `dropColumnFamily` concurrently, or if
another thread accesses `descriptors` during modification, it could lead to
`ConcurrentModificationException` or inconsistent state.
Additionally, modifying a list by index while iterating can lead to subtle
bugs if the same descriptor name appears multiple times (though unlikely).
Consider using an iterator with `remove()` method instead:
```java
Iterator<ColumnFamilyDescriptor> iterator = descriptors.iterator();
while (iterator.hasNext()) {
ColumnFamilyDescriptor desc = iterator.next();
if (Arrays.equals(desc.getName(), handleNameBytes)) {
descriptor = desc;
iterator.remove();
break;
}
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]