Copilot commented on code in PR #9574:
URL: https://github.com/apache/gravitino/pull/9574#discussion_r2647850140
##########
core/src/main/java/org/apache/gravitino/metalake/MetalakeManager.java:
##########
@@ -342,25 +344,43 @@ public void enableMetalake(NameIdentifier ident) throws
NoSuchMetalakeException
() -> {
try {
boolean inUse = metalakeInUse(store, ident);
- if (!inUse) {
- store.update(
- ident,
- BaseMetalake.class,
- EntityType.METALAKE,
- metalake -> {
- BaseMetalake.Builder builder =
newMetalakeBuilder(metalake);
-
- Map<String, String> newProps =
- metalake.properties() == null
- ? Maps.newHashMap()
- : Maps.newHashMap(metalake.properties());
- newProps.put(PROPERTY_IN_USE, "true");
- builder.withProperties(newProps);
-
- return builder.build();
- });
+ if (inUse) {
+ return null;
}
+ store.update(
+ ident,
+ BaseMetalake.class,
+ EntityType.METALAKE,
+ metalake -> {
+ BaseMetalake.Builder builder = newMetalakeBuilder(metalake);
+
+ Map<String, String> newProps =
+ metalake.properties() == null
+ ? Maps.newHashMap()
+ : Maps.newHashMap(metalake.properties());
+ newProps.put(PROPERTY_IN_USE, "true");
+ builder.withProperties(newProps);
+
+ return builder.build();
+ });
+
+ // The only problem is that we can't make sure we can change all
catalog properties
+ // in a transaction, if any of them fails, the metalake is already
enabled and the value
+ // in catalog is inconsistent.
+ store
+ .list(Namespace.of(ident.name()), CatalogEntity.class,
EntityType.CATALOG)
+ .forEach(
+ catalogEntity -> {
+ // update the properties metalake-in-use in catalog to
false
+ GravitinoEnv.getInstance()
+ .catalogManager()
+ .updateCatalogProperty(
+ catalogEntity.nameIdentifier(),
+ Catalog.PROPERTY_METALAKE_IN_USE,
+ "false");
+ });
Review Comment:
Error handling in the forEach loop is insufficient. If
`updateCatalogProperty` throws a RuntimeException (as seen in lines 600, 607,
and 610 of CatalogManager), it will abort the loop and leave some catalogs
updated while others are not. The metalake would be in enabled/disabled state
but only some catalogs would have the correct property values. Consider
collecting all errors and handling them appropriately, or implementing proper
transaction management.
##########
core/src/main/java/org/apache/gravitino/metalake/MetalakeManager.java:
##########
@@ -342,25 +344,43 @@ public void enableMetalake(NameIdentifier ident) throws
NoSuchMetalakeException
() -> {
try {
boolean inUse = metalakeInUse(store, ident);
- if (!inUse) {
- store.update(
- ident,
- BaseMetalake.class,
- EntityType.METALAKE,
- metalake -> {
- BaseMetalake.Builder builder =
newMetalakeBuilder(metalake);
-
- Map<String, String> newProps =
- metalake.properties() == null
- ? Maps.newHashMap()
- : Maps.newHashMap(metalake.properties());
- newProps.put(PROPERTY_IN_USE, "true");
- builder.withProperties(newProps);
-
- return builder.build();
- });
+ if (inUse) {
+ return null;
}
+ store.update(
+ ident,
+ BaseMetalake.class,
+ EntityType.METALAKE,
+ metalake -> {
+ BaseMetalake.Builder builder = newMetalakeBuilder(metalake);
+
+ Map<String, String> newProps =
+ metalake.properties() == null
+ ? Maps.newHashMap()
+ : Maps.newHashMap(metalake.properties());
+ newProps.put(PROPERTY_IN_USE, "true");
+ builder.withProperties(newProps);
+
+ return builder.build();
+ });
+
+ // The only problem is that we can't make sure we can change all
catalog properties
+ // in a transaction, if any of them fails, the metalake is already
enabled and the value
+ // in catalog is inconsistent.
Review Comment:
The comment on line 369 mentions "the metalake is already enabled" which is
correct for the enableMetalake method, but it should say "the metalake is
already enabled and the value in catalog is inconsistent" is slightly
misleading. The issue is that if updating catalog properties fails after the
metalake is already enabled, the state becomes inconsistent. Consider
clarifying this as "if any catalog property update fails, the metalake is
already enabled but catalog properties remain inconsistent."
```suggestion
// in a transaction. If any catalog property update fails, the
metalake is already
// enabled but catalog properties remain inconsistent.
```
##########
core/src/main/java/org/apache/gravitino/metalake/MetalakeManager.java:
##########
@@ -342,25 +344,43 @@ public void enableMetalake(NameIdentifier ident) throws
NoSuchMetalakeException
() -> {
try {
boolean inUse = metalakeInUse(store, ident);
- if (!inUse) {
- store.update(
- ident,
- BaseMetalake.class,
- EntityType.METALAKE,
- metalake -> {
- BaseMetalake.Builder builder =
newMetalakeBuilder(metalake);
-
- Map<String, String> newProps =
- metalake.properties() == null
- ? Maps.newHashMap()
- : Maps.newHashMap(metalake.properties());
- newProps.put(PROPERTY_IN_USE, "true");
- builder.withProperties(newProps);
-
- return builder.build();
- });
+ if (inUse) {
+ return null;
}
+ store.update(
+ ident,
+ BaseMetalake.class,
+ EntityType.METALAKE,
+ metalake -> {
+ BaseMetalake.Builder builder = newMetalakeBuilder(metalake);
+
+ Map<String, String> newProps =
+ metalake.properties() == null
+ ? Maps.newHashMap()
+ : Maps.newHashMap(metalake.properties());
+ newProps.put(PROPERTY_IN_USE, "true");
+ builder.withProperties(newProps);
+
+ return builder.build();
+ });
+
+ // The only problem is that we can't make sure we can change all
catalog properties
+ // in a transaction, if any of them fails, the metalake is already
enabled and the value
+ // in catalog is inconsistent.
+ store
+ .list(Namespace.of(ident.name()), CatalogEntity.class,
EntityType.CATALOG)
+ .forEach(
+ catalogEntity -> {
+ // update the properties metalake-in-use in catalog to
false
+ GravitinoEnv.getInstance()
+ .catalogManager()
+ .updateCatalogProperty(
+ catalogEntity.nameIdentifier(),
+ Catalog.PROPERTY_METALAKE_IN_USE,
+ "false");
Review Comment:
The comment on line 375 says "update the properties metalake-in-use in
catalog to false" but the actual code is setting it to "false" when enabling
the metalake. This is incorrect logic - when enabling a metalake, the
`PROPERTY_METALAKE_IN_USE` property for all its catalogs should be set to
"true", not "false". This appears to be a copy-paste error from the
disableMetalake method.
```suggestion
// update the properties metalake-in-use in catalog to
true
GravitinoEnv.getInstance()
.catalogManager()
.updateCatalogProperty(
catalogEntity.nameIdentifier(),
Catalog.PROPERTY_METALAKE_IN_USE,
"true");
```
##########
core/src/main/java/org/apache/gravitino/catalog/CatalogManager.java:
##########
@@ -591,6 +573,44 @@ public void enableCatalog(NameIdentifier ident)
});
}
+ public void updateCatalogProperty(
+ NameIdentifier nameIdentifier, String propertyKey, String propertyValue)
{
+ try {
+ store.update(
+ nameIdentifier,
+ CatalogEntity.class,
+ EntityType.CATALOG,
+ catalog -> {
+ CatalogEntity.Builder newCatalogBuilder =
+ newCatalogBuilder(nameIdentifier.namespace(), catalog);
+
+ Map<String, String> newProps =
+ catalog.getProperties() == null
+ ? new HashMap<>()
+ : new HashMap<>(catalog.getProperties());
+ newProps.put(propertyKey, propertyValue);
+ newCatalogBuilder.withProperties(newProps);
+
+ return newCatalogBuilder.build();
+ });
+ catalogCache.invalidate(nameIdentifier);
+
+ } catch (NoSuchCatalogException e) {
+ LOG.error("Catalog {} does not exist", nameIdentifier, e);
+ throw new RuntimeException(e);
+ } catch (IllegalArgumentException e) {
+ LOG.error(
+ "Failed to update catalog {} property {} with unknown change",
+ nameIdentifier,
+ propertyKey,
+ e);
+ throw e;
+ } catch (IOException ioe) {
+ LOG.error("Failed to update catalog {} property {}", nameIdentifier,
propertyKey, ioe);
+ throw new RuntimeException(ioe);
+ }
+ }
Review Comment:
The new `updateCatalogProperty` method lacks test coverage. Given that this
is a critical method that modifies catalog state and is called during metalake
enable/disable operations, it should have comprehensive unit tests covering:
successful updates, catalog not found scenarios, IO exceptions, and cache
invalidation behavior.
##########
core/src/main/java/org/apache/gravitino/catalog/OperationDispatcherInterceptor.java:
##########
@@ -87,6 +91,7 @@ public Object invoke(Object proxy, Method method, Object[]
args) throws Throwabl
}
if (catalogIdent != null) {
+ @SuppressWarnings("unused")
final NameIdentifier finalCatalogIdent = catalogIdent;
Review Comment:
The variable `finalCatalogIdent` is declared but never used after the
removal of the `checkCatalogInUse` call on line 108. While suppressing the
unused warning is correct, consider removing this unused variable declaration
entirely to keep the code clean.
```suggestion
```
##########
core/src/main/java/org/apache/gravitino/connector/BaseCatalogPropertiesMetadata.java:
##########
@@ -88,7 +88,12 @@ protected Map<String, PropertyEntry<?>>
specificPropertyEntries() {
PROPERTY_IN_USE,
"The property indicating the catalog is in use",
true /* default value */,
- false /* hidden */)),
+ false /* hidden */),
+ PropertyEntry.booleanReservedPropertyEntry(
+ PROPERTY_IN_USE,
+ "The property indicating the metalake that holds the catalog
is in use",
+ true /* default value */,
+ true /* hidden */)),
Review Comment:
There is a duplicate property entry with the same key `PROPERTY_IN_USE`.
Lines 87-91 define one entry for "The property indicating the catalog is in
use", and lines 92-96 define another entry with the same key `PROPERTY_IN_USE`
but different description "The property indicating the metalake that holds the
catalog is in use".
The second entry on lines 92-96 should use `PROPERTY_METALAKE_IN_USE` (which
was just added to the Catalog interface) instead of `PROPERTY_IN_USE`. When
using `Maps.uniqueIndex()` with `PropertyEntry::getName`, duplicate keys will
cause only one entry to be retained, making this a critical bug.
##########
core/src/main/java/org/apache/gravitino/connector/BaseCatalog.java:
##########
@@ -376,4 +379,21 @@ public Audit auditInfo() {
private CatalogOperations asProxyOps(CatalogOperations ops, ProxyPlugin
plugin) {
return OperationsProxy.createProxy(ops, plugin);
}
+
+ private void checkMetalakeAndCatalogInUse(CatalogEntity catalogEntity) {
+ boolean metalakeInuse =
+ Boolean.parseBoolean(
+
catalogEntity.getProperties().getOrDefault(Catalog.PROPERTY_METALAKE_IN_USE,
"true"));
+ if (!metalakeInuse) {
+ throw new MetalakeNotInUseException(
+ String.format("The metalake that holds catalog %s is not in use",
catalogEntity.name()));
+ }
+
+ boolean catalogInuse =
+
Boolean.parseBoolean(catalogEntity.getProperties().getOrDefault(PROPERTY_IN_USE,
"true"));
+ if (!catalogInuse) {
+ throw new MetalakeNotInUseException(
+ String.format("The catalog %s is not in use", catalogEntity.name()));
+ }
Review Comment:
The exception type is incorrect. When a catalog is not in use, the code
throws `MetalakeNotInUseException` with message "The catalog %s is not in use".
This should throw `CatalogNotInUseException` instead, as it's specifically
about the catalog being disabled, not the metalake. Using the wrong exception
type will confuse API consumers and make error handling inconsistent.
##########
core/src/main/java/org/apache/gravitino/connector/BaseCatalog.java:
##########
@@ -376,4 +379,21 @@ public Audit auditInfo() {
private CatalogOperations asProxyOps(CatalogOperations ops, ProxyPlugin
plugin) {
return OperationsProxy.createProxy(ops, plugin);
}
+
+ private void checkMetalakeAndCatalogInUse(CatalogEntity catalogEntity) {
+ boolean metalakeInuse =
+ Boolean.parseBoolean(
+
catalogEntity.getProperties().getOrDefault(Catalog.PROPERTY_METALAKE_IN_USE,
"true"));
+ if (!metalakeInuse) {
+ throw new MetalakeNotInUseException(
+ String.format("The metalake that holds catalog %s is not in use",
catalogEntity.name()));
+ }
+
+ boolean catalogInuse =
+
Boolean.parseBoolean(catalogEntity.getProperties().getOrDefault(PROPERTY_IN_USE,
"true"));
+ if (!catalogInuse) {
+ throw new MetalakeNotInUseException(
+ String.format("The catalog %s is not in use", catalogEntity.name()));
+ }
+ }
Review Comment:
The new `checkMetalakeAndCatalogInUse` method in BaseCatalog lacks test
coverage. This critical validation logic that throws exceptions when catalogs
or metalakes are not in use should be thoroughly tested to ensure it correctly
validates both properties and throws the appropriate exceptions.
##########
core/src/main/java/org/apache/gravitino/catalog/CatalogManager.java:
##########
@@ -591,6 +573,44 @@ public void enableCatalog(NameIdentifier ident)
});
}
+ public void updateCatalogProperty(
+ NameIdentifier nameIdentifier, String propertyKey, String propertyValue)
{
+ try {
+ store.update(
+ nameIdentifier,
+ CatalogEntity.class,
+ EntityType.CATALOG,
+ catalog -> {
+ CatalogEntity.Builder newCatalogBuilder =
+ newCatalogBuilder(nameIdentifier.namespace(), catalog);
+
+ Map<String, String> newProps =
+ catalog.getProperties() == null
+ ? new HashMap<>()
+ : new HashMap<>(catalog.getProperties());
+ newProps.put(propertyKey, propertyValue);
+ newCatalogBuilder.withProperties(newProps);
+
+ return newCatalogBuilder.build();
+ });
+ catalogCache.invalidate(nameIdentifier);
+
+ } catch (NoSuchCatalogException e) {
+ LOG.error("Catalog {} does not exist", nameIdentifier, e);
+ throw new RuntimeException(e);
+ } catch (IllegalArgumentException e) {
+ LOG.error(
+ "Failed to update catalog {} property {} with unknown change",
+ nameIdentifier,
+ propertyKey,
+ e);
+ throw e;
+ } catch (IOException ioe) {
+ LOG.error("Failed to update catalog {} property {}", nameIdentifier,
propertyKey, ioe);
+ throw new RuntimeException(ioe);
+ }
Review Comment:
The `updateCatalogProperty` method lacks proper locking mechanism. Unlike
`enableCatalog` and `disableCatalog` which use
`TreeLockUtils.doWithTreeLock()`, this method directly updates the catalog
without acquiring locks. This can lead to race conditions when multiple threads
try to update the same catalog's properties, especially during metalake
enable/disable operations. The method should acquire appropriate tree locks
before performing the update.
```suggestion
NameIdentifier metalakeIdent =
NameIdentifier.of(nameIdentifier.namespace().levels());
TreeLockUtils.doWithTreeLock(
metalakeIdent,
LockType.WRITE,
() -> {
try {
store.update(
nameIdentifier,
CatalogEntity.class,
EntityType.CATALOG,
catalog -> {
CatalogEntity.Builder newCatalogBuilder =
newCatalogBuilder(nameIdentifier.namespace(), catalog);
Map<String, String> newProps =
catalog.getProperties() == null
? new HashMap<>()
: new HashMap<>(catalog.getProperties());
newProps.put(propertyKey, propertyValue);
newCatalogBuilder.withProperties(newProps);
return newCatalogBuilder.build();
});
catalogCache.invalidate(nameIdentifier);
return null;
} catch (NoSuchCatalogException e) {
LOG.error("Catalog {} does not exist", nameIdentifier, e);
throw new RuntimeException(e);
} catch (IllegalArgumentException e) {
LOG.error(
"Failed to update catalog {} property {} with unknown
change",
nameIdentifier,
propertyKey,
e);
throw e;
} catch (IOException ioe) {
LOG.error("Failed to update catalog {} property {}",
nameIdentifier, propertyKey, ioe);
throw new RuntimeException(ioe);
}
});
```
##########
core/src/main/java/org/apache/gravitino/metalake/MetalakeManager.java:
##########
@@ -376,24 +396,42 @@ public void disableMetalake(NameIdentifier ident) throws
NoSuchMetalakeException
() -> {
try {
boolean inUse = metalakeInUse(store, ident);
- if (inUse) {
- store.update(
- ident,
- BaseMetalake.class,
- EntityType.METALAKE,
- metalake -> {
- BaseMetalake.Builder builder =
newMetalakeBuilder(metalake);
-
- Map<String, String> newProps =
- metalake.properties() == null
- ? Maps.newHashMap()
- : Maps.newHashMap(metalake.properties());
- newProps.put(PROPERTY_IN_USE, "false");
- builder.withProperties(newProps);
-
- return builder.build();
- });
+ if (!inUse) {
+ return null;
}
+
+ store.update(
+ ident,
+ BaseMetalake.class,
+ EntityType.METALAKE,
+ metalake -> {
+ BaseMetalake.Builder builder = newMetalakeBuilder(metalake);
+
+ Map<String, String> newProps =
+ metalake.properties() == null
+ ? Maps.newHashMap()
+ : Maps.newHashMap(metalake.properties());
+ newProps.put(PROPERTY_IN_USE, "false");
+ builder.withProperties(newProps);
+
+ return builder.build();
+ });
+
+ // The only problem is that we can't make sure we can change all
catalog properties
+ // in a transaction, if any of them fails, the metalake is already
enabled and the value
+ // in catalog is inconsistent.
+ store
+ .list(Namespace.of(ident.name()), CatalogEntity.class,
EntityType.CATALOG)
+ .forEach(
+ catalogEntity -> {
+ // update the properties metalake-in-use in catalog to
false
+ GravitinoEnv.getInstance()
+ .catalogManager()
+ .updateCatalogProperty(
+ catalogEntity.nameIdentifier(),
+ Catalog.PROPERTY_METALAKE_IN_USE,
+ "false");
+ });
Review Comment:
Similar to the enableMetalake method, error handling in the forEach loop for
disableMetalake is insufficient. If `updateCatalogProperty` throws an
exception, it will leave the system in an inconsistent state with the metalake
disabled but only some catalogs having updated properties.
```suggestion
// in a transaction. To avoid leaving the system in an
inconsistent state when a
// catalog update fails, we attempt to roll back the metalake
state.
List<CatalogEntity> catalogs =
store.list(Namespace.of(ident.name()), CatalogEntity.class,
EntityType.CATALOG);
try {
for (CatalogEntity catalogEntity : catalogs) {
// update the properties metalake-in-use in catalog to false
GravitinoEnv.getInstance()
.catalogManager()
.updateCatalogProperty(
catalogEntity.nameIdentifier(),
Catalog.PROPERTY_METALAKE_IN_USE,
"false");
}
} catch (RuntimeException e) {
// Best-effort rollback: re-enable the metalake if catalog
updates fail
try {
store.update(
ident,
BaseMetalake.class,
EntityType.METALAKE,
metalake -> {
BaseMetalake.Builder builder =
newMetalakeBuilder(metalake);
Map<String, String> rollbackProps =
metalake.properties() == null
? Maps.newHashMap()
: Maps.newHashMap(metalake.properties());
rollbackProps.put(PROPERTY_IN_USE, "true");
builder.withProperties(rollbackProps);
return builder.build();
});
} catch (IOException | RuntimeException rollbackEx) {
e.addSuppressed(rollbackEx);
}
throw e;
}
```
##########
core/src/main/java/org/apache/gravitino/metalake/MetalakeManager.java:
##########
@@ -376,24 +396,42 @@ public void disableMetalake(NameIdentifier ident) throws
NoSuchMetalakeException
() -> {
try {
boolean inUse = metalakeInUse(store, ident);
- if (inUse) {
- store.update(
- ident,
- BaseMetalake.class,
- EntityType.METALAKE,
- metalake -> {
- BaseMetalake.Builder builder =
newMetalakeBuilder(metalake);
-
- Map<String, String> newProps =
- metalake.properties() == null
- ? Maps.newHashMap()
- : Maps.newHashMap(metalake.properties());
- newProps.put(PROPERTY_IN_USE, "false");
- builder.withProperties(newProps);
-
- return builder.build();
- });
+ if (!inUse) {
+ return null;
}
+
+ store.update(
+ ident,
+ BaseMetalake.class,
+ EntityType.METALAKE,
+ metalake -> {
+ BaseMetalake.Builder builder = newMetalakeBuilder(metalake);
+
+ Map<String, String> newProps =
+ metalake.properties() == null
+ ? Maps.newHashMap()
+ : Maps.newHashMap(metalake.properties());
+ newProps.put(PROPERTY_IN_USE, "false");
+ builder.withProperties(newProps);
+
+ return builder.build();
+ });
+
+ // The only problem is that we can't make sure we can change all
catalog properties
+ // in a transaction, if any of them fails, the metalake is already
enabled and the value
Review Comment:
The comment on line 421 mentions "the metalake is already enabled" but this
is in the `disableMetalake` method, so it should say "the metalake is already
disabled". This makes the comment misleading about the transaction consistency
issue being described.
```suggestion
// in a transaction, if any of them fails, the metalake is
already disabled and the value
```
##########
core/src/main/java/org/apache/gravitino/metalake/MetalakeManager.java:
##########
@@ -342,25 +344,43 @@ public void enableMetalake(NameIdentifier ident) throws
NoSuchMetalakeException
() -> {
try {
boolean inUse = metalakeInUse(store, ident);
- if (!inUse) {
- store.update(
- ident,
- BaseMetalake.class,
- EntityType.METALAKE,
- metalake -> {
- BaseMetalake.Builder builder =
newMetalakeBuilder(metalake);
-
- Map<String, String> newProps =
- metalake.properties() == null
- ? Maps.newHashMap()
- : Maps.newHashMap(metalake.properties());
- newProps.put(PROPERTY_IN_USE, "true");
- builder.withProperties(newProps);
-
- return builder.build();
- });
+ if (inUse) {
+ return null;
}
+ store.update(
+ ident,
+ BaseMetalake.class,
+ EntityType.METALAKE,
+ metalake -> {
+ BaseMetalake.Builder builder = newMetalakeBuilder(metalake);
+
+ Map<String, String> newProps =
+ metalake.properties() == null
+ ? Maps.newHashMap()
+ : Maps.newHashMap(metalake.properties());
+ newProps.put(PROPERTY_IN_USE, "true");
+ builder.withProperties(newProps);
+
+ return builder.build();
+ });
+
+ // The only problem is that we can't make sure we can change all
catalog properties
+ // in a transaction, if any of them fails, the metalake is already
enabled and the value
+ // in catalog is inconsistent.
+ store
+ .list(Namespace.of(ident.name()), CatalogEntity.class,
EntityType.CATALOG)
+ .forEach(
+ catalogEntity -> {
+ // update the properties metalake-in-use in catalog to
false
+ GravitinoEnv.getInstance()
+ .catalogManager()
+ .updateCatalogProperty(
+ catalogEntity.nameIdentifier(),
+ Catalog.PROPERTY_METALAKE_IN_USE,
+ "false");
+ });
Review Comment:
The transaction consistency issue mentioned in the comments (lines 368-370
and 420-422) represents a real data integrity problem. If any catalog property
update fails after the metalake state has been updated, the system will be in
an inconsistent state. Consider wrapping the metalake update and all catalog
updates in a single transaction, or implementing a rollback mechanism to revert
the metalake state if catalog updates fail. Alternatively, update catalog
properties first before changing the metalake state, and document this ordering
decision.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]