HonahX commented on code in PR #2223:
URL: https://github.com/apache/polaris/pull/2223#discussion_r2369414815
##########
runtime/service/src/main/java/org/apache/polaris/service/admin/PolarisAdminService.java:
##########
@@ -1737,6 +1761,80 @@ public boolean revokePrivilegeOnNamespaceFromRole(
.isSuccess();
}
+ /**
+ * Creates and persists the missing synthetic namespace entities for
external catalogs.
+ *
+ * @param catalogEntity the external passthrough facade catalog entity.
+ * @param namespace the expected fully resolved namespace to be created.
+ * @param existingPath the partially resolved path currently stored in the
metastore.
+ * @return the fully resolved path wrapper.
+ */
+ private PolarisResolvedPathWrapper createSyntheticNamespaceEntities(
+ CatalogEntity catalogEntity, Namespace namespace,
PolarisResolvedPathWrapper existingPath) {
+
+ if (existingPath == null) {
+ throw new IllegalStateException(
+ String.format("Catalog entity %s does not exist.",
catalogEntity.getName()));
+ }
+
+ List<PolarisEntity> completePath = new
ArrayList<>(existingPath.getRawFullPath());
+ PolarisEntity currentParent = existingPath.getRawLeafEntity();
+
+ String[] allNamespaceLevels = namespace.levels();
+ int numMatchingLevels = 0;
+ // Find parts of the complete path that match the namespace levels.
+ // We skip index 0 because it is the CatalogEntity.
+ for (PolarisEntity entity : completePath.subList(1, completePath.size())) {
+ if (!entity.getName().equals(allNamespaceLevels[numMatchingLevels])) {
+ break;
+ }
+ numMatchingLevels++;
+ }
+
+ for (int i = numMatchingLevels; i < allNamespaceLevels.length; i++) {
+ String[] namespacePart = Arrays.copyOfRange(allNamespaceLevels, 0, i +
1);
+ String leafNamespace = namespacePart[namespacePart.length - 1];
+ Namespace currentNamespace = Namespace.of(namespacePart);
+
+ // TODO: Instead of creating synthetic entitties, rely on external
catalog mediated backfill.
+ PolarisEntity syntheticNamespace =
+ new NamespaceEntity.Builder(currentNamespace)
+
.setId(metaStoreManager.generateNewEntityId(getCurrentPolarisContext()).getId())
+ .setCatalogId(catalogEntity.getId())
+ .setParentId(currentParent.getId())
+ .setCreateTimestamp(System.currentTimeMillis())
+ .build();
+
+ EntityResult result =
+ metaStoreManager.createEntityIfNotExists(
+ getCurrentPolarisContext(),
+ PolarisEntity.toCoreList(completePath),
+ syntheticNamespace);
+
+ if (result.isSuccess()) {
+ syntheticNamespace = PolarisEntity.of(result.getEntity());
+ } else {
+ Namespace partialNamespace =
Namespace.of(Arrays.copyOf(allNamespaceLevels, i + 1));
+ PolarisResolvedPathWrapper partialPath =
+ resolutionManifest.getResolvedPath(partialNamespace);
+ PolarisEntity partialLeafEntity = partialPath.getRawLeafEntity();
+ if (partialLeafEntity == null
+ || !(partialLeafEntity.getName().equals(leafNamespace)
+ && partialLeafEntity.getType() ==
PolarisEntityType.NAMESPACE)) {
+ throw new RuntimeException(
+ String.format(
+ "Failed to create or find namespace entity '%s' in federated
catalog '%s'",
+ leafNamespace, catalogEntity.getName()));
+ }
Review Comment:
Thanks @dennishuo for the detailed explanation!
> I believe it's a bug - line 1819 should be calling
getPassthroughResolvedPath
+1, IMHO, the overall order might be
- resolutionManifest tries to resolve the path
- some or all JIT entities not found, backfill triggered.
- resolutionManifest getPassthroughPath to resolve the path again
- Proceed if the whole path resolved, fail with
`UnkownStateException`/`ConcurrentModification`... if not
We could try to see if it worth to add some option to make the batch
creation api, `createEntitiesIfNotExist` not fail on
`EntityAlreadyExistException` but simply return the entity if it is already
there, this way we could push down the actual creation into the persistence
layer instead of in the for-loop. But this may need broader discussion as this
is persistence API change.
##########
runtime/service/src/main/java/org/apache/polaris/service/admin/PolarisAdminService.java:
##########
@@ -1737,6 +1761,80 @@ public boolean revokePrivilegeOnNamespaceFromRole(
.isSuccess();
}
+ /**
+ * Creates and persists the missing synthetic namespace entities for
external catalogs.
+ *
+ * @param catalogEntity the external passthrough facade catalog entity.
+ * @param namespace the expected fully resolved namespace to be created.
+ * @param existingPath the partially resolved path currently stored in the
metastore.
+ * @return the fully resolved path wrapper.
+ */
+ private PolarisResolvedPathWrapper createSyntheticNamespaceEntities(
+ CatalogEntity catalogEntity, Namespace namespace,
PolarisResolvedPathWrapper existingPath) {
+
+ if (existingPath == null) {
+ throw new IllegalStateException(
+ String.format("Catalog entity %s does not exist.",
catalogEntity.getName()));
+ }
+
+ List<PolarisEntity> completePath = new
ArrayList<>(existingPath.getRawFullPath());
+ PolarisEntity currentParent = existingPath.getRawLeafEntity();
+
+ String[] allNamespaceLevels = namespace.levels();
+ int numMatchingLevels = 0;
+ // Find parts of the complete path that match the namespace levels.
+ // We skip index 0 because it is the CatalogEntity.
+ for (PolarisEntity entity : completePath.subList(1, completePath.size())) {
+ if (!entity.getName().equals(allNamespaceLevels[numMatchingLevels])) {
+ break;
+ }
+ numMatchingLevels++;
+ }
+
+ for (int i = numMatchingLevels; i < allNamespaceLevels.length; i++) {
+ String[] namespacePart = Arrays.copyOfRange(allNamespaceLevels, 0, i +
1);
+ String leafNamespace = namespacePart[namespacePart.length - 1];
+ Namespace currentNamespace = Namespace.of(namespacePart);
+
+ // TODO: Instead of creating synthetic entitties, rely on external
catalog mediated backfill.
+ PolarisEntity syntheticNamespace =
+ new NamespaceEntity.Builder(currentNamespace)
+
.setId(metaStoreManager.generateNewEntityId(getCurrentPolarisContext()).getId())
+ .setCatalogId(catalogEntity.getId())
+ .setParentId(currentParent.getId())
+ .setCreateTimestamp(System.currentTimeMillis())
+ .build();
+
+ EntityResult result =
+ metaStoreManager.createEntityIfNotExists(
+ getCurrentPolarisContext(),
+ PolarisEntity.toCoreList(completePath),
+ syntheticNamespace);
+
+ if (result.isSuccess()) {
+ syntheticNamespace = PolarisEntity.of(result.getEntity());
+ } else {
+ Namespace partialNamespace =
Namespace.of(Arrays.copyOf(allNamespaceLevels, i + 1));
+ PolarisResolvedPathWrapper partialPath =
+ resolutionManifest.getResolvedPath(partialNamespace);
+ PolarisEntity partialLeafEntity = partialPath.getRawLeafEntity();
+ if (partialLeafEntity == null
+ || !(partialLeafEntity.getName().equals(leafNamespace)
+ && partialLeafEntity.getType() ==
PolarisEntityType.NAMESPACE)) {
+ throw new RuntimeException(
+ String.format(
+ "Failed to create or find namespace entity '%s' in federated
catalog '%s'",
+ leafNamespace, catalogEntity.getName()));
+ }
Review Comment:
Thanks @dennishuo for the detailed explanation!
> I believe it's a bug - line 1819 should be calling
getPassthroughResolvedPath
+1, IMHO, the overall order might be
- resolutionManifest tries to resolve the path
- some or all JIT entities not found, backfill triggered.
- resolutionManifest getPassthroughPath to resolve the path again
- Proceed if the whole path resolved, fail with
`UnkownStateException`/`ConcurrentModification`... if not
We could try to see if it worth to add some option to make the batch
creation api, `createEntitiesIfNotExist` not fail on
`EntityAlreadyExistException` but simply return the entity if it is already
there, this way we could push down the actual creation into the persistence
layer instead of in the for-loop. But this may need broader discussion as this
is persistence API change.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]