dennishuo commented on code in PR #2223:
URL: https://github.com/apache/polaris/pull/2223#discussion_r2366658329


##########
runtime/service/src/main/java/org/apache/polaris/service/admin/PolarisAdminService.java:
##########
@@ -1737,6 +1761,80 @@ public boolean revokePrivilegeOnNamespaceFromRole(
         .isSuccess();
   }
 
+  /**
+   * Creates and persists the missing synthetic namespace entities for 
external catalogs.
+   *
+   * @param catalogEntity the external passthrough facade catalog entity.
+   * @param namespace the expected fully resolved namespace to be created.
+   * @param existingPath the partially resolved path currently stored in the 
metastore.
+   * @return the fully resolved path wrapper.
+   */
+  private PolarisResolvedPathWrapper createSyntheticNamespaceEntities(
+      CatalogEntity catalogEntity, Namespace namespace, 
PolarisResolvedPathWrapper existingPath) {
+
+    if (existingPath == null) {
+      throw new IllegalStateException(
+          String.format("Catalog entity %s does not exist.", 
catalogEntity.getName()));
+    }
+
+    List<PolarisEntity> completePath = new 
ArrayList<>(existingPath.getRawFullPath());
+    PolarisEntity currentParent = existingPath.getRawLeafEntity();
+
+    String[] allNamespaceLevels = namespace.levels();
+    int numMatchingLevels = 0;
+    // Find parts of the complete path that match the namespace levels.
+    // We skip index 0 because it is the CatalogEntity.
+    for (PolarisEntity entity : completePath.subList(1, completePath.size())) {
+      if (!entity.getName().equals(allNamespaceLevels[numMatchingLevels])) {
+        break;
+      }
+      numMatchingLevels++;
+    }
+
+    for (int i = numMatchingLevels; i < allNamespaceLevels.length; i++) {
+      String[] namespacePart = Arrays.copyOfRange(allNamespaceLevels, 0, i + 
1);
+      String leafNamespace = namespacePart[namespacePart.length - 1];
+      Namespace currentNamespace = Namespace.of(namespacePart);
+
+      // TODO: Instead of creating synthetic entitties, rely on external 
catalog mediated backfill.
+      PolarisEntity syntheticNamespace =
+          new NamespaceEntity.Builder(currentNamespace)
+              
.setId(metaStoreManager.generateNewEntityId(getCurrentPolarisContext()).getId())
+              .setCatalogId(catalogEntity.getId())
+              .setParentId(currentParent.getId())
+              .setCreateTimestamp(System.currentTimeMillis())
+              .build();
+
+      EntityResult result =
+          metaStoreManager.createEntityIfNotExists(
+              getCurrentPolarisContext(),
+              PolarisEntity.toCoreList(completePath),
+              syntheticNamespace);
+
+      if (result.isSuccess()) {
+        syntheticNamespace = PolarisEntity.of(result.getEntity());
+      } else {
+        Namespace partialNamespace = 
Namespace.of(Arrays.copyOf(allNamespaceLevels, i + 1));
+        PolarisResolvedPathWrapper partialPath =
+            resolutionManifest.getResolvedPath(partialNamespace);
+        PolarisEntity partialLeafEntity = partialPath.getRawLeafEntity();
+        if (partialLeafEntity == null
+            || !(partialLeafEntity.getName().equals(leafNamespace)
+                && partialLeafEntity.getType() == 
PolarisEntityType.NAMESPACE)) {
+          throw new RuntimeException(
+              String.format(
+                  "Failed to create or find namespace entity '%s' in federated 
catalog '%s'",
+                  leafNamespace, catalogEntity.getName()));
+        }

Review Comment:
   Yeah good point about the resolution manifest. I believe it's a bug - line 
1819 should be calling `getPassthroughResolvedPath` instead of 
`getResolvedPath`.
   
   Re: serializable isolation, given the fact that the remote catalog itself is 
fundamentally non-atomic with anything we see on the local Polaris (in 
particular, there's no atomic multi-table+multi-namespace GET in Iceberg REST 
anyways), we're fundamentally shooting for more of an "eventual consistency" 
semantic in federated catalogs.
   
   While concurrent jit-creation might be semi-common due to different 
workloads trying to hit the same not-yet-JIT-created entity at the same time, 
it'd be more rare that a JIT-created entity also then gets dropped/unlinked 
locally within the race condition followed by the attempt to re-read the 
conflicting entity.
   
   So if I understand the race condition identified here correctly, it seems 
fine to throw some kind of UnknownStateException in this case where the re-read 
is `null` and have the caller figure out if they just want to retry or figure 
out the state manually first.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to