dennishuo commented on code in PR #2223:
URL: https://github.com/apache/polaris/pull/2223#discussion_r2366654356


##########
runtime/service/src/main/java/org/apache/polaris/service/admin/PolarisAdminService.java:
##########
@@ -1737,6 +1761,80 @@ public boolean revokePrivilegeOnNamespaceFromRole(
         .isSuccess();
   }
 
+  /**
+   * Creates and persists the missing synthetic namespace entities for 
external catalogs.
+   *
+   * @param catalogEntity the external passthrough facade catalog entity.
+   * @param namespace the expected fully resolved namespace to be created.
+   * @param existingPath the partially resolved path currently stored in the 
metastore.
+   * @return the fully resolved path wrapper.
+   */
+  private PolarisResolvedPathWrapper createSyntheticNamespaceEntities(
+      CatalogEntity catalogEntity, Namespace namespace, 
PolarisResolvedPathWrapper existingPath) {
+
+    if (existingPath == null) {
+      throw new IllegalStateException(
+          String.format("Catalog entity %s does not exist.", 
catalogEntity.getName()));
+    }
+
+    List<PolarisEntity> completePath = new 
ArrayList<>(existingPath.getRawFullPath());
+    PolarisEntity currentParent = existingPath.getRawLeafEntity();
+
+    String[] allNamespaceLevels = namespace.levels();
+    int numMatchingLevels = 0;
+    // Find parts of the complete path that match the namespace levels.
+    // We skip index 0 because it is the CatalogEntity.
+    for (PolarisEntity entity : completePath.subList(1, completePath.size())) {
+      if (!entity.getName().equals(allNamespaceLevels[numMatchingLevels])) {
+        break;
+      }
+      numMatchingLevels++;
+    }
+
+    for (int i = numMatchingLevels; i < allNamespaceLevels.length; i++) {
+      String[] namespacePart = Arrays.copyOfRange(allNamespaceLevels, 0, i + 
1);
+      String leafNamespace = namespacePart[namespacePart.length - 1];
+      Namespace currentNamespace = Namespace.of(namespacePart);
+
+      // TODO: Instead of creating synthetic entitties, rely on external 
catalog mediated backfill.
+      PolarisEntity syntheticNamespace =
+          new NamespaceEntity.Builder(currentNamespace)
+              
.setId(metaStoreManager.generateNewEntityId(getCurrentPolarisContext()).getId())
+              .setCatalogId(catalogEntity.getId())
+              .setParentId(currentParent.getId())
+              .setCreateTimestamp(System.currentTimeMillis())
+              .build();

Review Comment:
   We should probably sort out the nuances of where the "synthetic" term fits 
in vs "passthrough facade" (synthetic is more just descriptive of the process 
by which the entity is formulated, but ultimately the "type" we're interested 
in identifying later is that the entity represents a "passthrough facade", even 
if its origins in the future may or may not be considered "synthetic" in the 
same way).
   
   To answer your question - right now the code basically defines *all* 
entities within an ExternalCatalog to be passthrough facades if they exist, so 
the identity is inherited from its parent catalog. I suppose we could also set 
a flag of some sort that explicitly identifies it as being a passthrough 
facade, but we might want to do some more design discussion about that. 
Individually identifying entities would mostly be important in a couple 
potential scenarios:
   
   1. *If* we support "mixed-mode" catalogs where some entities are natively 
owned in Polaris while others are basically "symlinks" to remote tables. Then 
we presumably need to know if any given table is local or remote. By default, 
we *probably* don't want to support this use case unless we have a really 
compelling reason.
   2. A variation of (1) would be if we support gradual "migration" scenarios. 
In such a case, we want the ability to incrementally transfer individual tables 
into having the local Polaris server be the source of truth, so it's also 
similar to a "mixed-mode" catalog, except the key difference is that we'd have 
much more constrained intermediate states, and all the tables "originally" came 
from the same remote catalog.
   
   The basic intent of a Catalog's ConnectionConfig was to be resolved 
hierarchically the same way the StorageConfig is technically resolved 
hierarchically, so that if it's present on a parent namespace or on the table 
itself, the "nearest" definition wins. So one possibility is that we set the 
ConnectionConfig directly on the table to indicate that it's a passthrough 
facade.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to