dennishuo commented on code in PR #2223:
URL: https://github.com/apache/polaris/pull/2223#discussion_r2366654356
##########
runtime/service/src/main/java/org/apache/polaris/service/admin/PolarisAdminService.java:
##########
@@ -1737,6 +1761,80 @@ public boolean revokePrivilegeOnNamespaceFromRole(
.isSuccess();
}
+ /**
+ * Creates and persists the missing synthetic namespace entities for
external catalogs.
+ *
+ * @param catalogEntity the external passthrough facade catalog entity.
+ * @param namespace the expected fully resolved namespace to be created.
+ * @param existingPath the partially resolved path currently stored in the
metastore.
+ * @return the fully resolved path wrapper.
+ */
+ private PolarisResolvedPathWrapper createSyntheticNamespaceEntities(
+ CatalogEntity catalogEntity, Namespace namespace,
PolarisResolvedPathWrapper existingPath) {
+
+ if (existingPath == null) {
+ throw new IllegalStateException(
+ String.format("Catalog entity %s does not exist.",
catalogEntity.getName()));
+ }
+
+ List<PolarisEntity> completePath = new
ArrayList<>(existingPath.getRawFullPath());
+ PolarisEntity currentParent = existingPath.getRawLeafEntity();
+
+ String[] allNamespaceLevels = namespace.levels();
+ int numMatchingLevels = 0;
+ // Find parts of the complete path that match the namespace levels.
+ // We skip index 0 because it is the CatalogEntity.
+ for (PolarisEntity entity : completePath.subList(1, completePath.size())) {
+ if (!entity.getName().equals(allNamespaceLevels[numMatchingLevels])) {
+ break;
+ }
+ numMatchingLevels++;
+ }
+
+ for (int i = numMatchingLevels; i < allNamespaceLevels.length; i++) {
+ String[] namespacePart = Arrays.copyOfRange(allNamespaceLevels, 0, i +
1);
+ String leafNamespace = namespacePart[namespacePart.length - 1];
+ Namespace currentNamespace = Namespace.of(namespacePart);
+
+ // TODO: Instead of creating synthetic entitties, rely on external
catalog mediated backfill.
+ PolarisEntity syntheticNamespace =
+ new NamespaceEntity.Builder(currentNamespace)
+
.setId(metaStoreManager.generateNewEntityId(getCurrentPolarisContext()).getId())
+ .setCatalogId(catalogEntity.getId())
+ .setParentId(currentParent.getId())
+ .setCreateTimestamp(System.currentTimeMillis())
+ .build();
Review Comment:
We should probably sort out the nuances of where the "synthetic" term fits
in vs "passthrough facade" (synthetic is more just descriptive of the process
by which the entity is formulated, but ultimately the "type" we're interested
in identifying later is that the entity represents a "passthrough facade", even
if its origins in the future may or may not be considered "synthetic" in the
same way).
To answer your question - right now the code basically defines *all*
entities within an ExternalCatalog to be passthrough facades if they exist, so
the identity is inherited from its parent catalog. I suppose we could also set
a flag of some sort that explicitly identifies it as being a passthrough
facade, but we might want to do some more design discussion about that.
Individually identifying entities would mostly be important in a couple
potential scenarios:
1. *If* we support "mixed-mode" catalogs where some entities are natively
owned in Polaris while others are basically "symlinks" to remote tables. Then
we presumably need to know if any given table is local or remote. By default,
we *probably* don't want to support this use case unless we have a really
compelling reason.
2. A variation of (1) would be if we support gradual "migration" scenarios.
In such a case, we want the ability to incrementally transfer individual tables
into having the local Polaris server be the source of truth, so it's also
similar to a "mixed-mode" catalog, except the key difference is that we'd have
much more constrained intermediate states, and all the tables "originally" came
from the same remote catalog.
The basic intent of a Catalog's ConnectionConfig was to be resolved
hierarchically the same way the StorageConfig is technically resolved
hierarchically, so that if it's present on a parent namespace or on the table
itself, the "nearest" definition wins. So one possibility is that we set the
ConnectionConfig directly on the table to indicate that it's a passthrough
facade.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]