singhpk234 commented on code in PR #1686:
URL: https://github.com/apache/polaris/pull/1686#discussion_r2157976040


##########
persistence/relational-jdbc/src/main/java/org/apache/polaris/persistence/relational/jdbc/QueryGenerator.java:
##########
@@ -208,6 +209,59 @@ static QueryFragment generateWhereClause(
     return new QueryFragment(clause, parameters);
   }
 
+  @VisibleForTesting
+  static PreparedQuery generateVersionQuery() {
+    return new PreparedQuery("SELECT version_value FROM 
POLARIS_SCHEMA.VERSION", List.of());
+  }
+
+  /**
+   * Generate a SELECT query to find any entities that have a given realm & 
parent and that may with
+   * a given location. The check is performed without consideration for the 
scheme, so a path on one
+   * storage type may give a false positive for overlapping with another 
storage type. This should
+   * be combined with a check using `StorageLocation`.
+   *
+   * @param realmId A realm to search within
+   * @param parentId A parent entity to search within
+   * @param baseLocation The base location to look for overlap with, with or 
without a scheme
+   * @return The list of possibly overlapping entities that meet the criteria
+   */
+  @VisibleForTesting
+  public static PreparedQuery generateOverlapQuery(
+      String realmId, long parentId, String baseLocation) {
+    StorageLocation baseStorageLocation = StorageLocation.of(baseLocation);
+    String locationWithoutScheme = baseStorageLocation.withoutScheme();
+
+    List<String> conditions = new ArrayList<>();
+    List<Object> parameters = new ArrayList<>();
+
+    String[] components = locationWithoutScheme.split("/");
+    StringBuilder pathBuilder = new StringBuilder();
+
+    for (String component : components) {
+      pathBuilder.append(component).append("/");
+      conditions.add("location_without_scheme = ?");
+      parameters.add(pathBuilder.toString());
+    }

Review Comment:
   [doubt] does it needs to opiniated based on depth of the directory ? a user 
recently pointed out this 
https://apache-polaris.slack.com/archives/C084XDM50CB/p1750346497095709 hence 
wanted to ask 



##########
persistence/relational-jdbc/src/main/resources/h2/schema-v2.sql:
##########
@@ -0,0 +1,128 @@
+--
+-- Licensed to the Apache Software Foundation (ASF) under one
+-- or more contributor license agreements.  See the NOTICE file--
+--  distributed with this work for additional information
+-- regarding copyright ownership.  The ASF licenses this file
+-- to you under the Apache License, Version 2.0 (the
+-- "License"). You may not use this file except in compliance
+-- with the License.  You may obtain a copy of the License at
+--
+--  http://www.apache.org/licenses/LICENSE-2.0
+--
+-- Unless required by applicable law or agreed to in writing,
+-- software distributed under the License is distributed on an
+-- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+-- KIND, either express or implied.  See the License for the
+-- specific language governing permissions and limitations
+-- under the License.
+--
+
+-- Changes from v1:
+--  * Added a `location` column to entities
+--  * Added an index `idx_locations` over (realm_id, parent_id, location) in 
entities
+
+CREATE SCHEMA IF NOT EXISTS POLARIS_SCHEMA;
+SET SCHEMA POLARIS_SCHEMA;
+
+CREATE TABLE IF NOT EXISTS version (
+    version_key VARCHAR PRIMARY KEY,
+    version_value INTEGER NOT NULL
+);
+
+MERGE INTO version (version_key, version_value)
+    KEY (version_key)
+    VALUES ('version', 2);
+
+-- H2 supports COMMENT, but some modes may ignore it
+COMMENT ON TABLE version IS 'the version of the JDBC schema in use';
+
+DROP TABLE IF EXISTS entities;
+CREATE TABLE IF NOT EXISTS entities (
+    realm_id TEXT NOT NULL,
+    catalog_id BIGINT NOT NULL,
+    id BIGINT NOT NULL,
+    parent_id BIGINT NOT NULL,
+    name TEXT NOT NULL,
+    entity_version INT NOT NULL,
+    type_code INT NOT NULL,
+    sub_type_code INT NOT NULL,
+    create_timestamp BIGINT NOT NULL,
+    drop_timestamp BIGINT NOT NULL,
+    purge_timestamp BIGINT NOT NULL,
+    to_purge_timestamp BIGINT NOT NULL,
+    last_update_timestamp BIGINT NOT NULL,
+    properties TEXT NOT NULL DEFAULT '{}',
+    internal_properties TEXT NOT NULL DEFAULT '{}',
+    grant_records_version INT NOT NULL,
+    location_without_scheme TEXT,
+    PRIMARY KEY (realm_id, id),
+    CONSTRAINT constraint_name UNIQUE (realm_id, catalog_id, parent_id, 
type_code, name)
+);
+
+DROP INDEX IF EXISTS idx_locations;
+CREATE INDEX idx_locations ON entities(realm_id, parent_id, 
location_without_scheme);

Review Comment:
   are we skipping this ?



##########
persistence/relational-jdbc/src/main/java/org/apache/polaris/persistence/relational/jdbc/JdbcBasePersistenceImpl.java:
##########
@@ -622,6 +632,64 @@ public boolean hasChildren(
     }
   }
 
+  private int loadVersion() {
+    PreparedQuery query = QueryGenerator.generateVersionQuery();
+    try {
+      List<SchemaVersion> schemaVersion =
+          datasourceOperations.executeSelect(query, new SchemaVersion());
+      if (schemaVersion == null || schemaVersion.size() != 1) {
+        throw new RuntimeException("Failed to retrieve schema version");
+      }
+      return schemaVersion.getFirst().getValue();
+    } catch (SQLException e) {
+      LOGGER.error("Failed to load schema version due to {}", e.getMessage(), 
e);
+      throw new RuntimeException("Failed to retrieve schema version", e);
+    }
+  }
+
+  /** {@inheritDoc} */
+  @Override
+  public <T extends PolarisEntity & LocationBasedEntity>
+      Optional<Optional<String>> hasOverlappingSiblings(
+          @Nonnull PolarisCallContext callContext, T entity) {
+    if (this.version < 2) {
+      return Optional.empty();
+    }
+    if (entity.getBaseLocation().chars().filter(ch -> ch == '/').count()
+        > MAX_LOCATION_COMPONENTS) {
+      return Optional.empty();
+    }
+
+    PreparedQuery query =
+        QueryGenerator.generateOverlapQuery(
+            realmId, entity.getParentId(), entity.getBaseLocation());
+    try {
+      var results = datasourceOperations.executeSelect(query, new 
ModelEntity());
+      if (!results.isEmpty()) {
+        StorageLocation entityLocation = 
StorageLocation.of(entity.getBaseLocation());
+        for (PolarisBaseEntity result : results) {
+          StorageLocation potentialSiblingLocation =
+              StorageLocation.of(((LocationBasedEntity) 
result).getBaseLocation());
+          if (entityLocation.isChildOf(potentialSiblingLocation)
+              || potentialSiblingLocation.isChildOf(entityLocation)) {
+            return 
Optional.of(Optional.of(potentialSiblingLocation.toString()));

Review Comment:
   IIUC, we might still miss the s3 / s3a equivalence check ? wondering if we 
need to define something like 
https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/io/ResolvingFileIO.java#L60
 , I know this is part of a very wide debate, please feel free to skip it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to