kfaraz commented on code in PR #18515:
URL: https://github.com/apache/druid/pull/18515#discussion_r2361771892


##########
server/src/main/java/org/apache/druid/metadata/SQLMetadataConnector.java:
##########
@@ -1106,41 +1168,47 @@ public ResultSet getIndexInfo(DatabaseMetaData 
databaseMetaData, String tableNam
   }
 
   /**
-   * create index on the table with retry if not already exist, to be called 
after createTable
+   * Create index on the table with retry if not already exist, to be called 
after createTable
+   * Format of index name is either specified via legacy {@param 
indexTemplate} or {@code generateSHABasedIndexIdentifier}.
    *
    * @param tableName       Name of the table to create index on
-   * @param indexName       case-insensitive string index name, it helps to 
check the existing index on table
+   * @param indexTemplate   Template to create index ID (nullable for forwards 
compatibility with SHA-based indices)
    * @param indexCols       List of columns to be indexed on
    * @param createdIndexSet
    */
   public void createIndex(
       final String tableName,
-      final String indexName,
+      final String indexTemplate,

Review Comment:
   ```suggestion
         final String legacyIndexNameFormat,
   ```



##########
processing/src/main/java/org/apache/druid/metadata/MetadataStorageConnector.java:
##########
@@ -95,4 +95,12 @@ default void exportTable(
    * SegmentSchema table is created only when CentralizedDatasourceSchema 
feature is enabled.
    */
   void createSegmentSchemasTable();
+
+  /**
+   * Returns the maximum identifier length for table indices in this 
MetadataStorageConnector implementation.

Review Comment:
   Nit:
   ```suggestion
      * Returns the maximum allowed length of table index names in this 
MetadataStorageConnector implementation.
   ```



##########
server/src/main/java/org/apache/druid/metadata/SQLMetadataConnector.java:
##########
@@ -1106,41 +1168,47 @@ public ResultSet getIndexInfo(DatabaseMetaData 
databaseMetaData, String tableNam
   }
 
   /**
-   * create index on the table with retry if not already exist, to be called 
after createTable
+   * Create index on the table with retry if not already exist, to be called 
after createTable
+   * Format of index name is either specified via legacy {@param 
indexTemplate} or {@code generateSHABasedIndexIdentifier}.
    *
    * @param tableName       Name of the table to create index on
-   * @param indexName       case-insensitive string index name, it helps to 
check the existing index on table
+   * @param indexTemplate   Template to create index ID (nullable for forwards 
compatibility with SHA-based indices)
    * @param indexCols       List of columns to be indexed on
    * @param createdIndexSet
    */
   public void createIndex(
       final String tableName,
-      final String indexName,
+      final String indexTemplate,
       final List<String> indexCols,
       final Set<String> createdIndexSet
   )
   {
+    final String shaIndexName = 
StringUtils.toUpperCase(generateSHABasedIndexIdentifier(tableName, indexCols));
+    final String legacyIndexName = indexTemplate != null ? 
StringUtils.toUpperCase(StringUtils.format(
+        indexTemplate,
+        tableName
+    )) : null;

Review Comment:
   ```suggestion
       final String legacyIndexName = indexTemplate != null
              ? StringUtils.toUpperCase(StringUtils.format(indexTemplate, 
tableName))
              : null;
   ```



##########
server/src/main/java/org/apache/druid/metadata/SQLMetadataConnector.java:
##########
@@ -572,10 +603,15 @@ public void createSupervisorsTable(final String tableName)
                 + "  PRIMARY KEY (id)\n"
                 + ")",
                 tableName, getSerialType(), getPayloadType()
-            ),
-            StringUtils.format("CREATE INDEX idx_%1$s_spec_id ON 
%1$s(spec_id)", tableName)
+            )
         )
     );
+    createIndex(
+        tableName,
+        "idx_%1$s_spec_id ",

Review Comment:
   Nit: Since the index name has only one occurrence of the table name, can we 
just use `%s` instead of `%1$s` across all the invocations of the `createIndex` 
method?



##########
server/src/main/java/org/apache/druid/metadata/SQLMetadataConnector.java:
##########
@@ -1206,4 +1274,27 @@ public static boolean isStatementException(Throwable e)
     return e instanceof StatementException ||
            (e instanceof CallbackFailedException && e.getCause() instanceof 
StatementException);
   }
+
+  /**
+   * Used to ensure full index coverage in the presence of large index names.
+   * Returned indentifiers are of the format: `idx_{tableName}_{SHA of column 
list}`.
+   *
+   * @param tableName the table name
+   * @param columns   the set of columns to create the index on 
(case-insensitive)
+   * @return unique index identifier
+   */
+  @VisibleForTesting
+  protected String generateSHABasedIndexIdentifier(String tableName, 
List<String> columns)
+  {
+    final String prefix = "idx_" + tableName + "_";
+    final String columnDigest = DigestUtils.sha1Hex(
+        columns.stream()
+               .map(StringUtils::toLowerCase)
+               .collect(Collectors.joining("_"))
+    );
+    return prefix + columnDigest.substring(

Review Comment:
   We should include the table name in generating the digest,
   and rather than truncating the digest, we should truncate the table name 
itself.



##########
server/src/main/java/org/apache/druid/metadata/SQLMetadataConnector.java:
##########
@@ -371,15 +378,28 @@ public void createSegmentTable(final String tableName)
             StringUtils.format(
                 createStatementBuilder.toString(),
                 tableName, getPayloadType(), getQuoteString(), getCollation()
-            ),
-            StringUtils.format("CREATE INDEX idx_%1$s_used ON %1$s(used)", 
tableName),
-            StringUtils.format(
-                "CREATE INDEX idx_%1$s_datasource_used_end_start ON 
%1$s(dataSource, used, %2$send%2$s, start)",
-                tableName,
-                getQuoteString()
             )
         )
     );
+
+    final Set<String> createdIndexSet = getIndexOnTable(tableName);
+    createIndex(
+        tableName,
+        "idx_%1$s_used",
+        ImmutableList.of("used"),
+        createdIndexSet
+    );
+    createIndex(
+        tableName,
+        "idx_%1$s_datasource_used_end_start",
+        ImmutableList.of(

Review Comment:
   Nit: reads better
   ```suggestion
           List.of(
   ```



##########
server/src/main/java/org/apache/druid/metadata/SQLMetadataConnector.java:
##########
@@ -1206,4 +1274,27 @@ public static boolean isStatementException(Throwable e)
     return e instanceof StatementException ||
            (e instanceof CallbackFailedException && e.getCause() instanceof 
StatementException);
   }
+
+  /**
+   * Used to ensure full index coverage in the presence of large index names.
+   * Returned indentifiers are of the format: `idx_{tableName}_{SHA of column 
list}`.
+   *
+   * @param tableName the table name
+   * @param columns   the set of columns to create the index on 
(case-insensitive)
+   * @return unique index identifier
+   */
+  @VisibleForTesting

Review Comment:
   We can skip this annotation.



##########
server/src/main/java/org/apache/druid/metadata/SQLMetadataConnector.java:
##########
@@ -1106,41 +1168,47 @@ public ResultSet getIndexInfo(DatabaseMetaData 
databaseMetaData, String tableNam
   }
 
   /**
-   * create index on the table with retry if not already exist, to be called 
after createTable
+   * Create index on the table with retry if not already exist, to be called 
after createTable
+   * Format of index name is either specified via legacy {@param 
indexTemplate} or {@code generateSHABasedIndexIdentifier}.
    *
    * @param tableName       Name of the table to create index on
-   * @param indexName       case-insensitive string index name, it helps to 
check the existing index on table
+   * @param indexTemplate   Template to create index ID (nullable for forwards 
compatibility with SHA-based indices)
    * @param indexCols       List of columns to be indexed on
    * @param createdIndexSet
    */
   public void createIndex(
       final String tableName,
-      final String indexName,
+      final String indexTemplate,
       final List<String> indexCols,
       final Set<String> createdIndexSet
   )
   {
+    final String shaIndexName = 
StringUtils.toUpperCase(generateSHABasedIndexIdentifier(tableName, indexCols));

Review Comment:
   Should using the new index name be an opt-in feature based on some config?



##########
server/src/main/java/org/apache/druid/metadata/SQLMetadataConnector.java:
##########
@@ -1206,4 +1274,27 @@ public static boolean isStatementException(Throwable e)
     return e instanceof StatementException ||
            (e instanceof CallbackFailedException && e.getCause() instanceof 
StatementException);
   }
+
+  /**
+   * Used to ensure full index coverage in the presence of large index names.
+   * Returned indentifiers are of the format: `idx_{tableName}_{SHA of column 
list}`.

Review Comment:
   ```suggestion
      * Creates a unique index name of the format {@code idx_<tableName>_<SHA 
of column list>}
      * while honoring the {@link #getMaxLengthOfIndexName}.
   ```



##########
server/src/main/java/org/apache/druid/metadata/SQLMetadataConnector.java:
##########
@@ -1106,41 +1168,47 @@ public ResultSet getIndexInfo(DatabaseMetaData 
databaseMetaData, String tableNam
   }
 
   /**
-   * create index on the table with retry if not already exist, to be called 
after createTable
+   * Create index on the table with retry if not already exist, to be called 
after createTable
+   * Format of index name is either specified via legacy {@param 
indexTemplate} or {@code generateSHABasedIndexIdentifier}.
    *
    * @param tableName       Name of the table to create index on
-   * @param indexName       case-insensitive string index name, it helps to 
check the existing index on table
+   * @param indexTemplate   Template to create index ID (nullable for forwards 
compatibility with SHA-based indices)
    * @param indexCols       List of columns to be indexed on
    * @param createdIndexSet
    */
   public void createIndex(
       final String tableName,
-      final String indexName,
+      final String indexTemplate,
       final List<String> indexCols,
       final Set<String> createdIndexSet
   )
   {
+    final String shaIndexName = 
StringUtils.toUpperCase(generateSHABasedIndexIdentifier(tableName, indexCols));
+    final String legacyIndexName = indexTemplate != null ? 
StringUtils.toUpperCase(StringUtils.format(
+        indexTemplate,
+        tableName
+    )) : null;
+
+    // Avoid creating duplicate indices if an index with either naming 
convention already exists
+    if (legacyIndexName != null && createdIndexSet.contains(legacyIndexName)) {
+      log.info("Legacy index[%s] on Table[%s] already exists", 
legacyIndexName, tableName);
+      return;
+    } else if (createdIndexSet.contains(shaIndexName)) {
+      log.info("Index[%s] on Table[%s] already exists", shaIndexName, 
tableName);
+      return;
+    }
     try {
       retryWithHandle(
-          new HandleCallback<Void>()
-          {
-            @Override
-            public Void withHandle(Handle handle)
-            {
-              if 
(!createdIndexSet.contains(StringUtils.toUpperCase(indexName))) {
-                String indexSQL = StringUtils.format(
-                    "CREATE INDEX %1$s ON %2$s(%3$s)",
-                    indexName,
-                    tableName,
-                    Joiner.on(",").join(indexCols)
-                );
-                log.info("Creating Index on Table [%s], sql: [%s] ", 
tableName, indexSQL);
-                handle.execute(indexSQL);
-              } else {
-                log.info("Index [%s] on Table [%s] already exists", indexName, 
tableName);
-              }
-              return null;
-            }
+          (HandleCallback<Void>) handle -> {
+            String indexSQL = StringUtils.format(
+                "CREATE INDEX %1$s ON %2$s(%3$s)",
+                shaIndexName,
+                tableName,
+                Joiner.on(",").join(indexCols)
+            );
+            log.info("Creating Index on Table [%s], sql: [%s] ", tableName, 
indexSQL);

Review Comment:
   Let's log the index name as it might have been truncated.
   ```suggestion
               log.info("Creating index[%s] on table[%s] using SQL[%s].", 
indexName, tableName, indexSQL);
   ```



##########
server/src/main/java/org/apache/druid/metadata/SQLMetadataConnector.java:
##########
@@ -371,15 +378,28 @@ public void createSegmentTable(final String tableName)
             StringUtils.format(
                 createStatementBuilder.toString(),
                 tableName, getPayloadType(), getQuoteString(), getCollation()
-            ),
-            StringUtils.format("CREATE INDEX idx_%1$s_used ON %1$s(used)", 
tableName),
-            StringUtils.format(
-                "CREATE INDEX idx_%1$s_datasource_used_end_start ON 
%1$s(dataSource, used, %2$send%2$s, start)",
-                tableName,
-                getQuoteString()
             )
         )
     );
+
+    final Set<String> createdIndexSet = getIndexOnTable(tableName);
+    createIndex(
+        tableName,
+        "idx_%1$s_used",
+        ImmutableList.of("used"),
+        createdIndexSet
+    );
+    createIndex(
+        tableName,
+        "idx_%1$s_datasource_used_end_start",
+        ImmutableList.of(
+            "dataSource",
+            "used",
+            StringUtils.format("%send%S", getQuoteString(), getQuoteString()),

Review Comment:
   can we delegate the quoting to the `createIndex` method itself? It could 
just quote all the methods by default.



##########
server/src/main/java/org/apache/druid/metadata/SQLMetadataConnector.java:
##########
@@ -1206,4 +1274,27 @@ public static boolean isStatementException(Throwable e)
     return e instanceof StatementException ||
            (e instanceof CallbackFailedException && e.getCause() instanceof 
StatementException);
   }
+
+  /**
+   * Used to ensure full index coverage in the presence of large index names.
+   * Returned indentifiers are of the format: `idx_{tableName}_{SHA of column 
list}`.
+   *
+   * @param tableName the table name
+   * @param columns   the set of columns to create the index on 
(case-insensitive)
+   * @return unique index identifier
+   */
+  @VisibleForTesting
+  protected String generateSHABasedIndexIdentifier(String tableName, 
List<String> columns)

Review Comment:
   ```suggestion
     protected String generateUniqueIndexName(String tableName, List<String> 
columns)
   ```



##########
server/src/main/java/org/apache/druid/metadata/SQLMetadataConnector.java:
##########
@@ -1106,41 +1168,47 @@ public ResultSet getIndexInfo(DatabaseMetaData 
databaseMetaData, String tableNam
   }
 
   /**
-   * create index on the table with retry if not already exist, to be called 
after createTable
+   * Create index on the table with retry if not already exist, to be called 
after createTable
+   * Format of index name is either specified via legacy {@param 
indexTemplate} or {@code generateSHABasedIndexIdentifier}.
    *
    * @param tableName       Name of the table to create index on
-   * @param indexName       case-insensitive string index name, it helps to 
check the existing index on table
+   * @param indexTemplate   Template to create index ID (nullable for forwards 
compatibility with SHA-based indices)
    * @param indexCols       List of columns to be indexed on
    * @param createdIndexSet
    */
   public void createIndex(
       final String tableName,
-      final String indexName,
+      final String indexTemplate,
       final List<String> indexCols,
       final Set<String> createdIndexSet
   )
   {
+    final String shaIndexName = 
StringUtils.toUpperCase(generateSHABasedIndexIdentifier(tableName, indexCols));
+    final String legacyIndexName = indexTemplate != null ? 
StringUtils.toUpperCase(StringUtils.format(
+        indexTemplate,
+        tableName
+    )) : null;
+
+    // Avoid creating duplicate indices if an index with either naming 
convention already exists
+    if (legacyIndexName != null && createdIndexSet.contains(legacyIndexName)) {
+      log.info("Legacy index[%s] on Table[%s] already exists", 
legacyIndexName, tableName);
+      return;
+    } else if (createdIndexSet.contains(shaIndexName)) {
+      log.info("Index[%s] on Table[%s] already exists", shaIndexName, 
tableName);
+      return;
+    }
     try {
       retryWithHandle(
-          new HandleCallback<Void>()
-          {
-            @Override
-            public Void withHandle(Handle handle)
-            {
-              if 
(!createdIndexSet.contains(StringUtils.toUpperCase(indexName))) {
-                String indexSQL = StringUtils.format(
-                    "CREATE INDEX %1$s ON %2$s(%3$s)",
-                    indexName,
-                    tableName,
-                    Joiner.on(",").join(indexCols)
-                );
-                log.info("Creating Index on Table [%s], sql: [%s] ", 
tableName, indexSQL);
-                handle.execute(indexSQL);
-              } else {
-                log.info("Index [%s] on Table [%s] already exists", indexName, 
tableName);
-              }
-              return null;
-            }
+          (HandleCallback<Void>) handle -> {

Review Comment:
   Nit: not sure, but I think the casting can be avoided since there are no 
other methods which expect a similar lambda signature.
   ```suggestion
             handle -> {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to