Copilot commented on code in PR #17800:
URL: https://github.com/apache/pinot/pull/17800#discussion_r2876673586


##########
pinot-spi/src/main/java/org/apache/pinot/spi/utils/PinotMd5Mode.java:
##########
@@ -0,0 +1,46 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.spi.utils;
+
+/**
+ * Global runtime switch for disabling MD5-dependent code paths in Pinot.
+ */
+public class PinotMd5Mode {
+  private static volatile boolean _pinotMd5Disabled = readFromSystemProperty();
+
+  private PinotMd5Mode() {
+  }
+
+  public static synchronized void setPinotMd5Disabled(boolean 
pinotMd5Disabled) {
+    _pinotMd5Disabled = pinotMd5Disabled;
+  }
+
+  public static synchronized boolean isPinotMd5Disabled() {
+    return _pinotMd5Disabled;
+  }
+
+  // Visible for tests in the same package.
+  static synchronized void resetFromSystemProperty() {

Review Comment:
   `PinotMd5Mode.isPinotMd5Disabled()` is `synchronized` even though the 
backing field is `volatile`. This method is called in hot paths (e.g., 
`HashUtils.hashPrimaryKey()` used during upsert/dedup), so the synchronization 
can introduce avoidable lock contention. Consider removing `synchronized` from 
the getter (and likely the setter) and rely on `volatile` or an `AtomicBoolean` 
instead.
   ```suggestion
     public static void setPinotMd5Disabled(boolean pinotMd5Disabled) {
       _pinotMd5Disabled = pinotMd5Disabled;
     }
   
     public static boolean isPinotMd5Disabled() {
       return _pinotMd5Disabled;
     }
   
     // Visible for tests in the same package.
     static void resetFromSystemProperty() {
   ```



##########
pinot-plugins/pinot-file-system/pinot-adls/src/test/java/org/apache/pinot/plugin/filesystem/test/ADLSGen2PinotFSTest.java:
##########
@@ -141,6 +143,35 @@ public DataLakeFileSystemClient 
getOrCreateClientWithFileSystem(DataLakeServiceC
     assertTrue(sasTokenFS != null);
   }
 
+  @Test
+  public void testChecksumDisabledWhenMd5Disabled()
+      throws Exception {
+    PinotConfiguration pinotConfiguration = new PinotConfiguration();
+    pinotConfiguration.setProperty("authenticationType", "SAS_TOKEN");
+    pinotConfiguration.setProperty("sasToken", 
"sp=rwdl&se=2025-12-31T23:59:59Z&sv=2022-11-02&sr=c&sig=test");
+    pinotConfiguration.setProperty("accountName", "testaccount");
+    pinotConfiguration.setProperty("fileSystemName", "testcontainer");
+    pinotConfiguration.setProperty("enableChecksum", "true");
+
+    ADLSGen2PinotFS adlsGen2PinotFs = new ADLSGen2PinotFS() {
+      @Override
+      public DataLakeFileSystemClient 
getOrCreateClientWithFileSystem(DataLakeServiceClient serviceClient,
+          String fileSystemName) {
+        return _mockFileSystemClient;
+      }
+    };
+
+    try {
+      PinotMd5Mode.setPinotMd5Disabled(true);
+      adlsGen2PinotFs.init(pinotConfiguration);
+      Field enableChecksumField = 
ADLSGen2PinotFS.class.getDeclaredField("_enableChecksum");
+      enableChecksumField.setAccessible(true);
+      assertFalse(enableChecksumField.getBoolean(adlsGen2PinotFs));

Review Comment:
   This test uses reflection to access the private `_enableChecksum` field, 
which is brittle (it will break on refactors/renames without changing 
behavior). Prefer asserting behavior via the public API (e.g., verifying 
checksum headers/hashes are not computed/sent when MD5 is disabled) or adding a 
small `@VisibleForTesting` accessor instead of reflective field access.
   ```suggestion
         // Initialization should succeed even when checksum is requested in 
the configuration
         // but MD5 computation is globally disabled.
         adlsGen2PinotFs.init(pinotConfiguration);
   ```



##########
pinot-plugins/pinot-file-system/pinot-adls/src/main/java/org/apache/pinot/plugin/filesystem/ADLSGen2PinotFS.java:
##########
@@ -123,7 +125,13 @@ public ADLSGen2PinotFS(DataLakeFileSystemClient 
fileSystemClient) {
 
   @Override
   public void init(PinotConfiguration config) {
-    _enableChecksum = config.getProperty(ENABLE_CHECKSUM, false);
+    boolean checksumEnabled = config.getProperty(ENABLE_CHECKSUM, false);
+    if (checksumEnabled && PinotMd5Mode.isPinotMd5Disabled()) {
+      LOGGER.warn("Checksum is enabled for ADLSGen2PinotFS but MD5 is globally 
disabled via '{}=true'; "
+              + "running with checksum disabled",
+          CommonConstants.CONFIG_OF_PINOT_MD5_DISABLED);
+    }
+    _enableChecksum = checksumEnabled && !PinotMd5Mode.isPinotMd5Disabled();

Review Comment:
   `_enableChecksum` is correctly forced off when MD5 is globally disabled, but 
`copySrcToDst()` still forwards `pathProperties.getContentMd5()` into 
`copyInputStreamToDst(...)`, which will set MD5 metadata whenever it exists 
(even when `_enableChecksum` is false). To fully honor the global MD5 disable 
switch, gate that metadata propagation as well (e.g., pass `null` when MD5 is 
disabled).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to