jihoonson commented on a change in pull request #10677:
URL: https://github.com/apache/druid/pull/10677#discussion_r554239833



##########
File path: 
indexing-service/src/main/java/org/apache/druid/indexing/common/task/CompactionTask.java
##########
@@ -153,6 +154,9 @@
   @JsonIgnore
   private final RetryPolicyFactory retryPolicyFactory;
 
+  @JsonIgnore
+  private final InputSourceSecurityConfig securityConfig;

Review comment:
       Why not `InputSourceSecurityConfig.ALLOW_ALL`? The system-wide 
securityConfig is ignored in DruidInputSource anyway.

##########
File path: 
indexing-service/src/main/java/org/apache/druid/indexing/common/task/IndexTask.java
##########
@@ -1044,17 +1048,21 @@ public IndexIOConfig(
       if (firehoseFactory != null && inputFormat != null) {
         throw new IAE("Cannot use firehose and inputFormat together. Try using 
inputSource instead of firehose.");
       }
+      if (inputSource != null) {
+        inputSource.validateAllowDenyPrefixList(securityConfig);

Review comment:
       I guess you wanted to validate inputSource in the constructor to fail 
early. However, this seems possible to cause some issues because it will throw 
exception on validation failure, which in turn making creating an ioConfig 
instance failed. For example, suppose you had an ingestion spec stored in the 
metadata store. Then, for some reason, you wanted to update the security config 
to forbid some URIs which are in the ingestion spec in the metadata store. The 
update will break reading ingestion specs from metadata store which happens in 
the Overlord. It will also unnecessarily perform the duplicate check multiple 
times during ingestion. I think this should be called somewhere in the code 
path of ingestion, such as `Task.isReady()`.

##########
File path: 
extensions-core/hdfs-storage/src/main/java/org/apache/druid/inputsource/hdfs/HdfsInputSource.java
##########
@@ -214,10 +215,23 @@ public boolean needsFormat()
   private void cachePathsIfNeeded() throws IOException
   {
     if (cachedPaths == null) {
-      cachedPaths = 
ImmutableList.copyOf(Preconditions.checkNotNull(getPaths(inputPaths, 
configuration), "paths"));
+      Collection<Path> paths = Preconditions.checkNotNull(getPaths(inputPaths, 
configuration), "paths");
+      cachedPaths = ImmutableList.copyOf(paths);
     }
   }
 
+  @Override
+  public void validateAllowDenyPrefixList(InputSourceSecurityConfig 
securityConfig)
+  {
+    try {
+      cachePathsIfNeeded();

Review comment:
       I think you won't probably need to cache paths since this will 
unnecessarily populate the cache in the Overlord.

##########
File path: 
core/src/main/java/org/apache/druid/data/input/AbstractInputSource.java
##########
@@ -31,13 +31,16 @@
  */
 public abstract class AbstractInputSource implements InputSource
 {
+  private transient boolean validated = false;

Review comment:
       Why `transient`? We don't use `Serializable`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org

Reply via email to