rkhachatryan commented on a change in pull request #15322: URL: https://github.com/apache/flink/pull/15322#discussion_r672604163
########## File path: flink-dstl/flink-dstl-dfs/src/main/java/org/apache/flink/changelog/fs/FsStateChangelogOptions.java ########## @@ -57,4 +87,18 @@ "Number of threads to use to perform cleanup in case an upload is discarded " + "(and not cleaned up by JM). " + "If the cleanup doesn't keep up then task might be back-pressured."); + + public static final ConfigOption<Integer> NUM_UPLOAD_THREADS = + ConfigOptions.key("dstl.dfs.upload.num-threads") + .intType() + .defaultValue(5) + .withDescription("Number of threads to use for upload."); Review comment: > This sounds like RocksDB issue I guess it was a design decision to (easily) throttle the number of request to DFS (S3). This is yet another reason for a separate thread pool here. > I can see two reasons against using the same thread pool: ...2. having actions with different priority It's not only about the priority, but also about isolation, as I wrote above. If there is an issue with uploading (e.g. bug or request throttling) then it will consume all IO threads - regardless of the priority. This makes it harder to debug because it will manifest itself in many different ways, giving false signals. > I don't believe anyone will want to bother how much he should bump individual thread pools I'd say users shouldn't bother at all and the defaults should be enough. But once there **is** some issue it might be helpful to see what are threads in each pool doing - and maybe adjust accordingly. > Besides I don't see this being an issue in this case either? I looked at the code, and it seems can block releasing partitions. Maybe there are or will be some other effects? > In that case it should be solved probably in a better way by still sharing thread pool, but with more sophisticated actions scheduler I think it's better only because of the other point - ease of configuration. Otherwise, separate thread pools are simpler in code and more robust. Besides that, async checkpointing phase (regardless of the backend) currently uses its own pool created in `StreamTask`, not the common IO pool. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org