stream2000 commented on code in PR #9395:
URL: https://github.com/apache/hudi/pull/9395#discussion_r1328657021


##########
hudi-common/src/main/java/org/apache/hudi/common/engine/HoodieEngineContext.java:
##########
@@ -105,4 +105,12 @@ public abstract <I, K, V> List<V> reduceByKey(
   public abstract void cancelJob(String jobId);
 
   public abstract void cancelAllJobs();
+
+  public <T> Stream<T> stream(List<T> data, Integer parallelism) {
+    return stream(stream(data.stream(), data.size()), parallelism);
+  }
+
+  public <T> Stream<T> stream(Stream<T> data, Integer parallelism) {
+    return parallelism == null || parallelism > 
Runtime.getRuntime().availableProcessors() ? data : data.parallel();

Review Comment:
   @yihua I agree that reducing the parallelism can avoid OOM in some 
situations.  I'm just wondering should we reduce the parallelism automatically. 
`HoodieFlinkEngineContext.map` provides a param `parallelism` and we can 
configure the parallelism as we want. In this scenario (Clean OOM in flink 
Engine) we can just set `hoodie.cleaner.parallelism` as 1 to reduce the 
parallelism. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to