[ https://issues.apache.org/jira/browse/HUDI-6669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nicholas Jiang updated HUDI-6669: --------------------------------- Description: HoodieEngineContext should not use parallel stream with parallelism greater than CPU cores to avoid {{OutOfMemoryError}} of {{{}ForkJoinTask{}}}, of which stacktrace as follows: {code:java} Caused by: java.lang.OutOfMemoryError at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:598) at java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:677) at java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:735) at java.util.stream.ReduceOps$ReduceOp.evaluateParallel(ReduceOps.java:714) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233) at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) at org.apache.hudi.client.common.HoodieFlinkEngineContext.map(HoodieFlinkEngineContext.java:101) at org.apache.hudi.table.action.clean.CleanPlanActionExecutor.requestClean(CleanPlanActionExecutor.java:117) at org.apache.hudi.table.action.clean.CleanPlanActionExecutor.requestClean(CleanPlanActionExecutor.java:145) at org.apache.hudi.table.action.clean.CleanPlanActionExecutor.execute(CleanPlanActionExecutor.java:170) at org.apache.hudi.table.HoodieFlinkCopyOnWriteTable.scheduleCleaning(HoodieFlinkCopyOnWriteTable.java:353) at org.apache.hudi.client.BaseHoodieWriteClient.scheduleTableServiceInternal(BaseHoodieWriteClient.java:1434) at org.apache.hudi.client.BaseHoodieWriteClient.clean(BaseHoodieWriteClient.java:891) at org.apache.hudi.async.AsyncCleanerService.lambda$startService$0(AsyncCleanerService.java:68) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) {code} was: HoodieEngineContext should not use parallel stream with parallelism greater than CPU cores to avoid {{OutOfMemoryError}} of {{{}ForkJoinTask{}}}, of which stacktrace as follows: Caused by: java.lang.OutOfMemoryError at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:598) at java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:677) at java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:735) at java.util.stream.ReduceOps$ReduceOp.evaluateParallel(ReduceOps.java:714) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233) at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) at org.apache.hudi.client.common.HoodieFlinkEngineContext.map(HoodieFlinkEngineContext.java:101) at org.apache.hudi.table.action.clean.CleanPlanActionExecutor.requestClean(CleanPlanActionExecutor.java:117) at org.apache.hudi.table.action.clean.CleanPlanActionExecutor.requestClean(CleanPlanActionExecutor.java:145) at org.apache.hudi.table.action.clean.CleanPlanActionExecutor.execute(CleanPlanActionExecutor.java:170) at org.apache.hudi.table.HoodieFlinkCopyOnWriteTable.scheduleCleaning(HoodieFlinkCopyOnWriteTable.java:353) at org.apache.hudi.client.BaseHoodieWriteClient.scheduleTableServiceInternal(BaseHoodieWriteClient.java:1434) at org.apache.hudi.client.BaseHoodieWriteClient.clean(BaseHoodieWriteClient.java:891) at org.apache.hudi.async.AsyncCleanerService.lambda$startService$0(AsyncCleanerService.java:68) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) > HoodieEngineContext should not use parallel stream with parallelism greater > than CPU cores > ------------------------------------------------------------------------------------------ > > Key: HUDI-6669 > URL: https://issues.apache.org/jira/browse/HUDI-6669 > Project: Apache Hudi > Issue Type: Improvement > Components: core > Reporter: Nicholas Jiang > Assignee: Nicholas Jiang > Priority: Major > Fix For: 0.14.0 > > > HoodieEngineContext should not use parallel stream with parallelism greater > than CPU cores to avoid {{OutOfMemoryError}} of {{{}ForkJoinTask{}}}, of > which stacktrace as follows: > {code:java} > Caused by: java.lang.OutOfMemoryError at > sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at > java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:598) > at java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:677) > at java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:735) at > java.util.stream.ReduceOps$ReduceOp.evaluateParallel(ReduceOps.java:714) at > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233) at > java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) at > org.apache.hudi.client.common.HoodieFlinkEngineContext.map(HoodieFlinkEngineContext.java:101) > at > org.apache.hudi.table.action.clean.CleanPlanActionExecutor.requestClean(CleanPlanActionExecutor.java:117) > at > org.apache.hudi.table.action.clean.CleanPlanActionExecutor.requestClean(CleanPlanActionExecutor.java:145) > at > org.apache.hudi.table.action.clean.CleanPlanActionExecutor.execute(CleanPlanActionExecutor.java:170) > at > org.apache.hudi.table.HoodieFlinkCopyOnWriteTable.scheduleCleaning(HoodieFlinkCopyOnWriteTable.java:353) > at > org.apache.hudi.client.BaseHoodieWriteClient.scheduleTableServiceInternal(BaseHoodieWriteClient.java:1434) > at > org.apache.hudi.client.BaseHoodieWriteClient.clean(BaseHoodieWriteClient.java:891) > at > org.apache.hudi.async.AsyncCleanerService.lambda$startService$0(AsyncCleanerService.java:68) > at > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)