davseitsev opened a new issue, #12086:
URL: https://github.com/apache/iceberg/issues/12086

   ### Apache Iceberg version
   
   1.7.0
   
   ### Query engine
   
   Spark
   
   ### Please describe the bug 🐞
   
   We have Spark job which performs all the maintenance actions over our data 
lake. We noticed high CPU usage on driver caused by `Committer-Service` threads.
   
   Here is output of `top`:
   ```
   51128 yarn      20   0  120.4g  29.7g  23732 R  99.9  24.1  66:34.41 
Committer-Servi
   53929 yarn      20   0  120.4g  29.7g  23732 R  99.9  24.1  16:08.09 
Committer-Servi
   11001 yarn      20   0  120.4g  29.7g  23732 R  99.7  24.1  90:47.03 
dispatcher-Coar
   49957 yarn      20   0  120.4g  29.7g  23732 R  99.7  24.1  65:28.45 
Committer-Servi
   50738 yarn      20   0  120.4g  29.7g  23732 R  99.7  24.1  66:45.21 
Committer-Servi
   11052 yarn      20   0  120.4g  29.7g  23732 S   1.0  24.1   1:53.65 
spark-listener-
   83359 yarn      20   0  120.4g  29.7g  23732 S   0.6  24.1   0:00.10 
Committer-Servi
   ```
   
   Consuming threads stack trace look like this:
   ```
   "Committer-Service" #28702 prio=5 os_prio=0 cpu=3958800.03ms 
elapsed=4094.93s tid=0x0000ffe544acc4e0 nid=0xc325 runnable  
[0x0000ffe492d55000]
      java.lang.Thread.State: RUNNABLE
           at 
org.apache.iceberg.actions.BaseCommitService.lambda$start$0(BaseCommitService.java:133)
           at 
org.apache.iceberg.actions.BaseCommitService$$Lambda$4975/0x00000030026fec58.run(Unknown
 Source)
           at 
java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1136)
           at 
java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:635)
           at java.lang.Thread.run([email protected]/Thread.java:840)
   ```
   
   Here is flame graph:
   
   <img width="997" alt="Image" 
src="https://github.com/user-attachments/assets/db1dba31-ff7a-4b27-9742-20933e40c452";
 />
   
   And call tree:
   
   <img width="1000" alt="Image" 
src="https://github.com/user-attachments/assets/9ceeffc1-2539-4193-81eb-f6f02fe738e4";
 />
   
   And in the code:
   
   <img width="1008" alt="Image" 
src="https://github.com/user-attachments/assets/0b4f5f6d-3bec-4625-915e-4af138247189";
 />
   
   It looks like wasting CPU time when there is actually nothing to do. 
   It doesn't affects us significantly as we use big istance for the driver but 
it worth optimization for smaller instances.
   
   ### Willingness to contribute
   
   - [ ] I can contribute a fix for this bug independently
   - [ ] I would be willing to contribute a fix for this bug with guidance from 
the Iceberg community
   - [x] I cannot contribute a fix for this bug at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to