andygrove commented on PR #3534:
URL: 
https://github.com/apache/datafusion-comet/pull/3534#issuecomment-3909407007

   ## TPC-H q21 Memory Profile: Comet vs Gluten (SF100, Docker, 1 executor / 8 
cores / 16 GiB)                                                                 
                                            
                                                                                
                                                                                
                                           
   Ran TPC-H query 21 with JVM memory profiling enabled (`--profile 
--profile-interval 1.0`) for both engines on the same Docker cluster.           
                                                       
                                                                                
               
   ### Configuration
   - **Spark:** 3.5.2 (docker image `comet-bench`)
   - **Comet:** Java 17, `comet-baseline.jar`
   - **Gluten:** Java 8, 
`gluten-velox-bundle-spark3.5_2.12-linux_amd64-1.4.0.jar`
   - **Cluster:** 1 executor, 8 cores, 16 GiB executor memory, 16 GiB off-heap
   
   ### Results (Executor 0)
   
   Engine          Wall-clock time
   ---Comet                    80.03s
   Gluten                   43.55s
   Delta        +36.48s (Comet 1.84x slower)
   
   Metric                                          Comet                 Gluten 
                 Delta
   
   Peak memoryUsed                               1.31 MB                1.91 MB 
    -605.79 KB (0.69x)
   Peak JVMHeapMemory                            5.90 GB                2.43 GB 
      +3.47 GB (2.43x)
   Peak JVMOffHeapMemory                       123.64 MB              116.36 MB 
      +7.27 MB (1.06x)
   Peak OnHeapExecutionMemory                     0.00 B                 0.00 B 
                     0
   Peak OffHeapExecutionMemory                   4.17 GB                2.79 GB 
      +1.38 GB (1.50x)
   Peak OnHeapUnifiedMemory                      9.49 MB                5.32 MB 
      +4.17 MB (1.78x)
   Peak OffHeapUnifiedMemory                     4.17 GB                2.79 GB 
      +1.38 GB (1.50x)
   Peak ProcessTreeJVMRSSMemory                   0.00 B                 0.00 B 
                     0
   memoryUsed % of maxMemory                       0.01%                  0.01% 
              -0.00 pp
   
   maxMemory (executor 0):  Comet = 25.42 GB  |  Gluten = 24.36 GB
   
   ### Key Takeaways
   
   - **Speed:** Gluten finished in 43.55s vs Comet's 80.03s (1.84x faster on 
q21)
   - **JVM Heap:** Comet peaked at 5.90 GB on-heap — **2.43x higher** than 
Gluten's 2.43 GB. This is the single largest memory difference.
   - **Off-Heap Execution Memory:** Comet used 4.17 GB vs Gluten's 2.79 GB 
(1.50x more), reflecting native engine working memory during q21's multi-way 
join + anti-join + aggregation.
   - **JVM Off-Heap:** Roughly comparable (~120 MB each, only 6% difference).
   - **On-Heap Execution:** Both 0 — expected since both engines execute 
natively off-heap.
   - **ProcessTreeJVMRSSMemory:** 0 for both (RSS tracking not enabled).
   
   ### Notes
   - This is a single query (q21) on a single run — results may vary across 
queries and iterations.
   - The Docker image was updated to include both Java 8 (Gluten) and Java 17 
(Comet), with a `--name` arg injection bugfix in `run.py`.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to