anishshri-db opened a new pull request, #46491:
URL: https://github.com/apache/spark/pull/46491

   ### What changes were proposed in this pull request?
   Skip providing memory usage metrics from RocksDB if bounded memory usage is 
enabled
   
   
   ### Why are the changes needed?
   Without this, we are providing memory usage that is the max usage per node 
at a partition level.
   For eg - if we report this
   ```
       "allRemovalsTimeMs" : 93,
       "commitTimeMs" : 32240,
       "memoryUsedBytes" : 15956211724278,
       "numRowsDroppedByWatermark" : 0,
       "numShufflePartitions" : 200,
       "numStateStoreInstances" : 200,
   ```
   
   We have 200 partitions in this case.
   So the memory usage per partition / state store would be ~78GB. However, 
this node has 256GB memory total and we have 2 such nodes. We have configured 
our cluster to use 30% of available memory on each node for RocksDB which is 
~77GB. 
   So the memory being reported here is actually per node rather than per 
partition which could be confusing for users.
   
   ### Does this PR introduce _any_ user-facing change?
   No - only a metrics reporting change
   
   ### How was this patch tested?
   Added unit tests
   
   ```
   [info] Run completed in 10 seconds, 878 milliseconds.
   [info] Total number of tests run: 24
   [info] Suites: completed 1, aborted 0
   [info] Tests: succeeded 24, failed 0, canceled 0, ignored 0, pending 0
   [info] All tests passed.
   ```
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to