Hi Faiz,

We find G1GC works well for some of our workloads that are Parquet-read 
intensive and we have been using G1GC with Spark on Java 8 already 
(spark.driver.extraJavaOptions and spark.executor.extraJavaOptions= 
“-XX:+UseG1GC”), while currently we are mostly running Spark (3.3 and higher) 
on Java 11.
However, the best is always to refer to measurements of your specific 
workloads, let me know if you find something different.
BTW besides the WebUI, I typically measure GC time also with a couple of custom 
tools: https://github.com/cerndb/spark-dashboard and  
https://github.com/LucaCanali/sparkMeasure
A few tests of microbenchmarking Spark reading Parquet with a few different 
JDKs at: https://db-blog.web.cern.ch/node/192

Best,
Luca


From: Faiz Halde <haldef...@gmail.com>
Sent: Thursday, December 7, 2023 23:25
To: user@spark.apache.org
Subject: Spark on Java 17

Hello,

We are planning to switch to Java 17 for Spark and were wondering if there's 
any obvious learnings from anybody related to JVM tuning?

We've been running on Java 8 for a while now and used to use the parallel GC as 
that used to be a general recommendation for high throughout systems. How has 
the default G1GC worked out with Spark?

Thanks
Faiz

Reply via email to