pan3793 commented on issue #6481: URL: https://github.com/apache/kyuubi/issues/6481#issuecomment-2175972882
An interesting diagnosis. TBH, the direct memory usage of JVM is always a mystery for me, and I agree with you that we should find a way to analyze the direct memory used by the Spark driver launched by Kyuubi. > ... the engine started but without executing any SQL. Have you unset `kyuubi.engine.initialize.sql=SHOW DATABASES`? If not, it will init a HiveClient and create bunches of Hive objects. I also see that the default value of `spark.[driver|executor].memoryOverheadFactor` 0.1 is too small for most production Spark jobs. To tackle the YARN OOM kill issues, we have done: - SPARK-47208 introduces `spark.[driver|executor].minMemoryOverhead` to make the hardcoded 384m configurable, in practice, we set it to 2g to gain the stability. - Disable direct memory usage of Netty by setting `spark.network.io.preferDirectBufs=false`, `spark.shuffle.io.preferDirectBufs=false` - Lower the concurrency of shuffle block fetching network requests by setting `spark.reducer.maxReqsInFlight=256` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
