Spark on Yarn by default support customized log4j configuration, RollingFileAppender could be used to avoid disk overflow as documented below
If you need a reference to the proper location to put log files in the YARN so that YARN can properly display and aggregate them, use spark.yarn.app.container.log.dir in your log4j.properties. For example, log4j.appender.file_appender.File=${spark.yarn.app.container.log.dir}/spark.log. For streaming applications, configuring RollingFileAppender and setting file location to YARN’s log directory will avoid disk overflow caused by large log files, and logs can be accessed using YARN’s log utility. You can get more information here: https://spark.apache.org/docs/latest/running-on-yarn.html#configuration At 2016-08-23 18:44:29, "Pradeep" <pradeep.mi...@mail.com> wrote: >Hi All, > >I am running Java spark streaming jobs in yarn-client mode. Is there a way I >can manage logs rollover on edge node. I have a 10 second batch and log file >volume is huge. > >Thanks, >Pradeep > >--------------------------------------------------------------------- >To unsubscribe e-mail: user-unsubscr...@spark.apache.org >