I have a quick question about this configuration. Particularly this line : 

 

log4j.appender.rolling.file=/var/log/spark/<logfilename>

 

Where is that path at? At the driver level or for each executor individually? 

 

Thank you

 

From: Jain, Abhishek 3. (Nokia - IN/Bangalore) <abhishek.3.j...@nokia.com> 
Sent: Thursday, February 14, 2019 7:48 AM
To: Deepak Sharma <deepakmc...@gmail.com>
Cc: spark users <user@spark.apache.org>
Subject: RE: Spark streaming filling the disk with logs

 

++

If you can afford loosing few old logs, then you can make use of rolling file 
Appender as well.

 

log4j.rootLogger=INFO, rolling

log4j.appender.rolling=org.apache.log4j.RollingFileAppender

log4j.appender.rolling.layout=org.apache.log4j.PatternLayout

log4j.appender.rolling.maxFileSize=50MB

log4j.appender.rolling.maxBackupIndex=5

log4j.appender.rolling.file=/var/log/spark/<logfilename>

log4j.logger.org.apache.spark=<LogLevel>

 

This means log4j will roll the log file by 50MB and keep only 5 recent files. 
These files are saved in /var/log/spark directory, with filename mentioned.

 

Regards,
Abhishek Jain

 

From: Jain, Abhishek 3. (Nokia - IN/Bangalore) 
Sent: Thursday, February 14, 2019 5:58 PM
To: Deepak Sharma <deepakmc...@gmail.com <mailto:deepakmc...@gmail.com> >
Cc: spark users <user@spark.apache.org <mailto:user@spark.apache.org> >
Subject: RE: Spark streaming filling the disk with logs

 

Hi Deepak,

 

The spark logging can be set for different purposes. Say for example if you 
want to control the spark-submit log, 
“log4j.logger.org.apache.spark.repl.Main=WARN/INFO/ERROR” can be set.

 

Similarly, to control third party logs:

log4j.logger.org.spark_project.jetty=<LEVEL>, 
log4j.logger.org.apache.parquet=<LEVEL> etc..

 

These properties can be set in the conf/log4j .properties file.

 

Hope this helps! 😊

 

Regards,

Abhishek Jain

 

From: Deepak Sharma <deepakmc...@gmail.com <mailto:deepakmc...@gmail.com> > 
Sent: Thursday, February 14, 2019 12:10 PM
To: spark users <user@spark.apache.org <mailto:user@spark.apache.org> >
Subject: Spark streaming filling the disk with logs

 

Hi All

I am running a spark streaming job with below configuration :

 

--conf "spark.executor.extraJavaOptions=-Droot.logger=WARN,console"

 

But it’s still filling the disk with info logs.

If the logging level is set to WARN at cluster level , then only the WARN logs 
are getting written but then it affects all the jobs .

 

Is there any way to get rid of INFO level of logging at spark streaming job 
level ?

 

Thanks

Deepak 

 

-- 

Thanks
Deepak
www.bigdatabig.com <http://www.bigdatabig.com> 
www.keosha.net <http://www.keosha.net> 

Reply via email to