Re: spark streaming: stderr does not roll
So the summarize (I had a similar question): Spark's log4j per default is configured to log to the console? Those messages end up in the stderr files and the approach does not support rolling? If I configure log4j to log to files, how can I keep the folder structure? Should I use relative paths and assume that those end up in the same folders the stderr files do? Regards, Jeff 2015-02-25 9:35 GMT+01:00 Sean Owen so...@cloudera.com: These settings don't control what happens to stderr, right? stderr is up to the process that invoked the driver to control. You may wish to configure log4j to log to files instead. On Wed, Nov 12, 2014 at 8:15 PM, Nguyen, Duc duc.ngu...@pearson.com wrote: I've also tried setting the aforementioned properties using System.setProperty() as well as on the command line while submitting the job using --conf key=value. All to no success. When I go to the Spark UI and click on that particular streaming job and then the Environment tab, I can see the properties are correctly set. But regardless of what I've tried, the stderr log file on the worker nodes does not roll and continues to grow...leading to a crash of the cluster once it claims 100% of disk. Has anyone else encountered this? Anyone? On Fri, Nov 7, 2014 at 3:35 PM, Nguyen, Duc duc.ngu...@pearson.com wrote: We are running spark streaming jobs (version 1.1.0). After a sufficient amount of time, the stderr file grows until the disk is full at 100% and crashes the cluster. I've read this https://github.com/apache/spark/pull/895 and also read this http://spark.apache.org/docs/latest/configuration.html#spark-streaming So I've tried testing with this in an attempt to get the stderr log file to roll. sparkConf.set(spark.executor.logs.rolling.strategy, size) .set(spark.executor.logs.rolling.size.maxBytes, 1024) .set(spark.executor.logs.rolling.maxRetainedFiles, 3) Yet it does not roll and continues to grow. Am I missing something obvious? thanks, Duc - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: spark streaming: stderr does not roll
I think that's up to you. You can make it log wherever you want, and have some control over how log4j names the rolled log files by configuring its file-based rolling appender. On Thu, Feb 26, 2015 at 10:05 AM, Jeffrey Jedele jeffrey.jed...@gmail.com wrote: So the summarize (I had a similar question): Spark's log4j per default is configured to log to the console? Those messages end up in the stderr files and the approach does not support rolling? If I configure log4j to log to files, how can I keep the folder structure? Should I use relative paths and assume that those end up in the same folders the stderr files do? Regards, Jeff 2015-02-25 9:35 GMT+01:00 Sean Owen so...@cloudera.com: These settings don't control what happens to stderr, right? stderr is up to the process that invoked the driver to control. You may wish to configure log4j to log to files instead. On Wed, Nov 12, 2014 at 8:15 PM, Nguyen, Duc duc.ngu...@pearson.com wrote: I've also tried setting the aforementioned properties using System.setProperty() as well as on the command line while submitting the job using --conf key=value. All to no success. When I go to the Spark UI and click on that particular streaming job and then the Environment tab, I can see the properties are correctly set. But regardless of what I've tried, the stderr log file on the worker nodes does not roll and continues to grow...leading to a crash of the cluster once it claims 100% of disk. Has anyone else encountered this? Anyone? On Fri, Nov 7, 2014 at 3:35 PM, Nguyen, Duc duc.ngu...@pearson.com wrote: We are running spark streaming jobs (version 1.1.0). After a sufficient amount of time, the stderr file grows until the disk is full at 100% and crashes the cluster. I've read this https://github.com/apache/spark/pull/895 and also read this http://spark.apache.org/docs/latest/configuration.html#spark-streaming So I've tried testing with this in an attempt to get the stderr log file to roll. sparkConf.set(spark.executor.logs.rolling.strategy, size) .set(spark.executor.logs.rolling.size.maxBytes, 1024) .set(spark.executor.logs.rolling.maxRetainedFiles, 3) Yet it does not roll and continues to grow. Am I missing something obvious? thanks, Duc - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: spark streaming: stderr does not roll
These settings don't control what happens to stderr, right? stderr is up to the process that invoked the driver to control. You may wish to configure log4j to log to files instead. On Wed, Nov 12, 2014 at 8:15 PM, Nguyen, Duc duc.ngu...@pearson.com wrote: I've also tried setting the aforementioned properties using System.setProperty() as well as on the command line while submitting the job using --conf key=value. All to no success. When I go to the Spark UI and click on that particular streaming job and then the Environment tab, I can see the properties are correctly set. But regardless of what I've tried, the stderr log file on the worker nodes does not roll and continues to grow...leading to a crash of the cluster once it claims 100% of disk. Has anyone else encountered this? Anyone? On Fri, Nov 7, 2014 at 3:35 PM, Nguyen, Duc duc.ngu...@pearson.com wrote: We are running spark streaming jobs (version 1.1.0). After a sufficient amount of time, the stderr file grows until the disk is full at 100% and crashes the cluster. I've read this https://github.com/apache/spark/pull/895 and also read this http://spark.apache.org/docs/latest/configuration.html#spark-streaming So I've tried testing with this in an attempt to get the stderr log file to roll. sparkConf.set(spark.executor.logs.rolling.strategy, size) .set(spark.executor.logs.rolling.size.maxBytes, 1024) .set(spark.executor.logs.rolling.maxRetainedFiles, 3) Yet it does not roll and continues to grow. Am I missing something obvious? thanks, Duc - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: spark streaming: stderr does not roll
I'm also facing the same issue. I tried the configurations but it seems the executors spark's log4j.properties seems to override the passed values, so you have to change /etc/spark/conf/log4j.properties. Let me know if any of you have managed to get this fixes programatically. I am planning to use logrotate to rotate these logs. On Thu, Nov 13, 2014 at 1:45 AM, Nguyen, Duc duc.ngu...@pearson.com wrote: I've also tried setting the aforementioned properties using System.setProperty() as well as on the command line while submitting the job using --conf key=value. All to no success. When I go to the Spark UI and click on that particular streaming job and then the Environment tab, I can see the properties are correctly set. But regardless of what I've tried, the stderr log file on the worker nodes does not roll and continues to grow...leading to a crash of the cluster once it claims 100% of disk. Has anyone else encountered this? Anyone? On Fri, Nov 7, 2014 at 3:35 PM, Nguyen, Duc duc.ngu...@pearson.com wrote: We are running spark streaming jobs (version 1.1.0). After a sufficient amount of time, the stderr file grows until the disk is full at 100% and crashes the cluster. I've read this https://github.com/apache/spark/pull/895 and also read this http://spark.apache.org/docs/latest/configuration.html#spark-streaming So I've tried testing with this in an attempt to get the stderr log file to roll. sparkConf.set(spark.executor.logs.rolling.strategy, size) .set(spark.executor.logs.rolling.size.maxBytes, 1024) .set(spark.executor.logs.rolling.maxRetainedFiles, 3) Yet it does not roll and continues to grow. Am I missing something obvious? thanks, Duc -- Thanks Regards, *Mukesh Jha me.mukesh@gmail.com*
Re: spark streaming: stderr does not roll
I've also tried setting the aforementioned properties using System.setProperty() as well as on the command line while submitting the job using --conf key=value. All to no success. When I go to the Spark UI and click on that particular streaming job and then the Environment tab, I can see the properties are correctly set. But regardless of what I've tried, the stderr log file on the worker nodes does not roll and continues to grow...leading to a crash of the cluster once it claims 100% of disk. Has anyone else encountered this? Anyone? On Fri, Nov 7, 2014 at 3:35 PM, Nguyen, Duc duc.ngu...@pearson.com wrote: We are running spark streaming jobs (version 1.1.0). After a sufficient amount of time, the stderr file grows until the disk is full at 100% and crashes the cluster. I've read this https://github.com/apache/spark/pull/895 and also read this http://spark.apache.org/docs/latest/configuration.html#spark-streaming So I've tried testing with this in an attempt to get the stderr log file to roll. sparkConf.set(spark.executor.logs.rolling.strategy, size) .set(spark.executor.logs.rolling.size.maxBytes, 1024) .set(spark.executor.logs.rolling.maxRetainedFiles, 3) Yet it does not roll and continues to grow. Am I missing something obvious? thanks, Duc