Re: Persisting driver logs in yarn client mode (SPARK-25118)

2018-08-27 Thread Henry Robinson
On Mon, 27 Aug 2018 at 13:04, Ankur Gupta wrote: > Thanks all for your responses. > > So I believe a solution that accomplishes the following will be a good > solution: > > 1. Writes logs to Hdfs asynchronously > In the limit, this could perform just as slowly at shutdown time as synchronous

Re: Persisting driver logs in yarn client mode (SPARK-25118)

2018-08-27 Thread Ankur Gupta
Thanks all for your responses. So I believe a solution that accomplishes the following will be a good solution: 1. Writes logs to Hdfs asynchronously 2. Writes logs at INFO level while ensuring that console logs are written at WARN level by default (in shell mode) 3. Optionally, moves this file

Re: Persisting driver logs in yarn client mode (SPARK-25118)

2018-08-24 Thread Marcelo Vanzin
I think this would be useful, but I also share Saisai's and Marco's concern about the extra step when shutting down the application. If that could be minimized this would be a much more interesting feature. e.g. you could upload logs incrementally to HDFS, asynchronously, while the app is

Re: Persisting driver logs in yarn client mode (SPARK-25118)

2018-08-22 Thread Ankur Gupta
Thanks for your responses Saisai and Marco. I agree that "rename" operation can be time-consuming on object storage, which can potentially delay the shutdown. I also agree that customers/users have a way to use log appenders to write log files and then send them along with Yarn application logs

Re: Persisting driver logs in yarn client mode (SPARK-25118)

2018-08-22 Thread Marco Gaido
I agree with Saisai. You can also configure log4j to append anywhere else other than the console. Many companies have their system for collecting and monitoring logs and they just customize the log4j configuration. I am not sure how needed this change would be. Thanks, Marco Il giorno mer 22 ago

Re: Persisting driver logs in yarn client mode (SPARK-25118)

2018-08-21 Thread Saisai Shao
One issue I can think of is that this "moving the driver log" in the application end is quite time-consuming, which will significantly delay the shutdown. We already suffered such "rename" problem for event log on object store, the moving of driver log will make the problem severe. For a vanilla

Persisting driver logs in yarn client mode (SPARK-25118)

2018-08-21 Thread Ankur Gupta
Hi all, I want to highlight a problem that we face here at Cloudera and start a discussion on how to go about solving it. *Problem Statement:* Our customers reach out to us when they face problems in their Spark Applications. Those problems can be related to Spark, environment issues, their own