Re: Persisting driver logs in yarn client mode (SPARK-25118)

Marcelo Vanzin Fri, 24 Aug 2018 15:12:02 -0700

I think this would be useful, but I also share Saisai's and Marco's
concern about the extra step when shutting down the application. If
that could be minimized this would be a much more interesting feature.


e.g. you could upload logs incrementally to HDFS, asynchronously,
while the app is running. Or you could pipe them to the YARN AM over
Spark's RPC (losing some logs in  the beginning and end of the driver
execution). Or maybe something else.

There is also the issue of shell logs being at "warn" level by
default, so even if you write these to a file, they're not really that
useful for debugging. So a solution than keeps that behavior, but
writes INFO logs to this new sink, would be great.

If you can come up with a solution to those problems I think this
could be a good feature.


On Wed, Aug 22, 2018 at 10:01 AM, Ankur Gupta
<ankur.gu...@cloudera.com.invalid> wrote:
> Thanks for your responses Saisai and Marco.
>
> I agree that "rename" operation can be time-consuming on object storage,
> which can potentially delay the shutdown.
>
> I also agree that customers/users have a way to use log appenders to write
> log files and then send them along with Yarn application logs but I still
> think it is a cumbersome process. Also, there is the issue that customers
> cannot easily identify which logs belong to which application, without
> reading the log file. And if users run multiple applications with default
> log4j configurations on the same host, then they can end up writing to the
> same log file.
>
> Because of the issues mentioned above, we can maybe think of this as an
> optional feature, which will be disabled by default but turned on by
> customers. This will solve the problems mentioned above, reduce the overhead
> on users/customers while adding a bit of overhead during the shutdown phase
> of Spark Application.
>
> Thanks,
> Ankur
>
> On Wed, Aug 22, 2018 at 1:36 AM Marco Gaido <marcogaid...@gmail.com> wrote:
>>
>> I agree with Saisai. You can also configure log4j to append anywhere else
>> other than the console. Many companies have their system for collecting and
>> monitoring logs and they just customize the log4j configuration. I am not
>> sure how needed this change would be.
>>
>> Thanks,
>> Marco
>>
>> Il giorno mer 22 ago 2018 alle ore 04:31 Saisai Shao
>> <sai.sai.s...@gmail.com> ha scritto:
>>>
>>> One issue I can think of is that this "moving the driver log" in the
>>> application end is quite time-consuming, which will significantly delay the
>>> shutdown. We already suffered such "rename" problem for event log on object
>>> store, the moving of driver log will make the problem severe.
>>>
>>> For a vanilla Spark on yarn client application, I think user could
>>> redirect the console outputs to log and provides both driver log and yarn
>>> application log to the customers, this seems not a big overhead.
>>>
>>> Just my two cents.
>>>
>>> Thanks
>>> Saisai
>>>
>>> Ankur Gupta <ankur.gu...@cloudera.com.invalid> 于2018年8月22日周三 上午5:19写道：
>>>>
>>>> Hi all,
>>>>
>>>> I want to highlight a problem that we face here at Cloudera and start a
>>>> discussion on how to go about solving it.
>>>>
>>>> Problem Statement:
>>>> Our customers reach out to us when they face problems in their Spark
>>>> Applications. Those problems can be related to Spark, environment issues,
>>>> their own code or something else altogether. A lot of times these customers
>>>> run their Spark Applications in Yarn Client mode, which as we all know, 
>>>> uses
>>>> a ConsoleAppender to print logs to the console. These customers usually 
>>>> send
>>>> their Yarn logs to us to troubleshoot. As you may have figured, these logs
>>>> do not contain driver logs and makes it difficult for us to troubleshoot 
>>>> the
>>>> issue. In that scenario our customers end up running the application again,
>>>> piping the output to a log file or using a local log appender and then
>>>> sending over that file.
>>>>
>>>> I believe that there are other users in the community who also face
>>>> similar problem, where the central team managing Spark clusters face
>>>> difficulty in helping the end users because they ran their application in
>>>> shell or yarn client mode (I am not sure what is the equivalent in Mesos).
>>>>
>>>> Additionally, there may be teams who want to capture all these logs so
>>>> that they can be analyzed at some later point in time and the fact that
>>>> driver logs are not a part of Yarn Logs causes them to capture only partial
>>>> logs or makes it difficult to capture all the logs.
>>>>
>>>> Proposed Solution:
>>>> One "low touch" approach will be to create an ApplicationListener which
>>>> listens for Application Start and Application End events. On Application
>>>> Start, this listener will append a Log Appender which writes to a local or
>>>> remote (eg:hdfs) log file in an application specific directory and moves
>>>> this to Yarn's Remote Application Dir (or equivalent Mesos Dir) on
>>>> application end. This way the logs will be available as part of Yarn Logs.
>>>>
>>>> I am also interested in hearing about other ideas that the community may
>>>> have about this. Or if someone has already solved this problem, then I 
>>>> would
>>>> like them to contribute their solution to the community.
>>>>
>>>> Thanks,
>>>> Ankur



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Persisting driver logs in yarn client mode (SPARK-25118)

Reply via email to