I think that's a reasonable argument, that it provides links to
potentially several logs of interest. It reduces the UI clutter a
little at the cost of one more hop to get to logs.
I don't feel strongly about it but think that's a reasonable thing to do.

On Fri, Feb 8, 2019 at 4:57 PM Jungtaek Lim <kabh...@gmail.com> wrote:
>
> Let me quote some voices here: seems like they don't participate this thread. 
> This still doesn't represent the majority are using this pattern, so I'm also 
> OK to make it optional (I might just work on SPARK-26792 to address) and 
> leave the default as it is if others aren't interested on this.
>
> https://github.com/apache/spark/pull/23260#issuecomment-456827963
>
> Sorry I haven't had time to look through all the code so this might be a 
> separate jira, but one thing I thought of here is it would be really nice not 
> to have specifically stderr/stdout. users can specify any log4j.properties 
> and some tools like oozie by default end up using hadoop log4j rather then 
> spark log4j, so files aren't necessarily the same. Also users can put in 
> other logs files so it would be nice to have links to those from the UI. It 
> seems simpler if we just had a link to the directory and it read the files 
> within there. Other things in Hadoop do it this way, but I'm not sure if that 
> works well for other resource managers, any thoughts on that? As long as this 
> doesn't prevent the above I can file a separate jira for it.
>
> https://github.com/apache/spark/pull/23260#issuecomment-456904716
>
> Hi Tom, +1: singling out stdout and stderr is definitely an annoyance. We
> typically configure Spark jobs to write the GC log and dump heap on OOM
> using <LOG_DIR>, and/or we use the rolling file appender to deal with
> large logs during debugging. So linking the YARN container log overview
> page would make much more sense for us. We work it around with a custom
> submit process that logs all important URLs on the submit side log.
>
>
>
> 2019년 2월 9일 (토) 오전 5:42, Ryan Blue <rb...@netflix.com>님이 작성:
>>
>> Here's what I see from a running job on our cluster. Both of these are links 
>> that go to the stderr and stdout links that Spark produces today.
>>
>> stderr : Total file length is 18557 bytes.
>> stdout : Total file length is 0 bytes.
>>
>> While it is nice to see that stderr or stdout has content, I don't think 
>> that this is worth the extra click or changes to Spark.
>>
>> However, we have configured our logs to go to stderr and stdout so these 
>> links work for us. I think some YARN applications send logs to a separate 
>> log endpoint, which would be useful when listed here. Does anyone have logs 
>> going to locations other than stderr and stdout?
>>
>> If there are logs going to other files, then I think making this an option 
>> is reasonable. Otherwise, I think we should leave links as they are.
>>
>> rb
>>
>> On Thu, Feb 7, 2019 at 12:31 PM Jungtaek Lim <kabh...@gmail.com> wrote:
>>>
>>> New URL shows all of local logs which includes stdout and stderr as a list.
>>>
>>> The change would help when end users modify their log4j configuration to 
>>> have another log files, as well as GC logs. Currently Spark only shows two 
>>> static files (stdout, stderr) as individual links so easier to see the 
>>> content (one-click) but users have to remove file part manually from URL to 
>>> access list page. Instead of this we may be able to change default URL to 
>>> show all of local logs and let users choose which file to read. (though it 
>>> would be two-clicks to access to actual file)
>>>
>>> -Jungtaek Lim (HeartSaVioR)
>>>
>>> 2019년 2월 8일 (금) 오전 1:33, Ryan Blue <rb...@netflix.com>님이 작성:
>>>>
>>>> Jungtaek,
>>>>
>>>> What is shown at the new URL and how would this improve usability?
>>>>
>>>> On Thu, Feb 7, 2019 at 12:45 AM Jungtaek Lim <kabh...@gmail.com> wrote:
>>>>>
>>>>> Hi devs,
>>>>>
>>>>> Based on the suggestion Tom Graves gave me in SPARK-26792, I'd like to 
>>>>> hear voices on changing default executor log URLs for YARN, specifically 
>>>>> removing "stdout" and "stderr" and provide link which shows log file"s". 
>>>>> For example, instead of referring two links below:
>>>>>
>>>>> http://<NM_HOST>:<NM_PORT>/node/containerlogs/<CONTAINER_ID>/<USER>/<stdout|stderr>?start=-4096
>>>>>
>>>>> we just refer only one link below:
>>>>>
>>>>> http://<NM_HOST>:<NM_PORT>/node/containerlogs/<CONTAINER_ID>/<USER>
>>>>>
>>>>> I've checked new URL works with redirection on NM to jobhistory, so it 
>>>>> won't break what we currently supported. Going through the actual log 
>>>>> file would require two clicks instead of one click though.
>>>>>
>>>>> Given it introduces the change on UX I'd like to hear voices on this 
>>>>> before submitting a patch. If we'd rather keep this as it is, I would 
>>>>> just open the chance to apply custom log URL for Spark UI as well.
>>>>>
>>>>> Thanks in advance!
>>>>>
>>>>> FYI, below is the rationalization on discussion:
>>>>>
>>>>> While I worked regarding SPARK-23155, I've got some inputs around linking 
>>>>> "log directory" instead of log urls for each "stdout" and "stderr", 
>>>>> because in real case end users would put more files then only stdout and 
>>>>> stderr (like gc logs).
>>>>>
>>>>> SPARK-23155 provides the way to modify log URL but it's only applied to 
>>>>> SHS, and in Spark UI in running apps it still only shows "stdout" and 
>>>>> "stderr". SPARK-26792 is for applying this to Spark UI as well, but I've 
>>>>> got suggestion to just change the default log URL.
>>>>>
>>>>> Thanks again,
>>>>> Jungtaek Lim (HeartSaVioR)
>>>>
>>>>
>>>>
>>>> --
>>>> Ryan Blue
>>>> Software Engineer
>>>> Netflix
>>
>>
>>
>> --
>> Ryan Blue
>> Software Engineer
>> Netflix

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to