Ryan,

actually I'm not clear about your suggestion. For me three possible options
here:

1. If we want to let users be able to completely rewrite log urls, that's
SPARK-26792 <https://issues.apache.org/jira/browse/SPARK-26792>. For SHS we
already addressed it.
2. We could let users turning on/off flag option to just get one url or
default two stdout/stderr urls.
3. We could let users enumerate file names they want to link, and create
log links for each file.

Which one do you suggest?

2019년 2월 9일 (토) 오전 8:24, Ryan Blue <rb...@netflix.com>님이 작성:

> Jungtaek,
>
> Thanks for the extra context. Those quotes are the confirmation that I was
> looking for to expose the link you suggest instead of going directly to
> stderr and stdout.
>
> What do you think about my suggestion to change this with a config option?
> I would prefer that since we use the supported pattern. But I would support
> moving forward on this either way.
>
> rb
>
> On Fri, Feb 8, 2019 at 3:03 PM Sean Owen <sro...@gmail.com> wrote:
>
>> I think that's a reasonable argument, that it provides links to
>> potentially several logs of interest. It reduces the UI clutter a
>> little at the cost of one more hop to get to logs.
>> I don't feel strongly about it but think that's a reasonable thing to do.
>>
>> On Fri, Feb 8, 2019 at 4:57 PM Jungtaek Lim <kabh...@gmail.com> wrote:
>> >
>> > Let me quote some voices here: seems like they don't participate this
>> thread. This still doesn't represent the majority are using this pattern,
>> so I'm also OK to make it optional (I might just work on SPARK-26792 to
>> address) and leave the default as it is if others aren't interested on this.
>> >
>> > https://github.com/apache/spark/pull/23260#issuecomment-456827963
>> >
>> > Sorry I haven't had time to look through all the code so this might be
>> a separate jira, but one thing I thought of here is it would be really nice
>> not to have specifically stderr/stdout. users can specify any
>> log4j.properties and some tools like oozie by default end up using hadoop
>> log4j rather then spark log4j, so files aren't necessarily the same. Also
>> users can put in other logs files so it would be nice to have links to
>> those from the UI. It seems simpler if we just had a link to the directory
>> and it read the files within there. Other things in Hadoop do it this way,
>> but I'm not sure if that works well for other resource managers, any
>> thoughts on that? As long as this doesn't prevent the above I can file a
>> separate jira for it.
>> >
>> > https://github.com/apache/spark/pull/23260#issuecomment-456904716
>> >
>> > Hi Tom, +1: singling out stdout and stderr is definitely an annoyance.
>> We
>> > typically configure Spark jobs to write the GC log and dump heap on OOM
>> > using <LOG_DIR>, and/or we use the rolling file appender to deal with
>> > large logs during debugging. So linking the YARN container log overview
>> > page would make much more sense for us. We work it around with a custom
>> > submit process that logs all important URLs on the submit side log.
>> >
>> >
>> >
>> > 2019년 2월 9일 (토) 오전 5:42, Ryan Blue <rb...@netflix.com>님이 작성:
>> >>
>> >> Here's what I see from a running job on our cluster. Both of these are
>> links that go to the stderr and stdout links that Spark produces today.
>> >>
>> >> stderr : Total file length is 18557 bytes.
>> >> stdout : Total file length is 0 bytes.
>> >>
>> >> While it is nice to see that stderr or stdout has content, I don't
>> think that this is worth the extra click or changes to Spark.
>> >>
>> >> However, we have configured our logs to go to stderr and stdout so
>> these links work for us. I think some YARN applications send logs to a
>> separate log endpoint, which would be useful when listed here. Does anyone
>> have logs going to locations other than stderr and stdout?
>> >>
>> >> If there are logs going to other files, then I think making this an
>> option is reasonable. Otherwise, I think we should leave links as they are.
>> >>
>> >> rb
>> >>
>> >> On Thu, Feb 7, 2019 at 12:31 PM Jungtaek Lim <kabh...@gmail.com>
>> wrote:
>> >>>
>> >>> New URL shows all of local logs which includes stdout and stderr as a
>> list.
>> >>>
>> >>> The change would help when end users modify their log4j configuration
>> to have another log files, as well as GC logs. Currently Spark only shows
>> two static files (stdout, stderr) as individual links so easier to see the
>> content (one-click) but users have to remove file part manually from URL to
>> access list page. Instead of this we may be able to change default URL to
>> show all of local logs and let users choose which file to read. (though it
>> would be two-clicks to access to actual file)
>> >>>
>> >>> -Jungtaek Lim (HeartSaVioR)
>> >>>
>> >>> 2019년 2월 8일 (금) 오전 1:33, Ryan Blue <rb...@netflix.com>님이 작성:
>> >>>>
>> >>>> Jungtaek,
>> >>>>
>> >>>> What is shown at the new URL and how would this improve usability?
>> >>>>
>> >>>> On Thu, Feb 7, 2019 at 12:45 AM Jungtaek Lim <kabh...@gmail.com>
>> wrote:
>> >>>>>
>> >>>>> Hi devs,
>> >>>>>
>> >>>>> Based on the suggestion Tom Graves gave me in SPARK-26792, I'd like
>> to hear voices on changing default executor log URLs for YARN, specifically
>> removing "stdout" and "stderr" and provide link which shows log file"s".
>> For example, instead of referring two links below:
>> >>>>>
>> >>>>> http://
>> <NM_HOST>:<NM_PORT>/node/containerlogs/<CONTAINER_ID>/<USER>/<stdout|stderr>?start=-4096
>> >>>>>
>> >>>>> we just refer only one link below:
>> >>>>>
>> >>>>> http://<NM_HOST>:<NM_PORT>/node/containerlogs/<CONTAINER_ID>/<USER>
>> >>>>>
>> >>>>> I've checked new URL works with redirection on NM to jobhistory, so
>> it won't break what we currently supported. Going through the actual log
>> file would require two clicks instead of one click though.
>> >>>>>
>> >>>>> Given it introduces the change on UX I'd like to hear voices on
>> this before submitting a patch. If we'd rather keep this as it is, I would
>> just open the chance to apply custom log URL for Spark UI as well.
>> >>>>>
>> >>>>> Thanks in advance!
>> >>>>>
>> >>>>> FYI, below is the rationalization on discussion:
>> >>>>>
>> >>>>> While I worked regarding SPARK-23155, I've got some inputs around
>> linking "log directory" instead of log urls for each "stdout" and "stderr",
>> because in real case end users would put more files then only stdout and
>> stderr (like gc logs).
>> >>>>>
>> >>>>> SPARK-23155 provides the way to modify log URL but it's only
>> applied to SHS, and in Spark UI in running apps it still only shows
>> "stdout" and "stderr". SPARK-26792 is for applying this to Spark UI as
>> well, but I've got suggestion to just change the default log URL.
>> >>>>>
>> >>>>> Thanks again,
>> >>>>> Jungtaek Lim (HeartSaVioR)
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Ryan Blue
>> >>>> Software Engineer
>> >>>> Netflix
>> >>
>> >>
>> >>
>> >> --
>> >> Ryan Blue
>> >> Software Engineer
>> >> Netflix
>>
>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>

Reply via email to