Jungtaek, Thanks for the extra context. Those quotes are the confirmation that I was looking for to expose the link you suggest instead of going directly to stderr and stdout.
What do you think about my suggestion to change this with a config option? I would prefer that since we use the supported pattern. But I would support moving forward on this either way. rb On Fri, Feb 8, 2019 at 3:03 PM Sean Owen <sro...@gmail.com> wrote: > I think that's a reasonable argument, that it provides links to > potentially several logs of interest. It reduces the UI clutter a > little at the cost of one more hop to get to logs. > I don't feel strongly about it but think that's a reasonable thing to do. > > On Fri, Feb 8, 2019 at 4:57 PM Jungtaek Lim <kabh...@gmail.com> wrote: > > > > Let me quote some voices here: seems like they don't participate this > thread. This still doesn't represent the majority are using this pattern, > so I'm also OK to make it optional (I might just work on SPARK-26792 to > address) and leave the default as it is if others aren't interested on this. > > > > https://github.com/apache/spark/pull/23260#issuecomment-456827963 > > > > Sorry I haven't had time to look through all the code so this might be a > separate jira, but one thing I thought of here is it would be really nice > not to have specifically stderr/stdout. users can specify any > log4j.properties and some tools like oozie by default end up using hadoop > log4j rather then spark log4j, so files aren't necessarily the same. Also > users can put in other logs files so it would be nice to have links to > those from the UI. It seems simpler if we just had a link to the directory > and it read the files within there. Other things in Hadoop do it this way, > but I'm not sure if that works well for other resource managers, any > thoughts on that? As long as this doesn't prevent the above I can file a > separate jira for it. > > > > https://github.com/apache/spark/pull/23260#issuecomment-456904716 > > > > Hi Tom, +1: singling out stdout and stderr is definitely an annoyance. We > > typically configure Spark jobs to write the GC log and dump heap on OOM > > using <LOG_DIR>, and/or we use the rolling file appender to deal with > > large logs during debugging. So linking the YARN container log overview > > page would make much more sense for us. We work it around with a custom > > submit process that logs all important URLs on the submit side log. > > > > > > > > 2019년 2월 9일 (토) 오전 5:42, Ryan Blue <rb...@netflix.com>님이 작성: > >> > >> Here's what I see from a running job on our cluster. Both of these are > links that go to the stderr and stdout links that Spark produces today. > >> > >> stderr : Total file length is 18557 bytes. > >> stdout : Total file length is 0 bytes. > >> > >> While it is nice to see that stderr or stdout has content, I don't > think that this is worth the extra click or changes to Spark. > >> > >> However, we have configured our logs to go to stderr and stdout so > these links work for us. I think some YARN applications send logs to a > separate log endpoint, which would be useful when listed here. Does anyone > have logs going to locations other than stderr and stdout? > >> > >> If there are logs going to other files, then I think making this an > option is reasonable. Otherwise, I think we should leave links as they are. > >> > >> rb > >> > >> On Thu, Feb 7, 2019 at 12:31 PM Jungtaek Lim <kabh...@gmail.com> wrote: > >>> > >>> New URL shows all of local logs which includes stdout and stderr as a > list. > >>> > >>> The change would help when end users modify their log4j configuration > to have another log files, as well as GC logs. Currently Spark only shows > two static files (stdout, stderr) as individual links so easier to see the > content (one-click) but users have to remove file part manually from URL to > access list page. Instead of this we may be able to change default URL to > show all of local logs and let users choose which file to read. (though it > would be two-clicks to access to actual file) > >>> > >>> -Jungtaek Lim (HeartSaVioR) > >>> > >>> 2019년 2월 8일 (금) 오전 1:33, Ryan Blue <rb...@netflix.com>님이 작성: > >>>> > >>>> Jungtaek, > >>>> > >>>> What is shown at the new URL and how would this improve usability? > >>>> > >>>> On Thu, Feb 7, 2019 at 12:45 AM Jungtaek Lim <kabh...@gmail.com> > wrote: > >>>>> > >>>>> Hi devs, > >>>>> > >>>>> Based on the suggestion Tom Graves gave me in SPARK-26792, I'd like > to hear voices on changing default executor log URLs for YARN, specifically > removing "stdout" and "stderr" and provide link which shows log file"s". > For example, instead of referring two links below: > >>>>> > >>>>> http:// > <NM_HOST>:<NM_PORT>/node/containerlogs/<CONTAINER_ID>/<USER>/<stdout|stderr>?start=-4096 > >>>>> > >>>>> we just refer only one link below: > >>>>> > >>>>> http://<NM_HOST>:<NM_PORT>/node/containerlogs/<CONTAINER_ID>/<USER> > >>>>> > >>>>> I've checked new URL works with redirection on NM to jobhistory, so > it won't break what we currently supported. Going through the actual log > file would require two clicks instead of one click though. > >>>>> > >>>>> Given it introduces the change on UX I'd like to hear voices on this > before submitting a patch. If we'd rather keep this as it is, I would just > open the chance to apply custom log URL for Spark UI as well. > >>>>> > >>>>> Thanks in advance! > >>>>> > >>>>> FYI, below is the rationalization on discussion: > >>>>> > >>>>> While I worked regarding SPARK-23155, I've got some inputs around > linking "log directory" instead of log urls for each "stdout" and "stderr", > because in real case end users would put more files then only stdout and > stderr (like gc logs). > >>>>> > >>>>> SPARK-23155 provides the way to modify log URL but it's only applied > to SHS, and in Spark UI in running apps it still only shows "stdout" and > "stderr". SPARK-26792 is for applying this to Spark UI as well, but I've > got suggestion to just change the default log URL. > >>>>> > >>>>> Thanks again, > >>>>> Jungtaek Lim (HeartSaVioR) > >>>> > >>>> > >>>> > >>>> -- > >>>> Ryan Blue > >>>> Software Engineer > >>>> Netflix > >> > >> > >> > >> -- > >> Ryan Blue > >> Software Engineer > >> Netflix > -- Ryan Blue Software Engineer Netflix