I think that's a reasonable argument, that it provides links to potentially several logs of interest. It reduces the UI clutter a little at the cost of one more hop to get to logs. I don't feel strongly about it but think that's a reasonable thing to do.
On Fri, Feb 8, 2019 at 4:57 PM Jungtaek Lim <kabh...@gmail.com> wrote: > > Let me quote some voices here: seems like they don't participate this thread. > This still doesn't represent the majority are using this pattern, so I'm also > OK to make it optional (I might just work on SPARK-26792 to address) and > leave the default as it is if others aren't interested on this. > > https://github.com/apache/spark/pull/23260#issuecomment-456827963 > > Sorry I haven't had time to look through all the code so this might be a > separate jira, but one thing I thought of here is it would be really nice not > to have specifically stderr/stdout. users can specify any log4j.properties > and some tools like oozie by default end up using hadoop log4j rather then > spark log4j, so files aren't necessarily the same. Also users can put in > other logs files so it would be nice to have links to those from the UI. It > seems simpler if we just had a link to the directory and it read the files > within there. Other things in Hadoop do it this way, but I'm not sure if that > works well for other resource managers, any thoughts on that? As long as this > doesn't prevent the above I can file a separate jira for it. > > https://github.com/apache/spark/pull/23260#issuecomment-456904716 > > Hi Tom, +1: singling out stdout and stderr is definitely an annoyance. We > typically configure Spark jobs to write the GC log and dump heap on OOM > using <LOG_DIR>, and/or we use the rolling file appender to deal with > large logs during debugging. So linking the YARN container log overview > page would make much more sense for us. We work it around with a custom > submit process that logs all important URLs on the submit side log. > > > > 2019년 2월 9일 (토) 오전 5:42, Ryan Blue <rb...@netflix.com>님이 작성: >> >> Here's what I see from a running job on our cluster. Both of these are links >> that go to the stderr and stdout links that Spark produces today. >> >> stderr : Total file length is 18557 bytes. >> stdout : Total file length is 0 bytes. >> >> While it is nice to see that stderr or stdout has content, I don't think >> that this is worth the extra click or changes to Spark. >> >> However, we have configured our logs to go to stderr and stdout so these >> links work for us. I think some YARN applications send logs to a separate >> log endpoint, which would be useful when listed here. Does anyone have logs >> going to locations other than stderr and stdout? >> >> If there are logs going to other files, then I think making this an option >> is reasonable. Otherwise, I think we should leave links as they are. >> >> rb >> >> On Thu, Feb 7, 2019 at 12:31 PM Jungtaek Lim <kabh...@gmail.com> wrote: >>> >>> New URL shows all of local logs which includes stdout and stderr as a list. >>> >>> The change would help when end users modify their log4j configuration to >>> have another log files, as well as GC logs. Currently Spark only shows two >>> static files (stdout, stderr) as individual links so easier to see the >>> content (one-click) but users have to remove file part manually from URL to >>> access list page. Instead of this we may be able to change default URL to >>> show all of local logs and let users choose which file to read. (though it >>> would be two-clicks to access to actual file) >>> >>> -Jungtaek Lim (HeartSaVioR) >>> >>> 2019년 2월 8일 (금) 오전 1:33, Ryan Blue <rb...@netflix.com>님이 작성: >>>> >>>> Jungtaek, >>>> >>>> What is shown at the new URL and how would this improve usability? >>>> >>>> On Thu, Feb 7, 2019 at 12:45 AM Jungtaek Lim <kabh...@gmail.com> wrote: >>>>> >>>>> Hi devs, >>>>> >>>>> Based on the suggestion Tom Graves gave me in SPARK-26792, I'd like to >>>>> hear voices on changing default executor log URLs for YARN, specifically >>>>> removing "stdout" and "stderr" and provide link which shows log file"s". >>>>> For example, instead of referring two links below: >>>>> >>>>> http://<NM_HOST>:<NM_PORT>/node/containerlogs/<CONTAINER_ID>/<USER>/<stdout|stderr>?start=-4096 >>>>> >>>>> we just refer only one link below: >>>>> >>>>> http://<NM_HOST>:<NM_PORT>/node/containerlogs/<CONTAINER_ID>/<USER> >>>>> >>>>> I've checked new URL works with redirection on NM to jobhistory, so it >>>>> won't break what we currently supported. Going through the actual log >>>>> file would require two clicks instead of one click though. >>>>> >>>>> Given it introduces the change on UX I'd like to hear voices on this >>>>> before submitting a patch. If we'd rather keep this as it is, I would >>>>> just open the chance to apply custom log URL for Spark UI as well. >>>>> >>>>> Thanks in advance! >>>>> >>>>> FYI, below is the rationalization on discussion: >>>>> >>>>> While I worked regarding SPARK-23155, I've got some inputs around linking >>>>> "log directory" instead of log urls for each "stdout" and "stderr", >>>>> because in real case end users would put more files then only stdout and >>>>> stderr (like gc logs). >>>>> >>>>> SPARK-23155 provides the way to modify log URL but it's only applied to >>>>> SHS, and in Spark UI in running apps it still only shows "stdout" and >>>>> "stderr". SPARK-26792 is for applying this to Spark UI as well, but I've >>>>> got suggestion to just change the default log URL. >>>>> >>>>> Thanks again, >>>>> Jungtaek Lim (HeartSaVioR) >>>> >>>> >>>> >>>> -- >>>> Ryan Blue >>>> Software Engineer >>>> Netflix >> >> >> >> -- >> Ryan Blue >> Software Engineer >> Netflix --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org