[
https://issues.apache.org/jira/browse/SAMZA-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14039468#comment-14039468
]
Yan Fang commented on SAMZA-215:
--------------------------------
RB: https://reviews.apache.org/r/22845/
1. Use an environment variable SAMZA_LOG4J_FILE to represent the location of
log4j configuration file. This variable is set in each script and run-calss.sh
just call this variable instead of the location of the log4j file.
2. The interactive tools like run-job.sh, kill-yarn-job.sh and
checkpoint-tool.sh use the log4j-default.xml while run-container.sh and
run-am.sh use the log4j.xml.
3. The log4j-default.xml is a new configuration which appends to console and is
added in samza-shell/resource.
4. It seems there is some duplicated code in the script. But I do not think we
can get rid of them.
BTW, I can not find out the usage of samza-shell/src/main/assembly/src.xml.
Because we are using gradle to compile and assemble. If it is not used, we may
want to remove it.
Thank you.
> Better logging for interactive command-line tools
> -------------------------------------------------
>
> Key: SAMZA-215
> URL: https://issues.apache.org/jira/browse/SAMZA-215
> Project: Samza
> Issue Type: Improvement
> Reporter: Martin Kleppmann
> Assignee: Yan Fang
>
> At the moment, if you use run-job.sh, it prints out a very long JVM
> invocation (which is arguably not very useful for most users) but no
> information about what has actually happened (e.g. connecting to YARN RM,
> etc). Where the progress messages get logged to depends on the configuration
> of the user project using Samza.
> For example, hello-samza supplies
> {{samza-job-package/src/main/resources/log4j.xml}} which sends the logs to a
> file called {{deploy/samza/undefined-samza-container-name.log}} by default.
> That is not a great experience for new users — if the job won't start up,
> users need to know to look in an obscurely-named log file to see any errors
> that occurred in run-job.sh (e.g. could not connect to YARN RM).
> It's good that jobs can supply their own configuration for logging within a
> container. However, for interactive tools like run-job.sh, kill-yarn-job.sh
> and checkpoint-tool.sh (SAMZA-180) it would be much better if the logs just
> went to the console (stdout or stderr).
> Suggested solution: we include a default log4j configuration that sends logs
> to the console, and use it in the interactive shell scripts (e.g.
> run-job.sh). We don't use it in run-container.sh and run-am.sh, as those
> should be configured by the job.
> This will be especially relevant when we make binary releases of Samza. A
> user should be able to download the tgz of a release and immediately use the
> shell scripts for managing jobs, without having to worry about configuring
> log4j.
--
This message was sent by Atlassian JIRA
(v6.2#6252)