[ 
https://issues.apache.org/jira/browse/SAMZA-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14039468#comment-14039468
 ] 

Yan Fang commented on SAMZA-215:
--------------------------------

RB: https://reviews.apache.org/r/22845/

1. Use an environment variable SAMZA_LOG4J_FILE to represent the location of 
log4j configuration file. This variable is set in each script and run-calss.sh 
just call this variable instead of the location of the log4j file.
2. The interactive tools like run-job.sh, kill-yarn-job.sh and 
checkpoint-tool.sh use the log4j-default.xml while run-container.sh and 
run-am.sh use the log4j.xml.
3. The log4j-default.xml is a new configuration which appends to console and is 
added in samza-shell/resource.
4. It seems there is some duplicated code in the script. But I do not think we 
can get rid of them.

BTW, I can not find out the usage of samza-shell/src/main/assembly/src.xml. 
Because we are using gradle to compile and assemble. If it is not used, we may 
want to remove it. 

Thank you.

> Better logging for interactive command-line tools
> -------------------------------------------------
>
>                 Key: SAMZA-215
>                 URL: https://issues.apache.org/jira/browse/SAMZA-215
>             Project: Samza
>          Issue Type: Improvement
>            Reporter: Martin Kleppmann
>            Assignee: Yan Fang
>
> At the moment, if you use run-job.sh, it prints out a very long JVM 
> invocation (which is arguably not very useful for most users) but no 
> information about what has actually happened (e.g. connecting to YARN RM, 
> etc). Where the progress messages get logged to depends on the configuration 
> of the user project using Samza.
> For example, hello-samza supplies 
> {{samza-job-package/src/main/resources/log4j.xml}} which sends the logs to a 
> file called {{deploy/samza/undefined-samza-container-name.log}} by default. 
> That is not a great experience for new users — if the job won't start up, 
> users need to know to look in an obscurely-named log file to see any errors 
> that occurred in run-job.sh (e.g. could not connect to YARN RM).
> It's good that jobs can supply their own configuration for logging within a 
> container. However, for interactive tools like run-job.sh, kill-yarn-job.sh 
> and checkpoint-tool.sh (SAMZA-180) it would be much better if the logs just 
> went to the console (stdout or stderr).
> Suggested solution: we include a default log4j configuration that sends logs 
> to the console, and use it in the interactive shell scripts (e.g. 
> run-job.sh). We don't use it in run-container.sh and run-am.sh, as those 
> should be configured by the job.
> This will be especially relevant when we make binary releases of Samza. A 
> user should be able to download the tgz of a release and immediately use the 
> shell scripts for managing jobs, without having to worry about configuring 
> log4j.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to