So the only way to have proper logging in separate files is to use YARN to
run jobs?
I guess I'll have to live with it as we can't justify running the jobs in
YARN now because of the high memory consumption.
Thanks for the clarification.
Lukas
-----Original Message-----
From: Chris Riccomini
Sent: Tuesday, December 23, 2014 3:02 PM
To: [email protected]
Subject: Re: Changes to logging in Samza 0.8
Hey Lukas,
I see this log in your container logs:
2014-12-23 22:37:26 SamzaContainer$ [INFO] Setting up Samza container:
local-process-container
This suggests that it's receiving an environment variable of:
SAMZA_CONTAINER_NAME=local-process-container
Looking at ProcssJob.scala shows that this is indeed the case:
class ProcessJobFactory extends StreamJobFactory with Logging {
def getJob(config: Config): StreamJob = {
val jobName = "local-process-container"
...
commandBuilder
.setConfig(config)
.setName(jobName)
...
In 0.8.0 there is no way to override this. In <= 0.8.0, the definition of
a "container name" was somewhat ill-defined. In 0.9.0, we have converged
on containers having Ids, and all containers "samza.container.name" always
evaluates to "samza-container-<id>".
Sorry for the confusion.
Cheers,
Chris
On 12/23/14 2:40 PM, "Lukas Steiblys" <[email protected]> wrote:
run-job.sh log: http://paste.ofcode.org/3bP8yLVzknxxXP5qR7M8LqJ
logs/local-process-container.log:
http://paste.ofcode.org/gH894xgBNWRYCSdiETsau8
Lukas
-----Original Message-----
From: Chris Riccomini
Sent: Tuesday, December 23, 2014 2:34 PM
To: [email protected]
Subject: Re: Changes to logging in Samza 0.8
Hey Lukas,
Could you paste the logs for both the run-job.sh and the .log file that's
being produced by the container? I see no mention of
"local-process-container" in either 0.7.0 or 0.8.0. By default, the
ShellCommandBuilder should set SAMZA_CONTAINER_NAME to your job.name, and
run-class.sh should set samza-container.name to SAMZA_CONTAINER_NAME.
Cheers,
Chris
On 12/23/14 2:24 PM, "Lukas Steiblys" <[email protected]> wrote:
Thanks! I made the deploy script switch to the package root directory
before
running the job. The only problem now is that the logs are written to a
local-process-container.log file instead of JOB-NAME.log file what is
specified in the SAMZA_CONTAINER_NAME environment variable.
Lukas
-----Original Message-----
From: Chris Riccomini
Sent: Tuesday, December 23, 2014 1:58 PM
To: [email protected]
Subject: Re: Changes to logging in Samza 0.8
Hey Lukas,
It looks like you are starting run-job.sh from outside the package root
(what you get when you un-tar your package tarball). By default, the
ProcessJob (via ShellCommandBuilder) uses this:
def getCommand =
getOption(ShellCommandConfig.COMMAND_SHELL_EXECUTE).getOrElse("bin/run-co
n
t
ainer.sh")
If this is run from outside the package root, you'll get the exception
you
see. I think this might work:
task.execute=./deploy/samza/bin/run-container.sh
To specify the location of the run-container.sh script.
Note: as expected, the logs show that the run-job.sh script is picking up
-Dlog4j.configuration=file:./deploy/samza/bin/log4j-console.xml. When the
ProcssJob works, I'd expect that process will pick up the lib/log4j.xml.
I've never run the run-job.sh script with ProcessJob from outside of the
package root, though. If it doesn't work, please post issues, so we can
open the appropriate JIRAs.
Cheers,
Chris
On 12/23/14 1:49 PM, "Lukas Steiblys" <[email protected]> wrote:
Here's the log: http://paste.ofcode.org/HLCvT2j8BY6nLhpqQ6Ld3b
Lukas
-----Original Message-----
From: Chris Riccomini
Sent: Tuesday, December 23, 2014 1:22 PM
To: [email protected]
Subject: Re: Changes to logging in Samza 0.8
Hey Lukas,
Log attachments seem to be filtered out. Could you try posting on a
public
paste, or github gist?
Cheers,
Chris
On 12/23/14 1:20 PM, "Lukas Steiblys" <[email protected]> wrote:
Unfortunately, that didn't help. Not only did the log show up in
STDOUT,
the
job also failed to start (but the process didn't stop). Log attached.
Lukas
-----Original Message-----
From: Chris Riccomini
Sent: Tuesday, December 23, 2014 1:08 PM
To: [email protected]
Subject: Re: Changes to logging in Samza 0.8
Hey Lukas,
I believe this is because you're using:
job.factory.class=org.apache.samza.job.local.ThreadJobFactory
Config settings you have that need to be set at JVM start time can't be
applied using the ThreadJobFactory, since the JVM has already started.
As
a result, you get whatever JVM settings your run-job.sh script uses.
For
log4j, I believe this means it'll pick up the log4j-console.xml in your
bin directory.
Can you try using:
job.factory.class=org.apache.samza.job.local.ProcessJobFactory
Cheers,
Chris
On 12/23/14 1:00 PM, "Lukas Steiblys" <[email protected]> wrote:
I do not have a custom task.opts.
Here's the full package we deploy:
http://imbusy.org/temp/samza-package-0.1-SNAPSHOT-dist.tar.gz . I have
also
attached one of the deploy scripts we use for one of the five jobs
available. They are all run locally.
Lukas
-----Original Message-----
From: Chris Riccomini
Sent: Tuesday, December 23, 2014 12:32 PM
To: [email protected]
Subject: Re: Changes to logging in Samza 0.8
Hey Lukas,
The changes are probably from this ticket:
https://issues.apache.org/jira/browse/SAMZA-109
The behavior you're observing does not sound correct, though. By
default,
if you have a log4j.xml in your lib directory, and don't have a custom
task.opts, then you should get proper .log files. Do you have a custom
task.opts? If so, could you paste it?
Cheers,
Chris
On 12/23/14 11:23 AM, "Lukas Steiblys" <[email protected]> wrote:
I have recently upgraded from Samza 0.7 to 0.8 and noticed that,
instead
of logging to a file using log4j to the log directory specified in
the
environment variable SAMZA_LOG_DIR, all the logs are dumped to
STDOUT.
What changed in 0.8 and what¹s the path to upgrading to get the old
functionality back?
Lukas