So the only way to have proper logging in separate files is to use YARN to run jobs?

I guess I'll have to live with it as we can't justify running the jobs in YARN now because of the high memory consumption.

Thanks for the clarification.

Lukas

-----Original Message----- From: Chris Riccomini
Sent: Tuesday, December 23, 2014 3:02 PM
To: [email protected]
Subject: Re: Changes to logging in Samza 0.8

Hey Lukas,

I see this log in your container logs:

2014-12-23 22:37:26 SamzaContainer$ [INFO] Setting up Samza container:
local-process-container


This suggests that it's receiving an environment variable of:

 SAMZA_CONTAINER_NAME=local-process-container

Looking at ProcssJob.scala shows that this is indeed the case:

 class ProcessJobFactory extends StreamJobFactory with Logging {
   def getJob(config: Config): StreamJob = {
   val jobName = "local-process-container"
   ...
   commandBuilder
     .setConfig(config)
     .setName(jobName)

   ...

In 0.8.0 there is no way to override this. In <= 0.8.0, the definition of
a "container name" was somewhat ill-defined. In 0.9.0, we have converged
on containers having Ids, and all containers "samza.container.name" always
evaluates to "samza-container-<id>".

Sorry for the confusion.

Cheers,
Chris

On 12/23/14 2:40 PM, "Lukas Steiblys" <[email protected]> wrote:

run-job.sh log: http://paste.ofcode.org/3bP8yLVzknxxXP5qR7M8LqJ
logs/local-process-container.log:
http://paste.ofcode.org/gH894xgBNWRYCSdiETsau8

Lukas


-----Original Message-----
From: Chris Riccomini
Sent: Tuesday, December 23, 2014 2:34 PM
To: [email protected]
Subject: Re: Changes to logging in Samza 0.8

Hey Lukas,

Could you paste the logs for both the run-job.sh and the .log file that's
being produced by the container? I see no mention of
"local-process-container" in either 0.7.0 or 0.8.0. By default, the
ShellCommandBuilder should set SAMZA_CONTAINER_NAME to your job.name, and
run-class.sh should set samza-container.name to SAMZA_CONTAINER_NAME.

Cheers,
Chris

On 12/23/14 2:24 PM, "Lukas Steiblys" <[email protected]> wrote:

Thanks! I made the deploy script switch to the package root directory
before
running the job. The only problem now is that the logs are written to a
local-process-container.log file instead of JOB-NAME.log file what is
specified in the SAMZA_CONTAINER_NAME environment variable.

Lukas

-----Original Message-----
From: Chris Riccomini
Sent: Tuesday, December 23, 2014 1:58 PM
To: [email protected]
Subject: Re: Changes to logging in Samza 0.8

Hey Lukas,

It looks like you are starting run-job.sh from outside the package root
(what you get when you un-tar your package tarball). By default, the
ProcessJob (via ShellCommandBuilder) uses this:

 def getCommand =
getOption(ShellCommandConfig.COMMAND_SHELL_EXECUTE).getOrElse("bin/run-co
n
t
ainer.sh")


If this is run from outside the package root, you'll get the exception
you
see. I think this might work:

 task.execute=./deploy/samza/bin/run-container.sh

To specify the location of the run-container.sh script.


Note: as expected, the logs show that the run-job.sh script is picking up
-Dlog4j.configuration=file:./deploy/samza/bin/log4j-console.xml. When the
ProcssJob works, I'd expect that process will pick up the lib/log4j.xml.
I've never run the run-job.sh script with ProcessJob from outside of the
package root, though. If it doesn't work, please post issues, so we can
open the appropriate JIRAs.

Cheers,
Chris

On 12/23/14 1:49 PM, "Lukas Steiblys" <[email protected]> wrote:

Here's the log: http://paste.ofcode.org/HLCvT2j8BY6nLhpqQ6Ld3b

Lukas

-----Original Message-----
From: Chris Riccomini
Sent: Tuesday, December 23, 2014 1:22 PM
To: [email protected]
Subject: Re: Changes to logging in Samza 0.8

Hey Lukas,

Log attachments seem to be filtered out. Could you try posting on a
public
paste, or github gist?

Cheers,
Chris

On 12/23/14 1:20 PM, "Lukas Steiblys" <[email protected]> wrote:

Unfortunately, that didn't help. Not only did the log show up in
STDOUT,
the
job also failed to start (but the process didn't stop). Log attached.

Lukas

-----Original Message-----
From: Chris Riccomini
Sent: Tuesday, December 23, 2014 1:08 PM
To: [email protected]
Subject: Re: Changes to logging in Samza 0.8

Hey Lukas,

I believe this is because you're using:

 job.factory.class=org.apache.samza.job.local.ThreadJobFactory

Config settings you have that need to be set at JVM start time can't be
applied using the ThreadJobFactory, since the JVM has already started.
As
a result, you get whatever JVM settings your run-job.sh script uses.
For
log4j, I believe this means it'll pick up the log4j-console.xml in your
bin directory.

Can you try using:

 job.factory.class=org.apache.samza.job.local.ProcessJobFactory


Cheers,
Chris

On 12/23/14 1:00 PM, "Lukas Steiblys" <[email protected]> wrote:

I do not have a custom task.opts.

Here's the full package we deploy:
http://imbusy.org/temp/samza-package-0.1-SNAPSHOT-dist.tar.gz . I have
also
attached one of the deploy scripts we use for one of the five jobs
available. They are all run locally.

Lukas

-----Original Message-----
From: Chris Riccomini
Sent: Tuesday, December 23, 2014 12:32 PM
To: [email protected]
Subject: Re: Changes to logging in Samza 0.8

Hey Lukas,

The changes are probably from this ticket:

 https://issues.apache.org/jira/browse/SAMZA-109

The behavior you're observing does not sound correct, though. By
default,
if you have a log4j.xml in your lib directory, and don't have a custom
task.opts, then you should get proper .log files. Do you have a custom
task.opts? If so, could you paste it?

Cheers,
Chris

On 12/23/14 11:23 AM, "Lukas Steiblys" <[email protected]> wrote:

I have recently upgraded from Samza 0.7 to 0.8 and noticed that,
instead
of logging to a file using log4j to the log directory specified in
the
environment variable SAMZA_LOG_DIR, all the logs are dumped to
STDOUT.

What changed in 0.8 and what¹s the path to upgrading to get the old
functionality back?

Lukas




Reply via email to