[jira] [Commented] (SAMZA-310) Publish container logs to a SystemStream

Yan Fang (JIRA) Thu, 31 Jul 2014 17:38:31 -0700

    [ 
https://issues.apache.org/jira/browse/SAMZA-310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081745#comment-14081745
 ]


Yan Fang commented on SAMZA-310:
--------------------------------

Thanks for pointing out the MDC. 

{quote}
Ideally, I'd like to have this work without depending on log4j in samza-core.
{quote}

slf4j has the [MDC|http://www.slf4j.org/api/org/slf4j/MDC.html], but 
grizzled.slf4j does not...

{quote}
The only injection point that I can think of right now to manage things like 
setting the MDC to update the taskName when we process a message is via the 
TaskLifecycleListener. 
{quote}

Can we just set up MDC at the starting time of the containers, instead of vai 
TaskLifecycleListener? Since the goal of assgining the AM/ContainerID 
information is to have the key to the logs, these information can be retrieved 
at the starting time of the container.

> Publish container logs to a SystemStream
> ----------------------------------------
>
>                 Key: SAMZA-310
>                 URL: https://issues.apache.org/jira/browse/SAMZA-310
>             Project: Samza
>          Issue Type: New Feature
>          Components: container
>    Affects Versions: 0.7.0
>            Reporter: Martin Kleppmann
>
> At the moment, it's a bit awkward to get to a Samza job's logs: assuming 
> you're running on YARN, you have to navigate around the YARN web interface, 
> and you can only see one container's logs at a time.
> Given that Samza is all about streams, it would make sense for the logs 
> generated by Samza jobs to also be sent to a stream. There, they could be 
> indexed with [Kibana|http://www.elasticsearch.org/overview/kibana/], consumed 
> by an exception-tracking system, etc.
> Notes:
> - The serde for encoding logs into a suitable wire format should be 
> pluggable. There can be a default implementation that uses JSON, analogous to 
> MetricsSnapshotSerdeFactory for metrics, but organisations that already have 
> a standardised in-house encoding for logs should be able to use it.
> - Should this be at the level of Slf4j or Log4j? Currently the log 
> configuration for YARN jobs uses Log4j, which has the advantage that any 
> frameworks/libraries that use Log4j but not Slf4j appear in the logs. 
> However, Samza itself currently only depends on Slf4j. If we tie this feature 
> to Log4j, it would somewhat defeat the purpose of using Slf4j.
> - Do we need to consider partitioning? Perhaps we can use the container name 
> as partitioning key, so that the ordering of logs from each container is 
> preserved.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SAMZA-310) Publish container logs to a SystemStream

Reply via email to