[ 
https://issues.apache.org/jira/browse/SAMZA-310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14223459#comment-14223459
 ] 

Yan Fang commented on SAMZA-310:
--------------------------------

Updated the RB based on Chris' and Chinmay's comments. 
https://reviews.apache.org/r/28035/

{quote}
It looks like we're appending newlines for every log entry.
{quote}

yes.

{quote}
We'll have to allow the OutgoingMessageEnvelope send to be overrideable so we 
don't force people to use String logging (we should give them raw LoggingEvent, 
so they can encode to JSON, Protobuf, etc). This, again, can be a separate 
ticket.
{quote}

yeah, Martin mentions the same issue in his comment. I will open a separate 
ticket for this.

{quote}
It would be nice if you set MDC for container name for both SamzaContainer and 
the AM, so we can log the container in the log4j format.
{quote}

done.

{quote}
 things appeared to be in order
{quote}

it's interesting. I actually see one time that the order in stream appender is 
different from that in file appender in all my testings (20+). The following 
logs are reversed.

{code}
samza-application-master 2014-11-24 12:06:54 SamzaAppMaster$ [INFO] got node 
manager http port: 8042

samza-application-master 2014-11-24 12:06:54 SamzaAppMaster$ [INFO] got node 
manager port: 57744
{code}

> Publish container logs to a SystemStream
> ----------------------------------------
>
>                 Key: SAMZA-310
>                 URL: https://issues.apache.org/jira/browse/SAMZA-310
>             Project: Samza
>          Issue Type: New Feature
>          Components: container
>    Affects Versions: 0.7.0
>            Reporter: Martin Kleppmann
>            Assignee: Yan Fang
>             Fix For: 0.9.0
>
>         Attachments: SAMZA-310.1.patch, SAMZA-310.2.patch, SAMZA-310.patch
>
>
> At the moment, it's a bit awkward to get to a Samza job's logs: assuming 
> you're running on YARN, you have to navigate around the YARN web interface, 
> and you can only see one container's logs at a time.
> Given that Samza is all about streams, it would make sense for the logs 
> generated by Samza jobs to also be sent to a stream. There, they could be 
> indexed with [Kibana|http://www.elasticsearch.org/overview/kibana/], consumed 
> by an exception-tracking system, etc.
> Notes:
> - The serde for encoding logs into a suitable wire format should be 
> pluggable. There can be a default implementation that uses JSON, analogous to 
> MetricsSnapshotSerdeFactory for metrics, but organisations that already have 
> a standardised in-house encoding for logs should be able to use it.
> - Should this be at the level of Slf4j or Log4j? Currently the log 
> configuration for YARN jobs uses Log4j, which has the advantage that any 
> frameworks/libraries that use Log4j but not Slf4j appear in the logs. 
> However, Samza itself currently only depends on Slf4j. If we tie this feature 
> to Log4j, it would somewhat defeat the purpose of using Slf4j.
> - Do we need to consider partitioning? Perhaps we can use the container name 
> as partitioning key, so that the ordering of logs from each container is 
> preserved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to