Hi Sergey, I think you are talking about TWILL-147 ( https://issues.apache.org/jira/browse/TWILL-147), right? The idea for that is we don't need to start EmbeddedKafkaServer in AM at all, but rather it just take a configuration (via TwillPreparer, which can have a default value in the Configuration object passed to YarnTwillRunnerService), which the configuration specifies the Kafka broker list and topic that the AM will publish to.
Since under this model, application logs from different application may send to the same Kafka topic (depends on the configuration), the LogEntry needs to be modified to carry the application and run id, so that the TwillController can filter based on it on the client side. Terence On Sat, Jul 1, 2017 at 1:46 AM, Сергей Филиппов <role...@gmail.com> wrote: > Hello, > I would like to implement possibility to use external kafka server for log > aggregation. > Now twill uses EmbededKafkaServer for that. I think implementation would > look like this: > 1. Add ZK path where kafka zk connection string will be stored. There > should should be only one such path per ApplicationMaster > 2. Use this path in ApplicationKafkaService while creating > EmbededKafkaService, if there on brokers right now > 3. For log aggregation there should be additional nodes in ZK for each > instance with kafka topic's name in it. Something like > "test-app-{UUID}-log". So publisher will send to this topic and consumer > will consume log messages on the job submission machine. > > What would you say? Is this sounds ok? > > Sergey >