[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection
Github user asfgit closed the pull request at: https://github.com/apache/twill/pull/40 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection
Github user hsaputra commented on a diff in the pull request: https://github.com/apache/twill/pull/40#discussion_r107788346 --- Diff: twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillController.java --- @@ -77,7 +77,7 @@ */ YarnTwillController(String appName, RunId runId, ZKClient zkClient, final ApplicationMasterLiveNodeData amLiveNodeData, final YarnAppClient yarnAppClient) { -super(runId, zkClient, Collections.emptyList()); +super(appName, runId, zkClient, amLiveNodeData.getKafkaZKConnect() != null, Collections.emptyList()); --- End diff -- At this point is more about style than correctness. I personally prefer to not make the logic of determining to enable or disable logging in YarnTwillController but rather on object that have current state of the union. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection
Github user hsaputra commented on a diff in the pull request: https://github.com/apache/twill/pull/40#discussion_r107781058 --- Diff: twill-core/src/main/java/org/apache/twill/internal/TwillRuntimeSpecification.java --- @@ -105,4 +114,17 @@ public String getRmSchedulerAddr() { public Map getMaxRetries() { return maxRetries; } + + /** + * Returns the ZK connection string for the Kafka used for log collections, + * or {@code null} if log collection is disabled. + */ + @Nullable + public String getKafkaZKConnect() { +if (!isLogCollectionEnabled()) { --- End diff -- Ah sorry, that will work, I was looking at snippet of change. Yeah, should work --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection
Github user chtyim commented on a diff in the pull request: https://github.com/apache/twill/pull/40#discussion_r107773444 --- Diff: twill-core/src/main/java/org/apache/twill/internal/TwillRuntimeSpecification.java --- @@ -105,4 +114,17 @@ public String getRmSchedulerAddr() { public Map getMaxRetries() { return maxRetries; } + + /** + * Returns the ZK connection string for the Kafka used for log collections, + * or {@code null} if log collection is disabled. + */ + @Nullable + public String getKafkaZKConnect() { +if (!isLogCollectionEnabled()) { --- End diff -- Do you mean this javadoc and the `@Nullable` annotation not enough? ```java /** * Returns the ZK connection string for the Kafka used for log collections, * or {@code null} if log collection is disabled. */ @Nullable public String getKafkaZKConnect() ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection
Github user hsaputra commented on a diff in the pull request: https://github.com/apache/twill/pull/40#discussion_r107771740 --- Diff: twill-core/src/main/java/org/apache/twill/internal/TwillRuntimeSpecification.java --- @@ -105,4 +114,17 @@ public String getRmSchedulerAddr() { public Map getMaxRetries() { return maxRetries; } + + /** + * Returns the ZK connection string for the Kafka used for log collections, + * or {@code null} if log collection is disabled. + */ + @Nullable + public String getKafkaZKConnect() { +if (!isLogCollectionEnabled()) { --- End diff -- Same thing of no way that caller could have guessed without looking at the code that there is a check of isLogCollectionEnabled inside getKafkaZKConnect that have inferred meaning when it is returning null. The compromise I think at least added JavaDoc header for getKafkaZKConnect to explain why it could return null. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection
Github user chtyim commented on a diff in the pull request: https://github.com/apache/twill/pull/40#discussion_r107767933 --- Diff: twill-core/src/main/java/org/apache/twill/internal/TwillRuntimeSpecification.java --- @@ -105,4 +114,17 @@ public String getRmSchedulerAddr() { public Map getMaxRetries() { return maxRetries; } + + /** + * Returns the ZK connection string for the Kafka used for log collections, + * or {@code null} if log collection is disabled. + */ + @Nullable + public String getKafkaZKConnect() { +if (!isLogCollectionEnabled()) { --- End diff -- This is because there is no way to enforce someone have to call `isLogCollectionEnabled` first before calling the `getKafkaZKConnect` method, meaning the `getKafkaZKConnect` method needs to handle the case when log collection disabled anyway (either by returning `null` or returning `Optional` or throwing exception). From the caller perspective, it seems more straight forward to call the method and validate against the contract then calling two methods in the right order. However, having `getKafkaZKConnect` returning `null` (I prefer `Optional`, but since we are on Java 7, and we are on the road on reducing reliance on guava, so that's not really an option), can make cases like this simpler in future when adding support for TWILL-147. E.g. ```java YarnTwillController(@Nullable String kafkaZKConnect, ...) { this.kafkaClient = createKafkaClient(kafkaZKConnect); } @Nullable private KafkaClient createKafkaClient(@Nnullable String kafkaZKConnect) { if (kafkaZKConnect == null) { return null; } return new KafkaClient(); } ... // For app launch with the current twill runner new YarnTwillController(spec.getKafkaZKConnect(), ...); // For running app recovered from ZK. new YarnTwillController(amLiveNode.getKafkaZKConnect(), ...); ``` Without the `nullable` support, there would be more overloaded constructors as well as code branches to achieve the same. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection
Github user hsaputra commented on a diff in the pull request: https://github.com/apache/twill/pull/40#discussion_r107739693 --- Diff: twill-core/src/main/java/org/apache/twill/internal/TwillRuntimeSpecification.java --- @@ -105,4 +114,17 @@ public String getRmSchedulerAddr() { public Map getMaxRetries() { return maxRetries; } + + /** + * Returns the ZK connection string for the Kafka used for log collections, + * or {@code null} if log collection is disabled. + */ + @Nullable + public String getKafkaZKConnect() { +if (!isLogCollectionEnabled()) { --- End diff -- The call to `getZkConnectStr` and `getTwillAppName` and `getTwillAppRunId` will never return null based on the contract (no @Nullable) , hence call to `getKafkaZKConnect` will never be null. I was saying that either way, someone need to check whether call to getKafkaZKConnect will return null if log collection is disabled, so why not ask to isLogCollectionEnabled to check first. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection
Github user chtyim commented on a diff in the pull request: https://github.com/apache/twill/pull/40#discussion_r107717653 --- Diff: twill-core/src/main/java/org/apache/twill/internal/TwillRuntimeSpecification.java --- @@ -105,4 +114,17 @@ public String getRmSchedulerAddr() { public Map getMaxRetries() { return maxRetries; } + + /** + * Returns the ZK connection string for the Kafka used for log collections, + * or {@code null} if log collection is disabled. + */ + @Nullable + public String getKafkaZKConnect() { +if (!isLogCollectionEnabled()) { --- End diff -- How to avoid returning null? You mean by throwing exception so that the caller has to call isLogCollectionEnabled before calling this method? I am not a friend of method coupling like this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection
Github user chtyim commented on a diff in the pull request: https://github.com/apache/twill/pull/40#discussion_r107717203 --- Diff: twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillController.java --- @@ -77,7 +77,7 @@ */ YarnTwillController(String appName, RunId runId, ZKClient zkClient, final ApplicationMasterLiveNodeData amLiveNodeData, final YarnAppClient yarnAppClient) { -super(runId, zkClient, Collections.emptyList()); +super(appName, runId, zkClient, amLiveNodeData.getKafkaZKConnect() != null, Collections.emptyList()); --- End diff -- The contract of the getKafkaZKConnect won't change though, since it has to return something (or throw exception). From the caller point of view, there is almost no differences. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection
Github user hsaputra commented on a diff in the pull request: https://github.com/apache/twill/pull/40#discussion_r107690293 --- Diff: twill-core/src/main/java/org/apache/twill/internal/AbstractTwillController.java --- @@ -71,21 +71,37 @@ private static final Logger LOG = LoggerFactory.getLogger(AbstractTwillController.class); private static final Gson GSON = new Gson(); + private final String appName; + private final RunId runId; private final Queue logHandlers; private final KafkaClientService kafkaClient; private ZKDiscoveryService discoveryServiceClient; private Cancellable logCancellable; - public AbstractTwillController(RunId runId, ZKClient zkClient, Iterable logHandlers) { + public AbstractTwillController(String appName, RunId runId, ZKClient zkClient, boolean logCollectionEnabled, + Iterable logHandlers) { super(runId, zkClient); +this.appName = appName; +this.runId = runId; this.logHandlers = new ConcurrentLinkedQueue<>(); -this.kafkaClient = new ZKKafkaClientService(ZKClients.namespace(zkClient, "/" + runId.getId() + "/kafka")); -Iterables.addAll(this.logHandlers, logHandlers); + +// When addressing TWILL-147, need to check if the given ZKClient is +// actually used by the Kafka used for log collection +if (logCollectionEnabled) { + this.kafkaClient = new ZKKafkaClientService(ZKClients.namespace(zkClient, "/" + runId.getId() + "/kafka")); + Iterables.addAll(this.logHandlers, logHandlers); +} else { + this.kafkaClient = null; + if (!Iterables.isEmpty(logHandlers)) { +LOG.warn("Log collection is disabled for application {} with runId {}. " + + "Adding log handler won't get any logs.", appName, runId); + } +} } @Override protected synchronized void doStartUp() { -if (!logHandlers.isEmpty()) { +if (kafkaClient != null && !logHandlers.isEmpty()) { --- End diff -- You are right here, there is a chance that even log is enabled but caller can pass empty `logHandlers`. +1 for current check --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection
Github user hsaputra commented on a diff in the pull request: https://github.com/apache/twill/pull/40#discussion_r107689017 --- Diff: twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillController.java --- @@ -77,7 +77,7 @@ */ YarnTwillController(String appName, RunId runId, ZKClient zkClient, final ApplicationMasterLiveNodeData amLiveNodeData, final YarnAppClient yarnAppClient) { -super(runId, zkClient, Collections.emptyList()); +super(appName, runId, zkClient, amLiveNodeData.getKafkaZKConnect() != null, Collections.emptyList()); --- End diff -- Yeah, thinking about having `ApplicationMasterLiveNode.isLogCollectionEnabled` to make the contract more clear. Otherwise later we will forget why checking for getKafkaZKConnect != null here --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection
Github user hsaputra commented on a diff in the pull request: https://github.com/apache/twill/pull/40#discussion_r107688594 --- Diff: twill-core/src/main/java/org/apache/twill/internal/TwillRuntimeSpecification.java --- @@ -105,4 +114,17 @@ public String getRmSchedulerAddr() { public Map getMaxRetries() { return maxRetries; } + + /** + * Returns the ZK connection string for the Kafka used for log collections, + * or {@code null} if log collection is disabled. + */ + @Nullable + public String getKafkaZKConnect() { +if (!isLogCollectionEnabled()) { --- End diff -- Thats actually what I meant. I am wondering if we could avoid returning null here because the extra check of if for `isLogCollectionEnabled`. The idea is the code that need to get KafkaZkconnect can get it when it is needed and dont call it when there is no need to do Kafka connect bc the caller can call `isLogCollectionEnabled` to determine whether to get the connect info or not. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection
Github user chtyim commented on a diff in the pull request: https://github.com/apache/twill/pull/40#discussion_r107347828 --- Diff: twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillController.java --- @@ -77,7 +77,7 @@ */ YarnTwillController(String appName, RunId runId, ZKClient zkClient, final ApplicationMasterLiveNodeData amLiveNodeData, final YarnAppClient yarnAppClient) { -super(runId, zkClient, Collections.emptyList()); +super(appName, runId, zkClient, amLiveNodeData.getKafkaZKConnect() != null, Collections.emptyList()); --- End diff -- Or are you suggesting adding a new method `boolean isLogCollectionEnabled` to the `ApplicationMasterLiveNode` class? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection
Github user chtyim commented on a diff in the pull request: https://github.com/apache/twill/pull/40#discussion_r107347590 --- Diff: twill-core/src/main/java/org/apache/twill/internal/TwillRuntimeSpecification.java --- @@ -105,4 +114,17 @@ public String getRmSchedulerAddr() { public Map getMaxRetries() { return maxRetries; } + + /** + * Returns the ZK connection string for the Kafka used for log collections, + * or {@code null} if log collection is disabled. + */ + @Nullable + public String getKafkaZKConnect() { +if (!isLogCollectionEnabled()) { --- End diff -- What do mean by side effect check? The contract of this method is to return `null` if log collection is disabled. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection
Github user chtyim commented on a diff in the pull request: https://github.com/apache/twill/pull/40#discussion_r107347432 --- Diff: twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillController.java --- @@ -77,7 +77,7 @@ */ YarnTwillController(String appName, RunId runId, ZKClient zkClient, final ApplicationMasterLiveNodeData amLiveNodeData, final YarnAppClient yarnAppClient) { -super(runId, zkClient, Collections.emptyList()); +super(appName, runId, zkClient, amLiveNodeData.getKafkaZKConnect() != null, Collections.emptyList()); --- End diff -- ? I don't understand, can you give me an example? The current logic is, from an existing app discovered via ZK, if the AM live node data doesn't have the KafkaZKConnect string, that means the app was launched with log collection disabled, hence the logic here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection
Github user hsaputra commented on a diff in the pull request: https://github.com/apache/twill/pull/40#discussion_r107347379 --- Diff: twill-core/src/main/java/org/apache/twill/internal/TwillRuntimeSpecification.java --- @@ -105,4 +114,17 @@ public String getRmSchedulerAddr() { public Map getMaxRetries() { return maxRetries; } + + /** + * Returns the ZK connection string for the Kafka used for log collections, + * or {@code null} if log collection is disabled. + */ + @Nullable + public String getKafkaZKConnect() { +if (!isLogCollectionEnabled()) { --- End diff -- I am not too keen about having this "side effect" check for log collection enabled here to return KafkaZKConnect. Caller can always check `isLogCollectionEnabled()` to figure out whether log collection is disabled, right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection
Github user hsaputra commented on a diff in the pull request: https://github.com/apache/twill/pull/40#discussion_r107347185 --- Diff: twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillController.java --- @@ -77,7 +77,7 @@ */ YarnTwillController(String appName, RunId runId, ZKClient zkClient, final ApplicationMasterLiveNodeData amLiveNodeData, final YarnAppClient yarnAppClient) { -super(runId, zkClient, Collections.emptyList()); +super(appName, runId, zkClient, amLiveNodeData.getKafkaZKConnect() != null, Collections.emptyList()); --- End diff -- Can the constructor of `YarnTwillController` take new input of `logCollectionEnabled` ? I am seeing different ways to check whether to disable log or not --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection
Github user chtyim commented on a diff in the pull request: https://github.com/apache/twill/pull/40#discussion_r107346435 --- Diff: twill-core/src/main/java/org/apache/twill/internal/AbstractTwillController.java --- @@ -71,21 +71,37 @@ private static final Logger LOG = LoggerFactory.getLogger(AbstractTwillController.class); private static final Gson GSON = new Gson(); + private final String appName; + private final RunId runId; private final Queue logHandlers; private final KafkaClientService kafkaClient; private ZKDiscoveryService discoveryServiceClient; private Cancellable logCancellable; - public AbstractTwillController(RunId runId, ZKClient zkClient, Iterable logHandlers) { + public AbstractTwillController(String appName, RunId runId, ZKClient zkClient, boolean logCollectionEnabled, + Iterable logHandlers) { super(runId, zkClient); +this.appName = appName; +this.runId = runId; this.logHandlers = new ConcurrentLinkedQueue<>(); -this.kafkaClient = new ZKKafkaClientService(ZKClients.namespace(zkClient, "/" + runId.getId() + "/kafka")); -Iterables.addAll(this.logHandlers, logHandlers); + +// When addressing TWILL-147, need to check if the given ZKClient is +// actually used by the Kafka used for log collection +if (logCollectionEnabled) { + this.kafkaClient = new ZKKafkaClientService(ZKClients.namespace(zkClient, "/" + runId.getId() + "/kafka")); + Iterables.addAll(this.logHandlers, logHandlers); +} else { + this.kafkaClient = null; + if (!Iterables.isEmpty(logHandlers)) { +LOG.warn("Log collection is disabled for application {} with runId {}. " + + "Adding log handler won't get any logs.", appName, runId); + } +} } @Override protected synchronized void doStartUp() { -if (!logHandlers.isEmpty()) { +if (kafkaClient != null && !logHandlers.isEmpty()) { --- End diff -- you mean have the logic as ```java if (kafkaClient == null) { return; } if (!logHandlers.isEmpty()) { kafkaClient.startAndWait(); logCancellable = kafkaClient.getConsumer(). } ``` ?? Doesn't look simpler than the existing code to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection
Github user hsaputra commented on a diff in the pull request: https://github.com/apache/twill/pull/40#discussion_r107346167 --- Diff: twill-core/src/main/java/org/apache/twill/internal/AbstractTwillController.java --- @@ -71,21 +71,37 @@ private static final Logger LOG = LoggerFactory.getLogger(AbstractTwillController.class); private static final Gson GSON = new Gson(); + private final String appName; + private final RunId runId; private final Queue logHandlers; private final KafkaClientService kafkaClient; private ZKDiscoveryService discoveryServiceClient; private Cancellable logCancellable; - public AbstractTwillController(RunId runId, ZKClient zkClient, Iterable logHandlers) { + public AbstractTwillController(String appName, RunId runId, ZKClient zkClient, boolean logCollectionEnabled, + Iterable logHandlers) { super(runId, zkClient); +this.appName = appName; +this.runId = runId; this.logHandlers = new ConcurrentLinkedQueue<>(); -this.kafkaClient = new ZKKafkaClientService(ZKClients.namespace(zkClient, "/" + runId.getId() + "/kafka")); -Iterables.addAll(this.logHandlers, logHandlers); + +// When addressing TWILL-147, need to check if the given ZKClient is +// actually used by the Kafka used for log collection +if (logCollectionEnabled) { + this.kafkaClient = new ZKKafkaClientService(ZKClients.namespace(zkClient, "/" + runId.getId() + "/kafka")); + Iterables.addAll(this.logHandlers, logHandlers); +} else { + this.kafkaClient = null; + if (!Iterables.isEmpty(logHandlers)) { +LOG.warn("Log collection is disabled for application {} with runId {}. " + + "Adding log handler won't get any logs.", appName, runId); + } +} } @Override protected synchronized void doStartUp() { -if (!logHandlers.isEmpty()) { +if (kafkaClient != null && !logHandlers.isEmpty()) { --- End diff -- But `this.logHandlers` will always be empty if `if (kafkaClient == null)` is true? So could we just rely on check on kafkaClient as null ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection
Github user chtyim commented on a diff in the pull request: https://github.com/apache/twill/pull/40#discussion_r107344859 --- Diff: twill-core/src/main/java/org/apache/twill/internal/AbstractTwillController.java --- @@ -71,21 +71,37 @@ private static final Logger LOG = LoggerFactory.getLogger(AbstractTwillController.class); private static final Gson GSON = new Gson(); + private final String appName; + private final RunId runId; private final Queue logHandlers; private final KafkaClientService kafkaClient; private ZKDiscoveryService discoveryServiceClient; private Cancellable logCancellable; - public AbstractTwillController(RunId runId, ZKClient zkClient, Iterable logHandlers) { + public AbstractTwillController(String appName, RunId runId, ZKClient zkClient, boolean logCollectionEnabled, + Iterable logHandlers) { super(runId, zkClient); +this.appName = appName; +this.runId = runId; this.logHandlers = new ConcurrentLinkedQueue<>(); -this.kafkaClient = new ZKKafkaClientService(ZKClients.namespace(zkClient, "/" + runId.getId() + "/kafka")); -Iterables.addAll(this.logHandlers, logHandlers); + +// When addressing TWILL-147, need to check if the given ZKClient is +// actually used by the Kafka used for log collection +if (logCollectionEnabled) { + this.kafkaClient = new ZKKafkaClientService(ZKClients.namespace(zkClient, "/" + runId.getId() + "/kafka")); + Iterables.addAll(this.logHandlers, logHandlers); +} else { + this.kafkaClient = null; + if (!Iterables.isEmpty(logHandlers)) { +LOG.warn("Log collection is disabled for application {} with runId {}. " + + "Adding log handler won't get any logs.", appName, runId); + } +} } @Override protected synchronized void doStartUp() { -if (!logHandlers.isEmpty()) { +if (kafkaClient != null && !logHandlers.isEmpty()) { --- End diff -- `logCollectionEnabled` is not a field. I used `kafkaClient == null` to indicate disabling of log collection. Also, the `logHandlers` collection can be empty during startup of the controller, even log collection is enabled, hence the check. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection
Github user hsaputra commented on a diff in the pull request: https://github.com/apache/twill/pull/40#discussion_r107344378 --- Diff: twill-core/src/main/java/org/apache/twill/internal/AbstractTwillController.java --- @@ -71,21 +71,37 @@ private static final Logger LOG = LoggerFactory.getLogger(AbstractTwillController.class); private static final Gson GSON = new Gson(); + private final String appName; + private final RunId runId; private final Queue logHandlers; private final KafkaClientService kafkaClient; private ZKDiscoveryService discoveryServiceClient; private Cancellable logCancellable; - public AbstractTwillController(RunId runId, ZKClient zkClient, Iterable logHandlers) { + public AbstractTwillController(String appName, RunId runId, ZKClient zkClient, boolean logCollectionEnabled, + Iterable logHandlers) { super(runId, zkClient); +this.appName = appName; +this.runId = runId; this.logHandlers = new ConcurrentLinkedQueue<>(); -this.kafkaClient = new ZKKafkaClientService(ZKClients.namespace(zkClient, "/" + runId.getId() + "/kafka")); -Iterables.addAll(this.logHandlers, logHandlers); + +// When addressing TWILL-147, need to check if the given ZKClient is +// actually used by the Kafka used for log collection +if (logCollectionEnabled) { + this.kafkaClient = new ZKKafkaClientService(ZKClients.namespace(zkClient, "/" + runId.getId() + "/kafka")); + Iterables.addAll(this.logHandlers, logHandlers); +} else { + this.kafkaClient = null; + if (!Iterables.isEmpty(logHandlers)) { +LOG.warn("Log collection is disabled for application {} with runId {}. " + + "Adding log handler won't get any logs.", appName, runId); + } +} } @Override protected synchronized void doStartUp() { -if (!logHandlers.isEmpty()) { +if (kafkaClient != null && !logHandlers.isEmpty()) { --- End diff -- Should this just check `if (logCollectionEnabled)` instead since technically both conditions will be met when logCollectionEnabled is true --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection
GitHub user chtyim opened a pull request: https://github.com/apache/twill/pull/40 (TWILL-122) Allow disabling log collection - Introduced a new configuration twill.log.collection.enabled for turning off log collection - Refactor YarnTwillController and related class hierarchy to not starting Kafka client when log collection is disabled - Added Kafka zk connection string information in AM live node data - Refactor KafkaAppender and ServiceMain configureLogger - Log to StatusManager instead of Logger to avoid recursive logging - Instead of resetting logback configuration, directly instantiate and add the Kafka log appender to the logging context. - Refactor ServiceMain, ApplicationMasterMain and TwillContainerMain to simplify ZK Connection string construction You can merge this pull request into a Git repository by running: $ git pull https://github.com/chtyim/twill feature/TWILL-122 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/twill/pull/40.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #40 commit 617280ee72b2e7fc7a3cc473d6dfe35576cf5e57 Author: Terence Yim Date: 2017-03-20T21:42:46Z (TWILL-122) Allow disabling log collection - Introduced a new configuration twill.log.collection.enabled for turning off log collection - Refactor YarnTwillController and related class hierarchy to not starting Kafka client when log collection is disabled - Added Kafka zk connection string information in AM live node data - Refactor KafkaAppender and ServiceMain configureLogger - Log to StatusManager instead of Logger to avoid recursive logging - Instead of resetting logback configuration, directly instantiate and add the Kafka log appender to the logging context. - Refactor ServiceMain, ApplicationMasterMain and TwillContainerMain to simplify ZK Connection string construction --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---