[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection

2017-03-23 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/twill/pull/40


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection

2017-03-23 Thread hsaputra
Github user hsaputra commented on a diff in the pull request:

https://github.com/apache/twill/pull/40#discussion_r107788346
  
--- Diff: 
twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillController.java ---
@@ -77,7 +77,7 @@
*/
   YarnTwillController(String appName, RunId runId, ZKClient zkClient,
   final ApplicationMasterLiveNodeData amLiveNodeData, 
final YarnAppClient yarnAppClient) {
-super(runId, zkClient, Collections.emptyList());
+super(appName, runId, zkClient, amLiveNodeData.getKafkaZKConnect() != 
null, Collections.emptyList());
--- End diff --

At this point is more about style than correctness. I personally prefer to 
not make the logic of determining to enable or disable logging in 
YarnTwillController but rather on object that have current state of the union.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection

2017-03-23 Thread hsaputra
Github user hsaputra commented on a diff in the pull request:

https://github.com/apache/twill/pull/40#discussion_r107781058
  
--- Diff: 
twill-core/src/main/java/org/apache/twill/internal/TwillRuntimeSpecification.java
 ---
@@ -105,4 +114,17 @@ public String getRmSchedulerAddr() {
   public Map getMaxRetries() {
 return maxRetries;
   }
+
+  /**
+   * Returns the ZK connection string for the Kafka used for log 
collections,
+   * or {@code null} if log collection is disabled.
+   */
+  @Nullable
+  public String getKafkaZKConnect() {
+if (!isLogCollectionEnabled()) {
--- End diff --

Ah sorry, that will work, I was looking at snippet of change. Yeah, should 
work


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection

2017-03-23 Thread chtyim
Github user chtyim commented on a diff in the pull request:

https://github.com/apache/twill/pull/40#discussion_r107773444
  
--- Diff: 
twill-core/src/main/java/org/apache/twill/internal/TwillRuntimeSpecification.java
 ---
@@ -105,4 +114,17 @@ public String getRmSchedulerAddr() {
   public Map getMaxRetries() {
 return maxRetries;
   }
+
+  /**
+   * Returns the ZK connection string for the Kafka used for log 
collections,
+   * or {@code null} if log collection is disabled.
+   */
+  @Nullable
+  public String getKafkaZKConnect() {
+if (!isLogCollectionEnabled()) {
--- End diff --

Do you mean this javadoc and the `@Nullable` annotation not enough?

```java
/**
  * Returns the ZK connection string for the Kafka used for log collections,
  * or {@code null} if log collection is disabled.
  */
@Nullable
public String getKafkaZKConnect()
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection

2017-03-23 Thread hsaputra
Github user hsaputra commented on a diff in the pull request:

https://github.com/apache/twill/pull/40#discussion_r107771740
  
--- Diff: 
twill-core/src/main/java/org/apache/twill/internal/TwillRuntimeSpecification.java
 ---
@@ -105,4 +114,17 @@ public String getRmSchedulerAddr() {
   public Map getMaxRetries() {
 return maxRetries;
   }
+
+  /**
+   * Returns the ZK connection string for the Kafka used for log 
collections,
+   * or {@code null} if log collection is disabled.
+   */
+  @Nullable
+  public String getKafkaZKConnect() {
+if (!isLogCollectionEnabled()) {
--- End diff --

Same thing of no way that caller could have guessed without looking at the 
code that there is a check of isLogCollectionEnabled inside getKafkaZKConnect 
that have inferred meaning when it is returning null.

The compromise I think at least added JavaDoc header for getKafkaZKConnect 
to explain why it could return null.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection

2017-03-23 Thread chtyim
Github user chtyim commented on a diff in the pull request:

https://github.com/apache/twill/pull/40#discussion_r107767933
  
--- Diff: 
twill-core/src/main/java/org/apache/twill/internal/TwillRuntimeSpecification.java
 ---
@@ -105,4 +114,17 @@ public String getRmSchedulerAddr() {
   public Map getMaxRetries() {
 return maxRetries;
   }
+
+  /**
+   * Returns the ZK connection string for the Kafka used for log 
collections,
+   * or {@code null} if log collection is disabled.
+   */
+  @Nullable
+  public String getKafkaZKConnect() {
+if (!isLogCollectionEnabled()) {
--- End diff --

This is because there is no way to enforce someone have to call 
`isLogCollectionEnabled` first before calling the `getKafkaZKConnect` method, 
meaning the `getKafkaZKConnect` method needs to handle the case when log 
collection disabled anyway (either by returning `null` or returning `Optional` 
or throwing exception). From the caller perspective, it seems more straight 
forward to call the method and validate against the contract then calling two 
methods in the right order.

However, having `getKafkaZKConnect` returning `null` (I prefer `Optional`, 
but since we are on Java 7, and we are on the road on reducing reliance on 
guava, so that's not really an option), can make cases like this simpler in 
future when adding support for TWILL-147. E.g.

```java
YarnTwillController(@Nullable String kafkaZKConnect, ...) {
  this.kafkaClient = createKafkaClient(kafkaZKConnect);
}

@Nullable
private KafkaClient createKafkaClient(@Nnullable String kafkaZKConnect) {
  if (kafkaZKConnect == null) {
return null;
  }
  return new KafkaClient();
}

...
// For app launch with the current twill runner
new YarnTwillController(spec.getKafkaZKConnect(), ...);

// For running app recovered from ZK.
new YarnTwillController(amLiveNode.getKafkaZKConnect(), ...);
```

Without the `nullable` support, there would be more overloaded constructors 
as well as code branches to achieve the same.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection

2017-03-23 Thread hsaputra
Github user hsaputra commented on a diff in the pull request:

https://github.com/apache/twill/pull/40#discussion_r107739693
  
--- Diff: 
twill-core/src/main/java/org/apache/twill/internal/TwillRuntimeSpecification.java
 ---
@@ -105,4 +114,17 @@ public String getRmSchedulerAddr() {
   public Map getMaxRetries() {
 return maxRetries;
   }
+
+  /**
+   * Returns the ZK connection string for the Kafka used for log 
collections,
+   * or {@code null} if log collection is disabled.
+   */
+  @Nullable
+  public String getKafkaZKConnect() {
+if (!isLogCollectionEnabled()) {
--- End diff --

The call to `getZkConnectStr` and `getTwillAppName` and `getTwillAppRunId` 
will never return null based on the contract (no @Nullable) , hence call to 
`getKafkaZKConnect` will never be null. 

I was saying that either way, someone need to check whether call to 
getKafkaZKConnect will return null if log collection is disabled, so why not 
ask to isLogCollectionEnabled to check first.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection

2017-03-23 Thread chtyim
Github user chtyim commented on a diff in the pull request:

https://github.com/apache/twill/pull/40#discussion_r107717653
  
--- Diff: 
twill-core/src/main/java/org/apache/twill/internal/TwillRuntimeSpecification.java
 ---
@@ -105,4 +114,17 @@ public String getRmSchedulerAddr() {
   public Map getMaxRetries() {
 return maxRetries;
   }
+
+  /**
+   * Returns the ZK connection string for the Kafka used for log 
collections,
+   * or {@code null} if log collection is disabled.
+   */
+  @Nullable
+  public String getKafkaZKConnect() {
+if (!isLogCollectionEnabled()) {
--- End diff --

How to avoid returning null? You mean by throwing exception so that the 
caller has to call isLogCollectionEnabled before calling this method? I am not 
a friend of method coupling like this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection

2017-03-23 Thread chtyim
Github user chtyim commented on a diff in the pull request:

https://github.com/apache/twill/pull/40#discussion_r107717203
  
--- Diff: 
twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillController.java ---
@@ -77,7 +77,7 @@
*/
   YarnTwillController(String appName, RunId runId, ZKClient zkClient,
   final ApplicationMasterLiveNodeData amLiveNodeData, 
final YarnAppClient yarnAppClient) {
-super(runId, zkClient, Collections.emptyList());
+super(appName, runId, zkClient, amLiveNodeData.getKafkaZKConnect() != 
null, Collections.emptyList());
--- End diff --

The contract of the getKafkaZKConnect won't change though, since it has to 
return something (or throw exception). From the caller point of view, there is 
almost no differences.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection

2017-03-23 Thread hsaputra
Github user hsaputra commented on a diff in the pull request:

https://github.com/apache/twill/pull/40#discussion_r107690293
  
--- Diff: 
twill-core/src/main/java/org/apache/twill/internal/AbstractTwillController.java 
---
@@ -71,21 +71,37 @@
   private static final Logger LOG = 
LoggerFactory.getLogger(AbstractTwillController.class);
   private static final Gson GSON = new Gson();
 
+  private final String appName;
+  private final RunId runId;
   private final Queue logHandlers;
   private final KafkaClientService kafkaClient;
   private ZKDiscoveryService discoveryServiceClient;
   private Cancellable logCancellable;
 
-  public AbstractTwillController(RunId runId, ZKClient zkClient, 
Iterable logHandlers) {
+  public AbstractTwillController(String appName, RunId runId, ZKClient 
zkClient, boolean logCollectionEnabled,
+ Iterable logHandlers) {
 super(runId, zkClient);
+this.appName = appName;
+this.runId = runId;
 this.logHandlers = new ConcurrentLinkedQueue<>();
-this.kafkaClient = new 
ZKKafkaClientService(ZKClients.namespace(zkClient, "/" + runId.getId() + 
"/kafka"));
-Iterables.addAll(this.logHandlers, logHandlers);
+
+// When addressing TWILL-147, need to check if the given ZKClient is
+// actually used by the Kafka used for log collection
+if (logCollectionEnabled) {
+  this.kafkaClient = new 
ZKKafkaClientService(ZKClients.namespace(zkClient, "/" + runId.getId() + 
"/kafka"));
+  Iterables.addAll(this.logHandlers, logHandlers);
+} else {
+  this.kafkaClient = null;
+  if (!Iterables.isEmpty(logHandlers)) {
+LOG.warn("Log collection is disabled for application {} with runId 
{}. " +
+   "Adding log handler won't get any logs.", appName, 
runId);
+  }
+}
   }
 
   @Override
   protected synchronized void doStartUp() {
-if (!logHandlers.isEmpty()) {
+if (kafkaClient != null && !logHandlers.isEmpty()) {
--- End diff --

You are right here, there  is a chance that even log is enabled but caller 
can pass empty `logHandlers`.
+1 for current check


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection

2017-03-23 Thread hsaputra
Github user hsaputra commented on a diff in the pull request:

https://github.com/apache/twill/pull/40#discussion_r107689017
  
--- Diff: 
twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillController.java ---
@@ -77,7 +77,7 @@
*/
   YarnTwillController(String appName, RunId runId, ZKClient zkClient,
   final ApplicationMasterLiveNodeData amLiveNodeData, 
final YarnAppClient yarnAppClient) {
-super(runId, zkClient, Collections.emptyList());
+super(appName, runId, zkClient, amLiveNodeData.getKafkaZKConnect() != 
null, Collections.emptyList());
--- End diff --

Yeah, thinking about having 
`ApplicationMasterLiveNode.isLogCollectionEnabled` to make the contract more 
clear. Otherwise later we will forget why checking for getKafkaZKConnect != 
null here


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection

2017-03-23 Thread hsaputra
Github user hsaputra commented on a diff in the pull request:

https://github.com/apache/twill/pull/40#discussion_r107688594
  
--- Diff: 
twill-core/src/main/java/org/apache/twill/internal/TwillRuntimeSpecification.java
 ---
@@ -105,4 +114,17 @@ public String getRmSchedulerAddr() {
   public Map getMaxRetries() {
 return maxRetries;
   }
+
+  /**
+   * Returns the ZK connection string for the Kafka used for log 
collections,
+   * or {@code null} if log collection is disabled.
+   */
+  @Nullable
+  public String getKafkaZKConnect() {
+if (!isLogCollectionEnabled()) {
--- End diff --

Thats actually what I meant. I am wondering if we could avoid returning 
null here because the extra check of if for `isLogCollectionEnabled`. The idea 
is the code that need to get KafkaZkconnect can get it when it is needed and 
dont call it when there is no need to do Kafka connect bc the caller can call 
`isLogCollectionEnabled` to determine whether to get the connect info or not.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection

2017-03-22 Thread chtyim
Github user chtyim commented on a diff in the pull request:

https://github.com/apache/twill/pull/40#discussion_r107347828
  
--- Diff: 
twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillController.java ---
@@ -77,7 +77,7 @@
*/
   YarnTwillController(String appName, RunId runId, ZKClient zkClient,
   final ApplicationMasterLiveNodeData amLiveNodeData, 
final YarnAppClient yarnAppClient) {
-super(runId, zkClient, Collections.emptyList());
+super(appName, runId, zkClient, amLiveNodeData.getKafkaZKConnect() != 
null, Collections.emptyList());
--- End diff --

Or are you suggesting adding a new method `boolean isLogCollectionEnabled` 
to the `ApplicationMasterLiveNode` class?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection

2017-03-22 Thread chtyim
Github user chtyim commented on a diff in the pull request:

https://github.com/apache/twill/pull/40#discussion_r107347590
  
--- Diff: 
twill-core/src/main/java/org/apache/twill/internal/TwillRuntimeSpecification.java
 ---
@@ -105,4 +114,17 @@ public String getRmSchedulerAddr() {
   public Map getMaxRetries() {
 return maxRetries;
   }
+
+  /**
+   * Returns the ZK connection string for the Kafka used for log 
collections,
+   * or {@code null} if log collection is disabled.
+   */
+  @Nullable
+  public String getKafkaZKConnect() {
+if (!isLogCollectionEnabled()) {
--- End diff --

What do mean by side effect check? The contract of this method is to return 
`null` if log collection is disabled.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection

2017-03-22 Thread chtyim
Github user chtyim commented on a diff in the pull request:

https://github.com/apache/twill/pull/40#discussion_r107347432
  
--- Diff: 
twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillController.java ---
@@ -77,7 +77,7 @@
*/
   YarnTwillController(String appName, RunId runId, ZKClient zkClient,
   final ApplicationMasterLiveNodeData amLiveNodeData, 
final YarnAppClient yarnAppClient) {
-super(runId, zkClient, Collections.emptyList());
+super(appName, runId, zkClient, amLiveNodeData.getKafkaZKConnect() != 
null, Collections.emptyList());
--- End diff --

? I don't understand, can you give me an example? The current logic is, 
from an existing app discovered via ZK, if the AM live node data doesn't have 
the KafkaZKConnect string, that means the app was launched with log collection 
disabled, hence the logic here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection

2017-03-22 Thread hsaputra
Github user hsaputra commented on a diff in the pull request:

https://github.com/apache/twill/pull/40#discussion_r107347379
  
--- Diff: 
twill-core/src/main/java/org/apache/twill/internal/TwillRuntimeSpecification.java
 ---
@@ -105,4 +114,17 @@ public String getRmSchedulerAddr() {
   public Map getMaxRetries() {
 return maxRetries;
   }
+
+  /**
+   * Returns the ZK connection string for the Kafka used for log 
collections,
+   * or {@code null} if log collection is disabled.
+   */
+  @Nullable
+  public String getKafkaZKConnect() {
+if (!isLogCollectionEnabled()) {
--- End diff --

I am not too keen about having this "side effect" check for log collection 
enabled here to return KafkaZKConnect.
Caller can always check `isLogCollectionEnabled()` to figure out whether 
log collection is disabled, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection

2017-03-22 Thread hsaputra
Github user hsaputra commented on a diff in the pull request:

https://github.com/apache/twill/pull/40#discussion_r107347185
  
--- Diff: 
twill-yarn/src/main/java/org/apache/twill/yarn/YarnTwillController.java ---
@@ -77,7 +77,7 @@
*/
   YarnTwillController(String appName, RunId runId, ZKClient zkClient,
   final ApplicationMasterLiveNodeData amLiveNodeData, 
final YarnAppClient yarnAppClient) {
-super(runId, zkClient, Collections.emptyList());
+super(appName, runId, zkClient, amLiveNodeData.getKafkaZKConnect() != 
null, Collections.emptyList());
--- End diff --

Can the constructor of `YarnTwillController` take new input of 
`logCollectionEnabled` ? I am seeing different ways to check whether to disable 
log or not


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection

2017-03-22 Thread chtyim
Github user chtyim commented on a diff in the pull request:

https://github.com/apache/twill/pull/40#discussion_r107346435
  
--- Diff: 
twill-core/src/main/java/org/apache/twill/internal/AbstractTwillController.java 
---
@@ -71,21 +71,37 @@
   private static final Logger LOG = 
LoggerFactory.getLogger(AbstractTwillController.class);
   private static final Gson GSON = new Gson();
 
+  private final String appName;
+  private final RunId runId;
   private final Queue logHandlers;
   private final KafkaClientService kafkaClient;
   private ZKDiscoveryService discoveryServiceClient;
   private Cancellable logCancellable;
 
-  public AbstractTwillController(RunId runId, ZKClient zkClient, 
Iterable logHandlers) {
+  public AbstractTwillController(String appName, RunId runId, ZKClient 
zkClient, boolean logCollectionEnabled,
+ Iterable logHandlers) {
 super(runId, zkClient);
+this.appName = appName;
+this.runId = runId;
 this.logHandlers = new ConcurrentLinkedQueue<>();
-this.kafkaClient = new 
ZKKafkaClientService(ZKClients.namespace(zkClient, "/" + runId.getId() + 
"/kafka"));
-Iterables.addAll(this.logHandlers, logHandlers);
+
+// When addressing TWILL-147, need to check if the given ZKClient is
+// actually used by the Kafka used for log collection
+if (logCollectionEnabled) {
+  this.kafkaClient = new 
ZKKafkaClientService(ZKClients.namespace(zkClient, "/" + runId.getId() + 
"/kafka"));
+  Iterables.addAll(this.logHandlers, logHandlers);
+} else {
+  this.kafkaClient = null;
+  if (!Iterables.isEmpty(logHandlers)) {
+LOG.warn("Log collection is disabled for application {} with runId 
{}. " +
+   "Adding log handler won't get any logs.", appName, 
runId);
+  }
+}
   }
 
   @Override
   protected synchronized void doStartUp() {
-if (!logHandlers.isEmpty()) {
+if (kafkaClient != null && !logHandlers.isEmpty()) {
--- End diff --

you mean have the logic as

```java
if (kafkaClient == null) {
  return;
}
if (!logHandlers.isEmpty()) {
  kafkaClient.startAndWait();
  logCancellable = kafkaClient.getConsumer().
}
```

??
Doesn't look simpler than the existing code to me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection

2017-03-22 Thread hsaputra
Github user hsaputra commented on a diff in the pull request:

https://github.com/apache/twill/pull/40#discussion_r107346167
  
--- Diff: 
twill-core/src/main/java/org/apache/twill/internal/AbstractTwillController.java 
---
@@ -71,21 +71,37 @@
   private static final Logger LOG = 
LoggerFactory.getLogger(AbstractTwillController.class);
   private static final Gson GSON = new Gson();
 
+  private final String appName;
+  private final RunId runId;
   private final Queue logHandlers;
   private final KafkaClientService kafkaClient;
   private ZKDiscoveryService discoveryServiceClient;
   private Cancellable logCancellable;
 
-  public AbstractTwillController(RunId runId, ZKClient zkClient, 
Iterable logHandlers) {
+  public AbstractTwillController(String appName, RunId runId, ZKClient 
zkClient, boolean logCollectionEnabled,
+ Iterable logHandlers) {
 super(runId, zkClient);
+this.appName = appName;
+this.runId = runId;
 this.logHandlers = new ConcurrentLinkedQueue<>();
-this.kafkaClient = new 
ZKKafkaClientService(ZKClients.namespace(zkClient, "/" + runId.getId() + 
"/kafka"));
-Iterables.addAll(this.logHandlers, logHandlers);
+
+// When addressing TWILL-147, need to check if the given ZKClient is
+// actually used by the Kafka used for log collection
+if (logCollectionEnabled) {
+  this.kafkaClient = new 
ZKKafkaClientService(ZKClients.namespace(zkClient, "/" + runId.getId() + 
"/kafka"));
+  Iterables.addAll(this.logHandlers, logHandlers);
+} else {
+  this.kafkaClient = null;
+  if (!Iterables.isEmpty(logHandlers)) {
+LOG.warn("Log collection is disabled for application {} with runId 
{}. " +
+   "Adding log handler won't get any logs.", appName, 
runId);
+  }
+}
   }
 
   @Override
   protected synchronized void doStartUp() {
-if (!logHandlers.isEmpty()) {
+if (kafkaClient != null && !logHandlers.isEmpty()) {
--- End diff --

But `this.logHandlers` will always be empty if `if (kafkaClient == null)` 
is true? So could we just rely on check on kafkaClient as null ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection

2017-03-22 Thread chtyim
Github user chtyim commented on a diff in the pull request:

https://github.com/apache/twill/pull/40#discussion_r107344859
  
--- Diff: 
twill-core/src/main/java/org/apache/twill/internal/AbstractTwillController.java 
---
@@ -71,21 +71,37 @@
   private static final Logger LOG = 
LoggerFactory.getLogger(AbstractTwillController.class);
   private static final Gson GSON = new Gson();
 
+  private final String appName;
+  private final RunId runId;
   private final Queue logHandlers;
   private final KafkaClientService kafkaClient;
   private ZKDiscoveryService discoveryServiceClient;
   private Cancellable logCancellable;
 
-  public AbstractTwillController(RunId runId, ZKClient zkClient, 
Iterable logHandlers) {
+  public AbstractTwillController(String appName, RunId runId, ZKClient 
zkClient, boolean logCollectionEnabled,
+ Iterable logHandlers) {
 super(runId, zkClient);
+this.appName = appName;
+this.runId = runId;
 this.logHandlers = new ConcurrentLinkedQueue<>();
-this.kafkaClient = new 
ZKKafkaClientService(ZKClients.namespace(zkClient, "/" + runId.getId() + 
"/kafka"));
-Iterables.addAll(this.logHandlers, logHandlers);
+
+// When addressing TWILL-147, need to check if the given ZKClient is
+// actually used by the Kafka used for log collection
+if (logCollectionEnabled) {
+  this.kafkaClient = new 
ZKKafkaClientService(ZKClients.namespace(zkClient, "/" + runId.getId() + 
"/kafka"));
+  Iterables.addAll(this.logHandlers, logHandlers);
+} else {
+  this.kafkaClient = null;
+  if (!Iterables.isEmpty(logHandlers)) {
+LOG.warn("Log collection is disabled for application {} with runId 
{}. " +
+   "Adding log handler won't get any logs.", appName, 
runId);
+  }
+}
   }
 
   @Override
   protected synchronized void doStartUp() {
-if (!logHandlers.isEmpty()) {
+if (kafkaClient != null && !logHandlers.isEmpty()) {
--- End diff --

`logCollectionEnabled` is not a field. I used `kafkaClient == null` to 
indicate disabling of log collection.
Also, the `logHandlers` collection can be empty during startup of the 
controller, even log collection is enabled, hence the check.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection

2017-03-22 Thread hsaputra
Github user hsaputra commented on a diff in the pull request:

https://github.com/apache/twill/pull/40#discussion_r107344378
  
--- Diff: 
twill-core/src/main/java/org/apache/twill/internal/AbstractTwillController.java 
---
@@ -71,21 +71,37 @@
   private static final Logger LOG = 
LoggerFactory.getLogger(AbstractTwillController.class);
   private static final Gson GSON = new Gson();
 
+  private final String appName;
+  private final RunId runId;
   private final Queue logHandlers;
   private final KafkaClientService kafkaClient;
   private ZKDiscoveryService discoveryServiceClient;
   private Cancellable logCancellable;
 
-  public AbstractTwillController(RunId runId, ZKClient zkClient, 
Iterable logHandlers) {
+  public AbstractTwillController(String appName, RunId runId, ZKClient 
zkClient, boolean logCollectionEnabled,
+ Iterable logHandlers) {
 super(runId, zkClient);
+this.appName = appName;
+this.runId = runId;
 this.logHandlers = new ConcurrentLinkedQueue<>();
-this.kafkaClient = new 
ZKKafkaClientService(ZKClients.namespace(zkClient, "/" + runId.getId() + 
"/kafka"));
-Iterables.addAll(this.logHandlers, logHandlers);
+
+// When addressing TWILL-147, need to check if the given ZKClient is
+// actually used by the Kafka used for log collection
+if (logCollectionEnabled) {
+  this.kafkaClient = new 
ZKKafkaClientService(ZKClients.namespace(zkClient, "/" + runId.getId() + 
"/kafka"));
+  Iterables.addAll(this.logHandlers, logHandlers);
+} else {
+  this.kafkaClient = null;
+  if (!Iterables.isEmpty(logHandlers)) {
+LOG.warn("Log collection is disabled for application {} with runId 
{}. " +
+   "Adding log handler won't get any logs.", appName, 
runId);
+  }
+}
   }
 
   @Override
   protected synchronized void doStartUp() {
-if (!logHandlers.isEmpty()) {
+if (kafkaClient != null && !logHandlers.isEmpty()) {
--- End diff --

Should this just check `if (logCollectionEnabled)` instead since 
technically both conditions will be met when logCollectionEnabled is true


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] twill pull request #40: (TWILL-122) Allow disabling log collection

2017-03-21 Thread chtyim
GitHub user chtyim opened a pull request:

https://github.com/apache/twill/pull/40

(TWILL-122) Allow disabling log collection

- Introduced a new configuration twill.log.collection.enabled for
  turning off log collection
- Refactor YarnTwillController and related class hierarchy to not
  starting Kafka client when log collection is disabled
- Added Kafka zk connection string information in AM live node data
- Refactor KafkaAppender and ServiceMain configureLogger
  - Log to StatusManager instead of Logger to avoid recursive logging
  - Instead of resetting logback configuration, directly instantiate and
add the Kafka log appender to the logging context.
- Refactor ServiceMain, ApplicationMasterMain and TwillContainerMain to
  simplify ZK Connection string construction

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chtyim/twill feature/TWILL-122

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/twill/pull/40.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #40


commit 617280ee72b2e7fc7a3cc473d6dfe35576cf5e57
Author: Terence Yim 
Date:   2017-03-20T21:42:46Z

(TWILL-122) Allow disabling log collection

- Introduced a new configuration twill.log.collection.enabled for
  turning off log collection
- Refactor YarnTwillController and related class hierarchy to not
  starting Kafka client when log collection is disabled
- Added Kafka zk connection string information in AM live node data
- Refactor KafkaAppender and ServiceMain configureLogger
  - Log to StatusManager instead of Logger to avoid recursive logging
  - Instead of resetting logback configuration, directly instantiate and
add the Kafka log appender to the logging context.
- Refactor ServiceMain, ApplicationMasterMain and TwillContainerMain to
  simplify ZK Connection string construction




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---