umustafi commented on code in PR #3602:
URL: https://github.com/apache/gobblin/pull/3602#discussion_r1026979650


##########
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinTaskRunner.java:
##########
@@ -544,23 +544,25 @@ void connectHelixManagerWithRetry() {
   private void addInstanceTags() {
     HelixManager receiverManager = getReceiverManager();
     if (receiverManager.isConnected()) {
-      // The helix instance associated with this container should be 
consistent on helix tag
-      List<String> existedTags = receiverManager.getClusterManagmentTool()
-          .getInstanceConfig(this.clusterName, 
this.helixInstanceName).getTags();
-      Set<String> desiredTags = new HashSet<>(
-          ConfigUtils.getStringList(this.clusterConfig, 
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
-      if (!desiredTags.isEmpty()) {
-        // Remove tag assignments for the current Helix instance from a 
previous run
-        for (String tag : existedTags) {
-          if (!desiredTags.contains(tag))
-            
receiverManager.getClusterManagmentTool().removeInstanceTag(this.clusterName, 
this.helixInstanceName, tag);
-          logger.info("Removed unrelated helix tag {} for instance {}", tag, 
this.helixInstanceName);
+      try {
+        Set<String> desiredTags = new 
HashSet<>(ConfigUtils.getStringList(this.clusterConfig, 
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
+        if (!desiredTags.isEmpty()) {
+          // The helix instance associated with this container should be 
consistent on helix tag
+          List<String> existedTags =
+              
receiverManager.getClusterManagmentTool().getInstanceConfig(this.clusterName, 
this.helixInstanceName).getTags();

Review Comment:
   I believe there are some yarn related assumptions (on container start 
registering the instance config or something) and basically we are not setting 
that up for all use cases. I saw a `setInstanceConfig` method in `ZKHelixAdmin` 
that should be called explicitly in the non-Yarn case perhaps if Helix is not 
automatically creating one if taskrunner joins. From testing the code, 
previously we never entered and tested this code block since `desiredTags` is 
empty for our use case. This shouldn't change the streaming logic because the 
`existedTags` definition is only used inside the code block so I've moved the 
declaration to where it's used and caught the exception that can be thrown. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to