homatthew commented on code in PR #3602:
URL: https://github.com/apache/gobblin/pull/3602#discussion_r1025937082


##########
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinTaskRunner.java:
##########
@@ -544,23 +544,25 @@ void connectHelixManagerWithRetry() {
   private void addInstanceTags() {
     HelixManager receiverManager = getReceiverManager();
     if (receiverManager.isConnected()) {
-      // The helix instance associated with this container should be 
consistent on helix tag
-      List<String> existedTags = receiverManager.getClusterManagmentTool()
-          .getInstanceConfig(this.clusterName, 
this.helixInstanceName).getTags();
-      Set<String> desiredTags = new HashSet<>(
-          ConfigUtils.getStringList(this.clusterConfig, 
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
-      if (!desiredTags.isEmpty()) {
-        // Remove tag assignments for the current Helix instance from a 
previous run
-        for (String tag : existedTags) {
-          if (!desiredTags.contains(tag))
-            
receiverManager.getClusterManagmentTool().removeInstanceTag(this.clusterName, 
this.helixInstanceName, tag);
-          logger.info("Removed unrelated helix tag {} for instance {}", tag, 
this.helixInstanceName);
+      try {
+        Set<String> desiredTags = new 
HashSet<>(ConfigUtils.getStringList(this.clusterConfig, 
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));

Review Comment:
   Nit: you removed the new line after new `HashSet<>(`. If possible, can we 
keep the whitespace the same to keep the git history cleaner?



##########
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinTaskRunner.java:
##########
@@ -544,23 +544,25 @@ void connectHelixManagerWithRetry() {
   private void addInstanceTags() {
     HelixManager receiverManager = getReceiverManager();
     if (receiverManager.isConnected()) {
-      // The helix instance associated with this container should be 
consistent on helix tag
-      List<String> existedTags = receiverManager.getClusterManagmentTool()
-          .getInstanceConfig(this.clusterName, 
this.helixInstanceName).getTags();
-      Set<String> desiredTags = new HashSet<>(
-          ConfigUtils.getStringList(this.clusterConfig, 
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
-      if (!desiredTags.isEmpty()) {
-        // Remove tag assignments for the current Helix instance from a 
previous run
-        for (String tag : existedTags) {
-          if (!desiredTags.contains(tag))
-            
receiverManager.getClusterManagmentTool().removeInstanceTag(this.clusterName, 
this.helixInstanceName, tag);
-          logger.info("Removed unrelated helix tag {} for instance {}", tag, 
this.helixInstanceName);
+      try {
+        Set<String> desiredTags = new 
HashSet<>(ConfigUtils.getStringList(this.clusterConfig, 
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
+        if (!desiredTags.isEmpty()) {

Review Comment:
   From what I can tell, the reason this works is because the 
`GobblinYarnTaskRunner` is the only one to set the config key for 
`HELIX_INSTANCE_TAGS_KEY`. And so by proxy we know that if this desired tags is 
empty, then we can skip this code because this is not being run by the 
`GobblinYarnTaskRunner`. But this all feels pretty messy and a bad experience 
for anyone in the future working on this.
   
   Can we find a better approach with an override method? Isn't the real code 
smell that this code is inherently coupled with the YarnService / 
YarnAutoScalingManager? But since this code is being used outside the yarn 
context, it should instead live inside the `GobblinYarnTaskRunner` somehow.
   
   Admittedly, I am not that big of a fan of inheritance, but since we are 
already using it in this class it makes sense to use that approach 
   
   ```
   public class GobblinYarnTaskRunner extends GobblinTaskRunner
   ```



##########
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinTaskRunner.java:
##########
@@ -544,23 +544,25 @@ void connectHelixManagerWithRetry() {
   private void addInstanceTags() {
     HelixManager receiverManager = getReceiverManager();
     if (receiverManager.isConnected()) {
-      // The helix instance associated with this container should be 
consistent on helix tag
-      List<String> existedTags = receiverManager.getClusterManagmentTool()
-          .getInstanceConfig(this.clusterName, 
this.helixInstanceName).getTags();
-      Set<String> desiredTags = new HashSet<>(
-          ConfigUtils.getStringList(this.clusterConfig, 
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
-      if (!desiredTags.isEmpty()) {
-        // Remove tag assignments for the current Helix instance from a 
previous run
-        for (String tag : existedTags) {
-          if (!desiredTags.contains(tag))
-            
receiverManager.getClusterManagmentTool().removeInstanceTag(this.clusterName, 
this.helixInstanceName, tag);
-          logger.info("Removed unrelated helix tag {} for instance {}", tag, 
this.helixInstanceName);
+      try {
+        Set<String> desiredTags = new 
HashSet<>(ConfigUtils.getStringList(this.clusterConfig, 
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
+        if (!desiredTags.isEmpty()) {
+          // The helix instance associated with this container should be 
consistent on helix tag
+          List<String> existedTags =
+              
receiverManager.getClusterManagmentTool().getInstanceConfig(this.clusterName, 
this.helixInstanceName).getTags();

Review Comment:
   It seems to me that this would return empty (but clearly it doesn't since it 
causes an exception).



##########
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinTaskRunner.java:
##########
@@ -544,23 +544,25 @@ void connectHelixManagerWithRetry() {
   private void addInstanceTags() {
     HelixManager receiverManager = getReceiverManager();
     if (receiverManager.isConnected()) {
-      // The helix instance associated with this container should be 
consistent on helix tag
-      List<String> existedTags = receiverManager.getClusterManagmentTool()
-          .getInstanceConfig(this.clusterName, 
this.helixInstanceName).getTags();
-      Set<String> desiredTags = new HashSet<>(
-          ConfigUtils.getStringList(this.clusterConfig, 
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
-      if (!desiredTags.isEmpty()) {
-        // Remove tag assignments for the current Helix instance from a 
previous run
-        for (String tag : existedTags) {
-          if (!desiredTags.contains(tag))
-            
receiverManager.getClusterManagmentTool().removeInstanceTag(this.clusterName, 
this.helixInstanceName, tag);
-          logger.info("Removed unrelated helix tag {} for instance {}", tag, 
this.helixInstanceName);
+      try {
+        Set<String> desiredTags = new 
HashSet<>(ConfigUtils.getStringList(this.clusterConfig, 
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));

Review Comment:
   Ditto for the other whitespace type changes



##########
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinTaskRunner.java:
##########
@@ -599,6 +601,7 @@ public void run() {
   private Optional<ContainerMetrics> buildContainerMetrics() {
     Properties properties = ConfigUtils.configToProperties(this.clusterConfig);
     if (GobblinMetrics.isEnabled(properties)) {
+      logger.info("Container metrics are enabled");

Review Comment:
   Nice log addition! Do you know where these container metrics are collected?



##########
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinTaskRunner.java:
##########
@@ -544,23 +544,25 @@ void connectHelixManagerWithRetry() {
   private void addInstanceTags() {
     HelixManager receiverManager = getReceiverManager();
     if (receiverManager.isConnected()) {
-      // The helix instance associated with this container should be 
consistent on helix tag
-      List<String> existedTags = receiverManager.getClusterManagmentTool()
-          .getInstanceConfig(this.clusterName, 
this.helixInstanceName).getTags();
-      Set<String> desiredTags = new HashSet<>(
-          ConfigUtils.getStringList(this.clusterConfig, 
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
-      if (!desiredTags.isEmpty()) {
-        // Remove tag assignments for the current Helix instance from a 
previous run
-        for (String tag : existedTags) {
-          if (!desiredTags.contains(tag))
-            
receiverManager.getClusterManagmentTool().removeInstanceTag(this.clusterName, 
this.helixInstanceName, tag);
-          logger.info("Removed unrelated helix tag {} for instance {}", tag, 
this.helixInstanceName);
+      try {
+        Set<String> desiredTags = new 
HashSet<>(ConfigUtils.getStringList(this.clusterConfig, 
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
+        if (!desiredTags.isEmpty()) {
+          // The helix instance associated with this container should be 
consistent on helix tag
+          List<String> existedTags =
+              
receiverManager.getClusterManagmentTool().getInstanceConfig(this.clusterName, 
this.helixInstanceName).getTags();

Review Comment:
   Why does moving this line here address the issue? 
   
   In what situations does a config not exist for an instance? Seems to me like 
the sequence of events is:
   ```
   on start(), connectHelixManager() makes taskrunner join as a participant
   Helix should make an instance config for the taskrunner
   ```
   
   In what scenario does the above not happen?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to