[ 
https://issues.apache.org/jira/browse/GOBBLIN-1743?focusedWorklogId=827028&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-827028
 ]

ASF GitHub Bot logged work on GOBBLIN-1743:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 18/Nov/22 03:39
            Start Date: 18/Nov/22 03:39
    Worklog Time Spent: 10m 
      Work Description: homatthew commented on code in PR #3602:
URL: https://github.com/apache/gobblin/pull/3602#discussion_r1025937082


##########
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinTaskRunner.java:
##########
@@ -544,23 +544,25 @@ void connectHelixManagerWithRetry() {
   private void addInstanceTags() {
     HelixManager receiverManager = getReceiverManager();
     if (receiverManager.isConnected()) {
-      // The helix instance associated with this container should be 
consistent on helix tag
-      List<String> existedTags = receiverManager.getClusterManagmentTool()
-          .getInstanceConfig(this.clusterName, 
this.helixInstanceName).getTags();
-      Set<String> desiredTags = new HashSet<>(
-          ConfigUtils.getStringList(this.clusterConfig, 
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
-      if (!desiredTags.isEmpty()) {
-        // Remove tag assignments for the current Helix instance from a 
previous run
-        for (String tag : existedTags) {
-          if (!desiredTags.contains(tag))
-            
receiverManager.getClusterManagmentTool().removeInstanceTag(this.clusterName, 
this.helixInstanceName, tag);
-          logger.info("Removed unrelated helix tag {} for instance {}", tag, 
this.helixInstanceName);
+      try {
+        Set<String> desiredTags = new 
HashSet<>(ConfigUtils.getStringList(this.clusterConfig, 
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));

Review Comment:
   Nit: you removed the new line after new `HashSet<>(`. If possible, can we 
keep the whitespace the same to keep the git history cleaner?



##########
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinTaskRunner.java:
##########
@@ -544,23 +544,25 @@ void connectHelixManagerWithRetry() {
   private void addInstanceTags() {
     HelixManager receiverManager = getReceiverManager();
     if (receiverManager.isConnected()) {
-      // The helix instance associated with this container should be 
consistent on helix tag
-      List<String> existedTags = receiverManager.getClusterManagmentTool()
-          .getInstanceConfig(this.clusterName, 
this.helixInstanceName).getTags();
-      Set<String> desiredTags = new HashSet<>(
-          ConfigUtils.getStringList(this.clusterConfig, 
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
-      if (!desiredTags.isEmpty()) {
-        // Remove tag assignments for the current Helix instance from a 
previous run
-        for (String tag : existedTags) {
-          if (!desiredTags.contains(tag))
-            
receiverManager.getClusterManagmentTool().removeInstanceTag(this.clusterName, 
this.helixInstanceName, tag);
-          logger.info("Removed unrelated helix tag {} for instance {}", tag, 
this.helixInstanceName);
+      try {
+        Set<String> desiredTags = new 
HashSet<>(ConfigUtils.getStringList(this.clusterConfig, 
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
+        if (!desiredTags.isEmpty()) {

Review Comment:
   From what I can tell, the reason this works is because the 
`GobblinYarnTaskRunner` is the only one to set the config key for 
`HELIX_INSTANCE_TAGS_KEY`. And so by proxy we know that if this desired tags is 
empty, then we can skip this code because this is not being run by the 
`GobblinYarnTaskRunner`. But this all feels pretty messy and a bad experience 
for anyone in the future working on this.
   
   Can we find a better approach with an override method? Isn't the real code 
smell that this code is inherently coupled with the YarnService / 
YarnAutoScalingManager? But since this code is being used outside the yarn 
context, it should instead live inside the `GobblinYarnTaskRunner` somehow.
   
   Admittedly, I am not that big of a fan of inheritance, but since we are 
already using it in this class it makes sense to use that approach 
   
   ```
   public class GobblinYarnTaskRunner extends GobblinTaskRunner
   ```



##########
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinTaskRunner.java:
##########
@@ -544,23 +544,25 @@ void connectHelixManagerWithRetry() {
   private void addInstanceTags() {
     HelixManager receiverManager = getReceiverManager();
     if (receiverManager.isConnected()) {
-      // The helix instance associated with this container should be 
consistent on helix tag
-      List<String> existedTags = receiverManager.getClusterManagmentTool()
-          .getInstanceConfig(this.clusterName, 
this.helixInstanceName).getTags();
-      Set<String> desiredTags = new HashSet<>(
-          ConfigUtils.getStringList(this.clusterConfig, 
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
-      if (!desiredTags.isEmpty()) {
-        // Remove tag assignments for the current Helix instance from a 
previous run
-        for (String tag : existedTags) {
-          if (!desiredTags.contains(tag))
-            
receiverManager.getClusterManagmentTool().removeInstanceTag(this.clusterName, 
this.helixInstanceName, tag);
-          logger.info("Removed unrelated helix tag {} for instance {}", tag, 
this.helixInstanceName);
+      try {
+        Set<String> desiredTags = new 
HashSet<>(ConfigUtils.getStringList(this.clusterConfig, 
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
+        if (!desiredTags.isEmpty()) {
+          // The helix instance associated with this container should be 
consistent on helix tag
+          List<String> existedTags =
+              
receiverManager.getClusterManagmentTool().getInstanceConfig(this.clusterName, 
this.helixInstanceName).getTags();

Review Comment:
   It seems to me that this would return empty (but clearly it doesn't since it 
causes an exception).



##########
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinTaskRunner.java:
##########
@@ -544,23 +544,25 @@ void connectHelixManagerWithRetry() {
   private void addInstanceTags() {
     HelixManager receiverManager = getReceiverManager();
     if (receiverManager.isConnected()) {
-      // The helix instance associated with this container should be 
consistent on helix tag
-      List<String> existedTags = receiverManager.getClusterManagmentTool()
-          .getInstanceConfig(this.clusterName, 
this.helixInstanceName).getTags();
-      Set<String> desiredTags = new HashSet<>(
-          ConfigUtils.getStringList(this.clusterConfig, 
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
-      if (!desiredTags.isEmpty()) {
-        // Remove tag assignments for the current Helix instance from a 
previous run
-        for (String tag : existedTags) {
-          if (!desiredTags.contains(tag))
-            
receiverManager.getClusterManagmentTool().removeInstanceTag(this.clusterName, 
this.helixInstanceName, tag);
-          logger.info("Removed unrelated helix tag {} for instance {}", tag, 
this.helixInstanceName);
+      try {
+        Set<String> desiredTags = new 
HashSet<>(ConfigUtils.getStringList(this.clusterConfig, 
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));

Review Comment:
   Ditto for the other whitespace type changes



##########
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinTaskRunner.java:
##########
@@ -599,6 +601,7 @@ public void run() {
   private Optional<ContainerMetrics> buildContainerMetrics() {
     Properties properties = ConfigUtils.configToProperties(this.clusterConfig);
     if (GobblinMetrics.isEnabled(properties)) {
+      logger.info("Container metrics are enabled");

Review Comment:
   Nice log addition! Do you know where these container metrics are collected?



##########
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinTaskRunner.java:
##########
@@ -544,23 +544,25 @@ void connectHelixManagerWithRetry() {
   private void addInstanceTags() {
     HelixManager receiverManager = getReceiverManager();
     if (receiverManager.isConnected()) {
-      // The helix instance associated with this container should be 
consistent on helix tag
-      List<String> existedTags = receiverManager.getClusterManagmentTool()
-          .getInstanceConfig(this.clusterName, 
this.helixInstanceName).getTags();
-      Set<String> desiredTags = new HashSet<>(
-          ConfigUtils.getStringList(this.clusterConfig, 
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
-      if (!desiredTags.isEmpty()) {
-        // Remove tag assignments for the current Helix instance from a 
previous run
-        for (String tag : existedTags) {
-          if (!desiredTags.contains(tag))
-            
receiverManager.getClusterManagmentTool().removeInstanceTag(this.clusterName, 
this.helixInstanceName, tag);
-          logger.info("Removed unrelated helix tag {} for instance {}", tag, 
this.helixInstanceName);
+      try {
+        Set<String> desiredTags = new 
HashSet<>(ConfigUtils.getStringList(this.clusterConfig, 
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
+        if (!desiredTags.isEmpty()) {
+          // The helix instance associated with this container should be 
consistent on helix tag
+          List<String> existedTags =
+              
receiverManager.getClusterManagmentTool().getInstanceConfig(this.clusterName, 
this.helixInstanceName).getTags();

Review Comment:
   Why does moving this line here address the issue? 
   
   In what situations does a config not exist for an instance? Seems to me like 
the sequence of events is:
   ```
   on start(), connectHelixManager() makes taskrunner join as a participant
   Helix should make an instance config for the taskrunner
   ```
   
   In what scenario does the above not happen?





Issue Time Tracking
-------------------

    Worklog Id:     (was: 827028)
    Time Spent: 1.5h  (was: 1h 20m)

> Ensure GobblinTaskRunner works when used without Yarn
> -----------------------------------------------------
>
>                 Key: GOBBLIN-1743
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1743
>             Project: Apache Gobblin
>          Issue Type: Bug
>          Components: gobblin-core
>            Reporter: Urmi Mustafi
>            Assignee: Abhishek Tiwari
>            Priority: Major
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> this PR [https://github.com/apache/gobblin/pull/3519/files]  has a call to 
> Helix cluster contained outside of a conditional that was previously never 
> encountered for Fliptop use case. I moved the call back inside the 
> conditional to ensure we don't attempt to retrieve an instanceConfig that is 
> not registered with Helix as we don't have Yarn mode enabled. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to