[
https://issues.apache.org/jira/browse/GOBBLIN-1743?focusedWorklogId=827028&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-827028
]
ASF GitHub Bot logged work on GOBBLIN-1743:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 18/Nov/22 03:39
Start Date: 18/Nov/22 03:39
Worklog Time Spent: 10m
Work Description: homatthew commented on code in PR #3602:
URL: https://github.com/apache/gobblin/pull/3602#discussion_r1025937082
##########
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinTaskRunner.java:
##########
@@ -544,23 +544,25 @@ void connectHelixManagerWithRetry() {
private void addInstanceTags() {
HelixManager receiverManager = getReceiverManager();
if (receiverManager.isConnected()) {
- // The helix instance associated with this container should be
consistent on helix tag
- List<String> existedTags = receiverManager.getClusterManagmentTool()
- .getInstanceConfig(this.clusterName,
this.helixInstanceName).getTags();
- Set<String> desiredTags = new HashSet<>(
- ConfigUtils.getStringList(this.clusterConfig,
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
- if (!desiredTags.isEmpty()) {
- // Remove tag assignments for the current Helix instance from a
previous run
- for (String tag : existedTags) {
- if (!desiredTags.contains(tag))
-
receiverManager.getClusterManagmentTool().removeInstanceTag(this.clusterName,
this.helixInstanceName, tag);
- logger.info("Removed unrelated helix tag {} for instance {}", tag,
this.helixInstanceName);
+ try {
+ Set<String> desiredTags = new
HashSet<>(ConfigUtils.getStringList(this.clusterConfig,
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
Review Comment:
Nit: you removed the new line after new `HashSet<>(`. If possible, can we
keep the whitespace the same to keep the git history cleaner?
##########
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinTaskRunner.java:
##########
@@ -544,23 +544,25 @@ void connectHelixManagerWithRetry() {
private void addInstanceTags() {
HelixManager receiverManager = getReceiverManager();
if (receiverManager.isConnected()) {
- // The helix instance associated with this container should be
consistent on helix tag
- List<String> existedTags = receiverManager.getClusterManagmentTool()
- .getInstanceConfig(this.clusterName,
this.helixInstanceName).getTags();
- Set<String> desiredTags = new HashSet<>(
- ConfigUtils.getStringList(this.clusterConfig,
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
- if (!desiredTags.isEmpty()) {
- // Remove tag assignments for the current Helix instance from a
previous run
- for (String tag : existedTags) {
- if (!desiredTags.contains(tag))
-
receiverManager.getClusterManagmentTool().removeInstanceTag(this.clusterName,
this.helixInstanceName, tag);
- logger.info("Removed unrelated helix tag {} for instance {}", tag,
this.helixInstanceName);
+ try {
+ Set<String> desiredTags = new
HashSet<>(ConfigUtils.getStringList(this.clusterConfig,
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
+ if (!desiredTags.isEmpty()) {
Review Comment:
From what I can tell, the reason this works is because the
`GobblinYarnTaskRunner` is the only one to set the config key for
`HELIX_INSTANCE_TAGS_KEY`. And so by proxy we know that if this desired tags is
empty, then we can skip this code because this is not being run by the
`GobblinYarnTaskRunner`. But this all feels pretty messy and a bad experience
for anyone in the future working on this.
Can we find a better approach with an override method? Isn't the real code
smell that this code is inherently coupled with the YarnService /
YarnAutoScalingManager? But since this code is being used outside the yarn
context, it should instead live inside the `GobblinYarnTaskRunner` somehow.
Admittedly, I am not that big of a fan of inheritance, but since we are
already using it in this class it makes sense to use that approach
```
public class GobblinYarnTaskRunner extends GobblinTaskRunner
```
##########
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinTaskRunner.java:
##########
@@ -544,23 +544,25 @@ void connectHelixManagerWithRetry() {
private void addInstanceTags() {
HelixManager receiverManager = getReceiverManager();
if (receiverManager.isConnected()) {
- // The helix instance associated with this container should be
consistent on helix tag
- List<String> existedTags = receiverManager.getClusterManagmentTool()
- .getInstanceConfig(this.clusterName,
this.helixInstanceName).getTags();
- Set<String> desiredTags = new HashSet<>(
- ConfigUtils.getStringList(this.clusterConfig,
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
- if (!desiredTags.isEmpty()) {
- // Remove tag assignments for the current Helix instance from a
previous run
- for (String tag : existedTags) {
- if (!desiredTags.contains(tag))
-
receiverManager.getClusterManagmentTool().removeInstanceTag(this.clusterName,
this.helixInstanceName, tag);
- logger.info("Removed unrelated helix tag {} for instance {}", tag,
this.helixInstanceName);
+ try {
+ Set<String> desiredTags = new
HashSet<>(ConfigUtils.getStringList(this.clusterConfig,
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
+ if (!desiredTags.isEmpty()) {
+ // The helix instance associated with this container should be
consistent on helix tag
+ List<String> existedTags =
+
receiverManager.getClusterManagmentTool().getInstanceConfig(this.clusterName,
this.helixInstanceName).getTags();
Review Comment:
It seems to me that this would return empty (but clearly it doesn't since it
causes an exception).
##########
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinTaskRunner.java:
##########
@@ -544,23 +544,25 @@ void connectHelixManagerWithRetry() {
private void addInstanceTags() {
HelixManager receiverManager = getReceiverManager();
if (receiverManager.isConnected()) {
- // The helix instance associated with this container should be
consistent on helix tag
- List<String> existedTags = receiverManager.getClusterManagmentTool()
- .getInstanceConfig(this.clusterName,
this.helixInstanceName).getTags();
- Set<String> desiredTags = new HashSet<>(
- ConfigUtils.getStringList(this.clusterConfig,
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
- if (!desiredTags.isEmpty()) {
- // Remove tag assignments for the current Helix instance from a
previous run
- for (String tag : existedTags) {
- if (!desiredTags.contains(tag))
-
receiverManager.getClusterManagmentTool().removeInstanceTag(this.clusterName,
this.helixInstanceName, tag);
- logger.info("Removed unrelated helix tag {} for instance {}", tag,
this.helixInstanceName);
+ try {
+ Set<String> desiredTags = new
HashSet<>(ConfigUtils.getStringList(this.clusterConfig,
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
Review Comment:
Ditto for the other whitespace type changes
##########
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinTaskRunner.java:
##########
@@ -599,6 +601,7 @@ public void run() {
private Optional<ContainerMetrics> buildContainerMetrics() {
Properties properties = ConfigUtils.configToProperties(this.clusterConfig);
if (GobblinMetrics.isEnabled(properties)) {
+ logger.info("Container metrics are enabled");
Review Comment:
Nice log addition! Do you know where these container metrics are collected?
##########
gobblin-cluster/src/main/java/org/apache/gobblin/cluster/GobblinTaskRunner.java:
##########
@@ -544,23 +544,25 @@ void connectHelixManagerWithRetry() {
private void addInstanceTags() {
HelixManager receiverManager = getReceiverManager();
if (receiverManager.isConnected()) {
- // The helix instance associated with this container should be
consistent on helix tag
- List<String> existedTags = receiverManager.getClusterManagmentTool()
- .getInstanceConfig(this.clusterName,
this.helixInstanceName).getTags();
- Set<String> desiredTags = new HashSet<>(
- ConfigUtils.getStringList(this.clusterConfig,
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
- if (!desiredTags.isEmpty()) {
- // Remove tag assignments for the current Helix instance from a
previous run
- for (String tag : existedTags) {
- if (!desiredTags.contains(tag))
-
receiverManager.getClusterManagmentTool().removeInstanceTag(this.clusterName,
this.helixInstanceName, tag);
- logger.info("Removed unrelated helix tag {} for instance {}", tag,
this.helixInstanceName);
+ try {
+ Set<String> desiredTags = new
HashSet<>(ConfigUtils.getStringList(this.clusterConfig,
GobblinClusterConfigurationKeys.HELIX_INSTANCE_TAGS_KEY));
+ if (!desiredTags.isEmpty()) {
+ // The helix instance associated with this container should be
consistent on helix tag
+ List<String> existedTags =
+
receiverManager.getClusterManagmentTool().getInstanceConfig(this.clusterName,
this.helixInstanceName).getTags();
Review Comment:
Why does moving this line here address the issue?
In what situations does a config not exist for an instance? Seems to me like
the sequence of events is:
```
on start(), connectHelixManager() makes taskrunner join as a participant
Helix should make an instance config for the taskrunner
```
In what scenario does the above not happen?
Issue Time Tracking
-------------------
Worklog Id: (was: 827028)
Time Spent: 1.5h (was: 1h 20m)
> Ensure GobblinTaskRunner works when used without Yarn
> -----------------------------------------------------
>
> Key: GOBBLIN-1743
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1743
> Project: Apache Gobblin
> Issue Type: Bug
> Components: gobblin-core
> Reporter: Urmi Mustafi
> Assignee: Abhishek Tiwari
> Priority: Major
> Time Spent: 1.5h
> Remaining Estimate: 0h
>
> this PR [https://github.com/apache/gobblin/pull/3519/files] has a call to
> Helix cluster contained outside of a conditional that was previously never
> encountered for Fliptop use case. I moved the call back inside the
> conditional to ensure we don't attempt to retrieve an instanceConfig that is
> not registered with Helix as we don't have Yarn mode enabled.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)