[
https://issues.apache.org/jira/browse/GOBBLIN-1910?focusedWorklogId=908509&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-908509
]
ASF GitHub Bot logged work on GOBBLIN-1910:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 06/Mar/24 08:11
Start Date: 06/Mar/24 08:11
Worklog Time Spent: 10m
Work Description: phet commented on code in PR #3858:
URL: https://github.com/apache/gobblin/pull/3858#discussion_r1513977080
##########
gobblin-service/src/main/java/org/apache/gobblin/service/monitoring/DagProcEngineEnabledDagActionStoreChangeMonitor.java:
##########
@@ -0,0 +1,80 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.gobblin.service.monitoring;
+
+import java.io.IOException;
+
+import com.typesafe.config.Config;
+
+import lombok.extern.slf4j.Slf4j;
+
+import org.apache.gobblin.runtime.api.DagActionStore;
+import org.apache.gobblin.runtime.spec_catalog.FlowCatalog;
+import org.apache.gobblin.service.modules.orchestration.DagManagement;
+import org.apache.gobblin.service.modules.orchestration.DagManager;
+import org.apache.gobblin.service.modules.orchestration.Orchestrator;
+
+
+/**
+ * A DagActionStore change monitor that uses {@link DagActionStoreChangeEvent}
schema to process Kafka messages received
+ * from its corresponding consumer client. This monitor responds to requests
to resume or delete a flow and acts as a
+ * connector between the API and execution layers of GaaS.
+ */
+@Slf4j
+public class DagProcEngineEnabledDagActionStoreChangeMonitor extends
DagActionStoreChangeMonitor {
+ private final DagManagement dagManagement;
+
+ // Note that the topic is an empty string (rather than null to avoid NPE)
because this monitor relies on the consumer
+ // client itself to determine all Kafka related information dynamically
rather than through the config.
+ public DagProcEngineEnabledDagActionStoreChangeMonitor(String topic, Config
config, DagManager dagManager, int numThreads,
+ FlowCatalog flowCatalog, Orchestrator orchestrator, DagActionStore
dagActionStore,
+ boolean isMultiActiveSchedulerEnabled, DagManagement dagManagement) {
+ // Differentiate group id for each host
+ super(topic, config, dagManager, numThreads, flowCatalog, orchestrator,
dagActionStore, isMultiActiveSchedulerEnabled);
+ this.dagManagement = dagManagement;
+ }
+
+ /**
+ * This implementation passes on the {@link
org.apache.gobblin.runtime.api.DagActionStore.DagAction} to the
+ * {@link DagManagement} instead of finding a {@link
org.apache.gobblin.runtime.api.FlowSpec} passing the spec to {@link
Orchestrator}.
+ */
+ @Override
+ protected void handleDagAction(DagActionStore.DagAction dagAction, boolean
isStartup) {
+ log.info("(" + (isStartup ? "on-startup" : "post-startup") + ") DagAction
change ({}) received for flow: {}",
+ dagAction.getFlowActionType(), dagAction);
+ LaunchSubmissionMetricProxy launchSubmissionMetricProxy = isStartup ?
ON_STARTUP : POST_STARTUP;
+ try {
+ // todo - add actions for other other type of dag actions
+ if
(dagAction.getFlowActionType().equals(DagActionStore.FlowActionType.LAUNCH)) {
+ // If multi-active scheduler is NOT turned on we should not receive
these type of events
+ if (!this.isMultiActiveSchedulerEnabled) {
+ this.unexpectedErrors.mark();
Review Comment:
hopefully this is the only kind of "unexpected error", otherwise how can we
discern? even if it indeed is, suggest renaming, e.g. to
`unexpectedLaunchEventErrors`
##########
gobblin-service/src/main/java/org/apache/gobblin/service/monitoring/DagProcEngineEnabledDagActionStoreChangeMonitor.java:
##########
@@ -0,0 +1,80 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.gobblin.service.monitoring;
+
+import java.io.IOException;
+
+import com.typesafe.config.Config;
+
+import lombok.extern.slf4j.Slf4j;
+
+import org.apache.gobblin.runtime.api.DagActionStore;
+import org.apache.gobblin.runtime.spec_catalog.FlowCatalog;
+import org.apache.gobblin.service.modules.orchestration.DagManagement;
+import org.apache.gobblin.service.modules.orchestration.DagManager;
+import org.apache.gobblin.service.modules.orchestration.Orchestrator;
+
+
+/**
+ * A DagActionStore change monitor that uses {@link DagActionStoreChangeEvent}
schema to process Kafka messages received
+ * from its corresponding consumer client. This monitor responds to requests
to resume or delete a flow and acts as a
+ * connector between the API and execution layers of GaaS.
+ */
+@Slf4j
+public class DagProcEngineEnabledDagActionStoreChangeMonitor extends
DagActionStoreChangeMonitor {
+ private final DagManagement dagManagement;
+
+ // Note that the topic is an empty string (rather than null to avoid NPE)
because this monitor relies on the consumer
+ // client itself to determine all Kafka related information dynamically
rather than through the config.
+ public DagProcEngineEnabledDagActionStoreChangeMonitor(String topic, Config
config, DagManager dagManager, int numThreads,
+ FlowCatalog flowCatalog, Orchestrator orchestrator, DagActionStore
dagActionStore,
+ boolean isMultiActiveSchedulerEnabled, DagManagement dagManagement) {
+ // Differentiate group id for each host
+ super(topic, config, dagManager, numThreads, flowCatalog, orchestrator,
dagActionStore, isMultiActiveSchedulerEnabled);
Review Comment:
`dagManager` doesn't seem to be used within... is it safe to pass `null` on
to `super`?
if so, let's not take a DM param here, and also be sure to document in
javadoc within our super class
##########
gobblin-service/src/main/java/org/apache/gobblin/service/monitoring/DagProcEngineEnabledDagActionStoreChangeMonitor.java:
##########
@@ -0,0 +1,80 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.gobblin.service.monitoring;
+
+import java.io.IOException;
+
+import com.typesafe.config.Config;
+
+import lombok.extern.slf4j.Slf4j;
+
+import org.apache.gobblin.runtime.api.DagActionStore;
+import org.apache.gobblin.runtime.spec_catalog.FlowCatalog;
+import org.apache.gobblin.service.modules.orchestration.DagManagement;
+import org.apache.gobblin.service.modules.orchestration.DagManager;
+import org.apache.gobblin.service.modules.orchestration.Orchestrator;
+
+
+/**
+ * A DagActionStore change monitor that uses {@link DagActionStoreChangeEvent}
schema to process Kafka messages received
+ * from its corresponding consumer client. This monitor responds to requests
to resume or delete a flow and acts as a
+ * connector between the API and execution layers of GaaS.
+ */
+@Slf4j
+public class DagProcEngineEnabledDagActionStoreChangeMonitor extends
DagActionStoreChangeMonitor {
+ private final DagManagement dagManagement;
+
+ // Note that the topic is an empty string (rather than null to avoid NPE)
because this monitor relies on the consumer
Review Comment:
I don't fully understand this message, since I don't see `""` being used
anywhere. e.g. do you want to eliminate the first param `String topic` and
hard-code `""` in the `super` call?
##########
gobblin-service/src/main/java/org/apache/gobblin/service/monitoring/DagProcEngineEnabledDagActionStoreChangeMonitor.java:
##########
@@ -0,0 +1,80 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.gobblin.service.monitoring;
+
+import java.io.IOException;
+
+import com.typesafe.config.Config;
+
+import lombok.extern.slf4j.Slf4j;
+
+import org.apache.gobblin.runtime.api.DagActionStore;
+import org.apache.gobblin.runtime.spec_catalog.FlowCatalog;
+import org.apache.gobblin.service.modules.orchestration.DagManagement;
+import org.apache.gobblin.service.modules.orchestration.DagManager;
+import org.apache.gobblin.service.modules.orchestration.Orchestrator;
+
+
+/**
+ * A DagActionStore change monitor that uses {@link DagActionStoreChangeEvent}
schema to process Kafka messages received
+ * from its corresponding consumer client. This monitor responds to requests
to resume or delete a flow and acts as a
+ * connector between the API and execution layers of GaaS.
+ */
+@Slf4j
+public class DagProcEngineEnabledDagActionStoreChangeMonitor extends
DagActionStoreChangeMonitor {
Review Comment:
suggestion: `DagManagementDagActionStoreChangeMonitor` (current name is not
a problem either)
##########
gobblin-service/src/main/java/org/apache/gobblin/service/monitoring/DagProcEngineEnabledDagActionStoreChangeMonitor.java:
##########
@@ -0,0 +1,80 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.gobblin.service.monitoring;
+
+import java.io.IOException;
+
+import com.typesafe.config.Config;
+
+import lombok.extern.slf4j.Slf4j;
+
+import org.apache.gobblin.runtime.api.DagActionStore;
+import org.apache.gobblin.runtime.spec_catalog.FlowCatalog;
+import org.apache.gobblin.service.modules.orchestration.DagManagement;
+import org.apache.gobblin.service.modules.orchestration.DagManager;
+import org.apache.gobblin.service.modules.orchestration.Orchestrator;
+
+
+/**
+ * A DagActionStore change monitor that uses {@link DagActionStoreChangeEvent}
schema to process Kafka messages received
+ * from its corresponding consumer client. This monitor responds to requests
to resume or delete a flow and acts as a
+ * connector between the API and execution layers of GaaS.
+ */
+@Slf4j
+public class DagProcEngineEnabledDagActionStoreChangeMonitor extends
DagActionStoreChangeMonitor {
+ private final DagManagement dagManagement;
+
+ // Note that the topic is an empty string (rather than null to avoid NPE)
because this monitor relies on the consumer
+ // client itself to determine all Kafka related information dynamically
rather than through the config.
+ public DagProcEngineEnabledDagActionStoreChangeMonitor(String topic, Config
config, DagManager dagManager, int numThreads,
+ FlowCatalog flowCatalog, Orchestrator orchestrator, DagActionStore
dagActionStore,
+ boolean isMultiActiveSchedulerEnabled, DagManagement dagManagement) {
+ // Differentiate group id for each host
+ super(topic, config, dagManager, numThreads, flowCatalog, orchestrator,
dagActionStore, isMultiActiveSchedulerEnabled);
+ this.dagManagement = dagManagement;
+ }
+
+ /**
+ * This implementation passes on the {@link
org.apache.gobblin.runtime.api.DagActionStore.DagAction} to the
+ * {@link DagManagement} instead of finding a {@link
org.apache.gobblin.runtime.api.FlowSpec} passing the spec to {@link
Orchestrator}.
+ */
+ @Override
+ protected void handleDagAction(DagActionStore.DagAction dagAction, boolean
isStartup) {
+ log.info("(" + (isStartup ? "on-startup" : "post-startup") + ") DagAction
change ({}) received for flow: {}",
+ dagAction.getFlowActionType(), dagAction);
+ LaunchSubmissionMetricProxy launchSubmissionMetricProxy = isStartup ?
ON_STARTUP : POST_STARTUP;
+ try {
+ // todo - add actions for other other type of dag actions
+ if
(dagAction.getFlowActionType().equals(DagActionStore.FlowActionType.LAUNCH)) {
+ // If multi-active scheduler is NOT turned on we should not receive
these type of events
+ if (!this.isMultiActiveSchedulerEnabled) {
+ this.unexpectedErrors.mark();
+ throw new RuntimeException(String.format("Received LAUNCH dagAction
while not in multi-active scheduler "
+ + "mode for flowAction: %s", dagAction));
+ }
+ dagManagement.addDagAction(dagAction);
+ } else {
+ log.warn("Received unsupported dagAction {}. Expected to be a KILL,
RESUME, or LAUNCH", dagAction.getFlowActionType());
Review Comment:
and a TODO added!
##########
gobblin-service/src/main/java/org/apache/gobblin/service/monitoring/DagProcEngineEnabledDagActionStoreChangeMonitorFactory.java:
##########
@@ -0,0 +1,88 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.gobblin.service.monitoring;
+
+import java.util.Objects;
+
+import com.typesafe.config.Config;
+
+import javax.inject.Inject;
+import javax.inject.Named;
+import javax.inject.Provider;
+import lombok.extern.slf4j.Slf4j;
+
+import org.apache.gobblin.runtime.api.DagActionStore;
+import org.apache.gobblin.runtime.spec_catalog.FlowCatalog;
+import org.apache.gobblin.runtime.util.InjectionNames;
+import org.apache.gobblin.service.modules.orchestration.DagManagement;
+import org.apache.gobblin.service.modules.orchestration.DagManager;
+import org.apache.gobblin.service.modules.orchestration.Orchestrator;
+import org.apache.gobblin.util.ConfigUtils;
+
+
+/**
+ * A factory implementation that returns a {@link
DagProcEngineEnabledDagActionStoreChangeMonitor} instance.
+ */
+@Slf4j
+public class DagProcEngineEnabledDagActionStoreChangeMonitorFactory implements
Provider<DagActionStoreChangeMonitor> {
+ static final String DAG_ACTION_STORE_CHANGE_MONITOR_NUM_THREADS_KEY =
"numThreads";
+
+ private final Config config;
+ private final DagManager dagManager;
Review Comment:
we probably won't have one of these around when using the `DagProcEngine`...
will we?!?
##########
gobblin-service/src/test/java/org/apache/gobblin/service/modules/orchestration/DagProcessingEngineTest.java:
##########
@@ -0,0 +1,98 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.gobblin.service.modules.orchestration;
+
+import java.io.IOException;
+import java.net.URI;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Optional;
+
+import org.junit.Assert;
+import org.testng.annotations.BeforeClass;
+import org.testng.annotations.Test;
+
+import com.typesafe.config.Config;
+
+import org.apache.gobblin.config.ConfigBuilder;
+import org.apache.gobblin.configuration.ConfigurationKeys;
+import org.apache.gobblin.metastore.testing.ITestMetastoreDatabase;
+import org.apache.gobblin.metastore.testing.TestMetastoreDatabaseFactory;
+import org.apache.gobblin.runtime.api.DagActionStore;
+import org.apache.gobblin.runtime.api.TopologySpec;
+import org.apache.gobblin.service.modules.orchestration.task.DagTask;
+import org.apache.gobblin.service.modules.orchestration.task.LaunchDagTask;
+
+
+public class DagProcessingEngineTest {
+ private MostlyMySqlDagManagementStateStore dagManagementStateStore;
+ private static final String TEST_USER = "testUser";
+ private static final String TEST_PASSWORD = "testPassword";
+ private static final String TEST_DAG_STATE_STORE = "TestDagStateStore";
+ private static final String TEST_TABLE = "quotas";
+ static ITestMetastoreDatabase testMetastoreDatabase;
+ DagProcessingEngine.DagProcEngineThread dagProcEngineThread;
+ DagManagementTaskStreamImpl dagManagementTaskStream;
+ DagProcFactory dagProcFactory;
+
+ @BeforeClass
+ public void setUp() throws Exception {
+ // Setting up mock DB
+ testMetastoreDatabase = TestMetastoreDatabaseFactory.get();
+
+ Config config;
+ ConfigBuilder configBuilder = ConfigBuilder.create();
+
configBuilder.addPrimitive(MostlyMySqlDagManagementStateStore.DAG_STATESTORE_CLASS_KEY,
MostlyMySqlDagManagementStateStoreTest.TestMysqlDagStateStore.class.getName())
+
.addPrimitive(MysqlUserQuotaManager.qualify(ConfigurationKeys.STATE_STORE_DB_URL_KEY),
testMetastoreDatabase.getJdbcUrl())
+
.addPrimitive(MysqlUserQuotaManager.qualify(ConfigurationKeys.STATE_STORE_DB_USER_KEY),
TEST_USER)
+
.addPrimitive(MysqlUserQuotaManager.qualify(ConfigurationKeys.STATE_STORE_DB_PASSWORD_KEY),
TEST_PASSWORD)
+
.addPrimitive(MysqlUserQuotaManager.qualify(ConfigurationKeys.STATE_STORE_DB_TABLE_KEY),
TEST_TABLE);
+ config = configBuilder.build();
+
+ // Constructing TopologySpecMap.
+ Map<URI, TopologySpec> topologySpecMap = new HashMap<>();
+ String specExecInstance = "mySpecExecutor";
+ TopologySpec topologySpec =
DagTestUtils.buildNaiveTopologySpec(specExecInstance);
+ URI specExecURI = new URI(specExecInstance);
+ topologySpecMap.put(specExecURI, topologySpec);
+ this.dagManagementStateStore = new
MostlyMySqlDagManagementStateStore(config, null);
+ this.dagManagementStateStore.setTopologySpecMap(topologySpecMap);
+ this.dagManagementStateStore.start();
+ this.dagManagementTaskStream =
+ new DagManagementTaskStreamImpl(config, Optional.empty(),
this.dagManagementStateStore);
+ this.dagManagementTaskStream.setActive(true);
+ this.dagProcFactory = new DagProcFactory();
+ this.dagProcEngineThread = new DagProcessingEngine.DagProcEngineThread(
+ this.dagManagementTaskStream, this.dagProcFactory,
this.dagManagementStateStore);
+ }
+
+ // This tests adding and removal of dag actions from dag task stream
+ // when we have different dag procs in future, we can test dag processing
and exception handling
+ @Test
+ public void addRemoveDagActions() throws IOException {
Review Comment:
this method only seems to be testing `DagManagementTaskStreamImpl`, not
actually `DagProcessingEngine`.
let's situate it into the appropriate class. after that, decide on what
kind of `DagProcEngine` tests to devise (against a `DagTaskStream` mock)
Issue Time Tracking
-------------------
Worklog Id: (was: 908509)
Time Spent: 28h (was: 27h 50m)
> Refactor code to move current in-memory references to new design for REST
> calls: Launch, Resume and Kill
> --------------------------------------------------------------------------------------------------------
>
> Key: GOBBLIN-1910
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1910
> Project: Apache Gobblin
> Issue Type: New Feature
> Reporter: Meeth Gala
> Priority: Major
> Time Spent: 28h
> Remaining Estimate: 0h
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)