[
https://issues.apache.org/jira/browse/GOBBLIN-1678?focusedWorklogId=801177&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-801177
]
ASF GitHub Bot logged work on GOBBLIN-1678:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 17/Aug/22 01:10
Start Date: 17/Aug/22 01:10
Worklog Time Spent: 10m
Work Description: umustafi commented on code in PR #3536:
URL: https://github.com/apache/gobblin/pull/3536#discussion_r947365065
##########
gobblin-service/src/main/java/org/apache/gobblin/service/monitoring/GitMonitoringService.java:
##########
@@ -234,20 +219,22 @@ void processGitConfigChanges() throws GitAPIException,
IOException {
*/
void processGitConfigChangesHelper(List<DiffEntry> changes) throws
IOException {
for (DiffEntry change : changes) {
- switch (change.getChangeType()) {
- case ADD:
- case MODIFY:
- addChange(change);
- break;
- case DELETE:
- removeChange(change);
- break;
- case RENAME:
- removeChange(change);
- addChange(change);
- break;
- default:
- throw new RuntimeException("Unsupported change type " +
change.getChangeType());
+ for (GitDiffListener listener: this.listeners) {
+ switch (change.getChangeType()) {
+ case ADD:
Review Comment:
how does an add change get processed? where is it picked up?
##########
gobblin-service/src/main/java/org/apache/gobblin/service/monitoring/GitFlowGraphMonitor.java:
##########
@@ -0,0 +1,134 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.gobblin.service.monitoring;
+
+import com.google.common.collect.ImmutableMap;
+import java.io.IOException;
+import java.net.URI;
+import java.util.Collections;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.CountDownLatch;
+
+import org.apache.gobblin.configuration.ConfigurationKeys;
+import org.apache.gobblin.service.modules.flowgraph.FlowGraphMonitor;
+import org.apache.hadoop.fs.Path;
+import org.eclipse.jgit.api.errors.GitAPIException;
+import org.eclipse.jgit.diff.DiffEntry;
+
+import com.google.common.base.Optional;
+import com.typesafe.config.Config;
+import com.typesafe.config.ConfigFactory;
+
+import lombok.extern.slf4j.Slf4j;
+
+import org.apache.gobblin.runtime.api.TopologySpec;
+import org.apache.gobblin.service.modules.flowgraph.DataNode;
+import org.apache.gobblin.service.modules.flowgraph.FlowEdge;
+import org.apache.gobblin.service.modules.flowgraph.FlowGraph;
+import
org.apache.gobblin.service.modules.template_catalog.FSFlowTemplateCatalog;
+
+
+/**
+ * Service that monitors for changes to {@link
org.apache.gobblin.service.modules.flowgraph.FlowGraph} from a git repository.
+ * The git repository must have an inital commit that has no files since that
is used as a base for getting
+ * the change list.
+ * The {@link DataNode}s and {@link FlowEdge}s in FlowGraph need to be
organized with the following directory structure on git:
+ * <root_flowGraph_dir>/<nodeName>/<nodeName>.properties
+ * <root_flowGraph_dir>/<nodeName1>/<nodeName2>/<edgeName>.properties
Review Comment:
This is the format I was expecting in the flowgraph
##########
gobblin-service/src/main/java/org/apache/gobblin/service/monitoring/GitConfigMonitor.java:
##########
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.gobblin.service.monitoring;
+
+import com.google.common.collect.ImmutableMap;
+import com.typesafe.config.Config;
+import com.typesafe.config.ConfigFactory;
+
+import javax.inject.Inject;
+import javax.inject.Singleton;
+import lombok.extern.slf4j.Slf4j;
+
+import org.apache.gobblin.configuration.ConfigurationKeys;
+import org.apache.gobblin.runtime.spec_catalog.FlowCatalog;
+
+/**
+ * Service that monitors for jobs from a git repository.
+ * The git repository must have an initial commit that has no config files
since that is used as a base for getting
+ * the change list.
+ * The config needs to be organized with the following structure:
+ * <root_config_dir>/<flowGroup>/<flowName>.(pull|job|json|conf)
Review Comment:
Why are we storing specific flow configs? The git flowgraph consists of the
nodes and edges as a graph and specific configs should be stored according to
the edge I would expect, not per flow basis? That seems like information we
have in the spec store. Are these specific edge properties?
Issue Time Tracking
-------------------
Worklog Id: (was: 801177)
Time Spent: 1h (was: 50m)
> Refactor GaaS Flowgraph Monitor to be extensible
> ------------------------------------------------
>
> Key: GOBBLIN-1678
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1678
> Project: Apache Gobblin
> Issue Type: Improvement
> Reporter: William Lo
> Priority: Major
> Time Spent: 1h
> Remaining Estimate: 0h
>
> To support new implementations of a flow graph monitor, which allows for live
> updating of a flowgraph, we should reuse as much implementation from the
> existing git flowgraph monitor as possible.
> The current flowgraph monitor has coupled logic to perform a lot of the
> adding node/edges which can be reused for a file based flowgraph.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)