[
https://issues.apache.org/jira/browse/GOBBLIN-1678?focusedWorklogId=801159&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-801159
]
ASF GitHub Bot logged work on GOBBLIN-1678:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 16/Aug/22 23:53
Start Date: 16/Aug/22 23:53
Worklog Time Spent: 10m
Work Description: phet commented on code in PR #3536:
URL: https://github.com/apache/gobblin/pull/3536#discussion_r947326733
##########
gobblin-service/src/main/java/org/apache/gobblin/service/monitoring/GitConfigListener.java:
##########
@@ -14,83 +14,64 @@
* See the License for the specific language governing permissions and
* limitations under the License.
*/
-package org.apache.gobblin.service.modules.core;
-import java.io.IOException;
-
-import org.apache.hadoop.fs.Path;
-import org.eclipse.jgit.diff.DiffEntry;
+package org.apache.gobblin.service.monitoring;
-import com.google.common.collect.ImmutableMap;
+import com.google.common.collect.Sets;
import com.google.common.io.Files;
import com.typesafe.config.Config;
import com.typesafe.config.ConfigFactory;
import com.typesafe.config.ConfigValueFactory;
-
-import javax.inject.Inject;
-import javax.inject.Singleton;
+import java.io.IOException;
+import java.net.URI;
+import java.util.Set;
import lombok.extern.slf4j.Slf4j;
-
import org.apache.gobblin.config.ConfigBuilder;
import org.apache.gobblin.configuration.ConfigurationKeys;
import org.apache.gobblin.runtime.api.FlowSpec;
import org.apache.gobblin.runtime.spec_catalog.FlowCatalog;
import org.apache.gobblin.runtime.spec_store.FSSpecStore;
import org.apache.gobblin.util.PullFileLoader;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.eclipse.jgit.diff.DiffEntry;
+
/**
- * Service that monitors for jobs from a git repository.
- * The git repository must have an initial commit that has no config files
since that is used as a base for getting
- * the change list.
- * The config needs to be organized with the following structure:
- * <root_config_dir>/<flowGroup>/<flowName>.(pull|job|json|conf)
- * The <flowGroup> and <flowName> is used to generate the URI used to store
the config in the {@link FlowCatalog}
+ * Listener for {@link GitConfigMonitor} to apply changes in Git to a {@link
FlowCatalog} for adding and removing jobs
Review Comment:
probably *changes from Git* ... so it doesn't sound like the changes are
applied in/to git.
##########
gobblin-service/src/main/java/org/apache/gobblin/service/monitoring/GitConfigListener.java:
##########
@@ -137,23 +118,23 @@ public void removeChange(DiffEntry change) {
.withDescription(SPEC_DESCRIPTION)
.build();
- this.flowCatalog.remove(spec.getUri());
+ this.flowCatalog.remove(spec.getUri());
}
}
- /**
- * check whether the file has the proper naming and hierarchy
- * @param configFilePath the relative path from the repo root
- * @return false if the file does not conform
- */
+ /**
+ * check whether the file has the proper naming and hierarchy
+ * @param configFilePath the relative path from the repo root
+ * @return false if the file does not conform
+ */
private boolean checkConfigFilePath(String configFilePath) {
// The config needs to stored at
configDir/flowGroup/flowName.(pull|job|json|conf)
Path configFile = new Path(configFilePath);
String fileExtension = Files.getFileExtension(configFile.getName());
if (configFile.depth() != CONFIG_FILE_DEPTH
- || !configFile.getParent().getParent().getName().equals(folderName)
+ ||
!configFile.getParent().getParent().getName().equals(configBaseFolderName)
||
!(PullFileLoader.DEFAULT_JAVA_PROPS_PULL_FILE_EXTENSIONS.contains(fileExtension)
||
PullFileLoader.DEFAULT_JAVA_PROPS_PULL_FILE_EXTENSIONS.contains(fileExtension)))
{
log.warn("Changed file does not conform to directory structure and file
name format, skipping: "
Review Comment:
I get not failing the entire service--but seems a particularly soft error to
merely log a warning... how often do people notice this, I wonder...
##########
gobblin-service/src/main/java/org/apache/gobblin/service/monitoring/GitConfigMonitor.java:
##########
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.gobblin.service.monitoring;
+
+import com.google.common.collect.ImmutableMap;
+import com.typesafe.config.Config;
+import com.typesafe.config.ConfigFactory;
+
+import javax.inject.Inject;
+import javax.inject.Singleton;
+import lombok.extern.slf4j.Slf4j;
+
+import org.apache.gobblin.configuration.ConfigurationKeys;
+import org.apache.gobblin.runtime.spec_catalog.FlowCatalog;
+
+/**
+ * Service that monitors for jobs from a git repository.
Review Comment:
are these jobs, as described, or flows?
##########
gobblin-service/src/main/java/org/apache/gobblin/service/modules/core/GobblinServiceManager.java:
##########
@@ -32,6 +32,7 @@
import org.apache.commons.cli.ParseException;
import org.apache.commons.lang3.ObjectUtils;
import org.apache.gobblin.service.modules.orchestration.UserQuotaManager;
+import org.apache.gobblin.service.monitoring.GitConfigMonitor;
Review Comment:
I see only this one change in the file. why is it necessary to import if
it's not used anywhere?
Issue Time Tracking
-------------------
Worklog Id: (was: 801159)
Time Spent: 50m (was: 40m)
> Refactor GaaS Flowgraph Monitor to be extensible
> ------------------------------------------------
>
> Key: GOBBLIN-1678
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1678
> Project: Apache Gobblin
> Issue Type: Improvement
> Reporter: William Lo
> Priority: Major
> Time Spent: 50m
> Remaining Estimate: 0h
>
> To support new implementations of a flow graph monitor, which allows for live
> updating of a flowgraph, we should reuse as much implementation from the
> existing git flowgraph monitor as possible.
> The current flowgraph monitor has coupled logic to perform a lot of the
> adding node/edges which can be reused for a file based flowgraph.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)