autumnust commented on a change in pull request #2701: GOBBLIN-846: Enhance
LogCopier service to handle continuous YARN log …
URL: https://github.com/apache/incubator-gobblin/pull/2701#discussion_r312572083
##########
File path:
gobblin-yarn/src/main/java/org/apache/gobblin/yarn/GobblinYarnLogSource.java
##########
@@ -69,21 +78,31 @@ protected boolean isLogSourcePresent() {
protected LogCopier buildLogCopier(Config config, ContainerId containerId,
FileSystem destFs, Path appWorkDir)
throws IOException {
LogCopier.Builder builder = LogCopier.newBuilder()
- .useSrcFileSystem(FileSystem.getLocal(new Configuration()))
- .useDestFileSystem(destFs)
+ .useSrcFileSystem(buildFileSystem(config, true))
+ .useDestFileSystem(buildFileSystem(config, false))
.readFrom(getLocalLogDirs())
.writeTo(getHdfsLogDir(containerId, destFs, appWorkDir))
-
.acceptsLogFileExtensions(ImmutableSet.of(ApplicationConstants.STDOUT,
ApplicationConstants.STDERR))
- .useLogFileNamePrefix(containerId.toString());
- if (config.hasPath(GobblinYarnConfigurationKeys.LOG_COPIER_MAX_FILE_SIZE))
{
-
builder.useMaxBytesPerLogFile(config.getBytes(GobblinYarnConfigurationKeys.LOG_COPIER_MAX_FILE_SIZE));
- }
- if (config.hasPath(GobblinYarnConfigurationKeys.LOG_COPIER_SCHEDULER)) {
-
builder.useScheduler(config.getString(GobblinYarnConfigurationKeys.LOG_COPIER_SCHEDULER));
- }
+
.useCurrentLogFileName(Files.getNameWithoutExtension(System.getProperty(GobblinYarnConfigurationKeys.GOBBLIN_YARN_CONTAINER_LOG_FILE_NAME)));
+
+
builder.acceptsLogFileExtensions(config.hasPath(GobblinYarnConfigurationKeys.LOG_FILE_EXTENSIONS)
? ImmutableSet
+
.copyOf(Splitter.on(",").splitToList(config.getString(GobblinYarnConfigurationKeys.LOG_FILE_EXTENSIONS)))
+ : ImmutableSet.of());
+
return builder.build();
}
+ /**
+ * Return a new (non-cached) {@link FileSystem} instance. The {@link
FileSystem} instance
+ * returned by the method has automatic closing disabled. The user of the
instance needs to handle closing of the
+ * instance, typically as part of its shutdown sequence.
+ */
+ private FileSystem buildFileSystem(Config config, boolean isLocal) throws
IOException {
+ return isLocal ? FileSystem.newInstanceLocal(AUTO_CLOSE_CONFIG)
Review comment:
This config is new to me and educated me a lot after reading :
http://johnjianfang.blogspot.com/2015/03/hadoop-filesystem-internal-cache.html
Just curious, are you seeing any issues from previous implementation or just
want to ensure the lifecycle of `fs` object is fully controlled ? If the
former, what kind of symptom is that, an exception while doing fs operation or
silent failure of copy ?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services