[ 
https://issues.apache.org/jira/browse/GOBBLIN-846?focusedWorklogId=292177&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-292177
 ]

ASF GitHub Bot logged work on GOBBLIN-846:
------------------------------------------

                Author: ASF GitHub Bot
            Created on: 09/Aug/19 17:15
            Start Date: 09/Aug/19 17:15
    Worklog Time Spent: 10m 
      Work Description: autumnust commented on pull request #2701: GOBBLIN-846: 
Enhance LogCopier service to handle continuous YARN log …
URL: https://github.com/apache/incubator-gobblin/pull/2701#discussion_r312572083
 
 

 ##########
 File path: 
gobblin-yarn/src/main/java/org/apache/gobblin/yarn/GobblinYarnLogSource.java
 ##########
 @@ -69,21 +78,31 @@ protected boolean isLogSourcePresent() {
   protected LogCopier buildLogCopier(Config config, ContainerId containerId, 
FileSystem destFs, Path appWorkDir)
       throws IOException {
     LogCopier.Builder builder = LogCopier.newBuilder()
-            .useSrcFileSystem(FileSystem.getLocal(new Configuration()))
-            .useDestFileSystem(destFs)
+            .useSrcFileSystem(buildFileSystem(config, true))
+            .useDestFileSystem(buildFileSystem(config, false))
             .readFrom(getLocalLogDirs())
             .writeTo(getHdfsLogDir(containerId, destFs, appWorkDir))
-            
.acceptsLogFileExtensions(ImmutableSet.of(ApplicationConstants.STDOUT, 
ApplicationConstants.STDERR))
-            .useLogFileNamePrefix(containerId.toString());
-    if (config.hasPath(GobblinYarnConfigurationKeys.LOG_COPIER_MAX_FILE_SIZE)) 
{
-      
builder.useMaxBytesPerLogFile(config.getBytes(GobblinYarnConfigurationKeys.LOG_COPIER_MAX_FILE_SIZE));
-    }
-    if (config.hasPath(GobblinYarnConfigurationKeys.LOG_COPIER_SCHEDULER)) {
-      
builder.useScheduler(config.getString(GobblinYarnConfigurationKeys.LOG_COPIER_SCHEDULER));
-    }
+            
.useCurrentLogFileName(Files.getNameWithoutExtension(System.getProperty(GobblinYarnConfigurationKeys.GOBBLIN_YARN_CONTAINER_LOG_FILE_NAME)));
+
+    
builder.acceptsLogFileExtensions(config.hasPath(GobblinYarnConfigurationKeys.LOG_FILE_EXTENSIONS)
 ? ImmutableSet
+        
.copyOf(Splitter.on(",").splitToList(config.getString(GobblinYarnConfigurationKeys.LOG_FILE_EXTENSIONS)))
+        : ImmutableSet.of());
+
     return builder.build();
   }
 
+  /**
+   * Return a new (non-cached) {@link FileSystem} instance. The {@link 
FileSystem} instance
+   * returned by the method has automatic closing disabled. The user of the 
instance needs to handle closing of the
+   * instance, typically as part of its shutdown sequence.
+   */
+  private FileSystem buildFileSystem(Config config, boolean isLocal) throws 
IOException {
+    return isLocal ? FileSystem.newInstanceLocal(AUTO_CLOSE_CONFIG)
 
 Review comment:
   This config is new to me and educated me a lot after reading : 
http://johnjianfang.blogspot.com/2015/03/hadoop-filesystem-internal-cache.html 
   
   Just curious, are you seeing any issues from previous implementation or just 
want to ensure the lifecycle of `fs` object is fully controlled ?  If the 
former, what kind of symptom is that, an exception while doing fs operation or 
silent failure of copy ? 
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 292177)
    Time Spent: 1.5h  (was: 1h 20m)

> Enhance LogCopier service to handle continuous YARN log aggregation
> -------------------------------------------------------------------
>
>                 Key: GOBBLIN-846
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-846
>             Project: Apache Gobblin
>          Issue Type: Improvement
>          Components: gobblin-yarn
>    Affects Versions: 0.15.0
>            Reporter: Sudarshan Vasudevan
>            Assignee: Abhishek Tiwari
>            Priority: Major
>             Fix For: 0.15.0
>
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> This task enhances LogCopier service to perform continuous log aggregation 
> for long-running Gobblin-on-Yarn applications. It periodically copies rolled 
> container logs to a specified location on HDFS. The "current" log file is 
> copied in the end as part of the service shutdown sequence. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to