[ https://issues.apache.org/jira/browse/YARN-6876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16135926#comment-16135926 ]
Junping Du commented on YARN-6876: ---------------------------------- Thanks [~xgong] for uploading the patch. Sorry for coming late on this. I assume we have sort out all design issues in umbrella JIRA, so I quickly walk through the refactor patch and have a few comments: It sounds like we are always adding TFile format no matter what we particular setting in yarn configuration. I can understand we want to do this for compatibility for older version of log files especially for rolling upgrades, but that means no ways to get rid of TFile format in future because we bind it in code level. I think it may better to make TFile format as default one (put it value in yarn-default.xml), but allow user to remove it explicitly? Also, if there are multiple values specified in "log-aggregation.file-formats", what format will take effective first in writing? I think it should be the first non-TFile format, isn't it? If so, we should document it somewhere. It sounds like for each file format, we need to add additional 4 configs in yarn-site.xml: log-aggregation.file-formats, log-aggregation.file-format.%s.class, log-aggregation.%s.remote-app-log-dir, log-aggregation.%s.remote-app-log-dir-suffix. Can we figure out some way to simplify it? - at least last two configurations are not so necessary. LogAggregationFileFormat.java is actually doing handy action for log files, like: write, createDir, checkExists, etc. May be we can rename it to LogAggregationFileController.java instead? Also *FormatContext and *FormatFactory can be renamed to *ControllerContext and *ControllerFactory. I found we are replacing close or closeStream to closeQuietly. Any specially reason for this change? Does that related to this refactor effort here? {noformat} - IOUtils.closeStream(this.fsDataOStream); + IOUtils.closeQuietly(this.fsDataOStream); {noformat} {code} + @VisibleForTesting + public void verifyRemoteLogDir(LogAggregationFileFormat fileFormat) { + fileFormat.verifyAndCreateRemoteLogDir(); + } + + @VisibleForTesting + public void createAppDir(LogAggregationFileFormat fileFormat, + String user, ApplicationId appId, UserGroupInformation userUgi) { + fileFormat.createAppDir(user, appId, userUgi); + } {code} These methods seems to be unnecessary. Even for test, we can get fileFormat (fileController) and use to do these operations directly. It sounds like UT test failures are not related. Can you confirm that and check if we have existing JIRAs to track all these failures? > Create an abstract log writer for extendability > ----------------------------------------------- > > Key: YARN-6876 > URL: https://issues.apache.org/jira/browse/YARN-6876 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Xuan Gong > Assignee: Xuan Gong > Attachments: YARN-6876-branch-2.001.patch, YARN-6876-trunk.001.patch, > YARN-6876-trunk.002.patch, YARN-6876-trunk.003.patch, > YARN-6876-trunk.004.patch > > > Currently, TFile log writer is used to aggregate log in YARN. We need to add > an abstract layer, and pick up the correct log writer based on the > configuration. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org