[ https://issues.apache.org/jira/browse/YARN-3505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538165#comment-14538165 ]
Junping Du commented on YARN-3505: ---------------------------------- bq. If this happens, that means the log aggregation still happens in some of NMs. I see. Agree that we don't need to do any cleanup in this case. Some minor comments on updated patch: In aggregateLogReport.java, {code} - if (report.getDiagnosticMessage() != null - && !report.getDiagnosticMessage().isEmpty()) { - curReport - .setDiagnosticMessage(curReport.getDiagnosticMessage() == null - ? report.getDiagnosticMessage() : curReport - .getDiagnosticMessage() + report.getDiagnosticMessage()); + if (!curReport.getLogAggregationStatus().equals( + LogAggregationStatus.SUCCEEDED) + && !curReport.getLogAggregationStatus().equals( + LogAggregationStatus.FAILED) + && (report.getLogAggregationStatus().equals( + LogAggregationStatus.SUCCEEDED) + || report.getLogAggregationStatus().equals( + LogAggregationStatus.FAILED))) { + statusChanged = true; // anchor 1 for comments + } + if (report.getLogAggregationStatus() != LogAggregationStatus.RUNNING + || curReport.getLogAggregationStatus() != + LogAggregationStatus.RUNNING_WITH_FAILURE) { + curReport.setLogAggregationStatus(report + .getLogAggregationStatus()); // anchor 2 for comments + } {code} Are we missing curReport.setLogAggregationStatus() (in above anchor 1 place)? We should set SUCCEEDED or FAILED to curReport. Isn't it? In addition, why we don't put statusChanged in above anchor 2 place? If we think statusChanged only hint status move to final state (SUCCEEDED or FAILED), then we should rename statusChanged to something like stateChangedToFinal which sounds more obviously. BTW, can we make logic here to be something simpler to make status get updated except only two cases?: 1. curReport.getLogAggregationStatus() = report.getLogAggregationStatus(); 2. curReport.getLogAggregationStatus() = RUNNING_WITH_FAILURE && report.getLogAggregationStatus() = RUNNING In updateLogAggregationDiagnosticMessages(), {code} if (report.getLogAggregationStatus() + == LogAggregationStatus.RUNNING || report.getLogAggregationStatus() + == LogAggregationStatus.SUCCEEDED || report.getLogAggregationStatus() + == LogAggregationStatus.FAILED) { {code} Why case of "report.getLogAggregationStatus() == LogAggregationStatus.FAILED" doesn't go to the other branch like: LogAggregationStatus.RUNNING_WITH_FAILURE? {code} + LogAggregationDiagnosticsForNMs.put(nodeId, diagnostics); {code} Move this into block of "diagnostics == null", right after " diagnostics = new ArrayList<String>();", because we only need to call this the first time we put diagnostics info. The same problem for failureMessages too. > Node's Log Aggregation Report with SUCCEED should not cached in RMApps > ---------------------------------------------------------------------- > > Key: YARN-3505 > URL: https://issues.apache.org/jira/browse/YARN-3505 > Project: Hadoop YARN > Issue Type: Sub-task > Components: log-aggregation > Affects Versions: 2.8.0 > Reporter: Junping Du > Assignee: Xuan Gong > Priority: Critical > Attachments: YARN-3505.1.patch, YARN-3505.2.patch, > YARN-3505.2.rebase.patch, YARN-3505.3.patch > > > Per discussions in YARN-1402, we shouldn't cache all node's log aggregation > reports in RMApps for always, especially for those finished with SUCCEED. -- This message was sent by Atlassian JIRA (v6.3.4#6332)