[jira] [Commented] (YARN-4953) Delete completed container log folder when rolling log aggregation is enabled
[ https://issues.apache.org/jira/browse/YARN-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17802751#comment-17802751 ] Shilun Fan commented on YARN-4953: -- Bulk update: moved all 3.4.0 non-blocker issues, please move back if it is a blocker. Retarget 3.5.0. > Delete completed container log folder when rolling log aggregation is enabled > - > > Key: YARN-4953 > URL: https://issues.apache.org/jira/browse/YARN-4953 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S >Priority: Major > > There would be potential bottle neck when cluster is running with very large > number of containers on the same NodeManager for single application. The > linux limits the subfolders count to 32K. If number of containers is greater > than 32K for an application, there would be container launch failure. At this > point of time, there are no more containers can be launched in this node. > Currently log folders are deleted after app is finished. Rolling log > aggregation aggregates logs to hdfs periodically. > I think if aggregation is completed for finished containers, then clean up > can be done i.e deleting log folder for finished containers. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4953) Delete completed container log folder when rolling log aggregation is enabled
[ https://issues.apache.org/jira/browse/YARN-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15419790#comment-15419790 ] Rohith Sharma K S commented on YARN-4953: - Thanks Jason for your inputs. Let me create version-0 patch initially. > Delete completed container log folder when rolling log aggregation is enabled > - > > Key: YARN-4953 > URL: https://issues.apache.org/jira/browse/YARN-4953 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > > There would be potential bottle neck when cluster is running with very large > number of containers on the same NodeManager for single application. The > linux limits the subfolders count to 32K. If number of containers is greater > than 32K for an application, there would be container launch failure. At this > point of time, there are no more containers can be launched in this node. > Currently log folders are deleted after app is finished. Rolling log > aggregation aggregates logs to hdfs periodically. > I think if aggregation is completed for finished containers, then clean up > can be done i.e deleting log folder for finished containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4953) Delete completed container log folder when rolling log aggregation is enabled
[ https://issues.apache.org/jira/browse/YARN-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417388#comment-15417388 ] Jason Lowe commented on YARN-4953: -- Sorry for the delay in responding. I think we can make deletion of aggregated completed container logs when log rolling is enabled. Only issue I can think of would be potential problems around NM restart to make sure we don't accidentally clobber an existing aggregated log when recovering with the (now incorrect) assumption that we can simply re-aggregate the completed container's logs. [~xgong] and [~vinodkv] may have some other insights and comments on this proposal as well. > Delete completed container log folder when rolling log aggregation is enabled > - > > Key: YARN-4953 > URL: https://issues.apache.org/jira/browse/YARN-4953 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > > There would be potential bottle neck when cluster is running with very large > number of containers on the same NodeManager for single application. The > linux limits the subfolders count to 32K. If number of containers is greater > than 32K for an application, there would be container launch failure. At this > point of time, there are no more containers can be launched in this node. > Currently log folders are deleted after app is finished. Rolling log > aggregation aggregates logs to hdfs periodically. > I think if aggregation is completed for finished containers, then clean up > can be done i.e deleting log folder for finished containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4953) Delete completed container log folder when rolling log aggregation is enabled
[ https://issues.apache.org/jira/browse/YARN-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15313775#comment-15313775 ] Rohith Sharma K S commented on YARN-4953: - bq. The main issue with aggregating as containers complete is the additional load on the namenode Right, this is major issue in large cluster. Since log rolling is supported I think it is worth to delete aggregated completed container log folders when log rolling is enabled. Any potential issues with it. Thoughts? > Delete completed container log folder when rolling log aggregation is enabled > - > > Key: YARN-4953 > URL: https://issues.apache.org/jira/browse/YARN-4953 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > > There would be potential bottle neck when cluster is running with very large > number of containers on the same NodeManager for single application. The > linux limits the subfolders count to 32K. If number of containers is greater > than 32K for an application, there would be container launch failure. At this > point of time, there are no more containers can be launched in this node. > Currently log folders are deleted after app is finished. Rolling log > aggregation aggregates logs to hdfs periodically. > I think if aggregation is completed for finished containers, then clean up > can be done i.e deleting log folder for finished containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4953) Delete completed container log folder when rolling log aggregation is enabled
[ https://issues.apache.org/jira/browse/YARN-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15312317#comment-15312317 ] Jason Lowe commented on YARN-4953: -- Sorry for missing this earlier. As I mentioned on YARN-5193, log aggregation originally aggregated logs for containers as they finished. The main issue with aggregating as containers complete is the additional load on the namenode. See YARN-219. Our large clusters were getting swamped with lease renewal load until that was changed. We might be able to work around it with append operations, but it can be very problematic to simply have the NM hold the aggregated log file open until the app completes. > Delete completed container log folder when rolling log aggregation is enabled > - > > Key: YARN-4953 > URL: https://issues.apache.org/jira/browse/YARN-4953 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > > There would be potential bottle neck when cluster is running with very large > number of containers on the same NodeManager for single application. The > linux limits the subfolders count to 32K. If number of containers is greater > than 32K for an application, there would be container launch failure. At this > point of time, there are no more containers can be launched in this node. > Currently log folders are deleted after app is finished. Rolling log > aggregation aggregates logs to hdfs periodically. > I think if aggregation is completed for finished containers, then clean up > can be done i.e deleting log folder for finished containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4953) Delete completed container log folder when rolling log aggregation is enabled
[ https://issues.apache.org/jira/browse/YARN-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238616#comment-15238616 ] Rohith Sharma K S commented on YARN-4953: - Even though scenario is very rare to happen, thinking one step ahead which would affect applications with large number of containers. Thoughts.?? > Delete completed container log folder when rolling log aggregation is enabled > - > > Key: YARN-4953 > URL: https://issues.apache.org/jira/browse/YARN-4953 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > > There would be potential bottle neck when cluster is running with very large > number of containers on the same NodeManager for single application. The > linux limits the subfolders count to 32K. If number of containers is greater > than 32K for an application, there would be container launch failure. At this > point of time, there are no more containers can be launched in this node. > Currently log folders are deleted after app is finished. Rolling log > aggregation aggregates logs to hdfs periodically. > I think if aggregation is completed for finished containers, then clean up > can be done i.e deleting log folder for finished containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4953) Delete completed container log folder when rolling log aggregation is enabled
[ https://issues.apache.org/jira/browse/YARN-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238596#comment-15238596 ] Rohith Sharma K S commented on YARN-4953: - cc :/ [~jlowe] > Delete completed container log folder when rolling log aggregation is enabled > - > > Key: YARN-4953 > URL: https://issues.apache.org/jira/browse/YARN-4953 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > > There would be potential bottle neck when cluster is running with very large > number of containers on the same NodeManager for single application. The > linux limits the subfolders count to 32K. If number of containers is greater > than 32K for an application, there would be container launch failure. At this > point of time, there are no more containers can be launched in this node. > Currently log folders are deleted after app is finished. Rolling log > aggregation aggregates logs to hdfs periodically. > I think if aggregation is completed for finished containers, then clean up > can be done i.e deleting log folder for finished containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)