[ https://issues.apache.org/jira/browse/YARN-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17305953#comment-17305953 ]
Andras Gyori commented on YARN-9618: ------------------------------------ Thank you [~zhuqi] for the patch. I have analysed the code a bit and I think the main performance gain here is due to eliminating the unnecessary back reference to rmDispatcher on RMAppNodeUpdateEvent. Is using an other async dispatcher justified here? My standing on this issue is: * The rmDispatcher will still have its eventQueue filled with NodeListManagerEvents. * The new async dispatcher is an other layer of abstraction, and its sole purpose is copying the events from the rmDispatcher to its own event queue then handling them just as rmDispatcher would do * The NodeListManager#handle will block on getting RMApp instances, because they are stored in a ConcurrentMap I think the new async dispatcher only makes sense, if the NodeListManager#sendRMAppNodeUpdateEventToNonFinalizedApps blocks the rmDispatcher thread for more time, than it takes to copy an event from rmDispatcher#eventQueue to nodeListManagerDispatcher#eventQueue. Checking the performance gain with and without the async dispatcher would be a really helpful metric here. > NodeListManager event improvement > --------------------------------- > > Key: YARN-9618 > URL: https://issues.apache.org/jira/browse/YARN-9618 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Bibin Chundatt > Assignee: Qi Zhu > Priority: Critical > Attachments: YARN-9618.001.patch, YARN-9618.002.patch, > YARN-9618.003.patch, YARN-9618.004.patch, YARN-9618.005.patch > > > Current implementation nodelistmanager event blocks async dispacher and can > cause RM crash and slowing down event processing. > # Cluster restart with 1K running apps . Each usable event will create 1K > events over all events could be 5k*1k events for 5K cluster > # Event processing is blocked till new events are added to queue. > Solution : > # Add another async Event handler similar to scheduler. > # Instead of adding events to dispatcher directly call RMApp event handler. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org