[ 
https://issues.apache.org/jira/browse/YARN-9826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16929749#comment-16929749
 ] 

Prabhu Joseph commented on YARN-9826:
-------------------------------------

[~hdaikoku] When getAndSetAppLogs moved outside, there are chances that 
multiple threads performs that for same applicationId.

> Blocked threads at EntityGroupFSTimelineStore#getCachedStore
> ------------------------------------------------------------
>
>                 Key: YARN-9826
>                 URL: https://issues.apache.org/jira/browse/YARN-9826
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: timelineserver
>    Affects Versions: 2.7.3
>            Reporter: Harunobu Daikoku
>            Priority: Minor
>
> We have observed this case several times on our production cluster where 100s 
> of TimelineServer threads are blocked at the following synchronized block in 
> EntityGroupFSTimelineStore#getCachedStore when our HDFS NameNode is under 
> high load.
> {code:java}
>     synchronized (this.cachedLogs) {
>       // Note that the content in the cache log storage may be stale.
>       cacheItem = this.cachedLogs.get(groupId);
>       if (cacheItem == null) {
>         LOG.debug("Set up new cache item for id {}", groupId);
>         cacheItem = new EntityCacheItem(groupId, getConfig());
>         AppLogs appLogs = getAndSetAppLogs(groupId.getApplicationId());
>         if (appLogs != null) {
>           LOG.debug("Set applogs {} for group id {}", appLogs, groupId);
>           cacheItem.setAppLogs(appLogs);
>           this.cachedLogs.put(groupId, cacheItem);
>         } else {
>           LOG.warn("AppLogs for groupId {} is set to null!", groupId);
>         }
>       }
>     }
> {code}
> One thread inside the synchronized block performs multiple fs operations 
> (fs.exists) inside getAndSetAppLogs, which could block other threads when, 
> for instance, the NameNode RPC queue is full.
> One possible solution is to move getAndSetAppLogs outside the synchronized 
> block.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to