[ 
https://issues.apache.org/jira/browse/HADOOP-13189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15325015#comment-15325015
 ] 

Konstantin Shvachko commented on HADOOP-13189:
----------------------------------------------

Hey [~mingma] good to hear from you. I should have given a bit more context in 
the description. Doing it now.
So we are trying to solve two inter-related issues here:
# Inconsistency between the config value of {{ipc.server.handler.queue.size}}, 
which is defined as _"How many calls per handler are allowed in the queue."_, 
and the actual queue size when {{FairCallQueue}} is used, which lets 4 times 
more.
# A need to reconfigure the cluster when switching from one CallQueue 
implementation to another.

The latter is important to prevent NN meltdown. Before we learned about this 
issue we saw too many clients connecting to the NameNode and making it 
unresponsive. This is related to read requests like {{listStatus}} and 
{{getFileInfo}}. And it worked fine with the default queue, but not with 
FairCallQueue. Looking deeper it was noticed that NN allowed more requests with 
FairCallQueue than with the default. We had to reconfigure NN and adjust the 
queue size, but now we have problem 1. Hope this makes sense.

With your example when all requests go only into the high priority queue one 
should probably switch to the default queue as FairCallQueue functionality is 
not utilized.

> FairCallQueue makes callQueue larger than the configured capacity.
> ------------------------------------------------------------------
>
>                 Key: HADOOP-13189
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13189
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: ipc
>    Affects Versions: 2.6.0
>            Reporter: Konstantin Shvachko
>            Assignee: Vinitha Reddy Gankidi
>         Attachments: HADOOP-13189.001.patch, HADOOP-13189.002.patch, 
> HADOOP-13189.003.patch
>
>
> {{FairCallQueue}} divides {{callQueue}} into multiple (4 by default) 
> sub-queues, with each sub-queue corresponding to a different level of 
> priority. The constructor for {{FairCallQueue}} takes the same parameter 
> {{capacity}} as the default CallQueue implementation, and allocates all its 
> sub-queues of size {{capacity}}. With 4 levels of priority (sub-queues) by 
> default it results in the total callQueue size 4 times larger than it should 
> be based on the configuration.
> {{capacity}} should be divided by the number of sub-queues at some place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to