[ 
https://issues.apache.org/jira/browse/HADOOP-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14035080#comment-14035080
 ] 

Arpit Agarwal commented on HADOOP-10281:
----------------------------------------

Hi Chris, I assume this is also optional like FairCallQueue.

High level concerns:
# Every call results in history mutation which seems to be overkill. I think we 
can get away with a sampling approach, where a given call is fed as input to 
scheduler with a small probability.
# Do you guys have any numbers from performance analysis at scale? It seems 
like the combined changes are adding a few interlocked operations in the call 
queue path, and indirect locks via use of ConcurrentHashMap. IIUC there are 4 
synchronized state updates per call in HistoryRpcScheduler itself. two updates 
to {{callHistoryCounts}} and two updates to {{callHistory}}.
# There should be a way to age state after a period of inactivity.
# I also think we are providing too many configuration knobs with this feature. 
Hadoop performance tuning is quite complex already and additional settings add 
administrator and debugging overhead. Perhaps it is better to leave some of 
these settings non-configurable.

What do you think?

If this feature is optional, we can proceed and tune the implementation later 
if we file a Jira to address the high level issues later.

# This comment confused me _If there is not enough traffic to flush the 
callHistory, sporadic users could end up with counts greater than the 
heartbeats_. Which heartbeats does it refer to?
# {{historyLength}}, {{serviceUsernames}}, {{thresholds}}, {{numQueues}} can be 
final.
# Why do we support multiple identity providers if we discard all but the first?
# I agree that service user seems like an unnecessary complication for now. 
Perhaps we can add it later?
# {{#getSchedulingDecisions}} - curious why you make a copy of the 
{{callHistoryCounts}} reference.
# {{getDefaultThresholds}} - is there any analysis behind choosing this 
bisection approach?
# {{getAndIncrement}} - I think we do care about the return value of 
{{putIfAbsent}}, because we want to update the local variable {{value}}.
# {{popLast}} - check for return value of poll being null?
# Nitpick: Spaces around equals sign in {{i=retval.length-1}}. Could you also 
fix other instances to be inline with our coding convention e.g. 
{{(aNumQueues-1)}}

> Create a scheduler, which assigns schedulables a priority level
> ---------------------------------------------------------------
>
>                 Key: HADOOP-10281
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10281
>             Project: Hadoop Common
>          Issue Type: Sub-task
>            Reporter: Chris Li
>            Assignee: Chris Li
>         Attachments: HADOOP-10281.patch, HADOOP-10281.patch
>
>
> The Scheduler decides which sub-queue to assign a given Call. It implements a 
> single method getPriorityLevel(Schedulable call) which returns an integer 
> corresponding to the subqueue the FairCallQueue should place the call in.
> The HistoryRpcScheduler is one such implementation which uses the username of 
> each call and determines what % of calls in recent history were made by this 
> user.
> It is configured with a historyLength (how many calls to track) and a list of 
> integer thresholds which determine the boundaries between priority levels.
> For instance, if the scheduler has a historyLength of 8; and priority 
> thresholds of 4,2,1; and saw calls made by these users in order:
> Alice, Bob, Alice, Alice, Bob, Jerry, Alice, Alice
> * Another call by Alice would be placed in queue 3, since she has already 
> made >= 4 calls
> * Another call by Bob would be placed in queue 2, since he has >= 2 but less 
> than 4 calls
> * A call by Carlos would be placed in queue 0, since he has no calls in the 
> history
> Also, some versions of this patch include the concept of a 'service user', 
> which is a user that is always scheduled high-priority. Currently this seems 
> redundant and will probably be removed in later patches, since its not too 
> useful.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to