[jira] [Comment Edited] (TAJO-540) (Umbrella) Implement Tajo Query Scheduler

Min Zhou (JIRA) Tue, 04 Feb 2014 07:09:14 -0800

    [ 
https://issues.apache.org/jira/browse/TAJO-540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13890328#comment-13890328
 ]


Min Zhou edited comment on TAJO-540 at 2/4/14 3:03 PM:
-------------------------------------------------------

Continue my previous 2 comments.  Sparrow improves "The power of two Choices" 
algorithm on  2 issues: 1) queued assignment can't accurately measure the real 
cost time of a task 2)  the concurrent scheduling problem. You can check the 
sparrow paper for the details.

As I mentioned, If we leverage a low-latency scheduler in an interactive or 
real-time system, we need radically change current design of tajo's scheduling.

Firstly, the way we use Yarn is quite different from Spark and Impala.  The 
resource requests are issued by Tajo workers, one container for one 
task/queryunit attempt.  While spark and impala uses yarn as a higher layer 
scheduler for resource management. They use sparrow(-like) as their own 
internal scheduler in a lower layer for the purpose of low latency.  Yarn is 
used for allocate the resources for a whole spark/impala cluster, not for a 
task. For example, if a spark cluster has 1 master and 10 slaves. The master 
need 10GB memory, and each of the slaves need 20GB memory. Yarn allocate a 10GB 
container for master daemon, and  20GB container for a slave daemon.  Because 
those daemons are long-lived process, those resource are long time occupied by 
the spark cluster.  Yarn revoke the resource only if one slave get 
decommissioned from the cluster.  Here is my thought on tajo query scheduler,  
we can use yarn as higher layer resource management,  yarn allocate cpu/memory 
resources to tajo master/querymaster/worker daemons. Sparrow-like scheduler 
coordinate query with those resources in a lower layer.

Secondly, directly use sparrow is not proper. There are 3 reasons: 1) Sparrow 
need to start a scheduler daemon on each machine, is not convenient to operate  
2) Sparrow support multitenancy, in another words, sparrow has user 
authentication, which tajo don't support yet. 3) sparrow can't kill a job 
currently. But the algorithm behind sparrow is quite suitable for tajo.



was (Author: coderplay):
Continue my previous 2 comments.  Sparrow improves "The power of two Choices" 
algorithm on  2 issues: 1) queued assignment can't accurately measure the real 
cost time of a task 2)  the concurrent scheduling problem. You can check the 
sparrow paper for the details.

As I mentioned, If we leverage a low-latency scheduler in an interactive or 
real-time system, we need radically change current design of tajo's scheduling.

Firstly, the way we use Yarn is quite different from Spark and Impala.  The 
resource requests are issued by Tajo workers, one container for one 
task/queryunit attempt.  While spark and impala uses yarn as a higher layer 
scheduler for resource management. They use sparrow(-like) as their own 
internal scheduler in a lower layer for the purpose of low latency.  Yarn is 
used for allocate the resources for a whole spark/impala cluster, not for a 
task. For example, if a spark cluster has 1 master and 10 slaves. The master 
need 10GB memory, and each of the slaves need 20GB memory. Yarn allocate a 10GB 
container for master daemon, and  20GB container for a slave daemon.  Because 
those daemons are long-lived process, those resource are long time occupied by 
the spark cluster.  Yarn revoke the resource only if one slave get 
decommissioned from the cluster.  Here is my thought on tajo query scheduler,  
we can use yarn as higher layer resource management,  yarn allocate cpu/memory 
resources to tajo master/querymaster/worker daemons. Sparrow-like scheduler 
coordinate query with those resources in a lower layer.

Secondly, directly use sparrow is not proper. There 3 reasons: 1) Sparrow has 
to start a scheduler daemon on each machine, is not convenient to operate  2) 
Sparrow support multitenancy, in another words, sparrow has user 
authentication, which tajo don't support yet. 3) sparrow can't kill a job 
currently. But the algorithm behind sparrow is quite suitable for tajo.


> (Umbrella) Implement Tajo Query Scheduler
> -----------------------------------------
>
>                 Key: TAJO-540
>                 URL: https://issues.apache.org/jira/browse/TAJO-540
>             Project: Tajo
>          Issue Type: New Feature
>            Reporter: Hyunsik Choi
>
> Currently, there is no Tajo query scheduler. So, all queries launched 
> simultaneously compete cluster resource which is managed by 
> TajoResourceManager.
> In this issue, we will investigate,  design, and implement a Tajo query 
> scheduler. This is an umbrella issue for that. We will create subtasks for 
> them.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Comment Edited] (TAJO-540) (Umbrella) Implement Tajo Query Scheduler

Reply via email to