[
https://issues.apache.org/jira/browse/TAJO-540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13890328#comment-13890328
]
Min Zhou edited comment on TAJO-540 at 2/4/14 3:03 PM:
-------------------------------------------------------
Continue my previous 2 comments. Sparrow improves "The power of two Choices"
algorithm on 2 issues: 1) queued assignment can't accurately measure the real
cost time of a task 2) the concurrent scheduling problem. You can check the
sparrow paper for the details.
As I mentioned, If we leverage a low-latency scheduler in an interactive or
real-time system, we need radically change current design of tajo's scheduling.
Firstly, the way we use Yarn is quite different from Spark and Impala. The
resource requests are issued by Tajo workers, one container for one
task/queryunit attempt. While spark and impala uses yarn as a higher layer
scheduler for resource management. They use sparrow(-like) as their own
internal scheduler in a lower layer for the purpose of low latency. Yarn is
used for allocate the resources for a whole spark/impala cluster, not for a
task. For example, if a spark cluster has 1 master and 10 slaves. The master
need 10GB memory, and each of the slaves need 20GB memory. Yarn allocate a 10GB
container for master daemon, and 20GB container for a slave daemon. Because
those daemons are long-lived process, those resource are long time occupied by
the spark cluster. Yarn revoke the resource only if one slave get
decommissioned from the cluster. Here is my thought on tajo query scheduler,
we can use yarn as higher layer resource management, yarn allocate cpu/memory
resources to tajo master/querymaster/worker daemons. Sparrow-like scheduler
coordinate query with those resources in a lower layer.
Secondly, directly use sparrow is not proper. There are 3 reasons: 1) Sparrow
need to start a scheduler daemon on each machine, is not convenient to operate
2) Sparrow support multitenancy, in another words, sparrow has user
authentication, which tajo don't support yet. 3) sparrow can't kill a job
currently. But the algorithm behind sparrow is quite suitable for tajo.
was (Author: coderplay):
Continue my previous 2 comments. Sparrow improves "The power of two Choices"
algorithm on 2 issues: 1) queued assignment can't accurately measure the real
cost time of a task 2) the concurrent scheduling problem. You can check the
sparrow paper for the details.
As I mentioned, If we leverage a low-latency scheduler in an interactive or
real-time system, we need radically change current design of tajo's scheduling.
Firstly, the way we use Yarn is quite different from Spark and Impala. The
resource requests are issued by Tajo workers, one container for one
task/queryunit attempt. While spark and impala uses yarn as a higher layer
scheduler for resource management. They use sparrow(-like) as their own
internal scheduler in a lower layer for the purpose of low latency. Yarn is
used for allocate the resources for a whole spark/impala cluster, not for a
task. For example, if a spark cluster has 1 master and 10 slaves. The master
need 10GB memory, and each of the slaves need 20GB memory. Yarn allocate a 10GB
container for master daemon, and 20GB container for a slave daemon. Because
those daemons are long-lived process, those resource are long time occupied by
the spark cluster. Yarn revoke the resource only if one slave get
decommissioned from the cluster. Here is my thought on tajo query scheduler,
we can use yarn as higher layer resource management, yarn allocate cpu/memory
resources to tajo master/querymaster/worker daemons. Sparrow-like scheduler
coordinate query with those resources in a lower layer.
Secondly, directly use sparrow is not proper. There 3 reasons: 1) Sparrow has
to start a scheduler daemon on each machine, is not convenient to operate 2)
Sparrow support multitenancy, in another words, sparrow has user
authentication, which tajo don't support yet. 3) sparrow can't kill a job
currently. But the algorithm behind sparrow is quite suitable for tajo.
> (Umbrella) Implement Tajo Query Scheduler
> -----------------------------------------
>
> Key: TAJO-540
> URL: https://issues.apache.org/jira/browse/TAJO-540
> Project: Tajo
> Issue Type: New Feature
> Reporter: Hyunsik Choi
>
> Currently, there is no Tajo query scheduler. So, all queries launched
> simultaneously compete cluster resource which is managed by
> TajoResourceManager.
> In this issue, we will investigate, design, and implement a Tajo query
> scheduler. This is an umbrella issue for that. We will create subtasks for
> them.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)