[
https://issues.apache.org/jira/browse/TAJO-602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13912333#comment-13912333
]
Hyunsik Choi commented on TAJO-602:
-----------------------------------
I've attached the patch for this issue. Above all, I'm very sorry for doing the
job to separate the state part from TajoWorkerResourceManager.
I tried to separate the job into another jira issue, but the job was very
overlapped in this issue (TAJO-602). This is because TAJO-602 is for separating
heartbeat service from WorkerResourceManager and the state machine is mostly
for heartbeat service. Instead, I made much effort to narrow the scope of
changes and keep existing logic as possible; although I found some parts which
should be improved. This patch mostly changes WorkerResourceManager and
TajoWorkerResourceManager. So, I believe tha this patch mostly does not affect
Min's work (TAJO-603)
During this work, I found some issues to be improved. Later, I'll create
additional issues for them.
In detail, this patch does as follows:
* Separate the heartbeat service from TajoWorkerResourceManager into
TajoResourceTracker class
* Separate the worker information from WorkerResource into Worker class
* Separate the ping expired node checker into WorkerLivelinessMonitor which
extends AbstractLivelinessMonitor
* Separate the resource heartbeat from TajoHeartbeat into NodeHeartbeat
** Add one more heartbeat protocol and its service for that
* Add TajoRMContext which contains active or inactive worker list managed by
TajoWorkerResourceManager.
* Extract the part to choose and start QueryMaster from WorkerResourceManager
into QueryInProgress.
** I still keep allocateQueryMaster method in order to avoid radical API
changes.
** But, allocateQueryMaster is changed to internally use
allocateWorkerResources to allocate resources.
** In other words, unlike before, TajoWorkerResourceManager manages resources
for both QueryMaster and worker in one resource pool management.
* Separate the heartbeat report service from Worker into WorkerHeartbeatService
The unit tests are passed sucessfully. I've tested membership management of
active/inactive workers, resource heartbeat, and some queries in a local
cluster.
In addition, I think that I don't have good naming sense. If you suggest better
names for newly added classes, I'll appreciate it.
Thanks!
> WorkerResourceManager should be broke down into 3 parts
> --------------------------------------------------------
>
> Key: TAJO-602
> URL: https://issues.apache.org/jira/browse/TAJO-602
> Project: Tajo
> Issue Type: Sub-task
> Affects Versions: 1.0-incubating
> Reporter: Min Zhou
> Assignee: Hyunsik Choi
> Fix For: 1.0-incubating
>
> Attachments: TAJO-602.patch
>
>
> Before implementing a scheduler, I think we should do some refactoring at
> first. There are 2 interfaces and 4 classes related to resource management,
> they are WorkerResourceManager , YarnTajoResourceManager,
> TajoWorkerResourceManager reside in TajoMaster, and ResourceAllocator,
> YarnResourceAllocator, TajoResourceAllocator reside in QueryMasters.
> WorkerResourceManager actually plays 3 roles
> 1. Choose or start a QueryMaster for a query, and management it
> 2. allocate resource for query tasks / task runners (only for standalone mode)
> 3. Handle worker's heartbeat (only for standalone mode)
> If the scheduler is a decentralized one, like sparrow, we can allocate
> resource for a QueryMaster as the same way for a TaskRunner. So 1. and 2.
> can use the same interface, but called by 2 different caller. 3. is
> different from the others, we can create another service, let's say
> HeartbeatService to deal with worker's heartbeats.
> Any suggestion?
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)