[
https://issues.apache.org/jira/browse/OOZIE-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13931400#comment-13931400
]
Srikanth Sundarrajan commented on OOZIE-1533:
---------------------------------------------
[~rohini], Unless all coord actions are done, status transit service should't
be updating the coord job. correct ? Perhaps we should keep updates to coord
possible only via three routes (1. user action, 2. when all coord actions are
in completed state, 3. Materialization) to prevent StatusTransitService from
playing god.
{quote}
One problem that needs to be addressed before this was that there are lot of
places in code where coord job is updated
{quote}
Regarding the CoordActionInputCheckXCommand, you bring up a really important
concern, but to throttle it down through a coord lock seems to generally bring
down the throughput and it might useful to keep it free of this lock. We should
look at options to perform bulk checks for input to improve the scalability of
this operation without hurting NN / DB
In practice I found that most commands resort to checking the coord status in
verifyPrecondition(), so the odds of a coord action running while the coord
being in killed state due to a user interrupt is negligible, however the
possibility does exist.
{quote}
Another thing is interrupt commands like coord kill, etc will not be processed
earlier if the lock is changed to the action id.
{quote}
> Coordinator action materialization is too slow due to coarse job level locks
> ----------------------------------------------------------------------------
>
> Key: OOZIE-1533
> URL: https://issues.apache.org/jira/browse/OOZIE-1533
> Project: Oozie
> Issue Type: Improvement
> Reporter: Srikanth Sundarrajan
> Assignee: Srikanth Sundarrajan
> Labels: locking
> Attachments: OOZIE-1533.patch
>
>
> Coord job level lock introduces high contention. Instead introduce coord
> action level locking whenever appropriate
--
This message was sent by Atlassian JIRA
(v6.2#6252)