[
https://issues.apache.org/jira/browse/OOZIE-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13906783#comment-13906783
]
Srikanth Sundarrajan commented on OOZIE-1533:
---------------------------------------------
Currently locks are being held for various coord-action-commands as follows
||Command||Lock (entity-key)||
|CoordActionCheckXCommand|coord-action-id|
|CoordActionInfoXCommand|no-locks|
|CoordActionInputCheckXCommand|coord-job-id|
|CoordActionMaterializeCommand|RANDOM("coord_action_mater" + UUID())|
|CoordActionNotificationXCommand|RANDOM("coord_action_notification" + UUID())|
|CoordActionReadyXCommand|coord-job-id|
|CoordActionsKillXCommand|coord-job-id|
|CoordActionStartXCommand|coord-job-id|
|CoordActionTimeOutXCommand|coord-action-id|
|CoordActionUpdatePushMissingDependency|coord-action-id|
|CoordActionUpdateXCommand|coord-job-id|
I intend to put up a patch changing locks for the following commands.
||Command||Lock (entity-key)||
|CoordActionInputCheckXCommand|coord-action-id|
|CoordActionReadyXCommand|coord-action-id|
|CoordActionStartXCommand|coord-action-id|
|CoordActionUpdateXCommand|coord-action-id|
It seems like these commands were using the coord-job-id level locks to prevent
starting the action when the parent coord is in killed or paused state. But
from a correctness stand point performing these commands when the coord is in
killed / paused state there isn't any impact, except perhaps in
CoordActionStartXCommand. While holding lock at the coord-job-id isn't all that
helpful as it unnecessarily forces serial execution of independent
coord-actions command essentially working on their specific actions.
Are there any concerns ?
> Coordinator action materialization is too slow due to coarse job level locks
> ----------------------------------------------------------------------------
>
> Key: OOZIE-1533
> URL: https://issues.apache.org/jira/browse/OOZIE-1533
> Project: Oozie
> Issue Type: Improvement
> Reporter: Srikanth Sundarrajan
>
> Coord job level lock introduces high contention. Instead introduce coord
> action level locking whenever appropriate
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)