[ https://issues.apache.org/jira/browse/OOZIE-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Purshotam Shah updated OOZIE-1844: ---------------------------------- Attachment: OOZIE-1844-V2.patch > HA - Lock mechanism for CoordMaterializeTriggerService ( may be for other > services as well) > -------------------------------------------------------------------------------------------- > > Key: OOZIE-1844 > URL: https://issues.apache.org/jira/browse/OOZIE-1844 > Project: Oozie > Issue Type: Bug > Components: HA > Reporter: Purshotam Shah > Assignee: Purshotam Shah > Attachments: OOZIE-1844-V2.patch > > > Currently we check if job id belong to this server by using modulus operation. > This may not be optimum way to do. > 1. We are not processing MATERIALIZATION_SYSTEM_LIMIT, each server is only > doing half (in case of two servers) processing. We can always double the > limit. But as we add new system, we need to restart whole cluster to > increase the limit. > 2. The job sequence id is shared among wf,coord,bundle. So, we could have a > case where coord with odd/even id is more. In that case we are not distribute > load. One server will always do more processing. > 3. We also have different frequency for different coord jobs. Job with 1 min > or 5 min frequency will put more load on system. In this approach one > particular job will always run in one system and eventually putting more load > on one server. > May be simple way to optimize is to have a lock mechanism, each > CoordMaterializeTriggerService will obtain a lock and materialize coord. If > lock is held by other system, then it will wait for other system to release > lock. In this way coord jobs will get distributed among servers. -- This message was sent by Atlassian JIRA (v6.2#6252)