David Wannemacher created OOZIE-1399:
----------------------------------------

             Summary: Reliability and retry for metastore calls
                 Key: OOZIE-1399
                 URL: https://issues.apache.org/jira/browse/OOZIE-1399
             Project: Oozie
          Issue Type: Improvement
          Components: core
    Affects Versions: 3.2.0
            Reporter: David Wannemacher


Oozie needs to have some mechanism to retry failed metastore queries and config 
parameters to govern it. One example is in ActionStartXCommand. An action can 
be successfully submitted and completed, which results in an output directory 
being created. But when the metastore call to update the job/action fails, the 
action gets repeatedly retried and fails due to the output directory already 
existing.

These problems are exacerbated when using a metastore (SQL Azure in this case) 
that occasionally closes connections due to idleness, throttling, or regular 
server maintenance. 

Hive has a retry mechanism for metastore calls which can be configured with the 
hive.metastore.ds.retry.attempts and hive.metastore.ds.retry.interval 
properties. Oozie should have a similar retry mechanism.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to