Gilad Wolff created OOZIE-1828:
----------------------------------

             Summary: Introduce counters JobStatus terminal states metrics
                 Key: OOZIE-1828
                 URL: https://issues.apache.org/jira/browse/OOZIE-1828
             Project: Oozie
          Issue Type: Bug
          Components: monitoring
    Affects Versions: 4.0.0
            Reporter: Gilad Wolff


Currently the Oozie server exposes job status metrics from the 'variables' 
group. These include metrics for jobs in terminal states: 'SUCCEEDED', 
'FAILED', 'KILLED'. The way Oozie compute the metrics is by querying the 
database for all jobs in each and every state. This means that when a purge 
happens these "apparent" counters' values are going to change (if anything was 
purged). This renders these counters as not very useful.

It would be better if real counters for jobs in terminal states can be exposed 
from the oozie server. One way to do this would be to initialize an in-memory 
counters and a timestamp count the jobs that finished between the timestamp and 
'now' (and keep updating timestamp to avoid it falling out of the retention 
period). This means that each Oozie server may have its own counter but that is 
okay as the count itself is not very important what is important is the 
rate-of-change.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to