[ https://issues.apache.org/jira/browse/SLING-5965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15426473#comment-15426473 ]
Stefan Egli commented on SLING-5965: ------------------------------------ [~chetanm], I can go via JMX yes - however would be good to release commons.metrics 1.0.2 (see [mail on list|http://markmail.org/message/gm6bxavnxumhz4pj] as in 1.0.0 metrics are not exposed as JMX. > Metrics and a Health-Check for Scheduler to detect long-running Quartz-Jobs > --------------------------------------------------------------------------- > > Key: SLING-5965 > URL: https://issues.apache.org/jira/browse/SLING-5965 > Project: Sling > Issue Type: New Feature > Components: Commons > Affects Versions: Commons Scheduler 2.5.0 > Reporter: Stefan Egli > Assignee: Stefan Egli > Fix For: Commons Scheduler 2.5.2 > > Attachments: SLING-5965.patch > > > Sling Scheduler jobs (aka Quartz-Jobs) should typically be fast running jobs. > They are served from a thread-pool and should occupy that thread only for a > short amount of time. > If there are 'misbehaving' quartz-jobs that run for a very long time, they > start to occupy threads from that thread-pool, thus have an influence on the > performance of other scheduled/quartz-jobs. > We should have metrics (using > [sling.commons.metrics|https://sling.apache.org/documentation/bundles/metrics.html]) > that provide information about internas of Sling Scheduler, such as average, > max etc duration of scheduled jobs, as well as how many jobs are currently > running and since when was the oldest job running. > Based on this, a Health-Check can monitor the 'oldest job running' metric and > flag {{critical}} when eg the oldest job is older than {{60'000ms}} > (configurable, default). -- This message was sent by Atlassian JIRA (v6.3.4#6332)