Maxim Kolesnikov created OOZIE-3566:
---------------------------------------
Summary: Wrong numbers provided by Oozie monitoring
webservices.requests sampler
Key: OOZIE-3566
URL: https://issues.apache.org/jira/browse/OOZIE-3566
Project: Oozie
Issue Type: Bug
Reporter: Maxim Kolesnikov
Attachments: unnamed.png
webservices.requests sampler based on
{code:java}
org.apache.oozie.servlet.JsonRestServlet#TOTAL_REQUESTS_SAMPLER_COUNTER{code}
may produce wrong results if
{code:java}
org.apache.oozie.servlet.JsonRestServlet#service{code}
rises an Exception during request URL validation.
TOTAL_REQUESTS_SAMPLER_COUNTER is incremented in try block and decremented in
finally block right after that. But as validateRestUrl comes before the counter
is incremented it may lead to situation when counter is decremented without
being incremented before.
Attached image is an example graph of webservices.requests being drawn in
Grafana via Graphite. Pay attention to Y-axis. While most nodes publish
positive numbers close to 0, some others seems to be shifted off by 1, with
results close to -1, util two of these nodes jump to numbers close to -2 around
14:57. At this exact time both servers have URL validation errors in their
logs, like this:
{code:java}
2019-11-29 14:57:29,394 WARN V2JobServlet:523 - USER[-] GROUP[-] TOKEN[-]
APP[-] JOB[-] ACTION[-] URL[GET <SANITIZED>] error[E0301], E0301: Invalid
resource [<SANITIZED>]{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)