Hi, I have a metric regarding SLA that needs to be 99.95 % or above . I am using the formula 100-(((30*24*60*60) - increase(process_uptime_seconds{job="Interop-InboundApi"}[30d]))/(30*24*60*60))*100 that runs for15 minutes ,which means if there is any time missing between the total number of seconds in 30 days minus the number of seconds the server was up in the last 30 days , that time should be less than .05%.. I am having difficulty writing test for this since I see that alert rules test doesn't allow '1d' as interval . So should I use something like 1m as interval with values: '0+60x43200' which would be number of entries equal to the number of minutes in 30 days. Also what should be the eval_time I use in this case ? I am using 15m but that doesn't yield the required result .
I have similar problem for Latency SLA . I am using histogram for that and am trying to get the percentage of messages below 1 second bucket . I am using the formula below : sum(rate(http_server_requests_seconds_bucket{le="1.0",uri="/inboundapi/message/v2"}[30d])) by (job) /sum(rate(http_server_requests_seconds_count{uri="/inboundapi/message/v2"}[30d]))by (job)*100. To test this too I need to use something similar to above case. Thanks Debashish -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CAHg4STwcVpsdX4_Q1Q3W2tCK69UcS8oLfPAJR%2BvBVcUiwkhHiw%40mail.gmail.com.