Re: [prometheus-users] collect non-metrics data

2023-02-13 Thread Christoph Anton Mitterer
Hey Ben.

On Saturday, February 11, 2023 at 11:18:44 AM UTC+1 Ben Kochie wrote:

You combine this with an "info" metric that tells you about the rest of the 
device.

Ah,... and I assume that one could just also export these info metrics 
alongside e.g. node_md_state?

Thanks :-)
Chris.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/d3abe98c-6b2b-4471-a981-8b99936b7dd4n%40googlegroups.com.


Re: [prometheus-users] collect non-metrics data

2023-02-11 Thread Ben Kochie
Typically those values are exposed as booleans/states.

For example, mdadm collector in the node_exporter has metrics like this:

# HELP node_md_state Indicates the state of md-device.
# TYPE node_md_state gauge
node_md_state{device="md0",state="active"} 1
node_md_state{device="md0",state="check"} 0
node_md_state{device="md0",state="inactive"} 0
node_md_state{device="md0",state="recovering"} 0
node_md_state{device="md0",state="resync"} 0

You combine this with an "info" metric that tells you about the rest of the
device.

For example, there is `node_os_info` that reads from LSB data.

# HELP node_os_info A metric with a constant '1' value labeled by build_id,
id, id_like, image_id, image_version, name, pretty_name, variant,
variant_id, version, version_codename, version_id.
# TYPE node_os_info gauge
node_os_info{build_id="",id="ubuntu",id_like="debian",image_id="",image_version="",name="Ubuntu",pretty_name="Ubuntu
20.04.2 LTS",variant="",variant_id="",version="20.04.2 LTS (Focal
Fossa)",version_codename="focal",version_id="20.04"} 1

PromQL allows you to do joins, kinda like SQL, in order to match this
information onto an alert.

On Sat, Feb 11, 2023 at 11:02 AM Christoph Anton Mitterer <
cales...@gmail.com> wrote:

> Hey.
>
> I wondered whether the following is possible with Prometheus. I basically
> think about possibly phasing out Icinga and do any alerting in Prometheus.
>
> For checks that are clearly metrics based (like load or free disk space)
> this seems rather easy.
>
> But what about any checks that are not really based on metrics?
> Like e.g. check_raid, which gives an error if any RAID has lost a disk or
> similar.
>
> Of course one could always just try to make a metric out of it - above one
> could make e.g. the number of non-consistent RAIDs the metric.
>
> But what one actually wants from such checks is additional (typically
> purely textual) information, like in the above example which HDD
> (enclosure, bay number,... or the serial number) has failed.
> Also I have numerous other checks which test for things which are not
> really related to a number but where the output are strings.
>
> Is there any (good) way to get that done with Prometheus, or is it simply
> not meant for that specific use case.
>
> Thanks,
> Chris.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/8fe84502-eca5-4e53-8a9c-35e7a9dd6113n%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABbyFmpVKOVwBqGzizVGyCvtFAN-JcuskFEM-ak892JzPAm-yQ%40mail.gmail.com.


[prometheus-users] collect non-metrics data

2023-02-11 Thread Christoph Anton Mitterer
Hey.

I wondered whether the following is possible with Prometheus. I basically 
think about possibly phasing out Icinga and do any alerting in Prometheus.

For checks that are clearly metrics based (like load or free disk space) 
this seems rather easy.

But what about any checks that are not really based on metrics?
Like e.g. check_raid, which gives an error if any RAID has lost a disk or 
similar.

Of course one could always just try to make a metric out of it - above one 
could make e.g. the number of non-consistent RAIDs the metric.

But what one actually wants from such checks is additional (typically 
purely textual) information, like in the above example which HDD 
(enclosure, bay number,... or the serial number) has failed.
Also I have numerous other checks which test for things which are not 
really related to a number but where the output are strings.

Is there any (good) way to get that done with Prometheus, or is it simply 
not meant for that specific use case.

Thanks,
Chris.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/8fe84502-eca5-4e53-8a9c-35e7a9dd6113n%40googlegroups.com.