On 08/08/2023 20:31, Matt Doughty wrote:
So you are trying to get discreet metrics for every run of the batch
job? That sounds like an unbounded cardinality problem as you would
end up with a timeseries for every run of the batch job.
Am I misunderstanding or is this accurate?
You're right I don't need the exact time when the metric is fetched. I only
need it to differentiate between iterations within the batch job. Then is
creating a separate metric the best way to go?
If that is the case then Prometheus isn't the right tool. Having
distinctly detectable groups of data for a particular job run indicates
you are talking about events which are quite different to metrics. For
events you'd want to be looking at tools such as Elasticsearch, Loki or
a standard SQL database.
Events and metrics can (and often are) used in parallel. For example
Prometheus would tell you that the average job runtime is 5 minutes over
the past 3 hours, but you'd then use the events system to find the exact
durations for each run (or the number of events processed, or the error
message returned, etc.).
--
Stuart Clark
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/579c5062-cc5a-0d7b-7353-61ed436e25b6%40Jahingo.com.