Hi folks.
We have configured a remote write to Victoria Metrics cluster and for some 
reason that we do not know we started get this error message: *Dropped 
sample for series that was not explicitly dropped via relabelling*

Our Prometheus instances are running in the kubernetes cluster

Before we guessed was because cpu throttling and adjusted the GOMAXPROCS to 
the same value that cpu limit (in our configuration cpu limit and request 
have the same value)

The CPU throttling decrease and so, we enabled auto-gomaxprocs feature, but 
in both cases  did not resolve.

The next step was delete wal files from prometheus instances and some 
curious happened.
When I delete the wal file and delete pod in sequence the drop samples 
disappear, but if I delete the wal file and run a rollout processs (delete 
the first pod, wait this pod get ready) and so the another pod is deleted, 
the last pod deleted (that was working until the pod 1 to be deleted) do 
not disappear the drop sample message. I've tried delete the wal again and 
delete manually in sequence, the drop message do not appear after start 
process.

Looking for the source 
code 
https://github.com/prometheus/prometheus/blob/main/storage/remote/queue_manager.go#L579

this message can happen in two situations:
1. When it is not possible recover the labels and
2. When it is not found the reference id to sample.

My question is:
what causes these two situations?
How to avoid this?
How to resolve this?



-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/a0c20cfd-1552-434e-9bef-f4e52e54491bn%40googlegroups.com.

Reply via email to