Re: [prometheus-users] Replicas more than 2 corrupting the wal directory

Stuart Clark Tue, 22 Dec 2020 00:36:52 -0800

On 22/12/2020 04:52, Mohan Nagandlla wrote:

HI team I am using the Prometheus instance having the more than 1replica, When replica as 1 there is no wal corruption in datadirectory and now for the sake of zero down time updates for instanceI make the replicas count as 2 the instance is up now but the walcorruptions are happening the logs are below|level=error ts=2020-12-22T04:36:00.860Z caller=scrape.go:1076component="scrape manager" scrape_pool=depl/node-exporter/0target=http://x.x.x.x:9100/metrics msg="Scrape commit failed"err="write to WAL: log samples: write /prometheus/wal/00000003: staleNFS file handle" level=error ts=2020-12-22T04:36:00.862Zcaller=scrape.go:1076 component="scrape manager"scrape_pool=depl/prometheus-kubelet/0target=https://x.x.x.x:10250/metrics msg="Scrape commit failed"err="write to WAL: log samples: write /prometheus/wal/00000003: staleNFS file handle" level=error ts=2020-12-22T04:36:00.881Zcaller=scrape.go:1076 component="scrape manager"scrape_pool=depl/prometheus-kubelet/0target=https://x.x.x.x:10250/metrics msg="Scrape commit failed"err="write to WAL: log samples: write /prometheus/wal/00000003: staleNFS file handle" level=error ts=2020-12-22T04:36:00.898Zcaller=scrape.go:1076 component="scrape manager"scrape_pool=depl/prometheus-kubelet/0target=https://x.x.x.x:10250/metrics msg="Scrape commit failed"err="write to WAL: log samples: write /prometheus/wal/00000003: staleNFS file handle" level=error ts=2020-12-22T04:36:00.970Zcaller=scrape.go:1076 component="scrape manager"scrape_pool=depl/node-exporter/0 target=http://x.x.x.x:9100/metricsmsg="Scrape commit failed" err="write to WAL: log samples: write/prometheus/wal/00000003: stale NFS file handle" |
Getting more logs like this at one replica no errors but when i amusing the more than 1 replica getting above errors.
Or is there any other way for prometheus zero down time and why does iam getting this errors but if i used the replicas as 1 there is noerrors in data directory this is happening more than 1 replica

Prometheus must not share a data directory with another runninginstance, as you will see data corruption. Each Prometheus instance musthave a unique data directory. Additionally NFS isn't supported, so youshould use a local hard drive/EBS volume.


--
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/fb9bdbaa-dbd3-cf5a-3af8-1205b34dc05e%40Jahingo.com.

Re: [prometheus-users] Replicas more than 2 corrupting the wal directory

Reply via email to