[prometheus-users] Checking if NFS is hanged or not using node_exporter.

2020-03-03 Thread Yagyansh S. Kumar
I want to check if the NFS is hanged(i.e whether it is accessible from the 
server or not, and if yes then what is the response time it is getting). I 
know using the mountstats and nfs collector we have a lot of metrics for 
NFS, but haven't found any that can tell me every time the NFS hangs 
correctly.
Thanks in advance.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/06929518-d3b5-4c2f-9490-b08cc664d26b%40googlegroups.com.


Re: [prometheus-users] Checking if NFS is hanged or not using node_exporter.

2020-03-03 Thread Murali Krishna Kanagala
Try enabling the nfs options in the node exporter config. It will spit out
some metrics about the nfs status.

Also look at the disk IO metrics from node exporter and if you see no
activity which indicates the nfs is not doing anything.

On Tue, Mar 3, 2020, 7:10 AM Yagyansh S. Kumar 
wrote:

> I want to check if the NFS is hanged(i.e whether it is accessible from the
> server or not, and if yes then what is the response time it is getting). I
> know using the mountstats and nfs collector we have a lot of metrics for
> NFS, but haven't found any that can tell me every time the NFS hangs
> correctly.
> Thanks in advance.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/06929518-d3b5-4c2f-9490-b08cc664d26b%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAKimyZso4WJE%2BFy7zoTQOsCMpO30hbH%2Bh5vfQ6AADbto7nqhYQ%40mail.gmail.com.


Re: [prometheus-users] Checking if NFS is hanged or not using node_exporter.

2020-03-03 Thread Yagyansh S. Kumar
Already enabled the nfs and nfsd collectors. Till now I haven't found 
anything that can accurately give me the information about NFS hang.
Correct me if I am wrong, but I don't think it is a good indicator of NFS 
hang as there may be times where no activity is happening on the NFS, but 
that does not mean that NFS is hanged. (eg. I have 25 NFS mounts on one of 
my servers, some of them are used rarely, so we won't find any substantial 
IO on those mounts, but I need to know whether they are accessible or not). 
Still, thanks for the suggestion, will try it out once.


On Tuesday, March 3, 2020 at 8:35:03 PM UTC+5:30, Murali Krishna Kanagala 
wrote:
>
> Try enabling the nfs options in the node exporter config. It will spit out 
> some metrics about the nfs status. 
>
> Also look at the disk IO metrics from node exporter and if you see no 
> activity which indicates the nfs is not doing anything.
>
> On Tue, Mar 3, 2020, 7:10 AM Yagyansh S. Kumar  > wrote:
>
>> I want to check if the NFS is hanged(i.e whether it is accessible from 
>> the server or not, and if yes then what is the response time it is 
>> getting). I know using the mountstats and nfs collector we have a lot of 
>> metrics for NFS, but haven't found any that can tell me every time the NFS 
>> hangs correctly.
>> Thanks in advance.
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to promethe...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/prometheus-users/06929518-d3b5-4c2f-9490-b08cc664d26b%40googlegroups.com
>>  
>> 
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/1dda60cc-0b20-47da-87ff-4f1c76ce076f%40googlegroups.com.


Re: [prometheus-users] Checking if NFS is hanged or not using node_exporter.

2020-03-03 Thread Murali Krishna Kanagala
I would write a small shell script that tries to write to the nfs mount
path and writes the status to a file which can be read by the text file
collector. And schedule that shell script cron. I think this is the easiest
solution.

On Tue, Mar 3, 2020, 9:12 AM Yagyansh S. Kumar 
wrote:

> Already enabled the nfs and nfsd collectors. Till now I haven't found
> anything that can accurately give me the information about NFS hang.
> Correct me if I am wrong, but I don't think it is a good indicator of NFS
> hang as there may be times where no activity is happening on the NFS, but
> that does not mean that NFS is hanged. (eg. I have 25 NFS mounts on one of
> my servers, some of them are used rarely, so we won't find any substantial
> IO on those mounts, but I need to know whether they are accessible or not).
> Still, thanks for the suggestion, will try it out once.
>
>
> On Tuesday, March 3, 2020 at 8:35:03 PM UTC+5:30, Murali Krishna Kanagala
> wrote:
>>
>> Try enabling the nfs options in the node exporter config. It will spit
>> out some metrics about the nfs status.
>>
>> Also look at the disk IO metrics from node exporter and if you see no
>> activity which indicates the nfs is not doing anything.
>>
>> On Tue, Mar 3, 2020, 7:10 AM Yagyansh S. Kumar 
>> wrote:
>>
>>> I want to check if the NFS is hanged(i.e whether it is accessible from
>>> the server or not, and if yes then what is the response time it is
>>> getting). I know using the mountstats and nfs collector we have a lot of
>>> metrics for NFS, but haven't found any that can tell me every time the NFS
>>> hangs correctly.
>>> Thanks in advance.
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Prometheus Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to promethe...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/prometheus-users/06929518-d3b5-4c2f-9490-b08cc664d26b%40googlegroups.com
>>> 
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/1dda60cc-0b20-47da-87ff-4f1c76ce076f%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAKimyZubrdH7_%2BcfiXuQ4ESGMfm%2By7HNAVekSEv9K3iXE8RWEQ%40mail.gmail.com.


Re: [prometheus-users] Checking if NFS is hanged or not using node_exporter.

2020-03-03 Thread Yagyansh S. Kumar
I also thought about that, but I am keeping that as a last resort. But that 
would require me to push a script to all my 2500+ servers. 

On Tuesday, March 3, 2020 at 8:46:27 PM UTC+5:30, Murali Krishna Kanagala 
wrote:
>
> I would write a small shell script that tries to write to the nfs mount  
> path and writes the status to a file which can be read by the text file 
> collector. And schedule that shell script cron. I think this is the easiest 
> solution.
>
> On Tue, Mar 3, 2020, 9:12 AM Yagyansh S. Kumar  > wrote:
>
>> Already enabled the nfs and nfsd collectors. Till now I haven't found 
>> anything that can accurately give me the information about NFS hang.
>> Correct me if I am wrong, but I don't think it is a good indicator of NFS 
>> hang as there may be times where no activity is happening on the NFS, but 
>> that does not mean that NFS is hanged. (eg. I have 25 NFS mounts on one of 
>> my servers, some of them are used rarely, so we won't find any substantial 
>> IO on those mounts, but I need to know whether they are accessible or not). 
>> Still, thanks for the suggestion, will try it out once.
>>
>>
>> On Tuesday, March 3, 2020 at 8:35:03 PM UTC+5:30, Murali Krishna Kanagala 
>> wrote:
>>>
>>> Try enabling the nfs options in the node exporter config. It will spit 
>>> out some metrics about the nfs status. 
>>>
>>> Also look at the disk IO metrics from node exporter and if you see no 
>>> activity which indicates the nfs is not doing anything.
>>>
>>> On Tue, Mar 3, 2020, 7:10 AM Yagyansh S. Kumar  
>>> wrote:
>>>
 I want to check if the NFS is hanged(i.e whether it is accessible from 
 the server or not, and if yes then what is the response time it is 
 getting). I know using the mountstats and nfs collector we have a lot of 
 metrics for NFS, but haven't found any that can tell me every time the NFS 
 hangs correctly.
 Thanks in advance.

 -- 
 You received this message because you are subscribed to the Google 
 Groups "Prometheus Users" group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to promethe...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/prometheus-users/06929518-d3b5-4c2f-9490-b08cc664d26b%40googlegroups.com
  
 
 .

>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to promethe...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/prometheus-users/1dda60cc-0b20-47da-87ff-4f1c76ce076f%40googlegroups.com
>>  
>> 
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/3822b8a0-697f-4efe-87ab-5a3f90de0786%40googlegroups.com.


Re: [prometheus-users] Checking if NFS is hanged or not using node_exporter.

2020-03-03 Thread Yagyansh S. Kumar
I also thought about doing the same, but I am keeping that as a last resort 
because that would require me to push the script to all my 2500+ servers.

On Tuesday, March 3, 2020 at 8:46:27 PM UTC+5:30, Murali Krishna Kanagala 
wrote:
>
> I would write a small shell script that tries to write to the nfs mount  
> path and writes the status to a file which can be read by the text file 
> collector. And schedule that shell script cron. I think this is the easiest 
> solution.
>
> On Tue, Mar 3, 2020, 9:12 AM Yagyansh S. Kumar  > wrote:
>
>> Already enabled the nfs and nfsd collectors. Till now I haven't found 
>> anything that can accurately give me the information about NFS hang.
>> Correct me if I am wrong, but I don't think it is a good indicator of NFS 
>> hang as there may be times where no activity is happening on the NFS, but 
>> that does not mean that NFS is hanged. (eg. I have 25 NFS mounts on one of 
>> my servers, some of them are used rarely, so we won't find any substantial 
>> IO on those mounts, but I need to know whether they are accessible or not). 
>> Still, thanks for the suggestion, will try it out once.
>>
>>
>> On Tuesday, March 3, 2020 at 8:35:03 PM UTC+5:30, Murali Krishna Kanagala 
>> wrote:
>>>
>>> Try enabling the nfs options in the node exporter config. It will spit 
>>> out some metrics about the nfs status. 
>>>
>>> Also look at the disk IO metrics from node exporter and if you see no 
>>> activity which indicates the nfs is not doing anything.
>>>
>>> On Tue, Mar 3, 2020, 7:10 AM Yagyansh S. Kumar  
>>> wrote:
>>>
 I want to check if the NFS is hanged(i.e whether it is accessible from 
 the server or not, and if yes then what is the response time it is 
 getting). I know using the mountstats and nfs collector we have a lot of 
 metrics for NFS, but haven't found any that can tell me every time the NFS 
 hangs correctly.
 Thanks in advance.

 -- 
 You received this message because you are subscribed to the Google 
 Groups "Prometheus Users" group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to promethe...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/prometheus-users/06929518-d3b5-4c2f-9490-b08cc664d26b%40googlegroups.com
  
 
 .

>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to promethe...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/prometheus-users/1dda60cc-0b20-47da-87ff-4f1c76ce076f%40googlegroups.com
>>  
>> 
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/832f2823-eab1-4f40-8f91-ddbc00190551%40googlegroups.com.


Re: [prometheus-users] Checking if NFS is hanged or not using node_exporter.

2020-03-03 Thread Serkan Çoban
if I remember correctly node exporter will hang too when an nfs share
hangs. maybe you can test it...

On Tue, Mar 3, 2020 at 6:26 PM Yagyansh S. Kumar
 wrote:
>
> I also thought about doing the same, but I am keeping that as a last resort 
> because that would require me to push the script to all my 2500+ servers.
>
> On Tuesday, March 3, 2020 at 8:46:27 PM UTC+5:30, Murali Krishna Kanagala 
> wrote:
>>
>> I would write a small shell script that tries to write to the nfs mount  
>> path and writes the status to a file which can be read by the text file 
>> collector. And schedule that shell script cron. I think this is the easiest 
>> solution.
>>
>> On Tue, Mar 3, 2020, 9:12 AM Yagyansh S. Kumar  wrote:
>>>
>>> Already enabled the nfs and nfsd collectors. Till now I haven't found 
>>> anything that can accurately give me the information about NFS hang.
>>> Correct me if I am wrong, but I don't think it is a good indicator of NFS 
>>> hang as there may be times where no activity is happening on the NFS, but 
>>> that does not mean that NFS is hanged. (eg. I have 25 NFS mounts on one of 
>>> my servers, some of them are used rarely, so we won't find any substantial 
>>> IO on those mounts, but I need to know whether they are accessible or not). 
>>> Still, thanks for the suggestion, will try it out once.
>>>
>>>
>>> On Tuesday, March 3, 2020 at 8:35:03 PM UTC+5:30, Murali Krishna Kanagala 
>>> wrote:

 Try enabling the nfs options in the node exporter config. It will spit out 
 some metrics about the nfs status.

 Also look at the disk IO metrics from node exporter and if you see no 
 activity which indicates the nfs is not doing anything.

 On Tue, Mar 3, 2020, 7:10 AM Yagyansh S. Kumar  
 wrote:
>
> I want to check if the NFS is hanged(i.e whether it is accessible from 
> the server or not, and if yes then what is the response time it is 
> getting). I know using the mountstats and nfs collector we have a lot of 
> metrics for NFS, but haven't found any that can tell me every time the 
> NFS hangs correctly.
> Thanks in advance.
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to promethe...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/prometheus-users/06929518-d3b5-4c2f-9490-b08cc664d26b%40googlegroups.com.
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups 
>>> "Prometheus Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to promethe...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/prometheus-users/1dda60cc-0b20-47da-87ff-4f1c76ce076f%40googlegroups.com.
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/prometheus-users/832f2823-eab1-4f40-8f91-ddbc00190551%40googlegroups.com.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAP9WWed%2BtxJVRSJc0mkCOkg6_neGAJRNEMq_hku87LPbYXAhjA%40mail.gmail.com.


Re: [prometheus-users] Checking if NFS is hanged or not using node_exporter.

2020-03-03 Thread Ben Kochie
We added some mitigation for filesystem hangs. The node_exporter will
notice a stuck filesystem and stop attempting to gather metrics from it
until it gets un-stuck. Although, I don't think we have any metrics for
when that happens, only log errors.

On Tue, Mar 3, 2020 at 6:03 PM Serkan Çoban  wrote:

> if I remember correctly node exporter will hang too when an nfs share
> hangs. maybe you can test it...
>
> On Tue, Mar 3, 2020 at 6:26 PM Yagyansh S. Kumar
>  wrote:
> >
> > I also thought about doing the same, but I am keeping that as a last
> resort because that would require me to push the script to all my 2500+
> servers.
> >
> > On Tuesday, March 3, 2020 at 8:46:27 PM UTC+5:30, Murali Krishna
> Kanagala wrote:
> >>
> >> I would write a small shell script that tries to write to the nfs
> mount  path and writes the status to a file which can be read by the text
> file collector. And schedule that shell script cron. I think this is the
> easiest solution.
> >>
> >> On Tue, Mar 3, 2020, 9:12 AM Yagyansh S. Kumar 
> wrote:
> >>>
> >>> Already enabled the nfs and nfsd collectors. Till now I haven't found
> anything that can accurately give me the information about NFS hang.
> >>> Correct me if I am wrong, but I don't think it is a good indicator of
> NFS hang as there may be times where no activity is happening on the NFS,
> but that does not mean that NFS is hanged. (eg. I have 25 NFS mounts on one
> of my servers, some of them are used rarely, so we won't find any
> substantial IO on those mounts, but I need to know whether they are
> accessible or not). Still, thanks for the suggestion, will try it out once.
> >>>
> >>>
> >>> On Tuesday, March 3, 2020 at 8:35:03 PM UTC+5:30, Murali Krishna
> Kanagala wrote:
> 
>  Try enabling the nfs options in the node exporter config. It will
> spit out some metrics about the nfs status.
> 
>  Also look at the disk IO metrics from node exporter and if you see no
> activity which indicates the nfs is not doing anything.
> 
>  On Tue, Mar 3, 2020, 7:10 AM Yagyansh S. Kumar 
> wrote:
> >
> > I want to check if the NFS is hanged(i.e whether it is accessible
> from the server or not, and if yes then what is the response time it is
> getting). I know using the mountstats and nfs collector we have a lot of
> metrics for NFS, but haven't found any that can tell me every time the NFS
> hangs correctly.
> > Thanks in advance.
> >
> > --
> > You received this message because you are subscribed to the Google
> Groups "Prometheus Users" group.
> > To unsubscribe from this group and stop receiving emails from it,
> send an email to promethe...@googlegroups.com.
> > To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/06929518-d3b5-4c2f-9490-b08cc664d26b%40googlegroups.com
> .
> >>>
> >>> --
> >>> You received this message because you are subscribed to the Google
> Groups "Prometheus Users" group.
> >>> To unsubscribe from this group and stop receiving emails from it, send
> an email to promethe...@googlegroups.com.
> >>> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/1dda60cc-0b20-47da-87ff-4f1c76ce076f%40googlegroups.com
> .
> >
> > --
> > You received this message because you are subscribed to the Google
> Groups "Prometheus Users" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an email to prometheus-users+unsubscr...@googlegroups.com.
> > To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/832f2823-eab1-4f40-8f91-ddbc00190551%40googlegroups.com
> .
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/CAP9WWed%2BtxJVRSJc0mkCOkg6_neGAJRNEMq_hku87LPbYXAhjA%40mail.gmail.com
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABbyFmqMKQXYNOfdr7BeFA%3Dx%3D5fY%2Bk4EQ8oprL0Wh-8SNqmvoA%40mail.gmail.com.


Re: [prometheus-users] Checking if NFS is hanged or not using node_exporter.

2020-03-03 Thread Julien Pivotto
Hi,

We have a dedicated job that collects disks metrics:

- job_name: node_disks
  params:
collect[]:
- diskstats
- filefd
- filesystem
- mdadm
- mountstats
- nfs
- nfsd
- job_name: node
  params:
collect[]:
- arp
- bonding
- conntrack
- cpu
- entropy
- hwmon
- infiniband
- loadavg
- meminfo
- netclass
- netdev
- netstat
- ntp
- processes
- sockstat
- stat
- textfile
- time
- timex
- uname
- vmstat
- xfs

stale nfs will usually be noticed:
  up{job="node_disks"}==0 and 
label_replace(up{job="node"}==1,"job","node_disks","","")
and second rule:
  node_filesystem_avail_bytes offset 8h unless node_filesystem_avail_bytes and 
on(job, instance) up == 1


Those two expression seem to have worked fine for us in the past.


On 03 Mar 18:11, Ben Kochie wrote:
> We added some mitigation for filesystem hangs. The node_exporter will
> notice a stuck filesystem and stop attempting to gather metrics from it
> until it gets un-stuck. Although, I don't think we have any metrics for
> when that happens, only log errors.
> 
> On Tue, Mar 3, 2020 at 6:03 PM Serkan Çoban  wrote:
> 
> > if I remember correctly node exporter will hang too when an nfs share
> > hangs. maybe you can test it...
> >
> > On Tue, Mar 3, 2020 at 6:26 PM Yagyansh S. Kumar
> >  wrote:
> > >
> > > I also thought about doing the same, but I am keeping that as a last
> > resort because that would require me to push the script to all my 2500+
> > servers.
> > >
> > > On Tuesday, March 3, 2020 at 8:46:27 PM UTC+5:30, Murali Krishna
> > Kanagala wrote:
> > >>
> > >> I would write a small shell script that tries to write to the nfs
> > mount  path and writes the status to a file which can be read by the text
> > file collector. And schedule that shell script cron. I think this is the
> > easiest solution.
> > >>
> > >> On Tue, Mar 3, 2020, 9:12 AM Yagyansh S. Kumar 
> > wrote:
> > >>>
> > >>> Already enabled the nfs and nfsd collectors. Till now I haven't found
> > anything that can accurately give me the information about NFS hang.
> > >>> Correct me if I am wrong, but I don't think it is a good indicator of
> > NFS hang as there may be times where no activity is happening on the NFS,
> > but that does not mean that NFS is hanged. (eg. I have 25 NFS mounts on one
> > of my servers, some of them are used rarely, so we won't find any
> > substantial IO on those mounts, but I need to know whether they are
> > accessible or not). Still, thanks for the suggestion, will try it out once.
> > >>>
> > >>>
> > >>> On Tuesday, March 3, 2020 at 8:35:03 PM UTC+5:30, Murali Krishna
> > Kanagala wrote:
> > 
> >  Try enabling the nfs options in the node exporter config. It will
> > spit out some metrics about the nfs status.
> > 
> >  Also look at the disk IO metrics from node exporter and if you see no
> > activity which indicates the nfs is not doing anything.
> > 
> >  On Tue, Mar 3, 2020, 7:10 AM Yagyansh S. Kumar 
> > wrote:
> > >
> > > I want to check if the NFS is hanged(i.e whether it is accessible
> > from the server or not, and if yes then what is the response time it is
> > getting). I know using the mountstats and nfs collector we have a lot of
> > metrics for NFS, but haven't found any that can tell me every time the NFS
> > hangs correctly.
> > > Thanks in advance.
> > >
> > > --
> > > You received this message because you are subscribed to the Google
> > Groups "Prometheus Users" group.
> > > To unsubscribe from this group and stop receiving emails from it,
> > send an email to promethe...@googlegroups.com.
> > > To view this discussion on the web visit
> > https://groups.google.com/d/msgid/prometheus-users/06929518-d3b5-4c2f-9490-b08cc664d26b%40googlegroups.com
> > .
> > >>>
> > >>> --
> > >>> You received this message because you are subscribed to the Google
> > Groups "Prometheus Users" group.
> > >>> To unsubscribe from this group and stop receiving emails from it, send
> > an email to promethe...@googlegroups.com.
> > >>> To view this discussion on the web visit
> > https://groups.google.com/d/msgid/prometheus-users/1dda60cc-0b20-47da-87ff-4f1c76ce076f%40googlegroups.com
> > .
> > >
> > > --
> > > You received this message because you are subscribed to the Google
> > Groups "Prometheus Users" group.
> > > To unsubscribe from this group and stop receiving emails from it, send
> > an email to prometheus-users+unsubscr...@googlegroups.com.
> > > To view this discussion on the web visit
> > https://groups.google.com/d/msgid/prometheus-users/832f2823-eab1-4f40-8f91-ddbc00190551%40googlegroups.com
> > .
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Prometheus Users" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to prometheus-users+unsubscr...@googlegroups.com.
> > To view this discussion 

Re: [prometheus-users] Checking if NFS is hanged or not using node_exporter.

2020-03-03 Thread Yagyansh S. Kumar
 If it will stop scraping the metrics all together, then can we safely say 
that the time we don't have any metrics for a NFS mount, it is because the 
mount is stuck? 

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/ab4e51ad-eddd-49d6-9c27-731b66af2b13%40googlegroups.com.


Re: [prometheus-users] Checking if NFS is hanged or not using node_exporter.

2020-03-03 Thread sayf eddine Hammemi
If the node-exporter will log errors if the nfs share hangs then u can use
mtail for example to scrape node exporter log files and export nfs errors,
that would be better than using a hand made script.

On Tue, Mar 3, 2020, 18:12 Ben Kochie  wrote:

> We added some mitigation for filesystem hangs. The node_exporter will
> notice a stuck filesystem and stop attempting to gather metrics from it
> until it gets un-stuck. Although, I don't think we have any metrics for
> when that happens, only log errors.
>
> On Tue, Mar 3, 2020 at 6:03 PM Serkan Çoban  wrote:
>
>> if I remember correctly node exporter will hang too when an nfs share
>> hangs. maybe you can test it...
>>
>> On Tue, Mar 3, 2020 at 6:26 PM Yagyansh S. Kumar
>>  wrote:
>> >
>> > I also thought about doing the same, but I am keeping that as a last
>> resort because that would require me to push the script to all my 2500+
>> servers.
>> >
>> > On Tuesday, March 3, 2020 at 8:46:27 PM UTC+5:30, Murali Krishna
>> Kanagala wrote:
>> >>
>> >> I would write a small shell script that tries to write to the nfs
>> mount  path and writes the status to a file which can be read by the text
>> file collector. And schedule that shell script cron. I think this is the
>> easiest solution.
>> >>
>> >> On Tue, Mar 3, 2020, 9:12 AM Yagyansh S. Kumar 
>> wrote:
>> >>>
>> >>> Already enabled the nfs and nfsd collectors. Till now I haven't found
>> anything that can accurately give me the information about NFS hang.
>> >>> Correct me if I am wrong, but I don't think it is a good indicator of
>> NFS hang as there may be times where no activity is happening on the NFS,
>> but that does not mean that NFS is hanged. (eg. I have 25 NFS mounts on one
>> of my servers, some of them are used rarely, so we won't find any
>> substantial IO on those mounts, but I need to know whether they are
>> accessible or not). Still, thanks for the suggestion, will try it out once.
>> >>>
>> >>>
>> >>> On Tuesday, March 3, 2020 at 8:35:03 PM UTC+5:30, Murali Krishna
>> Kanagala wrote:
>> 
>>  Try enabling the nfs options in the node exporter config. It will
>> spit out some metrics about the nfs status.
>> 
>>  Also look at the disk IO metrics from node exporter and if you see
>> no activity which indicates the nfs is not doing anything.
>> 
>>  On Tue, Mar 3, 2020, 7:10 AM Yagyansh S. Kumar 
>> wrote:
>> >
>> > I want to check if the NFS is hanged(i.e whether it is accessible
>> from the server or not, and if yes then what is the response time it is
>> getting). I know using the mountstats and nfs collector we have a lot of
>> metrics for NFS, but haven't found any that can tell me every time the NFS
>> hangs correctly.
>> > Thanks in advance.
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> Groups "Prometheus Users" group.
>> > To unsubscribe from this group and stop receiving emails from it,
>> send an email to promethe...@googlegroups.com.
>> > To view this discussion on the web visit
>> https://groups.google.com/d/msgid/prometheus-users/06929518-d3b5-4c2f-9490-b08cc664d26b%40googlegroups.com
>> .
>> >>>
>> >>> --
>> >>> You received this message because you are subscribed to the Google
>> Groups "Prometheus Users" group.
>> >>> To unsubscribe from this group and stop receiving emails from it,
>> send an email to promethe...@googlegroups.com.
>> >>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/prometheus-users/1dda60cc-0b20-47da-87ff-4f1c76ce076f%40googlegroups.com
>> .
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> Groups "Prometheus Users" group.
>> > To unsubscribe from this group and stop receiving emails from it, send
>> an email to prometheus-users+unsubscr...@googlegroups.com.
>> > To view this discussion on the web visit
>> https://groups.google.com/d/msgid/prometheus-users/832f2823-eab1-4f40-8f91-ddbc00190551%40googlegroups.com
>> .
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to prometheus-users+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/prometheus-users/CAP9WWed%2BtxJVRSJc0mkCOkg6_neGAJRNEMq_hku87LPbYXAhjA%40mail.gmail.com
>> .
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/CABbyFmqMKQXYNOfdr7BeFA%3Dx%3D5fY%2Bk4EQ8oprL0Wh-8SNqmvoA%40mail.gmail.com
> 

Re: [prometheus-users] Checking if NFS is hanged or not using node_exporter.

2020-03-03 Thread Yagyansh S. Kumar
@Julien - Can you please explain a bit on what actually you are checking 
and how are you concluding that the NFS is infact in hung state.

On Tuesday, March 3, 2020 at 10:48:02 PM UTC+5:30, Julien Pivotto wrote:
>
> Hi, 
>
> We have a dedicated job that collects disks metrics: 
>
> - job_name: node_disks 
>   params: 
> collect[]: 
> - diskstats 
> - filefd 
> - filesystem 
> - mdadm 
> - mountstats 
> - nfs 
> - nfsd 
> - job_name: node 
>   params: 
> collect[]: 
> - arp 
> - bonding 
> - conntrack 
> - cpu 
> - entropy 
> - hwmon 
> - infiniband 
> - loadavg 
> - meminfo 
> - netclass 
> - netdev 
> - netstat 
> - ntp 
> - processes 
> - sockstat 
> - stat 
> - textfile 
> - time 
> - timex 
> - uname 
> - vmstat 
> - xfs 
>
> stale nfs will usually be noticed: 
>   up{job="node_disks"}==0 and 
> label_replace(up{job="node"}==1,"job","node_disks","","") 
> and second rule: 
>   node_filesystem_avail_bytes offset 8h unless node_filesystem_avail_bytes 
> and on(job, instance) up == 1 
>
>
> Those two expression seem to have worked fine for us in the past. 
>
>
> On 03 Mar 18:11, Ben Kochie wrote: 
> > We added some mitigation for filesystem hangs. The node_exporter will 
> > notice a stuck filesystem and stop attempting to gather metrics from it 
> > until it gets un-stuck. Although, I don't think we have any metrics for 
> > when that happens, only log errors. 
> > 
> > On Tue, Mar 3, 2020 at 6:03 PM Serkan Çoban  > wrote: 
> > 
> > > if I remember correctly node exporter will hang too when an nfs share 
> > > hangs. maybe you can test it... 
> > > 
> > > On Tue, Mar 3, 2020 at 6:26 PM Yagyansh S. Kumar 
> > > > wrote: 
> > > > 
> > > > I also thought about doing the same, but I am keeping that as a last 
> > > resort because that would require me to push the script to all my 
> 2500+ 
> > > servers. 
> > > > 
> > > > On Tuesday, March 3, 2020 at 8:46:27 PM UTC+5:30, Murali Krishna 
> > > Kanagala wrote: 
> > > >> 
> > > >> I would write a small shell script that tries to write to the nfs 
> > > mount  path and writes the status to a file which can be read by the 
> text 
> > > file collector. And schedule that shell script cron. I think this is 
> the 
> > > easiest solution. 
> > > >> 
> > > >> On Tue, Mar 3, 2020, 9:12 AM Yagyansh S. Kumar <
> yagyans...@gmail.com> 
> > > wrote: 
> > > >>> 
> > > >>> Already enabled the nfs and nfsd collectors. Till now I haven't 
> found 
> > > anything that can accurately give me the information about NFS hang. 
> > > >>> Correct me if I am wrong, but I don't think it is a good indicator 
> of 
> > > NFS hang as there may be times where no activity is happening on the 
> NFS, 
> > > but that does not mean that NFS is hanged. (eg. I have 25 NFS mounts 
> on one 
> > > of my servers, some of them are used rarely, so we won't find any 
> > > substantial IO on those mounts, but I need to know whether they are 
> > > accessible or not). Still, thanks for the suggestion, will try it out 
> once. 
> > > >>> 
> > > >>> 
> > > >>> On Tuesday, March 3, 2020 at 8:35:03 PM UTC+5:30, Murali Krishna 
> > > Kanagala wrote: 
> > >  
> > >  Try enabling the nfs options in the node exporter config. It will 
> > > spit out some metrics about the nfs status. 
> > >  
> > >  Also look at the disk IO metrics from node exporter and if you 
> see no 
> > > activity which indicates the nfs is not doing anything. 
> > >  
> > >  On Tue, Mar 3, 2020, 7:10 AM Yagyansh S. Kumar <
> yagyans...@gmail.com> 
> > > wrote: 
> > > > 
> > > > I want to check if the NFS is hanged(i.e whether it is 
> accessible 
> > > from the server or not, and if yes then what is the response time it 
> is 
> > > getting). I know using the mountstats and nfs collector we have a lot 
> of 
> > > metrics for NFS, but haven't found any that can tell me every time the 
> NFS 
> > > hangs correctly. 
> > > > Thanks in advance. 
> > > > 
> > > > -- 
> > > > You received this message because you are subscribed to the 
> Google 
> > > Groups "Prometheus Users" group. 
> > > > To unsubscribe from this group and stop receiving emails from 
> it, 
> > > send an email to promethe...@googlegroups.com. 
> > > > To view this discussion on the web visit 
> > > 
> https://groups.google.com/d/msgid/prometheus-users/06929518-d3b5-4c2f-9490-b08cc664d26b%40googlegroups.com
>  
> > > . 
> > > >>> 
> > > >>> -- 
> > > >>> You received this message because you are subscribed to the Google 
> > > Groups "Prometheus Users" group. 
> > > >>> To unsubscribe from this group and stop receiving emails from it, 
> send 
> > > an email to promethe...@googlegroups.com. 
> > > >>> To view this discussion on the web visit 
> > > 
> https://groups.google.com/d/msgid/prometheus-users/1dda60cc-0b20-47da-87ff-4f1c76ce076f%40googlegroups.com
>  
> > > . 
> > > > 
> > > > -- 
> > >

Re: [prometheus-users] Checking if NFS is hanged or not using node_exporter.

2020-03-03 Thread Yagyansh S. Kumar
This seems like a lot of work, especially when I have to monitor over 2500+ 
servers. :P 

On Tuesday, March 3, 2020 at 10:49:57 PM UTC+5:30, sayf eddine Hammemi 
wrote:
>
> If the node-exporter will log errors if the nfs share hangs then u can use 
> mtail for example to scrape node exporter log files and export nfs errors, 
> that would be better than using a hand made script.
>
> On Tue, Mar 3, 2020, 18:12 Ben Kochie > 
> wrote:
>
>> We added some mitigation for filesystem hangs. The node_exporter will 
>> notice a stuck filesystem and stop attempting to gather metrics from it 
>> until it gets un-stuck. Although, I don't think we have any metrics for 
>> when that happens, only log errors.
>>
>> On Tue, Mar 3, 2020 at 6:03 PM Serkan Çoban > > wrote:
>>
>>> if I remember correctly node exporter will hang too when an nfs share
>>> hangs. maybe you can test it...
>>>
>>> On Tue, Mar 3, 2020 at 6:26 PM Yagyansh S. Kumar
>>> > wrote:
>>> >
>>> > I also thought about doing the same, but I am keeping that as a last 
>>> resort because that would require me to push the script to all my 2500+ 
>>> servers.
>>> >
>>> > On Tuesday, March 3, 2020 at 8:46:27 PM UTC+5:30, Murali Krishna 
>>> Kanagala wrote:
>>> >>
>>> >> I would write a small shell script that tries to write to the nfs 
>>> mount  path and writes the status to a file which can be read by the text 
>>> file collector. And schedule that shell script cron. I think this is the 
>>> easiest solution.
>>> >>
>>> >> On Tue, Mar 3, 2020, 9:12 AM Yagyansh S. Kumar  
>>> wrote:
>>> >>>
>>> >>> Already enabled the nfs and nfsd collectors. Till now I haven't 
>>> found anything that can accurately give me the information about NFS hang.
>>> >>> Correct me if I am wrong, but I don't think it is a good indicator 
>>> of NFS hang as there may be times where no activity is happening on the 
>>> NFS, but that does not mean that NFS is hanged. (eg. I have 25 NFS mounts 
>>> on one of my servers, some of them are used rarely, so we won't find any 
>>> substantial IO on those mounts, but I need to know whether they are 
>>> accessible or not). Still, thanks for the suggestion, will try it out once.
>>> >>>
>>> >>>
>>> >>> On Tuesday, March 3, 2020 at 8:35:03 PM UTC+5:30, Murali Krishna 
>>> Kanagala wrote:
>>> 
>>>  Try enabling the nfs options in the node exporter config. It will 
>>> spit out some metrics about the nfs status.
>>> 
>>>  Also look at the disk IO metrics from node exporter and if you see 
>>> no activity which indicates the nfs is not doing anything.
>>> 
>>>  On Tue, Mar 3, 2020, 7:10 AM Yagyansh S. Kumar <
>>> yagyans...@gmail.com> wrote:
>>> >
>>> > I want to check if the NFS is hanged(i.e whether it is accessible 
>>> from the server or not, and if yes then what is the response time it is 
>>> getting). I know using the mountstats and nfs collector we have a lot of 
>>> metrics for NFS, but haven't found any that can tell me every time the NFS 
>>> hangs correctly.
>>> > Thanks in advance.
>>> >
>>> > --
>>> > You received this message because you are subscribed to the Google 
>>> Groups "Prometheus Users" group.
>>> > To unsubscribe from this group and stop receiving emails from it, 
>>> send an email to promethe...@googlegroups.com.
>>> > To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/prometheus-users/06929518-d3b5-4c2f-9490-b08cc664d26b%40googlegroups.com
>>> .
>>> >>>
>>> >>> --
>>> >>> You received this message because you are subscribed to the Google 
>>> Groups "Prometheus Users" group.
>>> >>> To unsubscribe from this group and stop receiving emails from it, 
>>> send an email to promethe...@googlegroups.com.
>>> >>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/prometheus-users/1dda60cc-0b20-47da-87ff-4f1c76ce076f%40googlegroups.com
>>> .
>>> >
>>> > --
>>> > You received this message because you are subscribed to the Google 
>>> Groups "Prometheus Users" group.
>>> > To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to promethe...@googlegroups.com .
>>> > To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/prometheus-users/832f2823-eab1-4f40-8f91-ddbc00190551%40googlegroups.com
>>> .
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "Prometheus Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to promethe...@googlegroups.com .
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/prometheus-users/CAP9WWed%2BtxJVRSJc0mkCOkg6_neGAJRNEMq_hku87LPbYXAhjA%40mail.gmail.com
>>> .
>>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to promethe...@googlegroups.com .
>> To view this discussi

Re: [prometheus-users] Checking if NFS is hanged or not using node_exporter.

2020-03-03 Thread Yagyansh S. Kumar
@sayf eddine Hammemi - 
This seems a very tedious task, given that I have to monitor 2500+ servers.
On Tuesday, March 3, 2020 at 10:49:57 PM UTC+5:30, sayf eddine Hammemi 
wrote:
>
> If the node-exporter will log errors if the nfs share hangs then u can use 
> mtail for example to scrape node exporter log files and export nfs errors, 
> that would be better than using a hand made script.
>
> On Tue, Mar 3, 2020, 18:12 Ben Kochie > 
> wrote:
>
>> We added some mitigation for filesystem hangs. The node_exporter will 
>> notice a stuck filesystem and stop attempting to gather metrics from it 
>> until it gets un-stuck. Although, I don't think we have any metrics for 
>> when that happens, only log errors.
>>
>> On Tue, Mar 3, 2020 at 6:03 PM Serkan Çoban > > wrote:
>>
>>> if I remember correctly node exporter will hang too when an nfs share
>>> hangs. maybe you can test it...
>>>
>>> On Tue, Mar 3, 2020 at 6:26 PM Yagyansh S. Kumar
>>> > wrote:
>>> >
>>> > I also thought about doing the same, but I am keeping that as a last 
>>> resort because that would require me to push the script to all my 2500+ 
>>> servers.
>>> >
>>> > On Tuesday, March 3, 2020 at 8:46:27 PM UTC+5:30, Murali Krishna 
>>> Kanagala wrote:
>>> >>
>>> >> I would write a small shell script that tries to write to the nfs 
>>> mount  path and writes the status to a file which can be read by the text 
>>> file collector. And schedule that shell script cron. I think this is the 
>>> easiest solution.
>>> >>
>>> >> On Tue, Mar 3, 2020, 9:12 AM Yagyansh S. Kumar  
>>> wrote:
>>> >>>
>>> >>> Already enabled the nfs and nfsd collectors. Till now I haven't 
>>> found anything that can accurately give me the information about NFS hang.
>>> >>> Correct me if I am wrong, but I don't think it is a good indicator 
>>> of NFS hang as there may be times where no activity is happening on the 
>>> NFS, but that does not mean that NFS is hanged. (eg. I have 25 NFS mounts 
>>> on one of my servers, some of them are used rarely, so we won't find any 
>>> substantial IO on those mounts, but I need to know whether they are 
>>> accessible or not). Still, thanks for the suggestion, will try it out once.
>>> >>>
>>> >>>
>>> >>> On Tuesday, March 3, 2020 at 8:35:03 PM UTC+5:30, Murali Krishna 
>>> Kanagala wrote:
>>> 
>>>  Try enabling the nfs options in the node exporter config. It will 
>>> spit out some metrics about the nfs status.
>>> 
>>>  Also look at the disk IO metrics from node exporter and if you see 
>>> no activity which indicates the nfs is not doing anything.
>>> 
>>>  On Tue, Mar 3, 2020, 7:10 AM Yagyansh S. Kumar <
>>> yagyans...@gmail.com> wrote:
>>> >
>>> > I want to check if the NFS is hanged(i.e whether it is accessible 
>>> from the server or not, and if yes then what is the response time it is 
>>> getting). I know using the mountstats and nfs collector we have a lot of 
>>> metrics for NFS, but haven't found any that can tell me every time the NFS 
>>> hangs correctly.
>>> > Thanks in advance.
>>> >
>>> > --
>>> > You received this message because you are subscribed to the Google 
>>> Groups "Prometheus Users" group.
>>> > To unsubscribe from this group and stop receiving emails from it, 
>>> send an email to promethe...@googlegroups.com.
>>> > To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/prometheus-users/06929518-d3b5-4c2f-9490-b08cc664d26b%40googlegroups.com
>>> .
>>> >>>
>>> >>> --
>>> >>> You received this message because you are subscribed to the Google 
>>> Groups "Prometheus Users" group.
>>> >>> To unsubscribe from this group and stop receiving emails from it, 
>>> send an email to promethe...@googlegroups.com.
>>> >>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/prometheus-users/1dda60cc-0b20-47da-87ff-4f1c76ce076f%40googlegroups.com
>>> .
>>> >
>>> > --
>>> > You received this message because you are subscribed to the Google 
>>> Groups "Prometheus Users" group.
>>> > To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to promethe...@googlegroups.com .
>>> > To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/prometheus-users/832f2823-eab1-4f40-8f91-ddbc00190551%40googlegroups.com
>>> .
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "Prometheus Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to promethe...@googlegroups.com .
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/prometheus-users/CAP9WWed%2BtxJVRSJc0mkCOkg6_neGAJRNEMq_hku87LPbYXAhjA%40mail.gmail.com
>>> .
>>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to promethe...@googlegroups.com .
>> To view this

Re: [prometheus-users] Checking if NFS is hanged or not using node_exporter.

2020-03-03 Thread Murali Krishna Kanagala
The other option would be to run a custom exporter from one box that can
ssh to the rest and run the needful commands.

On Tue, Mar 3, 2020, 9:25 AM Yagyansh S. Kumar 
wrote:

> I also thought about that, but I am keeping that as a last resort. But
> that would require me to push a script to all my 2500+ servers.
>
> On Tuesday, March 3, 2020 at 8:46:27 PM UTC+5:30, Murali Krishna Kanagala
> wrote:
>>
>> I would write a small shell script that tries to write to the nfs mount
>> path and writes the status to a file which can be read by the text file
>> collector. And schedule that shell script cron. I think this is the easiest
>> solution.
>>
>> On Tue, Mar 3, 2020, 9:12 AM Yagyansh S. Kumar 
>> wrote:
>>
>>> Already enabled the nfs and nfsd collectors. Till now I haven't found
>>> anything that can accurately give me the information about NFS hang.
>>> Correct me if I am wrong, but I don't think it is a good indicator of
>>> NFS hang as there may be times where no activity is happening on the NFS,
>>> but that does not mean that NFS is hanged. (eg. I have 25 NFS mounts on one
>>> of my servers, some of them are used rarely, so we won't find any
>>> substantial IO on those mounts, but I need to know whether they are
>>> accessible or not). Still, thanks for the suggestion, will try it out once.
>>>
>>>
>>> On Tuesday, March 3, 2020 at 8:35:03 PM UTC+5:30, Murali Krishna
>>> Kanagala wrote:

 Try enabling the nfs options in the node exporter config. It will spit
 out some metrics about the nfs status.

 Also look at the disk IO metrics from node exporter and if you see no
 activity which indicates the nfs is not doing anything.

 On Tue, Mar 3, 2020, 7:10 AM Yagyansh S. Kumar 
 wrote:

> I want to check if the NFS is hanged(i.e whether it is accessible from
> the server or not, and if yes then what is the response time it is
> getting). I know using the mountstats and nfs collector we have a lot of
> metrics for NFS, but haven't found any that can tell me every time the NFS
> hangs correctly.
> Thanks in advance.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to promethe...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/06929518-d3b5-4c2f-9490-b08cc664d26b%40googlegroups.com
> 
> .
>
 --
>>> You received this message because you are subscribed to the Google
>>> Groups "Prometheus Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to promethe...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/prometheus-users/1dda60cc-0b20-47da-87ff-4f1c76ce076f%40googlegroups.com
>>> 
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/3822b8a0-697f-4efe-87ab-5a3f90de0786%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAKimyZvQvbqQkoOe%2BR04b6EeQMvsE_-Z2UeOabdqtm7VO8dRmg%40mail.gmail.com.