At the moment I've found that the mgr daemon works fine when I move it to
an OSD node. All nodes have the same OS version, so I can conclude that the
problem is limited to the nodes that normally run mgr. I'm still
investigating what's happening, but at least I got the monitoring back.

Regards.

On Tue, Jun 4, 2024 at 4:01 PM Dario Graña <dgr...@pic.es> wrote:

> Hi all!
>
> I'm running ceph quincy 17.2.7 in a cluster. On monday I updated the OS to
> AlmaLinux 9.3 to 9.4, since then grafana shows "No Data" message in all
> ceph related fields but, for example, the nodes information is still fine
> (Host Detail Dashboard).
> I have redeployed the mgr service with cephadm, disabled and re-enabled
> mgr prometheus module , but nothing changed. Digging into the problem, I
> accessed the prometheus interface. When I access prometheus, and found this
> error[image: Screen Shot 2024-06-04 at 15.22.37.png]
> When I access the node shown as down, it reports
> 503 Service Unavailable
>
> No cached data available yet
>
> Traceback (most recent call last):
>   File "/lib/python3.6/site-packages/cherrypy/_cprequest.py", line 638, in 
> respond
>     self._do_respond(path_info)
>   File "/lib/python3.6/site-packages/cherrypy/_cprequest.py", line 697, in 
> _do_respond
>     response.body = self.handler()
>   File "/lib/python3.6/site-packages/cherrypy/lib/encoding.py", line 219, in 
> __call__
>     self.body = self.oldhandler(*args, **kwargs)
>   File "/lib/python3.6/site-packages/cherrypy/_cpdispatch.py", line 54, in 
> __call__
>     return self.callable(*self.args, **self.kwargs)
>   File "/usr/share/ceph/mgr/prometheus/module.py", line 1751, in metrics
>     return self._metrics(_global_instance)
>   File "/usr/share/ceph/mgr/prometheus/module.py", line 1762, in _metrics
>     raise cherrypy.HTTPError(503, 'No cached data available yet')
> cherrypy._cperror.HTTPError: (503, 'No cached data available yet')
>
> I checked the mgr prometheus address and port
> [ceph: root@ceph-admin01 /]# ceph config get mgr
> mgr/prometheus/server_addr
> ::
> [ceph: root@ceph-admin01 /]# ceph config get mgr
> mgr/prometheus/server_port
> 9283
>
> It seems to be ok.
>
> When I check the master manager node for the port, I found
> [root@ceph-hn01 ~]# netstat -natup | grep 9283
> tcp6       0      0 :::9283                 :::*                    LISTEN
>      2453/ceph-mgr
> tcp6       0      0 192.168.97.51:9283      192.168.97.60:36130
> ESTABLISHED 2453/ceph-mgr
>
> I don't understand why it is showing as IPv6, the node doesn't have a dual
> stack.
>
> I also tried to use a newer version of the prometheus container image, the
> 1.6.0, but it keeps reporting the same, so I rolled it back to the original
> one.
>
> Has anyone experienced an issue like this?
> Where can I look for more information about it?
>
> Thanks in advance.
>
> Regards.
> --
> Dario Graña
> PIC (Port d'Informació Científica)
> Campus UAB, Edificio D
> E-08193 Bellaterra, Barcelona
> http://www.pic.es
> Avis - Aviso - Legal Notice: http://legal.ifae.es
>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to