Thanks for writing, Ernesto.

  1. output of ceph mgr services:
    ceph mgr services
    {
        "dashboard": "https://144.92.190.200:8443/",
        "prometheus": "http://144.92.190.200:9283/"
    }
  2. Network tab in dev tools, doing a reload just results in a GET -> DOMAIN: ceph01.ssc.wisc.edu:8443, file /
    1. Nothing else comes up as the throbber throbs.
    2. No assets list as being downloaded.
  3. Similar result with curl: curl -k https://144.92.190.200:8443 just results in a blinking cursor. No errors, just hanging. If I try any other random port, curl (as expected) says "connection refused" and quits instantly.
Zach


On 2021-11-19 10:17 AM, Ernesto Puerta wrote:
Hi Zach,

Thanks for the thorough description. We haven't noticed this issue so far and have some long-running clusters, but let's try to debug it:
  • First of all, as Kai suggested, let's ensure we're hitting the active manager address (there's a redirection mechanism, but let's ensure it anyway): a "ceph mgr services" should give you the active Dashboard URL.
  • After that, my suggestion for you is to open the Browser's Dev Tools (built-in in both Chrome or Firefox) and visit the Networking tab. In there, you should be able a few network requests on hard reload (remember to keep CTRL+SHIFT pressed while clicking on the reload icon). You should see a few HTML, CSS and JS assets downloading.
  • Let's try to perform a "curl" from the CLI: "curl -k https://<hostname>:<port>". That should return the index HTML file.
Are you using a reverse proxy/cache that might be interfering with this?

Kind Regards,
Ernesto


On Fri, Nov 19, 2021 at 12:04 AM Zach Heise (SSCC) <he...@ssc.wisc.edu> wrote:

Hello!

 

Our test cluster is a few months old, was initially set up from scratch with Pacific and has now had two separate small patches 16.2.5 and then a couple weeks ago, 16.2.6 applied to it. The issue I?m describing has been present since the beginning.

 

We have an active and standby mgr daemon, and the dashboard module is installed with SSL turned on. Self signed certificates only, not trusted by browsers, but I always just click ?okay? through Chrome and Firefox?s warnings about that.

 

I have noticed that every 2-3 days, in the morning when I start work, our ceph dashboard page does not respond in the browser. It works fine throughout the day, but it seems like after a certain unknown hours without anyone accessing it (I?m the only one using the dashboard now since it?s just a test) something must be going wrong with the dashboard module, or mgr daemon, because when I try to load (or refresh when it's already loaded) the ceph dashboard site, the browser just does the ?throbber? ? no content on the page ever appears, no errors or anything. None of the buttons on the page load ? nor time out and show a 404 ? for example, Block\Images or Cluster\Hosts in the left sidebar will load, but show empty. And the throbber never stops.

 

Confirmed that this happens in all browsers too.

 

I can easily fix it with ceph mgr module disable dashboard and then waiting 10 seconds, then ceph mgr module enable dashboard ? this makes it start working again, until the next time I go a few days without using the dashboard, at which point I need to do the same process again.

 

Any ideas as to what could be causing this? I have already turned on debug mode. When I?m in this hanging state, I check the cephadm logs with cephadm logs --name mgr.ceph01.fblojp -- -f but there?s nothing obvious (to my untrained eyes at least). When the dashboard is functional, I can see my own navigation around the dashboard in the logs so I know that logging is working:

 

Nov 01 15:46:32 ceph01.domain conmon[5814]: debug 2021-11-01T20:46:32.601+0000 7f7cbb42e700  0 [dashboard INFO request] [10.130.50.252:52267] [GET] [200] [0.013s] [admin] [1.0K] /api/summary

 

I already confirmed that the same thing happens regardless of whether I?m using default ports of http://ceph01.domain:8080 or https://ceph01.domain:8443 (although as mentioned I usually use self-signed SSL).

 

At this moment the dashboard is currently in this hanging state so I am happy to try to get logs.

 

Thanks,

-Zach

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to