On Tue, 15 Aug 2017, Sean Purdy said:
> Luminous 12.1.1 rc1
> 
> Hi,
> 
> 
> I have a three node cluster with 6 OSD and 1 mon per node.
> 
> I had to turn off one node for rack reasons.  While the node was down, the 
> cluster was still running and accepting files via radosgw.  However, when I 
> turned the machine back on, radosgw uploads stopped working and things like 
> "ceph status" starting timed out.  It took 20 minutes for "ceph status" to be 
> OK.  

Well I've figured out why "ceph status" was hanging (and possibly radosgw).  It 
seems that ceph utility looks at ceph.conf to find a monitor to connect to (or 
at least that's what strace implied), but our ceph.conf only had one monitor 
out of three actually listed in the file.  And that was the node I turned off.  
Updating mon_initial_members and mon_host with the other two monitors worked.

TBF, 
https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/1.3/html/administration_guide/managing_cluster_size
 does mention you should add your second and third monitors here.  But I hadn't 
read that, and elsewhere I read that on boot the monitors will discover other 
monitors, so I thought you didn't need to list them all.  e.g. 
http://docs.ceph.com/docs/master/rados/operations/add-or-rm-mons/#changing-a-monitor-s-ip-address
 (which also says clients use ceph.conf to find monitors - I missed that part).

Anyway, I'll do a few more tests with a better ceph.conf


Sean Purdy
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to