Hi!, I'm having a problem with a new ceph deployment using rbd mirroring and it's just in case someone can help me out or point me in the right direction.
I have a ceph jewel install, with 2 clusters(zone1,zone2), rbd is working fine, but the rbd mirroring between sites is not working correctly. I have configured pool replication in the default rbd pool, I have setup the peers and created 2 test images: [root@mon3 ceph]# rbd --user zone1 --cluster zone1 mirror pool info Mode: pool Peers: UUID NAME CLIENT 397b37ef-8300-4dd3-a637-2a03c3b9289c zone2 client.zone2 [root@mon3 ceph]# rbd --user zone2 --cluster zone2 mirror pool info Mode: pool Peers: UUID NAME CLIENT 2c11f1dc-67a4-43f1-be33-b785f1f6b366 zone1 client.zone1 Primary is ok: [root@mon3 ceph]# rbd --user zone1 --cluster zone1 mirror pool status --verbose health: OK images: 2 total 2 stopped test-2: global_id: 511e3aa4-0e24-42b4-9c2e-8d84fc9f48f4 state: up+stopped description: remote image is non-primary or local image is primary last_update: 2017-03-16 17:38:08 And secondary is always in this state: [root@mon3 ceph]# rbd --user zone2 --cluster zone2 mirror pool status --verbose health: WARN images: 2 total 1 syncing test-2: global_id: 511e3aa4-0e24-42b4-9c2e-8d84fc9f48f4 state: up+syncing description: bootstrapping, OPEN_LOCAL_IMAGE last_update: 2017-03-16 17:41:02 Sometimes for a couple of seconds it goes into replay state and health ok, but then back to bootstrapping, OPEN_LOCAL_IMAGE. what does this state mean?. In the log files I have this error: 2017-03-16 17:43:02.404372 7ff6262e7700 -1 librbd::ImageWatcher: 0x7ff654003190 error requesting lock: (30) Read-only file system 2017-03-16 17:43:03.411327 7ff6262e7700 -1 librbd::ImageWatcher: 0x7ff654003190 error requesting lock: (30) Read-only file system 2017-03-16 17:43:04.420074 7ff6262e7700 -1 librbd::ImageWatcher: 0x7ff654003190 error requesting lock: (30) Read-only file system 2017-03-16 17:43:05.422253 7ff6262e7700 -1 librbd::ImageWatcher: 0x7ff654003190 error requesting lock: (30) Read-only file system 2017-03-16 17:43:06.428447 7ff6262e7700 -1 librbd::ImageWatcher: 0x7ff654003190 error requesting lock: (30) Read-only file system Not sure to what file it refers that is RO, I have tried to strace it, but couldn't find it. I have disable selinux just in case but the result is the same the OS is rhel 7.2 by the way. If a do a demote/promote of the image, I get the same state and errors on the other cluster. If someone could help it would be great. Thnx in advance. Regards
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com