Re: [ceph-users] rbd map issues: no such file or directory (ENOENT) AND map wrong image
Transferring this back the ceph-users. Sorry, I can't help with rbd issues. One thing I will say is that if you are mounting an rbd device with a filesystem on a machine to export ftp, you can't also export the same device via iSCSI. David Zafman Senior Developer http://www.inktank.com On Aug 19, 2013, at 8:39 PM, PJ linalin1...@gmail.com wrote: 2013/8/14 David Zafman david.zaf...@inktank.com On Aug 12, 2013, at 7:41 PM, Josh Durgin josh.dur...@inktank.com wrote: On 08/12/2013 07:18 PM, PJ wrote: If the target rbd device only map on one virtual machine, format it as ext4 and mount to two places mount /dev/rbd0 /nfs -- for nfs server usage mount /dev/rbd0 /ftp -- for ftp server usage nfs and ftp servers run on the same virtual machine. Will file system (ext4) help to handle the simultaneous access from nfs and ftp? I doubt that'll work perfectly on a normal disk, although rbd should behave the same in this case. Consider what happens when to be some issues when the same files are modified at once by the ftp and nfs servers. You could run ftp on an nfs client on a different machine safely. Modern Linux kernels will do a bind mount when a block device is mounted on 2 different directories. Think directory hard links. Simultaneous access will NOT corrupt ext4, but as Josh said modifying the same file at once by ftp and nfs isn't going produce good results. With file locking 2 nfs clients could coordinate using advisory locking. David Zafman Senior Developer http://www.inktank.com The first issue is reproduced, but there are changes to system configuration. Due to hardware shortage, we only have one physical machine installed one OSD and runs 6 virtual machines on it. Only one monitor (wistor-003) and one FTP server (wistor-004), the other virtual machines are iSCSI servers. The log size is big because when enable FTP service for a rbd device, we have a rbd map retry loop in case it fails (retry rbd map every 10 sec and last for 3 minutes). Please download monitor log from below link, https://www.dropbox.com/s/88cb9q91cjszuug/ceph-mon.wistor-003.log.zip Here are the operation steps: 1. The pool rex is created Around 2013-08-20 09:16:38~09:16:39 2. The first time to map rbd device on wistor-004 and it fails (all retries failed) Around 2013-08-20 09:17:43~09:20:46 (180 sec) 3. Tried second time and it works, but still have 9 fails in retry loop Around 2013-08-20 09:20:48~09:22:10 (82 sec) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] rbd map issues: no such file or directory (ENOENT) AND map wrong image
On Aug 12, 2013, at 7:41 PM, Josh Durgin josh.dur...@inktank.com wrote: On 08/12/2013 07:18 PM, PJ wrote: If the target rbd device only map on one virtual machine, format it as ext4 and mount to two places mount /dev/rbd0 /nfs -- for nfs server usage mount /dev/rbd0 /ftp -- for ftp server usage nfs and ftp servers run on the same virtual machine. Will file system (ext4) help to handle the simultaneous access from nfs and ftp? I doubt that'll work perfectly on a normal disk, although rbd should behave the same in this case. Consider what happens when to be some issues when the same files are modified at once by the ftp and nfs servers. You could run ftp on an nfs client on a different machine safely. Modern Linux kernels will do a bind mount when a block device is mounted on 2 different directories. Think directory hard links. Simultaneous access will NOT corrupt ext4, but as Josh said modifying the same file at once by ftp and nfs isn't going produce good results. With file locking 2 nfs clients could coordinate using advisory locking. David Zafman Senior Developer http://www.inktank.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] rbd map issues: no such file or directory (ENOENT) AND map wrong image
Hi All, Before go on the issue description, here is our hardware configurations: - Physical machine * 3: each has quad-core CPU * 2, 64+ GB RAM, HDD * 12 (500GB ~ 1TB per drive; 1 for system, 11 for OSD). ceph OSD are on physical machines. - Each physical machine runs 5 virtual machines. One VM as ceph MON (i.e. totally 3 MONs), the other 4 VMs provides either iSCSI or FTP/NFS service - Physical machines and virtual machines are based on the same software condition: Ubuntu 12.04 + kernel 3.6.11, ceph v0.61.7 The issues we met are, 1. Right after ceph installation, create pool then create image and map is no problem. But if we do not use the whole environment more than half day, do the same process (create pool - create image - map image) will return error: no such file or directory (ENOENT). Once the issue occurs, it could be easily reproduce by the same process. But this issue may be disappear if wait 10+ minutes after pool creation. Reboot system also could avoid it. I had success and failed straces logged on the same virtual machine (the one provides FTP/NFS): success: https://www.dropbox.com/s/u8jc4umak24kr1y/rbd_done.txt failed: https://www.dropbox.com/s/ycuupmmrlc4d0ht/rbd_failed.txt 2. The second issue is to create two images (AAA and BBB) under one pool (xxx), if we map rbd -p xxx image AAA, the result is success but it shows BBB under /dev/rbd/xxx/. Use rbd showmapped, it shows AAA of pool xxx is mapped. I am not sure which one is really mapped because both images are empty. This issue is hard to reproduce but once happens /dev/rbd/ are mess-up. One more question but not about rbd map issues. Our usage is to map one rbd device and mount in several places (in one virtual machine) for iSCSI, FTP and NFS, does that cause any problem to ceph operation? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] rbd map issues: no such file or directory (ENOENT) AND map wrong image
On 08/12/2013 10:19 AM, PJ wrote: Hi All, Before go on the issue description, here is our hardware configurations: - Physical machine * 3: each has quad-core CPU * 2, 64+ GB RAM, HDD * 12 (500GB ~ 1TB per drive; 1 for system, 11 for OSD). ceph OSD are on physical machines. - Each physical machine runs 5 virtual machines. One VM as ceph MON (i.e. totally 3 MONs), the other 4 VMs provides either iSCSI or FTP/NFS service - Physical machines and virtual machines are based on the same software condition: Ubuntu 12.04 + kernel 3.6.11, ceph v0.61.7 The issues we met are, 1. Right after ceph installation, create pool then create image and map is no problem. But if we do not use the whole environment more than half day, do the same process (create pool - create image - map image) will return error: no such file or directory (ENOENT). Once the issue occurs, it could be easily reproduce by the same process. But this issue may be disappear if wait 10+ minutes after pool creation. Reboot system also could avoid it. This sounds similar to http://tracker.ceph.com/issues/5925 - and your case suggests it may be a monitor bug, since that test is userspace and you're using the kernel client. Could you reproduce this with logs from your monitors from the time of pool creation to after the map fails with ENOENT, and these log settings on all mons: debug ms = 1 debug mon = 20 debug paxos = 10 If you could attach those logs to the bug or otherwise make them available that'd be great. I had success and failed straces logged on the same virtual machine (the one provides FTP/NFS): success: https://www.dropbox.com/s/u8jc4umak24kr1y/rbd_done.txt failed: https://www.dropbox.com/s/ycuupmmrlc4d0ht/rbd_failed.txt Unfortunately these won't tell us much since the kernel is doing all the work with rbd map. 2. The second issue is to create two images (AAA and BBB) under one pool (xxx), if we map rbd -p xxx image AAA, the result is success but it shows BBB under /dev/rbd/xxx/. Use rbd showmapped, it shows AAA of pool xxx is mapped. I am not sure which one is really mapped because both images are empty. This issue is hard to reproduce but once happens /dev/rbd/ are mess-up. That sounds very strange, since 'rbd showmapped' and the udev rule that creates the /dev/rbd/pool/image symlinks use the same data source - /sys/bus/rbd/N/name. This sounds like a race condition where sysfs is being read (and reading stale memory) before the kernel finishes populating it. Could you file this in the tracker? Checking whether it still occurs in linux 3.10 would be great too. It doesn't seem possible with the current code. One more question but not about rbd map issues. Our usage is to map one rbd device and mount in several places (in one virtual machine) for iSCSI, FTP and NFS, does that cause any problem to ceph operation? If it's read-only everywhere, it's fine, but otherwise you'll run into problems unless you've got something on top of rbd managing access to it, like ocfs2. You could use nfs on top of one rbd device, but having multiple nfs servers on top of the same rbd device won't work unless they can coordinate with each other. The same applies to iscsi and ftp. Josh ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com