Re: [ceph-users] rbd map issues: no such file or directory (ENOENT) AND map wrong image

2013-08-19 Thread David Zafman

Transferring this back the ceph-users.  Sorry, I can't help with rbd issues.  
One thing I will say is that if you are mounting an rbd device with a 
filesystem on a machine to export ftp, you can't also export the same device 
via iSCSI.

David Zafman
Senior Developer
http://www.inktank.com

On Aug 19, 2013, at 8:39 PM, PJ linalin1...@gmail.com wrote:

 2013/8/14 David Zafman david.zaf...@inktank.com
 
 On Aug 12, 2013, at 7:41 PM, Josh Durgin josh.dur...@inktank.com wrote:
 
  On 08/12/2013 07:18 PM, PJ wrote:
 
  If the target rbd device only map on one virtual machine, format it as
  ext4 and mount to two places
mount /dev/rbd0 /nfs -- for nfs server usage
mount /dev/rbd0 /ftp  -- for ftp server usage
  nfs and ftp servers run on the same virtual machine. Will file system
  (ext4) help to handle the simultaneous access from nfs and ftp?
  
  I doubt that'll work perfectly on a normal disk, although rbd should
  behave the same in this case. Consider what happens when to be some
  issues when the same files are modified at once by the ftp and nfs
  servers. You could run ftp on an nfs client on a different machine
  safely.
 
 
 
 Modern Linux kernels will do a bind mount when a block device is mounted on 2 
 different directories.   Think directory hard links.  Simultaneous access 
 will NOT corrupt ext4, but as Josh said modifying the same file at once by 
 ftp and nfs isn't going produce good results.  With file locking 2 nfs 
 clients could coordinate using advisory locking.
 
 David Zafman
 Senior Developer
 http://www.inktank.com
 
 
 The first issue is reproduced, but there are changes to system configuration. 
 Due to hardware shortage, we only have one physical machine installed one OSD 
 and runs 6 virtual machines on it. Only one monitor (wistor-003) and one FTP 
 server (wistor-004), the other virtual machines are iSCSI servers.
 
 The log size is big because when enable FTP service for a rbd device, we have 
 a rbd map retry loop in case it fails (retry rbd map every 10 sec and last 
 for 3 minutes). Please download monitor log from below link,
 https://www.dropbox.com/s/88cb9q91cjszuug/ceph-mon.wistor-003.log.zip
 
 Here are the operation steps:
 1. The pool rex is created
Around 2013-08-20 09:16:38~09:16:39
 2. The first time to map rbd device on wistor-004 and it fails (all retries 
 failed)
Around 2013-08-20 09:17:43~09:20:46 (180 sec)
 3. Tried second time and it works, but still have 9 fails in retry loop
Around 2013-08-20 09:20:48~09:22:10 (82 sec)
 
 
 
 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd map issues: no such file or directory (ENOENT) AND map wrong image

2013-08-13 Thread David Zafman

On Aug 12, 2013, at 7:41 PM, Josh Durgin josh.dur...@inktank.com wrote:

 On 08/12/2013 07:18 PM, PJ wrote:
 
 If the target rbd device only map on one virtual machine, format it as
 ext4 and mount to two places
   mount /dev/rbd0 /nfs -- for nfs server usage
   mount /dev/rbd0 /ftp  -- for ftp server usage
 nfs and ftp servers run on the same virtual machine. Will file system
 (ext4) help to handle the simultaneous access from nfs and ftp?
 
 I doubt that'll work perfectly on a normal disk, although rbd should
 behave the same in this case. Consider what happens when to be some
 issues when the same files are modified at once by the ftp and nfs
 servers. You could run ftp on an nfs client on a different machine
 safely.
 


Modern Linux kernels will do a bind mount when a block device is mounted on 2 
different directories.   Think directory hard links.  Simultaneous access will 
NOT corrupt ext4, but as Josh said modifying the same file at once by ftp and 
nfs isn't going produce good results.  With file locking 2 nfs clients could 
coordinate using advisory locking.  

David Zafman
Senior Developer
http://www.inktank.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] rbd map issues: no such file or directory (ENOENT) AND map wrong image

2013-08-12 Thread PJ
Hi All,

Before go on the issue description, here is our hardware configurations:
- Physical machine * 3: each has quad-core CPU * 2, 64+ GB RAM, HDD * 12
(500GB ~ 1TB per drive; 1 for system, 11 for OSD). ceph OSD are on physical
machines.
- Each physical machine runs 5 virtual machines. One VM as ceph MON (i.e.
totally 3 MONs), the other 4 VMs provides either iSCSI or FTP/NFS service
- Physical machines and virtual machines are based on the same software
condition: Ubuntu 12.04 + kernel 3.6.11, ceph v0.61.7


The issues we met are,

1. Right after ceph installation, create pool then create image and map is
no problem. But if we do not use the whole environment more than half day,
do the same process (create pool - create image - map image) will return
error: no such file or directory (ENOENT). Once the issue occurs, it could
be easily reproduce by the same process. But this issue may be disappear if
wait 10+ minutes after pool creation. Reboot system also could avoid it.

I had success and failed straces logged on the same virtual machine (the
one provides FTP/NFS):
success: https://www.dropbox.com/s/u8jc4umak24kr1y/rbd_done.txt
failed: https://www.dropbox.com/s/ycuupmmrlc4d0ht/rbd_failed.txt


2. The second issue is to create two images (AAA and BBB) under one pool
(xxx), if we map rbd -p xxx image AAA, the result is success but it shows
BBB under /dev/rbd/xxx/. Use rbd showmapped, it shows AAA of pool xxx
is mapped. I am not sure which one is really mapped because both images are
empty. This issue is hard to reproduce but once happens /dev/rbd/ are
mess-up.

One more question but not about rbd map issues. Our usage is to map one rbd
device and mount in several places (in one virtual machine) for iSCSI, FTP
and NFS, does that cause any problem to ceph operation?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd map issues: no such file or directory (ENOENT) AND map wrong image

2013-08-12 Thread Josh Durgin

On 08/12/2013 10:19 AM, PJ wrote:

Hi All,

Before go on the issue description, here is our hardware configurations:
- Physical machine * 3: each has quad-core CPU * 2, 64+ GB RAM, HDD * 12
(500GB ~ 1TB per drive; 1 for system, 11 for OSD). ceph OSD are on
physical machines.
- Each physical machine runs 5 virtual machines. One VM as ceph MON
(i.e. totally 3 MONs), the other 4 VMs provides either iSCSI or FTP/NFS
service
- Physical machines and virtual machines are based on the same software
condition: Ubuntu 12.04 + kernel 3.6.11, ceph v0.61.7


The issues we met are,

1. Right after ceph installation, create pool then create image and map
is no problem. But if we do not use the whole environment more than half
day, do the same process (create pool - create image - map image) will
return error: no such file or directory (ENOENT). Once the issue occurs,
it could be easily reproduce by the same process. But this issue may be
disappear if wait 10+ minutes after pool creation. Reboot system also
could avoid it.


This sounds similar to http://tracker.ceph.com/issues/5925 - and
your case suggests it may be a monitor bug, since that test is userspace
and you're using the kernel client. Could you reproduce
this with logs from your monitors from the time of pool creation to
after the map fails with ENOENT, and these log settings on all mons:

debug ms = 1
debug mon = 20
debug paxos = 10

If you could attach those logs to the bug or otherwise make them
available that'd be great.


I had success and failed straces logged on the same virtual machine (the
one provides FTP/NFS):
success: https://www.dropbox.com/s/u8jc4umak24kr1y/rbd_done.txt
failed: https://www.dropbox.com/s/ycuupmmrlc4d0ht/rbd_failed.txt


Unfortunately these won't tell us much since the kernel is doing all the
work with rbd map.


2. The second issue is to create two images (AAA and BBB) under one pool
(xxx), if we map rbd -p xxx image AAA, the result is success but it
shows BBB under /dev/rbd/xxx/. Use rbd showmapped, it shows AAA of
pool xxx is mapped. I am not sure which one is really mapped because
both images are empty. This issue is hard to reproduce but once happens
/dev/rbd/ are mess-up.


That sounds very strange, since 'rbd showmapped' and the udev rule that
creates the /dev/rbd/pool/image symlinks use the same data source -
/sys/bus/rbd/N/name. This sounds like a race condition where sysfs is
being read (and reading stale memory) before the kernel finishes
populating it. Could you file this in the tracker? Checking whether
it still occurs in linux 3.10 would be great too. It doesn't seem
possible with the current code.


One more question but not about rbd map issues. Our usage is to map one
rbd device and mount in several places (in one virtual machine) for
iSCSI, FTP and NFS, does that cause any problem to ceph operation?


If it's read-only everywhere, it's fine, but otherwise you'll run into
problems unless you've got something on top of rbd managing access to
it, like ocfs2. You could use nfs on top of one rbd device, but having
multiple nfs servers on top of the same rbd device won't work unless
they can coordinate with each other. The same applies to iscsi and ftp.

Josh
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com